Confidence, uncertainty, and trust in AI affect how humans make decisions

In 2019, as the Department of Defense considered adopting AI ethics principles, the Defense Innovation Unit held a series of meetings across the U.S. to gather opinions from experts and the public. At one such meeting in Silicon Valley, Stanford University professor Herb Lin argued that he was concerned about people trusting AI too easily and said any application of AI should include a confidence score indicating the algorithm's degree of certainty.

"AI systems should not only be the best possible. Sometimes they should say ‘I have no idea what I'm doing here, don't trust me.' That's going to be really important," he said.

The concern Lin raised is an important one: People can be manipulated by artificial intelligence, with cute robots a classic example of the human tendency to trust machines. But understanding how people alter their decision-making when presented with the output of an AI system is critical as the technology is adopted to augment human activity in a range of high-stakes settings -- from courts of law to hospitals and the battlefield.

Health, human trust, and machine learning

Health care applications of AI are among the fastest-growing of any sector of the economy. According to the State of AI report McKinsey published in late 2020, companies in health care, automotive, and manufacturing were most likely to report having increased investments in AI last year.

While the marriage of health care and AI can offer many benefits, the stakes don't get much higher than human health, and there are a number of obstacles to building robust, trustworthy systems. Diagnosis may be one of the most popular health care applications, but it's also susceptible to automation bias, when people become overreliant on answers generated by AI. A 2017 analysis of existing literature found numerous examples of automation bias in health care, typically involving diagnosis, and trusting AI systems may be an intractable human bias.

To consider how AI can manipulate human decision-making, some AI researchers are focused on understanding the degree to which people are influenced when AI predictions are matched with confidence or uncertainty metrics.

A year ago, a team at IBM Research performed experiments to assess how much showing people an AI prediction with a confidence score would impact their trust levels and overall accuracy when predicting a person's annual income. The study found that sharing a confidence score did increase human levels of trust. But the researchers had expected confidence scores to improve human decision-making, which did not turn out to be the case.

"The fact that showing confidence improved trust and trust calibration but failed to improve the AI-assisted accuracy is puzzling, and it rejects our hypothesis that showing AI confidence score improves accuracy of AI-assisted predictions," their paper reads.

Himabindu Lakkaraju is an assistant professor at Harvard University and is currently working with doctors in Boston with Brigham and Women's Hospital and Massachusetts General Hospital. She thinks the case of doctors using AI to determine diagnosis and treatment plans is a great application for uncertainty measurements.

"If your models are inaccurate, the risk that people might be over-trusting them is very real," Lakkaraju said. "We're trying to see how these kinds of tools can help first of all with how should we train doctors to read these kinds of measures, like posterior predictive distributions or other forms of uncertainty, how they can leverage this information, and if this kind of information will help them make better diagnosis decisions and treatment recommendation decisions."

Across a number of research projects over the past year, Lakkaraju and colleagues have considered how sharing information like the algorithm's level of uncertainty impacts human trust. The researchers have conducted experiments with a range of skilled workers, from health care workers and legal professionals to machine learning experts, and even people who know a lot about apartment rental rates.

AI and machine learning applications in health care

Machine learning has already infiltrated health care in a variety of ways, with both positive and negative results. The 2018 AI Index report notes that computer vision has advanced to the point that AI fueled with large amounts of training data can identify skin cancer as accurately as a panel of doctors. Some of the best-known machine learning systems in health care research classify disease diagnosis or generate treatment plans. In early 2019, Google AI researchers working with Northwestern Medicine created an AI model capable of detecting lung cancer from screening tests better than human radiologists with an average eight years of experience. Around the same time, MIT CSAIL and Massachusetts General Hospital published a paper in the journal Radiology about a system capable of predicting the onset of breast cancer five years in advance using mammography exam imagery.

COVID-19 triggered the creation of computer vision for recognizing the novel coronavirus in CT scans, though doctors do not yet consider this an acceptable form of diagnosis. However, the American College of Radiology is working on projects that use machine learning to detect COVID-19 from CT scans.

AI is also being integrated into health care through smart hospitals with a range of sensors and edge devices. In May 2020, Nvidia introduced Clara Guardian software that uses machine learning to monitor distances between people and assist health care workers with contactless patient monitoring. Nvidia is also working with hospitals on federated learning applications to combine insights from multiple datasets while preserving privacy and to reduce the amount of data needed to train generative networks.

Despite these advances, experts have identified serious risks associated with putting too much trust in predictions made by these types of AI models. A study assessing an algorithm U.S. hospitals use that was published in Science last fall found that millions of Black patients received a lower standard of care than white patients. More recently, an algorithm Stanford Medicine used to dispense COVID-19 vaccines prioritized some hospital administration executives over residents working directly with patients.

And a Google Health study introduced in April 2020 in partnership with the Ministry of Public Health in Thailand shows just how inaccurate AI can be when taken from a lab setting into the real world.

As an example of the opportunities AI offers, deep learning can extend screening tools to people who do not have access to a specialist. There are fewer than 2,000 medical specialists in Thailand trained to carry out diabetic retinopathy screenings and an estimated 4.5 million patients with diabetes. Because of this shortage of experts, the traditional diabetic retinopathy screening approach in Thailand can take up to 10 weeks, and AI researchers sought to improve both speed and scale. Results of the study carried out at 11 Thai clinics in 2018 and 2019 were published in a paper accepted for the ACM Human Factors in Computing Systems conference in April 2020.

While the program had some success, it also articulated hurdles associated with moving from the lab to a clinical setting. Google's AI was intended to deliver results in 10 minutes with more than 90% accuracy, but the real-world performance came up short. Analysis of model predictions carried out in the first six months in nearly a dozen clinics in Thailand found that 21% of images failed to meet standards for the model trained with high-quality imagery in a lab. The study also found that socio-environmental factors like the need to quickly screen patients for a number of diseases and inadequate lighting in clinics can negatively impact model performance.

Black box AI

There's still hope that machine learning can help doctors accurately diagnose diseases faster or screen patients who don't have access to health care specialists. But the problem of flawed systems influencing human decision-making is amplified by black box algorithms whose results defy explainability. A 2019 study from Stanford University researchers suggests that "the fragility of neural network interpretation could be a much broader problem."

Before studying how uncertainty can influence the way doctors make decisions, in 2019 Lakkaraju and University of Pennsylvania research assistant Osbert Bastani created an AI system designed to mislead people.

For the experiment, researchers purposely created an untrustworthy AI system for bail decisions. They made the system after surveying law students who knew how pretrial bail hearings work to identify traits they associate with untrustworthy bail algorithms. The idea was for the system to hide the fact that it was basing decisions on untrustworthy traits.

The students agreed that race and gender are two of the least trustworthy metrics you can use when creating a bail risk assessment algorithm, so the untrustworthy system didn't use those as stated reasons for a recommendation. However, in the United States, a history of segregation and racist housing policy means zip codes can often serve as a substitute for race.

"We find that user trust can be manipulated by high-fidelity, misleading explanations. These misleading explanations exist since prohibited features (e.g., race or gender) can be reconstructed based on correlated features (e.g., zip code). Thus, adversarial actors can fool end users into trusting an untrustworthy black box [system] -- e.g., one that employs prohibited attributes to make decisions," the study reads.

The experiment confirmed the researchers' hypothesis and showed how easily humans can be manipulated by black box algorithms.

"Our results demonstrate that the misleading explanations generated using our approach can in fact increase user trust by 9.8 times. Our findings have far-reaching implications both for research on machine learning interpretability and real-world applications of ML."

Final thoughts

To make AI systems more accurate, Microsoft Research and others say professionals in fields like health care should become part of the machine learning development process. In an interview with VentureBeat in 2019, Microsoft Business AI VP Gurdeep Pall called working with human professionals in different fields the next frontier of machine learning.

But regardless of how AI is trained, studies have shown people from any profession, level of education, or background can suffer from automation bias, and black box systems only exacerbate the situation.

As a potential solution to people's willingness to trust black box deep learning systems, in 2019 Lakkaraju introduced Model Understanding through Subspace Explanations (MUSE), a framework for interpretable AI that allows people to ask natural language questions about a model. In a small study where MUSE was used to interpret the activity of black box algorithms, participants were more likely to prefer explanations from MUSE than those provided by other popular frameworks for the same purpose. Another study centered on critical AI systems and human decision-making also found interpretability to be important, calling for the use of a rapid test calibration process so people can better trust results.

A study Lakkaraju conducted with apartment rental listings last fall demonstrated that people with a background in machine learning had an advantage over non-experts when it came to understanding uncertainty curves but that showing uncertainty scores to both groups had an equalizing effect on their resilience to AI predictions.

Research by Lakkaraju and others is only beginning to answer questions about the way explanations about confidence or uncertainty scores affect people's trust in AI. While the solution may never be as simple as choosing between showing uncertainty measures or confidence metrics, awareness of the pitfalls could go some way toward helping protect humans from machine learning's limitations. Just as we judge other people before deciding how much to trust them, it seems only proper to place more trust in an AI system that has been created, maintained, and protected in ways scientifically proven to improve outcomes for people.