Uncertainty in the context of AI can be difficult to grasp at first. At a high level, uncertainty means working with imperfect or incomplete information, but there are countless potential sources of uncertainty. Some, like missing information, unreliable information, conflicting information, noisy information, and confusing information, are especially challenging to address without a grasp of the causes. Even the best-trained AI systems can’t be right 100% of the time. And in the enterprise, stakeholders must find ways to estimate and measure uncertainty to the extent possible.
It turns out uncertainty isn’t necessarily a bad thing — if it can be communicated clearly. Consider this example from machine learning engineer Dirk Elsinghorst: An AI is trained to classify animals in a safari to help safari-goers remain safe. The model trains with available data, giving animals a “risky” or “safe” classification. But because it never encounters a tiger, it classifies tigers as safe, drawing a comparison between the stripes on tigers and on zebras. If the model were able to communicate uncertainty, humans could intervene to alter the outcome.
There are two common types of uncertainty in AI: aleatoric and epistemic. Aleatoric accounts for chance, like differences in an environment and the skill levels of people capturing training data. Epistemic is part of the model itself — models that are too simple in design can have a high variation in outcome.
Observations, or sample data, from a domain or environment often contain variability. Typically referred to as “noise,” variability can be due to natural causes or an error, and it impacts not only the measurements AI learns from but the predictions it makes.
In the case of a dataset used to train AI to predict species of flowers, for instance, noise could be larger or smaller flowers than normal or typos when writing down the measurements of various petals and stems.
Another source of uncertainty arises from incomplete coverage of a domain. In statistics, samples are randomly collected, and bias is to some extent unavoidable. Data scientists need to arrive at a level of variance and bias that ensures the data is representative of the task a model will be used for.
Extending the flower-classifying example, a developer might choose to measure the size of randomly selected flowers in a single garden. The scope is limited to one garden, which might not be representative of gardens in other cities, states, countries, or continents.
As Machine Learning Mastery’s Jason Brownlee writes: “There will always be some unobserved cases. There will be part of the problem domain for which we do not have coverage. No matter how well we encourage our models to generalize, we can only hope that we can cover the cases in the training dataset and the salient cases that are not.”
Yet another dimension of uncertainty is errors. A model will always have some error, introduced during the data prep, training, or prediction stages. Error could refer to imperfect predictions or omission, where details are left out or abstracted. This might be desirable — by selecting simpler models as opposed to models that may be highly specialized to the training data, the model will generalize to new cases and have better performance.
Given all the sources of uncertainty, how can it be managed — particularly in an enterprise environment? Probability and statistics can help reveal variability in noisy observations. They can also shed light on the scope of observations, as well as quantifying the variance in performance of predictive models when applied to new data.
The fundamental problem is that models assume the data they’ll see in the future will look like the data they’ve seen in the past. Fortunately, several approaches can reliably “sample” a model to understand its overall confidence. Historically, these approaches have been slow, but researchers at MIT and elsewhere are devising new ways to estimate uncertainty from only one or a few runs of a model.
“We’re starting to see a lot more of these [neural network] models trickle out of the research lab and into the real world, into situations that are touching humans with potentially life-threatening consequences,” Alexander Amini, who recently presented research on a new method to estimate uncertainty in AI-assisted decision-making, said in a statement. “Any user of the method, whether it’s a doctor or a person in the passenger seat of a vehicle, needs to be aware of any risk or uncertainty associated with that decision.” He envisions the system not only quickly flagging uncertainty, but also using it to make more conservative decision making in risky scenarios, like when an autonomous vehicle approaches an intersection. “Any field that is going to have deployable machine learning ultimately needs to have reliable uncertainty awareness.”
Earlier this year, IBM open-sourced Uncertainty Quantification 360 (UQ360), a toolkit focused on enabling AI to understand and communicate its uncertainty. UQ360 offers a set of algorithms and a taxonomy to quantify uncertainty, as well as capabilities to measure and improve uncertainty quantification (UQ). For every UQ algorithm provided in the UQ360 Python package, a user can make a choice of an appropriate style of communication by following IBM’s guidance on communicating UQ estimates, from descriptions to visualizations.
“Common explainability techniques shed light on how AI works, but UQ exposes limits and potential failure points,” IBM research staff members Prasanna Sattigeri and Q. Vera Liao note in a blog post. “Users of a house price prediction model would like to know the margin of error of the model predictions to estimate their gains or losses. Similarly, a product manager may notice that an AI model predicts a new feature A will perform better than a new feature B on average, but to see its worst-case effects on KPIs, the manager would also need to know the margin of error in the predictions.”
In a recent study, Harvard University assistant professor Himabindu Lakkaraju found that showing uncertainty metrics to people with a background in machine learning and non-experts had an equalizing effect on their resilience to AI predictions. While fostering trust in AI may never be as simple as providing metrics, awareness of the pitfalls could go some way toward protecting people from machine learning’s limitations — a critical aim in the business domain.
VentureBeatVentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the subjects of interest to you
- our newsletters
- gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
- networking features, and more