Explainable AI could reduce the impact of biased algorithms

On May 25, 2018, the General Data Protection Regulation (GDPR) comes into effect across the EU, requiring sweeping changes to how organizations handle personal data. And GDPR standards have real teeth: For most violations, organizations have to pay a penalty of up to €20 million or 4 percent of global revenue, whichever is greater.

With the Cambridge Analytica scandal fresh on people's minds, many hope that GDPR will become a model for a new standard of data privacy around the world. We've already heard some industry leaders calling for Facebook to apply GDPR standards to its business in non-EU countries, even though the law doesn't require it.

But privacy is only one aspect of the debate around the use of data-driven systems. The increasing prevalence of machine learning-enabled systems introduces a host of issues, including one with an impact on society that could be huge but remains unquantifiable: bias.

We generally expect computers to be more objective and unbiased than humans. However, the past several years have seen multiple controversies over ML-enabled systems yielding biased or discriminatory results. In 2016, for example, ProPublica reported that ML algorithms U.S. courts use to gauge defendants' likelihood of recidivism were more likely to label black defendants as high risk compared to white defendants from similar backgrounds. This was true even though the system wasn't explicitly fed any data on defendants' race. The question is whether the net effect of ML-enabled systems is to make the world fairer and more efficient or to amplify human biases to superhuman scale.

Many important decisions in our lives are made by systems of some kind, whether those systems consist of people, machines, or a combination. Many of these existing systems are biased in both obvious and subtle ways. The increasing role of ML in decision-making systems, from banking to bail, affords us an opportunity to build better, less biased systems or run the risk of reinforcing these problems. That's in part why GDPR recognizes what could be considered a "right to explanation" for all citizens -- meaning that users can demand an explanation for any "legal or similarly significant" decisions made by machines. There is hope that the right to explanation will give the victims of "discrimination-by-algorithm" recourse to human authorities, thereby mitigating the effect of such biases.

But generating those types of explanations -- that is, creating explainable AI -- is complicated. Even where such explanations exist, some critics claim it's unclear whether they counter bias or merely mask it.

So will explainable AI -- and, by extension, GDPR -- make technology fairer? And if not, what alternatives do we have to safeguard against bias as the use of ML becomes more widespread?

Machines learn discrimination

Discussions of bias are often oversimplified to terms like "racist algorithms." But the problem isn't the algorithms themselves, it's the data researcher teams feed them. For example, collecting data from the past is a common starting point for data science projects -- but "[historical] data is often biased in ways that we don't want to transfer to the future," says Joey Gonzalez, assistant professor in the Department of Electrical Engineering and Computer Science at the University of California at Berkeley and a founding member of UC Berkeley's RISE Lab.

For example, let's say a company builds a model that decides which job applicants its recruiters should invite to interview, and trains it on a dataset that includes the resumes of all applicants the company has invited to interview for similar positions in the past. If the company's HR staff have historically rejected applications from former stay-at-home parents attempting to return to the workforce -- an unfortunately common practice -- the training algorithm may result in a model that excludes job applicants who have long employment gaps. That would cause the resulting model to disproportionately reject women (who still make up the majority of stay-at-home parents), even if gender isn't one of the characteristics in its training dataset. The ML-enabled system thus ends up amplifying existing human historical bias.

This is where explainable AI could come in. If human operators could check in on the "reasoning" an algorithm used to make decisions about members of high-risk groups, they might be able to correct for bias before it has a serious impact.

Making machines explain themselves

Since the behavior of an ML system is fueled by the data it learned from, it works differently from a standard computer program where humans explicitly write every line of code. Humans can measure the accuracy of an ML-enabled system, but visibility into how such a system actually makes decisions is limited. Think of it as analogous to a human brain. We generally know that human brains think due to the complex firing of neurons across specific areas, but we don't know exactly how that relates to particular decisions. That's why when we want to know why a human being made a decision, we don't look inside their head -- we ask them to justify their decision based on their experience or the data at hand.

Explainable AI asks ML algorithms to justify their decision-making in a similar way. For example, in 2016 researchers from the University of Washington built an explanation technique called LIME that they tested on the Inception Network, a popular image classification neural net built by Google. Instead of looking at which of the Inception Network's "neurons" fire when it makes an image classification decision, LIME searches for an explanation in the image itself. It blacks out different parts of the original image and feeds the resulting "perturbed" images back through Inception, checking to see which perturbations throw the algorithm off the furthest.

By doing this, LIME can attribute the Inception Network's classification decision to specific features of the original picture. For example, for an image of a tree frog, LIME found that erasing parts of the frog's face made it much harder for the Inception Network to identify the image, showing that much of the original classification decision was based on the frog's face.

Feature attribution methods like LIME don't fully explain an algorithm's decisions, and they don't work equally well for every type of ML model. However, at least where image classification is concerned, they're a step in the right direction. Image classification is one of the most popular tasks for cutting-edge ML research. Algorithms for solving this task have been embroiled in controversies over bias before. In 2015, a black software developer reported that Google Photos labeled images of him and his black friend as "gorillas." It's not hard to see how explanation techniques like LIME could mitigate this kind of bias: The human operator of an image classification algorithm could override classification decisions for which the algorithm's "explanation" didn't pass muster -- and, if necessary, tune or adjust the algorithm.

This ability of human operators to evaluate algorithms' explanations of their decisions might be even more crucial when it comes to facial recognition technology. AI-based facial recognition systems in the United States tend to identify black people's faces less accurately than white people's (possibly because they are trained on datasets of mostly white people's portraits). This increases the likelihood that black people, already disproportionately vulnerable to arrest, will be misidentified by police surveillance cameras, and thus be suspected of crimes they did not commit. Better human oversight of the "reasoning" of such systems might help avoid such undesirable results.

A human problem

While explainable AI and feature attribution for neural nets are promising developments, eliminating bias in AI ultimately comes down to one thing: data. If the data an algorithm is trained on doesn't fairly reflect the entire population that developers want to serve, bias is likely to occur. Where the training data reflects historical injustices or deeply ingrained inequalities, the algorithm will learn, and subsequently perpetuate or even amplify, those harmful patterns. And while GDPR and similar regulations put some controls on how organizations use data, they don't do much to keep those same organizations from using already biased datasets.

Ultimately, it's the responsibility of the organization that owns the data to collect, store, and use that data wisely and fairly. Algorithmic developments help, but the obligation to overcome bias lies with the designers and operators of these decision-making systems, not with the mathematical structures, software, or hardware. In this sense, reducing bias in machine-learning algorithms doesn't just require advances in artificial intelligence -- it also requires advances in our understanding of human diversity.

"It is an incredibly hard problem," acknowledged Gonzalez. "But by getting very smart people thinking about this problem and trying to codify a better approach or at least state what the approach is, I think that will help make progress."

Those "very smart people" should not just be data scientists. In order to develop fair and accountable AI, technologists need the help of sociologists, psychologists, anthropologists, and other experts who can offer insight into the ways bias affects human lives -- and what we can do to ensure that bias does not make ML-enabled systems harmful. Technology doesn't solve social problems by itself. But by collaborating across disciplines, researchers and developers can take steps to create ML-enabled technology that contributes to a fairer society.

Damon Civin is principal data scientist on the strategy team at Arm, where he works to make the data streams from connected IoT devices useful to the world through machine learning.

Machines learn discrimination

Making machines explain themselves

A human problem

More