How to make sure your 'AI for good' project actually does good

Artificial intelligence has been front and center in recent months. The global pandemic has pushed governments and private companies worldwide to propose AI solutions for everything from analyzing cough sounds to deploying disinfecting robots in hospitals. These efforts are part of a wider trend that has been picking up momentum: the deployment of projects by companies, governments, universities, and research institutes aiming to use AI for societal good. The goal of most of these programs is to deploy cutting-edge AI technologies to solve critical issues such as poverty, hunger, crime, and climate change, under the "AI for good" umbrella.

But what makes an AI project good? Is it the "goodness" of the domain of application, be it health, education, or environment? Is it the problem being solved (e.g. predicting natural disasters or detecting cancer earlier)? Is it the potential positive impact on society, and if so, how is that quantified? Or is it simply the good intentions of the person behind the project? The lack of a clear definition of AI for good opens the door to misunderstandings and misinterpretations, along with great chaos.

AI has the potential to help us address some of humanity's biggest challenges like poverty and climate change. However, as any technological tool, it is agnostic to the context of application, the intended end-user, and the specificity of the data. And for that reason, it can ultimately end up having both beneficial and detrimental consequences.

In this post, I'll outline what can go right and what can go wrong in AI for good projects and will suggest some best practices for designing and deploying AI for good projects.

Success stories

AI has been used to generate lasting positive impact in a variety of applications in recent years. For example, Statistics for Social Good out of Stanford University has been a beacon of interdisciplinary work at the nexus of data science and social good. In the last few years, it has piloted a variety of projects in different domains, from matching nonprofits with donors and volunteers to investigating inequities in palliative care. Its bottom-up approach, which connects potential problem partners with data analysts, helps these organizations find solutions to their most pressing problems. The Statistics for Social Good team covers a lot of ground with limited manpower. It documents all of its findings on its website, curates datasets, and runs outreach initiatives both locally and abroad.

Another positive example is the Computational Sustainability Network, a research group applying computational techniques to sustainability challenges such as conservation, poverty mitigation, and renewable energy. This group adopts a complementary approach for matching computational problem classes like optimization and spatiotemporal prediction with sustainability challenges such as bird preservation, electricity usage disaggregation and marine disease monitoring. This top-down approach works well given that members of the network are experts in these techniques and so are well-suited to deploy and fine-tune solutions to the specific problems at hand. For over a decade, members of CompSustNet have been creating connections between the world of sustainability and that of computing, facilitating data sharing and building trust. Their interdisciplinary approach to sustainability exemplifies the kind of positive impacts AI techniques can have when applied mindfully and coherently to specific real-world problems.

Even more recent examples include the use of AI in the fight against COVID-19. In fact, a plethora of AI approaches have emerged to address various aspects of the pandemic, from molecular modeling of potential vaccines to tracking misinformation on social media — I helped write a survey article about these in recent months. Some of these tools, while built with good intentions, had inadvertent consequences. However, others produced positive lasting impacts, especially several solutions created in partnership with hospitals and health providers. For instance, a group of researchers at the University of Cambridge developed the COVID-19 Capacity Planning and Analysis System tool to help hospitals with resource and critical care capacity planning. The system, whose deployment across hospitals was coordinated with the U.K.’s National Health Service, can analyze information gathered in hospitals about patients to determine which of them require ventilation and intensive care. The collected data was percolated up to the regional level, enabling cross-referencing and resource allocation between the different hospitals and health centers. Since the system is used at all levels of care, the compiled patient information could not only help save lives but also influence policy-making and government decisions.

Unintended consequences

Despite the best intentions of the project instigators, applications of AI towards social good can sometimes have unexpected (and sometimes dire) repercussions. A prime example is the now-infamous COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) project, which various justice systems in the United States deployed. The aim of the system was to help judges assess risk of inmate recidivism and to lighten the load on the overflowing incarceration system. Yet, the tool’s risk of recidivism score was calculated along with factors not necessarily tied to criminal behaviour, such as substance abuse and stability. After an in-depth ProPublica investigation of the tool in 2016 revealed the software’s undeniable bias against blacks, usage of the system was stonewalled. COMPAS's shortcomings should serve as a cautionary tale for black-box algorithmic decision-making in the criminal justice system and other areas of government, and efforts must be made to not repeat these mistakes in the future.

More recently, another well-intentioned AI tool for predictive scoring spurred much debate with regard to the U.K. A-level exams. Students must complete these exams in their final year of school in order to be accepted to universities, but they were cancelled this year due to the ongoing COVID-19 pandemic. The government therefore endeavored to use machine learning to predict how the students would have done on their exams had they taken them, and these estimates were then going to be used to make university admission decisions. Two inputs were used for this prediction: any given student's grades during the 2020 year, and the historical record of grades in the school the student attended. This meant that a high-achieving student in a top-tier school would have an excellent prediction score, whereas a high-achieving student in a more average institution would get a lower score, despite both students having equivalent grades. As a result, two times as many students from private schools received top grades compared to public schools, and over 39% of students were downgraded from the cumulative average they had achieved in the months of the school year before the automatic assessment. After weeks of protests and threats of legal action by parents of students across the country, the government backed down and announced that it would use the average grade proposed by teachers instead. Nonetheless, this automatic assessment serves as a stern reminder of the existing inequalities within the education system, which were amplified through algorithmic decision-making.

While the the goals of COMPAS and the UK government were not ill-intentioned, they highlight the fact that AI projects do not always have the intended outcome. In the best case, these misfires can still validate our perception of AI as a tool for positive impact even if they haven't solved any concrete problems. In the worst case, they experiment on vulnerable populations and result in harm.

Improving AI for good

Best practices in AI for good fall into two general categories -- asking the right questions and including the right people.

1. Asking the right questions

Before jumping head-first into a project intending to apply AI for good, there are a few questions you should ask. The first one is: What is the problem, exactly? It is impossible to solve the real problem at hand, whether it be poverty, climate change, or overcrowded correctional facilities. So projects inevitably involve solving what is, in fact, a proxy problem: detecting poverty from satellite imagery, identifying extreme weather events, producing a recidivism risk score. There is also often a lack of adequate data for the proxy problem, so you rely on surrogate data, such as average GDP per census block, extreme climate events over the last decade, or historical data regarding inmates committing crimes when on parole. But what happens when the GDP does not tell the whole story about income, when climate events are progressively becoming more extreme and unpredictable, or when police data is biased? You end up with AI solutions that optimize the wrong metric, make erroneous assumptions, and have unintended negative consequences.

It is also crucial to reflect upon whether AI is the appropriate solution. More often than not, AI solutions are too complex, too expensive, and too technologically demanding to be deployed in many environments. It is therefore of paramount importance to take into account the context and constraints of deployment, the intended audience, and even more straightforward things like whether or not there is a reliable energy grid present at the time of deployment. Things that we take for granted in our own lives and surroundings can be very challenging in other regions and geographies.

Finally, given the current ubiquity and accessibility of machine learning and deep learning approaches, you may take for granted that they are the best solution for any problem, no matter its nature and complexity. While deep neural networks are undoubtedly powerful in certain use cases and given a large amount of high-quality data relevant to the task, these factors are rarely the norm in AI-for-good projects. Instead, teams should prioritize simpler and more straightforward approaches, such as random forests or Bayesian networks, before jumping to a neural network with millions of parameters. Simpler approaches also have the added value of being more easily interpretable than deep learning, which is a useful characteristic in real-world contexts where the end users are often not AI specialists.

Generally speaking, here are some questions you should answer before developing an AI-for-good project:

Who will define the problem to be solved?
Is AI the right solution for the problem?
Where will the data come from?
What metrics will be used for measuring progress?
Who will use the solution?
Who will maintain the technology?
Who will make the ultimate decision based on the model's predictions?
Who or what will be held accountable if the AI has unintended consequences?

While there is no guaranteed right answer to any of the questions above, they are a good sanity check before deploying such a complex and impactful technology as AI when vulnerable people and precarious situations are involved. In addition, AI researchers must be transparent about the nature and limitations of the data they are using. AI requires large amounts of data, and ingrained in that data are the inherent inequities and imperfections that exist within our society and social structures. These can disproportionately impact any system trained on the data leading to applications that amplify existing biases and marginalization. It is therefore critical to analyze all aspects of the data and ask the questions listed above, from the very start of your research.

When you are promoting a project, be clear about its scope and limitations; don't just focus on the potential benefits it can deliver. As with any AI project, it is important to be transparent about the approach you are using, the reasoning behind this approach, and the advantages and disadvantages of the final model. External assessments should be carried out at different stages of the project to identify potential issues before they percolate through the project. These should cover aspects such as ethics and bias, but also potential human rights violations, and the feasibility of the proposed solution.

2. Including the right people

AI solutions are not deployed in a vacuum or in a research laboratory but involve real people who should be given a voice and ownership of the AI that is being deployed to "help'" them -- and not just at the deployment phase of the project. In fact, it is vital to include non-governmental organizations (NGOs) and charities, since they have the real-world knowledge of the problem at different levels and a clear idea of the solutions they require. They can also help deploy AI solutions so they have the biggest impact -- populations trust organizations such as the Red Cross, sometimes more than local governments. NGOs can also give precious feedback about how the AI is performing and propose improvements. This is essential, as AI-for-good solutions should include and empower local stakeholders who are close to the problem and to the populations affected by it. This should be done at all stages of the research and development process, from problem scoping to deployment. The two examples of successful AI-for-good initiatives I cited above (CompSusNet and Stats for Social Good) do just that, by including people from diverse, interdisciplinary backgrounds and engaging them in a meaningful way around impactful projects.

In order to have inclusive and global AI, we need to engage new voices, cultures, and ideas. Traditionally, the dominant discourse of AI is rooted in Western hubs like Silicon Valley and continental Europe. However, AI-for-good projects are often deployed in other geographical areas and target populations in developing countries. Limiting the creation of AI projects to outside perspectives does not provide a clear picture about the problems and challenges faced in these regions. So it is important to engage with local actors and stakeholders. Also, AI-for-good projects are rarely a one-shot deal; you will need domain knowledge to ensure they are functioning properly in the long term. You will also need to commit time and effort toward the regular maintenance and upkeep of technology supporting your AI-for-good project.

Projects aiming to use AI to make a positive impact on the world are often received with enthusiasm, but they should also be subject to extra scrutiny. The strategies I've presented in this post merely serve as a guiding framework. Much work still needs to be done as we move forward with AI-for-good projects, but we have reached a point in AI innovation where we are increasingly having these discussions and reflecting on the relationship between AI and societal needs and benefits. If these discussions turn into actionable results, AI will finally live up to its potential to be a positive force in our society.

Thank you to Brigitte Tousignant for her help in editing this article.

Sasha Luccioni is a postdoctoral researcher at MILA, a Montreal-based research institute focused on artificial intelligence for social good.