AI Weekly: AI-driven optimism about the pandemic's end is a health hazard

As the pandemic reaches new heights, with nearly 12 million cases and 260,000 deaths recorded in the U.S. to date, a glimmer of hope is on the horizon. Moderna and pharmaceutical giant Pfizer, which are developing vaccines to fight the virus, have released preliminary data suggesting their vaccines are around 95% effective. Manufacturing and distribution is expected to ramp up as soon as the companies seek and receive approval from the U.S. Food and Drug Administration. Representatives from Moderna and Pfizer say the first doses could be available as early as December.

But even if the majority of Americans agree to vaccination, the pandemic won't come to a sudden end. Merck CEO Kenneth Frazier and others caution that drugs to treat or prevent COVID-19, the condition caused by the virus, aren't silver bullets. In all likelihood, we will need to wear masks and practice social distancing well into 2021, not only because vaccines probably won't be widely available until mid-2021, but because studies will need to be conducted after each vaccine's release to monitor for potential side effects. Scientists will need still more time to determine the vaccines' efficacy, or level of protection against the coronavirus.

In this time of uncertainty, it's tempting to turn to soothsayers for comfort. In April, researchers from Singapore University of Technology and Design released a model they claimed could estimate the life cycle of COVID-19. After feeding in data -- including confirmed infections, tests conducted, and the total number of deaths recorded -- the model predicted that the pandemic would end this December.

The reality is far grimmer. The U.S. topped 2,000 deaths per day this week, the most on a single day since the devastating initial wave in the spring. The country is now averaging over 50% more deaths per day compared with two weeks ago, in addition to nearly 70% more cases per day on average.

It's possible -- likely, even -- that the data the Singapore University team used to train their model was incomplete, imbalanced, or otherwise severely flawed. They used a COVID-19 dataset assembled by research organization Our World in Data that comprised confirmed cases and deaths collected by the European Center for Disease Prevention and Control and testing statistics published in official reports. Hedging their bets, the model's creators warned that prediction accuracy depended on the quality of the data, which is often unreliable and reported differently around the world.

While AI can be a useful tool when used sparingly and with sound judgment, putting blind faith in these kinds of predictions leads to poor decision-making. In something of a case in point, a recent study from researchers at Stanford and Carnegie Mellon found that certain U.S. voting demographics, including people of color and older voters, are less likely to be represented in mobility data used by the U.S. Centers for Disease Control and Prevention, the California Governor's Office, and numerous cities across the country to analyze the effectiveness of social distancing. This oversight means policymakers who rely on models trained with the data could fail to establish pop-up testing sites or allocate medical equipment where it's needed most.

The fact that AI and the data it's trained on tend to exhibit bias is not a revelation. Studies investigating popular computer vision, natural language processing, and election-predicting algorithms have arrived at the same conclusion time and time again. For example, much of the data used to train AI algorithms for disease diagnosis perpetuates inequalities, in part due to companies' reticence to release code, datasets, and techniques. But with a disease as widespread as COVID-19, the effect of these models is amplified a thousandfold, as is the impact of government- and organization-level decisions informed by them. That's why it's crucial to avoid putting stock in AI predictions of the pandemic's end, particularly if they result in unwarranted optimism.

"If not properly addressed, propagating these biases under the mantle of AI has the potential to exaggerate the health disparities faced by minority populations already bearing the highest disease burden," wrote the coauthors of a recent paper in the Journal of American Medical Informatics Association. They argued that biased models may exacerbate the disproportionate impact of the pandemic on people of color. "These tools are built from biased data reflecting biased health care systems and are thus themselves also at high risk of bias -- even if explicitly excluding sensitive attributes such as race or gender."

We would do well to heed their words.

For AI coverage, send news tips to Khari Johnson and Kyle Wiggers -- and be sure to subscribe to the AI Weekly newsletter and bookmark The Machine.

Thanks for reading,

Kyle Wiggers

AI Staff Writer

More