ProBeat: A plea to the machine learning for health community

The room was packed at the annual Machine Learning and the Market for Intelligence conference in Toronto last week. Now in its fifth year, the lengthy name of the event matches the depth of the discussions. But one speaker and her talk stood out to me in particular: Marzyeh Ghassemi, who also happens to be a veteran of Alphabet's Verily, presented "Machine Learning From Our Mistakes."

Ghassemi, an assistant professor at the University of Toronto, talked about the importance of predicting actionable insights in health care, the regulation of algorithms, and practice data versus knowledge data. But at the very end, saving the best for last, she emphasized the importance of treating health data as a resource.

Here is how she closed her talk:

We have to make a decision about whether we, as a machine learning for health community, want to be like speech or vision. You cannot do state of the art machine learning in speech in academia anymore. Because, 10 years ago, all of the data became owned by companies. Amazon, Microsoft, Google -- they have all the speech data and they're not giving it to you. But the vision community decided to open-source the vision data. And so, machine learning departments around the world don't hire speech people anymore. Those people are in companies, and we can't train them in academia because the data doesn't exist. But you can train state of the art vision people who can then audit these deployed models. And so I think we need to make a choice to move into what the vision community has done so that we can create, audit, and deploy fair state of the art machine learning models in health.

Ghassemi nailed the problem. We talk about natural language processing improvements in terms of accuracy and error rates achieved by the tech giants. The conversation is almost exclusively focused on which virtual assistant -- Alexa, Bixby, Cortana, Google Assistant, Siri, etc. -- can understand your speech best or least.

Meanwhile, if you partake in any conversation regarding computer vision, you'll almost immediately stumble upon someone testing their own tool. Facebook, Twitter, and YouTube are full of videos showing off basics like object detection that anyone can learn to build themselves. Even tech companies like Alphabet's Waymo, GM's Cruise, Lyft, and Uber have open-sourced various self-driving car data and tools.

We need to free our health data so that it follows the same path as vision data. It needs to include as many people as possible, as many data points as possible, and be anonymized, of course. That's much easier said than done. Having machines understand everything about our bodies is much more complex, and more revealing, than simply what we hear and what we see. It's also much more critical. One of the most important areas humans can make advancements is health care.

Ghassemi joked a lot during her talk to drive her various points home. For example, she shared how she tackles conversations with doctors who doubt machine learning.

I'd like to just say -- small dig to some of my doctor friends. Whereas algorithms and devices are regulated by the FDA, doctors are self-regulated. *laughter* When somebody tells me 'so what do you do when the thing kills something?' I'm like, 'what do you when you kill something?' *more laughter* So this is a thing. It's happening. We should probably decide how we want to regulate it.

Doctors are burnt out. In fact, Ghasemmi noted that many doctors say they're so burnt out that they don't have time to be empathetic. The need for technology to lighten their load is dire.

Ghassemi thus argues we need to get data from as many sources as we can and use machine learning to arm doctors with reliable and actionable insights. Instead of relying on a single doctor to diagnose you, wouldn't you rather have a doctor armed with a machine learning model trained on health care data of people with the same symptoms? That's the dream, anyway.

We have asked the conference for a video of Ghassemi's talk and will put it at the top of this article when we get it.

ProBeat is a column in which Emil rants about whatever crosses him that week.

More