Facebook open-sources AI for clinical trial eligibility

As trials for novel coronavirus drugs, therapies, and vaccines kick off in countries around the world, Facebook researchers are making available a library of AI models that transform clinical trial screening criteria into a machine-readable format. They say that the code enables trials to be searched by their health requirements, making it easier for people to discover trials and determine their own eligibility.

Clinical trials are crucial for understanding diseases and testing new treatments, but in the U.S., they often face challenges recruiting enough participants and establishing diversity in their populations. Approximately 80% of clinical trials are delayed or closed because of problems with recruitment, with 9 out of 10 trials requiring the original timeline to be doubled in order to meet enrollment goals, according to a Tufts University study.

The National Library of Medicine (NLM) aimed to rectify this with ClinicalTrials.gov, a public database of over 330,000 clinical studies in the U.S. and globally, which provides filters for eligibility criteria like patient age, gender, ethnicity, trial location, language literacy, technology access, pregnancy, and study condition. Unfortunately, the criteria are written in free-text descriptions with complex and esoteric medical language, and a large volume of new trials -- more than 32,000 -- are added every day.

The Facebook researchers employed a multi-stage approach to extract criteria from ClinicalTrials.gov. Given a trial, one of their models split the text into "inclusion" and "exclusion" blocks, with the former outlining criteria a participant must satisfy and the latter listing the ineligibility criteria. The model automatically identifies the concept (some entity or attribute, like "leukemia" or "BMI") and constraints (eligibility requirements) associated with each trial, leveraging a vocabulary designed as a taxonomy for biomedical research literature (NLM's Medical Subject Headings) as its primary source of knowledge.

The researchers trained a separate model to extract the entity mentions from trial text (e.g., "treatment," "chronic disease," "allergy") and categorize them by class. They used another model trained on descriptions and aggregated, de-duplicated eligibility criteria of all trials in ClinivalTrials.gov (as of May 2019) to perform the task of named entity linking, where the goal was to link entities with concepts in the knowledge base. A third model helped to recognize and predict criteria for attributes in the knowledge base, while a relationship extraction model determined whether the trial must accept or reject subjects, given a trial and some extracted entity.

In experiments, the models met or exceeded the performance of a state-of-the-art baseline system, according to the researchers. They note that the extracted relations could be used to build a search interface for clinical trials that would allow an organization specializing in a certain treatment area to implement a question flow that matches people with clinical trials based on their medical condition and history.

"These types of discovery use cases may help address recruitment challenges that clinical trials need to overcome," the researchers wrote. "The difficulty that nonmedical audiences have in understanding trial details and eligibility criteria is a driver of these issues, and it reduces the ability of researchers to engage potentially eligible participants ... We hope [our] work helps these communities provide better ways for patients from all backgrounds to access clinical trials."

The new toolset isn't Facebook's first foray into clinical trials work. In 2017, the company reportedly hosted pharmaceutical marketers to learn about targeting users for their clinical trials, following a summit to pitch Facebook's platform as an alternative to television and print media drug ads.

More