Join Transform 2021 this July 12-16. Register for the AI event of the year.

A team of researchers hailing from Harvard and Université de Montréal today launched, an AI-powered, interactive platform designed to facilitate COVID-19 vaccine development. It’s built atop an algorithm — CAMAP — that generates predictions for potential vaccine targets, enabling researchers to identify which parts of the virus are more likely to be exposed at the surface (epitopes) of infected cells.

Project lead Dr. Tariq Daouda, who worked alongside doctorates in machine learning, immunobiologists, and bioinformaticians to build, hopes the platform will reduce the time and expense involved in creating vaccine candidates. Fewer than 12% of all drugs entering clinical trials end up in pharmacies, and it takes at least 10 years for medicines to complete the journey from discovery to the marketplace. Clinical trials alone take six to seven years, on average, putting the cost of R&D at roughly $2.6 billion, according to the Pharmaceutical Research and Manufacturers of America.

CAMAP, which Daouda developed while obtaining his Ph.D. at the Université de Montréal, was originally applied to cancer immunotherapy. But its aptitude for learning immune system patterns made it an ideal fit for revealing viruses’ weaknesses.

“The COVID-19 pandemic stresses the need to accelerate the design of vaccines and therapies to reduce the human and economic impact of global pandemics,” said Daouda in a statement. “People infected with COVID-19 tend to have [fewer] immune cells, making it difficult to get enough infected cells to study them appropriately in a lab — and because they are so rare, labs are in competition with each other to obtain them.” doesn’t synthesize vaccine candidates itself, but its predictions could be used to generate a list of epitope targets to test. The trained CAMAP model draws on a data set of 1.5 million candidate epitopes and their metadata, including approximately 78,000 from SARS-COVID-2 and SARS-COVID-1 (two variants of coronavirus) and 104,000 from normal human sequences — all of which is hosted on ArangoDB’s Oasis service.

A recent preprint paper published by researchers at NEC OncoImmunity and NEC Laboratories Europe describes work to identify COVID-19 vaccine candidates from epitopes. The team repurposed an algorithm similar to CAMAP to analyze COVID-19 sequences and isolate epitopes with optimal immune responses, which they believe could inform development of both current and future strains.

“ makes code the petri dish — utilizing open source technologies to connect machine learning to biomedicine to help accelerate learnings and findings,” said Daouda. — which also provides visualizations that allow researchers to plot results and use them for further research — is available on GitHub. A public API is available, and Daouda’s team plans to introduce new tools in the future.

In addition to ArangoDB, Digital Ocean,, and Slack are listed as sponsors. Both ArangoDB and Digital Ocean have pledged free cloud hosting and database management to the project.


VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more
Become a member