VentureBeat presents: AI Unleashed - An exclusive executive event for enterprise data leaders. Network and learn with industry peers. Learn More

Let the OSS Enterprise newsletter guide your open source journey! Sign up here.

DeepMind this week open-sourced AlphaFold 2, its AI system that predicts the shape of proteins, to accompany the publication of a paper in the journal Nature. With the codebase now available, DeepMind says it hopes to broaden access for researchers and organizations in the health care and life science fields.

The recipe for proteins — large molecules consisting of amino acids that are the fundamental building blocks of tissues, muscles, hair, enzymes, antibodies, and other essential parts of living organisms — are encoded in DNA. It’s these genetic definitions that circumscribe their three-dimensional structures, which in turn determine their capabilities. But protein “folding,” as it’s called, is notoriously difficult to figure out from a corresponding genetic sequence alone. DNA contains only information about chains of amino acid residues and not those chains’ final form.

In December 2018, DeepMind attempted to tackle the challenge of protein folding with AlphaFold, the product of two years of work. The Alphabet subsidiary said at the time that AlphaFold could predict structures more precisely than prior solutions. Its successor, AlphaFold 2, announced in December 2020, improved on this to outgun competing protein-folding-predicting methods for a second time. In the results from the 14th Critical Assessment of Structure Prediction (CASP) assessment, AlphaFold 2 had average errors comparable to the width of an atom (or 0.1 of a nanometer), competitive with the results from experimental methods.


AI Unleashed

An exclusive invite-only evening of insights and networking, designed for senior enterprise executives overseeing data stacks and strategies.


Learn More

DeepMind AlphaFold

AlphaFold draws inspiration from the fields of biology, physics, and machine learning.  It takes advantage of the fact that a folded protein can be thought of as a “spatial graph,” where amino acid residues (amino acids contained within a peptide or protein) are nodes and edges connect the residues in close proximity. AlphaFold leverages an AI algorithm that attempts to interpret the structure of this graph while reasoning over the implicit graph it’s building using evolutionarily related sequences, multiple sequence alignment, and a representation of amino acid residue pairs.

In the open source release, DeepMind says it significantly streamlined AlphaFold 2. Whereas the system took days of computing time to generate structures for some entries to CASP, the open source version is about 16 times faster. It can generate structures in minutes to hours, depending on the size of the protein.

Real-world applications

DeepMind makes the case that AlphaFold, if further refined, could be applied to previously intractable problems in the field of protein folding, including those related to epidemiological efforts. Last year, the company predicted several protein structures of SARS-CoV-2, including ORF3a, whose makeup was formerly a mystery. At CASP14, DeepMind predicted the structure of another coronavirus protein, ORF8, that has since been confirmed by experimentalists.

Beyond aiding the pandemic response, DeepMind expects AlphaFold will be used to explore the hundreds of millions of proteins for which science currently lacks models. Since DNA specifies the amino acid sequences that comprise protein structures, advances in genomics have made it possible to read protein sequences from the natural world, with 180 million protein sequences and counting in the publicly available Universal Protein database. In contrast, given the experimental work needed to translate from sequence to structure, only around 170,000 protein structures are in the Protein Data Bank.

DeepMind says it’s committed to making AlphaFold available “at scale” and collaborating with partners to explore new frontiers, like how multiple proteins form complexes and interact with DNA, RNA, and small molecules. Earlier this year, the company announced a new partnership with the Geneva-based Drugs for Neglected Diseases initiative, a nonprofit pharmaceutical organization that hopes to use AlphaFold to identify compounds to treat conditions for which medications remain elusive.

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.