This AI system can design RNA

RNA, or ribonucleic acid, is present in all living cells. It acts as a messenger, carrying instructions from DNA (deoxyribonucleic acid) that dictate how proteins in the body are synthesized. And when it doesn't work as it should, it can severely affect neurological, cardiovascular, and muscular regulatory processes, resulting in effects like tumors, insulin resistance, and motor skill impairment.

That's why researchers at the University of Freiburg's Department of Computer Science developed an AI system -- LEARNA -- that can learn to design RNA molecules for study. It's described in a new paper ("Learning to Design RNA") published this week on the preprint server Arxiv.org.

"Designing RNA molecules has garnered recent interest in medicine, synthetic biology, biotechnology and bioinformatics since many functional RNA molecules were shown to be involved in regulatory processes for transcription, epigenetics, and translation," the researchers wrote. "Here, we propose a new algorithm for the RNA design problem."

As the paper's authors explain, RNA's function depends on its structural properties. The real challenge -- sometimes described as RNA inverse folding -- is identifying patterns and sequences in RNA that cause it to fold into a specified structure.

The researchers' approach relies on a deep reinforcement learning (RL) algorithm -- an artificial intelligence (AI) training technique that uses rewards to drive agents toward goals -- that trains a policy network that can sequentially predict the entire RNA sequence. It generates this sequence, folds it, and uses the distance from the resulting structure to the target structure as a signal for the AI agent.

Meanwhile, a second version of LEARNA -- appropriately dubbed Meta-LEARNA -- learns a single policy across many RNA design problems that are directly applicable to new RNA design problems. That is, it learns a tailor-made generative model that builds RNA sequence samples by choosing actions to place nucleotides -- the chemical building blocks of RNA and DNA -- for a given RNA target structure.

"To the best of our knowledge, this is the first application of architecture search ... to RL [and] the first application of [architecture search] to metalearning," the researchers wrote.

After meta-learning across 8,000 different RNA target structure for one hour on a machine with 20 processor cores, Meta-LEARNA managed to solve up to 65 percent of the target structures in the Eterna100 benchmark. (Eterna100, for the uninitiated, is a collection of 100 target structures created by players of Eterna, an online open laboratory that tasks players with creating sequences that fold to specific structures.) Moreover, it only needed 90 seconds to achieve results on par with any other method, and achieved state-of-the-art performance in three minutes.

Meanwhile, on another benchmark -- Rfam-Taneda -- Meta-LEARNA produced results as good as state-of-the-art methods in 10 seconds, and surpassed those methods in accuracy after 1 minute.

The results -- much like those achieved by Google parent company DeepMind's protein-folding AlphaFold system earlier this year -- bode well for future work in RNA biology.

"Comprehensive empirical results ... show that our approach achieves new state-of-the-art performance on all benchmarks while also being orders of magnitudes faster in reaching the previous state-of-the-art performance," the researchers wrote.