Messenger RNA (mRNA)-based therapeutics have shown great promise for preventing and treating many intractable or genetic diseases. Among the many benefits when compared to conventional approaches, these therapeutics are faster to design and produce, as well as more flexible and cost-effective. Nevertheless, the RNA molecule is susceptible to a base-catalyzed hydrolysis that makes it chemically unstable, thus also limiting the stability of mRNA therapeutics. While designing specific mRNA sequences that are more resistant to hydrolysis could be a potential solution to this problem, predicting RNA degradation remains a challenging task.
In a recent work, Rhiju Das and colleagues leveraged a dual-crowdsourcing strategy — using Eterna, an online video game platform for solving scientific problems such as mRNA design, and Kaggle, a well-known platform for machine learning competitions — to develop highly accurate models for RNA degradation. Briefly, a total of 150 Eterna participants were responsible for designing a variety of RNA sequences, and the authors used two experimental methods — In-line-seq and PERSIST-seq — to synthetize and measure the degradation rates of the sequences. A competition was then hosted at Kaggle to create models that could accurately describe the degradation of these sequences: both training and blind test data were generated for the competition. The competition had over 1,600 teams and lasted for three weeks.
Remarkably, many Kaggle entries outperformed the baseline models substantially, and one of the most widely used community-developed featurization approaches was a graph-based distance embedding. The top two Kaggle models were then assessed using a dataset that was not available during the time of the competition, and both models were shown to outperform the other baseline models. All in all, the study demonstrated a unique opportunity of using the power of crowds to speed up scientific developments for problems that require short timescales, such as the design of stable COVID-19 mRNA vaccines.
Nat. Mach. Intell. 4, 1174–1184 (2022)
Read more here: Source link