Can AlphaFold be used?MIT new research: not much better than random guessing, still needs to continue to improve

Can AlphaFold be used?MIT new research: not much better than random guessing, still needs to continue to improve

After AlphaGo defeated Ke Jie, the world’s No. 1 Go player in 2017, AlphaFold 2 was born in 2020, making artificial intelligence (AI) successful again.

2 years later, what about AlphaFold today?

In July this year, DeepMind and EMBL-EBI used AlphaFold to predict almost all known proteins on the earth, 214 million protein structures of more than 1 million species, which can be called a major leap in the field of biology. sparked heated discussions on social media.

However, scientists in the field of life sciences, who are “insiders”, have mixed opinions on the results achieved by AlphaFold.

Last month, American drug discovery chemist Derek Lowe poured cold water on AlphaFold. In an article titled “Why AlphaFold won’t revolutionise drug discovery,” Lowe wrote that AlphaFold’s entire computational technique was built on finding analogies to known structures, and in the absence of comparable structures, AlphaFold would There is nothing left to do.

Can AlphaFold be used?MIT new research: not much better than random guessing, still needs to continue to improve

Now, in a new study, a team of researchers from MIT, Harvard and the Broad Institute has once again revealed the limitations of AlphaFold.

The research team hopes to use the AlphaFold (predicted) structure to find drugs that bind to specific bacterial proteins. But they found that AlphaFold did not perform well in this regard. “In fact, their predictions are not much better than chance.”

The related research paper, titled “Benchmarking AlphaFold-enabled molecular docking predictions for antibiotic discovery”, has been published in the scientific journal Molecular Systems Biology.

“Breakthroughs such as AlphaFold are expanding the possibilities of computational drug discovery efforts, but these developments need to be combined with advances in other aspects of modeling that are part of drug discovery efforts,” said James Collins, an MIT professor and corresponding author of the paper.

Insufficient accuracy

Few new antibiotics have been developed over the past few decades, largely because current methods of screening potential drugs are too expensive and time-consuming. A promising new strategy is the use of computational models to enable faster and cheaper drug discovery.

Previously, AlphaFold had accurately predicted protein structures from their amino acid sequences, a breakthrough that excited scientists in the search for new antibiotics.

According to reports, the new research is part of the Collins lab’s recently launched Antibiotics-AI Project, which aims to use artificial intelligence to discover and design new antibiotics.

In this work, the research team used the protein structures generated by AlphaFold to explore whether existing models can accurately predict the interaction of bacterial proteins with antimicrobial compounds.

If the answer is yes, scientists can use this type of model to conduct large-scale screening of new compounds that can target proteins that were previously untargetable. This will enable the development of antibiotics with unprecedented mechanisms of action, a key task in addressing the antibiotic resistance crisis.

Can AlphaFold be used?MIT new research: not much better than random guessing, still needs to continue to improve

To test the feasibility of this strategy, Collins’ team decided to study the interaction of 296 essential proteins from E. coli with 218 antimicrobial compounds, including antibiotics such as tetracyclines.

They used molecular docking simulations to analyze how these compounds interact with E. coli proteins, predicting how strongly the two molecules bind together based on their shape and physical properties.

This simulation has been successfully applied in studies where large numbers of compounds are screened against a single protein target to identify those that bind optimally. But when they tried to screen multiple compounds against many potential targets, the accuracy of the predictions was much lower.

By comparing the predictions produced by the model with the actual interactions of the 12 essential proteins obtained in laboratory experiments, the research team found that the false-positive rate of the model was similar to the true-positive rate. This suggests that the model cannot consistently identify true interactions between existing drugs and their targets.

Can AlphaFold be used?MIT new research: not much better than random guessing, still needs to continue to improve

In addition, the research team found that the models exhibited poor performance using auROC, a measurement commonly used to evaluate computational models.

In this regard, Collins said: “Using these standard molecular docking simulations, we obtained an auROC value of about 0.5, which is a number that indicates that the performance of the model is no better than the performance of random guessing.” When the research team determined experimentally They found similar results when using this modeling approach for the protein structure.

“The structures predicted by AlphaFold appear to be about the same as those determined experimentally, but if we are to use AlphaFold effectively and broadly in drug discovery, we need to get better at molecular docking models,” Collins said.

better predictions

For the above conclusions, the research team said that one possible reason for the poor performance of the AlphaFold model is that the protein structure of the input model is static, while in biological systems, proteins are dynamic, and their configurations often change.

Can AlphaFold be used?MIT new research: not much better than random guessing, still needs to continue to improve

“The machine learning model learns not only the shape of known interactions, but also the chemical and physical properties of known interactions, and then uses this information to re-evaluate the docking predictions,” said Felix Wong, co-author of the paper. “The data show that these additional models can help us get a higher rate of true positives and false positives.”

However, the research team says that further improvements are needed before this type of model can be used to successfully identify new drugs, and one possible approach is to incorporate more data into the model training, including the biophysical and biochemical properties of proteins and their different conformations, and how these features affect their binding to potential drug compounds.

With further progress, Collins believes, scientists may be able to use AI-generated protein structures to not only discover new antibiotics, but also drugs to treat a variety of diseases, including cancer.

“We are optimistic that these techniques will become increasingly important in drug discovery as modeling methods improve and computational power increases. However, we still have a long way to go.”

Read more here: Source link