For some years now, John McGeehan a biologist and director of the Center for Enzyme Innovation in Portsmouth, UK, has been searching for a molecule that could break down the 150 million tons of soda bottles and other plastic waste strewn across the globe.
His task is that of the most demanding locksmith: to pinpoint the chemical compounds that on their own will twist and fold into the microscopic shape that can fit perfectly into the molecules of a plastic bottle and split them apart, like a key opening a door.
Determining the exact chemical contents of any given enzyme is a fairly simple challenge these days. But identifying its 3D shape can involve years of biochemical experimentation. So last fall, after reading that an artificial intelligence lab in London called DeepMind had built a system that automatically predicts the shapes of enzymes and other proteins, McGeehan asked the lab if it could help with his project.
He sent DeepMind a list of seven enzymes. A few days later, the lab returned shapes for all seven. “This moved us a year ahead of where we were, if not two,” McGeehan said.
Now, any biochemist can speed their work in much the same way. DeepMind recently released the predicted shapes of over 3,50,000 proteins — the microscopic mechanisms that drive the behaviour of bacteria, viruses, the human body and all other living things. This database includes the 3D structures for all proteins expressed by the human genome, as well as those for proteins that appear in some others like the mouse, fruit fly and E. coli.
“This can take you ahead in time — influence the way you are thinking about problems and help solve them faster,” said Gira Bhabha, an assistant professor in the department of cell biology at New York University, US. “Whether you study neuroscience or immunology, this can be useful.”
This new knowledge is its own sort of key: if scientists can determine the shape of a protein, they can determine how other molecules will bind to it. This might reveal, say, how bacteria resist antibiotics — and how to counter that resistance. Bacteria resist antibiotics by expressing certain proteins; if scientists were able to identify the shapes of these proteins, they could develop new antibiotics that suppress them.
When McGeehan sent DeepMind his list of seven enzymes, he told the lab that he had already identified shapes for two of them, but he did not say which two. This was a way of testing how well the system worked; AlphaFold, the AI technology, passed the test, correctly predicting both shapes.
It was even more remarkable that the predictions arrived within days. McGeehan later learned that AlphaFold had completed the task in just a few hours.
AlphaFold predicts protein structures using what is called a neural network, a mathematical system that can learn tasks by analysing vast amounts of data — in this case, thousands of known proteins and their shapes — and extrapolating into the unknown.
This is the same technology that identifies the commands you bark into your smartphone, that recognises faces in the photos you post to Facebook, and that translates one language into another on Google Translate. But many believe AlphaFold is one of the technology’s most powerful applications.
“It shows that AI can do useful things amid the complexity of the real world,” said Jack Clark, one of the authors of the AI Index, an effort to track the progress of AI technology across the globe.
As McGeehan discovered, it can be remarkably accurate. AlphaFold can predict the shape of a protein with an accuracy that rivals physical experiments about 63 per cent of the time, according to independent benchmark tests that compare its predictions to known protein structures. Most experts had assumed that a technology this powerful was still years away.
“I thought it would take another 10 years,” said Randy Read, a professor at the University of Cambridge, UK. “This was a complete change.”
Even before DeepMind began openly sharing its technology and data, AlphaFold was feeding a wide range of projects. University of Colorado, US, researchers are using the technology to understand how bacteria such as E. coli and salmonella develop a resistance to antibiotics, and to develop ways of combating this resistance. At the University of California, San Francisco, US, researchers have used the tool to improve their understanding of the coronavirus.
The coronavirus wreaks havoc through 26 proteins. With help from AlphaFold, the researchers have improved their understanding of one key protein and are hoping the technology can help increase their understanding of the other 25.
If this comes too late to have an impact on the current pandemic, it could help in preparing for the next one. “A better understanding of these proteins will help us not only target this virus but other viruses,” said Kliment Verba, one of the researchers in San Francisco.
The possibilities are myriad.