Over the past several months, Baker’s team have been working with biologists who were previously stuck trying to figure out the shape of the proteins they were studying. “There’s a lot of pretty interesting biological research that’s been really ramped up,” he says. A public database of hundreds of thousands of ready-made protein forms should be an even bigger accelerator.
“It sounds surprisingly impressive,” says Tom Ellis, a synthetic biologist at Imperial College London who studies the yeast genome, who is excited to try the database. But he cautions that most of the predicted forms have yet to be verified in the lab.
In the new version of AlphaFold, the predictions come with a confidence score that the tool uses to indicate how close it thinks each predicted shape is to reality. Using this measurement, DeepMind found that AlphaFold predicted the shapes of 36% of human proteins with correct accuracy down to the level of individual atoms. It’s good enough for drug development, Hassabis says.
Previously, after decades of work, only 17% of the proteins in the human body had their structures identified in the laboratory. While AlphaFold’s predictions are as accurate as DeepMind says, the tool has more than doubled that number in just a few weeks.
Even predictions that are not entirely accurate at the atomic level are still useful. For more than half of the proteins in the human body, AlphaFold predicted a shape that should be good enough for researchers to understand the protein’s function. The rest of AlphaFold’s current predictions are either incorrect or concern the one-third of the proteins in the human body that have no structure until they bind to others. “They are flexible,” says Hassabis.
“The fact that it can be applied at this level of quality is an impressive thing,” says Mohammed AlQuraish, a systems biologist at Columbia University who has developed his own software to predict the structure of proteins. He also points out that having structures for most of the proteins in an organism will allow us to study how these proteins work as a system, and not just in isolation. “This is what strikes me as the most exciting,” he says.
DeepMind publishes its tools and forecasts for free and will not say if it intends to make any money from it in the future. This does not exclude the possibility, however. To set up and manage the database, DeepMind has partnered with the European Molecular Biology Laboratory, an international research institution that already hosts a large database of information on proteins.
For now, AlQuraishi is eager to see what the researchers do with the new data. “It’s pretty spectacular,” he says. I don’t think any of us thought we would be here so quickly. It is mind boggling.