Last week, Google DeepMind launched AlphaFold 3, a significant update to its predecessor, AlphaFold 2. The latter has profoundly influenced structural biology and biomedicine in recent years while also attracting some skepticism. While AlphaFold 2 could accurately predict the structures of vast swathes of proteins, AlphaFold 3 ups the ante. It offers higher accuracy and the ability to predict the structure and interactions of various biomolecules, including proteins, DNA, RNA, ligands, ions and antibody-antigen complexes.
Google DeepMind has debuted AlphaFold Server, an easy-to-use platform available for non-commercial research use, that enables users to quickly model structures composed of proteins, DNA, RNA and a selection of ligands, ions and chemical modifications
Architectural evolution and technical advances
As AlphaFold iterations evolve, so does their architecture. The first version of the platform used a convolutional neural network (CNN) as its main component, a mainstay of deep learning in computer vision. Its successor, AlphaFold 2, swapped the CNN with an architecture based on a transformer, an architecture also widely used in generative AI.
In its most recent update, AlphaFold 3 adopts a diffusion-based approach, which enables the model to learn protein structure at multiple scales and directly predict raw atom coordinates. “Now we’re really starting to get into a realm where we’re almost at molecular dynamics,” said Patrick Bangert, SVP data, analytics and AI at Searce, a cloud solutions and technology services provider. “It’s obviously still an AI algorithm and not a physics model, but I think it’s starting to almost become a hybrid of that in the sense that it’s getting very close to the nitty gritty details.”
Improved accuracy and performance
A Nature paper, published early as it undergoes further editing, explains that AlphaFold 3 boasts a 58% success rate on ligand binding prediction, compared to 24% for AutoDock Vina, an open-source molecular docking program. AlphaFold also outperformed RoseTTAFold2NA, a machine learning-based tool for predicting 3D structures of protein-RNA and protein-DNA complexes. Additionally, AlphaFold 3 demonstrates better performance in predicting antibody-antigen interfaces compared to its predecessor, AlphaFold-Multimer 2.3.
“Obviously, the point is that the accuracy is improving, which is what we want. All the users of AlphaFold don’t really care about speed, performance, latency, or any of those usual IT topics,” Bangert said. “What they really care about is accuracy. And that’s why this move beyond essentially just a language model into molecular dynamics is really a good one.”
Turning the drug discovery lottery into a game of better odds
While R&D productivity rates in biopharma have declined in recent decades, blockbuster drugs still deliver substantial returns. The industry’s reliance on a small fraction of drugs to subsidize the rest underscores the high-stakes nature of pharmaceutical development. “It is basically playing roulette, right? You just have to play roulette enough many times before you hit the jackpot,” Bangert said.
AlphaFold’s computational approach offers significant advantages over traditional medicinal chemistry methods. “Prior to AlphaFold, what you had to do to try out a new drug is you came up with an idea yourself as a human being,” Bangert said. “And you tried that out in the lab physically and you had to go through a few thousand physical tests with each new attempt.” Drug developers can now conduct these experiments computationally at a fraction of the cost using AlphaFold, compared to the time-consuming and resource-intensive process of traditional methods.
The benefits of AlphaFold have not gone unnoticed by the scientific community. Bangert states, “AlphaFold is being taken seriously. Companies are actually using it at production levels because it’s now proven in the scientific literature beyond a doubt that it’s helpful.” The technology reduces cost, staff, and time to market significantly. “Prior to AlphaFold, it was a lottery. With AlphaFold, it’s a lottery but with better odds,” Bangert adds.
Startups gain realistic drug discovery odds with AlphaFold, but Big Pharma still have key advantages
Thanks to AlphaFold, startups with roughly $5–10 million in VC funding have “a realistic chance at developing a new drug,” which can be sufficient to cover the costs of infrastructure, talent acquisition and initial research.
But Big Pharma companies tend to accumulate vast datasets over the years while forging partnerships with organizations. “It’s the data that’s the real gatekeeper,” Bangert said.
Despite this challenge and the significant costs of building the necessary foundation, AlphaFold is an order of magnitude — or two — less expensive than traditional methods in medicinal chemistry. “We’re talking about a completely game-changing technology as far as drug discovery is concerned,” Bangert said.
Deploying AlphaFold at scale
Despite the clear advantages, deploying AlphaFold at scale still requires a significant investment. “AlphaFold itself is like the heart but you need the lungs and the liver and the kidneys and blood vessels and all that as well,” Bangert said. Deploying AlphaFold 3 at a production scale to accelerate drug discovery involves assembling an ecosystem of IT infrastructure, data pipelines and domain expertise.
In general, data strategy has not been the strong suit of many biopharma firms. A recent survey from the Pistoia Alliance highlighted the challenges faced by life sciences experts in implementing AI. Some 70% acknowledged AI’s potential in the industry but noted struggles with data integrity and interoperability. The survey found that 63% of respondents were concerned about poor data quality.
Significant infrastructure and data requirements
Given these data challenges, implementing advanced tools like AlphaFold requires meticulous attention to detail and a significant capital outlay. “The pipeline of tools that you require both before and after AlphaFold to make this into a commercial proposition for a pharmaceutical company or a healthcare company is fairly serious,” Bangert stressed.
Serious implementation requires access to a robust hardware architecture, including servers and orchestration tools like Kubernetes and virtual machines. Additionally, companies need a large data storage system with efficient retrieval software to handle the significant amount of data generated before and after using AlphaFold.
As described earlier, data preparation is another core aspect of deploying AlphaFold at scale. “You need to do a lot of data preparation prior to having the data in a form and in the right state of cleanliness before you can stick it into AlphaFold and get a meaningful result,” Bangert explained. In AlphaFold, researchers may need to generate and test numerous variants of a molecule, requiring substantial data handling capacity, Bangert said. “You might be running through 100,000 possible variants of a solution. And that means you have to generate those 100,000 variants in the first place,” he added. “You have to store them, run each one through AlphaFold, get a result, store the result.”
Building a small talented team to generate a big impact
Another consideration for successful AI projects is talent. This is especially true for organizations deploying AlphaFold at scale. “Now, you don’t need a lot of staff,” Bangert said. “We’re not talking about dozens of people. We’re more like talking about maybe five people-ish, plus or minus,” Bangert said. But those five people need to be “absolutely stellar researchers.” And it would be ideal of about half of the team were biochemical experts while the other were highly-trained AI specialists who “got along well and spoke each other’s language,” Bangert said. “That is absolutely critical.”
Fostering cross-functional collaboration
Yet fostering cross-functional collaboration requires open-mindedness and a willingness to learn from both sides. “AI is always just a tool. So the people who use it have to both know how to use the tool and know what the real-life application is all about,” Bangert stressed. At the same time, AI experts should be receptive to the biochemists’ insights, while biochemists must be open to adopting new AI-driven workflows.
Organizations must also manage the transition to AI-driven workflows carefully. “There is a change management project at the basis of every AI project where you have to convince people and change the human process to adopt this new tool,” Bangert noted.
Education plays a crucial role in this transition. “As AI people, we need to educate and the others need to be willing to learn or need to accept that there is something to learn,” Bangert emphasized. This education should target both the technical aspects of the platform and the practical implications for day-to-day work. “What’s really important is the interfaces. How do I prepare the stuff that goes into AI? What do I do with the outputs of AI? How do I have to change my team structure, my business structure to be able to accommodate this tool?” Bangert asked.
Drug developers don’t need to become AI experts to be able to effectively wield its powers. He drew an analogy to electricity. “You know that flicking on a light switch will turn on the light, even if you can’t explain the electrodynamics behind it. The same applies to AI,” he said. “But you do know how to use it.”
Filed Under: Data science, Drug Discovery, machine learning and AI