
A Boltz-2 generated antibody complex made by the author: Credit: Wohlwend, J., Corso, G., Passaro, S. et al. “Boltz-1 Democratizing Biomolecular Interaction Modeling”, 2024, https://gcorso.github.io/assets/boltz1.pdf Tamarind Bio. State of the art computational tools for biology. (2024). https://www.tamarind.bio/.
Free-energy perturbation (FEP) has long been pharma’s gold standard for gauging how tightly a small molecule binds its protein target, yet each simulation can take 6 to 24 hours and cost hundreds of dollars. Boltz-2, a new open-source model from MIT and Recursion available on GitHub, delivers FEP-class accuracy in about 18 seconds on a single GPU. That’s roughly 1,000 times faster and over 10,000 times less expensive.
This shift in project economics has a direct impact on discovery timelines. “We’ve combined that with our generative AI models and have examples of programs in late discovery where we were able to go from start to finish in a matter of 18 months, instead of the 42-month industry average,” said Najat Khan, Chief R&D Officer at Recursion, during a press briefing. “We synthesize a couple of hundred molecules to get to the one that goes into the clinic, versus the industry average of 5,000 to 10,000 compounds.”
The developers validated the model’s performance validated on multiple benchmarks. On the FEP+ benchmark for hit-to-lead analysis, it achieved a Pearson correlation of 0.62, within striking distance of full physics simulations (0.72). In a blind test on Roche’s internal targets for the CASP16 competition, it scored 0.65, outperforming the next-best competitor. “In just 20 seconds, Boltz-2 reaches the performance of FEP that usually takes from 6 to 12 hours, pretty much changing the game in a hit-to-lead setting,” noted MIT researcher Saro Passaro.
[For a deeper coverage of the compute and R&D side of Boltz-2, check out our coverage on R&D World]
At its core, Boltz-2’s advance stems from a novel architecture. It also addressing a long-standing challenge in drug discovery. “Delivering this capacity for affinity was an open problem for decades…it requires very novel machine learning to develop this technique,” explained Regina Barzilay, an MIT Distinguished Professor for AI and Health.
The model features a joint head that predicts both a 3D binding pose and the binding free energy (ΔG) in a single pass. It also has physical steering, a technique that applies a Feynman-Kac potential during prediction to eliminate steric clashes and geometry errors, yielding physically plausible complexes in “nearly 100% of cases.” In addition, cControllability features allow medicinal chemists to impose constraints like pocket masks or contact lists, useful for accelerating structure-activity-relationship (SAR) cycles.
Such functionality allows high-fidelity affinity prediction to move from a late, expensive validation step to an early triage tool. “This changes the paradigm to using this much, much earlier,” Khan said. “Boltz-1, [the predecessor of Boltz-2], has quickly become the most widely adopted co-folding model in the industry,” added Gabriele Corso, one of the lead developers. It has more than 1,000 Slack users alone owing to its ability to approach AlphaFold 3-level of accuracy in some areas.
While the Boltz-2 model has limitations, struggling, for instance, with large induced-fit motions and has lower reliability for data-sparse protein classes, it enables chemistry teams to focus wet-lab and synthesis budgets on a smaller, more promising set of molecules. The model, weights and training pipeline are all available under a permissive MIT license.
Filed Under: machine learning and AI