Drug Discovery and Development

  • Home Drug Discovery and Development
  • Drug Discovery
  • Women in Pharma and Biotech
  • Oncology
  • Neurological Disease
  • Infectious Disease
  • Resources
    • Video features
    • Podcast
    • Webinars
  • Pharma 50
    • 2025 Pharma 50
    • 2024 Pharma 50
    • 2023 Pharma 50
    • 2022 Pharma 50
    • 2021 Pharma 50
  • Advertise
  • SUBSCRIBE

From Silicon Valley to the lab of tomorrow: Synfini’s leap to large chemistry models

By Brian Buntz | November 18, 2023

Illustrative representation of a molecular structure with interconnected spheres and rods, symbolizing atoms and bonds, in a stylized depiction."

Conceptual visualization of a molecular structure, not representative of a specific molecule. [Image courtesy of Adobe Stock]

During the COVID-19 pandemic in late 2020, the storied Silicon Valley institution SRI International secured a $4.3 million DARPA contract to develop a tool for generating therapeutic small molecules to combat biological threats. Not just known for innovations like the computer mouse and Siri, SRI International is also responsible for Synfini, a multimodal chemistry model akin to large language models like ChatGPT. In September 2023, the company spun out Synfini as an independent entity.

“The core group and project started about seven years ago at SRI International, and began as part of the DARPA Make-It program,” said Peter Madrid, co-founder and head of scientific development at Synfini. DARPA’s Make-It initiative aimed to automate small molecule discovery and synthesis. Thanks to the support, Synfini developed AI-based approaches to plan and optimize the production of synthetic molecules.

Through the work with DARPA, SRI International developed several key technologies. One was a synthetic planning tool called SynRoute, which uses a large database of chemistry data combined with AI to come up with the reaction steps and a strategic plan for synthesizing molecules. Another was AutoSyn, a flow chemistry hardware platform for reliable multi-step synthesis, that successfully produced grams of materials. The technology suite also includes SynBuild and SynPlan for efficient molecular design and synthesis planning, and SynDB, a comprehensive and evolving chemistry data repository at the core of their system.

Peter Madrid

Peter Madrid

Before Synfini could spread its wings as an independent entity, its technology had already attracted the attention of two Big Pharma giants, Sanofi and a J&J subsidiary. In 2021, Synfini entered into a research collaboration with Sanofi to tap the Synfini platform in the discovery and development of candidates across multiple high-profile drug-discovery programs. A year later, it inked a deal with Janssen Pharmaceutica NV with the aim of tapping the AI-guided, automated synthetic chemistry system for small molecule drug discovery.

Crouching data, hidden patterns

In the near future, imagine a chemist instructing a generative AI lab automation platform, “I want a drug candidate that binds to target A with potency X but does not bind to target B.” The seemingly straightforward request launches a sophisticated process to explore the molecular landscape, seeking compounds with the specified characteristics. The result saves the chemist days or hours of work.

Nathan Collins

Nathan Collins

“Large language models have generated incredible excitement in the AI field, primarily because you have this wealth of data available through the internet and texts throughout history,” said Nathan Collins, head of strategic alliances and development at Synfini. “The concept with large chemistry models is to use similar models but incorporate different modalities, such as the fairly limited sets of chemistry data available, and methods to rapidly and automatically generate data to fill those holes.” This approach aims to provide more data for improved AI models and better predictions on the next round of chemistry.

If the overarching goal is getting a drug to the clinic, breaking that process into bite-sized chunks can help drive efficiency. “I think AI has the ability to fill the stop gap between finding hits against targets and getting to preclinical candidates,” Collins said.

The goal is to streamline scientific workflows. “For AI to be genuinely effective and useful, it must assist scientists in a hands-on and interactive way in a way that they understand, enhancing their ingenuity rather than attempting to replace them,” Collins said. “Ultimately, our approach is a chemistry-first approach, not an AI-first approach,” Madrid said. The company’s technology touts a “chemist-first” user interface, which is designed to provide an intuitive and collaborative environment  to drive efficient scientific exploration.

Enhancing efficiency in drug discovery with AI

The benefits of the system need to be apparent almost immediately. “It has to be something that delights them in terms of feedback on the tasks they have right in front of them,” Collins said.

The initial stages of drug discovery are traditionally tedious, involving hit identification, lead optimization, and preclinical testing. These phases require meticulous laboratory work and extensive data analysis to refine potential drug candidates before advancing to clinical trials. “Our goal is to use AI to bridge this gap effectively, in a way that augments the work of professionals in the field, not trying to replace them,” Collins said.

Many existing AI tools have been trying to get past the way people work. Our approach, with Peter and myself bringing our chemistry perspective and our experience at SRI, working with experts in this field, is to create tools that leverage us to get things done in ways that the pharmaceutical industry can understand and show that there’s a real advantage to it.

Madrid said the company’s technology, much like LLMs, can be “really useful as long as the person receiving the results can judge their correctness.” A chemist might look at an AI-proposed solution and dismiss it as absurd in a second. But through a few rapid iterations, the researcher might continue exploring and find some powerful ideas. “Many AI solutions currently take days or even a week to generate solutions,” Madrid said. “If a chemist is going to dismiss many of those, it’s not really improving efficiency. So, I believe the interactivity and the timeliness of these tools are critical aspects that are not addressed by many of the current solutions.”

Nuts and bolts of a large chemistry model

Synfini’s primary AI is a pre-trained transformer model, similar to many of the popular LLM models. “One unique aspect though is the neuro-symbolic AI, where we can code medicinal chemistry concepts on shape and drug interactions with a protein receptor, and measure those within our model,” Madrid said. The company uses this concept to rapidly filter out many of the false designs that a traditional generative AI might produce. “This addresses the challenge where generative AI models can sometimes be overly creative, suggesting designs that medicinal chemists would instantly dismiss and not want to make in the lab,” he added. The company can incorporate logical assertions about the structures into its model as part of the generative process.

Madrid notes that the approach can create a sub-neural network that measures a feature of a compound, typically involving how it interacts with its receptor, or it could be a physical property. “Along with predictions for just the pure biological activity, it will predict whether or not it’s going to interact in the way that the medicinal chemist wants it to,” Madrid said. “That’s part of the overall scoring and function for prioritizing the candidates.”

Synfini taps embeddings and ML to enhance drug synthesis capabilities

Madrid elaborated on how the system uses embeddings, a machine learning technique in which complex data types are mapped into a defined space of continuous vectors. Vectors, in this context, are essentially a list of numbers that can be used to represent various characteristics of the data. “We actually use a 3D embedding space, which also allows us to capture parts of structure that 2D methods just can’t but each point in that 3D volume, it’s kind of a voxel and it’s represented as a vector.” Because vectors are often long strings of numbers, they can translate to substantial file size. “The datasets are huge,” Madrid said. The company has spent considerable time through trial and error determining which vector sizes are optimal.

The company also has developed a machine learning algorithm for predicting the synthesizability of molecules. “This is something else we’re able to incorporate into this process,” Collins said.” So, we’re not only generating molecules, but also generating molecules with predetermined properties that our medicinal chemists would like to see. Further, we’re able to incorporate synthesizability into that, again, using AI tools based on data generated with our system. This ensures that we can actually make these molecules on our automated platforms.”


Filed Under: Data science, Drug Discovery, Drug Discovery and Development, machine learning and AI
Tagged With: AI drug development, AI in pharma R&D, automated synthesis, DARPA Make-It program, Janssen, Sanofi, SRI International, Synfini, therapeutic molecule generation
 

About The Author

Brian Buntz

As the pharma and biotech editor at WTWH Media, Brian has almost two decades of experience in B2B media, with a focus on healthcare and technology. While he has long maintained a keen interest in AI, more recently Brian has made making data analysis a central focus, and is exploring tools ranging from NLP and clustering to predictive analytics.

Throughout his 18-year tenure, Brian has covered an array of life science topics, including clinical trials, medical devices, and drug discovery and development. Prior to WTWH, he held the title of content director at Informa, where he focused on topics such as connected devices, cybersecurity, AI and Industry 4.0. A dedicated decade at UBM saw Brian providing in-depth coverage of the medical device sector. Engage with Brian on LinkedIn or drop him an email at bbuntz@wtwhmedia.com.

Related Articles Read More >

From data to drug candidates: Optimizing informatics for ML and GenAI
Intrepid Labs
Intrepid Labs raises $7 million to expand AI-driven formulation platform
AI agents could shoulder 55% of biopharma work, Accenture/Wharton study finds
Lokavant’s Spectrum turns clinical-trial planning into a live simulation
“ddd
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest news and trends happening now in the drug discovery and development industry.

MEDTECH 100 INDEX

Medtech 100 logo
Market Summary > Current Price
The MedTech 100 is a financial index calculated using the BIG100 companies covered in Medical Design and Outsourcing.
Drug Discovery and Development
  • MassDevice
  • DeviceTalks
  • Medtech100 Index
  • Medical Design Sourcing
  • Medical Design & Outsourcing
  • Medical Tubing + Extrusion
  • Subscribe to our E-Newsletter
  • Contact Us
  • About Us
  • R&D World
  • Drug Delivery Business News
  • Pharmaceutical Processing World

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search Drug Discovery & Development

  • Home Drug Discovery and Development
  • Drug Discovery
  • Women in Pharma and Biotech
  • Oncology
  • Neurological Disease
  • Infectious Disease
  • Resources
    • Video features
    • Podcast
    • Webinars
  • Pharma 50
    • 2025 Pharma 50
    • 2024 Pharma 50
    • 2023 Pharma 50
    • 2022 Pharma 50
    • 2021 Pharma 50
  • Advertise
  • SUBSCRIBE