Drug Discovery and Development

  • Home Drug Discovery and Development
  • Drug Discovery
  • Women in Pharma and Biotech
  • Oncology
  • Neurological Disease
  • Infectious Disease
  • Resources
    • Video features
    • Podcast
    • Voices
    • Webinars
  • Pharma 50
    • 2025 Pharma 50
    • 2024 Pharma 50
    • 2023 Pharma 50
    • 2022 Pharma 50
    • 2021 Pharma 50
  • Advertise
  • SUBSCRIBE

Why AI alone won’t resolve drug discovery challenges

By Brian Buntz | June 21, 2023

Digital representation of a protein

A digital representation of a protein structure. [Image courtesy of NIH]

Big Pharma and researchers are sharpening their focus on AI to speed drug discovery. But the path to fully AI-driven drug discovery faces substantial hurdles, according to Adityo Prakash, CEO of Verseon. “When it comes to drug discovery, AI has a data problem with which the pharmaceutical industry has not yet come to terms,” he said.

Prakash explained there is simply “not enough data” to rely on AI as the primary means of small molecule drug discovery. He discussed these challenges further in an article in American Pharmaceutical Review titled “Exploring New Chemical Space for the Treatments of Tomorrow.”

The limitations and challenges of AI-first methods

Even with high-throughput screening to automate the process of testing pre-synthesized drug candidates against disease-associated target proteins, the pharma industry has managed to generate experimental data for fewer than ten million distinct chemicals out of a billion trillion trillion (1033) unique drug-like compounds synthesizable under the rules of organic chemistry. And “AI-first” methods can only be trained on that existing experimental data. It is like exploring a drop of water in an ocean, he said.

In addition, even within that limited available experimental data, a sizable portion is of questionable quality and often not reproducible, he said. A 2022 Patterns article and a preprint on ArXiv reached similar conclusions regarding the need for high-quality data. To that list, these articles added the lack of interoperability and the curse of dimensionality. The latter problem refers to the fact that machine learning models require a large amount of data to accurately learn and make predictions. But as the dimensionality (number of features) grows, the amount of data they require explodes.

[Related: 8 considerations to boost clinical trial productivity with AI]

Another of Prakash’s core points is the lack of a particular data type critical for machine learning: negative data from lab experiments and clinical trials. In drug discovery, “failures” are published far less often than positive findings. “Negative data is as important as positive data for training an ML model,” he said.

AlphaFold: A game of evolutionary guesswork

It is tempting to believe that AI’s success in other areas will translate into success in drug discovery, especially when considering recent advances in the field. One of the most noteworthy AI tools to emerge in recent years is AlphaFold, a groundbreaking tool from DeepMind, which can accurately predict protein folding. Before AlphaFold debuted in 2021, researchers relied on The Worldwide Protein Data Bank (PDB) to provide experimentally determined protein structures for about 90% of disease programs. AlphaFold augments the PDB data by predicting structures for those proteins where the structure was previously unknown.

But the application of AI in such a complex field is not without its challenges. The  effectiveness of AI is partly a function of the quality and size of training sets. DeepMind actively designed AlphaFold to actively predict previously unknown protein structures, but DeepMind did not leave it to operate without actively providing sufficient foundational data. Several large databases offered vast numbers of known protein structures and their amino acid sequences across multiple species. This wealth of information, combined with various evolutionary rules that preserve the structure and function of proteins across species, provided a robust training ground for AlphaFold, playing a central role in its success, Prakash noted.

Drug discovery is different from protein folding in ways that make it a far more daunting challenge for AI. Knowing a protein’s structure is a first step. But discovering new drugs requires understanding how that protein will bind to a novel drug structure. “But there is no binding data for truly novel compounds,” Prakash pointed out. And tools like AlphaFold cannot make these predictions.

Understanding the challenges inherent in small molecule drug discovery

Prakash provides an eye-opening statistic to illustrate the current limitations of AI in drug discovery, estimating that we only have available data on a 0.000000000000000000000001% of the drug-like chemical universe. “It is virtually impossible for current AI approaches to find breakthrough novel drugs unaided,” he explained. “And when you look at the companies who purportedly ‘discovered’ drugs with AI, you find that most have developed small ‘me-too’ modifications of well-trodden drug structures.

Given the impossibly prohibitive time and cost involved in experimentally generating binding data for these novel drug structures, researchers must computationally derive the required data by simulating protein-drug interactions using highly accurate molecular-physics models. “Available garden-variety physics models won’t do,” Prakash said. Scientists need further advances in physics simulations to generate the data required to train next-generation AI models. Follow-up experimental data on promising novel structures can then further enhance these AI models, allowing researchers to systematically design and optimize novel drugs.

Ultimately, AI is one tool of many, not a cure-all, Prakash concluded. Thoughtfully integrating advances in AI, physics, chemistry and biology is required to explore the ‘uncharted chemical ocean’ of potential new small molecule drugs. “Small molecule drug discovery requires progress across diverse fields and smart application of the resulting integrated tools,” he noted.

“AI is not magical,” Prakash asserted. “We must understand where it’s valuable and where it’s not.”

In reflecting on the future of drug discovery in the era of fast-moving technology, he concluded, “If you don’t use AI, you’ll be left behind,” Prakash concluded. “But the key to success is building and using all the other complementary tools required to generate the required high-quality data. AI acts on data.”


Filed Under: Data science, Drug Discovery, Industry 4.0, machine learning and AI
Tagged With: AI, AlphaFold, data quality, drug discovery, high-throughput screening, machine learning, protein folding, small molecule drugs, Verseon
 

About The Author

Brian Buntz

As the pharma and biotech editor at WTWH Media, Brian has almost two decades of experience in B2B media, with a focus on healthcare and technology. While he has long maintained a keen interest in AI, more recently Brian has made making data analysis a central focus, and is exploring tools ranging from NLP and clustering to predictive analytics.

Throughout his 18-year tenure, Brian has covered an array of life science topics, including clinical trials, medical devices, and drug discovery and development. Prior to WTWH, he held the title of content director at Informa, where he focused on topics such as connected devices, cybersecurity, AI and Industry 4.0. A dedicated decade at UBM saw Brian providing in-depth coverage of the medical device sector. Engage with Brian on LinkedIn or drop him an email at bbuntz@wtwhmedia.com.

Related Articles Read More >

Abstract neural network
Inside IQVIA’s quest to build a multi-agent AI ‘dream team’ to transform clinical trials
Xaira and Verily co-founder ponders low-hanging fruit and blue-sky potential in FDA’s genAI rollout
Capgemini’s life-sciences lead says ROI and data security, not algorithms, will decide pharma’s AI future
Portrait of happy smiling mature middle aged professional business woman investor manager executive or lawyer attorney looking at camera at workplace working on laptop computer in office.
As FDA pushes agency-wide generative AI, pharma experience show similar tools can cut clinical study-report drafting time by 30% or more
“ddd
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest news and trends happening now in the drug discovery and development industry.

MEDTECH 100 INDEX

Medtech 100 logo
Market Summary > Current Price
The MedTech 100 is a financial index calculated using the BIG100 companies covered in Medical Design and Outsourcing.
Drug Discovery and Development
  • MassDevice
  • DeviceTalks
  • Medtech100 Index
  • Medical Design Sourcing
  • Medical Design & Outsourcing
  • Medical Tubing + Extrusion
  • Subscribe to our E-Newsletter
  • Contact Us
  • About Us
  • R&D World
  • Drug Delivery Business News
  • Pharmaceutical Processing World

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search Drug Discovery & Development

  • Home Drug Discovery and Development
  • Drug Discovery
  • Women in Pharma and Biotech
  • Oncology
  • Neurological Disease
  • Infectious Disease
  • Resources
    • Video features
    • Podcast
    • Voices
    • Webinars
  • Pharma 50
    • 2025 Pharma 50
    • 2024 Pharma 50
    • 2023 Pharma 50
    • 2022 Pharma 50
    • 2021 Pharma 50
  • Advertise
  • SUBSCRIBE