In a recent interview, Connell provided a sense of what to expect from generative AI in drug discovery, touching on how these tools could automate mundane tasks, streamline complex scientific workflows and speed drug discovery.
The promises and pitfalls of generative AI in drug discovery
In scientific research, generative AI, of which LLMs are an example, can partly automate tasks such as summarizing academic papers, solving math problems, coding, ensuring quality compliance and annotating molecules and proteins.
But the journey toward tapping the potential of LLMs in scientific research has not been without obstacles. One high-profile hiccup to apply an LLM to scientific research was Galactica, a model developed by Meta (formerly Facebook), which was yanked three days after introduction. A glaring shortcoming was its inability to consistently distinguish fact from fiction. In a similar vein, ChatGPT is also prone to hallucinate at times. And Nature reported that ChatGPT-written articles fool scientists.
Yet generative AI has been making inroads in drug discovery for the past decade. Biotech firms using an AI-first approach had more than 150 small-molecule drugs in discovery and more than 15 were in clinical trials in 2022. The AI-fueled pipeline has been expanding at an annual clip of almost 40%, according to BCG.
Meanwhile, drug discovery upstarts like Insilico Medicine and Atom Bioworks are deploying ChatGPT as an interface for target discovery platforms. LLM applications such as Google DeepMind’s Med-PaLM show promise in answering medical queries. Additionally, NVIDIA has expanded its Clara Discovery platform, a GPU-backed computational drug discovery platform that fuses AI, data analytics, simulation and visualization to support cross-disciplinary workflows in drug design.It can handle an array of biological data formats — chemistry, protein sequences as well as DNA and RNA sequences. NVIDIA has also introduced BioNeMo, a cloud-based supercomputing framework focusing on the training and deployment of large biomolecular language models.
Raising the efficiency floor
Automating iterative loops: Drug discovery has something like a knowledge-intensive, iterative marathon backed by calculated gambles. While not a silver bullet, generative AI offers the potential to automate and optimize some of the iterations in process, pointing toward more efficient pathways for discovering new drugs. As Connell suggested, “If we can streamline the drug discovery process with AI, then that lowers the bar to doing good science.” He continued: “This kind of AI could jump to the answer, and you kind of skip a lot of the iterative work in between.”Navigating complex datasets: Generative AI’s capacity to handle unstructured scientific data could be a game changer as well. It can significantly simplify tasks for scientists, improving both the speed and quality of analysis. “I think that AI has the potential to greatly increase the efficiency of the drug discovery process,” Connell said. In addition, generalist medical AI, as Nature highlighted, could potentially help interpret diverse and large sets of medical data.
The company Aria Pharmaceuticals has developed an AI-based platform to find promising drug candidates. Its technology considers genetic, proteomic and metabolomic data to predict the efficacy of drug candidates. The company credits the approach in the discovery of a promising candidate for the treatment of rheumatoid arthritis, which is now in preclinical development
Reducing human errors and variability: The human-centric nature of traditional drug discovery processes may lead to variability and errors. AI has the potential to provide a consistent and precise approach, reducing these issues and resulting in a more robust drug discovery process. Owing to its iterative structure, a failure at any stage often necessitates returning to a previous stage or even starting from scratch. As Connell explains, the integration of AI in drug discovery has the potential to significantly minimize such setbacks, allowing for a more efficient process and reduced costs. Considering how generative AI would affect the efficiency of drug discovery, Connell said the trend is going to raise the efficiency baseline — the “floor.”
Bridging knowledge gaps: The expertise required in fields like chemistry and data science can be overwhelming for individual scientists to master. By incorporating AI into research tools, they become more accessible to scientists, reducing the barriers to entry and improving efficiency. “Scientists could do more if they could actually have access to scalable, compute and advanced algorithms, but it’s too much for most scientists to know how to do that as well as their science,” Connell said. He further emphasized the significant potential loss when chemists and data scientists work in isolation. Instead, he advocates for a more integrated approach: By placing AI tools directly in the hands of scientists, they can conceive an idea, test it, and once validated, advance it along the pipeline.
Enhancing decision-making: By tapping AI in the drug discovery process, researchers could access a broader solution space, improving the decision-making process about which problems to tackle and how to approach them. Data science is also carving out a growing role in experimental chemistry, as Nature has noted. These trends could help provide a clearer path through the drug development pipeline.
Reducing friction: Connell draws attention to the traditionally human-centric and sequential nature of drug discovery processes. Each step often relies on a hand-off, where one expert passes their findings to the next through a report. This introduces significant friction and variability, especially when junior members are involved, potentially jeopardizing the project’s success. Connell emphasized how AI can help alleviate these issues by bridging knowledge gaps, stating, “AI can level that knowledge. The weakest link phenomenon becomes less of factor — you have a lot fewer weak links.” He further explains how AI can automate tasks and navigate complex, unstructured data, effectively streamlining the process: “I can express things in natural language and I can get technical things done behind the scenes, like querying my unstructured, complex chemical data. That’s not something that’s easy to create a database for.”
Scalability and enhanced collaboration: The capability of AI to handle vast amounts of data and rapidly scale up processes presents a major opportunity in drug discovery. As Connell put it, “You want to put this power into the hands of the scientist who’s got the ideas and knows how to validate them.”
Collaboration is an integral part of scientific discovery, and AI can bolster this aspect by promoting interdisciplinary collaboration. “One of the things we work on is reengineering the workflow or the ecosystem,” Connell noted. “It’s almost about reshaping the people, the process, the technology, the data.”
How generative AI in drug discovery can raise the innovation ceiling
Augmented innovation and creativity: Emphasizing the elastic capabilities of AI, Connell points out that researchers are often inspired to explore new ideas, particularly when they observe something unique in their data. Connell envisions that AI could free scientists from mundane tasks, allowing them to concentrate more on creative, innovative aspects of their work. By automating some of the more labor-intensive tasks, AI can free up scientists to focus on the creative and innovative aspects of their work. This has the potential to lead to more novel and groundbreaking discoveries in the field. Additionally, as transformer-based models advance, the acceleration of this process may occur thanks to the emergent properties they display when trained on extensive and diverse data sources, such as biomedical literature, genomic data, and chemical structures.
Learning from AlphaFold and beyond: Though not technically an example of generative AI, AlphaFold, a deep-learning system that predicts protein structures with high accuracy, exemplifies how AI can enhance efficiency in drug discovery. By offering rapid access to protein structure information, it addresses one of the most challenging and time-consuming steps in the drug discovery process — especially for biologics. Knowledge of protein structures, however, is also prized in traditional small molecule drug discovery.Aside from AlphaFold, various AI approaches, especially machine learning algorithms such as random forest, support vector machine, neural network, deep learning and other techniques have successfully predicted absorption, distribution, metabolism, excretion and toxicity (ADME-Tox) properties during early drug discovery phases.
Enhancing decision-making: Speaking on the expansive capabilities of AI, Connell says, “And if it fails, at any point, you may have to go back to the previous loop all the way back to the beginning. And so that’s, that’s where the 15 years of expense comes in.”
Future-readiness and adaptability: Embracing generative AI requires organizations to be prepared for digital transformation. This theme encapsulates the necessity for future-proofing and the adaptability for multimodal applications of generative AI. “The importance of being future-proof or digitally ready is escalating,” Connell said. Life science firms making strategic pivots with emergent technologies like generative AI are those that are data-ready since this area is notably data-intensive, he concluded.
Filed Under: Data science, Drug Discovery and Development, machine learning and AI, Omics/sequencing