The dynamic appears to be changing, particularly in drug discovery, thanks in part to burgeoning data science maturity, the runaway success of generative AI (gen-AI), growing appetite for the cloud across healthcare and continued computational breakthroughs.
Tech companies are also realizing the diversity and uniqueness of healthcare data. “In less than a decade, [healthcare] will become the largest data generation industry,” said Kimberly Powell, vice president and general manager of healthcare at NVIDIA. “That’s one reason why large tech companies are interested. Just the storing and managing of the data — that’s what clouds do.”
Companies, ranging from hyperscale cloud vendors to hardware-focused behemoths, are working to develop advanced computational platforms tailored to the specific needs of healthcare.
It takes a village to map molecules and proteins
Additionally, tech companies have become more aligned, taking more of an ecosystem approach in positioning their technologies to life science organizations. Just last week, for instance, NVIDIA announced the accessibility of BioNeMo, its generative AI platform for drug discovery, via AWS. Now generally available, this platform integrates deep learning techniques for small molecule and protein structure prediction and generation. It features models such as Deepmind’s AlphaFold 2 and ESMFold, the latter developed by Meta Fundamental AI Research Protein Team (FAIR). BioNeMo also offers access to FAIR’s ESM-1nv and ESM-2, which are large language models for protein property predictions. Other models round out the offering.
The AWS-NVIDIA partnership makes BioNeMo available via AWS ParallelCluster, a cluster management tool for high-performance computing, and Amazon SageMaker, a machine learning service. ParallelCluster allows companies to deploy high-performance computing (HPC) clusters, which can prove useful for compute-hungry AI model training. “You want to have more of an HPC environment because, if GPUs are scattered all over the world, the training job is going to be much less efficient and potentially cost you a lot of money,” Powell said.
From Deepmind’s AlphaFold to Meta’s ESM-2: Powerful AI tools in protein modeling
Powell also positions Meta’s ESM-2 as “a super-powerful model” given its protein structure prediction capabilities, comparing it to Deepmind’s influential AlphaFold 2. She notes its cost-effectiveness in compute execution, which potentially allows for broader applications, such as employing embeddings more extensively to predict protein properties, including homology, which in biology refers to the similarity in sequence or structure of proteins owing to shared ancestry “It helps to interpret evolutionary information about proteins,” Powell said. It is applicable to drug discovery, but also to understanding genetic disease.”
While ESM-2’s efficiency is notable, completing predictions up to 60 times faster than AlphaFold 2, its predictions tend to be slightly less accurate than the near-experimental precision of AlphaFold 2.
With the total number of possible protein sequences practically infinite, as Nature has observed, the white space for targeted and personalized drug discovery is mind-boggling. It’s therefore clear that there are “proteins out there that nature has never seen before,” Powell said. The availability of generative AI tools and cloud accessible HPC resources therefore opens up significant potential not only in drug discovery but also in understanding genetic diseases and the field of protein engineering.
Still, stumbling blocks remain. While nearly all life science companies now have some form of AI strategy, many struggle with execution, often related to regulatory concerns, cybersecurity risks, and the complexities of data integration.
Tailoring AI tools to individual needs
One central consideration for the adoption of generative AI tools in drug discovery is the ability of researchers to customize tools based on their own needs. “BioNeMo allows companies like Genentech to customize these models and make them their own,” Powell said. (NVIDIA and Genentech have recently unveiled a multi-year alliance involving BioNeMo.) “We want to give every enterprise or tech bio company the ability to build from scratch very sophisticated foundation models or customize and adapt the models we’ve already provided with their own data,” Powell said.
Recognizing the varied expertise in data science and AI among scientists, the platform aims to be approachable and intuitive. “We’ve included training recipes to simplify deploying training jobs and automatically scaling across thousands of GPUs,” Powell said. This approach also facilitates seamless collaboration between seasoned data scientists and domain experts by reducing complexity in the AI deployment process.
In addition to its user-friendliness, the platform also introduces a feature known as “validation in the loop” to optimize AI model accuracy. Powell described this as a live process where “you’ll stop the training, get the checkpoints, do a little mini downstream task training to see if the model is gaining in performance.” The feature ensures the model is developing correctly and can reliably produce accurate results with increased accuracy as the training process proceeds. NVIDIA created the feature based on early access feedback, tailored to “accelerate the journey for domain scientists who don’t want to spend months becoming AI experts.”
Gen-AI traction could signal tipping point in drug discovery and beyond
Highlighting that Big Tech companies are making long-term investments in AI in healthcare, and expanding collaborations with partners across the life sciences ecosystem, Powell describes something of a tipping-point with AI in the industry. “The realization that this generative AI stuff is very useful is at the top of everybody’s mind,” Powell said. Consulting firms have expressed similar sentiments. Research from consulting firm Accenture suggests that Gen-AI may affect roughly half of life sciences work hours, and half the time currently spent in work activities at a biopharma company will be either automated or augmented. McKinsey agrees that gen-AI, across industries, had a breakout year in 2023, with one-third of McKinsey Global Survey respondents using the AI technology regularly in at least one business function.
McKinsey also notes that the healthcare sector is among those significantly impacted by generative AI given its heavy reliance on knowledge work. In essence, the global management consulting firm underscores that the integration of gen-AI is not just a fad, but represents a fundamental shift transforming how the many industries, including healthcare, operates and innovates. Makers of pharmaceutical and medical products could see value equivalent gains by as much as 5%. “And more and more, results are starting to be realized,” Powell said, referring to generative AI projects in healthcare. “And so it’s turning corporate strategies into true results.”
Filed Under: Data science, Drug Discovery, Drug Discovery and Development, Industry 4.0, machine learning and AI