While the topic of AI in drug discovery has received considerable attention in recent years, mature deployments of techniques such as machine learning in the industry remain rare.
“The chemistry domain is qualitatively different from any other problem that machine learning has exhibited real success in,” said Jason Rolfe, CTO of Variational AI (Vancouver, Canada).
A dataset involving FDA-approved drugs that have been tested in humans would be orders of magnitude smaller than the sort of datasets that underlie Generative Pre-trained Transformer 3 (GPT-3), a language model from OpenAI, an AI research company co-founded by Elon Musk.
High-throughput screening can generate substantially larger datasets. The PubChem database is billed by NIH as the “largest collection of freely accessible chemical information,” but the data can be noisy. Many of the apparent active compounds likely won’t be validated in a secondary screen due to factors such as aggregation, contamination of samples and assay interference.
“These datasets are intrinsically more difficult to work with than something like ImageNet, which has been the workhorse for much of the architectural development in machine learning,” Rolfe said.
With ImageNet, a visual database containing more than 14 million annotated images, the data are relatively clean. “Some images are misclassified, but it’ll be something like a Shih Tzu classified as a Pomeranian,” Rolfe said.
By contrast, noise is “rampant in pharmacological data,” Rolfe said.
Drug discovery is “a very challenging domain to work in, but it has outsized promise,” Rolfe said.
With the cost of developing a new drug frequently hitting billions of dollars and the failure rate high, “anything that can reduce that by even a fraction would be of extreme value to society,” Rolfe noted.
Variational AI focuses on machine learning for drug discovery to generate small molecules that become assets licensed to biopharma companies.
The biopharma industry is in the “first or second inning” in terms of adopting techniques such as machine learning for drug discovery, said Handol Kim, CEO of Variational AI.
Skepticism about AI’s promise in drug discovery has begun to fade during the pandemic, Kim said. “Pharma companies are realizing this could be a new potential modality not unlike biotech in the 1970s or 1980s,” he added.
In addition, “a lot of pharma companies are now investing in hiring people to specifically work on AI for drug discovery companies,” Kim said.
Filed Under: Data science, Drug Discovery, machine learning and AI