Researchers at Purdue University have devised chemical reactivity flowcharts and novel machine learning models that could accelerate drug discovery.
Although the use of machine learning for drug development has grown, researchers’ ability to understand machine learning recommendations has been limited.
In a paper published in Organic Letters, Purdue professor Gaurav Chopra noted that the machine learning (ML) models involved in the research were “human understandable.” The models could potentially boost chemists’ capability of interpreting chemical reaction outcomes related to organic and process chemistry.
“The ML research presented here can be used today within pharma to develop their discovery, process chemistry and developmental pipelines and enhance efficiency,” Chopra said.
The machine learning models rely on chemical reactivity flowcharts trained on a small number of chemical reactions. “Of course, more data helps, but the statistically robust machine learning models we developed help with less data, which I think will help pharma readily integrate this in their pipelines with minimal cost compared to developing large programs for infrastructure,” Chopra said. “We experimentally tested our models with less data including testing the new experiment that was based on the chemical reactivity flowchart.”
The researchers hope the machine learning models will improve drug developers’ ability to interpret chemical reactivity — especially related to N-Sulfinyl imines, a class of chemical compounds containing double bonds of carbon and nitrogen.
The Purdue researchers chose N-Sulfinyl imines owing to their widespread use for various chemical transitions in organic chemistry.
Machine learning models have traditionally required experimental training datasets requiring considerable manual experimentation.
The Purdue researchers assert that it is possible to train machine learning models using relatively small but targeted datasets.
To test their research, they used a fast multi-component reaction of acyclic or cyclic N-sulfonyl imines, which they used to predict reaction outcomes.
The researchers used density-functional theory to explain what they termed the “fast and peculiar reactivity mechanism of N-Sulfinyl imines to shed light on the compounds’ transition states and intermediates.
To explain the machine model’s prediction of reaction outcomes, the researchers developed a chemical reactivity flowchart.
The researchers state that the machine learning approach can serve as a basis for any multi-component reaction or chemical reaction to synthesize a library of chemical compounds.
“I think that the chemical reactivity flowcharts will help chemists not follow dead-end leads and suggest new experiments to do for the particular reaction of their interest during drug development,” Chopra concluded. “The major point here is that you don’t need large-scale data and these models can be used readily for day-to-day chemistry or reaction discovery or development versus using large-scale robotics.”