
[Photo, Rendering an Screenhost Credits: Innophore GmbH 2017-2021]
For Innophore’s Christian Gruber, Ph.D., zeroing in on the specific “binding sites” within proteins—rather than the proteins themselves—is the key to faster, more targeted drug discovery. “Our main principle is that proteins matter, but binding sites matter even more,” Gruber said. “Any given protein may have multiple binding sites for potential drugs, so a big part of our work is comparing all those possible sites to ensure specificity and reduce side effects.” By examining binding sites across entire genomes, the company aims to identify promising targets more quickly and refine treatments before they ever reach a patient.
[Video from Innophore GmbH]
Validating speed and precision in crisis
“Our pharmaceutical clients detect off-target interactions at roughly twice the rate of other in silico methods,” said Gruber. Innophore’s platform proved its mettle during COVID-19 by identifying the SARS-CoV-2 main protease as a prime drug target by mid-January 2020—months before Pfizer’s Paxlovid targeted the same site. The company analyzed the virus’s genome within hours of its release, using binding-site comparisons to prioritize the protease for antiviral development. This rapid response underscored the technology’s ability to map interactions in emerging pathogens, offering a blueprint for future pandemic readiness. The saga illustrated how an AI-driven approach to binding-site analysis can compress drug discovery timelines from years to weeks when paired with scalable compute — in this case provided by AWS.
“By mid-January 2020, we suggested the main protease of SARS-CoV-2 was the most promising drug-binding site,” Gruber noted. “Paxlovid later targeted that exact site, which demonstrates how quickly we can highlight critical targets in a pandemic setting.”
Not long after, Wired published an article titled “Biotech drops the proprieties and goes hog-wild for sharing” mentioning Innophore and other biotech efforts to combat the pandemic. “I’m also proud Wired featured us—I’ve been reading it since I was a kid. We got a lot of attention during and after the pandemic, thanks to our research and collaborations with proven outcomes,” Gruber said.

Innophore group picture in Graz February 2020 [Innophore GmbH]
Early collaboration with NVIDIA
That same speed was boosted by Innophore’s early collaboration with NVIDIA—placing the Austrian startup among the first handful of users of the BioNeMo platform. “We’ve been among the very first people that were using BioNeMo,” Gruber explained. “It was us, AstraZeneca, and a few partners.” By integrating BioNeMo’s cloud APIs into its Catalophore platform—and co-developing AI-driven approaches like CavitOmiX (a plugin for Schrodinger’s PyMOL)—Innophore can accelerate safety screening and drug design predictions, jumping from hundreds of predictions per second to 5 million.
Gruber and colleagues at Innophore and NVIDIA recently explained in Scientific Data how they combined homology-based and AI-driven modeling tools to predict the 3D structures of 42,042 distinct human proteins. By using NVIDIA’s BioNeMo platform—integrating AlphaFold 2, OpenFold, and ESMFold—alongside Innophore’s CavitOmiX technology, the team assembled a dataset designed for maximum coverage and consistency across the human proteome.
Human proteins underpin both health and disease, and having reliable 3D models of these molecules can dramatically accelerate drug discovery. While earlier efforts like AlphaFold paved the way for structural predictions, certain proteins or binding sites remained incomplete or low-confidence. By merging multiple AI prediction engines with Innophore’s binding-site analysis, this dataset fills those gaps and provides a more robust foundation for identifying potential drug targets—or off-target interactions—across the entire proteome.

Christian Gruber, Ph.D.
“Our dataset is offered in both unedited and edited formats for diverse research requirements. The unedited version contains structures as generated by the different prediction methods, whereas the edited version contains refinements, including a dataset of structures without low prediction-confidence regions and structures in complex with predicted ligands based on homologs in the PDB.” Low confidence regions can negatively influence downstream analyses (such as molecular docking) of the models in question. The team thus provided a new dataset that removed such low-confident regions from the structures.
This dataset also has potential for driving progress in machine learning applications in protein structure and function research: “The availability of comprehensive structural data is fundamental for the advancement of AI-driven tools, e.g., RFdiffusion, which is instrumental in the design of novel proteins.” In 2024, DeepMind unveiled AlphaFold 3, which can accurately model more than 99% of molecular types in the Protein Data Bank.
This comprehensive protein structure dataset exemplifies Innophore’s broader strategy of combining multiple computational approaches to enhance drug discovery precision. Now, the team plans to scale these methods further—incorporating personalized genomes, multiple organisms, and more robust structural data to guide next-generation drug discovery approaches. Innophore aims to refine large-scale, binding-site–focused models capable of accelerating treatments for emerging diseases and patient-specific needs alike. “We decided early on that if customers can verify our results in their own facilities, they can adopt the findings faster,” Gruber said.

[Rendering credit: Innophore GmbH 2017-2021]
Filed Under: Biotech, Data science, Drug Discovery, machine learning and AI