
Boltzmann machine example.The diagram illustrates a simple Boltzmann machine with visible units (v) and hidden units (h). [Wikipedia]
“Think of a reduction in sample size from 674 to 400 — if each patient costs $500,000 to treat on a trial, that’s millions of dollars saved,” said Alyssa Vanderbeek, a product manager at Unlearn.ai. But the company’s ambitions go beyond mere cost-cutting. By creating digital twins of patients, Unlearn hopes to cut the number of people exposed to placebos, speed drug development timelines, and spot subgroups that respond best to a given therapy.
Pillars of Unlearn’s technology include Neural Boltzmann Machine (NBM), the engine behind its digital twins. Unlike typical predictive models that tend to focus on a single outcome, the NBM can simultaneously predict dozens of clinical outcomes—everything from lab results and vital signs to complex biomarkers—while also capturing the relationships between these outcomes. This gives researchers a richer understanding of a patient’s likely disease trajectory. The NBM doesn’t just offer a single prediction; it generates a range of possible outcomes. This comprehensive approach is key to Unlearn’s ability to optimize clinical trials for cost or improved power.
Unlearn further enhances its approach with PROCOVA (PRObabilistic COVariate Adjustment), a method that seamlessly integrates digital twin predictions into clinical trial analysis. PROCOVA has gained the support of major regulatory bodies, including the EMA and FDA. This regulatory acceptance is crucial for wider adoption within the pharmaceutical industry.
Putting data pieces back together again
Developing sophisticated models like Unlearn’s requires addressing data harmonization challenges. The company focuses on creating general disease-specific models, rather than study-specific ones, to broaden their applicability. Data privacy is paramount, with Unlearn adhering to strict protocols to protect client information.
Developing such models is not without its challenges. Unlearn has dedicated significant resources to address the complexities of data harmonization. “We’ve spent about 80% of our tech stack and pipelines and work streams just on taking in data, cleaning it, harmonizing it, understanding it, and using it to build these models,” explained Vanderbeek.
Unlike traditional approaches that often build study-specific models, Unlearn focuses on creating general disease-specific models. “With our digital twin generators, these are general disease-specific models that are not necessarily specific to a trial,” Vanderbeek added.
Focusing on usability
Introducing cutting-edge AI solutions into the traditionally conservative pharmaceutical industry presents its own set of challenges. The industry’s if-it-ain’t-broke-don’t-fix-it habits has historically made the adoption of new technologies daunting. Unlearn sees an opening for startups to be agile given this dynamic, focusing on ease of use. “Our primary customers have a role called clinical development lead or medical director. They’re not technical at all. This interface really resonates with them,” explained Dao.
Making precision medicine a reality

Angela Dao
Unlearn’s vision extends beyond improving the efficiency of clinical trials; it aligns with the broader movement towards precision medicine. “It’s like this idea of precision medicine, right? Tailoring to an individual’s specific disease characteristics,” Vanderbeek remarked. By tapping digital twins, Unlearn can help identify which patients are most likely to respond to certain treatments, enabling more personalized and effective therapeutic strategies.
Addressing diversity in clinical data is another critical aspect of Unlearn’s mission. By ensuring that their models are trained on diverse datasets—including varying ages, genders, ethnic backgrounds, and disease severities—Unlearn aims to improve the generalizability of clinical trial outcomes across different populations. “That diversity problem in the data is a huge issue across the board for clinical trials, because they’re like, what, 90-something percent Caucasian up until kind of recently,” Vanderbeek pointed out. Sophisticated ML models, paired with disparate data sources, can shed light on which patients respond to a given therapy and which don’t. “Can you incorporate those into the very rigorous analyses that go into subgroup discoveries?” Vanderbeek asked.
“Subgroup discovery in general is of great interest in the pharma space, because the hope is that even if you have a drug that did not work in your population, maybe there’s some subpopulation for whom it’s effective,” said Dao.
Filed Under: clinical trials, Drug Discovery, machine learning and AI