Drug Discovery and Development

  • Home Drug Discovery and Development
  • Drug Discovery
  • Women in Pharma and Biotech
  • Oncology
  • Neurological Disease
  • Infectious Disease
  • Resources
    • Video features
    • Podcast
    • Webinars
  • Pharma 50
    • 2025 Pharma 50
    • 2024 Pharma 50
    • 2023 Pharma 50
    • 2022 Pharma 50
    • 2021 Pharma 50
  • Advertise
  • SUBSCRIBE

Better data can mean faster drug development, improved clinical trials and better healthcare

By Brian Buntz | May 30, 2024

Data quality

[Adobe Stock]

What’s the difference between a multimodal generative AI system to plan a vacation itinerary versus guiding cancer treatment? For the former, data quality is a definite plus, but for the latter, it’s indispensable.

“If I ask ChatGPT to plan a vacation for me and it spits out something, I might sit there and second guess it. ‘Is that restaurant really in that town?,'” said Dr. Mitesh Rao, CEO of OMNY Health, a company founded in 2017 to synthesize and contextualize healthcare data. In this high-stakes world, a “hallucination could mean someone gets hurt,” Rao said. Or worse. “If you can’t trace [AI output] back to the fundamental truth, as a provider or a clinician, you have to question that.”

OMNY Health works with life science clients including those in pharma and medical devices who have their own version of healthcare data headaches. “I basically started OMNY Health out of frustration,” Rao said. While working in senior roles related to patient safety and health services at prominent hospitals, Rao saw firsthand how data silos hindered research and potentially impacted the quality of care. “Pharma, med device and analytics companies kept coming to us wanting data, and every time we were doing it, it was piecemeal — like reinventing the wheel each time,” Rao said. While involved in a spectrum of research collaborations, each felt “bespoke.” “My goal was, how do we build relationships at scale? To do that, you really need infrastructure. The EMRs weren’t going to provide that,” Rao said. “You need something that’s not tied to the underlying IT infrastructure, something that can actually connect and serve as those pipes.”

The multifaceted challenges and impacts of healthcare data quality

Mitesh Rao

Mitesh Rao, MD

The problems with data quality in healthcare are multifaceted. Healthcare data is often stuck in a patchwork of disparate systems spanning everything from electronic medical records (EMRs) and lab systems to pharmacy databases and insurance claims. Add to that data from genomics, proteomics, clinical trials, research papers and beyond, the complexity ramps up, with each data type having its own vernacular, making it difficult to get even a unified view of a single patient’s health journey. “The data is kind of everywhere,” Rao put it. “There’s so much heterogeneity. It takes a lot to actually take that clinical data and transform it into usable, research-grade, regulatory-grade data and evidence,” he added.

In machine learning, the merging and concatenating disparate datasets can shed light on how disparate variables interrelate. But in healthcare, the array of data formats, terminologies and coding practices can be a stumbling block. Even data within a single medical institution could be siloed with data living within separate systems, sometimes requiring doctors to switch computers to access complete patient information.

Given the chaotic and sometimes hurried nature of clinical care, data quality is not always consistent, with missing values, errors and inconsistencies sometimes complicating analysis. “If you’ve seen the data coming out of one EMR within one hospital system, you know it’s often pretty messy and not very useful,” Rao said.

On top of that, healthcare data is inherently sensitive, and protecting patient privacy is a moral and legal priority. Companies working with this data must adhere to regulations such as HIPAA in the U.S. and GDPR in Europe. “There has to be a very strong moral and ethical compass in how the data is handled,” Rao said.

While data volume, along with ever-more powerful compute and algorithms, has helped fuel the current AI boom, simply having more data isn’t always better. Raw information, no matter how voluminous, is of limited value without careful curation and contextualization. As Rao put it, “People will start to realize that not all data is equal, not all data has come from the same source — quality, depth, timeliness, comprehensiveness, those are the important pieces.”

Implications for biopharma and the high stakes in healthcare

For biopharma companies, these data challenges have direct implications for drug development and clinical trials. Identifying promising drug targets, recruiting the right patients, and demonstrating efficacy all rely on accurate, comprehensive data. Fragmented or unreliable information can lead to problems with everything from target selection to clinical trial design.

For example, in recent decades, several major pharmaceutical companies have faced setbacks or warnings from regulators as a result of data snags, including data documentation problems and concerns about the accuracy and completeness of clinical trial data.

The increasing use of AI and machine learning in drug discovery and clinical trials further raises the stakes for data quality. The risk of AI hallucinations or faulty outputs from poor data inputs could lead biopharma companies down costly dead ends or even put patient safety at risk.

In healthcare, where lives are at stake, the consequences of poor data can be particularly severe. “That’s the thing when healthcare uses large language models — you need to provide data and you need provenance,” Rao said. “You need to be able to know that this output ties to a specific episode of care.”

OMNY Health’s growth and alignment with FDA priorities

OMNY Health was founded on the principle that better data can lead to better healthcare outcomes. Fast forward to today, and OMNY has patient records relating to more than 78 million patients across the U.S. The company hs forged partnerships with companies such as Atropos Health and Datavant, focusing on real-world data for applications spanning clinical trial research and drug development. The company is on track to surpass 100 million patients by the end of the year.

OMNY’s focus on data provenance and traceability aligns with the FDA’s increasing demands in this area. “Look at the way the FDA is going now around data — they want to see provenance,” Rao said. “They want to know the source of the data. They want proof.”

The explosive popularity of generative AI, along with the possibilities of AI hallucinations and the misuse of AI for misinformation, is set to “raise the bar” for data quality. Rao noted that “people will start to realize that not all data is equal.”

Ultimately, this push for higher data standards comes back to the core values of medicine itself. “It’s funny — physicians at our core, we’re data-driven. It’s how we’re trained,” Rao said. “Data often speaks to us in a way that is convincing. From early in our training, we’re taught that the literature, publications, research-grade output that has been vetted and peer-reviewed — that’s sort of the Bible that we can follow.” In an era where algorithms increasingly inform medical decisions, that same commitment to relying on trusted, verifiable data is no longer aspirational — it’s essential.


Filed Under: clinical trials, Data science, Drug Discovery, machine learning and AI
Tagged With: biopharma, clinical trials, data quality, drug discovery, Healthcare Data, real-world data, RWE
 

About The Author

Brian Buntz

As the pharma and biotech editor at WTWH Media, Brian has almost two decades of experience in B2B media, with a focus on healthcare and technology. While he has long maintained a keen interest in AI, more recently Brian has made making data analysis a central focus, and is exploring tools ranging from NLP and clustering to predictive analytics.

Throughout his 18-year tenure, Brian has covered an array of life science topics, including clinical trials, medical devices, and drug discovery and development. Prior to WTWH, he held the title of content director at Informa, where he focused on topics such as connected devices, cybersecurity, AI and Industry 4.0. A dedicated decade at UBM saw Brian providing in-depth coverage of the medical device sector. Engage with Brian on LinkedIn or drop him an email at bbuntz@wtwhmedia.com.

Related Articles Read More >

Capgemini’s life-sciences lead says ROI and data security, not algorithms, will decide pharma’s AI future
Portrait of happy smiling mature middle aged professional business woman investor manager executive or lawyer attorney looking at camera at workplace working on laptop computer in office.
As FDA pushes agency-wide generative AI, pharma experience show similar tools can cut clinical study-report drafting time by 30% or more
FDA’s genAI push could save CDER hundreds of thousands of review hours annually
Elsevier plugs 500,000 ClinicalTrials.gov records into Embase
“ddd
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest news and trends happening now in the drug discovery and development industry.

MEDTECH 100 INDEX

Medtech 100 logo
Market Summary > Current Price
The MedTech 100 is a financial index calculated using the BIG100 companies covered in Medical Design and Outsourcing.
Drug Discovery and Development
  • MassDevice
  • DeviceTalks
  • Medtech100 Index
  • Medical Design Sourcing
  • Medical Design & Outsourcing
  • Medical Tubing + Extrusion
  • Subscribe to our E-Newsletter
  • Contact Us
  • About Us
  • R&D World
  • Drug Delivery Business News
  • Pharmaceutical Processing World

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search Drug Discovery & Development

  • Home Drug Discovery and Development
  • Drug Discovery
  • Women in Pharma and Biotech
  • Oncology
  • Neurological Disease
  • Infectious Disease
  • Resources
    • Video features
    • Podcast
    • Webinars
  • Pharma 50
    • 2025 Pharma 50
    • 2024 Pharma 50
    • 2023 Pharma 50
    • 2022 Pharma 50
    • 2021 Pharma 50
  • Advertise
  • SUBSCRIBE