Drug Discovery and Development

  • Home Drug Discovery and Development
  • Drug Discovery
  • Women in Pharma and Biotech
  • Oncology
  • Neurological Disease
  • Infectious Disease
  • Resources
    • Video features
    • Podcast
    • Voices
    • Views
    • Webinars
  • Pharma 50
    • 2025 Pharma 50
    • 2024 Pharma 50
    • 2023 Pharma 50
    • 2022 Pharma 50
    • 2021 Pharma 50
  • Advertise
  • SUBSCRIBE

ChatRWD outperforms tech giants in medical question-answering

By Brian Buntz | July 30, 2024

ChatRWD BetaWhen it comes to medical AI, the biggest names aren’t necessarily delivering the best results. While tech giants race to build ever-larger language models, a new preprint reveals that when it comes to clinical accuracy and physician trust, a smaller player is outperforming the industry heavyweights.

Putting large language model-based systems to the test

Researchers put five gen AI systems to the test, evaluating their ability to provide reliable, actionable medical advice. In the research, nine independent physicians tested the ability of five AI systems in answering 50 clinical questions based on relevance, reliability, and actionability.

The widely-known large language models (LLMs) – ChatGPT-4, Claude 3 Opus, and Gemini Pro 1.5 – struggled to deliver trustworthy answers. They only managed to provide relevant and evidence-based responses for 2% to 10% of the questions. In addition, these LLMs frequently “hallucinated” citations, with 25% to 47% of their cited sources being either fictitious or entirely irrelevant to the question at hand.

A retrieval augmented generation (RAG)-based system, OpenEvidence, fared better, delivering relevant and evidence-based answers to 24% of questions. But ChatRWD, an AI-powered chat-to-database application from Atropos Health, fared best, achieving a 58% success rate in providing relevant and evidence-based answers.

The need for trustworthy, clinical-grade generative AI

Brigham Hyde

Brigham Hyde, Ph.D.

“I believe the whole generative AI space in healthcare is moving towards quality and trust, which is what we’ve been focusing on from the beginning,” said Dr. Brigham Hyde, CEO of Atropos Health, the developer of ChatRWD.

Many healthcare professionals have been exploring off-the-shelf large language models for the past roughly six months to a year, Hyde said. “And the hallucination problem is very real,” he added. “The goal is to find a way to use LLMs that offer convenience and speed while maintaining trust and accuracy, which is the holy grail for healthcare providers.”

Hyde notes that the version of ChatRWD used in the study is an early version. “We will not [formally] launch until our accuracy is in the 90-percent range,” he said. “And even then, we recommend that clinicians use it as a tool to inform their decisions.” He emphasized that it is still important for clinicians using such technology to tap experts to help contextualize the results.

The shift from convenience to trust

“I think what’s happening now is a shift from convenience to trust,” Hyde said. “As physicians, we’re being inundated with messages from these LLMs, and in our setting, where we’re producing evidence that could inform a treatment decision for a patient, we simply can’t afford a 20% error rate.”

Other companies are also working on developing more medically-accurate AI systems. One example is Google’s Med-PaLM 2, which has shown promising results in medical exams and answering consumer health questions. Other tech companies like IBM and Microsoft have similar initiatives.

Atropos is unique in its focus on providing rapid, high-quality real-world evidence (RWE) to support clinical decision-making and research in healthcare.

“Even the clinical trials we do run often exclude a significant portion of patients – around 70% – who have comorbidities,” Dr. Hyde explained. “And guess what? That’s about 70% of the patients doctors see every day.”

ChatRWD aims to bridge this critical gap by providing clinicians with rapid access to real-world evidence. “Once they [clinicians] input their query, we return a new study in under three minutes,” Hyde explained. This stands in contrast to traditional methods, where such comparative effectiveness studies can take six to eight weeks to conduct and often require large teams and significant resources. “Now you’ve got an individual user, with no programming ability, no statistics ability, just asking the question and being led through these steps,” he added.


Filed Under: clinical trials, Drug Discovery and Development, machine learning and AI
Tagged With: AI in healthcare, Atropos Health, ChatRWD, clinical trials, drug development, medical decision-making, real-world evidence
 

About The Author

Brian Buntz

As the pharma and biotech editor at WTWH Media, Brian has almost two decades of experience in B2B media, with a focus on healthcare and technology. While he has long maintained a keen interest in AI, more recently Brian has made making data analysis a central focus, and is exploring tools ranging from NLP and clustering to predictive analytics.

Throughout his 18-year tenure, Brian has covered an array of life science topics, including clinical trials, medical devices, and drug discovery and development. Prior to WTWH, he held the title of content director at Informa, where he focused on topics such as connected devices, cybersecurity, AI and Industry 4.0. A dedicated decade at UBM saw Brian providing in-depth coverage of the medical device sector. Engage with Brian on LinkedIn or drop him an email at bbuntz@wtwhmedia.com.

Related Articles Read More >

Columbia-CZ team develops 10.3M parameter model that outperforms 100M parameter rivals on cell type classification
Collage of close-up male and female eyes isolated on colored neon backgorund. Multicolored stripes. Concept of equality, unification of all nations, ages and interests. Diversity and human rights
How a ‘rising tide’ of inclusivity is transforming clinical trials
Mary Marcus appointed CEO of NewAge Industries
DNA double helix transforming into bar graphs, blue and gold, crisp focus on each strand, scientific finance theme --ar 5:4 --personalize 3kebfev --v 6.1 Job ID: f40101e1-2e2f-4f40-8d57-2144add82b53
Biotech in 2025: Precision medicine, smarter investments, and more emphasis on RWD in clinical trials
“ddd
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest news and trends happening now in the drug discovery and development industry.

MEDTECH 100 INDEX

Medtech 100 logo
Market Summary > Current Price
The MedTech 100 is a financial index calculated using the BIG100 companies covered in Medical Design and Outsourcing.
Drug Discovery and Development
  • MassDevice
  • DeviceTalks
  • Medtech100 Index
  • Medical Design Sourcing
  • Medical Design & Outsourcing
  • Medical Tubing + Extrusion
  • Subscribe to our E-Newsletter
  • Contact Us
  • About Us
  • R&D World
  • Drug Delivery Business News
  • Pharmaceutical Processing World

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search Drug Discovery & Development

  • Home Drug Discovery and Development
  • Drug Discovery
  • Women in Pharma and Biotech
  • Oncology
  • Neurological Disease
  • Infectious Disease
  • Resources
    • Video features
    • Podcast
    • Voices
    • Views
    • Webinars
  • Pharma 50
    • 2025 Pharma 50
    • 2024 Pharma 50
    • 2023 Pharma 50
    • 2022 Pharma 50
    • 2021 Pharma 50
  • Advertise
  • SUBSCRIBE