In June 2025, the Food and Drug Administration rolled out Elsa, an agency-wide generative AI assistant that officials say is already helping to speed the review of new drugs and devices and shrink weeks of paperwork into minutes. It is a vision that could reshape how lifesaving therapies reach patients, but Elsa’s first six months have exposed growing pains: internal pushback, documented hallucinations, and a more fundamental problem regulators have long overlooked: fragmented, disparate, and misaligned data standards.
Today’s new drug applications arrive at the FDA as massive, unstructured documents that can exceed 100,000 pages. Protocols, manufacturing data, and trial documentation are stitched together in largely incompatible formats with inconsistent terminology. A safety event labeled as “nausea” in one trial might appear as “gastrointestinal disorder” in another. Even when companies rely on shared dictionaries like MedDRA, they often use different versions, making it impossible to compare like with like.
This fragmentation drives up costs and slows reviews. Analyses of pharmaceutical R&D suggest that the inflation-adjusted cost of bringing a new drug to market has roughly doubled every nine years for about half a century, a trend dubbed Eroom’s Law in a 2012 paper as a deliberate inversion of Moore’s Law. While most industries use technology to become faster and cheaper, pharmaceutical R&D has mostly moved in the opposite direction.
The inefficiencies are staggering. Protocol amendments, which occur in 57 percent of clinical trials, take three to six months to process and cost upward of $535,000 in Phase III studies. A single dosing change can ripple across dozens of disconnected documents: statistical plans, consent forms, case report forms. FDA reviewers are left to manually reconcile these updates across thousands of pages.
Existing standards, where they exist, are applied inconsistently. The Clinical Data Interchange Standards Consortium (CDISC) has developed a suite of models to structure clinical trial data, yet adoption varies across sponsors and is often limited to only certain standards. Similar issues appear in manufacturing data. Specifications for identical analytical tests might be written differently across batches, while stability data arrive in custom spreadsheets that fail to integrate with FDA import systems.
The result has real human consequences. When the FDA’s Oncology Center of Excellence reviewed checkpoint inhibitor submissions across cancer types, reviewers spent months harmonizing criteria that should have been comparable. Instead of using AI to detect treatment patterns, scientists were forced into data archaeology. Every month lost in this process is a month patients wait for access to potentially lifesaving treatments.
The solution is regulatory willpower to fix the inputs themselves. The FDA does not need to wait for industry consensus. Even incremental requirements now would give AI a real chance to succeed.
First, accept digital protocols. Many sponsors already design trials with digital tools that capture every protocol element, from visit schedules to endpoint definitions, as structured data. But for FDA submission, they are forced to flatten them into Word documents. The agency then uses natural language processing to extract the information, essentially asking AI to decode what was already digital. Standards like ICH M11 and the CDISC Unified Study Definition Model (USDM) provide harmonized frameworks for defining and exchanging digital protocol elements in a consistent and machine-readable format, but they have been in development for years and still are not required. By requiring, or at least strongly recommending, structured digital protocols now, the FDA could save months of wasted effort.
Second, establish a shared vocabulary for regulatory data. Differences in definitions can reflect valid scientific or trial-specific reasons, but gaps remain across disease definitions, biomarkers, and outcome measures. CDISC controlled terminology and collaboration with NCI Enterprise Vocabulary Services have made headway, yet adoption remains limited without mandates to drive implementation. The FDA should work with NIH and industry to promote common ontologies where feasible, helping sponsors speak the same language in ways that let AI detect meaningful patterns across trials without forcing uniformity where it is inappropriate.
Third, create incentives that accelerate adoption. Priority review timelines for digital submissions. Fee reductions for early adopters. Public metrics showing how much faster structured submissions move through review. Just as electronic health record adoption accelerated after HITECH, regulatory carrots can make data standards non-negotiable.
The building blocks already exist. CDISC standards cover most clinical data types. HL7 FHIR enables healthcare data exchange. What is missing is an FDA mandate to harmonize, a step that would spur all stakeholders to address the gaps in legacy systems.
Every day of delay costs sponsors millions in lost revenue and costs patients access to therapies that could extend or save their lives. The FDA is right to embrace AI as a tool for accelerating drug reviews. But sophisticated algorithms cannot rescue a broken foundation. If regulators want AI to deliver, they must first require digital protocols and standardized data. Only then will AI have a real chance to transform drug development.
Angie Maurer is VP of Clinical Solutions at Faro Health and a clinical research executive with more than two decades of experience designing and running clinical trials. She focuses on protocol digitization, data standards, and using technology to improve trial quality and feasibility.
Filed Under: clinical trials, Data science



