Drug Discovery and Development

  • Home Drug Discovery and Development
  • Drug Discovery
  • Women in Pharma and Biotech
  • Oncology
  • Neurological Disease
  • Infectious Disease
  • Resources
    • Video features
    • Podcast
    • Voices
    • Webinars
  • Pharma 50
    • 2025 Pharma 50
    • 2024 Pharma 50
    • 2023 Pharma 50
    • 2022 Pharma 50
    • 2021 Pharma 50
  • Advertise
  • SUBSCRIBE

Accelerating R&D with FAIR data

By Jim Olson | August 4, 2022

lab microscope

[Photo by Chokniti Khongchum on Pexels]

In 2016, the FAIR Guiding Principles for scientific data management and stewardship were published, laying out guideposts for scholarly data producers to make their data discoverable and usable in the future. The FAIR principles seek to ensure that data is Findable, Accessible, Interoperable, and Reusable. At the time of their publication, they articulated and centralized many points of discussion among data scientists. Since then, academia has widely referenced and utilized FAIR principles.  

Of course, collaboration and data reuse are fundamental to academic research. But the ideas of FAIRness can actually bring value to any research organization. That includes life sciences companies with vast archives of proprietary data. For example, a pharma company may expect that its data will always stay within its own walls. But effective management and reuse of data can have dramatic benefits within the enterprise. By applying FAIR to leverage data assets already in their archives more efficiently, organizations can empower their R&D teams and drive faster innovation.   

How FAIR data can enable machine learning

Much of the potential for innovation is in the realm of machine learning and the possibilities it offers for transforming R&D. But as anyone who’s had a role in digital transformation initiatives can tell you, the “transformation” required to effectively leverage machine learning is easier said than done. Enterprise-wide FAIRification can be a big part of the solution.

Recent data highlighted the fact that 80% of a data scientist’s time is spent on mundane data access and management tasks. In a machine learning effort for pharma, these tasks might include consolidating data from different locations, ensuring compliance with privacy and security regulations, uniting diverse data types, and standardizing and labeling it. These tasks are not the highest and best use of scientists’ time. In addition, they’re unlikely to contribute to job satisfaction — something research organizations must keep in mind, given the shortage of data scientists. 

To make data FAIR at scale while at the same time reducing this “data wrangling” burden, pharma companies need tools for efficient data management and curation. That begins with a data infrastructure that can do the heavy work of automating data ingestion and curation and housing the data itself. Automation is a critical piece of the process, not only to reduce errors and inconsistencies that can be introduced by manual curation but to speed up the timeline and enable scientists to do the high-impact work of training their algorithms. 

This automated curation is possible for tabular data and complex data like imaging. Tools can unearth these assets from archives, then read and extract metadata to make them more easily discoverable within the enterprise. Medical imaging offers great value to R&D teams, with rich information in each asset. When properly managed, images can also be combined with related data like EHRs, radiology reports, and clinical reports. Such data sources offer a wealth of information for researchers—if they have the tools to handle the scale and complexity of this diverse data.   

A single source of truth for data

Leveraging these concepts can help enterprises establish a single source of truth for data and metadata within the organization. Accomplishing this objective can provide researchers with a central repository from which they can easily reference data. Compare this concept with today’s typical practice for leveraging imaging data, in which a data scientist has to write a SQL script to submit to a CRO or an internal information security group. After that, the data scientist must wait weeks to receive their requested data. In contrast, a modern, enterprise-scale platform can put rich, diverse information at scientists’ fingertips while maintaining compliance and data security. 

Where can researchers go with terabytes or petabytes of data made easily accessible? More automation can be applied in analysis pipelines and data processing to accelerate clinical trials and streamline discovery. Internal “de-siloing” can foster more collaborations within the enterprise and help it reduce costs by leveraging existing assets. And external partnerships can be improved as well, with data that is more readily useful to outside organizations. 

The near future will bring unprecedented volumes of data within scientists’ reach. In this landscape, collaboration and interoperability will be critical to success. The FAIR principles offer life sciences organizations a roadmap to succeeding in a more open environment and to unlocking greater value within their existing research assets. 

Jim Olson is CEO of Flywheel, a biomedical research informatics platform. The company uses cloud-scale computing infrastructure to address the increasing complexity of modern computational science and machine learning. Jim is a “builder” at his core. His passion is developing teams and growing companies. Jim has over 35 years of leadership experience in technology, digital product development, business strategy, high-growth companies, and healthcare. He has worked for large and startup companies, including West Publishing, now Thomson Reuters, Iconoculture, Livio Health Group and Stella/Blue Cross Blue Shield of Minnesota.


Filed Under: clinical trials, Drug Discovery
Tagged With: FAIR data, machine learning
 

Related Articles Read More >

Sanders, King target DTC pharma ads but the industry worries more about threats to its $2B R&D model
Zoliflodacin wins FDA nod for treatment of gonorrhea
FDA approved ENFLONSIA for the prevention of RSV in Infants
First clinical study results of Dupixent for atopic dermatitis in patients with darker skin tones 
“ddd
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest news and trends happening now in the drug discovery and development industry.

MEDTECH 100 INDEX

Medtech 100 logo
Market Summary > Current Price
The MedTech 100 is a financial index calculated using the BIG100 companies covered in Medical Design and Outsourcing.
Drug Discovery and Development
  • MassDevice
  • DeviceTalks
  • Medtech100 Index
  • Medical Design Sourcing
  • Medical Design & Outsourcing
  • Medical Tubing + Extrusion
  • Subscribe to our E-Newsletter
  • Contact Us
  • About Us
  • R&D World
  • Drug Delivery Business News
  • Pharmaceutical Processing World

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search Drug Discovery & Development

  • Home Drug Discovery and Development
  • Drug Discovery
  • Women in Pharma and Biotech
  • Oncology
  • Neurological Disease
  • Infectious Disease
  • Resources
    • Video features
    • Podcast
    • Voices
    • Webinars
  • Pharma 50
    • 2025 Pharma 50
    • 2024 Pharma 50
    • 2023 Pharma 50
    • 2022 Pharma 50
    • 2021 Pharma 50
  • Advertise
  • SUBSCRIBE