Drug Discovery and Development

  • Home Drug Discovery and Development
  • Drug Discovery
  • Women in Pharma and Biotech
  • Oncology
  • Neurological Disease
  • Infectious Disease
  • Resources
    • Video features
    • Podcast
    • Voices
    • Webinars
  • Pharma 50
    • 2025 Pharma 50
    • 2024 Pharma 50
    • 2023 Pharma 50
    • 2022 Pharma 50
    • 2021 Pharma 50
  • Advertise
  • SUBSCRIBE

How AI and the cloud can transform R&D workflows and fuel collaboration

By Brian Buntz | June 19, 2023

Collaboration is a cornerstone of R&D

[Image courtesy of vectorfusionart/Adobe Stock]

The rise of cloud-based systems and AI-assisted analysis has dramatically transformed the landscape of scientific research and collaboration. One striking example is the mRNA company Moderna, which used the cloud to develop and deliver its first clinical batch of a COVID-19 vaccine candidate for phase 1 trials in a mere 42 days after the initial viral sequencing​.

While enabling real-time international collaboration, this new paradigm has also introduced novel challenges. Simon Adar, CEO of Code Ocean, found the struggles of cross-geographical R&D collaboration during his PhD work at Cornell University. While file-sharing systems provided some relief, they fell short when it came to coordinating code, data, software and troubleshooting across different geographies. “It wasn’t enough, because when you have code and data, you also need all the software dependencies for that code to work,” Adar said. 

“In the end, collaboration occurred primarily through Word documents, which was disappointing,” Adar said. “This experience fueled my desire to improve collaboration in research projects,” he said.

Adar’s experience led to the creation of the computational reserach platform Code Ocean in 2015, a platform designed to streamline and enhance the research process. “This vision became a research project during my postdoc at Cornell, which later evolved into a company. This initiative aligns with the ‘Open Science Library’ from Nature and Code Ocean, where researchers can browse code without having to install it locally, thereby eliminating many of the mentioned challenges in scientific collaboration,” Adar explained.

The company aims to alleviate inefficiencies through a five-point strategy centered on:

  1. Cloud-based technologies.
  2. Centralized asset management.
  3. Standardized analysis workflows.
  4. A streamlined research management lifecycle.
  5. Self-serve options for the research community.

One of Code Ocean’s technologies, known as a ‘capsule’, offers a complete, self-contained computational environment that includes everything necessary to reproduce computational research. “We have a user interface, so you don’t need to be a Docker expert for this. Our UI generates its own Docker file,” Adar said. This approach provides access to all necessary components along with a timeline of various versions. This method allows scientists, computational biologists, discovery IT members and bioinformaticians to keep tabs on different iterations and changes.

Cloud adoption in biopharma companies: A generational distinction

Adar notes a stark contrast in the digital strategies of older and younger biopharma companies, particularly in their rate of cloud adoption. He estimates that upwards of 90% of companies established in the past decade are integrating cloud-based technologies as a core part of their infrastructure. Conversely, established biopharma companies have traditionally relied on on-premises data centers and legacy systems. “But even established companies are starting to think about the cloud,” Adar said. “It’s much more modern, requires less upfront capital investment and you can scale up and down according to your demands.”

Given the continued adoption of cloud and its ability to save researchers time and effort in setting up and maintaining computing environments, a growing number of research haver emerged to support seamless collaboration and reproducibility. Examples of cloud-based include research tools like Docker, Binder and Google’s Colaboratory, which help researchers tackle different facets of scientific investigation. Yet, researchers still often struggle with issues like software compatibility, especially when using open-source projects on different systems. “For instance, it can be challenging to get open-source projects working for a variety of reasons,” Adar said. “One common issue is software compatibility, as researchers might be using different systems, such as Mac, Linux or Windows and Python versions.”

Code Ocean lineage graph showing result provenance of a bulk RNA-seq pipeline. A visual representation of the input data, capsules, and pipeline used to generate the final result, "Star Output GSE157194". Each node of the lineage graph includes a direct link to the asset used for analysis.

Code Ocean lineage graph showing result provenance of a bulk RNA-seq pipeline. A visual representation of the input data, capsules, and pipeline used to generate the final result, “Star Output GSE157194”. Each node of the lineage graph includes a direct link to the asset used for analysis.

Cloud-based tools supporting research collaboration and reproducibility

​​To address these challenges and enhance reproducibility in research, cloud-based services such as Code Ocean and the multi-language computational notebook platform Nextjournal have emerged as options for supporting reproducible research. Code Ocean aims to capture all information needed to re-execute an analysis. This allows researchers to share fully reproducible analyses with reviewers, collaborators and the public.

In the modern research landscape, reproducibility of results presents significant challenges. These can range from the complexity of experimental procedures to the difficulty in sharing vast amounts of data and analysis methods. To address these hurdles, a number of technologies such as cloud-based services like Code Ocean and the multi-language computational notebook platform Nextjournal have emerged in recent years.

Code Ocean, for example, enables the encapsulation of all computational details required for an analysis, from the dataset used to the specific versions of software libraries, in a ‘compute capsule.’ Nextjournal offers the ability to seamlessly interweave code, text and data in a single document.  

Adar emphasized the importance of distinguishing between data sources and the storage technologies used to store and process them. “It’s essential to have code that can access and process this data correctly within different workflows and pipelines, regardless of the storage method used,” he pointed out.

The role of AI and cloud-based services in navigating the data deluge

As the volume of scientific research continues to swell, the growing adoption of AI and cloud technologies can help R&D professionals mine insights. In an article on pharma R&D, McKinsey estimates that big-data informed decision making could unlock up to $100 billion in value annually across the U.S. healthcare system 

In this vein, Adar provided an initiative project Code Ocean is undertaking to help scientists. “For instance, one of our projects with a customer explores how we can assist scientists to become more self-reliant in their research,” he said. “Traditionally, scientists who are skilled in programming can code interfaces to databases and visualize their data, but many lack these coding capabilities.”

In this context, Code Ocean is using AI agents, autonomous software charged with  performing defined tasks. Such agents could potentially automate coding tasks, enabling scientists to focus on their core research questions. As Adar points out, “Though coding scientists are still necessary, our approach offers a more specific and tailored solution to repetitive tasks. By creating different agents for distinct use cases, we can supply scientists with an efficient tool.” This approach suggests a future of scientific research where AI doesn’t replace scientists, but rather works alongside them, amplifying their capabilities.”

But successfully wading through the data deluge in this environment requires a clear understanding of the distinctions between data sources and the tools designed to harness them, as Adar points out.

For instance, cloud storage services like Amazon’s S3 and Google Cloud Storage can provide efficient storage and computational resources for a variety of workloads. But these storage buckets are only pieces of a larger puzzle. While they serve as the backbone for data storage and processing, they do not offer a comprehensive data management solution alone. 

In addition, he underscores the importance of reproducibility in the current research landscape. “Code Ocean aims to capture all information needed to re-execute an analysis,” he said. “This allows researchers to share fully reproducible analyses with reviewers, collaborators and the public.”


Filed Under: Cell & gene therapy, Data science, Industry 4.0, machine learning and AI, Regulatory affairs
Tagged With: AI in Pharma, biopharma technology, cloud-based collaboration, code ocean, data management, reproducible research
 

About The Author

Brian Buntz

As the pharma and biotech editor at WTWH Media, Brian has almost two decades of experience in B2B media, with a focus on healthcare and technology. While he has long maintained a keen interest in AI, more recently Brian has made making data analysis a central focus, and is exploring tools ranging from NLP and clustering to predictive analytics.

Throughout his 18-year tenure, Brian has covered an array of life science topics, including clinical trials, medical devices, and drug discovery and development. Prior to WTWH, he held the title of content director at Informa, where he focused on topics such as connected devices, cybersecurity, AI and Industry 4.0. A dedicated decade at UBM saw Brian providing in-depth coverage of the medical device sector. Engage with Brian on LinkedIn or drop him an email at bbuntz@wtwhmedia.com.

Related Articles Read More >

Accelerating science with AI-enhanced cryo-EM workflows
Abstract neural network
Inside IQVIA’s quest to build a multi-agent AI ‘dream team’ to transform clinical trials
Recursion-MIT AI screens thousands of molecules before a single FEP run completes
Labcorp widens precision oncology toolkit, aims to speed drug-trial enrollment
“ddd
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest news and trends happening now in the drug discovery and development industry.

MEDTECH 100 INDEX

Medtech 100 logo
Market Summary > Current Price
The MedTech 100 is a financial index calculated using the BIG100 companies covered in Medical Design and Outsourcing.
Drug Discovery and Development
  • MassDevice
  • DeviceTalks
  • Medtech100 Index
  • Medical Design Sourcing
  • Medical Design & Outsourcing
  • Medical Tubing + Extrusion
  • Subscribe to our E-Newsletter
  • Contact Us
  • About Us
  • R&D World
  • Drug Delivery Business News
  • Pharmaceutical Processing World

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search Drug Discovery & Development

  • Home Drug Discovery and Development
  • Drug Discovery
  • Women in Pharma and Biotech
  • Oncology
  • Neurological Disease
  • Infectious Disease
  • Resources
    • Video features
    • Podcast
    • Voices
    • Webinars
  • Pharma 50
    • 2025 Pharma 50
    • 2024 Pharma 50
    • 2023 Pharma 50
    • 2022 Pharma 50
    • 2021 Pharma 50
  • Advertise
  • SUBSCRIBE