Amazon Web Services (AWS) recently announced a significant expansion of Amazon Omics at the annual AWS Life Sciences Executive Symposium in Boston. Amazon Omics, which the company introduced last year, helps life science organizations to store, query and analyze genomic, transcriptomic and other omics data.
While other tools such as Qlucore Omics Explorer, Genospace, StrandOmics, Signals Translational and the publicly-funded Galaxy platform exist, Amazon Omics differentiates its offering with a focus on high scale, security and ease of integration in production environments.
Omics data encompass genomics, transcriptomics and other related fields, providing a comprehensive understanding of the genetic, transcriptional and functional elements of biological systems. Such data, instrumental for researchers studying biological systems, plays a vital role in modern drug discovery and development.
Omics data provides a comprehensive understanding of biological systems, and is a vital tool in modern drug discovery and development. A feature in Amazon Omics assists in parsing complex key/value pairs in variant call format (VCF) files, an often challenging task for users.
“A lot of valuable information about a variant is stored in the INFO column in a variant call format (VCF) file,” noted an Amazon Omics spokesperson over email. This column format, which traditionally includes a key/value pair of strings, can make complex pairs challenging to parse further. “For example, VEP uses additional delimiter within that string. Previously, to accurately parse it, customers would have to implement string manipulation functions (e.g., split) and then figure out how to map it back to what the location in the string meant,” the spokesperson continued. “With this feature, Amazon Omics takes care of that for customers and automatically maps the string to the correct value on import. Customers no longer need to then do any string manipulation to get value out of this data.”
Amazon Omics introduces Ready2Run workflows
In an attempt to differentiate Amazon Omics in the multi-omics ecosystem, AWS adopts a comprehensive managed service approach, combined with unique features like Ready2Run workflows. The company notes these pre-constructed and pre-configured workflows, sourced from third-party providers such as Sentieon and NVIDIA, and open-source pipelines, contribute to simplified data analysis.
The Ready2Run workflows are designed with flexibility, enabling everything from the conversion of base calls into FASTQ files to the execution of secondary and tertiary analyses like gene expression, variant calling, or even predicting protein structures. Minimal user input is required, and these workflows are priced on a per-run basis.

Facilitating the implementation of bioinformatics workflows, Amazon Omics offers both private and Ready2Run configurations.
Amazon Omics also supports NVIDIA Parabricks Ready2Run workflows, a suite of accelerated genomic analysis applications. These workflows, including GPU-accelerated GATK and DeepVariant, are designed to efficiently process large volumes of sequencing data from whole genome sequencing, and offer improved speed over CPU-based tools.
Among the organizations leveraging the new capabilities of Amazon Omics are Kite Pharma, a Gilead Sciences subsidiary renowned for its work in cancer immunotherapy, and Columbia University Medical Center, a leader in medical research and education. Kite Pharma, for instance, is finding significant utility in Amazon Omics Ready2Run workflows for scRNAseq to analyze single cell RNA sequencing data, as stated by Jenny Wei, the company’s senior director and head of R&D informatics and technology.
A competitive landscape
Despite its ongoing expansion and enriched service offering, Amazon Omics operates within a competitive landscape that includes offerings like Microsoft Genomics. Contrary to some misconceptions, Microsoft Immunomics is actually a research team and not a competing product.
Amazon Omics aims to differentiate itself through its genomics analysis tools, especially with its unique features such as supporting Workflow Description Language (WDL) and Nextflow workflow languages, both popular in the bioinformatics community for defining and executing complex computational pipelines. This is particularly significant for analyzing omics data, which often involve multiple data processing steps and dependencies.
Additionally, Amazon Omics boasts an analytic store capable of handling both genomic variants and annotations, facilitating genomic analyses. The former refers to differences in the DNA sequence among individuals, while annotations provide additional information about such variants, including their potential effects on genes and proteins.
The Amazon Omics platform operates data centers across several regions, including the U.S. East (N. Virginia), U.S. West (Oregon), Europe (Ireland, London, Frankfurt), and Asia Pacific (Singapore).
As part of its recent announcement, Amazon Omics also emphasized the inclusion of GPU support in Omics workflows, direct upload capabilities to Omics Storage, automatic variant data parsing, and integration with Amazon EventBridge.
The company explains that “event-driven architectures, enabled through EventBridge integration, are beneficial for these organizations as it offers automatic notifications, eliminating the need to constantly check for updates.” According to an AWS statement, This feature is advantageous for organizations seeking to reduce turnaround times and automate integrations between Amazon Omics and other applications, such as sample tracking and/or Laboratory Information Management System (LIMS) software.”
These unique features could be of considerable interest to drug developers, potentially hastening the drug discovery and development process. For instance, the streamlined process of data analysis via Amazon Omics Ready2Run workflows, coupled with the integration with Amazon EventBridge, could improve the efficiency of drug testing and approval procedures.
The company believes that the introduction of Ready2Run workflows, GPU support in Omics workflows, direct upload to Omics Storage, automatic variant data parsing, and Amazon EventBridge integration has the potential to amplify research capabilities and speed up scientific discoveries across healthcare and life sciences organizations.
Filed Under: Data science, Drug Discovery, Drug Discovery and Development, machine learning and AI, Omics/sequencing