Researchers around the globe can now access thousands of pediatric cancer genomic datasets, thanks to a new cloud-based initiative.
The St. Jude Children’s Research Hospital has launched the St. Jude Cloud—an online, publicly available data-sharing and collaboration platform—that gives researchers the world’s largest public repository of pediatric cancer genomics data.
Scott Newman, PhD, group lead for bioinformatics analysis in the St. Jude Department of Computational Biology presented St. Jude Cloud on April 15 at the 2018 American Association for Cancer Research (AACR) Annual Meeting. The cloud was developed as a collaboration between St. Jude, DNAnexus and Microsoft to provide accelerated data mining, analysis and visualization capabilities in a secure cloud-based environment.
“Sharing research and scientific discoveries is vital to advancing cures and saving lives, especially in rare diseases like pediatric cancer,” James Downing, MD, the St. Jude president and chief executive officer, said in a statement. “St. Jude has shared data and resources since its founding, and collaboration with researchers across the world is at the core of our mission. St. Jude Cloud offers researchers access to genomics data and analysis tools that will drive faster progress toward cures for catastrophic diseases of childhood.”
The new cloud will allow scientists to comb through more than 5,000 whole-genome, 5,000 whole-exome and 1,200 RNA-Seq datasets from more than 5,000 pediatric cancer patients and survivors. The amount of datasets—which are stored on Microsoft Azure—are expected to double by next year.
The majority of the data has been derived from the St. Jude—Washington University Pediatric Cancer Genome Project, designed to understand the genetic origins of childhood cancers, the Genomes for Kids clinical trial, focused on moving whole genome sequencing into the clinic, and the St. Jude Lifetime Cohort study (St. Jude LIFE), which conducts comprehensive clinical evaluations on thousands of pediatric cancer survivors throughout their lives.
The cloud also features a collection of bioinformatics tools that aid both experts and non-specialists to gain novel insights from the data. The tools include validated data analysis pipelines and interactive visualization tools to make it easier to utilize large datasets. The data can be shared privately within the platform.
Researchers will be able to explore the data or their own results using interactive visualizations powered by ProteinPaint—a genomic visualization engine created at St. Jude that allows users to navigate through the genome and identify genetic changes linked to cancer development.
A St. Jude scientist was able to use the St. Jude Cloud to replicate, in just a few days, experimental findings that originally took the research team more than two years to make.
“St. Jude Cloud is a powerful resource to drive global research and discovery forward,” Jinghui Zhang, PhD, chair of the St. Jude Department of Computational Biology and co-leader of the St. Jude Cloud project, said in a statement. “Providing genomic sequencing data to the global research community and making complex computational analysis pipelines easily accessible will lead to progress in eradicating childhood cancer.
“St. Jude has been committed to sequencing and understanding pediatric cancer genomes for nearly a decade, and we will continue to generate and share data with the research community in the future.”
Filed Under: Genomics/Proteomics