Machine Learning and Image Processing on the CGC: Tools For Success

Machine learning is becoming ubiquitous in the bioinformatics space: applying machine learning algorithms to analysis of proteomics, genomics, and other -omics datasets has provided a wealth of analysis and interpretations of data not easily achievable by conventional methods. The CGC offers many helpful features for users performing machine learning (ML) …

Written by Dan Ventre PhD, Soner Koc, and Ana Stankovic

The Data Repository Service API on Seven Bridges: Towards Global Interoperability

For the first time ever, users on CAVATICA are able to import datasets from NHLBI BioData Catalyst powered by Seven Bridges, such as TOPMed, onto the CAVATICA platform for analysis. Likewise, users on NHLBI BioData Catalyst can import Kid’s First Datasets from CAVATICA in the same manner. Before now, there …

Written by Daniel Ventre PhD

Enabling Workflow Reproducibility in the Cloud with New Pipelines from the Genomic Data Commons

When analyzing genomic data, there is a vast range of bioinformatics tools and workflows to choose from. However, making an informed selection from so many options can be overwhelming, even within a relatively narrow topic, such as harmonization to a reference genome. One approach to selecting the right tool for …

Written by Manisha Ray

Bioinformatics Workflow Portability is Critical to Achieving Reproducibility

With the explosion of genomic data in recent years, the number of bioinformatics workflows has seen a corresponding proliferation. Researchers and developers now have a wealth of analysis options, from building their own tools to taking advantage of those developed by others. However, a workflow developed in one environment may …

Written by Manisha Ray

The Cancer Genomics Cloud: collaborative, reproducible, democratized (and now citable!)

Last week we published our paper The Cancer Genomics Cloud: Collaborative, Reproducible, and Democratized—A New Paradigm in Large-Scale Computational Research in Cancer Research as part of their special issue on computer resources. Congratulations to everyone who’s worked on the Cancer Genomics Cloud, and many many thanks to the team at …

Written by Nick

Reducing Bioinformatic Analysis Costs with AWS Spot Instances

Although genome sequencing costs have dropped dramatically over the past few years, analyzing large amounts of genomic data remains expensive. As the scale of genomic projects continues to grow, cost-efficient bioinformatic analysis is key to gaining insight from the estimated 100 million to 2 billion human genomes that will be …

Written by Jessica Lau

CloudNeo: CWL Brings Cancer Genomics to the Cloud

A cloud-based workflow for patient-specific tumor neoantigens CloudNeo—a computational workflow for identifying patient-specific tumor neoantigens from Next-Generation Sequencing (NGS) data, was recently published in Bioinformatics. Originating from Jeffrey Chuang’s lab at The Jackson Laboratory, CloudNeo is a neoantigen prioritization workflow designed specifically for the cloud. The authors have made the CloudNeo workflow available on the Seven […]

Written by Patrick

Custom interactive analysis on all Seven Bridges environments

This week we released Data Cruncher, an interactive analysis tool available on the Seven Bridges Platform, Cancer Genomics Cloud, and Cavatica. By enabling researchers to apply custom scripts in JupyterLab to data stored in the cloud, Data Cruncher supports interactive and collaborative bioinformatic analysis at scale. Bringing custom interactive analysis to the cloud Although some […]

Written by Jessica Lau

Optimizing novoBreak on the Cancer Genomics Cloud

Last week, we released novoBreak—an exciting new bioinformatics software—on the Cancer Genomics Cloud (CGC). The tool, described by Zechen Chong and colleagues in Nature Methods, is a novel approach to detect breakpoints in cancer genomes with high precision and sensitivity. In this post, we explain how Zechen used the Publish your app …

Written by Jessica Lau

