A first look at GATK4 on the Seven Bridges Platform

One of the big take-away messages from the Bio-It World Conference this year was the Broad Institute’s announcement that they plan to fully open source their GATK4 software. By transitioning to a BSD 3-Clause licence, GATK4 becomes fully open for commercial use without a separate commercial licence, which should particularly …

Written by Nick

Reducing bioinformatic analysis costs with AWS Spot instances

Although genome sequencing costs have dropped dramatically over the past few years, analyzing large amounts of genomic data remains expensive. As the scale of genomic projects continues to grow, cost-efficient bioinformatic analysis is key to gaining insight from the estimated 100 million to 2 billion human genomes that will be …

Written by Jessica Lau

CloudNeo: CWL Brings Cancer Genomics to the Cloud

A cloud-based workflow for patient-specific tumor neoantigens CloudNeo—a computational workflow for identifying patient-specific tumor neoantigens from Next-Generation Sequencing (NGS) data, was recently published in Bioinformatics. Originating from Jeffrey Chuang’s lab at The Jackson Laboratory, CloudNeo is a neoantigen prioritization workflow designed specifically for the cloud. The authors have made the CloudNeo workflow available on the Seven […]

Written by Patrick

Sequence Bloom Trees, Part I: Motivation and principles

Modern bioinformatics involves a lot of searching datasets, like The Cancer Genome Atlas (TCGA), that contain data from many experiments. Wanting to do this efficiently raises not only data management problems but also algorithmic ones. Searching a dataset like TCGA in hopes of figuring out which experiments contain a given …

Written by Nate

Custom interactive analysis on all Seven Bridges environments

This week we released Data Cruncher, an interactive analysis tool available on the Seven Bridges Platform, Cancer Genomics Cloud, and Cavatica. By enabling researchers to apply custom scripts in JupyterLab to data stored in the cloud, Data Cruncher supports interactive and collaborative bioinformatic analysis at scale. Bringing custom interactive analysis to the cloud Although some […]

Written by Jessica Lau

Optimizing novoBreak on the Cancer Genomics Cloud

Last week, we released novoBreak—an exciting new bioinformatics software—on the Cancer Genomics Cloud (CGC). The tool, described by Zechen Chong and colleagues in Nature Methods, is a novel approach to detect breakpoints in cancer genomes with high precision and sensitivity. In this post, we explain how Zechen used the Publish your app …

Written by Jessica Lau

Cavatica wins Bio-IT People’s Choice Award

Cavatica is the data sharing platform for pediatric disease Seven Bridges and our partners at The Children’s Brain Tumor Tissue Consortium (CBTTC) and the Pacific Pediatric Neuro-Oncology Consortium (PNOC) were honored with a Bio-IT World People’s Choice award for Cavatica, the collaborative analysis and data sharing platform for pediatric diseases. This award follows Seven Bridges’ previous Best […]

Written by Patrick

Identifying viral sequences in TCGA data using Kraken and Centrifuge

Image adapted from Kim et al. Genome Res. 26, 1721–1729 (2016). Next-Generation Sequencing has opened up the field of metagenomics. In metagenomic studies, a sample often contains a complex ecosystem of different microorganisms. The key challenge in these experiments is disentangling the identities of unknown species from millions of sequencing reads. Bioinformatic tools for metagenomics are designed to […]

Written by Patrick

Reference bias: Challenges and solutions

Ahead of the Bio-IT World Conference & Expo in Boston this week, we take a look at the role of reference genomes in ensuring accurate genomic analysis. In standard next-generation sequencing analyses, DNA is fragmented and sequenced. The sequenced reads are then aligned to a reference genome for the species. …

Written by Jessica Lau

