Release notes

June 10th, 2024

Recently published apps

Somatic small variant callers for long read data, ClairS (0.2.0) and ClairS-TO (0.1.0) (for matched tumor-normal pairs and tumor-only data, respectively) have been published to the Seven Bridges Platform.

Read more

May 13th, 2024

Recently published apps 

We have published the GCTA 1.94.1 tool on the Seven Bridges Platform. GCTA is a suite of tools for various genetic analyses using genome-wide data. GCTA (Genome-wide Complex Trait Analysis) was initially developed to estimate the proportion of phenotypic variance explained by all genome-wide SNPs for a complex trait but has been greatly extended for many other analyses of data from genome-wide association studies (GWASs) 

Read more

May 10th, 2024

Recently published apps

snM3C pipeline

The snM3C pipeline is designed for profiling 3D genome structure and DNA methylation in single cell data as a part of the Human Cell Atlas and the WARP BRAIN Initiative.

The snM3C pipeline performs:

  • Demultiplexing (by the Demultiplexing custom tool)
  • Reads sorting (by the Sort custom tool)
  • Reads trimming (by Cutadapt)
  • Paired-end reads alignment (by Hisat-3n)
  • Separating unmapped, uniquely aligned, and multi-aligned reads (by Separate unmapped reads wrapped around a custom script)
  • Splitting unmapped reads by enzyme cut site (by Split unmapped reads wrapped around a custom script)
  • Alignment of the unmapped, single-end reads (by Hisat-3n)
  • Removing the overlapping reads (by Remove overlap read parts wrapped around a custom script)
  • Merging mapped reads from single- and paired-end alignments (by Samtools Merge)
  • Removing duplicate reads (by Picard MarkDuplicates)
  • Calling chromatin contacts (by Call chromatin contacts wrapped around the custom script)
  • Creating ALLC files (by Allcools bam-to-allc)
  • Creating summary output (by Allcools extract-allc)

All tools are wrapped for the workflow specifically and use retagged us.gcr.io/broad-gotc-prod/m3c-yap-hisat:1.0.0-2.2.1 Docker image.

DeepSomatic 1.6.1

DeepSomatic is an extension of DeepVariant for calling somatic variants from matched tumor-normal data. The tool is still in active development and only WGS data is currently supported.

SortMeRNA 4.3.6

SortMeRNA is a local sequence alignment tool for filtering, mapping and OTU clustering. The main applications of SortMeRNA are filtering rRNA from metatranscriptomic data, OTU-picking and taxonomy assignation available through QIIME v1.9+.

dupRadar 1.32.0

The dupRadar tool is intended for duplication rate quality control for RNA-Seq data. It gives an insight into the duplication problem by graphically relating the gene expression level and the duplication rate present on it.

Read more

April 8th, 2024

Recently published apps

Here are the new apps published in our Public Apps gallery:

  • ASCAT 3.1.2 tools (ASCAT prepareTargetedSeqASCAT prepareHTS and ASCAT). ASCAT prepareTargetedSeq prepares SNP references for ASCAT processing of targeted sequencing data. ASCAT prepareHTS prepares sequencing data (WGS, WES or targeted) for ASCAT. ASCAT infers tumor ploidy, purity and allele-specific copy number profiles.
  • JAFFAL 2.3 tool. JAFFAL is used to detect fusion genes from long-read (PacBio and ONT) transcriptome sequencing with high accuracy, overcoming the challenges posed by higher error rates in long-read data.
  • Ballgown 2.34.0 toolkit. Ballgown is a package designed to facilitate flexible differential expression analysis of RNA-Seq data. It also provides functions to organize, visualize, and analyze the expression measurements for transcriptome assembly

Apps with version updates

  • StringTie 2.2.1 toolkit. StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. StringTie Merge tool merges/assembles GTF/GFF transcript files into a non-redundant set of transcripts. This tool should be used after StringTie transcript assembling of each sample in the experiment.
Read more

April 1st, 2024

Recently published apps

We published the following ASCAT 3.1.2 tools in our Public Apps gallery:

  • ASCAT prepareTargetedSeq prepares SNP references for ASCAT processing of targeted sequencing data. 
  • ASCAT prepareHTS prepares sequencing data (WGS, WES or targeted) for ASCATASCAT infers tumor ploidy, purity and allele-specific copy number profiles.

Recently updated apps

We updated the following apps from the MSIsensor v0.6 toolkit:

  • MSIsensor scan – a tool for cataloging homopolymers and miscrosatelites sites in the reference genome. It prepares reference for MSIsensor msi.

  • MSIsensor msi – a tool for somatic microsatellite changes detecting and scoring. Designed to work with paired tumor-normal data.

Read more

March 11th, 2024

Recently published apps

We’ve published the following new apps on the Seven Bridges Platform:

  • FusionInspector (v2.8.0), a tool that performs validation of fusion transcript predictions. FusionInspector is a part of the Trinity Cancer Transcriptome Analysis Toolkit (CTAT). It takes a list of potential fusion genes (obtained by executing any fusion transcript prediction tool), extracts the genomic regions corresponding to the fusion partners, and creates mini-fusion-contigs that hold the gene pairs in the suggested fused orientation. The original reads align to these putative fusion contigs. In the fusion-gene context, fusion-supporting reads that would typically align as split reads or discordant pairs should align as concordant ‘normal’ reads. Reads that span fragments and reads containing fusion breakpoints that support each fusion, are recognized, reported, and scored accordingly.
  • Arriba (v2.4.0), a tool for the detection of gene fusions from RNA-Seq data. Arriba is designed to work with STAR aligner-processed data, and the post-alignment runtime is typically a few minutes long. Arriba does not require reducing the –alignIntronMax parameter of STAR to identify fusions resulting from focal deletions, in contrast to many other fusion detection methods that are based on STAR. Its intended application was in the context of clinical research. As such, high sensitivity and fast runtimes were crucial design requirements. Arriba can identify structural rearrangements other than gene fusions that may have clinical significance. These include viral integration sites, internal tandem duplications, whole exon duplications, and truncations of genes (i.e., breakpoints in introns and intergenic regions).
  • Arriba draw_fusions.R is an R script that comes with the Arriba gene fusions detection tool. This script produces visualizations of the transcripts involved in predicted fusions that are suitable for publication in terms of quality. For every predicted fusion, it creates a single page in the output PDF file. Each page contains information about the fusion partners, their orientation, the retained exons in the fusion transcript, statistics about the number of supporting reads, and, if the fusion_transcript column has a value, an excerpt of the sequence around the breakpoint.
  • Parabricks RNA Pipeline, utilizing Parabricks toolkit v4.2.0. It is used for SNP and Indel discovery from RNAseq input data.
  • STAARpipeline PheWAS v0.9.6 which is used for analyzing WGS/WES sequencing data in PheWAS.
  • SRA to DRS converter workflow that converts SRA metadata to the DRS format for streamlined genomic data handling. This app is developed to streamline the conversion of Sequence Read Archive (SRA) metadata into Data Repository Service (DRS) URIs. It addresses the challenge faced by researchers in efficiently accessing and utilizing large genomic datasets stored in SRA format. By simplifying and automating this conversion, the app facilitates quicker and more effective genomic data analysis, thus accelerating research in the fields such as disease study and genetic discovery.

Recently updated apps

We also updated the following apps:

  • DESeq2 tool (v1.40.1) that performs differential gene expression analysis across two or more study conditions. DESeq2 performs differential gene expression analysis using negative binomial generalized linear models. It analyzes estimated read counts from several samples, each belonging to one of two or more conditions under study, searching for systematic changes between conditions, as compared to within-condition variability.
Read more

February 14th, 2024

AlphaFold Visualizer – new public project on the Seven Bridges Platform

We published the AlphaFold Visualizer (v0.1.5) as a public project on the Seven Bridges Platform containing an interactive analysis for the visualization of AlphaFold results. AlphaFold is an AI tool for 3D protein structure prediction. As it produces protein structure models as its output, a secondary analysis is needed for the interpretation and visualization of results. This analysis must be interactive, so it has been developed as the Data Studio AlphaFold Visualizer analysis and it represents a great addition to the publicly available app.

Recently published apps

We have also published new workflows for processing Nanopore data:

  • ONT Flowcell Processing  aligns (Minimap2), sorts (Samtools) and quality checks (NanoPlot, Samtools Flagstat, Mosdepth, GATK ComputeLongReadMetrics) input Nanopore data from a single flowcell.
  • ONT WGS Variant Calling – merges (Sambamba), calls variants (Clair3, Sniffles2) and quality checks (Mosdepth, NanoPlot) input BAM files from Nanopore data.
Read more

January 3rd, 2024

Recently updated apps

We have updated the following tools on the Seven Bridges Platform: 

  • Exomiser 13.3.0 – used to identify candidate causative variants from WES or WGS patient VCF data and phenotype HPO terms. 
  • PharmCAT 2.8.3 toolkit: 
    • PharmCAT VCF Preprocess – prepares an input VCF file for PharmCAT. 
    • PharmCAT – takes a single-sample VCF file and returns a report with guideline variants. 
  • Sambamba 1.0.1 toolkit: 
    • Sambamba Index – creates a BAI or FAI index for the provided BAM/FASTA file. 
    • Sambamba Slice – copies a slice (region) of the coordinate sorted and indexed input file in BAM or FASTA format. 
    • Sambamba Sort – sorts alignments in BAM format. 
    • Sambamba Markdup – marks or removes duplicate reads from an input BAM file. 
    • Sambamba Flagstat – creates read flag statistics from a BAM file. 
    • Sambamba Merge – merges alignments in BAM format. 
    • Sambamba View – inspects and filters alignments in SAM/BAM format. 
  • Clair3 1.0.4 – calls small germline variants from data generated by Nanopore, PacBio or Illumina sequencing technologies. 
  • cuteSV 2.1.0 – calls structural variation from sorted long read alignments. 
  • Twelve tools from the Bismark 0.24.1 toolkit: 
    • Bismark – takes files with bisulfite-treated reads and aligns them to a specified bisulfite genome. 
    • Bismark Methylation Extractor – extracts the methylation call for every Cytosine in Bismark result files. 
    • Bismark Genome Preparation – converts the specified reference genome into bisulfite converted genome. 
    • Bismark Deduplicate – removes duplicate Bisulfite-Sequencing (BS-Seq) reads from an alignment file. 
    • Bismark2BedGraph – generates bedGraph and coverage files sorted by chromosomal position. 
    • Bismark2Report – uses Bismark alignment, deduplication and methylation reports to generate a graphical HTML report. 
    • Bismark2Summary – uses Bismark report files of several samples to generate a graphical summary HTML report. 
    • Bismark Bam2Nuc – calculates the mono and di-nucleotide coverage and compares it to the average genomic sequence composition. 
    • Bismark Coverage2Cytosine – generates a cytosine methylation report for a genome of interest. 
    • Bismark Filter Non Conversion – filters incomplete bisulfite conversion in non-CG context in Bismark BAM files. 
    • Bismark Methylation Consistency – splits BAM files based on methylation consistency. 
    • Bismark NOMe Filtering – filters reads in a yacht file (output of Bismark Methylation Extractor). 
  • Bismark Analysis 0.24.1 workflow for analyzing DNA methylation, a type of epigenetic modification. The workflow processes reads from Whole Genome Bisulfite Sequencing (WGBS) and Reduced Representation Bisulfite Sequencing (RRBS). While suitable for any input size, it excels with larger input samples. 
  • Three tools from the Salmon 1.10.1 toolkit: 
    • Salmon Index – builds an index necessary for the Salmon Quant – Reads and Salmon Alevin tools. 
    • Salmon Quant – Reads – infers transcript abundance estimates from RNA-seq data, using Selective Alignment (SA) for mapping. 
    • Salmon Quant – Alignment – estimates transcript abundance from aligned RNA-seq data using the Variational Bayesian EM algorithm. 
  • Salmon Workflow 1.10.1 for estimating transcript abundances from RNA-Seq data using Selective Alignment for mapping. The workflow enables creation of the necessary index for quantification and provides the capability to process multiple samples at once. It also creates an expression matrix at both the transcript and gene levels, aggregating expression results across all samples. 
  • ENCODE Chip-Seq Pipeline (v2.2.1). This workflow represents the ENCODE transcription factor and histone ChIP-Seq analysis pipelines. ChIP-Seq Analysis studies chromatin modifications and binding patterns of transcription factors and other proteins. It combines chromatin immunoprecipitation (ChIP) assays with standard NGS sequencing. The steps of the ChIP-Seq Analysis workflow consist of mapping of reads including duplicate removal, post alignment QC, cross correlation analysis, peak calling with blacklist filtering and a statistical framework, applied to the replicated peaks at the end in order to assess concordance of biological replicates.
  • ENCODE ATAC-Seq Pipeline. ATAC-Seq analysis performs quality control and signal processing, producing alignments and measures of enrichment. The Assay for Transposase-Accessible Chromatin followed by sequencing (ATAC-Seq) experiment provides genome-wide profiles of chromatin accessibility. Briefly, the ATAC-seq method works as follows: loaded transposase inserts sequencing primers into open chromatin sites across the genome, and reads are then sequenced. The ends of the reads mark open chromatin sites.
    The workflow is based on the ENCODE ATAC-seq pipeline, developed by the ENCODE Consortium. The four major steps of the ATAC-Seq analysis are pre-alignment quality control, alignment, post-alignment processing and advanced ATAC-Seq-specific quality control, and peak calling in order to identify accessible regions (which is the basis for advanced downstream analysis).
Read more

November 27th, 2023

Single and Global logout flows defined by SAML protocol are now available for SSO  

Users who access the Seven Bridges Platform through Single Sign-On (SSO) can now perform Single (IdP Initiated) logout to log out of multiple SSO sessions, in a single click. Also, it is now possible to initiate Global (SP initiated) logout flow from the Seven Bridges Platform.

Recently published apps  

We have published the following tools in our Public Apps gallery:  

  • Tximport, a tool that imports and summarizes transcript-level estimates for transcript and gene-level analysis based on the tximport R/Bioconductor package. It is designed to simplify the import of transcript-level abundances, estimated counts, and effective lengths from a variety of upstream tools, for downstream transcript-level or gene-level analysis.  

Three tools from the SplAdder (3.0.4) toolkit:  

  • SplAdder build constructs splicing graphs and extracts alternative splicing events.  
  • SplAdder test differentially tests the usage of alternative event between two groups of samples.  
  • SplAdder viz generates visual overviews of splicing graphs and alternative events. SplAdder viz uses results generated by SplAdder build or SplAdder test to create plots.  

Five tools from the Qualimap 2.3 toolkit:  

  • Qualimap Multi-sample BAM QC reports QC metrics computed in BAM QC analysis combined for multiple samples.  
  • Qualimap Compute counts calculates how many reads are mapped to each region of interest.  
  • Qualimap RNA-seq QC reports quality control metrics and bias estimations for whole transcriptome sequencing.  
  • Qualimap Counts QC analyzes count data to assess a differential expression between two or more conditions.  
  • Qualimap BAM QC reports information for the evaluation of the quality of the provided alignment data.  

Six tools from the RSeQC 5.0.1 toolkit:  

  • RSeQC read distribution calculates how mapped reads were distributed over genomic features.  
  • RSeQC junction annotation compares detected splice junctions to the reference gene model.  
  • RSeQC inner distance is used to calculate the inner distance (or insert size) between two paired RNA reads.  
  • RSeQC infer experiment is designed to estimate how RNA-seq data is configured.  
  • RSeQC read duplication calculates read duplication rate determined by mapping position or sequence of the read.  
  • RSeQC bam stat summarizes mapping statistics of the provided alignment file.  

The Tidyproteomics 1.5.2 toolkit:  

  • Expression analysis – used for proteomics differential expression analysis.  
  • Data input and summary – used for loading protein or peptide data.  
  • Data subsetting and summary – used for subsetting protein or peptide data.  
  • Abundance normalization – used for abundance normalization of the protein or peptide data.  
  • Enrichment analysis – used for enrichment analysis after differential expression analysis.  
  • Annotate data – used for proteomics data annotation before enrichment analysis.  

Improved storage cost breakdown  

To meet the need for precise allocation of storage costs within a division, we have implemented a solution that enables precise per-project storage breakdown on the Seven Bridges Platform.  

The new solution includes improved accuracy in per-project storage breakdown, facilitating precise cost allocation, as well as elimination of storage size discrepancies between per-project and per-division calculations. This presents a great benefit both for users with administrative roles in divisions and Platform users in general as it increases transparency and provides a better insight into storage cost distribution.  

Recently updated apps  

TopHat2, a tool that aligns RNA-Seq reads to a genome to identify exon-exon splice junctions, just got updated to version 2.2.1 and upgraded to CWL version 1.2 (was previously available in CWL draft-2).

Read more

We are always engaged in research and development, working to build the future of genomics, science, and health. Let's work together. We'd love to hear about your projects and challenges, so drop us a line.

get in touch