ImprovementsData Cruncher – JupyterLab Beta
We have also made more improvements in order to reduce analysis initialization time. It shouldn’t take more than four minutes to spin up your Data Cruncher analysis, regardless of the chosen compute resources.
Updates to the TCGA, TARGET, and CCLE datasets
As part of Seven Bridges’ ongoing partnership with the National Cancer Institute (NCI), authorized researchers can access valuable public datasets generated by the TCGA, TARGET, and CCLE initiatives through the Seven Bridges Platform, and Seven Bridges collaborates with the NCI Genomic Data Commons (GDC) on an ongoing basis to ensure alignment between the datasets available through the GDC and our Platform. In keeping with this, updated versions of the TCGA, TARGET, and CCLE datasets have been released on the Seven Bridges Platform. As of July 10, the legacy TCGA and CCLE datasets available through our Platform are fully aligned with those in the GDC Legacy Archive, and the TCGA GRCh38 and TARGET GRCh38 datasets are fully aligned with GDC Data Release 11.0. However, please note that this update is not currently available if you are using the Seven Bridges Platform on AWS EU as the cloud infrastructure provider.
As a result of these dataset improvements, some users who previously copied files from these datasets to their projects will no longer be able to access a subset of these files. In advance of the release of these new dataset versions, Seven Bridges contacted the owners of all affected projects. The users who were not contacted by Seven Bridges or by the owner of projects on which they collaborate can expect access to all TCGA, TARGET, and CCLE files in their current projects to continue without interruption.
ImprovementsSet null or empty values for app settings
When defining app settings prior to execution, you are now able to set null or empty values for the available inputs. This is possible using the two new buttons placed next to the inputs under the Define App Settings tab.
The Set null option is available both for simple inputs (such as strings or numeric values) and complex ones (such as arrays, records or maps).
Set empty is available as a button only for complex inputs (arrays, records and maps). For most of the simple app parameters, such as strings, empty value is set by simply removing the value from the input field and leaving the field blank.
We have also introduced support for displaying certain more complex input types, such as arrays of records, that are now also available through the visual interface.
NewVariant Browser (BETA)
Variant Browser is under active development with features being added successively. For any suggestions or bug reports, please contact our support team at email@example.com.
Variant Browser is an application for genome analysis and interpretation that will allow researchers and clinicians to quickly annotate and accurately prioritize variants and genes involved in a disease. Variant Browser bridges the gap between sequencing/raw data management and clinical management, providing robust genome interpretation to expedite diagnosis and accelerate the understanding of the genetic basis of disease, drug response, and health.
During the BETA stage, you are able to interpret demo VCF files which are readily available inside the app. The available options include filtering with configurable filtering criteria, directly opening the Genome Browser at the position of the selected variant and displaying general statistics for the currently applied filters. In case you want to interpret your own VCF files please contact us at firstname.lastname@example.org.
Future development will include an integrated flow for annotation and interpretation of VCF files using the Variant Browser, with annotated files provided by annotation apps on the Seven Bridges Platform (such as VEP annotation workflow or SnpEff), or uploaded by the users. Additionally, there will be more detailed and elaborate filtering options, along with more comprehensive display options for relevant analysis information.
Try the Variant Browser right now by accessing it in your project’s Interactive Analysis section and read more about it in the documentation.
Variant Browser resources
Several resources that will be used for creation of sqlite databases to be browsed further with Variant Browser are already available on the Seven Bridges Platform. Those are annotation workflows based on VEP (Variant Effect Predictor) and SnpEff. Workflows for annotation and conversion of a single VCF file to an sqlite database are: VEP annotation & DB conversion and SnpEff annotation & DB conversion, as well as Trios: SnpEff annotation & DB conversion or Trios: VEP annotation & DB conversion in case of Trios analysis. The workflows are available in the Variant Browser public project.
ReleaseGenome Browser official release
Genome Browser is no longer in the beta stage and has been officially released. The most important achievement in this release is guaranteed accuracy based on manual review of fields in the standard file compared to the most commonly used genome viewer.
Latest additions to Genome Browser also include a history of the last ten positions within the file, easier insertion of markers that takes less clicks than before and the ability to select a different reference file instead of the one that has been preselected based on the headers in the loaded BAM file.
We have also increased the maximum number of simultaneous BAM tracks, which is now up to twenty instead of the earlier maximum of only three. Higher number of tracks might affect the loading time, but will work properly as long as the browser memory limit is not exceeded.
To try out the new features, access the Genome Browser in your project’s Interactive Analysis section.
We have introduced the option to archive apps that you don’t intend to use in a project. Archiving will hide an app from the apps list, workflow editor and other relevant places across the Platform. However, this will not affect reproducibility, as you will be able to rerun existing tasks containing the app.
Archived apps can be displayed by changing the Status filter to Archived in the apps list within a project. To restore an app, click Restore next to the desired archived app.
Find out more about app archiving in our documentation.
Multi-Instance Whole Genome Sequencing GATK4.0 Workflow
We’ve published the Multi-Instance Whole Genome Sequencing BWA/GATK 4.0 workflow. This workflow keeps a similar price, with the improvement in total execution time of up to 3.0 times compared to a single-instance implementation. The Multi-Instance Whole Genome Sequencing workflow processes a 30x whole genome in as little time as 3 hours without any additional computational resource such as GPU or FPGA. The differences in precision, recall and f-score between multi-instance and single-instance workflows are lower than 0.001%, which is expected due to stochastic effects.
See the workflow on the Seven Bridges Platform.
NewSpatial Transcriptomics Pipeline with Spotty
Spatial Transcriptomics is a method that allows visualization and quantitative analysis of the transcriptome in individual tissue sections. Spatial Transcriptomics Pipeline with Spotty is a workflow for processing raw sequence data (paired FASTQ-files) generated using the Spatial Transcriptomics technology. The workflow produces a table of spatially distributed gene counts for downstream analysis.
This workflow consists of two nested workflows:
- ST Pipeline which performs demultiplexing and decoding of the RNA-Seq reads.
- The Spotty workflow with post processing, to perform automatic spot identification and pairing between the fluorescent (Cy3) and tissue (HE) image. Spotty is proprietary and available only on the Seven Bridges platform environments.
Read more about the workflow on our blog.
NewData Cruncher Interactive Analyses – Public project
As a part of the effort to grow a comprehensive set of platform features and capabilities, we have developed several Data Cruncher Interactive Analyses as an additional resource that should help users mitigate challenges related to interpretation of data obtained through secondary analysis. These Data Cruncher analyses can be found in the Data Cruncher Interactive Analyses public project.
The project contains five analyses:
- Ballgown Interactive Analysis
- VCF visualization Interactive Analysis
- Structural variation Interactive Analysis
- ChIP-seq Interactive Analysis
- Microbiome Differential Abundance Analysis
Each Interactive Analysis comes with explanations of analysis steps and a corresponding set of files needed for successful execution.
NewSupported instances update
The new generation of AWS EC2 Compute Optimized instances (C5) and General Purpose instances (M5) is now also available in task executions and Data Cruncher analyses on eu.sbgenomics.com. See the full list of supported AWS EU instances in our documentation.
NewAPI bulk actions
Due to popular demand, we have created several new API bulk calls. The calls allow users to perform an operation on up to 100 files within a single API call, while using only one API rate limit. This results in significantly faster completion of operations involving a big number of files.
Currently we have bulk calls implemented for the following operations:
- bulk imports from a volume
- bulk exports to a volume
- bulk get file details (stat files)
- bulk delete files
- bulk edit files
The calls have also been implemented in our Python client, with Java and R client implementation coming up in the near future.