Release notes

November 12th, 2019

ImprovementsSearch by ID through multiple datasets at once

We have improved the existing Search by ID feature by enabling you to perform a search that will be applied across all available datasets. The search is performed by clicking Search by ID from the Data Browser’s dataset selection screen, returns sets of matched entities from all available datasets and allows you to select an entity (or a combination of entities) to start the Data Browser with. The search covers every available UUID and ID, either belonging to an entity or property, while retaining the existing capability of searching by file name.

Read more

October 28th, 2019

NewRecently published apps

Several new CWL1.0 apps have been published to the Public Apps Gallery:

New BROAD Best Practices workflows: Data Pre-processing and Germline snps and indels variant calling in version 4.1.0.0. These workflows are built according to BROAD’s best practices following their WDL scripts, and together they allow for producing analysis-ready BAM files and VCF files with germline mutations.

eQTL analysis workflows: FastQTL and MatrixEQTL – Expression quantitative trait loci (eQTLs) are genomic variants related to variation in expression levels of mRNAs. These loci could be either cis, in the neighborhood of a gene transcription start site (TSS) or trans, distant eQTLs. The eQTL analysis workflow with FastQTL and MatrixEQTL are designed for fast eQTL analysis on large datasets, using standard mapping methods that test the linkage between variation in expression and genetic polymorphisms. FastQTL works with cis loci, while MatrixEQTL works with both cis and trans loci. These workflows are available on the Platform, starting from standard bioinformatics file formats (VCFs and gene expression results), and producing a comprehensive set of plots, reports and results allowing for easier insight into eQTL analysis.

NanoStringQCPro 1.10.0: NanoString® has introduced the nCounter technology for direct counting of molecules in samples, which enables direct detection of specific RNA, DNA and protein molecules. It provides highly robust data across clinically relevant samples while reducing hands-on time and simplifying analysis. The NanoStringQCPro app performs basic QC steps and data normalization of NanoString mRNA gene expression data.

Read more

October 21st, 2019

NewAdded support for Amazon EC2 P3 GPU Instances

We have added support for Amazon P3 GPU instance family to the Seven Bridges Platform. Amazon EC2 P3 instances deliver high performance compute in the cloud with up to 8 NVIDIA® V100 Tensor Core GPUs and up to 100 Gbps of networking throughput. These instances deliver up to one petaflop of mixed-precision performance per instance to significantly accelerate machine learning and high performance computing applications.

NVIDIA drivers come preinstalled and optimized according to the Amazon best practice for the specific instance family and are accessible from the Docker container.

The following instances have been added:

  • p3.2xlarge
  • p3.8xlarge
  • p3.16xlarge
Read more

September 30th, 2019

NewDefine Compute Resources per Task Run

When creating a task via visual interface, you are now able to set top level instance type and max number of parallel instances for your execution without having to create a new version of the app. Learn more about setting execution hints on task level from our documentation.

Read more

August 26th, 2019

NewAccess task secondary files via the API

You can now use our sevenbridges-python client to access secondary files for task inputs and outputs.

New and improved functionality:

  1. API users can now see exactly which files were used as secondary files for inputs.
  2. Python client can now easily get those files via a simple call, as shown in the example below.
  3. All of this is also supported for CWL 1.x tools and workflows, where the secondary files can be defined as JS expressions.

Some examples utilizing the sevenbridges-python API client:

import sevenbridges as sb
config = sb.Config(profile='default')
api = sb.Api(config=config)

task = api.tasks.get('439221a0-27c8-47a3-bcac-fcc5f44f82a8')
output_secondary_files = task.outputs['my_output'].secondary_files
input_secondary_files = task.inputs['my_input'].secondary_files
print(output_secondary_files)
print(input_secondary_files)

Please note that secondary files are captured from tasks as inputs or outputs, not from the file system. This means that the secondary_files property is available only when the file is pulled from the task itself, not when it is reloaded from the file system or directly instantiated from the file system via the api.files.get(<FILE_ID>) call or a similar one. The only supported way of getting secondary files is shown above – they need to be captured as soon as possible from the input file.

Learn more about the sevenbridges-python API client.

Whole Genome Sequencing – Quality Control – CWL1.0 Workflow

Data quality control (QC) is an important component of NGS projects, especially with relatively costly whole genome sequencing (WGS). Timely QC can identify and account for issues with the starting biological material (DNA contamination or sample swaps), the sequencing process or bioinformatic pipelines used for processing.

Whole Genome Sequencing – Quality Control – CWL1.0 Workflow is intended as a general-purpose QC flow for users processing WGS data, regardless of the number of samples. It should offer plots which can be easily visually inspected by the end users, as well as structured data output suitable for aggregation and parsing in an automated setup. As it may be of interest to keep the cost and duration of single-sample tasks to a minimum in large-scale sequencing projects, the workflow is designed to be modular, with nodes that can be turned on/off on request, or segments completely skipped (based on input data availability, for example).

Read more

August 19th, 2019

ImprovementsExport files to a volume within the same region

It is now possible to mount volumes from all supported cloud providers and regions in read-write (RW) mode on the Seven Bridges Platform. File export is possible to volumes that are in the same location (cloud provider and region) as the file that is being exported, which prevents additional data transfer costs to be caused by the export procedure.

Read more

August 14th, 2019

ReleaseGDC Datasets version update

As of August 7, GDC datasets available through the Data Browser and the API correspond to GDC Data Release 18.

Read more

August 5th, 2019

New and improved API calls for user management

You will be able to use new and improved API calls for enterprise users that enable you to:

  1. List all users from a division with filtering based on the role field,
  2. Get role information for a user in a division,
  3. List all teams, not only the ones you are a member of.

The changes will enable you to create various API scripts to answer questions like:

  • Who has access to what?
  • How much money was spent on compute per team?
  • Do I have some external collaborators on my enterprise that I forgot about?

Some examples utilizing the sevenbridges-python API client:

(more…)

Read more

July 15th, 2019

ReleaseGDC Datasets version update

As of July 10, GDC datasets available through the Data Browser and the API correspond to GDC Data Release 17.

ReleaseCPTAC-3 data release

With this release we will have controlled access data from the CPTAC-3 project available on the Platform for search and filtering in the Data Browser and through the API. This set contains WGS, WXS, and RNA-Seq data that is protected, and access to it requires approval from dbGaP. The data has been collected within the CPTAC (Clinical Proteomic Tumor Analysis Consortium) program, in the third phase labelled as CPTAC-3. The program was focused on collection of proteomics data for patients with a particular cancer type, but the data collection is also expanded to genomic data, particularly for lung, kidney, and uterus carcinoma. The primary source for this genomic data is at the GDC. Read more.

Read more

We are always engaged in research and development, working to build the future of genomics, science, and health. Let's work together. We'd love to hear about your projects and challenges, so drop us a line.

get in touch