ReleaseGDC Datasets version update
As of July 10, GDC datasets available through the Data Browser and the API correspond to GDC Data Release 17.
ReleaseCPTAC-3 data release
With this release we will have controlled access data from the CPTAC-3 project available on the Platform for search and filtering in the Data Browser and through the API. This set contains WGS, WXS, and RNA-Seq data that is protected, and access to it requires approval from dbGaP. The data has been collected within the CPTAC (Clinical Proteomic Tumor Analysis Consortium) program, in the third phase labelled as CPTAC-3. The program was focused on collection of proteomics data for patients with a particular cancer type, but the data collection is also expanded to genomic data, particularly for lung, kidney, and uterus carcinoma. The primary source for this genomic data is at the GDC. Read more.
Supported browsers update
Internet Explorer is no longer a supported browser on the Seven Bridges Platform. When trying to access the Platform using Internet Explorer, you will be presented with an adequate explanatory message stating that you are using an unsupported browser and suggesting that you switch to a supported one.
We have also updated the minimum required versions for the supported browsers:
Recently published apps
The following apps have been ported to CWL 1.0 and are now available as CWL 1.0 apps in the Public Apps gallery:
- Optitype 1.2
- VEP annotation workflow 90.5
- Ensembl-VEP 90.5
Writing rate limit-efficient API scripts
The API rate limit is a limit to the number of calls you can send to the Seven Bridges API within a predefined time frame. That limit is 1,000 requests within 5 minutes. After this limit is reached, no further calls are accepted by the API server until the 5 minute interval ends.
It is important to write API scripts with this API rate limit in mind to minimize the number of API calls to the Seven Bridges Platform. This way, you avoid reaching your rate limit and your API scripts can execute without delay due to server-side throttling.
We put new documentation online that helps you make your API scripts rate limit-efficient. Code snippets demonstrate recommended use of the Seven Bridges Python client to minimize API calls for common tasks, including finding projects, iterating over result sets of queries, importing files from volumes, exporting files to volumes, updating file metadata, copying files between projects, deleting files, and submitting tasks for execution.
If you ever experienced errors or delays due to the Seven Bridges API rate limit, please make sure to give this new content a read to learn how to make your API calls (not) count.
NewSeven Bridges Automation Tools and Services
The new Seven Bridges Automation Tools and Services enable biotechnology and biopharmaceutical companies to increase productivity by bringing a diverse set of users into one environment. Scripts written with the Python Automation Development Kit (ADK) automatically gain concurrency, dependency management, memoization, retries, execution logs, and much more, enabling developers to focus on business logic and ultimately, reduce their lines of code by up to 80%. Within the same environment, end users are now able to process complex analysis workflows with the push of a button, share results instantly, and achieve total reproducibility.
Visit sevenbridges.com/automation for more details.
Supported instances update
You can now use next generation AWS Memory Optimized instances (R5) in task executions and Data Cruncher analyses. R5 instances support the high memory requirements of certain applications to increase performance and reduce latency.
Learn more about supported instance types.
NewNew CWL web editor is now live
We have released an updated version of our CWL web editor. This release integrates the functionality of our desktop editor, Rabix Composer, with the Seven Bridges Platform.
- Ability to edit both CWL sbg:draft-2 and v1.0 apps and select the CWL version before creating an app.
- Streamlined tool editor with all parameter sections available in a single continuous page. It’s now easier to have an overview of the way the tool was built.
- Optimized design of the workflow editor to facilitate working with large and complex workflows. When selecting a node, you’ll be able to see what its direct connections are.
- CWL Code editor in addition to the Visual Editor. For people who feel comfortable with writing raw JSON, we have enabled the code editor. All changes that you make in the code editor will be reflected in the visual editor and vice versa.
- Better documentation. We have expanded the documentation with our best practices for more advanced users.
If you have been using the Seven Bridges Platform for a while, don’t worry, the legacy editor will be available in case you don’t feel comfortable with making this transition right now. Note that the legacy editor can only be used to edit sbg:draft-2 CWL apps.
If you store your files on AWS US East (N. Virginia) and/or AWS US West (Oregon) regions, the Seven Bridges Platform now allows you to manage all your work from a single space and spin up chosen computation resources at the location where your data lives. By letting you choose your Project location, Seven Bridges provides an environment for much easier control of costs caused by potential data transfers between different regions. This way you can optimize your workload for a specific use case, and Seven Bridges will continue to provide full transparency over data transfer costs charged by cloud infrastructure providers. Read more.
Recently published apps
BROAD Best Practices RNA-Seq
This workflow represents the GATK Best Practices for SNP and INDEL calling on RNA-Seq data. Starting from an unmapped BAM file, it performs alignment to the reference genome, followed by marking duplicates, reassigning mapping qualities, base recalibration, variant calling and variant filtering. We used Broad’s best practice script in WDL format as a reference to create the BROAD Best Practices RNA-Seq Variant Calling 184.108.40.206 workflow in CWL version 1.0.
BROAD Best Practices Somatic CNV Panel Workflow
BROAD Best Practices Somatic CNV Panel Workflow is used for creating a panel of normals (PON) given a group of normal samples. Using read coverage collected over specified intervals, this workflow creates a panel of normals HDF5 file which is used in BROAD Best Practices Somatic CNV Pair Workflow for standardizing and denoising read counts. This workflow represents a CWL implementation of Broad’s best practice CNV panel WDL workflow.
BROAD Best Practices Somatic CNV Pair Workflow
BROAD Best Practices Somatic CNV Pair Workflow is used for detecting copy number variants (CNVs) as well as allelic segments. Given a tumor and optional matched normal sample, as well as panel of normals (PON) file, this workflow models and calls CNV segments. This workflow represents a CWL implementation of Broad’s best practice CNV pair WDL workflow.
When defining task execution settings, you can now enable memoization. Achieve significant time and cost optimization of your project workload by letting the Platform reuse existing results of your previous runs. Memoization can be enabled at project or task level, where the task-level setting overrides the project-level one.
Learn more from our documentation.
NewMultiple datasets selection for querying in Data Browser
With multiple dataset selection for simultaneous querying, you are now able to start data querying in Data Browser by selecting more than one dataset. This allows you to search selected datasets by common entities (e.g. Case) and property values (e.g. male for the gender property), and get combined results from selected datasets. In addition, you can select only those instances for an entity (e.g. Case identifier) that are common in all selected datasets, if such instances are available. Those can then be filtered further by other entities/properties that the common instances have in selected datasets.
Additionally, the List (table) view on the Data Browser page has been removed and priority has been given to the Detail page.
ImprovementsImproved organization of Public Reference Files
The Public Reference Files gallery has been renamed into Public Files and split into two categories, Public Reference Files and Public Test Files, where the former holds all common reference files, while the latter contains common test samples. Both of these categories can be accessed from the Data menu on the main menu bar.
This change does not affect the API, so all related API calls remain the same.
NewAdded support for GPU Instances
The first family of GPU instances we’re introducing is Amazon EC2 P2. P2 Instances are powerful, scalable instances that provide GPU-based parallel compute capabilities. Designed for general-purpose GPU compute applications using CUDA and OpenCL, these instances are ideally suited for machine learning, molecular modeling, genomics, rendering, and other workloads requiring massive parallel floating point processing power. NVIDIA drivers come preinstalled and optimized according to the Amazon best practice for the specific instance family and are accessible from the Docker container.
The following instances have been added:
Learn more about GPU instances on the Platform.