If you store your files on AWS US East (N. Virginia) and/or AWS US West (Oregon) regions, the Seven Bridges Platform now allows you to manage all your work from a single space and spin up chosen computation resources at the location where your data lives. By letting you choose your Project location, Seven Bridges provides an environment for much easier control of costs caused by potential data transfers between different regions. This way you can optimize your workload for a specific use case, and Seven Bridges will continue to provide full transparency over data transfer costs charged by cloud infrastructure providers. Read more.
Recently published apps
BROAD Best Practices RNA-Seq
This workflow represents the GATK Best Practices for SNP and INDEL calling on RNA-Seq data. Starting from an unmapped BAM file, it performs alignment to the reference genome, followed by marking duplicates, reassigning mapping qualities, base recalibration, variant calling and variant filtering. We used Broad’s best practice script in WDL format as a reference to create the BROAD Best Practices RNA-Seq Variant Calling 220.127.116.11 workflow in CWL version 1.0.
BROAD Best Practices Somatic CNV Panel Workflow
BROAD Best Practices Somatic CNV Panel Workflow is used for creating a panel of normals (PON) given a group of normal samples. Using read coverage collected over specified intervals, this workflow creates a panel of normals HDF5 file which is used in BROAD Best Practices Somatic CNV Pair Workflow for standardizing and denoising read counts. This workflow represents a CWL implementation of Broad’s best practice CNV panel WDL workflow.
BROAD Best Practices Somatic CNV Pair Workflow
BROAD Best Practices Somatic CNV Pair Workflow is used for detecting copy number variants (CNVs) as well as allelic segments. Given a tumor and optional matched normal sample, as well as panel of normals (PON) file, this workflow models and calls CNV segments. This workflow represents a CWL implementation of Broad’s best practice CNV pair WDL workflow.
When defining task execution settings, you can now enable memoization. Achieve significant time and cost optimization of your project workload by letting the Platform reuse existing results of your previous runs. Memoization can be enabled at project or task level, where the task-level setting overrides the project-level one.
Learn more from our documentation.
NewMultiple datasets selection for querying in Data Browser
With multiple dataset selection for simultaneous querying, you are now able to start data querying in Data Browser by selecting more than one dataset. This allows you to search selected datasets by common entities (e.g. Case) and property values (e.g. male for the gender property), and get combined results from selected datasets. In addition, you can select only those instances for an entity (e.g. Case identifier) that are common in all selected datasets, if such instances are available. Those can then be filtered further by other entities/properties that the common instances have in selected datasets.
Additionally, the List (table) view on the Data Browser page has been removed and priority has been given to the Detail page.
ImprovementsImproved organization of Public Reference Files
The Public Reference Files gallery has been renamed into Public Files and split into two categories, Public Reference Files and Public Test Files, where the former holds all common reference files, while the latter contains common test samples. Both of these categories can be accessed from the Data menu on the main menu bar.
This change does not affect the API, so all related API calls remain the same.
NewAdded support for GPU Instances
The first family of GPU instances we’re introducing is Amazon EC2 P2. P2 Instances are powerful, scalable instances that provide GPU-based parallel compute capabilities. Designed for general-purpose GPU compute applications using CUDA and OpenCL, these instances are ideally suited for machine learning, molecular modeling, genomics, rendering, and other workloads requiring massive parallel floating point processing power. NVIDIA drivers come preinstalled and optimized according to the Amazon best practice for the specific instance family and are accessible from the Docker container.
The following instances have been added:
Learn more about GPU instances on the Platform.
Spot Instances enabled by default on project creation
In order to promote execution cost optimization, Spot Instances are now enabled by default when creating a new project through the visual interface or the API, unless you have specifically set otherwise. This setting can later be changed from the project settings page, or overridden per task on the draft task page. Learn more about Spot Instances on the Platform.
NewSupport for asynchronous bulk actions through the API
As a part of adding full support for folders and improving scalability, we have introduced asynchronous file system actions through the API. Currently supported actions are copy and delete, and these are enabled for both files and folders. There are five new API endpoints for async bulk actions which can be used for issuing copy and delete commands and for getting job statuses. This enables the following API actions which weren’t possible before:
- Copy folder (along with the files it contains and the underlying folder structure)
- Bulk copy of files and folders into different paths (project root or specific folder inside the project)
- Delete non-empty folder
Learn more from our documentation.
ImprovementsImproved layout of the draft task page
In order to streamline the preparation process for task execution, both file inputs and app settings will now be available as two columns under the same tab named Task Inputs on the draft task page. Spot Instance configuration will be moved to the second tab on the draft task page, named Execution Settings. This tab will also serve as the central and unique location for all settings related to task execution that will be added in the future.
In order to enhance our comprehensive security framework and provide our enterprise customers with additional options for securing their data and analyses on the Seven Bridges Platform, we are introducing the following security enhancements:
- Multi-factor authentication
- Shorter idle session logout time
Multi-factor authentication (MFA) significantly decreases the risk of compromising user accounts. It is an additional layer of protection beyond your password and combines something you already know (password) with something you have (mobile phone). You will have the option to use your email or an authentication app as the second step in the authentication procedure, or use backup codes if the preferred authentication method is not available. We also offer the popular “remember me on this computer” option that increases the usability of this security feature, as you will not be asked for the second factor on a specific computer during a specific time period. Moreover, in case of problems with the login procedure, there will be an option to contact support directly from the login screen. Learn more about MFA from User documentation and Administrator documentation.
We also enable the enterprise administrator to:
- Define whether multi-factor is forced on all users in an enterprise
- Define whether the “remember me on this computer” option is available, as well as the number of days after which the user will be asked for the second factor again.
Shorter idle session logout time
We have reduced default idle session logout time to 1 hour on the Seven Bridges Platform (both USA and EU installations), thus enabling our enterprise customers to be compliant with the required security standards. This means that users who have been inactive for the defined period of time will be logged out. This will affect all Platform users.
We would also like to remind you of our Single sign-on solution, which enables our enterprise users to integrate their own SAML-based single sign-on solution with the Platform. If you are an enterprise user, this way you can improve your security by minimizing the number of logins you need to manage for your users and re-use already existing security infrastructure, together with improving user experience. If you wish to set this up, feel free to contact our support.
Recently published apps
Metagenomics WGS Functional Profiling – HUMAnN2
HUMAnN2 (the HMP Unified Metabolic Analysis Network) is a tool used for efficiently and accurately determining the presence/absence and abundance of metabolic pathways in a microbial community from metagenomic sequencing data. It introduces a novel tiered search algorithm that provides highly accurate profiles for characterized members of microbial communities, with fallback to translated search for uncharacterized members.
Metagenomic WGS Functional Profiling – HUMAnN2 workflow provides a complete functional profiling analysis of input samples, designed to analyze several metagenomics samples in parallel.
NewData Cruncher – RStudio (beta)
In addition to JupyterLab, Data Cruncher now supports one more development environment, RStudio. You can choose between the two environments when setting up your Data Cruncher analysis.
Also, file saving rules have been deprecated, so all analysis files will be automatically saved in your analysis workspace on the Platform, regardless of their size or extension.
Learn more about RStudio in Data Cruncher from our documentation.
NewHuman Cell Atlas Preview Datasets public project
Human Cell Atlas Preview Datasets are now available as a public project on the Seven Bridges Platform. The project contains files released to the research community within the first three single-cell sequencing datasets as “Human Cell Atlas Preview Datasets”. The available datasets are:
- Census of Immune Cells
- Ischaemic Sensitivity of Human Tissue
- Melanoma Infiltration of Stromal and Immune Cells
The Human Cell Atlas
Launched in 2016, the Human Cell Atlas (HCA) is an international collaborative effort to catalog all the cells in the human body in terms of their distinctive patterns of gene expression, physiological states, developmental trajectories, and location to understand how genetic variants impact disease risk, define drug toxicities, discover better therapies, and advance regenerative medicine. Learn more.
ImprovementsFolders as task inputs and outputs
When selecting inputs for a task, you will now be able to select an entire folder for input ports that are set up to take folders as input values. This means that such input ports will take all files from the root of the selected folder and its subfolders. Folders can now also be displayed as app outputs, provided that the app itself is configured to produce output data in folder(s). This feature is available for CWL 1.0 apps only.
ImprovementsComputation backend improvements
We are making some improvements to our computation backend. These changes impact sbg:draft2 tasks only, mostly bringing some of their behaviors/capabilities in line with CWL 1.0 tasks.
The following are the changes:
Scattering improvement: When running sbg:draft2 workflows, you might notice a runtime improvement in some workflows that make use of chained scattering. There is no action needed on your part and you can continue running your apps as usual.
Docker entrypoints: The executor for sbg:draft2 apps now honors docker image entrypoints. If your sbg:draft-2 app refers to a Docker image that has a defined entrypoint, this previously ignored Docker feature will now be active. Please have this change in mind when running the app and do let us know if you notice any unexpected behavior.
Stage multiple files with the same name: If multiple files that have the same name are provided from an upstream tool to a staged input port of a downstream tool in a workflow, staging will work successfully as the files will now be renamed automatically. Please have this in mind if you are relying on file names during the processing steps in your workflows.
If you have any questions related to these changes, please contact our Support team at firstname.lastname@example.org.
Recently published apps
DeepVariant 0.7.2 is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data. DeepVariant is highly accurate, robust, flexible and easy to use. To use DeepVariant on the Seven Bridges Platform, simply supply it with reads, reference, and select the desired model to use (WGS or WES).