We’ve implemented a system to help you generate queries with a meaningful sample size of data. For instance, if you add more than three entities (e.g. Case, Sample, File) without applying any filters, the Data Browser prompts you to add one or more properties with specific values to narrow your query before continuing, as shown below.
The Data Browser similarly prompts you to add filters if you create a query that has symmetrical branches without any filters applied, as shown below.
Learn more about adding filters to your query.
ImprovementsTable of results improvements
The table below the Data Browser contains further details about your query. In addition to the List view and Analytics view, we’ve introduced a Details view.
The Details view displays details about a selected entity in the context of your query and consists of three panels. The first panel details the inbound and outbound connections for the selected entity in the context of your query. The second panel displays a list of UUIDs corresponding to that entity that match your query. If a specific UUID within the second panel is selected, the third panel displays the metadata corresponding to it.
This new Details view enables you to explore the connections and relations between entities in your query more easily and thus ensure that your query is identifying the data you need to meet your research goals.
Learn more about the Details view.
ImprovementsSupport for Amazon Web Services Spot instances
Seven Bridges has introduced support for Spot instances on the Amazon Web Services (AWS) deploy of the Platform. Spot instance support can be selected as a default for projects and and an option for each task execution. By selecting a spot instance execution costs can be dramatically reduced. Our testing indicates an execution cost savings of over 75% on common workflows.
Due to the nature of how AWS handles Spot instances, they can be interrupted while tasks are running. If a Spot instance is interrupted, Seven Bridges’ job retry functionality will automatically restart interrupted and remaining unfinished jobs on an On-Demand instance to prevent further interruptions. Such an interruption may impact the cost savings from using a Spot instance and can result in a longer overall runtime, but the reliability of task execution is unaffected.
NewSBFS [Beta release]
SBFS is a command line tool which enables interaction with Platform project files that are mounted as a local file system.
Use SBFS to make project files available on a local file system and thus as accessible as any other locally available file. This eliminates the need for downloading complete files to a local machine, which is especially useful when working with large files exceeding the size of a local disk. With SBFS, parts of a file are accessible without necessitating a complete file download and users can perform interactive analyses on a local machine (or server instance) without needing to bring their tool to the Platform.
SBFS is available for Linux and macOS operating systems, and beta version is available for download from the new Data tools page.
Learn more from SBFS documentation.
TARGET GRCh38 Dataset
The Therapeutically Applicable Research to Generate Effective Treatments (TARGET) dataset provides genomic, transcriptomic, and epigenomic data from patients representing several childhood cancers and serves as a valuable complement to the existing genomic and multi-omic datasets available on the Seven Bridges Platform via the CGC. The complete TARGET GRCh38 dataset, which includes both Open Data accessible to all researchers and Controlled Data, to which access is regulated by the Database for Genotypes and Phenotypes (dbGaP), is now available on the Platform. This dataset can be queried using the Data Browser to generate custom cohorts from within this dataset as well as cohorts derived from multiple similarly aligned datasets such as the TARGET GRCh38 and TCGA GRCh38 datasets.
The Platform now provides access to mass spectrometry data that were generated by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) as part of the TCGA initiative to characterize and quantify the proteome of cancer samples. This dataset represents 335 samples from patients with Breast Invasive Carcinoma, Colon Adenocarcinoma, Ovarian Serous Cystadenocarcinoma, and Rectum Adenocarcinoma for whom matched genomic data are available. The dataset can be queried using the Data Browser to generate custom cohorts from within this dataset as well as multi-omic cohorts across the TCGA GRCh38 genomic and CPTAC proteomic datasets.
To maximize the accessibility and value of the multi-omic datasets available on the Platform, the Data Browser now enables cross-dataset queries for datasets with harmonized metadata. This allows researchers to use the Data Browser to identify cohorts of interest across multiple genomic datasets such as the GRCh38 alignments of TCGA and TARGET.
Learn more about cross-dataset queries through a sample query.
NewElastic Block Storage customization feature
On August 1, we released a feature that provides you with the ability to customize the amount of Elastic Block Storage (EBS) disk space attached to different Amazon instance configurations. EBS customization is useful for bioinformatics workflows because it provides the ability to optimize your computation by giving you greater control over requirements and costs.
To learn more about EBS customization, please see our documentation.
Passing through EBS charges from AWS
Along with the introduction of EBS customization, there will be changes in the way that you are billed for Elastic Block Storage and Amazon Web Services (AWS) costs on the Platform. Starting on August 1, you will see an additional charge on the task overview page and your bill when you use EBS disk space. Information about the instance and EBS usage cost is available on the tooltip on the task overview page, next to the total price. Before August 1, you will see a $0 charge for attached disk space.
Why will I be charged more when I use EBS disk space?
When AWS introduced EBS they changed their pricing structure to charge separately for the compute and storage. AWS charges ~$0.10 per GB*h/month for EBS disk space. Up to now, Seven Bridges has paid costs for EBS usage on your behalf, however starting on August 1, along with the capability to fully customize EBS, we will begin passing through EBS costs. Our policy is to to be completely transparent around your AWS charges and to not charge a premium to access AWS services through the Platform.
Check out our documentation for more information on EBS charging.
NewSimons Genome Diversity Project (SGDP) Dataset
The Simons Genome Diversity Project (SGDP) Open Access dataset contains complete genome sequences from 130 diverse human populations. It is the largest dataset of diverse, high-quality human genome sequences ever reported and includes many deeply divergent human populations that are not well-represented in other datasets, which makes the SGDP dataset ideal for interrogating the genomic landscape of different populations.
This dataset is available for analysis under the Public projects tab of the top navigation bar. You won’t pay for storage of the raw data files: copy the entire project or select files into your own project on the Platform to take conduct further analysis. Or, compare SGDP data alongside of your own data.
Learn more about using the SGDP public project.
NewExport to Volumes: Copy-only parameter
We’ve added an Advance Access copy-only parameter to the Start an Export job request for Volumes. This means that, while it is fully operational, it is subject to change. If this parameter, copy_only, is set to true, the specified file will be copied to a volume but the source file will remain on the Platform. Learn more from our documentation.
ImprovementsInteractive Analysis improvements
The Seven Bridges Platform features interactive visualization toolkits to help you interpret the results of analyses on the Platform and to assess the quality of the data obtained. We’ve reorganized the Interactive Analysis space into a dashboard which displays all available apps as well as the number of files from your analysis which can be viewed with each app. Easily switch between apps using the app switcher feature in the app’s header. Collaboration is also easier than ever: share exactly what you see in an app via the URL in the navigation bar. Collaborators can follow the URL to see the exact files you’re working on and the same displays you see.
Currently available apps include the Genome Browser and VCF Benchmarking. Learn more about these apps below.
Use the VCF Benchmarking app to benchmark a set of variant calls. Select and open multiple .bench_sqlite report files from the File browser page in your project. Then, continue to the VCF Benchmarking app to see selected files, including all values and a chart for each of the reports. This makes it possible to compare multiple files.
The Genome Browser is our in-house genome browser for exploring alignment files. We’ve improved the Genome Browser to behave like an app in the Interactive Analysis space. Core functions remain the same.
ImprovementsDATA BROWSER: COPY ALL FILES
Choose to copy files from a specific File node or from all File nodes on the Data Browser canvas. Learn more about this feature.
We’ve released a new version of the Python bindings for our API, sevenbridges-python 0.7.0, which includes some major updates.
Note that the format for storing your API credentials as well as the location of the configuration file have changed. Please update your configuration file before using the configuration file in your API scripts.
The old configuration file will continue to work if you do not update the API bindings.
Export and import manifest files
Take advantage of new options, Export metadata manifest and Import metadata manifest, from the drop-down menu of the Files tab within a project, as shown below.
Select Export metadata manifest to export your project files’ metadata as an editable manifest file. Use this manifest file to modify the file metadata. Conversely, use Export metadata manifest from filtered files to export and modify the metadata for a subset of your project files based on the filters you’ve applied.
Select Import metadata manifest to import a manifest file to the Platform and apply the metadata contained in that file to all the files in your project.
Note that a manifest file is formatted as a .CSV file.
Learn more about this feature from our Knowledge Center.
Seven Bridges Sonar
Seven Bridges Sonar is a novel platform which allows users to rapidly explore and answer questions spanning genomic and phenomic dimensions and learn from data as it accumulates. On Monday, Dec. 19, we are releasing the Advanced Access (AA) version of Sonar. This is a mix of fully-functional software and clickable mocks.
Task Execution: documentation update
Check out our new documentation on advanced methods to optimize analysis cost and execution time. Learn more about the following:
- Correctly set a tool’s CPU and memory requirements as you create or port tools to the Platform
- Parallelize critical tools to help the Platform select the right instance for the job as you develop workflows.
Used correctly, these techniques help accelerate your workflows and lower their cost.
Use manifest files to set metadata in the Seven Bridges Uploader
Use a manifest file to upload large batches of files along with their metadata in the Seven Bridges Uploader. This is similar to the functionality already present in the Command Line Uploader. Learn more from our documentation.
Filter by and set custom metadata
Custom metadata fields are visible via the visual interface. You can set the values for these fields from the Files tab of your project or from that individual file’s page. Use these custom metadata fields for filtering alongside of preset fields. Note that you cannot create new metadata fields on the visual interface. However, you can create custom metadata fields using the API.