The prevalence of NGS data within regulatory submissions has skyrocketed over the past few years which has prompted a collaboration amongst the Food and Drug Administration (FDA) and George Washington University to develop the BioCompute Object (BCO) as a standard for communicating NGS data.
The BioCompute Object community aims to reduce the time and effort required to exchange and understand the process of analyzing genomic data by standardizing the reporting of key analysis domains, including data provenance domain, usability domain, an execution domain, verification kit, and error domain. By providing specific recommendations on how to describe and document analysis workflows and outlining the structure on how to share and communicate descriptions about NGS analysis for review, the BioCompute Object will lead to a reduction in the amount of time it takes to explain complex analyses during regulatory review processes. By reducing the amount of time required for regulatory review, new drugs can gain market access at an accelerated rate.
The BioCompute will provide seven structured domains that include sufficient detail of the complex NGS analysis workflows. Here are some examples of those domains and the benefits they provide.
- The identification and provenance domain includes name, digital signature, BCO ID, version, authors and publication status. These high-level identifiers will support search capabilities as BioCompute databases are created.
- The usability domain is a free text description that allows authors to describe appropriate scientific use of the BCO and may highlight unique aspects of BioCompute Object implementation.
- The parametric domain explicitly requires workflow developers to identify parameters which adversely affect the output of the workflow. The parametric domain provides both a location for workflow developer’s to detail their experience refining workflow parameters, which can be time-consuming, and provide specific guidance, and potentially cautions, on setting workflow parameters.
- The BioCompute Object execution domain includes a structured description of the information required to execute a BioCompute Object. Unique to the BioCompute Object is the validation kit which provides the inputs, outputs, and settings required to verify that a BioCompute Object is implemented correctly. It is expected that the validation kit directly facilitates the reproducibility of NGS analysis.
The ability to include other commonly used standards is built into the design of the BioCompute Object so that the BioCOmpute Object can evolve with the NGS analysis community. The current iterations of the BioCompute Object are designed to be human and machine readable which is expected to support quick adoption with limited technology developments. As support for the BioCompute Object grows, the specification allows for including other standards such as Common Workflow Language (CWL), Fast Healthcare Interoperability Resources (FHIR), Research Objects (RO), Ontologies, Data Repositories, and Semantic Web. Thus, allowing BioCompute Object to incorporate the standardization taking place throughout the NGS community.
More information on the BioCompute Object can be read in the manuscript titled “Enabling precision medicine via standard communication of HTS provenance, analysis, and results” or via the BioCompute Website.