Making the Impossible, Possible: Harnessing the Cloud to Uncover Rare Somatic Variations in Cancer Patients

Back to Blog

Making the Impossible, Possible: Harnessing the Cloud to Uncover Rare Somatic Variations in Cancer Patients

Limited by computational resources, the Khiabanian lab at Rutgers needed a breakthrough.  

The lab was looking at rare changes in blood cells linked to a condition called CHIP (clonal hematopoiesis of indeterminate potential). This condition has been associated with certain types of treatment-induced cancers as well as resistance to cancer treatments. Most people study CHIP by genomic sequencing of whole blood samples. However, the Khiabanian lab believed they might detect these changes in trace blood cells present in tumor tissues. If true, this would allow them to use many existing datasets that were not originally collected with CHIP in mind as untapped resources for their research.  

Their secret weapon was a powerful software pipeline developed in their lab: MERIT and Backtrack. The catch? MERIT promised exciting results in small tests but proved too hefty for Rutgers’ High-Performance Computing (HPC) resources.  

“Even just running 100 samples at a time caused the HPC to choke,” recalls Dr. Vaidhyanathan Mahaganapathy, the graduate student whose dissertation this work comprised.  

Enter Velsera’s Cancer Genomics Cloud (CGC). Designed and built by Velsera scientists and engineers and funded by the National Cancer Institute, the CGC debuted in 2014 and has since enabled over 8,000 researchers to access high-powered, scalable cloud computation resources and valuable public datasets such as The Cancer Genome Atlas (TCGA), Human Tumor Atlas Network9HTAN), Childhood Cancer Data Initiative (CCDI), among others. Access to the CGC removed the computational barriers, and the project moved at light speed. With no need to download and handle cumbersome files, the researchers could move straight to analysis. Dr. Mahaganapathy analyzed 16,500 whole exome sequences of case-paired tumor and normal blood samples across 113 genes in just ten days—a process that would have taken substantially longer (weeks, if not months!) on HPC.  

How did Velsera help?  

Using the Common Workflow visual editor available on Velsera’s cloud computing platforms, Dr. Mahaganapathy converted the MERIT pipeline into a cloud-based analytical tool. The data he wanted, from The Cancer Genome Atlas (TCGA), was already on the CGC platform. This saved him from handling tons of big files. And, once everything was in place, the Velsera team made sure Dr. Mahaganapathy could access a fleet of 500 parallel cloud instances at once.  

In the end, the Khiabanian lab got the results they were looking for—proof-positive that CHIP variants can be detected in solid-tissue tumor samples. The groundwork laid by Dr. Mahaganapathy’s research enables future studies to tap into this methodology, potentially changing the way we research and treat CHIP-associated cancers. 

Reflecting on this groundbreaking progress, Dr. Mahaganapathy stated, “Without the CGC, this would have been impossible.” He also credited the unwavering support from the Velsera team during this endeavor. “I cannot explain the level of support I would get [from the Velsera staff]. They got me through the last moments of my PhD.” 

Today, thanks to this collaboration, the Khiabanian lab is performing additional analyses on the CHIP-related mutations identified from tumor tissue for the first time, and the MERIT pipeline is available for all cancer researchers on CGC. 

The combination of this lab’s novel workflow and our powerful cloud computing environment has unlocked the potential hidden in hundreds of existing datasets. Using these tools, researchers can work to understand what the “indeterminate potential” of these variants might be in the progression, treatment, and prognoses of different cancers. That could lead to better targeting of existing treatments, or even the development of new treatments.  

Learn more about Vaidhy’s story from his webinar: “A computational framework for detecting low abundance clonal hematopoiesis in large-scale tumor sequence datasets.”