Virginia Tech computer engineers are developing open-source software in a multi-institution National Cancer Institute (NCI) effort to use grid-enabled intelligent computing to change the very nature of cancer research.
NCI and its 50+ cancer centers are developing what it calls, “the World Wide Web of cancer research” for scientists and physicians to integrate different forms of data in their own laboratories and tap data from other researchers around the world. The $14.9 million effort is building the cancer Biomedical Informatics Grid (caBIGTM) to provide an open-source, open-access network where researchers can share tools, standards, data, applications, and technologies.
The $14.9 million initiative is the largest bioinformatics engineering initiative to date. It is being developed in response to the explosion of information and knowledge about cancers and their treatment. The ability for researchers to harness the information for research advances has been limited due to incompatible forms of data, lack of technology standards, and the tremendous amount of information to be analyzed.
The NCI caBIGTM website states, “This means that researchers often operate in a near-vacuum, or a silo approach, without the benefit of outside information. They cannot easily share data or tools, or benefit from the innovative technologies developed by others.”
The pilot caBIGTM development is divided into six workspaces: clinical trial management systems, tissue banks and pathology tools, integrative cancer research, vocabularies and common data elements, and architecture. Researchers from ECE’s Computational Bioinformatics and BioImaging Laboratory (CBIL) have teamed with the Georgetown University Medical Center on a one-year, about $800,000 grant to develop software for the integrative cancer research workspace. Virginia Tech’s portion of the grant is about $229,960.
The Georgetown/Tech team is adapting its Visual and Statistical Data Analyzer (VISDA) software to the caBIGTM architecture so cancer researchers can analyze both their own molecular expression data as well as “grid” data, said Yue (Joseph) Wang, the Virginia Tech project lead, and CBIL director. “Our VISDA software helps researchers discover phenotypic and gene patterns from molecular profiling datasets, including microarray gene expression data, proteomics data, and clinical data,” he said. VISDA is a C++/Java tool for cluster modeling, discovery, and visualization.
Nearly 30 universities, along with national laboratories, government organizations, and industrial partners, are involved in the caBIGTM effort. Virginia Tech and MIT are the only university participants that do not have medical schools, Wang said.