Cancer is notoriously different for every patient. Many men who get prostate cancer die of other causes before the cancer grows harmful. Other prostate cancers are aggressive, grow quickly, and spread to other parts of the body. Ovarian cancer, although noted for its lethal character, also has variable degrees of aggressiveness. If doctors could know the aggressiveness of each patient’s cancer, treatment could be tailored for better results.
A team of ECE researchers led by Yue (Joseph) Wang has helped identify biomarkers that can differentiate between aggressive and slow growing prostate cancers and between different levels of aggressiveness in ovarian cancers.
Wang’s group at ECE’s Computational Bioinformatics and Bio-imaging Laboratory (CBIL) applies electrical and computer engineering methodologies, computational, and modeling tools to solve biomedical problems that influence cancer and diabetes, and to gain a deeper understanding of the structure and function of the genome. Wang collaborates with medical researchers at Georgetown, Johns Hopkins, and Wake Forest universities as well as with National Children’s Hospital.
CBIL is based at the Advanced Research Institute (ARI) in Arlington, Va. and its faculty and students conduct about $1.2 million in research each year.
Biomarkers for aggressive cancers
The biomarkers for aggressive prostate and ovarian cancers are based on different copy numbers of a segment of DNA. Humans should normally have two copies of each strand of DNA, Wang explains, one from each parent. “This is a nice, built-in redundancy that engineers can appreciate. It helps keep our system stable,” he says. “If one copy gets deleted for some reason, you still have a copy to sustain normal function.”
Researchers continue to delve deeper in the human genome and recently discovered that strands of DNA segments can get partially or completely deleted (or in contrast amplified) and that many differences can exist between populations and individuals. These changes can be inherited and are called copy number variation (CNV), or, if they occur through local mutations after birth, they are called copy number alternations (CNAs).
These two figures show potential patterns of metastatic cancer spread. The team reports in a Spring 2009 Nature Medicine article that their results show most, if not all, metastatic prostate cancers have clonal origins.
Copy number changes can cause disease. Wang provides an example of critical cancer suppressor genes in the human body: “If that gene or part of that gene has been deleted, the person’s risk of cancer increases. If one copy is deleted, the person may be ok, but if both get deleted, the person will almost definitely get cancer.”
Cancer genes called oncogenes also exhibit copy number changes. “Once a person has cancer, the cancer evolves and the oncogene gets amplified. A gene may have three, four, or five copies,” he says. “We have seen either some deletions or amplification in different DNA segments.”
The segments contain sequences of nucleotide base pairs, whose measured hybridization intensities can be treated as signals by electrical and computer engineers. Wang’s team applies signal processing, pattern recognition, and computational modeling to analyze the sequences in the cancer genes to extract the gene copy number differences. “What we do is to identify where those changes occur and how often,” he says.
Wang is collaborating with three groups on this problem. Working with Dr. Steve Bova at Johns Hopkins, the team is investigating the patterns in copy number changes between metastatic prostate cancers across subjects. “Our research shows that the copy number changes are structural markers,” he says. The results of this research will be published in the up-coming issue of Nature Medicine.
Wang is also working with Dr. Ie-Ming Shih at Johns Hopkins on ovarian cancer. The team collected a group of patients that have highly aggressive and less aggressive ovarian cancers. “Our analysis clearly revealed that the copy number changes or the mathematically defined genomic instability index clearly can differentiate between aggressive and less aggressive forms of ovarian cancer,” he says. The research will be published in the upcoming issue of Cancer Research.
Wang is involved in a third study of the copy number analyses with a group at Wake Forest University led by Dr. Jianfeng Xu. Instead of studying metastasis, they are exploring whether they can identify the genetic origin of the prostate cancer, that is the consensus regions in relation to either oncogenes or cancer suppressor genes.
Next generation genome
Wang’s work with copy number changes stems from a broader effort called the next generation genome. “When the mapping of the human genome was completed, we thought that the basic building blocks were known and we just needed to figure out how they worked,” Wang says. With microarray technologies, researchers began studying downstream effects, such as gene expression, proteomics, and metabolism and changes in the cell.
The molecules that make up DNA are called nucleotides, represented by the letters A, C, G, and T. When the sequence of nucleotides differs between two or more genomes, it is called a single nucleotide polymorphism or SNP (pronounced snip).
“We hoped we could use reverse engineering and the vast new measurement capability to understand the initiation and progression of disease. It turns out it may not be so simple,” he says. “In the past eight years, biomedical researchers have made many discoveries, but unfortunately, all the biomarkers from molecular analysis can only explain less than 15 percent of common diseases.”
Now, many researchers are going back to the genome again, studying sequencing. “We’ve come full circle,” Wang says. The next generation sequencing is moving in multiple directions. One is the copy number variation. Another is the investigation of differences on a single nucleotide base pair of DNA. When a single nucleotide, an A, C, G, or T in the genome is different between members of a population, it is called a single nucleotide polymorphism, or SNP (pronounced snip).
Genetic information as signals
Wang’s team has several projects using bioinformatics to analyze SNPs. They are working with Wake Forest researchers (a team led by Dr. David Herrington) to understand SNP influence on cardiovascular disease and (with a team led by Donald Bowden) in populations with high incidence of diabetes, such as African Americans and Hispanics. They are also working with Eric Hoffman at Children’s National Hospital in Washington, D.C. to study diabetes in young children. “Because of the morbidity of metabolic syndrome, diabetes is increasingly important. We want to understand it and find ways to prevent and treat it,” Wang says.
The difficulty with SNP research is its complexity, Wang says, noting that there are an estimated 3 billion nucleotides in the human genome. “When we first worked analyzing genes with microarrays, we dealt with about 30,000 genes and had to find the 50 genes responsible for the issue being studied,” he says. Gene chip technology has changed dramatically in the past five years to where an SNP microarray can process 1 million SNPs, “and we have to find 10-15 causal SNPs. It’s extremely difficult if not impossible.”
The combined effects of SNPs add to the complexity of the problem. “When we find two, or three, or five SNPs working together, we can identify their nonlinear, higher order interactions within relevant biological pathways. We are still at the beginning of understanding the form and order of the complex gene-gene or gene-environment interactions,” Wang says.
A SNP sequence, or signal, is intrinsically digital, according to Wang. “They are not continuous. This makes it a good, firm playground for us electrical and computer engineers,” he says. Sometimes even a single SNP or a subset of SNPs cannot explain the whole issue, so researchers look at the construct’s alterations. “We need to look at how and which part of the genomic waveform or sequence they have changed,” he says.
Wang’s dream is a personalized medicine in which doctors can precisely determine how an individual patient’s cancer or other disease will behave, then target a precise treatment plan based on expected outcomes. He is working to help achieve that dream, but also enjoys the fun of solving a mystery. “I have so much curiosity about the biological system,” he says.
Decoding the genome’s software
That curiosity is leading him into a third area of research called epigenetics, which he describes as the software of biology. “The genome, the DNA, the downstream effects on gene expressions, proteins, and metabolic processes those are all the hardware,” he says. “We know little about the software that works on the combined genetic and environmental base to determine various complex phenotypes.”
Wang and his wife have sons that are identical twins. “They have identical DNA and come from the same cell, but they are not the same person. This is an indication that the hardware of the genome in the cell nuclei does not determine everything,” he says. He is interested in helping to figure out the mechanisms that guide the human genome to develop a complete organism.
“This is going to be challenging,” he says. He describes two mechanisms that researchers currently suspect might explain how the software works with the hardware to determine which characteristics are expressed by the genome in a particular cell, organism, or individual. One mechanism, called methylation, could control how a gene is turned on or off. The other mechanism, called histone modification, can affect a gene’s activity level. “This is the fine tuning mechanism.”
Wang is developing a project with Leena Clarke of Georgetown University Hospital to investigate the epigenetics effect on breast cancer. “We want to figure out how the software works so that a prevention strategy can be developed,” he says. The epigenetics effect would occur during the developmental phases. “Epigenetics are more active during youth when the human genome is trying to develop a fully mature human being.” During puberty, epigenetics are active in transforming stem cells into mature mammary glands.
“We believe that if the development during puberty is complete, which is normal development, there is much less risk of breast cancer later,” he explains. “Some girls have denser breasts, indicating that development wasn’t as complete. Why is that a higher risk Because that particular person has a lot of stem cells left over.” Stem cells can easily become any kind of cell, including cancer.
The transformation from stem cell into mammary gland cell would be heavily affected by epigenetics. “If the regulators DNA methylation or histone modification malfunction, we end up with enough leftover stem cells to cause problems later,” he says. “We are trying to investigate which markers, which segment of DNA, is affected by methylation or modifications.”
Wang describes how many areas of investigation have opened up for electrical and computer engineers in biomedical research. “We face an unprecedented situation. Our ability to acquire information about the biological players is growing every day, but we don’t yet have the computational intelligence or the engineering approach to handle those data,” he says.
His team uses every tool available from high performance computing to system identification and signal processing. Recently, they began also exploring whether game theory might help answer some question. “Virginia Tech ECEs are pioneers in applying game theory to cognitive radio and wireless networks. Perhaps we can also apply these concepts to genomic issues,” he explains.
“There is a considerable gap between the ever growing information and our scientific ability to analyze those data,” he says. He enjoys the challenge every day, he says. “I like to trace back to the root of the mystery.”