Unlocking the code
By: Dr. Guillaume Paré
Medical biochemist, Hamilton Health Sciences
Senior Scientist, Population Health Research Institute
Associate Professor, Department of Pathology and Molecular Medicine, McMaster University
April marks the 15th anniversary of the completion of the Human Genome Project, a thirteen-year journey led by scientists from around the globe to map all of the genes of the human species and one of the biggest undertakings in scientific history.
When the project began in 1990, the goal was to not only understand genetic factors in human disease, but to set the course for new treatments and prevention methods for ailments like cancer and heart disease. It was a costly project with a price tag of about 3 billion dollars, but the investment was worthwhile: it gave way to a whole new era of discovery in the pursuit of global health.
The last decade and a half has also seen an acceleration in technological advancement. Our genetics lab at HHS is now fully automated – extracting, mapping, sequencing and copying DNA – allowing our team to quickly generate substantial amount of data to support many of the large-scale, global research studies that we lead from right here in Hamilton. These advancements have happened so quickly that, today, our lab could do the same work as the Human Genome Project in less than one week for only a few thousand dollars.
The results of these extraordinary advances are clear. We’re now able to classify more than 1,800 genetic diseases, including genetic forms of common diseases as breast cancer, Alzheimer’s, diabetes and heart disease. Furthermore, scientists have been able to develop tests to better predict a family member’s genetic risk, allowing healthcare providers to help prevent the development of potentially devastating diseases.
We’re now able to classify more than 1,800 genetic diseases.
But, we’re also left with a new set of challenges. Although we’ve unlocked the mystery of our genetic make-up, we’re now dealing with the surge of genetic data we’re able to generate with modern technology. Indeed, the bottleneck in research projects is often data analysis, meaning that the role of “scientist” is evolving in to “data scientist”.
In recent years, we’ve learned that it’s not always a single gene responsible for disease, but that different variations in our genetic makeup can equate to risk. So, what does that mean in terms of data? To start, every person has around 20,500 genes, but within each gene are subtle differences called genetic variants. We’ll record anywhere between 500,000 to 1,000,000 genetic variants per person. Then, multiply that by the thousands of people in a research study. Fifteen years ago, this would have been simply impossible. Not only would it take years to complete by hand, we just didn’t have the computing power to store the data. Currently, we have over 200 terabytes of genetic data stored at HHS. That’s the equivalent of 40 million minutes of high-quality YouTube videos!
Currently, we have over 200 terabytes of genetic data stored at HHS.
Building these data and sophisticated analysis techniques, my team at HHS’ Population Health Research Institute introduced a test earlier this year that can predict an individual’s genetic risk of early heart disease with high accuracy by combining information from a large number of variants. This discovery was made on the shoulders of the Human Genome Project and large-scale international collaborations aiming to decipher the genetic underpinnings of heart disease.
As is the essence of research, the original question that the Human Genome Project sought to answer opened up the floodgates to countless new queries. The endeavour shed a new light on the most precious part of humanity, but we still haven’t fully broken our genetic code. We’ve made immense progress in the last 15 years, but there’s lots more we need to learn.
This is what drives us as researchers and clinicians: to always find new and better ways to prevent disease in people around the world. It’s an ambition we all share; it’s in our DNA.