Dana-Farber Exterior Building Signage

Research in the CGL

Collins Genomics Lab

The ultimate goal of the CGL is to blunt the public health burden of cancer by translating the knowledge gained from genomics research into better prevention and screening strategies for intercepting cancers while they are still at an early stage and before they advance to become aggressive, life-threatening malignancies.

Our research scrutinizes the molecular and cellular blueprints encoded within the human genome to unlock fundamental knowledge about how, why, when, and where cancer occurs over the human lifespan.

We employ a variety of techniques to achieve this goal. The primary tools in our toolkit include genome informatics, statistics / machine learning, epigenomics, and multimodal data fusion / integration. We often partner with clinicians to study specific patient populations and with molecular biologists to mechanistically validate our findings.

You can review our full list of published research, also available on Google Scholar or PubMed.

Otherwise, read on below to learn more about our current research topics.

WGS discovery

Unraveling cancer predisposition with population-scale genomics

WGS discovery

Which inherited genetic factors lead to cancer predisposition?

In the words of former NIH Director Francis Collins, cancer is "a disease of the genome," not only because tumor cells are driven by acquired somatic mutations, but also because the genetic variation we inherit at birth accounts for roughly 15–30% of our lifetime risk of most cancers. yet despite decades of cutting-edge research in cancer genetics, most sources of this inherited cancer risk remain opaque, even in families with strong histories of disease.

What genetic risk factors for cancer could we be missing? In the CGL, we believe they lie in forms of variation that have been largely inaccessible to prior approaches: structural, repetitive, and rare noncoding variants. To unearth these hidden risk factors, we interrogate vast, petabyte-scale genomic datasets from tens of thousands of individuals with and without cancer using specialized algorithms designed to capture these overlooked forms of genetic variation. We then integrate these data with rigorous statistical modeling to create a more complete portrait of the genetic architecture of cancer by integrating all sources of genetic risk across mutational frequencies (common vs. rare), sizes (short vs. structural), and contexts (coding vs. noncoding).

We are applying these population-scale genomic strategies widely across cancer types, but have a focused interest in cancers with especially early onset and/or poor patient outcomes, like ovarian, pancreatic, esophageal, colorectal, lung, and brain cancer. By illuminating the full spectrum of inherited genetic risk, we aim to transform how cancer is predicted, prevented, and treated by linking fundamental discoveries in genome biology to tangible advances in clinical oncology.

Tumorigenesis schematic

Decoding early tumorigenesis through the lens of germline genetics

Tumorigenesis schematic

How do tumors form?

Every cancer cell—no matter how aberrant or mutated—was once a normal human cell carrying a patient's inherited genome. The entire evolutionary history of each tumor unfolds against this inherited genetic backdrop. In the CGL, we are investigating tumor initiation through the lens of germline genetics, asking how inherited variation shapes the emergence of precancerous lesions and their progression to malignant disease. 

Our primary aim within this topic is to disentangle the complex interplay between each patient's germline genetics and the somatic alterations their cells acquire during tumorigenesis. We are focusing on gastrointestinal cancers as an ideal model for these questions, as precancerous GI lesions are common in the population and can be sampled during routine screening procedures such as colonoscopy. We are also extending these approaches to other contexts, including fusion-driven cancers and highly rearranged pediatric solid tumors.

By revealing how inherited and acquired genetics conspire to initiate tumors, we aim to identify new opportunities for cancer prevention, not only through improved screening and risk stratification, but also by uncovering protective factors that could be harnessed to block progression from precancer to invasive disease, opening the door to pharmaceutical prevention strategies as transformative as statins have been for heart disease.

Early detection schematic

Genome-informed strategies for intercepting early-onset cancers

Early detection schematic

What is the role of genomics in clinical cancer interception?

Each year, nearly 60,000 Americans under age 50 are diagnosed with cancer. These early-onset cases are more often linked to inherited predisposition syndromes and family history of cancer as compared to later-onset patients. Yet not everyone who has a family history of cancer or carries a cancer predisposition variant will develop cancer, and many high-risk individuals remain unidentified until their cancers are diagnosed at an advanced stage.

In the CGL, we envision a future where genome-based cancer risk prediction is performed universally at birth and personalized screening schedules are seamlessly integrated into routine primary care.

To move toward this vision, we focus on familial cancers and hereditary syndromes such as Lynch, Li-Fraumeni, and Von Hippel–Lindau. We are particularly interested in cryptic, "secondary" genetic factors—such as structural variants, noncoding regulatory variants, or polygenic risk—that might modify of penetrance or clinical presentation within these families and explain why some high-risk individuals develop aggressive cancers early while others remain disease-free.

Ultimately, we aim to unite the genomic insights gained from our research with advances in liquid biopsies, medical imaging, and electronic health records to develop predictive models that can not only identify who will develop cancer but also anticipate when and where it will arise, enabling personalized screening to catch tumors as early as possible.