Genomics is unlocking how DNA and viruses come together to shape chronic disease

Written by:

Slavé Petrovski

Vice President, Centre for Genomics Research (CGR), AstraZeneca

Ryan Dhindsa

Assistant Professor, Pathology & Immunology, Baylor College of Medicine; Principal Investigator, Jan and Dan Duncan Neurological Research Institute at Texas Children’s Hospital

Caleb Lareau

Assistant Member, Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center; Assistant Professor, Computational Biology and Medicine, Weill Cornell Medicine

A virus you most likely carry could hold clues to your future health. Epstein–Barr virus (EBV) infects nearly everyone and for most people, it hides quietly in the body for life. But for some, this common virus can do far more than linger and is linked to increased risk of developing chronic diseases like lupus, COPD, heart disease, and certain cancers.

In our recent Nature paper, we describe a novel method used to analyse genome sequence and health data from 750,000 people within UK Biobank and the All of Us Research Program, making it the largest EBV study ever conducted. In collaboration with Memorial Sloan Kettering Cancer Center and Baylor College of Medicine, we uncovered 22 specific genetic variants that are linked to both higher active levels of EBV and chronic disease risk, revealing how our unique DNA can influence EBV activity and shape our future risk of chronic disease. These insights could allow us to use EBV and other viruses as potential biomarkers for chronic disease and identify who is most vulnerable, which in the future may help inform how we better predict, prevent, and ultimately treat chronic diseases.




Understanding Epstein–Barr virus (EBV) and chronic disease

EBV lives inside more than 90% of adults for life, but its impact is far from uniform. For many, the virus lays dormant, never causing negative health effects, but for others, EBV can reactivate and raise the risk of chronic disease.

Though the scientific and medical communities have long recognised its reach, there has been a lack of scale and technology to understand why only some individuals go on to develop chronic conditions from a previous EBV infection. Through population genomics, virology, and chronic disease research, we revealed a connection between an individual’s DNA, active EBV and the connection to long-term health outcomes.

Unlocking EBV insights through population genomics

We developed a novel computational approach that reimagines how existing libraries of human genomic and health data can be used. Modern DNA sequencing, which is the process of reading the billions of letters that make up our genetic code, often picks up traces of viral DNA that are typically dismissed as “noise” and cast aside when running typical analyses. We saw untapped potential instead.

By developing computational models capable of processing millions of gigabytes of sequencing data, we created a new metric called EBV DNAemia. This measure reflects high levels of EBV genetic material circulating in the blood—found in about 10% of people previously infected—and allows us to uncover links between viral burden, genetic differences, and specific health outcomes.

For example, people with higher EBV levels have about 50% higher risk of rheumatoid arthritis and nearly twice the risk of COPD compared to those with lower EBV levels. Insight into the genetic drivers behind this could help us understand who is at greater risk of longer-term disease burden while also informing the next wave of research into therapeutic and potentially early intervention strategies.

Uncovering the 22 genes that shape EBV control

Our analysis uncovered 22 genetic variants that are linked to high levels of EBV DNA as well as higher disease rates. Many of the genes we identified sit in the immune system’s command centre, the Human Leukocyte Antigen (HLA) region of the genome. This region is a set of genes that controls how viral fragments are presented to immune cells to elicit an immune response. Variants in genes like ERAP1 and ERAP2 can tip this balance influencing viral presentation; while strong presentation helps contain EBV, weak presentation lets it linger, and could raise future health risks.

We also found links to genes outside the HLA-region involved in T-cell activation and interferon pathways, such as PTPN22 and SH2B3 which are both known drivers of autoimmune diseases. These findings reveal overlapping genetic pathways that govern both EBV control and autoimmune risk, highlighting immune-regulating genes as potential biomarkers and therapeutic targets.

From risk detection to prevention: the future of genomics and health

Mapping viral levels and understanding their genetic drivers could be a critical step toward the future of chronic disease care. And importantly, our innovative computational method allows us to extend beyond EBV to study the long-term effects of similar viruses within existing genomic sequencing data.

Today, we treat diseases after they appear. Tomorrow, insights like these could help shift us from traditionally reactive ‘sick care’ systems to true ‘health care’ systems – systems where we can detect disease and even predict risk years in advance and intervene before illness takes hold.

Crucially, this progress reflects the power of industry–academia collaboration. By uniting deep scientific expertise with large-scale data and cutting-edge technology, we are helping accelerate discoveries that aren’t possible in isolation. The answers to questions that once seemed out of reach are now within sight—bringing us closer to a future where genomics guides not just treatment, but lifelong health management.


Topics:



tags

  • R&D