Epstein-Barr virus (EBV), first discovered from Burkitt's lymphoma (BL), is a class 1 carcinogen that is now associated with a wide range of hematologic and epithelial cancers, including lymphoma nasopharyngeal carcinoma (NPC), gastric cancer, Hodgkin lymphoma and some AIDS-associated B cell lymphoma. Although almost all BL cases from Africa and NPC in China are EBV-positive, consistent with a direct role of EBV in tumor causation, the precise nature of the mechanisms of causation remains to be elucidated.

Of interest, EBV is ubiquitous and causes asymptomatic lifelong infection. Up to 95% of developing world population is infected at an early age. In contrast, the geographical patterns of EBV-associated cancers and their peaks age-incidences vary. For example, BL incidence is highest in equatorial Africa, where peak risk occurs in children aged 5-9 years. By contrast, NPC incidence is highest in Southern China and also parts of northern Africa; with peak risk in the elderly. These variations have led to speculation about presence of EBV variants with different penetration and expression. Previous studies attempting to examine this question have focused on genetic variation in one or only a few EBV genes at a time, precluding firm conclusions about genetic variation.

Whole EBV genome analysis in tumor and non-diseased tissue from the same individual as well from healthy individuals in at risk populations may facilitate discovery of sequence heterogeneity that might be associated with cancer risk. Since the first genome of EBV was published, 23 whole EBV genomes have been sequenced, including from 3 BL cell lines, 5 immortalized B cells of normal blood donors (B95-8 plus WT-EBV), and 13 NPC biopsies. No EBV genomes have however been sequenced to date directly from BL biopsies and from healthy individuals from the same region.

The goal of this study is to obtain a comprehensive assessment of EBV genome in BL tissue, and to determine how these sequences differ from EBV genomes in matched non-tumor reservoir of same individuals and from EBV genomes in healthy individuals from same regions.

We have available 50 BL biopsies, 37 representing endemic BL from Africa, 13 from South America, and normal tissue from healthy individuals from the same region. We are reporting preliminary data obtained from whole genome sequences of EBV genomes from six BL biopsies from West Africa and South America obtained using high-throughput sequencing (HTS) Illumina MiSeq platform. Using the WT-EBV as a reference, EBV genomes in the BL biopsies showed considerable variation ranging from 550-1200 variations per genome (Fig 1). Most were single nucleotide variations. Insertions and deletions ranged between 15 and 51 per genome. As much as one third of variations resulted in amino acid changes. Surprisingly some of the BL biopsies contained EBV genomes with heterozygous reads, suggesting that ongoing mutations in the EBV genome occurred after clonal expansion of BL cells. Novel variations were observed in BZLF1 suggesting a possible influence of variation on regulation of EBV lytic cycle. Using an in-house EBV genome database prepared for comparative analysis that contained genomic DNA sequences of the 23 published EBV genomes, sequence comparison and phylogenetic analysis showed a much greater sequence diversity among EBV sequences from BL biopsies on a whole-genome level. Based upon these results, we are proposing to expand EBV genome wide sequencing from the remaining BL biopsies as well as from paired normal subjects to determine variations commonly associated with BL and to understand how these EBV genomic variations contribute to BL pathogenesis in different geographic areas.

Fig 1

Distribution of variations across the EBV genome in select BL biopsies

Fig 1

Distribution of variations across the EBV genome in select BL biopsies

Close modal
Disclosures

No relevant conflicts of interest to declare.

Author notes

*

Asterisk with author names denotes non-ASH members.

Sign in via your Institution