Background

Multiple myeloma is a malignancy of antibody-secreting, terminally differentiated B cells with a predilection for the bone marrow. Considering that microbial infections are common triggers for antibody production, we hypothesize that known and/or previously undiscovered viruses may be present in myeloma tissues and may influence myeloma prognosis, even if the viruses themselves do not cause myeloma.

Virus-harboring tissues should presumably contain both human and pathogenic DNA/RNA, but viruses may not be detected if they are present at frequencies that are too low for detection by standard assays. Therefore, high-throughput sequencing, followed by computational subtraction of human sequence reads should result in the identification of non-human (and candidate pathogenic) sequences. The Multiple Myeloma Research Consortium has generated such a dataset, thus providing a unique opportunity to address this question.

Methods

We have developed a computational subtraction pipeline, PathSeq, for pathogen discovery. It consists of 3 main modules. Quality filtering is carried out in the “pre-subtraction module”. In the “subtraction module”, the quality filtered reads are analyzed by alignment to a series of human genome and transcriptomes databases using the BWA aligner and BLAST. In the “post-subtraction module”, residual nucleotide and conceptually translated sequences are aligned to microbial databases. We also generate large contigs composed of unmapped reads that do not possess significant alignment similarity to any sequence in the reference databases. Such contigs may be suggestive of novel microorganisms.

Whole genome sequencing data obtained from an initial set of 26 tumor-normal sample pairs (reported in Chapman et al, Nature 2010) were analyzed by PathSeq. In addition, analysis of mRNA sequencing data from 84 tumors is ongoing. Comparative marker selection analysis was also performed on array-based gene expression profiling data from 20 of the initial 26 tumors.

Results

PathSeq analyses revealed the presence of Epstein-Barr virus (EBV) sequence reads in 7 cases and human herpesvirus 6B (HHV-6B) reads in 2 cases. In most instances, the viral sequences were found in both tumor and normal samples from the same patient. The lowest viral-human read ratio was 1 in 1 billion, demonstrating the sensitivity of the PathSeq algorithm for pathogen identification.

Comparative marker selection analysis comparing gene expression profiles between the 7 tumors with EBV reads with 13 tumors without viral sequences revealed a tentative virus associated signature, with SHISA2 (which encodes a protein that modulates FGF and Wnt signaling) as the gene whose expression is most significantly associated with virus positivity.

Conclusions

We have described the identification of EBV and HHV6B sequences in DNA samples obtained from myeloma tissues. Our data are preliminary and it is possible that the viral sequences are simply vestiges of non-active herpesviral infection present in the tumor microenvironment. Therefore, these findings need to be confirmed by our ongoing PathSeq analysis of RNA sequencing data. In addition, efforts are ongoing to identify novel pathogens, as described above.

The translational utility of knowledge about the influence of host-viral relationships on the prognosis of myeloma, if they exist, transcends the need to prove causation. Therefore if viral sequences are found in RNA sequencing data, future plans will include comparison of our data to published prognosis-related gene expression signatures in myeloma.

Disclosures:

No relevant conflicts of interest to declare.

Author notes

*

Asterisk with author names denotes non-ASH members.

Sign in via your Institution