A tissue level atlas of the healthy human virome


Association of HCV infection with robust interferon responses in the liver and the onset of hepatitis

Advances in next-generation sequencing (NGS) methods in recent decades have made comprehensive surveys of a variety of microorganisms possible. Metagenomic analyses have explored microorganisms, including bacteria, phages, and viruses, in a variety of places, such as the oceans and soils and on Earth. Perhaps, the most deeply surveyed microbiome is that of humans. The Human Microbiome Project aims to characterize bacteria, viruses, and other microorganisms in the human body. As they often live outside human cells, the human bacterial microbiome has been well described in multiple organs and samples via non-destructive sampling: the bacterial microbiome of the skin, oral cavity, and gastrointestinal tract (including feces) are well described. A major theme emerging from these studies is that, while certain microbial species are associated with pathology, many components of the microbiome likely play a symbiotic role in maintaining human health (reviewed in).

Many viruses are clearly human pathogens: human immunodeficiency virus (HIV) and influenza virus are the causative agents of human diseases, and Epstein-Barr virus (EBV; also known as human herpesvirus 4 [HHV-4]) and hepatitis C virus (HCV) can drive oncogenesis. On the other hand, similar to the cases of the human bacterial microbiome, some viruses can chronically infect a broad range of human tissues without overt pathology. Previous studies have suggested that these viruses nonetheless have detrimental effects. For example, human respiratory syncytial virus and human rhinoviruses may play an important role in the inceptions of childhood asthma and atopic asthma, respectively. On the other hand, there are a few examples of viruses that exhibit protective effects. For example, GB virus C (GBV-C, also known as hepatitis G virus [HGV]) infection can be protective against HIV infection and improve survival.

In experiments using animal models, latent infection with murine gammaherpesvirus, which is genetically related to EBV in humans, confers symbiotic protection from bacterial infection. Thus, virus infections are associated with multiple aspects of human health, and revealing the human “virome” would be beneficial for understanding the hidden mutualism and/or conflict between humans and viruses. In the largest study of the human virome to date, Simon et al. performed meta-transcriptome analysis using more than 17,000 human RNA-sequencing (RNASeq) datasets to reveal the human virome. However, the dataset used in this previous study is related to human diseases. Moustafa et al. used DNA-sequencing data obtained after whole-genome sequencing of more than 8000 “healthy” subjects using blood samples.

The authors detected 19 human viruses in the blood; however notably, their approach was “blind” to RNA viruses. Previous studies of the human virome have used specimens that are relatively easy to access in healthy individuals (e.g., blood or skin). In other words, it is technically difficult to obtain the somatic tissues from inside the human body (e.g., brain and internal organs) of healthy individuals for virome investigation. Moreover, it remains unclear what kind of viruses infect in various tissues in healthy individuals and how these virus infections influence human gene expression and perturb the homeostasis of these tissues. To characterize the virome in the human body, we performed meta-transcriptomic analysis using RNA-Seq dataset provided by the Genotype-Tissue Expression (GTEx) Project. We detected 39 viruses in a variety of human tissues, revealing both expected and unexpected associations between viral infections and human gene expression and human disease.

Strong association of a human gene expression pattern with the presence of HHV-7 transcripts in the stomach

Importantly, our approach allows us to quantify both viral RNA and host gene expression in the same sample (Fig). We thus analyzed these two quantities to explore the effect of each virus on human gene expression patterns, comparing virus-positive samples to virusnegative samples in the same tissue. We first focused on HCV, which was specifically detected in liver (Fig). HCV is classified into the family Flaviviridae, genus Hepacivirus and possesses a positive-sense singlestranded RNA (~ 10 kb) genome.

Interface hepatitis and bridging fibrosis were also observed in two (13SLX and 139TS) and one (13SLX) case(s), respectively (Fig. 3b, left). These histological findings are compatible with hepatitis. On the other hand, two HCV negative liver samples did not show these morphologic features suggesting hepatitis (Fig. 3b, right). Although hepatic disease did not cause their death, our findings suggest that HCV infection contributed to undiagnosed hepatitis in these three individuals. HCV infection is sensed by cellular pathogen recognition receptors (PRRs) and can cause interferon (IFN) production.

Unrecognized limbic encephalitis associated with HSV-1

Based on the following two hallmarks, histological observations suggesting hepatitis (Fig. 3b) and upregulation of ISGs (Fig. 3c and d), the detection of HCV reads is a good chance to validate the specificity and sensitivity of our analytical pipeline. To validate the performance of our computational pipeline, we compared ours with the other three computational pipelines, Kraken, CLARK , and Kaiju, which have been previously reported as computational pipelines to identify viral sequences from NGS data. As shown in Additional file 7: Figure S2, these three pipelines detected HCV reads from the three HCV-positive samples (sample IDs: ZAB4, 13SLX and 139TS) identified by our pipeline. However, CLARK and Kraken detected tremendous amounts of HCV reads from all liver samples (Additional file: Figure S2). Taken together with the fact that the prevalence of HCV infection worldwide is less than 5% , these results suggest that the results obtained by CLARK and Kraken contain many falsepositive hits, and these pipelines are relatively less specific for at least HCV in the dataset used in this study.

On the other hand, the result obtained by Kaiju was similar to that by our pipeline (Additional file 7: Figure S2). Collectively, these results strongly support the validity of our analytical pipeline. cells , EBV was broadly detected in various tissues (Fig. left). This result may be attributed to the residual blood in each tissue even after the perfusion process, or to tissue-resident cells from the B cell lineage . To address the biological impact of EBV infection, we particularly focused on the spleen and peripheral blood, where leukocytes, including B cells, naturally reside or circulate in high abundance. The spleen was the tissue with the highest proportion of EBV-positive samples (Fig. 2, middle). We detected DEGs that were upregulated in EBV-positive samples compared to EBV-negative samples and performed GO enrichment analysis.

In both the blood (Fig.a) and spleen (Fig.b), genes encoding a variety of immunoglobulins were upregulated, leading to enrichment of GO terms such as “complement activation.” This suggested that the abundance of B cells may be increased in EBV-positive samples. To further address this point, we performed deconvolution analysis, which estimates the proportion of different immune cell types from bulk RNA-Seq data. The proportion of plasma cells, which produce antibodies in high abundance, was significantly increased in EBV-positive samples compared to EBV-negative samples in both the blood (Fig. c; see also Additional file: Figure S3) and spleen (Fig.d; see also Additional file Figure S4).

It has been reported that productive EBV replication (also known as “lytic infection”) is initiated during B cell differentiation into plasma cells. Moreover, a recent study suggested that EBV infection reprograms the gene expression pattern of infected B cells to resemble plasmablasts and early plasma cells. Our findings correspond well to these experimental observations further suggesting the biological validity of our investigations. Moreover, our results raise the possibility that the “healthy” human virome, in particular EBV, may influence the spectrum of B cell lymphoproliferative disorders such as monoclonal gammopathy of unknown significance in different individuals.

We next focused on EBV, the most broadly detected virus in multiple individuals (Fig. 2, right). EBV is classified into the family Herpesviridae, genus Lymphocryptovirus, and possesses a double-stranded circular DNA (~ 172 kb) genome. EBV is known as the causative agent of infectious mononucleosis and some malignant diseases such as Burkitt lymphoma and post-transplant lymphoproliferative disorder. However, more than 90% of adults are positive for EBV, leading to the concept that EBV chronically infects humans without causing serious disease [19]. Although EBV preferentially infects human leukocytes, particularly B

HSV-1 is classified into the family Herpesviridae, genus Simplexvirus, and possesses a double-stranded circular DNA (~ 150 kb) genome. HSV-1 infects a broad range of cell types and tissues and causes cold sores and genital herpes infections. After primary infection, HSV-1 latently infects nerve cells. It sporadically reactivates, leading to recurrent symptoms. HSV-1 encephalitis is a severe and often fatal condition caused by HSV-1 infection. Surprisingly, we detected high levels of HSV-1 transcripts in the brain (Fig. left) of one individual (sample ID: X4EP) (Fig.a).

TTV infection in many human tissues without inducing an IFN response

HHV-7-positive samples were strikingly enriched in cluster 1 (Fig. 7a, middle; 73 out of the 76 HHV-7-positive samples; 96.1%). These results suggest that the HHV-7 transcription status is strongly associated with the global human gene expression pattern in the stomach. The difference of global human transcriptome between clusters 1 and 2 may be due to the different anatomical regions, and HHV-7 may predominantly infect the stomach region categorized as cluster. HHV-7 preferentially infects CD4+ T cells. The results of deconvolution analysis of the transcriptome of both HHV-7-positive and HHV-7-negative samples were consistent with the presence of resting memory CD4+ T cells in the stomach (Fig.a, bottom). Additionally, transcripts attributable to plasma cells were relatively abundant in cluster 1, while those attributable to myeloid cells (monocytes and M2 macrophages) and mast cells were abundant in cluster 2 (Fig.a, bottom).

These findings suggest that HHV-7 infection is associated with the pattern of leukocytes residing in the stomach. We further performed GO analysis on the DEGs between these two clusters. In addition to the GO terms associated with tissue-resident leukocytes (e.g., “phagocytosis” and “immune response”), GO terms such as “digestion” were highly ranked in the upregulated genes in cluster 1 compared to those in cluster 2 (Fig. 7b). In fact, the expression levels of some genes encoding enzymes and proteins that play critical roles in digestion in cluster 1 were significantly higher than those in cluster 2 For instance, calpains (CAPN8 and CAPN9) [60] and pepsinogens (PGA3–5) digest proteins and peptides, and the signals mediated by cholecystokinin receptors (CCKAR and CCKBR) help the digestion of proteins and lipids.

Somatostatin (SST) and its receptor (SSTR1) control the secretion of gastric acids, and trefoil factors (TFF1 and TFF2) help protect and repair the gastrointestinal mucosa. These genes are expressed in stomach cells (i.e., secretory epithelial cells) but not in leukocytes. Thus, our results suggest that HHV-7 infection may be associated with increased expression of the transcripts important in the function of the stomach (Fig.b). Another possibility is that HHV-7 is highly transcribed in the stomach region where digestive genes are highly expressed. More study of the potentially mutualistic relationship between HHV-7 and humans influencing digestive function will be needed.


Author: Ryuichi Kumata , Jumpei Ito  , Kenta Takahashi , Tadaki Suzuki  and Kei Sato