Figure 1.
Pax5+/–associated genetic susceptibility to pB-ALL shapes a specific gut microbiota. (A) A diagram of the study design is shown. Pax5+/– (gray) and WT (white) mice were born in SPF facilities (light blue). Lifespan (in weeks) of an individual mouse is indicated by a horizontal bar. A green bar indicates that the mouse remained healthy throughout the experiment, a red end label indicates development of pB-ALL. After weaning, 6 mice were cohoused per cage (blue-shaded horizontal boxes). Three Pax5+/– and 3 WT mice were housed together in cages 1 to 4 (mixed genotype cages), and mice of the same genotypes were housed in cages 5 to 8 (same genotype). At ∼6.3 weeks of age, mice in cages 3, 4, 7, and 8 were transferred to the conventional facilities (CFs) where there is a natural infectious environment (gray). Fecal pellets for microbial profiling were collected 1 month (blue vertical lines) and 10 months (black vertical lines) after the beginning of cohousing. Dotted vertical lines correspond to samples that were excluded because of failed quality control (cage 5; first time point; mice V443 and V493). (B) Exposure of mice to a naturally infectious environment altered their gut microbiome composition over time. Pairwise unweighted UniFrac distances (beta diversity) were computed for all cohousing samples. Distance metric was ordinated via principal coordinates analysis (PCoA) into 3D and visualized via EMPeror.66 Axes indicate percentage of explained variance. Cones were used to visualize microbiomes of Pax5+/– mice, whereas spheres correspond to WT mice. First-time-point samples were drawn with smaller shapes than those for the second time point. A huge shift over time becomes obvious for CFs (blue) compared with the much smaller difference in SPF facilities (red). (C-F) Heterozygous loss of Pax5 shaped a specific gut microbiota. Stratified by time point (1 month or 10 months of cohousing) and facility (CF or SPF facility), pairwise PERMANOVA tests with 999 permutations were applied to test for differences in beta diversity grouped by mouse genotype for same-genotype cages (5, 6, 7, 8). Boxes visualize unweighted UniFrac distances between samples of the same genotype (ie, intragroup distances) (blue for Pax5+/– and green for WT) and distances across genotypes (mustard, intergroup; ie, all pairs in which 1 partner has a Pax5+/– genotype and the other partner has a WT genotype). The boxes show the quartiles of the dataset, while the whiskers (error bars) show the rest of the distribution, except for points that are determined to be outliers, using 1.5-fold of the interquartile range. P value is from the PERMANOVA test. Note that PERMANOVA requires a minimum of 5 samples per group, which is minimally undercut in the first time point for SPF facilities. Boxes and PERMANOVA operate on the same data. (G-H) Pax5+/– and WT genotypes were accurately predicted using machine learning. Accurate genotype prediction from relative abundances of 40 V4-ASVs (G) or 4 full-length rRNA (H) features. The full rarefied feature table was randomly split and contained relative abundances of 3983 V4-ASVs and 502 samples from the abx and the cohousing cohorts (excluding samples from mixed-genotype cohousing cages and the antibiotics treatment phase) with a 1:1 ratio into a training and a testing set of samples. For training, 168 Pax5+/– and 83 WT samples were used, and for testing, 161 Pax5+/– and 90 WT samples were used. In a first pass, the machine learning random forest algorithm (1000 trees; scikit-learn library) was trained to estimate the importance of the 3983 V4-ASVs in predicting the mouse genotype (Pax5+/– or WT). Starting with the single most important V4-ASV, we then tested the accuracy of predicting the mouse genotype with a second random forest. Continuing in stages, V4-ASVs were added to the random forest to improve overall accuracy. Saturation was found at 96.8% accuracy by using only the top 40 V4-ASVs. The confusion matrix showed that only 8 samples were predicted to have the wrong genotype. Panel H visualizes results for the same prediction strategy on the full-length PacBio ASVs (60 Pax5+/– and 4 WT mice samples were used for training and 57 Pax5+/– and 8 WT samples were used for testing). Accuracy for predicting a mouse’s genotype based on only the top 4 full-length ASVs was 100%. (I-J) The taxonomic profile of the top 4 full-length ASVs that differentiate between the genotypes is shown. Full-length ASV composition of each sample is visualized as 1 stacked bar. Samples were grouped by housing facility (CF in panel I, SPF facility in panel J) and subgenotypes were grouped by genotype (Pax5+/– on the left and WT on the right). Combined maximal relative abundance of the 4 full-length ASVs was at 2%. A naive Bayesian classifier was applied against the Greengenes 13.8 reference to assign taxonomic labels to the full-length ASVs, which are annotated in the legend. For PacBio full-length 16S rRNA sequencing, all stool samples from mice housed in infectious environments until week 50 were used.

Pax5+/–associated genetic susceptibility to pB-ALL shapes a specific gut microbiota. (A) A diagram of the study design is shown. Pax5+/– (gray) and WT (white) mice were born in SPF facilities (light blue). Lifespan (in weeks) of an individual mouse is indicated by a horizontal bar. A green bar indicates that the mouse remained healthy throughout the experiment, a red end label indicates development of pB-ALL. After weaning, 6 mice were cohoused per cage (blue-shaded horizontal boxes). Three Pax5+/– and 3 WT mice were housed together in cages 1 to 4 (mixed genotype cages), and mice of the same genotypes were housed in cages 5 to 8 (same genotype). At ∼6.3 weeks of age, mice in cages 3, 4, 7, and 8 were transferred to the conventional facilities (CFs) where there is a natural infectious environment (gray). Fecal pellets for microbial profiling were collected 1 month (blue vertical lines) and 10 months (black vertical lines) after the beginning of cohousing. Dotted vertical lines correspond to samples that were excluded because of failed quality control (cage 5; first time point; mice V443 and V493). (B) Exposure of mice to a naturally infectious environment altered their gut microbiome composition over time. Pairwise unweighted UniFrac distances (beta diversity) were computed for all cohousing samples. Distance metric was ordinated via principal coordinates analysis (PCoA) into 3D and visualized via EMPeror.66  Axes indicate percentage of explained variance. Cones were used to visualize microbiomes of Pax5+/– mice, whereas spheres correspond to WT mice. First-time-point samples were drawn with smaller shapes than those for the second time point. A huge shift over time becomes obvious for CFs (blue) compared with the much smaller difference in SPF facilities (red). (C-F) Heterozygous loss of Pax5 shaped a specific gut microbiota. Stratified by time point (1 month or 10 months of cohousing) and facility (CF or SPF facility), pairwise PERMANOVA tests with 999 permutations were applied to test for differences in beta diversity grouped by mouse genotype for same-genotype cages (5, 6, 7, 8). Boxes visualize unweighted UniFrac distances between samples of the same genotype (ie, intragroup distances) (blue for Pax5+/– and green for WT) and distances across genotypes (mustard, intergroup; ie, all pairs in which 1 partner has a Pax5+/– genotype and the other partner has a WT genotype). The boxes show the quartiles of the dataset, while the whiskers (error bars) show the rest of the distribution, except for points that are determined to be outliers, using 1.5-fold of the interquartile range. P value is from the PERMANOVA test. Note that PERMANOVA requires a minimum of 5 samples per group, which is minimally undercut in the first time point for SPF facilities. Boxes and PERMANOVA operate on the same data. (G-H) Pax5+/– and WT genotypes were accurately predicted using machine learning. Accurate genotype prediction from relative abundances of 40 V4-ASVs (G) or 4 full-length rRNA (H) features. The full rarefied feature table was randomly split and contained relative abundances of 3983 V4-ASVs and 502 samples from the abx and the cohousing cohorts (excluding samples from mixed-genotype cohousing cages and the antibiotics treatment phase) with a 1:1 ratio into a training and a testing set of samples. For training, 168 Pax5+/– and 83 WT samples were used, and for testing, 161 Pax5+/– and 90 WT samples were used. In a first pass, the machine learning random forest algorithm (1000 trees; scikit-learn library) was trained to estimate the importance of the 3983 V4-ASVs in predicting the mouse genotype (Pax5+/– or WT). Starting with the single most important V4-ASV, we then tested the accuracy of predicting the mouse genotype with a second random forest. Continuing in stages, V4-ASVs were added to the random forest to improve overall accuracy. Saturation was found at 96.8% accuracy by using only the top 40 V4-ASVs. The confusion matrix showed that only 8 samples were predicted to have the wrong genotype. Panel H visualizes results for the same prediction strategy on the full-length PacBio ASVs (60 Pax5+/– and 4 WT mice samples were used for training and 57 Pax5+/– and 8 WT samples were used for testing). Accuracy for predicting a mouse’s genotype based on only the top 4 full-length ASVs was 100%. (I-J) The taxonomic profile of the top 4 full-length ASVs that differentiate between the genotypes is shown. Full-length ASV composition of each sample is visualized as 1 stacked bar. Samples were grouped by housing facility (CF in panel I, SPF facility in panel J) and subgenotypes were grouped by genotype (Pax5+/– on the left and WT on the right). Combined maximal relative abundance of the 4 full-length ASVs was at 2%. A naive Bayesian classifier was applied against the Greengenes 13.8 reference to assign taxonomic labels to the full-length ASVs, which are annotated in the legend. For PacBio full-length 16S rRNA sequencing, all stool samples from mice housed in infectious environments until week 50 were used.

Close Modal

or Create an Account

Close Modal
Close Modal