Electronic Supplementary Material

Microbial diversity decline and community response are decoupled from increased respiration in warmed tropical forest soil

Overview

This page contains the Supplementary Information for the manuscript Microbial diversity decline and community response are decoupled from increased respiration in warmed tropical forest soil. It is basically the same information contained in the published version of the online material except here, all tables are embedded in the page and the information can be download directly.

Note

The source code for the PDF version of this page (publication Electronic Supplementary Material)—including all figures, tables, and data sets—can be found on the project GitHub repo. The code and data needed to reproduce each figure can here.

Content Description

Supplmentary Tables

Supplmentary Table 1 | Publicly available data and data products.
Supplmentary Table 2 | Sample Details.
Supplmentary Table 3 | Primer sequences for 16S rRNA & ITS gene amplification.
Supplmentary Table 4 | Tracking read changes through DADA2 workflow (16S rRNA).
Supplmentary Table 5 | Tracking read changes through DADA2 workflow (ITS).
Supplmentary Table 6 | Results of different filtering approaches (16S rRNA).
Supplmentary Table 7 | Results of different filtering approaches (ITS).
Supplmentary Table 8 | Hill numbers (16S rRNA) for samples in the unfiltered & filtered data sets.
Supplmentary Table 9 | Summary of significant tests (16S rRNA) for unfiltered & filtered datas sets.
Supplmentary Table 10 | Hill numbers (ITS) for samples in the unfiltered & filtered data sets.
Supplmentary Table 11 | Summary of significant tests (ITS) for the unfiltered & filtered data sets.
Supplmentary Table 12 | Results of beta dispersion & permutation test for homogeneity of multivariate dispersions (16S rRNA).
Supplmentary Table 13 | Results of beta dispersion & permutation test for homogeneity of multivariate dispersions (ITS).
Supplmentary Table 14 | Summary of beta diversity significant tests for the unfiltered & filtered data sets.
Supplmentary Table 15 | Mantel tests for 16S rRNA & ITS data compared to each of the three metadata groups.

Supplmentary Figures

Supplmentary Figure 1 | Top bacterial/archaeal phyla (unfiltered data).
Supplmentary Figure 2 | Top fungal orders (unfiltered data).
Supplmentary Figure 3 | Alpha diversity (16S rRNA) of the unfiltered & filtered datas sets.
Supplmentary Figure 4 | Alpha diversity (ITS) of the unfiltered & filtered datas sets.
Supplmentary Figure 5 | Autocorrelation plots (16S rRNA).
Supplmentary Figure 6 | Autocorrelation plots (ITS).
Supplmentary Figure 7 | Acidobacteriota family plots.
Supplmentary Figure 8 | Actinobacteriota family plots.
Supplmentary Figure 9 | Alphaproteobacteria family plots.
Supplmentary Figure 10 | Gammaproteobacteria family plots.
Supplmentary Figure 11 | Bacteroidota family plots.
Supplmentary Figure 12 | Firmicutes family plots.
Supplmentary Figure 13 | Myxococcota family plots.
Supplmentary Figure 14 | Verrucomicrobiota family plots

Supplmentary Datasets

Supplmentary Dataset 1 | Output from the 16S rRNA DADA2 workflow containing information for each ASVs.
Supplmentary Dataset 2 | Output from the ITS DADA2 workflow containing information for each ASVs.
Supplmentary Dataset 3 | Complete metadata information collected in this study.
Supplmentary Dataset 4 | Differentially abundant (DA) ASVs from the 16S rRNA data identified using ISA.
Supplmentary Dataset 5 | Differentially abundant (DA) ASVs from the 16S rRNA data identified using LEfSe.
Supplmentary Dataset 6 | Differentially abundant (DA) ASVs from the ITS data identified using ISA.
Supplmentary Dataset 7 | Differentially abundant (DA) ASVs from the ITS data identified using LEfSe.

Data & Code Availability

We provide additional data products and code through online repositories (Supplementary Table 1). For further details, please see the Data Availability page on the project website.

Supplementary Table 1 | Publicly available data and data products. ENA refers to the European Nucleotide Archive

Supplementary Methods

Sample naming

Samples were named by combining the plot number (P01–P10) with the treatment (C = control, W = warming), the temperature (0 = no warming, 3 = +3°C warming, and 8 = +8°C warming), and the plot pairing designation (A—E). For example, P07_W3D is the sample from plot #7 that was warmed by +3°C. This sample is part of group D which contains P07_W8D (warmed by +8°C) and P08_C0D (the control sample for this group) (Supplementary Table 2).

Supplementary Table 2 | Sample Details.

Processing microbial community data

DNA was extracted using the DNeasy Powersoil kit (Qiagen) and bacterial and fungal communities were amplified using a two-stage PCR protocol where the locus specific primers used for PCR1 included the Illumina sequencing primer sequence on their 5’ ends. For bacteria, we amplified the V4 hypervariable region of the 16S rRNA gene using the 515F–806R (Caporaso et al. 2011) primer pair (Supplementary Table 3). For fungi, we amplified the first internal transcribed spacer (ITS1) region of the rRNA operon, using the primers ITS1F (Gardes and Bruns 1993) and ITS2 (White et al. 1990) (Supplementary Table 3).

Supplementary Table 3 | Primer sequences for 16S rRNA & ITS gene amplification.

We used Platinum 2X Mastermix (Thermo) in PCR reactions with a final volume of 12.5µl with 25 cycles using a 50°C annealing temperature for both loci. PCR2 used 2µl of PCR1 as template and added on remaining Illumina adaptors and index sequences. PCR2 products were cleaned and normalized using PCR Normalization plates (CharmBiotech, USA) and the pooled libraries concentrated using AMPure beads (Beckman Coulter, USA). Libraries were sequenced on an Illumina MiSeq with 250bp paired end reads.

Reads in the 16S rRNA and ITS data sets were trimmed of forward and reverse primers using cutadapt (Martin 2011) (v1.18) following an initial filtering step that removed reads with ambiguous bases. Primer sequences with more than 12% error rate (–error-rate = 0.12) were discarded. Reads were then processed using DADA2 (Callahan et al. 2016) (v1.16.0) within the R environment (Team, n.d.) (v4.1.0). Reads were dropped from the data set if they had three or more expected errors (maxEE = 2), at least one base with very low quality (truncQ = 2), or at least one position with an unspecified nucleotide (maxN = 0). Based on visual inspection of quality plots, only the forward reads from the 16S rRNA data were retained, while both forward and reverse reads were retained in the ITS data set. Remaining reads were dereplicated before inferring amplicon sequence variants (ASVs).

We used ASVs over traditional OTUs because ASVs provide single nucleotide resolution, thus providing more detailed resolution when examining treatment effects. Paired-end reads (ITS only) were merged and read pairs that did not match exactly across at least 12 base pairs (minOverlap = 12) were discarded. For the 16S rRNA data we retained amplicons between 230 and 235 base pairs and for the ITS data we retained amplicons between 100 and 450 base pairs. Reads were then screened for chimeras (method = consensus). Taxonomy for the 16S rRNA data set was assigned to each ASV using the naive Bayesian classifier (Wang et al. 2007) against the Silva reference database (Quast et al. 2012) (Silva_nr_v138_train_set version 138). For taxonomic classification of the ITS data set, we used the naive Bayesian classifier (Wang et al. 2007) against the UNITE (Nilsson et al. 2019) database, specifically the UNITE general FASTA release for Fungi (v. 04.02.2020) (Abarenkov et al. 2020). Supplementary Dataset1 and Supplementary Dataset2 contain the ASV table, taxonomic assignments, and unique sequences for the 16S rRNA and ITS data sets, respectively. The complete DADA2 workflow is available here: https://sweltr.github.io/high-temp/dada2.html.

Prior to community analysis of the 16S rRNA data set, ASVs classified as chloroplasts, mitochondria, or Eukaryota, or ASVs that remained unclassified (i.e., NA) at the kingdom level, were removed from the data set using the phyloseq package (McMurdie and Holmes 2013) (v1.36.0) in the R environment (Team, n.d.). No such curation was performed on the ITS data set since all ASVs could be classified to kingdom level (Fungi). The complete data set preparation workflow is available here: https://sweltr.github.io/high-temp/data-prep.html.

Filtering

We applied three complementary methods of prevalence filtering to both the 16S rRNA and ITS data sets. The complete filtering workflow is available here: https://sweltr.github.io/high-temp/filtering.html.

i) Sample-wise filtering with arbitrary functions. Here we used the genefilter_sample function from the phyloseq package (McMurdie and Holmes 2013) (v1.36.0) where ASVs represented by fewer than 5 reads and/or present in less that 20% of samples, were removed.

ii) PERFect (PERmutation Filtering test for microbiome data) (Smirnova, Huzurbazar, and Jafari 2019) (v0.2.4) filtering. Here we used the function PERFect_sim with the alpha parameter set to 0.05 for the 16S rRNA data and 0.1 for the ITS data.

iii) PIME (Prevalence Interval for Microbiome Evaluation) (Roesch et al. 2020) (v0.1.0) filtering. Here we first rarefied all samples to even depths (per the developer’s recommendation) and then split the data sets by predictor variable (temperature treatment) using the pime.split.by.variable function. Next, we calculated the prevalence intervals with the function pime.prevalence and then used the function pime.best.prevalence to calculate the best prevalence. The best prevalence interval was selected when the out-of-bag (OOB) error rate first reached zero or close to zero. The most prevalent ASVs (at the best prevalence interval) were retained from each split. Splits were then merged to obtain the final PIME filtered data set.

Alpha diversity estimates

To account for presence of rare sequence variants caused by sequencing errors (or other technical artifacts), we used Hill numbers (Alberdi and Gilbert 2019a) for estimates of alpha diversity. Hill numbers allow the weight put on rare versus abundant sequence variants to be scaled while providing intuitive comparisons of diversity levels using effective number of ASVs as a measuring unit. This approach allows for balancing the over representation of rare ASVs that might be inflated due to sequencing errors. To calculate Hill numbers, we used the R package hilldiv (Alberdi and Gilbert 2019b). We calculated three metrics that put less or more weight on common ASVs: (i) Observed richness, where q-value = 0; (ii) Shannon exponential, which weighs ASVs by their frequency, where q-value = 1; and (iii) Simpson multiplicative inverse, which over weighs abundant ASVs, where q-value = 2.

We present all three metrics of alpha diversity, while acknowledging that each metric of alpha diversity is based on (measured) relative abundance and can be subject to bias due to variation in extraction efficiency and sequencing depth, especially when detecting rare taxa (or ASVs) (Bálint et al. 2016). However, we also recognize that each metric provides complementary information, by varying in their relative sensitivity towards rare and common species (Roswell, Dushoff, and Winfree 2021). We therefore interpret alpha diversity metrics in terms of changes in diversity due to changes in rarer ASVs (observed richness) and due to changes in more proportionally abundant ASVs (Shannon and inverse Simpson).

Next, we assessed whether the alpha diversity estimates were normally distributed using both the Shapiro-Wilk Normality test (Shapiro and Wilk 1965) and the Bartlett Test of Homogeneity of Variances (Bartlett 1937). If the p-values from both tests were not significant (p > 0.05), we accepted the null hypothesis that the results were normally distributed. If one or both of the the p-values were significant (p < 0.05), we rejected the null hypothesis. For parametric data we tested for significance across treatments using ANOVA followed by Tukey post-hoc test. For non-parametric data we tested for significance across treatments using Kruskal-Wallis followed by Dunn test with Benjamini-Hochberg correction. All tests were performed using the vegan package (Oksanen et al. 2012) in R (Team, n.d.). The complete alpha diversity workflow is available here: https://sweltr.github.io/high-temp/alpha.html.

Beta diversity estimates

To test for significance between temperature treatments, we performed the following steps for the 16S rRNA and ITS data sets. First, we transformed the sample counts to relative abundance. We then generated distance matrices using the phyloseq (McMurdie and Holmes 2013) function phyloseq::distance. For the 16S rRNA data, we used unweighted and weighted UniFrac (Chen et al. 2012). For the ITS data, we used Jensen-Shannon Divergence (Fuglede and Topsoe 2004) and Bray-Curtis dissimilarity (Bray and Curtis 1957). Next, we calculated beta dispersion using the betadisper function from the vegan package (Oksanen et al. 2012). Then we used the function permutest to run a Permutation test for homogeneity of multivariate dispersions (Legendre, Oksanen, and ter-Braak 2011). If beta dispersion tests were not significant, we ran a PERMANOVA (Legendre and Anderson 1999) using adonis (PERMANOVA assumes equal dispersion), otherwise we used Analysis of Similarity (Clarke 1993) (anosim), both available in the vegan package (Oksanen et al. 2012). Ordination plots were generated for each distance matrix using Principal Coordinate Analysis (Gower 1966) (PCoA). The complete beta diversity workflow is available here: https://sweltr.github.io/high-temp/beta.html.

Diffentially abundant ASVs

Diffentially abundant ASVs across temperature treatments (PIME filtered data sets) were identified using (i) the labdsv package (Roberts and Roberts 2016) (v.2.0-1)—to run Dufrene-Legendre Indicator Species Analysis (ISA)—and (ii) the microbiomeMarker package (Cao 2020) (v.0.0.1.9000) to run linear discriminant analysis (LDA) effect size (LEfSe) (Segata et al. 2011). ISA calculates the indicator value (fidelity and relative abundance) of ASVs in treatment groups. For the 16S rRNA and ITS data, we set the p-value cutoff to 0.5. For all other parameters the default values were used. For the LEfSe analysis we used pre-sample normalization of the sum of the values to 1e+06 (norm = "CPM"), set the LDA score cutoff to 2 (lda_cutoff = 2), set the p-value cutoff of Wilcoxon test to 0.05 (wilcoxon_cutoff = 0.05), and the p-value cutoff of Kruskal-Wallis test to 0.05 (kw_cutoff = 0.05). The complete diffentially abundant workflow is available here: https://sweltr.github.io/high-temp/da.html.

Multivariate analysis

Here we compare the environmental metadata (Supplementary Dataset3) with both the PIME filtered 16S rRNA and the ITS community data. The complete multivariate workflow is available here: https://sweltr.github.io/high-temp/metadata.html. Each workflow contained the same major steps:

  1. Metadata normality tests: We used Shapiro-Wilk Normality Test (Shapiro and Wilk 1965) to test whether each metadata parameter was normally distributed.

  2. Normalize parameters: Use the R package bestNormalize (Peterson and Cavanaugh 2019; Peterson 2021) and default parameters to find and execute the best normalizing transformation for non-parametric metadata parameters identified in step #1. The function tested the following normalizing transformations: arcsinh, Box-Cox, Yeo-Johnson, Ordered Quantile (ORQ) normalization, log transformation, square-root, and exponential. Once the non-parametric parameters were transformed, we reran the normality tests.

  3. Split metadata parameters into groups: Next we split the metadata parameters into three groups: a) environmental and edaphic properties; b) microbial functional responses; and c) temperature adaptation properties.

    1. Environmental and edaphic properties: AST, H2O, N, P, Al, Ca, Fe, K, Mg, Mn, Na, TEB, ECEC, pH, NH4, NO3, PO4, DOC, DON, DOCN.

    2. Microbial functional responses: micC, micN, micP, micCN, micCP, micNP, AGase, BGase, BPase, CEase, Pase, Nase, Sase, XYase, LPase, PXase, CO2, enzCN, enzCP, enzNP.

    3. Temperature adaptation: AGQ10, BGQ10, BPQ10, CEQ10, PQ10, NQ10, SQ10, XYQ10, LPQ10, PXQ10, CUEcn, CUEcp, NUE, PUE, Tmin, SI.

The following parameters were collected but not used in the analysis: minPO4, minNH4, minNO3, minTIN.

  1. Autocorrelation tests: Then we tested all possible pair-wise comparisons of the normalized metadata (from step #2) for each group (step #3) to identify potential autocorrelated parameters.

  2. Remove autocorrelated parameters from each group. Based on the results of step #4, we removed autocorrelated parameters.

  3. Dissimilarity correlation tests: Here we used Mantel Tests to determine if any metadata groups were significantly correlated with community data. We generated Bray-Curtis (Bray and Curtis 1957) distance matrices for the community data and Euclidean distance matrices for each metadata group. Then we performed Mantel tests (Mantel 1967; Legendre and Legendre 2012) for all comparisons using the function mantel from the vegan (Oksanen et al. 2012) package.

  4. Best subset of variables: Then we determined which metadata parameters (from each group) were the most strongly correlated with the community data. For this we used the bioenv function from the vegan (Oksanen et al. 2012) package where method = "spearman", index = "bray", and metric = "euclidean".

  5. Distance-based Redundancy Analysis (dbRDA): Finally, we performed ordination analysis of samples plus metadata vector overlays using capscale, also from the vegan package. For each of the three metadata subsets, we perform the following steps:

    1. Run rankindex (Faith, Minchin, and Belbin 1987) in the vegan package to compare metadata and community dissimilarity indices for gradient detection. This aids in the selection of the best dissimilarity metric to use. Here we tested the following metrics: Euclidean, Manhattan, Gower, Bray–Curtis, and Kulczynski.
    2. Run capscale in the vegan package for distance-based redundancy analysis.
    3. Run envfit to fit environmental parameters onto the ordination. This function basically calculates correlation scores between the metadata parameters and the ordination axes.
    4. Select metadata parameters significant for bioenv (see above) and/or envfit analyses.
    5. Plot the ordination and vector overlays.

Supplementary Results

Tracking reads through DADA2 workflow

16S rRNA data

The processed and curated 16S rRNA data set contained 937,761 high-quality reads, with a range of 25,151–86,600 reads per sample (mean 62,443). Modelling and error correcting amplicon errors inferred 20,332 ASVs, 19% of which were doubleton ASVs. After removing reads classified as mitochondria, chloroplast, or Eukaryota, the data set contained 20,173 ASVs and 936,640 reads—–with a range of 25,088–-86,600 reads (mean 62,443) and 813–-3065 (mean 2063) ASVs per sample (Supplementary Table 4).

Supplementary Table 4 | Tracking read changes through DADA2 workflow (16S rRNA).

Column descriptions. 1raw: number of initial reads; 2pre filt: after removing reads with ambiguous bases; 3cut: after removing primers; 4filter: after filtering; 5denoiseF: forward reads after denoising; 6nonchim: after removing chimeras; 7final: final read totals; 8asvs: total ASVs.

ITS data

The processed and curated ITS data set contained 491,143 high-quality reads, with a range of 80–64,636 reads per sample (mean 32,743). Modelling and error correcting amplicon errors inferred 3357 ASVs, 2.1% of which were doubleton ASVs. After removing 2 samples with low read counts (< 500 reads), the data set contained 3357 ASVs and 490,767 reads—–with a range of 9172–-64,636 reads (mean 37,751) and 335–1017 (mean 757) ASVs per sample (Supplementary Table 5).

Supplementary Table 5 | Tracking read changes through DADA2 workflow (ITS).

Column descriptions. 1raw: number of initial reads; 2pre filt: after removing reads with ambiguous bases; 3cut: after removing primers; 4filter: after filtering; 5denoiseF: forward reads after denoising; 6denoiseR: reverse reads after denoising; 7merge: after merging paired-end reads 8nonchim: after removing chimeras; 9final: final read totals; 10asvs: total ASVs. *These samples were removed from final analysis.

Filtering

16S rRNA data

The curated 16S rRNA data set contained 936,640 reads and 20,173 ASVs. Applying arbitrary filtering—removing ASVs represented by fewer than 5 reads and/or present in less than 20% of samples—reduced the number of reads by 22% and the number of ASVs by 90%. Similarly, PERFect filtering reduced the number of reads by 22% and the number of ASVs by 92% (Supplementary Table 6).

Rarefying the FULL data set to even read depths (25,088 reads/sample) prior to PIME filtering removed 2741 ASVs, reducing the number of reads by 60% and the number of ASVs by 14%. Baseline noise detection indicated the out-of-bag (OOB) error rate was 0.67. Splitting the rarefied data by predictor variable (i.e., temperature treatment) resulted in a total of 7883 ASVs for the Control group, 7173 ASVs for the +3°C group, and 6027 ASVs for the +8°C group. The Best Prevalence interval, calculated using the function pime.best.prevalence, was Prevalence = 50%, the first interval where the OOB error rate dropped to 0%. At this interval (compared to the FULL data set), the number of reads was reduced by 75% and the number of ASVs by 95% (Supplementary Table 6).

Supplementary Table 6 | Results of different filtering approaches (16S rRNA).

ITS data

The curated ITS data set contained 490,767 reads and 3355 ASVs. Applying arbitrary filtering—removing ASVs represented by fewer than 5 reads and/or present in less that 20% of samples—reduced the number of reads by 21% and the number of ASVs by 76%. PERFect filtering reduced the number of reads by 26% and the number of ASVs by 91% (Supplementary Table 7).

Rarefying the FULL data set to even read depths (9172 reads/sample) prior to PIME filtering removed 298 ASVs, reducing the number of reads by 76% and the number of ASVs by 9%. Baseline noise detection indicated the out-of-bag (OOB) error rate was 0.39. Splitting the rarefied data by predictor variable (i.e., temperature treatment) yielded a total of 1932 ASVs for the Control group, 1682 ASVs for the +3°C group, and 1306 ASVs for the +8°C group. The Best Prevalence interval, calculated using the function pime.best.prevalence, was Prevalence = 55%, the first interval where the OOB error rate dropped to 0%. At this interval (compared to the FULL data set), the number of reads was reduced by 86% and the number of ASVs by 87% (Supplementary Table 7).

Supplementary Table 7 | Results of different filtering approaches (ITS).

Taxonomic diversity of microbial communities

Bacterial diversity was largely comprised of Proteobacteria (Alpha, Gamma), Acidobacteriota, Actinobacteriota, Bacteroidota, Firmicutes, Myxococcota, Verrucomicrobiota, Chloroflexi, and Planctomycetota. Methylomirabilota and Crenarchaeota were the dominanmt phyla of Archaeal diversity (Supplementary Figure 1). For a breakdown of dominant bacterial phyla by family, see . Fungal diversity was largely comprised of Ascomycota, Basidiomycota, and Glomeromycota (Supplementary Figure 2). The complete taxonomic workflow is available here: https://sweltr.github.io/high-temp/taxa.html.

Alpha diversity of microbial communities

Shapiro-Wilk Normality and Bartlett tests indicated all data was normally distributed except for Shannon exponential estimates of the 16S rRNA PIME filtered data. Differences in alpha diversity assessed using analysis of variance (ANOVA) followed by Tukey HSD post hoc tests (normally distributed data) or Kruskal-Wallis followed by Dunn test with Benjamini-Hochberg correction (non-normally distributed data).

16S rRNA data

Supplementary Table 8 contains the results of alpha diversity estimates for the unfiltered data set plus the arbitrary, PERFect, and PIME filtered data sets. In Supplementary Table 9 we report the results of the Shapiro-Wilk Normality and Bartlett tests for each Hill number as well as the results of post-hoc analysis. Alpha diversity plots comparing the different filtering methods for each Hill number can be found in Supplementary Figure 3.

Supplementary Table 8 | Hill numbers (16S rRNA) for samples in the unfiltered & filtered data sets.

FULL = unfiltered data set; FILT = arbitrary filtering where nreads > 5 and prevalence > 20%; PERFect = PERFect filtering; PIME = PIME filtering

Supplementary Table 9 | Summary of significant tests (16S rRNA) for unfiltered & filtered datas sets.

Column descriptions. 1metric: Hill number; 2data set: FULL = unfiltered data set; FILT = arbitrary filtering where nreads > 5 and prevalence > 20%; PERFect = PERFect filtering; PIME = PIME filtering; 3pval_shap: p-value of Shapiro-Wilk Normality test; 4pval_bart: p-value of Bartlett Test of Homogeneity of Variances; 5method: Selected significance test; 6posthoc method: Selected posthoc test; 7posthoc pval: Posthoc p-value

ITS data

Supplementary Table 10 contains the results of alpha diversity estimates for the unfiltered data set plus the arbitrary, PERFect, and PIME filtered data sets. In Supplementary Table 11 we report the results of the Shapiro-Wilk Normality and Bartlett tests for each Hill number as well as the results of post-hoc analysis. Alpha diversity plots comparing the different filtering methods for each Hill number can be found in Supplementary Figure 4.

Supplementary Table 10 | Hill numbers (ITS) for samples in the unfiltered & filtered data sets.

FULL = unfiltered data set; FILT = arbitrary filtering where nreads > 5 and prevalence > 20%; PERFect = PERFect filtering; PIME = PIME filtering

Supplementary Table 11 | Summary of significant tests (ITS) for the unfiltered & filtered data sets.

Column descriptions. 1metric: Hill number; 2data set: FULL = unfiltered data set; FILT = arbitrary filtering where nreads > 5 and prevalence > 20%; PERFect = PERFect filtering; PIME = PIME filtering; 3pval_shap: p-value of Shapiro-Wilk Normality test; 4pval_bart: p-value of Bartlett Test of Homogeneity of Variances; 5method: Selected significance test; 6posthoc method: Selected posthoc test; 7posthoc pval: Posthoc p-value

Beta diversity of microbial communities

To test for significance between treatment groups, we calculated the beta dispersion (using the betadisper function, vegan package) for each 16S rRNA distance matrix (unweighted and weighted UniFrac) and each ITS distance matrix (Jensen-Shannon and Bray-Curtis). We then used the function permutest to run a Permutation Test for Homogeneity of multivariate dispersions against the results of each beta dispersion test (Supplementary Table 12, Supplementary Table 13). If the results were not significant (i.e., p-value > 0.05) we ran a PERMANOVA using adonis (PERMANOVA assumes equal dispersion), otherwise we used Analysis of Similarity (ANOSIM). Supplementary Table 14 contains the results of the significance tests.

Supplementary Table 12 | Results of beta dispersion & permutation test for homogeneity of multivariate dispersions (16S rRNA).

Supplementary Table 13 | Results of beta dispersion & permutation test for homogeneity of multivariate dispersions (ITS).

Supplementary Table 14 | Summary of beta diversity significant tests for the unfiltered & filtered data sets.

Differentially abundant ASVs

Indicator Species Analysis (ISA) of the 16S rRNA data set identified 251 differentially abundant (DA) ASVs. Of those, 154 ASVs were enriched in the Control samples, 82 in the +3°C treatment, and 15 in the +8°C treatment (Supplementary Dataset4). Linear discriminant analysis (LDA) effect size (LEfSe) identified 676 DA ASVs with an LDA score > 2.0 and a p-value < 0.05. Of those, 355 ASVs were enriched in the Control samples, 227 in the +3°C treatment, and 94 in the +8°C treatment (Supplementary Dataset5).

ISA of the ITS data set identified 203 DA ASVs. Of those, 54 ASVs were enriched in the Control samples, 95 in the +3°C treatment, and 54 in the +8°C treatment (Supplementary Dataset6). LEfSe identified 228 DA ASVs with an LDA score > 2.0 and a p-value < 0.05. Of those, 52 ASVs were enriched in the Control samples, 107 in the +3°C treatment, and 69 in the +8°C treatment (Supplementary Dataset7).

Multivariate analysis

Normality tests & parameter normalization

We used Shapiro-Wilk Normality Test (Shapiro and Wilk 1965) to determine which of the 61 metadata parameters were or were not normally distributed. For the 16S rRNA data we needed to transform 25 metadata parameters (p-value < 0.05) and for the ITS data, 21 metadata parameters needed transformation (p-value < 0.05). Please see the project website for the specific parameters that were transformed and the method of transformation used in each case (https://sweltr.github.io/high-temp/metadata.html). For both community data sets, bestNormalize was unable to find a suitable transformation for Al and Fe. This is likely because there was very little variation in these parameters and/or there were too few significant digits.

Removing autocorrelated parameters

Based on autocorrelation tests between the metadata and community data (Supplementary Figure 5, Supplementary Figure 6), we removed the following parameters:

Environmental and edaphic properties: TEB, DON, Na, Al, Ca.
Microbial functional responses: micN, micNP, enzCN, enzCP, BPase, CEase, LPase, Nase, Pase.
Temperature adaptation: NUE, PUE, SI.

In addition, we removed PQ10 (temperature adaptation) from the ITS analysis based on the autocorrelation tests.

Dissimilarity correlation tests

We used Mantel Tests to determine if any metadata groups were significantly correlated with 16S rRNA or ITS community data (Supplementary Table 15).

Supplementary Table 15 | Mantel tests for 16S rRNA & ITS data compared to each of the three metadata groups. Significant differences denoted by p-values < 0.05.

Best subset of variables

The bioenv function found the following metadata parameters (normalized with autocorrelated data removed) significantly correlated with community data (results of Mantel tests shown in parentheses).

Environmental and edaphic properties

16S rRNA: AST (r = 1.0, p = 0.001).
ITS: AST (r = 1.0, p = 0.001).

Microbial functional responses

16S rRNA: AGase (r = 0.559, p = 0.001), enzNP (r = 0.462, p = 0.006), Sase (r = 0.614, p = 0.001), PXase (r = 0.612, p = 0.001), XYase (r = 0.456, p = 0.002).
ITS: enzNP (r = 0.553, p = 0.001), PXase (r = 0.685, p = 0.001), XYase (r = 0.505, p = 0.002).

Temperature adaptation

16S rRNA: CUEcp (r = 0.325, p = 0.013), LPQ10 (r = 0.377, p = 0.005), PQ10 (r = 0.518, p = 0.001), SQ10 (r = 0.440, p = 0.001), and Tmin (r = 0.404, p = 0.005).
ITS: XYQ10 (r = 0.726, p = 0.001), Tmin (r = 0.616, p = 0.001).

Distance-based Redundancy Analysis (dbRDA)

In all cases (i.e., both community data sets against each of the three metadata subsets), rankindex (Faith, Minchin, and Belbin 1987) indicated that Bray-Curtis was best dissimilarity metric to use. Based on these results, we set dist = "bray" for each dbRDA analysis using capscale. Due to issue pertaining to degrees of freedom, we needed to remove some metadata parameters from specific groups. From the 16S rRNA analysis, we removed Mg and Mn (environmental and edaphic properties). From the ITS analysis, we removed Mg, Mn, Na, Al, Fe, and K (environmental and edaphic properties) and SQ10 (temperature adaptation). Next, we used the vegan function envfit to fit environmental parameters onto the ordination. This function calculates correlation scores between metadata parameters and ordination axes. envfit found the following parameters were significantly correlated with community data (Goodness of fit statistic/squared correlation coefficient and empirical p-values for each variable shown in parentheses).

Environmental and edaphic properties

16S rRNA: AST (r2 =0.829, p = 0.001), H2O (r2 =0.519, p = 0.010), DOC (r2 =0.446, p = 0.024).
ITS: AST (r2 =0.485, p = 0.037), DOC (r2 =0.535, p = 0.028).

Microbial functional responses

16S rRNA: AGase (r2 = 0.444, p = 0.026), BGase (r2 = 0.560, p = 0.007), Sase (r2 = 0.737, p = 0.002), XYase (r2 = 0.519, p = 0.009), PXase (r2 = 0.764, p = 0.001), CO2 (r2 = 0.504, p = 0.013), enzNP (r2 = 0.624, p = 0.004).
ITS: micP (r2 = 0.693, p = 0.002), micCP (r2 = 0.583, p = 0.016), AGase (r2 = 0.506, p = 0.037), PXase (r2 = 0.500, p = 0.035), enzNP (r2 = 0.547, p = 0.014).

Temperature adaptation

16S rRNA: SQ10 (r2 = 0.496, p = 0.015), XYQ10 (r2 = 0.373, p = 0.049), LPQ10 (r2 = 0.413, p = 0.041), Tmin (r2 = 0.446, p = 0.030).
ITS: XYQ10 (r2 = 0.617, p = 0.010), CUEcp (r2 = 0.479, p = 0.035), Tmin (r2 = 0.475, p = 0.028).

Appendices

Appendix 1: Supplementary Datasets

For this study, Supplementary Datasets are text files that were too large to include in the Supplementary Material. The individual files can be downloaded from the journal’s website. Below are descriptions for each Supplementary Data item.

Supplementary Dataset1

Description: Output from the 16S rRNA DADA2 workflow before manual curation. Table is a tab delimited text file containing information for 20,332 ASVs. The first column is the unique ASV ID, followed by the read counts for each sample, ASV taxonomic lineage (Kingdom to Genus), and finally the unique ASV sequence.

Supplementary Dataset2

Description: Output from the ITS DADA2 workflow before manual curation. Table is a tab delimited text file containing information for 3357 ASVs. The first column is the unique ASV ID, followed by the read counts for each sample, ASV taxonomic lineage (Kingdom to Genus), and finally the unique ASV sequence.

Supplementary Dataset3

Description: Complete metadata information collected in this study. Tab delimited text file containing data for 61 metadata parameters (before normalization) associated with each sample. The first column is the sample ID, followed plot number (1–10), treatment (control or warm), temperature (0°C, +3°C, +8°C), plot pair ID (A–E), and collection season (W = rainy season). Subsequent columns contain values for all metadata parameters.

Supplementary Dataset 3 | Complete metadata information collected in this study.

Supplementary Dataset4

Description: Differentially abundant (DA) ASVs from the 16S rRNA data identified using Indicator Species Analysis (ISA) against the PIME filtered data set. Tab delimited text file of all 251 DA ASVs between temperature treatments.

Supplementary Dataset 4 | Differentially abundant (DA) ASVs from the 16S rRNA data identified using ISA.

Description of table headers:

  • ASV_ID ASV name.
  • group Sample group ASV is enriched in.
  • indval Indicator value from Dufrene-Legendre Indicator Species Analysis.
  • pval p-value from Dufrene-Legendre Indicator Species Analysis.
  • freq Total number of samples where ASV was detected.
  • freq_C0 Total number of Control samples where ASV was detected.
  • freq_W3 Total number of +3°C samples where ASV was detected.
  • freq_W8 Total number of +8°C samples where ASV was detected.
  • reads_total Total reads in data set.
  • reads_C0 Total reads in Control samples.
  • reads_W3 Total reads in +3°C samples.
  • reads_W8 Total reads in +8°C samples.

The remaining columns contain lineage information for each ASV followed by its’ unique sequence.

Supplementary Dataset5

Description: Differentially abundant (DA) ASVs from the 16S rRNA data identified using linear discriminant analysis (LDA) effect size (LEfSe) against the PIME filtered data set. Tab delimited text file of all 676 DA ASVs between temperature treatments.

Supplementary Dataset 5 | Differentially abundant (DA) ASVs from the 16S rRNA data identified using LEfSe.

Description of table headers:

  • ASV_ID ASV name.
  • group Sample group ASV is enriched in.
  • lda Linear discriminant analysis (LDA) scores.
  • pval p-value from LEfSe analysis.
  • freq Total number of samples where ASV was detected.
  • freq_C0 Total number of Control samples where ASV was detected.
  • freq_W3 Total number of +3°C samples where ASV was detected.
  • freq_W8 Total number of +8°C samples where ASV was detected.
  • reads_total Total reads in data set.
  • reads_C0 Total reads in Control samples.
  • reads_W3 Total reads in +3°C samples.
  • reads_W8 Total reads in +8°C samples.

The remaining columns contain lineage information for each ASV followed by its’ unique sequence.

Supplementary Dataset6

Description: Differentially abundant (DA) ASVs from the ITS data identified using Indicator Species Analysis (ISA) against the PIME filtered data set. Tab delimited text file of all 203 DA ASVs between temperature treatments.

Supplementary Dataset 6 | Differentially abundant (DA) ASVs from the ITS data identified using ISA.

Description of table headers:

  • ASV_ID ASV name.
  • group Sample group ASV is enriched in.
  • indval Indicator value from Dufrene-Legendre Indicator Species Analysis.
  • pval p-value from Dufrene-Legendre Indicator Species Analysis.
  • freq Total number of samples where ASV was detected.
  • freq_C0 Total number of Control samples where ASV was detected.
  • freq_W3 Total number of +3°C samples where ASV was detected.
  • freq_W8 Total number of +8°C samples where ASV was detected.
  • reads_total Total reads in data set.
  • reads_C0 Total reads in Control samples.
  • reads_W3 Total reads in +3°C samples.
  • reads_W8 Total reads in +8°C samples.

The remaining columns contain lineage information for each ASV followed by its’ unique sequence.

Supplementary Dataset7

Description: Differentially abundant (DA) ASVs from the ITS data identified using linear discriminant analysis (LDA) effect size (LEfSe) against the PIME filtered data set. Tab delimited text file of all 228 DA ASVs between temperature treatments.

Supplementary Dataset 7 | Differentially abundant (DA) ASVs from the ITS data identified using LEfSe.

Description of table headers:

  • ASV_ID ASV name.
  • group Sample group ASV is enriched in.
  • lda Linear discriminant analysis (LDA) scores.
  • pval p-value from LEfSe analysis.
  • freq Total number of samples where ASV was detected.
  • freq_C0 Total number of Control samples where ASV was detected.
  • freq_W3 Total number of +3°C samples where ASV was detected.
  • freq_W8 Total number of +8°C samples where ASV was detected.
  • reads_total Total reads in data set.
  • reads_C0 Total reads in Control samples.
  • reads_W3 Total reads in +3°C samples.
  • reads_W8 Total reads in +8°C samples.

The remaining columns contain lineage information for each ASV followed by its’ unique sequence.

Appendix 2: Family-level bacterial charts

Top twelve (12) families of abundant bacterial phyla. Remaining taxa are grouped in Other. In cases where ASVs could not be classified to family level, abundance data was calculated for the next highest taxonomic rank, denoted by the prefix rank abbreviation plus underscore (e.g., c_ is Class). As above, relative abundance of taxa based on the full, unfiltered data set. Left plots show taxa collapsed by temperature treatment while right plots show individual samples faceted by temperature treatment. Taxa are ordered first by rank and then alphabetically. The same color palette displayed in the same order was used for each plot.

Source Code

The source code for this page can be accessed on GitHub by clicking this link.

Last updated on

[1] "2022-06-29 07:19:49 EST"

References

Abarenkov, Kessy, Allan Zirk, Timo Piirmann, Raivo Pöhönen, Filipp Ivanov, R. Henrik Nilsson, and Urmas Kõljalg. 2020. “UNITE General FASTA Release for Fungi.” https://dx.doi.org/10.15156/BIO/786368.
Alberdi, Antton, and M Thomas P Gilbert. 2019a. “A Guide to the Application of Hill Numbers to DNA-Based Diversity Analyses.” Molecular Ecology Resources 19 (4): 804–17. https://doi.org/10.1111/1755‐0998.13014.
———. 2019b. “Hilldiv: An r Package for the Integral Analysis of Diversity Based on Hill Numbers.” bioRxiv, 545665. https://doi.org/10.1101/545665.
Bálint, Miklós, Mohammad Bahram, A Murat Eren, Karoline Faust, Jed A Fuhrman, Björn Lindahl, Robert B O’Hara, et al. 2016. “Millions of Reads, Thousands of Taxa: Microbial Community Structure and Associations Analyzed via Marker Genes.” FEMS Microbiology Reviews 40 (5): 686–700. https://doi.org/10.1093/femsre/fuw017.
Bartlett, Maurice Stevenson. 1937. “Properties of Sufficiency and Statistical Tests.” Proceedings of the Royal Society of London. Series A-Mathematical and Physical Sciences 160 (901): 268–82. https://doi.org/10.1098/rspa.1937.0109.
Bray, J Roger, and John T Curtis. 1957. “An Ordination of the Upland Forest Communities of Southern Wisconsin.” Ecological Monographs 27 (4): 326–49. https://doi.org/10.2307/1942268 .
Callahan, Benjamin J, Paul J McMurdie, Michael J Rosen, Andrew W Han, Amy Jo A Johnson, and Susan P Holmes. 2016. “Dada2: High-Resolution Sample Inference from Illumina Amplicon Data.” Nature Methods 13 (7): 581. https://doi.org/10.1038/nmeth.3869.
Cao, Yang. 2020. “microbiomeMarker: Microbiome Biomarker Analysis.” R Package Version 0.0.1.9000. https://doi.org/10.5281/zenodo.3749415.
Caporaso, J Gregory, Christian L Lauber, William A Walters, Donna Berg-Lyons, Catherine A Lozupone, Peter J Turnbaugh, Noah Fierer, and Rob Knight. 2011. “Global Patterns of 16s rRNA Diversity at a Depth of Millions of Sequences Per Sample.” Proceedings of the National Academy of Sciences 108: 4516–22. https://doi.org/https://doi.org/10.1073/pnas.1000080107.
Chen, Jun, Kyle Bittinger, Emily S Charlson, Christian Hoffmann, James Lewis, Gary D Wu, Ronald G Collman, Frederic D Bushman, and Hongzhe Li. 2012. “Associating Microbiome Composition with Environmental Covariates Using Generalized UniFrac Distances.” Bioinformatics 28 (16): 2106–13. https://doi.org/10.1093/bioinformatics/bts342.
Clarke, K Robert. 1993. “Non-Parametric Multivariate Analyses of Changes in Community Structure.” Australian Journal of Ecology 18 (1): 117–43. https://doi.org/10.1111/j.1442-9993.1993.tb00438.x.
Faith, Daniel P, Peter R Minchin, and Lee Belbin. 1987. “Compositional Dissimilarity as a Robust Measure of Ecological Distance.” Vegetatio 69 (1): 57–68. https://doi.org/10.1007/BF00038687.
Fuglede, Bent, and Flemming Topsoe. 2004. “Jensen-Shannon Divergence and Hilbert Space Embedding.” In International Symposium on Information Theory, 2004. ISIT 2004. Proceedings., 31. IEEE.
Gardes, Monique, and Thomas D Bruns. 1993. “ITS Primers with Enhanced Specificity for Basidiomycetes-Application to the Identification of Mycorrhizae and Rusts.” Molecular Ecology 2 (2): 113–18. https://doi.org/10.1111/j.1365-294X.1993.tb00005.x.
Gower, John C. 1966. “Some Distance Properties of Latent Root and Vector Methods Used in Multivariate Analysis.” Biometrika 53 (3-4): 325–38. https://doi.org/10.1093/biomet/53.3-4.325.
Legendre, Pierre, and Marti J Anderson. 1999. “Distance-Based Redundancy Analysis: Testing Multispecies Responses in Multifactorial Ecological Experiments.” Ecological Mmonographs 69 (1): 1–24. https://doi.org/10.1890/0012-9615(1999)069[0001:DBRATM]2.0.CO;2.
Legendre, Pierre, and Louis Legendre. 2012. Numerical Ecology. Elsevier.
Legendre, Pierre, Jari Oksanen, and Cajo JF ter-Braak. 2011. “Testing the Significance of Canonical Axes in Redundancy Analysis.” Methods in Ecology and Evolution 2 (3): 269–77. https://doi.org/10.1111/j.2041-210X.2010.00078.x.
Mantel, Nathan. 1967. “The Detection of Disease Clustering and a Generalized Regression Approach.” Cancer Research 27 (2 Part 1): 209–20.
Martin, Marcel. 2011. “Cutadapt Removes Adapter Sequences from High-Throughput Sequencing Reads.” EMBnet. Journal 17 (1): 10–12. https://doi.org/10.14806/ej.17.1.200.
McMurdie, Paul J, and Susan Holmes. 2013. “Phyloseq: An r Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data.” PLoS One 8 (4): e61217. https://doi.org/10.1371/journal.pone.0061217.
Nilsson, Rolf Henrik, Karl-Henrik Larsson, Andy F S Taylor, Johan Bengtsson-Palme, Thomas S Jeppesen, Dmitry Schigel, Peter Kennedy, et al. 2019. “The UNITE Database for Molecular Identification of Fungi: Handling Dark Taxa and Parallel Taxonomic Classifications.” Nucleic Acids Research 47 (D1): D259–64. https://doi.org/10.1093/nar/gky1022.
Oksanen, Jari, F. Guillaume Blanchet, Roeland Kindt, Pierre Legendre, Peter R. Minchin, R. B. O’Hara, Gavin L. Simpson, Peter Sólymos, M. Henry H. Stevens, and Helene Wagner. 2012. “Vegan: Community Ecology Package.” R Package Version 2 (0). https://cran.r-project.org/web/packages/vegan/index.html.
Peterson, Ryan A. 2021. “Finding Optimal Normalizing Transformations via bestNormalize.” The R Journal.
Peterson, Ryan A, and Joseph E Cavanaugh. 2019. “Ordered Quantile Normalization: A Semiparametric Transformation Built for the Cross-Validation Era.” Journal of Applied Statistics. https://doi.org/10.1080/02664763.2019.1630372.
Quast, Christian, Elmar Pruesse, Pelin Yilmaz, Jan Gerken, Timmy Schweer, Pablo Yarza, Jörg Peplies, and Frank Oliver Glöckner. 2012. “The SILVA Ribosomal RNA Gene Database Project: Improved Data Processing and Web-Based Tools.” Nucleic Acids Research 41 (D1): D590–96. https://doi.org/10.1093/nar/gks1219.
Roberts, David W, and Maintainer David W Roberts. 2016. “Package ‘Labdsv’.” Ordination and Multivariate 775. https://cran.r-project.org/web/packages/labdsv/index.html.
Roesch, Luiz Fernando W, Priscila T Dobbler, Victor S Pylro, Bryan Kolaczkowski, Jennifer C Drew, and Eric W Triplett. 2020. “PIME: A Package for Discovery of Novel Differences Among Microbial Communities.” Molecular Ecology Resources 20 (2): 415–28. https://doi.org/10.1111/1755-0998.13116.
Roswell, Michael, Jonathan Dushoff, and Rachael Winfree. 2021. “A Conceptual Guide to Measuring Species Diversity.” Oikos 130 (3): 321–38. https://doi.org/10.1111/oik.07202.
Segata, Nicola, Jacques Izard, Levi Waldron, Dirk Gevers, Larisa Miropolsky, Wendy S Garrett, and Curtis Huttenhower. 2011. “Metagenomic Biomarker Discovery and Explanation.” Genome Biology 12 (6): 1–18. https://doi.org/10.1186/gb-2011-12-6-r60.
Shapiro, Samuel Sanford, and Martin B Wilk. 1965. “An Analysis of Variance Test for Normality (Complete Samples).” Biometrika 52 (3/4): 591–611. https://doi.org/10.2307/2333709 .
Smirnova, Ekaterina, Snehalata Huzurbazar, and Farhad Jafari. 2019. “PERFect: PERmutation Filtering Test for Microbiome Data.” Biostatistics 20 (4): 615–31. https://doi.org/10.1093/biostatistics/kxy020.
Team, R Core. n.d. “R: A Language and Environment for Statistical Computing.” https://www.R-project.org/.
Wang, Qiong, George M Garrity, James M Tiedje, and James R Cole. 2007. “Naive Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy.” Applied and Environmental Microbiology 73 (16): 5261–67. https://doi.org/10.1128/AEM.00062-07.
White, Thomas J, Thomas Bruns, SJWT Lee, John Taylor, et al. 1990. “Amplification and Direct Sequencing of Fungal Ribosomal RNA Genes for Phylogenetics.” PCR Protocols: A Guide to Methods and Applications 18 (1): 315–22.