Click here for setup information.

knitr::opts_chunk$set(echo = TRUE, eval = FALSE)
set.seed(119)
library(phyloseq); packageVersion("phyloseq")
library(Biostrings); packageVersion("Biostrings")
pacman::p_load(tidyverse, 
               microbiome, phytools, phangorn, reactable, Matrix, 
               pairwiseAdonis, naniar, downloadthis,
               labdsv, patchwork, agricolae, captioner, microeco,
               install = FALSE, update = FALSE)

#pacman::p_depends(agricolae, local = TRUE)  
#pacman::p_depends_reverse(agricolae, local = TRUE)  

options(scipen=999)
knitr::opts_current$get(c(
  "cache",
  "cache.path",
  "cache.rebuild",
  "dependson",
  "autodep"
))

Synopsis

This workflow contains beta diversity assessments for the high temperature data sets. In order to run the workflow, you either need to first run the DADA2 Workflow for 2018 High Temp samples, the Data Preparation workflow, and the Alpha diversity workflow or download the files linked below. See the Data Availability page for complete details.

Workflow Input

Files needed to run this workflow can be downloaded from figshare.

16s rRNA

In order to test for significance between sample groups, we must perform the following steps:

Transform sample counts to relative abundance.
Create distance matrices based on some metrics, for example wunifrac, unifrac, and jsd. For this we use the function phyloseq::distance.
Next, calculate beta dispersion using the betadisper function from the vegan package (Oksanen et al. 2012).
Then, use the function permutest to run a Permutation test for homogeneity of multivariate dispersions.
If the beta dispersion tests are not significant we will run a PERMANOVA using adonis (since PERMANOVA assumes equal dispersion), otherwise we will use Analysis of Similarity (ANOSIM), both available in the vegan package (Oksanen et al. 2012).

Steps 2–4 are combined in a for loop that tests all three distance metrics.

Before we begin, we need to first transform sample counts to relative abundance and generate some new trees.

Code

ssu_samp_ps <- c("ssu18_ps_work", "ssu18_ps_filt", 
             "ssu18_ps_perfect", "ssu18_ps_pime")
for (i in ssu_samp_ps) {
     tmp_name <- purrr::map_chr(i, ~ paste0(., "_prop"))
     tmp_get <- get(i)
     tmp_ps <- transform_sample_counts(tmp_get, function(otu) 
                                          1e5 * otu/sum(otu))
     tmp_ps@phy_tree <- NULL
     tmp_ps <- prune_samples(sample_sums(tmp_ps) > 0, tmp_ps)
     tmp_tree <- rtree(ntaxa(tmp_ps), rooted = TRUE, 
                       tip.label = taxa_names(tmp_ps))
     tmp_ps <- merge_phyloseq(tmp_ps, sample_data, tmp_tree)
     print(tmp_name)
     assign(tmp_name, tmp_ps)
     rm(list = ls(pattern = "tmp_"))
}

Beta Dispersion

Here we create distance matrices for each metric, calculate the beta dispersion, and run a permutation test for homogeneity of multivariate dispersions.

Code

ssu_dist <- c("jsd", "unifrac", "wunifrac")
for (i in ssu_samp_ps) {
    for (d in ssu_dist){
       tmp_get <- get(purrr::map_chr(i, ~ paste0(i, "_prop")))
       tmp_samp <- data.frame(sample_data(tmp_get))
       tmp_df <- phyloseq::distance(tmp_get, method = d)
       tmp_df_name <- purrr::map_chr(d, ~ paste0(i, "_beta_dist_", .))
       assign(tmp_df_name, tmp_df)
       tmp_df2 <- betadisper(tmp_df, tmp_samp$TEMP, bias.adjust = TRUE)
       tmp_df_name2 <- purrr::map_chr(d, ~ paste0(i, "_beta_dispersion_", .))
       assign(tmp_df_name2, tmp_df2)
       tmp_df3 <- permutest(tmp_df2, pairwise = TRUE,
                            permutations = 1000, binary = FALSE)
       tmp_df_name3 <- purrr::map_chr(d, ~ paste0(i, "_permutest_", .))
       assign(tmp_df_name3, tmp_df3)
       rm(list = ls(pattern = "tmp_"))
    }
}
objects()

Detailed results of Beta Dispersion & Permutation tests


#################################################### 
BETA DISPERSION significance test jsd distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm Pr(>F)
Groups     2 0.018639 0.0093195 2.9882   1000 0.1089
Residuals 12 0.037425 0.0031188                     

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
         0        3      8
0          0.465534 0.1638
3 0.412469          0.0599
8 0.159005 0.075347       

#################################################### 
BETA DISPERSION significance test unifrac distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df    Sum Sq    Mean Sq      F N.Perm  Pr(>F)   
Groups     2 0.0062292 0.00311458 5.3484   1000 0.00999 **
Residuals 12 0.0069881 0.00058234                         
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
         0        3     8
0          0.831169 0.012
3 0.727926          0.032
8 0.033203 0.048150      

#################################################### 
BETA DISPERSION significance test wunifrac distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm   Pr(>F)   
Groups     2 0.018783 0.0093917 7.3178   1000 0.005994 **
Residuals 12 0.015401 0.0012834                          
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
         0        3     8
0          0.957043 0.015
3 0.938214          0.004
8 0.024555 0.019472


#################################################### 
BETA DISPERSION significance test jsd distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm  Pr(>F)  
Groups     2 0.019867 0.0099333 3.0824   1000 0.08891 .
Residuals 12 0.038671 0.0032226                        
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
         0        3      8
0          0.854146 0.0859
3 0.843567          0.0869
8 0.112793 0.090799       

#################################################### 
BETA DISPERSION significance test unifrac distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm   Pr(>F)   
Groups     2 0.025430 0.0127152 7.0788   1000 0.005994 **
Residuals 12 0.021555 0.0017962                          
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
         0        3     8
0          0.583417 0.011
3 0.568432          0.018
8 0.018174 0.023627      

#################################################### 
BETA DISPERSION significance test wunifrac distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm   Pr(>F)   
Groups     2 0.023408 0.0117041 6.7591   1000 0.004995 **
Residuals 12 0.020779 0.0017316                          
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
         0        3     8
0          0.565435 0.008
3 0.528192          0.013
8 0.023244 0.032852


#################################################### 
BETA DISPERSION significance test jsd distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm  Pr(>F)  
Groups     2 0.027097 0.0135487 4.5971   1000 0.02098 *
Residuals 12 0.035367 0.0029472                        
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
         0        3     8
0          0.778222 0.044
3 0.771551          0.027
8 0.061934 0.045555      

#################################################### 
BETA DISPERSION significance test unifrac distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm   Pr(>F)   
Groups     2 0.034618 0.0173088 9.4781   1000 0.003996 **
Residuals 12 0.021914 0.0018262                          
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
          0         3     8
0           0.5684316 0.008
3 0.5529721           0.008
8 0.0088715 0.0096445      

#################################################### 
BETA DISPERSION significance test wunifrac distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm   Pr(>F)   
Groups     2 0.020738 0.0103692 7.5898   1000 0.002997 **
Residuals 12 0.016395 0.0013662                          
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
         0        3     8
0          0.547453 0.005
3 0.516927          0.003
8 0.027061 0.017081


#################################################### 
BETA DISPERSION significance test jsd distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm Pr(>F)
Groups     2 0.007284 0.0036418 0.9199   1000 0.4356
Residuals 12 0.047509 0.0039591                     

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
        0       3      8
0         0.84515 0.3876
3 0.82435         0.3407
8 0.37620 0.31969       

#################################################### 
BETA DISPERSION significance test unifrac distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm Pr(>F)
Groups     2 0.016868 0.0084341 1.4936   1000 0.2398
Residuals 12 0.067761 0.0056468                     

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
        0       3      8
0         0.97902 0.2617
3 0.97444         0.2058
8 0.24180 0.21934       

#################################################### 
BETA DISPERSION significance test wunifrac distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm   Pr(>F)   
Groups     2 0.025802 0.0129010 8.6861   1000 0.008991 **
Residuals 12 0.017823 0.0014852                          
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
         0        3     8
0          0.982018 0.006
3 0.982289          0.004
8 0.015248 0.014089

Remember, if the beta dispersion p-value is greater than 0.05 we use PERMANOVA, otherwise we use ANOSIM.

(16S rRNA) Table 1 | Summary of Beta Dispersion Tests for unfiltered & filtered data sets against various distance metrics. The test column shows the method chosen (based on the test results) to assess differences in beta diversity. If p-values are significant (red, p-value < 0.05) we used ANOSIM, otherwise we used PERMANOVA (blue, p-value > 0.05).

Now we to create sample data frames for each data set.

Code

for (i in ssu_samp_ps) {
     tmp_get <- get(purrr::map_chr(i, ~ paste0(., "_prop")))
     tmp_samp <- data.frame(sample_data(tmp_get))
     tmp_name <- purrr::map_chr(i, ~ paste0(., "_sampledf"))
     assign(tmp_name, tmp_samp)
     rm(list = ls(pattern = "tmp_"))
}
objects(pattern="_sampledf")

Significance Tests

Depending on the test, we need different data structures. For adonis we use the sample metadata and for anosim we will start with a phyloseq object. So for each data set we create two variables: adonis_sampledf and anosim_data

Is the p-value of the jsd distance metric less than 0.05? FALSE.
Then we use ADONIS.

ssu18_ps_work_adonis_jsd <-  adonis(ssu18_ps_work_beta_dist_jsd ~ TEMP,
                                 data = adonis_sampledf, permutations = 1000)
ssu18_ps_work_adonis2_jsd <- adonis2(ssu18_ps_work_beta_dist_jsd ~ TEMP,
                                  data = adonis_sampledf, permutations = 1000)

Is the p-value of the unifrac distance metric less than 0.05? TRUE.
Then we use ANOSIM.

ssu18_ps_work_groups <- get_variable(anosim_data, "TEMP")
ssu18_ps_work_anosim_unifrac <-
  anosim(phyloseq::distance(ssu18_ps_work_prop, "unifrac"),
         grouping = ssu18_ps_work_groups)

Is the p-value of the wunifrac distance metric less than 0.05? TRUE.
Then we use ANOSIM.

ssu18_ps_work_groups <- get_variable(anosim_data, "TEMP")
ssu18_ps_work_anosim_wunifrac <-
  anosim(phyloseq::distance(ssu18_ps_work_prop, "wunifrac"),
         grouping = ssu18_ps_work_groups)

Is the p-value of the jsd distance metric less than 0.05? FALSE.
Then we use ADONIS.

ssu18_ps_filt_adonis_jsd <-  adonis(ssu18_ps_filt_beta_dist_jsd ~ TEMP,
                                 data = adonis_sampledf, permutations = 1000)
ssu18_ps_filt_adonis2_jsd <- adonis2(ssu18_ps_filt_beta_dist_jsd ~ TEMP,
                                  data = adonis_sampledf, permutations = 1000)

Is the p-value of the unifrac distance metric less than 0.05? TRUE.
Then we use ANOSIM.

ssu18_ps_filt_groups <- get_variable(anosim_data, "TEMP")
ssu18_ps_filt_anosim_unifrac <-
  anosim(phyloseq::distance(ssu18_ps_filt_prop, "unifrac"),
         grouping = ssu18_ps_filt_groups)

Is the p-value of the wunifrac distance metric less than 0.05? TRUE.
Then we use ANOSIM.

ssu18_ps_filt_groups <- get_variable(anosim_data, "TEMP")
ssu18_ps_filt_anosim_wunifrac <-
  anosim(phyloseq::distance(ssu18_ps_filt_prop, "wunifrac"),
         grouping = ssu18_ps_filt_groups)

Is the p-value of the jsd distance metric less than 0.05? TRUE.
Then we use ANOSIM.

ssu18_ps_perfect_groups <- get_variable(anosim_data, "TEMP")
ssu18_ps_perfect_anosim_jsd <-
  anosim(phyloseq::distance(ssu18_ps_perfect_prop, "jsd"),
         grouping = ssu18_ps_perfect_groups)

Is the p-value of the unifrac distance metric less than 0.05? TRUE.
Then we use ANOSIM.

ssu18_ps_perfect_groups <- get_variable(anosim_data, "TEMP")
ssu18_ps_perfect_anosim_unifrac <-
  anosim(phyloseq::distance(ssu18_ps_perfect_prop, "unifrac"),
         grouping = ssu18_ps_perfect_groups)

Is the p-value of the wunifrac distance metric less than 0.05? TRUE.
Then we use ANOSIM.

ssu18_ps_perfect_groups <- get_variable(anosim_data, "TEMP")
ssu18_ps_perfect_anosim_wunifrac <-
  anosim(phyloseq::distance(ssu18_ps_perfect_prop, "wunifrac"),
         grouping = ssu18_ps_perfect_groups)

Is the p-value of the jsd distance metric less than 0.05? FALSE.
Then we use ADONIS.

ssu18_ps_pime_adonis_jsd <-  adonis(ssu18_ps_pime_beta_dist_jsd ~ TEMP,
                                 data = adonis_sampledf, permutations = 1000)
ssu18_ps_pime_adonis2_jsd <- adonis2(ssu18_ps_pime_beta_dist_jsd ~ TEMP,
                                  data = adonis_sampledf, permutations = 1000)

Is the p-value of the unifrac distance metric less than 0.05? FALSE.
Then we use ADONIS.

ssu18_ps_pime_adonis_unifrac <-  adonis(ssu18_ps_pime_beta_dist_unifrac ~ TEMP,
                                     data = adonis_sampledf, permutations = 1000)
ssu18_ps_pime_adonis2_unifrac <- adonis2(ssu18_ps_pime_beta_dist_unifrac ~ TEMP,
                                      data = adonis_sampledf, permutations = 1000)

Is the p-value of the wunifrac distance metric less than 0.05? TRUE.
Then we use ANOSIM.

ssu18_ps_pime_groups <- get_variable(anosim_data, "TEMP")
ssu18_ps_pime_anosim_wunifrac <-
  anosim(phyloseq::distance(ssu18_ps_pime_prop, "wunifrac"),
         grouping = ssu18_ps_pime_groups)

Detailed results of Significance tests


 ***PERMANOVA for Jensen-Shannon Divergence, `jsd`***

Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 1000

adonis2(formula = ssu18_ps_work_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000)
         Df SumOfSqs      R2      F   Pr(>F)   
TEMP      2  0.21809 0.23826 1.8767 0.003996 **
Residual 12  0.69727 0.76174                   
Total    14  0.91536 1.00000                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


 ***ANOSIM for Unweighted UniFrac distance, `unifrac`***


Call:
anosim(x = phyloseq::distance(ssu18_ps_work_prop, "unifrac"),      grouping = ssu18_ps_work_groups) 
Dissimilarity: 

ANOSIM statistic R: 0.2356 
      Significance: 0.003 

Permutation: free
Number of permutations: 999

Upper quantiles of permutations (null model):
   90%    95%  97.5%    99% 
0.0873 0.1130 0.1396 0.1725 

Dissimilarity ranks between and within classes:
        0%   25%  50%   75% 100%  N
Between  1 32.50 58.0 81.00  103 75
0        2 10.25 17.5 33.25   56 10
3        4 14.75 26.5 41.00   65 10
8       39 69.75 85.5 94.25  105 10


 ***ANOSIM for Weighted-UniFrac distance, `wunifrac`***


Call:
anosim(x = phyloseq::distance(ssu18_ps_work_prop, "wunifrac"),      grouping = ssu18_ps_work_groups) 
Dissimilarity: 

ANOSIM statistic R: 0.3218 
      Significance: 0.001 

Permutation: free
Number of permutations: 999

Upper quantiles of permutations (null model):
  90%   95% 97.5%   99% 
0.090 0.120 0.149 0.176 

Dissimilarity ranks between and within classes:
        0%   25%  50%    75% 100%  N
Between  2 36.50 59.0  81.50  102 75
0        3  9.75 20.5  35.00   50 10
3        1 14.50 19.0  27.75   48 10
8       17 66.75 80.5 102.00  105 10


 ***PERMANOVA for Jensen-Shannon Divergence, `jsd`***

Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 1000

adonis2(formula = ssu18_ps_filt_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000)
         Df SumOfSqs      R2      F  Pr(>F)   
TEMP      2  0.15050 0.29049 2.4566 0.00999 **
Residual 12  0.36758 0.70951                  
Total    14  0.51808 1.00000                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


 ***ANOSIM for Unweighted UniFrac distance, `unifrac`***


Call:
anosim(x = phyloseq::distance(ssu18_ps_filt_prop, "unifrac"),      grouping = ssu18_ps_filt_groups) 
Dissimilarity: 

ANOSIM statistic R: 0.2427 
      Significance: 0.009 

Permutation: free
Number of permutations: 999

Upper quantiles of permutations (null model):
  90%   95% 97.5%   99% 
0.109 0.136 0.169 0.225 

Dissimilarity ranks between and within classes:
        0%   25%  50%   75% 100%  N
Between  2 35.00 58.0 81.50  102 75
0        1  7.75 17.5 34.75   59 10
3       13 23.00 28.5 34.50   57 10
8       11 66.50 81.5 99.50  105 10


 ***ANOSIM for Weighted-UniFrac distance, `wunifrac`***


Call:
anosim(x = phyloseq::distance(ssu18_ps_filt_prop, "wunifrac"),      grouping = ssu18_ps_filt_groups) 
Dissimilarity: 

ANOSIM statistic R: 0.3618 
      Significance: 0.001 

Permutation: free
Number of permutations: 999

Upper quantiles of permutations (null model):
  90%   95% 97.5%   99% 
0.103 0.140 0.173 0.216 

Dissimilarity ranks between and within classes:
        0%   25% 50%   75% 100%  N
Between  3 37.50  61 82.50  104 75
0        1  7.25  14 27.25   36 10
3        2 13.00  27 37.75   43 10
8       23 59.25  76 99.00  105 10


 ***ANOSIM for Jensen-Shannon Divergence, `jsd`***


Call:
anosim(x = phyloseq::distance(ssu18_ps_perfect_prop, "jsd"),      grouping = ssu18_ps_perfect_groups) 
Dissimilarity: 

ANOSIM statistic R: 0.2996 
      Significance: 0.001 

Permutation: free
Number of permutations: 999

Upper quantiles of permutations (null model):
   90%    95%  97.5%    99% 
0.0999 0.1450 0.1770 0.2081 

Dissimilarity ranks between and within classes:
        0%  25%  50%   75% 100%  N
Between  2 33.5 61.0 83.50  104 75
0        5 15.5 19.5 44.75   59 10
3        1  9.5 23.0 29.75   57 10
8       39 53.5 75.0 95.50  105 10


 ***ANOSIM for Unweighted UniFrac distance, `unifrac`***


Call:
anosim(x = phyloseq::distance(ssu18_ps_perfect_prop, "unifrac"),      grouping = ssu18_ps_perfect_groups) 
Dissimilarity: 

ANOSIM statistic R: 0.2053 
      Significance: 0.007 

Permutation: free
Number of permutations: 999

Upper quantiles of permutations (null model):
   90%    95%  97.5%    99% 
0.0908 0.1191 0.1441 0.1814 

Dissimilarity ranks between and within classes:
        0%   25% 50%    75% 100%  N
Between  5 31.00  59  81.50  102 75
0        1  4.75  14  36.25   56 10
3        4 21.75  34  43.50   53 10
8       52 70.50  80 100.00  105 10


 ***ANOSIM for Weighted-UniFrac distance, `wunifrac`***


Call:
anosim(x = phyloseq::distance(ssu18_ps_perfect_prop, "wunifrac"),      grouping = ssu18_ps_perfect_groups) 
Dissimilarity: 

ANOSIM statistic R: 0.3378 
      Significance: 0.001 

Permutation: free
Number of permutations: 999

Upper quantiles of permutations (null model):
   90%    95%  97.5%    99% 
0.0942 0.1213 0.1538 0.1831 

Dissimilarity ranks between and within classes:
        0%   25%  50%   75% 100%  N
Between  2 36.00 61.0 80.50  105 75
0        1 20.00 26.5 35.50   40 10
3        3  6.25 15.5 20.75   42 10
8       14 61.00 92.5 99.25  104 10


 ***PERMANOVA for Jensen-Shannon Divergence, `jsd`***

Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 1000

adonis2(formula = ssu18_ps_pime_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000)
         Df SumOfSqs      R2      F   Pr(>F)    
TEMP      2  0.23994 0.59595 8.8496 0.000999 ***
Residual 12  0.16268 0.40405                    
Total    14  0.40262 1.00000                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


 ***PERMANOVA for Unweighted UniFrac distance, `unifrac`***

Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 1000

adonis2(formula = ssu18_ps_pime_beta_dist_unifrac ~ TEMP, data = adonis_sampledf, permutations = 1000)
         Df SumOfSqs      R2      F   Pr(>F)    
TEMP      2  0.78789 0.64853 11.071 0.000999 ***
Residual 12  0.42699 0.35147                    
Total    14  1.21488 1.00000                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


 ***ANOSIM for Weighted-UniFrac distance, `wunifrac`***


Call:
anosim(x = phyloseq::distance(ssu18_ps_pime_prop, "wunifrac"),      grouping = ssu18_ps_pime_groups) 
Dissimilarity: 

ANOSIM statistic R: 0.5316 
      Significance: 0.001 

Permutation: free
Number of permutations: 999

Upper quantiles of permutations (null model):
  90%   95% 97.5%   99% 
0.114 0.157 0.207 0.262 

Dissimilarity ranks between and within classes:
        0%   25% 50%  75% 100%  N
Between  9 41.00  62 85.5  105 75
0        1  8.75  17 28.5   37 10
3        2  5.50  17 26.5   39 10
8        4 50.50  73 81.0   97 10

Summaries

Here is a quick summary of significance tests for the data sets against three distance matrices.

(16S rRNA) Table 2 | Summary of significant tests for the unfiltered & filtered data sets. P-values in red indicate significant differences (p-value < 0.05).

ITS

In order to test for significance between sample groups, we must perform the following steps:

Transform sample counts to relative abundance.
Create distance matrices based on some metrics, for example gower, bray, and jsd. For this we use the function phyloseq::distance.
Next, calculate beta dispersion using the betadisper function from the vegan package (Oksanen et al. 2012).
Then, use the function permutest to run a Permutation test for homogeneity of multivariate dispersions.
If the beta dispersion tests are not significant we will run a PERMANOVA using adonis (since PERMANOVA aitsmes equal dispersion), otherwise we will use Analysis of Similarity (ANOSIM), both available in the vegan package (Oksanen et al. 2012).

Steps 2–4 are combined in a for loop that tests all three distance metrics.

Before we begin, we need to first transform sample counts to relative abundance and generate some new trees.

Code

its_samp_ps <- c("its18_ps_work", "its18_ps_filt", 
             "its18_ps_perfect", "its18_ps_pime")
for (i in its_samp_ps) {
     tmp_name <- purrr::map_chr(i, ~ paste0(., "_prop"))
     tmp_get <- get(i)
     tmp_ps <- transform_sample_counts(tmp_get, function(otu) 
                                          1e5 * otu/sum(otu))
     tmp_ps@phy_tree <- NULL
     tmp_ps <- prune_samples(sample_sums(tmp_ps) > 0, tmp_ps)
     tmp_ps <- merge_phyloseq(tmp_ps, sample_data)
     print(tmp_name)
     assign(tmp_name, tmp_ps)
     rm(list = ls(pattern = "tmp_"))
}

Beta Dispersion

Here we create distance matrices for each metric, calculate the beta dispersion, and run a permutation test for homogeneity of multivariate dispersions.

Code

its_dist <- c("jsd", "bray", "gower")
for (i in its_samp_ps) {
    for (d in its_dist){
       tmp_get <- get(purrr::map_chr(i, ~ paste0(i, "_prop")))
       tmp_samp <- data.frame(sample_data(tmp_get))
       tmp_df <- phyloseq::distance(tmp_get, method = d)
       tmp_df_name <- purrr::map_chr(d, ~ paste0(i, "_beta_dist_", .))
       assign(tmp_df_name, tmp_df)
       tmp_df2 <- betadisper(tmp_df, tmp_samp$TEMP, bias.adjust = TRUE)
       tmp_df_name2 <- purrr::map_chr(d, ~ paste0(i, "_beta_dispersion_", .))
       assign(tmp_df_name2, tmp_df2)
       tmp_df3 <- permutest(tmp_df2, pairwise = TRUE,
                            permutations = 1000, binary = FALSE)
       tmp_df_name3 <- purrr::map_chr(d, ~ paste0(i, "_permutest_", .))
       assign(tmp_df_name3, tmp_df3)
       rm(list = ls(pattern = "tmp_"))
    }
}
objects()

Detailed results of Beta Dispersion & Permutation tests


#################################################### 
BETA DISPERSION significance test jsd distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df    Sum Sq   Mean Sq     F N.Perm Pr(>F)
Groups     2 0.0016626 0.0008313 0.272   1000 0.7522
Residuals 10 0.0305600 0.0030560                    

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
        0       3      8
0         0.65734 0.7612
3 0.65549         0.4925
8 0.75785 0.49503       

#################################################### 
BETA DISPERSION significance test bray distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm Pr(>F)
Groups     2 0.002828 0.0014140 0.3342   1000 0.7093
Residuals 10 0.042306 0.0042306                     

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
        0       3      8
0         0.51548 0.8501
3 0.53577         0.4525
8 0.85014 0.48472       

#################################################### 
BETA DISPERSION significance test gower distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df    Sum Sq   Mean Sq      F N.Perm Pr(>F)
Groups     2 0.0029899 0.0014950 1.3038   1000 0.3247
Residuals 10 0.0114662 0.0011466                     

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
        0       3      8
0         0.78921 0.2058
3 0.75923         0.3506
8 0.22060 0.33265


#################################################### 
BETA DISPERSION significance test jsd distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm Pr(>F)
Groups     2 0.001023 0.0005116 0.0745   1000 0.9101
Residuals 10 0.068640 0.0068640                     

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
        0       3      8
0         0.68831 0.9301
3 0.67472         0.8062
8 0.92769 0.81555       

#################################################### 
BETA DISPERSION significance test bray distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq  Mean Sq      F N.Perm Pr(>F)
Groups     2 0.002408 0.001204 0.1338   1000 0.8801
Residuals 10 0.090000 0.009000                     

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
        0       3      8
0         0.58042 0.9061
3 0.58847         0.7582
8 0.90501 0.75350       

#################################################### 
BETA DISPERSION significance test gower distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df    Sum Sq   Mean Sq      F N.Perm Pr(>F)
Groups     2 0.0058376 0.0029188 2.2494   1000 0.1419
Residuals 10 0.0129758 0.0012976                     

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
         0        3      8
0          0.061938 0.4555
3 0.074044          0.2847
8 0.464903 0.264060


#################################################### 
BETA DISPERSION significance test jsd distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm Pr(>F)
Groups     2 0.001059 0.0005293 0.1168   1000 0.8831
Residuals 10 0.045317 0.0045317                     

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
        0       3      8
0         0.69231 0.9131
3 0.71118         0.6683
8 0.92002 0.67220       

#################################################### 
BETA DISPERSION significance test bray distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm Pr(>F)
Groups     2 0.002522 0.0012611 0.2056   1000 0.8062
Residuals 10 0.061340 0.0061340                     

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
        0       3      8
0         0.57443 0.9950
3 0.58316         0.6224
8 0.99819 0.62424       

#################################################### 
BETA DISPERSION significance test gower distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df    Sum Sq   Mean Sq      F N.Perm Pr(>F)
Groups     2 0.0023433 0.0011716 2.0201   1000 0.2138
Residuals 10 0.0058000 0.0005800                     

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
         0        3      8
0          0.868132 0.1029
3 0.869601          0.2677
8 0.091826 0.229457


#################################################### 
BETA DISPERSION significance test jsd distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm Pr(>F)
Groups     2 0.007678 0.0038389 0.3436   1000 0.7163
Residuals 10 0.111712 0.0111712                     

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
        0       3      8
0         0.47253 0.4805
3 0.49362         0.9760
8 0.47799 0.97911       

#################################################### 
BETA DISPERSION significance test bray distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df   Sum Sq   Mean Sq      F N.Perm Pr(>F)
Groups     2 0.008276 0.0041378 0.2627   1000 0.7602
Residuals 10 0.157522 0.0157522                     

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
        0       3      8
0         0.46753 0.5874
3 0.51164         0.8821
8 0.58541 0.88566       

#################################################### 
BETA DISPERSION significance test gower distance 
####################################################

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 1000

Response: Distances
          Df    Sum Sq   Mean Sq      F N.Perm   Pr(>F)   
Groups     2 0.0165873 0.0082936 15.047   1000 0.001998 **
Residuals 10 0.0055118 0.0005512                          
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Pairwise comparisons:
(Observed p-value below diagonal, permuted p-value above diagonal)
           0          3     8
0            0.05594406 0.002
3 0.04179289            0.040
8 0.00072543 0.03074504

Remember, if the beta dispersion p-value is greater than 0.05 we use PERMANOVA, otherwise we use ANOSIM.

(ITS) Table 1 | Summary of Beta Dispersion Tests for unfiltered & filtered data sets against various distance metrics. The test column shows the method chosen (based on the test results) to assess differences in beta diversity. If p-values are significant (red, p-value < 0.05) we used ANOSIM, otherwise we used PERMANOVA (blue, p-value > 0.05).

Now we to create sample data frames for each data set.

Code

for (i in its_samp_ps) {
     tmp_get <- get(purrr::map_chr(i, ~ paste0(., "_prop")))
     tmp_samp <- data.frame(sample_data(tmp_get))
     tmp_name <- purrr::map_chr(i, ~ paste0(., "_sampledf"))
     assign(tmp_name, tmp_samp)
     rm(list = ls(pattern = "tmp_"))
}
objects(pattern="_sampledf")

Significance Tests

Is the p-value of the jsd distance metric less than 0.05? FALSE.
Then we use ADONIS.

its18_ps_work_adonis_jsd <-  adonis(its18_ps_work_beta_dist_jsd ~ TEMP,
                                 data = adonis_sampledf, permutations = 1000)
its18_ps_work_adonis2_jsd <- adonis2(its18_ps_work_beta_dist_jsd ~ TEMP,
                                  data = adonis_sampledf, permutations = 1000)

Is the p-value of the bray distance metric less than 0.05? FALSE.
Then we use ADONIS.

its18_ps_work_adonis_bray <-  adonis(its18_ps_work_beta_dist_bray ~ TEMP,
                                 data = adonis_sampledf, permutations = 1000)
its18_ps_work_adonis2_bray <- adonis2(its18_ps_work_beta_dist_bray ~ TEMP,
                                  data = adonis_sampledf, permutations = 1000)

Is the p-value of the gower distance metric less than 0.05? FALSE.
Then we use ADONIS.

its18_ps_work_adonis_gower <-  adonis(its18_ps_work_beta_dist_gower ~ TEMP,
                                 data = adonis_sampledf, permutations = 1000)
its18_ps_work_adonis2_gower <- adonis2(its18_ps_work_beta_dist_gower ~ TEMP,
                                  data = adonis_sampledf, permutations = 1000)

Is the p-value of the jsd distance metric less than 0.05? FALSE.
Then we use ADONIS.

its18_ps_filt_adonis_jsd <-  adonis(its18_ps_filt_beta_dist_jsd ~ TEMP,
                                 data = adonis_sampledf, permutations = 1000)
its18_ps_filt_adonis2_jsd <- adonis2(its18_ps_filt_beta_dist_jsd ~ TEMP,
                                  data = adonis_sampledf, permutations = 1000)

Is the p-value of the bray distance metric less than 0.05? FALSE.
Then we use ADONIS.

its18_ps_filt_adonis_bray <-  adonis(its18_ps_filt_beta_dist_bray ~ TEMP,
                                 data = adonis_sampledf, permutations = 1000)
its18_ps_filt_adonis2_bray <- adonis2(its18_ps_filt_beta_dist_bray ~ TEMP,
                                  data = adonis_sampledf, permutations = 1000)

Is the p-value of the gower distance metric less than 0.05? FALSE.
Then we use ADONIS.

its18_ps_filt_adonis_gower <-  adonis(its18_ps_filt_beta_dist_gower ~ TEMP,
                                 data = adonis_sampledf, permutations = 1000)
its18_ps_filt_adonis2_gower <- adonis2(its18_ps_filt_beta_dist_gower ~ TEMP,
                                  data = adonis_sampledf, permutations = 1000)

Is the p-value of the jsd distance metric less than 0.05? FALSE.
Then we use ADONIS.

its18_ps_perfect_adonis_jsd <-  adonis(its18_ps_perfect_beta_dist_jsd ~ TEMP,
                                 data = adonis_sampledf, permutations = 1000)
its18_ps_perfect_adonis2_jsd <- adonis2(its18_ps_perfect_beta_dist_jsd ~ TEMP,
                                  data = adonis_sampledf, permutations = 1000)

Is the p-value of the bray distance metric less than 0.05? FALSE.
Then we use ADONIS.

its18_ps_perfect_adonis_bray <-  adonis(its18_ps_perfect_beta_dist_bray ~ TEMP,
                                 data = adonis_sampledf, permutations = 1000)
its18_ps_perfect_adonis2_bray <- adonis2(its18_ps_perfect_beta_dist_bray ~ TEMP,
                                  data = adonis_sampledf, permutations = 1000)

Is the p-value of the gower distance metric less than 0.05? FALSE.
Then we use ADONIS.

its18_ps_perfect_adonis_gower <-  adonis(its18_ps_perfect_beta_dist_gower ~ TEMP,
                                 data = adonis_sampledf, permutations = 1000)
its18_ps_perfect_adonis2_gower <- adonis2(its18_ps_perfect_beta_dist_gower ~ TEMP,
                                  data = adonis_sampledf, permutations = 1000)

Is the p-value of the jsd distance metric less than 0.05? FALSE.
Then we use ADONIS.

its18_ps_pime_adonis_jsd <-  adonis(its18_ps_pime_beta_dist_jsd ~ TEMP,
                                 data = adonis_sampledf, permutations = 1000)
its18_ps_pime_adonis2_jsd <- adonis2(its18_ps_pime_beta_dist_jsd ~ TEMP,
                                  data = adonis_sampledf, permutations = 1000)

Is the p-value of the bray distance metric less than 0.05? FALSE.
Then we use ADONIS.

its18_ps_pime_adonis_bray <-  adonis(its18_ps_pime_beta_dist_bray ~ TEMP,
                                 data = adonis_sampledf, permutations = 1000)
its18_ps_pime_adonis2_bray <- adonis2(its18_ps_pime_beta_dist_bray ~ TEMP,
                                  data = adonis_sampledf, permutations = 1000)

Is the p-value of the gower distance metric less than 0.05? TRUE.
Then we use ANOSIM.

its18_ps_pime_groups <- get_variable(anosim_data, "TEMP")
its18_ps_pime_anosim_gower <-
  anosim(phyloseq::distance(its18_ps_pime_prop, "gower"),
         grouping = its18_ps_pime_groups)

Detailed results of Significance tests.


 ***PERMANOVA for Jensen-Shannon Divergence, `jsd`***

Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 1000

adonis2(formula = its18_ps_work_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000)
         Df SumOfSqs     R2      F  Pr(>F)  
TEMP      2  0.38087 0.2287 1.4826 0.03796 *
Residual 10  1.28448 0.7713                 
Total    12  1.66534 1.0000                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


 ***PERMANOVA for Bray-Curtis dissimilarity, `bray`***

       Df        SumOfSqs            R2               F        
 Min.   : 2   Min.   :0.9003   Min.   :0.2070   Min.   :1.305  
 1st Qu.: 6   1st Qu.:2.1744   1st Qu.:0.5000   1st Qu.:1.305  
 Median :10   Median :3.4484   Median :0.7930   Median :1.305  
 Mean   : 8   Mean   :2.8992   Mean   :0.6667   Mean   :1.305  
 3rd Qu.:11   3rd Qu.:3.8986   3rd Qu.:0.8965   3rd Qu.:1.305  
 Max.   :12   Max.   :4.3488   Max.   :1.0000   Max.   :1.305  
                                                NA's   :2      
     Pr(>F)       
 Min.   :0.05095  
 1st Qu.:0.05095  
 Median :0.05095  
 Mean   :0.05095  
 3rd Qu.:0.05095  
 Max.   :0.05095  
 NA's   :2


 ***PERMANOVA for Gower distance, `gower`***

       Df        SumOfSqs             R2               F        
 Min.   : 2   Min.   :0.05219   Min.   :0.2016   Min.   :1.263  
 1st Qu.: 6   1st Qu.:0.12941   1st Qu.:0.5000   1st Qu.:1.263  
 Median :10   Median :0.20664   Median :0.7984   Median :1.263  
 Mean   : 8   Mean   :0.17255   Mean   :0.6667   Mean   :1.263  
 3rd Qu.:11   3rd Qu.:0.23273   3rd Qu.:0.8992   3rd Qu.:1.263  
 Max.   :12   Max.   :0.25882   Max.   :1.0000   Max.   :1.263  
                                                 NA's   :2      
     Pr(>F)       
 Min.   :0.00999  
 1st Qu.:0.00999  
 Median :0.00999  
 Mean   :0.00999  
 3rd Qu.:0.00999  
 Max.   :0.00999  
 NA's   :2


 ***PERMANOVA for Jensen-Shannon Divergence, `jsd`***

Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 1000

adonis2(formula = its18_ps_filt_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000)
         Df SumOfSqs      R2      F  Pr(>F)  
TEMP      2  0.36598 0.25398 1.7022 0.02498 *
Residual 10  1.07499 0.74602                 
Total    12  1.44097 1.00000                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


 ***PERMANOVA for Bray-Curtis dissimilarity, `bray`***

       Df        SumOfSqs            R2               F        
 Min.   : 2   Min.   :0.8899   Min.   :0.2216   Min.   :1.423  
 1st Qu.: 6   1st Qu.:2.0078   1st Qu.:0.5000   1st Qu.:1.423  
 Median :10   Median :3.1257   Median :0.7784   Median :1.423  
 Mean   : 8   Mean   :2.6770   Mean   :0.6667   Mean   :1.423  
 3rd Qu.:11   3rd Qu.:3.5706   3rd Qu.:0.8892   3rd Qu.:1.423  
 Max.   :12   Max.   :4.0155   Max.   :1.0000   Max.   :1.423  
                                                NA's   :2      
     Pr(>F)       
 Min.   :0.03996  
 1st Qu.:0.03996  
 Median :0.03996  
 Mean   :0.03996  
 3rd Qu.:0.03996  
 Max.   :0.03996  
 NA's   :2


 ***PERMANOVA for Gower distance, `gower`***

       Df        SumOfSqs            R2               F        
 Min.   : 2   Min.   :0.1076   Min.   :0.2416   Min.   :1.593  
 1st Qu.: 6   1st Qu.:0.2225   1st Qu.:0.5000   1st Qu.:1.593  
 Median :10   Median :0.3375   Median :0.7584   Median :1.593  
 Mean   : 8   Mean   :0.2967   Mean   :0.6667   Mean   :1.593  
 3rd Qu.:11   3rd Qu.:0.3913   3rd Qu.:0.8792   3rd Qu.:1.593  
 Max.   :12   Max.   :0.4451   Max.   :1.0000   Max.   :1.593  
                                                NA's   :2      
     Pr(>F)        
 Min.   :0.006993  
 1st Qu.:0.006993  
 Median :0.006993  
 Mean   :0.006993  
 3rd Qu.:0.006993  
 Max.   :0.006993  
 NA's   :2


 ***PERMANOVA for Jensen-Shannon Divergence, `jsd`***

Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 1000

adonis2(formula = its18_ps_perfect_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000)
         Df SumOfSqs      R2      F  Pr(>F)  
TEMP      2  0.38311 0.23194 1.5099 0.04296 *
Residual 10  1.26864 0.76806                 
Total    12  1.65176 1.00000                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


 ***PERMANOVA for Bray-Curtis dissimilarity, `bray`***

       Df        SumOfSqs            R2               F        
 Min.   : 2   Min.   :0.8938   Min.   :0.2056   Min.   :1.294  
 1st Qu.: 6   1st Qu.:2.1734   1st Qu.:0.5000   1st Qu.:1.294  
 Median :10   Median :3.4530   Median :0.7944   Median :1.294  
 Mean   : 8   Mean   :2.8978   Mean   :0.6667   Mean   :1.294  
 3rd Qu.:11   3rd Qu.:3.8999   3rd Qu.:0.8972   3rd Qu.:1.294  
 Max.   :12   Max.   :4.3468   Max.   :1.0000   Max.   :1.294  
                                                NA's   :2      
     Pr(>F)       
 Min.   :0.07293  
 1st Qu.:0.07293  
 Median :0.07293  
 Mean   :0.07293  
 3rd Qu.:0.07293  
 Max.   :0.07293  
 NA's   :2


 ***PERMANOVA for Gower distance, `gower`***

       Df        SumOfSqs            R2               F        
 Min.   : 2   Min.   :0.0647   Min.   :0.2319   Min.   :1.509  
 1st Qu.: 6   1st Qu.:0.1395   1st Qu.:0.5000   1st Qu.:1.509  
 Median :10   Median :0.2143   Median :0.7681   Median :1.509  
 Mean   : 8   Mean   :0.1860   Mean   :0.6667   Mean   :1.509  
 3rd Qu.:11   3rd Qu.:0.2466   3rd Qu.:0.8841   3rd Qu.:1.509  
 Max.   :12   Max.   :0.2790   Max.   :1.0000   Max.   :1.509  
                                                NA's   :2      
     Pr(>F)        
 Min.   :0.001998  
 1st Qu.:0.001998  
 Median :0.001998  
 Mean   :0.001998  
 3rd Qu.:0.001998  
 Max.   :0.001998  
 NA's   :2


 ***PERMANOVA for Jensen-Shannon Divergence, `jsd`***

Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 1000

adonis2(formula = its18_ps_pime_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000)
         Df SumOfSqs     R2      F   Pr(>F)    
TEMP      2  0.54824 0.4002 3.3361 0.000999 ***
Residual 10  0.82167 0.5998                    
Total    12  1.36991 1.0000                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


 ***PERMANOVA for Bray-Curtis dissimilarity, `bray`***

Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 1000

adonis2(formula = its18_ps_pime_beta_dist_bray ~ TEMP, data = adonis_sampledf, permutations = 1000)
         Df SumOfSqs      R2     F   Pr(>F)    
TEMP      2   1.1998 0.30872 2.233 0.000999 ***
Residual 10   2.6865 0.69128                   
Total    12   3.8862 1.00000                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


 ***ANOSIM for Gower distance, `gower`***


Call:
anosim(x = phyloseq::distance(its18_ps_pime_prop, "gower"), grouping = its18_ps_pime_groups) 
Dissimilarity: gower 

ANOSIM statistic R: 0.6542 
      Significance: 0.001 

Permutation: free
Number of permutations: 999

Upper quantiles of permutations (null model):
  90%   95% 97.5%   99% 
0.151 0.221 0.265 0.346 

Dissimilarity ranks between and within classes:
        0%   25%  50%   75% 100%  N
Between  4 31.50 48.5 64.25   78 56
0       22 25.00 34.5 40.75   57 10
3        8 12.75 15.5 16.75   19  6
8        1  2.25  4.0  5.75    7  6

Summaries

Here is a quick summary of significance tests for the data sets against three distance matrices.

(ITS) Table 2 | Summary of significant tests for the unfiltered & filtered data sets. P-values in red indicate significant differences (p-value < 0.05).

Ordinations

Now we can visualize the Jensen-Shannon Divergence, unweighted-UniFrac, and weighted-UniFrac distance matrices to access dissimilarity among sample. Here we ordinate using Principal Coordinate Analysis (PCoA) but we could also test other methods like NMDS, CCA, etc. We will also test the whether there are any differences in ordination between the phyloseq package and the microeco package.

Plot Code

Code to reproduce plots

We begin with the code for the 16S rRNA data.

First the code for ordination implementation in phyloseq.

ssu18_data_sets <- c("ssu18_ps_work", "ssu18_ps_filt", "ssu18_ps_perfect", "ssu18_ps_pime")
ssu_dist <- c("jsd", "unifrac", "wunifrac")
for (samp_ps in ssu18_data_sets) {
for (d in ssu_dist){
     tmp_get <- get(purrr::map_chr(samp_ps, ~ paste0(., "_prop")))
     ord_meths <- c("NMDS", "PCoA", "CCA", "DCA") # MDS = PCoA, "CCA", "DCA", "DPCoA", "RDA"
     tmp_plist <- plyr::llply(as.list(ord_meths), function(i, physeq, d) {
        ordi = ordinate(physeq, method = i, distance = d)
        plot_ordination(physeq, ordi, "samples", color = "TEMP")
   }, tmp_get, d)

  names(tmp_plist) <- ord_meths

  tmp_df <- plyr::ldply(tmp_plist, function(x){
      df = x$data[, 1:2]
      colnames(df) = c("Axis_1", "Axis_2")
      return(cbind(df, x$data))})
  names(tmp_df)[1] = "method"
  
  tmp_plot <- ggplot(tmp_df, aes(Axis_1, Axis_2, color = TEMP, shape = TEMP, fill = TEMP))
  tmp_plot <- tmp_plot + geom_point(size = 4)
  tmp_plot <- tmp_plot + facet_wrap(~method, scales = "free")
  tmp_plot <- tmp_plot + scale_colour_manual(values = swel_col)
  
  tmp_df_name <- purrr::map_chr(d, ~ paste0(samp_ps, "_dist_", .))
  tmp_plist_name <- purrr::map_chr(d, ~ paste0(samp_ps, "_", ., "_plist"))
  tmp_plot_name <- purrr::map_chr(d, ~ paste0(samp_ps, "_dist_", ., "_plot"))
  tmp_list <- list("tmp_df_name" = tmp_df, tmp_plist_name = tmp_plist, tmp_plot_name = tmp_plot)
  assign(paste0(samp_ps, "_",  d, "_ord_results"), tmp_list)
  rm(list = ls(pattern = "_tmp"))
 }
}

plist_name <- objects(pattern="_ord_results")
plot_num <- c(1,2,3,4)
for (i in plist_name) {
  for (j in plot_num) {
       tmp_get_i <- get(i)$tmp_plist_name
       tmp_ord <- names(tmp_get_i)[j]
       tmp_name <- stringr::str_replace(i, "ord_results", tmp_ord)
       tmp_plot <- tmp_get_i[[j]] + scale_colour_manual(values = swel_col)
       tmp_plot <- tmp_plot + geom_point(size = 4, aes(shape = TEMP)) +
         theme(legend.position = "bottom")
       tmp_plot$labels$shape <- "TEMP"
       assign(tmp_name, tmp_plot)
       rm(list = ls(pattern = "tmp_"))
  }
}

And now the code for ordination implementation in microeco.

microeco_path <- "files/beta/microeco/"
for (i in ssu18_data_sets) {
  tmp_dataset <- get(purrr::map_chr(i, ~paste0(., "_me")))
  tmp_dataset$cal_betadiv(unifrac = TRUE)
#### CODE TO ADD JSD DISTANCE ####  
  tmp_jsd <- phyloseq::distance(get(i), method = "jsd") 
  tmp_jsd <- forceSymmetric(as.matrix(tmp_jsd), uplo = "L")
  tmp_jsd <- as.matrix(tmp_jsd)
  tmp_dataset$beta_diversity$jsd <- tmp_jsd
#### #### #### #### #### #### #### ####  
  dir.create(paste(microeco_path, i, "/metrics/", sep = ""), recursive = TRUE)
  tmp_dataset$save_betadiv(dirpath = paste(microeco_path, i, "/metrics/", sep = ""))
  rm(list = ls(pattern = "tmp_"))
}

class(ssu18_ps_perfect_me$beta_diversity)

[1] "list"

summary(ssu18_ps_perfect_me$beta_diversity)

              Length Class  Mode   
bray          225    -none- numeric
jaccard       225    -none- numeric
wei_unifrac   225    -none- numeric
unwei_unifrac 225    -none- numeric
jsd           225    -none- numeric

Here I made a custom “function” to run the analysis, plot the graphs, save graph objects, and save plots (as .png and .pdf files). I am sure an actual programmer would be shocked, but it works.

microeco_beta_plot <- function(choose_input, choose_metric, choose_ord) {  
  tmp_dataset <- get(purrr::map_chr(choose_input, ~paste0(., "_me")))
  tmp_t1 <- trans_beta$new(dataset = tmp_dataset, group = "TEMP", measure = choose_metric)
  tmp_t1$cal_ordination(ordination = choose_ord)
  tmp_t1_ord_plot <- tmp_t1$plot_ordination(plot_color = "TEMP", 
                                            plot_shape = "TEMP", 
                                            plot_group_ellipse = FALSE, 
                                            color_values = swel_col, 
                                            shape_values = c(16, 17, 15)) + 
                       geom_point(size = 4) +  theme(legend.position = "bottom")
  tmp_t1$cal_group_distance()
  tmp_t1_within_group_plot <- tmp_t1$plot_group_distance(distance_pair_stat = TRUE, color_values = swel_col)
  tmp_t1$res_group_distance
  tmp_t1$cal_group_distance(within_group = FALSE)
  tmp_t1_btwn_group_plot <- tmp_t1$plot_group_distance(distance_pair_stat = TRUE, color_values = swel_col) 

###### SET names
  tmp_name_ord <- paste(choose_input, "_me_", choose_metric, "_", choose_ord, sep = "")
  tmp_name_wg <- paste(choose_input, "_me_wg_", choose_metric, "_", choose_ord, sep = "")
  tmp_name_bg <- paste(choose_input, "_me_bg_", choose_metric, "_", choose_ord, sep = "")
  
  assign(tmp_name_ord, tmp_t1_ord_plot, envir = parent.frame() )
  assign(tmp_name_wg, tmp_t1_within_group_plot, envir = parent.frame() )
  assign(tmp_name_bg, tmp_t1_btwn_group_plot, envir = parent.frame() )
###### SET paths
  dir.create(paste(microeco_path, choose_input, "/ordination/", sep = ""), recursive = TRUE)
  tmp_path <- paste(microeco_path, choose_input, "/ordination/", sep = "")
  tmp_path_ord <- paste(tmp_path, "ord_plots/", sep = "")
  tmp_path_wg <- paste(tmp_path, "within_group/", sep = "")
  tmp_path_bg <- paste(tmp_path, "between_group/", sep = "")
###### SAVE plots  
  ggplot2::ggsave(tmp_t1_ord_plot, path = paste(tmp_path_ord, sep = ""), filename = paste0(tmp_name_ord, ".png", sep = ""))
  ggplot2::ggsave(tmp_t1_ord_plot, path = paste(tmp_path_ord, sep = ""), filename = paste0(tmp_name_ord, ".pdf", sep = ""))                    
  ggplot2::ggsave(tmp_t1_within_group_plot, path = paste(tmp_path_wg, sep = ""), filename = paste0(tmp_name_wg, ".png", sep = ""))
  ggplot2::ggsave(tmp_t1_within_group_plot, path = paste(tmp_path_wg, sep = ""), filename = paste0(tmp_name_wg, ".pdf", sep = ""))                  
  ggplot2::ggsave(tmp_t1_btwn_group_plot, path = paste(tmp_path_bg, sep = ""), filename = paste0(tmp_name_bg, ".png", sep = ""))
  ggplot2::ggsave(tmp_t1_btwn_group_plot, path = paste(tmp_path_bg, sep = ""), filename = paste0(tmp_name_bg, ".pdf", sep = ""))                  
  rm(list = ls(pattern = "tmp_"))
}

And then run it like so:

microeco_beta_plot(choose_input = "ssu18_ps_pime", choose_metric = "bray", choose_ord = "PCoA")

It would be super cool to have a function that looped through a bunch of data sets, distance metrics, and ordination methods.

for (j in 1:length(get(paste("ssu18_ps_work", "_me", sep = ""))$beta_diversity)) {
  tmp_metric <- names(get(paste("ssu18_ps_work", "_me", sep = ""))$beta_diversity[j])
  microeco_beta_plot(choose_input = "ssu18_ps_work", choose_metric = tmp_metric, choose_ord = "PCoA")
  rm(list = ls(pattern = "tmp_"))
}

for (j in 1:length(get(paste("ssu18_ps_filt", "_me", sep = ""))$beta_diversity)) {
  tmp_metric <- names(get(paste("ssu18_ps_filt", "_me", sep = ""))$beta_diversity[j])
  microeco_beta_plot(choose_input = "ssu18_ps_filt", choose_metric = tmp_metric, choose_ord = "PCoA")
  rm(list = ls(pattern = "tmp_"))
}

for (j in 1:length(get(paste("ssu18_ps_perfect", "_me", sep = ""))$beta_diversity)) {
  tmp_metric <- names(get(paste("ssu18_ps_perfect", "_me", sep = ""))$beta_diversity[j])
  microeco_beta_plot(choose_input = "ssu18_ps_perfect", choose_metric = tmp_metric, choose_ord = "PCoA")
  rm(list = ls(pattern = "tmp_"))
}

for (j in 1:length(get(paste("ssu18_ps_pime", "_me", sep = ""))$beta_diversity)) {
  tmp_metric <- names(get(paste("ssu18_ps_pime", "_me", sep = ""))$beta_diversity[j])
  microeco_beta_plot(choose_input = "ssu18_ps_pime", choose_metric = tmp_metric, choose_ord = "PCoA")
  rm(list = ls(pattern = "tmp_"))
}

And now the ITS code.

its18_data_sets <- c("its18_ps_work", "its18_ps_filt", "its18_ps_perfect", "its18_ps_pime")
its_dist <- c("jsd", "bray", "gower")
for (samp_ps in its18_data_sets) {
for (d in its_dist){
     tmp_get <- get(purrr::map_chr(samp_ps, ~ paste0(., "_prop")))
     ord_meths <- c("NMDS", "PCoA", "CCA", "DCA") # MDS = PCoA, "CCA", "DCA", "DPCoA", "RDA"
     tmp_plist <- plyr::llply(as.list(ord_meths), function(i, physeq, d) {
        ordi = ordinate(physeq, method = i, distance = d)
        plot_ordination(physeq, ordi, "samples", color = "TEMP")
   }, tmp_get, d)

  names(tmp_plist) <- ord_meths

  tmp_df <- plyr::ldply(tmp_plist, function(x){
      df = x$data[, 1:2]
      colnames(df) = c("Axis_1", "Axis_2")
      return(cbind(df, x$data))})
  names(tmp_df)[1] = "method"
  
  tmp_plot <- ggplot(tmp_df, aes(Axis_1, Axis_2, color = TEMP, shape = TEMP, fill = TEMP))
  tmp_plot <- tmp_plot + geom_point(size = 4)
  tmp_plot <- tmp_plot + facet_wrap(~method, scales = "free")
  tmp_plot <- tmp_plot + scale_colour_manual(values = swel_col)
  
  tmp_df_name <- purrr::map_chr(d, ~ paste0(samp_ps, "_dist_", .))
  tmp_plist_name <- purrr::map_chr(d, ~ paste0(samp_ps, "_", ., "_plist"))
  tmp_plot_name <- purrr::map_chr(d, ~ paste0(samp_ps, "_dist_", ., "_plot"))
  tmp_list <- list("tmp_df_name" = tmp_df, tmp_plist_name = tmp_plist, tmp_plot_name = tmp_plot)
  assign(paste0(samp_ps, "_",  d, "_ord_results"), tmp_list)
  rm(list = ls(pattern = "_tmp"))
 }
}

plist_name <- objects(pattern="_ord_results")
plot_num <- c(1,2,3,4)
for (i in plist_name) {
  for (j in plot_num) {
       tmp_get_i <- get(i)$tmp_plist_name
       tmp_ord <- names(tmp_get_i)[j]
       tmp_name <- stringr::str_replace(i, "ord_results", tmp_ord)
       tmp_plot <- tmp_get_i[[j]] + scale_colour_manual(values = swel_col)
       tmp_plot <- tmp_plot + geom_point(size = 4, aes(shape = TEMP)) +
         theme(legend.position = "bottom")
       tmp_plot$labels$shape <- "TEMP"
       assign(tmp_name, tmp_plot)
       rm(list = ls(pattern = "tmp_"))
  }
}

And now the code for ordination implementation in microeco.

microeco_path <- "files/beta/microeco/"
for (i in its18_data_sets) {
  tmp_dataset <- get(purrr::map_chr(i, ~paste0(., "_me")))
  tmp_dataset$cal_betadiv(unifrac = FALSE)
#### CODE TO ADD JSD DISTANCE ####  
  tmp_jsd <- phyloseq::distance(get(i), method = "jsd") 
  tmp_jsd <- forceSymmetric(as.matrix(tmp_jsd), uplo = "L")
  tmp_jsd <- as.matrix(tmp_jsd)
  tmp_dataset$beta_diversity$jsd <- tmp_jsd
#### CODE TO ADD GOWER DISTANCE ####  
  tmp_gower <- phyloseq::distance(get(i), method = "gower") 
  tmp_gower <- forceSymmetric(as.matrix(tmp_gower), uplo = "L")
  tmp_gower <- as.matrix(tmp_gower)
  tmp_dataset$beta_diversity$gower <- tmp_gower
#### #### #### #### #### #### #### ####  
  dir.create(paste(microeco_path, i, "/metrics/", sep = ""), recursive = TRUE)
  tmp_dataset$save_betadiv(dirpath = paste(microeco_path, i, "/metrics/", sep = ""))
  rm(list = ls(pattern = "tmp_"))
}

class(its18_ps_perfect_me$beta_diversity)

[1] "list"

summary(its18_ps_perfect_me$beta_diversity)

        Length Class  Mode   
bray    169    -none- numeric
jaccard 169    -none- numeric
jsd     169    -none- numeric
gower   169    -none- numeric

Here we call the custom function described above.

for (j in 1:length(get(paste("its18_ps_work", "_me", sep = ""))$beta_diversity)) {
  tmp_metric <- names(get(paste("its18_ps_work", "_me", sep = ""))$beta_diversity[j])
  microeco_beta_plot(choose_input = "its18_ps_work", choose_metric = tmp_metric, choose_ord = "PCoA")
  rm(list = ls(pattern = "tmp_"))
}

for (j in 1:length(get(paste("its18_ps_filt", "_me", sep = ""))$beta_diversity)) {
  tmp_metric <- names(get(paste("its18_ps_filt", "_me", sep = ""))$beta_diversity[j])
  microeco_beta_plot(choose_input = "its18_ps_filt", choose_metric = tmp_metric, choose_ord = "PCoA")
  rm(list = ls(pattern = "tmp_"))
}

for (j in 1:length(get(paste("its18_ps_perfect", "_me", sep = ""))$beta_diversity)) {
  tmp_metric <- names(get(paste("its18_ps_perfect", "_me", sep = ""))$beta_diversity[j])
  microeco_beta_plot(choose_input = "its18_ps_perfect", choose_metric = tmp_metric, choose_ord = "PCoA")
  rm(list = ls(pattern = "tmp_"))
}

for (j in 1:length(get(paste("its18_ps_pime", "_me", sep = ""))$beta_diversity)) {
  tmp_metric <- names(get(paste("its18_ps_pime", "_me", sep = ""))$beta_diversity[j])
  microeco_beta_plot(choose_input = "its18_ps_pime", choose_metric = tmp_metric, choose_ord = "PCoA")
  rm(list = ls(pattern = "tmp_"))
}

16S rRNA

(16S rRNA) Figure 1 | PCoA ordination plots of the FULL data set. Top = phyloseq, bottom = microeco; Left = Jensen-Shannon Divergence, middle = UniFrac (unwieghted), right = UniFrac (wieghted).

(16S rRNA) Figure 2 | PCoA ordination plots of the Arbitrary filtered data set. Top = phyloseq, bottom = microeco; Left = Jensen-Shannon Divergence, middle = UniFrac (unwieghted), right = UniFrac (wieghted).

(16S rRNA) Figure 3 | PCoA ordination plots of the PERfect filtered data set. Top = phyloseq, bottom = microeco; Left = Jensen-Shannon Divergence, middle = UniFrac (unwieghted), right = UniFrac (wieghted).

(16S rRNA) Figure 4 | PCoA ordination plots of the PIME filtered data set. Top = phyloseq, bottom = microeco; Left = Jensen-Shannon Divergence, middle = UniFrac (unwieghted), right = UniFrac (wieghted).

ITS

(ITS) Figure 1 | PCoA ordination plots of the FULL data set. Top = phyloseq, bottom = microeco; Left = Jensen-Shannon Divergence, middle = Bray-Curtis dissimilarity, right = Gower distance.

(ITS) Figure 2 | PCoA ordination plots of the Arbitrary filtered data set. Top = phyloseq, bottom = microeco; Left = Jensen-Shannon Divergence, middle = Bray-Curtis dissimilarity, right = Gower distance.

(ITS) Figure 3 | PCoA ordination plots of the PERfect filtered data set. Top = phyloseq, bottom = microeco; Left = Jensen-Shannon Divergence, middle = Bray-Curtis dissimilarity, right = Gower distance.

(ITS) Figure 4 | PCoA ordination plots of the PIME filtered data set. Top = phyloseq, bottom = microeco; Left = Jensen-Shannon Divergence, middle = Bray-Curtis dissimilarity, right = Gower distance.

Within & Between Group Distances

Above we calculated the sample distances within groups. Let’s see what those plots look like.

16S rRNA

(16S rRNA) Figure 5 | Within (top) and between (bottom) group variation plots of the FULL data set. Left = Jensen-Shannon Divergence, middle = UniFrac (unwieghted), right = UniFrac (wieghted). Significant differences denoted by asterisks (* p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001). ns = not significant.

(16S rRNA) Figure 6 | Within (top) and between (bottom) group variation plots of the Arbitrary filtered data set. Left = Jensen-Shannon Divergence, middle = UniFrac (unwieghted), right = UniFrac (wieghted). Significant differences denoted by asterisks (* p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001). ns = not significant.

(16S rRNA) Figure 7 | Within (top) and between (bottom) group variation plots of the PERfect filtered data set. Left = Jensen-Shannon Divergence, middle = UniFrac (unwieghted), right = UniFrac (wieghted). Significant differences denoted by asterisks (* p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001). ns = not significant.

(16S rRNA) Figure 8 | Within (top) and between (bottom) group variation plots of the PIME filtered data set. Left = Jensen-Shannon Divergence, middle = UniFrac (unwieghted), right = UniFrac (wieghted). Significant differences denoted by asterisks (* p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001). ns = not significant.

ITS

(ITS) Figure 5 | Within (top) and between (bottom) group variation plots of the FULL data set. Left = Jensen-Shannon Divergence, middle = Bray-Curtis dissimilarity, right = Gower distance. Significant differences denoted by asterisks (* p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001). ns = not significant.

(ITS) Figure 6 | Within (top) and between (bottom) group variation plots of the Arbitrary filtered data set. Left = Jensen-Shannon Divergence, middle = Bray-Curtis dissimilarity, right = Gower distance. Significant differences denoted by asterisks (* p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001). ns = not significant.

(ITS) Figure 7 | Within (top) and between (bottom) group variation plots of the PERfect filtered data set. Left = Jensen-Shannon Divergence, middle = Bray-Curtis dissimilarity, right = Gower distance. Significant differences denoted by asterisks (* p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001). ns = not significant.

(ITS) Figure 8 | Within (top) and between (bottom) group variation plots of the PIME filtered data set. Left = Jensen-Shannon Divergence, middle = Bray-Curtis dissimilarity, right = Gower distance. Significant differences denoted by asterisks (* p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001). ns = not significant.

Workflow Output

Data products generated in this workflow can be downloaded from figshare.

Next workflow:
7. Differentially Abundant ASVs & Taxa Previous workflow:
5. Alpha Diversity Estimates

Source Code

The source code for this page can be accessed on GitHub by clicking this link.

Data Availability

Data generated in this workflow and the Rdata need to run the workflow can be accessed on figshare at 10.25573/data.16828063.

Last updated on

[1] "2022-07-07 07:25:14 EST"

References

Oksanen, Jari, F. Guillaume Blanchet, Roeland Kindt, Pierre Legendre, Peter R. Minchin, R. B. O’Hara, Gavin L. Simpson, Peter Sólymos, M. Henry H. Stevens, and Helene Wagner. 2012. “Vegan: Community Ecology Package.” R Package Version 2 (0). https://cran.r-project.org/web/packages/vegan/index.html.

--- title: "6. Beta Diversity & Dispersion Estimates" description: | Reproducible workflow for assessing community dissimilarity across temperature treatments. format: html: toc: true toc-depth: 3 --- <details markdown="1"> <summary>Click here for setup information.</summary> ```{r} #| message: false #| results: hide knitr::opts_chunk$set(echo = TRUE, eval = FALSE) set.seed(119) library(phyloseq); packageVersion("phyloseq") library(Biostrings); packageVersion("Biostrings") pacman::p_load(tidyverse, microbiome, phytools, phangorn, reactable, Matrix, pairwiseAdonis, naniar, downloadthis, labdsv, patchwork, agricolae, captioner, microeco, install = FALSE, update = FALSE) #pacman::p_depends(agricolae, local = TRUE) #pacman::p_depends_reverse(agricolae, local = TRUE) options(scipen=999) knitr::opts_current$get(c( "cache", "cache.path", "cache.rebuild", "dependson", "autodep" )) ``` </details> ```{r} #| include: false #| eval: true ## Load to build page only #2 remove(list = ls()) load("page_build/beta_wf_part_1.rdata") ``` ```{r} #| echo: false #| eval: true #| results: hide # Create the caption(s) with captioner caption_tab_ssu <- captioner(prefix = "(16S rRNA) Table", suffix = " |", style = "b") caption_fig_ssu <- captioner(prefix = "(16S rRNA) Figure", suffix = " |", style = "b") caption_tab_its <- captioner(prefix = "(ITS) Table", suffix = " |", style = "b") caption_fig_its <- captioner(prefix = "(ITS) Figure", suffix = " |", style = "b") # Create a function for referring to the tables in text ref <- function(x) str_extract(x, "[^|]*") %>% trimws(which = "right", whitespace = "[ ]") ``` ```{r} #| echo: false #| eval: true source("assets/captions/captions_beta.R") ``` # Synopsis This workflow contains beta diversity assessments for the high temperature data sets. In order to run the workflow, you either need to first run the [DADA2 Workflow for 2018 High Temp samples](dada2.html), the [Data Preparation workflow](data-prep.html), and the [Alpha diversity workflow](alpha.html) **or** download the files linked below. See the [Data Availability](data-availability.html) page for complete details. ## Workflow Input Files needed to run this workflow can be downloaded from figshare. <iframe src="https://widgets.figshare.com/articles/16826779/embed?show_title=1" width="100%" height="251" allowfullscreen frameborder="0"></iframe> # 16s rRNA ```{r} #| include: false #| eval: false ## Initial Load for ANALYSIS #1 set.seed(119) ssu18_ps_work <- readRDS("files/alpha/rdata/ssu18_ps_work.rds") ssu18_ps_filt <- readRDS("files/alpha/rdata/ssu18_ps_filt.rds") ssu18_ps_perfect <- readRDS("files/alpha/rdata/ssu18_ps_perfect.rds") ssu18_ps_pime <- readRDS("files/alpha/rdata/ssu18_ps_pime.rds") ``` ```{r} #| echo: false swel_col <- c("#2271B2", "#71B222", "#B22271") ``` In order to test for significance between sample groups, we must perform the following steps: 1. Transform sample counts to relative abundance. 2. Create distance matrices based on some metrics, for example `wunifrac`, `unifrac`, and `jsd`. For this we use the function `phyloseq::distance`. 3. Next, calculate beta dispersion using the `betadisper` function from the `vegan` package [@oksanen2013community]. 4. Then, use the function `permutest` to run a Permutation test for homogeneity of multivariate dispersions. 5. If the beta dispersion tests are not significant we will run a PERMANOVA using `adonis` (since PERMANOVA assumes equal dispersion), otherwise we will use Analysis of Similarity (ANOSIM), both available in the `vegan` package [@oksanen2013community]. Steps 2--4 are combined in a `for` loop that tests all three distance metrics. Before we begin, we need to first transform sample counts to relative abundance and generate some new trees. ```{r} #| code-fold: true ssu_samp_ps <- c("ssu18_ps_work", "ssu18_ps_filt", "ssu18_ps_perfect", "ssu18_ps_pime") for (i in ssu_samp_ps) { tmp_name <- purrr::map_chr(i, ~ paste0(., "_prop")) tmp_get <- get(i) tmp_ps <- transform_sample_counts(tmp_get, function(otu) 1e5 * otu/sum(otu)) tmp_ps@phy_tree <- NULL tmp_ps <- prune_samples(sample_sums(tmp_ps) > 0, tmp_ps) tmp_tree <- rtree(ntaxa(tmp_ps), rooted = TRUE, tip.label = taxa_names(tmp_ps)) tmp_ps <- merge_phyloseq(tmp_ps, sample_data, tmp_tree) print(tmp_name) assign(tmp_name, tmp_ps) rm(list = ls(pattern = "tmp_")) } ``` ## Beta Dispersion Here we create distance matrices for each metric, calculate the beta dispersion, and run a permutation test for homogeneity of multivariate dispersions. ```{r} #| code-fold: true ssu_dist <- c("jsd", "unifrac", "wunifrac") for (i in ssu_samp_ps) { for (d in ssu_dist){ tmp_get <- get(purrr::map_chr(i, ~ paste0(i, "_prop"))) tmp_samp <- data.frame(sample_data(tmp_get)) tmp_df <- phyloseq::distance(tmp_get, method = d) tmp_df_name <- purrr::map_chr(d, ~ paste0(i, "_beta_dist_", .)) assign(tmp_df_name, tmp_df) tmp_df2 <- betadisper(tmp_df, tmp_samp$TEMP, bias.adjust = TRUE) tmp_df_name2 <- purrr::map_chr(d, ~ paste0(i, "_beta_dispersion_", .)) assign(tmp_df_name2, tmp_df2) tmp_df3 <- permutest(tmp_df2, pairwise = TRUE, permutations = 1000, binary = FALSE) tmp_df_name3 <- purrr::map_chr(d, ~ paste0(i, "_permutest_", .)) assign(tmp_df_name3, tmp_df3) rm(list = ls(pattern = "tmp_")) } } objects() ``` <details markdown="1"> <summary>Detailed results of Beta Dispersion & Permutation tests</summary> ::: {.panel-tabset} ```{r} #| echo: false #| eval: true ssu_samp_ps <- c("ssu18_ps_work", "ssu18_ps_filt", "ssu18_ps_perfect", "ssu18_ps_pime") ssu_dist <- c("jsd", "unifrac", "wunifrac") ``` #### FULL ```{r} #| echo: false #| eval: true for (i in ssu_samp_ps[1]) { #cat("\n", " *****", i, "*****", "\n") for (d in ssu_dist){ tmp_get <- get(purrr::map_chr(i, ~ paste0(i, "_permutest_", d))) cat("\n") cat("####################################################", "\n") tmp_print <- c("BETA DISPERSION significance test", d, "distance") cat(tmp_print, "\n") cat("####################################################") cat("\n") print(tmp_get) rm(list = ls(pattern = "tmp_")) } } ``` #### Arbitrary ```{r} #| echo: false #| eval: true for (i in ssu_samp_ps[2]) { #cat("\n", " *****", i, "*****", "\n") for (d in ssu_dist){ tmp_get <- get(purrr::map_chr(i, ~ paste0(i, "_permutest_", d))) cat("\n") cat("####################################################", "\n") tmp_print <- c("BETA DISPERSION significance test", d, "distance") cat(tmp_print, "\n") cat("####################################################") cat("\n") print(tmp_get) rm(list = ls(pattern = "tmp_")) } } ``` #### PERfect ```{r} #| echo: false #| eval: true for (i in ssu_samp_ps[3]) { #cat("\n", " *****", i, "*****", "\n") for (d in ssu_dist){ tmp_get <- get(purrr::map_chr(i, ~ paste0(i, "_permutest_", d))) cat("\n") cat("####################################################", "\n") tmp_print <- c("BETA DISPERSION significance test", d, "distance") cat(tmp_print, "\n") cat("####################################################") cat("\n") print(tmp_get) rm(list = ls(pattern = "tmp_")) } } ``` #### PIME ```{r} #| echo: false #| eval: true for (i in ssu_samp_ps[4]) { #cat("\n", " *****", i, "*****", "\n") for (d in ssu_dist){ tmp_get <- get(purrr::map_chr(i, ~ paste0(i, "_permutest_", d))) cat("\n") cat("####################################################", "\n") tmp_print <- c("BETA DISPERSION significance test", d, "distance") cat(tmp_print, "\n") cat("####################################################") cat("\n") print(tmp_get) rm(list = ls(pattern = "tmp_")) } } ``` ::: </details> Remember, if the beta dispersion p-value is greater than `0.05` we use PERMANOVA, otherwise we use ANOSIM. ```{r} #| echo: false #| eval: true #| results: markup ssu18_perm_pvalues <- data.frame() for (i in ssu_samp_ps) { for (d in ssu_dist){ tmp_get <- get(purrr::map_chr(i, ~ paste0(., "_permutest_", d))) tmp_pv <- tmp_get$tab$`Pr(>F)`[1] tmp_test <- if(tmp_pv < 0.05){ tmp_test <- "ANOSIM" } else { tmp_test <- "ADONIS" } tmp_df <- data.frame(i, d, tmp_pv, tmp_test) ssu18_perm_pvalues <- dplyr::bind_rows(ssu18_perm_pvalues, tmp_df) rm(list = ls(pattern = "tmp_")) } } ssu18_perm_pvalues <- ssu18_perm_pvalues %>% dplyr::rename(c("data_set" = 1, "dist" = 2, "pval" = 3, "test" = 4)) ssu18_perm_pvalues <- ssu18_perm_pvalues %>% tibble::remove_rownames() ``` ```{r} #| echo: false ssu18_perm_summary <- ssu18_perm_pvalues ssu18_perm_summary <- ssu18_perm_summary %>% separate(col = 1, into = c("dummy", "metric", "dataset"), sep = "_", remove = TRUE) ssu18_perm_summary[,1:2] <- NULL ssu18_perm_summary$dataset <- stringr::str_replace(ssu18_perm_summary$dataset, "work", "FULL") %>% stringr::str_replace(., "filt", "FILT") %>% stringr::str_replace(., "pime", "PIME") %>% stringr::str_replace(., "perfect", "PERfect") ssu18_perm_summary$dist <- stringr::str_replace(ssu18_perm_summary$dist, "jsd", "Jensen–Shannon divergence") %>% stringr::str_replace(., "wunifrac", "weighted UniFrac") %>% stringr::str_replace(., "unifrac", "unweighted UniFrac") ssu18_perm_summary$pval <- round(ssu18_perm_summary[[3]], 4) ``` `r caption_tab_ssu("ssu_beta_disp_tests")` ```{r} #| echo: false #| eval: true seq_table <- ssu18_perm_summary seq_table %>% download_this( output_name = "ssu_beta_disp_tests", output_extension = ".csv", button_label = "Download data as csv file", button_type = "default", csv2 = FALSE, has_icon = TRUE, icon = "fa fa-save") ``` ```{r} #| echo: false #| eval: true reactable(seq_table, defaultColDef = colDef( header = function(value) gsub("_", " ", value, fixed = TRUE), cell = function(value) format(value, nsmall = 4), align = "center", filterable = TRUE, sortable = TRUE, resizable = TRUE, footerStyle = list(fontWeight = "bold"), minWidth = 100, headerVAlign = "center" ), columns = list( dataset = colDef(name = "Dataset", sticky = "left", style = list(borderRight = "1px solid #eee"), headerStyle = list(borderRight = "1px solid #eee"), align = "left", minWidth = 50), dist = colDef(name = "distance", align = "left"), `pval` = colDef(name = "p-value", minWidth = 100, resizable = FALSE, style = function(value) { if (value > 0.05) { color <- "#2271B2" } else if (value < 0.05) { color <- "#B22271" } else { color <- "#777" } list(color = color, fontWeight = "bold") }) ), searchable = FALSE, defaultPageSize = 12, showPagination = FALSE, pageSizeOptions = c(5, 10, nrow(seq_table)), showPageSizeOptions = TRUE, highlight = TRUE, bordered = TRUE, striped = TRUE, compact = TRUE, wrap = TRUE, showSortable = TRUE, fullWidth = TRUE, theme = reactableTheme(style = list(fontSize = "0.8em"))) rm(seq_table) ``` Now we to create sample data frames for each data set. ```{r} #| code-fold: true for (i in ssu_samp_ps) { tmp_get <- get(purrr::map_chr(i, ~ paste0(., "_prop"))) tmp_samp <- data.frame(sample_data(tmp_get)) tmp_name <- purrr::map_chr(i, ~ paste0(., "_sampledf")) assign(tmp_name, tmp_samp) rm(list = ls(pattern = "tmp_")) } objects(pattern="_sampledf") ``` ## Significance Tests Depending on the test, we need different data structures. For `adonis` we use the sample metadata and for `anosim` we will start with a phyloseq object. So for each data set we create two variables: `adonis_sampledf` and `anosim_data` ::: {.column-body} ::: {.panel-tabset} #### FULL ```{r} #| echo: false adonis_sampledf <- ssu18_ps_work_sampledf anosim_data <- ssu18_ps_work ``` ```{r} #| echo: false #| eval: true tmp_perm_tests <- ssu18_perm_pvalues[ssu18_perm_pvalues$data_set == 'ssu18_ps_work',] ``` Is the p-value of the `r ssu18_perm_pvalues$dist[1]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[1] < 0.05))`. Then we use **`r tmp_perm_tests$test[1]`**. ```{r} ssu18_ps_work_adonis_jsd <- adonis(ssu18_ps_work_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000) ssu18_ps_work_adonis2_jsd <- adonis2(ssu18_ps_work_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` Is the p-value of the `r tmp_perm_tests$dist[2]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[2] < 0.05))`. Then we use **`r tmp_perm_tests$test[2]`**. ```{r} ssu18_ps_work_groups <- get_variable(anosim_data, "TEMP") ssu18_ps_work_anosim_unifrac <- anosim(phyloseq::distance(ssu18_ps_work_prop, "unifrac"), grouping = ssu18_ps_work_groups) ``` Is the p-value of the `r tmp_perm_tests$dist[3]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[3] < 0.05))`. Then we use **`r tmp_perm_tests$test[3]`**. ```{r} ssu18_ps_work_groups <- get_variable(anosim_data, "TEMP") ssu18_ps_work_anosim_wunifrac <- anosim(phyloseq::distance(ssu18_ps_work_prop, "wunifrac"), grouping = ssu18_ps_work_groups) ``` #### Arbitrary ```{r} #| echo: false adonis_sampledf <- ssu18_ps_filt_sampledf anosim_data <- ssu18_ps_filt ``` ```{r} #| echo: false #| eval: true tmp_perm_tests <- ssu18_perm_pvalues[ssu18_perm_pvalues$data_set == 'ssu18_ps_filt',] ``` Is the p-value of the `r ssu18_perm_pvalues$dist[1]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[1] < 0.05))`. Then we use **`r tmp_perm_tests$test[1]`**. ```{r} ssu18_ps_filt_adonis_jsd <- adonis(ssu18_ps_filt_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000) ssu18_ps_filt_adonis2_jsd <- adonis2(ssu18_ps_filt_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` Is the p-value of the `r tmp_perm_tests$dist[2]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[2] < 0.05))`. Then we use **`r tmp_perm_tests$test[2]`**. ```{r} ssu18_ps_filt_groups <- get_variable(anosim_data, "TEMP") ssu18_ps_filt_anosim_unifrac <- anosim(phyloseq::distance(ssu18_ps_filt_prop, "unifrac"), grouping = ssu18_ps_filt_groups) ``` Is the p-value of the `r tmp_perm_tests$dist[3]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[3] < 0.05))`. Then we use **`r tmp_perm_tests$test[3]`**. ```{r} ssu18_ps_filt_groups <- get_variable(anosim_data, "TEMP") ssu18_ps_filt_anosim_wunifrac <- anosim(phyloseq::distance(ssu18_ps_filt_prop, "wunifrac"), grouping = ssu18_ps_filt_groups) ``` #### PERfect ```{r} #| echo: false adonis_sampledf <- ssu18_ps_perfect_sampledf anosim_data <- ssu18_ps_perfect ``` ```{r} #| echo: false #| eval: true tmp_perm_tests <- ssu18_perm_pvalues[ssu18_perm_pvalues$data_set == 'ssu18_ps_perfect',] ``` Is the p-value of the `r ssu18_perm_pvalues$dist[1]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[1] < 0.05))`. Then we use **`r tmp_perm_tests$test[1]`**. ```{r} ssu18_ps_perfect_groups <- get_variable(anosim_data, "TEMP") ssu18_ps_perfect_anosim_jsd <- anosim(phyloseq::distance(ssu18_ps_perfect_prop, "jsd"), grouping = ssu18_ps_perfect_groups) ``` Is the p-value of the `r tmp_perm_tests$dist[2]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[2] < 0.05))`. Then we use **`r tmp_perm_tests$test[2]`**. ```{r} ssu18_ps_perfect_groups <- get_variable(anosim_data, "TEMP") ssu18_ps_perfect_anosim_unifrac <- anosim(phyloseq::distance(ssu18_ps_perfect_prop, "unifrac"), grouping = ssu18_ps_perfect_groups) ``` Is the p-value of the `r tmp_perm_tests$dist[3]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[3] < 0.05))`. Then we use **`r tmp_perm_tests$test[3]`**. ```{r} ssu18_ps_perfect_groups <- get_variable(anosim_data, "TEMP") ssu18_ps_perfect_anosim_wunifrac <- anosim(phyloseq::distance(ssu18_ps_perfect_prop, "wunifrac"), grouping = ssu18_ps_perfect_groups) ``` #### PIME ```{r} #| echo: false adonis_sampledf <- ssu18_ps_pime_sampledf anosim_data <- ssu18_ps_pime ``` ```{r} #| echo: false #| eval: true tmp_perm_tests <- ssu18_perm_pvalues[ssu18_perm_pvalues$data_set == 'ssu18_ps_pime',] ``` Is the p-value of the `r ssu18_perm_pvalues$dist[1]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[1] < 0.05))`. Then we use **`r tmp_perm_tests$test[1]`**. ```{r} ssu18_ps_pime_adonis_jsd <- adonis(ssu18_ps_pime_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000) ssu18_ps_pime_adonis2_jsd <- adonis2(ssu18_ps_pime_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` Is the p-value of the `r tmp_perm_tests$dist[2]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[2] < 0.05))`. Then we use **`r tmp_perm_tests$test[2]`**. ```{r} ssu18_ps_pime_adonis_unifrac <- adonis(ssu18_ps_pime_beta_dist_unifrac ~ TEMP, data = adonis_sampledf, permutations = 1000) ssu18_ps_pime_adonis2_unifrac <- adonis2(ssu18_ps_pime_beta_dist_unifrac ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` Is the p-value of the `r tmp_perm_tests$dist[3]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[3] < 0.05))`. Then we use **`r tmp_perm_tests$test[3]`**. ```{r} ssu18_ps_pime_groups <- get_variable(anosim_data, "TEMP") ssu18_ps_pime_anosim_wunifrac <- anosim(phyloseq::distance(ssu18_ps_pime_prop, "wunifrac"), grouping = ssu18_ps_pime_groups) ``` ::: ::: <details markdown="1"> <summary>Detailed results of Significance tests</summary> ::: {.panel-tabset} #### FULL ```{r} #| echo: false #| eval: true cat("\n", "***PERMANOVA for Jensen-Shannon Divergence, `jsd`***", "\n\n") ssu18_ps_work_adonis2_jsd cat("\n") cat("\n", "***ANOSIM for Unweighted UniFrac distance, `unifrac`***", "\n\n") summary(ssu18_ps_work_anosim_unifrac) cat("\n") cat("\n", "***ANOSIM for Weighted-UniFrac distance, `wunifrac`***", "\n\n") summary(ssu18_ps_work_anosim_wunifrac) cat("\n") ``` #### Arbitrary ```{r} #| echo: false #| eval: true cat("\n", "***PERMANOVA for Jensen-Shannon Divergence, `jsd`***", "\n\n") ssu18_ps_filt_adonis2_jsd cat("\n") cat("\n", "***ANOSIM for Unweighted UniFrac distance, `unifrac`***", "\n\n") summary(ssu18_ps_filt_anosim_unifrac) cat("\n") cat("\n", "***ANOSIM for Weighted-UniFrac distance, `wunifrac`***", "\n\n") summary(ssu18_ps_filt_anosim_wunifrac) cat("\n") ``` #### PERfect ```{r} #| echo: false #| eval: true cat("\n", "***ANOSIM for Jensen-Shannon Divergence, `jsd`***", "\n\n") summary(ssu18_ps_perfect_anosim_jsd) cat("\n") cat("\n", "***ANOSIM for Unweighted UniFrac distance, `unifrac`***", "\n\n") summary(ssu18_ps_perfect_anosim_unifrac) cat("\n") cat("\n", "***ANOSIM for Weighted-UniFrac distance, `wunifrac`***", "\n\n") summary(ssu18_ps_perfect_anosim_wunifrac) cat("\n") ``` #### PIME ```{r} #| echo: false #| eval: true cat("\n", "***PERMANOVA for Jensen-Shannon Divergence, `jsd`***", "\n\n") ssu18_ps_pime_adonis2_jsd cat("\n") cat("\n", "***PERMANOVA for Unweighted UniFrac distance, `unifrac`***", "\n\n") ssu18_ps_pime_adonis2_unifrac cat("\n") cat("\n", "***ANOSIM for Weighted-UniFrac distance, `wunifrac`***", "\n\n") summary(ssu18_ps_pime_anosim_wunifrac) cat("\n") ``` ::: </details> ## Summaries Here is a quick summary of significance tests for the data sets against three distance matrices. ```{r} #| echo: false data_set <- data.frame(c("FULL", "FILT", "PERfect", "PIME")) ssu_dist_metric <- data.frame(c("Jensen-Shannon Divergence", "unweighted UniFrac", "weighted UniFrac")) tmp_p_value_full <- data.frame(c(ssu18_ps_work_adonis_jsd$aov.tab$`Pr(>F)`[1], ssu18_ps_work_anosim_unifrac$signif, ssu18_ps_work_anosim_wunifrac$signif)) tmp_p_value_perfect <- data.frame(c(ssu18_ps_perfect_anosim_jsd$signif, ssu18_ps_perfect_anosim_unifrac$signif, ssu18_ps_perfect_anosim_wunifrac$signif)) tmp_p_value_filt <- data.frame(c(ssu18_ps_filt_adonis_jsd$aov.tab$`Pr(>F)`[1], ssu18_ps_filt_anosim_unifrac$signif, ssu18_ps_filt_anosim_wunifrac$signif)) tmp_p_value_pime <- data.frame(c(ssu18_ps_pime_adonis_jsd$aov.tab$`Pr(>F)`[1], ssu18_ps_pime_adonis_unifrac$aov.tab$`Pr(>F)`[1], ssu18_ps_pime_anosim_wunifrac$signif)) ssu18_tab_sig_test <- dplyr::bind_cols(ssu_dist_metric, tmp_p_value_full) %>% dplyr::bind_cols(., tmp_p_value_filt) %>% dplyr::bind_cols(., tmp_p_value_perfect) %>% dplyr::bind_cols(., tmp_p_value_pime) %>% dplyr::rename("distance metric" = 1, "FULL" = 2, "FILT" = 3, "PERfect" = 4, "PIME" = 5) ssu18_tab_sig_test <- ssu18_tab_sig_test %>% dplyr::mutate(across(.cols = c(2:ncol(ssu18_tab_sig_test)), round, digits = 4)) rm(list = ls(pattern = "tmp_")) ``` `r caption_tab_ssu("ssu_sign_tests")` ```{r} #| echo: false #| eval: true seq_table <- ssu18_tab_sig_test seq_table %>% download_this( output_name = "ssu_sign_tests", output_extension = ".csv", button_label = "Download data as csv file", button_type = "default", csv2 = FALSE, has_icon = TRUE, icon = "fa fa-save") ``` ```{r} #| echo: false #| eval: true reactable(seq_table, defaultColDef = colDef( header = function(value) gsub("_", " ", value, fixed = TRUE), cell = function(value) format(value, nsmall = 3), align = "center", filterable = FALSE, sortable = FALSE, resizable = TRUE, footerStyle = list(fontWeight = "bold"), minWidth = 100, headerVAlign = "center", style = function(value) { if (value > 0.05) { color <- "#2271B2" } else if (value < 0.05) { color <- "#B22271" } else { color <- "#777" } list(color = color, fontWeight = "bold") } ), columns = list( `distance metric` = colDef(name = "Distance metric", sticky = "left", style = list(borderRight = "1px solid #eee"), headerStyle = list(borderRight = "1px solid #eee"), align = "left", minWidth = 200), dist = colDef(name = "distance", align = "left") ), columnGroups = list(colGroup(name = "p-values", columns = c("FULL", "FILT", "PERfect", "PIME"), headerStyle = list(fontSize = "1.3em")) ), searchable = FALSE, defaultPageSize = 12, showPagination = FALSE, pageSizeOptions = c(5, 10, nrow(seq_table)), showPageSizeOptions = FALSE, highlight = TRUE, bordered = TRUE, striped = TRUE, compact = TRUE, wrap = TRUE, showSortable = FALSE, fullWidth = TRUE, theme = reactableTheme(style = list(fontSize = "0.9em"))) rm(seq_table) ``` # ITS ```{r} #| include: false #| eval: false ## Initial Load for ANALYSIS #1 #remove(list = ls()) #set.seed(119) its18_ps_work <- readRDS("files/alpha/rdata/its18_ps_work.rds") its18_ps_filt <- readRDS("files/alpha/rdata/its18_ps_filt.rds") its18_ps_perfect <- readRDS("files/alpha/rdata/its18_ps_perfect.rds") its18_ps_pime <- readRDS("files/alpha/rdata/its18_ps_pime.rds") ``` ```{r} #| echo: false swel_col <- c("#2271B2", "#71B222", "#B22271") ``` In order to test for significance between sample groups, we must perform the following steps: 1. Transform sample counts to relative abundance. 2. Create distance matrices based on some metrics, for example `gower`, `bray`, and `jsd`. For this we use the function `phyloseq::distance`. 3. Next, calculate beta dispersion using the `betadisper` function from the `vegan` package [@oksanen2013community]. 4. Then, use the function `permutest` to run a Permutation test for homogeneity of multivariate dispersions. 5. If the beta dispersion tests are not significant we will run a PERMANOVA using `adonis` (since PERMANOVA aitsmes equal dispersion), otherwise we will use Analysis of Similarity (ANOSIM), both available in the `vegan` package [@oksanen2013community]. Steps 2--4 are combined in a `for` loop that tests all three distance metrics. Before we begin, we need to first transform sample counts to relative abundance and generate some new trees. ```{r} #| code-fold: true its_samp_ps <- c("its18_ps_work", "its18_ps_filt", "its18_ps_perfect", "its18_ps_pime") for (i in its_samp_ps) { tmp_name <- purrr::map_chr(i, ~ paste0(., "_prop")) tmp_get <- get(i) tmp_ps <- transform_sample_counts(tmp_get, function(otu) 1e5 * otu/sum(otu)) tmp_ps@phy_tree <- NULL tmp_ps <- prune_samples(sample_sums(tmp_ps) > 0, tmp_ps) tmp_ps <- merge_phyloseq(tmp_ps, sample_data) print(tmp_name) assign(tmp_name, tmp_ps) rm(list = ls(pattern = "tmp_")) } ``` ### Beta Dispersion Here we create distance matrices for each metric, calculate the beta dispersion, and run a permutation test for homogeneity of multivariate dispersions. ```{r} #| code-fold: true its_dist <- c("jsd", "bray", "gower") for (i in its_samp_ps) { for (d in its_dist){ tmp_get <- get(purrr::map_chr(i, ~ paste0(i, "_prop"))) tmp_samp <- data.frame(sample_data(tmp_get)) tmp_df <- phyloseq::distance(tmp_get, method = d) tmp_df_name <- purrr::map_chr(d, ~ paste0(i, "_beta_dist_", .)) assign(tmp_df_name, tmp_df) tmp_df2 <- betadisper(tmp_df, tmp_samp$TEMP, bias.adjust = TRUE) tmp_df_name2 <- purrr::map_chr(d, ~ paste0(i, "_beta_dispersion_", .)) assign(tmp_df_name2, tmp_df2) tmp_df3 <- permutest(tmp_df2, pairwise = TRUE, permutations = 1000, binary = FALSE) tmp_df_name3 <- purrr::map_chr(d, ~ paste0(i, "_permutest_", .)) assign(tmp_df_name3, tmp_df3) rm(list = ls(pattern = "tmp_")) } } objects() ``` <details markdown="1"> <summary>Detailed results of Beta Dispersion & Permutation tests</summary> ::: {.panel-tabset} ```{r} #| echo: false #| eval: true its_samp_ps <- c("its18_ps_work", "its18_ps_filt", "its18_ps_perfect", "its18_ps_pime") its_dist <- c("jsd", "bray", "gower") ``` #### FULL ```{r} #| echo: false #| eval: true for (i in its_samp_ps[1]) { #cat("\n", " *****", i, "*****", "\n") for (d in its_dist){ tmp_get <- get(purrr::map_chr(i, ~ paste0(i, "_permutest_", d))) cat("\n") cat("####################################################", "\n") tmp_print <- c("BETA DISPERSION significance test", d, "distance") cat(tmp_print, "\n") cat("####################################################") cat("\n") print(tmp_get) rm(list = ls(pattern = "tmp_")) } } ``` #### Arbitrary ```{r} #| echo: false #| eval: true for (i in its_samp_ps[2]) { #cat("\n", " *****", i, "*****", "\n") for (d in its_dist){ tmp_get <- get(purrr::map_chr(i, ~ paste0(i, "_permutest_", d))) cat("\n") cat("####################################################", "\n") tmp_print <- c("BETA DISPERSION significance test", d, "distance") cat(tmp_print, "\n") cat("####################################################") cat("\n") print(tmp_get) rm(list = ls(pattern = "tmp_")) } } ``` #### PERfect ```{r} #| echo: false #| eval: true for (i in its_samp_ps[3]) { #cat("\n", " *****", i, "*****", "\n") for (d in its_dist){ tmp_get <- get(purrr::map_chr(i, ~ paste0(i, "_permutest_", d))) cat("\n") cat("####################################################", "\n") tmp_print <- c("BETA DISPERSION significance test", d, "distance") cat(tmp_print, "\n") cat("####################################################") cat("\n") print(tmp_get) rm(list = ls(pattern = "tmp_")) } } ``` #### PIME ```{r} #| echo: false #| eval: true for (i in its_samp_ps[4]) { #cat("\n", " *****", i, "*****", "\n") for (d in its_dist){ tmp_get <- get(purrr::map_chr(i, ~ paste0(i, "_permutest_", d))) cat("\n") cat("####################################################", "\n") tmp_print <- c("BETA DISPERSION significance test", d, "distance") cat(tmp_print, "\n") cat("####################################################") cat("\n") print(tmp_get) rm(list = ls(pattern = "tmp_")) } } ``` ::: </details> Remember, if the beta dispersion p-value is greater than `0.05` we use PERMANOVA, otherwise we use ANOSIM. ```{r} #| echo: false #| eval: true #| results: markup its18_perm_pvalues <- data.frame() for (i in its_samp_ps) { for (d in its_dist){ tmp_get <- get(purrr::map_chr(i, ~ paste0(., "_permutest_", d))) tmp_pv <- tmp_get$tab$`Pr(>F)`[1] tmp_test <- if(tmp_pv < 0.05){ tmp_test <- "ANOSIM" } else { tmp_test <- "ADONIS" } tmp_df <- data.frame(i, d, tmp_pv, tmp_test) its18_perm_pvalues <- dplyr::bind_rows(its18_perm_pvalues, tmp_df) rm(list = ls(pattern = "tmp_")) } } its18_perm_pvalues <- its18_perm_pvalues %>% dplyr::rename(c("data_set" = 1, "dist" = 2, "pval" = 3, "test" = 4)) its18_perm_pvalues <- its18_perm_pvalues %>% tibble::remove_rownames() ``` ```{r} #| echo: false its18_perm_summary <- its18_perm_pvalues its18_perm_summary <- its18_perm_summary %>% separate(col = 1, into = c("dummy", "metric", "dataset"), sep = "_", remove = TRUE) its18_perm_summary[,1:2] <- NULL its18_perm_summary$dataset <- stringr::str_replace(its18_perm_summary$dataset, "work", "FULL") %>% stringr::str_replace(., "filt", "FILT") %>% stringr::str_replace(., "pime", "PIME") %>% stringr::str_replace(., "perfect", "PERfect") its18_perm_summary$dist <- stringr::str_replace(its18_perm_summary$dist, "jsd", "Jensen–Shannon divergence") %>% stringr::str_replace(., "gower", "Gower distance") %>% stringr::str_replace(., "bray", "Bray-Curtis dissimilarity") its18_perm_summary$pval <- round(its18_perm_summary[[3]], 4) ``` `r caption_tab_its("its_beta_disp_tests")` ```{r} #| echo: false #| eval: true seq_table <- its18_perm_summary seq_table %>% download_this( output_name = "its_beta_disp_tests", output_extension = ".csv", button_label = "Download data as csv file", button_type = "default", csv2 = FALSE, has_icon = TRUE, icon = "fa fa-save") ``` ```{r} #| echo: false #| eval: true reactable(seq_table, defaultColDef = colDef( header = function(value) gsub("_", " ", value, fixed = TRUE), cell = function(value) format(value, nsmall = 4), align = "center", filterable = TRUE, sortable = TRUE, resizable = TRUE, footerStyle = list(fontWeight = "bold"), minWidth = 100, headerVAlign = "center" ), columns = list( dataset = colDef(name = "Dataset", sticky = "left", style = list(borderRight = "1px solid #eee"), headerStyle = list(borderRight = "1px solid #eee"), align = "left", minWidth = 50), dist = colDef(name = "distance", align = "left"), `pval` = colDef(name = "p-value", minWidth = 100, resizable = FALSE, style = function(value) { if (value > 0.05) { color <- "#2271B2" } else if (value < 0.05) { color <- "#B22271" } else { color <- "#777" } list(color = color, fontWeight = "bold") }) ), searchable = FALSE, defaultPageSize = 12, showPagination = FALSE, pageSizeOptions = c(5, 10, nrow(seq_table)), showPageSizeOptions = TRUE, highlight = TRUE, bordered = TRUE, striped = TRUE, compact = TRUE, wrap = TRUE, showSortable = TRUE, fullWidth = TRUE, theme = reactableTheme(style = list(fontSize = "0.8em"))) rm(seq_table) ``` Now we to create sample data frames for each data set. ```{r} #| code-fold: true for (i in its_samp_ps) { tmp_get <- get(purrr::map_chr(i, ~ paste0(., "_prop"))) tmp_samp <- data.frame(sample_data(tmp_get)) tmp_name <- purrr::map_chr(i, ~ paste0(., "_sampledf")) assign(tmp_name, tmp_samp) rm(list = ls(pattern = "tmp_")) } objects(pattern="_sampledf") ``` ## Significance Tests Depending on the test, we need different data structures. For `adonis` we use the sample metadata and for `anosim` we will start with a phyloseq object. So for each data set we create two variables: `adonis_sampledf` and `anosim_data` ::: {.column-body} ::: {.panel-tabset} #### FULL ```{r} #| echo: false adonis_sampledf <- its18_ps_work_sampledf anosim_data <- its18_ps_work ``` ```{r} #| echo: false #| eval: true tmp_perm_tests <- its18_perm_pvalues[its18_perm_pvalues$data_set == 'its18_ps_work',] ``` Is the p-value of the `r its18_perm_pvalues$dist[1]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[1] < 0.05))`. Then we use **`r tmp_perm_tests$test[1]`**. ```{r} its18_ps_work_adonis_jsd <- adonis(its18_ps_work_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000) its18_ps_work_adonis2_jsd <- adonis2(its18_ps_work_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` Is the p-value of the `r tmp_perm_tests$dist[2]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[2] < 0.05))`. Then we use **`r tmp_perm_tests$test[2]`**. ```{r} its18_ps_work_adonis_bray <- adonis(its18_ps_work_beta_dist_bray ~ TEMP, data = adonis_sampledf, permutations = 1000) its18_ps_work_adonis2_bray <- adonis2(its18_ps_work_beta_dist_bray ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` Is the p-value of the `r tmp_perm_tests$dist[3]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[3] < 0.05))`. Then we use **`r tmp_perm_tests$test[3]`**. ```{r} its18_ps_work_adonis_gower <- adonis(its18_ps_work_beta_dist_gower ~ TEMP, data = adonis_sampledf, permutations = 1000) its18_ps_work_adonis2_gower <- adonis2(its18_ps_work_beta_dist_gower ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` #### Arbitrary ```{r} #| echo: false adonis_sampledf <- its18_ps_filt_sampledf anosim_data <- its18_ps_filt ``` ```{r} #| echo: false #| eval: true tmp_perm_tests <- its18_perm_pvalues[its18_perm_pvalues$data_set == 'its18_ps_filt',] ``` Is the p-value of the `r its18_perm_pvalues$dist[1]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[1] < 0.05))`. Then we use **`r tmp_perm_tests$test[1]`**. ```{r} its18_ps_filt_adonis_jsd <- adonis(its18_ps_filt_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000) its18_ps_filt_adonis2_jsd <- adonis2(its18_ps_filt_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` Is the p-value of the `r tmp_perm_tests$dist[2]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[2] < 0.05))`. Then we use **`r tmp_perm_tests$test[2]`**. ```{r} its18_ps_filt_adonis_bray <- adonis(its18_ps_filt_beta_dist_bray ~ TEMP, data = adonis_sampledf, permutations = 1000) its18_ps_filt_adonis2_bray <- adonis2(its18_ps_filt_beta_dist_bray ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` Is the p-value of the `r tmp_perm_tests$dist[3]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[3] < 0.05))`. Then we use **`r tmp_perm_tests$test[3]`**. ```{r} its18_ps_filt_adonis_gower <- adonis(its18_ps_filt_beta_dist_gower ~ TEMP, data = adonis_sampledf, permutations = 1000) its18_ps_filt_adonis2_gower <- adonis2(its18_ps_filt_beta_dist_gower ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` #### PERfect ```{r} #| echo: false adonis_sampledf <- its18_ps_perfect_sampledf anosim_data <- its18_ps_perfect ``` ```{r} #| echo: false #| eval: true tmp_perm_tests <- its18_perm_pvalues[its18_perm_pvalues$data_set == 'its18_ps_perfect',] ``` Is the p-value of the `r its18_perm_pvalues$dist[1]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[1] < 0.05))`. Then we use **`r tmp_perm_tests$test[1]`**. ```{r} its18_ps_perfect_adonis_jsd <- adonis(its18_ps_perfect_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000) its18_ps_perfect_adonis2_jsd <- adonis2(its18_ps_perfect_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` Is the p-value of the `r tmp_perm_tests$dist[2]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[2] < 0.05))`. Then we use **`r tmp_perm_tests$test[2]`**. ```{r} its18_ps_perfect_adonis_bray <- adonis(its18_ps_perfect_beta_dist_bray ~ TEMP, data = adonis_sampledf, permutations = 1000) its18_ps_perfect_adonis2_bray <- adonis2(its18_ps_perfect_beta_dist_bray ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` Is the p-value of the `r tmp_perm_tests$dist[3]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[3] < 0.05))`. Then we use **`r tmp_perm_tests$test[3]`**. ```{r} its18_ps_perfect_adonis_gower <- adonis(its18_ps_perfect_beta_dist_gower ~ TEMP, data = adonis_sampledf, permutations = 1000) its18_ps_perfect_adonis2_gower <- adonis2(its18_ps_perfect_beta_dist_gower ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` #### PIME ```{r} #| echo: false adonis_sampledf <- its18_ps_pime_sampledf anosim_data <- its18_ps_pime ``` ```{r} #| echo: false #| eval: true tmp_perm_tests <- its18_perm_pvalues[its18_perm_pvalues$data_set == 'its18_ps_pime',] ``` Is the p-value of the `r its18_perm_pvalues$dist[1]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[1] < 0.05))`. Then we use **`r tmp_perm_tests$test[1]`**. ```{r} its18_ps_pime_adonis_jsd <- adonis(its18_ps_pime_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000) its18_ps_pime_adonis2_jsd <- adonis2(its18_ps_pime_beta_dist_jsd ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` Is the p-value of the `r tmp_perm_tests$dist[2]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[2] < 0.05))`. Then we use **`r tmp_perm_tests$test[2]`**. ```{r} its18_ps_pime_adonis_bray <- adonis(its18_ps_pime_beta_dist_bray ~ TEMP, data = adonis_sampledf, permutations = 1000) its18_ps_pime_adonis2_bray <- adonis2(its18_ps_pime_beta_dist_bray ~ TEMP, data = adonis_sampledf, permutations = 1000) ``` Is the p-value of the `r tmp_perm_tests$dist[3]` distance metric less than `0.05`? `r eval(isTRUE(tmp_perm_tests$pval[3] < 0.05))`. Then we use **`r tmp_perm_tests$test[3]`**. ```{r} its18_ps_pime_groups <- get_variable(anosim_data, "TEMP") its18_ps_pime_anosim_gower <- anosim(phyloseq::distance(its18_ps_pime_prop, "gower"), grouping = its18_ps_pime_groups) ``` ::: ::: <details markdown="1"> <summary>Detailed results of Significance tests.</summary> ::: {.panel-tabset} #### FULL ```{r} #| echo: false #| eval: true cat("\n", "***PERMANOVA for Jensen-Shannon Divergence, `jsd`***", "\n\n") its18_ps_work_adonis2_jsd cat("\n") cat("\n", "***PERMANOVA for Bray-Curtis dissimilarity, `bray`***", "\n\n") summary(its18_ps_work_adonis2_bray) cat("\n") cat("\n", "***PERMANOVA for Gower distance, `gower`***", "\n\n") summary(its18_ps_work_adonis2_gower) cat("\n") ``` #### Arbitrary ```{r} #| echo: false #| eval: true cat("\n", "***PERMANOVA for Jensen-Shannon Divergence, `jsd`***", "\n\n") its18_ps_filt_adonis2_jsd cat("\n") cat("\n", "***PERMANOVA for Bray-Curtis dissimilarity, `bray`***", "\n\n") summary(its18_ps_filt_adonis2_bray) cat("\n") cat("\n", "***PERMANOVA for Gower distance, `gower`***", "\n\n") summary(its18_ps_filt_adonis2_gower) cat("\n") ``` #### PERfect ```{r} #| echo: false #| eval: true cat("\n", "***PERMANOVA for Jensen-Shannon Divergence, `jsd`***", "\n\n") its18_ps_perfect_adonis2_jsd cat("\n") cat("\n", "***PERMANOVA for Bray-Curtis dissimilarity, `bray`***", "\n\n") summary(its18_ps_perfect_adonis2_bray) cat("\n") cat("\n", "***PERMANOVA for Gower distance, `gower`***", "\n\n") summary(its18_ps_perfect_adonis2_gower) cat("\n") ``` #### PIME ```{r} #| echo: false #| eval: true cat("\n", "***PERMANOVA for Jensen-Shannon Divergence, `jsd`***", "\n\n") its18_ps_pime_adonis2_jsd cat("\n") cat("\n", "***PERMANOVA for Bray-Curtis dissimilarity, `bray`***", "\n\n") its18_ps_pime_adonis2_bray cat("\n") cat("\n", "***ANOSIM for Gower distance, `gower`***", "\n\n") summary(its18_ps_pime_anosim_gower) cat("\n") ``` ::: </details> ## Summaries Here is a quick summary of significance tests for the data sets against three distance matrices. ```{r} #| echo: false data_set <- data.frame(c("FULL", "FILT", "PERfect", "PIME")) its_dist_metric <- data.frame(c("Jensen-Shannon Divergence", "Bray-Curtis dissimilarity", "Gower distance")) tmp_p_value_work <- data.frame(c(its18_ps_work_adonis_jsd$aov.tab$`Pr(>F)`[1], its18_ps_work_adonis_bray$aov.tab$`Pr(>F)`[1], its18_ps_work_adonis_gower$aov.tab$`Pr(>F)`[1])) tmp_p_value_filt <- data.frame(c(its18_ps_filt_adonis_jsd$aov.tab$`Pr(>F)`[1], its18_ps_filt_adonis_bray$aov.tab$`Pr(>F)`[1], its18_ps_filt_adonis_gower$aov.tab$`Pr(>F)`[1])) tmp_p_value_perfect <- data.frame(c(its18_ps_perfect_adonis_jsd$aov.tab$`Pr(>F)`[1], its18_ps_perfect_adonis_bray$aov.tab$`Pr(>F)`[1], its18_ps_perfect_adonis_gower$aov.tab$`Pr(>F)`[1])) tmp_p_value_pime <- data.frame(c(its18_ps_pime_adonis_jsd$aov.tab$`Pr(>F)`[1], its18_ps_pime_adonis_bray$aov.tab$`Pr(>F)`[1], its18_ps_pime_anosim_gower$signif)) its18_tab_sig_test <- dplyr::bind_cols(its_dist_metric, tmp_p_value_work) %>% dplyr::bind_cols(., tmp_p_value_filt) %>% dplyr::bind_cols(., tmp_p_value_perfect) %>% dplyr::bind_cols(., tmp_p_value_pime) %>% dplyr::rename("distance metric" = 1, "FULL" = 2, "FILT" = 3, "PERfect" = 4, "PIME" = 5) its18_tab_sig_test <- its18_tab_sig_test %>% dplyr::mutate(across(.cols = c(2:ncol(its18_tab_sig_test)), round, digits = 4)) rm(list = ls(pattern = "tmp_")) ``` `r caption_tab_its("its_sign_tests")` ```{r} #| echo: false #| eval: true seq_table <- its18_tab_sig_test seq_table %>% download_this( output_name = "its_sign_tests", output_extension = ".csv", button_label = "Download data as csv file", button_type = "default", csv2 = FALSE, has_icon = TRUE, icon = "fa fa-save") ``` ```{r} #| echo: false #| eval: true reactable(seq_table, defaultColDef = colDef( header = function(value) gsub("_", " ", value, fixed = TRUE), cell = function(value) format(value, nsmall = 3), align = "center", filterable = FALSE, sortable = FALSE, resizable = TRUE, footerStyle = list(fontWeight = "bold"), minWidth = 100, headerVAlign = "center", style = function(value) { if (value > 0.05) { color <- "#2271B2" } else if (value < 0.05) { color <- "#B22271" } else { color <- "#777" } list(color = color, fontWeight = "bold") } ), columns = list( `distance metric` = colDef(name = "Distance metric", sticky = "left", style = list(borderRight = "1px solid #eee"), headerStyle = list(borderRight = "1px solid #eee"), align = "left", minWidth = 200), dist = colDef(name = "distance", align = "left") ), columnGroups = list(colGroup(name = "p-values", columns = c("FULL", "FILT", "PERfect", "PIME"), headerStyle = list(fontSize = "1.3em")) ), searchable = FALSE, defaultPageSize = 12, showPagination = FALSE, pageSizeOptions = c(5, 10, nrow(seq_table)), showPageSizeOptions = FALSE, highlight = TRUE, bordered = TRUE, striped = TRUE, compact = TRUE, wrap = TRUE, showSortable = FALSE, fullWidth = TRUE, theme = reactableTheme(style = list(fontSize = "0.9em"))) rm(seq_table) ``` ```{r} #| echo: false save.image("page_build/beta_wf_part_1.rdata") ``` # Ordinations ```{r} #| echo: false #| eval: true load("page_build/beta_wf_part_2.rdata") ``` ```{r} #| echo: false remove(list = ls()) load("page_build/beta_wf_part_1.rdata") gdata::keep(ssu18_ps_work, ssu18_ps_filt, ssu18_ps_pime, ssu18_ps_perfect, ssu18_ps_work_prop, ssu18_ps_filt_prop, ssu18_ps_pime_prop, ssu18_ps_perfect_prop, its18_ps_work, its18_ps_filt, its18_ps_pime, its18_ps_perfect, its18_ps_work_prop, its18_ps_filt_prop, its18_ps_pime_prop, its18_ps_perfect_prop, swel_col, sure = TRUE) ssu18_ps_work_me <- readRDS("files/taxa/rdata/ssu18_ps_work_me.rds") ssu18_ps_filt_me <- readRDS("files/taxa/rdata/ssu18_ps_filt_me.rds") ssu18_ps_perfect_me <- readRDS("files/taxa/rdata/ssu18_ps_perfect_me.rds") ssu18_ps_pime_me <- readRDS("files/taxa/rdata/ssu18_ps_pime_me.rds") its18_ps_work_me <- readRDS("files/taxa/rdata/its18_ps_work_me.rds") its18_ps_filt_me <- readRDS("files/taxa/rdata/its18_ps_filt_me.rds") its18_ps_perfect_me <- readRDS("files/taxa/rdata/its18_ps_perfect_me.rds") its18_ps_pime_me <- readRDS("files/taxa/rdata/its18_ps_pime_me.rds") objects() ``` Now we can visualize the **Jensen-Shannon Divergence**, **unweighted-UniFrac**, and **weighted-UniFrac** distance matrices to access dissimilarity among sample. Here we ordinate using Principal Coordinate Analysis (PCoA) but we could also test other methods like NMDS, CCA, etc. We will also test the whether there are any differences in ordination between the `phyloseq` package and the [`microeco` package](https://github.com/ChiLiubio/microeco). #### Plot Code <details markdown="1"> <summary>Code to reproduce plots</summary> **We begin with the code for the 16S rRNA data.** First the code for ordination implementation in `phyloseq`. ```{r} ssu18_data_sets <- c("ssu18_ps_work", "ssu18_ps_filt", "ssu18_ps_perfect", "ssu18_ps_pime") ssu_dist <- c("jsd", "unifrac", "wunifrac") for (samp_ps in ssu18_data_sets) { for (d in ssu_dist){ tmp_get <- get(purrr::map_chr(samp_ps, ~ paste0(., "_prop"))) ord_meths <- c("NMDS", "PCoA", "CCA", "DCA") # MDS = PCoA, "CCA", "DCA", "DPCoA", "RDA" tmp_plist <- plyr::llply(as.list(ord_meths), function(i, physeq, d) { ordi = ordinate(physeq, method = i, distance = d) plot_ordination(physeq, ordi, "samples", color = "TEMP") }, tmp_get, d) names(tmp_plist) <- ord_meths tmp_df <- plyr::ldply(tmp_plist, function(x){ df = x$data[, 1:2] colnames(df) = c("Axis_1", "Axis_2") return(cbind(df, x$data))}) names(tmp_df)[1] = "method" tmp_plot <- ggplot(tmp_df, aes(Axis_1, Axis_2, color = TEMP, shape = TEMP, fill = TEMP)) tmp_plot <- tmp_plot + geom_point(size = 4) tmp_plot <- tmp_plot + facet_wrap(~method, scales = "free") tmp_plot <- tmp_plot + scale_colour_manual(values = swel_col) tmp_df_name <- purrr::map_chr(d, ~ paste0(samp_ps, "_dist_", .)) tmp_plist_name <- purrr::map_chr(d, ~ paste0(samp_ps, "_", ., "_plist")) tmp_plot_name <- purrr::map_chr(d, ~ paste0(samp_ps, "_dist_", ., "_plot")) tmp_list <- list("tmp_df_name" = tmp_df, tmp_plist_name = tmp_plist, tmp_plot_name = tmp_plot) assign(paste0(samp_ps, "_", d, "_ord_results"), tmp_list) rm(list = ls(pattern = "_tmp")) } } ``` ```{r} plist_name <- objects(pattern="_ord_results") plot_num <- c(1,2,3,4) for (i in plist_name) { for (j in plot_num) { tmp_get_i <- get(i)$tmp_plist_name tmp_ord <- names(tmp_get_i)[j] tmp_name <- stringr::str_replace(i, "ord_results", tmp_ord) tmp_plot <- tmp_get_i[[j]] + scale_colour_manual(values = swel_col) tmp_plot <- tmp_plot + geom_point(size = 4, aes(shape = TEMP)) + theme(legend.position = "bottom") tmp_plot$labels$shape <- "TEMP" assign(tmp_name, tmp_plot) rm(list = ls(pattern = "tmp_")) } } ``` And now the code for ordination implementation in `microeco`. ```{r} microeco_path <- "files/beta/microeco/" for (i in ssu18_data_sets) { tmp_dataset <- get(purrr::map_chr(i, ~paste0(., "_me"))) tmp_dataset$cal_betadiv(unifrac = TRUE) #### CODE TO ADD JSD DISTANCE #### tmp_jsd <- phyloseq::distance(get(i), method = "jsd") tmp_jsd <- forceSymmetric(as.matrix(tmp_jsd), uplo = "L") tmp_jsd <- as.matrix(tmp_jsd) tmp_dataset$beta_diversity$jsd <- tmp_jsd #### #### #### #### #### #### #### #### dir.create(paste(microeco_path, i, "/metrics/", sep = ""), recursive = TRUE) tmp_dataset$save_betadiv(dirpath = paste(microeco_path, i, "/metrics/", sep = "")) rm(list = ls(pattern = "tmp_")) } ``` ```{r} #| echo: true #| eval: true class(ssu18_ps_perfect_me$beta_diversity) summary(ssu18_ps_perfect_me$beta_diversity) ``` Here I made a custom "function" to run the analysis, plot the graphs, save graph objects, and save plots (as `.png` and `.pdf` files). I am sure an actual programmer would be shocked, but it works. ```{r} microeco_beta_plot <- function(choose_input, choose_metric, choose_ord) { tmp_dataset <- get(purrr::map_chr(choose_input, ~paste0(., "_me"))) tmp_t1 <- trans_beta$new(dataset = tmp_dataset, group = "TEMP", measure = choose_metric) tmp_t1$cal_ordination(ordination = choose_ord) tmp_t1_ord_plot <- tmp_t1$plot_ordination(plot_color = "TEMP", plot_shape = "TEMP", plot_group_ellipse = FALSE, color_values = swel_col, shape_values = c(16, 17, 15)) + geom_point(size = 4) + theme(legend.position = "bottom") tmp_t1$cal_group_distance() tmp_t1_within_group_plot <- tmp_t1$plot_group_distance(distance_pair_stat = TRUE, color_values = swel_col) tmp_t1$res_group_distance tmp_t1$cal_group_distance(within_group = FALSE) tmp_t1_btwn_group_plot <- tmp_t1$plot_group_distance(distance_pair_stat = TRUE, color_values = swel_col) ###### SET names tmp_name_ord <- paste(choose_input, "_me_", choose_metric, "_", choose_ord, sep = "") tmp_name_wg <- paste(choose_input, "_me_wg_", choose_metric, "_", choose_ord, sep = "") tmp_name_bg <- paste(choose_input, "_me_bg_", choose_metric, "_", choose_ord, sep = "") assign(tmp_name_ord, tmp_t1_ord_plot, envir = parent.frame() ) assign(tmp_name_wg, tmp_t1_within_group_plot, envir = parent.frame() ) assign(tmp_name_bg, tmp_t1_btwn_group_plot, envir = parent.frame() ) ###### SET paths dir.create(paste(microeco_path, choose_input, "/ordination/", sep = ""), recursive = TRUE) tmp_path <- paste(microeco_path, choose_input, "/ordination/", sep = "") tmp_path_ord <- paste(tmp_path, "ord_plots/", sep = "") tmp_path_wg <- paste(tmp_path, "within_group/", sep = "") tmp_path_bg <- paste(tmp_path, "between_group/", sep = "") ###### SAVE plots ggplot2::ggsave(tmp_t1_ord_plot, path = paste(tmp_path_ord, sep = ""), filename = paste0(tmp_name_ord, ".png", sep = "")) ggplot2::ggsave(tmp_t1_ord_plot, path = paste(tmp_path_ord, sep = ""), filename = paste0(tmp_name_ord, ".pdf", sep = "")) ggplot2::ggsave(tmp_t1_within_group_plot, path = paste(tmp_path_wg, sep = ""), filename = paste0(tmp_name_wg, ".png", sep = "")) ggplot2::ggsave(tmp_t1_within_group_plot, path = paste(tmp_path_wg, sep = ""), filename = paste0(tmp_name_wg, ".pdf", sep = "")) ggplot2::ggsave(tmp_t1_btwn_group_plot, path = paste(tmp_path_bg, sep = ""), filename = paste0(tmp_name_bg, ".png", sep = "")) ggplot2::ggsave(tmp_t1_btwn_group_plot, path = paste(tmp_path_bg, sep = ""), filename = paste0(tmp_name_bg, ".pdf", sep = "")) rm(list = ls(pattern = "tmp_")) } ``` And then run it like so: ```{r} microeco_beta_plot(choose_input = "ssu18_ps_pime", choose_metric = "bray", choose_ord = "PCoA") ``` It would be super cool to have a function that looped through a bunch of data sets, distance metrics, and ordination methods. ```{r} for (j in 1:length(get(paste("ssu18_ps_work", "_me", sep = ""))$beta_diversity)) { tmp_metric <- names(get(paste("ssu18_ps_work", "_me", sep = ""))$beta_diversity[j]) microeco_beta_plot(choose_input = "ssu18_ps_work", choose_metric = tmp_metric, choose_ord = "PCoA") rm(list = ls(pattern = "tmp_")) } for (j in 1:length(get(paste("ssu18_ps_filt", "_me", sep = ""))$beta_diversity)) { tmp_metric <- names(get(paste("ssu18_ps_filt", "_me", sep = ""))$beta_diversity[j]) microeco_beta_plot(choose_input = "ssu18_ps_filt", choose_metric = tmp_metric, choose_ord = "PCoA") rm(list = ls(pattern = "tmp_")) } for (j in 1:length(get(paste("ssu18_ps_perfect", "_me", sep = ""))$beta_diversity)) { tmp_metric <- names(get(paste("ssu18_ps_perfect", "_me", sep = ""))$beta_diversity[j]) microeco_beta_plot(choose_input = "ssu18_ps_perfect", choose_metric = tmp_metric, choose_ord = "PCoA") rm(list = ls(pattern = "tmp_")) } for (j in 1:length(get(paste("ssu18_ps_pime", "_me", sep = ""))$beta_diversity)) { tmp_metric <- names(get(paste("ssu18_ps_pime", "_me", sep = ""))$beta_diversity[j]) microeco_beta_plot(choose_input = "ssu18_ps_pime", choose_metric = tmp_metric, choose_ord = "PCoA") rm(list = ls(pattern = "tmp_")) } ``` ```{r} #| echo: false ssu18_ps_work_title <- "FULL Ordination Plots" ssu18_ps_filt_title <- "Arbitrary Ordination Plots" ssu18_ps_perfect_title <- "PERfect Ordination Plots" ssu18_ps_pime_title <- "PIME Ordination Plots" for (samp_ps in ssu18_data_sets) { tmp_combo_plot <- ( get(purrr::map_chr(samp_ps, ~ paste0(., "_jsd_PCoA"))) + get(purrr::map_chr(samp_ps, ~ paste0(., "_unifrac_PCoA"))) + get(purrr::map_chr(samp_ps, ~ paste0(., "_wunifrac_PCoA"))) ) / ( get(purrr::map_chr(samp_ps, ~ paste0(., "_me", "_jsd_PCoA"))) + get(purrr::map_chr(samp_ps, ~ paste0(., "_me", "_unwei_unifrac_PCoA"))) + get(purrr::map_chr(samp_ps, ~ paste0(., "_me", "_wei_unifrac_PCoA"))) ) + geom_point(size = 1, show.legend = FALSE) tmp_title <- paste(samp_ps, "_title", sep = "") tmp_combo_plot <- tmp_combo_plot + plot_annotation(title = get(tmp_title), subtitle = 'Comparisons of three distance metrics. Implemented in phyloseq (top) and microeco (bottom).') + plot_layout(widths = c(3, 3), guides = "collect") & theme(plot.title = element_text(size = 18), plot.subtitle = element_text(size = 16), plot.tag = element_text(size = 12), axis.title = element_text(size = 14), axis.text = element_text(size = 10), legend.position = "bottom", legend.justification = "center", legend.title = element_text(size = rel(1.75)), legend.text = element_text(size = rel(1.5))) tmp_combo_plot_name <- purrr::map_chr(samp_ps, ~ paste0(., "_plots")) assign(tmp_combo_plot_name, tmp_combo_plot) rm(list = ls(pattern = "tmp_")) } ``` ```{r} #| echo: false for (i in ssu18_data_sets) { dir.create(paste("files/beta/figures/combo_plots", sep = ""), recursive = TRUE) tmp_path <- paste("files/beta/figures/combo_plots", sep = "") tmp_filename <- paste(i, "_beta_div_ord_plots", sep="") tmp_get_plot <- get(purrr::map_chr(i, ~ paste0(., "_plots"))) ggplot2::ggsave(tmp_get_plot, path = tmp_path, filename = paste(tmp_filename, ".png", sep = ""), height = 28, width = 36, units = 'cm', dpi = 600, bg = "white") ggplot2::ggsave(tmp_get_plot, path = tmp_path, filename = paste(tmp_filename, ".pdf", sep = ""), height = 10, width = 12) } ``` ```{r} #| echo: false system("cp files/beta/figures/combo_plots/ssu18_ps_work_beta_div_ord_plots.png include/beta/ssu18_ps_work_beta_div_ord_plots.png") system("cp files/beta/figures/combo_plots/ssu18_ps_filt_beta_div_ord_plots.png include/beta/ssu18_ps_filt_beta_div_ord_plots.png") system("cp files/beta/figures/combo_plots/ssu18_ps_perfect_beta_div_ord_plots.png include/beta/ssu18_ps_perfect_beta_div_ord_plots.png") system("cp files/beta/figures/combo_plots/ssu18_ps_pime_beta_div_ord_plots.png include/beta/ssu18_ps_pime_beta_div_ord_plots.png") ``` **And now the ITS code.** ```{r} its18_data_sets <- c("its18_ps_work", "its18_ps_filt", "its18_ps_perfect", "its18_ps_pime") its_dist <- c("jsd", "bray", "gower") for (samp_ps in its18_data_sets) { for (d in its_dist){ tmp_get <- get(purrr::map_chr(samp_ps, ~ paste0(., "_prop"))) ord_meths <- c("NMDS", "PCoA", "CCA", "DCA") # MDS = PCoA, "CCA", "DCA", "DPCoA", "RDA" tmp_plist <- plyr::llply(as.list(ord_meths), function(i, physeq, d) { ordi = ordinate(physeq, method = i, distance = d) plot_ordination(physeq, ordi, "samples", color = "TEMP") }, tmp_get, d) names(tmp_plist) <- ord_meths tmp_df <- plyr::ldply(tmp_plist, function(x){ df = x$data[, 1:2] colnames(df) = c("Axis_1", "Axis_2") return(cbind(df, x$data))}) names(tmp_df)[1] = "method" tmp_plot <- ggplot(tmp_df, aes(Axis_1, Axis_2, color = TEMP, shape = TEMP, fill = TEMP)) tmp_plot <- tmp_plot + geom_point(size = 4) tmp_plot <- tmp_plot + facet_wrap(~method, scales = "free") tmp_plot <- tmp_plot + scale_colour_manual(values = swel_col) tmp_df_name <- purrr::map_chr(d, ~ paste0(samp_ps, "_dist_", .)) tmp_plist_name <- purrr::map_chr(d, ~ paste0(samp_ps, "_", ., "_plist")) tmp_plot_name <- purrr::map_chr(d, ~ paste0(samp_ps, "_dist_", ., "_plot")) tmp_list <- list("tmp_df_name" = tmp_df, tmp_plist_name = tmp_plist, tmp_plot_name = tmp_plot) assign(paste0(samp_ps, "_", d, "_ord_results"), tmp_list) rm(list = ls(pattern = "_tmp")) } } ``` ```{r} plist_name <- objects(pattern="_ord_results") plot_num <- c(1,2,3,4) for (i in plist_name) { for (j in plot_num) { tmp_get_i <- get(i)$tmp_plist_name tmp_ord <- names(tmp_get_i)[j] tmp_name <- stringr::str_replace(i, "ord_results", tmp_ord) tmp_plot <- tmp_get_i[[j]] + scale_colour_manual(values = swel_col) tmp_plot <- tmp_plot + geom_point(size = 4, aes(shape = TEMP)) + theme(legend.position = "bottom") tmp_plot$labels$shape <- "TEMP" assign(tmp_name, tmp_plot) rm(list = ls(pattern = "tmp_")) } } ``` And now the code for ordination implementation in `microeco`. ```{r} microeco_path <- "files/beta/microeco/" for (i in its18_data_sets) { tmp_dataset <- get(purrr::map_chr(i, ~paste0(., "_me"))) tmp_dataset$cal_betadiv(unifrac = FALSE) #### CODE TO ADD JSD DISTANCE #### tmp_jsd <- phyloseq::distance(get(i), method = "jsd") tmp_jsd <- forceSymmetric(as.matrix(tmp_jsd), uplo = "L") tmp_jsd <- as.matrix(tmp_jsd) tmp_dataset$beta_diversity$jsd <- tmp_jsd #### CODE TO ADD GOWER DISTANCE #### tmp_gower <- phyloseq::distance(get(i), method = "gower") tmp_gower <- forceSymmetric(as.matrix(tmp_gower), uplo = "L") tmp_gower <- as.matrix(tmp_gower) tmp_dataset$beta_diversity$gower <- tmp_gower #### #### #### #### #### #### #### #### dir.create(paste(microeco_path, i, "/metrics/", sep = ""), recursive = TRUE) tmp_dataset$save_betadiv(dirpath = paste(microeco_path, i, "/metrics/", sep = "")) rm(list = ls(pattern = "tmp_")) } ``` ```{r} #| echo: true #| eval: true class(its18_ps_perfect_me$beta_diversity) summary(its18_ps_perfect_me$beta_diversity) ``` Here we call the custom function described above. ```{r} for (j in 1:length(get(paste("its18_ps_work", "_me", sep = ""))$beta_diversity)) { tmp_metric <- names(get(paste("its18_ps_work", "_me", sep = ""))$beta_diversity[j]) microeco_beta_plot(choose_input = "its18_ps_work", choose_metric = tmp_metric, choose_ord = "PCoA") rm(list = ls(pattern = "tmp_")) } for (j in 1:length(get(paste("its18_ps_filt", "_me", sep = ""))$beta_diversity)) { tmp_metric <- names(get(paste("its18_ps_filt", "_me", sep = ""))$beta_diversity[j]) microeco_beta_plot(choose_input = "its18_ps_filt", choose_metric = tmp_metric, choose_ord = "PCoA") rm(list = ls(pattern = "tmp_")) } for (j in 1:length(get(paste("its18_ps_perfect", "_me", sep = ""))$beta_diversity)) { tmp_metric <- names(get(paste("its18_ps_perfect", "_me", sep = ""))$beta_diversity[j]) microeco_beta_plot(choose_input = "its18_ps_perfect", choose_metric = tmp_metric, choose_ord = "PCoA") rm(list = ls(pattern = "tmp_")) } for (j in 1:length(get(paste("its18_ps_pime", "_me", sep = ""))$beta_diversity)) { tmp_metric <- names(get(paste("its18_ps_pime", "_me", sep = ""))$beta_diversity[j]) microeco_beta_plot(choose_input = "its18_ps_pime", choose_metric = tmp_metric, choose_ord = "PCoA") rm(list = ls(pattern = "tmp_")) } ``` ```{r} #| echo: false its18_ps_work_title <- "FULL Ordination Plots" its18_ps_filt_title <- "Arbitrary Ordination Plots" its18_ps_perfect_title <- "PERfect Ordination Plots" its18_ps_pime_title <- "PIME Ordination Plots" for (samp_ps in its18_data_sets) { tmp_combo_plot <- ( get(purrr::map_chr(samp_ps, ~ paste0(., "_jsd_PCoA"))) + get(purrr::map_chr(samp_ps, ~ paste0(., "_bray_PCoA"))) + get(purrr::map_chr(samp_ps, ~ paste0(., "_gower_PCoA"))) ) / ( get(purrr::map_chr(samp_ps, ~ paste0(., "_me", "_jsd_PCoA"))) + get(purrr::map_chr(samp_ps, ~ paste0(., "_me", "_bray_PCoA"))) + get(purrr::map_chr(samp_ps, ~ paste0(., "_me", "_gower_PCoA"))) ) + geom_point(size = 1, show.legend = FALSE) tmp_title <- paste(samp_ps, "_title", sep = "") tmp_combo_plot <- tmp_combo_plot + plot_annotation(title = get(tmp_title), subtitle = 'Comparisons of three distance metrics. Implemented in phyloseq (top) and microeco (bottom).') + plot_layout(widths = c(3, 3), guides = "collect") & theme(plot.title = element_text(size = 18), plot.subtitle = element_text(size = 16), plot.tag = element_text(size = 12), axis.title = element_text(size = 14), axis.text = element_text(size = 10), legend.position = "bottom", legend.justification = "center", legend.title = element_text(size = rel(1.75)), legend.text = element_text(size = rel(1.5))) tmp_combo_plot_name <- purrr::map_chr(samp_ps, ~ paste0(., "_plots")) assign(tmp_combo_plot_name, tmp_combo_plot) rm(list = ls(pattern = "tmp_")) } its18_ps_pime_plots ``` ```{r} #| echo: false for (i in its18_data_sets) { dir.create(paste("files/beta/figures/combo_plots", sep = ""), recursive = TRUE) tmp_path <- paste("files/beta/figures/combo_plots", sep = "") tmp_filename <- paste(i, "_beta_div_ord_plots", sep="") tmp_get_plot <- get(purrr::map_chr(i, ~ paste0(., "_plots"))) ggplot2::ggsave(tmp_get_plot, path = tmp_path, filename = paste(tmp_filename, ".png", sep = ""), height=28, width=36, units = 'cm', dpi = 600, bg = "white") ggplot2::ggsave(tmp_get_plot, path = tmp_path, filename = paste(tmp_filename, ".pdf", sep = ""), height=10, width=12) } ``` ```{r} #| echo: false system("cp files/beta/figures/combo_plots/its18_ps_work_beta_div_ord_plots.png include/beta/its18_ps_work_beta_div_ord_plots.png") system("cp files/beta/figures/combo_plots/its18_ps_filt_beta_div_ord_plots.png include/beta/its18_ps_filt_beta_div_ord_plots.png") system("cp files/beta/figures/combo_plots/its18_ps_perfect_beta_div_ord_plots.png include/beta/its18_ps_perfect_beta_div_ord_plots.png") system("cp files/beta/figures/combo_plots/its18_ps_pime_beta_div_ord_plots.png include/beta/its18_ps_pime_beta_div_ord_plots.png") ``` </details> ## 16S rRNA ::: {.column-body-outset} ::: {.panel-tabset} #### FULL ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/ssu18_ps_work_beta_div_ord_plots.png") ``` `r caption_fig_ssu("ssu_beta_div_plots_full")` #### Arbitrary ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/ssu18_ps_filt_beta_div_ord_plots.png") ``` `r caption_fig_ssu("ssu_beta_div_plots_filt")` #### PERfect ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/ssu18_ps_perfect_beta_div_ord_plots.png") ``` `r caption_fig_ssu("ssu_beta_div_plots_perfect")` #### PIME ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/ssu18_ps_pime_beta_div_ord_plots.png") ``` `r caption_fig_ssu("ssu_beta_div_plots_pime")` ::: ::: ## ITS ::: {.column-body-outset} ::: {.panel-tabset} #### Arbitrary ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/its18_ps_work_beta_div_ord_plots.png") ``` `r caption_fig_its("its_beta_div_plots_full")` #### FILT ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/its18_ps_filt_beta_div_ord_plots.png") ``` `r caption_fig_its("its_beta_div_plots_filt")` #### PERfect ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/its18_ps_perfect_beta_div_ord_plots.png") ``` `r caption_fig_its("its_beta_div_plots_perfect")` #### PIME ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/its18_ps_pime_beta_div_ord_plots.png") ``` `r caption_fig_its("its_beta_div_plots_pime")` ::: ::: ```{r} #| echo: false ## NEXT few blocks are to modify with and between group plots for (i in objects(pattern = "_wg_")) { tmp_dataset <- get(i) tmp_wg <- tmp_dataset + geom_boxplot(fill=swel_col) + scale_colour_manual(values=c("#191919", "#191919", "#191919")) tmp_name <- purrr::map_chr(i, ~ paste0(., "_mod")) assign(tmp_name, tmp_wg) rm(list = ls(pattern = "tmp_")) } ``` ```{r} #| echo: false #| eval: false btw_colors <- c("#FF00FF", "#FFFF00", "#00FFFF") for (i in objects(pattern = "_bg_")) { tmp_dataset <- get(i) tmp_wg <- tmp_dataset + geom_boxplot(fill = "grey90") + scale_colour_manual(values=c("#191919", "#191919", "#191919")) tmp_name <- purrr::map_chr(i, ~ paste0(., "_mod")) assign(tmp_name, tmp_wg) rm(list = ls(pattern = "tmp_")) } ``` ```{r} #| echo: false for (i in objects(pattern = "_jsd_PCoA_mod")) { tmp_wg <- get(i) tmp_wg <- tmp_wg + ylab("Jensen-Shannon divergence") tmp_name <- i assign(tmp_name, tmp_wg) rm(list = ls(pattern = "tmp_")) } for (i in objects(pattern = "_bray_PCoA_mod")) { tmp_wg <- get(i) tmp_wg <- tmp_wg + ylab("Bray-Curtis dissimilarity") tmp_name <- i assign(tmp_name, tmp_wg) rm(list = ls(pattern = "tmp_")) } for (i in objects(pattern = "_gower_PCoA_mod")) { tmp_wg <- get(i) tmp_wg <- tmp_wg + ylab("Gower distance") tmp_name <- i assign(tmp_name, tmp_wg) rm(list = ls(pattern = "tmp_")) } ``` # Within & Between Group Distances Above we calculated the sample distances within groups. Let's see what those plots look like. ## 16S rRNA ```{r} #| echo: false ssu18_ps_work_title <- "FULL (unfiltered)" ssu18_ps_filt_title <- "Arbitrary filtered" ssu18_ps_perfect_title <- "PERfect filtered" ssu18_ps_pime_title <- "PIME filtered" for (samp_ps in ssu18_data_sets) { tmp_combo_plot <- ( get(purrr::map_chr(samp_ps, ~ paste0(., "_me_wg_jsd_PCoA_mod"))) + geom_point(size = 2, show.legend = FALSE) + ggtitle("Within-group") + get(purrr::map_chr(samp_ps, ~ paste0(., "_me_wg_unwei_unifrac_PCoA_mod"))) + geom_point(size = 2, show.legend = FALSE) + get(purrr::map_chr(samp_ps, ~ paste0(., "_me_wg_wei_unifrac_PCoA_mod"))) + geom_point(size = 2, show.legend = FALSE) ) / ( get(purrr::map_chr(samp_ps, ~ paste0(., "_me_bg_jsd_PCoA_mod"))) + geom_point(size = 2, show.legend = FALSE) + ggtitle(title = "Between-group") + get(purrr::map_chr(samp_ps, ~ paste0(., "_me_bg_unwei_unifrac_PCoA_mod"))) + geom_point(size = 2, show.legend = FALSE) + get(purrr::map_chr(samp_ps, ~ paste0(., "_me_bg_wei_unifrac_PCoA_mod"))) + geom_point(size = 2, show.legend = FALSE) ) tmp_title <- paste(samp_ps, "_title", sep = "") tmp_combo_plot <- tmp_combo_plot + plot_annotation(title = get(tmp_title), subtitle = 'Treatment-level variation of three distance metrics.') + plot_layout(widths = c(6, 6), guides = "collect") & theme(plot.title = element_text(size = 18), plot.subtitle = element_text(size = 16), plot.tag = element_text(size = 12), axis.title = element_text(size = 13), axis.text = element_text(size = 10), axis.text.x = element_text(size = 14), legend.position = "none", legend.justification = "center", legend.title = element_text(size = rel(1.75)), legend.text = element_text(size = rel(1.5))) tmp_combo_plot_name <- purrr::map_chr(samp_ps, ~ paste0(., "_me_variation_plots")) assign(tmp_combo_plot_name, tmp_combo_plot) rm(list = ls(pattern = "tmp_")) } ssu18_ps_pime_me_variation_plots objects(pattern = "_plots") ``` ```{r} #| echo: false for (i in ssu18_data_sets) { dir.create(paste("files/beta/figures/combo_plots", sep = ""), recursive = TRUE) tmp_path <- paste("files/beta/figures/combo_plots", sep = "") tmp_filename <- paste(i, "_variation_plots", sep="") tmp_get_plot <- get(purrr::map_chr(i, ~ paste0(., "_me_variation_plots"))) ggplot2::ggsave(tmp_get_plot, path = tmp_path, filename = paste(tmp_filename, ".png", sep = ""), height=26, width=36, units = 'cm', dpi = 600, bg = "white") ggplot2::ggsave(tmp_get_plot, path = tmp_path, filename = paste(tmp_filename, ".pdf", sep = ""), height=9, width=12) } ``` ```{r} #| echo: false system("cp files/beta/figures/combo_plots/ssu18_ps_work_variation_plots.png include/beta/ssu18_ps_work_variation_plots.png") system("cp files/beta/figures/combo_plots/ssu18_ps_filt_variation_plots.png include/beta/ssu18_ps_filt_variation_plots.png") system("cp files/beta/figures/combo_plots/ssu18_ps_perfect_variation_plots.png include/beta/ssu18_ps_perfect_variation_plots.png") system("cp files/beta/figures/combo_plots/ssu18_ps_pime_variation_plots.png include/beta/ssu18_ps_pime_variation_plots.png") ``` ::: {.column-body-outset} ::: {.panel-tabset} #### FULL ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/ssu18_ps_work_variation_plots.png") ``` `r caption_fig_ssu("ssu_variation_plots_full")` #### Arbitrary ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/ssu18_ps_filt_variation_plots.png") ``` `r caption_fig_ssu("ssu_variation_plots_filt")` #### PERfect ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/ssu18_ps_perfect_variation_plots.png") ``` `r caption_fig_ssu("ssu_variation_plots_perfect")` #### PIME ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/ssu18_ps_pime_variation_plots.png") ``` `r caption_fig_ssu("ssu_variation_plots_pime")` ::: ::: ## ITS ```{r} #| echo: false its18_ps_work_title <- "FULL (unfiltered)" its18_ps_filt_title <- "Arbitrary filtered" its18_ps_perfect_title <- "PERfect filtered" its18_ps_pime_title <- "PIME filtered" for (samp_ps in its18_data_sets) { tmp_combo_plot <- ( get(purrr::map_chr(samp_ps, ~ paste0(., "_me_wg_jsd_PCoA_mod"))) + geom_point(size = 2, show.legend = FALSE) + ggtitle("Within-group")+ get(purrr::map_chr(samp_ps, ~ paste0(., "_me_wg_bray_PCoA_mod"))) + geom_point(size = 2, show.legend = FALSE) + get(purrr::map_chr(samp_ps, ~ paste0(., "_me_wg_gower_PCoA_mod"))) + geom_point(size = 2, show.legend = FALSE) ) / ( get(purrr::map_chr(samp_ps, ~ paste0(., "_me_bg_jsd_PCoA_mod"))) + geom_point(size = 2, show.legend = FALSE) + ggtitle("Between-group")+ get(purrr::map_chr(samp_ps, ~ paste0(., "_me_bg_bray_PCoA_mod"))) + geom_point(size = 2, show.legend = FALSE) + get(purrr::map_chr(samp_ps, ~ paste0(., "_me_bg_gower_PCoA_mod"))) + geom_point(size = 2, show.legend = FALSE) ) tmp_title <- paste(samp_ps, "_title", sep = "") tmp_combo_plot <- tmp_combo_plot + plot_annotation(title = get(tmp_title), subtitle = 'Treatment-level variation of three distance metrics.') + plot_layout(widths = c(6, 6), guides = "collect") & theme(plot.title = element_text(size = 18), plot.subtitle = element_text(size = 16), plot.tag = element_text(size = 12), axis.title = element_text(size = 13), axis.text = element_text(size = 10), axis.text.x = element_text(size = 14), legend.position = "none", legend.justification = "center", legend.title = element_text(size = rel(1.75)), legend.text = element_text(size = rel(1.5))) tmp_combo_plot_name <- purrr::map_chr(samp_ps, ~ paste0(., "_me_variation_plots")) assign(tmp_combo_plot_name, tmp_combo_plot) rm(list = ls(pattern = "tmp_")) } its18_ps_pime_me_variation_plots objects(pattern = "_plots") ``` ```{r} #| echo: false for (i in its18_data_sets) { dir.create(paste("files/beta/figures/combo_plots", sep = ""), recursive = TRUE) tmp_path <- paste("files/beta/figures/combo_plots", sep = "") tmp_filename <- paste(i, "_variation_plots", sep="") tmp_get_plot <- get(purrr::map_chr(i, ~ paste0(., "_me_variation_plots"))) ggplot2::ggsave(tmp_get_plot, path = tmp_path, filename = paste(tmp_filename, ".png", sep = ""), height=26, width=36, units = 'cm', dpi = 600, bg = "white") ggplot2::ggsave(tmp_get_plot, path = tmp_path, filename = paste(tmp_filename, ".pdf", sep = ""), height=9, width=12) } ``` ```{r} #| echo: false system("cp files/beta/figures/combo_plots/its18_ps_work_variation_plots.png include/beta/its18_ps_work_variation_plots.png") system("cp files/beta/figures/combo_plots/its18_ps_filt_variation_plots.png include/beta/its18_ps_filt_variation_plots.png") system("cp files/beta/figures/combo_plots/its18_ps_perfect_variation_plots.png include/beta/its18_ps_perfect_variation_plots.png") system("cp files/beta/figures/combo_plots/its18_ps_pime_variation_plots.png include/beta/its18_ps_pime_variation_plots.png") ``` ::: {.column-body-outset} ::: {.panel-tabset} #### FULL ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/its18_ps_work_variation_plots.png") ``` `r caption_fig_its("its_variation_plots_full")` #### Arbitrary ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/its18_ps_filt_variation_plots.png") ``` `r caption_fig_its("its_variation_plots_filt")` #### PERfect ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/its18_ps_perfect_variation_plots.png") ``` `r caption_fig_its("its_variation_plots_perfect")` #### PIME ```{r} #| echo: false #| eval: true #| warning: false knitr::include_graphics("include/beta/its18_ps_pime_variation_plots.png") ``` `r caption_fig_its("its_variation_plots_pime")` ::: ::: ```{r} #| echo: false #objects() #rm(ssu18_ps_filt, ssu18_ps_work, ssu18_ps_perfect, ssu18_ps_pime) save.image("page_build/beta_wf_part_2.rdata") ``` ```{r} #| include: false #| eval: true remove(list = ls()) ``` # Workflow Output Data products generated in this workflow can be downloaded from figshare. <iframe src="https://widgets.figshare.com/articles/16828063/embed?show_title=1" width="100%" height="251" allowfullscreen frameborder="0"></iframe> <a href="da.html" class="btnnav button_round" style="float: right;">Next workflow: 7. Differentially Abundant ASVs & Taxa</a> <a href="alpha.html" class="btnnav button_round" style="float: left;">Previous workflow: 5. Alpha Diversity Estimates</a> ```{r} #| message: false #| results: hide #| eval: true #| echo: false remove(list = ls()) ### COmmon formatting scripts ### NOTE: captioner.R must be read BEFORE captions_XYZ.R source(file.path("assets", "functions.R")) ``` #### Source Code {.appendix} The source code for this page can be accessed on GitHub `r fa(name = "github")` by [clicking this link](`r source_code()`). #### Data Availability {.appendix} Data generated in this workflow and the Rdata need to run the workflow can be accessed on figshare at [10.25573/data.16828063](https://doi.org/10.25573/data.16828063). #### Last updated on {.appendix} ```{r} #| echo: false #| eval: true Sys.time() ```