FOMC Service Report

16S rRNA Gene V1V3 Amplicon Sequencing

Version V1.11

The Forsyth Institute, Cambridge, MA, USA
July 15, 2021

Project ID: FOMC4511


I. Project Summary

Project FOMC4511 services include NGS sequencing of the V1V3 region of the 16S rRNA amplicons from the samples. First and foremost, please download this report, as well as the sequence raw data from the download links provided below. These links will expire after 60 days. We cannot guarantee the availability of your data after 60 days.

Full Bioinformatics analysis service was requested. We provide many analyses, starting from the raw sequence quality and noise filtering, pair reads merging, as well as chimera filtering for the sequences, using the DADA2 denosing algorithm and pipeline.

We also provide many downstream analyses such as taxonomy assignment, alpha and beta diversity analyses, and differential abundance analysis.

For taxonomy assignment, most informative would be the taxonomy barplots. We provide an interactive barplots to show the relative abundance of microbes at different taxonomy levels (from Phylum to species) that you can choose.

If you specify which groups of samples you want to compare for differential abundance, we provide both ANCOM and LEfSe differential abundance analysis.

 

II. Workflow Checklist

1.Sample Received
2.Sample Quality Evaluated
3.Sample Prepared for Sequencing
4.Next-Gen Sequencing
5.Sequence Quality Check
6.Absolute Abundance
7.Report and Raw Sequence Data Available for Download
8.Bioinformatics Analysis - Reads Processing (DADA2 Quality Trimming, Denoising, Paired Reads Merging)
9.Bioinformatics Analysis - Reads Taxonomy Assignment
10.Bioinformatics Analysis - Alpha Diversity Analysis
11.Bioinformatics Analysis - Beta Diversity Analysis
12.Bioinformatics Analysis - Differential Abundance Analysis
13.Bioinformatics Analysis - Heatmap Profile
14.Bioinformatics Analysis - Network Association
 

III. NGS Sequencing

The samples were processed and analyzed with the ZymoBIOMICS® Service: Targeted Metagenomic Sequencing (Zymo Research, Irvine, CA).

DNA Extraction: If DNA extraction was performed, one of three different DNA extraction kits was used depending on the sample type and sample volume and were used according to the manufacturer’s instructions, unless otherwise stated. The kit used in this project is marked below:

ZymoBIOMICS® DNA Miniprep Kit (Zymo Research, Irvine, CA)
ZymoBIOMICS® DNA Microprep Kit (Zymo Research, Irvine, CA)
ZymoBIOMICS®-96 MagBead DNA Kit (Zymo Research, Irvine, CA)
N/A (DNA Extraction Not Performed)
Elution Volume: 50µL
Additional Notes: NA

Targeted Library Preparation: The DNA samples were prepared for targeted sequencing with the Quick-16S™ NGS Library Prep Kit (Zymo Research, Irvine, CA). These primers were custom designed by Zymo Research to provide the best coverage of the 16S gene while maintaining high sensitivity. The primer sets used in this project are marked below:

Quick-16S™ Primer Set V1-V2 (Zymo Research, Irvine, CA)
Quick-16S™ Primer Set V1-V3 (Zymo Research, Irvine, CA)
Quick-16S™ Primer Set V3-V4 (Zymo Research, Irvine, CA)
Quick-16S™ Primer Set V4 (Zymo Research, Irvine, CA)
Quick-16S™ Primer Set V6-V8 (Zymo Research, Irvine, CA)
Other: NA
Additional Notes: NA

The sequencing library was prepared using an innovative library preparation process in which PCR reactions were performed in real-time PCR machines to control cycles and therefore limit PCR chimera formation. The final PCR products were quantified with qPCR fluorescence readings and pooled together based on equal molarity. The final pooled library was cleaned up with the Select-a-Size DNA Clean & Concentrator™ (Zymo Research, Irvine, CA), then quantified with TapeStation® (Agilent Technologies, Santa Clara, CA) and Qubit® (Thermo Fisher Scientific, Waltham, WA).

Control Samples: The ZymoBIOMICS® Microbial Community Standard (Zymo Research, Irvine, CA) was used as a positive control for each DNA extraction, if performed. The ZymoBIOMICS® Microbial Community DNA Standard (Zymo Research, Irvine, CA) was used as a positive control for each targeted library preparation. Negative controls (i.e. blank extraction control, blank library preparation control) were included to assess the level of bioburden carried by the wet-lab process.

Sequencing: The final library was sequenced on Illumina® MiSeq™ with a V3 reagent kit (600 cycles). The sequencing was performed with 10% PhiX spike-in.

 

IV. Complete Report Download

The complete report of your project, including all links in this report, can be downloaded by clicking the link provided below. The downloaded file is a compressed ZIP file and once unzipped, open the file “REPORT.html” (may only shown as "REPORT" in your computer) by double clicking it. Your default web browser will open it and you will see the exact content of this report.

Please download and save the file to your computer storage device. The download link will expire after 60 days upon your receiving of this report.

Complete report download link:

To view the report, please follow the following steps:
1.Download the .zip file from the report link above.
2.Extract all the contents of the downloaded .zip file to your desktop.
3.Open the extracted folder and find the "REPORT.html" (may shown as only "REPORT").
4.Open (double-clicking) the REPORT.html file. Your default browser will open the top age of the complete report. Within the report, there are links to view all the analyses performed for the project.

 

V. Raw Sequence Data Download

The raw NGS sequence data is available for download with the link provided below. The data is a compressed ZIP file and can be unzipped to individual sequence files. Since this is a pair-end sequencing, each of your samples is represented by two sequence files, one for READ 1, with the file extension “*_R1.fastq.gz”, another READ 2, with the file extension “*_R1.fastq.gz”. The files are in FASTQ format and are compressed. FASTQ format is a text-based data format for storing both a biological sequence and its corresponding quality scores. Most sequence analysis software will be able to open them. The Sample IDs associated with the R1 and R2 fastq files are listed in the table below:

Sample IDRead 1 File NameRead 2 File Name
S10zr4083_10V1V3_R1.fastq.gzzr4083_10V1V3_R2.fastq.gz
S11zr4083_11V1V3_R1.fastq.gzzr4083_11V1V3_R2.fastq.gz
S12zr4083_12V1V3_R1.fastq.gzzr4083_12V1V3_R2.fastq.gz
S13zr4083_13V1V3_R1.fastq.gzzr4083_13V1V3_R2.fastq.gz
S14zr4083_14V1V3_R1.fastq.gzzr4083_14V1V3_R2.fastq.gz
S15zr4083_15V1V3_R1.fastq.gzzr4083_15V1V3_R2.fastq.gz
S16zr4083_16V1V3_R1.fastq.gzzr4083_16V1V3_R2.fastq.gz
S17zr4083_17V1V3_R1.fastq.gzzr4083_17V1V3_R2.fastq.gz
S18zr4083_18V1V3_R1.fastq.gzzr4083_18V1V3_R2.fastq.gz
S19zr4083_19V1V3_R1.fastq.gzzr4083_19V1V3_R2.fastq.gz
S01zr4083_1V1V3_R1.fastq.gzzr4083_1V1V3_R2.fastq.gz
S20zr4083_20V1V3_R1.fastq.gzzr4083_20V1V3_R2.fastq.gz
S02zr4083_2V1V3_R1.fastq.gzzr4083_2V1V3_R2.fastq.gz
S03zr4083_3V1V3_R1.fastq.gzzr4083_3V1V3_R2.fastq.gz
S04zr4083_4V1V3_R1.fastq.gzzr4083_4V1V3_R2.fastq.gz
S05zr4083_5V1V3_R1.fastq.gzzr4083_5V1V3_R2.fastq.gz
S06zr4083_6V1V3_R1.fastq.gzzr4083_6V1V3_R2.fastq.gz
S07zr4083_7V1V3_R1.fastq.gzzr4083_7V1V3_R2.fastq.gz
S08zr4083_8V1V3_R1.fastq.gzzr4083_8V1V3_R2.fastq.gz
S09zr4083_9V1V3_R1.fastq.gzzr4083_9V1V3_R2.fastq.gz
S01zr4106_1V1V3_R1.fastq.gzzr4106_1V1V3_R2.fastq.gz
S02zr4106_2V1V3_R1.fastq.gzzr4106_2V1V3_R2.fastq.gz
S03zr4106_3V1V3_R1.fastq.gzzr4106_3V1V3_R2.fastq.gz
S04zr4106_4V1V3_R1.fastq.gzzr4106_4V1V3_R2.fastq.gz
S05zr4106_5V1V3_R1.fastq.gzzr4106_5V1V3_R2.fastq.gz
S06zr4106_6V1V3_R1.fastq.gzzr4106_6V1V3_R2.fastq.gz
S07zr4106_7V1V3_R1.fastq.gzzr4106_7V1V3_R2.fastq.gz
S08zr4106_8V1V3_R1.fastq.gzzr4106_8V1V3_R2.fastq.gz
S09zr4106_9V1V3_R1.fastq.gzzr4106_9V1V3_R2.fastq.gz
S01zr4511_1V1V3_R1.fastq.gzzr4511_1V1V3_R2.fastq.gz
S02zr4511_2V1V3_R1.fastq.gzzr4511_2V1V3_R2.fastq.gz

Please download and save the file to your computer storage device. The download link will expire after 60 days upon your receiving of this report.

Raw sequence data download link:

 

VI. Analysis - DADA2 Read Processing

What is DADA2?

DADA2 is a software package that models and corrects Illumina-sequenced amplicon errors. DADA2 infers sample sequences exactly, without coarse-graining into OTUs, and resolves differences of as little as one nucleotide. DADA2 identified more real variants and output fewer spurious sequences than other methods.

DADA2’s advantage is that it uses more of the data. The DADA2 error model incorporates quality information, which is ignored by all other methods after filtering. The DADA2 error model incorporates quantitative abundances, whereas most other methods use abundance ranks if they use abundance at all. The DADA2 error model identifies the differences between sequences, eg. A->C, whereas other methods merely count the mismatches. DADA2 can parameterize its error model from the data itself, rather than relying on previous datasets that may or may not reflect the PCR and sequencing protocols used in your study.

DADA2 Publication: Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJ, Holmes SP. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods. 2016 Jul;13(7):581-3. doi: 10.1038/nmeth.3869. Epub 2016 May 23. PMID: 27214047; PMCID: PMC4927377.

DADA2 Software Package is available as an R package at : https://benjjneb.github.io/dada2/index.html

Analysis Procedures:

DADA2 pipeline includes several tools for read quality control, including quality filtering, trimming, denoising, pair merging and chimera filtering. Below are the major processing steps of DADA2:

Step 1. Read trimming based on sequence quality The quality of NGS Illumina sequences often decreases toward the end of the reads. DADA2 allows to trim off the poor quality read ends in order to improve the error model building and pair mergicing performance.

Step 2. Learn the Error Rates The DADA2 algorithm makes use of a parametric error model (err) and every amplicon dataset has a different set of error rates. The learnErrors method learns this error model from the data, by alternating estimation of the error rates and inference of sample composition until they converge on a jointly consistent solution. As in many machine-learning problems, the algorithm must begin with an initial guess, for which the maximum possible error rates in this data are used (the error rates if only the most abundant sequence is correct and all the rest are errors).

Step 3. Infer amplicon sequence variants (ASVs) based on the error model built in previous step. This step is also called sequence "denoising". The outcome of this step is a list of ASVs that are the equivalent of oligonucleotides.

Step 4. Merge paired reads. If the sequencing products are read pairs, DADA2 will merge the R1 and R2 ASVs into single sequences. Merging is performed by aligning the denoised forward reads with the reverse-complement of the corresponding denoised reverse reads, and then constructing the merged “contig” sequences. By default, merged sequences are only output if the forward and reverse reads overlap by at least 12 bases, and are identical to each other in the overlap region (but these conditions can be changed via function arguments).

Step 5. Remove chimera. The core dada method corrects substitution and indel errors, but chimeras remain. Fortunately, the accuracy of sequence variants after denoising makes identifying chimeric ASVs simpler than when dealing with fuzzy OTUs. Chimeric sequences are identified if they can be exactly reconstructed by combining a left-segment and a right-segment from two more abundant “parent” sequences. The frequency of chimeric sequences varies substantially from dataset to dataset, and depends on on factors including experimental procedures and sample complexity.

Results

1. Read Quality Plots NGS sequence analaysis starts with visualizing the quality of the sequencing. Below are the quality plots of the first sample for the R1 and R2 reads separately. In gray-scale is a heat map of the frequency of each quality score at each base position. The mean quality score at each position is shown by the green line, and the quartiles of the quality score distribution by the orange lines. The forward reads are usually of better quality. It is a common practice to trim the last few nucleotides to avoid less well-controlled errors that can arise there. The trimming affects the downstream steps including error model building, merging and chimera calling. FOMC uses an empirical approach to test many combinations of different trim length in order to achieve best final amplicon sequence variants (ASVs), see the next section “Optimal trim length for ASVs”.

Below is the link to a PDF file for viewing the quality plots for all samples:

2. Optimal trim length for ASVs The final number of merged and chimera-filtered ASVs depends on the quality filtering (hence trimming) in the very beginning of the DADA2 pipeline. In order to achieve highest number of ASVs, an empirical approach was used -

  1. Create a random subset of each sample consisting of 5,000 R1 and 5,000 R2 (to reduce computation time)
  2. Trim 10 bases at a time from the ends of both R1 and R2 up to 50 bases
  3. For each combination of trimmed length (e.g., 300x300, 300x290, 290x290 etc), the trimmed reads are subject to the entire DADA2 pipeline for chimera-filtered merged ASVs
  4. The combination with highest percentage of the input reads becoming final ASVs is selected for the complete set of data

Below is the result of such operation, showing ASV percentages of total reads for all trimming combinations (1st Column = R1 lengths in bases; 1st Row = R2 lengths in bases):

R1/R2281271261251241231
32129.17%41.01%44.67%45.65%48.17%43.91%
31129.62%42.93%45.85%47.27%45.38%29.07%
30130.25%42.98%45.55%42.70%29.08%12.06%
29130.56%43.10%42.09%27.39%12.98%11.53%
28131.73%39.28%26.95%12.10%12.21%9.34%
27127.10%26.54%11.64%10.88%9.32%3.39%

Based on the above result, the trim length combination of R1 = 321 bases and R2 = 241 bases (highlighted red above), was chosen for generating final ASVs for all sequences. This combination generated highest number of merged non-chimeric ASVs and was used for downstream analyses, if requested.

3. Error plots from learning the error rates After DADA2 building the error model for the set of data, it is always worthwhile, as a sanity check if nothing else, to visualize the estimated error rates. The error rates for each possible transition (A→C, A→G, …) are shown below. Points are the observed error rates for each consensus quality score. The black line shows the estimated error rates after convergence of the machine-learning algorithm. The red line shows the error rates expected under the nominal definition of the Q-score. The ideal result would be the estimated error rates (black line) are a good fit to the observed rates (points), and the error rates drop with increased quality as expected.

Forward Read R1 Error Plot


Reverse Read R2 Error Plot

The PDF version of these plots are available here:

 

4. DADA2 Result Summary The table below shows the summary of the DADA2 analysis, tracking paired read counts of each samples for all the steps during DADA2 denoising process - including end-trimming (filtered), denoising (denoisedF, denoisedF), pair merging (merged) and chimera removal (nonchim).

Sample IDF4083.S01F4083.S02F4083.S03F4083.S04F4083.S05F4083.S06F4083.S07F4083.S08F4083.S09F4083.S10F4083.S11F4083.S12F4083.S13F4083.S14F4083.S15F4083.S16F4083.S17F4083.S18F4083.S19F4083.S20F4106.S01F4106.S02F4106.S03F4106.S04F4106.S05F4106.S06F4106.S07F4106.S08F4106.S09F4511.S01F4511.S02Row SumPercentage
input19,73128,10626,83124,79329,33020,89616,72226,28521,82125,39227,37825,42527,85123,14523,56524,94123,72824,74931,00021,43830,57533,77036,60539,28739,55737,58035,73239,78829,89234,94249,462900,317100.00%
filtered13,69421,74420,83519,18222,80416,23912,98120,68816,16019,93521,38519,65821,81018,07218,39518,91817,19319,36524,41916,41523,18327,94130,38831,96631,84530,65729,22031,00622,0327,67033,304679,10475.43%
denoisedF12,92520,93219,81718,41021,96915,26512,20819,78615,17918,77720,47119,07920,74217,17717,42718,02916,39418,45923,37215,87422,42327,02529,61130,92131,01829,41428,06129,95121,2327,06332,332651,34372.35%
denoisedR13,19221,40220,31518,81722,39515,61012,63520,34915,67419,54620,94019,37121,20517,54018,06018,57116,63018,90523,86116,11522,63527,39729,98331,40831,41129,77828,66030,25121,5907,35732,890664,49373.81%
merged9,98118,85916,45715,63719,45612,36910,06517,03312,67116,83418,15916,83117,76314,85714,76715,61914,57215,73820,52914,09120,27124,67027,19626,81426,93725,18024,00425,50717,8535,84528,414564,97962.75%
nonchim5,9959,1548,8868,2708,8676,6605,8747,5336,8227,6787,9678,5078,9917,9597,4817,3006,7227,2399,4627,0939,86212,00912,82814,74614,05915,25414,44514,53410,4823,61812,663288,96032.10%

This table can be downloaded as an Excel table below:

 

5. DADA2 Amplicon Sequence Variants (ASVs). A total of 5889 unique merged and chimera-free ASV sequences were identified, and their corresponding read counts for each sample are available in the "ASV Read Count Table" with rows for the ASV sequences and columns for sample. This read count table can be used for microbial profile comparison among different samples and the sequences provided in the table can be used to taxonomy assignment.

 

The table can be downloaded from this link:

 
 
 
 

VII. Analysis - Read Taxonomy Assignment

Read Taxonomy Assignment - Methods

 

The species-level, open-reference 16S rRNA NGS reads taxonomy assignment pipeline

Version 20210310
 

1. Raw sequences reads in FASTA format were BLASTN-searched against a combined set of 16S rRNA reference sequences. It consists of MOMD (version 0.1), the HOMD (version 15.2 http://www.homd.org/index.php?name=seqDownload&file&type=R ), HOMD 16S rRNA RefSeq Extended Version 1.1 (EXT), GreenGene Gold (GG) (http://greengenes.lbl.gov/Download/Sequence_Data/Fasta_data_files/gold_strains_gg16S_aligned.fasta.gz) , and the NCBI 16S rRNA reference sequence set (https://ftp.ncbi.nlm.nih.gov/blast/db/16S_ribosomal_RNA.tar.gz). These sequences were screened and combined to remove short sequences (<1000nt), chimera, duplicated and sub-sequences, as well as sequences with poor taxonomy annotation (e.g., without species information). This process resulted in 1,015 from HOMD V15.22, 495 from EXT, 3,940 from GG and 18,044 from NCBI, a total of 25,120 sequences. Altogether these sequence represent a total of 15,601 oral and non-oral microbial species.

The NCBI BLASTN version 2.7.1+ (Zhang et al, 2000) was used with the default parameters. Reads with ≥ 98% sequence identity to the matched reference and ≥ 90% alignment length (i.e., ≥ 90% of the read length that was aligned to the reference and was used to calculate the sequence percent identity) were classified based on the taxonomy of the reference sequence with highest sequence identity. If a read matched with reference sequences representing more than one species with equal percent identity and alignment length, it was subject to chimera checking with USEARCH program version v8.1.1861 (Edgar 2010). Non-chimeric reads with multi-species best hits were considered valid and were assigned with a unique species notation (e.g., spp) denoting unresolvable multiple species.

2. Unassigned reads (i.e., reads with < 98% identity or < 90% alignment length) were pooled together and reads < 200 bases were removed. The remaining reads were subject to the de novo operational taxonomy unit (OTU) calling and chimera checking using the USEARCH program version v8.1.1861 (Edgar 2010). The de novo OTU calling and chimera checking was done using 98% as the sequence identity cutoff, i.e., the species-level OTU. The output of this step produced species-level de novo clustered OTUs with 98% identity. Representative reads from each of the OTUs/species were then BLASTN-searched against the same reference sequence set again to determine the closest species for these potential novel species. These potential novel species were pooled together with the reads that were signed to specie-level in the previous step, for down-stream analyses.

Reference:
Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010 Oct 1;26(19):2460-1. doi: 10.1093/bioinformatics/btq461. Epub 2010 Aug 12. PubMed PMID: 20709691.

3. Designations used in the taxonomy:

	1) Taxonomy levels are indicated by these prefixes:
	
	   k__: domain/kingdom
	   p__: phylum
	   c__: class
	   o__: order
	   f__: family
	   g__: genus  
	   s__: species
	
	   Example: 
	
	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Blautia;s__faecis
		
	2) Unique level identified – known species:
	   
	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Roseburia;s__hominis
	
	   The above example shows some reads match to a single species (all levels are unique)
	
	3) Non-unique level identified – known species:

	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Roseburia;s__multispecies_spp123_3
	   
	   The above example “s__multispecies_spp123_3” indicates certain reads equally match to 3 species of the 
	   genus Roseburia; the “spp123” is a temporally assigned species ID.
	
	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__multigenus;s__multispecies_spp234_5
	   
	   The above example indicates certain reads match equally to 5 different species, which belong to multiple genera.; 
	   the “spp234” is a temporally assigned species ID.
	
	4) Unique level identified – unknown species, potential novel species:
	   
	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Roseburia;s__ hominis_nov_97%
	   
	   The above example indicates that some reads have no match to any of the reference sequences with 
	   sequence identity ≥ 98% and percent coverage (alignment length)  ≥ 98% as well. However this groups 
	   of reads (actually the representative read from a de novo  OTU) has 96% percent identity to 
	   Roseburia hominis, thus this is a potential novel species, closest to Roseburia hominis. 
	   (But they are not the same species).
	
	5) Multiple level identified – unknown species, potential novel species:
	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Roseburia;s__ multispecies_sppn123_3_nov_96%
	
	   The above example indicates that some reads have no match to any of the reference sequences 
	   with sequence identity ≥ 98% and percent coverage (alignment length)  ≥ 98% as well. 
	   However this groups of reads (actually the representative read from a de novo  OTU) 
	   has 96% percent identity equally to 3 species in Roseburia. Thus this is no single 
	   closest species, instead this group of reads match equally to multiple species at 96%. 
	   Since they have passed chimera check so they represent a novel species. “sppn123” is a 
	   temporary ID for this potential novel species. 

 
4. The taxonomy assignment algorithm is illustrated in this flow char below:
 
 
 
 

Read Taxonomy Assignment - Result Summary

CodeCategoryRead Count (MC=1)*Read Count (MC=100)*
ATotal reads288,960288,960
BTotal assigned reads288,388288,388
CAssigned reads in species with read count < MC05,726
DAssigned reads in samples with read count < 50000
ETotal samples3131
FSamples with reads >= 5003131
GSamples with reads < 50000
HTotal assigned reads used for analysis (B-C-D)288,388282,662
IReads assigned to single species269,050265,470
JReads assigned to multiple species11,64611,361
KReads assigned to novel species7,6925,831
LTotal number of species376199
MNumber of single species252171
NNumber of multi-species137
ONumber of novel species11121
PTotal unassigned reads572572
QChimeric reads9494
RReads without BLASTN hits22
SOthers: short, low quality, singletons, etc.476476
A=B+P=C+D+H+Q+R+S
E=F+G
B=C+D+H
H=I+J+K
L=M+N+O
P=Q+R+S
* MC = Minimal Count per species, species with total read count < MC were removed.
* The assignment result from MC=100 was used in the downstream analyses.
 
 

Read Taxonomy Assignment - Sample Meta Information

#SampleIDGroup
F4083.S01CF
F4083.S02SECC
F4083.S03SECC
F4083.S04SECC
F4083.S05SECC
F4083.S06SECC
F4083.S07SECC
F4083.S08SECC
F4083.S09CF
F4083.S10SECC
F4083.S11SECC
F4083.S12SECC
F4083.S13SECC
F4083.S14SECC
F4083.S15SECC
F4083.S16CF
F4083.S17SECC
F4083.S18SECC
F4083.S19CF
F4083.S20CF
F4106.S01SECC
F4106.S02SECC
F4106.S03SECC
F4106.S04CF
F4106.S05CF
F4106.S06SECC
F4106.S07CF
F4106.S08SECC
F4106.S09CF
F4511.S01SECC
F4511.S02SECC
 
 

Read Taxonomy Assignment - ASV Read Counts by Samples

#Sample IDRead Count
F4511.S013618
F4083.S075874
F4083.S015995
F4083.S066660
F4083.S176722
F4083.S096822
F4083.S207093
F4083.S187239
F4083.S167300
F4083.S157481
F4083.S087533
F4083.S107678
F4083.S147959
F4083.S117967
F4083.S048270
F4083.S128507
F4083.S058867
F4083.S038886
F4083.S138991
F4083.S029154
F4083.S199462
F4106.S019862
F4106.S0910482
F4106.S0212009
F4511.S0212663
F4106.S0312828
F4106.S0514059
F4106.S0714445
F4106.S0814534
F4106.S0414746
F4106.S0615254
 
 

Read Taxonomy Assignment - ASV Read Counts Table

SPIDTaxonomyF4083.S01F4083.S02F4083.S03F4083.S04F4083.S05F4083.S06F4083.S07F4083.S08F4083.S09F4083.S10F4083.S11F4083.S12F4083.S13F4083.S14F4083.S15F4083.S16F4083.S17F4083.S18F4083.S19F4083.S20F4106.S01F4106.S02F4106.S03F4106.S04F4106.S05F4106.S06F4106.S07F4106.S08F4106.S09F4511.S01F4511.S02
SP1k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Veillonellaceae;g__Veillonella;s__parvula21424432536817844314161636050814922169543232102386154151432046172153898978923873104149233001169
SP10k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__sp._HMT_314000002700000021574900360057100000001100
SP100k__Bacteria;p__Bacteroidetes;c__Flavobacteriia;o__Flavobacteriales;f__Flavobacteriaceae;g__Capnocytophaga;s__gingivalis3667590000712300230000000660000512717511400267
SP101k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__sp._HMT_942000016002152000089000001600000050000
SP102k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae_[XIVa];g__Lachnospiraceae_[G];s__sp._Oral_Taxon_B32000129000033000000000000000441001000000
SP103k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__trevisanii1827000280800000000001800007844206144000
SP104k__Bacteria;p__Bacteroidetes;c__Flavobacteriia;o__Flavobacteriales;f__Flavobacteriaceae;g__Capnocytophaga;s__leadbetteri17057556701355539260004400036482335000106311201503261470145
SP105k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae_[XIV];g__Lachnoanaerobaculum;s__umeaense1059200710000000590000944600001344801770000
SP106k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Peptostreptococcaceae;g__Peptoanaerobacter;s__yurii05525000000003200000000900000111130140
SP107k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__sp._HMT_2152318123588184600370149000001600008160097024300
SP108k__Bacteria;p__Bacteroidetes;c__Flavobacteriia;o__Flavobacteriales;f__Flavobacteriaceae;g__Bergeyella;s__sp._HMT_9002000160000000032002100022000000013133056
SP109k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__australis000000116003406900000000000560000000
SP11k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__sp._HMT_21276166160081952505729071570062771420000932632670309332149975
SP110k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__sp._HMT_2190130000000000000000800000001191000104
SP111k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__sp._HMT_41700000164000000890730120900000000000000
SP112k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Veillonellaceae;g__Megasphaera;s__micronuciformis00000000000002990000010586000000000
SP113k__Bacteria;p__Firmicutes;c__Bacilli;o__Bacillales;f__Staphylococcaceae;g__Gemella;s__moribillum004600005044000025100006111001040002000000
SP114k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__nanceiensis1700430011802700160190500000000870000000
SP115k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__histicola00029032000011008911102424000113870001370010400
SP116k__Bacteria;p__Bacteroidetes;c__Flavobacteriia;o__Flavobacteriales;f__Flavobacteriaceae;g__Capnocytophaga;s__sputigena9815501214702738003263000001080000475029150891814632137
SP117k__Bacteria;p__Bacteroidetes;c__Bacteroides;o__Bacteroidales;f__Porphyromonadaceae;g__Porphyromonas;s__sp._Oral_Taxon_C34420430001385560000013974000004300011500000
SP118k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__oulorum0012103033009019027661600000921410902580000172
SP119k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__infantis_clade_4310000003600670000000000172780000006300
SP12k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__sp._str._2136FAA000052000001100421405719000000028000000
SP120k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Kingella;s__oralis9878144505508752167600440323570387221160011112401960114017
SP121k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__jejuni00001812700000004930000001378000000018
SP124k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Micrococcaceae;g__Rothia;s__dentocariosa49228511382626317054013307646096069020702770299297183661111160248
SP125k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Micrococcaceae;g__Rothia;s__mucilaginosa7104200662238730561235043081121093006000000000130
SP126k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Aggregatibacter;s__sp._HMT_949000000000000770510000000012400010726600
SP13k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Neisseria;s__sicca2222481741984767710736387029408910729020310285552309565500431168397223048
SP130k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__loescheii161750008600000004600018110001960460000072
SP131k__Bacteria;p__Firmicutes;c__Negativicutes;o__Selenomonadales;f__Selenomonadaceae;g__Selenomonas;s__sputigena001800220110000399100111600809500052240000
SP132k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Ruminococcaceae;g__Ruminococcaceae_[G-1];s__bacterium_HMT_07553059002501200003652053002400035004616230470
SP133k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__sp._HMT_392161859901023111001601873044720561050000894288216100202380
SP134k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Lautropia;s__mirabilis54835059870858610811568136502784922190000017319006821923344175167
SP135k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__salivae00260000000005147700000114191000350041300
SP136k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Lachnoanaerobaculum;s__orale0070000000500560300003103410512312900000000
SP138k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__massiliensis0000000000000000034000000254000000
SP14k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Kingella;s__denitrificans001091085181016124691030090650350071001171214234000548
SP140k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__naeslundii00000000000000000200178000131100000000
SP141k__Bacteria;p__Saccharibacteria_(TM7);c__Saccharibacteria_(TM7)_[C-1];o__Saccharibacteria_(TM7)_[O-1];f__Saccharibacteria_(TM7)_[F-1];g__Saccharibacteria_(TM7)_[G-1];s__bacterium_HMT_3460037000011000069860170290000000155000750
SP142k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Aggregatibacter;s__aphrophilus50950000360014000006408400000270000000
SP144k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__oralis_subsp._dentisani_clade_3980302200160000000020021000000000016000
SP145k__Bacteria;p__Saccharibacteria_(TM7);c__Saccharibacteria_(TM7)_[C-1];o__Saccharibacteria_(TM7)_[O-1];f__Saccharibacteria_(TM7)_[F-1];g__Saccharibacteria_(TM7)_[G-1];s__bacterium_HMT_347290205200000128000000016000000010100000
SP146k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Neisseria;s__sp._HMT_499000000000000000000000000000000280
SP147k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Porphyromonadaceae;g__Porphyromonas;s__sp._HMT_2840570000000000000000000000000187000
SP149k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Alloprevotella;s__sp._HMT_913190890144701700029133012636270220000002353329100
SP15k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Aggregatibacter;s__sp._HMT_45832090210472601525900070064173725540000519335294140370163078
SP150k__Bacteria;p__Firmicutes;c__Negativicutes;o__Selenomonadales;f__Selenomonadaceae;g__Selenomonas;s__infelix0002500000000000003000000066000000
SP151k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Porphyromonadaceae;g__Porphyromonas;s__sp._HMT_278039000000021000053010000000000910000
SP152k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae_[XIVa];g__Catonella;s__morbi5103533023000160000000000000033037032340
SP154k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__johnsonii02400001800000000211300000030410340119340
SP155k__Bacteria;p__Bacteroidetes;c__Flavobacteriia;o__Flavobacteriales;f__Flavobacteriaceae;g__Bergeyella;s__sp._HMT_206002606260249053000064560101300155000101000089
SP158k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae_[XIV];g__Lachnospiraceae_[G-3];s__bacterium_HMT_10059442678000000004200002100000560013502800
SP159k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Alloprevotella;s__sp._HMT_91423011200312425000490057944607210001420000014900
SP16k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Neisseria;s__mucosa29001950000026729027190002521281949461273001147613139404128193122
SP162k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__sp._HMT_3060053072400002001735000498022200000330000
SP163k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Corynebacteriaceae;g__Corynebacterium;s__matruchotii7106800046000003600000230000109271000000
SP166k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__sp._HMT_064000000000000000000000112000000000
SP168k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Micrococcaceae;g__Rothia;s__aeria2811030940083003897000230768127000320008300403
SP17k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__buccalis148157172006626805353018340079125018917600009619838347533902030
SP170k__Bacteria;p__Saccharibacteria_(TM7);c__Saccharibacteria_(TM7)_[C-1];o__Saccharibacteria_(TM7)_[O-1];f__Saccharibacteria_(TM7)_[F-1];g__Saccharibacteria_(TM7)_[G-1];s__bacterium_HMT_957003926000000005900101000000000000000
SP172k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Fusobacteriaceae;g__Fusobacterium;s__nucleatum_subsp._vincentii00012200000000000000000000001720000
SP174k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Porphyromonadaceae;g__Porphyromonas;s__sp._HMT_2755300460000000100038000210000390000000
SP175k__Bacteria;p__Bacteroidetes;c__Flavobacteriia;o__Flavobacteriales;f__Flavobacteriaceae;g__Bergeyella;s__sp._HMT_93100190000000000034000460034000300170015
SP176k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Burkholderiales;f__Comamonadaceae;g__Ottowia;s__sp._HMT_894000000000000000000000000040620000
SP178k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__sp._Oral_Taxon_2990012300000000000280000250000520000150
SP179k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__sp._str._ChDCB1977000000000000000000000130001030710000
SP18k__Bacteria;p__Saccharibacteria_(TM7);c__Saccharibacteria_(TM7)_[C-1];o__Saccharibacteria_(TM7)_[O-1];f__Saccharibacteria_(TM7)_[F-1];g__Saccharibacteria_(TM7)_[G-1];s__bacterium_HMT_348383107000191337000103140014272328000598214430119651220
SP180k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Kingella;s__sp._Oral_Taxon_C21000000057260100000000000000007029000
SP182k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Aggregatibacter;s__sp._HMT_8980030071000000770000000000038640045000
SP184k__Bacteria;p__Firmicutes;c__Negativicutes;o__Selenomonadales;f__Selenomonadaceae;g__Selenomonas;s__noxia000000000000000000000000007900710
SP185k__Bacteria;p__Proteobacteria;c__Epsilonproteobacteria;o__Campylobacterales;f__Campylobacteraceae;g__Campylobacter;s__showae81001500040000170070000000170271300100
SP187k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__odontolyticus2403500039028320000006456000005900003100
SP19k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Haemophilus;s__sp._HMT_9080321150766708400103008330095644100000000110000
SP190k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__sp._str._C1500000000120000480000000000000004000
SP192k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__anginosus00000800018700030005000000004300000
SP193k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Cardiobacteriales;f__Cardiobacteriaceae;g__Cardiobacterium;s__valvarum300001700000000000000770000970000022
SP2k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__sp._str._C300722721312219111960653319123820911927460514509153366474285567429856812334648138038965873
SP20k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Carnobacteriaceae;g__Granulicatella;s__adiacens00104053006217014679055001240117002771190000052000
SP202k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__sp._Oral_Taxon_710000000006100000530000000000000031
SP204k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__micans002700000000000012000000001490001900
SP205k__Bacteria;p__Bacteroidetes;c__Flavobacteriia;o__Flavobacteriales;f__Flavobacteriaceae;g__Capnocytophaga;s__sp._HMT_32681000000000000000000000028192909000
SP206k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae_[XIV];g__Stomatobaculum;s__longum00000000000019480000001180000000000
SP209k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__sp._HMT_472000052000000000000000000000000095
SP21k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Porphyromonadaceae;g__Porphyromonas;s__pasteri18302901401992512141452111121914780329181156265115147132434029255402994442975181100
SP213k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Peptostreptococcaceae_[XIII];g__Parvimonas;s__sp._Oral_Taxon_1100000040000000000000120000001290008
SP214k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae_[XIV];g__Oribacterium;s__sinus26000001300003500000000600000000000
SP216k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Propionibacteriales;f__Propionibacteriaceae;g__Arachnia;s__rubra0000000000000000000000000017500036
SP217k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__sp._HMT_4480000000000001230000000000000028000
SP218k__Bacteria;p__Firmicutes;c__Negativicutes;o__Selenomonadales;f__Selenomonadaceae;g__Selenomonas;s__flueggei0000000000000350000000000010600000
SP22k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Neisseria;s__flavescens|subflava41870679784192296671017503302903033010003610701342321970067
SP23k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__parasanguinis_I000000000000000000023860000000000
SP230k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__sp._Oral_Taxon_8480000000000000000000000049124000000
SP234k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__denticola000007500000000000000570000000000
SP236k__Bacteria;p__Fusobacteria;c__Fusobacteria;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__goodfellowii00000000000000000000000109118000000
SP24k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Propionibacteriales;f__Propionibacteriaceae;g__Pseudopropionibacterium;s__propionicum4729067802922505300977301058788300004571720044028104
SP244k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Fusobacteriaceae;g__Fusobacterium;s__sp._HMT_204000000000000000000000000107000000
SP248k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae_[XIV];g__Lachnospiraceae_[G-2];s__bacterium_HMT_096000000000000000000001590000000000
SP25k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Haemophilus;s__sputorum00000014585000000000043000000000000
SP26k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__mitis3054964991424072515023544767075171504829622784253733180112213887439387188487336891098342116717
SP27k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__oralis35440941464893925937629548884501581282001993075634060732731237315558906352146255054
SP28k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Alloprevotella;s__sp._HMT_473004643534916910334439899142066308856621092036342832410102022836114417433800
SP29k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Cardiobacteriales;f__Cardiobacteriaceae;g__Cardiobacterium;s__hominis24960430015510131535222200000700002326817823185760146
SP3k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__sp._HMT_31700143283000083000008959400000002742172522432761261250
SP30k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Neisseria;s__sp._Oral_Taxon_144164000037000012700000000000000635500132
SP32k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__shahii01805090128145084000012563071642360530000059025728200
SP33k__Bacteria;p__Fusobacteria;c__Fusobacteria;o__Fusobacteriales;f__Fusobacteriaceae;g__Fusobacterium;s__periodonticum6800000117029008900101131000000016500285002910
SP34k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Fusobacteriaceae;g__Fusobacterium;s__nucleatum19805000142000000000117009815500037024552697853964251175
SP35k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__intermedius03209531170016610000000789032092411471823026
SP36k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Peptostreptococcaceae_[XI];g__Eubacterium_[XI][G-7];s__yurii0000000000000000010290000481790003400
SP37k__Bacteria;p__Fusobacteria;c__Fusobacteria;o__Fusobacteriales;f__Fusobacteriaceae;g__Fusobacterium;s__nucleatum_ss_polymorphum3293801980000000023500000009902315313140401676130138223
SP38k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Haemophilus;s__parainfluenzae9287321615020720338837529129325510613690286163186504194016905545932183456943281870360
SP39k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__infantis530054480299099001793400152930000016500012400019
SP4k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__sp._HMT_225860000000000000081037610000700877500022
SP40k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__oralis_subsp._tigurinus_clade_07000140014400000051460202450280000000000023
SP41k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Porphyromonadaceae;g__Tannerella;s__sp._HMT_286622651752988342657000671146037561500003041241162459520610
SP42k__Bacteria;p__Fusobacteria;c__Fusobacteria;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__sp._AF189244.168381206792219127247107173872200190114809134549793720170677117122455511211947
SP43k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Aerococcaceae;g__Abiotrophia;s__defectiva426173127183070233942401019992151451505314717025094541901673916014627525
SP44k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Corynebacteriales;f__Corynebacteriaceae;g__Corynebacterium;s__durum02621249700000162900000000000731530046000
SP45k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Haemophilus;s__paraphrohaemolyticus106803636580108709905098001813702090002181150000000
SP46k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__sp._HMT_17505601410000000046000098000000598600000
SP47k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__sp._HMT_056667060180055002500000000265000100000870808
SP48k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Aggregatibacter;s__sp._HMT_51398480063645300300041075000000000010900000
SP49k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__veroralis0000000280010301600000000990000003800
SP5k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Alloprevotella;s__tannerae1903500402100000540000000740000000000
SP50k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__sanguinis117198967404243010427931553101302188555132133405200263832084775646812705523731530586
SP51k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Moraxellaceae;g__Moraxella;s__sp._Oral_Taxon_B0700700051000000005300000016000098000
SP52k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Carnobacteriaceae;g__Granulicatella;s__paradiacens003191541632455000183091550182005000282057017701517100
SP53k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__melaninogenica2377110168863661127402250660244114106159797732655567703100497150118277021
SP54k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Veillonellaceae;g__Veillonella;s__atypica017000163000034011114619077000621751000451007700
SP56k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__oris001429239000000670002005745270004610631310100
SP58k__Bacteria;p__Bacteroidetes;c__Flavobacteriia;o__Flavobacteriales;f__Flavobacteriaceae;g__Capnocytophaga;s__sp._HMT_3324500000110000041003600000001500000210
SP59k__Bacteria;p__Firmicutes;c__Negativicutes;o__Veillonellales;f__Veillonellaceae;g__Veillonella;s__sp._HMT_780001100200119261467801700731802220012519623028350390280127079238048
SP6k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__cristatus8336142155407175851362302088431552404972589621391112901569211379454672943127109430
SP60k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Kingella;s__sp._HMT_012000091000000001430001496000007200001
SP61k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Neisseria;s__elongata83004272930000037110023060000012732003415200254
SP62k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__pallens00126002700000100850002100188099500230000
SP63k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Alloprevotella;s__sp._HMT_9120000003703700061051116029710000000300000
SP64k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Porphyromonadaceae;g__Porphyromonas;s__catoniae65078754919316020002732042000921300026219058109221236070
SP65k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__sp._HMT_22106800000001440013867008000031700035200000
SP66k__Bacteria;p__Fusobacteria;c__Fusobacteria;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__hofstadii0321581867007375283800430001024000054630867035834314933
SP67k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__sp._Oral_Taxon_B660079560000002510000001889600000000010100
SP68k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Haemophilus;s__parahaemolyticus00221000410033175000057000001906972000000
SP70k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__gordonii2503810912301296732140012142012641767450173167014002804400
SP71k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Carnobacteriaceae;g__Granulicatella;s__elegans06838013425213849161267711480211195312298011345917831224553503792791224691250109
SP72k__Bacteria;p__Bacteroidetes;c__Flavobacteriia;o__Flavobacteriales;f__Flavobacteriaceae;g__Capnocytophaga;s__granulosa460085091000000003450075011100017201680164000
SP73k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Neisseria;s__cinerea0013774540202914100000350034002397280001012800020
SP74k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__mutans0003002001132625005857382410066143220014912421600000
SP75k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Neisseria;s__sp._HMT_01800000000000170000000000005014500036
SP76k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Haemophilus;s__haemolyticus000003711796204134053416812292052001630220147159091017000
SP77k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Neisseria;s__subflava0015700000000045006100000002051940000740
SP78k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Alloprevotella;s__sp._HMT_308003740330002140000104028440007000930000000
SP79k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__sp._HMT_49800000350051000149915413116500014515601910111001821250
SP8k__Bacteria;p__Firmicutes;c__Bacilli;o__Bacillales;f__Gemellaceae;g__Gemella;s__morbillorum607217761751910862029007120436028167906500017216002970730
SP80k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Eikenella;s__corrodens035001527000000018430000740037006514700150
SP81k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae_[XIV];g__Johnsonella;s__ignava27000000000000000000000043103000000
SP83k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__parasanguinis_II000004700014124012903230195001410110004901420040017
SP84k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Haemophilus;s__sp._HMT_036300148283301135700003859432200154220083000195077053
SP87k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__sp._Oral_Taxon_18000030050000029046232600001300000011000000
SP88k__Bacteria;p__Bacteroidetes;c__Flavobacteriia;o__Flavobacteriales;f__Flavobacteriaceae;g__Bergeyella;s__sp._HMT_32244913702418277002919117121717462659000043771351112720487047
SP89k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__wadei00140006800001840205228156137000310237332912002280000189
SP9k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Porphyromonadaceae;g__Porphyromonas;s__sp._HMT_930001040942700691600163220001503855003621905600013100
SP90k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__constellatus0017000029120002500000001030000000000
SP92k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__salivarius000320013100051003416013700094146074071000029
SP93k__Bacteria;p__Firmicutes;c__Bacilli;o__Bacillales;f__NA;g__Gemella;s__haemolysans2473164601605380164753616235123158963880023656173713021741841371159319400
SP94k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__oralis_subsp._dentisani_clade_0580026076610000000000290000002200000600
SP95k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Corynebacteriaceae;g__Corynebacterium;s__sp._Oral_Taxon_A1600000003100000000000016002400370000
SP96k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Aggregatibacter;s__segnis1180930070003100190011100000001763220050000
SP97k__Bacteria;p__Saccharibacteria_(TM7);c__Saccharibacteria_(TM7)_[C-1];o__Saccharibacteria_(TM7)_[O-1];f__Saccharibacteria_(TM7)_[F-1];g__Saccharibacteria_(TM7)_[G-6];s__bacterium_HMT_8703800730310019190004325929719170854399000242009800
SP98k__Bacteria;p__Bacteroidetes;c__Bacteroides;o__Bacteroidales;f__Porphyromonadaceae;g__Porphyromonas;s__sp._Oral_Taxon_B4300620005300010000814800000000000860460
SPN15k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Haemophilus;s__parahaemolyticus0000000000000000000005500010200000
SPN19k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Veillonellaceae;g__Selenomonas;s__infelix0000000000830000002000000048000000
SPN2k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__fusca00402300000000000000000000000000131
SPN20k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Porphyromonadaceae;g__Porphyromonas;s__catoniae0000000000000000000000010446000000
SPN21k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__salivae00035290000000034000000000000005100
SPN22k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Carnobacteriaceae;g__Granulicatella;s__paradiacens00100000000001100000000000000398200
SPN23k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Haemophilus;s__parainfluenzae0000109001111550002500000170002130003900012100
SPN24k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Alloprevotella;s__sp._HMT_913000000000000000000000000000014100
SPN25k__Bacteria;p__Firmicutes;c__Negativicutes;o__Selenomonadales;f__Selenomonadaceae;g__Mitsuokella;s__sp._HMT_521000000000000000000001290000000000
SPN26k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Haemophilus;s__sp._HMT_908850000000000000000020000000000000
SPN27k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Veillonellaceae;g__Centipeda;s__sp._Oral_Taxon_D180031000000000004800000000000260000
SPN28k__Bacteria;p__Saccharibacteria(TM7);c__TM7_[C];o__TM7_[O];f__TM7_[F];g__TM7_[G];s__sp._Oral_Taxon_A5600000000620000112752984066000000185086000
SPN29k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Kingella;s__denitrificans000000000000290000000000000073000
SPN30k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__sp._str._C3000025000000000000250052000000000000
SPN34k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Alloprevotella;s__sp._HMT_308000000033000023000021170000000000088
SPN40k__Bacteria;p__Firmicutes;c__Negativicutes;o__Selenomonadales;f__Selenomonadaceae;g__Selenomonas;s__sp._HMT_479540710110140000003000551605500000057076000
SPN51k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__mitis00230000040000000003423000020380024290020
SPN61k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__Aggregatibacter;s__sp._HMT_94900130248000000000000078000000000000
SPN73k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Leptotrichiaceae;g__Leptotrichia;s__sp._HMT_392004700055000048000000000034000006400
SPN85k__Bacteria;p__Fusobacteria;c__Fusobacteriia;o__Fusobacteriales;f__Fusobacteriaceae;g__Fusobacterium;s__nucleatum34820000000000000000000000000118000
SPN98k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Prevotellaceae;g__Prevotella;s__veroralis_nov_97.727%000000000000024002700001010004200000
SPP10k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__multispecies_spp10_20000000000000280000000000011300000
SPP11k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__multispecies_spp11_3006907156780064093119000101036860458002601510000197
SPP12k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Streptococcaceae;g__Streptococcus;s__multispecies_spp12_205420122365140582651442222880941008960011737910250810000059011400
SPP2k__Bacteria;p__Bacteroidetes;c__Bacteroides;o__Bacteroidales;f__Porphyromonadaceae;g__Porphyromonas;s__multispecies_spp2_200048049591590300000000017300640000001880105
SPP3k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__multifamily;g__Eubacterium_[XIVa][G-1];s__saburreum026651660560466804606473300444200124320909011500156152107
SPP5k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Neisseriales;f__Neisseriaceae;g__Neisseria;s__multispecies_spp5_2000000000229000000000000000000000
SPP9k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Carnobacteriaceae;g__Granulicatella;s__multispecies_spp9_2805523053000033137830013807402999005484002200710
 
 
Download Read Count Tables at Different Taxonomy Levels
domain
phylum
class
order
family
genus
species
;
 

Sample Taxonomy Bar Plots

 

VIII. Analysis - Alpha Diversity

 

In ecology, alpha diversity (α-diversity) is the mean species diversity in sites or habitats at a local scale. The term was introduced by R. H. Whittaker[1][2] together with the terms beta diversity (β-diversity) and gamma diversity (γ-diversity). Whittaker's idea was that the total species diversity in a landscape (gamma diversity) is determined by two different things, the mean species diversity in sites or habitats at a more local scale (alpha diversity) and the differentiation among those habitats (beta diversity).

References:
Whittaker, R. H. (1960) Vegetation of the Siskiyou Mountains, Oregon and California. Ecological Monographs, 30, 279–338. doi:10.2307/1943563
Whittaker, R. H. (1972). Evolution and Measurement of Species Diversity. Taxon, 21, 213-251. doi:10.2307/1218190

 

Boxplot of Alpha-diversity indices

The two main factors taken into account when measuring diversity are richness and evenness. Richness is a measure of the number of different kinds of organisms present in a particular area. Evenness compares the similarity of the population size of each of the species present. There are many different ways to measure the richness and evenness. These measurements are called "estimators" or "indices". Below is a diversity of 3 commonly used indices showing the values for all the samples (dots) and in groups (boxes).

 
 
 
 
 

Alpha diversity analysis by rarefaction

Diversity measures are affected by the sampling depth. Rarefaction is a technique to assess species richness from the results of sampling. Rarefaction allows the calculation of species richness for a given number of individual samples, based on the construction of so-called rarefaction curves. This curve is a plot of the number of species as a function of the number of samples. Rarefaction curves generally grow rapidly at first, as the most common species are found, but the curves plateau as only the rarest species remain to be sampled.

References:
Willis AD. Rarefaction, Alpha Diversity, and Statistics. Front Microbiol. 2019 Oct 23;10:2407. doi: 10.3389/fmicb.2019.02407. PMID: 31708888; PMCID: PMC6819366.

 
 

IX. Analysis - Beta Diversity

 

NMDS and PCoA Plots

Beta diversity compares the similarity (or dissimilarity) of microbial profiles between different groups of samples. There are many different similarity/dissimilarity metrics. In general, they can be quantitative (using sequence abundance, e.g., Bray-Curtis or weighted UniFrac) or binary (considering only presence-absence of sequences, e.g., binary Jaccard or unweighted UniFrac). They can be even based on phylogeny (e.g., UniFrac metrics) or not (non-UniFrac metrics, such as Bray-Curtis, etc.).

For microbiome studies, species profiles of samples can be compared with the Bray-Curtis dissimilarity, which is based on the count data type. The pair-wise Bray-Curtis dissimilarity matrix of all samples can then be subject to either multi-dimensional scaling (MDS, also known as PCoA) or non-metric MDS (NMDS).

MDS/PCoA is a scaling or ordination method that starts with a matrix of similarities or dissimilarities between a set of samples and aims to produce a low-dimensional graphical plot of the data in such a way that distances between points in the plot are close to original dissimilarities.

NMDS is similar to MDS, however it does not use the dissimilarities data, instead it converts them into the ranks and use these ranks in the calculation.

In our beta diversity analysis, Bray-Curtis dissimilarity matrix was first calculated and then plotted by the PCoA and NMDS separately. The results are shown below:

 
 
 
 
 

The above PCoA and NMDS plots are based on count data. The count data can also be transformed into centered log ratio (CLR) for each species. The CLR data is no longer count data and cannot be used in Bray-Curtis dissimilarity calculation. Instead CLR can be compared with Euclidean distances. When CLR data are compared by Euclidean distance, the distance is also called Aitchison distance.

Below are the NMDS and PCoA plots of the Aitchison distances of the samples:

 
 
 
 
 

Interactive 3D PCoA Plots - Bray-Curtis Dissimilarity

 
 
 

Interactive 3D PCoA Plots - Euclidean Distance

 
 
 

Interactive 3D PCoA Plots - Correlation Coefficients

 
 
 

X. Analysis - Differential Abundance

16S rRNA next generation sequencing (NGS) generates a fixed number of reads that reflect the proportion of different species in a sample, i.e., the relative abundance of species, instead of the absolute abundance. In Mathematics, measurements involving probabilities, proportions, percentages, and ppm can all be thought of as compositional data. This makes the microbiome read count data “compositional” (Gloor et al, 2017). In general, compositional data represent parts of a whole which only carry relative information (http://www.compositionaldata.com/).

The problem of microbiome data being compositional arises when comparing two groups of samples for identifying “differentially abundant” species. A species with the same absolute abundance between two conditions, its relative abundances in the two conditions (e.g., percent abundance) can become different if the relative abundance of other species change greatly. This problem can lead to incorrect conclusion in terms of differential abundance for microbial species in the samples.

When studying differential abundance (DA), the current better approach is to transform the read count data into log ratio data. The ratios are calculated between read counts of all species in a sample to a “reference” count (e.g., mean read count of the sample). The log ratio data allow the detection of DA species without being affected by percentage bias mentioned above

In this report, a compositional DA analysis tool “ANCOM” (analysis of composition of microbiomes) was used. ANCOM transforms the count data into log-ratios and thus is more suitable for comparing the composition of microbiomes in two or more populations

References:

Gloor GB, Macklaim JM, Pawlowsky-Glahn V, Egozcue JJ. Microbiome Datasets Are Compositional: And This Is Not Optional. Front Microbiol. 2017 Nov 15;8:2224. doi: 10.3389/fmicb.2017.02224. PMID: 29187837; PMCID: PMC5695134.

Mandal S, Van Treuren W, White RA, Eggesbø M, Knight R, Peddada SD. Analysis of composition of microbiomes: a novel method for studying microbial composition. Microb Ecol Health Dis. 2015 May 29;26:27663. doi: 10.3402/mehd.v26.27663. PMID: 26028277; PMCID: PMC4450248.

 

ANCOM differential abundance analysis

 
View ANCOM results
Comparison No.Comparison Name
Comparison 1.CF vs SECC
 
 
 

LEfSe - Linear Discriminant Analysis Effect Size

LEfSe (Linear Discriminant Analysis Effect Size) is an alternative method to find "organisms, genes, or pathways that consistently explain the differences between two or more microbial communities" (Segata et al., 2011). Specifically, LEfSe uses rank-based Kruskal-Wallis (KW) sum-rank test to detect features with significant differential (relative) abundance with respect to the class of interest. Since it is rank-based, instead of proportional based, the differential species identified among the comparison groups is less biased (than percent abundance based).

Reference:

Segata N, Izard J, Waldron L, Gevers D, Miropolsky L, Garrett WS, Huttenhower C. Metagenomic biomarker discovery and explanation. Genome Biol. 2011 Jun 24;12(6):R60. doi: 10.1186/gb-2011-12-6-r60. PMID: 21702898; PMCID: PMC3218848.

 
SECC vs CF
 
 
 
 
 
 
 

XI. Analysis - Heatmap Profile

 

Species vs sample abundance heatmap

 
 
 

XII. Analysis - Network Association

To analyze the co-occurrence or co-exclusion between microbial species among different samples, network correlation analysis tools are usually used for this purpose. However, microbiome count data are compositional. If count data are normalized to the total number of counts in the sample, the data become not independent and traditional statistical metrics (e.g., correlation) for the detection of specie-species relationships can lead to spurious results. In addition, sequencing-based studies typically measure hundreds of OTUs (species) on few samples; thus, inference of OTU-OTU association networks is severely under-powered. Here we use SPIEC-EASI (SParse InversE Covariance Estimation for Ecological Association Inference), a statistical method for the inference of microbial ecological networks from amplicon sequencing datasets that addresses both of these issues (Kurtz et al., 2015). SPIEC-EASI combines data transformations developed for compositional data analysis with a graphical model inference framework that assumes the underlying ecological association network is sparse. SPIEC-EASI provides two algorithms for network inferencing – 1) Meinshausen-Bühlmann's neighborhood selection (MB method) and inverse covariance selection (GLASSO method, i.e., graphical least absolute shrinkage and selection operator). This is fundamentally distinct from SparCC, which essentially estimate pairwise correlations. In addition to these two methods, we provide the results of a third method - SparCC (Sparse Correlations for Compositional Data)(Friedman & Alm 2012), which is also a method for inferring correlations from compositional data. SparCC estimates the linear Pearson correlations between the log-transformed components.

References:

Kurtz ZD, Müller CL, Miraldi ER, Littman DR, Blaser MJ, Bonneau RA. Sparse and compositionally robust inference of microbial ecological networks. PLoS Comput Biol. 2015 May 7;11(5):e1004226. doi: 10.1371/journal.pcbi.1004226. PMID: 25950956; PMCID: PMC4423992.

Friedman J, Alm EJ. Inferring correlation networks from genomic survey data. PLoS Comput Biol. 2012;8(9):e1002687. doi: 10.1371/journal.pcbi.1002687. Epub 2012 Sep 20. PMID: 23028285; PMCID: PMC3447976.

 

SPIEC-EASI network inference by neighborhood selection (MB method)

 

 

 

SPIEC-EASI network inference by inverse covariance selection (GLASSO method)

 

 

 

Association network inference by SparCC

 

 

 
 

Copyright FOMC 2021