FOMC Service Report

16S rRNA Gene V1V3 Amplicon Sequencing

Version V1.50

Version History

The Forsyth Institute, Cambridge, MA, USA
May 30, 2025

Project ID: 20250529_HAKU_6


I. Project Summary

Project 20250529_HAKU_6 services include NGS sequencing of the V1V3 region of the 16S rRNA gene amplicons from the samples. First and foremost, please download this report, as well as the sequence raw data from the download links provided below. These links will expire after 60 days. We cannot guarantee the availability of your data after 60 days.

Full Bioinformatics analysis service was requested. We provide many analyses, starting from the raw sequence quality and noise filtering, pair reads merging, as well as chimera filtering for the sequences, using the DADA2 denosing algorithm and pipeline.

We also provide many downstream analyses such as taxonomy assignment, alpha and beta diversity analyses, and differential abundance analysis.

For taxonomy assignment, most informative would be the taxonomy barplots. We provide an interactive barplots to show the relative abundance of microbes at different taxonomy levels (from Phylum to species) that you can choose.

If you specify which groups of samples you want to compare for differential abundance, we provide both ANCOM and LEfSe differential abundance analysis.

 

II. Workflow Checklist

1.Sample Received
2.Sample Quality Evaluated
3.Sample Prepared for Sequencing
4.Next-Gen Sequencing
5.Sequence Quality Check
6.Absolute Abundance
7.Report and Raw Sequence Data Available for Download
8.Bioinformatics Analysis - Reads Processing (DADA2 Quality Trimming, Denoising, Paired Reads Merging)
9.Bioinformatics Analysis - Reads Taxonomy Assignment
10.Bioinformatics Analysis - Alpha Diversity Analysis
11.Bioinformatics Analysis - Beta Diversity Analysis
12.Bioinformatics Analysis - Differential Abundance Analysis
13.Bioinformatics Analysis - Heatmap Profile
14.Bioinformatics Analysis - Network Association
 

III. NGS Sequencing

The samples were processed and analyzed with the ZymoBIOMICS® Service: Targeted Metagenomic Sequencing (Zymo Research, Irvine, CA).

DNA Extraction: If DNA extraction was performed, the following DNA extraction kit was used according to the manufacturer’s instructions:

ZymoBIOMICS®-96 MagBead DNA Kit (Zymo Research, Irvine, CA)
N/A (DNA Extraction Not Performed)
Elution Volume: 50µL
Additional Notes: NA

Targeted Library Preparation: The DNA samples were prepared for targeted sequencing with the Quick-16S™ NGS Library Prep Kit (Zymo Research, Irvine, CA). These primers were custom designed by Zymo Research to provide the best coverage of the 16S gene while maintaining high sensitivity. The primer sets used in this project are marked below:

Quick-16S™ Primer Set V1-V2 (Zymo Research, Irvine, CA)
Quick-16S™ Primer Set V1-V3 (Zymo Research, Irvine, CA)
Quick-16S™ Primer Set V3-V4 (Zymo Research, Irvine, CA)
Quick-16S™ Primer Set V4 (Zymo Research, Irvine, CA)
Quick-16S™ Primer Set V6-V8 (Zymo Research, Irvine, CA)
Additional Notes: NA

The sequencing library was prepared using an innovative library preparation process in which PCR reactions were performed in real-time PCR machines to control cycles and therefore limit PCR chimera formation. The final PCR products were quantified with qPCR fluorescence readings and pooled together based on equal molarity. The final pooled library was cleaned up with the Select-a-Size DNA Clean & Concentrator™ (Zymo Research, Irvine, CA), then quantified with TapeStation® (Agilent Technologies, Santa Clara, CA) and Qubit® (Thermo Fisher Scientific, Waltham, WA).

Control Samples: The ZymoBIOMICS® Microbial Community Standard (Zymo Research, Irvine, CA) was used as a positive control for each DNA extraction, if performed. The ZymoBIOMICS® Microbial Community DNA Standard (Zymo Research, Irvine, CA) was used as a positive control for each targeted library preparation. Negative controls (i.e. blank extraction control, blank library preparation control) were included to assess the level of bioburden carried by the wet-lab process.

Sequencing: The final library was sequenced on Illumina® NextSeq 2000™ with a p1 (Illumina, Sand Diego, CA) reagent kit (600 cycles). The sequencing was performed with 25% PhiX spike-in.

Absolute Abundance Quantification*: A quantitative real-time PCR was set up with a standard curve. The standard curve was made with plasmid DNA containing one copy of the 16S gene and one copy of the fungal ITS2 region prepared in 10-fold serial dilutions. The primers used were the same as those used in Targeted Library Preparation. The equation generated by the plasmid DNA standard curve was used to calculate the number of gene copies in the reaction for each sample. The PCR input volume (2 µl) was used to calculate the number of gene copies per microliter in each DNA sample.
The number of genome copies per microliter DNA sample was calculated by dividing the gene copy number by an assumed number of gene copies per genome. The value used for 16S copies per genome is 4. The value used for ITS copies per genome is 200. The amount of DNA per microliter DNA sample was calculated using an assumed genome size of 4.64 x 106 bp, the genome size of Escherichia coli, for 16S samples, or an assumed genome size of 1.20 x 107 bp, the genome size of Saccharomyces cerevisiae, for ITS samples. This calculation is shown below:

Calculated Total DNA = Calculated Total Genome Copies × Assumed Genome Size (4.64 × 106 bp) ×
Average Molecular Weight of a DNA bp (660 g/mole/bp) ÷ Avogadro’s Number (6.022 x 1023/mole)


* Absolute Abundance Quantification is only available for 16S and ITS analyses.

The absolute abundance standard curve data can be viewed in Excel here:

The absolute abundance standard curve is shown below:

Absolute Abundance Standard Curve

 

IV. Complete Report Download

The complete report of your project, including all links in this report, can be downloaded by clicking the link provided below. The downloaded file is a compressed ZIP file and once unzipped, open the file “REPORT.html” (may only shown as "REPORT" in your computer) by double clicking it. Your default web browser will open it and you will see the exact content of this report.

Please download and save the file to your computer storage device. The download link will expire after 60 days upon your receiving of this report.

Complete report download link:

To view the report, please follow the following steps:

1.Download the .zip file from the report link above.
2.Extract all the contents of the downloaded .zip file to your desktop.
3.Open the extracted folder and find the "REPORT.html" (may shown as only "REPORT").
4.Open (double-clicking) the REPORT.html file. Your default browser will open the top age of the complete report. Within the report, there are links to view all the analyses performed for the project.

 

V. Raw Sequence Data Download

The raw NGS sequence data is available for download with the link provided below. The data is a compressed ZIP file and can be unzipped to individual sequence files. Since this is a pair-end sequencing, each of your samples is represented by two sequence files, one for READ 1, with the file extension “*_R1.fastq.gz”, another READ 2, with the file extension “*_R1.fastq.gz”. The files are in FASTQ format and are compressed. FASTQ format is a text-based data format for storing both a biological sequence and its corresponding quality scores. Most sequence analysis software will be able to open them. The Sample IDs associated with the R1 and R2 fastq files are listed in the table below:

Sample IDOriginal Sample IDRead 1 File NameRead 2 File Name
F19244.S10zr19244_10V1V3_R1.fastq.gzzr19244_10V1V3_R2.fastq.gz
F19244.S11zr19244_11V1V3_R1.fastq.gzzr19244_11V1V3_R2.fastq.gz
F19244.S12zr19244_12V1V3_R1.fastq.gzzr19244_12V1V3_R2.fastq.gz
F19244.S13zr19244_13V1V3_R1.fastq.gzzr19244_13V1V3_R2.fastq.gz
F19244.S14zr19244_14V1V3_R1.fastq.gzzr19244_14V1V3_R2.fastq.gz
F19244.S15zr19244_15V1V3_R1.fastq.gzzr19244_15V1V3_R2.fastq.gz
F19244.S16zr19244_16V1V3_R1.fastq.gzzr19244_16V1V3_R2.fastq.gz
F19244.S17zr19244_17V1V3_R1.fastq.gzzr19244_17V1V3_R2.fastq.gz
F19244.S18zr19244_18V1V3_R1.fastq.gzzr19244_18V1V3_R2.fastq.gz
F19244.S19zr19244_19V1V3_R1.fastq.gzzr19244_19V1V3_R2.fastq.gz
F19244.S01zr19244_1V1V3_R1.fastq.gzzr19244_1V1V3_R2.fastq.gz
F19244.S20zr19244_20V1V3_R1.fastq.gzzr19244_20V1V3_R2.fastq.gz
F19244.S21zr19244_21V1V3_R1.fastq.gzzr19244_21V1V3_R2.fastq.gz
F19244.S22zr19244_22V1V3_R1.fastq.gzzr19244_22V1V3_R2.fastq.gz
F19244.S23zr19244_23V1V3_R1.fastq.gzzr19244_23V1V3_R2.fastq.gz
F19244.S24zr19244_24V1V3_R1.fastq.gzzr19244_24V1V3_R2.fastq.gz
F19244.S25zr19244_25V1V3_R1.fastq.gzzr19244_25V1V3_R2.fastq.gz
F19244.S26zr19244_26V1V3_R1.fastq.gzzr19244_26V1V3_R2.fastq.gz
F19244.S27zr19244_27V1V3_R1.fastq.gzzr19244_27V1V3_R2.fastq.gz
F19244.S28zr19244_28V1V3_R1.fastq.gzzr19244_28V1V3_R2.fastq.gz
F19244.S29zr19244_29V1V3_R1.fastq.gzzr19244_29V1V3_R2.fastq.gz
F19244.S02zr19244_2V1V3_R1.fastq.gzzr19244_2V1V3_R2.fastq.gz
F19244.S30zr19244_30V1V3_R1.fastq.gzzr19244_30V1V3_R2.fastq.gz
F19244.S31zr19244_31V1V3_R1.fastq.gzzr19244_31V1V3_R2.fastq.gz
F19244.S32zr19244_32V1V3_R1.fastq.gzzr19244_32V1V3_R2.fastq.gz
F19244.S33zr19244_33V1V3_R1.fastq.gzzr19244_33V1V3_R2.fastq.gz
F19244.S34zr19244_34V1V3_R1.fastq.gzzr19244_34V1V3_R2.fastq.gz
F19244.S35zr19244_35V1V3_R1.fastq.gzzr19244_35V1V3_R2.fastq.gz
F19244.S36zr19244_36V1V3_R1.fastq.gzzr19244_36V1V3_R2.fastq.gz
F19244.S37zr19244_37V1V3_R1.fastq.gzzr19244_37V1V3_R2.fastq.gz
F19244.S38zr19244_38V1V3_R1.fastq.gzzr19244_38V1V3_R2.fastq.gz
F19244.S39zr19244_39V1V3_R1.fastq.gzzr19244_39V1V3_R2.fastq.gz
F19244.S03zr19244_3V1V3_R1.fastq.gzzr19244_3V1V3_R2.fastq.gz
F19244.S40zr19244_40V1V3_R1.fastq.gzzr19244_40V1V3_R2.fastq.gz
F19244.S41zr19244_41V1V3_R1.fastq.gzzr19244_41V1V3_R2.fastq.gz
F19244.S42zr19244_42V1V3_R1.fastq.gzzr19244_42V1V3_R2.fastq.gz
F19244.S43zr19244_43V1V3_R1.fastq.gzzr19244_43V1V3_R2.fastq.gz
F19244.S44zr19244_44V1V3_R1.fastq.gzzr19244_44V1V3_R2.fastq.gz
F19244.S45zr19244_45V1V3_R1.fastq.gzzr19244_45V1V3_R2.fastq.gz
F19244.S46zr19244_46V1V3_R1.fastq.gzzr19244_46V1V3_R2.fastq.gz
F19244.S47zr19244_47V1V3_R1.fastq.gzzr19244_47V1V3_R2.fastq.gz
F19244.S48zr19244_48V1V3_R1.fastq.gzzr19244_48V1V3_R2.fastq.gz
F19244.S04zr19244_4V1V3_R1.fastq.gzzr19244_4V1V3_R2.fastq.gz
F19244.S05zr19244_5V1V3_R1.fastq.gzzr19244_5V1V3_R2.fastq.gz
F19244.S06zr19244_6V1V3_R1.fastq.gzzr19244_6V1V3_R2.fastq.gz
F19244.S07zr19244_7V1V3_R1.fastq.gzzr19244_7V1V3_R2.fastq.gz
F19244.S08zr19244_8V1V3_R1.fastq.gzzr19244_8V1V3_R2.fastq.gz
F19244.S09zr19244_9V1V3_R1.fastq.gzzr19244_9V1V3_R2.fastq.gz

Please download and save the file to your computer storage device. The download link will expire after 60 days upon your receiving of this report.

Raw sequence data download link:

 

VI. Analysis - DADA2 Read Processing

What is DADA2?

DADA2 is a software package that models and corrects Illumina-sequenced amplicon errors [1]. DADA2 infers sample sequences exactly, without coarse-graining into OTUs, and resolves differences of as little as one nucleotide. DADA2 identified more real variants and output fewer spurious sequences than other methods.

DADA2’s advantage is that it uses more of the data. The DADA2 error model incorporates quality information, which is ignored by all other methods after filtering. The DADA2 error model incorporates quantitative abundances, whereas most other methods use abundance ranks if they use abundance at all. The DADA2 error model identifies the differences between sequences, eg. A->C, whereas other methods merely count the mismatches. DADA2 can parameterize its error model from the data itself, rather than relying on previous datasets that may or may not reflect the PCR and sequencing protocols used in your study.

DADA2 Software Package is available as an R package at : https://benjjneb.github.io/dada2/index.html

References

  1. Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJ, Holmes SP. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods. 2016 Jul;13(7):581-3. doi: 10.1038/nmeth.3869. Epub 2016 May 23. PMID: 27214047; PMCID: PMC4927377.

Analysis Procedures:

DADA2 pipeline includes several tools for read quality control, including quality filtering, trimming, denoising, pair merging and chimera filtering. Below are the major processing steps of DADA2:

Step 1. Read trimming based on sequence quality The quality of NGS Illumina sequences often decreases toward the end of the reads. DADA2 allows to trim off the poor quality read ends in order to improve the error model building and pair mergicing performance.

Step 2. Learn the Error Rates The DADA2 algorithm makes use of a parametric error model (err) and every amplicon dataset has a different set of error rates. The learnErrors method learns this error model from the data, by alternating estimation of the error rates and inference of sample composition until they converge on a jointly consistent solution. As in many machine-learning problems, the algorithm must begin with an initial guess, for which the maximum possible error rates in this data are used (the error rates if only the most abundant sequence is correct and all the rest are errors).

Step 3. Infer amplicon sequence variants (ASVs) based on the error model built in previous step. This step is also called sequence "denoising". The outcome of this step is a list of ASVs that are the equivalent of oligonucleotides.

Step 4. Merge paired reads. If the sequencing products are read pairs, DADA2 will merge the R1 and R2 ASVs into single sequences. Merging is performed by aligning the denoised forward reads with the reverse-complement of the corresponding denoised reverse reads, and then constructing the merged “contig” sequences. By default, merged sequences are only output if the forward and reverse reads overlap by at least 12 bases, and are identical to each other in the overlap region (but these conditions can be changed via function arguments).

Step 5. Remove chimera. The core dada method corrects substitution and indel errors, but chimeras remain. Fortunately, the accuracy of sequence variants after denoising makes identifying chimeric ASVs simpler than when dealing with fuzzy OTUs. Chimeric sequences are identified if they can be exactly reconstructed by combining a left-segment and a right-segment from two more abundant “parent” sequences. The frequency of chimeric sequences varies substantially from dataset to dataset, and depends on on factors including experimental procedures and sample complexity.

Results

1. Read Quality Plots NGS sequence analaysis starts with visualizing the quality of the sequencing. Below are the quality plots of the first sample for the R1 and R2 reads separately. In gray-scale is a heat map of the frequency of each quality score at each base position. The mean quality score at each position is shown by the green line, and the quartiles of the quality score distribution by the orange lines. The forward reads are usually of better quality. It is a common practice to trim the last few nucleotides to avoid less well-controlled errors that can arise there. The trimming affects the downstream steps including error model building, merging and chimera calling. FOMC uses an empirical approach to test many combinations of different trim length in order to achieve best final amplicon sequence variants (ASVs), see the next section “Optimal trim length for ASVs”.

Quality plots for all samples:

2. Optimal trim length for ASVs The final number of merged and chimera-filtered ASVs depends on the quality filtering (hence trimming) in the very beginning of the DADA2 pipeline. In order to achieve highest number of ASVs, an empirical approach was used -

  1. Create a random subset of each sample consisting of 5,000 R1 and 5,000 R2 (to reduce computation time)
  2. Trim 10 bases at a time from the ends of both R1 and R2 up to 50 bases
  3. For each combination of trimmed length (e.g., 300x300, 300x290, 290x290 etc), the trimmed reads are subject to the entire DADA2 pipeline for chimera-filtered merged ASVs
  4. The combination with highest percentage of the input reads becoming final ASVs is selected for the complete set of data

Below is the result of such operation, showing ASV percentages of total reads for all trimming combinations (1st Column = R1 lengths in bases; 1st Row = R2 lengths in bases):

R1/R2281271261251241231
32181.76%81.93%82.16%82.45%82.46%75.85%
31181.72%81.98%82.22%82.16%76.01%57.02%
30181.80%82.07%81.93%75.68%57.11%34.02%
29181.84%81.70%75.31%56.71%34.07%23.14%
28181.58%75.16%56.54%33.76%23.04%9.80%
27175.19%56.67%33.74%22.88%9.76%3.66%

Based on the above result, the trim length combination of R1 = 321 bases and R2 = 241 bases (highlighted red above), was chosen for generating final ASVs for all sequences. This combination generated highest number of merged non-chimeric ASVs and was used for downstream analyses, if requested.

3. Error plots from learning the error rates After DADA2 building the error model for the set of data, it is always worthwhile, as a sanity check if nothing else, to visualize the estimated error rates. The error rates for each possible transition (A→C, A→G, …) are shown below. Points are the observed error rates for each consensus quality score. The black line shows the estimated error rates after convergence of the machine-learning algorithm. The red line shows the error rates expected under the nominal definition of the Q-score. The ideal result would be the estimated error rates (black line) are a good fit to the observed rates (points), and the error rates drop with increased quality as expected.

Forward Read R1 Error Plot


Reverse Read R2 Error Plot

The PDF version of these plots are available here:

 

4. DADA2 Result Summary The table below shows the summary of the DADA2 analysis, tracking paired read counts of each samples for all the steps during DADA2 denoising process - including end-trimming (filtered), denoising (denoisedF, denoisedF), pair merging (merged) and chimera removal (nonchim).

Sample IDF19244.S01F19244.S02F19244.S03F19244.S04F19244.S05F19244.S06F19244.S07F19244.S08F19244.S09F19244.S10F19244.S11F19244.S12F19244.S13F19244.S14F19244.S15F19244.S16F19244.S17F19244.S18F19244.S19F19244.S20F19244.S21F19244.S22F19244.S23F19244.S24F19244.S25F19244.S26F19244.S27F19244.S28F19244.S29F19244.S30F19244.S31F19244.S32F19244.S33F19244.S34F19244.S35F19244.S36F19244.S37F19244.S38F19244.S39F19244.S40F19244.S41F19244.S42F19244.S43F19244.S44F19244.S45F19244.S46F19244.S47F19244.S48Row SumPercentage
input82,973120,857103,27581,01287,63987,53997,03387,96066,12365,19379,14488,006100,01269,99491,11784,81179,04772,55268,83989,444125,663107,703106,63189,19764,07286,34978,945103,483101,730112,37278,86097,78375,19074,82870,34380,43677,09477,41982,48678,902101,62573,13088,89887,682108,654107,80782,57875,8534,198,283100.00%
filtered82,972120,856103,27381,01187,63787,53797,03287,95966,12265,19379,14288,005100,00969,99491,11584,81179,04572,55268,83889,443125,662107,702106,63089,19664,07286,34678,944103,482101,730112,37178,85997,78275,19074,82870,34180,43677,09377,41882,48678,902101,62573,12988,89487,679108,652107,80682,57675,8524,198,229100.00%
denoisedF82,195119,717102,24679,93286,93686,87196,29387,46165,55864,83978,41387,42698,83169,17190,07884,16878,15271,67568,19388,852124,710107,031105,62088,48563,28385,48878,128102,911100,633111,46177,72196,56574,59474,16469,78080,08575,98876,30081,80878,260100,68372,59787,96286,531107,653107,08781,58074,9194,159,03499.07%
denoisedR81,530119,426102,13379,84986,82686,52496,01487,33965,45264,84578,50787,42198,73268,59589,91384,21577,99071,38267,71188,705124,297106,981105,69588,36163,35085,33777,888102,905100,207111,31177,15396,20974,11374,03469,57479,80775,67176,00081,30977,985100,58572,37787,82186,147107,068106,92981,14274,6144,147,97998.80%
merged75,906108,05796,03274,82881,48882,20291,41483,13861,42862,91673,73283,12992,94163,35382,91380,04672,55966,15162,25784,298118,027102,953100,60583,46559,14780,52071,82299,14393,247105,68670,15689,52768,33069,45264,97276,20068,78169,27875,15073,32494,05667,03882,99879,95799,764101,50374,30969,2133,887,41192.60%
nonchim71,896103,16291,31771,26377,12873,27082,80870,49149,13146,18464,90871,16289,85259,02876,34666,41769,66163,02555,57676,102112,55595,35387,75574,64255,31474,16463,54284,92688,796102,22665,99185,64163,76260,25755,26671,17362,88261,32771,22666,64688,43362,72780,52276,31891,14592,13869,57664,8923,557,92284.75%

This table can be downloaded as an Excel table below:

 

5. DADA2 Amplicon Sequence Variants (ASVs). A total of 7002 unique merged and chimera-free ASV sequences were identified, and their corresponding read counts for each sample are available in the "ASV Read Count Table" with rows for the ASV sequences and columns for sample. This read count table can be used for microbial profile comparison among different samples and the sequences provided in the table can be used to taxonomy assignment.

 

The table can be downloaded from this link:

 
 

Sample Meta Information

Download Sample Meta Information
#SampleIDSampleNameGroupGROUP_AGEDRPART_SEXGROUP_SHEDEVERSHEDHIV_STATUSPriortestSEQHIV_SheddingGender_SheddingAge_HIVAge_SheddingAge_HIV_Shedding_Age
HAKU.02057.01HAKU.02057.01ADULT-SHEDDERADULTMaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERMale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.02057.04HAKU.02057.04CHILD-NON-SHEDDERCHILDMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.04092.05HAKU.04092.05CHILD-NON-SHEDDERCHILDFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.04287.03HAKU.04287.03CHILD-SHEDDERCHILDFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.05094.01HAKU.05094.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.05791.01HAKU.05791.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.06266.01HAKU.06266.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.06449.01HAKU.06449.01ADULT-SHEDDERADULTMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.06449.04HAKU.06449.04CHILD-NON-SHEDDERCHILDMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.06601.03HAKU.06601.03CHILD-NON-SHEDDERCHILDFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.06641.01HAKU.06641.01ADULT-SHEDDERADULTFemaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERFemale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.06912.03HAKU.06912.03ADULT-SHEDDERADULTMaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERMale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.06978.01HAKU.06978.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.07031.01HAKU.07031.01ADULT-SHEDDERADULTMaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERMale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.07102.01HAKU.07102.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.07104.01HAKU.07104.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERMale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.07991.04HAKU.07991.04CHILD-SHEDDERCHILDFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.08146.01HAKU.08146.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERMale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.10105.01HAKU.10105.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERMale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.10302.01HAKU.10302.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.12128.01HAKU.12128.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.12368.01HAKU.12368.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.12368.03HAKU.12368.03CHILD-SHEDDERCHILDFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.12653.01HAKU.12653.01ADULT-SHEDDERADULTMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.13032.01HAKU.13032.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERMale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.13051.01HAKU.13051.01ADULT-SHEDDERADULTFemaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERFemale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.13110.01HAKU.13110.01ADULT-SHEDDERADULTMaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERMale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.13254.01HAKU.13254.01ADULT-SHEDDERADULTFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.13254.02HAKU.13254.02ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERMale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.13254.03HAKU.13254.03ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.14131.01HAKU.14131.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.14131.03HAKU.14131.03CHILD-NON-SHEDDERCHILDMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.14257.05HAKU.14257.05CHILD-NON-SHEDDERCHILDFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.14257.06HAKU.14257.06CHILD-SHEDDERCHILDFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.15133.01HAKU.15133.01ADULT-SHEDDERADULTFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.15133.03HAKU.15133.03CHILD-NON-SHEDDERCHILDFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.15133.05HAKU.15133.05CHILD-SHEDDERCHILDMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.15133.08HAKU.15133.08ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.16063.02HAKU.16063.02ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.16255.01HAKU.16255.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.16255.02HAKU.16255.02ADULT-SHEDDERADULTMaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERMale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.16255.04HAKU.16255.04ADULT-SHEDDERADULTMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.17022.01HAKU.17022.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.17022.02HAKU.17022.02CHILD-SHEDDERCHILDMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.17043.01HAKU.17043.01ADULT-SHEDDERADULTFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.17043.03HAKU.17043.03CHILD-SHEDDERCHILDFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.17164.01HAKU.17164.01ADULT-SHEDDERADULTFemaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERFemale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.17164.06HAKU.17164.06ADULT-SHEDDERADULTFemaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERFemale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.17251.01HAKU.17251.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.17282.03HAKU.17282.03CHILD-NON-SHEDDERCHILDFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.17364.01HAKU.17364.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERMale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.17575.01HAKU.17575.01ADULT-SHEDDERADULTFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.17575.08HAKU.17575.08ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.18078.01HAKU.18078.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.18305.01HAKU.18305.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.19122.01HAKU.19122.01ADULT-SHEDDERADULTFemaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERFemale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.19356.04HAKU.19356.04CHILD-SHEDDERCHILDMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.19356.05HAKU.19356.05ADULT-SHEDDERADULTMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.21294.01HAKU.21294.01ADULT-SHEDDERADULTFemaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERFemale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.21309.01HAKU.21309.01ADULT-SHEDDERADULTMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.22022.01HAKU.22022.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERMale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.22420.02HAKU.22420.02ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.22968.02HAKU.22968.02ADULT-SHEDDERADULTFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.22968.06HAKU.22968.06CHILD-SHEDDERCHILDMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.23313.04HAKU.23313.04CHILD-NON-SHEDDERCHILDMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.24962.03HAKU.24962.03CHILD-SHEDDERCHILDMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.25095.01HAKU.25095.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.25284.02HAKU.25284.02ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.25705.02HAKU.25705.02ADULT-SHEDDERADULTFemaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERFemale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
 
 

ASV Read Counts by Samples

#Sample IDRead Count
HAKU.04287.031
HAKU.06449.01352
HAKU.19356.0525,795
HAKU.06912.0328,475
HAKU.17282.0328,730
HAKU.06601.0328,949
HAKU.15133.0830,535
HAKU.07104.0130,920
HAKU.14257.0636,669
HAKU.17043.0338,148
HAKU.17575.0840,616
HAKU.12368.0340,676
HAKU.06449.0441,135
HAKU.07991.0441,660
HAKU.23313.0441,706
HAKU.12653.0142,335
HAKU.15133.0142,545
HAKU.17043.0142,710
HAKU.15133.0342,967
HAKU.22968.0243,087
HAKU.19356.0443,706
HAKU.22968.0643,874
HAKU.12368.0144,138
HAKU.14131.0344,184
HAKU.17575.0144,291
HAKU.25284.0244,885
HAKU.10105.0144,984
HAKU.16255.0445,918
HAKU.14257.0546,123
HAKU.10302.0146,218
HAKU.16063.0247,633
HAKU.13254.0348,235
HAKU.02057.0448,322
HAKU.14131.0148,431
HAKU.22420.0248,649
HAKU.17164.0648,690
HAKU.13254.0148,972
HAKU.21309.0149,067
HAKU.15133.0549,692
HAKU.17022.0250,011
HAKU.25095.0151,219
HAKU.04092.0551,685
HAKU.18078.0151,693
HAKU.24962.0351,720
HAKU.21294.0151,796
HAKU.17251.0152,051
HAKU.13254.0252,495
HAKU.17164.0153,230
HAKU.13032.0153,578
HAKU.17364.0156,377
HAKU.16255.0157,409
HAKU.22022.0159,817
HAKU.12128.0162,172
HAKU.25705.0263,606
HAKU.06978.0163,770
HAKU.18305.0164,705
HAKU.08146.0165,095
HAKU.16255.0267,126
HAKU.13051.0167,808
HAKU.19122.0170,404
HAKU.05094.0170,625
HAKU.06266.0171,430
HAKU.06641.0171,848
HAKU.17022.0176,274
HAKU.07102.0179,326
HAKU.02057.0180,201
HAKU.13110.0183,060
HAKU.07031.0187,315
HAKU.05791.0196,451
 
 
 

VII. Analysis - Read Taxonomy Assignment

Read Taxonomy Assignment - Methods

 

The close-reference taxonomy assignment of the ASV sequences using BLASTN is based on the algorithm published by Al-Hebshi et. al. (2015)[2].

The species-level, open-reference 16S rRNA NGS reads taxonomy assignment pipeline

Version 20210310a
 
 

1. Raw sequences reads in FASTA format were BLASTN-searched against a combined set of 16S rRNA reference sequences - the FOMC 16S rRNA Reference Sequences version 20221029 (https://microbiome.forsyth.org/ftp/refseq/). This set consists of the HOMD (version 15.22 http://www.homd.org/index.php?name=seqDownload&file&type=R ), Mouse Oral Microbiome Database (MOMD version 5.1 https://momd.org/ftp/16S_rRNA_refseq/MOMD_16S_rRNA_RefSeq/V5.1/), and the NCBI 16S rRNA reference sequence set (https://ftp.ncbi.nlm.nih.gov/blast/db/16S_ribosomal_RNA.tar.gz). These sequences were screened and combined to remove short sequences (<1000nt), chimera, duplicated and sub-sequences, as well as sequences with poor taxonomy annotation (e.g., without species information). This process resulted in 1,015 full-length 16S rRNA sequences from HOMD V15.22, 356 from MOMD V5.1, and 22,126 from NCBI, a total of 23,497 sequences. Altogether these sequence represent a total of 17,035 oral and non-oral microbial species.

The NCBI BLASTN version 2.7.1+ (Zhang et al, 2000) [3] was used with the default parameters. Reads with ≥ 98% sequence identity to the matched reference and ≥ 90% alignment length (i.e., ≥ 90% of the read length that was aligned to the reference and was used to calculate the sequence percent identity) were classified based on the taxonomy of the reference sequence with highest sequence identity. If a read matched with reference sequences representing more than one species with equal percent identity and alignment length, it was subject to chimera checking with USEARCH program version v8.1.1861 (Edgar 2010). Non-chimeric reads with multi-species best hits were considered valid and were assigned with a unique species notation (e.g., spp) denoting unresolvable multiple species.

2. Unassigned reads (i.e., reads with < 98% identity or < 90% alignment length) were pooled together and reads < 200 bases were removed. The remaining reads were subject to the de novo operational taxonomy unit (OTU) calling and chimera checking using the USEARCH program version v8.1.1861 (Edgar 2010)[4]. The de novo OTU calling and chimera checking was done using 98% as the sequence identity cutoff, i.e., the species-level OTU. The output of this step produced species-level de novo clustered OTUs with 98% identity. Representative reads from each of the OTUs/species were then BLASTN-searched against the same reference sequence set again to determine the closest species for these potential novel species. These potential novel species were pooled together with the reads that were signed to specie-level in the previous step, for down-stream analyses.

Reference:

  1. Al-Hebshi NN, Nasher AT, Idris AM, Chen T. Robust species taxonomy assignment algorithm for 16S rRNA NGS reads: application to oral carcinoma samples. J Oral Microbiol. 2015 Sep 29;7:28934. doi: 10.3402/jom.v7.28934. PMID: 26426306; PMCID: PMC4590409.
  2. Zhang Z, Schwartz S, Wagner L, Miller W. A greedy algorithm for aligning DNA sequences. J Comput Biol. 2000 Feb-Apr;7(1-2):203-14. doi: 10.1089/10665270050081478. PMID: 10890397.
  3. Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010 Oct 1;26(19):2460-1. doi: 10.1093/bioinformatics/btq461. Epub 2010 Aug 12. PubMed PMID: 20709691.
  4. 3. Designations used in the taxonomy:

    	1) Taxonomy levels are indicated by these prefixes:
    	
    	   k__: domain/kingdom
    	   p__: phylum
    	   c__: class
    	   o__: order
    	   f__: family
    	   g__: genus  
    	   s__: species
    	
    	   Example: 
    	
    	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Blautia;s__faecis
    		
    	2) Unique level identified – known species:
    	   
    	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Roseburia;s__hominis
    	
    	   The above example shows some reads match to a single species (all levels are unique)
    	
    	3) Non-unique level identified – known species:
    
    	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Roseburia;s__multispecies_spp123_3
    	   
    	   The above example “s__multispecies_spp123_3” indicates certain reads equally match to 3 species of the 
    	   genus Roseburia; the “spp123” is a temporally assigned species ID.
    	
    	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__multigenus;s__multispecies_spp234_5
    	   
    	   The above example indicates certain reads match equally to 5 different species, which belong to multiple genera.; 
    	   the “spp234” is a temporally assigned species ID.
    	
    	4) Unique level identified – unknown species, potential novel species:
    	   
    	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Roseburia;s__ hominis_nov_97%
    	   
    	   The above example indicates that some reads have no match to any of the reference sequences with 
    	   sequence identity ≥ 98% and percent coverage (alignment length)  ≥ 98% as well. However this groups 
    	   of reads (actually the representative read from a de novo  OTU) has 96% percent identity to 
    	   Roseburia hominis, thus this is a potential novel species, closest to Roseburia hominis. 
    	   (But they are not the same species).
    	
    	5) Multiple level identified – unknown species, potential novel species:
    	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Roseburia;s__ multispecies_sppn123_3_nov_96%
    	
    	   The above example indicates that some reads have no match to any of the reference sequences 
    	   with sequence identity ≥ 98% and percent coverage (alignment length)  ≥ 98% as well. 
    	   However this groups of reads (actually the representative read from a de novo  OTU) 
    	   has 96% percent identity equally to 3 species in Roseburia. Thus this is no single 
    	   closest species, instead this group of reads match equally to multiple species at 96%. 
    	   Since they have passed chimera check so they represent a novel species. “sppn123” is a 
    	   temporary ID for this potential novel species. 
    

 
4. The taxonomy assignment algorithm is illustrated in this flow char below:
 
 
 
 

Read Taxonomy Assignment - Result Summary *

CodeCategoryMPC=0% (>=1 read)MPC=0.01%(>=330 reads)
ATotal reads3,488,3203,488,320
BTotal assigned reads3,305,3283,305,328
CAssigned reads in species with read count < MPC027,925
DAssigned reads in samples with read count < 5003204
ETotal samples6969
FSamples with reads >= 5006767
GSamples with reads < 50022
HTotal assigned reads used for analysis (B-C-D)3,305,0083,277,399
IReads assigned to single species1,387,6191,368,304
JReads assigned to multiple species1,917,3891,909,095
KReads assigned to novel species00
LTotal number of species538275
MNumber of single species367190
NNumber of multi-species17185
ONumber of novel species00
PTotal unassigned reads182,992182,992
QChimeric reads00
RReads without BLASTN hits6262
SOthers: short, low quality, singletons, etc.182,930182,930
A=B+P=C+D+H+Q+R+S
E=F+G
B=C+D+H
H=I+J+K
L=M+N+O
P=Q+R+S
* MPC = Minimal percent (of all assigned reads) read count per species, species with read count < MPC were removed.
* Samples with reads < 500 were removed from downstream analyses.
* The assignment result from MPC=0.1% was used in the downstream analyses.
 
 
 

Read Taxonomy Assignment - ASV Species-Level Read Counts Table

This table shows the read counts for each sample (columns) and each species identified based on the ASV sequences. The downstream analyses were based on this table.
SPIDTaxonomyHAKU.02057.01HAKU.02057.04HAKU.04092.05HAKU.04287.03HAKU.05094.01HAKU.05791.01HAKU.06266.01HAKU.06449.01HAKU.06449.04HAKU.06601.03HAKU.06641.01HAKU.06912.03HAKU.06978.01HAKU.07031.01HAKU.07102.01HAKU.07104.01HAKU.07991.04HAKU.08146.01HAKU.10105.01HAKU.10302.01HAKU.12128.01HAKU.12368.01HAKU.12368.03HAKU.12653.01HAKU.13032.01HAKU.13051.01HAKU.13110.01HAKU.13254.01HAKU.13254.02HAKU.13254.03HAKU.14131.01HAKU.14131.03HAKU.14257.05HAKU.14257.06HAKU.15133.01HAKU.15133.03HAKU.15133.05HAKU.15133.08HAKU.16063.02HAKU.16255.01HAKU.16255.02HAKU.16255.04HAKU.17022.01HAKU.17022.02HAKU.17043.01HAKU.17043.03HAKU.17164.01HAKU.17164.06HAKU.17251.01HAKU.17282.03HAKU.17364.01HAKU.17575.01HAKU.17575.08HAKU.18078.01HAKU.18305.01HAKU.19122.01HAKU.19356.04HAKU.19356.05HAKU.21294.01HAKU.21309.01HAKU.22022.01HAKU.22420.02HAKU.22968.02HAKU.22968.06HAKU.23313.04HAKU.24962.03HAKU.25095.01HAKU.25284.02HAKU.25705.02
SP1Bacteria;Bacteroidetes;Bacteroidetes_[C-1];Bacteroidetes_[O-1];Bacteroidetes_[F-1];Bacteroidetes_[G-5];bacterium HMT511005300000002262552003161660133840000140440000122028410400145162413310118127223170000124928
SP104Bacteria;Bacteroidetes;Bacteroidetes_[C-1];Bacteroidetes_[O-1];Bacteroidetes_[F-1];Bacteroidetes_[G-5];bacterium HMT50700800000000000000000080123000000001001640000006178000000147000000032000006700
SP105Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;oris5725780038195536058921954107149105108111950141933521224446242631513764836185846376342423166843563544310725648932231975142232546073118944062415571034354387142
SP106Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Veillonella;sp. HMT9170013400130000450081209900500300882000600456111181940000103660167026400000178500120417123120000
SP107Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;nucleatum3013608801322519501443543884638155212111992471134671332271339363761384213569736155163219699298300661691562089761555845950602403950243648422181481693208319929697251
SP108Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;nucleatum_subsp._vincentii21121790186716204500332222961485025223731514721315827127205291930518438207325551521682985211631136096315240426575264115827164257400207422523431552051407417174593971837
SP113Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Kingella;sp. HMT012023107357002970015021611100441000038160131270148410200828700123018301080623601900285241170
SP115Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Eikenella;corrodens81970111125020364615912028234010671220155041625461430330144658133026422917691249340153911115141625101623308
SP116Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Tannerellaceae;Tannerella;serpentiformis3390002400135035426911010240400075100505241419817501011540263005501113322274363121
SP129Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Selenomonas;noxia65901115550102752176881214680240382349994260017275725839107588376617101342351728115118584431012080967106819
SP130Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;trevisanii322101008152707872112370314001598300212120876151060700575220584506000162560235213110014
SP131;;;;;Peptostreptococcaceae_[XI][G-2];bacterium HMT09112020240100012616230402164350611600521000022004120612901406070239016013233100001161228
SP135Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;wadei26066913019999901296444454436313171571055050379571378305232172821412178904315440151122216121510141215531201724731231326914637075151
SP140Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Mitsuokella;sp. HMT521158247000270012004001120000441110400200121354001010102139030001001001300260730009
SP145Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT49875430844101242050197050224866431357451831841205975426999141685156303593374166761416227940193723553611503102519905391222284651139213
SP146Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Butyrivibrio;sp. HMT080000000001200706004131320101901001930000041100001001162301800432018003013300400251863
SP147Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Lachnospiraceae_[G-8];bacterium HMT500005000500116921001858200035121002400000540011020244911201801263015001990107412008002243843
SP148Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Shuttleworthia;satelles6311102025033114358270427742500235973580400282391512019269714912800410113354014412700871801328
SP151Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Capnocytophaga;leadbetteri365550391316407954363252784217816591182723121314535194393125811610312213320237646914612473764440942118342325104571335357426
SP153Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Oribacterium;sp. HMT1020000000000002500021000017111801230100000000420401801400001115400011218800000704
SP154Bacteria;Firmicutes;Erysipelotrichia;Erysipelotrichales;Erysipelotrichaceae;Erysipelotrichaceae_[G-1];bacterium HMT9050010013080044210002140220019200600020001010131010174200908016002240012330490
SP155Bacteria;Absconditabacteria_(SR1);Absconditabacteria_(SR1)_[C-1];Absconditabacteria_(SR1)_[O-1];Absconditabacteria_(SR1)_[F-1];Absconditabacteria_(SR1)_[G-1];bacterium HMT87407003706103211052261167022400190207006716610034612096153405600281102027510074104554100213177150
SP156Bacteria;Absconditabacteria_(SR1);Absconditabacteria_(SR1)_[C-1];Absconditabacteria_(SR1)_[O-1];Absconditabacteria_(SR1)_[F-1];Absconditabacteria_(SR1)_[G-1];bacterium HMT3451900111506100281008000080004132135002219310000080011829314026158010274735331035002271171010133350
SP169Bacteria;Firmicutes;Tissierellia;Tissierellales;Peptoniphilaceae;Parvimonas;micra38137016630360158054979373121642275662610901374213241920621402875101480310185664012658151342564317611077262390107150549122719
SP183Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Selenomonas;dianae062000100002149400002410001111502031403001100110042403142001500651460355274300005041
SP187Bacteria;Firmicutes;Clostridia;Eubacteriales;Ruminococcaceae;Ruminococcaceae_[G-2];bacterium HMT085521035501209230121891316735141041561218915890301173110522136917934274092145610229251324714301192153804020154218131311295323311
SP188Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Haemophilus;parahaemolyticus114000078000200002579043619211001052790022561871436709551000832300050901520130806884815000025046500330029411877200
SP191Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT396150022218900025137909061550291231340541315001200216240109414493832018211210010018002230
SP193Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Capnocytophaga;sp. HMT901800000120000004930000000000000010101001000003000000000000400300300100000001
SP194Bacteria;Proteobacteria;Betaproteobacteria;Burkholderiales;Comamonadaceae;Ottowia;sp. HMT894000011330270128385060121030026000480000001400030480000210001001942400090007450
SP197Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Alloprevotella;sp. HMT9141254501543414024080206331104323531600310561501316127126831712613015932171210360018218038513232828047070018150110
SP2Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;amylovorum002000117000015200012000000140000012753221200340010014812000570091360008700700303730102365
SP208Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT39211342206314630523140141324571312221550941412815153948461601402863183714699911251646212719947152220446415359901719761021498
SP235Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;sp. HMT26201300000900038000023201071150000060000092001030001170190000381217300000052000611870
SP238Bacteria;Firmicutes;Erysipelotrichia;Erysipelotrichales;Erysipelotrichaceae;Solobacterium;moorei152363069020157019088720374475486117849238611013833397199928125647101300379111234231217525832281139818329123102711474782312181322542968137257901017
SP239Bacteria;Firmicutes;Erysipelotrichia;Erysipelotrichales;Erysipelotrichaceae;Bulleidia;extructa00600120260031600019319101020193000114000915403001107061123011029054000533300018054567
SP240Bacteria;Absconditabacteria_(SR1);Absconditabacteria_(SR1)_[C-1];Absconditabacteria_(SR1)_[O-1];Absconditabacteria_(SR1)_[F-1];Absconditabacteria_(SR1)_[G-1];bacterium HMT875400016108308013012005501012000008111000641310000003508031919000840204012000100021620000351100
SP241Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Neisseria;sp. HMT0180000482200017400132022031203000311136641000261311402000021000030001101502700109162301000
SP248Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Micrococcaceae;Rothia;dentocariosa6494146301455074850621153190701566032818958171115425044642942821422681078408738222271891054132404117310643644251898122426514221630163114819052705143448122881671509494628871255
SP249Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;hofstadii359681203626380253016054318730115434681474604758451842241227552960574330761289217255910152963061104119257854968154516542
SP253Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Peptidiphaga;gingivicola158420378202655252733051711240275376308311400028991516401500115273258800260111010128349843171825139127
SP257Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Eikenella;halliae45703754128023854316211681271721312131322900213684314859378320710197449411210311228112247269141013722113883612183354820497
SP258Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Oribacterium;sp. HMT078208611001620551541051140032421284937205112101121119131230121503113110217260201055500128702537921402509
SP261Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;pleuritidis521930027500310034818214000219181040401000158013128223103033175049582056740127100941411100691110182
SP262Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;veroralis180501300318106868462105493532247435850256213115219191241763101365721151356147221501303922122172552111545201612810310506165
SP265Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Bergeyella;sp. HMT93131140352510011001447158297711280141010005014725011003012992898012011174920430810000191301201904
SP268Bacteria;Firmicutes;Bacilli;Lactobacillales;Aerococcaceae;Abiotrophia;defectiva633579010437030212227361205160291131132601711231212425603560226109721312790295124782741451228341046133812222370332489150574422988262197158115459522763367122
SP277Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT30500950327600011097021231005721340004310100230170050007065000018168260000041000101020000020
SP28Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Alloprevotella;tannerae0204301715397097133323210510703961865928150147113139652070183268797598013712327066445302103912386227304639068358039254082229501819373912117122
SP282Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Capnocytophaga;granulosa333100114630023373517249541054282342192391921220306538338373172932730464546341720356373273014482060116637111232117523
SP284Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Veillonella;sp. HMT780242611525301892614128057600414425724729381481475696352524620800113176934242547596486031587042628012717693822100998961742402114961066039223400887493121333365490250
SP297Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;salivae119204309049721046100572824518142131197375109533831065757851021110268296113800281854967551912129506491102411667812141295723706170431552821177511292121050816619517788338
SP298Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;shahii2531156590127234106811220913919622310015627124813241517452402050923126301318716579129816926912049411843503257436228242152974244832078105
SP299Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Dialister;invisus1091711603022908398242120308154013642102919432454291601518404542997363015520202016331273851518122051172623841583512916218231115862331118477
SP3Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;buccalis3960012139400125013501371222132111011294409134211049338921243874185271573221286540268261840111116812132615418371161610324
SP30Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;hongkongensis6161800411600267012522176221310131171082938301839135153014112281211165321511005462499310249016450313
SP300Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Aggregatibacter;actinomycetemcomitans0924500000747008900002300000000000000045953388514300409800310900100059000032000000000027
SP301Bacteria;Bacteroidetes;Bacteroidetes_[C-1];Bacteroidetes_[O-1];Bacteroidetes_[F-1];Bacteroidetes_[G-5];bacterium HMT50500300000012007000080900001600000110137015110154180346201700910640130000001200900038100
SP306Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Capnocytophaga;sp. HMT33231106130812300313030400027250031902329153220102901210304137002700263018791302241498
SP307Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Tannerellaceae;Tannerella;forsythia00900770901093908142431030250675655051133012013510170727315254136191221086538101290442517830000101175249
SP309Bacteria;Proteobacteria;Betaproteobacteria;Burkholderiales;Burkholderiaceae;Lautropia;mirabilis2956500130904312689003312090261508889334149691671511771421168224203217531332131673914149725413267238054419272146130566056195698135214505516439151457639353511401486023156367
SP316Bacteria;Firmicutes;Clostridia;Eubacteriales;Ruminococcaceae;Ruminococcaceae_[G-1];bacterium HMT0756813410805154038561461694935495819321548616818713112118078825515397120171121848283124152924246817733903915913143115127261731335914
SP317Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Veillonella;dispar33186111106935461537015615519228626913761683463411038746233390668127394281442171458128951211707763204594734409601452111541011652306137711228919143537154457112291872208461444105198556343710
SP323;;;;;Peptostreptococcaceae_[XI][G-1];[Eubacterium]_sulci1131601741144010274664965331504024296270114056134112150422864100112903665814116930665104674799021018642427843714621051880
SP331Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;histicola422492989010233177090263034021209564415146782401197136610981710401344224210091914455368692013242019049413200312131259102199686432218125917318922411977729186012115437881152442275
SP333Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT313627429202471992470134530168113128911621264873610167101383051371214450323001313925010829214416251773494332120651330281721910333926130466750097315148161
SP335Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;pallens17867223034975168401935189112871831231002508212610180558115713095113467856470051088541440816222721108542514911583152429625233352334204722193841040117169257185
SP341Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT526000000100000120001260530001618000017000011300000262104000011235015200004100002269
SP36Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Alloprevotella;sp. HMT9128354011170678000203308011000124310021320007041701144913042831021014681600140380035031541
SP362Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Neisseria;oralis946290117089749837016351418710889911073091101951163548121470012918412616155140161338263217315394012816784710771941033543322551310458618311017224557913884951551776420118
SP366Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT44301000031000207200051381330210214225030696000017190081009666911004373168235025040461320000021842291
SP37Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Capnocytophaga;sputigena1435260343828403512735212510291917261211131414641006291713926546313514080106181743161910163010815561389354517102317213611172282631
SP375Bacteria;Proteobacteria;Gammaproteobacteria;Cardiobacteriales;Cardiobacteriaceae;Cardiobacterium;valvarum081601720582413336301224702605380005101016017810811160170215111301223012101015120394671354
SP376Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Micrococcaceae;Rothia;mucilaginosa89860312550130752831410011655397031660139072140346599129412969187068051548551417162713105017590178428301174379536144513365671509364318022089258311115341273287841052283641343373145198180731263274477165290134215271070205039328185295612891855
SP377Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;sp. HMT248848113400080000052308053827420005251180490071630110050000027896170002260135172027073200128000258411000
SP378Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Clostridioides;difficile1600018308000165121324401640200006102401310000000201515011000580019000141000702200000071
SP38Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Lachnospiraceae_[G-3];bacterium HMT1005354209023061230656691291106417335616152127166200127182728215541135230460342107396504604351688538709102085360
SP39Bacteria;Proteobacteria;Epsilonproteobacteria;Campylobacterales;Campylobacteraceae;Campylobacter;gracilis260942501235960359812742824764842515141527833328172252470012122992163530261612017105181111105104327143600152142471657816535928219878
SP393Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;shahii13140041650801283480121215900103500026096002124213072132101860051021198120349000124018205681
SP403Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Lachnoanaerobaculum;orale15111025601417427401832396234913043951193414383172314412912777935916643494132162943210487384153134359940261466939241361703011028223213515483312587928174
SP407Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT225102254116031520660167102104120365201309258013739813091038120702512213593331152630142032571001242190149181014390151639875
SP408Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Lachnospiraceae_[G-2];bacterium HMT09666443844200210180042371108015262537524958419072020284193132001329149746228112114256022114128268513514377327722124016549111693914200019092215525
SP413Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT472016004912430355359291715090021317130925001149007006312401004500520016030170100005114
SP420Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT2184074857298072597780120192200385807169033604380147054751144758717295759337103050401291715858512509215109506810609601762932825301585
SP427Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Oribacterium;asaccharolyticum9317700321481010048659061232552126201645348691874426205225150148613129221050258198645744410104108622025129818680399010118
SP43Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;fusca2652032100800600084021801014000000305109600117807130946001191860100400000900430
SP431Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;periodonticum398336154501553525160018152170966413813257119513823782182217003522407994625687565691080696105838705719643942035891949947240134010823486652774217155731194336925402563373625515317525845424217931027596291378429
SP432Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;denticola2501000023060340242700158115902024901460632057056012901521813192744353010701343171226028004601327620923048443153
SP44Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;lecithinolyticum0020001201900512510001820002205014400015041031100212222911133012300261160123040364006422001485
SP449Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Kingella;oralis3597504411420323472811432610917018253017023287323012262114962493413896106612383805623610264072111571701345182274010542
SP45Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Centipeda;periodontii32140011300120414839201032415110050037388330536011850214102161011130016351016518920
SP454Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT309268805350010021003384100101299010940441243036040013183390149320053900112122600306212410010
SP455Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;oulorum76169240211514301964241846597217122722724121285072491354195233862420340410100216571792361352138341483814710274489469051101
SP456Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Veillonellaceae_[G-1];bacterium HMT155502810011000111166020000060486210200001800702510021177100080270400149310002610162
SP460Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Alloprevotella;rava3011670499821903911264124803522331920483673501054321125302516493103031311306631827532843108268161120819100013147121591213343841763069132
SP462Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;anginosus19309000226000784522790014139151126130846109713835194007213421880123190756531012313006164036322693926366662317192012108257
SP463Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT212104442097323702643718242411082423515013331311920105551611072405233949104268702114941268203157591548772122676191401014076139181117923
SP470Bacteria;Proteobacteria;Epsilonproteobacteria;Campylobacterales;Campylobacteraceae;Campylobacter;concisus70128145017413986083183138117483220453812144361471091359275160194314820765641921510652459716512193223261083401012012091411559216441017822933323183252149057
SP473Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;intermedia00280001702200269400047544312650340163010050117202028021201217066501838381517013550131526136604530730115261003020216064
SP474Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;sp. HMT23101540002001002504001022307025001501020072008447002100371020102200144400107006246
SP485Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Porphyromonadaceae;Porphyromonas;sp. HMT93013957010211292106983003136466475088210327182863369086331526541261810264168810051219217618322889104682468203645142052322430251848702794163403
SP486;;;;;Peptostreptococcaceae_[XI][G-5];[Eubacterium]_saphenum006000500005550002141210040045221144020060000404501000034199034000147823907680004718540004024828372
SP487Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT94292614370016521104810020140014446011001000601500020603010001301112110800864290240012052001922102441741
SP488Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Johnsonella;sp. HMT1660030001300008000009300000120190441010000000020087709000146020100000414600000165414
SP49Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multiformis00186000830000033000000550025901370301032005057021261700259234134429300211328034781124211310081050
SP493Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT90936000120210004210200113067020120100001307010833200300015920105081712020371620
SP494Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Pseudostreptobacillus;hongkongensis0100010025000000054012000201150400301047100001821200100040000000700001000300
SP496Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Anaeroglobus;geminatus116130000000900155212025575115110294480620320211501800012166313559151034704471213104110143711293402258319
SP497Bacteria;Firmicutes;Tissierellia;Tissierellales;Peptoniphilaceae;Peptoniphilaceae_[G-1];bacterium HMT113000000000000000400000001714210000100000003000007000000002900008330000002511810
SP498Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT219000011822030214318011203001297060014480040751222362017800070202903540059495178117127632
SP499Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Filifactor;alocis3023001180170137900003706034401301200222002010104043019571619255718562241770049015175476473704232127232500030513397700
SP500Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;nigrescens2337548807123574011111124252719203132011219115463421040321611696063482590173123540841012586125151221610410863191098349141938248
SP503Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Lachnoanaerobaculum;saburreum163505014663001540191062086140112345490171511230045755621253931058093808981108201315780164131558232602911
SP504Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;micans047130004100104718600015010312010124510011210202025450005002614024619012153720281860180
SP510Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Selenomonas;flueggei9220001130050011600050001741234000006111201113329000113547000000700770002100043
SP512Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT304000000106000005500000500160204010500008405400000010066980240000660200025101101817730000190316366
SP516Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Dialister;pneumosintes17025000760240316351790419382053270695718701451260301776402303004472331207401104425014503878382005361231163140
SP522Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Peptostreptococcus;stomatis3035305084912201908226683438571375618310829223182533320752267619962390201064178321291537031271326101862592211281914149916872332474697164701510105619407149
SP525Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Stomatobaculum;longum45560410052201340742148330485155624814117314523817165134831286827914805571159211763948751850167464810367132371604216461903345505115128862121037
SP529Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;sp. HMT256001000003000000000300001000000000020002000000784130000064117000000240010042320
SP53Bacteria;Firmicutes;Bacilli;Lactobacillales;Carnobacteriaceae;Granulicatella;elegans356314952605681475905216081721611350323382106904211794130489715924302680663147834290100410330215332297344730272813721546393318218622781906272210374355227910230441310169421136014521446819128209201017995229229917769042301961
SP535Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;mutans15930421130011324307841511220915054519163918100215889554165976527200121060050001810872231906001028003292
SP54Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Bergeyella;sp. HMT907064000001011008210020210171280082171101507230001331510003439008809010848004912154
SP541Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Romboutsia;timonensis11010212730002161117233022901000051024240000200070241002400069101301016901402501000063
SP542Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Lachnospiraceae_[G-2];bacterium HMT0880000000011000010400140230000026000500000013000010000000009142001070018010048600
SP55Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;baroniae0027009800019232000019481127058418502108023034041600853006923180060400242207187039891111400007928
SP550Bacteria;Firmicutes;Bacilli;Lactobacillales;Carnobacteriaceae;Granulicatella;adiacens290352106509405467570221163408434689389113114469728181751192645688854241211711513144153689412822555219779936727161190145458285116913289214712831267911395164273849068420261710762115428361252176836479341282483801168629
SP552Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Alloprevotella;sp. HMT4734148349340490295611202910235698327146419001281183462851022133117165012628503398977476078273100164520756224791214480131729611512104497355172922546086592824911310888223134331
SP558Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;nanceiensis3517139014112172370272617733267127136223149524136017358363937431566176456003915364831510279771519725477742420648533711949524133596301736259142
SP56Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;socranskii312711004280109411814271011062714022918428018483152156500241031551813295462140122231731120733314630022524345751
SP574Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Streptobacillus;felis00000017204000050000037000040463000000000000000002571310000017033010750000176000005232580
SP576Bacteria;Bacteroidetes;Bacteroidetes_[C-1];Bacteroidetes_[O-1];Bacteroidetes_[F-1];Bacteroidetes_[G-3];bacterium HMT2800030001070005000010490001101900003000035001050271901160002431260650046180000013520
SP578Bacteria;Proteobacteria;Epsilonproteobacteria;Campylobacterales;Campylobacteraceae;Campylobacter;sp. HMT044205201204001801023224401096691300041450000141057004127115100250455951100341613591440035142460100222336
SP579Bacteria;Proteobacteria;Deltaproteobacteria;Desulfobacterales;Desulfobulbaceae;Desulfobulbus;sp. HMT041000000100001611000017000001011000000030000081004824090158190850122209300004371
SP584Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;koreensis0040001000000100000000004060000220000200000001005000002291910000000000001642
SP585Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Bacteroidales_[F-2];Bacteroidales_[G-2];bacterium HMT2745832925200369064951984325916631390020012418833322681970471443321032934610912175028278450139622707802431628000187232922
SP590Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Actinomyces;graevenitzii691420561715805091547949696086319911944364672171310224149812921400005542641728235035113373226861378024359816171572238853198116170214314657
SP594Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;aurantiaca039100430000135161115002302441743757002101120028117303061611302474161160130551731631162602300104114250041220
SP597Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT317428860152610145511538132828105012935131673141962163515391043521271311422141515104816017342045105696102112153214131
SP60Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multisaccharivorax2040001000008021151490001440000690202603025101800011805062000300008000034031027420000385
SP603Bacteria;Tenericutes;Mollicutes;Mycoplasmatales;Mycoplasmataceae;Mycoplasma;faucium0044003507006721001010278000065200008201223414001859011368200000112700104020637000606172200
SP61Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT2154281732080505113805139796240371119770942364165186858458259120299530226537601576142317443461185186203671974722291513051442716481171061990272277741147964168401012740145180
SP62Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Porphyromonadaceae;Porphyromonas;gingivalis00000064000003124006841516702590256039703005530002800047500118384104764053000016194099200112461150400301991568705
SP63Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT8475992802803204811411250000370232510553653031871164495905013017627221203810329171410551594601140
SP636Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;saccharolytica44710006001404120251320111015160002703214345001016032131624701057010206210035113113
SP642Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT3001020150091000128120000405922724026140116187020001143201211192200121102901000094005050
SP65Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;melaninogenica2241712047010368764136108134165095223921344184529613305823982254169517120239668153620725556959943153358314059314689155647987021348123045352319815329817716886467936059038261109114862094435812125619616564
SP652Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;denticola49840303366640011212120361063901310358102275076536446191250391444109053722918594345333492052900440432018662391012554432111524185
SP654Bacteria;Actinobacteria;Actinomycetia;Corynebacteriales;Corynebacteriaceae;Corynebacterium;matruchotii148119861033947703911539348350199343134141391284126431315815155504128132952433452172421882358733838434110242872298199241352471528412921244298
SP655;;;;;Peptostreptococcaceae_[XI][G-9];[Eubacterium]_brachy4210250091030562861502164201221241021018214020207110210636813416211401252202879020125120307351682615100054052154147
SP663Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT4171072412418503677127201024625918901517433335585820721167709703758749351404129212552922536258643601214731271910910225287056978121149371401365280665122282254941854837164620714417516
SP695Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Capnocytophaga;gingivalis2171706129530332947428855111782671413391341711055201783420416191331342424105713498098010313813181391803823167
SP700Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;sp. HMT95100700070300016000222001901120000000030000001216803200000003712047000300004038130
SP705Bacteria;Firmicutes;Clostridia;Eubacteriales;Eubacteriaceae;Pseudoramibacter;alactolyticus660000100000003200100600114900000050000000104320221310390100410424100002400677
SP709Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Megasphaera;micronuciformis1786601037815147604621216925219722543141084996113622326526934767111731378121882485082271381023222081721217231109215867668292406913911444806250316170
SP712Bacteria;Actinobacteria;Coriobacteriia;Coriobacteriales;Atopobiaceae;Atopobium;sp. HMT199210004043050149527801804105343230110128914002010162301007113192924748030616118325103148950007117016275
SP713Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Haemophilus;parainfluenzae525123842450727336442456083914037128199425252542436187026827814414189161221842769849938203671644161470219334871151051329259126911106201814412041177535555852428225791101225064033695903311915939529044761304120812291933152718895409809226451233441658523785
SP72Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;intermedius1411218000650112139281692672961115151918240350545110152133505945240164220861190204011731648230110851980124
SP723Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Oribacterium;sinus10850400041859803820701947535435121144542115123183331496511349161038945452679012132352313648287306928351171542110471173887971431426332311335801
SP729Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Bergeyella;sp. HMT3221311110584679010123337469751910934651868134142104994615393129831246242111985374321019179941451915617677651816264596091033021514
SP731Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;sp. HMT24635039000000000000010000108321500000200000275701330010111600000001800000055000111135470
SP740Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Butyrivibrio;sp. HMT4551401401513170003301653224753160410014165112478505001160043804020200104202111045150800163010130
SP743;;;;;Peptostreptococcaceae_[XI][G-4];bacterium HMT36900000010000010180001632000042060000100001311007001533700010111143097015722500000104635
SP75Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT217000000000033013200007000011720007534320000000000640030012908852100000001500004410
SP76Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Catonella;morbi1717320231516040285032802465616762449156271617008033171111154451834796627181715451170308332600670063011139
SP760;;;;;Peptostreptococcaceae_[XI][G-6];[Eubacterium]_nodatum00100010000030000001480000300000001000011001200103511141002001019031000168510000090167108
SP763Bacteria;Proteobacteria;Gammaproteobacteria;Cardiobacteriales;Cardiobacteriaceae;Cardiobacterium;hominis078011255802359123233917702408155029011511211012303931760166391744461652233161700613245615411111581102
SP765Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Peptidiphaga;sp. HMT183182700118402130118225951510100045502112012015152348100000560416001000173119036700107001522
SP766Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;maculosa34178160485406271197641911615011131110308521540385161315101441912234727282091529016115941041627642621
SP775Bacteria;Actinobacteria;Actinomycetia;Corynebacteriales;Corynebacteriaceae;Corynebacterium;durum144174307333112071272251381428332749202833618619601284616713258281892346345369812917271994212291248022101212224153419147110333040
SP78Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Schaalia;lingnae_[Not_Validly_Published]52390143448000298619125173514771517452806115116162342398110090124511501329014183710539501045660647012170629311440027198
SP781Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Micrococcaceae;Rothia;aeria14718270893051004252571328665319102711198082821735561592409106108160320566050941138331841329131101950263612101217482414829521584827432783930125
SP792Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Schaalia;sp. HMT17212853400022555270015088524650137423493031841834436460756495944125531170140054947301744071541056210620176166271113212141312250626302718751831816633014171550
SP793Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Bergeyella;sp. HMT20653903110100011680181100342009336900006000003553050263001010000015459042050136000000000372269144406760200
SP807Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;sp. HMT2380057000000000000002400000000011900000016230003700491500000016505900000076000140822050
SP822Bacteria;Firmicutes;Bacilli;Lactobacillales;Lactobacillaceae;Limosilactobacillus;mucosae1860000200072007005000110010035051100001006010130006000900004370007900005530010318
SP828Bacteria;Actinobacteria;Coriobacteriia;Coriobacteriales;Atopobiaceae;Atopobium;sp. HMT41600000000000001000000000002000200000020000010001000000000000000000059000108
SP836Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;hongkongensis15451106023672320103039310131181643028844156381415423128538127103109114444469558018643030231113825910613820463186894631453171212523537627753216170
SP839Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Alloprevotella;sp. HMT308714177011984548081314775787116554078832655423432618139157626939051234013572222024205123218821306013141241962303421713
SP845Bacteria;Actinobacteria;Coriobacteriia;Eggerthellales;Eggerthellaceae;Cryptobacterium;curtum593000010201070050005514081114341000600100201500010142600000001300190000211502013
SP852Bacteria;Synergistetes;Synergistia;Synergistales;Synergistaceae;Fretibacterium;sp. HMT36000200410501540000204000010114312013210000090181301510902006103047002048000306465
SP853Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Mogibacterium;timidum00100020700176017031400001713233000030200900022013213010102201001170321434000020388956
SP860Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;sanguinis1808160692054369575031561028342587196677352812814223815115929546144473209717512444718058343413947094522652465203913579373569802662861522468021673081841018649148188109155780108747115214
SP864Bacteria;Synergistetes;Synergistia;Synergistales;Synergistaceae;Fretibacterium;fastidiosum0020011209006134300061330102207100120300001300314132513840067217014403244550055038444
SP866Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT221262473488029853810902865967829074411121992471377319382264062959817356513646783219023252871921331114413819262344717112810519174193110585149107816065342813434099115
SP869Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Stomatobaculum;sp. HMT097821400178416002129220698481523561672210770164490257104761851105002043305017319498969230411356333004724241135755531330313610926680639321330
SP870Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Oribacterium;parvum11408114704023503812104791641139082844171425921230083603913803703251562011221235101533700224119059006930
SP876;;;;;Peptostreptococcaceae_[XI][G-1];[Eubacterium]_infirmum1025201029023024271453613093213100254514000010202121040101521642439900532390751410502501311022794
SP98Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;sp. HMT2570014600400025107400305931401013970001390001210266003794449474033000015105485011900027534200063132110
SP99Bacteria;Actinobacteria;Actinomycetia;Bifidobacteriales;Bifidobacteriaceae;Scardovia;wiggsiae11262024441701141321400603003419413020540219000623300010010920890617400205540500009172208
SPP1multidomain;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp1_31330012402510131116001158001502209200481000608190171100700183965326366084952158002631682102
SPP101Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp101_24233002113004185146283021162309209901341280915132813876910221014067551013122156013218189318037262796412143651444487
SPP102Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;multispecies_spp102_30000001000912750000247100000000612000000006100584700012091527006000012500000000
SPP106Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Aggregatibacter;multispecies_spp106_31361391960682139695064137256529252333777112932360395518863144325571820111137051041120792132916993492064305570937162229778118355731931771161617387636516654961580112660114105914287250925640562701324981103338
SPP107Bacteria;Actinobacteria;Coriobacteriia;Coriobacteriales;Atopobiaceae;Atopobium;rimae2011250210180224946000200031791180164696516101401013171751509261116112117600162011226112543310424562335
SPP110Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp110_3824020000001380150020002061007000000163160304700102120192600010200730402004530110018
SPP116Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Selenomonas;multispecies_spp116_3444925082819010943983202282868611570623174397840143245522882001611218818135453874682615014413023242913149
SPP117Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;multispecies_spp117_2135585220441934202332112828160023114091714620725119186158994734417145152520791652531296616420243113187712173471413103905655010626233201211361758241738380315212
SPP118Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Lachnoanaerobaculum;multispecies_spp118_238952001743784018841925122236545727332674538691022551310761144102182703241638188542286423846189286928338464178692546415310152602563721045316871301415
SPP127Bacteria;Actinobacteria;Coriobacteriia;Coriobacteriales;Atopobiaceae;multigenus;multispecies_spp127_2134331020381254030274114328139917048230263476171745354409916982591912881175063041384181918899633305444311512127147464671831215055696290281363591888336413169270
SPP129Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;multispecies_spp129_2450000100050612000023530302120631900001060411161012100303039020057200004810
SPP13Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;multispecies_spp13_2222108403237079241420490111111221651214140805146752011461921119313332542421328104951221113904244160148161533139334582814204735
SPP131Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp131_415185802360725637960992623281364321433896229321625116923570314381826158319911514741041279173552121339109185818362005581857814386502164196622694572979217749069210134111765141814183518049977214292491435917
SPP132Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp132_31601507692031901217719054066271106500114014280000011930039336500123607402104302271811119
SPP133Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;multigenus;multispecies_spp133_27101420201788380611023525424562041364839125096326354171117850730171108118722822887202032841906209341001938612348017761011786914790572471057230245828157248812516
SPP135Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Pseudoleptotrichia;goodfellowii309430143170234653001751101412111164000386352106122070151503846134192992221502211016391416196202221
SPP136Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Moraxellaceae;Moraxella;multispecies_spp136_40000160000000000000100000000020249518000000020870300000410082533000014000000005100
SPP142Bacteria;Proteobacteria;Epsilonproteobacteria;Campylobacterales;Campylobacteraceae;Campylobacter;multispecies_spp142_228284309274024228217476631349196060581751017512956615132673012192520112011071784481935018528234114228193872501810151381486439237168253109
SPP148Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Porphyromonadaceae;Porphyromonas;multispecies_spp148_244199135016362771346033586091535752452314743947985803414202387928018070911035579962751184168151581351945628715086261841195978131109274174244126154818164036418351649204912661858131743957822841
SPP149Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Porphyromonadaceae;Porphyromonas;multispecies_spp149_3521612805371425026001107274147137015310039915113305570004396153421860021841690740224009699147103664046339840491131723253946500
SPP15Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;multigenus;multispecies_spp15_386165300100012861007010040882915801900001231481030700030520818503101200006100243
SPP150Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Haemophilus;multispecies_spp150_5803332740313203556101106509194927931391208327218111809413168970556362934831827836171596157813310772253105309059838087103884061029621117755172146812615143133743942269572458360848758048444365488783143280146422599
SPP151Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Porphyromonadaceae;Porphyromonas;multispecies_spp151_411310210278014811825322651203298580002011003301143101471620490300116806340082592014008132920
SPP152Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Selenomonas;multispecies_spp152_27421620017191610425637298334916279714348480216123385717273219178153999633319382829347893114820557013581550946460037110252377152
SPP153Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;multigenus;multispecies_spp153_6612307420021801084900015000452041101001140188551302316752160036000700139005126577600
SPP154Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Catonella;multispecies_spp154_24710001801112185010040420227646001603160126360033180161541016975321537101113108105014721081521
SPP156Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;multigenus;multispecies_spp156_20010001200010936001210702052431001094022015008426261813013000223700720013060001033124116
SPP157Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;multigenus;multispecies_spp157_528693056252021271011228101406588315818121311926739952734587912757493328649685512192912105621300484490266514456246281681342536551521686007507322493887711751504083348734313630212281129038431278495361895267080732196673131323230059314091223
SPP16Bacteria;Firmicutes;Bacilli;Bacillales;Gemellaceae;Gemella;multispecies_spp16_443056105370236839645960354392916758152074743968781589204653863901176659688117721458385829241350230925621540202136926754555754130506887619326093702196490049941787259238122134658219493810113293952485638575101280612922385378445176313027184937334615816225
SPP160Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp160_31916518040911556149006704162028175956168144406402458913445123442910512091365796470672819482327901031179267334272619951257465657743280432813528313713525130108263917972780120108466114363756773153205841042
SPP166Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Kingella;multispecies_spp166_206101503028032280060702003011223012100030040224041202023540231823314021073174196010
SPP171Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp171_7346282103007985511070332309177912999725527672735262954364833532810477305110345656651064607326435870310461362534455912431136124187615153932779378615322208693032226102161432278989656731196918150537914431
SPP18Bacteria;Firmicutes;Bacilli;Lactobacillales;Lactobacillaceae;Lactobacillus;multispecies_spp18_3149000000000000000000011000000317000000000001200000005000000000207000057800000
SPP2Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;multispecies_spp2_613412471025220013432027823461492916024162118140158861912396107701271526364427161724021913554910471118102755102346471286651
SPP20Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;multispecies_spp20_43521000000510130057400001590000130100004700255902052010510121270350014212000311005
SPP27Bacteria;Firmicutes;multiclass;multiorder;multifamily;multigenus;multispecies_spp27_43320000450120187254323100880872785508136795502233892214033244180111331016323504916261214662011655902690227233320302139323320
SPP28Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp28_393210102220165112414010139144499366441006938108236563461431019165341204104133217331035301740578915637521013581267109793190255217972996735522132443414924485241938143216621645570136922223384271388453410591866711142
SPP3Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;multispecies_spp3_3359016316300381150197228226587287309130221318180173037948114960343232642204110019086125172983148116212132719340919854297121364302464533316510015295116322112254
SPP30Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Selenomonas;multispecies_spp30_118310001200001531074404001213519000010000040251110110000073337041003530011000020
SPP31Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Selenomonas;multispecies_spp31_95731906865018221518064732204531981864766322586101125457323103415016243181081731262642126915182183941214713273050
SPP32Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Haemophilus;multispecies_spp32_3292778013409600265104001271502533358722015024516423146918922031470152141600101110134521501129007813928400037693541949911359114110653116019
SPP34Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Aggregatibacter;multispecies_spp34_2010010000540507000250050104160009104002161016030153004658820017822246180634650204513305231410
SPP36Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp36_23725062103120929903842640615261654841050226498172323026311147782711011638186291552102254705712688741011239823268921497154114745111127940319583336152068
SPP37Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp37_71728560102728000290445625208904505053384523452210102405831281421284121115610005201442010010612160500
SPP4Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;multispecies_spp4_430356676026479286047156691481111209496212334124541578316560397135244489467755210112197491371174282561127131521051264829328946810030136552128959799143127401643074374378665564
SPP41Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Mogibacterium;multispecies_spp41_423716190152481011350975512224213067163173819082192607141581012162310317112118781214072202041013150905161101251571311163761532421815557445527
SPP43Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Neisseria;multispecies_spp43_444961810580389557431735047024658852623965495244749381403355167853123875263764102834041795751323582175860322956514221178252315942034280668464337173765223844802388314824432005491517660431270225464219206563741326015264236112219841163120425967960267
SPP44Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;multispecies_spp44_20010001000011000016100001401101005042003003422042409010010434603800283470000026288
SPP47Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;multigenus;multispecies_spp47_27127036010023207285121220150321001120586010268132363510272104010029040160033023140220051206
SPP49Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp49_212130550174376005440749731870157137403014665270526935041151309138295085127291934280019342813615610107133210106
SPP5Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;multispecies_spp5_31169120463000194232811634212582938193279819410917101852261023490134615383651324172220105121414136025913104821226891874136651230
SPP50Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacterales;Enterobacteriaceae;multigenus;multispecies_spp50_70513000001330000005001307141200011011161451051611090090111261020708990041309031159672100
SPP53Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Veillonella;multispecies_spp53_329825284025449038460587712053825431175620111236594422804105411206359869601270143243322140185737182436826644108108126650910677914039711252721312309855729819385321874333256131646785713852332349854156
SPP54Bacteria;Proteobacteria;Betaproteobacteria;Burkholderiales;Burkholderiaceae;Lautropia;multispecies_spp54_299708013360139110176710012029391182740083181601454153720086370121218834641507723760223032199575128517967213378
SPP55Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp55_6321221140154262380101719228230376139100015802522771616186919130451094251318394281963602098239351315061231502527443551878117377111515832052432133137
SPP57Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Tannerellaceae;Tannerella;multispecies_spp57_220415214036354012461315158212243728432132154637801561645112705310219115114841172412634641281117014011431210191612152746192406
SPP58Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Veillonella;multispecies_spp58_21035937458805512274216860213222229922233619581173230146047916911210762439729610614519694259130619925791107497210669217021265832792163606174389313328778381511153861709152177245423731326581205866911203403413806967668801031
SPP59Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Alloprevotella;multispecies_spp59_21412160005004400630110103111021712212094100104311004228248065110180018279060560271100170427133
SPP6Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;multispecies_spp6_26803861210222823301851055287132325126432942481217129241198299192106211404067751623694981112360911409826354749894481608410136342432208743712236228852779124463012961
SPP60Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;multispecies_spp60_2352100021030101900011061106013014000030420163009525343481077362022063990233000983001601555912
SPP61Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp61_2145978063436202015447293182031012421195025675109238180832143326314013399398102888136906451246077035210591114110390180653822
SPP62Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp62_70000219016000723402660000038600007017450000000000092640261100060201000102900000110
SPP63Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp63_635114410316280131006962433429281913261239117302744203193792160152217341411594221219819502459101508424127141493534042245723
SPP64Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp64_436245470881502506111027924718698938581427020509260185670831041261428803264077349111616138091112322371765212238333
SPP65Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp65_31331264123055414617400636488342891159390458753351782657051831278872142414915177757464254267174239303010431886281395628193292893894967259823355954899121610153214015817818829157727041112
SPP66Bacteria;Synergistetes;Synergistia;Synergistales;Synergistaceae;Fretibacterium;multispecies_spp66_4006001200000100002110000330170000100002400360155270102001300330238001622910000127852
SPP67Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp67_294026402957210000091016011223737042335770150862561341164748050911014243595540376249118514802825292582324058752569589
SPP68Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp68_262369695021813405071250382874958927061195285214133141902877608122254047046876341302081893392391419668299691037448290111825217331702657221006752241014793847
SPP69Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Actinomyces;multispecies_spp69_47520733029280171282222174321313611113009091079530310628154385410054141124290316100226226822634111136413731
SPP7Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp7_1815290342879560993321809150400100467636848762078256136662311610105751023844700533631968379481224580449479110211321176355596562740531317123468102619476771005962203147767669751331019864116732462662863216890660444136841745515298565156051318383015163693390541230787009392239313160111286808767720410
SPP70Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Actinomyces;multispecies_spp70_672629029005482800862802257843395854851035627479168601485112713511613032687910921165625536736980472344895875112243171827789185711754117196174861654337150
SPP72multidomain;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp72_200100008000400000007100060910040303001200013000125091910116394001912018200071050100
SPP76Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptococcaceae;Peptococcus;multispecies_spp76_252431501150110515242521213122030196101110022134002435932171120472128230014049003102182414
SPP78multidomain;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp78_2140000000000009030412020530190031000000000005014221033007873502700311181000010322
SPP8Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;multigenus;multispecies_spp8_20270092210001203287000413604530061310010191409301168111120501030063411410122731920081430
SPP80Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp80_313121002048210253716940016111960607512397643603110038234551010529239413211902101382118436267714100528171110
SPP81Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp81_1035320415802444531707282113161345257241201821262216258725111814758741075261211968624748344664931775454475134111612161804002191218369365280292158210589185170551191266108171268
SPP83Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp83_2018880025110119509833020000619016008042014017003270141113022069310026150193301370272174
SPP85Bacteria;Actinobacteria;Actinomycetia;Bifidobacteriales;Bifidobacteriaceae;Bifidobacterium;multispecies_spp85_291000000001800271000010005950460157300000100950400073090020000011000019231600288637000715
SPP86Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;multispecies_spp86_213720011062017219072356006911020793119051516714241621921931263561951614140432482102853826016812784732311
SPP90Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Porphyromonadaceae;Porphyromonas;multispecies_spp90_2001201226701000763174200114846375060031293892453412580212073419332823439309347533142396864831321815841521052201101664743006718641110728
SPP91Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp91_2546810001511101900021622410241211120000420132123112300931312521120328793220180554025018115522608
SPP93Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Aggregatibacter;multispecies_spp93_22910020222390212108141333235336464327195202301216138530378526101633666231093437371116310371320239103838341321900168125053363132098
SPP94Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Schaalia;multispecies_spp94_3164418990107731700814104256308772728215944223334913171635119351216123458911214664010216531312411141804232326571941436721274100932218196854197104013023227464
SPP95Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Micrococcaceae;Kocuria;multispecies_spp95_20210000072000000100000000000000012660001000000000000000000029000000012000
 
 
Download OTU Tables at Different Taxonomy Levels
PhylumCount*: Relative**: CLR***:
ClassCount*: Relative**: CLR***:
OrderCount*: Relative**: CLR***:
FamilyCount*: Relative**: CLR***:
GenusCount*: Relative**: CLR***:
SpeciesCount*: Relative**: CLR***:
* Read count
** Relative abundance (count/total sample count)
*** Centered log ratio transformed abundance
;
 
The species listed in the table has full taxonomy and a dynamically assigned species ID specific to this report. When some reads match with the reference sequences of more than one species equally (i.e., same percent identiy and alignmnet coverage), they can't be assigned to a particular species. Instead, they are assigned to multiple species with the species notaton "s__multispecies_spp2_2". In this notation, spp2 is the dynamic ID assigned to these reads that hit multiple sequences and the "_2" at the end of the notation means there are two species in the spp2.

You can look up which species are included in the multi-species assignment, in this table below:
 
 
 
 
Another type of notation is "s__multispecies_sppn2_2", in which the "n" in the sppn2 means it's a potential novel species because all the reads in this species have < 98% idenity to any of the reference sequences. They were grouped together based on de novo OTU clustering at 98% identity cutoff. And then a representative sequence was chosed to BLASTN search against the reference database to find the closest match (but will still be < 98%). This representative sequence also matched equally to more than one species, hence the "spp" was given in the label.
 
 

Taxonomy Bar Plots for All Samples

 
 

Taxonomy Bar Plots for Individual Comparison Groups

 
 
Comparison No.Comparison NameFamiliesGeneraSpecies
Comparison 1CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 2CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 3CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 4CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 5CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 6CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 7CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 8CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 9ADULT negative vs ADULT positivePDFSVGPDFSVGPDFSVG
Comparison 10ADULT NON-SHEDDER vs ADULT SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 11ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 12ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 13ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 14CHILD NON-SHEDDER vs CHILD SHEDDERPDFSVGPDFSVGPDFSVG
 
 

VIII. Analysis - Alpha Diversity

 

In ecology, alpha diversity (α-diversity) is the mean species diversity in sites or habitats at a local scale. The term was introduced by R. H. Whittaker[5][6] together with the terms beta diversity (β-diversity) and gamma diversity (γ-diversity). Whittaker's idea was that the total species diversity in a landscape (gamma diversity) is determined by two different things, the mean species diversity in sites or habitats at a more local scale (alpha diversity) and the differentiation among those habitats (beta diversity).

 

References:

  1. Whittaker, R. H. (1960) Vegetation of the Siskiyou Mountains, Oregon and California. Ecological Monographs, 30, 279–338. doi:10.2307/1943563
  2. Whittaker, R. H. (1972). Evolution and Measurement of Species Diversity. Taxon, 21, 213-251. doi:10.2307/1218190

 

Alpha Diversity Analysis by Rarefaction

Diversity measures are affected by the sampling depth. Rarefaction is a technique to assess species richness from the results of sampling. Rarefaction allows the calculation of species richness for a given number of individual samples, based on the construction of so-called rarefaction curves. This curve is a plot of the number of species as a function of the number of samples. Rarefaction curves generally grow rapidly at first, as the most common species are found, but the curves plateau as only the rarest species remain to be sampled [7].


References:

  1. Willis AD. Rarefaction, Alpha Diversity, and Statistics. Front Microbiol. 2019 Oct 23;10:2407. doi: 10.3389/fmicb.2019.02407. PMID: 31708888; PMCID: PMC6819366.

 
 
 

Boxplot of Alpha-diversity Indices

The two main factors taken into account when measuring diversity are richness and evenness. Richness is a measure of the number of different kinds of organisms present in a particular area. Evenness compares the similarity of the population size of each of the species present. There are many different ways to measure the richness and evenness. These measurements are called "estimators" or "indices". Below is a diversity of 3 commonly used indices showing the values for all the samples (dots) and in groups (boxes).

 
Alpha Diversity Box Plots for All Groups
 
 
 
 
 
 
 
Alpha Diversity Box Plots for Individual Comparisons at Species level
 
Comparison 1CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERView in PDFView in SVG
Comparison 2CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERView in PDFView in SVG
Comparison 3CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERView in PDFView in SVG
Comparison 4CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERView in PDFView in SVG
Comparison 5CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERView in PDFView in SVG
Comparison 6CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERView in PDFView in SVG
Comparison 7CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERView in PDFView in SVG
Comparison 8CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERView in PDFView in SVG
Comparison 9ADULT negative vs ADULT positiveView in PDFView in SVG
Comparison 10ADULT NON-SHEDDER vs ADULT SHEDDERView in PDFView in SVG
Comparison 11ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERView in PDFView in SVG
Comparison 12ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERView in PDFView in SVG
Comparison 13ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERView in PDFView in SVG
Comparison 14CHILD NON-SHEDDER vs CHILD SHEDDERView in PDFView in SVG
 
 
 
 

Group Significance of Alpha-diversity Indices

To test whether the alpha diversity among different comparison groups are different statistically, we use the Kruskal Wallis H test provided the "alpha-group-significance" fucntion in the QIIME 2 "diversity" package. Kruskal Wallis H test is the non-parametric alternative to the One Way ANOVA. Non-parametric means that the test doesn’t assume your data comes from a particular distribution. The H test is used when the assumptions for ANOVA aren’t met (like the assumption of normality). It is sometimes called the one-way ANOVA on ranks, as the ranks of the data values are used in the test rather than the actual data points. The H test determines whether the medians of two or more groups are different.

Below are the Kruskal Wallis H test results for each comparison based on three different alpha diversity measures: 1) Observed species (features), 2) Shannon index, and 3) Simpson index.

 
 
Comparison 1.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 2.CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 3.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 4.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 5.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 6.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 7.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 8.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 9.ADULT negative vs ADULT positiveObserved FeaturesShannon IndexSimpson Index
Comparison 10.ADULT NON-SHEDDER vs ADULT SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 11.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 12.ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 13.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 14.CHILD NON-SHEDDER vs CHILD SHEDDERObserved FeaturesShannon IndexSimpson Index
 
 

IX. Analysis - Beta Diversity

 

NMDS and PCoA Plots

Beta diversity compares the similarity (or dissimilarity) of microbial profiles between different groups of samples. There are many different similarity/dissimilarity metrics [8]. In general, they can be quantitative (using sequence abundance, e.g., Bray-Curtis or weighted UniFrac) or binary (considering only presence-absence of sequences, e.g., binary Jaccard or unweighted UniFrac). They can be even based on phylogeny (e.g., UniFrac metrics) or not (non-UniFrac metrics, such as Bray-Curtis, etc.).

For microbiome studies, species profiles of samples can be compared with the Bray-Curtis dissimilarity, which is based on the count data type. The pair-wise Bray-Curtis dissimilarity matrix of all samples can then be subject to either multi-dimensional scaling (MDS, also known as PCoA) or non-metric MDS (NMDS).

MDS/PCoA is a scaling or ordination method that starts with a matrix of similarities or dissimilarities between a set of samples and aims to produce a low-dimensional graphical plot of the data in such a way that distances between points in the plot are close to original dissimilarities.

NMDS is similar to MDS, however it does not use the dissimilarities data, instead it converts them into the ranks and use these ranks in the calculation.

In our beta diversity analysis, Bray-Curtis dissimilarity matrix was first calculated and then plotted by the PCoA and NMDS separately. Below are beta diveristy results for all groups together:

References:

  1. Plantinga, AM, Wu, MC (2021). Beta Diversity and Distance-Based Analysis of Microbiome Data. In: Datta, S., Guha, S. (eds) Statistical Analysis of Microbiome Data. Frontiers in Probability and the Statistical Sciences. Springer, Cham. https://doi.org/10.1007/978-3-030-73351-3_5

 
 
NMDS and PCoA Plots for All Groups
 
 
 
 
 

The above PCoA and NMDS plots are based on count data. The count data can also be transformed into centered log ratio (CLR) for each species. The CLR data is no longer count data and cannot be used in Bray-Curtis dissimilarity calculation. Instead CLR can be compared with Euclidean distances. When CLR data are compared by Euclidean distance, the distance is also called Aitchison distance.

Below are the NMDS and PCoA plots of the Aitchison distances of the samples:

 
 
 
 
 
 
 
NMDS and PCoA Plots for Individual Comparisons at Species level
 
 
Comparison No.Comparison NameNMDAPCoA
Bray-CurtisCLR EuclideanBray-CurtisCLR Euclidean
Comparison 1CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 2CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 3CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 4CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 5CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 6CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 7CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 8CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 9ADULT negative vs ADULT positivePDFSVGPDFSVGPDFSVGPDFSVG
Comparison 10ADULT NON-SHEDDER vs ADULT SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 11ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 12ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 13ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 14CHILD NON-SHEDDER vs CHILD SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
 
 
 
 
 
 

Interactive 3D PCoA Plots - Bray-Curtis Dissimilarity

 
 
 

Interactive 3D PCoA Plots - Euclidean Distance

 
 
 

Interactive 3D PCoA Plots - Correlation Coefficients

 
 
 

Group Significance of Beta-diversity Indices

To test whether the between-group dissimilarities are significantly greater than the within-group dissimilarities, the "beta-group-significance" function provided in the QIIME 2 "diversity" package was used with PERMANOVA (permutational multivariate analysis of variance) as the group significant testing method.

Three beta diversity matrics were used: 1) Bray–Curtis dissimilarity 2) Correlation coefficient matrix , and 3) Aitchison distance (Euclidean distance between clr-transformed compositions).

 
 
Comparison 1.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERBray–CurtisCorrelationAitchison
Comparison 2.CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERBray–CurtisCorrelationAitchison
Comparison 3.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERBray–CurtisCorrelationAitchison
Comparison 4.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERBray–CurtisCorrelationAitchison
Comparison 5.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERBray–CurtisCorrelationAitchison
Comparison 6.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERBray–CurtisCorrelationAitchison
Comparison 7.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERBray–CurtisCorrelationAitchison
Comparison 8.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERBray–CurtisCorrelationAitchison
Comparison 9.ADULT negative vs ADULT positiveBray–CurtisCorrelationAitchison
Comparison 10.ADULT NON-SHEDDER vs ADULT SHEDDERBray–CurtisCorrelationAitchison
Comparison 11.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERBray–CurtisCorrelationAitchison
Comparison 12.ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERBray–CurtisCorrelationAitchison
Comparison 13.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERBray–CurtisCorrelationAitchison
Comparison 14.CHILD NON-SHEDDER vs CHILD SHEDDERBray–CurtisCorrelationAitchison
 
 
 

X. Analysis - Differential Abundance

16S rRNA next generation sequencing (NGS) generates a fixed number of reads that reflect the proportion of different species in a sample, i.e., the relative abundance of species, instead of the absolute abundance. In Mathematics, measurements involving probabilities, proportions, percentages, and ppm can all be thought of as compositional data. This makes the microbiome read count data “compositional” (Gloor et al, 2017). In general, compositional data represent parts of a whole which only carry relative information [9].

The problem of microbiome data being compositional arises when comparing two groups of samples for identifying “differentially abundant” species. A species with the same absolute abundance between two conditions, its relative abundances in the two conditions (e.g., percent abundance) can become different if the relative abundance of other species change greatly. This problem can lead to incorrect conclusion in terms of differential abundance for microbial species in the samples.

When studying differential abundance (DA), the current better approach is to transform the read count data into log ratio data. The ratios are calculated between read counts of all species in a sample to a “reference” count (e.g., mean read count of the sample). The log ratio data allow the detection of DA species without being affected by percentage bias mentioned above

In this report, a compositional DA analysis tool “ANCOM” (analysis of composition of microbiomes) was used [10]. ANCOM transforms the count data into log-ratios and thus is more suitable for comparing the composition of microbiomes in two or more populations. "ANCOM" generates a table of features with W-statistics and whether the null hypothesis is rejected. The “W” is the W-statistic, or number of features that a single feature is tested to be significantly different against. Hence the higher the "W" the more statistical sifgnificant that a feature/species is differentially abundant.

References:

  1. Gloor GB, Macklaim JM, Pawlowsky-Glahn V, Egozcue JJ. Microbiome Datasets Are Compositional: And This Is Not Optional. Front Microbiol. 2017 Nov 15;8:2224. doi: 10.3389/fmicb.2017.02224. PMID: 29187837; PMCID: PMC5695134.
  2. Mandal S, Van Treuren W, White RA, Eggesbø M, Knight R, Peddada SD. Analysis of composition of microbiomes: a novel method for studying microbial composition. Microb Ecol Health Dis. 2015 May 29;26:27663. doi: 10.3402/mehd.v26.27663. PMID: 26028277; PMCID: PMC4450248.
 
 

ANCOM Differential Abundance Analysis

 
ANCOM Results for Individual Comparisons
Comparison No.Comparison Name
Comparison 1.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER
Comparison 2.CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 3.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 4.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 5.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 6.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 7.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 8.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 9.ADULT negative vs ADULT positive
Comparison 10.ADULT NON-SHEDDER vs ADULT SHEDDER
Comparison 11.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 12.ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 13.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 14.CHILD NON-SHEDDER vs CHILD SHEDDER
 
 

ANCOM-BC2 Differential Abundance Analysis

 

Starting with version V1.2, we include the results of ANCOM-BC (Analysis of Compositions of Microbiomes with Bias Correction) (Lin and Peddada 2020) [11]. ANCOM-BC is an updated version of "ANCOM" that:
(a) provides statistically valid test with appropriate p-values,
(b) provides confidence intervals for differential abundance of each taxon,
(c) controls the False Discovery Rate (FDR),
(d) maintains adequate power, and
(e) is computationally simple to implement.

The bias correction (BC) addresses a challenging problem of the bias introduced by differences in the sampling fractions across samples. This bias has been a major hurdle in performing DA analysis of microbiome data. ANCOM-BC estimates the unknown sampling fractions and corrects the bias induced by their differences among samples. The absolute abundance data are modeled using a linear regression framework.

Starting with version V1.43, ANCOM-BC2 is used instead of ANCOM-BC, So that multiple pairwise directional test can be performed (if there are more than two gorups in a comparison). When performing pairwise directional test, the mixed directional false discover rate (mdFDR) is taken into account. The mdFDR is the combination of false discovery rate due to multiple testing, multiple pairwise comparisons, and directional tests within each pairwise comparison. The mdFDR is adopted from (Guo, Sarkar, and Peddada 2010 [12]; Grandhi, Guo, and Peddada 2016 [13]). For more detail explanation and additional features of ANCOM-BC2 please see author's documentation.

References:

  1. Lin H, Peddada SD. Analysis of compositions of microbiomes with bias correction. Nat Commun. 2020 Jul 14;11(1):3514. doi: 10.1038/s41467-020-17041-7. PMID: 32665548; PMCID: PMC7360769.
  2. Guo W, Sarkar SK, Peddada SD. Controlling false discoveries in multidimensional directional decisions, with applications to gene expression data on ordered categories. Biometrics. 2010 Jun;66(2):485-92. doi: 10.1111/j.1541-0420.2009.01292.x. Epub 2009 Jul 23. PMID: 19645703; PMCID: PMC2895927.
  3. Grandhi A, Guo W, Peddada SD. A multiple testing procedure for multi-dimensional pairwise comparisons with application to gene expression studies. BMC Bioinformatics. 2016 Feb 25;17:104. doi: 10.1186/s12859-016-0937-5. PMID: 26917217; PMCID: PMC4768411.
 
 
ANCOM-BC Results for Individual Comparisons
 
Comparison No.Comparison Name
Comparison 1.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER
Comparison 2.CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 3.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 4.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 5.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 6.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 7.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 8.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 9.ADULT negative vs ADULT positive
Comparison 10.ADULT NON-SHEDDER vs ADULT SHEDDER
Comparison 11.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 12.ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 13.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 14.CHILD NON-SHEDDER vs CHILD SHEDDER
 
 
 
 
 

LEfSe - Linear Discriminant Analysis Effect Size

LEfSe (Linear Discriminant Analysis Effect Size) is an alternative method to find "organisms, genes, or pathways that consistently explain the differences between two or more microbial communities" (Segata et al., 2011) [14]. Specifically, LEfSe uses rank-based Kruskal-Wallis (KW) sum-rank test to detect features with significant differential (relative) abundance with respect to the class of interest. Since it is rank-based, instead of proportional based, the differential species identified among the comparison groups is less biased (than percent abundance based).

Reference:

  1. Segata N, Izard J, Waldron L, Gevers D, Miropolsky L, Garrett WS, Huttenhower C. Metagenomic biomarker discovery and explanation. Genome Biol. 2011 Jun 24;12(6):R60. doi: 10.1186/gb-2011-12-6-r60. PMID: 21702898; PMCID: PMC3218848.
 
CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER
 
 
 
 
 
 
 
LEfSe Results for All Comparisons
 
Comparison No.Comparison Name
Comparison 1.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER
Comparison 2.CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 3.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 4.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 5.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 6.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 7.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 8.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 9.ADULT negative vs ADULT positive
Comparison 10.ADULT NON-SHEDDER vs ADULT SHEDDER
Comparison 11.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 12.ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 13.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 14.CHILD NON-SHEDDER vs CHILD SHEDDER
 
 

XI. Analysis - Heatmap Profile

 

Species vs Sample Abundance Heatmap for All Samples

 
 
 

Heatmaps for Individual Comparisons

 
A) Two-way clustering - clustered on both columns (Samples) and rows (organism)
Comparison No.Comparison NameFamily LevelGenus LevelSpecies Level
Comparison 1CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 2CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 3CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 4CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 5CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 6CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 7CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 8CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 9ADULT negative vs ADULT positivePDFSVGPDFSVGPDFSVG
Comparison 10ADULT NON-SHEDDER vs ADULT SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 11ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 12ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 13ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 14CHILD NON-SHEDDER vs CHILD SHEDDERPDFSVGPDFSVGPDFSVG
 
 
B) One-way clustering - clustered on rows (organism) only
Comparison No.Comparison NameFamily LevelGenus LevelSpecies Level
Comparison 1CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 2CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 3CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 4CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 5CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 6CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 7CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 8CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 9ADULT negative vs ADULT positivePDFSVGPDFSVGPDFSVG
Comparison 10ADULT NON-SHEDDER vs ADULT SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 11ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 12ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 13ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 14CHILD NON-SHEDDER vs CHILD SHEDDERPDFSVGPDFSVGPDFSVG
 
 
C) No clustering
Comparison No.Comparison NameFamily LevelGenus LevelSpecies Level
Comparison 1CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 2CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 3CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 4CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 5CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 6CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 7CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 8CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 9ADULT negative vs ADULT positivePDFSVGPDFSVGPDFSVG
Comparison 10ADULT NON-SHEDDER vs ADULT SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 11ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 12ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 13ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 14CHILD NON-SHEDDER vs CHILD SHEDDERPDFSVGPDFSVGPDFSVG
 
 

XII. Analysis - Network Association

To analyze the co-occurrence or co-exclusion between microbial species among different samples, network correlation analysis tools are usually used for this purpose. However, microbiome count data are compositional. If count data are normalized to the total number of counts in the sample, the data become not independent and traditional statistical metrics (e.g., correlation) for the detection of specie-species relationships can lead to spurious results. In addition, sequencing-based studies typically measure hundreds of OTUs (species) on few samples; thus, inference of OTU-OTU association networks is severely under-powered. Here we use SPIEC-EASI (SParse InversE Covariance Estimation for Ecological Association Inference), a statistical method for the inference of microbial ecological networks from amplicon sequencing datasets that addresses both of these issues (Kurtz et al., 2015) [15]. SPIEC-EASI combines data transformations developed for compositional data analysis with a graphical model inference framework that assumes the underlying ecological association network is sparse. SPIEC-EASI provides two algorithms for network inferencing – 1) Meinshausen-Bühlmann's neighborhood selection (MB method) and inverse covariance selection (GLASSO method, i.e., graphical least absolute shrinkage and selection operator). This is fundamentally distinct from SparCC, which essentially estimate pairwise correlations. In addition to these two methods, we provide the results of a third method - SparCC (Sparse Correlations for Compositional Data)(Friedman & Alm 2012)[16], which is also a method for inferring correlations from compositional data. SparCC estimates the linear Pearson correlations between the log-transformed components.

References:

  1. Kurtz ZD, Müller CL, Miraldi ER, Littman DR, Blaser MJ, Bonneau RA. Sparse and compositionally robust inference of microbial ecological networks. PLoS Comput Biol. 2015 May 7;11(5):e1004226. doi: 10.1371/journal.pcbi.1004226. PMID: 25950956; PMCID: PMC4423992.
  2. Friedman J, Alm EJ. Inferring correlation networks from genomic survey data. PLoS Comput Biol. 2012;8(9):e1002687. doi: 10.1371/journal.pcbi.1002687. Epub 2012 Sep 20. PMID: 23028285; PMCID: PMC3447976.
 

SPIEC-EASI Network Inference by Neighborhood Selection (MB Method)

 

 

 

Association Network Inference by SparCC

 

 

 
 

XIII. Disclaimer

The results of this analysis are for research purpose only. They are not intended to diagnose, treat, cure, or prevent any disease. Forsyth and FOMC are not responsible for use of information provided in this report outside the research area.

 

Copyright FOMC 2025