Illumina 16S V3V4 amplicone analysis pipeline using QIIME2 + KTU reclustering

Software requirement

Example

Sawada, K., Koyano, H., Yamamoto, N. et al. The relationships between microbiota and the amino acids and organic acids in commercial vegetable pickle fermented in rice-bran beds. Sci Rep 11, 1791 (2021). https://doi.org/10.1038/s41598-021-81105-x
Expdesign
Download data

QIIME2 analysis pipeline

0. activate conda environment

conda activate qiime2-2019.10

1. Import primer trimmed FASTQs as QIIME2 artifact

mkdir reads_qza
qiime tools import \
  --type 'SampleData[PairedEndSequencesWithQuality]' \
  --input-path manifest \
  --output-path reads_qza/paired-end.qza \
  --input-format PairedEndFastqManifestPhred33

manifest

2. Trim primers

time qiime cutadapt trim-paired \
--i-demultiplexed-sequences reads_qza/paired-end.qza \
--p-cores 6 \
--p-front-f CCTACGGGNGGCWGCAG \
--p-front-r GACTACHVGGGTATCTAATCC  \
--o-trimmed-sequences reads_qza/paired-end.trimmed.qza \
--p-error-rate 0 \
--p-discard-untrimmed \
--verbose \
&> primer_trimming.log 

3. View sequence quality

qiime demux summarize \
--i-data reads_qza/paired-end.trimmed.qza \
--o-visualization reads_qza/primer-trimmed.qzv

4. Run DADA2 denoising and output denoising statistics

time qiime dada2 denoise-paired --i-demultiplexed-seqs reads_qza/paired-end.trimmed.qza \
                           --p-trunc-len-f 270 \
                           --p-trunc-len-r 210 \
                           --p-n-threads 6 \
                           --output-dir dada2_output

qiime tools export --input-path dada2_output/denoising_stats.qza --output-path dada2_output

5. Convert artifact to readable format and prepare dataset for KTU

qiime tools export --input-path dada2_output/table.qza --output-path dada2_output_exported
qiime tools export --input-path dada2_output/representative_sequences.qza --output-path dada2_output_exported
biom convert -i dada2_output_exported/feature-table.biom -o dada2_output_exported/feature-table.txt --to-tsv

cp dada2_output_exported/feature-table.txt dada2_output_exported/feature-table.1.txt
sed -i '1d' dada2_output_exported/feature-table.1.txt
sed -i -e '1 s/#OTU ID/OTU_ID/' dada2_output_exported/feature-table.1.txt

6. Assign taxonomy using VSEARCH

qiime feature-classifier classify-consensus-vsearch \
    --i-query dada2_output/representative_sequences.qza \
    --i-reference-reads ~/Genomics/QIIME2_taxa_classifiers/Silva99_16SRepSeq.qza \
    --i-reference-taxonomy ~/Genomics/QIIME2_taxa_classifiers/Silva99_16STaxonomy.qza \
    --p-threads 6 \
    --verbose \
    --output-dir taxa  

qiime tools export --input-path taxa/classification.qza --output-path taxa

Download classifiers

7. Annotate feature table

sed -i -e '1 s/Feature/#Feature/' -e '1 s/Taxon/taxonomy/' taxa/taxonomy.tsv
biom add-metadata -i dada2_output_exported/feature-table.biom -o dada2_output_exported/feature-table_w_tax.biom --observation-metadata-fp taxa/taxonomy.tsv --sc-separated taxonomy
biom convert -i dada2_output_exported/feature-table_w_tax.biom -o dada2_output_exported/feature-table_w_tax.txt --to-tsv --header-key taxonomy

8. KTU reclustering

Please cite: Liu, P. Y., Yang, S. H., & Yang, S. Y. (2021). KTU: K‐mer Taxonomic Units improve the biological relevance of amplicon sequence variant microbiota data. Methods in Ecology and Evolution. doi:10.1111/2041-210x.13758

  • create an empty R file, name as KTU.R and paste following code in the file:
rm(list=ls())
library(KTU) 
data <- read.delim("dada2_output_exported/feature-table.1.txt")


kluster <- klustering(repseq = "dada2_output_exported/dna-sequences.fasta",
                          feature.table = data,
                          write.fasta = TRUE,cores = 4)
saveRDS(kluster,file="kluster.RDS")
  • optional codes: KTU evaluation & KTU taxonomy assignment
evaluation <- KTUsim.eval(klusterRDS = "kluster.RDS",ASVfasta = "dada2_output_exported/dna-sequences.fasta")
saveRDS(evaluation,file="ktu_eval.RDS")

rm(list=ls())
kluster <- readRDS("kluster.RDS")
kaxa <- KTU::kaxonomy(dbRDS = "~/Projects/16S_Coccidiosis/SILVA_132_V3V4_DB.RDS",
                      taxaRDS = "~/Projects/16S_Coccidiosis/SILVA_132_V3V4_TX.RDS",
                      kmer.table = kluster$kmer.table,cos.cutff = 0.95,consensus = 0.51,
                      cores = 4)
saveRDS(kaxa,file="kaxonomy.RDS")

9. Run KTU (using Linux command)

time Rscript KTU.R

Diversity analysis & visualization

MARco R package

Please cite: Liu, P.-Y. poyuliu/MARco: MARco: Microbiome Analysis RcodeDB (Version v1.0). Zenodo. Available online: http://doi.org/10.5281/zenodo.4589898 (accessed on Day Month Year).

Barplot
Alpha diversity
Beta diversity