In this section we will focus on the taxonomic classification of shotgun metagenomic reads using two different tools: Kraken 2 and Kaiju. We will use the data obtained in the data retrieval section.
Approach 1: Kraken 2¶
Before we can use Kraken 2, we need to build or download a database. We will use the build-kraken-db action to fetch the PlusPF database
from here - this database covers RefSeq sequences for archaea, bacteria, viral, plasmid,
human, UniVec_Core, protozoa and fungi.
mosh annotate build-kraken-db \
    --p-collection pluspf \
    --o-kraken2-db cache:kraken2_db \
    --o-bracken-db cache:bracken_db \We can now use the classify-kraken2 command to run Kraken2 using the paired-end reads as a query and the PlusPF database retrieved in the previous step:
mosh annotate classify-kraken2 \
    --i-seqs cache:reads_filtered \
    --i-db cache:kraken2_db \
    --p-threads 72 \
    --p-confidence 0.5 \
    --p-memory-mapping False \
    --p-report-minimizer-data \
    --o-reports cache:kraken_reports_reads \
    --o-outputs cache:kraken_hits_reads
    --verbosemosh annotate estimate-bracken \
    --i-kraken2-reports cache:kraken_reports_reads \
    --i-db cache:bracken_db \
    --p-threshold 5 \
    --p-read-len 150 \
    --o-taxonomy cache:bracken_taxonomy \
    --o-table cache:bracken_ft \
    --o-reports cache:bracken_reportsTo remove the unclassified read fraction we can use the filter-table action from the q2-taxa QIIME 2 plugin:
mosh taxa filter-table \
    --i-table cache:bracken_ft \
    --i-taxonomy cache:bracken_taxonomy \
    --p-exclude Unclassified \
    --o-filtered-table cache:bracken_ft_filteredApproach 2: Kaiju¶
Similarly to Kraken 2, Kaiju requires a reference database to perform taxonomic classification. We will use the fetch-kaiju-db
action to download the nr_euk database that includes both
prokaryotes and eukaryotes (more info on the taxa here).
mosh annotate fetch-kaiju-db \
    --p-database-type nr_euk \
    --o-db cache:kaiju_nr_eukWe run Kaiju with the confidence of 0.1 using the paired-end reads as a query and the database artifact that was generated in the previous step:
mosh annotate classify-kaiju \
    --i-seqs cache:reads_paired \
    --i-db cache:kaiju_nr_euk \
    --p-z 16 \
    --p-c 0.1 \
    --o-taxonomy cache:kaiju_taxonomy \
    --o-abundances cache:kaiju_ftFinally, we filter the table to remove the unclassified reads:
mosh taxa filter-table \
    --i-table cache:kaiju_ft \
    --i-taxonomy cache:kaiju_taxonomy \
    --p-exclude unclassified,belong,cannot \
    --o-filtered-table cache:kaiju_ft_filteredVisualization¶
You can try to generate a taxa bar plot with either of these results now! We will continue with the Kaiju results - to generate a taxa bar plot, you can run:
mosh taxa barplot \
    --i-table cache:kaiju_ft_filtered \
    --i-taxonomy cache:kaiju_taxonomy \
    --m-metadata-file metadata.tsv \
    --o-visualization results/kaiju_barplot.qzvYour visualization should look similar to this one.