Analysis of 16S Metabarcoding Data

From counts to ecological insights

Author

Quadram Institute Bioscience

Published

May 11, 2026

Introduction

About this workshop

This is the second part of the QIB Core Bioinformatics 16S metabarcoding workshop series. Where the first part covered raw read processing with DADA2, from FASTQ files to an ASV table and taxonomy assignments. This session focuses on downstream ecological analysis: what to do with those counts once you have them.

The workshop covers:

Assembling and cleaning a phyloseq object from DADA2 outputs
Filtering rare ASVs and applying abundance transformations
Alpha diversity: within-sample richness and evenness
Beta diversity: between-sample distances, ordination, and PERMANOVA
Composition visualisation: bar plots and heat maps at taxon level
A brief introduction to differential abundance analysis with MaAsLin3

The dataset comes from a study of Romanian-style cucumber fermentations comparing spontaneous fermentation with inoculation by Lactiplantibacillus plantarum IBB082, sampled at four time points.

This workshop builds on the first session. You will need the cleaned DADA2 outputs (ASV count table, taxonomy table, and sample metadata) to follow the practical sections. The data files used here were generated in the 16S Metabarcoding with DADA2 workshop.

Trainers

This session is designed and delivered by QIB Core Bioinformatics: Alise Ponsero, Andrea Telatin, and Judit Talas.

Workshop structure

Block	Topic
1	From DADA2 outputs to a clean phyloseq object
2	Filtering rare ASVs and abundance transformations
3	Alpha diversity
4	Beta diversity, compositionality, and PERMANOVA
5	Composition visualisation
6	Wrap-up, differential abundance, and further reading

Software requirements

Run the provided install_dependencies.R script at least a day before the workshop to check everything installs correctly on your machine:

Rscript install_dependencies.R

Or from RStudio: open the file and click Source.

The script installs the following packages:

From CRAN:

install.packages(c(
  "rmarkdown", "knitr",
  "dplyr", "tidyr", "tibble", "readr", "ggplot2",
  "patchwork",   # side-by-side ordination plots (Block 4)
  "vegan"        # PERMANOVA and distance matrices (Block 4)
))

From Bioconductor:

BiocManager::install(c("phyloseq", "microbiome"))

Optional — only needed for the differential abundance section in Block 6:

BiocManager::install("maaslin3")

The script will print an installation check at the end listing each package as OK or FAILED. Resolve any failures before the session starts.

R version 4.3 or newer is recommended.

Data

The input files expected by Block 1 are those produced at the end of the DADA2 workshop:

File	Contents
`metabarcoding_dada2_asvtax.csv`	ASV count table (samples × ASVs) and Taxonomy assignments
`metabarcoding_dada2_QC.csv`	Per-sample DADA2 read-tracking and
`metabarcoding_metadata_2026.tsv`	metadata

Place these files in a Inputs/ subdirectory before starting Block 1.