This hands-on workshop introduces functional profiling approaches for characterizing the metabolic potential of microbial communities. You’ll learn the conceptual differences between gene catalogue and read-based approaches, understand HUMAnN3’s tiered search strategy, and explore pathway-level functional changes in real metagenomic data. We’ll work with coffee bean fermentation samples to understand how microbial metabolism shifts during the fermentation process.
What you’ll learn:
Workshop structure: Theory presentations followed by guided R exploration of pre-computed HUMAnN3 results, with time for independent pathway investigation.
Dataset: Coffee bean fermentation time series (T0, T16, T24, T48 hours) from Ecuador showing metabolic shifts as lactic acid bacteria and yeasts process coffee bean substrates.
The Workshop material is available to download in Figshare
Taxonomic profiling tells us who is present in a microbial community, but not what they can do. Functional profiling characterizes the genes, enzymes, and metabolic pathways present, revealing biochemical potential independent of taxonomic identity.
Key concept: Functional redundancy - Different species can perform the same functions. Multiple organisms may carry genes for the same metabolic pathway, providing functional stability even when taxonomic composition changes.
Why it matters: Functional profiling links community structure to ecosystem function, predicts metabolic outputs, identifies biomarkers, and enables microbiome engineering for desired functions.
Two main strategies exist for functional profiling, each with distinct trade-offs.
| Aspect | Gene Catalogue | Read-Based (HUMAnN3) |
|---|---|---|
| Workflow | Assemble → Annotate → Quantify | Map reads → Quantify |
| Assembly needed? | Yes | No |
| Novel functions | Can discover | Database-limited |
| Speed | Slow (days) | Fast (hours) |
| Best for | Discovery, novel environments | Comparative studies, standard analysis |
Which to choose? Read-based profiling (HUMAnN3) is standard for most metagenomic studies—fast, standardized, and works well for comparative analyses. Gene catalogues are used for discovery in under-studied environments or when strain-level resolution is needed. Many projects use both approaches as they’re complementary.
For this workshop: We’ll focus on HUMAnN3, the most widely used tool for functional profiling.
HUMAnN3 (HMP Unified Metabolic Analysis Network) uses a tiered search strategy to balance speed and sensitivity.
Step 1: Taxonomic Profiling - Runs MetaPhlAn4 to identify species present
Step 2: Nucleotide Search - Maps reads to pangenomes of detected species only (ChocoPhlAn database)
Step 3: Translated Search - Unmapped reads are translated and searched against UniRef90 proteins
Step 4: Quantification - Counts reads per gene family and maps to metabolic pathways (MetaCyc)
Why this works: Only searching relevant pangenomes (Step 2) makes it fast. The translated search (Step 3) catches divergent genes missed initially. Results are stratified by species to show which organisms contribute to each function.
Gene Families (genefamilies.tsv) - Abundance of each gene family (UniRef90 IDs), in RPK units
Pathway Abundance (pathabundance.tsv) - Abundance of metabolic pathways (MetaCyc), in CPM units
Pathway Coverage (pathcoverage.tsv) - What proportion of each pathway is present (0-1 scale)
A key feature is stratification - showing which species contribute to each function.
|): Community-total abundance|): Species-specific contributionsExample: PWY-5484|Lactobacillus_plantarum shows L. plantarum’s contribution to pathway PWY-5484.
While HUMAnN3 provides broad functional profiling, some questions require specialized tools for specific gene classes:
CAZymes (dbCAN) - Carbohydrate-active enzymes for polysaccharide degradation
ARGs (ABRicate, groot, AMRFinderPlus) - Antimicrobial resistance genes
Secondary metabolites (antiSMASH) - Biosynthetic gene clusters
Viral (VIBRANT, CheckV) - Viral sequences and prophages
When to use: Start with HUMAnN3 for overview, then use specialized tools when focusing on specific gene classes or when detailed characterization is needed.
– HUMAnN3 Documentation: https://huttenhower.sph.harvard.edu/humann
– bioBakery Forum: https://forum.biobakery.org/
– MetaCyc Database: https://metacyc.org/
– Functional Profiling Review: Nayfach et al. (2020) Nature Reviews Microbiology