Coffee dataset

📜 From Pothakos 2020Temporal shotgun metagenomics of an Ecuadorian coffee fermentation process highlights the predominance of lactic acid bacteria

This study presents a temporal shotgun metagenomic analysis of an Arabica wet coffee fermentation process conducted at a coffee plantation research station near Nanegal, Ecuador (elevation 1329 m). The project was designed to complement previous microbiological and metabolomic studies of the same fermentation process by providing an in-depth, genome-level analysis of the microbial community structure and functional capabilities. The research examined reproducible wet coffee processing practices, where ripe coffee cherries (Coffea arabica L. var. Typica) were mechanically depulped and fermented underwater in a concrete tank (1 m × 2 m × 2 m).

The primary objectives were to delineate the structure and function of the microbiome, investigate temporal shifts in community composition, track microbial sources from the plantation environment, identify core and transient microbiota, and determine the metabolic roles of distinct microbial species during fermentation.

The study generated approximately 54 million high-quality metagenomic sequences representing 16 Gbp of data, which were analyzed following removal of ~8 million plant DNA sequences from Coffea canephora.

The Samples

Sample Time Point Description
F0 0 h Start of fermentation (baseline)
F8 8 h Early fermentation phase
F16 16 h End of standard fermentation; mucilage removal complete
F24 24 h Extended fermentation begins
F36 36 h Mid-extended fermentation
F64 64 h End of extended fermentation

Downloading the reads

The reads were deposited to public repositories under this accession number: PRJEB24129. This project contains indeed six sequenced samples:

Reference genome

We will use - like the authors of the original paper - genome GCA_036785865.1, that we can download using datasets, a CLI tool from NCBI.


Previous submodule:
Next submodule: