This workshop focuses on showing a simplified workflow
to mine phages in a metagenome.
Starting from reads, we need to do some quality filtering and assembly to get contigs.
In a simplified workflow:

We will start with the co-assembly of three samples, so we will use:
- A viral miner tool to predict viral sequences in a metagenome (geNomad), that will produce a FASTA file with viral sequences and prophages
- A program to check the quality of the predictions (checkV), will give us a report on each prediction checking for bacterial contamination, completeness, and a score based on marker genes.
- To produce vOTUs (Viral Operational Taxonomic Units) we will dereplicate the viral sequences
- We will rename the sequences with SeqFu to make them Anvi’o friendly
To keep our Anvi’o love high, we can also:
- Back-map the reads to the vOTUs to estimate their abundance in the original samples with MiniMap2 and samtools
- Use Anvi’o to generate a visualization focused on the vOTUs, building a CONTIGS.db database only based on the candidate vOTUs.
EBAME settings
We will use VMs provided by Biosphere, and with a setup script datasets and databases will be automatically available.
See EBAME Setup