We converted the tutorial into a very simple Nextflow pipeline. The code is available from Github
The pipeline will need
The reads are to be supplied as a mapping file that you can create with votuderep tabulate.
votuderep tabulate ~/virome/reads -o ~/reads.csv
One of the cool things about Nextflow is that you don’t need to download the pipeline from github, and you don’t need to install the dependencies: they will be fetched using your favourite source (in our case, Docker). We could use Nextflow also to download the databases…
nextflow run quadram-institute-bioscience/nf-ebame-virome -profile docker \
--assembly ~/virome/human_gut_assembly.fa.gz \
--reads ~/reads.csv \
--genomad_db ~/db/genomad_db/ \
--checkv_db ~/db/checkv-db-v1.5/ \
--outdir ~/nf-ebame-virome-out
After you hit enter, Nextflow will download the pipeline and start orchestrating the tasks

The output folder will contain a subdirectory for each task (it’s still a bit messy):
output-directory
├── genomad
│ └── genomad-out
│ ├── input_assembly_aggregated_classification
│ ├── input_assembly_annotate
│ │ └── input_assembly_mmseqs2
│ │ ├── ...
│ ├── input_assembly_find_proviruses
│ │ └── input_assembly_provirus_mmseqs2
│ │ ├── ...
│ ├── input_assembly_marker_classification
│ ├── input_assembly_nn_classification
│ │ ├── ...
│ └── input_assembly_summary
├── dereplicated-votus (dereplicated vOTU FASTA)
├── checkv
│ └── checkv-out
│ └── tmp...
├── alignments (your BAM files)
└── pipeline_info