First we need to log in to a “submission node”, using
ssh ${USER}@hpc.ac.uk
We know our home directory.
To check our Micromamba installation we will create a new environment
# Log in to the software node
ssh software
# Crea the new environment with votuderep, available from Bioconda
micromamba create -n tutorial votuderep
Staying in the software node we will download a training dataset using a tool called votuderep.
# Check the options
micromamba run -n tutorial votuderep trainingdata --help
# Run the program
micromamba run -n tutorial votuderep trainingdata -o ~/tutorial/
⚠️ If the download fails but we got at least the first file, we can move on and skip the fastp part
Return to the login node:
logout
We will use seqfu stats to gather statistics about the FASTA file we downloaded.
The package is available with lmod after configuring it with QIB Core Bioinformatics tool.
# Load Seqfu from lmod
module avail seqfu
module load
# Fallback method
source /nbi/software/testing/bin/seqfu__1.22.0
Checking the stats: we will submit the job! Let’s try the configurator and produce something like:
#!/bin/bash
#SBATCH --job-name=assembly-stats
#SBATCH --output=%x-%j.out
#SBATCH --partition=qib-short
#SBATCH --nodes=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=4G
#SBATCH --time=0-02:00:00
#SBATCH --mail-type=BEGIN,END,FAIL
# Job information
echo "Job started at: $(date)"
echo "Running on: $(hostname)"
echo "Job ID: $SLURM_JOB_ID"
echo ""
# Change to submission directory
cd "$SLURM_SUBMIT_DIR"
# Environment setup
# load seqfu
module load seqfu
# Main commands
# check stats
seqfu stats human_gut_assembly.fa.gz > stats.tsv
echo ""
echo "Job completed at: $(date)"
❓ How can you check the status of the job
💡 Verify with ls and cat that you produced the files you expect
cd ~/tutorial
# Here we use the $DATABASES shortcut from bashrc
export KRAKEN2_DEFAULT_DB=$DATABASES/kraken2/benlangmead/k2_standard_20230314/
We will use again the LMOD software catalogue:
# Load kraken2
module avail kraken
module load kraken2
But this time we will use nbi-slurm to run the job with some help:
# Use "nbi-slurm" helper to launch the job
runjob -c 16 -m 128G -w logs -run -n mykraken-1 \
"kraken2 --threads 16 --memory-mapping --paired --report ERR6797443.tsv reads/ERR6797443_R{1,2}* > /dev/null"
# Check if running
lsjobs
…
Remember to delete your files!