SeqFu is a tool to help you manage your sequence files. It is a command-line tool that can be used to filter, transform, and manipulate sequence files in various formats. SeqFu is designed to be fast and efficient, and it is written in Rust.
SeqFu has a set of subtools like:
seqfu stats
to get statistics of FASTA and FASTQ files, including N50 (supports gzipped files)seqfu head
and seqfu tail
will print the first or last sequences. With options to print one every N and more.seqfu cat
, to reshape FASTA/FASTQ files, adding prefixes or suffixes to the names, removing comments, etc.seqfu interleave
and seqfu deinterleave
and many other tools!
You can install if from conda
conda install -y -c conda-forge -c bioconda seqfu
Alternatively, you can download pre-compiled binaries and put them on your PATH.
Note that in the EBAME VM you have SeqFu in ~/bin/
.
seqfu cat --anvio ASSEMBLY > RENAMED_ASSEMBLY.fa
will substitute anvi-script-reformat-fasta
. It’s faster, and being SeqFu easy to install, you can embed this step
in your pipeline to get Anvi’o ready contigs.
You can have a quick look at FASTQ files (to check for quality and primers/adapters):
seqfu cat FASTQ_FILE | less -SR
To check multiple alignment files from the command line:
fu-msa MSA.fa