ReadFX provides three primary methods for parsing FASTA and FASTQ files, each with different characteristics and use cases:
Method | Memory Usage | Speed | Ease of Use | Flexibility |
---|---|---|---|---|
readFQ |
Higher | Good | Excellent | Good |
readFQPtr |
Low | Excellent | Moderate | Good |
readFastx |
Customizable | Excellent | Requires setup | Excellent |
iterator readFQ*(path: string): FQRecord
readFQ
is a high-level iterator that returns FQRecord
objects with string fields.
import readfx
for record in readFQ("sample.fastq.gz"):
echo record.name, " has length ", record.sequence.len
# Manipulate record.sequence, record.quality, etc. as strings
iterator readFQPtr*(path: string): FQRecordPtr
readFQPtr
is a high-performance iterator that returns FQRecordPtr
objects with pointer fields for maximum efficiency.
import readfx
for record in readFQPtr("sample.fastq.gz"):
echo $record.name, " has length ", $record.sequence.len
# Be careful! These pointers are reused on each iteration
readFQ
The pointers in FQRecordPtr
are reused with each iteration! If you need to keep a record after moving to the next iteration, you must copy the data:
var savedNames: seq[string]
for record in readFQPtr("sample.fastq.gz"):
savedNames.add($record.name) # Make a copy
proc readFastx*[T](f: var Bufio[T], r: var FQRecord): bool
readFastx
is a lower-level procedure that reads one record at a time from a buffered input stream.
import readfx
var record: FQRecord
var f = xopen[GzFile]("sample.fastq.gz")
defer: f.close()
while f.readFastx(record):
echo record.name, " has length ", record.sequence.len
readFQ
is actually built on top of readFQPtr
, converting pointers to stringsreadFastx
is the native Nim implementation used internallyIn benchmarks on large files:
readFQPtr
is typically the fastest but requires careful memory managementreadFQ
is slightly slower due to string allocations but much saferreadFastx
performance depends on how you implement the surrounding codeChoose the method that best balances your needs for performance, safety, and code simplicity.