Comparison at a Glance

Method Returns Memory Speed Ease of use Use case
readFQ FQRecord Higher Good Excellent General use
readFQPtr FQRecordPtr Low Excellent Moderate High-throughput streaming
readFastx fills FQRecord Custom Excellent Requires setup Custom I/O workflows
readFQPair FQPair Moderate Good Excellent Paired-end reads

readFQ

signature
iterator readFQ*(path: string): FQRecord

Yields FQRecord objects backed by Nim strings. Records are completely safe to store, copy, or pass to any function after the loop body returns — the data is owned by the record, not shared with an internal buffer.

nim
import readfx

for record in readFQ("sample.fastq.gz"):
  echo record.name, " (", record.sequence.len, " bp)"

# Collect all names
var names: seq[string]
for record in readFQ("sample.fastq.gz"):
  names.add(record.name)
When to use This is the right default for most programs. Use readFQ unless profiling shows that memory allocation is a bottleneck.

Internally, readFQ is built on top of readFQPtr and converts the C pointers to Nim strings on each yield. The extra copy is usually negligible.


readFQPtr

signature
iterator readFQPtr*(path: string): FQRecordPtr

Yields pointer-based records using Heng Li's kseq.h C library directly via FFI. The underlying buffer is reused on every iteration — the pointer returned on iteration N becomes invalid when iteration N+1 begins.

nim
import readfx

for record in readFQPtr("sample.fastq.gz"):
  # Safe: convert to string immediately
  echo $record.name, ": ", len($record.sequence)
Pointer invalidation Never store an FQRecordPtr or its fields across iterations. If you need to retain data, convert it explicitly:
nim
# ✓ Safe — data copied to Nim string immediately
var names: seq[string]
for record in readFQPtr("sample.fastq.gz"):
  names.add($record.name)

# ✗ WRONG — ptr is invalid after the loop body
var p: ptr char
for record in readFQPtr("sample.fastq.gz"):
  p = record.name   # undefined behaviour on next iteration
When to use When processing tens or hundreds of millions of records and you have confirmed (via profiling) that readFQ's string allocation is a bottleneck. For any output that requires retaining field data, convert to strings with $.

readFastx

signature
proc readFastx*[T](f: var Bufio[T], r: var FQRecord): bool

The low-level native-Nim parser. Rather than an iterator, it is a procedure that fills a caller-owned FQRecord and returns true while records remain, false at end-of-file.

You open the file yourself with xopen[GzFile] and manage the stream lifecycle with defer: f.close().

nim
import readfx

var record: FQRecord
var f = xopen[GzFile]("sample.fastq.gz")
defer: f.close()

while f.readFastx(record):
  echo record.name, " (", record.sequence.len, " bp)"

Because you control the loop, you can interleave reads from multiple streams, add custom break conditions, or mix FASTX I/O with other file handles.

Status codes

After a call to readFastx, record.status contains a diagnostic value:

ValueMeaning
> 0Sequence length — record parsed successfully
-1End of file
-2Stream error
-3Other parsing error
-4Sequence and quality length mismatch (FASTQ)
When to use Custom parsing workflows — for example, interleaving reads from two sources, implementing a merge sort over multiple files, or building a streaming pipeline that can pause and resume.

readFQPair

signature
iterator readFQPair*(path1, path2: string, checkNames: bool = false): FQPair

Reads two FASTQ files in lockstep. On each iteration it yields an FQPair with fields read1 and read2, both of type FQRecord.

nim
import readfx

for pair in readFQPair("sample_R1.fastq.gz", "sample_R2.fastq.gz"):
  echo "R1: ", pair.read1.name, "  (", pair.read1.sequence.len, " bp)"
  echo "R2: ", pair.read2.name, "  (", pair.read2.sequence.len, " bp)"

Error handling

nim
for pair in readFQPair("R1.fastq.gz", "R2.fastq.gz", checkNames = true):
  # Raises ValueError if read1.name ≠ read2.name (after suffix stripping)
  discard
Stdin support path1 may be "-" to read R1 from standard input. Both files cannot be stdin simultaneously.
When to use Any paired-end sequencing pipeline — Illumina R1/R2 files, mate-pair libraries, or any scenario where two FASTQ files must be consumed in sync.

Implementation notes

Data structures → Utilities →