FQRecord

The primary record type. Returned by readFQ, filled by readFastx, and used as the element type inside FQPair. All fields are ordinary Nim strings — completely safe to store, copy, and manipulate.

nim
type FQRecord* = object
  name*:     string  # Sequence identifier (up to first whitespace)
  comment*:  string  # Optional free-text after name (may be empty)
  sequence*: string  # Nucleotide or amino-acid sequence
  quality*:  string  # Phred quality string (empty for FASTA records)
  status*, lastChar*: int  # Internal parsing state — see below

Status codes

Relevant only when using readFastx directly. readFQ and readFQPair handle these internally.

ValueMeaning
> 0Sequence length — record parsed successfully
-1End of file
-2Stream error
-3Other parsing error
-4Sequence and quality length mismatch in FASTQ
nim — basic usage
for record in readFQ("sample.fastq.gz"):
  echo record.name       # "read_001"
  echo record.comment    # "length=150" or ""
  echo record.sequence   # "ACGT..."
  echo record.quality    # "IIII..." (empty for FASTA)

FQRecordPtr

A pointer-based record for high-performance streaming via the kseq.h C library. The fields point directly into a reused internal buffer — they are invalidated on the next iteration.

nim
type FQRecordPtr* = object
  name*:     ptr char  # null-terminated C string
  comment*:  ptr char  # null-terminated C string (may be nil)
  sequence*: ptr char  # null-terminated C string
  quality*:  ptr char  # null-terminated C string (nil for FASTA)

Convert a field to a Nim string with the $ operator: $record.name. This copies the data — safe to keep after the loop body.

nim
for record in readFQPtr("sample.fastq.gz"):
  let name = $record.name       # ✓ copied, safe to keep
  let seq  = $record.sequence   # ✓ copied, safe to keep
  echo name, ": ", seq.len

FQRecord vs FQRecordPtr

FQRecord FQRecordPtr
Memory model Allocates Nim strings per record Reuses a single C buffer
Safe to store Yes No — must copy with $ first
Iterator readFQ readFQPtr
Typical throughput Good Excellent
Recommended for Most programs High-throughput streaming pipelines
Rule of thumb Start with readFQ / FQRecord. Switch to readFQPtr only if profiling shows allocation overhead is measurable.

FQPair

A paired-end record containing two FQRecord objects. Yielded by readFQPair.

nim
type FQPair* = object
  read1*: FQRecord   # Forward read (R1)
  read2*: FQRecord   # Reverse read (R2)
nim — usage
for pair in readFQPair("R1.fastq.gz", "R2.fastq.gz"):
  echo pair.read1.name, " / ", pair.read2.name
  echo pair.read1.sequence.len, " bp + ", pair.read2.sequence.len, " bp"

SeqComp

Nucleotide composition statistics returned by the composition() procedure.

nim
type SeqComp* = object
  A*, C*, G*, T*: int   # Per-base counts
  N*:             int   # Count of ambiguous bases
  Other*:         int   # Non-ACGTN characters
  GC*:            float # GC fraction (0.0–1.0)
nim — usage
import readfx, strutils

for record in readFQ("sample.fastq.gz"):
  let c = composition(record)
  echo record.name,
       "  A=", c.A, " C=", c.C, " G=", c.G, " T=", c.T,
       "  N=", c.N,
       "  GC=", (c.GC * 100).formatFloat(ffDecimal, 1), "%"

Bufio[T]

A generic buffered reader used internally by readFastx. Typically instantiated as Bufio[GzFile] via xopen[GzFile](path). Handles both plain and gzip files through the same interface.

nim
type Bufio*[T] = tuple[fp: T, buf: string, st, en, sz: int, EOF: bool]
nim — typical use with readFastx
var record: FQRecord
var f = xopen[GzFile]("sample.fastq.gz")
defer: f.close()
while f.readFastx(record):
  echo record.name
Note You rarely need to interact with Bufio directly beyond calling xopen, readFastx, and close. For line-oriented access, readLine is also available.

Interval[S,T]

A genomic interval for use with the built-in interval tree. Useful for overlap queries on annotation data alongside sequence parsing.

nim
type Interval*[S,T] = tuple[st, en: S, data: T, max: S]

Build an interval set, call index() to prepare it, then query with the overlap() iterator:

nim
var ivs: seq[Interval[int, string]]
ivs.add((st: 100, en: 200, data: "gene_A", max: 0))
ivs.add((st: 150, en: 300, data: "gene_B", max: 0))

ivs.sort()
ivs.index()

for iv in ivs.overlap(120, 180):
  echo iv.data   # "gene_A", "gene_B"

Performance notes

Benchmarks in the benchmark/ directory show that with --opt:speed --gc:arc, the performance difference between object and tuple layouts becomes negligible. Recommended compile flags for production:

shell
nim c --opt:speed --gc:arc myprogram.nim
Utilities → ← Parsing methods