readfx

FASTX Parsing Methods in ReadFX

ReadFX provides three primary methods for parsing FASTA and FASTQ files, each with different characteristics and use cases:

  1. readFQ - String-based high-level iterator
  2. readFQPtr - Pointer-based high-performance iterator
  3. readFastx - Lower-level reader for custom workflow integration

Comparison at a Glance

Method Memory Usage Speed Ease of Use Flexibility
readFQ Higher Good Excellent Good
readFQPtr Low Excellent Moderate Good
readFastx Customizable Excellent Requires setup Excellent

readFQ

iterator readFQ*(path: string): FQRecord

readFQ is a high-level iterator that returns FQRecord objects with string fields.

How to Use

import readfx

for record in readFQ("sample.fastq.gz"):
  echo record.name, " has length ", record.sequence.len
  # Manipulate record.sequence, record.quality, etc. as strings

When to Use

Why Use

readFQPtr

iterator readFQPtr*(path: string): FQRecordPtr

readFQPtr is a high-performance iterator that returns FQRecordPtr objects with pointer fields for maximum efficiency.

How to Use

import readfx

for record in readFQPtr("sample.fastq.gz"):
  echo $record.name, " has length ", $record.sequence.len
  # Be careful! These pointers are reused on each iteration

When to Use

Why Use

Important Note

The pointers in FQRecordPtr are reused with each iteration! If you need to keep a record after moving to the next iteration, you must copy the data:

var savedNames: seq[string]
for record in readFQPtr("sample.fastq.gz"):
  savedNames.add($record.name)  # Make a copy

readFastx

proc readFastx*[T](f: var Bufio[T], r: var FQRecord): bool

readFastx is a lower-level procedure that reads one record at a time from a buffered input stream.

How to Use

import readfx

var record: FQRecord
var f = xopen[GzFile]("sample.fastq.gz")
defer: f.close()
while f.readFastx(record):
  echo record.name, " has length ", record.sequence.len

When to Use

Why Use

Implementation Details

Performance Considerations

In benchmarks on large files:

Choose the method that best balances your needs for performance, safety, and code simplicity.