dadaist2-checkstats
Analyse the stats from DADA2 (called dada2_stats.tsv
in the output directory) and evaluates at which step(s) or sample(s) the reads are loste.
usage: dadaist2-checkstats [-h] -i INPUT [-l LOSS] [--sample] [--all] [--keys KEYS] [--tmp TMP] [--log LOG] [--verbose]
Check DADA2 stats
optional arguments:
-h, --help show this help message and exit
Main:
-i INPUT, --input INPUT
DADA2 stats table
-l LOSS, --loss LOSS Warn when loss is above this value [default: 0.33]
--sample Also check sample by sample
--all Report loss for all the steps
--keys KEYS Comma separated headers [default: input,filtered,denoised,merged,non-chimeric]
--tmp TMP Temporary directory
Other parameters:
--log LOG Log file
--verbose Verbose mode
Input file
The input file is a TSV file produced by DADA2 via Dadaist2:
Sample | input | filtered | denoised | merged | non-chimeric |
---|---|---|---|---|---|
M0614DD2plus165 | 254245 | 225049 | 225049 | 35382 | 35114 |
M0614DD2plus45 | 296332 | 281027 | 281027 | 12126 | 12114 |
M0614DD3plus120 | 2879433 | 2706733 | 2706733 | 124381 | 123007 |
Output
When running with default parameters, the output is a JSON file, that will report the steps where the loss is bigger than --loss FLOAT
:
Example:
{
"failed_steps": {
"merged": 4.865372188100686
}
}
Adding --sample
will also report the loss for each sample:
{
"failed_steps": {
"merged": 4.865372188100686
},
"failed_by_sample": {
"input": [],
"filtered": [
"M0614GD3plus120"
],
"denoised": [],
"merged": [
"M0614DD2plus165",
"M0614DD2plus45",
"M0614DD3plus120",
],
"non-chimeric": []
}
}