fix(trimgalore): drop process label to process_low by pinin4fjords · Pull Request #11531 · nf-core/modules

pinin4fjords · 2026-05-05T12:42:52Z

Summary

Drops the TRIMGALORE process label from process_high (12 cpus / 72 GB / 16 h) to process_low (2 cpus / 12 GB / 4 h, scaling with task.attempt).

Why

trim-galore 2.x is a Rust binary that streams reads, so memory stays flat with input size rather than scaling with read count. The process_high ceiling was inherited from the Perl-based 0.6.x era and is now massively over-provisioned, starving shared HPC schedulers for no benefit.

Empirical data (30M PE on nf-core/rnaseq)

Metric	Observed	`process_low` budget
peak_rss	~100 MB	12 GB (~120×)
realtime	1.25–2.0 min (median 1.5)	4 h (~80×)
cpus consumed	1 worker thread	2 cpus (≥1 worker)

The script already auto-derives --cores from task.cpus and caps the worker count at 8, so over-allocating cpus doesn't help anyway.

Why `process_low` and not `process_single`?

process_single (1 cpu / 6 GB / 4 h) is the only smaller standard bucket. For trim_galore's worker-thread math both yield 1 worker (since cores = max(1, task.cpus - 4) on paired), so the trimming parallelism is identical. The runtime difference comes from the surrounding I/O pipeline:

A paired trim_galore invocation spawns ~6–8 helper processes (cutadapt for R1/R2, pigz reader/writer pairs, validator).
On 1 cpu they all contend for a single core; on 2 cpus the OS can overlap I/O with the worker thread, which is most of the wall-time on real data. The ~1.5 min realtime above came from a 2-cpu run; collapsing to 1 cpu would likely give some of it back.
process_single also semantically signals "single-threaded by nature" (utilities, parsers, R scripts). trim_galore is genuinely multi-process even when only running one trimming worker, so process_low reads more honestly for "small resource ceiling, still parallel I/O".

Users with bespoke needs (huge inputs, custom adapter detection, etc.) can still override resources at the pipeline level.

What's not changing

Container, environment, output channels, args - all unchanged.
Module-level snapshots - unaffected; the label isn't captured in snap content.

Test plan

Module-level nf-test passes against the new label.
Downstream pipelines (rnaseq, methylseq, atacseq, ...) still complete on real data with the new ceiling.

trim-galore 2.x is a Rust binary that streams reads, so memory stays flat with input size. Empirical 30M PE benchmark (rnaseq pipeline): - peak_rss ~100 MB - realtime ~1.5 min median, ~2 min max The previous `process_high` label (12 cpus / 72 GB / 16 h) is massively over-provisioned for the new implementation and starves shared HPC schedulers. `process_low` (2 cpus / 12 GB / 4 h, scaling with task.attempt) gives ~120x memory headroom and ~80x runtime headroom over observed peaks at 30M PE, comfortably absorbing the 200M+ PE inputs that pipelines actually see in production. The script's own `--cores` calculation derives worker count from `task.cpus` and caps at 8, so allocating more than the label's 2 cpus (which yields 1 worker thread paired) gives diminishing returns; users with bespoke needs can still override `cpus` downstream.

pinin4fjords · 2026-05-05T14:10:21Z

Thanks @SPPearce !

FelixKrueger · 2026-05-05T14:10:32Z

As a comment on this, the invocation was --cores 8, which uses:

~12 CPUs
~100MB of RAM
with roughly 20M read pairs per minute

Please note that the threading model is N+4, so using --cores 2 uses 6 cores, while --cored 8 uses 12 cores.

Using --cores 2 is still good time wise, but the wall clock time can be reduced by a further 75% by using 12 CPUs (--coreds 8).

SPPearce · 2026-05-05T15:40:32Z

Please note that the threading model is N+4, so using --cores 2 uses 6 cores, while --cored 8 uses 12 cores.

Uh?

SPPearce approved these changes May 5, 2026

View reviewed changes

SPPearce added this pull request to the merge queue May 5, 2026

Merged via the queue into nf-core:master with commit 7ced6ac May 5, 2026
105 of 111 checks passed

pinin4fjords deleted the pinin4fjords/trimgalore-process-low branch May 5, 2026 14:10

pinin4fjords mentioned this pull request May 6, 2026

fix(trimgalore): stack process_low_memory on top of process_low #11541

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(trimgalore): drop process label to process_low#11531

fix(trimgalore): drop process label to process_low#11531
SPPearce merged 1 commit intonf-core:masterfrom
pinin4fjords:pinin4fjords/trimgalore-process-low

pinin4fjords commented May 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

pinin4fjords commented May 5, 2026

Uh oh!

FelixKrueger commented May 5, 2026

Uh oh!

SPPearce commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pinin4fjords commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Empirical data (30M PE on nf-core/rnaseq)

Why process_low and not process_single?

What's not changing

Test plan

Uh oh!

Uh oh!

pinin4fjords commented May 5, 2026

Uh oh!

FelixKrueger commented May 5, 2026

Uh oh!

SPPearce commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pinin4fjords commented May 5, 2026 •

edited

Loading

Why `process_low` and not `process_single`?