Skip to content

Conversation

@pinin4fjords
Copy link
Member

@pinin4fjords pinin4fjords commented Sep 15, 2025

Enable BAM input support for RSEM workflows

This PR extends the existing BAM input functionality to support RSEM-based quantification workflows, building on the foundation of BAM input infrastructure already in place and the module work allowing us to use RSEM with premade alignments.

Note: RSEM output md5sums changed due to a logic fix. Previously, tests used a pre-made RSEM index that excluded additional FASTA sequences (Gfp). Switching to alignment input exposed this mismatch (RSEM requires consistent reference sequences between index and alignments), so the RSEM index is now generated on-the-fly with all sequences included.

Major changes:

RSEM architecture overhaul:

  • Separate STAR alignment from RSEM quantification to enable RSEM --alignments mode
  • Users can now re-use BAM files from previous star_rsem pipeline runs for efficient reprocessing
  • Pipeline now has full control over alignment parameters, making the logic much more maintainable
  • Eliminates dependency on RSEM's internal alignment behavior

Quantifier-aware STAR configuration:

  • Implement dynamic STAR parameter selection based on target quantifier (star_salmon vs star_rsem)
  • star_rsem mode replicates RSEM's internal STAR parameters for optimal compatibility
  • Maintain existing star_salmon parameters to preserve backward compatibility
  • Fix parameter handling to ensure consistent CI test results

Workflow simplification:

  • RSEM subworkflow now focuses solely on quantification using pre-aligned BAMs
  • Remove duplicate BAM processing steps that were previously handled by RSEM internally
  • Cleaner, more understandable workflow architecture

This enables users to efficiently reprocess data from previous RSEM runs using --aligner star_rsem --skip_alignment, while providing much better maintainability through explicit control over the alignment process rather than relying on RSEM's internal alignment behavior.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/rnaseq branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@github-actions
Copy link

github-actions bot commented Sep 15, 2025

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 91d7d26

+| ✅ 290 tests passed       |+
#| ❔   7 tests were ignored |#
!| ❗   9 tests had warnings |!

❗ Test warnings:

  • files_exist - File not found: assets/multiqc_config.yml
  • pipeline_todos - TODO string in base.config: Check the defaults for all processes
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
  • pipeline_todos - TODO string in nextflow.config: Specify any additional parameters here
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_if_empty_null - ifEmpty(null) found in main.nf: _ versions = ch_versions.ifEmpty(null) // channel: [ versions.yml ]
    _

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 3.3.2
  • Run at 2025-09-16 08:34:32

@pinin4fjords pinin4fjords changed the title Rsem bam input RSEM bam input Sep 15, 2025
@pinin4fjords pinin4fjords changed the title RSEM bam input RSEM BAM input Sep 15, 2025
@pinin4fjords pinin4fjords changed the title RSEM BAM input Enable BAM input for RSEM Sep 15, 2025
Copy link

@FloWuenne FloWuenne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise
Nothing major spotted on my end. Looks great!

@pinin4fjords pinin4fjords merged commit 00a0a15 into dev Sep 16, 2025
35 checks passed
@pinin4fjords pinin4fjords deleted the rsem_bam_input branch September 16, 2025 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants