nf-core · pinin4fjords · Nov 26, 2025 · Nov 25, 2025 · Nov 25, 2025 · Nov 25, 2025
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -13,6 +13,7 @@ Special thanks to the following for their contributions to the release:
 - [Elad Herzog](https://github.com/EladH1)
 - [Emily Miyoshi](https://github.com/emilymiyoshi)
 - [Pontus Höjer](https://github.com/pontushojer)
+- [Siddhartha Bagaria](https://github.com/siddharthab)
 
 ### Enhancements and fixes
 
@@ -28,6 +29,7 @@ Special thanks to the following for their contributions to the release:
 - [PR #1624](https://github.com/nf-core/rnaseq/pull/1624) - Document RSeQC inner_distance limitation for genomes with large chromosomes (>500 Mb), such as plant genomes
 - [PR #1625](https://github.com/nf-core/rnaseq/pull/1625) - Add documentation warning about Qualimap read counting bug ([#1273](https://github.com/nf-core/rnaseq/issues/1273))
 - [PR #1628](https://github.com/nf-core/rnaseq/pull/1628) - Template update for nf-core/tools v3.5.1
+- [PR #1632](https://github.com/nf-core/rnaseq/pull/1632) - Add validation error for incompatible `--transcript_fasta` and `--additional_fasta` params ([#1450](https://github.com/nf-core/rnaseq/issues/1450))
 - [PR #1630](https://github.com/nf-core/rnaseq/pull/1630) - Fix arm64 profile to use pre-built ARM containers and update documentation
 - [PR #1631](https://github.com/nf-core/rnaseq/pull/1631) - Fix bbsplit index staging by using symlinks instead of full copy
 - [PR #1635](https://github.com/nf-core/rnaseq/pull/1635) - Fix `--gtf_extra_attributes` to support multiple comma-separated values and correct deprecated parameter name in docs ([#1626](https://github.com/nf-core/rnaseq/issues/1626))

diff --git a/docs/usage.md b/docs/usage.md
@@ -312,7 +312,7 @@ Notes:
 
 - If `--gff` is provided as input then this will be converted to a GTF file, or the latter will be used if both are provided.
 - If `--gene_bed` is not provided then it will be generated from the GTF file.
-- If `--additional_fasta` is provided then the features in this file (e.g. ERCC spike-ins) will be automatically concatenated onto both the reference FASTA file as well as the GTF annotation before building the appropriate indices.
+- If `--additional_fasta` is provided then the features in this file (e.g. ERCC spike-ins) will be automatically concatenated onto both the reference FASTA file as well as the GTF annotation before building the appropriate indices. Note: if you need the pipeline to build a pseudo-aligner index (Salmon/Kallisto), `--additional_fasta` cannot be used together with `--transcript_fasta` because the pipeline cannot append additional sequences to a user-provided transcriptome. Either omit `--transcript_fasta` and let the pipeline generate it, or provide a pre-built index that already contains the spike-ins.
 - When using `--aligner star_rsem`, the pipeline will build separate STAR and RSEM indices. STAR performs alignment with RSEM-compatible parameters, then RSEM quantifies from the resulting BAM files using `--alignments` mode.
 - If the `--skip_alignment` option is used along with `--transcript_fasta`, the pipeline can technically run without providing the genomic FASTA (`--fasta`). However, this approach is **not recommended** with `--pseudo_aligner salmon`, as any dynamically generated Salmon index will lack decoys. To ensure optimal indexing with decoys, it is **highly recommended** to include the genomic FASTA (`--fasta`) with Salmon, unless a pre-existing decoy-aware Salmon index is supplied. For more details on the benefits of decoy-aware indexing, refer to the [Salmon documentation](https://salmon.readthedocs.io/en/latest/salmon.html#preparing-transcriptome-indices-mapping-based-mode).
 
@@ -346,6 +346,10 @@ In addition to the reference genome sequence and annotation, you can provide a r
 
 We recommend not providing a transcriptome FASTA file and instead allowing the pipeline to create it from the provided genome and annotation. Similar to aligner indexes, you can save the created transcriptome FASTA and BED files to a central location for future pipeline runs. This helps avoid redundant computation and having multiple copies on your system. Ensure that all genome, annotation, transcriptome, and index versions match to maintain consistency.
 
+:::warning
+If you are using `--additional_fasta` to add spike-in sequences (e.g. ERCC) and need the pipeline to build a pseudo-aligner index (Salmon/Kallisto), you **must not** provide `--transcript_fasta`. The pipeline needs to generate the transcriptome itself so that it includes the spike-in sequences. This combination will cause the pipeline to exit with an error unless you also provide a pre-built index (`--salmon_index` or `--kallisto_index`) that already contains the spike-in sequences.
+:::
+
 #### Indices
 
 By default, indices are generated dynamically by the workflow for tools such as STAR and Salmon. Since indexing is an expensive process in time and resources you should ensure that it is only done once, by retaining the indices generated from each batch of reference files by specifying `--save_reference`.

diff --git a/nextflow_schema.json b/nextflow_schema.json
@@ -111,7 +111,8 @@
                     "mimetype": "text/plain",
                     "pattern": "^\\S+\\.fn?a(sta)?(\\.gz)?$",
                     "fa_icon": "far fa-file-code",
-                    "description": "Path to FASTA transcriptome file."
+                    "description": "Path to FASTA transcriptome file.",
+                    "help_text": "If not provided, the transcriptome will be generated from the genome FASTA and GTF files. Cannot be used together with `--additional_fasta` when building a pseudo-aligner index, because the pipeline cannot append spike-in sequences to a user-provided transcriptome. Either omit this parameter or provide a pre-built index."
                 },
                 "additional_fasta": {
                     "type": "string",
@@ -121,7 +122,7 @@
                     "pattern": "^\\S+\\.fn?a(sta)?(\\.gz)?$",
                     "fa_icon": "far fa-file-code",
                     "description": "FASTA file to concatenate to genome FASTA file e.g. containing spike-in sequences.",
-                    "help_text": "If provided, sequences in this file will be concatenated to the genome FASTA file. A GTF file will be automatically created using these sequences, and alignment indices will be created from the combined files. Use `--save_reference` to reuse these indices in future runs."
+                    "help_text": "If provided, sequences in this file will be concatenated to the genome FASTA file. A GTF file will be automatically created using these sequences, and alignment indices will be created from the combined files. Use `--save_reference` to reuse these indices in future runs. Cannot be used together with `--transcript_fasta` when building a pseudo-aligner index - either omit `--transcript_fasta` or provide a pre-built index that already contains the spike-ins."
                 },
                 "splicesites": {
                     "type": "string",

diff --git a/subworkflows/local/utils_nfcore_rnaseq_pipeline/main.nf b/subworkflows/local/utils_nfcore_rnaseq_pipeline/main.nf
@@ -263,6 +263,23 @@ def validateInputParameters() {
     }
 
     if (params.transcript_fasta) {
+        // Only error if additional_fasta is provided AND we need to build a pseudo-aligner index
+        // (i.e., no pre-built salmon/kallisto index provided). If the user provides a pre-built
+        // index that already contains the spike-ins, the combination is valid.
+        if (params.additional_fasta) {
+            def needs_to_build_index = false
+            if (!params.skip_pseudo_alignment && params.pseudo_aligner) {
+                // Check if the relevant index for the selected pseudo-aligner is missing
+                if (params.pseudo_aligner == 'salmon' && !params.salmon_index) {
+                    needs_to_build_index = true
+                } else if (params.pseudo_aligner == 'kallisto' && !params.kallisto_index) {
+                    needs_to_build_index = true
+                }
+            }
+            if (needs_to_build_index) {
+                transcriptFastaAdditionalFastaError()
+            }
+        }
         transcriptsFastaWarn()
     }
 
@@ -496,6 +513,28 @@ def transcriptsFastaWarn() {
         "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
 }
 
+//
+// Print an error if using both '--transcript_fasta' and '--additional_fasta' without a pre-built index
+//
+def transcriptFastaAdditionalFastaError() {
+    def error_string = "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n" +
+        "  Both '--transcript_fasta' and '--additional_fasta' have been provided,\n" +
+        "  but no pre-built pseudo-aligner index (--salmon_index/--kallisto_index).\n\n" +
+        "  The pipeline cannot append additional sequences (e.g. ERCC spike-ins) to a\n" +
+        "  user-provided transcriptome FASTA file. This would cause quantification to\n" +
+        "  fail because the built index would not contain the additional sequences.\n\n" +
+        "  Please either:\n" +
+        "    - Remove '--transcript_fasta' and let the pipeline generate the\n" +
+        "      transcriptome from the genome FASTA and GTF (recommended), or\n" +
+        "    - Provide a pre-built index (--salmon_index/--kallisto_index) that\n" +
+        "      already contains the additional sequences, or\n" +
+        "    - Remove '--additional_fasta' if you do not need spike-in sequences.\n\n" +
+        "  Please see:\n" +
+        "  https://github.com/nf-core/rnaseq/issues/1450\n" +
+        "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
+    error(error_string)
+}
+
 //
 // Print a warning if --skip_alignment has been provided
 //

diff --git a/tests/kallisto.nf.test b/tests/kallisto.nf.test
@@ -12,6 +12,11 @@ nextflow_pipeline {
                 pseudo_aligner = 'kallisto'
                 skip_qc = true
                 skip_alignment = true
+                // Disable spike-ins since we don't have a kallisto_index with spike-ins.
+                // Must also disable transcript_fasta because the test profile's transcriptome
+                // was generated with spike-ins - we need the pipeline to regenerate it.
+                additional_fasta = null
+                transcript_fasta = null
             }
         }
 
@@ -46,6 +51,11 @@ nextflow_pipeline {
                 pseudo_aligner = 'kallisto'
                 skip_qc = true
                 skip_alignment = true
+                // Disable spike-ins since we don't have a kallisto_index with spike-ins.
+                // Must also disable transcript_fasta because the test profile's transcriptome
+                // was generated with spike-ins - we need the pipeline to regenerate it.
+                additional_fasta = null
+                transcript_fasta = null
             }
         }
 

diff --git a/tests/kallisto.nf.test.snap b/tests/kallisto.nf.test.snap
@@ -1,17 +1,14 @@
 {
     "Params: --pseudo_aligner kallisto --skip_qc --skip_alignment": {
         "content": [
-            48,
+            47,
             {
                 "BBMAP_BBSPLIT": {
                     "bbmap": 39.18
                 },
                 "CAT_FASTQ": {
                     "cat": 9.5
                 },
-                "CUSTOM_CATADDITIONALFASTA": {
-                    "python": "3.12.2"
-                },
                 "CUSTOM_GETCHROMSIZES": {
                     "getchromsizes": 1.21
                 },
@@ -30,9 +27,6 @@
                 "GTF_FILTER": {
                     "python": "3.9.5"
                 },
-                "GUNZIP_ADDITIONAL_FASTA": {
-                    "gunzip": 1.13
-                },
                 "GUNZIP_GTF": {
                     "gunzip": 1.13
                 },
@@ -42,6 +36,10 @@
                 "KALLISTO_QUANT": {
                     "kallisto": "0.51.1"
                 },
+                "MAKE_TRANSCRIPTS_FASTA": {
+                    "rsem": "1.3.1",
+                    "star": "2.7.10a"
+                },
                 "SALMON_QUANT": {
                     "salmon": "1.10.3"
                 },
@@ -70,10 +68,6 @@
                 "bbsplit/RAP1_UNINDUCED_REP2.stats.txt",
                 "bbsplit/WT_REP1.stats.txt",
                 "bbsplit/WT_REP2.stats.txt",
-                "custom",
-                "custom/out",
-                "custom/out/genome_gfp.fasta",
-                "custom/out/genome_gfp.gtf",
                 "fastqc",
                 "fastqc/trim",
                 "fastqc/trim/RAP1_IAA_30M_REP1_trimmed_1_val_1_fastqc.html",
@@ -248,9 +242,7 @@
                 "trimgalore/WT_REP2_trimmed_2.fastq.gz_trimming_report.txt"
             ],
             [
-                "genome_gfp.fasta:md5,e23e302af63736a199985a169fdac055",
-                "genome_gfp.gtf:md5,c98b12c302f15731bfc36bcf297cfe28",
-                "tx2gene.tsv:md5,0e2418a69d2eba45097ebffc2f700bfe",
+                "tx2gene.tsv:md5,1be389a28cc26d94b19ea918959ac72e",
                 "cutadapt_filtered_reads_plot.txt:md5,6fa381627f7c1f664f3d4b2cb79cce90",
                 "cutadapt_trimmed_sequences_plot_3_Counts.txt:md5,13dfa866fd91dbb072689efe9aa83b1f",
                 "cutadapt_trimmed_sequences_plot_3_Obs_Exp.txt:md5,07145dd8dd3db654859b18eb0389046c",
@@ -277,17 +269,14 @@
     },
     "Params: --pseudo_aligner kallisto --skip_qc --skip_alignment - stub": {
         "content": [
-            22,
+            21,
             {
                 "BBMAP_BBSPLIT": {
                     "bbmap": 39.18
                 },
                 "CAT_FASTQ": {
                     "cat": 9.5
                 },
-                "CUSTOM_CATADDITIONALFASTA": {
-                    "python": null
-                },
                 "CUSTOM_GETCHROMSIZES": {
                     "getchromsizes": 1.21
                 },
@@ -300,15 +289,16 @@
                 "GTF_FILTER": {
                     "python": "3.9.5"
                 },
-                "GUNZIP_ADDITIONAL_FASTA": {
-                    "gunzip": 1.13
-                },
                 "GUNZIP_GTF": {
                     "gunzip": 1.13
                 },
                 "KALLISTO_INDEX": {
                     "kallisto": "0.51.1"
                 },
+                "MAKE_TRANSCRIPTS_FASTA": {
+                    "rsem": "1.3.1",
+                    "star": "2.7.10a"
+                },
                 "TRIMGALORE": {
                     "cutadapt": 4.9,
                     "pigz": 2.8,
@@ -319,10 +309,6 @@
                 }
             },
             [
-                "custom",
-                "custom/out",
-                "custom/out/genome_transcriptome.fasta",
-                "custom/out/genome_transcriptome.gtf",
                 "fastqc",
                 "fastqc/trim",
                 "fq_lint",
@@ -349,8 +335,6 @@
                 "trimgalore/WT_REP2_trimmed_2.fastq.gz_trimming_report.txt"
             ],
             [
-                "genome_transcriptome.fasta:md5,d41d8cd98f00b204e9800998ecf8427e",
-                "genome_transcriptome.gtf:md5,d41d8cd98f00b204e9800998ecf8427e"
             ]
         ],
         "meta": {