Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
135 commits
Select commit Hold shift + click to select a range
fc13650
Add umicollapse as an alternative to umi-tools
siddharthab Sep 3, 2024
761c56d
Keep umitools as the default
siddharthab Sep 4, 2024
314bd76
add DE_analys folder
LorenzoS96 Oct 29, 2024
72e6e2e
prettier check
LorenzoS96 Oct 29, 2024
acd8f65
Merge branch 'dev' into umicollapse
siddharthab Oct 29, 2024
e86e9a6
update umicollapse to 1.1.0
siddharthab Oct 29, 2024
28b10ac
update CHANGELOG
siddharthab Oct 29, 2024
6d0ebe5
change name DE folder
LorenzoS96 Oct 30, 2024
73fb0e7
modify md files
LorenzoS96 Oct 31, 2024
e8d8061
prettier check
LorenzoS96 Oct 31, 2024
e5d9d4b
actually update umicollapse
siddharthab Nov 5, 2024
05513ee
Apply suggestions from code review
siddharthab Nov 6, 2024
aa35d37
Update docs/usage/DEanalysis/index.md
LorenzoS96 Nov 7, 2024
b1ec267
Update docs/usage/DEanalysis/interpretation.md
LorenzoS96 Nov 7, 2024
659c86b
Update docs/usage/DEanalysis/rnaseq.md
LorenzoS96 Nov 7, 2024
f2b34d0
Update docs/usage/DEanalysis/rnaseq.md
LorenzoS96 Nov 7, 2024
c8c7ef8
Update docs/usage/DEanalysis/rnaseq.md
LorenzoS96 Nov 7, 2024
ae16c2c
update index.md
LorenzoS96 Nov 7, 2024
000e228
update images
LorenzoS96 Nov 7, 2024
dfa8a04
update interpretation
LorenzoS96 Nov 7, 2024
aa55fe5
update rnaseq
LorenzoS96 Nov 7, 2024
f7e3c97
update theory
LorenzoS96 Nov 7, 2024
dbc5260
update de_rstudio
LorenzoS96 Nov 7, 2024
08efda6
correct formula in theory
LorenzoS96 Nov 7, 2024
6be4d82
prettier check
LorenzoS96 Nov 7, 2024
da5f135
Merge pull request #1447 from LorenzoS96/dev
LorenzoS96 Nov 8, 2024
ee2dde2
Merge branch 'dev' into umicollapse
pinin4fjords Nov 12, 2024
b372821
[automated] Fix code linting
nf-core-bot Nov 12, 2024
88d6411
Update tests for UMICollapse module.
MatthiasZepper Nov 29, 2024
15116a3
Update umi-tools dedup tests at well.
MatthiasZepper Nov 29, 2024
3bbd907
Update tests in subworkflows as well.
MatthiasZepper Nov 29, 2024
c4fb145
skip_sample_count
robsyme Nov 29, 2024
2e4e98c
Add linting-enabled preprocessing subworkflow and wire in
pinin4fjords Dec 2, 2024
bacbecb
Update CHANGELOG
pinin4fjords Dec 2, 2024
dd98da7
Add fq/lint
pinin4fjords Dec 2, 2024
de301d2
Add output previxes for linting
pinin4fjords Dec 2, 2024
d5a14a8
Tweaks
pinin4fjords Dec 3, 2024
380f760
Link in new subworkflow config
pinin4fjords Dec 3, 2024
839e21e
Let's skip the name validation by default, it seems to be problematic
pinin4fjords Dec 3, 2024
611e228
Correct config
pinin4fjords Dec 3, 2024
5e1d465
separate linting reports by directory rather than prefix
pinin4fjords Dec 3, 2024
b178bc0
update schema
pinin4fjords Dec 3, 2024
51a7238
Add stub to fq lint
pinin4fjords Dec 3, 2024
25e3f9d
update docs
pinin4fjords Dec 3, 2024
97f241a
update snaps
pinin4fjords Dec 3, 2024
ae42032
exclude lint reports from snapshots
pinin4fjords Dec 3, 2024
83967a7
update module/ subworkflow
pinin4fjords Dec 3, 2024
e69971a
Add CI Java fix
pinin4fjords Dec 3, 2024
6a6e946
Fix java ci issue
pinin4fjords Dec 3, 2024
376296b
Fix subworkflow test
pinin4fjords Dec 3, 2024
2cf53ab
Fix salmon test
pinin4fjords Dec 3, 2024
d8649fb
Merge branch 'swf_rnaseq_prepro_lint' of https://github.com/nf-core/r…
pinin4fjords Dec 3, 2024
7ec665d
fix changelog and versions
pinin4fjords Dec 3, 2024
c0f027a
Update CHANGELOG.md
pinin4fjords Dec 3, 2024
803c424
fix version snap
pinin4fjords Dec 3, 2024
b8b5aa2
Merge branch 'swf_rnaseq_prepro_lint' of https://github.com/nf-core/r…
pinin4fjords Dec 3, 2024
0247f7b
Merge pull request #1461 from nf-core/swf_rnaseq_prepro_lint
pinin4fjords Dec 3, 2024
79d52f6
Merge branch 'dev' into umicollapse
maxulysse Dec 4, 2024
450b5b9
Restore truncated CHANGELOG.md with 3.0 and prior releases.
MatthiasZepper Dec 4, 2024
a2dac87
[automated] Fix code linting
nf-core-bot Dec 4, 2024
2986385
Merge pull request #1369 from siddharthab/umicollapse
MatthiasZepper Dec 4, 2024
9b62f06
Move channel operations outside of the onComplete() block
robsyme Dec 5, 2024
532497b
Add changelog entry
robsyme Dec 6, 2024
674645e
Merge pull request #1463 from nf-core/oncomplete-fix-minimal
robsyme Dec 9, 2024
f4b76b1
Starting complement for umi factor-out
pinin4fjords Dec 10, 2024
9603f3d
Fixes with local subworkflow
pinin4fjords Dec 10, 2024
d2fd885
update changelog
pinin4fjords Dec 10, 2024
e0244fe
Add UMI tests
pinin4fjords Dec 10, 2024
e6843d8
Update umi tests
pinin4fjords Dec 10, 2024
a4aea81
Don't mix fastq stats files passed to MultiQC
pinin4fjords Dec 11, 2024
caccadc
Exclude date-containing umi handling logs from snapshotting
pinin4fjords Dec 11, 2024
3478877
update snapshot
pinin4fjords Dec 11, 2024
43c55fe
Exclude more variable logs
pinin4fjords Dec 11, 2024
52ba2d4
update CHANGELOG
pinin4fjords Dec 11, 2024
631f0e3
Merge pull request #1467 from nf-core/test_umi
pinin4fjords Dec 11, 2024
01b212e
Merge branch 'dev' into factor_out_umi
pinin4fjords Dec 12, 2024
c017916
Fix subworkflow alias
pinin4fjords Dec 12, 2024
42f3fa7
Fix linting
pinin4fjords Dec 12, 2024
d3cc50d
Fix version mixing
pinin4fjords Dec 12, 2024
4769025
fix selector
pinin4fjords Dec 12, 2024
742be2f
misc
pinin4fjords Dec 12, 2024
9f06d70
Fix process name in snap
pinin4fjords Dec 12, 2024
dbd3343
Remove unneeded subworkflow include
pinin4fjords Dec 12, 2024
1bc1b73
Fix more config selectors
pinin4fjords Dec 12, 2024
9196038
Don't mix transcriptome bam stats with genome ones for multiqc
pinin4fjords Dec 12, 2024
bc3cffb
Remove method from config, tidy up
pinin4fjords Dec 12, 2024
8377720
umi workflow from nf-core
pinin4fjords Dec 12, 2024
4749877
Merge pull request #1466 from nf-core/factor_out_umi
pinin4fjords Dec 12, 2024
0908456
linting docs fix
pinin4fjords Dec 17, 2024
39bafc2
Restore images
pinin4fjords Dec 17, 2024
0761530
Update CHANGELOG.md
pinin4fjords Dec 17, 2024
68d4e81
Update subworkflow to account for fix to bad argument handling
pinin4fjords Dec 17, 2024
ca92fe7
Update changelog
pinin4fjords Dec 17, 2024
08d8a2a
Merge branch 'dev' into remove-unused-params-in-email-template
pinin4fjords Dec 17, 2024
dd26536
Update CHANGELOG.md
pinin4fjords Dec 17, 2024
246b9dd
Merge pull request #1459 from nf-core/remove-unused-params-in-email-t…
pinin4fjords Dec 17, 2024
81f2824
Merge branch 'dev' into fix_prepro_arg
pinin4fjords Dec 17, 2024
4b6c9cc
Merge branch 'dev' into lint_docs_fix
pinin4fjords Dec 17, 2024
1e383e2
Merge pull request #1469 from nf-core/lint_docs_fix
pinin4fjords Dec 18, 2024
11b8670
Merge branch 'dev' into fix_prepro_arg
pinin4fjords Dec 18, 2024
254ad04
Fix prepare_genome subworkflow for sortmerna
pinin4fjords Dec 18, 2024
06fa117
Merge pull request #1470 from nf-core/fix_prepro_arg
pinin4fjords Dec 18, 2024
ae6990a
Update CHANGELOG.md
pinin4fjords Dec 18, 2024
f449840
Merge branch 'dev' into fix_sortmerna_index
pinin4fjords Dec 18, 2024
ebdabae
Merge pull request #1471 from nf-core/fix_sortmerna_index
pinin4fjords Dec 18, 2024
b23cca8
Bump STAR modules
pinin4fjords Dec 18, 2024
b0c583a
Update CHANGELOG.md
pinin4fjords Dec 18, 2024
7955857
update samtools version
pinin4fjords Dec 19, 2024
d11b9af
Fix swf snaps
pinin4fjords Dec 19, 2024
329fd69
Merge pull request #1473 from nf-core/bump_star
pinin4fjords Dec 19, 2024
859cfc3
Bump versions to 3.18.0
pinin4fjords Dec 19, 2024
f41744e
Update CHANGELOG.md
pinin4fjords Dec 19, 2024
492efe9
Revert "Update CHANGELOG.md"
pinin4fjords Dec 19, 2024
0cf9100
Update changelog
pinin4fjords Dec 19, 2024
a27ec8e
Merge pull request #1474 from nf-core/prerelease_3.18.0
pinin4fjords Dec 19, 2024
f67ae86
Fix minor umi dedup log issue
pinin4fjords Dec 19, 2024
37b52b5
better log file fix
pinin4fjords Dec 19, 2024
e8459aa
undo config change
pinin4fjords Dec 19, 2024
4138270
tiny fix
pinin4fjords Dec 19, 2024
16148b9
tiny fix
pinin4fjords Dec 19, 2024
685c1b1
Tidy up umitools/ umicollapse config
pinin4fjords Dec 19, 2024
cd97229
update changelog
pinin4fjords Dec 19, 2024
bdcd760
Update UMI test
pinin4fjords Dec 19, 2024
b799aa3
Update umi.nf.test
pinin4fjords Dec 19, 2024
6ce7096
Exclude umi logs
pinin4fjords Dec 19, 2024
cb59092
Update outputs in docs
pinin4fjords Dec 19, 2024
c98f8f3
Fix snapshot after log exclusion
pinin4fjords Dec 19, 2024
da322f1
Merge branch 'umi_dedup_log_path' of https://github.com/nf-core/rnase…
pinin4fjords Dec 19, 2024
311b633
Add keep intermeds to test file
pinin4fjords Dec 19, 2024
0a21c3f
Fix snapshot
pinin4fjords Dec 20, 2024
8033764
Merge pull request #1475 from nf-core/umi_dedup_log_path
pinin4fjords Dec 20, 2024
aa1ac44
Merge branch 'master' into update_from_master
pinin4fjords Dec 20, 2024
1b593f7
Merge pull request #1477 from nf-core/update_from_master
pinin4fjords Dec 20, 2024
eb7d764
Update CHANGELOG.md
pinin4fjords Dec 20, 2024
324dcdf
Merge pull request #1479 from nf-core/add_missing_changelog_entry
pinin4fjords Dec 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,11 @@ jobs:
- name: Check out pipeline code
uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4

- uses: actions/setup-java@8df1039502a15bceb9433410b1a100fbe190c53b # v4
with:
distribution: "temurin"
java-version: "17"

- name: Set up Nextflow
uses: nf-core/setup-nextflow@v2
with:
Expand Down
108 changes: 78 additions & 30 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,54 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

# 3.18.0 - 2024-12-19

### Credits

Special thanks to the following for their contributions to the release:

- [Caitlin Winkler](https://github.com/oligomyeggo)
- [Jonathan Manning](https://github.com/pinin4fjords)
- [Lorenzo Sola](https://github.com/LorenzoS96)
- [Maxime Garcia](https://github.com/maxulysse)
- [Siddhartha Bagaria](https://github.com/siddharthab)

### Enhancements & fixes

- [PR #1369](https://github.com/nf-core/rnaseq/pull/1369) - Add umicollapse as an alternative to umi-tools
- [PR #1461](https://github.com/nf-core/rnaseq/pull/1461) - Add FASTQ linting during preprocessing
- [PR #1463](https://github.com/nf-core/rnaseq/pull/1463) - Move channel operations outside of the onComplete() block
- [PR #1467](https://github.com/nf-core/rnaseq/pull/1467) - Add test suite for UMI handling functionality
- [PR #1466](https://github.com/nf-core/rnaseq/pull/1466) - Factor out UMI handling
- [PR #1470](https://github.com/nf-core/rnaseq/pull/1470) - Update subworkflow to account for fix to bad argument handling
- [PR #1469](https://github.com/nf-core/rnaseq/pull/1469) - Minor docs fix
- [PR #1459](https://github.com/nf-core/rnaseq/pull/1466) - Remove reference to unused "skip_sample_count" value in email templates
- [PR #1471](https://github.com/nf-core/rnaseq/pull/1471) - Fix prepare_genome subworkflow for sortmerna
- [PR #1473](https://github.com/nf-core/rnaseq/pull/1473) - Bump STAR modules
- [PR #1474](https://github.com/nf-core/rnaseq/pull/1474) - Bump versions to 3.18.0
- [PR #1475](https://github.com/nf-core/rnaseq/pull/1475) - Fix log publishing around umitools/ umicollapse
- [PR #1447](https://github.com/nf-core/rnaseq/pull/1447) - Add tutorial series for analysing count data

## Parameters

| Old parameter | New parameter |
| ------------- | --------------------- |
| | `--skip_linting` |
| | `--extra_fqlint_args` |
| | `--umi_dedup_tool` |

### Software dependencies

| Dependency | Old version | New version |
| ------------- | ----------- | ----------- |
| `UMICollapse` | | 1.1.0 |

> **NB:** Dependency has been **updated** if both old and new version information is present.
>
> **NB:** Dependency has been **added** if just the new version information is present.
>
> **NB:** Dependency has been **removed** if new version information isn't present.

## [[3.17.0](https://github.com/nf-core/rnaseq/releases/tag/3.17.0)] - 2024-10-23

### Credits
Expand Down Expand Up @@ -1007,14 +1055,14 @@ Note, since the pipeline is now using Nextflow DSL2, each process will be run wi

### Parameters

| Old parameter | New parameter |
| --------------------------- | -------------------------------------- |
| `--fc_extra_attributes` | `--gtf_extra_attributes` |
|  `--fc_group_features` |  `--gtf_group_features` |
|  `--fc_count_type` |  `--gtf_count_type` |
|  `--fc_group_features_type` |  `--gtf_group_features_type` |
|   |  `--singularity_pull_docker_container` |
|  `--skip_featurecounts` |   |
| Old parameter | New parameter |
| -------------------------- | ------------------------------------- |
| `--fc_extra_attributes` | `--gtf_extra_attributes` |
| `--fc_group_features` | `--gtf_group_features` |
| `--fc_count_type` | `--gtf_count_type` |
| `--fc_group_features_type` | `--gtf_group_features_type` |
| | `--singularity_pull_docker_container` |
| `--skip_featurecounts` | |

> **NB:** Parameter has been **updated** if both old and new parameter information is present.
> **NB:** Parameter has been **added** if just the new parameter information is present.
Expand Down Expand Up @@ -1092,28 +1140,28 @@ Note, since the pipeline is now using Nextflow DSL2, each process will be run wi

#### Updated

| Old parameter | New parameter |
| ----------------------------- | --------------------------- |
| `--reads` | `--input` |
|  `--igenomesIgnore` |  `--igenomes_ignore` |
|  `--removeRiboRNA` |  `--remove_ribo_rna` |
|  `--rRNA_database_manifest` |  `--ribo_database_manifest` |
|  `--save_nonrRNA_reads` |  `--save_non_ribo_reads` |
|  `--saveAlignedIntermediates` |  `--save_align_intermeds` |
|  `--saveReference` |  `--save_reference` |
|  `--saveTrimmed` |  `--save_trimmed` |
|  `--saveUnaligned` |  `--save_unaligned` |
|  `--skipAlignment` |  `--skip_alignment` |
|  `--skipBiotypeQC` |  `--skip_biotype_qc` |
|  `--skipDupRadar` |  `--skip_dupradar` |
|  `--skipFastQC` |  `--skip_fastqc` |
|  `--skipMultiQC` |  `--skip_multiqc` |
|  `--skipPreseq` |  `--skip_preseq` |
|  `--skipQC` |  `--skip_qc` |
|  `--skipQualimap` |  `--skip_qualimap` |
|  `--skipRseQC` |  `--skip_rseqc` |
|  `--skipTrimming` |  `--skip_trimming` |
|  `--stringTieIgnoreGTF` |  `--stringtie_ignore_gtf` |
| Old parameter | New parameter |
| ---------------------------- | -------------------------- |
| `--reads` | `--input` |
| `--igenomesIgnore` | `--igenomes_ignore` |
| `--removeRiboRNA` | `--remove_ribo_rna` |
| `--rRNA_database_manifest` | `--ribo_database_manifest` |
| `--save_nonrRNA_reads` | `--save_non_ribo_reads` |
| `--saveAlignedIntermediates` | `--save_align_intermeds` |
| `--saveReference` | `--save_reference` |
| `--saveTrimmed` | `--save_trimmed` |
| `--saveUnaligned` | `--save_unaligned` |
| `--skipAlignment` | `--skip_alignment` |
| `--skipBiotypeQC` | `--skip_biotype_qc` |
| `--skipDupRadar` | `--skip_dupradar` |
| `--skipFastQC` | `--skip_fastqc` |
| `--skipMultiQC` | `--skip_multiqc` |
| `--skipPreseq` | `--skip_preseq` |
| `--skipQC` | `--skip_qc` |
| `--skipQualimap` | `--skip_qualimap` |
| `--skipRseQC` | `--skip_rseqc` |
| `--skipTrimming` | `--skip_trimming` |
| `--stringTieIgnoreGTF` | `--stringtie_ignore_gtf` |

#### Added

Expand Down
19 changes: 0 additions & 19 deletions assets/email_template.html
Original file line number Diff line number Diff line change
Expand Up @@ -34,25 +34,6 @@ <h4 style="margin-top: 0; color: inherit">nf-core/rnaseq execution completed uns
<p>The full error message was:</p>
<pre style="white-space: pre-wrap; overflow: visible; margin-bottom: 0">${errorReport}</pre>
</div>
""" } else if(skip_sample_count > 0) { out << """
<div
style="
color: #856404;
background-color: #fff3cd;
border-color: #ffeeba;
padding: 15px;
margin-bottom: 20px;
border: 1px solid transparent;
border-radius: 4px;
"
>
<h4 style="margin-top: 0; color: inherit">nf-core/rnaseq execution completed with warnings!</h4>
<p>
The pipeline finished successfully, but samples were skipped. Please check warnings at the top of the MultiQC report.
</p>
<p></p>
</div>

""" } else { out << """
<div
style="
Expand Down
7 changes: 0 additions & 7 deletions assets/email_template.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,6 @@ The full error message was:

${errorReport}
"""
} else if (skip_sample_count > 0) {
out << """##################################################
## nf-core/rnaseq execution completed with warnings ##
##################################################
The pipeline finished successfully, but samples were skipped.
Please check warnings at the top of the MultiQC report.
"""
} else {
out << "## nf-core/rnaseq execution completed successfully! ##"
}
Expand Down
Binary file added docs/images/mqc_fastqc_adapter.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/mqc_fastqc_counts.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/mqc_fastqc_quality.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
23 changes: 19 additions & 4 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
- [Pipeline overview](#pipeline-overview)
- [Preprocessing](#preprocessing)
- [cat](#cat)
[fq lint](#fq-lint)
- [FastQC](#fastqc)
- [UMI-tools extract](#umi-tools-extract)
- [TrimGalore](#trimgalore)
Expand Down Expand Up @@ -73,6 +74,20 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d

If multiple libraries/runs have been provided for the same sample in the input samplesheet (e.g. to increase sequencing depth) then these will be merged at the very beginning of the pipeline in order to have consistent sample naming throughout the pipeline. Please refer to the [usage documentation](https://nf-co.re/rnaseq/usage#samplesheet-input) to see how to specify these samples in the input samplesheet.

### fq lint

<details markdown="1">
<summary>Output files</summary>

- `fq_lint/*`
- `*.fq_lint.txt`: Linting report per library from `fq lint`.

> **NB:** You will see subdirectories here based on the stage of preprocessing for the files that have been linted, for example `raw`, `trimmed`.

</details>

[fq lint](https://github.com/stjude-rust-labs/fq) runs several checks on input FASTQ files. It will fail with a non-zero error code when issues are found, which will terminate the workflow execution. In the absence of this, the successful linting produces the logs you will find here.

### FastQC

<details markdown="1">
Expand Down Expand Up @@ -105,7 +120,7 @@ If multiple libraries/runs have been provided for the same sample in the input s

</details>

[UMI-tools](https://github.com/CGATOxford/UMI-tools) deduplicates reads based on unique molecular identifiers (UMIs) to address PCR-bias. Firstly, the UMI-tools `extract` command removes the UMI barcode information from the read sequence and adds it to the read name. Secondly, reads are deduplicated based on UMI identifier after mapping as highlighted in the [UMI-tools dedup](#umi-tools-dedup) section.
[UMI-tools](https://github.com/CGATOxford/UMI-tools) and [UMICollapse](https://github.com/Daniel-Liu-c0deb0t/UMICollapse) deduplicate reads based on unique molecular identifiers (UMIs) to address PCR-bias. Firstly, the UMI-tools `extract` command removes the UMI barcode information from the read sequence and adds it to the read name. Secondly, reads are deduplicated based on UMI identifier after mapping as highlighted in the [UMI dedup](#umi-dedup) section.

To facilitate processing of input data which has the UMI barcode already embedded in the read name from the start, `--skip_umi_extract` can be specified in conjunction with `--with_umi`.

Expand Down Expand Up @@ -290,7 +305,7 @@ The original BAM files generated by the selected alignment algorithm are further

![MultiQC - SAMtools mapped reads per contig plot](images/mqc_samtools_idxstats.png)

### UMI-tools dedup
### UMI dedup

<details markdown="1">
<summary>Output files</summary>
Expand All @@ -299,7 +314,7 @@ The original BAM files generated by the selected alignment algorithm are further
- `<SAMPLE>.umi_dedup.sorted.bam`: If `--save_umi_intermeds` is specified the UMI deduplicated, coordinate sorted BAM file containing read alignments will be placed in this directory.
- `<SAMPLE>.umi_dedup.sorted.bam.bai`: If `--save_umi_intermeds` is specified the BAI index file for the UMI deduplicated, coordinate sorted BAM file will be placed in this directory.
- `<SAMPLE>.umi_dedup.sorted.bam.csi`: If `--save_umi_intermeds --bam_csi_index` is specified the CSI index file for the UMI deduplicated, coordinate sorted BAM file will be placed in this directory.
- `<ALIGNER>/umitools/`
- `<ALIGNER>/umitools/` (UMI-tools only)
- `*_edit_distance.tsv`: Reports the (binned) average edit distance between the UMIs at each position.
- `*_per_umi.tsv`: UMI-level summary statistics.
- `*_per_umi_per_position.tsv`: Tabulates the counts for unique combinations of UMI and position.
Expand All @@ -308,7 +323,7 @@ The content of the files above is explained in more detail in the [UMI-tools doc

</details>

After extracting the UMI information from the read sequence (see [UMI-tools extract](#umi-tools-extract)), the second step in the removal of UMI barcodes involves deduplicating the reads based on both mapping and UMI barcode information using the UMI-tools `dedup` command. This will generate a filtered BAM file after the removal of PCR duplicates.
After extracting the UMI information from the read sequence (see [UMI-tools extract](#umi-tools-extract)), the second step in the removal of UMI barcodes involves deduplicating the reads based on both mapping and UMI barcode information. UMI deduplication can be carried out either with [UMI-tools](https://github.com/CGATOxford/UMI-tools) or [UMICollapse](https://github.com/Daniel-Liu-c0deb0t/UMICollapse), set via the `umi_dedup_tool` parameter. The output BAM files are the same, though UMI-tools has some additional outputs, as described above. Either method will generate a filtered BAM file after the removal of PCR duplicates.

### picard MarkDuplicates

Expand Down
6 changes: 6 additions & 0 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,12 @@ CONTROL_REP1,AEG588A1_S1_L003_R1_001.fastq.gz,AEG588A1_S1_L003_R2_001.fastq.gz,a
CONTROL_REP1,AEG588A1_S1_L004_R1_001.fastq.gz,AEG588A1_S1_L004_R2_001.fastq.gz,auto
```

### Linting

By default, the pipeline will run [fq lint](https://github.com/stjude-rust-labs/fq) on all input FASTQ files, both at the start of preprocessing and after each preprocessing step that manipulates FASTQ files. If errors are found, and error will be reported and the workflow will stop.

The `extra_fqlint_args` parameter can be manipulated to disable [any validator](https://github.com/stjude-rust-labs/fq?tab=readme-ov-file#validators) from `fq` you wish. For example, we have found that checks on the names of paired reads are prone to failure, so that check is disabled by default (setting `extra_fqlint_args` to `--disable-validator P001`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've found this too. In the next release could we set those to the default args.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did :-)


### Strandedness Prediction

If you set the strandedness value to `auto`, the pipeline will sub-sample the input FastQ files to 1 million reads, use Salmon Quant to automatically infer the strandedness, and then propagate this information through the rest of the pipeline. This behavior is controlled by the `--stranded_threshold` and `--unstranded_threshold` parameters, which are set to 0.8 and 0.1 by default, respectively. This means:
Expand Down
Loading
Loading