-
Notifications
You must be signed in to change notification settings - Fork 953
Add draft of FASTQ_REMOVE_ADAPTERS_AND_MERGE subworkflow with tests #9521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
98 commits
Select commit
Hold shift + click to select a range
f2ee256
Add draft of FASTQ_REMOVE_ADAPTERS_AND_MERGE subworkflow with tests
kornkv 060b52d
Add ontologies to tcoffee/regressive and upp/align modules (#9484)
mirpedrol a03e263
Add module PBMARKDUP (#9457)
sainsachiko e505244
Enable complex contrast strings in DESeq2 (#9473)
delfiterradas 26d5803
Declare deepvariant optional html output (#9469)
peterpru dbadf9c
utils_nfcore_pipeline: fix small lang server error (#9492)
matthdsm e97aaf4
Fix hisat2/align to support large genome indices (.ht2l) (#9493)
pinin4fjords 51369d0
Update shinyngs modules to latest release (#9488)
delfiterradas 7130774
Update semibin/singleeasybin environment (#9495)
dialvarezs e188efd
add new ontology term to tcoffee align (#9497)
mirpedrol 74834d8
tcoffee_extractfrompdb test: sort file listing so "first" file is det…
nathanweeks f69fa14
Sambamba depth add region bed input (#9498)
peterpru 2629da3
fix fasta_index_methylseq and fastq_align_dedup workflows (#9496)
sateeshperi 19b67c4
Update test files for Glimpse (#9467)
LouisLeNezet 819155d
meta.yml schema: add `containers` section, fix order and simplify (#9…
mashehu 53675cf
update and add topics to snakemake module (#9454)
maxulysse b8c8f85
fix sambamba depth stub version (#9509)
fellen31 6c99c79
anota2seq: wrong variable name for batch assignment (#9511)
suhrig 28bea76
fix(anota2seq): add gene IDs and handle empty results (#9510)
pinin4fjords dabd880
fix(decoupler): reorder imports and ensure environment variables are …
atrigila 2d52823
Add strdrop/build (#9512)
fellen31 60239ed
chore(deps): update infrastructural dependencies
renovate[bot] 7b3c6f4
🔧 update image and bioconda container for VueGen to latest version (#…
enryH 6963d50
qsv/cat: bump version (#9518)
dialvarezs 76c51e7
Update haplogrep3 recipe to use topics (#9523)
ramprasadn 36c4abb
semibin/singleeasybin: bump version + migrate to topics (#9517)
dialvarezs 9239334
remove unused folder
vagkaratzas ced4270
rename
vagkaratzas 66b909c
Merge branch 'master' into fastq_remove_adapters_and_merge
vagkaratzas a68276a
trimmomatic revisit
vagkaratzas c7ef10c
cutadapt revisit
vagkaratzas ca4db62
trimgalore revisit
vagkaratzas 54863da
bbduk revisit
vagkaratzas 697d943
fastp revisit
vagkaratzas a0b1a41
Merge branch 'master' into fastq_remove_adapters_and_merge
vagkaratzas 77cdc8c
adapterremoval revisit
vagkaratzas 209581d
leehom checkpoint (#9534)
vagkaratzas 5854017
remove ngmerge because cant deal with /1 /2 paired reads
vagkaratzas 3001aa7
paired end no merge test
vagkaratzas beba663
Merge branch 'master' into fastq_remove_adapters_and_merge
vagkaratzas 8637cb1
New module: clusty (#9533)
Joon-Klaps a7bc445
Bump TRGT to 4.1.0 (#9514)
fellen31 68d6485
fix missing quotes (#9535)
mashehu 690f067
Fix dream to show more than 10 results (#9507)
delfiterradas 2b65b43
rename to more appropriate fastq_preprocess_seqkit (#9537)
vagkaratzas 08f0cec
New module - TD2 (added modules for td2.longorfs & td2.predict) (#9475)
khersameesh24 8293491
Bump ichorCNA package build in ichorcna/createpon and ichorcna/run (#…
lbeltrame a78cc00
RSeQC split_bam.py module implementation (#9536)
rhassaine 56445cb
remove topics from multiqc (#9530)
nvnieuwk 66d0bdd
bump to MultiQC version 1.33 (#9538)
FriederikeHanssen f3f9582
Add strdrop/call (#9513)
fellen31 0bac4f6
Bump TRGT to 5.0.0 (#9541)
fellen31 a939428
Channel -> channel in some subwfs (#9542)
nvnieuwk 9d6f628
Migrate cat/fastq to topic channel (#9543)
Aratz 7a6165f
Add module picard/collectvariantcallingmetrics (#9502)
georgiakes 24b5c0c
fix tabix/tabix stub (#9544)
nvnieuwk f13b9a9
Unify msa modules (#9539)
mirpedrol e2c9156
Fix missing version from subworkflow snapshot (#9548)
Aratz b686830
New module: whatshap/phase (#9431)
HaidYi 196f22c
Bump version cat/cat to pigz 2.8 & rewrite nf-test & topic channel (#…
Joon-Klaps 5649f8a
plastid metagene_generate, make_wiggle, psite (#9482)
suhrig a0e5a8b
Add index and threads to trgt/merge (#9545)
fellen31 6b2c2fb
Fix test path modification (#9465)
LouisLeNezet ea2d4f0
Update xenium ranger modules and subworkflows (#9525)
an-altosian 41ac464
Version update: Modkit repair, callmods and bedmethyltobigwig (#9547)
jkh00 37dbe45
Update `GLIMPSE` sbwf (#9524)
LouisLeNezet 03cae33
Add quilt imputation subworkflow (#9443)
LouisLeNezet 1f66a1e
Update glimpse2 imputation subworkflow (#9434)
LouisLeNezet e23f119
Add `BEAGLE5` imputation subworkflow (#9550)
LouisLeNezet 2c74f42
Add minimac4 imputation subworkfllow (#9451)
LouisLeNezet 1e8cb18
Add BBSplit stats to MultiQC in fastq_qc_trim_filter_setstrandedness …
pinin4fjords 51ffcb6
Update cutadapt (#9551)
vagkaratzas 02ae283
Merge branch 'master' into fastq_remove_adapters_and_merge
vagkaratzas 4999717
added cutadapt to stub now that stub gz is properly created, and remo…
vagkaratzas a0a1406
single-end test with tool skips
vagkaratzas 54a78b6
Standarize and alignment for all imputation and alignment modules (#9…
LouisLeNezet 460be27
Update Infrastructural dependencies
renovate[bot] 0b1640f
Remove .view() (#9567)
LouisLeNezet f12328e
Bump strdrop to 0.3.1 (#9565)
fellen31 2ed958f
Remove unecessary tags (#9568)
LouisLeNezet 6546c99
Update trimgalore (#9570)
vagkaratzas ad92ee7
trimgalore output versions removed
vagkaratzas 2f5aefa
Merge branch 'master' into fastq_remove_adapters_and_merge
vagkaratzas 3ca7400
structure for subworkflow outputs in meta.yml file
vagkaratzas efbaae6
Merge branch 'master' into fastq_remove_adapters_and_merge
vagkaratzas f5cb4ca
Merge branch 'master' into fastq_remove_adapters_and_merge
vagkaratzas e73c143
Update subworkflows/nf-core/fastq_removeadapters_merge/main.nf
vagkaratzas 7184754
Update subworkflows/nf-core/fastq_removeadapters_merge/main.nf
vagkaratzas 36a9fff
main and meta updated with new one-tool logic
vagkaratzas d7519d6
nf-tests updated
vagkaratzas 117f8f9
Merge branch 'master' into fastq_remove_adapters_and_merge
vagkaratzas ce01561
var name change
vagkaratzas 9bff6f6
paired_interleaved dropped
vagkaratzas 7e3531a
adapterremoval merge logic update similar to eager
vagkaratzas 615a598
Merge branch 'master' into fastq_remove_adapters_and_merge
vagkaratzas 1860f0b
Update subworkflows/nf-core/fastq_removeadapters_merge/main.nf
vagkaratzas 04d8f3f
Merge branch 'master' into fastq_remove_adapters_and_merge
vagkaratzas 6f39919
update snapshot
vagkaratzas File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Submodule setup-nextflow
deleted from
6c2e22
140 changes: 140 additions & 0 deletions
140
subworkflows/nf-core/fastq_removeadapters_merge/main.nf
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,140 @@ | ||
| // both SE and PE | ||
| include { TRIMMOMATIC } from '../../../modules/nf-core/trimmomatic/main' | ||
| include { CUTADAPT } from '../../../modules/nf-core/cutadapt/main' | ||
| include { TRIMGALORE } from '../../../modules/nf-core/trimgalore/main' | ||
| include { BBMAP_BBDUK } from '../../../modules/nf-core/bbmap/bbduk/main' | ||
| include { LEEHOM } from '../../../modules/nf-core/leehom/main' | ||
| // both SE and PE, plus merging | ||
| include { FASTP } from '../../../modules/nf-core/fastp/main' | ||
| include { ADAPTERREMOVAL as ADAPTERREMOVAL_SE } from '../../../modules/nf-core/adapterremoval/main' | ||
| include { ADAPTERREMOVAL as ADAPTERREMOVAL_PE } from '../../../modules/nf-core/adapterremoval/main' | ||
| // helper module for concatenating adapterremoval paired-end processed reads | ||
| include { CAT_FASTQ } from '../../../modules/nf-core/cat/fastq/main' | ||
|
|
||
| workflow FASTQ_REMOVEADAPTERS_MERGE { | ||
|
|
||
| take: | ||
| ch_input_reads // channel: [mandatory] meta, reads | ||
| val_adapter_tool // string: [mandatory] tool_name // choose from: ["trimmomatic", "cutadapt", "trimgalore", "bbduk", "leehom", "fastp", "adapterremoval"] | ||
| ch_custom_adapters_file // channel: [optional] {fasta,txt} // fasta, for bbduk or fastp, or txt, for adapterremoval | ||
| val_save_merged // boolean: [mandatory] if true, will return the merged reads instead, for fastp and adapterremoval | ||
| val_fastp_discard_trimmed_pass // boolean: [mandatory] // only for fastp | ||
| val_fastp_save_trimmed_fail // boolean: [mandatory] // only for fastp | ||
|
|
||
| main: | ||
|
|
||
| ch_discarded_reads = channel.empty() // from trimmomatic, trimgalore, leehom, fastp, adapterremoval | ||
| ch_log = channel.empty() // from trimmomatic, trimgalore, fastp | ||
| ch_report = channel.empty() // from trimmomatic, trimgalore, fastp | ||
| ch_versions = channel.empty() | ||
| ch_multiqc_files = channel.empty() // from trimmomatic, cutadapt, bbduk, leehom, fastp, adapterremoval | ||
|
|
||
| if (val_adapter_tool == "trimmomatic") { | ||
| TRIMMOMATIC( ch_input_reads ) | ||
|
|
||
| ch_processed_reads = TRIMMOMATIC.out.trimmed_reads | ||
| ch_discarded_reads = ch_discarded_reads.mix(TRIMMOMATIC.out.unpaired_reads.transpose()) // .transpose() because paired reads will output 2 unpaired files in an array | ||
| ch_log = TRIMMOMATIC.out.trim_log | ||
| ch_report = TRIMMOMATIC.out.summary | ||
| ch_versions = ch_versions.mix(TRIMMOMATIC.out.versions.first()) | ||
| ch_multiqc_files = ch_multiqc_files.mix(TRIMMOMATIC.out.out_log) | ||
| } else if (val_adapter_tool == "cutadapt") { | ||
| CUTADAPT( ch_input_reads ) | ||
|
|
||
| ch_processed_reads = CUTADAPT.out.reads | ||
| ch_multiqc_files = ch_multiqc_files.mix(CUTADAPT.out.log) | ||
| } else if (val_adapter_tool == "trimgalore") { | ||
| TRIMGALORE( ch_input_reads ) | ||
|
|
||
| ch_processed_reads = TRIMGALORE.out.reads | ||
| ch_discarded_reads = ch_discarded_reads.mix(TRIMGALORE.out.unpaired) | ||
| ch_log = TRIMGALORE.out.log | ||
| ch_report = TRIMGALORE.out.html.mix(TRIMGALORE.out.zip) | ||
| } else if (val_adapter_tool == "bbduk") { | ||
| BBMAP_BBDUK( ch_input_reads, ch_custom_adapters_file ) | ||
|
|
||
| ch_processed_reads = BBMAP_BBDUK.out.reads | ||
| ch_versions = ch_versions.mix(BBMAP_BBDUK.out.versions.first()) | ||
| ch_multiqc_files = ch_multiqc_files.mix(BBMAP_BBDUK.out.log) | ||
| } else if (val_adapter_tool == "leehom") { | ||
| LEEHOM( ch_input_reads ) | ||
|
|
||
| ch_processed_reads = LEEHOM.out.fq_pass | ||
| .join(LEEHOM.out.unmerged_r1_fq_pass, by: 0, remainder: true) | ||
| .join(LEEHOM.out.unmerged_r2_fq_pass, by: 0, remainder: true) | ||
| .map { meta, single, r1, r2 -> | ||
| if (meta.single_end) { | ||
| return [meta, single] | ||
| } else { | ||
| return [meta, [r1, r2]] | ||
| } | ||
| } | ||
| ch_discarded_reads = ch_discarded_reads.mix(LEEHOM.out.fq_fail, LEEHOM.out.unmerged_r1_fq_fail, LEEHOM.out.unmerged_r2_fq_fail) | ||
| ch_versions = ch_versions.mix(LEEHOM.out.versions.first()) | ||
| ch_multiqc_files = ch_multiqc_files.mix(LEEHOM.out.log) | ||
| } else if (val_adapter_tool == "fastp") { | ||
| FASTP( | ||
| ch_input_reads.map { meta, files -> [ meta, files, ch_custom_adapters_file ] }, | ||
| val_fastp_discard_trimmed_pass, | ||
| val_fastp_save_trimmed_fail, | ||
| val_save_merged | ||
| ) | ||
|
|
||
| if (val_save_merged) { | ||
| ch_processed_reads = FASTP.out.reads_merged | ||
| } else { | ||
| ch_processed_reads = FASTP.out.reads | ||
| } | ||
| ch_discarded_reads = ch_discarded_reads.mix(FASTP.out.reads_fail.transpose()) // .transpose() because paired reads have 3 fail files in an array | ||
| ch_log = FASTP.out.log | ||
| ch_report = FASTP.out.html | ||
| ch_versions = ch_versions.mix(FASTP.out.versions.first()) | ||
| ch_multiqc_files = ch_multiqc_files.mix(FASTP.out.json) | ||
| } else if (val_adapter_tool == "adapterremoval") { | ||
| ch_adapterremoval_in = ch_input_reads | ||
| .branch { meta, _reads -> | ||
| single: meta.single_end | ||
| paired: !meta.single_end | ||
| } | ||
|
|
||
| ADAPTERREMOVAL_SE( ch_adapterremoval_in.single, ch_custom_adapters_file ) | ||
| ADAPTERREMOVAL_PE( ch_adapterremoval_in.paired, ch_custom_adapters_file ) | ||
|
|
||
| if (val_save_merged) { // merge | ||
| ch_concat_fastq = channel.empty() | ||
| .mix( | ||
| ADAPTERREMOVAL_PE.out.collapsed, | ||
| ADAPTERREMOVAL_PE.out.collapsed_truncated, | ||
| ADAPTERREMOVAL_PE.out.singles_truncated, | ||
| ) | ||
| .map { meta, reads -> | ||
| def meta_new = meta.clone() | ||
| meta_new.single_end = true | ||
| [meta_new, reads] | ||
| } | ||
| .groupTuple() | ||
| // Paired-end reads cause a nested tuple during grouping. | ||
| // We want to present a flat list of files to `CAT_FASTQ`. | ||
| .map { meta, fastq -> [meta, fastq.flatten()] } | ||
|
|
||
| CAT_FASTQ( ch_concat_fastq ) | ||
|
|
||
| ch_processed_reads = CAT_FASTQ.out.reads.mix(ADAPTERREMOVAL_SE.out.singles_truncated) | ||
| } else { // no merge | ||
| ch_processed_reads = ADAPTERREMOVAL_PE.out.paired_truncated.mix(ADAPTERREMOVAL_SE.out.singles_truncated) | ||
| } | ||
| ch_discarded_reads = ch_discarded_reads.mix(ADAPTERREMOVAL_SE.out.discarded, ADAPTERREMOVAL_PE.out.discarded) | ||
| ch_versions = ch_versions.mix(ADAPTERREMOVAL_SE.out.versions.first(), ADAPTERREMOVAL_PE.out.versions.first()) | ||
| ch_multiqc_files = ch_multiqc_files.mix(ADAPTERREMOVAL_PE.out.settings, ADAPTERREMOVAL_SE.out.settings) | ||
| } else { | ||
| error('Please choose one of the available adapter removal and merging tools: ["trimmomatic", "cutadapt", "trimgalore", "bbduk", "leehom", "fastp", "adapterremoval"]') | ||
| } | ||
|
|
||
| emit: | ||
| processed_reads = ch_processed_reads // channel: [ val(meta), [ fastq.gz ] ] | ||
| discarded_reads = ch_discarded_reads // channel: [ val(meta), [ fastq.gz ] ] | ||
| logfile = ch_log // channel: [ val(meta), [ {log,txt} ] ] | ||
| report = ch_report // channel: [ val(meta), [ {summary,html,zip} ] ] | ||
| versions = ch_versions // channel: [ versions.yml ] | ||
| multiqc_files = ch_multiqc_files | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,92 @@ | ||
| # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json | ||
| name: "fastq_removeadapters_merge" | ||
| description: Remove adapters and merge reads based on various module choices | ||
| keywords: | ||
| - adapters | ||
| - removal | ||
| - short reads | ||
| - merge | ||
| - trim | ||
| components: | ||
| - trimmomatic | ||
| - cutadapt | ||
| - trimgalore | ||
| - bbmap/bbduk | ||
| - leehom | ||
| - fastp | ||
| - adapterremoval | ||
| - cat/fastq | ||
| input: | ||
| - ch_input_reads: | ||
| type: file | ||
| description: | | ||
| List of FastQ files of size 1 and 2 for single-end and paired-end data, respectively. | ||
| Structure: [ val(meta), [ path(reads) ] ] | ||
| - val_adapter_tool: | ||
| type: string | ||
| description: | | ||
| Choose one of the available adapter removal and/or merging tools | ||
| enum: ["trimmomatic", "cutadapt", "trimgalore", "bbduk", "leehom", "fastp", "adapterremoval"] | ||
| - ch_custom_adapters_file: | ||
| type: file | ||
| description: | | ||
| Optional reference files, containing adapter and/or contaminant sequences for removal. | ||
| In fasta format for bbmap/bbduk and fastp, or in text format for AdapterRemoval (one adapter per line). | ||
| - val_save_merged: | ||
| type: boolean | ||
| description: | | ||
| Specify true to output merged reads instead | ||
| Used by fastp and adapterremoval | ||
| - val_fastp_discard_trimmed_pass: | ||
| type: boolean | ||
| description: | | ||
| Used only by fastp. | ||
| Specify true to not write any reads that pass trimming thresholds from the fastp process. | ||
| This can be used to use fastp for the output report only. | ||
| - val_fastp_save_trimmed_fail: | ||
| type: boolean | ||
| description: | | ||
| Used only by fastp. | ||
| Specify true to save files that failed to pass fastp trimming thresholds | ||
| output: | ||
| - processed_reads: | ||
| type: file | ||
| description: | | ||
| Structure: [ val(meta), path(fastq.gz) ] | ||
| The trimmed/modified single or paired end or merged fastq reads | ||
| pattern: "*.fastq.gz" | ||
| - discarded_reads: | ||
| type: file | ||
| description: | | ||
| Structure: [ val(meta), path(fastq.gz) ] | ||
| The discarded reads | ||
| pattern: "*.fastq.gz" | ||
| - logfile: | ||
| type: file | ||
| description: | | ||
| Execution log file | ||
| (trimmomatic {log}, trimgalore {txt}, fastp {log}) | ||
| pattern: "*.{log,txt}" | ||
| - report: | ||
| type: file | ||
| description: | | ||
| Execution report | ||
| (trimmomatic {summary}, trimgalore {html,zip}, fastp {html}) | ||
| pattern: "*.{summary,html,zip}" | ||
| - versions: | ||
| type: file | ||
| description: | | ||
| File containing software versions | ||
| Structure: [ path(versions.yml) ] | ||
| pattern: "versions.yml" | ||
| - multiqc_files: | ||
| type: file | ||
| description: | | ||
| MultiQC-compatible output files from tools used in preprocessing | ||
| (trimmomatic, cutadapt, bbduk, leehom, fastp, adapterremoval) | ||
| authors: | ||
| - "@kornkv" | ||
| - "@vagkaratzas" | ||
| maintainers: | ||
| - "@kornkv" | ||
| - "@vagkaratzas" |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a validation check here to make sure whatever is given to
val_adapter_toolis recognised by the subworkflow.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is this
elsecondition at the end:Is this not enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, missed that 😅
I personally would put it separately right at the beginning as I find it easier to 'read' and know the rest of subwf won't start until it passes. But that's just personal prefereence, this still works.