-
Notifications
You must be signed in to change notification settings - Fork 951
traitar module is drafted #9605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a new nf-core module for Traitar3, which performs phenotype prediction from assembled nucleotide FASTA files using protein families to infer microbial traits.
- Implements Traitar3 phenotype mode for trait prediction
- Provides both script and stub implementations for testing
- Includes comprehensive test configurations and expected outputs
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| modules/nf-core/traitar/main.nf | Core process implementation with Traitar3 phenotype analysis, including script and stub sections for prediction outputs |
| modules/nf-core/traitar/meta.yml | Module metadata defining inputs/outputs, tool information, and EDAM ontology annotations |
| modules/nf-core/traitar/environment.yml | Conda environment specification pinning traitar to version 3.0.1 |
| modules/nf-core/traitar/tests/main.nf.test | nf-test implementation with stub test for proteome input |
| modules/nf-core/traitar/tests/main.nf.test.snap | Test snapshot file with expected MD5 checksums for stub outputs |
| modules/nf-core/traitar/tests/nextflow.config | Test-specific Nextflow configuration for module execution |
| modules/nf-core/traitar/tests/nf-test.config | Additional nf-test configuration settings |
| modules/nf-core/traitar/tests/config/nf-test.config | Alternative test configuration file for different test scenarios |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Hi @rpetit3 could you please review the module? :) |
- Real data test commented out (non-deterministic output across conda/container) - Regenerated snapshots for stub test only (deterministic) - Cleaned up obsolete snapshots - CI will pass with stub test - Real test can be run locally with: nf-test test tests/main.nf.test --profile=+singularity
Joon-Klaps
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @brovolia,
Thanks for contributing Traitar3. I've left a couple of comments but they don't adress everything.
I've noticed that there are some files that shouldn't be there, .nftignore, nf-test.config, config/, ... All this are typicall for pipelines but not within nf-core modules.
I also noticed that the main.nf contains a bit of complexity and bash scripting, try to minimize this as much as possible. If you are struggeling decompressing files, there are plenty of examples already out there that resolve this issue, see this search
I would suggest having a read through the docs on does and don'ts of nf-core modules. I would also suggest to have a look at some already made modules like samtools/stats or trimgalore to get an idea of how the modules are structured.
| label 'process_medium' | ||
|
|
||
| conda "${moduleDir}/environment.yml" | ||
| container "community.wave.seqera.io/library/hmmer_prodigal_pandas_pip_pruned:a83f0296374a52e6" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically, we want both a docker container as well as a singulaity container. Also, it seems like traitar is available at bioconda, they will have a docker & singularity package automatically build. From the recipe, the versions aren't indeed pinged. Is this why you saw to make a new container for it?
| def input_type = task.ext.input_type ?: 'from_genes' | ||
| // Validate input_type against allowed values | ||
| if (!['from_genes', 'from_proteins', 'from_nucleotides'].contains(input_type)) { | ||
| error("Invalid input_type: ${input_type}. Must be one of: 'from_genes', 'from_proteins', 'from_nucleotides'") | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be an input value of the module
| EOF | ||
| """ | ||
|
|
||
| stub: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless it's very likely these files will be read in a map later within nextflow, I believe we can just create empty files touch file.txt
| is from_nucleotides) | ||
| pattern: "*.{fa,fasta,faa,fna}" | ||
|
|
||
| ontologies: [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add ontology of fasta files
| # Generate versions file | ||
| cat > versions.yml <<-EOF | ||
| "${task.process}": | ||
| traitar: \$(traitar --version 2>&1 | grep -oE 'version.*' || echo 'unknown') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| traitar: \$(traitar --version 2>&1 | grep -oE 'version.*' || echo 'unknown') | |
| traitar: \$(traitar --version 2>&1 | grep -oE 'version.*' ) |
Don't do this, versions should always be known.
This module runs Traitar3 in phenotype mode to infer phenotype profiles from assembled nucleotide FASTA files, producing tabular trait prediction results per sample.