-
Notifications
You must be signed in to change notification settings - Fork 31
GSEA support (WIP) #84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
We know, and we don't care in this case
The SummarizedExperiment class was removed from GenomicRanges (after having moved it to the SummarizedExperiment package) back in Bioconductor 3.3 [1], released in May 2016 and targetting R 3.3. This commit removes code targeting support for Bioconductor versions older than Bioconductor 3.3. It slightly bumps the R version dependency to explicitly break with that old Bioconductor version. [1] https://bioconductor.org/news/bioc_3_3_release/
Back in Sept 2016, the representation= argument of setClass was documented as old, and the slots= argument was suggested instead [1]. [1] wch/r-source@edadf29 In 2016 it was already old, so now it is considered even older. This commit just updates the syntax to ease maintenance.
Look at gene_set_analyses being used here: https://github.com/pinin4fjords/shinyngs/blob/d2ca950934070c699c839aea45e77053bc461722/R/genesetanalysistable.R#L239 gst <- ese@gene_set_analyses[[assay]][[gene_set_types]][[as.numeric(selected_contrasts)]] It's indexed by assay, geneset type and contrast. Fix the docs so the explanation matches reality.
The gene_set_analyses_tool has a list-based structure, just like gene_set_analyses. gene_set_analyses_tool[[assay]][[gene_set_type]][[contrast]] contains a string, either "auto", "gsea", or "roast". This string can be used by the geneset-related modules to parse and interact with the gene set enrichment tables.
Ensure each given gene_set_analysis has a tool set, and set it to "auto" otherwise.
Just a helper function
Previous library(shinyngs) statements are on eval=FALSE chunks.
|
I've pushed significant progress and the code is starting to grow. I believe commits can be reviewed incrementally for your convenience. I would like to find a way to pass enrichment result files to make_app_from_files, but CLI arguments are not that convenient: The make_app_from_files tool accepts one assay with a list of differential expression tables, one per contrast. For each contrast, there is a list of gene set types. For each gene set type there are up to two files associated (up and down are reported separately in gsea). For 4 contrasts and 3 gene set types I would need to report 24 tsv files. This is not convenient in the command line. My suggestion is to add functional enrichment support to that script by passing a list of gene set types separated by commas in an argument, and passing a file name template. The tool will assume that file names from the enrichment tool follow a predictable naming based on the template, and will substitute contrast information and the gene set to find all the tsv files. This solution avoids bloating the command line arguments and it is flexible enough to work well with the differential abundance nfcore pipeline. I'm happy to get feedback about this idea |
When reading csv files from each assay, gene_set type and contrast, allow to not have enrichment results for all combinations.
|
Still needs more testing, tests and documentation, but the core work is already here. |
This is a draft pull request that will add support for GSEA output files to shinyngs.
Still work in progress.
Status/Roadmap
ExploratorySummarizedExperimentthere is a new slot for thegene_set_analyses_tool. This slot allows the user to specify the tool used for gene enrichment. Depending on the tool, the package processes the enrichment files, applying the pvalue/FDR filters from the user to the corresponding columns. By default a simple auto detect heuristic is implemented.Closes #83