Skip to content

all RdRP+ macro/micro contigs #252

@rchikhi

Description

@rchikhi

This issues describes the procedure to search all of our contigs against RdRP and presents results.
(Slack thread: https://hackseq-rna.slack.com/archives/C012H9SDQCA/p1615948152031200)

Input: FASTA files of contigs, either assembled using micro (all SRA .pro DIAMOND hits assembled with rnaviralspades) or macro (all from s3://lovelywater/assembly/contigs/, i.e. all CoV + dicistro + quenya + satellite + 1k random subset assembled using either coronaSPAdes or rnaviralspades).

Output: FASTA of all the contigs that hit RdRP either with HMM and/or palmscan, i.e. the RdRP+ contigs:
s3://serratus-rayan/pro-assembly/rdrpplus.micro.fa
s3://serratus-rayan/pro-assembly/rdrpplus.macro.fa
total size: 8.2 GB

hmmsearch was run using an exhaustive collection of RdRP HMMs:
https://gitlab.pasteur.fr/rchikhi_pasteur/serratus-rdrp-analysis/-/blob/master/hmm_macro_micro/RdRP_all.v2.hmm

alignments were made using this script:
https://gitlab.pasteur.fr/rchikhi_pasteur/serratus-rdrp-analysis/-/blob/master/hmm_macro_micro/align_hmm_to_contigs.sh

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions