Skip to content

criteria to name/number proteins #501

@ferninfm

Description

@ferninfm

Hi,
I realised a peculiar behaviour in funannotate predict (guessing), which I am not sure is desired as such.

The gene models found and proteins are wonderfully named correlatively: species_00001-T1 and so on within each scaffold. This is perfect. I expected this naming to continue correlative to the order of scaffolds: Scaffold_1 proteins 00001 to 01000 Scaffold_2 protein 01001 to 03000.

This is not the case. Scaffolds are ordered by name (as text) and not using the numerical index, so the naming of proteins is:
Scaffold_1: proteins 00001 to 01000
Scaffold_10: proteins 01001 to 03000
Scaffold_11
Scaffold_110...
Scaffold_2: proteins 05001 to 06000

It is quite trivial, and if anything, it can only get confusing when trying to troubleshoot mediocre datasets. Still, my OCD does not let me believe this is the behaviour you desired....

Take care,
All the best
Fer

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions