Skip to content

output file format: minus strand entries, extra spaces, header #19

@darked89

Description

@darked89

Hello,

I have a modes proposal for the output file format improvements:

minus strand entries

tRNAscan-SE minus strand predictions in the output file have "tRNA Begin" > "tRNA End". Same goes for introns positions (if tRNA is spliced obviously). This is not an issue for the tRNAs themselves (BED files and fasta files have the correct 1:142656825-142656896 format/interval description) but the introns have to be flipped.
Would it be easier to have a same, BED-like start-end-strand numbering scheme in the output?

extra spaces

To convert the output to a still human readable but easy to parse TSV I do:

tail -n +4 trnascan_out.txt | tr -d ' ' > trnascan_out.tsv

Since you have a complicated header in the file I understand the need for the spaces. Which brings me to the next point

header / TSV

TSV format with named columns seem to be the default. With comment lines # on the top it could be even easier to understand than the current one and certainly easier to parse. For example:

 "chrom", "trna_num",  "trna_start", "trna_end", "trna_type", "anticodon", 
"intr_start", "intr_end", "inf_score", "iso_CM", "iso_score", "note"

in order to fix minus strand issue the "strand" should be inserted somewhere.

These are just my 0.02$

Thank you for developing and maintaining tRNAScan-SE.

Darek Kedra

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions