-
Notifications
You must be signed in to change notification settings - Fork 91
Description
Hi there
I went now further into the compare step and mafft failed with the error below. running the listed command manually witout --quiet revealed that likely Selenocystein (U) caused this. This failure can be circumvented with the flag --any-symbol. But were in the code is the mafft command (I do not find it in th ecompare script when I search for mafft. Is this a different scipt?
[12:38 AM]: CMD ERROR: mafft test/phylogeny/phylogeny.concat.fa
[12:38 AM]: (None, b'')
nthread = 0
nthreadpair = 0
nthreadtb = 0
ppenalty_ex = 0
stacksize: 8192 kb
rescale = 1
Gap Penalty = -1.53, +0.00, +0.00
=========================================================================
=== Alphabet 'U' is unknown.
=== Please check site 603129 in sequence 1.
=== To make an alignment that has unusual characters (U, @, #, etc), try
=== % mafft --anysymbol input > output
=========================================================================
Illegal character U
Likely solution:
--anysymbol
To use unusual characters (e.g., U as selenocysteine in protein sequence; i as inosine in nucleotide sequence), use the --anysymbol option:
% mafft --anysymbol input > output
It accepts any printable characters (U, O, #, $, %, etc.; 0x21-0x7e in the ASCII code), execpt for > (0x3e) and ( (0x28). Unusual characters are scored as unknown (not considered in the calculation), unlike in the --text mode.