-
Notifications
You must be signed in to change notification settings - Fork 2
pairalign
Pairalign will perform pairwise alignment of DNA sequences given in fasta format through standard in.
This manual is based on the pairalign help function (-h) but has been successively changed.
pairalign [arguments] < inputfile.fasta
pairalign [arguments] inputfile.fasta
Input file is already aligned.
Output aligned sequences pairwise.
Output difference between the Jukes-Cantor (JC) distance and proportion different sites.
Output proportion different sites, JC distance, and difference between the two.
Set the format of the input to fasta or fasta with sequences pairwise (as output given the -a -n option). If sequences are aligned give the -A switch.
This option will cluster sequences that are similar and/or find the most inclusive taxa in a hierarchy that are alignable according to MAD (Smith et al. 2009, BMC evol. Biol. 9:37). It need the taxonomy given after a (the first) | in the sequence name or in a separate file. Each taxa in the hierarchy should be separated by a semicolon, with the highest rank first and then increasingly nested levels until the lowest known level for the sequence. The groups that can be aligned are put in a file with the ending .alignment_groups and printed to the screen preceded by #. Clusters are printed to the screen after a heading, preceded by ###. To get alignable groups give 'alignment_groups' as extra argument, to cluster give 'cluster', and to do both give 'both'. Cut off value for pairwise similarity can be given after colon (:) by cut-off= followed value, e.g.:
pairalign -g both:cut-off=0.97
A file with taxonomy can be given with taxonomy=. The taxonomy file should have the taxonomy (as above) first on each row followed by a |, and the sequence name with that taxonomy as a comma (,) and/or space ( ) separated string. The same taxon can be repeated several times.
Print this help.
Output Jukes-Cantor (JC) distance.
Output in the form of a space separated left-upper triangular matrix.
Output sequence names (if outputting alignments then in fasta format).
Output proportion sites that are different.
Output similarity between sequences (1-proportion different).
This option is only valid if you have compiled pairalign with PTHREADS=YES (see installation. Set the number of threads additional to the controlling thread, e.g.:
pairalign -T 4
Default 1.
Get additional output.