Using a protein model¶
We use apps to load unaligned DNA sequences and to translate them into amino acids.
from cogent3 import get_app
loader = get_app("load_unaligned", format="fasta")
to_aa = get_app("translate_seqs")
process = loader + to_aa
seqs = process("data/SCA1-cds.fasta")
Protein alignment with default settings¶
The default setting for “protein” is a WG01 model.
from cogent3 import get_app
aa_aligner = get_app("progressive_align", "protein")
aligned = aa_aligner(seqs)
aligned
0 | |
Human | MKSNQERSNECLPPKKREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGNPGGRGHGGGRH |
Chimp | ............................................................ |
Rat | ........................P.....TA......C...V....ST..S........ |
Mouse | ........................P.....TA......C...V....ST..I........ |
Mouse Lemur | ...............................A.......A..AP................ |
Macaque | ........................P......A............................ |
6 x 825 (truncated to 6 x 60) protein alignment
Specify a different distance measure for estimating the guide tree¶
The distance measures available are percent or paralinear.
Note
An estimated guide tree has its branch lengths scaled so they are consistent with usage in a codon model.
aa_aligner = get_app("progressive_align", "protein", distance="paralinear")
aligned = aa_aligner(seqs)
aligned
0 | |
Human | MKSNQERSNECLPPKKREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGNPGGRGHGGGRH |
Rat | ........................P.....TA......C...V....ST..S........ |
Mouse | ........................P.....TA......C...V....ST..I........ |
Mouse Lemur | ...............................A.......A..AP................ |
Macaque | ........................P......A............................ |
Chimp | ............................................................ |
6 x 825 (truncated to 6 x 60) protein alignment
Alignment settings and file provenance are recorded in the info
attribute¶
aligned.info
{'Refs': {},
'source': 'data/SCA1-cds.fasta',
'align_params': {'indel_length': 0.1,
'indel_rate': 1e-10,
'guide_tree': "(((Rat:0.004763355238688913,Mouse:0.011219581285708921):0.052856143725369786,Mouse_Lemur:0.03580862702845759):0.024351474041303396,(Macaque:0.0023127545121458606,(Chimp:0.008168683695808834,Human:0.00019740149152159842):0.0030743103943117536)'AUTOGENERATED_NAME_SD':1e-06);",
'model': 'JTT92',
'lnL': -3208.5222197901767}}