Seqsim Alignment Simulation Example with Non-standard alphabetΒΆ
Section author: Julia Goodrich
This is an example of how to use PyCogent’s seqsim
module to simulate an
alignment where the alphabet is defined by the user for a simple tree starting
with a random sequence and a random substitution rate matrix.
First we will perform the necessary imports.
>>> from cogent.seqsim.usage import Rates
>>> from cogent.core.alignment import DenseAlignment
>>> from cogent.seqsim.tree import RangeNode
>>> from cogent.parse.tree import DndParser
>>> from cogent.core.alphabet import CharAlphabet
>>> from cogent.seqsim.usage import Usage
>>> from cogent.core.sequence import ModelSequence
Now, lets specify a 4 taxon tree:
>>> t = DndParser('(a:0.4,b:0.3,(c:0.15,d:0.2)edge.0:0.1);',
... constructor = RangeNode)
Create the alphabet by passing in the characters to CharAlphabet
then create
tuples of all the possible pairs using ** operator
>>> Bases = CharAlphabet('ABCD')
>>> Pairs = Bases**2
Generate a random sequence with the new alphabet and a random rate matrix,
Usage
is being used to define character frequencies for the random
sequence. Then we create a random sequence of length five.
>>> u = Usage({'A':0.5,'B':0.2,'C':0.15,'D':0.25}, Alphabet = Bases)
>>> s = ModelSequence(u.randomIndices(5))
>>> q = Rates.random(Pairs)
Set q at the base of the tree and propagate it to all nodes in the tree,
>>> t.Q = q
>>> t.propagateAttr('Q')
Set a P matrix from every Q matrix on each node,
>>> t.assignP()
Use evolve
to evolve sequences for each tip, Note: must evolve sequence
data, not sequence object itself (for speed)
>>> t.evolve(s._data)
Build alignment,
>>> seqs = {}
>>> for n in t.tips():
... seqs[n.Name] = ModelSequence(n.Sequence,Bases)
>>> aln = DenseAlignment(seqs,Alphabet=Bases)
The result is a Cogent Alignment
object, which can be used the same way as
any other alignment object.