skbio.alignment.TabularMSA.extend

TabularMSA.extend(sequences, minter=None, index=None)[source]

Extend this MSA with sequences without recomputing alignment.

State: Experimental as of 0.4.1.

Parameters:

sequences : iterable of GrammaredSequence

Sequences to be appended. Must match the dtype of the MSA and the number of positions in the MSA.

minter : callable or metadata key, optional

Used to create index labels for the sequences being appended. If callable, it generates a label directly. Otherwise it’s treated as a key into the sequence metadata. Note that minter cannot be combined with index.

index : pd.Index consumable, optional

Index labels to use for the appended sequences. Must be the same length as sequences. Must be able to be passed directly to pd.Index constructor. Note that index cannot be combined with minter.

Raises:

ValueError

If both minter and index are both provided.

ValueError

If neither minter nor index are provided and the MSA has a non-default index.

ValueError

If index is not the same length as sequences.

TypeError

If sequences contains an object that isn’t a GrammaredSequence.

TypeError

If sequence contains a type that does not match the dtype of the MSA.

ValueError

If the length of a sequence does not match the number of positions in the MSA.

Notes

If neither minter nor index are provided and this MSA has default index labels, the new index labels will be auto-incremented.

The MSA is not automatically re-aligned when appending sequences. Therefore, this operation is not necessarily meaningful on its own.

Examples

>>> from skbio import DNA, TabularMSA
>>> msa = TabularMSA([DNA('ACGT')])
>>> msa
TabularMSA[DNA]
---------------------
Stats:
    sequence count: 1
    position count: 4
---------------------
ACGT
>>> msa.extend([DNA('AG-T'), DNA('-G-T')])
>>> msa
TabularMSA[DNA]
---------------------
Stats:
    sequence count: 3
    position count: 4
---------------------
ACGT
AG-T
-G-T

Auto-incrementing index labels:

>>> msa.index
Int64Index([0, 1, 2], dtype='int64')
>>> msa.extend([DNA('ACGA'), DNA('AC-T'), DNA('----')])
>>> msa.index
Int64Index([0, 1, 2, 3, 4, 5], dtype='int64')