Package schrodinger :: Package application :: Package msv :: Package domain :: Module alignment :: Class BaseAlignment
[hide private]
[frames] | no frames]

Class BaseAlignment

       object --+            
                |            
sip.simplewrapper --+        
                    |        
          sip.wrapper --+    
                        |    
     PyQt4.QtCore.QObject --+
                            |
                           BaseAlignment
Known Subclasses:

Abstract base class for classes which handle alignment of various sequences and corresponding annotations.

This is a pure domain object intended to make it easy to work with aligned collections of sequences.

Instance Methods [hide private]
 
__init__(self, sequences=None)
x.__init__(...) initializes x; see help(type(x)) for signature
 
__len__(self)
Returns the number of sequences in the alignment
 
__iter__(self)
Returns an iterable of the sequences held in the alignment
 
__contains__(self, seq)
Returns whether the sequence is present in the alignment
 
__getitem__(self, index)
Returns the sequence at the index in the alignment
 
_getTextFormattedAlignment(self)
Returns a formatted list of strings with detailed alignment information
 
_getAlignmentDescription(self)
Returns a formatted description of the alignment
 
__str__(self)
Returns a str representation of the alignment
 
__repr__(self)
Returns a str representation of the alignment
 
__add__(self, other)
Adds another alignment to this one
 
__deepcopy__(self, memo)
We should be able to copy an alignment, getting back an alignment with all the essential data in the original alignment and sharing no references with it
 
getResidueData(self, seqnum, index, annotation=None)
Returns residue-level data for the specified sequence at the specified index in the alignment, or None if no data is available.
 
getGlobalAnnotationData(self, index, annotation)
Returns column-level annotation data at an index in the alignment
int
getSeqIndex(self, seq)
Returns: The index of the requested sequence
int or NoneType
_recalculateLength(self, omit, extra_length=None)
Determine what the length of the alignment would be if we removed one sequence.
 
_monitorSequence(self, seq)
Monitor changes in the specified sequence, which has been stored at the specified index in self._sequences.
 
_stopMonitoringSequence(self, seq, slots)
Stop monitoring for changes in the specified sequence
 
_sequenceLengthAboutToChange(self, seq, old_seq_length, new_seq_length)
Respond to a sequence lengthAboutToChange signal by emitting alignmentLengthAboutToChange if necessary.
 
_sequenceLengthChanged(self, seq, old_seq_length, new_seq_length)
Respond to a sequence lengthChanged signal by emitting alignmentLengthChanged and sequenceResiduesChanged as necessary.
int or NoneType
_checkAlignmentLength(self, seq, old_seq_length, new_seq_length)
Determine what the length of the alignment will be if we changed the length of the specified sequence
 
addSeq(self, seq, index=None)
 
addSeqs(self, sequences, start=None)
Add multiple sequences to the alignment
 
removeSeq(self, seq)
Remove a sequence from the alignment
 
removeSeqs(self, seqs)
Remove multiple sequences from the alignment
 
removeSeqByIndex(self, index)
Remove a Sequence from the alignment
 
removeAllSeqs(self)
Clears the entire alignment of sequences
list of (sequence index, residue index) tuples
getResidueIndices(self, residues)
Returns the indices (in the alignment) of the specified residues
BaseAligment
getSubalignment(self, start, end)
Return another alignment containing the elements within the specified start and end indices
 
removeSubalignment(self, start, end)
Remove a block of the subalignment from the start to end points, including column locks in that region
 
insertSubalignment(self, aln, start)
Insert an alignment into the current alignment at the specified index
 
appendSubalignment(self, aln)
Append an alignment to this one
 
replaceSubalignment(self, aln, start, end)
Replace a subsection of the alignment indicated by start and end indices with the specified alignment
list
getGaps(self)
Returns a list of gap indices lists
list
getTerminalGaps(self)
Returns the indices of terminal gaps in all the sequences
 
removeGaps(self, gap_indices)
 
removeAllGaps(self)
Removes all the gaps of the sequences in the alignment.
 
removeTerminalGaps(self)
Removes the gaps from the ends of every sequence in the alignment
 
_validateGapIndices(self, gap_indices)
Check for gaps in locked columns
 
_adjustGapsInSubalignments(self, method_name, gap_indices)
Utility method that iterates through alternating locked and unlocked stretches of an alignment in reverse order and calling the specified method on the sub regions, padding all but the last region.
 
addGaps(self, gap_indices)
Adds gaps to the alignment
 
setGaps(self, gap_indices)
Sets gaps on the alignment
list
_getGapOnlyColumns(self)
Returns a list of lists of indices for unlocked columns that contain only gaps
 
minimizeAlignment(self)
Minimizes the alignment, i.e.
list
getColumn(self, index, omit_gaps=False)
Returns single alignment column at index position.
 
columns(self, omit_gaps=False)
Returns a range of alignment columns or all columns if indices are not specified.
set
lockedColumns(self)
Returns a set with indices of locked columns.
 
_setLockedColumns(self, columns, lock=True, reset=False)
Sets the columns to the specified lock state
 
setLockedColumns(self, columns, lock=True, reset=False)
Sets the specified columns to the specified lock state
bool
alignmentLocked(self)
Whether every column in the alignment is locked
 
setAllLocks(self, lock=True)
Convenience method to set all the locks to the specified lock state at once
list
_getRegions(self)
Returns a list of _Region objects containing information of locked and unlocked stretches of the alignment
 
getIdentities(self, omit_gaps=True)
Returns an alignment-length list of bools indicating which columns have identical residues
 
getSimilarityScore(self, seq)
Returns a sequence length array of similarity scores against the reference sequence
 
getAlignedBlocks(self)
Returns the indices of aligned blocks (regions without gaps).
 
getFrequencies(self, exclude=None, consider_gaps=False)
Returns a dict mapping residues types to the frequency in the alignment
 
getEntropy(self, frequencies)
Returns an alignment length array of residue entropy scores
 
createConsensusSequence(self)
Returns a consensus sequence for sequences in the alignment
list
findPattern(self, pattern)
Finds a specified PROSITE pattern in all sequences.
list of int
getRedundantSequences(self, value)
Returns the indices of sequences below a specified identity threshold value.
 
calculateMatrix(self)
Calculates a substitution matrix based on the current alignment.

Inherited from PyQt4.QtCore.QObject: __getattr__, blockSignals, childEvent, children, connect, connectNotify, customEvent, deleteLater, destroyed, disconnect, disconnectNotify, dumpObjectInfo, dumpObjectTree, dynamicPropertyNames, emit, event, eventFilter, findChild, findChildren, inherits, installEventFilter, isWidgetType, killTimer, metaObject, moveToThread, objectName, parent, property, pyqtConfigure, receivers, removeEventFilter, sender, senderSignalIndex, setObjectName, setParent, setProperty, signalsBlocked, startTimer, thread, timerEvent, tr, trUtf8

Inherited from sip.simplewrapper: __new__

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

Class Methods [hide private]
 
padAlignment(cls, aln)
Insert gaps into an alignment so that it forms a rectangular block
 
_insertGaps(cls, sequences, gap_boundaries)
Inserts gaps represented as "gap boundaries" (indices of residues that immediately follow the gaps) to a list of sequences.
 
mergePairwiseAlignments(cls, sequence_pairs)
Merges several pairwise alignments into one flat alignment while preserving relative residue positions.
Class Variables [hide private]

Inherited from PyQt4.QtCore.QObject: staticMetaObject

Instance Variables [hide private]
 
sequencesAboutToBeInserted(...)
A signal emitted before sequences are inserted into the alignment.
 
sequencesInserted(...)
A signal emitted after sequences are inserted into the alignment.
 
sequencesAboutToBeRemoved(...)
A signal emitted before sequences are removed from the alignment.
 
sequencesRemoved(...)
A signal emitted after sequences are removed from the alignment.
 
sequenceResiduesChanged(...)
A signal emitted after the contents of a sequence have changed.
 
sequenceNameChanged(...)
A signal emitted after a sequence has changed names.
 
alignmentLengthAboutToChange(...)
A signal emitted before the alignment changes length.
 
alignmentLengthChanged(...)
A signal emitted after the alignment changes length.
Properties [hide private]
  global_annotations
Returns the alignment-level annotations available for the alignment
  seq_annotations
Returns the sequence-level annotations available for sequences held in the alignment
  all_annotations
Return a list of all annotations types in this alignment

Inherited from object: __class__

Method Details [hide private]

__init__(self, sequences=None)
(Constructor)

 

x.__init__(...) initializes x; see help(type(x)) for signature

Parameters:
  • sequences (list) - An optional iterable of sequences
Overrides: object.__init__

__str__(self)
(Informal representation operator)

 

Returns a str representation of the alignment

This is fairly detailed since it is very useful for debugging

Overrides: object.__str__

__repr__(self)
(Representation operator)

 

Returns a str representation of the alignment

Overrides: object.__repr__

__add__(self, other)
(Addition operator)

 

Adds another alignment to this one

Parameters:
  • other (schrodinger.application.msv.domain.alignment.Alignment

    NOTE: Undo is currently not supported for this operation

    ) - Another alignment with the same number of sequences

getResidueData(self, seqnum, index, annotation=None)

 

Returns residue-level data for the specified sequence at the specified index in the alignment, or None if no data is available.

If annotation is specified, the residue-level information for the residue is returned. If not, the residue object itself is returned.

Parameters:
  • seqnum (int) - The index of the sequence in the alignment
  • index (int) - The index of the residue in the sequence
  • annotation (enum.Enum) - An enum representing the requested annotation, if any

getGlobalAnnotationData(self, index, annotation)

 

Returns column-level annotation data at an index in the alignment

Parameters:
  • index (int) - The index in the alignment
  • annotation (enum.Enum) - An enum representing the requested annotation, if any

getSeqIndex(self, seq)

 
Parameters:
Returns: int
The index of the requested sequence

_recalculateLength(self, omit, extra_length=None)

 

Determine what the length of the alignment would be if we removed one sequence.

Parameters:
  • omit (sequence.Sequence) - The sequence to omit
  • extra_length (int) - An additional sequence length to include in the calculation
Returns: int or NoneType
The calculated alignment length if the length is going to change. None otherwise.

_monitorSequence(self, seq)

 

Monitor changes in the specified sequence, which has been stored at the specified index in self._sequences.

Parameters:

_stopMonitoringSequence(self, seq, slots)

 

Stop monitoring for changes in the specified sequence

Parameters:
  • seq (sequence.Sequence) - The sequence to stop monitoring
  • slots (tuple) - A tuple containing all slots to disconnect

_sequenceLengthAboutToChange(self, seq, old_seq_length, new_seq_length)

 

Respond to a sequence lengthAboutToChange signal by emitting alignmentLengthAboutToChange if necessary.

Parameters:
  • seq (sequence.Sequence) - The sequence that's about to be modified
  • old_seq_length (int) - The current length of the sequence
  • new_seq_length (int) - The new length of the sequence

_sequenceLengthChanged(self, seq, old_seq_length, new_seq_length)

 

Respond to a sequence lengthChanged signal by emitting alignmentLengthChanged and sequenceResiduesChanged as necessary.

Parameters:
  • seq (sequence.Sequence) - The sequence that's been modified
  • old_seq_length (int) - The old length of the sequence
  • new_seq_length (int) - The current length of the sequence

_checkAlignmentLength(self, seq, old_seq_length, new_seq_length)

 

Determine what the length of the alignment will be if we changed the length of the specified sequence

Parameters:
  • seq (sequence.Sequence) - The sequence that will change length
  • old_seq_length (int) - The old length of the sequence
  • new_seq_length (int) - The new length of the sequence
Returns: int or NoneType
If the alignment will change length, returns the new alignment length. Otherwise, returns None.

addSeq(self, seq, index=None)

 
Parameters:
  • seq (sequence.Sequence) - The sequence to add
  • start (int) - The index at which to insert; if None, seq is appended

addSeqs(self, sequences, start=None)

 

Add multiple sequences to the alignment

Parameters:
  • sequences (list of sequence.Sequence) - Sequences to add
  • start (int) - The index at which to insert; if None, seq is appended

removeSeq(self, seq)

 

Remove a sequence from the alignment

Parameters:

removeSeqByIndex(self, index)

 

Remove a Sequence from the alignment

Parameters:
  • index (int) - The index of the sequence to remove

getResidueIndices(self, residues)

 

Returns the indices (in the alignment) of the specified residues

Parameters:
  • residues ()
Returns: list of (sequence index, residue index) tuples
A list of (int, int)

getSubalignment(self, start, end)

 

Return another alignment containing the elements within the specified start and end indices

Parameters:
  • start (int) - The index at which the subalignment should start
  • end (int) - The index at which the subalignment should end
Returns: BaseAligment
An alignment corresponding to the start and end point specified

removeSubalignment(self, start, end)

 

Remove a block of the subalignment from the start to end points, including column locks in that region

Parameters:
  • start (int) - The start index of the columns to remove
  • end (int) - The end index of the columns to remove

insertSubalignment(self, aln, start)

 

Insert an alignment into the current alignment at the specified index

Parameters:
  • aln (BaseAlignment) - The alignment to insert
  • start (int) - The index at which to insert the alignment

appendSubalignment(self, aln)

 

Append an alignment to this one

Parameters:
  • aln (BaseAlignment or list of Sequence) - The alignment to append

replaceSubalignment(self, aln, start, end)

 

Replace a subsection of the alignment indicated by start and end indices with the specified alignment

Parameters:
  • aln (BaseAlignment) - The alignment to insert
  • start (int) - The index at which to insert the alignment

getGaps(self)

 

Returns a list of gap indices lists

Returns: list
A list of lists of ints

getTerminalGaps(self)

 

Returns the indices of terminal gaps in all the sequences

Returns: list
A list of lists of ints

removeGaps(self, gap_indices)

 
Parameters:
  • gap_indices (list of list of ints) - Indices of gaps to remove

removeAllGaps(self)

 

Removes all the gaps of the sequences in the alignment. This also unlocks all columns

_validateGapIndices(self, gap_indices)

 

Check for gaps in locked columns

Parameters:
  • gap_indices (list of lists) - A list of lists of gap indices

_adjustGapsInSubalignments(self, method_name, gap_indices)

 

Utility method that iterates through alternating locked and unlocked stretches of an alignment in reverse order and calling the specified method on the sub regions, padding all but the last region.

Parameters:
  • method_name (str) - the requested operation
  • gap_indices (list of lists of integers

    Suppose we have the following alignment:

    A C G T A P ~ G T A ~ P A ~ G T A G

    If we wish to insert a gap at 0 in the first sequence, at 1 in the second, and at 4 in the third, the result will look like this:

    ~ A C G T A P ~ ~ G T A ~ P A ~ G T ~ A G

    Suppose we lock column 3 (T A T). The alignment is now divided into three subalignments that need to be preserved:

    [A C G] [T] [A P] [~ G T] [A] [~ P] [A ~ G] [T] [A G]

    This complicates gap insertion: We now need to pad the region before the lock in the third sequence in order to preserve the regions intact:

    [~ A C G] [T] [A P] [~ ~ G T] [A] [~ P] [A ~ G ~] [T] [~ A G]

    Adding the padding at position 3 in the third sequence requires us to change the gap index of 4 to 5.

    To handle this, we iterate through the different alternating locked and unlocked regions of the alignment from back to front, padding all but the last region in order to preserve the regions intact.

    ) - indices of gaps used by method

addGaps(self, gap_indices)

 

Adds gaps to the alignment

Parameters:
  • gap_indices - A list of lists of gap indices, one for each sequence in the alignment.

    Note that the length of the gap_indices list must match the number of sequences in the alignment.

setGaps(self, gap_indices)

 

Sets gaps on the alignment

Parameters:
  • gap_indices - A list of lists of gap indices, one for each sequence in the alignment.

padAlignment(cls, aln)
Class Method

 

Insert gaps into an alignment so that it forms a rectangular block

Parameters:
  • aln (schrodinger.application.msv.domain.Alignment) - An alignment to pad

_getGapOnlyColumns(self)

 

Returns a list of lists of indices for unlocked columns that contain only gaps

Returns: list
List of list of indices

minimizeAlignment(self)

 

Minimizes the alignment, i.e. removes all gaps from the gap-only columns.

_insertGaps(cls, sequences, gap_boundaries)
Class Method

 

Inserts gaps represented as "gap boundaries" (indices of residues that immediately follow the gaps) to a list of sequences. The gap indices correspond to first sequence in the list, but they come from a different alignment.

Parameters:
  • sequences (List of sequences to add gaps to.) - list of Sequence
  • gaps (list of int) - List of gaps.

mergePairwiseAlignments(cls, sequence_pairs)
Class Method

 

Merges several pairwise alignments into one flat alignment while preserving relative residue positions. The original sequences are modified. After executing this function, all query sequences (first pair members) will be identical.

Example. Let's assume we have three pairwise query/template alignments:

Q1: ACDEFGHI T1: ~~DEF~~~

Q2: ~~~ACDEFGHI T2: TTT~~DE~~H~

Q3: ACDEF~~GHI~ T3: ACD~~PPGH~Y

Note the query sequence is identical in all cases, but it has gaps in different positions. After running mergePairwiseAlignments, the result is:

Q1: ~~~ACDEF~~GHI T1: ~~~~~DEF~~~~~

Q2: ~~~ACDEF~~GHI T2: TTT~~DE~~~~H~

Q3: ~~~ACDEF~~GHI~ T3: ~~~ACD~~PPGH~Y

Now the queries have gaps in identical positions, and aligned residues are in positions equivalent to these in original alignments.

Parameters:
  • sequence_pairs (list of list of sequences) - List of [query, template] pairs.

getColumn(self, index, omit_gaps=False)

 

Returns single alignment column at index position. Optionally, filters out gaps if omit_gaps is True.

Parameters:
  • index (int) - The index in the alignment
  • omit_gaps (bool) - Whether to omit the gaps
Returns: list
Single alignment column at index position.

columns(self, omit_gaps=False)

 

Returns a range of alignment columns or all columns if indices are not specified.

Parameters:
  • omit_gaps (bool) - Whether to omit gaps

lockedColumns(self)

 

Returns a set with indices of locked columns.

Returns: set
A set of indices

The set is a copy of our internal set, so modifying it has no effect on our private attribute

_setLockedColumns(self, columns, lock=True, reset=False)

 

Sets the columns to the specified lock state

Parameters:
  • columns (iterable) - an iterable of columns to set, specified by index
  • lock (bool) - Whether to lock or unlock columns
  • reset (bool) - Whether to reset the locks or add to existing ones

setLockedColumns(self, columns, lock=True, reset=False)

 

Sets the specified columns to the specified lock state

In undoable subclasses, this is an undoable method. We offer a private version of the method so that other undoable methods can call it without calling a second undoable method.

Parameters:
  • columns (iterable) - an iterable of columns to set, specified by index
  • lock (bool) - Whether to lock or unlock columns
  • reset (bool) - Whether to reset the locks or add to existing ones

alignmentLocked(self)

 

Whether every column in the alignment is locked

Returns: bool
Whether the alignment is locked

setAllLocks(self, lock=True)

 

Convenience method to set all the locks to the specified lock state at once

Parameters:
  • lock (bool) - Whether to lock or unlock the specified columns

_getRegions(self)

 

Returns a list of _Region objects containing information of locked and unlocked stretches of the alignment

Returns: list
A list of _Region objects

getIdentities(self, omit_gaps=True)

 

Returns an alignment-length list of bools indicating which columns have identical residues

Parameters:
  • omit_gaps (bool) - Whether gaps should be excluded from a column.

getSimilarityScore(self, seq)

 

Returns a sequence length array of similarity scores against the reference sequence

Gaps in the sequences are coded as None values.

getFrequencies(self, exclude=None, consider_gaps=False)

 

Returns a dict mapping residues types to the frequency in the alignment

Parameters:
  • exclude (list) - A list of sequences to exclude
  • consider_gaps (bool) - Whether to consider gaps in calculating frequences

findPattern(self, pattern)

 

Finds a specified PROSITE pattern in all sequences.

Parameters:
  • pattern (basestring) - The pattern to find
Returns: list
A list of pattern matches

getRedundantSequences(self, value)

 

Returns the indices of sequences below a specified identity threshold value.

Returns: list of int
The indices of sequences in the alignment below specified identity threshold

Instance Variable Details [hide private]

sequencesAboutToBeInserted(...)

 
A signal emitted before sequences are inserted into the alignment. Emitted with:
  • The index of the first sequence to be inserted
  • The index of the last sequence to be inserted

sequencesInserted(...)

 
A signal emitted after sequences are inserted into the alignment. Emitted with:
  • The index of the first sequence inserted
  • The index of the last sequence inserted

sequencesAboutToBeRemoved(...)

 
A signal emitted before sequences are removed from the alignment. Emitted with:
  • The index of the first sequence to be removed
  • The index of the last sequence to be removed

sequencesRemoved(...)

 
A signal emitted after sequences are removed from the alignment. Emitted with:
  • The index of the first sequence removed
  • The index of the last sequence removed

sequenceResiduesChanged(...)

 
A signal emitted after the contents of a sequence have changed. Note that this signal may also be emitted in response to a sequence changing length, as positions in the alignment may switch from blank to occupied or vice versa. Emitted with:
  • The modified sequence
  • The position of the first modified residue
  • The position of the last modified residue

sequenceNameChanged(...)

 
A signal emitted after a sequence has changed names. Emitted with:
  • The modified sequence

alignmentLengthAboutToChange(...)

 
A signal emitted before the alignment changes length. Emitted with:
  • The current length of the alignment
  • The new length of the alignment

alignmentLengthChanged(...)

 
A signal emitted after the alignment changes length. Emitted with:
  • The old length of the alignment
  • The current length of the alignment

Property Details [hide private]

global_annotations

Returns the alignment-level annotations available for the alignment

Get Method:
unreachable.global_annotations(self) - Returns the alignment-level annotations available for the alignment

seq_annotations

Returns the sequence-level annotations available for sequences held in the alignment

Get Method:
unreachable.seq_annotations(self) - Returns the sequence-level annotations available for sequences held in the alignment

all_annotations

Return a list of all annotations types in this alignment

Get Method:
unreachable.all_annotations(self) - Return a list of all annotations types in this alignment