schrodinger.protein.align module

exception schrodinger.protein.align.AlignmentException

Bases: Exception

class schrodinger.protein.align.AbstractAligner(gap_open_penalty=1, gap_extend_penalty=0, sub_matrix=None, direct_scores=False, constraints=None, merge_all=False, ss_constraints=False)

Bases: object

Base class of objects that can perform an alignment

gap_open_penalty
gap_extend_penalty
sub_matrix

Get or set the sub_matrix.

When setting, it will be converted to a numpy array.

run(aln)

Aligns the sequences in an alignment using the parameters supplied on init

Subclasses need to override this default implementation.

Parameters:aln (schrodinger.protein.alignment.BaseAlignment) – The alignment to align
class schrodinger.protein.align.BasicAligner(gap_open_penalty=1, gap_extend_penalty=0, sub_matrix=None, direct_scores=False, constraints=None, merge_all=False, ss_constraints=False)

Bases: schrodinger.protein.align.AbstractAligner

Aligns sequences by simply adding gaps to the ends of sequences to make them the same length

run(aln)

Aligns the sequences in an alignment using the parameters supplied on init

Subclasses need to override this default implementation.

Parameters:aln (schrodinger.protein.alignment.BaseAlignment) – The alignment to align
class schrodinger.protein.align.RescodeAligner(gap_open_penalty=1, gap_extend_penalty=0, sub_matrix=None, direct_scores=False, constraints=None, merge_all=False, ss_constraints=False)

Bases: schrodinger.protein.align.AbstractAligner

Aligns sequences by rescode

run(aln)

Aligns the sequences in an alignment using the parameters supplied on init

Subclasses need to override this default implementation.

Parameters:aln (schrodinger.protein.alignment.BaseAlignment) – The alignment to align
class schrodinger.protein.align.PairwiseAligner(**kwargs)

Bases: schrodinger.protein.align.AbstractAligner

Implementation of the Needleman-Wunsch local alignment algorithm for pairwise sequence alignment with affine gap penalties.

  1. ability to merge new sequence with existing alignment,
  2. ability to penalize gaps in secondary structure elements,
  3. ability to use custom substitution matrix generated from a family of proteins or provided by the user.
Variables:
  • CONSTRAINT_SCORE – Reward amount of keeping constrained residues aligned
  • RES_MATCH_BONUS – Reward amount for aligning matching residues. Used by default if a substitution matrix is not specified.
  • RES_MISMATCH_PENALTY – Penalty for aligning differing residues. Used by default if a subtitution matrix is not specified
Ctype CONSTRAINT_SCORE:
 

float

Ctype RES_MATCH_BONUS:
 

float

Ctype RES_MISMATCH_PENALTY:
 

float

CONSTRAINT_SCORE = 1000000.0
RES_MATCH_BONUS = 1.0
RES_MISMATCH_PENALTY = 1.0
run(aln, seqs_to_align=None)
Parameters:
  • aln (alignment.Alignment) – The alignment containing sequences to align.
  • seqs_to_align (list(sequence.Sequence)) – The sequences in aln to align against the reference sequence of aln. If None, defaults to the first non-reference sequence in aln (ie aln[1])
Raises:

ValueError – If seqs_to_align contains a sequence not found in aln.

getAlignmentScore()

Get the score of the alignment. Found by taking the highest value in the scoring matrix.

Returns:Score of the pairwise alignment.
Return type:float
class schrodinger.protein.align.ClustalAligner(gap_open_penalty=1, gap_extend_penalty=0, sub_matrix=None, direct_scores=False, constraints=None, merge_all=False, ss_constraints=False)

Bases: schrodinger.protein.align.AbstractAligner

Aligns sequences using the Clustal alignment algorithm.

class schrodinger.protein.align.SuperpositionAligner(gap_open_penalty=None, gap_extend_penalty=None)

Bases: schrodinger.protein.align.PairwiseAligner

Align structured sequences based on their superposition.

run(aln, seqs_to_align=None)

Align sequences based on structure superposition to the reference.

Parameters:
  • aln (alignment.Alignment) – The alignment containing sequences to align.
  • seqs_to_align (list of sequence.Sequence or NoneType) – The sequences in aln to align against the reference sequence of aln. If None, defaults to the first non-reference sequence in aln (ie aln[1])
Raises:
  • ValueError – If seqs_to_align contains a sequence not found in aln.
  • ValueError – If the reference sequence or any of seqs_to_align don’t have an associated structure.