schrodinger.protein.annotation module

Implementation of Multiple Sequence Viewer Annotation class.

Copyright Schrodinger, LLC. All rights reserved.

schrodinger.protein.annotation.LIGAND_CONTACTS

alias of schrodinger.protein.annotation.LIGAND_CONTACT

class schrodinger.protein.annotation.AntibodyCDRLabel

Bases: enum.Enum

An enumeration.

H1 = 5
H2 = 6
H3 = 7
L1 = 2
L2 = 3
L3 = 4
NotCDR = 1
class schrodinger.protein.annotation.AntibodyCDRScheme

Bases: enum.Enum

An enumeration.

AHo = 5
Chothia = 1
EnhancedChothia = 4
IMGT = 3
Kabat = 2
class schrodinger.protein.annotation.AntibodyCDR(label, start, end)

Bases: tuple

end

Alias for field number 2

label

Alias for field number 0

start

Alias for field number 1

class schrodinger.protein.annotation.Consensus

Bases: enum.Enum

An enumeration.

not_conserved = ' '
fully_conserved = '*'
strongly_conserved = ':'
weakly_conserved = '.'
class schrodinger.protein.annotation.SequenceAnnotations(seq)

Bases: PyQt5.QtCore.QObject

Knows how to annotate a sequence

Annotations can be set at the level of the sequence as a whole, or be per sequence element annotations. If an attribute is accessed on the SequenceAnnotations object, the attribute is first looked for on the object and if not found is assumed to be a per sequence element annotation. If the elements in the sequence lack the attribute, an AttributeError will be raised.

Variables:titleChanged (QtCore.pyqtSignal) – A signal emitted after an annotation’s title (row header) changes.
titleChanged
sequence
class schrodinger.protein.annotation.ProteinSequenceAnnotations(seq)

Bases: schrodinger.protein.annotation.SequenceAnnotations

Knows how to annotate a ProteinSequence

Variables:
  • CLOSE_LIG_DIST (int) – The distance (in angstroms) from a ligand to consider a residue as a “close contact”
  • FAR_LIG_DIST (int) – The distance (in angstroms) from a ligand to consider a residue as a “far contact”
FAR_LIG_DIST = 6
CLOSE_LIG_DIST = 3
class ANNOTATION_TYPES

Bases: object

antibody_cdr = 18
b_factor = 13
beta_strand_propensity = 5
disulfide_bonds = 3
exposure_tendency = 8
helix_propensity = 4
helix_termination_tendency = 7
hydrophobicity = 11
isoelectric_point = 12
ligand_contacts = 17
rescode = 2
resnum = 1
sasa = 19
secondary_structure = 16
side_chain_chem = 10
steric_group = 9
turn_propensity = 6
window_hydrophobicity = 14
window_isoelectric_point = 15
RES_PROPENSITY_ANNOTATIONS = {<ANNOTATION_TYPES.exposure_tendency: 8>, <ANNOTATION_TYPES.beta_strand_propensity: 5>, <ANNOTATION_TYPES.turn_propensity: 6>, <ANNOTATION_TYPES.helix_propensity: 4>, <ANNOTATION_TYPES.steric_group: 9>, <ANNOTATION_TYPES.side_chain_chem: 10>, <ANNOTATION_TYPES.helix_termination_tendency: 7>}
window_hydrophobicity

Decorator that converts a method with a single self argument into a property cached on the instance.

Use del to delete the currently cached value and force a recalculation on the next access. See the tests for examples.

This class is based on code that is Copyright (c) Django Software Foundation

invalidateWindowHydrophobicity()

Invalidate the cached window hydrophobicity data. Note that this method is also called from the sequence when the window size changes.

window_isoelectric_point

Decorator that converts a method with a single self argument into a property cached on the instance.

Use del to delete the currently cached value and force a recalculation on the next access. See the tests for examples.

This class is based on code that is Copyright (c) Django Software Foundation

invalidateWindowIsoelectricPoint()

Invalidate the cached window isoelectric point data. Note that this method is also called from the sequence when the window size changes.

sasa

Decorator that converts a method with a single self argument into a property cached on the instance.

Use del to delete the currently cached value and force a recalculation on the next access. See the tests for examples.

This class is based on code that is Copyright (c) Django Software Foundation

getAntibodyCDR(col, scheme)

Returns the antibody CDR information of the col’th index in the sequence under a given antibody CDR numbering scheme.

Parameters:
  • col (int) – index into the sequence
  • scheme (AntibodyCDRScheme) – The antibody CDR numbering scheme to use
Returns:

Antibody CDR label, start, and end positions

Return type:

AntibodyCDR, which is a named tuple of (AntibodyCDRLabel, int, int) if col is in a CDR, otherwise (AntibodyCDRLabel.NotCDR, None, None)

getAntibodyCDRs(scheme)

Returns a list of antibody CDR information for the entire sequence.

Parameters:scheme (AntibodyCDRScheme) – The antibody CDR numbering scheme to use
Returns:A list of Antibody CDR labels, starts, and end positions
Return type:list(AntibodyCDR)
isAntibodyChain()
Returns:Whether the sequence described is an antibody chain
Return type:bool
getSparseRescodes(modulo)
ligand_contacts
ligands
onStructureChanged()
resetAnnotation(ann)

Force a reset of an annotation’s cache.

getSSBondPartner(index)

Return the residue’s intra-sequence disulfide bond partner, if any.

If the residue is not involved in a disulfide bond, its partner has been deleted, or its partner is in another sequence, it will return None.

Parameters:index (int) – Index of the residue to check
Returns:the other Residue in the disulfide bond or None
Return type:schrodinger.protein.residue.Residue or None
inscode
resnum
class schrodinger.protein.annotation.NucleicAcidSequenceAnnotations(seq)

Bases: schrodinger.protein.annotation.ProteinSequenceAnnotations

class schrodinger.protein.annotation.ProteinAlignmentAnnotations(aln)

Bases: object

Knows how to annotate an alignment (a collection of aligned sequences)

class ANNOTATION_TYPES

Bases: object

consensus_freq = 6
consensus_seq = 5
consensus_symbols = 4
indices = 1
mean_hydrophobicity = 2
mean_isoelectric_point = 3
alignment
indices

A numbering of all the column indices in an alignment

mean_hydrophobicity

returns: A list of floats representing per-column averages of the hydrophobicity of residues in the alignment

mean_isoelectric_point

returns: A list of floats representing per-column averages of the isoelectric point of residues in the alignment

consensus_seq

Consensus sequence in the alignment. If there is more than one highest freq. residue in the column, save all of them.

Returns:consensus sequence
Return type:list(list(schrodinger.protein.residue.Residue))
consensus_freq

Frequencies of the consensus residue in each alignment column Gapped positions are counted when calculating frequencies.

consensus_symbols

Consensus symbols in the alignment based on pre-defined residue sets, same as in ClustalW

Returns:consensus symbols for each alignment position
Type:A list of ConsensusSymbol enums.

Calculates normalized frequencies of individual amino acids per alignment position, and overall estimate of column composition diversity (‘bits’).

Schneider TD, Stephens RM (1990). “Sequence Logos: A New Way to Display Consensus Sequences”. Nucleic Acids Res 18 (20): 6097–6100. doi:10.1093/nar/18.20.6097

Returns:the list of bits and frequencies of the residues in a position in decreasing order.
Return type:list of tuples, each tuple consists of a float (bits), and a list of tuples (<short_code>, freq) in decreasing order of frequency.