Package schrodinger :: Package application :: Package phase :: Module hypothesis_binding_modes
[hide private]
[frames] | no frames]

Module hypothesis_binding_modes


Clusters actives and hypotheses into possible binding modes. Actives are
represented by bit strings encoding the hypotheses they match, and
hypotheses are represented by bit strings encoding the actives they match.
Tanimoto similarities between bit strings are computed, and hierarchical,
agglomerative clustering is performed on both actives and hypotheses. The
presence of consistent groupings of actives and hypotheses may indicate the
existence of multiple binding modes.

For example, if there are 10 hypotheses and 8 actives, an idealized clustered
bit matrix for 2 clusters might look like this:

                        Actives                Order

                  H  1 1 1 1 0 0 0 0             7
                  y  1 1 1 1 0 0 0 0             1
                  p  1 1 1 1 0 0 0 0             4
                  o  1 1 1 1 0 0 0 0             0
                  t  1 1 1 1 0 0 0 0             9
                  h  1 1 1 1 0 0 0 0   Cut 0 --- 2
                  e  0 0 0 0 1 1 1 1             6
                  s  0 0 0 0 1 1 1 1             5
                  e  0 0 0 0 1 1 1 1             8
                  s  0 0 0 0 1 1 1 1   Cut 1 --- 3

              Order  3 5 0 2 7 1 5 4
                           |       |
                           |       |
                         Cut 0   Cut 1

Example Usage:

    hypos = hypothesis.extract_hypotheses(phypo_path)
    results = hbm.calculate_binding_modes(hypos, 2)
    cluster_matrix, active_IDs, hypo_IDs, actives_cut, hypo_cut = results

Copyright Schrodinger LLC, All Rights Reserved.

Functions [hide private]
 
calculate_binding_modes(hypotheses, num_modes)
Clusters actives and hypotheses into possible binding modes.
list of str
_get_active_IDs(hypothesis)
Extracts all PHASE_LIGAND_NAME properties from the reference ligand or any actives in the current PhpHypoAdaptor object.
dict
_actives_bit_dict(hypotheses)
Creates bit matrix dictionary from set of hypotheses, where each key is an active ID, and corresponding values are numpy arrays indicating if that hypothesis (array index) includes the given active (1 it true, 0 otherwise).
bool, str
_validate_bit_matrix(bit_matrix, num_modes)
Validates the size and composition of the bit matrix based on the number of proposed binding modes.
float
_tanimoto_coefficient(bitrow_i, bitrow_j)
Computes Tanimoto coefficient between two bit arrays.
numpy.array
_distance_matrix(bit_matrix)
Computes distance matrix to use for clustering, where values are given as (1 - Tanimoto coefficient_ij) between rows i and j of the matrix.
list, list
_perform_clustering(bit_matrix, num_modes)
Performs clustering using the PhpHiCluster class on a given bit matrix for an expected number of clustering modes.
Variables [hide private]
  __package__ = 'schrodinger.application.phase'
Function Details [hide private]

calculate_binding_modes(hypotheses, num_modes)

 

Clusters actives and hypotheses into possible binding modes. Returns:
- clutered bit matrix for actives (columns) and hypotheses (rows)
- active IDs in column order
- hypothesis IDs in row order
- 0-based cluster cutoff indices for actives clusters
- 0-based cluster cutoff indices for hypotheses clusters

@param hypotheses: list of Phase hypotheses
@type hypotheses: list of L{phase.PhpHypoAdaptor}

@param num_modes: proposed number of binding modes (i.e. clusters)
@type num_modes: int

@return: cluster bit matrix (number of hypos x number of actives),
         active IDs, hypotheis IDs, active cut indices, hypo cut indices
@type: tuple, tuple, tuple, tuple, tuple

_get_active_IDs(hypothesis)

 

Extracts all PHASE_LIGAND_NAME properties from the reference ligand or any actives in the current PhpHypoAdaptor object.

Parameters:
  • hypothesis (PhpHypoAdaptor) - hypothesis from which to extract active IDs
Returns: list of str
list of ligand names for stored actives (expected mol_%d)

_actives_bit_dict(hypotheses)

 

Creates bit matrix dictionary from set of hypotheses, where each key is an active ID, and corresponding values are numpy arrays indicating if that hypothesis (array index) includes the given active (1 it true, 0 otherwise).

Parameters:
  • hypothesis (list of PhpHypoAdaptor) - list of hypotheses with actives
Returns: dict
bit matrix dictionary where values are numpy array of 1/0

_validate_bit_matrix(bit_matrix, num_modes)

 

Validates the size and composition of the bit matrix based on the number of proposed binding modes.

Parameters:
  • bit_matrix (numpy.array) - Bit matrix indicating hypothesis/active intersections
  • num_modes (int) - proposed number of binding modes
Returns: bool, str
if validate bit matrix, error message

_tanimoto_coefficient(bitrow_i, bitrow_j)

 

Computes Tanimoto coefficient between two bit arrays.

Parameters:
  • bitrow_i (list of ints) - vector of bits
  • bitrow_j - vector of bits
  • bitrow_j - list of ints
Returns: float
tanimoto coefficient

_distance_matrix(bit_matrix)

 

Computes distance matrix to use for clustering, where values are given as (1 - Tanimoto coefficient_ij) between rows i and j of the matrix.

Parameters:
  • bit_matrix (numpy.array) - Bit matrix indicating row/column intersections
Returns: numpy.array
2D numpy array of bit distances between all row pairs

_perform_clustering(bit_matrix, num_modes)

 

Performs clustering using the PhpHiCluster class on a given bit matrix for an expected number of clustering modes.

Parameters:
  • bit_matrix (numpy.array) - Bit matrix indicating hypothesis/active intersections
  • num_modes (int) - proposed number of binding modes
Returns: list, list
indices sorted by clustering order, indices for cutoff points