schrodinger.structutils.rgroup_enumerate module

Module for R-group enumeration.

schrodinger.structutils.rgroup_enumerate.logger = <Logger rgroup_enumerate (INFO)>

RGroup properties:

  • atom_index: index of the atom bound to the core (aka the “leaving atom”)
  • source_index: index of the R-group source that will replace this group
  • leaving_atoms: list of all the atoms in the leaving group (by index)
  • staying_atom: index of the core atom bound to the leaving group
class schrodinger.structutils.rgroup_enumerate.RGroup(atom_index, source_index, leaving_atoms, staying_atom, bond_order)

Bases: tuple

RGroupSource properties:

  • source_index: index of the R-group source that will replace this group
  • length: number of atom indices divided by concentration
  • rgroups_indices: list of R group indices for each source
__contains__

Return key in self.

__init__

Initialize self. See help(type(self)) for accurate signature.

__len__

Return len(self).

atom_index

Alias for field number 0

bond_order

Alias for field number 4

count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

leaving_atoms

Alias for field number 2

source_index

Alias for field number 1

staying_atom

Alias for field number 3

class schrodinger.structutils.rgroup_enumerate.RGroupSource(source_index, length, rgroups_indices)

Bases: tuple

__contains__

Return key in self.

__init__

Initialize self. See help(type(self)) for accurate signature.

__len__

Return len(self).

count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

length

Alias for field number 1

rgroups_indices

Alias for field number 2

source_index

Alias for field number 0

exception schrodinger.structutils.rgroup_enumerate.RgroupError

Bases: Exception

Exception class for errors specific to this module, which the caller may want to present to the user as a simple error message, as opposed to a traceback. This is meant for “user errors”, as opposed to bugs; for example, when an input structure doesn’t fulfill the requirements.

__init__

Initialize self. See help(type(self)) for accurate signature.

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception schrodinger.structutils.rgroup_enumerate.BondOrderMismatch

Bases: schrodinger.structutils.rgroup_enumerate.RgroupError

A specific kind of error that we’ll ignore to allow libraries that include R-groups with different bond orders.

__init__

Initialize self. See help(type(self)) for accurate signature.

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class schrodinger.structutils.rgroup_enumerate.RgroupEnumerator(core_st, sources, optimize_sidechains=True, deduplicate=True, start=0, stop=None, copy_properties=False, enumerate_cistrans=True, yield_renum_maps=False, concentrations=None)

Bases: object

Enumerate a structure using R-group sources.

A source is a sequence with an iterable of Structure as its first element, followed by one or more core atom atom indices where the side chains from the source should be inserted.

RgroupEnumerator objects are iterable. Example:

sources = [
    (StructureReader('r1.maegz'), 4, 12),
    (StructureReader('r2.maegz'), 8),
]
for prod_st in RgroupEnumerator(core_st, sources):
    ...

will use the first reader to replace atoms 4 and 12 in an homo fashion (meaning that for a given product, the groups attached to atoms 4 and 12 are always the same), in combination with the structures for the second reader for atom 8.

The generated structures have the title of the core structure and the title of each of the R-groups, encoded in CSV format. For ease of parsing, this information is also stored as separate properties: i_rge_num_r_groups has the number of R groups, and the title of each is goes in properties r_rge_R1, r_rge_R2, etc.

As an option, all CT properties from the R groups can be copied to each product molecule. These properties have the original name prefixed with <type-char>_rge_R<index>_; for example, r_i_glide_gscore for the first R group becomes r_rge_R1_r_i_glide_gscore.

The structures in each R-group source should each have one dummy atom (symbol ‘’, atomic number zero).

The user of the class can request only a slice of the full set of combinations to be yielded, by providing the optional ‘start’ and ‘stop’ constructor arguments. These follow the standard Python slicing convention.

If the core structure came from an SDF file with R-group labels (“M RGP” lines), the attachment atoms don’t need to be specified; the labels from the file can be used implicitly.

__init__(core_st, sources, optimize_sidechains=True, deduplicate=True, start=0, stop=None, copy_properties=False, enumerate_cistrans=True, yield_renum_maps=False, concentrations=None)

Initialize an enumerator for a given core structure and specification of rgroup sources.

Parameters:
  • core_st (schrodinger.structure.Structure) – core structure
  • sources (list of list) – side chain sources. See class description for details.
  • optimize_sidechains (bool) – if true, generate 3D coordinates for the side chain atoms using Fast3D. The input coordinates will only be used for determining stereochemistry. If false, position the side chains using rigid rotation and translation (and an arbitrary torsional angle around the new bond).
  • deduplicate (bool) – use unique SMILES to identify and reject duplicate products
  • start (int) – beginning of results slice (used by subjobs)
  • stop (int or None) – end of results slice (used by subjobs)
  • copy_properties (bool) – if true, copy all CT properties from each R-group to the constructed molecule.
  • enumerate_cistrans (bool) – if True (default), emit both cis and trans isomers for double-bonded R-groups.
  • yield_renum_maps (bool) – if True then on each iteration yield not only the product structure but also the relevant old-to-new atom index map
  • concentrations (list of float or None) – List of concentrations for each source. Must have the same length as sources
attachSidechains(sidechains)

Attach the sidechains to the core structure and return the resulting structure and index map.

Parameters:sidechains (list of schrodinger.structure.Structure) – list of sidechains. Should have the same length as the number of attachment atoms in the core.
Yield:product structures and index maps
Ytype:schrodinger.structure.Structure, dict
combinations()

Return the number of combintations that will be generated for each tuple of R-groups. That is, combinations due to occupancy of the various attachment points when all the concentrations are not 1.0.

schrodinger.structutils.rgroup_enumerate.find_rgroup_from_smarts(st, smarts, leaving_atom_pos, staying_atom_pos, bond_order=None)

Find the various ways in which a structure can be split into “R-group” and “functional group” using a SMARTS pattern.

The SMARTS pattern must consist of at least two atoms. Two of the atoms, identified by their position in the SMARTS string, are used to define the bond to be broken between the R group and the “leaving group”. If the two atoms are not directly connected, the bond leading from the leaving atom to the staying atom is broken.

For example, consider the structure c1ccccc1cC(=O)O and the SMARTS pattern C(=O)O. With leaving_atom_pos=2, staying_atom_pos=1, the entire carboxylate is removed, producing the R-group c1ccccc1. With leaving_atom_pos=4, staying_atom_pos=2, only the terminal O is removed, leading to the R-group c1ccccc1C(=O)*. (The asterisks are shown here only to highlight the bond that was broken.)

The return value is a list of tuples, where the first element is the attachment atom index and the second is a list of the indexes of the atoms comprising the R-group. In the first example above, if we pretend there are no hydrogens, the return value might be [(7, [1,2,3,4,5,6])].

Notes: 1) ring bonds can’t be broken because they don’t split the structure in two; 2) if bond_order is not None, skip matches having the attachment bond of different order.

Parameters:
  • st (schrodinger.structure.Structure) – structure to analyze
  • smarts (str) – SMARTS pattern describing the functional group
  • leaving_atom_pos (index) – position of the leaving atom in the SMARTS pattern (1-based)
  • staying_atom_pos (index) – position of the attachment atom in the SMARTS pattern (1-based)
  • bond_order (int or NoneType) – If None (default), has no effect. Otherwise skip matches having R-group attachment bond of different order.
Returns:

list of tuples (attachment atom, list of R-group atom indexes). If no matches satisfied all the requirements, the list may be empty. May include duplicate R-groups (R-group in this context is the substructure made of the newly found R-group atoms).

Return type:

list

schrodinger.structutils.rgroup_enumerate.find_staying_atom(st, leaving_atom)

Given a picked “leaving” atom, determine which of the atoms it is bonded to is part of the larger molecule - the “staying” atom. All other atoms bound to the leaving atom are considered to be part of the leaving group.

Parameters:leaving_atom (schrodinger.structure._StructureAtom) – atom which defines the start of the leaving group
Returns:“staying atom”: the core atom bound to the leaving atom
Rtype leaving_atom:
 schrodinger.structure._StructureAtom
schrodinger.structutils.rgroup_enumerate.get_dummy_filter()

Return a filter which has as criteria all the descriptors that can be computed by this module, along with their suggested default limiters (ranges).

Return type:schrodinger.ui.qt.filter_dialog_dir.filter_core.Filter
schrodinger.structutils.rgroup_enumerate.add_descriptors(st, filter_obj)

Add the descriptors required by a filter to a given Structure.

schrodinger.structutils.rgroup_enumerate.list_to_csv(fields)

Convert a list into a CSV string representation.

Parameters:fields (list) – list to convert
Return type:str
schrodinger.structutils.rgroup_enumerate.add_amide_constraints(st, frozen_set)

For amide bonds which have one atom frozen and the other not, add the necessary ct properties to tell fast3d to constrain the amides to the trans conformation.

Parameters:
  • st (schrodinger.structure.Structure) – structure, to be modified in place
  • frozen_set (set of int) – set of frozen atoms, by atom index
Returns:

names of properties that were added

Return type:

list of str

schrodinger.structutils.rgroup_enumerate.get_metals_and_neighbors(st)

Returns indices of metal atoms, and atoms bonded to them.

Parameters:st – Structure.
Returns:Set of atom indices in st.
Return type:set(int)
schrodinger.structutils.rgroup_enumerate.get_last_EZ_property_index(st)

Return the maximum index of the s_st_EZ_<index> properties of st. If there are no such properties, return 0.

Return type:int
schrodinger.structutils.rgroup_enumerate.get_sources_from_r_labels(st, iters, prop='i_rdkit__MolFileRLabel')

Given a Structure and a list of iterables, return the “sources” data structure needed by RgroupEnumerator. The structure must have (some) atoms with the specified property; the values of this property must be in the range [1, len(iters)] and all the values in that range must be represented at least once. If this condition is not met, raise a ValueError.

Parameters:
  • st (schrodinger.structure.Structure) – core Structure
  • iters (list) – list of iterables of structures
  • prop (str) – name of the atom property holding the R-group labels. The default is what comes from reading an SD file with “M RGP” fields using RDKit.
Returns:

sources data structure. See RgroupEnumerator for details.

Return type:

list of list

schrodinger.structutils.rgroup_enumerate.convert_attachment_point(struct, bond_order=None)

Converts attachment point from methyl to dummy atom. If r-group fragment is ‘Null’ returns False, otherwise returns True. Null r-group has atom with atomic number -2 and growname ‘rpc1’.

Parameters:
  • struct (structure.Structure) – structure object
  • bond_order (int or NoneType) – If None, has no effect. Otherwise return False if attachment bond is of different order.
Returns:

True if conversion succeeded and False otherwise.

Return type:

bool

schrodinger.structutils.rgroup_enumerate.get_attachment_point(st)

Identifies attachment point (dummy atom with a single neighbor) in the provided structure.

Parameters:st (schrodinger.structure.Structure) – Structure
Returns:Index of the dummy atom and order of the bond that joins the dummy to the rest.
Return type:(int, int)
schrodinger.structutils.rgroup_enumerate.check_attachment_point(struct, bond_order=None)

Checks that provided structure contains attachment point.

Parameters:bond_order (int or NoneType) – If None, has no effect. Otherwise return False if attachment bond is of different order.
Returns:True if attachment point was found and False otherwise.
Return type:bool
schrodinger.structutils.rgroup_enumerate.create_fragment_structure(st, rgroup_data)

Creates r-group fragment structures from a given structure and a list of atoms that should be included in the r-group.

Parameters:
  • st (structure.Structure) – structure object
  • rgroups (tuple) – r-group data that contains index of attachment atom and indices of R-group fragment atoms.
Returns:

fragment structure

Return type:

structure.Structure