schrodinger.application.pathfinder.multiroute module

Functions to support multi-route enumeration (AKA “simple reaction enumeration” or “automated reaction enumeration”).

schrodinger.application.pathfinder.multiroute.get_mol_supplier

Read a reagent file. Calls are cached so if the function is called again with the same filename the return value will be reused for speed.

Parameters:filename (str) – filename
Returns:mol supplier
Return type:rdkit.Chem.rdmolfiles.SmilesMolSupplier
schrodinger.application.pathfinder.multiroute.has_variable_reactants(route)

Check if the route has at least one variable reactant.

Parameters:route (schrodinger.application.pathfinder.route.RouteNode) – route to analyze
Returns:does the route have at least one variable reactant?
Return type:bool
schrodinger.application.pathfinder.multiroute.get_reagent_sources(route, libpath, core_atoms, core_neighbors)

Find the reagent sources for a route.

Parameters:
  • route (schrodinger.application.pathfinder.route.RouteNode) – route to analyze
  • libpath (list of str) – list of directories to prepend to the standard reagent library search path
  • core_atoms (set of int) – core atom indices
  • core_neighbors (set of int) – indices of atoms directly bound to the core
Returns:

reagent sources

Return type:

list of ReagentSource

schrodinger.application.pathfinder.multiroute.is_core_sm(sm, core_atoms)

Check if a starting material node corresponds to a core.

Parameters:
Return type:

bool

schrodinger.application.pathfinder.multiroute.mol_from_labeled_smiles(smiles, core_neighbors)

Return a Mol in which sidechain atom mapping numbers are turned into isotopes, to keep track of them during the enumeration.

Parameters:
  • smiles (str) – reactant SMILES
  • core_neighbors (set of int) – indices of atoms directly bound to the core
Returns:

molecule

Return type:

rdkit.Chem.rdchem.Mol

schrodinger.application.pathfinder.multiroute.meta_sample(samples, dedup=True)

A generator that, on each cycle, picks a random element of samples and yields the next element from said sample. It never stops unless all samples raise StopIteration.

Each product gets annotated with properties representing the route that was used to make the molecule.

Parameters:
  • samples (list of iterator of Mol) – molecule samples
  • dedup (bool) – skip duplicate products
Returns:

molecule generator

Return type:

generator of rdkit.Chem.Mol

schrodinger.application.pathfinder.multiroute.clear_and_get_atom_labels(mol)

Clear the isotope atom labels in mol and return a list of tuples (index, ref_index) for atoms which were labeled as attachment atoms and came from frozen components.

Parameters:target_mol (rdkit.Chem.Mol) – molecule
Returns:list of (index, ref_index), both 1-based, where “index” refers to mol and “ref_index” is the index of the corresponding atom in the target molecule.
Return type:list of (int, int)
schrodinger.application.pathfinder.multiroute.measure_vectors(st, r1, c1, r2, c2)

Measure the distance and angle between two bond vectors. The distance is measured between atoms r1 and r2; the angle is between the c1-r1 and c2-r2 vectors.

Parameters:
  • st (schrodinger.structure.Structure) – structure to measure
  • r1 (int) – R-group attachment atom 1
  • c1 (int) – core atom 1
  • r2 (int) – R-group attachment atom 2
  • c2 (int) – core atom 2
Returns:

distance and angle

Return type:

float, float

schrodinger.application.pathfinder.multiroute.st_from_mol(mol)

Convert a Mol into a Structure, with 3D coordinates but no added hydrogens. Missing stereochemistry is tolerated.

Parameters:mol (rdkit.Chem.rdchem.Mol) – molecule
Returns:Structure
Return type:schrodinger.structure.Structure
schrodinger.application.pathfinder.multiroute.apply_core_hopping_filters(products, ref_measurements, ch_dist_tol, ch_ang_tol)

Generator to filter products to exclude those in which any measurement of distance and angles between side chains differs too much from the reference measurements. (If there are no reference measurements, all products pass.

Parameters:
  • products (iterator of rdkit.Chem.Mol) – products to filter
  • ref_measurements (dict {(int, int): (float, float)}) – reference measurements dict; keys are pairs of atom indices; values are distance, angle tuples.
  • ch_dist_tol (float) – distance tolerance in Angstroms
  • ch_ang_tol (float) – angle tolerance in degrees
Returns:

molecules meeting the geometric criteria

Return type:

generator of rdkit.Chem.Mol

schrodinger.application.pathfinder.multiroute.is_core(graph, free_component, frozen_components)

Check if the free_component subgraph should be considered a core, meaning that it is connected to more than one of the free_components.

Parameters:
  • graph (networkx.classes.graph.Graph) – molecular graph
  • free_component (set of int) – possible core atom indices
  • frozen_components (list of set of int) – possible sidechain atom indexes
Returns:

is it a core?

Return type:

bool

schrodinger.application.pathfinder.multiroute.apply_similarity_filters(products, args)

Implement the -sim_keep_percent and -sim_discard_percent functionality.

Parameters:
  • products (iterator of rdkit.Chem.Mol) – molecules to filter
  • args (argparse.Namespace) – command-line arguments
Returns:

filtered products and number of products to keep

Return type:

generator of rdkit.Chem.Mol, int

schrodinger.application.pathfinder.multiroute.analyze_frozen_atoms(mol, frozen_atoms)

Examine the molecular graph of the target molecule to partition it, based on the set of frozen atoms, into free regions and frozen regions. Also determine which free region is the core, if any.

A core in this context is a contiguous set of non-frozen atoms which is adjacent to two or more sets of frozen atoms (the side chains).

Jobs with two or more cores will abort immediately.

When there is a core, also measure the distances and angles between all the pairs of vectors leading from the core to the side chains. The resulting dict has pairs of atoms as keys, and (distance, angle) tuples as values.

In addition to the set of core atoms, a set of “core neighbors” is also returned. These are the non-core atoms that are directly connected to the core.

Parameters:
  • mol (rdkit.Chem.rdchem.Mol) – target molecule
  • frozen_atoms (set of int) – frozen atom indices
Returns:

measurements, core atoms, core neighbors

Return type:

dict {(int, int): (float, float)}, set of int, set of int

schrodinger.application.pathfinder.multiroute.generate_samples(target_mol, reactions_dict, core_atoms, core_neighbors, *, depth=None, frozen_atoms=frozenset(), max_routes=100, max_per_route=1000, libpath=None, bond_reactions=None)

Perform a retrosynthetic analysis of target_mol, generate all routes and pick up to max_routes at random, and finally turn each route into a RandomSampleIterator.

Parameters:
  • target_mol (rdkit.Chem.Mol) – target molecule
  • reactions_dict (dict {str: Reaction}) – reactions to use for the analysis
  • core_atoms (set of int) – core atom indices
  • core_neighbors (set of int) – indices of atoms directly bound to the core
  • depth (int or NoneType) – maximum depth
  • frozen_atoms (set of int) – indexes (1-based) of atoms to keep in the product
  • max_per_route (int) – maximum number of products per route
  • max_routes (int) – maximum number of routes to sample
  • libpath (list of str) – directories to search for reactant files
  • bond_reactions – dict specifying which reactions are allowed to break certain bonds. Keys are tuples of two ints (sorted atom indexes); values are sets of reaction names.
  • bond_reactions – {(int, int): set(str)}
Returns:

samples

Return type:

list of RandomSampleIterator

schrodinger.application.pathfinder.multiroute.generate_mols(target_mol, reactions_dict, *, dedup=True, depth=None, descriptors='MolLogP, MolWt, NumChiralCenters, NumHAcceptors, NumHDonors, TPSA', frozen_atoms=frozenset(), libpath=None, max_per_route=1000, max_routes=100, no_core_hopping=False, product_property_filter_file=None, product_smarts_filter_file=None, ref_mols=None, ch_dist_tol=1.0, ch_ang_tol=15.0, bond_reactions=None, **unused_args)

A generator of products following the multiroute enumeration protocol.

Parameters:
  • dedup (bool) – skip duplicate products (using SMILES for comparison)
  • depth (int or NoneType) – analysis depth (if None, increasing depths will be attempted until enough routes are found)
  • descriptors (list of str) – names of RDKit descriptors to compute for each product
  • frozen_atoms (set of int) – indexes (1-based) of atoms to keep in the product
  • libpath (list of str) – directories to search for reactant files
  • max_per_route (int) – maximum number of products per route
  • max_routes (int) – maximum number of routes to sample
  • no_core_hopping (bool) – don’t use the special core hopping mode even when possible
  • product_property_filter_file (str) – name of JSON file with product property filters
  • product_smarts_filter_file (str) – name of .cflt file with SMARTS patterns
  • ref_mols (list of Mol) – reference molecules for similarity calculations
  • ch_dist_tol (float) – core-hopping distance tolerance in Angstroms (maximum allowed change in the distance between side chains, relative to the input structure)
  • ch_ang_tol (float) – core-hopping angle tolerance in degrees (maximum change bond vector angle for side chains, relative to the input structure)
  • bond_reactions – dict specifying which reactions are allowed to break certain bonds. Keys are tuples of two ints (sorted atom indexes); values are sets of reaction names.
  • bond_reactions – {(int, int): set(str)}
Return type:

generator of Mol

schrodinger.application.pathfinder.multiroute.write_products(products, filename, max_products)

Write out up to max_products from a mol iterator to a file.

Parameters:
  • products (iterator of Mol) – molecules to write
  • filename (str) – filename
  • max_products (int) – maximum number of structures to write
schrodinger.application.pathfinder.multiroute.generate_structures(st, frozen_atoms, *, depth=2, max_routes=20, libpath=None, **kwargs)

Simplified Structure-based API for multiroute enumeration. For advanced use, additional keywords arguments are passed through to generate_mols().

Parameters:
  • st (schrodinger.structure.Structure) – input structure
  • frozen_atoms (set of int) – frozen atom indices (1-based)
  • depth (int or NoneType) – analysis depth (if None, increasing depths will be attempted until enough routes are found)
  • max_routes (int) – maximum number of routes to sample
  • libpath (list of str) – list of directories to prepend to the standard reagent library search path
Returns:

generator of 3D structures

Return type:

generator of schrodinger.structure.Structure