Structures¶
The Structure class is the fundamental class in our modules, and will probably be used in all of the code you write. Structure objects can be single molecules or groups of molecules. They provide access to atoms, bonds, properties, and a number of substructure elements.
Like any other python object, Structure objects can be stored in arrays or dictionaries, assigned to variables, and passed between functions. (However, they cannot be pickled because they wrap an underlying C library.)
In principle, Structure objects can be created programmatically, by creating a zero-atom structure, adding the desired atoms and connecting them with bonds. However, this usage pattern is atypical. In most cases a structure will be loaded from a file or retrieved from the Maestro Workspace or the Maestro Project Table.
Most Schrödinger calculations will produce a Maestro-format output file (with either a mae or maegz file extension). Creating a Structure object from one of these files will allow you to investigate the properties and structure of the resulting molecule or molecules.
Structure Class Organization¶
Structure objects expose many attributes as iterators, including atoms, bonds, and substructure elements. Structures, atoms, and bonds each have general dictionary-like property attributes that can store properties associated with the specific object.
See the API documentation for more details on the properties and methods of the Structure class.
Atoms and Bonds¶
All Structure objects have a list-like atom
attribute that can be used to
iterate over all atoms or to access them by index. For example:
Note
In this example and those below, we use st
as the standard variable
name for a Structure object.
# Print the names and atomic numbers of all the atoms in the structure
for atom in st.atom:
print("{name}: {num}".format(name=atom.name, num=atom.atomic_number))
# Print the name and atomic number of the first atom in the structure.
# Indexing starts at 1.
print("{name}: {num}".format(name=st.atom[1].name, num=st.atom[1].atomic_number))
Each atom is represented by a _StructureAtom class. This class is “private”
(i.e. named with a leading underscore) because you won’t be creating it
directly. It isn’t possible for a _StructureAtom
object to exist
independently of a Structure object, and so they can only be accessed from
an existing Structure object.
Some attributes (actually Python properties) of the
_StructureAtom
objects include name
, atomic_number
, formal_charge
,
and the Cartesian coordinates in x
, y
, and z
. See the _StructureAtom
properties for a full list.
Each atom also has a list-like bond attribute:
for atom in st.atom:
print("atom {} is bonded to:".format(atom.index))
for bond in atom.bond:
print(" atom {}".format(bond.atom2.index))
Bonds are represented by the _StructureBond class. Important attributes
of the bond class include order
, atom1
, and atom2
. See the
_StructureBond properties for full documentation.
Bonds within the structure are also accessible from a list-like attribute of a
Structure object called bond
. This access is useful for cases where you
want to iterate over all bonds in a structure exactly once.
# It's possible to iterate over all bonds in a structure:
for bond in st.bond:
print("Bonded atoms: {index1} and {index2}".format(
index1=bond.atom1.index, index2=bond.atom2.index))
Properties¶
Structures, atoms, and bonds each have the ability to store properties in a
dictionary-like attribute named property
.
The property names in this property
object must follow a pattern that is
required for storage in Maestro-format files. The required naming scheme is
type_author_property_name
, where type
is a data type
prefix, author
is a source specification, and property_name
is the actual name of the data. The type
prefix must be b
for
boolean, i
for integer, r
for real, and s
for string. The source
specification is typically a Schrödinger program abbreviation (e.g. m
for
Maestro and j
for Jaguar) and the appropriate user-level source
specification is user
. (In Maestro-format files, the Structure object
property names correspond to the properties listed under the f_m_ct {
line.)
This example shows how to access, set, and delete Structure object properties:
# 'r_j_Gas_Phase_Energy' is a real property set by Jaguar.
gas_phase_energy = st.property['r_j_Gas_Phase_Energy']
# Properties stored by the user should use an "author" of 'user'.
st.property['r_user_Energy_Plus_Two'] = gas_phase_energy + 2.0
# Delete the new 'r_user_Energy_Plus_Two' property.
del st.property['r_user_Energy_Plus_Two']
Because the property
objects are dictionary subclasses, the standard
dictionary methods like keys
and items
also work.
Properties of atoms work the same way. For example, the property
b_fragmol_attachment
is set by fragment_molecule.py
(in
$SCHRODINGER/mmshare-vX.Y/python/scripts_startup_scripts
.). The
property is True for atoms that were bonded in the input structure but whose
bond is broken in the output structures.
for atom in st.atom:
if atom.property['b_fragmol_attachment']:
print("Atom {} unattached.".format(atom.name))
print("Coordinates: {x}, {y}, {z}".format(x=atom.x, y=atom.y, z=atom.z))
Bonds also have a property
attribute for general property storage and
retrieval, although they don’t have commonly-used built-in properties.
Substructures¶
A number of “substructure iterators” are available from each Structure
object. Each of these iterators returns an instance of a non-public class that
is a view on the substructure contained within the Structure object. Each
substructure class has an extractStructure
method that can be used to create
a new and independent Structure object with the atoms in the substructure.
They also have getAtomList
methods to return a list of atom indices
corresponding to the substructure.
- molecule
- Iterates over individual molecules. Returns a _Molecule instance.
- chain
- Iterates over protein chains in the Structure object. Returns a _Chain instance.
- residue
- Iterates over protein residues in the Structure object. Returns a _Residue instance.
- ring
- Iterates over all rings in the Structure object, as found by SSSR. Returns a _Ring instance. (The Structure.find_rings method implements similar functionality but returns a list of lists of ints to identify the rings, with each int being an atom index.)
For example:
print("The structure has {} molecules.".format(len(st.molecule)))
for mol in st.molecule:
print("Molecule {mol_num} has {num_atoms} atoms.".format(
mol_num=mol.number, num_atoms=len(mol.atom)))
The _Molecule
and _Chain
instances also support their own residue
iterators. For example:
for chain in st.chain:
residues = []
for residue in chain.residue:
residues.append(residue.getCode())
print("chain {name}: {residues}".format(name=chain.name, residues="".join(residues)))
Structure I/O¶
Reading a Structure from a File¶
The schrodinger.structure.StructureReader class creates
Structure objects from molecular data stored in a number of standard file
formats. Supported file types are Maestro, MDL SD, PDB, and Sybyl Mol2.
Because these files may contain multiple molecules, the StructureReader
is
an iterator, and molecule files are presented as a sequence of Structure
objects.
from schrodinger import structure
#Input can be a .mae, .sdf, .sd, .pdb, or .mol2 file.
input_file = "input.mae"
for st in structure.StructureReader(input_file):
# Do something with the Structure...
result = process_structure(st)
# To read only the first structure from a file, pass the handle to next.
reader = structure.StructureReader(input_file)
st = next(reader)
SMILES format
files and CSV files with SMILES data are also supported, but because
these have no structural data, resulting structures are SmilesStructures,
which have less functionality than standard Structures
. See the
SmilesReader and SmilesCsvReader documentation.
Saving a Structure to a File¶
The StructureWriter class is the counterpart to the StructureReader
.
This is an example of a typical read, process, and write script:
from schrodinger import structure
with structure.StructureReader("input.mae") as reader:
with structure.StructureWriter("output.mae") as writer:
for st in reader:
# Do the required processing
result_structure = do_processing(st)
# Save the result to the output file
writer.append(result_structure)
# Because both reader and writer here are context managers, we can use the
`with` keyword to ensure that both files associated with them will be
closed automatically when we exit the scope of the with statement.
Alternatively, if only a single structure is being written to a file, you can use the Structure.write method:
st.write("output.mae")
Structure Operations¶
In addition to the functionality provided in the schrodinger.structure module itself, much is provided in the schrodinger.structutils package.
This section lists some additional Structure features and a few highlights
of the structutils
package.
Structure Minimization¶
Structures can be minimized using one of the OPLS_2005 or OPLS3e force fields by using the minimize_structure function. This operation requires a valid product license from MacroModel, GLIDE, Impact, or PLOP. Note that minimization will not hold on to a license; a license is checked out to ensure that one is available, then immediately checked back in.
For example, to compare the energy of a molecule before and after minimization:
from schrodinger.structutils.minimize import minimize_structure
# Set the energy property name
energy_name = 'r_ff_Potential_Energy-OPLS_2005'
# Do a 0-step "minimization" to get the initial energy.
minimize_structure(st, max_steps=0)
original_energy = st.property[energy_name]
minimize_structure(st)
minimized_energy = st.property[energy_name]
print("The minimized energy is {} kcal/mol lower than the original.".format(
original_energy - minimized_energy))
Substructure Searching or Specification¶
Generate SMILES, SMARTS, or ASL strings based on a set of atom indices via the generate_smiles, generate_smarts, and generate_asl functions. Documentation on ASL can be found in the Maestro Command Reference Manual.
Evaluate SMARTS or ASL strings and return a list of matching atom indices via the evaluate_smarts and evaluate_asl functions.
This example finds the set of unique SMILES strings in a structure file:
from schrodinger.structutils.analyze import generate_smiles
unique_smiles = set()
for st in reader:
pattern = generate_smiles(st)
unique_smiles.add(pattern)
Structure Measurement¶
The schrodinger.structutils.measure module provides functions for measuring distances, angles, dihedral angles, and plane angles. It also offers the get_close_atoms method to find all pairs of atoms within a specified distance in less than O(N 2) time.
Structure Superimposition or Comparison¶
The in-place RMSD of two structures can be determined via the calculate_in_place_rmsd function. The ConformerRmsd class offers more complete RMSD comparison tools for conformers.
Two structures can be superimposed based on all atoms or a subset of atoms with the superimpose function.
Conversion Between 1D/2D and 3D Structures¶
To convert a 3D structure to a 1D structure (SMILES or SMARTS), use the appropriate function from schrodinger.structutils.analyze:
from schrodinger.structutils import analyze
smiles_list = []
smarts_list = []
for st in reader:
smiles_list.append(analyze.generate_smiles(st))
smarts_list.append(analyze.generate_smarts(st))
It is possible to convert a file of 1D SMILES strings to 3D structures.:
from schrodinger import structure
3d_cts = []
with structure.StructureReader.fromString('smiles_input') as reader:
for 1d_ct in reader:
3d_cts.append(1d_ct.ct.generate3dConformation())
To convert a 3D structure to a 2D structure, use the canvasConvert
utility from the command line:
$SCHRODINGER/utilities/canvasConvert -imae input.mae -2D -osd output.sd
The resulting SD file can then be read back in with the StructureReader class.
Modifying a Structure¶
Note
The >>>
prefix in the examples that follow is the interactive
prompt. Examples without the prompt are snippets
of scripts.
Atoms can be added via the Structure.addAtoms method.
Individual atoms can be deleted with standard Python list syntax:
>>> st_copy = st.copy()
>>> len(st.atom)
5
>>> del st.atom[5]
>>> len(st.atom)
4
Note
Deleting atoms changes the indices of the atoms remaining in the Structure object.
Because deleting atoms renumbers the remaining atoms, multiple atoms should be deleted via the Structure.deleteAtoms method.
>>> len(st.atom)
14
>>> st.deleteAtoms([1, 2, 3, 4])
>>> len(st.atom)
10
Charges and atom identity can be modified by making assignments to the
proper _StructureAtom
attributes:
>>> at = st.atom[1]
>>> at.element
'C'
>>> at.atomic_number
6
>>> at.formal_charge
0
>>> at.element = 'N'
>>> at.formal_charge = 1
>>> at.formal_charge
1
>>> at.atomic_number
7
>>> at.atomic_number = 6
>>> at.element
'C'
As can be seen from the above examples, changing the atomic_number
or
element
attributes automatically updates the associated value.
Bonds can be broken or created. For example:
# To avoid modifying the original structure, make a copy.
st = st_orig.copy()
# Break and re-join the first bond on the first atom.
bond = st.atom[1].bond[1]
atom1 = bond.atom1.index
atom2 = bond.atom2.index
order = bond.order
st.deleteBond(atom1, atom2) # Delete the bond.
st.addBond(atom1, atom2, order) # Recreate bond with same bond order.
Hydrogens can be added via the add_hydrogens function, or deleted via the delete_hydrogens function.
Note
Changing formal charge, atomic identity (via element
,
atomic_number
, or atom_type
), breaking or forming bonds, or
changing bond orders all require retyping the atoms involved. This can
be accomplished via the Structure.retype method. This can be an
expensive operation, so is not automatically invoked.