Structures

The Structure class is the fundamental class in our modules, and will probably be used in all of the code you write. Structure objects can be single molecules or groups of molecules. They provide access to atoms, bonds, properties, and a number of substructure elements.

Like any other python object, Structure objects can be stored in arrays or dictionaries, assigned to variables, and passed between functions. (However, they cannot be pickled because they wrap an underlying C library.)

In principle, Structure objects can be created programmatically, by creating a zero-atom structure, adding the desired atoms and connecting them with bonds. However, this usage pattern is atypical. In most cases a structure will be loaded from a file or retrieved from the Maestro Workspace or the Maestro Project Table.

Most Schrödinger calculations will produce a Maestro-format output file (with either a mae or maegz file extension). Creating a Structure object from one of these files will allow you to investigate the properties and structure of the resulting molecule or molecules.

Structure Class Organization

Structure objects expose many attributes as iterators, including atoms, bonds, and substructure elements. Structures, atoms, and bonds each have general dictionary-like property attributes that can store properties associated with the specific object.

See the API documentation for more details on the properties and methods of the Structure class.

Atoms and Bonds

All Structure objects have a list-like atom attribute that can be used to iterate over all atoms or to access them by index. For example:

Note

In this example and those below, we use st as the standard variable name for a Structure object.

# Print the names and atomic numbers of all the atoms in the structure
for atom in st.atom:
    print("{name}: {num}".format(name=atom.name, num=atom.atomic_number))

# Print the name and atomic number of the first atom in the structure.
# Indexing starts at 1.
print("{name}: {num}".format(name=st.atom[1].name, num=st.atom[1].atomic_number))

Each atom is represented by a _StructureAtom class. This class is “private” (i.e. named with a leading underscore) because you won’t be creating it directly. It isn’t possible for a _StructureAtom object to exist independently of a Structure object, and so they can only be accessed from an existing Structure object.

Some attributes (actually Python properties) of the _StructureAtom objects include name, atomic_number, formal_charge, and the Cartesian coordinates in x, y, and z. See the _StructureAtom properties for a full list.

Each atom also has a list-like bond attribute:

for atom in st.atom:
    print("atom {} is bonded to:".format(atom.index))
    for bond in atom.bond:
        print("  atom {}".format(bond.atom2.index))

Bonds are represented by the _StructureBond class. Important attributes of the bond class include order, atom1, and atom2. See the _StructureBond properties for full documentation.

Bonds within the structure are also accessible from a list-like attribute of a Structure object called bond. This access is useful for cases where you want to iterate over all bonds in a structure exactly once.

# It's possible to iterate over all bonds in a structure:
for bond in st.bond:
    print("Bonded atoms: {index1} and {index2}".format(
        index1=bond.atom1.index, index2=bond.atom2.index))

Properties

Structures, atoms, and bonds each have the ability to store properties in a dictionary-like attribute named property.

The property names in this property object must follow a pattern that is required for storage in Maestro-format files. The required naming scheme is type_author_property_name, where type is a data type prefix, author is a source specification, and property_name is the actual name of the data. The type prefix must be b for boolean, i for integer, r for real, and s for string. The source specification is typically a Schrödinger program abbreviation (e.g. m for Maestro and j for Jaguar) and the appropriate user-level source specification is user. (In Maestro-format files, the Structure object property names correspond to the properties listed under the f_m_ct { line.)

This example shows how to access, set, and delete Structure object properties:

# 'r_j_Gas_Phase_Energy' is a real property set by Jaguar.
gas_phase_energy = st.property['r_j_Gas_Phase_Energy']

# Properties stored by the user should use an "author" of 'user'.
st.property['r_user_Energy_Plus_Two'] = gas_phase_energy + 2.0

# Delete the new 'r_user_Energy_Plus_Two' property.
del st.property['r_user_Energy_Plus_Two']

Because the property objects are dictionary subclasses, the standard dictionary methods like keys and items also work.

Properties of atoms work the same way. For example, the property b_fragmol_attachment is set by fragment_molecule.py (in $SCHRODINGER/mmshare-vX.Y/python/scripts_startup_scripts.). The property is True for atoms that were bonded in the input structure but whose bond is broken in the output structures.

for atom in st.atom:
    if atom.property['b_fragmol_attachment']:
        print("Atom {} unattached.".format(atom.name))
        print("Coordinates: {x}, {y}, {z}".format(x=atom.x, y=atom.y, z=atom.z))

Bonds also have a property attribute for general property storage and retrieval, although they don’t have commonly-used built-in properties.

Substructures

A number of “substructure iterators” are available from each Structure object. Each of these iterators returns an instance of a non-public class that is a view on the substructure contained within the Structure object. Each substructure class has an extractStructure method that can be used to create a new and independent Structure object with the atoms in the substructure. They also have getAtomList methods to return a list of atom indices corresponding to the substructure.

molecule
Iterates over individual molecules. Returns a _Molecule instance.
chain
Iterates over protein chains in the Structure object. Returns a _Chain instance.
residue
Iterates over protein residues in the Structure object. Returns a _Residue instance.
ring
Iterates over all rings in the Structure object, as found by SSSR. Returns a _Ring instance. (The Structure.find_rings method implements similar functionality but returns a list of lists of ints to identify the rings, with each int being an atom index.)

For example:

print("The structure has {} molecules.".format(len(st.molecule)))
for mol in st.molecule:
    print("Molecule {mol_num} has {num_atoms} atoms.".format(
        mol_num=mol.number, num_atoms=len(mol.atom)))

The _Molecule and _Chain instances also support their own residue iterators. For example:

for chain in st.chain:
    residues = []
    for residue in chain.residue:
        residues.append(residue.getCode())
    print("chain {name}: {residues}".format(name=chain.name, residues="".join(residues)))

Structure I/O

Reading a Structure from a File

The schrodinger.structure.StructureReader class creates Structure objects from molecular data stored in a number of standard file formats. Supported file types are Maestro, MDL SD, PDB, and Sybyl Mol2. Because these files may contain multiple molecules, the StructureReader is an iterator, and molecule files are presented as a sequence of Structure objects.

from schrodinger import structure

#Input can be a .mae, .sdf, .sd, .pdb, or .mol2 file.
input_file = "input.mae"

for st in structure.StructureReader(input_file):
    # Do something with the Structure...
    result = process_structure(st)

# To read only the first structure from a file, pass the handle to next.
reader = structure.StructureReader(input_file)
st = next(reader)

SMILES format files and CSV files with SMILES data are also supported, but because these have no structural data, resulting structures are SmilesStructures, which have less functionality than standard Structures. See the SmilesReader and SmilesCsvReader documentation.

Saving a Structure to a File

The StructureWriter class is the counterpart to the StructureReader.

This is an example of a typical read, process, and write script:

from schrodinger import structure

with structure.StructureReader("input.mae") as reader:
    with structure.StructureWriter("output.mae") as writer:
        for st in reader:
            # Do the required processing
            result_structure = do_processing(st)
            # Save the result to the output file
            writer.append(result_structure)

# Because both reader and writer here are context managers, we can use the
`with` keyword to ensure that both files associated with them will be
closed automatically when we exit the scope of the with statement.

Alternatively, if only a single structure is being written to a file, you can use the Structure.write method:

st.write("output.mae")

Structure Operations

In addition to the functionality provided in the schrodinger.structure module itself, much is provided in the schrodinger.structutils package.

This section lists some additional Structure features and a few highlights of the structutils package.

Structure Minimization

Structures can be minimized using one of the OPLS_2005 or OPLS3e force fields by using the minimize_structure function. This operation requires a valid product license from MacroModel, GLIDE, Impact, or PLOP. Note that minimization will not hold on to a license; a license is checked out to ensure that one is available, then immediately checked back in.

For example, to compare the energy of a molecule before and after minimization:

from schrodinger.structutils.minimize import minimize_structure

# Set the energy property name
energy_name = 'r_ff_Potential_Energy-OPLS_2005'

# Do a 0-step "minimization" to get the initial energy.
minimize_structure(st, max_steps=0)
original_energy = st.property[energy_name]

minimize_structure(st)
minimized_energy = st.property[energy_name]

print("The minimized energy is {} kcal/mol lower than the original.".format(
    original_energy - minimized_energy))

Substructure Searching or Specification

Generate SMILES, SMARTS, or ASL strings based on a set of atom indices via the generate_smiles, generate_smarts, and generate_asl functions. Documentation on ASL can be found in the Maestro Command Reference Manual.

Evaluate SMARTS or ASL strings and return a list of matching atom indices via the evaluate_smarts and evaluate_asl functions.

This example finds the set of unique SMILES strings in a structure file:

from schrodinger.structutils.analyze import generate_smiles

unique_smiles = set()
for st in reader:
    pattern = generate_smiles(st)
    unique_smiles.add(pattern)

Structure Measurement

The schrodinger.structutils.measure module provides functions for measuring distances, angles, dihedral angles, and plane angles. It also offers the get_close_atoms method to find all pairs of atoms within a specified distance in less than O(N 2) time.

Structure Superimposition or Comparison

The in-place RMSD of two structures can be determined via the calculate_in_place_rmsd function. The ConformerRmsd class offers more complete RMSD comparison tools for conformers.

Two structures can be superimposed based on all atoms or a subset of atoms with the superimpose function.

Conversion Between 1D/2D and 3D Structures

To convert a 3D structure to a 1D structure (SMILES or SMARTS), use the appropriate function from schrodinger.structutils.analyze:

from schrodinger.structutils import analyze
smiles_list = []
smarts_list = []
for st in reader:
    smiles_list.append(analyze.generate_smiles(st))
    smarts_list.append(analyze.generate_smarts(st))

It is possible to convert a file of 1D SMILES strings to 3D structures.:

from schrodinger import structure

3d_cts = []
with structure.StructureReader.fromString('smiles_input') as reader:
    for 1d_ct in reader:
        3d_cts.append(1d_ct.ct.generate3dConformation())

To convert a 3D structure to a 2D structure, use the canvasConvert utility from the command line:

$SCHRODINGER/utilities/canvasConvert -imae input.mae -2D -osd output.sd

The resulting SD file can then be read back in with the StructureReader class.

Modifying a Structure

Note

The >>> prefix in the examples that follow is the interactive prompt. Examples without the prompt are snippets of scripts.

Atoms can be added via the Structure.addAtoms method.

Individual atoms can be deleted with standard Python list syntax:

>>> st_copy = st.copy()
>>> len(st.atom)
5
>>> del st.atom[5]
>>> len(st.atom)
4

Note

Deleting atoms changes the indices of the atoms remaining in the Structure object.

Because deleting atoms renumbers the remaining atoms, multiple atoms should be deleted via the Structure.deleteAtoms method.

>>> len(st.atom)
14
>>> st.deleteAtoms([1, 2, 3, 4])
>>> len(st.atom)
10

Charges and atom identity can be modified by making assignments to the proper _StructureAtom attributes:

>>> at = st.atom[1]
>>> at.element
'C'
>>> at.atomic_number
6
>>> at.formal_charge
0
>>> at.element = 'N'
>>> at.formal_charge = 1
>>> at.formal_charge
1
>>> at.atomic_number
7
>>> at.atomic_number = 6
>>> at.element
'C'

As can be seen from the above examples, changing the atomic_number or element attributes automatically updates the associated value.

Bonds can be broken or created. For example:

# To avoid modifying the original structure, make a copy.
st = st_orig.copy()

# Break and re-join the first bond on the first atom.
bond = st.atom[1].bond[1]

atom1 = bond.atom1.index
atom2 = bond.atom2.index
order = bond.order

st.deleteBond(atom1, atom2)     # Delete the bond.
st.addBond(atom1, atom2, order) # Recreate bond with same bond order.

Hydrogens can be added via the add_hydrogens function, or deleted via the delete_hydrogens function.

Note

Changing formal charge, atomic identity (via element, atomic_number, or atom_type), breaking or forming bonds, or changing bond orders all require retyping the atoms involved. This can be accomplished via the Structure.retype method. This can be an expensive operation, so is not automatically invoked.