Product Specific Modules

Desmond

The desmond package contains modules for various Desmond-related operations. In particular, the trajectory infrastructure and trajectory analyzers are in the following modules:

Atom indexing

For MD related workflows, there are two output files associated with the simulated trajectory:

  • A text file called <jobname>-out.cms, which contains structural information such as atom types, bonds, etc, as well as atom positions of the last trajectory frame
  • A binary file of the trajectory, which contains atom positions at all time points.

One tricky detail of the trajectory infrastructure is that the .cms files only contain physical atoms whereas the trajectory files contain additional pseudo atoms. This is reflected in two different ways of identifying atoms:

  • Atom ID (AID) for structural files such as cms and mae files * physical atoms only * 1-indexed * results of ASL/SMARTS evaluation
  • Global ID (GID) for trajectory files and Desmond’s MSYS files * all atoms, both physical and pseudo * 0-indexed

Since pseudo atoms carry charge, they are relevant to all charge related analyses, such as electric dipole, center of charge, etc. Note that the pseudo atoms are not selected from ASL or SMARTS evaluations, and one needs to use APIs in the schrodinger.desmond.topo such as:

asl2gids(cms_model, asl, include_pseudoatoms=True)
aids2gids(cms_model, asl, include_pseudoatoms=True)

Here the argument cms_model has to be created from:

from schrodinger.application.desmond.packages import topo
msys_model, cms_model = topo.read_cms('example-out.cms')

This is the only way to guarantee the correct AID/GID mapping in the cms_model. Avoid loading the .cms file as a Schrodinger Structure object or as a Desmond Cms object directly if it is to interact with trajectories.

Trajectory frames

The Desmond trajectories are saved in binary files and currently both DTR and XTC formats are supported. To read a trajectory:

from schrodinger.application.desmond.packages import traj
tr = traj.read_traj('foo_trj')  # DTR format trajectory
fr = tr[0]  # first frame

Here tr is simply a python list of trajectory Frame objects. The guaranteed properties of Frame include:

  • fr.natoms: number of atoms in the frame, including the pseudo atoms
  • fr.time: chemical time in pico-seconds
  • fr.box: simulation box (key information for periodic boundary unwrapping), which may change from frame to frame
  • fr.pos() or fr.pos(gid), or fr.pos(gids): atom positions

Here gid is an integer valued GID and gids is a list/tuple of them. The position calls return N x 3 Numpy array where N is the number of requested atom(s). Note that fr.pos() (which selects all atoms) and fr.pos(gid) return views whereas fr.pos(gids) returns a copy of the data in Frame. This is a result of Numpy advanced indexing.

Additional information may exist in the Frame objects. For example, atom velocities. In that case, the API fr.vel() has the same interface as fr.pos().

The path to the trajectory file is also saved in the <jobname>-out.cms file thus it is possible to load the trajectory from the .cms file:

from schrodinger.application.desmond.packages import traj_util
msys_model, cms_model, tr = traj_util.read_cms_and_traj('example-out.cms')

If only the path to the trajectory file is needed, use:

from schrodinger.application.desmond.packages import topo
trj_path = topo.find_traj_path(cms_model)  # or
trj_path = topo.find_traj_path_from_cms_path('example-out.cms')

Note that in order to utilize these helper functions, one should not rename the trajectory file or change the relative locations between the .cms file and the trajectory file.

Often one needs to update the coordinates of a group of atoms (e.g., a protein molecule from ASL selection) on a per-frame basis. This task can be fulfilled in two ways:

  • update all atom coordinates (i.e., the full system CT) for each frame and then extract the protein per frame
  • extract the protein once and then update its coordinates per frame

The second approach is more efficient and its implementation is shown below:

from schrodinger.application.desmond.packages import traj, topo

_, cms_model = topo.read_cms('example-out.cms')
tr = traj.read_traj('example_trj')
protein_aids = cms_model.select_atom('protein')
protein_st = cms_model.extract(protein_aids)
protein_gids = topo.aids2gids(cms_model, protein_aids,
                          include_pseudoatoms=False)
for fr in tr:
    protein_st.setXYZ(fr.pos(protein_gids))
    # whatever needs to be done on the protein structure
    ...

In case the full system CT is needed per frame, use:

fsys_ct = cms_model.fsys_ct.copy()
for fr in tr:
    topo.update_ct(fsys_ct, cms_model, fr)
    # whatever needs to be done on the full system CT

Trajectory analysis

There are about 50 trajectory analyzers provided in the schrodinger.desmond.analysis module, ranging from basic geometric operations such as angle, distance, vector, torsion, to specialized analyses such as order parameter, protein-ligand interactions.

To use the existing trajectory analyzers:

from schrodinger.application.desmond.packages import analysis, traj, topo

# load data
msys_model, cms_model = topo.read_cms('foo-out.cms')
tr = traj.read_traj('foo.xtc')  # XTC format trajectory

# define analyzers
analyzer1 = analysis.Com(msys_model, cms_model, asl='m.n 1')
analyzer2 = analysis.ProtLigInter(msys_model, cms_model, 'protein', 'm.n 2')

# compute result
results = analysis.analyze(tr, analyzer1, analyzer2, )

Here results is a list of two items corresponding to the outputs of the two analyzers. The outputs of the analyzers may differ in their format. Typically, the output is a list of frame-wise results.

There is a caching mechanism built in the analysis framework to avoid redundant calculations among analyzers, e.g., coordinate unwrapping around periodic boundary condition, centering solute atoms, etc. Thus it is better to use all analyzers together, as in the example.

A special feature of the basic geometric analyzers is that they can use center of mass Com, center of charge Coc, and centroid Centroid analyzers on an equal footing with atoms. For example:

from schrodinger.application.desmond.packages import analysis
# data loading for msys_model, cms_model, and tr is omitted
s1 = analysis.Com(msys_model, cms_model, asl='atom.num 1-4')
s2 = analysis.Centroid(msys_model, cms_model, asl='atom.num 40-200')

d1 = analysis.Distance(msys_model, cms_model, 1, 2)
d2 = analysis.Distance(msys_model, cms_model, s1, 27)
d3 = analysis.Distance(msys_model, cms_model, s1, s2)

results = analysis.analyze(tr, d1, d2, d3)

Glide

The glide package helps to set up Glide jobs via keyword-value pairs.

Calculate Enrichment from Virtual Screening

The analysis.enrichment module helps to calculate enrichments based on known actives and the number of decoys in screen.

import schrodinger.analysis.enrichment as enrichment
efcalc = enrichment.Calculator(
    actives = "my_actives.txt",            # Active titles, one per line.
    results = "screen_results.mae",        # Glide pv file.
    total_decoys = 1000
)
efcalc.report()                    # Print default report to standard out.
efcalc.savePlot()                  # Create default graph png.
print(efcalc.calcBEDROC(alpha=20)) # Print the BEDROC metric value

Jaguar

The jaguar package helps to set up Jaguar (QM) tasks. For example, to set up and run a Jaguar 6-31g** geometry optimization:

import schrodinger.application.jaguar.input as jagInput
import schrodinger.job.jobcontrol as jc
input = jagInput.read("inputFile.mae")
input.setValue("basis","6-31G**")
input.setValue("dftname","B3LYP")
input.setValue("igeopt","1") # Do geometry optimization
jagFileName = "jag631.in"
input.saveAs(jagFileName)
run_command = [os.path.join(os.environ['SCHRODINGER'], 'jaguar'),
               "run",
               jagFileName]
job = jc.launch_job(run_command)

MacroModel

The macromodel package helps to set up and run MacroModel tasks. To use this module productively, you should be familiar with the MacroModel Reference Manual. As a starting point, see the documentation for the ComUtil and SbcUtil classes.

A few examples of simplified scripts follow.

Set up and run a Monte Carlo multiple minimum conformation search job with non-default options:

import schrodinger.application.macromodel.utils as mmodutils
import schrodinger.job.jobcontrol as jc
import sys

mcu = mmodutils.ComUtil(ffld='mmffs')

# Use extended torsion sampling.
mcu.AUTO[8] = 3

# Write the com file for a Monte Carlo multiple minimum conformation
# search. (To actually run this script, you need to set
# input_structure_file.)
com_file = mcu.mcmm(input_structure_file)

cmd_args = mcu.getLaunchCommand(com_file)
job = jc.launch_job(cmd_args)
job.wait()

print("MCMM job status: {}".format(job.Status))
mcmm_output_file = job.StructureOutputFile

Set up and run a geometry minimization with restraints defined by an Atom Selection Language expression. This is specified within a MacroModel substructure file with the ASL2 code.

import schrodinger.application.macromodel.utils as mmodutils
import schrodinger.job.jobcontrol as jc

mcu = mmodutils.ComUtil(subs=True)

# Write the com file for a geometry optimization.
# (To run this script, you should set input_structure_file. For the
# script to do something interesting, the input structure file should
# have more than one molecule. :-)
com_file = mcu.mini(input_structure_file)

# Set up the MacroModel substructure file.
sbu = mmodutils.SbcUtil()

# Specify the force constant for Cartesian restraints.
sbu.ASL2[1] = 200
# Specify the ASL identifying the restrained atoms.
sbu.ASL2[2] = 'mol.n 1'
sbu_args = [com_file, 'ASL2']
sbc_file = sbu.writeSbcFile(sbu_args)

cmd_args = mcu.getLaunchCommand(com_file)
job = jc.launch_job(cmd_args)
job.wait()

print("MINI job status: {}".format(job.Status))
mini_output_file = job.StructureOutputFile

Set up and run a systematic pseudo-Monte Carlo conformation search, dynamically assigning restraints to a ligand core (SUBS and FXAT):

import schrodinger.application.macromodel.utils as mmodutils
import schrodinger.structutils.analyze as analyze
import schrodinger.structure as structure

mcu = mmodutils.ComUtil(subs=True)

# Write the com file for a systematic pseudo-Monte Carlo multiple
# minimum conformation search.
com_file = mcu.spmc(input_structure_file)

# Find the atoms to be restrained via ASL and SMARTS; anything in a
# 6-membered ring.
st = next(structure.StructureReader(input_structure_file))
core_pattern = 'SMARTS. "r6"'
core_atoms = analyze.evaluate_asl(st, "withinbonds 1 {}".format(core_pattern))
attachment_atoms = analyze.evaluate_asl(st, "not withinbonds 1 {}".format(core_pattern))

# Set up and write the substructure file with the restraints.
sbu = mmodutils.SbcUtil()
sbc_args = [com_file]
sbc_args.extend(sbu.setSubs(attachment_atoms))
sbc_args.extend(sbu.setFixed(core_atoms))
sbu.writeSbcFile(sbc_args)

cmd_args = mcu.getLaunchCommand(com_file)
job = jc.launch_job(cmd_args)
job.wait()

print("SPMC job status: {}".format(job.Status))
spmc_output_file = job.StructureOutputFile

Prime

The prime package contains modules for reading and writing Prime input files and for fixing structures for use with Prime.

The protein package contains other modules for protein-related tasks, such as:

  • The assignment module for optimizing hydroxyl, thiol and water orientiations, Chi-flips of asparagine, glutamine and histidine, and protonation states of aspartic acid, glutamic acid, and histidine.
  • The captermini module for capping uncapped terminal residues.
  • The findhets module for finding het (non-protein) compounds in structures.
  • The rotamers module provides a rotamer library that can be applied to protein sidechains.
The adjust_residue_numbering_panel.py script in
$SCHRODINGER/mmshare-vX.Y/python/common contains an example of protein structure alignment. The script also renumbers protein residues based on the alignment.

In addition, the schrodinger.structutils.build module contains a useful mutate function.

Here is an example of residue mutation and rotamer-library sampling:

from schrodinger.structutils import build
from schrodinger.protein import rotamers

residue = st.residue[index_of_residue_to_mutate]
# Here, mut_type is the three-letter code for the new residue.
build.mutate(st, residue.atom[1], mut_type)
rotamer_lib = rotamers.Rotamers( st, residue.atom[1] )
for rotamer in rotamer_lib.rotamers:
    rotamer.apply()
    # Operate on the new structure - compute energy, analyze or write
    # the structure.

The prepwizard module also contains useful functions for manipulating protein structures. For example, to fill missing loops in a protein structure, based on a FASTA file sequence:

import schrodinger.application.prepwizard as pw
filled_st = pw.fill_missing_loops(input_st, fasta_file, prime_jobname)

QSite

The qsite package helps to set up QSite jobs.

VSW

The vsw package contains modules for VSW, including pipelining functionality for VSW jobs.

For example, to use VSW pipelining to write a QikProp job for a file of ligands ligands_file:

import schrodinger.application.vsw.input as vswinput
input_file = jobname + '.inp'
txt="""
"""
writer = vswinput.Writer(input_file, comment=txt)
writer.addVar('LIGANDS', 'Structures', [ligands_file])
writer.addStage('RUN_QIKPROP',
                'qikprop.QikPropStage',
                ['LIGANDS'],
                ['OUTPUT_LIGANDS'])
writer.write()