Product Specific Modules¶
Desmond¶
The desmond package contains modules for various Desmond-related operations. In particular, the trajectory infrastructure and trajectory analyzers are in the following modules:
- schrodinger.desmond.topo: Defines functionalities to handle molecular topologies and systems. This is the layer interfacing Desmond’s MSYS infrastructure with Schrodinger’s infrastructure.
- schrodinger.desmond.traj: Defines utility functions for trajectory manipulations.
- schrodinger.desmond.analysis: Defines trajectory analyzers.
- schrodinger.desmond.staf: Defines trajectory analyzer base classes.
Atom indexing¶
For MD related workflows, there are two output files associated with the simulated trajectory:
- A text file called
<jobname>-out.cms
, which contains structural information such as atom types, bonds, etc, as well as atom positions of the last trajectory frame - A binary file of the trajectory, which contains atom positions at all time points.
One tricky detail of the trajectory infrastructure is that the .cms
files
only contain physical atoms whereas the trajectory files contain additional
pseudo atoms. This is reflected in two different ways of identifying atoms:
- Atom ID (AID) for structural files such as
cms
andmae
files * physical atoms only * 1-indexed * results of ASL/SMARTS evaluation - Global ID (GID) for trajectory files and Desmond’s MSYS files * all atoms, both physical and pseudo * 0-indexed
Since pseudo atoms carry charge, they are relevant to all charge related analyses, such as electric dipole, center of charge, etc. Note that the pseudo atoms are not selected from ASL or SMARTS evaluations, and one needs to use APIs in the schrodinger.desmond.topo such as:
asl2gids(cms_model, asl, include_pseudoatoms=True)
aids2gids(cms_model, asl, include_pseudoatoms=True)
Here the argument cms_model
has to be created from:
from schrodinger.application.desmond.packages import topo
msys_model, cms_model = topo.read_cms('example-out.cms')
This is the only way to guarantee the correct AID/GID mapping in the
cms_model
. Avoid loading the .cms
file as a Schrodinger Structure
object or as a Desmond Cms
object directly if it is to interact with
trajectories.
Trajectory frames¶
The Desmond trajectories are saved in binary files and currently both DTR and XTC formats are supported. To read a trajectory:
from schrodinger.application.desmond.packages import traj
tr = traj.read_traj('foo_trj') # DTR format trajectory
fr = tr[0] # first frame
Here tr
is simply a python list of trajectory Frame
objects.
The guaranteed properties of Frame
include:
fr.natoms
: number of atoms in the frame, including the pseudo atomsfr.time
: chemical time in pico-secondsfr.box
: simulation box (key information for periodic boundary unwrapping), which may change from frame to framefr.pos()
orfr.pos(gid)
, orfr.pos(gids)
: atom positions
Here gid
is an integer valued GID and gids
is a list/tuple of them.
The position calls return N x 3
Numpy array where N
is the number of
requested atom(s). Note that fr.pos()
(which selects all atoms) and
fr.pos(gid)
return views whereas fr.pos(gids)
returns a copy of the
data in Frame
. This is a result of Numpy advanced indexing.
Additional information may exist in the Frame
objects. For example, atom
velocities. In that case, the API fr.vel()
has the same interface as
fr.pos()
.
The path to the trajectory file is also saved in the <jobname>-out.cms
file
thus it is possible to load the trajectory from the .cms
file:
from schrodinger.application.desmond.packages import traj_util
msys_model, cms_model, tr = traj_util.read_cms_and_traj('example-out.cms')
If only the path to the trajectory file is needed, use:
from schrodinger.application.desmond.packages import topo
trj_path = topo.find_traj_path(cms_model) # or
trj_path = topo.find_traj_path_from_cms_path('example-out.cms')
Note that in order to utilize these helper functions, one should not rename the
trajectory file or change the relative locations between the .cms
file and
the trajectory file.
Often one needs to update the coordinates of a group of atoms (e.g., a protein molecule from ASL selection) on a per-frame basis. This task can be fulfilled in two ways:
- update all atom coordinates (i.e., the full system CT) for each frame and then extract the protein per frame
- extract the protein once and then update its coordinates per frame
The second approach is more efficient and its implementation is shown below:
from schrodinger.application.desmond.packages import traj, topo
_, cms_model = topo.read_cms('example-out.cms')
tr = traj.read_traj('example_trj')
protein_aids = cms_model.select_atom('protein')
protein_st = cms_model.extract(protein_aids)
protein_gids = topo.aids2gids(cms_model, protein_aids,
include_pseudoatoms=False)
for fr in tr:
protein_st.setXYZ(fr.pos(protein_gids))
# whatever needs to be done on the protein structure
...
In case the full system CT is needed per frame, use:
fsys_ct = cms_model.fsys_ct.copy()
for fr in tr:
topo.update_ct(fsys_ct, cms_model, fr)
# whatever needs to be done on the full system CT
Trajectory analysis¶
There are about 50 trajectory analyzers provided in the schrodinger.desmond.analysis module, ranging from basic geometric operations such as angle, distance, vector, torsion, to specialized analyses such as order parameter, protein-ligand interactions.
To use the existing trajectory analyzers:
from schrodinger.application.desmond.packages import analysis, traj, topo
# load data
msys_model, cms_model = topo.read_cms('foo-out.cms')
tr = traj.read_traj('foo.xtc') # XTC format trajectory
# define analyzers
analyzer1 = analysis.Com(msys_model, cms_model, asl='m.n 1')
analyzer2 = analysis.ProtLigInter(msys_model, cms_model, 'protein', 'm.n 2')
# compute result
results = analysis.analyze(tr, analyzer1, analyzer2, )
Here results
is a list of two items corresponding to the outputs of the two
analyzers. The outputs of the analyzers may differ in their format. Typically,
the output is a list of frame-wise results.
There is a caching mechanism built in the analysis framework to avoid redundant calculations among analyzers, e.g., coordinate unwrapping around periodic boundary condition, centering solute atoms, etc. Thus it is better to use all analyzers together, as in the example.
A special feature of the basic geometric analyzers is that they can use center
of mass Com
, center of charge Coc
, and centroid Centroid
analyzers
on an equal footing with atoms. For example:
from schrodinger.application.desmond.packages import analysis
# data loading for msys_model, cms_model, and tr is omitted
s1 = analysis.Com(msys_model, cms_model, asl='atom.num 1-4')
s2 = analysis.Centroid(msys_model, cms_model, asl='atom.num 40-200')
d1 = analysis.Distance(msys_model, cms_model, 1, 2)
d2 = analysis.Distance(msys_model, cms_model, s1, 27)
d3 = analysis.Distance(msys_model, cms_model, s1, s2)
results = analysis.analyze(tr, d1, d2, d3)
Glide¶
The glide package helps to set up Glide jobs via keyword-value pairs.
Calculate Enrichment from Virtual Screening¶
The analysis.enrichment module helps to calculate enrichments based on known actives and the number of decoys in screen.
import schrodinger.analysis.enrichment as enrichment
efcalc = enrichment.Calculator(
actives = "my_actives.txt", # Active titles, one per line.
results = "screen_results.mae", # Glide pv file.
total_decoys = 1000
)
efcalc.report() # Print default report to standard out.
efcalc.savePlot() # Create default graph png.
print(efcalc.calcBEDROC(alpha=20)) # Print the BEDROC metric value
Jaguar¶
The jaguar package helps to set up Jaguar (QM) tasks. For example, to set up and run a Jaguar 6-31g** geometry optimization:
import schrodinger.application.jaguar.input as jagInput
import schrodinger.job.jobcontrol as jc
input = jagInput.read("inputFile.mae")
input.setValue("basis","6-31G**")
input.setValue("dftname","B3LYP")
input.setValue("igeopt","1") # Do geometry optimization
jagFileName = "jag631.in"
input.saveAs(jagFileName)
run_command = [os.path.join(os.environ['SCHRODINGER'], 'jaguar'),
"run",
jagFileName]
job = jc.launch_job(run_command)
MacroModel¶
The macromodel package helps to set up and run MacroModel tasks. To use this module productively, you should be familiar with the MacroModel Reference Manual. As a starting point, see the documentation for the ComUtil and SbcUtil classes.
A few examples of simplified scripts follow.
Set up and run a Monte Carlo multiple minimum conformation search job with non-default options:
import schrodinger.application.macromodel.utils as mmodutils
import schrodinger.job.jobcontrol as jc
import sys
mcu = mmodutils.ComUtil(ffld='mmffs')
# Use extended torsion sampling.
mcu.AUTO[8] = 3
# Write the com file for a Monte Carlo multiple minimum conformation
# search. (To actually run this script, you need to set
# input_structure_file.)
com_file = mcu.mcmm(input_structure_file)
cmd_args = mcu.getLaunchCommand(com_file)
job = jc.launch_job(cmd_args)
job.wait()
print("MCMM job status: {}".format(job.Status))
mcmm_output_file = job.StructureOutputFile
Set up and run a geometry minimization with restraints defined by an Atom
Selection Language expression. This is specified within a MacroModel
substructure file with the ASL2
code.
import schrodinger.application.macromodel.utils as mmodutils
import schrodinger.job.jobcontrol as jc
mcu = mmodutils.ComUtil(subs=True)
# Write the com file for a geometry optimization.
# (To run this script, you should set input_structure_file. For the
# script to do something interesting, the input structure file should
# have more than one molecule. :-)
com_file = mcu.mini(input_structure_file)
# Set up the MacroModel substructure file.
sbu = mmodutils.SbcUtil()
# Specify the force constant for Cartesian restraints.
sbu.ASL2[1] = 200
# Specify the ASL identifying the restrained atoms.
sbu.ASL2[2] = 'mol.n 1'
sbu_args = [com_file, 'ASL2']
sbc_file = sbu.writeSbcFile(sbu_args)
cmd_args = mcu.getLaunchCommand(com_file)
job = jc.launch_job(cmd_args)
job.wait()
print("MINI job status: {}".format(job.Status))
mini_output_file = job.StructureOutputFile
Set up and run a systematic pseudo-Monte Carlo conformation search,
dynamically assigning restraints to a ligand core (SUBS
and FXAT
):
import schrodinger.application.macromodel.utils as mmodutils
import schrodinger.structutils.analyze as analyze
import schrodinger.structure as structure
mcu = mmodutils.ComUtil(subs=True)
# Write the com file for a systematic pseudo-Monte Carlo multiple
# minimum conformation search.
com_file = mcu.spmc(input_structure_file)
# Find the atoms to be restrained via ASL and SMARTS; anything in a
# 6-membered ring.
st = next(structure.StructureReader(input_structure_file))
core_pattern = 'SMARTS. "r6"'
core_atoms = analyze.evaluate_asl(st, "withinbonds 1 {}".format(core_pattern))
attachment_atoms = analyze.evaluate_asl(st, "not withinbonds 1 {}".format(core_pattern))
# Set up and write the substructure file with the restraints.
sbu = mmodutils.SbcUtil()
sbc_args = [com_file]
sbc_args.extend(sbu.setSubs(attachment_atoms))
sbc_args.extend(sbu.setFixed(core_atoms))
sbu.writeSbcFile(sbc_args)
cmd_args = mcu.getLaunchCommand(com_file)
job = jc.launch_job(cmd_args)
job.wait()
print("SPMC job status: {}".format(job.Status))
spmc_output_file = job.StructureOutputFile
Prime¶
The prime package contains modules for reading and writing Prime input files and for fixing structures for use with Prime.
The protein package contains other modules for protein-related tasks, such as:
- The assignment module for optimizing hydroxyl, thiol and water orientiations, Chi-flips of asparagine, glutamine and histidine, and protonation states of aspartic acid, glutamic acid, and histidine.
- The captermini module for capping uncapped terminal residues.
- The findhets module for finding het (non-protein) compounds in structures.
- The rotamers module provides a rotamer library that can be applied to protein sidechains.
- The
adjust_residue_numbering_panel.py
script in $SCHRODINGER/mmshare-vX.Y/python/common
contains an example of protein structure alignment. The script also renumbers protein residues based on the alignment.
In addition, the schrodinger.structutils.build module contains a useful
mutate
function.
Here is an example of residue mutation and rotamer-library sampling:
from schrodinger.structutils import build
from schrodinger.protein import rotamers
residue = st.residue[index_of_residue_to_mutate]
# Here, mut_type is the three-letter code for the new residue.
build.mutate(st, residue.atom[1], mut_type)
rotamer_lib = rotamers.Rotamers( st, residue.atom[1] )
for rotamer in rotamer_lib.rotamers:
rotamer.apply()
# Operate on the new structure - compute energy, analyze or write
# the structure.
The prepwizard module also contains useful functions for manipulating protein structures. For example, to fill missing loops in a protein structure, based on a FASTA file sequence:
import schrodinger.application.prepwizard as pw
filled_st = pw.fill_missing_loops(input_st, fasta_file, prime_jobname)
QSite¶
The qsite package helps to set up QSite jobs.
VSW¶
The vsw package contains modules for VSW, including pipelining functionality for VSW jobs.
For example, to use VSW pipelining to write a QikProp job for a file of
ligands ligands_file
:
import schrodinger.application.vsw.input as vswinput
input_file = jobname + '.inp'
txt="""
"""
writer = vswinput.Writer(input_file, comment=txt)
writer.addVar('LIGANDS', 'Structures', [ligands_file])
writer.addStage('RUN_QIKPROP',
'qikprop.QikPropStage',
['LIGANDS'],
['OUTPUT_LIGANDS'])
writer.write()