schrodinger.application.desmond.packages.topo module

Functionalities to handle molecular topologies

Copyright Schrodinger, LLC. All rights reserved.

class schrodinger.application.desmond.packages.topo.DuckFrame(model)

Bases: object

A duck-type frame with limited interface.

We can duck-type a msys model (i.e., a msys.System object) into a `DuckFrame’ object, example:

dfr = DuckFrame(msys_model)

Changes on the DuckFrame' object should NOT affect the original `model object.

__init__(model)

Copy the coordinates, the velocities, and the box matrix of the original model.

FIXME: We only support msys.System type of model for now.

pos(i=None)

Return the position vector(s).

vel()
natoms
nactive
box

Return a row-majored 3x3 matrix. The primitive cell vectors are the rows of the matrix, which should be consistent with traj.Frame.box.

schrodinger.application.desmond.packages.topo.cms_atom(cms_model)

Returns an iterator through all atoms in a CMS model. At each iteration, we get a tuple of (fsys_atom, comp_atom, comp_ct, ct_index), where

  • fsys_atom: atom in full-system CT
  • comp_atom: atom in component CT
  • comp_ct: component CT to which `comp_atom’ belong
  • ct_index: index of the component CT

The iterator breaks right before returning any ghost atoms, unlike the function below that returns indices of all atoms (including ghost atoms).

schrodinger.application.desmond.packages.topo.cms_atom_index(cms_model)

Returns an iterator through all atom indices in a CMS model. At each iteration, we get a tuple of (fsys_atom_index, comp_atom_index, comp_ct, ct_index), where

  • fsys_atom_index: atom index in full-system CT
  • comp_atom_index: atom index in component CT
  • comp_ct: component CT to which `comp_atom_index’ belong
  • ct_index: index of the component CT

This function differs from the above in that it returns indices of all atoms including ghost atoms. Note that the fsys_atom_index for ghost atoms is merely a “projected” index as the fullsystem CT doesn’t really contain any ghost atoms.

schrodinger.application.desmond.packages.topo.aid_match(cms_model)
Return type:1D ‘1-indexed’ numpy.ndarray of ints
Returns:Returns an array of atom indices. Index = 1-based atom index as in the full-system CT, value = index of the matched atom (0 means not matched).
schrodinger.application.desmond.packages.topo.pseudoatom_match(msys_model, cms_model)

This function will find the match between pseudoatoms and return it as two lists of pseudoatom indices. The first list is for the reference CT, and the second for the mutant CT.

match = pseudoatom_match(msys_model, cms_model) j = match[0][i]

where `j’ is the matched pseudoatom’s index in the ffio_pseudo block of the mutant CT, `i’ is the pseudoatom’s index in that of the reference CT. If `i’-th pseudoatom is not matched, `j’s value is zero. `match[0][0]’ is always 0, which is junk and should be ignored.

j = match[1][i]

is similar, except that `i’ is of the mutant CT, and `j’ is of the reference CT.

For non-alchemical-FEP systems, this function returns [].

Return type:[list[int], list[int]] or []
schrodinger.application.desmond.packages.topo.set_original_index_property(cms_model)

For each atom in the full-system CT, add an atom-level property: constants.ORIGINAL_INDEX, and set its value to the atom’s ID.

schrodinger.application.desmond.packages.topo.find_traj_path(cms_model, base_dir=None)

Return path of the trajectory file or dir, whose name was saved into a CT-level property in the output cms file. Return None is valid path cannot be found.

Note that this function is temporary. In the long run the coupling between trajectory directory and cms file will be removed. The user should be explicit about where the trajectory file/folder is.

If you are developing a new command line tool, please do not rely on this function to get the trajectory path. Instead, explicitly pass in the trajectory path. See an example below: $SCHRODINGER/run analyze_simulation.py -h

Return type:str or None
schrodinger.application.desmond.packages.topo.find_traj_path_from_cms_path(cms_path)

Return path of the trajectory directory/file that is “associated” with the given CMS file. This works only if both are in the same parent directory.

Parameters:cms_path (str) – Path to the CMS file
Returns:Path to the trajectory directory or file.
Return type:str or None
schrodinger.application.desmond.packages.topo.read_cms(fname=None, from_string=None, remove_ghost_atoms_in_fsys=True)

Read a .cms file from the given file name fname, or from a string buffer. How does this differ from cms.Cms(fname)? - Here, two models will be created for the given .cms file. So you will

get 2-tuple return value: The first member is the msys model, and the second is the cms model.
  • The cms model returned by this function will give you correct gids, whereas the model returned by cms.Cms(fname) might not do so when there are virtual sites.

  • The cms model returned by this function will have three extra attributes: - pseudoatoms This is a map from the gid of a physical atom to that

    of its pseudoatom(s).

    • pseudomatch This is pseudoatom match between the two alchemical molecules. (See the docstring of pseudoatom_match for detail)

Will we unite this with cms.Cms(fname)? Maybe. But I don’t see a strong motivation right now, and I don’t want to overengineer. -YW (May 26, 2016)

FIXME: msys.LoadMAE(...) is unbelievably slow and may take up to 10 sec for typically-sized alchemical FEP systems.

schrodinger.application.desmond.packages.topo.extract_subsystem(cms_model, asl)

We need to clarify on what we call ``subsystem’’ here. First, let’s review the hierarchical structure of a chemical system in the cms model:

  • At the most top level is the whole system.
  • Under that are a number of CTs (connectivity tables). Each CT contains one molecule (usually in the case of macromolecules), or more (the most obvious example is the water CT, which include thousands of water molecules).
  • Under that are chains, which belong to the same molecule but not necessarily covalently bonded.
  • Under that are residues, groups, atoms, pseudoatoms, …, which we don’t have to go into the detail.

Despite of the fact that there are already a lot of concepts to grasp, it’s still sometimes inadequate to describe a portion of the system. The problem arises mainly due to the fact that the cms model contains force field data: When we do atom selections, we have to keep that in mind, we won’t be able to create a valid .cms file if we select a number of atoms that don’t match the force field data.

To address that problem, we introduce an extra concept that sits between the CT and the molecule levels in the hierarchy. We can call it ``instance’‘, and it refers to a single complete substructure that matches the ffio. For example, in a protein CT, the ffio describe the whole protein structure, and so the instance there is the whole protein, and the CT contains only 1 instance; whereas in a water CT, the ffio describes only a single water molecule, and thus the instance is a single water molecule, and the CT contains a number of instances.

With that, we now define a subsystem as follows: A subsystem is a portion of the original system, containing 1 or more instances of any types as defined in the original system. By this definition, a subsystem should always be a valid cms model.

We allow users to specify a subsystem using ASL, but how do we deal with cases where the ASL expression doesn’t include a complete set of atoms in their respective instances? For now, we will simply automatically expand the selection to the whole instance. This is good enough for most cases, I think. Of course, in future we should have no problem to be more sophisticated in deciding to expand or shrink the selection.

Parameters:
  • cms_model (cms.Cms) – A cms model generated by read_cms (see above). Won’t be mutated.
  • asl (str) – ASL expression to specify the subsystem.
Return type:

(new-cms-model, seletion-in-aid)

Returns:

Returns a new cms.Cms object (the input cms_model will not be mutated), and an unsorted list of aids of the selected atoms. The new cms.Cms is guaranteed to contain no ghost atoms.

schrodinger.application.desmond.packages.topo.matched_gids(cms_model, sub_cms, sub_msys)

Return GIDs of cms_model that match the GIDs of sub_cms and preserve the ordering, which is essential for trajectory extraction/reduction.

@param cms_model: Original system where subsystem was extracted from @type cms_model: cms.Cms @param sub_cms: A subsystem extracted from cms_model. The atoms are

expected to have a property called i_des_orig_aid, and the values should be the AIDs in cms_model.

@type sub_cms: cms.Cms @param sub_msys: This object must match sub_cms. @type sub_msys: msys.System

@rtype: list of int

schrodinger.application.desmond.packages.topo.comp_atoms(cms_model)

A coroutine to iterate through all atoms in all component CTs of the cms_model.

schrodinger.application.desmond.packages.topo.update_ct_box(ct, box)

Given a ct, set the following CT-level properties using the values from box:

“r_chorus_box_ax”, “r_chorus_box_ay”, “r_chorus_box_az”, “r_chorus_box_bx”, “r_chorus_box_by”, “r_chorus_box_bz”, “r_chorus_box_cx”, “r_chorus_box_cy”, “r_chorus_box_cz”,
Return type:structure.Structure
Returns:The same input ct object, with its “r_chorus_box_*” CT-level properties updated.
schrodinger.application.desmond.packages.topo.update_ct(ct, cms_model, fr, allaid_gids=None, is_fullsystem=True, update_vel=False)

Updates coordinates and simulation-box-matrix of the input ct with the given trajectory frame. Here cms_model and fr should correspond to the same simulation, and ct could be full system CT, component CT, or other subsystems extracted from the system. Note GCMC systems with multiple types of solvent are not supported.

Parameters:
  • ct (structure.Structure) – structure to be updated
  • cms_model (cms.Cms) – It should be generated by the read_cms function.
  • allaid_gids (numpy.ndarray, or list of int or None. The caller needs to make sure that its length and order match the atoms in ct. None means cms_model.allaid_gids is used.) – GIDs of all AIDs
  • is_fullsystem (bool) – whether or not this represents a fullsystem CT. Knowing this can keep us from doing extra work
  • update_vel (bool) – Update the atom velocities.
Return type:

structure.Structure

Returns:

The same input ct object, with the atom coordinates and simulation box matrix updated from the frame.

Example:
fsys_ct = cms_model.fsys_ct.copy() update_ct(fsys_ct, cms_model, tr[i])
schrodinger.application.desmond.packages.topo.update_fsys_ct_from_frame_GF(fsys_ct, cms_model, fr, frames_to_smooth=None, aids_to_smooth=None, update_vel=False)

This function is the future version of update_fsys_ct_from_frame and the difference is that fsys_ct will be ghost-free.

schrodinger.application.desmond.packages.topo.update_fsys_ct_from_frame(fsys_ct, cms_model, fr, frames_to_smooth=None, aids_to_smooth=None)

Update a full system CT using a frame object, including atom positions, simulation box, atom-level properties i_des_atom_domain and i_m_visibility. If ghost atoms are present, they are labelled and set invisible. If frames_to_smooth is given, the smoothed coordinates are used to update the full system CT’s coordinates instead.

Parameters:
  • frames_to_smooth (list of traj.Frame) – Frames whose atom coordinates are to be smoothed to update the atom coordinates in fsys_ct. Note it should contain fr.
  • aids_to_smooth (list of int or None) – AIDs of atoms whose coordinates are to be smoothed. If None, default to non-solvent atoms.
schrodinger.application.desmond.packages.topo.get_active_fsys_ct_from_frame(fsys_ct, cms_model, fr)

Return a Structure object that contains only the active atoms from the given frame fr. Note that the atom coordinates in the input fsys_ct are updated with those from the frame. For GCMC simulations, this active full system CT is extracted from the input full system CT to exclude ghost (inactive) atoms. A new Structure object will be returned in this case. For normal MD simulations, the active full system CT is the entire system, and the updated fsys_ct object will be returned. This function has the same interface as update_ct.

Parameters:fsys_ct – full system CT
schrodinger.application.desmond.packages.topo.set_atom_velocity(atom, vel)
This function will set the following atom properties:
r_ffio_x_vel r_ffio_y_vel r_ffio_z_vel

with vel[0], vel[1], and vel[2], respectively.

schrodinger.application.desmond.packages.topo.update_cms_physical(cms_model, pos, vel, box)

Update the physical data of a cms model. The physical data include all atom positions and velocities and also the box matrix.

Parameters:cms_model (cms.Cms) – The cms model to be updated. This model should be created by the read_cms function, and its atom count and order should match to the pos, vel inputs.
schrodinger.application.desmond.packages.topo.check_consistency(cms_model, frame)

Return None if cms_model and frame are consistent, or a str object describing the inconsistency if they are not consistent.

Parameters:cms_model (cms.Cms) – It should be generated by the read_cms function.
schrodinger.application.desmond.packages.topo.update_cms(cms_model, frame, update_pseudoatoms=True)

Update the given cms_model with the atom coordinates and the simulation box matrix from a simulation frame frame. For GCMC systems, it also updates the CT-level property ‘i_des_active_total’.

N.B.: If you call this function for every frame of a long trajectory, you might find this function is quite slow. Also keep in mind that its performance has a strong dependency on the number of atoms in the system. If all you want is a full-system CT with atom coordinates updated by the trajectory frame, consider using the get_active_fsys_ct_from_frame or update_ct functions above, which are much faster.

The input cms_model and frame objects should match. When in doubt, call check_consistency.

Parameters:cms_model (cms.Cms) – It should be generated by the read_cms function.
Return type:structure.Structure
Returns:The same input cms_model object, with the atom coordinates and the simulation box matrix updated from the frame.
schrodinger.application.desmond.packages.topo.update_msys(msys_model, frame)

Updates the given msys model msys_model with the atom coordinates and velocities and the simulation box matrix from a simulation frame frame.

Return type:msys.System
Returns:The same input msys_model object, with the atom coordinates, velocities and the simulation box matrix updated from the frame.
schrodinger.application.desmond.packages.topo.get_aids_with_virtuals(cms_model)
Returns:a set of full-system AIDs that have virtual sites attached to them.
Return type:set of int
schrodinger.application.desmond.packages.topo.aids2gids(cms_model, aids, include_pseudoatoms=True)

Convert a list of atom IDs (aids) into a list of gids. If any selected atoms have pseudoatoms associated to them, the pseudoatoms will be included into the list of gids. It puts all pseudoatoms’ GIDs after all physical atoms’ GIDs with arbitrary ordering.

Parameters:cms_model (cms.Cms) – A cms model generated by read_cms (see above).
Return type:list of int
Returns:A list of gids of the selected particles
schrodinger.application.desmond.packages.topo.asl2gids(cms_model, asl, include_pseudoatoms=True)

Evaluate an ASL expression, and return a list of gids of particles selected by asl. If any selected atoms have pseudoatoms associated to them, the pseudoatoms will be included into the list of gids.

Parameters:
  • cms_model (cms.Cms) – A cms model generated by read_cms (see above).
  • asl (str) – An ASL expression
Return type:

list of int

Returns:

An unsorted list of gids of the selected particles

schrodinger.application.desmond.packages.topo.make_glued_topology(msys_model, cms_model)

Add glued topology to msys_model, which contains with extra bonds based on proximity of the solute molecules. It works with _pfx_apply to properly center disjoint solute molecules (e.g., proteins with missing segments, dimers) The assumption is that the input cms_model has the correct spatial configurations.

schrodinger.application.desmond.packages.topo.make_whole(msys_model, tr)

In MD simulation, molecules can be broken due to the periodic boundary condition, which makes some atoms be at one side of the simulation box and the other atoms at the opposite side. This function will edit the atom coordinates so to make broken molecules whole again for each frame of the MD trajectory tr.

Parameters:
  • msys_model (msys.System) – The msys model, which must be consistent with the trajectory in terms of the molecular topology. This object will not be mutated.
  • tr (list) – The simulation trajectory to be modified
Return type:

list

Returns:

Modified simulation trajectory

schrodinger.application.desmond.packages.topo.glue(msys_model, gids, tr)

First, make-whole all molecules in the simulation system, and then glue the selected molecules together.

Parameters:
  • msys_model (msys.System) – The msys model, which must be consistent with the trajectory in terms of the molecular topology. This object will not be mutated.
  • gids (list) – A list of gids to specify the molecules to be glued
  • tr (list) – The simulation trajectory to be modified
Return type:

list

Returns:

Modified simulation trajectory

schrodinger.application.desmond.packages.topo.center(msys_model, gids, tr, dims=None)

This function will do what glue does, but it will do one more thing: It will translate the coordinates of all atoms so that the specified molecules will be placed at the center (origin) of the simulation box.

Parameters:
  • msys_model (msys.System) – The msys model, which must be consistent with the trajectory in terms of the molecular topology. This object will not be mutated.
  • gids (list) – A list of gids to specify the molecules to be glued and centered
  • tr (list) – The simulation trajectory to be modified
  • dim (None or list of int, e.g., [2] for z axis, [0, 1] for x-y plane. If set to None, center on all three spatial dimensions) – dimensions to be centered on
Return type:

list

Returns:

Modified simulation trajectory

schrodinger.application.desmond.packages.topo.superimpose(msys_model, gids, tr, ref_pos, weights=None)

This function will do what center does, but in addition it will align and superimpose all the coordinates in the frame with respect to a reference coordinates (ref_pos). This operation will be applied to all frames in the trajectory.

Parameters:
  • msys_model (msys.System) – The msys model, which must be consistent with the trajectory in terms of the molecular topology. This object will not be mutated.
  • gids (list) – A list of gids to specify the molecules to be glued and centered
  • tr (list) – The simulation trajectory to be modified
  • ref_pos (numpy.array. 3xN matrix, where N is length of gids.) – Reference coordinates used for alignments
  • weights (list or None) – Atom weights used for alignments. If it’s None, all atoms will be weighted equally.
Return type:

list

Returns:

Modified simulation trajectory

schrodinger.application.desmond.packages.topo.make_whole_cms(msys_model, cms_model)

Similar to make_whole (see above), but on a cms model instead of a simulation trajectory.

Both msys_model and cms_model must be previously obtained through the read_cms function. They both should have the same atom coordinates and the same simulation box matrix.

Return type:structure.Structure
Returns:Modified cms model
schrodinger.application.desmond.packages.topo.glue_cms(msys_model, gids, cms_model)

Similar to glue (see above), but on a cms model instead of a simulation trajectory.

Both msys_model and cms_model must be previously obtained through the read_cms function. They both should have the same atom coordinates and the same simulation box matrix.

Return type:structure.Structure
Returns:Modified cms model
schrodinger.application.desmond.packages.topo.center_cms(msys_model, gids, cms_model, dims=None)

Similar to center (see above), but on a cms model instead of a simulation trajectory.

Both msys_model and cms_model must be previously obtained through the read_cms function. They both should have the same atom coordinates and the same simulation box matrix.

Parameters:dim (None or list of int, e.g., [2] for z axis, [0, 1] for x-y plane. If set to None, center on all three spatial dimensions) – dimensions to be centered on
Return type:structure.Structure
Returns:Modified cms model
schrodinger.application.desmond.packages.topo.superimpose_cms(msys_model, gids, cms_model, ref_pos, weights=None)

Similar to superimpose (see above), but on a cms model instead of a simulation trajectory.

Both msys_model and cms_model must be previously obtained through the read_cms function. They both should have the same atom coordinates and the same simulation box matrix.

Return type:structure.Structure
Returns:Modified cms model
schrodinger.application.desmond.packages.topo.is_dynamic_asl(cms_model, asl) → bool

Return True if ASL could evaluate to different results on different frames.

schrodinger.application.desmond.packages.topo.replicate_ct(ct, rep_vec, natoms=None)

Replicate CT or adjust existing replication along the three primitive cell directions. Modification occurs on ct as side effect.

Parameters:
  • rep_vec (tuple of 3 int) – A replication vector whose components denote the number of copies along the primitive cell directions. All components should be positive, e.g., (1, 1, 1) means no extra copy.
  • natoms – Number of atoms in the un-replicated CT. If None, the input ct is considered un-replicated.
schrodinger.application.desmond.packages.topo.unroll_pos(ct, rep_vec, xyz0)

Set coordinates for all copies of the replicated CT.