schrodinger.application.scaffold_enumeration.cxsmiles module¶

Functions to parse “repeating units” and “position variant bonds” from CX SMILES “features” text are not particularly bright, but probably good enough for machine-generated CX SMILES.

class schrodinger.application.scaffold_enumeration.cxsmiles.MCG(atoms, center)¶

Bases: tuple

__contains__¶: Return key in self.

__init__¶: Initialize self. See help(type(self)) for accurate signature.

__len__¶: Return len(self).

atoms¶: List of atom indices ([int]).

center¶: Central atom index (int).

count(value) → integer -- return number of occurrences of value¶

index(value[, start[, stop]]) → integer -- return first index of value.¶: Raises ValueError if the value is not present.

class schrodinger.application.scaffold_enumeration.cxsmiles.SRU(atoms, subscript, superscript)¶

Bases: tuple

__contains__¶: Return key in self.

__init__¶: Initialize self. See help(type(self)) for accurate signature.

__len__¶: Return len(self).

atoms¶: List of atom indices ([int]).

count(value) → integer -- return number of occurrences of value¶

index(value[, start[, stop]]) → integer -- return first index of value.¶: Raises ValueError if the value is not present.

subscript¶: SRU’s subscript (str).

superscript¶: SRU’s superscript (str).

schrodinger.application.scaffold_enumeration.cxsmiles.parse_mcg(text, pos, accum)¶

Parses “multi-center SGroup” data from CX SMILES “features”.

<quote>

The multicenter atom indexes written after “m:” followed by a colon character and the indexes of the atoms which forms the given SGroup separated by “.”. The SGroups are separated by commas.

Example: “m:0:7.6.5.4.3,2:12.11.10.9.8,C:0.0,2.1”

</quote>

Parameters:	text (str) – CX SMILES “features” string. pos (int) – Index of the character in `text` right after “m:”. accum (list) – List to which the “SGroups” are to be appended.
Returns:	Index of the first unconsumed character in `text`.
Return type:	int

schrodinger.application.scaffold_enumeration.cxsmiles.parse_sru(text, pos, accum)¶

Parses “SRU” data from CX SMILES “features”.

<quote>

Polymer Sgroups Each Sgroup exported after “Sg:” in fields separated by a colon. Fields are:

Atom indexes separated with commas.
Subscript of the Sgroup. If the supscript equals the keyword of the Sgroup this field can be empty. Escaped field.
Superscript of the Sgroup. In the superscript only connectivity and flip information is allowed. This field can be empty. Escaped field.
Head crossing bond indexes. The indexes of bonds that share a common bracket in case of ladder-type polymers. This field can be empty.
Tail crossing bond indexes. The indexes of bonds that share a common bracket in case of ladder-type polymers. This field can be empty.
If the c export option is present then bracket orientation, bracket type followed by the coordinates (4 pair, separated with commas). Bracket orientation can be s or d (single or double), bracket type can be b,c,r,s for braces, chevrons, round and square, respectively. The brackets are written between parentheses and separated with semicolons.

A colon is needed after the last non-empty field.

If one needs to retain not only the chemically relevant information, but the whole structure (as drawn), then the c export option should be used.

</quote>

In addition:

<quote>

Escaping

In some places special characters are escaped to ‘&#code’ where code is the ASCII code of the special character.

Not escaped characters in fields of Sgroups and DataSgroups: ‘a’-‘z’, ‘A’-‘Z’, ‘0’-‘9’ and ‘><”!@#$%()[]./?-+*^_~=’ and the space character.

Not escaped characters in atom property keys and values: ‘a’-‘z’, ‘A’-‘Z’, ‘0’-‘9’ and ‘><”!@#$%()[]./?-+*^_~=’ and the space character.

Not escaped characters in atom labels and atom values: ‘a’-‘z’, ‘A’-‘Z’, ‘0’-‘9’ and ‘><”!@#%()[]./?-+*^_~=,:’ and the space character.

</quote>

This subroutine recognizes only:: atoms (2), subscript (3), and superscript (4).

Parameters:	text (str) – CX SMILES “features” string. pos (int) – Index of the character in `text` right after “Sg:n:”. accum (list) – List to which the “SGroups” are to be appended.
Returns:	Index of the first unconsumed character in `text`.
Return type:	int

schrodinger.application.scaffold_enumeration.cxsmiles.parse_cx_extensions(text)¶

Parses: (a) multi-center groups and (b) SRUs.

Parameters:	text (str) – CX extensions to be parsed.
Returns:	Tuple ot lists that hold the MCGs and SRUs.
Return type:	(list(MCG), list(SRU))

schrodinger.application.scaffold_enumeration.cxsmiles.mol_from_cxsmiles(text, parseName=True)¶

Strives to instantiate rdkit.Chem.Mol from text assuming that the latter is CX SMILES.

Parameters:	text (str) – CX SMILES string. parseName (bool) – Parse molecule title?
Returns:	Molecule or None
Return type:	rdkit.Chem.Mol or NoneType

Previous topic

Next topic

schrodinger.application.scaffold_enumeration.cxsmiles module¶