schrodinger.application.matsci.genetic_optimization.genetic_optimization module¶
Classes and functions for the genetic optimization module.
Copyright Schrodinger, LLC. All rights reserved.
-
class
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
CanvasKPLS
(structs, properties, copy_models=True, fp_options_dict=None)¶ Bases:
schrodinger.application.matsci.genetic_optimization.genetic_optimization.ClassEvaluator
Manage Canvas KPLS jobs.
-
ALLOWED_FP_TYPES
= ['linear', 'maccs', 'radial', 'molprint2D', 'torsion', 'pairwise', 'triplet', 'quartet', 'dendritic']¶
-
BIT_EXT
= '-bit'¶
-
CELSIUS
= 'C'¶
-
CLASS_KWARGS
= OrderedDict([('kpls_tg', 'kpls_model>/scr/buildbot/savedbuilds/2017-4/NB/build-152/mmshare-v4.0/data/genetic_optimization/canvas_kpls_models/Tg250.kpls.tar.gz')])¶
-
CUSTOM_KEY
= 'r_matsci_KPLS_%s/%s'¶
-
DIR
= 'genetic_optimization/canvas_kpls_models'¶
-
DOUBLE_PRECISION
= 64¶
-
EXTRA_OPTIONS
= ['kpls_model']¶
-
FP_OPTIONS_DICT_KEY
= 'fp_options_dict'¶
-
FP_TEXT_FILE
= 'fpInfo.txt'¶
-
IN_FP_EXT
= '-in.fp'¶
-
KPLS_EXT
= 'kpls.tar.gz'¶
-
MODEL_OPTION
= 'kpls_model'¶
-
OUT_EXT
= '.out'¶
-
PATH
= '/scr/buildbot/savedbuilds/2017-4/NB/build-152/mmshare-v4.0/data/genetic_optimization/canvas_kpls_models'¶
-
SINGLE_PRECISION
= 32¶
-
TAG
= '_kpls'¶
-
TG_KEY
= 'r_matsci_KPLS_Tg/C'¶
-
TG_PARAMETERS
= OrderedDict([('kpls_model', '/scr/buildbot/savedbuilds/2017-4/NB/build-152/mmshare-v4.0/data/genetic_optimization/canvas_kpls_models/Tg250.kpls.tar.gz')])¶
-
TG_PROP
= 'kpls_tg'¶
-
TG_UNITS
= 'C'¶
-
UNKNOWN_UNITS
= 'unknown'¶
-
VALUE_PATTERN
= <_sre.SRE_Pattern object>¶
-
static
addCanvasKPLSOptions
(property_string, options, name, known_name, index)¶ Add any Canvas KPLS options found in the given property string to the given property options dictionary.
Parameters: - property_string (str) – the string representation of the property specifications, containing options as ‘<option_substring>=<value>’
- options (dict) – contains property options
- name (str) – the property name
- known_name (bool) – if the given property name is a known name
- index (int) – the property index
Raises: - UnknownNameError – if the name is not allowed to be unknown
- IncompleteExtraOptionsError – if the provided Canvas KPLS options are incomplete
Return type: dict
Returns: the given property options with Canvas KPLS options added
-
static
checkCanvasKPLSModelFile
(model_file)¶ Check the given Canvas KPLS model file.
Parameters: model_file (str) – the name of the Canvas KPLS model file Raises: RuntimeError – if there is anything wrong with the Canvas KPLS model file
-
copyCanvasKPLSModelFiles
()¶ Copy the Canvas KPLS model files to the CWD.
-
static
getFpOptions
(model_file)¶ Return fingerprint options obtained from the given Canvas KPLS model file.
Parameters: model_file (str) – the name of the Canvas KPLS model file Return type: int, str, int or None Returns: contains (1) precision, (2) fingerprint type, and (3) atom type if present Raises: RuntimeError – if there is anything wrong with the Canvas KPLS model file
-
getPropertyValue
(property_outfile)¶ Get the property value.
Parameters: property_outfile (str) – the Canvas KPLS output file Raises: RuntimeError – if property output file doesn’t exist or doesn’t contain the property value Return type: float Returns: the property value
-
static
getValidCanvasKPLSModelFiles
(property_lists)¶ Return file names of any valid Canvas KPLS model files.
Parameters: property_lists (list) – contains lists of property specifications Return type: list Returns: file names of valid Canvas KPLS model files
-
makeFingerPrintInfile
(mae_infile, name, key)¶ Make fingerprint infile.
Parameters: Raises: RuntimeError – if canvasFPGen fails
Return type: Returns: the Canvas fingerprint input file name
-
makeFpOptionsDict
()¶ Make the fingerprint options dictionary.
-
makeMaestroInfile
(struct)¶ Make Maestro infile.
Parameters: struct (schrodinger.structure.Structure) – the structure for which to prepare the Maestro input file Return type: str Returns: the Maestro input file name
-
runCanvasKPLS
(struct)¶ Run the Canvas KPLS.
Parameters: struct (schrodinger.structure.Structure) – the structure on which to run Canvas KPLS Raises: RuntimeError – if canvasKPLS fails
-
runIt
()¶ Run it.
-
-
class
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
CheckInput
¶ Bases:
object
Manage checking user input.
-
checkConformationalSearch
(conformational_search, logger=None)¶ Check the conformational search.
Parameters: - conformational_search (bool or str) – specifies whether a conformational search is to be performed, if a string is given specifies a file used to set options
- logger (logging.Logger) – output logger
-
checkElitism
(elitism, population, logger=None)¶ Check the elitism.
Parameters: - elitism (int) – the number of elite individuals to use
- population (int) – the size of population to use
- logger (logging.Logger) – output logger
-
checkFragmentLibs
(fragment_libs, logger=None)¶ Check the specified fragment libraries.
Parameters: - fragment_libs (list) – strings specifying fragment libraries to be used
- logger (logging.Logger) – output logger
Return type: list
Returns: valid user provided fragment files
-
checkFreezers
(freezers, pop_size, input_size, logger=None)¶ Check the freezers.
Parameters: - freezers (list) – collection of freezers to use
- pop_size (int) – the size of the population
- input_size (int) – the number of structures given
- logger (logging.Logger) – output logger
Return type: list
Returns: collection of freezers to use
-
checkGenerations
(generations, logger=None)¶ Check the specified number of generations.
Parameters: - generations (int) – the number of generations
- logger (logging.Logger) – output logger
-
checkInitialPopulation
(initial_population, crossover_names, mutator_names, crossover_rate, mutation_rate, no_open_shell, logger=None)¶ Check the initial population.
Parameters: - initial_population (list) – the initial population of schrodinger.structure.Structure
- crossover_names (list) – contains the function names of the crossover operators to be used
- mutator_names (list) – contains the function names of the mutation operators to be used
- crossover_rate (float) – the rate of crossover
- mutation_rate (float) – the rate of mutation
- no_open_shell (bool) – if True then check for open shell structures otherwise do not
- logger (logging.Logger) – output logger
-
checkInoculate
(inoculate, logger=None)¶ Check the inoculate.
Parameters: - inoculate (list) – circumstances in which to inoculate
- logger (logging.Logger) – output logger
-
checkMaeFile
(input_file, logger=None)¶ Check that a file exists and is *mae.
Parameters: - input_file (str) – the name of the input file
- logger (logging.Logger) – output logger
-
checkNodeFile
(logger=None)¶ Check the hosts in the node file, specified by the SCHRODINGER_NODEFILE envvar, that have been allocated by the queue manager.
Parameters: logger (logging.Logger) – output logger
-
checkOperators
(operators, logger=None)¶ Check the operators.
Parameters: - operators (list) – contains tuples of the operator functions and their weights
- logger (logging.Logger) – output logger
-
checkPopulationParam
(population, num_structures_given, logger=None)¶ Check the population parameter.
Parameters: - population (int) – the size of the population to use in the genetic optimization
- num_structures_given (int) – the number of structures provided to the genetic optimization
- logger (logging.Logger) – output logger
-
checkProperties
(properties, logger=None)¶ Check the list of properties.
Parameters: - properties (list) – contains Property instances
- logger (logging.Logger) – output logger
-
checkRates
(crossover_rate, mutation_rate, logger=None)¶ Check the specified rates of crossover and mutation.
Parameters: - crossover_rate (float) – the rate of crossover as a percentage
- mutation_rate (float) – the rate of mutation as a percentage
- logger (logging.Logger) – output logger
-
checkScaling
(scaling, properties, logger=None)¶ Check the scaling.
Parameters: - scaling (str) – the scaling protocol to use in the genetic optimization
- properties (list) – the properties to be optimized
- logger (logging.Logger) – output logger
-
checkSelection
(selection, logger=None)¶ Check the specified selection protocol.
Parameters: - selection (str) – the selection protocol to use.
- logger (logging.Logger) – output logger
-
checkTerminationParams
(terminators, num_unproductive, logger=None)¶ Check the termination parameters.
Parameters: - terminators (list) – the list of terminators to use
- num_unproductive (int) – used when the unproductive termination option is active, it is the generation number on which to exit if the score hasn’t improved
- logger (logging.Logger) – output logger
Return type: list and int
Returns: valid terminators and valid num_unproductive
-
checkTournamentSize
(tournament_size, population, logger=None)¶ Check the specified tournament size.
Parameters: - tournament_size (int) – the size of tournament to use in tournament based selection
- population (int) – the size of population to use
- logger (logging.Logger) – output logger
-
checkTpp
(tpp_ga, population, eval_kwargs, logger=None)¶ Check the threads per processor.
Parameters: - tpp_ga (int) – the threads per processor for the genetic optimization
- population (int) – the size of population to use
- eval_kwargs (dict) – the kwargs for the evaluation function
- logger (logging.Logger) – output logger
-
-
class
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
ClassEvaluator
(structs, properties)¶ Bases:
object
Manage a class evaluator.
-
runIt
()¶ Run it.
Raises: RuntimeError – for any issue
-
-
class
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
Failure
(genome, msg)¶ Bases:
schrodinger.application.matsci.genetic_optimization.genetic_optimization.Skip
Manage a failure.
-
class
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
GeneticOptimization
(initial_population, properties, structure_score_threshold=-50.0, eval_kwargs={}, crossovers=None, mutators=None, fragment_libs=['optoelectronics'], script_evaluator=None, generations=10, population=8, crossover_rate=90.0, mutation_rate=90.0, selection='roulette_wheel', tournament_size=2, terminators=['unproductive', 'all_properties'], num_unproductive=6, scaling='sigma_truncation', elitism=1, random_seed=None, no_minimize=False, file_base_name='genopt', tpp_ga=1, no_open_shell=False, props_to_remove=None, jobbe=None, conformational_search=False, freezers=['remainder', 'previous'], inoculate=['no_child', 'bad_structure'], class_evaluator=None, logger=None)¶ Bases:
object
Manage the genetic optimization.
-
MSGWIDTH
= 80¶
-
checkInputParams
()¶ Check the input parameters.
-
initializeGA
(genome)¶ Initialize the genetic optimization.
Parameters: genome (StructureGenome) – a genome
-
initializeGenome
()¶ Initialize a genome.
Return type: StructureGenome Returns: a genome
-
printParams
()¶ Log the parameters.
-
printProperties
()¶ Log the set of sought properties and their details.
-
runIt
()¶ Run the components of the genetic optimization.
-
setOperatorNames
()¶ Set the operator names.
-
setRootLoggerForPyEvolve
()¶ Set up the root logger for PyEvolve.
-
-
exception
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
IncompleteExtraOptionsError
¶ Bases:
exceptions.Exception
-
class
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
Property
(index=1, key=None, name=None, units=None, minimax=None, target=None, comparator=None, error=None, weight=1.0, positive=None, structure_property=False, patterns=None, summarize=None, class_kwargs=None)¶ Bases:
object
Manage a property to be used in a genetic optimization.
-
EQUALS
= 'eq'¶
-
GREATER_THAN
= 'gt'¶
-
LESS_THAN
= 'lt'¶
-
MAX
= 'max'¶
-
MIN
= 'min'¶
-
SUB_OPTIONS
= ['index', 'key', 'name', 'units', 'minimax', 'target', 'comparator', 'error', 'weight', 'positive', 'structure_property', 'patterns', 'summarize', 'class_kwargs']¶
-
static
addKwargs
(property_string, kwargs)¶ Add the given options to the given property string.
Parameters: - property_string (str) – the string representation of the property specifications, containing options as ‘<option_substring>=<value>’
- kwargs (dict) – key-value option pairs to add to the property string
Return type: Returns: the string representation of the property specifications containin the new options
-
checkProperty
()¶ Check this property instance.
-
static
getClassKwargsPropertySubOption
(class_kwargs, kwarg_sep='>', kwargs_sep=', ')¶ Return the class kwargs property sub-option string from the given class kwargs.
Parameters: Return type: Returns: the class kwargs property sub-option string
-
static
getKwargs
(property_string, option_substrings, add_relative_paths=None)¶ Return kwargs of the given property options from the given property string.
Parameters: - property_string (str) – the string representation of the property specifications, containing options as ‘<option_substring>=<value>’
- option_substrings (list or str) – contains the option substrings for the needed values, a single occurence or list of occurences may be passed
- add_relative_paths (list) – contains options for which relative paths should be added, such relative paths might be needed for correctly parallelizing the evaluation stage of the genetic optimization as they will be needed to copy otherwise shared files into local subdirectories
Return type: Returns: the extracted dictionary of kwargs or single kwarg depending on the input option_substrings or None if nothing is found
-
static
getPropertyStrings
(property_lists)¶ Return property strings from the given property lists.
Parameters: property_lists (list) – contains lists of property specifications Return type: list Returns: contains string representations of the property specifications
-
isClassProperty
()¶ Return True if this property is a class property, False otherwise.
Return type: bool Returns: return True if this property is a class property, False otherwise
-
isScriptProperty
()¶ Return True if this property is a script property, False otherwise.
Return type: bool Returns: return True if this property is a script property, False otherwise
-
isStructureProperty
()¶ Return True if this property is a structure property, False otherwise.
Return type: bool Returns: return True if this property is a structure property, False otherwise
-
parsePropertyString
(property_string)¶ Parse the attributes of this class from a string representation of the property specifications. For example, ‘index=1 key=r_matsci_Reduction_Potential_(eV) name=reduction units=eV target=1.28 comparator=eq error=0.05 weight=0.5’ or ‘index=2 key=r_matsci_Oxidation_Potential_(eV) name=oxidation units=eV minimax=max weight=2.5’
Parameters: property_string (str) – the string representation of the property specifications
Raises: - PropertySyntaxError – if there is something wrong with the property syntax
- UnknownPropertySuboptionError – if an unknown property suboption is found
-
static
rmKwargs
(property_string, option_substrings)¶ Return a copy of the given property string with all of the given property option substrings removed.
Parameters: - property_string (str) – the string representation of the property specifications, containing options as ‘<option_substring>=<value>’
- option_substrings (list) – contains the option substrings to be removed
Return type: Returns: the string representation of the property specifications less the options substrings that were to be removed
-
setAttributes
(index=1, key=None, name=None, units=None, minimax=None, target=None, comparator=None, error=None, weight=1.0, positive=None, structure_property=False, patterns=None, summarize=None, class_kwargs=None)¶ Set some attributes for this class.
Parameters: - index (int) – a numeric index used to refer to this Property instance, a default of 1 is used
- key (str) – the schrodinger.structure.Structure property key to be optimized
- name (str) – specify a name for the property, this name will be, for example used in any *log files, etc.
- units (str) – enter the units that the property is in, for example eV, nm, etc.
- minimax (str) – to minimize or maximize this property then set this option to the class constants MIN or MAX
- target (float) – if instead of maximizing or minimizing the property, the genetic optimization is supposed to handle a specific value then enter that value using this option.
- comparator (str) – specify here how the target value and computed values are to be compared, i.e. either the class constants EQUALS for =, GREATER_THAN for >, or LESS_THAN for <.
- error (float) – if equality to a target value has been specified then this option allows the user to control the error bounds of the target value, if not specified then a default of 10% of the specified target value will be used.
- weight (float) – specify the weight to use for this property, if the genetic optimization is to be run on several properties then the weight allows the user to bias the solution. This option can also be used to control a situation where more than a single property is desired and where those properties are quantified using different physical units such that the numbers might be orders of magnitude apart from one another, for example comparing eV and nm. A default of 1.0 is used.
- positive (bool) – True if this property can only take on positive values, for example as in the area of a surface, False otherwise, for example as in temperature in Celcius. The default is False.
- structure_property (bool) – True if this property is a structure property, False otherwise, i.e. if this property requires an external calculation, typically for physical observables
- patterns (list) – contains SMARTS patterns
- summarize (bool) – if True then print a summary of this property, False otherwise
- class_kwargs (OrderedDict) – contains kwargs for class based evaluation of this property
-
-
exception
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
PropertySyntaxError
¶ Bases:
exceptions.Exception
-
class
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
Skip
(genome, msg)¶ Bases:
object
Manage a skip.
-
class
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
StructureGenome
¶ Bases:
pyevolve.GenomeBase.GenomeBase
Manage a genome. The genome, aka chromosome, is the solution to the problem trying to be solved via genetic optimization. It is referred to as being composed of genes that are manipulated by the crossover and mutation operators. In our genetic optimization module this genome is basically just a schrodinger.structure.Structure object.
-
addPreviousFreezerFile
(freezer_file)¶ Add the given file to the list of previous freezer files.
Parameters: freezer_file (str) – the name of the file to be added
-
clone
()¶ Clone the current genome.
Return type: StructureGenome Returns: genome
-
copy
(genome)¶ Copy the current genome to the provided genome.
Parameters: genome (StructureGenome) – a new genome instance to which to copy the current genome
-
evaluate
(**args)¶ Evaluate the score of this individual.
Parameters: args (dict) – dictionary of genetic optimization parameters created and used by pyevolve
-
optimizeGeometry
()¶ Optimize the geometry of this genome’s structure using OPLS.
-
removeProperties
()¶ Remove some structure properties.
-
resetParentProperties
()¶ Reset the crossover and mutation parent structure properties.
-
updateStructureProperties
(index, generation)¶ Update some structure properties.
Parameters: - index (int) – the index of this individual
- generation (int) – this generation
-
-
exception
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
UnknownNameError
¶ Bases:
exceptions.Exception
-
exception
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
UnknownPropertySuboptionError
¶ Bases:
exceptions.Exception
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
all_properties
(ga_obj)¶ Terminate when all properties have been matched.
Parameters: ga_obj (GSimpleGA.GSimpleGA) – the entire current state of the genetic optimization Return type: bool Returns: True to terminate, False otherwise
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
apply_uniform_operator_weights
(operators)¶ Set the operator weights uniformly.
Parameters: operators (list) – a list of two-element tuples, each tuple contains first an operator function and second a weight Return type: list Returns: list of two-element tuples of operators and uniform weights
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
base_evaluator
(genome)¶ This is the base evaulator used to wrap all other evaluators.
Parameters: genome (StructureGenome) – a genome Return type: float Returns: the score for this individual
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
bond_crossover
(genome, **args)¶ Perform a crossover operation by swapping molecular fragments at two randomly choosen bonds, i.e. a double displacement reaction channel.
Parameters: - genome (StructureGenome) – a genome
- args (dict) – dictionary of genetic optimization parameters created and used by pyevolve
Return type: tuple
Returns: tuple containing the sister and brother StructureGenome
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
combine_two_structures
(astructure, bstructure, offset=10.0)¶ Combine two structure objects into a single structure object using somewhat arbitrary placement.
Parameters: - astructure (schrodinger.structure.Structure) – the first of the structures to be combined
- bstructure (schrodinger.structure.Structure) – the second of the structures to be combined
- offset (float) – the final distance between the structures will be the sum of the molecular VDW radii plus this offset in Angstrom
Return type: Returns: the combined structure object
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
elemental_mutator
(genome, **args)¶ Perform a random elemental mutation to an element in the same column (as known as group) of the periodic table. Note that hydrogen and the halogens are considered to belong to the same column.
Parameters: - genome (StructureGenome) – a genome
- args (dict) – dictionary of genetic optimization parameters created and used by pyevolve
Return type: int
Returns: the number of mutations applied, appears to never be used in PyEvolve
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
first_property
(ga_obj)¶ Terminate when the first property has been matched.
Parameters: ga_obj (GSimpleGA.GSimpleGA) – the entire current state of the genetic optimization Return type: bool Returns: True to terminate, False otherwise
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
fragment_mutator
(genome, **args)¶ Randomly mutate the genome by swapping a molecular fragement on one side of a bond by a similar fragment from a library.
Parameters: - genome (StructureGenome) – a genome
- args (dict) – dictionary of genetic optimization parameters created and used by pyevolve
Return type: int
Returns: the number of mutations applied, appears to never be used in PyEvolve
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
from_initial_population
(genome, **args)¶ Draw a unique genome from the initial population.
Parameters: - genome (StructureGenome) – a genome
- args (dict) – dictionary of genetic optimization parameters created and used by pyevolve
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
get_child_like_parent
(parent_st, children_sts, definition)¶ Return the child structure that is most like the provided parent.
Parameters: - parent_st (schrodinger.structure.Structure) – the parent structure
- children_sts (list of schrodinger.structure.Structure) – the children structures
- definition (two-element list) – each sublist contains two atom indicies describing the reactive bonds in parent and fragment structures which created the children
Return type: Returns: the sought child structure
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
get_element_histogram
(astructure)¶ Return a dictionary where keys are elements and values are the numbers of atoms of a given element.
Parameters: astructure (schrodinger.structure.Structure) – the structure in question Return type: dict Returns: dictionary with element histogram, keys are elements (strs) and values are numbers (ints)
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
get_element_mutator_dict
(astructure)¶ Return a dictionary where the keys contain the indicies of the mutatable atoms and the values contain those elements that the keyed atom may be mutated to.
Parameters: astructure (schrodinger.structure.Structure) – the structure to be mutated Return type: dict Returns: keys are atom indicies of those atoms that are mutatable and values are those elements that the atom can be mutated to
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
get_freezer_structure
(structure_libs, tries_from_libs=3, structure_score_threshold=None, properties=None, conformational_search=False, inoculate='no_child', crossover_applied=None, mutation_applied=None, basename_ext=None, seed=None)¶ Return a random structure from the freezer and update that structure’s properties.
Parameters: - structure_libs (dict) – keys are strings specifying the types of libraries to be used and can be module constants from FREEZER_CHOICES.keys(), values are lists of libraries by type and can be either module constants from FRAGMENT_LIBS.keys(), ALL, or the names of Maestro files (including the file extensions)
- tries_from_libs (int) – the number of times to try before giving up
- structure_score_threshold (float or None) – specifies that a structure with a structure score greater-than-or-equal-to this threshold is sought, the best of the considered structures will be returned and will contain several structure properties related to the scoring
- properties (list of Property or None) – the properties used in structure scoring
- conformational_search (bool or str) – specifies whether a Macromodel conformational search will be performed prior to evaluation, when a string it specifies a simplified Macromodel input file containing extra options
- inoculate (str) – specify the reason for drawing from the freezer, which is an inoculate option from INOCULATE_CHOICES
- crossover_applied (str or None) – specify the intended crossover operator or None if there isn’t to be one
- mutation_applied (str or None) – specify the intended mutation operator or None if there isn’t to be one
- basename_ext (str or None) – specify an extension to append to the stoichiometry which is used to set the title of the returned structure
- seed (int or None) – if not None specifies that random should be reseeded with the given value
Return type: Returns: the random structure or None if one couldn’t be found
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
get_generation_log_file_name
(basename, generation)¶ Get the generation log file name.
Parameters: - basename (str) – base name to use
- generation (int) – the generation
Return type: Returns: generation_log_file_name, name of generation log file
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
get_isoelectronic_mutator_indicies
(astructure)¶ Return a list of atom indicies that can be mutated by the isoelectronic mutator.
Parameters: astructure (schrodinger.structure.Structure) – the structure to be mutated Return type: list Returns: mutatable indicies
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
get_loggable_float
(afloat, num_decimal='%.2f', field_width=10)¶ Return a float as a string with the specified format.
Parameters: - afloat (float) – a float to convert to a string
- num_decimal (str) – the format of the string representation
- field_width (int) – the field width of the final string
Return type: Returns: the float as a string
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
get_low_energy_conformers
(astructure_in, macromodel_options_file=None, remove_files=False, overwrite=False, seed=None)¶ Return the lowest energy conformers from a Macromodel conformational search.
Parameters: - astructure_in (schrodinger.structure.Structure) – the structure to search for conformations
- macromodel_options_file (str or None) – the name of a simplified Macromodel input file that contains any options to use in addition to those used by default in a conformational search or None if there are none and you just want to use the defaults
- remove_files (bool) – if the job is successful, specifies whether to remove all files created for it after it finishes
- overwrite (bool) – if True then the coordinates of the input structure will be overwritten by those of the lowest energy conformer and that structure alone returned by this function
- seed (int or None) – used to seed the random number generator used in the Macromodel conformational search, should be in CONF_SEARCH_SEED_RANGE, if None then if a CONFSEARCH_SEED has been specified in macromodel_options_file it will be used, otherwise a random int in CONF_SEARCH_SEED_RANGE will be used
Return type: list of schrodinger.structure.Structure, int
Returns: the structures of the lowest energy conformers sorted by increasing energy and the seed used in the conformational search (same as input if input was given either as seed or in macromodel_options_file)
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
get_num_simple_bonds
(astructure)¶ Return the number of simple bonds in the provided structure. The definition of a simple bond follows from that used in the reaction channel module and is an acyclic single order bond that may involve a hydrogen atom.
Parameters: astructure (schrodinger.structure.Structure) – the structure for which to get the number of simple bonds Return type: int Returns: the number of simple bonds
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
get_output_file_name
(basename)¶ Get the output file name from the basename.
Parameters: basename (str) – base name to use Return type: str Returns: output_file_name, name of output file
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
get_random_structure
(structure_libs, tries_from_libs=3, structure_score_threshold=None, properties=None, conformational_search=False, seed=None)¶ From the given dictionary of libraries return a random structure.
Parameters: - structure_libs (dict) – keys are strings specifying the types of libraries to be used and can be module constants from FREEZER_CHOICES.keys(), values are lists of libraries by type and can be either module constants from FRAGMENT_LIBS.keys(), ALL, or the names of Maestro files (including the file extensions)
- tries_from_libs (int) – the number of times to try before giving up
- structure_score_threshold (float or None) – specifies that a structure with a structure score greater-than-or-equal-to this threshold is sought, the best of the considered structures will be returned and will contain several structure properties related to the scoring
- properties (list of Property or None) – the properties used in structure scoring
- conformational_search (bool or str) – specifies whether a Macromodel conformational search will be performed prior to evaluation, when a string it specifies a simplified Macromodel input file containing extra options
- seed (int or None) – if not None specifies that random should be reseeded with the given value
Return type: Returns: the random structure or None if one couldn’t be found
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
get_structure_score
(astructure, properties, conformational_search, seed=None)¶ Return the structure score for the provided structure.
Parameters: - astructure (schrodinger.structure.Structure) – the structure to score
- properties (list of Property) – the properties used in scoring
- conformational_search (bool or str) – specifies whether a Macromodel conformational search will be performed prior to evaluation, when a string it specifies a simplified Macromodel input file containing extra options
- seed (int or None) – random seed used in conformational search or None if conformational search is not being done
Return type: float
Returns: the structure score
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
hack_for_multiprocessing
()¶
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
isoelectronic_mutator
(genome, **args)¶ Perform a random isoelectronic mutation from the following sets of series CH3X, NH2X, OHX, and FX, CH2XY, NHXY, OXY, and CHXYZ and NXYZ, where X, Y, and Z are non-H-bonds.
Parameters: - genome (StructureGenome) – a genome
- args (dict) – dictionary of genetic optimization parameters created and used by pyevolve
Return type: int
Returns: the number of mutations applied, appears to never be used in PyEvolve
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
logging_summary_callback
(ga_obj)¶ Callback to log progress.
Parameters: ga_obj (GSimpleGA.GSimpleGA) – the entire current state of the genetic optimization
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
manage_failures_callback
(ga_obj)¶ Callback to manage failures in the evaluation.
Parameters: ga_obj (GSimpleGA.GSimpleGA) – the entire current state of the genetic optimization
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
manage_skips_callback
(ga_obj)¶ Callback to manage skips in the evaluation.
Parameters: ga_obj (GSimpleGA.GSimpleGA) – the entire current state of the genetic optimization
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
molecule_history_callback
(ga_obj)¶ Callback to append all structures from all generations to individual log files.
Parameters: ga_obj (GSimpleGA.GSimpleGA) – the entire current state of the genetic optimization
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
optoelectronics_evaluator
(genome)¶ Run an optoelectronics job.
Parameters: genome (StructureGenome) – a genome Return type: launcher.Launcher Returns: the script launcher object for this individual, it is run in the base evaluator
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
prepare_next_generation_dirs_callback
(ga_obj)¶ Callback to update the generation property of the genomes and to create a subdirectory to hold the next series of evaluations.
Parameters: ga_obj (GSimpleGA.GSimpleGA) – the entire current state of the genetic optimization
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
print_bad_jobs
(all_bad_jobs, logger, bad_type='skip')¶ Log bad jobs, i.e. skips and failures.
Parameters: - all_bad_jobs (dict) – a collection of bad subjobs, keys are genetic optimization generation and values are a list of Skip or Failure objects for bad subjobs
- logger (logging.Logger) – output logger
- bad_type (str) – specifies either ‘skip’ or ‘fail’ type
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
remove_basename_ext
(stoich_ext)¶ Remove the basename extension from the given string and return the remainder which is the stoichiometry. Do this instead of having to recompute the stoichiometry which can be expensive.
Parameters: stoich_ext (str) – contains the stoichiometry and basename extension Return type: str Returns: stoichiometry
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
set_title_to_stoichiometry
(astructure, toappend=None, separation='.')¶ Set the structure title to be the stoichiometry of the structure.
Parameters: - astructure (schrodinger.structure.Structure) – the structure
- toappend (str) – a string to append to the stoichiometry
- separation (str) – used to separate the stoichiometry and the toappend str
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
structure_evaluator
(genome)¶ This is the structure evaulator.
Parameters: genome (StructureGenome) – a genome Return type: float Returns: the score for this individual
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
structure_is_open_shell
(astructure, ignore_charge=True)¶ Return True if the provided structure is open shell, i.e. has an odd number of electrons.
Parameters: - astructure (schrodinger.structure.Structure) – the structure in question
- ignore_charge (bool) – if True then ignore any structure.formal_charge settings
Return type: bool
Returns: True if the provided structure is open shell, False otherwise
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
thread_safe_evaluator
(evaluator)¶ Decorator to make evaluator functions thread-safe, i.e. to make using the random module safe, meaning usability and reproducibility, with the multiprocessing module.
Parameters: evaluator (function) – the function to decorate Return type: function Returns: the decorated function
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
uniquify_titles_callback
(ga_obj)¶ Callback to uniquify titles of the individuals.
Parameters: ga_obj (GSimpleGA.GSimpleGA) – the entire current state of the genetic optimization
-
schrodinger.application.matsci.genetic_optimization.genetic_optimization.
unproductive
(ga_obj)¶ Terminate if the maximum number of unproductive generations has been reached.
Parameters: ga_obj (GSimpleGA.GSimpleGA) – the entire current state of the genetic optimization Return type: bool Returns: True to terminate, False otherwise