schrodinger.analysis.enrichment.calculator module¶
This module contains the class for generating the default enrichment report.
Example metrics from two different screens:
The enrichment metrics from example_A are generally more favorable than those from example_B.
Enrichment Report¶
Actives file: example_A_actives.txt Results: example_A_dock_pv.rept Total actives: 117 Total ligands(actives+decoys): 1117 Number of ranked actives: 117
BEDROC(alpha=160.9, alpha*Ra=16.8534): 1.000 BEDROC(alpha=20.0, alpha*Ra=2.0949): 0.914 BEDROC(alpha=8.0, alpha*Ra=0.8380): 0.868 ROC: 0.92 RIE: 7.65 Area under accumulation curve: 0.87 Ave. Number of outranking decoys: 82
Count and percentage of actives in top N% of decoy results. # Actives (1%|2%|5%|10%|20%): 90| 90| 92| 94| 97 % Actives (1%|2%|5%|10%|20%): 76.9| 76.9| 78.6| 80.3| 82.9
Enrichment Factors with respect to N% sample size. EF (1%|2%|5%|10%|20%): 9.5| 9.5| 9.4| 7.7| 4.1 EF*(1%|2%|5%|10%|20%): 77| 38| 16| 8| 4.1 EF’(1%|2%|5%|10%|20%): 2.9e+02|1.7e+02| 54| 23| 9.9 Eff(1%|2%|5%|10%|20%): 0.974| 0.949| 0.88| 0.779| 0.611
Enrichment Factors with respect to N% actives recovered. EF (40%|50%|60%|70%|80%|90%|100%): 9.3| 9.4| 9.4| 9.2| 5.7| 2| 1.3 EF*(40%|50%|60%|70%|80%|90%|100%): 4e+02|5e+02|6e+02|2.3e+02| 13| 2.2| 1.4 EF’(40%|50%|60%|70%|80%|90%|100%): 3.8e+02|4.3e+02|4.7e+02|4.3e+02| 38| 4.7| 2.7 FOD(40%|50%|60%|70%|80%|90%|100%): 9e-05|0.0003|0.0004|0.0006|0.003| 0.03| 0.08
Enrichment Report¶
Actives file: example_B_actives.txt Results: example_B_dock_pv.rept Total actives: 62 Total ligands(actives+decoys): 1062 Number of ranked actives: 62
BEDROC(alpha=160.9, alpha*Ra=9.3934): 0.703 BEDROC(alpha=20.0, alpha*Ra=1.1676): 0.256 BEDROC(alpha=8.0, alpha*Ra=0.4670): 0.323 ROC: 0.72 RIE: 3.02 Area under accumulation curve: 0.71 Ave. Number of outranking decoys: 281
Count and percentage of actives in top N% of decoy results. # Actives (1%|2%|5%|10%|20%): 8| 8| 9| 13| 23 % Actives (1%|2%|5%|10%|20%): 12.9| 12.9| 14.5| 21.0| 37.1
Enrichment Factors with respect to N% sample size. EF (1%|2%|5%|10%|20%): 12| 6.5| 2.9| 2.1| 1.6 EF*(1%|2%|5%|10%|20%): 13| 6.5| 2.9| 2.1| 1.9 EF’(1%|2%|5%|10%|20%): 23| 12| 5.3| 3.4| 2.3 Eff(1%|2%|5%|10%|20%): 0.856| 0.732| 0.488| 0.354| 0.299
Enrichment Factors with respect to N% actives recovered. EF (40%|50%|60%|70%|80%|90%|100%): 1.8| 2| 1.9| 2| 1.6| 1.6| 1 EF*(40%|50%|60%|70%|80%|90%|100%): 1.9| 2.1| 2| 2.1| 1.6| 1.6| 1.1 EF’(40%|50%|60%|70%|80%|90%|100%): 2.3| 2.2| 2.2| 2.2| 2.1| 2| 1.6 FOD(40%|50%|60%|70%|80%|90%|100%): 0.1| 0.1| 0.1| 0.2| 0.2| 0.2| 0.3
Copyright Schrodinger, LLC. All rights reserved.
-
class
schrodinger.analysis.enrichment.calculator.
Calculator
(actives, results, total_decoys=0)¶ Bases:
object
A class to report default set of enrichment terms for a screen. By default, a report containing a suite of metrics is directed to standard out.
Note: This is not the preferred way to obtain enrichment metrics. Please consider using parser and metric functions directly in enrichment_input.py and metrics.py if possible.
Variables: - ef_precision (int) – Number of decimals when reporting EF values. Default = 2
- efp_precision (int) – Number of decimals when reporting EF’ values. Default = 2
- efs_precision (int) – Number if decimals when reporting EF* values. Default = 2
- eff_precision (int) – Number of decimals when reporting Eff values. Default = 3
- fod_precision (int) – Number of decimals when reporting FOD values. Default = 1
-
ef_precision
= 2¶
-
efs_precision
= 2¶
-
efp_precision
= 2¶
-
eff_precision
= 3¶
-
fod_precision
= 1¶
-
__init__
(actives, results, total_decoys=0)¶ Parameters: - actives (str or list(str)) – File name or a list of strings containing all active titles. If a file name is provided, the input should be a valid csv or structure file, a raw text file containing one line per title is also acceptable. Duplicate titles are discarded, only the first occurrence is recorded.
- results (str or list(str) or list(structure.Structure)) – File name, a list of strings, or a list of structure.Structure containing the virtual screening result ordered by the scoring metric. If a file name is provided, the input should be a valid csv file or structure file. Duplicate titles are discarded, only the first occurrence is recorded.
- total_decoys (int) – The total number of decoys. If specified, the total number of ligands will be distinct active titles from actives file + num_decoy. This will enable the calculation of the correction term in calc_AUAC, should the total number of ligands not equal to the total number of ranked titles in results_file.
-
__repr__
()¶ Return repr(self).
-
calcEF
(n_sampled_set, min_actives=None)¶
-
calcEFStar
(n_sampled_decoy_set, min_actives=None)¶
-
calcEFP
(n_sampled_decoy_set, min_actives=None)¶
-
calcFOD
(fraction_of_actives)¶
-
calcEFF
(fraction_of_decoys)¶
-
calcActivesInN
(n_sampled_set)¶
-
calcActivesInNStar
(n_sampled_set)¶
-
calcAveNumberOutrankingDecoys
()¶
-
calcBEDROC
(alpha=20.0)¶
-
calcRIE
(alpha=20.0)¶
-
calcAUAC
()¶
-
calcROC
()¶
-
calcMWUROC
(alpha=0.05)¶
-
calcDEF
(n_sampled_set, min_actives=None)¶
-
calcDEFStar
(n_sampled_decoy_set, min_actives=None)¶
-
calcDEFP
(n_sampled_decoy_set, min_actives=None)¶
-
calculateSensitivity
(rank)¶
-
calculateSpecificity
(rank)¶
-
getPercentScreenCurvePoints
()¶
-
getActiveRankCsvRows
()¶
-
getROCCurvePoints
()¶
-
getROCAreaRomberg
(lower_limit=0.0, upper_limit=1.0)¶
-
savePlot
(png_file='plot.png', title='Screen Results', xlabel='1-Specificity', ylabel='Sensitivity')¶
-
static
format
(value, precision=2)¶ Parameters: - value (float or None) – Float value to format as string.
- precision (int) – Number of digits after the decimal.
Returns: a string representation of the passed value. If the value is None then the returned string is ‘n/a’. Uses %g formatting idiom so large values are returned as exponentials.
Return type: str
-
__class__
¶ alias of
builtins.type
-
__delattr__
¶ Implement delattr(self, name).
-
__dict__
= mappingproxy({'__module__': 'schrodinger.analysis.enrichment.calculator', '__doc__': "\n A class to report default set of enrichment terms for a screen.\n By default, a report containing a suite of metrics is directed to\n standard out.\n\n :note: This is not the preferred way to obtain enrichment metrics.\n Please consider using parser and metric functions directly in\n enrichment_input.py and metrics.py if possible.\n\n :cvar ef_precision: Number of decimals when reporting EF values.\n Default = 2\n :vartype ef_precision: int\n\n :cvar efp_precision: Number of decimals when reporting EF' values.\n Default = 2\n :vartype efp_precision: int\n\n :cvar efs_precision: Number if decimals when reporting EF* values.\n Default = 2\n :vartype efs_precision: int\n\n :cvar eff_precision: Number of decimals when reporting Eff values.\n Default = 3\n :vartype eff_precision: int\n\n :cvar fod_precision: Number of decimals when reporting FOD values.\n Default = 1\n :vartype fod_precision: int\n\n ", 'ef_precision': 2, 'efs_precision': 2, 'efp_precision': 2, 'eff_precision': 3, 'fod_precision': 1, '__init__': <function Calculator.__init__>, '__repr__': <function Calculator.__repr__>, '_parseActives': <function Calculator._parseActives>, '_parseResults': <function Calculator._parseResults>, '_getFingerprintComponent': <function Calculator._getFingerprintComponent>, '_getActiveSampleSizeStar': <function Calculator._getActiveSampleSizeStar>, '_getActiveSampleSize': <function Calculator._getActiveSampleSize>, '_getDecoySampleSize': <function Calculator._getDecoySampleSize>, 'calcEF': <function Calculator.calcEF>, 'calcEFStar': <function Calculator.calcEFStar>, 'calcEFP': <function Calculator.calcEFP>, 'calcFOD': <function Calculator.calcFOD>, 'calcEFF': <function Calculator.calcEFF>, 'calcActivesInN': <function Calculator.calcActivesInN>, 'calcActivesInNStar': <function Calculator.calcActivesInNStar>, 'calcAveNumberOutrankingDecoys': <function Calculator.calcAveNumberOutrankingDecoys>, 'calcBEDROC': <function Calculator.calcBEDROC>, 'calcRIE': <function Calculator.calcRIE>, 'calcAUAC': <function Calculator.calcAUAC>, 'calcROC': <function Calculator.calcROC>, 'calcMWUROC': <function Calculator.calcMWUROC>, 'calcDEF': <function Calculator.calcDEF>, 'calcDEFStar': <function Calculator.calcDEFStar>, 'calcDEFP': <function Calculator.calcDEFP>, 'calculateSensitivity': <function Calculator.calculateSensitivity>, 'calculateSpecificity': <function Calculator.calculateSpecificity>, 'getPercentScreenCurvePoints': <function Calculator.getPercentScreenCurvePoints>, 'getActiveRankCsvRows': <function Calculator.getActiveRankCsvRows>, 'getROCCurvePoints': <function Calculator.getROCCurvePoints>, 'getROCAreaRomberg': <function Calculator.getROCAreaRomberg>, 'savePlot': <function Calculator.savePlot>, 'format': <staticmethod object>, '_calculateMetrics': <function Calculator._calculateMetrics>, 'report': <function Calculator.report>, 'getCsvRows': <function Calculator.getCsvRows>, '__dict__': <attribute '__dict__' of 'Calculator' objects>, '__weakref__': <attribute '__weakref__' of 'Calculator' objects>})¶
-
__dir__
() → list¶ default dir() implementation
-
__eq__
¶ Return self==value.
-
__format__
()¶ default object formatter
-
__ge__
¶ Return self>=value.
-
__getattribute__
¶ Return getattr(self, name).
-
__gt__
¶ Return self>value.
-
__hash__
¶ Return hash(self).
-
__init_subclass__
()¶ This method is called when a class is subclassed.
The default implementation does nothing. It may be overridden to extend subclasses.
-
__le__
¶ Return self<=value.
-
__lt__
¶ Return self<value.
-
__module__
= 'schrodinger.analysis.enrichment.calculator'¶
-
__ne__
¶ Return self!=value.
-
__new__
()¶ Create and return a new object. See help(type) for accurate signature.
-
__reduce__
()¶ helper for pickle
-
__reduce_ex__
()¶ helper for pickle
-
__setattr__
¶ Implement setattr(self, name, value).
-
__sizeof__
() → int¶ size of object in memory, in bytes
-
__str__
¶ Return str(self).
-
__subclasshook__
()¶ Abstract classes can override this to customize issubclass().
This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).
-
__weakref__
¶ list of weak references to the object (if defined)
-
report
(file_handle=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, header='', footer='')¶ Prints text summary of results to the file_handle.
Parameters: - header (str) – Header for the report.
- footer (str) – Footer for the report.
- file_handle (file) – File handle-like object, default is sys.stdout.
-
getCsvRows
()¶ Return a list of two lists, the first inner list contains all metric names, the other contains all corresponding metric values.
Returns: a list of header and enrichment value tuples. Return type: list