schrodinger.analysis.enrichment.calculator module

This module contains the class for generating the default enrichment report.

Example metrics from two different screens:

The enrichment metrics from example_A are generally more favorable than those from example_B.

Enrichment Report

Actives file: example_A_actives.txt Results: example_A_dock_pv.rept Total actives: 117 Total ligands(actives+decoys): 1117 Number of ranked actives: 117

BEDROC(alpha=160.9, alpha*Ra=16.8534): 1.000 BEDROC(alpha=20.0, alpha*Ra=2.0949): 0.914 BEDROC(alpha=8.0, alpha*Ra=0.8380): 0.868 ROC: 0.92 RIE: 7.65 Area under accumulation curve: 0.87 Ave. Number of outranking decoys: 82

Count and percentage of actives in top N% of decoy results. # Actives (1%|2%|5%|10%|20%): 90| 90| 92| 94| 97 % Actives (1%|2%|5%|10%|20%): 76.9| 76.9| 78.6| 80.3| 82.9

Enrichment Factors with respect to N% sample size. EF (1%|2%|5%|10%|20%): 9.5| 9.5| 9.4| 7.7| 4.1 EF*(1%|2%|5%|10%|20%): 77| 38| 16| 8| 4.1 EF’(1%|2%|5%|10%|20%): 2.9e+02|1.7e+02| 54| 23| 9.9 Eff(1%|2%|5%|10%|20%): 0.974| 0.949| 0.88| 0.779| 0.611

Enrichment Factors with respect to N% actives recovered. EF (40%|50%|60%|70%|80%|90%|100%): 9.3| 9.4| 9.4| 9.2| 5.7| 2| 1.3 EF*(40%|50%|60%|70%|80%|90%|100%): 4e+02|5e+02|6e+02|2.3e+02| 13| 2.2| 1.4 EF’(40%|50%|60%|70%|80%|90%|100%): 3.8e+02|4.3e+02|4.7e+02|4.3e+02| 38| 4.7| 2.7 FOD(40%|50%|60%|70%|80%|90%|100%): 9e-05|0.0003|0.0004|0.0006|0.003| 0.03| 0.08

Enrichment Report

Actives file: example_B_actives.txt Results: example_B_dock_pv.rept Total actives: 62 Total ligands(actives+decoys): 1062 Number of ranked actives: 62

BEDROC(alpha=160.9, alpha*Ra=9.3934): 0.703 BEDROC(alpha=20.0, alpha*Ra=1.1676): 0.256 BEDROC(alpha=8.0, alpha*Ra=0.4670): 0.323 ROC: 0.72 RIE: 3.02 Area under accumulation curve: 0.71 Ave. Number of outranking decoys: 281

Count and percentage of actives in top N% of decoy results. # Actives (1%|2%|5%|10%|20%): 8| 8| 9| 13| 23 % Actives (1%|2%|5%|10%|20%): 12.9| 12.9| 14.5| 21.0| 37.1

Enrichment Factors with respect to N% sample size. EF (1%|2%|5%|10%|20%): 12| 6.5| 2.9| 2.1| 1.6 EF*(1%|2%|5%|10%|20%): 13| 6.5| 2.9| 2.1| 1.9 EF’(1%|2%|5%|10%|20%): 23| 12| 5.3| 3.4| 2.3 Eff(1%|2%|5%|10%|20%): 0.856| 0.732| 0.488| 0.354| 0.299

Enrichment Factors with respect to N% actives recovered. EF (40%|50%|60%|70%|80%|90%|100%): 1.8| 2| 1.9| 2| 1.6| 1.6| 1 EF*(40%|50%|60%|70%|80%|90%|100%): 1.9| 2.1| 2| 2.1| 1.6| 1.6| 1.1 EF’(40%|50%|60%|70%|80%|90%|100%): 2.3| 2.2| 2.2| 2.2| 2.1| 2| 1.6 FOD(40%|50%|60%|70%|80%|90%|100%): 0.1| 0.1| 0.1| 0.2| 0.2| 0.2| 0.3

Copyright Schrodinger, LLC. All rights reserved.

class schrodinger.analysis.enrichment.calculator.Calculator(actives, results, total_decoys=0)

Bases: object

A class to report default set of enrichment terms for a screen. By default, a report containing a suite of metrics is directed to standard out.

Note:

This is not the preferred way to obtain enrichment metrics. Please consider using parser and metric functions directly in enrichment_input.py and metrics.py if possible.

Variables:
  • ef_precision (int) – Number of decimals when reporting EF values. Default = 2
  • efp_precision (int) – Number of decimals when reporting EF’ values. Default = 2
  • efs_precision (int) – Number if decimals when reporting EF* values. Default = 2
  • eff_precision (int) – Number of decimals when reporting Eff values. Default = 3
  • fod_precision (int) – Number of decimals when reporting FOD values. Default = 1
ef_precision = 2
efs_precision = 2
efp_precision = 2
eff_precision = 3
fod_precision = 1
__init__(actives, results, total_decoys=0)
Parameters:
  • actives (str or list(str)) – File name or a list of strings containing all active titles. If a file name is provided, the input should be a valid csv or structure file, a raw text file containing one line per title is also acceptable. Duplicate titles are discarded, only the first occurrence is recorded.
  • results (str or list(str) or list(structure.Structure)) – File name, a list of strings, or a list of structure.Structure containing the virtual screening result ordered by the scoring metric. If a file name is provided, the input should be a valid csv file or structure file. Duplicate titles are discarded, only the first occurrence is recorded.
  • total_decoys (int) – The total number of decoys. If specified, the total number of ligands will be distinct active titles from actives file + num_decoy. This will enable the calculation of the correction term in calc_AUAC, should the total number of ligands not equal to the total number of ranked titles in results_file.
__repr__()

Return repr(self).

calcEF(n_sampled_set, min_actives=None)
calcEFStar(n_sampled_decoy_set, min_actives=None)
calcEFP(n_sampled_decoy_set, min_actives=None)
calcFOD(fraction_of_actives)
calcEFF(fraction_of_decoys)
calcActivesInN(n_sampled_set)
calcActivesInNStar(n_sampled_set)
calcAveNumberOutrankingDecoys()
calcBEDROC(alpha=20.0)
calcRIE(alpha=20.0)
calcAUAC()
calcROC()
calcMWUROC(alpha=0.05)
calcDEF(n_sampled_set, min_actives=None)
calcDEFStar(n_sampled_decoy_set, min_actives=None)
calcDEFP(n_sampled_decoy_set, min_actives=None)
calculateSensitivity(rank)
calculateSpecificity(rank)
getPercentScreenCurvePoints()
getActiveRankCsvRows()
getROCCurvePoints()
getROCAreaRomberg(lower_limit=0.0, upper_limit=1.0)
savePlot(png_file='plot.png', title='Screen Results', xlabel='1-Specificity', ylabel='Sensitivity')
static format(value, precision=2)
Parameters:
  • value (float or None) – Float value to format as string.
  • precision (int) – Number of digits after the decimal.
Returns:

a string representation of the passed value. If the value is None then the returned string is ‘n/a’. Uses %g formatting idiom so large values are returned as exponentials.

Return type:

str

__class__

alias of builtins.type

__delattr__

Implement delattr(self, name).

__dict__ = mappingproxy({'__module__': 'schrodinger.analysis.enrichment.calculator', '__doc__': "\n A class to report default set of enrichment terms for a screen.\n By default, a report containing a suite of metrics is directed to\n standard out.\n\n :note: This is not the preferred way to obtain enrichment metrics.\n Please consider using parser and metric functions directly in\n enrichment_input.py and metrics.py if possible.\n\n :cvar ef_precision: Number of decimals when reporting EF values.\n Default = 2\n :vartype ef_precision: int\n\n :cvar efp_precision: Number of decimals when reporting EF' values.\n Default = 2\n :vartype efp_precision: int\n\n :cvar efs_precision: Number if decimals when reporting EF* values.\n Default = 2\n :vartype efs_precision: int\n\n :cvar eff_precision: Number of decimals when reporting Eff values.\n Default = 3\n :vartype eff_precision: int\n\n :cvar fod_precision: Number of decimals when reporting FOD values.\n Default = 1\n :vartype fod_precision: int\n\n ", 'ef_precision': 2, 'efs_precision': 2, 'efp_precision': 2, 'eff_precision': 3, 'fod_precision': 1, '__init__': <function Calculator.__init__>, '__repr__': <function Calculator.__repr__>, '_parseActives': <function Calculator._parseActives>, '_parseResults': <function Calculator._parseResults>, '_getFingerprintComponent': <function Calculator._getFingerprintComponent>, '_getActiveSampleSizeStar': <function Calculator._getActiveSampleSizeStar>, '_getActiveSampleSize': <function Calculator._getActiveSampleSize>, '_getDecoySampleSize': <function Calculator._getDecoySampleSize>, 'calcEF': <function Calculator.calcEF>, 'calcEFStar': <function Calculator.calcEFStar>, 'calcEFP': <function Calculator.calcEFP>, 'calcFOD': <function Calculator.calcFOD>, 'calcEFF': <function Calculator.calcEFF>, 'calcActivesInN': <function Calculator.calcActivesInN>, 'calcActivesInNStar': <function Calculator.calcActivesInNStar>, 'calcAveNumberOutrankingDecoys': <function Calculator.calcAveNumberOutrankingDecoys>, 'calcBEDROC': <function Calculator.calcBEDROC>, 'calcRIE': <function Calculator.calcRIE>, 'calcAUAC': <function Calculator.calcAUAC>, 'calcROC': <function Calculator.calcROC>, 'calcMWUROC': <function Calculator.calcMWUROC>, 'calcDEF': <function Calculator.calcDEF>, 'calcDEFStar': <function Calculator.calcDEFStar>, 'calcDEFP': <function Calculator.calcDEFP>, 'calculateSensitivity': <function Calculator.calculateSensitivity>, 'calculateSpecificity': <function Calculator.calculateSpecificity>, 'getPercentScreenCurvePoints': <function Calculator.getPercentScreenCurvePoints>, 'getActiveRankCsvRows': <function Calculator.getActiveRankCsvRows>, 'getROCCurvePoints': <function Calculator.getROCCurvePoints>, 'getROCAreaRomberg': <function Calculator.getROCAreaRomberg>, 'savePlot': <function Calculator.savePlot>, 'format': <staticmethod object>, '_calculateMetrics': <function Calculator._calculateMetrics>, 'report': <function Calculator.report>, 'getCsvRows': <function Calculator.getCsvRows>, '__dict__': <attribute '__dict__' of 'Calculator' objects>, '__weakref__': <attribute '__weakref__' of 'Calculator' objects>})
__dir__() → list

default dir() implementation

__eq__

Return self==value.

__format__()

default object formatter

__ge__

Return self>=value.

__getattribute__

Return getattr(self, name).

__gt__

Return self>value.

__hash__

Return hash(self).

__init_subclass__()

This method is called when a class is subclassed.

The default implementation does nothing. It may be overridden to extend subclasses.

__le__

Return self<=value.

__lt__

Return self<value.

__module__ = 'schrodinger.analysis.enrichment.calculator'
__ne__

Return self!=value.

__new__()

Create and return a new object. See help(type) for accurate signature.

__reduce__()

helper for pickle

__reduce_ex__()

helper for pickle

__setattr__

Implement setattr(self, name, value).

__sizeof__() → int

size of object in memory, in bytes

__str__

Return str(self).

__subclasshook__()

Abstract classes can override this to customize issubclass().

This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).

__weakref__

list of weak references to the object (if defined)

report(file_handle=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, header='', footer='')

Prints text summary of results to the file_handle.

Parameters:
  • header (str) – Header for the report.
  • footer (str) – Footer for the report.
  • file_handle (file) – File handle-like object, default is sys.stdout.
getCsvRows()

Return a list of two lists, the first inner list contains all metric names, the other contains all corresponding metric values.

Returns:a list of header and enrichment value tuples.
Return type:list