schrodinger.analysis.enrichment.calculator module¶
This module contains the class for generating the default enrichment report.
Example metrics from two different screens:
The enrichment metrics from example_A are generally more favorable than those from example_B.
Enrichment Report¶
Actives file: example_A_actives.txt Results: example_A_dock_pv.rept Total actives: 117 Total ligands(actives+decoys): 1117 Number of ranked actives: 117
BEDROC(alpha=160.9, alpha*Ra=16.8534): 1.000 BEDROC(alpha=20.0, alpha*Ra=2.0949): 0.914 BEDROC(alpha=8.0, alpha*Ra=0.8380): 0.868 ROC: 0.92 RIE: 7.65 Area under accumulation curve: 0.87 Ave. Number of outranking decoys: 82
Count and percentage of actives in top N% of decoy results. # Actives (1%|2%|5%|10%|20%): 90| 90| 92| 94| 97 % Actives (1%|2%|5%|10%|20%): 76.9| 76.9| 78.6| 80.3| 82.9
Enrichment Factors with respect to N% sample size. EF (1%|2%|5%|10%|20%): 9.5| 9.5| 9.4| 7.7| 4.1 EF*(1%|2%|5%|10%|20%): 77| 38| 16| 8| 4.1 EF’(1%|2%|5%|10%|20%): 2.9e+02|1.7e+02| 54| 23| 9.9 Eff(1%|2%|5%|10%|20%): 0.974| 0.949| 0.88| 0.779| 0.611
Enrichment Factors with respect to N% actives recovered. EF (40%|50%|60%|70%|80%|90%|100%): 9.3| 9.4| 9.4| 9.2| 5.7| 2| 1.3 EF*(40%|50%|60%|70%|80%|90%|100%): 4e+02|5e+02|6e+02|2.3e+02| 13| 2.2| 1.4 EF’(40%|50%|60%|70%|80%|90%|100%): 3.8e+02|4.3e+02|4.7e+02|4.3e+02| 38| 4.7| 2.7 FOD(40%|50%|60%|70%|80%|90%|100%): 9e-05|0.0003|0.0004|0.0006|0.003| 0.03| 0.08
Enrichment Report¶
Actives file: example_B_actives.txt Results: example_B_dock_pv.rept Total actives: 62 Total ligands(actives+decoys): 1062 Number of ranked actives: 62
BEDROC(alpha=160.9, alpha*Ra=9.3934): 0.703 BEDROC(alpha=20.0, alpha*Ra=1.1676): 0.256 BEDROC(alpha=8.0, alpha*Ra=0.4670): 0.323 ROC: 0.72 RIE: 3.02 Area under accumulation curve: 0.71 Ave. Number of outranking decoys: 281
Count and percentage of actives in top N% of decoy results. # Actives (1%|2%|5%|10%|20%): 8| 8| 9| 13| 23 % Actives (1%|2%|5%|10%|20%): 12.9| 12.9| 14.5| 21.0| 37.1
Enrichment Factors with respect to N% sample size. EF (1%|2%|5%|10%|20%): 12| 6.5| 2.9| 2.1| 1.6 EF*(1%|2%|5%|10%|20%): 13| 6.5| 2.9| 2.1| 1.9 EF’(1%|2%|5%|10%|20%): 23| 12| 5.3| 3.4| 2.3 Eff(1%|2%|5%|10%|20%): 0.856| 0.732| 0.488| 0.354| 0.299
Enrichment Factors with respect to N% actives recovered. EF (40%|50%|60%|70%|80%|90%|100%): 1.8| 2| 1.9| 2| 1.6| 1.6| 1 EF*(40%|50%|60%|70%|80%|90%|100%): 1.9| 2.1| 2| 2.1| 1.6| 1.6| 1.1 EF’(40%|50%|60%|70%|80%|90%|100%): 2.3| 2.2| 2.2| 2.2| 2.1| 2| 1.6 FOD(40%|50%|60%|70%|80%|90%|100%): 0.1| 0.1| 0.1| 0.2| 0.2| 0.2| 0.3
Copyright Schrodinger, LLC. All rights reserved.
- 
class schrodinger.analysis.enrichment.calculator.Calculator(actives, results, total_decoys=0)¶
- Bases: - object- A class to report default set of enrichment terms for a screen. By default, a report containing a suite of metrics is directed to standard out. - Note: - This is not the preferred way to obtain enrichment metrics. Please consider using parser and metric functions directly in enrichment_input.py and metrics.py if possible. - Variables: - ef_precision (int) – Number of decimals when reporting EF values. Default = 2
- efp_precision (int) – Number of decimals when reporting EF’ values. Default = 2
- efs_precision (int) – Number if decimals when reporting EF* values. Default = 2
- eff_precision (int) – Number of decimals when reporting Eff values. Default = 3
- fod_precision (int) – Number of decimals when reporting FOD values. Default = 1
 - 
ef_precision= 2¶
 - 
efs_precision= 2¶
 - 
efp_precision= 2¶
 - 
eff_precision= 3¶
 - 
fod_precision= 1¶
 - 
__init__(actives, results, total_decoys=0)¶
- Parameters: - actives (str or list(str)) – File name or a list of strings containing all active titles. If a file name is provided, the input should be a valid csv or structure file, a raw text file containing one line per title is also acceptable. Duplicate titles are discarded, only the first occurrence is recorded.
- results (str or list(str) or list(structure.Structure)) – File name, a list of strings, or a list of structure.Structure containing the virtual screening result ordered by the scoring metric. If a file name is provided, the input should be a valid csv file or structure file. Duplicate titles are discarded, only the first occurrence is recorded.
- total_decoys (int) – The total number of decoys. If specified, the total number of ligands will be distinct active titles from actives file + num_decoy. This will enable the calculation of the correction term in calc_AUAC, should the total number of ligands not equal to the total number of ranked titles in results_file.
 
 - 
calcEF(n_sampled_set, min_actives=None)¶
 - 
calcEFStar(n_sampled_decoy_set, min_actives=None)¶
 - 
calcEFP(n_sampled_decoy_set, min_actives=None)¶
 - 
calcFOD(fraction_of_actives)¶
 - 
calcEFF(fraction_of_decoys)¶
 - 
calcActivesInN(n_sampled_set)¶
 - 
calcActivesInNStar(n_sampled_set)¶
 - 
calcAveNumberOutrankingDecoys()¶
 - 
calcBEDROC(alpha=20.0)¶
 - 
calcRIE(alpha=20.0)¶
 - 
calcAUAC()¶
 - 
calcROC()¶
 - 
calcMWUROC(alpha=0.05)¶
 - 
calcDEF(n_sampled_set, min_actives=None)¶
 - 
calcDEFStar(n_sampled_decoy_set, min_actives=None)¶
 - 
calcDEFP(n_sampled_decoy_set, min_actives=None)¶
 - 
calculateSensitivity(rank)¶
 - 
calculateSpecificity(rank)¶
 - 
getPercentScreenCurvePoints()¶
 - 
getActiveRankCsvRows()¶
 - 
getROCCurvePoints()¶
 - 
getROCAreaRomberg(lower_limit=0.0, upper_limit=1.0)¶
 - 
savePlot(png_file='plot.png', title='Screen Results', xlabel='1-Specificity', ylabel='Sensitivity')¶
 - 
static format(value, precision=2)¶
- Parameters: - value (float or None) – Float value to format as string.
- precision (int) – Number of digits after the decimal.
 - Returns: - a string representation of the passed value. If the value is None then the returned string is ‘n/a’. Uses %g formatting idiom so large values are returned as exponentials. - Return type: - str 
 - 
report(file_handle=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, header='', footer='')¶
- Prints text summary of results to the file_handle. - Parameters: - header (str) – Header for the report.
- footer (str) – Footer for the report.
- file_handle (file) – File handle-like object, default is sys.stdout.
 
 - 
getCsvRows()¶
- Return a list of two lists, the first inner list contains all metric names, the other contains all corresponding metric values. - Returns: - a list of header and enrichment value tuples. - Return type: - list