Enrichment Calculator Panel

Calculate enrichment metrics for a screen that was run with a set of actives and a set of decoys. The results are presented as a text report, and plots for some metrics can be displayed.

To open this panel: click the Tasks button and browse to Receptor-Based Virtual Screening → Enrichment Calculator.
To open this panel from the entry group for the results of a Glide docking job, and load the results, use the Workflow Action Menu .

Using the Enrichment Calculator Panel

The purpose of this panel is to assess the enrichment of active compounds in a screening process that includes a set of actives and a set of decoys. The screening can be done with any program: Glide, Shape Screening, Phase, to name a few. All that is required is information on the identity of the actives and the ordering of the screened compounds by the screening process. The input for the panel is taken from the output from the screening program, which can be in a file or in the Project Table.

For most screening results formats, titles are used to identify the ligands and the input is expected to be correctly ordered. If the file contains duplicate titles then only the first occurrence of a unique title is ranked. Careful consideration must be made when Glide results contain multiple titles to ensure they are properly ordered; ordering by glide score is typically not sufficient. For example, after saving multiple poses per input ligand from a Glide SP and HTVS docking experiment, the poses of the same chemical species should be ordered by Emodel, and different species by GlideScore (or docking score).

The allowed input file formats for this panel are:

  • Structure file—A file containing ordered structures. e.g. myscreen_pv.mae. The file can be in Maestro or SD format, compressed or uncompressed.
  • CSV file—A comma-separated values file. The first row is expected to contain column headers, one of which must be 'Title'.
  • Table file—A space separated, two column file. The file does not contain title information. A table file is formatted such that the first column is the rank of each retrieved active, the second column is the cumulative count of actives found, and the last row contains the total number of ligands screened and the total count of the actives possible to find in the screen.

The enrichment metrics are described below:

BEDROC Boltzmann-enhanced Discrimination Receiver Operator Characteristic area under the curve. The value is bounded between 1 and 0, with 1 being ideal screen performance. The default alpha=20 weights the first ~8% of screen results. When alpha*Ra << 1, where Ra is the ratio of total actives to total ligands, and alpha is the exponential prefactor, the BEDROC metric takes on a probabilistic meaning. Calculated as described by Truchon, J. F.; Bayly, C. I. J. Chem. Inf. Model. 2007, 47, 488-508, Eq 36.
ROC Receiver Operator Characteristic area under the curve. The value is bounded between 1 and 0, with 1 being ideal screen performance and 0.5 reflecting random behavior. The area under the curve is the probability that a randomly chosen known active will rank higher than a randomly chosen decoy. Calculated as described by Truchon, J. F.; Bayly, C. I. J. Chem. Inf. Model. 2007, 47, 488-508, Eq A.8.
AUAC Area Under the Accumulation Curve. The value is bounded between 1 and 0, with 1 being ideal screen performance. Calculated as described by Truchon, J. F.; Bayly, C. I. J. Chem. Inf. Model. 2007, 47, 488-508, Eq 8.
RIE Robust Initial Enhancement. Active ranks are weighted with an continuously decreasing exponential term. Large positive RIE values indicate better screen performance. Calculated as described by Truchon, J. F; Bayly, C. I. J. Chem. Inf. Model. 2007, 47, 488-508, Eq 18.
EF Enrichment Factor, calculated with respect to the number of total ligands. EF = (a/n)/(A/N), where a is the number of actives found in sample size n, A is the total number of actives, and N is the total number of ligands (decoys and actives).
EF* Enrichment factor for recovering x% of the known actives, defined as the fraction of the actives recovered divided by the fraction of decoys recovered at that point. This value gives the relative probability that a compound recovered is an active rather than a decoy. Because the enrichment factors are computed using the fractions of actives and decoys recovered, they are independent of the absolute and relative numbers of actives and decoys screened.
EF' Modified enrichment factor defined using the average of the reciprocals of the EF* enrichment factors for recovering the first aa% of the known actives. EF'(x) will be larger than EF*(x) if the actives in question come relatively early in the sequence, and smaller if they come relatively late.
DEF Diverse Enrichment Factor, calculated with respect to the number of total ligands. DEF = (a/n)/(A/N) * D, where a is the number of actives found in sample size n, A is the total number of actives, N is the total number of ligands (decoys and actives), and D is a penalty for recovering only congeneric actives from a diverse set. D is defined as 1 - minimum_Tanimoto_a/ 1 - min_Tanimoto_A.
DEF* Diverse EF*. DEF* = (EF*)(D), where D is the penalty for recovering only congeneric actives from a diverse set. D is defined as (1 - minimum_Tanimoto_a/ 1 - min_Tanimoto_A)
DEF' Diverse EF'. DEF' = (EF')(D), where D is the penalty for recovering only congeneric actives from a diverse set. D is defined as (1 - minimum_Tanimoto_a/ 1 - min_Tanimoto_A)
Eff Efficiency in distinguishing actives from decoys on an absolute scale of 1 (perfect; all actives come before any decoys) to -1 (all decoys come before any actives); a value of 0 means that actives and decoys were recovered at equal proportionate rates.
FOD Average fraction of outranking decoys.

Enrichment Calculator Panel Features

Use structures from option menu

Choose the structure source for assessing enrichment. This is the output from the screening run that included a set of actives and a set of decoys.

  • Project Table (n selected entries)—Use the entries that are currently selected in the Project Table or Entry List. The number of entries selected is shown on the menu item. An icon is displayed to the right which you can click to open the Project Table and select entries.
  • File—Use the specified file. When this option is selected, the File name text box and Browse button are displayed. The allowed file types are: Maestro or SD structure file, CSV file, table file (see above).
Open Project Table button

Open the Project Table panel, so you can select the entries for the structure source.

File name text box and Browse button

Enter the file name in this text box, or click Browse and navigate to the file. The name of the file you selected is displayed in the text box.

Actives file text box and Browse button

File that specifies the known actives. The actives are identified by title, so you should ensure that the titles are unique and meaningful. The file can be a simple text file with one title per line, a Maestro file that contains the structures of the actives, or a CSV file that has the title in the first field (such as one exported from Maestro as a spreadsheet). The titles must match those used for the structures in the screening run.

Number of decoys text box

The total number of decoys used as input for the screen. The number of structures screened should be the sum of the number of actives and the number of decoys.

Enrichment Report text area

This text area displays the results of the enrichment calculation in text form. An example is given below.

Enrichment Report
-----------------

Actives file: example_A_actives.txt
Results: example_A_pv.maegz
Total actives: 43
Total ligands(actives+decoys): 1043
Number of ranked actives: 43

BEDROC(alpha=160.9, alpha*Ra=6.6335): 0.742
BEDROC(alpha=20.0, alpha*Ra=0.8245): 0.401
BEDROC(alpha=8.0, alpha*Ra=0.3298): 0.461
ROC: 0.76
RIE: 5.47
Area under accumulation curve: 0.75
Ave. Number of outranking decoys: 242
Minimum Tc over all active pairs: 0.011

Count and percentage of actives in top N% of decoy results.
% Decoys  |   1%|   2%|   5%|  10%|  20%|
# Actives |    9|   11|   13|   19|   23|
% Actives | 20.9| 25.6| 30.2| 44.2| 53.5|

Count and percentage of actives in top N% of results.
% Results |   1%|   2%|   5%|  10%|  20%|
# Actives |    8|    9|   13|   18|   23|
% Actives | 18.6| 20.9| 30.2| 41.9| 53.5|

Enrichment Factors with respect to N% sample size.
% Sample |     1%|     2%|     5%|    10%|    20%|
EF       |     19|     10|    6.1|    4.2|    2.7|
EF*      |     21|     13|      6|    4.4|    2.7|
EF'      |     43|     23|     11|    7.1|    4.5|
DEF      |     19|     10|    6.1|    4.2|    2.7|
DEF*     |     21|     13|      6|    4.4|    2.7|
DEF'     |     43|     23|     11|    7.1|    4.5|
Eff      |  0.909|  0.855|  0.716|  0.631|  0.456|

Enrichment Factors with respect to N% actives recovered.
% Actives |    40%|    50%|    60%|    70%|    80%|    90%|   100%|
EF        |    4.3|    2.8|      2|    2.1|      2|    1.4|      1|
EF*       |    5.1|    3.1|    2.1|    2.2|    2.1|    1.4|      1|
EF'       |    8.3|    5.1|    3.5|    3.2|      3|    2.3|    1.8|
FOD       |   0.02|   0.04|   0.07|    0.1|    0.1|    0.2|    0.2|

Rank   Title                                                                    
------ -------------------------------------------------------------------------
1      15748-1_native-sip                                                       
2      1103-3_native-sip                                                        
3      15630-1_native-sip                                                       
4      15751-1_native-sip                                                       
5      15624-1_native-sip                                                       
6      16128-1_native-sip                                                       
7      15723-1_native-sip                                                       
9      16228-2_native-sip                                                       
11     15752-1_native-sip                                                       
22     15729-3_native-sip                                                       
28     15722-1_native-sip                                                       
42     15728-3_native-sip                                                       
44     1154-1_native-sip                                                        
66     15755-2_native-sip                                                       
71     15754-1_native-sip                                                       
73     16186-1_native-sip                                                       
95     1152-1_native-sip                                                        
101    1149-2_native-sip                                                        
109    1150-1_native-sip                                                        
155    1233-1_native-sip                                                        
157    1234-1_native-sip                                                        
189    1143-2_native-sip                                                        
190    15753-1_native-sip                                                       
287    1100-2_native-sip                                                        
303    1119-1_native-sip                                                        
309    1144-3_native-sip                                                        
320    16195-1_native-sip                                                       
334    15730-1_native-sip                                                       
352    15725-2_native-sip                                                       
354    1147-3_native-sip                                                        
373    15727-1_native-sip                                                       
387    1145-2_native-sip                                                        
395    16101-2_native-sip                                                       
418    16107-1_native-sip                                                       
441    15724-2_native-sip                                                       
483    1153-1_native-sip                                                        
556    1196-1_native-sip                                                        
614    16190-1_native-sip                                                       
684    1151-1_native-sip                                                        
687    16193-1_native-sip                                                       
696    16191-1_native-sip                                                       
997    1146-3_native-sip                                                        
1004   1148-2_native-sip                                                        


ROC Plot button

Display a plot of the ROC area under the curve.

% Screen Plot button

Display a plot of the percentage of actives recovered against the percentage of structures screened.

Save Metrics CSV button

Save a CSV file containing the metrics. The first column is the title, the remaining columns are the metric values. Opens a file selector in which you can navigate to a location and save the file.

Save Ranks CSV button

Save a CSV file listing the actives and their ranks. Opens a file selector in which you can navigate to a location and save the file.

Job toolbar

Manage job submission and settings. See Job Toolbar for a description of this toolbar.

Status bar

The status bar displays information about the current job settings and status for the panel. The settings includes the job name, task name and task settings (if any), number of subjobs (if any) and the host name and job incorporation setting. The job status can include messages about job start, job completion and incorporation.

Use the Reset button to reset the panel to its default settings and clear any data from the panel. You can also reset the panel from the Job toolbar.

The status bar also contains the Help button , which opens the help topic for the panel in your browser. If the panel is used by one or more tutorials, hovering over the Help button displays a button, which you can click to display a list of tutorials (or you can right-click the Help button instead). Choosing a tutorial opens the tutorial topic.