Correlation Plot (FEP+) Panel
Display a correlation plot of predicted against experimental binding free energies along with a statistics table and histogram. These are updated when the map data changes, e.g. a ligand or mutant is deleted. This plot is available for both ligand mutation and protein mutation FEP.
To open this panel, click the Plot button in the FEP+ panel.
- Features
- Additional Resources
Correlation Plot (FEP+) Panel Features
- Plot area
- Statistics tables
- Histogram of FEP+ edge errors
- Plot options section
- Units option menu
- Plot title text box
- Save As button
- Plot area
-
Displays the plot of the predicted against the experimental binding free energy. The line of perfect agreement is also displayed. For relative binding FEP, this line passes through the origin. For absolute binding FEP, the line can be displaced, as there may be an offset in energy needed to correlate predicted and experimental binding free energies. This offset corresponds to the protein reorganization energy, which is missing from the FEP calculations. The offset can be determined from the plot and used with the absolute binding FEP results for other compounds to obtain an estimate of the expected experimental binding free energies.
You can select points in the plot by drawing a closed curve around them. The ligands or mutants for these points are displayed in the Workspace and selected in the FEP+ panel.
Error ranges in both the experimental and predicted values are indicated by horizontal and vertical lines. If ligands have been designated as "Top of assay" or "Bottom of assay", i.e. their experimental values are outside the assay range, the lines are drawn all the way to the appropriate side of the plot, to indicate the range of values. These points are excluded from the statistical analysis.
- Statistics tables
-
There are two tables that can be displayed: Correlation and Classification.
The Metric name column has an information button that shows a tooltip defining the metric when you hover over it. The Value column shows the value of the metric type.
- Correlation
-
The Correlation table displays various statistics associated with the plot. The numbers in [brackets] show the 95% confidence intervals of R2. The statistics listed depends on the type of FEP+ calculation.
-
Expected Exp.R2: value of the experimental data from repeated experiments.
-
Expected Pred.R2: value of the predicted data against the experimental data from repeated predictions.
-
R2: value of the fit of predicted to experimental values.
-
MUE (ΔΔG): the mean unsigned error calculated between all the predicted and experimental ΔΔG values in the map, for all ligand pairs in the map.
-
RMSE (ΔΔG): the RMS error calculated between all the predicted and experimental ΔΔG values in the map, for all ligand pairs in the map.
-
MUE (edgewise): the mean unsigned error of the FEP+ values with respect to the experimental values for the free energy change of directly-connected edges in the graph.
-
RMSE (edgewise): the RMS error of the FEP+ values with respect to the experimental values for the free energy change of directly-connected edges in the graph.
-
MUE (pairwise): the mean unsigned error of the FEP+ values with respect to the relative free energy change for all ligand or mutant pairs.
-
RMSE (pairwise): the RMS error of the FEP+ values with respect to the relative free energy change for all ligand or mutant pairs.
-
Best Fit Line: the slope m and intercept b values of the best fit line between predicted and experimental values.
-
Kendall's τ: A statistical measure that evaluates the rank order of predictions against the true order of the experimental values.
-
Total Ligands: the total number of ligands included in the correlation plot.
-
Total Excluded: the total number of ligands excluded in the correlation plot. For ligand mutation FEP, the statistics do not include ligands designated as "Top of assay" or "Bottom of assay", as the experimental values for these ligands are considered to be unreliable.
This information is useful if there are some outliers in the correlation plot, which can cause the MUE or RMSE for the directly connected edges to be very sensitive to the topology of the graph.
If the Edgewise RMSE is larger than the Pairwise RMSE, this indicates possible problems with the graph construction, or with the structures used for bias.
-
- Classification
-
The Classification table is only present when Classification is selected from the Plot options section, and displays statistics associated with the classification matrix. The False Negative (FN), True Negative (TN), True Positive (TP), and False Position (FP) values used to calculate the following metrics are displayed on the plot.
-
-
Accuracy: The rate of correct predictions (true positives and negatives). This value can be misleading if the number of ligands in each outcome is imbalanced [(TP + TN) / (TP + TN + FP + FN)].
-
Specificity: The proportion of actual negatives that are correctly identified, or, the TN rate [TN / (TN + FP)].
-
Recall: The proportion of actual positives that are correctly identified. The TP rate is important when you want to minimize false negatives (FN). Values closer to 1 are better [TP / (TP + FN)].
-
Precision: The positive predictive value. This metric is useful with the cost of false postives is high [TP / (TP + FP)].
-
F1-score:Formula: The harmonic mean of precision and recall. This metric is useful when you need to consider both false positives and negatives. [2 * (Precision * Recall) / (Precision + Recall)]
-
Cohen's κ: Measures the agreement between the model's predictions and the actual values, while accounting for the possibility of agreement occurring by chance. Data with a value of 1 indicates perfect agreement, -1 indicates perfect disagreement, and 0 indicates randomness. This metric is particularly valuable when evaluating imbalanced datasets (one outcome has many more samples).
-
TP/FP ratio: The ratio of true positives to false positives. This is a useful metric for assessing the performance of a classification model. A higher ratio means the model is more likely to correctly identify positive instances while minimizing false alarms. The optimal cutoff value is calculated by optimizing this metric.
-
- Histogram of FEP+ errors
-
Histogram of the FEP+ error magnitudes. For relative binding FEP, pairwise error magnitudes are shown for all ligand pairs that have experimental and cycle-closure-corrected ΔΔG values; the error magnitude is the absolute difference between the experimental and predicted ΔΔG . For absolute binding FEP, the error is the absolute difference between experimental and predicted ΔG values.
- Plot options section
-
Select one of these options to show information on the plot.
- Error Bands—show error bands at 1 kcal/mol intervals from the correlation line.
- Error Bars—show error bars for the predicted and the experimental values (if available).
- Best Fit—shows the line of best fit between the predicted and experimental value. Values are shown in the Correlation table.
- Classification—overlays a classification matrix of the ligands onto the plot with four outcomes: False Negative (FN), True Negative (TN), True Positive (TP), and False Position (FP). False positive and negative boxes are colored in blue, while true positive and negative boxes are colored in orange. The number of ligands categorized in each outcome is labeled on the respective boxes, followed by the percentage out of all the ligands. The Exp. and Pred. Cutoff text boxes can be used to change the boundaries of the boxes on the X and Y axis, respectively. The default cutoff used is the optimal cutoff, and gives the highest number of true positives and lowest number of false positives. You can also drag the dotted lines to change the boundaries, which automatically updates the cutoff text boxes. Use the reset icon to reset the cutoff values to the default. Statistics associated with the classification matrix are listed in the Classification table.
- Units option menu
-
Select the units to use for the affinity, from ΔG (kcal/mol), pKi, or logS (for solubility FEP). The units are changed wherever they are used, including the saved PDF file.
- Plot title text box
-
Title of the plot, which you can edit. The default is derived from the title shown on the FEP+ panel.
- Save As button
-
Save a report in PDF format. The report includes the plot, the statistics table, the histogram, and explanatory text. Opens a file selector, so you can browse to a location and name the file.