FEP+ Panel — Analysis Tab

Analyze the results of FEP calculations for ligand binding or for ligand solubility. The analysis produces estimates of the relative or absolute free energies, or solubility,

For ligand binding, the analysis displays detailed information on the protein-ligand interactions, ligand properties and changes, protein changes, and simulation convergence for complexes or for connected complexes. This information is given for the end points of the perturbation, λ = 0 and λ = 1, which correspond to the two complexes or to the bound and free complexes: no information is given for intermediate stages. You can also display the trajectories for the complex and solvent legs, and the vacuum leg if it is present.

Overview of Analysis

This section gives some background information on parts of the analysis for relative binding free energies.

Estimates of Relative Binding Energies

To estimate the ΔG for each ligand, some experimental binding free energies are required. The predicted binding free energy is estimated as

ΔGj(pred) = (1/M) Σk [ ΔGk(exp) − ΔΔGkj ]

where there are M experimental binding free energies over which the sum is performed. The prediction is just the average of the values obtained by using each experimental value as a reference. Similarly, the predicted error is taken by combining the errors for each experimental value used and dividing by the number of experimental values:

ej(pred) = (1/M) √ Σk ekj(pred)2

where ekj(pred) is the error in the prediction of the binding free energy from the use of a single experimental value k as reference, ekj(pred)2 = ek(exp)2 + ekj(ΔΔG)2.

See Conversion of ΔΔG Values to ΔG Values for a more detailed explanation.

Error Estimates for Relative Binding FEP

The errors in the cycle-closure-corrected estimates of ΔΔG are calculated as follows. (Combined errors are evaluated as usual by quadratic combination.)

For each pair of ligands that are directly connected (by an edge in the graph), the hysteresis is evaluated for each loop involving that edge, and the error for each loop k is evaluated as

errork = hysteresisk/√Nk

where Nk is the number of nodes in the loop. The error in ΔΔG for the pair of ligands is taken as the maximum loop error.

For each pair of ligands that are not directly connected, the error is evaluated for each loop that contains the two ligands, and the error in ΔΔG is taken to be that of the loop with the smallest error. The error around a loop is evaluated by using the expression above for the error in each edge of the loop.

See FEP Cycle Closure Method and Error Estimates for a more detailed explanation.

By default the experimental error ek(exp) is taken to be 0.4 kcal/mol, and ekj(ΔΔG) is the error in the predicted relative binding free energy for ligands k and j.

Using the Analysis Tab

The purpose of this tab is to analyze the results of FEP+ jobs. For relative binding free energies, on the computational side it provides tools for assessing the error in the predicted ΔΔG values for relative binding free energies by examining the hysteresis around loops and assessing the accuracy by comparison with experimental ΔG values. On the lead optimization side, the ΔΔG values provide information on the effect of a functional group substitution at a site on the binding affinity. This in turn can be used to direct the choice of compounds for synthesis or generate ideas for further compounds to test.

If any of the convergence classifiers (Energy conv., Lig. RMSD, REST Exch., CCC Conv.) has a FAIR or BAD rating, refer to the FEP+ Best Practices document for discussion of the issues.

Analysis Tab Features

This tab has a single table, which provides information on the raw free energy data generated from the perturbation calculation, and access to the trajectories and edge (connection) or complex analysis. See FEP+ Methodology for information on these quantities and how they are calculated. A summary of the error estimation methods for relative binding FEP is given above.

  • You can drag the column headings to rearrange the columns.

  • Clicking a column heading sorts the column by the values in the column (or last-selected item, for columns with both value and error).

  • Right-clicking a column heading shows a menu, in which you can select groups of columns to show or hide in the table, and choose whether to sort by the value or the error, for columns that contain both. You can also open the menu by pausing the pointer over the heading, and clicking the icon that appears.

  • Right-clicking a row shows a menu for importing trajectories, showing representative structures, or deleting the row.

The table columns are described in the tables below. NOTE: Some of the columns for ligand binding are only present for relative binding FEP, absolute binding FEP, or solubility FEP.

Table 1. FEP+ analysis table columns for ligand binding FEP.

Column

Description

Experimental

Experimental relative binding free energy difference for two ligands, absolute binding free energy of a single ligand, or experimental affinities for protein FEP+ calculations. Includes the error if available.

Energy conv.

Classification of energy convergence for the edge as GOOD, FAIR, or BAD. This classifier tracks the rate at which both FEP legs (solvent and complex) converge, measured in the last nanosecond (ns) of the simulation. Two criteria are used: a global variation, which is the maximum change in ΔΔG or ΔG divided by the time span in which convergence is measured (the last ns, ideally), and a local variation, which is the maximum change in ΔΔG or ΔG for any time step in that time span divided by the time step. Both criteria yield a value in kcal mol−1 ns−1.

  • GOOD—both global variation and local variation are less than 0.3 kcal mol−1 ns−1 for both legs.
  • FAIR—global variation is less than 0.3 kcal mol−1 ns−1 for both legs and local variation is greater than 0.3 kcal mol−1 ns−1 for one or both legs.
  • BAD—global variation for one or both legs is greater than 0.3 kcal mol−1 ns−1.
Lig. RMSD

Classification of ligand RMSD as GOOD, FAIR, or BAD. Ligand RMSD is measured with respect to the input ligand configuration, with the complex aligned on the input receptor.

  • GOOD—ligand RMSD values are less than 2.0 Å throughout the simulation.
  • FAIR—ligand RMSD values do not exceed 4.0 Å
  • BAD—ligand RMSD values exceed 4.0 Å.
REST Exch.

Classification of replica exchange density profiles as GOOD, FAIR, or BAD. A good profile is one in which all replicas are sampled adequately in all lambda windows. This classifier monitors the mixing of replicas throughout the FEP simulation and can be visualized through a PDF report of each edge. Replica mixing is assigned a score from 0 to 1, where one is perfect mixing while zero is no mixing.

  • GOOD—mixing score of both legs, solvent and complex, is greater than 0.15
  • FAIR—mixing score of one of the legs is greater than 0.15, the other is less than 0.15
  • BAD—mixing score of both legs, solvent and complex, is less than 0.15
Solvent Trajectory

This column displays the total simulation time for the solvent leg. The time is a link (in blue) which you can click to view the trajectory for the solvent leg in the Trajectory Player

Complex Trajectory

This column displays the total simulation time for the complex leg. The time is a link (in blue) which you can click to view the trajectory for the complex leg in the Trajectory Player

 

Relative binding FEP only:
Ligand1

Title of the first ligand in a perturbation.

Ligand2

Title of the second ligand in a perturbation.

FEP

Free energy difference (ΔΔG) in kcal mol−1 between two ligands directly connected in the map, including error estimate. ΔΔG is estimated with the Bennett Acceptance Ratio method. The error is the analytical statistical uncertainty estimate for the calculated Bennett ΔΔG. The error estimate is calculated based on the assumption that all the important regions of the phase space have been sampled, and usually it estimates the real error in the predicted free energy. If the error is higher than is considered reasonable, a warning symbol is displayed next to the error.

These estimates rely only on data for the perturbation of the two ligands, and not on any other perturbation in the graph.

Cycle Closure

Cycle closure prediction of the free energy difference between two ligands and error estimate. The cycle closure algorithm takes the Bennett ΔΔG of all the pairs of ligands connected in the graph as input, and calculates the statistically most proper values of ΔΔG using the maximal likelihood method. The Bennett ΔΔG results may have hysteresis, while the Cycle Closure Corrected ΔΔG results do not have hysteresis, i.e. the free energy difference between any pair of ligands is independent of the path.

The error estimate is calculated from the Bennett ΔΔG for all the pairs of ligands connected in the graph, and it is related to the hysteresis of the Bennett ΔΔG of the closed thermodynamic cycle.

Unsigned Diff.

Absolute value of the difference between the experimental relative binding free energy difference and the predicted free energy difference (FEP column). This column shows N/A if the Exp. column also shows N/A (i.e. no experimental value added). The column is hidden by default.

CCC Conv.

Classification of cycle closure hysteresis as GOOD, FAIR, or BAD, based on the effect of random omission of edges on the hysteresis. This is the same classification as in the Map tab, when a perturbation is marked as bad.

  • GOOD—hysteresis score is less than 0.5
  • FAIR—hysteresis score is greater than 0.5 but less than 0.8
  • BAD—hysteresis score is greater than 0.8
Similarity This column displays the similarity score. It is not displayed by default.

Edge Analysis

Click the View button to display an analysis for the edge defined in this row, in the FEP+ — Analysis Panel.

Vacuum Trajectory

This column displays the total simulation time for the vacuum leg (if present). The time is a link (in blue) which you can click to view the trajectory for the vacuum leg in the Trajectory Player

Protocol

This column displays the protocol used for the edge, which is one of default, core-hopping, fragment-linking, charge-hopping. It is not displayed by default.

 

Absolute binding FEP only:
Ligand

Title of the ligand. Only present for Absolute Binding FEP and Solubility FEP.

FEP

Absolute binding free energy(ΔG) in kcal mol−1 of the ligand, including error estimate. The error is the analytical statistical uncertainty estimate for the calculated ΔG. The error estimate is calculated based on the assumption that all the important regions of the phase space have been sampled, and usually it estimates the real error in the predicted free energy. If the error is higher than is considered reasonable, a warning symbol is displayed next to the error.

Analysis

Click the View button to display an analysis for the complex defined in this row, in the FEP+ — Analysis Panel.

Table 2. FEP+ analysis table columns for solubility FEP.

Column

Description

Compound

Title of the compound.

Experimental

Experimental solubility free energy for the compound. Includes the error if available.

FEP

Solubility free energy ΔG(solubility) in kcal mol−1 of the ligand, including error estimate. See Overview of FEP Solubility for a definition.
The error is the analytical statistical uncertainty estimate for the calculated ΔG. The error estimate is calculated based on the assumption that all the important regions of the phase space have been sampled, and usually it estimates the real error in the predicted free energy. If the error is higher than is considered reasonable, a warning symbol is displayed next to the error.

Unsigned Diff.

Absolute value of the difference between the experimental solubility free energy and the predicted solubility free energy (FEP column). This column shows N/A if the Exp. column also shows N/A (i.e. no experimental value added).

Energy conv.

Classification of energy convergence for the process as GOOD, FAIR, or BAD. This classifier tracks the rate at which both FEP legs (hydration and sublimation) converge, measured in the last nanosecond (ns) of the simulation. Two criteria are used: a global variation, which is the maximum change in ΔΔG or ΔG divided by the time span in which convergence is measured (the last ns, ideally), and a local variation, which is the maximum change in ΔΔG or ΔG for any time step in that time span divided by the time step. Both criteria yield a value in kcal mol−1 ns−1.

  • GOOD—both global variation and local variation are less than 0.3 kcal mol−1 ns−1 for both legs.
  • FAIR—global variation is less than 0.3 kcal mol−1 ns−1 for both legs and local variation is greater than 0.3 kcal mol−1 ns−1 for one or both legs.
  • BAD—global variation for one or both legs is greater than 0.3 kcal mol−1 ns−1.
REST Exch.

Classification of replica exchange density profiles as GOOD, FAIR, or BAD. A good profile is one in which all replicas are sampled adequately in all lambda windows. This classifier monitors the mixing of replicas throughout the FEP simulation and can be visualized through a PDF report of each edge. Replica mixing is assigned a score from 0 to 1, where one is perfect mixing while zero is no mixing.

  • GOOD—mixing score of both legs, solvent and complex, is greater than 0.15
  • FAIR—mixing score of one of the legs is greater than 0.15, the other is less than 0.15
  • BAD—mixing score of both legs, solvent and complex, is less than 0.15

Analysis

Click the View button to display an analysis for the complex defined in this row, in the Solubility FEP+ — Analysis Panel.

Hydration Trajectory

This column displays the total simulation time for the hydration leg. The time is a link (in blue) which you can click to view the trajectory for the hydration leg in the Trajectory Player

Sublimation Trajectory

This column displays the total simulation time for the sublimation leg. The time is a link (in blue) which you can click to view the trajectory for the sublimation leg in the Trajectory Player.