Grand Canonical Monte Carlo Addition of Water
If run long enough, molecular dynamics (MD) simulations can, in principle, sample all hydration states that are relevant for accurate predictions with FEP+. In practice, however, it can take longer than the length of an FEP simulation for water to diffuse in and out of buried pockets. Indeed, some hydration sites can have on/off binding rates that are microseconds long, meaning that depending on how a protein was set up, key hydration sites could be unoccupied throughout the whole of an MD simulation. Similarly, if a protein structure has been prepared with an unfavorable distribution of water, a pure MD simulation would take far too long to correct those initial positions.
Because of the water sampling issues that are inherent in MD, FEP+ uses grand canonical Monte Carlo (GCMC) to accelerate the sampling of water molecules. At the core of GCMC are special moves that can instantaneously insert and delete water molecules around a ligand as well as over the whole simulation volume. Because of these special moves, physical barriers between an occluded pocket and bulk water that would otherwise hinder the movement of water in MD can be completely bypassed with GCMC. Thus, GCMC can be orders of magnitude more efficient at sampling buried water than MD alone. Importantly, the presence of water in hydration sites will occur in GCMC simulations at the same frequencies that would be observed in infinitely long MD simulations.
FEP+ uses GCMC at the start of a simulation to add water around the ligands and to equilibrate the water density in the whole simulation volume. GCMC is also used throughout the FEP+ production stage so that the ligands will experience a more diverse ensemble of water positions and occupancies than would have occurred with MD alone.
When using FEP+ with GCMC, FEP+ predictions become largely independent of the initial placement of water in the binding site, even if the structure is prepared without any water molecules, or if water molecules have been misplaced. When growing/shrinking a chemical group that displaces hydration sites or creates new sites, FEP+ with GCMC automatically incorporates the desolvation/solvation free energy for the pocket.
FEP+ with GCMC Methodology
GCMC is on by default in FEP+. In the complex leg of each FEP+ simulation, the total number of waters is a variable quantity that fluctuates as the simulations progress. This is because the complex leg simulations sample from a grand canonical ensemble, within which the simulation volume and a quantity known as the chemical potential are constant. As only the number of water molecules is fluctuating, only the chemical potential for water needs to be specified. In FEP+, the chemical potential of water has been precalculated and is applied automatically for a number of different water models (see below).
The FEP+ simulations combine water insertion and deletion GCMC moves with MD at a fixed number of water molecules. At regular intervals, MD is halted, and a fixed number of water insertion and deletion attempts are made. Attempts are accepted or rejected using a Metropolis-Hastings criterion. Because water molecules near the ligand have a greater impact of the predicted relative free energy, insertion and deletion attempts are made more frequently within the immediate vicinity of the ligand than the rest of the system. GCMC sampling efficiency is also increased by biasing water insertion attempts into free space.
GCMC is carried out on every lambda window in an FEP+ simulation. This allows the water around the ligand to be equilibrated for both end points of the calculation. For instance, if a ligand transformation grows/shrinks to/from a hydration site (e.g displacing a water molecule with an alkyl chain), GCMC should ensure that a water molecule is present at one end point of the calculation but not the other, and that there is a gradual transition in the average water occupancy in the intermediate lambda windows.
Using Different Water Models
The default water model in FEP+ is the 3-site SPC model. A different water model can be specified by using the -water option. For instance, to use the TIP4P-Ew water model, you can use the following command:
$SCHRODINGER/fep_plus -water tip4pew <other options>
The water models that are supported with GCMC and FEP+ are: SPC, TIP3P, TIP4P-Ew, TIP4PD, and TIP5P. Using the -water option ensures that GCMC uses the appropriate chemical potential for the chosen water model. See fep_plus for details of this option.
Turning GCMC Off
While it is recommended to always use GCMC in an FEP+ calculation, the GCMC feature can be switched off by specifying the ensemble. From the FEP+ Panel, you can set the ensemble in the FEP+ Advanced Options Dialog Box; from the command line you can use the -ensemble option with the fep_plus command. The grand canonical ensemble is used by default in the complex legs (i.e. bound stages) of the calculation; this ensemble is designated as muVT in the fep_plus command syntax. Previous versions of FEP+ employed the isothermal-isobaric ensemble (NPT). You can use the NPT ensemble with the following command:
$SCHRODINGER/fep_plus -ensemble NPT <other options>
The canonical ensemble (NVT) is also supported.