Hierarchical Sampling Technical Details

Hierarchical sampling in the refinement of protein-ligand complexes and in MM-GSBA calculations is an implementation of the PGL Sampling approach described in "Exploring hierarchical refinement techniques for induced fit docking with protein and ligand flexibility" [1], and applied with other Schrodinger tools in "Leveraging Data Fusion Strategies in Multireceptor Lead Optimization MM/GBSA End-Point Methods" [2].

The PGL algorithm is a protocol for sampling protein sidechains (P), ligand functional groups (G) and ligand orientational degrees of freedom (L). This is achieved through a two-stage sample-and-score algorithm. In order to sample the ligand functional groups, a series of rotamer libraries is created by sampling the ligand functional group in the presence of the ligand core using MacroModel. These rotamer libraries for ligand functional groups are then integrated into the existing Prime side-chain packing algorithm to optimally orient the ligand side chains and the protein functional groups using the Prime Energy Model.

To refine the conformation of the ligand core in the active site, a series of possible orientations within a small cutoff (typically 1.0 Å and 30°) of the input orientation are created by systematic sampling. These orientations are evaluated using a flexible steric-only scoring function all all orientations in which the the ligand functional groups and protein side chains can be packed without steric clashes are retained. The resulting orientations are clustered and a representative from each cluster has its protein side chain and ligand functional group degrees of freedom optimized using the Prime side-chain packing algorithm. These optimized conformations are ranked by Prime Energy and the lowest energy conformations are returned.

For MM-GBSA Calculations using hierarchical sampling, the complex structures are optimized using the active site optimization described above. The free receptor conformation is optimized using a Prime side-chain packing optimization without the ligand present. The free ligand conformation isobtained by performing a Prime side-chain packing optimization of the ligand functional groups without the protein present using the same rotamer libraries as used in the hierarchical sampling active site optimization used to determine the complex conformation. The Prime Energy of the optimized complex structure minus the Prime Energy of the optimized “free receptor” and the Prime Energy of the optimized “free ligand” conformation is then returned as the binding energy of the protein-ligand complex. If no ligand core is provided, a core is automatically determined.