H-Bond Optimization Technical Notes
While X-ray structures are invaluable starting points in modeling studies, they have a critical weakness: due to the relative lack of electron density around hydrogens, it is nearly impossible to accurately resolve their locations in the structure. While many hydrogen positions are easily estimated from simple geometric considerations, the problem is more difficult in the case of hydroxyl and thiol hydrogens, where the electrostatic environment must be taken into account. In addition, the lack of hydrogen coordinates leads to other ambiguities in the structure. For example, without knowledge of hydrogen locations, the protonation state of residues such as His cannot be directly determined from experiment. More subtly, without knowledge of the location of the hydrogens, it is generally not possible to distinguish the oxygen and nitrogen in the amides of Asn and Gln residues. A 180° flip about the relevant chi dihedral angle, transposing the oxygen and nitrogen, will often produce an alternate structure that is equally consistent with the electron density. A similar ambiguity exists with His, with respect to the carbon and nitrogen atoms of the imidazole ring.
In order to make the best use of X-ray structures in modeling studies, it is clearly important to resolve these three types of structural ambiguities. The purpose of the Protein Assignment utility (protassign) is to select the most likely:
- position of hydroxyl and thiol hydrogens (including any present on ligands, waters, and cofactors)
- protonation states and tautomers of His residues
- chi “flip” assignments for Asn, Gln and His residues
- orientations of water molecules
- protonation states of Asp, Glu, and Tyr residues
- protonation states and orientation of Lys residues
In order to optimize performance, localized clusters of hydrogen-bonding species are identified. Two such species are considered to be in the same cluster if their heavy atoms are within 3.5 Å of one another. Within a given cluster, all possible combinations of assignments for the species are then enumerated. In the case of hydroxyl and thiol torsions and water orientations, where countless possible assignments exist, a smaller set of possibilities is selected based on the local environment. Potential hydrogen-bond donors and acceptors are identified, and the rotatable hydrogen is directed towards each potential acceptor, as well as 120° away from each potential donor (to simulate hydrogen-bonding to the lone pair of its corresponding heavy atom). Positively charged metal species are treated as virtual hydrogen-bond donors.
The algorithm for optimization is a sophisticated algorithm that involves iterative sampling techniques, a kind of genetic algorithm for combining ensembles of orientations, and a kind of simulated annealing technique for improving the orientations.
Once all possible assignments for a given cluster are identified, each possibility is scored to determine the quality of the hydrogen-bond network (among the species in the cluster itself as well as with the surrounding environment). The nature of the scoring function is loosely based on simple electrostatic considerations. The intent is to score the existence of hydrogen-bonding networks, rather than estimate the actual electrostatic energy of the system.
-
Hydrogen-Bond Scoring: The core of the scoring function involves an evaluation of the quality of the hydrogen-bonding network. In addition to identifying the number of hydrogen-bonds created, the quality of those bonds are also taken into account based on their geometries relative to an idealized hydrogen-bond.
-
Hydrogen Clash Penalty: Assignments that place two polar hydrogen atoms too close to one another are given a high enough penalty to effectively eliminate them from consideration. The actual distance varies with the type of hydrogen.
-
Protonation Reward/Penalty: The protonation penalty for unlikely protonation states is determined from the pKa as predicted by PROPKA. If this program is not used, the following set of rules determine the protonation. A histidine residue is ruled to be protonated if it is either in close proximity to a negatively charged species, or if both its delta and epsilon nitrogens are in close proximity to hydrogen bond acceptors. Similarly, aspartic and glutamic acid residues are ruled to be neutralized if one or both of their carboxylate oxygens are in close proximity to a hydrogen bond acceptor.