Refining crystallographic protein-ligand structures using GlideXtal and Phenix/OPLS
Tutorial Created with Software Release: 2025-1
Topics: Biologics Drug Discovery , Ligand Preparations and Library Design , Small Molecule Drug Discovery , Structure Prediction & Target Enablement
Products Used: Glide , Prime
|
24.1 MB |
This tutorial is written for use with a 3-button mouse with a scroll wheel.
Words found in the Glossary of Terms are shown like this: Workspacethe 3D display area in the center of the main window, where molecular structures are displayed
Abstract:
The correct placement of ligands into crystallographic densities of protein-ligand complexes is essential for enabling accurate and reliable computational modeling. In this tutorial, you will learn how to use GlideXtal and the OPLS integration to the Phenix program to place ligands and refine the bound protein-ligand structure.
Tutorial Content
1. Introduction to crystallographic structure refinement
Drug discovery projects can greatly profit from high-quality protein-ligand structures. With recent advances in crystallographic data collection, obtaining experimental structures in high-throughput studies is becoming more common. This requires automated methods for processing data and obtaining high quality structural models from experimental densities.
GlideXtal uses Glide docking guided by the crystallographic density to generate a ligand pose with good fit and energies without user intervention. It shares many features with its counterpart GlideEM for cryo-electron microscopy. GlideXtal can be used in combination with the OPLS integration for Phenix to optimize refinement parameters and run final refinement for each of the obtained poses.
In this tutorial, you will use GlideXtal to place a ligand in an experimental structure of TACE, TNF-α converting enzyme, bound to a small-molecule inhibitor. The experimental structure, published by Levin et al., has an overall resolution of 2.3 Å but the density for parts of the ligand is weak. Using GlideXtal, you will further refine the ligand pose and examine alternate conformations.
Note that you can use the GlideMap panel in Maestro for performing refinement with GlideXtal and GlideEM. If you prefer using the graphical UI, see the quick reference sheet here.
This tutorial uses command line tools from the Schrödinger suite. For operating system-specific instructions on getting started using the command line interface, see Running Schrödinger Applications from the Command Line. Refinement of GlideXtal poses using Phenix/OPLS also requires a properly configured Phenix installation, including setup of the $SCHRODINGER_PHENIX environment variable. See the Phenix documentation for detailed instructions.
2. Creating Projects and Importing Structures
At the start of the session, change the file path to your chosen Working Directorythe location where files are saved in Maestro to make file navigation easier. Each session in Maestro begins with a default Scratch Projecta temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project, which is not saved. A Maestro project stores all your data and has a .prj extension. A project may contain numerous entries corresponding to imported structures, as well as the output of modeling-related tasks. Once a project is saved, the project is automatically saved each time a change is made.
Structures can be built in Maestro or can be imported using File > Import Structures (or drag-and-dropped), and are added to the Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion and Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data. The Entry Lista simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion is located to the left of the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. The Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data can be accessed by Ctrl+T (Cmd+T) or Window > Project Table if you would like to see an expanded view of your project data.
-
Double-click the Maestro icon
- (No icon? See Starting Maestro)
- Go to File > Change Working Directory
- Find your directory, and click Choose
- Pre-generated files are included for running jobs or examining output. Download the zip file here: schrodinger.com/sites/default/files/s3/release/current/Tutorials/zip/glidextal.zip
- After downloading the zip file, unzip the contents in your Working Directorythe location where files are saved for ease of access throughout the tutorial
Figure 2-2. Save Project panel.
- Go to File > Save Project As
-
Change the File name to glideXtal_tutorial, click Save
-
The project is now named
glideXtal_tutorial.prj
-
The project is now named
3. Examining the experimental density and preparing structures
In this section, we will examine the 2A8H structure of TACE and overlay the structural model with the experimental density. To this end, we need to convert the crystallographic data from the cif format deposited in the PDB into a surface we can import into Maestro. Then, we will use the Protein Preparation Workflow to prepare the structure, setting up proper bond orders, formal charges and protonation states, as well as filling in missing residues. Force-field based methods such as GlideXtal work better on complete structures, even if the experimental density is not good enough to unambiguously place all the atoms. For detailed instructions on this workflow, consult the Introduction to Structure Preparation and Visualization tutorial. Note that similar considerations apply when preparing structures for refinement using GlideXtal as for docking with Glide in a virtual screening context. See the Structure-Based Virtual Screening with Glide tutorial for more details.
3.1 Inspecting the density
First, you can use the sf2map.py script to convert the structure factors to 2Fo-Fc and Fo-Fc density maps for visualization.
- Open a terminal window with a properly configured
$SCHRODINGERenvironment in your working directory. -
Type
$SCHRODINGER/run -FROM pspsf2map.py2a8h.pdb 2a8h-sf.cifand press Enter.- This job takes about a minute.
-
The files
2yxj_createmap-out-2fofc.ccp4and2yxj_createmap-out-fofc.ccp4are written to the working directory. -
You can also find the output files in the tutorial zip archive.
Note: Optionally, you can also generate the surfaces with the PrimeX user interface in Maestro instead of using the sf2map.py script. See the PrimeX Quick Reference Sheet for instructions.
Now, switch to the Maestro GUI and import both the structure and the surface you just created.
-
In the main Maestro window, go to File > Import structures and find the
2a8h.pdbstructure from the tutorial zip archive.
To import the surface, you need to specify which entry it should be attached to and what type of surface it is.
-
Go to Workspace > Surface > Import… and choose the
2a8h_createmap-out-2fofc.ccp4file - In the Choose Entry window, ensure the row for the 2A8H structure is selected and click Choose.
-
In the Map Type dialog, choose the 2Fo-Fc option.
- The Manage Surfaces panel opens and the surface is imported in the workspace.
By default, only 16 ų of the density is visualized to keep the graphics load minimal. You may want to increase this somewhat to be able to see the density for the ligand and its environment.
- Increase the value for Display at most to 20 ų.
- Type L to zoom to the ligand.
- Middle-click a ligand atom to center the view (and the displayed density) there.
Note that the asymmetric unit of this structure contains two copies of the ligand-receptor complex. In this tutorial, we’ll only use the pocket of chain A for the refinement, but when applying this to your project, it could be valuable to perform the re-docking and refinement with both pockets.
Optionally, you can further adjust the way the surface is visualized to your preference.
- Click Display Options
- For Style, choose Solid.
- Set transparency to 40% for both front and back surfaces.
- Set the surface Color to a light gray.
-
Click OK.
Note: You can adjust the depth of field by holding the Ctrl/Command key while scrolling with the mouse wheel or using the Clipping Planes workspace gadget.
You can now investigate the ligand fit more closely. You can see that the ligand is not very highly resolved and partly “sticks out” of the electron density, for example at the methyl substituents to the heterocycle.
Additionally, the ligand is part of the coordination shell of a Zn2+ ion, which has an unusual pentahedral coordination (usually, tetrahedral coordination would be expected).
Finally, there’s an empty blob of density just below the water molecule next to the Zinc ion. Given its size and shape, it’s possible that this water molecule is incorrectly placed, or that there should be a second water molecule nearby.
Also, take a look at the formamide group near the Zinc ion – it is very close to GLU 406, but is awkwardly angled so that it cannot make any interactions with it even though the density between the two seems to imply some delocalization. Whether this interaction is possible depends on the protonation states of both receptor and ligand, so if this were a real project you should pay close attention to that residue during protein preparation.
Figure 3-7. A view of the density in the pocket of chain A. Orange boxes highlight areas where the model does not fit the density well or shows awkward chemistry.
3.2 Preparing the structure
An in-depth discussion of the preparation of this system, including detailed analysis of the protonation states in the binding pocket, is beyond the scope of this tutorial. The protonation state assignments made by the automated H-Bond optimization as part of the Protein Preparation Workflow are adequate for our purposes.
Figure 3-8. Adjusting options for the Optimize H-bond assignments step of the Protein Preparation Workflow.
You can now prepare the structure, making sure that missing residues are filled in.
- Open the Protein Preparation Workflow from the Favorites toolbar or the Tasks menu.
- In the Preprocess section, ensure the Fill in missing side chains option is enabled.
- In the Optimize H-Bond Assignments section, click Settings.
- Enable the Use crystal symmetry option.
When working with crystal structures, the Protein Preparation Workflow can also make use of the other copies of the protein in the crystal context to assign the protonation states. This can be helpful for refining the structure, especially if the binding pocket is at a crystal mate contact such as in this case. Keep in mind that these contacts are not present in the protein’s native state and can introduce artifacts. To generate the crystal mates, you can go to Edit > Crystal Mates and choose a cutoff distance within which to generate the copies.
Figure 3-9. Adjusting options for the Minimize and Delete Waters step of the Protein Preparation Workflow.
The restrained force field minimization which is part of the workflow should only be applied to the Hydrogen atoms, as we want to keep the positions of heavy atoms identical to the original model for the refinement. Also, all waters should be kept in place.
- In the Minimize and Delete Waters section, click Settings.
- Under Minimize, choose Hydrogens only.
- Under Delete Waters, disable the Distant from ligands option.
- For job name, type proteinprep_2a8h
-
Click Run.
- This job runs for ~3 minutes.
- You can find the results in the tutorial zip archive as proteinprep_2a8h-out.maegz.
4. Refining the ligand poses with GlideXtal and Phenix/OPLS
In this section, you will use GlideXtal to dock the ligand into the prepared structure guided by the experimental density, generating reasonable starting poses. In a second step, these poses you can further refine the poses against the density by using the OPLS integration for Phenix. Currently, there is no graphical user interface for GlideXtal, so this section can only be done using the command line. You can find the outputs of this calculation in the tutorial zip archive.
4.1 Redocking the ligand with GlideXtal only
First, make a new directory for the GlideXtal job.
- Return to your previous terminal window or start a new terminal session with a properly configured
$SCHRODINGERenvironment in your working directory. -
Type
mkdir 2a8h_redock && cd 2a8h_redock
In addition to the prepared structure and the crystallographic data, the glidextal.py script requires you to specify the ligand or binding pocket. Here, we use the ASL to point out where the ligand is in the structure.
-
Type
$SCHRODINGER/run -FROM pspglideXtal.py../proteinprep_2a8h-out.maegz -map ../2a8h-sf.cif -ligand_asl "chain A and res.n 158" -JOBNAME 2a8h_redockand press Enter.- The job should finish in ~10 minutes.
-
You can find the results in the tutorial zip archive as
2a8h_redock-out_final.maegz.
Note: Many scripts in the Schrödinger suite, including those in this tutorial, use a local Job Server in the background. You can monitor the status of these jobs either using the $SCHRODINGER/jsc list command, or using the Job Monitor Panel in Maestro (you may need to enable the “Show All Projects” option).
In addition to providing a structure of the ligand-receptor complex and pointing out the ligand via ASL, there are several other ways to specify the ligand and/or binding site. For example, you can also provide the ligand as a SMILES string and specify the binding site of a structure without a ligand using ASL. See the glidextal.py documentation page or run $SCHRODINGER/run -FROM psp glideXtal.py -h for more examples.
You can use the Atom Selection dialog box to construct the ASL expression or have it generated from the workspace selection.
Figure 4-3: Comparing the original pose provided in the PDB structure (white, top left) with other possible ligand poses identified by GlideXtal.
Once the job is complete, you can import the results into Maestro to examine the ligand poses GlideXtal has found.
- Go to File > Import Structures and choose the 2a8h_redock-out_final.maegz file.
- Right-click the 2a8h_redock-out_final group and click Include.
- Optional: From the Presets menu, choose Binding Mode comparison.
- Optional: Press Ctrl+L/Cmd+L to show the different Entries side by side.
Feel free to adjust the visualization settings to your preference.
We’ll also add the relevant properties to the table in the Entry list.
- Click the three dots in the Entry List header and choose Show Property.
- In the Show Properties dialog box, click Choose
- Search for and click docking method, docking score, ligand rmsd, and glide denscore to add them to the list.
- Click OK.
- Drag the border of the Entry list to the right to see the added columns.
You can see that GlideXtal has found a few additional poses for the ligand, which overall place the ligand in a very similar pose. This results in very low RMSDs compared to the original pose. The biggest difference from the original pose is in the position of the carbonyl groups coordinating the Zinc ion.
Looking at the docking method column in the table in the Entry List, you can see that the top-ranked pose was obtained by using a force-field minimization on the initial pose from the crystal structure, while the other poses resulted from Glide’s conformer generation. The poses are ranked based on the Glide density score (glide denscore property).
Should you be working with macrocycle or peptide ligands, or if GlideXtal fails to find poses that fit the density well, you should increase the sampling by using the -mode option to the script.
4.2 Redocking with refinement using Phenix/OPLS
To obtain a more accurate fit of the model against the map and discriminate between the putative ligand poses, each pose obtained from GlideXtal can be refined with Phenix/OPLS, an interface to Phenix versions 1.16 through 1.21. Here, the OPLS force field and VSGB2.1 solvation model are used as additional restraints in the target function of Phenix. The refinement then adjusts the positions of the ligand and nearby receptor atoms in order to optimize the fit to the experimental data.
While the GlideXtal poses can be obtained in a manner of minutes, a full optimization with Phenix/OPLS can take several hours depending on the size of the system and number of ligand poses.
Two additional parameters needed for this refinement are the optimal weights for the stereochemistry (wxc) and B-factor (wxu) terms in the target function. These can be automatically obtained by performing a weight scan. See the phenix_weight_scan.py Command Help for more information.
- Return to your previous terminal window.
-
Type
cd .. && mkdir 2a8h_phenix_weight_scan && cd 2a8h_phenix_weight_scanand press Enter -
Type
$SCHRODINGER/run -FROM pspphenix_weight_scan.py../proteinprep_2a8h-out.maegz ../2a8h-sf.cif -scan_optionfix_wxc=0.01,0.1,0.5,1,2,3,4,5,10,50 -scan_optionfix_wxu=0.01,0.1,0.5,1,2,3,4,5,10,50 -JOBNAME 2a8h_phenix_weight_scan.- Warning! This job is computationally intensive and can take several hours.
- You can find reference results in the tutorial zip archive in the 2a8h_weight_scan directory.
- To keep the size of the tutorial zip archive small, we have only included the full outputs for the run with the optimal combination of parameters.
Note: You can add the -HOST option to this command to run this on multiple processors locally or send the job to a cluster. If sending to a cluster, you can use the -phenix_dir <path> option to specify the path of the Phenix root directory on the cluster.
The results from the weight scan are summarized in the 2a8h_results.csv file. You can open it in your text editor or spreadsheet software of choice. Look for the single row with a “1” in the best_case column – this is the best-scoring combination of weights for the wxc and wxu parameters.
In our case, the optimal weights are wxc = 10.0 and wxu = 1.0.
With the optimal weights in hand, we can now run GlideXtal with Phenix/OPLS to refine the poses.
- Return to your previous terminal window.
-
Type
cd .. && mkdir 2a8h_redock_refine && cd 2a8h_redock_refineand press Enter -
Type
$SCHRODINGER/run -FROM pspglideXtal.py../proteinprep_2a8h-out.maegz -map ../2a8h-sf.cif -ligand_asl "chain A and res.n 158"-phenix_opls_refine full -phenix_refine_option fix_wxc=10.0 -phenix_refine_option fix_wxu=1.0-JOBNAME 2a8h_redock_refineand press Enter.- This job takes approximately 30 minutes.
- You can find reference results in the 2a8h_redock_refine folder in the tutorial zip archive.
Once the calculation is done, we can once again examine the results in the GUI.
- Go to File > Import Structure and choose 2a8h_redock_refine-out_final.maegz.
- Adjust the display options according to your preference to compare the structures.
If Phenix refinement was used, there is an additional output property you should add to the Entry List.
- Click the Cog in the Entry List header and choose Show Property
- In the Show Properties dialog box, click Choose
- Search for and click Real SpaceCorrelation Ligand.
- Click OK.
Figure 4-10: Comparing the pose obtained by GlideXtal without refinement of the receptor atoms (green) and the refined protein-ligand complex obtained with Phenix/OPLS (pink).
You can also compare the top pose of the first GlideXtal run to the refined top pose to see the subtle changes in the atomic positions of the receptor and solvent.
- Include the top entries from the 2a8h_redock-out_final and 2a8h_redock_refine-out_final groups.
Looking through the refined poses, you can see that they are even more similar to each other compared to the results with just redocking. Again, the RMSDs to the original pose are smaller than 1 Å in all cases, and the biggest difference lies in the positioning of the hydroxamate group near the Zinc. Seeing such low ligand RMSDs across the board indicates that the original placement of the ligand was quite good.
The top two poses from the refinement are almost identical to each other. The top-ranked pose this time is obtained from conformer generation rather than the minimization of the initial pose.
Because these poses were refined with Phenix, the ranking is based on the ligand cross correlation metric (the RealSpaceCorrelation Ligand column in the Entry List) rather than the glide denscore.
To compare the quality of the pose’s interaction with the receptor, use the docking score. Use the glide denscore to compare the pose’s fit with the ligand density. The lower these two scores, the better.
5. Conclusion and References
In this tutorial, you used GlideXtal to generate potential ligand poses for a published structure by using the crystallographic data to guide the docking. Then, you used Phenix/OPLS to further refine the structural model in order to identify which ligand poses result in the optimal fit to the experimental data.
Once you have refined the ligand pose, you can use the Protein Reliability Report to check some common structure quality metrics. You should also re-run the Protein Preparation Workflow on the structure to create new zero-order bonds to any metal ions coordinated by the ligand. You can also run an unrestrained molecular dynamics simulation on the protein-ligand complex to see whether the pose is stable.
For further reading:
- Running Schrödinger Applications from the Command Line
- Acetylenic TACE inhibitors. Part 3: Thiomorpholine sulfonamide hydroxamates.
The original publication for the TACE structure with this ligand.
6. Glossary of Terms
Entry List - a simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion
Included - the entry is represented in the Workspace, the circle in the In column is blue
Project Table - displays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data
Recent actions - This is a list of your recent actions, which you can use to reopen a panel, displayed below the Browse row. (Right-click to delete.)
Scratch Project - a temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project
Selected - (1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries
Working Directory - the location where files are saved
Workspace - the 3D display area in the center of the main window, where molecular structures are displayed