Introduction to Structure Preparation and Visualization
Tutorial Created with Software Release: 2026-1
Topics: Hit Discovery , Hit-to-Lead & Lead Optimization , Ligand Preparations and Library Design , Small Molecule Drug Discovery , Structure Prediction & Target Enablement
Products Used: Maestro
|
2.5 MB |
This tutorial is written for use with a 3-button mouse with a scroll wheel.
Words found in the Glossary of Terms are shown like this: Workspacethe 3D display area in the center of the main window, where molecular structures are displayed
Abstract:
This tutorial provides an introduction to the fundamental principles and techniques of structure visualization and preparation in Maestro. You will learn the essential steps to visualize the target structure and to address structural issues for modeling applications.
This tutorial however does not cover the details of protein structure refinement. Further, preparation of lipids, membrane proteins, DNA, RNA, etc. is not covered.
Tutorial Content
1. Introduction to Structure Visualization and Preparation
1.1 Structure Visualization
Biomolecules are highly complex and diverse, and are composed of thousands of atoms held together via covalent and non-covalent interactions. Visualization plays a very important role in the study of biomolecules, helping to identify and understand how the properties of biomolecules are related to their structure. Visualization techniques can address some simple yet important questions:
- How flexible is the target?
- Does the target have a ligand bound to it?
- Which residues are playing a crucial role in the binding site if the target has a bound ligand?
- What are the various interactions present between the ligand and the receptor?
1.2 Structure Preparation
Structure files obtained from the PDB, vendors, and other sources often lack necessary information for performing modeling-related tasks mostly due to experimental limitations of structural biology techniques and insufficient or ambiguous data. Typically, these files are missing hydrogens, partial charges, side chains, and/or whole loop regions. Proteins in their raw state may also have incorrect bond order assignments and side chain orientations. To make these structures suitable for modeling tasks, you will use the Protein Preparation Workflow to find and resolve common structural issues.
The Protein Preparation Workflow in Maestro involves a series of structural and functional checks designed to prepare protein and nucleic acid (DNA/RNA) structures for accurate molecular modeling and simulations. First, the structure is assessed for missing atoms, residues, or side chains, which are then added or corrected. Hydrogen atoms are added to ensure proper protonation states, and the protein is optimized for correct bond lengths, angles, and torsions. The overall geometry is checked for any steric clashes or unusual bond geometries. Functional checks include evaluating the protonation states of ionizable residues at the relevant pH, identifying potential disulfide bonds, and ensuring proper orientation of active site residues. The final protein structure is minimized to relieve any unfavorable interactions and prepare it for further computational analysis, such as docking or molecular dynamics simulations.
In this tutorial, you will learn how to import structures, visualize protein structures, ligand binding sites, ligand-receptor interactions and molecular surfaces. You will also learn how to prepare protein and ligand structures, an essential first step for modeling projects.
Although this tutorial guides you to prepare protein and ligand structures for Glide docking, these steps would be the starting point for many computational experiments, including molecular dynamics simulations (Desmond), and lead optimization (FEP+).
If you prefer to watch video tutorials, see the Protein Preparation and Ligand Preparation videos from the Getting Going with Maestro video series.
2. Creating Projects and Importing Structures
Most of the entries in the PDB contain the diffraction data that can be used together with the deposited model to create and view the electron densities. Downloading diffraction data is usually unnecessary but if you wish to perform advanced structure refinement and detailed electron density map analysis, check the Diffraction data / EM map option. Further, if you want to download the entire biological unit rather than the asymmetric unit, check the Biological unit option.
For more information on biological assembly and asymmetric unit, click here.
3. Visualizing the Target Structure
In this section, you will explore ways to visualize structures in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. To demonstrate the three-dimensional configuration and spatial arrangement of atoms in biomolecules, different representation models can be used. Object representation can be changed in a number of ways using the Style Toolbox. Presets offer the ability to quickly render a structure in preset style – similar to PyMOL – to facilitate easy visualization. Presets can be used in a variety of ways, from decluttering your structure to creating publication-quality images.
3.1 Change Visualization Style and Display Ligands in Maestro
By default, residues within 8 Å of the ligand and waters and ions within 3 Å of the ligand are displayed. The structure is colored according to the residue position. The N-terminus is red colored while the C-terminus is purple colored. There is a rainbow color ramp between these ends.
You can change these default settings via the Edit Default Custom Preset option in the Presets menu as per your interest. Try changing the color scheme to Secondary Structure to identify the secondary structure elements in the structure.
4. Preparing Protein Structures for Glide Docking
In this section, you will learn how to prepare the protein structure for docking in
Glide. The Protein Preparation Workflow is run within the Preparation Workflow tab. The workflow has processing, modification, and
refinement tools that we will use on the 1FJS.pdb structure. These tools support two main workflows
- Interactive, single protein preparation and Automatic, bulk protein preparations. The Interactive preparations are
manually performed in a step-by-step manner, with the opportunity to review the results of each step and easily
control the order of modifications. The Automatic preparation is pre-set by the user by the use of toggles that
control which stages of the workflow are run in a single job and allows processing of multiple protein
structures in a single job, permitting they maintain the same settings. The recommended minimal processing tasks
are checked by default in both workflows but may be modified using the dropdown options. There are also options
for filling in missing side chains and/or loops, depending on the needs of your structure.
The Preparation Workflow tab may be used in conjunction with the Diagnostics tab and Substructures tab for the diagnosis and analysis of the protein structure. The Diagnostics tab lists the issues present in the protein structure. The Substructures tab provides options to view, copy or delete ligands, water, chains, etc. in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. The Protein Preparation Workflow toggles include Preprocess, Optimize H-Bond Assignments, and Minimize and Delete Waters.
For more information on the Protein Preparation Workflow panel, see the Protein Preparation Workflow Panel Documentation. For best practices to follow while preparing protein structures for modeling tasks, click here.
4.1 Prepare the Protein using the Protein Preparation Workflow
When preparing the structure for docking, you will need to make an informed decision about which water molecules to retain in the active site. Any water left in the structure will be considered an immutable part of the receptor (see this article). For this example, none of the waters are considered structural, so you will create a fully dry structure.
5. Preparing Ligands for Glide Docking
In this section, you will prepare the co-crystallized ligand from the 1FJS structure for use in virtual screening. This is a typical step for cognate liganda ligand that is bound to its protein target docking, as it provides important validation prior to screening a larger ligand data set.
Ligand files can be sourced from numerous places, such as vendors or databases, often in the form of 1D or 2D structures with unstandardized chemistry. These ligand structures may contain missing hydrogens, incorrect geometries, inappropriate protonation states, or undefined chiral centers, which can lead to erroneous docking results if not addressed. Further, Glide requires ligands in specific formats (e.g., 3D structures with filled valences and optimized geometries) to evaluate binding interactions effectively. So, before being used in a virtual screen, ligands must be converted to 3D structures, with their chemistry properly standardized. To know about all the necessary requirements a ligand structure must meet for Glide docking, read the Ligand Preparation for Glide Documentation. The following steps provide an example of how you would prepare a ligand structure using LigPrep.
5.1 Prepare Small Molecule Ligands with LigPrep
Here, we show how to run LigPrep for the cognate ligand which is already imported in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. For larger ligand libraries, you can directly provide a file containing the ligands to LigPrep. For more information and supported formats, see the LigPrep panel documentation.
5.2 View and Compare Ligand Structures in Maestro
6. Analyzing the Structure After Preparation
In this section, you will analyze the protein-ligand complex by looking at the interactions in the 2D Ligand Interactions Diagram and 3D Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. Then you will generate a Custom Set for some binding residues of interest. Finally, you will visualize the surface of the binding pocket, and save an image of the complex.
6.1 Visualize Ligand-Receptor Interactions in 2D and 3D
In the Ligand Interaction Diagram, the ligand is displayed as a 2D structure. Residues are represented as colored teardrop shapes, labeled with the residue name and residue number, and colored according to their properties. The chain is represented as a black line connecting residues. Interactions between the residues and the ligand are drawn as lines, colored by interaction type (see the LID Legend). For detailed explanation, please refer to the Ligand Interaction Diagram panel features.
For the 1FJS structure, you can see that ASP 189, SER 195 and GLY 218 are involved in H-bonding with the ligand. ILE 227 is involved in water mediated H-bonding with the ligand. Further, TYR 99, PHE 174 and TRP 215 are involved in pi interactions with the ligand.
6.2 Create a Custom Set
Custom Sets in Maestro offer several advantages for managing and working with atom/residue selections within your project. Once you save your selections as a Custom Set, these are readily available. This is particularly helpful for complex selections you frequently need. You will now create a Custom Set of the residues that form key interactions with the ligand.
6.3 Generate and Manipulate a Surface
Visualizing molecular surfaces, particularly binding site surfaces of proteins aids in identifying critical structural features such as cavities, grooves, and pockets that may serve as binding sites. By overlaying the ESP map onto this surface, you can visualize the distribution of positive and negative charges, offering insights into electrostatic complementarity between the protein and the ligand. For example, positively charged regions on the surface may attract negatively charged ligands or functional groups, facilitating specific interactions critical for binding.
The regions in shades of blue are associated with positively charged residues while those in shades of red are associated with negatively charged residues. The white colored regions are those with neutral electrostatic potential and often have non-polar or hydrophobic residues.
To locate and characterize potential binding pockets or cavities on the protein’s surface that are likely to accommodate ligands, you can use SiteMap. For more details, please refer to the Target Analysis with SiteMap and WaterMap tutorial.
6.4 Save an image of the Workspace
- Go to Workspace > Save Image As.
- The Save Image dialog box opens.
Note: You can also right-click anywhere in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed and choose the Save Image option.
You can adjust the size and resolution of the image to meet quality requirements, e.g. for printing.
- Click Options >>.
- Check Transparent background and select 300 DPI.
- For File name, type 1FJS_ESP_map.
- Click Save.
- The image is saved in the Working Directorythe location where files are saved.
Note: You can also export the structure to PyMOL from the File menu and Ray Trace the structure to get high quality images for publications. See the free Visualizing Science with PyMOL 3 online course for detailed instructions.
7. Conclusion and References
In this tutorial, you imported and visualized a protein structure. You adjusted structure visualization options both manually and with one click using Presets. Then you prepared the protein and ligand structure for Glide Docking. A raw PDB file was made suitable for modeling purposes using the Protein Preparation Workflow, and the cognate ligand was prepared using LigPrep in the same fashion that would be used for a multi-ligand file. Finally, you visualized key interactions in the 2D Ligand Interaction Diagram and in the 3D Workspace to analyze the protein-ligand complex. The Workspace Configuration toolbar allowed for toggling various components in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. Visualizing the surfaces gave another way to analyze the protein-ligand complex.
If you are working with nucleic acid structures, please refer to the Preparing Nucleic Acid Structures and Structure Visualization and Interaction Analysis in Nucleic Acids tutorials.
For further learning:
- Getting Going with Maestro Video Series: Preparing Protein Structures
- Getting Going with Maestro Video Series: Ligand Preparation
- Target Analysis with SiteMap and WaterMap
- Structure-Based Virtual Screening using Glide
- Ligand-Based Virtual Screening Using Phase
- Refining Crystallographic Protein-Ligand structures using GlideXtal and Phenix/OPLS
- Homology Modeling of Protein-Ligand Binding Sites with IFD-MD
- Introduction to Molecular Modeling in Drug Discovery Online Course
- Understanding and Visualizing Target Flexibility
- Visualizing Science with PyMOL 3
For further reading:
- Protein Preparation Workflow Panel Documentation
- Best Practices for protein preparation
- How to prepare a large ligand library using LigPrep
- Docking performance of the glide program as evaluated on the Astex and DUD datasets: a complete set of glide SP results and selected results for a new scoring function integrating WaterMap and glide - 2012 paper from Schrödinger evaluating the performance of Glide on the Astex and DUD datasets. The paper highlighted both the performance of Glide SP, as well as the sizable impact careful protein preparation can have on docking performance. The authors found that the “average AUC was greater than 0.7 for all best-practices protein families demonstrating consistent enrichment performance across a broad range of proteins and ligand chemotypes.” This paper is also the first to introduce the WScore, which is described more in reference 18.
- Effects of histidine protonation and rotameric states on virtual screening of M. tuberculosis RmlC - An open access publication which indicates the importance of accurate protonation during protein preparation, and indicates that for binding pockets that contain residues that can be protonated, deprotonated, or adopt various rotameric states, it may be worth taking preparing all combinations for a virtual screening campaign.
- Protein and ligand preparation: parameters, protocols, and influence on virtual screening enrichments - 2013 paper from Schrödinger introducing the Protein Preparation Wizard, and demonstrating the importance of properly prepared structures for virtual screening applications.
- Protein Reliability Report Help page - A section in the Maestro user manual that explains the 21 metrics that make-up the Protein Reliability Report, and notes their cutoffs.
- Can the Protein Preparation workflow be run from the command line?
8. Glossary of Terms
Cognate ligand - a ligand that is bound to its protein target
Entries - a simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion
Included - the entry is represented in the Workspace, the circle in the In column is blue
Project Table - displays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data
Recent actions - This is a list of your recent actions, which you can use to reopen a panel, displayed below the Browse row. (Right-click to delete.)
Scratch Project - a temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project
Selected - (1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entries (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries
Working Directory - the location where files are saved
Workspace - the 3D display area in the center of the main window, where molecular structures are displayed