Ligand-Based Virtual Screening Using Phase

Tutorial Created with Software Release: 2025-3
Topics: Hit Discovery, Small Molecule Drug Discovery, Virtual Screening
Products Used: Maestro, Phase

Tutorial files

8.5 MB

This tutorial is written for use with a 3-button mouse with a scroll wheel.
Words found in the Glossary of Terms are shown like this: Workspacethe 3D display area in the center of the main window, where molecular structures are displayed.

 

Tip: You can hover over a glossary term to display its definition. You can click on an image to expand it in the page.
Abstract:

This tutorial demonstrates the development of pharmacophore hypotheses from a congeneric ligand series as well as the creation of Phase databases. You will learn how to perform  screening of a Phase database against pharmacophore hypotheses and how to analyze, visualize and evaluate the hypotheses and screening results.

 

Tutorial Content
  1. Introduction to Pharmacophore Modeling

  1. Creating Projects and Importing Structures

  1. Developing Pharmacophore Hypotheses from a Congeneric Ligand Series

  1. Creating a Phase Database

  1. Screening of a Phase Database Against Pharmacophore Hypotheses

  1. Conclusions and References

  1. Glossary of Terms

1. Introduction to Pharmacophore Modeling

Virtual screening in drug discovery is the process of searching large libraries of small molecules to identify those structures most likely to bind to a drug target. It can be employed in structure-based, ligand-based and hybrid scenarios.

Phase is a versatile and easy-to-use pharmacophore modeling solution for both ligand- and structure-based drug design processes. The terms pharmacophore, pharmacophore model or pharmacophore hypothesis are synonymous and describe an abstract representation or ensemble of steric and electronic molecular features of a ligand in space. These features are necessary to trigger molecular interaction with a biological target to ensure a desired response. The default features are hydrogen bond donor (D) and acceptor (A), aromatic ring (R), hydrophobic (H), as well as positive (P) and negative (N) ionic. Pharmacophore models can be developed from a single ligand, a set of ligands, a ligand-receptor complex or an apo-protein. This tutorial focuses on a ligand-based approach with a set of ligands, for the use of Phase in a structure-based approach see the Structure-Based Virtual Screening Using Phase tutorial.

When you have multiple ligands that are known to bind to a receptor you can develop a set of hypotheses that are optimized to represent the pharmacophore information derived from all ligands. In this approach, the pharmacophore features on each ligand are identified, and a set of common features are sought that satisfy criteria on their positions and directions, to form pharmacophore hypotheses. The hypotheses can be scored on their geometric alignment, their ability to retrieve actives from a set of decoys, and can be penalized for matches to inactive ligands.

In this tutorial, you will learn how to develop pharmacophore hypotheses from a congeneric ligand series with known experimental binding affinity and how to visualize and evaluate them with the PhaseHypoScore. Additionally, you will create a new Phase database from an additional set of ligands. Finally, you will perform  screening of a Phase database against your top three pharmacophore hypotheses and evaluate their matching visually and with the PhaseScreenScore.

2. Creating Projects and Importing Structures

At the start of the session, change the file path to your chosen Working Directorythe location that files are saved in Maestro to make file navigation easier. Each session in Maestro begins with a default Scratch Projecta temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project, which is not saved. A Maestro project stores all your data and has a .prj extension. A project may contain numerous entries corresponding to imported structures, as well as the output of modeling-related tasks. Once a project is created, the project is automatically saved each time a change is made.

Structures can be imported from the PDB directly, or from your Working Directorythe location that files are saved using File > Import Structures, and are automatically added to the Entries and Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data. The Entries is located to the left of the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. The Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data can be accessed by Ctrl+T (Cmd+T) or Window > Project Table if you would like to see an expanded view of your project data.

  1. Double-click the Maestro icon.

Figure 2-1. Change Working Directory option.

  1. Go to File > Change Working Directory.
  2. Pre-generated input and results files are included for running jobs or examining output. Download the zip file here: https://www.schrodinger.com/sites/default/files/s3/release/current/Tutorials/zip/lbvs_phase_ligand.zip
  3. After downloading the zip file, unzip the contents in your chosen Working Directorythe location that files are saved for ease of access throughout the tutorial.

Figure 2-2. Opening the project.

  1. Go to File > Open Project.
  2. Navigate to and choose Phase_tutorial.prjzip in your Working Directorythe location that files are saved.
  3. Click Open.
  4. In the Save scratch project dialog box, click OK.
    • Several groups of structures are shown in the Entries.

Figure 2-3. Saving the project in the Save Project panel.

  1. Go to File > Save Project As
  2. Change the File name to Phase_ligand_tutorial.
  3. Click Save.
    • The project is saved as Phase_ligand_tutorial.prj.

3. Developing Pharmacophore Hypotheses from a Congeneric Ligand Series

In this section, you will generate pharmacophore hypotheses from a congeneric series of ligands with known experimental binding affinities. You will utilize the Develop Pharmacophore Model panel. Afterwards you will analyze the generated hypotheses, with regards to their PhaseHypoScore and visualize the alignment of features for active and inactive ligands.

Ligand files can be sourced from numerous places, such as vendors or databases, often in the form of 1D or 2D structures with unstandardized chemistry. LigPrep can convert ligand files to 3D structures, with the chemistry properly standardized and extrapolated, ready for use in virtual screening. For this tutorial, the ligands have already been prepared in order to save time. Please see the Introduction to Structure Preparation and Visualization tutorial and the LigPrep Documentation for instructions on using LigPrep.

3.1 Develop hypotheses

Figure 3-1. Including all the entries in the at1 group.

  1. Expand and select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the at1 group in the Entries.
  2. Right-click the selection and choose Include.
  3. In the Warning box, click Continue.
    • All the entries in the at1 group are displayed in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed.
    • The group consists of 25 prepared but unaligned ligands.

Ligand alignment is very important for the generation of pharmacophores from any ligand series. Phase can (and will as a default) automatically find the best alignment and common features of your ligands, so you do not have to perform alignment prior to using Phase.

 

At times it can be beneficial to manually pre-align your ligands using the Flexible Ligand Alignment tool before importing the ligands into Phase. This can be useful when trying to align to a known bioactive conformation.

Figure 3-2. The Develop Pharmacophore Hypothesis in Tasks.

  1. Go to Tasks > Browse > Ligand-Based Virtual Screening > Develop Pharmacophore Hypothesis.
    • The Develop Pharmacophore Model panel opens.

Figure 3-3. The Develop Pharmacophore Model panel.

  1. For Create pharmacophore model using, choose Multiple ligands (selected entries) from the dropdown menu.
    • The 25 selected(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries ligands and their pharmacophore features are detected and shown in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed.
  2. For Actives / Inactives Split (selected entries), click the Define button.
    • The Define Actives and Inactives dialog box opens.

Figure 3-4. The Define Actives and Inactives dialog box.

  1. For Choose an activity property, select pIC50-Exp (SD).
  2. Set Active if Activity to >= 7.30.
  3. Set Inactive if Activity to <= 5.00.
  4. Click Apply and OK.
    • 5 ligands are recognized as active, 6 as inactive.

Distinction between active and inactive ligands is crucial for developing pharmacophore hypotheses. The pIC50 property contains experimental binding affinities converted to a free energy scale by pIC50 = -log(IC50). The threshold for active ligands is an experimental affinity of IC50 ≤ 50nM, which corresponds to pIC50 ≥ 7.3. The threshold for inactive ligands is ≥ 10μM or pIC50 ≤ 5.0.

 

You can also specify ligands that must match a hypothesis for it to be returned by checking the Must Match column for the desired ligand.

Figure 3-5. The Hypothesis Settings in the Develop Pharmacophore Model panel.

  1. In the Develop Pharmacophore Model panel, under Pharmacophore method, click the Hypothesis Settings button.
    • The Hypothesis Settings dialog box opens.

Figure 3-6. The Features tab in the Hypothesis Settings dialog box.

  1. In the Features tab, set Number of features in the hypothesis to 5 to 6.
  2. Check Preferred minimum number of features and set to 5.
  3. In the Minimum column, set both (D) Donor and (N) Negative Ionic to 1.
  4. For Feature presets, select Make acceptor and negative equivalent from the dropdown menu.

 

 

Settings such as the number of total features a hypothesis should have or values for specific features highly depend on the target system, e.g. the size of the ligands and known interactions between active ligands and a receptor.

Figure 3-7.The Excluded Volumes tab in the Hypothesis Settings dialog box.

  1. Go to the Excluded Volumes tab.
  2. Check Create excluded volume shell.
  3. For Create shell from, choose Actives and Inactives.
  4. Click Save.

Excluded volumes can be placed as a shell around a hypothesis to mimic the receptor surface in whole or in part. They are placed in all areas, where inactive ligands have atoms but active ligands do not. These areas correspond to regions that are not favored for activity and can be considered to be “clashes” of ligands with receptor side chains or solvent molecules.

Figure 3-8. Modifying the Job settings and running the job to develop a hypothesis.

  1. In the Develop Pharmacophore Model panel, change the Job name to at1_dcph.
  2. Optional: To run the job, click the cog next to the Job name and set the number of processors to 4, then click Run.
    • This job takes ~ 5 minutes.
    • A banner appears confirming the job has been incorporated.
  3. Close the Develop Pharmacophore Model panel.

 

Note: To save time, you will look at the pre-generated results.

3.2 Analyze the developed hypothesis

Figure 3-9. Expanding the group and showing property for analysis.

Note: Hypotheses are named according to their features, e.g. APRR has an acceptor, a positive ionic and two ring features.

 

  1. Expand the at1_dcph group in the Entries.
  2. Expand the DHNRRR_1 group.
  3. Expand both the Active (5) and the Inactive Partial Match (6) groups.
  4. Double-click the DHNRRR_1 In circle to fix the hypothesis in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed.
  5. At the top right corner of the Entries, click the Change table settings (three vertical dots).
  6. Choose Show Property.
    • The Show Properties in Table dialog box opens.

Figure 3-10. Adding PhaseHypoScore and Matched Ligand Sites properties to the table.

  1. Click Choose.
  2. Search and select PhaseHypoScore and Matched Ligand Sites from the list.
  3. Click OK.
    • The PhaseHypoScore and the Matched Ligand Sites are added to the Entries.

Figure 3-11. The updated Entries showing the PhaseHypoScore and the Matched Ligand Sites.

Note: All returned hypotheses are rank-ordered by the PhaseHypoScore, which is a measure of how well a hypothesis matches the ligand set. Both the quantity and quality of feature matching is taken into account. The features that are matched for a ligand and the hypothesis are highlighted in the Matched Ligand Sites column. More information on the scoring can be found here.

Figure 3-12. Active group with hypothesis DHNRRR_1 shown in the Workspace.

  1. Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the Active (5) group by clicking on the group heading.
  2. Right-click the selection and choose Include.
    • All the active ligands are displayed in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed.
  3. Right-click the Active (5) group again and choose Exclude.

Figure 3-13. Inactive group with hypothesis DHNRRR_1 shown in the Workspace.

  1. Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the Inactive Partial Match.
  2. Right-click the selection and choose Include.
    • All the inactive ligands are shown in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed.

 

Take your time and compare some of the other hypotheses.

 

Optional: You can measure distances, angles etc. of pharmacophore features with the Measure tool in Maestro, just as you can for atoms.

All the active ligands have good feature alignment and avoid excluded volumes, while the inactive ligands have poor feature alignment (e.g. D5) and clash with excluded volumes.

 

If you want to manually change a feature of a pharmacophore hypothesis, e.g. position or type, use the Edit Feature X Dialog Box.

4. Creating a Phase Database

In this section, you will create a Phase database for a set of ligands with the Create Phase Database panel. Screening compounds from a prepared Phase database is 2 to 3 times faster compared to screening compounds from files or the Entries directly. By default, compounds in Phase databases include low-energy ionization/tautomeric states and an ensemble of low-energy conformers. These compounds are ready for use in docking tasks with Glide, shape screening with Shape, and pharmacophore querying with Phase. Compounds can be screened against one or more pharmacophore hypotheses from one or more Phase databases in one calculation.

The ligands you will be using to create your database are taken from the Database of Useful Decoys: Enhanced, DUD-E in short. For each of the more than 20,000 active compounds in the database, there are 50 decoys with similar 1-D physio-chemical properties, but different 2-D topology. The DUD-E ligand set can thus be used in pharmacophore screening to avoid artificial enrichment.

Figure 4-1. The Create Database option in Ligand-Based Virtual Screening.

  1. Go to Workspace and choose Clear Workspace.
  2. In the Entries, select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the DUD-E_act_3chp_trainc group by clicking on the group heading.
  3. Go to Tasks > Browse > Ligand-Based Virtual Screening > Create Database.
    • The Create Phase Database panel opens.

Figure 4-2. The Input tab in the Create Phase Database panel.

  1. In the Input tab, next to Choose task, choose Create new database.
  2. Specify the location on the job host for saving the database by writing an absolute file path.
  3. For Use ligands from, choose Project Table (selected entries).
  4. Under Conformation Generation, check Generate ligand conformers.

Figure 4-3. The Ligand Filtering tab in the Create Phase Database panel.

  1. Go to the Ligand Filtering tab.
  2. Under Filter Prepared Ligands, check the options Generate QikProp properties, Prefilter by Lipinski's Rule (requires QikProp properties), and Skip ligands with reactive functional groups.
  3. Change the Job name to tutorial_database.
  4. Optional: To run the job, click Run.
    • This job takes ~10 minutes on a single CPU.
    • To save time, you will use the pre-generated database file.

 

Note: No licenses are checked out during the creation or modification of a Phase database, enabling the calculation to be distributed over as many CPUs as are available without running out of licenses.

Ligands are prepared by default and different conformations are generated with ConfGen. To filter out structures that are not relevant for the later screening step different filters can be used before addition to the database. One filter makes use of Lipinski's Rule of 5. This option requires the calculation of properties used in Lipinski's Rule of 5 with QikProp. Additionally, ligands with predefined reactive groups such as anhydrides, peroxides, and many more can be skipped automatically. The full list can be accessed in the panel help.

5. Screening of a Phase Database Against Pharmacophore Hypotheses

In this section, you will perform a screening of a Phase database against your top three developed pharmacophore hypotheses with the Phase Ligand Screening panel. Then you will visually examine the results and analyze the PhaseScreenScore to evaluate how well the ligands align to the pharmacophore hypotheses they were screened against.

5.1 Screen Ligands from a Phase Database

Figure 5-1. Including the top three hypotheses in the Workspace.

  1. Ctrl+Click (Cmd+Click) to includethe entry is represented in the Workspace, the circle in the In column is blue the top  three hypotheses from the at1_dcph group (DHNRRR_1, DHNRRR_2, ADHNRR_1).
  2. Go to Tasks > Browse > Ligand-Based Virtual Screening > Ligand Screening.
    • The Phase Ligand Screening panel opens.

Figure 5-2. The Phase Ligand Screening panel.

  1. For Ligands to screen, choose Phase Database.
  2. Click Add Database and click Browse.
  3. Navigate to tutorial_database folder and choose tutorial_database.phdb in your Working Directorythe location that files are saved.
  4. Click Open.
  5. Click Add Hypothesis.
  6. Choose Workspace.
    • The three hypotheses are added to the table.
  7. Click the Screening Settings button.
    • The Screening Settings dialog box opens.

Figure 5-3. The Constraints tab in the Screening Settings dialog box.

  1. Go to the Constraints tab.
  2. Check the Pick Features box.
  3. Includethe entry is represented in the Workspace, the circle in the In column is blue only the DHNRRR_1 hypothesis in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed.
  4. Select the features R10 and R11 in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed by clicking on them.
  5. Click Add Constraint.
    • A constraint has been added to the table, which specifies the corresponding hypothesis, features, type of constraint, the value and tolerance.

Figure 5-4. Adding constraints.

  1. Repeat steps 10. to 12. for the other two hypotheses (DHNRRR_2, ADHNRR_1)
  2. Click Save.

Geometric constraints on distances, angles and dihedrals can be added to a hypothesis, to further customize the screening by introducing restricting geometric criteria that must be matched in order for a ligand to match with a hypothesis. The value and tolerance of the constraint can be edited manually. Read more on the various screening settings here.

Figure 5-5. Running the job.

  1. In the Phase Ligand Screening panel, change the Job name to at1_dcph_screen.
  2. Optional: To run the job, click Run.
    • This job takes ~ 1 minute.
    • To save time, you will look at the pre-generated results.
  3. Close the Phase Ligand Screening panel.

5.2 Analyze the pre-generated pharmacophore screening results

Figure 5-6. Expanding the group and showing property for analysis.

  1. Go to Workspace and choose Clear Workspace.
  2. Expand the group at1_dcph_screen, as well as ADHNRR_1, DHNRRR_2, DHNRRR_1.
  3. At the top right corner of the Entries, click the Change table settings (three vertical dots).
  4. Choose Show Property.
    • The Show Properties in Table dialog box opens.

Figure 5-7. The updated Entries showing the PhaseScreenScore.

  1. Click Choose.
  2. Search and select PhaseScreenScore from the list.
  3. Click OK.
    • The PhaseScreenScore is added to the Entries.

All returned ligands are rank-ordered by the PhaseScreenScore, which is a measure of how well a ligand matches the hypothesis. Both the quantity and quality of feature matching is taken into account. A ligand can thus both match fewer sites and have a higher PhaseScreenScore than another ligand in the screening set. The features that are matched for a ligand and the hypothesis are highlighted in the Matched Ligand Sites column.

Figure 5-8. The hypothesis ADHNRR_1 with one of the screened ligands.

  1. Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the ADHNRR_1 group in the Entries by clicking on the group heading.
  2. Double-click the In circle to fix the ADHNRR_1 entry in the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed.
  3. Use the right and left arrow keys to step through the ligands.
  4. Check out the other matched ligands with their hypothesis in the other two groups in the same way.

Most of the matched ligands within each hypothesis share a similar core structure. Comparing matched ligands between the hypotheses shows more diversity.

6. Conclusions and References

In this tutorial, you learned how to develop pharmacophore hypotheses from a congeneric ligand series with known experimental binding affinity and how to visualize and evaluate them with the PhaseHypoScore. Additionally, you created a new Phase database from an additional set of ligands. Finally, you performed screening of a Phase database against your top three pharmacophore hypotheses and evaluated their matching visually and with the PhaseScreenScore.

Going to larger ligand sets or libraries in the order of millions to billions, even faster ligand-based tools than Phase are employed for screening, e.g. Shape or Quick Shape.

For further reading:

7. Glossary of Terms

congeneric - a series of ligands that are structurally related, such as sharing a common core

Entry List - a simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion

included - the entry is represented in the Workspace, the circle in the In column is blue

incorporated - once a job is finished, output files from the working directory are added to the project and shown in the Entry List and Project Table

Project Table - displays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data

Scratch Project - a temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project

selected - (1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entry List (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries

Working Directory - the location that files are saved

Workspace - the 3D display area in the center of the main window, where molecular structures are displayed