Rapid Screening of Chemical Libraries with GPU Shape

Tutorial Created with Software Release: 2025-3
Topics: Hit Discovery, Small Molecule Drug Discovery, Virtual Screening
Products Used: Phase, Shape Screening

Tutorial files

161.5 MB

This tutorial is written for use with a 3-button mouse with a scroll wheel.
Words found in the Glossary of Terms are shown like this: Workspacethe 3D display area in the center of the main window, where molecular structures are displayed

 

Tip: You can hover over a glossary term to display its definition. You can click on an image to expand it in the page.
Abstract:

 

In this tutorial, you will learn how to perform rapid shape-based screening of a chemical library with Shape GPU. We will use information from nearly 70 Cdk2 small-molecule inhibitors to evaluate a library of compounds provided by DUD-E for their propensity to bind Cdk2 (http://dude.docking.org/). We will then run a screen on GPU using Shape GPU, and perform enrichment calculations using the true actives in the dataset as provided by DUD-E.

 

Tutorial Content
  1. Creating Projects and Importing Structures

  1. Selecting Template (Probe) Molecules for the Screen

  1. Preparing a Screen Deck for Shape

  1. Running Shape GPU Screen

  1. Analyzing Shape Screening Results

  1. Conclusion and References

  1. Glossary of Terms

1. Creating Projects and Importing Structures

At the start of the session, change the file path to your chosen Working Directorythe location that files are saved in Maestro to make file navigation easier. Each session in Maestro begins with a default Scratch Projecta temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project, which is not saved. A Maestro project stores all your data and has a .prj extension. A project may contain numerous entries corresponding to imported structures, as well as the output of modeling-related tasks. Once a project is created, the project is automatically saved each time a change is made.

Structures can be built in Maestro or can be imported using File > Import Structures (or drag-and-dropped), and are added to the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion and Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data. The Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion is located to the left of the Workspacethe 3D display area in the center of the main window, where molecular structures are displayed. The Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data can be accessed by Ctrl+T (Cmd+T) or Window > Project Table if you would like to see an expanded view of your project data.

  1. Double-click the Maestro icon.

Figure 1-1. Change Working Directory option.

  1. Go to File > Change Working Directory.
  2. Find your directory, and click Choose.
  3. Pre-generated input and results files are included for running jobs or examining output. Download the zip file here: https://www.schrodinger.com/sites/default/files/s3/release/current/Tutorials/zip/gpu_shape.zip
  4. After downloading the zip file, unzip the contents in your Working Directorythe location that files are saved for ease of access throughout the tutorial.  

Figure 1-2. Saving the project in the Save Project panel.

  1. Go to File > Open Project.
  2. Select CDK2_screen.prjzip and click Open.
  3. In Save scratch project dialog box, click OK.
  4. Go to File > Save Project As.
  5. Change the File name to GPU-Shape.
  6. Click Save.
    • The project is now named GPU-Shape.prj.

2. Selecting Template (Probe) Molecules for Shape Screen

The first step in executing a Shape screen of a chemical library is to construct a shape-based model of known small-molecule binders. Each molecule in the screening deck will be analyzed in the same way and compared with the profiles of each of the template or probe molecules. The ability of Shape to recover true actives is influenced by the diversity and number of probes. Increasing the number of diverse probes generally improves performance, and begins to saturate around 10 molecules. As we have 127 actives to choose from, we will start by selecting a diverse subset of these molecules using clustering tools available in Maestro.

Figure 2-1. Fingerprint Similarity in Ligand-Based Virtual Screening.

  1. Select the CDK2-Actives group in the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion.
  2. Go to Tasks > Browse > Ligand-Based Virtual Screening > Fingerprint Similarity.
    • The Canvas Similarity and Clustering panel opens in the Fingerprints tab.

Figure 2-2. The Fingerprints tab in the Canvas Similarity and Clustering panel.

  1. For Precision, choose 32-bit.
  2. For Fingerprint type, select Dendritic.
  3. For Atom Typing Scheme, choose          4. Atoms distinguished by functional type: {H}, {C}, {F,Cl}, {Br,I}, {N,O}, {S}, {other}; bonds by hybridization.

 

Figure 2-3. The Similarity tab in the Canvas Similarity and Clustering panel.

  1. Go to the Similarity tab.
  2. Set the Similarity metric to Tanimoto.

Figure 2-4. The Cluster tab in the Canvas Similarity and Clustering panel.

  1. Go to the Cluster tab.
  2. For Linkage method, choose Centroid.
  3. Click Calculate Clustering.
  4. Next to Number of clusters, type 10.
  5. For Apply Clustering, choose A group containing the structures nearest the centroid in each cluster.
  6. Click Apply Clustering.
    • A new group Representative Entries is added to the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion.
    • The representative molecules from this group will be the probe molecules for the Shape screen.
  7. Close the Canvas Similarity and Clustering panel.

3. Preparing a Screening Deck for Shape Screen

Figure 3-1. Create Shape Data File in Ligand-Based Virtual Screening.

  1. Go to Tasks > Browse > Ligand-Based Virtual Screening > Create Shape Data File (for GPU).
    • The Create Shape Data File panel opens.

Figure 3-2. Running the Create Shape Data File job.

  1. For Use ligands from, choose File.
  2. For Input structure file, click Browse and locate cdk2-screen-deck-shuffled.maegz.

 

Note: The input ligands have already been prepared with LigPrep.

 

  1. For Conformers, select Generate ligand conformers and leave it set to Rapid.
  2. Change Job name to shape_data_CDK2.
  3. Click Run.
    • This job takes ~5 minutes.
  4. Close the Create Shape Data File panel.

Both ligand preparation and conformer generation are unnecessary when using a Phase Database as the input. Typed pharmacophore is the recommended Shape type and requires that a shape from a probe molecule will only match a ligand from the screening deck when the two spheres have matching pharmacophore types.

4. Running GPU Shape Screen

Figure 4-1. The Shape Screening in Ligand-Based Virtual Screening.

  1. Go to Tasks > Browse > Ligand-Based Virtual Screening > Shape Screening.
    • The Shape Screening panel opens.
  2. Select the header of the Representative Entries group.

Figure 4-2. Setting up a Shape Screening job.

  1. For Use shape query from, choose Project Table (10 selected entries).
  2. For Run screen on, choose GPU.

 

Note: The GPU option for Run screen on is unavailable unless you have a machine with a GPU specified in your host file.

 

  1. For Screen structures in, choose Shape data file (local).
  2. Click Browse.
  3. Navigate to the shape_data_CDK2 folder in your Working Directorythe location that files are saved and select shape_data_cdk2.bin file.
  4. Click Open.
  5. Click Screening Settings.
    • The Shape Screening Settings dialog box opens.
  6. For Maximum number of structures to save, type 1000.
  7. Click OK.
  8. Click the Job Settings (cog) button.
    • The Shape Screening - Job Settings dialog box opens.

If you check Include PDF report a PDF report of the top X matched ligands will be generated along with their alignment to a single probe (if you have selected multiple probes it will run just the first probe and show the results in the PDF).

Figure 4-4. Running the Shape Screening job.

  1. For Incorporate, select Append new entries as a new group.
  2. Click OK.
  3. Change the Job name to cdk2–shape–screen.
  4. Select your desired GPU host.
  5. Click Run.
    • This job takes a few seconds to complete.
    • Results from the screen can also be found in the cdk2-shape-screen folder in the tutorial files.
  6. Close the Shape Screening panel.

 

Note: Jobs can be split across several GPUs from the command line. 1 GPU is used by default. Job runtime varies depending on the graphics card used.

5. Analyzing the Shape Screening Results

In order to determine the impact of the Shape screen, we will look toward various enrichment metrics. These metrics help quantify a screening method’s effectiveness in differentiating true-actives from decoys.

5.1 Analyze enrichment for a single probe

Figure 5-1. Importing the pre-generated Shape Screening results.

  1. Go to File > Import Structures.
    • The Import panel opens.
  2. Navigate to the cdk2-shape-screen folder in your Working Directorythe location that files are saved and select cdk2-shape-screen-out.maegz.
  3. Click Open.
    • A banner appears and a group is added to the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion.
  4. Select the top group titled cdk2_1aq1_ (query 1).

Figure 5-2. The Enrichment Calculator in Receptor-Based Virtual Screening.

  1. Go to Tasks > Browse > Receptor-Based Virtual Screening > Enrichment Calculator.
    • The Enrichment Calculator panel opens.

 

 

 

Figure 5-3. Setting up the Enrichment Calculator job.

  1. For Use structures from, select Project Table (1000 selected).
  2. For Actives file, click Browse.
  3. Navigate to and select cdk2-true-active-titles.txt in your Working Directorythe location that files are saved.
  4. Click Open.
  5. For Number of decoys, type 19757.

Figure 5-4. Running the Enrichment Calculator job.

  1. Change the Job name to cdk2-enrichment-1aq1.
  2. Click Run.
    • The Enrichment Report populates with data within a few seconds.

 

Figure 5-5. The ROC Plot.

  1. Click ROC Plot.
    • The ROC Plot illustrates the performances of the screen in terms of active recovery compared with random.

 

 

5.2 Analyze enrichment across all probes

Figure 5-6. Selecting all the Query groups in the Entries.

To illustrate the impact of using additional probe molecules on the performance, we will merge the results across all of the probe molecules and repeat the enrichment calculation.

  1. In the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion, right-click the cdk2_1aq1__ (query 1) header and choose Collapse > All.
  2. Expand cdk2-shape-screen group and Ctrl+Click (Cmd+Click) to select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entries (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries each of the query groups.

Figure 5-7. Duplicating selected structures.

  1. Right-click the selection and choose Duplicate > Into New Group.
    • The Duplicate into New Group dialog box opens.

 

 

Figure 5-8. Duplicating the group..

  1. For New group title, type cdk2-screen-merge.
  2. For Location of new group, choose At top level.
  3. Click Duplicate.
    • A new group cdk2-screen-merge is added to the end of the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion table.

Figure 5-9. Ungrouping the groups.

  1. Select(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entries (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries the cdk2_screen-merge group by clicking on the group heading.
  2. Right-click the selection and choose Ungroup groups.

Note: This is a shortcut to ungroup the groups for each of the individual probes so we can group them all together in the same group (without any subgroups) in the next step.

Figure 5-10. Re-grouping the outputs.

  1. Right-click the selection and choose Group.
    • The Group Selected Entries & Groups dialog box opens.

Figure 5-11. Creating group.

 

 

  1. For New group title, type cdk2-screen-merge.
  2. For Location of new group, choose At top level and First selected row.
  3. Click Create Group.

Figure 5-12. The Show Property option.

  1. At the top right corner of the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion, click the Change table settings (three vertical dots).
  2. Choose Show Property.
    • The Show Properties in Table dialog box opens.

 

Figure 5-13. Adding Shape Sim property to the Entries table.

  1. In the dialog box, click Choose.
  2. Search and select Shape Sim from the list.
  3. Click OK.
    • The Shape Sim property is added to the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion table.
    • You may need to click and drag the edge of the table towards the right to see the added property.

Figure 5-14. Sorting structures by Shape Similarity Score.

  1. Confirm the cdk2-screen-merge group is selected(1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entries (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries in the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion.
  2. Right-click the Shape Sim column header and choose Sort Selected (Descending).

 

Note: All of the grouping and ungrouping was necessary to get the output from all of the probes together in the same group so they could be sorted by Shape Sim.

Figure 5-15. Running the Enrichment Calculator job.

  1. Open the Enrichment Calculator panel.
  2. Change Job name to cdk2-enrichment-merge.
  3. Click Run.
    • The Enrichment Report populates with data within a few seconds.

 

Note: There is a clear improvement across all enrichment metrics compared to using a single probe alone.

Figure 5-16. The ROC Plot.

  1. Click ROC Plot.

 

Note: The ROC Plot improves considerably with the addition of the remaining probes.

6. Conclusion and References

In this tutorial, you screened over 20,000 compounds using GPU Shape. The diverse probe molecules were selected by clustering the known actives and retrieving representative structures. You then used those probes to screen our library of compounds and evaluate the enrichment with just one of the probes and then with all 10 probes.

7. Glossary of Terms

Entries - a simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion

included - the entry is represented in the Workspace, the circle in the In column is blue

Project Table - displays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data

Scratch Project - a temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project

selected - (1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entries (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries

Working Directory - the location that files are saved

Workspace - the 3D display area in the center of the main window, where molecular structures are displayed