Designing Out Common ADMET Liabilities using Consensus IFD-MD

Tutorial Created with Software Release: 2025-2
Topics: Hit-to-Lead & Lead Optimization, Small Molecule Drug Discovery, Structure Prediction & Target Enablement
Products Used: FEP+, IFD-MD

Tutorial files

58 MB

This tutorial is written for use with a 3-button mouse with a scroll wheel.
Words found in the Glossary of Terms are shown like this: Workspacethe 3D display area in the center of the main window, where molecular structures are displayed.the 3D display area in the center of the main window, where molecular structures are displayed
Abstract:

 

Employing ensemble receptor structures for an IFD-MD calculation ensures coverage of large receptor conformational changes. This technique, called Consensus IFD-MD, can be applied to a series of common off-targets (CYP3A4, CYP2D6, PXR, and hERG) to use all public structural information to build an accurate structural model that could be used to rationalize binding with the off-target.

 

In this 2012 paper from Cumming et al., the authors describe the development of ligands with improved selectivity between their on-target protein and hERG. The authors mention that serendipity played a role in this process. In this tutorial we will see if we can create a rational structure-based hypothesis instead using Consensus IFD-MD to generate potential poses and FEP+ to evaluate the quality of those poses for a series of congeneric ligands.

 

Tutorial Content
  1. Introduction to Consensus IFD-MD

  1. Creating Projects and Importing Structures

  1. Setting up and submitting the Consensus IFD-MD Job

  1. Running and Analyzing Consensus Binding Mode Analysis

  1. Conclusion and References

  1. Glossary of Terms

1. Introduction to Consensus IFD-MD

Consensus IFD-MD assumes that congeneric ligands share one binding mode in the protein. Utilizing this assumption, consensus IFD-MD independently docks two to three congeneric ligands into an ensemble of protein structures and only returns predicted binding modes that satisfy this assumption.

Employing ensemble receptor structures ensures coverage of large receptor conformational changes, but it can also result in a substantial increase in the computational cost for IFD-MD. To mitigate the additional computational burden, consensus IFD-MD skips the metadynamics step for computational efficiency resulting in approximately 1/10 of the GPU resource usage compared to full IFD-MD. The IFD-MD metadynamics step is only used for scoring in IFD-MD, meaning that its exclusion does not affect the quality of the poses returned, only their ranking. The use of the consensus binding mode assumption compensates for this reduced scoring accuracy by discarding poses that are not consistent across the congeneric series.

In this tutorial, we will use Consensus IFD-MD to model hERG inhibition for three ligands, docking them into 7 hERG receptor conformations. The consensus binding mode assumption allows us to reduce the computational cost of this task, with the final computational burden being roughly equivalent to the GPU cost of running two full IFD-MD jobs (rather than 21).

Figure 1. Consensus IFD-MD Workflow

Consensus Binding Mode Analysis (CBMA) finds and clusters common binding modes among the docked congeneric ligands from all the independently run IFD-MD jobs. The clustering is performed based on the Maximum Common Substructure (MCS) ligand RMSD, ensuring that the poses with similar binding modes are grouped together.

For more information about IFD-MD in general, please see the IFD-MD panel documentation as well as the IFD-MD Best Practices document.

2. Creating Projects and Importing Structures

At the start of the session, change the file path to your chosen Working Directorythe location that files are saved. in Maestro to make file navigation easier. Each session in Maestro begins with a default Scratch Projecta temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project., which is not saved. A Maestro project stores all your data and has a .prj extension. A project may contain numerous entries corresponding to imported structures, as well as the output of modeling-related tasks. Once a project is created, the project is automatically saved each time a change is made.

Structures can be imported from the PDB directly, or from your Working Directory using File > Import Structures, and are added to the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion. and Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data.. Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion. are located to the left of the Workspace. The Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data. can be accessed by Ctrl+T (Cmd+T) or Window > Project Table if you would like to see an expanded view of your project data.

  1. Double click the Maestro icon to start Maestro

Figure 2-1. Change Working Directory option.

  1. Go to File > Change Working Directory
  2. Find your directory, and click Choose
  3. Pre-generated input and results files are included for running jobs or examining output. Download the zip file here: https://www.schrodinger.com/sites/default/files/s3/release/current/Tutorials/zip/consensus_ifd-md.zip
  4. After downloading the zip file, unzip the contents in your Working Directorythe location that files are saved. for ease of access throughout the tutorial

Figure 2-2. Save Project panel.

A project (.prj) can be saved with whatever name you prefer. In this instance (and quite often), we will name the project the same as the name of the working directory.

 

  1. Go to File > Save Project As
  2. Change the File name to consensus_IFD-MD
  3. Click Save
    • The project is named consensus_IFD-MD.prj

3. Setting up and submitting the Consensus IFD-MD Job

In the tutorial zip archive, you will find the prepared structures of three ligands from the Cumming et al. publication detailing the design of hERG-selective CCR2 antagonists. After importing them into Maestro, you can directly load them into the IFD-MD panel to set up the Consensus IFD-MD job. As Consensus IFD-MD relies on pre-prepared protein structures that are shipped with the software, all you will need to do is select the target protein of interest: CYP3A4, CYP2D6, PXR, or hERG.

For compute intensive jobs such as IFD-MD, you will need to either write out the calculation input files and transfer them to a remote cluster for running or use Job Server to handle the job submission. Here, we show how to submit the calculation from your local computer to a remote cluster using Job Server.

The only input structures needed for the consensus IFD-MD job is a set of 2-3 ligands from a congeneric series with activity data. You should choose the most potent ligand(s) from the series, and if there are multiple ligands with similar potency, choose the largest ones. The ligands should be prepared with LigPrep.

Figure 3-1. Import ligands.

To set up the job, we need to load in the set of ligands.

 

  1. Go to File > Import Structures
  2. Select hERG_congeneric_ligands.meagz and click Open
    • A set of congeneric ligands from Cumming et al. are added to the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion.

Consensus IFD-MD calculations use a suite of public protein structures for each off-target that are shipped with the software. Currently, we use 28 structures for PXR, 7 structures for hERG, 71 structures for CYP3A4, and 13 structures each for CYP2D6 and CYP2C9. For hERG, the consensus IFD-MD protocol also uses a membrane.

You can find the receptor structures and pre-calculated WaterMaps in $SCHRODINGER/psp-vX.x/data/consensus_ifd-md. The receptor structures we use are holo structures generated from ligand-bound structures published in the PDB.

Figure 3-2. Load the input ligands into the panel and specify target protein.

  1. Navigate to Tasks > Receptor-Based Virtual Screening > IFD-MD
    • The IFD-MD panel opens
  2. Under Dock into target protein, choose Multiple congeneric ligands
  3. Next to Target Ligand, choose Project Table (3 selected) and click Load
    • The three ligands that we just imported are loaded into the panel
  4. For Target protein, choose hERG.
    • All available public hERG structures (7 total) will be considered for this simulation.

The warnings next to each ligand name alert you that torsions are present in that ligand that are not parametrized in the default OPLS force field.

You should parametrize missing torsions for your ligands using the Force Field Builder. A custom parameter file (custom_2025_2.opls) is included with the tutorial, which you can now load in.

  1. Next to Forcefield, check Use customized version
  2. Click the three dots
    • The Set Custom Parameters Location dialog opens.
  3. In the dropdown, click Browse and navigate to the directory where you extracted the tutorial files.
  4. Click OK.
    • The torsion check on the ligands runs again, with checkmarks showing that all torsions are parametrized.

Note: The setting in the Set Custom Parameters Location dialog is specific to this Maestro project. You can click Manage Default to change the global preference for custom forcefield parameters. Clicking Customize opens the Force Field Builder.

Figure 3-6. Choose the CPU and GPU hosts.

This is a very computationally intensive job. It uses a Driver host to keep track of the many individual subjobs, which run on both CPUs and GPUs. You must specify the number of hosts of each type that the job should use.

  1. Click the (cog) icon
  2. Choose the Driver Host and the number of processors to use (if possible, use two processors).
  3. Choose the CPU subhost and number of processors (we recommend at least 64 CPUs)
  4. Choose the GPU subhost and number of processors (we recommend at least 4 GPUs)
  5. Click OK.

The number of total CPU processors is used to determine the number of IFD-MD jobs to run concurrently (e.g., if it is set to 10, 10 IFD-MD jobs will be submitted to run concurrently by default). See the IFD-MD Best Practices for more information regarding the recommended number of CPUs and GPUs and how IFD-MD handles license checkout.

Figure 3-7. Optional: Write out the input files.

Due to the large number of jobs run as part of this simulation and their computational cost, you cannot launch the job directly from the GUI.

  1. Click Write to write out the input and job setup files to the Working Directory.
  2. Follow the instructions in the panel to copy the job to the machine where it will run and launch it.

 

Note: Each of the 21 subjobs part of this calculation would take approximately 400 CPU hours and 14 GPU hours (on an NVidia T4).

The ifd-md_consensus.sh script written as part of the input files calls the consensus IFD-MD backend (ifd-md_consensus.py), which provides an overview of unsubmitted, running, completed, and failed jobs. Each time you run it, the <jobname>_launch_batch.sh script is updated to include the jobs that need to be (re)submitted next.

 

In some cases, consensus IFD-MD needs to run ~250 individual IFD-MD jobs and submitting and running all 250 at once may cause trouble in limited compute environments. There is a command line option -max_num_jobs NUM_JOBS  which can be used with ifd-md_consensus.py to limit the number of jobs launched simultaneously. For example, consensus IFD-MD will require 252 IFD-MD jobs for PXR analysis and supplying -max_num_jobs 25 to ifd-md_consensus.py limits consensus IFD-MD to only running 25 jobs at once.

4. Running and Analyzing Consensus Binding Mode Analysis

Consensus Binding Mode Analysis in the context of Consensus IFD-MD operates under the assumption that congeneric ligands share the same binding mode (with an MCS RMSD 1.5 Å cutoff). We can therefore throw out poses that might score well according to the scoring function, but that are not consistent across the provided ligands. This is a post-processing step after completing the Multiple congeneric ligands IFD-MD job to prepare for the FEP+ validation that will ultimately be used to determine the most prospectively useful model.

Consensus Binding Mode Analysis is performed using the same command as for the Consensus IFD-MD job, just with an additional flag (-analyze) added. This step is also much less computationally costly than the consensus IFD-MD simulations, so one driver host and one CPU suffice. You can edit the script on the remote machine to use fewer hosts and run it there, or configure the job settings in the panel to run it on your local machine after downloading the results of the consensus IFD-MD job. In this tutorial, we have provided the relevant consensus IFD-MD output files in the zip archive so you can run the analysis job on your local machine.

 

This analysis does not require that all individual IFD-MD jobs have completed, so you can examine the results even if some jobs are still running or have failed.

Figure 4-1. Reconfigure the job to run the analysis locally.

  1. Optional: If you closed or reset the IFD-MD panel after setting up the consensus IFD-MD job, follow steps 4-10 of section 3 to load in the ligands, configure the force field, and set the target protein.
  2. Click the (cog) icon
  3. Set the Driver Host, CPU subhost, and GPU subhost to localhost, using 1 CPU/GPU.
  4. Click OK.

Note: While the command that you write out will have a defined GPU subhost, no GPUs are used for the Consensus Binding Mode Analysis job which allows for it to run on just CPUs.

Figure 4-4. Add the -analyze flag to the shell script.

  1. Open the ifd-md_consensus.sh script from your Working Directorythe location that files are saved. with your desired text editor.
  2. Add the -analyze flag to the script.
  3. Save and close the file.

 

 

Figure 4-5. Launch the Consensus Binding Mode Analysis job.

Now that you have edited the shell script, you need to move it to the same parent directory as the results from the consensus IFD-MD simulation and run the job. For this tutorial, we have provided the relevant part of the output files of the job.

 

  1. Move the ifd-md_consensus.sh file to the provided IFD-MD_hERG directory.
  2. Open up your terminal and set your directory to the IFD-MD_hERG directory.
  3. Make sure that your environment variable is properly set.
    • See this knowledge base article for guidance.
  4. In the terminal window, write ./ifd-md_consensus.sh
    • This will launch the Consensus Binding Mode Analysis job, which will take approximately 25 minutes on a single CPU.
    • Once the analysis is complete, the job outputs results to IFD-MD_hERG_consensus-out.maegz
    • While you can wait for the job to complete, we have also provided pre-generated results that we will visualize in Maestro.

 

Figure 4-6. Import pre-generated outputs of the Consensus Binding Mode Analysis.

 

 

Import the results into Maestro to visualize the outputs from the Consensus Binding Mode Analysis.

 

  1. Go to File > Import Structures
  2. Select IFD-MD_hERG_consensus-out.maegz and click Open
    • The outputs from the Consensus Binding Mode Analysis are added to the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion..

Consensus IFD-MD outputs are structured in the following way:

In the IFD-MD_hERG_dir folder, there is a subfolder for each individual IFD-MD subjob (e.g. “herg_HERG2_chembl2036754_2”. The naming scheme of each folder is <target>_<target><structure index>_<ligand name>_<ligand index>.

As the IFD-MD job progresses through its various stages, it writes outputs and temporary files to the <subjob-name>_workdir. The tutorial zip archive does not contain the full outputs of all individual IFD-MD jobs to keep the file size manageable, but the folder for the herg_HERG5_chembl2036755_3 job is provided as an example. Of particular interest is the fep_input_gen_dir folder, where you can find FEP-ready output structures for that ligand-receptor complex (including membrane for hERG).

 

Running consensus IFD-MD in analysis mode performs the consensus binding mode analysis on all IFD-MD poses from the individual subjobs, discarding any that are not shared by the ligands (based on RMSD). The results are aggregated in the <jobname>_consensus_out.maegz file.

Each entry in that file is named based on the rank of the pose in its corresponding IFD-MD job. So the entry named “rank 4 in herg_HERG5_chembl2036755_3-out.maegz”, it is the fourth-ranked pose by score in the IFD-MD output file “herg_HERG5_chembl2036755_3-out.maegz” of  the IFD-MD job “herg_HERG5_chembl2036755_3”.

Figure 4-7. Binding mode comparison Preset.

  1. Right-click the IFD-MD_hERG_consensus_out group heading and choose Includethe entry is represented in the Workspace, the circle in the In column is blue..
  2. Click Presets and select Binding Mode Comparison
    • The structures are now rendered according to the Binding Mode Comparison preset which is particularly helpful for visualizing the output of IFD-MD.

 

Above you can see the top 5 poses output from the Consensus Binding Mode Analysis job overlaid. You cannot rely on the consensus IFD-MD score to rank the poses, due to the reduced accuracy of the scoring from skipping the metadynamics simulation.

Rather, you should evaluate the accuracy and the utility of the above models with Relative Binding Free Energy Perturbations (RB-FEP). The correlation between the binding affinity as predicted by FEP and measured in the experimental reference depends on the input pose, so the performance of the FEP model can be used to identify the best performing IFD-MD output.

 

To perform this validation, use each of the models returned from the consensus binding mode analysis to launch independent FEP+ jobs, each run with 23 single state compounds of a congeneric series from Cumming, et al. Each subjob from the consensus IFD-MD run contains FEP-ready structures in the <subjob_name>_workdir/fep_input_gen_dir directory.

For more information on setting up and running RB-FEP jobs with FEP+ please reference the BACE1 Inhibitor Design Using Free Energy Perturbation tutorial, Preparing Protein and Ligand Structures for FEP+, Troubleshooting Common Issues, and the FEP+ Checklist.

 

As this analysis is beyond the scope of this tutorial, we will only discuss the results in brief.

You can use the fep_stats_printer job ($SCHRODINGER/run -FROM psp fep_stats_printer.py *.fmp) to pull relevant statistics from a series of .fmp files. The results of this analysis for the RB-FEP outputs can be found below:

 

The FEP+ scatter plot associated with the herg_HERG5_chembl2036755_3_rank_04_out.fmp model, the best performing model according to the chart above can be found below:

The binding pose that correlates to the above model shows that the positively charged (+1) amine on the cyclohexanamine of the ligand is predicted to form pi-cation interactions (green dotted lines) with the two TYR652 that are on chain A and chain D. These have been described in the literature (see Asai et al.) as a selectivity filter.

5. Conclusion and References

For further learning:

6. Glossary of Terms

Entries - a simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion.

Favorites Toolbar - buttons for tasks designated as favorites in the Task Tool. You can add panels to your Favorites Toolbar by checking the star icon beside the panel name in Tasks.

included - the entry is represented in the Workspace, the circle in the In column is blue.

Project Table - displays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data.

Scratch project - a temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project.

selected - (1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entries (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries.

Working Directory - the location that files are saved.

Workspace - the 3D display area in the center of the main window, where molecular structures are displayed.