Designing Out Common ADMET Liabilities using Consensus IFD-MD
Tutorial Created with Software Release: 2025-2
Topics: Hit-to-Lead & Lead Optimization , Small Molecule Drug Discovery , Structure Prediction & Target Enablement
Products Used: FEP+ , IFD-MD
|
58 MB |
This tutorial is written for use with a 3-button mouse with a scroll wheel.
Words found in the Glossary of Terms are shown like this: Workspacethe 3D display area in the center of the main window, where molecular structures are displayed.the 3D display area in the center of the main window, where molecular structures are displayed
Abstract:
Employing ensemble receptor structures for an IFD-MD calculation ensures coverage of large receptor conformational changes. This technique, called Consensus IFD-MD, can be applied to a series of common off-targets (CYP3A4, CYP2D6, PXR, and hERG) to use all public structural information to build an accurate structural model that could be used to rationalize binding with the off-target.
In this 2012 paper from Cumming et al., the authors describe the development of ligands with improved selectivity between their on-target protein and hERG. The authors mention that serendipity played a role in this process. In this tutorial we will see if we can create a rational structure-based hypothesis instead using Consensus IFD-MD to generate potential poses and FEP+ to evaluate the quality of those poses for a series of congeneric ligands.
Tutorial Content
1. Introduction to Consensus IFD-MD
Consensus IFD-MD assumes that congeneric ligands share one binding mode in the protein. Utilizing this assumption, consensus IFD-MD independently docks two to three congeneric ligands into an ensemble of protein structures and only returns predicted binding modes that satisfy this assumption.
Employing ensemble receptor structures ensures coverage of large receptor conformational changes, but it can also result in a substantial increase in the computational cost for IFD-MD. To mitigate the additional computational burden, consensus IFD-MD skips the metadynamics step for computational efficiency resulting in approximately 1/10 of the GPU resource usage compared to full IFD-MD. The IFD-MD metadynamics step is only used for scoring in IFD-MD, meaning that its exclusion does not affect the quality of the poses returned, only their ranking. The use of the consensus binding mode assumption compensates for this reduced scoring accuracy by discarding poses that are not consistent across the congeneric series.
In this tutorial, we will use Consensus IFD-MD to model hERG inhibition for three ligands, docking them into 7 hERG receptor conformations. The consensus binding mode assumption allows us to reduce the computational cost of this task, with the final computational burden being roughly equivalent to the GPU cost of running two full IFD-MD jobs (rather than 21).
Figure 1. Consensus IFD-MD Workflow
Consensus Binding Mode Analysis (CBMA) finds and clusters common binding modes among the docked congeneric ligands from all the independently run IFD-MD jobs. The clustering is performed based on the Maximum Common Substructure (MCS) ligand RMSD, ensuring that the poses with similar binding modes are grouped together.
For more information about IFD-MD in general, please see the IFD-MD panel documentation as well as the IFD-MD Best Practices document.
2. Creating Projects and Importing Structures
At the start of the session, change the file path to your chosen Working Directorythe location that files are saved. in Maestro to make file navigation easier. Each session in Maestro begins with a default Scratch Projecta temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project., which is not saved. A Maestro project stores all your data and has a .prj extension. A project may contain numerous entries corresponding to imported structures, as well as the output of modeling-related tasks. Once a project is created, the project is automatically saved each time a change is made.
Structures can be imported from the PDB directly, or from your Working Directory using File > Import Structures, and are added to the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion. and Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data.. Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion. are located to the left of the Workspace. The Project Tabledisplays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data. can be accessed by Ctrl+T (Cmd+T) or Window > Project Table if you would like to see an expanded view of your project data.
- Double click the Maestro icon to start Maestro
- (No icon? See Starting Maestro)
- Go to File > Change Working Directory
- Find your directory, and click Choose
- Pre-generated input and results files are included for running jobs or examining output. Download the zip file here: https://www.schrodinger.com/sites/default/files/s3/release/current/Tutorials/zip/consensus_ifd-md.zip
- After downloading the zip file, unzip the contents in your Working Directorythe location that files are saved. for ease of access throughout the tutorial
A project (.prj) can be saved with whatever name you prefer. In this instance (and quite often), we will name the project the same as the name of the working directory.
- Go to File > Save Project As
- Change the File name to consensus_IFD-MD
- Click Save
- The project is named
consensus_IFD-MD.prj
- The project is named
3. Setting up and submitting the Consensus IFD-MD Job
In the tutorial zip archive, you will find the prepared structures of three ligands from the Cumming et al. publication detailing the design of hERG-selective CCR2 antagonists. After importing them into Maestro, you can directly load them into the IFD-MD panel to set up the Consensus IFD-MD job. As Consensus IFD-MD relies on pre-prepared protein structures that are shipped with the software, all you will need to do is select the target protein of interest: CYP3A4, CYP2D6, PXR, or hERG.
For compute intensive jobs such as IFD-MD, you will need to either write out the calculation input files and transfer them to a remote cluster for running or use Job Server to handle the job submission. Here, we show how to submit the calculation from your local computer to a remote cluster using Job Server.
The only input structures needed for the consensus IFD-MD job is a set of 2-3 ligands from a congeneric series with activity data. You should choose the most potent ligand(s) from the series, and if there are multiple ligands with similar potency, choose the largest ones. The ligands should be prepared with LigPrep.
To set up the job, we need to load in the set of ligands.
- Go to File > Import Structures
- Select
hERG_congeneric_ligands.meagzand click Open- A set of congeneric ligands from Cumming et al. are added to the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion.
Consensus IFD-MD calculations use a suite of public protein structures for each off-target that are shipped with the software. Currently, we use 28 structures for PXR, 7 structures for hERG, 71 structures for CYP3A4, and 13 structures each for CYP2D6 and CYP2C9. For hERG, the consensus IFD-MD protocol also uses a membrane.
You can find the receptor structures and pre-calculated WaterMaps in $SCHRODINGER/psp-vX.x/data/consensus_ifd-md. The receptor structures we use are holo structures generated from ligand-bound structures published in the PDB.
- Navigate to Tasks > Receptor-Based Virtual Screening > IFD-MD
- The IFD-MD panel opens
- Under Dock into target protein, choose Multiple congeneric ligands
- Next to Target Ligand, choose Project Table (3 selected) and click Load
- The three ligands that we just imported are loaded into the panel
- For Target protein, choose hERG.
- All available public hERG structures (7 total) will be considered for this simulation.
The warnings next to each ligand name alert you that torsions are present in that ligand that are not parametrized in the default OPLS force field.
You should parametrize missing torsions for your ligands using the Force Field Builder. A custom parameter file (custom_2025_2.opls) is included with the tutorial, which you can now load in.
- Next to Forcefield, check Use customized version
- Click the three dots
- The Set Custom Parameters Location dialog opens.
- In the dropdown, click Browse and navigate to the directory where you extracted the tutorial files.
- Click OK.
- The torsion check on the ligands runs again, with checkmarks showing that all torsions are parametrized.
Note: The setting in the Set Custom Parameters Location dialog is specific to this Maestro project. You can click Manage Default to change the global preference for custom forcefield parameters. Clicking Customize opens the Force Field Builder.
This is a very computationally intensive job. It uses a Driver host to keep track of the many individual subjobs, which run on both CPUs and GPUs. You must specify the number of hosts of each type that the job should use.
The number of total CPU processors is used to determine the number of IFD-MD jobs to run concurrently (e.g., if it is set to 10, 10 IFD-MD jobs will be submitted to run concurrently by default). See the IFD-MD Best Practices for more information regarding the recommended number of CPUs and GPUs and how IFD-MD handles license checkout.
Due to the large number of jobs run as part of this simulation and their computational cost, you cannot launch the job directly from the GUI.
- Click Write to write out the input and job setup files to the Working Directory.
- Follow the instructions in the panel to copy the job to the machine where it will run and launch it.
Note: Each of the 21 subjobs part of this calculation would take approximately 400 CPU hours and 14 GPU hours (on an NVidia T4).
The ifd-md_consensus.sh script written as part of the input files calls the consensus IFD-MD backend (ifd-md_consensus.py), which provides an overview of unsubmitted, running, completed, and failed jobs. Each time you run it, the <jobname>_launch_batch.sh script is updated to include the jobs that need to be (re)submitted next.
In some cases, consensus IFD-MD needs to run ~250 individual IFD-MD jobs and submitting and running all 250 at once may cause trouble in limited compute environments. There is a command line option -max_num_jobs NUM_JOBS which can be used with ifd-md_consensus.py to limit the number of jobs launched simultaneously. For example, consensus IFD-MD will require 252 IFD-MD jobs for PXR analysis and supplying -max_num_jobs 25 to ifd-md_consensus.py limits consensus IFD-MD to only running 25 jobs at once.
4. Running and Analyzing Consensus Binding Mode Analysis
Consensus Binding Mode Analysis in the context of Consensus IFD-MD operates under the assumption that congeneric ligands share the same binding mode (with an MCS RMSD 1.5 Å cutoff). We can therefore throw out poses that might score well according to the scoring function, but that are not consistent across the provided ligands. This is a post-processing step after completing the Multiple congeneric ligands IFD-MD job to prepare for the FEP+ validation that will ultimately be used to determine the most prospectively useful model.
Consensus Binding Mode Analysis is performed using the same command as for the Consensus IFD-MD job, just with an additional flag (-analyze) added. This step is also much less computationally costly than the consensus IFD-MD simulations, so one driver host and one CPU suffice. You can edit the script on the remote machine to use fewer hosts and run it there, or configure the job settings in the panel to run it on your local machine after downloading the results of the consensus IFD-MD job. In this tutorial, we have provided the relevant consensus IFD-MD output files in the zip archive so you can run the analysis job on your local machine.
This analysis does not require that all individual IFD-MD jobs have completed, so you can examine the results even if some jobs are still running or have failed.
- Optional: If you closed or reset the IFD-MD panel after setting up the consensus IFD-MD job, follow steps 4-10 of section 3 to load in the ligands, configure the force field, and set the target protein.
- Click the
(cog) icon
- Set the Driver Host, CPU subhost, and GPU subhost to localhost, using 1 CPU/GPU.
- Click OK.
Note: While the command that you write out will have a defined GPU subhost, no GPUs are used for the Consensus Binding Mode Analysis job which allows for it to run on just CPUs.
- Open the
ifd-md_consensus.shscript from your Working Directorythe location that files are saved. with your desired text editor. - Add the -analyze flag to the script.
- Save and close the file.
Now that you have edited the shell script, you need to move it to the same parent directory as the results from the consensus IFD-MD simulation and run the job. For this tutorial, we have provided the relevant part of the output files of the job.
- Move the
ifd-md_consensus.shfile to the provided IFD-MD_hERG directory. - Open up your terminal and set your directory to the IFD-MD_hERG directory.
-
Make sure that your environment variable is properly set.
- See this knowledge base article for guidance.
- In the terminal window, write
./ifd-md_consensus.sh- This will launch the Consensus Binding Mode Analysis job, which will take approximately 25 minutes on a single CPU.
- Once the analysis is complete, the job outputs results to
IFD-MD_hERG_consensus-out.maegz - While you can wait for the job to complete, we have also provided pre-generated results that we will visualize in Maestro.
Import the results into Maestro to visualize the outputs from the Consensus Binding Mode Analysis.
- Go to File > Import Structures
- Select
IFD-MD_hERG_consensus-out.maegzand click Open- The outputs from the Consensus Binding Mode Analysis are added to the Entriesa simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion..
Consensus IFD-MD outputs are structured in the following way:
In the IFD-MD_hERG_dir folder, there is a subfolder for each individual IFD-MD subjob (e.g. “herg_HERG2_chembl2036754_2”. The naming scheme of each folder is <target>_<target><structure index>_<ligand name>_<ligand index>.
As the IFD-MD job progresses through its various stages, it writes outputs and temporary files to the <subjob-name>_workdir. The tutorial zip archive does not contain the full outputs of all individual IFD-MD jobs to keep the file size manageable, but the folder for the herg_HERG5_chembl2036755_3 job is provided as an example. Of particular interest is the fep_input_gen_dir folder, where you can find FEP-ready output structures for that ligand-receptor complex (including membrane for hERG).
Running consensus IFD-MD in analysis mode performs the consensus binding mode analysis on all IFD-MD poses from the individual subjobs, discarding any that are not shared by the ligands (based on RMSD). The results are aggregated in the <jobname>_consensus_out.maegz file.
Each entry in that file is named based on the rank of the pose in its corresponding IFD-MD job. So the entry named “rank 4 in herg_HERG5_chembl2036755_3-out.maegz”, it is the fourth-ranked pose by score in the IFD-MD output file “herg_HERG5_chembl2036755_3-out.maegz” of the IFD-MD job “herg_HERG5_chembl2036755_3”.
- Right-click the IFD-MD_hERG_consensus_out group heading and choose Includethe entry is represented in the Workspace, the circle in the In column is blue..
- Click Presets and select Binding Mode Comparison
- The structures are now rendered according to the Binding Mode Comparison preset which is particularly helpful for visualizing the output of IFD-MD.
Above you can see the top 5 poses output from the Consensus Binding Mode Analysis job overlaid. You cannot rely on the consensus IFD-MD score to rank the poses, due to the reduced accuracy of the scoring from skipping the metadynamics simulation.
Rather, you should evaluate the accuracy and the utility of the above models with Relative Binding Free Energy Perturbations (RB-FEP). The correlation between the binding affinity as predicted by FEP and measured in the experimental reference depends on the input pose, so the performance of the FEP model can be used to identify the best performing IFD-MD output.
To perform this validation, use each of the models returned from the consensus binding mode analysis to launch independent FEP+ jobs, each run with 23 single state compounds of a congeneric series from Cumming, et al. Each subjob from the consensus IFD-MD run contains FEP-ready structures in the <subjob_name>_workdir/fep_input_gen_dir directory.
For more information on setting up and running RB-FEP jobs with FEP+ please reference the BACE1 Inhibitor Design Using Free Energy Perturbation tutorial, Preparing Protein and Ligand Structures for FEP+, Troubleshooting Common Issues, and the FEP+ Checklist.
As this analysis is beyond the scope of this tutorial, we will only discuss the results in brief.
You can use the fep_stats_printer job ($SCHRODINGER/run -FROM psp fep_stats_printer.py *.fmp) to pull relevant statistics from a series of .fmp files. The results of this analysis for the RB-FEP outputs can be found below:
The FEP+ scatter plot associated with the herg_HERG5_chembl2036755_3_rank_04_out.fmp model, the best performing model according to the chart above can be found below:
The binding pose that correlates to the above model shows that the positively charged (+1) amine on the cyclohexanamine of the ligand is predicted to form pi-cation interactions (green dotted lines) with the two TYR652 that are on chain A and chain D. These have been described in the literature (see Asai et al.) as a selectivity filter.
5. Conclusion and References
For further learning:
- Introduction to Structure Preparation and Visualization
- Cross-docking with IFD-MD
- Using IFD-MD on a Membrane-bound protein
- Using IFD-MD on a covalently-bound ligand
- Introduction to Molecular Modeling in Drug Discovery Online Course (Course Page | Preview)
- Free Energy Calculations for Drug Design with FEP+ Online Course (Course Page | Preview)
For further reading:
- A Reliable and Accurate Solution to the Induced Fit Docking Problem for Protein-Ligand Binding
- Induced-Fit Docking Enables Accurate Free Energy Perturbation Calculations in Homology Models
- Benchmarking Refined and Unrefined AlphaFold2 Structures for Hit Discovery
- Using AlphaFold and Experimental Structures for the Prediction of the Structure and Binding Affinities of GPCR Complexes via Induced Fit Docking and Free Energy Perturbation
- FEP+ Best Practices
- IFD-MD User Manual
- IFD-MD Best Practices
6. Glossary of Terms
Entries - a simplified view of the Project Table that allows you to perform basic operations such as selection and inclusion.
Favorites Toolbar - buttons for tasks designated as favorites in the Task Tool. You can add panels to your Favorites Toolbar by checking the star icon beside the panel name in Tasks.
included - the entry is represented in the Workspace, the circle in the In column is blue.
Project Table - displays the contents of a project and is also an interface for performing operations on selected entries, viewing properties, and organizing structures and data.
Scratch project - a temporary project in which work is not saved, closing a scratch project removes all current work and begins a new scratch project.
selected - (1) the atoms are chosen in the Workspace. These atoms are referred to as "the selection" or "the atom selection". Workspace operations are performed on the selected atoms. (2) The entry is chosen in the Entries (and Project Table) and the row for the entry is highlighted. Project operations are performed on all selected entries.
Working Directory - the location that files are saved.
Workspace - the 3D display area in the center of the main window, where molecular structures are displayed.