Active Learning Glide Panel

Dock a large number of ligands by training a QSAR model on a subset of the ligands and using the model to score the remaining ligands.

To open this panel: click the Tasks button and browse to Receptor-Based Virtual Screening → Active Learning Docking.

To write out the input file and a script for running the job from the command line, click the arrow next to the Settings button and choose Write. For information on command usage and options, see glide_active_learning.py Command Help. See Running Active Learning Glide from the Command Line for more command line instructions.

Batch Glide Screening with Active Learning Panel Features

Inputs section

Specify the input for the Glide grid and the ligands.

Receptor grid button

Load the receptor grid for docking of the training set. The button is labeled Load File. Opens a file chooser to locate and load the grid. For more information on the Glide grid, see Receptor Grid Generation Panel. The status of the grid is reported to the right of the button.

Ligands button

Click this button, labeled Add Files, to add files that contain the ligand structures, in SMILES format. A file selector opens, in which you can select multiple files of type .csv or .smi (but not both types).

You can specify multiple files to screen either by selecting them in the file selector, or clicking the button multiple times. All .csv files must have the same columns for the SMILES string and the title. To remove files from the list, click the change link. A small pane opens with a list of files, which you can select and remove.

The.smi file must contain only two columns, with the SMILES in the first and the title in the second. For .csv files with two columns, the file is imported directly if the two columns have the headings smiles and title (case-insensitive). Otherwise the Import SMILES Dialog Box opens, so you can choose which columns contain the SMILES string and the title.

The number of files is displayed to the right of the button. To clear the file selection, click the X displayed to the right of the button.

Preparation Options link

Specify options for preparation of the ligands for docking. Opens the Ligand Preparation Options Dialog Box.

Training options section

Make settings for the training of the QSAR model. The training consists of rounds, in which a sample of ligands is docked, a model is constructed, and the remaining ligands are docked with the model. Ligands are chosen from the docked set and added to the training set for the next round.

NOTE: The maximum recommended value for -training_size is 100,000

Benchmarks show that the enrichment score and recovery rate become asymptotic around a training size of 100,000. There is no indication that larger training sets are needed for performance (even with libraries on the order of billions of compounds).
Sample size per round text box and menu

Specify the number of ligands to add to the training set per round of training. The menu allows you to choose the scaling factor used, either thousand ligands or hundred ligands. Note that increasing the sample size increases the memory use. The ligands are chosen from the best 10% of all the docked ligands, according to the chosen ligand selection criterion (below).

Ligand selection criterion link

Choose the criterion by which ligands are selected. The current choice is displayed as the link text.

  • Random—choose the ligands at random.
  • Diverse—choose a diverse set of ligands (the default).
  • Most uncertain—choose the ligands with the highest uncertainty in the predicted docking score.
Training time per round option menu

Choose the amount of training time to use per round, from the following. The training of the models stops after the amount of time has been used, and the best model found to that point is used. Using more time than the standard could improve the models if they are not already well fit.

  • Standard (4 hours)
  • Extended (8 hours)
  • Maximum (12 hours)
Number of rounds text box

Specify the number of rounds of training to use for the model.

Run pilot only option

Run a pilot job to assess the effectiveness of active learning on the input library before submitting a full run. Sets Number of rounds to 1 and disables it; the number of rounds is restored when you clear this option.

The pilot job samples a pilot library of 50k ligands from the input ligand file and runs a modified active-learning workflow optimized for a short run time, with a training set size of 5k ligands. Separately, the entire pilot library is docked with Glide. A summary report is generated, capturing the effectiveness of active learning on this library.

Outputs section

Make settings for the output of the docked ligands.

Dock and import best ligands text box

Specify the percentage of the total number of ligands to return from the screening run.

Estimated time to completion text

This text displays the estimated time to completion based on the panel settings, license availability, and estimated ligand count.

License change link

Clicking the link opens a pane with options to adjust the number of CPUs (for AutoQSAR) and the number of Glide licenses. Additionally, the Estimated number of ligands can be set using the text box and menu. These values are used in the time estimate. You can change the numbers freely to determine the effect on the time estimate, but ultimately the numbers should reflect the actual number you have available for the job.

Job toolbar

Manage job submission and settings. See Job Toolbar for a description of this toolbar.

Status bar

Use the Reset button to reset the panel to its default settings and clear any data from the panel. If the panel has a Job toolbar, you can also reset the panel from the Settings button menu.

If you can submit a job from the panel, the status bar displays information about the current job settings and status for the panel. The settings include the job name, task name and task settings (if any), number of subjobs (if any) and the host name and job incorporation setting. The job status can include messages about job start, job completion and incorporation.

The status bar also contains the Help button , which opens an option menu with choices to open the help topic for the panel (Documentation), launch Maestro Assistant, or if available, choose from an option menu of Tutorials. If the panel is used by one or more tutorials, hover over the Tutorials option to display a list of tutorials. Choosing a tutorial opens the tutorial topic.