glide_active_learning.py evaluate Command Help
Command: $SCHRODINGER/run -FROM glide glide_active_learning.py evaluate
usage: $SCHRODINGER/run glide_active_learning.py evaluate [-h]
[-infile_list_file <infile_path_list>]
[-block_size <num_lig_per_block>]
[-smi_index <smiles_column_index>]
[-name_index <title_column_index>]
[-no_header]
[-result_prefix <output_file_prefix>]
[-remote_input_ligands]
[-restart_file <restart_pkl_file>]
[-avoid_splitting_csv_files]
[-infile INFILE]
[-grid <gridfile>]
[-extra_docking_inputs <glide_input_file_of_extra_inputs>]
[-known_docking_score_file <known_docking_score_file>]
[-jobname <jobname>]
[-stop_after <stop_workflow_after_stage>]
[-max_ml_eval_cpu <maximum_ml_evaluation_cpu>]
[-eval_with_mq]
[-max_glide_cpu <maximum_glide_cpu>]
[-glide_mq]
[-rescore_host <rescore_host>]
[-ncpu_rescore <ncpu_rescore>]
[-model <ml_model.qzip>]
[-random_seed <random_seed_number>]
[-overwrite_args]
[-force_restart]
[-ligprep_args <ligprep_arguments>]
[-num_report_poses_rescore <num_best_poses_rescore>]
[-no_rescore_poses]
[-glide_subjob_size <ligs_per_glide_subjob> | -num_glide_subjobs <number_of_glide_subjobs>]
[-new_glide | -classic_glide]
[-glide_rescore_subjob_size <ligs_per_glide_rescore_subjob> | -num_glide_rescore_subjobs <number_of_glide_rescore_subjobs>]
[-keep <num_returned_ligand> | -keep_fraction <fraction_of_returned_ligand>]
[-num_rescore_ligand <num_rescore_ligand> | -rescore_ligand_fraction <fraction_of_rescore_ligand>]
options:
-h, --help show this help message and exit
-random_seed <random_seed_number>
Random seed number for shuffling all the ligands
and seeding ligand_ml training.
-overwrite_args Overwrite previous arguments, Default is False.
-force_restart Force the workflow to restart when some restarting files are missing.
-ligprep_args <ligprep_arguments>
Arguments for using ligprep to prepare the ligands. Default is -pht 1.0 -epik -s16.
-num_report_poses_rescore <num_best_poses_rescore>
Number of the best poses returned in
.mae library pose file and virtual screening database
(.vsdb) file for rescored ligands.
Equals to 'NREPORT' keyword in Glide.
Default is return all poses.
-no_rescore_poses Do not generate the .mae pose file or
.vsdb file for the rescored ligands.
Default generates the pose file and vsdb.
-glide_subjob_size <ligs_per_glide_subjob>
Number of ligands in each Glide subjob. Default is allowing
Glide to distribute the ligands automatically.
-num_glide_subjobs <number_of_glide_subjobs>
Number of subjobs for Glide job.
-new_glide Run new Glide backend in AL-Glide. The default follows feature flag NEW_GLIDE.
-classic_glide Run classic Glide backend in AL-Glide. The default follows feature flag NEW_GLIDE.
-glide_rescore_subjob_size <ligs_per_glide_rescore_subjob>
Number of ligands in Glide subjob of rescore stage.
Default is the same as the value of -glide_subjob_size.
-num_glide_rescore_subjobs <number_of_glide_rescore_subjobs>
Number of subjobs for Glide job of rescore stage.
Default is the same as the value of -num_glide_subjobs.
-keep <num_returned_ligand>
Number of best ligands to be returned.
Default is 10000000.
-keep_fraction <fraction_of_returned_ligand>
Fraction of the ligands to be returned.
-num_rescore_ligand <num_rescore_ligand>
Number of the best ligands to run rescore with Glide.
Default is 1000000.
-rescore_ligand_fraction <fraction_of_rescore_ligand>
Fraction of the best ligands to run rescore with GLIDE.
file options:
-infile_list_file <infile_path_list>
A file that contains a list of infile paths.
-block_size <num_lig_per_block>
Number of ligands in each sub input ligands file. Default is
15000 for AL-FEP+/AL-ABFEP and 100,000 for AL-Glide.
-smi_index <smiles_column_index>
1-based column index of ligand's SMILES.
Default is 1.
-name_index <title_column_index>
1-based column index of ligand's title.
Default is 2.
-no_header Whether the input file(s) has header in the first line.
-result_prefix <output_file_prefix>
prefix of the .csv result files. Default is -jobname.
-remote_input_ligands
Whether input ligand files are located at remote. Absolute
paths of input ligand files are required if this flag is
provided.
-restart_file <restart_pkl_file>
.pkl file for restarting or continuing the active learning
workflow.
-avoid_splitting_csv_files
Enable this option to use pre-split ligand CSV files, avoiding redundancy.
Requirements:
- Reorder columns: 'SMILES' first, 'Title' second.
- Each split CSV must have the same header: 'SMILES, Title'.
- Remove any additional columns.
- Optionally gzip compress each split CSV file.
- Archive all pre-split CSV files into one directory and zip compress it.
Pass the zipped directory to '-infile'.
-infile INFILE .csv or .smi file(s) that contains all the ligand SMILES.
Multiple .csv or .smi files can be included by specifying
multiple -infile options.
-grid <gridfile> Glide grid file for docking.
-extra_docking_inputs <glide_input_file_of_extra_inputs>
Glide input-like file that contains extra inputs for
docking stages.
-known_docking_score_file <known_docking_score_file>
CSV score file with known docking scores.
The data from this file will be used to train the first round of ML
models and reduce the number of glide calculations needed in the first
round of active learning. The CSV file should have the following columns:
SMILES, Title, docking_score.
job options:
-jobname <jobname> Job name of the active learning workflow run.
-stop_after <stop_workflow_after_stage>
Terminate the workflow after the specified stage finished.
Specify FinishAll (case-insensitive) to run all the remaining stages.
For convenience, you can also specify 'iter_X' to stop after
iteration X (e.g., 'iter_1', 'iter_2', etc.).
-max_ml_eval_cpu <maximum_ml_evaluation_cpu>
Allowed maximum number of CPU for
machine learning evaluation subjobs.
-eval_with_mq Enables ligand_ml evaluation using zeroMQ.
Specify the number of evaluation jobs with -max_ml_eval_cpu.
If running a screen or pilot, the -chosen_models flag will
automatically set to TorchGraphConv models.
-max_glide_cpu <maximum_glide_cpu>
Allowed maximum number of CPU for Glide subjobs.
-glide_mq Turn on Glide execution with ZMQ.
Use -num_glide_subjobs to control the number of subjobs.
-rescore_host <rescore_host>
Rescoring host name.
When using this option, the ncpu_rescore option must also be set.
Default is the same as HOST
-ncpu_rescore <ncpu_rescore>
Equivalent of -NJOBS for Glide.
Specifies number of CPUs provided to the rescore host.
Must be used with rescore_host.
evaluate options:
-model <ml_model.qzip>
Generated ligand_ml .qzip model.