dise_select.py Command Help
Command: $SCHRODINGER/run dise_select.py
usage: dise_select.py [-h] [-num_cpds <int>] [-seed_cpds <int>]
[-seed_file <filename>] [-use_scaffolds]
[-scaffold_type {bemis_murcko,generic_bemis_murcko}]
[-sim_threshold <float>] [-smi_col <str>]
[-save_analogs] [-HOST <hostname>] [-JOBNAME JOBNAME]
ordered_cpds
Performs selection of ordered compounds using a variation of the DIrected Sphere Exclusion (DISE) algorithm. Typically used to enhance diversity in virtual screen results. An initial list of the top cpds is elected without respect to diversity, then subsequent members are compared, in input order, against the members in the list. If the subsequent cpd is distinct with respect to fingerprint similarity, it is added to the list. Option to calculate the similarity using either thefull compounds or their Bemis-Murcko scaffolds.The ordered input is traversed until the desired number of cpds is reached or the input is exhausted.
positional arguments:
ordered_cpds Sorted csv or structure file from which the selection is made (csv format must include cpd SMILES and identifier fields).
options:
-h, -help Show this help message and exit.
-num_cpds <int> Total number cpds to retain. Default is 10000.
-seed_cpds <int> Number of cpds to seed the list. These cpds form the initial selection and are not evaluated for diversty. Subsequent candidates are compared against these cpds. Default is 1000.
-seed_file <filename>
Optional source file of seed cpds. Default is to read the first <seed_cpds> cpds from the <ordered_cpds> file.
-use_scaffolds Whether to use Bemis-Murcko scaffolds instead of full compounds for the similarity calculations.
-scaffold_type {bemis_murcko,generic_bemis_murcko}
Scaffold type to use: 1) bemis_murcko: Retains rings and the linkers, removes all side chains. 2) generic_bemis_murcko: This is a generic version of the bemis_murcko scaffolds which further converts all bonds to single bond and all atom types to Carbon. Default is generic_bemis_murcko.
-sim_threshold <float>
Tanimoto similarity threshold for a cpd to be deemed distinct. Default is 0.60 (using fingerprints of type "MolPrint2D").
-smi_col <str> Name of the cpd SMILES column (csv format). Default is "SMILES".
-save_analogs Save the passed over, "excluded" cmpds to a file.
Job Control Options:
-HOST <hostname> Run job remotely on the indicated host entry.
-JOBNAME JOBNAME Provide an explicit name for the job.