oled_ml_formulations_driver.py Command Help
Command: $SCHRODINGER/run oled_ml_formulations_gui_dir/oled_ml_formulations_driver.py
usage: $SCHRODINGER/run oled_ml_formulations_gui_dir/oled_ml_formulations_driver.py
[-h] [-csv CSV] [-groups [GROUPS ...]] -mode {train,predict}
[-target [TARGET ...]] [-descriptors [DESCRIPTORS ...]]
[-hyperparameter HYPERPARAMETER] [-time TIME] [-test_size TEST_SIZE]
[-model [MODEL ...]] [-model_type {Regression,Classification}]
[-custom_model_json CUSTOM_MODEL_JSON]
[-custom_model_tar_file CUSTOM_MODEL_TAR_FILE] [-downsample DOWNSAMPLE]
[-out_split] [-cross_validation CROSS_VALIDATION]
[-split_seed SPLIT_SEED] [-remove_correlated REMOVE_CORRELATED]
[-HOST <hostname>] [-D] [-VIEWNAME <viewname>] [-JOBNAME JOBNAME]
Driver to Train and Predict OLED Device Machine Learning Models Copyright
Schrodinger, LLC. All rights reserved.
options:
-h, -help Show this help message and exit.
-csv CSV CSV file containing the formulation data. The file
must contain the layer type, layer thickness, layer
smiles, and layer composition columns. The file must
also contain the target property column and any
additional descriptor columns (default: None)
-groups [GROUPS ...] JSON file(s) containing the group names and the SMILES
that belong to each group. For multiprediction,
provide one file per model. The file must contain the
group name as the keys and the list of SMILES header
as the values (default: None)
-mode {train,predict}
Use train to train models, predict to predict using a
trained model. (default: None)
-target [TARGET ...] The target property on which model will be trained or
used for prediction. This property must be present in
the input CSV file (default: None)
-descriptors [DESCRIPTORS ...]
The additional descriptor properties used for
training. These properties must be present in the
input CSV file (default: None)
-hyperparameter HYPERPARAMETER
The hyperparameter to use for training the models.
Either -hyperparameter or -time must be provided. Both
cannot be provided. (default: None)
-time TIME The time limit in hours for training the models.
Either -hyperparameter or -time must be provided. Both
cannot be provided. (default: None)
-test_size TEST_SIZE The proportion of the dataset to include in the test
split. Should be between 0.0 and 1.0 (default: 0.1)
-model [MODEL ...] The trained model (.mlform) to use for prediction.
(default: None)
-model_type {Regression,Classification}
Select the algorithm type to use for training
(default: None)
-custom_model_json CUSTOM_MODEL_JSON
The json file containing the custom model used to
generate features (default: None)
-custom_model_tar_file CUSTOM_MODEL_TAR_FILE
The tar file containing the custom model used to
generate features (default: None)
-downsample DOWNSAMPLE
The number of samples to downsample the dataset during
hyperparameter tuning (default: 10000)
-out_split Split the training data to include unique formulations
instead of random train-test split (default: False)
-cross_validation CROSS_VALIDATION
Number of splits for cross validation (default: 5)
-split_seed SPLIT_SEED
Seed for splitting the dataset into train and test
sets (default: 1234)
-remove_correlated REMOVE_CORRELATED
The threshold for removing correlated features.
Features with a correlation coefficient greater than
this threshold will be removed (default: 0.9)
Job Control Options:
-HOST <hostname> Run job remotely on the indicated host entry.
(default: localhost)
-D, -DEBUG Show details of Job Control operation. (default:
False)
-VIEWNAME <viewname> Specifies viewname used in job filtering in maestro.
(default: False)
-JOBNAME JOBNAME Provide an explicit name for the job. (default: None)