AutoQSAR - Advanced Options Dialog Box

Make additional AutoQSAR settings, for definition of categorical properties from numerical data, for model building, and for manual selection of independent variables.

AutoQSAR - Advanced Options Dialog Box Features

Define Categories section

In this section you can control how numeric data is divided into categories for a categorical QSAR model. This section is not present for numeric QSAR models.

Number of categories box

Specify the number of categories to use. This is a copy of the control in the AutoQSAR panel, and is linked to it.

Category distribution options

Select an option for how the data are distributed between the categories.

  • Equal widths—Divide the data into categories whose width (the range of values covered) is the same.

  • Equal populations—Divide the data into categories containing contiguous values so that there is the same number of values in each category.

Interactive category diagram

This diagram displays information on the categories (or classes). The horizontal axis represents the dependent variable. The data points are plotted horizontally. The boundaries between classes are represented by vertical lines, with a dot on the top. The value of the dependent variable at the boundary is displayed above the boundary line. Labels for each class are displayed between the boundaries, above the data points, with the populations of each class below the data points.

The diagram can be used to set and move class boundaries. If you click between the dots, a boundary is created where you click. You can move the boundaries by dragging the dot on the top of the boundary.

Number of models to build for each model type box

Specify the number of models to build for each type of model. A model is defined by a random selection of training and test sets, according to the percentage specified in the AutoQSAR panel. Multiple models are built to assess the validity of the model. Only available for the Traditional method.

Maximum allowed correlation between any pair of independent variables box

Independent variables are selected from a pool of several hundred molecular descriptors and fingerprints (depending on the model). Variables that are too strongly correlated with other variables are of little use in the model. The value specified in this box sets the criterion for discarding such variables. Only available for the Traditional method.

Descriptors section

In this section, you can select classes of descriptors and add custom descriptors for developing the QSAR models. Binary fingerprints and numeric descriptors are not generally appropriate for applications that do not involve relatively small molecules (such as periodic structures).

Binary fingerprints option

Include binary fingerprints in the descriptors used for the QSAR models. The types of fingerprints used are listed to the right.

Numeric descriptors option

Include classes of numeric descriptors in the descriptors used for the QSAR models. The classes are listed to the right.

Other properties from option and buttons

Click one of these buttons to include descriptors from other properties in the QSAR models. The properties must be numeric properties that are defined for the selected entries in the Project Table, or that exist in the file for the structures used, depending on the structure source for the model.

  • Properties— Select the properties by choosing them in a property selector, in which you can sort and filter the properties and select the ones you want to use.

  • File—Read a list of the properties from a plain text file. The button opens a file selector, in which you can locate and open the file that has the property names. The text file must have the name of one property on each line, and the name must be exactly as it appears in the structure source. For Maestro files it is the "internal name", e.g. r_user_Activity.

Note that there is no guarantee that any or all of the added properties will survive the selection process: these properties are treated the same as the binary fingerprints and numeric descriptors.

Descriptor category table

This table lists the custom descriptors that are added with the Other properties from option and buttons. Each descriptor has a check box that you can use to include or exclude it from the QSAR models. You can select rows in the table and delete them with the Delete button to the right of the table.

Reset button

Reset all options in this dialog box to their default values.