Atom-Based QSAR Panel

In this panel you can set up a Phase 3D QSAR model from a set of aligned ligands, and use the model to predict activities for other molecules. You can also visualize the QSAR model in the Workspace, and add the QSAR model to a hypothesis.

To open this panel: click the Tasks button and browse to Discovery Informatics and QSAR → Atom-Based.

Atom-Based QSAR Panel Features

View QSAR Model button

Click this button to view the QSAR model in the Workspace.

Add ligands buttons

These buttons allow you to add ligands to the ligands table.

  • From Project—Opens the Add From Project Dialog Box, in which you can choose a set of entries and select an activity property, converting it into the appropriate units if need be.

  • From File—Opens a file selector, in which you can navigate to and select the file. When you click OK, the Choose Activity Property Dialog Box opens, in which you can select an activity property, converting it into the appropriate units if need be, and select a QSAR set property, to define the members of the training set and the test set by a value of this property.

The ligands you add must be fully prepared 3D structures that are properly aligned. No facility is provided in this panel for preparing the structures or aligning the ligands.

Delete and Delete All buttons

These buttons allow you to delete ligands from the ligands table. The Delete button deletes the selected ligands from the table. This allows you to change a model by removing ligands, or replacing ligands, for example. The Delete All button removes all ligands from the table. This is useful if you want to create a model with a different set of ligands.

Ligands table

This table contains the list of ligands. When the ligands are first read, all ligands are included in the training set, and the # Factors and Predicted Activity columns are empty. These columns are added after the QSAR model is built. The table columns are described below.

Most of the columns of this table are noneditable. You can change the activity values, select the training and test sets, and display the ligands in the Workspace. You can sort the table by the values in a column, by clicking the column heading. Use shift-click and control-click to select multiple rows.

In   Inclusion status of the ligand. The diamond has a cross in it if the ligand is included in the Workspace, and is empty if the ligand is excluded. You can include and exclude ligands with click, shift-click and control-click.
Ligand Name   The name of the ligand.
QSAR Set   Indicates whether a ligand is in the training set, the test set, or neither (the ligand is ignored). The column is blank if the ligand is ignored. Click the column repeatedly to cycle the ligand through the three possible states. Control-click to cycle the selected ligands through the three states. The state for the selected ligands is set to the state for the row that is clicked.
Activity   The ligand's activity. You can alter the activity values by directly editing the table cells.
# Factors   Number of factors in the partial least squares regression model.
Predicted Activity   Activity predicted by the QSAR model. The number of rows in each cell is equal to the maximum number of PLS factors specified in the Build Atom-Based Model Dialog Box. Each row contains the prediction from a model containing the number of PLS factors indicated in the # Factors column.
Prediction Error   Error in the activity predicted by the QSAR model. The number of rows in each cell is equal to the maximum number of PLS factors specified in the Build Atom-Based Model Dialog Box. Each row contains the prediction error from a model containing the number of PLS factors indicated in the # Factors column.

Random Training Set Controls

These controls allow you to randomly select the training set.

Random training set text box

Specify the percentage of ligands to include in the training set by random selection from the ligands in the union of the training set and the test set.

Apply button

Click to apply a random selection of the training set from the ligands that are in the union of the training set and the test set. The ligands that are not selected are assigned to the test set.

Random seed text box

Enter the seed for the random selection of the training set in this text box. A zero value means that a different seed will be selected each time, and hence a different training set. A nonzero value means that the same seed is used each time, which produces the same training set.

Model buttons

These four buttons allow you to perform different actions on the QSAR model.

Build

Build the model. Opens the Build Atom-Based Model Dialog Box, box in which you can specify parameters for building the model, and then build the model.

Import

Import an existing model. The model includes the ligands, the QSAR training and test set membership, and the regression information.

Test

Generate predicted activities for the test set.

QSAR statistics table

The QSAR Results table shows the statistics of the fit for the training set and the test set. Each row presents the results for a hypothesis. Within each row are lines for regression models with a particular number of partial least squares factors included. The QSAR Results table has the following columns:

# Factors   Number of factors in the partial least squares regression model.
SD   Standard deviation of the regression.
R^2   Value of R2 for the regression.
R^2 CV   Cross-validated R2 value, computed from predictions obtained by a leave-N-out approach. The value of N is specified in the Build Atom-Based Model Dialog Box.
R^2 Scramble   Average value of R2 from a series of models built using scrambled activities. Measures the degree to which the molecular fields can fit meaningless data, and should be low.
Stability   Stability of the model predictions to changes in the training set composition. This statistic has a maximum value of 1 (meaning stable).
F   Variance ratio. Large values of F indicate a more statistically significant regression.
P   Significance level of variance ratio. Smaller values indicate a greater degree of confidence.
RMSE   Root-mean-square error of the test set.
Q^2   Value of Q2 for the predicted activities of the test set.
Pearson-r   Value of Pearson-R for the predicted activities of the test set.

Atom type fractions table

Display the fraction due to each atom type in the QSAR model for each number of PLS factors used in the model. Only those atom types that are present in the training set are represented in the table. This information can give you a general idea of how much each atom type affects the activity.

Action buttons

The following buttons can be used to perform actions once a QSAR model is available.

Export

Export the QSAR model to files. The model data is written to the named file, for example mymodel.qsar. The ligands are written to a Maestro file, which for the example given would be mymodel_qsar_pred.mae. The activity properties and the QSAR set membership are written to the ligand file.

QSAR Visualization

Opens the QSAR Visualization Settings Panel, in which you can make settings for the visualization of the QSAR model in the Workspace.

Predict

Predict the activity for one or more molecules. These molecules must exist as entries in the Project Table. Opens an entry chooser, in which you can choose the entries. The predicted activities are added as properties to the project entries.