Build and Apply QSAR/QSPR Model — Recursive Partitioning — Advanced Options Pane

Set options for the recursive partitioning model. To open this pane, select Recursive Partitioning in the Method option menu in the Build Task, go to the Recursive Partitioning Settings section and select Advanced Options.

Recursive Partitioning — Advanced Options Pane Features

Partitioning section

In this section you can set various parameters that control how the partitioning (splitting) is performed at each node in the tree.

Algorithm options

There are two choices for the partitioning algorithm:

  • Gini impurity—Use the Gini impurity criterion for partitioning the observations. This criterion measures the divergences between the probability distributions of the target values.

  • Information gain—Use the information gain for partitioning the observations. This criterion uses an entropy measure.

Minimum leaf size text box

Minimum size of a leaf node in the decision tree. If partitioning a node would create nodes that are smaller than this size, the node is not split and is kept as a leaf node.

Find split values from nearest values options

There are two options for defining the split value from the two training set property values on either side of the split point:

  • Lower—Use the lower of the two values.

  • Average—Use the average of the two values.

For example, if the values on either side of the split were 2.5 and 2.8, selecting Average would set the split point to 2.65, whereas selecting Lower would set the split point to 2.5. The test for this node would then be x ≤ 2.65 or x ≤ 2.5, where x is the property (or attribute) used for partitioning.

Ensemble Model section

In this section you can specify parameters for ensemble model building. This section is not available if you are building a single tree model.

Number of trees text box

Specify the number of trees to generate for the ensemble model.

Training set % text box

Percentage of the training set (selected at random) used to build a tree.

Use average leaf node probabilities option

Use the class populations in the leaf node to assign class probabilities to an observation. By default, the class populations in the leaf node are weighted by the class populations in the training set to arrive at a probability for each class.

Require unique root attributes option

If this option is selected, a different property is selected in each tree for the first partitioning of the training set.If it is not selected, no restriction is placed on the property used for the first partitioning of the training set.