Peptide QSAR — Advanced Options - KPLS Dialog Box

Set options for the kernel-based partial least-squares (KPLS) procedure for fitting the descriptors. For more information on this method, see Model-Building MethodsDefinitions for .

To open this panel, click Advanced Options for KPLS in the Setup tab of the Peptide QSAR panel.

Features
Additional Resources

Advanced Options - KPLS Dialog Box Features

Maximum number of KPLS factors box
Kernel nonlinearity slider, box, and Reset button
Stop adding KPLS factors when standard deviation of the regression drops to option and text box
Calculate uncertainty on test set predictions option
Use N bootstrapping cycles box

Maximum number of KPLS factors box: Specify the maximum number of KPLS factors to use in the regression model. Regression models are built for increasing numbers of KPLS factors up to this number. The maximum number that can be used is limited by the number of descriptors, which is 3 times the number of residues for the zvalue set, 5 times the number of residues for the ezvalue set, and 10 times the number of residues for the dpps set. It is rarely useful to build models with more than a few PLS factors, as models with a large number tend to be overfit. You should examine the statistics, particularly the stability and Q², to determine how many PLS factors to use in the model you choose for application to new systems.
Kernel nonlinearity slider, box, and Reset button: Change the kernel nonlinearity value. A Gaussian kernel exp(−d²/σ²) is used, where d is the Euclidean distance between two X variables. The nonlinearity value is 1/σ, so small values are almost linear, and large values are very nonlinear. Higher nonlinearity typically leads to tighter fitting, but it also tends to give poorer predictions on new peptides.
Stop adding KPLS factors when standard deviation of the regression drops to option and text box: Select this option to stop adding KPLS factors when the standard deviation of the regression drops below the value specified in the text box. Using this option could result in fewer KPLS factors than the number specified in the Maximum number of KPLS factors box.
Calculate uncertainty on test set predictions option: Calculate a confidence interval for each predicted value in the test set, by bootstrapping. This is done by sampling the training set randomly with replacement to generate a new test set of the same size with duplicates, building a model and making predictions of the test set, then repeating the procedure a specified number of times. The standard deviation from the original test set is then calculated as the uncertainty.
Use N bootstrapping cycles box: Specify the number of times a random sample is made and a prediction obtained in the uncertainty calculations. This number determines how many values are used in calculating the standard deviation, and should be at least 5.

Peptide QSAR — Advanced Options - KPLS Dialog Box

Advanced Options - KPLS Dialog Box Features

Related Topics