HIERARCHICAL RIDGE REGRESSION

Performs a hierarchical PWAS constrained ridge regression on a set of data

Contents

Function of MOBY-DIC TOOLBOX.

Description

This function is very similar to function ridgeRegression. In function ridgeRegression a PWAS function is found which fits data X and Y such that the error $[f(X)-Y]^2$ is minimal. In this function, many PWAS functions are computed such as their sum fits the data X and Y, in order to minimize:

$$ \left[ \sum_{i=1}^{nfun} f_i(x) \textrm{--} Y \right]^2 $$

Each function %f_i% is a pwas function in the form:

$$ f(x) = \sum_{j=1}^{Nbs} w_j \alpha_j(x) \qquad (2) $$

In this way the problem of the curse of dimensionality which affects the classical ridge regression solution, is softened. The weights to be computed are indeed less.

Syntax

[fpwas info] = hierarchicalRidgeRegression(X,Y,P)

X must be a cell array of matrices and Y a cell array of arrays. The number of cell arrays will coincide with the number of pwas function to be added. P defines the simplicial partition you want the fpwas functions to be defined on. If P is a scalar, each dimension of the domain of all pwas functions is subdivided into P intervals. If it is a cell array, you specify individually the number of subdivisions per dimension for each pwas function. Each element of cell array P can be an array or a cell array itself. The domain of the pwas function is automatically extrapolated from input data X and Xt. Xt and Yt (optional) represent a test dataset used to estimate the optimal value for the Tikhonov parameter lambda. lambda is chosen in order to minimize | f(Xt) - Yt |^2 being f the pwas function obtained starting from the training dataset (X,Y). If Xt and Yt are not provided, lambda is chosen with a GCV approach. Xt must be a [ndatatest x ndim] matrix and Yt a [ndatatest x 1] array.

fpwas is an array of pwas objects defining the pwas functions obtained after the regression which must be added to fit data Y.

info is a struct with the following fields:

[fpwas info] = hierarchicalRidgeRegression(X,Y,P,D)

As above, but the domain of the pwas functions is passed from outside the function. D is a cell array of matrices in the form: $$\left[ \begin{array}{cccc} x_{min}^1 & x_{min}^2 & \ldots & x_{min}^{nx}\\ x_{max}^1 & x_{max}^2 & \ldots & x_{max}^{nx} \end{array} \right] $$

Each element of the cell array is related to a pwas function.

[fpwas info] = hierarchicalRidgeRegression(X,Y,P,options)

options is a structure with the following fields:

[fpwas info] = hierarchicalRidgeRegression(X,Y,P,D,options)

All fields explained above are specified.

Acknowledgements

Contributors:

Copyright is with: