Introduction
This is a MATLAB function that implements the method proposed in the paper "Regularized Sliced Inverse Regression for Kernel Models" by Wu et.al (2008) from Department of Statistical Science of Duke University. It extends the sliced inverse regression framework for nonlinear dimension reduction using kernel models and regularization and can be applied to high-dimensional data. Download here (.m file).
Inputs and Outputs
Suppose the sample size is n and the number of explanatory variables is p, and x_i is a p-vector denoting the covariates for the i-th observation, i=1,...n. The inputs of this function should consist of a n*n kernel matrix KER with the (i,j)-th element K(x_i,x_j) where K(.,.) is a suitable kernel function, a response vector Y with length n, the number of effective dimension reduction (EDR) dimensions d, a regularization parameter s, and (optionally) a structure type variable that specifies doing regression or classification and the number of slices. The function will return a structure type variable SIR that contains the estimated quantities for computing kSIR variates, specifically, the following command
SIR.C*(KER(:,i)-mean(KER(:,i)))+SIR.b
returns the kSIR variate(s) for the i-th observation. For a new observation (possibly from a test dataset) with a covariates vector x*, suppose KERx=(K(x*,x_1),...,K(x*,x_n))', then "SIR.C*(KERx-mean(KERx))+SIR.b" returns the kSIR variate(s) for this new observation.
Try "help KSIR" in the command line for more details.
A toy Example
In this section we illustrate how nonlinear dimension reduction is achieved by the kSIR through a toy regression problem.
Suppose there are 400 observations, each x_i, (i=1,...400) is a 5-vector drawn from a 5-dimensional multivariate normal distribution with mean 0 and covariance the identity matrix. The i-th response y_i equals the sum of squares of the first and second dimensions of x_i plus a random error (from a univariate normal distribution with mean 0 and standard deviation 0.2).
Evidently any linear combination of the explanatory variables cannot explain the variance in this example.
For kSIR one can first specify a suitable kernel function K(.,.), say, a Gaussian kernel with
with s being some bandwidth parameter.
Now suppose KER is a 400*400 matrix with the (i,j)-th element K(x_i,x_j), and type
[SIR] = KSIR(KER, Y, 1, 0.1, opts);
in the command line with opts.pType = 'r' and opts.H = 20. The third argument tells the function to choose 1 EDR hence 1 kSIR variate, and the fourth argument sets the regularization parameter to be 0.1. opts is a structure: opts.pType='r' means doing regression and opts.H=20 means the number of slices is taken to be 20. Now the command
var1=zeros(400,1); for i=1:400 var1(i)=SIR.C*(KER(:,i)-mean(KER(:,i)))+SIR.b; end
calculates one kSIR variate (since only one EDR is chosen) for each of the observations, and "plot(var1,Y, '.')" produces the Figure below, from which it is seen that this variate is highly predictive.
![\includegraphics[totalheight=2.5in]{](fig/ksirtoy.jpg)