Introduction

This is a software based on the method proposed in K. Mao etal. (2008) that does simultaneous dimension reduction and regression as well as inference of graphical models in Bayesian perspective. It realizes the Markov chain Monte Carlo procedure in that paper and returns posterior draws of quantities such as the effective dimension reduction directions and the gradient outer product matrix (GOP), from which further inference can be made and uncertainty can be measured. Both regression and binary classification are available. It runs on Matlab. Download here.

A Brief Tutorial

Unzip the downloaded file. To start, first make sure that the folder "gradlearn/" and the sub folders and files are in the search path of Matlab. The input training data should consist of a response vector Y and a covariate matrix X with rows observations and columns dimensions (input variables). The program will return the RKHS (Reproducing Kernel Hilbert Space) norms for each dimension and (optionally) posterior draws of the dimension reduction directions and GOP.

For example, suppose we have training data X and Y formatted as aforementioned and we can type:

[gnorm, dr, gop]=gradlearn(Y, X, 'r')

in the command line. For the third argument, 'r' tells the program to do regression and 'c' means binary classification. The output "gnorm" stores the posterior draws of the RKHS norms for each dimension with rows corresponding to dimensions and columns corresponding to draws. "dr" is a cell structure with dr{i} the i-th dimension reduction direction draw, again a matrix form with each column a draw for this direction vector. "gop" is the draw of the GOP matrix, with each column a draw for the GOP but in a vector form, so that to recover the matrix form of say the t-th GOP draw one needs to type:

GOP_t=reshape(gop(:,t), p, p)

where p is the number of dimensions.

Note that the output gop might take up a large amount of memory, so in practice if this quantity is not needed one can ignore it by specifying only the first two outputs:

[gnorm, dr]=gradlearn(Y, X, 'r')

Additional inputs are available. For a further detailed explanation of the inputs and outputs type "help gradlearn" in the command line.

Examples

Dimension Reduction

We illustrate how dimension reduction can be performed in this section by considering a classification problem for handwritten digits. An illustration of the digit data is shown in Figure 1. Each digit is represented by a 28*28=784 vector that contains the pixel values.

Figure1: Illustration of handwritten digit data 1-9.
\includegraphics[totalheight=2.5in]{digifig.jpg}

Suppose we want to classify digit "5" and digit "8". We can collect 100 samples for each digit and label digit "5" as response=1 and digit "8" as response=0, so that the covariates X is a 200*784 matrix for pixel values and the response Y is a 200 vector for class labels. We can type in the command line:

[gnorm, dr]=gradlearn(Y, X, 'c', [1000,1000])

where the fourth argument specifies [burn-in steps, number of posterior samples kept]

Now dr{1}, a 784*1000 matrix, contains the posterior samples for the first (top) dimension reduction direction. "mean(dr{1},2)" and "std(dr{1},0,2)" return the posterior mean and standard deviation for this direction, both 784 vectors. We can plot them in a visually friendly way by "imagesc(reshape(mean(dr{1},2),28,28)')" and "imagesc(reshape(std(dr{1},0,2),28,28)')" with the results shown in Figure 2. The red part in the left panel is exactly the region that differentiates digit "5" and "8", hence if we project the original data onto this direction we can immediately perform classification. The right panel indicates small uncertainty.

Figure2: The left panel is the posterior mean and the right panel is the posterior standard deviation for the top dimension reduction direction.
\includegraphics[totalheight=2.5in]{digi58topf.jpg}

Graphical Models

The GOP matrix can be used to infer a graphical model that provides information for the partial correlation between any two dimensions w.r.t the response given all the other dimensions. Consider a toy example: Let $\theta_j$ be a n-vector with each element a standard normal random variable, j=1,...,5, and $x_1=\theta_1, x2=\theta_1+\theta_2$ $\theta_j$, where $\X_i$ is the i-th column of the covariate X, a n*5 matrix with n the sample size. The response Y is generated in such a way that $Y_i=X_{i1}+\frac{X_{i3}+X_{i5}}{2}+\varepsilon_i$, for i=1,...,n, with $\varepsilon_i \sim N(0,0.25)$.

We can type: [gnorm, dr, gop]=gradlearn(Y, X, 'r', [1000,1000])

Now "gop" is the draw of the GOP matrix, with each column a draw for the GOP but in a vector form, so to obtain the posterior mean GOP matrix one needs a command like "GOPmean=reshape(mean(gop,2),5,5)", where the mean is in element-wise sense. To infer a graphical model one can compute the partial correlation matrix from the GOP matrix. The relationship between these two matrices can be found in K. Mao etal. (2008) Page 5.

Figure 3 shows the posterior mean and standard deviation for the GOP matrix and the partial correlation matrix. The partial correlation matrix clearly captures the negative covariation between Dimension 1,3,5 w.r.t the response.

Figure3: The posterior mean and standard deviation for the GOP matrix and the partial correlation matrix.
\includegraphics[totalheight=2.5in]{toyg.jpg}