function [beta,V,varind,prob,haty,e,R2,BIC]=mregbic(y,x,k) % fits multiple linear regression model to response data y % using predictor matrix x -- but first centers all variables % % Uses sequential addition of predictors up to k predictors, computes % BIC to rank, outputs vector BIC for all k models, R2 for all, % and for the full model, the LSE beta, the SDs of elements of beta in v, % approximate probability level for each coeff, fitted values haty and residuals e % varind is list of variables included (in order - intercept is always included too) % % Forward selection correlates "current" residuals with potential predictors for % inclusion at next step - note that we need to fit residuals of linear % regressions of candidate variables on current predictors in model for y % to create "corrected" residuals; this is compute intensive when p is large % % y is nx1 column vector, x is pxn matrix % plots betas and approximate 95% intervals for BIC best model % % EASY TO USE/MODIFY TO LOOK AT MULTIPLE TRACKS THROUGH PREDICTORS % -- e.g., choose second best BIC at step 1, continue etc to generate % larger lists of regressions on smallish numbers of predictors % Also easy to modify to replace LSE with proper Bayesian analysis as from mregbayes % [N,n]=size(x); y=reshape(y,n,1); e=y-mean(y); ssy=e'*e; varind=[]; BIC=[]; R2=[]; e=y; X=x-repmat(mean(x,2),1,n); X0=X; H=ones(n,1); p=0; for im=1:k if (im==1) eX0=X0; else H=[ones(n,1),X(varind,:)']; % create design and projection matrix with intercept P=eye(n)-H*inv(H'*H)*H'; % for current variables in model eX0=X0*P'; % each col has resids for candidate variable end ig=select_genes_cor(eX0,e,1); % if you want to cascade multiple searches, use m>1 here to % generate k candidates to add varind=[varind ig]; % add variable to list to fit model ... [beta,V,prob,haty,e]=mreg(y,X(varind,:)); % least squares fit with intercept - could use Bayes % and plot fprintf('Hit a key to fit next variable ...\n') pause X0(ig,:)=0; % delete current var from set to consider next s=e'*e; % residual sum of squares r2=1-s/ssy; % R square bic=n*log(1-r2)+p*log(n); % bic BIC=[BIC bic]; R2=[R2 r2]; end % stepwise(X(varind,:)',y) fprintf('R2 and BIC:\n') num2str([R2' BIC'],4) %