STA114/MTH136 Lecture Notes, Weeks 10,11: Linear Models & Regression 1. METHOD OF LEAST SQUARES Fitting a Straight Line Real line: alpha + beta * x Fitted line: a + b * x Fitting a Polynomial Fitting a Linear Function of Several Variables 2. REGRESSION Regression Functions: E[Y|x1...xk] = b0 + b1x1 + ... + bkxk Simple Linear Regression Yj ~ No(alpha + beta xj, sig^2) -n/2 -sum (Yj-al-be*xj)^2/2 sig^2 --> LHF = (2pi sig^2) e Minimizing LHF is same as minimizing Q = sum(Yj-a-b*xj)^2 Then minimizing sig^2 is S^2 = sum(Ej^2)/n which will have dist'n such that n*Q/sig^2 ~ chi^2, df=n-2 -------------------------- Distribution of L-S Estimators b1 = sum (xj-xbar)&Yj / sum (xj-xbar)^2 ~ No(beta1, mse^2) mse1^2 = sig^2 / sum (xj-xbar)^2 mse0^2 = (sum xj^2) sig^2 / n sum (xj-xbar)^2 cov(1,2)= - xbar sig^2 / sum (xj-xbar)^2 Design of experiment: Evidently you'd like sum (xj-xbar)^2 to be big. Prediction: / sum (xj-x)^2 \ E[(Y-Y-hat)^2] = sig^2 * ( --------------- + 1 ) \sum (xj-xbar)^2 / SO, 95% interval is Y = Y-hat +/- S * sqrt( " ) * 1.96 (or T_.025 with df) ---------------- Bayesian: Alpha, Beta, sig^2 ~ 1/sig^2 (flat prior) ==> xi(al,be,sig^2) = (2pi sig^2)^(-n/2) exp(-Q/2 sig^2) --> al|x ~ No(a, mse0) be|x ~ No(b, mse1) sig^-2|x ~ Ga((n-2)/2,n*S^2/2) ==> Bayesian CI's are same as Classical CI's below. 3. HYPOTHESES AND CI's IN SIMPLE LINEAR REGRESSION Most common: Ho: be1 = be1* -> t = (b-be1*)/mse1 ~ Student T (df=n-2) H1: be1 != be1* Residuals: ej = (Yj - Yj-hat); plot vs. X, X', etc. 4. REGRESSION FALLACY 5. MULTIPLE REGRESSION Vector: Y = X beta + epsilon X'Y = X'X beta + X'epsilon [X'X]^i[X'Y] = beta + [X'X]^i X'epsilon b = [X'X]^i[X'Y] = beta - [X'X]^i X'epsilon ~ No(beta, sig^2 [X'X]^i) MLE 6. ANOVA (1-way) 7. ANOVA (2-way) Balanced Unbalanced