STA
210B/ENV 251
Statistics and Data Analysis for
the Biological Sciences
Objective: To apply simple linear regression.
Material: Use polycity.jmp in the JMPin databook.
Background: Here we consider a curious proposition. Is there an association between ozone and population size? We have both population estimates and ozone measurements for 48 cities throughout the US. Also we have population size for another four cities but no ozone measurement. Is there an association between the two variables? Is the relation linear or curved? Are there outliers? Can we predict ozone levels for the four additional cities?
Main Question: What is the relationship between ozone and city population?
Steps:
1. Fit a line to ozone by pop. Add a confidence curve for fit to show
uncertainty about fit. Add a polynomial fit of degree 2 or 3 to see if
there is a substantial departure from the linear fit. If the polynomial
fits lie well between the two confidence curves, then this is an indication
that the relationship is most likely linear. (We will learn more precise
ways to test this in mid-November.)
2. Open up another window to fit ozone by pop again. Do any observations appear to be an outlier from the linear fit. To check, add a confidence curve for individual observations. Are any of the observations outside, not between, the two confidence curves. If so, they may be outliers. Then again, they may just be values a little more extreme than usual.
3. Now that we have fit the data with a line, checked to see if the
linear fit was adequate, and checked for outliers, we have some confidence
in using the linear model. Make predictions for ozone levels at the four
cities without ozone measurements. Report both the expected value and standard
deviation for each. You may do this by hand. You may also make JMP do it.
It's your choice.
Report:
Report the parameter estimates for the linear fit, include RMSE. Why
do you think the linear fit is adequate? Are there outliers to be concerned
about? Report predictions for Cheyenne, Dubuque, Galveston, and Spokane?
Are these four cities different from the others in some way that may call
to question the predictions? (It helps to know your western geography.)
Why should one believe that population and ozone are related? Is there
any scientific explanation for this? Do you believe that this relation
is causative or merely predictive or discriptive? Explain why in either
case.
Due: September 29, 1998. You may turn it in sooner if you wish feedback before the exam.