Statistics 103
  Probability and Statistical Inference
 

Instructions for lab 11


Lab Objective

To gain experience with simple regressions.

Lab Procedures

Many macroeconomic studies use cross-sectional data (i.e., data from the same time frame) from countries around the world.  Of particular interest is the factors related to Gross National Product (GNP), which essentially is the amount of money the country produces from all sources.

Open the data set countries.JMP.  It contains economic data for 97 countries from around the world.  All monetary values are expressed in U.S. dollars.   The variables include:

GNP (per capita)                        the GNP divided by the number of people in the country.
Birth Rate (per 1000)                  the number of births per 1000 people in the country.
Death Rate (per 1000)                the number of deaths per 1000 people in the country.
Infant Deaths (per 1000)            the number of infant deaths per 1000 people in the country.
Life Expectancy (Males)             average age at death for men.
Life Expectancy (Females)         average age at death for women.
Region                                         Eastern European and former Soviet Union countries = 1
                                                    South American and Central American countries = 2
                                                    "Western" countries (e.g., France, Japan, USA) = 3
                                                   
Middle Eastern countries = 4
                                                    South Asian countries = 5
                                                   
African countries = 6.
Country                                       name of country.


Questions:

1.  Does a normal curve describe the distribution of per capita GNP well?  

2.  Which numerical variable has the strongest correlation with per capita GNP?  Note that "Region" is not a numerical variable (it is categorical) so correlations involving Region make no sense.

3.  a) What is the regression equation for predicting per capita GNP (Y) from birth rate (X)?
     b)  What is a typical deviation in GNPs from  the regression line?  (This is the root mean square error.)

To fit a regression line, go to Analyze - Fit Y by X.  Select "GNP" as the Y variable and "Birth Rate" as the X variable.  Once you see the scatter plot, go to the red arrow next to Bivariate Fit.   Select Fit Line.

4.   Does the plot of residuals versus the predictor suggest any violations of the regression assumptions?   Would you be willing to use this regression to predict per capita GNPs?  Justify your answers in at most two sentences.

To obtain the plot of residuals versus the predictor values, click on the red arrow next to Linear Fit, which is just below the scatter plot.  Then, select Plot Residuals.

5a).  Let's do the regression using the (natural) logarithm of per capita GNP as the dependent variable.   What is the regression equation for predicting the logarithm per capita GNP (Y) from birth rate (X)? 
5b)  Does the plot of residuals versus the predictor suggest any violations of the regression assumptions?   Would you be willing to use this regression to predict logged per capita GNPs?  Justify your answers in at most two sentences.

6).  Give a 90% confidence interval for the true regression slope.  Use the t-multiplier.

7)  Are the data consistent with there being no linear relationship between log(per capita GNP)  and Birth rates?   Test the hypothesis that the slope equals zero.  Report your test statistic, p-value, and conclusion about the relationship between log(per capita GNP) and Birth rates.  Consider p-values in the .05 range as small enough to reject the null hypothesis.

8.  If a country has a birth rate of 30 people per 1000, can you use the regression equation to predict the per capita GNP?  If you think so, write down the estimated per capita GNP (take "e" raised to the predicted log(per capita GNP)).  If you think not, explain why not in at most one sentence.

9.  If a country has a birth rate of 80 people per 1000, can you use the regression equation to predict the per capita GNP?  If you think so, write down the estimated per capita GNP (take "e" raised to the predicted log(per capita GNP)).  If you think not, explain why not in at most one sentence.

10.  Can you conclude from the regression results that implementing health programs to decrease the birth rate will increase GNPs?  Explain why or why not in no more than three sentences.