STATISTICS
Final Exam, Fall 1995

Dalene Stangl

Good luck! This is a long exam. Work as quickly as you can without making careless errors.

  1. A class of 400 students is divided into 2 sections of 200 each. Both sections are given a common exam. The following is observed

    Average SD
    Section 1 85 10
    Section 2 45 10

    Suppose all 400 scores are combined as one list. The SD of the new list will be

    1. smaller than 10
    2. larger than 10
    3. equal to 10
    4. can't tell without knowing the entire list

  2. A study was made of the age at entrance of college freshman to Duke. The SD turned out to be:
    1. 1 month
    2. 1 year
    3. 5 years

  3. Given the group of numbers: 3, 7, 4, 6, 2, 2, 7, 3, 2. Which of the following are the same?
    1. average
    2. range
    3. SD
    4. median
    5. mode

  4. The average temperature in Durham on the 11th day of May over the last 10 years was 74 degrees F with an SD of 6 degrees F. Temperature Centigrade is found by subtracting 32 degrees from temperature Fahrenheit, then multiplying by 5/9. What were the average and the SD of the temperature in Durham on the 11th day of May expressed in degrees Centigrade.

  5. Draw two histograms, one with its average greater than its median and the other with its median greater than its average. Label which is which.

  6. A list of numbers has a normal distribution, an average of 10 and an SD of 2. What is the interquartile range?

  7. A teacher decides to test a new teaching method. In class A she gives 'pop' quizzes at random intervals. In class B she gives announced quizzes on alternating Mondays. At the end of the class she gives a common exam to both classes with the following results. She creates a contingency table classifying students by whether they had mastered more or less than 90% of the material.

    Class <90% >90% Total % scoring >90%
    A 85 65 150 43%
    B 68 32 100 32%
    Total 153 97 250

    Then she decides to look at whether the effect was consistent for upper and lower classmen. She found the following.

    Fresh/Soph Class <90% >90% Total % scoring >90%
    A 60 15 75 20%
    B 60 15 75 20%
    Total 120 30 150

    Jr/Sr Class <90% >90% Total % scoring >90%
    A 25 50 75 67%
    B 8 17 25 68%
    Total 33 67 100

    1. What is the confounding variable?

    2. Calculate a weighted average for the percentage in Class A that mastered greater than 90% of the material taking into account the confounding variable. Do the same for Class B.

  8. The ETS Verbal Aptitude Test is designed so that the scores of high school seniors taking the test will have a normal distribution with average 500 and SD 100. A college wants their students to be in the top 12.5% for verbal aptitude. What minimum qualifying score should they set?

  9. At one law school, the correlation between LSAT score and GPA for first-year students was 0.4. Average LSAT of those admitted was 650 with an SD of 50; average GPA was 3.0 with an SD of 0.4. Both scores approximately follow the normal curve. Draw a scatter diagram to help answer the following questions.

    1. approximately what percentage of all students admitted was above average on both the LSAT score and law school GPA?

    2. With the same averages and SD's, and the scores following the normal curve, what is the highest that this percentage could be at another law school? What would be the corresponding correlation? ____________% __________________Correlation

    3. Suppose .5 was added to the GPA of each student at the law school. How would the correlation between LSAT and GPA change?
      1. increase
      2. decrease
      3. stay the same
      4. can't tell without further information

  10. A teaching assistant gives a quiz. There are 10 questions on the quiz and no partial credit is given. After grading the papers the TA writes down for each student the number of questions the student got right and the number he got wrong. The average number of correct answers is 6.4 with an SD of 2.0.

    1. What is the average and SD of the number of wrong answers? ____________Average ____________SD

    2. The correlation coefficient between the numbers of right and wrong answers is
      1. 0
      2. -.5
      3. +.5
      4. -1
      5. +1
      6. can't tell

  11. After being given a dose of a certain drug, the average temperatures and blood pressures of 5 groups of 20 subjects were recorded. The correlation for the 5 pairs of group averages was +0.92. The relation between a subject's temperature and blood pressure in the group is almost exactly a line. True or False?

  12. A doctor records, for 400 hospital patients selected at random, the pair: (temperature when admitted, temperature 24 hours later). He finds that the average temperatures are 101 degrees when admitted, 99 the next day. The SD is 1.5 degrees for temperatures when admitted, 0.5 for those the next day. The correlation between the two temperatures is 0.8. The scatter diagram is football shaped.

    1. if a patient enters with temperature 103, what do you expect his temperature to be the next day?

    2. The fact that the average temperature the day after entering is 2 degrees less than upon admission is due to the regression effect. True or False?

  13. As part of a large experimental course, a diagnostic quiz is given at the outset of the course. At the end of the course, the instructor collected the following statistics:

    Diagnostic quiz (20 points possible): average=10, SD=4

    Final exam (100 points possible): average=60, SD=15

    r=0.6

    Of those who scored 6 points on the diagnostic quiz, about what percent scored below average on the final?

  14. A GPA-LSAT study shows that the slope for estimating LSAT from GPA is about 100 LSAT points per GPA point. Which of the following is correct?

    1. If one person has twice the GPA of another, his or her estimated LSAT score is 100 points higher than the others. True or False?

    2. If two people differ by 1.5 points in GPA, you can expect their LSAT scores to differ by about 150 points. True or False?

    3. A person with a GPA of 3.5 will score, on average, 3.5x100=350 on the LSAT. True or False?

    4. The SD of LSAT scores is 100xSD of GPA scores in the group studied. True or False?

  15. A certain pregnancy test determines pregnancy with high probability. In particular, if a woman is pregnant the chance the test will give a positive reaction is 90%. If she isn't pregnant, the chance it will give a negative reaction is 85%. Suppose a woman takes the test and gets a negative result. The chance she is pregnant is: ___________ or can not be determined

  16. An astronaut's oxygen supply comes from two independent sources. Source A has probability 0.9 of working, and Source B has probability 0.8 of working. What is the probability that at least one of the sources is working?

  17. A die is tossed twice. Given that the sum is greater than 7, what is the chance of getting a pair?

  18. There are 20 men and 5 women in a class. Each day one person is chosen at random (with replacement).

    1. What is the chance that a woman will be chosen on any given day?

    2. What is the chance that a man is chosen on the first three days?

    3. What is the chance of choosing exactly two women in the first five days?

    4. What is the chance of choosing at least two women in the first five days?

    5. What is the chance of choosing a woman on the fifth day given that you chose a man on each of the first four days?

    6. What is the chance of choosing a man on the first day and a woman on the second day?

    7. What is the chance of choosing a man on the first day or a woman on the second day?

  19. You want to find out about drug use among members of your class, but you are sure that people will not respond honestly if you ask them directly, so you set up a scheme such that each classmate is given a private ticket. No one besides the receiver knows what was on the ticket. 70% of the tickets have a S written on them and 30% have an NS written on them. Those students receiving an S ticket are to answer the question " Have you used an illegal drug in the last week?" Those with an NS ticket are to answer the question "Does your mother's phone number end in an even digit?". In the class of 60, 25 respond yes to their question and 35 respond no. Calculate the probability that a classmate has used illegal drugs in the past week.

  20. In a city, 20% of the workers have incomes over $40,000 per year. If 1600 workers are chosen at random with replacement, what is the chance that between 320 and 336 of those chosen have incomes over $40,000 per year?

  21. One hospital has 112 live births during the month of June, another has 432. Which is likelier to have 55% or more male births or are the chances equal? (Note: there is about a 52% chance for a liveborn infant to be male.)

    Hospital with 112 births Hospital with 432 births

  22. In a certain precinct, 80% of the voters are Republican. A simple random sample of size 400 is drawn (with replacement). Each person in the sample is polled and the percentage of Republicans in the sample is calculated. What is the chance that this percentage is between 78% and 88%.

  23. In a simple random sample of 900 students at a major state university, only 180 favor a return to the semester system. Find a 95% confidence interval for the percentage of students in the university favoring a return to the semester system.

  24. In 1990, the average of the daily maximum temperature at San Francisco Airport was 64.4 degrees, with an SD of 7 degrees. The standard error for the average is A 95% confidence interval for the average daily maximum temperature at San Francisco airport is 64.4 +/- 0.74 degrees. True or False?

  25. A large lecture course has 1000 students. On a midterm exam, the class average was 50 with an SD of 10. For one section of 25 students the section average was 60. It is suggested that the early morning section has more serious students and would therefor have higher scores.

    1. Formulate the null hypothesis

    2. compute z and P

    3. Can this higher average be reasonably explained by chance variation? Justify your answer.

  26. To find our whether students should be required to take a statistics course, a campus poll is taken; the results are shown below:
    Fresh/Soph Jr/Seniors Totals
    Should 197 195 392
    Should not 233 175 408
    Totals 430 370 800

    1. Calculate the relative risk.

    2. Interpret the relative risk.

    3. Calculate the odds ratio.

    4. Using the data given, test the hypothesis that responses do not differ according to class in the total population of students. Use a Chi-square test with a 0.05 level of significance.

  27. You are hired by the university to estimate the % of students that oppose the alcohol policy at Duke. How many students should you sample to be 95% confident of falling within +/- 2 percentage points of the true population percentage. Assume a pilot study has been done estimating the percentage to be 50%.

  28. What do each of the following symbols stand for in standard statistical practice?

    = ______________________________________

    = ______________________________________

    = ______________________________________

  29. A friend tells you he has a coin that lands heads 25%, 50%, or 75% of the time, but he does not tell you which. Before seeing any data you assume that each possibility is equally likely. He then lets you toss the coin 8 times, and it lands heads 7 times.

    1. What is your posterior probability that p=.5?

    2. What is your prediction for the probability that the coin will land heads the next time it is tossed?

    3. Suppose you wanted to test H0:p=.5 against HA:p>.5. What is the observed p-value?

    4. In two sentences or less explain why the answers to a and d are different.

  30. A random sample of graduate students at Duke were asked how much they expected to make upon taking their first job after graduation. The students were categorized by political party affiliation (1=Democrat 2=Independent 3=Other 4=Republican). A researcher wanted to know if average expected salary differed by political party affiliation. Below is the output for an analysis of variance. Answer the questions below the output:

    General Linear Models Procedure
                            Class Level Information
                             Class    Levels    Values
                             Q4           4    1 2 3 4
    
                    Number of observations in data set = 38
    
    NOTE: Due to missing values, only 35 observations can be used in this
          analysis.
    
                        General Linear Models Procedure
    
    Dependent Variable: Q6   Expected salary upon graduation
                                  Sum of         Mean
    Source               DF       Squares       Square  F Value   Pr > F
    Model                 3     1.287E+09    4.291E+08    2.84    0.0541
    Error                31     4.687E+09    1.512E+08
    Corrected Total      34     5.974E+09
    
                   R-Square          C.V.     Root MSE            Q6 Mean
                   0.215454      30.70738        12296              40043
    
    
    Source               DF     Type I SS  Mean Square  F Value   Pr > F
    Q4                    3     1.287E+09    4.291E+08     2.84   0.0541
    Source               DF   Type III SS  Mean Square  F Value   Pr > F
    Q4                    3     1.287E+09    4.291E+08     2.84   0.0541
    
    
    
    Tukey's Studentized Range (HSD) Test for variable: Q6
    
          NOTE: This test controls the type I experimentwise error rate,
                but generally has a higher type II error rate than REGWQ.
    
                       Alpha= 0.05  df= 31  MSE= 1.5119E8
                   Critical Value of Studentized Range= 3.838
                     Minimum Significant Difference= 21703
                       WARNING: Cell sizes are not equal.
                     Harmonic Mean of cell sizes= 4.729064
    
          Means with the same letter are not significantly different.
    
                  Tukey Grouping              Mean      N  Q4
                               A             47667     12  2
                               A
                               A             42000      5  4
                               A
                               A             35000      2  3
                               A
                               A             34344     16  1
    
    

    1. What is the null hypothesis the researcher is testing?

    2. What are the assumptions necessary to make this a valid test?

    3. What does the Mean Square for the Model tell us about the variability in our data?

    4. What does the Mean Square for the Error tell us about the variability in our data?

    5. How is the F-statistic calculated? (Use the numbers on the output to illustrate.)

    6. What is the conclusion from this test? Justify your answer.

  31. If you worked in pairs on your data project, list the percentage of the project completed by yourself ______________% and the percentage completed by your partner _____________%.

  32. Honor Code: My data project was my own work, not the work of someone else.

    Signature _______________________________


Solutions

  1. A class of 400 students is divided into 2 sections of 200 each. Both sections are given a common exam. The following is observed

    Average SD
    Section 1 85 10
    Section 2 45 10

    Suppose all 400 scores are combined as one list. The SD of the new list will be

    1. smaller than 10
    2. larger than 10
    3. equal to 10
    4. can't tell without knowing the entire list

  2. A study was made of the age at entrance of college freshman to Duke. The SD turned out to be:
    1. 1 month
    2. 1 year
    3. 5 years

  3. Given the group of numbers: 3, 7, 4, 6, 2, 2, 7, 3, 2. Which of the following are the same?
    1. average
    2. range
    3. SD
    4. median
    5. mode

  4. The average temperature in Durham on the 11th day of May over the last 10 years was 74 degrees F with an SD of 6 degrees F. Temperature Centigrade is found by subtracting 32 degrees from temperature Fahrenheit, then multiplying by 5/9. What were the average and the SD of the temperature in Durham on the 11th day of May expressed in degrees Centigrade.

  5. Draw two histograms, one with its average greater than its median and the other with its median greater than its average. Label which is which.

  6. A list of numbers has a normal distribution, an average of 10 and an SD of 2. What is the interquartile range?

    10 - .675 (2) = 8.65
    10 + .675 (2) = 11.35

    11.35 - 8.65 = 2.70

  7. A teacher decides to test a new teaching method. In class A she gives 'pop' quizzes at random intervals. In class B she gives announced quizzes on alternating Mondays. At the end of the class she gives a common exam to both classes with the following results. She creates a contingency table classifying students by whether they had mastered more or less than 90% of the material.

    Class <90% >90% Total % scoring >90%
    A 85 65 150 43%
    B 68 32 100 32%
    Total 153 97 250

    Then she decides to look at whether the effect was consistent for upper and lower classmen. She found the following.

    Fresh/Soph Class <90% >90% Total % scoring >90%
    A 60 15 75 20%
    B 60 15 75 20%
    Total 120 30 150

    Jr/Sr Class <90% >90% Total % scoring >90%
    A 25 50 75 67%
    B 8 17 25 68%
    Total 33 67 100

    1. What is the confounding variable?
      Fresh/Soph versus Junior/Senior

    2. Calculate a weighted average for the percentage in Class A that mastered greater than 90% of the material taking into account the confounding variable. Do the same for Class B.
    % Class A = .2 (150/250) + .67 (100/250) = .39
    % Class B = .2 (150/250) + .68 (100/250) = .39

  8. The ETS Verbal Aptitude Test is designed so that the scores of high school seniors taking the test will have a normal distribution with average 500 and SD 100. A college wants their students to be in the top 12.5% for verbal aptitude. What minimum qualifying score should they set?

    500 + 1.15(100) = 615

  9. At one law school, the correlation between LSAT score and GPA for first-year students was 0.4. Average LSAT of those admitted was 650 with an SD of 50; average GPA was 3.0 with an SD of 0.4. Both scores approximately follow the normal curve. Draw a scatter diagram to help answer the following questions.

    1. approximately what percentage of all students admitted was above average on both the LSAT score and law school GPA?

      25% < x% < 40%

    2. With the same averages and SD's, and the scores following the normal curve, what is the highest that this percentage could be at another law school? What would be the corresponding correlation? ___50_________% ________1__________Correlation

    3. Suppose .5 was added to the GPA of each student at the law school. How would the correlation between LSAT and GPA change?
      1. increase
      2. decrease
      3. stay the same
      4. can't tell without further information

  10. A teaching assistant gives a quiz. There are 10 questions on the quiz and no partial credit is given. After grading the papers the TA writes down for each student the number of questions the student got right and the number he got wrong. The average number of correct answers is 6.4 with an SD of 2.0.

    1. What is the average and SD of the number of wrong answers? ______3.6______Average _____2.0_______SD

    2. The correlation coefficient between the numbers of right and wrong answers is
      1. 0
      2. -.5
      3. +.5
      4. -1
      5. +1
      6. can't tell

  11. After being given a dose of a certain drug, the average temperatures and blood pressures of 5 groups of 20 subjects were recorded. The correlation for the 5 pairs of group averages was +0.92. The relation between a subject's temperature and blood pressure in the group is almost exactly a line. True or False?
    False
    This is ecological correlation.

  12. A doctor records, for 400 hospital patients selected at random, the pair: (temperature when admitted, temperature 24 hours later). He finds that the average temperatures are 101 degrees when admitted, 99 the next day. The SD is 1.5 degrees for temperatures when admitted, 0.5 for those the next day. The correlation between the two temperatures is 0.8. The scatter diagram is football shaped.

    1. if a patient enters with temperature 103, what do you expect his temperature to be the next day?
      y = mx + b
      m = r (SDY/SDX) = .8 (.5/1.5) = .267
      99 = .267 (101) + b
      b = 72.03

      y = .267 (101) + 72.03 = 99.53

    2. The fact that the average temperature the day after entering is 2 degrees less than upon admission is due to the regression effect. True or False?
      False

  13. As part of a large experimental course, a diagnostic quiz is given at the outset of the course. At the end of the course, the instructor collected the following statistics:

    Diagnostic quiz (20 points possible): average=10, SD=4

    Final exam (100 points possible): average=60, SD=15

    r=0.6

    Of those who scored 6 points on the diagnostic quiz, about what percent scored below average on the final?

    m = .6 (15/4) = 2.25
    60 = 10 (2.25) + b
    37.5 = b
    y = 2.25 x + 37.5
    51 = 2.25 (6) + 37.5

    (60 - 51) / 12 = .75

    77.44%

  14. A GPA-LSAT study shows that the slope for estimating LSAT from GPA is about 100 LSAT points per GPA point. Which of the following is correct?

    1. If one person has twice the GPA of another, his or her estimated LSAT score is 100 points higher than the others. True or False?
      False

    2. If two people differ by 1.5 points in GPA, you can expect their LSAT scores to differ by about 150 points. True or False?
      True

    3. A person with a GPA of 3.5 will score, on average, 3.5x100=350 on the LSAT. True or False?
      False

    4. The SD of LSAT scores is 100xSD of GPA scores in the group studied. True or False?
      False

  15. A certain pregnancy test determines pregnancy with high probability. In particular, if a woman is pregnant the chance the test will give a positive reaction is 90%. If she isn't pregnant, the chance it will give a negative reaction is 85%. Suppose a woman takes the test and gets a negative result. The chance she is pregnant is: ___________ or can not be determined
    P(Preg | Neg) = P(Neg | Preg) P(Preg) / P(Neg)
    We don't know P(Preg), so the chance can not be determined.

  16. An astronaut's oxygen supply comes from two independent sources. Source A has probability 0.9 of working, and Source B has probability 0.8 of working. What is the probability that at least one of the sources is working?
    P(A or B) = P(A) + P(B) - P(A and B) = .9 + .8 - (.9)(.8) = .98

  17. A die is tossed twice. Given that the sum is greater than 7, what is the chance of getting a pair?

    3/15

  18. There are 20 men and 5 women in a class. Each day one person is chosen at random (with replacement).

    1. What is the chance that a woman will be chosen on any given day?
      5/25 = 1/5

    2. What is the chance that a man is chosen on the first three days?

    3. What is the chance of choosing exactly two women in the first five days?

    4. What is the chance of choosing at least two women in the first five days?

    5. What is the chance of choosing a woman on the fifth day given that you chose a man on each of the first four days?
      1/5

    6. What is the chance of choosing a man on the first day and a woman on the second day?
      (4/5) (1/5) = .16

    7. What is the chance of choosing a man on the first day or a woman on the second day?
      P(A or B) = P(A) + P(B) - P(A and B) = 4/5 + 1/5 - (4/5) (1/5) = .84

  19. You want to find out about drug use among members of your class, but you are sure that people will not respond honestly if you ask them directly, so you set up a scheme such that each classmate is given a private ticket. No one besides the receiver knows what was on the ticket. 70% of the tickets have a S written on them and 30% have an NS written on them. Those students receiving an S ticket are to answer the question " Have you used an illegal drug in the last week?" Those with an NS ticket are to answer the question "Does your mother's phone number end in an even digit?". In the class of 60, 25 respond yes to their question and 35 respond no. Calculate the probability that a classmate has used illegal drugs in the past week.
    P(yes) = P(yes | NS) P(NS) + P(yes | S) P(S)
    25/60 = .5 (.3) + x (.7)

    x = 38%

  20. In a city, 20% of the workers have incomes over $40,000 per year. If 1600 workers are chosen at random with replacement, what is the chance that between 320 and 336 of those chosen have incomes over $40,000 per year?
    mean: 1600 (.2) = 320
    SD: 16 =

    34%

  21. One hospital has 112 live births during the month of June, another has 432. Which is likelier to have 55% or more male births or are the chances equal? (Note: there is about a 52% chance for a liveborn infant to be male.)

    Hospital with 112 births Hospital with 432 births

  22. In a certain precinct, 80% of the voters are Republican. A simple random sample of size 400 is drawn (with replacement). Each person in the sample is polled and the percentage of Republicans in the sample is calculated. What is the chance that this percentage is between 78% and 88%.
    mean: 80%
    standard error:

    84%

  23. In a simple random sample of 900 students at a major state university, only 180 favor a return to the semester system. Find a 95% confidence interval for the percentage of students in the university favoring a return to the semester system.
    mean: 180/900 = 20%
    standard error:

    20% +/- 2 (1.3%)
    ( 17.4%, 22.6%)

  24. In 1990, the average of the daily maximum temperature at San Francisco Airport was 64.4 degrees, with an SD of 7 degrees. The standard error for the average is A 95% confidence interval for the average daily maximum temperature at San Francisco airport is 64.4 +/- 0.74 degrees. True or False?
    False. This isn't a sample.

  25. A large lecture course has 1000 students. On a midterm exam, the class average was 50 with an SD of 10. For one section of 25 students the section average was 60. It is suggested that the early morning section has more serious students and would therefor have higher scores.

    1. Formulate the null hypothesis

    2. compute z and P

      p < .05

    3. Can this higher average be reasonably explained by chance variation? Justify your answer.
      No.

      p < .05

  26. To find our whether students should be required to take a statistics course, a campus poll is taken; the results are shown below:
    Fresh/Soph Jr/Seniors Totals
    Should 197 195 392
    Should not 233 175 408
    Totals 430 370 800

    1. Calculate the relative risk.
      (197/392) / (233/408) = .88

    2. Interpret the relative risk.
      The students who felt a statistics course should be taken were less likely to be Fresh/Soph.

    3. Calculate the odds ratio.
      (197 x 175) / (195 x 233) = .75

    4. Using the data given, test the hypothesis that responses do not differ according to class in the total population of students. Use a Chi-square test with a 0.05 level of significance.
      Fresh/Soph Jr/Seniors
      Should (430/800)*392 = 210.7 (370/800)*392 = 181.3
      Should not (430/800)*408 = 219.3 (370/800)*408 = 188.7

      p = .06

      This is a marginally significant result.

  27. You are hired by the university to estimate the % of students that oppose the alcohol policy at Duke. How many students should you sample to be 95% confident of falling within +/- 2 percentage points of the true population percentage. Assume a pilot study has been done estimating the percentage to be 50%.

  28. What do each of the following symbols stand for in standard statistical practice?

    = Population mean

    = Population slope

    = Population SD

  29. A friend tells you he has a coin that lands heads 25%, 50%, or 75% of the time, but he does not tell you which. Before seeing any data you assume that each possibility is equally likely. He then lets you toss the coin 8 times, and it lands heads 7 times.

    1. What is your posterior probability that p=.5?

    2. What is your prediction for the probability that the coin will land heads the next time it is tossed?
      .25 (.0004/.2987) + .50 (.0313/.2987) + .75(.2670/.2987) = .0003 +.0524 + .6704= .7231

    3. Suppose you wanted to test H0:p=.5 against HA:p>.5. What is the observed p-value?

    4. In two sentences or less explain why the answers to a and d are different.
      In a) we calculated the probability of p given the data, while in d) we calculated the probability of the data we observed or more extreme given p = .5.

  30. A random sample of graduate students at Duke were asked how much they expected to make upon taking their first job after graduation. The students were categorized by political party affiliation (1=Democrat 2=Independent 3=Other 4=Republican). A researcher wanted to know if average expected salary differed by political party affiliation. Below is the output for an analysis of variance. Answer the questions below the output:

    General Linear Models Procedure
                            Class Level Information
                             Class    Levels    Values
                             Q4           4    1 2 3 4
    
                    Number of observations in data set = 38
    
    NOTE: Due to missing values, only 35 observations can be used in this
          analysis.
    
                        General Linear Models Procedure
    
    Dependent Variable: Q6   Expected salary upon graduation
                                  Sum of         Mean
    Source               DF       Squares       Square  F Value   Pr > F
    Model                 3     1.287E+09    4.291E+08    2.84    0.0541
    Error                31     4.687E+09    1.512E+08
    Corrected Total      34     5.974E+09
    
                   R-Square          C.V.     Root MSE            Q6 Mean
                   0.215454      30.70738        12296              40043
    
    
    Source               DF     Type I SS  Mean Square  F Value   Pr > F
    Q4                    3     1.287E+09    4.291E+08     2.84   0.0541
    Source               DF   Type III SS  Mean Square  F Value   Pr > F
    Q4                    3     1.287E+09    4.291E+08     2.84   0.0541
    
    
    
    Tukey's Studentized Range (HSD) Test for variable: Q6
    
          NOTE: This test controls the type I experimentwise error rate,
                but generally has a higher type II error rate than REGWQ.
    
                       Alpha= 0.05  df= 31  MSE= 1.5119E8
                   Critical Value of Studentized Range= 3.838
                     Minimum Significant Difference= 21703
                       WARNING: Cell sizes are not equal.
                     Harmonic Mean of cell sizes= 4.729064
    
          Means with the same letter are not significantly different.
    
                  Tukey Grouping              Mean      N  Q4
                               A             47667     12  2
                               A
                               A             42000      5  4
                               A
                               A             35000      2  3
                               A
                               A             34344     16  1
    
    

    1. What is the null hypothesis the researcher is testing?

    2. What are the assumptions necessary to make this a valid test?
      equal variances, independent populations, random sampling

    3. What does the Mean Square for the Model tell us about the variability in our data?
      variability between groups

    4. What does the Mean Square for the Error tell us about the variability in our data?
      variability within groups

    5. How is the F-statistic calculated? (Use the numbers on the output to illustrate.)

    6. What is the conclusion from this test? Justify your answer.
      This is marginally significant. The probability of seeing this data or more extreme given Ho is 5.4%.