Statistics 101
Data Analysis and Statistical
Inference
Instructions for lab 8
Lab Objective
The purpose of the lab is to analyze a data set from scratch, using
methods that we have learned in class.
Lab Procedures
The television series Sesame Street is concerned mainly with teaching
preschool skills to children age 3-5, with special emphasis on
reaching economically disadvantaged children. The show is
designed to hold young childrens' attention through action oriented,
short duration presentations teaching specific preschool cognitive
skills and some social skills. Each show is one hour and involves much
repetition of concepts within and across shows.
Does Sesame Street help economically disadvantaged children 'catch-up'
with economically advantaged children? In the early 1970s,
researchers at Educational Testing Service (the company that runs the
SAT) ran a study to evaluate Sesame Street. The researchers
sampled children representative of economically advantaged and
disadvantaged populations from five different sites in the United
States. To ensure the study contained a group of children that
watched Sesame Street regularly, they randomly assigned children either
to receive encouragement to watch Sesame Street or not to receive
encouragement. Those assigned to encouragement were given
promotional materials, and received weekly visits and phone calls from
ETS staff. Those assigned not to receive encouragement did not
get this attention.
The children were tested on a variety of cognitive variables, including
knowledge of body parts, knowledge about letters, knowledge about
numbers, etc., both before and after viewing the series.
Open the data set sesame.jmp
by clicking on the link. These data are part of a larger data set
used to evaluate the impact of Sesame Street. The names of
variables are shown in the code book at the end of the lab
instructions. Note that all the variables are currently coded as
continuous (quantitative) variables. You should recode any
nominal (qualitative) variables by clicking on the blue Cs in the box
to the left with the variable names and selecting "Nominal".
Questions:
1. Did encouragement cause children to watch Sesame Street more
frequently? Did encouragement result in higher tests scores on
average?
2. What do the data suggest about whether watching Sesame Street
helped children? Compare within types of kids.
3. What do the data suggest about whether Sesame Street helped
economically disadvantaged children catch up?
Before you come to lab, think of an analysis plan to
address the following issues:
1. The general question you will answer, and a hypothesized answer
(i.e. what results will support your hypothesized answer?).
2. The comparison groups you will use.
3. The outcome (dependent, response, Y) and predictor (independent, X)
variables you will use to answer the question. You can look at
one or two outcome variables, or more if you'd like.
4. The statistical method(s) that you will use to help answer the
question.
5. What results from these specific statistical methods are needed to
support your hypothesized answer?
Take this analysis plan with you to lab. You then can ask the TAs or
other students about your plans, make adjustments, and use the
remaining time to begin your analyses. You'll have a week to
complete your analyses. You can ask the TAs or the professor for
advice on your data analysis plans at any time during the week.
Turn in a typed summary of your analyses (not to exceed 1 type
written page, single space and 12 point text). In the write up,
explain the analyses you did, and your conclusions. Provide
numerical evidence from the data to support your conclusions. You
don't have to tell me all the JMP commands you used. Just tell me
what you found. For example, you might say "The values of test
scores for the kids who were encouraged are typically higher than those
who were not encouraged. The means are ___and ___ respectively,
with SDs of ___ and ___."
This write up counts for 30 lab points.
Important Note
These data are challenging to analyze, particularly for Question #2
and #3. There was a lot of controversy over the conclusions of
ETS (who found it does help) because of concerns related to the study
design and potential confounding. Analyze Question #2 and #3 as
best you can, thinking about potential confounding variables that could
affect your conclusions. Perform analyses for Question #2
assuming those confounding variables are not a problem. But,
explain in your last paragraph how they might be a problem.
Code book with variable names
id : subject identification number
site : 1 =Three to five year old disadvantaged children from
inner city areas in various parts of the country.
2 = Four year old advantaged
suburban children.
3 = Advantaged rural children.
4 = Disadvantaged rural
children.
5 = Disadvantaged Spanish
speaking children.
sex male=1, female=2
age age in months
viewcat frequency of viewing
1=rarely watched the
show
2=once or twice a
week
3=three to five times
a week
4=watched the show on
average more than 5 times a week
setting: setting in which Sesame Street was viewed,
1=home 2=school
viewenc : treatment condition 1=child
encouraged to watch, 2=child not encouraged to watch
prebody : pretest on knowledge of body parts (scores range
from 0-32)
prelet : pretest on letters (scores range from 0-58)
preform : pretest on forms (scores range from 0-20)
prenumb : pretest on numbers (scores range from 0-54)
prerelat : pretest on relational terms (scores range from 0-17)
preclasf : pretest on classification skills
postbody : posttest on knowledge of body parts (0-32)
postlet : posttest on letters (0-58)
postform : posttest on forms (0-20)
postnumb : posttest on numbers (0-54)
postrelat : posttest on relational terms (0-17)
postclasf: posttest on classification skills
peabody: mental age score obtained from administration of
the Peabody Picture Vocabulary test as a pretest measure of vocabulary
maturity