STA
210B/ENV 251
Statistics and Data Analysis for
the Biological Sciences
Objective: Generate random variables and sampling distributions
Material: Use simprob.jmp and cntrlmt.jmp in JMPin databook.
Background: We will use JMPin to generate random variables. This will give us an opportunity to see how binomial and Poisson random variables behave and how this relates to the distribution of p-hat and a Poisson. Also there is a mystery distribution for which we will explore the sample distribution of its sample mean.
Main Question: How does sample size relate to the standard deviation of a sampling distribution?
Steps:
1. Open simprob.jmp and add 200 to 1000 rows. This will generate
binomial random variables for several values of n and p. What you see in
the columns are actually p-hats, the sample fraction. Look at histograms.
Confirm that the average of p-hats is close to p and that the standard
deviation of the p-hats is close to the standard error, sqrt(p(1-p)/n).
2. Open a new JMP table. Add four columns and 200 to 1000 rows. Each column will be defined by a formula. This formula will generate a Poisson random variable and compute its z-score. Here are the formulas:
Look at the distributions confirm that the mean of the z-scores is near 0 and the standard deviation is near 1. Note that the distributions of the last two columns is nearer the standard normal distribution. What qualitative difference in the distributions shows the improve approximation to normality?
3. If time permits, open cntrlmt.jmp. This table computes sample
means from a mystery distribution. The columns differ only by sample size.
The mystery distribution has theoretical mean 1/5 and standard deviation
4/15. Confirm that column 1, which has sample size 1, is near this theoretical
mean and standard deviation. (Why would this be the case?) Confirm that
the other columns have mean 1/5 as well. Theory also tells us that the
standard deviation of sample means of sample size n, i.e. the standard
error, is the standard deviation for sample size 1 divided by the square
root of n. That is, for sample of n, SE = SD/sqrt(n). Again, SD is 4/15.
Confirm this for the columns with n = 5, 10, 50, and 100.
Report:
Write up the steps. Plots are not necessary.
Due: Following Thursday lecture.