Lab Exercise 4

STA 210B/ENV 251
Statistics and Data Analysis for the Biological Sciences

Objective: Generate random variables and sampling distributions

Material: Use simprob.jmp and cntrlmt.jmp in JMPin databook.

Background: We will use JMPin to generate random variables. This will give us an opportunity to see how binomial and Poisson random variables behave and how this relates to the distribution of p-hat and a Poisson. Also there is a mystery distribution for which we will explore the sample distribution of its sample mean.

Main Question: How does sample size relate to the standard deviation of a sampling distribution?

Steps:
1. Open simprob.jmp and add 200 to 1000 rows. This will generate binomial random variables for several values of n and p. What you see in the columns are actually p-hats, the sample fraction. Look at histograms. Confirm that the average of p-hats is close to p and that the standard deviation of the p-hats is close to the standard error, sqrt(p(1-p)/n).

2. Open a new JMP table. Add four columns and 200 to 1000 rows. Each column will be defined by a formula. This formula will generate a Poisson random variable and compute its z-score. Here are the formulas:

(Note that in lecture we used mu as the Greek letter for the rate of a Poisson, while JMP uses lambda for this parameter. Either is appropriate, but lambda is probably more commonly used.)

Look at the distributions confirm that the mean of the z-scores is near 0 and the standard deviation is near 1. Note that the distributions of the last two columns is nearer the standard normal distribution. What qualitative difference in the distributions shows the improve approximation to normality?

3. If time permits, open cntrlmt.jmp. This table computes sample means from a mystery distribution. The columns differ only by sample size. The mystery distribution has theoretical mean 1/5 and standard deviation 4/15. Confirm that column 1, which has sample size 1, is near this theoretical mean and standard deviation. (Why would this be the case?) Confirm that the other columns have mean 1/5 as well. Theory also tells us that the standard deviation of sample means of sample size n, i.e. the standard error, is the standard deviation for sample size 1 divided by the square root of n. That is, for sample of n, SE = SD/sqrt(n). Again, SD is 4/15. Confirm this for the columns with n = 5, 10, 50, and 100.
 

Report:
Write up the steps. Plots are not necessary.

Due: Following Thursday lecture.