Pitman MTH 135/ STA 104 Probability Week 6 Read: Pitman sections 3.4-3.5 Guest Lecturers: Simon and Margaret!!! Mean, Variance & Expectation * Discrete Distributions Imagine rolling a fair die as many times as needed to get an ace (a one); if you count how many NON-aces come first, you'll find P[ X = k ] = p q^k k=0,1,2,... where p=1/6 is the chance of an ace, and q=1-p. The chance we get an ace in the first N rolls is the same as the chance that there are LESS THAN N non-aces before the first ace, p - p q^N P[ X < N ] = \sum_{k=0}^(N-1) (p q^k) = ------------- = (1-q^N) 1 - q so, in the limit as N->oo, P[ X < oo ] = 1. On average, how many non-aces come before the first ace? In a huge number N of trials, we should expect an average of about DEFINITION 0*N*P[X=0] + 1*N*P[X=1] + 2*N*P[X=2] +... + k*N*P[X=k] +... E [ X ] = ----------------------------------------------------------- N = \sum k P[X=k] = \sum p k q^k = p*q * \sum k q^{k-1} = p * q * (d/dq) \sum q^k = p * q * (d/dq) (1-q)^{-1} = p * q * (1-q)^{-2} = q/p Note that the book counts TRIALS (T) for the first success, while we count FAILURES (X) before the first success; obviously T = X+1, so their mean is E [ T ] = E[ X+1 ] = E[ X ] + 1 = q/p + p/p = (p+q)/p = 1/p ------------------------ Let's play a game--- I put $1 on the table. We roll the fair die until it shows an Ace. Every time it shows a NON-ace, I double the amount of money on the table; when the first Ace shows, you get to keep the money. On average, how much money do you get in this game? How much would you have to pay to make this a "fair game" with no long-term gain or loss for either of us? Note that your winnings are just 2^X; then E[ 2^X ] = \sum (2^k) p q^k = \sum p * (2*q)^k = p/(1-2*q). If the success probability p is, say, 5/6 then your expected gain is E[ 2^X ] = p/(1-2*q) = 5/6 / (1 - 2/6) = 5/4 = $1.25, and we're even in the long run if you pay $1.25 each game. What if we toss a fair coin? Then E[ 2^X ] = p/(1-2*q) = 1/2 / (1 - 2/2) = (1/2) / 0... uh oh! And for p=1/6 things are worse, E[ 2^X ] = p/(1-2*q) = 1/6 / (1 - 2*5/6) = (1/6) / (-4/6) = -1/4, so you LOSE $0.25 on average... yet in this game you ALWAYS win at least $1. If we do the infinite sum more carefully, you'll see that for the fair coin we have E[ 2^X ] = 2^0 * 1/2 + 2^1 * 1/4 + 2^2 * 1/8 + 2^3 * 1/16 + 2^4 * 1/32 + ... = 1/2 + 1/2 + 1/2 + 1/2 + 1/2 + ... = oo and similarly for any p <= 1/2, q >= 1/2. -------- WARNING: E[ g(X) ] is NOT DEFINED unless \sum | g(x) | P[X=x] < oo ------------------------------------------------------------ * The Poisson Distribution Sometimes we have a Binomial problem with a HUGE number "N" of trials and a TINY probability p of success each time... and a moderate value of lambda = N*p, the "expected number" of successes. In that case we have an interesting approximation: P[ X = k ] = (N:k) (lambda/N)^k (1 - lambda/N)^(N-k) N*(N-1)*...*(N+1-k) lambda lambda lambda = ---------------------- -------- -------- ... -------- k ! N-lambda N-lambda N-lambda * (1- lambda/N) ^ N lambda ^ k - lambda -> ---------- e for k = 0,1,2,... k ! This distribution is called the "Poisson Distribution"; some places where it is used (all "really" binomial with huge n, tiny p) include: * Deaths by horse-kick per regiment for Prussian soldiers in 1800's; * Typographical errors per page for manuscripts; * Pot-holes per mile of highway during a winter; * Cases of any relatively-rare disease per city per year; * Atomic decays per minute from fixed mass of radium Can you think of more? The "prussian soldiers" example is where the Poisson distribution first appeared in print (really!)--- the work of Ladislaus von Bortkiewicz, "Das Gesetz der kleinen Zahlen", commissioned by the Russian Czar to see how grave the problem was.