STA113 Probability and Statistics in Engineering
Section 001, #2683
Spring 2005
Other exercises may be assigned. Quizzes will be based largely
on homework problems. Data sets from the book for homework are on the web for easy loading into Matlab.
Homework Problems:
Ch1: 19, 25, 33, 41, 43, 50, 59 (histp.m for probability histograms)
Ch12 (12.1, 12.2 only): 3 (get the estimates and correlation), 16abc
Getting tabular data into matlab is a bit harder than with some other statistics
software. "tdfread" is pretty good, but for a space-separated file you need to
specify the delimiter--tdfread('data.txt', ' '). It also expects a line of column labels
at the top.
What is the probability of a pair in the pocket pre-flop?
What is the probability that four cards of one suit in the pocket and the flop complete
to a flush?
RAID (Redundant Array of Independent Disks--a way to combine hard disk drives
to increase reliability): Three 1 Gbyte hard drives store 2 Gb of data as follows: the first Gb is on
disk 1, the second Gb is on disk 2, and on disk 3 the xor of the bits on the first two
drives is stored (the xor is 1 if the other two are both the same, 0 otherwise, so if you know
two of the three values you can get the third). If disks fail and lose data independently
with probability p=.01, find the probability that the total RAID
configuration loses data. If you compare this with no replication (everything on
two disks) the RAID system is much more reliable. If you compare it with total
duplication where everything is stored twice over four disks, the RAID system has
almost the same reliability but fewer disks.
Unfortunately, the book counts the number of failures before the success
as the geometric r.v., rather than the more standard trial of the first
success. This leads to a pmf shifted one unit to the left,
and the mean in the book is the real mean 1/p minus 1 (giving 1/p -1 = (1-p)/p).
I will not use the book's system!
Geometric Distribution problems:
What is the probability that the first 6 in repeated rolls of a die occurs somewhere
in rolls 1 through 5?
A very large box is filled with parts and some are defective, say a proportion p.
If we sample randomly from the box until a defective part appears, and
it is discovered, over thousands of these experiments, that on average
the first defective appears on the
tenth draw, estimate the unknown proportion p.
Let p be the probability that a packet arrives on an IP network in a discrete time interval
of 10^-6 seconds (which is the clock precision). Assume arrivals from one interval to
the next are independent.
If the mean time between packet arrivals is 550 (10^-6 seconds), estimate p and then estimate
the expected number of packets over a period of 1 second. Finally, if each
packet carries 64 bytes, find the probability that the amount of traffic (number of
bits) in one second exceeds the capacity of 1Mbit/s. (If the size of each packet is
random, then the problem is more complicated.)
Monte Carlo integration: If U_1, U_2, ..., U_1000 are unif(0,1) random
variables, what integral does the average [exp(exp(U_1))+ ... + exp(exp(U_1000))]/1000
approximate? Can you do this integral by hand?
To do calculations with
the Weibull distribution in Matlab, such as for 66, note that the Matlab parametrization is different than the book--
weibpdf(x,a,b) is the function f(x)=abxb-1e-axb for x > 0, with mean a-1/bGamma(1+1/b)
82 --try "qqplot" in Matlab, which essentially plots
the normal percentiles at values (1-.5)/10, (2-.5)/10, ..., (10-.5)/10 against the ordered data:
Normal(mu, sigma) data will give a qqplot that is nearly a straight line with slope sigma, intercept mu.
To see this, try y=normrnd(2,3,100,1) to simulate N(2,3) data and apply qqplot to y.
Matlab does not have the newer method for the binomial c.i., but binci.m does it. The old
method (7.11) is binciold.m.
Ch8: 1, 11abc, 21, 30, 32a, 35, 36a, 44, 46, 47
Ch9: 5, 28, 33, 41, 44 (sections 1-3 only)
Matlab just added the newer method, sometimes called Welch's approximation, for confidence intervals in a two sample situation where the variances are not the same. This feature is in Version 7, not 6, which you start with the command "matlab7" at this time. The .m file twosampleci.m also does it right.