STAT501 (Sec.3) EXAM 2 SOLUTIONS November 6, 2003

1. a) What is the probability that production will be stopped during a given hour?

Ans.: P(not stopped) = P(8.535<Xbar<9.421) = P(-1.8 < Z < 1.63) = .9125, so

P(stopped) = 1 – P(not stopped) = .0875. The calculation of P(not stopped) is done by standardizing Xbar: (Xbar - m )/(s /Ö n), with m = 9, s = 2, and n = 60.

b) What can be said about the normality of the population distribution of gross weights of 8-oz. boxes of Naturola Cornflakes? Ans: nothing. The population distribution is whatever it is, which may or may not be normal. When you take large samples, say for n = 60, it is the sampling distribution of Xbar that is (approximately) normal.

c) The quality control engineer at the Naturola Company keeps a record of all the averages from the hourly samples for a whole year. Sketch the histogram of these values. What distribution does this histogram correspond to? (Amazingly, you don't need the data to do this.) Ans.: The histogram is a bell-shaped curve centered at m = 9 and with "spread" s /Ö n (I can’t draw it on the computer). It corresponds to the normal distribution with mean m = 9 and variance s 2/n (with the above values for m and s ).

2. In a public opinion poll of 400 adults in Amherst just before the 2000 presidential election, 208 people thought the Electoral College was one of the Five Colleges. (Is it?)

Answer: no.

a) Give a 90% CI for the proportion of adults in Amherst that thought that the Electoral College is one of the Five Colleges. Ans.: Here phat = 208/400 = .52, z.05 = 1.645, and phat * qhat/n = .52*.48/400, of which the square root is .0249. Thus the CI for p is P.E. +/- critical value * S.E., which in this case is:

.52 +/- 1.645*.0249 = .52 +/- .041

b) What is the margin of error for this estimate? .041.

c) Would a 95% CI give a larger or a smaller margin of error? (Do not compute the 95% CI.) Ans.: larger, since you would use the (larger) z-value z.025 = 1.96.

d) How large a sample would be needed to have a margin of error of .01 at the 95% confidence level?

Ans.: n = (1.96/.01)2pq. For nconservative use p = q = ˝, for nliberal use p = .52, q = .48. The results are: nconservative = 9604, nliberal = 9589.

3. a) Explain whether this is an experiment involving two independent samples or paired samples. Ans.: paired. The experimental "units" are the 25 locations and each one has an "x"-measurement (the ground-based temp.) and a "y"-measurement (the satellite-based temp.) made on it. These x- and y-measurements are not independent. Moreover, the analysis is given entirely in terms of the 25 differences di = xi – yi, which is characteristic of a paired experiment.

b) Show how to compute a 90% confidence interval for m 1 - m 2, where m 1 corresponds to ground-based readings and m 2 to satellite. Set up the computation but do not carry out the arithmetic. (These z- and t-values may be helpful: z.20 = .84, z.10 = 1.28, z.05 = 1.645, z.025 = 1.96, t24;.1 = 1.318, t24;.05 = 1.711, t24;.025 = 2.064, t25;.1 = 1.316, t25;.05 = 1.708, t25;.025 = 2.06.) Ans.: dbar +/- t24;.05 * sd/Ö n = .25 +/- 1.711*(.82/5).

c) What assumptions are necessary for the validity of the CI in part (b)? Ans.: The pairs (xi,yi) (or the differences di) have to constitute a random sample from the population of all such pairs (or differences) of measurements, and the population of differences must have a normal distribution.

4. a) A 95% CI for the difference in average breaking strengths for type1 and type 2 yarns is given by 150 ± 33.32. Explain, either in words or by giving a formula, how this CI was computed. (Do not compute it again.) Ans.: xbar – ybar +/- 1.96 (s12/n1 + s22/n2)1/2, where

xbar = 1400, ybar = 1250, s1 = 120, and s2 = 80.

b) If the actual value of m 1 - m 2 were in the CI, within what margin of error would it be estimated? Is actually contained in the CI? Explain. Ans.: M.E. = 33.32.

xbar – ybar is always contained in the CI for m 1 - m 2: it is at the center of the interval.

c) What conditions are needed for the validity of the CI in part (a)? Are they fulfilled?

Ans.: Two independent random samples and both sample sizes large: here the sample sizes are 61 and 121, which are "large" (>= 30) according to our rule of thumb, but we don’t know whether the samples are independent of each other or whether they are r. samples.

5. An article in last year’s New England Journal of Medicine … incidentally, this was a real study.

a) Give a 95% CI for pE - pC, where pE and pC denote the probabilities of death due to heart attack in each of the two corresponding populations. Ans.:

pEhat - pChat +/- 1.96 ( pEhat* qEhat/ nE + pChat * qChat/ nC)1/2 with pEhat = 144/8414 = .017, and pChat = 176/7557 = .023. The resulting CI is -.006 +/- .004 or (-.01, -.002).

b) Explain whether the conditions needed for the validity of the method used in part (a) are met. Ans.: the conditions are: two large, independent, random samples. The samples are large (check nphat, nqhat > 5 for each of the groups), but we don’t know whether they are independent or random samples. Moreover, they are samples of male physicians, therefore the conclusions of this study apply, strictly speaking, only to male physicians. Whether the results apply to men in general and to women remains at question.

c) Based on the CI in part (a), is there evidence that regular exercise reduces the probability of death by heart attack during or shortly after exertion? Discuss briefly. Ans.: the CI lies entirely on the negative part of the real line, so we can be 95% confident that

pE - pC < 0, i.e., that pE < pC. Thus exercise does seem to reduce the prob. of death.

..d) An editorial in a popular magazine made the following statement: "A study involving nearly 16,000 men showed that regular exercise cuts the risk of sudden death due to heart attack brought on by vigorous exertion only by about .6%. So don't worry about it – exercise, if you like, otherwise, forget it." Discuss briefly whether this is (i) accurate and (ii) good advice, based on the information given above. Ans.: (i) The P.E. for the reduction in risk of death is -.006 or -.6%, and there were just under 16000 men (although all physicians) in the study, so the first statement is accurate. (ii) Depends on your point of view: the risk in the C-group is estimated to be .023 and in the E-group .017, so you are reducing from a fairly small number to a slightly smaller number. From a personal point of view, there’s not all that much difference. But for people who like to hang on to life, any improvement is considered to be good, so for them the advice is not good. From a public health point of view, the advice is definitely not good. If the results apply to all males, or to both males and females, on the order of 200,000,000 people in the U.S. (just a very crude estimate of the number of adults), then .6% would mean 1,200,000 fewer deaths.