You are on page 1of 11

4/20/2008

12 Answers
Mix and Match 1. f 2. i 3. a 4. b 5. h 6. c 7. e A googol is 10100, and P(Z < -20) is approximately 10-90. 8. g This is the symmetry property of the normal distribution. 9. j 10. d True/False 11. False The probability of the region between 38 and 44 corresponds to P(0 < Z < 1) 1/3 whereas the region above 44 corresponds to P(Z > 1) 1/6. 12. False The normal distribution does not end at 2 or 3 SDs below the mean. There could be a very young employee. 13. True Converting to days amongs to multiplying X, say, by 365. The mean and SD change, but the distribution remains normal. 14. True This is a shift of the normal distribution, from X to X + 1. 15. True If X ~ N(38,6), then P(X < 30) 0.0912. Times 400 gives the expected count. 16. False It would be highly unlikely to be normal. Much, much more likely the distribution would be skewed in the fashion observed with mixing diamonds of varying quality. 17. True As noted in the discussion of the CLT, sums of normal random variables are normal. 18. True Changing the sign of a normal does not alter its distribution. Both X and X are normal. Hence the difference is also a sum of normal random variables and hence normal. 19. False The statement is true on average, but not specifically for tomorrow and the day after. Both

4/20/2008

12 Answers

random variables have the same distribution, but that does not mean that they are the same. These are like two tosses of the same coin: the chances are the same, but the outcomes need not match. 20. True Changing the scale alters the mean and SD, but not the shape of the normal distribution. 21. True P(X2 > X1) = P(X2 - X1= 0) = since the difference is normal with mean zero. 22. True Think About It 23. The normal distribution puts no limits on the size of the possible values. Even a standard normal with mean zero and variance 1 can be arbitrarily large, though with tiny probability. 24. A normal model assigns some probability everywhere, regardless of the bounds that apply in practice. If the mean of the data is near one of the boundaries, the normal model will be a poor match. 25. Both are normal with mean 2 , but the variance of the sum is 2 2 rather than 4 2 . E(2 X1) = 2 E(X1) = 2 and E(X1 + X2) = E(X1) +E(X2) = + whereas 2 Var(2 X1) = 4 Var(X1) =4 2 and Var(X1 + X 2) = Var(X1) + Var( X2 ) = 2 Think of 2 X1 = X1+X1 sum of two perfectly dependent normal random variables whereas X1+X1 is the sum of two independent normal random variables. 26. Standard normal, with mean zero and variance 1. For the mean, E(X1 X2)/2 = (E(X1) E(X2))/2 = 0 and for the variance Var((X1 X2)/2) = Var(X1 X2)/22 = (Var(X1) + Var(X2))/22 = 1 27. A Skewed B Outliers C Normal D Bimodal 28. a C (nearly normal) b D (outliers) c A (skewed) d B (multimodal) 29. a. A is the original data and B is with rounding. The rounding shows up as small gaps, stairsteps in the quantile plot. b. The histograms are so similar because the amount of rounding is small relative to the size of the bins. 30. a. The anomaly is due to rounding. During this period, stocks were priced in 1/8s of a dollar, so on some days the closing price of the stock would be the same as the prior day, and the return fixed at zero. This phenomenon is less common now that stocks are priced in 1/100s of a dollar.
12 -2

4/20/2008

12 Answers

b. The effect does not cause, for example, a high peak in the center of the histogram because presumably theres just rounding. The rounding is on a fine scale relative to the size of the intervals that define the bars in the histogram. 31. (a) Yes, so long as the weather was fairly consistent during the time period. If, on the other hand, a strong heat wave caused temperatures to soar, then these would create the sort of outliers that the normal model would not accommodate. (b) No, this would probably not be appropriate because the weather is a dependent process. The use on a given day is most likely highly dependent on the use on the previous and next day. The amounts might be normally distributed, but not independent. 32. (a) Set the mean to the center of this range and set the SD so that the indicated range is 4. That gives = 8 and = 2. (b) It should order an equal amount for both; the normal model is symmetric around the mean. (c) No. The normal model implies that someone might wear size 6.243. That might be so, but the sizes are discrete. Shoes only come in integer sizes (or perhaps half sizes). Nonetheless, the normal model could be a very close approximation; note how well the normal model approximates the binomial model shown in Figure 12.5. 33. a. $300 is 1 SD below the mean, so wed expect about 1/6 to earn less than $300, or about 5.6 to earn more. (If you used a table, youd get 84.1%.) b. It shifts the mean by 100, from $700 to $800. c. The mean increases by 5% from $700 to $735. The SD also increases by 5%, from $400 to $420. d. No. It appears that the distribution is skewed to the right (which is what you would expect for the distribution of salaries). The mean is substantially larger than the median and theres relatively little data to the left of the mean (much less than indicated by the calculation in part a). 34. a. Thats 2 SDs below the mean, so only about 2.5% weigh this little. b. If the weights of the steaks in the order are independent, then use a normal model with mean 5 1.2 = 6 pounds and SD = sqrt(5 0.12) 0.22. Remember, variances add; standard deviations do not. c. Normal with mean 1.2 16 = 19.2 ounces and SD = 0.1 16 = 1.6 ounces. d. Yes, these are almost exactly where they should be under normality. For example, the lower quartile of the normal model for the steaks is about 2/3 of a SD below the mean, in this case at 1.2 0.67 * 0.1 = 1.133. Unless you have a very large population (so that the quantile plot would be very accurate), the normal model appears reasonable. 35. (a) Because the homes in the development are generally rather similar but for minor differences, a normal model is reasonable. The price of each is the overall average plus various factors that increase and decrease the value of each. (b) Sales data from recent housing projects recently built by this contractor with similar characteristics. If those are not available, then sales data from homes with similar types of construction in the area. (c) normal with mean $400,000 and SD $50,000. The range = [$400,000 to $500,000] would then be expected to hold 2/3 of the distribution of prices.

12 -3

4/20/2008

12 Answers

36. (a) Because the CPA specializes in similar businesses with comparable sales, we expect the adjustments to be of similar size. If adjustment is the sum of many small corrections, then a normal model would be well-suited. (b) Data for adjustments from prior years for this CPA and others who handle comparable types of businesses. (c) About = $3,500. This choice would mean that all but 2.5% (roughly) would save something since the mean of the normal model is 2 SDs less than zero. (d) Probably not. These firms would likely be very different in size and nature of the adjustment. Some might be considerably larger than others. The distribution of adjustments for one CPA might be normal, but we should expect to see more outliers and perhaps skewness in the larger collection (as in the diamond example of the chapter). You Do It 37. From the table excerpts shown below a) P(Z < 1.5) = 0.93319 b) P(Z > -1) = 0.8413 c) P(|Z| < 1.2) = 0.7699 d) P(|Z| > 0.5) = 0.6171 e) P(-1 Z 1.5) = 0.93319-0.1587=0.77449
z 0.5 1 1.2 1.5 P(Z -z) 0.3085 0.1587 0.1151 0.06681 P(Z z) 0.6915 0.8413 0.8849 0.93319 P(|Z| > z) 0.6171 0.3173 0.2301 0.1336 P(-z Z z) 0.3829 0.6827 0.7699 0.8664

38. From the table excerpts shown below a) P(Z 0.3) = 0.3821 b) P(Z < -2.2) = 0.01390 c) P(-0.7 Z < 0.7) = 0.5161 d) P(|Z| > 1.5) = 0.1336 e) P(0.3 Z 2.2) = 0.98610- 0.6179 = 0.3682
z 0.3 0.7 1.5 2.2 P(Z -z) 0.3821 0.2420 0.06681 0.01390 P(Z z) 0.6179 0.7580 0.93319 0.98610 P(|Z| > z) 0.7642 0.4839 0.1336 0.02781 P(-z Z z) 0.2358 0.5161 0.8664 0.97219

39. These values come from the table excerpted below a) P(Z > 1.2816) = 0.20 b) P(Z 0) = 0.50 c) P(-0.6745 Z 0.6745) = 0.50 d) P(|Z| > 2.5758) = 0.01 e) P(|Z| < 1.6449) = 0.90
z 0 0.6745 1.2816 P(Z -z) 0.5 0.25 0.10 P(Z z) 0.5 0.75 0.90 P(|Z| > z) 1 0.5 0.20 P(-z Z z) 0 0.5 0.80

12 -4

4/20/2008 1.6449 2.5758 0.05 0.005 0.95 0.995 0.1 0.01 0.9 0.99

12 Answers

40. Using the table (on the right side of the inside cover) or software, find the value of z that makes the following probabilities true. Again, you might find it helpful to draw a picture to check your answers. a) P(Z < 0.6745) = 0.25 b) P(Z -0.2533) = 0.60 c) P(-0.3853 Z 0.3853) = 0.30 d) P(|Z| > 2.8070) = 0.005 e) P(|Z| < 2.5758) = 0.99
z 0.2533 0.3853 0.6745 2.5758 2.8070 P(Z -z) 0.4 0.35 0.25 0.005 0.0025 P(Z z) 0.6 0.65 0.75 0.995 0.9975 P(|Z| > z) 0.8 0.7 0.5 0.01 0.005 P(-z Z z) 0.2 0.3 0.5 0.99 0.995

41. a) P(Z < -2.0537) = 0.02, so the worst case percentage change is 0.08 2.0537(0.2) = -0.33074. The investment could fall 33% in value, meaning that the value at risk is $33,000. b) For the value at risk to be reduced to $20,000, a 2% percentage change must be equivalent to a loss of 20%. Solving 2.0537(0.2) = 0.2 for implies that he growth would have to be approximately 21%. c) No, the value at risk does not add up over time this way because standard deviations dont add up either. Consider the conditions for b. If =0.21 with = 0.2, this investment puts $20,000 at risk annually (at 2%). Ignoring compounding, for two years, the expected growth would be 42%. If the results are uncorrelated between years, the SD of the sum of the returns is sqrt(0.22 + 0.22) = sqrt(0.08) 0.28. The two-year value at risk is then 100,000 (0.42 - 2.0537 * 0.28) 100,000 -0.155 = $15,500 The value at risk is smaller over the longer horizon (primarily because the mean return is so large relative to the SD). 42. a) P(Z < -2.3263) = 0.01, so the worst case percentage change is 0.10 2.3263(0.35) = -0.7142, or a loss of about 71% in value. The value at risk is 0.7142($500,000) = $357,000. b) For the value at risk to be $200,000, the worst case percentage change must be 0.20. Solving 0.10 2.3263 = -0.40 for implies that the SD of the investment must drop to about 21.5% c) For a two-year holding period, = 0.20 and = sqrt(2 * 0.352) 0.495. The value at risk is then $500,000 (0.20 - 2.3263 * 0.495) -($475,760). The value at risk is larger than in a because of the relatively small mean in comparison to the SD. 43. a) It loses on one policy with probability 0.025, the chance for a driver to have an accident. b) The insurer takes in $2.5 million. The payout is the sum of 1,000 independent random variables, so we can model the total payout as a normal random variable. Let Bi denote a Bernoulli random variable, with value 0 if the ith driver has no accident and 1 otherwise. Then X is 65,000 B1. The expected value of the sum of the Bs (a binomial random variable) is 1000 0.025 = 25; we expect 25 accidents among these 1000 drivers. The variance of the sum is 1000(0.025)(0.975) = 24.375 with SD 4.94. Hence the expected total payout T has mean 25($65,000) = $1,625,000 with SD = $321,100. Using the Central Limit Theorem, the chance for the insurer losing money in the aggregate is much smaller,
12 -5

4/20/2008

12 Answers

P(T > 2,500,000) = P(Z > (2500000-1625000)/321000 = 2.73) 0.003. c) Yes, to the extent that the company can be profitable by writing many policies whereas it would not if it only sold a few. 44. W ~ N(26,5), measuring the wear of a set in thousands of miles a) P(W < 20) = P(Z < (20-26)/5 = -1.2) = 0.1151 b) If a claim is made, the manufacturer loses money because the cost of replacing the set is larger than the earned profit. Hence, the chance for a profit is 1 - 0.1151 = 0.8849 c) The profit from selling 500 sets is 500(200) = $100,000. Let the random variable T denote the total number of warranty claims. T is binomial with mean 500(0.1151) = 57.55 with SD =500(0.1151)(0.8849) 7.14. The cost from these claims is 400 T, which has mean 400(57.55) = $23,020 with SD 400(7.14) = $2,856. The probability of a profit is thus P(400 T < 100,000) = P(Z < (100000-23020)/2856 27. Its a lock to profit. 45. a) 5%, and in this case it loses a lot! b) 5%. Either all of the bonds pay or they do not. These are not independent contracts. They all pay at the same time. c) The life insurance firm has independent customers. They dont all die at once. The hurricane bonds do. These bonds are much more risking than insurance. 46. hedge_funds (a) The histogram and boxplot (shown in the following figure) indicate a bell-shaped distribution (unimodal, symmetric, falling off toward the tails), but both show several very extreme outliers. (b) There are numerous outliers. The lowest return (most negative) is -16.26% (reported by Trendoscil). The highest return (most positive) is 20.7% (Emerging Income Fund). (c) Set to the mean return ( x =0.01755) and set to the SD (s = 0.02994). (d) The normal model is not a good match. The tails of the distribution extend farther out from the mean than a normal model would predict. The normal quantile plot shows that the lows are much lower and the highs are much too large for the normal model.

12 -6

4/20/2008

12 Answers

47. (a) The histogram and boxplot (shown below) look like a reasonable match for a normal model. The histogram is roughly bell-shaped and the boxplot does not flag many outliers. The histogram does, however, show some skewness and a sharp (rather than gradual) cutoff near 30%. (b) Some slight outliers. The lowest share is 24.9% (in Jersey City, NJ); the highest is 43.7% (in Buffalo, NY). (c) The shares of this product are never very close to zero or 100%, so the data do not run into the boundary at the upper and lower limits. The boundaries do not affect the distribution. (d) Set the mean = 33.63% ( x ) and = 3.34 (s). (e) The normal quantile suggests an acceptable match, but for the kink in the quantile plot near 30%. This cluster of values is unusual for data that are normally distributed. Otherwise, the normal model describes the distribution of shares nicely, matching well in the tails of the distribution.

12 -7

4/20/2008

12 Answers

4M Normality of Stock Returns Motivation a) Normality allows us to summarize the returns using a mean and standard deviation. These combined with the Empirical Rule (as refined by tables of the normal distribution) provide a complete description of the performance of the investment. b) One ca use either returns or percentage changes. The price itself does not pass the visual test for simplicity; it has a strong pattern and should not be summarized in a histogram. Method c) Simplicity of the variation. These data form a time series, so we need to check that independence and stability (no patterns) are reasonable assumptions. d) Normal quantile plot. Mechanics e) Features related to the timing of events are lost in a histogram. The time series of percentage changes does not show patterns, so little is lost in this application by summarizing the data in a histogram.
20 10

Pct Change

0 -10 -20 -30 1980 1985 1990 1995 2000

Date

f) Except for the most extreme values, this normal quantile plot shows that the normal model is a very good description.
20 .01 .05.10 .25 .50 .75 .90.95 .99

10

-10

-20

-3

-2

-1

Normal Quantile Plot

g) The table shown below summarizes the mean and SD of the returns during these months. Using these values to define a normal model (X is normal with = 1.4339 and = 6.7381), the probability of a month in which this stock returns 10% or more is P(X > 10) = P(Z > (10-)/) = P(Z > (10-1.4339)/6.7381 1.2713) = 0.1018 by using software. The table at the front of the text will get you quite close. (This z-score
12 -8

4/20/2008

12 Answers

is close to the value in the text example.)


Mean Std Dev N Moments 1.4339 6.7381 312

h) The price increases by 10% or more in 31 out of 312 months, working out to 9.94% which is almost identical to the value produced by the normal model. (You can anticipate this similarity from the normal quantile plot; the data track the reference line closely.) Message i) Monthly returns on stock in McDonalds is approximately normally distributed with mean 1.4% and SD 6.7% during the period 1980 through 2005. A normal model thus gives accurate estimates of the probability of events during this period. The use of a normal model to anticipate future events, however, requires that we assume that the variation seen in the past will continue into the future.

12 -9

4/20/2008

12 Answers

4M Normality and Transformation Motivation a) Normality is a familiar model for which we have diagnostic plots. It also makes it easy to summarize the data with a mean and a SD (on the log scale, of course). b) Find the z-score for the log of 20,000 using the parameters set from the logs of this data. Then convert this z-score to a probability using the methods of this chapter. Method c) If clustered, the incomes within the cluster may be more similar than those over the whole area. This would mean that the SD of these data would be much smaller than for San Antonio as a whole. The average income might then be much too high or much too low, with a small SD. d) Take logs and look at the normal quantile plot. Mechanics e) A normal model is not good description of the household incomes. The data are severely right skewed.
350000 300000 250000 200000 150000 100000 50000 0

f) A Normal model works better for the log10 of household incomes, but does not match the lower tail well. Incomes get lower than the lognormal model predicts.
5.5 5 4.5 4 3.5 3 -3 -2 -1 0 1 2 3 .01 .05 .10 .25 .50 .75 .90 .95 .99

Normal Quantile Plot

g) Using the lognormal model with parameters set to match this sample, find the probability of finding a household with income less than $20,000.
Mean Std Dev N 4.6434359 0.3575534 257

12 -10

4/20/2008

12 Answers

h) The lognormal is a good match to the distribution for incomes of this size and larger. Were we farther into the lower tail, the lognormal would not be a good description of the variation for the very poor. Directly from the data, there are 40/258 = 15.5% of incomes at $20,000 and below. Message i)

12 -11

You might also like