You are on page 1of 41

answer each question in its entirety.

Car Wash

Day # Calls Bins Q1 (25 Points). Youre the manager

1 45 0 You collect data over a 75 day period

2 30 20

3 53 40 Q-1a (5 Points) Create a graph of the

4 69 60

5 67 80 Q-2a (5 Points) Calculate the averag

6 45 100

Q-2b (5 Points) Calculate the mean n

7 17 120 the number of calls per a day? Expla

8 46 140

9 32 160 Q-2c (2.5 Points) Calculate the mode

10 33 180

11 59 Q-2d (2.5 Points) Calculate the Stan

12 74

13 42 Q-2e (5 Points) Based on the data co

reasoning and how to interpret this in

14 35

15 38 32

16 18

17 41

18 63

19 72

20 43

21 45

22 44

23 48

24 48

25 67 Range

26 72

27 41

28 52

29 75

30 40

31 34

32 40

33 88 Question 1

34 63

35 38 Day

36 48 0-20

37 45 21-39

38 50 40-59

39 150 60-79

40 84

41 28

42 37

43 52

44 44

45 49

46 70

47 75

48 58

49 23

50 62

51 10

52 71

53 80

54 70

55 41

56 47

57 99

58 38

59 29 Question 2 a

60 83

61 60 Total car washed = 4128

62 54 number of days = 75

63 35

64 46 Average =

65 51

66 58

67 72 Outliers are values that "lie outside" the other v

68 86

69 48 Question 2 c

70 48 Question 2d

71 51

72 62 Days

73 62 1.0-20

74 85 21-39

75 150 40-59

total 75 4128 60-79

total

Variance=

SD=

25 Points). Youre the manager of a car wash business and you want to know how many cars you wash on a daily b

collect data over a 75 day period.

a (5 Points) Calculate the average number of cars per a day and describe how outliers affect the value of the mean.

b (5 Points) Calculate the mean number of cars per a day. What does this tell you and is this a better or a worse estim

number of calls per a day? Explain.

d (2.5 Points) Calculate the Standard Deviation. Explain what this tells you.

e (5 Points) Based on the data collected, is the graph skewed to the left, right, or symmetric? If so, please explain you

oning and how to interpret this information.

140

Car washed

879

1003

1200

1118

car washed

car washed

car washed

er of days = 75

55.04

es that "lie outside" the other values. Hence if a number is too far from the main group, it gives a bad mean.

9.5 879 -3321 11029041

30 1003 -3197 10220809

49.5 1200 -3000 9000000

69.5 1118 -3082 9498724

4200 0

14643.5

121.010330138

The graph is skewed to the left (negative skew). This means that the mean is on the left of the peak.

y cars you wash on a daily basis.

Day

0-19

20-39

40-59

ric? If so, please explain your 60-80

80-99

100-110

120-139

140-159

160-180

0-20 879

21-39 1003

40-59 1200

60-79 1118

car washed

Bin Frequency 1-a)

0 0

20 3 Histogram

40 15

60 31

80 18

100 6

120 0

140 0

160 2

180 0 Frequency

More 0

Mean after removing 2 outliers with values 150 =

2-c)

In this case it will be 40-60

2-d)

Standard deviation

Here f is frequency of the group. X is the mid point of the group and x bar is total mea

Histogram

Frequency

52.4383562

Income Sales Age

$ 26,748.51 $ 1,695,712.62 33.16

$ 53,063.79 $ 3,403,862.05 32.67

$ 36,090.14 $ 2,710,352.91 35.66

$ 32,058.07 $ 529,215.46 33.07

$ 47,843.42 $ 663,686.65 35.76

$ 50,180.97 $ 2,546,324.34 33.81

$ 30,710.08 $ 2,787,046.20 30.98

$ 29,141.70 $ 612,696.05 30.78

$ 55,980.15 $ 891,822.03 32.32

$ 28,730.88 $ 1,124,967.97 32.53

$ 31,109.23 $ 909,500.98 31.44

$ 55,614.12 $ 2,631,166.88 33.16

$ 23,038.43 $ 882,972.65 31.87

$ 34,531.72 $ 1,078,573.12 33.41

$ 30,350.36 $ 844,320.19 34.05

$ 38,964.94 $ 1,849,119.03 28.89

$ 49,392.77 $ 3,860,007.32 36.11

$ 25,595.69 $ 826,573.88 32.81

$ 29,622.61 $ 604,682.87 33.05

$ 31,586.10 $ 1,903,611.60 33.50

$ 49,674.56 $ 2,356,808.39 32.68

$ 28,878.98 $ 2,788,571.96 28.52

$ 24,287.08 $ 1,634,878.29 32.89

$ 46,711.24 $ 2,371,627.37 30.50

$ 43,449.81 $ 2,627,837.96 30.29

$ 31,694.45 $ 1,868,116.33 31.29

$ 45,459.22 $ 2,236,796.86 33.05

$ 47,047.34 $ 1,318,876.23 32.93

$ 26,433.24 $ 1,868,097.84 31.84

$ 33,396.66 $ 1,695,218.57 31.08

$ 26,179.36 $ 2,700,194.42 32.18

$ 33,454.64 $ 1,156,049.77 31.69

$ 42,271.50 $ 643,858.44 34.03

Sales

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.3601347911

R Square 0.1296970678

Adjusted R Square 0.1016227797

Standard Error 857089.133581316

Observations 33

ANOVA

df SS

Regression 1 3393699486846.59

Residual 31 22772655269998.3

Total 32 26166354756844.9

Intercept 545545.787265091 578170.228098795

Income 32.4943853039 15.1181142648

regression

data

m = XiYi

Xi^ 2

C= Y - m X

Xi^ 2

C= Y - m X

$ 26,748.51 $ 1,695,712.62 -$ 1,192,543.25

$ 53,063.79 $ 3,403,862.05 -$ 1,166,227.97

$ 36,090.14 $ 2,710,352.91 -$ 1,183,201.62

$ 32,058.07 $ 529,215.46 -$ 1,187,233.69

$ 47,843.42 $ 663,686.65 -$ 1,171,448.34

$ 50,180.97 $ 2,546,324.34 -$ 1,169,110.79

$ 30,710.08 $ 2,787,046.20 -$ 1,188,581.68

$ 29,141.70 $ 612,696.05 -$ 1,190,150.06

$ 55,980.15 $ 891,822.03 -$ 1,163,311.61

$ 28,730.88 $ 1,124,967.97 -$ 1,190,560.88

$ 31,109.23 $ 909,500.98 -$ 1,188,182.53

$ 55,614.12 $ 2,631,166.88 -$ 1,163,677.64

$ 23,038.43 $ 882,972.65 -$ 1,196,253.33

$ 34,531.72 $ 1,078,573.12 -$ 1,184,760.04

$ 30,350.36 $ 844,320.19 -$ 1,188,941.40

$ 38,964.94 $ 1,849,119.03 -$ 1,180,326.82

$ 49,392.77 $ 3,860,007.32 -$ 1,169,898.99

$ 25,595.69 $ 826,573.88 -$ 1,193,696.07

$ 29,622.61 $ 604,682.87 -$ 1,189,669.15

$ 31,586.10 $ 1,903,611.60 -$ 1,187,705.66

$ 49,674.56 $ 2,356,808.39 -$ 1,169,617.20

$ 28,878.98 $ 2,788,571.96 -$ 1,190,412.78

$ 24,287.08 $ 1,634,878.29 -$ 1,195,004.68

$ 46,711.24 $ 2,371,627.37 -$ 1,172,580.52

$ 43,449.81 $ 2,627,837.96 -$ 1,175,841.95

$ 31,694.45 $ 1,868,116.33 -$ 1,187,597.31

$ 45,459.22 $ 2,236,796.86 -$ 1,173,832.54

$ 47,047.34 $ 1,318,876.23 -$ 1,172,244.42

$ 26,433.24 $ 1,868,097.84 -$ 1,192,858.52

$ 33,396.66 $ 1,695,218.57 -$ 1,185,895.10

$ 26,179.36 $ 2,700,194.42 -$ 1,193,112.40

$ 33,454.64 $ 1,156,049.77 -$ 1,185,837.12

$ 42,271.50 $ 643,858.44 -$ 1,177,020.26

$ 1,219,291.76 $ 57,623,147.23

X= $ 36,948.24

Y= $ 1,746,155.98

m= $ 46.28

C= $ 36,329.36

Y=46.28 X + 36.329.36

Sales of 50,000 Y=46.28 X 50,000+36,329.36

Y=2,350,329.36

Growth HS College

0.8299 73.5949 17.8350

0.6619 88.4557 31.9439

0.9688 73.5362 18.6198 25 Marks:

0.0821 79.1780 20.6284

0.4646 84.1838 35.2032 The data at left are m

2.1796 93.4996 41.7057 the sample, are app

customers is referre

1.8048 78.0234 28.0250

Sales ------Latest on

-0.0569 70.2949 15.0882 Income ---Median fa

-0.1577 70.6674 10.9829 Age --------Median a

0.3664 63.7395 13.2458 HS ----------Percenta

2.2256 76.9059 19.5500 College ---Percentag

1.5158 82.9452 20.8135 Growth ---Annual po

0.1413 65.2127 16.9796

-1.0400 73.4944 32.9920 Q-2a (5 Points). Co

1.6836 80.2201 22.3185

Q-2b (5 Points). As

2.3596 87.5973 24.5670

0.7840 85.3041 30.8790 Q-2c (2.5 points): Pr

0.1164 65.5884 17.4545

1.1498 80.6176 18.6356 Q-2d (2.5 points): E

0.0606 80.3790 38.3249

1.6338 79.8526 23.7780 Q-2e (10 Points). In

1.1256 81.2371 16.9300

1.4884 70.2244 19.1429

4.7937 87.1046 30.8843

1.8922 80.2057 26.5570

1.8667 75.2914 28.3600

1.7896 77.6162 19.2490

0.2707 85.1753 35.4994

3.0129 74.1792 18.6375

3.4630 81.6991 41.1130

0.7041 73.4140 17.8566

-0.1569 73.7161 26.5426

0.7084 78.6493 29.8734

Sales

Sale

Sale

MS F Significance F

3393699486846.59 4.6197811737 0.0395229519

734601782903.171

0.9435729492 0.352684205 -633640.167248479 1724732 -633640

2.1493676218 0.0395229519 1.6607879765 63.327983 1.660788

Yi-Y XiYi Xi^2

-$ 55,927,434.61 45357785973.1962 $ 715,482,787.22

-$ 54,219,285.17 180621821169.361 $ 2,815,765,809.16

-$ 54,912,794.32 97817015790.8567 $ 1,302,498,205.22

-$ 57,093,931.77 16965626229.7041 $ 1,027,719,852.12

-$ 56,959,460.57 31753039335.7167 $ 2,288,992,837.30

-$ 55,076,822.89 127777025064.905 $ 2,518,129,750.14

-$ 54,836,101.03 85590411827.1162 $ 943,109,013.61

-$ 57,010,451.17 17855004596.8518 $ 849,238,678.89

-$ 56,731,325.19 49924331180.645 $ 3,133,777,194.02

-$ 56,498,179.26 32321319606.2592 $ 825,463,465.57

-$ 56,713,646.25 28293875047.6085 $ 967,784,191.19

-$ 54,991,980.35 146330030659.96 $ 3,092,930,343.37

-$ 56,740,174.57 20342303681.0932 $ 530,769,256.86

-$ 56,544,574.10 37244985117.4933 $ 1,192,439,686.16

-$ 56,778,827.03 25625421843.1698 $ 921,144,352.13

-$ 55,774,028.20 72050812017.8433 $ 1,518,266,549.20

-$ 53,763,139.91 190656453557.505 $ 2,439,645,728.27

-$ 56,796,573.35 21156728794.5772 $ 655,139,346.58

-$ 57,018,464.36 17912284772.4455 $ 877,499,023.21

-$ 55,719,535.63 60127666358.76 $ 997,681,713.21

-$ 55,266,338.84 117073419827.233 $ 2,467,561,911.19

-$ 54,834,575.27 80531113774.7639 $ 833,995,485.84

-$ 55,988,268.94 39706419722.3449 $ 589,862,254.93

-$ 55,251,519.86 110781655223.928 $ 2,181,939,942.34

-$ 54,995,309.27 114179060116.237 $ 1,887,885,989.04

-$ 55,755,030.90 59208919615.3685 $ 1,004,538,160.80

-$ 55,386,350.37 101683040644.968 $ 2,066,540,683.01

-$ 56,304,270.99 62049618598.9176 $ 2,213,452,201.08

-$ 55,755,049.39 49379878442.4686 $ 698,716,176.90

-$ 55,927,928.66 56614638074.3896 $ 1,115,336,899.16

-$ 54,922,952.81 70689361660.2744 $ 685,358,890.01

-$ 56,467,097.45 38675229011.2514 $ 1,119,212,937.53

-$ 56,979,288.78 27216862215.546 $ 1,786,879,712.25

2233513159552.76 $ 48,264,759,027.52

25 Marks:

The data at left are monthly sales totals from a random sample of 33 stores in a large chain of nationwide clothing s

the sample, are approximately the same size and carry the same merchandise. The county, or in some cases coun

customers is referred to here as the customer base. For each of the 33 set are:

Sales ------Latest one month sales total (dollars)

Income ---Median family income of customer base (dollars)

Age --------Median age of customer base (years)

HS ----------Percentage of customer base with a high school diploma

College ---Percentage of customer base with a college diploma

Growth ---Annual population growth rate of customer base over the past 10 years.

Q-2a (5 Points). Construct a scatter plot, using sales as the dependent variable and median family income as the in

Q-2b (5 Points). Assuming a linear relationship, use the least-squares method to compute the regression coefficien

Q-2c (2.5 points): Predict the sales based off income of $50,000.00

Q-2d (2.5 points): Explain why it would not be appropriate to use the model to predict sales when income is $10,000

Q-2e (10 Points). Interpret the meaning of the Y-intercept, b0, and the slope, b1, in this problem.

It can be seen that the plots are scattered all over the place.

Sales

Sales

Upper 95.0%

1724732

63.327983

2-d)

At income =10,000 Sales 870489.64

Standard Error 857089.13

Predicted sales is almost the same as the error. That's why it won't be appropriate

2-e)

Sales = b1*Income +b0

Sales = 32.494*income+545545.7873

b1>0 which means that for an increase in income sales will also increase

Also, when we increase income by 1 unit, Sales increases by 32.494 unit

b0=545545.7873 which is equal to the sales generated when the income of the family =0

e chain of nationwide clothing stores. All stores in the franchise, and thus within

county, or in some cases counties, in which the store draws the majority of its

median family income as the independent variable. Discuss the scatter plot.

mpute the regression coefficients b0 and b1 and state the regression equation.

his problem.

won't be appropriate

lso increase

32.494 unit

25 Points:

Q-3: The cellular spinoff company Jog wants to estimate the proportion of househ

were made available with a free handset. A random sample of 500 accounts is se

purchase an additional line if the handset was free.

Construct a 99% confidence interval estimate of the population proportion of accounts that w

the table below to arrive at your answer. Points breakdown below:

(2.5 Points)- Determine N, P-bar, and Confidence Level, Square Root, and determine the Cen

(2.5 Points)- Complete the table in full.

(5 Points)- Describe the confidence interval estimate of the population proportion of accoun

Q-4 (15 Points): The amount of time it takes to take your order at the local Briarp

a standard deviation s of 0.40 minutes. If you select a random sample of 16 custo

Q-4a (5 Points): What is the probability that the mean time spent per customer is at least 3 m

Q-4b (5 Points): What is the probability that it takes between 3 minutes and 5 minutes to tak

Q-4c: (5 Points) What is the length of an order if only 1% of all orders are shorter? (Round to

Proportions

n >= 30

p-bar 0.27 pbar = number of accounts purchasing an additional line i

confidence level 99%

z*s/sqrt(n) 0.05 Z score = normsinv(1-alpha/2) Standard deviation = s/sqrt

Lower end of int'l 0.219 Lower end = Center of interval + Z*s/sqrt(n)

Upper end of int'l 0.321 Upper end = Center of interval - Z*s/sqrt(n)

Interval width

1-confidence level 1%

(1-confidence level)/2 0.005

z 2.58

(p)(1-p) 0.20

s = sqrt[(p)(1-p)} 0.44

sqrt(n) 22.36

s/sqrt(n) 0.02

Check assumptions:

np>5 OK

n(1-p) > 5 OK

proportion of households that would purchase an additional cellular line if it

of 500 accounts is selected. The results indicate that 135 of the accounts would

ortion of accounts that would purchase the additional line if the handset were free. Fill in

n proportion of accounts that would purchase the additional line if the handset were free.

er at the local Briarpatch restaurant has a population mean m of 3.1 minutes and

m sample of 16 customers:

sing an additional line if handset was free / n

Z score = -1

*s/sqrt(n)

b) P(3<X<5) = P(Z > 5-//sqrt(n)) - P( 3-//sqrt(n))

Z score at X =5 19

Z score at X =3 -1

Probability = 0.8413447

c) Probability = 1%

Z score at probability -2.326348

Findout n using the above formula

andard deviation/sqrt(n))

- P( 3-//sqrt(n))

iation/sqrt(n))

25 Points:

Q-5 (15 Points). You are the manager of a popular retail store. You want to determine w

check-out has changed in the past month from its previous population mean value of 4.2

population is normally distributed with a population standard deviation of 1.6 minutes. Y

during a one hour period. The sample mean is 4.75 minutes. Determine whether there

population mean wait time to check-out has changed in the past month from its previous

findings utilizing information from the hypothesis test you conduct.

Q-5c (5 Points): Based on the information do we reject or accept the null hypothesis. De

table below to help answer the questions.

Q-5d (5 Points): Describe what would occur if the same size was doubled.

Q-6 (5 points):

A sport preference poll yielded the following data for men and women. Use the 5% signif

gender are independent.

Sport Preference

Basketball Football Soccer

Gender Men 20 25 30 75

Women 18 12 15 45

Total 38 37 45 120

Q-7 (5 Points): Suppose that we observe a random sample of size n from a normally dis

5% significance level, is it true that we can definitely reject in favor of the appropriate one

why not?

Mean

Sample Mean, x-bar >4.75 V=43-1

Sample Std Dev., s 1.60 V=42

Sample Size, n 43

Confidence Level 5%

Reject Null Hypothesis

Lower end of range

Center of range

Upper end of range

p-value

Significance Level

(Significance Level)/2

z

sqrt(n)

s/sqrt(n)

z*s/sqrt(n)

Value of z-statistic

ail store. You want to determine whether the population mean wait time for customer to

ous population mean value of 4.25 minutes. From past experience, you can assume that the

dard deviation of 1.6 minutes. You select a sample of 43 customers wait time to check-out

nutes. Determine whether there is evidence at the 0.05 level of significance that the

the past month from its previous population mean value of 4.25 minutes. Explain your

u conduct.

or accept the null hypothesis. Describe what this means for the results of the test. Fill in the

n and women. Use the 5% significance level and test to determine is sport preference and

mple of size n from a normally distributed population. If we are able to reject in favor of at the

ect in favor of the appropriate one-tailed alternative at the 2.5% significance level? Why or

5b) Alternate hypothesis: 4.25

Z= 2.0491995

p value = 0.0404426 Check the excel tab for the formula

If p value <0.05, reject null hypothesis

If p value >0.05, accept null hypothesis

5d) Z score = (X -mean)/(standard deviation/sqrt(n))

Z= 2.8980058

p value = 0.0037554

If p value <0.05, reject null hypothesis

If p value >0.05, accept null hypothesis

This is not true for certain. Suppose and the sample mean we obs

obviously cant reject the null because the observed sample mea

reject the null at the 2.5% level. The reason is that we know the

value for a one-tailed test is half of this, or less than 0.025, which

nd the sample mean we observe is If the alternative for the one-tailed test is then we

e the observed sample mean is in the wrong direction. But if the alternative is we can

reason is that we know the p-value for the two-tailed test was less than 0.05. The p-

is, or less than 0.025, which implies rejection at the 2.5% level.

hen we

we can

he p-

Observed

Basketball Football

Gender Men 20 25

Women 18 12

Total 38 37

Soccer Total Basketball

30 75 Gender Men 23.75

15 45 Women 14.25

45 120 Total 38

Please check the tabs for formu

Null hypothesis: Sports preference and gender are independent

Alternate hypothesis: Sports preference and gender are dependent

If p value < 0.05, reject the null hypothesis

If p value > 0.05, accept the null hypothesis

Since p value >0.05, we accept the null hypothesis and say that sports preference and gender are

Expected

Football Soccer Total

23.125 28.125 75

13.875 16.875 45

37 45 120

