# Narsee Monjee Institute of Management Studies (Deemed University)

## Statistical Methods- Assignment-5(chi-square tests)

1. Each person in a random sample of 50 was asked to state his/her sex and preferred colour. The resulting frequencies
are shown below.
Colour
Red Blue Green
Male 5 14 6
Sex Female 15 6 4

Test the null hypothesis that sex and preferred colour are independent.

Frequencies Yates
Data Correction 0 5 10 -5 25 2.5
1 2 3 Total 14 10 4 16 1.6
1 5 14 6 25 #Rows 2 6 5 1 1 0.2
2 15 6 4 25 #Cols 3 15 10 5 25 2.5
Total 20 20 10 50 df 2 6 10 -4 16 1.6
4 5 -1 1 0.2
Test Statistic 8.6
χ 2= 8.6
10 10 5
10 10 5
Reject Ho at
5% LOS

2. The following data were obtained from a company which manufactures special plastic containers which are to hold
a specified volume of hazardous material. On each of the three 8 hour shifts workers are able to make 500 of the
containers. Some containers do not meet specifications as required by the company's customer because they are too
small, others because they are too large.
Conformance to Specification
Shift Too Small Within Spec. Too Large
8am 36 452 12
4pm 24 443 33
midnight 12 438 50
Frequencies Data
1 2 3 Total
1 36 452 12 500
2 24 443 33 500
3 12 438 50 500
Yates
Test Statistic Correction 0
χ 2= 35.11077
#Rows 3
#Cols 3
df 4

## Total 72 1333 95 1500

Expected Frequencies
1 2 3
1 24 444.3333 31.66667
2 24 444.3333 31.66667
3 24 444.3333 31.66667
Reject Ho at 5%
LOS

3. A professional baseball player, Lon Dakestraw, was at bat five times in each of 100 games. Lon claims that he has a
probability of 0.4 of getting a hit each time he goes to bat. Test his claim at the 0.05 level by seeing if the following
data are distributed binomially (p = 4). (Note: Combine classes if the expected number of observations is less than
5).
Number of Hits Number of Games with
per Game That Number of Hits
0 12
1 38
2 27
3 17
4 5
5 1
P(Exactly
x x) Oi
0 0.07776 12 7.776 12 7.776 4.224 17.84218 2.294519
1 0.2592 38 25.92 38 25.92 12.08 145.9264 5.629877
2 0.3456 27 34.56 27 34.56 -7.56 57.1536 1.65375
3 0.2304 17 23.04 17 23.04 -6.04 36.4816 1.583403
4 0.0768 5 7.68 6 8.704 -2.704 7.311616 0.840029
5 0.01024 1 1.024 χ 2= 12.00158
1 100 Reject Ho

4. The following data was collected concerning food purchases at several sporting events:

Food Purchases
Hot Dogs Popcorn No Purchase
Football 240 80 30
Sport
Are the purchases independent of sports? Test at 5% los.
Yates
Frequencies Data Correction 0
1 2 3 Total
1 240 80 30 350 #Rows 2
2 50 90 10 150 #Cols 3
Total 290 170 40 500 df 2
Expected Frequencies Test Statistic
1 2 3 χ 2= 65.56071
1 203 119 28
2 87 51 12 p-value 5.8E-15
Reject
Ho

5. A poll conducted to investigate whether three television dramas are equally preferred among men and
women, gave this results:
NVPD Blue Law & order The Practice Total
Men 40 35 10 85
Women 30 45 10 85
Total 70 80 20 170
(i) Describe the null and alternative hypothesis for this problem.
(ii) Compute the value of the test statistic.
(iii) Write down the degrees of freedom for this test, and describe the rejection region
(iv) What is the appropriate conclusion?
Frequencies Data
1 2 3 Total #Rows 2
1 40 35 10 85 #Cols 3
2 30 45 10 85 df 2
Total 70 80 20 170
Test Statistic
Expected Frequencies χ 2= 2.678571
1 2 3
1 35 40 10 p-value 0.262033
2 35 40 10
Do not reject Ho

6. A study was done to determine the effectiveness of varying amounts of vitamin C in reducing the number of
common colds. A survey of 450 people provided the following information:

## Daily amount of vitamin C taken

None 500 mg 1000 mg
No colds 57 26 17
At least one cold 223 84 43

Is there evidence of a relationship between catching a cold and taking vitamin C? Test at 1% los.
Yates
Frequencies Data Correction 0
1 2 3 Total
1 57 26 17 100 #Rows 2
2 223 84 43 350 #Cols 3
Total 280 110 60 450 df 2
Expected
Frequencies Test Statistic
1 2 3 χ 2= 1.987222
1 62.22222 24.44444 13.33333
2 217.7778 85.55556 46.66667 Do not reject Ho p-value 0.370237

7. A brand manager is concerned that her brand’s share may be unevenly distributed throughout the country. In a
survey in which the country was divided into four geographic regions, a random sampling of 100 consumers in each
region was surveyed, with the following results:
REGION
NE NW SE SW TOTAL
Purchase the brand 40 55 45 50
190
Do not purchase 60 45 55 50 210

## TOTAL 100 100 100 100 400

Develop a table of observed and expected frequencies for this problem.
(a) Calculate the sample X2 value.
(b) State the null and alternative hypotheses.
(c) If the level of significance is 0.05, should the null hypothesis be rejected?
Yates
Frequencies Data Correction 0
1 2 3 4 Total
1 40 55 45 50 190 #Rows 2
2 60 45 55 50 210 #Cols 4
Total 100 100 100 100 400 df 3

## Expected Frequencies Test Statistic

1 2 3 4 χ 2= 5.012531
1 47.5 47.5 47.5 47.5
2 52.5 52.5 52.5 52.5 p-value 0.170882
Do not reject Ho
8. The post office is interested in modeling the “mangled-letter” problem. It has been suggested that any letter sent to a
certain area has a 0.15 chance of being mangled. Because the post office is so big, it can be assumed that two
letters’ chances of being mangled are independent. A sample of 310 people was selected, and two test letters were
mailed to each of them. The number of people receiving zero, one, or two mangled letters was 260, 40, and 10,
respectively. At the 0.10 level of significance, is it reasonable to conclude that the number of mangled letters
received by people follows a binomial distribution with p = 0.15?
P(Exactly
x x) E(x)
0 0.7225 223.975 260 36.025 1297.801 5.794399
1 0.255 79.05 40 -39.05 1524.903 19.29035
2 0.0225 6.975 10 3.025 9.150625 1.311918
1 310 310 Reject χ 2= 26.39667
Reject Ho
9. A state lottery commission claims that for a new lottery game, there is a 10 percent chance of getting a \$1 prize, a 5
percent chance of \$100, and an 85 percent chance of getting nothing. To test if this claim is correct, a winner from
the last lottery went out and bought 1,000 tickets for the new lottery, He had 87 one dollar prizes, 48 one-hundred
dollar prizes, and 865 worthless tickets. At the 0.05 significance level, is the state’s claim reasonable?
Ei Oi
10% 0.1 100 87 -13 169 1.69
5% 0.05 50 48 -2 4 0.08
85% 0.85 850 865 15 225 0.264706
1000 1000 Do no reject Ho χ 2
= 2.034706

10. A large city fire department calculates that for any given precinct, during any given 8-hour shirt, there is a 30
percent chance of receiving at least one fire alarm. Here is a random sampling of 60 days:
Number of shifts during which alarms were received 0 1 2 3
Number of days 16 27 11 6
At the 0.05 level of significance, do these fire alarms follow a binomial distribution?
(Hint: Combine the last two groups so that all expected frequencies will be greater than 5.)
10. (Assuming p(atleast one) =0.3) => p= 0.112
p=0.112
P(Exactly
x x) exp exp obs
0 0.700227 42.01362 42.01362 16 -26.0136 676.7087 16.10689
1 0.264951 15.89705 17.98638 44 26.01362 676.7087 37.6234
2 0.033417 2.005033 χ 2= 53.73029
3 0.001405 0.084296 60 60
Reject
60 Ho

11. A diligent statistics student wants to see if it is reasonable to assume that some sales data have been sampled from a
normal population before performing a hypothesis test on the mean sales. She collected some sales data, computed
= 78 and s=9, and tabulated the data as follows:
Sales level ≤ 65 66-70 71-75 76-80 81-85 ≥ 86
Number of observations 10 20 40 50 40 40
(a) Is it important for the statistics student to check if the data are normally distributed? Explain.
(b) State explicit null and alternative hypotheses for checking if the data are normally distributed.
(c) What is the probability (using a normal distribution with µ=78 and σ =9) that sales will be less than or equal to
65.5; between 65.5 and 70.5; between 70.5 and 75.5; between 75.5 and 80.5; between 80.5 and 85.5; greater than
or equal to 85.5?
(d) At the 0.05 level of significance, does the observed frequency distribution follow a normal distribution?

x1 x2 P(x1<X<x2) Oi
-inf 65.5 0.082433 10 16.48665 -6.48665 42.07668 2.552166
65.5 70.5 0.119895 20 23.97902 -3.97902 15.83262 0.66027
70.5 75.5 0.188263 40 37.65262 2.347381 5.510198 0.146343
75.5 80.5 0.218817 50 43.76341 6.23659 38.89506 0.888757
80.5 85.5 0.188263 40 37.65262 2.347381 5.510198 0.146343
85.5 infi 0.202328 40 40.46568 -0.46568 0.216854 0.005359
1 200 200 Do no reject Ho χ 2= 4.399238

12. A supermarket manager is keeping track of the arrival of customers a checkout counters
to see how many cashiers are needed to handle the flow. In a sample of 500 five-minute time periods, there were 22, 74,
115, 95, 94, 80, and 20 periods in which zero, one, two, three, four, five, or six or more customers, respectively, arrived at a
checkout counter. Are these data consistent at the 0.05 level of significance with a Poisson distribution with λ = 3?

P(Exactly
x x) Oi
0 0.049787 22 24.89353 -2.89353 8.37254 0.336334
1 0.149361 74 74.6806 -0.6806 0.46322 0.006203
2 0.224042 115 112.0209 2.979096 8.875014 0.079226
3 0.224042 95 112.0209 -17.0209 289.7112 2.586224
4 0.168031 94 84.01568 9.984322 99.68669 1.186525
5 0.100819 80 50.40941 29.59059 875.6032 17.36984
>=6 0.083918 20 41.95897 -21.959 482.1964 11.49209
0.916082 500 500 Reject Ho χ 2= 33.05644