Professional Documents
Culture Documents
Assignment 5
Overview
Total marks: / 70
This assignment covers content from Unit 5. It assesses your ability to use sampling distributions in
hypothesis testing about the difference between two or more population means or the difference
between two population proportions, including tests for experiments with more than two categories and
tests about contingency tables.
Instructions
Show all your work and justify all of your answers and conclusions, except for the TRUE/FALSE
questions.
Keep your work to 4 decimals, unless otherwise stated.
Note: Finishing a test of hypotheses with a statement like H0 ” or “do not H 0 ” will be
“reject reject
insufficient for full marks. You must also provide a written concluding statement in the context of
the problem itself. For example, if you are testing hypotheses about the effectiveness of a medical
treatment, you must conclude with a statement like, “we can conclude that the treatment is effective”
or “we cannot conclude that the treatment is effective.”
(9 marks)
1. A researcher is interested in examining the cholesterol levels of heart-attack patients. Cholesterol
levels are measured for 28 heart-attack patients (2 days after their attacks) and 30 other hospital
patients who did not have a heart attack. The researcher believes that cholesterol levels will be
higher for the heart-attack patients. Random samples from each group provide the following results:
Heart-Attack Non-Heart-Attack
Patients Patients
Sample Size 28 30
Mean Cholesterol (mg/DL) 213.9 193.1
Standard Deviation of Cholesterol (mg/DL) 47.7 22.3
Assume that the cholesterol levels for both populations are normally distributed and that the
population standard deviations are equal.
Using a 5% significance level, can the researcher conclude that the mean cholesterol level of heart-
attack patients is greater than the mean cholesterol level of non-heart-attack patients? Formulate and
test the appropriate hypotheses. State and explain your conclusion within the context of the question.
Use the critical value approach.
H0: µ1 - µ2 = 0
H1: µ1 - µ2 > 0
Degrees of freedom = n1 + n2 – 2 = 28 + 30 – 2 = 56
Calculate sp:
( 𝑛 1 − 1) 𝑠2 + ( 𝑛 2 − 1) 𝑠2
𝑠𝑝 = 1 2
𝑛1 + 𝑛2 − 2
So…
Answer: 2.1500 is greater than 1.6730 so we reject the null hypothesis at significance level 0.05 and
can therefore conclude that the mean cholesterol level of heart attack patients is greater than the
mean level of cholesterol levels in non heart attack patients.
(9 marks)
2. A manufacturer wanted to improve on the processes used to produce electrical components. At the
beginning of the year, the factory randomly examined 9,000 electrical components, and of these a
total of 900 components were rejected after a quality-control inspection. A project was deployed to
fix the problem. Following the project, 7,000 components were randomly picked to be examined. Of
these, a total of 600 were rejected. Did the project intervention improve the process?
Test at the 2% significance level whether the population proportion of rejected components
decreased after the project compared to the population proportion prior to the project. Formulate and
test the appropriate hypotheses. Use p-value approach. Be sure to clearly state and explain your
the
conclusion within the context of the question.
H0: p1 – p2 = 0
H1: p1 – p2 < 0
n1 = 9000
n2 = 7000
p̂ 1 = x1/n1 = 900/9000 = 0.1
p̂ 2 = x2/n2 = 600/7000 = 0.08571429
a = 0.02
𝑥1 + 𝑥2
𝑝=
𝑛1 + 𝑛2
900 + 600
𝑝= = 0.09375
9000 +
7000
This means that q is 1 – 0.09375 = 0.90625
p = 0.09375
q = 0.90625
Now calculate sp1-p2
1 1
𝑆𝑝1 − 𝑝2 = √𝑝 ⋅ 𝑞 ( + )
𝑛1 𝑛2
1 1
𝑆𝑝 − = √0.09375 ⋅ 0.90625 ⋅ ( + ) = 0.0046454
𝑝
1 2
9000 7000
(11 marks)
3. Researchers counted the number of breeding sea turtles on various sections of beach property in
Cancun every year. Nine randomly selected sections of beach were used. The following table shows
the number of counted sea turtles for two successive years (2015 and 2016).
At the 5% significance level, can it be concluded that the number of breeding sea turtles in 2015 is
different from the number of sea turtles in 2016? Formulate and test the appropriate hypotheses. Use
the critical value approach. Assume the population of paired differences has a normal distribution.
Clearly state and explain your conclusion within the context of the question.
H0: µ1 – µ2 = 0
H1: µ1 – µ2 ≠ 0 (since H1 has a ≠ sign it is a two tailed test)
Make a table with all numbers given and calculate d and d2:
2015 2016 d d²
(2015-2016)
62 60 2 4
54 58 -4 16
36 31 5 25
42 40 2 4
61 62 -1 1
76 70 6 36
84 81 3 9
75 72 3 9
43 43 0 0
∑d = 16 ∑d2 = 104
Calculate d̅ :
∑𝑑
𝑑̅ = 𝑛
16
𝑑̅ = = 1.7778
9
Calculate sd:
(∑𝑑)2
∑𝑑2 − √ 𝑛
𝑆𝑑 =
𝑛−1
162
√ 104 −
9
𝑆𝑑 = = 3.07318148667
9−1
Calculate sd̅ :
Answer: Since our test statistic value (1.7355) lies between the two critical values which is our
fail to reject region, we fail to reject our null hypothesis (H0). We cannot conclude that the
number of breeding sea turtles in 2015 is different from the number of sea turtles in 2016.
(9 marks)
4. After introducing a new teaching curriculum, a teacher is interested in whether the grade distribution
in his course is significantly different than it was in previous years. The distribution of grades before
the introduction of the new curriculum was as follows:
Grade Percentage
A 15%
B 40%
C 25%
D 15%
F 5%
A random sample of 150 students taken after the introduction of the new curriculum provided the
following results:
Does the observed data contradict the hypothesis? Formulate and test the appropriate hypotheses at
the 1% significance level. Use the critical value approach. Clearly state and explain your conclusion
within the context of the problem.
Create chart:
B 65 0.40 60 5 25 0.416667
n = 150 ∑=
6.416667
Find critical value in CHI-SQUARE distribution chart, using df (n-1 = 4) and significance level a =
0.010
Critical value: 13.277
Answer:
6.4167 < 13.277
Because 6.4167 is less than 13.277, we reject the null hypothesis. We conclude that the
grade distribution is significantly different than it was in previous years.
(10 marks)
5. A marketing firm that markets refrigerators is interested in studying consumer behavior in the
context of purchasing a particular brand of refrigerator. It wants to know, in particular, whether the
income-level of the consumers influences their choice of refrigerator brand. Currently there are three
brands available in the marketplace. Brand A is a premium brand, Brand B is a more moderately
priced brand, and Brand C is the most economical brand.
A representative stratified random sampling procedure was adopted covering the entire market using
income as the basis of selection. Income was classified into three categories: lower, middle and
high. A sample of 200 consumers participated in this study and produced the following data:
At the 5% significance level, can it be concluded that there is a relationship between income-level
and brand preference? Formulate and test the appropriate hypotheses. Use the critical value
approach. Clearly state and explain your conclusion within the context of the question.
Calculate df:
(R – 1) ∙ (C – 1) = (3 – 1) ∙ (3 – 1) = 2 ∙ 2 = 4
Lower 20 30 50 100
(Observed)
(Expected) 25 35 40
Middle 20 25 15 60
(Observed)
(Expected) 15 21 24
High (Observed) 10 15 15 40
(Expected) 10 14 16
Row Total 50 70 80
x² = 10.1518
Answer: Because our x² value is greater than the critical value, we reject the null hypothesis.
We conclude that there is a relationship between income level and brand preference.
(12 marks)
6. Three colors of warning lights can be used on an automobile instrument panel. A researcher was
interested to know whether users would have different reaction times depending on the color used in
the panel. To find out, she randomly assigned, from 15 participants in total, 5 participants to each
one of the 3 colors, and then measured their reaction times (in hundredths of a second, with decimal
points deleted). The following data were obtained:
Given that the necessary assumptions are satisfied, can it be concluded, at the 5% level of
significance, that not all mean reaction times to the colors are equal? Formulate and test the
appropriate hypotheses. Use the critical value approach. Clearly state and explain your conclusion
within the context of the question.
n = 15
k=3
df (num) = k – 1 = 3 – 1 = 2
df (den) = n – k = 15 – 3 = 12
a = 0.05
Look up critical value in 0.05 F distribution table, under 2 numerator and 12 denominator:
Critical value = 3.89
20 21 21
20 22 24
21 18 23
23 19 22
22 20 25
n1 = 5 n2 = 5 n3 = 5
∑x² (square each value and then add all together) = 6919
Answer: Because the F value of 5.10 is greater than the critical value of 3.89, we reject the null
hypothesis. We conclude that the mean reaction times to the colours are not equal.
(6 marks)
Fill in the missing values in the table as indicated by the blanks (---).
k=5
k-1 = 4
SSW = MSW ∙ (n-k) = 75.400 ∙ 20 = 1508
MSB = SSB / (k-1) = 332.100 / 4 = 83.025
F = MSB/MSW = 83.025 / 75.400 = 1.10112732
(4 marks)
Using a significance level of α = 0.10, indicate what your null and alternative hypotheses would
be in this situation. Test these hypotheses, state your conclusion and explain its meaning in the
context of this problem.
H0: the population means are equal
H1: the population means are not equal
a = 0.10
df(num) = k-1 = 4
df(den) = n-k = 20
Using df(num) and df(den), look up critical value in 0.10 F distribution table…
Critical value = 2.25