Professional Documents
Culture Documents
Questions
2 1. When a statistic, like the median, is resistant/robust to outliers, it means that:
A. the statistic is not greatly influenced by the value of the outliers.
B. the statistic itself cannot be an outlier.
C. the statistic is greatly influenced by the value of the outliers.
D. it is impossible for the data to have any outliers.
E. None of these.
2 2. Based on a random sample of 78 second-year students, researchers constructed a 95% confidence interval
for the mean height of the students. The confidence interval was (1.67 m, 1.73 m). Suppose you have a
friend that is 1.7 m tall, which of the following statements are correct?
A. We are 95% confident that your friend has an above-average height.
B. We are 95% confident that your friend has a below-average height.
C. The friend is one of the 95% of students that have a height of between 1.67 and 1.73 m.
D. There is insufficient evidence to say that that your friend has either a below or an
above average height.
E. None of these.
2 3. Researchers conducted a study to compare the proportion of engineering students that gets sufficient sleep
compared to medical students. Random samples of 35 engineering students and 45 medical students were
considered. Of the engineering students, 72% reported getting enough sleep. Of the medical students, 62%
reported getting enough sleep. The researchers wish to construct a confidence interval for the difference
between the proportions. What is the Standard Error for the interval?
A. 0.014
B. 0.092 Fout op vraestel skip die vraag
C. 0.012
D. 0.016
E. None of these.
Solution:
s r
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 ) 0.75(1 − 0.75) 0.65(1 − 0.65)
SE = + = + = 0.092
n1 n2 40 60
Solution: Samples were randomly taken and consist of less than 10% of the entire population, so we
can assume they are independent.
The number of success is 0.8 × 40 = 32 ≥ 10 so the success condition is met.
The number of failure is 0.2 × 40 = 8 10 so the failure condition is NOT met.
Solution:
r r
var1 var2 31 39
SE = + = + = 1.541
n1 n2 38 25
√
(Note that sd = var)
2 10. A study is designed to test whether the probability of a student’s hair being a certain colour (either
‘brunette’, ‘blond’, ‘red’, ‘black’ or ‘other’) is uniformly distributed. If a very large X 2 test statistic is
obtained, this suggests that:
A. The probability of a student’s hair being a certain colour is not uniformly distributed.
B. The probability of a student’s hair being a certain colour is uniformly distributed.
C. We can calculate a confidence interval for the probability of a student’s hair being a certain colour.
D. We cannot conclude anything without knowing the sample size.
E. None of these.
2 11. Estimate the variance of the population which produced the following sample data:
26 31 30 31 29 24 29 32
A. 7.42
Correct answer is A
B. 9.28
C. 2.73
D. 8.00
E. None of these.
2 12. Given six books, how many sets of three books can be selected but always in a different order?
A. 120
B. 6
C. 504
D. 60480
E. None of these.
Solution:
6!
6P 3 = = 120
(6 − 3)!
Questions
1. The sugar content of seven 1kg boxes of breakfast cereal was determined. The measurements (in grams)
are:
Previous studies suggest that the sugar content used to be 260 units. The theory is that the sugar content
may have increased due to a change of supplier. A statistical test is to be performed to test this hypothesis.
2 (a) The null and alternative hypotheses for the test are:
A. H0 : µ = 260
HA : µ > 260
B. H0 : µ = 260
HA : µ 6= 260
C. H0 : µ 6= 260
HA : µ > 260
D. H0 : µ < 260
HA : µ > 260
E. H0 : x̄ = 260
HA : x̄ > 260
F. H0 : x̄ = 260
HA : x̄ 6= 260
2 (b) What are the point estimators for the parameters of the normal distribution?
Solution:
260 + 310 + 300 + 310 + 290 + 240 + 290
x̄ = = 285.714
sP 7
n 2 √
i=1 (xi − x̄)
s= = 695.238 = 26.367
n−1
1 (a) What type of test is appropriate for evaluating if there is an association between coffee intake and
depression?
A. Normal test of independence
B. T-test
C. Chi-squared test of independence
D. Paired test
E. None of these
2 (b) Write the hypotheses for the test you identified in part (a).
A. H0 : Caffeinated coffee consumption and depression in women are independent.
B. HA : Caffeinated coffee consumption and clinical are in women dependent/associated.
C. H0 : Caffeinated coffee consumption and depression in women are dependent/associated.
D. HA : Caffeinated coffee consumption and clinical are independent in women.
E. H0 : Caffeinated coffee consumption and depression in women are not independent.
F. HA : Caffeinated coffee consumption and clinical are not dependent/associated in women.
G. H0 : Caffeinated coffee consumption and depression in women are related.
H. HA : Caffeinated coffee consumption and depression in women are unrelated.
2 (c) Calculate the overall proportion of women who do and do not suffer from depression.
Solution:
2 607
Depression: 50 739 = 0.0514
No depression: 1 − 0.0514 = 0.9486
2 (d) Identify the expected count for those that consume one cup per day who suffer from depression.
Solution:
E = 2 607×17
50 739
234
= 885.493 ≈ 885
2 (e) Calculate the contribution of this cell (in (b)) to the test statistic.
Solution:
(O−E)2 2
E = (905−885)
885 = 0.452
Solution:
df = (R − 1) × (C − 1) = 1 × 4 = 4
p-value < 0.001
Pn sP
n Pn
i=1 xi i=1 (xi− x̄)2 − x̄)2
i=1 (xi
x̄ = s= var =
n n−1 n−1
Pr(A and B)
Pr(A or B) = Pr(A) + Pr(B) - Pr(A and B) Pr(A|B) =
Pr(B)
x−µ
z= x = µ + zσ
σ
Q1 − 1.5 × IQR, Q3 + 1.5 × IQR
r r
p̂(1 − p̂) p̂ − p0 p0 (1 − p0 )
p̂ ± zscore × SEp̂ SEp̂ = z= SE0 =
n SE0 n
x̄ − µ0 s
x̄ ± tscore × SEx̄ t= SEx̄ = √ df = n − 1
SEx̄ n
s
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
(p̂1 − p̂2 ) ± zscore × SE SE = +
n1 n2
r
(p̂1 − p̂2 ) − 0 x1 + x2
z= SE0 = p̂(1 − p̂) n11 + n12 p̂ =
SE0 n1 + n2
s
(x̄1 − x̄2 ) − 0 s21 s2
(x̄1 − x̄2 ) ± tscore × SE t= SE = + 2 df =min(n1 − 1, n2 − 1)
SE n1 n2
X (observed − expected)2 row × column
χ2 = expected = df = (r − 1) × (c − 1)
expected total
rP
(y − ŷ)2
sy
ŷ = β̂0 + β̂1 x β̂1 = r residual = y − ŷ s=
sx n−2
β̂1 − 0
ȳ = β̂0 + β̂1 x̄ t= β̂1 ± tscore × SEβ̂1 df = n − 2
SEβ̂1
negative Z
Y
positive Z
0 5 10 15
Upper tail 0.3 0.2 0.1 0.05 0.02 0.01 0.005 0.001
df 1 1.07 1.64 2.71 3.84 5.41 6.63 7.88 10.83
2 2.41 3.22 4.61 5.99 7.82 9.21 10.60 13.82
3 3.66 4.64 6.25 7.81 9.84 11.34 12.84 16.27
4 4.88 5.99 7.78 9.49 11.67 13.28 14.86 18.47
5 6.06 7.29 9.24 11.07 13.39 15.09 16.75 20.52
6 7.23 8.56 10.64 12.59 15.03 16.81 18.55 22.46
7 8.38 9.80 12.02 14.07 16.62 18.48 20.28 24.32
8 9.52 11.03 13.36 15.51 18.17 20.09 21.95 26.12
9 10.66 12.24 14.68 16.92 19.68 21.67 23.59 27.88
10 11.78 13.44 15.99 18.31 21.16 23.21 25.19 29.59
11 12.90 14.63 17.28 19.68 22.62 24.72 26.76 31.26
12 14.01 15.81 18.55 21.03 24.05 26.22 28.30 32.91
13 15.12 16.98 19.81 22.36 25.47 27.69 29.82 34.53
14 16.22 18.15 21.06 23.68 26.87 29.14 31.32 36.12
15 17.32 19.31 22.31 25.00 28.26 30.58 32.80 37.70
16 18.42 20.47 23.54 26.30 29.63 32.00 34.27 39.25
17 19.51 21.61 24.77 27.59 31.00 33.41 35.72 40.79
18 20.60 22.76 25.99 28.87 32.35 34.81 37.16 42.31
19 21.69 23.90 27.20 30.14 33.69 36.19 38.58 43.82
20 22.77 25.04 28.41 31.41 35.02 37.57 40.00 45.31
25 28.17 30.68 34.38 37.65 41.57 44.31 46.93 52.62
30 33.53 36.25 40.26 43.77 47.96 50.89 53.67 59.70
40 44.16 47.27 51.81 55.76 60.44 63.69 66.77 73.40
50 54.72 58.16 63.17 67.50 72.61 76.15 79.49 86.66