Professional Documents
Culture Documents
UNIVERSITY OF PRETORIA
FACULTY OF NATURAL AND AGRICULTURAL SCIENCE
DEPARTMENT OF STATISTICS
STUDENT NUMBER
SIGNATURE UWN
Please note: Students should take note that, if found guilty of academic misconduct or non-compliance with these rules, a student
could, among other penalties, forfeit his/her credits for a module and/or be suspended from the University for a period that
could range from one year to permanent suspension. Such a student’s record will be blocked for the period of suspension
and he/she will not be entitled to a certificate of good conduct from the University during this period. Students should also
take note that, if found guilty of academic misconduct, it may negatively influence their admission to other universities
and/or registration with professional councils. Academic misconduct is indicated on all certificates of conduct provided to
students by the University.
2
QUESTION 1 [8]
4𝑆 2
b. Give the distribution of 𝑋̅ and respectively. (2)
𝜎2
𝜎2
𝑋̅~𝑁 (𝜇, )
5
4𝑆 2
~𝜒 2 (4)
𝜎2
𝑋̅ −𝜇
c. Show that ~𝑡(4). (3)
𝑆 ⁄√ 𝑛
𝑋̅ − 𝜇
~𝑁(0,1)
𝜎⁄√𝑛
𝑋̅ −𝜇
𝜎 ⁄ √𝑛 𝑋̅ − 𝜇
= ~𝑡(4)
⁄ 4𝑆2 𝑆⁄√𝑛
√ 𝜎2⁄
4
𝑋̅ and 𝑆 2 independent
3
QUESTION 2 [5]
∑50 2 2
𝑖=1 𝑌𝑖 ∼ 𝑁(𝑛𝜇, 𝑛𝜎 ) where 𝑛𝜇 = 𝑛𝑝 = 12.5 and 𝑛𝜎 = 𝑛𝑝(1 − 𝑝) = 9.375
12.5−12.5
𝑃(∑50
𝑖=1 𝑌𝑖 ≤ 12) = 𝑃 (𝑍 ≤ ) = 𝑃(𝑍 ≤ 0) = 0.5
√9.375
4
QUESTION 3 [6]
𝐸(𝑋𝑖 ) = 2
9 1 9 1
𝐸(𝜃̂2 ) = 3𝐸(𝑋𝑖 ) + 3𝐸(𝑋𝑖 ) = (2) + (2) = 2
30 30 10 10
𝑉(𝑋𝑖 ) = 2
9 2 1 2
̂ ̂
𝑀𝑆𝐸(𝜃2 ) = 𝑉(𝜃2 ) = ( ) 3𝑉(𝑋𝑖 ) + ( ) 3𝑉(𝑋𝑖 )
30 30
81 1 41
= ×3×2+ ×3×2= = 0.547
900 900 75
5
QUESTION 4 [5]
explain why this information can be used to derive a 100(1 − 𝛼)% confidence
interval for 𝛽. (2)
2 ∑𝑛
𝑖=1 𝑌𝑖
The variable depends functionally on 𝑌1 , 𝑌2 , . . . , 𝑌𝑛 and 𝛽
𝛽
2
2 × 5.39 2
∴ 𝜒0.95,16 ≤ ≤ 𝜒0.05,16
𝛽
0.78
7.962 ≤ ≤ 26.296
𝛽
1 𝛽 1
≥ ≥
7.962 10.78 26.296
10.78 10.78
≥𝛽≥
7.962 26.296
⇒ 0.410 ≤ 𝛽 ≤ 1.354
6
QUESTION 5 [5]
a. Let 𝜃 denote a population parameter with estimator 𝜃̂. State the assumptions
required to construct a large-sample confidence interval for 𝜃. (2)
0.27
0.27 ± 𝑧0.025 × √
400
7
QUESTION 6 [12]
8
QUESTION 7 [7]
𝑥̅ − 𝜇0 𝜎 12
𝑧= 𝜎 > 1.96 ∴ 𝑥̅ > 𝑧 + 𝜇0 = 1.96 + 50 = 56.073
√𝑛 √15
√𝑛
𝜇0 − 𝜇′ 50 − 60
𝛽(𝜇′) = Φ (𝑧𝛼 + ) = Φ (1.96 + ) = Φ(−1.27) = 0.102
𝜎/√𝑛 12/√15
III
II
I
55 56
50 60 μ
I. Area I (black): 𝛼
II. Area II (grey): 𝛽
III. Area to the right of line III (under the curve centered at 50): 𝑃 −value
9
QUESTION 8 [8]
Literature from 10 years ago indicates that children (ages 2-11) spend an average of
21 hours 30 minutes watching television per week while teens (ages 12-17) spend
an average of 20 hours 40 minutes. To get an indication of the current trend, a
random sample from each of these populations was drawn. The sample statistics are
given below. Answer the questions that follow. Use 𝛼 = 0.10.
1 1
Reject 𝐻0 if 𝑓 < 𝐹 = 3.39 = 0.2995 or if 𝑓 > 𝐹8,9,0.05 = 3.23
9,8,0.05
𝑠12 16.4
𝑓= = = 0.901
𝑠22 18.2
b. Assume that the variances in the hours watching television for children and
teens are equal. It must be tested whether there is sufficient evidence to
conclude a difference in average television watching times between the two
groups. (4)
𝑋̅ − 𝑌̅
𝑇=
1 1
√𝑆𝑝2 ( + )
𝑚 𝑛
(ii) Give the distribution and parameter value(s) of the test statistic under
𝐻0 : 𝜇1 = 𝜇2 .
~𝑡(𝑚 + 𝑛 − 2), 𝑚 + 𝑛 − 2 = 17
(iii) Suppose that the 𝑃 −value is equal to 0.05. Give a decision and
conclusion with a reason for your answer.
10
QUESTION 9 [6]
Consider the analysis of variance for a single factor with 3 levels. The values 𝑎, 𝑏, 𝑐
and 𝑑 represent measurements and 𝐽 = 4.
Level Observations
I 𝒂 𝒃 𝒄 𝒅
II 𝒂 𝒃 𝒄 𝒅
III 𝒂 𝒃 𝒄 𝒅
𝑋̅𝑖. = (𝑎 + 𝑏 + 𝑐 + 𝑑)/4
Level Observations
I 𝒂 𝒂 𝒂 𝒂
II 𝒃 𝒃 𝒃 𝒃
III 𝒄 𝒄 𝒄 𝒄
3 4
c. Consider 𝐻0 : 𝜇𝐼 = 𝜇𝐼𝐼 = 𝜇𝐼𝐼𝐼 . Suppose the test statistic value is equal to 20. Give
the rejection region and a decision (𝛼 = 0.05). (2)
11
QUESTION 10 [5]
Ranks
𝑠 + =sum of ranks of
(of 6 pairs of 𝜇1 − 𝜇2 = Δ0
positive differences Exact distribution
observations)
Frequencies 𝜒2
3
(𝑛 = 30) 𝑝1 = 𝑝2 = 𝑝3 (𝑛𝑖 − 10)2 𝜒 2 (2)
=∑
10
𝑖=1
Frequencies 𝑝1 − 𝑝2
(samples of size 𝑍=
𝑝1 = 𝑝2 1 1 𝑛(0,1)
𝑚 and 𝑛 √𝑝̂ 𝑞̂ ( + )
𝑚 𝑛
respectively)
Frequencies
3 4
(with all Variable 𝑋 and (𝑛𝑖𝑗 − 𝑒̂𝑖𝑗 )2
2
expected variable 𝑌 are 𝜒 = ∑∑ 𝜒 2 (6)
𝑒̂𝑖𝑗
frequencies independent 𝑖=1 𝑗=1
greater than 5)
12
QUESTION 11 [13]
A company leases animals, which have been trained to perform certain tasks, for
use in the movie industry. The table below gives the number of tasks that each of
nine monkeys in a random sample can perform, along with the number of years the
monkeys have been working with the company.
The random variable 𝑌𝑖 denotes the number of tasks and 𝑋𝑖 the number of years for
each monkey 𝑖 = 1, … , 9.
𝑌̂ = 16.452 + 1.405𝑥
𝑆𝑆𝐸 232.4739
𝜎̂ 2 = = = 33.211
𝑛−2 7
𝑟 = 0.6887
𝑟 2 = 0.4743
47.43% of the variation in number of tasks can be explained by the linear regression
model.
13
d. Perform a statistical test to determine whether the population correlation
coefficient is significantly different from zero at a 5% level. (3)
𝐻0 : 𝜌 = 0
𝐻𝑎 : 𝜌 ≠ 0
𝑟√𝑛−2
Test statistic: 𝑡 = = 3.466
1−𝑟 2
e. Using the fitted regression line, calculate the residual of SuSu. (2)
14
f. Suppose that more data became available and the following standardised
residual plot was obtained.
0 5 10 15 20 25
Tasks (𝑥)
Based on the residual plot, comment on the adequacy of the fitted linear
regression model. (2)
15
QUESTION 12 [4]
Let 𝑦̂𝑖 = 𝛽̂0 + 𝛽̂1 𝑥𝑖 denote the estimate of the true simple linear regression line, with
1 1
𝑦̅ = 𝑛 ∑ 𝑦𝑖 and 𝑥̅ = 𝑛 ∑ 𝑥𝑖 . Show that the square of the sample correlation coefficient
gives the value of the coefficient of determination that would result from fitting the
simple linear regression model.
Hint: Start by assuming that 𝑦̂𝑖 − 𝑦̅ = 𝛽̂1 (𝑥𝑖 − 𝑥̅ ). (4)
Therefore
𝑆𝑆𝑅
(𝑟)2 =
𝑆𝑆𝑇
𝑆𝑆𝐸 𝑆𝑆𝑅
But the coefficient of determination, 𝑟 2 , is defined as 1 − 𝑆𝑆𝑇 = 𝑆𝑆𝑇 thus, completing
the proof.
16
QUESTION 13 [6]
a. Retired senior executives who had returned to work were surveyed in Gauteng. It
was found that after returning to work, 38% were self-employed, 32% were free-
lancing or consulting, 23% were employed by another organization and 7 % had
formed their own companies. To see if these percentages are consistent with
those in the Western Cape, a local researcher surveyed 300 retired executives
who had returned to work in the Western Cape.
(i) If the Gauteng percentages also hold in the Western Cape, how many
executives are expected to have formed their own companies in the Western
Cape? (1)
0.07(300) = 21
(ii) The test statistic value to test the above, is equal to 3.294. Calculate the
𝑃 −value. (2)
b. The following data are a summary of the goal scoring record of 436 soccer
players who played in international matches during a specific period of time:
Number of 0 1 2 3 4 5 6 7 8
goals
scored
Number of 300 67 33 11 6 9 6 2 2
players
Expected
number of
218.7 150.9 52.1 12.0 2.1 0.3 0.0 0.0 0.0
players
under 𝑯𝟎
Consider
𝐻0 : The goal scoring record of soccer players has a Poisson distribution.
(i) Only calculate the contribution of 2 goals scored to the test statistic value. (1)
(33 − 52.1)2
= 7.002
52.1
(ii) Give the rejection region (in terms of the critical value and significance level
0.05) for this test. (2)
2
Reject 𝐻0 if 𝜒 2 > 𝜒4−1−1;0.05 = 5.992
17
QUESTION 14 [4]
In two consecutive weeks 120 patients with same disease are subjected to two
medical treatments. The observed contingency table below shows the reactions to
the treatments:
Treatment B
Treatment A Favourable Unfavourable
Favourable 23 36
Unfavourable 32 29
27.042
d. Suppose the test statistic value is equal to 2.194. Draw a conclusion and give
a reason for your answer. (1)
Since 𝜒 2 = 2.194 < 6.637, 𝐻0 is not rejected. Sample evidence suggest that
the degree of success for the two treatments is independent.
18
QUESTION 15 [6]
In a large department store, the owner wishes to see whether the number of
shoplifting incidents per day will change if the number of uniformed security officers
is doubled. A sample of 7 days before security is increased and 7 days after the
increase shows the number of shoplifting incidents.
Number of shoplifting
incidents
Day Before After Difference Rank of
absolute
differences
Monday 7 5 2 3.5
Tuesday 2 3 -1 1.5
Wednesday 3 4 -1 1.5
Thursday 6 3 3 5
Friday 5 1 4 6
Saturday 8 6 2 3.5
Sunday 12 4 8 7
𝐻0 : 𝜇𝐷 = 0
𝐻0 : 𝜇𝐷 > 0
Reject 𝐻0 if 𝑠+ ≥ 𝑐1 = 24
7(8)
𝑠+ = − 3 = 25
2
19