You are on page 1of 19

Copyright reserved

UNIVERSITY OF PRETORIA
FACULTY OF NATURAL AND AGRICULTURAL SCIENCE
DEPARTMENT OF STATISTICS

MATHEMATICAL STATISTICS 121

EXAMINATION: 6 NOVEMBER 2019


ALLOWED TOTAL TIME: 180 MINUTES TOTAL MARKS: 100
INTERNAL EXAMINERS : Ms. J. Mazarura
Dr. H.F. Strydom
EXTERNAL EXAMINER: Dr. P.J. van Staden
MEMO
SURNAME AND INITIALS

STUDENT NUMBER
SIGNATURE UWN

QUESTION MARKS MARKS AWARDED


1 8
2 5
3 6
4 5
5 5
6 12
7 7
8 8
9 6
10 5
11 13
12 4
13 6
14 4
15 6
TOTAL 100
 Answer all questions in the spaces provided.
 Write down the formulae you use.
 Marks will be awarded for notation. Show all steps and give reasons.
 Give answers correctly up to 3 decimal places where necessary.
 Electronic resources such as smart phones, tablets and other mobile devices
may not be used and must be switched off.
TEST AND EXAMINATION INSTRUCTIONS
1. Students are obliged to identify themselves positively by means of a valid student card when writing a test or examination. No access to
the test or examination venue will be allowed without a valid student card.
2. No person may pretend to be a registered student and/or write a test or examination on behalf of a student.
3. No student may enter the test or examination venue later than half an hour after commencement of a test or examination session. No
student may leave the test or examination venue earlier than half an hour after commencement of a test or examination session. In the
case of computer-based assessment, a student may not enter the venue after the punctual commencement of the test or examination
session.
4. Students must obey all the instructions given by an invigilator immediately and strictly.
5. Except as indicated in paragraph 6, students may not bring into the test or examination venue or have in their possession any of the
following:
 bags (satchels)
 handbags
 pencil cases or bags
 unauthorised apparatus
 books
 electronic means of communication or similar devices
 cellular phone watches (smart watches) or cellular phones (cellular phones may not be used as a substitute for calculators)
 any piece of paper, no matter how small
 notes of any nature whatsoever.
Mere possession of any of the aforementioned, irrespective of whether the student acted intentionally or negligently or innocently, is
regarded as a serious transgression of the rules and subsequently as serious academic misconduct. It remains the student’s
responsibility to verify, prior to the commencement of a test or examination, that none of the aforementioned items are in his or her
possession.
6. Satchels (book bags) and handbags may be kept with a student, provided that such bags are closed and placed under the student’s
chair. All books and study material must be stowed away in the closed bag. The student may not open or handle such bag at all during
the test or examination session. If study material and/or notes (belonging to a student), are found under the seat or desk, or are visible to
the student to such an extent that they could possibly assist the student, such student shall be regarded as being in possession of
prohibited, unauthorised material. Electronic devices such as cell phones and tablets must be switched off and placed inside the bag,
which is to be closed and to be kept under the student’s chair. In the absence of a bag a student must switch off his or her cell phone or
tablet or any other device and place it on the floor under his/her chair and out of the student’s line of sight. These devices may not be
kept on the person of the student and may not be switched on or handled by the student during the test or examination session.
7. Students are responsible for providing their own writing materials, apparatus and stationery in accordance with the requirements and
specifications or instructions set by the lecturer concerned. Mutual exchange of such items will not be allowed.
8. Wearing of caps, hats or beanies during examinations and tests is prohibited and students may be requested to remove such headgear.
An exception is made in the case of religious headgear.
9. It is important that the surname, full names and signature of the student are provided in the relevant space on the test or examination
answer script. If so preferred by the student, this information may be treated as confidential by folding and sealing the top portion of the
examination or test answer script. The covered portion may only be opened by the examiner if the student number is incorrect or illegible.
All scripts must be completed in indelible ink. Scripts completed in pencil or erasable ink will not be marked and the writer (student) will
not qualify for an additional evaluation opportunity (test/examination).
10. Once the invigilator has announced the commencement of the test or examination, all conversation or any other form of communication
between students must cease. During the course of the test or examination no communication of any nature whatsoever may take place
between students.
11. No student may assist or attempt to assist another student, or obtain help, or attempt to obtain help from another student during a test or
examination.
12. Students may not act dishonestly in any way whatsoever. Dishonest conduct includes, but is not limited to:
 dishonesty with regard to any assessment, whether it be a test or an examination, or with regard to the completion and/or submission
of any other academic task or assignment;
 plagiarism (using the work of others as though it is your own without acknowledging the source);
 the submission of work by a student with a view to assessment when the work in question is that of someone else either in full or in
part, or where it is the result of collusion between the student and another person or persons. The exception is group work as
determined by the lecturer concerned.
13. Writing on any paper other than that provided for test or examination purposes is strictly prohibited. Students may also not write on the
test or examination paper, except in the case of fill-in and multiple-choice question papers.
14. Rough work should be done in the test or examination answer script and then crossed out. No pages may be removed from the test or
examination answer script.
15. Smoking is not permitted in the test or examination venue, and students will also not be permitted to leave the venue during the test or
examination for this purpose.
16. Only in exceptional circumstances will a student be given permission to leave the test or examination venue temporarily, and then only
under the supervision of an invigilator.
17. Students may not take used or unused answer scripts from the test or examination venue.
18. As soon as the invigilator announces during a test or examination that the time has expired, students should stop writing immediately. In
the case of computer-based assessment students are automatically stopped from working on the computer when the login time expires.
19. Students may bring their own watches to the test/examination venue, but smart watches will not be allowed.

Please note: Students should take note that, if found guilty of academic misconduct or non-compliance with these rules, a student
could, among other penalties, forfeit his/her credits for a module and/or be suspended from the University for a period that
could range from one year to permanent suspension. Such a student’s record will be blocked for the period of suspension
and he/she will not be entitled to a certificate of good conduct from the University during this period. Students should also
take note that, if found guilty of academic misconduct, it may negatively influence their admission to other universities
and/or registration with professional councils. Academic misconduct is indicated on all certificates of conduct provided to
students by the University.

2
QUESTION 1 [8]

Let 𝑋1 , 𝑋2 , … , 𝑋5 be a random sample from a normal population with mean 𝜇 and


1
variance 𝜎 2 . Let 𝑋̅ denote the sample mean and 𝑆 2 = 4 ∑5𝑖=1(𝑋𝑖 − 𝑋̅)2 the sample
variance.
a. Give the distribution and relevant parameter values of
𝑋1 + 5𝑋4 − 2𝑋5 .
(3)

𝑋1 + 5𝑋4 − 2𝑋5 ∼ 𝑁(4𝜇, 30𝜎 2 )

4𝑆 2
b. Give the distribution of 𝑋̅ and respectively. (2)
𝜎2

𝜎2
𝑋̅~𝑁 (𝜇, )
5

4𝑆 2
~𝜒 2 (4)
𝜎2

𝑋̅ −𝜇
c. Show that ~𝑡(4). (3)
𝑆 ⁄√ 𝑛

𝑋̅ − 𝜇
~𝑁(0,1)
𝜎⁄√𝑛

𝑋̅ −𝜇
𝜎 ⁄ √𝑛 𝑋̅ − 𝜇
= ~𝑡(4)
⁄ 4𝑆2 𝑆⁄√𝑛
√ 𝜎2⁄
4

𝑋̅ and 𝑆 2 independent

3
QUESTION 2 [5]

Suppose that 𝑌1 , … , 𝑌50 is a random sample from a Bernoulli distribution with


𝑝 = 0.25.

a. Give the approximate sampling distribution and relevant parameter values of


𝑌̅. (2)
𝜎 2 𝜎 2
𝑝(1−𝑝) 3
𝑌̅ ∼ 𝑁(𝜇, 𝑛 ) where 𝜇 = 𝑝 = 0.25 and 𝑛 = 𝑛 = 800 = 0.00375

b. Find the approximate probability that ∑50


𝑖=1 𝑌𝑖 ≤ 12 by using the normal
approximation (with the continuity correction). (3)

∑50 2 2
𝑖=1 𝑌𝑖 ∼ 𝑁(𝑛𝜇, 𝑛𝜎 ) where 𝑛𝜇 = 𝑛𝑝 = 12.5 and 𝑛𝜎 = 𝑛𝑝(1 − 𝑝) = 9.375

12.5−12.5
𝑃(∑50
𝑖=1 𝑌𝑖 ≤ 12) = 𝑃 (𝑍 ≤ ) = 𝑃(𝑍 ≤ 0) = 0.5
√9.375

4
QUESTION 3 [6]

Let 𝑋1 , 𝑋2 , … , 𝑋6 be a random sample from a population following a Gamma(2,1)


distribution. Consider the following two estimators of the mean of this distribution:
9 1
𝜃̂1 = 𝑋̅ and 𝜃̂2 = 30 (𝑋1 + 𝑋2 + 𝑋3 ) + 30 (𝑋4 + 𝑋5 + 𝑋6 )

a. Determine whether 𝜃̂2 is an unbiased estimator. (3)

𝐸(𝑋𝑖 ) = 2

9 1 9 1
𝐸(𝜃̂2 ) = 3𝐸(𝑋𝑖 ) + 3𝐸(𝑋𝑖 ) = (2) + (2) = 2
30 30 10 10

𝜃̂2 is an unbiased estimator since 𝐸(𝜃̂2 ) = 𝐸(𝑋𝑖 )

b. Derive the mean squared error of 𝜃̂2 . (3)

𝑉(𝑋𝑖 ) = 2

9 2 1 2
̂ ̂
𝑀𝑆𝐸(𝜃2 ) = 𝑉(𝜃2 ) = ( ) 3𝑉(𝑋𝑖 ) + ( ) 3𝑉(𝑋𝑖 )
30 30

81 1 41
= ×3×2+ ×3×2= = 0.547
900 900 75

5
QUESTION 4 [5]

Assume that 𝑌1 , 𝑌2 , . . . , 𝑌𝑛 is a sample of size 𝑛 from a gamma-distributed population


with 𝛼 = 2 and unknown 𝛽.
a. Given that
2 ∑𝑛𝑖=1 𝑌𝑖
∼ 𝜒 2 (4𝑛)
𝛽

explain why this information can be used to derive a 100(1 − 𝛼)% confidence
interval for 𝛽. (2)

2 ∑𝑛
𝑖=1 𝑌𝑖
The variable depends functionally on 𝑌1 , 𝑌2 , . . . , 𝑌𝑛 and 𝛽
𝛽

The probability distribution of the variable does not depend on 𝛽 or any


other unknown parameters.

b. If a sample of size 𝑛 = 4 yields ∑𝑛𝑖=1 𝑌𝑖 = 5.39, calculate a 90% confidence


interval for 𝛽. Show all steps. (3)
2 2 ∑𝑛𝑖=1 𝑌𝑖
0.90 = 𝑃 (𝜒1− 𝛼
,4𝑛
≤ ≤ 𝜒𝛼2,4𝑛 )
2 𝛽 2

2
2 × 5.39 2
∴ 𝜒0.95,16 ≤ ≤ 𝜒0.05,16
𝛽

0.78
7.962 ≤ ≤ 26.296
𝛽

1 𝛽 1
≥ ≥
7.962 10.78 26.296
10.78 10.78
≥𝛽≥
7.962 26.296
⇒ 0.410 ≤ 𝛽 ≤ 1.354

6
QUESTION 5 [5]

a. Let 𝜃 denote a population parameter with estimator 𝜃̂. State the assumptions
required to construct a large-sample confidence interval for 𝜃. (2)

𝜃̂ must satisfy the following properties:


1. 𝜃̂ ∼ 𝑁 approximately
2. 𝐸(𝜃̂ ) ≈ 𝜃 (approximately unbiased)
3. 𝜎𝜃̂ (or 𝑠𝜃̂ ) available

b. Consider a random sample 𝑋1 , … , 𝑋𝑛 from a Poisson distribution with


expectation 𝐸[𝑋𝑖 ] = 𝜆. An estimator, 𝜆̂, for the parameter 𝜆 is given by the
observed mean of the sample, that is:
𝑛
1
𝜆̂ = ∑ 𝑋𝑖 .
𝑛
𝑖=1

Suppose a random sample of size 𝑛 = 400 gives the estimate 𝜆̂ = 0.27.


Calculate a 95% confidence interval for 𝜆. (3)

According to CLT, 𝜆̂ = 𝑋̅ follows an approximate normal distribution with 𝜇 = 𝜆


𝜎2 𝜆
and = 𝑛.
𝑛

The large sample confidence interval is given by 𝜃̂ ± 𝑧𝛼/2 𝜎𝜃̂


𝜆̂ ± 𝑧𝛼/2 𝑠𝜆̂

0.27
0.27 ± 𝑧0.025 × √
400

0.27 ± 1.96 × 0.02598


(0.219, 0.321)

7
QUESTION 6 [12]

a. Consider a random sample 𝑋1 , 𝑋2 , … , 𝑋𝑛 from a population with mean 𝜇 and


variance 𝜎 2 . This sample was used to test each of the following hypotheses.
Calculate the 𝑃-value in each case. (6)

Assumption Hypothesis Test statistic value 𝑷-value

𝑛 = 10, 𝜎 2 known 𝑃(𝑍 < −2)


𝐻0 : 𝜇 = 𝜇0
𝑋 is normally −2 = 0.0228
𝐻𝑎 : 𝜇 < 𝜇0
distributed
𝑃(𝑍 > −2)
𝐻0 : 𝜇 = 𝜇0 = 1 − Φ(−2)
𝑛 = 100 −2
𝐻𝑎 : 𝜇 > 𝜇0 = 1 − 0.0228
= 0.9772
2𝑃(𝑍 > | − 2.5|)
𝐻0 : 𝑝 = 𝑝0 = 2Φ(−2.5)
𝑛 = 100 −2.5
𝐻𝑎 : 𝑝 ≠ 𝑝0 = 2(0.0062)
= 0.0124
𝑛=8 𝑃(𝜒72 < 1.69)
𝐻0 : 𝜎 2 = 𝜎02
𝑋 is normally 1.69 = 1 − 0.975 =
𝐻𝑎 : 𝜎 2 < 𝜎02
distributed = 0.025

b. Consider the information below for a significance level 𝛼 and confidence


coefficient 1 − 𝛼. Based on the given confidence limits, indicate whether 𝐻0
should be rejected or not and give a reason in each case: (6)

Hypothesis 𝜽 Confidence Decision Reason


limits for 𝜽
Do not reject 0 included in CI
𝐻0 : 𝑝1 = 𝑝2
𝑝1 − 𝑝2 (−0.2, 0.4) 𝐻0 since
𝐻𝑎 : 𝑝1 ≠ 𝑝2

𝜎12 Reject 𝐻0 since 1 not included


𝐻0 : 𝜎12 = 𝜎22
(0.0, 0.9) in CI
𝐻𝑎 : 𝜎12 ≠ 𝜎22 𝜎22
Do not reject 5 included in CI
𝐻0 : 𝜇1 − 𝜇2 = 5
𝜇1 − 𝜇2 (3.2, 6.1) 𝐻0 since
𝐻𝑎 : 𝜇1 − 𝜇2 ≠ 5
Reject 𝐻0 since 0 is less than
𝐻0 : 𝜇1 = 𝜇2 Lower bound:
𝜇1 − 𝜇2 LL
𝐻𝑎 : 𝜇1 > 𝜇2 1

8
QUESTION 7 [7]

Consider 𝐻0 : 𝜇 = 50 vs. 𝐻0 : 𝜇 > 50. where 𝜇 is the population mean of a normally


distributed variable. The null hypothesis must be tested on a 0.025 level based on a
sample of size 15. Suppose that the sample mean is equal to 55, the population
standard deviation is equal to 12 and the test statistic value is equal to 1.6. Consider
a specific value of 𝜇 under the alternative hypothesis, say 𝜇′ = 60.

a. For which values of the sample mean will 𝐻0 be rejected? (2)

𝑥̅ − 𝜇0 𝜎 12
𝑧= 𝜎 > 1.96 ∴ 𝑥̅ > 𝑧 + 𝜇0 = 1.96 + 50 = 56.073
√𝑛 √15
√𝑛

b. Calculate the probability of a type II error if 𝜇′ = 60. (2)

𝜇0 − 𝜇′ 50 − 60
𝛽(𝜇′) = Φ (𝑧𝛼 + ) = Φ (1.96 + ) = Φ(−1.27) = 0.102
𝜎/√𝑛 12/√15

c. Consider the graph below and name the following: (3)

III

II
I

55 56
50 60 μ

I. Area I (black): 𝛼
II. Area II (grey): 𝛽
III. Area to the right of line III (under the curve centered at 50): 𝑃 −value

9
QUESTION 8 [8]

Literature from 10 years ago indicates that children (ages 2-11) spend an average of
21 hours 30 minutes watching television per week while teens (ages 12-17) spend
an average of 20 hours 40 minutes. To get an indication of the current trend, a
random sample from each of these populations was drawn. The sample statistics are
given below. Answer the questions that follow. Use 𝛼 = 0.10.

Children (1) Teens (2)


Sample mean 22.45 18.50
Sample variance 16.4 18.2
Sample size 9 10

a. It must be tested whether the variance in hours watching television is the


same for children and teens. Only give: (4)

(i) The rejection region:

1 1
Reject 𝐻0 if 𝑓 < 𝐹 = 3.39 = 0.2995 or if 𝑓 > 𝐹8,9,0.05 = 3.23
9,8,0.05

(ii) The test statistic value:

𝑠12 16.4
𝑓= = = 0.901
𝑠22 18.2

b. Assume that the variances in the hours watching television for children and
teens are equal. It must be tested whether there is sufficient evidence to
conclude a difference in average television watching times between the two
groups. (4)

(i) Give the formula of the test statistic.

𝑋̅ − 𝑌̅
𝑇=
1 1
√𝑆𝑝2 ( + )
𝑚 𝑛

(ii) Give the distribution and parameter value(s) of the test statistic under
𝐻0 : 𝜇1 = 𝜇2 .

~𝑡(𝑚 + 𝑛 − 2), 𝑚 + 𝑛 − 2 = 17

(iii) Suppose that the 𝑃 −value is equal to 0.05. Give a decision and
conclusion with a reason for your answer.

Since 𝑃 − value = 0.05 < 0.10 = 𝛼, 𝐻0 is rejected. Sample evidence


suggests that there is a difference in average TV watching times.

10
QUESTION 9 [6]

Consider the analysis of variance for a single factor with 3 levels. The values 𝑎, 𝑏, 𝑐
and 𝑑 represent measurements and 𝐽 = 4.

a. Determine the value of 𝑆𝑆𝑇𝑟. (2)

Level Observations
I 𝒂 𝒃 𝒄 𝒅
II 𝒂 𝒃 𝒄 𝒅
III 𝒂 𝒃 𝒄 𝒅

𝑋̅𝑖. = (𝑎 + 𝑏 + 𝑐 + 𝑑)/4

𝑋̅.. = (4𝑎 + 4𝑏 + 4𝑐 + 4𝑑)/12


3 4

𝑆𝑆𝑇𝑟 = ∑ ∑(𝑋̅𝑖. − 𝑋̅.. )2 = 0


𝑖=1 𝑗=1

b. Give an example of data for which 𝑆𝑆𝐸 = 0. (2)

Level Observations
I 𝒂 𝒂 𝒂 𝒂
II 𝒃 𝒃 𝒃 𝒃
III 𝒄 𝒄 𝒄 𝒄

𝑋̅1. = 𝑎, 𝑋̅2. = 𝑏, 𝑋̅3. = 𝑐

𝑋1𝑗 = 𝑎, 𝑋2𝑗 = 𝑏, 𝑋3𝑗 = 𝑐

3 4

𝑆𝑆𝐸 = ∑ ∑(𝑋𝑖𝑗 − 𝑋̅𝑖. )2 = 0


𝑖=1 𝑗=1

c. Consider 𝐻0 : 𝜇𝐼 = 𝜇𝐼𝐼 = 𝜇𝐼𝐼𝐼 . Suppose the test statistic value is equal to 20. Give
the rejection region and a decision (𝛼 = 0.05). (2)

𝑅𝑅: {𝑓: 𝑓 > 𝐹2,9,0.05 ) = 4.26

Since 20 > 𝐹2,9,0.05, 𝐻0 is rejected.

11
QUESTION 10 [5]

Complete the table below. (5)

Type of data Null hypothesis Test statistic Distribution of the test


statistic under 𝑯𝟎
Measurements 𝑀𝑆𝐵⁄
𝜇1 = 𝜇2 = 𝜇3 = 𝜇4 𝑀𝑆𝐸 𝐹(4 − 1, (3 − 1)(4 − 1))

Ranks
𝑠 + =sum of ranks of
(of 6 pairs of 𝜇1 − 𝜇2 = Δ0
positive differences Exact distribution
observations)

Frequencies 𝜒2
3
(𝑛 = 30) 𝑝1 = 𝑝2 = 𝑝3 (𝑛𝑖 − 10)2 𝜒 2 (2)
=∑
10
𝑖=1

Frequencies 𝑝1 − 𝑝2
(samples of size 𝑍=
𝑝1 = 𝑝2 1 1 𝑛(0,1)
𝑚 and 𝑛 √𝑝̂ 𝑞̂ ( + )
𝑚 𝑛
respectively)

Frequencies
3 4
(with all Variable 𝑋 and (𝑛𝑖𝑗 − 𝑒̂𝑖𝑗 )2
2
expected variable 𝑌 are 𝜒 = ∑∑ 𝜒 2 (6)
𝑒̂𝑖𝑗
frequencies independent 𝑖=1 𝑗=1
greater than 5)

12
QUESTION 11 [13]

A company leases animals, which have been trained to perform certain tasks, for
use in the movie industry. The table below gives the number of tasks that each of
nine monkeys in a random sample can perform, along with the number of years the
monkeys have been working with the company.

Name Halie Fred SuSu Henri Jo Peep Cleo Jeep Mag


Years (𝒙) 10 8 6.5 6 5 1.5 0.5 0.5 0.4
Tasks (𝒚) 28 24 28 28 27 23 15 6 23

The random variable 𝑌𝑖 denotes the number of tasks and 𝑋𝑖 the number of years for
each monkey 𝑖 = 1, … , 9.

𝑆𝑥𝑥 = 106.32, 𝑆𝑦𝑦 = 442.22, 𝑆𝑥𝑦 = 149.33

a. Calculate the least squares simple linear regression line. (2)

𝑌̂ = 16.452 + 1.405𝑥

b. Estimate the true error variance, 𝜎̂ 2 , of the regression model. (2)

𝑆𝑆𝐸 232.4739
𝜎̂ 2 = = = 33.211
𝑛−2 7

c. Calculate and interpret the coefficient of determination. (2)

𝑟 = 0.6887
𝑟 2 = 0.4743

47.43% of the variation in number of tasks can be explained by the linear regression
model.

13
d. Perform a statistical test to determine whether the population correlation
coefficient is significantly different from zero at a 5% level. (3)

𝐻0 : 𝜌 = 0

𝐻𝑎 : 𝜌 ≠ 0

𝑟√𝑛−2
Test statistic: 𝑡 = = 3.466
1−𝑟 2

Rejection region: |𝑡| ≥ 𝑡0.025,7 = 2.365

e. Using the fitted regression line, calculate the residual of SuSu. (2)

𝑒 = 𝑦 − 𝑦̂ = 28 − (16.45163 + 1.40456(6.5)) = 2.419

14
f. Suppose that more data became available and the following standardised
residual plot was obtained.

0 5 10 15 20 25

Tasks (𝑥)
Based on the residual plot, comment on the adequacy of the fitted linear
regression model. (2)

Based on the residual plot, the model appears to be adequate as the


standardised residuals are randomly scattered about 0 with constant variance.

15
QUESTION 12 [4]

Let 𝑦̂𝑖 = 𝛽̂0 + 𝛽̂1 𝑥𝑖 denote the estimate of the true simple linear regression line, with
1 1
𝑦̅ = 𝑛 ∑ 𝑦𝑖 and 𝑥̅ = 𝑛 ∑ 𝑥𝑖 . Show that the square of the sample correlation coefficient
gives the value of the coefficient of determination that would result from fitting the
simple linear regression model.
Hint: Start by assuming that 𝑦̂𝑖 − 𝑦̅ = 𝛽̂1 (𝑥𝑖 − 𝑥̅ ). (4)

∑(𝑦̂𝑖 − 𝑦̅)2 = 𝛽̂12 ∑(𝑥𝑖 − 𝑥̅ )2 (square and sum both sides)

∑(𝑥𝑖 −𝑥̅ )(𝑦𝑖 −𝑦̅) 2


=( ) × ∑(𝑥𝑖 − 𝑥̅ )2 (substitute 𝛽̂1)
∑(𝑥𝑖 −𝑥̅ )2

[∑(𝑥𝑖 −𝑥̅ )(𝑦𝑖 −𝑦̅)]2 ∑(𝑦 −𝑦̅)2


= ∑(𝑥𝑖 −𝑥̅ )2
× ∑(𝑦𝑖 ̅)2
(multiply by 1)
𝑖 −𝑦

[∑(𝑥𝑖 −𝑥̅ )(𝑦𝑖 −𝑦̅)]2


= ∑(𝑥 × ∑(𝑦𝑖 − 𝑦̅)2
𝑖 −𝑥̅ )2 ∑(𝑦 𝑖 −𝑦̅)2

SSR (𝑟)2 𝑆𝑆𝑇


= (𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑐𝑜𝑒𝑓𝑖𝑐𝑖𝑒𝑛𝑡)2

Therefore
𝑆𝑆𝑅
(𝑟)2 =
𝑆𝑆𝑇

𝑆𝑆𝐸 𝑆𝑆𝑅
But the coefficient of determination, 𝑟 2 , is defined as 1 − 𝑆𝑆𝑇 = 𝑆𝑆𝑇 thus, completing
the proof.

16
QUESTION 13 [6]

a. Retired senior executives who had returned to work were surveyed in Gauteng. It
was found that after returning to work, 38% were self-employed, 32% were free-
lancing or consulting, 23% were employed by another organization and 7 % had
formed their own companies. To see if these percentages are consistent with
those in the Western Cape, a local researcher surveyed 300 retired executives
who had returned to work in the Western Cape.

(i) If the Gauteng percentages also hold in the Western Cape, how many
executives are expected to have formed their own companies in the Western
Cape? (1)

0.07(300) = 21

(ii) The test statistic value to test the above, is equal to 3.294. Calculate the
𝑃 −value. (2)

𝑃 −value= 𝑃(𝜒32 > 3.294) > 0.100

b. The following data are a summary of the goal scoring record of 436 soccer
players who played in international matches during a specific period of time:

Number of 0 1 2 3 4 5 6 7 8
goals
scored
Number of 300 67 33 11 6 9 6 2 2
players
Expected
number of
218.7 150.9 52.1 12.0 2.1 0.3 0.0 0.0 0.0
players
under 𝑯𝟎

Consider
𝐻0 : The goal scoring record of soccer players has a Poisson distribution.

(i) Only calculate the contribution of 2 goals scored to the test statistic value. (1)

(33 − 52.1)2
= 7.002
52.1

(ii) Give the rejection region (in terms of the critical value and significance level
0.05) for this test. (2)
2
Reject 𝐻0 if 𝜒 2 > 𝜒4−1−1;0.05 = 5.992

17
QUESTION 14 [4]

In two consecutive weeks 120 patients with same disease are subjected to two
medical treatments. The observed contingency table below shows the reactions to
the treatments:

Treatment B
Treatment A Favourable Unfavourable
Favourable 23 36
Unfavourable 32 29

It must be tested whether there is dependence (interaction) between the degree of


success for the two treatments (𝛼 = 0.01). Only give:

a. 𝐻0 : The degree of success for the two treatments is independent. (1)


𝐻𝑎 : The degree of success for the two treatments is dependent.

b. The rejection region: (1)


2
Reject 𝐻0 if 𝜒 2 > 𝜒1,0.01 = 6.637.

c. Calculate the expected number of outcomes which is favourable for both


treatment A and treatment B if the success of the two treatments are
independent. (1)

27.042

d. Suppose the test statistic value is equal to 2.194. Draw a conclusion and give
a reason for your answer. (1)

Since 𝜒 2 = 2.194 < 6.637, 𝐻0 is not rejected. Sample evidence suggest that
the degree of success for the two treatments is independent.

18
QUESTION 15 [6]

In a large department store, the owner wishes to see whether the number of
shoplifting incidents per day will change if the number of uniformed security officers
is doubled. A sample of 7 days before security is increased and 7 days after the
increase shows the number of shoplifting incidents.

Number of shoplifting
incidents
Day Before After Difference Rank of
absolute
differences
Monday 7 5 2 3.5
Tuesday 2 3 -1 1.5
Wednesday 3 4 -1 1.5
Thursday 6 3 3 5
Friday 5 1 4 6
Saturday 8 6 2 3.5
Sunday 12 4 8 7

Is there enough evidence to indicate, on an approximate 5% level of significance,


that there is a decrease in the number of shoplifting incidents before and after the
increase in security? Use the appropriate non-parametric test. (6)

𝐻0 : 𝜇𝐷 = 0
𝐻0 : 𝜇𝐷 > 0

Reject 𝐻0 if 𝑠+ ≥ 𝑐1 = 24
7(8)
𝑠+ = − 3 = 25
2

Reject 𝐻0 . Sample evidence suggests that there is a decrease in shoplifting


incidents.

19

You might also like