Professional Documents
Culture Documents
Important note
This commentary reflects the examination and assessment arrangements for this course in the
academic year 2018–19. The format and structure of the examination may change in future years,
and any such changes will be publicised on the virtual learning environment (VLE).
Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2014).
You should always attempt to use the most recent edition of any Essential reading textbook, even if
the commentary and/or online reading list and/or subject guide refer to an earlier edition. If
different editions of Essential reading are listed, please check the VLE for reading supplements – if
none are available, please use the contents list and index of the new edition to find the relevant
section.
General remarks
Learning outcomes
At the end of this half course, and having completed the Essential reading and activities, you should
be able to:
• apply and be competent users of standard statistical operators and be able to recall a
variety of well-known distributions and their respective moments
• explain the fundamentals of statistical inference and apply these principles to justify the use
of an appropriate model and perform tests in a number of different settings
• demonstrate understanding that statistical techniques are based on assumptions and the
plausibility of such assumptions must be investigated when analysing real problems.
The examination is two hours long and you must answer all four questions.
Question 1, for 40% of the marks, is a compulsory question with several parts. It is designed to test
general knowledge and understanding of the whole syllabus. Here candidates are expected to give
reasoned answers, with some explanation, avoiding one-word responses which will never be given any
marks. More emphasis is given to understanding than to knowledge. Candidates should answer the
first part of this question (true or false statements) either by proving that the statement is true or
false or, in the case of a false statement, providing a counterexample. It is not sufficient to just
provide the correct answer (no credit is given for this); an explanation is required. Furthermore,
when trying to show that a certain statement is true, it is not sufficient to show that the statement
1
ST104b Statistics 2
is true in a very specific case – this does not prove the statement is true, you must show the
statement holds in all circumstances.
The other three questions are also compulsory and account for 20% of the total marks each. They
are meant to test a greater depth of knowledge on parts of the syllabus. They are also longer and
examine the ability to apply general knowledge and concepts to specific problems.
It is hard to overemphasise that memorising answers to past examination questions is not the best
way to study for this paper. It is important for candidates to understand the material they write
down, and to be able to develop it all from scratch as they write it. Often, there are several ways to
obtain good marks for a question. If you cannot solve a certain section of a question you might still
get full marks for subsequent sections as long as your reasoning is correct.
A very good mathematics background is extremely important. The course can be divided into a
probability and distribution theory part and a statistics part. A mathematics background is
important for both but especially for the probability and distribution theory part.
You should ensure you have an understanding of all parts of the half course. Specialising is a bad
strategy as all questions are compulsory and there is no choice.
The paper is light on computations, as questions are answered with the use of a basic calculator only.
You should understand all parts of the half course without exception. Remember that all questions
in the paper are compulsory. Another reason why you should understand all parts of the course is
that you should not expect to get very similar questions compared to previous years’ papers.
You should be able to write down or discuss definitions or models used in the syllabus.
It is important that you have the necessary mathematical skills. This means that you should
understand your mathematics courses well too.
Routine computations are less important and you should spend more time understanding the
concepts. Understanding concepts means being able to apply them, sometimes even in combined
situations.
As stated earlier, the ST104b Statistics 2 examinations are not heavy in calculations.
Probability is the most important part of the half course as everything else depends on it. You must
have a thorough understanding of all concepts. You should avoid making elementary mistakes which
demonstrate a lack of understanding and are hence heavily penalised by the examiners. These
include the following.
You should get familiar with the logical thinking needed to answer the first part of Question 1.
2
Examiners’ commentaries 2019
When there is no evidence to reject a null hypothesis it does not mean that there is evidence to
accept it.
Many candidates are disappointed to find that their examination performance is poorer than they
expected. This may be due to a number of reasons, but one particular failing is ‘question
spotting’, that is, confining your examination preparation to a few questions and/or topics which
have come up in past papers for the course. This can have serious consequences.
We recognise that candidates might not cover all topics in the syllabus in the same depth, but you
need to be aware that examiners are free to set questions on any aspect of the syllabus. This
means that you need to study enough of the syllabus to enable you to answer the required number of
examination questions.
The syllabus can be found in the Course information sheet available on the VLE. You should read
the syllabus carefully and ensure that you cover sufficient material in preparation for the
examination. Examiners will vary the topics and questions from year to year and may well set
questions that have not appeared in past papers. Examination papers may legitimately include
questions on any topic in the syllabus. So, although past papers can be helpful during your revision,
you cannot assume that topics or specific questions that have come up in past examinations will
occur again.
If you rely on a question-spotting strategy, it is likely you will find yourself in difficulties
when you sit the examination. We strongly advise you not to adopt this strategy.
3
ST104b Statistics 2
Important note
This commentary reflects the examination and assessment arrangements for this course in the
academic year 2018–19. The format and structure of the examination may change in future years,
and any such changes will be publicised on the virtual learning environment (VLE).
Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2014).
You should always attempt to use the most recent edition of any Essential reading textbook, even if
the commentary and/or online reading list and/or subject guide refer to an earlier edition. If
different editions of Essential reading are listed, please check the VLE for reading supplements – if
none are available, please use the contents list and index of the new edition to find the relevant
section.
Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks) and all
THREE questions from Section B (60 marks in total). Candidates are strongly advised to
divide their time accordingly.
Section A
Question 1
(a) For each one of the statements below say whether the statement is true or false,
justifying your answer.
i. For any three events A, B and C, if P (A ∩ B ∩ C) = P (A) P (B) P (C) then
A, B and C are independent.
ii. For any two random variables X and Y it holds that:
4
Examiners’ commentaries 2019
(b) Explain why we must consider both bias and variance when judging the
performance of an estimator.
(5 marks)
(c) The random variable X has the probability density function given by:
(
kx2 for 0 < x < 1
f (x) =
0 otherwise
5
ST104b Statistics 2
ii. We have: 1
1 1
3x4
Z Z
3 3
E(X) = x f (x) dx = 3x dx = =
0 0 4 0 4
and: 1
1 1
3x5
Z Z
3
E(X 2 ) = x2 f (x) dx = 3x4 dx = = .
0 0 5 0 5
Hence: 2
3 3 3
Var(X) = E(X 2 ) − (E(X))2 = − = = 0.0375.
5 4 80
(d) Suppose that 10 people are seated in a random manner in a row of 10 lecture
theatre seats. What is the probability that two particular people, A and B, will
be seated next to each other?
(5 marks)
(e) A person tried by a three-judge panel is declared guilty if at least two judges
cast votes of guilty (i.e. a majority verdict). Suppose that when the defendant is
in fact guilty, each judge will independently vote guilty with probability 0.9,
whereas when the defendant is not guilty (i.e. innocent), the probability of
voting guilty drops to 0.25. Suppose 70% of defendants are guilty.
i. Compute the probability that judge 1 votes guilty.
(2 marks)
ii. Given that both judge 1 and judge 2 vote not guilty, compute the probability
that judge 3 votes guilty.
(4 marks)
6
Examiners’ commentaries 2019
Section B
Question 2
(a) The teams Ajax and Bayern Munich are meeting in an important football
match. Based on previous encounters and current form, the probabilities of the
various scores are given in the table below, where A and B denote the numbers
of goals scored by Ajax and Bayern Munich, respectively. It is believed that the
probabilities of either team scoring 4 or more goals are so small that these can
be considered to be zero.
A
Goals 0 1 2 3
0 0.06 0.05 0.04 0.03
B 1 0.12 0.08 0.05 0.03
2 0.12 0.05 0.05 0.05
3 0.10 0.10 0.05 0.02
7
ST104b Statistics 2
P (A = 0, B = 0) + P (A = 1, B = 1) + P (A = 2, B = 2) + P (A = 3, B = 3) = 0.21.
P (A = 1, B = 0) + P (A = 2, B = 0) + P (A = 3, B = 0)
+ P (A = 2, B = 1) + P (A = 3, B = 1) + P (A = 3, B = 2)
= 0.25
and he expects to lose his £1 with probability 0.75. Therefore, he is expected to make a
profit of:
5 × 0.25 − 1 × 0.75 = £0.50.
(b) Suppose that you are given independent observations y1 , y2 and y3 such that:
y1 = α + β + ε 1
y2 = α + 2β + ε2
y3 = α + 4β + ε3 .
i. Find the least squares estimators of the parameters α and β, and verify that
they are unbiased estimators.
(7 marks)
ii. Calculate the variance of the estimator of α.
(3 marks)
We have:
∂S
= −2(y1 − α − β) − 2(y2 − α − 2β) − 2(y3 − α − 4β)
∂α
= 2(3α + 7β − (y1 + y2 + y3 ))
and:
∂S
= −2(y1 − α − β) − 4(y2 − α − 2β) − 8(y3 − α − 4β)
∂β
= 2(7α + 21β − (y1 + 2y2 + 4y3 )).
8
Examiners’ commentaries 2019
3b
α + 7βb = y1 + y2 + y3 and 7b
α + 21βb = y1 + 2y2 + 4y3 .
Solving yields:
−4y1 − y2 + 5y3 2y1 + y2 − y3
βb = and α
b= .
14 2
They are unbiased estimators since:
1
E(β)
b = (−4α − 4β − α − 2β + 5α + 20β) = β
14
and:
1
E(b
α) = (2α + 2β + α + 2β − α − 4β) = α.
2
ii. We have, by independence:
2 2
2 1 1 3
Var(b
α) = 1 + + = .
2 2 2
Question 3
(a) Three call centre workers were being monitored for the average number of calls
they answer per daily shift. Worker A answered a total of 187 calls in 4 days.
Worker B answered a total of 347 calls in 6 days. Worker C answered a total of
461 calls in 10 days. Note that these figures
P 2 are total calls, not daily averages.
The sum of the squares of all 20 days, xi , is 50915.
i. Construct a one-way analysis of variance table. (You may exclude the
p-value.)
(7 marks)
ii. Would you say there is a difference between the average daily calls answered
of the three workers? Justify your answer using a 5% significance level.
(3 marks)
9
ST104b Statistics 2
ii. At the 5% significance level, the critical value is F0.05, 2, 17 = 3.59. Since 3.59 < 5.60, we
reject H0 : µA = µB = µC and conclude that there is evidence of a difference in the
average daily calls answered of the three workers.
(b) Suppose that one observation is taken from the geometric distribution:
(
(1 − π)x−1 π for x = 1, 2, . . .
p(x; π) =
0 otherwise
i. What is the probability that a Type II error will be committed when the
true parameter value is π = 0.4?
(4 marks)
ii. What is the probability that a Type I error will be committed?
(4 marks)
iii. If x = 4, what is the p-value of the test?
(2 marks)
= 0.784.
ii. We have:
= 0.343.
Question 4
10
Examiners’ commentaries 2019
Differentiating:
n n
Xi − 2nλ2
P P
2 Xi 2
d i=1 i=1
l(λ) = − 2nλ = .
dλ λ λ
Setting to zero, we re-arrange for the estimator:
n
P 1/2
n
X Xi
b2 = 0 b = i=1 = X̄ 1/2 .
2 Xi − 2nλ ⇒ λ n
i=1
b3 = X̄ 3/2 .
θb = λ
11
ST104b Statistics 2
12
Examiners’ commentaries 2019
Important note
This commentary reflects the examination and assessment arrangements for this course in the
academic year 2018–19. The format and structure of the examination may change in future years,
and any such changes will be publicised on the virtual learning environment (VLE).
Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2014).
You should always attempt to use the most recent edition of any Essential reading textbook, even if
the commentary and/or online reading list and/or subject guide refer to an earlier edition. If
different editions of Essential reading are listed, please check the VLE for reading supplements – if
none are available, please use the contents list and index of the new edition to find the relevant
section.
Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks) and all
THREE questions from Section B (60 marks in total). Candidates are strongly advised to
divide their time accordingly.
Section A
Question 1
(a) For each one of the statements below say whether the statement is true or false,
justifying your answer.
i. For any three events A, B and C, if A is independent of B, B is independent
of C and A is independent of C, then the three events are independent.
ii. If A and B are two events with P (A) > 0, P (B) > 0 and such that
P (B | A) > P (B), then it holds that P (A | B) > P (A).
iii. If A, B and C are three events such that A ∩ B ∩ C = ∅ then:
iv. The average of two unbiased estimators of the same parameter is also an
unbiased estimator of the same parameter.
v. Two fair dice are thrown and let A = {the sum of the dice is 9, 10 or 12}
and B = {the dice show a total score that is odd}. The events A and B are
independent.
(10 marks)
13
ST104b Statistics 2
P (A ∩ B)
P (A | B) = > P (A).
P (B)
v. True. We have P (A) = 8/36 and P (B) = 1/2, whereas P (A ∩ B) = 4/36 hence A and B
are independent.
(b) Briefly describe the concepts of significance level and power of a hypothesis
test. How is the former linked to the p-value of a hypothesis test?
(5 marks)
(c) The random variable X has the probability density function given by:
(
kx4 for 0 < x < 1
f (x) =
0 otherwise
14
Examiners’ commentaries 2019
1 1
kx5
Z
k
kx4 dx = = =1
0 5 0 5
and so k = 5.
ii. We have: 1
1 1
5x6
Z Z
5 5
E(X) = x f (x) dx = 5x dx = =
0 0 6 0 6
and: 1
1 1
5x7
Z Z
2 2 6 5
E(X ) = x f (x) dx = 5x dx = = .
0 0 7 0 7
Hence: 2
5 5
Var(X) = E(X 2 ) − (E(X))2 = − = 0.0198.
7 6
(d) Suppose that 20 people are seated in a random manner in a row of 20 lecture
theatre seats. What is the probability that two particular people, A and B, will
be seated next to each other?
(5 marks)
(e) A person tried by a three-judge panel is declared guilty if at least two judges
cast votes of guilty (i.e. a majority verdict). Suppose that when the defendant is
in fact guilty, each judge will independently vote guilty with probability 0.8,
whereas when the defendant is not guilty (i.e. innocent), the probability of
voting guilty drops to 0.15. Suppose 80% of defendants are guilty.
i. Compute the probability that judge 1 votes guilty.
(2 marks)
ii. Given that both judge 1 and judge 2 vote not guilty, compute the probability
that judge 3 votes guilty.
(4 marks)
15
ST104b Statistics 2
Section B
Question 2
(a) The teams Arsenal and Bournemouth are meeting in an important football
match. Based on previous encounters and current form, the probabilities of the
various scores are given in the table below, where A and B denote the numbers
of goals scored by Arsenal and Bournemouth, respectively. It is believed that
the probabilities of either team scoring 4 or more goals are so small that these
can be considered to be zero.
A
Goals 0 1 2 3
0 0.05 0.10 0.15 0.10
B 1 0.02 0.07 0.15 0.13
2 0.02 0.05 0.05 0.05
3 0.01 0.02 0.02 0.01
16
Examiners’ commentaries 2019
P (A = 0, B = 0) + P (A = 1, B = 1) + P (A = 2, B = 2) + P (A = 3, B = 3) = 0.18.
P (B = 1, A = 0) + P (B = 2, A = 0) + P (B = 3, A = 0)
+ P (B = 2, A = 1) + P (B = 3, A = 1) + P (B = 3, A = 2)
= 0.14
and she expects to lose her £1 with probability 0.86. Therefore, she is expected to make
a loss of:
5 × 0.14 − 1 × 0.86 = −£0.16.
(b) Suppose that you are given independent observations y1 , y2 and y3 such that:
y1 = α + 2β + ε1
y2 = α + β + ε 2
y3 = α + 4β + ε3 .
i. Find the least squares estimators of the parameters α and β, and verify that
they are unbiased estimators.
(7 marks)
ii. Calculate the variance of the estimator of α.
(3 marks)
17
ST104b Statistics 2
We have:
∂S
= −2(y1 − α − 2β) − 2(y2 − α − β) − 2(y3 − α − 4β)
∂α
= 2(3α + 7β − (y1 + y2 + y3 ))
and:
∂S
= −4(y1 − α − 2β) − 2(y2 − α − β) − 8(y3 − α − 4β)
∂β
= 2(7α + 21β − (2y1 + y2 + 4y3 )).
3b
α + 7βb = y1 + y2 + y3 and 7b
α + 21βb = 2y1 + y2 + 4y3 .
Solving yields:
−y1 − 4y2 + 5y3 y1 + 2y2 − y3
βb = and α
b= .
14 2
They are unbiased estimators since:
1
E(β)
b = (−α − 2β − 4α − 4β + 5α + 20β) = β
14
and:
1
E(b
α) = (α + 2β + 2α + 2β − α − 4β) = α.
2
Question 3
(a) Three call centre workers were being monitored for the average number of calls
they answer per daily shift. Worker A answered a total of 321 calls in 6 days.
Worker B answered a total of 548 calls in 9 days. Worker C answered a total of
354 calls in 8 days. Note that these figures
P 2 are total calls, not daily averages.
The sum of the squares of all 23 days, xi , is 66901.
18
Examiners’ commentaries 2019
(b) Suppose that one observation is taken from the geometric distribution:
(
(1 − π)x−1 π for x = 1, 2, . . .
p(x; π) =
0 otherwise
to test H0 : π = 0.4 vs. H1 : π > 0.4. The null hypothesis is rejected if x ≥ 4.
i. What is the probability that a Type II error will be committed when the
true parameter value is π = 0.6?
(4 marks)
ii. What is the probability that a Type I error will be committed?
(4 marks)
iii. If x = 4, what is the p-value of the test?
(2 marks)
= 0.936.
ii. We have:
P (Type I error) = P (reject H0 | H0 ) = 1 − P (X ≤ 3 | π = 0.4)
3
X x−1
=1− (1 − 0.4) × 0.4
x=1
= 0.216.
iii. The p-value is P (X ≥ 4 | π = 0.4) = 0.216.
19
ST104b Statistics 2
Question 4
Differentiating:
n n
Xi − 2nλ2
P P
2 Xi 2
d
l(λ) = i=1 − 2nλ = i=1
.
dλ λ λ
Setting to zero, we re-arrange for the estimator:
n
P 1/2
n
X Xi
b2 = 0 b = i=1 = X̄ 1/2 .
2 Xi − 2nλ ⇒ λ n
i=1
20
Examiners’ commentaries 2019
21