You are on page 1of 20

Examiners’ commentaries 2014

Examiners’ commentaries 2014


ST3133 Advanced statistics: distribution theory (half course)

Important note

This commentary reflects the examination and assessment arrangements for this course in the
academic year 2013–14. The format and structure of the examination may change in future years,
and any such changes will be publicised on the virtual learning environment (VLE).

Information about the subject guide

Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2011).
You should always attempt to use the most recent edition of any Essential reading textbook, even if
the commentary and/or online reading list and/or subject guide refers to an earlier edition. If
different editions of Essential reading are listed, please check the VLE for reading supplements – if
none are available, please use the contents list and index of the new edition to find the relevant
section.

General remarks

Learning outcomes

By the end of this half course and having completed the Essential reading and activities you should
be able to demonstrate to the Examiners that you are able to:

• recall a large number of distributions and be a competent user of their mass/density and
distribution functions and moment generating functions
• explain relationships between variables, conditioning, independence and correlation
• relate the theory and method taught in the course to solve practical problems.

Format of the examination


• This year there are two papers for two examination zones. Compared to last year, the
ability of students varied more widely this year. A group of students have shown great
improvement in the ability to answer questions from different parts of the subject guide,
while some others also show improvements, albeit on a lesser scale. That said, many
candidates still have not studied the subject guide thoroughly enough. As in previous years,
candidates should prepare well by studying the subject guide thoroughly, knowing what is
covered by the syllabus, not simply practise on past papers.
• The format in this year’s examination will be retained for next year’s examination.

Key steps to improvement


• Basic algebra has improved overall, which is an important step towards getting a better
result. Still, quite a number of candidates can only partly remember formulae or procedures

1
ST3133 Advanced statistics: distribution theory (half course)

for solving particular problems. This is not good enough, and certainly more practice is
needed. This is especially true for finding the probability of an event involving more than
one random variable, integration by part, and finding the probability mass/density function
of transformed bivariate random variables.
• When calculating probability or expectation, especially when evaluating double integrals,
many candidates got the results wrong because of carelessly placing the wrong limits of
integration. Please practise more how to find the limits correctly for a particular region of a
joint density.
• Please utilise any hints given in a question, since this can dramatically reduce evaluation
time. Think about the way you can evaluate the answer using the given hint. This makes
knowledge of the various ways to approach a question more important, since a question can
be significantly easier if it is approached in a particular way, especially following the hint.
For instance, see Question 1 (c) part ii. in the Zone A paper.
• Candidates should be ready to derive the moment generating functions of standard random
variables, like the normal, Gamma, chi-square, Exponential (all continuous), or the
geometric, binomial, Poisson (all discrete), and ideally know the forms by heart. It is also
important to know basic applications of these distributions, and apply the correct formulae
in probability questions.

Question spotting
Many candidates are disappointed to find that their examination performance is poorer
than they expected. This can be due to a number of different reasons and the Examiners’
commentaries suggest ways of addressing common problems and improving your performance.
We want to draw your attention to one particular failing – ‘question spotting’, that is,
confining your examination preparation to a few question topics which have come up in past
papers for the course. This can have very serious consequences.
We recognise that candidates may not cover all topics in the syllabus in the same depth, but
you need to be aware that Examiners are free to set questions on any aspect of the syllabus.
This means that you need to study enough of the syllabus to enable you to answer the required
number of examination questions.
The syllabus can be found in the ‘Course information sheet’ in the section of the VLE dedicated
to this course. You should read the syllabus very carefully and ensure that you cover sufficient
material in preparation for the examination.
Examiners will vary the topics and questions from year to year and may well set questions that
have not appeared in past papers – every topic on the syllabus is a legitimate examination
target. So although past papers can be helpful in revision, you cannot assume that topics or
specific questions that have come up in past examinations will occur again.
If you rely on a question spotting strategy, it is likely you will find yourself in
difficulties when you sit the examination paper. We strongly advise you not to
adopt this strategy.

2
Examiners’ commentaries 2014

Examiners’ commentaries 2014


ST3133 Advanced statistics: distribution theory (half course)

Important note

This commentary reflects the examination and assessment arrangements for this course in the
academic year 2013–14. The format and structure of the examination may change in future years,
and any such changes will be publicised on the virtual learning environment (VLE).

Information about the subject guide

Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2011).
You should always attempt to use the most recent edition of any Essential reading textbook, even if
the commentary and/or online reading list and/or subject guide refers to an earlier edition. If
different editions of Essential reading are listed, please check the VLE for reading supplements – if
none are available, please use the contents list and index of the new edition to find the relevant
section.

Comments on specific questions – Zone A

Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks) and all
THREE questions from Section B (60 marks in total). Candidates are strongly advised to
divide their time accordingly.

Section A

Answer all three parts of question 1 (40 marks in total).

Question 1

(a) Let X ∼Geometric(p), that is, it is a discrete random variable with underlying
probability mass function

pX (x) = q x−1 p, 0 < p < 1, x = 1, 2, . . . ,

where q = 1 − p. For parts i., ii. and iii., write your answer in terms of p only.
i. Find the probability that X is even.
(5 marks)
ii. Let Y ∼Geometric(p) be independent of X. Find the probability that X + Y
is even.
(5 marks)
iii. Find P (X − Y = 2).
(5 marks)
Approaching the question
This question tests on basic evaluations for discrete distributions. You are required to read
the question carefully, so that you are aware that you need to write the answer in terms of p

3
ST3133 Advanced statistics: distribution theory (half course)

only. Part (i) was done well in general. To find the probability that X is even, the majority
of candidates knew the answer should be summing up all the pX (x) with x being even. The
best answer is to realise that the resulting sum is an infinite geometric series, and apply the
correct formula while simplifying the resulting expression to involve only p. Many
candidates forgot to express this in terms of p only, and this resulted in a deduction of
marks. A minority of candidates attempted to use integration in answering the question,
showing rather a weak understanding of probability. See Chapter 3, Learning activity 3.1 on
page 47 of the subject guide for the basic concept of evaluating probabilities, and Example
3.3.9 on page 55 and related materials for the treatment of a geometric distribution.
There are two ways of doing part ii. One is to sum up all probabilities corresponding to
X + Y being even. This is a very complicated way and results in a double summation,
which is difficult to evaluate. The other way, which is the better way to approach the
question, is to observe that X + Y being even means that either both X and Y are odd, or
both are even. This way, you can utilise the answer in part i to the full extent, since
P (X is odd) = 1 − P (X is even). Even if you cannot get an expression for part i, writing
the form of answer in terms of P (X is even) means you have already gained the majority of
marks. You are required to have an understanding of independence of probability, stemming
from the fact that X and Y are independent. See Definition 2.4.13 on page 28 of the subject
guide.
For part iii, the only effective way to approach the question is to find the joint probability
mass function of (X, Y ) first, which is just a multiplication of respective marginal functions
because of the independence of X and Y . Then setting x = y + 2, sum all possible values of
y. Note that if you decide to set y = x − 2 instead, then the sum over all x should be from
x = 3 onwards, since x = 1, 2 result in y = −1, 0, which are not possible values for Y . The
answer requires knowledge of the law of total probability, which can be found in Proposition
2.4.9 on page 26 of the subject guide.
Full solutions are:
i. We have:

X X qp 1−p
P (X even) = q x−1 p = q 2m−1 p = = .
x even m=1
1 − q2 2−p

ii. We have:

P (X + Y even) = P (X, Y are odd) + P (X, Y are even)


= P (X odd)P (Y odd) + P (X even)P (Y even)
1 (1 − p)2
= 2
+
(2 − p) (2 − p)2
2 − 2p + p2
= .
(2 − p)2

iii. We have:

P (X − Y = 2) = P (X = Y + 2)
X∞
= P (X = y + 2|Y = y)P (Y = y)
y=1

X
= q y+2−1 p · q y−1 p
y=1
X∞
= q 2y p2
y=1

q 2 p2 p(1 − p)2
= 2
= .
1−q 2−p

4
Examiners’ commentaries 2014

(b) Suppose the waiting time X (in minutes) of a particular bus is a random
variable with underlying probability density function
fX (x) = exp(3 − kx), x ≥ 3.
i. Show that k = 1.
(4 marks)
ii. Find the expected waiting time for the bus.
(5 marks)
iii. Suppose you wait for the bus each day. Assume that waiting times on
different days are independent, what is the probability that for at least 9
days among 10, the waiting time is less than 4 minutes? You can leave your
answer in terms of e.
(6 marks)
Approaching the question
The majority of candidates got part i right, or at least knew the way to proceed. Since you
are given
R that k = 1 is the answer, the easiest way to do this is to substitute k = 1, and show
that fX (x) dx = 1. Some R candidates treated k as an unknown and solved the equation
involving k after setting fX (x) dx = 1. This is the correct way to do the question and is
shown in the solutions. However, you would not be able to solve the equation unless you
spotted the obvious answer k = 1. See Section 3.3.2 in the subject guide for more details.
Part ii was done well in general, since this is basically the expectation of an exponential
random variable. Refer to Example 3.4.3 and related materials there on pages 65 and 66 of
the subject guide. Integration by parts is a basic technique needed here, and you should
practise more on this technique.
Part iii requires you to have a combination of knowledge in probability evaluation and the
binomial distribution and its usage. Questions of the type ‘certain events happen y times
among a total of n independent trials’ require use of the binomial distribution. Here, n is 10
days, and y is 9 or 10 days (at least 9 means 9 or 10). Some candidates calculated
8
P
probabilities for 0, 1, . . . , 8, and used 1 − P (Y = y), where Y denotes the number of days
y=0
with waiting time less than 4 minutes, which is unnecessarily complicated. Always work out
which way to calculate is easier before committing to evaluating anything. See Example
3.3.8 and related materials there on pages 54 and 55 of the subject guide for the use of the
binomial distribution.
Full solutions are:
i. We use:

e3−3k
Z
1= e3−kx dx = e3 [−e−kx /k]∞
3 =
3 k
showing that k = 1. (Note that the function (e3−3k )/k is strictly decreasing for k > 0, so
that k = 1 is the only solution)
ii. The expected waiting time is, in minutes,
Z ∞  Z ∞ 
xe3−x dx = e3 [−xe−x ]∞ −x
= e3 3e−3 + e−3 = 4.

E(X) = 3 + e dx
3 3

iii. First we find P (X < 4). We have:


Z 4
P (X < 4) = e3−x dx = e3 [−e−x ]43 = 1 − e−1 .
3

Hence, the required probability is:


P (At least 9 days with X < 4) = P (Exactly 9 days with X < 4)
+ P (Exactly 10 days with X < 4)
   
10 −1 9 −1 10
= (1 − e ) e + (1 − e−1 )10
9 10
= (1 − e−1 )9 (1 + 9e−1 ).

5
ST3133 Advanced statistics: distribution theory (half course)

(c) Let Y | X = x ∼ Exponential(x), that is, given X = x, the probability density


function for Y | X = x is

fY |x (y) = xe−xy , y > 0.

The probability density function for the random variable X is

fX (x) = 1, 1 < x < 2.

i. Find the probability density function for Y , fY (y).


(6 marks)
1
ii. Evaluate E(Y ). You are allowed to use E(Y | X = x) = x
.
(4 marks)

Approaching the question


Part i requires you to find the joint density of X, Y first, which is a test of your basic
knowledge of conditional density. Refer to Definition 5.2.1 in Section 5.2 on page 152, and
equation (5.2) on Rpage 153 of the subject guide for details. Afterwards, the standard
formula fX (x) = fX,Y (x, y) dy can be applied. Most candidates knew how to do this part,
despite the fact that still many of them found it difficult to evaluate integrals involving the
exponential function. Integration by parts is again the main technique here which you
should master before the examination.
For part ii, there is a hint on using the conditional expectation given. There were some
candidates
R Rwho tried to calculate the expectation of y by directly using
E(Y ) = yfX,Y (x, y) dxdy, which is perfectly fine, except that for this particular question
you would encounter a very difficult integral to do. The best way to approach this question
is to use the hint, and use the knowledge that E(Y ) = E(E(Y | X)), which is the law of
iterated expectations in Proposition 5.4.2 on page 156 of the subject guide. This results in a
very simple integral to do as you can see in the solutions. Some candidates indeed proceeded
this way, but thought that E(1/X) = 1/E(X), which is wrong!
Full solutions are:

i. We have:
Z
fY (y) = fY,X (y, x) dx
Z 2
= fY | x (y) fX (x) dx
1
Z 2
= x e−xy · 1 dx
1
2 2
x e−yx e−xy
 Z
= + dx
−y 1 1 y
−y −2y
 −yx 2
e − 2e e
= +
y −y 2 1
e−y − 2e−2y e−y − e−2y
= + , y > 0.
y y2

ii. We have:
  Z 2
1 1
E(Y ) = E(E(Y | X)) = E = dx = log(2).
X 1 x

6
Examiners’ commentaries 2014

Section B

Answer all three questions in this section (60 marks in total).

Question 2

Let X, Y be two random variables having the joint probability density function

fX,Y (x, y) = (x + y)e−(x+y) , 0 < x < y.

(a) Let U = X + Y and V = X − Y . Show that the joint density of U, V is


1
fU,V (u, v) = ue−u , −u < v < 0.
2
(Hint: solve x and y in terms of u and v, and then use 0 < x < y to find the
region where the joint density is valid in terms of u and v)
(7 marks)
(b) Find the marginal density fU (u) for U . You are given that the marginal density
for V is
ev (1 − v)
fV (v) = , v < 0.
2
Are U and V independent?
(4 marks)
(c) Calculate E(U | V = v).
(9 marks)

Approaching the question

You can find multivariate transformations of random variables in Section 4.6.2 on page 129 of the
subject guide. This question was done well generally for part (a), also satisfactorily for part (b),
and basically no-one could finish part (c).

For part (a), applying the formula (4.9) on page 129 of the subject guide is essential. To do this,
evaluating the Jacobian correctly is the first step. Another step, which is less obvious from the
subject guide, is finding the region where the joint density is valid in terms of u and v. This is
why a hint was given. As in the solutions, you need to solve the inequalities 0 < x < y, which is
the given region for fX,Y (x, y), in terms of u and v. The first thing to do is to express x and y in
terms of u and v. Quite a number of candidates got confused and calculated the Jacobian
incorrectly, or applied the inverse of it. One thing to note is that the absolute value should be
for the Jacobian, so that if you calculated |J| = −1/2, in the formula it should be
taken
|J| = 1/2. Remember a density should not be negative. Many candidates also got it wrong
because they forgot to take the absolute value and leave the density as negative!

For part (b), finding the marginal density for U is not difficult with the joint density fU,V (u, v)
given in part (a). Some candidates forgot the region is −u < v < 0, and hence got the limits of
integration wrong. To see if U and V are independent, we need to see if fU,V (u, v) = fU (u) fV (v)
or not. Please see Section 4.4.1 and Proposition 4.4.3 on page 120 of the subject guide.

Although not needed in this question, it is important to remember that −u < v < 0 means:
Z ∞Z 0 Z 0 Z ∞
fU,V (u, v) dvdu = 1, or fU,V (u, v) dudv = 1.
0 −u −∞ −v

Note the limits of integration implied from −u < v < 0 in both integrals, and make sure you
understand why they are as shown above. Drawing the region to visualise it can also help. The
outside integral definitely does not contain a variable, yet some candidates made this kind of
mistake and got the results wrong in other questions involving double integrals.

For part (c), again some candidates got the limits of integration wrong. We are calculating
E(U | V = v), so we should use the conditional density fU | V (u | v), and limits for u being from

7
ST3133 Advanced statistics: distribution theory (half course)

−v to ∞. This region is found by observing that −u < v < 0 means that u > −v and
−u < 0 ⇒ u > 0. Since we always have v < 0, we must have −v > 0, so that u > 0 intersected
with u > −v becomes only u > −v. Hence, in the end, the region is u > −v, meaning the limits
are from −v to ∞. The rest of it is really about the technique of integration, and you need more
practice in order to arrive at the final answer.

Full solutions are:

(a) We have X = (U + V )/2 and Y = (U − V )/2. The Jacobian is:


 ∂x ∂x   1 1

J= ∂u ∂v = 2 2
1
∂y
∂u
∂y
∂v 2 − 21

Hence the joint density has the form:


1 −u
fU,V (u, v) = fX,Y (x, y)|J| = ue .
2
To find the region where this is defined, note that:
u+v u−v
0<x<y ⇒ 0< < .
2 2
Solving all 3 inequalities, we get:

u > −v, u > v, v < 0

which can be expressed as −u < v < 0. Hence the joint density is as stated in the question.
(b) We have:
Z 0
1 −u 1
fU (u) = ue dv = u2 e−u , u > 0.
−u 2 2

It is clear that fU,V (u, v) 6= fU (u) · fV (v), and hence U and V are not independent.
(c) We have:
∞ Z ∞
u2 e−u /2
Z
E(U | V = v) = ufU | v (u) du = v
du
−v −v e (1 − v)/2
Z ∞
1
= u2 e−u du
ev (1 − v) −v
 Z ∞ 
1 2 −u ∞ −u
= [−u e ]−v + 2 ue du
ev (1 − v) −v
 Z ∞ 
1 2 v −u ∞ −u
= v e + 2[−ue ] −v + 2 e du
ev (1 − v) −v
1  2 v
v e − 2vev + 2ev

=
ev (1 − v)
(1 − v)2 + 1
= .
1−v

Question 3

Let N be the random variable denoting the number of customers still in a queue at
5pm, where N ∼Poisson(µ). That is, the probability mass function of N is

µn e−µ
pN (n) = , n = 0, 1, 2, . . . .
n!
Suppose each customer is served sequentially by one person, and the service time
for each customer is Xi ∼Exponential(λ) (in minutes), i = 1, . . . , N , and is
independent of each other. That is fXi (x) = λe−λx , x > 0.

8
Examiners’ commentaries 2014

(a) What is the total service time T after 5pm in terms of N and Xi for
i = 1, . . . , N ?
(1 mark)
(b) Find E(T ). You can use the mean of a Poisson random variable and an
Exponential random variable without proof, as long as you state them clearly.
(4 marks)
(c) Derive the moment generating function of T .
(10 marks)
(d) By differentiating the moment generating function, or otherwise, find var(T ).
You can use the mean and variance of a Poisson random variable and an
Exponential random variable without proof, as long as you state them clearly.
(5 marks)

Approaching the question

This question is about random sums. You can find the related materials in Section 5.6 on page
164 of the subject guide.

For part (a), a lot of candidates got it wrong, apparently because of a lack of knowledge of
random sums. This is definitely disappointing, as the scenario described in the question is a
standard one for applications of random sums.

For part (b), candidates needed to use the formula E(T ) = E(E(T | N )), or directly use the first
formula in Proposition 5.6.3 on page 165 of the subject guide. Hence it comes down to just
testing the mean of a Poisson random variable and the mean of an exponential random variable.
Since many candidates got part (a) wrong, they could not continue to the other parts.

For part (c), you could either directly derive the moment generating function (MGF) from first
principles, or use formula iii in Proposition 5.6.3 in the subject guide. In either case, what
candidates needed was to derive the MGF for X and N first, then T . You should practise
deriving MGFs for standard distributions, whether continuous or discrete. Make sure you
understand the first principles in deriving MGFs for a random sum, which is the approach used
in the solutions.

For part (d), you can either differentiate the MGF, or use

var(T ) = E(var(T | N )) + var(E(T | N )) ⇒ var(T ) = E(N ) var(X) + E 2 (X) var(N )

which is formula ii of Proposition 5.6.3 in the subject guide. Make sure you know both ways of
doing this question, and in the examination, if allowed, use the quickest way to do it.

Full solutions are:


N
P
(a) The total service time is T = Xi , with T = 0 when N = 0.
i=1

(b) We have:
µ
E(T ) = E(E(T | N )) = E(N E(X1 )) = E(N/λ) = .
λ

(c) Consider:
Z ∞ Z ∞
−λx λ
MXi (t) = E(e tXi
)= λe tx
·e dx = λe−(λ−t)x dx = , t < λ.
0 0 λ−t

Also, the moment generating function for N is:



X etn µn e−µ
MN (t) = = exp(µ(et − 1)), t ∈ R.
i=1
n!

9
ST3133 Advanced statistics: distribution theory (half course)

Hence:
N
!!
X
MT (t) = E exp tXi
i=1
N
!
Y
tXi
=E E(e )
i=1
( N )
λ
=E , λ<t
λ−t
  
λ
= MN log
λ−t
    
λ µt
= exp µ −1 = exp , t < λ.
λ−t λ−t

(d) We have:

µ(λ − t) + µt λµMT (t)


MT0 (t) = MT (t) · =
(λ − t)2 (λ − t)2
0 2
λµMT (t)(λ − t) + 2λµ(λ − t)MT (t) λµ(λ − t)MT0 (t) + 2λµMT (t)
MT00 (t) = = .
(λ − t)4 (λ − t)3

Hence: µ
λ2 µ · + 2λµ µ2 2µ
var(T ) = MT00 (0) − E 2 (T ) = λ
− = 2.
λ3 λ2 λ
Another method is to use:

var(T ) = E(var(T | N )) + var(E(T | N ))


= E(N/λ2 ) + var(N/λ)
µ µ 2µ
= 2 + 2 = 2.
λ λ λ

Question 4

There are 4 broad weather conditions: Rainy and windy (Rw), Rain only (R),
Cloudy without rain (C) and Sunny (S). The probabilities of the four weather
conditions are 0.4, 0.3, 0.2 and 0.1 respectively.

The probability of being late to work or not depends on which weather condition
you face for the day. The probability that you are late for work when it is Rainy
and windy is 0.4 (so that you are on time for work with probability 0.6). The
probabilities of being late for work when it is Rainy only, Cloudy and Sunny are
respectively 0.3, 0.1 and 0.05.

(a) What is the probability that you are on time for work on a particular day?
(4 marks)
(b) Given you are late, what is the probability that the weather is Sunny?
(4 marks)
2
(c) Let the travel time be T ∼ N (µ1 , σ ) when it is Sunny or Cloudy, and
T ∼ N (µ2 , σ 2 ) otherwise. Given T > c, what is the probability that it is Sunny?
Leave your answer in terms of Φ(·), the distribution function of a standard
normal random variable.
(6 marks)

10
Examiners’ commentaries 2014

(d) Find E(T ) and var(T ). (Hint: For var(T ), find E(T 2 ) first.)
(6 marks)

Approaching the question

This question is about the use of Bayes’ theorem and the law of total probability. The materials
can be found in Section 2.4.2 on page 26 of the subject guide. Most candidates answered parts (a)
and (b) well. The correct way to approach the question is to calculate the probability that one is
late for work, and then use P (on time) = 1 − P (late). To calculate P (late), we need the law of
total probability, with 4 different weathers giving 4 different probabilities. For part (b), a simple
application of Bayes’ theorem does the job. Candidates who got it wrong usually did not know
how to approach the question completely, or made some careless mistakes when filling in the
many different probabilities. Drawing a tree diagram can certainly help, but is not mandatory.

Part (c) and part (d) proved to be more difficult and many candidates could not finish them. It
is still an application of Bayes’ theorem, but one needs to express the probability P (T > c | S) as
P (T > c | T ∼ N (µ1 , σ 2 )), so that it is equal to
 
2 c − µ1
P (T > c | T ∼ N (µ1 , σ )) = P (Z > (c − µ1 )/σ) = 1 − Φ .
σ

Calculating the variance is more straightforward, you just needed to condition on the correct
weather and calculate E(T ) and E(T 2 ) with the corresponding weather condition (meaning
different means, µ1 or µ2 , for T ).

Full solutions are:

(a) We have:

P (On time) = 1 − P (late)


= 1 − P (late | S)P (S) − P (late | C) P (C)
− P (late | R) P (R) − P (late | Rw) P (Rw)
= 1 − 0.05 × 0.1 − 0.1 × 0.2 − 0.3 × 0.3 − 0.4 × 0.4
= 0.725.

(b) We have:
P (S, late) 0.05 × 0.1
P (S | late) = = = 0.01818 (= 1/55).
P (late) 1 − P (On time)

(c) We have:

P (S, T > c)
P (S | T > c) =
P (T > c)
P (T > c | T ∼ N (µ1 , σ 2 )) P (S)
=
P (T > c | S or C) P (S or C) + P (T > c | R or Rw) P (R or Rw)
0.1 1 − Φ c−µ 1

σ
=
(0.1 + 0.2) 1 − Φ c−µ + (0.3 + 0.4) 1 − Φ c−µ
1
 2

σ σ
1 − Φ c−µ 1

σ
= .
10 − 3Φ c−µ − 7Φ c−µ
1
 2
σ σ

(d) We have:

E(T ) = E(T | S or C) P (S or C) + E(T | R or Rw) P (R or Rw)


= 0.3µ1 + 0.7µ2 .
E(T ) = E(T 2 | S or C) P (S or C) + E(T 2 | R or Rw) P (R or Rw)
2

= 0.3(µ21 + σ 2 ) + 0.7(µ22 + σ 2 ).

11
ST3133 Advanced statistics: distribution theory (half course)

Hence var(T ) = E(T 2 ) − E(T )2 , which is:

var(T ) = 0.3(µ21 + σ 2 ) + 0.7(µ22 + σ 2 ) − (0.3µ1 + 0.7µ2 )2


= σ 2 + 0.21(µ21 + µ22 ) − 0.42µ1 µ2
= σ 2 + 0.21(µ1 − µ2 )2 .

12
Examiners’ commentaries 2014

Examiners’ commentaries 2014


ST3133 Advanced statistics: distribution theory (half course)

Important note

This commentary reflects the examination and assessment arrangements for this course in the
academic year 2013–14. The format and structure of the examination may change in future years,
and any such changes will be publicised on the virtual learning environment (VLE).

Information about the subject guide

Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2011).
You should always attempt to use the most recent edition of any Essential reading textbook, even if
the commentary and/or online reading list and/or subject guide refers to an earlier edition. If
different editions of Essential reading are listed, please check the VLE for reading supplements – if
none are available, please use the contents list and index of the new edition to find the relevant
section.

Comments on specific questions – Zone B

Section A

Answer all three parts of question 1 (40 marks in total).

Question 1

(a) Let X ∼Geometric(p), that is, it is a discrete random variable with underlying
probability mass function

pX (x) = q x−1 p, 0 < p < 1, x = 1, 2, . . . ,

where q = 1 − p. For parts i., ii., and iii., write your answer in terms of q only.
i. Find the probability that X is odd.
(5 marks)
ii. Let Y ∼Geometric(p) be independent of X. Find the probability that X − Y
is odd.
(5 marks)
iii. Find P (X − Y = 3).
(5 marks)
Approaching the question
This question is parallel to Question 1(a) in the Zone A paper, only this time you need to
answer in terms of q rather than in term of p. The performance of candidates overall was
similar to those sitting the Zone A paper. For part i, you need to sum the probabilities that
X is odd and simplify to involve only q. Part ii requires you to observe that X − Y being
odd means that either X is even and Y is odd, or vice versa. Then the same procedure as in
answering the corresponding Zone A question applies using the answer in part i. Part iii is

13
ST3133 Advanced statistics: distribution theory (half course)

very similar to that in the Zone A paper, only this time X − Y = 3 instead of 2. The
treatment is exactly the same, and hence you should refer to the comments in the Zone A
paper.
Full solutions are:

i. We have:

X X p 1
P (X odd) = q x−1 p = q 2m−2 p = = .
m=1
1 − q2 1+q
x odd

ii. We have:

P (X − Y odd) = P (X even, Y odd) + P (X odd, Y even)


= P (X even)P (Y odd) + P (X odd)P (Y even)
q q
= +
(1 + q)2 (1 + q)2
2q
= .
(1 + q)2

iii. We have:

P (X − Y = 3) = P (X = Y + 3)
X∞
= P (X = y + 3 | Y = y) P (Y = y)
y=1

X
= q y+3−1 p · q y−1 p
y=1
X∞
= q 2y+1 p2
y=1

q 3 p2 q 3 (1 − q)
= 2
= .
1−q 1+q

(b) Suppose the waiting time X (in minutes) of a particular bus is a random
variable with underlying probability density function

fX (x) = k exp {−(2k − 1)(x − 1)} , x ≥ 1.

i. Show that k = 1.
(4 marks)
ii. Find the expected waiting time for the bus.
(5 marks)
iii. You wait for the bus everyday. Assuming waiting times on different days are
independent, what is the probability that for at least 8 days among 10, the
waiting time is less than 2 minutes? You can leave your answer in terms of e.
(6 marks)
Approaching the question
This question is parallel to Question 1 (b) inR the Zone A paper. The best way to approach
part i is to substitute k = 1, and show that fX (x) dx = 1. Part ii requires the use of the
expectation formula for continuous random variables, and part iii requires knowledge of the
binomial distribution. Refer to the comments for Question 1 (b) in the Zone A paper for
details.

14
Examiners’ commentaries 2014

Full solutions are:

i. We use:

kmboxe−(2k−1)(x−1) ∞
Z
k
1= ke−(2k−1)(x−1) dx = [− ]1 =
1 2k − 1 2k − 1
showing that k = 1.
ii. The expected waiting time is, in minutes:
Z ∞
E(X) = xe−(x−1) dx
1 Z ∞ 
= [−xe−(x−1) ]∞
1 + e−(x−1) dx
1
=1+ [−e−(x−1) ]∞
1
= 2.

iii. First we find P (X < 2). We have:


Z 2
P (X < 2) = e−(x−1) dx = [−e−(x−1) ]21 = 1 − e−1 .
1

Hence, the required probability is:

P (At least 8 days with X < 2)


= P (Exactly 8 days with X < 2)
+ P (Exactly 9 days with X < 2)
+ P (Exactly 10 days with X < 2)
     
10 −1 8 −2 10 −1 9 −1 10
= (1 − e ) e + (1 − e ) e + (1 − e−1 )10
8 9 10
= (1 − e−1 )8 (1 + 8e−1 + 36e−2 ).

(c) Assume Y | X = x has the probability density function

fY | x (y) = xe−x(y−3) , y > 3.

The probability density function for the random variable X is

1
fX (x) = , 1 < x < e.
x

i. Find the probability density function for Y , fY (y).


(6 marks)
1
ii. Evaluate E(Y ). You are allowed to use E(Y | X = x) = 3 + x
.
(4 marks)

Approaching the question


This part corresponds to Question 1 (c) of the Zone A paper, but note that the density
starts from 3 for y rather than 0. Refer to the comments on Question 1(c) for the Zone A
paper. Note that the density fX (x) here is different from the one in Zone A, but the idea is
the same: use the law of iterated expectations E(Y ) = E(E(Y | X)), so that in the end you
need to calculate E(1/X) again. Just remember to use the expectation formula correctly,
rather than using the wrong formula E(1/X) = 1/E(X) as some candidates did!

15
ST3133 Advanced statistics: distribution theory (half course)

Full solutions are:

i. We have:
Z
fY (y) = fY,X (y, x) dx
Z e
= fY |x (y) fX (x) dx
1
Z e
= e−x(y−3) dx
1
e
e−x(y−3)

=
−(y − 3) 1
e−(y−3) − e−e(y−3)
= , y > 3.
y−3

ii. We have:
  Z e
1 1
E(Y ) = E(E(Y | X)) = E 3 + =3+ 2
dx = 4 − e−1 .
X 1 x

Section B

Answer all three questions in this section (60 marks in total).

Question 2

Let X, Y be two random variables having the joint probability density function

4
fX,Y (x, y) = ye−(x+y) , 0 < x < y.
3
(a) Let U = X + Y and V = X − Y . Show that the joint density of U, V is

1
fU,V (u, v) = (u − v)e−u , −u < v < 0.
3
(Hint: solve x and y in terms of u and v, and then use 0 < x < y to find the
region where the joint density is valid in terms of u and v)
(7 marks)
(b) Find the marginal density fU (u) for U . You are given that the marginal density
for V is
(1 − 2v)ev
fV (v) = , v < 0.
3
Are U and V independent?
(4 marks)
(c) Calculate E(U | V = v).
(9 marks)

Approaching the question

The only difference between this question and Question 2 of the Zone A paper is the form of the
density. You can see that the resulting region of integration is exactly the same, so that the form
of the density does not matter, but the region of integration. All other calculations are in parallel
to the corresponding parts in the Zone A paper. Please see the comments there for details.

16
Examiners’ commentaries 2014

Full solutions are:

(a) We have X = (U + V )/2 and Y = (U − V )/2. The Jacobian is:


 ∂x ∂x   1 1

J= ∂u ∂v = 2 2 .
1
∂y
∂u
∂y
∂v 2 − 12

Hence the joint density has the form:


4 u − v −u 1 1
fU,V (u, v) = fX,Y (x, y)|J| = e · = (u − v)e−u .
3 2 2 3
To find the region where this is defined, note that:
u+v u−v
0<x<y ⇒ 0< < .
2 2
Solving all 3 inequalities, we get:

u > −v, u > v, v < 0

which can be expressed as −u < v < 0. Hence the joint density is as stated in the question.
(b) We have:
Z 0
1 1
fU (u) = (u − v)e−u dv = u2 e−u , u > 0.
−u 3 2
It is clear that fU,V (u, v) 6= fU (u) · fV (v), and hence U and V are not independent.
(c) We have:
∞ Z ∞
u(u − v)e−u /3
Z
E(U | V = v) = u fU |v (u) du = v
du
−v −v e (1 − 2v)/3
Z ∞
1
= (u2 − uv)e−u du
ev (1 − 2v) −v
 Z ∞ 
1 2 −u ∞ −u
= [−(u − uv)e ]−v + (2u − v)e du
ev (1 − 2v) −v
 Z ∞ 
1 2 v −u ∞ −u
= 2v e + [−(2u − v)e ] −v + 2 e du
ev (1 − 2v) −v
1  2 v v v

= 2v e − 3ve + 2e
ev (1 − 2v)
2v 2 − 3v + 2
= .
1 − 2v

Question 3

Let N be the random variable denoting the number of customers still in a queue at
5pm (no more customers come in afterwards), where N ∼Poisson(µ). That is, the
probability mass function of N is

µn e−µ
pN (n) = , n = 0, 1, 2, . . . .
n!
Suppose each customer is served sequentially by one person, and the service time
for each customer is, in minutes,

Exponential(λ1 ), if Yi = 1;
Xi ∼ , i = 1, . . . , N,
Exponential(λ2 ), if Yi = 0.

and is independent of each other. The random variables Yi are independent of the
Xi ’s and N , and is independent and identically distributed as Yi ∼Bernoulli(1/2).
That is fXi |Yi =1 (x) = λ1 e−λ1 x , and fXi |Yi =0 (x) = λ2 e−λ2 x , x > 0.

17
ST3133 Advanced statistics: distribution theory (half course)

(a) What is the total service time T after 5pm in terms of N and Xi for
i = 1, . . . , N ?
(1 mark)
(b) Find E(T ). You can use the mean of a Poisson random variable and an
Exponential random variable without proof, as long as you state them clearly.
(4 marks)
(c) Show that the moment generating function of T in terms of λ1 and λ2 is given
by   
λ1 λ2
MT (t) = exp µ + −1 ,
2(λ1 − t) 2(λ2 − t)
where t < min(λ1 , λ2 ).
(10 marks)
(d) If λ1 = λ2 = λ, by differentiating the moment generating function, or otherwise,
find var(T ). You can use the mean and variance of a Poisson random variable
and an Exponential random variable without proof, as long as you state them
clearly.
(5 marks)

Approaching the question

The difference between this question and Question 3 of the Zone A paper is that Xi follows an
exponential distribution, but with a rate that can be different according to an independent
Bernoulli random variable Yi . The major technique here is that when taking the expectation of
Xi or a function of Xi , say g(Xi ), we need to use the law of iterated expectations (Proposition
5.4.2 on page 156 of the subject guide):

E(g(Xi )) = E(E(g(Xi ) | Yi ))
= E(g(Xi ) | Yi = 1) P (Yi = 1) + E(g(Xi ) | Yi = 0) P (Yi = 0)

because the random variable E(g(Xi ) | Yi ) is also Bernoulli, like Yi does. Part (b) and part (c)
both need to use the above. And part (d) is exactly the same as the Zone A paper. See the
comments on Question 3 for the Zone A paper for more details.

Full solutions are:


N
P
(a) The total service time is T = Xi , and T = 0 when N = 0.
i=1

(b) We have:

E(T ) = E(E(T | N ))
= E(N E(X1 ))
= E(N {E(X1 | Y1 = 1) P (Y1 = 1) + E(X1 | Y1 = 0) P (Y1 = 0)})
   
1 1 1 1 µ 1 1
=µ · + · = + .
λ1 2 λ2 2 2 λ1 λ2

(c) Consider:

MXi (t) = E(etXi ) = E(etXi | Yi = 1) P (Yi = 1) + E(etXi | Yi = 0) P (Yi = 0)


1 ∞ 1 ∞
Z Z
= λ1 etx · e−λ1 x dx + λ2 etx · e−λ2 x dx
2 0 2 0
1 ∞ 1 ∞
Z Z
= λ1 e−(λ1 −t)x dx + λ2 e−(2λ2 −t)x dx
2 0 2 0
λ1 λ2
= + , t < min(λ1 , λ2 ).
2(λ1 − t) 2(λ2 − t)

18
Examiners’ commentaries 2014

Also, the moment generating function for N is:



X etn µn e−µ
MN (t) = = exp(µ(et − 1)), t ∈ R.
i=1
n!

Hence:
N
!!
X
MT (t) = E exp tXi
i=1
N
!
Y
=E E(etXi )
i=1
( N )
λ1 λ2
=E + , t < min(λ1 , λ2 )
2(λ1 − t) 2(λ2 − t)
  
λ1 λ2
= MN log +
2(λ1 − t) 2(λ2 − t)
  
λ1 λ2
= exp µ + −1 ,
2(λ1 − t) 2(λ2 − t)
t < min(λ1 , λ2 ).

(d) We have:
µ(λ − t) + µt λµMT (t)
MT0 (t) = MT (t) · =
(λ − t)2 (λ − t)2
λµMT0 (t)(λ − t)2 + 2λµ(λ − t)MT (t) λµ(λ − t)MT0 (t) + 2λµMT (t)
MT00 (t) = 4
= .
(λ − t) (λ − t)3
Hence: µ
λ2 µ · + 2λµ µ2 2µ
var(T ) = MT00 (0) − E(T )2 = λ
− = 2.
λ3 λ2 λ
Another method is to use:
var(T ) = E(var(T | N )) + var(E(T | N ))
= E(N/λ2 ) + var(N/λ)
µ µ 2µ
= 2 + 2 = 2.
λ λ λ

Question 4

A certain product is produced by five different machines: A,B,C,D and E. The


proportions of the total items produced by each machine are 0.1, 0.3, 0.2, 0.25 and
0.15, respectively. The probability that an item is faulty depends on the machine
that produced it. The proportions of faulty items are 0.01, 0.2, 0.1, 0.05, and 0.02
for products produced by machines A, B, C, D, and E, respectively

(a) What is the probability that a randomly chosen item is faulty?


(4 marks)
(b) Given an item is faulty, what is the probability that it is produced by machine
B?
(4 marks)
2
(c) Let the selling price of an item be S ∼ N (µ1 , σ ) when it is produced by
machines A, B or C, and S ∼ N (µ2 , σ 2 ) otherwise. Given S > c, what is the
probability that it is produced by machine A? Leave your answer in terms of
Φ(·), the distribution function of a standard normal random variable.
(6 marks)

19
ST3133 Advanced statistics: distribution theory (half course)

(d) Find E(S) and var(S). (Hint: For var(S), find E(S 2 ) first.)
(6 marks)

Approaching the question

This question is parallel to Question 4 of the Zone A paper. Hence please see the comments there.

Full solutions are:

(a) We have:

P (Faulty) = P (Faulty | A) P (A) + P (Faulty | B) P (B)


+ P (Faulty | C) P (C) + P (Faulty | D) P (D) + P (Faulty | E) P (E)
= 0.01 × 0.1 + 0.2 × 0.3 + 0.1 × 0.2 + 0.05 × 0.25 + 0.02 × 0.15
= 0.0965.

(b) We have:

P (B, Faulty) 0.2 × 0.3


P (B | Faulty) = = = 0.6217617 (= 120/193).
P (Faulty) P (Faulty)

(c) We have:

P (A, S > c)
P (A | S > c) =
P (S > c)
P (S > c | S ∼ N (µ1 , σ 2 )) P (A)
=
P (S > c | A, B or C) P (A, BtextorC) + P (S > c | D or E) P (D or E)
0.1 1 − Φ c−µ 1

σ
=
(0.1 + 0.3 + 0.2) 1 − Φ c−µ + (0.25 + 0.15) 1 − Φ c−µ
1
 2

σ σ
1 − Φ c−µ 1

σ
= .
10 − 6Φ c−µ − 4Φ c−µ
1
 2
σ σ

(d) We have:

E(S) = E(S | A, B or C) P (A, B or C) + E(S | D or E) P (D or E)


= 0.6µ1 + 0.4µ2 .
E(S ) = E(S 2 | A, B or C) P (A, B or C) + E(S 2 | D or E) P (D or E)
2

= 0.6(µ21 + σ 2 ) + 0.4(µ22 + σ 2 ).

Hence var(S) = E(S 2 ) − E(S)2 , which is:

var(T ) = 0.6(µ21 + σ 2 ) + 0.4(µ22 + σ 2 ) − (0.6µ1 + 0.4µ2 )2


= σ 2 + 0.24(µ21 + µ22 ) − 0.48µ1 µ2 .
= σ 2 + 0.24(µ1 − µ2 )2

20

You might also like