You are on page 1of 7

Problem set 1 - Answers

1. This requires you to know how to transform a normally distributed variable into a standard
normal variable. You subtract the mean and divide by the square root of the variance. If
X − X −
X ~ N ( ,  2 ) then Z = = ~ N (0,1) . Hence the answer is (d).
2 
But let’s look at more detail at the alternative options to see why they are wrong.

a. X +  ~ N (0,  2 ) . This is incorrect. We can use the properties of the expectations operator
at the end of Section 1 in the lecture notes to see what effect it has to add a factor to the
variable.
For example, if the mean of X is 𝜇 then the mean of 𝑋 + 𝑎 will add a factor of 𝑎 to the
mean of X. So the mean of 𝑋 + 𝑎 will be 𝜇 + 𝑎.
Applying this rule to our question, the mean of 𝑋 + 𝜇 is 𝜇 + 𝜇 = 2𝜇. And so
X +  ~ N (2,  2 ) .
X −
b. ~ N (0,1) . This is incorrect. The subtraction of 𝜇 does give us a 0 mean (using the
2
explanation from part (a) you should be able to see this). But what about the scale factor of
1⁄𝜎 2 ? Using the properties of variance detailed at the end of Section 1 we can see that
𝑣𝑎𝑟(𝑎𝑋) = 𝑎2 𝑣𝑎𝑟(𝑋).
Applying this to the question, 𝑣𝑎𝑟(𝑋⁄𝜎 2 ) = (1⁄𝜎 2 )2 𝑣𝑎𝑟(𝑋) = 𝜎 2 ⁄(𝜎 2 )2 = 1⁄𝜎 2 .
X −
And so ~ N (0,1  2 ) .
2
X − 2
c. ~ N (0,1) . This is incorrect. Applying the above rules you should be able to see

that
X − 2
~ N ( −  2 , 2  2 ) .

X −
d. ~ N (0,1) . This is correct, as you should be able to see by applying the rules.

2. So, the same thing here, except we also have to notice that two options are correct. As in Q1
above, we know that (b) is correct. But (c) is also correct because the difference between the
value of X and its mean will still have a variance of 𝜎 2 . The correct option is therefore (e).
3. We are told that X ~ N (132, 25) .
i) P ( X  140 ) = A

f(x)

132 140 X
To find area A we need to transform this Normally distributed variable into a Standard Normal
variable, i.e. Z ~ N (0,1) . That way we can use the Standard Normal distribution table to find
the associated probabilities. We know that

X − 132 X − 132
Z= = ~ N ( 0,1)
25 5
 140 − 132 
and so the equivalent of finding P ( X  140 ) is to find P  Z   = P ( Z  1.6 ) .
 5 

f(z)

A
0 1.6 Z
It means that 140 is 1.6 standard deviations above the mean of 132.

Looking at the Z table we can see that the probability is about 0.055, i.e. P ( Z  1.6 )  0.055 .
Therefore P ( X  140 ) = 0.055 and so there is about half a percentage point chance of getting
a value for X greater than 140, when it has this distribution.

 130 − 132 
ii) P ( X  130 ) = P  Z   = P ( Z  −0.4 ) . Given that the standard normal is
 5 
symmetric around 0 this is equivalent to finding P ( Z  0.4 ) .
f(z)

-0.4 0 0.4 Z

The SN table gives only the probability that lies above certain Z values. Therefore, we cannot read
off directly P ( Z  0.4 ) . But we do know that

P ( Z  0.4 ) = 1 − P ( Z  0.4 ) = 1 − 0.345 = 0.655 . Therefore P ( Z  −0.4 ) = 0.655

iii)

 125 − 132 135 − 132 


P (125  X  135 ) = P  Z  = P ( −1.4  Z  0.6 ) =
 5 5 
P( Z  0.6) − P( Z  −1.4)

f(z)

-1.4 0 0.6 Z

P ( Z  0.6 ) = 1 − P ( Z  0.6 ) = 1 − 0.274 = 0.726


P ( Z  −1.4 ) = P ( Z  1.4 ) = 0.081
 P ( −1.4  Z  0.6 ) = P ( Z  0.6 ) − P ( Z  −1.4 ) = 0.726 − 0.081 = 0.645 .

4. Using exactly the same answers as in Q3 but replacing the variance value, we should arrive at
the following probabilities:
X − 132 X − 132
(i) Now the SN variable is Z = = ~ N ( 0,1) and so the equivalent of
16 4
 140 − 132 
finding P ( X  140 ) is to find P  Z   = P ( Z  2 )  0.0225 . Comparing
 4 
to Q3(i), we are still testing for a value that is 8 units above the mean (i.e 140 compared
to 132), but the probability of X taking a value higher than this is half what it was in
Q3. That’s because the distribution here is much less dispersed, so more of the mass of
the distribution is around the mean and less is in the tails. We can see this is in the
distributions below, where the red distribution is the one with the lower variance and
hence has a lower area in the tail above the value of 140:

f(x)

132 140 X

 130 − 132 
(ii) P ( X  130 ) = P  Z   = P ( Z  −0.5) and therefore
 4 
P ( Z  0.5 ) = 1 − P ( Z  0.5 ) = 1 − 0.309 = 0.691 and hence P ( X  130 ) = 0.691 ,
which is greater than when the variance was 25.
(iii)
 125 − 132 135 − 132 
P (125  X  135) = P  Z  = P ( −1.75  Z  0.75) =
 4 4 
P( Z  0.75) − P( Z  −1.75)
P ( Z  0.75 ) = 1 − P ( Z  0.75 ) = 1 − 0.227 = 0.773
P ( Z  −1.75 ) = P ( Z  1.75 ) = 0.04
 P ( −1.75  Z  0.75 ) = 0.773 − 0.04 = 0.733 .

12
5. (a) Mean: X =  X i 12 = ( 5 + 8 + 11 + 6 + 15 + 9 + 12 + 10 + 7 + 5 + 2 + 6 ) 12 = 8
i =1

1  n 2 2 1  12 2 
Variance: S 2 =  
n − 1  i =1
X i − nX  =  
 11  i =1
X i − 12 X 2 

( )
= 52 + 82 + 112 + 62 + 152 + 92 + 12 2 + 10 2 + 7 2 + 52 + 2 2 + 6 2 − 12 (82 ) 11 = 12.91

(b) Test H0 :  = 9 against the alternative H1 :   9 .

So, whilst we estimate the mean to be 8 is it possible that the true mean could actually be 9?
2 X −
t test approach: We know that X ~ N (  , ) ~ N (0,1) and when we replace the
n  2
n

X −
population variance by its estimate we get t = ~ tn −1 .
S2
n

8−9
t= = −0.964
12.91
12

t = 0.964  t110.025 = 2.201

-2.201 0 2.201 t
You can see that the t stat of -0.964 is in the non-rejection area of the distribution. Therefore
we cannot reject the null that the true mean is 9.

 X − c
CI approach: P  −t c   t  = 1 −  and hence we are 100 (1 −  ) % confident that
 S2 
 n 
the true mean lies in the interval X − t c S n    X + t c S n . Applying this to our case we
2 2

have 95% confidence that the true value for  lies in the interval (5.717, 10.283). Given we
have such high confidence in this range, and that the hypothesised value for  lies in this
range, we do not reject the null.

X −
6. We need to use again the test statistic t = ~ tn −1 . With the statistics provided we have
S2
n

68 − 70
t= = −6.86 . The critical value is t990.05 = 1.66 . This means we are in the rejection region
8.5
100
of the distribution.
5%

-1.66 0 t

The sample evidence suggests that the national average was lower than 70%.

 
7. When you plug in the values to the CI range X − tn −21 S2
n    X + tn −21 S2
n where the critical
value is t = 2.032 you find we are 95% confident in the range 18.948, 20.692. Given the
0.025
34

null of  = 20 lies within that range, we do not reject this null. Hence the correct answer is
(a).

8. If ˆ is the Best Linear Unbiased Estimator of  then it has the lowest variance amongst all
unbiased estimators. If there are various ways that we could estimate  all of which are
unbiased, i.e. their distributions are centred on  , then the best option to use is the one with
the lowest variance as this gives the largest potential for estimating a value close the  . The
correct answer is (d).

9. (a) An unbiased estimator has a distribution centred on the true parameter value. This means
that on average the estimator will estimate the correct value (if we could estimate over and over
again with different random samples and take an average). In statistical notation ˆ is an
()
unbiased estimator of  if E ˆ =  . If the sample mean estimator X is an unbiased estimator

( )
of the mean  then E X =  . We can prove this as follows:

X
( )
E ( X ) = E  i n = 1n E ( X 1 + X 2 + + Xn ) = 1
n (E ( X ) + E ( X ) +
1 2 + E ( X n ))
Because we assume that each observation in the sample is randomly picked from the same
distribution, then each X i has a mean of  so that we get
1
n ( +  + ) = n
n =
( )
Hence E X =  .

(b) Again, under the assumption that the sample used for the estimation is a random sample,
then each X i is drawn independently from the same distribution, so that each X i has the same
mean and variance as the underlying distribution, that is  and  2 respectively.

( )
var ( X ) = var  i n = ( 1n ) var (  X i )
X 2
The independence assumption suggests that we can express var (  X ) as  var ( X ) . Why
i i

is this?

Let’s look at a simplified case: var ( X + Y )


var ( X + Y ) = var ( X ) + var (Y ) + 2 cov ( X , Y )
If variables X and Y are independent then the covariance is 0. We can prove this

cov ( X , Y ) = E ( XY ) −  X Y (see the Appendix to Section 1 for this proof)


Independence is when E ( XY ) = E ( X ) E (Y ) by definition. Hence

cov ( X , Y ) = E ( X ) E (Y ) −  X Y =  X Y −  X Y = 0

Therefore var ( X + Y ) = var ( X ) + var (Y ) when X and Y are independent. Generalising


this statement we can say that var (  X ) =  var ( X ) .
i i

( )
Therefore var X = var  i n
X
( )=( 1 2
n ) var (  X i ) = 1
n2  var ( X )
i

 var ( X ) = ( + 2 + + 2 ) = n 2
= n .
2
Further 1
n2 i
1
n2
2
n2

2
Thus var X = ( ) n
.

(c) An estimator is the best if it is the most efficient, i.e. if it has the lowest variance. So this is
a relative concept – the variance has to be compared to other variances.

By the central limit theorem, the sampling distribution is X ~ N  , n . ( 2


)
(d) Given that X med ~ N  , 2 n ( 2
) and X ~ N (  , ) we can see that both
2
n X and X med are
unbiased as they both have mean  . If X is more efficient then it will have smaller variance
 var ( X ) 
than X med , i.e.    1.
 var ( X med ) 
 
2 2
n
= 1
 2
2n 

var ( X )  var ( X med ) . X has a lower variance and is therefore a better estimator of  than
X med .

You might also like