Statistical Methodology Past Paper 2020-2021

Statistical Methodology
MATH10095
Wednesday 16th December 2020

1300-1500 † *
†
All students: you have an additional 1 hour to assemble and submit your PDF.
Final submission deadline: 16:00.
*
Students with a Schedule of Adjustment: You are entitled to a further fixed
additional 1 hour for this remote examination.
Final submission deadline: 17:00
Attempt all questions
Important instructions
1. Start each question on a new sheet of paper.

2. Number your sheets of paper to help you scan them in order.
3. Only write on one side of each piece of paper.
4. If you have rough work to do, simply include it within your overall answer – put
brackets at the start and end of it if you want to highlight that it is rough work.
MATH10095 Statistical Methodology 1
(1) The data y1 , . . . , yn are n independent observations of a random variable with

probability density function
θe−θ sin(y) cos(y)

f (y; θ) = , y ∈ [0, π/2]
1 − e−θ
and f (y; θ) = 0 for y ∈
/ [0, π/2], and θ > 0 is an unknown parameter.
(a) Show that the log-likelihood for this data can be written as
n
X n
X
n log(θ) − n log(1 − e−θ ) − θ sin(yi ) + log cos(yi ).
i=1 i=1
[4 marks]
(b) Show that the observed information function for this distribution is
n eθ
− n .
θ2 (eθ − 1)2
[6 marks]
(c) Obtain an iterative formula based on the Fisher’s method of scoring for
calculating the MLE of θ. Discuss whether the iterative formula changes if
the Newton-Raphson method was applied instead.
[8 marks]
P42
(d) An experiment with 42 observations yielded i=1 sin yi = 11. Taking the initial
value for iteration as θ(0) = 3.2, complete one iteration of the Fisher’s method
of scoring.
[6 marks]
(e) We know that after 3 iterations the iterative scheme in (c) for data given in (d)
converges to the maximum likelihood estimate θ̂ = 3.35. Test the hypothesis
that θ = 3 against the two-sided alternative θ 6= 3 for this data using a Wald
test at α = 0.05.
[Note: Choose an appropriate critical point from the quantiles below in which
(α) shows the area under the curve on the right hand-side of the distribution:
χ21 (0.05) = 3.8415, χ22 (0.05) = 5.9915, z(0.025) = 1.96, z(0.05) = 1.64.]
[10 marks]
[Please turn over]

(2) Let y1 , y2 , . . . , yn denote a random i.i.d. sample from the following distribution
4
f (y | µ, β) = Cβ 1/4 e−β(y−µ) , y ∈ (−∞, +∞),
with µ ∈ R and β > 0 where C ≈ 0.552.
(a) Consider the case where µ = 0 is known. Derive the expression of the likelihood
ratio test statistic for testing hypothesis H0 : β = 1 against H1 : β 6= 1, and
hence derive the rule for rejecting the null hypothesis.
[Note: log(1) = 0.]
[10 marks]
(b) Let µ = 0 be known and consider a Bayesian model for β with prior density
ba a−1 −bβ
p(β) = β e , β > 0,
Γ(a)
a
for some fixed values a, b > 0; i.e. β ∼ Γ(a, b), for which, E(β) = and
b
a
Var(β) = 2 .
b
(i) Derive the posterior distribution of β.
[10 marks]
(ii) Write the expressions for the posterior mean and the posterior variance of
β (no need to prove them).
[4 marks]
(c) Now consider the case both µ and β are unknown. Obtain the score vector and
the Fisher information matrix for vector of the unknown parameters θ = (µ, β).
[Note: you can use E((Y − µ)2 ) = 0.338β −1/2 and E((Y − µ)3 ) = 0 for Y ∼
f (y | µ, β).]
[10 marks]
[Please turn over]

(3) (a) Consider the simple linear regression model E(Yi | xi ) = β0 + β1 xi for i =
1, · · · , n, in which Yi are independently distributed from N (β0 + β1 xi , σ 2 ) and
σ 2 is unknown. Assume we have a sample of size n = 6 and RSS = 0.026,
Syy = 6i=1 (yi − ȳ)2 = 0.54 and Sxy = 6i=1 (xi − x̄)(yi − ȳ) = −6.
P P
We want to test whether the expectation of the response variable (Y ) linearly

depends on the explanatory variable (X).
(i) Write the null and alternative hypotheses and find the t-test statistic value
for this test. Use four decimal places in the calculations.
[Note: RSS = Syy − 2 βb1 Sxy + βb12 Sxx ]
[10 marks]
(ii) For the same test explained in (a), find the value of another test statistic
and specify its distribution under the null hypothesis.
[4 marks]
(b) Consider the simple linear regression model E(Yi | xi ) = β0 + β1 xi for i =
1, · · · , n, in which Yi are independently distributed from N (β0 + β1 xi , σ 2 ).
RSS
The value of 1 − is calculated for this model; in which RSS is the residuals
Syy
sum of squares and Syy is the total sum of squares about ȳ. What is this value
named? What range of values does it take? Explain what it describes about
the model.
[8 marks]
(c) Consider the regression model,
1
log Yi = β0 + β1 + i
xi
i.i.d
in which i ∼ N (0, σ 2 ). Apply the least squares estimation method and derive
the estimates for parameters β0 and β1 .
[10 marks]
[End of Paper]

Statistical Methodology Past Paper 2020-2021

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistical Methodology Past Paper 2020-2021

Uploaded by

Copyright:

Available Formats

Statistical Methodology

Wednesday 16th December 2020

Final submission deadline: 17:00

Attempt all questions

1. Start each question on a new sheet of paper.

(1) The data y1 , . . . , yn are n independent observations of a random variable with

θe−θ sin(y) cos(y)

[Please turn over]

with µ ∈ R and β > 0 where C ≈ 0.552.

[Please turn over]

We want to test whether the expectation of the response variable (Y ) linearly

You might also like