# EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY (formerly the Examinations of the Institute of Statisticians

)

HIGHER CERTIFICATE IN STATISTICS, 2007

Paper I : Statistical Theory Time Allowed: Three Hours

Candidates should answer FIVE questions. All questions carry equal marks. The number of marks allotted for each part-question is shown in brackets.

Graph paper and Official tables are provided.

Candidates may use calculators in accordance with the regulations published in the Society's "Guide to Examinations" (document Ex1).

The notation log denotes logarithm to base e. Logarithms to any other base are explicitly identified, e.g. log10. ⎛n⎞ Note also that ⎜ ⎟ is the same as nCr. ⎜r ⎟ ⎝ ⎠ 1
HC Paper I 2007

This examination paper consists of 9 printed pages each printed on one side only. This front cover is page 1. Question 1 starts on page 2. There are 8 questions altogether in the paper. ©RSS 2007

3. When the digits 1. For example. (a) (b) (c) (d) show that there are 9 orderings in which no digit occupies its home position. 4 are there? (1) By enumeration.1. 4 are placed in ascending order (without repetition). and the random variable X denotes the number of digits occupying their respective home positions. find the number of orderings in which all four digits each occupy their respective home positions. 2. the 3rd place is the home position of the digit 3. 2. or otherwise. 4 are put in random order. 3. (9) 2 Turn over . (10) (iii) The digits 1. (i) (ii) How many possible orderings of the digits 1. Write down the probability distribution of X. and calculate E(X) and Var(X). each digit is said to occupy its "home" position. find the number of orderings in which exactly two digits each occupy their respective home positions. 3. show that there are 8 orderings in which exactly one digit occupies its home position. 2.

none of whom is a haemophiliac. but a female who carries the gene will pass it on to each of her children independently with probability ½ in each case. What is now the probability that Mrs Smith's son will be a haemophiliac? (10) (ii) 3 Turn over . (i) Mrs Smith is expecting her first child (a son). only one of whom is a haemophiliac? (6) Suppose throughout this part that Mrs Smith's mother has only one brother. (4) The gene for haemophilia (the inability of blood to clot) produces no symptom in females. State Bayes' theorem. (a) (b) (c) Show that the probability that none of Mrs Smith's brothers is a haemophiliac. Assume throughout this question that males with haemophilia (haemophiliacs) do not marry or have children. What would be the probability if instead it is known that Mrs Smith's mother has four brothers. Explain clearly why. and that Mrs Smith has four brothers. and it is known that Mrs Smith's maternal uncle (the only brother of Mrs Smith's mother) is a haemophiliac. Write down the probability that none of Mrs Smith's brothers is a haemophiliac. who is a haemophiliac. is 1/16. the probability that Mrs Smith's son will be a haemophiliac is 1/8. given that Mrs Smith's mother does not carry the gene. in the light of this information. given that Mrs Smith's mother carries the gene.2.

. xn is taken from this distribution. and that the sum and sum of squares of the sample values are 39. x > 0. Hence derive an approximate 95% ˆ confidence interval for α in terms of α and n. α2 (iii) ∑ log(1 + x ) i =1 i 100 ˆ = 28. (i) Show that the maximum likelihood estimate of α is given by ˆ α= n i =1 ∑ log(1 + xi ) n . x100 which is assumed to be from the pdf f(x) gives ˆ Stating any results you assume without proof. x2 . Calculate α and an approximate 95% confidence (4) interval for α. show that α ~ N(α. (5) A random sample x1 .. Also calculate the sample mean and variance.57... (iv) You are now given that. E(X) = 1 .. α > 0. Evaluate the mean and variance of this distribution.. The random variable X has the Pareto distribution with probability density function (pdf) f ( x) = α (1 + x ) α +1 . for α > 2 . and comment briefly. . (5) (ii) ) n approximately if the sample size is large. and a random sample x1 . x2 . (6) 4 Turn over . using the ˆ value of α found in part (iii). α −1 Var(X) = (α − 1)2 (α − 2) α .4 and 52.3. .6 respectively.

x = 0. of accidents on a certain stretch of road is assumed to follow a Poisson distribution with mean λ. Y(x) say. (i) Given that a total of x accidents occurs in a given year. …. 2. the probability that it involves one or more fatalities (i. y ) = y y x− y λ p e− λ . x. (4) Deduce that the unconditional probability that. . is given by p ( x. in any given year. 1. 1. 2. (5) (iv) 5 Turn over . independently of all other accidents. (4) (iii) Hence show that the marginal distribution of the number of fatal accidents on this stretch of road in any given year is Poisson with mean λp.e. there are x accidents of which y are fatal. 0 < p < 1. The annual number. For any accident on this stretch of road. y = 0. ⎡ λ (1 − p ) ⎤ ⎦ y! ⎣ ( x − y )! (ii) for x = 0. the probability that. so that P ( X = x ) = e− λ λx x! . explain why the distribution of the number of fatal accidents in that year. 1.. λ > 0. is binomial.2.. λ > 0.4. find (a) (b) the probability that in a given year there are no fatal accidents. … . at most one is fatal. in a year of 6 accidents. and write down the parameters of this distribution. it is a fatal accident) is p.. (7) Given that λ = 10 and p = 0. 2. X say.

If P(X = Y) = ½ and also P(X < Y) = P(X > Y). where 0 < p2 < 1. p2). By writing P ( X > Y ) = ∑ P (Y = y ) P ( X > Y | Y = y ) . x = 1. For any non-negative integer x.…. p1 + p2 − p1 p2 (6) (iv) Show that P(X = Y ) = p1 p2 .5. and deduce a condition on p1 and p2 such that P(X = Y) = ½. show that P ( X > x ) = (1 − p1 ) . 0 < p1 < 1. (6) 6 Turn over .4 and 1 ≤ x ≤ 5. solve for p1 and p2. p1) for the case p1 = 0. p1 ) = (1 − p1 ) x−1 p1 . (4) Sketch the graph of p(x. 3. y =1 ∞ or otherwise. x (ii) (4) (iii) The random variable Y is distributed independently of X with pmf p(x. p1 + p2 − p1 p2 and hence write down P(X < Y). (i) The random variable X has the geometric probability mass function (pmf) given by p(x. show that P( X > Y ) = p2 (1 − p1 ) . 2.

5. where C ~ N(6. all other assumptions being as before. (3) The check-in scales record the weight of hold luggage to the nearest kg. Find the probability that a randomly chosen passenger's hold luggage is recorded as weighing 24 kg or more. Assume that the weight in kg. 36). H and C being independent within and between passengers. (4) (iii) 7 Turn over . W say.e. What is now the probability that the safety requirement is not met? (6) Comment critically on the assumptions listed in part (i) of this question. and cabin luggage weighing C kg. of a randomly chosen passenger is Normally distributed with mean 65 and standard deviation 6. T say. (i) A passenger aircraft carries 100 passengers. (a) Write down the distribution of the total weight in kg. of a randomly chosen passenger and his or her hold and cabin luggage. where H ~ N(20. 4). W. Assume also that a randomly chosen passenger has hold luggage weighing H kg. 9). i. the random variables H and C are not independent but instead the correlation between them is –0. for each passenger. and find P(T > 110). (3) (b) (c) (ii) Now suppose that. W ~ N(65. Find the probability that this requirement is not met.6. (4) A safety requirement is that the total weight of all 100 passengers and their luggage should not exceed 9300 kg.

…. n. n. (a) The random variable X has the binomial distribution with probability mass function (pmf) ⎛n⎞ n− x f X ( x ) = ⎜ ⎟ p x (1 − p ) . write down an example of a binomial (1 − p )x distribution which has two consecutive equal maximal probabilities. (10) (b) In a game of target practice. and show that. 2.8 with A and 0. and deduce that the mode m of X is the largest integer satisfying m ≤ (n + 1)p. His probability of hitting the target is 0. a player has two guns.. A and B.. 8 Turn over ... (1 − p )x X x = 0. 0 < p < 1 . f X (x ) = p (n + 1 − x ) f ( x − 1) . (10) (iii) Comment on your results.7.4 with B. the mean np lies between the two modes. 1. which must be fired alternately. for x = 1. and when firing in the order B A B. for any such distribution. He is allowed three shots at the target. By considering the case p(n + 1 − x ) = 1 . Assume that events relating to different shots are independent. ⎝ x⎠ Show that. (i) (ii) Which sequence should he choose to maximise the probability of hitting the target in two consecutive shots? Find the mean and variance of the number of times the player hits the target when firing in the order A B A. 2. which must be fired either in the order A B A or in the order B A B.

for this regression. Theoretical physics suggests that the extension of n i =1 the wire should be proportional to the load put on it. ( xn . A uniform thin wire is fixed at its top end. when a linear regression of y on x is fitted to data ( x1 . Calculate the residual mean square. and why. (8) (b) You are given that. Varying loads (L. Note: you should state clearly any formulae you use. L E 1 2 1 3 2 4 2 2 3 8 ΣLE = 300 3 10 4 13 4 13 5 11 5 14 ΣL = 30 ΣE = 80 ΣL2 = 110 (i) ΣE2 = 852 It is proposed to analyse these data using simple linear regression analysis. State which of L and E should be taken as the independent variable and which as the dependent variable. s2 say.. (4) 9 . yn ) . . (8) (a) Fit a straight line to the data by the method of least squares. where x = 1 n ∑ xi . and give a point estimate of the expected extension when the load is 350 grams. the standard error of the intercept parameter is (ii) s i =1 n ∑ xi n 2 n ∑ ( xi − x )2 i =1 . which implies that the intercept parameter in the linear regression fitted in (i) should be zero. stating any necessary assumptions.. but you do not need to prove them. y1 ) . Test this hypothesis against a two-sided alternative.8. and the extension (E. in millimetres) of the wire for each load is measured. The results are as shown in the following table. in hundreds of grams) are hung from the bottom end of the wire. Construct a scatterplot of the data and comment on their suitability for linear regression analysis..