You are on page 1of 22

M2S1: Probability and Statistics II

Problems
Professor David A van Dyk
Statistics Section, Imperial College London dvandyk@imperial.ac.uk http://www2.imperial.ac.uk/dvandyk

October 2013

M2S1 Problems ( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

1
1.1 1.2

Introduction and Motivation


Course Administration and Syllabus Randomized Controlled Experiments

1.2.1 Suppose you are interested in the effects of caffeine on the concentration of sleep-deprived undergraduates, in particular on their performance on exam-like tasks. You are planning an experiment and have identied 100 of your classmates who are willing to participate as subjects in the experiment. You plan to have a test that involves reading comprehension, mathematical skills, and analytical abilities. You plan to administer the test at 14.00 on a day when your 100 subjects have had at most 4 hours of sleep in the last 24 hours and have had no sleep since 7.00. Describe how you will design your experiment. Be sure to consider issues such as blinding, control and treatment groups, control, randomization, dose, and the placebo effect. You might want to read
http://news.bbc.co.uk/1/hi/health/1142492.stm and http://en.wikipedia.org/wiki/Effect_of_psychoactive_drugs_on_animals

for background.

2
2.1

Probability Spaces
Denitions of Probability

2.1.1 Suppose your friend has a 50p coin and is going to ip it into the air. What probability would you assign that it will come up heads in each of the following situations: (a) You have no information about the coin or its history of coming up head when ipped by your friend. (b) You friend has already ipped this coin twice, and it came up heads both times. (c) You friend has already ipped this coin 10 times, and it came up heads each time. (d) You friend has already ipped this coin 1000 times, and it came up heads each time. What type of probability are you using? [Hint: No calculations are required for this question.]

2.2

Basic Probability

2.2.1 () Describe the sample spaces for each of the following experiments, (a) A coin is tossed ve times. (b) The number of people in the queue ahead of you when you arrive in the junior common room to purchase a decaf(!) skinny latte. (c) Measure the life-time of a particular type of light bulb. (d) Measure the time it takes a Piccadilly line train to go from South Kensington Station to Heathrow Terminal 4. 2.2.2 () Find formulas for each of the following events in terms of = Pr(A), = Pr(B ), and = Pr(A B ). .........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

2.2

Basic Probability (a) either A or B or both. (b) either A or B but not both. (c) at least one of A or B . (d) at most one of A or B .

M2S1 Problems

2.2.3 () A couple plans to have three children. There are 8 possible arrangements of girls and boys. For example, GGB means the rst two children are girls and the third child is a boy. All 8 arrangements are (approximately) equally likely. (a) Write down all 8 arrangements of the sexes of the three children. What is the probability of any one of these arrangements? (b) Let X be the number of girls the couple has. What is the probability that X = 2? (c) Starting from your work in (a), nd the distribution of X . That is, what values can X take, and what are the probabilities for each value? (d) What named distribution does X follow? What are the mean and variance of X ? 2.2.4 () If Pr(A) = 1/4 and Pr(B c ) = 1/5, can A and B be disjoint? Explain. 2.2.5 () Consider the probability space (S , B , Pr) with A, B B . Using only the Kolmogorov axioms prove (a) Pr(A) 1, (b) If A B , then Pr(A) Pr(B ), and (c) Pr(A B ) = Pr(A) + Pr(B ) Pr(A B ). 2.2.6 Let be a sample space. (a) Show that the collection B = {, } is a sigma algebra. (b) Let B = {all subsets of , including itself}. Show B is a sigma algebra. (c) Show that the intersection of two sigma algebras is a sigma algebra. 2.2.7 ( ) A slightly more general denition of a probability function than what we consider in class allows a probability function to be dened on a eld (rather than only on a sigma algebra). In particular, let F be a eld composed of subset of and dene a probability function on F , as a function Pr such that (i) Pr(A) 0 for all A F , (ii) Pr() = 1, and (iii) Pr is countably additive. For a countably disjoint sequence {Ak , k = 1, 2, . . .}, we require Pr( k=1 Ak ) = k=1 Pr(Ak ) only if k=1 Ak F . Suppose Bk F for k = 1, 2, . . . , and consider the following: A XIOM : If B1 B2 B3 . . . and k=1 Bk = then Pr(Bk ) 0 as k . Now let = {1, 2, . . .} and F = {nite and conite subsets of }. Show that the function Pr() = 0 1 if is nite if is conite

does not satisfy this A XIOM. Given what we have shown about this function in class, comment on why this result is not surprising. 2.2.8 ( ) Let B be the set of countable and cocountable subsets of the real numbers, where a set is cocountable if its complement is countable. .........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

2.3

Conditional Probability and Independence M2S1 Problems (a) Show that B is a sigma algebra. [Hint: Is a countable set of countable sets countable?] (b) Let P () = 0 1 if is countable if is cocountable.

Is this function nitely additive? Is it countable additive?

2.3

Conditional Probability and Independence

2.3.1 () Suppose Pr(A) > 0 and Pr(B ) > 0, Show that (a) if A and B are disjoint they are not independent, and (b) if A and B are independent they are not disjoint. 2.3.2 () About one in three human twins is identical (one egg) and two in three are fraternal (two eggs). Identical twins are always the same sex, with boys and girls being equally likely. About one quarter of fraternal twin pairs are both boys, another quarter are both girls, and the rest of one boy and one girl. In England and Wales about one in sixty-ve births is a twin birth (http: //www.multiplebirths.org.uk/media.asp. Let A = {A birth in England/Wales results in twin boys} B = {A birth in England/Wales results in fraternal twin} C = {A birth in England/Wales results in twins} (a) Describe the event A B C in words. (b) Find Pr(A B C ). 2.3.3 For events A and B in a sample space , under what conditions does the following hold: Pr(A) = Pr(A|B ) + Pr(A|B c )? 2.3.4 A biased coin is tossed repeatedly, with tosses mutually independent; the probability of the coin showing Heads on any toss is p. Let Hn be the event that an even number of Heads have been obtained after n tosses, let pn = Pr (Hn ), and dene p0 = 1. By conditioning on Hn1 and using the L AW OF T OTAL P ROBABILITY, show that, for n 1, pn = (1 2p)pn1 + p. (1)

Find a solution to this difference equation, valid for all n 0, of the form pn = A + Bn , where A, B and are constants to be identied. Prove that if p < 1/2, then pn > 1/2 for all n 1, and nd the limiting value of pn as n . Is this limit intuitively reasonable? 2.3.5 ( ) A simple model for weather forecasting involves classifying days as either Fine or Wet, and then assuming that the weather on a given day will be the same as the weather on the preceding day with probability p, with 0 < p < 1. Suppose that the probability of ne weather on day indexed 1 (say Jan 1st) is denoted by . Let n denote the probability that day indexed n is Fine. For n = 2, 3, ..., nd a difference equation for n similar to that in equation (1) in Problem 2.3.2, and use this difference equation to nd n explicitly as a function of n, p and . Find the limiting value of n as n . 2.3.6 () Consider two coins, of which one is a normal fair coin and the other is biased so that the probability of obtaining a Head is p > 1/2. ........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

M2S1 Problems (a) Suppose p = 1 and a coin is selected at random and tossed n times, with tosses mutually independent. Evaluate the conditional probability that the selected coin is the normal one, given that the rst n tosses are all Heads. [Hint: You will need to use the Binomial distribution from M1S.] (b) Now suppose 1/2 < p < 1 and that again, one of the coins is selected randomly and tossed n times. Let E be the event that the n tosses result in k Heads and n k Tails, and let F be the event that the coin is fair. Find Pr(F |E ). 2.3.7 () A company is to introduce mandatory drug testing for its employees. The test used is very accurate, in that it gives a correct positive test (detects drugs when they are actually present in a blood sample) with probability 0.99, and a correct negative test (does not detect drugs when they are not present) with probability 0.98. If an individual tests positive on the rst test, a second blood sample is tested. It is assumed that only 1 in 5000 employees actually does provide a blood sample with drugs present. Calculate the probability that the presence of drugs in a blood sample is detected correctly, given (a) a positive result on the rst test (before the second test is carried out), (b) a positive result on both rst and second tests. Assume that the results of tests are conditionally independent, that is, independent given the presence or absence of drugs in the sample.

3
3.1

Univariate Random Variables and Probability Distributions


Probability Distribution, Density and Mass Functions

3.1.1 () Determine for which values of the constant c the following functions dene valid probability mass functions for a discrete random variable X , taking values on range X = {1, 2, 3, ...}. For parts (a) and (d) the value of c will depend on . In this case, specify the range of resulting in a valid probability mass function. (a) fX (x) = cx , (c) fX (x) = c/(x2 ), In each case, calculate Pr(X > 1). 3.1.2 () Suppose n identical fair coins are tossed. Those that show Heads are tossed again, and the number of Heads obtained on the second set of tosses denes a discrete random variable X . Assuming that all tosses are independent, nd the range and probability mass function of X . 3.1.3 () A continuous random variable X has pdf given by fX (x) = c(1 x)x2 , for 0 < x < 1, and zero otherwise. Find the value of c, the cdf of X , FX , and Pr(X > 1/2). 3.1.4 () A function f is dened by f (x) = k/xk+1 , for x > 1, and zero otherwise. For what values of k is f a valid pdf? Find the cdf of X . ........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

(b) fX (x) = c/(x2x ), (d) fX (x) = cx /x!.

3.2 Transformations of Univariate Random Variables 3.1.5 () A continuous random variable X has pdf given by x, fX (x) = 2 x, for 0 < x < 1, for 1 x < 2,

M2S1 Problems

and zero otherwise. Plot fX , nd the cdf FX , and plot FX . 3.1.6 () Show that the function, FX , dened for x R by FX (x) = c exp ex is a valid cdf for a continuous random variable X for a specic choice of constant c, where > 0. Find the pdf fX associated with this cdf. 3.1.7 Evaluate
0

e4x dx.

[Hint: Relate the integrand to a well-known pdf.] 3.1.8 A point is to be selected randomly from an integer lattice restricted to the triangle with corners at (1, 1), (n, 1) and (n, n) for positive integer n. If all points are equally likely to be selected, nd the probability mass functions for the two discrete random variables X and Y corresponding to the x and y coordinates of the selected point respectively. 3.1.9 Consider two random variables, X with cdf FX and Y with cdf FY . We say that Y is stochastically greater than X if FY (u) FX (u) for all u and FY (u) < FX (u) for some u. If Y is stochastically greater than X , prove that Pr(Y > u) Pr(X > u) for every u and Pr(Y > u) > Pr(X > u) for some u. Qualitatively compare the distributions of X and Y .

3.2

Transformations of Univariate Random Variables

3.2.1 () Suppose that X is a continuous random variable with density function given by fX (x) = 4x3 , for 0 < x < 1, and zero otherwise. Find the density functions of the following random variables: (a) Y = X 4 , (b) W = eX , (c) Z = log X, (d) U = (X 0.5)2 .

3.2.2 Again suppose that X is a continuous random variable with density function given by fX (x) = 4x3 , for 0 < x < 1, and zero otherwise. Find the monotonic decreasing function H such that the random variable V , dened by V = H (X ), has a density function that is constant on the interval (0, 1), and zero otherwise. .........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

3.3 Expected Values M2S1 Problems 3.2.3 The measured radius of a circle, R, is a continuous random variable with density function given by fR (r) = 6r(1 r), for 0 < r < 1, and zero otherwise. Find the density functions of (a) the circumference and (b) the area of the circle. 3.2.4 Suppose that X is a continuous random variable with density function given by fX (x) = x 1+
(+1)

, for x > 0,

and zero elsewhere, with and non-negative parameters. (a) Find the density function and cdf of the random variable dened by Y = log X . (b) Find the density function of the random variable dened by Z = + Y .

3.3

Expected Values

3.3.1 () onsider the function fX (x) = cg (x) for some constant c > 0, with g dened by g (x) = |x| , for x R. (1 + x2 )2

Show that fX (x) is a valid pdf for a continuous random variable X with range X = R, and nd the cdf, FX , and the expected value of X , E(X ), associated with this pdf. 3.3.2 Suppose that X is a continuous random variable with support R. Its pdf is given by fX (x) = 2 x exp {x} , for x 0, and is zero otherwise, with parameter > 0. (a) Find the cdf of X . (b) Show that, for any positive value m, Pr(X m) = (1 + m) exp {m} . (c) Find E(X ). (d) Now consider the pdf gX (x) = kx2 exp{x}, for x 0 and zero elsewhere. Find the value of k for which (2) is a valid pdf. 3.3.3 The solid line in the left panel of Figure 1 is an unnormalized density function, f X (x). It does not integrate to one and its integral is unknown. It is superimposed on a standard normal pdf, (x), (dashed line). The ratio f X (x)/(x) is plotted in the right panel. Suppose you obtain a large random sample from the standard normal distribution, (X1 , X2 , . . . , Xn ). How might you use this sample to approximate the expectation under fX (x), the properly normalized version of f X (x)? Based on the plot in Figure 1, how well do you expect your method to work? Why? (2)

.........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

3.4

Higher Moments

M2S1 Problems

0.4

0.3

fX(x) (x) 3 2 1 0 x 1 2 3

fX(x)

0.2

0.1

0.0

0.6 3

0.8

1.0

1.2

0 x

Figure 1: Plots for Problem 3.3.3.

3.4

Higher Moments
1 x/ e , for x > 0 2X/ . Find the pdf, mean, and

3.4.1 Suppose that X is a continuous random variable with pdf fX (x) =

and zero elsewhere, with a positive parameter. Let Y = variance of Y . 3.4.2 A continuous random variable X has cdf given by

FX (x) = c(x x ), for 0 x 1, FX (x) = 0, for x < 0 and FX (x) = 1, for x > 1, for constants 1 < . Find the value of constant c, and evaluate the rth moment of X . 3.4.3 Let X be a random variable with k th non-central moment, k = E(X k ), and k th central moment, k = E[(X 1 )k ]. In addition to the mean and variance, two quantities sometimes used to summarize a distribution are the 3 4 skewness : 3 = and kurtosis : 4 = 2 . 3 / 2 2 (2 ) The skewness measures asymmetry in the pdf, while the kurtosis measures peakedness. A pdf is said to be symmetric around a point m if fX (m ) = fX (m + ) for every positive . (a) Show that the skewness of a symmetric distribution is zero.
3. (b) Show 3 = 3 32 1 + 21

(c) Suppose X= 0 with probability 1 p with probability p,

where > 0. Find and plot the skewness of X as a function of p and . ........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

(d) Calculate the skewness of fX (x) = distribution.

ex ,

M2S1 Problems for x 0 and 0 elsewhere. This is a right skewed

(e) Calculate the kurtosis for each of the following pdfs and comment on the peakedness of each. 1 2 fX (x) = ex /2 , 2 1 (ii) fX (x) = , 2 1 (iii) fX (x) = e|x| , 2 (i) for x R for 1 < x < 1 for x R

3.4.4 ( ) Let X be a continuous random variable with range X = R+ , pdf fX and cdf FX . (a) Show that E(X ) =
0

[1 FX (x)] dx.

(b) Show also that for integer r 1, E(X r ) =


0

rxr1 [1 FX (x)] dx.

(c) Find a similar expression for random variables for which X = R.

4
4.1

Univariate Families of Distributions


Standard Univariate Parametric Families

4.1.1 () (a) At a certain London university, the average overall grade of students in their rst year is 65 with a standard deviation of 15. Under a normal model, what proportion of students have grades over 80? Under 40? Find the sixtieth percentile of the overall grades of rst year students. If there are 250 rst year students, nd the mean and standard deviation of the number who score above 80. (b) About 75% of 20 year old women weigh between 103.5 and 148.5 lb. Using a normal model, and assuming that 103.5 and 148.5 are equal distant from the mean, , calculate . 4.1.2 One reason cited for the mental deterioration so often seen in the very elderly is the reduction in cerebral blood ow that accompanies the aging process. Addressing itself to this notion, a study was done (Ball and Taylor, 1969) to see whether cyclandelate, a vasodilator, might be able to stimulate the cerebral circulation and thereby slow the rate of mental deterioration. Blood circulation time can be measured using a radioactive tracer. Let X and Y be the mean blood circulation time before treatment and after treatment respectively, for a randomly selected elderly patient. (a) Let D= What is the distribution of D? .........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

0 1

if Y < X if Y > X .

4.2

Classes of Parametric Families M2S1 Problems (b) Consider the skeptical hypothesis that cyclandelate has no effect on mean circulation time and any differences that are observed before and after treatment are due to chance. What is the distribution of D under this hypothesis? (c) Now suppose we select a random sample of n patients, and let = distribution of under the skeptical hypothesis?
n i=1 Di ,

what is the

(d) The drug was given to eleven subjects and blood ow was measured before and after treatment as described above. The data appear below. What is the observed value, , of the random variable ? How likely is it that we would see a value as extreme or more extreme under the skeptical hypothesis? What do you conclude about the skeptical hypothesis?
C EREBRAL C IRCULATION E XPERIMENT Mean Circulation Time (seconds) Before, xi After, yi 15 13 12 8 12 12.5 14 12 13 12 13 12.5 13 12.5 12 14 12.5 12 12 11 12.5 10

4.1.3 The drug enforcement team in a large city ceased a stash of 496 small packets containing what appeared to be illegal narcotics. They were packaged for resale on the street. Four of the packets were randomly sampled, tested, and found to contain prohibited substances. Undercover police ofcers took two more of the (untested) packets and sold them to a defendant later accused of purchasing illegal drugs. Unfortunately, these last two packets were lost before they could be tested for narcotics. [This question is based on actual events as described in Shuster (1991) and reported in Casella and Berger (2002).] (a) Let N be the number of the original 496 packets that contained prohibited substances and let M = 496 N be the number that did not. Compute the probability that the rst four randomly selected packets contained narcotics and the next two randomly selected packets did not. (You should report your answer as a function of N and M .) (b) Maximize the probability that you found in part (a) as a function of N and M = 496 N . This is the defendants maximum probability of innocence. 4.1.4 Show that the binomial distribution converges to the Poisson distribution as n with = pn held xed. That is, lim n x e x p (1 p)nx = , x x!

where p = /n.

4.2

Classes of Parametric Families

4.2.1 Consider the Poisson distribution with expectation . ........................................................................................................................................


( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

10

M2S1 Problems (a) Show that the Poisson distribution belongs to the exponential family. What is the canonical parameterization?
J (b) Now suppose = exp j =1 xj j , where (x1 , . . . , xJ ) are known predictor variables, and (1 , . . . , J ) are unknown parameters. (Here replaces as the unknown parameter.) Show that this new distribution is also a member of the exponential family.

4.2.2 Consider the probability density function fX (x) = ex , for x > 0. (a) Construct a location-scale family fX (x|, ) from fX (x), where is the location parameter and is the scale parameter. (b) Derive the non-central moments, n = E(Y n ), where Y follows the location-scale family.

5
5.1

Multivariate Random Variables and Probability Distributions


Multivariate Random Variables

5.1.1 Suppose that X and Y are discrete random variables with joint mass function given by fX,Y (x, y ) = c 2x!y! , and zero otherwise, for some constant c. (a) Find the value of c, and the marginal mass functions of X and Y . (b) Prove that X and Y are independent random variables, that is, fX,Y (x, y ) = fX (x)fY (y ), for all x, y = 0, 1, . . . . 5.1.2 Continuous random variables X and Y have joint cdf FX,Y dened by FX,Y (x, y ) = 1 ex with FX,Y (x, y ) = 0, for x 0. Find the joint pdf, fX,Y . Are X and Y independent ? Justify your answer. 5.1.3 Suppose that X and Y are continuous random variables with joint pdf given by fX,Y (x, y ) = cx(1 y ), for 0 < x < 1 and 0 < y < 1, and zero otherwise, for some constant c. (a) Are X and Y independent random variables? (b) Find the value of c. (c) Find Pr(X < Y ). 5.1.4 Suppose that the joint pdf of X and Y is given by fX,Y (x, y ) = 24xy, for x > 0, y > 0, and x + y < 1, and zero otherwise. Find (a) the marginal pdf of X , fX , (b) the marginal pdf of Y , fY , (c) the conditional pdf of X given Y = y , fX |Y , (d) the conditional pdf of Y given X = x, fY |X , (e) the expected value of X , and (f) the expected value of Y . [Hint: Sketch the region on which the joint density is non-zero; remember that the integrand is only non-zero for some part of the integral range.] ........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.
x+y

for x, y = 0, 1, 2, . . . ,

1 1 + tan1 y , for x > 0 and < y < , 2

11

5.2 Multivariate Transformations M2S1 Problems 5.1.5 Suppose that X and Y are continuous random variables with joint pdf given by fX,Y (x, y ) = and zero otherwise. (a) Find the marginal pdf of X . (b) Find the marginal pdf of Y . (c) Find the conditional pdf of X given Y = y . (d) Find the conditional pdf of Y given X = x. (e) Find the expectation of Y , E(Y ). 5.1.6 Suppose X is a random variable with pdf proportional to the normal pdf (mean and variance 2 ) for positive x and zero elsewhere,
(x)2 k 22 exp 22 fX (x) =

1 1 , for 1 x < and y x, 2 2x y x

for x 0 elsewhere .

0,

Derive the mean and variance of X . 5.1.7 A critical component in an experimental rocket can withstand temperatures up to t0 in degrees Centigrade. If the maximum temperature, T , of this component exceeds t0 , the rocket will fail. Preliminary tests indicate that there is some variability in the maximum temperature that the component is likely to reach when the rocket is launched; the pdf of this temperature is given by fT (t). Engineers are anxious about an upcoming test launch because, although t fT (t)dt is near zero, 0 it is greater than zero. There are sensors in the rocket that will record T , but it will take some time to recover the rocket and analyze the data from the sensors. (This, of course, is assuming that the rocket does not fail.) Suppose that the test launch goes smoothly and the rocket does not fail. (a) Carefully derive the pdf of the maximum temperate of the critical component after the engineers observe the successful launch of the rocket, but before they are able to analyze the sensor data. (b) Verify that the pdf you gave in part (a) is a valid pdf.

5.2

Multivariate Transformations
iid

5.2.1 Suppose Xi Gamma(i , ) for i = 1, . . . , n. (a) Use the convolution theorem to show that Y = X1 + X2 Gamma(1 + 2 , ). (b) Prove that Zn =
n i=1 Xi n i=1 i ,

Gamma (

).

5.2.2 ( ) Suppose that X and Y have joint pdf that is constant on the range X(2) (0, 1) (0, 1), and zero otherwise. Find the marginal pdf of the random variables U = X/Y and V = log(XY ), stating clearly the range of the transformed random variable in each case. [Hint: For U , you might consider rst the joint pdf of (U, X ), then obtain the marginal pdf of U . For V , consider the joint pdf of (V, log X ), then obtain the marginal pdf of V . Compare the ease of these calculations with those required by the joint transformation from (X, Y ) to (U, V ).] ........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

12

5.2 Multivariate Transformations M2S1 Problems 5.2.3 ( ) Suppose that continuous random variables X1 , X2 , X3 are independent, and have marginal pdfs specied by xi fXi (xi ) = ci xi , xi > 0, ie for i = 1, 2, 3, where c1 , c2 and c3 are normalizing constants. Find the joint pdf of the random variables Y1 , Y2 , Y3 dened by Y1 = X1 /(X1 + X2 + X3 ), Y2 = X2 /(X1 + X2 + X3 ), Y3 = X1 + X2 + X3 ,

and evaluate the (marginal) expectation of Y1 . 5.2.4 Suppose that X and Y are continuous random variables with pdf given by fX,Y (x, y ) = 1 1 exp x2 + y 2 2 2 , for x, y R.

(a) Let the random variable U be dened by U = X/Y . Find the pdf of U . (b) Suppose now that S 2 is independent of X and Y . (The pdf of S is given by fS (s) = c( )s/21 es/2 , for s > 0, where is a positive integer and c( ) is a normalizing constant depending on .) Find the pdf of random variable T dened by T = X . S/

This is the pdf of a t random variable with degrees of freedom. 5.2.5 Suppose (X1 , . . . , Xn ) is a collection of independent and identically distributed random variables taking values on X with pmf/pdf fX and cdf FX . Let Yn and Zn correspond to the maximum and minimum order statistics derived from (X1 , . . . , Xn ), that is Yn = max {X1 , . . . , Xn } , FYn (y ) = {FX (y )}n , (b) Suppose X1 , . . . , Xn Unif(0, 1), that is FX (x) = x, for 0 x 1. Find the cdfs of Yn and Zn . (c) Suppose X1 , . . . , Xn have cdf FX (x) = 1 x1 , for x 1.
n. Find the cdfs of Zn and Un = Zn (d) Suppose X1 , . . . , Xn have cdf

Zn = min {X1 , . . . , Xn } . FZn (z ) = 1 {1 FX (z )}n .

(a) Show that the cdfs of Yn and Zn are given by

1 , for x R. 1 + e x Find the cdfs of Yn and Un = Yn log n. (e) Suppose X1 , . . . , Xn have cdf FX (x) = FX (x) = 1 1 , for x > 0. 1 + x

Find the cdfs of Yn , Zn , Un = Yn /n, and Vn = nZn . ........................................................................................................................................


( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

13

5.3

Covariance and Correlation

M2S1 Problems

5.3

Covariance and Correlation

5.3.1 Suppose X and Y are two random variables each with nite mean and variance. Prove 1 XY 1 by using the fact that Var are both positive quantities. 5.3.2 Suppose that X and Y have joint pdf given by fX,Y (x, y ) = cxy (1 x y ), for 0 < x < 1, 0 < y < 1, and 0 < x + y < 1, for some constant c > 0. Find the covariance of X and Y . X Y + X Y and Var X Y X Y

5.4

Hierarchical and Mixture Models

5.4.1 Suppose X |Y Exponential(Y ) and Y Gamma(, ), using the parameterization on the formula sheet. (a) Find the mean and variance of X . (b) Find the marginal distribution of X . 5.4.2 The number of daughters of an organism is a discrete random variable with mean and variance 2 . Each of its daughters reproduces in the same manner. Find the expectation and variance of the number of granddaughters. 5.4.3 Suppose that the joint pdf of random variables X and Y is specied via the conditional density fX |Y and the marginal density fY as fX |Y (x|y ) = y yx2 exp 2 2 , for x R; fY (y ) = c( )y /21 ey/2 , for y > 0,

where is a positive integer. Find the marginal pdf of X .

6
6.1

Multivariate Families of Distributions


Multinomial and Dirichlet Distributions

6.1.1 Suppose N Poisson() and X |N Multinomial(N, p), where N is univariate and X = (X1 , . . . , Xk ) is a (k 1) random variable and p = (p1 , . . . , pk ) is a (k 1) probability vector. (a) Write the joint pmf of N and X . (b) ( ) Rearrange the terms in the joint pmf and its support to show that Xi Poisson(pi ) and N = k i=1 Xi .
ind

........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

14

6.2

Multivariate Normal Distribution

M2S1 Problems

6.2

Multivariate Normal Distribution

6.2.1 ( ) The Bivariate Normal Distribution: Suppose that X1 and X2 are independent and identically distributed N(0, 1) random variables. Let random variables Y1 and Y2 be dened by Y1 = 1 + 1 1 2 X1 + 1 X2 or, equivalently, Y1 Y2 = 1 2 + 1 1 2 1 0 2 X1 X2 , and Y2 = 2 + 2 X2 ,

for positive constants 1 and 2 , and || < 1. (a) Find the joint pdf of (Y1 , Y2 ).
2 , and that conditionally (b) Show that, marginally for i = 1, 2, Yi N i , i

Y1 |Y2 = y2 N 1 + Y2 |Y1 = y1 N 2 + (c) Find the correlation between Y1 and Y2 . 6.2.2 Suppose X1 X2 N2 = 2 5

1 2 2 1

2 1 2 (y2 2 ) , 1 2 1 2 (y1 1 ) , 2

, =

1 0.5 0.5 4

Compute Pr(X1 > 0) and Pr(X2 < 6). [Hint: You may use the result of Problem 6.2.1.] 6.2.3 The joint pdf of the random variables X1 and X2 is fX1 ,X2 (x1 , x2 ) = k exp x1 x2 2x2 x2 1 + 2 6 3 3 , for < x1 , x2 < .

Find E(X1 ), E(X2 ), Var(X1 ), Var(X2 ), Cov(X1 , X2 ) and k. [Hint: You may use the result of Problem 6.2.1.] 6.2.4 ( ) [Warning: If you have an aversion to vector notation, you may nd this question challenging!] Suppose Y and X = (X1 , X2 ) jointly follow a trivariate normal distribution. Here Y is a univariate random variable and Z = (Y, X1 , X2 ) is a (3 1) trivariate normal random vector with mean = Y X and variance-covariance matrix M 1 = mY Y MY X MY X MXX
1

where Y is the univariate mean of Y , X is the (2 1) mean vector of X , is the (3 1) mean vector of both X and Y , mY Y is the rst diagonal element of M , MXX is the lower-right (2 2) submatrix of M , and MY X is the remaining off-diagonal (1 2) submartix of M . (Note that we parameterize the multivariate normal in terms of the inverse of its variance-covariance matrix. This will signicantly simplify calculations!) (a) Derive the conditional distribution of Y given both X1 and X2 . [Hint: Use vector/matrix notation.] ........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

15

6.3

Connections between the Distributions M2S1 Problems (b) Now suppose Y and X = (X1 , . . . , Xn ) jointly follow a multivariate normal distribution. Here Y remains a univariate random variable and Z = (Y, X1 , . . . , Xn ) is an [(n + 1) 1] multivariate normal random vector. Use the same notation for the mean and the inverse of the variance-covariance matrix, but with appropriately adjusted dimensions. Derive the conditional distribution of Y given X1 , . . . , Xn . [Hint: If you used vector/matrix notation in part (a), this problem will be very easy. If you did not, it will be very hard!] (c) Set n = 1 and check that your answer is the same as the conditional distribution for the bivariate normal derived in lecture and in Problem 6.2.1.

6.3

Connections between the Distributions

6.3.1 Suppose that U1 and U2 are independent and identically distributed Unif(0, 1) random variables. Let random variables Z1 and Z2 be dened by Z1 = Z2 = Find the joint pdf of (Z1 , Z2 ). 6.3.2 Suppose that U is a Unif(0, 1) random variable. Find the distribution of X = log U. 6.3.3 Suppose that an unlimited sequence of Unif(0, 1) random variables is available. Using the results of Problems 6.3.1 and 6.3.2, and results discussed earlier this term, describe how to generate: (a) a Gamma(k, ) random variable, for integer k 0; (b) a realization of a Poisson process with rate ; (c) a 2 Gamma
1 2, 2

2 log(U1 ) cos (2U2 ) , 2 log(U1 ) sin (2U2 ) .

random variable, where is a positive, integer parameter;

(d) a tn random variable, where n is a positive integer parameter.

7
7.1 7.2

Sampling Distributions and Statistical Inference


Background Statistics and Their Sampling Distributions

7.2.1 Suppose that (X1 , . . . , Xn ) is a random sample from a Poisson() distribution. Dene the statistics n n 1 = 1 )2 . T1 = X Xi , and T2 = S 2 = (Xi X n i=1 n 1 i=1 Show that E(T1 ) = E(T2 ) = . 7.2.2 Suppose that (X1 , . . . , Xn ) is a random sample from the probability distribution with pdf 1 fX (x; ) = ex/ , for x > 0. ........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

16

7.3

The Method of Moments is an unbiased estimator of . (a) Show that the sample mean X

M2S1 Problems

(b) Set Y1 = min {X1 , . . . , Xn } and show that Z = nY1 is also unbiased for . 7.2.3 Suppose that (X1 , . . . , Xn ) is a random sample from the uniform distribution on ( 1, + 1). is an unbiased estimator of . (a) Show that the sample mean X (b) Let Y1 and Yn be the smallest and largest order statistics derived from (X1 , . . . , Xn ). Show also that random variable M = (Y1 + Yn ) /2 is an unbiased estimator of .

7.3

The Method of Moments

7.3.1 Method of moments. (a) Suppose (X1 , . . . , Xn ) is a random sample from a gamma distribution, having pdf fX (x) = 1 () x1 exp{x/ }, for x > 0,

where , > 0. Find the method of moments estimators of and . Can the corresponding estimates ever be outside the parameter space? (b) Suppose (X1 , . . . , Xn ) is a random sample from a beta distribution, having pdf fX (x) = 1 x1 (1 x) 1 , for 0 < x < 1, B(, )

where , > 0. Find the method of moments estimators of and . Can the corresponding estimates ever be outside the parameter space?

7.4

Maximum Likelihood Estimation

7.4.1 [Problem 4.2.2 continued] Consider the probability density function fX (x) = ex for x > 0. In Problem 4.2.2 (a) you derived a location-scale family fX (x|, ) from fX (x), where is the location parameter and is the scale parameter. Now suppose = 1. Report the loglikelihood function for and compute its maximum likelihood estimator, . 7.4.2 [Problem 6.1.1 continued] Suppose N Poisson() and X |N Multinomial(N, p), where N is univariate and X = (X1 , . . . , Xk ) is a (k 1) random variable and p = (p1 , . . . , pk ) is a (k 1) probability vector. In Problem 6.1.1, you showed that Xi Poisson(pi ) and N =
i=1 ind k

Xi .

(3)

(a) Let = p. Note that = (1 , . . . , k ) is a (k 1) vector. Using (3), nd the maximum of . likelihood estimator, derived in part (a) derive formulas for and p = p that satisfy , such that p is (b) Using a probability vector. You do not need to show it, but maximum likelihood estimators are and p are the maximum likelihood estimators of and invariant to transformations, so that p, respectively. 7.4.3 Suppose that (X1 , . . . , Xn ) is a random sample from a Poisson() distribution. (a) Find the maximum likelihood estimator of and show that this estimator is unbiased. ........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

17

7.5

Random Intervals and Condence Intervals (b) Find the maximum likelihood estimator of () = e = Pr(X = 0).

M2S1 Problems

7.4.4 Find the maximum likelihood estimators of the unknown parameters in the following probability densities on the basis of a random sample of size n. (a) fX (x; ) = x1 , for 0 < x < 1 and > 0. (b) fX (x; ) = ( + 1)x2 , for 1 < x and > 0. (c) fX (x; ) = 2 x exp {x} , for 0 < x and > 0. (d) fX (x; ) = 22 x3 , for x and > 0.
1 1 1 (e) fX (x; 1 , 2 ) = 1 2 x , for 2 x and 1 , 2 > 0.

7.5

Random Intervals and Condence Intervals

7.5.1 Suppose you observe a single observation from a normal distribution with unit variance and unknown mean, . Specically, X N(, 1). (a) Report the loglikelihood function for and compute its maximum likelihood estimator, . (b) What is the distribution of ? (c) Derive an interval I () that has a 95% chance of containing . For deniteness, choose the shortest possible interval with this property. Make a plot of your interval, with plotted on the horizontal axis, and the lower and upper bounds of I () plotted on the vertical axis. (d) Now consider the interval J ( ) = : I () . Identify this interval on your plot from your answer to part (c). Give formulas for the lower and upper bounds of J ( ). Notice that J can be computed from data, whereas I cannot. (I depends on the unknown mean, , while J depends only on the maximum likelihood estimator.) (e) Show that Pr J ( ) = 95%. [Hint: What is the random quantity in this expression.] An interval with this property is called a 95% condence interval. 7.5.2 Suppose you observe a binomial random variable, X Bin(n, p), where n is known, but p is not. (a) Report the loglikelihood function for p and compute its maximum likelihood estimator, p . (b) What is the distribution of p ? (c) Suppose n = 10 and for p on the grid of values (0, 0.1, 0.2, 0.3, . . . , 1.0), derive an interval I (p) that has at least a 95% chance of containing p . For deniteness, choose the shortest possible interval with this property. Make a plot of your interval, interpolating linearly between grid points, with p plotted on the horizontal axis, and the lower and upper bounds of I (p) plotted on the vertical axis. (d) Now consider the interval J ( p) = p : p I (p) . Identify this interval on your plot from your answer to part (c). Compute J ( p) for each possible value of p . (e) Show that Pr p J ( p) 95%, at least for p (0, 0.1, 0.2, 0.3, . . . , 1.0). (f) Qualitatively, how will the intervals change as n increases? ........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

18

7.6 Bayesian Statistical Inference M2S1 Problems 7.5.3 Suppose that (X1 , . . . , Xn ) is a random sample from the probability distribution with pdf fX (x; ) = ex , for x > 0. (a) Find the maximum likelihood estimator of and show that it is biased as an estimator of , but that some multiple of it is not. (b) Show that 2 n i=1 Xi is a pivotal quantity. Describe briey how to use this to construct a 100(1 )% condence interval for , (0, 1). 7.5.4 Let (X1 , . . . , Xn ) be a random sample from the uniform distribution on (0, ). of . (a) Find the maximum likelihood estimator, , show that for (0, 1), a 100(1 )% condence (b) By considering the distribution of / / 1/n ). interval for based on is given by (, 7.5.5 Suppose X1 N(1 , 1) and X2 N(2 , 1), with X1 and X2 independent and 1 and 2 both unknown. Show that both the square S and circle C given by S = {(1 , 2 ) : |X1 1 | 2.236, |X2 2 | 2.236} and C = {(1 , 2 ) : (X1 1 )2 + (X2 2 )2 5.991} are 95% condence sets for (1 , 2 ). What is a sensible criterion for choosing between S and C ? 7.5.6 The following data is sampled from a N(, 2 ) distribution, where both and 2 are unknown: 6.82, 6.07, 3.74, 6.87, 5.92. For this data xi = 29.42 and x2 i = 179.588. (a) Find a 95% condence interval for and show that its width is about 3.16. (b) Suppose it becomes known that the true value of 2 is 1. Show that a 95% condence interval for now has width about 1.75. The width of the condence interval is narrower when the true value of 2 is known. Will this always happen? (c) ( ) Consider the event that the 95% condence interval for is narrower when 2 is known than when it is unknown, with both intervals are computed from the same random sample (X1 , . . . , Xn ) from N(, 2 ). Show that for n = 5 this event has probability a bit less than 0.75. [Hint: you will need to refer to tables of the quantiles of the 2 4 distribution.]

7.6

Bayesian Statistical Inference

7.6.1 Suppose Y | = Poisson() and we wish to estimate using Bayesian methods and specify a gamma prior distribution, Gamma(r, ). Use the parameterization of the gamma distribution on the formula sheet and assume that r is a positive integer. (a) Derive the posterior distribution of given Y . What named distribution is this? (b) What are the posterior mean and variance of ? (c) Find the marginal distribution of Y . What named distribution is this? of . How does the maximum likelihood (d) Compute the maximum likelihood estimator, estimate compare with the posterior mean? [Hint: Ignore the prior distribution when computing the maximum likelihood estimate/estimator!] 7.6.2 Suppose Y | Binomial(n, ) and show: (a) If Unif(0, 1), then Var(|Y ) < Var(). ........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

19

M2S1 Problems (b) If Beta(, ), Var(|Y ) may be larger than Var(). 7.6.3 Suppose you are given the choice between two envelopes, one containing pounds and the other with 2 pounds. The envelopes are sealed and shufed so that you do not know which one contains more money. One of the envelopes is opened and found to have x pounds. You can either take the x pounds or take the other envelope and the money that it contains. You are not, however, allowed to open the second envelope before you make your decision. (a) Suppose that your subjective pdf for is uniform on the interval (0, M ) for some positive M . What is your optimal strategy given x. (b) Now suppose that your subjective pmf for is f () = 1/10 for = 1, 2, . . . , 10. What is your optimal strategy given x.
1 / (c) Finally, suppose that your subjective pdf for is f () = e for > 0 and 0 elsewhere, where is some positive number. Show how the optimal strategy depends on x. [Hint: Let W be the expected value of the money in the sealed envelope and let = W/x. When is > 1?]

8
8.1

Convergence Concepts
Convergence in Distribution and the Central Limit Theorem

8.1.1 Suppose that random variable X has mgf, MX (t) given by 1 2 5 MX (t) = et + e2t + e3t . 8 8 8 Find the probability distribution, expectation, and variance of X . [Hint: Consider MX and its denition.] 8.1.2 Suppose that X is a continuous random variable with pdf fX (x) = exp {(x + 2)} , for 2 < x < . Find the mgf of X , and hence nd the expectation and variance of X . 8.1.3 Suppose Z N(0, 1). (a) Find the mgf of Z , and also the pdf and the mgf of the random variable X , where X =+ for parameters and > 0. (b) Find the expectation of X , and the expectation of the function g (X ), where g (x) = ex . Use both the denition of the expectation directly and the mgf and compare the complexity of your calculations. (c) Suppose now Y is the random variable dened in terms of X by Y = eX . Find the pdf of Y , and show that the expectation of Y is exp + ........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

1 Z,

1 2 2

20

8.1

Convergence in Distribution and the Central Limit Theorem M2S1 Problems 2 (d) Finally, let random variable T be dened by T = Z . Find the pdf and mgf of T .

8.1.4 Suppose that X is a random variable with pmf/pdf fX and mgf MX . The cumulant generating function of X , KX , is dened by KX (t) = log [MX (t)]. Prove that d {KX (t)}t=0 = E(X ), dt d2 {KX (t)}t=0 = Var(X ). dt2

8.1.5 Using the C ENTRAL L IMIT T HEOREM, construct Normal approximations to each of the following random variables, (a) a Binomial distribution, X Binomial(n, ); (b) a Poisson distribution, X Poisson(); (c) a Negative Binomial distribution, X Negative Binomial(n, ).
2 denote the sample variance of a random sample of size n from N(, 2 ), so that 8.1.6 Let Sn 2 (n 1)Sn 2 n1 . 2 Show, using the C ENTRAL L IMIT T HEOREM that 2 2) n 1(Sn D Z N(0, 1), 2 2

Vn =

2 is approximately distributed as N 2 , 2 so that, for large n, Sn n1 .

8.1.7 [Problem 5.2.5 continued] In Problem 5.2.5, you derived the cdfs of a number of random variables involving the minimum or maximum of a random sample. In this problem we will derive the limiting distribution of these same random variables. Suppose (X1 , . . . , Xn ) is a collection of independent and identically distributed random variables taking values on X with pmf/pdf fX and cdf FX , let Yn and Zn correspond to the maximum and minimum order statistics derived from X1 , . . . , Xn . (a) Suppose X1 , . . . , Xn Unif(0, 1), that is FX (x) = x, for 0 x 1. Find the limiting distributions of Yn and Zn as n . (b) Suppose X1 , . . . , Xn have cdf FX (x) = 1 x1 , for x 1.
n as n . Find the limiting distributions of Zn and Un = Zn

(c) Suppose X1 , . . . , Xn have cdf FX (x) = 1 , for x R. 1 + e x

Find the limiting distributions of Yn and Un = Yn log n, as n . (d) Suppose X1 , . . . , Xn have cdf FX (x) = 1 1 , for x > 0. 1 + x

Let Un = Yn /n and Vn = nZn . Find the limiting distributions of Yn , Zn , Un , and Vn as n . ........................................................................................................................................


( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

21

8.2

Convergence in Probability, the Law of Large Numbers, and Inequalities

M2S1 Problems

8.2

Convergence in Probability, the Law of Large Numbers, and Inequalities

8.2.1 Convergence in Probability. Suppose X1 , . . . , Xn Poisson(). Let = 1 X n as n . (a) Show that X (b) ( ) Suppose Tn = eX , show that Tn e . 8.2.2 Suppose S 2 is computed from a random sample, (X1 , . . . , Xn ), from a distribution with nite 2 variance, . Letting S = S 2 , show E(S ) and if 2 > 0, then E(S ) < .
p p n

Xi .
i=1

........................................................................................................................................
( ) You may nd starred problems more challenging. () Problems marked with a () review material from M1S.

22