You are on page 1of 15
Chapter 7. Continuous Random Variables Suppose again we have a random experiment with sample space $ and probability measure P. Recall our definition of a random variable as a mapping X : $ —> R from the sample space $ to the real numbers inducing a probebility measure Px(B) = P(X~"(B)},B CR. Definition 7.0.1. A random variable X is (absolutely) continuous if 3fx : R —> R (measurable) such that Px()= [ fxd, BCR, in which case fx is referred to as the probability density function, or pdf, of X. 7.0.6 Continuous Cumulative Distribution Function Definition 7.0.2. The cumulative distribution function of CDF, Fx of a continuous random variable X is defined as Fx(x)=P(K 0}. i) For the cdf of a continuous random variable, Q, i) At values of x where Fis diferentable a fala) = Fa) ili) If X is continuous, Sx(x) AP(X= Jim [P(X 0 time units, suppose that P(Az) = exp{-7}. Define continuous random variable X : $+ R*, by X(5) time 2, Then, if x > 0 x <=> component fails at (X 0, Ax) = 25(0) = 2xe@{-3"} 84 and zero otherwise. Figure 7.1 displays the probability density function (left) and cumulative distribution function (right). Note that both the PDF and CDF are defined for all real values of x, and that both are continuous functions. pdt cdf o a «| 3 3 4 #x z. S at oj oa asl = T T T 673 $49 8 6723 4 8 x x Figure 7.1: PDF f(x) = 2xexp{—22}, x > 0, and CDF Fy(x) = 1 exp{—7}. Also note that here a se Poxsec) = Fale) = fi faltae= f° feteat as f(x) = for x <0, and also that [foeode [7 feces Example Suppose we have a continuous random variable X with probability density function given by oo for some unknown constant c. Questions QU) Determine c. Q) Find the cdf of X. Q3) Calculate P(1 < X <2). Solutions 3 $1) We must have r Fale) dx =d => Sex? dead Se 4. [Sete lier. Aherehee ; e= 4 82) ° if xo . * 3 P(xee) = B= fh. Plu -[¢ * oh i ace R, aoe x42 $3) 7 3 ES pa 0. Hence, Fx(x) = fy fa(u)du = ‘Let Y = g(X) = log(X). Then y= mt ge ‘Then, using (7.1), the pdf of Y is fy) = expf-et e* | ye. . 37 7.1 Mean, Variance and Quantiles 7.L1 Expectation Definition 7.1.1. For a continuous random variable X we define the mean or expectation of X, peor Bx(X) = [ xfxla)ae Extension: More generally, for a (measurable) function of interest of the random variable g: RR we have Ex{g(X)} = [™ s(x)fe(x)ax. Properties of Expectations Clearly, for continuous random variables we again have linearity of expectation E(aX +b) = aE(X) + 6, vab eR, and that for two functions g,h : IR R, we have additivity of expectation E{g(X) +4(X)} = E{g(X)} + B(A(X)}. 7.1.2. Variance Definition 7.1.2. The variance of a continuous random variable X is given by 08 or Var() = E{(X—px)?} = [ (e— px)2Felwae and again it is easy to show that Varx() = [2 f(3)dx— ph = BOP) — (EO) P. For a linear transformation aX +b we again have Var(aX +b) = a*Var(X), Va,beER. 58 7.1.3 Quantiles Recall we defined the lower and upper quartiles and median of a sample of data as points (1/43/41/2}-way through the ordered sample. This idea can be generalised as follows: Definition 7.1.3. For a (continuous) random variable X we define the a-quantile Qx(x), 0 0. ‘Then X is said to follow an exponential distribution with rate parameter A and we write X~ Exp(a). Notes * The cdfis x>0. * An alternative representation uses @ = 1/A as the parameter of the distribution. This is sometimes uses because the expectation and variance of the Exponential distributions 2 BOXx)=4=0 va(xy-4=0 + If X ~ Exp(a), then, for all x,t > 0, _P(X>x+tOX>t)_ P(X>x+t) es FO aE eee eS ea TOM APK >). Thus, for all x,t > 0, P(X > x+1|X > 1) = P(X > x) — this is known as the Lack of Memory Property, and is unique to the exponential distribution amongst continuous distributions. Interpretation: So if we think of the exponential variable as the time to an event, then knowledge that we have waited time s for the event tells us nothing about how much longer we will have to wait ~ the process has 1to memory. * Exponential random variables are often used to model the time until occurrence of a random event where there is an assumed constant risk (A) of the event happening over time, and so are frequently used as a simplest model, for example, in reliability analysis. So examples include: = the time to failure of a component in a system; ~ the time until we find the next mistake on my slides; ~ the distance we travel along a road until we find the next pothole; él Pessoa ~ the time until the next jobs arrives at a database server; Notice the duality between some of the exponential random variable examples and those ‘we saw for a Poisson distribution. In each case, “number of events” has been replaced with “time between events”. Claim: If events in a random process occur according to a Poisson distribution with rate A then the time between events has an Exponential distribution with rate parameter A. Proof. Suppose we have some random event process such that Vx > 0, the number of ents occurring in [0,x], Nz, follows a Poisson distribution with rate parameter A, so Ny ~Bai(Ax). Such a process is known as an homogeneous Poisson process. Let X be the time until the first event of this process arrives. ‘Then we notice that P(X > x) = P(N; = 0) _ xe =e and hence X ~ Exp(A). The same argument applies for all subsequent inter-arrival times. o Figure 7.3: PDFs and CDFs for Exponential distribution with different rate parameters. 7.2.3 Normal Distribution Suppose X is a random variable taking value on R with pdf -» fr(a) = Seeop {SEE}, for some jt € R,o > 0. Then X is said to follow a Gaussian or normal distribution with mean. wand variance o?, and we write X ~ N(,07). Oy Notes © The cdf of X ~ N(x,02) is not analytically tractable for any (1,7), 0 we can only write B= he [or {Pha + Special Case: If = 0 and o? = 1, then X has a standard or unit normal distribution. ‘The pdf of the standard normal distribution is written as (x) and simplifies to Also, the cdf of the standard normal distribution is written as (x). Again, for the edf, we can only write © EX ~ N(Q,1), and Y=oX+y then Y ~ N(x,07). Re-expressing this result: if X ~ N(y,07) and Y = (X — y/)/o, then Y ~ N(0,1). This is an important result as it allows us to write the cdf of any normal distribution in terms of @: If X ~ N(u,02) and we set Y = = can first observe that for any x € R, , then since ¢ > 0 we Xseee oh as ¥< 2k o Therefore we can write the cdf of X in terms of ®, BG) = Pax) =P (vs 254) =o G as 9) 7 7 ‘* Since the cdf, and therefore any probabilities, associated with a normal distribution are not analytically available, numerical integration procedures are used to find approximate probabilities. In particular, statistical tables contain values of the standard normal cdf ©(z) for a range of values z € R, and the quantiles @"!(w) for a range of values € (0,1). Linear interpolation is used for approximation between the tabulated values. ‘As seen in the point above, all normal distribution probabilities can be related back to probabilities from a standard normal distribution. * Table 7.1 is an example of a statistical table for the standard normal distribution. z Oz) Zz (2) 2 (2) fz (2) 0 05 09 0816 [18 0964 [28 0997 o1 0540 = | 10 oat 19 0971 30 0998 02 0579 |11 0864 |20 0977 |35 0.9998 03 0618 =| 12 0.885 21 0982 1.282 0.9 04 0.655 13 0903 [22 0986 1.645 0.95 05 061 =| 14 0919 | 23 0.989 196 0.975 06 0.726 |15 0933 |24 0992 | 2326 0.99 07 0758 | 16 0945 |25 0994 | 2576 0.995 os 07s [17 0955 |26 0.995 | 3.09 0.999 Table 7.1 Notice that (z) has been tabulated for z > 0. ‘This is because the standard normal pdf @ is symmetric about 0, that is, p(—2) = $(2). For the cdf , this means (2) = - ®(-2). So, for example, (—1.2) = 1 - (1.2) «1 —0.885 = 0.115. Similarly, if Z ~ N(0,1) and we want, for example, P(Z > 15) = 1-P(Z < 15) = 1-0(15)(= (-13)). So more generally we have F(Z >z) = (-2) We will often have cause to use the 97.5% and 99.5% quantiles of N(0,1), given by -1(0,975) and &-"(0.995).. (1.96) = 97.5%. ‘So with 95% probability an N(0,1) random variable will lie in [—1.96, 1.96] (~ [-2,2]). (258) = 99.5%. So with 99% probability an N(0,1) random variable will lie in [-2.58, 2.58] Figure 7.4: PDFs and CDFs of normal distributions with different means and variances. Example An analogue signal received at a detector (measured in microvolts) may be modelled as a Gaussian random variable X ~ N00, 256) Questions o QU) What is the probability that the signal will exceed 240? (Q2) What is the probability that the signal is larger than 240#V given that it is greater than 210nV? Solutions y s1) o (240 — 200° P(X > 240) = 1—P(X < 240) =1-@ = 1— (25) = 0.00621. 2) ( vey . 1 (mom P(X > a4]x > 10) = PEZ 3 ote om END OF LECTURE: 1o/il/16 me Let X1,X2,...,Xq be m independent and identically distributed (i.i.d.) random variables disecte from Way probability distribution, each with mean y and variance 0, = From before we know of hy fi NRE ATE «(Ex) =np, Var (&x) =n07, = it First notice (Ex —m) =I Dividing by fig, we obtain Dh Xie! (Batam) 9, (Eas) m( ‘Theorem 73 (Central Limit Theorem or CLT). tim Deni er r finally, for large 1 we have approximately x-N( 6 or on ~N (np, no?) ‘We note here that although all these approximate distributional results hold irrespective of the distribution of the {X;}, in the special case where X; ~ N(,0*) these distributional results are, in fact, exact. This is because the sum of independent normally distributed random variables is also normally distributed. Example Consider the most simple example, that X1, X2,... are ii.d. Bernoulli(p) discrete random variables taking value 0 or 1. Then the {X;} have mean yi = p and variance o? that for any n we have p(1—p). Then, by definition, we know YX; ~ Binomial(n,p) a which has mean np and variance np(1—p). But now, by the Central Limit Theorem (CLT), we also have for large m that approximately: YX) ~N (np no?) = (np, np(1 — p)) a So for large n Binomial(x,,p) © N(np,np(1 — p))- Notice that the LHS is a discrete distribution, and the RHS is a continuous distribution. Binomia1.05, nu525, 020 020 po) 757 010 010 0.00 0.00 poe) 002 004 0.08 008 0.00 a ; SS > 8 000 002 004 0.08 008 n-40 re Ym A= 400 ro % Bhooi00005) Nis. ] nzdece i : raw is Fe i 3 ° sc0 450, 500 550 600 400 450 ‘500 550, 600 Example Suppose X was the number of heads found on 1000 tosses of a fair coin, and we were interested in P(X < 490). Using the binomial distribution pmf, we would need to calculate P(X < 490) = f(0) + FRC) +f) +... (490) (= 0.27) However, using the CLT we have approximately X ~ N(500,250) and so 490 — 500° P(X < 490) = @ ($20=800) _ @_9.632) = 1 (0.632) = 0.26. (x < 490) = 0 (PO) — oy (0.632) a Example Suppose X ~ N(ji,02), and consider the transformation Y i 5 ee ‘Then if (2) =e, ¢°*(y) = logy) and g*"(y) =F. ‘Then by (7.1) we have 1 Peau =#) =—1 ep |-foew=w 0 Fl) = oP 20 y> and we say Y follows a log-normal distribution. a 7

You might also like