PPCh04 PDF

UCLA STAT 110 A
Applied Probability & Statistics for

Engineers
zInstructor:
Ivo Dinov,
Asst. Prof. In Statistics and Neurology

zTeaching
Assistant:
Neda Farzinnia, UCLA Statistics
University of California, Los Angeles, Spring 2004
http://www.stat.ucla.edu/~dinov/
Stat 110A, UCLA, Ivo Dinov
Slide 1
Chapter 4
Continuous
Random Variables
and Probability
Distributions
Slide 2
4.1
Continuous Random Variables
Continuous Random
Variables and
Probability
Distributions
Slide 3
Probability Distribution
Let X be a continuous rv. Then a
probability distribution or probability
density function (pdf) of X is a function
f (x) such that for any two numbers a
and b,
A random variable X is continuous if its

set of possible values is an entire
interval of numbers (If A < B, then any
number x between A and B is possible).
Slide 4
Probability Density Function

For f (x) to be a pdf
1. f (x) > 0 for all values of x.
2.The area of the region between the
graph of f and the x axis is equal to 1.
P ( a X b ) = f ( x)dx
a
The graph of f is the density curve.

Slide 5
y = f ( x)
Area = 1
Slide 6
Continuous RVs
Probability Density Function

P (a X b) is given by the area of the shaded
region.
y = f ( x)
z A RV is continuous if it can take on any real value in a

non-trivial interval (a ; b).
z PDF, probability density function, for a cont. RV, Y, is
a non-negative function pY(y), for any real value y,
such that for each interval (a; b), the probability that Y
takes on a value in (a; b), P(a<Y<b) equals the area
under pY(y) over the interval (a: b).
pY(y)
z P(a<Y<b)
b
Slide 7
z For a continuous RV the density histograms converge

to the PDF as the size of the bins goes to zero.
AdditionalInstructorAids\BirthdayDistribution_1978_systat.SYD
Slide 9
Uniform Distribution
A continuous rv X is said to have a
uniform distribution on the interval [A, B]
if the pdf of X is
1
A x B
f ( x; A, B ) = B A
0
otherwise
Slide 11
Slide 8
Convergence of density histograms to the PDF
b
Convergence of density histograms to the PDF

z For a continuous RV the density histograms converge
to the PDF as the size of the bins goes to zero.
z
Slide 10
Probability for a Continuous rv

If X is a continuous rv, then for any
number c, P(x = c) = 0. For any two
numbers a and b with a < b,
P ( a X b) = P ( a < X b)
= P( a X < b)
= P( a < X < b)
Slide 12
The Cumulative Distribution Function
4.2
Cumulative Distribution
Functions and Expected
Values
Slide 13
The cumulative distribution function,

F(x) for a continuous rv X is defined for
every number x by
F ( x) = P ( X x ) =
Let X be a continuous rv with pdf f(x)

and cdf F(x). Then for any number a,
P ( X > a ) = 1 F (a)
and for any numbers a and b with a < b,
f ( y )dy
For each x, F(x) is the area under the

density curve to the left of x.
Slide 14
Using F(x) to Compute Probabilities
Obtaining f(x) from F(x)

If X is a continuous rv with pdf f(x)
and cdf F(x), then at every number x
for which the derivative F ( x ) exists,
F ( x) = f ( x).
P ( a X b ) = F (b) F (a)
Slide 15
Let p be a number between 0 and 1. The

(100p)th percentile of the distribution of a
continuous rv X denoted by ( p ), is
defined by
( p)
Slide 17
Median
Percentiles
p = F ( ( p ) ) =
Slide 16
The median of a continuous distribution,

denoted by % , is the 50th percentile. So %
satisfies 0.5 = F ( % ). That is, half the area
under the density curve is to the left of % .
f ( y )dy
Slide 18
Expected Value
The expected or mean value of a
continuous rv X with pdf f (x) is
X = E ( X ) =
x f ( x)dx
Slide 19
The variance of continuous rv X with

pdf f(x) and mean is
X2 = V ( x) =
(x )
If X is a continuous rv with pdf f(x) and

h(x) is any function of X, then
E [ h( x )] = h ( X ) =
f ( x)dx
h( x) f ( x)dx
Slide 20
Variance and Standard Deviation
Expected Value of h(X)
Short-cut Formula for Variance
( )
V ( X ) = E X 2 [ E ( X )]
= E[( X ) ]
The standard deviation is X = V ( x).

Slide 21
Slide 22
Normal Distributions
4.3
The Normal
Distribution
A continuous rv X is said to have a

normal distribution with parameters
and , where < < and

0 < , if the pdf of X is
f ( x) =
Slide 23
2
2
1
e( x ) /(2 )
2
Slide 24
< x <
Standard Normal Distributions

The normal distribution with parameter
values = 0 and = 1 is called a
standard normal distribution. The
random variable is denoted by Z. The
pdf is
2
1
f ( z;0,1) =
e z / 2 < z <
2
The cdf is
z
( z ) = P( Z z ) = f ( y;0,1)dy
Standard Normal Cumulative Areas

Shaded area = (z )
Standard
normal
curve
Slide 25
Slide 26
Standard Normal Distribution

Let Z be the standard normal variable.
Find (from table)
c. P (2.1 Z 1.78)
Find the area to the left of 1.78 then
subtract the area to the left of 2.1.
= P( Z 1.78) P ( Z 2.1)
a. P ( Z 0.85)
Area to the left of 0.85 = 0.8023
= 0.9625 0.0179
= 0.9446
b. P(Z > 1.32)
1 P( Z 1.32) = 0.0934
Slide 27
Slide 28
Ex. Let Z be the standard normal variable. Find z if

a. P(Z < z) = 0.9278.
z Notation
z will denote the value on the
measurement axis for which the area

under the z curve lies to the right of z .
Shaded area
= P( Z z ) =
Look at the table and find an entry

= 0.9278 then read back to find
z = 1.46.
b. P(z < Z < z) = 0.8132
P(z < Z < z ) = 2P(0 < Z < z)
= 2[P(z < Z ) ]
= 2P(z < Z ) 1 = 0.8132
0
Slide 29
P(z < Z ) = 0.9066
z
Slide 30
z = 1.32
Nonstandard Normal Distributions

If X has a normal distribution with
mean and standard deviation , then
Z=
Normal Curve
Approximate percentage of area within
given standard deviations (empirical
rule).
99.7%
95%
68%
has a standard normal distribution.
Slide 31
Ex. Let X be a normal random variable

with = 80 and = 20.
Find P( X 65).
65 80
P ( X 65 ) = P Z
20
= P ( Z .75 )
= 0.2266
Slide 33
96
3.75 6
P ( 3.75 X 9 ) = P
Z
1.5
1.5
= P ( 1.5 Z 2 )
= 0.9772 0.0668
Slide 32
Ex. A particular rash shown up at an

elementary school. It has been
determined that the length of time that the
rash will last is normally distributed with
= 6 days and = 1.5 days.
Find the probability that for a student
selected at random, the rash will last for
between 3.75 and 9 days.
Slide 34
Percentiles of an Arbitrary Normal

Distribution
(100p)th percentile
(100 p )th for
for normal ( , ) = + standard normal
= 0.9104
Slide 35
Slide 36
Normal Approximation to the

Binomial Distribution
Ex. At a particular small college the pass rate

of Intermediate Algebra is 72%. If 500
students enroll in a semester determine the
probability that at least 375 students pass.
Let X be a binomial rv based on n trials, each

with probability of success p. If the binomial
probability histogram is not too skewed, X may
be approximated by a normal distribution with
= np and = npq .
= np = 500(.72) = 360
= npq = 500(.72)(.28) 10
375.5 360
P ( X 375)
= (1.55)
10
x + 0.5 np
P( X x)
npq
Slide 37
= 0.9394
Slide 38
Normal approximation to Binomial
Normal approximation to Binomial Example
z Suppose Y~Binomial(n, p)
z Then Y=Y1+ Y2+ Y3++ Yn, where
z Roulette wheel investigation:

z Compute P(Y>=58), where Y~Binomial(100, 0.47)
Yk~Bernoulli(p) , E(Yk)=p & Var(Yk)=p(1-p)
E(Y)=np & Var(Y)=np(1-p), SD(Y)= (np(1-p))
1/2
Standardize Y:
Z=(Y-np) / (np(1-p))1/2
By CLT Z ~ N(0, 1). So, Y ~ N [np, (np(1-p))1/2]
z Normal Approx to Binomial is

reasonable when np >=10 & n(1-p)>10
(p & (1-p) are NOT too small relative to n).
Slide 39
Normal approximation to Poisson

z Let X1~Poisson() & X2~Poisson() X1+ X2~Poisson(+)
The proportion of the Binomial(100, 0.47) population having

more than 58 reds (successes) out of 100 roulette spins (trials).
Since np=47>=10
& n(1-p)=53>10 Normal

approx is justified.
Roulette has 38 slots
z Z=(Y-np)/Sqrt(np(1-p)) = 18red 18black 2 neutral
58 100*0.47)/Sqrt(100*0.47*0.53)=2.2
z P(Y>=58) P(Z>=2.2) = 0.0139
z True P(Y>=58) = 0.177, using SOCR (demo!)
z Binomial approx useful when no access to SOCR avail.
Slide 40
Normal approximation to Poisson example

z Let X1~Poisson() & X2~Poisson() X1+ X2~Poisson(+)
z Let X1, X2, X3, , Xk ~ Poisson(), and independent,
z Let X1, X2, X3, , X200 ~ Poisson(2), and independent,
z Yk = X1 + X2 + + Xk ~ Poisson(k), E(Yk)=Var(Yk)=k.
z Yk = X1 + X2 + + Xk ~ Poisson(400), E(Yk)=Var(Yk)=400.
z The random variables in the sum on the right are

independent and each has the Poisson distribution
with parameter .
z By CLT the distribution of the standardized variable

(Yk 400) / (400)1/2 N(0, 1), as k increases to infinity.
z By CLT the distribution of the standardized variable

(Yk k) / (k)1/2 N(0, 1), as k increases to infinity.
z So, for k >= 100, Zk = {(Yk k) / (k)1/2 } ~ N(0,1).
z Yk ~ N(k, (k)1/2).
Slide 41
z Zk = (Yk 400) / 20 ~ N(0,1) Yk ~ N(400, 400).

z P(2 < Yk < 400) = (stdz 2 & 400) =
z P( (2400)/20 < Zk < (400400)/20 ) = P( -20< Zk<0)
= 0.5
Slide 42
Poisson or Normal approximation to Binomial?

z Poisson Approximation (Binomial(n, pn) Poisson() ):
y
WHY? e
n p y (1 p ) n y
n

n
y n
y!
n pn

n>=100 & p<=0.01 & =n p <=20

z Normal Approximation
(Binomial(n, p) N ( np, (np(1-p))1/2) )
np >=10 & n(1-p)>10

Slide 43
The Gamma Function

For > 0, the gamma function
( ) is defined by
( ) = x 1e x dx
0
4.4
The Gamma
Distribution and Its
Relatives
Slide 44
Gamma Distribution
A continuous rv X has a gamma
distribution if the pdf is
1
x 1e x / x 0
f ( x; , ) = ( )
0
otherwise
where the parameters satisfy > 0, > 0.

The standard gamma distribution has = 1.
Slide 45
Mean and Variance

The mean and variance of a random
variable X having the gamma distribution
f ( x; , ) are
E ( X ) = = V ( X ) = 2 = 2
Slide 47
Slide 46
Probabilities from the Gamma

Distribution
Let X have a gamma distribution with
parameters and .
Then for any x > 0, the cdf of X is given by
x
P( X x) = F ( x; , ) = F ;

where
x 1 y
y e
F ( x; ) =
dy
( )
0
Slide 48
Exponential Distribution
Mean and Variance
A continuous rv X has an exponential

distribution with parameter if the pdf is
e x x 0
f ( x; ) =
0
Slide 49
otherwise
Let X have a exponential distribution

Then the cdf of X is given by
x<0
0
F ( x; ) =
x
x0
1 e
The Chi-Squared Distribution

Let v be a positive integer. Then a
random variable X is said to have a chisquared distribution with parameter v if
the pdf of X is the gamma density with
= v / 2 and = 2. The pdf is
1
x ( v / 2)1e x / 2
v/2
f ( x; v ) = 2 (v / 2)
Slide 53
= =
2 = 2 =
Slide 50
Applications of the Exponential

Distribution
Probabilities from the Gamma

Distribution
Slide 51

variable X having the exponential
distribution
x0
Suppose that the number of events

occurring in any time interval of length t
has a Poisson distribution with parameter t
and that the numbers of occurrences in
nonoverlapping intervals are independent
of one another. Then the distribution of
elapsed time between the occurrences of
two successive events is exponential with
parameter = .
Slide 52
The Chi-Squared Distribution

The parameter v is called the number of
degrees of freedom (df) of X. The
2
symbol is often used in place of chisquared.
x<0
Slide 54
Constructing QQ plots
Identifying Common Distributions QQ plots

z Quantile-Quantile plots indicate how well the model
distribution agrees with the data.
z q-th quantile, for 0<q<1, is the (data-space) value, Vq, at or
below which lies a proportion q of the data.
1
Graph of the CDF, FY(y)=P(Y<=Vq)=q
Vq
Slide 55
z Start off with data {y1, y2, y3, , yn}

z Order statistics y(1) <= y(2) <= y(3) <=<= y(n)
z Compute quantile rank, q(k), for each observation, y(k),
P(Y<= q(k)) = (k-0.375) / (n+0.250),
where Y is a RV from the (target) model distribution.

z Finally, plot the points (y(k), q(k)) in 2D plane, 1<=k<=n.
z Note: Different statistical packages use slightly
different formulas for the computation of q(k). However,
the results are quite similar. This is the formulas
employed in SAS.
z Basic idea: Probability that:
P((model)Y<=(data)y(1))~ 1/n;
P(Y<=y(2)) ~ 2/n; P(Y<=y(3)) ~ 3/n;

Slide 56
Example - Constructing QQ plots
Expected Value for

Normal Distribution
z Plot the points (y(k), q(k)) in 2D plane, 1<=k<=n.
3
2
1
0
-1
-2
-3
Slide 57
C:\Ivo.dir\UCLA_Classes\Winter2002\AdditionalInstructorAids
BirthdayDistribution_1978_systat.SYD
SYSTAT, Graph Probability Plot, Var4, Normal Distribution
z Start off with data {y1, y2, y3, , yn}.
4.5
Other Continuous
Distributions
Slide 58
The Weibull Distribution

A continuous rv X has a Weibull
distribution if the pdf is
1 ( x / )
x e
f ( x; , ) =
x0
x<0
where the parameters satisfy > 0, > 0.

Slide 59
Mean and Variance

variable X having the Weibull
distribution are
= 1 +
1
2 1
2
2
= 1 + 1 +
Slide 60
10
Weibull Distribution
Lognormal Distribution
The cdf of a Weibull rv having

parameters and is
1 e( x / )
F ( x; , ) =
0
Slide 61
A nonnegative rv X has a lognormal

distribution if the rv Y = ln(X) has a
normal distribution the resulting pdf has
parameters and and is
x0
x<0
2
2
1
e[ln( x ) ] /(2 )
f ( x; , ) = 2 x
Slide 62
Mean and Variance
E( X ) = e
V (X ) = e
Slide 63
2 + 2
x<0
Lognormal Distribution
The cdf of the lognormal distribution is
given by
The mean and variance of a variable X

having the lognormal distribution are
+ 2 / 2
x0
(e
F ( x; , ) = P( X x ) = P[ln( X ) ln( x )]
ln( x )
ln( x)
= P Z
=
Slide 64
Beta Distribution
Mean and Variance
A rv X is said to have a beta distribution

with parameters A, B, > 0, and > 0
if the pdf of X is
The mean and variance of a variable X

having the beta distribution are
f ( x; , , A, B) =
1
1
1
( + ) x A B x
B A ( ) ( ) B A B A
0
otherwise
Slide 65
x0
= A + ( B A)
2 =
( B A)2
( + ) 2 ( + + 1)
Slide 66
11
Sample Percentile
4.6
Order the n-sample observations from

smallest to largest. The ith smallest
observation in the list is taken to be the
[100(i 0.5)/n]th sample percentile.
Probability
Plots
Slide 67
Slide 68
Normal Probability Plot

Probability Plot
A plot of the pairs
[100(i .5) / n]th percentile ith smallest sample
observation
of the distribution
If the sample percentiles are close to the

corresponding population distribution
percentiles, the first number will roughly
equal the second.
Slide 69
([100(i .5) / n]th z percentile,
On a two-dimensional coordinate system

is called a normal probability plot. If the
drawn from a normal distribution the
points should fall close to a line with
slope and intercept .
Slide 70
Relation among Distributions
Beyond Normality
Consider a family of probability
distributions involving two parameters
1 and 2 . Let F ( x;1, 2 ) denote the
corresponding cdfs. The parameters
1 and 2 are said to location and scale
parameters if
x 1
F ( x;1, 2 ) is a function of
.
Z=
Normal (X)
Normal (Z)
2 = i =1 Z i
Y = eX
df
1
Chi-square ( )
Weibull
Lognormal (Y)
, 2
Uniform(X)
= n / 2, = 2
,
U=
Beta
Gamma
X = ( )U +
Uniform(U)
= =1
0,1
Tdf=n
(0,1)
df

0, 1
X = ln Y
,
Slide 71
ith smallest observation )
Cauchy
(0,1)
=1
n=2
=1
Exponential(X)
X = ln U
Slide 72
12

PPCh04 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PPCh04 PDF

Uploaded by

Copyright:

Available Formats

UCLA STAT 110 A

Applied Probability & Statistics for

Asst. Prof. In Statistics and Neurology

Neda Farzinnia, UCLA Statistics

University of California, Los Angeles, Spring 2004

Stat 110A, UCLA, Ivo Dinov

Stat 110A, UCLA, Ivo Dinov

Continuous Random Variables

Stat 110A, UCLA, Ivo Dinov

A random variable X is continuous if its

Stat 110A, UCLA, Ivo Dinov

Probability Density Function

The graph of f is the density curve.

Stat 110A, UCLA, Ivo Dinov

Stat 110A, UCLA, Ivo Dinov

Probability Density Function

z A RV is continuous if it can take on any real value in a

z For a continuous RV the density histograms converge

Stat 110A, UCLA, Ivo Dinov

Stat 110A, UCLA, Ivo Dinov

Convergence of density histograms to the PDF

Stat 110A, UCLA, Ivo Dinov

Convergence of density histograms to the PDF

Stat 110A, UCLA, Ivo Dinov

Probability for a Continuous rv

Stat 110A, UCLA, Ivo Dinov

The Cumulative Distribution Function

The cumulative distribution function,

Let X be a continuous rv with pdf f(x)

For each x, F(x) is the area under the

Stat 110A, UCLA, Ivo Dinov

Using F(x) to Compute Probabilities

Stat 110A, UCLA, Ivo Dinov

Obtaining f(x) from F(x)

Stat 110A, UCLA, Ivo Dinov

Let p be a number between 0 and 1. The

Stat 110A, UCLA, Ivo Dinov

The median of a continuous distribution,

Stat 110A, UCLA, Ivo Dinov

Stat 110A, UCLA, Ivo Dinov

The variance of continuous rv X with

If X is a continuous rv with pdf f(x) and

Stat 110A, UCLA, Ivo Dinov

Variance and Standard Deviation

Expected Value of h(X)

Stat 110A, UCLA, Ivo Dinov

Short-cut Formula for Variance

The standard deviation is X = V ( x).

Stat 110A, UCLA, Ivo Dinov

Stat 110A, UCLA, Ivo Dinov

A continuous rv X is said to have a

and , where < < and

Stat 110A, UCLA, Ivo Dinov

Stat 110A, UCLA, Ivo Dinov

Standard Normal Distributions

Standard Normal Cumulative Areas

Stat 110A, UCLA, Ivo Dinov

Standard Normal Distribution

Stat 110A, UCLA, Ivo Dinov

b. P(Z > 1.32)

Stat 110A, UCLA, Ivo Dinov

Stat 110A, UCLA, Ivo Dinov

Ex. Let Z be the standard normal variable. Find z if