2.random Variables and Probability Distributions

Chapter 2
Random Variables and

Probability Distributions
1
2. The Concept of a Random Variable
 Introduction
In the first chapter, we covered basic probability
concepts and computation of probability of an event.
In many cases, however, we are not interested in

knowing which of the outcomes has occurred.
Rather we are interested in knowing the numbers

associated with outcomes of an experiment. Thus,
we associate a real number with each outcome.
2
 In other words, we are considering a function whose
domain is the set of all possible outcomes (sample
space, S) and whose range is the subset of real
numbers.
 Such function is known as the random variable (rv).

This implies that random variable is not a variable
but it is a function that maps the elements of S
into the real number.
3
Definition: Let X represent a function that associates a real
number with each and every elementary event in a sample
space, S. Then X is known as a random variable.
Alternatively, a random variable is a function from the
sample space S to the real numbers.
Example: Consider an experiment of tossing two coins.

Let the random variable X denote the number of heads,
and let the random variable Y denote the number of
tails.
Then S = {HH, HT, TH, TT}, and X(x) = 2 if x = HH, X(x) = 1 if

x = HT or TH, X(x) = 0 if x=TT. Similarly, Y(y) = 2 if y = TT,
Y(y) = 1 if y = HT or TH, Y(y) = 0 if y = HH
4
Possible outcomes X=x Y X+Y
(domain)
(range)
HH 2 0 2
HT 1 1 2
TH 1 1 2
TT 0 2 2
We say that the space of the R. V. X = {0, 1, 2}
5
 2. Tossing a fair coin three times. The sample
space is HHH , HHT , HTH , HTT , THH , THT ,TTH ,TTT 
 The random variable can be the number of heads
in this three tosses.
 The space of the R. V. X = {0, 1, 2, 3}

6
3 The sample space for rolling a die once is
S = {1, 2, 3, 4, 5, 6}.
 Let the rv X denote number on the face that turns up in
a sample point, then we can write
 X(1) = 1, X(2) = 2, X(3) = 3, X(4) = 4, X(5) = 5, X(6) = 6.
4. Rolling pair of a fair dice. A random variable X is the

sum of the two dice. The sample space is {(1,1), (1,2),
…, ((6,6)}, N = 36
 The space of the R. V. X = {2,3,4,5,6,7,8,9,10,11,12},
number of element is 11
7
Types of Random Variable
a. Discrete Random Variable: If a random

variable can assume only a particular finite or
accountably infinite set of values, it is said to be
discrete random variable.
b. Continuous Random Variable: if a random
variable can assume infinite and uncountable
set of values, it is known as continuous random
variable
8
2.2. The probability distribution/function
of a discrete random variable
 Once a random variable X is defined, the sample
space is no longer important.
 All relevant aspects of the experiment can be captured

by listing the possible values of X and their
corresponding probabilities.
 This list is called a probability density function

(pdf) or probability mass function or probability
distribution.
9
 Definition: If X is a discrete random variable, the
function given by f(x) = P(X=x) for each x within the
range of X is known as the probability distribution
of X.
 The probability distribution reveals association

of each value of X with the corresponding
probability.
10
 A function can serve as the probability distribution of
a discrete random variable X iff its value, f(χ), satisfy
the following two conditions:
1. f ( x )  0 for each value within its domain
2. f ( x )  1,
x
 Where the summation extends over all the values within

its domain.
11
3. If X is a discrete rv, then
pdf can be indicated using: function, table or graph.
12
Tables of Probability Distributions
Examples
 Consider an experiment of tossing two coins. Let the
random variable X denote the number of heads,
Then S = {HH, HT, TH, TT}, and X(x) = 2 if x = HH,
X(x) = 1 if x = HT or TH, X(x) = 0 if x=TT.
13
14
 2. Tossing a fair coin three times. The random
variable X can be the number of heads in this three
tosses.
 The probability distribution is given by (table):
15
3. Rolling pair of a fair dice. A random variable
X is the sum of the two dice
16
 4. In the experiment of flipping a coin and generating the
number of tosses required to find a head, the sample
space, S = {H, TH, TTH, TTTH ...}
 Let the R.V. X = the number of tosses required to
produce the first head and let P(H) = p, P(T) = 1-p, then
 The sum of this process must be equal to 1.

17
18
 Functions of Probability Distribution for
Discrete Random Variable
 A function can serve as the probability distribution of a

discrete random variable X iff its value, f( χ), satisfy the
following two conditions:
1. f ( x)  0 for each value within its domain
2. f ( x)  1,
x
 Where the summation extends over all the values within its
domain.
19
 Examples:
 Check whether the following function represents
probability distribution
x
f ( x)  ,1,2,3,4,5,6
21
 Solution: Check for the two conditions:
f ( x)  0, for all the given value of x

6
 f ( x)  1
x 1
 f (1)  f ( 2)  f (3)  f (4)  f (5)  f (6)

1 2 3 4 5 6 21
       1
21 21 21 21 21 21 21
20
2. Probability function of the uniform distribution is
given by:
1
 , x  1 , , 6
f X
x  6
 0 , elsewhere

3. Check whether the following function represents

x2
f ( x)  , x  1,2,3,4,5
5
 Since f(x)<0 for x=1, it does not represent a

21
4. Check whether the following function represents
22
5. For what value of k can the following function serve
as probability distribution:
The series: k/4, k/16, k/64,…, is in GP with a1 = k/4

and r = ¼. Then, 23
Graphs of Probability Distributions
for Discrete Random Variable
 The graphs include (probability) histogram and
bar chart.
 Example: Tossing a fair coin three times. The
random variable can be the number of heads in
this three tosses.
 The probability distribution is given by (table):
24
25
Cumulative Distribution Functions (CDF) a
Discrete Random Variable
 It tries to answer the question “what the probability that a
discrete rv X will assume a value less than or equal to a
given number?”
Definition: A cdf shows the relation between the possible
value t of a rv X and the probability that the value of X is
less than or equal to t. That is, the cdf relates each
possible value t that variable X might take on the
probability f ( X  t ) , the probability of the value less than
or equal to t.
 This cumulative probability f ( X  t ) is written as .
F ( x)  f ( X  x)
 Formally
26
 The values of F(x) satisfy the following
conditions:
F ( )  1
F ()  0
If a  b, then F (a)  F (b) for any real numbers a & b
27
 The following five properties hold for
the CDF
a) 0 ≤ F(x) ≤ 1
b) F(x) is non-decreasing (i.e., if x1  x2 , then F ( x1 )  F ( x2 ) )
c) F(x) = 0 for x < x1, x1 being the minimum/least of the
values of the random variable X.
d) F(x) = 1 for x ≥ xn, xn being the maximum/largest value of
X
e) P(x < X ≤ x/) = P(X ≤ x/) -P (X ≤ x) = FX(x/) - FX(x)
28
 Example: Suppose the pdf is given by:
 f  x i  , x  x i; i  1, , k

f  x  
0
 , elsewhere
29
 Example
1
 , x  x i; i  1, 2 ,6
f x  6
0 , elsewhere
30
31
 Graphically,
F(x)
5/6
4/6
3/6
2/6
1/6
x
0 1 2 3 4 5 6
Graphically F(x) is a step function with the height of the

step at xi equal to f(xi).
32
 Note that:
1. F(x) gives us the probability that the rv X will assume a
value less than or equal to a given number. But f(x) gives
us the probability that the rv X will assume a particular
value.
2. Given f(x) we can derive F(x) or given F((x) we can
derive f(x).
F  x    f t  for    x  
tx
33
34
2.3 Probability Density Functions
(pdf) of the Continuous rv X
 Continuous Random Variable: if a random variable
can assume infinite and uncountable set of values, it is
known as continuous random variable
 (e.g. height, weight, and the time elapsing between two
telephone calls). More technically,
 Definition: A random variable X is called continuous
if there exists a function f(.) such that
x
P(X≤ x) = F ( x)   f (u )du for every real number x,

and F(x) is called the distribution function of the

random variable X.
35
 Definition: A function with values f(x), defined over the set
of all real numbers, is known as a pdf of the continuous rv
b
X iff p ( a  X  b )  ,a f ( x)dx
for any real constants a and b with . Or if X x is a continuous

random variable, the function f(.) in F ( x)   f (u)du is called
the probability density function of X.

36
 A function can serve as a probability density
function of a continuous random variable X if its
values, f(χ), satisfy the following conditions:
37
 Properties of Probability Density
Function (pdf)
3. Probability of a fixed value of a continuous rv X is zero.

 p ( a  X  b)  p ( a  X  b )  p ( a  X  b)  p ( a  X  b )
For any real constants a and b with .
 Probability Density Function (pdf) can be indicated using

graphs or functions
38
 Examples:
1. f ( x)  6 x(1  x), for 0  x  1
Solution:
1
0
6 x (1  x )dx  1
1
 6(  ( x  x 2 ) dx) |10  1
0
1 2 1 3 1
( x  x ) |0
2 3
3x 2  2 x3 1
 6( ) |0  1
6
 3 2 1
11
graph it x-intercept= 0 or 1
39
 For what value of k the function
f ( x)  kx(1  x), for 0  x  1

can serve as the probability density function?
 Solution:
40
3. The pdf of the continuous rv X is given by:
 c
 ,0  x  4
f ( x)   x
0 , otherwise

a. Find the value of c

b. Find 1
p( X  ) & p ( X  1)
4
4. Given the pdf:
kxe  x ,for x  0
2
f ( x)  
Find the 0 value
,for x  0
of k.
41
Cumulative Distribution Functions
(cdf) of a continuous rv X
Definition: if X is a continuous rv and the value of its
probability density at t is f(t), then the function given by:
x
F ( x)  p ( X  x)  

f (t )dt , for    x  
is known as the Cumulative Distribution Function of X.

F(.) must be continuous with domain the set of all real
numbers and range between 0 and 1 (inclusive).
42
Properties of cdf of a continuous rv X
1. F     lim F x   1
x  
2. F     lim F x   0
x  
3. Properties 1 and 2 imply that
0  F X x   1
4. If f(x) and F(x) are the values of the pdf and cdf of a continuous rv
X, then
p (a  X  b)  F (b)  F (a ) for any real

constant a and b with a  b and
5. F(x) is non decreasing,
dF ( x) i.e., if a >b then F(a) - F(b) ≥ 0 which is
f ( x)  ≤ b). where the derivative exists
P(x ≤ a)-P(x dx
43
7. d F X x x
 f  x   F  x   F  -    f u du
dx 
8. Since F(x) is non-decreasing it follows that
Examples
44
ii.
2. Given The pdf of the continuous rv X is given by:
 c
 ,0  x  4
f ( x)   x
0 , otherwise

Find its cdf and p(X>1)
45
1. Find the cdf of the pdf of the form:
kxe
 x2
,for x  0
f ( x)  
0 ,for x  0
1
2. Suppose X is a continuous rv with cdf: F ( x) 
1  e x
Find its pdf and compute p(-1<x<2) using both pdf and cdf
46
2.4. The Expected Value of a
Random Variable and Moments
Mathematical Expectation:
1. Let X be a discrete random variable taking values x1, x2,
x3,…with f(xi) as its probability density, then the expected
value of X, denoted by E(X), is defined as
That is, E(X) is the weighted mean of the possible values of

X, each value is weighted by its probability.
47
2. Let X be a continuous random variable with probability
density function f(x), then the expected value of X,
denoted by E(X), is defined as:

E( X )   xf ( x)dx

Properties of mathematical expectations

a. if c is a constant E(c) = c
 E(c) = (c)1 = c
b. E(aX+b) = aE(X)+b, where a and b are constants in 

Proof: For discrete
48
b. For continuous:
c. Let X and Y be random variables with finite

expected values. Then
E(X + Y ) = E(X) +E(Y )
49
Expectation of a Function of a Random
Variable
Let g(x) be a function of a rv X, then
50
Soln.
51
For c is any positive constant and g(x) be a
function of a rv X, then
1. for the discrete rv:
2. for the continuous rv:
52
Variance and Standard Deviation of a rv X
 The variance of the rv X measures the spread or
dispersion of a rv X
 Let X be a rv with the following distribution.
Then, the variance of the X, denoted by Var(X), is defined

by:
53
Properties of Var(X)
a. If “a” is any real constant, then Var(a)=0
2
b. If Var ( X )   , then the variance of Y in Y=aX+b is given
by: Var (Y )  a 2 2
54
Examples: for discrete
1. Given
Find: a. E(X). b. Var(X)

2. Bernoulli Random Variable: A random variable with only
two outcomes (0 and 1) is known as a Bernoulli R.V.
Let X be a random variable with probability p of success and
(1 - p) of failure.
55
The above tabular expression for the probability of a
Bernoulli R. V. can be written as
Let X be the number of trials required to produce the 1st

success, say a head in a toss of a fair coin. This is easily
described by a geometric random variable and is given
as:
56
57
58
Examples for Continuous rv X
1. Given the pdf:
59
Moments of a probability distribution
The mean of a distribution is the expected value of the
random variable X.
A generalisation of this is to raise X to any power r, for r =0,
1, 2,... and compute the E(Xr). This is known as the moment
of order r about the origin.
The rth moment about the origin, denoted by  r/.

for r  0, 0/  E ( X 0  0)  1
for r  1, 1/  E ( X 1  0)  E ( X )  mean   xf ( x)
for r  2, 2/  E ( X 2  0)  E ( X 2 )
for r  2, 3/  E ( X 3  0)  E ( X 3 )
.
.
.
 r/  E ( X r  0)  E ( X r ) 60
In general,
r
/
 E X r  x r
f X x ; r  0,1, 2, (Discrete)


/
r  EX r
  x
r
f ( x )dx ; r  0,1, 2, (Continuous )

Moments can also be generated around the mean, which

are known as central moments or moments around

the mean. It is denoted rby
for r  0, 0  E ( X  X )0  1
for r  1, 1  E ( X  X )1  0
for r  2, 2  E ( X  X ) 2  Var ( X )
for r  3, 3  E ( X  X )3
.
.
.
r  E ( X  X )r
61
 In general,
 r  E  x       x    f x
r r
, r  0,1, 2, or
  x

 E  x        x    f  x  dx
r r
r , r  0,1, 2,
  
 is defined as the variance of the

random variable, and is also denoted by Var(X) or V(X).
62
Thus, the variance of a random variable is expected value
of the square of the random variable less the square of
the expected value of the random variable.
63
 is the third moment about the mean and is equal to :
64
 is the fourth moment about the mean and is
equal to :
65
Interpretations
1. The first moment about the origin is the mean of the

distribution. That is, μ1/ =μ is a measure of central
tendency
2. The second moment about the mean is the variance of
the distribution. μ2 which is denoted by 2, Var(X), or
V(X), is the known as the variance, and is a measure of
dispersion of the random variable.
3. , or, the positive root of the variance, is called the
standard deviation of the random variable
66
Skewness and Kurtosis
 A fundamental task in many statistical analyses is to characterize
the location and variability of a data set.
 A further characterization of the data includes skewness and
kurtosis.
 Skewness is a measure the lack of symmetry. A distribution, or
data set, is symmetric if it looks the same to the left and right of
the center point.
 Kurtosis is a measure of whether the data are peaked or flat
relative to a normal distribution.
 That is, data sets with high kurtosis tend to have a distinct peak
near the mean, decline rather rapidly, and have heavy tails. Data
sets with low kurtosis tend to have a flat top near the mean
rather than a sharp peak.
67
Measures of Asymmetry
 One of the main features of a distribution is the

extent to which it is symmetric.
68
 Kurtosis - It is a measure of
peakedness.
69
4. μ3 is the third moment about the mean and is used to
calculate the measure of skewness which is given
3

as 3  3
and known as the Pearson’s measure of
skewness.
 If α3 = 0 then the distribution is symmetric.

 If α3 > 0 then the distribution is positively skewed, and
there is a spread to the right–few observations on the
right-hand of the mean pull the mean to the right.
 If α3 < 0 then the distribution is negatively skewed, and
there is a spread to the left–few observations on the left-
hand of the mean pull the mean to the left.
70
5. μ4, the fourth moment about the mean and is used to
calculate the measure of pickedness or flatness
4 (which is
known as kurtosis) and is given as  4  4 .

α4 =3 for a normal distribution.
α4 >3 if the distribution is narrower and thinner at its tails

than the normal distribution (it is known as leptokurtic).
α4 <3 if the distribution is flatter and thicker at its tails

than the normal distribution (it is known as platykurtic).
71
 Example
We have already obtained the expected value of the
Bernoulli random variable
 to be equal to E(x) = 0(1-p)+1(p) = p.

 To obtain the variance of the Bernoulli random variable we
first get
E(x2) = 02(1 - p) + 12(p) = p.
72
Cont.
 Exercise
Let the random variable Y be the number of failures
preceding a success in an experiment of tossing a fair
coin, where success is obtaining a head. Find E(Y) and 2
for this random variable.
73
Moment Generating Function (mgf)
Moments of most distributions can be determined directly
through integration or summation. An alternative technique
to calculate moments of a distribution is known as the
moment generating function (mgf).
Definition: Moment generating function of a random
variable X, written as M(t), is defined as
tx
m(t )  E (e )
where t is any constant in neighbourhood of 0 (zero).
74
Taylor series expansion about any point x0 of a
continuously differentiable function f(x) is given by:
The Maclaurin’s series = the Taylor series expansion about

the origin or zero is given by:
75
Example: Let f ( x)  e x , then
76
 In general,
d r m(t )
r
|t  0   /
r , the r th
moment about the origin
dt
Examples
1. Given the pdf f ( x)  2 x, 0  x  1 x
1
2. Given the probability distribution: f ( x)  2  3  , x  1, 2,3,...
 
 Find:
a. moment generating function

b. The mean and the variance of the distribution using a
 Solution:
77

2.random Variables and Probability Distributions

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2.random Variables and Probability Distributions

Uploaded by

Copyright:

Available Formats

Chapter 2

Random Variables and

In many cases, however, we are not interested in

Rather we are interested in knowing the numbers

 Such function is known as the random variable (rv).

Example: Consider an experiment of tossing two coins.

Then S = {HH, HT, TH, TT}, and X(x) = 2 if x = HH, X(x) = 1 if

 The space of the R. V. X = {0, 1, 2, 3}

4. Rolling pair of a fair dice. A random variable X is the

a. Discrete Random Variable: If a random

 All relevant aspects of the experiment can be captured

 This list is called a probability density function

 The probability distribution reveals association

 Where the summation extends over all the values within

pdf can be indicated using: function, table or graph.

Then S = {HH, HT, TH, TT}, and X(x) = 2 if x = HH,

X(x) = 1 if x = HT or TH, X(x) = 0 if x=TT.

 The sum of this process must be equal to 1.

 A function can serve as the probability distribution of a

1. f ( x)  0 for each value within its domain

f ( x)  0, for all the given value of x

 f (1)  f ( 2)  f (3)  f (4)  f (5)  f (6)

3. Check whether the following function represents

 Since f(x)<0 for x=1, it does not represent a

The series: k/4, k/16, k/64,…, is in GP with a1 = k/4

Graphically F(x) is a step function with the height of the

and F(x) is called the distribution function of the

for any real constants a and b with . Or if X x is a continuous

3. Probability of a fixed value of a continuous rv X is zero.

 Probability Density Function (pdf) can be indicated using

f ( x)  kx(1  x), for 0  x  1

a. Find the value of c

4. Given the pdf:

is known as the Cumulative Distribution Function of X.

p (a  X  b)  F (b)  F (a ) for any real

8. Since F(x) is non-decreasing it follows that

2. Given The pdf of the continuous rv X is given by:

Find its cdf and p(X>1)

That is, E(X) is the weighted mean of the possible values of

Properties of mathematical expectations

b. E(aX+b) = aE(X)+b, where a and b are constants in 

c. Let X and Y be random variables with finite

2. for the continuous rv:

Then, the variance of the X, denoted by Var(X), is defined

Find: a. E(X). b. Var(X)

Let X be the number of trials required to produce the 1st

The rth moment about the origin, denoted by  r/.

Moments can also be generated around the mean, which

 is defined as the variance of the

1. The first moment about the origin is the mean of the

 One of the main features of a distribution is the

 If α3 = 0 then the distribution is symmetric.

α4 =3 for a normal distribution.

α4 >3 if the distribution is narrower and thinner at its tails

α4 <3 if the distribution is flatter and thicker at its tails

 to be equal to E(x) = 0(1-p)+1(p) = p.

The Maclaurin’s series = the Taylor series expansion about

a. moment generating function

You might also like