You are on page 1of 17

Random Variable and Probability Distribution

Random Variable

A variable that takes on different values as a result of the outcomes of a random experiment.

Or

A variable which assumes different values with associated probabilities is called a random variable.

Example

By tossing two unbiased coins simultaneously we get the number of heads, x  0, 1, 2 with
1 1 1
corresponding probabilities , and . Here x is a random variable.
4 2 4

Types of Random Variable

A random variable may be classified on depending upon the specific numerical values it can take.

These are

 Discrete Random Variable

 Continuous Random Variable

Discrete Random Variable

A random variable defined over a discrete sample space i.e. that may only take on a finite or countable
number of different isolated values is referred to as a discrete random variable.

Some Example of Discrete Random Variables

 Number of telephone calls received in a telephone booth.

 Number of correct answers in 100-MCQ type examination.

 Number of defective bulbs produced during a day’s run.

 Number of under-five children in a family.

Continuous Random Variable

A random variable defined over a continuous sample space i.e. which may take on any value in a certain
interval or collection of intervals is referred to as a continuous random variable.
Some Example of Continuous Random Variables

 Time taken to serve a customer in a bank counter.

 Weight of a six-month-old baby.

 Rate of interest offered by a commercial bank.

 Longevity of a machine.

Probability Distribution

The collection of all possible values of a random variable x together with corresponding probabilities
P (x) is called a probability distribution of x.

Or

Any statement of a function associating each of a set of mutually exclusive and exhaustive classes or class
intervals with its probability is a probability distribution.

Types of Probability Distribution

A probability distribution will be either discrete or continuous according as the random variable is
discrete or continuous.

Discrete Probability Distribution

A probability distribution in which the variable is allowed to take on only a limited number of values
which can be listed.

If a random variable X has a discrete distribution, the probability distribution of X is defined as the
function f such that for any real number x,

f ( x)  P( X  x)

The function f (x) defined above must satisfy the following conditions in order to be a probability mass
function:

 f ( x)  0

  f ( x)  1
x

 P( X  x)  f ( x)
Example: Verify that the following functions are probability mass functions.

2x 1
a) f ( x)  , x  0, 1, 2, 3.
8

x 1
b) f ( x)  , x  0, 1, 2, 3.
16

3x  6
c) f ( x)  , x  1, 2.
21

Solution: a) Summing the function over the entire range of X,

3
1 1 3 5
 f ( x)  f (0)  f (1)  f (2)  f (3)   8  8  8  8  1
x 0

Here f (x) satisfies condition (2) but P( X  0)  f (0)  1 8 , which contradicts condition (1). Hence
f (x) is not a probability mass function.

b) Summing the function over the entire range of X,

3
1 2 3 4 10
 f ( x)  f (0)  f (1)  f (2)  f (3)  16  16  16  16  16  1
x 0

Here f (x) fails to satisfy condition (2) and hence is not a probability mass function although for all
values of x, f ( x)  0 .

c) Summing the function over the entire range of X,

2
9 12
 f ( x)  f (1)  (2)  21  21  1
x 1

Here f (x) satisfies all the conditions laid down above and thus the function f (x) represents a probability
mass function.

Example: A rack contains 10 helmets of which 4 are defective. If 3 helmets are drawn at random without
replacement, obtain the probability distribution for the number of defective helmets drawn.

Solution: If X denotes the number of defective helmets drawn, then clearly X can assume values 0, 1, 2
and 3. To obtain the probability distribution of X, we need to compute the probabilities associated with 0,
1, 2 and 3. Since 3 RAMs are to be chosen, the number of ways in which this choice can be made is
10
C3 .
4
C0 6 C3 20 1
f (0)  P ( X  0)  10
 
C3 120 6
Thus,

4
C1  6 C 2 60 1
f (1)  P( X  1)  10
 
C3 120 2
4
C 2 6 C1 36 3
f ( 2)  P ( X  2)  10
 
C3 120 10
4
C3  6 C0 4 1
f (3)  P( X  3)  10
 
C3 120 30

Hence the tabular form of the probability distribution of X will be as follows:

X x 0 1 2 3
P( X  x)  f ( x) 1 1 3 1
6 2 10 30

Example: A shelf has some equal two types of equipment’s these are defective and non-defective. A
equipment select three times. Let X be the number of runs obtained as a result of the outcomes of this
experiment. Find the probability distribution of X.

Solution: Any unending sequence of a particular outcome will be counted as a run. We denote the
defective equipment is D and the non-defective equipment is N. Now we construct a sample space for the
experiment along with the values of the random variable X together with their associated probabilities:

Number of Runs
Outcome P( X  x)
( X  x)
DDD 1 18
DDN 2 18
DND 3 18
DNN 2 18
NDD 2 18
NDN 3 18
NND 2 18
NNN 1 18

The random variable X is seen to take on three distinct values 1, 2 and 3 with probabilities
2 8 , 4 8 and 2 8 respectively. The values of the random variables and their associated probabilities are
summarized in a tabular form, which appear below:

Number of runs
1 2 3
( X  x)
P( X  x)  f ( x) 2 4 2
8 8 8
Continuous Probability Distribution

A probability distribution in which the variable is allowed to take on any value within a given range.

A continuous variate can take on any value in the given interval a  X  b . As a result, a continuous
random variable has a probability zero of assuming exactly any of its values. This implies that

P ( a  X  b)  P ( X  a )  P (a  X  b)  P ( X  b)  P ( a  X  b) .

A continuous random variable X by the functional notation f (x) and it is usually called a probability
density function (pdf) or simply density function.

A formal definition of a probability density function may be presented as follows:

A probability density function is a non-negative function and is constructed so that the area under its
curve bounded by the x-axis is equal to unity when computed over the range of x, for which f(x) is defined.

The above definition leads to conclude that a pdf is one that possesses the following properties:

 f ( x )  0.


  f ( x)dx  1.


b
 P(a  X  b)   f ( x)dx.
a

Example: A random variable X has the following functional form:


kx , 0 x4
f ( x)  
0 , elsewhere
i. Determine k for which f (x ) is a density function.

ii. Find P(1  X  2) and P ( X  2).

Solution: (i) For f (x) to be a density function, we must have


 f ( x)dx  1

4
 k  xdx  1
0
4
 x2 
 k    1
 2 0
 8k  1
1
k 
8
So the complete density function is thus

x
 , 0 x4
f ( x)   8
0 , elsewhere

2 2
1 1  x2  3
(ii) Again P (1  X  2)   xdx    
81 8  2 1 16

4 4
1 1  x2  3
And P ( X  2)   xdx    
82 8  2 2 4

The following are important probability distribution

 Binomial Distribution

 Poisson Distribution

 Normal Distribution

 Exponential Distribution

Binomial Probability Distribution

The number X of successes in n Bernoulli trials is called a binomial random variable. The probability
distribution of this discrete random variable is called the binomial distribution, and its values will be
denoted by b(x; n, p) since they depend on the number of trials and the probability of a success on a given
trial.
A Bernoulli trial can result in a success with probability p and a failure with probability q = 1 — p. Then
the probability distribution of the binomial random variable X, the number of successes in n independent
trials, is

n
b( x ; n, p)    p x q n  x , x = 0,1,2,...,n.
 x
The binomial distribution was discovered by James Bernoulli (1654-1705) in the year 1700.
Strictly speaking, the Bernoulli process must possess the following properties:
1. The experiment consists of fixed n repeated trials.
2. Each trial results in an outcome that may be classified as a. success or a failure.
3. The probability of success, denoted by p, remains constant from trial to trial.
4. The repeated trials arc independent.
Remarks
A probability function P(x) must satisfy the following conditions:
n
 P( x)    p x q n  x  0 for all values of x.
 x
n n
n
  P( x)   C x p x q n  x  ( p  q ) n  1 , [since p  q  1 ]
x 0 x 0

Properties of Binomial Distribution

 Total probability of the events will be unity i.e.  P(x)  1 .


 The binomial probability distribution has two parameters n and p.

 The mean and variance of the binomial distribution are np and npq respectively.

1 1
 Binomial distribution is symmetrical if p  q  , if p  the distribution is positively skewed
2 2
1
and if p  the distribution is negatively skewed.
2

(q  p) 2
 The skewness of the distribution is 1  and the kurtosis of the distribution
npq
1  6 pq
is  2  3  . As the number of trials n increases infinitely, 1  0 and  3  3 .
npq

 When n   and p  0 then Binomial distribution tends to Poisson distribution.

 When n   and p  q then Binomial distribution tends to Normal distribution.

Mean

We know the mean of the binomial distribution is

n n n
n  n  1 x 1 n  x
  1  E ( x)   xp( x)   x  p x q n  x  np    p q  np(q  p) n1  np
x 0 x 0  x x 1  x  1

Variance

We know the variance of the binomial distribution is

 2   2   2  1 2 where  2  E ( x 2 )  E[ x( x  1)  x]  E[ x( x  1)]  E ( x)


n n
n x n(n  1)  n  2  x n  x
Again E[ x( x  1)]   x( x  1) x  p q n  x   x( x  1)  p q
x0   x 0 x( x  1)  x  2 

n
 n  2
= n(n  1) p 2   x  2  p x2
q n  x  n(n  1) p 2 ( p  q ) n 2  n(n  1) p 2
x 2  

We have already known E ( x)  np

We get  2  E[ x( x  1)]  E ( x)  n(n  1) p 2  np

Now  2   2   2  1 2  n(n  1) p 2  np  (np ) 2  n 2 p 2  np 2  np  n 2 p 2  np (1  p )  npq

Show that Mean>Variance

Proof: We know that Mean  E ( x)  np and Variance  npq

We can write

Variance  npq
 np (1  p )
 np  np 2
 E ( x)  np 2

 E ( x)  V ( x)  np 2
[Since np 2  0]
E ( x )  V ( x)

That is mean is always greater than the variance.

Example

A large chain retailer purchases a certain kind of electronic device from a manufacturer. The
manufacturer indicates that the defective rate of the device is 3%.
(a) The inspector of the retailer randomly picks 20 items from a shipment. What is the probability that
there will be at least one defective item among these 20?
(b) Suppose that the retailer receives 10 shipments in a month and the inspector randomly tests 20 devices
per shipment. What is the probability that there will be 3 shipments containing at least one defective
device?
Solution
(a) Denote by X the number of defective devices among the 20. This X follows a b(x; 20, 0.03)
distribution. Hence
P( X  1)  1  P( X  0)  1  b( x, 20, 0.03)
 1 20 C 0 (0.03) 0 (0.97) 200  1  0.5438  0.4562
(b) In this case, each shipment can either contain at least one defective item or not. Hence, testing the
result of each shipment can be viewed as a Bernoulli trial with p = 0.4562 from part (a). Assuming
the independence from shipment to shipment and denoting by Y the number of shipments containing
at least one defective item Y follows another binomial distribution b(y; 10,0.4562). Therefore, the
answer to this question is
P(Y  3)  b( y, 10, 0.4562)10 C 3 (0.4562) 3 (1  0.4562)103  0.1602

Poisson Probability Distribution


Poisson experiment
Experiments yielding numerical values of a random variable X, the number of outcomes occurring during
a given time interval or in a specified region, are called Poisson experiments. The given time interval
may be of any length, such as a minute, a day, a week, a month, or even a year.
Example
Hence a Poisson experiment can generate observations for the random variable X representing -
 The number of telephone calls per hour received by an office.
 The number of day’s school is closed due to snow during the winter.
 The number of postponed games due to rain during a baseball season.
 The number of field mice per acre.
 The number of bacteria in a given culture.
 The number of typing errors per page.

Properties of Poisson Process


A Poisson experiment is derived from the Poisson process and possesses the following properties:
1. The number of outcomes occurring in one time interval or specified region is independent of the
number that occurs in any other disjoint time interval or region of space. In this way we say that
the Poisson process has no memory.
2. The probability that a single outcome will occur during a very short time interval or in a small
region is proportional to the length of the time interval or the size of the region and does not
depend on the number of outcomes occurring outside this time interval or region.
3. The probability that more than one outcome will occur in such a short time interval or fall in such
a small region is negligible.
Poisson distribution
The number X of outcomes occurring during a Poisson experiment is called a Poisson random variable,
and its probability distribution is called the Poisson distribution. The probability distribution of the
Poisson random variable X, representing the number of outcomes occurring in a given time interval or
specified region. The probability function of Poisson distribution is

e m m x
P( X  x)  , x  0, 1, 2,    , 
x!
where m is the average number of outcomes per unit time, distance, area, or volume, and e = 2.71828.
 
e m m x 
mx
Remarks: It should be noted that  P( x)    e m   e m e m  1
x 0 x0 x ! x 0 x!
Properties of Poisson Distribution
 The Poisson distribution is a discrete distribution with a parameter m.

 Total probability of the distribution will be unity i.e.  P( x)  1 .
x 0

 The mean and variance of Poisson distribution is equal to m.


1 1
 The skewness and kurtosis of the Poisson distribution is  1  and  2  3  .
m m
 As m   ,  1  0 and  2  3 .

Mean

We know the mean of the Poisson distribution is

 
e m m x m

m x 1
  1  E ( x)   xp( x)   x  me   me m e m  m
x 0 x 0 x! x 1 ( x  1)!

Variance

We know the variance of the Poisson distribution is

 2   2   2  1 2 where  2  E ( x 2 )  E[ x( x  1)  x]  E[ x( x  1)]  E ( x)


e m m x 
m x 2
Again E[ x( x  1)]   x( x  1)  m 2 e m   m 2 e m e m  m 2
x0 x! x 2 ( x  2 )!

We have already known E ( x)  m

We get  2  E[ x( x  1)]  E ( x)  m 2  m
Now  2   2   2  1 2  m 2  m  m 2  m

Example
During a laboratory experiment the average number of radioactive particles passing through a counter in 1
millisecond is 4. What is the probability that 6 particles enter the counter in a given millisecond?
Solution: Using the Poisson distribution with x = 6 and m = 4. We have

e 4 (4) 6 0.018  4096


P( x; m)  P(6, 4)    0.1024
6! 720
Example
In a certain industrial facility accidents occur infrequently. It is known that the probability of an accident
on any given day is 0.005 and accidents are independent of each other.
(a) What is the probability that in any given period of 400 days there will be an accident on one day?
(b) What is the probability that there are at most three days with an accident?
Solution: Let. X be a binomial random variable with n = 400 and p = 0.005. Thus np = 2. Using the
Poisson approximation,

e 2 (2)1
(a) P ( x; m)  P ( X  1, 2)   0.2704
1!
3
e 2 (2) x
(b) P ( x; m)  P ( X  3, 2)    0.857
x 0 x!
Example
In a manufacturing process where glass products arc produced, defects or bubbles occur, occasionally
rendering the piece undesirable: for marketing. It is known that, on average, 1 in every 1000 of these
items produced has one or more bubbles. What is the probability that a random sample of 8000 will yield
fewer than 7 items possessing bubbles?
Solution: This is essentially a binomial experiment with n = 8000 and p= 0.001. Since p is very close to
zero and n is quite large, we shall approximate with the Poisson distribution using
  m  np  8000  0.001  8
Hence, if X represents the number of bubbles, we have
6
e 8 8 x
P ( X  7)    0.3134
x 0 x!
Normal Distribution
The most important, continuous probability distribution in the entire field of statistics is the normal
distribution. Its graph, called the normal curve, is the bell shaped curve which describes approximately
many phenomena that occur in nature, industry, and research. Physical measurements in areas such as
meteorological experiments, rainfall studies, and measurements of manufactured parts are often more than
adequately explained with a normal distribution. In 1733, Abraham DeMoivre developed the
mathematical equation of the normal curve. It provided a basis for which much of the theory of inductive
statistics is founded. The normal distribution is often referred to as the Gaussian distribution, in honor of
Karl Fricdrich Gauss (1777-1855), who also derived its equation from a study of errors in repeated
measurements of the same quantity.

A continuous random variable X having the bell-shaped distribution is called a normal random
variable. The mathematical equation for the probability distribution of the normal variable depends upon
the two parameters  and  , its mean and standard deviation. Hence we denote the values of the density

of X by N (  ,  2 ) . The density of the normal random variable X, with mean  and variance  2 , is
2
1  x 
1  
2  

f ( x ; , 2 )  e   x  
 2
where   3.1416 and e  2.71828 are two constants.

Properties of Normal Distribution


We summarize below some of the important properties of the normal distribution.
 The normal probability curve is symmetrical about the ordinate at x   , where  is the mean of
the distribution.
 The mode, which is the point on the horizontal axis, where the curve is a maximum, occurs

at x   and is equal to 1 .
 2
 For the curve mean, median and mode are the same.
 The total area under curve and above the horizontal axis is equal to 1.
 The curve has its point of inflection at x     . It may be obtained by solving equation

f ( x ;  ,  2 )  0 . Thus the curve changes from convex to concave in relation to the horizontal
axis at these point.
 The curve extends from minus infinity to plus infinity.
 All odd moments of the distribution are zero.
 The value of 1 and  2 of the distribution are 0 and 3 respectively.

4
 Mean deviation about arithmetic mean is  (approximately)
5
 The proportions of area lying between    ,   2 and   3 are respectively 68.26%,
95.44% and 99.73%. These proportions provide an excellent basis for the test of significance for
large samples.

Importance of Normal Distribution in Statistics


Normal distribution plays a very important role in statistics because of the following reasons:
 Most of the distributions occurring in practice e.g. Binomial, Poisson, Hypergeometric
distribution etc. can be approximated by the normal distribution under some assumptions.
Moreover, many of the sampling distributions e.g. student t, F and  2 tends to normality for large
samples.
 The distribution has attractive mathematical properties which are very useful from theoretical
point of view.
 The proofs of all the tests of significance in sampling are based upon the fundamental assumption
that the population from which the samples have been drawn is normal.
 Normal distribution finds large application in statistical quality control theory.

Standard Normal Distribution


We are able to transform all the observations of any normal random variable X to a new set of
observations of a normal random variable Z with mean 0 and variance 1. This can be done by means of
the transformation
X 
Z

X 
Whenever X assumes a value x, the corresponding value of Z is given by Z  . Therefore, if X

falls between the values X  x1 and X  x 2 the random variable Z will fall between the corresponding

values z1  ( x1   )  and z 2  ( x 2   )  . Consequently, we may write


1
1  x2
2
f ( z ; 0,1)  e   z  
2
where Z is seen to be a normal random variable with mean 0 and variance 1.
The distribution of a normal random variable with mean 0 and variance 1 is called a standard normal
distribution.
Example
A random variate X is normally distributed with mean 12 and standard deviation 4. Find out the
probability of the following i) X  20 ii) X  20 and iii) 0  X  12
Solution: Here,   12 and   4 . So we get

 X   20  12 
i) P ( X  20)  P    P ( Z  2)  1  P( Z  2)  1   (2)  1  0.9772  0.0228
  4 

 X   20  12 
ii) P ( X  20)  P    P ( Z  2)   (2)  0.9772
  4 
 0  12 X   12  12 
iii) P(0  X  12)  P     P(3  Z  0)
 4  4 
 P(Z  0)  P(Z  3)   (0)   (3)
  (0)  {1   (3)}  0.5  {1  0.9987}
 0.5  0.0013  0.4987

Example
A city installs 2000 electric lamps for street lighting. These lamps have a mean burning life of
1000 hours with a standard deviation of 200 hours. The normal distribution is a close
approximation to this case.
i) What is the probability that a lamp will fail in the first 700 burning hours?
ii) What is the probability that a lamp will fail between 900 and 1300 burning hours?
iii) How many lamps are expected to fail between 900 and 1300 burning hours?
iv) After how many burning hours would we expect 10% of the lamps to be left?

Solution:
Here,   1000 and   200 . So we get

i) The probability that lamp will fail in the first 700 burnings hours is P ( X  700) . So

 X   700  1000 
P( X  700)  P    P( Z  1.5)   (1.5)  1   (1.5)  1  0.9332  0.0668
  200 
Then probability of burning life is less than 700 is 0.067.
ii) The probability that a lamp will fail between 900 and 1300 burning hours is P (900  X  1300). So
 900  1000 X   1300  1000 
P(900  X  1300)  P     P (0.5  Z  1.5)   (1.5)   (0.5)
 200  200 
  (1.5)  {1   (0.5)}  0.9332  (1  0.6915)  0.9332  0.3085  0.6247.
Then the probability of burning life is between 900 and 1300 is 0.625.
iii) This is a continuation of part (ii). The expected number of failures is given by the total number of
lamps multiplied by the probability of failure in that interval. Then the expected number of failures =
(2000)  (0.6247) = 1249.4 or 1250 lamps. Because the burning life of each lamp is a random variable, the
actual number of failures between 900 and 1300 burning hours would be only approximately 1250.
iv) After how many burning hours would we expect 10% of the lamps to be left means that
P ( X  x)  10%  0.10.
So
P( X  x)  0.10
 1  P( X  x)  0.10
 P( X  x)  1  0.10  0.90
Then
P ( X  x)  0.90
 X  x
 P    0.90
   
 P( Z  z )  0.90
  ( z )  0.90
From the normal table we know that
 ( z )  0.90
  (1.28)  0.90
So let us take z  1.28 . Then we have
z  1.28
x
  1.28

x  1000
  1.28
200
 x  1.28  200  1000  1256
Then after 1256 hours of burning, we would expect 10% of the lamps to be left. And again, because the
burning time is a random variable, performing the experiment would give a result which would be close
to 1256 hours but probably not exactly that, even if the normal distribution with the given values of the
mean and standard deviation applied exactly.
Example
Given a random variable X having a normal distribution with   50 and   10 , find the probability that
X assumes a value between 45 and 62.
Solution: The z values corresponding to x1  45 and x 2  62 are

z1  ( x1   )   (45  50) 10  0.5 and z 2  ( x 2   )   (62  50) 10  1.2


Therefore,
P (45  X  62)  (0.5  Z  1.2)  P ( z  1.2)  P ( Z  0.5)  0.8849  0.3085  0.5764 .
Example

Given that X has a normal distribution with   300 and   50 , find the probability that X assumes a
value greater than 362.
Solution: To find the P ( X  362) we need to evaluate the area under the normal curve to the right

of x  362 . This can be done by transforming x  362 to the corresponding z value, obtaining the area to
the left of z and then subtracting this area from 1. We find that
362  300
z  1.24 .
50
Hence
P ( X  362)  P ( Z  1.24)  1  P( Z  1.24)  1  0.8925  0.1075 .
Example
In an industrial process the diameter of a ball bearing is an important component part. The buyer sets
specifications on the diameter to be 3.00  0.01 cm. The implication is that no part falling outside these
specifications will be accepted. It is known that in the process the diameter of a ball bearing has a normal
distribution with mean   3.0 and standard deviation   0.005 . On the average, how many
manufactured ball bearings will be scrapped?
Solution: The values corresponding to the specification limits are x1  2.99 and x 2  3.01 . The

(2.99  3.0) (3.01  3.0)


corresponding Z values are z1   2.0 and z 2   2 .0
0.005 0.005
Hence
P (2.99  X  3.01)  P (2.0  Z  2.0)  P ( Z  2.0)  P ( Z  2.0)  0.9772  0.0228  0.9544
As a result, it is anticipated that on the average, (1-0.9544) = 0.0456. That means 4.56% of manufactured
ball bearings will be scrapped.
Example
A certain machine makes electrical resistors having a mean resistance of 40 ohms and a standard
deviation of 2 ohms. Assuming that the resistance follows a normal distribution and can be measured to
any degree of accuracy, what percentage of resistors will have a resistance exceeding 43 ohms?
Solution: A percentage is found by multiplying the relative frequency by 100%. Since the relative
frequency for an interval is equal to the probability of falling in the interval, we must find the area to the
right of x = 43. This can be done by transforming x = 43 to the corresponding value, obtaining the area to
the left of z from Normal Table, and then subtracting this area from 1. We find
X  43  40
z   1 .5
 2
Therefore,
P ( X  43)  P( Z  1.5)  1  P ( Z  1.5)  1  0.9332  0.0668 .
Hence, 6.68% of the resistors will have a resistance exceeding 43 ohms.

Example
The average grade for an exam is 74, and the standard deviation is 7. If 12% of the class is given A's, and
the grades are curved to follow a normal distribution, what are the lowest possible A and the highest
possible B?
Solution: In this example we begin with a known area of probability, find the z value, and then determine
x from the formula x  z   . An area of 0.12, corresponding to the fraction of students receiving A's.
We require a z value that leaves 0.12 of the area to the right and hence, an area of 0.88 to the left. From
Normal Table P ( Z  1.18)  0.88 . So that, the desired z value is 1.18.
Hence
x  z    7  1.18  74  82.2
Therefore, the lowest A is 83 and the highest. B is 82.

You might also like