You are on page 1of 18

Chapter Three

Special probability Distributions and Densities


Under special probability distribution we will see standard discrete distributions:
1. The Discrete Uniform Distributions
2. The Bernoulli distribution
3. The Binomial Distribution
Under special density will see standard continuous distributions:
1. The Uniform Distribution
2. The Normal Distribution
2.1.1. Special probability Distributions
2.1.1.1. The Discrete Uniform Distributions

The uniform discrete probability distribution illustrates that each value of the random
variable has the same probability of being observed.

Example: Suppose that a fair die is thrown once. The outcomes are number of dots. Let
X is random variables that represents the number of dots of the die and find the
probability P (X)?
Random variable (Xi), Number of dots 1 2 3 4 5 6
Probability Of X, P (Xi) 1/6 1/6 1/6 1/6 1/6 1/6 ∑ ( )

Graphically:
P (xi)

1/6

0 1 2 3 4 5 6
2.1.1.2. The Bernoulli distribution

This is perhaps one of the simplest possible discrete random variables. Bernoulli
distribution deals with a simple experiment that may result in either of two possible
outcomes. We call an experiment with two possible outcomes Bernoulli trials and we
label the two outcomes success (S) and failure (f). Here the probability of success is the
complement of failure.

We say that a random variable x has the Bernoulli (p) distribution if and only if its
probability mass function is given by
( ) ( ) ( )

In applications, we may collect dichotomous data, for example simply record whether an item
is defective (x = 0) or non-defective (x = 1), whether an individual is married (x = 0) or
unmarried (x = 1) or whether a vaccine works (x = 0) or does not work (x = 1), and so on. In
such situation, P stands for P(x = 1) and 1-P stands for P (x = 0).

2.1.1.3. The Binomial Distribution

An experiment that consists of n fixed repeated independent Bernoulli trials, each with
probability of success p, is called a binominal experiment with n trials and parameter p.

A binomial experiment is a probability experiment that satisfies the following four


requirements:

1) Each trial can have only two outcomes or outcomes that can be reduced to two
outcomes. These outcomes can be considered as either success or failure.
2) There must be a fixed number of trials
3) The outcomes of each trial must be independent of each other
4) The probability of a success must remain the same for each trial

We say that a discrete random variable X has the Binomial (n, P) distribution if and
only if its probability mass function (PMF) is given by
n
f ( x)  P( X  x)    p x (1  P) n  x
 x
n
f ( x)  P( X  x)    P x q n  x where: q  1  p, where : 0  p  1
 x
f ( x)  P( X  x)  nCxP x q n  x
n!
f ( x)  P( X  x)  P x q n x
(n  x)! x!
n n!
where :    combination 
 x (n  x)! x!

Here again P is referred to as a parameter. Observe that the Bernoulli (P) distribution is
the same as the Binomial (1, P) distribution.

The Binomial (n, p) distribution arises as follows, consider repeating the Bernoulli
experiment independently n times where each time one observes the outcome (0 or 1) where
P = P (X = 1) remains the same throughout.

Binomial distribution is a process in which


a) The process is performed under the same conditions for a fixed and finite number of
times, say “n”.
b) Each trial is independent of other trial. I.e. the probability of an outcome for any
particular trial is not influenced by the outcomes of the other trials.
c) Each trial has two mutually exclusive possible outcomes, such as “success” or
“failure”, “non-defective ” or “defective”, “yes” or “no”, „hit‟ or „miss‟ and so on.
The outcomes are usually called success or failure for convenience.
d) The probability of success, P remains constant from trial to trial (so is the probability
of failure q where q = 1-p)

Example: Suppose a company produces toothpaste. Historically, eight-tenths of the


toothpaste tubes were correctly filled (successes). What is the probability of getting
exactly three of six tubes (half a carton) correctly filled?

Solution:
Given: P = 0.8, r = 3, q = 0.2, and n = 6, then using binomial formula

( )
( )
( )( )

Interpretation- the probability of getting exactly 3 correctly filled tubes out of six is
0.08192.

Example: A tyre wholesaler has 500 super band tires in a stock. And those 50 tires with
slightly damaged steel belting are randomly mixed in the stock. A retailer buys 10 tires.
What is the probability that the retailer receives 8 undamaged types?
Solution:
n = 10, P = 450/500 = 0.9
r = 8, q = 50/500 = 0.1

( )
( )

Example: In a short multiple choice quiz suppose that there are ten unrelated questions,
each with five suggested choices as the possible answers, each question has exactly one
correct answer given. An unprepared student guessed all the answers in that quiz.
Suppose that each correct (wrong) answer to a question carries one (zero) point
respectively.
a. Find the probability that the student gets 0/10?
b. Find the probability that the student gets at least 8?
c. Calculate the expected value?
Solution:
Let X stand for the student’s quiz score.
 1
We can postulate that X has the Binomial  n  10, P   distribution, then
 5

n
P ( X  x)    px (1  p ) n  x
 x
10   1 
0 10
4 0
a) P( x  0)         0.10737  ( the probability that the student get )
 0  5 5 10
10   1   4  10   1  10   1 
8 2 9 1 10 0
4 4 5
b) P( x  8)                       7.7926 x 10
 8  5 5  9  5 5 10   5  5
 the probability that the studnet' s result is  8
c)The expected value of a discrete random variable always can be computed from
n
E(x)   f ( xi) * xi
i 1

For a binomial random variabl e, X has a mean


 = E (x) = np
1 1
Therefore, if P = and n = 10, then E(x)  * 10  2
5 5
The variance ( ) and standard deviation (  ) of Binomial distribution is
2

 2  npq  np(1  p)
  npq

Example: We toss a coin five times and we are interested in the number of heads in each
possible outcome. Find the probability distribution of X and graph the distribution. X
represents the number of heads. And calculate the expected value of the random
variable?
Solution:
The probability distribution of X is
No of heads (x) 0 1 2 3 4 5
F(x) = P (X = x) 1 5 10 10 5 1
32 32 32 32 32 32

This probability distribution can be found using Binomial formula. The probability that
the random variable X equals
5  1 
0 5
1 1
f (0)  P ( x  0)        
0  2  2 32
 5  1 
1 4
1 5
f (1)  P ( x  1)        
1  2  2 32
5  1 
2 3
1 10
f (2)  P ( x  2)        
 2  2  2 32
 5  1 
3 2
1 10
f (3)  P ( x  3)        
 3  2  2 32
5  1 
4 1
1 5
f (4)  P ( x  4)        
 4  2  2 32
 5  1 
5 0
1 1
f (5)  P ( x  5)        
 5  2  2 32

10
32

5
32

1
32

0 1 2 3 4 5

The expected value of a discrete random variable always can be computed from
n
E(x)   f ( xi) * xi
i 1

For a binomial random variable, X has a mean


 = E (x) = np
1 1
Therefore, if P = and n = 5, then E(x)  * 5  2.5
2 2
The variance ( ) and standard deviation (  ) of Binomial distribution is
2

 2  npq  np(1  p )
  npq
2.1.2. Special Densities

In the discussion of discrete probability distributions, we introduced the concept of a


probability function ƒ (x). Recall that this function provides the probability that the random
variable x assumes some specific value. In the continuous case, the counterpart of the
probability function is the probability density function, also denoted by ƒ (x). For a
continuous random variable, the probability density function provides the value of the
function at any particular value of x; it does not directly provide the probability of the random
variable assuming some specific value
2.1.2.1. The Uniform Distribution
A continuous probability distribution where the probability that the random variable
will assume a value in any interval is the same for each interval of equal length is called
a uniform probability distribution.

A continuous random variable x is said to have a uniform distribution over the internal
[a, b] if its probability density function is given by

 1
 a xb
f ( x)   b  a
 0 otherwise

It is used to model events that are equally likely to occur at any time within a given time
interval.
Graph of uniform distribution is

f(x)

1
ba

a b
The graph of the PDF provides the height or value of the function at any particular value of x.
Unlike the discrete probability function the PDF for a continuous random variable does not
represent probability rather it provides the height of the function at any particular value of x.
The cumulative density function of a uniform distribution is given by
0 xa
x  a

F ( x)  P( X  x)   a xb
b  a
1 xb

The expected value of X is given by


 b
x  x2  b
E ( x)  

xf ( x)dx  
a
ba
dx   
 2(b  a )  a
b2  a 2 (b  a ) (b  a ) a  b
 
2(b  a ) 2(b  a ) 2
The second moment of x is given by
 b
x2 x3
E ( x 2 )   x 2 f ( x)dx   dx 
b

 aba
/
3(b  a) a
b 3 a 3 (b  a)(b 2  ab  a 2 ) b 2  ab  a 2
  
3(b  a) 3(b  a) 3
Thus, the variance of x is given by


2
x
 
 E x 2  ( E ( x))2
b 2  ab  a 2 b 2  2ab  a 2
 
3 4
b 2  2ab  a 2

12
(b  a ) 2

12

Example: The time that Abebe, the teaching assistant, takes to grade a paper is uniformly
distributed between 5 minutes and 10 minutes. Find the mean and variance of the time to
grade a paper.

Solution: Let X be a random variable that denotes the time it takes Abebe to grade a paper,

therefore, the mean or expected value E(X) and variance (  x )


2

10  5
E ( x)   7.5
2
(10  5) 2 25
  
2
x
12 12
Examples: The random variable X is supposed to be uniformly distributed between 10
& 20.
a) Find P (x ≤ 15)?
b) Find P (12 ≤x ≤18)?
c) Compute E (x) and Var. (x)?
Solution:

a) ( ) { ( ) {

( )
( )

( )

b) ( )

c) ( )

( ) ( )
( )

Example: The mean of a uniformly distributed random variable is 10 and the range is
1.8?
a) What are the smallest and the largest values of the distribution?
b) What is the probability that the random variable can take values between 9 and
10.5?
Solution:
Given that:

( )

a. ( )

Solving equations 1and 2 simultaneously, we can find the result: a = 9.1 and b = 10.9

b. ( ) {

( )
2.1.2.2. The Normal Distribution
The most useful theoretical distribution for continuous random variable distribution is
the Normal Distribution. It has been observed that most business and economic variables
generate continuous data whose behaviour is often best described by a bell-shaped continuous
curve. Since this is what we normally come across in the case of most populations on these
variables, a bell-shaped curve has come to be universally known as normal curve.
Accordingly, the probability distribution described by a normal curve is called the
normal (probability) distribution.

The normal distribution has come to acquire a wide range of applications in many areas of
human knowledge. It is being used in almost all data based research in the field of
agriculture, business, and industry. As will be noticed later, much of the theory of inductive
statistics, concerning estimation of unknown population parameter and testing
hypotheses on the basis of sample statistics, has been developed using the concepts of
normal curve.

There are two basic reasons why the normal distribution occupies such a prominent
place in statistics.
First, it has some properties that make it widely applicable in various situations in which it is
necessary to make inferences by taking samples.
Second, it comes close to fitting the actual observed frequency distributions of many
phenomena. For instance, human characteristics (weights, heights and IQs), outputs from
physical/process (like dimensions and yields) and other measures of interest to economists
and business professional in both the public and private sectors.
E.g. Per capita income in developing countries, air pollution in a community, etc

Normal Distribution Equation and Its Parameters


A continuous variable that yields a bell-shaped curve as shown in the below is called normal
variable. The normal curve describes the probability distribution of the normal variable
X, which is said to be normally distributed with parameters mean  and standard
deviation ; Denoted as n(X; , ) in which n is read as normal.

The probability density function for a normally distributed probability distribution is as


follows:
1 x 2
1  ( )
f ( x)  n( x;   )  e 2 
, In which  = 3.14159, e = 2.71828, and -∞ < X< ∞
2 2

= mean
= Mode
= median
Figure: A normal distribution curve

The parameters  and  are critical values in the normal distribution equation. Other
terms being constant, the exponent
1 x 2 ( x   )2
 ( ) or  , is the only operational part of the equation. It shows the
2  2 2
deviation of the value of normal variable X from its mean (). The larger these deviations,
the higher the value of standard deviation  (or variance 2), which is the denominator in the
exponent.

It may be seen that  lies at the centre of the normal curve, and indicates the central
value of the normal distribution. The standard deviation  is a measure of the extent of
the spread of X values from the central value (). Thus, while  fixes the position or the
level of the distribution on the X-axis,  determines the spread of the distribution along the
X-axis on both sides of the central value.

In the light of the above, consider the following three situations as shown in the figure
below:
1. A change in  , standard deviation  remaining the same, shifts the curve along the
X-axis without changing its spread. This is shown in figure below.

1

µ1 µ2

Figure: Normal curves with different means ( 1> µ2) and 1 = 2


2. A change in, but  remaining the same, changes the shape or spread of the normal
curve. This is shown
1

2

µ1 = 2
Figure: Normal curves with different standard deviations (2>1) and 1=2

3. An increase in  increases the spread of the normal curve equally on both sides of the
central value; it lower the normal curve in height, irrespective of whether or not
there is any change in. A decrease in , on the contrary, reduces the spread of the
normal curve and increase its height. The inverse relationship obtaining between the extent
of spread of the normal curve and its height at the central value can be easily grasped by
observing the above curves. This is so because the total area under any two normal
curves must always be 1.
Properties of the Normal Curve
As the positioning and spread of the normal curve is determined only by the value of  and ,
it means there are different normal curves ( different in spread and positioning) for different
values of these two parameters. This enables us to state the following properties of the normal
curve.
1. The normal curve is not a single curve representing only one continuous distribution.
It represents a family of normal curves. Since, for each different value of  and/or ,
there is a specific normal curve different in positioning on the X- axis and spread
around the central value.
2. A change in the value of  displace the entire curve of a difference level, whereas a
change in  changes the spread and determines its height.
3. The normal curve is completely determined by the values of  and , which are its
two parameters.
4. The mode of the normal distribution occurs at a point on the X- axis where the curve
reaches the maximum height. Since most observation tends to cluster around the mean
value, the point of mode coincides with the point of mean. That is, mean and mode
are equal at the point where the curve attains the maximum height.
5. Being bell-shaped, the normal curve is perfectly symmetrical about the vertical axis
through the mean. As a result, 50 percent of the area lies to the right of mean and the
other 50 percent to its left.
6. Areas under the curve give probabilities for the normally distributed variables. The
area under the normal curve is distributed as follows:

7. Perfectly symmetry also means that the mean, median and mode are all equal in the
case of a normal distribution.

The Standard Normal Distribution

Since each normally distributed variable has its own mean () and standard deviation () as
stated earlier, the shape and location of these curves will vary. In practical application, then
one would have to have a table of areas under the curve. In order to simplify this situation,
statisticians use what is called the standard normal distribution.
The standard normal distribution is a distribution with mean of 0 and a standard
deviation of 1. One advantage of all normally distributed variables are that they can be
transformed into the standard normal distribution by using the formula for the
standard score (Z): In order to transform the X – scale (X~N (µ, σ)...X-scale) in to Z –
scale (Z~N (0, 1)...Z-scale), we use the following formula:

As we stated earlier, the area under the normal distribution curve is used to solve practical
application problem, hence the major emphasis of this section is to show the procedures for
finding the area under the normal distribution curve for any Z value. Once the X values are
transformed by using the above formula, they are called Z values. The Z value is actually
the number of standard deviation that particular X value is away from its mean (i.e.
below or above the mean). For example Z   2 implies that the value X is 2 standard
deviation above or below the mean
X-values µ-3 µ-2 µ-1 µ µ+1 µ+2 µ+3

Z-values -3 -2
-3 -1 0 1 2 3

68.26%

95.44%

99.74%

The above figure shows the graph of the probability density function of Z with mean equal to
zero & standard deviation equal to one. This curve is symmetric, bell-shaped & centred on
the mean equal to zero and has most of the area contained within the range  3 (99.74%)
Example: Find the area under the normal curve for Z = ± 1.54
Solution: P (0 + 1.54) = 0.4382 (from the table)
P (0 + -1.54) = 0.4382

P (-1.54 ≤ x ≤ 1.54) = 0.4382 + 0.4382 = 0.8764

Example: Find the area to the right of Z = 0.25

Solution: P (x ≥0.25) = 0.5 – 0.0987 = 0.4013


Example: Find the area to the left of Z = 1.96

Solution: 0.5 + 0.475 = 0.975


Example: Find the area between Z = 0.6 and Z = 1.8
Solution:
P (0 + 1.8) = 0.4641
P (0 + 0.6) = 0.2257

P (0.6 ≤ x ≤ 1.8) = 0.4641 - + 0.2257 = 0.2384

Example: The income of a group of 1000 persons found to be normally distributed with
mean 750 Birr per month and a standard deviation of 50 Birr. Show that of this group
about 95% had income exceeding 668 Birr and only 5% had income exceeding 832
Birr?

Solution:
Given: µ = 750 Birr X1 = 668 Birr

σ= 50 Birr X2 = 832 Birr

Therefore, Income exceeding 668 Birr = 0.5 + 0.4495 = 0.9495= 95%

Therefore, Income exceeding 832 Birr = 0.5 - 0.4495 = 0.0505 = 5%

Example: A continuous manufacturing process produces items whose weights are normally
distributed with a mean weight of 800 g and standard deviation of 300 g. A random sample of
16 items is to be drawn from the process.
a. What is the probability that the arithmetic mean of the sample exceeds 900 g?
Interpret the results.
b. Find the value of the sample arithmetic mean within which the middle 95 percent of
all sample means will fall.
Solution: (a) we are given the following information
µ = 800g,  = 300g, and n= 16
Since population is normally distributed, the distribution of sample mean is normal with
mean and standard deviation equal to
 x    800

 300 300
And  x     75
n 16 4

The required probability P ( x  900 ) is represented by the shaded area in the below figure of a
normal curve. Hence, 9.18 percent of all possible samples of size n=16 will have a sample
mean value greater than 900g.

0.0918

x  800 x  900
Z  1.33

(a) since Z = 1.96 is the 95 percent area under the normal curve, therefore using the
formula for Z to solve for the values of x in terms of the known values are as follows:

x1   x  Z x  800  1.96(75)  653g


x 2   x  Z x  800  1.96(75)  947g

Therefore, 95% of the population are within [653,947]


0.9500

x1 x x2
Example: In a normal distribution 31 percent of the items are under 45 and 8 percent are over
64. Find the mean and standard deviation of the distribution.
Solution: since 31 percent of the items are under 45, therefore the left of the ordinate at X =
45 is 0.31, and obviously the area to the right of the ordinate up to the mean is (0.5-0.31) =
0.19. The value of Z corresponding to this area is 0.5. Hence

45  
Z  0.5 or    0.5  45

As 8 percent of the items are above 64, therefore areas to the right of the ordinate at 64 is
0.08. Area to the left to the ordinate at X = 64 up to mean ordinate is (0.5 – 0.08) = 0.42 and
the value of Z corresponding to this area is 1.4. Hence
64  
Z  1.4 or    1.4  64

From these two equations, we get 1.9 = 19 or  = 10 in the first equation, we get
 - 0.5 10 = 45 or  = 50
Thus, mean of the distribution is 50 and standard deviation 10

19% 42%
8%
31%

45  64

You might also like