You are on page 1of 10

DSC 1371 Business Statistics

CHAPTER – 7: Random Variables and Probability Distributions


Chapter Objectives:
From this chapter, it is expected to;
• Explain the Concepts of Random Variables and Probability Distributions.
• Define Discrete and Continuous Random Variables.
• Define a Discreate Probability Distributions alone with its Mean and Variance.
• Introduce Binomial and Poisson Discrete Distribution and Discuss their Applications.
• Define Continuous Probability Distributions.
• Describe the Normal Distribution and its Applications.

Learning outcomes:
After learning the content of this chapter, the students should be able to;
 Define and Identified Discreate and Continuous Random Variables.
 Calculate mean and variance of Discrete Random Variables.
 Calculate Probabilities using Binomial and Poisson Probability Functions.
 Solve Probability Problems involving Binomial and Poisson Distributions.
 Describe Properties of Normal Distribution and its Importance.
 Use Standard Normal Tables to find Probabilities of Normal Distribution.
 Solve Probability Problems involving Normal Distribution.

A probability distribution is a statistical function that describes all the possible values
and probabilities for a random variable within a given range. A variable to which is
assigned outcomes of random experiment as values is known as a Random variable. The
observed outcomes of an activity are entirely by chance, unpredictable and may differ from
response to response. By definition of Randomness, each possible entity has the same
chance of being considered.

Example 7.1:
i. Outcomes of lottery drawing
ii. Outcomes of a toss of a fair coin

A random variable X is a rule that assigns a numerical value to each possible outcome of
a random experiment. The specific value that one assigned to the outcomes are denoted by
the symbols x1, x2, x3, …, and so forth. The function X transforms outcomes of the
experiment into real numbers x1, x2, x3, …, and so forth. A random variable may be discrete
or continuous.

A random variable X is discrete if it can assume finite or countable infinite number of


different values. Discrete random variable is usually the result of counting something.

Example 7.2:
i. Number of defective items in a production batch
ii. Number of telephone calls received in an hour

45
DSC 1371 Business Statistics

A random variable X is continuous if it can assume uncountable infinite number of values


within certain limits or intervals.

Example 7.3:
i. The length of an iron bar in cm 72, 72.2, and 72.273 and so on depending on the
accuracy of our measuring device.
ii. Tire pressure could be in pounds per square inch (psi); 28, 28.6, 28.62, 28.624
depending on the accuracy of the gauge.

7.1 Types of Probability Distribution


The probability distribution can be divided into two parts namely, Discrete Probability
Distributions and Continuous Probability Distributions depending on the nature of the
random variable as discrete or continuous.

7.2 Discrete Probability Distributions and Functions


A discrete probability distribution is a listing of all possible values of a discrete random
variable X and the corresponding probabilities.

Example 7.4:
Construct the probability distribution of the number of heads obtained in three tosses of a
fair coin.

Let X be the random variable denoting the number of heads.

Number of Heads Probability


(X) P(X)
0 1/8
1 3/8
2 3/8
3 1/8
1

Discrete Probability Function


Let X be a discrete random variable defined on the sample space S and S(X) = {x1, x2,…,
xn}. If we denote P(X=xi) = p(xi) as, probability that the random variable X takes the value
xi, then the real-valued function defined on R by P(X = xi) = p(xi) is called the discrete
probability distribution function or probability mass function of X provided that it
satisfies following properties.

i. p( x )  0 , x  R
ii.  p( xi ) = 1
i
Normally we denote probability mass function as a table of probabilities as given below.
46
DSC 1371 Business Statistics

x1 x2 … xn
P(x1) p(x2) … p(xn)

Example 7.5: Probability mass function of X of above Example 7.4 is,


X 0 1 2 3
P(X=x) 1/8 3/8 3/8 1/8
Mathematical Expectation of a Discrete Random Variable
If X is a discrete random variable which assumes a finite number of values x1, x2, …, xn,
with respective probabilities p(x1), p(x2), …, p(xn), the expected value (or mean) of X,
denoted by E[X] or  is defined as;
r
Var[ X ] =  2 =  pi ( xi −  )2
i =1

The expected value of a random variable is the weighted average of the possible values it
can assume, where the weights are the probabilities of occurrence of those values. If the
random experiment is undertaken repeatedly a large number of times, the expected value
of the random variable will be a good approximation to the average or the mean.

Example 7.6:
Find the expected value of the random variable X define in Example 7.4.

Variance of a Discrete Random Variable


The variance of a discrete random variable is given by the sum of the products of the
squared deviations between the mean and all individual values of the random variable,
taken one at a time, and their respective probabilities.

The variance of a discrete random variable X, which takes the values xi with probability
pi (for i = 1, …, r), and has mean , is;
n
E[ X ] =  xi p ( xi )
i =1
Var[X] may be written as;

E [( X −  ) 2 ] or E[( X − E [ X ]) 2 ]

From the above equation, we can derive alternative form for the variance as;
σ2 = E [ X 2 ] − ( E [ X ]) 2
The standard deviation  is the positive square root of the variance.
Example 7.7:
Find the variance of the random variable X define in Example 7.4.
In broadly speaking, regardless of the nature of the values assigned for the random variable,
probability distributions are two types: sampling distributions and theoretical distributions.

47
DSC 1371 Business Statistics

The distributions, which are based on actual data or experimentation, are called sampling
distributions. On the other hand, a distribution based on expectation on the basis of past
experience is known as a theoretical distribution or probability distribution. In short, a
sampling distribution is based on actual sample studies whereas a theoretical distribution is
based on expectations on the basis of previous experience or theoretical considerations. For
theoretical distributions, a random experiment is theoretically assumed to serve as a model
and the probabilities are given by a function of the random variable called probability
function. Some of the theoretically developed models or theoretical distributions are
discussed in the following sections.

7.2.1 Binomial Distribution


Binomial distribution is one of the simplest and most frequently used discrete probability
distribution and it is very useful in many practical situations involving two events such as
success and failure. The Binomial distribution has certain distinct properties, which are
enumerated as follows.

1. The experiment is performed under the same conditions for a fixed and finite number
of trials says (n).

2. The result of each trial can be classified into one of two categories called ‘Success’ and
‘Failure’.

3. The probability of success p remains constant from trial to trial. Similarly, the
probability of failure q or (1-p) remains constant over all observations.

4. Each trial is independent of other trials. This means that the outcome of any trial does
not influence on the outcome of any other trial.

Definition
A random variable X is said to have a Binomial Distribution with parameters n and p if
and only if it’s probability mass function is given by,

P ( X = x) = n C x p x (1 − p ) n − x ; For x = 0, 1, 2, …, n

If X has a Binomial distribution with parameter n and p we represent it as X ~ Bin (n, p)


E(X) = µ = np
Var(X) = σ2 = np(1-p)
Example 7.8:
For the binomial distribution with n = 12 and p = 0.3, calculate the probability of
a. two successes ii. at most three successes iii. at least three successes

48
DSC 1371 Business Statistics

Example 7.9:
According to previous records 5% of nails produced by a certain machine are defectives. If
10 nails were chosen randomly, which are produced by this machine; find the probability
that;
a. 2 nails are defectives
b.at most 2 nails are defectives
c. at least 3 nails are defectives
Example 7.10:
Multiple choice quiz has 10 questions. Each question has five possible answers of which
only one is correct. What is the probability that sheer guesswork will yield at least four
correct answers?
7.2.2 Poisson distribution
Poisson distribution deals with the problem of counting the number of occurrences of a
particular event during a specified time interval or a region of space. This distribution was
discovered by the French mathematician S.D. Poisson in 1837. It is often referred to as the
law of improbable events, meaning that the probability p of a particular event’s happening
is small. The Poisson distribution is a discrete probability distribution because it is formed
by counting something.

Examples 7.11:
i. In quality control to count the number of defects.
ii. In insurance problems to count the number of casualties
iii. In waiting time problems to count the number of telephone calls or incoming
customers. etc.

Definition
A discrete random variable X whose probability mass function is given by;
𝑒 −𝜆 𝜆𝑥
𝑃(𝑋 = 𝑥) = for x = 0, 1, 2, … and  > 0 is called the Poisson distribution.
𝑥!
This distribution has a single parameter . If random variable X has the Poisson distribution
with parameter  then X ~ P()
E(X) =  and Var(X) = 
Example 7.12:
If random variable X has a Poisson distribution with  = 4 [i.e., 𝑋~𝑃(4)] , find
i. 𝑃(𝑋 = 3) ii. 𝑃(𝑋 ≤ 3) iii. 𝑃(𝑋 ≥ 3)
Example 7.13:
Customers arrive randomly at a certain book shop at an average rate of 3.4 per minute.
Assuming the customer arrivals form a Poisson distribution, calculate the probability that;
i. Two or more customers arrive in any particular minute.
ii. Less than four customers arrive in any two-minute period.
Example 7.14:
The number of accidents, on average, that have occurred in a factory is 30 per year.
(i) What is the probability that in a given month exactly 2 accidents will occur?

49
DSC 1371 Business Statistics

(ii) What is the probability that in a given two-month period at least 3 accidents will occur?

7.3 Continuous Probability Distributions and Functions


A continuous distribution describes the probabilities of a continuous random variable's
possible values. A continuous random variable has an infinite and uncountable set of
possible values (known as the range).

7.3.1 Continuous Probability Function / Probability Density Function


A function with values f(x), defined over the set of all real numbers, is called a probability
density function of the continuous random variable X if and only if;

b
P(a  X  b) =  f ( x)dx ; for any real constants a and b with a  b
a

This probability density function should satisfy the following conditions


i. f(x)  0 for -< x <;

ii.  f ( x)dx = 1
−

For a continuous random variable X, probability of getting a particular value is zero


P(X = x) = 0, -<x<
If X is a continuous random variable and a and b are real constants with ab, then
P ( aXb ) = P (a X<b ) = P (a <Xb ) = P (a <X<b )

Now let’s move to discuss the usefulness of one of the most important continuous
distributions which is highly used in the real world.

7.3.2 Normal Distribution


The Normal Distribution is the most important specific continuous distribution. This
distribution was first discovered by an English mathematician De-Moiver in 1733. One
reason for the important of this distribution is that it usefully models or describes the
distribution of numerous random variables that arise in practice.

Examples 7.15:
Weights or heights of a group of people.
The weight might be 112 Kg, 112.1 Kg, and 112.13 Kg and so on, depending on the
accuracy of the scale.

50
DSC 1371 Business Statistics

Definition:
A random variable X has a Normal distribution and it is referred to as a Normal random
variable, if and only if it’s probability density is given by;
1  x− 
2
− 
f ( x) =
1 
e 2   For - < x <  and  > 0
 2
The numbers  and 2 in the function represent the mean and variance respectively, of the
distribution. They are the parameters that completely determine the spread and location of
the Normal distribution. If random variable X has a Normal distribution with parameters 
and 2 we write X ~ N[, 2].

Importance of the Normal Distribution


1. Frequency distributions of many physical characteristics such as height and weight of
people often have the shape of the Normal curve.
2. Most of the distributions occurring in practice (eg. Binomial, Poisson, Hypergeometric
distributions) can be approximated by Normal distribution.
3. Even if a variable is not normally distributed, it can sometimes be brought to Normal
by simple transformation of variable.
4. Even if a variable has any distribution, we can approximate the distribution of means
of the variable to Normal distribution. This is a very important characteristic of the
Normal distribution and it is known as the Central Limit theorem.

Characteristics of a Normal Distribution


The Normal probability distribution and its accompanying normal curve have the following
characteristics.
1. It is bell-shaped and has single peak at the exact center of the distribution.
2. It is symmetrical about its mean.
3. It falls off smoothly in either direction from the central value.
4. It is asymptotic (i.e the curve gets closer to the X-axis but never actually touches it).
That is the ‘tails’ of the curve extended indefinitely in both directions. -<x<
5. The total area under the curve is unity

51
DSC 1371 Business Statistics

The Standard Normal Probability Distribution


There is no single Normal distribution. Rather there are an unlimited number of Normal
curves. The family of Normal probability distribution all bell-shaped and symmetrical,
those differ in terms of their respective means () and standard deviations () as follows.

Hence, it would be physically impossible to provide a table of probabilities for each


combination of mean () and standard deviation (). Fortunately, one member of the family
of Normal distribution can be used for all problems where the Normal distribution is
applicable. It has a mean of 0 and standard deviation of 1 and is called the Standard
Normal distribution.

First it is necessary to convert, or standardize the actual distribution to a Standard Normal


distribution using a Z value, also called a Z score, a Z statistic. However, the Z scores of
variables that are normally distributed will themselves be Normally distributed. In other
words, the transformation to Z scores does not in any way alter the original form of the
distribution. The Z score represents the deviation of a score from the mean expressed in
units of standard deviation. Symbolically;
𝑋−𝜇
𝑍=
𝜎
Why is the Z score transformation so important?
Of all the possible Normal distributions, the one we shall use is a theoretical continuous
distribution called the Standard Normal distribution has a mean () of 0, a standard
deviation () of 1, and a total area equal to 1.00. Thus, when we transform any normally
distributed variable into Z or Standard Scores, we can use one table, which provides the
proportion of areas under the Normal curve, regardless of the units of measurement in the
original data.

Example 7.16:
Using standard normal distribution, find the probabilities:
i. P (0 ≤ Z ≤ 1.7) ii. P (Z < 1.36)
iii. P (Z ≤ − 0.53) iv. P (Z ≥ 1.6)
v. P (Z > −1.5) vi. P (−2.4 ≤ Z < 0.6)
vii. P (0.50 ≤ Z < 1.25) viii. P (1.4 ≤ Z ≤ 2)
ix. P (−2 ≤ Z≤ − 0.7) x. P (−1.95 ≤ Z ≤ 0.65)

52
DSC 1371 Business Statistics

Example 7.17:
The weights of packages of a brand of cereal are normally distributed with a mean of 5 kg
and a standard deviation of 120g. What is the probability that a package, selected at random
will have a weight between;

i. 5 and 5.3 kilograms? ii. 4.7 and 5 kilograms?


iii. 4.7 and 5.3 kilograms? iv. 5.12 and 5.24 kilograms?
v. 4.88 and 4.97 kilograms?

Occasionally, we required finding a value of z corresponding to a specified probability that


falls between given probabilities. In that case, for convenience, we always choose the z
value corresponding to the tabular value that comes closest to the specified probability.
However, if the given probability falls midway between tabular values, we shall choose for
z the value falling midway between the corresponding values of z.

Example 7.18:
Find the values of z0 such that,
i. P(Z ≥ Z0) = 0.025 ii. P(Z ≤ Z0) = 0.05

Example 7.19:
A large construction firm estimates that the time required to complete an office complex
is normally distributed with a mean of 20 months and standard deviation of 2 months.
i. What is the probability that it will take at least 22 months to complete the office
complex?
ii. What is the probability that it will be completed in less than 23 months?
iii. If the firm wishes to make a bid on the project, quoting a completion time that has a
90% chance of meeting, how many months should it quote.

Example 7.20:
A soft-drink machine is regulated so that it discharges an average of 200 milliliters per cup.
If the amount of drink is normally distributed with a standard deviation equal to 7 milliliters,
i. What fraction of cups will contain more than 215 milliliters?
ii. How many cups will probably overflow if 220 milliliter cups are used for next 1000
drinks?

Exercises:
1. A company wants to test the defectiveness of their items produced, by using a sample
of 20 items. If the sample consists of 2 or more defective items, the lot will be rejected.
If a lot contains 4% defective items what is the probability of accepting the lot? What
is the probability of rejecting the lot?

53
DSC 1371 Business Statistics

2. Only 25% of the workers in a certain company agreed with the opinion that the
management has the right to monitor their telephone usage. A random sample of 10
workers is selected and they are asked if management has the right to monitor telephone
usage. What is the probability that at least 3 of the workers agree with this opinion?

3. Births in a hospital occur randomly at an average rate of 1.5 births per hour.
i. What is the probability of observing at least 2 births in a given hour at the hospital?
ii. What is the probability of observing no more than 2 births in a given 2 hour interval?

4. Text book authors and publishers work very hard to minimize the number of errors in
a text. However, some errors are unavoidable. A statistics editor, reports that the mean
number of errors per chapter is 0.8. What is the probability that there are less than 3
errors in a particular chapter?

5. The volume of paint in a can is normally distributed with a mean of 10.25 liters and a
variance of 0.04.
i. Find the probability that randomly selected can contains less than 10 liters.
ii. Suppose that the cans having a volume below a certain level will be rejected. The
company wishes to specify a level of paint volume so that 90% of the cans will be
accepted. Find the level of volume that should be specified by the company.

6. Past experience with a particular model of refrigerator has shown that the life of the
compressor is approximately normally distributed with a mean of 12 years and a
standard deviation of 3.24.
i. If the manufacturer is considering a three year unconditional warranty on the
compressor, what proportion of failures should it anticipate to be under warranty?
ii. If 50,000 refrigerators are delivered to a specific service area, approximately how
many compressors will require warranty service?
iii. If the manufacturer wishes to reduce the probability of a failure under the warranty
period, to 0.0020. How long should the warranty be?

54

You might also like