Introduction To Statistics and Data Analysis

Introduction to Statistics and In the problems discussed above the
statistical methods used involve dealing with

Data Analysis variability, and in each case the variability to
be studied is that encountered in scientific
Beginning in the 1980s and data. If the observed product density in the
continuing into the 21st century, an process were always the same and were
inordinate amount of attention has been always on target, there would be no need for
focused on improvement of quality in statistical methods. Statistics researchers
American industry. Much has been said and have produced an enormous number of
written about the Japanese “industrial analytical methods that allow for analysis of
miracle,” which began in the middle of the data from systems. This reflects the true
20th century. The Japanese were able to nature of the science that we call inferential
succeed where we and other countries had statistics, namely, using techniques that
failed–namely, to create an atmosphere that allow us to go beyond merely reporting data
allows the production of high-quality to drawing conclusions (or inferences) about
products. Much of the success of the the scientific system.
Japanese has been attributed to the use of
statistical methods and statistical thinking 1-1. THE ENGINEERING METHOD
among management personnel. AND STATISTICAL THINKING
An engineer is someone who solves
Use of Scientific Data problems of interest to society by the
efficient application of scientific principles.
The use of statistical methods in The engineering, or scientific method is
manufacturing, development of food the approach to formulating and solving
products, computer software, energy these problems. The steps in the engineering
sources, pharmaceuticals, and many other method are as follows:
areas involves the gathering of information
or scientific data. There is a profound 1. Develop a clear and concise
distinction between collection of scientific description of the problem.
information and inferential statistics. It is the 2. Identify, at least tentatively, the
latter that has received rightful attention in important factors that affect this problem or
recent decades. The offspring of inferential that may play a role in its solution.
statistics has been a large “toolbox” of 3. Propose a model for the problem,
statistical methods employed by statistical using scientific or engineering knowledge of
practitioners. These statistical methods are the phenomenon being studied. State any
designed to contribute to the process of limitations or assumptions of the model.
making scientific judgments in the face of 4. Conduct appropriate experiments
uncertainty and variation. Statistical and collect data to test or validate the
methods are used to analyze data from a tentative model or conclusions made in steps
process such as this one in order to gain 2 and 3.
more sense of where in the process changes 5. Refine the model on the basis of
may be made to improve the quality of the the observed data.
process. 6. Manipulate the model to assist in
developing a solution to the problem.
7. Conduct an appropriate
Variability in Scientific Data experiment to confirm that the proposed
Compiled and prepared by: ENGR. K.T. CABANLIG

solution to the problem is both effective and Factors that might have affected the
efficient. mileage performance are factors
8. Draw conclusions or make representing potential sources of variability
recommendations based on the problem in the system. Statistics gives us a
solution. framework for describing this variability and
for learning about which potential sources of
The engineering method features a variability are the most important or which
strong interplay between the problem, the have the greatest impact on the gasoline
factors that may influence its solution, a mileage performance.
model of the phenomenon, and
experimentation to verify the adequacy of We also encounter variability in
the model and the proposed solution to the dealing with engineering problems.
problem.
The field of statistics deals with the Example:
collection, presentation, analysis, and use of The engineer is considering
data to make decisions, solve problems, and establishing the design specification on wall
design products and processes. Statistical thickness at 332 inches but is somewhat
techniques can be a powerful aid in uncertain about the effect of this decision on
designing new products and systems, the connector pull-off force. Eight prototype
improving existing designs, and designing, units are produced and their pull-off forces
developing, and improving production measured, resulting in the following data (in
processes. pounds): 12.6, 12.9, 13.4, 12.3, 13.6, 13.5,
12.6, 13.1. As we anticipated, not all of the
Statistical methods are used to help prototypes have the same pull-off force.
us describe and understand variability. By Because the pull-off force measurements
variability, we mean that successive exhibit variability, we consider the pull-off
observations of a system or phenomenon do force to be a random variable.
not produce exactly the same result. And
statistical thinking can give us a useful way A convenient way to think of a
to incorporate this variability into our random variable, say X, that represents a
decision-making processes. measurement, is by using the model
𝑋 =𝜇+𝜖
Example: where 𝜇 is a constant and 𝜖 is a random
Do you always get exactly the same disturbance. The constant remains the same
with every measurement, but small changes
in the environment, test equipment,
differences in the individual parts
themselves, and so forth change the value of
𝜖. If there were no disturbances, would
always equal zero and X would always be
equal to the constant 𝜇.
mileage performance on every tank of fuel?
- No. Sometimes the mileage performance Often, physical laws (such as Ohm’s
varies considerably depending on many law and the ideal gas law) are applied to
factors. help design products and processes. We are
familiar with this reasoning from general

laws to specific cases. But it is also
important to reason from a specific set of
measurements to more general cases to
answer the previous questions. This
reasoning is from a sample (such as the eight
connectors) to a population (such as the
connectors that will be sold to customers).
The reasoning is referred to as statistical
inference. Clearly, reasoning based on
measurements from some objects to
measurements on all objects can result in
errors (called sampling errors). However, if
the sample is selected properly, these risks
can be quantified and an appropriate sample
size can be determined.
Sample is actually selected from a

well-defined population. The sample is a
subset of the population.
Population is conceptual, but it
might be thought of as future replicates of
the objects in the sample.
Information is gathered in the form
of samples, or collections of observations.
At times a population signifies a scientific
system.
1-2. COLLECTING ENGINEERING
Example: DATA
A sampling process may involve 1-2.1 Basic Principles
collecting information on 50 computer Three basic methods of collecting data are:
boards sampled randomly from the process. A. A retrospective study using historical
Here, the population is all computer boards data
manufactured by the firm over a specific B. An observational study
period of time. If an improvement is made in C. A designed experiment
the computer board process and a second An effective data collection
sample of boards is collected, any procedure can greatly simplify the analysis
conclusions drawn regarding the and lead to improved understanding of the
effectiveness of the change in process population or process that is being studied.
should extend to the entire population of
computer boards produced under the A. Retrospective study
“improved process.” A retrospective study would use
either all or a sample of the historical
process data archived over some period of
time. Retrospective study may involve a lot
of data, but that data may contain relatively

little useful information about the problem. continuously) do not correspond perfectly to
Furthermore, some of the relevant data may the acetone concentration measurements
be missing, there may be transcription or (which are made hourly). It may not be
recording errors resulting in outliers (or obvious how to construct an approximate
unusual values), or data on other important correspondence.
factors may not have been collected and 3. Production maintains the two
archived. temperatures as closely as possible to
Statistical analysis of historical data desired targets or set points. Because the
sometimes identify interesting phenomena, temperatures change so little, it may be
but solid and reliable explanations of these difficult to assess their real impact on
phenomena are often difficult to obtain. acetone concentration.
4. Within the narrow ranges that they
do vary, the condensate temperature tends to
Example: increase with the reboil temperature.
Montgomery, Peck, and Vining (2001) Consequently, the effects of these two
describe an acetone-butyl alcohol distillation process variables on acetone concentration
column for which concentration of acetone may be difficult to separate.
in the distillate or output product stream is
an important variable. Factors that may B. Observational Study
affect the distillate are the reboil In an observational study, the
temperature, the condensate temperature, engineer observes the process or population,
and the reflux rate. Production personnel disturbing it as little as possible, and records
obtain and archive the following records: the quantities of interest.
-The concentration of acetone in an Generally, an observational study
hourly test sample of output product tends to solve problems 1 and 2 above and
-The reboil temperature log, which is goes a long way toward obtaining accurate
a plot of the reboil temperature over time and reliable data. However, observational
-The condenser temperature studies may not help resolve problems 3 and
controller log 4.
-The nominal reflux rate each hour
The reflux rate should be held constant for C. Designed Experiments
this process. Consequently, production In a designed experiment the
personnel change this very infrequently. The engineer makes deliberate or purposeful
study objective might be to discover the changes in the controllable variables of the
relationships among the two temperatures system or process, observes the resulting
and the reflux rate on the acetone system output data, and then makes an
concentration in the output product stream. inference or decision about which variables
However, this type of study presents some are responsible for the observed changes in
problems: output performance.
1. We may not be able to see the In this simple comparative
relationship between the reflux rate and experiment, the engineer is interested in
acetone concentration, because the reflux determining if there is any difference
rate didn’t change much over the historical between the 332- and 18-inch designs. An
period. approach that could be used in analyzing the
2. The archived data on the two data from this experiment is to compare the
temperatures (which are recorded almost mean pull-off force for the 332-inch design

to the mean pull-off force for the 18-inch
design using statistical hypothesis testing.
A hypothesis is a statement about
some aspect of the system in which we are
interested.
We would be interested in testing the
hypothesis. This is called a single sample
hypothesis testing problem. It is also an
example of an analytic study.
Designed experiments are a very
powerful approach to studying complex
systems.
Often data are collected over time. In
this case, it is usually very helpful to plot the
data versus time in a time series plot.
D. Observing Processes Over Time

Often data are collected over time. In
this case, it is usually very helpful to plot the
data versus time in a time series plot.
Phenomena that might affect the system or
process often become more visible in a time-
oriented plot and the concept of stability can
be better judged.

hypothetical data taken from the population,
based on known features of the population.
Sampling Procedures / Data Collection

1. Simple Random Sampling
Implies that any particular sample of
a specified sample size has the same chance
of being selected as any other sample of the
same size. Let us assume that only a single
population exists in the problem.
Sample size simply means the
number of elements in the sample.
The virtue of simple random
sampling is that it aids in the elimination of
the problem of having the sample reflect a
different (possibly more confined)
population than the one about which
inference need to be made.
Example:
A sample is to be chosen to answer certain
questions regarding political preferences in a
certain state in the United States. The
sample involves the choice of 1000 families,
and a survey is to be conducted. Now,
suppose it turns out that random sampling is
not used. Rather, all or nearly all of the 1000
families chosen live in an urban setting. It is
believed that political preferences in rural
areas differ from those in urban areas.
The sample along with inferential
statistics allows us to draw conclusions Implication:
about the population, with inferential The sample drawn actually confined
statistics making clear use of elements of the population and thus the inferences need
probability. to be confined to the “limited population”.
The sample of size 1000 described here is
often referred to as a biased sample.
Strata
Sampling units are not homogeneous
and naturally divide themselves into
nonoverlapping groups that are
homogeneous.
Elements in probability allow us to Stratified random sampling
draw conclusions about characteristics of

It involves random selection of a
sample within each stratum. The purpose is
to be sure that each of the strata is neither
over- nor underrepresented.
Example:
A sample survey is conducted in
order to gather preliminary opinions
regarding a bond referendum that is being
considered in a certain city. The city is
subdivided into several ethnic groups which
represent natural strata.
Implication:
In order not to disregard or
overrepresent any group, separate random
samples of families could be chosen from
each group.
2. Experimental Design
The concept of randomness or
random assignment plays a huge role in the
area of experimental design.

PROBABILITY
SAMPLE SPACES AND EVENTS
1.1 Sample Spaces
In the study of statistics, we are concerned basically with the presentation and
interpretation of chance outcomes that occur in a planned study or scientific investigation.
Example:
We may record the number of accidents that occur monthly at the intersection of Driftwood Lane
and Royal Oak Drive, hoping to justify the installation of a traffic light; we might classify items
coming off an assembly line as “defective” or “non-defective”; or we may be interested in the
volume of gas released in a chemical reaction when the concentration of an acid is varied.
Hence, the statistician is often dealing with either numerical data, representing counts or
measurements, or categorical data, which can be classified according to some criterion.
We shall refer to any recording of information, whether it be numerical or categorical, as
an observation.
Example:
The numbers 2, 0, 1, and 2, representing the number of accidents that occurred for each month
from January through April during the past year at the intersection of Driftwood Lane and Royal
Oak Drive, constitute a set of observations. Similarly, the categorical data N, D, N, N, and D,
representing the items found to be defective or non-defective when five items are inspected, are
recorded as observations.
Statisticians use the word experiment to describe any process that generates a set of data.
Example:
A simple example of a statistical experiment is the tossing of a coin. In this experiment,
there are only two possible outcomes, heads or tails.
Another experiment might be the launching of a missile and observing of its velocity at
specified times.
The opinions of voters concerning a new sales tax can also be considered as observations
of an experiment.
We are particularly interested in the observations obtained by repeating the experiment

several times. In most cases, the outcomes will depend on chance and, therefore, cannot be
predicted with certainty.
Sample space
The set of all possible outcomes of a statistical experiment and is represented by the
symbol S.

Each outcome in a sample space is called an element or a member of the sample space, or
simply a sample point.
If the sample space has a finite number of elements, we may list the members separated
by commas and enclosed in braces
𝑆 = {𝑂𝑢𝑡𝑐𝑜𝑚𝑒 1, 𝑂𝑢𝑡𝑐𝑜𝑚𝑒 2},
Example 1.1:
Consider the experiment of tossing a die. If we are interested in the number that shows on
the top face, the sample space is
𝑆1 = {1, 2, 3, 4, 5, 6}.
If we are interested only in whether the number is even or odd, the sample space is
simply
𝑆2 = {𝑒𝑣𝑒𝑛, 𝑜𝑑𝑑}.
Example 1.2:
Suppose that three items are selected at random from a manufacturing process. Each item is
inspected and classified defective, D, or non-defective, N. To list the elements of the sample
space providing the most information, we construct the tree diagram.
𝑆 = {𝐷𝐷𝐷, 𝐷𝐷𝑁, 𝐷𝑁𝐷, 𝐷𝑁𝑁, 𝑁𝐷𝐷, 𝑁𝐷𝑁, 𝑁𝑁𝐷, 𝑁𝑁𝑁}.
Sample spaces with a large or infinite number of sample points are best described by a
statement or rule method.
For example, if the possible outcomes of an experiment are the set of cities in the world with a
population over 1 million, our sample space is written
𝑆 = {𝑥 | 𝑥 𝑖𝑠 𝑎 𝑐𝑖𝑡𝑦 𝑤𝑖𝑡ℎ 𝑎 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑜𝑣𝑒𝑟 1 𝑚𝑖𝑙𝑙𝑖𝑜𝑛}

Other term
𝑆 = {(𝑥, 𝑦)|𝑥 ! + 𝑦 ! ≤ 4}
reading as “S is the set of all points (x, y) on the boundary or the interior of a circle of radius 2
with center at the origin”.
1.2 Events
A subset of a sample space. For any given experiment, we may be interested in the
occurrence of certain events rather than in the occurrence of a specific element in the sample
space.
For instance, we may be interested in the event A that the outcome when a die is tossed is
divisible by 3. The outcome is an element of the subset 𝐴 = {3, 6} of the sample space 𝑆" in the
die example.
We may be interested in the event B that the number of defectives is greater than 1 in
Example 1.1. This will occur if the outcome is an element of the subset of the sample space S.
𝐵 = {𝐷𝐷𝑁, 𝐷𝑁𝐷, 𝑁𝐷𝐷, 𝐷𝐷𝐷}

Example 1.3:
Given the sample space 𝑆 = {𝑡 | 𝑡 ≥ 0}, where t is the life in years of a certain
electronic component, then the event A that the component fails before the end of the fifth year is
the subset 𝐴 = {𝑡 | 0 ≤ 𝑡 < 5}.
It is conceivable that an event may be a subset that includes the entire sample space S or a
subset of S called the null set and denoted by the symbol 𝜙, which contains no elements at all.
For instance, if we let A be the event of detecting a microscopic organism by the naked eye in a
biological experiment, then 𝐴 = 𝜙.
The complement of an event A with respect to S is the subset of all elements of S that are
not in A. We denote the complement of A by the symbol 𝐴′.
Consider an experiment where the smoking habits of the employees of a manufacturing
firm are recorded. A possible sample space might classify an individual as a nonsmoker, a light
smoker, a moderate smoker, or a heavy smoker. Let the subset of smokers be some event. Then
all the nonsmokers correspond to a different event, also a subset of S, which is called the
complement of the set of smokers.
Example 1.4.
Let R be the event that a red card is selected from an ordinary deck of 52 playing cards,
and let S be the entire deck. Then R is the event that the card selected from the deck is not a red
card but a black card.
The intersection of two events A and B, denoted by the symbol A ∩ B, is the event
containing all elements that are common to A and B.
Example 1.5.
1. In the tossing of a die we might let A be the event that an even number occurs and B
the event that a number greater than 3 shows. Then the subsets A = {2, 4, 6} and B = {4, 5, 6}
are subsets of the same sample space S = {1, 2, 3, 4, 5, 6}.

2. Let E be the event that a person selected at random in a classroom is majoring in
engineering, and let F be the event that the person is female. Then E ∩ F is the event of all
female engineering students in the classroom.
Two events A and B are mutually exclusive, or disjoint, if A ∩ B = φ, that is, if A and B
have no elements in common.
Example 1.6.
1. Let V = {a, e, i, o, u} and C = {l, r, s, t}; then it follows that V ∩ C = φ. That is, V and
C have no elements in common and, therefore, cannot both simultaneously occur.
2. A cable television company offers programs on eight different channels, three of which
are affiliated with ABC, two with NBC, and one with CBS. The other two are an educational
channel and the ESPN sports channel. Suppose that a person subscribing to this service turns on
a television set without first selecting the channel. Let A be the event that the program belongs to
the NBC network and B the event that it belongs to the CBS network. Since a television program
cannot belong to more than one network, the events A and B have no programs in common.
Therefore, the intersection A ∩ B contains no programs, and consequently the events A and B
are mutually exclusive
The union of the two events A and B, denoted by the symbol A∪B, is the event containing
all the elements that belong to A or B or both.
Example 1.7.
1. In the die-tossing experiment, if A = {2, 4, 6} and B = {4, 5, 6}, we might be
interested in either A or B occurring or both A and B occurring. Such an event, called the union
of A and B, will occur if the outcome is an element of the subset {2, 4, 5, 6}.
2. Let A = {a, b, c} and B = {b, c, d, e}; then A ∪ B = {a, b, c, d, e}.
3. Let P be the event that an employee selected at random from an oil drilling company
smokes cigarettes. Let Q be the event that the employee selected drinks alcoholic beverages.
Then the event P ∪ Q is the set of all employees who either drink or smoke or do both.
4. If M = {x | 3 <x< 9} and N = {y | 5 <y< 12}, then M ∪ N = {z | 3 <z< 12}.
The relationship between events and the corresponding sample space can be illustrated
graphically by means of Venn diagrams.
𝐴 ∩ 𝐵 = 𝑟𝑒𝑔𝑖𝑜𝑛𝑠 1 𝑎𝑛𝑑 2,
𝐵 ∩ 𝐶 = 𝑟𝑒𝑔𝑖𝑜𝑛𝑠 1 𝑎𝑛𝑑 3,
𝐴 ∪ 𝐶 = 𝑟𝑒𝑔𝑖𝑜𝑛𝑠 1, 2, 3, 4, 5, 𝑎𝑛𝑑 7,
𝐵 ∩ 𝐴 = 𝑟𝑒𝑔𝑖𝑜𝑛𝑠 4 𝑎𝑛𝑑 7,
𝐴 ∩ 𝐵 ∩ 𝐶 = 𝑟𝑒𝑔𝑖𝑜𝑛 1,
(𝐴 ∪ 𝐵) ∩ 𝐶 = 𝑟𝑒𝑔𝑖𝑜𝑛𝑠 2, 6, 𝑎𝑛𝑑 7,

Example 1.8
Events of the sample spaces S.
A situation where we select a card at random from an ordinary deck of 52 playing cards and
observe whether the following events occur:
𝐴: the card is red,
𝐵: the card is the jack, queen, or king of diamonds,
𝐶: the card is an ace.
1. A ∩ 𝜙 = 𝜙. 6. 𝜙’ = S.
2. A ∪ 𝜙 = A. 7. (A ) = A.
3. A ∩ A = φ. 8. (A ∩ B) = A ∪ B
4. A ∪ A = S. 9. (A ∪ B) = A ∩ B
5. S = 𝜙.
1.3 Counting Sample Points
One of the problems that the statistician must consider and attempt to evaluate is the
element of chance associated with the occurrence of certain events when an experiment is
performed.
If an operation can be performed in n1 ways, and if for each of these ways a second
operation can be performed in n2 ways, then the two operations can be performed together in
𝑛" 𝑛! ways.
Example 1.9
How many sample points are there in the sample space when a pair of dice is thrown once?
Solution:
n1 = 6 ways.
n2 = 6 ways.
𝑛" 𝑛! = (6)(6) = 36 possible ways.
Example 1.10
A developer of a new subdivision offers prospective home buyers a choice of Tudor,
rustic, colonial, and traditional exterior styling in ranch, two-story, and split-level floor plans. In
how many different ways can a buyer order one of these homes?

Solution:
𝑛" = 4
𝑛! = 3
𝑛" 𝑛! = (4)(3) = 12 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 ℎ𝑜𝑚𝑒𝑠.
The generalized multiplication rule covering k operations is stated in the following. If an

operation can be performed in n1 ways, and if for each of these a second operation can be
performed in 𝑛! ways, and for each of the first two a third operation can be performed in 𝑛#
ways, and so forth, then the sequence of k operations can be performed in 𝑛" 𝑛! ··· 𝑛$ ways.
Example 1.11
If a 22-member club needs to elect a chair and a treasurer, how many different ways can
these two to be elected?
Solution:
For the chair position, there are 22 total possibilities
There are 21 possibilities to elect the treasurer
Multiplication rule:
𝑛" × 𝑛! = 22 × 21 = 462 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝑤𝑎𝑦𝑠
Example 1.12
Sam is going to assemble a computer by himself. He has the choice of chips from two brands, a
hard drive from four, memory from three, and an accessory bundle from five local stores. How
many different ways can Sam order the parts?
Solution:
𝑛" = 2, 𝑛! = 4, 𝑛# = 3, 𝑎𝑛𝑑 𝑛% = 5
𝑛" × 𝑛! × 𝑛# × 𝑛% = 2 × 4 × 3 × 5 = 120
A permutation is an arrangement of all or part of a set of objects.
Example 1.12
Consider the three letters a, b, and c. The possible permutations are abc, acb, bac, bca, cab, and
cba. Thus, we see that there are 6 distinct arrangements.
𝑛" 𝑛! 𝑛# = (3)(2)(1) = 6 𝑝𝑒𝑟𝑚𝑢𝑡𝑎𝑡𝑖𝑜𝑛𝑠
In general, 𝑛 distinct objects can be arranged in

𝑛(𝑛 − 1)(𝑛 − 2) ··· (3)(2)(1) 𝑤𝑎𝑦𝑠.
with special case 0! = 1.
Theorem 1: The number of permutations of n objects is n!.

The number of permutations of the four letters a, b, c, and d will be 4! = 24.
Now consider the number of permutations that are possible by taking two letters at a time
from four. These would be ab, ac, ad, ba, bc, bd, ca, cb, cd, da, db, and dc.
𝑛" = 4, 𝑛! = 3

𝑛" 𝑛! = (4)(3) = 12
In general, n distinct objects taken r at a time can be arranged in 𝑛(𝑛 − 1)(𝑛 − 2) ··· (𝑛 −
𝑟 + 1)
Theorem 2: The number of permutations of n distinct objects taken r at a time is

(We represent this product by the symbol)
'!
n𝑃& =
(' + &)!
Example 1.13
In one year, three awards (research, teaching, and service) will be given to a class of 25
graduate students in a statistics department. If each student can receive at most one award, how
many possible selections are there?
Solution:
!-! !-!
25𝑃# = = 25! 22! = !!! = (25)(24)(23) = 13, 800.
(!-+#)!
Example 1.14
A president and a treasurer are to be chosen from a student club consisting of 50 people.
How many different choices of officers are possible if
(a) there are no restrictions
Solution:
-.!
50𝑃! = = (50)(49)
%/!
(b) A will serve only if he is president;

Solution:
- 49𝑃! = (49)(48) = 2352
- Total number of choices is 49 + 2352 = 2401.
(c) B and C will serve together or not at all;

Solution:
%/!
- 48𝑃! = = 2256
%0!
- Total number of choices in this situation is 2 + 2256 = 2258.
(d) D and E will not serve together?

- D serves as an officer but not E is (2)(48) = 96
- E serves as an officer but not D is also (2)(48) = 96
- when both D and E are not chosen is 48P2 = 2256
- total number of choices is (2)(96) + 2256 = 2448
Circular permutations
Permutations that occur by arranging objects in a circle
Theorem 3: The number of permutations of n objects arranged in a circle is (n − 1)!
If the letters b and c are both equal to x, then the 6 permutations of the letters a, b, and c
become axx, axx, xax, xax, xxa, and xxa, of which only 3 are distinct. Therefore, with 3 letters, 2

being the same, we have 3!/2! = 3 distinct permutations. With 4 different letters a, b, c, and d, we
have 24 distinct permutations.
If we let a = b = x and c = d = y, we can list only the following distinct permutations:
xxyy, xyxy, yxxy, yyxx, xyyx, and yxyx. Thus, we have 4!/(2! 2!) = 6 distinct permutations.
Theorem 4: The number of distinct permutations of 𝒏 things of which 𝒏𝟏 are of one kind, 𝒏𝟐
of a second kind, ... , 𝒏𝒌 of a 𝒌𝒕𝒉 kind is
𝑛!
𝑛" ! 𝑛! ! … 𝑛$ !
Example 1.15
In a college football training session, the defensive coordinator needs to have 10 players
standing in a row. Among these 10 players, there are 1 freshman, 2 sophomores, 4 juniors, and 3
seniors. How many different ways can they be arranged in a row if only their class level will be
distinguished?
Solution:
10!
= 12,600
1! 2! 3! 4!
Often, we are concerned with the number of ways of partitioning a set of n objects into r
subsets called cells.
Theorem 5: The number of ways of partitioning a set of n objects into r cells with n1 elements
in the first cell, n2 elements in the second, and so forth, is
where 𝒏𝟏 + 𝒏𝟐 + ··· + 𝒏𝒓 = 𝒏.
Example 1.16
In how many ways can 7 graduate students be assigned to 1 triple and 2 double hotel
rooms during a conference?
Solution:
We are interested in the number of ways of selecting r objects from n without regard to
order. These selections are called combinations. combination is actually a partition with two
cells, the one cell containing the r objects selected and the other cell containing the (n −r) objects
that are left. The number of such combinations, denoted by
Theorem 6: The number of combinations of n distinct objects taken r at a time is

Example 1.16
A young boy asks his mother to get 5 Game-BoyTM cartridges from his collection of 10
arcade and 5 sports games. How many ways are there that his mother can get 3 arcade and 2
sports games?
Solution:
The number of ways of selecting 3 cartridges from 10 is
The number of ways of selecting 2 cartridges from 5 is
Using the multiplication rule

𝑛" = 120
𝑛! = 10
𝑛" 𝑛! = 1200 𝑤𝑎𝑦𝑠
Example 1.17
How many different letter arrangements can be made from the letters in the word STATISTICS?
1.4 Probability of an Event

Expressing an outcome of which we are not certain, but owing to past information or
from an understanding of the structure of the experiment, we having some degree of confidence
in the validity of the statement.
The likelihood of the occurrence of an event resulting from such a statistical experiment
is evaluated by means of a set of real numbers, called weights or probabilities, ranging from 0
to 1. To every point in the sample space we assign a probability such that the sum of all
probabilities is 1.
The probability of an event A is the sum of the weights of all sample points in A. Therefore,
0 ≤ 𝑃(𝐴) ≤ 1, 𝑃(𝜙) = 0, 𝑎𝑛𝑑 𝑃(𝑆) = 1.
Furthermore, if A1, A2, A3, ... is a sequence of mutually exclusive events, then
𝑃(𝐴" ∪ 𝐴! ∪ 𝐴# ∪···) = 𝑃(𝐴" ) + 𝑃(𝐴! ) + 𝑃(𝐴# ) + ··· .
Example 1.18
A coin is tossed twice. What is the probability that at least 1 head occurs?
Solution:
𝑆 = {𝐻𝐻, 𝐻𝑇, 𝑇𝐻, 𝑇𝑇}.
𝐴 = {𝐻𝐻, 𝐻𝑇, 𝑇𝐻} 𝑎𝑛𝑑
1 1 1 3
𝑃(𝐴) = + + =
4 4 4 4

Example 1.19
A die is loaded in such a way that an even number is twice as likely to occur as an odd number.
If E is the event that a number less than 4 occurs on a single toss of the die, find P(E).
Solution:
S = {1, 2, 3, 4, 5, 6}
E = {1, 2, 3}
1 2 1 4
𝑃(𝐸) = + + =
9 9 9 9
Example 1.20
In Example above, let A be the event that an even number turns up and let B be the event that a
number divisible by 3 occurs. Find P(A ∪ B) and P(A ∩ B). Solution: For the events A = {2, 4,
6} and B = {3, 6}, we have
A ∪ B = {2, 3, 4, 6} and A ∩ B = {6}
By assigning a probability of 1/9 to each odd number and 2/9 to each even number, we have
2 1 2 2 7
𝑃(𝐴 ∪ 𝐵) = + + + =
9 9 9 9 9
2
𝑃(𝐴 ∩ 𝐵) =
9
Rule: If an experiment can result in any one of N different equally likely outcomes, and if exactly
n of these outcomes corresponds to event A, then the probability of event A is
𝑛
𝑃(𝐴) =
𝑁
Example 1.21
A statistics class for engineers consists of 25 industrial, 10 mechanical, 10 electrical, and 8 civil
engineering students. If a person is randomly selected by the instructor to answer a question, find
the probability that the student chosen is
(a) an industrial engineering major
(b) a civil engineering or an electrical engineering major.
Solution:
I – industrial engineering
M – mechanical engineering
E - electrical engineering
C - civil engineering
(a) an industrial engineering major

25
𝑃(𝐼) =
53
(b) a civil engineering or an electrical engineering major.
18
𝑃(𝐶 ∪ 𝐸) =
53

Example 1.22
In a poker hand consisting of 5 cards, find the probability of holding 2 aces and 3 jacks.
Solution:
The number of ways of being dealt 2 aces from 4 cards is
and the number of ways of being dealt 3 jacks from 4 cards is
there are n = (6)(4) = 24 hands with 2 aces and 3 jacks
the probability of getting 2 aces and 3 jacks in a 5-card poker hand is

24
𝑃(𝐶) = = 0.9 × 10+-
2, 598, 960
If the outcomes of an experiment are not equally likely to occur, the probabilities must be
assigned on the basis of prior knowledge or experimental evidence.
For example, if a coin is not balanced, we could estimate the probabilities of heads and
tails by tossing the coin a large number of times and recording the outcomes.
According to the relative frequency definition of probability, the true probabilities

would be the fractions of heads and tails that occur in the long run. Another intuitive way of
understanding probability is the indifference approach. For instance, if you have a die that you
believe is balanced, then using this indifference approach, you determine that the probability
that each of the six sides will show up after a throw is 1/6.
The use of intuition, personal beliefs, and other indirect information in arriving at
probabilities is referred to as the subjective definition of probability.
The relative frequency interpretation of probability is the operative one. Its foundation is
the statistical experiment rather than subjectivity, and it is best viewed as the limiting relative
frequency. As a result, many applications of probability in science and engineering must be
based on experiments that can be repeated.
1.5 Additive Rules

Applies to unions of events.
Theorem 7: If A and B are two events, then

𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵).

Corollary 1: If A and B are mutually exclusive, then
𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵)
Corollary 2.1: If A1, A2,...,An are mutually exclusive, then

𝑃(𝐴1 ∪ 𝐴2 ∪···∪ 𝐴𝑛) = 𝑃(𝐴1) + 𝑃(𝐴2) + ··· + 𝑃(𝐴𝑛)
Theorem 8: For three events A, B, and C,

𝑃(𝐴 ∪ 𝐵 ∪ 𝐶)
= 𝑃(𝐴) + 𝑃(𝐵) + 𝑃(𝐶) − 𝑃(𝐴 ∩ 𝐵) − 𝑃(𝐴 ∩ 𝐶) − 𝑃(𝐵 ∩ 𝐶) + 𝑃(𝐴
∩ 𝐵 ∩ 𝐶)
Example 1.23
John is going to graduate from an industrial engineering department in a university by the
end of the semester. After being interviewed at two companies he likes, he assesses that his
probability of getting an offer from company A is 0.8, and his probability of getting an offer
from company B is 0.6. If he believes that the probability that he will get offers from both
companies is 0.5, what is the probability that he will get at least one offer from these two
companies?
Solution:
Using the additive rule
𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵) = 0.8 + 0.6 − 0.5 = 0.9
Example 1.24
What is the probability of getting a total of 7 or 11 when a pair of fair dice is tossed?
Solution:
1 1 2
𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) = + =
6 18 9
Example 1.25
If the probabilities are, respectively, 0.09, 0.15, 0.21, and 0.23 that a person purchasing a
new automobile will choose the color green, white, red, or blue, what is the probability that a
given buyer will purchase a new automobile that comes in one of those colors?
Solution:
Let G – Green
W – White
R – Red
B – Blue
𝑃(𝐺 ∪ 𝑊 ∪ 𝑅 ∪ 𝐵) = 𝑃(𝐺) + 𝑃(𝑊) + 𝑃(𝑅) + 𝑃(𝐵)
= 0.09 + 0.15 + 0.21 + 0.23 = 0.68
Theorem 9: If A and A’ are complementary events, then

𝑃(𝐴) + 𝑃(𝐴 ′) = 1.
(Note: The probability that the event does not occur)
Since 𝐴 ∪ 𝐴’ = S and the sets 𝐴 and 𝐴′ are disjoint,
1 = 𝑃(𝑆) = 𝑃(𝐴 ∪ 𝐴′ ) = 𝑃(𝐴) + 𝑃(𝐴 ′)
Example 1.26
If the probabilities that an automobile mechanic will service 3, 4, 5, 6, 7, or 8 or more
cars on any given workday are, respectively, 0.12, 0.19, 0.28, 0.24, 0.10, and 0.07, what is the
probability that he will service at least 5 cars on his next day at work?
Solution:
Let E – event that at least 5 cars are serviced
𝑃(𝐸′) = 0.12 + 0.19 = 0.31

𝑃(𝐸) = 1 − 0.31 = 0.69
Example 1.27
Suppose the manufacturer’s specifications for the length of a certain type of computer
cable are 2000 ± 10 millimeters. In this industry, it is known that small cable is just as likely to
be defective (not meeting specifications) as large cable. That is, the probability of randomly
producing a cable with length exceeding 2010 millimeters is equal to the probability of
producing a cable with length smaller than 1990 millimeters. The probability that the production
procedure meets specifications is known to be 0.99.
(a) What is the probability that a cable selected randomly is too large?
(b) What is the probability that a randomly selected cable is larger than 1990 millimeters?
Solution:
Let M – event that a cable meets specifications
S – event that the cable is too small
L – event that the cable is too large
(a)
𝑃(𝑀) = 0.99
(1 − 0.99)
𝑃(𝑆) = 𝑃(𝐿) = = 0.005
2
(b) L – the length of a randomly selected cable

𝑃(1990 ≤ 𝑋 ≤ 2010) = 𝑃(𝑀) = 0.99

Since 𝑃(𝑋 ≥ 2010) = 𝑃(𝐿) = 0.005
𝑃(𝑋 ≥ 1990) = 𝑃(𝑀) + 𝑃(𝐿) = 0.995
using Theorem 9
𝑃(𝑋 ≥ 1990) + 𝑃(𝑋 < 1990) = 1
𝑃(𝑋 ≥ 1990) = 1 − 𝑃(𝑆) = 1 − 0.005 = 0.995
SW:
1. Registrants at a large convention are offered 6 sightseeing tours on each of 3 days. In
how many ways can a person arrange to go on a sightseeing tour planned by this convention?
2. In a fuel economy study, each of 3 race cars is tested using 5 different brands of
gasoline at 7 test sites located in different regions of the country. If 2 drivers are used in the
study, and test runs are made once under each distinct set of conditions, how many test runs are
needed?
3. In how many different ways can a true-false test consisting of 9 questions be
answered?
4. (a) How many distinct permutations can be made from the letters of the word
COLUMNS?
(b) How many of these permutations start with the letter M?
5. (a) How many three-digit numbers can be formed from the digits 0, 1, 2, 3, 4, 5, and 6
if each digit can be used only once?
(b) How many of these are odd numbers?
(c) How many are greater than 330?
6. The probability that an American industry will locate in Shanghai, China, is 0.7, the
probability that it will locate in Beijing, China, is 0.4, and the probability that it will locate in
either Shanghai or Beijing or both is 0.8. What is the probability that the industry will locate (a)
in both cities? (b) in neither city?
7. If 3 books are picked at random from a shelf containing 5 novels, 3 books of poems,
and a dictionary, what is the probability that
(a) the dictionary is selected?
(b) 2 novels and 1 book of poems are selected?
8.A pair of fair dice is tossed. Find the probability of getting
(a) a total of 8;
(b) at most a total of

1.6 Conditional Probability, Independence, and the Product Rule
Conditional Probability
The probability of an event B occurring when it is known that some event A has occurred
is denoted by P(B|A). The conditional probability of B, given A, denoted by P(B|A), is defined
by
P(A ∩ B)
𝑃(𝐵|𝐴) =
𝑃(𝐴)
provided P(A) > 0.
Example 1.28
Suppose that our sample space S is the population of adults in a small town who have
completed the requirements for a college degree. We shall categorize them according to gender
and employment status.
Categorization of the Adults in a Small Town
One of these individuals is to be selected at random for a tour throughout the country to publicize
the advantages of establishing new industries in the town.
Solution:
Let M − a man is chosen
E − the one chosen is employed
P(E ∩ M)
𝑃(𝑀|𝐸) =
𝑃(𝐸)
460 23
P(E ∩ M) = =
900 45
600
𝑃(𝐸) = = 23
900
P(E ∩ M) 23/45 23
𝑃(𝑀|𝐸) = = =
𝑃(𝐸) 2/3 30
OR
460 23
𝑃(𝑀|𝐸) = =
600 30
Example 1.29
The probability that a regularly scheduled flight departs on time is P(D)=0.83; the
probability that it arrives on time is P(A)=0.82; and the probability that it departs and arrives on
time is P(D ∩ A)=0.78. Find the probability that a plane
(a) arrives on time, given that it departed on time,
P(D ∩ A) 0.78
𝑃(𝐴|𝐷) = = = 0.94
𝑃(𝐷) 0.83

(b) departed on time, given that it has arrived on time.
P(D ∩ A) 0.78
𝑃(𝐷|𝐴) = = = 0.95
𝑃(𝐴) 0.82
(c) the probability that it arrives on time, given that it did not depart on time
P(A ∩ D′) 0.82 − 0.78
𝑃(𝐴|𝐷′) = = = 0.24
𝑃(𝐷) 0.17
Example 1.30
The concept of conditional probability has countless uses in both industrial and
biomedical applications. Consider an industrial process in the textile industry in which strips of a
particular type of cloth are being produced. These strips can be defective in two ways, length and
nature of texture. For the case of the latter, the process of identification is very complicated. It is
known from historical information on the process that 10% of strips fail the length test, 5% fail
the texture test, and only 0.8% fail both tests. If a strip is selected randomly from the process and
a quick measurement identifies it as failing the length test, what is the probability that it is
texture defective?
Solution:
Let L – length defective
T – texture defective
Given that the strip is length defective, the probability that this strip is texture defective is given
by
P(T ∩ L) 0.008
𝑃(𝑇|𝐿) = = = 0.08
𝑃(𝐿) 0.1
Independent Events
Two events A and B are independent if and only if
𝑃(𝐵|𝐴) = 𝑃(𝐵) 𝑜𝑟 𝑃(𝐴|𝐵) = 𝑃(𝐴)
assuming the existences of the conditional probabilities.
The Product Rule, or the Multiplicative Rule

Enables us to calculate the probability that two events will both occur.
Theorem 10: If in an experiment the events A and B can both occur, then
𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴)𝑃(𝐵|𝐴), provided 𝑃(𝐴) > 0
The probability that both A and B occur is equal to the probability that A occurs
multiplied by the conditional probability that B occurs, given that A occurs.
Example 1.31
Suppose that we have a fuse box containing 20 fuses, of which 5 are defective. If 2 fuses
are selected at random and removed from the box in succession without replacing the first, what
is the probability that both fuses are defective? Solution: We shall let A be the event that the first
fuse is defective and B the event that the second fuse is defective; then we interpret A ∩ B as the
event that A occurs and then B occurs after A has occurred. The probability of first removing a
defective fuse is 1/4; then the probability of removing a second defective fuse from the
remaining 4 is 4/19. Hence,
Solution:
1 4 1
𝑃(𝐴 ∩ 𝐵) = { | { | =
4 19 19
Example 1.32
One bag contains 4 white balls and 3 black balls, and a second bag contains 3 white balls
and 5 black balls. One ball is drawn from the first bag and placed unseen in the second bag.
What is the probability that a ball now drawn from the second bag is black?
Solution:
Let 𝐵" – black ball form bag 1
𝐵! – black ball from bag 2
𝑊" − white ball from bag 1
𝑃[(𝐵" ∩ 𝐵! ) 𝑜𝑟 (𝑊" ∩ 𝐵! )]
= 𝑃(𝐵" ∩ 𝐵! ) + 𝑃(𝑊" ∩ 𝐵!
= 𝑃(𝐵" )𝑃(𝐵! |𝐵" ) + 𝑃(𝑊" )𝑃(𝐵! |𝑊" )
3 6 4 5 38
= { |{ | + { |{ | =
7 9 7 9 63
Theorem 11: Two events A and B are independent if and only if

𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴)𝑃(𝐵)
Therefore, to obtain the probability that two independent events will both occur, we simply find
the product of their individual probabilities.
Example 1.33
A small town has one fire engine and one ambulance available for emergencies. The
probability that the fire engine is available when needed is 0.98, and the probability that the
ambulance is available when called is 0.92. In the event of an injury resulting from a burning

building, find the probability that both the ambulance and the fire engine will be available,
assuming they operate independently.
Solution:
Let 𝐴 – event that fire engine is available
𝐵 – event that the ambulance is available
𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴)𝑃(𝐵) = (0.98)(0.92) = 0.9016
Example 1.34
An electrical system consists of four components as illustrated. The system works if
components A and B work and either of the components C or D works. The reliability
(probability of working) of each component is also shown. Find the probability that (a) the entire
system works and (b) the component C does not work, given that the entire system works.
Assume that the four components work independently.
Solution:
Let 𝐴 , 𝐵, 𝐶, 𝐷 – constitute a serial circuit system, whereas the subsystem C and D itself is
parallel circuit system.
(a) the probability that the entire system works

𝑃[𝐴 ∩ 𝐵 ∩ (𝐶 ∪ 𝐷)] = 𝑃(𝐴)𝑃(𝐵)𝑃(𝐶 ∪ 𝐷) = 𝑃(𝐴)𝑃(𝐵)[1 − 𝑃(𝐶′ ∩ 𝐷′ )]
= 𝑃(𝐴)𝑃(𝐵)[1 − 𝑃(𝐶 ′)𝑃(𝐷 ′)]
= (0.9)(0.9)[1 − (1 − 0.8)(1 − 0.8)] = 0.7776
(b) the component C does not work, given that the entire system works
𝑃(𝑡ℎ𝑒 𝑠𝑦𝑠𝑡𝑒𝑚 𝑤𝑜𝑟𝑘𝑠 𝑏𝑢𝑡 𝐶 𝑑𝑜𝑒𝑠 𝑛𝑜𝑡 𝑤𝑜𝑟𝑘)
𝑃 =
𝑃(𝑡ℎ𝑒 𝑠𝑦𝑠𝑡𝑒𝑚 𝑤𝑜𝑟𝑘𝑠)
𝑃(𝐴 ∩ 𝐵 ∩ 𝐶′ ∩ 𝐷) (0.9)(0.9)(1 − 0.8)(0.8)
= = = 0.1667
𝑃(𝑡ℎ𝑒 𝑠𝑦𝑠𝑡𝑒𝑚 𝑤𝑜𝑟𝑘𝑠) 0.7776
Theorem 12: If, in an experiment, the events A1, A2,...,Ak can occur, then
𝑃(𝐴" ∩ 𝐴! ∩···∩ 𝐴$ ) = 𝑃(𝐴" )𝑃(𝐴! |𝐴" )𝑃(𝐴# |𝐴" ∩ 𝐴! ) ··· 𝑃(𝐴$ |𝐴" ∩ 𝐴! ∩···∩ 𝐴$ − 1)
If the events A1, A2,...,Ak are independent, then

𝐴" ∩ 𝐴! ∩···∩ 𝐴$ ) = 𝑃(𝐴" )𝑃(𝐴! ) ··· 𝑃(𝐴$ )

A collection of events 𝐴 = {𝐴" , . . . , 𝐴' } are mutually independent if for any subset of
A, 𝐴5" , . . . , 𝐴5$ , 𝑓𝑜𝑟 𝑘 ≤ 𝑛, we have
𝑃(𝐴5" ∩···∩ 𝐴5$ ) = 𝑃(𝐴5" ) ··· 𝑃(𝐴5$ )
Example 1.35
Three cards are drawn in succession, without replacement, from an ordinary deck of
playing cards. Find the probability that the event A1 ∩ A2 ∩ A3 occurs, where A1 is the event
that the first card is a red ace, A2 is the event that the second card is a 10 or a jack, and A3 is the
event that the third card is greater than 3 but less than 7.
Solution:
Let 𝐴" – the first card is a red ace
𝐴! – the second card is a 10 or a jack
𝐴# − the second card is a 10 or a jack
2 8 12
𝑃(𝐴" ) = , 𝑃(𝐴! |𝐴" ) = , 𝑃(𝐴# |𝐴" ∩ 𝐴! ) = ′
52 51 50
𝑃(𝐴" ∩ 𝐴! ∩ 𝐴# ) = 𝑃(𝐴" )𝑃(𝐴! |𝐴" )𝑃(𝐴# |𝐴" ∩ 𝐴! )

2 8 12 8
= { |{ |{ | =
52 51 50 5525
1.7 Bayes’ Rule

Bayesian statistics is a collection of tools that is used in a special form of statistical
inference which applies in the analysis of experimental data in many practical situations in
science and engineering.
Total Probability
Suppose that our sample space S is the population of adults in a small town who have
completed the requirements for a college degree. We shall categorize them according to gender
and employment status.
Categorization of the Adults in a Small Town
One of these individuals is to be selected at random for a tour throughout the country to publicize
the advantages of establishing new industries in the town.
Suppose that we are now given the additional information that 36 of those employed and
12 of those unemployed are members of the Rotary Club. We wish to find the probability of the
event A that the individual selected is a member of the Rotary Club. Referring to Figure below,
we can write A as the union of the two mutually exclusive events 𝐸 ∩ 𝐴 and 𝐸′ ∩ 𝐴. Hence,

𝐴 = (𝐸 ∩ 𝐴) ∪ (𝐸′ ∩ 𝐴), and by Corollary 1.1 of Theorem 1.7, and then Theorem 1.10, we can
write
𝑃(𝐴) = 𝑃[(𝐸 ∩ 𝐴) ∪ (𝐸′ ∩ 𝐴)] = 𝑃(𝐸 ∩ 𝐴) + 𝑃(𝐸′ ∩ 𝐴)
= 𝑃(𝐸)𝑃(𝐴|𝐸) + 𝑃(𝐸′ )𝑃(𝐴|𝐸′ )
Venn diagram for the events A, E, and E’.

600 2 36 3
𝑃(𝐸) = = , 𝑃(𝐴|𝐸) = =
900 3 600 50
1 12 1
𝑃(𝐸′ ) = , 𝑃(𝐴|𝐸′ ) = =
3 300 25
the probability 𝑃(𝐸 )𝑃(𝐴|𝐸 ), it follows that

2 3 1 1 4
𝑃(𝐴) = { | { | + { | { | =
3 50 3 25 75
A generalization of the foregoing illustration to the case where the sample space is
partitioned into k subsets is covered by the following theorem, sometimes called the theorem of
total probability or the rule of elimination.
Theorem 13: If the events B1, B2,...,Bk constitute a partition of the sample space S such that
P(Bi) = 0 for i = 1, 2,...,k, then for any event A of S,

$ $
𝑃(𝐴) = € 𝑃(𝐵5 ∩ 𝐴) = € 𝑃(𝐵5 )𝑃(𝐴|𝐵5 )

56" 5
Partitioning the sample space S
Proof : Consider the Venn diagram. The event A is seen to be the union of the mutually
exclusive events
𝐵" ∩ 𝐴, 𝐵! ∩ 𝐴, . . . , 𝐵$ ∩ 𝐴
that is,
𝐴 = (𝐵1 ∩ 𝐴) ∪ (𝐵2 ∩ 𝐴) ∪···∪ (𝐵𝑘 ∩ 𝐴)
Using Corollary 1.2 of Theorem 1.7 and Theorem 1.10, we have
𝑃(𝐴) = 𝑃[(𝐵" ∩ 𝐴) ∪ (𝐵! ∩ 𝐴) ∪···∪ (𝐵$ ∩ 𝐴)]
= 𝑃(𝐵" ∩ 𝐴) + 𝑃(𝐵! ∩ 𝐴) + ··· + 𝑃(𝐵$ ∩ 𝐴)
$
€ 𝑃(𝐵" ∩ 𝐴)
56"
$
€ 𝑃(𝐵" )𝑃(𝐴|𝐵5 )
56"
Example 1.35
In a certain assembly plant, three machines, B1, B2, and B3, make 30%, 45%, and 25%,
respectively, of the products. It is known from past experience that 2%, 3%, and 2% of the
products made by each machine, respectively, are defective. Now, suppose that a finished
product is randomly selected. What is the probability that it is defective?
Solution:
Let 𝐴" – the product is defective
𝐵" – the product is made by machine B1
𝐵! – the product is made by machine B2
𝐵! – the product is made by machine B3
Applying the rule of elimination, we can write
𝑃(𝐴) = 𝑃(𝐵1)𝑃(𝐴|𝐵1) + 𝑃(𝐵2)𝑃(𝐴|𝐵2) + 𝑃(𝐵3)𝑃(𝐴|𝐵3)
𝑃(𝐵1)𝑃(𝐴|𝐵1) = (0.3)(0.02) = 0.006,
𝑃(𝐵2)𝑃(𝐴|𝐵2) = (0.45)(0.03) = 0.0135,

𝑃(𝐵3)𝑃(𝐴|𝐵3) = (0.25)(0.02) = 0.005,
𝑃(𝐴) = 0.006 + 0.0135 + 0.005 = 0.0245
Theorem 2.14: (Bayes’ Rule) If the events 𝑩𝟏 , 𝑩𝟐 ,...,Bk constitute a partition of the sample
space S such that 𝑷(𝑩𝒊) ≠ 0 for i = 1, 2,...,k, then for any event A in S such that P(A) = 0
𝑃(𝐵𝑟 ∩ 𝐴) 𝑃(𝐵& )𝑃(𝐴|𝐵& )
𝑃(𝐵& |𝐴) = $ = $ 𝑓𝑜𝑟 1,2, … , 𝑘
∑56" 𝑃(𝐵𝑖 ∩ 𝐴) ∑56" (𝐵5 )𝑃(𝐴|𝐵5 )
Proof : By the definition of conditional probability,
𝑃(𝐵& ∩ 𝐴)
𝑃(𝐵& |𝐴) =
𝑃(𝐴)
and then using Theorem 1.13 in the denominator, we have
𝑃(𝐵& ∩ 𝐴) 𝑃(𝐵& )𝑃(𝐴|𝐵& )
𝑃(𝐵& |𝐴) = $ = $ ) ,
∑56" 𝑃(𝐵𝑖 ∩ 𝐴) ∑56" 𝑃(𝐵5 )𝑃(𝐴|𝐵5 )
which completes the proof.
Instead of asking for 𝑃(𝐴) in Example 1.35, by the rule of elimination, suppose that we
now consider the problem of finding the conditional probability 𝑃(𝐵5 |𝐴). In other words,
suppose that a product was randomly selected and it is defective. What is the probability that this
product was made by machine Bi? Questions of this type can be answered by using the following
theorem, called Bayes’ rule:
If a product was chosen randomly and found to be defective, what is the probability that it was
made by machine B3?
Solution:
Using Bayes’ rule to write
𝑃(𝐵# )𝑃(𝐴|𝐵# )
𝑃(𝐵# |𝐴) =
𝑃(𝐵" )𝑃(𝐴|𝐵" ) + 𝑃(𝐵! )𝑃(𝐴|𝐵! ) + 𝑃(𝐵# )𝑃(𝐴|𝐵# )
0.005 0.005 10
𝑃(𝐵# |𝐴) = = =
0.006 + 0.0135 + 0.005 0.0245 49
In view of the fact that a defective product was selected, this result suggests that it
probably was not made by machine B3.

Example 1.36
A manufacturing firm employs three analytical plans for the design and development of a
particular product. For cost reasons, all three are used at varying times. In fact, plans 1, 2, and 3
are used for 30%, 20%, and 50% of the products, respectively. The defect rate is different for the
three procedures as follows:
𝑃(𝐷|𝑃" ) = 0.01, 𝑃(𝐷|𝑃! ) = 0.03, 𝑃(𝐷|𝑃# ) = 0.02,
where 𝑃(𝐷|𝑃7 ) is the probability of a defective product, given plan j. If a random product was
observed and found to be defective, which plan was most likely used and thus responsible?
Solution:
𝑃(𝑃" ) = 0.30, 𝑃(𝑃! ) = 0.20, 𝑎𝑛𝑑 𝑃(𝑃# ) = 0.50,
we must find 𝑃(𝑃7 |𝐷) for j = 1, 2, 3. Bayes’ rule (Theorem 1.14) shows
𝑃(𝑃1)𝑃(𝐷|𝑃1)
𝑃(𝑃" |𝐷) =
𝑃(𝑃1)𝑃(𝐷|𝑃1) + 𝑃(𝑃2)𝑃(𝐷|𝑃2) + 𝑃(𝑃3)𝑃(𝐷|𝑃3)
(0.30)(0.01)
=
(0.3)(0.01) + (0.20)(0.03) + (0.50)(0.02)
0.003
= = 0.158
. 019
(0.03)(0.20)
𝑃(𝑃! |𝐷) = = 0.316
0.019
(0.02)(0.50)
𝑃(𝑃# |𝐷) = = 0.526
0.019
The conditional probability of a defect given plan 3 is the largest of the three; thus a defective for
a random product is most likely the result of the use of plan 3.

Measures of Location: The Sample Mean and Median
Measures of location are designed to provide the analyst with some quantitative values of
where the center, or some other location, of data is located.
Sample mean
The mean is simply a numerical average.
Suppose that the observations in a sample are 𝑥" , 𝑥! , . . . , 𝑥' . The sample mean, denoted by 𝑥†, is
'
𝑥5 𝑥" + 𝑥! + ⋯ + 𝑥'
𝑥̅ = € =
𝑛 𝑛
56"
Sample median
The purpose of the sample median is to reflect the central tendency of the sample in such
a way that it is uninfluenced by extreme values or outliers.
Given that the observations in a sample are 𝑥" , 𝑥! , . . . , 𝑥' , arranged in increasing order of
magnitude, the sample median is
𝑥'8"
𝑥† = 𝑖𝑓 𝑛 𝑖𝑠 𝑜𝑑𝑑
2
1 𝑥' 𝑥'
𝑥† = ‰ + Š 𝑖𝑓 𝑛 𝑖𝑠 𝑒𝑣𝑒𝑛
2 2 2+1
Trimmed means
A trimmed mean is computed by “trimming away” a certain percent of both the largest
and the smallest set of values.
Measures of Variability
Process and product variability is a fact of life in engineering and scientific systems: The
control or reduction of process variability is often a source of major difficulty. More and more
process engineers and managers are learning that product quality and, as a result, profits derived
from manufactured products are very much a function of process variability.
Sample Range and Sample Standard Deviation

Just as there are many measures of central tendency or location, there are many measures of
spread or variability. The simplest one is the sample range 𝑋9:; − 𝑋95' . The range can be very
useful on statistical quality control.
The sample variance, denoted by 𝑠 ! , is given by

'
!
(𝑥5 − 𝑥̅ )!
𝑠 =€
𝑛−1
56"
Variance is a measure of the average squared deviation from the mean 𝑥̅ . We use the term
average squared deviation even though the definition makes use of a division by degrees of
freedom n − 1 rather than n.
The sample standard deviation, denoted by s, is the positive square root of 𝑠 ! , that is,
𝑠‹𝑠 !

Discrete and Continuous Data
Discrete data can only take particular values. There may potentially be an infinite
number of those values, but each is distinct and there's no grey area in between. Discrete
data can be numeric -- like numbers of apples -- but it can also be categorical -- like red or
blue, or male or female, or good or bad.
Continuous data are not restricted to defined separate values, but can occupy any
value over a continuous range. Between any two continuous data values there may be an
infinite number of others. Continuous data are always essentially numeric.
Statistical Modeling, Scientific Inspection, and Graphical Diagnostics

Often the end result of a statistical analysis is the estimation of parameters of a
postulated model. A statistical model is not deterministic but, rather, must entail some
probabilistic aspects. A model form is often the foundation of assumptions that are made by the
analyst.
Scatter Plot
Example:
A textile manufacturer who designs an experiment where cloth specimen that contain various
percentages of cotton are produced.
Tensile Strength
The analysis of the data should revolve around a different type of model, one that postulates a
type of structure relating the population mean tensile strength to the cotton concentration. A
model may be written
𝜇<,> = 𝛽. + 𝛽"? + 𝛽! 𝐶 ! ,
where 𝜇<,> is the population mean tensile strength, which varies with the amount of cotton in the
product C. The implication of this model is that for a fixed cotton level, there is a population of
tensile strength measurements and the population mean is μt,c. This type of model, called a
regression model.

Two points become evident from the two data illustrations here: (1) The type of model
used to describe the data often depends on the goal of the experiment; and (2) the structure of the
model should take advantage of non-statistical scientific input.
Stem-and-Leaf Plot
Statistical data, generated in large masses, can be very useful for studying the behavior of
the distribution if presented in a combined tabular and graphic display called a stem-and-leaf
plot.
Example:
Consider the data on the table, which specifies the “life” of 40 similar car batteries recorded to
the nearest tenth of a year. The batteries are guaranteed to last 3 years.
Car Battery Life
Stem-and-Leaf Plot of Battery
modified double-stem-and-leaf plot

Double-Stem-and-Leaf Plot of Battery Life
The stem-and-leaf plot represents an effective way to summarize data.
Histogram
Data are grouped into different classes or intervals which can be constructed by counting
the leaves belonging to each stem and noting that each stem defines a class interval.

Dividing each class frequency by the total number of observations, we obtain the
proportion of the set of observations in each of the classes. A table listing relative frequencies is
called a relative frequency distribution
Relative Frequency Distribution of Battery Life
Relative frequency histogram
Graphical tools such as what we see in the figures above aids in the characterization of
the nature of the population, and one property of population is distribution or probability
distribution.
A distribution is said to be symmetric if it can be folded along a vertical axis so that the
two sides coincide. A distribution that lacks symmetry with respect to a vertical axis is said to be
skewed. The distribution illustrated in Figure 1.8(a) is said to be skewed to the right since it has
a long right tail and a much shorter left tail. In Figure 1.8(b) we see that the distribution is
symmetric, while in Figure 1.8(c) it is skewed to the left.

If we rotate a stem-and-leaf plot counterclockwise through an angle of 90◦, we observe
that the resulting columns of leaves form a picture that is similar to a histogram. Consequently, if
our primary purpose in looking at the data is to determine the general shape or form of the
distribution, it will seldom be necessary to construct a relative frequency histogram.
Estimating frequency distribution
Examples: 1
𝑥̅ = (3.4 + 2.5 + 4.8 + 2.9 + 3.6 + 2.8
1. The following measurements were 15
recorded for the drying time, in hours, of a + 3.3 + 5.6 + 3.7 + 2.8
certain brand of latex paint. + 4.4 + 4.0 + 5.2 + 3.0
3.4 2.5 4.8 2.9 3.6 2.8 + 4.8)
3.3 5.6 3.7 2.8 4.4 𝑥̅ = 3.787
4.0 5.2 3.0 4.8 (c) Calculate the sample median.
Assume that the measurements are a simple 2.5 2.8 2.8 2.9 3.0 3.3
random sample. 3.4 3.6 3.7 4.0 4.4
(a) What is the sample size for the above 4.8 4.8 5.2 5.6
sample? (d) Plot the data by way of a dot plot.
(b) Calculate the sample mean for these
data.
(c) Calculate the sample median.
(d) Plot the data by way of a dot plot. (e) Compute the 20% trimmed mean for the
(e) Compute the 20% trimmed mean for the above data set.
above data set. 2.9 3.0 3.3 3.4 3.6 3.7
(f) Is the sample mean for these data more or 4.0 4.4 4.8
less descriptive as a center of location than 1
𝑥̅ = (2.9 + 3.0 + 3.3 + 3.4 + 3.6 + 3.7
the trimmed mean? 15
+ 4.0 + 4.4 + 4.8)
Solution: 𝑥̅ = 3.678
(a) What is the sample size for the above (f) Is the sample mean for these data more or
sample? less descriptive as a center of location than
= 15 the trimmed mean? They are about the
(b) Calculate the sample mean for these same.
data. (g) Compute the sample variance and
sample standard deviation.

1
𝑠! = [(2.5 − 3.787)!
(b) Compute the mean, median, and 10%
15 − 1
+ (2.8 − 3.787)! trimmed mean for both groups.
+ (2.8 − 3.787)! 𝑋Ž?@'<&@A = 5.60 𝑋•?@'<&@A = 5
+ (2.9 − 3.787)! 𝑋Ž<&(".) ?@'<&@A = 5.13
+ (3.0 − 3.787)! 𝑋ŽB&C:<9C'< = 7.60 𝑋•B&C:<9C'< = 4.5
+ (3.3 − 3.787)! 𝑋Ž<&(".) B&C:<9C'< = 5.63
+ (3.4 − 3.787)!
+ (3.6 − 3.787)! (c) Explain why the difference in means
+ (3.7 − 3.787)! suggests one conclusion about the effect of
+ (4.0 − 3.787)! the regimen, while the difference in medians
+ (4.4 − 3.787)! or trimmed means suggests a different
+ (4.8 − 3.787)! conclusion.
+ (4.8 − 3.787)! - The difference of the means is 2.0 and the
+ (5.2 − 3.787)! differences of the medians and the trimmed
+ (5.6 − 3.787)! ] means are 0.5, which are much smaller. The
= 0.94284 possible cause of this might be due to the
𝑠 = ‹𝑠 = √0.9428 = 0.971
! extreme values (outliers) in the samples,
especially the value of 37.
2. Twenty adult males between the ages of
30 and 40 participated in a study to evaluate 3. List the elements of each of the following
the effect of a specific health regimen sample spaces:
involving diet and exercise on the blood (a) the set of integers between 1 and 50
cholesterol. Ten were randomly selected to divisible by 8;
be a control group, and ten others were 𝑆 = {8, 16, 24, 32, 40, 48}
assigned to take part in the regimen as the
treatment group for a period of 6 months. (b) the set 𝑆 = {𝑥 |𝑥 ! + 4𝑥 − 5 = 0};
The following data show the reduction in 𝑥 ! + 4𝑥 − 5 = 0 = (𝑥 + 5)(𝑥 − 1) = 0
cholesterol experienced for the time period 𝑥 = −5 𝑎𝑛𝑑 𝑥 = 1
for the 20 subjects: 𝑆 = {−5,1}
Control group: 7 3 −4 14 (c) the set of outcomes when a coin is tossed

2 5 22 −7 until a tail or three heads appear;
9 5 𝑆 = {𝑇, 𝐻𝑇, 𝐻𝐻𝑇, 𝐻𝐻𝐻}
Treatment group: −6 5 9 4 (d) the set S = {x | x is a continent};

4 12 37 5 𝑆 = 𝑁. 𝐴𝑚𝑒𝑟𝑖𝑐𝑎, 𝑆. 𝐴𝑚𝑒𝑟𝑖𝑐𝑎, 𝐸𝑢𝑟𝑜𝑝𝑒,
3 3 𝐴𝑠𝑖𝑎, 𝐴𝑓𝑟𝑖𝑐𝑎, 𝐴𝑢𝑠𝑡𝑟𝑎𝑙𝑖𝑎, 𝐴𝑛𝑡𝑎𝑟𝑐𝑡𝑖𝑐𝑎}
Solution:
(a) Do a dot plot of the data for both groups (e) the set S = {x | 2x − 4 ≥ 0 and x < 1}.
on the same graph. 2𝑥 − 4 ≥ 0; 2𝑥 − 4 = 0
4
2𝑥 = 4, 𝑥= , 𝑥=2
2
but 𝑥 < 1
Let x = control group 𝑆=𝜙
o = treatment group

4. Construct a Venn diagram to illustrate the (b) M ∩ N;
possible intersections and unions for the = {𝑥|1 < 𝑥 < 5}
following events relative to the sample space (c) M ∩ N′
consisting of all automobiles made in the = {𝑥|9 < 𝑥 < 12}
United States.
F : Four door, 7. Registrants at a large convention are
S : Sun roof, offered 6 sightseeing tours on each of 3
P : Power steering. days. In how many ways can a person
arrange to go on a sightseeing tour planned
by this convention?
Solution:
𝑛" = 6 𝑠𝑖𝑔ℎ𝑡𝑠𝑒𝑒𝑖𝑛𝑔
𝑛! = 3 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝑤𝑎𝑦𝑠
𝑛" 𝑛! = 6 × 3 = 18 𝑤𝑎𝑦𝑠
8. In a fuel economy study, each of 3 race

cars is tested using 5 different brands of
gasoline at 7 test sites located in different
regions of the country. If 2 drivers are used
in the study, and test runs are made once
under each distinct set of conditions, how
many test runs are needed?
5. If S = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} and A =
Solution:
{0, 2, 4, 6, 8}, B = {1, 3, 5, 7, 9}, C = {2, 3,
𝑛" = 3 𝑟𝑎𝑐𝑒 𝑐𝑎𝑟𝑠
4, 5}, and D = {1, 6, 7}, list the elements of
the sets corresponding to the following 𝑛! = 5 𝑏𝑟𝑎𝑛𝑑 𝑔𝑎𝑠𝑜𝑙𝑖𝑛𝑒
events: 𝑛# = 7 𝑡𝑒𝑠𝑡 𝑠𝑖𝑡𝑒𝑠
(a) 𝐴 ∪ 𝐶; 𝑛% = 2 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝑠𝑡𝑟𝑒𝑛𝑔𝑡ℎ𝑠
𝐴 ∪ 𝐶 = {0,2,3,4,5,6,7} 𝑛" 𝑛! 𝑛# = 3 × 5 × 7 × 2 = 210
(b) 𝐴 ∩ 𝐵;
9. In how many different ways can a true-
𝐴 ∩ 𝐵 = 𝜙
false test consisting of 9 questions be
(c) 𝐶′ ;
answered?
{0,1,6,7,8,9}
Solution:
(d) (𝐶′ ∩ 𝐷) ∪ 𝐵;
𝑛" = 𝑛! = 𝑛# = 𝑛% = 𝑛- = 𝑛0 = 𝑛E = 𝑛/
(𝐶 D ∩ 𝐷) = {1,6,7}
= 𝑛F = 2
𝐶′ ∩ 𝐷) ∪ 𝐵 = {1,3,5,6,7,9}
𝑛" 𝑛! 𝑛# 𝑛% 𝑛- 𝑛0 𝑛E 𝑛/ 𝑛F = 2F = 512 𝑤𝑎𝑦𝑠
(e) (𝑆 ∩ 𝐶)′ ;
𝐶 D = {0,1,6,7,8,9}
10. (a) How many distinct permutations can
(f) 𝐴 ∩ 𝐶 ∩ 𝐷′ . be made from the letters of the word
𝐴 ∩ 𝐶 = {2,4} COLUMNS?
𝐴 ∩ 𝐶 ∩ 𝐷D = {2,4} Solution:
7! = 5040
6. If S = {x | 0 <x< 12}, M = {x | 1 <x< 9}, (b) How many of these permutations start
and N = {x | 0 <x< 5}, with the letter M?
find (7 − 1)! = 720
(a) M ∪ N;
{𝑥|0 < 𝑥 < 9}
11. (a) How many three-digit numbers can 5 3
‰2Š ‰1Š
be formed from the digits 0, 1, 2, 3, 4, 5, and
6 if each digit can be used only once? 9
‰3Š
Solution:
8.A pair of fair dice is tossed. Find
𝑛" = 6
the probability of getting
𝑛! = 6
(a) a total of 8;
𝑛# = 5
6 × 6 = 36
𝑛" 𝑛! 𝑛# = 6 × 6 × 5
5 elements can give a sum of 8, (2,6), (3,5),
= 180 𝑡ℎ𝑟𝑒𝑒 𝑑𝑖𝑔𝑖𝑡 𝑛𝑢𝑚𝑏𝑒𝑟
(4,4), (5,3) and (6,2)
(b) How many of these are odd numbers?
𝑛" = 3
(b) at most a total of 5
𝑛! = 5 10 of 36 elements
𝑛# = 5 10 5
𝑛" 𝑛! 𝑛# = 3 × 5 × 5 =
36 18
= 75 𝑡ℎ𝑟𝑒𝑒 𝑑𝑖𝑔𝑖𝑡 𝑜𝑑𝑑 𝑛𝑢𝑚𝑏𝑒𝑟𝑠
(c) How many are greater than 330? 14. Interest centers around the life of an
3 × 6 × 5 = 90 𝑡ℎ𝑟𝑒𝑒 𝑑𝑖𝑔𝑖𝑡 𝑛𝑢𝑚𝑏𝑒𝑟𝑠 electronic component. Suppose it is known
1 × 3 × 5 = 15 𝑡ℎ𝑟𝑒𝑒 𝑑𝑖𝑔𝑖𝑡 𝑛𝑢𝑚𝑏𝑒𝑟 that the probability that the component
Total number of three digit numbers that are survives for more than 6000 hours is 0.42.
greater than 330 is 90 +15 =105 Suppose also that the probability that the
component survives no longer than 4000
12. The probability that an American hours is 0.04. Let A be the event that the
industry will locate in Shanghai, China, is component fails a particular test and B be
0.7, the probability that it will locate in the event that the component displays strain
Beijing, China, is 0.4, and the probability but does not actually fail. Event A occurs
that it will locate in either Shanghai or with probability 0.20, and event B occurs
Beijing or both is 0.8. What is the with probability 0.35.
probability that the industry will locate (a) What is the probability that the
(a) in both cities? component does not fail the test?
𝑃(𝑆 ∩ 𝐵) = 𝑃(𝑆) + 𝑃(𝐵) − 𝑃(𝑆 ∪ 𝐵) 𝑃(𝐴D ) = 1 − 0.2 = 0.8
0.7 + 0.4 − 0.8 = 0.3 (b) What is the probability that the
(b) in neither city? component works perfectly well (i.e., neither
𝑃(𝑆 D ∩ 𝐵D ) = 1 − 𝑃(𝑆 ∪ 𝐵) displays strain nor fails the test)?
= 1 − 0.8 = 0.2 𝑃(𝐴′ ∩ 𝐵′) = 1 − 𝑃(𝐴 ∪ 𝐵)
= 1 − 0.2 − 0.35 = 0.45
13. If 3 books are picked at random from a (c) What is the probability that the co
shelf containing 5 novels, 3 books of poems, 𝑃(𝐴 ∪ 𝐵) = 0.2 + 0.35 = 0.55
and a dictionary, what is the probability that
(a) the dictionary is selected? 15. If the probabilities that an automobile
1 8 1! 8 mechanic will service 3, 4, 5, 6, 7, or 8 or
‰1Š ‰2Š {(1 − 1)!| {(8 − 2)!| 1
= = more cars on any given workday are,
9 9! 3 respectively, 0.12, 0.19, 0.28, 0.24, 0.10,
‰3Š {(9 |
− 3)! and 0.07, determine the following:
(b) 2 novels and 1 book of poems are (a) What is the probability that no more than
selected? 4 cars will be serviced by the mechanic?
0.12 + 0.19 = 0.31
(b) What is the probability that he will (c) a vehicle entering the Luray Caverns
service fewer than 8 cars? does not have Canadian plates or is not a
1 − 0.07 = 0.93 camper?
(c) What is the probability that he will 𝑃(𝐵′ ∪ 𝐴′) = 1 − 𝑃(𝐴 ∩ 𝐵) = 1 − 0.09
service either 3 or 4 cars? = 0.91
0.12 + 0.19 = 0.31
18. Find the probability of randomly
16. In the senior year of a high school selecting 4 good quarts of milk in succession
graduating class of 100 students, 42 studied from a cooler containing 20 quarts of which
mathematics, 68 studied psychology, 54 5 have spoiled, by using
studied history, 22 studied both mathematics (a) the first formula of Theorem 12
and history, 25 studied both mathematics 𝑃(𝑄" ∩ 𝑄! ∩ 𝑄# ∩ 𝑄% ∩ 𝑄- )
and psychology, 7 studied history but = 𝑃(𝑄" ) 𝑃(𝑄! |𝑄" ) 𝑃(𝑄# |𝑄" ∩ 𝑄! ) 𝑃(𝑄% |𝑄" ∩ 𝑄! ∩ 𝑄# )
neither mathematics nor psychology, 10 𝑃(𝑄- |𝑄" ∩ 𝑄! ∩ 𝑄# ∩ 𝑄% )
studied all three subjects, and 8 did not take
any of the three. Randomly select a student (b) the formulas of Theorem 6 and Rule 1.3
from the class and find the probabilities of A – event that 4 good quarts of milk are
the following events. selected
(a) A person enrolled in psychology takes all 15 15!
three subjects. ‰ 4 Š {(51 − 4)! | 91
10 5 𝑃(𝐴) = = =
20 20! 323
‰4Š {
𝑃(𝑀 ∩ 𝑃 ∩ 𝐻) = =
68 34 (20 − 4)!|
(b) A person not taking psychology is taking
both history and mathematics. 19. A paint-store chain produces and sells
𝑃(𝐻 ∩ 𝑀 ∩ 𝑃′) latex and semi-gloss paint. Based on long-
𝑃(𝐻 ∩ 𝑀|𝑃′) = range sales, the probability that a customer
𝑃(𝑃D )
22 − 10 12 3 will purchase latex paint is 0.75. Of those
= = = that purchase latex paint, 60% also purchase
100 − 68 32 8
rollers. But only 30% of semi-gloss paint
17. The probability that a vehicle entering buyers purchase rollers. A randomly
the Luray Caverns has Canadian license selected buyer purchases a roller and a can
plates is 0.12; the probability that it is a of paint. What is the probability that the
camper is 0.28; and the probability that it is paint is latex?
a camper with Canadian license plates is Solution:
0.09. What is the probability that A: a customer purchases latex paint
(a) a camper entering the Luray Caverns has A’: a customer purchases semi-gloss paint
Canadian license plates? B: a customer purchases rollers.
A: vehicle is a camper 𝑃(𝐴)𝑃(𝐵|𝐴)
𝑃(𝐴|𝐵) =
B: the vehicle has Canadian license plates 𝑃(𝐴)𝑃(𝐵|𝐴) + 𝑃(𝐴D )𝑃(𝐵|𝐴D )
𝑃(𝐴 ∩ 𝐵) 0.09 9 0.75 × 0.60
𝑃(𝐵|𝐴) = = = = = 0.857
𝑃(𝐴) 0.28 28 0.75 × 0.60 + 0.30 × 0.25
(b) a vehicle with Canadian license plates
entering the Luray Caverns is a camper?
𝑃(𝐴 ∩ 𝐵) 0.09 3
𝑃(𝐴|𝐵) = = =
𝑃(𝐵) 0.12 4

20. The probability that a patient recovers 22. There is a 50-50 chance that the queen
from a delicate heart operation is 0.8. What carries the gene of hemophilia. If she is a
is the probability that carrier, then each prince has a 50-50 chance
(a) exactly 2 of the next 3 patients who have of having hemophilia independently. If the
this operation survive? queen is not a carrier, the prince will not
𝑃(𝑅" ′ ∩ 𝑅! ∩ 𝑅# ) + 𝑃(𝑅" ∩ 𝑅! ′ ∩ 𝑅# ) have the disease. Suppose the queen has had
+ 𝑃(𝑅" ∩ 𝑅! ∩ 𝑅# ′) three princes without the disease. What is
= 𝑃(𝑅" ′)𝑃(𝑅! )𝑃(𝑅# ′) the probability the queen is a carrier?
+ 𝑃(𝑅" )𝑃(𝑅! ′)𝑃(𝑅# ) Solution:
+ 𝑃(𝑅" )𝑃(𝑅! )𝑃(𝑅# ′) C: The queen is a carrier, 𝑃(𝐶) = 0.5
3 × 0.8 × 0.8 × (1 − 0.8) = 0.384 D: a prince has the disease, 𝑃(𝐷|𝐶) = 0.5
(b) all of the next 3 patients who have this
operation survive? 𝑃(𝐶|𝐷"D 𝐷!D 𝐷#D )
𝑃(𝑅" ∩ 𝑅! ∩ 𝑅# ) 𝑃(𝐶)𝑃(𝐷"D 𝐷!D 𝐷#D )
=
= 𝑃(𝑅" ) + 𝑃(𝑅! ) + 𝑃(𝑅# ) 𝑃(𝐶)𝑃(𝐷"D 𝐷!D 𝐷#D |𝐶) + 𝑃(𝐶′)𝑃(𝐷"D 𝐷!D 𝐷#D |𝐶′)
0.8! = 0.512 0.5 × 0.5# 1
=
0.5 × 0.5# + 0.5 × 1 9
21. A rare disease exists with which only 1
in 500 is affected. A test for the disease
exists, but of course it is not infallible. A
correct positive result (patient actually has
the disease) occurs 95% of the time, while a
false positive result (patient does not have
the dis ease) occurs 1% of the time. If a
randomly selected individual is tested and
the result is positive, what is the probability
that the individual has the disease?
Solution:
"
D: a person has a rare disease, 𝑃(𝐷) = -..
P: the test shows a positive result, 𝑃(𝑃|𝐷) =
0.95 and 𝑃(𝑃|𝐷D ) = 0.01
𝑃(𝑃)𝑃(𝑃|𝐷)
𝑃(𝐷|𝑃) =
𝑃(𝑃)𝑃(𝑃|𝐷) + 𝑃(𝐷D )𝑃(𝑃|𝐷′)
1
‰500Š (0.95)
=
1 1
‰500Š (0.95) + ‰1 − 500Š (0.01)
= 0.1599

1. If an experiment consists of throwing a 𝑛" 𝑛! = 120 × 2 = 240 𝑤𝑎𝑦𝑠
die and then drawing a letter at random from Possible ways of these two specific people
the English alphabet, how many points are will no follow each other is
there in the sample space? 720 − 240 = 480 𝑤𝑎𝑦𝑠
Solution: 4. Find the number of ways that 6 teachers

𝑛" = 6 can be assigned to 4 sections of an
𝑛! = 26 introductory psychology course if no teacher
𝑛" 𝑛! = 6 × 26 = 156 is assigned to more than one section.
2. A developer of a new subdivision offers a Solution:

prospective home buyer a choice of 4 6!
designs, 3 different heating systems, a 6P4 = = 360 𝑤𝑎𝑦𝑠
(6 − 4)!
garage or carport, and a patio or screened
porch. How many different plans are 5. How many ways are there that no two
available to this buyer? students will have the same birth date in a
class of size 60?
Solution:
𝑛" = 4 Solution:
𝑛! = 3
𝑛# = 2 365P0. = 𝑇𝑜𝑜 𝑏𝑖𝑔
𝑛% = 2
𝑛" 𝑛! 𝑛# 𝑛% = 4 × 3 × 2 × 2 = 48 6. From past experience, a stockbroker
believes that under present economic
3. (a) In how many ways can 6 people be conditions a customer will invest in tax-free
lined up to get on a bus? bonds with a probability of 0.6, will invest
in mutual funds with a probability of 0.3,
Solution: and will invest in both tax-free bonds and
6! = 720 mutual funds with a probability of 0.15. At
this time, find the probability that a
(b) If 3 specific persons, among 6, insist on customer will invest
following each other, how many ways are
possible? Solution:
Let B: Customer invests in a tax-free bonds
Solution: M: customer invests in a mutual funds
4! × 3! = 144 𝑤𝑎𝑦𝑠
a) in either tax-free bonds or mutual funds;
(c) If 2 specific persons, among 6, refuse to 𝑃(𝐵 ∪ 𝑀) = 𝑃(𝐵) + 𝑃(𝑀) − 𝑃(𝐵 ∩ 𝑀)
follow each other, how many ways are 0.6 + 0.3 − 0.15 = 0.75
possible?
Solution: b) in neither tax-free bonds nor mutual
𝑛" = 5! = 120 𝑤𝑎𝑦𝑠 funds.
𝑛! = 2! = 2 𝑤𝑎𝑦𝑠 𝑃(𝐵′ ∪ 𝑀′) = 1 − 0.75 = 0.25
𝑛" 𝑛!
= 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑤𝑎𝑦𝑠 𝑡ℎ𝑎𝑡 𝑡ℎ𝑒𝑠𝑒 𝑡𝑤𝑜 𝑝𝑒𝑟𝑠𝑜𝑛𝑠
𝑤𝑖𝑙𝑙 𝑓𝑜𝑙𝑙𝑜𝑤 𝑒𝑎𝑐ℎ 𝑜𝑡ℎ𝑒𝑟

7. In an experiment to study the relationship (b) A person not taking psychology is taking
of hypertension and smoking habits, the both history and mathematics.
following data are collected for 180 D)
𝑃(𝐻 ∩ 𝑀 ∩ 𝑃D )
individuals: 𝑃(𝐻 ∩ 𝑀|𝑃 =
𝑃(𝑃D )
22 − 10 12 3
= = =
100 − 68 32 8
where H and NH in the table stand for 9. For married couples living in a certain
Hypertension and Non-hypertension, suburb, the probability that the husband will
respectively. If one of these individuals is vote on a bond referendum is 0.21, the
selected at random, find the probability that probability that the wife will vote on the
the person is: referendum is 0.28, and the probability that
Let: A - a person is experiencing both the husband and the wife will vote is
hypertension 0.15. What is the probability that
B – a person is a heavy smoker
C – a person is non-smoker SOLUTION:
LET: H -the husband will vote on the bond
(a) experiencing hypertension, given that the referendum,
person is a heavy smoker; W- the wife will vote on the bond
referendum
30
𝑃(𝐴|𝐵) = 𝑃(𝐻) = 0.21, 𝑃(𝑊) = 0.28,
49
𝑃(𝐻 ∩ 𝑊) = 0.15
(b) a nonsmoker, given that the person is (a) at least one member of a married couple
experiencing no hypertension. will vote?
48 16 𝑃(𝐻 ∪ 𝑊) = 𝑃(𝐻) + 𝑃(𝑊) − 𝑃(𝐻 ∩ 𝑊)
𝑃(𝐶|𝐴′) = = 0.21 + 0.28 − 0.15 = 0.34
93 31
8. In the senior year of a high school (b) a wife will vote, given that her husband
graduating class of 100 students, 42 studied will vote?
mathematics, 68 studied psychology, 54 𝑃(𝐻 ∩ 𝑊) 0.15 5
𝑃(𝑊|𝐻) = = =
studied history, 22 studied both mathematics 𝑃(𝐻) 0.21 7
and history, 25 studied both mathematics
and psychology, 7 studied history but (c) a husband will vote, given that his wife
neither mathematics nor psychology, 10 will not vote?
studied all three subjects, and 8 did not take 𝑃(𝐻 ∩ 𝑊′)
any of the three. Randomly select a student 𝑃(𝐻|𝑊′) =
𝑃(𝑊′)
from the class and find the probabilities of 𝑃(𝑀 ∩ 𝑊′) = 𝑃(𝑀) − 𝑃(𝑀 ∩ 𝑊)
the following events. = 0.21 − 0.15 = 0.06
𝑃(𝑊 D ) = 1 − 𝑃(𝑊) = 1 − 0.28 = 0.72
Solution:
(a) A person enrolled in psychology takes all 𝑃(𝐻 ∩ 𝑊′)
three subjects. 𝑃(𝐻|𝑊′) =
10 𝑃(𝑊′)
𝑃(𝑀 ∩ 𝑃 ∩ 𝐻) = 0.06
68 = 0.083
0.72

BAYES’ RULE D: a person is diagnosed with cancer.
Bayes rule provides us with a way to
update our beliefs based on the arrival of 𝑃(𝐷|𝐶)𝑃(𝐶)
new, relevant pieces of evidence. 𝑃(𝐶|𝐷) =
𝑃(𝐷|𝐶)𝑃(𝐶) + 𝑃(𝐷|𝐶 D )𝑃(𝐶′)
For example, if we were trying to provide 𝑃(𝐶) = 0.05,

the probability that a given person has 𝑃(𝐷|𝐶) = 0.78
cancer, we would initially just say it is 𝑃(𝐶 D ) = 1 − .05 = 0.95
whatever percent of the population has 𝑃(𝐷|𝐶 D ) = 0.06
cancer. However, given additional evidence
such as the fact that the person is a smoker, 0.78 × 0.05
we can update our probability, since the 𝑃(𝐶|𝐷) =
0.78 × 0.05 + 0.06 × 0.95
probability of having cancer is higher given 0.039
that the person is a smoker. = = 0.406
0.096
𝑃(𝐴 ∩ 𝐵) 𝑃(𝐵|𝐴)𝑃(𝐴) 11. Police plan to enforce speed limits by
𝑃(𝐴|𝐵) = =
𝑃(𝐵) 𝑃(𝐵) using radar traps at four different locations
within the city limits. The radar traps at each
P(A|B) is called the posterior; this is what of the locations 𝐿" , 𝐿! , 𝐿# , 𝑎𝑛𝑑 𝐿% will be
we are trying to estimate. operated 40%, 30%, 20%, and 30% of the
P(B|A) is called the likelihood; this is the time. If a person who is speeding on her way
probability of observing the new evidence, to work has probabilities of 0.2, 0.1, 0.5, and
given our initial hypothesis. 0.2, respectively, of passing through these
P(A) is called the prior; this is the locations, what is the probability that she
probability of our hypothesis without any will receive a speeding ticket?
additional prior information.
P(B) is called the marginal likelihood; this Solution:
is the total probability of observing the Let 𝑆" , 𝑆! , 𝑆# 𝑎𝑛𝑑 𝑆% − represent the events
evidence. that a person is speeding is speeding as he
passes through the respective locations
EXAMPLE: 𝑅 − the event that the radar trap is operating
10. In a certain region of the country it is resulting in a speeding ticket
known from past experience that the
probability of selecting an adult over 40 %
years of age with cancer is 0.05. If the 𝑃(𝑅) = € 𝑃(𝑅|𝑆5 )
probability of a doctor correctly diagnosing 56"
a person with cancer as having the disease is = (0.4 × 0.2) + (0.3 × 0.1) + (0.2 × 0.5)
0.78 and the probability of incorrectly + (0.3 × 0.2)
diagnosing a person without cancer as = 0.27
having the disease is 0.06, what is the 12. Suppose that the four inspectors at a film
probability that a person diagnosed as factory are supposed to stamp the expiration
having cancer actually has the disease? date on each package of film at the end of
the assembly line. John, who stamps 20% of
Solution: the packages, fails to stamp the expiration
Let C: a person has cancer date once in every 200 packages; Tom, who
C’: a person does not have cancer stamps 60% of the packages, fails to stamp

the expiration date once in every 100 SOLUTION:
packages; Jeff, who stamps 15% of the LET G: guilty of committing a crime
packages, fails to stamp the expiration date G’: a person is not guilty
once in every 90 packages; and Pat, who 𝐺G : the event that truth serum indicates that a
stamps 5% of the packages, fails to stamp person is guilty
the expiration date once in every 200
packages. If a customer complains that her 𝑃(𝐺H |𝐺 D )𝑃(𝐺′)
package of film does not show the 𝑃(𝐺′|𝐺H ) =
𝑃(𝐺H |𝐺 D )𝑃(𝐺′) + 𝑃(𝐺H |𝐺)𝑃(𝐺)
expiration date, what is the probability that it
was inspected by John? 𝑃(𝐺H |𝐺) = 0.9
𝑃(𝐺H |𝐺′) = 0.01
Solution: 𝑃(𝐺) = 0.05
Let A: no expiration date D)
𝑃(𝐺 = 1 − 0.05 = 0.95
𝐵" : John is the inspector
𝐵! : Tom is the inspector 0.95 × 0.01
𝐵# : Jeff is the inspector 𝑃(𝐺 D |𝐺H ) =
0.95 × 0.01 + 0.05 × 0.9
𝐵% : Pat is the inspector 0.0095
= = 0.1743
0.0545
1
𝑃(𝐵" ) = 0.20, 𝑃(𝐴|𝐵" ) = = 0.005
200 14. A rare disease exists with which only 1
1 in 500 is affected. A test for the disease
𝑃(𝐵! ) = 0.60, 𝑃(𝐴|𝐵! ) = = 0.010
100 exists, but of course it is not infallible. A
1 correct positive result (patient actually has
𝑃(𝐵# ) = 0.15, 𝑃(𝐴|𝐵# ) = = 0.011
90 the disease) occurs 95% of the time, while a
1 false positive result (patient does not have
𝑃(𝐵% ) = 0.05, 𝑃(𝐴|𝐵# ) = = 0.005
200 the dis ease) occurs 1% of the time. If a
randomly selected individual is tested and
𝑃(𝐵" |𝐴) = the result is positive, what is the probability
𝑃(𝐴|𝐵" )𝑃(𝐵" ) that the individual has the disease?
𝑃(𝐴|𝐵" )𝑃(𝐵" ) + 𝑃(𝐴|𝐵! )𝑃(𝐵! ) + 𝑃(𝐴|𝐵# )𝑃(𝐵# )
+𝑃(𝐴|𝐵% )𝑃(𝐵% ) Solution:
(0.005 × 0.20) Let D: a person has a rare disease, 𝑃(𝐷) =
= "
(0.005 × 0.20) + (0.010 × 0.60)
-..
(0.011 × 0.15) + (0.005 × 0.05) P: the test shows a positive result, 𝑃(𝑃|𝐷) =
= 0.1124 0.95 and 𝑃(𝑃|𝐷D ) = 0.01
𝑃(𝑃)𝑃(𝑃|𝐷)
13A truth serum has the property that 90% 𝑃(𝐷|𝑃) =
of the guilty suspects are properly judged 𝑃(𝑃)𝑃(𝑃|𝐷) + 𝑃(𝐷D )𝑃(𝑃|𝐷′)
1
while, of course, 10% of the guilty suspects ‰500Š (0.95)
are improperly found innocent. On the other =
1 1
hand, innocent suspects are misjudged 1% ‰500Š (0.95) + ‰1 − 500Š (0.01)
of the time. If the suspect was selected from = 0.1599
a group of suspects of which only 5% have
ever committed a crime, and the serum
indicates that he is guilty, what is the
probability that he is innocent?

Random Variables and Probability of distinct values such as 0,1,2,3,4,........
Distributions Discrete random variables are usually (but
not necessarily) counts. If a random variable
A random variable is a function that can take only a finite number of distinct
associates a real number with each element values, then it must be discrete.
in the sample space.
A quantity having a numerical value for Discrete Probability Distributions
each member of a group, especially one The probability distribution of a
whose values occur according to a frequency discrete random variable is a list of
distribution. probabilities associated with each of its
possible values. It is also sometimes called
EXAMPLE: the probability function or the probability
1. Two balls are drawn in succession mass function.
without replacement from an urn containing A discrete random variable assumes
4 red balls and 3 black balls. The possible each of its values with a certain probability.
outcomes and the values y of the random Frequently, it is convenient to
variable Y , where Y is the number of red represent all the probabilities of a random
balls, are variable X by a formula. Such a formula
would necessarily be a function of the
numerical values x that we shall denote by
𝑓(𝑥), 𝑔(𝑥), 𝑟(𝑥), and so forth. Therefore, we
write f(x) = P(X = x); that is, 𝑓(3) =
𝑃(𝑋 = 3). The set of ordered pairs (x,
2. A stockroom clerk returns three safety f(x)) is called the probability function,
helmets at random to three steel mill probability mass function, or probability
employees who had previously checked distribution of the discrete random variable
them. If Smith, Jones, and Brown, in that X.
order, receive one of the three hats, list the
sample points for the possible orders of Definition 1. The set of ordered pairs
returning the helmets, and find the value m (𝑥, 𝑓(𝑥)) is a probability function,
of the random variable M that represents the probability mass function, or probability
number of correct matches. distribution of the discrete random variable
X if, for each possible outcome x,
1. 𝑓(𝑥) ≥ 0,
2. € 𝑓(𝑥) = 1
;
3. 𝑃(𝑋 = 𝑥) = 𝑓(𝑥)
Example 2.
If a sample space contains a finite A shipment of 20 similar laptop computers
number of possibilities or an unending to a retail outlet contains 3 that are
sequence with as many elements as there are defective. If a school makes a random
whole numbers, it is called a discrete purchase of 2 of these computers, find the
sample space. probability distribution for the number of
A discrete random variable is one defectives.
which may take on only a countable number
Solution: Thus, the probability distribution of X is
Let X be a random variable whose values x
are the possible numbers of defective
computers purchased by the school. Then x
can only take the numbers 0, 1, and 2.
The cumulative distribution function 𝐹(𝑥)
Following a hypergeometric distribution (a of a discrete random variable X with
discrete probability distribution that probability distribution 𝑓(𝑥) is
describes the probability of "𝒌" successes 𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = € 𝑓(𝑡)
[random draws for which the object drawn
<I;
has a specific feature] in "𝒏" draws, without 𝑓𝑜𝑟 − ∞ < 𝑥 < ∞.
replacement, from a finite population of size
"𝑵" that contain exactly "𝑲" the objects Example 2.
with that feature, wherein each draw is If a car agency sells 50% of its inventory of
either a success or a failure). In contrast, a certain foreign car equipped with side
the binomial distribution describes the airbags, find a formula for the probability
probability of "𝒌" successes in "𝒏" draws distribution of the number of cars with side
with replacement. airbags among the next 4 cars sold by the
𝑋=𝑥 agency.
Solution:
The probability of selling an automobile
with side airbags = 0.5
where:
Points in the sample space = 2 × 2 × 4 =
16
𝑘 ∶ 𝑔𝑜𝑜𝑑𝑠 The event of selling models without airbags

𝑛 − 𝑘 ∶ 𝑏𝑎𝑑 4 − 𝑥
𝑵 = 20 probability distribution 𝑓(𝑥) = 𝑃(𝑋 = 𝑥)

, 𝑮 = 3 is
𝑩 = 17
𝒏 = 2
Verify that 𝑓(2) = 3/8.

Cumulative distribution function
1
𝐹(0) = 𝑓(0) = ,
16
5
𝐹(1) = 𝑓(0) + 𝑓(1) =
16
11
𝐹(2) = 𝑓(0) + 𝑓(1) + 𝑓(2) = ,
16
𝐹(3) = 𝑓(0) + 𝑓(1) + 𝑓(2) + 𝑓(3)
15
= ,
16

𝐹(4) = 𝑓(0) + 𝑓(1) + 𝑓(2) + 𝑓(3)
+ 𝑓(4) = 1.
Discrete Cumulative Distribution Function

11 5 3
𝑓(2) = 𝐹(2) − 𝐹(1) = − =
16 16 8 CONTINUOUS PROBABILITY
DISTRIBUTION
A probability distribution in which
the random variable X can take on any value
(is continuous). Because there are infinite
values that X could assume, the probability
of X taking on any one specific value is
zero. Therefore, we often speak in ranges of
values (𝑷(𝑿 > 𝟎) = . 𝟓𝟎). The normal
distribution is one example of a continuous
distribution. The probability that X falls
between two values (a and b) equals the
integral (area under the curve) from a to b:
Probability mass function
A probability density function is defined

such that the likelihood of a value of X
between a and b equals the integral (area
under the curve) between a and b. This
probability is always positive. Further, we
know that the area under the curve from
negative infinity to positive infinity is one.
Probability Histogram

The cumulative distribution function 𝐹(𝑥)
of a continuous random variable X with
density function f(x) is
;
𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ž 𝑓(𝑡)𝑑𝑡
+J
𝑓𝑜𝑟 − ∞ < 𝑥 < ∞.
As an immediate consequence of Definition
above, one can write the two results
𝑃(𝑎 < 𝑋 < 𝑏) = 𝐹(𝑏) − 𝐹(𝑎)

Figure: 𝑃(𝑎 < 𝑋 < 𝑏) if the derivative exists.
The function f(x) is a probability Example 4:
density function (pdf) for the continuous For the density function of Example 3, find
random variable X, defined over the set of F(x), and use it to evaluate P(0 < X ≤ 1).
real numbers, if
Solution:
1. 𝑓(𝑥) ≥ 0, 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑥 ∈ 𝑅 For −1 <x< 2
J
; ;
2. ž 𝑓(𝑥)𝑑𝑥 = 1 𝑡!
𝐹(𝑥) = ž 𝑓(𝑡) = ž 𝑑𝑡
+" 3
+J
K J
3. 𝑃(𝑎 < 𝑋 < 𝑏) = ž 𝑓(𝑥)𝑑𝑥 𝑡# ; 𝑥# + 1
= |+" =
: 9 9
Example 3:
Suppose that the error in the reaction
temperature, in °𝐶, for a controlled
laboratory experiment is a continuous 2 1
random variable X having the probability 𝑃(0 < 𝑋 ≤ 1) = 𝐹(1) − 𝐹(0) = −
9 9
density function 1
=
9
Example:
(a) Verify that f(x) is a density function. 5. 1 Classify the following random variables
J ! !
𝑥 𝑑𝑥 𝑥 # ! 8 1 as discrete or continuous:
ž 𝑓(𝑥)𝑑𝑥 = ž = |+" = + X: the number of automobile accidents per
+J +" 3 9 9 9 year in Virginia. − 𝐷𝑖𝑠𝑐𝑟𝑒𝑡𝑒
=1 Y : the length of time to play 18 holes of
golf. −𝐶𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠
(b) Find P(0 < X ≤ 1). M: the amount of milk produced yearly by a
" particular cow. −𝐶𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠
𝑥 ! 𝑑𝑥 𝑥 # " 1 N: the number of eggs laid each month by a
𝑃(0 < 𝑋 ≤ 1) = ž = |. =
. 3 9 9 hen. −𝐷𝑖𝑠𝑐𝑟𝑒𝑡𝑒
P: the number of building permits issued
each month in a certain city. −𝐷𝑖𝑠𝑐𝑟𝑒𝑡𝑒
Q: the weight of grain produced per acre.
−𝐶𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠
5. An overseas shipment of 5 foreign 𝑠 = {𝐻𝐻𝐻, 𝑇𝐻𝐻𝐻
automobiles contains 2 that have slight paint ,HTHHH,TTHHH,TTTHHH,HTT,HHH,TH
blemishes. If an agency receives 3 of these THHH,HHTHHH, …}
automobiles at random, list the elements of − 𝐶𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠
the sample space S, using the letters B and N
for blemished and non-blemished, 8. Determine the value c so that each of the
respectively; then to each sample point following functions can serve as a
assign a value x of the random variable X probability distribution of the discrete
representing the number of automobiles with random variable X:
paint blemishes purchased by the agency. (a) 𝑓(𝑥) = 𝑐(𝑥 ! + 4), 𝑓𝑜𝑟 𝑥 = 0, 1, 2, 3;
Solution: Solution:
∑; 𝑓(𝑥) = 1
#
€ 𝑐(𝑥 ! + 4) = 1
;6.
= 𝑐 [(0! + 4) + (1! + 4) + (2! + 4)
+ (3 + 4)] = 1
= 𝑐[4 + 5 + 8 + 13] = 1
= 30𝑐 = 1
1
𝑐=
6. Let W be a random variable giving the 30
number of heads minus the number of tails (b)
in three tosses of a coin. List the elements of
the sample space S for the three tosses of the 𝑓𝑜𝑟 𝑥 = 0, 1, 2
coin and to each sample point assign a value !
w of W. € 𝑓(𝑥) = 1
;6.
Solution:
2! 3!
𝑐 { |{ |
0! − (2 − 0!)! 3! (3 − 3)!
2! 3!
+{ |{ |
1! − (2 − 1!)! 2! (3 − 2)!
2! 3!
+{ |{ |¡ = 1
2! − (2 − 2!)! 1! (3 − 1)!
1
𝑐=
10
7. A coin is flipped until 3 heads in 10. The shelf life, in days, for bottles of a
succession occur. List only those elements certain prescribed medicine is a random
of the sample space that require 6 or less variable having the density function
tosses. Is this a discrete sample space?
Explain.
Solution:
Find the probability that a bottle of this (a) less than 120 hours;
medicine will have a shell life of 120
(a) at least 200 days 𝑃 {𝑋 < | = 𝑃(𝑋 < 1.2)
100
".!
Solution: 𝑃(𝑋 < 1.2) = ž 𝑓(𝑥)𝑑𝑥

.
𝑃(𝑋 ≥ 200) = 1 − 𝑃(𝑋 < 200) " ".!
!.. = ž 𝑓(𝑥)𝑑𝑥 + ž 𝑓(𝑥)𝑑𝑥
=1−ž 𝑓(𝑥) . "
. " ".!
20,000
!.. = ž 𝑥𝑑𝑥 + ž (2 − 𝑥)𝑑𝑥
=1−ž (𝑥 + 100)# . "
. 𝑥! 𝑥!
10000 !.. = ¢ £ |". + ¢2𝑥 − £ |"".!
= 1 − (− | 2 2
(𝑥 + 100)! .
10000 10000 1 1.2! 1!
=1+ − = { − 0| + ¢2 × 1.2 − − 2.1 + £
(200 + 100)! (0 + 100)! 2 2 2
10000 1 1.44 1
=1+ −1 = + 2.4 − −2+
90000 2 2 2
1 = 0.68
𝑃(𝑋 ≥ 200) =
9
(b) between 50 and 100 hours.
(b) anywhere from 80 to 120 days. 50 100
𝑃{ <𝑋< | = 𝑃(0.5 < 𝑋 < 1)
𝑃(80 ≤ 𝑋 ≤ 120) = 𝑃(𝑋 ≤ 120) 100 100
"
− 𝑃(𝑋 < 80) 𝑃(0.5 < 𝑋 < 1) = ž 𝑓(𝑥)𝑑𝑥
"!. /.
..-
=ž 𝑓(𝑥) − ž 𝑓(𝑥) "
. . = ž 𝑥𝑑𝑥
!..
..-
=ž 𝑓(𝑥) 𝑥! "
!..
/. = ¢ £ |..-
20000 10000 2
=ž = {− | |!.. 1! 0.5!
/. (𝑥 + 100) # (𝑥 + 100)! /.
= −
10000 10000 2 2
=− − {− | 1 1
220! 180! = − = 0.375
𝑃(80 ≤ 𝑋 ≤ 200) = 0.102 2 8
12. A shipment of 7 television sets contains
11. The total number of hours, measured in 2 defective sets. A hotel makes a random
units of 100 hours, that a family runs a purchase of 3 of the sets. If x is the number
vacuum cleaner over a period of one year is of defective sets purchased by the hotel, find
a continuous random variable X that has the the probability distribution of X. Express the
density function results graphically as a probability
histogram.
Solution:
Let X = values 0, 1, and 2
Find the probability that over a period of one
year, a family runs their vacuum cleaner

If 𝑋 = 0
5𝐶#
𝑃(𝑋 = 0) =
7𝐶#
5!
{ | 10
3! 5 − 3)!
(
= =
7!
{ (7 | 35
3! − 3)!
2
= = 0.286
7
If 𝑋 = 1
5𝐶! × 2𝐶"
𝑃(𝑋 = 1) =
7𝐶#
5! 2!
{ ( |×{ ( | Probability Histogram
2! 5 − 2)! 1! 2 − 1)!
=
7!
{ (7 | 13. Find the cumulative distribution function
3! − 3)!
10 × 2 of the random variable X representing the
= number of defectives in Exercise 12. Then
35
4 using F(x), find
= = 0.571 𝐹(𝑥) = 𝑃(𝑋 = 𝑥) =
7
0, 𝑥 < 0
If 𝑋 = 2 𝑓(0), 0 ≤ 𝑥 < 1
5𝐶" × 2𝐶! 𝑓(0) + 𝑓(1), 1 ≤ 𝑥 < 2
𝑃(𝑋 = 3) =
7𝐶# 𝑓(0) + 𝑓(1) + 𝑓(2), 2 ≤ 𝑥
5! 2!
{ ( ) |×{ ( |
1! 5 − 1 ! 2! 2 − 2)! if 𝐹(0 < 𝑥) = 0
=
7!
{ (7 | if 𝐹(0)
3! − 3)!
5×1 1 5!
=
35
= = 0.143
7 5𝐶# {3! (5 − 3)!|
= 𝑓(0) = =
7𝐶# { 7!
|
3! (7 − 3)!
2
=
7
if 𝐹(1)
= 𝑓(0) + 𝑓(1)
5𝐶# 5𝐶! × 2𝐶"
= +
7𝐶# 7𝐶#
5! 2!
2 {2! (5 − 2)!| × {1! (2 − 1)! |
= +
7 7!
{ (7 |
3! − 3)!
2 4 6
= + =
7 7 7
if 𝐹(2)

= 𝑓(0) + 𝑓(1) + 𝑓(2) 14. Suppose it is known from large amounts
5𝐶# 5𝐶! × 2𝐶" 5𝐶" × 2𝐶! of historical data that X, the number of cars
= + + that arrive at a specific intersection during a
7𝐶# 7𝐶# 7𝐶#
5! 2! 20-second time period, is characterized by
2 4 {1! (5 − 1)!| × {2! (2 − 2)! | the following discrete probability function:
= + +
7 7 7!
{ (7 | 6;
3! − 3)!
2 4 1 𝑓(𝑥) = 𝑒 +0 , 𝑓𝑜𝑟 𝑥 = 0, 1, 2, . . ..
𝑥!
+ + =1
7 7 7
(a) Find the probability that in a specific 20-
0 𝑥 < 0 second time period, more than 8 cars arrive
2/7 0 ≤ 𝑥 < 1 at the intersection.
6/7 1 ≤ 𝑥 < 2
1, 2 ≤ 𝑥 𝑃(𝑋 > 8) = 1 − 𝑃(𝑋 ≤ 8)
= 1 − € 𝑓(𝑡)
(a) 𝑃(𝑋 = 1) < I/
𝑃(𝑋 = 1) = 𝑃(𝑋 ≤ 1) − 𝑃(𝑋 =≤ 0) = 1 − ¤𝑓(1) + 𝑓(2) + 𝑓(3) + 𝑓(4) + 𝑓(5)
= 𝐹(1) − 𝐹(0) + 𝑓(6) + 𝑓(7) + 𝑓(8)¥
6 2
= − 6 6! 6# 6% 6- 60
7 7 = 1 − 𝑒 +0 ¢ + + + + +
4 1! 2! 3! 4! 5! 6!
= 6E 6/
7
+ + £
7! 8!
(b) 𝑃(0 ≤ 𝑋 ≤ 2) = 1 − 0.8448
𝑃(𝑋 ≤ 2) − 𝑃(𝑋 ≤ 0) = 0.1552
2 (b) Find the probability that only 2 cars
= 𝐹(2) − 𝐹(0) = 1 −
7 arrive
5
=
7 𝑃(𝑋 = 2) = 𝑓(2)
+0
6!
(c) Construct a graph of the cumulative =𝑒 = 0.0446
distribution function 2!

JOINT PROBABILITY DISTRIBUTION (b) P[(X, Y ) ∈ A], where A is the region
If X and Y are two discrete random {(x, y)|x + y ≤ 1}.
variables, the probability distribution for
their simultaneous occurrence can be Solution:
represented by a function with values f(x, y) The probability that (X, Y ) fall in the region
for any pair of values (x, y) within the range A is
of the random variables X and Y . It is 𝑃 [(𝑋, 𝑌 ) ∈ 𝐴],
customary to refer to this function as the 𝑤ℎ𝑒𝑟𝑒 𝐴 𝑖𝑠
joint probability distribution of X and Y . = 𝑃(𝑋 + 𝑌 ≤ 1)
Hence, in the discrete case, = 𝑓(0, 0) + 𝑓(0, 1)
𝑓(𝑥, 𝑦) = 𝑃(𝑋 = 𝑥, 𝑌 = 𝑦); + 𝑓(1, 0)
that is, the values f(x, y) give the probability
that outcomes x and y occur at the same 3 3 9 9
time. = + + =
28 14 28 14
The function f(x, y) is a joint
probability distribution or probability Table 3.1. Joint Probability Distribution
mass function of the discrete random
variables X and Y if
1. 𝑓(𝑥, 𝑦) ≥ 0 𝑓𝑜𝑟 𝑎𝑙𝑙 (𝑥, 𝑦),
2. € 𝑥 € 𝑦 𝑓(𝑥, 𝑦) = 1,
; M
3. 𝑃(𝑋 = 𝑥, 𝑌 = 𝑦) = 𝑓(𝑥, 𝑦).
𝐹𝑜𝑟 𝑎𝑛𝑦 𝑟𝑒𝑔𝑖𝑜𝑛 𝐴 𝑖𝑛 𝑡ℎ𝑒 𝑥𝑦 𝑝𝑙𝑎𝑛𝑒, 𝑃[(𝑋, 𝑌 )

∈ 𝐴] = € € 𝑓 (𝑥, 𝑦)
M
Example 15 Joint density function
When X and Y are continuous
Two ballpoint pens are selected at random
from a box that contains 3 blue pens, 2 red random variables, the joint density function
f(x, y) is a surface lying above the xy plane,
pens, and 3 green pens. If X is the number of
blue pens selected and Y is the number of and P[(X, Y ) ∈ A], where A is any region in
red pens selected, find the xy plane, is equal to the volume of the
right cylinder bounded by the base A and the
Solution: surface.
The possible pairs of values (x, y) are (0, 0),
(0, 1), (1, 0), (1, 1), (0, 2), and (2, 0) The function f(x, y) is a joint density
function of the continuous random variables
(a) the joint probability function f(x, y), X and Y if
1. 𝑓(𝑥, 𝑦) ≥ 0, 𝑓𝑜𝑟 𝑎𝑙𝑙 (𝑥, 𝑦),
J J
2. ∫+J ∫+J(𝑥, 𝑦) 𝑑𝑥 𝑑𝑦 = 1
3. 𝑃[(𝑋, 𝑌 ) ∈ 𝐴] = ∬ 𝐴 𝑓(𝑥, 𝑦) 𝑑𝑥 𝑑𝑦,
for any region A in the xy plane.
𝑓𝑜𝑟 𝑥 = 0, 1, 2; 𝑦 = 0, 1, 2; 𝑎𝑛𝑑
0 ≤ 𝑥 + 𝑦 ≤ 2.

Example 16 𝑦 3𝑦 ! "/!
A privately owned business operates both a =¢ + £|
10 10 "/%
drive-in facility and a walk-in facility. On a 1 1 3 1 3 13
randomly selected day, let X and Y , = { + | − { + |¡ =
10 2 4 4 16 160
respectively, be the proportions of the time
that the drive-in and the walk-in facilities The marginal distributions of X alone and
are in use, and suppose that the joint density of Y.
function of these random variables is Given the joint probability
distribution f(x, y) of the discrete random
variables X and Y , the probability
distribution g(x) of X alone is obtained by
summing f(x, y) over the values of Y .
a. Verify condition 2 of Definition of joint
Similarly, the probability distribution h(y) of
density function Y alone is obtained by summing f(x, y) over
the values of X.
Solution: When X and Y are continuous
The integration of f(x, y) over the whole
random variables, summations are replaced
region is by integrals.
J
J
ž ž 𝑓(𝑥, 𝑦) 𝑑𝑥 𝑑𝑦 𝑔(𝑥) = € 𝑓 (𝑥, 𝑦) 𝑎𝑛𝑑
+J M
+J
" "
2 ℎ(𝑦) = € 𝑓 (𝑥, 𝑦)
=ž ž (2𝑥 + 3𝑦)𝑑𝑥 𝑑𝑦
. . 5 ;
"
2𝑥 ! 6𝑥𝑦 ;6" for the discrete case, and
=ž ¢ + £ | ;6. 𝑑𝑦 J
. 5 5
" 𝑔(𝑥) = ž 𝑓(𝑥, 𝑦)𝑑𝑦
2 6𝑦 2𝑦 3𝑦 ! " +J
= ž { + | 𝑑𝑦 = ¢ + £ | . 𝑎𝑛𝑑
5 5 5 5
. J
2 3
= + =1 ℎ(𝑦) = ž 𝑓(𝑥, 𝑦)𝑑𝑥
5 5
+J
b. Find 𝑃[(𝑋, 𝑌 ) ∈ 𝐴], 𝑤ℎ𝑒𝑟𝑒 𝐴 =
" " " Example 17
{(𝑥, 𝑦)|0 < 𝑥 < ! , % < 𝑦 < ! } Show that the column and row totals of
Table 3.1 give the marginal distribution of X
Solution: alone and of Y alone.
𝑃 [(𝑋, 𝑌 ) ∈ 𝐴]
1 1 1 Solution:
= 𝑃 { 0 < 𝑋 < , < 𝑌 < | For the random variable X
2 4 2
"/! "/! 𝑔(0) = 𝑓(0, 0) + 𝑓(0, 1) + 𝑓(0, 2)
2
=ž ž (2𝑥 + 3𝑦)𝑑𝑥 𝑑𝑦 3 3 1 5
"/% . 5 = + + =
"/!
2𝑥 ! 6𝑥𝑦 ;6"/! 28 14 28 14
=ž ¢ + £ | ;6. 𝑑𝑦 𝑔(1) = 𝑓(1, 0) + 𝑓(1, 1) + 𝑓(1, 2)
"/% 5 5 9 3 15
"/! = + + 0 =
28 14 28
1 3𝑦
= ž { + | 𝑑𝑦
10 5
"/%

𝑔(2) = 𝑓(2, 0) + 𝑓(2, 1) + 𝑓(2, 2) conditional distribution of the random
3 3 variable Y given that X = x is
= + 0 + 0 = 𝑓(𝑥, 𝑦)
28 28
𝑓(𝑦|𝑥) = , 𝑝𝑟𝑜𝑣𝑖𝑑𝑒𝑑 𝑔(𝑥) > 0.
𝑔(𝑥)
Similarly, the conditional distribution of X

given that Y = y is
𝑓(𝑥, 𝑦)
𝑓(𝑥|𝑦) = , 𝑝𝑟𝑜𝑣𝑖𝑑𝑒𝑑 ℎ(𝑦) > 0.
ℎ(𝑦)
The probability that the discrete random

Conditional Probability Distribution variable X falls between a and b when it is
Using the definition of conditional known that the discrete variable Y = y, we
probability that the value x of the random evaluate
variable X represents an event that is a 𝑃(𝑎 < 𝑋 < 𝑏 | 𝑌 = 𝑦) = € 𝑓(𝑥 | 𝑦)
subset of the sample space, :O;OK
𝑃(𝐴 ∩ 𝐵)
𝑃(𝐵|𝐴) =
𝑃(𝐴) where the summation extends over all values
provided P(A) > 0, of X between a and b. When X and Y are
where A and B are now the events defined continuous, we evaluate
by 𝑋 = 𝑥 and 𝑌 = 𝑦, respectively, then
K
𝑃(𝑌 = 𝑦 | 𝑋 = 𝑥) 𝑃(𝑎 < 𝑋 < 𝑏 |𝑌 = 𝑦) = ž 𝑓(𝑥 | 𝑦)𝑑𝑥

:
𝑃(𝑋 = 𝑥, 𝑌 = 𝑦) Example 18
=
𝑃(𝑋 = 𝑥) The joint density for the random
𝑓(𝑥, 𝑦) variables (X, Y ), where X is the unit
= , 𝑝𝑟𝑜𝑣𝑖𝑑𝑒𝑑 𝑔(𝑥) > 0
𝑔(𝑥) temperature change and Y is the proportion
where X and Y are discrete random of spectrum shift that a certain atomic
variables particle produces, is
The function f(x, y)/g(x), which is
strictly a function of y with x fixed, satisfies
all the conditions of a probability
distribution. This is also true when f(x, y)
and g(x) are the joint density and marginal (a) Find the marginal densities g(x), h(y),
distribution, respectively, of continuous and the conditional density f(y|x)
random variables. As a result, it is extremely
important that we make use of the special Solution:
type of distribution of the form f(x, y)/g(x) J
in order to be able to effectively compute 𝑔(𝑥) = ž 𝑓(𝑥, 𝑦) 𝑑𝑦
conditional probabilities.
+J
"
10 # M6"
Let X and Y be two random = ž 10𝑥𝑦 ! 𝑑𝑦 = 𝑥𝑦 | M6;
variables, discrete or continuous. The ; 3
10
= 𝑥(1 − 𝑥 # ), 0 < 𝑥 < 1,
3
J
𝑓" (𝑥" ), 𝑓! (𝑥! ), . . . , 𝑓' (𝑥' ), respectively. The
ℎ(𝑦) = ž 𝑓(𝑥, 𝑦) 𝑑𝑥 random variables X1, X2,...,Xn are said to
+J be mutually statistically independent if
M
;6M and only if
= ž 10𝑥𝑦 ! 𝑑𝑥 = 5𝑥 ! 𝑦 ! | ;6.
.
= 5𝑦 % , 0<𝑥<𝑦<1 𝑓(𝑥" , 𝑥! , . . . , 𝑥' ) = 𝑓" (𝑥" )𝑓! (𝑥! ) ··· 𝑓' (𝑥' )
Now for all (x1, x2,...,xn) within their range.
𝑓(𝑥, 𝑦) 10𝑥𝑦 !
𝑓(𝑦|𝑥) = = Example 19
𝑔(𝑥) 10
𝑥 (1 − 𝑥 # ) Suppose that the shelf life, in years, of a
3
3𝑦 ! certain perishable food product packaged in
= , cardboard containers is a random variable
1 − 𝑥#
0<𝑥<𝑦<1 whose probability density function is given
(b) Find the probability that the spectrum by
shifts more than half of the total
observations, given that the temperature is
increased by 0.25 unit.
1 Let 𝑋" , 𝑋! , 𝑎𝑛𝑑 𝑋# represent the shelf lives
𝑃 {𝑌 > | 𝑋 = 0.25| for three of these containers selected
2
" independently and find
= ž 𝑓(𝑦 | 𝑥 = 0.25) 𝑑𝑦 𝑃(𝑋" < 2, 1 < 𝑋! < 3, 𝑋# > 2).
"/!
"
3𝑦 ! 8 Solution:
=ž #
𝑑𝑦 =
"/! 1 − 0.25 9 Random variables X1, X2, and X3 are
statistically independent, having the joint
Statistical Independence probability density
If f(x|y) does not depend on y, then
of course the outcome of the random 𝑓(𝑥" , 𝑥! , 𝑥# ) = 𝑓(𝑥" )𝑓(𝑥! )𝑓(𝑥# )
variable Y has no impact on the outcome of = 𝑒 +;! 𝑒 +;" 𝑒 +;# = 𝑒 +;! +;" +;# ,
the random variable X. In other words, we
say that X and Y are independent random for 𝑥" > 0, 𝑥! > 0, 𝑥# > 0, and
variables. 𝑓(𝑥" , 𝑥! , 𝑥# ) = 0 elsewhere. Hence
Let X and Y be two random 𝑃(𝑋" < 2, 1 < 𝑋! < 3, 𝑋# > 2)

variables, discrete or continuous, with joint J # !
probability distribution f(x, y) and marginal = ž ž ž 𝑒 +;! +;" +;# 𝑑𝑥" 𝑑𝑥! 𝑑𝑥#
! " .
distributions g(x) and h(y), respectively. The
random variables X and Y are said to be = (1 − 𝑒 +! )(𝑒 +" − 𝑒 +# )𝑒 +! = 0.0372.
statistically independent if and only if
𝑓(𝑥, 𝑦) = 𝑔(𝑥)ℎ(𝑦) Example 20
for all (x, y) within their range. A coin is biased such that a head is three
times as likely to occur as a tail. Find the
Let 𝑋" , 𝑋! , . . . , 𝑋' be n random expected number of tails when this coin is
variables, discrete or continuous, with joint tossed twice.
probability distribution 𝑓(𝑥" , 𝑥! , . . . , 𝑥' ) and
marginal distribution

Solution: The insurance company should charge a
3 1 premium of
𝑃(𝐻) = 𝑎𝑛𝑑 𝑃(𝑇) =
4 4 = $6,400 + $500 = $6,900
𝑆 = {𝐻𝐻, 𝐻𝑇, 𝑇𝐻, 𝑇𝑇}
Let X = number of tails that occur in two Example 22
tosses of coin, The density function of coded measure
3 3 9
𝑃(𝑋 = 0) = 𝑃(𝐻𝐻) = × =
4 4 16
Find the expected value of X.
𝑃(𝑋 = 1) = 𝑃(𝐻𝑇) + 𝑃(𝑇𝐻)
3 1 3 1 3 Solution:
= × + × = 4 " 𝑥 ln 4
4 4 4 4 8 𝐸(𝑋) = ž 𝑑𝑥 =
𝜋 . 1+𝑥 ! 𝜋
1 1 1
𝑃(𝑋 = 2) = 𝑃(𝑇𝑇) = × =
4 4 16 Example 23
If a dealer’s profit, in units of $5000, on a
The probability Distribution: new automobile can be looked upon as a
random variable X having the density
function
9 3
𝜇 = 𝐸(𝑋) = {0 × | + {1 × |
16 8 find the average profit per automobile.
1 1
+ {2 × | =
16 2 Solution:
"
1
Example 21 𝑓(𝑥) = ž 2𝑥(1 − 𝑥) 𝑑𝑥 =
. 3
A private pilot wishes to insure his airplane
1
for $200,000. The insurance company 𝐸(𝑋) = × ($5,000) = $1,667.67
estimates that a total loss will occur with 3
probability 0.002, a 50% loss with
Example 24
probability 0.01, and a 25% loss with
Suppose that you are inspecting a lot of
probability 0.1. Ignoring all other partial
1000 light bulbs, among which 20 are
losses, what premium should the insurance
defectives. You choose two light bulbs
company charge each year to realize an
randomly from the lot without replacement.
average profit of $500?
Let
Solution:
𝐸(𝑋) = (200,000) × (0.002) Find the probability that at least one light
+ (100,000) × (0.01) bulb chosen is defective
+ (50,000) × (0.1)
+ (0) × (0.888) Solution:
= $6,400 𝑃(𝑋" + 𝑋! ≥ 1)

𝑃(𝑋" + 𝑋! ≥ 1) = 1 − 𝑃(𝑋" = 0,
𝑋! = 0)
980𝐶! × 20𝐶.
=1−
1000𝐶!
= 1 − 0.09604 = 0.040
(a) find 𝐸(𝑋 ! 𝑌 − 2𝑋𝑌)
Example 25 𝐸(𝑋 ! 𝑌 − 2𝑋𝑌)
Let X be a random variable with the # !
following probability distribution: = € €(𝑥 ! 𝑦 − 2𝑥𝑦)𝑓(𝑥, 𝑦)
;6. M6.
18 18
! = (1 − 2) { | + (4 − 4) { | + ⋯
Find 𝜇P (𝑋), where 𝑔(𝑋) = (2𝑋 + 1) . 70 70
3 3
+ (8 − 8) { | = −
Solution: 70 7
If X is discrete, 𝜇P (𝑋) = ∑; 𝑔(𝑥)𝑓(𝑥)
𝑔(𝑋) = (2𝑋 + 1)! (b) find 𝜇Q − 𝜇R .
𝑔(−3) = (2(−3) + 1)! = 25
𝑔(6) = (2(6) + 1)! = 169 𝜇Q − 𝜇R = € € 𝑥𝑓(𝑥, 𝑦)
𝑔(9) = (2(9) + 1)! = 361 ; S
− € € 𝑦𝑓(𝑥, 𝑦)
Probability density function is ; S
!
#
=€ €(𝑥 − 𝑦)
;6.
M6.
𝜇P (𝑋) = € 𝑔(𝑥)𝑓(𝑥) 2 3
= (0 − 1) × + (0 − 2) ×
; 70 70
1 1 3
= {25 × | + {169 × | + (1 − 0) ×
6 2 70
1 9
+ {361 × | = 209 + (1 − 2) ×
3 70
9
Example 26 + (2 − 0) ×
70
From a sack of fruit containing 3 oranges, 2 18
apples, and 3 bananas, a random sample of 4 + (2 − 1) ×
70
pieces of fruit is selected. If X is the number 3
of oranges and Y is the number of apples in + (3 − 0) ×
70
the sample, 2
+ (3 − 1) ×
the joint probability distribution of X and Y 70
+ (3 − 2) × 0
−2 − 6 + 3 − 9 + 18 + 18 + 9 + 4
=
70
𝑥 = 0,1,2,3 𝑦 = 0,1,2
1
=
2

Example 27
Let X and Y be random variables with joint
density function
Find the expected value of

𝑍 = ‹𝑋 ! + 𝑌 !
𝐸(𝑍) = 𝐸 ‹𝑋 ! + 𝑌 !
" "
= ž ž 4𝑥𝑦‹𝑥 ! + 𝑦 ! 𝑑𝑥 𝑑𝑦
. .
4 " #
= ž 𝑦(1 + 𝑦 ! )! − 𝑦 % ¡ 𝑑𝑦
3 .
#
8 {2! − 1|
= = 0.9752
15
Example 28
The proportion of the budget for a certain
type of industrial company that is allotted to
environmental and pollution control is
coming under scrutiny. A data collection
project determines that the distribution of
these proportions is given by
(a). What is the mean proportion of the

budget allocated to environmental and
pollution control?
"
𝜇 = 𝐸(𝑌) = 5 ž 𝑦(1 − 𝑦 % ) 𝑑𝑦
.
"
1
= − ž 𝑦(1 − 𝑦)- =
. 6
(b) What is the probability that a company
selected at random will have allocated to
environmental and pollution control a
proportion that exceeds the population mean
given in (a)?
"
1
𝑃 {𝑌 > | = ž 5(1 − 𝑦)% 𝑑𝑦
6 "/0
"
= −1(1 − 𝑦)- |"/0
-
1
= {1 − | = 0.4019
6

Some Discrete Probability
Distributions Since the items are selected independently
and we assume that the process produces
I. Binomial and Multinomial 25% defectives, we have
Distributions
An experiment often consists of 𝑃(𝑁𝐷𝑁) = 𝑃(𝑁)𝑃(𝐷)𝑃(𝑁)
repeated trials, each with two possible 3 1 3 9
= { |{ |{ | =
outcomes that may be labeled success or 4 4 4 64
failure. The most obvious application deals
with the testing of items as they come off an The probability distribution of X is therefore
assembly line, where each trial may indicate
a defective or a non-defective item. We may
choose to define either outcome as a
success. The process is referred to as a
Bernoulli process. Each trial is called a Binomial Distribution
Bernoulli trial. The number X of successes in n
Bernoulli trials is called a binomial random
The Bernoulli Process variable. The probability distribution of this
Strictly speaking, the Bernoulli discrete random variable is called the
process must possess the following binomial distribution, and its values will be
properties: denoted by 𝑏(𝑥; 𝑛, 𝑝) since they depend on
1. The experiment consists of the number of trials and the probability of a
repeated trials. success on a given trial. Thus, for the
2. Each trial results in an outcome probability distribution of X, the number of
that may be classified as a success or a defectives is
failure. 1 9
3. The probability of success, 𝑃(𝑋 = 2) = 𝑓(2) = 𝑏 {2; 3, | =
4 64
denoted by p, remains constant from trial to
trial. We wish to find a formula that gives the
4. The repeated trials are probability of x successes in n trials for a
independent. binomial experiment. First, consider the
probability of x successes n and n − x
Consider the set of Bernoulli trials failures in a specified order.
where three items are selected at random Each success occurs with probability p and
from a manufacturing process, inspected, each failure with probability 𝑞 = 1 − 𝑝.
and classified as defective or non-defective. Therefore, the probability for the specified
A defective item is designated a success. order is
The number of successes is a random 𝑝 ; 𝑞'+;
variable X assuming integral values from 0 We must now determine the total number of
through 3. The eight possible outcomes and sample points in the experiment that have x
the corresponding values of X are successes and n−x failures.
A Bernoulli trial can result in a

success with probability p and a failure with
probability q = 1−p. Then the probability
distribution of the binomial random variable

X, the number of successes in n independent (c) exactly 5 survive?
trials, is
Solution:
Let X be the number of people who survive.
(a)
𝑃(𝑋 ≥ 10) = 1𝑃(𝑋 ≥ 10)
𝑥 = 0,1,2 … 𝑛 "-
= € 𝑏 (𝑥; 15, 0.4)

When n = 3 and p = 1/4, the probability ;6".
distribution of X, the number of defectives, "-
may be written as = € 𝑛𝐶& × 𝑝& × 𝑞'+&
&6".
= 15𝐶". × (0.4)". × (0.6)-
+ 15𝐶"" × (0.4)"" × (0.6)%
The binomial distribution derives its + 15𝐶"! × (0.4)"! × (0.6)#
name from the fact that the 𝑛 + 1 terms in + 15𝐶"# × (0.4)"# × (0.6)!
the binomial expansion of (𝑞 + 𝑝)' + 15𝐶"% × (0.4)"% × (0.6)"
correspond to the various values of + 15𝐶"- × (0.4)"- × (0.6).
𝑏(𝑥; 𝑛, 𝑝) for x = 0, 1, 2,...,n. That is, = 0.0338
(b)
/
𝑃(3 ≤ 𝑋 ≤ 8) = € 𝑏 (𝑥; 15, 0.4)

;6#
/
= 𝑏(0; 𝑛, 𝑝) + 𝑏(1; 𝑛, 𝑝) + 𝑏(2; 𝑛, 𝑝) = € 𝑛𝐶& × 𝑝& × 𝑞'+&

+ ··· + 𝑏(𝑛; 𝑛, 𝑝). &6#
= 15𝐶# × (0.4)# × (0.6)"!
Since 𝑝 + 𝑞 = 1, we see that + 15𝐶% × (0.4)% × (0.6)""
' + 15𝐶- × (0.4)- × (0.6)".
€ 𝑏 (𝑥; 𝑛, 𝑝) = 1, + 15𝐶0 × (0.4)0 × (0.6)F
;6. + 15𝐶E × (0.4)E × (0.6)/
a condition that must hold for any + 15𝐶/ × (0.4)/ × (0.6)E
probability distribution. = 0.9050 − 0.0271
Frequently, we are interested in problems = 0.8779
where it is necessary to find 𝑃(𝑋 < 𝑟) (c)
or 𝑃(𝑎 ≤ 𝑋 ≤ 𝑏). Binomial sums 𝑃(𝑋 = 5) = 𝑏 (𝑥; 15, 0.4) =
&
= 𝑛𝐶& × 𝑝& × 𝑞'+&
𝐵(𝑟; 𝑛, 𝑝) = € 𝑏 (𝑥; 𝑛, 𝑝) = 015𝐶- × (0.4)- × (0.6)". = 0.1859
;6.
Mean and Variance
Example 1: Theorem 1: The mean and variance of the
The probability that a patient recovers from binomial distribution 𝑏(𝑥; 𝑛, 𝑝) are
a rare blood disease is 0.4. If 15 people are 𝜇 = 𝑛𝑝 𝑎𝑛𝑑 𝜎 ! = 𝑛𝑝𝑞
known to have contracted this disease, what Proof : Let the outcome on the jth trial be
is the probability that represented by a Bernoulli random variable
(a) at least 10 survive, Ij , which assumes the values 0 and 1 with
(b) from 3 to 8 survive, and
probabilities q and p, respectively. greater than for a similar random variable
Therefore, in a binomial experiment the with a larger standard deviation.
number of successes can be written as the
sum of the n independent indicator A continuous distribution with a
variables. Hence, large value of σ to indicate a greater
𝑋 = 𝐼1 + 𝐼2 + ··· + 𝐼' variability, we should expect the area to be
more spread out
The mean of any 𝐼𝑗 is 𝐸(𝐼7 ) = (0)(𝑞) +
(1)(𝑝) = 𝑝. Therefore, using
𝑔(𝑋, 𝑌 ) = 𝑋 𝑎𝑛𝑑 ℎ(𝑋, 𝑌 ) = 𝑌 , we see
that 𝐸[𝑋 ± 𝑌 ] = 𝐸[𝑋] ± 𝐸[𝑌 ], the mean
of the binomial distribution is
𝜇 = 𝐸(𝑋) = 𝐸(𝐼" ) + 𝐸(𝐼! ) + ·· (a) Variability of continuous observations about the

· + 𝐸(𝐼' ) mean.
A distribution with a small standard

The variance of any 𝐼7 𝑖𝑠 𝜎T!$ = 𝐸¤𝐼7! ¥ − deviation should have most of its area close
to μ
𝑝! = (0)! (𝑞) + (1)! (𝑝) − 𝑝! = 𝑝(1 −
𝑝) = 𝑝𝑞. Extending “if 𝑋" , 𝑋! , … , 𝑋' are
independent random variables, then
𝜎:!!%!&'"%"&···&')%) = 𝑎"! 𝜎Q!! + 𝑎!! 𝜎Q!" ···
+ 𝑎'! 𝜎Q!) ” to the case of n independent
Bernoulli variables gives the variance of the
binomial distribution as
𝜎Q! = 𝜎T!" + 𝜎T!" + ··· + 𝜎T!)

(b) Variability of continuous observations about the
Example 2: mean.
Find the mean and variance of the binomial
random variable of Example 1, and then use
Chebyshev’s theorem to interpret the
interval μ ± 2σ.
Chebyshev’s Theorem
If a random variable has a small
variance or standard deviation, we would
expect most of the values to be grouped
around the mean. Therefore, the probability
that the random variable assumes a value
within a certain interval about the mean is Variability of discrete observations about the mean.

Chebyshev’s Theorem: The probability that and 𝐸$ occurs 𝑥$ times in n independent
any random variable X will assume a value trials, where
within k standard deviations of the mean is 𝑥" + 𝑥! + ··· + 𝑥$ = 𝑛
at least 1 − 1/𝑘 ! . That is,
1 We shall denote this joint probability
𝑃(𝜇 − 𝑘𝜎 < 𝑋 < 𝜇 + 𝑘𝜎) ≥ 1 − !
𝑘 distribution by
For 𝑘 = 2, the theorem states that the 𝑓(𝑥" , 𝑥! , . . . , 𝑥$ ; 𝑝" , 𝑝! , . . . , 𝑝$ , 𝑛)
random variable X has a probability of at
least 1 − 1/2! = 3/4 of falling within two Clearly, 𝑝" + 𝑝! + ··· + 𝑝$ = 1, since
standard deviations of the mean. That is, the result of each trial must be one of the k
three-fourths or more of the observations of possible outcomes.
any distribution lie in the interval 𝜇 ± 2𝜎.
Similarly, the theorem says that at least Since the trials are independent, any
eight-ninths of the observations of any specified order yielding x1 outcomes for
distribution fall in the interval 𝜇 ± 3𝜎. 𝐸" , 𝑥! for 𝐸! , . . . , 𝑥$ for 𝐸$ will occur with
; ; ;
probability 𝑝" ! 𝑝! " ··· 𝑝$ * . The total
Solution (for example 2) number of orders yielding similar outcomes
- Since problem is binomial, for the n trials is equal to the number of
𝑛 = 15, 𝑝 = 0.4 partitions of n items into 𝑘 groups with 𝑥" in
𝜇 = 15 × 0.4 = 6 the first group, x2 in the second group, ... ,
and 𝑥$ in the kth group. This can be done in
𝜎 ! = 15 × 0.4 × 0.6 = 3.6
𝜎 = 1.897
Hence, the required interval is
6 ± (2)(1.897) = 2.206 𝑡𝑜 9.794 ways.
Interpretation: Example 3:
Chebyshev’s theorem states that the number The complexity of arrivals and
of recoveries among 15 patients who departures of planes at an airport is such that
contracted the disease has a probability of at computer simulation is often used to model
least 3/4 of falling between 2.206 and 9.794 the “ideal” conditions. For a certain airport
or, because the data are discrete, between 2 with three runways, it is known that in the
and 10 inclusive. ideal set ting the following are the
probabilities that the individual runways are
Multinomial Experiments and the accessed by a randomly arriving commercial
Multinomial Distribution jet:
The binomial experiment becomes a Runway 1: 𝑝" = 2/9,
multinomial experiment if we let each trial Runway 2: 𝑝! = 1/6,
have more than two possible outcomes. Runway 3: 𝑝# = 11/18.
In general, if a given trial can result What is the probability that 6 randomly
in any one of k possible outcomes arriving airplanes are distributed in the
𝐸" , 𝐸! , . . . , 𝐸$ with probabilities following fashion?
𝑝" , 𝑝! , . . . , 𝑝$ , then the multinomial Runway 1: 2 airplanes,
distribution will give the probability that Runway 2: 1 airplane,
𝐸" occurs 𝑥" times, 𝐸! occurs 𝑥! times, ... , Runway 3: 3 airplanes

Solution: the sample size n, the range of a
hypergeometric random variable will be
𝑥 = 0, 1, . . . , 𝑛.
Example 4:
Lots of 40 components each are
6! 2! 1 11# deemed unacceptable if they contain 3 or
= × ! × × # = 0.1127
2! 1! 3! 9 6 18 more defectives. The procedure for sampling
a lot is to select 5 components at random
II. Hypergeometric Distribution and to reject the lot if a defective is found.
Hypergeometric distribution does not What is the probability that exactly 1
require independence and is based on defective is found in the sample if there are
sampling done without replacement. 3 defectives in the entire lot?
Solution: Using the hypergeometric
The total number of samples of size distribution with 𝑛 = 5, 𝑁 = 40, 𝑘 =
n chosen from N items is . These 3, 𝑎𝑛𝑑 𝑥 = 1, we find the probability of
samples are assumed to be equally likely. obtaining 1 defective to be
There are
Solution:
ways of selecting x successes from the ℎ(1; 40, 5, 3)
𝑘 that are available, and for each of these
ways we can choose the 𝑛 − 𝑥 failures in
ways. Thus, the total number of 3𝐶" × 37𝐶%

= = 0.3011
40𝐶-
favorable samples among the possible Ans: Once again, this plan is not desirable
since it detects a bad lot (3 defectives) only
samples is given by about 30% of the time.
The probability distribution of the Theorem 2: The mean and variance of the
hypergeometric random variable X, the hypergeometric distribution h(x; N, n, k) are
number of successes in a random sample of 𝑛𝑘
size 𝑛 selected from N items of which k are 𝜇 =
𝑁
labeled success and N − k labeled failure, is 𝑁−𝑛 𝑘 𝑘
𝜎2 = · 𝑛 · { 1 − |
𝑁−1 𝑁 𝑁
Example 5:
Find the mean and variance of the random
𝑚𝑎𝑥{0, 𝑛 − (𝑁 − 𝑘)} ≤ 𝑥 ≤ 𝑚𝑖𝑛{𝑛, 𝑘}
variable of Example 4 and then use
Chebyshev’s theorem to interpret the
The range of 𝑥 can be determined by
interval μ ± 2σ.
the three binomial coefficients in the
definition, where 𝑥 and 𝑛 − 𝑥 are no more
Solution:
than 𝑘 and 𝑁 − 𝑘, respectively, and both of
𝑁 = 40, 𝑛 = 5, 𝑎𝑛𝑑 𝑘 = 3,
them cannot be less than 0. Usually, when (5)(3) 3
both k (the number of successes) and 𝑁 − 𝜇 = = = 0.375
𝑘 (the number of failures) are larger than 40 8

40 − 5 3 3 Example 6:
𝜎! = { | × 5 × { | × {1 − | In an NBA (National Basketball
39 40 40
= 0.3113 Association) championship series, the team
𝜎 = 0.558 that wins four games out of seven is the
winner. Suppose that teams A and B face
the required interval is each other in the championship games and
0.375 ± (2)(0.558) = −0.741 𝑡𝑜 1.491 that team A has probability 0.55 of winning
a game over team B.
Interpretation: Chebyshev’s theorem states (a) What is the probability that team A will
that the number of defectives obtained when win the series in 6 games?
5 components are selected at random from a 𝑏 ∗ (6; 4, 0.55)
% (1
lot of 40 components of which 3 are 5𝐶# × 0.55 − 0.55)0+% = 0.1853
defective has a probability of at least 3/4 of
falling between −0.741 and 1.491. That is, at (b) What is the probability that team A will
least three-fourths of the time, the 5 win the series?
components include fewer than 2 defectives.
𝑃(𝑡𝑒𝑎𝑚 𝐴 𝑤𝑖𝑛𝑠 𝑡ℎ𝑒 𝑐ℎ𝑎𝑚𝑝𝑖𝑜𝑛𝑠ℎ𝑖𝑝 𝑠𝑒𝑟𝑖𝑒𝑠
III. Negative Binomial and Geometric ) is
Distributions
Let us consider an experiment where 𝑏 ∗ (4; 4, 0.55) + 𝑏 ∗ (5; 4, 0.55)
the properties are the same as those listed for + 𝑏 ∗ (6; 4, 0.55)
a binomial experiment, with the exception + 𝑏 ∗ (7; 4, 0.55)
that the trials will be repeated until a fixed = 0.0915 + 0.1647
number of successes occur. Therefore, + 0.1853 + 0.1668
instead of the probability of x successes in n = 0.6083.
trials, where n is fixed, we are now
interested in the probability that the kth (c) If teams A and B were facing each other
success occurs on the xth trial. Experiments in a regional playoff series, which is decided
of this kind are called negative binomial by winning three out of five games, what is
experiments. the probability that team A would win the
series?
Negative Binomial Distribution
If repeated independent trials can 𝑏 ∗ (3; 3, 0.55) + 𝑏 ∗ (4; 3, 0.55)
result in a success with probability p and a + 𝑏 ∗ (5; 3, 0.55)
failure with probability q = 1 − p, then the = 0.1664 + 0.2246
probability distribution of the random + 0.2021 = 0.5931
variable X, the number of the trial on which
the kth success occurs, is The negative binomial distribution derives
its name from the fact that each term in the
𝑏 ∗ (𝑥; 𝑘, 𝑝) expansion of 𝑝$ (1 − 𝑞) ;+$ corresponds to
the values 𝑜𝑓 𝑏 ∗ (𝑥; 𝑘, 𝑝) 𝑓𝑜𝑟 𝑥 = 𝑘, 𝑘 +
1, 𝑘 + 2, . . ..
𝑥 = 𝑘, 𝑘 + 1, 𝑘 + 2, . . ..
Geometric Distribution
If repeated independent trials can
result in a success with probability p and a

failure with probability 𝑞 = 1 − 𝑝, then 3. The probability that more than one
the probability distribution of the random outcome will occur in such a short time
variable X, the number of the trial on which interval or fall in such a small region is
the first success occurs, is negligible.
𝑔(𝑥; 𝑝) = 𝑝𝑞 ;+" , 𝑥 = 1, 2, 3, . . ..
The number X of outcomes
Example 7: occurring during a Poisson experiment is
For a certain manufacturing process, it is called a Poisson random variable, and its
known that, on the average, 1 in every 100 probability distribution is called the Poisson
items is defective. What is the probability 162 Chapter 5 Some Discrete Probability
that the fifth item inspected is the first Distributions distribution.
defective item found?
Poisson Distribution
Solution: The probability distribution of the
x = 5 p = 0.01 Poisson random variable 𝑋, representing the
𝑔(5; 0.01) = (0.01)(0.99)% = 0.0096 number of outcomes occurring in a given
time interval or specified region denoted by
Theorem 3: The mean and variance of a 𝑡, is
random variable following the geometric 𝑒 +V< (𝜆𝑡) ;
distribution are 𝑝(𝑥; 𝜆𝑡) =
𝑥!
1 𝑝 where λ is the average number of outcomes
𝜇 = 𝑎𝑛𝑑 𝜎 ! = 1 − !
𝑝 𝑝 per unit time, distance, area, or volume and
e = 2.71828 ... .
IV. Poisson Distribution and the Poisson Poisson probability sums,
&
Process
Experiments yielding numerical 𝑃(𝑟; 𝜆𝑡) = € 𝑝 (𝑥; 𝜆𝑡)
values of a random variable X, the number ;6.
of outcomes occurring during a given time for selected values of λt of a range.
interval or in a specified region, are called
Poisson experiments. Example 8:
During a laboratory experiment, the average
Properties of the Poisson Process number of radioactive particles passing
1. The number of outcomes occurring in through a counter in 1 millisecond is 4.
one-time interval or specified region of What is the probability that 6 particles enter
space is independent of the number that the counter in a given millisecond?
occur in any other disjoint time interval or
region. In this sense we say that the Poisson Solution:
process has no memory. 𝑥 = 6 𝑎𝑛𝑑 𝜆𝑡 = 4
2. The probability that a single outcome will 𝑒 +% 40
𝑝(6; 4) =
occur during a very short time interval or in 6!
a small region is proportional to the length = 0.1042.
of the time interval or the size of the region
and does not depend on the number of Theorem 4: Both the mean and the variance
outcomes occurring outside this time of the Poisson distribution p(x; λt) are λt.
interval or region.

The Poisson distribution becomes more and more symmetric, even bell-shaped, as the mean
grows large.
Poisson density functions for different means
Approximation of Binomial Distribution by 0
a Poisson Distribution 𝑃(𝑋 < 7) = € 𝑏 (𝑥; 8000, 0.001)

In the case of the binomial, if n is ;6.
quite large and p is small, the conditions ≈ 𝑝(𝑥; 8)
+/ .
begin to simulate the continuous space or 𝑒 ×8 𝑒 +/ × 8" 𝑒 +/ × 8!
time implications of the Poisson process. = + +
0! 1! 2!
𝑒 +/ × 8# 𝑒 +/ × 8%
Theorem 5: Let X be a binomial random + +
3! 4!
variable with probability distribution 𝑒 +/ × 8- 𝑒 +/ × 80
𝑏(𝑥; 𝑛, 𝑝). When n → ∞, p → 0, and + +
J 5! 6!
𝑛𝑝' → → 𝜇 remains constant, = 0.0003354 + 0.002684 + 0.01073
→J
𝑏(𝑥; 𝑛, 𝑝)' → 𝑝(𝑥; 𝜇) + 0.02863 + 0.0573
+ 0.0916 + 0.12214
Example 8: = 0.3134
In a manufacturing process where
glass products are made, defects or bubbles OR Using Binomial Distribution
occur, occasionally rendering the piece 𝑃(𝑋 < 7)
undesirable for marketing. It is known that, = 8000𝐶+ × 0.001+ × (1 − 0.001),+++-+
on average, 1 in every 1000 of these items + 8000𝐶. × 0.001.
× (1 − 0.001),+++-.
produced has one or more bubbles. What is + 8000𝐶/ × 0.001/
the probability that a random sample of × (1 − 0.001),+++-/
8000 will yield fewer than 7 items + 8000𝐶0 × 0.0010
possessing bubbles? × (1 − 0.001),+++-0
+ 8000𝐶1 × 0.0011
Solution: × (1 − 0.001),+++-1
+ 8000𝐶2 × 0.0012
𝑛 = 8000 𝑎𝑛𝑑 𝑝 = 0.001 × (1 − 0.001),+++-2
𝜇 = (8000)(0.001) = 8 + 8000𝐶3 × 0.0013
Let X represents the number of bubbles × (1 − 0.001),+++-3
= 0.3134

Example 9: (b) What is the probability that the Bulls win
In a certain city district, the need for money the initial best-of-7 playoff series?
to buy drugs is stated as the reason for 75% 𝑆 = {𝑊𝑊𝑊𝐿, 𝐿𝑊𝑊𝑊, 𝑊𝐿𝑊𝑊, 𝑊𝑊𝐿𝑊 }
of all thefts. Find the probability that among
the next 5 theft cases reported in this district, 𝑃(4 − 1𝑤𝑖𝑛) = 𝑃(3 − 1 𝑙𝑒𝑎𝑑)𝑃(𝑊𝑖𝑛)
= [4𝐶# × 0.9# × 0.1%+# ] × (0.9)
(a) exactly 2 resulted from the need for = 0.2624
money to buy drugs;
75 𝑃(4 − 2𝑤𝑖𝑛) = 𝑃(3 − 2 𝑙𝑒𝑎𝑑)𝑃(𝑊𝑖𝑛)
𝐿𝑒𝑡 𝑛 = 5 𝑎𝑛𝑑 𝑝 = = 0.75
100 = [5𝐶# × 0.9# × 0.1-+! ] × (0.9)
𝑃(𝑋 = 𝑥) = 5𝐶; × (0.75); = 0.0656
× (1 − 0.75)'+;
𝑃(2) = 5𝐶! × (0.75)! × (1 − 0.75)-+! 𝑃(4 − 3𝑤𝑖𝑛) = 𝑃(3 − 3 𝑙𝑒𝑎𝑑)𝑃(𝑊𝑖𝑛)
0.088 = [6𝐶# × 0.9# × 0.10+# ] × (0.9)
= 0.0131
(b) at most 3 resulted from the need for 𝑃(𝑏𝑢𝑙𝑙 𝑤𝑖𝑛 𝑡ℎ𝑒 𝑖𝑛𝑖𝑡𝑖𝑎𝑙 𝑏𝑒𝑠𝑡 𝑜𝑓 7 𝑝𝑎𝑙𝑦𝑜𝑓𝑓𝑠)
money to buy drugs. = 0.6561 + 0.2624 + 0.0656 + 0.0131
= 0.9972
𝑃(𝑋 ≤ 3) = 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2)
+ 𝑃(𝑋 = 3) Example 11:
= 5𝐶" × (0.75)" × (1 − 0.75)-+" A national study that examined attitudes
+ 5𝐶! × (0.75)! about antidepressants revealed that
× (1 − 0.75)-+! approximately 70% of respondents believe
+ 5𝐶# × (0.75)# “antidepressants do not really cure anything,
× (1 − 0.75)-+# they just cover up the real trouble.” If X
= 0.3672 represents the number of people who believe
Example 10: that antidepressants do not cure but only
The percentage of wins for the Chicago cover up the real problem, find the mean and
Bulls basketball team going into the playoffs variance of X when 5 people are selected at
for the 1996–97 season was 90. random.
(a) What is the probability that the Bulls
sweep (4-0) the initial best-of-7 playoff Solution:
series? 𝐿𝑒𝑡 𝑛 = 5 𝑎𝑛𝑑 𝑝 = 0.7
𝐿𝑒𝑡 𝑛 = 4 𝜇 = 𝑛𝑝 = 5 × 0.7 = 3.5
𝑃(𝑋 = 𝑥) = 𝑏(𝑥; 4,0.9)
𝑃(𝑋 ≤ 4) = 1 − 𝑃(𝑋 ≤ 3) 𝜎 ! = 𝑛𝑝𝑞 = 5 × 0.7 × 0.3
#
𝜎 ! = 1.05
= € 𝑏(𝑥; 4,0.9) σ = 1.025
;6. Example 12:
1 − (4𝐶. × 0.9. × 0.1% According to USA Today (March 18, 1997),
+ 4𝐶" × 0.9" × 0.1%+" of 4 million workers in the general
+ 4𝐶! × 0.9! × 0.1%+! workforce, 5.8% tested positive for drugs.
+ 4𝐶# × 0.9# × 0.1%+# Of those testing positive, 22.5% were
= 1 − 0.3439 = 0.6561 cocaine users and 54.4% marijuana users.
(a) What is the probability that of 10
workers testing positive, 2 are cocaine users,

5 are marijuana users, and 3 are users of 4𝐶" × (6 − 4)𝐶#+"
other drugs? =1−
6𝐶#
= 1 − 0.20 = 0.8
𝑓(𝑥" , 𝑥! , 𝑥# ; 𝑝" , 𝑝! , 𝑝# , 𝑛) Example 13:
It is estimated that 4000 of the 10,000 voting
residents of a town are against a new sales
= 𝑓(2,5,3; 0.225, 0.544, 0.231, 10) tax. If 15 eligible voters are selected at
10! random and asked their opinion, what is the
={ | × 0.225! probability that at most 7 favor the new tax?
2! 5! 3!
× 0.544- × 0.231#
= 0.0749 Solution:
𝑝
(b) What is the probability that of 10 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑜𝑡𝑒𝑟𝑠 𝑤ℎ𝑜 𝑓𝑎𝑣𝑜𝑟 𝑡ℎ𝑒 𝑛𝑒𝑤 𝑡𝑎𝑥
=
workers testing positive, all are marijuana 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑜𝑡𝑒𝑟𝑠
users? 6000
𝑝= = 0.6
10000
= 𝑓(0,10,0; 0.225, 0.544, 0.231, 10) 𝑛 = 15
10! 𝑃(𝑋 = 𝑥) = 𝑏(𝑥; 15, 0.6)
={ | × 0.225.
0! 10! 0! = 15𝐶; × 0.6; × 0.4"-+;
× 0.544". × 0.231. E
= 0.0023 𝑃(𝑋 ≤ 7) = € 𝑏(𝑥; 15, 0.6)
;6.
(c) What is the probability that of 10 = 15𝐶. × 0.6. × 0.4"-+.
workers testing positive, none is a cocaine + 15𝐶" × 0.6" × 0.4"-+"
user? + 15𝐶! × 0.6! × 0.4"-+!
𝑃(𝑛𝑜𝑛𝑒 𝑐𝑜𝑐𝑎𝑖𝑛𝑒) = 1 − 𝑝" + 15𝐶# × 0.6# × 0.4"-+#
= 1 − 0.225 = 0.775 + 15𝐶% × 0.6% × 0.4"-+%
+ 15𝐶- × 0.6- × 0.4"-+-
= 10𝐶. × 0.225. × 0.775".+. = 0.0782 + 15𝐶0 × 0.60 × 0.4"-+0
Example 13: + 15𝐶E × 0.6E × 0.4"-+E
A random committee of size 3 is selected = 0.2131
from 4 doctors and 2 nurses. Write a Example 14:
formula for the probability distribution of A foreign student club lists as its members 2
the random variable X representing the Canadians, 3 Japanese, 5 Italians, and 2
number of doctors on the committee. Find Germans. If a committee of 4 is selected at
P(2 ≤ X ≤ 3). random, find the probability that
(a) all nationalities are represented;
Solution: 𝑁 = 12, 𝑎" = 2, 𝑎! = 3,
ℎ(𝑥; 𝑁 = 6, 𝑛 = 3, 𝑘 = 4) 𝑎# = 5, 𝑎% = 2
𝑓(𝑥" , 𝑥! , 𝑥# , 𝑥% ; 𝑎" , 𝑎! , 𝑎# , 𝑎% ; 𝑁, 𝑛)
for 𝑥 = 1,2,3 𝑎" 𝐶;! + 𝑎! 𝐶;" 8 𝑎# 𝐶;# 8 𝑎! 𝐶;"

𝑃(2 ≤ 𝑋 ≤ 3) = 1 − 𝑃[𝑋 = 1] =
𝑁𝐶'
= 1 − ℎ(1; 6,3,4) 2𝐶;! + 3𝐶;" 8 5𝐶;# 8 2𝐶;"
=
12𝐶%

𝑓(1,1,1,1; 2,3,5,2; 12,4) 1 1 1 1 1 1
2𝐶" × 3𝐶" × 5𝐶" × 2𝐶" = × × + × ×
2 2 2 2 2 2
= 0.1212 𝑞 = 0.25
12𝐶%
𝑝 = 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 = 1 − 𝑞 = 0.75
(b) all nationalities except Italian are 𝑃(𝑋 = 𝑥) = 𝑏(𝑥; 𝑘, 𝑝)
represented.
𝑃(𝑋" = 2, 𝑋! = 1, 𝑋# = 0, 𝑋% = 1)
= 𝑓(2,1,0,1; 2,3,5,2; 12,4) where 𝑥 = 𝑘, 𝑘 + 1, 𝑘 + 2
2𝐶! × 3𝐶" × 5𝐶. × 2𝐶"
= 𝑃(𝑋 < 4) = 𝑃(1 ≤ 𝑋 ≤ 3)
12𝐶%
#
2𝐶" × 3𝐶! × 5𝐶. × 2𝐶"
+ = € 𝑏(𝑥; 1,0.75)
12𝐶%
2𝐶" × 3𝐶" × 5𝐶. × 2𝐶! ;6"
+ + = (1 − 1)𝐶. × 0.75" × 0.25.
12𝐶%
8 + 1𝐶. × 0.75" × 0.25"
= + 2𝐶. × 0.75" × 0.25!
165 = 0.75 + 0.1875 + 0.046875
Example 15:
The probability that a person living in a = 0.9844
certain city owns a dog is estimated to be Example 16:
0.3. Find the probability that the tenth According to a study published by a group
person randomly interviewed in that city is of University of Massachusetts sociologists,
the fifth one to own a dog. about two-thirds of the 20 million persons in
this country who take Valium are women.
Solution: Assuming this figure to be a valid estimate,
Let success if a person owns a dog and find the probability that on a given day the
consequently, a person without a dog will be fifth prescription written by a doctor for
a failure. Valium is
𝑝 = 0.3
(a) the first prescribing Valium for a
𝑞 = 1 − 𝑝 = 0.7
woman;
2
Since trials are independent, X has a 𝐿𝑒𝑡 𝑝 = 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 𝑖𝑛 𝑒𝑎𝑐ℎ 𝑡𝑟𝑖𝑎𝑙 =
negative binomial distribution with 𝑘 = 5 3
2 1
𝑃(𝑋 = 𝑥) = 𝑏(𝑥; 𝑘, 𝑝) = 𝑏(10; 5,0.3) 𝑞 =1− =
3 3
= (10 − 1)𝐶-+" × 0.3- × 0.7".+- 𝑋 =number of prescriptions of valium to get
0.05146 first success.
Example 16: 𝑘 = 1
Three people toss a fair coin and the odd one 𝑃(𝑋 = 𝑥) = 𝑏(𝑥; 𝑘, 𝑝)
pays for coffee. If the coins all turn up the
same, they are tossed again. Find the
probability that fewer than 4 tosses are
needed. 2 " 1 ;+"
= (𝑥 − 1)𝐶$+" { | { |
3 3
"
Solution: 2 1 -+"
= (5 − 1)𝐶"+" { | { |
Probability of failure in each trial is 3 3
𝑃 [{𝐻𝐻𝐻} ∪ {𝑇𝑇𝑇 }] = 𝑃(𝐻𝐻𝐻) + 𝑃(𝑇𝑇𝑇) = 0.0082

(b) the third prescribing Valium for a 𝑃(𝑋 < 3) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1)
woman. + 𝑃(𝑋 = 2)
2 # 1 -+# = 𝑝(0; 3) + 𝑝(1; 3) + 𝑝(2; 3)
= (5 − 1)𝐶#+" { | { | 𝑒 +# (3). 𝑒 +# (3)" 𝑒 +# (3)!
3 3 = += = 0.4232
= 0.1975 0! 1! 2!
Example 17:
The probability that a student pilot passes (c) at least 2 accidents will occur?
the written test for a private pilot’s license is 𝑃(𝑋 ≥ 2) = 1 − 𝑃(𝑋 ≤ 1)
0.7. Find the probability that a given student = 1 − 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1)
will pass the test = 1 − 𝑝(0; 3) − 𝑝(1; 3)
(a) on the third try; = 0.8008
Example 19.
𝐿𝑒𝑡 𝑝 = 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 𝑖𝑛 𝑒𝑎𝑐ℎ 𝑡𝑟𝑖𝑎𝑙 = 0.7 For a certain type of copper wire, it is
𝑞 = 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑓𝑎𝑖𝑙𝑢𝑟𝑒 known that, on the average, 1.5 flaws occur
= 1 − 𝑝 = 0.3 per millimeter. Assuming that the number of
𝑃(𝑋 = 3) = 𝑔(3; 0.7) flaws is a Poisson random variable, what is
= 𝑝𝑞 ;+" = (0.7)(0.3)#+" the probability that no flaws occur in a
= 0.0630 certain portion of wire of length 5
millimeters? What is the mean number of
(b) before the fourth try. flaws in a portion of length 5 millimeters?
#
𝑃(𝑋 < 4) = € 𝑔 (𝑥; 0.7) Solution:

;6" Let 𝑋 = number of flaws in 5 millimeter
# copper wire
= €(0.7)(0.3) ;+" 𝜆 = average flaws per millimeter = 1.5
;6" 𝜆𝑡 = 5 × 1.5 = 7.5
= (0.7)(0.3)"+" + (0.7)(0.3)!+" (a)
+ (0.7)(0.3)#+" = 0.9730 𝑃(𝑋 = 0) = 𝑝(0; 7.5)
Example 18: 𝑒 E.- × 7.5.
On average, 3 traffic accidents per month = = 0.000553
0!
occur at a certain intersection. What is the (b)
probability that in any given month at this 𝜇 = 𝐸(𝑋) = 𝜆 × 𝑡
intersection = 5 × 1.5 = 7.5
(a) exactly 5 accidents will occur?

Let 𝑋 = number of accidents in a month
𝑡 = 1 month
𝜆 = 3 average number of accidents
𝜆𝑡 = 𝑝𝑜𝑖𝑠𝑠𝑜𝑛 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟
=3
𝑃(𝑋 = 5) = 𝑝(5; 3)
𝑒 +# (3)-
= = 0.1008
5!
(b) fewer than 3 accidents will occur?

Continuous Probability Distributions The two curves are identical in form
but are centered at different positions along
Normal Distribution the horizontal axis.
The most important continuous
probability distribution in the entire field of
statistics. Its graph, called the normal
curve, is the bell-shaped curve. In addition,
errors in scientific measurements are
extremely well approximated by a normal
distribution. In 1733, Abraham DeMoivre
developed the mathematical equation of the
normal curve. It provided a basis from

which much of the theory of inductive Normal curves with 𝜇" < 𝜇! and 𝜎" = 𝜎! .
statistics is founded. The normal distribution
is often referred to as the Gaussian Two normal curves having different
distribution, in honor of Karl Friedrich means and different standard deviations.
Gauss (1777–1855), who also derived its
equation from a study of errors in repeated
measurements of the same quantity.
Normal curves with 𝜇" < 𝜇! and 𝜎" < 𝜎! .
The normal curve. The following properties of the normal

curve:
A continuous random variable X 1. The mode, which is the point on the
having the bell-shaped distribution is called horizontal axis where the curve is a
a normal random variable. maximum, occurs at 𝑥 = 𝜇.
The mathematical equation for the 2. The curve is symmetric about a vertical
probability distribution of the normal axis through the mean 𝜇.
variable depends on the two parameters 𝜇 3. The curve has its points of inflection at
and 𝜎, its mean and standard deviation, 𝑥 = 𝜇 ± 𝜎; it is concave downward if
respectively. We denote the values of the 𝜇 − 𝜎 < 𝑋 < 𝜇 + 𝜎 and is concave
density of X by 𝑛(𝑥; 𝜇, 𝜎). upward otherwise.
The density of the normal random 4. The normal curve approaches the
variable X, with mean μ and variance σ2, is horizontal axis asymptotically as we proceed
1 +
"
(;+Y)"
in either direction away from the mean.
𝑛(𝑥; 𝜇, 𝜎) = 𝑒 !X" , 5. The total area under the curve and above
√2𝜋𝜎 the horizontal axis is equal to 1.
− ∞ < 𝑥 < ∞,

Theorem 1: The mean and variance of
𝑛(𝑥; 𝜇, 𝜎) are μ and 𝜎! , respectively.
Hence, the standard deviation is σ.
𝐸(𝑋) = 𝜇
!
𝜎! +
Z"
𝐸 [(𝑋 − 𝜇) ] = ¢−𝑧𝑒 ! |J +J
√2𝜋
J Z"
+
+ž 𝑒 ! 𝑑𝑧£
+J 𝑃(𝑥" < 𝑋 < 𝑥! ) for different normal curves.
= 𝜎 (0 + 1) = 𝜎 ! .
!
New set of observations of a normal

Areas under the Normal Curve random variable Z with mean 0 and variance
1
𝑋 − 𝜇
𝑍 =
𝜎
Whenever X assumes a value x, the
corresponding value of Z is given by 𝑧 =
(𝑥 − 𝜇)/𝜎. Therefore, if X falls between
the values x = x1 and x = x2, the random
variable Z will fall between the
corresponding values 𝑧" = (𝑥" − 𝜇)/𝜎
𝑃(𝑥" < 𝑋 < 𝑥! ) and 𝑧! = (𝑥! − 𝜇)/𝜎. Consequently, we
;"
= ž 𝑛(𝑥; 𝜇, 𝜎)𝑑𝑥 may write
;! 𝑃(𝑥" < 𝑋 < 𝑥! )
;" ;"
1 +
"
(;+Y)" 1 +
"
(;+Y)"
= ž 𝑒 !X " 𝑑𝑥 = ž 𝑒 !X" 𝑑𝑥
√2𝜋𝜎 ;! √2𝜋𝜎 ;!
Z"
1 " "
The area under the curve between any two = ž 𝑒 + !Z 𝑑𝑧
√2𝜋 Z!
ordinates must then also depend on the
= 𝑃(𝑧" < 𝑍 < 𝑧! )
values μ and σ where we have shaded The distribution of a normal random
regions corresponding to 𝑃(𝑥" < 𝑋 < 𝑥! ) variable with mean 0 and variance 1 is
for two curves with different means and called a standard normal distribution.
variances.
𝑃(𝑥" < 𝑋 < 𝑥! ), where X is the random
variable describing distribution A, is
indicated by the shaded area below the curve
of A. If X is the random variable describing
distribution B, then 𝑃(𝑥" < 𝑋 < 𝑥! ) is
given by the entire shaded region.

Example 1:
Given a standard normal distribution, find
the area under the curve that lies
(a) to the right of z = 1.84 and
(b) between z = −1.97 and z = 0.86.
Solution:
a. In Figure (a)
𝐴 = 1 − 𝑎𝑟𝑒𝑎 𝑡𝑜 𝑡ℎ𝑒 𝑙𝑒𝑓𝑡
𝐴 = 1 − 0.3015
= 0.0.6985 (𝑎𝑟𝑒𝑎 𝑡𝑜 𝑡ℎ𝑒 𝑙𝑒𝑓𝑡)
Solution: From Table A.3
a. 𝑘 = 0.52
𝑧 = 1.84
𝐴 = 1 − 𝑎𝑟𝑒𝑎 𝑡𝑜 𝑡ℎ𝑒 𝑙𝑒𝑓𝑡 𝑜𝑓 𝑍 b. In figure b
𝐴 = 1 − 0.9671 = 0.0329 −0.18 = 0.4286
b. 𝐴 = 0.4286 − 0.4197 = 0.0089
𝑧 = −1.97 𝑎𝑛𝑑 𝑧 = 0.86 𝑘 = −2.37
𝐴 = 0.8051 − 0.0244 = 0.7807
Example 3:
Example 2: Given a random variable X having a normal
(a) 𝑃(𝑍 > 𝑘) = 0.3015 and distribution with μ = 50 and σ = 10, find the
(b) 𝑃(𝑘 < 𝑍 < −0.18) = 0.4197. probability that X assumes a value between
45 and 62.
Solution:
45 − 50
𝑧" = = −0.5
10
62 − 50
𝑧! = = 1.2
10
𝑃(45 < 𝑋 < 62) = 𝑃(−0.5 < 𝑍 < 1.2)
= 𝑃(𝑍 < 1.2) − 𝑃(𝑍 < −0.5)
= 0.8849 − 0.3085 = 0.5764


Example 4: Solution:
: Given that X has a normal distribution with 2.3 − 3
μ = 300 and σ = 50, find the probability that 𝑧 = = −1.4
0.5
X assumes a value greater than 362.
Solution: 𝑃(𝑋 < 2.3) = 𝑃(𝑍 < −1.4) = 0.0808.

362 − 300
𝑧 = = 1.24 Example 7:
50
𝑃(𝑋 > 362) = 𝑃(𝑍 > 1.24) An electrical firm manufactures light bulbs
= 1 − 𝑃(𝑍 < 1.24) that have a life, before burn-out, that is
= 1 − 0.8925 = 0.1075. normally distributed with mean equal to 800
hours and a standard deviation of 40 hours.
According to Chebyshev’s theorem, Find the probability that a bulb burns
the probability that a random variable between 778 and 834 hours.
assumes a value within 2 standard deviations
of the mean is at least 3/4. If the random
variable has a normal distribution, the z
values corresponding to 𝑥" = 𝜇 − 2𝜎 and
𝑥! = 𝜇 + 2𝜎 are easily computed to be
(𝜇 − 2𝜎) − 𝜇
𝑧" = = −2
𝜎
(𝜇 + 2𝜎) − 𝜇
𝑧! = = 2 Solution:
𝜎
Hence, 778 − 800
𝑃(𝜇 − 2𝜎 < 𝑋 < 𝜇 + 2𝜎) = 𝑃(𝑍 𝑧" = = −0.55
40
< 2) − 𝑃(𝑍 < −2)
= 0.9772 − 0.0228 = 0.9544 834 − 800
𝑧! = = 0.85
which is a much stronger statement than that 40
given by Chebyshev’s theorem. 𝑃(778 < 𝑋 < 834)
= 𝑃(−0.55 < 𝑍 < 0.85)
Example 5: = 𝑃(𝑍 < 0.85) − 𝑃(𝑍 < −0.55)
A certain type of storage battery lasts, on
average, 3.0 years with a standard deviation Example 8:
of 0.5 year. Assuming that battery life is Gauges are used to reject all components for
normally distributed, find the probability which a certain dimension is not within the
that a given battery will last less than 2.3 specification 1.50 ± d. It is known that this
years. measurement is normally distributed with
mean 1.50 and standard deviation 0.2.

Determine the value d such that the NORMAL APPROXIMATION TO THE
specifications “cover” 95% of the BINOMIAL
measurements. Probabilities associated with
Solution: binomial experiments are readily obtainable
𝑃(−1.96 < 𝑍 < 1.96) = 0.95. from the formula b(x; n, p) of the binomial
(1.50 + 𝑑) − 1.50 distribution when n is small. Poisson
1.96 = distribution can be used to approximate
0.2
𝑑 = 0.2 × 1.96 = 0.392 binomial probabilities when n is quite large
and p is very close to 0 or 1. Both the
binomial and the Poisson distributions are
discrete. The normal distribution is often a
good approximation to a discrete
distribution when the latter takes on a
symmetric bell shape
The normal distribution is often a
good approximation to a discrete
distribution when the latter takes on a
symmetric bell shape. From a theoretical
Example 9:
point of view, some distributions converge
The average grade for an exam is 74, and the
to the normal as their parameters approach
standard deviation is 7. If 12% of the class is
certain limits. The normal distribution is a
given As, and the grades are curved to
convenient approximating distribution
follow a normal distribution, what is the
because the cumulative distribution function
lowest possible A and the highest possible
is so easily tabled. The binomial distribution
B?
is nicely approximated by the normal in
practical problems when one works with the
cumulative distribution function.
Theorem 2: If X is a binomial random

variable with mean μ = np and variance σ2
= npq, then the limiting form of the
distribution of
𝑋 − 𝑛𝑝
Solution: 𝑍 =
‹𝑛𝑝𝑞
𝐴𝑟𝑒𝑎 𝑡𝑜 𝑡ℎ𝑒 𝑟𝑖𝑔ℎ𝑡 = 0.88
as n → ∞, is the standard normal
𝑧 ≈ 1.18
distribution n(z; 0, 1)
𝑋 − 𝜇
𝑍 = The normal distribution with μ = np
𝜎 and σ2 = np(1 − p) not only provides a very
𝑥 − 74
1.18 = accurate approximation to the binomial
7 distribution when n is large and p is not
𝑥 = 82.26
extremely close to 0 or 1 but also provides a
∴ Therefore, the lowest A is 83 and the
fairly good approximation even when n is
highest B is 82
small and p is reasonably close to 1/2.

Example 9:
𝜇 = 𝑛𝑝 = (15)(0.4) = 6
𝜎 ! = 𝑛𝑝𝑞 = (15)(0.4)(0.6) = 3.6
𝜎 = 1.897
Using binomial, 𝑏(𝑥; 15, 0.4)
Normal approximation of 𝑏(𝑥; 15, 0.4)
Assuming 𝑥 = 4
using binomial distribution
𝑃(𝑋 = 4) = 𝑏(4; 15, 0.4) = 0.1268
The shaded region under the normal curve between the two ordinates 𝑥" =
3.5 𝑎𝑛𝑑 𝑥! = 4.5
3.5 − 6
𝑧" = = −1.32
1.897
4.5 − 6
𝑧2 = = −0.79
1.897
𝑃(𝑋 = 4)
= 𝑏(4; 15, 0.4)
≈ 𝑃(−1.32 < 𝑍 < −0.79)
= 𝑃(𝑍 < −0.79) − 𝑃(𝑍 < −1.32)
= 0.2148 − 0.0934 = 0.1214.
Note: The normal approximation is most useful in calculating binomial sums for large values of
n.

Normal Approximation and True Cumulative Binomial Probabilities

Example 10:
The probability that a patient recovers from 1
a rare blood disease is 0.4. If 100 people are 𝜇 = 𝑛𝑝 = (80) { | = 20
4
known to have contracted this disease, what 𝜎 = ‹𝑛𝑝𝑞
is the probability that fewer than 30 survive?
= ‹(80)(1/4)(3/4) = 3.873
Solution:
𝑥" = 24.5 𝑎𝑛𝑑 𝑥! = 30.5
𝑋 =number of patients who survive.
𝑛 = 100 24.5 − 20
𝑧" = = 1.16
𝜇 = 𝑛𝑝 = (100)(0.4) = 40 3.873
𝜎 = ‹𝑛𝑝𝑞 = ‹(100)(0.4)(0.6) 30.5 − 20
= 4.899. 𝑧! = = 2.71
3.873
𝑥 = 29.5. #.
29.5 − 40 𝑃(25 ≤ 𝑋 ≤ 30) = € 𝑏(𝑥; 80, 0.25)
𝑧 = = −2.14
4.899 ;6!-
≈ 𝑃(1.16 < 𝑍 < 2.71)
= 𝑃(𝑍 < 2.71) − 𝑃(𝑍 < 1.16)
= 0.9966 − 0.8770
= 0.1196.
EXAMPLES:
12. A soft-drink machine is regulated so that
it discharges an average of 200 milliliters
per cup. If the amount of drink is normally
distributed with a standard deviation equal
𝑃(𝑋 < 30) ≈ 𝑃(𝑍 < −2.14) to 15 milliliters,
= 0.0162. (a) what fraction of the cups will contain
more than 224 milliliters?
Example 11:
A multiple-choice quiz has 200 questions, Solution:
each with 4 possible answers of which only 𝑥 − 𝜇 224 − 200
1 is correct. What is the probability that 𝑧= = 1.6
𝜎 15
sheer guesswork yields from 25 to 30
correct answers for the 80 of the 200 𝑃(𝑥 > 224) = 𝑃(𝑍 > 1.6)
problems about which the student has no = 1 − 𝑃(𝑍 < 1.6)
knowledge? 1 − 0.9452 = 0.0548
Solution: (b) what is the probability that a cup

1 contains between 191 and 209 milliliters?
𝑝=
4
#.
Solution:
𝑃(25 ≤ 𝑋 ≤ 30) = € 𝑏(𝑥; 80, 1/4) 𝑥" − 𝜇 191 − 200
𝑧" = = = −0.6
;6!- 𝜎 15

𝑥! − 𝜇 209 − 200 = 1 − 𝑍 ≤ 1.58)
𝑧! = = = 0.6
𝜎 15 = 1 − 0.9429 = 0.0571
𝑃(−0.6 ≤ 𝑋 ≤ 0.6) (b) If the office opens at 9:00 A.M. and the
= 𝑃(−0.6 ≤ 𝑍 ≤ 0.6) lawyer leaves his house at 8:45 A.M. daily,
= 𝑃(𝑍 < 0.6) − 𝑃(𝑍 > −0.6) what percentage of the time is he late for
= 0.7257 − 0.2743 = 0.4514 work?
(c) how many cups will probably overflow if
230- milliliter cups are used for the next Solution:
1000 drinks? 𝑃(𝑋 > 15) = 1 − 𝑃(𝑋 ≤ 15)
15 − 24
Solution: = 1 − 𝑃 {𝑋 < |
3.8
𝑥 − 𝜇 230 − 200 = 1 − 𝑃(𝑍 ≤ −2.37)
𝑧= = =2
𝜎 15 = 1 − 0.0089 = 0.9911 = 99.11%
𝑃(𝑋 > 230) = 𝑃(𝑍 > 2) (c) If he leaves the house at 8:35 A.M. and
= 1 − 𝑃(𝑍 < 2) = 1 − 0.9772 = 0.0228 coffee is served at the office from 8:50 A.M.
until 9:00 A.M., what is the probability that
𝐸(𝑋) = 𝑛 × 𝑝 = 1000 × 0.0228 he misses coffee?
= 22.8 = 23
Solution:
(d) below what value do we get the smallest 𝑃(𝑋 > 25) = 1 − 𝑃(𝑋 ≤ 25)
25% of the drinks? 25 − 24
= 1 − 𝑃 {𝑋 < |
3.8
Solution: = 1 − 𝑃(𝑍 ≤ 0.23)
𝑃(𝑋, 𝑥) = 0.25 1 − 0.6026 = 0.3974
𝑥−𝜇
𝑃 ‰𝑍 < Š = .25
𝜎 (d) Find the length of time above which we
𝑃(𝑍 < −0.68) = 0.25 find the slowest 15% of the trips.
𝑥−𝜇
= −0.68
𝜎 Solution:
𝑥 = (−0.68 × 𝜎) + 𝜇
𝑃(𝑋 > 𝑥) = 0.15
𝑥 = (−0.67 × 15) + 200
1 − 𝑃(𝑋 > 𝑥) = 0.15
𝑥 = 189.95
𝑃(𝑋 < 𝑥) = 1 − 0.15 = 0.85
13. 5 A lawyer commutes daily from his 𝑥−𝜇
suburban home to his midtown office. The 𝑃 ‰𝑍 ≤ Š = 0.85
𝜎
average time for a one-way trip is 24 𝑃(𝑍 ≤ 1.04) = 0.85
minutes, with a standard deviation of 3.8 𝑥−𝜇
minutes. Assume the distribution of trip = 1.04
𝜎
times to be normally distributed. 𝑥 = (1.04 × 𝜎) + 𝜇
(a) What is the probability that a trip will 𝑥 = (1.04 × 3.8) + 24
take at least 1/2 hour? 𝑥 = 27.952
Solution: (e) Find the probability that 2 of the next 3

𝑃(𝑋 ≥ 30) = 1 − 𝑃(𝑋 < 30) trips will take at least 1/2 hour.
15 − 24
= 1−𝑃{ |
3.8 Solution:
𝑃(𝑋 > 30) = 1 − 𝑃(𝑋 ≤ 30) 16. The serum cholesterol level X in 14-
30 − 24 year-old boys has approximately a normal
= 1 − 𝑃 {𝑋 < | distribution with mean 170 and standard
3.8
= 1 − 𝑃(𝑍 ≤ 1.58) deviation 30.
1 − 0.9429 = 0.0571 (a) Find the probability that the serum
∴Therefore, 𝑛 = 3 𝑎𝑛𝑑 𝑝 = 0.0571 cholesterol level of a randomly chosen 14-
𝑃(𝑋 = 𝑥) = 3𝐶; year-old boy exceeds 230.
× (0.0571); (1
− 0.0571)#+; Solution:
= 3𝐶! × (0.0571)! (1 − 0.0571)#+! 𝑥 − 𝜇 230 − 170
𝑧= = =2
= 0.0092 𝜎 30
𝑃(𝑋 > 230) = 𝑃(𝑍 > 2)
14. If a set of observations is normally = 1 − 𝑃(𝑍 < 2)
distributed, what percent of these differ from = 1 − 0.9772 = 0.0228
the mean by
(a) more than 1.3σ? Solution:
𝜇 + 1.3𝜎 − 𝜇 (b) In a middle school there are 300 14-
𝑃(𝑋 > 𝜇 + 1.3𝜎) = 𝑃 {𝑍 > | year-old boys. Find the probability that at
𝜎)
= 𝑃(𝑍 > 1.3) least 8 boys have a serum cholesterol level
= 1 − 𝑃(𝑍 < 1.3) that exceeds 230.
= 1 − 0.9032 = 0.0968 = 9.68%
Solution:
(b) less than 0.52σ? 𝑛 = 300
𝑃(𝑋 > 𝜇 − 0.52𝜎) 𝑝 = 0.0228
𝜇 − 0.52𝜎 − 𝜇 𝑋 =number of boys with cholesterol lever
= 𝑃 {𝑍 > | higher than 230
𝜎)
𝜇 = 𝑛𝑝 = 300 × 0.0228 = 6.84
= 𝑃(𝑍 > −0.52) = 0.3015
𝜎 = ‹𝑛𝑝𝑞 = √300 × 0.0228 × 0.9772
15. A process yields 10% defective items. If = 2.58
100 items are randomly selected from the 𝑥 − 𝜇 7.5 − 6.84
𝑧= = = 0.26
process, what is the probability that the 𝜎 2.58
number of defectives
(a) exceeds 13? 𝑃(𝑋 ≥ 7.5) = 𝑃(𝑍 < 7.7)
𝜇 = 𝑛𝑝 = 100 × 0.1 = 10 = 1 − 𝑃(𝑍 < 0.26)
𝜎 = ‹𝑛𝑝𝑞 = √100 × 0.1 × 0.9 = 3 = 1 − 0.6026
𝑥 − 𝜇 13.5 − 10 = 0.3974
𝑧= = = 1.166 = 1.17
𝜎 3
𝑃(𝑋 > 13) = 𝑃(𝑍 > 1.17)

= 1 − 𝑃(𝑍 < 1.17)
= 1 − 0.8790 = 0.1210
(b) is less than 8?

𝑥 − 𝜇 7.5 − 10
𝑧= = = −0.83
𝜎 3
𝑃(𝑋 < 8) = 𝑃(𝑍 < −0.83) = 0.2033
GAMMA AND EXPONENTIAL
DISTRIBUTION 𝑓(𝑥; 𝛼, 𝛽) = 0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
The exponential distribution is a 𝑤ℎ𝑒𝑟𝑒 𝛼 > 0 𝑎𝑛𝑑 𝛽 > 0.
special case of the gamma distribution.
The exponential and gamma The special gamma distribution for which
distributions play an important role in both 𝛼 = 1 is called the exponential distribution.
queuing theory and reliability problems.
Time between arrivals at service facilities
and time to failure of component parts and
electrical systems often are nicely modeled
by the exponential distribution. The
relationship between the gamma and the
exponential allows the gamma to be used in
similar types of problems.
The Gamma distribution is widely
used in engineering, science, and business,
to model continuous variables that are
always positive and have skewed
distributions. Gamma distributions.
The gamma function is defined by EXPONENTIAL DISTRIBUTION

J The continuous random variable X
𝛤(𝛼) = ∞ ž 𝑥 [+" 𝑒 +; 𝑑𝑥 , 𝑓𝑜𝑟 𝛼 > 0. has an exponential distribution, with
. parameter β, if its density function is given
The following are a few simple properties of
by
the gamma function 1
(a) 𝛤(𝑛) = (𝑛 − 1)(𝑛 − 2) ··· (1)𝛤(1), 𝑓(𝑥; 𝛽) = 𝑒 +;/\
𝛽
for a positive integer 𝑛.
where β > 0.
(b) 𝛤(𝑛) = (𝑛 − 1)! for a positive integer
𝑓(𝑥; 𝛽) = 0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
𝑛.
Theorem 4: The mean and variance of the
(c) 𝛤(1) = 1.
" gamma distribution are
(d) 𝛤 ‰!Š = √𝜋. = 𝛼𝛽 𝑎𝑛𝑑 𝜎 ! = 𝛼𝛽 ! .
The following is the definition of the gamma The mean and variance of the exponential
distribution distribution are
The continuous random variable X has a 𝜇 = 𝛽 𝑎𝑛𝑑 𝜎 ! = 𝛽 !
gamma distribution, with parameters α and
β, if its density function is given by Relationship to the Poisson Process
The Poisson process allows for the
GAMMA DISTRIBUTION use of the discrete distribution. The Poisson
The continuous random variable X distribution is used to compute the
has a gamma distribution, with parameters α probability of specific numbers of “events”
and β, if its density function is given by during a particular period of time or span of
1 ;
+ space
𝑓(𝑥; 𝛼, 𝛽) = [ 𝑥 [+" 𝑒 \
𝛽 𝛤(𝛼)
if x> 0 𝑃(𝑋 > 𝑥) = 𝑒 +V;

The cumulative distribution function the probability of waiting more than 15
for X is given by minutes between any two successive calls?
𝑃(0 ≤ 𝑋 ≤ 𝑥) = 1 − 𝑒 +V;
Solution:
The density function Let Y = time in hours between successive
𝑓(𝑥) = 𝜆𝑒 +V; calls.
where 𝜆 = 1/𝛽 1
𝜆 = 6, 𝛽 =
6
Example 1: 1
15 𝑚𝑖𝑛𝑠 = ℎ𝑜𝑢𝑟𝑠
Suppose that a system contains a certain 4
type of component whose time, in years, to 1 1
𝑃 {𝑋 > | = 1 − {𝑋 < − |
failure is given by T. The random variable T 4 4
" "
is modeled nicely by the exponential ×
= 1 − ¢1 − 𝑒 \ % £
distribution with mean time to failure β = 5.
If 5 of these components are installed in "
= 1 − {1 − 𝑒 +0×% |
different systems, what is the probability
that at least 2 are still functioning at the end = 0.2231
of 8 years? Example 3:
A certain type of device has an advertised
Solution: failure rate of 0.01 per hour. The failure rate
1 J +< / is constant and the exponential distribution
𝑃(𝑇 > 8) = ž 𝑒 - 𝑑𝑡 = 𝑒 +- ≈ 0.2. applies.
5 /
(a) What is the mean time to failure?
Let X = the number of components 1 1
𝜇= = = 100
functioning after 8 years. Then using the 𝜆 0.01
binomial distribution, we have (b) What is the probability that 200 hours
will pass before a failure is observed?
- 𝑃(𝑋 ≥ 200) = 1 − 𝑃(𝑋 ≤ 200)
𝑃(𝑋 ≥ 2) = € 𝑏(𝑥; 5, 0.2) = 1 − (1 − 𝑒 +..."×!.. )
;6!
= 0.1353
"
= 1 − € 𝑏(𝑥; 5, 0.2) Example 4:

;6.
Suppose that the time, in hours, required to
= 1 − 0.7373 = 0.2627. repair a heat pump is a random variable X
having a gamma distribution with
Example 2: parameters α = 2 and β = 1/2. What is the
The exponential distribution is probability that on the next service call
frequently applied to the waiting times (a) at most 1 hour will be required to repair
between successes in a Poisson process. If the heat pump?
the number of calls received per hour by a 𝑥 [+" 𝑒 +;/\
telephone answering service is a Poisson 𝑃(𝑋 < 1) =
𝛽 [ 𝛤(𝛼)
random variable with parameter λ = 6, we "
𝑥 !+" 𝑒 +!;
know that the time, in hours, between =ž !
𝑑𝑥 = 0.5940
. 0.5 (2 − 1)
successive calls has an exponential
(b) at least 2 hours will be required to repair
distribution with parameter β =1/6. What is
the heat pump?

!
𝑥 !+" 𝑒 +!; 6 !
𝑃(𝑋 ≥ 1) = 1 − ž 𝑑𝑥 𝛽 = 12
!
. 0.5 (2 − 1)
β
= 0.0916 6𝛽 = 12
Example 5: 𝛽=2
In a certain city, the daily consumption of
water (in millions of liters) follows 6 6
𝛼= = =3
approximately a gamma distribution with 𝛽 2
𝛼 = 2 𝑎𝑛𝑑 𝛽 = 3. If the daily capacity of (b) Find the probability that on any given
that city is 9 million liters of water, what is day the daily power consumption will
the probability that on any given day the exceed 12 million kilowatt-hours.
1 ;
water supply is inadequate? +
𝑓(𝑥; 𝛼, 𝛽) = [ 𝑥 [+" 𝑒 \
𝛽 𝛤(𝛼)
Solution: 1 ;
1 ; 𝑓(𝑥; 2, 3) = # 𝑥 #+" 𝑒 +!
𝑓(𝑥; 𝛼, 𝛽) = [ 𝑥 [+" +\
𝑒 2 𝛤(3)
𝛽 𝛤(𝛼)
1 ; 𝑃(𝑋 > 12) = 1 − 𝑃((𝑋 ≤ 12)
𝑓(𝑥; 2, 3) = ! 𝑥 !+" 𝑒 +# "!
3 𝛤(2) 1 ;
#+" +!
𝑃(𝑋 > 9) = 1 − 𝑃((𝑋 ≤ 9) =1−ž #
𝑥 𝑒 = 0.0620
F . 2 𝛤(3)
1 ;
=1−ž ! 𝑥 !+" 𝑒 +#
. 3 𝛤(2)
1 F #
= 1 − ž 𝑥𝑒 + ; 𝑑𝑥 = 0.1991
9 .
(a) Find the mean and variance of the daily
water consumption.
𝜇 = 𝛼𝛽 = 6 𝑀𝐿
𝜎 ! = 𝛼𝛽 ! = 2 × 9 = 18
(b) According to Chebyshev’s theorem,

there is a probability of at least 3/4 that the
water consumption on any given day will
fall within what interval?
𝜇 ± 2𝜎 = 6 ± 2√18
6 + 2√18 = 14.485
6 − 2√18 = −2.485
∴ Water consumption on any given day is
from 0 to 14.485 million liters.
Example 6:
In a certain city, the daily
consumption of electric power, in millions
of kilowatt-hours, is a random variable X
having a gamma distribution with mean μ =
6 and variance σ2 = 12.
(a) Find the values of α and β.
𝛼𝛽 ! = 12
FUNDAMENTAL SAMPLING An Observation
DISTRIBUTIONS AND DATA • Each observation in a population is
DESCRIPTION a value of a random variable X having some
Random Sampling probability distribution f(x).
The outcome of a statistical • For example, if one is inspecting
experiment may be recorded either as a items coming off an assembly line for
numerical value or as a descriptive detect, then each observation in the
representation. population might be a value 0 or 1 of the
In this chapter, we focus on sampling Bernoulli random variable X with
from distributions or populations and study probability distribution
such important quantities as the sample
mean and sample variance. 𝑏(𝑥; 1, 𝑝) = 𝑝 ; 𝑞"+; , 𝑥 = 0, 1
where 0 indicates a non-defective item and 1
Population indicates a defective one. p is the probability
Consists of the totality of the of any item being defective and 𝑞 = 1 − 𝑝.
observations with which we are concerned. • When we refer to the population
• The totality of observations, f(x), i.e, binomial or normal distributions,
whether their number be finite or infinite, we mean a population whose observations
constitutes what we call a population. are values of a random variable having the
• The word population previously probability distribution f(x).
referred to observations obtained from
statistical studies about people.
• Today, statisticians use the term to Sample
refer to observations relevant to anything of A subset of a population.
interest, whether it be groups of people, • In the statistical inference,
animals, or all possible outcomes from some statisticians are interested in arriving at
complicated biological or engineering conclusions concerning a population when it
system. is impossible or impractical to observe the
entire set of observations that make up the
Populations: Examples population.
• If there 600 students in the school • We must depend on a subset of
whom we classified according to blood type, observations from the population to help us
we say that we have a population of size make inferences concerning that same
600. population.
• The number of the cards in a deck, • If our inferences are to be valid, we
the heights of residents in a city, and the must obtain samples that are representative
lengths of cars in a parking lot are examples of the population. • Any sampling procedure
of populations with finite number. The total that produces inferences that consistently
number of observations is also a finite overestimate or consistently under-estimate
number. some characteristic of the population is said
• The observations obtained by to be biased.
measuring the atmospheric pressure every
day or all measurements of the depth of a Random sample
lake are examples of populations whose Let 𝑋" , 𝑋! , . . . , 𝑋' be n independent
sizes are infinite. random variables, each having the same
probability distribution 𝑓(𝑥). Define

𝑋" , 𝑋! , . . . , 𝑋' to be a random sample of size Sample variance:
'
n from the population 𝑓(𝑥) and write its 1 !
!
joint probability distribution as 𝑆 = €¤𝑋5 − 𝑋¥
𝑛−1
𝑓(𝑥" , 𝑥! , . . . , 𝑥' ) = 𝑓(𝑥" )𝑓(𝑥! ) ··· 𝑓(𝑥' ). 56"
• In a random sample, the Variability Measures of a Sample:

observations are made independently and at Example The variability in a sample
random. displays how the observations spread out
• The random variable 𝑋5 , 𝑖 = from the average. For example,
1,...,n represents the ith measurement or • Consider the following
sample value that we observe. measurements, in liters, for two samples of
• And 𝑥5 , 𝑖 = 1, . . . , 𝑛 represents the orange juice bottled by company A and B:
real value that we measure.
STATICS • The sample mean and std of

• Any function of the random samples A and B:
variables constituting a random sample is
called static.
Statistical Inferences Example 1:

• We want some methods to make A comparison of coffee prices at 4
decisions or to draw conclusions about a randomly selected grocery stores in San
population. Diego showed increases from the previous
• We need samples from population month of 12, 15, 17, and 20 cents for a 1-
and utilize the information within. pound bag. Find the variance of this random
• The methods can be divided into sample of price increases.
two major areas: parameter estimation and
hypothesis testing. Solution:
12 + 15 + 17 + 20
𝑥† = = 16 𝑐𝑒𝑛𝑡𝑠.
What is statistics? 4
• Statistics is a function of 𝑆!
'
observations or random samples. 1
• Statistics itself is also a random = €(𝑋5 − 16)!
4−1
variable. 56"
(12 − 16)! + (15 − 16)!
• The probability distribution of a
statistics is called a sampling distribution. + (17 − 16)! + (20 − 16)!
=
3
Sample mean:
' 34
1 𝑆! =
𝑋Ž = € 𝑋5 4
𝑛
56"
Theorem 1: If 𝑺𝟐 is the variance of a
Sample median: random sample of size n, we may write
𝑥'8" ' ' !
𝑥† = , 𝑖𝑓 𝑛 𝑖𝑠 𝑜𝑑𝑑, 1
2 𝑆! = ¾ 𝑛 € 𝑋5! − ¿€ 𝑋5 À Á
1 𝑥' 𝑥' 𝑛(𝑛 − 1)
𝑥† = ‰ + Š, 𝑖𝑓 𝑛 𝑖𝑠 𝑒𝑣𝑒𝑛. 56" 56"
2 2 2+1
Sample standard deviation has a normal distribution with mean
𝑆 = ‹𝑆 !
Let 𝑋9:; denote the largest of the 𝑋5 values
and 𝑋95' the smallest.
and variance
Sample range
𝑅 = 𝑋9:; − 𝑋95'
Example 2:
Find the variance of the data 3, 4, 5, Central Limit Theorem
6, 6, and 7, representing the number of trout The Central Limit Theorem states
caught by a random sample of 6 fishermen that the sampling distribution of the sample
on June 19, 1996, at Lake Muskoka. means approaches a normal distribution as
the sample size gets larger — no matter
Solution: what the shape of the population
1 13 distribution. This fact holds especially true
𝑠! = [(6)(171) − (31)! ] = for sample sizes over 30. All this is saying is
(6)(5) 6
that as you take more samples, especially
13 large ones, your graph of the sample
𝑠 = Â = 1.47 means will look more like a normal
6
distribution.
𝑅 =7−3=4
Theorem 2. If X is the mean of a random
sample of size n taken from a population
Sampling Distribution of Mean
with mean µ and variance 𝝈𝟐 , then the
The probability distribution of a
limiting form of the distribution of Z
statistic is called a sampling distribution.
• Since a statistic is a random
𝑋Ž − 𝜇
variable that depends only on the observed 𝑍= 𝜎
samples, it must have a probability -
distribution. √𝑛
as n → ∞, is the standard normal
• The sampling distribution of a
distribution N(0,1).
statistic depends on the distribution of the
• If n ≥ 30, the normal approximation
population, the size of the samples, and the
will be satisfactory regardless of population
method of choosing the samples.
shape.
• If n < 30, the approximation is
Sample Mean
good only if the population is not too
• Suppose a random sample of n
different from a normal distribution.
observations, 𝑋" , 𝑋! , . . . , 𝑋' , is taken from a
• If the population is known to be
normal distribution with mean µ and
normal, the sampling distribution of 𝑋Ž is
variance 𝜎 ! , 𝑖. 𝑒. 𝑋5 ∼ 𝑁(µ, 𝜎 ! ), 𝑖 =
normal for any size of n.
1, . . . , 𝑛.
• The sample mean
'
1 1
𝑋Ž = (𝑋" + 𝑋! +··· +𝑋' ) = € 𝑋5
𝑛 𝑛
56"

(𝑋Ž" − 𝜇" ) − (𝑋Ž" − 𝜇! )
𝑍=
𝜎"! 𝜎!!
Â
𝑛" + 𝑛!
is approximately normal, when the
conditions for the central limits theorem
apply.
• The difference of two Gaussian

RVs is still normal.
• If both 𝑛" 𝑎𝑛𝑑 𝑛! are greater than
30, the normal approximation of 𝑋" − 𝑋! is
Example 3: good.
An electronics company Example 4.
manufactures resistors that have a mean The effective life of a part used in a
resistance of µ = 100 Ω and standard jet engine is close to a normal random
deviation σ = 10 Ω. The distribution of variable with mean 5000 hours and standard
resistance is normal. Find the probability deviation 40 hours. An improvement has
that a random sample of n = 25 resistors will been introduced to increase the mean life to
have an average resistance less than 95 Ω. 5050 hours and to decrease the standard
deviation to 30 hours. Suppose there are
Solution: random samples of 𝑛" = 16 𝑎𝑛𝑑 𝑛! = 25
• The sampling distribution of 𝑋Ž is components selected form the original and
normal, with mean µQ^ = 100 Ω, and improved processes, respectively. What is
standard deviation of the probability that the difference in the two
𝜎 10
𝜎Q^ = = =2 sample means 𝑋Ž! − 𝑋Ž" is at least 25 hours.
√𝑛 √25
Solution:
𝑋Ž − 100 For 𝑋Ž"
𝑃(𝑋Ž < 95) = 𝑃 ¢𝑧 =
2 𝜇" = 5000
95 − 100 𝜎" 40
< £ 𝜎Q^! = = = 10
2 √𝑛" √16
= 𝑃(𝑧 < −2.5) = 0.0062 For 𝑋Ž!
𝜇! = 5050
Theorem 3 (Two Samples of Two 𝜎! 30
𝜎Q^! = = =6
Populations). If we have two independent √𝑛! √25
populations with means µ𝟏 𝒂𝒏𝒅 µ𝟐 and
É 𝟏 and 𝑿
variances 𝝈𝟐𝟏 and 𝝈𝟐𝟐 and if 𝑿 É 𝟐 are
the sample means of two independent 𝜇Q^ = 𝜇Q^" + 𝜇Q^! = 5050 − 5000 = 50
random samples of sizes 𝒏𝟏 𝒂𝒏𝒅 𝒏𝟐 from
these populations, then the sampling 𝜎Q!^" 𝜎Q!^!
distribution of 𝜎Q!^ = + = 6! + 10! = 136
𝑛! 𝑛"
𝑃(𝑥̅ = 𝑋Ž! − 𝑋Ž" ≥ 25)

𝑋Ž − 50 25 − 50 • If a random sample of size n is
𝑃(𝑥̅ ≥ 25) = 𝑃 ¢𝑧 = ≥ £ drawn from a normal distribution with mean
√136 √136
= 𝑃(𝑧 ≥ −2.14) µ and variance 𝜎 ! . The sample variance
= 1 − 𝑃(𝑧 < −2.14) (the statistic 𝑆 ! ) is given by
= 0.9836 '
1
Example 5:
!
𝑆 = €(𝑋𝑖 − 𝑋Ž )!
𝑛−1
56"
Two independent experiments are
• Since 𝑋Ž is a normal distribution
run in which two different types of paint are
with mean µ and variance 𝜎 ! /𝑛 , the random
computed. Eighteen specimens are painted (Q^ +a)"
using type A, and drying time (in hours) is variable 𝑍 ! = X " /'
is also a chi-squared
recorded each. The same is done with type distribution with 1 degree of freedom.
B. The population standard deviations are
both known to be 1.0. Assume that the mean Theorem 2: If 𝑺𝟐 is the variance of a
drying time is equal for the two types of random sample of size n taken from a
paint, find P(𝑋Ž_ − 𝑋Ž` > 1.0), where normal population having the variance 𝝈𝟐 ,
𝑋Ž_ 𝑎𝑛𝑑 𝑋Ž` are average drying times for then the statistic
samples of size 𝑛_ = 𝑛` = 18. '
!
(𝑛 − 1)𝑆 ! (𝑋5 − 𝑋Ž)!
𝜒 = = €
Solution: 𝜎! 𝜎!
56"
𝜇Q^ = 𝜇Q^4 + 𝜇Q^5 = 0 has a chi-squared distribution with 𝑣 =
𝑛 − 1 degrees of freedom
𝜎Q!^ 4 𝜎Q!^ 5
𝜎Q!^ = + Chi-Squared Distribution
𝑛 𝑛
The Chi Square distribution is
𝑃(𝑋Ž_ − 𝑋Ž` > 1.0) the distribution of the sum
= 𝑃(𝑋Ž_ − 𝑋Ž` ) − (𝜇Q^" + 𝜇Q^! ) of squared standard normal deviates.
> 1.0 − 0.0)
Ž Ž
(𝑋_ − 𝑋` ) − (𝜇Q^" + 𝜇Q^! ) The probability that a random
1
= 𝑝( > sample produces a χ2 value greater than
𝜎Q^4 − 𝜎Q^5 ‹1/9 some specified value is equal to the area
= 𝑃(𝑧 > 3.0) = 1 − 𝑃(𝑧 < 3.0) under the curve to the right of this value.
= 1 − 0.9987 = 0.0013 𝑃(𝜒 ! > 𝜒[,S
!
) = 𝛼
Sampling Distribution of Sample Variance The chi-squared distribution

The following table gives values of 𝜒[! for
various values of 𝛼 𝑎𝑛𝑑 𝜈.
Theorem 3: Let Z be a standard normal
• Exactly 95% of a chi-squared random variable and V a chi-squared
! ! random variable with v degrees of freedom.
distribution lies between 𝜒..FE- and 𝜒...!- .
• A 𝜒 ! values falling to the right of χ If Z and V are independent, then the
!
𝜒...!- . is not likely to occur, P < 0.025, distribution of the random variable T,
unless the assumed value of σ 2 is too small. where
• A 𝜒 ! values falling to the left of 𝜒 !
0.0975 is not likely to occur, P < 0.025, 𝑍
𝑇 =
unless the assumed value of 𝜎 ! is too small. ‹𝑉/𝑣
• When 𝜎 ! is correct, it is possible, P is given by the density function
< 0.05, to have a 𝜒 ! value to the left of 𝑣 + 1 S8"
+ !
! ! 𝛤Í 2 Î 𝑡!
𝜒..FE- or to the right of 𝜒...!- . ℎ(𝑡) = ¢1 + £ ,
• If this should happen, it is more 𝑣 𝑣
𝛤 ‰2Š √𝜋𝑣
probable that the assume value of 𝜎 ! is in − ∞ < 𝑡 < ∞.
error. This is known as the t-distribution with v
degrees of freedom.
Example 3:

A manufacturer of car batteries
Let 𝑋" , 𝑋! , . . . , 𝑋' be independent random
guarantees that the batteries will last, on
variables that are all normal with mean μ
average, 3 years with a standard deviation of
and standard deviation σ. Let
1 year. If five of these batteries have '
lifetimes of 1.9, 2.4, 3.0, 3.5, and 4.2 years, 𝑋Ž = € 𝑋𝑖
should the manufacturer still be convinced
56.
that the batteries have a standard deviation and
of 1 year? Assume that the battery lifetime '
follows a normal distribution. 1 !
𝑆! = €¤𝑋5 − 𝑋¥
𝑛 − 1
56"
Solution: Then the random variable
!
(5)(48.26) − (15)! 𝑋Ž − 𝜇
𝑠 = = 0.815 𝑇 =
(5)(4) 𝑆
(4)(0.815) √𝑛
𝜒! = = 3.26
1 has a t-distribution with v = n − 1 degrees of
(𝑛 − 1)𝑆 ! 4 × 0.815 freedom.
𝜒! = = 3.26 =
𝜎 ! 𝜎!
!
𝜎 =1
t-Distribution
Its applications revolve around
inferences on a population mean or the
difference between two population means.
Use of the Central Limit Theorem and the
normal distribution is certainly helpful in
this context. However, it was assumed that
the population standard deviation is known.
𝑡...- leaves an area of 0.05 to the right, and
−𝑡...!- leaves an area of 0.025 to the left,
we find a total area of
1 − 0.05 − 0.025 = 0.925
𝑏𝑒𝑡𝑤𝑒𝑒𝑛 − 𝑡...!- 𝑎𝑛𝑑 𝑡...-
Hence
𝑃(−𝑡...!- < 𝑇 < 𝑡...- ) = 0.925

F-Distribution
The F-distribution finds enormous
The t-distribution curves for v = 2, 5, and ∞. application in comparing sample variances.
Applications of the F-distribution are found
in problems involving two or more samples.
Let U and V be two independent
random variables having chi-squared
distributions with v1 and v2 degrees of
freedom, respectively. Then the distribution
of the random variable
𝑈
𝑣"
𝐹 =
𝑉
𝑣!
is given by the density function
Symmetry property (about 0) of the t- S!
𝑣 + 𝑣! 𝑣" b ! c
distribution. 𝛤Í " Î‰ Š
2 𝑣!
ℎ(𝑓) = 𝑣" 𝑣
𝛤 ‰ 2 Š 𝛤 ‰ 2! Š
The t-distribution is used extensively S!
in problems that deal with inference about 𝑓 ! +"
− S! 8S"
the population mean. or in problems that 1 + 𝑣" 𝑓 !
involve comparative samples (i.e., in cases { 2 |
where one is trying to determine if means 𝐼𝑓 𝑓 > 0,
from two samples are significantly
different). ℎ(𝑓) = 0, 𝐼𝑓 𝑓 ≤ 0
Example 4: This is known as the F-distribution with v1
The t-value with v = 14 degrees of freedom and v2 degrees of freedom (d.f.).
that leaves an area of 0.025 to the left, and
therefore an area of 0.975 to the right, is
𝑡..FE- = −𝑡...!- = −2.145
Example 5:
Find 𝑃(−𝑡...!- < 𝑇 < 𝑡...- )
Solution:
The lengths of time, in minutes, that 10
patients waited in a doctor’s office before
receiving treatment were recorded as
follows: 5, 11, 9, 5, 10, 15, 6, 10, 5, and 10.
Treating the data as a random sample, find
(a) the mean;
5 + 11 + 9 + 5 + 10 + 15 + 6 + 10 + 5 + 10
=
10
= 8.6 𝑚𝑖𝑛
(b) the median;
= 5, 5, 5, 6, 9, 10, 10, 10, 11, 15
Typical F-distributions 9 + 10
= = 9.5 𝑚𝑖𝑛
2
(c) the mode.
5 ( 3 times)& 10 ( 3 times)
Example 2:
The reaction times for a random sample of 9
subjects to a stimulant were recorded as 2.5,
3.6, 3.1, 4.3, 2.9. 2.3, 2.6, 4.1, and 3.4
seconds. find
(a) the range;
𝑅𝑎𝑛𝑔𝑒 = 4.3 − 2.3 = 2
(b) the standard deviation.
Illustration of the fα for the Fdistribution.
'
1
Theorem 4: Writing 𝒇𝜶(𝒗𝟏 , 𝒗𝟐 ) for fα with 𝑆! = €(𝑋𝑖 − 𝑋Ž )!
𝑛−1
𝒗𝟏 𝒂𝒏𝒅 𝒗𝟐 degrees of freedom, we obtain 56"
1 (2.5 − 3.2)! + (3.6 − 3.2)! + (3.1 − 3.2)!
𝑓" − 𝛼(𝑣" , 𝑣! ) = (4.3 − 3.2)! + (2.9 − 3.2)! + (2.3 − 3.2)! +
𝑓[ (𝑣! , 𝑣" )
(2.6 − 3.2)! + (4.1 − 3.2)! + (3.4 − 3.2)!
=
The F-Distribution with Two Sample 9−1
Variances = 0.4975
𝜎 = √0.4975 = 0.7053
Theorem 5: If 𝑺𝟐𝟏 𝒂𝒏𝒅 𝑺𝟐𝟐 are the variances
of independent random samples of size
𝒏𝟏 𝒂𝒏𝒅 𝒏𝟐 taken from normal populations Example 3:
with variances 𝝈𝟐𝟏 𝒂𝒏𝒅 𝝈𝟐𝟐 , respectively, The numbers of incorrect answers on a true-
then false competency test for a random sample
𝑆"! /𝜎"! 𝜎!! 𝑆"! of 15 students were recorded as follows: 2,
𝐹 = ! ! = ! ! 1, 3, 0, 1, 3, 6, 0, 3, 3, 5, 2, 1, 4, and 2.
𝑆! /𝜎! 𝑆! 𝜎"
calculate the variance using the formula
has an F-distribution with v1 = n1 − 1 and
v2 = n2 − 1 degrees of freedom Solution:
𝑥̅ = 2.4
Example 1: (a) of form

' !
1 (b) 𝜒...- when v = 19;
𝑆! = €(𝑋𝑖 − 𝑋Ž )! = 30.144
𝑛−1 !
56" (c) 𝜒..." when v = 12.
1 = 26.217
= [(2 − 2.4)! + (1 − 2.4)! + ⋯
14 Example 6:
+ (2 − 2.4)! = 2.971 Assume the sample variances to be
continuous measurements. Find the
b. Theorem 1 probability that a random sample of 25
' ' !
1 observations, from a normal population with
!
𝑆 = ¾ 𝑛 € 𝑋5! − ¿€ 𝑋5 À Á variance 𝜎 ! = 6, will have a sample
𝑛(𝑛 − 1)
56" 56" variance 𝑆 !
!
15 × 128 − 36 (a) greater than 9.1;
= = 2.971 (𝑛 − 1)𝑆 !
15 × 14
𝜒! =
𝜎!
Example 4: 𝑛 − 1 = 25 − 1 = 24 𝑑𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚
An electrical firm manufactures light bulbs
!
(𝑛 − 1)𝑆 ! 24 × 9.1
that have a length of life that is 𝑃(𝑆 > 9.1) = 𝑃 ¢ > £
𝜎! 6
approximately normally distributed, with
𝑃(𝜒 ! > 36.4) = 0.05
mean equal to 800 hours and a standard
deviation of 40 hours. Find the probability
(b) between 3.462 and 10.745.
that a random sample of 16 bulbs will have
𝑃(3.462 < 𝑆 ! < 10.745)
an average life of less than 775 hours.
(25 − 1)(3.462)
𝑃¢ < 𝑆!
Solution: 6
𝜇Q^ = 800 (25 − 1)(10.745
40 < £
6
𝜎Q^ = = 10
√16 = 𝑃(13.848 < 𝜒 ! < 42.980)
𝑥̅ = 775 = 0.95 − 0.01 = 0.94
775 − 800
𝑧 = = −2.5 Example 7:
10
(a) Find 𝑃(𝑇 < 2.365) when v = 7.
1 − .025 = 0.975
(b) Find 𝑃(−1.356 < 𝑇 < 2.179) when v
= 12.
= 1 − 0.025 − 0.1 = 0.875
(c) Find 𝑃(−𝑡....- < 𝑇 < 𝑡..." ) for v = 20.

= 1 − 0.01 − 0.005 = 0.9850
𝑃(𝑋Ž < 775) = 𝑃(𝑍 < −2.5) = 0.0062 (d) Find 𝑃(𝑇 > −𝑡...!- )
= 1 − 0.025 = 0.975
Example 5:
For a chi-squared distribution, find Example 8:
! A manufacturing firm claims that the
(a) 𝜒....- when v = 5;
batteries used in their electronic games will
= 16.750
last an average of 30 hours. To maintain this

average, 16 batteries are tested each month. 𝑛 − 1 = 8 − 1 = 7 𝑑𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚
If the computed t-value falls between ∴ Assumption is Okay.
−𝑡...!- 𝑎𝑛𝑑 𝑡...!- , the firm is satisfied with
its claim. What conclusion should the firm Example 10.
draw from a sample that has a mean of 𝑥̅ = For an F-distribution, find
27.5 hours and a standard deviation of s = 5 (a) 𝑓...- 𝑤𝑖𝑡ℎ 𝑣" = 7 𝑎𝑛𝑑 𝑣! = 15;
hours? Assume the distribution of battery = 2.71
lives to be approximately normal. (d) 𝑓..F- 𝑤𝑖𝑡ℎ 𝑣" = 19 𝑎𝑛𝑑 𝑣! = 24;
1
Solution: = = 0.47
2.11
𝑋Ž − 𝜇 27.5 − 30
𝑇 = = = −2.00 Example 10:
𝑆 5
√𝑛 √16 Pull-strength tests on 10 soldered leads for a
semiconductor device yield the following
−𝑡...!- = 2.131, 𝑡...!- = 2.131 results, in pounds of force required to
∴ Valid since it falls between -2.131 to rupture the bond: 19.8 12.7 13.2 16.9 10.6
2.131. 18.8 11.1 14.3 17.0 12.5 Another set of 8
leads was tested after encapsulation to
Example 9: determine whether the pull strength had
A maker of a certain brand of low-fat been increased by encapsulation of the
cereal bars claims that the average saturated device, with the following results: 24.9 22.8
fat content is 0.5 gram. In a random sample 23.6 22.1 20.4 21.6 21.8 22.5 Comment on
of 8 cereal bars of this brand, the saturated the evidence available concerning equality
fat content was 0.6, 0.7, 0.7, 0.3, 0.4, 0.5, of the two population variances.
0.4, and 0.2. Would you agree with the
claim? Assume a normal distribution.
Solution:
0.6 + 0.7 + 0.7 + 0.3 + 0.4 + 0.5
𝑥̅ = 0.4 + 0.2
8
= 0.475
1
𝑠! = [(0.6 − 0.475)!
8−1
+ (0.7 − 0.475)!
+ (0.7 − 0.475)!
+ (0.3 − 0.475)!
+ (0.4 − 0.475)!
+ (0.5 − 0.475)!
+ (0.4 − 0.475)!
+ (0.2 − 0.475)! = 0.0336
𝑠 = 0.183
0.475 − 0.5
𝑡= = −0.386
0.183
√8


Introduction To Statistics and Data Analysis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Statistics and Data Analysis

Uploaded by

Copyright:

Available Formats

Introduction to Statistics and In the problems discussed above the

statistical methods used involve dealing with

Compiled and prepared by: ENGR. K.T. CABANLIG

Compiled and prepared by: ENGR. K.T. CABANLIG

Sample is actually selected from a

Compiled and prepared by: ENGR. K.T. CABANLIG

Compiled and prepared by: ENGR. K.T. CABANLIG

D. Observing Processes Over Time

Compiled and prepared by: ENGR. K.T. CABANLIG

Sampling Procedures / Data Collection

Compiled and prepared by: ENGR. K.T. CABANLIG

Compiled and prepared by: ENGR. K.T. CABANLIG

We are particularly interested in the observations obtained by repeating the experiment

Compiled and prepared by: ENGR. K.T. CABANLIG

𝑆 = {𝐷𝐷𝐷, 𝐷𝐷𝑁, 𝐷𝑁𝐷, 𝐷𝑁𝑁, 𝑁𝐷𝐷, 𝑁𝐷𝑁, 𝑁𝑁𝐷, 𝑁𝑁𝑁}.

Compiled and prepared by: ENGR. K.T. CABANLIG

𝐵 = {𝐷𝐷𝑁, 𝐷𝑁𝐷, 𝑁𝐷𝐷, 𝐷𝐷𝐷}

Compiled and prepared by: ENGR. K.T. CABANLIG

Compiled and prepared by: ENGR. K.T. CABANLIG

Events of the sample spaces S.

Compiled and prepared by: ENGR. K.T. CABANLIG

The generalized multiplication rule covering k operations is stated in the following. If an

A permutation is an arrangement of all or part of a set of objects.

In general, 𝑛 distinct objects can be arranged in

Theorem 1: The number of permutations of n objects is n!.

Compiled and prepared by: ENGR. K.T. CABANLIG

Theorem 2: The number of permutations of n distinct objects taken r at a time is

(b) A will serve only if he is president;

(c) B and C will serve together or not at all;

(d) D and E will not serve together?

Compiled and prepared by: ENGR. K.T. CABANLIG

Theorem 6: The number of combinations of n distinct objects taken r at a time is

Compiled and prepared by: ENGR. K.T. CABANLIG

The number of ways of selecting 2 cartridges from 5 is

Using the multiplication rule

1.4 Probability of an Event

Compiled and prepared by: ENGR. K.T. CABANLIG

(a) an industrial engineering major

Compiled and prepared by: ENGR. K.T. CABANLIG

and the number of ways of being dealt 3 jacks from 4 cards is

there are n = (6)(4) = 24 hands with 2 aces and 3 jacks

the probability of getting 2 aces and 3 jacks in a 5-card poker hand is

According to the relative frequency definition of probability, the true probabilities

1.5 Additive Rules

Theorem 7: If A and B are two events, then

Compiled and prepared by: ENGR. K.T. CABANLIG

Corollary 2.1: If A1, A2,...,An are mutually exclusive, then

Theorem 8: For three events A, B, and C,

Theorem 9: If A and A’ are complementary events, then

𝑃(𝐸′) = 0.12 + 0.19 = 0.31

(b) L – the length of a randomly selected cable

Compiled and prepared by: ENGR. K.T. CABANLIG

𝑃(𝑋 ≥ 1990) = 𝑃(𝑀) + 𝑃(𝐿) = 0.995

Compiled and prepared by: ENGR. K.T. CABANLIG

Categorization of the Adults in a Small Town

Compiled and prepared by: ENGR. K.T. CABANLIG

The Product Rule, or the Multiplicative Rule

Theorem 11: Two events A and B are independent if and only if

Compiled and prepared by: ENGR. K.T. CABANLIG

(a) the probability that the entire system works

If the events A1, A2,...,Ak are independent, then

Compiled and prepared by: ENGR. K.T. CABANLIG

𝑃(𝐴" ∩ 𝐴! ∩ 𝐴# ) = 𝑃(𝐴" )𝑃(𝐴! |𝐴" )𝑃(𝐴# |𝐴" ∩ 𝐴! )