Lecture - Business Statistics - 11142019 PDF

Class Orientation
Second Semester, A.Y. 2019-2020
January 20, 2019

FEU PRAYER
Direct, O God, we beseech You.
All our actions by your holy inspiration
and help us all by your gracious assistance
so that every prayer and work of ours
may begin with You
and by You be happily ended. Amen.
VISION – MISSION STATEMENT
Guided by the Core Values of Fortitude,
Excellence and Uprightness,
Far Eastern University aims to be a university of
choice in Asia.
Committed to the highest intellectual, moral
and cultural standards, Far Eastern University
strives to produce principled and competent
graduates.
VISION – MISSION STATEMENT
It nurtures a service-oriented and
environment-conscious community which seeks
to contribute to the advancement of the global
society.
CORE VALUES
Fortitude
A Tamaraw is characterized by fortitude.
Moral courage and strength of character allow
Tamaraws to persevere and achieve more than is
expected of them.
CORE VALUES
Excellence
A Tamaraw is characterized by excellence.
The FEU academic community is committed to
perform to its fullest potential thus creating a
culture of excellence.
CORE VALUES
Uprightness
A Tamaraw is characterized by uprightness.
Full development of morality and integrity is
among the primary purposes of FEU as an
educational institution.
BYRON H. CABARLOC
OFFICIAL CONSULTATION HOURS:

MONDAY 4:30P-6:30P
VENUE: Faculty Room
MOBILE NUMBER: 0929 699 9828

EMAIL: bcabarloc@feu.edu.ph
Classroom Management
• ATTENDANCE – absences must NOT be more than
20% of the total class hours. There are no excused
absences
• Seat Plan – ALPHABETICAL ORDER
• No Eating/Drinking/Smoking/Sleeping inside the
classroom
• Place on SILENT MODE all your CELLPHONES
before entering the class / Use of Cellphones inside
the classroom is strictly forbidden.
• Friction Pen/Pencil is NOT allowed for any submitted
requirement (i.e. EXAMINATIONS/EXERCISES/SEATWORKS)
• 100% utilization of CANVAS
COMPUTER LABORATORY RULES:
• You may only enter the laboratory if your faculty is inside
• No browsing of sites that are not part of the course requirement
(i.e. facebook, youtube, etc)
• No tampering/changing of computer configuration w/o approval
• No installation of any application without approval
• Playing of online games is strictly not allowed
• No swapping of equipment / report all defective peripheral
devices
• SHUTDOWN the computer before leaving the laboratory (use
the POWER BUTTON)
• No eating/drinking inside the laboratory
• Alphabetical seating arrangement (no transferring of seats
without the approval of your faculty)
IMPORTANT REMINDERS:
• ASSIGNMENTS must be completed on or before
the deadline (no extensions/no excuses)
• QUIZZES must be completed on or before the deadline (no extensions/no excuses)
• Always read the topic for discussion before entering the class.
• Classes are NOT for lecture make sure that you are prepared for discussion on the
topic for the day.
• Any seat-work(s)/exercise(s) missed due to absence(s) will automatically get a
ZERO grade.
• Laboratory Exercises should ONLY be done during laboratory hours. Late
submission/absences during laboratory hours automatically get a ZERO grade for
the exercise scheduled for the day.
• FINAL consultation of your grade is at the END of your MIDTERM period
• You are only allowed 3 ABSENCES for this course / 15-MINUTES LATE is
considered an absence. All those who exceed the maximum number of absences
will be considered as DROPPED with a grade of F unless you have an approval
letter from the Academic Office that you are allowed to continue to attend the
course due to VALID reason(s).
SCHOOL CALENDAR
AY 2019-2020
SECOND SEMESTER
ENROLLMENT PERIOD JANUARY 6-17, 2020
START OF CLASS JANUARY 20, 2020
LATE ENROLLMENT JANUARY 20-24, 2020
ADJUSTMENT PERIOD JANUARY 20-24, 2020
LAST DAY OF DROPPING MARCH 14, 2020
MIDTERM EXAMINATIONS MARCH 16-21, 2020
FINAL EXAMINATIONS MAY 18-23, 2020
GRADE ENCODING MAY 25-27, 2020
SUMMER TERM
ENROLLMENT PERIOD JUNE 1-5, 2020
START OF CLASS JUNE 8, 2020
LATE ENROLLMENT JUNE 8-11, 2020
ADJUSTMENT PERIOD JUNE 8-11, 2020
LAST DAY OF DROPPING JULY 1, 2020
MIDTERM EXAMINATIONS JULY 2-3, 2020
FINAL EXAMINATIONS JULY 23-24, 2020
GRADE ENCODING JULY 27-28, 2020
COMPUTATION of GRADES: GRADING SYSTEM
A1. Class Participation
- In-class activities 20%
- Case discussion/Exercises
- Group presentations
A2. Quizzes/ Assignments 30%
A3. Major Examination 50%
TOTAL 100%
Note: University wide passing is 50% of the total
required credit points. Your FINAL GRADE is 50%
MIDTERM + 50% FINALS GRADE
LMS: https://onefeu.instructure.com / https://feu.instructure.com
(CANVAS)
LOGIN USING YOUR FEU EMAIL / MAKE SURE
TO CHECK YOUR ACCESS
ONCE YOU LOGIN PERSONALIZE YOUR
PASSWORD
NOTE: TAKE TIME TO READ THE MANUAL and MAKE SURE TO SET THE AUTO-
NOTIFICATION TO SEND EMAIL or SMS TO YOUR CELLPHONE.
Course Title: Business Statistics
TEXTBOOK: Essentials of
Modern Business Statistics
with Microsoft Office Excel – Anderson,
Sweeney, Williams, Camm, Cochran
YOU ARE REQUIRED TO BUY THE E-BOOK. ALL

LABORATORY EXERCISES WILL BE TAKEN FROM
THE E-BOOK
COURSE OUTLINE
Chapter 1: Data and Statistics
Chapter 2: Descriptive Statistics: Tabular and Graphical Displays
Chapter 3: Descriptive Statistics: Numerical Measures
Chapter 4: Introduction to Probability
Chapter 5: Discrete Probability Distributions
Chapter 6: Continuous Probability Distributions
MIDTERM EXAMINATION and GRADE CONSULTATIONS
Chapter 7: Sampling and Sampling Distributions
Chapter 8: Interval Estimation
Chapter 9: Hypothesis Tests
Chapter 10: Inferences About Means and Proportions with Two Populations
GROUP PROJECT: Apply Business Statistics in Conducting a Market Research Proposal on the Feasibility
of Any Product (you may use this template as a guide only https://www.pandadoc.com/market-research-
proposal-template/) / This is 50% of your Final Examination / Deadline: MAY 8, 2020, 3:00PM (NO
EXTENSION)
FINAL EXAMINATION
login.cengagebrain.com/cb/
REGISTER USING YOUR SECTIONS ASSIGNED COURSE
LINK URL
How to access your MindTap course
MGT1103 BUSINESS STATISTICAL ANALYSIS with SOFTWARE APPLICATIONS – SECTION 2
Instructor : Byron Cabarloc
Start Date : 01/20/2020
What is MindTap?
MindTap empowers you to produce your best work – consistently.
MindTap is designed to help you master the material. Interactive videos, animations, and activities create a learning path designed by
your instructor to guide you through the course and focus on what's important. Get started today!
Registration
Connect to https://login.cengagebrain.com/course/MTPNR0TNMB3D
Follow the prompts to register your MindTap course.
Payment
After registering for your course, you will need to pay for access using one of the options below:
Online: You can pay online using a credit or debit card, or PayPal.
Bookstore: You may be able to purchase access to MindTap at your bookstore. Check with the bookstore to find out what they offer
for your course.
Free Trial: If you are unable to pay at the start of the semester you may choose to access MindTap until 11:59 PM on
02/03/2020 during your free trial. After the free trial ends you will be required to pay for access.
Please note: At the end of the free trial period, your course access will be suspended until your payment has been made. All your
scores and course activity will be saved and will be available to you after you pay for access.
Already registered an access code? Bought MindTap at your bookstore or online? Now use the course link from your instructor to
register for the class: https://login.cengagebrain.com/course/MTPNR0TNMB3D
System Check
To check whether your computer meets the requirements for using MindTap, go
to http://ng.cengage.com/static/browsercheck/index.html
Please Note: the System Check is also accessible in the drop down box next to your name located in the upper right corner of your
MindTap page.
WORK ON BELOW TOPIC (YELLOW PAD): NOT LESS
THAN 500 WORDS
TOPIC: DISCUSS ON WHY THE NEED TO STUDY

STATISTICS AND ITS APPLICATION TO THE
CORPORATE ENVIRONMENT.
DURATION: 30 MINUTES
CHAPTER 1: Data and Statistics
Essentials of
Modern Business
Statistics (7e)
Anderson, Sweeney, Williams, Camm, Cochran
© 2018 Cengage Learning
25
Chapter 1 - Data and Statistics
 Statistics
 Applications in Business and Economics
 Data Sources
 Descriptive Statistics
 Statistical Inference
 Data Mining
 Statistical Analysis Using Microsoft Excel
26
What is Statistics?
 The term statistics can refer to numerical
facts such as averages, medians,
percentages, and maximums that help us
understand a variety of business and
economic situations.
 Statistics can also refer to the art and
science of collecting, analyzing, presenting,
and interpreting data.
27
Applications in Business and Economics
 Accounting
Public accounting firms use statistical sampling procedures when
conducting audits for their clients.
 Economics
Economists use statistical information in making forecasts about the
future of the economy or some aspect of it.
 Finance
Financial advisors use price-earnings ratios and dividend yields to guide
their investment advice.
28
Applications in Business and Economics
 Marketing
Electronic point-of-sale scanners at retail checkout counters are
used to collect data for a variety of marketing research
applications.
 Production
A variety of statistical quality control charts are used to monitor
the output of a production process.
 Information Systems
A variety of statistical information helps administrators assess the
performance of computer networks.
29
Data and Data Sets
 Data: facts and figures from which conclusions
can be drawn
 Data set: the data that are collected for a
particular study
 Elements: the entities on which data are
collected it may be people, objects, events, or
other entries
 Variable: any characteristics, number, or quantity
that can be measured or counted
30
Observations
The set of measurements obtained for a
particular element is called an observation.
 A data set with n elements contains n
observations.
31
Data, Data Sets, Elements, Variables,
and Observations
Variables
Nation WTO status Per Capita Fitch Rating

GDP ($)
Armenia Member 5,400 BB - Observation
Element Australia Member 40,800 AAA
Names Austria Member 41,700 AAA
Azerbaijan Observer 5,400 BBB -
Bahrain Member 27,300 BBB
Data Set
32
Scales of Measurement
 Scales of measurement include
• Nominal
• Ordinal
• Interval
• Ratio
 The scale determines the amount of information
contained in the data.
 The scale indicates the data summarization and
statistical analyses that are most appropriate.
33
 Nominal
– Data are labels or names used to identify an
attribute of the element.
– A non-numeric label or numeric code may be
used.
34
 Ordinal
– The data have the properties of nominal data and
the order or rank of the data is significant.
– A non-numeric label or numeric code may be
used.
35
 Interval
– The data have the properties of ordinal data, and the
interval between observations is expressed in terms
of a fixed unit of measure.
– Interval data are always numeric.
The classic example of an interval scale is

Celsius temperature because the difference
between each value is the same. For example,
the difference between 10 and 20 degrees is a
measurable 10 degrees, as is the difference
between 40 and 50 degrees.
36
 Ratio
– The data have all the properties of interval data and the
ratio of two values is meaningful.
– Variables such as distance, height, weight, and time use
the ratio scale.
– This scale must contain a zero value that indicates that
nothing exists for the variable at the zero point.
Melissa’s college record shows

This Device Provides Two 36 credit hours earned, while
Examples of Ratio Scales Kevin’s record shows 72 credit
(height and weight) hours earned. Kevin has twice as
many credit hours earned as
Melissa.
37
CATEGORICAL (QUALITATIVE) AND QUANTITATIVE DATA
Any characteristic of an element is called a

variable.
Qualitative data is the data in which the classification of
objects is based on attributes and properties. (e.g. The cake
is orange, blue and black in color); Females have brown,
black, blonde, and red hair)
Quantitative data is the one that focuses on numbers and
mathematical calculations and can be calculated and
computed. (e.g. There are 4 cakes and three muffins kept in
the basket; 1 glass of fizzy drink has 97.5 calories)
38
Data
Categorical Quantitative
Non-
Numeric Numeric
numeric
Nominal Ordinal Nominal Ordinal Interval Ratio

Cross-Sectional Data
CROSS-SECTIONAL DATA is a type of data collected by
observing many subjects (such as individuals, firms,
countries, or regions) at the same point of time, or
without regard to differences in time. Analysis of cross-
sectional data usually consists of comparing the
differences among the subjects.
Example:
For instance, if one has to calculate the present level of obesity in a given population, then
they can take a sample of around 1,000 people through random techniques from the given
population, which is also called the population’s cross section. Then the sample’s height and
weight is measured in order to estimate the percentage of the people who can be classified
as obese. Such cross-sectional sample gives them an overview regarding the population, at
a particular time span. It should be noted that we are unaware that on the bases of a single
cross-sectional sample whether the obesity is decreasing or increasing. In this case we can
just explain the present proportion.
40
Time Series Data
A TIME SERIES is a series of data points indexed (or listed or
graphed) in time order. Most commonly, a time series is a
sequence taken at successive equally spaced points in time.
Example:
U.S average price per gallon of
conventional regular gasoline
between 2010 and 2015.
Graphs of time series help
analysts understand what
happened in the past, identify
any trends over time, and
project future values for the time
series
41
Data Sources
 Existing Sources
 Internal company records – almost any
department
 Business database services – Dow Jones & Co.
 Government agencies - U.S. Department of Labor
 Industry associations – Travel Industry Association
of America
 Special-interest organizations – Graduate
Management Admission Council (GMAT)
 Internet – more and more firms
42
Data Sources
 Data Available From Internal Company
Records
Record Some of the Data Available
Employee records Name, address, social security number
Production Part number, quantity produced, direct labor
records cost, material cost
Inventory records Part number, quantity in stock, reorder level,
economic order quantity
Sales records Product number, sales volume, sales volume by
region
Credit records Customer name, credit limit, accounts receivable
balance
Customer profile Age, gender, income, household size
43
Data Sources
 Data Available From Selected Government
Agencies
Government Agency Some of the Data Available
Census Bureau Population data, number of households, household income
Federal Reserve Board Data on money supply, exchange rates, discount rates
Office of Mgmt. & Data on revenue, expenditures, debt of federal government

Budget
Department of Data on business activity, value of shipments, profit by
Commerce industry
Bureau of Labor Customer spending, unemployment rate, hourly earnings,
Statistics safety record
44
TYPES OF STATISTICAL STUDIES
 Statistical Studies – Observational
– In observational (nonexperimental) studies no
attempt is made to control or influence the
variables of interest
• A survey is a good example
Studies of smokers and nonsmokers are
observational studies because researchers do not
determine or control who will smoke and who will
not smoke
45
TYPES OF STATISTICAL STUDIES
 Statistical Studies – Experimental
– In experimental studies the variable of interest is
first identified. Then one or more variables are
identified and controlled so that data can be
obtained about how they influence the variable of
interest.
– The largest experimental study ever conducted is
believed to be the 1954 Public Health Service
experiment for the Salk polio vaccine. Nearly two
million U.S. children (grades 1- 3) were selected.
46
Data Acquisition Considerations
 Time Requirement
– Searching for information can be time consuming.
– Information may no longer be useful by the time it is
available.
 Cost of Acquisition
– Organizations often charge for information even when it
is not their primary business activity.
 Data Errors
– Using any data that happen to be available or were
acquired with little care can lead to misleading
information.
47
Descriptive Statistics
Table 1. The table shows the average salaries for
DESCRIPTIVE STATISTICS is the term various occupations in the United States in 1999.
given to the analysis of data that $112,760 pediatricians
$106,130 dentists
helps describe, show or summarize $100,090 podiatrists
data in a meaningful way such $ 76,140 physicists
that, for example, patterns might $ 53,410 architects
emerge from the data. Descriptive $ 49,720

school, clinical, and counseling
psychologists
statistics do not, however, allow us
$ 47,910 flight attendants
to make conclusions beyond the
$ 39,560 elementary school teachers
data we have analyzed or reach
$ 38,710 police officers
conclusions regarding any $ 18,980 floral designers
hypotheses we might have made. Descriptive statistics like these offer insight into American
They are simply a way to describe society. It is interesting to note, for example, that the pay
for those in education and who protect the citizens is a
our data. great deal less than the pay received by those who take
care of their feet or their teeth.
48
Two General Types of Statistic That Are Used to Describe
Data:
MEASURES OF CENTRAL TENDENCY: these are ways of describing the central

position of a frequency distribution for a group of data. In this case, the
frequency distribution is simply the distribution and pattern of marks scored
by the 100 students from the lowest to the highest. We can describe this
central position using a number of statistics, including the mode, median, and
mean.
MEASURES OF SPREAD: these are ways of summarizing a group of data by

describing how spread out the scores are. For example, the mean score of our
100 students may be 65 out of 100. However, not all students will have scored
65 marks. Rather, their scores will be spread out. Some will be lower and
others higher. Measures of spread help us to summarize how spread out
these scores are. To describe this spread, a number of statistics are available
to us, including the range, quartiles, absolute deviation, variance and
standard deviation.
Numerical Descriptive Statistics
 The most common numerical descriptive
statistic is the mean (or average).
 The mean demonstrates a measure of the
central tendency, or central location, of the
data for a variable.
50
TERMINOLOGIES
 Population - The set of all elements of interest
in a particular study.
 Sample - A subset of the population.
 Statistical inference - The process of using
data obtained from a sample to make estimates
and test hypotheses about the characteristics of a
population.
 Census - Collecting data for the entire
population.
 Sample survey - Collecting data for a sample.
51
Analytics
Scientific process of transforming data into insight
for making better decisions.
Types
– Descriptive analysis – Analytical techniques that describe
what happened in the past.
– Predictive analysis
– Analytical techniques that use models constructed
from past data to predict future.
– Helps assess the impact the impact of one variable on
another
– Prescriptive analysis – Analytical techniques that yield a
best course of action to take.
52
Data Warehousing
 Organizations obtain large amounts of data on a daily
basis by means of magnetic card readers, bar code
scanners, point of sale terminals, and touch screen
monitors.
 Wal-Mart captures data on 20-30 million transactions
per day.
 Visa processes 6,800 payment transactions per second.
 Capturing, storing, and maintaining the data, referred to
as data warehousing, is a significant undertaking.
53
Data Mining
 Analysis of the data in the warehouse might aid in
decisions that will lead to new strategies and higher
profits for the organization.
 Using a combination of procedures from statistics,
mathematics, and computer science, analysts “mine
the data” to convert it into useful information.
 The most effective data mining systems use automated
procedures to discover relationships in the data and
predict future outcomes, … prompted by only general,
even vague, queries by the user.
54
Example: Hudson Auto Repair
The manager of Hudson Auto would
like to have a better understanding of
the cost of parts used in the engine
tune-ups performed in her shop. She
examines 50 customer invoices for
tune-ups. The costs of parts, rounded
to the nearest dollar, are listed on the
next slide.
55
Example: Hudson Auto Repair
Sample of Parts Cost ($) for 50 Tune-ups
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73
56
Tabular Summary:
Frequency and Percent Frequency
Parts Cost ($) Frequency Percent Frequency
50-59 2 4%
60-69 13 26%
70-79 16 32%
80-89 7 14%
90-99 7 14%
100-109 5 10%
TOTAL 50 100%
57
Graphical Summary: Bar Chart
Example: Hudson Auto
Hudson Auto
18
16
14
12
Frequency
10
0
50-59 60-69 70-79 80-89 90-99
Parts Auto cost
58
Process of Statistical Inference
Example: Hudson Auto
Step 1 Step 2 Step 3 Step 4

• Population • A sample • The sample • The sample
consists of of 50 data average is
all tune engine provides a used to
ups. tune-ups is sample estimate
Average examined. average the
cost of parts cost population
parts is of $79 per average.
unknown. tune-up.
59
EXERCISE A
Classify each of the following qualitative variables
as ordinal or nominative. Explain your answers.
QUALITATIVE VARIABLE CATEGORIES
Statistics course letter grade A B C D F
Door choice on Let's Make A Deal Door #1 Door #2 Door #3
Television show classifications TV-G TV-PG TV-14 TV-MA
Personal computer ownership Yes No
Restaurant rating
Income tax filing status Married filing jointly Married filing separately
Single Head of household Qualifying
widow(er)
EXERCISE A (answers)
Letter Grades: Ordinal
Door Choices: Nominative
TV Classifications: Ordinal
PC Ownership: Nominative
Restaurant Ratings: Ordinal
Filing Status: Nominative.
EXERCISE B
Classify each of the following qualitative variables as ordinal or
nominative. Explain your answers.
QUALITATIVE VARIABLE CATEGORIES
Personal computer operating system Windows XP Windows Vista Windows 7 Windows 8
Motion picture classifications G PG PG-13 R NC-17 X
Level of education Elementary Middle school High school College
Graduate school
Rankings of the top 10 college football 1 2 3 4 5 6 7 8 9 10
teams
Exchange on which a stock is traded AMEX NYSE NASDAQ Other
Zip code 45056 90015 etc.
EXERCISE B (answers)
PC OS: Nominative
Movie Classifications: Ordinal
Education Level: Ordinal
Football Rankings: Ordinal
Stock Exchanges: Nominative
Zip Codes: Nominative.
EXERCISE C
Given the data set: 4 , 10 , 7 , 7 , 6 , 9 , 3 , 8 , 9
Find:
a) the mode (a statistical term that refers to the most frequently
occurring number found in a set of numbers),
b) the median (a simple measure of central tendency),
c) the mean (the usual average),
d) the sample standard deviation (measure that is used to quantify the
amount of variation or dispersion of a set of data values)
e) If we replace the data value 6 in the data set above by 24, will the
standard deviation increase, decrease or stay the same?
EXERCISE C (answers)
• The given data set has 2 modes: 7 and 9
• order data : 3 , 4 , 6 , 7 , 7 , 8 , 9 , 9 , 10 : median = 7
• (mean) : m = (3+4+6+7+7+8+9+9+10) / 9 = 7
x x - m (x - m)2
4 -3 9
10 3 9
7 0 0
7 0 0
6 -1 1
9 2 4
3 -4 16
8 1 1
9 2 4 SUM = 44
• sample standard deviation = 2.35 (rounded to 2 decimal places)
• The standard deviation will increase since 24 is further from away from the
other data values than 6.
EXERCISE D
Which of these variables are quantitative and
which are qualitative?
a. The dollar amount on an accounts receivable invoice.
b. The net profit for a company in 2009.
c. The stock exchange on which a company’s stock is
traded.
d. The national debt of the United States in 2009.
e. The advertising medium (radio, television, or print)
used to promote a product.
EXERCISE D (answers)
a. Quantitative; dollar amounts correspond to values
on the real number line.
b. Quantitative; net profit is a dollar amount.
c. Qualitative; which stock exchange is a category.
d. Quantitative; national debt is a dollar amount.
e. Qualitative; media is categorized into radio,
television, or print.
EXERCISE E
DISPLAY SIZE BATTERY LIFE CPU
TABLET COST($) OPERATING SYSTEM (INCHES) (HOURS) MANUFACTURER
Acer Iconia W510 599 Windows 10.1 8.5 Intel

Amazon Kindle Fire HD 299 Android 8.9 9 TI OMAP
Apple IPAD 4 499 IOS 9.7 11 Apple
HP Envy X2 860 Windows 11.6 8 Intel
Lenovo Thinkpad Tablet 668 Windows 10.1 10.5 Intel
Microsoft Surface Pro 899 Windows 10.6 4 Intel
Motorola Droid Xyboard 530 Android 10.1 9 TI OMAP
Samsung Ativ Smart PC 590 Windows 11.6 7 Intel
Samsung Galaxy Tab 525 Android 10.1 10 Nvidia
Sony Tablet S 360 Android 9.4 8 Nvidia
Tablet PC Comparison provides a wide variety of information about tablet computers.

Their website enables consumers to easily compare different tablets using factors such
as cost, type of operating system, display size, battery life, and CPU manufacturer. A
sample of 10 tablet computer is shown above
a) How many elements are in this data set?
b) How many variables are in this data set?
c) Which variables are categorical and which variable is quantitative?
d) What type of measurement scale is used for each of the variables?
EXERCISE E (answers)
a. The ten elements are the ten tablet computers
b. 5 variables: Cost ($), Operating System, Display Size (inches), Battery Life
(hours), CPU Manufacturer
c. Categorical variables: Operating System and CPU Manufacturer

Quantitative variables: Cost ($), Display Size (inches), and Battery Life (hours)
d. Variable Measurement Scale

Cost ($) -- RATIO
Operating System -- NOMINAL
Display Size (inches) -- RATIO
Battery Life (hours) -- RATIO
CPU Manufacturer -- NOMINAL
EXERCISE F
DISPLAY SIZE BATTERY LIFE CPU
TABLET COST($) OPERATING SYSTEM (INCHES) (HOURS) MANUFACTURER
Acer Iconia W510 599 Windows 10.1 8.5 Intel

Amazon Kindle Fire HD 299 Android 8.9 9 TI OMAP
Apple IPAD 4 499 IOS 9.7 11 Apple
HP Envy X2 860 Windows 11.6 8 Intel
Lenovo Thinkpad Tablet 668 Windows 10.1 10.5 Intel
Microsoft Surface Pro 899 Windows 10.6 4 Intel
Motorola Droid Xyboard 530 Android 10.1 9 TI OMAP
Samsung Ativ Smart PC 590 Windows 11.6 7 Intel
Samsung Galaxy Tab 525 Android 10.1 10 Nvidia
Sony Tablet S 360 Android 9.4 8 Nvidia
A. What is the average cost of the tablet?

B. Compare the average cost of tablets with a Windows operating system to the
average cost of tablets with an Android operating system?
C. What percentage of tablets use a CPU manufactured by TI OMAP?
D. What percentage of tablets use an Android operating system?
EXERCISE F (answers)
a. Average cost = 5829/10 = $582.90
b. Average cost with a Windows operating system =
3616/5 = $723.20
Average cost with an Android operating system =
1714/4 = $428.5
The average cost with a Windows operating system
is much higher.
c. 2 of 10 or 20% use a CPU manufactured by TI
OMAP
d. 4 of 10 or 40% use an Android operating system
Three Case Studies That Illustrate Sampling and
Statistical Inference
• The Cell Phone Case. A bank estimates its cellular phone
costs and decides whether to outsource management of its
wireless resources by studying the calling patterns of its
employees.
• The Marketing Research Case. A bottling company
investigates consumer reaction to a new bottle design for
one of its popular soft drinks.
• The Car Mileage Case. To determine if it qualifies for a
federal tax credit based on fuel economy, an automaker
studies the gas mileage of its new midsize model.
Chapter 1: Seatwork
Work on the attached exercises and be ready

to discuss your answer.
Essentials of Modern
Business Statistics
(7e)
Anderson, Sweeney, Williams, Camm, Cochran
© 2018 Cengage Learning
75
Chapter 2
Descriptive Statistics: Tabular and Graphical Displays
 Summarizing Data for a Categorical Variable

• Categorical data use labels or names to
identify categories of like items.
 Summarizing Data for a Quantitative Variable
• Quantitative data are numerical values
that indicate how much or how many.
76
Summarizing Categorical Data
 Frequency Distribution
 Relative Frequency Distribution
 Percent Frequency Distribution
 Bar Chart
 Pie Chart
77
Frequency Distribution
 A frequency distribution is a tabular summary of
data showing the number (frequency) of
observations in each of several non-overlapping
categories or classes.
 The objective is to provide insights about the

data that cannot be quickly obtained by looking
only at the original data.
78
Example
• Soft drink purchasers were asked to select one among the five popular
soft drinks: Coca-Cola, Diet Coke, Dr. Pepper, Pepsi and Sprite.
• Soft drink selected by a sample of 20 purchasers are:
Coca-Cola Pepsi Dr. Pepper

Diet Coke Dr. Pepper Dr. Pepper
Dr. Pepper Pepsi Pepsi
Pepsi Coca-Cola Diet Coke
Pepsi Diet Coke Dr. Pepper
Pepsi Pepsi Sprite
Pepsi Pepsi
79
Example
Rating Frequency
Coca-Cola 2
Diet Coke 3
Dr Pepper 5
Pepsi 9
Sprite 1
Total 20
80
Relative Frequency Distribution
 The relative frequency of a class is the fraction or
proportion of the total number of data items
belonging to a class.
Frequency of the class
Relative frequency of a class =
𝑛
 A relative frequency distribution is a tabular

summary of a set of data showing the relative
frequency for each class.
81
Example of Relative Frequency Distribution
Suppose that a frequency distribution is based on a sample
of 200 supermarkets. It turns out that 50 of these
supermarkets charge a price between ₱120.00 and
₱130.00 for a kilo of beef. In a relative frequency
distribution, the number assigned to this class would be
0.25 (50/200). In other words, that’s 25 percent of the total.
Percent Frequency Distribution
 The percent frequency of a class is the relative
frequency multiplied by 100.
 A percent frequency distribution is a tabular

summary of a set of data showing the percent
frequency for each class.
83
Relative Frequency and Percent Frequency
Distributions
Example
Rating Relative Frequency Percent

Frequency
Coca-Cola .10 10 .10(100) = 10
Diet Coke .15 15
Dr.Pepper .25 25
Pepsi .45 45
Sprite .05 5
Total 1.00 1/20 = 0.05
100
84
Bar Chart
 A bar chart is a graphical display for depicting
qualitative data.
• On one axis (usually the horizontal axis), we specify the
labels that are used for each of the classes.
• A frequency, relative frequency, or percent frequency
scale can be used for the other axis (usually the vertical
axis).
• Using a bar of fixed width drawn above each class
label, we extend the height appropriately.
• The bars are separated to emphasize the fact that each
class is separate.
85
Bar Chart
Bar Chart for Purchase of Soft Drink
10
6
Frequency
0
Coca-Cola Diet Coke Dr. Pepper Pepsi Sprite
Soft Drink
86
Pie Chart
• The pie chart is a commonly used graphical
display for presenting relative frequency and
percent frequency distributions for categorical
data.
• First draw a circle; then use the relative
frequencies to subdivide the circle into sectors
that correspond to the relative frequency for
each class.
• Since there are 360 degrees in a circle, a class
with a relative frequency of .25 would consume
.25(360) = 90 degrees of the circle.
87
Pie Chart
Coca-Cola
Diet Coke
Dr. Pepper
Pepsi
Sprite
88
Pie Chart
Example
Inferences from the Pie Chart
 Almost one-half of the customers surveyed
preferred Pepsi (looking at the left side of the
pie).
 The second preference is for Dr. Pepper with
25% of the customers opting for it.
 Only 5% of the customers opted for Sprite.
89
Summarizing Quantitative Data
 Frequency Distribution
 Relative Frequency and Percent Frequency Distributions
 Dot Plot
 Histogram
 Cumulative Distributions
 Stem-and-Leaf Display
90
Example
Sanderson and Clifford, a small public accounting
firm wants to determine time in days required to
complete year end audits. It takes a sample of 20
clients.
91
Example: Sanderson and Clifford
Year-end Audit Time (in Days)

12 14 19 18
15 15 18 17
20 27 22 23
22 21 33 28
14 18 16 13
92
The three steps necessary to define the classes
for a frequency distribution with quantitative
data are:
 Step 1 - Determine the number of non-overlapping
classes.
 Step 2 - Determine the width of each class.
 Step 3 - Determine the class limits.
93
HOW TO FIND NUMBER OF CLASSES
Step 1: Find the number of classes. One rule for finding an appropriate
number of classes says that the number of classes should be the smallest
whole number K that makes the quantity 2K greater than the number of
measurements in the data set.
For Example: in a payment time data set we have 65 measurements.

Because 26 = 64 is less than 65 and 27 = 128 is greater than 65, we
should use K = 7 classes. Table 2.5 gives the appropriate number of
classes (determined by the 2K rule) to use for data sets of various sizes.
For completeness all values of n ≥ 1 are included in this table. However,
constructing a histogram with fewer than 16 measurements is not
recommended.
HOW TO FIND NUMBER OF CLASSES
TABLE 2.5 Recommended
Number of Classes for Data
Step 2: Find the class length We find the length of Sets of n Measurements
each class by computing
Number of Size, n, of the
Classes Data Set
Approximate class length =
𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡 ;𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡 2 1≤n<4
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠
3 4≤n<8
Given that the largest and smallest payment times 4 8 ≤ n < 16
in above example are 29 days and 10 days, the
approximate class length is (29 − 10)/7 = 2.7143. To 5 16 ≤ n < 32
obtain a simpler final class length, round this value. 6 32 ≤ n < 64
Commonly, the approximate class length is rounded
up to the precision of the data measurements (that 7 64 ≤ n < 128
is, increased to the next number that has the same 8 128 ≤ n < 256
number of decimal places as the data
measurements). For instance, because the payment 9 256 ≤ n < 528
times are measured to the nearest day, we round 10 528 ≤ n < 1056
2.7143 days up to 3 days.
Guidelines for Determining the Number of Classes
• Use between 5 and 20 classes (class is a grouping of
values by which data is binned for computation of a frequency
distribution).
• Data sets with a larger number of elements
usually require a larger number of classes.
• Smaller data sets usually require fewer classes.
• The goal is to use enough classes to show the
variation in the data, but not so many classes that
some contain only a few data items.
96
Guidelines for Determining the Width of Each
Class
• Use classes of equal width.
• Approximate Class Width =
Largest data value;Smallest data value
Number of classes
• Making the classes the same width reduces the
chance of inappropriate interpretations.
97
Note on Number of Classes and Class Width
• In practice, the number of classes and the appropriate
class width are determined by trial and error.
• Once a possible number of classes is chosen, the
appropriate class width is found.
• The process can be repeated for a different number of
classes.
• Ultimately, the analyst uses judgment to determine the
combination of the number of classes and class width
that provides the best frequency distribution for
summarizing the data.
98
Guidelines for Determining the Class Limits
• Class limits must be chosen so that each data item
belongs to one and only one class.
• The lower class limit identifies the smallest possible
data value assigned to the class.
• The upper class limit identifies the largest possible data
value assigned to the class.
• The appropriate values for the class limits depend on
the level of accuracy of the data.
• An open-end class requires only a lower class limit or
an upper class limit.
99
Class Midpoint
• In some cases, we want to know the
midpoints of the classes in a frequency
distribution for quantitative data.
• The class midpoint is the value halfway
between the lower and upper class limits.
100
• If we choose five classes:
• Approximate Class Width = (33 - 12)/5 = 4.2  4
Time in days Frequency
10-14 4
15-19 8
20-24 5
25-29 2
30-34 1
Total 20
101
Relative Frequency and Percent
Frequency Distributions

Audit time Relative Frequency Percent Frequency
(in days)
10 – 14 .20 (4/20) 20 (0.2 *
100)
15 – 19 .40 40
20 – 25 .25 25
25 – 29 .10 10
30 – 34 .05 5
Total 1.00 100
102
Relative Frequency and Percent
Frequency Distributions
Insights obtained from the Percent Frequency
Distribution:
 40% of the audits required from 15 to 19 days.
 Another 25% of the audits required 20 to 25 days.
 Only 5% of the audits required more than 30 days.
103
Dot Plot
• One of the simplest graphical summaries of
data is a dot plot.
• A horizontal axis shows the range of data
values.
• Then each data value is represented by a dot
placed above the axis.
104
Dot Plot
105
Histogram
• Another common graphical display of quantitative
data is a histogram.
• The variable of interest is placed on the horizontal
axis.
• A rectangle is drawn above each class interval with
its height corresponding to the interval’s frequency,
relative frequency, or percent frequency.
• Unlike a bar graph, a histogram has no natural
separation between rectangles of adjacent classes.
106
Histogram
107
Cumulative Distributions
• Cumulative frequency distribution - shows the
number of items with values less than or equal to
the upper limit of each class.
• Cumulative relative frequency distribution –

shows the proportion of items with values less
than or equal to the upper limit of each class.
• Cumulative percent frequency distribution –

shows the percentage of items with values less
than or equal to the upper limit of each class.
108
• The last entry in a cumulative frequency

distribution always equals the total number of
observations.
• The last entry in a cumulative relative frequency

distribution always equals 1.00.
• The last entry in a cumulative percent frequency

distribution always equals 100.
109
Example: Sanderson and Cliffords
Audit time Cumulative Cumulative Cumulative
(Days) Frequency Relative Percent
Frequency Frequency
≤ 14 4 .20 20
≤ 19 12 .60 60
≤ 24 17 .85 85
≤ 29 19 .95 95
≤ 34 20 1.00 100
110
Stem-and-Leaf Display
• A stem-and-leaf display shows both the rank order and
shape of the distribution of the data.
• It is similar to a histogram on its side, but it has the
advantage of showing the actual data values.
• The first digits of each data item are arranged to the
left of a vertical line.
• To the right of the vertical line we record the last digit
for each item in rank order.
• Each line (row) in the display is referred to as a stem.
• Each digit on a stem is a leaf.
111
Constructing a Stem-and-Leaf Display
1. Decide what units will be used for the stems and the leaves. Each leaf
must be a single digit and the stem values will consist of appropriate
leading digits. As a general rule, there should be between 5 and 20
stem values.
2. Place the stem values in a column to the left of a vertical line with the
smallest value at the top of the column and the largest value at the
bottom.
3. To the right of the vertical line, enter the leaf for each measurement into
the row corresponding to the proper stem value. Each leaf should be a
single digit—these can be rounded values that were originally more
than one digit if we are using an appropriately defined leaf unit.
4. Rearrange the leaves so that they are in increasing order from left to
right.
Example
The number of questions answered correctly on an
aptitude test by 50 students analysed with the help
of a Stem – and – leaf display here. The relevant data
is given in the following table.
113
Number of questions answered correctly by
50 students
112 73 126 82 92 115 95 84 68 100
72 92 128 104 108 76 141 119 98 85
69 76 118 132 96 91 81 113 115 94
97 86 127 134 100 102 80 98 106 106
107 73 124 83 92 81 106 75 95 119
114
6 9 8
7 2 3 6 3 6 5
8 6 2 3 1 1 0 45
9 7 2 2 6 2 1 588 5 4
10 7 4 8 0 2 6 606
11 2 8 5 9 3 5 9
12 6 8 7 4
13 2 4
14 1
Stem Leaf
115
Stretched Stem-and-Leaf Display
• If we believe the original stem-and-leaf display

has condensed the data too much, we can
stretch the display vertically by using two stems
for each leading digit(s).
• Whenever a stem value is stated twice, the first

value corresponds to leaf values of 0 - 4, and
the second value corresponds to leaf values of
5 - 9.
116
Stretched Stem-and-Leaf Display
6 8 9
7 2 3 3
7 5 6 6
8 0 1 1 2 3 4
8 5 6
9 1 2 2 2 4
9 5 5 6 7 8 8
10 0 0 2 4
10 6 6 6 7 8
11 2 3
11 5 5 8 9 9
12 4
12 6 7 8
13 2 4
14 1
117
Leaf Units
• A single digit is used to define each leaf.
• In the preceding example, the leaf unit was 1.
• Leaf units may be 100, 10, 1, 0.1, and so on.
• Where the leaf unit is not shown, it is assumed to equal

1.
• The leaf unit indicates how to multiply the stem-and-leaf

numbers in order to approximate the original data.
118
Example: Leaf Unit = 0.1
If we have data with values such as
8.6 11.7 9.4 9.1 10.2 11.0 8.8
Leaf Unit = 0.1
8 6 8
9 1 4
10 2
11 0 7
119
Example: Leaf Unit = 10
If we have data with values such as
1806 1717 1974 1791 1682 1910 1838
Leaf Unit = 10
16 8
17 1 9
18 0 3
19 1 7
The 82 in 1682 is rounded down to 80 and is
represented as an 8.
120
• EXERCISES SUMMARIZING DATA FOR A
CATEGORICAL VARIABLE
• EXERCISES SUMMARIZING DATA FOR
QUANTITATIVE VARIABLE
Chapter 2: Descriptive Statistics: Tabular and
Graphical Displays (Part B)
• Summarizing Data for Two Variables Using

Tables
• Summarizing Data for Two Variables Using
Graphical Displays
Summarizing Data for Two Variables
using Tables
• Crosstabulation is a method for summarizing
the data for two variables.
123
Crosstabulation
CROSSTABULATION can be used when:
• one variable is categorical and the other is
quantitative,
• both variables are categorical, or
• both variables are quantitative.
• The left and top margin labels define the
classes for the two variables.
124
Crosstabulation
Example: Zagat’s Restaurant Review
Crosstabulation of quality rating and meal price
data for 300 Los Angeles restaurants is given
here.
125
Crosstabulation
Insights Gained from Preceding Crosstabulation
Greatest number of restaurants in the sample (64)
have a very good rating and the meal price in the
$20-29 range.
Only 2 restaurants have an excellent rating and a
meal price in the range of $10-19 range
126
Crosstabulation
127
Crosstabulation: Row or Column
Percentages
Converting the entries in the table into row

percentages or column percentages can provide
additional insight about the relationship
between the two variables.
128
Crosstabulation: Row Percentages
Meal Price
Quality $10- $20-
Rating 19 29 $30-39 $40-49 Total
Good 50 47.6 2.4 0 100
Very
Good 22.7 42.7 30.6 4 100
Excellent 3 21.2 42.4 33.4 100
 Good restaurants charging a meal price of $10-19/Total number

of good restaurants i.e 42/84 * 100 = 50%.
129
Crosstabulation: Simpson’s Paradox
• Data in two or more crosstabulations are
often aggregated to produce a summary Average scores of male and
crosstabulation. female students in two schools
• We must be careful in drawing conclusions

about the relationship between the two
variables in the aggregated crosstabulation.
• In some cases the conclusions based upon

an aggregated crosstabulation can be simple calculation shows that the
completely reversed if we look at the overall average scores in these two
unaggregated data. The reversal of schools are 83.2 and 81.8,
conclusions based on aggregate and respectively. School Alpha won on
unaggregated data is called Simpson’s the average score
paradox.
130
Summarizing Data for Two Variables
Using Graphical Display
• Scatter diagrams and trendlines are useful in
exploring the relationship between two
variables.
131
Scatter Diagram and Trendline
A scatter diagram is a graphical presentation of the

relationship between two quantitative variables.
– One variable is shown on the horizontal axis and the

other variable is shown on the vertical axis.
– The general pattern of the plotted points suggests the
overall relationship between the variables.
– A trendline provides an approximation of the
relationship.
132
Scatter Diagram
 A Positive Relationship
133
Scatter Diagram
 A Negative Relationship
134
Scatter Diagram
• No Apparent Relationship
135
Scatter Diagram
Example:
A Stereo and sound equipment store in San Franscisco wants to analyze
the relationship between sales and advertising. Sample data for ten
weeks with sales in hundreds of dollars is shown below:
136
Scatter Diagram and Trendline for the
Stereo and Sound Equipment Store
137
Scatter Diagram
Example
Insights Gained from the Stereo and Sound
Equipment store Scatter Diagram
The scatter diagram indicates a positive
relationship between the number of commercials
and sales.
Higher sales is associated with greater number of
commercials.
The relationship is not perfect; all plotted points in
the scatter diagram are not on a straight line.
138
Side-by-Side Bar Chart
– A side-by-side bar chart is a graphical display for
depicting multiple bar charts on the same display.
– Each cluster of bars represents one value of the
first variable.
– Each bar within a cluster represents one value of
the second variable.
139
Side-by-Side Bar Chart
Side by Side Bar Chart for the Quality
and Price Meal Data
60
50
Frequency
40
30
20
10
0
$10-19 $20-29 $30-39 $40-49
Meal Price ($)
Good Very Good Excellent
140
Stacked Bar Chart
– A stacked bar chart is another way to display and
compare two variables on the same display.
– It is a bar chart in which each bar is broken into
rectangular segments of a different color.
– If percentage frequencies are displayed, all bars
will be of the same height (or length), extending
to the 100% mark.
141
Stacked Bar Chart
Stacked bar chart for the quality rating and meal
price data
100%
Percentage frequency
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
$10-19 $20-29 $30-39 $40-49
Meal Price ($)
Good Very Good Excellent
142
Skewed and Symmetric Data
Skewed to the Left
The situation reverses itself when we deal with data
skewed to the left. Data that are skewed to the left
have a long tail that extends to the left. An
alternate way of talking about a data set skewed to
the left is to say that it is negatively skewed. In this
situation, the mean and the median are both less
than the mode. As a general rule, most of the time
for data skewed to the left, the mean will be less
than the median. In summary, for a data set skewed
to the left:
• Always: mean less than the mode
• Always: median less than the mode
• Most of the time: mean less than median
Skewed to the Right

Data that are skewed to the right have a long tail that extends to If the data are symmetric, they have about the
the right. An alternate way of talking about a data set skewed to same shape on either side of the middle. In other
the right is to say that it is positively skewed. In this situation, the words, if you fold the histogram in half, it looks
mean and the median are both greater than the mode. As a about the same on both sides.
general rule, most of the time for data skewed to the right, the
mean will be greater than the median. In summary, for a data set Histogram C in the figure shows an example of
skewed to the right: symmetric data. With symmetric data, the mean
• Always: mean greater than the mode and median are close together.
• Always: median greater than the mode
• Most of the time: mean greater than median
Data Dashboard Example
144
Tabular and Graphical Displays
Summary
Data
Categorical Data Quantitative Data
Tabular Graphical Tabular Graphical

Displays Displays Displays Displays
• Frequency Dist. • Dot Plot

• Frequency Distribution • Bar Chart • Rel. Freq. Dist. • Histogram
• Relative Frequency • Pie Chart • % Freq. Dist. • Stem-and-Leaf Display
Distribution • Side-by-Side Bar Chart • Cum. Freq. Dist.
• Percent Frequency • Scatter
• Stacked Bar Chart • Cum. Rel. Freq. Dist.
Distribution Diagram
• Cum. % Freq. Dist.
• Crosstabulation
• Crosstabulation
145
Chapter 3: Descriptive Statistics:
Numerical Measures
David R. Anderson, Dennis J. Sweeney, Thomas A. Williams, Jeffrey D.

Camm, James J. Cochran
Essentials of Modern Business Statistics
7th Edition
MEASURES of LOCATION
Measures of location summarize a list of numbers by a "typical" value.
The three most common measures of location are the mean, the median,
and the mode.
The mean is the sum of the values, divided by the number of values. It has
the smallest possible sum of squared differences from members of the list.
The median is the middle value in the sorted list. It is the smallest number
that is at least as big as at least half the values in the list. It has the smallest
possible sum of absolute differences from members of the list.
The mode is the most frequent value in the list (or one of the most frequent
values, if there are more than one). It differs from the fewest possible
members of the list.
MEAN
• The mean is sometimes referred to as the arithmetic mean.
• Perhaps the most important measure of location is the mean,
or average value, for a variable. The mean provides a
measure of central location for the data. If the data are for a
sample, the mean is denoted by x̄; if the data are for a
population, the mean is denoted by the Greek letter µ.
𝑋𝑖
SAMPLE MEAN: 𝑋=
𝑛
MEAN (Example)
Table 2.9: Data on Selling
Home Sale Price ($)
Home Sales
1 138,000
in a Cincinnati, Ohio,
2 254,000
Suburb 3 186,000
4 257,500
5 108,000
6 254,000
7 138,000
8 298,000
9 199,500
10 208,000
11 142,000
12 456,250
MEAN (Example)
Computation of Sample Mean:
• Illustration: Computation of the mean home selling
price for the sample of 12 home sales.
x i x1  x2   x12
x 
n 12
138,000  254,000  456,250

12
2,639,250
  219,937.50
12
MEDIAN
The median is another measure of central
location.
Arrange the data in ascending order (smallest
value to largest value).
(a)For an odd number of observations, the
median is the middle value.
(b)For an even number of observations, the
median is the average of the two middle
values.
MEDIAN: EXAMPLE
Computation of Sample Median:
– Illustration: When the number of observations are odd,
– Consider the class size data for a sample of five
college classes:
46 54 42 46 32
– Arrange the class size data in ascending order:
32 42 46 46 54
– Middlemost value in the data set = 46.
– Median is 46.
MEDIAN EXAMPLE
Computation of Sample Median:
Illustration: When the number of observations are even:
– Consider the data on home sales in Cincinnati, Ohio,
Suburb (Table 2.9).
– Arrange the data in ascending order:
108,000 138,000 138,000 142,000 186,000 199,500
208,000 254,000 254,000 257,500 298,000 456,250
• Median = average of two middle values:
199,500  208,000
Median   203,750
2
MODE
Another measure of location is the mode.
The mode is the value that occurs with greatest frequency.
Example 2: In a crash test, 11 cars were tested to determine what impact speed
was required to obtain minimal bumper damage. Find the mode of the speeds
given in miles per hour below.
24, 15, 18, 20, 18, 22, 24, 26, 18, 26, 24
Solution: Ordering the data from least to greatest, we get:

15, 18, 18, 18, 20, 22, 24, 24, 24, 26, 26
Answer: Since both 18 and 24 occur three times, the modes are 18 and 24 miles
per hour. This data set is bimodal.
• Multimodal data: Data contain at least two modes.

• Bimodal data: Data contain exactly two modes.
WEIGHTED MEAN
The weighted mean is a type of mean that is
calculated by multiplying the weight (or probability)
associated with a particular event or outcome with its
associated quantitative outcome and then summing
all the products together.
𝑊𝑖 𝑋𝑖
𝑋=
𝑊𝑖
WHERE: 𝑾𝒊 is weight for observation i
Weighted Mean Example
Narciso wants to buy a new camera, and decides
on the following rating system: Image Quality
50%; Battery Life 30%; Zoom Range 20%
The Sonu camera gets 8 (out of 10) for Image Quality, 6 for
Battery Life and 7 for Zoom Range
The Conan camera gets 9 for Image Quality, 4 for Battery

Life and 6 for Zoom Range
Which camera is best?

Weighted Mean Example
SOLUTION:
Sonu: 0.5 × 8 + 0.3 × 6 + 0.2 × 7 = 4 + 1.8 + 1.4 = 7.2
Conan: 0.5 × 9 + 0.3 × 4 + 0.2 × 6 = 4.5 + 1.2 + 1.2 = 6.9
Sam decides to buy the Sonu.

GEOMETRIC MEAN
• The Geometric Mean is a special type of average where we
multiply the numbers together and then take a square root (for
two numbers), cube root (for three numbers) etc.
• The geometric mean is a measure of location that is calculated by
finding the nth root of the product of n values.
• Applications of the geometric mean are most common in business
and finance, where it is used when dealing with
percentages to calculate growth rates and returns
on portfolio of securities.
𝑛
Formula: 𝑋𝑔 = 𝑋1 𝑋2 𝑋3 … 𝑋𝑛
GEOMETRIC MEAN
𝑌𝑒𝑎𝑟2 − 𝑌𝑒𝑎𝑟1
RATE OF RETURN 𝑅1 =
𝑌𝑒𝑎𝑟 1
𝑛
GEOMETRIC MEAN 𝑅𝑔 = 1 + 𝑅1 (1 + 𝑅2 ) … (1 + 𝑅𝑛 ) - 1
VALUE OF
INVESTMENT AFTER 𝑛
N YEARS 𝐼𝑁𝑉𝐸𝑆𝑇𝑀𝐸𝑁𝑇 𝑥 (1 + 𝑅𝑔 )
GEOMETRIC MEAN SAMPLE
The average person’s monthly salary in a certain town
jumped from $2,500 to $5,000 over the course of ten years.
Using the geometric mean, what is the average yearly
increase?
𝑋𝑔 = 2500 𝑥 5000 = $3.535.53
Average Yearly Increase Over 10-Years will be:

$3,535.53 / 10 = $353.55
GEOMETRIC MEAN EXAMPLE
The following table gives the value of the Dow Jones Industrial Average (DJIA),
NASDAQ, and the S&P 500 on the first day of trading for the years 2008 through
2010
Year DJIA NASDAQ S&P 500
2008 13,043.96 2,609.63 1,447.16
2009 9,034.69 1,632.21 931.80
2010 10,583.96 2,308.42 1,132.99
A. For each stock index, compute the rate of return from 2008 to 2009 and
from 2009 to 2010.
B. Calculate the geometric mean rate of return for each stock index for the
period from 2008 to 2010.
C. Suppose that an investment of $100, 000 is made in 2008 and that the
portfolio performs with returns equal to those of the DJIA. What is the
investment worth in 2010?
D. Repeat part c for the NASDAQ and the S&P 500.
GEOMETRIC MEAN EXAMPLE
FOR EACH STOCK INDEX, COMPUTE THE RATE OF RETURN FROM 2008 TO 2009 AND FROM
(A.) 2009 TO 2010.
DJIA NASDAQ S&P 500
2008-2009 -30.7% -37.5% -35.6%
2009-2010 17.1% 41.4% 21.6%
CALCULATE THE GEOMETRIC MEAN RATE OF RETURN FOR EACH STOCK INDEX FOR THE PERIOD
(B.) FROM 2008 TO 2010.
DJIA =((1-0.307)*(1+0.171))^(1/2)-1 = (0.099)
NASDAQ =((1-0.375)*(1+0.414))^(1/2)-1 = (0.060)
S&P =((1-0.356)*(1+0.216))^(1/2)-1 = (0.115)
SUPPOSE THAT AN INVESTMENT OF $100, 000 IS MADE IN 2008 AND THAT THE PORTFOLIO
PERFORMS WITH RETURNS EQUAL TO THOSE OF THE DJIA. WHAT IS THE INVESTMENT WORTH IN
(C.) 2010?
DJIA =100000*((1+0.099)^(2)) = $ 81,150.30
(D.) REPEAT PART C FOR THE NASDAQ AND THE S&P 500.
NASDAQ $ 100,000.00 $ 88,375.00
S&P $ 100,000.00 $ 78,310.40
Refer to the first exercise. The values of the DJIA
on the first day of trading in 2005, 2006, and 2007
were 10,729.43, 10,847.41, and 12,474.52.
a. Calculate the geometric mean rate of return for

the DJIA from 2005 to 2010.
b. If an investment of $100, 000 is made in 2005
and the portfolio performs with returns equal to
those of the DJIA, what is the investment worth
in 2010?
a. 2005 - 2006: R1 = 100 (10,847.41 – 10,729.43) / 10,729.43 =
1.1%
2006 - 2007: R2 = 100 (12,474.52 – 10,847.41) / 10,847.41 =
15.0%
2007 - 2008: R3 = 100 (13,043.96 – 12,474.52) / 12,474.52 =
4.6%
2008 - 2009: R4 = -30.7% (from 3.52 a)
2009 - 2010: R5 = 17.1%
Rg = 5 1 + 0.011 1 + 0.150 (1 + 0.046) 1 − 0.307 (1 + 0.171)

– 1 = -0.00263
b. Value of $100,000 investment = ($100,000)(1-.00263)5 =
($100,000)(0.99737)5 = ($100,000)(0.986919) = $98,691.90
PERCENTILE
A PERCENTILE provides information about how the data
are spread over the interval from the smallest value to the
largest value. For a data set containing n observations,
the pth percentile divides the data into two parts:
Approximately p% of the observations are less than the
pth percentile, and approximately (100 − p)% of the
observations are greater than the pth percentile.
𝑃
𝐿𝑝 = x (n+1)
100
P – percentile / n – sample size
PERCENTILE EXAMPLE
There are 25 test scores such
as: 72,54, 56, 61, 62, 66, 68,
43, 69, 69, 70, 71,77, 78, 79, 85,
87, 88, 89, 93, 95, 96, 98, 99,
99. Find the 60th percentile?
PERCENTILE EXAMPLE
POSITION TEST SCORES
1
2
72
54
STEP 1: Sort the data set
3
4
56
61 STEP 2: Get the location of the required
5 62
6
7
66
68
percentile using the formula
8
9
43
69
STEP 3: Get the value
10 69
11 70
12 71
𝑃
13 77
𝐿60 = x (n+1) = (60 / 100) x (25+1) = 15.6
14 78 100
15 79
16 85
17 87
18 88 VALUE0.6 = (85-79) = 6
19 89
20 93
𝑃60 = 79 + (0.60 x 6) = 82.6

21 95
22 96
23 98
24 99
25 99
QUARTILES
Quartiles are just specific percentiles; thus, the steps for computing percentiles can
be applied directly in the computation of quartiles.
It is often desirable to divide a data set into four parts, with each part containing
approximately one-fourth, or 25%, of the observations. These division points are
referred to as the quartiles and are defined as follows.
SAMPLES and EXERCISES (4 EXERCISES)
Measures of Variability
Measures of Variability are statistics that
describe the amount of difference and spread
in a data set. These measures include variance,
standard deviation, and standard error of the
mean. If the numbers corresponding to these
statistics are high it means that the scores or
values in our data set are widely spread out and
not tightly centered around the mean.
Range
The difference between the lowest and highest values.
In {4, 6, 9, 3, 7} the lowest value is 3, and the highest is 9,

so the range is 9 − 3 = 6.
Although the range is the easiest of the measures of variability to compute, it is seldom used
as the only measure. The reason is that the range is based on only two of the observations
and thus is highly influenced by extreme values. Suppose the highest paid graduate received a
starting salary of $10,000 per month. In this case, the range would be rather than 615. This
large value for the range would not be especially descriptive of the variability in the data
because 11 of the 12 starting salaries are closely grouped between 3710 and 4130.
Interquartile Range
A measure of variability that overcomes the dependency on extreme values is the

interquartile range (IQR). This measure of variability is the difference between the
third quartile,Q3, and the first quartile, Q1 . In other words, the interquartile range is
the range for the middle 50% of the data.
STEP 1: Arrange the data set into ascending order.
STEP 2: Split the data set into two
STEP 3: Get the median of the first half (Q1) and the median of the second half (Q3)
STEP 4: IQR = Q3 – Q1
Variance
The variance is a measure of variability that utilizes all the data. The variance is based on the
difference between the value of each observation Xi and the mean. The difference between each
Xi and the mean (x̄ for a sample, µ for a population) is called a deviation about the mean. For a
sample, a deviation about the mean is written (Xi - x̄) ; for a population, it is written (Xi - µ). In
the computation of the variance, the deviations about the mean are squared.
To calculate the variance follow these steps:

• Work out the Mean (the simple average of POPULATION VARIANCE:
the numbers) (𝑋𝑖 ; 𝜇)2
• Then for each number: subtract the Mean 𝝈𝟐 =
and square the result (the squared 𝑁
difference).
• Then work out the average of those 𝑺𝑨𝑴𝑷𝑳𝑬 𝑽𝑨𝑹𝑰𝑨𝑵𝑪𝑬:
squared differences.
2
(𝑋𝑖 − 𝑋)
𝒔𝟐 =
NOTE: The sample variance
𝑛−1
𝒔𝟐 𝒊𝒔 𝒂 𝒑𝒐𝒊𝒏𝒕 𝒆𝒔𝒕𝒊𝒎𝒂𝒕𝒐𝒓 𝒐𝒇 𝒕𝒉𝒆 𝒑𝒐𝒑𝒖𝒍𝒂𝒕𝒊𝒐𝒏 𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆 𝝈𝟐
Standard Deviation
The standard deviation is defined to be the positive square

root of the variance. Following the notation we adopted for
a sample variance and a population variance, we use s to
denote the sample standard deviation and to denote the
population standard deviation.
Coefficient of Variation
The coefficient of variation
is a relative measure of
variability; it measures the
standard deviation relative
to the mean.
EXERCISES
• Exercise
Measures of Distribution Shape, Relative
Location, and Detecting Outliers
Histogram provides a graphical display showing the shape of a
distribution. An important numerical measure of the shape of a
distribution is called skewness.
Skewness is a measure of symmetry, or more precisely, the lack

of symmetry. A distribution, or data set, is symmetric if it looks
the same to the left and right of the center point.
Skewed and Symmetric Data
Skewed to the Left
The situation reverses itself when we deal with data
skewed to the left. Data that are skewed to the left
have a long tail that extends to the left. An
alternate way of talking about a data set skewed to
the left is to say that it is negatively skewed. In this
situation, the mean and the median are both less
than the mode. As a general rule, most of the time
for data skewed to the left, the mean will be less
than the median. In summary, for a data set skewed
to the left:
• Always: mean less than the mode
• Always: median less than the mode
• Most of the time: mean less than median
Skewed to the Right

Data that are skewed to the right have a long tail that extends to If the data are symmetric, they have about the
the right. An alternate way of talking about a data set skewed to same shape on either side of the middle. In other
the right is to say that it is positively skewed. In this situation, the words, if you fold the histogram in half, it looks
mean and the median are both greater than the mode. As a about the same on both sides.
general rule, most of the time for data skewed to the right, the
mean will be greater than the median. In summary, for a data set Histogram C in the figure shows an example of
skewed to the right: symmetric data. With symmetric data, the mean
• Always: mean greater than the mode and median are close together.
• Always: median greater than the mode
• Most of the time: mean greater than median
Pearson’s Coefficient of Skewness
Pearson’s Coefficient of Skewness #1 (Mode):
Where x̄ = the mean; Step 1: Subtract the mode from the mean
Mo = the mode; and s = Step 2: Divide by the standard deviation
the standard deviation
for the sample.
Pearson’s Coefficient of Skewness #2 (Median):
Where x̄ = the
mean; Md = the
median and s = the Step 1: Subtract the median from the mean
standard deviation Step 2: Multiply Step 1 by 3
for the sample. Step 3: Divide by the standard deviation:
It is generally used
when you don’t
know the mode.
CAUTION
Caution: Pearson’s first coefficient of skewness uses the mode. Therefore, if the mode
is made up of too few pieces of data it won’t be a stable measure of central
tendency. For example, the mode in both these sets of data is 9:
1 2 3 4 5 6 7 8 9 9.
1 2 3 4 5 6 7 8 9 9 9 9 9 9 9 9 9 9 9 9 10 12 12 13.
In the first set of data, the mode only appears twice. This isn’t a good measure of central
tendency so you would be cautioned not to use Pearson’s coefficient of skewness. The
second set of data has a more stable set (the mode appears 12 times). Therefore,
Pearson’s coefficient of skewness will likely give you a reasonable result.
Interpretation In general:
• The direction of skewness is given by the sign.
• The coefficient compares the sample distribution with a normal distribution.
The larger the value, the larger the distribution differs from a normal
distribution.
• A value of zero means no skewness at all.
• A large negative value means the distribution is negatively skewed.
• A large positive value means the distribution is positively skewed.
Z-Score
z
The z-score, i, can be interpreted as the number of standard deviations is from the
mean .
A z-score greater than zero occurs for observations with a value greater than the
mean, and a z-score less than zero occurs for observations with a value less than the
mean. A z-score of zero indicates that the value of the observation is equal to the
mean.
Z-SCORE
How many standard deviations a value is
from the mean.
In this example, the value 1.7 is 2 standard

deviations away from the mean of 1.4, so 1.7
has a z-score of 2.
Similarly 1.85 has a z-score of 3.
To convert a value to a Standard Score ("z-

score"):
· first subtract the mean,
· then divide by the standard deviation
Excel Function:
STANDARDIZE(x,mean,standard_deviation)
Chebyshev’s Theorem
At least (1-1/z²) of the data values must be within z standard
deviations of the mean, where z is any value greater than 1.
Some of the implications of this theorem, with z = 2, 3 and 4
standard deviations, follow.
• At least .75, or 75%, of the data values must be within z = 2 standard
deviations of the mean.
Chebyshev’s theorem requires Z > 1, but z need not be

an integer.
Chebyshev’s Theorem
For an example using Chebyshev’s theorem, suppose that the midterm test scores
for 100 students in a college business statistics course had a mean of 70 and a
standard deviation of 5. How many students had test scores between 60 and
80? How many students had test scores between 58 and 82?
For the test scores between 60 and 80, we note that 60 is two standard
deviations below the mean and 80 is two standard deviations above the mean.
Using Chebyshev’s theorem, we see that at least .75, or at least 75%, of the
observations must have values within two standard deviations of the mean. Thus,
at least 75% of the students must have scored between 60 and 80.
For the test scores between 58 and 82, we see that (58-70)/5 indicates 58 is 2.4
standard deviations below the mean and that (82-70)/5 indicates 82 is 2.4
standard deviations above the mean. Applying Chebyshev’s theorem with , we
have z = 2.4
At least 82.6% of the students must
have test scores between 58 and 82.
EMPIRICAL RULE
The empirical rule is a statistical rule which states that for a
normal distribution, almost all data will fall within three standard
The empirical rule shows that 68% will fall within the first
standard deviation, 95% within the first two standard
deviations, and 99.7% will fall within the first three standard
deviations of the distribution's average.
The empirical rule is often referred to as the three-sigma rule

or the 68-95-99.7 rule.
EMPIRICAL RULE EXAMPLE
A study on the average minutes spent by students on Internet
usage is 300 with a standard deviation of 102. Answer the
following questions assuming a bell-shaped distribution and
using the empirical rule.
a) What percentage of students use the Internet for more

than 402 minutes?
b) What percentage of students use the Internet for more
than 504 minutes?
c) What percentage of students use the Internet between 198
minutes and 300 minutes?
EMPIRICAL RULE EXAMPLE
a) 402 is one standard deviation above the mean. The empirical rule states that 68% of
data values will be within one standard deviation of the mean. Because a bell-
shaped distribution is symmetric, 0.5×(1-68%) = 16% of the data values will be greater
than (mean + 1 × standard deviation) 402. 16% of students use internet for more
than 402 minutes.
b) 504 is two standard deviations above the mean. The empirical rule states that 95% of
data values will be within two standard deviations of the mean. Because a bell-
shaped distribution is symmetric, 0.5×(1-95%) = 2.5% of the data values will be greater
than (mean + 2×standard deviation) 504. 2.5% of students use internet for more
than 504 minutes.
c) 198 is one standard deviation below the mean. The empirical rule states that 68% of
data values will be within one standard deviation of the mean, and we expect that
0.5× (1 - 68%) = 16% of data values will be below one standard deviation below the
mean. 300 is the mean, so we expect that 50% of the data values will be below the
mean. Therefore, we expect 50% - 16% = 34% of the data values will be between the
mean 300 and one standard deviation below the mean 198. 34% of students use
internet between 198 minutes and 300 minutes.
Five-Number Summary
In a five-number summary, five numbers are
used to summarize the data:
• Smallest value
• First quartile
• Median
• Third quartile
• Largest value
Five-Number Summary
To illustrate the development of a five-number summary, we will use the
monthly starting salary data. Arranging the data in ascending order, we obtain
the following results.
The smallest value is 3710 and the largest value is 4325. We showed how to compute
the quartiles (Q1 = 3857.50; Q2 = 3905; Q3 = 4025). Thus, the five-number summary
for the monthly starting salary data is:
The five-number summary indicates that the starting salaries in the sample are
between 3710 and 4325 and that the median or middle value is 3905. The first and
third quartiles show that approximately 50% of the starting salaries are between
3857.5 and 4025.
BOX PLOT
A box plot is a graphical
display of data based on a
five-number summary. A key
to the development of a box
plot is the computation of the
interquartile range.
Box plots provide another way to

identify outliers, but they do not
necessarily identify the same
values as those with a z-score
less than −3 or greater than +3.
Either or both procedures may be
used.
BOX PLOT
The steps used to construct the box plot follow.
1. A box is drawn with the ends of the box located at the first, Q1 = 3857.5 and third quartiles, Q3 =
4025. For the salary data, and . This box contains the middle 50% of the data.
2. A horizontal line is drawn in the box at the location of the median (3905 for the salary data). An X
indicates the value of the mean (3940 for the salary data).
3. By using the interquartile range, IQR = Q3 – Q1 , limits are located at 1.5(IQR) below Q1 , and
1.5(IQR) above Q3. For the salary data, IQR = Q3 – Q1 = 4025 – 3857.50 = 167.5. Thus, the limits
are 3857.5 – 1.5(167.5) = 3606.25 and 4025 – 1.5(167.5) = 4276.25. Data outside these limits are
considered outliers.
4. The vertical lines extending from each end of the box called whiskers. The whiskers are drawn from
the ends of the box to the smallest and largest values inside the limits computed in step 3. Thus,
the whiskers end at salary values of 3710 and 4130.
5. Finally, the location of each outlier is shown with a small dot. In the figure we see one outlier,
4325.
COVARIANCE
Covariance is a measure of the joint variability of two random
variables.
Population covariance,   xi   x    yi   y 
 xy  .
N
SAMPLE COVARIANCE
Calculate covariance for the following data set:
x: 2.1, 2.5, 3.6, 4.0 (mean = 3.1)
y: 8, 10, 12, 14 (mean = 11)
Substitute the values into the formula and solve:

Cov(X,Y) = ΣE((X-μ)(Y-ν)) / n-1
= (2.1-3.1)(8-11)+(2.5-3.1)(10-11)+(3.6-3.1)(12-11)+(4.0-3.1)(14-11)
/(4-1)
= (-1)(-3) + (-0.6)(-1)+(.5)(1)+(0.9)(3) / 3
= 3 + 0.6 + .5 + 2.7 / 3
= 6.8/3
= 2.267
The result is positive, meaning that the variables are positively

related.
INTERPRETATION OF SAMPLE
COVARIANCE
One problem with using covariance

as a measure of the strength of the
linear relationship is that the value of
the covariance depends on the units
of measurement for x and y.
CORRELATION COEFFICIENT
A measure of the relationship between two variables that is
not affected by the units of measurement for x and y is the
correlation coefficient.
CORRELATION COEFFICIENT
EXERCISES
3.50 Suppose that a company's sales were $5, 000, 000
three years ago. Since that time sales have grown at
annual rates of 10 percent, −10 percent, and 25 percent.
a. Find the geometric mean growth rate of sales over
this three-year period.
b. Find the ending value of sales after this three-year
period.
3.51 Suppose that a company's sales were $1,000,000
four years ago and are $4,000,000 at the end of the four
years. Find the geometric mean growth rate of sales.
EXERCISES
3.31 Thirteen internists in the Midwest are randomly
selected, and each internist is asked to report last year's
income. The incomes obtained (in thousands of dollars)
are 152, 144, 162, 154, 146, 241, 127, 141, 171, 177, 138,
132, 192. Find:
a. The 90th percentile.
b. The median.
c. The first quartile.
d. The third quartile.
e. The 10th percentile.
f. The interquartile range.
EXERCISE
3.32 Construct a box-and-whiskers
display of the following 12 household
incomes :
7,524 11,070 18,211 26,817 36,551 41,286
49,312 57,283 72,814 90,416 135,540 190,250
EXERCISE
3.44 The following table gives a summary of the grades received by a
student for the first 64 semester hours of university coursework. The
table gives the number of semester hours of A, B, C, D, and F earned
by the student among the 64 hours.
Grade Number of Hours
A (that is, 4.00) 18
B (that is, 3.00) 36
C (that is, 2.00) 7
D (that is, 1.00) 3
F (that is, 0.00) 0
a. By assigning the numerical values, 4.00, 3.00, 2.00, 1.00, and 0.00
to the grades A, B, C, D, and F (as shown), compute the student's
grade point average for the first 64 semester hours of coursework.
b. Why is this a weighted average?
EXERCISE
3.46 The following is a frequency distribution summarizing earnings per share
(EPS) growth data for the 30 fastest-growing firms as given on Fortune
magazine's website on March 16, 2005.
EPS Growth (Percent) Frequency
0–49 1
50–99 17
100–149 5
150–199 4
200–249 1
250–299 2
Source: http://www.fortune.com (accessed March 16, 2005).
Calculate the (approximate) population mean, variance, and standard

deviation for these data.
SOLUTION
The n = 13 internists’ yearly income ($000) are listed below,
sorted:
x1 x2 x3 x4 x5 x6 x7 x8 x9
x10 x11 x12 x13
127 132 138 141 144 146 152 154 162
171 177 192 241
a. i = (90/100) 13 = 11.7. Rounding the index up to 12, the twelfth value is x12 =
192 or $192,000
b. i = (50/100) 13 = 6.5. Rounding the index up to 7, Md = x7 = 152 or $152,000
c. i = (25/100) 13 = 3.25. Rounding the index up to 4, Q1 = x4 = 141 or
$141,000
d. i = (75/100) 13 = 9.75. Rounding the index up to 10, Q3 = x10 = 171 or
$171,000
e. i = (10/100) 13 = 1.3. Rounding the index up to 2, x2 = 132 or $132,000
f. The Inter Quartile Range, IQR = Q3 – Q1 = 171 – 141 = 30 or $30,000
SOLUTION
3.50 a.
$5,000,000x(1 + .0736)3 = $6,187,500 using the

full precision (unrounded) Rg value
SOLUTION
3.51 1,000,000 (1+Rg)4 = 4,000,000
(1+Rg)4 = 4
(1+Rg) =
Rg = 1.4142 - 1
Rg = .4142
SOLUTION
3.32 The following plot was constructed in Megastat.
BoxPlot
0 50000 100000 150000 200000 250000
Household Incomes
SOLUTION
GOODLUCK ON YOUR
PRELIM EXAM!
Chapter 4: Introduction to Probability
Probability
Probability is a numerical measure of the
likelihood that an event will occur.
Probability values are always assigned on a scale
from 0 to 1. A probability near zero indicates an
event is unlikely to occur; a probability near 1
indicates an event is almost certain to occur.
Managers often base their decisions on an analysis of
uncertainties such as the following:
• What are the chances that sales will decrease if we

increase prices?
• What is the likelihood a new assembly method will
increase productivity?
• How likely is it that the project will be finished on
time?
• What is the chance that a new investment will be
profitable?
ASSIGNING PROBABILITIES
This is usually done by using one of three methods: the
classical method, the relative frequency method, or the
subjective method. Regardless of the method used,
probabilities must be assigned to the sample space
outcomes so that two conditions are met:
1. The probability assigned to each sample space outcome must
be between 0 and 1. That is, if E represents a sample space
outcome and if P(E) represents the probability of this
outcome, then then 0 ≤ P(E) ≤ 1.
2. The probabilities of all of the sample space outcomes must
sum to 1.
What is Classical Probability?
• Rolling a fair die. It’s equally likely you would get a 1, 2, 3, 4, 5, or 6.
• Selecting bingo balls. Each numbered ball has an equal chance of being
chosen.
• Guessing on a test. If you guessed on a multiple choice test with four
possible answer A B C and D, each choice has the same odds of being
picked (assuming you pick randomly and don’t follow a pattern).
P(A) = f / N.
P(A) means ―probability of event A‖ (event A is whatever event you are looking
for, like winning the lottery). “f” is the frequency, or number of possible times the
event could happen. N is the number of times the event could happen
The odds of rolling a 2 on a fair die are one out of 6, or 1/6. That’s one possible
outcome (there’s only one way to roll a 1!) divided by the number of possible
outcomes (1,2,3,4,5,6).
What is the Relative Frequency
Method?
The Relative Frequency Method of assigning probabilities is appropriate when
data are available to estimate the proportion of the time the experimental
outcome will occur if the experiment is repeated a large number of times.
For example, to estimate the probability that a randomly selected consumer prefers
Coca-Cola to all other soft drinks, an experiment is performed wherein randomly
selected consumer were ask for his or her preference. There are two possible sample
space outcomes: “prefers Coca-Cola” and “does not prefer Coca-Cola.” However, we
have no reason to believe that these sample space outcomes are equally likely, so we
cannot use the classical method. We might perform the experiment, say, 1,000 times
by surveying 1,000 randomly selected consumers. Then, if 140 of those surveyed said
that they prefer Coca-Cola, we would estimate the probability that a randomly selected
consumer prefers Coca-Cola to all other soft drinks to be 140/1,000 = .14.
What is the Subjective Method of
Assigning Probabilities?
The subjective method of assigning probabilities is most appropriate when one
cannot realistically assume that the experimental outcomes are equally likely and
when little relevant data are available.
Example: John and Marsha Puruntong made an offer to purchase a house. Two
outcomes are possible:
Marsha believes that the probability their offer will be accepted is .8; thus, Marsha
would set P(E1) = 0.8 and P(E2) = 0.2. John, however, believes that the probability that
their offer will be accepted is .6; hence, John would set P(E1) = 0.6 and P(E2) = 0.4 .
Note that Tom’s probability estimate for reflects a greater pessimism that their offer
will be accepted.
EXPERIMENTS
An EXPERIMENT is a process that generates well-

defined outcomes.
On any single repetition of Experiment Experimental
Outcomes
an experiment, one and Toss a coin Head, tail
only one of the possible Select a part for Defective, non-
inspection defective
experimental outcomes Conduct a Purchase, no
will occur. sales call purchase
Roll a die 1, 2, 3, 4, 5, 6
Play a football Win, lose, tie
game
Sample Space
The sample space for an experiment is the set of
all experimental outcomes.
An experimental outcome is also called a sample
point to identify it as an element of the sample
space.
Counting Rules, Combinations, and
Permutations
Multiple-Step Experiments
If an experiment can be described as a sequence of k steps
with n1 possible outcomes on the first step, n2 possible
outcomes on the second step, and so on, then the total
number of experimental outcomes is given by (n1) (n2)… (nk)
EXAMPLE: If we toss a coin twice, how many possible outcomes do we
have?
How many possible outcomes are there in first coin toss?
Two outcomes: Head, Tail
How many possible outcomes are there in a second coin toss?
Two outcomes: Head, Tail
Therefore, n = (n1)(n2) = (2)(2) = 4 total possible outcomes.
TREE DIAGRAM
A tree diagram is a graphical representation that
helps in visualizing a multiple-step experiment.
Counting Rules for Combinations
A second useful counting rule allows one to count
the number of experimental outcomes when the
experiment involves selecting n objects from a set
of N objects. It is called the counting rule for
combinations.
The number of combinations of N objects taken n at a time is
Where:
And, by definition,
Example
The GRAND lottery system uses the random selection of 6 numbers from a
group of 55 numbers to determine each week's lottery winner. There are
combinations of 6 numbers that can be selected from 55 numbers. Therefore, if

you buy a lottery ticket and pick six numbers, the probability that this ticket will
win the lottery is 28,989,675
Counting Rules for PERMUTATIONS
A third counting rule that is sometimes useful is the counting rule for permutations. It
allows one to compute the number of experimental outcomes when n objects are to be
selected from a set of N objects where the order of selection is important. The same n
objects selected in a different order are considered a different experimental outcome
Counting Rule for Permutations

The number of permutations of N objects taken n at a time is given by
Example
A Manufacturing Company has a quality control process in which an inspector selects
two of five parts to inspect for defects. How many permutations may be selected? The
counting rule in the equation shows that with N = 5 and n = 2 , we have
Thus, 20 outcomes are possible for the experiment of randomly selecting two parts
from a group of five when the order of selection must be taken into account. If we
label the parts A, B, C, D, and E, the 20 permutations are AB, BA, AC, CA, AD, DA, AE,
EA, BC, CB, BD, DB, BE, EB, CD, DC, CE, EC, DE, and ED.
What is the Difference Between
Combination and Permutation?
EXAMPLE: "My fruit salad is a combination of apples, grapes and bananas"
We don't care what order the fruits are in, they could also be "bananas,
grapes and apples" or "grapes, apples and bananas", its the same fruit salad.
When the order doesn't matter, it is a COMBINATION.
EXAMPLE: "The combination to the safe is 472". Now the order is important.
"724" won't work, nor will "247". It has to be exactly 4-7-2.
When the order does matter it is a PERMUTATION.

• EXERCISES 1
• EXERCISES 2
Chapter 5
Discrete Probability
Distributions
Chapter 6
Continuous Probability
Distributions
Random Variable
Random Variable is a numerical description of
the outcome of an experiment.
Example: Tossing a coin: we could get Heads or Tails.
Let's give it the values Heads=0 and Tails=1 and we have a
Random Variable "X":
In Short: X = {0, 1}
TYPES OF RANDOM VARIABLES
Discrete Random Variable is a random variable that may
assume either a finite number of values or an infinite
sequence of values such as 0, 1, 2, …
The value assumed by a discrete random variable
depends on the outcome of an experiment. Because the
outcome of the experiment will be uncertain, the value
assumed by the random variable will also be uncertain.
Properties of a Discrete Probability Distribution p(x)
A discrete probability distribution p(x) must be such that
1. p(x) ≥ 0 for each value of x 2.
Examples of Discrete Random Variables
Experiment Random Variable (x) Possible Values for
the Random Variable
Contact five customers Number of customers 0, 1, 2, 3, 4, 5

who place an order
Inspect a shipment of 50 Number of defective 0, 1, 2, …, 49, 50

radios radios
Operate a restaurant for one Number of customers 0, 1, 2, 3, …
day
Sell an automobile Gender of the customer 0 if male; 1 if female
Continuous Random Variable is a random variable that
may assume any numerical value in an interval or
collection of intervals.
Experimental outcomes based on measurement scales
such as time, weight, distance, and temperature can be
described by continuous random variables.
Properties of a Continuous Probability Distribution
The continuous probability distribution (or probability curve) f(x) of
a random variable x must satisfy the following two conditions:
1. f(x) ≥ 0 for any value of x.
2. The total area under the curve f(x) is equal to 1.
Examples of Continuous Random Variables
Experiment Random Variable (x) Possible Values for the
Random Variable
Operate a bank Time between customer

arrivals in minutes
x≥0
Fill a soft drink can Number of ounces 0 ≤ x ≤ 12.1
(max = 12.1 ounces)
Construct a new library Percentage of project 0 ≤ x ≤ 100
complete after six months
Test a new chemical Temperature when the 150 ≤ x ≤ 212

process desired reaction takes place
(min 150°F; max 212°F)
The Normal Probability Distribution is
defined by the equation
Here μ and σ are the mean and standard

deviation of the population of all possible
observed values of the random
variable x under consideration. Furthermore,
π = 3.14159…, and e = 2.71828…is the base
of Napierian logarithms.
THE STANDARD NORMAL DISTRIBUTION
If a random variable x (or, equivalently,
the population of all possible observed
values of x) is normally distributed with
mean μ and standard deviation σ, then
the random variable
z = (x - µ) / σ
(or, equivalently, the population of all
possible observed values of z) is
normally distributed with mean 0 and
standard deviation 1. A normal
distribution (or curve) with mean 0 and
standard deviation 1 is called a
standard normal distribution (or curve).
Three Important Areas under the Normal Curve
P(µ - σ ≤ X ≤ µ + σ) = 0.6826
This means that 68.26 percent of all possible
observed values of x are within (plus or minus)
one standard deviation of μ.
P(µ - 2σ ≤ X ≤ µ + 2σ) = 0.9544

two standard deviations of μ
P(µ - 3σ ≤ X ≤ µ + 32σ) = 0.9973
three standard deviations of μ.
What is the Difference Between Discrete
Variable and Continuous Variable?
Discrete Variable can only take certain values.
Example: The number of students
in a class. It can be 0, 1, 2, … There
can never be half a student
Example: The results of rolling 2

dice. It only has the values 2, 3, 4, 5,
6, 7, 8, 9, 10, 11 and 12
Continuous Variable can take any value (within a range)

Example: A person's height: could be
any value (within the range of human
heights), not just certain fixed
heights,
EXERCISE
The Life Insurance Case: Setting a Policy Premium
An insurance company
sells a $20,000 whole life
insurance policy for an
annual premium of $300.
Actuarial tables show that a
person who would be sold
such a policy with this
premium has a .001
probability of death during a
year.
EXERCISE
Let x be a random variable representing the insurance company's profit made on one of
these policies during a year. The probability distribution of x is:
x, Profit p(x), Probability of x

$300 (if the policyholder lives) .999
$300 − $20,000 = −$19,700 .001
(a $19,700 loss if the policyholder dies)
The expected value of x (expected profit per year) is:
This says that if the insurance company sells a very large number of these
policies, it will average a profit of $280 per policy per year. Because insurance
companies actually do sell large numbers of policies, it is reasonable for these
companies to make profitability decisions based on expected values.
EXERCISE
The following table summarizes investment outcomes and
corresponding probabilities for a particular oil well:
x = the outcome in $ p(x)
−$40,000 (no oil) .25
10,000 (some oil) .70
70,000 (much oil) .05
a. Graph p(x); that is, graph the probability distribution of x.
b. Find the expected monetary outcome. Mark this value on
your graph of part a. Then interpret this value.
c. Calculate the standard deviation of x.
EXERCISE
a. Graph p(x); that is, graph the probability distribution of x
EXERCISE
b. Find the expected monetary outcome. Mark this value on your graph of part
a. Then interpret this value.
μ = ∑ x•p(x) = (-40,000)(0.25) + (10,000)(0.70) + (70,000)(0.05) = $500

If numerous (in theory, an infinite number of) oil wells were drilled, the average
profit per oil well would be $500.
c. Calculate the standard deviation of x.
σ2 = ∑(x – μ)2 • p(x) = (-40,000 – 500)2(0.25) + (10,000 – 500)2(0.70) + (70,000 –

500)2(0.05) = 714,750,000
σ = σ2 = 714,750,000 = $26,734.81
EXERCISE
Five thousand raffle tickets are to be sold at $10 each to benefit a
local community group. The prizes, the number of each prize to be
given away, and the dollar value of winnings for each prize are as
follows:
Prize Number to Be Given Away Dollar Value
Automobile 1 $20,000
Entertainment center 2 3,000 each
DVD recorder 5 400 each
Gift certificate 50 20 each
If you buy one ticket, calculate your expected winnings. (Form the
probability distribution of x = your dollar winnings, and remember to
subtract the cost of your ticket.)
EXERCISE
Prize Ticket Frequency Profit f/N p(x) x • p(x)
Income $ Cost $ f x
20000 10 1 19,990 1/5000 0.00020 3.9980
3000 10 2 2,990 2/5000 0.00040 1.1960
400 10 5 390 5/5000 0.00100 0.3900
20 10 50 10 50/5000 0.01000 0.1000
0 10 4942 (10) 4942/5000 0.98840 (9.8840)
5000 5000/5000 1.00000 (4.2000)
μ = ∑ x•p(x) = -$4.20
EXERCISE
The DVD Case: Managing Inventory
A large discount store sells 50 packs of HX-150 blank DVDs and
receives a shipment every Monday. Historical sales records
indicate that the weekly demand, x, for these 50 packs is normally
distributed with a mean of μ=100 and a standard deviation of σ=10.
How many 50 packs should be stocked at the beginning of a week
so that there is only a 5 percent chance that the store will run short
during the week?
If we let st equal the number of 50 packs that will be stocked, then

st must be chosen to allow only a .05 probability that weekly
demand, x, will exceed st. That is, st must be chosen so that
P (x > st) = 0.05

EXERCISE
Using z-score to calculate the
value.
z = (st - µ) / σ
z = (st – 100) / 10
z.05 = 1 – 0.05 = 0.95
Using Standard Normal

Distribution table the areas
closest to .95 are .9495, which
has a corresponding z value of
1.64, and .9505, which has a
corresponding z value of 1.65.
Finding the Number of 50 Packs of DVDs Stocked, Interpolating we get 1.645
st, so That P(x > st) = .05 When μ = 100 and σ =
Substitute value:
10 (st – 100) / 10 = 1.645
st = 116.45 or 117-- 50 Packs
Must be on stock at the start of
week
Continuous Random Variable
Uniform Distribution
Continuous Random Variable
Uniform Distribution
EXERCISE 1
Discrete Probability Distributions
• Random Variables
• Developing Discrete Probability Distributions
• Expected Value and Variance
• Bivariate distributions, Covariance and
Financial Portfolios
• Binomial Probability Distribution
• Poisson Probability Distribution
• Hypergeometric Probability Distribution
252
Discrete Random Variable
with a Finite Number of Values
Example: An accountant taking CPA examination
The examination has four parts.
Let random variable x = the number of parts of
the CPA examination passed
x may assume the finite number of values 0,1,2,3
or 4.
253
Discrete Random Variable
with an Infinite Number of Values
Example: Cars arriving at a toll booth
Let x = number of cars arriving in one day,

where x can take on the values 0, 1, 2, . . .
We can count the customers arriving, but

there is
no finite upper limit on the number that might
arrive.
254
• The probability distribution for a random
variable describes how probabilities are
distributed over the values of the random
variable.
• We can describe a discrete probability
distribution with a table, graph, or formula.
255
Two types of discrete probability distributions:
– First type: uses the rules of assigning

probabilities to experimental outcomes to
determine probabilities for each value of the
random variable.
– Second type: uses a special mathematical
formula to compute the probabilities for each
value of the random variable.
256
• The probability distribution is defined by a
probability function, denoted by f(x), that
provides the probability for each value of the
random variable.
• The required conditions for a discrete
probability function are:
f(x) > 0 and f(x) = 1
257
• There are three methods for assigning
probabilities to random variables:
Classical method,
Subjective method, and
Relative frequency method.
• The use of the relative frequency method to
develop discrete probability distributions
leads to what is called an empirical discrete
distribution.
258
Example: DiCarlo Motors
Using past data on daily car sales for 300 days, a tabular
representation of the probability distribution for sales was
developed. Number of cars Number of x f(x)
sold days
0 54 0 .18
1 117 1 .39
2 72 2 .24
3 42 3 .14
4 12 4 .04
5 3 5 .01
Total 300 1.00
259
Graphical
representation
of Probability
Distribution.
260
• In addition to tables and graphs, a formula that
gives the probability function, f(x), for every value
of x is often used to describe the probability
distributions.
• Some of the discrete probability distributions
specified by formulas are
Discrete – uniform distribution
Binomial distribution
Poisson distribution
Hypergeometric distribution
261
• The discrete uniform probability distribution is
the simplest example of a discrete probability
distribution given by a formula.
• The discrete uniform probability function is
f(x) = 1/n
where: n = the number of values the random variable may

assume
The values of the random variable are equally likely.
262
Expected Value
• The expected value, or mean, of a random
variable is a measure of its central location.
E(x) =  =∑x f(x)
• The expected value is a weighted average of
the values the random variable may assume.
The weights are the probabilities.
• The expected value does not have to be a
value the random variable can assume.
263
Variance and Standard Deviation
• The variance summarizes the variability in the
values of a random variable.
Var(x) =  2 = (x - )2f(x)
• The variance is a weighted average of the
squared deviations of a random variable from
its mean. The weights are the probabilities.
• The standard deviation, , is defined as the
positive square root of the variance.
264
x f(x) xf(x)
0 .18 .00
1 .39 .39
2 .24 .48
3 .14 .42
4 .04 .16
5 .01 .05
1.00 1.50
E(x) = 1.50 = expected number of cars sold in a day
265
x x- (x - )2 f(x) (x - )2
f(x)
0 0 – 1.5 = - 2.25 .18 2.25 (.18) =
1.5 .4050
1 1 – 1.5 = -.5 .25 .39 .0975 Variance of
2 2 – 1.5 = .5 .25 .24 .0600 daily sales =  2 = 1.25
3 3 – 1.5 = 1.5 2.25 .14 .3150
4 4 – 1.5 = 2.5 6.25 .04 .2500
5 5 – 1.5 = 3.5 12.25 .01 .1225
Standard deviation of
1.00 1.2500
daily sales = 1. 118 cars
266
Bivariate Distributions
• A probability distribution involving two random
variables is called a bivariate probability
distribution.
• Each outcome of a bivariate experiment consists
of two values, one for each random variable.
Example: Rolling a pair of dice
• When dealing with bivariate probability
distributions, we are often interested in the
relationship between the random variables.
267
A Bivariate discrete probability
distribution
The crosstabulation of daily car sales for 300
days at DiCarlo’s Saratoga and Geneva
dealership is given below:
Geneva Saratoga Dealership Total
Dealership 0 1 2 3 4 5
0 21 30 24 9 2 0 86
1 21 36 33 18 2 1 111
2 9 42 9 12 3 2 77
3 3 9 6 3 5 0 26
Total 54 117 72 42 12 3 300
268
distribution
Bivariate empirical discrete probability distribution for
daily sales at DiCarlo dealerships in Saratoga and
Geneva New York is shown below.
Geneva Saratoga Dealership Total
Dealership 0 1 2 3 4 5
0 .0700 .1000 .0800 .0300 .0067 .0000 .2867
1 .0700 .1200 .1100 .0600 .0067 .0033 .3700
2 .0300 .1400 .0300 .0400 .0100 .0067 .2567
3 .0100 .0300 .0200 .0100 .0167 .0000 .0867
Total .18 .39 .24 .14 .04 .01 1.0000
269
distribution
Example: DiCarlo Motors
Expected value and Variance for daily car sales
at Geneva dealership.
x f(x) xf(x) x – E(x) (x – E(x))2 (x –
E(x))2f(x)
0 .2867 .0000 -1.1435 1.3076 .3749
1 .3700 .3700 -.1435 0.0206 .0076
2 .2567 .5134 .8565 0.8565 .1883
3 .0867 .2601 1.8565 1.8565 .2988
E(x) = 1.1435 Var(x) =
.8696
270
distribution
Example: DiCarlo Motors Expected value and Variance
for total daily car sales data.
s f(s) sf(s) s – E(s) (s – E(s))2 (s – E(s))2f(s)
0 .0700 .0000 -2.6433 6.9872 .4891
1 .1700 .1700 -1.6433 2.7005 .4591
2 .2300 .4600 -0.6433 0.4139 .0952
3 .2900 .8700 0.3567 0.1272 .0369
4 .1267 .5067 1.3567 1.8405 .2331
5 .0667 .3333 2.3567 5.5539 .3703
6 .0233 .1400 3.3567 11.2672 .2629
7 .0233 .1633 4.3567 18.9805 .4429
8 .0000 .0000 5.3567 28.6939 .0000
E(s) = 2.6433 Var(s) = 2.3895
271
distribution
Covariance for random variables x and y.

Varxy = [Var(x + y) – Var(x) – Var(y)]/2
(2.3895 - .8696 – 1.25)/2
= .1350
272
distribution
Correlation between random variables x and y
𝝈𝒙𝒚
𝝆𝒙𝒚 =
𝝈𝒙 𝝈 𝒚
𝜍𝑥 = .8696 = .9325
𝜍𝑦 = 1.25 = 1.1180
.1350
𝜌𝑥𝑦 = = .1295
.9325 (1.1180)
273
Binomial Probability Distribution
Four Properties of a Binomial Experiment
1. The experiment consists of a sequence of n
identical trials.
2. Two outcomes, success and failure, are
possible on each trial.
3. The probability of a success, denoted by p,
and failure denoted by 1-p does not
change from trial to trial. (This is
referred to as the stationarity assumption.)
4. The trials are independent.
274
• Our interest is in the number of successes
occurring in the n trials.
• We let x denote the number of successes
occurring in the n trials.
275
Binomial Probability Function
𝒏!
𝒇 𝒙 = 𝒑𝒙 (𝟏 − 𝒑)(𝒏;𝒙)
𝒙! 𝒏 − 𝒙 !
where:
x = the number of successes
p = the probability of a success on one trial
n = the number of trials
f(x) = the probability of x successes in n trials
n! = n(n – 1)(n – 2) ….. (2)(1)
276
• Binomial Probability Function
𝑛!
𝑓 𝑥 = 𝑝 𝑥 (1 − 𝑝)(𝑛;𝑥)
𝑥! 𝑛 − 𝑥 !
Number of experimental Probability of a particular

outcomes providing sequence of trial outcomes
exactly x successes in n with x successes in n trials
trials
277
Example: Martin Clothing store
The store manager wants to determine the

purchase decisions of next three customers who
enter the clothing store. On the basis of past
experience, the store manager estimates the
probability that any one customer will make a
purchase is .30.
What is the probability that two of the next three
customers will make a purchase?
278
Using S to denote success (a purchase) and F to denote
failure (no purchase), we are interested in experimental
outcomes involving two successes in the three trials.
• The probability of the first two customers buying and
the third customer not buying denoted by (S, S, F), is
given by
(p)(p)(1 – p)
• With a .30 probability of a customer buying on any one
trial, the probability of the first two customers buying
and the third customer not buying is (0.3)(0.3)(1-0.3) =
.063
279
Two other experimental outcomes result in two

success and one failure. The probabilities for all
three experimental outcomes involving two
successes follow:
Experimental
outcome Probability
(S, S, F) .063
(S, F, S) .063
(F, S, S) .063
280
Using the probability function:

Let: p = .30, n = 3, x = 2
𝑛!
𝑓 𝑥 = 𝑝 𝑥 (1 − 𝑝)(𝑛;𝑥)
𝑥! 𝑛 − 𝑥 !
3!
𝑓 1 = 0.3 2 (0.7)1 = .189
2! 3;2 !
281
282
Binomial Probabilities and Cumulative
Probabilities
• Statisticians have developed tables that give
probabilities and cumulative probabilities for a
binomial random variable.
• These tables can be found in some statistics
textbooks.
• With modern calculators and the capability of
statistical software packages, such tables are
almost unnecessary.
283
Expected Value and Variance for
Binomial Distribution
• Expected Value E(x) =  = np
• Variance Var(x) =  2 = np(1 – p)
• Standard Deviation 𝜍 = 𝑛𝑝(1 − 𝑝)
284
Expected Value and Variance for
Binomial Distribution
Expected Value E(x) = np = 3 (.3) = .9
Var(x) = np(1 – p) = 3(.3)(1-.3) = .63
Standard Deviation = 𝜍 = 𝑛𝑝 (1 − 𝑝) = 𝜍 = .63 = .79
285
Poisson Probability Distribution
– A Poisson distributed random variable is often

useful in estimating the number of occurrences
over a specified interval of time or space.
– It is a discrete random variable that may assume
an infinite sequence of values (x = 0, 1, 2, . . . ).
286
Examples
• Number of knotholes in 14 linear feet of pine
board
• Number of vehicles arriving at a toll booth in
one hour
• Number of leaks in 100 miles of pipeline
Bell Labs used the Poisson distribution to model the

arrival of phone calls.
287
Properties of a Poisson Experiment
– The probability of an occurrence is the same for

any two intervals of equal length.
– The occurrence or nonoccurrence in any interval is
independent of the occurrence or nonoccurrence
in any other interval.
288
Poisson Probability Function
𝜇𝑥 𝑒 −𝜇
𝑓 𝑥 =
𝑥!
where:
x = the number of occurrences in an interval
f(x) = the probability of x occurrences in an interval
 = mean number of occurrences in an interval
e = 2.71828
x! = x(x – 1)(x – 2) . . . (2)(1)
289
Poisson Probability Function
– Since there is no stated upper limit for the

number of occurrences, the probability function
f(x) is applicable for values x = 0, 1, 2, … without
limit.
– In practical applications, x will eventually become
large enough so that f(x) is approximately zero and
the probability of any larger values of x becomes
negligible.
290
Example: Arrivals at the drive-up teller window
of a bank
The average number of cars arriving at the drive-

up teller window of a bank in a 15 –minute
period of time on weekday mornings is 10.
What is the probability of 5 arrivals in a 15-

minute period of time on a weekday morning?
291
Example: Arrivals at the drive-up teller window of a
bank

µ= 10; x = 5
105 (2.71828);10
𝑓 5 =
5!
= .0378
292
A property of the Poisson distribution is that the
mean and variance are equal.
=2
293
Example: Arrivals at the drive-up teller window
of a bank
Variance for the number of cars arriving at the

drive-up teller window of a bank in a 15 –minute
period of time on weekday mornings is
 =  2 = 10
294
Hypergeometric Probability Distribution
The hypergeometric distribution is closely

related to the binomial distribution.
However, for the hypergeometric distribution:
the trials are not independent, and

the probability of success changes from trial
to trial
295
Hypergeometric Probability
Distribution
Hypergeometric Probability Function
𝑟 𝑁−𝑟
𝑥 𝑛−𝑥
𝑓 𝑥 =
𝑁
𝑛
where: x = number of successes

n = number of trials
f(x) = probability of x successes in n trials
N = number of elements in the population
r = number of elements in the population labeled
success
296
Distribution
𝑟 𝑁;𝑟
𝑥 𝑛;𝑥
𝑓 𝑥 = 𝑁 for 0 < x < r
𝑛
number of ways number of ways
x successes can n – x failures can be
be selected selected
from a total of r number of ways
successes n elements can be from a total of N – r
in the population selected failures
from a population of size in the population
N
297
Distribution
– The probability function f(x) on the previous slide

is usually applicable for values of x = 0, 1, 2, … n.
– However, only following values of x are valid:
• 1) x < r and
• 2) n – x < N – r
– If these two conditions do not hold for a value of
x, the corresponding f(x) equals 0.
298
Distribution
Example: Ontario electric
Electric fuses produced by Ontario electric are

packaged in boxes of 12 each. Suppose an
inspector randomly selects 3 of the 12 fuses in a
box for testing. If the box contains 5 defective
fuses, what is the probability that the inspector
will find exactly one of the three fuses
defective?
299
Distribution
𝑟 𝑁;𝑟 5 7 5! 7!
𝑥 𝑛;𝑥 1 2 1!4! 2!5! (5)(21)
𝑓 𝑥 = 𝑁 = 12 = 12! = = .4773
220
𝑛 3 3!9!
where: x = 1 = number of defective fuse selected

n = 3 = number of fuses selected
N = 12 = number of fuses in total
r = 5 = number of defective fuses in total
300
Distribution
Mean
𝑟
𝐸 𝑥 =𝜇=𝑛
𝑁
Variance
2 𝑟 𝑟 𝑁−𝑛
𝑉𝑎𝑟 𝑥 = 𝜍 = 𝑛 1−
𝑁 𝑁 𝑁−1
301
Distribution
Mean
𝑟 5
𝜇=𝑛 =3 = 1.25
𝑁 12
Variance
2
5 5 12 − 3
𝜍 =3 1− = .60
12 12 12 − 1
Standard deviation
𝜍 = .77
302
Distribution
• Consider a hypergeometric distribution with n trials and let p =
(r/n) denote the probability of a success on the first trial.
• If the population size is large, the term (N – n)/(N – 1) approaches 1.
• The expected value and variance can be written

– E(x) = np
– Var(x) = np(1 – p).
• Note that these are the expressions for the expected value and
variance of a binomial distribution.
303
Distribution
When the population size is large, a
hypergeometric distribution can be
approximated by a binomial distribution with n
trials and a probability of success p = (r/N).
304
EXERCISES
The probability distribution for the rate of return on an investment is
Rate of Return (%) Probability

9.5 .1
9.8 .2
10.0 .3
10.2 .3
10.6 .1
a. What is the probability that the rate of return will be at least 10%?
b. What is the expected rate of return?
c. What is the variance of the rate of return?
SOLUTION
EXERCISES
The number of electrical outages in a city varies from day
to day. Assume that the number of electrical outages (x)
in the city has the following probability distribution.
x f(x)
0 0.80
1 0.15
2 0.04
3 0.01
The mean and the standard deviation for the number of

electrical outages (respectively) are _____.
SOLUTION
EXERCISES
A random variable x has the following probability
distribution:
x f(x)
0 .08
1 .17
2 .45
3 .25
4 .05
a. Determine the expected value of x.

b. Determine the variance.
SOLUTION
EXERCISES
For the following probability distribution:
x f(x)
0 .01 a. Determine E(x).
1 .02
2 .10
b. Determine the variance.
3 .35 c. Determine the standard deviation.
4 .20
5 .11
6 .08
7 .05
8 .04
9 .03
10 .01
SOLUTION
EXERCISES
A company sells its products to wholesalers in batches of
1,000 units only. The daily demand for its product and the
respective probabilities are given below
Demand (Units) Probability

0 .2
1000 .2
2000 .3
3000 .2
4000 .1
a. Determine the expected daily demand.
Assume that the company sells its product for $3.75 per unit. What is the expected daily
b.
revenue?
SOLUTION
EXERCISES
The demand for a product varies from month to month. Based on the
past year's data, the following probability distribution shows MNM
company's monthly demand.
x f(x) Determine the expected number of

a.
Unit Demand Probability units demanded per month.
0 .10 Each unit produced costs the company
1,000 .10 $8.00, and is sold for $10.00. How
much will the company gain or lose in a
2,000 .30 b.
month if it stocks the expected number
3,000 .40 of units demanded, but sells 2000
units?
4,000 .10
SOLUTION
EXERCISES
EXERCISES
The probability function for the number of insurance policies John will
sell to a customer is given by f(x) = .5 − (x/6) for x = 0, 1, or 2
a. Is this a valid probability function? Explain your answer.
b. What is the probability that John will sell exactly 2 policies to a customer?
c. What is the probability that John will sell at least 2 policies to a customer?
d. What is the expected number of policies John will sell?
e. What is the variance of the number of policies John will sell?
SOLUTION
EXERCISES
A production process produces 2% defective parts. A sample
of 5 parts from the production is selected. What is the
probability that the sample contains exactly 2 defective
parts? Use the binomial probability function and show your
computations to answer this question.
SOLUTION
Use Binomial Probability Function:

f(x)=n!/x!(n-x)! p^x (1-p)^((n-x))
n=5
x=2
p = 0.02
Answer: 0.0037648
EXERCISES
Thirty-two percent of the students in a management class

are graduate students. A random sample of 5 students is
selected. Using the binomial probability function, determine
the probability that the sample contains exactly 2 graduate
students?
SOLUTION
Apply the Binomial Probability Function:
f(x)=n!/x!(n-x)! p^x (1-p)^((n-x))
Where:
n=5
x=2
p = 0.32
Answer = 0.321978368
EXERCISES
Ten percent of the items produced by a machine are
defective. Out of 15 items chosen at random,
a.what is the probability that exactly 3 items will be defective?

b.what is the probability that less than 3 items will be defective?
c. what is the probability that exactly 11 items will be non-defective?
SOLUTION
(A) (B)
n=15
p=0.10 x=0, 1, 2
n=15 p=0.10
x=3
answer: 0.128505439 P(X<=2)=P(X=0)+P(X=1)+P(X=2)
(C) x=0 0.205891132

p=0.90 x=1 0.343151887
n=15 x=2 0.266895912
P(X<=2)=P(X=0)+P(X=1)+P(X=2)
x=11
0.815938931
answer: 0.042835146
Chapter 6
Continuous Probability Distributions
f (x) Exponential
• Uniform Probability Distribution
• Normal Probability Distribution
• Exponential Probability Distribution
Uniform x
f (x)
Normal
f (x)
x
x
327
• A continuous random variable can assume any
value in an interval on the real line or in a
collection of intervals.
• It is not possible to talk about the probability
of the random variable assuming a particular
value.
• Instead, we talk about the probability of the
random variable assuming a value within a
given interval.
328
• The probability of the random variable assuming a value
within some given interval from x1 to x2 is defined to be the
area under the graph of the probability density function
between x1 and x2.
f (x) Exponential
Uniform Normal
f (x) f (x)
x
x1 x2
x x
x1 x2
x1 x2
329
Uniform Probability Distribution
A random variable is uniformly distributed
whenever the probability is proportional to the
interval’s length.
The uniform probability density function is:
f (x) = 1/(b – a) for a < x < b
= 0 elsewhere
where: a = smallest value the variable can
assume
b = largest value the variable can
assume
330
Expected Value of x
E(x) = (a + b)/2
Variance of x
Var(x) = (b - a)2/12
331
Example: Flight time of an airplane traveling
from Chicago to New York
Suppose the flight time can be any value in the

interval from 120 minutes to 140 minutes.
332
Uniform Probability Density Function
f(x) = 1/20 for 120 < x < 140

=0 elsewhere
where:
x = Flight time of an airplane traveling from Chicago
to New York
333
Expected Value of x
E(x) = (a + b)/2
= (120 + 140)/2
= 130
Variance of x
Var(x) = (b - a)2/12
= (140 – 120)2/12
= 33.33
334
Example: Flight time of an airplane traveling
from Chicago to New York
335
Example: Flight time of an airplane traveling from Chicago to
New York
Probability of a flight time between 120 and 130 minutes
P(120 < x < 130) = 1/20(10) = .5
336
Area as a Measure of Probability
• The area under the graph of f(x) and
probability are identical.
• This is valid for all continuous random
variables.
• The probability that x takes on a value
between some lower value x1 and some
higher value x2 can be found by computing the
area under the graph of f(x) over the interval
from x1 to x2.
337
Normal Probability Distribution
– The normal probability distribution is the most important
distribution for describing a continuous random variable.
– It is widely used in statistical inference.
– It has been used in a wide variety of applications including:
Heights of people
Test scores
Rainfall amounts
Scientific measurements
– Abraham de Moivre, a French mathematician, published
The Doctrine of Chances in 1733. He derived the normal
distribution.
338
Normal Probability Density Function
1 ;(𝑥;𝜇)2 /2𝜎 2
𝑓 𝑥 = 𝑒
𝜍 2𝜋
Where µ = mean
𝜍 = standard deviation
= 3.14159
e = 2.71828
339
Characteristics
• The distribution is symmetric; its skewness

measure is zero.
340
Characteristics
• The entire family of normal probability
distributions is defined by its mean  and its
standard deviation  .
341
Characteristics
• The highest point on the normal curve is at
the mean, which is also the median and
mode.
342
Characteristics
• The mean can be any numerical value:
negative, zero, or positive.
343
Characteristics
• The standard deviation determines the width of the
curve: larger values result in wider, flatter curves.
344
Characteristics
• Probabilities for the normal random variable
are given by areas under the curve. The total
area under the curve is 1 (.5 to the left of the
mean and .5 to the right).
.5 x
.5 345
Characteristics (basis for the empirical rule)
• 68.3% of values of a normal random variable are

within +/- 1 standard deviation of its mean.
within +/- 2 standard deviations of its mean.
within +/- 3 standard deviations of its mean.
346
Characteristics (basis for the empirical rule)
347
Standard Normal Probability
Distribution
Characteristics
A random variable having a normal distribution
with a mean of 0 and a standard deviation of 1 is
said to have a standard normal probability
distribution.
348
Distribution
Characteristics
The letter z is used to designate the standard
normal random variable.
349
Distribution
Converting to Standard Normal Distribution
𝑥;𝜇
z=
𝜎
We can think of z as a measure of the number of

standard deviations x is from .
350
Using excel to compute standard
normal probabilities
Excel has two functions for computing probabilities
and z values for a standard normal probability
distribution.
 NORM.S.DIST function computes the cumulative

probability given a z value.
 NORM.S.INV function computes the z value given
a cumulative probability.
“S” in the function names reminds us that these
functions relate to the standard normal probability
distribution.
351
Distribution
Example: Grear Tire Company Problem
Grear Tire company has developed a new steel-

belted radial tire to be sold through a chain of
discount stores. But before finalizing the tire
mileage guarantee policy, Grear’s managers
want probability information about the number
of miles of tires will last.
356
Distribution
It was estimated that the mean tire mileage is

36,500 miles with a standard deviation of 5000.
The manager now wants to know the probability
that the tire mileage x will exceed 40,000.
P(x > 40,000) = ?

357
Distribution
Solving for the Probability
• Step 1: Convert x to standard normal distribution.
z = (x - )/
= (40,000 – 36,500)/5,000
= .7
• Step 2: Find the area under the standard normal
curve to the left of z = .7.
358
Standard Normal Probability Distribution
Cumulative Probability Table for the Standard Normal
Distribution
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
. . . . . . . . . . .
.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
.9 .8159 .8186 .8212 .8238 .8264 .8289

P(z < .7) = .7580.8315 .8340 .8365 .8389
. . . . . . . . . . .
359
Distribution
Solving for the Probability
• Step 3: Compute the area under the standard
normal curve to the right of z = .7
P(z > .7) = 1 – P(z < .7)
= 1- .7580
= .2420
360
Distribution
361
Distribution
Area = .7580 Area = 1 - .7580

= .2420
z
0 .7
362
Distribution
What should be the guaranteed mileage if Grear
wants no more than 10% of tires to be eligible
for the discount guarantee?
(Hint: Given a probability, we can use the
standard normal table in an inverse fashion to
find the corresponding z value.)
363
Distribution
Solving for the guaranteed mileage
364
Distribution
Example: Grear Tire Company Problem - Solving for the
guaranteed mileage
Step 1: Find the z value that cuts off an area of .1 in the
left tail of the standard normal distribution.
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
. . . . . . . . . . .
-1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
-1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681
-1.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
-1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
-1.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 365
0.1170
Distribution
From the table we see that z = -1.28 cuts off an area
of 0.1 in the lower tail.
Step 2: Convert z.1 to the corresponding value of x.

x =  + z.1
x = 36500 - 1.28 (5000) = 30,100
Thus a guarantee of 30,100 miles will meet the

requirement that approximately 10% of the tires
will be eligible for the guarantee.
366
Using Excel to Compute Normal
Probabilities
Excel has two functions for computing
cumulative probabilities and x values for any
normal distribution:
– NORM.DIST is used to compute the cumulative
probability given an x value.
– NORM.INV is used to compute the x value given a
cumulative probability.
367
Exponential Probability Distribution
• The exponential probability distribution is useful
in describing the time it takes to complete a task.
• The exponential random variables can be used to
describe:
• Time between vehicle arrivals at a toll booth
• Time required to complete a questionnaire
• Distance between major defects in a highway
• In waiting line applications, the exponential
distribution is often used for service times.
370
• A property of the exponential distribution is
that the mean and standard deviation are
equal.
• The exponential distribution is skewed to the
right. Its skewness measure is 2.
371
Density Function
1 ;𝑥/𝜇
𝑓 𝑥 = 𝑒 for x > 0
𝜇
where:  = expected or mean

e = 2.71828
372
Cumulative Probabilities
𝑃(x < 𝑥0 )= 1 − 𝑒 ;𝑥0 /𝜇

where:
x0 = some specific value of x
373
Example: Loading time for trucks
Suppose x represents the loading time for a

truck at the Schips loading dock and follows
exponential distribution. If the mean or average
loading time is 15 minutes, What is the
probability that loading a truck will take 6
minutes or less?
374
Example: Loading time for trucks
𝑃(x < 𝑥0 )= 1 − 𝑒 ;𝑥0 /𝜇
𝑃(x < 6 ) = 1 − 𝑒 ;6 /15
= .3297
375
Using Excel to Compute Exponential
Probabilities
The EXPON.DIST function can be used to
compute exponential probabilities.
The EXPON.DIST function has three arguments:
– 1st The value of the random variable x
– 2nd 1/m - the inverse of the mean number of
occurrences in an interval
– 3rd “TRUE” or “FALSE - We will always enter
“TRUE” because we’re seeking a cumulative
probability.
376
Using Excel to Compute Exponential
Probabilities
379
Relationship between the Poisson and
Exponential Distributions
The Poisson distribution

provides an appropriate description
of the number of occurrences
per interval
The exponential distribution

provides an appropriate description
of the length of the interval
between occurrences
380
EXERCISES
Suppose x is a normally distributed random variable
with a mean of 22 and a standard deviation of 5. The
probability that x is less than 9.7 is _____.
SOLUTION
EXERCISES
The ages of students at a university are normally
distributed with a mean of 21. What percentage of the
student body is at least 21 years old?
SOLUTION
MEAN = 21 ; PERCENTAGE OF
AGE 21 = 50%
EXERCISES
The weight of football players is normally distributed with a
mean of 200 pounds and a standard deviation of 25 pounds.
The probability of a player weighing more than 241.25 pounds
is _____.
SOLUTION
mean = 200
std dev = 25 z= (X - µ) / δ = (241.25-200)/25 = 1.65

P(X ≥ 241.25) =?
using table P(X ≥ 241) = 1 - 0.9505 0.0495
EXERCISES
The starting salaries of individuals with an MBA

degree are normally distributed with a mean of
$40,000 and a standard deviation of $5,000.
What is the probability that a randomly selected
individual with an MBA degree will get a starting
salary of at least $30,000?
SOLUTION
MEAN $ 40,000.00 Z= (X - µ) / δ = (30,000-40,000) / 5,000 = -2
STD DEV $ 5,000.00
using table
P(X ≥ $30,000) =? P(X ≥ $30,000) = 1 - 0.0228 = 0.9772

EXERCISES
The life expectancy of a particular brand of tire is
normally distributed with a mean of 40,000 and a
standard deviation of 5,000 miles. What is the
probability that a randomly selected tire will have a life
of at least 47,500 miles?
SOLUTION
mean = 40000
std dev = 5000 z= (X - µ) / δ =
P(X ≥ 47,500) 1.5
using table 0.93332
P(X ≥ 47,500) 0.06668
EXERCISES
f(x) = (1/10) e-x/10 x≥0
The mean of x is _____.

SOLUTION
Density Function
1 ;𝑥/𝜇
𝑓 𝑥 = 𝑒 for x > 0
𝜇
f(x) = (1/10) e-x/10 x≥0
𝜇 = 10
EXERCISES
A random variable x is uniformly distributed between
45 and 150.
a. Determine the probability of x = 48.
b. What is the probability of x ≤ 60?
c. What is the probability of x ≥ 50?
d. Determine the expected vale of x and its standard
deviation.
SOLUTION
Formulas for the theoretical mean and standard deviation
are
μ=a+b/2 σ=√(b−a)2/12 mean = (60-45)/2 52.5

std dev = ((60-45)^2/12)^1/2 4.33013
f (x) = 1/(b – a) for a < x < b
= 0 elsewhere f(x) = 1/(150-45) =
a = smallest value the variable can
where: assume 0.00952381
b = largest value the variable can
assume
P(X = 48) = 0
P (45 ≤ x ≤ 60) 0.1429 (60-45)*f(x)
P( 50 ≤ x ≤ 150) 0.9524 (150-50)*f(x)
P(60 ≤ x ≤150) mean = 97.5
std dev = 30.311
EXERCISES
The price of a bond is uniformly distributed between

$80 and $85.
a. What is the probability that the bond price will be
at least $83?
b. What is the probability that the bond price will be
between $81 and $90?
c. Determine the expected price of the bond.
d. Compute the standard deviation for the bond price.
SOLUTION
f(x) = 1 / (b-a)
0.2000
P (83≤X≤85) = (85-83) * f(x) 0.40
P (81≤X≤85)= (85-81)*f(x) 0.80
E(X) = (80+85) / 2 $ 82.50
σ =√(b−a)2/12 = (((85-80)^2)/12)^(1/2) $ 1.44

EXERCISES
The time it takes to hand carve a guitar neck is uniformly
distributed between 110 and 190 minutes.
a. What is the probability that a guitar neck can be carved

between 95 and 165 minutes?
b. What is the probability that the guitar neck can be carved
between 120 and 200 minutes?
c. Determine the expected completion time for carving the
guitar neck.
d. Compute the standard deviation.
SOLUTION
f(x) = 1/(190-110) 0.0125
P(110 ≤ X≤ 165) = (165-110)*f(x) 0.6875
P(120 ≤ X≤ 190) = (190-120)*f(x) 0.8750
E(X) = (110+190)/2 150.00 minutes
σ =√(b−a)2/12 = (((190-110)^2)/12)^(1/2) 23.09 minutes

EXERCISES
The length of time patients must wait to see a doctor in a local
clinic is uniformly distributed between 15 minutes and 2 1/2
hours.
a. Define the random variable in words.
b. What is the probability of a patient waiting exactly 50
minutes?
c. What is the probability that a patient would have to wait
between 45 minutes and 2 hours?
d. Compute the probability that a patient would have to wait
over 2 hours.
e. Determine the expected waiting time and its standard
deviation.
SOLUTION
f(x) = 1 / (b-a) 0.007407 15mins 150mins
length of time patients must wait to see a doctor in a

X= local clinic
P(X = 50 mins) = (50-50)*f(x) = 0
P(45≤X≤120) = (120-45)*f(x) = 0.555556
P(120≤X≤150) = (150-120)*f(x) = 0.222222
E(X)= (a+b)/2 (150+15)/2 = 82.5 mins
(𝑏;𝑎)2 (((150-15)^2)/12)^(1/2) 38.97114 mins

𝜍=
12
EXERCISES
The time it takes a mechanic to change the oil in a car is exponentially

distributed with a mean of 5 minutes.
a. What is the probability density function for the time it takes to
change the oil?
b. What is the probability that it will take a mechanic less than 6
minutes to change the oil?
c. What is the probability that it will take a mechanic between 3 and 5
minutes to change the oil?
SOLUTION
where: m = expected or mean
e = 2.71828
.
f(x) = (1/μ) e-x/μ for x ≥ 0 .
f(x) = (1/5) e-x/5 for x ≥ 0
P(X ≤ 6) = 1 - 2.71828^(6/5) 0.698805545
P(3 ≤ X ≤ 5) = P(X ≤ 5) - P(X ≥ 3) 0.180932169

EXERCISES
The time it takes to completely tune an engine of an automobile follows an exponential
distribution with a mean of 40 minutes.
a. Define the random variable in words.
b. What is the probability of tuning an engine in 30 minutes or less?
c. What is the probability of tuning an engine between 30 and 35 minutes?
SOLUTION
.
f(x) = (1/μ) e-x/μ for x ≥ 0
MEAN = 40
X = time it takes to completely tune an engine of an automobile
P(X ≤ 30) = 1 - 2.71828^(-30/40) 0.52763
P(30 ≤ X ≤ 35) = P(X ≤ 35) - P(X ≥ 30) = 0.055504526

EXERCISES
The 2GO Island Ferry leaves on the hour and at 15-minute intervals. The time,
x, it takes Lexeleen to drive from her house to the ferry has a uniform
distribution with x between 10 and 20 minutes. One morning Lexeleen leaves
her house at precisely 8:00 A.M.
a. What is the probability Lexeleen will wait less than 5 minutes for the ferry?
b. What is the probability Lexeleen will wait less than 10 minutes for the ferry?
c. What is the probability Lexeleen will wait less than 15 minutes for the ferry?
d. What is the probability Lexeleen will not have to wait for the ferry?
e. Suppose Lexeleen leaves at 8:05 A.M. What is the probability Lexeleen will
wait (1) less than 5 minutes for the ferry; (2) less than 10 minutes for the ferry?
f. Suppose Lexeleen leaves at 8:10 A.M. What is the probability Lexeleen will
wait (1) less than 5 minutes for the ferry; (2) less than 10 minutes for the ferry?
g. What appears to be the best time for Lexeleen to leave home if she wishes
to maximize the probability of waiting less than 10 minutes for the ferry?
EXERCISE
Suppose that the random variable x is normally distributed
with mean μ = 500 and standard deviation σ = 100. For each
of the following, use the normal table to find the needed
value k.
a. P(x ≥ k) = .025
b. P(x ≥ k) = .05
c. P(x < k) = .025
d. P(x ≤ k) = .015
e. P(x < k) = .985
f. P(x > k) = .95
g. P(x ≤ k) = .975
h. P(x ≥ k) = .0228
i. P(x > k) = .9772
SOLUTION
a. P(x  k) = 0.0250 Distribution Plot
Normal, Mean=500, StDev=100
Distribution Plot
P(z  1.96) = 0.0250

Distribution Plot
0.004 0.004 Normal, Mean=500, StDev=100
0.985
0.004
k = 1.96(100) + 500 = 696 0.003 0.003
P(x  696) = 0.0250 0.003
Density
Density
0.002
0.002
Density
0.002
b. P(z  1.645) = 0.0500 0.001

0.001
k = 1.645(100) + 500 = 664.5
H
0.025 0.001
0.000
500 696.0
P(x  664.5) = 0.0500
0.000
X
500 717.0 0.0228
X
0.000
A
500 699.9
Distribution Plot X
c. P(z < -1.96) = 0.0250 Normal, Mean=500, StDev=100
E
0.004 Distribution Plot
k = -1.96(100) + 500 = 304 Normal, Mean=500, StDev=100
P(x < 304) = 0.0250 0.003

I
Density
0.95 0.004
0.002 0.003
0.9772
B
d. P(z ≤ -2.17) = 0.0150
Density
0.003
k = -2.17(100) + 500 = 283 0.001 0.002
F
Density
P(x  283) = 0.0150 0.000
500 664.5
0.05
0.001
0.002
e. P(z < 2.17) = 0.9850 0.000

335.5 500
0.001
X
k = 2.17(100) + 500 = 717
C
Distribution Plot
0.000
Normal, Mean=500, StDev=100 300.1 500
P(x < 717) = 0.9850 X
G
f. P(z > -1.645) = 0.9500 0.003 0.004
0.975
k = -1.645(100) + 500 = 335.5

Density
0.002 0.003
P(x > 335.5) = 0.9500 D

Density
0.001 0.002
g. P(z ≤ -1.96) = 0.9750 0.000

0.025
304.0 500 0.001

k = -1.96(100) + 500 = 304 X
P(x  696) = 0.9750 0.000

500 696.0
Distribution Plot X
h. P(z > 2.00) = 0.0228 0.004
k = 2.00(100) + 500 = 700

P(x  700) = 0.0228 0.003
Density
0.002
i. P(z > -2.00) = 0.9772
k = -2.00(100) + 500 = 300 0.001
P(x > 300) = 0.9772 0.000

0.015
283.0 500
X
EXERCISE
Weekly demand at a grocery store for a brand of breakfast
cereal is normally distributed with a mean of 800 boxes and
a standard deviation of 75 boxes.
a. What is the probability that weekly demand is959 boxes
or less? More than 1, 004 boxes? Less than 650 boxes
or greater than 950 boxes?
b. The store orders cereal from a distributor weekly. How
many boxes should the store order for a week to have only
a 2.5 percent chance of running short of this brand of cereal
during the week?
SOLUTION
a. (1) P(x  959) = P(x-μσ ≤ 959-80075) = P(z  2.12) = .9830
(2) P(x > 1004) = P(x-μσ > 1004-80075) = P(z > 2.72) = 1 – .9967 = .0033
(3) P(x < 650) + P(x > 950) = P(z < –2) + P(z > 2) = .0228 + (1 – .9772) = .0456
b. P(x > order) = 0.025 and P(z > 1.96) = 0.025

z = x-μσ = order-80075 = 1.96
order = 800 + 1.96(75) = 947 boxes of cereal
EXERCISE
United Motors claims that one of its cars, the Starbird 300,
gets city driving mileages that are normally distributed with a
mean of 30 mpg and a standard deviation of 1 mpg.
Let x denote the city driving mileage of a randomly selected
Starbird 300.
a.Assuming that United Motors' claim is correct, find P(x ≤
27).
b.If you purchase (randomly select) a Starbird 300 and your
car gets 27 mpg in city driving, what do you think of United
Motors' claim? Explain your answer.
SOLUTION
𝑥;𝜇 27;30
a. P(x  27) = P( ≤ ) =
σ 1
P(z  -3.00) = 0.00135
b. Claim is probably not true, because the

probability is very low of randomly purchasing a
car getting no more than 27 mpg if the mean is
actually 30 mpg.
EXERCISE
In the book Advanced Managerial Accounting, Robert P. Magee discusses monitoring cost variances. A cost
variance is the difference between a budgeted cost and an actual cost. Magee considers weekly monitoring of the
cost variances of two manufacturing processes, Process A and Process B. One individual monitors both processes
and each week receives a weekly cost variance report for each process. The individual has decided to investigate the
weekly cost variance for a particular process (to determine whether or not the process is out of control) when its
weekly cost variance is too high. To this end, a weekly cost variance will be investigated if it exceeds $2,500.
a. When Process A is in control, its potential weekly cost variances are normally distributed with a mean of $0 and a
standard deviation of $5,000. When Process B is in control, its potential weekly cost variances are normally
distributed with a mean of $0 and a standard deviation of $10,000. For each process, find the probability that a
weekly cost variance will be investigated (that is, will exceed $2,500) even though the process is in control. Which in-
control process will be investigated more often?
b. When Process A is out of control, its potential weekly cost variances are normally distributed with a mean of
$7,500 and a standard deviation of $5,000. When Process B is out of control, its potential weekly cost variances are
normally distributed with a mean of $7,500 and a standard deviation of $10,000. For each process, find the
probability that a weekly cost variance will be investigated (that is, will exceed $2,500) when the process is out of
control. Which out-of-control process will be investigated more often?
c. If both Processes A and B are almost always in control, which process will be investigated more often?
d. Suppose that we wish to reduce the probability that Process B will be investigated (when it is in control) to .3085.
What cost variance investigation policy should be used? That is, how large a cost variance should trigger an
investigation? Using this new policy, what is the probability that an out-of-control cost variance for Process B will be
investigated?
SOLUTION
a. Process A in control: μA = $0 σA = $5,000
Process B in control: μB = $0 σB = $10,000
𝑥;𝜇 2500;0
P(xA > 2,500) = P( > ) = P(zA > 0.50) = 1 – 0.6915 = 0.3085
σ 5000
𝑥;𝜇 2500;0
P(xB > 2,500) = P( > ) = P(zB > 0.25) = 1 – 0.5987 = 0.4013
σ 10000
Process B is investigated more often.
b. Process A out of control: μA = $7,500 σA = $5,000
Process B out of control: μB = $7,500 σB = $10,000
𝑥;𝜇 2500;7500
P(xA > 2,500) = P( > ) = P(zA > -1.00) = 1 – 0.1587 = 0.8413
σ 5000
𝑥;𝜇 2500;7500
P(xB > 2,500) = P( > ) = P(zB > -0.50) = 1 – 0.3085 = 0.6915
σ 10000
Process A is investigated more often.
c. Process B will be investigated more often.
d. Find k so that P(xB > k) = .3085 when process B is in control so use μB = $0 and σB = $10,000.
𝑥;𝜇 𝑘;0
P(z > 0.50) = .3085 implies that z = = = 0.50 Thus k = 5000, and we will investigate Process B if
σ 10000
the cost variance exceeds $5000.
If Process B is out of control we use μB = $7,500 and σB = $10,000. Thus the probability of investigating an
out of control Process B is:
𝑥;𝜇 5000;7500
P(xB > 5,000) = P( > ) = P(zB > -0.25) = 1 – 0.4013 = 0.5987
σ 10000
EXERCISE
A business executive, transferred from Chicago to Atlanta, needs to sell her house in
Chicago quickly. The executive’s employer has offered to buy the house for $210,000, but
the offer expires at the end of the week. The executive does not currently have a better offer
but can afford to leave the house on the market for another month. From conversations with
her realtor, the executive believes the price she will get by leaving the house on the market
for another month is uniformly distributed between $200,000 and $225,000.
A) If she leaves the house on the market for another month, what is the mathematical
expression for the probability density function of the sales price?
B) If she leaves it on the market for another month, what is the probability that she will get at
least $215,000 for the house?
C) If she leaves it on the market for another month, what is the probability that she will get
less than $210,000?
D) Should the executive leave the house on the market for another month? Why or why not?
SOLUTION
EXERCISES
The NCAA estimates that the yearly value of a full athletic
scholarship at in-state public universities is $19,000 (The
Wall Street Journal, March 12, 2012). Assume the
scholarship value is normally distributed with a standard
deviation of $2100.
A. For the 10% of athletic scholarships of least value, how
much are they worth?
B. What percentage of athletic scholarships are valued at
$22,000 or more?
C. For the 3% of athletic scholarships that are most valuable,
how much are they worth?
SOLUTION
EXERCISE
Motorola used the normal distribution to determine the probability of defects
and the number of defects expected in a production process. Assume a
production process produces items with a mean weight of 10 ounces.
Calculate the probability of a defect and the expected number of defects for
a 1000-unit production run in the following situations.
A. The process standard deviation is .15, and the process control is set at
plus or minus one standard deviation. Units with weights less than 9.85
or greater than 10.15 ounces will be classified as defects.
B. Through process design improvements, the process standard deviation
can be reduced to .05. Assume the process control remains the same,
with weights less than 9.85 or greater than 10.15 ounces being
classified as defects.
C. What is the advantage of reducing process variation, thereby causing
process control limits to be at a greater number of standard deviations
from the mean?
SOLUTION
EXERCISE
During early 2012, economic hardship was stretching the limits of
France’s welfare system. One indicator of the level of hardship was the
increase in the number of people bringing items to a Paris pawnbroker.
That number had risen to 658 per day (Bloomberg Businessweek, March
5–March 11, 2012). Assume the number of people bringing items to the
pawnshop per day in 2012 is normally distributed with a mean of 658.
A.Suppose you learn that on 3% of the days, 610 or fewer people brought
items to the pawnshop. What is the standard deviation of the number of
people bringing items to the pawnshop per day?
B.On any given day, what is the probability that between 600 and 700
people bring items to the pawnshop?
C.How many people bring items to the pawnshop on the busiest 3% of
days?
SOLUTION
EXERCISE
The port of South Louisiana, located along 54 miles of the Mississippi
River between New Orleans and Baton Rouge, is the largest bulk cargo
port in the world. The U.S. Army Corps of Engineers reports that the port
handles a mean of 4.5 million tons of cargo per week (USA Today,
September 25, 2012). Assume that the number of tons of cargo handled
per week is normally distributed with a standard deviation of .82 million
tons.
A. What is the probability that the port handles less than 5 million tons of
cargo per week?
B. What is the probability that the port handles 3 million or more tons of
cargo per week?
C. What is the probability that the port handles between 3 million and 4
million tons of cargo per week?
SOLUTION
Chapter 7
Sampling and
Sampling
Distributions
SAMPLE
A collection of data from part of the population.
Samples should be chosen randomly.
Example: You ask 20 randomly chosen student in

a Business Statistics class on why they are always
late. Your sample is the 20, while the population is
all the student in the class.
Selecting a Sample
Simple Random Sample (Finite Population)
A simple random sample of size n from a finite population of size N is a sample
selected such that each possible sample of size n has the same probability of
being selected.
An example, consider the car mileage case, the automaker has decided to select a sample of
50 cars by randomly selecting one car from the 100 cars produced on each of 50 consecutive
production shifts.
Random Sample (Infinite Population)

A random sample of size n from an infinite population is a sample selected such
that the following conditions are satisfied.
• Each element selected comes from the same population.
• Each element is selected independently.
An example of selecting a random sample from an infinite population, consider the
population of customers arriving at a fast-food restaurant. Suppose an employee is asked
to select and interview a sample of customers in order to develop a profile of customers who
visit the restaurant. The customer arrival process is ongoing and there is no way to obtain a
list of all customers in the population.
EXAMPLE
Consider a finite population with five elements labeled A, B, C, D, and E. Ten
possible simple random samples of size 2 can be selected.
List the 10 samples beginning with AB, AC, and so on
Answer: AB, AC, AD, AE, BC, BD, BE, CD, CE, DE
Using simple random sampling, what is the probability that each sample of size 2 is
selected?
Answer: With 10 samples, each has a 1/10 probability

EXAMPLE
Indicate which of the following situations involve sampling from a finite population and
which involve sampling from an infinite population.
a. Select a sample of licensed drivers in the state of New York.

Answer: finite
b. Select a sample of boxes of cereal off the production line for the Breakfast Choice
Company.
Answer: infinite
c. Select a sample of cars crossing the Golden Gate Bridge on a typical weekday.
Answer: infinite
d. Select a sample of students in a statistics course at Indiana University.
Answer: finite
e. Select a sample of the orders being processed by a mail-order firm.
Answer: infinite
Point Estimation
A sample statistics that estimates a population
parameter (i.e. sample mean is a point estimate
of the population mean)
Properties:
Unbiased: E (estimate) = parameter
Efficient: Var (estimate) is the lowest among
competing point estimator
Consistent: Var (estimate) decreases (usually to
0) as the sample size increases
Point Estimation
Example 1: The following data are from a simple random
sample.
5 8 10 7 10 14
a) What is the point estimate of the population mean?
𝑥𝑖 54
Answer: 𝑥 = = =9
𝑛 6
b) What is the point estimate of the population standard
deviation?
2
(𝑥𝑖 ; 𝑥) 48
Answer: s = = = 3.1
𝑛 ;1 (6 ;1)
Point Estimation
Example 2: A survey question for a sample of
150 individuals yielded 75 YES responses, 55 NO
responses, and 20 No Opinions.
a) What is the point estimate of the proportion
in the population who respond Yes?
Answer: = 75 / 150 = 0.50
b) What is the point estimate of the proportion
in the population who respond No?
Answer: = 55 / 150 = 0.3667
Introduction to Sampling Distribution
A Sampling Distribution shows every possible result
a statistics can take in every possible sample from a
population and how often each results happen.
EXAMPLE: If random samples of size three are drawn without
replacement from the population consisting of four numbers 4, 5, 5,7.
Find the sample mean 𝑋 for each sample and make a sampling
distribution of 𝑋. Calculate the mean and standard deviation of this
sampling distribution. Compare your calculations with the population
parameters.
SOLUTION: The population values 4, 5, 5, 7, population size N = 4

and sample size n = 3. Thus, the number of possible samples which
𝑁 4
can be drawn without replacement is = =4
𝑛 3
SAMPLE N SAMPLE VALUES SAMPLE MEAN (𝑿)
1 4,5,5 14/3
2 4,5,7 16/3
3 4,5,7 16/3
4 5,5,7 17/3
The sampling distribution of the sample mean 𝑋 and its mean and
standard deviation are:
𝑿 f f (𝑿) 𝑿 f (𝑿) 𝑿𝟐 f (𝑿)

14/3 1 1/4 14/12 196/36
16/3 2 2/4 32/12 512/36
17/3 1 1/4 17/12 289/36
Total 4 1 63/12 997/36
63
𝜇𝑥 = 𝑋 𝑓 (𝑋) = = 5.25
12
𝜍𝑋= 𝑋2 𝑓 𝑋 − 𝑋 𝑓 (𝑋) 2 =
997 63 2
− = 0.3632
36 12
The mean and the standard deviation of the population are:
X 4 5 5 7
𝑿 = 𝟐𝟏
𝑋2 16 25 2 49
𝑋 2 = 115
𝑋 21
𝜇= = = 5.25
𝑁 4
𝜍 = 0.3632
HENCE: the mean and standard deviation of the sample

distribution is equal to the mean and standard deviation of the
population
Sampling Distribution of 𝑋
The sampling distribution of 𝑋 is the probability distribution
of all possible values of the sample mean 𝑋.
The central limit theorem states that if large enough
sample is taken (typically n>30) then the sampling
distribution of 𝑋 is approximately a normal distribution
𝝈
with a mean of 𝝁 and a standard deviation of (also
𝒏
known as the standard error of the mean).
NOTE: For Finite population a correction factor ( (N−n) / (N−1))

applies to the standard deviation if the sample size is greater
than 5% of the population.
EXERCISES
EX1: A population has a mean of 200 and a standard deviation of 50. Suppose
a simple random sample of size 100 is selected and 𝑥 is used to estimate the
𝜇.
a. What is the probability that the sample mean will be within ±5 of the
population mean?
Ans. The sampling distribution is normal with:
E(𝑥 ) = 𝜇 = 200
𝜎 50
𝜍𝐸 = = = 5
𝑛 100
𝑥;𝜇 5
Z= = = 1 ; P(𝑥 ) = 0.8413 − 0.1587 = 0.6827
𝜎𝑥 5
b. What is the probability that the sample mean will be within ±10 of the
population mean?
Ans For ±10, (𝑥 − 𝜇) = 10
Area = 0.9544
Sample Proportion
The Sampling Distribution of the Sample Proportion 𝒑
The sample proportion is the proportion of individuals in a sample
sharing a certain trait, denoted 𝑝.
1. Approximately has a normal distribution, if the sample size n is large.

2. Has mean 𝜇𝑝 = p
𝑝(1;𝑝)
3. Has standard deviation 𝜍𝑝 = ,
𝑛
Stated equivalently, the sampling distribution of 𝑝 has mean 𝜇𝑝 =

𝑝(1;𝑝)
p has standard deviation 𝜍𝑝 = , and is approximately a normal
𝑛
distribution (if the sample size n is large).
Other Sampling Methods
Stratified Random Sampling In stratified
random sampling, the elements in the
population are first divided into groups called
strata, such that each element in the population
belongs to one and only one stratum. The basis
for forming the strata, such as department,
location, age, industry type, and so on, is at the
discretion of the designer of the sample.
Cluster Sampling. In cluster sampling, the
elements in the population are first divided into
separate groups called clusters. Each element of
the population belongs to one and only one
cluster. A simple random sample of the clusters
is then taken. All elements within each sampled
cluster form the sample.
Systematic Sampling. In some sampling
situations, especially those with large
populations, it is time-consuming to select a
simple random sample by first finding a random
number and then counting or searching through
the list of the population until the corresponding
element is found. An alternative to simple
random sampling is systematic sampling.
Convenience Sampling is a nonprobability
sampling technique. As the name implies, the
sample is identified primarily by convenience.
Elements are included in the sample without
pre-specified or known probabilities of being
selected. For example, a professor conducting
research at a university may use student
volunteers to constitute a sample simply
because they are readily available and will
participate as subjects for little or no cost.
Judgment Sampling. In this approach, the
person most knowledgeable on the subject of
the study selects elements of the population
that he or she feels are most representative of
the population. Often this method is a relatively
easy way of selecting a sample.
EXERCISES
Go to your canvas account
Chapter 8
Interval Estimation
INTERVAL ESTIMATION
Interval estimation is the use of sample data to calculate an
interval of possible (or probable) values of an unknown population
parameter, in contrast to point estimation, which is a single number.
Formula:
𝜎
𝜇= 𝑥± 𝑍𝛼
2 𝑛
Where:
x = mean ; Zα = confidence coefficient
2
α = confidence level ; σ = standard deviation

N = sample size
Interval Estimation
• Population Mean:  Known

• Population Mean:  Unknown
• Determining the Sample Size
• Population Proportion
• Big data and Interval estimation
452
Margin of Error and the Interval
Estimate
• A point estimator cannot be expected to provide the
exact value of the population parameter.
• An interval estimate can be computed by adding and
subtracting a margin of error to the point estimate.
Point Estimate +/- Margin of Error
• The purpose of an interval estimate is to provide
information about how close the point estimate is to
the value of the parameter.
453
Margin of Error and the Interval
Estimate
The general form of an interval estimate of a

population mean is
𝑥 + Margin of Error
454
Interval Estimate of a Population
Mean:  Known
– In order to develop an interval estimate of a

population mean, the margin of error must be
computed using either:
– the population standard deviation  , or
– the sample standard deviation s
–  is rarely known exactly, but often a good estimate

can be obtained based on historical data or other
information.
– We refer to such cases as the  known case.
455
Mean:  Known
There is a 1 -  probability that the value of a
sample
mean will provide a margin of error of 𝑧𝛼/2 𝜍𝑥 or
less.
/ /
2 2
456
Mean:  Known
457
Mean:  Known
Interval Estimate of 
𝜍
𝑥 ± 𝑧𝛼/2
𝑛
where: 𝑥 is the sample mean

1- is the confidence coefficient
z/2 is the z value providing an area of /2 in the upper
tail of the standard normal probability distribution
 is the population standard deviation
n is the sample size
458
Mean:  Known
Values of z/2 for the Most Commonly Used Confidence
Levels
Confidence  /2 Look-up z/2

level Area
90% .1 .05 .9500 1.645
95% .05 .025 .9750 1.960
99% .01 .005 .9950 2.576
459
Meaning of Confidence
Because 90% of all the intervals constructed using 𝑥 +
1.645𝜍𝑥 will contain the population mean, we say
we are 90% confident that the interval
𝑥 + 1.645𝜍𝑥 includes the population mean .
• We say that this interval has been established at the

90% confidence level.
• The value .90 is referred to as the confidence

coefficient.
460
Mean:  Known
Example: Lloyds Department store
Each week Lloyds department store selects a simple
random sample of 100 customers in order to learn about
the amount spent per shopping trip. The historical data
indicates that the population follows a normal
distribution.
During most recent week, Lloyd’s surveyed 100
customers (n = 100) and obtained a sample mean of 𝑥 =
$82. Based on historical data, Lloyd’s now assumes a
known value of 𝜍 = $20. The confidence coefficient to be
used in the interval estimate is .95.
461
Mean:  Known
95% of the sample means that can be observed are

within + 1.96 𝜍𝑥 of the population mean . The
margin of error is:
𝜍 20
𝑧𝛼/2 = 1.96 = 3.92
𝑛 100
462
Interval Estimate of a Population Mean: 
Known
Interval estimate of  is:
$82 + $ 3.92
or
$78.08 to $85.29
We are 95% confident that the interval contains the

population mean.
463
Mean:  Known
Confidence level Margin of Error Interval
estimate
90% 3.92 78.08 – 85.92
95% 3.29 78.71 – 85.29
99% 5.15 76.85 – 87.15
In order to have a higher degree of confidence, the margin

of error and thus the width of the confidence interval must
be larger.
464
Mean:  Known
Adequate Sample Size
– In most applications, a sample size of n ≥ 30 is
adequate.
– If the population distribution is highly skewed or
contains outliers, a sample size of 50 or more is
recommended.
– If the population is not normally distributed but is
roughly symmetric, a sample size as small as 15 will
suffice.
– If the population is believed to be at least
approximately normal, a sample size of less than 15
can be used.
465
Mean:  Unknown
• If an estimate of the population standard deviation 
cannot be developed prior to sampling, we use the
sample standard deviation s to estimate  .
• This is the  unknown case.
• In this case, the interval estimate for  is based on
the t distribution.
• (We’ll assume for now that the population is
normally distributed.)
466
t Distribution
• William Gosset, writing under the name “Student”, is

the founder of the t distribution.
• Gosset was an Oxford graduate in mathematics and
worked for the Guinness Brewery in Dublin.
• He developed the t distribution while working on
small-scale materials and temperature experiments.
467
t Distribution
• The t distribution is a family of similar probability

distributions.
• A specific t distribution depends on a parameter

known as the degrees of freedom.
• Degrees of freedom refer to the number of

independent pieces of information that go into
the computation of s.
468
t Distribution
• A t distribution with more degrees of
freedom has less dispersion.
• As the degrees of freedom increase, the

difference between the t distribution and the
standard normal probability distribution
becomes smaller and smaller.
469
t Distribution
Comparison of the standard normal distribution
with t distributions having 10 and 20 degrees of
freedom.
470
t Distribution
• For more than 100 degrees of freedom, the

standard normal z value provides a good
approximation to the t value.
• The standard normal z values can be found in
the infinite degrees row (labeled ∞ ) of the t
distribution table.
471
t Distribution
Selected values
from the t
distribution table
472
Mean:  Unknown
𝑠
𝑥 ± 𝑡𝛼/2
𝑛
where: 𝑥 = the sample mean
1 -  = the confidence coefficient
t/2 = the t value providing an area of /2 in the
upper tail of a t distribution with n - 1 degrees of freedom
s = the sample standard deviation
n = the sample size
473
Mean:  Unknown
Example: Credit card debt for the population of
US households
The credit card balances of a sample of 70
households provided a mean credit card debt of
$9312 with a sample standard deviation of $4007.
Let us provide a 95% confidence interval estimate
of the mean credit card debt for the population of
US households. We will assume this population to
be normally distributed.
474
Mean:  Unknown
• At 95% confidence,  = .05, and /2 = .025.
• t.025 is based on n - 1 = 70 - 1 = 69 degrees of
freedom.
475
Mean:  Unknown
Example: Credit card debt for the population of US
households
𝑠
𝑥 ± 𝑡.025
𝑛
4007
9312 + 1.995 = 9312 + 955
70
We are 95% confident that the mean credit card

debt for the population of US households is
between $8357 and $10267.
476
Mean:  Unknown
Adequate Sample Size
• Usually, a sample size of n ≥ 30 is adequate when
using the expression 𝑥 ± 𝑡𝛼/2 𝑠/ 𝑛 to develop an
interval estimate of a population mean.
• If the population distribution is highly skewed or
contains outliers, a sample size of 50 or more is
recommended.
• If the population is not normally distributed but is roughly
symmetric, a sample size as small as 15 will suffice.
• If the population is believed to be at least approximately
normal, a sample size of less than 15 can be used.
477
Summary of Interval Estimation Procedures
for a Population Mean
478
Sample Size for an Interval Estimate of
a Population Mean
• Let E = the desired margin of error.
• E is the amount added to and subtracted from
the point estimate to obtain an interval
estimate.
• If a desired margin of error is selected prior to
sampling, the sample size necessary to satisfy
the margin of error can be determined.
479
a Population Mean
• Margin of Error
𝜍
𝐸 = 𝑧𝛼/2
𝑛
• Necessary Sample Size
(𝑧𝛼/2 )2 𝜎 2
n=
𝐸2
480
a Population Mean
• The Necessary Sample Size equation requires a
value for the population standard deviation  .
• If  is unknown, a preliminary or planning value
for  can be used in the equation.
1. Use the estimate of the population standard
deviation computed in a previous study.
2. Use a pilot study to select a preliminary study and
use the sample standard deviation from the study.
3. Use judgment or a “best guess” for the value of  .
481
a Population Mean
Example: Cost of renting Automobiles in United States
A previous study that investigated the cost of renting
automobiles in the United States found a mean cost of
approximately $55 per day for renting a midsize
automobile with a standard deviation of $9.65.
Suppose the project director wants an estimate of the
population mean daily rental cost such that there is a .95
probability that the sampling error is $2 or less.
How large a sample size is needed to meet the required
precision?
482
a Population Mean
Example: Cost of renting Automobiles in United States
𝜍
𝐸 = 𝑧𝛼/2 =2
𝑛
At 95% confidence, z.025 = 1.96. Recall that  = 9.65.
(1.96)2 (9.65)2
𝑛= = 89.43 ⋍ 90
(2)2
The sample size needs to be at least 90 mid size automobile
rentals in order to satisfy the project director’s $2 margin-of-
error requirement.
483
Proportion
The general form of an interval estimate of a
population proportion is:
𝑝 + Margin of Error
484
Proportion
• The sampling distribution of 𝑝 plays a key role
in computing the margin of error for this
interval estimate.
• The sampling distribution of 𝑝 can be
approximated by a normal distribution
whenever np > 5 and n(1 – p) > 5.
485
Proportion
Normal Approximation of Sampling Distribution
of 𝑝
486
Proportion
𝑝(1 − 𝑝)
𝑝 ± 𝑧𝛼/2
𝑛
where: 1 -  is the confidence coefficient,

z/2 is the z value providing an area of /2 in the
upper tail of the standard normal probability distribution, and
𝑝 is the sample proportion
487
Proportion
Example: Survey of women golfers
A national survey of 900 women golfers was
conducted to learn how women golfers view their
treatment at golf courses in United States. The
survey found that 396 of the women golfers were
satisfied with the availability of tee times.
Suppose one wants to develop a 95% confidence
interval estimate for the proportion of the
population of women golfers satisfied with the
availability of tee times.
488
Proportion
𝑝(1 − 𝑝)
𝑝 ± 𝑧𝛼/2
𝑛
where: n = 900, 𝑝 = 396/900 = .44, z/2 = 1.96
.44(1;.44)
. 44 ±1.96 = .44 ± .0324
900
Survey results enable us to state with 95% confidence that between

40.76% and 47.24% of all women golfers are satisfied with the
availability of tee times.
489
a Population Proportion
Margin of Error
𝑝(1;𝑝)
E = 𝑧𝛼/2
𝑛
Solving for the necessary sample size n, we get
2
𝑧𝛼/2 𝑝 1 − 𝑝
𝑛=
𝐸2
However, 𝑝 will not be known until after we have

selected the sample. We will use the planning value p*
for 𝑝.
490
Necessary Sample Size
2 ∗
𝑧𝛼/2 𝑝 1 − 𝑝∗
𝑛=
𝐸2
The planning value p* can be chosen by:

1. Using the sample proportion from a previous sample of the same or
similar units, or
2. Selecting a preliminary sample and using the sample proportion
from this sample.
3. Using judgment or a “best guess” for a p* value.
4. Otherwise, using .50 as the p* value.
491
Suppose the survey director wants to estimate the

population proportion with a margin of error of
.025 at 95% confidence.
How large a sample size is needed to meet the

required precision? (A previous sample of similar
units yielded .44 for the sample proportion.)
492
𝑝∗ (1;𝑝∗ )
E = 𝑧𝛼/2 = .025
𝑛
At 95% confidence, z.0125 = 1.96. Recall that p* = .44.
2 ∗
𝑧𝛼/2 𝑝 1 − 𝑝∗ 1.96 2 (.44) .56
𝑛= 2
= 2
= 1514.5
𝐸 (.025)
A sample of size 1515 is needed to reach a desired precision
of + .025 at 95% confidence.
493
Note: We used .44 as the best estimate of p in
the preceding expression. If no information is
available about p, then .5 is often assumed
because it provides the highest possible sample
size. If we had used p = .5, the recommended n
would have been 1537.
494
Implications of Big Data
• As the sample size becomes extremely large, the
margin of error becomes extremely small and
resulting confidence intervals become extremely
narrow.
• No interval estimate will accurately reflect the
parameter being estimated unless the sample is
relatively free of non-sampling error.
• Statistical inference along with information
collected from other sources can help in making
the most informed decision.
495
EXERCISE
A statistician selected a sample of 16
accounts receivable and determined the
mean of the sample to be $5,000 with a
standard deviation of $400. She reported
that the sample information indicated the
mean of the population ranges from
$4,739.80 to $5,260.20. She did not report
what confidence coefficient she had used.
Based on the above information, determine
the confidence coefficient that was used.
SOLUTION
𝑠
𝑥 ± 𝑡𝛼/2
𝑛
where:
𝑥 is the sample mean
1 -  is the confidence coefficient
z/2 is the z value providing an area of
/2 in the upper tail of the standard
normal probability distribution
 is the population standard deviation
n is the sample size
GIVEN:
n =16
𝑥 = 5,000
𝑠 = 400 𝐸 = ± 𝑡𝛼/2 (400/ 16) =
Margin of Error =$4,739.80 to
520.40 ( 16) = ± 𝑡𝛼/2 (400)
$5,260.20 = $520.40
± 𝑡𝛼/2 = 5.204 = 0.98
DF = n -1 = 15
EXERCISE
A sample of 16 students from a large university

is selected. The average age in the sample was
22 years with a standard deviation of 6 years.
Construct a 95% confidence interval for the
average age of the population. Assume the
population of student ages is normally
distributed.
EXERCISE
The proprietor of a boutique in New York wanted to

determine the average age of his customers. A
random sample of 25 customers revealed an
average age of 28 years with a standard deviation
of 10 years. Determine a 95% confidence interval
estimate for the average age of all his customers.
Assume the population of customer ages is
normally distributed.
EXERCISE
A sample of 25 patients in a doctor's office
showed that they had to wait an average of 35
minutes with a standard deviation of 10 minutes
before they could see the doctor. Provide a 98%
confidence interval estimate for the average
waiting time of all the patients who visit this
doctor. Assume the population of waiting times
is normally distributed.
EXERCISE
The makers of a soft drink want to identify the average
age of its consumers. A sample of 16 consumers is
selected. The average age in the sample was 22.5 years
with a standard deviation of 5 years. Assume the
population of consumer ages is normally distributed.
a) Construct a 95% confidence interval for the average
age of all the consumers.
b) Construct an 80% confidence interval for the average
age of all the consumers.
c) Discuss why the 95% and 80% confidence intervals
are different.
EXERCISE
In order to determine how many hours per week freshmen
college students watch television, a random sample of 256
students was selected. It was determined that the students in
the sample spent an average of 14 hours per week watching
television. The standard deviation is 3.2 hours per week for all
freshmen college students.
a. Provide a 95% confidence interval estimate for the average
number of hours that all college freshmen spend watching TV
per week.
b. Suppose the sample mean came from a sample of 25
students. Provide a 95% confidence interval estimate for the
average number of hours that all college freshmen spend
watching TV per week. Assume that the hours are normally
distributed.
EXERCISE
Computer Services, Inc. wants to determine a

confidence interval for the average CPU time of
their teleprocessing transactions. A sample of
196 transactions yielded a mean of 5 seconds.
The population standard deviation is 1.4
seconds. Determine a 97% confidence interval
for the average CPU time.
YELLOW PAD
• In a manufacturing process, a random sample of 9 bolts has a mean length
of 3 inches with a variance of .09. What is the 90 percent confidence
interval for the true mean length of the bolt?
• The internal auditing staff of a local manufacturing company performs a
sample audit each quarter to estimate the proportion of accounts that are
current (between 0 and 60 days after billing). The historical records show
that over the past 8 years 70 percent of the accounts have been current.
Determine the sample size needed in order to be 99 percent confident that
the sample proportion of the current customer accounts is within .03 of the
true proportion of all current accounts for this company
• A company is interested in estimating μ, the mean number of days of sick
leave taken by its employees. Their statistician randomly selects 100
personnel files and notes the number of sick days taken by each employee.
The sample mean is 12.2 days, and the sample standard deviation is 10
days. Calculate a 95 percent confidence interval for μ, the mean number of
days of sick leave.
Chapter 9
Hypothesis Tests
Hypothesis Testing
Hypothesis testing is a statistical method that is
used in making statistical decisions using
experimental data. Hypothesis Testing is
basically an assumption that we make about the
population parameter.
Key terms and concepts:
Null hypothesis: Null hypothesis is a statistical hypothesis that
assumes that the observation is due to a chance factor. Null
hypothesis is denoted by; H0: μ1 = μ2, which shows that there
is no difference between the two population means.
Alternative hypothesis: Contrary to the null hypothesis, the
alternative hypothesis shows that observations are the result
of a real effect.
Level of significance: Refers to the degree of significance in
which we accept or reject the null-hypothesis. 100% accuracy
is not possible for accepting or rejecting a hypothesis, so we
therefore select a level of significance that is usually 5%.
Type I error: When we reject the null hypothesis,
although that hypothesis was true. Type I error is
denoted by alpha. In hypothesis testing, the
normal curve that shows the critical region is called
the alpha region.
Type II errors: When we accept the null hypothesis
but it is false. Type II errors are denoted by beta. In
Hypothesis testing, the normal curve that shows
the acceptance region is called the beta region.
One-tailed test: When the given statistical
hypothesis is one value like H0: μ1 = μ2, it is
called the one-tailed test.
Two-tailed test: When the given statistics
hypothesis assumes a less than or greater than
value, it is called the two-tailed test.
Steps in Hypothesis Testing
Hypothesis testing is the use of statistics to determine the probability that a given
hypothesis is true. The usual process of hypothesis testing consists of four steps.
1. Formulate the null hypothesis H_0 (commonly, that the observations are the result of pure
chance) and the alternative hypothesis H_a (commonly, that the observations show a real
effect combined with a component of chance variation).
2. Identify a test statistic that can be used to assess the truth of the null hypothesis.
3. Compute the P-value, which is the probability that a test statistic at least as significant as
the one observed would be obtained assuming that the null hypothesis were true. The smaller
the P-value, the stronger the evidence against the null hypothesis.
4. Compare the p-value to an acceptable significance value alpha (sometimes called an alpha
value). If p≤ alpha, that the observed effect is statistically significant, the null hypothesis is
ruled out, and the alternative hypothesis is valid.
Rare Event Rule for Inferential Statistics
If, under a given assumption, the probability of

a particular observed event is exceptionally
small, we conclude that the assumption is
probably not correct.
Example: ProCare Industries, Ltd., once provided a product called
“Gender Choice,” which, according to advertising claims, allowed
couples to “increase your chances of having a boy up to 85%, a girl up
to 80%.” Gender Choice was available in blue packages for couples
wanting a baby boy and (you guessed it) pink packages for couples
wanting a baby girl. Suppose we conduct an experiment with 100
couples who want to have baby girls, and they all follow the Gender
Choice “easy-to-use in-home system” described in the pink package.
For the purpose of testing the claim of an increased likelihood for
girls, we will assume that Gender Choice has no effect. Using
common sense and no formal statistical methods, what should we
conclude about the assumption of no effect from Gender Choice if
100 couples using Gender Choice have 100 babies consisting of
a) 52 girls?; b) 97 girls?
Example: ProCare Industries, Ltd.: Part a)
a) We normally expect around 50 girls in 100 births.

The result of 52 girls is close to 50, so we should not
conclude that the Gender Choice product is effective. If
the 100 couples used no special method of gender
selection, the result of 52 girls could easily occur by
chance. The assumption of no effect from Gender
Choice appears to be correct. There isn’t sufficient
evidence to say that Gender Choice is effective.
Example: ProCare Industries, Ltd.: Part b)
b) The result of 97 girls in 100 births is extremely

unlikely to occur by chance. We could explain the
occurrence of 97 girls in one of two ways: Either an
extremely rare event has occurred by chance, or
Gender Choice is effective. The extremely low
probability of getting 97 girls is strong evidence
against the assumption that Gender Choice has no
effect. It does appear to be effective.
Objectives
 Given a claim, identify the null hypothesis and
the alternative hypothesis, and express them
both in symbolic form.
 Given a claim and sample data, calculate the
value of the test statistic.
 Given a significance level, identify the critical
value(s).
 Given a value of the test statistic, identify the P-
value.
 State the conclusion of a hypothesis test in
simple, non-technical terms.
Example: Let’s again refer to the Gender Choice product that was
once distributed by ProCare Industries. ProCare Industries claimed
that couples using the pink packages of Gender Choice would have
girls at a rate that is greater than 50% or 0.5. Let’s again consider an
experiment whereby 100 couples use Gender Choice in an attempt to
have a baby girl; let’s assume that the 100 babies include exactly 52
girls, and let’s formalize some of the analysis.
Under normal circumstances the proportion of girls is 0.5, so a claim

that Gender Choice is effective can be expressed as p > 0.5.
Using a normal distribution as an approximation to the binomial

distribution, we find P(52 or more girls in 100 births) = 0.3821.
Example: Let’s again refer to the Gender Choice product that was
once distributed by ProCare Industries. ProCare Industries claimed that
couples using the pink packages of Gender Choice would have girls at a
rate that is greater than 50% or 0.5. Let’s again consider an
experiment whereby 100 couples
use Gender Choice in an attempt to have a baby girl; let’s assume that
the 100 babies include exactly 52 girls, and let’s formalize some of the
analysis.
Figure 8-1, following, shows that with a probability of 0.5, the
outcome of 52 girls in 100 births is not unusual.
We do not reject random chance

as a reasonable explanation.
We conclude that the proportion
of girls born to couples using
Gender Choice is not
significantly greater than the
number that we would expect Figure 8-1
by random chance.
Observations
 Claim: For couples using Gender Choice, the proportion of
girls is p > 0.5.
 Working assumption: The proportion of girls is p = 0.5 (with no effect
from Gender Choice).
 The sample resulted in 52 girls among 100 births, so the sample
ˆ
proportion is p = 52/100 = 0.52.
 Assuming that p = 0.5, we use a normal distribution as an approximation
to the binomial distribution to find that P (at least 52 girls in 100 births) =
0.3821.
 There are two possible explanations for the result of 52 girls in 100
births: Either a random chance event (with probability 0.3821) has
occurred, or the proportion of girls born to couples using Gender
Choice is greater than 0.5.
 There isn’t sufficient evidence to support Gender Choice’s claim.
Components of a
Formal Hypothesis Test
Null Hypothesis: H0
 The null hypothesis (denoted by H0) is a
statement that the value of a population
parameter (such as proportion, mean, or
standard deviation) is equal to some claimed
value.
 We test the null hypothesis directly.
 Either reject H0 or fail to reject H0.
Alternative Hypothesis: H1
 The alternative hypothesis (denoted
by H1 or Ha or HA) is the statement that
the parameter has a value that
somehow differs from the null
hypothesis.
 The symbolic form of the alternative
hypothesis must use one of these
symbols: , <, >.
Note about Forming Your Own Claims
(Hypotheses)
If you are conducting a study and want to

use a hypothesis test to support your
claim, the claim must be worded so that it
becomes the alternative hypothesis.
Note about Identifying H0 and H1
Figure 8-2
Example: Identify the Null and Alternative Hypothesis. Refer
to Figure 8-2 and use the given claims to express the
corresponding null and alternative hypotheses in symbolic
form.
a) The proportion of drivers who admit to running red

lights is greater than 0.5.
b) The mean height of professional basketball players is at
most 7 ft.
c) The standard deviation of IQ scores of actors is equal
to 15.
corresponding null and alternative hypotheses in symbolic
form.
a) The proportion of drivers who admit to running red lights

is greater than 0.5. In Step 1 of Figure 8-2, we express the
given claim as p > 0.5. In Step 2, we see that if p > 0.5 is
false, then p  0.5 must be true. In Step 3, we see that the
expression p > 0.5 does not contain equality, so we let
the alternative hypothesis H1 be p > 0.5, and we let H0 be p
= 0.5.
corresponding null and alternative hypotheses in symbolic form.
b) The mean height of professional basketball players is at most 7

ft. In Step 1 of Figure 8-2, we express ―a mean of at most 7 ft‖ in
symbols as   7. In Step 2, we see that if   7 is false, then µ >
7 must be true. In Step 3, we see that the expression µ > 7 does
not contain equality, so we let the alternative hypothesis H1 be µ >
0.5, and we let H0 be µ = 7.
Example: Identify the Null and Alternative Hypothesis. Refer to
Figure 8-2 and use the given claims to express the corresponding
null and alternative hypotheses in symbolic form.
c) The standard deviation of IQ scores of actors is equal to 15. In

Step 1 of Figure 8-2, we express the given claim as  = 15. In Step 2,
we see that if  = 15 is false, then   15 must be true. In Step 3, we
let the alternative hypothesis H1 be   15, and we let H0 be  = 15.
Test Statistic
The test statistic is a value used in making a

decision about the null hypothesis, and is
found by converting the sample statistic to a
score with the assumption that the null
hypothesis is true.
Test Statistic - Formulas

z=p-p Test statistic for

pq proportions
n
x - µx Test statistic for
z=
 mean
n
Test statistic for
(n – 1)s2
2 = standard
2 deviation
Example: A survey of n = 880 randomly
selected adult drivers showed that 56% (or p = 0.56) of
those respondents admitted to running red lights. Find
the value of the test statistic for the claim that the
majority of all adult drivers admit to running red lights.
(In Section 8-3 we will see that there are assumptions

that must be verified. For this example, assume that
the required assumptions are satisfied and focus on
finding the indicated test statistic.)
Solution: The preceding example showed that the
given claim results in the following null and alternative
hypotheses: H0: p = 0.5 and H1: p > 0.5. Because we
work under the assumption that the null hypothesis is
true with p = 0.5, we get the following test statistic:
z = p – p = 0.56 - 0.5 = 3.56


 pq
n  (0.5)(0.5)
880
Interpretation: We know from previous chapters
that a z score of 3.56 is exceptionally large. It
appears that in addition to being “more than half,”
the sample result of 56% is significantly more than
50%.
See figure following.

Critical Region, Critical Value,
Test Statistic
Critical Region
The critical region (or rejection region) is

the set of all values of the test statistic that
cause us to reject the null hypothesis. For
example, see the red-shaded region in the
previous figure.
Significance Level
The significance level (denoted by ) is the

probability that the test statistic will fall in the
critical region when the null hypothesis is
actually true. Common choices for  are 0.05,
0.01, and 0.10.
Critical Value
A critical value is any value that separates the
critical region (where we reject the null
hypothesis) from the values of the test statistic
that do not lead to rejection of the null hypothesis.
The critical values depend on the nature of the
null hypothesis, the sampling distribution that
applies, and the significance level . See the
previous figure where the critical value of z =
1.645 corresponds to a significance level of  =
0.05.
Two-tailed, Right-tailed,
Left-tailed Tests
The tails in a distribution are the

extreme regions bounded by critical
values.
Two-tailed Test
H 0: =  is divided equally between
H1:  the two tails of the critical
region
Means less than or greater than

Right-tailed Test
H0: =
H1: >
Points Right
Left-tailed Test
H0: =
H1: <
Points Left
P-Value
The P-value (or p-value or probability value) is

the probability of getting a value of the test
statistic that is at least as extreme as the one
representing the sample data, assuming that the
null hypothesis is true. The null hypothesis is
rejected if the P-value is very small, such as 0.05
or less.
Conclusions
in Hypothesis Testing
We always test the null hypothesis. The initial
conclusion will always be one of the following:
1. Reject the null hypothesis.
2. Fail to reject the null hypothesis.

Decision Criterion
Traditional method:
Reject H0 if the test statistic falls

within the critical region.
Fail to reject H0 if the test statistic

does not fall within the critical region.
Decision Criterion - cont
P-value method:
Reject H0 if the P-value   (where  is

the significance level, such as 0.05).
Fail to reject H0 if the P-value > .

Another option:
Instead of using a significance level

such as 0.05, simply identify the P-
value and leave the decision to the
reader.
Confidence Intervals:
Because a confidence interval estimate of

a population parameter contains the likely
values of that parameter, reject a claim
that the population parameter has a value
that is not included in the confidence
interval.
Procedure for Finding P-Values
Figure 8-6
Example: Finding P-values. First determine whether the
given conditions result in a right-tailed test, a left-tailed test,
or a two-tailed test, then find the P-values and state a
conclusion about the null hypothesis.
a) A significance level of  = 0.05 is used in testing the

claim that p > 0.25, and the sample data result in a test
statistic of z = 1.18.
b) A significance level of  = 0.05 is used in testing the
claim that p  0.25, and the sample data result in a test
statistic of z = 2.34.
Example: Finding P-values. First determine whether the given
conditions result in a right-tailed test, a left-tailed test, or a
two-tailed test, then find the P-values and state a conclusion
about the null hypothesis.
a) With a claim of p > 0.25, the test is right-tailed. Because the
test is right-tailed, Figure 8-6 shows that the P-value is the area
to the right of the test statistic z = 1.18. We refer to Z-Table and
find that the area to the right of z = 1.18 is 0.1190. The P-value
of 0.1190 is greater than the significance level  = 0.05, so we
fail to reject the null hypothesis. The P-value of 0.1190 is
relatively large, indicating that the sample results could easily
occur by chance.
Example: Finding P-values. First determine whether the
given conditions result in a right-tailed test, a left-tailed test,
or a two-tailed test, then find the P-values and state a
conclusion about the null hypothesis.
b) With a claim of p  0.25, the test is two-tailed. Because the test is two-
tailed, and because the test statistic of z = 2.34 is to the right of the center,
Figure 8-6 shows that the P-value is twice the area to the right of z = 2.34. We
refer to Z-Table and find that the area to the right of z = 2.34 is 0.0096, so P-
value = 2 x 0.0096 = 0.0192. The P-value of 0.0192 is less than or equal to the
significance level, so we reject the null hypothesis. The small P-value o 0.0192
shows that the sample results are not likely to occur by chance.
Wording of Final Conclusion
Figure 8-7
Accept Versus Fail to Reject
 Some texts use “accept the null

hypothesis.”
 We are not proving the null hypothesis.

 The sample evidence is not strong
enough to warrant rejection
(such as not enough evidence to
convict a suspect).
Type I Error
 A Type I error is the mistake of

rejecting the null hypothesis when it
is true.
 The symbol  (alpha) is used to

represent the probability of a type I
error.
Type II Error
 A Type II error is the mistake of failing

to reject the null hypothesis when it is
false.
 The symbol  (beta) is used to

represent the probability of a type II
error.
Example: Assume that we a conducting a hypothesis
test of the claim p > 0.5. Here are the null and
alternative hypotheses: H0: p = 0.5, and H1: p > 0.5.
a) Identify a type I error.

b) Identify a type II error.
a) A type I error is the mistake of rejecting a true null
hypothesis, so this is a type I error: Conclude that
there is sufficient evidence to support p > 0.5, when
in reality p = 0.5.
b) A type II error is the mistake of failing to reject the

null hypothesis when it is false, so this is a type II
error: Fail to reject p = 0.5 (and therefore fail to
support p > 0.5) when in reality p > 0.5.
Type I and Type II Errors
Chapter 10
Inference About Means and
Proportions with Two
Populations
Inference About Means and Proportions with
Two Populations
• Inferences About the Difference Between Two

Population Means:  1 and  2 Known
Population Means:  1 and  2 Unknown
Population Means: Matched Samples
564
Inferences About the Difference Between
Two Population Means:  1 and  2 Known
• Interval Estimation of  1 –  2
• Hypothesis Tests About  1 –  2
565
Estimating the Difference Between
Two Population Means
• Let 1 equal the mean of population 1 and 2 equal
the mean of population 2.
• The difference between the two population means is
1  2.
• To estimate 1  2, we will select a simple random
sample of size n1 from population 1 and a simple
random sample of size n2 from population 2.
• Let 𝑥1 equal the mean of sample 1 and 𝑥2 equal the
mean of sample 2.
• The point estimator of the difference between the
means of the populations 1 and 2 is 𝑥1 − 𝑥2 .
566
Sampling Distribution of 𝑥1 − 𝑥2
• Expected Value
𝐸(𝑥1 − 𝑥2 )= 𝜇1 − 𝜇2
• Standard Deviation (Standard Error)
𝜎1 2 𝜎2 2
𝜍𝑥1;𝑥2 = +
𝑛1 𝑛2
where: 1 = standard deviation of population 1
2 = standard deviation of population 2
n1 = sample size from population 1
n2 = sample size from population 2
567
Interval Estimate of 1 - 2:  1 and  2
Known
• Interval Estimate
𝜍1 2 𝜍2 2
𝑥1 − 𝑥2 ± 𝑧𝛼/2 +
𝑛1 𝑛2
where:
1 -  is the confidence coefficient
568
Known
Example: Homestyle Furniture
Homestyle sells furniture at two stores in Buffalo, New

York: One is in the inner city and other in suburban
shopping centre. There is difference in the types of
furniture sold in each store and manager believes this
can be attributed to the difference in customer
demographics.
The manager wants to investigate the difference in mean
age of customers who shop at two stores.
569
Known
Inner city Sore

Suburban Store
Sample Size 36
49
Sample Mean 40 years 35 years
Standard deviation 9 years
10 years
(Based on previous
studies)
570
Known
Let us develop a 95% confidence interval

estimate of the difference between the mean
age of the customers who shop at two stores.
571
Estimating the Difference Between Two
Population Means
Population 1 Population 2
Inner-City Store Customers Suburban store customers
1 = mean age of 2 = mean age of
Inner-City Store Customers Suburban store
customers
1 – 2 = difference between
the mean ages
Random sample Random sample

of n1 Inner-city of n2 suburban customers
customers 𝑥2 = sample mean age for the
𝑥1 = sample mean age for sub-urban store customers
the
Inner-city store customers
𝑥1 − 𝑥2 = Point estimator of 1 – 2
572
Point Estimate of 1 - 2
Point estimate of 1 - 2 = 𝑥1 − 𝑥2 = 40 – 35 = 5 years
where:
1 = mean age of Inner-City Store Customers
2 = mean age of Suburban store customers
573
Interval Estimation of 1 - 2:  1 and
 2 Known
𝜎1 2 𝜎2 2 (9)2 (10)2
𝑥1 − 𝑥2 ± 𝑧𝛼/2 + = 5 ± 1.96 +
𝑛1 𝑛2 36 49
5 ± 4.06 or .94 Years to 9.06 Years
We are 95% confident that the difference between

mean age of Inner–city and suburban store
customers is .94 years to 9.06 years.
574
Hypothesis Tests About 1 - 2: 1 and
2 Known
• Hypotheses
H0: 1 – 2 > D0 H0: 1 – 2 < D0 H0: 1 – 2 = D0
Ha: 1 – 2 < D0 Ha: 1 – 2 > D0 Ha: 1 – 2 ≠ D0
Left-tailed Right-tailed Two-tailed
• Test Statistic
𝑥1 − 𝑥2 − 𝐷0
𝑧=
(𝜍1 )2 (𝜍2 )2
+
𝑛1 𝑛2
575
2 Known
Example: Training Centers
A standardized examination was given to the individuals

who are trained at two different centres to evaluate the
difference in education quality between them.
Let
µ1 = The mean examination score for the
population of
individuals trained at center A
µ2 = The mean examination score for the
population of
individuals trained at center B
576
2 Known
A B
Sample Size 30 40
Sample Mean 82 78
Standard deviation 10 10
(Based on previous
studies)
577
2 Known
Can we conclude using  = .05 that no

difference exists between the training quality
provided at the two centers?
578
2 Known
 p –Value and Critical Value Approaches
1. Develop the hypotheses.
H0: 1 - 2 = 0
Ha: 1 - 2 ≠ 0 (two -tailed test)
Where
µ1 = The mean examination score for the population of
individuals trained at center A
µ2 = The mean examination score for the population of
individuals trained at center B
2. Specify the level of significance  = .05
579
2 Known
 p –Value and Critical Value Approaches
3. Compute the value of the test statistic.
𝑥1 − 𝑥2 − 𝐷0
𝑧=
(𝜍1 )2 (𝜍2 )2
+
𝑛1 𝑛2
82 − 78 − 0 4
𝑧= = = 1.66
(10)2 (10)2 2.4152
+
30 40
580
2 Known
 p –Value Approach
4. Compute the p–value.
For z = 1.66, the area to the left is .9515.
The area in the upper tail of the distribution is
1.0000 -.9515 = .0485
p –value = 2(.0485) = .0970
5. Determine whether to reject H0.

Because p–value > = .05, we can not reject H0.
At the .05 level of significance, the sample
evidence indicates there is no difference
in quality between training centers.
581
2 Known
 Critical Value Approach
4. Determine the critical value and rejection rule.
For  = .05, z.025 = 1.96
Reject H0 if z > 1.96

Because z = 1.66 < 1.96, we cannot reject H0.
At the .05 level of significance, the sample

evidence indicates there is no difference
in quality between training centers.
582
Two Population Means:  1 and  2 Unknown
– Interval Estimation of 1 – 2
– Hypothesis Tests About 1 – 2
583
Interval Estimation of 1 - 2: 1 and 2
Unknown
When  1 and  2 are unknown, we will:
– Use the sample standard deviations s1 and s2 as
estimates of  1 and  2 , and
– Replace z/2 with t/2.
584
Interval Estimation of 1 - 2: 1 and
2 Unknown
– Interval Estimate
𝑠1 2 𝑠2 2
𝑥1 − 𝑥2 ± 𝑡𝛼/2 +
𝑛1 𝑛2
Where the degrees of freedom for t/2 are:
2
𝑠1 2 𝑠2 2
𝑛1 + 𝑛2
𝑑𝑓 = 2 2
1 𝑠1 2 1 𝑠2 2
𝑛1 − 1 𝑛1 +𝑛 −1 𝑛2
2
585
Difference Between Two Population
Means:  1 and  2 Unknown
Example: Clearwater National Bank
Clearwater National Bank wants to compare the

account checking practices by the customers at
two of its branch banks – Cherry Grove Branch
and Beechmont Branch. A random sample of 28
and 22 checking accounts is selected from these
branches respectively. The sample statistics are
shown on the next slide.
586
Cherry Grove Beechmont

Sample Size 28 22
Sample Mean $1025 $910
Sample Standard $150 $125
Deviation
587
Let us develop a 95% confidence interval

estimate of the difference between the
population mean checking account balances at
the two branch banks.
588
Point Estimate of 1 - 2
Point estimate of 1 - 2 = 𝑥1 − 𝑥2 = 1025 – 910 = 115
where:
1 = mean checking account balance maintained by the
population of Cherry Grove customers
2 = mean checking account balance maintained by the
population of Beechmont customers
589
2 Unknown
The degrees of freedom for t/2 are:
2
(150)2 (125)2
:
28 22
𝑑𝑓 = 2 2 = 47.8 = 47
1 (150)2 1 (125)2
28−1 28
:22−1 22
with /2 = .025 and df = 47, t/2 = 2.012
590
2 Unknown
𝑠1 2 𝑠2 2
𝑥1 − 𝑥2 ± 𝑡𝛼/2 +
𝑛1 𝑛2
(150)2 (125)2
1025 − 910 ± 2.012 +
28 22
115 ± 78 = $37 to $193

We are 95% confident that the difference between the mean
accounting checking balances maintained by the customers
at Cherry Grove branch and the Beechmont branch is $37 to $193.
591
2 Unknown
• Hypotheses
H0: 1 – 2 > D0 H0: 1 – 2 < D0 H0: 1 – 2 = D0
Ha: 1 – 2 < D0 Ha: 1 – 2 > D0 Ha: 1 – 2 ≠ D0
Left-tailed Right-tailed Two-tailed
• Test Statistic
𝑥1 − 𝑥2 − 𝐷0
𝑡=
(𝑠1 )2 (𝑠2 )2
+
𝑛1 𝑛2
592
2 Unknown
Example: Computer software package
A new computer software package is developed to reduce the

time required to design, develop and implement an
information system. To evaluate the benefits a random sample
of 24 system analysts is selected, 12 of them using current
technology and other 12 using new software package.
Can we conclude, using a .05 level of significance, that the

mean project completion time for system analysts using the
new software package is lesser than the mean project
completion time for system analysts using current
technology?
593
Hypothesis Tests About 1 - 2: 1 and 2
Unknown
Current Technology New Software
300 274
280 220
Summary Statistics 344 308
Sample Size 12 12 385 336
Sample Mean 325 286 372 198
Sample SD 40 44 360 300
288 315
321 258
376 318
290 310
301 332
283 263
594
2 Unknown
 p –Value approach
H0: 1 - 2 < 0
Ha: 1 - 2 > 0
(right-tailed test)
where:
1 = the mean project completion time for system analysts using the
current technology
2 = the mean project completion time for system analysts using the
new software package
595
2 Unknown
2. Specify the level of significance -  = .05
325 − 286 − 0
𝑡= = 2.27
(40)2 (44)2
+
12 12
596
2 Unknown
4. Compute the p –value.
The degrees of freedom for t are:
2
(40)2 (44)2
+
12 12
𝑑𝑓 = 2 2 = 21.8 = 21
1 (40) 2 1 (44) 2
+
12 − 1 12 12 − 1 12
597
2 Unknown
From the table we see p-value is

between .025 and .01.
Because p–value <  = .05, we reject H0.
There is sufficient statistical evidence that

1 - 2 > 0 or 1 > 2 i.e new software package
provides a smaller population mean completion time.
598
Inferences About the Difference Between Two
Population Means: Matched Samples
• With a matched-sample design each sampled

item provides a pair of data values.
• This design often leads to a smaller sampling
error than the independent-sample design
because variation between sampled items is
eliminated as a source of sampling error.
599
Two Population Means: Matched Samples
Example: Comparison of production methods
Two production methods are tested under similar conditions. A
random sample of six workers is used.
Task Completion Times For a Matched Sample Design
Worker Completion Time for Completion Time for Difference in
Method 1 (Minutes) Method 2 (Minutes) Completion Times
(di )
1 6.0 5.4 .6
2 5.0 5.2 -.2
3 7.0 6.5 .5
4 6.2 5.9 .3
5 6.0 6.0 .0
6 6.4 5.8 .6
600
Inferences About the Difference
Between Two Population Means:
Matched Samples
Example: Comparison of production methods
Each worker provides a pair of data values, one

for each production method. The test is
conducted to determine if the mean completion
times differ between the two methods.
601
Matched Samples
H0: d = 0
Ha: d  
Let d = the mean of the difference in

values for the population of workers
602
Matched Samples
2. Specify the level of significance.  = .05

𝑑𝑖 (1.8)
𝑑= = = .30
𝑛 6
𝑠 𝑑𝑖 ;𝑑 2 .56
𝑠𝑑 = = = .335
𝑛;1 5
𝑑 ;𝜇𝑑 .30;0
𝑡= = = 2.20
𝑠𝑑 / 𝑛 .335/ 6
603
Two Population Means: Matched Samples
4. Compute the p –value.
For t = 2.20 and df = 5, the p–value is between .10
and .05. (This is a two-tailed test, so we double the
upper-tail areas of .05 and .025.)

Because p–value >  = .05, we cannot reject H0.
604
Excel’s “t-Test: Paired Two Sample
for Means” Tool
• Step 1 Click the Data tab on the Ribbon
• Step 2 In the Analysis group, click Data Analysis
• Step 3 Choose t-Test: Paired Two Sample for
Means
from the list of Analysis Tools
• Step 4 When the t-Test: Paired Two Sample for
Means
dialog box appears: (see details on next
slide)
605

Lecture - Business Statistics - 11142019 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture - Business Statistics - 11142019 PDF

Uploaded by

Copyright:

Available Formats

Class Orientation

Second Semester, A.Y. 2019-2020

January 20, 2019

OFFICIAL CONSULTATION HOURS:

MOBILE NUMBER: 0929 699 9828

A2. Quizzes/ Assignments 30%

A3. Major Examination 50%

YOU ARE REQUIRED TO BUY THE E-BOOK. ALL

TOPIC: DISCUSS ON WHY THE NEED TO STUDY

Nation WTO status Per Capita Fitch Rating

The classic example of an interval scale is

Melissa’s college record shows

Any characteristic of an element is called a

Nominal Ordinal Nominal Ordinal Interval Ratio

Office of Mgmt. & Data on revenue, expenditures, debt of federal government

that, for example, patterns might $ 53,410 architects

emerge from the data. Descriptive $ 49,720

MEASURES OF CENTRAL TENDENCY: these are ways of describing the central

MEASURES OF SPREAD: these are ways of summarizing a group of data by

Step 1 Step 2 Step 3 Step 4

• sample standard deviation = 2.35 (rounded to 2 decimal places)

Acer Iconia W510 599 Windows 10.1 8.5 Intel

Tablet PC Comparison provides a wide variety of information about tablet computers.

c. Categorical variables: Operating System and CPU Manufacturer

d. Variable Measurement Scale

Acer Iconia W510 599 Windows 10.1 8.5 Intel

A. What is the average cost of the tablet?

Work on the attached exercises and be ready

 Summarizing Data for a Categorical Variable

 Relative Frequency Distribution

 Percent Frequency Distribution

 The objective is to provide insights about the

Coca-Cola Pepsi Dr. Pepper

 A relative frequency distribution is a tabular

 A percent frequency distribution is a tabular

Rating Relative Frequency Percent

 Relative Frequency and Percent Frequency Distributions

Year-end Audit Time (in Days)

For Example: in a payment time data set we have 65 measurements.

Example: Sanderson and Clifford

• Cumulative relative frequency distribution –

• Cumulative percent frequency distribution –

• The last entry in a cumulative frequency

• The last entry in a cumulative relative frequency

• The last entry in a cumulative percent frequency

72 92 128 104 108 76 141 119 98 85

69 76 118 132 96 91 81 113 115 94

97 86 127 134 100 102 80 98 106 106

107 73 124 83 92 81 106 75 95 119

• If we believe the original stem-and-leaf display

• Whenever a stem value is stated twice, the first

• A single digit is used to define each leaf.

• In the preceding example, the leaf unit was 1.

• Leaf units may be 100, 10, 1, 0.1, and so on.

• Where the leaf unit is not shown, it is assumed to equal

• The leaf unit indicates how to multiply the stem-and-leaf

• Summarizing Data for Two Variables Using

Converting the entries in the table into row

 Good restaurants charging a meal price of $10-19/Total number

• We must be careful in drawing conclusions

• In some cases the conclusions based upon

A scatter diagram is a graphical presentation of the

– One variable is shown on the horizontal axis and the