You are on page 1of 47

Fundamentals

of Business
Statistics
6E

Slides by

John
Sweeney Loucks
Williams St. Edward’s
Anderson University

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
1
or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 1
Data and Statistics
 Statistics
 Applications in Business and Economics
 Data
 Data Sources
 Descriptive Statistics
 Statistical Inference
 Computers and Statistical Analysis
 Data Mining
 Ethical Guidelines for Statistical Practice

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
2
or duplicated, or posted to a publicly accessible website, in whole or in part.
Statistics

 The term statistics can refer to numerical facts such as


averages, medians, percents, and index numbers that
help us understand a variety of business and economic
situations.
 Statistics can also refer to the art and science of
collecting, analyzing, presenting, and interpreting
data.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
3
or duplicated, or posted to a publicly accessible website, in whole or in part.
Applications in
Business and Economics
 Accounting
Public accounting firms use statistical sampling
procedures when conducting audits for their clients.
 Economics
Economists use statistical information in making
forecasts about the future of the economy or some
aspect of it.
 Finance
Financial advisors use price-earnings ratios and
dividend yields to guide their investment advice.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
4
or duplicated, or posted to a publicly accessible website, in whole or in part.
Applications in
Business and Economics
 Marketing
Electronic point-of-sale scanners at retail checkout
counters are used to collect data for a variety of
marketing research applications.
 Production
A variety of statistical quality control charts are used
to monitor the output of a production process.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
5
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data and Data Sets

 Data are the facts and figures collected, analyzed,


and summarized for presentation and interpretation.

 All the data collected in a particular study are referred


to as the data set for the study.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
6
or duplicated, or posted to a publicly accessible website, in whole or in part.
Elements, Variables, and Observations

 Elements are the entities on which data are collected.


 A variable is a characteristic of interest for the elements.
 The set of measurements obtained for a particular
element is called an observation.
 A data set with n elements contains n observations.
 The total number of data values in a complete data
set is the number of elements multiplied by the
number of variables.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
7
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data, Data Sets,
Elements, Variables, and Observations
Observation Variables
Element
Names Stock Annual Earn/
Company Exchange Sales($M) Share($)

Dataram NQ 73.10 0.86


EnergySouth N 74.00 1.67
Keystone N 365.70 0.86
LandCare NQ 111.40 0.33
Psychemedics N 17.60 0.13

Data Set
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
8
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

Scales
Scales of
of measurement
measurement include:
include:
Nominal Interval
Ordinal Ratio

The
The scale
scale determines
determines thethe amount
amount of
of information
information
contained
contained in
in the
the data.
data.

The
The scale
scale indicates
indicates the
the data
data summarization
summarization and
and
statistical
statistical analyses
analyses that
that are
are most
most appropriate.
appropriate.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
9
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Nominal

Data
Data are
are labels
labels or
or names
names used
used to
to identify
identify an
an
attribute
attribute of
of the
the element.
element.

A
A nonnumeric
nonnumeric label
label or
or numeric
numeric code
code may
may be
be used.
used.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
10
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Nominal

Example:
Example:
Students
Students of
of aa university
university are
are classified
classified by
by the
the
school
school in
in which
which they
they are
are enrolled
enrolled using
using aa
nonnumeric
nonnumeric label
label such
such as
as Business,
Business, Humanities,
Humanities,
Education,
Education, and
and soso on.
on.
Alternatively,
Alternatively, aa numeric
numeric code
code could
could be
be used
used for
for
the
the school
school variable
variable (e.g.
(e.g. 11 denotes
denotes Business,
Business,
22 denotes
denotes Humanities,
Humanities, 33 denotes
denotes Education,
Education, and
and
so
so on).
on).

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
11
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Ordinal

The
The data
data have
have the
the properties
properties of
of nominal
nominal data
data and
and
the
the order
order or
or rank
rank of
of the
the data
data is
is meaningful.
meaningful.

A
A nonnumeric
nonnumeric label
label or
or numeric
numeric code
code may
may be
be used.
used.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
12
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Ordinal

Example:
Example:
Students
Students of
of aa university
university are
are classified
classified by
by their
their
class
class standing
standing using
using aa nonnumeric
nonnumeric label
label such
such asas
Freshman,
Freshman, Sophomore,
Sophomore, Junior,
Junior, or
or Senior.
Senior.
Alternatively,
Alternatively, aa numeric
numeric code
code could
could bebe used
used for
for
the
the class
class standing
standing variable
variable (e.g.
(e.g. 11 denotes
denotes
Freshman,
Freshman, 22 denotes
denotes Sophomore,
Sophomore, and and so
so on).
on).

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
13
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Interval

The
The data
data have
have the
the properties
properties of
of ordinal
ordinal data,
data, and
and
the
the interval
interval between
between observations
observations is
is expressed
expressed in
in
terms
terms of
of aa fixed
fixed unit
unit of
of measure.
measure.

Interval
Interval data
data are
are always
always numeric.
numeric.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
14
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Interval

Example:
Example:
Melissa
Melissa has
has an
an SAT
SAT score
score of
of 1885,
1885, while
while Kevin
Kevin
has
has an
an SAT
SAT score
score of
of 1780.
1780. Melissa
Melissa scored
scored 105
105
points
points more
more than
than Kevin.
Kevin.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
15
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Ratio

The
The data
data have
have all
all the
the properties
properties of
of interval
interval data
data
and
and the
the ratio
ratio of
of two
two values
values is
is meaningful.
meaningful.

Variables
Variables such
such as
as distance,
distance, height,
height, weight,
weight, and
and time
time
use
use the
the ratio
ratio scale.
scale.

This
This scale
scale must
must contain
contain aa zero
zero value
value that
that indicates
indicates
that
that nothing
nothing exists
exists for
for the
the variable
variable at
at the
the zero
zero point.
point.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
16
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Ratio

Example:
Example:
Melissa’s
Melissa’s college
college record
record shows
shows 36
36 credit
credit hours
hours
earned,
earned, while
while Kevin’s
Kevin’s record
record shows
shows 7272 credit
credit
hours
hours earned.
earned. Kevin
Kevin has
has twice
twice as
as many
many credit
credit
hours
hours earned
earned asas Melissa.
Melissa.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
17
or duplicated, or posted to a publicly accessible website, in whole or in part.
Categorical and Quantitative Data

Data
Data can
can be
be further
further classified
classified as
as being
being categorical
categorical
or
or quantitative.
quantitative.

The
The statistical
statistical analysis
analysis that
that is
is appropriate
appropriate depends
depends
on
on whether
whether the
the data
data for
for the
the variable
variable are
are categorical
categorical
or
or quantitative.
quantitative.

In
In general,
general, there
there are
are more
more alternatives
alternatives for
for statistical
statistical
analysis
analysis when
when the
the data
data are
are quantitative.
quantitative.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
18
or duplicated, or posted to a publicly accessible website, in whole or in part.
Categorical Data

Labels
Labels or
or names
names used
used to
to identify
identify an
an attribute
attribute of
of
each
each element
element

Often
Often referred
referred to
to as
as qualitative
qualitative data
data

Use
Use either
either the
the nominal
nominal or
or ordinal
ordinal scale
scale of
of
measurement
measurement

Can
Can be
be either
either numeric
numeric or
or nonnumeric
nonnumeric

Appropriate
Appropriate statistical
statistical analyses
analyses are
are rather
rather limited
limited

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
19
or duplicated, or posted to a publicly accessible website, in whole or in part.
Quantitative Data

Quantitative
Quantitative data
data indicate
indicate how
how many
many or
or how
how much:
much:

discrete,
discrete, ifif measuring
measuring how
how many
many

continuous,
continuous, ifif measuring
measuring how
how much
much

Quantitative
Quantitative data
data are
are always
always numeric.
numeric.

Ordinary
Ordinary arithmetic
arithmetic operations
operations are
are meaningful
meaningful for
for
quantitative
quantitative data.
data.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
20
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

Data

Categorical Quantitative

Numeric Non-numeric Numeric

Nominal
Nominal Ordinal Nominal Ordinal Interval Ratio

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
21
or duplicated, or posted to a publicly accessible website, in whole or in part.
Cross-Sectional Data

Cross-sectional
Cross-sectional data
data are
are collected
collected at
at the
the same
same or
or
approximately
approximately the
the same
same point
point in
in time.
time.

Example:
Example: data
data detailing
detailing the
the number
number ofof building
building
permits
permits issued
issued in
in February
February 2010
2010 in
in each
each of
of the
the
counties
counties of
of Ohio
Ohio

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
22
or duplicated, or posted to a publicly accessible website, in whole or in part.
Time Series Data

Time
Time series
series data
data are
are collected
collected over
over several
several time
time
periods.
periods.

Example:
Example: datadata detailing
detailing the
the number
number of
of building
building
permits
permits issued
issued in
in Lucas
Lucas County,
County, Ohio
Ohio in
in each
each of
of
the
the last
last 36
36 months
months

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
23
or duplicated, or posted to a publicly accessible website, in whole or in part.
Time Series Data

U.S. Average Price Per Gallon


For Conventional Regular Gasoline

Source: Energy Information Administration, U.S. Department of Energy, May 2009.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
24
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Sources

 Existing Sources

Internal company records – almost any department


Business database services – Dow Jones & Co.
Government agencies - U.S. Department of Labor
Industry associations – Travel Industry Association
of America
Special-interest organizations – Graduate Management
Admission Council
Internet – more and more firms

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
25
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Sources

 Data Available From Internal Company Records


Record Some of the Data Available
Employee records name, address, social security number
Production records part number, quantity produced,
direct labor cost, material cost
Inventory records part number, quantity in stock,
reorder level, economic order quantity
Sales records product number, sales volume, sales
volume by region
Credit records customer name, credit limit, accounts
receivable balance
Customer profile age, gender, income, household size
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
26
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Sources

 Data Available From Selected Government Agencies


Government Agency Some of the Data Available
Census Bureau Population data, number of
www.census.gov households, household income
Federal Reserve Board Data on money supply, exchange
www.federalreserve.gov rates, discount rates
Office of Mgmt. & Budget Data on revenue, expenditures, debt
www.whitehouse.gov/omb of federal government
Department of Commerce Data on business activity, value of
www.doc.gov shipments, profit by industry
Bureau of Labor Statistics Customer spending, unemployment
www.bls.gov rate, hourly earnings, safety record

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
27
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Sources

 Statistical Studies - Experimental


In
In experimental
experimental studies
studies the
the variable
variable of
of interest
interest is
is
first
first identified.
identified. Then
Then one
one or
or more
more other
other variables
variables
are
are identified
identified and
and controlled
controlled so
so that
that data
data can
can be
be
obtained
obtained about
about how
how they
they influence
influence the
the variable
variable of
of
interest.
interest.

The
The largest
largest experimental
experimental study
study ever
ever conducted
conducted is
is
believed
believed to
to be
be the
the 1954
1954 Public
Public Health
Health Service
Service
experiment
experiment forfor the
the Salk
Salk polio
polio vaccine.
vaccine. Nearly
Nearly two
two
million
million U.S.
U.S. children
children (grades
(grades 1-
1- 3)
3) were
were selected.
selected.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
28
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Sources

 Statistical Studies - Observational


In
In observational
observational (nonexperimental)
(nonexperimental) studies
studies no
no
attempt
attempt is
is made
made toto control
control or
or influence
influence the
the
variables
variables of
of interest.
interest. a survey is a good example

Studies
Studies of
of smokers
smokers and
and nonsmokers
nonsmokers are
are
observational
observational studies
studies because
because researchers
researchers
do
do not
not determine
determine or
or control
control
who
who will
will smoke
smoke and
and who
who will
will not
not smoke.
smoke.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
29
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Acquisition Considerations

Time Requirement
• Searching for information can be time consuming.
• Information may no longer be useful by the time it
is available.
Cost of Acquisition
• Organizations often charge for information even
when it is not their primary business activity.
Data Errors
• Using any data that happen to be available or were
acquired with little care can lead to misleading
information.
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
30
or duplicated, or posted to a publicly accessible website, in whole or in part.
Descriptive Statistics

 Most of the statistical information in newspapers,


magazines, company reports, and other publications
consists of data that are summarized and presented
in a form that is easy to understand.
 Such summaries of data, which may be tabular,
graphical, or numerical, are referred to as descriptive
statistics.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
31
or duplicated, or posted to a publicly accessible website, in whole or in part.
Example: Hudson Auto Repair

The manager of Hudson Auto would like to have a


better understanding of the cost of parts used in the
engine tune-ups performed in her shop. She examines
50 customer invoices for tune-ups. The costs of parts,
rounded to the nearest dollar, are listed on the next
slide.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
32
or duplicated, or posted to a publicly accessible website, in whole or in part.
Example: Hudson Auto Repair

 Sample of Parts Cost ($) for 50 Tune-ups


91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
33
or duplicated, or posted to a publicly accessible website, in whole or in part.
Tabular Summary:
Frequency and Percent Frequency
 Example: Hudson Auto

Parts Percent
Cost ($) Frequency Frequency
50-59 2 4
60-69 13 26
(2/50)100
70-79 16 32
80-89 7 14
90-99 7 14
100-109 5 10
50 100

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
34
or duplicated, or posted to a publicly accessible website, in whole or in part.
Graphical Summary: Histogram

 Example: Hudson Auto


18
Tune-up Parts Cost
16
14
Frequency

12
10
8
6
4
2
Parts
50-59 60-69 70-79 80-89 90-99 100-110 Cost ($)
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
35
or duplicated, or posted to a publicly accessible website, in whole or in part.
Numerical Descriptive Statistics

 The most common numerical descriptive statistic


is the average (or mean).
 The average demonstrates a measure of the central
tendency, or central location, of the data for a variable.
 Hudson’s average cost of parts, based on the 50
tune-ups studied, is $79 (found by summing the
50 cost values and then dividing by 50).

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
36
or duplicated, or posted to a publicly accessible website, in whole or in part.
Statistical Inference

Population - the set of all elements of interest in a


particular study
Sample - a subset of the population

Statistical inference - the process of using data obtained


from a sample to make estimates
and test hypotheses about the
characteristics of a population
Census - collecting data for the entire population

Sample survey - collecting data for a sample

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
37
or duplicated, or posted to a publicly accessible website, in whole or in part.
Process of Statistical Inference

1. Population
consists of all tune- 2. A sample of 50
ups. Average cost of engine tune-ups
parts is unknown. is examined.

3. The sample data


4. The sample average provide a sample
is used to estimate the average parts cost
population average. of $79 per tune-up.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
38
or duplicated, or posted to a publicly accessible website, in whole or in part.
Computers and Statistical Analysis

 Statisticians often use computer software to perform


the statistical computations required with large
amounts of data.
 To facilitate computer usage, many of the data sets
in this book are available on the website that
accompanies the text.
 The data files may be downloaded in either Minitab
or Excel formats.
 Also, the Excel add-in StatTools can be downloaded
from the website.
 Chapter ending appendices cover the step-by-step
procedures for using Minitab, Excel, and StatTools.
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
39
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Warehousing

 Organizations obtain large amounts of data on a


daily basis by means of magnetic card readers, bar
code scanners, point of sale terminals, and touch
screen monitors.
 Wal-Mart captures data on 20-30 million transactions
per day.
 Visa processes 6,800 payment transactions per second.
 Capturing, storing, and maintaining the data, referred
to as data warehousing, is a significant undertaking.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
40
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Mining

 Analysis of the data in the warehouse might aid in


decisions that will lead to new strategies and higher
profits for the organization.
 Using a combination of procedures from statistics,
mathematics, and computer science, analysts “mine
the data” to convert it into useful information.
 The most effective data mining systems use automated
procedures to discover relationships in the data and
predict future outcomes, … prompted by only general,
even vague, queries by the user.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
41
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Mining Applications

 The major applications of data mining have been


made by companies with a strong consumer focus
such as retail, financial, and communication firms.
 Data mining is used to identify related products that
customers who have already purchased a specific
product are also likely to purchase (and then pop-ups
are used to draw attention to those related products).
 As another example, data mining is used to identify
customers who should receive special discount offers
based on their past purchasing volumes.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
42
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Mining Requirements

 Statistical methodology such as multiple regression,


logistic regression, and correlation are heavily used.
 Also needed are computer science technologies
involving artificial intelligence and machine learning.
 A significant investment in time and money is
required as well.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
43
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Mining Model Reliability

 Finding a statistical model that works well for a


particular sample of data does not necessarily mean
that it can be reliably applied to other data.
 With the enormous amount of data available, the
data set can be partitioned into a training set (for
model development) and a test set (for validating
the model).
 There is, however, a danger of over fitting the model
to the point that misleading associations and
conclusions appear to exist.
 Careful interpretation of results and extensive testing
is important.
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
44
or duplicated, or posted to a publicly accessible website, in whole or in part.
Ethical Guidelines for Statistical Practice

 In a statistical study, unethical behavior can take a


variety of forms including:
• Improper sampling
• Inappropriate analysis of the data
• Development of misleading graphs
• Use of inappropriate summary statistics
• Biased interpretation of the statistical results
 You should strive to be fair, thorough, objective, and
neutral as you collect, analyze, and present data.
 As a consumer of statistics, you should also be aware
of the possibility of unethical behavior by others.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
45
or duplicated, or posted to a publicly accessible website, in whole or in part.
Ethical Guidelines for Statistical Practice

 The American Statistical Association developed the


report “Ethical Guidelines for Statistical Practice”.
 The report contains 67 guidelines organized into
eight topic areas:
• Professionalism
• Responsibilities to Funders, Clients, Employers
• Responsibilities in Publications and Testimony
• Responsibilities to Research Subjects
• Responsibilities to Research Team Colleagues
• Responsibilities to Other Statisticians/Practitioners
• Responsibilities Regarding Allegations of Misconduct
• Responsibilities of Employers Including Organizations,
Individuals, Attorneys, or Other Clients
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
46
or duplicated, or posted to a publicly accessible website, in whole or in part.
End of Chapter 1

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
47
or duplicated, or posted to a publicly accessible website, in whole or in part.

You might also like