You are on page 1of 38

Slides by

John
Loucks
St. Edward’s
University

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
1
or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 1
Data and Statistics
 Statistics
 Applications in Business and Economics
 Data
 Data Sources
 Descriptive Statistics
 Statistical Inference
 Computers and Statistical Analysis
 Data Mining
 Ethical Guidelines for Statistical Practice

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
2
or duplicated, or posted to a publicly accessible website, in whole or in part.
Statistics

 The term statistics can refer to numerical facts such as


averages, medians, percents, and index numbers that
help us understand a variety of business and economic
situations.
 Statistics can also refer to the art and science of
collecting, analyzing, presenting, and interpreting
data.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
3
or duplicated, or posted to a publicly accessible website, in whole or in part.
Applications in
Business and Economics
 Accounting
Public accounting firms use statistical sampling
procedures when conducting audits for their clients.
 Economics
Economists use statistical information in making
forecasts about the future of the economy or some
aspect of it.
 Finance
Financial advisors use price-earnings ratios and
dividend yields to guide their investment advice.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
4
or duplicated, or posted to a publicly accessible website, in whole or in part.
Applications in
Business and Economics
 Marketing
Electronic point-of-sale scanners at retail checkout
counters are used to collect data for a variety of
marketing research applications.
 Production
A variety of statistical quality control charts are used
to monitor the output of a production process.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
5
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data and Data Sets

 Data are the facts and figures collected, analyzed,


and summarized for presentation and interpretation.

 All the data collected in a particular study are referred


to as the data set for the study.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
6
or duplicated, or posted to a publicly accessible website, in whole or in part.
Elements, Variables, and Observations

 Elements are the entities on which data are collected.


 A variable is a characteristic of interest for the elements.
 The set of measurements obtained for a particular
element is called an observation.
 A data set with n elements contains n observations.
 The total number of data values in a complete data
set is the number of elements multiplied by the
number of variables.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
7
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data, Data Sets,
Elements, Variables, and Observations
Observation Variables
Element
Names Stock Annual Earn/
Company Exchange Sales($M) Share($)

Dataram NQ 73.10 0.86


EnergySouth N 74.00 1.67
Keystone N 365.70 0.86
LandCare NQ 111.40 0.33
Psychemedics N 17.60 0.13

Data Set
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
8
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

Scales
Scales of
of measurement
measurement include:
include:
Nominal Interval
Ordinal Ratio

The
The scale
scale determines
determines thethe amount
amount of
of information
information
contained
contained in
in the
the data.
data.

The
The scale
scale indicates
indicates the
the data
data summarization
summarization and
and
statistical
statistical analyses
analyses that
that are
are most
most appropriate.
appropriate.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
9
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Nominal

Data
Data are
are labels
labels or
or names
names used
used to
to identify
identify an
an
attribute
attribute of
of the
the element.
element.

A
A nonnumeric
nonnumeric label
label or
or numeric
numeric code
code may
may be
be used.
used.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
10
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Nominal

Example:
Example:
Gender
Gender of
of subject
subject of
of aa study
study could
could be
be male
male (M)
(M) or
or
female
female (F).
(F).
Alternatively,
Alternatively, aa numeric
numeric code
code could
could be
be used
used for
for
the
the gender
gender variable
variable (e.g.
(e.g. 11 denotes
denotes male,
male,
22 denotes
denotes female).
female).

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
11
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Ordinal

The
The data
data have
have the
the properties
properties of
of nominal
nominal data
data and
and
the
the order
order or
or rank
rank of
of the
the data
data is
is meaningful.
meaningful.

A
A nonnumeric
nonnumeric label
label or
or numeric
numeric code
code may
may be
be used.
used.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
12
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Ordinal
Example:
Example:
Students
Students of
of aa university
university are
are classified
classified by
by the
the
grades
grades they
they secure
secure using
using aa nonnumeric
nonnumeric label
label
such
such as
as A,
A, B,
B, C,
C, D,
D, or
or F.
F.
Alternatively,
Alternatively, aa numeric
numeric code
code could
could bebe used
used for
for
the
the class
class standing
standing variable
variable (e.g.
(e.g. 11 denotes
denotes
A,
A, 22 denotes
denotes B,
B, and
and so
so on).
on).

Note
Note that
that the
the grades
grades do
do not
not convey
convey any
any numerical
numerical
marks
marks obtained
obtained by
by students.
students.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
13
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Interval

The
The data
data have
have the
the properties
properties of
of ordinal
ordinal data,
data, and
and
the
the interval
interval between
between observations
observations is
is expressed
expressed in
in
terms
terms of
of aa fixed
fixed unit
unit of
of measure.
measure.

Interval
Interval data
data are
are always
always numeric.
numeric.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
14
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Interval
Example:
Example:
Mohan
Mohan has
has aa GMAT
GMAT score
score of
of 1205,
1205, while
while Seema
Seema
has
has aa GMAT
GMAT score
score of
of 1090.
1090. Mohan
Mohan scored
scored 115
115
points
points more
more than
than Seema.
Seema.

Important:
Important: Difference
Difference between
between two
two observations
observations
are
are meaningful.
meaningful.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
15
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Ratio

The
The data
data have
have all
all the
the properties
properties of
of interval
interval data
data
and
and the
the ratio
ratio of
of two
two values
values is
is meaningful.
meaningful.

Variables
Variables such
such as
as distance,
distance, height,
height, weight,
weight, and
and time
time
use
use the
the ratio
ratio scale.
scale.

This
This scale
scale must
must contain
contain aa zero
zero value
value that
that indicates
indicates
that
that nothing
nothing exists
exists for
for the
the variable
variable at
at the
the zero
zero point.
point.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
16
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

 Ratio

Example:
Example:
Mohan’s
Mohan’s college
college record
record shows
shows 36
36 credit
credit hours
hours
earned,
earned, while
while Seema’s
Seema’s record
record shows
shows 72
72 credit
credit
hours
hours earned.
earned. Seema
Seema has
has twice
twice as
as many
many credit
credit
hours
hours earned
earned as
as Mohan.
Mohan.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
17
or duplicated, or posted to a publicly accessible website, in whole or in part.
Categorical and Quantitative Data

Data
Data can
can be
be further
further classified
classified as
as being
being categorical
categorical
or
or quantitative.
quantitative.

The
The statistical
statistical analysis
analysis that
that is
is appropriate
appropriate depends
depends
on
on whether
whether the
the data
data for
for the
the variable
variable are
are categorical
categorical
or
or quantitative.
quantitative.

In
In general,
general, there
there are
are more
more alternatives
alternatives for
for statistical
statistical
analysis
analysis when
when the
the data
data are
are quantitative.
quantitative.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
18
or duplicated, or posted to a publicly accessible website, in whole or in part.
Categorical Data

Labels
Labels or
or names
names used
used to
to identify
identify an
an attribute
attribute of
of
each
each element
element

Often
Often referred
referred to
to as
as qualitative
qualitative data
data

Use
Use either
either the
the nominal
nominal or
or ordinal
ordinal scale
scale of
of
measurement
measurement

Can
Can be
be either
either numeric
numeric or
or nonnumeric
nonnumeric

Appropriate
Appropriate statistical
statistical analyses
analyses are
are rather
rather limited
limited

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
19
or duplicated, or posted to a publicly accessible website, in whole or in part.
Quantitative Data

Quantitative
Quantitative data
data indicate
indicate how
how many
many or
or how
how much:
much:

discrete,
discrete, ifif measuring
measuring how
how many
many

continuous,
continuous, ifif measuring
measuring how
how much
much

Quantitative
Quantitative data
data are
are always
always numeric.
numeric.

Ordinary
Ordinary arithmetic
arithmetic operations
operations are
are meaningful
meaningful for
for
quantitative
quantitative data.
data.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
20
or duplicated, or posted to a publicly accessible website, in whole or in part.
Scales of Measurement

Data

Categorical Quantitative

Numeric Non-numeric Numeric

Nominal
Nominal Ordinal Nominal Ordinal Interval Ratio

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
21
or duplicated, or posted to a publicly accessible website, in whole or in part.
Cross-Sectional Data

Cross-sectional
Cross-sectional data
data are
are collected
collected at
at the
the same
same or
or
approximately
approximately the
the same
same point
point in
in time.
time.

Example:
Example: data
data detailing
detailing the
the number
number ofof building
building
permits
permits issued
issued in
in February
February 2010
2010 in
in each
each of
of the
the
counties
counties of
of Ohio
Ohio

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
22
or duplicated, or posted to a publicly accessible website, in whole or in part.
Time Series Data

Time
Time series
series data
data are
are collected
collected over
over several
several time
time
periods.
periods.

Example:
Example: datadata detailing
detailing the
the number
number of
of building
building
permits
permits issued
issued in
in Lucas
Lucas County,
County, Ohio
Ohio in
in each
each of
of
the
the last
last 36
36 months
months

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
23
or duplicated, or posted to a publicly accessible website, in whole or in part.
Time Series Data

U.S. Average Price Per Gallon


For Conventional Regular Gasoline

Source: Energy Information Administration, U.S. Department of Energy, May 2009.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
24
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Sources

 Existing Sources

Internal company records – almost any department


Business database services – Dow Jones & Co.
Government agencies - U.S. Department of Labor
Industry associations – Travel Industry Association
of America
Special-interest organizations – Graduate Management
Admission Council
Internet – more and more firms

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
25
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Sources

 Data Available From Internal Company Records


Record Some of the Data Available
Employee records name, address, social security number
Production records part number, quantity produced,
direct labor cost, material cost
Inventory records part number, quantity in stock,
reorder level, economic order quantity
Sales records product number, sales volume, sales
volume by region
Credit records customer name, credit limit, accounts
receivable balance
Customer profile age, gender, income, household size
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
26
or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Acquisition Considerations

Time Requirement
• Searching for information can be time consuming.
• Information may no longer be useful by the time it
is available.
Cost of Acquisition
• Organizations often charge for information even
when it is not their primary business activity.
Data Errors
• Using any data that happen to be available or were
acquired with little care can lead to misleading
information.
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
30
or duplicated, or posted to a publicly accessible website, in whole or in part.
Representation of Data - Descriptive Statistics

 Most of the statistical information in newspapers,


magazines, company reports, and other publications
consists of data that are summarized and presented
in a form that is easy to understand.
 Such summaries of data, which may be tabular,
graphical, or numerical, are referred to as descriptive
statistics.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
31
or duplicated, or posted to a publicly accessible website, in whole or in part.
Example: Hudson Auto Repair

The manager of Hudson Auto would like to have a


better understanding of the cost of parts used in the
engine tune-ups performed in her shop. She examines
50 customer invoices for tune-ups. The costs of parts,
rounded to the nearest dollar, are listed on the next
slide.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
32
or duplicated, or posted to a publicly accessible website, in whole or in part.
Example: Hudson Auto Repair

 Sample of Parts Cost ($) for 50 Tune-ups


91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
33
or duplicated, or posted to a publicly accessible website, in whole or in part.
Representation of Data - Tabular Summary:
Frequency and Percent Frequency
 Example: Hudson Auto

Parts Percent
Cost ($) Frequency Frequency
50-59 2 4
60-69 13 26
(2/50)100
70-79 16 32
80-89 7 14
90-99 7 14
100-109 5 10
50 100

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
34
or duplicated, or posted to a publicly accessible website, in whole or in part.
Representation of Data - Graphical Summary:
Histogram
 Example: Hudson Auto
18
Tune-up Parts Cost
16
14
Frequency

12
10
8
6
4
2
Parts
50-59 60-69 70-79 80-89 90-99 100-110 Cost ($)
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
35
or duplicated, or posted to a publicly accessible website, in whole or in part.
Representation of Data - Numerical Descriptive
Statistics
 The most common numerical descriptive statistic
is the average (or mean).
 The average demonstrates a measure of the central
tendency, or central location, of the data for a variable.
 Hudson’s average cost of parts, based on the 50
tune-ups studied, is $79 (found by summing the
50 cost values and then dividing by 50).

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
36
or duplicated, or posted to a publicly accessible website, in whole or in part.
Statistical Inference

Population - the set of all elements of interest in a


particular study
Sample - a subset of the population

Statistical inference - the process of using data obtained


from a sample to make estimates
and test hypotheses about the
characteristics of a population
Census - collecting data for the entire population

Sample survey - collecting data for a sample

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
37
or duplicated, or posted to a publicly accessible website, in whole or in part.
Process of Statistical Inference

1. Population
consists of all tune- 2. A sample of 50
ups. Average cost of engine tune-ups
parts is unknown. is examined.

3. The sample data


4. The sample average provide a sample
is used to estimate the average parts cost
population average. of $79 per tune-up.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
38
or duplicated, or posted to a publicly accessible website, in whole or in part.
Ethical Guidelines for Statistical Practice

 In a statistical study, unethical behavior can take a


variety of forms including:
• Improper sampling
• Inappropriate analysis of the data
• Development of misleading graphs
• Use of inappropriate summary statistics
• Biased interpretation of the statistical results
 You should strive to be fair, thorough, objective, and
neutral as you collect, analyze, and present data.
 As a consumer of statistics, you should also be aware
of the possibility of unethical behavior by others.

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
45
or duplicated, or posted to a publicly accessible website, in whole or in part.
Ethical Guidelines for Statistical Practice

 The American Statistical Association developed the


report “Ethical Guidelines for Statistical Practice”.
 The report contains 67 guidelines organized into
eight topic areas:
• Professionalism
• Responsibilities to Funders, Clients, Employers
• Responsibilities in Publications and Testimony
• Responsibilities to Research Subjects
• Responsibilities to Research Team Colleagues
• Responsibilities to Other Statisticians/Practitioners
• Responsibilities Regarding Allegations of Misconduct
• Responsibilities of Employers Including Organizations,
Individuals, Attorneys, or Other Clients
© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
46
or duplicated, or posted to a publicly accessible website, in whole or in part.
End of Chapter 1

© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
Slide
47
or duplicated, or posted to a publicly accessible website, in whole or in part.

You might also like