You are on page 1of 91

Statistics for

Business and Economics


7th Edition

Chapter 1

Describing Data: Graphical

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-1
Introduction

 What are the projected sales of new product


such as Apple’s new iPhone?
 Who will win the next presidential elections?
 What will be the best jobs available when you
graduate from the university?

Numbers are used to predict or forecast sales of


a new product, the weather, the election etc.
1.1
Dealing with Uncertainty

 Numbers and data are used to assist


decision making.

 Once the data are collected, what do we


do with them?

 Statistics is a tool to help us process, summarize,


analyze, and interpret data for the purpose of
making better decisions in an uncertain environment…

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-3
Dealing with Uncertainty

Everyday decisions are based on incomplete


information…
 Accountants may need to select a portion of records for auditing
purposes…
 Financial investors need to understand the market’s fluctuations
and they need to choose between various portfolio investments..
 Managers may use surveys to find out if customers are satisfied
with company’s products or services…

Define the problem / determine what data are needed /


collect the data / use the statistics to summarize the
data / make inferences and decisions based on the data
obtained
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-4
Managers need an understanding of statistics
for the following four key reasons:

 To know how to properly present and describe


information.
 To know how to draw conclusions about large
populations based only on information obtained from
samples.
 To know how to improve processes.
 To know how to obtain reliable forecasts.
Key Definitions
1.2

 A population is the collection of all items of interest


or under investigation
 N represents the population size
 so large to analyze
 Collecting complete information for a population could be
impossible or expensive
 Time consuming

 A sample is an observed subset of the population


 n represents the sample size

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-6
Reasons for Drawing a Sample

 A sample is less time consuming than a census.

 A sample is less costly to administer than a census.

 A sample is less cumbersome and more practical to


administer than a census of the targeted population.
Examples of Populations

 Names of all registered voters in the United States

 Incomes of all families living in Daytona Beach

 Annual returns of all stocks traded on the New York


Stock Exchange

 Grade point averages of all the students in your


university

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-8
Random Sampling

Simple random sampling is a procedure in which

 each member of the population is chosen strictly by


chance,
 each member of the population is equally likely to be
chosen,

The resulting sample is called a random sample

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-9
 A parameter is a specific characteristic of a
population

 A statistic is a specific characteristic of a sample


Population vs. Sample

Population Sample

a b cd b c
ef gh i jk l m n gi n
o p q rs t u v w o r u
x y z y

Values calculated using Values computed from


population data are called sample data are called
parameters statistics
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-11
To think statistically
 Define the problem.
 What information is required?
 What is the relevant population?
 How should sample members be selected?
 How should information be obtained from the sample
members?

 Use sample information to make decisions about


population of interest.

 Draw the conclusion about the population.


Descriptive and Inferential Statistics

Two branches of statistics:


 Descriptive statistics
 Graphical and numerical procedures to summarize
and process data
 Inferential statistics
 Using data to make predictions, forecasts, and
estimates to assist decision making

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-13
Descriptive Statistics

 Collect data
 e.g., Survey
 Present data
 e.g., Tables and graphs
 Summarize data
 e.g., Sample mean = X i

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-14
Inferential Statistics
 Estimation
 e.g., Estimate the population

mean weight using the sample


mean weight
 Hypothesis testing
 e.g., Test the claim that the

population mean weight is 140


pounds

Inference is the process of drawing conclusions or


making decisions about a population based on
sample results
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-15
Classification of Variables

 Variable
specific characteristics of an individual or
object…

 Categorical / numerical
(type and amount of information contained in
the data)

 Measurement levels
(qualitative and quantitative)
Types of Data

Examples:
 Marital Status
 Are you registered to vote?
 Eye Color
(Defined as categories or groups) Examples: Examples:
 Number of Children  Weight
 Defects per hour  Voltage
(Counted items) (Measured characteristics)
(counting process) (measurement process)

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-17
Types of Data

There is no measurable
meaning to the
“difference” in numbers.

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-19
Obtained from ordered categories.
Obtained from categorical questions Examples:
Examples:  Product quality ratings
 1: male 2: female 1: poor 2: average 3:good
 1: yes 2: no

Satisfaction rating with university food


service
1: very dissatisfied
2: moderately dissatisfied
3: no opinion
4: moderately satisfied
5: very satisfied
Indicates rank and distance from an Indicates both rank and distance from a
arbitrary zero measured in unit natural zero, with ratios of two
intervals. measures having meaning.
Examples: Examples:
 Temperature  A person who weighs 200 pounds is
twice the weight of a person who
 Year / calendar
weighs 100 pounds.
 A person who is 40 years old is twice
the age of someone who is 20 years
old.
1.3 Graphical
Presentation of Data

 Data in raw form are usually not easy to


use for decision making
 Some type of organization is needed
 Table

 Graph

 The type of graph to use depends on the


variable being summarized

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-22
Graphical
Presentation of Data
(continued)
 Techniques reviewed in this chapter:

Categorical Numerical
Variables Variables

• Frequency distribution • Line chart


• Bar chart • Frequency distribution
• Pie chart • Histogram and ogive
• Pareto diagram • Stem-and-leaf display
• Scatter plot

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-23
Tables and Graphs for
Categorical Variables

Categorical
Data

Tabulating Data Graphing Data

Frequency
Distribution Bar Pie Pareto
Table Chart Chart Diagram

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-24
The Frequency
Distribution Table

 Used to organize data..


 Summarize data by category…
- list of frequencies or
- number of observations…

- Classes or groups..
- All possible responses on a
variable being studied…
The Frequency
Distribution Table

Example: Hospital Patients by Unit

Hospital Unit Number of Patients


Cardiac Care 1,052
Emergency 2,245
Intensive Care 340
(Variables are
categorical) Maternity 552
Surgery 4,630

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-26
Bar and Pie Charts
 Bar charts and Pie charts are often used for
qualitative (categorical) data

 If our intent is to draw attention to the


frequency of each category, we will most likely
draw a bar chart..

 If we want to draw attention to the portion of


frequencies in each category, we will probably
use pie chart..
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-27
Bar Chart Example
Hospital Number
Unit of Patients
Hospital Patients by Unit
Cardiac Care 1,052 5000 4630
Emergency 2,245

patients per year


4000
Intensive Care 340

Number of
Maternity 552 3000
2245
Surgery 4,630
2000
1052
1000 552
340

Cardiac

Emergency

Maternity

Surgery
Intensive
Care

Care
Height of bar shows
the frequency of
each category
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-28
Example: Health conscious level by gender

Males Females
Very health conscious 16 13
Moderately health 26 29
conscious
Slightly health 12 8
conscious
Not very health 7 2
conscious
Component or Stacked Bar Chart
Males Females
Very health 16 13
conscious
Moderately health 26 29
conscious
Slightly health 12 8
conscious
Not very health 7 2
conscious 60

50

40 29

female
30
male
13
20
8
26
10 16 2
12
7
0
very health moderately slightly health not very
conscious health conscious health
conscious conscious
Cluster Bar Chart

35

30 29
26
very health conscious
25

20 moderately health
16 conscious
15 slightly health
12 13
conscious
10 8 not very healt
7 conscious
5
2
0
male female
Pie Chart Example

Hospital Number % of
Unit of Patients Total
Hospital Patients by Unit
Cardiac Care 1,052 11.93
Emergency 2,245 25.46 Cardiac Care
12%
Intensive Care 340 3.86
Maternity 552 6.26
Surgery 4,630 52.50

Emergency
Surgery 25%
53%

size of pie slice shows


Intensive Care
percentage for each 4%

category
Maternity
6%

(Percentages are rounded to


Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall the nearest percent) Ch. 1-32
Pareto Diagram
 Used to portray categorical data
 Italian economist Vilfredo Pareto
 In the most cases a small number of factors are
responsible for most of the problems.
 Used to separate the “vital few” from the “trivial
many”
 80-20 Rule

“A student might think that 80% of the work on a


group project was done by only 20% of the team
members”
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-33
 Managers / identfy major causes of the
problem / correct them with a minimum cost
 A bar chart, where categories are shown in
descending order of frequency
 A cumulative polygon is often shown in the
same graph
Pareto Diagram: Cause of Manufacturing Defect
60% 100%

90%

50%
80%

70%
40%

60%

30% 50%

40%

20%
30%

20%
10%

10%

0% 0%
Poor Alignment Paint Flaw Bad Weld Missing Part Cracked case Electrical Short

The most indicate the causes with the


frequent causes decreasing frequencies
Pareto Diagram Example
Example: 400 defective items are examined
for cause of defect:
Source of
Manufacturing Error Number of defects
Bad Weld 34
Poor Alignment 223
Missing Part 25
Paint Flaw 78
Electrical Short 19
Cracked case 21
Total 400

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-36
Pareto Diagram Example
(continued)

Step 1: Sort by defect cause, in descending order


Step 2: Determine % in each category

Source of
Manufacturing Error Number of defects % of Total Defects
Poor Alignment 223 55.75
Paint Flaw 78 19.50
Bad Weld 34 8.50
Missing Part 25 6.25
Cracked case 21 5.25
Electrical Short 19 4.75
Total 400 100%

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-37
Pareto Diagram Example
(continued)
Step 3: Show results graphically
Pareto Diagram: Cause of Manufacturing Defect
60% 100%

90%
% of defects in each category

50%
80%

cumulative % (line graph)


70%
40%
60%
(bar graph)

30% 50%

40%

20%
30%

20%
10%
10%

0% 0%
Poor Alignment Paint Flaw Bad Weld Missing Part Cracked case Electrical Short

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-38
 Example: A company has determined that there are
seven possible defects for one of its product line.
Construct a Pareto diagram for the following defect
frequencies:

Defect
Code Frequency
A 10
B 70
C 15
D 90
E 8
F 4
G 3
 Example: Construct a Pareto diagram to help the
company determine the most significant factors
contributing to processing errors.

Defect Code Frequency


Procedural and diagnostic
1 codes 40
2 Provider information 9
3 Patient information 6
4 Pricing schedule 17
5 Contractual applications 37
6 Prodiver adjustment 7
7 others 4
Graphical
Presentation of Data
(continued)
 Techniques reviewed in this chapter:

Categorical Numerical
Variables Variables

• Frequency distribution • Line chart


• Bar chart • Frequency distribution
• Pie chart • Histogram and ogive
• Pareto diagram • Stem-and-leaf display
• Scatter plot

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-41
Graphs for Time-Series Data

 Suppose that we take a random sample of 100 boxes


of a new variety of cereal.

 If we collect our sample at one point in time and


weigh each box, then the measurements obtained are
known as cross-sectional data.

 We could collect and measure a random sample of 5


boxes every 15 minutes or 10 boxes every 20
minutes. Data measured at successive points in time
are called time-series data.
Graphs for Time-Series Data
1.4

 A line chart (time-series plot) is used to show


the values of a variable over time

 Time is measured on the horizontal axis

 The variable of interest is measured on the


vertical axis

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-43
Examples of time series data;

 Monthly products sales and interest rates


 Quarterly corporate earnings
 Daily closing prices of common stock
 Daily exchange rates between various world
currencies
Line Chart Example

Magazine Subscriptions by Year

350

300
Thousands of subscribers

250

200

150

100

50

0
1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-45
Example
 Construct a time-series plot for the following
numbers of customers shopping at a new mall during
a given week. (from your book, pp.40)

Day Number of customers


Monday 525
Tuesday 540
Wednesday 469
Thursday 500
Friday 586
Saturday 640
1.5 Graphs to Describe
Numerical Variables

Numerical Data

Frequency Distributions Stem-and-Leaf


and Display
Cumulative Distributions

Histogram Ogive

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-47
Frequency Distributions

 A frequency distribution is a list or a table …


 containing class groupings (categories or
ranges within which the data fall) (left
column).
 and the corresponding frequencies with
which data fall within each class or category
(right column)

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-48
 Example: Frequency distribution for employee
completion times

Completion times (in seconds) Frequency

220 less than 230 5

230 less than 240 8


240 less than 250 13
classes Number of
250 less than 260 22 observations
or
260 less than 270 32
intervals
270 less than 280 13

280 less than 290 10

290 less than 300 7


Why Use Frequency Distributions?

 A frequency distribution is a way to summarize


data
 The distribution condenses the raw data into a
more useful form...
 and allows for a quick visual interpretation of
the data

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-50
The classes or intervals of a frequency
distribution for numerical data are not easily
identifiable…

 How many intervals should be used?


 How wide should each interval be?
Class Intervals
and Class Boundaries
STEP 1:

 Determine the number of the intervals (classes)

 The number of intervals (classes) used in a


frequency distribution is decided in a somewhat
arbitrary manner.

 Larger sets require more intervals


Smaller sets require fewer intervals.

 Use at least 5 but no more than 15-20 intervals.


Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-52
How Many Class Intervals?

 Many (Narrow class intervals) 3.5

may yield a very jagged distribution


3

2.5

with gaps from empty classes

Frequency
2
1.5
 Can give a poor indication of how 1

frequency varies across classes


0.5
0

4
8
12
16
20
24
28
32
36
40
44
48

56
60
52

More
Temperature

 Few (Wide class intervals) 12

may compress variation too much


10

8

Frequency
and yield a blocky distribution 6

can obscure important patterns of


4

2

variation. 0
0 30 60 More
Temperature
(X axis labels are upper class endpoints)

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-53
Class Intervals
and Class Boundaries

STEP 2:
 Determine the interval width.

 Each class grouping has the same width.

largest number  smallest number


w  interval width 
number of desired intervals

 Round up the interval width to get desirable


interval endpoints
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-54
Class Intervals
and Class Boundaries

STEP 3:
 Intervals never overlap
 Each observation must belong to one and only
one interval.

Example:
age 20 to age 30 age 20 but less than age 30 20-29
age 30 to age 40 age 30 but less than age 40 30-39

wrong
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-55
Questions for Grouping Data
into Intervals

 1. How wide should each interval be?


(How many classes should be used?)
 2. How should the endpoints of the
intervals be determined?
 Often answered by trial and error, subject
to user judgment
 The goal is to create a distribution that is
neither too "jagged" nor too "blocky”
 Goal is to appropriately show the pattern of
variation in the data

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-56
Frequency Distribution Example
Example:
A manufacturer of insulation randomly
selects 20 winter days and records the
daily high temperature

24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-57
Frequency Distribution Example
(continued)

 Sort raw data in ascending order:


12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43,
44, 46, 53, 58
 Find range: 58 - 12 = 46
 Select number of classes: 5 (usually between 5 and
15)
 Compute interval width: 10 (46/5 then round up)
 Determine interval boundaries: 10 but less than 20, 20
but less than 30, . . . , 60 but less than 70
 Count observations & assign to classes
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-58
Frequency Distribution Example
(continued)
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Relative Percentage
Interval Frequency Frequency (%)
10 but less than 20 3 .15 15
20 but less than 30 6 .30 30
30 but less than 40 5 .25 25

40 but less than 50 4 .20 20


50 but less than 60 2 .10 10

Total 20
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall
1.00 100 Ch. 1-59
The Cumulative
Frequency Distribution
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Cumulative Cumulative
Class Frequency Percentage
Frequency Percentage

10 but less than 20 3 15 3 15


20 but less than 30 6 30 9 45
30 but less than 40 5 25 14 70
40 but less than 50 4 20 18 90
50 but less than 60 2 10 20 100
Total 20 100

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-60
 Example (from your book, pp. 47):

17 62 15 65
28 51 24 65 Please construct a
frequency distribution
39 41 35 15 by considering the
39 32 36 37 following data

40 21 44 37
59 13 44 56
12 54 64 59
 Example (from your book, pp. 47):

Class Frequency Consider the following data:


0 < 10 8 a) Construct a relative
frequency distribution
10 < 20 10
b) Construct a cumulative
20 < 30 13 frequency distribution
30 < 40 12 c) Construct a cumulative
relative frequency
40 < 50 6 distribution
Histogram

 A graph of the data in a frequency


distribution is called a histogram
 The interval endpoints are shown on the
horizontal axis
 the vertical axis is either frequency,
relative frequency, or percentage
 Bars of the appropriate heights are used to
represent the number of observations within
each class
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-63
Histogram Example

Interval Frequency
His togram : Daily High Te m pe rature
10 but less than 20 3
20 but less than 30 6 7 6
30 but less than 40 5
40 but less than 50 4
6 5
50 but less than 60 2 5 4
Frequency

4 3
3 2
2
1 0 0
(No gaps between 0
bars)
0 0 10 10 2020 30 30 40 40 50 50 60 60
70 Temperature in Degrees
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-64
Ogive

 Cumulative line graph

 Line that connects points that are cumulative


percent of observations below the upper limit
of each interval in a cumulative frequency
distribution
The Ogive
Graphing Cumulative Frequencies
Upper
interval Cumulative
Interval endpoint Percentage
Less than 10 10 0
10 but less than 20 20 15
20 but less than 30 30 45 Ogive: Daily High Temperature
30 but less than 40 40 70
40 but less than 50 50 90
100
50 but less than 60 60 100
Cumulative Percentage 80
60
40
20
0
10 20 30 40 50 60
Interval endpoints
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-66
 Example (from your book, pp. 47):

Relative Cumulative
Interval frequency frequency relative
frequency (%)
10 but less than 20 5 0.178 17.8

20 but less than 30 3 0.107 28.5

30 but less than 40 7 0.25 53.5

40 but less than 50 4 0.142 67.7

50 but less than 60 5 0.178 85.5

60 but less than 70 4 0.142 100

Please draw a histogram and an ogive by


considering the following data
 Example: Frequency distribution for employee
completion times (from your book, pp. 43-45)

Completion times (in seconds) Frequency

220 less than 230 5


Please draw a
histogram and an ogive
230 less than 240 8
by considering the
240 less than 250 13
following data
250 less than 260 22

260 less than 270 32

270 less than 280 13

280 less than 290 10

290 less than 300 7


Stem-and-Leaf Diagram
(dal-yaprak diyagramı)

 A simple way to see distribution details in a data


set (an alternative to the histogram)

METHOD:
Separate the sorted data series
into leading digits (the stem) and
the trailing digits (the leaves)

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-69
Example
Data in ordered array:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41

 Here, use the 10’s digit for the stem unit:


Stem Leaf
 21 is shown as 2 1
 38 is shown as 3 8

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-70
Example
(continued)
Data in ordered array:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41

 Completed stem-and-leaf diagram:


Stem Leaves
2 1 4 4 6 7 7
3 0 2 8
4 1

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-71
Using other stem units
 Using the 100’s digit as the stem:
 Round off the 10’s digit to form the leaves

Stem Leaf
 613 would become 6 1
 776 would become 7 8
 ...
 1224 becomes 12 2

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-72
Using other stem units
(continued)
 Using the 100’s digit as the stem:
 The completed stem-and-leaf display:

Data:
Stem Leaves
613, 632, 658, 717, 6 136
722, 750, 776, 827, 7 2258
841, 859, 863, 891, 8 346699
894, 906, 928, 933, 9 13368
955, 982, 1034,
1047,1056, 1140, 10 356
1169, 1224 11 47
12 2
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-73
 Example (from your book, pp. 46):

88 51 63 85 79

65 79 70 73 77

An accounting professor obtained two random samples


of data. The first data set is for a small random
sample of 10 final exam grades for an introductory
accounting class.
Please construct a stem and leaf display for grades.
 Example (from your book, pp. 47):

3.5 2.8 4.5 6.2 4.8 2.3 2.6 3.9 4.4 5.5

5.2 6.7 3.0 2.4 5.0 3.6 2.9 1.0 2.8 3.6

Please construct a stem and leaf


display for the hours that 20
students spent studying for a
marketing test.
1.6
Relationships Between Variables

 Graphs illustrated so far have involved only a


single variable
 When two variables exist other techniques
are used:

Categorical Numerical
(Qualitative) (Quantitative)
Variables Variables

Cross tables Scatter plots

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-76
Scatter Diagrams
(dağılma diyagramı)

 Scatter Diagrams are used for


paired observations taken from
two numerical variables

 The Scatter Diagram:


 one variable is measured on the

vertical axis and


 the other variable is measured on the

horizontal axis
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-77
 The scatter diagram provides a picture of
the data, include the following:

 The range of each variable


 The pattern of values over the range
 A suggestion as to a possible relationship
between the two variables
 An indication of outliers (extreme points)
Scatter Diagram Example

Volume Cost per Cost per Day vs. Production Volume


per day day
23 125 250
26 140
200
29 146
Cost per Day

33 160 150
38 167 100
42 170
50
50 188
55 195 0
60 200 0 10 20 30 40 50 60 70
Volume per Day

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-79
Cross Tables
(çapraz tablolar)

 Cross Tables (or contingency tables) list the


number of observations for every
combination of values for two categorical or
ordinal variables

 If there are r categories for the first


variable (rows) and c categories for the
second variable (columns), the table is called
an r x c cross table

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-80
Cross Table Example
 4 x 3 Cross Table for Investment Choices by
Investor (values in $1000’s)
Investment Investor A Investor B Investor C Total
Category
Stocks 46.5 55 27.5 129
Bonds 32.0 44 19.0 95
CD 15.5 20 13.5 49
Savings 16.0 28 7.0 51
Total 110.0 147 67.0 324

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-81
1.7
Data Presentation Errors

Goals for effective data presentation:


 Present data to display essential information
 Communicate complex ideas clearly and
accurately
 Avoid distortion that might convey the wrong
message

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-82
Data Presentation Errors
(continued)

 Unequal histogram interval widths


 Compressing or distorting the vertical
axis
 Providing no zero point on the vertical
axis
 Failing to provide a relative basis in
comparing data between groups

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-83
Example: Grocery Receipts (from your
book, pp. 56)

Dollar amount Number of Proportions


receipts
$ 0 < $ 10 84 84 / 692
$ 10 < $ 20 113 113 / 692
$ 20 < $ 30 112 112 / 692
$ 30 < $ 40 85 85 / 692
$ 40 < $ 50 77 77 / 692
$ 50 < $ 60 58 58 / 692
$ 60 < $ 80 75 75 / 692
$ 80 < $ 100 48 48 / 692
$ 100 < $ 200 40 40 / 692
 Example (from your book, pp. 47):
Age Percent Consider the following data:
a) Construct a relative
18-24 11.30
cumulative frequency
25-34 19.11 distribution

35-44 23.64 b) What percent of the


Internet visitors were
45-54 23.48 under 45 years of age?
55 + 22.48 c) What percent of the
Internet visitors were at
least 35 years of age?
 Example (from your book, pp. 60):

A sample of 20 financial analyst was asked to provide


forecasts of earnings per share of a corporation for next
year. The results are summarized in the following table:

Forecast Number of
analyst a) Draw the histogram.
($ per share)
9.95 < 10.45 2 b) Find the relative frequencies.
10.45 < 10.95 8 c) Find the cumulative
frequencies.
10.95 < 11.45 6
d) Find the relative cumulative
11.45 < 11.95 3 frequencies.
11.95 < 12.45 1
 Example (from your book, pp. 62):

A random sample of customers was asked to select their


favorite soft drink from a list of five brands. The results
showed that 30 preferred Brand A, 50 preferred Brand B,
46 preferred Brand C, 100 preferred Brand D and 14
preferred Brand E.

a) Construct a pie chart.


b) Construct a bar chart.

You might also like