You are on page 1of 33

Applied Business Statistics, 7th ed.

by Ken Black

Chapter 2
Charts
and Graphs

Copyright2011
Copyright 2011John
JohnWiley
Wiley&&Sons,
Sons,Inc.
Inc. 1
Learning Objectives

Construct a frequency distribution


For both grouped and ungrouped data
Construct graphical summaries of qualitative data
Construct graphical summaries of quantitative data
Construct graphical summaries of two variables

Copyright 2011 John Wiley & Sons, Inc. 2


Ungrouped Versus Grouped Data

Ungrouped data
have not been summarized in any way
are also called raw data
Grouped data
logical groupings of data exists
i.e. age ranges (20-29, 30-39, etc.)
have been organized into a frequency distribution

Copyright 2011 John Wiley & Sons, Inc. 3


Example of Ungrouped Data

Ages of a Sample of
Managers from
Urban Child Care
Centers in the
United States

Copyright 2011 John Wiley & Sons, Inc. 4


Frequency Distribution

Frequency Distribution – summary of data presented


in the form of class intervals and frequencies
Vary in shape and design
Constructed according to the individual researcher's
preferences

Copyright 2011 John Wiley & Sons, Inc. 5


Frequency Distribution

Steps in Frequency Distribution


Step 1 - Determine range of frequency distribution
Range is the difference between the high and the lowest
numbers
Step 2 – determine the number of classes
Don’t use too many, or two few classes
Step 3 – Determine the width of the class interval
Approx class width can be calculated by dividing the range
by the number of classes
Values fit into only one class

Copyright 2011 John Wiley & Sons, Inc. 6


Frequency Distribution of Child
Care Manager’s Ages
Class Interval Frequency
20-under 30 6
30-under 40 18
40-under 50 11
50-under 60 11
60-under 70 3
70-under 80 1

Copyright 2011 John Wiley & Sons, Inc. 7


Relative Frequency

The relative frequency is the proportion of the total frequency


that is any given class interval in a frequency distribution.
Relative
Class Interval Frequency Frequency
20-under 30 6 .12 6

30-under 40 18 50 .36
40-under 50 11 18 .22

50-under 60 11 50 .22
60-under 70 3 .06
70-under 80 1 .02
Total 50 1.00

Copyright 2011 John Wiley & Sons, Inc. 8


Cumulative Frequency

The cumulative frequency is a running total of frequencies


through the classes of a frequency distribution.
Cumulative
Class Interval Frequency Frequency
20-under 30 6 6
30-under 40 18 18 + 6 24
40-under 50 11 11 + 24 35
50-under 60 11 46
60-under 70 3 49
70-under 80 1 50
Total 50

Copyright 2011 John Wiley & Sons, Inc. 9


Cumulative Relative Frequencies

The cumulative relative frequency is a running total of the


relative frequencies through the classes of a frequency
distribution.
Cumulative
Relative Cumulative Relative
Class Interval Frequency Frequency Frequency Frequency
20-under 30 6.12 6 .12
30-under 40 18 .36 24 .48
40-under 50 11 .22 35 .70
50-under 60 11 .22 46 .92
60-under 70 3.06 49 .98
70-under 80 1.02 50 1.00
Total 50 1.00

Copyright 2011 John Wiley & Sons, Inc. 10


Common Statistical Graphs – Quantitative Data
Histogram -- vertical bar chart of frequencies
Frequency Polygon -- line graph of frequencies
Ogive -- line graph of cumulative frequencies
Stem and Leaf Plot – Like a histogram, but shows individual
data values. Useful for small data sets.
Pareto Chart -- type of chart which contains both bars and a
line graph.
The bars display the values in descending order, and the line graph
shows the cumulative totals of each category, left to right.
The purpose is to highlight the most important among a (typically
large) set of factors.

Copyright 2011 John Wiley & Sons, Inc. 11


Histogram

A histogram is a graphical summary of a frequency


distribution
The number and location of bins (bars) should be
determined based on the sample size and the range
of the data

Copyright 2011 John Wiley & Sons, Inc. 12


Data Range

42 26 32 34 57 Range = Largest - Smallest


30 58 37 50 30

53 40 30 47 49
= 74 - 23
50 40 32 31 40 = 51
52 28 23 35 25

30 36 32 26 50

55 30 58 64 52 Smallest
49 33 43 46 32

61 31 30 40 60
Largest
74 37 29 43 54

Copyright 2011 John Wiley & Sons, Inc. 13


Number of Classes and Class Width

The number of classes should be between 5 and 15.


Fewer than 5 classes cause excessive summarization.
More than 15 classes leave too much detail.
Class Width
Divide the range by the number of classes for an
approximate class width
Round up to a convenient number

Range 51
Approx Class Width = = = 8.5
Num Class 6
Class Width = 10
Copyright 2011 John Wiley & Sons, Inc. 14
Class Midpoint

The midpoint of each class interval is called the


class midpoint or the class mark.
beginning class endpoint + ending class endpoint
Class Midpoint =
2
30 + 40
=
2
= 35
1
Class Midpoint = class beginning point + class width
2
1
= 30 +  10
2
= 35
Copyright 2011 John Wiley & Sons, Inc. 15
Midpoints for Age Classes

Relative Cumulative
Class Interval Frequency Midpoint Frequency Frequency
20-under 30 6 25 .12 6
30-under 40 18 35 .36 24
40-under 50 11 45 .22 35
50-under 60 11 55 .22 46
60-under 70 3 65 .06 49
70-under 80 1 75 .02 50
Total 50 1.00

Copyright 2011 John Wiley & Sons, Inc. 16


Histogram

Class IntervalFrequency

20
20-under 30 6
30-under 40 18
40-under 50 11

Frequency
50-under 60 11

10
60-under 70 3
70-under 80 1 0

0 10 20 30 40 50 60 70 80
Years

Copyright 2011 John Wiley & Sons, Inc. 17


Histogram Construction

Class IntervalFrequency
20-under 30 6

20
30-under 40 18
40-under 50 11
50-under 60 11

Frequency
60-under 70 3

10
70-under 80 1

0 10 20 30 40 50 60 70 80
Years

Copyright 2011 John Wiley & Sons, Inc. 18


Frequency Polygon

Class IntervalFrequency

20
20-under 30 6
30-under 40 18
40-under 50 11
50-under 60 11 Frequency
60-under 70 3
70-under 80 1 10
0

0 10 20 30 40 50 60 70 80

Years

Copyright 2011 John Wiley & Sons, Inc. 19


Ogive

Cumulative

60
Class Interval Frequency
20-under 30 6

40
Frequency
30-under 40 24
40-under 50 35

20
50-under 60 46
60-under 70 49
70-under 80 50 0
0 10 20 30 40 50 60 70 80
Years

Copyright 2011 John Wiley & Sons, Inc. 20


Relative Frequency Ogive

Cumulative

Cumulative Relative Frequency


1.00
Relative 0.90
Class Interval Frequency 0.80
0.70
20-under 30 .12 0.60
0.50
30-under 40 .48 0.40
40-under 50 .70 0.30
0.20
50-under 60 .92 0.10
0.00
60-under 70 .98
0 10 20 30 40 50 60 70 80
70-under 80 1.00
Years

Copyright 2011 John Wiley & Sons, Inc. 21


Stem and Leaf plot:
Safety Examination Scores for Plant Trainees

Raw Data Stem Leaf

2 3
3 9
4 79
5 569
6 07788
7 0245567789
8 11233689
9 11247

Copyright 2011 John Wiley & Sons, Inc. 22


Construction of Stem and Leaf Plot

Raw Data Stem Leaf

2 3
Stem 3 9
4 79
5 569
Leaf 6 07788
7 0245567789
Stem
8 11233689

Leaf 9 11247

Copyright 2011 John Wiley & Sons, Inc. 23


Histogram vs. Stem and Leaf?

So, which one should I use?


A Stem and Leaf plot is useful for small data sets. It shows the values of
the datapoints.
A histogram foregoes seeing the individual values of the data for the
bigger picture of the distribution of the data
The purpose of these graphs is to summarize a set of data. As long as
that need is met, either one is okay to use.

Copyright 2011 John Wiley & Sons, Inc. 24


Common Statistical Graphs – Qualitative Data

Pie Chart -- proportional representation for


categories of a whole
Bar Chart – frequency or relative frequency of one
more categorical variables

Copyright 2011 John Wiley & Sons, Inc. 25


Complaints by Amtrak Passengers

COMPLAINT NUMBER PROPORTION DEGREES

Stations, etc. 28,000 .40 144.0

Train 14,700 .21 75.6


Performance
Equipment 10,500 .15 50.4

Personnel 9,800 .14 50.6

Schedules, 7,000 .10 36.0


etc.
Total 70,000 1.00 360.0

Copyright 2011 John Wiley & Sons, Inc. 26


Complaints by Amtrak Passengers

Copyright 2011 John Wiley & Sons, Inc. 27


Second Quarter U.S. Truck Production

Second Quarter Truck


Production in the U.S.
(Hypothetical values)

Copyright 2011 John Wiley & Sons, Inc. 28


Second Quarter U.S. Truck Production

17%
4%
1%

39%
39%

A B C D E

Copyright 2011 John Wiley & Sons, Inc. 29


Pie Chart Calculations for Company A

.388  360 =

357, 411
=
920,190

Copyright 2011 John Wiley & Sons, Inc. 30


Pareto Chart
A pareto chart is a bar chart, sorted from most frequent to least frequent,
overlaid with a cumulative line graph (like an ogive).
These data present the most common types of defects.

Copyright 2011 John Wiley & Sons, Inc. 31


Common Statistical Graphs –
Comparing Two Variables
Scatter Plot -- type of display using Cartesian
coordinates to display values for two variables for a
set of data.
The data is displayed as a collection of points, each having
the value of one variable determining the position on the
horizontal axis and the value of the other variable
determining the position on the vertical axis.
A scatter plot is also called a scatter chart, scatter diagram
and scatter graph.

Copyright 2011 John Wiley & Sons, Inc. 32


Scatter Plot

Registered Gasoline Sales


Vehicles (1000's of
200
(1000's) Gallons)

Gasoline Sales
5 60
100
15 120

9 90
0
15 140 0 5 10 15 20
Registered Vehicles
7 60

Copyright 2011 John Wiley & Sons, Inc. 33

You might also like