You are on page 1of 50

A PowerPoint Presentation Package to Accompany

Applied Statistics in Business &


Economics, 5th edition
David P. Doane and Lori E. Seward
Prepared by Lloyd R. Jaisingh

McGraw-Hill/Irwin

Copyright 2015 by The McGraw-Hill Companies, Inc. All rights reserved.

Chapter 3

Describing Data Visually


Chapter Contents
3.1 Stem-and-Leaf Displays and Dot Plots
3.2 Frequency Distributions and Histograms
3.3 Effective Excel Charts
3.4 Line Charts
3.5 Column and Bar Charts
3.6 Pie Charts
3.7 Scatter Plots
3.8 Tables
3.9 Deceptive Graphs

3-2

Chapter 3

Describing Data Visually


Chapter Learning Objectives
LO3-1: Make a stem-and-leaf or dot plot.
LO3-2: Create a frequency distribution for a data set.
LO3-3: Make a histogram with appropriate bins.
LO3-4: Identify skewness, modal classes, and outliers in a
histogram.
LO3-5: Make an effective line chart.

3-3

Chapter 3

Describing Data Visually

Chapter Learning Objectives


LO3-6: Make an effective column chart or bar chart.
LO3-7: Make an effective pie chart.
LO3-8: Make and interpret a scatter plot.
LO3-9: Make simple tables and pivot tables.
LO3-10: Recognize deceptive graphing techniques.

3-4

Chapter 3

3.1 Stem-and-Leaf Displays and


Dot Plots
Methods of organizing, exploring and summarizing data include:

Visual (charts and graphs)


provides insight into characteristics of a data set without using
mathematics.

- Numerical (statistics or tables)


provides insight into characteristics of a data set using
mathematics.

3-5

Chapter 3

3.1 Stem-and-Leaf Displays and


Dot Plots
Begin with univariate data (a set of n observations on one variable)
and consider the following:

3-6

Chapter 3

3.1 Stem-and-Leaf Displays and


Dot Plots

Preliminary Assessment
Look at the data and visualize how they were collected
and measured.
Sorting (Example: Price/Earnings Ratios)
Sort the data as a first step and then summarize in a
graphical display. Here are the sorted P/E ratios (values
from Table 3.2).

3-7

3.1 Stem-and-leaf Displays and


Dot Plots

Chapter 3

LO3-1

The type of graph you use to display your data is dependent on


the type of data you have. Some charts are better suited for
quantitative data, while others are better for displaying
categorical data.

LO3-1: Make a stem-and-leaf or dot plot.


Stem-and-Leaf Plot

One simple way to visualize small data sets is a stem-and-leaf


plot. The stem-and-leaf plot is a tool of exploratory data
analysis (EDA) that seeks to reveal essential data features in an
intuitive way. A stem-and-leaf plot is basically a frequency tally,
except that we use digits instead of tally marks. For two-digit or
three-digit integer data, the stem is the tens digit of the data, and
the leaf is the ones digit.

3-8

3.1 Stem-and-Leaf Displays and


Dot Plots

Chapter 3

LO3-1

For the 44 P/E ratios, the stem-and-leaf plot is given below .

For example, the data values in the fourth stem are 31, 37, 37, 38. We always use
equally spaced stems (even if some stems are empty). The stem-and-leaf can reveal
central tendency (24 of the 44 P/E ratios were in the 1019 stem) as well as
dispersion (the range is from 7 to 59). In this illustration, the leaf digits have been
sorted, although this is not necessary. The stem-and-leaf has the advantage that we
can retrieve the raw data by concatenating a stem digit with each of its leaf digits. For
example, the last stem has data values 50 and 59.
3-9

3.1 Stem-and-Leaf Displays and


Dot Plots

Chapter 3

LO3-1

Dot Plots

A dot plot is the simplest graphical display of n individual values of


numerical data.
- Easy to understand.
- It reveals dispersion, central tendency, and the shape of the distribution.

Steps in Making a Dot Plot


1. Make a scale that covers the data range.
2. Mark the axes and label them.
3. Plot each data value as a dot above the scale at its approximate
location.

Note: If more than one data value lies at about the same axis
location, the dots are stacked vertically.
3-10

3.1 Stem-and-Leaf Displays and


Dot Plots

Chapter 3

LO3-1

The range is from 7 to 59.


All but a few data values lie between 10 and 25.
A typical middle data value would be around 17 or 18.
The data are not symmetric due to a few large P/E ratios.

3-11

3.1 Stem-and-Leaf Displays and


Dot Plots

Chapter 3

LO3-1

Comparing Groups
A stacked dot plot compares two or more groups using a common
X-axis scale.

3-12

3.2 Frequency Distributions and


Histograms

LO3-2: Create a frequency distribution for a data set

Chapter 3

LO3-2

Bins and Bin Limits

A frequency distribution is a table formed by classifying n data


values into k classes (bins).
Bin limits define the values to be included in each bin. Widths must
all be the same except when we have open-ended bins.
Frequencies are the number of observations within each bin.
Express as relative frequencies (frequency divided by the total) or
percentages (relative frequency times 100).

3-13

Constructing a Frequency Distribution

Chapter 3

LO3-2

3.2 Frequency Distributions and


Histograms
- Herbert Sturges proposed the following rule:

3-14

Chapter 3

LO3-2

3.2 Frequency Distributions and


Histograms

3-15

Chapter 3

LO3-2

3.2 Frequency Distributions and


Histograms

Histograms
A histogram is a graphical representation of a frequency
distribution.
Y-axis shows frequency within each bin.
A histogram is a bar chart.
X-axis ticks shows end points of each bin.

3-16

LO3-3: Make a histogram with appropriate bins.

Chapter 3

LO3-3

3.2 Frequency Distributions and


Histograms
Consider 3 histograms for the P/E ratio data with different bin
widths. What do they tell you?

3-17

Chapter 3

LO3-3

3.2 Frequency Distributions and


Histograms

LO3-3: Make a histogram with appropriate bins.

Choosing the number of bins and bin limits in creating histograms


requires judgment.
One can use software programs to create histograms with different
bins. These include software such as:
Excel
MegaStat
Minitab

3-18

Chapter 3

LO3-3

3.2 Frequency Distributions and


Histograms

Modal Class

A histogram bar that is higher than those on either side.


Unimodal a single modal class.
Bimodal two modal classes.
Multimodal more than two modal classes.
Modal classes may be artifacts of the way bin limits are chosen.

3-19

LO3-4: Identify skewness, modal classes, and outliers in a histogram.

Chapter 3

LO3-4

3.2 Frequency Distributions and


Histograms

Shape

A histogram may suggest the shape of the population.


It is influenced by the number of bins and bin limits.
Skewness indicated by the direction of the longer tail of the
histogram.
Left-skewed (negatively skewed) a longer left tail.
Right-skewed (positively skewed) a longer right tail.
Symmetric both tail areas are the same.

3-20

Chapter 3

LO3-4

3.2 Frequency Distributions and


Histograms

3-21

Frequency Polygons and Ogive

Chapter 3

3.2 Frequency Distributions and


Histograms
A frequency polygon is a line graph that connects the midpoints of
the histogram intervals, plus extra intervals at the beginning and
end
so that the line will touch the X-axis.
It serves the same purpose as a histogram, but is attractive when
you
need to compare two data sets (since more than one
frequency
polygon can be plotted on the same scale).
An ogive (pronounced oh-jive) is a line graph of the cumulative
frequencies.
It is useful for finding percentiles or in comparing the shape of the
sample with a known benchmark such as the normal distribution
(that
you will be seeing in the next chapter).

3-22

Chapter 3

3.2 Frequency Distributions and


Histograms
Frequency Polygons and Ogives

3-23

This section describes how to use Excel to create


charts. Excel offers a vast array of charts. Refer to
Figure 3.10. Please refer to the text as well.

Chapter 3

3.3 Effective Excel Charts

3-24

3.4 Line Charts

LO3-5: Make an effective line chart.

Chapter 3

LO3-5

Simple Line Charts

Used to display a time


series or spot trends,
or to compare time
periods.

Can display several


variables at once.

3-25

3.4 Line Charts

Chapter 3

LO3-5

Simple Line Charts

Two-scale line chart used to compare variables that differ in


magnitude or are measured in different units.

3-26

3.4 Line Charts

Chapter 3

LO3-5

Log Scales

Arithmetic scale distances on the Y-axis are proportional to the


magnitude of the variable being displayed.

Logarithmic scale (ratio scale) equal distances represent equal


ratios.

Use a log scale for the vertical axis when data vary over a wide
range, say, by more than an order of magnitude.

This will reveal more detail for smaller data values.

3-27

3.4 Line Charts

Log Scales

Chapter 3

LO3-5

A log scale is useful for time series data that might be expected to grow at a
compound annual percentage rate (e.g., GDP, the national debt, or your
future income). It reveals whether the quantity is growing at an
increasing percent (concave upward),
constant percent (straight line), or
declining percent (concave downward)

3-28

3.5 Column and Bar Charts

LO3-6: Make an effective column chart or bar chart.

Chapter 3

LO3-6

Simple Column and Bar Charts

Column chart is a vertical display of the data.


Bar chart is a horizontal display of the data.

3-29

3.5 Bar Charts

Chapter 3

LO3-6

Pareto Charts

Special type of bar chart used in quality management to display the


frequency of defects or errors of different types.

Categories are
displayed in
descending order
of frequency.

Focus on
significant few
(i.e., few
categories that
account for most defects or errors).
3-30

Chapter 3

LO3-6

3.5 Bar Charts

Stacked Column Chart

Bar height is the sum


of several subtotals.
Areas may be
compared by color to
show patterns in the
subgroups and total.

3-31

3.6 Pie Charts

Chapter 3

LO3-7

LO3-7: Make an effective pie chart.


Pie Chart

A pie chart can only convey a general idea of the data.


Pie charts should be used to portray data which sum to a total
(e.g., percent market shares).
A pie chart should only have a few (i.e., 2 to 5) slices.
Each slice can be labeled with data values or percents.

3-32

3.6 Pie Charts

Chapter 3

LO3-7

Pie Chart
A simple 2-D pie chart is best as shown in Figure 3.19.

3-33

3.6 Pie Charts

Pie Chart

Chapter 3

LO3-7

The 3-D pie chart adds visual interest, but the sizes of the
pie slices are harder to assess.

3-34

3.6 Pie Charts

Chapter 3

LO3-7

Pie Chart

A simple bar chart can be used to display the same data, and
would be preferred by many statisticians.

3-35

3.7 Scatter Plots

LO3-8: Make and interpret a scatter plot.

Chapter 3

LO3-8

Scatter plots can convey patterns in data pairs that would not be
apparent from a table.
A scatter plot is a starting point for bivariate data analysis in which we
investigate the association and relationship between two variables.
View the next slide for an example.

3-36

3.7 Scatter Plots

LO3-8: Make and interpret a scatter plot.

Chapter 3

LO3-8

3-37

3.7 Scatter Plots

LO3-8: Make and interpret a scatter plot.

Chapter 3

LO3-8

Figure 3.23 shows some scatter plot patterns similar to those that you
might observe when you have a sample of (X, Y) data pairs.
A scatter plot can convey patterns in data pars that would not be
apparent from a table.

3-38

3.7 Scatter Plots

LO3-8: Make and interpret a scatter plot.

Chapter 3

LO3-8

Other examples of scatter plots

3-39

3.7 Scatter Plots

LO3-8: Make and interpret a scatter plot.

Chapter 3

LO3-8

Other examples of scatter plots

3-40

3.8 Tables

Chapter 3

LO3-9

LO3-9: Make simple tables and pivot tables.

Tables are the simplest form of data display.


By arranging numbers in rows and columns, their meaning can be
enhanced so it can be understood at a glance.

Example: School Expenditures

Arrangement of data is in rows and columns to enhance meaning.

The data can be viewed by focusing on the time pattern (down the
columns) or by comparing the variables (across the rows).

3-41

3.8 Tables

Example: School Expenditures

Chapter 3

LO3-9

Refer to the text on the discussion on Pivot Tables.

3-42

3.8 Tables

Chapter 3

LO3-9

LO3-9: Make simple tables and Pivot tables


Here are some tips for creating effective tables:
1. Keep the table simple, consistent with its purpose. Put
summary
tables in the main body of the written report and detailed tables in
an appendix.
2. Display the data to be compared in columns rather than rows.
3. For presentation purposes, round off to three or four significant
digits.
4. Physical table layout should guide the eye toward the
comparison you wish to emphasize.
5. Row and column headings should be simple yet descriptive.
6. Within a column, use a consistent number of decimal digits.
3-43

3.9 Deceptive Graphs

LO3-10: Recognize deceptive graphing techniques.

Chapter 3

LO3-10

Error 1: Nonzero Origin

A nonzero origin will exaggerate the trend.

Deceptive

Correct
3-44

3.9 Deceptive Graphs

Chapter 3

LO3-10

Error 2: Elastic Graph Proportions

Keep the aspect ratio (width/height) below 2.00 so as not to


exaggerate the graph. By default, Excel uses an aspect ratio of 1.68.

3-45

3.9 Deceptive Graphs

Chapter 3

LO3-10

Error 4: 3-D and Novelty Graphs

Novelty charts such as the pyramid chart should be avoided


because they distort the bar volume and make it hard to measure
bar height.

3-46

3.9 Deceptive Graphs

Chapter 3

LO3-10

Error 5: Rotated Graphs

Can make trends appear to dwindle into the distance or loom


towards you.

3-47

3.9 Deceptive Graphs

Error 8: Complex Graphs

Chapter 3

LO3-10

Avoid if possible. This example (surgery volume) combines several


errors (silly subtitle, distracting pictures, no data labels, no
definitions, vague source, too much information).

3-48

3.9 Deceptive Graphs

Chapter 3

LO3-10

Error 11: Area Trick

As figure height increases, so does width, distorting the graph.

3-49

3.9 Deceptive Graphs

Chapter 3

LO3-10

Other deceptive graphing techniques.

Error 3:
Error 6:
Error 7:
Error 9:
Error 10:

Dramatic Title and Distracting Pictures


Unclear Definitions or Scales
Vague Sources
Gratuitous Effects
Estimated Data

3-50