Professional Documents
Culture Documents
1
Chapter 2,3. Describing (Qualitative) Data Using
Tables and Graphs: Outline
1. Types of Variables
2. Tabular Techniques for One Qualitative Variable: Frequency
and Relative Frequency Table
3. Graphical Presentation for One Qualitative Variable: Pie Chart
and Bar Chart
4. Tabular Techniques for Two Qualitative Variables Cross-
Classification Table and
5. Graphical Presentation for Two Qualitative Variables:
Clustered Bar Chart
6. Graphical Presentation for Quantitative Data: Histogram,
Ogive, Dot plot, Box Plot (next chapter), Index Plot (line chart)
7. Graphical Presentation for two Quantitative Variables: Scatter
Plot
Example 3
In a major North American city there are four
competing newspapers: the Post, Globe and Mail,
Sun, and Star.
To help design advertising campaigns, the
advertising managers of the newspapers need to
know which segments of the newspaper market
are reading their papers.
A survey was conducted to analyze the
relationship between newspapers read and
occupation.
Example 3
A sample of newspaper readers was asked to
report which newspaper they read:
1 = G&M (Globe and Mail),
2 = Post,
3 = Star, or
4 = Sun,
and to indicate whether they were:
1 = Blue-collar worker,
2 = White-collar worker, or
3 = Professional.
MACT 2222: Statistics for Business 2.4
2.5
Example 3
By counting the number of times each of the 12
combinations occurs, we produced the following
Cross-Classification Table or Contingency Table.
Occupation
Newspaper Blue Collar White Collar Professional Total
G&M 27 29 33 89
Post 18 43 51 112
Star 38 21 22 81
Sun 37 15 20 72
Total 120 108 126 354
Example 3
If occupation and newspaper are related, then
there will be differences in the newspapers read
among the occupations. An easy way to see this is
to covert the frequencies in each column to
relative frequencies in each column. That is,
compute the column totals and divide each
frequency by its column total.
Occupation
Newspaper Blue Collar White Collar Professional
G&M 27/120 =.23 29/108 = .27 33/126 = .26
Post 18/120 = .15 43/108 = .40 51/126 = .40
Star 38/120 = .32 21/108 = .19 22/126 = .17
Sun 37/120 = .31 15/108 = .14 20/126 = .16
MACT 2222: Statistics for Business 2.6
2.7
Example 3
Interpretation: The relative frequencies in the
columns 2 & 3 are similar, but there are large
differences between columns 1 and 2 and
between columns 1 and 3.
similar
dissimilar
MACT 2222: Statistics for Business 2.7
2.8
Example 3
This tells us that blue collar workers tend to read
different newspapers from both white collar
workers and professionals and that white collar
and professionals are quite similar in their
newspaper choice.
similar
dissimilar
MACT 2222: Statistics for Business 2.8
2.9
2 qualitative 2 quantitative
variables variables
Clustered Bar
Scatter Plot
Chart
Example 3
Use the data from the cross-classification table to create the clustered bar
chart. We can use frequencies or relative frequencies.
Occupation
Newspaper Blue Collar White Collar Professional Total
G&M 27 29 33 89
Post 18 43 51 112
Star 38 21 22 81
Sun 37 15 20 72
Total 120 108 126 354
Example 3
Use the data from the cross-classification table to create the clustered bar chart.
We can use frequencies or relative frequencies. If all clusters are sharing the
same pattern then there is no relation (regardless of the height of the bars)
otherwise there is a relation between the two variables. The heights of the bars
of one cluster should add up to 100% if using percentages. Be careful how the
bars are arranged!
Professionals
tend to read
the Globe &
Mail more
than twice as
often as the
Star or Sun
Qualitative Quantitative
Graphical
Tabular Techniques Techniques
Graphical
1.Frequency Table Techniques Tabular Techniques 1. Histogram
2.Relative Frequency 1.Bar Chart 1.Frequency Table 2. Dot Plot
Table 2.Relative
2.Pie Chart 3. Ogive
Frequency Table
4.Box Plot
3.Cumulative
5.Line Chart for one
Frequency/Relative
variable changing
Frequency Table
over time
Histogram
To draw a histogram, we need a frequency or a
relative frequency table first.
So, we partition or split the interval data into a
number of categories or classes. As a rule of
thumb, the number of classes, m, should be
between 5 and 15. If m < 5, Less than 5, then we
would be summarizing too much and if m > 15, we
would be giving too much details and then count
how many observations fall in each of the classes.
Our focus here is on Equal size classes whenever
possible. The tables will be given in this course. No
need to construct them.
MACT 2222: Statistics for Business 2.14
2.15
Example 4
As part of a larger study, a long-distance company
wanted to acquire information about the monthly
bills of new subscribers in the first month after
signing with the company.
The company’s marketing manager conducted a
survey of 200 new residential subscribers wherein
the first month’s bills were recorded.
The general manager planned to present his/her
findings to senior executives. What information
can be extracted from these data?