Professional Documents
Culture Documents
Types of Variables
Types of Variables
Types of variables
Quantitative
Qualitative (numeric)
(numeric or non -
numeric)
Interval
Nominal Ordinal
Types of Variables
We need to know the type of variables in order to
choose the appropriate way to analyze the data.
1. Quantitative (Interval) :
a) Values are real numbers.
b) They have measurement units (e.g., dollars,
pounds, miles, etc…)
c) All calculations are valid.
Examples:
# of children in a family, # of shares outstanding,
Exchange rates, Stock prices
Types of Variables
2. Qualitative Ordinal Data:
Qualitative Quantitative
Graphical Techniques
Example 1
COLOR
FREQUENCY RELATIVE
(NUMBER) FREQUENCY
These are nominal data.
Nominal data can be
Brown 14 0.24 ( = 14/58 ) described by:
Red 12 0.21 ( = 12/58 ) • Frequency and relative
frequency distribution
Blue 11 0.19 ( = 11/58 ) tables.
Yellow 9 0.16 ( = 9/58 ) • Graphically using either
Green 8
Bar Chart and/or Pie
0.14 ( = 8/58 )
Charts.
Orange 4 0.07 ( = 4/58 ) • The bar or the pie chart
tell us about the classes
with highest and lowest
frequencies
MACT 2222: Statistics for Business 2.12
2.13
Example 2
Hospital Unit Number
of Patients
Cardiac Care 1,052
Emergency 2,245
Intensive Care 340
Maternity 552
Surgery 4,630
2 qualitative 2 quantitative
variables variables
Clustered Bar
Scatter Plot
Chart
Example 3
In a major North American city there are four
competing newspapers: the Post, Globe and Mail,
Sun, and Star.
To help design advertising campaigns, the
advertising managers of the newspapers need to
know which segments of the newspaper market
are reading their papers.
A survey was conducted to analyze the
relationship between newspapers read and
occupation.
Example 3
A sample of newspaper readers was asked to
report which newspaper they read:
1 = G&M (Globe and Mail),
2 = Post,
3 = Star, or
4 = Sun,
and to indicate whether they were:
1 = Blue-collar worker,
2 = White-collar worker, or
3 = Professional.
MACT 2222: Statistics for Business 2.18
2.19
Example 3
By counting the number of times each of the 12
combinations occurs, we produced the following
Cross-Classification Table or Contingency Table.
Occupation
G&M 27 29 33 89
Post 28 43 51 112
Star 38 21 22 81
Sun 37 15 20 72
Example 3
If occupation and newspaper are related, then there will be
differences in the newspapers read among the occupations.
An easy way to see this is to convert the frequencies in each column to
relative frequencies in each column. That is, compute the column
totals and divide each frequency by its column total. Then compare
the frequencies for each row, if the percentages are different at least
for one row then this implies that there is a relation between the
variables. If the percentages are close within the row and it is the
same case for all rows then there is no relation.
Occupation
Newspaper Blue Collar White Collar Professional
G&M 27/120 =.23 29/108 = .27 33/126 = .26
Post 18/120 = .15 43/108 = .40 51/126 = .40
Star 38/120 = .32 21/108 = .19 22/126 = .17
Sun 37/120 = .31 15/108 = .14 20/126 = .16
Example 3
Interpretation: The relative frequencies in the
columns 2 & 3 are similar, but there are large
differences between columns 1 and 2 and
between columns 1 and 3.
similar
dissimilar
MACT 2222: Statistics for Business 2.21
2.22
Example 3
This tells us that blue collar workers tend to read
different newspapers from both white collar
workers and professionals and that white collar
and professionals are quite similar in their
newspaper choice.
similar
dissimilar
MACT 2222: Statistics for Business 2.22
Example 3
2.23
Use the data from the cross-classification table to create the clustered bar chart.
We can use frequencies or relative frequencies. If all clusters are sharing the
same pattern then there is no relation (regardless of the height of the bars)
otherwise there is a relation between the two variables. The heights of the bars
of one cluster should add up to 100% if using percentages. Be careful how the
bars are arranged!
Professionals
tend to read
the Globe &
Mail more
than twice as
often as the
Star or Sun
Histogram
To draw a histogram, we need a frequency or a
relative frequency table first.
So, we partition or split the interval data into a
number of categories or classes. As a rule of
thumb, the number of classes, m, should be
between 5 and 15. If m < 5, Less than 5, then we
would be summarizing too much and if m > 15, we
would be giving too much details and then count
how many observations fall in each of the classes.
Our focus here is on Equal size classes whenever
possible. The tables will be given in this course. No
need to construct them.
MACT 2222: Statistics for Business 2.26
2.27
Example 4
As part of a larger study, a long-distance company
wanted to acquire information about the monthly
bills of new subscribers in the first month after
signing with the company.
The company’s marketing manager conducted a
survey of 200 new residential subscribers wherein
the first month’s bills were recorded.
The general manager planned to present his/her
findings to senior executives. What information
can be extracted from these data?
Example 4
(18+28+14=60)÷200 = 30%
about half (71+37=108)
i.e. nearly a third of the phone bills
of the bills are “small”,
are more than $75.
i.e. less than $30 There are only a few telephone
bills in the middle range.
Example 5
The Distribution of the hourly wages of 50
employees is given as follows:
Example 5
Frequency 15
14
Comment on this 13
Histogram of the
histogram 12 wage distribution
1. The class with the
11
highest frequency is
from 30-35. This is our 10
modal class. Most of 9
the wages are 8
between 30 and 35. 7
2. Symmetry?
6
3. Spread? Is it steep or
flat? 5
4. Unimodal? 4
3
2
wages
1
0
15 20 25 30 35 40 45 50
MACT 2222: Statistics for Business 2.32
2.33
Shapes of Histogram
Symmetry
A histogram is said to be symmetric if, when we
draw a vertical line down the center of the
histogram, the two sides are identical in shape and
size:
Frequency
Frequency
Frequency
Variable Variable Variable
Shapes of Histogram
Skewness
A skewed histogram is one with a long tail
extending to either the right or the left:
Frequency
Frequency
Variable Variable
Positively Negatively
Skewed Skewed
Shapes of Histogram
Modality
A unimodal histogram is one with a single peak,
while
Bimoda
a bimodal histogram is one with two peaks:
l Unimoda
l
Frequency
Frequency
Variable Variable
A modal class is the class with
the largest number of
observations
Histogram Comparison…
•Compare the following histograms based on data
about the grades of a group of students in two
The two courses
courses. . have very different
histograms…
Histogram Comparison…
Comparing the two histograms, one can conclude that the two
courses have two different distribution of marks.
Shapes of Histogram
Bell Shape
A special type of symmetric unimodal histogram is
one that is bell shaped:
Many statistical
techniques require that
the population be bell
Frequency
shaped.
Drawing the histogram
helps verify the shape of Variable
Example 6
first
class…
next class: .355+.185=.540
:
:
Example 6
41
MACT 2222: Statistics for Business 2.41
2.42
Dot Plot
It is a number line containing all numbers in the sample
showing a dot or a mark over the position corresponding
to each number.
If more than one dot falls in the same position then they
are stacked up.
This graph shows that
the most repeated mark
is 80. However, the most
repeated values for girls
are 70,80.
More boys tend to
achieve higher marks
than girls.
A line chart of the monthly average retail price of gasoline since 1978
Example 8
A real estate agent wanted to know to what extent
the selling price of a home is related to its size. To
acquire this information he took a sample of 12
homes that had recently sold, recording the price
in thousands of dollars and the size in hundreds of
square feet. These data are listed in the
accompanying table. Use a graphical technique to
describe the relationship between size and price.
Size 23 18 26 20 22 14 33 28 23 20 27 18
Price 315 229 355 261 234 216 308 306 289 204 265 195
Example 8
It appears that there is a relationship, that is, the
greater the house size the greater the selling price
Summary
Quantitative Qualitative
Data Data
Learning Outcomes