You are on page 1of 42

STATISTICS

IN ECONOMICS AND BUSINESS

Nguyen Huyen Trang


Faculty of Statistics - National Economics University
trangtk@neu.edu.vn
LECTURE 5: DESCRIBING DATA
BY TABLE AND CHART

• Describing Data for qualitative data


• Describing Data for quantitative data
DESCRIBING DATA

processing

RAW DATA
OUTLINE
DESCRIBING FOR QUALITATIVE DATA

Qualitative
data

Tabulating Graphing
Data Data

Frequency Relative
Bar Chart Pie Chart
Distribution Table Frequency Table

Side – by – side Stacked Bar


Crosstabulation
bar chart Chart
FREQUENCY DISTRIBUTION

A grouping of qualitative data into mutually exclusive


classes showing the number of observations in each
class
Number of
Color
Flowers
Yellow 5
Red 4
Purple 3
RELATIVE FREQUENCY DISTRIBUTION

Show the fraction of the total number of


observations in each class

Colour
Colour Colour
Frequency Number
Frequency ofRelativePercentage
Relative
(Number
(Number
of Flowers) Flowers
of Flowers)
Frequency
FrequencyFrequency
Yellow 5
Yellow
Yellow 5 5 0.4170.416 41.7
Red 4
Red
Red 4 4 0.3330.333 33.3
Purple 3
Purple
Purple 3 3 0.2500.250 25.0
Total
Total 12 12 1.0001.000 100.0
EXERCISE 1

A total of 1,000 residents in Minnesota were asked which season they preferred.
The results were 100 liked winter best, 300 liked spring, 400 liked summer, and 200
liked fall.
If the data were summarized in a frequency table, how many classes would be used?
What would be the relative frequencies for each class?
CROSSTABULATION

Number of students divided by gender and evaluation

Evaluation
Good Fair Bad Total
Gender
Male 13 4 3 20
Female 15 9 6 30
Total 28 13 9 50
CROSSTABULATION
Percentage of students divided by gender
Evaluation
Number of students Good Fair Bad Total
Gender
divided by gender and Male 65.00 20.00 15.00 100.0
evaluation Female 50.00 30.00 20.00 100.0
Evaluation
Good Fair Bad Total
Gender

Male 13 4 3 20
Female 15 9 6 30
Percentage of students divided by evaluation
Total 28 13 9 50
Evaluation
Good Fair Bad
Gender
Male 46.43 30.77 33.33
Female 53.57 69.23 66.67
Total 100.00 100.00 100.00
CROSSTABULATION
Number of students divided by gender and evaluation
Evaluation
Good Fair Bad Total
Gender
Male 13 4 3 20
Female 15 9 6 30
Total 28 13 9 50

Percentage of students divided by gender and evaluation


Evaluation
Good Fair Bad Total
Gender
Male 26.00 8.00 6.00 40.00
Female 30.00 18.00 12.00 60.00
Total 56.00 26.00 18.00 100.00
DESCRIBING FOR QUALITATIVE DATA

Qualitative data

Graphing
Tabulating Data
Data

Frequency
Relative
Distribution Bar Chart Pie Chart
Frequency Table
Table

Side – by – side Stacked Bar


Crosstabulation
bar chart Chart
BAR CHART

• The classes are reported on the horizontal axis


• The class frequencies on the vertical axis
• The class frequencies are proportional to the heights of the bars
Number of Flowers
6
5
5
4
4
3
3
2
1
0
Yellow Red Purple
LINE CHART

• The classes are reported on the horizontal axis


• The class frequencies on the vertical axis
• Displays information as a series of data points connected by straight line
segments
Number of Flowers
6
5
4
3
2
Yellow Red Purple
LINE CHART

• Used to visualize a trend in data over intervals of time


PIE CHART

Shows the proportion or percent that each class represents of the total
number of frequencies

Colour Frequency Relative


(Number of Flowers) Frequency
25.00%
Yellow 5 41.67%
41.67%
Red 4 33.33%
33.33%
Purple 3 25.00%

Total 12 100.00%
Yellow Red Purple
SIDE-BY-SIDE CHART

Evaluation
Good Fair Bad Total
Gender
Male 13 4 3 20
Female 15 9 6 30
Total 28 13 9 50
STACKED CHART

Evaluation
Good Fair Bad Total
Gender
Male 13 4 3 20
Female 15 9 6 30
Total 28 13 9 50
DESCRIBING FOR QUANTITATIVE DATA

Quantitative
data

Tabulating Graphing
Data Data

Frequency
Relative Frequency Common Stem-and-
Distribution
Table Charts Leaf Display
Table

Cumulative Cumulative Relative


Frequency Frequency Histogram Ogive
Distribution Distribution

Scatter
Crosstabulation Polygon
Diagram
FREQUENCY DISTRIBUTION TABLE

Ungrouped Data
Age 23 26 28 32 35 36 38 40 43 47 50 54 58 63 Sum

Freq. 1 2 2 2 4 3 5 8 7 4 5 4 2 1 50

Grouped Data

Age 20 – 24 25 – 29 30 – 34 35 – 39 40 – 44 45 – 49 50 – 54 55 – 59 60 – 64 Sum

Freq. 1 4 2 12 15 4 9 2 1 50

Age 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69 Sum
Freq. 5 14 19 11 1 50
FREQUENCY DISTRIBUTION TABLE

Age 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69 Sum
Freq. 5 14 19 11 1 50

◼ Each class grouping should have the same 3.5


3
2.5

width

Frequency
2
1.5
1
0.5

◼ Use at least 5 but no more than 15-20 intervals 0

4
8
12
16
20
24
28
32
36
40
44
48
52
56
60
More
Temperature

◼ Intervals never overlap 12

10

Frequency
8

◼ Round up the interval width to get desirable 4

2
0
0 30 60 More

interval endpoints Temperature


EXAMPLE

A manufacturer of insulation randomly selects 20 winter days


and records the daily high temperature

24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27

Classify the data into classes with equal class width.


largest number − smallest number
w = interval width =
number of desired intervals
FREQUENCY DISTRIBUTION TABLE

24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27

◼ Find the range = largest value - smallest value = 58 – 12 = 46

◼ Identify number of classes: 5


= 9.2 → Round up to 10
◼ Compute the interval width

◼ Choose the smallest value - nice round boundaries and try to avoid
empty classes → Smallest value is 10

◼ Count observations & assign to classes


FREQUENCY DISTRIBUTION

24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27

Temperature Frequency
Frequency Temperature Frequency
12 –– 22
12 22 44 10 – 20 3
22 – 32 7 20 – 30 6
32 – 42 4 30 – 40 5
42 – 52 3 40 – 50 4
52 – 62 2 50 – 60 2
Total
Total 2020 Total 20
RELATIVE FREQUENCY DISTRIBUTION

Age Frequency % Temperature Frequency %


20 – 29 5 10.0 10 – 20 3 15.0
30 – 39 14 28.0 20 – 30 6 30.0
40 – 49 19 38.0 30 – 40 5 25.0
50 – 59 11 22.0 40 – 50 4 20.0
60 – 69 1 2.0 50 – 60 2 10.0
Sum 50 100.0 Total 20 100.0
CUMULATIVE FREQUENCY DISTRIBUTION

The sum of all previous frequencies up to the current point


Temperatu
Temperatur Frequenc
Temperature Frequen
Frequency %% Cumulative
Cumulative Cumulative
ree ycy Frequency
Frequency Relative
10––20
10 20 33 15.0 3 Frequency
10––30
20
20 30
20 66 3 30.0
15.0 39 15.0
20–––40
30
30 40
30 55 6 25.0
30.0 14
9 45.0
30–––50
40
40 50
40 44 5 20.0
25.0 16
14 70.0
40–––60
50
50 60
50 22 4 10.0
20.0 20
16 90.0
Total
Total 20
20 100.0
50 – 60 2 10.0 20 100.0
Total 20 100.0
• How many days have the temperature lower than 30?
DESCRIBING FOR QUANTITATIVE DATA

Quantitative
data

Tabulating Graphing
Data Data

Frequency
Relative Frequency Common Stem-and-
Distribution
Table Charts Leaf Display
Table

Cumulative Cumulative Relative


Frequency Frequency Histogram Ogive
Distribution Distribution

Scatter
Crosstabulation Polygon
Diagram
STEM-AND-LEAF DISPLAYS

• A simple way to see distribution


details in a data set

METHOD: Separate the sorted data


series into leading digits (the stem)
and the trailing digits (the leaves)
EXAMPLE
HISTOGRAMS

• A graph of the data in a frequency distribution


• The horizontal axis: interval endpoints
• The vertical axis: frequency, relative frequency, or percentage

Temperature Frequency % Histogram : Daily High Tem perature


7
10 – 20 3 15.0 6
6 5
20 – 30 6 30.0
5 4
30 – 40 5 25.0 Frequency 4 3
40 – 50 4 20.0 3 2
50 – 60 2 10.0 2
1 0 0
Total 20 100.0
0
0 10 20 30 40 50 60
POLYGON
Temperature Frequency %
10 – 20 3 15.0
20 – 30 6 30.0 • The same as histograms
30 – 40 5 25.0 • A line graph
40 – 50 4 20.0 • The horizontal axis: interval
50 – 60 2 10.0 midpoints
Total 20 100.0 • The vertical axis: frequency,
Frequenc y
relative frequency,
7
percentage
6

5 • Compare set of data


4

5 15 25 36 45 55 More
OGIVE

Cumulative • A type of frequency polygon


Temperature Frequency
relative freq
10 – 20 3 15.0 • Shows cumulative frequencies
20 – 30 6 30.0 • The horizontal axis: interval
30 – 40 5 25.0 endpoints
40 – 50 4 20.0 • The vertical axis: cumulative
50 – 60 2 10.0
frequency or cumulative relative
Total 20 100.0
frequency
Ogive: Daily High Temperature
• Determine how many or
100
Cumulative Percentage

80 what proportion of the


60
data value are below or
40
20
above a certain value
0
10 20 30 40 50 60
HISTOGRAM, POLYGON AND OGIVE

Features Histogram Polygon Ogive


Type of Chart Bar chart Line chart Line chart

Horizental Axis Interval endpoints Interval midpoints Interval endpoints

Vertical Axis - Frequency - Frequency - Cumulative


- Relative frequency - Relative frequency frequency
- Cumulative relative
frequency
Usage Data distribution Compare set of data Determine how many
or what proportion of
the data value are
below or above a
certain value
EXERCISE 2

The London School of Economics and the Harvard Business School conducted a
study of how chief executive officers (CEOs) spend their day. The study found that
CEOs spend average about 18 hours per week in meetings, not including conference
calls, business meals, and public events (The Wall Street Journal, February 14,
2012). Shown below is the time spent per week in meetings (hours) for a sample of
25 CEOs.
EXERCISE 3

Draw the histogram corresponding to the following ogive:


EXERCISE 4

The following data set lists the ages of 24 people:


2; 5; 1; 76; 34; 23; 65; 22; 63; 45; 53; 38; 4; 28; 5; 73; 79;
17; 15; 5; 34; 37; 45; 56
Use the data to answer the following questions:
a. Using an interval width of 8 construct a cumulative frequency plot.
b. How many are below 30?
c. How many are below 60?
d. Giving an explanation, state below what value the bottom 50% of the ages
fall.
e. Below what value do the bottom 40% fall?
f. Construct a frequency polygon.
SCATTER DIAGRAM

Used for
paired observations
taken from
two numerical variables
Radar (Spider Web) Chart

• Same scale variables (multi-dimension)


Staff
Dimension
A B
Knowledge 50 80
Skill 50 90
Learning 60 80
Discipline 65 75
Attitude 80 60
Harmony 90 65
Loyalty 85 30
COMMON MISTAKE
COMMON MISTAKE
COMMON MISTAKE
COMMON MISTAKE

You might also like