You are on page 1of 9

Lecture 2

BUSINESS STATISTICS DESCRIPTIVE STATISTICS:


Advanced Educational Program Tables and Charts

Reading materials:
Chap 2,3 (Keller)

1 2

Why do we have to summarise data


• Recap
– In the previous chap you know how to collect data. Data collected
through surveys are called ‘raw’ data.
– Raw data may include thous. obs and often provide too much
information => need to summarise before presenting to audience
• Requirement
– Data summary clears away details but should give the overall
pattern.
– Summarised information are concise but should reflect the accurate
view of the original data
• Methods to summarise and present data
– Tables
– Charts
– Numerical summaries (measure of location and dispersion) 3

Outline

• Frequency table (frequency distribution)


- Simple frequency table
- Grouped frequency table Univariate distribution
• Charts
- Bar and pie charts
- Histograms
- Boxplot

5 6

1
Frequency tables Simple frequency distribution
• What is a simple frequency table: consider each
• Frequency is the number of times a certain value has observed value as a class
happened
• A frequency distribution records the number of • Applications:
times each value occurs and is presented in the form • Qualitative data
of table • Discrete variable with few values
• Types of frequency distribution: • Example of discrete variable with few values
• Simple frequency distribution
• You are given a raw data of midterm marks of 20 students as
• Grouped frequency distribution follows: 7, 7, 10, 8, 5, 4, 5, 6, 4, 9, 8, 7, 6, 4, 8, 5, 7, 10, 10, 9
• Create a simple frequency table manually

7 8

Simple frequency table: example 1 Simple frequency distribution:


nominal variable
Marks Number of students (frequency) • Example 2: We have a data set of 686 international
students studying at UNSW, Australia. Create a
4 3 frequency table
5 3
• Large data set => can’t create a frequency table
manually
6 2
• Creating a simple frequency table using SPSS
7 4
• Go to ‘Analyse’ => ‘Tables’ => ‘Tables of
8 3 frequency’
9 2 • When the dialog box appears, choose a variable for the
box ‘Frequencies for’, then click OK
10 3
• Copy the table to Excel for more manipulations
9 10

Simple frequency distribution: example 2 Grouped frequency tables


Nationality Number of students (frequency)
Australia 179 • What is a grouped frequency table? Each class
New Zealand 1 include a range of observed values (class interval)
Hong Kong 21
Singapore 48
• Applications:
Malaysia 70
Indonesia 76 • Discrete variables with many values: age
Philippines 6
Thailand 18 • Continuous variables
China 99
Vietnam 9
• Two types of grouped frequency tables:
India 11
• Frequency table with equal class intervals
USA, Canada 14
UK, Ireland 35 • Frequency table with unequal class intervals
Other Europe 42
Rest of the world 57
Total 686
11 12

2
Grouped FT with equal class interval: Grouped FT with equal class interval:
discrete variable with many values discrete variable with many values (cont.)
Example 3: the marks scored by 58 candidates seeking promotion in
a personnel selection test were recorded as follows. Construct a Marks (class interval) Number of
frequency table using a class width of ten marks candidates
(frequency)
37 49 58 59 56 79 21 – 30 2 Note: Decision on the
62 82 53 58 34 45 31 – 40 11 number of classes and
40 43 44 50 42 61 class intervals is
41 – 50 17 subjective but the
54 30 49 54 76 47 51 – 60 20 number should be
64 53 64 54 60 39 61 – 70 5 chosen carefully
49 44 47 44 25 38 71 – 80 2
55 57 54 55 59 40 81 – 90 1
31 41 53 47 58 55 Total 58
59 64 56 42 38 37
33 33 47 50
13 14

Grouped FT with equal class interval:


Grouped FT with equal class interval: continuous variable (cont.)
continuous variable
Wages (class Number of
interval) people Terminology:
Example 4: draw a frequency table of wages (in (frequency)
Lower value: the lowest value of one
USD) paid to 30 people as follows: < $100 2 class.
$100 – < $200 5 Upper value: the highest value of
202 277 554 145 361 one class
$200 – <$300 8 Class interval: range from lower to
457 87 94 240 144 upper value
310 391 362 437 429 $300 – <$400 9
Open-ended class: the first or last
176 325 221 374 216 $400 – <$500 5 classes in the range may be open-
ended. That means they have no
480 120 274 398 282 $500 – <$600 1 lower or upper values (e.g: <$100).
153 470 303 338 209 Total 30 Open-ended class is designed for
uncommon value: too low or too high

15 16

Grouped FT with unequal class interval Frequency distribution: summary


1. Simple frequency distribution: easy task and can either do
manually or rely on statistical software
Wages per employee Number of employees
2. Grouped frequency distribution: more difficult. The
≤ $60 4 hardest task is to decide the number of classes and class
> $60 – ≤ $80 6
width or class intervals. Obs in one class should be
homogenous in terms of characteristics. The more you
> $80 – ≤ $90 6 work on it, the more reasonable classes’ number and size
> $90 – ≤ $120 6 you decide
> $120 3 3. The upper value of the previous class should not coincide
with the lower value of the following class to make sure
each value should only be in one class.

17 18

3
Class midpoint, cumulative, percentage, and
Guidelines for forming class interval cumulative percentage frequency distribution

Wages (class Class Number of Cumulative Percentage Cumulative


interval) midpoint people frequency frequency percentage
(frequency) frequency
< $100 50 2 2 6.7 6.7
$100 – < $200 150 5 7 16.7 23.3
$200 – <$300 250 8 15 26.7 50.0
$300 – <$400 350 9 24 30.0 80.0
$400 – <$500 450 5 29 16.7 96.7
$500 – <$600 550 1 30 3.3 100.0
Total 30

19 20

Class midpoint, cumulative, percentage, and


cumulative percentage frequency distribution

• Class midpoint: the average


• Cumulative frequency: running total of frequencies
through the classes of a FT Practice with SPSS
• Percentage (relative) frequency: proportion of a
frequency of a class on total frequencies.
• Cumulative percentage frequency: similar to
cumulative frequency but in percentage

21 22

Charts
See how the figures tell you
• Tools for qualitative and discrete data:
• Simple bar charts
https://edition.cnn.com/videos/politics/2020 • Pie charts
/08/04/president-trump-axios-interview-
vpx.cnn
• Tools for continuous data:
• Histograms
• Boxplots (discussed in lecture 4)

23 24

4
Bar and pie charts Bar charts: example of UNSW
• Back to the UNSW survey example, create a bar and pie
charts
• Reduce numbers of classes for easily visual look
Number of inter. students at UNSW

Number of students Percentage 250


Nationality (frequency) frequency
200
Australia & NZ 180 26.24%

Frequency
150
China 120 17.49%
100
South East Asia 227 33.09%
50
India 11 1.60%
0
USA & Canada 14 2.04% Australia China South India USA & UK & Other Rest of
& NZ East Asia Canada Ireland Europe the world
UK & Ireland 35 5.10%
Other Europe 42 6.12%
Rest of the world 57 8.31%
Total 686 100.00% 25 26

Pie charts: example of UNSW Notes

 Choose charts that present information most


Percentage of inter.students at UNSW
8.31% effectively (‘Learning by doing’)
6.12%
5.10%  Practice with SPSS
26.24%
2.04%
1.60%

17.49%
33.09%

Australia & NZ China South East Asia India


USA & Canada UK & Ireland Other Europe Rest of the world

27 28

Equal-width histograms
Histograms
 All bars have the same width (the same class intervals)
 Raw data => frequency table => histograms  The height of each bar represents the frequency or
percentage frequency of the class intervals
 A histogram looks like a bar charts except that  Using raw data in the example 4, draw a histogram
the bars are joined together representing wages
 Two types of histograms:
 Equal-width histogram
 Unequal-width histogram

29 30

5
Shapes of histograms – positive skew (long tail
Shapes of Histograms - symmetric
to right)

H i s togr a m of P os itiv e s k e w
Histogr a m of S y mme tr ic
35
50

30
40

25
30
Frequency

Frequency
20

20
15

10 10

5
0
-2.4 -1.6 -0.8 0.0 0.8 1.6 2.4
Sy mme t r ic 0
0 .0 1.5 3.0 4.5 6.0 7.5
Po s it iv e s ke w

31 32

Shapes of histograms – negative skew (long tail


to left) Shapes of histograms - bimodal

Histogram of N egativ e skew Histogram of Bimodal


35 25

30
20
25

15
Frequency
Frequency

20

15 10

10
5
5

0
0 -1.5 0.0 1.5 3.0 4.5 6.0
3.0 4.5 6.0 7.5 9.0 Bimodal
Nega t iv e ske w

33 34

Histogram terms Histograms of COVID19 in the world


• Modal class – class with highest number of
• https://covid19.who.int/?gclid=EAIaIQobChMI
observations
8a‐
• Uni-modal, bi-modal, tri-modal, multi-modal
unPCH6wIVix0rCh3tQAogEAAYASAAEgJb5_D_
• Skewness, symmetry
BwE
• Relative frequency histogram: replace frequency
for each class by • Access data: 7/8/2020
class frequency/total number of obs.

35 36

6
Flattening COVID19 curve in Korea COVID19 curve in Vietnam

37 38

Distribution of national HS exam scores 2018


Uni entrance exam

39

Distribution of national HS exam scores 2018 Distribution of national HS exam scores 2018

7
Distribution of national HS exam scores 2018 Ogive
Instead of presenting cumulative percentage freq in the
FT, you can draw a graph.

44

Practice with SPSS Bivariate distribution

45 46

Investigating the relationship between variables Cross-table


• Methods: • Cross-table is used to investigate the relationship
– Table: Cross-table b/w two categorical vars or discrete variables with
– Charts: few values.
o Multiple bar chart • Note:
o Scatterplot (mentioned in lecture 8)
– Need to identify dependent and independent variables.
– Know how to calculate row and column percentages
– Rule of thumb: independent var in row and dependent
var in column

48

8
Cross-table Multiple bar chart
• EX: use gss.sav data file to explore the relationship • We can use multiple bar chart to explore the
b/w internet use and degree relationship b/w variables.
• The skill is to know how to draw chart
• EX: use gss.sav data file to explore the relationship
b/w internet use, age, and degree

49 50

Multiple bar chat Summary


Here you are
Univariate distribution
• Table: Frequency distribution
- Simple frequency table
- Grouped frequency table
• Charts
- Bar and pie charts
- Histograms
Bivariate distribution

51 52

You might also like