You are on page 1of 7

Lecture 2

BUSINESS STATISTICS DESCRIPTIVE STATISTICS:


Advanced Educational Program Tables and Charts

Reading materials:
Chap 2,3 (Keller)

1 2

Why do we have to summarise data


•  Recap
–  In the previous chap you know how to collect data. Data collected
through surveys are called ‘raw’ data.
–  Raw data may include thous. obs and often provide too much
information => need to summarise before presenting to audience
•  Requirement
–  Data summary clears away details but should give the overall
pattern.
–  Summarised information are concise but should reflect the accurate
view of the original data
•  Methods to summarise and present data
–  Tables
–  Charts
–  Numerical summaries (measure of location and dispersion) 3

Outline Frequency tables


•  Frequency is the number of times a certain event has
•  Frequency table (frequency distribution) happened
- Simple frequency table •  A frequency distribution records the number of
- Grouped frequency table times each value occurs and is presented in the form
•  Charts of table
- Bar and pie charts •  Types of frequency distribution:
- Histograms •  Simple frequency distribution
•  Grouped frequency distribution
- Boxplot

Univariate distribution
5 6
Simple frequency distribution Simple frequency table: example 1
•  What is a simple frequency table: consider each
observed value as a class Marks Number of students (frequency)

•  Applications: 4 3
•  Qualitative data nominal/ordinal data 5 3
•  Discrete variable with few values 6 2
•  Example of discrete variable with few values 7 4
variable
•  You are given a raw data of midterm marks of 20 students as 8 3
follows: 7, 7, 10, 8, 5, 4, 5, 6, 4, 9, 8, 7, 6, 4, 8, 5, 7, 10, 10, 9 9 2
•  Create a simple frequency table manually
10 3

7 8

Simple frequency distribution: Simple frequency distribution: example 2


nominal variable Nationality Number of students (frequency)
Australia 179
•  Example 2: We have a data set of 686 international New Zealand 1
students studying at UNSW, Australia. Create a Hong Kong 21
frequency table Singapore 48

•  Large data set => can’t create a frequency table Malaysia 70


Indonesia 76
manually Philippines 6
•  Creating a simple frequency table using SPSS Thailand 18

•  Go to ‘Analyse’ => ‘Tables’ => ‘Tables of China 99


Vietnam 9
frequency’ India 11
•  When the dialog box appears, choose a variable for the USA, Canada 14
box ‘Frequencies for’, then click OK UK, Ireland 35
Other Europe 42
•  Copy the table to Excel for more manipulations Rest of the world 57
Total 686
9 10

To make a table shorter => combine values


into 1
+ Southest Asia
+ Europe

group observations in 1 class Grouped FT with equal class interval:


Grouped frequency tables discrete variable with many values
Example 3: the marks scored by 58 candidates seeking promotion in
•  What is a grouped frequency table? Each class a personnel selection test were recorded as follows. Construct a
include a range of observed values (class interval) frequency table using a class width of ten marks
•  Applications: Number of class intervals = 1 + 3.3 log (n) 37 49 58 59 56 79
62 82 53 58 34 45
•  Discrete variables with many values: age
40 43 44 50 42 61
•  Continuous variables 54 30 49 54 76 47
64 53 64 54 60 39
•  Two types of grouped frequency tables:
49 44 47 44 25 38
•  Frequency table with equal class intervals 55 57 54 55 59 40
•  Frequency table with unequal class intervals 31 41 53 47 58 55
59 64 56 42 38 37
· Nhóm = Lp (classes) còn c gi là khong ca nhóm (class
33 33 47 50
interval) bao gm các giá tr nm trong khong gii hn di và
11 12
gii hn trên ca mt nhóm.
Grouped FT with equal class interval: Grouped FT with equal class interval:
discrete variable with many values (cont.) continuous variable
Marks (class interval) Number of Example 4: draw a frequency table of wages (in
(value)
candidates USD) paid to 30 people as follows:
(frequency)
21 – 30 2 Note: Decision on the
202 277 554 145 361
31 – 40 11 number of classes and
class intervals is 457 87 94 240 144
41 – 50 17 subjective but the 310 391 362 437 429
51 – 60 20 number should be
176 325 221 374 216
61 – 70 5 chosen carefully
480 120 274 398 282
71 – 80 2
153 470 303 338 209
81 – 90 1
Total 58

13 14

Grouped FT with equal class interval:


continuous variable (cont.) Grouped FT with unequal class interval

Wages (class Number of


interval) people Terminology:
(frequency) Wages per employee Number of employees
Lower value: the lowest value of one
< $100 2 class. ≤ $60 4
$100 – < $200 5 Upper value: the highest value of
one class > $60 – ≤ $80 6
$200 – <$300 8 Class interval: range from lower to
> $80 – ≤ $90 6
upper value
$300 – <$400 9
Open-ended class: the first or last > $90 – ≤ $120 6
$400 – <$500 5 classes in the range may be open-
ended. That means they have no > $120 3
$500 – <$600 1 lower or upper values (e.g: <$100).
Total 30 Open-ended class is designed for
uncommon value: too low or too high how to calculate open-ended class midpoint

15 16

Frequency distribution: summary Guidelines for forming class interval

1.  Simple frequency distribution: easy task and can either do


manually or rely on statistical software
2.  Grouped frequency distribution: more difficult. The
hardest task is to decide the number of classes and class
width or class intervals. Obs in one class should be
homogenous in terms of characteristics. The more you
work on it, the more reasonable classes’ number and size
you decide
3.  The upper value of the previous class should not coincide
with the lower value of the following class to make sure
each value should only be in one class.

17 18
Class midpoint, cumulative, percentage, and Class midpoint, cumulative, percentage, and
cumulative percentage frequency distribution cumulative percentage frequency distribution
u+cui /2
Wages (class Class Number of Cumulative Percentage Cumulative •  Class midpoint: the average
interval) midpoint people frequency frequency percentage
(frequency) frequency •  Cumulative frequency: running total of frequencies
< $100 50 2 2 6.7 6.7 through the classes of a FT
$100 – < $200 150 5 7 16.7 23.3
$200 – <$300 250 8 15 26.7 50.0
•  Percentage (relative) frequency: proportion of a
$300 – <$400 350 9 24 30.0 80.0 frequency of a class on total frequencies.
$400 – <$500 450 5 29 16.7 96.7 •  Cumulative percentage frequency: similar to
$500 – <$600 550 1 30 3.3 100.0
cumulative frequency but in percentage
Total 30

19 20

See how the figures tell you

Practice with SPSS https://edition.cnn.com/videos/politics/


2020/08/04/president-trump-axios-
interview-vpx.cnn

21 22

Charts Bar and pie charts


•  Back to the UNSW survey example, create a bar and pie
charts
•  Tools for qualitative and discrete data: •  Reduce numbers of classes for easily visual look
•  Simple bar charts
Number of students Percentage
•  Pie charts Nationality (frequency) frequency

•  Tools for continuous data: Australia & NZ 180 26.24%


China 120 17.49%
•  Histograms
South East Asia 227 33.09%
•  Boxplots (discussed in lecture 4) India 11 1.60%
USA & Canada 14 2.04%
UK & Ireland 35 5.10%
Other Europe 42 6.12%
Rest of the world 57 8.31%

23
Total 686 100.00% 24
Bar charts: example of UNSW Pie charts: example of UNSW

Percentage of inter.students at UNSW


Number of inter. students at UNSW 8.31%
6.12%
250
5.10%
200 26.24%
2.04%
Frequency

150 1.60%

100
17.49%
50 33.09%

0
Australia China South India USA & UK & Other Rest of
Australia & NZ China South East Asia India
& NZ East Asia Canada Ireland Europe the world
USA & Canada UK & Ireland Other Europe Rest of the world

25 26

Notes Histograms

§  Choose charts that present information most §  Raw data => frequency table => histograms
effectively (‘Learning by doing’) §  A histogram looks like a bar charts except that
the bars are joined together
§  Practice with SPSS
§  Two types of histograms:
§  Equal-width histogram
§  Unequal-width histogram

27 28

Equal-width histograms Shapes of Histograms - symmetric


§  All bars have the same width (the same class intervals)
§  The height of each bar represents the frequency or
percentage frequency of the class intervals Histogram of Symmetric
50
§  Using raw data in the example 4, draw a histogram
representing wages 40

30
Frequency

20

10

0
-2.4 -1.6 -0.8 0.0 0.8 1.6 2.4
Symmetric

29 30
Shapes of histograms – positive skew (long tail Shapes of histograms – negative skew (long tail
to right) to left)

Histogram of Positive skew


35 Histogram of Negative skew
35
30
30
25
25
Frequency

20

Frequency
20
15
15
10
10

5
5

0
0.0 1.5 3.0 4.5 6.0 7.5 0
Positive skew 3.0 4.5 6.0 7.5 9.0
Negative skew

31 32

Shapes of histograms - bimodal Histogram terms


•  Modal class – class with highest number of
Histogram of Bimodal
observations
25
•  Uni-modal, bi-modal, tri-modal, multi-modal
•  Skewness, symmetry
20

15
•  Relative frequency histogram: replace frequency
Frequency

10
for each class by
5 class frequency/total number of obs.
0
-1.5 0.0 1.5 3.0 4.5 6.0
Bimodal

33 34

Histograms of COVID19 in the world Flattening COVID19 curve in Korea


•  h#ps://covid19.who.int/?
gclid=EAIaIQobChMI8a-
unPCH6wIVix0rCh3tQAogEAAYASAAEgJb5_D_
BwE
•  Access data: 7/8/2020

35 36
Ogive
COVID19 curve in Vietnam Instead of presenting cumulative percentage freq in the
FT, you can draw a graph.

37 38

Summary

•  Table: Frequency distribution


- Simple frequency table
Practice with SPSS - Grouped frequency table
•  Charts
- Bar and pie charts
- Histograms

39 40

You might also like