Professional Documents
Culture Documents
Chapter 3 - Visualising Data
Chapter 3 - Visualising Data
VISUALISING DATA
CONTENT
3.1 3.4
Data Organisation and Box-plot
Frequency Distribution
3.2 3.5
Histograms, Frequency Other Types of Graphs
Polygons, Ogive (For Qualitative Data)
3.3 3.6
Stem and Leaf Plot Graphical Summary
using Microsoft Excel
OVERVIEW
ORGANISING AND GRAPHING DATA
(QUANTITATIVE DATA)
HISTOGRAM
(Section 3.2)
STEM AND LEAF PLOT
(Section 3.3)
DATA ORGANISATION
AND
FREQUENCY DISTRIBUTION
FREQUENCY DISTRIBUTION
(QUANTITATIVE DATA)
UNGROUPED DATA
Data are given as INDIVIDUAL POINTS
EXAMPLE
0 2
1 13
GROUPED DATA
2 18 Data are given in INTERVALS
3 0
EXAMPLE
4 10
Exam Score Frequency
5 2
90-99 7
80-89 5
70-79 15
60-69 4
50-59 5
40-49 1
FREQUENCY DISTRIBUTION
(QUANTITATIVE DATA)
FREQUENCY DISTRIBUTION
REASON
Heights of statistics students were obtained by a lecturer as part of a study conducted for class. The
last digits of those heights are listed below.
0 1 6 5 5 5 0 3 3
5 5 8 8 0 7 5 5 0
8 0 2 5 3 1 9 5 5
5 5 4 6 5 0 0 5 0
Solution
Last Digit 0 1 2 3 4 5 6 7 8 9
Frequency 8 2 1 3 1 14 2 1 3 1
FREQUENCY DISTRIBUTION
(QUANTITATIVE DATA-GROUPED DATA)
33 38 41 41 43 47 48 48 49 49
50 52 52 52 54 55 56 56 57 58
59 60 61 62 64 65 65 66 66 68
68 69 71 71 75 76 79 80 81 85
Construct a frequency distribution table for the data above with 31 as the starting point.
Solution
85 33
Class width 8 round up
7
EXAMPLE 3.2
(QUANTITATIVE DATA)-CONTINUE
31 x 39 2
39 x 47 3
47 x 55 10
55 x 63 9
63 x 71 8
71 x 79 4
79 x 87 4
EXAMPLE 3.3
(QUALITATIVE DATA)
A sample of 25 young executives was taken at random. Each executive was asked to choose only one
of the five listed models of national cars; Myvi, Bezza, Viva, Axia and Alza.
Solution
Frequency 6 3 6 2 8
3.2
HISTOGRAMS
FREQUENCY POLYGON
OGIVE
OVERVIEWS
Symmetrical Distribution
J-SHAPE REVERSE J
EXAMPLE 3.4
(HISTOGRAM)
The following frequency distribution summarized the percentage of sugar in a popular brand of soft
drink.
32.0-32.9 5
33.0-33.9 12
34.0-34.9 20
35.0-35.9 16
36.0-36.9 7
Construct a histogram for the data above. Then identify the shape of distribution based on the
histogram.
EXAMPLE 3.4
(HISTOGRAM)-CONTINUE
Solution
31.95-32.95 5
32.95-33.95 12
33.95-34.95 20
34.95-35.95 16 SYMMETRICAL
DISTRIBUTION
35.95-36.95 7 25
Frequency 20
15
10
0
31.95 32.95 33.95 34.95 35.95 36.95
Class Boundaries
EXAMPLE 3.5
(FREQUENCY POLYGON)
The following table depicted the frequency distribution for the weight of 52 female workers in a factory.
Measurements have been recorded to the nearest kilogram (kg).
40-44 3
45-49 2
50-54 7
55-59 18
60-64 18
65-69 3
70-74 1
Solution
Midpoint Frequency
37 0
42 3
47 2
52 7
57 18 20
18
62 18
16
67 3 14
Frequency
12
72 1
10
79 0 8
6
4
2
0
37 42 47 52 57 62 67 72 79
Class Boundaries
EXAMPLE 3.6
(OGIVE/
CUMULATIVE FREQUENCY GRAPH)
The following table shows the frequency distribution which depicts the years of service for 75
employees of a large manufacturing department of an international company.
Frequency 21 25 15 0 8 6
Construct an cumulative frequency graph (ogive) for the given frequency distribution. Then, find the
number of employees who serves in the company less than 18 years.
Solution
Cumulative Frequency 0 21 46 61 61 69 75
EXAMPLE 3.6
(OGIVE/
CUMULATIVE FREQUENCY GRAPH)-
CONTINUE
80
70
Cumulative frequency
60
50
40
30
20
10
0
0.5 5.5 10.5 15.5 20.5 25.5 30.5
Upper boundary
showing data
stem (leading
in graphic
digit)
form
leaf (trailing
digit)
the stem
key indicator must be
to define the arranged in
stem and leaf order.
values.
STEM AND LEAF PLOT
mixture
model/back- shape of distribution
to-back stem (rotated in horizontal
and leaf position))
EXAMPLE 3.7
(STEM AND LEAF PLOT)
Alias Consultancy conducted a survey on the number of motorcycle thefts in Malaysia for a period of 25
days. The data acquired are shown below.
10 11 12 13 13 14 25 26 27 27
28 28 29 29 31 31 32 33 34 34
45 47 49 50 52
Construct a stem and leaf plot. Then give a comment on the distribution of the number of motorcycle
thefts in Klang.
Solution
Stem Leaf
1 0 1 2 3 3 4
2 5 6 7 7 8 8 9 9
3 1 1 2 3 4 4 Key : 0 0 means 0
4
(iii) 5 7 9
5 0 2
EXAMPLE 3.8
(MIXTURES STEM AND LEAF PLOT)
The numbers of blocked intrusion attempts on each day during the first two weeks of the month were
56 47 49 37 38 60 50 43 43 59 50 56 54 58
After the change of firewall settings, the numbers of blocked intrusions during the next 20 days were
53 21 32 49 45 38 44 33 32 43
53 46 36 48 39 35 37 36 39 45
i. Construct a back-to-back stem and leaf plot. Then, give a comment on the distributions of the
number of blocked intrusions attempts before and after the change.
ii. Based on the answer in (i), can we said that the change of firewall settings is reduced the number
of blocked intrusions attempts? Justify your answer.
EXAMPLE 3.8
(MIXTURES STEM AND LEAF PLOT)-
CONTINUE
Solution
ii
Since the peak of the distribution for the number of blocked intrusions attempts is
shifted to left after the changes of firewall settings, therefore we can said that
the said that the change of firewall settings is reduced the number of blocked
intrusions attempts.
EXERCISE 3.1
(STEM AND LEAF PLOT)
1. The following data shows 22 exam marks for Mathematic course:
44 52 70 75 53 44 52 66
57 79 83 68 94 66 59 45
69 48 53 80 95 44
2. The data shown represents the percentage of unemployed males and females in 1995 for a
sample of countries of the world. Using the whole numbers as stems and the decimals as
leaves construct a back-to-back (mixture) stem and leaf plot and compare the distribution
of the two groups.
Females Males
4.9 5.0 5.3 2.1 2.3 2.3
5.5 5.6 5.6 2.7 3.0 3.3
5.8 6.1 6.3 3.3 3.6 3.7
6.6 6.7 7.1 3.9 4.2 4.2
7.4 7.6 7.9 4.4 4.5 5.6
3.4
BOX-PLOT
WHAT IS
BOX-PLOT??
A box and whisker plot
A Vertical boxplot
A Horizontal boxplot
HOW TO DRAW
BOX-PLOT??
PROCEDURES FOR
CONSTRUCTING A BOX-PLOT
5 Locate the minimum value, Q1 , Q2 , Q3 , maximum value and outliers (if any) on the scale.
Draw a box around Q1and Q3vertical line through the Q2 , and connect the upper and
6 lower value.
OUTLIERS
an extremely high or an
extremely low data value when
compared with the rest of the
data values.
HOW TO DETECT OUTLIERS??
2 Find Q1 and Q3
5 Check the data set for any data value which is smaller than or
larger than the lower and upper limit
x Q1 1.5 Q3 Q1 or x Q3 1.5 Q3 Q1
EXAMPLE 3.9
OUTLIERS
The number of credits in business courses eight job applicants had is shown here:
9, 12, 15, 27, 33, 45, 63, 72.
Find the first and third quartiles for the above data. Is there any outlier on the above
data?
Solution:
x2 x3 x6 x7
Q1 x18 x2 13.5 and Q3 x 38 x6 54
4
2 4
2
Q1 1.5 Q3 Q1 13.5 1.5(54 13.5) 47.25
Q3 1.5 Q3 Q1 54 1.5(54 13.5) 114.75
5 Locate the minimum value, Q1 , Q2 , Q3 , maximum value and outliers (if any) on the scale.
Draw a box around Q1 and Q3 vertical line through the Q2 , and connect the upper and
6 lower value.
EXAMPLE 3.10
(BOX-PLOT)
A company that sells mail-order handphones interested to study the typical weekly sales for the
inventory planning. The company has randomly selected 10 weekly sales from last year’s records and
obtained the data (RM’000) as shown below.
147 108 123 122 131 115 125 127 128 123
Construct a box-plot for data above. Then, give a comment about the shape of distribution.
Solution
108 115 122 123 123 125 127 128 131 147
Minimum 115
Median 124
Maximum 131
Q1 Q2 Q3
105 115 125 135 145
(iii)
Weekly Sales (RM'000)
EXAMPLE 3.11
(PARALLEL BOX-PLOTS)
Mr. Tan is interested to compare the number of hours that housewives spend on television programs between
the town and rural areas in Kuantan. Therefore, he has randomly surveyed 20 and eleven housewives from the
town and rural areas, respectively. The following table depicted the collected data in ascending order.
Minimum 25 2
28 29 Q1 x 11 x3 8
First Quartile Q1 x 20
28.5
2.73
5 2 4
4
34 35 Q2 x 11 x 6 15
Median Q2 x 20
34.5
10 2 5.5
2 2
38 39
Third Quartile Q3 x 3 20
38.5 Q3 x 311 8.25 x9 30
15 2
4
4
Maximum 45 34
Outlier No No
Upper Limit x Q1 1.5 IQR 13.5 x Q1 1.5 IQR 25
Lower Limit x Q3 1.5 IQR 53.5 x Q3 1.5 IQR 63
EXAMPLE 3.11
(PARALLEL BOX-PLOTS)-CONTINUE
Rural Area
R
Minimum Maximum
Q1 Q2 Q3
Town Area
T
Minimum Maximum
Q1 Q2 Q3
0 10 20 30 40 50
BOX-PLOT
(SHAPE DISTRIBUTION)
median is near
median falls median falls
the centre of the
to the left to the right
box
If the boxplots for two or more data sets are graphed on the same axis, the
distributions can be compared using their central tendency (average) and variability
values.
To compare the average, use the location of the medians.
To compare the variability, use the length of the IQR.
COMPARING BOX-PLOT
Shape
Evening : left-skewed
Day: left skewed
Evening
Variability
Evening : IQR = 18
Day: IQR=27.25
Since IQR evening < IQR
Day
day, performance test
score during evening is
better than during the
day
EXERCISE 3.3
BOX-PLOT
1. Jason saves a portion of his salary from his part-time job in the hope
of buying a pair of shoes. He recorded the number of ringgit he was able
to save over the past 15 weeks.
19, 12, 9, 7, 17, 10, 6, 18, 9, 14, 19, 8, 5, 17, 9
Plot a box-plot to illustrate the money saving by Jason.
2. Test scores for a college statistics class held during the day are:
99; 56; 32; 90; 81; 56; 45; 77; 84; 72; 68; 32; 79; 90
Test scores for a college statistics class held during the evening are:
78; 68; 89; 76; 65; 45; 90; 80; 85; 78; 98; 90; 81; 25
Construct box plot for each set of data in the same axis.
3.5
The following table depicted the operating cost of a minimart in Kelantan for the year 2017.
Solution
i Total Expenses=RM10000
5.00%
25.00%
30.00%
18.00%
22.00%
ii Total Expenses=RM8000
Construct a Pareto chart to represent the data given in the following table to display the frequency and
percentage of the different areas of employment for the accounting graduates.
General
Area Accounting Marketing Finance Others Total
Management
Number of
80 30 60 15 15 200
graduates
Solution
General
Area Accounting Finance Marketing Others
Management
Number of
80 60 30 15 15
graduates
Cumulative
40 70 85 92.5 100
Percentage (%)
x
Percentage 100%; where x represents the expense for particular item.
200
EXAMPLE 3.12
(PARETO CHART)-CONTINUE
100 100%
90 90%
80 80%
70 70%
Frequency
60 60%
50 50%
40 40%
30 30%
20 20%
10 10%
0 0%
Accounting Finance Marketing General Others
Management
Area
EXAMPLE 3.13
(BAR CHART)
The following data rendered the statistics of two major violence against women in a city from year 2015 to
2017.
Year
Violence
2015 2016 2017
Domestic violence 410 433 489
Rape 180 210 306
600
489
500
433
410
400
Frequency
306
300
210 Domestic violence
180
200 Rape
100
0
2015 2016 2017
Year
3.6