Professional Documents
Culture Documents
SB 2024 Lecture2
SB 2024 Lecture2
2
Graphical Presentation of Data
►Data in raw form are usually not easy to use for decision-
making.
3
Graphical Presentation of Data
►Categorical Variables
►Frequency distribution
►Bar chart
►Pie chart
►Pareto diagram
►Numerical variables
►Bar chart
►Line chart
►Frequency distribution
►Histogram and ogive
►Stem-and-leaf display
►Scatter plot
4
Graphical Presentation of Data
Categorical
Data
Frequency
Distribution Bar Pie Pareto
► Bar charts and Pie charts are often used for qualitative
(category) data.
► The height of the bar or the size of the pie slice shows the
frequency or percentage for each category.
► A simple bar chart can be used to display the same data and
would be preferred by many statisticians.
6
Graphical Presentation of Data
Summarize data by category
Finance 120
Marketing 150
Management 50
Accounting 75
Finance 120
Number of students
International 200
Business
Number of students
Marketing 150
Management 50
Accounting 75
200
150
120
75
50
8
Bar Charts
9
Bar Charts
10
Pie Charts
► Pie charts are another excellent tool for comparing
proportions for categorical data.
11
Pie Charts
Major # of students Percentage
Finance 120 20
IB 200 34
Marketing 150 25
Management 50 8
Accounting 75 13
Number of students
Finance Int ernational Business Marketing Management Accoun ting
13%
20%
8%
25%
34%
12
Pareto Diagram
13
Pareto Diagram Example
► A Pareto Chart is a combination of a bar graph and a line
graph.
► This chart helps organizations and individuals identify
and prioritize the most significant factors contributing to
a particular issue or problem.
► The most problematic categories are shown first.
► For example, you collect customer complaints.
Product 9
Service 7
Store 5
Price 3
Location 2
14
Pareto Diagram Example
Customer Cumulative
Frequency Percentage
Complaints Percentage
Product 9 35 35
Service 7 27 62
Store 5 19 81
Price 3 11 92
Location 2 8 100
Total 26 100
15
Pareto Diagram Example
► Step 3: Show results graphically
Frequency Percentage
10 120
9
100
8
7
80
6
5 60
4
40
3
2
20
1
0 0
Product Service Store Price Location
Numerical Data
Histogram Ogive
17
Frequency Distribution
18
Relative Frequency Distribution
19
Number of Classes
►Use at least 5 but no more than 15-20 classes.
►Methods to determine the number of classes in a
frequency distribution:
►The rule : 2" ≥ $
Where k=Number of Classes
n=Number of Data points. (Observation)
►Find the lowest value of k that satisfies the rule.
►For example, n=50
25 = 32 < 50
26 = 64 > 50, " = 6 ./ 0 1223 4ℎ2.46
►Another rule:
789:16/’ <9=6: " = 1 + 3.3 log $ ,
$ ./ 0 /0EF=6 /.G6.
20
Frequency Distribution
►Once desired classes (k) are known, the width of each class
can be found.
►The width is the range of numbers to put into each class.
►Determine the width of each class by
largest number - smallest number
w = interval width =
number of desired intervals
21
Frequency Distribution
22
Class Boundaries
23
Frequency Distribution
24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27
24
Frequency Distribution
►Find range: 58 - 12 = 46
25
Frequency Distribution
Relative
Interval Frequency Frequency Percentage
26
Histograms
►A histogram is a graphical representation of a frequency
distribution.
27
Histograms
Interval Frequency
Frequency
Total 20 4 3
3 2
2
1 0 0
0
(No gaps 0 10 20 30 40 50 60
between bars) Temperature in Degrees
28
The Shapes of Histograms
29
The Consequences of Too Few or Too
Many Classes
► Wide classes result in few class Weight Distribution
intervals
9
graph. 6
much 4
distribution shape. 2
0
[8, 51] (51, 94] (94, 137]
30
The Consequences of Too Few or Too
Many Classes
► Too many narrow classes
have consequences:
4
► Result in a
“jagged”
3
histogram
Frequency
► Some classes may
2
be empty
► Does not 1
summarize the
data enough
0
0 20 40 60 80 100
weight
10
31
The Ogive
32
The Cumulative Frequency Distribution
Cumulative
Relative Cumulative
Interval Frequency Percentage Percentage
Frequency Frequency
10 but less than 20 3 0.15 15 3 15
Total 20 1 100
33
The Ogive Graphing Cumulative Frequencies
100
Cumulative Percentage
80
60
40
20
0
10 20 30 40 50 60
34
Stem-and-Leaf Diagram
35
Stem-and-Leaf Diagram
36
Stem-and-Leaf Diagram
Stem Leaves
2 1 4 4 6 7 7
3 0 2 8
4 1
37
Stem-and-Leaf Diagram
► Using the 100’s digit as the stem and the 10’s digit as the stem
► In this illustration, the leaf digits have been sorted, although this
is not necessary.
Stem Leaf
► Data on scores:
6 135
613, 632, 658, 717, 7 1257
722, 750, 776, 827,
841, 859, 863, 891, 8 245699
894, 906, 928, 933, 9 02358
955, 982, 1034, 1047,
1056, 1140, 1169, 1224 10 345
11 46
12 2
Stem-and-Leaf Diagram
► The Stem-and-Leaf Diagram shows the distribution of data.
► For example, in the dataset above, you can see that most scores
fall in the 800s, with a few scores in the 1100s and 1200s.
40
Dot Plots
►A dot plot is the simplest graphical display of n individual
values of numerical data.
►Easy to understand.
►It reveals dispersion, central tendency, and the
shape of the distribution.
►If more than one data value lies at about the same axis
location, the dots are placed vertically.
41
Dot Plots
42
Graphs for Time-Series Data
43
Graphs for Time-Series Data
► A line chart (time-series plot) is used to show the values of a variable
over time
44
Graphs for Time-Series Data
45
Relationships Between Variables
46
Cross Tables
► Tools: PivotTables
47
Cross Tables
► 4 x 3 Cross Table for Investment Portfolios by Investor (values in millions
VND)
48
Cross Tables
Investment Portfolio
45
40
35
30
25
20
15
10
5
0
Savings Stock market Bon d market Insurance
49
Scatter Plots
50
Scatter Plots
► Scatter plots can convey patterns in data pairs that would not be
apparent from a table.
51
Scatter Plots
GDP
Happiness Per Capita
Index ($US) Happiness and GDP Per Capita
9 40,000 70000
3 10,230
4 12,939 60000
3 9,383
50000
6 28,300
2 4,000
0
0 2 4 6 8 10 12
Happiness Index
52
Scatter Plots
► The figure shows a scatter plot with
Happiness Index on the X-axis and
GDP per Capita on the Y-axis.
► In this illustration, there seems to be Happiness and GDP Per Capita
70000
an association between X and Y.
► That is, nations with higher 60000
53
Scatter Plots
► To qualitatively assess the strength of the relationship between variables, we
inspect visually how closely data points cluster around a trendline.
► However, it is not easy and not quantitative.
54
Exercise
55