You are on page 1of 17

Page 1 of 17

Topic 3
Data Presentation Methods

Once the data are gathered from the sample or population, one thing you could do is present them in
some way understandable. As mentioned before, data will remain as data unless something is done
to make them useful. There are three general methods of data presentation: textual presentation,
tabular presentation and graphical presentation.

❖ Textual presentation uses a narrative to describe the data. For example:


“Out of 50 COVID19 survivors surveyed, 54% are males, 80% are married, and 90% are at least
55 years old.”

❖ Tabular presentation uses tables to present your data. A table consists of a title describing
the data being presented, and headings for particular entries such as the frequencies and
relative frequencies or percentages.

• A one-way table presents the categories of one variable with its corresponding
frequencies (Freq) and percentages (or relative frequencies).

Table 1. Distribution of COVID19 survivors by blood type


Blood type Freq %
A 7 14
B 12 24
AB 8 16
O 23 46
Total 50 100

This table can be constructed in another way:

Table 2. Distribution of COVID19 survivors by blood type


Blood type Freq (𝑛 = 50) %
A n=50
7 14
B 12 24
AB 8 16
O 23 46

Table 1 and Table 2 show that 46% of the survivors have Type O blood, followed
by Type B blood (24%), Type AB blood (16%) and Type A blood (14%).

• A two-way table (or 𝒓 𝒙 𝒄 table) presents the categories of each of the two variables (row
variable and column variable) with corresponding frequencies. It is also called a cross
tabulation (cross tabs) or 𝒓 𝒙 𝒄 contingency table.

hvvvalle
Page 2 of 17
Table 3. Distribution of COVID19 survivors by blood type and sex
Sex
Blood type Total
Male Female
A 4 3 7
B 5 7 12
AB 3 5 8
O 15 8 23
Total 27 23 50

Table 3 shows that almost half of the survivors have Type O blood. Of the 27 male
survivors, majority have Type O blood.

This table is a 𝟒 𝒙 𝟐 table since there are 4 categories for the row variable (Blood type,
𝑟 = 4) and 2 categories for the column variable (Sex, 𝑐 = 2). You can add row
percentages or column percentages if you like.

The variables in the two tables are qualitative so it would be very easy for you to construct
a table of these types because all you have to do is identify the categories and count how
many entities belong to each category. Now what if the data are quantitative? Example:
Number of siblings that a student has−5, 6, 0, 2, 3, 3, 4, 13, 4, 4, 5, 6, 6, 7, 3, 3, 7,
10, 2, 3. A simple one-way table may be constructed, that is,

Table 4. Distribution of sibling number among respondents


No. of siblings Freq (𝑛 = 20) %
0 1 5
2 2 10
3 5 25
4 3 15
5 2 10
6 3 15
7 2 10
10 1 5
13 1 5

Table 4 shows that 5 or 25% of the respondents have 3 siblings. Only 1 respondent is
an only child.

This is fairly easy since there are only 20 data values and they are discrete. What if the
number of data values is large? Solution: Construct a frequency distribution table or FDT.
The table above is an FDT for ungrouped data (raw data).

• A frequency distribution table for grouped data is a table where frequencies are
determined from each class/interval. Classes may look like these:

17 − 20 2.5 − 4.6
21 − 24 or 4.7 − 6.8
25 − 28 6.9 − 9.0

hvvvalle
Page 3 of 17

It is necessary to construct equally spaced intervals, find the corresponding frequencies


per interval, and the corresponding relative frequencies. How do you find these intervals?
There are many ways to find them−you may use a statistical software where all you have
to do is enter the data and after a few clicks, an FDT is produced. You may even use an
online frequency distribution table calculator to do this. But for this instance, let us do it
using simple formulas, just to test how well you can follow simple instructions.

Steps:

1. Solve for the 𝑅𝑎𝑛𝑔𝑒, where 𝑅𝑎𝑛𝑔𝑒 = 𝑀𝑎𝑥𝑖𝑚𝑢𝑚 𝑉𝑎𝑙𝑢𝑒 − 𝑀𝑖𝑛𝑖𝑚𝑢𝑚 𝑉𝑎𝑙𝑢𝑒. (Do
not round off your answer.)

2. Solve for 𝑘, the approximate number of classes/ intervals that can be constructed,
where 𝑘 = √𝑛 and 𝑛 is the total number of observations (or data values). This 𝑘
should be rounded off to the next higher integer (not the nearest integer) to
accommodate all observations. 𝑘 is an approximate value so when you construct your
FDT, the actual number of classes/intervals may or may not be equal to the computed
𝑘.

Example: 𝑛 = 24 so 𝑘 = √24 = 4.898979 … ≈ 𝟓 classes/intervals

Example: 𝑛 = 25 so 𝑘 = √25 = 𝟐𝟓 classes/intervals

Example: 𝑛 = 26 so 𝑘 = √26 = 5.099019 … ≈ 𝟔 classes/intervals

Why 6 instead of 5? → Because 6 is the next higher integer

𝑅𝑎𝑛𝑔𝑒
3. Solve for the class width 𝑐, where 𝑐 = 𝑘
. This 𝑐 is rounded off with the same
number of decimal places as the observations in the data set (rounding off to the
nearest value).

Example:
# of
Data Set Data values decimal Computed c
places
A 10, 9, 1, 2, 5, 9, 3, ….. 0 c=3.6≈4
B 25, 22, 3, 10, …. 0 c=6.1≈6
C 2.5, 2.3, 1.5, 8.5,.. 1 c=2.6822≈2.7
D 12.4, 31.2, 18.8,… 1 c=5.4346≈5.4
E 1.76, 2.54, 1.98,… 2 c=0.2342≈0.23
F 11.25, 13.26, 8.31, 2.20,… 2 c=4.127112≈4.13
G 5.456, 3.145, 10.333, … 3 c=2.123056≈2.123
H 3.1, 2, 8.24, 6.235, 2.25,… ? c=3.82118≈3.821

hvvvalle
Page 4 of 17

You are now ready to construct your FDT. The FDT includes three basic columns:
Classes/Intervals, Frequencies, Relative Frequencies (%). Columns for the True Class
Boundaries (𝑇𝐶𝐵), Class Marks (𝐶𝑀), Less than Cumulative Frequencies (< 𝐶𝐹), and
Greater than Cumulative Frequencies (> 𝐶𝐹) can also be added.

𝐶𝑙𝑎𝑠𝑠𝑒𝑠/𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑠 𝐹𝑟𝑒𝑞 𝑅𝐹 (%) 𝑇𝐶𝐵 < 𝐶𝐹 > 𝐶𝐹 𝐶𝑀

4. Find the following:


• Lower and Upper Limits (𝑳𝑳 𝒂𝒏𝒅 𝑼𝑳). A class/interval consists of these two
limits.

𝑳𝑳𝟏 17 − 20 𝑼𝑳𝟏
𝑳𝑳𝟐 21 − 24 𝑼𝑳𝟐
𝑳𝑳𝟑 25 − 28 𝑼𝑳𝟑

The 𝑳𝑳𝟏 is usually the minimum in the data set; 𝑼𝑳𝟏 = 𝑳𝑳𝟏 + 𝒄 − 𝒑. What is 𝒑?
It is the precision of the data, which has to do with the number of decimal places
of the observations in the data set.

Example:
# of decimal places of
Observations 𝑝
the observations
3, 4, 0, 5, 11,… 0 1
5.1, 2.9, 8.0, … 1 0.1
2.34, 7.00, 4.06, … 2 0.01
6.231, 8.992, 0.008 3 0.001
11.4534, 5.6672, 4.2105… 4 ?

To get 𝑳𝑳𝟐 , 𝒄 is added to 𝑳𝑳𝟏 . To get the 𝑼𝑳𝟐 , add 𝒄 to the 𝑼𝑳𝟏 .

To get 𝑳𝑳𝟑 , 𝒄 is added to the 𝑳𝑳𝟐 . To get 𝑼𝑳𝟑 , add 𝒄 to the 𝑼𝑳𝟐 .

This process continues until you have counted in the maximum observation in the
last class/interval.

• Freq is the number of observations within each class/interval. Tally the values if
you do not want to strain your eyes looking for them in the data set.
Freq
• 𝐑𝐞𝐥𝐚𝐭𝐢𝐯𝐞 𝐅𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲 (RF, %) =
n
∗ 100%
• Less than Cumulative Frequency (< 𝑪𝑭) is the number of observations that are
less than or equal to a specified upper limit 𝑼𝑳.
• Greater than Cumulative Frequency (> 𝑪𝑭) is the number of observations that
are greater than or equal to a specified lower limit 𝑳𝑳.
• If data are continuous, then the True Class Boundaries (𝑻𝑪𝑩)may be computed.
A 𝑻𝑪𝑩 is composed of the Lower True Class Boundary (𝑳𝑻𝑪𝑩) and the Upper
True Class Boundary (𝑼𝑻𝑪𝑩).

hvvvalle
Page 5 of 17

𝑳𝑻𝑪𝑩 = 𝑳𝑳 − 𝟎. 𝟓 ∗ 𝒑 𝑼𝑻𝑪𝑩 = 𝑼𝑳 + 𝟎. 𝟓 ∗ 𝒑

𝑳𝑳+𝑼𝑳
• Class Mark (𝑪𝑴)is the mean of the 𝑳𝑳 and 𝑼𝑳; 𝑪𝑴 = 𝟐
Do not roundoff!!

Numerical Example 1

Construct a frequency distribution for the number of siblings of BS Nursing students:

𝟓, 𝟔, 𝟎, 𝟐, 𝟑, 𝟑, 𝟒, 𝟏𝟑, 𝟒, 𝟒, 𝟓, 𝟔, 𝟔, 𝟕, 𝟑, 𝟑, 𝟕, 𝟏𝟎, 𝟐, 𝟑 → 𝒏 = 𝟐𝟎 𝒑=𝟏


Steps:

1. 𝑹𝒂𝒏𝒈𝒆 = 𝟏𝟑 − 𝟎 = 𝟏𝟑

2. 𝒌 = √𝟐𝟎 = 𝟒. 𝟒𝟕𝟐𝟏 … ≈ 𝟓 Why 5?

3. 𝒄 = 𝟏𝟑/𝟓 = 𝟐. 𝟔 ≈ 𝟑

4. Find the following:

Lower Limits and Upper Limits (𝑳𝑳) and (𝑼𝑳)

𝑳𝑳𝟏 = 𝟎 𝑼𝑳𝟏 = 𝑳𝑳𝟏 + 𝒄 − 𝒑 = 𝟎 + 𝟑 − 𝟏 = 𝟐


𝑳𝑳𝟐 = 𝑳𝑳𝟏 + 𝒄 = 𝟎 + 𝟑 = 𝟑 𝑼𝑳𝟐 = 𝑼𝑳𝟏 + 𝒄 = 𝟐 + 𝟑 = 𝟓
And so on…

Freq. For each interval, count how many observations are included from the data set.
For the interval 𝟎 − 𝟐, there are 3 observations (𝟎, 𝟐, 𝟐). For the interval 𝟑 − 𝟓, there
are 10 observations (𝟓, 𝟑, 𝟑, 𝟒, 𝟒, 𝟒, 𝟓, 𝟑, 𝟑, 𝟑), and so on. You know you are in the right
track if the largest value (𝑴𝒂𝒙𝒊𝒎𝒖𝒎 = 𝟏𝟑) is already contained in the last interval and
the total frequency is equal to 𝒏 = 𝟐𝟎.

Relative Frequency (𝑹𝑭) for each interval. For the first interval, the 𝑹𝑭 is 15%.
𝐹𝑟𝑒𝑞 3
𝑅𝐹 (%) = ∗ 100% = ∗ 100% = 15% …and so on….
𝑛 20

Less than Cumulative Frequency (< 𝑪𝑭)

< 𝑪𝑭𝟏 = the number of observations that are less than or equal to 𝑼𝑳𝟏
< 𝑪𝑭𝟐 = the number of observations that are less than or equal to 𝑼𝑳𝟐

and so on…

Greater than Cumulative Frequency (> 𝑪𝑭)

> 𝑪𝑭𝟏 = the number of observations that are greater than or equal to 𝑳𝑳𝟏
> 𝑪𝑭𝟐 = the number of observations that are greater than or equal to 𝑳𝑳𝟐

and so on…

hvvvalle
Page 6 of 17

Class Mark (𝑪𝑴) for each interval.

𝑪𝑴𝟏 = (𝟎 + 𝟐)/𝟐 = 𝟏
𝑪𝑴𝟐 = 𝑪𝑴𝟏 + 𝒄 = 𝟏 + 𝟑 = 𝟒
and so on…

Table 5. Distribution of the Number of Siblings among Respondents


𝐶𝑙𝑎𝑠𝑠𝑒𝑠/𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑠 𝑇𝑎𝑙𝑙𝑦 𝐹𝑟𝑒𝑞 𝑅𝐹(%) < 𝐶𝐹 > 𝐶𝐹 𝐶𝑀
0−2 ||| 3 15 3 20 1
3−5 |||| |||| 10 50 13 17 4
6−8 |||| 5 25 18 7 7
9 − 11 | 1 5 19 2 10
12 − 14 | 1 5 20 1 13
Total 20 100

Shortcut in getting the (< 𝑪𝑭): Let the 𝐹𝑟𝑒𝑞1 be the < 𝑪𝑭𝟏 (See figures in red). Then,

< 𝑪𝑭𝟐 =< 𝑪𝑭𝟏 + 𝑭𝒓𝒆𝒒𝟐 = 𝟑 + 𝟏𝟎 = 𝟏𝟑


< 𝑪𝑭𝟑 =< 𝑪𝑭𝟐 + 𝑭𝒓𝒆𝒒𝟑 = 𝟏𝟑 + 𝟓 = 𝟏𝟖
< 𝑪𝑭𝟒 =< 𝑪𝑭𝟑 + 𝑭𝒓𝒆𝒒𝟒 = 𝟏𝟖 + 𝟏 = 𝟏𝟗
< 𝑪𝑭𝟓 =< 𝑪𝑭𝟒 + 𝑭𝒓𝒆𝒒𝟓 = 𝟏𝟗 + 𝟏 = 𝟐𝟎
Shortcut in getting the (> 𝑪𝑭): Let 𝑛 be the > 𝐶𝐹𝟏 (See figures in brown). Then,

> 𝑪𝑭𝟐 => 𝑪𝑭𝟏 − 𝑭𝒓𝒆𝒒𝟏 = 𝟐𝟎 − 𝟑 = 𝟏𝟕


> 𝑪𝑭𝟑 => 𝑪𝑭𝟐 − 𝑭𝒓𝒆𝒒𝟐 = 𝟏𝟕 − 𝟏𝟎 = 𝟕
> 𝑪𝑭𝟒 => 𝑪𝑭𝟑 − 𝑭𝒓𝒆𝒒𝟑 =𝟕−𝟓=𝟐
> 𝑪𝑭𝟓 => 𝑪𝑭𝟒 − 𝑭𝒓𝒆𝒒𝟒 =𝟐−𝟏=𝟏

Questions:

1. How many respondents have 6 to 8 siblings? Ans: 5 → 𝑭𝒓𝒆𝒒𝟐


2. How many respondents have at least 9 siblings? Ans: 2 → > 𝑪𝑭𝟒
3. How many respondents have at most 5 siblings? Ans: 13 → < 𝑪𝑭𝟐

https://www.youtube.com/watch?v=DW1lsZnaP8Q&t=1313s

hvvvalle
Page 7 of 17

Numerical Example 2

Construct the FDT of the weights (in kg) of a sample female COVID19 survivors given
below.

56.3 66.0 56.4 63.0 61.3


49.0 53.2 68.2 56.2 57.6
66.3 60.0 59.0 66.0 48.8
69.0 54.0 73.0 56.1 56.0
61.3 53.8 68.2 54.9 63.2
58.3 56.2 63.2 66.7 57.4
50.6 76.0 45.3 58.0 64.9

𝒏 = 𝟑𝟓, 𝒑 = 𝟎. 𝟏
Steps:

1. 𝑹𝒂𝒏𝒈𝒆 = 𝟕𝟔. 𝟎 − 𝟒𝟓. 𝟑 = 𝟑𝟎. 𝟕

2. 𝒌 = √𝟑𝟓 = 𝟓. 𝟗𝟏𝟔𝟎 … ≈ 𝟔
𝟑𝟎.𝟕
3. 𝒄 = 𝟔
= 𝟓. 𝟏𝟏𝟔𝟔 ≈ 𝟓. 𝟏

4. Find the 𝑳𝑳 and 𝑼𝑳, and other values.

𝑳𝑳𝟏 = 𝟒𝟓. 𝟑 𝑼𝑳𝟏 = 𝑳𝑳𝟏 + 𝒄 − 𝒑 = 𝟒𝟓. 𝟑 + 𝟓. 𝟏 − 𝟎. 𝟏 = 𝟓𝟎. 𝟑


𝑳𝑳𝟐 = 𝑳𝑳𝟏 + 𝒄 = 𝟒𝟓. 𝟑 + 𝟓. 𝟏 = 𝟓𝟎. 𝟒 𝑼𝑳𝟐 = 𝑼𝑳𝟏 + 𝒄 = 𝟓𝟎. 𝟑 + 𝟓. 𝟏 = 𝟓𝟓. 𝟒 And so on…

𝑳𝑳𝟏 +𝑼𝑳𝟏 𝟒𝟓.𝟑+𝟓𝟎.𝟑


𝑪𝑴𝟏 = 𝟐
= 𝟐
= 𝟒𝟕. 𝟖 𝑪𝑴𝟐 = 𝑪𝑴𝟏 + 𝒄 = 𝟒𝟕. 𝟖 + 𝟓. 𝟏 = 𝟓𝟐. 𝟗 And so on…

Include the 𝑻𝑪𝑩𝒔 in the computation.

𝑳𝑻𝑪𝑩 = 𝑳𝑳 − 𝟎. 𝟓 ∗ 𝒑 𝑼𝑻𝑪𝑩 = 𝑼𝑳 + 𝟎. 𝟓 ∗ 𝒑

𝐿𝑇𝐶𝐵1 = 𝐿𝐿1 − 0.5 ∗ 0.1 = 45.3 − 0.5 ∗ 0.1 = 45.25


𝐿𝑇𝐶𝐵2 = 𝐿𝑇𝐶𝐵1 + 𝑐 = 45.25 + 5.1 = 50.35
𝐿𝑇𝐶𝐵3 = 𝐿𝑇𝐶𝐵2 + 𝑐 = 50.35 + 5.1 = 55.45 And so on….

𝑈𝑇𝐶𝐵1 = 𝑈𝐿1 + 0.5 ∗ 0.1 = 50.3 + 0.5 ∗ 0.1 = 50.35


𝑈𝑇𝐶𝐵2 = 𝑈𝑇𝐶𝐵1 + 𝑐 = 50.35 + 5.1 = 55.45
𝑈𝑇𝐶𝐵3 = 𝑈𝑇𝐶𝐵2 + 𝑐 = 55.45 + 5.1 = 60.55 And so on….

hvvvalle
Page 8 of 17

Table 6. Distribution of the weights (kg) of a sample of COVID survivors


𝑪𝒍𝒂𝒔𝒔𝒆𝒔 𝑭𝒓𝒆𝒒 𝑹𝑭(%) < 𝑪𝑭 > 𝑪𝑭 𝑻𝑪𝑩 𝑪𝑴
45.3 − 50.3 3 9 3 35 45.25 − 50.35 47.8
50.4 − 55.4 5 14 8 32 50.35 − 55.45 52.9
55.5 − 60.5 12 34 20 27 55.45 − 60.55 58.0
60.6 − 65.6 6 17 26 15 60.55 − 65.65 63.1
65.7 − 70.7 7 20 33 9 65.65 − 70.75 68.2
70.8 − 75.8 1 3 34 2 70.75 − 75.85 73.3
75.9 − 80.9 1 3 35 1 75.85 − 80.95 78.4
Total 35 100

Practice:

1. If 𝒏 = 𝟔𝟕, what is the approximate value of 𝒌? ________


2. In the interval 𝟒𝟐. 𝟓𝟔 − 𝟓𝟒. 𝟔𝟐, what is the precision of the data? ________
3. Refer to no.2. What is its class mark? __________
4. Refer to no.2. What is the value of 𝒄? __________
5. Refer to no.2. What is the value of the 3rd lower limit? _______ 4th upper limit? _________
6. Given the True Class Boundary 𝟏𝟐𝟑. 𝟒𝟓𝟓𝟓 − 𝟒𝟓𝟔. 𝟕𝟖𝟗𝟓, what is the precision of the data? ____
7. Refer to no.6. What is the value of 𝒄? __________
8. Refer to no.6. What is its Lower Limit? __________ Upper limit?___________
9. Refer to no.6. What is the corresponding Class mark? _______________

hvvvalle
Page 9 of 17

• Graphical presentation makes use of graphs and charts or any means to visually display the
data. Graphs and charts catch the attention of the viewer easily compared to textual and
tabular presentation.

Types of Graphs

❖ Bar chart – Each bar represents a category in a variable; if bars lie in the x-axis, the
frequencies (counts) or relative frequencies lie in the y-axis and vice versa; they can also
be clustered or stacked. Referring to Table 1 on page 16, the variable blood type is
qualitative (categorical) with four categories (𝐀, 𝐁, 𝐀𝐁, 𝐎). In the said table, there are 𝑛 =
50 and the highest frequency is 23, pertaining to blood type O. To construct its bar chart,
do not calibrate your y-axis from 0 to 50; instead calibrate it from 0 up to the highest
frequency, more or less, or multiples of 5, whichever is applicable. In the graph below,
23 is between 20 and 25. Point to ponder: What will your graph look like if you calibrate
it from 1 to 50? Take note also that the variable under consideration has 4 distinct
categories so the bars must be separated with spaces. You may use Excel or a statistical
software or an online tool to construct this. If none of these is available, you can do it
manually. Provide an appropriate title below the graph, not above it.

Fig. 1. Distribution of COVID 19 survivors by blood type

❖ Pie chart—As the name suggests, it looks like a pie (round, not the pan-pizza type). A slice
represents a category of the variable under consideration. One advantage of the pie chart
is that it gives a visual view of a portion of the data (slice) with regards to the whole.
Each slice has a corresponding angle measurement (a circle has 360 degrees). Referring
to Table 1 on page 16, the corresponding angle measurements are computed as follows:

Blood type Degrees


A 14% 𝑜𝑓 360 = 0.14 ∗ 360 = 50.4
B 24% 𝑜𝑓 360 = 0.24 ∗ 360 = 86.4
AB 16% 𝑜𝑓 360 = 0.16 ∗ 360 = 57.6
O 46% 𝑜𝑓 360 = 0.46 ∗ 360 = 165.6

hvvvalle
Page 10 of 17

Using a compass and a protractor, you make a circle and measure the degrees for each
angle corresponding to a particular category. You may use Excel or a statistical software
or an online tool to construct this. Provide an appropriate title below the graph.

Fig. 2. Distribution of COVID 19 survivors by blood type

❖ Stem-and-Leaf Plot—It presents numerical data in terms of “stems and “leaves”. For
example, you have the number 29. Its stem is 2 and its leaf is 9. For the number 290, its
stem is 29 and its leaf is 0.

Steps in constructing a stem-and-leaf plot:

1. Arrange the observations in ascending order.


2. Draw a vertical line.
3. Write the stems on the left of the line and the leaves on the right.

Example 1

These are the scores of students in their first Biostatistics quiz:

34, 45, 23, 22, 20, 10, 9, 8, 40, 31, 60, 63

In ascending order: 8, 9, 10, 20, 22, 23, 31, 34, 40, 45, 60, 63

Write the stems first. Our data show that the stems are 0, 1, 2, 3, 4, 6 (highlighted in
yellow, 0 not included since 08 is the same as 8)

So, for the stem 0, there are two observations (8 and 9) thus the leaves are 8 and 9
respectively.
For the stem 1, there is only one observation (10) so the leaf is 0.
For the stem 2, there are three observations (20, 22, 23) so the leaves are 0, 2, and 3
respectively.

hvvvalle
Page 11 of 17

For the stem 3, there are two observations (31 and 34) so the leaves are 1 and 4
respectively. And so on…

0 8 9
1 0
2 0 2 3
3 1 4
4 0 5
5
6 0 3
Key: 2|1=21

Fig. 3. Scores of students in first Biostatistics quiz

Example 2

Construct the stem-and-leaf plot of the systolic blood pressures of a sample of baseball
athletes before a big game. The systolic blood pressure is the pressure exerted when
blood is injected into the arteries and is the upper number in blood pressure
110
measurements (e. g. , ).
70

110, 120, 118, 110, 109, 90, 115, 110, 113, 108, 100

In ascending order: 90, 100, 108, 109, 110, 110, 110, 113, 115, 118, 120

Now the stems are 9, 10,11 (highlighted in yellow). Write the corresponding leaves. Do
not forget to write the key.

9 0
10 0 8 9
11 0 0 0 3 5 8
12 0

Key: 12|5=125

Fig. 4. Systolic blood pressures of a sample of baseball athletes prior to a big game

Example 3

Given below are average ratings (1.0 to 5.0) given to an instructor by some of his students.

1.5, 2.4, 4.5, 5.0, 3.5, 4.3, 3.4, 2.1, 4.0, 4.3, 3.7, 5.0, 3.2, 1.9, 3.0

In ascending order:

1.5, 1.9, 2.1, 2.4, 3.0, 3.2, 3.4, 3.5, 3.7, 4.0, 4.3, 4.3, 4.5, 5.0, 5.0

hvvvalle
Page 12 of 17

Key: 2|3=2.3

Fig. 5. Average ratings given to an instructor by his students

Example 4

Suppose the final grades of 24 students in Biostatistics are given below.

1.00 1.25 1.25 2.25 2.50 2.75 3.00 4.00 3.00 2.25 2.50 1.50
1.75 2.00 2.25 2.75 2.50 1.25 1.50 1.75 3.00 4.00 5.00 5.00

In ascending order:

1.00 1.25 1.25 1.25 1.50 1.50 1.75 1.75 2.00 2.25 2.25 2.25
2.50 2.50 2.50 2.75 2.75 3.00 3.00 3.00 4.00 4.00 5.00 5.00

Key: 1.2|5=1.25

Fig. 6. Final grades of students in Biostatistics

Example 5

Suppose your data consist of the scores of 2BSND students in the 1st statistics quiz and
their sexes. Construct the stem-and-leaf plot of the data.

Scores 15 45 33 23 32 54 50 48 19 36 41 40 26 18 38 44 10 28 25 10 20
Sex M M F M F F F M M F M F M M F F F F M M F

hvvvalle
Page 13 of 17

In ascending order:

Scores 10 10 15 18 19 20 23 25 26 28 32 33 36 38 40 41 44 45 48 50 54
Sex F M M M M F M M M F F F F F F M F M M F F

For stem 1, the leaf for females is 0 (stem then going to the left). For males, the leaves
are 0, 5, 8, and 9.

For stem 2, the leaves for females are 0 and 8. For males, the leaves are 3, 5, and 6.
And so on….

The stem-and-leaf plot is now in back-to-back form.

FEMALE MALE
0 1 0 5 8 9
8 0 2 3 5 6
8 6 3 2 3
4 0 4 1 5 8
4 0 5
Key: 1|3=31 Key: 1|3=13
Fig. 7. Scores of 2BSND students in the 1st statistics quiz

❖ Dot Plot− It is also known as a dot chart. It makes use of dots to present the frequency
of an observation in a category. One dot is equivalent to one observation. Just count how
many observations belong to a category

Example 6

Suppose these are the favorite dog breeds of some BS Biology students:

Student Dog Breed Student Dog Breed Student Dog Breed


1 Pomeranian 11 Siberian Husky 21 Dachshund
2 Chihuahua 12 Golden Retriever 22 Poodle
3 Golden Retriever 13 German Shepherd 23 Chihuahua
4 Siberian Husky 14 Bulldog 24 Bulldog
5 Dachshund 15 Chihuahua 25 Golden Retriever
6 Pomeranian 16 Dalmatian 26 Golden Retriever
7 Dachshund 17 German Shepherd 27 Dalmatian
8 Poodle 18 Golden Retriever 28 Bulldog
9 Pomeranian 19 Siberian Husky 29 Siberian Husky
10 Dachshund 20 Pomeranian 30 Pomeranian

The dot plot can be done in two ways.

hvvvalle
Page 14 of 17

Fig. 8. Favorite dog breeds of BS Biology students Fig. 9. Favorite dog breeds of BS Biology students

Example 7

The scores of students in a 10-item quiz are shown below. Construct its dot plot.

1, 1, 1, 0, 5, 6, 8, 9, 3, 4, 5, 10, 7, 6, 5, 7, 4, 5, 8, 9, 4, 6, 7, 3

Fig. 10. Scores of students in 10-item quiz Fig. 11. Scores of students in 10-item quiz

❖ Pictograph – This graph presents the frequency of your data using picture or symbols. For
example, if you want to present the population of a certain country, then a human image
is the best image to use. For one human image, there is a corresponding number of
people. For penguin sightings, a pictograph is shown below:

Figure 12. Penguin sightings in area XYZ

https://www.subjectcoach.com/imagecdn/prep-k/xpictograph1.png.pagespeed.ic.G3efOd_9Cv.png

hvvvalle
Page 15 of 17

❖ Some graphs associated with the Frequency Distribution Table

There are graphs that can be made from the FDT. Two of these are the bar graph and the
histogram.

• Bar graph—The classes/intervals are plotted in the x-axis and the frequencies (or
relative frequencies) are plotted in the y-axis. It is best used when data are discrete.

Example: Use Table 5 from page 6.


Table 5. Distribution of the Number of Siblings among Respondents
𝐶𝑙𝑎𝑠𝑠𝑒𝑠/𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑠 𝐹𝑟𝑒𝑞 𝑅𝐹(%)
0−2 3 15
3−5 10 50
6−8 5 25
9 − 11 1 5
12 − 14 1 5
Total 20 100
Freq

𝟎−𝟐 𝟑−𝟓 𝟔−𝟖 𝟗 − 𝟏𝟏 𝟏𝟐 − 𝟏𝟒

Fig. 13. Distribution of the Number of Siblings among Respondents

• Histogram−The true class boundaries (𝑻𝑪𝑩) are plotted in the x-axis and the
frequencies (or relative frequencies) are plotted in the y-axis. It is best used when
data are continuous.

hvvvalle
Page 16 of 17

Example: Use the Table 6 from page 8.

Table 6. Distribution of the weights (kg) of a sample of COVID survivors


𝐶𝑙𝑎𝑠𝑠𝑒𝑠 𝐹𝑟𝑒𝑞 𝑅𝐹(%) True Class Boundaries
45.3 − 50.3 3 9 45.25 − 50.35
50.4 − 55.4 5 14 50.35 − 55.45
55.5 − 60.5 12 34 55.45 − 60.55
60.6 − 65.6 6 17 60.55 − 65.65
65.7 − 70.7 7 20 65.65 − 70.75
70.8 − 75.8 1 3 70.75 − 75.85
75.9 − 80.9 1 3 75.85 − 80.95
Total 35 100

The histogram using this FDT is shown below:

Freq

45.25 50.35 55.45 60.55 65.65 70.75 75.85 80.95

Fig. 14. Distribution of the weights (kg) of a sample of COVID survivors

hvvvalle
Page 17 of 17

• Frequency polygon−The class marks (𝑪𝑴) are plotted in the x-axis and the frequencies
(or relative frequencies) are plotted in the y-axis. For a figure to be called a polygon, then
it should be a closed figure. To do this, just subtract the value of 𝒄 from the first class
mark with frequency equal to 0 and add the value of 𝒄 to the last class mark with
frequency equal to 0.

Example: Use Table 5 from page 6.

Table 5. Distribution of the Number of Siblings among Respondents


𝐶𝑙𝑎𝑠𝑠𝑒𝑠/𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑠 𝑇𝑎𝑙𝑙𝑦 𝐹𝑟𝑒𝑞 𝑅𝐹(%) < 𝐶𝐹 > 𝐶𝐹 𝐶𝑀
0−2 ||| 3 15 3 20 1
3−5 |||| |||| 10 50 13 17 4
6−8 |||| 5 25 18 7 7
9 − 11 | 1 5 19 2 10
12 − 14 | 1 5 20 1 13
Total 20 100

Upon subtracting 𝒄 from the first class mark and adding c to the last class mark, both with
0 frequencies, we get these values:

𝑪𝑴 𝑭𝒓𝒆𝒒
−2 0
1 3
4 10
7 5
10 1
13 1
16 0

Freq

Fig. 15. Number of siblings among respondents

Practice: Construct the frequency polygon of Table 6 from page 8.

hvvvalle

You might also like