You are on page 1of 17

Module 3 Measures of Central Tendency and Variation

At the end of the module the students should be able to:


1. define and enumerate some of the characteristics of mean, median, and mode;
2. calculate and interpret measures of central tendency, such as the mean,
median, and mode.
3. differentiate measures of central tendency to measures of variations.
4. calculate and interpret measures of dispersion, such as the range, variance, and
standard deviation.

MEASURES OF CENTRAL TENDENCY UNGROUPED DATA

*Central Tendency – value/s that represents the whole set of data.

̅)
MEAN (𝒙
- computational average
- the sum of all n values divided by the total frequency

• Arithmetic Mean
∑𝑥
𝑥̅ = Where: x represents the value of an observation
𝑛
n represents the total number of observations
• Weighted Mean
∑ 𝑤𝑥
𝑤𝑥̅ = ∑𝑤
Where: x represents each of the item values
w represents the weight of each item value
∑ 𝑓𝑥
𝑤𝑥̅ = Where: f represents the frequency
𝑛
n represents the sample size

• Properties of the Mean:


1. Always a unique value in any set of data.
2. Associated with the interval or ratio data.
3. Strongly influenced by the extreme values in a set of data.
4. Most reliable measure of central tendency.
̃)
MEDIAN (𝒙
- Positional average
- the center most or the middle most observation or value (when n is odd) or the
average of the two middle values (when n is even) when the data are arranged
(either ascending or descending)
- divides the set of data into two equal parts (half of the observation belongs to the
higher 50%, while the other half belongs to the lower 50% of the group)

• Properties of the Median:


1. Always a unique value in any set of data.
2. Associated with ordinal data.
3. Is not affected by extreme values.
4. A positional measure.

̂)
MODE (𝒙
- Nominal average
- the most frequently occurring score in a distribution
- the observation or value which appears the most number of times in the set of
values

• Properties of the Mode:


1. Not affected by extreme values.
2. It may not exist.
3. If the mode exists, it may not always be unique.
4. In finding the mode, we do not consider all the values in the distribution.
5. Associated with nominal data.

Examples:

Find the mean, median and mode of the following set of data.

1. 17 25 34 25 27 19 24

17+25+34+25+27+19+24 171
𝑥̅ = = ≈ 24.43
7 7
• In getting the median, arrange first the data (either ascending or descending),
then get the middlemost (if n is odd) or the average of the two middle values
(if n is even).

𝑥̃ ⇒ 17, 19, 24, 25, 25, 27, 34


𝑥̃ = 25

𝑥̂ = 25
2. 40 52 50 48 56 60 37 65 40 50 65

40(2)+52+50(2)+48+56+60+37+65(2) 563
𝑥̅ = = ≈ 51.18
11 11

𝑥̃ ⇒ 37, 40, 40, 48, 50, 50, 52, 56, 60, 65, 65
𝑥̃ = 50

𝑥̂ = 40, 50 and 65

3. 87 94 36 56 54 76 87 54 87 36

667
𝑥̅ = = 66.7
10

𝑥̃ ⇒ 36, 36, 54, 54, 56, 76, 87, 87, 87, 94


56+76 132
𝑥̃ = = = 66
2 2

𝑥̂ = 87

4. 21 23 16 15 26 27 19 24

171
𝑥̅ = = 21.375 ≈ 21.38
8

𝑥̃ ⇒ 15, 16, 19, 21, 23, 24, 26, 27


21+23 44
𝑥̃ = = = 22
2 2

𝑥̂ = no mode
➢ Weighted Mean

1. Supposed we are interested in computing the weighted mean of a BS Math student


in a certain university where he is enrolled in 6 subjects having different unit load, as
follows:

No. of Grades
Subject wx
units (w) (x)
1 5 2.25 11.25
2 3 2.75 8.25 ∑ 𝑤𝑥 41.25
𝑤𝑥̅ = ∑𝑤
= = 2.29
3 4 3.00 12.00 18
4 3 1.25 3.75
5 1 2.00 2.00
6 2 2.00 4.00
∑ 𝑤 = 18 ∑ 𝑤𝑥 = 41.25

2. If 8 000 books of Algebra were sold at ₱320 each, 1 500 Business Mathematics at
₱380 each, 1 000 Mathematics of Investment at ₱300 each and 3 500 Statistics at
₱340 each, find the weighted mean sales for the four books.

Book Title No. of books (w) Price (x) wx


Algebra 8 000 ₱320 2 560 000
Business
1 500 ₱380 570 000
Mathematics
Mathematics of
1 000 ₱300 300 000
Investment
Statistics 3 500 ₱340 1 190 000
∑ 𝑤 = 14 000 ∑ 𝑤𝑥 = 4 620 000

∑ 𝑤𝑥 4 620 000
𝑤𝑥̅ = ∑𝑥
= = ₱330.00
14 000

3. Miss Z has 21 students in a specific subject. These students were asked on how often
Miss Z gives assignment. Of these students, 18 answered (4) very often, 2 answered
(3) often, 1 for (2) seldom and nobody for (1) never.

∑ 𝑤𝑥 18(4)+2(3)+1(2)+0(1)
𝑤𝑥̅ = ∑𝑥
= 21
= 3.81(very often)
Module 3
Name:______________________________________Score:_________________
Section:_____________________________________Date:__________________

Activity 1
Measures of Central Tendency Ungrouped

Find the mean, median and mode of the following data.

a. 21 10 36 42 39 52 30 25 26

𝑥̅ = _________ 𝑥̃ = _________ 𝑥̂ = _________

b. 21 55 25 30 26 36 42 39 36 25

𝑥̅ = _________ 𝑥̃ = _________ 𝑥̂ = _________

c. 108 120 154 118 125 164 135

𝑥̅ = _________ 𝑥̃ = _________ 𝑥̂ = _________

d. 31 21 16 15 21 27 19 18

𝑥̅ = _________ 𝑥̃ = _________ 𝑥̂ = _________

e. 87 94 36 56 54 76 87 85 68 56 78 88

𝑥̅ = _________ 𝑥̃ = _________ 𝑥̂ = _________

f. A student gets the following grades in his seven subjects: 87 for Calculus, 82 for
Physics, 79 for Chemistry, 81 for English and 83 for History. Compute for his mean
grade if the weights for the five subjects are 5.0, 4.0, 4.0, 3.0 and 3.0, respectively.
𝑥̅ = _________

g. It was recorded that 5 brands of ballpen with tag prices of ₱7.50, ₱8.00, ₱9.00,
₱10.00 and ₱12.50 were bought by 16, 5, 4, 12 and 6 students. Find the mean
sale. 𝑥̅ = _________

h. Jessie Salvador, an Engineering student got 88%, 85%, 91% and 93% in four of
his subjects. What grade must he get in his fifth subject in order to obtain an
average of 90%? 𝑥 = _________
i. The table below shows the number of respondents who answered 5, 4, 3, 2 and 1
on three questions. Compute for the weighted mean and give the mean
interpretation using the scale below:

Mean Interpretation
1.00 – 1.79 To a Very Slight Extent (VSE)
1.80 – 2.59 To a Slight Extent (SE)
2.60 – 3.39 To a Moderate Extent (ME)
3.40 – 4.19 To a Great Extent (GE)
4.20 – 5.00 To a Very Great Extent (VGE)

5 4 3 2 1 wx̅ Interpretation
To what extent do you think Statistics
15 20 5 0 0
will help you in your chosen career?
To what extent do you think Statistics
10 25 3 2 0
will help you in doing research?
To what extent do you think Statistics
11 16 8 5 0
will help you in real life situation?
MEASURES OF VARIABILITY OR DISPERSION

The measures of variability indicate the degree or extent to which numerical values
are dispersed or spread out about the average value (mean) in a distribution. The most
commonly used measures of variations are the range, variance and standard deviation.

RANGE (R)
The range, which is the simplest to compute, is the difference between the
largest and the lowest values in the set of numerical data. This is a poor and unstable
measure of variation, particularly, if we consider a large number of values. It is least
reliable and should be used only when someone wants to obtain a quick measure of
variation.

THE VARIANCE (s2) AND THE STANDARD DEVIATION (s)


The variance is the average of the squared deviation values from the
distribution’s mean. The standard deviation which is the positive square root of the
variance measures the spread or dispersion of each value from the mean of the
distribution. It is the most used measure of spread since it improves interpretability by
removing the variance square and expressing deviations in their original unit, and is
significantly related to normal distributions. It is the most important measure of
dispersion since it enables us to determine with a great deal of accuracy where the
values of the distribution are located in relation to the mean.

The variance and the standard deviation are generally accepted measures of
dispersion, especially in discussions and presentation of reports containing basic
statistics. The standard deviation is more popularly used than the variance since its
value is expressed in the unit of observations and the mean.

Take note: The higher the standard deviation, the more spread or more dispersed the
data are. The smaller the standard deviation, the less spread and less
dispersed, the more homogeneous, more consistent or more uniform the
data are.

∑(𝑥−𝑥̅ )2 𝑛 ∑ 𝑥 2 −(∑ 𝑥)2


s2 = 𝑛−1
or s2 = 𝑛(𝑛−1)

∑(𝑥−𝑥̅ )2 𝑛 ∑ 𝑥 2 −(∑ 𝑥)2


s=√ or s2 = √
𝑛−1 𝑛(𝑛−1)

Examples:

1. Find the value of the range, variance and standard deviation of the set of data: 17,
25, 24, 18, 20

R = HV – LV = 25 – 17 = 8
x ̅)
(𝒙 − 𝒙 ̅ )𝟐
(𝒙 − 𝒙 x2
17 17– 20.8 = –3.8 2
(–3.8) = 14.44 289
18 18 – 20.8 = –2.8 (–2.8)2 = 7.84 324
20 20 – 20.8 = –0.8 (–0.8)2 = 0.64 400
24 24 – 20.8 = 3.2 (3.2)2 = 10.24 576
25 25 – 20.8 = 4.2 (4.2)2 = 17.64 625
104 50.8 2214

∑(𝑥−𝑥̅ )2 50.8 50.8


s2 = = 5−1 = = 12.7 or
𝑛−1 4
𝑛 ∑ 𝑥 2 −(∑ 𝑥)2 5(2 214)−(104)2 254
s2 = = = = 12.7
𝑛(𝑛−1) 5(5−1) 20

s = √12.7 ≈ 3.56

2. Suppose two applicants, A and B for secretarial position were given an examination
to test and compare their typing speed. (Assume all factors are being equal). Each
was given nine trials (in minutes) and the results were as follows:
A: 14 16 18 20 22 24 26 28 30
B: 18 18 20 22 24 24 24 24 24

RA = 30 – 14 = 16 RB = 24 – 18 = 6

Secretary A Secretary B
x x2 x x2
14 196 18 324
16 256 18 324
18 324 20 400
20 400 22 484
22 484 24 576
24 576 24 576
26 676 24 576
28 784 24 576
30 900 24 576
198 4 596 198 4 412

𝑛 ∑ 𝑥 2 −(∑ 𝑥)2 9(4 596)−(198)2 2160


Secretary A: s 2 = = = = 30 s = √30 ≈ 5.48
𝑛(𝑛−1) 9(9−1) 72

𝑛 ∑ 𝑥 2 −(∑ 𝑥)2 9(4 412)−(198)2 504


Secretary B: s 2 = = = =7 s = √7 ≈ 2.65
𝑛(𝑛−1) 9(9−1) 72

• Secretary B is more consistent than Secretary A in terms of performance in the


typing test.
Module 3
Name:______________________________________Score:_________________
Section:_____________________________________Date:__________________

Activity 2
Measures of Variability or Dispersion

a. The monthly number of cars sold by a car dealer from January to October for a
particular year are: 20 24 12 10 18 4 15 6 11 19.

Find the range, variance and standard deviation.

b. Sample annual salaries, in thousands of pesos, for Manila and Makati are listed.
Manila: 34 25 17 17 27 25 29 33 26
Makati: 26 23 27 28 25 26 18 26 31

*Compute for the range, variance and standard deviation; and interpret the result.
*In which area salary is more consistent?
Measures of Central Tendency Grouped Data
Mean
Recall that the grouped data are data which have been arranged in a frequency distribution table.
To compute the mean for grouped data, we can use two formulas, namely:

A. The Classmark Formula:


∑ 𝑓𝑋𝑚
𝑋̅ =
𝑛

where: f= frequency
Xm = classmark
n = total frequency

The steps in computing the mean using the classmark formula are as follows:
1. Construct the column for the classmark (Xm).
2. Multiply each classmark by its corresponding frequency,this will be written in the fXm column.
3. Get the sum of the values in fXmcolumn (∑ 𝑓𝑋𝑚 ).
4. Substitute the values in the formula to find the mean.
Example:
Below is the frequency distribution of the scores of 40 students in Mathematics
Classes f Xm fXm
16 – 23 1 19.5 19.5
24 – 31 3 27.5 82.5
32 – 39 6 35.5 213
40 – 47 12 43.5 522
48 – 55 10 51.5 515
56 – 63 8 59.5 476
n = 40 ∑ 𝑓𝑋𝑚 = 1828

Solution:
∑ 𝑓𝑋𝑚 1828
𝑋̅ = = = 𝟒𝟓. 𝟕
𝑛 40
This indicates that the mean score in Mathematics of the 40 students is 45.7
The Coded Formula
∑ 𝑓𝑑
𝑋̅ = 𝑋0 + ( 𝑛 ) 𝑖

where: X0 = classmark with a deviation of 0


f = frequency
d = deviation
n = total frequency
i = class interval

Steps in computing for the mean using the coded formula are as follows:
1. Choose any class interval to find the assumed mean. The classmark of this interval is X0 where
the deviation is 0.
2. Construct the column for the deviation. For the class larger than the assumed mean, the deviations
are 1,2,3,...whereas for the class smaller than the assumed mean, the deviations are -1,-2,-3,....and
so on.
3. Multiply each frequency by the corresponding deviation to get the entries in the fd column. Get
the sum (∑ 𝑓𝑑).
4. Use the formula to compute for the mean.

To illustrate, consider the data below.


Solution:

Classes f Xm d fd
16 – 23 1 19.5 -3 -3
24 – 31 3 27.5 -2 -6
32 – 39 6 35.5 -1 -6
40 – 47 12 43.5 0 0
48 – 55 10 51.5 1 10
56 – 63 8 59.5 2 16
n = 40 ∑ 𝑓𝑑 = 11

∑ 𝑓𝑑
𝑋̅ = 𝑋0 + ( )𝑖
𝑛
11
𝑋̅ = 43.5 + ( ) 8
40
𝑋̅ = 45.7
Notice that we got the same mean which is 45.7, thus either of the two formulas will give the
same value of the mean. Note also, that we will get the same mean if we take the assumed mean from the
other class intervals.

Characteristics of the Mean:


1. The mean is the most appropriate measure when the data are in interval or ratio scale.
2. The mean lies between the largest and smallest values.
3. The value of the mean is unique for a given set of data.
4. The mean is easily influenced by extreme values.
5. The mean is better suited for further statistical measures.

Median
𝑛
−<𝑐𝑓𝑏
𝑋̃ = 𝐿𝐵 + ( 2 )𝑖
𝑓𝑚

where: LB = lower boundary of the median class


fm = frequency of the median class
<cfb = less than cumulative frequencybefore the median class
n = total frequency
i = class interval
Steps in computing the median for grouped data:
1. Construct the less than cumulative frequency.
𝑛
2. Determine the median class ( ) in the cumulative frequency column.
2
3. Substitute using the formula.

Example: To illustrate how to compute for the median of grouped data, let us use the distribution of the
test scores of 40 students in Mathematics.

Classes f <cf
16 – 23 1 1
24 – 31 3 4
32 – 39 6 10
40 – 47 12 22
48 – 55 10 32
56 – 63 8 40
n = 40

The median class is 40 – 47 because it contains one-half of the total frequency


𝑛 40
( = = 20) in the <cf column. Substituting the values in the formula;
2 2
𝑛
−<𝑐𝑓𝑏
𝑋̃ = 𝐿𝐵 + ( 2 )𝑖
𝑓𝑚

20−10
𝑋̃ = 39.5 + ( )8
12

𝑋̃ = 46.17

Characteristics of the Median:


1. The median is the most appropriate measure for interval data.
2. The mean lies between the highest and lowest measurements.
3. There is only one value for the median in a given set of measurements.
4. The median is not influenced by extreme values.
5. The median is used when the middle value is desired. It is the value where 50% or half of the
distribution lies above it and 50% lies below it.
Mode
𝑓𝑚𝑜−𝑓1
𝑋̂ = 𝐿𝐵𝑚𝑜 + ( )𝑖
2𝑓𝑚𝑜 −𝑓1 −𝑓2

where: LBmo = lower boundary of the modal class


fmo = frequency of the modal class
f1 = frequency before the modal class
f2 = frequency after the modal class
i = class interval

Steps in computing the mode for grouped data:


1. Find the modal class; the class with the highest frequency.
2. Use the formula to find the mode.
Take Note: It is important to note that the given formula will be used only for
unimodal;
𝑓𝑚𝑜−𝑓1
𝑋̂ = 𝐿𝐵𝑚𝑜 + ( )𝑖
2𝑓𝑚𝑜 − 𝑓1 − 𝑓2

For multimodal distribution, an alternative formula is used for the rough


mode;

𝑋̂ = 3(Median) – 2 (Mean)

Example: Find the mode of the grouped data, using the distribution of the test scores of 40 students in
Mathematics.

Classes f
16 – 23 1
24 – 31 3
32 – 39 6
40 – 47 12
48 – 55 10
56 - 63 8
n = 40

The modal class is the class interval 40 – 47 with the highest frequency of 12 (unimodal). Substituting the
values in the formula;
𝑓𝑚𝑜−𝑓1 12−6
𝑋̂ = 𝐿𝐵𝑚𝑜 + ( )𝑖 𝑋̂ = 39.5 + ( ) 8
2𝑓𝑚𝑜 −𝑓1 −𝑓2 2(12)− 6 −10
̂ = 45.5
𝑿

Characteristics of the Mode:


1. The mode is the most appropriate measure for nominal data.
2. The mode is the least reliable among the measures of central tendency.
3. The mode is used when we want to find the value which occurs most often.
4. The mode is a quick approximation of the average.
Module 3
Name:______________________________________Score:_________________

Section:_____________________________________Date:__________________

Activity 3
Measures of Central Tendency Grouped Data

Table 1. Projected Regional and Provincial Population by Five-Year Age Group by Five-
Calendar Year, Philippines: 2010-2030

Age Answer the following:


2015 2020 2025 2030
Group
1. Determine the values of the
0-4 11,327,300 11,475,800 11,360,700 11,043,800 Mean, Median, and Mode
using the distribution of the
5-9 10,671,000 11,233,600 11,385,600 11,273,500 age group in 2020.
10-14 10,283,900 10,601,800 11,162,300 11,312,500 2. Compare the means of
2020 and 2025. What do the
15-19 10,136,900 10,208,500 10,524,400 11,081,200
results indicate?
20-24 9,643,400 10,045,400 10,117,800 10,431,700

25-29 8,332,500 9,540,100 9,944,300 10,017,200

30-34 7,342,000 8,229,200 9,435,800 9,841,200

35-39 6,685,300 7,238,600 8,127,400 9,333,700

40-44 5,916,400 6,573,800 7,133,600 8,024,400

45-49 5,351,200 5,787,300 6,449,500 7,015,500

50-54 4,530,000 5,185,800 5,630,000 6,295,500

55-59 3,703,100 4,319,200 4,970,900 5,421,200

60-64 2,765,500 3,444,600 4,045,700 4,685,300

65-69 1,978,400 2,472,300 3,109,600 3,684,300

70-74 1,249,200 1,667,600 2,110,400 2,686,400

75-79 870,200 966,600 1,313,000 1,688,800

80+ 776,000 957,700 1,138,400 1,501,300


Source: https://psa.gov.ph/statistics/census/projected-population
Measures of Variability
Shown here are the various age groups of 100 samples chosen at random that
constitute the labor force of an economic zone.
Age Frequency
(in years)
15 – 19 18
20 – 24 25
25 – 29 20
30 – 34 14
35 – 39 9
40 – 44 7
45 – 49 5
50 - 54 2
n = 100

The Range (R)


The range is the most unreliable and most unstable measure of variability. But if
you want a quick approximation of the dispersion a data set, the range is most likely to
use.
The range of the grouped data is defined as the difference between the highest among
the upper-class boundaries and the lowest among the lower-class boundaries. That is,
R = HUCB – LLCB
where: HUCB = highest upper-class boundary
LLCB = lowest lower class boundary
Using the given data, we have:
R = HUCB – LLCB
R = 54. 5 – 14. 5
R = 40
Variance and the Standard Deviation
Remember that the variance and standard deviation are related to each other.
For grouped data, their respective formula are as follows:
Sample Variance:
∑ 𝑓(𝑋𝑚 −𝑋̅)2
𝑠2 = where: f – frequency of the class interval
𝑛−1
𝑋𝑚 – is the classmark
Sample Standard Deviation: 𝑋̅– is the sample mean
n– the total number of cases
s2– sample variance
s– sample standard deviation
∑ 𝑓 (𝑋𝑚 − 𝑋̅)2
𝑠=√
𝑛−1

Example:
Let us solve the variance and standard deviation of the data set.
Age Frequency 𝑋𝑚 f𝑋𝑚 (𝑋𝑚 − 𝑋̅) 𝑓 (𝑋𝑚 − 𝑋̅ )2
(in years)
15 – 19 18 17 306 – 11.1 2217.78
20 – 24 25 22 550 – 6.1 930.25
25 – 29 20 27 540 –1.1 24.20
30 – 34 14 32 448 3.9 212.94
35 – 39 9 37 333 8.9 712.89
40 – 44 7 42 294 13.9 1352.47
45 – 49 5 47 235 18.9 1786.05
50 – 54 2 52 104 23.9 1142.42
n = 100 ∑ 𝑓𝑋 = 2810 ∑ 𝑓(𝑋 − 𝑋̅)2 = 8379

Applying the formula, we have:

2
∑ 𝑓 (𝑋𝑚 − 𝑋̅)2
𝑠 =
𝑛−1
8379
𝑠2 =
100 − 1
𝒔𝟐 = 𝟖𝟒. 𝟔𝟒 sample variance

∑ 𝑓(𝑋𝑚 −𝑋̅)2
𝑠=√ 𝑛−1

8379
𝑠=√
100 − 1

𝒔 = 𝟗. 𝟏𝟗𝟗𝟖 or 9.20 sample standard deviation


Module 3

Name:______________________________________Score:_________________

Section:_____________________________________Date:__________________

Activity 4
Measures of Dispersion

The distribution of the monthly earnings of selected families in Barangay Maligaya in


2019 is summarized in the table. Compute for the different measures of variability.
Number
Monthly Income ̅) ̅ )𝟐
of 𝑿𝒎 f𝑿𝒎 (𝑿 𝒎 − 𝑿 𝒇 (𝑿 𝒎 − 𝑿
(in pesos)
Families
9, 500 – 11, 999 23
12, 000 – 14, 499 44
14, 500 – 16, 999 32
17, 000 – 19, 499 25
19, 500 – 21, 999 13
22, 000 – 24, 499 9
24 ,500 – 26, 999 4

1. R: _________________

2. s2: _________________

3. s: _________________

You might also like