You are on page 1of 5

Lesson 2.

Measures of Central Tendency


A measure of central tendency is a single value that attempts to describe
a set of data by identifying the central position within that set of data. As such,
measures of central tendency are sometimes called measures of central
location. You can think of it as the tendency of data to cluster around a middle
value.

The measures of central tendency are the mean, median and mode. Each
of these measures calculates the central point using a different method.

1. ARITHMETIC MEAN or MEAN


The mean is the arithmetic average and is the most popular and well-
known measure of central tendency. The mean is equal to the sum of all
values in the data set divided by the number of values in the data set.
Let 𝑛 be the number of values in a data set, and the values are
𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 , the mean, usually denoted by 𝑥 (x bar) is:

𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 x
𝑥=
𝑛
This formula can be written in summation notation,
∑𝑛𝑖=1 𝑥𝑖
𝑥=
𝑛
Example: The scores obtained by the students during a 30-point math quiz
are as follows:
18, 23, 28, 27, 29, 27, 23, 20, 18, 10, 14, 25, 29, 30, 24, 15, 10, 24, 19, 20
What is their average score?

Solution: The average score can be determined by solving for the mean.
∑𝑛𝑖=1 𝑥𝑖
𝑥=
𝑛
18 + 23 + 28 + 27 + 29 + 27 + 23 + 20 + 18 + 10 + 14 + 25 + 29 + 30 + 24 + 15 + 10 + 24 + 19 + 20
=
20
=18+23+...+20
The advantages of using the mean is that it is simple to understand and
easy to calculate. It also takes into account all the values in the dataset. The
=433/20 only disadvantage is that it is easily affected by outliers. Outliers are the
=21.65 values that are unusual compare to the rest of the data by being especially
small or large in value.
Example: Below are the salaries given to staffs of a certain company.

Staff 1 2 3 4 5 6 7 8 9 10
Salary 21K 12K 10K 13K 14K 15K 12K 18K 90K 96K
=(21000+12000+...+96000)/10 = 301000/10 = 30100 or 30.1K
Notice that most of the salaries only range from 10K to 21K. There are
two entries which are extremely high, 90K and 96K. The resulting mean salary
of these ten staff is 30.1K which is higher than most of the given salaries.
This is because the mean is being pulled by the two large values. In this
situation, we might need to use another measure of central tendency.

The mean is best used when data is continuous and symmetrically


distributed. When the data values are assigned to different weights, the
weighted mean can be computed.

Let 𝑤𝑖 be the corresponding weight of each data values 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 ,


the weighted mean is solved by using the following formula:
x w wx
∑𝑛𝑖=1 𝑤𝑖 𝑥𝑖 23 8 184
𝑥= 𝑛 =(23x8) + (25x5) +(28x2) 25 5 125
∑𝑖=1 𝑤𝑖 15 28 2 56
=365/15 = 24.33 total: 15 365
That is, taking the sum of the product of each data values multiplied to
its weight, divided by the sum of all the weight.

2. MEDIAN
The median is the middle value. It is the value that splits the data in
half. To find for the median, first arrange the dataset in ascending or
descending order.
If the number of items in the dataset is odd, find the middle values
where in there are equal number of data below and above it.
Example, find the median of the following scores of students during a
math quiz:

12 25 30 13 17 28 27 23 24 11 21

Arrange the scores in ascending order and find the middle value,

If n is odd, the median 11 12 13 17 21 23 24 25 27 28 30


is the value in the
(n+1)/2 position.
Ex. n=11, (n+1)/2 = (11+1)/2 = 12/2 = 6 --> 6th position
The median is 23 because there are five scores before it and five scores
after it.

If the number of items in the dataset is even, find the two middle values
and get it average. For example, find the median of the following data:

12 25 30 13 17 28 27 23 24 11 21 21

First, arrange the scores in ascending order and find the two middle values,

11 12 13 17 21 21 23 24 25 27 28 30

Finally, get the mean of the two numbers,


If n is even, the median is
the average of the 21 + 23
two middle scores. = 22.
Ex. n= 12, n/2 =12/2 = 6 --> 6th and 7th 2
Therefore, the median of the dataset is 22, which means that there are
six scores lower than 22 and six which are higher.
Median can be used to data with skewed distribution, continuous data
and ordinal data.

3. MODE
The mode is the most frequent score appeared in the dataset. If the
data have multiple values that are tied for occurring the most frequently, the
data have a multimodal distribution. If no value repeats, the data do not have
a mode.
For example, consumers are asked to rate a certain restaurant
according to its overall service. They are asked rate it from 1 to 5, with 5 as
the highest. Here are their responses:

5, 5, 3, 4,3, 4, 3, 4, 5, 2, 3, 4, 4, 4, 2, 3, 4, 5, 4, 4, 5,
3, 3, 4, 3, 4, 4, 4, 3, 3, 4, 4, 5, 2, 5, 4, 4, 3, 4, 3, 4, 5

Rating Frequency
5 8
4 19
3 11
2 3
1 0

The mode is 4. The score with highest frequency is 4. Majority of the responses is 4.
Since 4 has the greatest number of frequencies, then the mode is 4.

Mode is typically used with categorical, ordinal, and discrete data. In


fact, the mode is the only measure of central tendency which can be used on
categorical data. An example is when you determined that most children who
love chocolate flavor of ice cream.

You might also like