You are on page 1of 14

MODULE 7

MEASURES OF CENTRAL TENDENCY AND


DISPERSION/VARIABILITY
LEARNING OUTCOMES
 Explain the meaning and function of the measures of central tendency
and measures of dispersion/variability
 Distinguish among the measures of central tendency and measures of
variability/dispersion
 Explain the meaning of normal and skewed score distribution

INTRODUCTION
A measure of central tendency is a single value that attempts to describe a set of data (like
scores) by identifying the central position within that set of data or scores. As such, measures of
central tendency are sometimes called measures of central location. Central tendency refers to the
center of a distribution of observations. Where do scores tend to congregate? In a test of 100
items, where are most of the scores? Do they tend to group around the mean score of 50 or 80?
There are three measures of central tendency – the mean, the median and the mode.
Perhaps you are most familiar with the mean (often called the average). But there are two other
measures of central tendency, namely, the median and the mode. Is there such a thing as best
measure of central tendency?
If the measures of central tendency indicate where scores congregate, or the measures of
variability indicate how spread out a group of scores is or how varied the scores are or how far
they are from the mean? Common measures of dispersion or variability are range, interquartile
range, variance and standard deviation.

7.1. The Measures of Central Tendency


The mean, mode and median are valid measures of central tendency but under different
conditions, one measure becomes more appropriate than the others. For example, if the scores
are extremely high and extremely low, the median is a better measure of central tendency since
mean is affected by extremely high and extremely low scores.

The Mean (Arithmetic)


The mean (or average or arithmetic mean) is the most popular and most well-known
measure of central tendency. The mean is equal to the sum of all the values in the data set divided
by the number of values in the data set. For example, 10 students in a Graduate School class got
the following scores in a 100-item test: 70, 72, 75, 77, 78, 80, 84, 87, 90, 92. The mean score of
the group of 10 students is the sum of all their scores divided by 10. The mean, therefore, is
805/10 equals 80.5. 80.5 is the average score of the group. There are 6 scores below the average
score (mean) of the group (70, 72, 75, 77, 78, and 80) and there are 4 scores above the average
score (mean) of the group (84, 87, 90 and 92).
When Not to Use the Mean
The mean has one main disadvantage. It is particularly susceptible to the influence of
outliers. These are values that are unusual compared to the rest of the data set by being
especially small or large in numerical value. For example, consider the scores of 10 Grade 12
students in a 100-item Statistics test below:

Score 1 2 3 4 5 6 7 8 9 10
5 38 56 60 67 70 73 78 79 95

The mean score for these ten Grade 12 students is 62.1 However, inspecting the raw data
suggests that this mean score may not be the best way to accurately reflect the score of the typical
Grade 12 student, as most students have scores in the 5 to 95 range. The mean is being skewed
by the extremely low and extremely high scores. Therefore, in this situation, we would like to have
a better measure of central tendency. As we will find out later, taking the median would be a better
measure of central tendency in this situation.

Median
The median is the middle score for a set of scores arranged from lowest to highest. The
mean is less affected by extremely low and extremely high scores. How do we find the median?
Suppose we have the following data:
65 55 89 56 35 14 56 55 87 45 92
To determine the median, first we have to rearrange the scores into order of magnitude
(from smallest to largest).
14 35 45 55 55 56 65 87 89 92
Our median is the score at the middle of the distribution. In this case, 56. It is the middle
score. There are 5 scores before it and 5 scores after it. This works fine when you have an odd
number of scores, but what happens when you have an even number of scores? What if you had
10 scores like the scores below?
65 55 89 56 35 14 56 55 87 45
Arrange that data according to order of magnitude (smallest to largest). Arrange that data
according to order of magnitude (smallest to largest). The median is 55.5. This gives us a more
reliable picture of the tendency of the scores. There are indeed scores of 55 and 56 in the score
distribution.

Mode
The mode is the most frequent score in our data set. On a histogram or bar chart it
represents the highest bar. If it is a score of the number of times an option is chosen in a multiple
choice test.. You can, therefore, sometimes consider the mode as being the most popular option.
Study the score distribution given below:
14 35 45 55 55 56 56 65 87 89
There are two most frequent scores 55 and 56. So we have a score distribution with two
modes, hence a bimodal distribution.

7.2. Normal and Skewed Distributions


A score distribution sample has a “normal distribution” when most of the values are
aggregated around the mean, and the number of values decrease as you move below or above
the mean: the bar graph of frequencies of a “normally distributed” sample will look like a bell curve.
Standard Normal Distribution

-5 -4 -3 -2 -1 0 1 2 3 4 5

http://kalnari.com/blog/an-interesting-multipurposed-brown-sofa-by-marcin-wielgosz/
Figure 13. Normal Distribution

 If mean is equal to the median and median is equal to the mode, the score distribution
shows a perfectly normal distribution. This is illustrated by the perfect bell shape or
normal curve shown in Figure 13.
 If mean is less than the median and the mode, the score distribution a negatively skewed
distribution. See Figure 14. In a negatively skewed distribution the scores tend to
congregate at the upper end of the score distribution.

https://sciencestruck.com/types-of-skewed-distribution-with-real-life-examples
Figure 14. Negatively Skewed Distribution
 If mean is greater than the median and the mode, the score distribution is a positively
skewed distribution. See Figure 15. In a positively skewed distribution the scores tend to
congregate at the lower end of the score distribution.

https://sciencestruck.com/types-of-skewed-distribution-with-real-life-examples
Figure 15. Positively Skewed Distribution

If scores tend to be high because teacher taught very well and students are highly
motivated to learn, the score distribution tends to be negatively skewed, i.e. the scores will tend to
be high. On the other hand, when teacher does not teach well and students are poorly motivated,
the score distribution tends to be positively skewed which means that scores tend to below. So
which score distribution should we work for?

7.3. Outcome-based Teaching-Learning and Score Distribution


If teachers teach in accordance with the principles of outcome- based teaching-learning
and so align content and assessment with the intended learning outcomes and re-teach till
mastery what has/have not been understood as revealed by the formative assessment process,
then student scores in the assessment phase of the lesson will tend to congregate on the higher
end of the score distribution.
7.4. On the other hand, if what teachers teach and assess are not aligned with the intended
learning outcomes, the opposite will be true.

7.5. Measures of Dispersion or Variability


If the measures of central tendency indicate where scores congregate, the measures of
variability indicate how spread out a group of scores is or how varied the scores are. Common
measures of dispersion or variability are range, variance and standard deviation.

Range
What is variability?
Variability refers to how “spread out” a group of scores is. The terms variability, spread,
and dispersion are synonymous, and refer to how spread out a distribution is. Here are two sets of
score distribution:
A - 5, 5, 5, 5, 6, 6, 6, 6, 6, 6 - Mean is 5, 6
B - 1, 3, 4, 5, 5, 6, 7, 8, 8, 9 - Mean is 5, 6
The two score distributions have equal mean scores and yet the scores are varied. Score
distribution A shows scores that are less varied than score distribution B. That is what we mean by
variability or dispersion. If we have to study both score distributions, assuming that the highest
possible score in the quiz is 10, we can say that Groups A and B are equal in terms of mean but
Group A has more similar scores and are closer to the mean while Group B, while its mean is
equal to the mean of Group A, students in Group B have more varied scores than Group A. In fact
the lowest score is extremely low compared to Group A and the highest score is much higher than
the highest score in Group A.
To see more what we mean by spread out, consider graphs in Figure 1. These graphs
represent the scores on two quizzes. The mean score for each quiz is 7.0. Despite the equality of
means, you can see that the distributions are quite different. Specifically, the scores on Quiz 1 are
more densely packed and those on Quiz 2 are more spread out. The differences among students
were much greater on Quiz 2 than on Quiz 1.

Quiz 1

Figure 16. Bar charts of two quizzes

Quiz 2

Figure 17. Bar charts of two quizzes


http://onlinestatbook.com/2/summarizing_distributions/variability.htm
Range
The range is the most simple measure of variability. The range is simply the highest score
minus the lowest score. Here are examples: Let’s take a few examples. What is the range of the
following group of scores: 10, 2, 5, 6, 7, 3, 4? The highest number is 10, and the lowest number is
2, so 10 - 2 = 8. The range is 8.

Here are other examples:


Here is a set of scores in a test: 99, 45, 23, 67, 45, 91, 82, 78, 62, 51. What is the range?
The highest number is 99 and the lowest number is 23, so 99.- 23 equals 76; the range is 76. Here
is another set of scores: 40, 40, 42, 50, 53, 56, 67, 68, 70, 89. What is the range? 89 minus 40
equals 49. The range is 49. The set of scores with a range of 76 is more varied or more spread
than the set of scores with a range of 49.

Variance
Variability can also be defined in terms of how close the scores in the distribution are to the
middle of the distribution. Using the mean as the measure of the middle of the distribution, the
variance is defined as the average squared difference of the scores from the mean. The data from
Quiz 1 are shown in Table 1. The mean score is 7.0. Therefore, the column “Deviation from Mean”
contains the score minus 7. The column “Squared Deviation” is simply the previous column
squared.
Table 6. Calculation of Variance for Quiz 1 scores.
Scores Deviation from Mean Squared Deviation
9 2 4
9 2 4
9 2 4
8 1 1
8 1 1
8 1 1
8 1 1
7 0 0
7 0 0
7 0 0
7 0 0
7 0 0
6 -1 1
6 -1 1
6 -1 1
6 -1 1
6 -1 1
6 -1 1
5 -2 4
5 -2 4
Means
7 0 1.5

One thing that is important to notice is that the mean deviation from the mean is 0. This will
always be the case. The mean of the squared deviations is 1.5. Therefore, the variance is 1.5. The
formula for the variance is:

∑ �−� �
σ =
2

Standard Deviation
To calculate the standard deviation of those numbers:
1. Work out the Mean (the simple average of the numbers).
2. Then for each number: subtract the Mean and square the result.
3. Then work out the mean of those squared differences.
4. Take the square root of that and we are done!

The Formula Explained


First, let us have some example values to work on:
Example: Sam has 20 rose bushes.
The number of flowers on each bush is 9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10,
9, 6, 9, 4

Let’s solve for the Standard Deviation.


Step 1. Work out the mean
In the formula above μ (the Greek letter “mu”) is the Mean of all our values...
Example: 9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
The mean is:
9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+4 20
= 140 20 = 7
So: μ = 7

Step 2. Then for each number: subtract the Mean and square the result.
This is the part of the formula that says:

�� − �
So what is xi? They are the individual x values 9, 2, 5, 4, 12, 7, etc... In other words
xi = 9, x2 = 2, x3 = 5, etc.
So it says “for each value, subtract the mean and square the result,” like this:
Example (continued):
(9 - 7)2 = (2)2 = 4
(2 - 7)2 = (-5)2 = 25
(5 - 7)2 = (-2)2 = 4
(4 - 7)2 = (-3)2 = 9
(12 - 7)2 = (5)2 = 25
(7 - 7)2 = (0)2 = 0
(8 - 7)2 = (1)2 = 1
… etc ...
And we get these results:
4, 25, 4, 9, 25, 0, 1, 16, 4, 16, 0, 9, 25, 4, 9, 9,4, 1, 4,9
Step 3. Then work out the mean of those squared differences. To work out the mean,
add up all the values then divide by how many.
First add up all the values from the previous step.
But how do we say “add them all up” in mathematics? We use “Sigma”: 2
The handy Sigma Notation says to sum up as many terms as we want:
We want to add up all the values from 1 to N, where N=20 in our case because
there are 20 values:
Example (continued):


�� − �
�=�
Which means: Sum all values from (xi - 7)2 to (xN - 7)2
We already calculated (xi - 7)2 = 4 etc. in the previous step, so just sum them up: =
4+25+4+9+25+0+1+16+4+16+0+9+25+4+9+9+4+1+4+9 = 178
But that isn’t the mean yet, we need to divide by how many, which is done by
multiplying by 1/N (the same as dividing by N):
Example (continued):

� �
�� − �

�=�

Mean of squared differences = (1/20) x 178 = 8.9


(Note: this value is called the “Variance”)

Step 4. Take the square root of that:


Example (concluded):

N
1 2
σ= xi − μ
N
i=1

σ = √(8.9) = 2.983…

Sample Standard Deviation


But sometimes our data are only a sample of the whole population.
Example: Sam has 20 rose bushes, but only counted the flowers on 6 of them!
The “population” is all 20 rose bushes, and the “sample” is the 6 bushes that Sam counted
among the 20.
Let us say Sam’s flower counts are: 9, 2, 5, 4, 12, 7
We can still estimate the Standard Deviation.
But when we use the sample as an estimate of the whole population, the Standard
Deviation formula changes to this:
The formula for Sample Standard Deviation:


� �
�= �� = �
�−�
�=�

The important change is “N-1” instead of “N” (which is called “Bessel’s correction”).
The symbols also change to reflect that we are working on sample instead of the whole
population:
 The mean is now x (for sample mean) instead of μ (the population mean),
 And the answer is s (for Sample Standard Deviation instead of σ.
But that does not affect the calculations. Only N-1 instead of N changes the calculations.
Here are the steps in calculating the Sample Standard Deviation:
Step 1. Work out the mean
Example 2: Using sampled values 9, 2, 5, 4, 12, 7
The mean is (9+2+5+4+12+7) / 6 = 39/6 = 6.5
So: x = 6.5

Step 2. Then for each number: subtract the Mean and square the result
Example 2 (continued):
(9 - 6.5)2 = (2.5)2 = 6.25
(2-6.5)2 (-4.5)2 = 20.25
(5-6.5)2 = (-1.5)2 = 2.25
(4-6.5)2 = (-2.5)2 = 6.25
(12 - 6.5)2 = (5.5)2 = 30.25
(7-6.5)2 = (0.5)2 = 0.25

Step 3. Then work out the mean of those squared differences.


To work out the mean, add up all the values then divide by how many.
But hang on... we are calculating the Sample Standard Deviation, so instead of
dividing by how many (N), we will divide by N-1.
Example 2 (continued):
Sum = 6.25 + 20.25 + 2.25 + 6.25 + 30.25 + 0.25 = 65.5
Divide by N-1: (1/5) x 65.5 = 13.1
(This value is called the “Sample Variance”)

Step 4. Take the square root of that:


Example 2 (concluded):


� �
�= �� = �
�−�
�=�

s = √(13.1) = 3.619…
7.6. Comparing
When we used the whole population we got: Mean = 7, Standard Deviation = 2.983…
When we used the sample we got: Sample Mean = 6.5, Sample Standard Deviation =
3.619…
Our Sample Mean was wrong by 7%, and our Sample Standard Deviation was wrong by
21%.

Why Take a Sample?


Mostly because it is easier and cheaper.
Imagine you want to know what the whole university thinks... you can’t ask thousands of
people, so instead you ask maybe only 300 people. Samuel Johnson once said “You don’t have to
eat the whole ox to know that the meat is tough.”
(Source: https://www.mathsisfun.com/data/standard-deviation-formulas.html,Retrieved 1-24-19)

7.7. More Notes on Standard Deviation


The standard deviation is simply the square root of the variance. The standard deviation is
an especially useful measure of variability when the distribution is normal or approximately normal
because the proportion of the distribution within a given number of standard deviations from the
mean can be calculated. For example, 68% of the distribution is within one standard deviation of
the mean and approximately 95% of the distribution is within two standard deviations of the mean.
Therefore, if you had a normal distribution with a mean of 50 and a standard deviation of 10, then
68% of the distribution would be between 50 - 10 = 40 and 50 + 10 = 60. Similarly, about 95% of
the distribution would be between 50 - 2 x 10 = 30 and 50 + 2 x 10 = 70. The symbol for the
population standard deviation is σ;
Figure 2 shows two normal distributions. The distribution (bold line) has a mean of 40 and
a standard deviation of 5; the other distribution has a mean of 60 and a standard deviation of 10.
For the distribution (bold line), 68% of the distribution is between 35 and 4, for the other
distribution, 68% is between 50 and 70.

http://onlinestatbook.com/2/summarizing_distributions/variability.html
Figure 18. Normal distributions with standard deviations of 5 and 10.

Standard Deviation is a measure of dispersion, the more dispersed the data, the less
consistent the data are. A lower standard deviation means that the data are more clustered around
the mean and hence the data set is more consistent.
You need to read your calculator instructions to see what notation your calculator uses for
the standard deviation.
An example. Standard deviation for a data set with frequency 1.
Using the following data: 10 15 13 25 22 53 47
We found the mean to be x 26:4285714. You should also see from the same calculation
that the standard deviation (SD) = 16:98879182.
(2009 ASU School of Mathematical & Statistical Sciences and Terri Miller, retrieved, 1-15-19)

7.8. Interpretation of Standard Deviation


Let us use the standard deviation to compare two data sets. Let us use the standard
deviation to interpret how consistent the data are. The lower the standard deviation, the more
consistent the data are.
Example - Two bowlers, Katie and Mike have the scores given below:
Katie’s Scores 189 146 200 241 231
Mike’s Scores 235 201 217 168 186
Both sets of data have a mean (x) = 201.4. Does this mean they are equivalent bowlers?
No, consider the standard deviations. Katie has a standard deviation of SD = 37.6470 and
Mike has a standard deviation of SD = 26.1017. Since Mike has a smaller standard deviation, he is
a more consistent bowler than Katie, i.e. Mike is more likely to get a score of 201.4.
Let’s presume that Katie’s and Mike’s scores are scores in a long test:
Katie’s Scores - 189 146 200 241 231
Mike’s Scores - 235 201 217 168 186
If you compute the mean for both sets of scores, you get 201. SD for Katie’s scores is
37.64.70 while that of Mike is 26.1017. Mike’s scores indicate greater consistency than those of
Katie. This means that Mike tends to do better than Katie because his scores are more consistent
than those of Katie.
(Source: 2009 ASU School of Mathematical & Statistical Sciences and Teri L. Miller), Retrieved, 1-
25-19)
Learning Task 1-A

Directions: Encircle the letter of the correct answer.

1. Which is referred to as average of scores?


A. Mean C. Mode
B. Median D. Standard Deviation
2. If scores are plotted in a histogram, which do you call that with the highest frequency?
A. Mean C. Mode
B. Median D. Standard Deviation
3. Which is the midpoint of a score distribution?
A. Mean C. Mode
B. Median D. Standard Deviation
4. Which does NOT belong?
A. Mean C. Mode
B. Median D. Standard Deviation
5. Which is a measure of variability?
A. Range C. Mean
B. Median D. Mode
6. Which is a measure of dispersion?
A. Mean C. Mode
B. Median D. Variance
7. Which is a measure of the spread of scores?
A. Mean C. Standard Deviation
B. Mode D. Median
8. You like to get a more reliable picture of the scores of your students in your Math class?
Which will you compute?
A. The mean
B. The mean and the SD
C. The difficulty index
D. The discrimination index
9. Here is a score distribution of a quiz with 10 as the highest possible score: 2, 4, 5, 5, 6, 7, 7,
7, 8, 8. Which is the range?
A. 2 C. 7
B. 6 D. 8
10. Which score distribution do all teachers, parents and students wish?
A. Negatively skewed C. Positively skewed
B. Bell curve D. That depends on the Mean
11. If there is not real teaching and learning that take place, which score distribution is most
likely to come?
A. Negatively skewed C. Positively skewed
B. Bell curve D. That depends on the Mean
12. Among the measures of central tendency, which is most affected by outliers?
A. Mean C. Median
B. Mode D. Range
13. If a score distribution has no outliers, which is most likely to be TRUE?
A. The scores may not be so varied.
B. The scores may be highly varied.
C. In this case, the median is the most reliable measure of central tendency.
D. In this case, the mode is the best measure of central tendency.
14. Which is the mean of the squared deviation from the mean?
A. Variance C. Standard Deviation
B. Range D. Mean
15. Which is TRUE of scores that follow the normal distribution curve?
A. The mean, the median and mode are equal.
B. The median is higher than the mean.
C. The mean is higher than the median.
D. The mode is higher than the mean and the median.
16. If a score distribution has a Standard Deviation of zero, what does it mean?
A. Most scores are zero.
B. The scores are the same.
C. Most Scores are high.
D. Most scores are negative.
B.Problem Solving

1. Here is a set of scores: 1, 2, 3, 4, 5, 6, 7.


What is the mean?
What is the median?
What is the mode?
What is the range?

2. A student has gotten the following grades on his tests: 87, 95, 76, and 88. He wants an
85 or better overall. What is the minimum grade he must get on the last test in order to
achieve that average?

References:

Balagtas, M. et al (2020) Assessment in Learning 1:1st Edition Rex Book Store Buenaflor, R.
C. (2012). Assessment of learning book one: the conventional approach.
Quezon City: Great Books Publishing
Garcia, C.D. (2013). Measuring and evaluating learning outcomes: a textbook in educational
assessment
1&2. Second Ed. Mandaluyong City: Books Atbp. Publishing Corp.
Navarro, RL.et al (2017). Assessment of Learning 1. (OBE-and K to12- Based) 3 rd Edition.
Lorimar Publishing Inc.Quezon City.
Navarro, RL.et al (2019). Assessment of Learning 1. (OBE-& PPST- Based) 4 th Edition. Lorimar
Publishing Inc.Quezon City.
Link:
https://www.youtube.com/watch?v=rNz0zPCgYyU
https:/www.mathsisfun.com/data/standard-deviation-formulas.html,Retrieved 5-1-19)

Prepared by:

DIOSALYN T. GALANG, MAEd


Asst. Prof. 3

LEAH C. NAVARRO, EdD


Chair, TED

MAT M. N UESTRO, MEM


Director, Curriculum and Instruction

You might also like