You are on page 1of 10

Course Code and Title: MAT100 – MATHEMATICS IN THE MODERN WORLD

Lesson 8 Week 8
Topic: DATA MANAGEMENT – MEASURES OF RELATIVE POSITION: (Z-SCORES, PERCENTILES,
QUARTILES, BOX-and-WHISKER PLOT)

Measures of Relative Position: Z-scores, Percentiles, Quartiles


and Box-and-Whisker Plot

INTRODUCTION:

In addition to measures of central tendency and measures of dispersion, there are measures of position,
which are used to locate the relative position of a value in the data set. Some of these measures are standard
scores or z-scores, percentiles, quartiles, and box-and-whisker plot.

A measure of position determines the position of a single value in relation to other values in a sample
or a population data set. These are conversions of values, usually standardized test scores, to show where a
given value stands in relation to other values of the same grouping. The most common example in education is
the conversion of scores on standardized tests to show where a given student stands in relation to other students
of the same age, grade level, etc.

Converted scores are based on the standard deviation or distance of a raw score from the mean for a
normal curve or distribution. In a normal distribution, the distance from one S.D. (standard deviation) above the
mean to one S.D. below the mean includes approximately 68 percent of all the scores. Plus two (+2) to minus
two (-2) S.D. includes approximately 95 percent of all scores and plus three (+3) to minus three (-3) S.D. includes
over 99 percent of all scores.

LEARNING OBJECTIVES:
At the end of this lesson, students should be able to:

1. describe and explain the characteristics of the different measures of relative position;
2. compute various measures of position for ungrouped data;
3. interpret the use of the z-scores and the different quartiles; and
4. make a presentation of the box-and-whiskers plot and be able to interpret it.

CONTENT:

Z-SCORE

The number of standard deviations between a data value and the mean is known as the data value’s
z-score or standard score. The z-score for a given data value x is the number of standard deviations that x is
above or below the mean of the data.

The units marked on the horizontal axis are denoted by z and are called the z-values or z-scores. A
specific value of z gives the distance between the mean and the point represented by z in terms of the standard
deviation.

The z-values on the right side of the mean are positive and those on the left side are negative. The z-
value for a point on the horizontal axis gives the distance between the mean and that point in terms of the
standard deviation.

For example, a point with a value of z = 2 is two standard deviations to the right of the mean. Similarly,
a point with a value of z = - 2 is two standard deviations to the left of the mean.

The following formulas show how to calculate the z-score for a data value x in a population and in a sample:
𝑥−𝜇 𝑥−𝑥
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛: 𝑧= 𝑆𝑎𝑚𝑝𝑙𝑒: 𝑧=
𝜎 𝑠
𝑊ℎ𝑒𝑟𝑒: 𝑥 = 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑣𝑎𝑙𝑢𝑒 𝑊ℎ𝑒𝑟𝑒: 𝑥 = 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑣𝑎𝑙𝑢𝑒
𝜇 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛 𝑥 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛
𝜎 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑠 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛

EXAMPLE: Comparing z-scores

Ruben has taken two tests in his chemistry class. He scored 72 on the first test, for which the mean of
all scores was 65 and the standard deviation was 8. He received a 60 on a second test, for which the mean of
all scores was 45 and the standard deviation was 12. In comparison to the other students, did Ruben do better
on the first test or the second test?

𝑥− 𝜇
SOLUTION: Find the z-score for each test. Using the formula: zx = 𝜎
72 − 65 60 − 45
zx = = 0.88 zx = = 1.25
8 12

Ruben scored 0.88 standard deviation above the mean on the first test and 1.25 standard deviations
above the mean on the second test. These z-scores indicate that, in comparison to his classmates, Ruben
scored better on the second test than he did on the first test.

What does a z-score of 3 for a data value x represent? What does a z-score of -1 for a data value x
represent? A z-score of 3 for a data value x means that x is 3 standard deviations above the mean. A z-score
of -1 for a data value x means that x is 1 standard deviation below the mean.

𝑥 − 𝑥̅
A z-score equation zx = involves four variables. If the values of any three of the four variables
𝑠
are known, you can solve for the unknown variable. This procedure is illustrated in the example below:

EXAMPLE:
A consumer group tested a sample of 100 light bulbs. It found that the mean life expectancy of the
bulbs was 842 h, with a standard deviation of 90. One particular light bulb from the Dura Bright Company had
a z-score of 1.2. What was the life span of this light bulb?

SOLUTION:
Given: zx = 1.2, 𝑥̅ = 842, 𝑠 = 90
Substitute the given values into the z-score equation and solve for x.
𝑥 − 𝑥̅ 𝑥 − 842
zx = 1.2 = 108 = x – 842
𝑠 90

Solve for x: x = 950 h, is the life span of the light bulb.


EXAMPLE:
Danny scored 72 in a quiz in Algebra for which the average score of the class was 65 with a standard
deviation of 8. He also took a quiz in Statistics and scored 60 for which the average score of the class was 45,
and the standard deviation was 12. Relative to other students in the class, did Danny do better in Algebra or
Statistics?

SOLUTION. Compute the z-scores of Danny’s scores for each quiz.


For Algebra, For Statistics,
72 − 65 60 − 45
𝑧72 = 𝑧60 =
8 12

𝑧72 = 0.88 𝑧60 = 1.25

In Algebra, Danny scored 0.88 standard deviation above the mean. In Statistics, he scored 1.25
standard deviations above the mean. These indicate that relative to his classmates, Danny scored better in
Statistics than in Algebra.

PERCENTILES:
Percentiles (denoted by 𝑃𝑘 ) are measures of relative position that divide the distribution into 100 parts.
The kth percentile is the value such that at least k percent of the data are below that value and (100 − 𝑘) percent
are above that value.

Percentiles are also used to compare an individual’s test score with the same norm. For example, tests
such as the National Secondary Achievement Test (NSAT) are taken by high school students. A student’s
scores are compared with those of the other students locally and nationally using percentile ranks.

Percentiles are not the same as percentages. If a student gets 75 correct answers out of 100 items in
an examination in his class, then he obtains a percentage score of 75. But this will not tell his position with
respect to the rest of his class. His score could be the 70th percentile, then he did better than 70% of the students
in his class.

Most standardized examinations provide scores in terms of percentiles, which is defined as:
pth Percentile – a value x is called the pth percentile of a data set provided p% of the data values are less than
x.

EXAMPLE: Using Percentiles


In a recent year, the median annual salary for a physical therapist was $74,480. If the
90th percentile for the annual salary of a physical therapist was $105,900, find the percent of
physical therapists whose annual salary was:
a. more than $74,480.
b. less than $105,900.
c. between $74,480 and $105,900.
SOLUTION: a. By definition, the median is the 50th percentile. Therefore, 50% of the physical therapists
earned more than $74,480 per year.
b. Because $105,900 is the 90th percentile, 90% of all physical therapists made less than
$105,900.
c. From parts a and b, 90% - 50% = 40% of the physical therapists earned between $74,480
and $105,900.

To approximate the percentile rank of value 𝑥 in the distribution, then

(𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑏𝑒𝑙𝑜𝑤 𝑥) + 0.5


𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 = ∙ 100
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟𝑠 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠

EXAMPLE: What is the percentile rank of 24?


23 25 19 21 28 15 20 24 22 27

SOLUTION: Arrange the data in ascending order.


15 19 20 21 22 23 24 25 27 28
There are 6 values below 24. Determine the percentile using the formula.
(6 + 0.5) + 0.5
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 = ∙ 100
10
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 = 65 𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒
This means that a student with a score 24 did better than 65% of the class.

The following formula can be used to find the percentile that corresponds to a particular data value in
a set of data. To find the percentile for a given data value. Given a set of data and a data value x:
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝑥
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑜𝑓 𝑠𝑐𝑜𝑟𝑒 𝑥 = ∙ 100
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟𝑠 𝑜𝑓𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠
EXAMPLE:
On a reading examination given to 900 students, Elaine’s score of 602 was higher than the scores of
576 of the students who took the examination. What is the percentile for Elaine’s score?

𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 602


SOLUTION: 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 = ∙ 100
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟𝑠 𝑜𝑓𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠

576
= ∙ 100 = 64, Elaine’s score of 602 places her at the 64th percentile
900
QUARTILES:

Quartiles are the summary measures that divide a ranked data set into four equal parts. These
measures will divide any data set into four equal parts. These three measures are the first quartile (denoted
by Q1), the second quartile (denoted by Q2), and the third quartile (denoted by Q3). The data should be ranked
in increasing order before the quartiles are determined.
The second quartile is the same as the median of a data set. The first quartile is the value of the
middle term among the observations that are less than the median, and the third quartile is the value of the
middle term among the observations that are greater than the median.

The figure below describes the position of the three quartiles. Each of these portions contains 25% of
the observations of a data set arranged in increasing order:

25% 25% 25% 25%

Q1 Q2 Q3

Approximately 25% of the values in a ranked data set are less than Q 1 and about 75% are greater than
Q1. The second quartile, Q 2, divides a ranked data set into two equal parts, hence, the second quartile and the
median are the same. Approximately 75% of the data values are less than Q 3 and about 25% are greater than
Q3.

This indicates that 75% of the students got a score of 26 or below. Equivalently, this means that 25%
of the class got a score higher than 26. Quartiles can be obtained by first arranging the data set in ascending
order.

EXAMPLE: Find the value of 𝑄1 , 𝑄2 , 𝑎𝑛𝑑 𝑄3 of the following scores of students in a class:
20 15 10 29 30 19 12 26 24 18

SOLUTION: Arrange the data in ascending order:


10 12 15 18 19 20 24 26 29 30

Next, determine the median of the distribution and that median is the value of 𝑄2 . Then determine the
median of the values of the 1st half of the distribution to get 𝑄1 . And finally, determine the median of the values
of the 2nd half of the distribution for 𝑄3 .

Determine 𝑄2 which is the median of the distribution.


10 12 15 18 19 20 24 26 29 30

Median
19 + 20
𝑄2 = 𝑄2 = 19.5
2

This means that 50% of the students in the class got a score of 19.5 or less. Determine 𝑄1 which is the
median of the lower half of the distribution.
𝑄1 = 15

This means that 25% of the students obtained a score of 15 or below. Determine 𝑄3 which is the median
of the upper half of the distribution.
𝑄3 = 26

This indicates that 75% of the students got a score of 26 or below. Equivalently, this means that 25%
of the class got a score higher than 26. Quartiles can be obtained by first arranging the data set in ascending
order. Next, determine the median of the distribution and that median is the value of 𝑄2 . Then determine the
median of the values of the 1st half of the distribution to get 𝑄1 . And finally, determine the median of the values
of the 2nd half of the distribution for 𝑄3 .

THE BOX-AND WHISKER PLOT

The box-and-whisker plot - is a plot that shows the center, spread, and skewness of a data set. It is
constructed by drawing a box and two whiskers that use the median, the first quartile, the third quartile, and the
smallest and the largest values in the data set between the lower and the upper inner fences.

A box-and-whisker plot gives graphic presentation of data using the said five measures: the three
quartiles, the two extreme values (the minimum and the maximum) describing the entire data set. Together
these five numbers are called the five-number summary of a data set.

We know that the box-and-whisker plot can help us visualize the center, the spread, and the skewness
of a data set. It also helps detect outliers. We can compare the different distributions by making a box-and-
whisker plots for each of them.

{ 𝑋𝑚𝑖𝑛 , 𝑄1 , 𝑄2 , 𝑄3 , 𝑋𝑚𝑎𝑥 }

The five-number summary is used to construct a box plot, as in the figure below. Each of the five
numbers is represented by a vertical line segment, a box is formed using the line segments at 𝑄1 𝑄1 and 𝑄1 𝑄3 as
its two vertical sides, and two horizontal line segments are extended from the vertical segments
marking 𝑄1 𝑄1 and 𝑄3 𝑄3 to the adjacent extreme values, 𝑋𝑚𝑖𝑛 (minimum) and 𝑋𝑚𝑎𝑥 (maximum).

The two horizontal line segments are referred to as “whiskers,” and the diagram is called a "box and
whiskers plot.".
Xmin Q1 Q2 Q3 box Xmax

.
whiskers
The Box-and-Whiskers Plot
The following explains all the steps needed to make a box-and-whisker plot.

EXAMPLE: The following are the incomes (in thousands of dollars) for a sample of 12 households:

23 17 32 60 22 52 29 38 42 92 27 46 41.

Construct a box-and-whiskers plot for these data.

SOLUTION: Step 1. First, rank the data in increasing order:


17 22 23 27 29 32 38 41 42 46 52 60 92

Step 2. Calculate the values of the median, this is the second quartile, Q 2. The median of
these 13 data values has a rank of 7. Thus, the median, Q2, is 38.
17 22 23 27 29 32 38 41 42 46 52 60 92

Median (Q2)
Step 3. There are 6 data values less than the median and 6 data values greater than the
median. The first quartile is the median of the data values less than the median.
Thus, Q1 is the mean of the data values with ranks of 3 and 4.
17 22 23 27 29 32 38 41 42 46 52 60 92

Q1 = (23 + 27)/2 = 25

Step 4. The third quartile, Q3, is the median of the data values greater than the median.
Thus, Q3 is the mean of the data values with ranks 10 and 11.
17 22 23 27 29 32 38 41 42 46 52 60 92.

Q3 = (46 + 52)/2 = 49

To find the lower and upper fences: Q3 – Q1 = 49 – 25 = 24

Step 5. Find the points that are 1.5 x 24 below Q1 and 1.5 x 24 above Q3. These two points
are called the lower and upper inner fences, respectively.

1.5 x 24 = 36
Lower inner fence = Q1 – 36 = 25 – 36 = - 11
Upper inner fence = Q3 + 36 = 49 + 36 = 85
Step 6. Determine the smallest and the largest values in the given data set within the two
inner fences. These two values for our example are as follows:

Smallest value within the two inner fences = 17


Largest value within the two inner fences = 60

Step 7. Construct the Box-and-whisker plot.

a. Draw a horizontal scale that extends from the minimum data value to the maximum
data value.

b. Above the scale draw a rectangle (box) with its left side at Q1 and the right side at
Q3.

c. Draw a vertical line segment across the rectangle at the median, Q 2.

d. Draw a horizontal line segment, called a whisker, that extends from Q 1 to the
minimum and another that extends from Q3 to the maximum.
For this data:
Q1 Q3
Median

Smallest value (17) Largest value (60) * An outlier (92)

15 25 35 45 55 65 75 85 95
Income (in thousands of dollars)

By drawing two lines, join the points of the smallest and the largest values within the
two inner fences to the box. These values are 17 and 60. The 2 lines that join the box to these
two values are called whiskers. A value that falls outside the two inner fences is shown by marking
an asterisk and is called an outlier.

SUMMARY:

Measures of relative position or dispersion are unitless and are used when one wishes to compare the
scatter of one distribution with another distribution. The standard score is not a measure of relative dispersion
per se but is somewhat related. The standard score is useful for comparing two values from different series
especially when these two series differ with respect to the mean or standard deviation or both are expressed in
different units.

Standard scores or z-scores can be used in calculations and provide comparative information with other
persons or with other scores for the same person. For research purposes they are equal to and sometimes
preferred over raw scores and better than percentile ranks. Z-scores are used for simple communication. The
standard deviation is the unit of measurement of the z-score. It allows comparison of observations from different
normal distributions, which is done frequently in research.

REFERENCES:

Aufmann, Richard, Abad, Edmundo, et. al. “Mathematics in the Modern World”, Rex
Bookstore, Incorporated

Febre Jr., Francisco. “Introduction to Statistics”, Phoenix Publishing House, Incorporated

Manansala, T. (2017). “Statistics”, Jimcy Publishing House

Orines, Fernando B., Dilao, Soledad, Bernabe, Julieta G. “Advanced Algebra – Trigonometry
and Statistics Functional Approach”, Vibal Publishing House Incorporated.

https://www.youtube.com/watch?v=PneFeOn1rXk
https://www.youtube.com/watch?v=V61fftqO30c
https://youtu.be/1yYCyAwg-7g

You might also like