You are on page 1of 50

DATA MANAGEMENT

I. LEARNING OUTCOMES:
At the end of this chapter, students should be able to:
1. Recall the different terms and basic concepts in Statistics.
2. Compute and interpret the mean, median and mode for ungrouped and
grouped data.
3. Solve and interpret the different measures of position
4. Calculate and interpret the range, variance and standard deviation of
ungrouped and grouped data.

II. DISCUSSION:
REVIEW OF THE BASIC CONCEPTS IN STATISTICS
What is Statistics? Statistics is a field of Mathematics that deals with the
collection, organization, analysis, and interpretation of quantitative data.

Two Fields of Statistics


Descriptive Statistics – consist of the collection, organization,
summarization, and presentation of data. It tries to describe a given situation
Inferential Statistics – concerned with drawing conclusions about large
groups of data called the population. It makes inferences from samples to
population. This area also makes use the concept of probability.

MEASURES OF CENTRAL TENDENCY


Any single value that is used to identify the “center” of the data or the typical
value. Is referred to the average. Measures of center tendency is the numerical
descriptive measures which indicate or locate the center of a set of data.

THE ARITHMETIC MEAN


Mean is most commonly known as “average”. Population mean is denoted
by the Greek letter µ (mu) and sample mean is represented by x̅ (x bar).
To determine the population/sample mean for ungrouped data, the formulas
shown below can be used:
POPULATION MEAN
∑x
µ=
N
Where:
µ = population mean
∑ x = sum of the population observation
N = population size

SAMPLE MEAN
∑x
x̅ =
𝑛
Where:
x̅ = sample mean
∑ x = sum of the sample observation
n = sample size

Example 4.1
The number of faculty in 12 different universities are 58, 78, 42, 68, 66, 44,
52, 72, 67, 82, 45, and 56. Based from the data, find the population mean of faculty
members for the 12 universities.

Solution:
∑x
µ=
N
58 + 78 + 42 + 68 + 66 + 44 + 52 + 72 + 67 + 82 + 45 + 56
µ=
12
730
µ=
12

µ = 60.83 OR 61
∴ The population mean/average of faculty members of the 12 universities is
60.
Example 4.2
The number of students of selected course in a classroom are 47, 38, 39,
51, 48, 37, 52, 53 and 49. Find the mean of students of selected courses.
Solution:
∑x
x̅ =
𝑛
47 + 38 + 39 + 51 + 48 + 37 + 52 + 53 + 49
x̅ =
9
414
x̅ =
9
x̅ = 46
∴ The sample mean/average number of students is 46.

PROPERTIES OF MEAN
• The sum of the deviations of all measurements in a set from the mean is 0.
• It can be calculated for any set of numerical data, so it always exists.
• A set of numerical data has one and only one mean.
• It lends itself to higher statistical treatment.
• It is the most reliable since it takes into account every item in the set of data.
• It is greatly affected by extreme or deviant values.
• It is used only if the data are interval or ratio and when normally distributed.

MEDIAN
The middle position of an arranged values from lowest to highest. Median
precedes the half value of an array and half follows it. It is denoted by M d.
If the number of observed values (N) is odd, the median position is equal to
(n+1) (n+1)th
, and the value of the observation in the array is taken as the median.
2 2

If the N is even, the mean of the two middle values in the array is the
median.
𝑋𝑛⁄ +(𝑋𝑛⁄ +1)
2 2
Md =
2

Example 4.3
Find the median of the given data set: 36, 49, 42, 39, 45, 34 and 47
Step 1: Arrange the data from lowest to highest or from highest to
lowest.
x1 x2 x3 x4 x5 x6 x7
34 36 39 42 45 47 49
(n+1)
Since the number of observed data is odd, use the formula .
2
(n+1) (7+1) (8)
Step 2: N=7 Md = = = = 4th
2 2 2

∴ The median is the 4th position on the array which is 42.


Example 4.4
Find the median of the given data set: 23, 41, 27, 48, 30, 21, 43, and 36
Step 1: Arrange the data from lowest to highest or from highest to
lowest.
x1 x2 x3 x4 x5 x6 x7 x8
21 23 27 30 36 41 43 48
Since the number of observed data is even, use the formula
𝑋𝑛⁄ +(𝑋𝑛⁄ +1)
2 2
Md =
2
𝑋𝑛⁄ +(𝑋𝑛⁄ +1) (4th +5𝑡ℎ )
2 2
Step 2: N=8 Md = = =
2 2
(30+36) (66)
= = 33
2 2

∴ The median is 33.


PROPERTIES OF MEDIAN
• It is the score or class in a distribution below which 50% of the score falls
and above which another 50% lies.
• It is not affected by extreme or deviant values.
• It is appropriate to use when there are extreme or deviant values.
• It is used when the data are ordinal.
• It exists in both quantitative and qualitative data.

MODE
Mode is the most frequently observed value that occurs. Some data sets do
not have a mode because each value occurs only once. On the other hand, some
data sets can have more than one mode. This happens when the data set has two
or more values of equal frequency which is greater than that of any other value.
Example 4.5
Identify the mode(s) of the following data sets:
Data set A: 12 8 10 14 9 8 21 7
∴ Mode is unimodal and it is 8 because it has the greatest number of
occurrences.
Data set B: 4 8 7 4 5 6 10 7
∴ Mode is bimodal where in 4 and 7 which occurs both two times in the data
set.
Data set C: 12 9 8 9 7 12 6 9
∴ Mode is unimodal and it is 9 because it occurs three time which is the
greatest number of occurrences.
Data set D: Apple Banana Mango Apple Pineapple
Banana Mango Apple Pineapple Mango
Orange Mango
∴ Mode is unimodal which is the mango has the highest number of
occurrences.
Data set E: Coffee Milk Juice Tea Milk Coffee Tea
Juice
Coffee Milk Water Tea
∴ Data set is Trimodal where in Coffee, Milk and Tea have the highest
occurrences.

WEIGHTED MEAN
Weighted average is a mean calculated by giving values in a data set more
influence according to some attribute of the data. It is an average in which each
quantity to be averaged is assigned a weight, and these weightings determine the
relative importance of each quantity on the average. To find the weighted mean.
Find the sum of the products formed by multiplying each number by its assigned
weight or weighted mean.
∑n
i =1 W𝑖 Xi
Weighted Mean =
∑n
i =1 Wi

Where:
w = weight of each value;
x = individual value
Example 4.6
Almond wants to know his final grade in Calculus. Given the following data,
what is Almond’s final grade in Calculus?
Weighted
Grading System Almond's Score
Percentage
Quizzes 15.00% 82
Board work 15.00% 86
Activities 10.00% 78
Assignment 10.00% 93
Project 20.00% 88
Major Examination 30.00% 92
TOTAL 100.00%

(82×0.15)+(86×0.15)+(78×0.1)+(93×0.1)+(88×0.2)+(92×0.3)
Almond’s Final Grade =
100%

12.3+12.9+7.8+9.3+17.6+27.6
Almond’s Final Grade = = 87.5
100%
EXERCISE 4.1
Find the mean, median and mode of the following data.
1. 28, 17, 34, 32, 22, 28, 19, 20, 17, 34, 28, 36

2. 44, 52, 37, 40, 50, 52, 46, 46, 38, 41

3. 12, 19, 24, 17, 13, 19, 12, 20, 10, 11

4. 75, 45, 57, 72, 46, 45, 66, 70, 75, 46

5. 55, 57, 63, 58, 57, 55, 60, 54, 62, 64, 59

6. Mae wants to know her general weighted average. Given the following data,
what is Mae’s general weighted average?
Subject Units Grades
General
3 88
Mathematics
English 3 85
Science 3 85
Physical Education 2 88
Trigonometry 3 90
Calculus 4 89
TOTAL 18
MEAN FOR GROUPED DATA
Grouped data; used for large cases (n>30)
A. MIDPOINT METHOD
Procedure:
1. Group data in the form of frequency distribution.
2. Compute the midpoints (x) of all class limits
Upper Limit + Lower Limit
Midpoint (x) =
2
3. Multiply the midpoints by their corresponding frequencies (f∗x)
4. Get the sum of the products of the midpoints and frequencies
(∑ fx)
5. Divide the sum by the number of cases (n)
Using the Midpoint Method, the formula is

MEAN (MIDPOINT METHOD)


∑ fx
x̅ =
n
Where:
x̅ = mean
x = midpoint
∑ fx = sum of the product of the frequency and the midpoints
n = sample size/number of cases

Example 4.7
Find the mean of the given data set:

Class Limits f
10-14 4
15-19 4
20-24 5
25-29 9
30-34 13
35-39 11
40-44 2
45-49 2
Solution:

Class Limits f x fx
10-14 4 12 48
15-19 4 17 68
20-24 5 22 110
25-29 9 27 243
30-34 13 32 416
35-39 11 37 407
40-44 2 42 84
45-49 2 47 94
n = 50 ∑ fx = 1,470

Computation:
∑ fx
x̅ =
n
1,470
x̅ =
50
x̅ = 29.4
∴ mean is equal to 29.4

B. CLASS DEVIATION METHOD

Procedure:
1. Choose an arbitrary starting point or origin from any of the class
limits.
2. Get the midpoint of the class limit you have chosen as your
starting point. Call this your Assumed Mean (AM).
3. Get the deviation (d) of each limit from the class limit where the
assumed mean is. The deviation of the class limit of the assumed
mean is 0. Add 1 to each class limits higher than the assumed
mean consecutively and subtract 1 to the class limits lower than
the assumed mean consecutively.
4. Multiply the deviation by their corresponding frequency (f∗d)
5. Get the sum of the products of the deviations and frequencies
(∑ fd)
6. Find the class interval (i) then follow the given formula below:
MEAN (CLASS DEVIATION METHOD)
∑ fd
x̅ =AM + ( )i
n
Where:
x̅ = population mean
AM = assumed mean
∑ fx = sum of the product of the frequency
and the deviation
i = class interval
n = sample size/number of cases

Example 4.8
Find the mean of the given data set:
Class Limits f
10-14 4
15-19 4
20-24 5
25-29 9
30-34 13
35-39 11
40-44 2
45-49 2

Solution:
Class Limits f d fd
10-14 4 -4 -16
15-19 4 -3 -12
20-24 5 -2 -10
25-29 9 -1 -9
30-34 13 0 0
35-39 11 1 11
40-44 2 2 4
45-49 2 3 6
n=50 ∑ fd = −26
Computation:
∑ fd
x̅ = AM + ( )i
n
−26
x̅ = 32 + ( )5
50

x̅ = 32 – 2.6
x̅ = 29.4
∴ mean is equal to 29.4

MEDIAN FOR GROUPED DATA


Method 1
Procedure:
1. Add down or accumulate the frequencies starting from the lowest to
the highest-class limit. Call this the cumulative frequency (cf).
2. Find one-half of the number of cases in the distribution (n/2).
3. Find the cumulative frequency which is equal or closest (but higher
than) to the half of the number of cases. The class containing this
frequency is the median class.
4. Find the lower boundary (Lb) of the median class by subtracting 0.5
from the lower limit of the median class.
5. Get the cumulative frequency of the class below the median class
(<cfb).
6. Subtract this from the half of the number of cases in the distribution
n
(2 - <cfb)
7. Get the frequency of the median class
8. Find the class interval (i) then follow the given formula below:

MEDIAN (Method 1)
n
2
− <𝑐𝑓𝑏
Md = Lb + ( )i
f
Where:
Md = median
Lb = lower boundary
<cfb = less than cumulative frequency below
i = class interval
f = frequency of the median class
n = number of cases
Example 4.9
Find the median of the given data set:

Class Limits f
10-14 4
15-19 4
20-24 5
25-29 9
30-34 13
35-39 11
40-44 2
45-49 2

Solution:
Class Limits f <cfb
10-14 4 4
15-19 4 8
20-24 5 13
25-29 9 22 <cfb
median
30-34 13 (f) 35
class
35-39 11 46
40-44 2 48
45-49 2 50
n = 50

Computation:
n 50
i = 5, = = 25
2 2
Lb = LL – 0.5 = 30 – 0.5 = 29.5
n
2
− <𝑐𝑓𝑏
Md = Lb + ( )i
f
25 − 22
Md = 29.5 + ( )5
13
3
Md = 29.5 + ( )5
13
Md = 29.5 + 1.15
Md = 30.65 ∴ median is equal to 30.65
Method 2
Procedure:
1. Add UP or accumulate the frequencies starting from the lowest to
the highest-class limit. Call this the cumulative frequency (cf).
2. Find one-half of the number of cases in the distribution (n/2).
3. Find the cumulative frequency which is equal or closest (but
higher than) to the half of the number of cases. The class
containing this frequency is the median class.
4. Find the upper boundary (Ub) of the median class by adding 0.5
from the upper limit of the median class.
5. Get the cumulative frequency of the class below the median class
(>cfb).
6. Subtract this from the half of the number of cases in the
n
distribution ( - >cfb)
2
7. Get the frequency of the median class
8. Find the class interval (i) then follow the given formula below:

MEDIAN (Method)
n
− >𝑐𝑓𝑏
2
Md = Ub - ( )i
f
Where:
Md = median
Ub = upper boundary
>cfb = greater than cumulative frequency below
i = class interval
f = frequency of the median class
n = number of cases

Example 4.10
Find the median of the given data set:
Class Limits f
10-14 4
15-19 4
20-24 5
25-29 9
30-34 13
35-39 11
40-44 2
45-49 2
Solution:
Class Limits f >cfb
10-14 4 50
15-19 4 46
20-24 5 42
25-29 9 37
30-34 13 (f) 28 median class
35-39 11 15 >cfb
40-44 2 4
45-49 2 2
n = 50

Computation:
i=5
n 50
= = 25
2 2
Ub = UL + 0.5 = 34 + 0.5 = 34.5
n
− >𝑐𝑓𝑏
2
Md = Ub - ( )i
f
25 − 15
Md = 34.5 - ( )5
13
10
Md = 34.5 – ( )5 13
Md = 34.5 – 3.85
Md = 30.65
∴ median is equal to 30.65

MODE FOR GROUPED DATA


A. CRUDE MODE
Crude Mode – refers to the midpoint of the class limit with the highest
frequency.

Procedure:
1. Find the class limit with the highest frequency.
2. Get the midpoint of that class limit.
3. The midpoint is the crude mode.
Example 4.11
Find the crude mode of the given data set:

Class Limits f
10-14 4
15-19 4
20-24 5
25-29 9
30-34 13
35-39 11
40-44 2
45-49 2

Solution:
The class limit with the highest frequency is 30-34. The midpoint of
the class limit is equal to 32.
∴ the crude mode of the given data set is 32.

B. REFINED MODE
Refined Mode – refers to the mode obtained from an ordered
arrangement or a class frequency distribution.

Procedure:
1. Get the mean and the median of the grouped data.
2. Multiply the median by three (3x̃)
3. Multiply the mean by two (2x̅)
4. Find the difference

Used the formula below to find the refined mode:

REFINED MODE

x̂ = 3x̃ - 2x̅
Where:
x̂ =refined mode
x̃ = median
x̅ = mean
Example 4.12
Find the refined mode of the given data set:
Class Limits f
10-14 4
15-19 4
20-24 5
25-29 9
30-34 13
35-39 11
40-44 2
45-49 2

Solution:
Based from the previous problem mean = 29.4 and median = 30.65.
Computation:

x̂ = 3x̃ - 2x̅
x̂ = 3(30.65) – 2(29.4)
x̂ = 91.95 – 58.8
x̂ = 33.15
∴ refined mode is equal to 33.15
EXERCISE 4.2
1. Using the data below, find the following:
a. Mean (Midpoint method and class deviation method)
b. Median
c. Crude Mode
d. Refined Mode

Class Limits f
24-30 3
31-37 6
38-44 8
45-51 12
52-58 14
59-65 13
66-72 9
73-79 5

2. Using the data below, find the following:


a. Mean (Midpoint method and class deviation method)
b. Median
c. Crude Mode
d. Refined Mode

Class Limits f
11-15 7
16-20 9
21-25 10
26-30 13
31-35 14
36-40 11
41-45 8
46-50 3
MEASURES OF RELATIVE POSITION OR FRACTILES
Fractile is the division of an array into equivalent subgroups. It identifies the
position of a value in an array. An array divided into hundred equal parts is
percentile. In quartile, array is divided into four equal parts and decile divides an
array into 10 equal parts.

i(n+1) 𝑡ℎ
General formula to is: [ ]
F

where i = term of interest; n = number of observed values; and F = Fractile


(Percentile = 100; Decile = 10; Quartile = 4)
Percentiles (Pk). Values in an array are subdivided into 100 equal parts.
For instance, P1 is read as first percentile which means that the value is greater
than 1% of the observed values in the array. P2, read as second percentile, value
is greater than 2% of the observed values in the array, and so on.
QUARTILES FOR UNGROUPED DATA
Formula:

QUARTILES FOR UNGROUPED DATA


𝑖(𝑛+1) th
Qi = [ ]
4

Where:
Qi = Quartile (i = 1, 2, 3)
n = sample size

Example 4.13
The following are the students’ scores in a post-test examination given by
their teacher. 28, 32, 21, 29, 36, 40, 19, 22, 33, 50, 48, 35, 17, 22, 20, 37, 39, 41,
45, 32, 42, 31, 23, 47. Find the 1st, 2nd and 3rd quartile.
Solution:
Since the data is ungrouped, arrange the data in ascending order.
17, 19, 20, 21, 22, 22, 23, 28, 29, 31, 32, 32, 33, 35, 36, 37, 39, 40, 41, 42, 45,
47, 48, 50.
First Quartile:
𝑖(𝑛+1) th
Qi = [ ]
4
1(24+1) th
Q1 = [ ]
4
25
Q1 = [ ]th
4

Q1 = 6.25th value of the observation


Q1 = 6th observation + 0.25(7th – 6th)
Q1 = 22 + 0.25(23 - 22)
Q1 = 22 + 0.25(1)
Q1 = 22.25
Second Quartile:
𝑖(𝑛+1) th
Qi = [ ]
4
2(24+1) th
Q2 = [ ]
4
25
Q2 = [ ]th
2

Q2 = 12.5th value of the observation


Q2 = 12th observation + 0.5(13th – 12th)
Q2 = 32 + 0.5(33 - 32)
Q2 = 32 + 0.5 (1)
Q2 = 32.5
Third Quartile:
𝑖(𝑛+1) th
Qi = [ ]
4
3(24+1) th
Q3 = [ ]
4
3 (25) th
Q3 = [ ]
4
75
Q3 = ( )th
4
Q3 = 18.75th value of the observation
Q3 = 18th observation + 0.75(19th – 18th)
Q3 = 40 + 0.75(41 - 40)
Q3 = 40 + 0.75(1)
Q3 = 40.75

DECILE FOR UNGROUPED DATA


Formula:

DECILE FOR UNGROUPED DATA


𝑖(𝑛+1) th
Di = [ ]
10

Where:
Di = Decile (i = 1, 2, … 9)
n = sample size

Example 4.14
The following are the students’ scores in a pre-test examination given by
their teacher. 16, 22, 35, 28, 29, 41, 38, 39, 25, 36, 17. Find the 3rd, 6th and 8th
decile.
Solution:
Since the data is ungrouped, arrange the data in ascending order.
16, 17, 22, 25, 28, 29, 35, 36, 38, 39, 41
Third Decile:
𝑖(𝑛+1) th
Di = [ ]
10
3(11+1) th
D3 = [ ]
10
3 (12) th
D3 = [ ]
10
36
D3 = ( )th
10
D3 = 3.6th value of the observation
D3 = 3rd observation + 0.6(4th – 3rd)
D3 = 22 + 0.6(25 - 22)
D3 = 22 + 0.6(3)
D3 = 22 + 1.8
D3 = 23.8

Six Decile:
𝑖(𝑛+1) th
Di = [ ]
10
6(11+1) th
D6 = [ ]
10
6 (12) th
D6 = [ ]
10
72
D6 = ( )th
10

D6 = 7.2th value of the observation


D6 = 7th observation + 0.2(8th – 7th)
D6 = 35 + 0.2(36 - 35)
D6 = 35 + 0.2(1)
D6 = 35 + 0.2
D6 = 35.2

Eight Decile:
𝑖(𝑛+1) th
Di = [ ]
10
8(11+1) th
D8 = [ ]
10
8 (12) th
D8 = [ ]
10
96
D8 = ( )th
10
D8 = 9.6th value of the observation
D8 = 9th observation + 0.6(10th – 9th)
D8 = 38 + 0.6(39 - 38)
D8 = 38 + 0.6(1)
D8 = 38 + 0.6
D8 = 38.6

PERCENTILE FOR UNGROUPED DATA


Formula:

PERCENTILE FOR UNGROUPED DATA


𝑖(𝑛+1) th
Pi = [ ]
100

Where:
Pi = Percentile (i = 1, 2, … 99)
n = sample size

Example 4.14
The following are the students’ scores in a pre-test examination given by
their teacher. 16, 22, 35, 28, 29, 41, 38, 39, 25, 36, 17. Find the 20th, 65th and 85th
percentile.
Solution:
Since the data is ungrouped, arrange the data in ascending order.
16, 17, 22, 25, 28, 29, 35, 36, 38, 39, 41
20th Percentile:
𝑖(𝑛+1) th
Pi = [ ]
100
20(11+1) th
P20 = [ ]
100
20 (12) th
P20 = [ ]
100
240 th
P20 = ( )
100

P20 = 2.4th value of the observation


P20 = 2nd observation + 0.4(3rd – 2nd)
P20 = 17 + 0.4(22 - 17)
P20 = 17 + 0.4(5)
P20 = 17 + 2
P20 = 19
65th Percentile:
𝑖(𝑛+1) th
Pi = [ ]
100
65(11+1) th
P65 = [ ]
100
65(12) th
P65 = [ ]
100
780 th
P65 = ( )
100

P65 = 7.8th value of the observation


P65 = 7th observation + 0.8(8th – 7th)
P65 = 35 + 0.8(36 - 35)
P65 = 35 + 0.8(1)
P65 = 35 + 0.8
P65 = 35.8

85th Percentile:
𝑖(𝑛+1) th
Pi = [ ]
100
85(11+1) th
P85 = [ ]
100
85(12) th
P85 = [ ]
100
1020 th
P85 = ( )
100
P85 = 10.2th value of the observation
P85 = 10th observation + 0.2(11th – 10th)
P85 = 39 + 0.2(41 - 39)
P85 = 39 + 0.2(2)
P85 = 39 + 0.4
P85 = 39.4
EXERCISE 4.3
Find Q1, Q2, Q3, D4, D6, D8, P10, P55 and P90 of the given data below. Show
your complete solution to the problem.
1. 24, 28, 22, 30, 18, 27, 24, 34, 15, 31

2. 36, 40, 45, 41, 38, 29, 47, 42, 34, 46

3. 13, 9, 16, 20, 24, 12, 17, 27, 22, 13, 11, 23

4. 30, 23, 36, 32, 19, 15, 29, 32, 26, 19, 26
QUARTILES FOR GROUPED DATA
Procedure:
1. Add down or accumulate the frequencies starting from the lowest to
the highest-class limit. Call this the cumulative frequency (cf).
kn
2. Find of the number of cases in the distribution.
4
3. Find the cumulative frequency which is equal or closest (but higher
kn
than) to the of the number of cases. The class containing this
4
frequency is the Qk class.
4. Find the lower boundary (Lb) of the Qk class by subtracting 0.5 from
the lower limit of the Qk class.
5. Get the cumulative frequency of the class below the Qk class (<cfb).
kn
6. Subtract this from the of the number of cases in the distribution
4
kn
( - <cfb)
4
7. Get the frequency of the Qk class
8. Find the class interval (i) then follow the given formula below:

QUARTILE FOR GROUPED DATA


kn
− <𝑐𝑓𝑏
4
Qk = Lb + ( )i
f
Where:
Qk = quartile (i.e k = 1, 2 or 3)
Lb = lower boundary
<cfb = less than cumulative frequency
below
i = class interval
f = frequency of the quartile class
n = number of cases
Example 4.15
Find the first, second and third quartiles of the given data set:
Class Limits f
10-14 4
15-19 4
20-24 5
25-29 9
30-34 13
35-39 11
40-44 2
45-49 2

Solution (Q1):
Class Limits F <cfb
10-14 4 4
15-19 4 8 <cfb
20-24 5 (f) 13 Q1 Class
25-29 9 22
30-34 13 35
35-39 11 46
40-44 2 48
45-49 2 50
n = 50
First Quartile (Q1)
i=5
kn 1(50)
= = 12.5
4 4
Lb = Ll – 0.5 = 20 – 0.5 = 19.5
kn
− <𝑐𝑓𝑏
4
Q1 = Lb + ( )i
f
12.5 − 8
Q1 = 19.5 + ( )5
5

Q1 = 19.5 + 4.5
Q1 = 24
∴ first quartile is equal to 24
Solution (Q2):
Class Limits f <cfb
10-14 4 4
15-19 4 8
20-24 5 13
25-29 9 22 <cfb
30-34 13 (f) 35 Q2 Class
35-39 11 46
40-44 2 48
45-49 2 50
n = 50
Second Quartile (Q2)
i=5
kn 2(50)
= = 25
4 4
Lb = 30 – 0.5 = 29.5
kn
− <𝑐𝑓𝑏
4
Q2 = Lb + ( )i
f
25 − 22
Q2 = 29.5 + ( )5
13
3
Q2 = 29.5 + ( )5
13

Q2 = 29.5 + 1.15
Q2 = 30.65
∴ second quartile is equal to 30.65
Solution (Q3):

Class Limits f <cfb


10-14 4 4
15-19 4 8
20-24 5 13
25-29 9 22
30-34 13 35 <cfb

35-39 11 (f) 46 Q3 Class

40-44 2 48
45-49 2 50
n = 50
Third Quartile (Q3)
i=5
kn 3(50)
= = 37.5
4 4
Lb = 35 – 0.5 = 34.5
kn
4
− <𝑐𝑓𝑏
Q3 = Lb + ( )i
f
37.5 − 35
Q3 = 34.5 + ( )5
11
2.5
Q3 = 34.5 + ( )5
11
12.5
Q3 = 34.5 + ( )
11
Q3 = 34.5 + 1.14
Q3 = 35.64
∴ third quartile is equal to 35.64

DECILES FOR GROUPED DATA


Procedure:
1. Add down or accumulate the frequencies starting from the lowest to
the highest-class limit. Call this the cumulative frequency (cf).
kn
2. Find 10 of the number of cases in the distribution.
3. Find the cumulative frequency which is equal or closest (but higher
kn
than) to the 10 of the number of cases. The class containing this
frequency is the Dk class.
4. Find the lower boundary (Lb) of the Dk class by subtracting 0.5 from
the lower limit of the Dk class.
5. Get the cumulative frequency of the class below the Dk class (<cfb).
kn
6. Subtract this from the 10 of the number of cases in the distribution
kn
(10 - <cfb)
7. Get the frequency of the Dk class
8. Find the class interval (i) then follow the given formula below:
DECILES FOR GROUPED DATA
kn
10
− <𝑐𝑓𝑏
Dk = Lb + ( )i
f
Where:
Dk = decile (i.e k = 1, 2, … 9)
Lb = lower boundary
<cfb = less than cumulative frequency below
i = class interval
f = frequency of the median class
n = number of cases

Example 4.16
Find the second, fourth and eight deciles of the given data set:
Class Limits f
10-14 4
15-19 4
20-24 5
25-29 9
30-34 13
35-39 11
40-44 2
45-49 2

Solution (D2):
Class Limits f <cfb
10-14 4 4
15-19 4 8 <cfb
20-24 5 (f) 13 D2 Class
25-29 9 22
30-34 13 35
35-39 11 46
40-44 2 48
45-49 2 50
n = 50
Second Decile (D2)
i=5
kn 2(50)
= = 10
10 10
Lb = 20 – 0.5 = 19.5
kn
10
− <𝑐𝑓𝑏
D2 = Lb + ( )i
f
10 − 8
D2 = 19.5 + ( )5
5
D2 = 19.5 + 2
D2 = 21.5
∴ second decile is equal to 21.5

Solution (D4):
Class Limits f <cfb
10-14 4 4
15-19 4 8
20-24 5 13 <cfb
25-29 9 (f) 22 D4 Class
30-34 13 35
35-39 11 46
40-44 2 48
45-49 2 50
n = 50
Fourth Decile (D4)
i=5
kn 4(50)
= = 20
10 10
Lb = 25 – 0.5 = 24.5
kn
− <𝑐𝑓𝑏
10
D4 = Lb + ( )i
f
20 − 13
D4 = 24.5 + ( )5
9
7
D4 = 24.5 + ( )5
9
35
D4 = 24.5 +
9
D4 = 24.5 + 3.89
D4 = 28.39
∴ fourth decile is equal to 28.39

Solution:
Class Limits f <cfb
10-14 4 4
15-19 4 8
20-24 5 13
25-29 9 22
30-34 13 35 <cfb
35-39 11 (f) 46 D8 Class
40-44 2 48
45-49 2 50
n = 50

Eight Decile (D8)


i=5
kn 8(50)
= = 40
10 10
Lb = 35 – 0.5 = 34.5
kn
− <𝑐𝑓𝑏
10
Dk = Lb + ( )i
f
40 − 35
D8 = 34.5 + ( )5
11
5
D8 = 34.5 + ( )5
11
25
D8 = 34.5 +
11
D8 = 34.5 + 2.27
D8 = 36.77
PERCENTILES FOR GROUPED DATA
Procedure:
1. Add down or accumulate the frequencies starting from the lowest to
the highest-class limit. Call this the cumulative frequency (cf).
kn
2. Find 100 of the number of cases in the distribution.
3. Find the cumulative frequency which is equal or closest (but higher
kn
than) to the 100 of the number of cases. The class containing this
frequency is the Pk class.
4. Find the lower boundary (Lb) of the Pk class by subtracting 0.5 from
the lower limit of the Pk class.
5. Get the cumulative frequency of the class below the P k class (<cfb).
kn
6. Subtract this from the 100 of the number of cases in the distribution
kn
(100 - <cfb)
7. Get the frequency of the Pk class
8. Find the class interval (i) then follow the given formula below:

PERCENTILES FOR GROUPED DATA


kn
100
− <𝑐𝑓𝑏
Pk = Lb + ( )i
f
Where:
Pk = decile (i.e k = 1, 2, … 99)
Lb = lower boundary
<cfb = less than cumulative frequency below
i = class interval
f = frequency of the median class
n = number of cases

Example 4.17
Find the 35th, 60th and 88th percentiles of the given data set:
Class Limits f
10-14 4
15-19 4
20-24 5
25-29 9
30-34 13
35-39 11
40-44 2
45-49 2
Solution:
Class Limits f <cfb
10-14 4 4
15-19 4 8
20-24 5 13 <cfb
25-29 9 (f) 22 P35 Class
30-34 13 35
35-39 11 46
40-44 2 48
45-49 2 50
n = 50
35th Percentile (P35)
i=5
kn 35(50)
= = 17.5
100 100
Lb = 25 – 0.5 = 24.5
kn
− <𝑐𝑓𝑏
100
Pk = Lb + ( )i
f
17.5 − 13
P35 = 24.5 + ( )5
9
4.5
P35 = 24.5 + ( )5
9
22.5
P35 = 24.5 +
9
P35 = 24.5 + 2.5
P35 = 27

Solution:
Class Limits f <cfb
10-14 4 4
15-19 4 8
20-24 5 13
25-29 9 22 <cfb
30-34 13 (f) 35 P60 Class
35-39 11 46
40-44 2 48
45-49 2 50
n = 50
Computation:
60th Percentile (P60)
i=5
kn 60(50)
= = 30
100 100
Lb = 30 – 0.5 = 29.5
kn
100
− <𝑐𝑓𝑏
Pk = Lb + ( )i
f
30 − 22
P60 = 29.5 + ( )5
13
8
P60 = 29.5 + ( )5
13
40
P60 = 29.5 +
13
P60 = 29.5 + 3.08
P60 = 32.58
Solution:
Class Limits f <cfb
10-14 4 4
15-19 4 8
20-24 5 13
25-29 9 22
30-34 13 35 <cfb
35-39 11 (f) 46 P60 Class
40-44 2 48
45-49 2 50
n = 50
Computation:
88th Percentile (P88)
i=5
kn 88(50)
= = 44
100 100
Lb = 35 – 0.5 = 34.5
kn
100
− <𝑐𝑓𝑏
Pk = Lb + ( )i
f
44 − 35
P88 = 34.5 + ( )5
11
9
P88 = 34.5 + ( )5
11
45
P88 = 34.5 +
11
P88 = 34.5 + 4.09
P88 = 38.59
∴ 88th percentile is equal to 38.59
EXERCISE 4.4
1. Find Q1, Q2, D4, D7, D8, P10, P55 and P90 of the given data below. Show your
complete solution to the problem.

Class Limits f
24-30 3
31-37 6
38-44 8
45-51 12
52-58 14
59-65 13
66-72 9
73-79 5

2. Find Q1, Q3, D4, D5, D8, P10, P65 and P85 of the given data below. Show your
complete solution to the problem.

Class Limits f
11-15 7
16-20 9
21-25 10
26-30 13
31-35 14
36-40 11
41-45 8
46-50 3
MEASURES OF VARIABILITY
The measures of variability we will consider are the range, the variance and
the standard deviation.
RANGE (R)
a. Range Ungrouped Data – the difference between the highest and
lowest score. It is the simplest measure of variability to calculate.
Procedure:
1. Arrange the data from lowest to highest then
2. Subtract the smallest value from the highest value in the data set.

Example 4.18
1. Jude took 6 math short quizzes in first quarter. What is the range of his
test scores?
94, 98, 86, 79, 89, 81
Solution: Arrange the test scores from lowest to highest, we get:
79, 81, 86, 89, 94, 98
Range = Highest – lowest score
= 98 – 79
= 19
∴ the range of Jude’s test scores is 19
2. A basketball team won seven consecutive games. Find the range if the
team’s lead by points are 12, 19, 9, 11, 8, 15, and 12.
Solution: Arrange the test scores from lowest to highest, we get:
8, 9, 11, 12, 12, 15, 19
Range = Highest – lowest score
= 19 – 8 = 11
∴ the range of the scores is 11

b. Range Grouped Data – the difference between the highest limit of the
highest-class limit and the lowest limit of the lowest class limit.
Example 4.19
Determine the range of the given data below:
Class f
Limits
45-52 6
53-60 8
61-68 12
69-76 9
77-84 5

Solution:
Range = Highest limit of the highest-class limit – lowest limit of the lowest
class limit
Range = 84 – 45
Range = 39
∴ the range of the scores is 11

VARIANCE (S2)
Indicates a relationship the mean of a distribution and the data points; it is
determined by averaging the sum of the squared deviations. Squaring the
differences instead of taking the absolute values allows for greater flexibility in
calculating further algebraic manipulations of the data.
Ungrouped Data
Procedure:
1. Find the mean (x̅)
2. Subtract the mean from each score to get the deviation. (d = x - x̅)
3. Square the deviation (x - x̅)2
4. Get the sum of the squared deviations [∑ (x − x̅)2 ]
∑(x−x̅)2
5. Divide the sum by the number of cases minus one n−1

VARIANCE FOR UNGROUPED DATA


∑(x−x̅)2
S2 = n−1
Where:
S2 = Variance
x = individual score
x̅ = mean
n = number of cases
Example 4.20
A. Eight students were asked if how many hours they study in one day.
Find the variance of the numbers of hours that the students spent in
studying. The data are shown below (x in hours).
x= 5, 4, 4, 6, 3, 5, 5, 2

B. Find the variance of the given data below.


x= 250, 360, 520, 210, 740, 525, 615, 195, 815

Solution for A.
∑x 5+4+4+6+3+5+5+2 34
x̅ = = = = 4.25
n 8 8

x x - x̅ (x - x̅)2
5 0.75 0.5625
4 -0.25 0.0625
4 -0.25 0.0625
6 1.75 3.0625
3 -1.25 1.5625
5 0.75 0.5625
5 0.75 0.5625
2 -2.25 5.0625
∑(x − x̅)2 = 11.5

̅ )2
∑(x−x
S2 = n−1
11.5
S² = 8−1
11.5
S² = 7

S² = 1.64 Hours
∴ the variance of the data is 1.64 hours or 1 hour and 38
minutes
Solution for B.
∑x 250+360+520+210+740+525+615+195+815 4230
x̅ = = = = 470
n 9 9

X x - x̅ (x - x̅)2
250 -220 48400
360 -110 12100
520 50 2500
210 -260 67600
740 270 72900
525 55 3025
615 145 21025
195 -275 75625
815 345 119025
∑(x − x̅)2 = 422200

̅ )2
∑(x−x
S2 = n−1
422200
S² = 9−1
422200
S² = 8

S² = 52,775
∴ the variance of the data is 52,775

STANDARD DEVIATION (SD)


Standard deviation is the square root of the variance.
Ungrouped Data
Procedure:
1. Find the mean (x̅)
2. Subtract the mean from each score to get the deviation. (d = x - x̅)
3. Square the deviation (x - x̅)2
4. Get the sum of the squared deviations [∑ (x − x̅)2 ]
∑(x−x̅)2
5. Divide the sum by the number of cases minus one n−1
∑(x−x̅)2
6. Get the square root of the quotient (√ n−1
)
STANDARD DEVIATION FOR UNGROUPED DATA

2
∑(x−x
̅)
SD = √ n−1

Where:
SD = Standard Deviation
x = individual score
x̅ = mean
n = number of cases

Example 4.21
Find the standard deviation using the data given in example 4.18.
Solution for A.
Since the variance is equal to 1.64, therefore

SD = √variance

SD = √1.64 = 1.28
Solution for B.
Since the variance is equal to 52,775, therefore

SD = √variance

SD = √52,775 = 229.73
EXERCISE 4.5
Find the variance and standard deviation of the given data below. Show
your complete solution to the problem.
1. 24, 28, 22, 30, 18, 27, 24, 34, 15, 31

2. 36, 40, 45, 41, 38, 29, 47, 42, 34, 46

3. 13, 9, 16, 20, 24, 12, 17, 27, 22, 13, 11, 23

4. 30, 23, 36, 32, 19, 15, 29, 32, 26, 19, 26
VARIANCE AND STANDARD DEVIATION FOR GROUPED DATA
A. POPULATION VARIANCE

POPULATION VARIANCE

2 ∑ 𝑓(𝑥− 𝜇)2
𝜎 =
𝑁
Where:
𝜎 2 = Population Variance
f = frequency
x = class mark
µ = population mean
N = number of cases/observations

Example 4.22
Given the frequency distribution table below, compute for the
population variance.
Scores of 60 Students in a Final Examination

Class Limits f
5-10 3
11-16 4
17-22 6
23-28 10
29-34 12
35-40 11
41-46 10
47-52 5
53-58 1
Solution:
Class
f x fx x-µ (x - µ)² f(x - µ)²
Limits
5-10 3 7.5 22.5 -24.6 605.16 1815.48
11-16 4 13.5 54 -18.6 345.96 1383.84
17-22 6 19.5 117 -12.6 158.76 952.56
23-28 8 25.5 204 -6.6 43.56 348.48
29-34 12 31.5 378 -0.6 0.36 4.32
35-40 11 37.5 412.5 5.4 29.16 320.76
41-46 10 43.5 435 11.4 129.96 1299.6
47-52 5 49.5 247.5 17.4 302.76 1513.8
53-58 1 55.5 55.5 23.4 547.56 547.56
N = 60 ∑ 𝒇𝒙 = 1926 ∑ 𝒇(𝒙 − 𝝁)𝟐 = 8186.4

Finding mean of the data:


∑ 𝑓𝑥 1926
x̅ = = = 32.10
𝑛 60

Finding Population Variance:


∑ 𝑓(𝑥− 𝜇)2 8186.4
𝜎2 = = = 136.44
𝑁 60

B. SAMPLE VARIANCE

SAMPLE VARIANCE

𝑛 ∑ 𝑓𝑥 2 −(∑ 𝑓𝑥)²
s2 =
𝑛(𝑛−1)

Where:
s 2 = Sample Variance
f = frequency
x = class mark
x̅ = sample mean
n = number of cases/observations
Example 4.22
Given the frequency distribution table below, compute for the
population variance.
Scores of 60 Students in a Final Examination

Class Limits f
5-10 3
11-16 4
17-22 6
23-28 10
29-34 12
35-40 11
41-46 10
47-52 5
53-58 1

Solution:
Class
f x fx x² fx²
Limits
5-10 3 7.5 22.5 56.25 168.75
11-16 4 13.5 54 182.25 729
17-22 6 19.5 117 380.25 2281.5
23-28 10 25.5 204 650.25 5202
29-34 12 31.5 378 992.25 11907
35-40 11 37.5 412.5 1406.25 15468.75
41-46 10 43.5 435 1892.25 18922.5
47-52 5 49.5 247.5 2450.25 12251.25
53-58 1 55.5 55.5 3080.25 3080.25
n = 60 ∑ 𝑓𝑥 = 1926 ∑ 𝑓𝑥 2 = 70011

𝑛 ∑ 𝑓𝑥 2 −(∑ 𝑓𝑥)²
s2 =
𝑛(𝑛−1)

60(70011)−(1926)2
s² = 60 (60−1)

4200660−3709476
s² = 60 (59)
491184
s² =
3540

s² = 138.75
STANDARD DEVIATION
It is a measure of the dispersion of a set of data from its mean. It is
determined by calculating the positive root (square root) of variance.

POPULATION STANDARD DEVIATION

𝜎 = √𝜎²
SAMPLE STANDARD DEVIATION

s = √𝑠²

Example 4.23
Find the population standard deviation and sample standard deviation of
the given data.
Scores of 60 Students in a Final Examination
Class Limits f
5-10 3
11-16 4
17-22 6
23-28 10
29-34 12
35-40 11
41-46 10
47-52 5
53-58 1

Solution: Population Standard Deviation


Class
f x fx x-µ (x - µ)² f(x - µ)²
Limits
5-10 3 7.5 22.5 -24.6 605.16 1815.48
11-16 4 13.5 54 -18.6 345.96 1383.84
17-22 6 19.5 117 -12.6 158.76 952.56
23-28 8 25.5 204 -6.6 43.56 348.48
29-34 12 31.5 378 -0.6 0.36 4.32
35-40 11 37.5 412.5 5.4 29.16 320.76
41-46 10 43.5 435 11.4 129.96 1299.6
47-52 5 49.5 247.5 17.4 302.76 1513.8
53-58 1 55.5 55.5 23.4 547.56 547.56
N = 60 ∑ 𝒇𝒙 = 1926 ∑ 𝒇(𝒙 − 𝝁)𝟐 = 8186.4
Finding mean of the data:
∑ 𝑓𝑥 1926
x̅ = = = 32.10
𝑛 60

Finding Population Variance:


∑ 𝑓(𝑥− 𝜇)2 8186.4
𝜎2 = = = 136.44
𝑁 60

Finding Population Standard Deviation:

𝜎 = √𝜎² = √136.44 = 11.68


Solution: Sample Standard Deviation
Class
f x fx x² fx²
Limits
5-10 3 7.5 22.5 56.25 168.75
11-16 4 13.5 54 182.25 729
17-22 6 19.5 117 380.25 2281.5
23-28 10 25.5 204 650.25 5202
29-34 12 31.5 378 992.25 11907
35-40 11 37.5 412.5 1406.25 15468.75
41-46 10 43.5 435 1892.25 18922.5
47-52 5 49.5 247.5 2450.25 12251.25
53-58 1 55.5 55.5 3080.25 3080.25
n = 60 ∑ 𝑓𝑥 = 1926 ∑ 𝑓𝑥 2 = 70011

Finding Sample Variance


𝑛 ∑ 𝑓𝑥 2 −(∑ 𝑓𝑥)²
s2 =
𝑛(𝑛−1)

60(70011)−(1926)2
s² = 60 (60−1)

4200660−3709476
s² = 60 (59)
491184
s² = 3540

s² = 138.75
Finding Sample Standard Deviation

s = √𝑠² = √138.75 = 11.78


EXERCISE 4.6
1. Find population variance, sample variance, population standard deviation
and sample standard deviation of the given data below. Show your
complete solution to the problem.

Class Limits f
24-30 3
31-37 6
38-44 8
45-51 12
52-58 14
59-65 13
66-72 9
73-79 5

2. Find population variance, sample variance, population standard deviation


and sample standard deviation of the given data below. Show your
complete solution to the problem.
Class Limits f
11-15 7
16-20 9
21-25 10
26-30 13
31-35 14
36-40 11
41-45 8
46-50 3
SUMMATIVE TEST
1. Using the scores of 60 students in a 100-item test in Statistics given in the
table below, find:
a. the mean score of the students,
b. median
c. crude mode
d. refined mode
e. Q1
f. Q2
g. Q3
h. D3
i. D6
j. D9
k. P40
l. P55
m. P80
n. Population Variance
o. Sample Variance
p. Population Standard Deviation
q. Sample Standard Deviation

Class f
Limits
54-59 4
60-65 7
66-71 8
72-77 10
78-83 11
84-89 14
90-95 6
n = 60

You might also like