You are on page 1of 15

MODULE 5

DATA MANAGEMENT
Prepared by: Ms. Katherine D. Yap

MEASURES OF CENTRAL TENDENCY


After the data have been presented in tabular or graphical form, the researcher must be able
to describe them in terms of a single number. This single figure which is representative or summary
of the characteristics of a given set of data is called a measure of central tendency.

Most commonly used measures of central tendency are mean, median and mode.

Measures of Central Tendency of Ungrouped Data

Ungrouped Data or Raw Data are those data which are not yet organized or arranged into
frequency distribution. If your number of observation is less than or equal ( ) to 30 , it is
ungrouped data.

Mean

The arithmetic mean or arithmetic average is defined as the sum of all items or terms
divided by the total number of items or terms. The definition is the same for both the sample and
population, although we use different symbol to refer to each.

The symbol for the sample mean is x bar ( x ), and for the population mean is the Greek
letter mu (µ).

Suppose you have six scores: 12, 10, 18, 16, 20 and 14. If x1=12, x2=10, x3=18, x4=16,
x5=20, x6=14 , the mean as represented as x bar is:

x1 + x 2 + x3 + x 4 + x5 + x6
x=
N

12 + 10 + 18 + 16 + 20 + 14
x= = 15
6
Instead of writing the equation for the mean as shown above you can shorten it to:

Population Mean Sample Mean

=
x x=
x
N n
where: where:
 = the mean x = the mean

x = sum of all scores ∑x= sum of all the scores


N=total number of cases in the population n= total number of cases in the sample

Median
The median of ungrouped data is the value of the middle item after arranging the data in
an ascending or descending order.

Example 1: Compute for the median from the following set of scores; 6, 14, 10, 8, 2, 12 and 4.

2, 4, 6, 8, 10, 12, 14

Answer: The median is 8, which is the middle item.

Example 2: Find the median of the following set of item; 6, 14, 10, 8, 12 and 4.

4, 6, 8, 10, 12, 14

8 + 10
median = =9
Answer: 2
Mode

The mode for ungrouped data is defined as the value that appears with the highest
frequency. That is, the item that appears most often.

Example:

Find the mode of the following set of items: 4, 7, 11, 6, 4, 3, 5, 8, 9, 2

Answer: The mode is 4.

Measures of Central Tendency of Grouped Data

Grouped data are those data organized and summarized in the forms of frequency
distribution. If your number of observation is greater than ( ) 30 , it is grouped data. These
are data classified into categories for better presentation and analysis.

FREQUENCY DISTRIBUTIONS
Raw Data

Raw data are collected data which have not been organized numerically. An example is the
set of mass of 200 male students obtained from an alphabetical listing of college records.

Array

An array is an arrangement of raw numerical data according to magnitude which is


ascending or descending order. The difference between the largest and smallest number is called
the RANGE of the data. For example, if the largest mass of 200 male students is 84 kg and the
smallest mass is 63 kg, the range is 84-63 = 21 kg.

Frequency Distribution

It is a tabular arrangement of data showing its classification or grouping according to


magnitude or size.

Class interval. This refers to the grouping defined by a lower limit and an upper limit.
Class frequency. This refers to the number of observations belonging to a class interval.

Class mark. This is the midpoint or middle value of the class interval.

Class boundary. This is the more precise expressions of the class limits also called the true limits.

Class size. This is the width of each class interval.

Steps in Constructing a Frequency Distribution


1. Array the given raw data in ascending order.
2. Compute the range.
Range= Highest score – Lowest score
3. Determine the number of classes by using the Sturge’s formula.
K = 1 + 3.322 log n
where:
k is the approximate number of classes
n is the number of observations
4. Compute for the class size. C = R ÷ K. The computed value of C should be rounded-off
for convenience.
5. Determine the lowest class limit.
6. Tally each score to the category of class interval it belongs to. Sum the frequency and
check if its total is equal to the total number of observations.

Relative Frequency Distribution

Denoted by % (rf), is derived by getting the ratio of the number of items in each class to
the total number of frequency. The relative frequency distribution may be expressed in percent and
its total sum must be equal to 100%.

Cumulative Frequency Distribution

The cumulative frequency is the accumulated frequencies of the classes; it can be either at
the beginning or end of the distribution.
The “less than” cumulative frequency is the number of observations that are less than the
upper class boundary in a given interval.

The “greater than” cumulative frequency is the number of observations that are greater than
the lower class boundary in a given interval.

Example: Grouped Data

Construct a frequency distribution from a sample of 100 residents of Barangay Banicain,


Olongapo City. The following are the observed ages gathered from 100 persons.

Age (in years) of 100 Residents of Brgy. New


Banicain, Olongapo City
14 27 27 23 29 21 20 12 22 17
23 24 18 20 27 16 12 22 19 19
15 20 29 25 24 20 20 17 18 18
12 22 23 17 23 26 16 21 21 20
17 18 26 18 28 27 18 22 19 16
14 16 19 20 20 18 25 19 26 15
28 13 18 17 14 27 24 20 18 25
17 20 23 18 18 24 19 19 14 18
21 21 25 24 14 25 20 17 17 17
15 12 26 23 17 20 24 25 18 15
Solution

Step 1. Arrange the given raw data in ascending order.

Age (in years) of 100 Residents of Banicain,


Olongapo City in Ascending Order
12 12 12 12 13 14 14 14 14 14
15 15 15 15 16 16 16 16 17 17
17 17 17 17 17 17 17 17 18 18
18 18 18 18 18 18 18 18 18 18
18 19 19 19 19 19 19 19 20 20
20 20 20 20 20 20 20 20 20 20
21 21 21 21 21 22 22 22 22 23
23 23 23 23 23 24 24 24 24 24
24 25 25 25 25 25 25 26 26 26
26 27 27 27 27 27 28 28 29 29

Step 2. Compute the range.

range= Highest score – Lowest score


range= 29-12
range= 17

Step 3. Compute the number of classes.


K = 1 + 3.322 log n
= 1 + 3.322 log 100
= 7.644

Step 4. Compute the class size.


C=R÷K
= 17 ÷ 7.644
= 2.22 which is approximately equal to 2.

Step 5. Organize the class interval. Start the first class with a lower limit equal to or a little bit less
than the lowest observed value.

Step 6. Tally each score to the category of class interval it belongs to.

Class Mark →To obtain the midpoint, simply add the lower limit and upper limit and
divided by two. For example class interval 12-13, adding these two will give us 25 divided by
2 equals 12.5.

Class Boundary→The exact limit is obtained by adding 0.5 from upper limit and
subtracting 0.5 from lower limit. For example class interval 12-13 the exact limit is 11.5-13.5.

Cumulative Frequency→ In less than cumulative frequency (cf<), adding of frequencies


from the top. Start at 5; (5 + 9)=14; (14 + 14)=28; (28 + 20)=48; (48 + 17)=65; (65 + 10)=75;
(75 + 12)=87; (87 + 9)=96; (96 + 4)=100. The last cumulative frequency is equal to the total
number of observation.

On the other hand, cumulative frequency (>cf), is done by subtracting the frequency
starting from the top. Start at the total number of your observation which is 100; (100-5)=95;
(95-9)=86; (86-14)=72; (72-20)=52; (52-17)=35; (35-10)=25; (25-12)=13; (13-9)=4.

Relative Frequency→ The frequency percentage is obtained by dividing the frequency of


a class interval by the total number of observation times 100%. For example class interval 12-
13, divide the frequency 5 to the total number of observation which is 100 multiplied by 100%.
Frequency Distribution of Age (in years) of 100 Residents of Banicain, Olongapo City
Class Class
Frequency Class <Cumulative >Cumulative Relative
Interval Mark Boundaries frequency Frequency Frequency
12-13 5 12.5 11.5-13.5 5 100 5%
14-15 9 14.5 13.5-15.5 14 95 9%
16-17 14 16.5 15.5-17.5 28 86 14%
18-19 20 18.5 17.5-19.5 48 72 20%
20-21 17 20.5 19.5-21.5 65 52 17%
22-23 10 22.5 21.5-23.5 75 35 10%
24-25 12 24.5 23.5-25.5 87 25 12%
26-27 9 26.5 25.5-27.5 96 13 9%
28-29 4 28.5 27.5-29.5 100 4 4%
N=100
Arithmetic Mean

x=
X F i i

n
where:
X i = classmark
Fi = frequency
n= total number of frequency

Example

The mean score of the frequency distribution of 60 students in entrance examination is shown
below.

Class
Class Frequency
Mark ( X i Fi )
Interval ( fi )
(X i )
18-26 8 22 176
27-35 13 31 403
36-44 21 40 840
45-53 6 49 294
54-62 12 58 696
n= 60 x i fi =

2409

Solution

1. Using the Long Method

x=
X F i i

n
x=
 2409 = 40.15
60

Median
The formula for finding the median of grouped data is given as follows:
𝑛⁄ − <𝑐𝑓
2
𝑀𝑑𝑛 = 𝐿𝐶𝐵𝑀𝑑𝑛 + 𝑐 ( )
𝑓𝑖

where:

Mdn = median
𝐿𝐶𝐵𝑚𝑑𝑛 = Lower Class Boundary containing the median class
<cf = less than cumulative frequency preceding the median class
f i = frequency of the class interval containing the median class
c = class interval
n= total number of frequency

To solve for the median the following steps are followed.

1. Compute the less than cumulative frequency.


2. Find the class interval in which n/2, one half the total of respondent must be equal to
or greater than to the less than cumulative frequency for the first time.
3. Apply the formula by substituting the given values.

Example: Compute the median of the given data:

Class
Frequency ( f i ) Cumulative
Interval Frequency <

18-26 8 8
27-35 13 21
36-44 21 42 median class

45-53 6 48
54-62 12 60
N= 60
n/2= 60/2 = 30

𝑛⁄ − < 𝑐𝑓
𝑀𝑑𝑛 = 𝐿𝐶𝐵𝑀𝑑𝑛 + 𝑐 ( 2 )
𝑓𝑖

 60 − 21 
Mdn = 35.5 +  2 9 = 39.36
 21 
 

Answer: 39.36

Mode
The formula for finding the mode of grouped data is given as follows:

𝑓𝑀𝑜 − 𝑓1
𝑀𝑜 = 𝐿𝐶𝐵𝑀𝑜 + 𝑐 ( )
2𝑓𝑀𝑜 − 𝑓1 − 𝑓2

where:
Mo
= Mode
𝐿𝐶𝐵𝑀𝑜 = Lower Class Boundary containing the modal class
𝑓𝑀𝑜 = frequency of the class interval containing the modal class
𝑓1= frequency of the class before the modal class
𝑓2= frequency of the class after the modal class
c = class size
n= total number of frequency

Modal class is the class interval with the largest frequency.


Example: Compute the mode of the given data:

Class Frequency
Interval ( fi )
18-26 8
27-35 13
36-44 21
45-53 6
54-62 12
N= 60

𝑓𝑀𝑜 − 𝑓1
𝑀𝑜 = 𝐿𝐶𝐵𝑀𝑜 + 𝑐 ( )
2𝑓𝑀𝑜 − 𝑓1 − 𝑓2
(21 − 13)
𝑀𝑜 = 35.5 + 9 ( )
2(21) − 13 − 6

Ans. 38.63

MEASURES OF POSITION

Quantiles

The quantiles are a natural extension of the median concept in that they are the values
which divide the distribution into a given number of equal parts. While the median divide the
distribution into two parts, the quartiles divide the distribution into four equal parts or quartiles,
ten equal parts or deciles and one hundred equal parts or percentiles.

Ungrouped Data

𝑖(𝑛+1)
Quartile
4

𝑖(𝑛+1)
Decile
10
𝑖(𝑛+1)
Percentile
100

Example : Find 3rd quartile for the following data.


5, 7, 11, 1, 17, 23, 19, 3, 9, 21, 15 and 13

Solution:

First thing to do is arrange the data in ascending order.

1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21 and 23

For 3rd quartile:

i ( n + 1) 3(12 + 1)
Q3 = = = 9.75th position → 9 th position + .75 * (10th − 9 th ) position
4 4

17 + .75 * (19-17) = 18.5

After you arranged the data in ascending order, you count what number falls under the 9.75th
position. To get the 9.75th position, we have to interpolate from the given data. The 9.75th position
is interpolated from the 9th position plus .75 (10th-9th). The value of the third quartile is equal to
18.5.

Grouped Data

(𝑖𝑛⁄4)−< 𝑐𝑓𝑄𝑖−1
𝑄𝑖 = 𝐿𝐶𝐵𝑄𝑖 + 𝑐 ( )
𝑓𝑄𝑖
where:
𝐿𝐶𝐵𝑄𝑖 = the Lower Class Boundary of the 𝑄𝑖 th class
c= class size
n = total number of observations in the distribution
< 𝑐𝑓𝑄𝑖−1 = less than cumulative frequency
preceding the 𝑄𝑖 th class
𝑓𝑄𝑖 = frequency of the 𝑄𝑖 th class
(𝑖𝑛⁄10)−< 𝑐𝑓𝐷𝑖−1
𝐷𝑖 = 𝐿𝐶𝐵𝐷𝑖 + 𝑐 ( )
𝑓𝐷𝑖
where:
𝐿𝐶𝐵𝐷𝑖 = the Lower Class Boundary of the 𝐷𝑖 th class
c= class size
n = total number of observations in the distribution
< 𝑐𝑓𝐷𝑖−1 = less than cumulative frequency
preceding the 𝐷𝑖 th class
𝑓𝐷𝑖 = frequency of the 𝐷𝑖 th class

(𝑖𝑛⁄100)−< 𝑐𝑓𝑝𝑖−1
𝑃𝑖 = 𝐿𝐶𝐵𝑝𝑖 + 𝑐 ( )
𝑓𝑝𝑖
where:
𝐿𝐶𝐵𝑝𝑖 = the Lower Class Boundary of the 𝑃𝑖 th class
c= class size
n = total number of observations in the distribution
< 𝑐𝑓𝑝𝑖−1 = less than cumulative frequency
preceding the 𝑃𝑖 th class
𝑓𝑝𝑖 = frequency of the 𝑃𝑖 th class
Example

The following is a frequency distribution of an achievement test. Compute the third quartile
(Q3 ).

Class Frequency Cumulative


Interval ( fi ) Frequency <
18-26 8 8
27-35 13 21
36-44 21 42

45-53 6 48 class interval containing


the desired quartile
54-62 12 60
N= 60

Solution

𝑖𝑛 (3)(60)
4
= 4
= 45
(𝑖𝑛⁄100)−< 𝑐𝑓𝑄𝑖−1
𝑄𝑖 = 𝐿𝐶𝐵𝑄𝑖 + ( )
𝑓𝑄𝑖

 45 − 42 
Q3 = 44.5 + 9  = 49
 6 

Ans.: 49(third quartile)

You might also like