Professional Documents
Culture Documents
MMW Module V DATA MANAGEMENT
MMW Module V DATA MANAGEMENT
DATA MANAGEMENT
Prepared by: Ms. Katherine D. Yap
Most commonly used measures of central tendency are mean, median and mode.
Ungrouped Data or Raw Data are those data which are not yet organized or arranged into
frequency distribution. If your number of observation is less than or equal ( ) to 30 , it is
ungrouped data.
Mean
The arithmetic mean or arithmetic average is defined as the sum of all items or terms
divided by the total number of items or terms. The definition is the same for both the sample and
population, although we use different symbol to refer to each.
The symbol for the sample mean is x bar ( x ), and for the population mean is the Greek
letter mu (µ).
Suppose you have six scores: 12, 10, 18, 16, 20 and 14. If x1=12, x2=10, x3=18, x4=16,
x5=20, x6=14 , the mean as represented as x bar is:
x1 + x 2 + x3 + x 4 + x5 + x6
x=
N
12 + 10 + 18 + 16 + 20 + 14
x= = 15
6
Instead of writing the equation for the mean as shown above you can shorten it to:
=
x x=
x
N n
where: where:
= the mean x = the mean
Median
The median of ungrouped data is the value of the middle item after arranging the data in
an ascending or descending order.
Example 1: Compute for the median from the following set of scores; 6, 14, 10, 8, 2, 12 and 4.
2, 4, 6, 8, 10, 12, 14
Example 2: Find the median of the following set of item; 6, 14, 10, 8, 12 and 4.
4, 6, 8, 10, 12, 14
8 + 10
median = =9
Answer: 2
Mode
The mode for ungrouped data is defined as the value that appears with the highest
frequency. That is, the item that appears most often.
Example:
Grouped data are those data organized and summarized in the forms of frequency
distribution. If your number of observation is greater than ( ) 30 , it is grouped data. These
are data classified into categories for better presentation and analysis.
FREQUENCY DISTRIBUTIONS
Raw Data
Raw data are collected data which have not been organized numerically. An example is the
set of mass of 200 male students obtained from an alphabetical listing of college records.
Array
Frequency Distribution
Class interval. This refers to the grouping defined by a lower limit and an upper limit.
Class frequency. This refers to the number of observations belonging to a class interval.
Class mark. This is the midpoint or middle value of the class interval.
Class boundary. This is the more precise expressions of the class limits also called the true limits.
Denoted by % (rf), is derived by getting the ratio of the number of items in each class to
the total number of frequency. The relative frequency distribution may be expressed in percent and
its total sum must be equal to 100%.
The cumulative frequency is the accumulated frequencies of the classes; it can be either at
the beginning or end of the distribution.
The “less than” cumulative frequency is the number of observations that are less than the
upper class boundary in a given interval.
The “greater than” cumulative frequency is the number of observations that are greater than
the lower class boundary in a given interval.
Step 5. Organize the class interval. Start the first class with a lower limit equal to or a little bit less
than the lowest observed value.
Step 6. Tally each score to the category of class interval it belongs to.
Class Mark →To obtain the midpoint, simply add the lower limit and upper limit and
divided by two. For example class interval 12-13, adding these two will give us 25 divided by
2 equals 12.5.
Class Boundary→The exact limit is obtained by adding 0.5 from upper limit and
subtracting 0.5 from lower limit. For example class interval 12-13 the exact limit is 11.5-13.5.
On the other hand, cumulative frequency (>cf), is done by subtracting the frequency
starting from the top. Start at the total number of your observation which is 100; (100-5)=95;
(95-9)=86; (86-14)=72; (72-20)=52; (52-17)=35; (35-10)=25; (25-12)=13; (13-9)=4.
x=
X F i i
n
where:
X i = classmark
Fi = frequency
n= total number of frequency
Example
The mean score of the frequency distribution of 60 students in entrance examination is shown
below.
Class
Class Frequency
Mark ( X i Fi )
Interval ( fi )
(X i )
18-26 8 22 176
27-35 13 31 403
36-44 21 40 840
45-53 6 49 294
54-62 12 58 696
n= 60 x i fi =
2409
Solution
x=
X F i i
n
x=
2409 = 40.15
60
Median
The formula for finding the median of grouped data is given as follows:
𝑛⁄ − <𝑐𝑓
2
𝑀𝑑𝑛 = 𝐿𝐶𝐵𝑀𝑑𝑛 + 𝑐 ( )
𝑓𝑖
where:
Mdn = median
𝐿𝐶𝐵𝑚𝑑𝑛 = Lower Class Boundary containing the median class
<cf = less than cumulative frequency preceding the median class
f i = frequency of the class interval containing the median class
c = class interval
n= total number of frequency
Class
Frequency ( f i ) Cumulative
Interval Frequency <
18-26 8 8
27-35 13 21
36-44 21 42 median class
45-53 6 48
54-62 12 60
N= 60
n/2= 60/2 = 30
𝑛⁄ − < 𝑐𝑓
𝑀𝑑𝑛 = 𝐿𝐶𝐵𝑀𝑑𝑛 + 𝑐 ( 2 )
𝑓𝑖
60 − 21
Mdn = 35.5 + 2 9 = 39.36
21
Answer: 39.36
Mode
The formula for finding the mode of grouped data is given as follows:
𝑓𝑀𝑜 − 𝑓1
𝑀𝑜 = 𝐿𝐶𝐵𝑀𝑜 + 𝑐 ( )
2𝑓𝑀𝑜 − 𝑓1 − 𝑓2
where:
Mo
= Mode
𝐿𝐶𝐵𝑀𝑜 = Lower Class Boundary containing the modal class
𝑓𝑀𝑜 = frequency of the class interval containing the modal class
𝑓1= frequency of the class before the modal class
𝑓2= frequency of the class after the modal class
c = class size
n= total number of frequency
Class Frequency
Interval ( fi )
18-26 8
27-35 13
36-44 21
45-53 6
54-62 12
N= 60
𝑓𝑀𝑜 − 𝑓1
𝑀𝑜 = 𝐿𝐶𝐵𝑀𝑜 + 𝑐 ( )
2𝑓𝑀𝑜 − 𝑓1 − 𝑓2
(21 − 13)
𝑀𝑜 = 35.5 + 9 ( )
2(21) − 13 − 6
Ans. 38.63
MEASURES OF POSITION
Quantiles
The quantiles are a natural extension of the median concept in that they are the values
which divide the distribution into a given number of equal parts. While the median divide the
distribution into two parts, the quartiles divide the distribution into four equal parts or quartiles,
ten equal parts or deciles and one hundred equal parts or percentiles.
Ungrouped Data
𝑖(𝑛+1)
Quartile
4
𝑖(𝑛+1)
Decile
10
𝑖(𝑛+1)
Percentile
100
Solution:
i ( n + 1) 3(12 + 1)
Q3 = = = 9.75th position → 9 th position + .75 * (10th − 9 th ) position
4 4
After you arranged the data in ascending order, you count what number falls under the 9.75th
position. To get the 9.75th position, we have to interpolate from the given data. The 9.75th position
is interpolated from the 9th position plus .75 (10th-9th). The value of the third quartile is equal to
18.5.
Grouped Data
(𝑖𝑛⁄4)−< 𝑐𝑓𝑄𝑖−1
𝑄𝑖 = 𝐿𝐶𝐵𝑄𝑖 + 𝑐 ( )
𝑓𝑄𝑖
where:
𝐿𝐶𝐵𝑄𝑖 = the Lower Class Boundary of the 𝑄𝑖 th class
c= class size
n = total number of observations in the distribution
< 𝑐𝑓𝑄𝑖−1 = less than cumulative frequency
preceding the 𝑄𝑖 th class
𝑓𝑄𝑖 = frequency of the 𝑄𝑖 th class
(𝑖𝑛⁄10)−< 𝑐𝑓𝐷𝑖−1
𝐷𝑖 = 𝐿𝐶𝐵𝐷𝑖 + 𝑐 ( )
𝑓𝐷𝑖
where:
𝐿𝐶𝐵𝐷𝑖 = the Lower Class Boundary of the 𝐷𝑖 th class
c= class size
n = total number of observations in the distribution
< 𝑐𝑓𝐷𝑖−1 = less than cumulative frequency
preceding the 𝐷𝑖 th class
𝑓𝐷𝑖 = frequency of the 𝐷𝑖 th class
(𝑖𝑛⁄100)−< 𝑐𝑓𝑝𝑖−1
𝑃𝑖 = 𝐿𝐶𝐵𝑝𝑖 + 𝑐 ( )
𝑓𝑝𝑖
where:
𝐿𝐶𝐵𝑝𝑖 = the Lower Class Boundary of the 𝑃𝑖 th class
c= class size
n = total number of observations in the distribution
< 𝑐𝑓𝑝𝑖−1 = less than cumulative frequency
preceding the 𝑃𝑖 th class
𝑓𝑝𝑖 = frequency of the 𝑃𝑖 th class
Example
The following is a frequency distribution of an achievement test. Compute the third quartile
(Q3 ).
Solution
𝑖𝑛 (3)(60)
4
= 4
= 45
(𝑖𝑛⁄100)−< 𝑐𝑓𝑄𝑖−1
𝑄𝑖 = 𝐿𝐶𝐵𝑄𝑖 + ( )
𝑓𝑄𝑖
45 − 42
Q3 = 44.5 + 9 = 49
6