Professional Documents
Culture Documents
MIDTERM
Introduction to Data Management
• Raw data
• Range
• Frequency distribution
• Class limits (Apparent limits)
• Class boundaries (Real limits)
• Interval (width)
• Frequency (f)
• Percentage
• Cumulative frequency
• Midpoint (x)
Frequency Distribution
𝑅𝑎𝑛𝑔𝑒 𝐻𝑉 − 𝐿𝑉
𝑆𝑢𝑔𝑔𝑒𝑠𝑡𝑒𝑑 𝐶𝑙𝑎𝑠𝑠 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 = =
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠 𝑘
Where: HV = Highest value in a data set
LV = Lowest value in a data set
k = number of classes
I = suggested class interval
2𝑘 ≥ 𝑛
Determining class Interval
𝑅𝑎𝑛𝑔𝑒
𝑆𝑢𝑔𝑔𝑒𝑠𝑡𝑒𝑑 𝐶𝑙𝑎𝑠𝑠 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 =
1 + 3 log 𝑛
Construct a frequency distribution for the
following data.
11 19 11 15 16 10
16 16 15 17 10 27 Determine the ff:
21 11 13 21 10 16 a. Range
b. Interval
11 19 24 12 22 13 c. Class limits
19 13 18 20 21 11 d. Relative
19 15 11 25 29 23 frequencies
16 23 10 17 11 27 e. Percentage
f. Cumulative
16 24 12 21 13 12 frequency
26 15 11 14 10 12 g. Midpoints
11 15 18 12 20 13
Determine the value of k = 1 + 3 log n where n = 60.
log60 = 1.7782=15125
k = 1 + 3 (1.7781512)
k = 1 + 5.3344536
k = 6. therefore, 6 is the estimated number of classes.
r = 29- 10 = 19
Class size = 17/6 = 3.16 or 3.
Interval is 3.
11 19 11 15 16 10 Class
Class percentag
Boundarie f x fx Cf Relative f
limits e
16 16 15 17 10 27 s
1/60=0.01 0.1666*10
21 11 13 21 10 16 27.5 – 30.5 28-30 1 29 29 60
6 0 = 1.67
11 15 18 12 20 13 Total, n 60 981
Construct a frequency distribution for the following
data. The scores of the students in Geometry test.
55 63 44 37 50 57 44 57 42 46
58 40 54 65 39 27 28 56 38 45
30 35 56 78 55 27 50 28 44 28
39 37 65 43 33 70 60 61 60 44
Cf Relative
Class Class f x fx (Cumulative Percentage
Boundaries limits (frequency) (midpoint) frequency)
Frequency)
72 – 80 1 76 76 40 0.025 2.5
63 – 71 4 67 268 39 0.1 10
45 – 53 4 49 196 24 0.1 10
36 – 44 12 40 480 20 0.3 30
27 – 35 8 31 248 8 0.2 20
In computing the median of the grouped data, determine the median class
which contains the (N/2)th score under Cf of the cumulative frequency
distribution. To solve for the median, we use the formula:
𝑁
−𝑐𝑓𝑏
𝑀𝑑 = 𝑋𝐿𝐵 + 2
I
𝑓𝑚
Where: 𝑀𝑑 = median
𝑋𝐿𝐵 = the lower boundary or true lower limit of the median class.
N = total frequency
𝑐𝑓𝑏 = cumulative frequency before the median class
𝑓𝑚 = frequency of the median class
i = size of the class interval
Compute the median given following data:
Scores in Statistics f
75 – 79 6
70 – 74 7
65 – 69 2
60 – 64 8
55 – 59 12
50 – 54 7
45 – 49 10
40 – 44 8
N 60
Measures of Central Tendency
550+420+560+500+700+670+860+480 4,740
𝑥ҧ = = = 592.50
8 8
53 + 45 + 59 + 48 + 54 + 46 + 51 + 58 + 55
𝜇=
9
469
= = 52.11
9
The mean population age of middle – management
employees is 52.11
MEDIAN
1. The median is unique, there is only one median for a data set.
2. The median is found by arranging the set of data from lowest or
highest (or highest to lowest) and getting the value of the
middle observation.
3. Median is not affected by the extreme small or large values.
4. Median can be applied for ordinal, interval and ratio data.
5. Median is most appropriate in a skewed data.
To determine the value of the median for
ungrouped, we need to consider two rules.
𝑥1 𝑤1 + 𝑥2 𝑤2 + 𝑥3 𝑤3 + …+ 𝑥𝑛 𝑤𝑛
𝑥ҧ𝑤 =
𝑤1 +𝑤2 +𝑤3 + …+𝑤𝑛
Solution:
Let w1 = 18 w2 = 12 w3 = 7 w4 = 3
x1 = 30,500 x2 = 33,700 x3 = 38,600 x4 = 45,000
30,500(18)+33,700(12)+38,600(7)+45,000(3)
𝑥ҧ𝑤 =
18+12+7+3
1,358,600
𝑥ҧ𝑤 = = 33,965
40
(ungrouped data)
𝑠𝑢𝑚 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠
Mean =
𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠
Example 1:
24 + 25 + 33 + 50+ 53 + 66 + 78 = 329
Step 2. Divide the total by the number of months.
Notebook’s Price
f fX
(X)
Php 10 1000 Php10, 000.00
20 500 10, 000
25 500 12, 500
30 100 3, 000
N = 2, 100 σ 𝑓𝑋 = Php35,
500.00
σ 𝑓𝑋
WM =
𝑁
35,500
= = 16.90
2,100
Grouped Data: MEAN
σ 𝑓𝑋
M=
𝑁
where M = mean
f = frequency
X = class mark
σ 𝑓𝑋 = sum of the product of frequencies and class marks
N = total frequency
Example 4:
The table show summarizes the weights of the cubs. Find the
average weight of the cubs.
N = 45 σ 𝑓𝑋 = 8137.5
σ 𝑓𝑋 8137.5
M= = = 180.83
𝑁 45
Exercises:
1. The size of pants sold during one business day in a department
store are 32, 38, 34, 42, 36, 34, 40, 44, 32, 34. Find the average
size of the pants.
2. Give the frequency distribution for the weights of 50 pieces of
luggage. Compute the mean.
Weight (kilograms) Number of Pieces, f
7–9 2
10 -12 8
13 – 15 14
16 – 18 19
19 -21 7
N 50
Assignment:
σ 𝑓𝑋
𝑥ҧ =
𝑁
763
=
50
𝑥ҧ = 15.26
𝑁
−𝑐𝑓𝑏
Median, 𝑀𝑑 = 𝑋𝐿𝐵 + 2
𝑓𝑚
𝑖
50
−24
= 15.5 + 2
x3
19
25 −24
= 15.5 + x3
19
1
= 15.5 + x3
19
= 15.5 + (0.052631578) x 3
= 15.5 + 0.157894736
𝑀𝑑 = 15.65789474
𝑑𝑓1
Mode, 𝑀𝑜 = 𝑋𝐿𝐵 + 𝑖
(𝑑𝑓1 +𝑑𝑓2 )
19−14
= 15.5 + x3
19−14 +(19 −7)
5
= 15.5 + x3
5+12
5
= 15.5 + x3
17
= 15.5 + (0.294117647) x 3
= 15.5 + (0.88235294)
= 16.38235294
Ascending order of Classes
Class
limits
f x fx Cf 𝑥ҧ x- 𝑥ҧ (x- 𝑥)ҧ 𝟐 f(x- 𝑥)ҧ 𝟐
550+420+560+500+700+670+860+480 4,740
𝑥ҧ = = = 592.50
8 8
x ഥ
𝒙 x-𝒙
ഥ ഥ)𝟐
(𝒙 − 𝒙
550 592.50 -42.5 1,806.25
420 592.50 172.5 29,756.25
560 592.50 -32.5 1,056.25
500 592.50 -92.5 8,556.25
700 592.50 107.5 11,556.25
670 592.50 77.5 6,006.25
860 592.50 267.5 71,556.25
480 592.50 -112.5 12,656.25
σ 𝑥 = 4,740 σ(𝑥 - 𝒙
ഥ)=0 ഥ)𝟐 = 142,950
σ(𝒙 − 𝒙
Variance and standard deviation
2
𝑥 − 𝑥ҧ 2
𝑠𝟐 = σ 𝑥 − 𝑥ҧ
𝑛−1 𝒔=
𝑛−1
𝟏𝟒𝟐,𝟗𝟓𝟎
=
𝟖 −𝟏 𝟏𝟒𝟐,𝟗𝟓𝟎
=
𝟖 −𝟏
𝑠 𝟐 = 𝟐𝟎, 𝟒𝟐𝟏. 𝟒𝟑
= 𝟐𝟎, 𝟒𝟐𝟏. 𝟒𝟑
𝒔 = 𝟏𝟒𝟐. 𝟗𝟎
Class
limits
f x fx Cf 𝑥ҧ x- 𝑥ҧ (x- 𝑥)ҧ 𝟐 f(x- 𝑥)ҧ 𝟐
𝑁
𝑘 𝑁+1 − 𝑐𝑓𝑏
𝑄𝑛 = 𝑋𝐿𝐵 + 4 𝑖
𝑄𝑘 =
4 𝑓𝑚
Where: Where:
𝑄𝑘 = Quartile 𝑄𝑛 = 𝑡ℎ𝑒 𝑠𝑐𝑜𝑟𝑒 𝑐𝑜𝑟𝑟𝑒𝑠𝑝𝑜𝑛𝑑𝑖𝑛𝑔 𝑡𝑜 𝑡ℎ𝑒 𝑖𝑡ℎ 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑟𝑎𝑛𝑘.
𝑁 = 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑋𝐿𝐵 = 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑐𝑙𝑎𝑠𝑠.
𝑘 = 𝑄𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑓𝑚 = 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑐𝑙𝑎𝑠𝑠.
𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛 𝑐𝑓𝑏 = 𝑐𝑢𝑚𝑢𝑙𝑎𝑡𝑖𝑣𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐 𝑜𝑓 𝑡ℎ𝑒 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑏𝑒𝑓𝑜𝑟𝑒 𝑡ℎ𝑒 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑐𝑙𝑎𝑠
i = 𝑐𝑙𝑎𝑠𝑠 𝑠𝑖𝑧𝑒.
4 = stands for the class size division
N = total frequency
Find the 1st, 2nd, 3rd quartiles of the ages of 9 middle-
management employees of a certain company.
The ages are 53, 45, 59, 48, 54, 46, 51, 58 and 55.
1. Arrange the data in order.
𝑘 𝑁+1
𝑄𝑘 = 45, 46, 48, 51, 53, 54, 55, 58, 59
4
2. Select the 1st, 2nd, 3rd quartiles value using the formula.
1 9+1 (10)
Where:
𝑄1 = = = 2.5
4 4
𝑄𝑘 = Quartile 2 9+1 (20)
𝑁 = 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑄2 = = =5
4 4
𝑘 = 𝑄𝑢𝑎𝑟𝑡𝑖𝑙𝑒 3 9+1 (30)
𝑄3 = = = 7.5
4 4
𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛
3. Identify the 1st, 2nd, 3rd quartiles values in the data set.
45, 46, 48, 51, 53, 54, 55, 58, 59
𝑄1 = 47, 𝑄2 = 53, 𝑄3 = 56.5
2.5 5 7.5
MEASURES OF RELATIVE POSITION: DECILE
𝑁
𝑘 𝑁+1 − 𝑐𝑓𝑏
𝐷𝑛 = 𝑋𝐿𝐵 + 10 𝑖
𝐷𝑘 =
10 𝑓𝑚
Where: Where:
𝐷𝑘 = Decile 𝐷𝑛 = 𝑡ℎ𝑒 𝑠𝑐𝑜𝑟𝑒 𝑐𝑜𝑟𝑟𝑒𝑠𝑝𝑜𝑛𝑑𝑖𝑛𝑔 𝑡𝑜 𝑡ℎ𝑒 𝑖𝑡ℎ 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑟𝑎𝑛𝑘.
𝑁 = 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑋𝐿𝐵 = 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑐𝑙𝑎𝑠𝑠.
𝑘 = 𝐷𝑒𝑐𝑖𝑙𝑒 𝑓𝑚 = 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑐𝑙𝑎𝑠𝑠.
𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛 𝑐𝑓𝑏 = 𝑐𝑢𝑚𝑢𝑙𝑎𝑡𝑖𝑣𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐 𝑜𝑓 𝑡ℎ𝑒 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑏𝑒𝑓𝑜𝑟𝑒 𝑡ℎ𝑒 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑐𝑙𝑎𝑠
i = 𝑐𝑙𝑎𝑠𝑠 𝑠𝑖𝑧𝑒.
10 = stands for the class size division
N = total frequency
MEASURES OF RELATIVE POSITION: PERCENTILE
𝑁
𝑘 𝑁+1 − 𝑐𝑓𝑏
𝑃𝑛 = 𝑋𝐿𝐵 + 100 𝑖
𝑃𝑘 =
100 𝑓𝑚
Where: Where:
𝑃𝑘 = Percentile 𝑃𝑛 = 𝑡ℎ𝑒 𝑠𝑐𝑜𝑟𝑒 𝑐𝑜𝑟𝑟𝑒𝑠𝑝𝑜𝑛𝑑𝑖𝑛𝑔 𝑡𝑜 𝑡ℎ𝑒 𝑖𝑡ℎ 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑟𝑎𝑛𝑘.
𝑁 = 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑋𝐿𝐵 = 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑐𝑙𝑎𝑠𝑠.
𝑘 = 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑓𝑚 = 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑐𝑙𝑎𝑠𝑠.
𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛 𝑐𝑓𝑏 = 𝑐𝑢𝑚𝑢𝑙𝑎𝑡𝑖𝑣𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐 𝑜𝑓 𝑡ℎ𝑒 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑏𝑒𝑓𝑜𝑟𝑒 𝑡ℎ𝑒 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑐𝑙𝑎𝑠
i = 𝑐𝑙𝑎𝑠𝑠 𝑠𝑖𝑧𝑒.
100 = stands for the class size division
N = total frequency
Given the frequency distribution below, calculate the
following:
Statistics Test Results
Class Limits f cf
60 – 62 2 40
Find :
57 – 59 2 38
𝑄1 , 𝑄3 ,
54 – 56 4 36
51 – 53 5 32 𝑃10 , 𝑃50 ,
48 – 50 11 27
45 – 47 8 16 𝐷2 , 𝐷5
42 – 44 4 7
(Tip: Complete the table
39 – 41 2 4 with the required
36 – 38 1 2 information for Q,D,P)
33 - 35 1 1 Take Home
N = 40 Exercises
Solve for 𝐷3 , 𝐷7 , and 𝐷9
Score in Algebra f
75 – 79 6
70 – 74 7
65 – 69 2
60 – 64 8
55 – 59 12
50 – 54 7
45 – 49 10
40 – 44 8
N 60
UNGROUPED GROUPED
σ𝑥 σ 𝑓𝑥
MEAN 𝑥ҧ = 𝑥ҧ =
𝑛 𝑛
𝑁
−𝑐𝑓𝑏
MEDIAN Middle most value 𝑥=
𝑋𝐿𝐵 + 2
𝑓𝑚
𝑖
𝑑𝑓1
𝑥=
ො 𝑋𝐿𝐵 + 𝑖
MODE Most frequent values (𝑑𝑓1 +𝑑𝑓2 )
2 2
σ 𝑥 − 𝑥ҧ σ 𝑓 𝑥 − 𝑥ҧ
VARIANCE 2
𝑠 = 2
𝑠 =
𝑛−1 𝑛−1
STANDARD σ 𝑥 − 𝑥ҧ 2
σ 𝑓 𝑥 − 𝑥ҧ 2
DEVIATION 𝑠= 𝑠=
𝑛−1 𝑛−1
𝑁 (2)𝑁
QUARTILE 𝑘 𝑁+1 − 𝑐𝑓𝑏 − 𝑐𝑓𝑏
𝑄𝑘 = 𝑄𝑛 = 𝑋𝐿𝐵 + 4 𝑖 𝑄2 = 𝑋𝐿𝐵 + 4 𝑖
(1 – 4) 4 𝑓𝑚 𝑓𝑚
𝑁 (5)𝑁
DECILE 𝑘 𝑁+1
10 − 𝑐𝑓𝑏 − 𝑐𝑓𝑏
𝐷𝑘 = 𝐷𝑛 = 𝑋𝐿𝐵 + 𝑖 𝐷5 = 𝑋𝐿𝐵 + 10 𝑖
(1 – 10) 10 𝑓𝑚 𝑓𝑚
𝑁 75𝑁
PERCENTILE 𝑘 𝑁+1
100 − 𝑐𝑓𝑏 − 𝑐𝑓𝑏
𝑃𝑘 = 𝑃𝑛 = 𝑋𝐿𝐵 + 𝑖 𝑃75 = 𝑋𝐿𝐵 + 100 𝑖
(1 – 100) 100 𝑓𝑚 𝑓𝑚
x ഥ
𝒙 x-𝒙
ഥ ഥ)𝟐
(𝒙 − 𝒙
σ 𝑥 = 4,740 σ(𝑥 - 𝒙
ഥ)=0 ഥ)𝟐 = 142,950
σ(𝒙 − 𝒙
Class
limits
f x fx Cf 𝑥ҧ x- 𝑥ҧ (x- 𝑥)ҧ 𝟐 f(x- 𝑥)ҧ 𝟐
GROUPED DATA
1.74
16 – 18 19 17 323 43 15.26 3.0276 57.5244
15.26 -1.26
13 – 15 14 14 196 24 1.5876 22.2264
15.26 -4.26
10 -12 8 11 88 10 18.1476 145.1808
15.26 -7.26
7–9 2 8 16 2 52.7076 105.4152