Professional Documents
Culture Documents
SYLLABUS
1
TYPES OF DATA
Examples:
Variables are generally classified into two, namely qualitative and quantitative
variables. A qualitative variable yields categorical responses while a quantitative
variable yields numerical responses representing an amount or quantity.
Examples:
Quantitative variables on the other hand can either be discrete or continuous. A discrete
quantitative variable assumes finite or accountably infinite such as 0, 1, 2, 3… and are
usually obtained through the process of counting. A continuous quantitative variable
on the other hand assumes values which are associated with points on an interval of the
number line. These are usually obtained through the process of measurement with
corresponding units.
Examples:
2
Variables can be also classified according to their levels of measurement. These
are scales of measuring data.
A nominal data is the crudest form of data. It uses numbers or symbols for the
purpose of categorizing subjects into groups or categories, which are mutually exclusive.
Thus, being in one category automatically excludes one from being a member of another
category. Moreover, the categories are exhaustive, that is. All possible categories of a
variable should be included.
Examples:
An ordinal data possesses all the properties of the nominal data. Hence, it can
be said that can ordinal data is an improvement of the nominal data because in here, the
data are ranked/ ordered in a somewhat “bottom to top” or “high or low” scheme.
Examples:
a.) Student’s class standing is an ordinal data. These are categorized into:
5 – Excellent
4 – Very Good
3 – Good
2 – Fair
1 - Poor
An interval data possesses all the properties of the nominal and ordinal data.
Here, the data are numeric in nature and the distances between any two
3
numbers are known. However, the interval data, although numeric, does not
have a stable point or absolute zero.
Examples:
Here we can say that the difference between 140 and 50 in the same as
the difference between 145 and 75. But, we cannot claim that the second student
is twice as intelligent as the first. Is there such a zero IQ?
A ratio data possesses all the properties of the nominal, ordinal and
interval data. It is also numeric in nature and has an absolute zero point. Thus, in
a ration data, we can classify, order/rank them and likewise we can also compare
their magnitudes.
Examples:
There are also other classifications of data. Raw data are those, which
are in their original form and structure. Responses out from surveys, taped
interview, and recorded observations are examples in raw data. Grouped data
on the other hand are those placed and summarized in tabular form.
4
METHODS OF DATA COLLECTION
The observation method is the most simple data collection technique. Here, the
data are obtained by merely observing the behaviour of persons, or objects but only at a
particular time of occurrence. The data obtained is called an observation data.
The experimental method is especially useful when one wants to collect data
for cause and effect studies under controlled conditions. In this method, there is actual
interface with the conditions and situations that can affect the variable under study. The
data obtained in this method is called an experimental method.
5
RAW
SCORES
6
RAW SCORES OF 52 STUDENTS IN AN ACHIEVEMENT TEST
7
172 163 173 183 179 173
23 36 40 31
29 15 34 36
31 24 40 45
34 57 20 45
16 33 37 37
12 27 14 43
36 41 41 52
8
39 25 46 49
34 22 21 40
27 18 35 40
39 30 41 42
51 38 16 27
24 26 32 34
26 38 46 41
40 30 45 44
38 29 34 35
32 19 18 26
37 38 32 50
43 29 25 29
37 41 51 35
48 40 34 31
33 43 37 46
34 34 38 20
40 14 31 32
33 42 38 59
9
The master sheet or classifier is a device used in arranging scores
or statistical data. With it, scores are easily and conveniently arranged
from the highest to the lowest, or vice versa. The frequency of each score
is easily determined. It is preparatory step to the ranking and grouping of
scores.
2. Subtract the tens of the lowest scores from the tens of the highest
scores. Add 4 (constant) to the difference to determine the number
of horizontal line.
5. Tally the raw scores in the cell where they fall. The first digit or
digits of a score are represented by the vertical tens and the last
digit by the horizontal units.
6. Count the tallies in every cell and write the total frequencies
corresponding to the tens and the units. Add the total frequencies
of the ten and of the units. The sums must be equal. (These sums
correspond to the number of raw scores for cases consideration.)
10
Name Course Date
U N I T S
0 1 2 3 4 5 6 7 8 9 TOTAL
T 19 I I I 3
18 I I I I I I 6
E 17 II II III I I I I 11
16 II III II I II I IIII II I 18
N 15 I I I II II I I 9
14 I I I I 4
S 13 I 1
TOTAL 1 5 8 9 1 2 6 9 7 4 N=52
11
Difference = 6t
+ Constant + 4
Horizontal Lines = 10
U N I T S
0 1 2 3 4 5 6 7 8 9 TOTAL
5 I II I I I 6
12
2 II I I I II II III III IIII 19
1 I II I II II I 9
TOTAL 12 12 9 7 13 9 11 9 9 9 N=100
H.S. = 59 – 5t
L.S. = 12 – 1t
Difference = 6t
+ Constant + 4
Horizontal Lines = 10
Constant 13 Vertical line
RANKING OF SCORES
13
From the ranks, it is possible to determine the bright from the dull and the
mediocre pupils. It is possible to determine percentage of pupils that surpass a
pupil and that surpassed by him. Ranks are also used in the computation of the
coefficient of correlation.
1. By using the master sheet, arrange the scores from the highest to the
lowest, writing each score as many times as it appears.
2. Number the score consecutively, giving the highest score tentative rank 1.
Next highest 2, and so on to the lowest score. (The tentative rank of the
lowest score is equal to the total number of case, ‘N’).
3. Assign the ranks as their real ranks. Scores appearing more than once
have the average of their ordinal numbers (tentative ranks) as their real
ranks. (Identical or similar scores have equal ranks).
SCORES TR RR SCORES TR RR
14
198 1 1 165 29
193 3 3 164 31 31
189 4 4 163 32
32.5
188 5 5 163 33
186 6 6 162 34
183 7 7 162 35 35
182 8 8 162 36
180 9 9 161 37
37.5
179 10 10 161 38
178 11 11 159 39 39
177 12 12 158 40 40
176 13 13 157 41
41.5
173 14 157 42
173 15 15 156 43
172 17 153 45 45
17.5
172 18 152 46 46
171 19 151 47 47
19.5
171 20 148 48 48
169 21 21 147 49 49
168 22 143 50 50
167 24 136 52 52
167 25 25.5 N = 52
15
167 26 Legend:
59 1 1 37 41 26 80
57 2 2 37 42 26 81 81
52 3 3 37 43 43 26 82
51 4 37 44 25 83
4.5 83.5
51 5 37 45 25 84
50 6 6 36 46 24 85
85.5
49 7 7 36 47 47 24 86
47 8 8 36 48 23 87 87
46 9 10 35 49 50 22 88 88
46 10 35 50 21 89 89
46 11 35 51 20 90 90.5
16
45 12 34 52 20 91
45 13 13 34 53 19 92 92
45 14 34 54 18 93
93.5
44 15 15 34 55 18 94
55.5
43 16 34 56 16 95
95.5
43 17 17 34 57 16 96
43 18 34 58 15 97 97
42 19 34 59 14 98
19.5 98.5
42 20 33 60 14 99
41 21 33 61 61 12 100 100
41 22 33 62 N = 100
41 23 23 32 63
41 24 32 64
64.5
41 25 32 65
40 26 32 66
40 27 31 67
40 28 31 68 68.5
40 29 29 31 69
40 30 31 70
40 31 30 71 71.5
40 32 30 72 Legend:
39 33 29 73 TR = Tentative Rank
33.5
39 34 29 74 RR= Real Rank
74.5
38 35 29 75
38 36 37.5 29 76
38 37 27 77 78
17
38 38 27 78
38 39 27 79
38 40
1. By using the master sheet, arrange the scores from the highest to the
lowest, writing each score only once.
2. Take the raw scores and place a tally after each score as many times as
the score appears.
3. Count the tallies opposite each score and write the number opposite the
tallies themselves. This number of tallies is the frequency (f) of the score.
4. Add the frequencies and write the sum at the bottom of the tabulation to
get N, the total of scores or cases.
18
EDUCATION 602 – Statistics
Score Distribution
Exercise No. 3
19
182 I I 165 II 2 147 I 1
13
177 I I 161 II 2
176 I I 159 I 1
13 26
SUMMARY:
I = 13
II = 26
III = 13
IV = 52
Score Distribution
Exercise No.3
20
Scores: Tallies Freq. Scores: Tallies Freq. Scores Tallies Freq.
:
59 I 1 40 IIII-II 7 26 III 3
57 I 1 39 II 2 25 II 2
52 I 1 38 IIII-I 6 24 II 2
51 II 2 37 IIII 5 23 I 1
50 I 1 36 III 3 22 I 1
49 I 1 35 III 3 21 I 1
48 I 1 34 IIII-III 8 20 II 2
46 III 3 33 III 3 19 I 1
45 III 3 32 IIII 4 18 II 2
44 I 1 31 IIII 4 16 II 2
43 III 3 30 II 2 15 I 1
21
42 II 2 29 IIII 4 14 II 2
41 IIII 5 27 III 3 12 I 1
25 54 21
SUMMARY:
I = 25
II = 54
III = 21
N = 100
Data collected from the test and experiments may have little meaning to
the investigator until they have been arranged or classified in some systematic
way. The first task therefore is to organize our materials and this leads naturally
to a grouping of scores into classes or steps.
1. Determine the range. The range is the gap between the highest and the
lowest scores- the difference that results when the lowest score is
subtracted from the highest score.
2. Determine the class interval. (a) To minimize the error and to avoid too
much labor, it is suggested that the number of steps should not be less
than 10 nor more than 20. The ideal number should be between 12 and
15. Under exceptional cases the number which given ranges nay yield
may be below 10 nor more than 20. A good rule is to select an odd
number for a class interval ( I ) which will give a quotient of between ten
and fifteen when the range is divided by it. Be sure the interval chosen will
22
nor spread the data out too much, thus losing the benefit of grouping, nor
crowd the scores into coarse categories. (b) Another method of
determining the i = (by Ross) add 1 (constant) to the range and divide
the sum by 12 (constant).
3. Determine the limits of the classes or steps. For the lower limit of the
highest step, choose a number which is nearest to or equal to the highest
score, but not exceeding it, and which is exactly divisible by the size of the
class interval. The upper limit is determined by adding to the lower limit
one number less than i . The succeeding limits are determined by
subtracting the size of i from the proceeding lower and upper limits.
4. Make the tabulation. Tally the raw scores opposite their proper interval or
class. The total number of tallies of each class interval (frequency) is
written in a column labelled f. The sum of f column is called n (number of
cases).
195-199 II 2
190-194 I 1
23
185-189 IIII 3
180-184 III 3
175-179 IIII 4
170-174 IIII – II 7
155-159 IIII – I 6
150-154 III 3
145-149 II 2
140-144 II 2
135-139 I 1
N = 52
H = 198 5 = i
24
L = 136 12 ) 63
62 60 odd number, it
automatically
+ 1 3 becomes the
interval ( i ).
63
Frequency Distribution
Exercise No.4
57-59 II 2
H = 59 3 = i
54-58 0
L = 12
51-53 III 3 4
47 12 48
48-50 III 3
48
25
45-47 IIII - I 6
4 0
42-44 IIII - I 6 3 12
Rule No.2 :
27-29 IIII - II 7
21-23 III 3
18-20 IIII 5
15-17 III 3
12-14 III 3
N = 100
26
THE FREQUENCY POLYGON
The frequency polygon and the histogram have the same uses. The score
opposite the summit of the frequency polygon is the crude mode; the midpoint of the tip
in the highest rectangle of the histogram is a crude mode. From the frequency polygon
and the histogram, it is possible to gleam the “representativeness” of the group
concerned. If the graph plotted is similar to the shape of bell, the group is more or less
typical. The more the irregular the shape of the graph, the less representative is the
group. It is also possible to note the tendency of the measure - whether they are piled
up at the low (or high) end of the scale and or evenly and regularly distributed over the
scale. If the test is so easy, the score accumulated at the high end of the scale, whereas
the test is too hard, scores will crowd at the low end of the scale. When the test will be
distributed symmetrically around the mean, few individuals scoring quite high, few quite,
low, and the majority failing somewhere near the middle of the scale.
The frequency polygon is less precise than the histogram in that it does not
represent accurately, i.e. in terms of areas, the frequency upon each interval. In
comparing two or more graphs plotted on the same axes, however, the frequency
polygon is likely to be more useful as the vertical and horizontal lines in the histogram
will often coincide.
1. Labelling the points on the base line. There are several ways of labelling the
intervals along the base line X axis of the frequency polygon. For example, step
195-199 of our frequency distribution which has an 1 of 5 may be interpreted as
having the following limits:
a. Expressed limits: 195-199. This means that this interval begins with score
195 and ends with the score 199. These limits are ideal for graphing the
scores in a frequency distribution because of the ease in tallying.
27
b. Score limits: 195-200. This interval means that 11 scores mean
from 195 up but not more including 200 – fall within this grouping.
These limits are conveniently used in labelling the point in the base
line of the frequency polygon, but inconveniently use din tallying
because it is fairly easy for one to let score 200 slip into interval
195-200 owing simply to be presence of 200 at the upper limit of
the interval.
3. Drawing the frequency polygon. When all the points have been located in
the diagram, they are joined by a series of short lines to form the
frequency polygon. In order to complete the figure (i.e., to bring it down to
the base line), one additional interval at the low end and one additional
interval at the high end of the distribution are include on the X scale. The
frequency on each these intervals is, of course, zero.
28
Steps in constructing a frequency polygon:
1. Draw two straight lines perpendicular to each other, the vertical line near left side
of the paper, the horizontal line near the bottom. Label the vertical line (the Y
axis) OY, and the horizontal line (the X axis) OX. Put 0 where the two lines
intersect. This point is the origin.
2. Lay off the score intervals of the frequency distribution at regular distance along
X axis. Begin with the interval next below the lowest in the distribution, and end
intervals next above the highest in the distribution. Label the successive X
distance with the score interval limits. Select an X unit which will allow all the
intervals to be represented easily on the graph paper.
3. Mark off the Y axis successive units to represent the scores (the frequencies) on
the different intervals. Choose a Y scale which will make the largest frequency
(the height of the polygon approximately 75% or 60-80% of the width of the
figure).
4. At the midpoint of each interval on the X axis go up in the Y direction a distance
equal to the number of scores on the interval. Place points at these locations.
5. Joint the points plotted in step 4 with straight lines to give the frequency surface.
Exercise No.5
Polygon- a figured especially a closed plane figure have 3 or more straight sides.
-visual aids
-attention getting
1) frequency polygon
2) histogram
29
Three types of limits:
135-139
2. Score limits- score added- limits ideal for labelling for graphs
140-145 (140-144)
135-140 (135-139)
(2) computation
Exercise No.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
10
F
9
R
8
E
7
Q
6
30
5
2
U
1
E
0
N
C
130 135 140 145 150 155 160 165 170 175 180 185 190 195 200 205
S C O R E S
Added = 2
165-169 10 165-169 10
150-154 3 150-154 3 - 96
31
145-149 2 145-149 2 40
140-144 2 140-144 2 - 36
135-139 2 135-139 2 4
130-134 1 130-134 1
N=52 N=52
Orig. Steps = 16
Constant = 1
54-56 0 54-56 0
45-47 6 45-47 6 19
39-41 14 39-41 14 95
32
30-32 10 30-32 10
27-29 7 27-29 7
12-14 3 12-14 3
9-11 0 9-11 0
N=100 N=100
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
15
14
F
13
R
12
E
11
Q 10
U 9
E 8
N 7
C 6
Y 5
0
9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63
33
ORIGINAL AND SMOOTHED FREQUENCY POLYGONS
1. Find the adjust or “smoothed” if every class interval (including the one
additional step next below the lowest in distribution and the one additional
step next above the highest in the distribution) by adding the f on the
given interval and the f’s on the two adjacent intervals (the interval just
below and the interval just above) and dividing the sum by three (3). (The
total of all the adjusted frequencies should equal the number of cases.)
2. By using the same graph paper on paper on which the original frequency
polygon has been constructed, place the point at the midpoint of each
interval on the X axis corresponding to the adjusted f in the Y direction.
3. Join the points plotted in step in the 2 with straight or dotted lines (by
using a different ink color) to complete the smoothed polygon.
34
EDUCATION 602 – Statistics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
10
9
F
8
R
7
E
6
Q
5
U
4
E
3
N
2
C
1
Y
0 1
130 135 140 145 150 155 160 165 170 175 180 185 190 195 200 205
S C O R E S
frequencies
35
Scores F Scores F
195-199 2 195-199 2 1
190-194 1 190-194 1 2
170-174 7 170-174 7 7
160-164 8 160-164 8 8
135-139 2 135-139 2 1
or 52
36
Expressed Limits Score Limits
60-62 0 60-62 0 2
/3
57-59 2 57-79 2 2
/3
54-56 0 54-56 0 1 2
/3
51-53 3 51-53 3 2
48-50 3 48-50 3 4
45-47 6 45-47 6 5
42-44 6 42-44 6 8 2
/3
36-38 14 36-38 14 14
27-29 7 27-29 7 8
24-26 7 24-26 7 5
21-23 3 21-23 3 5 2
/3
18-20 5 18-20 5 3 2
/3
15-17 3 15-17 3 3 2
/3
12-14 3 12-14 3 2
9-11 0 9-11 0 1
F
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
15
37
14
13
12
11
10
R 8
E 7
Q 6
U 5
E 4
N 3
C 2
0
9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63
In the frequency polygon, all the score within a given interval are
represented by the midpoint of that interval, whereas in a histogram the scores
are assumed to be spread uniformly, over the entire interval. Within each interval
of a histogram the frequency is shown by a rectangle, the base of which is the
length of the interval, and the height of which is the number of scores within the
interval.
1. Draw OX and OY as in the frequency polygon and lay off equal distance
on both axes – OX for the steps and OY for the frequencies.
38
2. Lay off the score intervals of the frequency distribution along the X-axis.
Begin with the highest interval in the distribution.
4. Draw line limited by the lower and upper limits of the steps, instead of a
point at the midpoint as in the adjacent end of the lines. The figure is a
histogram.
5. Connect by straight vertical lines every two adjacent end of the lines. The
figure is histogram.
6. Shade the histogram to bring about clearly the total area of the figure.
Y
10 1 2 3 4 5 6 7 8 9 10 11 12 13 14
39
32
F 9
31
R 8
22 30
E 7
21 29 39
Q 6
14 20 28 38
U 5
13 19 27 37
E 4
12 18 26 36 43
N 3
8 11 17 25 35 42 46 49
C 2
3 5 7 10 16 24 34 41 45 48 52
Y 1
1 2 4 6 9 15 23 33 40 44 47 50 51
X
o
135 140 145 150 155 160 165 170 175 180 185 190 195 200
S C O R E S
Score Freq.
Vertical Lines:
185-189 3
175-179 4 Constant + 1
170-174 7 14
165-169 10
40
150-154 3 frequency
145-149 2 polygon
140-144 2
135-139 2
130-134 1
N=52
52 66 80
13
51 65 79
12
50 64 78
11
49 63 77
10
38 48 62 76
F 9
R 8
37 47 61 75
E 7
36 46 60 74
Q 6
21 28 35 45 59 73
U 5
20 27 34 44 58 72 86 92
E 4
11 19 26 33 43 57 71 85 91
41
3 10 18 25 32 42 56 70 84 90
2 3 6 9 14 17 24 31 41 55 69 83 89 95 98
N 1
2 5 8 13 16 23 30 40 54 68 82 88 94 97 100
C o
1 4 7 12 15 22 29 39 53 67 81 87 93 96 99
X
Y 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60
S C O R E S
Score Freq.
Vertical Lines:
51-53 3
45-47 6 Constant + 1
42-44 6 17
39-41 14
30-32 10 frequency
27-29 7 polygon
24-26 7
21-23 3
18-20 5
15-17 3
12-14 3
42
N=100
THE MEAN
The mean is the average of the scores or measures. It is the sum of the
separate scores divided by their number. It is dependent on the magnitude of
scores. Changing a score even by one, more or less, change the mean.
43
The Absolute Mean
There are three methods of computing the mean. One method, the long or
absolute method, which is used when the data are ungrouped, is the subject of
this exercise.
3. Divide the sum by the number of cases. The quotient is the arithmetic
mean r simply mean (M).
∑= sum of
N N= number of cases
44
EDUCATION 602 – Statistics
45
188 173 167 173
∑X
I = 2,267
M =
II = 2,077
III = 2,166
8,693
IV = 2,183
=
52
∑ X = 8,693
M = 167.17
46
The Mean by the Absolute Mean
Exercise No.8 a
23 48 29 41 43
29 33 19 16 52
31 34 38 32 49
34 40 29 46 40
16 33 41 45 40
12 36 40 34 42
36 15 43 18 27
39 24 34 32 34
34 57 14 25 41
27 33 42 51 44
39 27 40 34 35
51 41 34 37 26
24 25 40 38 50
26 22 20 31 29
40 18 37 38 35
38 30 14 31 31
32 38 41 36 46
37 26 46 45 20
43 38 21 45 32
37 30 35 37 59
47
∑X
I = 648
M =
II = 648
III = 657
3,440
IV = 712
=
100
V = 775
∑ X = 3,440
M = 34.40
48
a. When a quick and approximate measure of central tendency is all that is
wanted.
b. When the measure of central tendency should be the most typical value.
When the scores are many and are grouped into frequency distribution,
the mean may be computed by suing the midpoint method, the formula of which
is:
∑ fx
M =
N
49
Procedures in calculating the mean by the midpoint method:
1. Lay off the frequency distribution, showing the class or step intervals
(column 1) and their corresponding frequencies (column 2). Add the
frequencies to get the number of cases (N).
2. Determine the midpoint (x) of every interval and enter it in column 3. The
midpoint is equal to the lower limit plus one-half of the interval.
3. Multiply the frequencies (f) by their midpoint values (x) and enter the
products in column 4.
4. Add the products (fx) of all the steps to get their sum (∑fx).
5. Divide the sum (∑fx) by the total number of cases (N) to obtain the mean.
Scores F X fx
50
195-199 2 197 394
N = 52 ∑fx = 8684
∑fx 8684
M = = = 167
N 52
51
EDUCATION 602 – Statistics
Scores F X fx
57-59 2 58 116
54-56 0 55 0
51-53 3 52 156
48-50 3 49 147
45-47 6 46 276
42-44 6 43 258
39-41 14 40 560
36-38 14 37 518
33-35 14 34 476
30-32 10 31 310
27-29 7 28 196
24-26 7 25 175
52
21-23 3 22 66
18-20 5 19 95
15-17 3 16 48
12-14 3 13 39
∑fx 3426
M = = = 34.36
N 100
The short method is another method of computing the mean when the
scores are many (more than 30) and these are grouped into a frequency
distribution. This less cumbersome than the midpoint method because in the
short method smaller figures are handled, facilitating computation.
The formula used in calculating the mean by the short method is:
∑fd
M = AM + i
N
chosen as Origin
53
d = deviation upward or downward from AM
1. Lay off the frequency distribution showing the class or step intervals
(column 1) and their corresponding frequencies (column 2). Add the
frequencies to get the total number of cases (N).
2. Take the midpoint of any step as an assumed mean (AM). From this lay
off positive deviations upward and negative deviations downward
(column 3).
5. Divide the sum (∑fd) by N and multiply the quotient by the size of the class
interval (i) to get the correction.
6. Add the correction to get the assumed mean to obtain the true mean.
Scores F D fd
195-199 2 6 12
190-194 1 5 5
54
185-189 3 4 12
180-184 3 3 9
175-179 4 2 8
170-174 7 1 7
165-169 10 0 AM = 167
160-164 8 -1 -8
155-159 6 -2 -12
150-154 3 -3 -9
145-149 2 -4 -8
140-144 2 -5 -10
135-139 1 -6 -12
N = 52 ∑fd = 0
∑fd
M = AM + I
N
0 5
= 167 +
52
= 167 + 0
= 167
55
EDUCATION 602 – Statistics
Scores F D fd
57-59 2 8 16
54-56 0 7 0
51-53 3 6 18
48-50 3 5 15
45-47 6 4 24
42-44 6 3 18
39-41 14 2 28
36-38 14 1 14 133
33-35 14 0 AM = 34
30-32 10 -1 -10
27-29 7 -2 -14
56
24-26 7 -3 -21
21-23 3 -4 -12
18-20 5 -5 -25
15-17 3 -6 -18
N = 100 ∑fd = 12
∑fd
M = AM + i
N
Where:
12
= 34 + 3
100 AM = assumed mean
∑ = sum of
i = size of class
= 0.36
THE MEDIAN
The median as that point in the scale above and below which lie 50% of
the cases. It is a point-measure, dividing a group into two equal sub-groups.
Hence, if the group is to be sectioned in two based on achievement or ability, the
point of division is the median.
57
The median is the most stable measure of central tendency. It is not much
affected by extreme low or high scores. Hence, if there are low or high scores
and it is desired that these scores do not affect the average disproportionately,
the median used. Again, if there are relatively few cases, the median is
computed.
The value of the median depends on the number of scores, and not on
the magnitude of the scores. If most of the scores are high, median is high; if the
scores are low, the median is low.
a. When N is odd: Arrange the score from the highest to the lowest.
Or vice versa. The middlemost score is the counting median or
midscore.
b. When N is even: Arrange the scores from the highest to the lowest,
or vice versa. The average of the two middlemost scores is the
median.
58
Ctgn. Mdn.
198 173 167 158
59
176 167 Ctgn. Mdn.
159 136
PROCEDURE :
Arrange the scores from the highest to the lowest and get the
middle most scores for odd numbers; take the average of the
59 40 35 29
57 40 34 27
52 40 34 27
51 40 34 27
51 40 34 26
50 40 34 26
49 40 34 26
60
48 39 34 25
46 39 34 25
46 38 33 24
46 38 33 24
45 38 33 23
45 38 32 22
45 38 32 21
44 38 32 20
43 38 32 20
43 37 31 19
43 37 31 18
42 37 31 18
42 37 31 16
41 36 30 16
41 36 30 15
41 36 29 14
41 35 29 14
41 35 29 12
Ctgn. Mdn.
35
35 2 70
6
+
35 10
10
70 x
61
EDUCATION 602 – Statistics
195-199 2
190-194 1 N 52
= = 26
185-189 3 2 2
180-184 3
175-179 4 f = 22
170-174 7
165-169 10 Fm fm = 10
160-164 8 22 F
155-159 6 14 l = 164.5
150-154 3 8
145-149 2 5 i = 5
140-144 2 3
135-139 1 1 26-22
Mdn. = 164.5+ 5
N = 52 10
= 164.5+ 4 5
62
10
Formula: 20
= 164.5+
N 0
- F
2
l +
i = 164.5 + 2
fm
= 166.5
Where:
63
Scores Freq. Cumulative Freq.
57-59 2
54-56 0 N 52
= = 26
51-53 3 2 2
48-50 3
45-47 6 f = 22
42-44 6
39-41 14 fm = 10
36-38 14
33-35 14 fm l = 164.5
30-32 10 38 F
27-29 7 29 i = 5
24-26 7 21
21-23 3 14 12
Mdn. = 32.5+ 3
18-20 5 11 14
15-19 3 8
12-14 3 3 12
= 32.5 + 3
N= 100 14
Formula: 36
= 32.5 +
N 14
- F
2
l +
i = 32.5 + 2.57
fm
64
= 35.07
Where:
THE MODE
The mode is the most frequent score in a series or occurs the greatest
number in a grouped distribution. It is otherwise known as “commercial average”
or typical value, as in mode or fashion in dresses worm by the “average” woman.
2. When the data are ungrouped, the “crude” mode is the midpoint of the
stop with the greatest frequency.
65
3. The formal for approximating the true mode, when the frequency
distribution is symmetrical, or at least not badly skewed is:
The Mode
Exercise No.10
A. Crude Mode:
1. Ungrouped Scores:
34 appears the greatest number of times
2. Grouped Scores:
The step(s) has the greatest number of tallies.
165-169 = 10
160-164 = 8
155-159 = 6
3. Frequency Distribution:
66
What are the midpoints of steps having the greatest frequency? What kind of
distribution?
39-41 = 40
36-38 = 37
33-35 = 34
1. Person
Mo = M - ( M - Mdn ) 3
= 167 – ( 167 – 165.5 ) 3
= 167 - 1.5
= 165.5
Mo = ( 3 x Mdn ) – ( 2 x M )
= (3 x 166.5) – ( 2 x 167 )
= 499.5 - 334
= 165.5
Note (Relationships) :
The Mode
Exercise No.10
A. Crude Mode:
1. Ungrouped Scores:
34 appears the greatest number of times
67
2. Grouped Scores:
The step(s) has the greatest number of tallies.
39-41 = 14
36-40 = 14
33-35 = 14
3. Frequency Distribution:
What are the midpoints of steps having the greatest frequency? What kind of
distribution?
39-41 = 40
36-40 = 37
33-35 = 34
1. Person
Mo = M - ( M - Mdn ) 3
= 34.36 – ( 34.36 – 35.07 ) 3
= 34.36 - 2.13
= 36.49
Mo = ( 3 x Mdn ) – ( 2 x M )
= (3 x 35.07) – ( 2 x 34.46 )
= 103.08 - 68.72
= 36.49
MEASURES OF VARIABILITY
68
Calculating the range:
The range is the interval between the highest and the lowest scores. It is
the most general measure of spread or scatter, and is computed when we wish
to make a rough comparison of two or more groups of variability. The range
takes account of the extremes of the series scores only and is unreliable when N
is small, or when there are large gaps (i.e. zero f’s) in the frequency distribution
the range to the equal to the difference between the midpoint of the highest step
and the midpoint of the lowest step.
Computing the Quartile Deviation (Q):
The quartile deviation or Q is one-half the scale distance between the 75 th
and the 25th percentiles or Q1 is the first quartile on the score scale, the point
below which lie 25% of the score. The 75th percentile or Q3 is the third quartile on
the score scale, the point below which lie 75% of the scores. When we have
these two points the Q is found from the formula.
Q3 - Q1
Q =
2
3
N
- F N - F
Q1 = l + i and Q3 = l + i
4 4
Fq fq
Where:
l = the exact lower limit of the interval in which the quartile falls
69
F= the cumulative sum of all frequencies from the lowest step, which sum
approaches or equal to (but not exceeds) N or Q 1/4 or 3N for Q3/4
70
Name Course Date
195-199 2
190-194 1 Q3 – Q1
Q =
185-189 3 2
180-184 3
175-179 174.5 4 fq
170-174 7 39 F N
- F
165-169 10 32 4 i
Q1 = l+
160-164 8 22 fq
155-159 154.5 6 fq 14
150-154 3 8 F 8
145-149 2 5 5 3N
- F
140-144 2 3 3 Q3 = l+ 4 i
135-139 1 1 1 fq
N= 52
N 3N
- F - F
4 i 4 i
Q1 = l + Q3 = l+
fq fq
71
13 – 8 39 - 39
Q1 = 154.5 + 5 Q3 = 174.5+ 3
6 4
Q1 = 158.67 Q3 = 174.5
Note:
Q3 - Q1
Q =
2 If Q is 10 or more, the group is
heterogeneous
Q = 7.91 Homogeneous
57-59 2
54-56 0 Q3 – Q1
Q =
51-53 3 2
72
48-50 3
45-47 6
42-44 6 N
39-41 14 f 4 - F
i
q Q1 = l+
36-38 14 66 F fq
33-35 14 52
30-32 10 38
27-29 7 f 28 3N
q - F
i
24-26 7 21 F 21 Q1 = l+ 4
21-23 3 14 14 fq
18-20 5 11 11
15-17 3 6 6
12-14 3 3 3
N = 100
N 3N
- F - F
4 i 4 i
Q1 = l + Q3 = l+
Fq fq
25 – 21 75 – 66
Q1 = 26.5 + 3 Q3 = 38.5 + 3
7 14
Q1 = 28.214 Q3 = 40.429
73
Q3 - Q1 Note:
Q =
2 If Q is 10 or more, the group is
heterogeneous
Q = 6.108
Q = 6.11 Homogeneous
74
∑ /X/
AD =
N
In which the bars / / enclosing the x indicates that signs are disregarded in
arriving the sum. As 1 as always, x is a deviation of the scores from the mean,
i.e. m X – M = x.
1. Find the arithmetic mean by the long method.
2. Subtract the mean from every score to get the deviation (X).
4. Divide the sum of the deviations (∑/X/) by the number of cases (N).
2. Subtract the mean from the midpoint of every step to find the deviation
(x= Midpoint – Mean).
5. Divide the arithmetic sum of the product (∑ /fx/) by the number of cases (N). The
quotient is the AD.
75
EDUCATION 602 - Statistics
195-199 2 197 30 60
190-194 1 192 25 25
185-189 3 187 20 60
180-184 3 182 15 45
76
175-179 4 177 10 40
170-174 7 172 5 35
165-169 10 167 0 0
160-164 8 162 -5 40
N = 52 /fx/ = 530
ul – ll
Midpt. = l +
2 ∑ /fx/
AD =
N
199-
= 195 + 195
2 530
=
52
4
= 195 +
2 = 10.193
= 195 + 2 = 10.19
= 197
77
If AD is 12 or more, the group is heterogeneous.
54-56 0 55 20.64 0
78
21-23 3 22 -12.36 37.08
N = 52 /fx/ = 763.44
ul – ll
Midpt. = l +
2 ∑ /fx/
AD =
N
59 - 57
= 57 +
2 763.44
=
100
2
= 57 +
2 = 76.344
= 57 + 1 = 76.63
Homogeneous
= 58
79
of signs by squaring the separate deviations. The squared deviation used in
computing the SD is always taken from the mean, never from the median or
mode. The conventional symbol for SD is the Greek letter sigma ( o ).
M + 0.5 SD to M + 1.5 SD B or 2
M – 0.5 SD to M + 0.5 SD C or 3
M – 1.5 SD to M – 0.5 SD D or 4
The SD is also used in the comparison of groups. The higher the SD, the
more heterogeneous is the group. The smaller is the SD, the more homogeneous
is the group.
80
Name Course Date
Formula :
∑ fd2
SD =
N
Where:
d = Midpt. – M
b) fd * f20 * 1 = 20 178.846153
+ 3 -1
23 78
13 * 20 = 260 - 69
+3 984
263 - 789
+ 7 - 18669
2667 89253
- 1330 - 80229
1337 9024
* 20
81
26740 Re-check:
+ 3
26743 13.373
x 13.373
40109
93611
40119
40119
+ 13373
178837129
+ 9024
178.846153
82
185-189 3 187 20 60 1200
165-169 10 167 0 0 0
∑ fd2 = 9300
∑ fd2
SD =
N
9330
=
52
= 178.846153
= 13.37
83
Interpretation :
54-56 0 55 20.64 0 0
84
27-29 7 28 -6.36 44.52 283.1472
∑ fd2 = 9527.04
∑ fd2
SD =
N
9330
=
52
= 178.846153
= 13.37
Interpretation :
85
Calculating the SD from the Ungrouped Scores:
∑ fd2
SD = fd2 = d x fd
N
SD = ∑ fd2 - ∑ fd 2
86
N N
2. From this lay off (d), positive deviation (1, 2, 3, etc.) upward and
negative deviations (-1, -2, -3, etc.) downward.
87
EDUCATION 602 - Statistics
165-169 10 167 0 0 0
88
∑ fd2 ∑fd 2
SD = -
N N
9300 530 2
= -
52 52
= 74.96302
= 8.66 Homogeneous
Interpretation :
89
Scores Freq. Midpt. D fd fd2
54-56 0 55 20.64 0 0
∑ fd2 ∑fd 2
SD = -
N N
= 9527.04 - 763.44 2
90
100 100
= 95.2704 – 58.28406
= 6.08 Homogeneous
Interpretation :
1. Find the cumulative frequency of every step beginning with the lowest step
by adding the f’s cumulatively upward. (The last cumulative f is equal to
N).
91
4. Read off the steps, together with the corresponding cumulative
frequencies and place points through the exact upper limits of the steps.
5. Connect the successive points by straight lines, and at the lower end drop
a line to the exact lower limit of the lowest step (also the exact upper limit
of the step next below the lowest, the f which is O).
92
93
CALCULATION OF PERCENTILES IN A FREQUENCY
DISTRIBUTION
Pp = l + PN-F i
fp
Where:
94
Procedure:
3. Subtract the partial sum (F) from the corresponding percentile sum (P N)
and divide the difference by the frequency (fp) of the step containing
the percentile desired, multiply the quotient by the size of class
intervals (i) to get the correction.
4. Add the correction to the exact lower limit (l) to the step containing the
percentile desired. The result is the percentile.
P0 and P100 mark the exact lower limit of the first interval the exact upper
limit of the last interval, respectively. These two percentiles represent limiting
points. Their principal value is to indicate the boundaries of the percentile scale.
Percentiles can be used as grades, and are more reliable and comparable
than grades, letters, or numbers. Under a strict or a lenient teacher, percentiles
which are numerically equal have the same meaning. A grade P35 given by a
strict teacher is equal to a grade of P35 given by a lenient teacher.
95
Name Course Date
96
97
CALCULATING OF PERCENTILE RANKS IN A FREQUENCY
DISTRIBUTION
N
Where:
3. Multiply the difference of the exact lower limit (l) and the given score by
the quotient obtained in step 2.
98
Name Course Date
99
100
101
THE CUMULATIVE FREQUENCY CURVE OR OGIVE
2. Plot the ogive from the data in column 4, that is the cumulative
percent f (column).
102
score from the ogive, reverse the process in determining percentiles.
Percentiles and percentile ranks will often be slightly in error when read
from the ogive, but this can be made very small when the curve is
carefully drawn, the scale division precisely marked, and the diagram
fairly large.
103
104
MEASURING DIVERGENCE FROM NORMALITY
105
To find the divergence of the actual distribution (represented by the
histogram) from the best fitting normal curve that has been superimposed,
both the skewness and the kurtosis should be computed.
106
Name Course Date
107
108
PLOTTING THE BEST FITTING NORMAL CURVE
y=
√
δ N
2π
Where:
[ ]
SD
I
√ 2 π ∨√2(3.1416)=2.51(constant )
109
For example, if N = 52, SD = 13.37, M = 167, therefore
52
y=
2.67 x 2.51
52
=
6.7
Note:
Round off your answer to one decimal for convenience in plotting the
normal curve.
± 1 δ=0.60653 x 7.8=4.7
± 2 δ=0.13534 x 7.8=1.0
± 3 δ=0.01111 x 7.8=0.09∨0.1
110
Name Course Date
111
112
THE NORMAL PROBABILITY CURVE – ITS NATURE AND IMPORTANCE
113
An unsymmetrical distribution is called a skewed distribution. In
a skewed distribution, the mean, median, and the mode fall at
different points, in the distribution. Skewness is computed by the
formula:
When the mean is smaller than the median, the distribution is skewed
negatively or to the left, that is the scores are massed at the high (right)
end of the scale, and the spread out gradually at the lower end. When the
mean is greater than the median, the distribution is positively skewed, or to
the right, that is the score are massed at the low left end of the scale and
are spread out gradually at the high (right) end.
114
The results of the test that is easy are negatively skewed and those of a
test that is difficult are positively skewed. The results of a test that is of moderate
easy or difficulty approach a normal curve.
Note:
The normal distribution is not actual distribution of test scores, but is,
instead a mathematical model. Frequency distribution of scores approach the
theoretical distribution as a limit, but the fit is rarely perfect.
Principle:
115
Much evidence has accumulated to show that the normal distribution
serves to describe the frequency of occurrence of many variable facts with a
relatively high degree of accuracy. Phenomena which follow the normal
probability curve (at least approximately) may be classified as:
116
APPLICATIONS OF THE NORMAL PROBABILITY CURVE
Solution:
a. Score 25 – Mean (20) = 5 and score 15 – Mean (20) = -5. Divide the
difference 5 by SD (5). The quotients are 1SD and -1SD, respectively.
Score 25 is 1SD above the mean and score 15 is 1SD below the mean.
From Table A 1SD includes 34.13% of the cases above the mean and -
1SD includes 34.13% of the cases below the mean. Add 34.13% and
34.13%. The sum, 68.26% represents the cases that fall between15 and
25.
b. Score 30 is 10 points or 2SD above the mean. From the table 47.72%
of the cases fall between the mean and 2SD. Accordingly, 2.28% (50%-
47.72%) of the cases lie above 30.
c. Score 12 is 8 points or -1.6SD from the mean. Between the mean and -
1.6SD are 44.52% of the case. Hence, 50% - 44.52 or 5.48% of the cases
lie below 12.
Problem:
117
Solution:
The middle 65% of the cases include 32.5% above and 32.5%
below the mean. From Table 32.5% of the distribution is very close to
32.38% or .93SD. The middle 65% of the case, therefore, lies between the
mean and ± .93 SD or since SD equals 5, between the mean and ± 4.65
points adding 4.65 to the mean (20) gives 24.65 and subtracting 4.65 from
the mean gives 15.35. Therefore the middle 65% of the cases lie between
24.65 and 15.35.
Problem:
Give a test question or problem solved by 10% of the group; a
second problem solved by 20% of the group and a third, solved by 40% of
the group. What is the relative difficulty of questions 1, 2, and 3?
Solution: Question 1 is passed by 10% and is failed by 90%. The highest
10% of the group has 40% of the cases between its lower limit and the
mean (50% - 10% = 40%). From the table 39.97% or 40% fall between
1.28SD and the mean. Accordingly, 1.28SD is the difficulty value of
question 1.
Following the same procedure, question 2; passed by 20% of the group,
falls at a point in the distribution 30% above the mean (50% - 20% = 30%).
From the table 9.87% of 10% of the group falls between the mean
and .25SD. Therefore question 3 has a difficulty value of .25SD.
The SD gives the real index of difficulty of test questions, and not
the percent of passing or failing.
118
X .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 0000 0040 0080 0120 0160 0199 0239 0279 0319 0359
0.1 0398 0438 0478 0517 0557 0596 0636 0675 0714 0753
0.2 0793 0832 0871 0910 0948 0987 1026 1064 1103 1141
0.3 1179 1217 1255 1293 1331 1368 1406 1443 1480 1517
0.4 1554 1591 1628 1664 1700 1736 1772 1808 1844 1879
0.5 1915 1950 1985 2019 2054 2088 2123 2157 2190 2224
0.6 2257 2291 2324 2357 2339 2422 2454 2485 2517 2549
0.7 2580 2611 2642 2673 2704 2734 2764 2794 2823 2852
0.8 2881 2910 2939 2967 2995 3023 3051 3075 3106 3133
0.9 3159 3186 3212 3238 3264 3290 3315 3340 3365 3380
1.0 3413 3438 3461 3485 3508 3531 3554 3577 3599 3621
1.1 3643 3665 3686 3708 3729 3749 3770 3790 3810 3830
1.2 3849 3869 3888 3907 3925 3944 3962 3980 3997 4015
1.3 4032 4049 4066 4082 4099 4115 4131 4147 4162 4177
1.4 4192 4107 4222 4236 4251 4265 4279 4292 4306 4319
1.5 4332 4345 4357 4370 4383 4394 4406 4418 4429 4441
1.6 4452 443 4474 4484 4495 505 4515 4525 4535 4545
1.7 4554 4564 4573 4582 4591 4599 4608 4616 4625 4633
1.8 4641 4649 4656 4664 4671 4678 4686 4693 4699 4606
1.9 4713 4719 4726 4732 4738 4744 4750 4756 4761 4767
2.0 4772 4778 4783 4788 4793 4798 4809 4808 4812 4817
2.1 4821 4826 4830 4834 4838 4842 4846 4850 4854 4857
2.2 4861 4864 4868 4871 4875 1878 4881 4884 4887 4890
2.3 4893 4896 4898 4901 4904 4906 4909 4911 4913 4916
2.4 4918 4920 4922 4925 4927 4929 4931 4932 4934 4936
2.5 4938 4940 4941 4943 4945 4946 4948 4949 4951 4952
2.6 4918 4920 4922 4925 4927 4929 4931 4932 4934 4936
2.7 4938 4940 4941 4943 4945 4946 4948 4949 4951 4952
2.8 4974 4975 4976 4977 4977 4978 4979 4979 4980 4981
2.9 4981 4982 4982 4983 4984 4984 4985 4985 4986 4986
3.0 4986.5 4996.9 4987.4 4987.8 4988.2 4988.6 4988.9 4989.3 4989.7 4999
3.1 4990.3 4990.6 4991.0 4991.3 4991.6 4991.8 4992.1 4992.4 4992.6 4992
3.2 4993.129
3.3 4996.166
3.4 4996.631
3.5 4997.674
3.6 4998.409
3.7 4998.922
3.8 4999.277
3.9 4999.519
4.0 4999.683
4.5 4999.966
5.0 4999.997133
119
Name Course Date
120
121
122
Name Course Date
123
124
125
COEFFICIENT OF CORRELATION
126
The coefficient of correlation is also used for prediction. A pupil takes the
test in English literature but is absent in the test in American literature. With the
reliability of each of these test and the relationships between them known, it is
possible to predict the probable score the pupil the test in American literature by
substituting the values in the regression equation which are not within the scope
of this book.
Procedure:
1. Rank the scores in each test and enter the corresponding ranks in
columns “Rank X” and “Rank Y”.
2. Find the difference in ranks by subtracting algebraically the ranks in one
test from the ranks in the other set and write difference in column D.
3. Square each of the difference in rank and enter it in column D 2.
❑
4. Take the sum of the squares of the difference in rank ( ∑ ❑ D ) and the
2
❑
numbers of pairs (N); substitute their values in the formula.
❑
6∑ ❑ D
2
P ( RHO )=1− ❑
N ( N 2−1)
Table of Interpretation
Any coefficient of correlation that is not zero and that is also statistically
significant denotes some degree of relationship between two variables.
127
Name Course Date
128
6 (112)
= 1−
15(152 −1)
672
= 1−
3360
= 1−0.20
= 0.80 High Positive Correlation Indicating Marked Relationship.
Procedure:
1. Rank the scores in each test and enter the corresponding ranks in
columns “Rank X” and “Rank Y”.
2. Subtract the ranks in one test algebraically from the ranks in the
other test, but enter only the positive difference as are the gains in
ranks of one test over the other (Col. G).
❑
3. Find the sum of the gains in ranks ( ∑ ❑G ) and the number of pairs
❑
❑
129
130
Name Course Date
131
102
= 1−
225−1
102
= 1−
224
= 1 – 0.46
R = 0.54 Moderate Correlation Indicating Substantial Relationship
THE PRODUCT MOMENT COEFFICIENT OF CORRELATION
[ ]
❑
Procedure:
1. Compute the arithmetic mean of each test by the long method.
2. Subtract algebraically the mean of one test from every score in that test to
get the corresponding deviations x and y.
3. Square the deviation of each test and sum them up. (Cols. X 2 and y2)
5. Divide the sum of the deviation squared (x 2 and y2) of each test by N.
Extract the square roots of the quotient to get the sign.
132
6. Divide the sum of the product by the products of the sigmas of the two
tests, or divide the algebraic sum of the products of the deviations by the
square root of the products of the sums of the squared deviations in the
two tests. The result is Pearson coefficient of correlations.
❑
∑
❑
❑ xy
N
R=
√
❑ ❑
∑ ❑ x2
❑
∑
❑
❑ y2
x
N N
EDUCATION 602
133
❑ ❑ ❑ ❑
N = 15 ∑
❑
❑ X =157 ∑ ❑Y =140
❑
∑
❑
❑ XY =58.65 ∑
❑
2
❑ X =77.75
❑
∑
❑
❑Y 2=73.35
❑ ❑
MX
∑❑X
= ❑ MY
∑ ❑Y
= ❑
N N
157 140
= =
15 15
MX = 10.5 MY = 9.3
❑
∑
❑
❑ XY
N
√
R = ❑ ❑
∑❑ x
❑
2
∑
❑
❑Y
2
x
N N
58.65
15
=
√ 77.75 73.35
15
3.91
x
15
=
√5.18 x 4.89
3.91
=
√25.3302
3.91
=
5.032
= 0.777
R = 0.78
R = 0.8 High Positive Correlation Indicating Direct Marked Relationship.
Procedure:
1. Determine the highest score and the lowest score.
134
2. Beginning from the highest score, write the number
consecutively down to the lowest score (top to bottom).
EDUCATION 602
135
Transmutation of Raw Scores into Ratings by Long or Spread Method
Exercise 22a
HS= 59
LS= 12
Group A 5% =6 91-95
Group B 25% = 26 86-90
Group C 40% = 40 80-85
Group D 20% = 19 75-79
Group E 10% =9 70-74
136
Name Course Date
EDUCATION 602
137
TRANSMUTATION OF RAW SCORES INTO RATINGS BY SHORT CUT
METHOD
Procedure:
2. Determine the highest and lowest ratings for the period. Find their
difference.
6. Assign the rating. The lowest rating is assigned to the lowest score
or lowest step, as the case may be, the next consecutive rating to
the step next to the lowest score or step and so on or score as the
case may be.
138
Note:
139
Name Course Date
EDUCATION 602
Given:
N = 100 Scores Equivalent Rating
HS = 59 - 60 95
LS = 12 57 – 59 94
HR = 95 54 – 56 93 i = 3 (2 steps)
LR = 70 52 – 53 92 i = 2 (23 steps)
50 – 51 91
1. Determine the Range 48 – 49 90
HS= 59 46 – 47 89
LS= - 12 44 – 45 88
47 42 – 43 87
2. Determine the range of 40 – 41 86
ratings for the period. 38 – 39 85
HR= 95 36 – 37 84
LR= - 70 34 – 35 83
25 32 – 33 82
30 – 31 81
25/47 = 1.8 or 2 = 1st i 28 – 29 80
25-2 = 23 26 – 27 79
24 – 25 78
23 = No. of steps with an I of 2 22 – 23 77
2 = 1st i + 1 = 3 20 – 21 76
3 = 2nd i 18 – 19 75
2 = No. of steps with an I of 3 16 – 17 74
14 – 15 73
12 – 13 72
10 – 11 71
08 – 09 70
140
PART II
MODULES FOR
STATISTICAL
METHOD
141
Modules for Statistical Methods
EXERCISES
1. Recall two research studies you have read recently.
a. What would be the hypothesis you can generate from such studies:
1. Research hypothesis
2. Null hypothesis
3. Alternate hypothesis
b. Submit one good research study you would like to go into, what
would be your:
1. Research hypothesis
142
2. Null hypothesis
3. Alternate hypothesis
MODULE I
I. How to compute Chi-square (one sample) Testing the Significant
difference between responds within a group.
How to compute Chi-Square (one sample):
2
2 ( 0−Ʃ)
Formula: x =Ʃ
e
2
x = Chi- Square
Ʃ = sum
0 = Observed Frequencies
E = Expected Frequencies
II. Problems and Hypothesis (Null)
A. Null hypothesis (Ho): There will be no significant difference for each
of three kinds of responses.
B. Problems:
Thirty perspective teachers were asked their opinion about
the desirability of introducing technological innovations rate the
classroom.
1. To what extent is the teachers opinion on technological
innovations in the classroom?
2. Test at .05 level if a significant difference exists among the
teachers.
III. Statistical Procedures:
143
Step 6: Consult Chi-Square table with equal degree of freedom.
df= (R – 1) (C – 1) Tabled value (Critical Value) at
df= (2 – 1) (3 – 1) 2df = 5.99
df= 2
144
For five (5) levels of descriptive ratings, the following may be use
4.21 – 5.00 strongly agree
3.41 – 4.20 agree
2.61 – 3.40 undecided
1.81 – 2.60 disagree
1.00 – 1.80 strongly disagree
EXERCISE:
A. Problem: A survey questionnaire given to college teachers on the
issue. “It is best for students to be told what subjects to take rather
than have them choose for themselves.” Test at what .05 the
significant responses of the respondents.
B. Data:
Strongly Agree – 325 x 5 = 1625
Agree – 584 x 4 = 2336
Uncertain – 189 x 3 = 567
Strongly Disagree – 82 x 1 = 82
1466 5182
Questions:
1. Is there a significant difference in responses among the college teachers?
Support your answer.
2. What is the level of attitude of respondents?
3. If you were the college president, would you allow affixed curriculum for
students to follow strictly based on findings?
145
Module 2
How to compute the chi-square (two or more samples) testing the significant
between two or more groups.
146
III. Statistical Procedure:
Step 1. Record observed frequencies as follows:
Teachers/ SA A E D SD TOTAL
Scales
TOTAL 73 36 82 46 30 267
Step 2. Compute the “e” of each cell. The “e” of each cell is
obtained by multiplying the marginal sums and divide this by the total number of
respondent (N).
cell (a) = 73x76 = 20.51 cell (f) = 73 x 92 = 25.70 cell (k) = 73 x 98 = 26.79
267 267 267
147
Step 3. Substitute the “e” in the chi – square formula and perform
the indicated operations.
cell (a) = (15-20.51)2 = 1.48 cell (b) = (6-10.11) 2 = 1.67 cell (c) = (25-23.03) 2 = 0.17
20.51 10.11 23.03
cell (d) = (18-12.92)2 = 2.00 cell (e) = (11-8.43)2 = 0.78 cell (f) = (32-25.70)2 = 1.54
12.92 8.43 25.70
cell (g) = (12-12.67)2 = 0.04 cell (h) = (28-28.87)2 = 0.03 cell (i) = (13-16.19)2 = 0.63
12.67 28.87 16.19
cell (k) = (26-26.79)2 = 0.02 cell (l) = (18-13.21)2 = 1.74 cell (m) = (29-30.10)2 = 0.04
26.79 13.21 30.10
X2 = 1.48 + 1.67 + 0.17 + 2.00 + 0.78 +1.54 + 0.04 + 0.03 + 0.63 + 0.23 + 0.02 + 1.74 + 0.04 + 0.21
+ 0.09
X2 = 10.67
Findings:
X2 (10.67) is lesser than ∫ .05 (15.51)
Interpretations/Implementations:
Opinion among groups of teachers in comparable to
their opinion on the teaching of sex education In school is
agreeable to teachers.
148
Module 3
How to Compute Coefficient of Contingency
of husbands.
Grad Work 4 9 38 54
College 20 31 55 99
High School 23 37 41 51
Elementary 11 10 11 19
149
Questions:
Formula: Where:
√ x2
N +x 2
C = coefficient of contingency
x2 = Chi - Square
N = Number of cases
II. Problem and Hypothesis: (NULL)
A. Null – Hypothesis (HO): There is no significant correlation between
marriage adjustment level and education of husbands
B. Problem: Test at .05 level of significance that There is no significant
correlation between marriage adjustment level and education of
husbands
III. Statistical Procedure
Step 1.Record observed frequencies as follows:
Education of Husband: Marriage Adjustment Levels
513
Step 2.Compute the “c” of each cell by multiplying the marginal sums and
dividing this by the total number (N).
150
58 x 105 58 x 205
Cell (a) = =11.87 Cell (e) = =23.18 Cell (i) =
513 513
58 x 152
=17.19
513
87 x 105 87 x 205
Cell (b) = =17.81 Cell (f) = =34.77 Cell (j) =
513 513
87 x 152
=25.78
513
145 x 105 145 x 205
Cell (c) = =29.68 Cell (g) = =57.94 Cell (k) =
513 513
145 x 152
=242.96
513
283 x 105 223 x 205
Cell (d) = =45.64 Cell (h) = =89.11 Cell (l) =
513 513
223 x 152
=66.07
513
Step 3.Compute
❑
X 2=
∑
❑
❑ ( o−e ) x
2
2 2
( 4−11.87) (23−17.19)
Cell (a) = =5.22 Cell (i) = =1.96
11.87 17.19
(9−17.81)2 (37−25.78)2
Cell (b) = =4.36 Cell (j) = =4.88
17.81 25.78
2 2
(38−29.68) ( 41−42.96)
Cell (c) = =2.33 Cell (k) = =0.09
29.68 42.96
2 2
(54−45.64) (51−66.07)
Cell (d) = =1.53 Cell (l) = =3.44
45.64 66.07
2 2
(20−23.18) (11−5.77)
Cell (e) = =0.44 Cell (m) = =4.74
23.18 5.77
(31−34.77)2 (10−8.65)2
Cell (f) = =0.41 Cell (n) = =0.21
34.77 8.65
151
2 2
(85−57.94) (11−14.42)
Cell (g) = =0.15 Cell (o) = =0.81
57.94 14.42
2 2
(99−89.11) (19−22.17)
Cell (h) = =1.10 Cell (p) = =0.45
89.11 22.17
Step 4: C=
√ x2
N +x 2
=
√ 32.12
513+32.12
=0.24
C 0.24
Step 5: = =0.29
84 0.84
Step 6: t = C
√ N−2
1.00−C
=0.29
√ 513−2
1.00−(0.29)
=7.78
152
df = 4
∝ .05 (4df) = 9.488 tabled value
X2 = 1.99 <∝ .05 (9.488)
Findings x2 (1.99) is lesser than the critical value at 0.05 level of significance
(9.488)
Decision: Accept the hypothesis of no divergence
It is a normal distribution
Interpretation/Implication:
Analysis of the data has shown no significant divergence in 5 levels of
performance in N1 level the same number of expected (0.42) pupils compared to
the observed no (1) same is true with the ms vs and o levels, however we are
only four pupils who are satisfactory (S) out of the expected 5 pupils the
distribution is normal.
Answer to Questions:
1. The graph is not normal at 3 levels of performance
2. The graph is normal at 5 levels of performance
Module 4
153
How to determine the profile of the Academic Performance of a group (Testing
the significant Divergence from the normal curve of distribution)
FORMULA :
❑
(O−E ) ❑2
x 2=∑ ❑
❑ E
2
x =Chi−square
= sum
O = observed frequency
E = expected frequency
The expected frequencies in the test for normality are not of the
distribution of the normal curve.
II. PROBLEM AND HYPOTHESIS (null)
The distribution of levels of performance in a mathematics test is not
divergent from the normal curve of distribution.
(Distribution is normal)
A. Problem: Grade IV pupils were given an achievement test in
Mathematics. Test at .05 level that distribution of performance is
not divergent from the normal curve of distribution.
Total
48
154
III. STATISTICS L PROCEDURE
Step 1. Record observed frequencies.
Perform indicated operations.
Frequencies BA A AA TOTAL
Observed (o) 4 26 18 48
Expected (e) 16% or 48 68% of 48 16% of 48
Step 2. O-E 3.68 6.64 7.68
STEP 3. (O – E)❑2 13.54 44.09 106.50
2
(O−E)❑
STEP 4. 1.76 1.35 13.87
E
Expected “E” in 5 levels of academic performance uses the following proportion.
Outstanding – 3.5% satisfactory – 45%
Very satisfactory—24% moderately satisfactory—24%
Needs improvement—3.5%
Step 5. Substitute the numbers in the chi-square formula and perform the
indicated operation.
2
x =1.76 + 1.35 + 13.87 = 16.98
x 2=16.98
Step 6. Match computed value with critical value.
Computed value critical value at .05
2
x =16.98 df = (3-1) (2-1)
=2
0.5 (df2) = 5.99
2
IV. FINDINGS: x ( 16.98 ) is greater than the critical value at .05 level of
significance (5.99)
V. DECISION: Reject the hypothesis of no divergence
VI. INTERPRETATIONS/IMPLICATIONS:
155
Analysis of data has shown a significant divergence in 3 level of
performance. In the BA (below average) level there are only 4 out of the
expected 7 pupils (7.68). In the A (average) level, there are only 26 out of the
expected 32 pupils (32.64). However, there are more pupils (18) than expected
(7.68) in the AA (Above Average) level. The distribution presents a skewness to
the left the normal curve of distribution.
Therefore the academic performance of the Grade IV pupils is that we
have brighter pupils than poor ones.
EXERCISE :
I. PROBLEM : Give the profile of the academic performance of a group
of pupils in a Reading test getting the following scores.
156
MODULE 4a. How to convert scores into Grades: (Levels of Academic
Performance) A, B, C,
D, F
I. Formula : 1.8 SD ± x (A and F)
.6 SD ± x (B, C, D)
III. Procedures:
For B, C, D
15.93 X .6 = 9.56 33.60 + 9. 56 = 43.16 (B)
157
TABLE A.
Example : When the df are 20 an the t is 2.09, the .05 level means that 5
times in 100 trials in a divergence as large or as larger than that obtained (plus or
minus) may be expected under the null hypothesis.
Df
1 12.71 63.66
2 4.30 9.92
3 3.18 5.84
4 2.78 4.60
5 2.57 4.03
6 2.45 3.71
7 2.36 3.50
8 2.31 3.36
9 2.26 3.25
10 2.23 3.17
11 2.20 3.11
12 2.18 3.06
13 2.16 3.01
14 2.14 2.98
15 2.13 2.95
16 2.12 2.92
17 2.11 2.90
18 2.10 2.88
19 2.09 2.86
20 2.09 2.84
21 2.08 2.83
22 2.07 2.82
23 2.07 2.81
24 2.06 2.80
25 2.06 2.79
26 2.06 2.78
27 2.05 2.77
28 2.05 2.76
29 2.04 2.76
30 2.04 2.75 .
50 2.01 2.68.
100 1.98 2.63
158
TABLE B.
1 3.84 6.64
2 5.99 9.21
3 7.82 11.34
4 9.49 13.28
5 11.07 15.09
6 12.59 16.81
7 14.07 18.48
8 15.51 20.09
9 16.92 21.67
10 18.31 23.21
11 19.68 24.72
12 21.03 26.22
13 22.36 27.69
14 23.68 29.14
15 25.00 30.58
16 26.30 32.00
17 27.59 33.41
18 28.87 34.80
19 30.14 36.19
20 31.41 37.57
21 32.67 38.93
22 33.92 40.29
23 35.17 41.64
24 36.42 42.98
25 37.65 44.31
26 38.88 45.64
27 40.11 46.96
28 41.34 48.28
29 42.56 49.59
30 43.77 50.89
159
Module 5
How to compute the Product-Moment Correlation (r)
(Testing the significant correlation between two variables)
( x−X ) ( y −Y )
Formula: r =
( N )( SDx ) ( SDy )
A 50 60
B 60 80
C 70 90
D 80 70
E 90 10
160
STATISTICAL PROCEDURE:
Total
400
-100
Step 2: To obtain the sum of (x-X)(y-Y) multiply columns x-X and y-Y as indicated
above, then add these products:
161
SDx = 14.14 SDy = 14.14
700
r=
(5)(14.14)(14.14)
= 0.70
t = r √ n−21.00−¿ ¿ t = 0.70
√ 5−2 1.00−¿ ¿
t = 1.69
t = (1.69) df = N – 2
=5-2
=3
df = 0.05(df3)
= 3.18
IV. Findings:
t (1.69) is less than the significant correlation between test x and test y
scores.
V. Decision:
Accept Ho.
162
Student Test A Test B
A 22 35
B 16 25
C 25 30
D 9 15
E 10 20
3. Question:
Answer:
STATISTICAL PROCEDURE:
Total
56.0
43.0
163
D 9 16.4 - 54.76 15 25 -10 100 -7.4 -10
7.4
74.0
Step 2: To obtain the sum of (x-X)(y-Y) multiply columns x-X and y-Y as
indicated above, then add these products:
205
r =
(5)(6.34)(7.07)
= 0.91
t = (3.80) df = N – 2
=5-2
=3
164
df = 0.05(df3) =
3.18
V. Findings: t (3.80) is less than the significant correlation between test x and test
y scores.
EXERCISE
Problem:
A unit test was given to a group of Grade VI pupils in Science. Here are the
scores. 28, 30, 33, 42, 17, 18, 33, 32
Questions:
Mean Score
Standard Deviation
Answer:
Ʃx
Mean Score = = 29.89
N
For A and F
165
29.89 + 13.66 = 43.55
43.55
A
43.54
42
36
34.44
B
34.43
33
33
32
30
25.34
C
25.33
28
18
17
16.23
D
16.22
43.44 – up = 0
34.4 – 43.54 = 2
25.34 – 34.43 = 3
16.23 – 25.33 = 2
16.22 – below = 0
N= 9
5 Level of Performance
166
f
Outstanding (O) 42, 36 2
Very Satisfactory (VS) 33, 33, 32 3
Satisfactory(S) 30, 28 2
Moderately Satisfactory (MS) 18, 17 2
Needs Improvement(NI) 0
N= 9
3 Level of Performance
f
Above Average 42,36 2
Average 33, 33, 32, 30,28 5
Below Average 18, 17 2
Module 6
(Biserial r)Testing Statistically the Significant Correlation between Variable A
(Continuous) and Variable B (Discontinuous)
I. Hypothesis Problem:
Is there relationship between music appreciation and training in music?
A. Variables:
Variable A (Continuous) – Music appreciation test score
Variable B (Discontinuous) –
1. Those with training in Music (Group 1)
2. Those without training in Music (Group 2)
167
B. Formula:
X1 (p) – x2 Pq
R (bis) - (q) x
SD U
85 – 89 5 6 11
80 – 84 2 16 18
75 – 79 6 19 25
70 – 74 6 27 33
65 – 69 1 19 20
60 – 64 0 21 21
55 – 59 1 16 17
N1 = 21 N2 = 124 N = 145
Total
168
III. Hypothesis:
p = 85.5%
14.5%
C. Step 3. Compute u.
U is the value of the height of the ordinate of the specific area in the
normal curve. (See illustration of u in the normal curve).
q - 50%
AM =
100
169
85.55%-50% (assuming that q is the
=
100 larger group)
35.5%
=
100
= 0.355
U = 0.288
x1 (p) - xz (q) pq
r = X
SD u
5.65 0.124
= X
8.80 0.288
= 0.642 x 0.431
170
r (bis) = 0.276 or 0.28
t = r N - 2
1–(r)2
t = 0.28 142 - 2
1 – (0.28) 2
t = 0.28 143
0.22
t = 0.28 650
t = 0.28 x 25.50
171
t = 7.14
V. Findings:
1. What is the value of r?
● Low
● significant
172
Module 7
Testing the SignificantChange in opinions/Attitudes of a Group After
Treatment
II.Hypothetical Data:
AFTER
Negative Positive
Positive 13A 2B
_________
Negative 9C 6D
Before
( ⃓ A−D ⃓ −⃓ ) ❑2
X2=
A+ D
173
VIII.Computation
2 ( ⃓ 13−⃓ G−⃓ ) ❑2
X = df =(−1)(C−1)
19
( 7−1 ) ❑2
= =( 2−1 ) ( 2−1 )
19
3G
= = (1 ) ( 1 )
19
= 1.89 =1
IX. Findings.
2
X = (1.89) < .05(3.84)
X. Decision
XI. Interpretation
The attitude of the fifteen nursing students did not change after the
development program. Attitude is
the difficult to change. It takes a longer time to change his attitude one’s attitude.
174
Module 8
How to compute the t-ratio (Testing the Significant Difference between two
Mean Scores of Large N’s)
I. Formula
X1-x2
A= (SD1)2 ₊ (SD2)2
N1 N2
III.
A. Problem:
School A and School B were compared in the performance
achievement test in Mathematics of Grade VI pupils of the same
City. Test at .05 level that there is no significant different in mean
scores between the 2 scores.
175
B. Null Hypothesis (Ho): There is no significant difference in mean
scores between School A and School B in a Mathematic Test.
C. Data: School A School B
Difference
X 69 67
2
SD 10 12
N 100 144
Step 2 = Compute the Standard error of the mean for sample a and b.
Step 3 – Substitute the number in the formula and perform the indicate operation.
A= 69-67 2
=
(10)2 + (12)2 100 + 144
100 100 100 144
2 2 2
A= = = = t = 1.42
1+1 2 1.41
176
df = 99+143
df = 242
v .05= (df 242)= 1.96
V. Findings:
EXERCISE:
II. Data:
Class A Class B
X 74.55 62.42
SD 14.70 18.25
N 6.6 8.2
III. Questions:
1. Const. the following:
a. RH
b. Ho
c. Ha
177
2. Compute
a. T
b. Critical value at .05
a. Findings
b. Decision
c. Interpretation
Answers:
1.
a. RH
c. Ha
2. Compute
a. t
A= 74.55- 62.42
(14.70)2 (18.25)2
6.6 + 8.2
= 12.13
216.09 333.06
6.6 + 8.2
12.13
32.74 + 40.62
= 12.13
178
73.36
= 12.13
8.57
A = 1.42
t = 1.42
c. Interpretation/ Implication
The performance of the Achievement Test in Class A and Class B is
comparable.
179
Module 9
How to compute the Significant difference between two mean scores when N is
small (below 30)
A. Hypothetical Problem
Scores Scores
18 15
17 18
16 19
12 18
19 19
15 20
180
19 19
12 N=8
N = 10
C. Formula
t = X1 – X2 where S2 = ( x - X)2
2
S1 + S
2
N1 N2
D. Statistical Procedures
a. Mean 1 = 16
b. Mean 2 = 18
62 64
a. S2 = =
10−1 9
2
S = 7. 11
1
181
64 64
S2 = =
10−1 9
2
S = 7. 11
1
Module 10
How to compute the significant differences in two mean scores between
pre post test scores.
I. Hypothetical Problem
Test at .05 level if there is a significant mean gain between pre and
post test scores in an activity- centered science class.
III. Formula
xd
A=
S2
√N
Where s 2=
√
( X −xd ) 2
N−1
IV. Data
Pre-Post Post-Test
A 8 11 3 0 0
B 2 4 2 3 -1 1
C 6 10 4 +1 1
182
N=3 2
V. Statistical Procedure
Step 1- Compute the mean difference (Xd)
S2 =
√ ( xd ixd ) 2
N −1
=
2
2√
S2 =
2
2
2√
S = 1 = (1)
=
S = (1)
2
√ 2
2
3 3
t= =
1 1
t= √ 3 = 1.73
3
t=
50
t= 5.17
= 3-1
=2
= 2.92 (one-tailed)
VI. Findings
t( 5.17) > .05 (2.92)
The computed t (5.17) is greater than the tabled value of 2.92.
183
VII. Decision
Rejected the Null Hypothesis
VIII. Interpretation:
184