Module With Exercises

COURSE
SYLLABUS
1
TYPES OF DATA
A variable is a characteristic, description, or attribute of persons or objects,

which assumes different values or labels.
Examples:
a) Height, age, weight are variables, which assumes numerical responses or

values.
Height: 5 feet and 4 inches
Weight: 120 kilograms
Age: 20 years old
b) Gender, religious affiliation, civil status are also variables, which assumes
not values but different labels or categories.
Gender: Male (M) Female (F)
Religious Affiliation: Roman Catholic Adventist
Mormons Protestant
Civil Status: Single Married
Separated Widow/er
Variables are generally classified into two, namely qualitative and quantitative
variables. A qualitative variable yields categorical responses while a quantitative
variable yields numerical responses representing an amount or quantity.
Examples:
Civil status and religious affiliation are qualitative variables.

Number of children in the family, blood pressure, and temperature are
quantitative variables.
Quantitative variables on the other hand can either be discrete or continuous. A discrete
quantitative variable assumes finite or accountably infinite such as 0, 1, 2, 3… and are
usually obtained through the process of counting. A continuous quantitative variable
on the other hand assumes values which are associated with points on an interval of the
number line. These are usually obtained through the process of measurement with
corresponding units.
Examples:
a.) Number of students, and number of patients are discrete quantitative

variables.
b.) Height, weight, and temperature continuous quantitative variables.
2
Variables can be also classified according to their levels of measurement. These
are scales of measuring data.
A nominal data is the crudest form of data. It uses numbers or symbols for the
purpose of categorizing subjects into groups or categories, which are mutually exclusive.
Thus, being in one category automatically excludes one from being a member of another
category. Moreover, the categories are exhaustive, that is. All possible categories of a
variable should be included.
Examples:
a.) Gender can be categorized as either

M- Male
F- Female
Thus, if an individual is a member of the male group then he cannot be a

member of the female group at the same time.
b.) College year level can be categorized as:

I – First Year
II – Second Year
III – Third Year
IV – Fourth Year
An ordinal data possesses all the properties of the nominal data. Hence, it can
be said that can ordinal data is an improvement of the nominal data because in here, the
data are ranked/ ordered in a somewhat “bottom to top” or “high or low” scheme.
Examples:
a.) Student’s class standing is an ordinal data. These are categorized into:
5 – Excellent
4 – Very Good
3 – Good
2 – Fair
1 - Poor
b.) Pain assessment is also an ordinal data, which is categorized as:

0 – No Pain
1 - Moderately Painful
2 – Severely Painful
3 – Very Painful
An interval data possesses all the properties of the nominal and ordinal data.
Here, the data are numeric in nature and the distances between any two
3
numbers are known. However, the interval data, although numeric, does not
have a stable point or absolute zero.
Examples:
Consider the IQ of four students: 70, 140, 75, and 145.
Here we can say that the difference between 140 and 50 in the same as
the difference between 145 and 75. But, we cannot claim that the second student
is twice as intelligent as the first. Is there such a zero IQ?
A ratio data possesses all the properties of the nominal, ordinal and
interval data. It is also numeric in nature and has an absolute zero point. Thus, in
a ration data, we can classify, order/rank them and likewise we can also compare
their magnitudes.
Examples:
Age, income and scores are examples in ratio data.
There are also other classifications of data. Raw data are those, which
are in their original form and structure. Responses out from surveys, taped
interview, and recorded observations are examples in raw data. Grouped data
on the other hand are those placed and summarized in tabular form.
4
METHODS OF DATA COLLECTION
In an statistical investigations, there are a lot of ways of collecting data.

Unfortunately, none of these methods would become the best method because the
choice of appropriate methods to be used largely depends on some factors, which
include the definition of the problem, the research design, and the time element of data
collection and the cooperation of the respondents.
The observation method is the most simple data collection technique. Here, the
data are obtained by merely observing the behaviour of persons, or objects but only at a
particular time of occurrence. The data obtained is called an observation data.
The experimental method is especially useful when one wants to collect data
for cause and effect studies under controlled conditions. In this method, there is actual
interface with the conditions and situations that can affect the variable under study. The
data obtained in this method is called an experimental method.
In the registration method, the respondents provide the necessary information

in compliance with existing laws. For instance, data can be derived from card
registration, birth registrations, voter’s registration and marriage registration.
The use of existing studies also provides an archival method of data

collection. In this method, the primary source is the source in which the data are
measured or gathered by the researcher or agency that published it. The secondary
source on the other hand is the source from which any republication of data is made by
another agency.
In the survey method, the desired information is obtained through asking

question. The survey method or either be direct or personal interview method and
indirect or questionnaire method.
In the direct or personal interview method, there is a person-to-person contact

between the interview and interviewee. This is considered as one of the most effective
methods of data collection because accurate are precise information can be obtained
and verified from the respondents. Moreover, this has a higher response rate but can
only be administered to the respondents’ ne at a time.
The indirect or questionnaire method is considered to the easiest method of

data collection through the use of questionnaire or a data- gathering tool. Unlike the
direct method, this method has a lower response note but can be administered toa large
number of respondents simultaneously.
5
RAW
SCORES
6
RAW SCORES OF 52 STUDENTS IN AN ACHIEVEMENT TEST
163 180 148 156 168 172
177 193 142 152 167 178
189 162 157 161 167 173
176 188 198 158 151 162
167 186 143 164 169 171
182 157 171 197 159 168
153 147 136 161 166 162
7
172 163 173 183 179 173
165 156 165 167
RAW SCORES OF 100 PUPILS IN A VOCABULARY TEST
23 36 40 31
29 15 34 36
31 24 40 45
34 57 20 45
16 33 37 37
12 27 14 43
36 41 41 52
8
39 25 46 49
34 22 21 40
27 18 35 40
39 30 41 42
51 38 16 27
24 26 32 34
26 38 46 41
40 30 45 44
38 29 34 35
32 19 18 26
37 38 32 50
43 29 25 29
37 41 51 35
48 40 34 31
33 43 37 46
34 34 38 20
40 14 31 32
33 42 38 59
THE MASTER SHEET
9
The master sheet or classifier is a device used in arranging scores
or statistical data. With it, scores are easily and conveniently arranged
from the highest to the lowest, or vice versa. The frequency of each score
is easily determined. It is preparatory step to the ranking and grouping of
scores.
Procedures in classifying scores on the master sheet:
1. Determine the highest and the lowest scores.
2. Subtract the tens of the lowest scores from the tens of the highest
scores. Add 4 (constant) to the difference to determine the number
of horizontal line.
3. Draw 13 (constant) vertical lines and many horizontal lines as

computed in step 2.
4. Write in the horizontal cells the units 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, and

“Total”, starting from the second cell; and in the left vertical cells,
“Total”, and tens (ten to the lowest score, followed by the second
ten, and so on to the ten of the highest score), starting from the
bottom.
5. Tally the raw scores in the cell where they fall. The first digit or
digits of a score are represented by the vertical tens and the last
digit by the horizontal units.
6. Count the tallies in every cell and write the total frequencies
corresponding to the tens and the units. Add the total frequencies
of the ten and of the units. The sums must be equal. (These sums
correspond to the number of raw scores for cases consideration.)
EDUCATION 602 - Statistics
10
Name Course Date
THE MASTER SHEET OR CLASSIFIER

Exercise No.1
U N I T S
0 1 2 3 4 5 6 7 8 9 TOTAL
T 19 I I I 3
18 I I I I I I 6
E 17 II II III I I I I 11
16 II III II I II I IIII II I 18
N 15 I I I II II I I 9
14 I I I I 4
S 13 I 1
TOTAL 1 5 8 9 1 2 6 9 7 4 N=52
H.S. = 198 – 19t
L.S. = 136 – 13t
11
Difference = 6t
+ Constant + 4
Horizontal Lines = 10
Constant 13 Vertical line
Name Course Date
THE MASTER SHEET OR CLASSIFIER

Exercise No.1
U N I T S
0 1 2 3 4 5 6 7 8 9 TOTAL
5 I II I I I 6
N 4 IIII-II IIII II III I III III I I 26
3 II IIII IIII III IIII-III III III IIII IIII-I II 40
12
2 II I I I II II III III IIII 19
1 I II I II II I 9
TOTAL 12 12 9 7 13 9 11 9 9 9 N=100
H.S. = 59 – 5t
L.S. = 12 – 1t
Difference = 6t
+ Constant + 4
Horizontal Lines = 10
Constant 13 Vertical line
RANKING OF SCORES
Ranking is the relative placement or arrangement of measures in a series

according to magnitude, value, or quality, from the lowest to the highest or vice
versa. It does not take into account the size of the difference between any two
successive measures. The ranks are successive and continuous but the
differences vary. Moreover, the scores are only indication of the achievements of
pupils. They are not exact measures they signify more or less the
accomplishments of pupils. Hence, the positions of pupils indicated by their ranks
are relative, not absolute.
13
From the ranks, it is possible to determine the bright from the dull and the
mediocre pupils. It is possible to determine percentage of pupils that surpass a
pupil and that surpassed by him. Ranks are also used in the computation of the
coefficient of correlation.
Procedure in the ranking of scores or measures:
1. By using the master sheet, arrange the scores from the highest to the
lowest, writing each score as many times as it appears.
2. Number the score consecutively, giving the highest score tentative rank 1.
Next highest 2, and so on to the lowest score. (The tentative rank of the
lowest score is equal to the total number of case, ‘N’).
3. Assign the ranks as their real ranks. Scores appearing more than once
have the average of their ordinal numbers (tentative ranks) as their real
ranks. (Identical or similar scores have equal ranks).
EDUCATION 602- Statistics
Name Course Date
THE RANKING OF SCORES

Exercise No.2
SCORES TR RR SCORES TR RR
14
198 1 1 165 29
197 2 2 165 30 29.5
193 3 3 164 31 31
189 4 4 163 32
32.5
188 5 5 163 33
186 6 6 162 34
183 7 7 162 35 35
182 8 8 162 36
180 9 9 161 37
37.5
179 10 10 161 38
178 11 11 159 39 39
177 12 12 158 40 40
176 13 13 157 41
41.5
173 14 157 42
173 15 15 156 43
173 16 156 44 43.5
172 17 153 45 45
17.5
172 18 152 46 46
171 19 151 47 47
19.5
171 20 148 48 48
169 21 21 147 49 49
168 22 143 50 50
168 23 22.5 142 51 51
167 24 136 52 52
167 25 25.5 N = 52
15
167 26 Legend:
167 27 TR = Tentative Rank
166 28 28 RR = Real Rank
Name Course Date
The Ranking Scores

Exercise No.2
Scores TR RR Scores TR RR Scores TR RR
59 1 1 37 41 26 80
57 2 2 37 42 26 81 81
52 3 3 37 43 43 26 82
51 4 37 44 25 83
4.5 83.5
51 5 37 45 25 84
50 6 6 36 46 24 85
85.5
49 7 7 36 47 47 24 86
47 8 8 36 48 23 87 87
46 9 10 35 49 50 22 88 88
46 10 35 50 21 89 89
46 11 35 51 20 90 90.5
16
45 12 34 52 20 91
45 13 13 34 53 19 92 92
45 14 34 54 18 93
93.5
44 15 15 34 55 18 94
55.5
43 16 34 56 16 95
95.5
43 17 17 34 57 16 96
43 18 34 58 15 97 97
42 19 34 59 14 98
19.5 98.5
42 20 33 60 14 99
41 21 33 61 61 12 100 100
41 22 33 62 N = 100
41 23 23 32 63
41 24 32 64
64.5
41 25 32 65
40 26 32 66
40 27 31 67
40 28 31 68 68.5
40 29 29 31 69
40 30 31 70
40 31 30 71 71.5
40 32 30 72 Legend:
39 33 29 73 TR = Tentative Rank
33.5
39 34 29 74 RR= Real Rank
74.5
38 35 29 75
38 36 37.5 29 76
38 37 27 77 78
17
38 38 27 78
38 39 27 79
38 40
THE SCORE DISTRIBUTION
The grouping of scores in a score distribution is resorted to when there are

few cases not more than 30.
Grouping scores in a score distribution, as well as in a step or frequency

distribution, makes the data partly meaningful. At a glance the recurrence of
frequency of a score is seen, that is, how many times a certain score appears.
Where most of the scores cluster8 is also evident.
The distribution also indicates whether the examination is easy, difficult, or

of moderate difficulty. If most of the score are high, the examination is relatively
easy; if most of the score is low, the examination is difficult; if most of the scores
are found in the center of distribution, the examination is of moderate difficulty.
Scores are grouped in score or step distribution to economize space and

to facilitate computation of statistical measures like the median and the arithmetic
mean taken in later exercises.
Procedures in grouping scores or measures in a score distribution:
1. By using the master sheet, arrange the scores from the highest to the
lowest, writing each score only once.
2. Take the raw scores and place a tally after each score as many times as
the score appears.
3. Count the tallies opposite each score and write the number opposite the
tallies themselves. This number of tallies is the frequency (f) of the score.
4. Add the frequencies and write the sum at the bottom of the tabulation to
get N, the total of scores or cases.
18
EDUCATION 602 – Statistics
Name Course Date
Score Distribution
Exercise No. 3
Scores Tallies Freq. Scores Tallies Freq. Scores Tallies Freq.
198 I I 173 III 3 158 I 1
197 I I 172 II 2 157 II 2
193 I I 171 II 2 156 II 2
194 I I 169 I 1 153 I 1
188 I I 168 II 2 152 I 1
186 I I 167 IIII 4 151 I 1
183 I I 166 I 1 148 I 1
19
182 I I 165 II 2 147 I 1
180 I I 164 I 1 143 I 1
179 I I 163 II 2 142 I 1
178 I I 162 III 3 136 I 1
13
177 I I 161 II 2
176 I I 159 I 1
13 26
SUMMARY:
I = 13
II = 26
III = 13
IV = 52
Name Course Date
Score Distribution
Exercise No.3
20
Scores: Tallies Freq. Scores: Tallies Freq. Scores Tallies Freq.
:
59 I 1 40 IIII-II 7 26 III 3
57 I 1 39 II 2 25 II 2
52 I 1 38 IIII-I 6 24 II 2
51 II 2 37 IIII 5 23 I 1
50 I 1 36 III 3 22 I 1
49 I 1 35 III 3 21 I 1
48 I 1 34 IIII-III 8 20 II 2
46 III 3 33 III 3 19 I 1
45 III 3 32 IIII 4 18 II 2
44 I 1 31 IIII 4 16 II 2
43 III 3 30 II 2 15 I 1
21
42 II 2 29 IIII 4 14 II 2
41 IIII 5 27 III 3 12 I 1
25 54 21
SUMMARY:
I = 25
II = 54
III = 21
N = 100
THE FREQUENCY DISTRIBUTION
Data collected from the test and experiments may have little meaning to
the investigator until they have been arranged or classified in some systematic
way. The first task therefore is to organize our materials and this leads naturally
to a grouping of scores into classes or steps.
Procedures in grouping scores or measures into a frequency distribution:
1. Determine the range. The range is the gap between the highest and the
lowest scores- the difference that results when the lowest score is
subtracted from the highest score.
2. Determine the class interval. (a) To minimize the error and to avoid too
much labor, it is suggested that the number of steps should not be less
than 10 nor more than 20. The ideal number should be between 12 and
15. Under exceptional cases the number which given ranges nay yield
may be below 10 nor more than 20. A good rule is to select an odd
number for a class interval ( I ) which will give a quotient of between ten
and fifteen when the range is divided by it. Be sure the interval chosen will
22
nor spread the data out too much, thus losing the benefit of grouping, nor
crowd the scores into coarse categories. (b) Another method of
determining the i = (by Ross) add 1 (constant) to the range and divide
the sum by 12 (constant).
3. Determine the limits of the classes or steps. For the lower limit of the
highest step, choose a number which is nearest to or equal to the highest
score, but not exceeding it, and which is exactly divisible by the size of the
class interval. The upper limit is determined by adding to the lower limit
one number less than i . The succeeding limits are determined by
subtracting the size of i from the proceeding lower and upper limits.
4. Make the tabulation. Tally the raw scores opposite their proper interval or
class. The total number of tallies of each class interval (frequency) is
written in a column labelled f. The sum of f column is called n (number of
cases).
Name Course Date
The Frequency Distribution

Exercise No.4
Scores Tallies Frequencies
195-199 II 2
190-194 I 1
23
185-189 IIII 3
180-184 III 3
175-179 IIII 4
170-174 IIII – II 7
165-169 IIII – IIII 10
160-164 IIII – III 8
155-159 IIII – I 6
150-154 III 3
145-149 II 2
140-144 II 2
135-139 I 1
N = 52
H = 198 5 = i
24
L = 136 12 ) 63
Rule No.1 : If the quotient is an
62 60 odd number, it
automatically
+ 1 3 becomes the
interval ( i ).
63
Name Course Date
Frequency Distribution
Exercise No.4
Scores Tallies Frequencies
57-59 II 2
H = 59 3 = i
54-58 0
L = 12
51-53 III 3 4
47 12 48
48-50 III 3
48
25
45-47 IIII - I 6
4 0
42-44 IIII - I 6 3 12
39-41 IIII – IIII - IIII 14 12
36-38 IIII – IIII - IIII 14 0
33-35 IIII – IIII - IIII 14
30-32 IIII - IIII 10
Rule No.2 :
27-29 IIII - II 7
If the quotient is even, without

any remainder, take the preceding
24-26 IIII - II 7 odd number.
21-23 III 3
18-20 IIII 5
15-17 III 3
12-14 III 3
N = 100
26
THE FREQUENCY POLYGON
Aid in analysing numerical data is obtained from a graphics or pictorial treatment

of the frequency distribution. The advertiser has long used graphics methods because
these advises catch the eye and hold the attention when the most careful array of
statistical evidence fails to attract notice. For this and other reasons the research worker
also utilizes the attention getting power of visual presentation; and at the same time,
seeks to translate numerical facts – often abstract and difficult of interpretation – into
more concrete and understandable form.
Four methods of representation a frequency distribution graphically are in use;

the frequency polygon, the histogram, the cumulative frequency graph, and the
cumulative percentage curve or ogive.
The frequency polygon and the histogram have the same uses. The score
opposite the summit of the frequency polygon is the crude mode; the midpoint of the tip
in the highest rectangle of the histogram is a crude mode. From the frequency polygon
and the histogram, it is possible to gleam the “representativeness” of the group
concerned. If the graph plotted is similar to the shape of bell, the group is more or less
typical. The more the irregular the shape of the graph, the less representative is the
group. It is also possible to note the tendency of the measure - whether they are piled
up at the low (or high) end of the scale and or evenly and regularly distributed over the
scale. If the test is so easy, the score accumulated at the high end of the scale, whereas
the test is too hard, scores will crowd at the low end of the scale. When the test will be
distributed symmetrically around the mean, few individuals scoring quite high, few quite,
low, and the majority failing somewhere near the middle of the scale.
The frequency polygon is less precise than the histogram in that it does not
represent accurately, i.e. in terms of areas, the frequency upon each interval. In
comparing two or more graphs plotted on the same axes, however, the frequency
polygon is likely to be more useful as the vertical and horizontal lines in the histogram
will often coincide.
Procedure in plotting a frequency polygon:
1. Labelling the points on the base line. There are several ways of labelling the
intervals along the base line X axis of the frequency polygon. For example, step
195-199 of our frequency distribution which has an 1 of 5 may be interpreted as
having the following limits:
a. Expressed limits: 195-199. This means that this interval begins with score
195 and ends with the score 199. These limits are ideal for graphing the
scores in a frequency distribution because of the ease in tallying.
27
b. Score limits: 195-200. This interval means that 11 scores mean
from 195 up but not more including 200 – fall within this grouping.
These limits are conveniently used in labelling the point in the base
line of the frequency polygon, but inconveniently use din tallying
because it is fairly easy for one to let score 200 slip into interval
195-200 owing simply to be presence of 200 at the upper limit of
the interval.
c. Exact limits: 194.5-199.5. This interval begins exactly at 194.5 (not

at 195) and ends at 199.5 (not at 199). Apparently, this is time-
consuming and clumsy. However, this may be used (rather than in
labelling the points on the base line of the graph).
2. Plotting midpoints. Frequencies on each interval are plotted above the

midpoints of the intervals on the X axis. They are represented in each
instance by a dot the specified distance up on Y and midway between the
lower and upper limits of the interval upon which it falls.
3. Drawing the frequency polygon. When all the points have been located in
the diagram, they are joined by a series of short lines to form the
frequency polygon. In order to complete the figure (i.e., to bring it down to
the base line), one additional interval at the low end and one additional
interval at the high end of the distribution are include on the X scale. The
frequency on each these intervals is, of course, zero.
4. Dimension of the frequency polygon. In order to give symmetry and

balance to the polygon, care must be exercised in the selection of unit
distances to represent the intervals on the X axis and the frequencies on
the Y axis. A good general rule is to select X and Y units which will make
the height to width may vary 60-80%.
5. Area of the polygon. The total frequency (m) of a distribution is

represented by the area of its polygon; that is, the area bounded by the
frequency surface and the X axis.
28
Steps in constructing a frequency polygon:
1. Draw two straight lines perpendicular to each other, the vertical line near left side
of the paper, the horizontal line near the bottom. Label the vertical line (the Y
axis) OY, and the horizontal line (the X axis) OX. Put 0 where the two lines
intersect. This point is the origin.
2. Lay off the score intervals of the frequency distribution at regular distance along
X axis. Begin with the interval next below the lowest in the distribution, and end
intervals next above the highest in the distribution. Label the successive X
distance with the score interval limits. Select an X unit which will allow all the
intervals to be represented easily on the graph paper.
3. Mark off the Y axis successive units to represent the scores (the frequencies) on
the different intervals. Choose a Y scale which will make the largest frequency
(the height of the polygon approximately 75% or 60-80% of the width of the
figure).
4. At the midpoint of each interval on the X axis go up in the Y direction a distance
equal to the number of scores on the interval. Place points at these locations.
5. Joint the points plotted in step 4 with straight lines to give the frequency surface.
The Frequency Polygon
Exercise No.5
Polygon- from Greek term polygonon which means many angled
Polygon- a figured especially a closed plane figure have 3 or more straight sides.
-visual aids
-attention getting
-catch the eye
-seeks to translate numerical facts
Four Methods in general use:
1) frequency polygon
2) histogram
3) cumulative frequency graph
4) cumulative percentage curve or ogive
29
Three types of limits:
1. Expressed limits- limits ideal for tallying
140- 144 *expressly tells us what is included in a class.
135-139
2. Score limits- score added- limits ideal for labelling for graphs
140-145 (140-144)
135-140 (135-139)
3. Exact limits- this is expressed in 0.5 or decimals
139.5-144.5 (lower limit is .5 less) 140-145
134.5-139.5 (upper limit is .5 more) 135-140
Purpose is for (1)labeling and for
(2) computation
Name Course Date
Exercise No.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
10
F
9
R
8
E
7
Q
6
30
5
2
U
1
E
0
N
C
130 135 140 145 150 155 160 165 170 175 180 185 190 195 200 205
S C O R E S
Expressed Limits Score Limits
Scores F Scores F Orig. Steps= 13
Added = 2
200-204 0 200-204 0 Constant = 1
195-199 2 195-199 2 Vertical Lines= 16
190-194 1 190-194 1 Width x .75
185-189 3 185-189 3 + .80
180-184 3 180-184 3 112
175-179 4 175-179 4 Horizontal = 12.00
170-174 7 170-174 7 Lines
165-169 10 165-169 10
160-164 8 160-164 8 0.83= Freq./
155-159 6 155-159 6 12 10.00 Lines
150-154 3 150-154 3 - 96
31
145-149 2 145-149 2 40
140-144 2 140-144 2 - 36
135-139 2 135-139 2 4
130-134 1 130-134 1
N=52 N=52
Name Course Date

Exercise No.5
Orig. Steps = 16
Scores Freq. Scores Freq. Added steps = 2
Constant = 1
60-62 0 60-62 0 Vertical = 19
57-79 2 57-79 2 Lines
54-56 0 54-56 0
51-53 3 51-53 3 Horizontal lines= .75 of vertical
48-50 3 48-50 3 lines
45-47 6 45-47 6 19
42-44 6 42-44 6 x .75
39-41 14 39-41 14 95
36-38 14 36-38 14 + 133
33-35 14 33-35 14 14.25
32
30-32 10 30-32 10
27-29 7 27-29 7
24-26 7 24-26 7 0.98 or 1
21-23 3 21-23 3 14.25 14.00 Convert
18-20 5 18-20 5 - 12.82 into a
15-17 3 15-17 3 1.18 Whole
12-14 3 12-14 3
9-11 0 9-11 0
N=100 N=100
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
15
14
F
13
R
12
E
11
Q 10
U 9
E 8
N 7
C 6
Y 5
0
9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63
33
ORIGINAL AND SMOOTHED FREQUENCY POLYGONS
If the sample is small and the frequency distribution is somewhat irregular,

the polygon tends to be jagged in online. To iron out change irregularities, and
also get a better notion of how the figure might look if the data were more
numerous, the frequency polygon maybe moving “smoothed”. In smoothing, a
series of “moving” or “running” averages are taken from which new or adjustment
frequencies are determined.
Procedure in smoothing a frequency polygon:
1. Find the adjust or “smoothed” if every class interval (including the one
additional step next below the lowest in distribution and the one additional
step next above the highest in the distribution) by adding the f on the
given interval and the f’s on the two adjacent intervals (the interval just
below and the interval just above) and dividing the sum by three (3). (The
total of all the adjusted frequencies should equal the number of cases.)
2. By using the same graph paper on paper on which the original frequency
polygon has been constructed, place the point at the midpoint of each
interval on the X axis corresponding to the adjusted f in the Y direction.
3. Join the points plotted in step in the 2 with straight or dotted lines (by
using a different ink color) to complete the smoothed polygon.
34
Name Course Date
The Original and Smoothed Frequency Polygons

Exercise No.6
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
10
9
F
8
R
7
E
6
Q
5
U
4
E
3
N
2
C
1
Y
0 1
130 135 140 145 150 155 160 165 170 175 180 185 190 195 200 205
S C O R E S
Expressed Limits Score Limits Adjusted
frequencies
35
Scores F Scores F
200-204 0 200-204 0 0 2/3
195-199 2 195-199 2 1
190-194 1 190-194 1 2
185-189 3 185-189 3 2 1/3
180-184 3 180-184 3 3 1/3
175-179 4 175-179 4 4 2/3
170-174 7 170-174 7 7
165-169 10 165-169 10 8 1/3
160-164 8 160-164 8 8
155-159 6 155-159 6 5 2/3
150-154 3 150-154 3 3 2/3
145-149 2 145-149 2 3 2/3
140-144 2 140-144 2 1 2/3
135-139 2 135-139 2 1
130-134 1 130-134 1 0 1/3
N=52 N=52 N=47 15/3
or 52
Name Course Date
The Original and Smoothed Frequency Polygons

Exercise No.6
36
Scores Freq. Scores Freq. Adjusted Frequencies
60-62 0 60-62 0 2
/3
57-59 2 57-79 2 2
/3
54-56 0 54-56 0 1 2
/3
51-53 3 51-53 3 2
48-50 3 48-50 3 4
45-47 6 45-47 6 5
42-44 6 42-44 6 8 2
/3
39-41 14 39-41 14 11 1/3
36-38 14 36-38 14 14
33-35 14 33-35 14 12 2/3
30-32 10 30-32 10 10 1/3
27-29 7 27-29 7 8
24-26 7 24-26 7 5
21-23 3 21-23 3 5 2
/3
18-20 5 18-20 5 3 2
/3
15-17 3 15-17 3 3 2
/3
12-14 3 12-14 3 2
9-11 0 9-11 0 1
N=100 N=100 N=94 112/3 or 96+4=100
F
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
15
37
14
13
12
11
10
R 8
E 7
Q 6
U 5
E 4
N 3
C 2
0
9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63
THE HISTOGRAM OR COLUMN DIAGRAM
In the frequency polygon, all the score within a given interval are
represented by the midpoint of that interval, whereas in a histogram the scores
are assumed to be spread uniformly, over the entire interval. Within each interval
of a histogram the frequency is shown by a rectangle, the base of which is the
length of the interval, and the height of which is the number of scores within the
interval.
Procedures of constructing a histogram or column diagram:
1. Draw OX and OY as in the frequency polygon and lay off equal distance
on both axes – OX for the steps and OY for the frequencies.
38
2. Lay off the score intervals of the frequency distribution along the X-axis.
Begin with the highest interval in the distribution.
3. Mark off on the Y axis successive units to represent the frequencies on

the different intervals.
4. Draw line limited by the lower and upper limits of the steps, instead of a
point at the midpoint as in the adjacent end of the lines. The figure is a
histogram.
5. Connect by straight vertical lines every two adjacent end of the lines. The
figure is histogram.
6. Shade the histogram to bring about clearly the total area of the figure.
Name Course Date
The Histogram or Column Diagram

Exercise No.7
Y
10 1 2 3 4 5 6 7 8 9 10 11 12 13 14
39
32
F 9
31
R 8
22 30
E 7
21 29 39
Q 6
14 20 28 38
U 5
13 19 27 37
E 4
12 18 26 36 43
N 3
8 11 17 25 35 42 46 49
C 2
3 5 7 10 16 24 34 41 45 48 52
Y 1
1 2 4 6 9 15 23 33 40 44 47 50 51
X
o
135 140 145 150 155 160 165 170 175 180 185 190 195 200
S C O R E S
Score Freq.
Vertical Lines:
195-199 2 2 lines less than in
190-194 1 The polygon
185-189 3
180-184 3 Original Steps = 13
175-179 4 Constant + 1
170-174 7 14
165-169 10
160-164 8 Horizontal Lines:
155-159 6 Same as in the
40
150-154 3 frequency
145-149 2 polygon
140-144 2
135-139 2
130-134 1
N=52
Name Course Date
The Histogram or Column Diagram

Exercise No.7
Y
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
14
52 66 80
13
51 65 79
12
50 64 78
11
49 63 77
10
38 48 62 76
F 9
R 8
37 47 61 75
E 7
36 46 60 74
Q 6
21 28 35 45 59 73
U 5
20 27 34 44 58 72 86 92
E 4
11 19 26 33 43 57 71 85 91
41
3 10 18 25 32 42 56 70 84 90
2 3 6 9 14 17 24 31 41 55 69 83 89 95 98
N 1
2 5 8 13 16 23 30 40 54 68 82 88 94 97 100
C o
1 4 7 12 15 22 29 39 53 67 81 87 93 96 99
X
Y 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60
S C O R E S
Score Freq.
Vertical Lines:
57-59 2 2 lines less than in
54-56 0 the polygon
51-53 3
48-50 3 Original Steps = 16
45-47 6 Constant + 1
42-44 6 17
39-41 14
36-38 14 Horizontal Lines:
33-35 14 Same as in the
30-32 10 frequency
27-29 7 polygon
24-26 7
21-23 3
18-20 5
15-17 3
12-14 3
42
N=100
MEASURE OF CENTRAL TENDENCY
When scores or other measure have been tabulated into a frequency

distribution, usually the next task is to calculate a measure of central tendency, or
central position. The value of a measure of central tendency is twofold: (1)It is an
“average” which represents all of the scores made by the group, and as such
gives a concise description of the performance of the group as a whole; and (2) it
enables us to compare two or more groups in terms of typical performance.
There are three “average” or measures of central tendency in common

use: the arithmetic mean, the median, and the mode. The “average” is the
popular term for the arithmetic mean. In statistical work, “average” is the general
for any measure of central tendency.
THE MEAN
The arithmetic mean is the most reliable measure of central tendency;

hence, its wide in use in scientific educational literature. It is generally preferred
to other averages as it is rigidly defined mathematically and is based upon all of
the measures. It is advantage when the scores are distributed symmetrically
around a central point, when the measure of central tendency having the greatest
stability is wanted, and when other statistic such as standard deviation and the
coefficient of correlation are to be computed later.
The mean is the average of the scores or measures. It is the sum of the
separate scores divided by their number. It is dependent on the magnitude of
scores. Changing a score even by one, more or less, change the mean.
43
The Absolute Mean
There are three methods of computing the mean. One method, the long or
absolute method, which is used when the data are ungrouped, is the subject of
this exercise.
Procedure in calculating the mean by the long r absolute method:
1. Find the sum of the series of ungrouped raw scores (EX).
2. Count the scores to get the number of cases (N).
3. Divide the sum by the number of cases. The quotient is the arithmetic
mean r simply mean (M).
∑= sum of
Formula: M = ∑X where X= scores / measures
N N= number of cases
44
Name Course Date
The Mean by the Absolute Mean

Exercise No.8 a
163 186 165 151
177 157 156 169
189 147 152 159
176 163 161 166
167 156 158 179
182 148 164 172
153 142 197 178
172 157 161 173
165 198 183 162
180 143 167 171
193 171 168 168
162 136 167 162
45
188 173 167 173
2267 2077 2166 2183
∑X
I = 2,267
M =
II = 2,077
III = 2,166
8,693
IV = 2,183
=
52
∑ X = 8,693
M = 167.17
Name Course Date
46
The Mean by the Absolute Mean
Exercise No.8 a
23 48 29 41 43
29 33 19 16 52
31 34 38 32 49
34 40 29 46 40
16 33 41 45 40
12 36 40 34 42
36 15 43 18 27
39 24 34 32 34
34 57 14 25 41
27 33 42 51 44
39 27 40 34 35
51 41 34 37 26
24 25 40 38 50
26 22 20 31 29
40 18 37 38 35
38 30 14 31 31
32 38 41 36 46
37 26 46 45 20
43 38 21 45 32
37 30 35 37 59
648 648 657 712 775
47
∑X
I = 648
M =
II = 648
III = 657
3,440
IV = 712
=
100
V = 775
∑ X = 3,440
M = 34.40
WHEN TO USE THE VARIOUS MEASURES OF CENTRAL TENDENCY
Statistics in Psychology and Education by Henry G. Garret:
1. USE THE MEAN

a. When the scores are distributed symmetrically around a central point, i.e.
when the distribution is not badly showed. The M is the center of gravity
in the distribution, and each score contributes to its distribution.
b. When the measure of central tendency having the greatest reliability is
wanted.
c. When other statistics (e.g. SD coefficient of the greatest of correlation)
are to be computed later. Many statistics are based upon the mean.
2. USE THE MEDIAN
a. When the exact midpoint of the distribution is wanted.
b. When there are extreme scores which would markedly affect the mean.
c. When it is desired that certain scores should influence the central
tendency, but all that is known about them is that they are above or below
median.
3. USE THE MODE
48
a. When a quick and approximate measure of central tendency is all that is
wanted.
b. When the measure of central tendency should be the most typical value.
Fundamental Statistics in Psychology and Education by J.P. Guilford:
1. COMPUTE THE ARITHMETIC MEAN WHEN:

a. The greatest reliability is wanted. It usually varies less from sample to
sample drawn from the same population.
b. Other computation, as finding measures of variability, is to follow.
c. The distribution is symmetrical about the center, particularly when it is
approximately normal.
d. We wish to know the “center of gravity” of a sample.
2. COMPUTE THE MEDIAN WHEN:
a. There is no sufficient time to compute a mean.
b. Distribution is badly skewed.
c. We are interested in whether cases fall within the upper or lower halves of
the distribution and not particularly in how far from the central point.
d. Incomplete distribution is given.
3. COMPUTE THE MODE WHEN:
a. The quickest estimate of central value is wanted.
b. A rough estimate of central value will go do.
c. We wish to know what is the most typical.
THE MEAN BY THE MIDPOINT METHOD
When the scores are many and are grouped into frequency distribution,
the mean may be computed by suing the midpoint method, the formula of which
is:
∑ fx
M =
N
Where f =frequency (number of scores) of each interval

X = midpoint of each interval
∑ = sum of
49
Procedures in calculating the mean by the midpoint method:
1. Lay off the frequency distribution, showing the class or step intervals
(column 1) and their corresponding frequencies (column 2). Add the
frequencies to get the number of cases (N).
2. Determine the midpoint (x) of every interval and enter it in column 3. The
midpoint is equal to the lower limit plus one-half of the interval.
Upper Limit - Lower Limit

Midpoint = Lower
+
Limit i
3. Multiply the frequencies (f) by their midpoint values (x) and enter the
products in column 4.
4. Add the products (fx) of all the steps to get their sum (∑fx).
5. Divide the sum (∑fx) by the total number of cases (N) to obtain the mean.
Name Course Date
The Mean by the Midpoint Method

Exercise No.8 b
Scores F X fx
50
195-199 2 197 394
190-194 1 192 192
185-189 3 187 561
180-184 3 182 546
175-179 4 177 708
170-174 7 172 1204
165-169 10 167 1670
160-164 8 162 1296
155-159 6 157 942
150-154 3 152 456
145-149 2 147 294
140-144 2 142 284
135-139 2 137 274
N = 52 ∑fx = 8684
∑fx 8684
M = = = 167
N 52
51
Name Course Date
The Mean by the Midpoint Method

Exercise No.8 b
Scores F X fx
57-59 2 58 116
54-56 0 55 0
51-53 3 52 156
48-50 3 49 147
45-47 6 46 276
42-44 6 43 258
39-41 14 40 560
36-38 14 37 518
33-35 14 34 476
30-32 10 31 310
27-29 7 28 196
24-26 7 25 175
52
21-23 3 22 66
18-20 5 19 95
15-17 3 16 48
12-14 3 13 39
N = 100 ∑fx = 3436
∑fx 3426
M = = = 34.36
N 100
MEAN BY THE SHORT METHOD
The short method is another method of computing the mean when the
scores are many (more than 30) and these are grouped into a frequency
distribution. This less cumbersome than the midpoint method because in the
short method smaller figures are handled, facilitating computation.
The formula used in calculating the mean by the short method is:
∑fd
M = AM + i
N
Where AM = assumed Mean in the midpoint of the step
chosen as Origin
f = frequency of every interval
53
d = deviation upward or downward from AM
i = size of the class interval
Procedures in calculating the mean by short method:
1. Lay off the frequency distribution showing the class or step intervals
(column 1) and their corresponding frequencies (column 2). Add the
frequencies to get the total number of cases (N).
2. Take the midpoint of any step as an assumed mean (AM). From this lay
off positive deviations upward and negative deviations downward
(column 3).
3. Multiply the frequencies by their respective deviation, keeping the

algebraic signs, fd (column 4).
4. Find the sum of positive and negative products ( fd ) algebraically.
5. Divide the sum (∑fd) by N and multiply the quotient by the size of the class
interval (i) to get the correction.
6. Add the correction to get the assumed mean to obtain the true mean.
Name Course Date
The Mean by the Short Method

Exercise No.8 c
Scores F D fd
195-199 2 6 12
190-194 1 5 5
54
185-189 3 4 12
180-184 3 3 9
175-179 4 2 8
170-174 7 1 7
165-169 10 0 AM = 167
160-164 8 -1 -8
155-159 6 -2 -12
150-154 3 -3 -9
145-149 2 -4 -8
140-144 2 -5 -10
135-139 1 -6 -12
N = 52 ∑fd = 0
∑fd
M = AM + I
N
0 5
= 167 +
52
= 167 + 0
= 167
55
Name Course Date
The Mean by the Short Method

Exercise No.8 c
Scores F D fd
57-59 2 8 16
54-56 0 7 0
51-53 3 6 18
48-50 3 5 15
45-47 6 4 24
42-44 6 3 18
39-41 14 2 28
36-38 14 1 14 133
33-35 14 0 AM = 34
30-32 10 -1 -10
27-29 7 -2 -14
56
24-26 7 -3 -21
21-23 3 -4 -12
18-20 5 -5 -25
15-17 3 -6 -18
12-14 3 -7 -21 -121
N = 100 ∑fd = 12
∑fd
M = AM + i
N
Where:
12
= 34 + 3
100 AM = assumed mean
∑ = sum of
= 34 + 0.12 N = no. of cases
i = size of class
= 0.36
THE MEDIAN
The median as that point in the scale above and below which lie 50% of
the cases. It is a point-measure, dividing a group into two equal sub-groups.
Hence, if the group is to be sectioned in two based on achievement or ability, the
point of division is the median.
The median is an inspection measure. It is easily determined. If there are

few cases, the median is the middlemost score of the series of scores arranged
in order of size. If there are many cases, the median is computed by
interpolation. The ease with which the median is computed accounts for its
popularity and wide use by elementary school teachers.
57
The median is the most stable measure of central tendency. It is not much
affected by extreme low or high scores. Hence, if there are low or high scores
and it is desired that these scores do not affect the average disproportionately,
the median used. Again, if there are relatively few cases, the median is
computed.
The value of the median depends on the number of scores, and not on
the magnitude of the scores. If most of the scores are high, median is high; if the
scores are low, the median is low.
Calculating of the median when data are ungrouped:
When ungrouped scores or other measures are arranged in order of size,

the median is the midpoint in the series. Two situations arise the median
computation of the median from the ungrouped data:
a. When N is odd: Arrange the score from the highest to the lowest.
Or vice versa. The middlemost score is the counting median or
midscore.
b. When N is even: Arrange the scores from the highest to the lowest,
or vice versa. The average of the two middlemost scores is the
median.
Name Course Date
The Median of Ungrouped Scores

Exercise No.9 a
Counting Median = Ctgn. Mdn.
GIVEN RAW SCORES:
58
Ctgn. Mdn.
198 173 167 158
197 173 166 157
193 173 165 157
189 172 165 156
188 172 164 156
186 171 163 153
183 171 163 152
182 169 162 151
180 168 162 148
179 168 162 147
178 167 161 143
177 167 161 142
59
176 167 Ctgn. Mdn.
159 136
PROCEDURE :
Arrange the scores from the highest to the lowest and get the
middle most scores for odd numbers; take the average of the
two middle most scores for even numbers.
Name Course Date
The Median of Ungrouped Scores

Exercise No.9 a
Counting Median = Ctgn. Mdn.
GIVEN RAW SCORES:
59 40 35 29
57 40 34 27
52 40 34 27
51 40 34 27
51 40 34 26
50 40 34 26
49 40 34 26
60
48 39 34 25
46 39 34 25
46 38 33 24
46 38 33 24
45 38 33 23
45 38 32 22
45 38 32 21
44 38 32 20
43 38 32 20
43 37 31 19
43 37 31 18
42 37 31 18
42 37 31 16
41 36 30 16
41 36 30 15
41 36 29 14
41 35 29 14
41 35 29 12
Ctgn. Mdn.
35
35 2 70
6
+
35 10
10
70 x
61
Name Course Date
The Median by Interpolation

Exercise No.9 b
Scores Freq. Cumulative Freq.
195-199 2
190-194 1 N 52
= = 26
185-189 3 2 2
180-184 3
175-179 4 f = 22
170-174 7
165-169 10 Fm fm = 10
160-164 8 22 F
155-159 6 14 l = 164.5
150-154 3 8
145-149 2 5 i = 5
140-144 2 3
135-139 1 1 26-22
Mdn. = 164.5+ 5
N = 52 10
= 164.5+ 4 5
62
10
Formula: 20
= 164.5+
N 0
- F
2
l +
i = 164.5 + 2
fm
= 166.5
Where:
L = exact lower limit where the Mdn lies
N/2 = ½ of the cases
F = partial sum which approaches to or
equal to but not exceeding N/2
fm = frequency where median lies
I = size of the class interval
Name Course Date
The Median by Interpolation

Exercise No.9 b
63
Scores Freq. Cumulative Freq.
57-59 2
54-56 0 N 52
= = 26
51-53 3 2 2
48-50 3
45-47 6 f = 22
42-44 6
39-41 14 fm = 10
36-38 14
33-35 14 fm l = 164.5
30-32 10 38 F
27-29 7 29 i = 5
24-26 7 21
21-23 3 14 12
Mdn. = 32.5+ 3
18-20 5 11 14
15-19 3 8
12-14 3 3 12
= 32.5 + 3
N= 100 14
Formula: 36
= 32.5 +
N 14
- F
2
l +
i = 32.5 + 2.57
fm
64
= 35.07
Where:
L = exact lower limit where the Mdn lies
N/2 = ½ of the cases
F = partial sum which approaches to or
equal to but not exceeding N/2
fm = frequency where median lies
I = size of the class interval
THE MODE
The mode is the most frequent score in a series or occurs the greatest
number in a grouped distribution. It is otherwise known as “commercial average”
or typical value, as in mode or fashion in dresses worm by the “average” woman.
Procedure in calculating the mode:
1. In a simple ungrouped series of measures the “rude”or “rough empirical”

mode is that single measure or score which occurs most frequent. In case
there are two most frequent scores, both are regarded as rough modes,
and the group is considered “bimodal” or “polymodal”.
2. When the data are ungrouped, the “crude” mode is the midpoint of the
stop with the greatest frequency.
65
3. The formal for approximating the true mode, when the frequency
distribution is symmetrical, or at least not badly skewed is:
Mode = 3 Mdn - 2 Mean
4. This mode is also called the refined or theoretical mode or “approximated”

Pearson mode (Karl Pearson contributed the formula: Mode = M – (M-
Md) 3, which read exactly like the former). If the mean and median are
equal, the mode is equal to either, if the mean greater than the median,
the mode is lowest; if the mean is lower than the median, the mode is the
highest.
Name Course Date
The Mode
Exercise No.10
A. Crude Mode:
1. Ungrouped Scores:
34 appears the greatest number of times
2. Grouped Scores:
The step(s) has the greatest number of tallies.
165-169 = 10
160-164 = 8
155-159 = 6
3. Frequency Distribution:
66
What are the midpoints of steps having the greatest frequency? What kind of
distribution?
39-41 = 40
36-38 = 37
33-35 = 34
B. Refined Theoretical/ approximated True Mode:
1. Person
Mo = M - ( M - Mdn ) 3
= 167 – ( 167 – 165.5 ) 3
= 167 - 1.5
= 165.5
2. Simplified Garret Mode:
Mo = ( 3 x Mdn ) – ( 2 x M )
= (3 x 166.5) – ( 2 x 167 )
= 499.5 - 334
= 165.5
Note (Relationships) :
1. If M is greater than the median, the mode is lowest.

2. If M is smaller than the median, mode is the highest.
3. If M is equal to the median, mode is equal.
4. EDUCATION 602 – Statistics
5.
Name Course Date
The Mode
Exercise No.10
A. Crude Mode:
1. Ungrouped Scores:
34 appears the greatest number of times
67
2. Grouped Scores:
The step(s) has the greatest number of tallies.
39-41 = 14
36-40 = 14
33-35 = 14
3. Frequency Distribution:
What are the midpoints of steps having the greatest frequency? What kind of
distribution?
39-41 = 40
36-40 = 37
33-35 = 34
B. Refined Theoretical/ Approximated True Mode/ Person and Garret Mode:
1. Person
Mo = M - ( M - Mdn ) 3
= 34.36 – ( 34.36 – 35.07 ) 3
= 34.36 - 2.13
= 36.49
2. Simplified Garret Mode:
Mo = ( 3 x Mdn ) – ( 2 x M )
= (3 x 35.07) – ( 2 x 34.46 )
= 103.08 - 68.72
= 36.49
MEASURES OF VARIABILITY
Ordinarily, after calculating the measures of central tendency, the next

step is to find some measures of the variability of our scores, that is, of the
“scatter” or “dispersion” of the separate scores around the central tendency.
Four measures have been devised to indicate the variability within a set of
measure: the range, the quartile deviation (Q), the average deviation (AD), and
the standard deviation (AD or S).
68
Calculating the range:
The range is the interval between the highest and the lowest scores. It is
the most general measure of spread or scatter, and is computed when we wish
to make a rough comparison of two or more groups of variability. The range
takes account of the extremes of the series scores only and is unreliable when N
is small, or when there are large gaps (i.e. zero f’s) in the frequency distribution
the range to the equal to the difference between the midpoint of the highest step
and the midpoint of the lowest step.
Computing the Quartile Deviation (Q):
The quartile deviation or Q is one-half the scale distance between the 75 th
and the 25th percentiles or Q1 is the first quartile on the score scale, the point
below which lie 25% of the score. The 75th percentile or Q3 is the third quartile on
the score scale, the point below which lie 75% of the scores. When we have
these two points the Q is found from the formula.
Q3 - Q1
Q =
2
To find Q1 it is clear that we must first compute the 75 th and 25th

percentiles. These statistics are found in exactly the same way as was median,
which is, of course, the 50th percentile or Q3. The only difference is that ½ of N is
counted off from the low end of the distribution to find Q 1 and that 3N is counted
off to find Q3. The formulas are:
3
N
- F N - F
Q1 = l + i and Q3 = l + i
4 4
Fq fq
Where:
l = the exact lower limit of the interval in which the quartile falls
69
F= the cumulative sum of all frequencies from the lowest step, which sum
approaches or equal to (but not exceeds) N or Q 1/4 or 3N for Q3/4
fq= the frequency on the interval containing the quartile

The quartile Q1 and Q3 mark off the limits of the middle 50% of the scores
in the distribution, and the distance between these distance two points is called
the interquartile range. Q is ½ the range of the middle 50% or the semi-
interquartile range. Since Q measures the average distance of the quartile points
from the median, it is a good index of scores in the distribution are packed
closely together, the quartiles will be near one another and Q will be small. If the
score are widely scattered, the quartile will be relatively far apart. And Q will be
large.
The quartile deviation is used with the median. If the distribution is
issued to be normal, the median plus the quartile deviation gives the upper
quartile (Mdn + Q = Q3); the median minus the quartile deviation gives the lower
quartile (Mdn – Q = Q1). In a normal distribution, Q is called the probable error or
PE (of the normal probability curve).
The quartile deviation may be used in sectioning or classifying pupils in a
group or in the distribution of grades. For instance, from Mdn = Q, section B r
grade B or 2; from Mdn – Q to Mdn, section C or grade of C or 3; from the loest
score to Mdn – Q, section D or grade of D or 4. With this there will be four
sections with practically equal number of pupils in each.
The quartile deviation indicates the homogeneity or heterogeneity
of the group. The smaller the Q, the more homogeneous is the group; the greater
the Q, the more heterogeneous is the group.
70
Name Course Date
The Quartile Deviation

Exercise No.11
Scores Freq. Q1 Q2 Formulas:
195-199 2
190-194 1 Q3 – Q1
Q =
185-189 3 2
180-184 3
175-179 174.5 4 fq
170-174 7 39 F N
- F
165-169 10 32 4 i
Q1 = l+
160-164 8 22 fq
155-159 154.5 6 fq 14
150-154 3 8 F 8
145-149 2 5 5 3N
- F
140-144 2 3 3 Q3 = l+ 4 i
135-139 1 1 1 fq
N= 52
N 3N
- F - F
4 i 4 i
Q1 = l + Q3 = l+
fq fq
71
13 – 8 39 - 39
Q1 = 154.5 + 5 Q3 = 174.5+ 3
6 4
Q1 = 154.5 + 25/6 Q3 = 174.5 + 0/4
Q1 = 154.5 + 4.166 Q3 = 174.5 + 0
Q1 = 158.67 Q3 = 174.5
Note:
Q3 - Q1
Q =
2 If Q is 10 or more, the group is
heterogeneous
174.5 – 158.67 If Q is less than 10, the group

Q =
2 is homogeneous
Q = 7.91 Homogeneous
Name Course Date
The Quartile Deviation

Exercise No.11
Scores Freq. Q1 Q2 Formulas:
57-59 2
54-56 0 Q3 – Q1
Q =
51-53 3 2
72
48-50 3
45-47 6
42-44 6 N
39-41 14 f 4 - F
i
q Q1 = l+
36-38 14 66 F fq
33-35 14 52
30-32 10 38
27-29 7 f 28 3N
q - F
i
24-26 7 21 F 21 Q1 = l+ 4
21-23 3 14 14 fq
18-20 5 11 11
15-17 3 6 6
12-14 3 3 3
N = 100
N 3N
- F - F
4 i 4 i
Q1 = l + Q3 = l+
Fq fq
25 – 21 75 – 66
Q1 = 26.5 + 3 Q3 = 38.5 + 3
7 14
Q1 = 26.5 + 1.714 Q3 = 38.5 + 1.929
Q1 = 28.214 Q3 = 40.429
73
Q3 - Q1 Note:
Q =
2 If Q is 10 or more, the group is
heterogeneous
40.429-28.214 If Q is less than 10, the group

Q =
2 is homogeneous
Q = 6.108
Q = 6.11 Homogeneous
THE AVERAGE DEVIATION
The average deviation or AD (also written as mean deviation or MD) is

the mean of the deviation of all the separate scores in series taken from their
mean (occasionally from the median or mode). It includes the middle of the
cases. It is larger than the quartile deviation.
The average deviation is affected by every score; hence if it is desired to
have every score a weight on the measure of variability; the average deviation
should be used.
The AD, like other measure of dispersion, is used in determining the
extent of difference of variability among the members of a group. The higher the
AD, the more variable of heterogeneous is the group; the smaller the AD, the
more compact or homogeneous is the group. From this it may be seen that the
AD can be used in classifying pupils.
The long method in computing the average deviation is used when there
are few cases, not more than 30. The short method is used when there are many
cases grouped with intervals. The former method is simple but laborious, the
latter is more complicated but requires shorter time if there are many cases
Computation of the AD from the ungrouped scores:
The average deviations to find the AD, no account is taken of signs, and
all deviations whether plus or minus are treated as positive. The formula for the
AD scores of ungrouped is:
74
∑ /X/
AD =
N
In which the bars / / enclosing the x indicates that signs are disregarded in
arriving the sum. As 1 as always, x is a deviation of the scores from the mean,
i.e. m X – M = x.
1. Find the arithmetic mean by the long method.
2. Subtract the mean from every score to get the deviation (X).
3. Add the deviations arithmetically, i.e., regardless of the positive and

negative signs.
4. Divide the sum of the deviations (∑/X/) by the number of cases (N).
Calculating the AD from the Grouped Data:

The AD is rarely used in modern statistics, but it is often found in the older
experimental literature. Should the student find it necessary to compute the AD
from the grouped data, the formula is:
∑/X/ fx = the product of the deviation by their

AD =
N frequencies
1. Compute the arithmetic mean by the short method.
2. Subtract the mean from the midpoint of every step to find the deviation
(x= Midpoint – Mean).
3. Multiply each deviation by the corresponding frequency to obtain fx.
4. Add the product (fx) arithmetically to get their sum (∑ /fx/).
5. Divide the arithmetic sum of the product (∑ /fx/) by the number of cases (N). The
quotient is the AD.
75
Name Course Date
The Average Deviation

Exercise No.12
Given in the previous exercise = 167
Scores Frequencies Midpoints X Fx
195-199 2 197 30 60
190-194 1 192 25 25
185-189 3 187 20 60
180-184 3 182 15 45
76
175-179 4 177 10 40
170-174 7 172 5 35
165-169 10 167 0 0
160-164 8 162 -5 40
155-159 6 157 -10 60
150-154 3 152 -15 45
145-149 2 147 -20 40
140-144 2 142 -25 50
135-139 1 137 -30 30
N = 52 /fx/ = 530
ul – ll
Midpt. = l +
2 ∑ /fx/
AD =
N
199-
= 195 + 195
2 530
=
52
4
= 195 +
2 = 10.193
= 195 + 2 = 10.19
= 197
77
If AD is 12 or more, the group is heterogeneous.
If AD is less than 12, the group is homogeneous.
Name Course Date
The Average Deviation

Exercise No.12
Given in the previous exercise = 34.36
Scores Frequencies Midpoints X Fx
57-59 2 58 23.64 47.28
54-56 0 55 20.64 0
51-53 3 52 17.64 52.92
48-50 3 49 14.64 43.92
45-47 6 46 11.64 69.84
42-44 6 43 8.64 51.84
39-41 14 40 5.64 78.96
36-38 14 37 2.64 36.96
33-35 14 34 -0.36 5.04
30-32 10 31 -3.36 33.6
27-29 7 28 -6.36 44.52
24-26 7 25 -9.36 65.52
78
21-23 3 22 -12.36 37.08
18-20 5 19 -15.36 76.8
15-17 3 16 -18.36 55.08
12-14 3 13 -21.36 64.08
N = 52 /fx/ = 763.44
ul – ll
Midpt. = l +
2 ∑ /fx/
AD =
N
59 - 57
= 57 +
2 763.44
=
100
2
= 57 +
2 = 76.344
= 57 + 1 = 76.63
Homogeneous
= 58
If AD is 12 or more, the group is heterogeneous.
If AD is less than 12, the group is homogeneous.
THE STANDARD DEVIATION
The standard deviation or SD is the most stable index of variability and

is customarily employed in experimental work and in research studies. The SD
differs from the AD in several respects.in computing the AD, we disregard signs
and treat all deviations as positive, whereas finding the SD we avoid the difficulty
79
of signs by squaring the separate deviations. The squared deviation used in
computing the SD is always taken from the mean, never from the median or
mode. The conventional symbol for SD is the Greek letter sigma ( o ).
The SD is less affected by sampling errors than the Q or the AD. In a

normal, as well as in a less symmetrical, distribution the SD marks the limits of
the middle 68.26% (roughly the middle two-thirds) of the distribution. It is
therefore, larger than the AD which is, in turn, larger than Q. These relationships
supply a rough check upon the accuracy of the measures of variability.
With the sigma or ASD, it is possible to compute other measures: AD=

0.7979 ; Q = 0.745 ; Pedist = 0.6745 ; (coefficient of variability) = 100 /M.
In the distribution of grades in the transmutation of raw scores, the SD is

used with the arithmetic mean. To get the limits, the SD is multiplied by 0.5 and
by 1.5 and the result is added to, and subtracted from the mean.
LIMITS GRADE or RATINGS
M + 1.5 SD to highest scores A or 1
M + 0.5 SD to M + 1.5 SD B or 2
M – 0.5 SD to M + 0.5 SD C or 3
M – 1.5 SD to M – 0.5 SD D or 4
Lowest score to M – 1.5 SD E or 5
The SD is also used in the comparison of groups. The higher the SD, the
more heterogeneous is the group. The smaller is the SD, the more homogeneous
is the group.
80
Name Course Date
The Standard Deviation by the Long Method

Exercise No.13a
Formula :
∑ fd2
SD =
N
Where:
D = Deviation of every midpoint from the mean
d = Midpt. – M
fd2 = a) fd2 * d 13.373
b) fd * f20 * 1 = 20 178.846153
+ 3 -1
23 78
13 * 20 = 260 - 69
+3 984
263 - 789
133 * 20 = 2660 19561
+ 7 - 18669
2667 89253
- 1330 - 80229
1337 9024
* 20
81
26740 Re-check:
+ 3
26743 13.373
x 13.373
40109
93611
40119
40119
+ 13373
178837129
+ 9024
178.846153
Name Course Date

Exercise No.13a
Scores Freq. Midpt. D fd fd2
195-199 2 197 30 60 1800
190-194 1 192 25 25 625
82
185-189 3 187 20 60 1200
180-184 3 182 15 45 675
175-179 4 177 10 40 400
170-174 7 172 5 35 175
165-169 10 167 0 0 0
160-164 8 162 -5 40 200
155-159 6 157 -10 60 600
150-154 3 152 -15 45 675
145-149 2 147 -20 40 800
140-144 2 142 -25 50 1250
135-139 1 137 -30 30 900
∑ fd2 = 9300
∑ fd2
SD =
N
9330
=
52
= 178.846153
= 13.37
83
Interpretation :
If SD is less than 15, the group is homogeneous.
If SD is 15 or more, the group is heterogeneous.
Name Course Date

Exercise No.13a
57-59 2 58 23.64 47.28 1117.699
54-56 0 55 20.64 0 0
51-53 3 52 17.64 52.92 933.5088
48-50 3 49 14.64 43.92 642.9888
45-47 6 46 11.64 69.84 812.9376
42-44 6 43 8.64 51.84 447.8976
39-41 14 40 5.64 78.96 445.3344
36-38 14 37 2.64 36.96 97.5744
33-35 14 34 -0.36 5.04 1.8144
30-32 10 31 -3.36 33.6 112.896
84
27-29 7 28 -6.36 44.52 283.1472
24-26 7 25 -9.36 65.52 613.2672
21-23 3 22 -12.36 37.08 458.3088
18-20 5 19 -15.36 76.8 1179.648
15-17 3 16 -18.36 55.08 1011.269
12-14 3 13 -21.36 64.08 1368.749
∑ fd2 = 9527.04
∑ fd2
SD =
N
9330
=
52
= 178.846153
= 13.37
Interpretation :
85
Calculating the SD from the Ungrouped Scores:
The formula is ∑x2

SD = X2 = squared deviations
N
1. Find the mean.

2. Subtract the arithmetic mean algebraically from every score to get the
deviation (x). When the score is numerically greater than the mean, the x
will be plus; when numerically less than the mean, the x will be minus.
3. Square each deviation (x2).
4. Add the square deviations to get their sum (∑x2).
5. Divide the sum (∑x2) by the number of cases (N) to get the mean of the
deviation squared.
6. Extract the root of the mean of the deviations squared. The result is the
SD or sigma.
Calculating of the Grouped Data:
(a) By the long Method:
The process is identical with used for ungrouped items

except that, in addition to squaring the deviation (represented by x
or d) of each midpoint from the mean, we weight each of these
squared deviations by the frequency which it represents – that is,
by the frequency opposite it.
∑ fd2
SD = fd2 = d x fd
N
(b) By the Short Method:
The short method used in calculating the mean consisted

essentially in “guessing” or assuming the mean, and later applying
a correction to give the actual mean. It is a decided time or labor
saver in dealing with grouped data, and is well-high indispensable
in the calculation of 0’s in a correlation table.
SD = ∑ fd2 - ∑ fd 2
86
N N
1. Determine the point of origin (0) as in the computation of the

mean by the short method.
2. From this lay off (d), positive deviation (1, 2, 3, etc.) upward and
negative deviations (-1, -2, -3, etc.) downward.
3. Multiply each deviation (d) by the frequency (f) column.
4. Multiply each deviation (d) by the frequency times deviation (fd)

to obtain fd2 (column 5). This is the same in squaring each
deviation and multiplying it by its corresponding frequency.
5. Find the algebraic sum of fd and fd 2 to get ∑fd and ∑fd2

respectively.
6. Apply the formula.
(c) By the percentile Method:
The SD may be estimated fairly accurately by means of the

formula:
= 0.4 x d or = 0.4 ( P90 - P10 )
1. Obtain the 90th percentile.

2. Obtain the 10th percentile.
3. Substitute in the formula.
Though Q is used much more frequently than D, the latter is a far

better percentile measure of variability and is considerably easier to
compute.
87
Name Course Date
The Average Deviation by the Short Method

Exercise No.13b
195-199 2 197 30 60 1800
190-194 1 192 25 25 625
185-189 3 187 20 60 1200
180-184 3 182 15 45 675
175-179 4 177 10 40 400
170-174 7 172 5 35 175
165-169 10 167 0 0 0
160-164 8 162 -5 40 200
155-159 6 157 -10 60 600
150-154 3 152 -15 45 675
145-149 2 147 -20 40 800
140-144 2 142 -25 50 1250
135-139 1 137 -30 30 900
∑fd= 530 ∑ fd2 = 9300
88
∑ fd2 ∑fd 2
SD = -
N N
9300 530 2
= -
52 52
= 74.96302
= 8.66 Homogeneous
Interpretation :
Name Course Date
The Average Deviation by the Short Method

Exercise No.13b
89
57-59 2 58 23.64 47.28 1117.699
54-56 0 55 20.64 0 0
51-53 3 52 17.64 52.92 933.5088
48-50 3 49 14.64 43.92 642.9888
45-47 6 46 11.64 69.84 812.9376
42-44 6 43 8.64 51.84 447.8976
39-41 14 40 5.64 78.96 445.3344
36-38 14 37 2.64 36.96 97.5744
33-35 14 34 -0.36 5.04 1.8144
30-32 10 31 -3.36 33.6 112.896
27-29 7 28 -6.36 44.52 283.1472
24-26 7 25 -9.36 65.52 613.2672
21-23 3 22 -12.36 37.08 458.3088
18-20 5 19 -15.36 76.8 1179.648
15-17 3 16 -18.36 55.08 1011.269
12-14 3 13 -21.36 64.08 1368.749
∑fd =763.44 ∑ fd2 = 9527.04
∑ fd2 ∑fd 2
SD = -
N N
= 9527.04 - 763.44 2
90
100 100
= 95.2704 – 58.28406
= 6.08 Homogeneous
Interpretation :
THE CUMULATIVE FREQUENCY GRAPH
The cumulative frequency graph is another way of representing a

frequency distribution by means of a diagram. Before we can plot a cumulative
frequency graph, the scores of the distribution must be added serially or
cumulated. The height of the graph indicates the total number of frequencies.
CONSTRUCTION OF THE COMULATIVE FREQUENCY GRAPH:
1. Find the cumulative frequency of every step beginning with the lowest step
by adding the f’s cumulatively upward. (The last cumulative f is equal to
N).
2. Draw OX and OY, as the in the frequency polygon or histogram.
3. Lay off the intervals or steps on OX and mark on OY successive unit

distances to represent the cumulative frequencies on different steps. (Use
the exact limits of the intervals).
91
4. Read off the steps, together with the corresponding cumulative
frequencies and place points through the exact upper limits of the steps.
5. Connect the successive points by straight lines, and at the lower end drop
a line to the exact lower limit of the lowest step (also the exact upper limit
of the step next below the lowest, the f which is O).
Name Course Date
92
93
CALCULATION OF PERCENTILES IN A FREQUENCY
DISTRIBUTION
We have learned that the median is that point in a frequency distribution

below which lie 50% of the measures or scores; and that Q1 and Q3 mark points
in the distribution below which lie, respectively, 25%and 75% of the measures or
scores. Using the same method by which the median and the quartiles were
found, we may compute points below which lie 10%, 43%, 85%, or any percent
of scores. These points are called percentiles, and are designated, in general, by
the symbols Pp, the referring to the percentage of cases below the given value.
P10 for example, is the point below which lie 10% of the scores. It is evident that
the median, expressed as a percentile is 50%; also Q1 is P25 is Q3 is P75
The method of calculating percentiles is essentially the same as that

employed in finding the median. The formula is
Pp = l + PN-F i
fp
Where:
Pp = percentage of the distribution wanted, e.g., 10%, 33%, etc.
l = exact lower limit of the class interval upon which Pp lies.
P = part of N to be counted off in order to reach Pp.
F = sum of all scores upon all intervals below l.
fp = number of scores (f) within the interval upon which Pp falls.
I = length or size of the class interval.
94
Procedure:
1. Multiply N by the percentile desired.
2. Add cumulatively upwards the frequencies to get the partial sum (F

which should approach or equal, but not exceed the percentile sum
(PN).
3. Subtract the partial sum (F) from the corresponding percentile sum (P N)
and divide the difference by the frequency (fp) of the step containing
the percentile desired, multiply the quotient by the size of class
intervals (i) to get the correction.
4. Add the correction to the exact lower limit (l) to the step containing the
percentile desired. The result is the percentile.
P0 and P100 mark the exact lower limit of the first interval the exact upper
limit of the last interval, respectively. These two percentiles represent limiting
points. Their principal value is to indicate the boundaries of the percentile scale.
Percentiles are used in the transmutation of raw scores. A pupil whose

score is equal to P20 surpasses 20% of the group and is surpassed by 80%
percentiles, therefore, indicates the percentage of pupils surpassed by a pupil
and the percentage of pupils that surpass him.
Percentiles can be used as grades, and are more reliable and comparable
than grades, letters, or numbers. Under a strict or a lenient teacher, percentiles
which are numerically equal have the same meaning. A grade P35 given by a
strict teacher is equal to a grade of P35 given by a lenient teacher.
95
Name Course Date
96
97
CALCULATING OF PERCENTILE RANKS IN A FREQUENCY
DISTRIBUTION
Percentiles are point in a continuous distribution below which lie given

percentage of N. Percentile rank is the position on a scale of 100 to which the
f
subjects score entitles him. In calculating percentiles we start with a certain
i
percent of N, say 15%. We then count into the distribution the given percent and
the point reach is the required percentile ranks is the reverse of this process.
Here we begin with and individual score, and determine the percentage of scores
which lies below it. If this percentage is 62, for example, the score has a
percentile rank or PR of 62 on a scale of 100.
Formula : Pp = x (score – l ) + cum. f

X 100
N
Where:
f = frequency of the interval where the score falls
i = size of the class interval
l = exact lower limit of the interval where the score falls
cum f. = number of scores (cumulative frequency below l)
N = total number of cases
Procedure in Calculating PR:
1. Determine the class interval or step where a given score falls.
2. Determine the frequency (f) on this interval and divide it by i.
3. Multiply the difference of the exact lower limit (l) and the given score by
the quotient obtained in step 2.
4. Add the product obtained in step 3 to the cumulative sum of the

frequencies (cum. f) below l and divide by resulting sum by the total
number of cases (N). This gives us the PR of the given scores.
98
Name Course Date
99
100
101
THE CUMULATIVE FREQUENCY CURVE OR OGIVE
The cumulative frequency curve or ogive differs from the cumulative

frequency graph in that frequencies are expressed as cumulative percents of N
on the Y axis instead of as cumulative frequencies.
Construction of the ogive:
1. Lay off a cumulative percentage distribution:
a. List the class intervals and their frequencies (columns 1 and

2)
b. Cumulate the f’s from the low end of the distribution upward
(column 3)
c. Compute the cumulative percents by dividing each sum. F
by NA better method is to determine first the reciprocal 1/N,
called the Rate and multiply each cumulative f. in order by
this fraction (column 4)
2. Plot the ogive from the data in column 4, that is the cumulative
percent f (column).
a. Draw OX and OY, as in the frequency polygon or histogram

or cumulative frequency graph
b. Lay off the exact interval limits of the distribution on OX and
mark off on OY successive units distances to represent the
cumulative percentages on the different steps.
c. Read off the steps, together with the corresponding
cumulative percentages and place points through the exact
upper limits of the steps.
d. Connect the successive points by straight lines, and at the
lower end drop a line to the exact limit of the lowest step.
Uses of the ogive:
1. Percentiles and percentile ranks may be determined quickly and fairly

accurately from the ogive. To obtain P50, the median for example draw a
line from 50 on the Y scale parallel to the X axis. This will locate the
median approximately. In order to read the percentile ranks of a given
102
score from the ogive, reverse the process in determining percentiles.
Percentiles and percentile ranks will often be slightly in error when read
from the ogive, but this can be made very small when the curve is
carefully drawn, the scale division precisely marked, and the diagram
fairly large.
2. A useful comparison of two or more groups is provided when ogives

representing their scores on a given test are plotted upon the same
coordinate axis. Differences in achievement as between the groups are
shown by the distances separating the two curves at various levels.
3. Percentile norms may be determined directly from the smoothed ogives.

(Norms are measures of achievement which represent the typical
performance of some designated group or groups). However, percentile
norms read from an ogive are not strictly accurate, but the error is slight
except at the top and bottom of the distribution. Estimates of these
extreme percentiles from smoothed ogives are probably more nearly true
values than are calculated points, since the smoothed curve represents
what we might expect to get from large groups or additional samplings.
Name Course Date
103
104
MEASURING DIVERGENCE FROM NORMALITY
105
To find the divergence of the actual distribution (represented by the
histogram) from the best fitting normal curve that has been superimposed,
both the skewness and the kurtosis should be computed.
A useful index of skewness is given by the formula
3(M −MDN ) ( P 90+ P 10 ) −50

Sk= ∨SK =
SD 2
A distribution is said to be skewed when the M and the MDN fall at

different points in the distribution, and the balance (or center of gravity) is
shifted to one side or the other – to the left or right. In a normal distribution,
the M equals the MDN exactly and the skewness is of course zero. The
more nearly the distribution approaches the normal form, the closer
together are the M and the MDN, and the less skewness. Distribution are
said to be skewed negatively (to the left) when scores are massed at the
high end of the scale (right end) and are spread out more gradually toward
the low end (left). Distributions are skewed positively (to the right) when
scores are massed at the low (left) end of the scale and are spread out
gradually toward the high or right end. Moreover, when skewnes is
negative, the M lies to the left of the MDN and when skewness is positive,
the M lies to the right of the MDN.
The term kurtosis refers to the peakedness or flatness of a

frequency distribution as compared with the normal. A frequency
distribution more peaked than normal is said to be leptokurtic; one flatter
than the normal, platykurtic. A normal curve is called mesokurtic. A formula
for measuring Kurtosis is :
Q
Ku=
P 90−P 10
For normal curve the formula gives Ku = 0.263. If Ku is greater than

0.263 the distribution is platykurtic, if less than 0.263 the distribution is
leptokurtic.
106
Name Course Date
107
108
PLOTTING THE BEST FITTING NORMAL CURVE
A normal curve of the same N, M, SD, as the actual distribution may

be superimposed on the histogram or frequency polygon of the distribution.
Such a model curve is the best fitting normal distribution. The research
worker often wished to compare his distribution by eye with that normal
curve which “best fits” the data and such comparison may profitably be
made even if no measures of divergence from normality are computed. In
fact the direction and extent of asymmetry often strike us more
convincingly when seen in a graph that when expressed by measures of
skewness and kurtosis. It may be noted that a normal curve can always be
readily constructed by following the procedures given here, provided the
area N and variability SD are known.
The scores of the actual frequency distribution should be

represented by a histogram instead of by a frequency polygon in order to
prevent coincidence of the surface outlines and to bring out more clearly
agreement and disagreement at different points. To plot a normal curve
over this histogram, we first compute the height of the maximum ordinate.
The maximum ordinate δ can be determined from the equation of the
normal curve:
y=
√
δ N
2π
Where:
δ = SD of the distribution expressed in units of class intervals because the

units on the X axis are in terms of class intervals.
[ ]
SD
I
√ 2 π ∨√2(3.1416)=2.51(constant )
109
For example, if N = 52, SD = 13.37, M = 167, therefore
52
y=
2.67 x 2.51
52
=
6.7
= 7.76 or 7.8 the height of the maximum

ordinate of the best fitting normal curve data. This point is at
the middle most point of the histogram.
Note:
Round off your answer to one decimal for convenience in plotting the
normal curve.
Knowing yδ we are able to compute from table B the heights of

ordinates at given distances from the mean.
± 1 δ=0.60653 x 7.8=4.7
± 2 δ=0.13534 x 7.8=1.0
± 3 δ=0.01111 x 7.8=0.09∨0.1
The normal probability curve may be sketched in without much

difficulty through the ordinates at these seven points. Somewhat greater
accuracy will be obtained if various intermediate ordinates, for example at
± 0.5 δ 1.5 , etc. are also plotted.
110
Name Course Date
111
112
THE NORMAL PROBABILITY CURVE – ITS NATURE AND IMPORTANCE
The normal probability curve, or simply the normal curve, is a bell

shaped, symmetrical curve. Frequency distributions of data drawn from
anthropometry, psychology, meteorology and education resemble the
normal probability curve, the mean, median, mode fall at the same point,
and there is a perfect balance or symmetry between the right and left
halves of the figure.
The normal probability curve is of great importance in educational

measurement because of its usefulness in the construction of tests and
scales and in many calculations involving quantitative data. It is usually
necessary in certain problems to assume some form of distribution, and
the normal curve is taken because it gives the best single approximation to
the ordinary distribution of the scores.
113
An unsymmetrical distribution is called a skewed distribution. In
a skewed distribution, the mean, median, and the mode fall at
different points, in the distribution. Skewness is computed by the
formula:
3(M −MDN ) ( P 90+ P 10 ) −P 50

Sk= ∨SK =
SD 2
When the mean is smaller than the median, the distribution is skewed
negatively or to the left, that is the scores are massed at the high (right)
end of the scale, and the spread out gradually at the lower end. When the
mean is greater than the median, the distribution is positively skewed, or to
the right, that is the score are massed at the low left end of the scale and
are spread out gradually at the high (right) end.
114
The results of the test that is easy are negatively skewed and those of a
test that is difficult are positively skewed. The results of a test that is of moderate
easy or difficulty approach a normal curve.
There are several reasons why distributions are skewed:
a. Small size of the group measured or tested
b. Selection of a special group
c. Technical faults in the construction of the test and errors in

scoring.
Note:
The normal distribution is not actual distribution of test scores, but is,
instead a mathematical model. Frequency distribution of scores approach the
theoretical distribution as a limit, but the fit is rarely perfect.
Principle:
Measurements of many natural phenomena and of many mental and

social traits under certain conditions tend to be distributed symmetrically about
their means in proportions which approximate those of the normal probability
distribution.
115
Much evidence has accumulated to show that the normal distribution
serves to describe the frequency of occurrence of many variable facts with a
relatively high degree of accuracy. Phenomena which follow the normal
probability curve (at least approximately) may be classified as:
1. Biological statistics: Mendelian ratios – Proportion of male to female

birth for the same community over a period of years.
2. Anthropological data: height, weight, etc.
3. Social and economic data: rates of birth, marriage, or death under

certain constant conditions: wages
4. Psychological measurements: Intelligence as measured by

standard tests; speed of association, perceptions span; educational
test scores in spelling, etc.
5. Errors of observation: measures of height, speed of movement,

linear magnitude, physical and mental traits.
116
APPLICATIONS OF THE NORMAL PROBABILITY CURVE
A number of problems may readily be solved if we can assure that our

obtained distributions can be treated as normal, or as approximately normal.
Each general problem will be illustrated by several examples. Constant reference
will be made to Table A; and a knowledge of how to use this table is essential.
1. To determine the percentage of cases in a normal distribution which

falls within given limits?
Problem: Given a normal distribution with M = 20 and SD = 5
a. What percent of the cases fall between 15 and 25?
b. What percent of the cases lie above 30?
c. What percent of the cases lie below 12?
Solution:
a. Score 25 – Mean (20) = 5 and score 15 – Mean (20) = -5. Divide the
difference 5 by SD (5). The quotients are 1SD and -1SD, respectively.
Score 25 is 1SD above the mean and score 15 is 1SD below the mean.
From Table A 1SD includes 34.13% of the cases above the mean and -
1SD includes 34.13% of the cases below the mean. Add 34.13% and
34.13%. The sum, 68.26% represents the cases that fall between15 and
25.
b. Score 30 is 10 points or 2SD above the mean. From the table 47.72%
of the cases fall between the mean and 2SD. Accordingly, 2.28% (50%-
47.72%) of the cases lie above 30.
c. Score 12 is 8 points or -1.6SD from the mean. Between the mean and -
1.6SD are 44.52% of the case. Hence, 50% - 44.52 or 5.48% of the cases
lie below 12.
2. To find the limits in any normal distribution which includes a given

percentage of the cases?
Problem:
Given a distribution with M = 20 and SD = 5. Assuming normality,

what limits will include the middles 65% of the cases?
117
Solution:
The middle 65% of the cases include 32.5% above and 32.5%
below the mean. From Table 32.5% of the distribution is very close to
32.38% or .93SD. The middle 65% of the case, therefore, lies between the
mean and ± .93 SD or since SD equals 5, between the mean and ± 4.65
points adding 4.65 to the mean (20) gives 24.65 and subtracting 4.65 from
the mean gives 15.35. Therefore the middle 65% of the cases lie between
24.65 and 15.35.
3. To determine the relative difficulty of test questions, problems, and other

test items.
Problem:
Give a test question or problem solved by 10% of the group; a
second problem solved by 20% of the group and a third, solved by 40% of
the group. What is the relative difficulty of questions 1, 2, and 3?
Solution: Question 1 is passed by 10% and is failed by 90%. The highest
10% of the group has 40% of the cases between its lower limit and the
mean (50% - 10% = 40%). From the table 39.97% or 40% fall between
1.28SD and the mean. Accordingly, 1.28SD is the difficulty value of
question 1.
Following the same procedure, question 2; passed by 20% of the group,
falls at a point in the distribution 30% above the mean (50% - 20% = 30%).
From the table 9.87% of 10% of the group falls between the mean
and .25SD. Therefore question 3 has a difficulty value of .25SD.
The SD gives the real index of difficulty of test questions, and not
the percent of passing or failing.
4. To separate a given group into subgroups according to capacity, when

the trait is normally distributed.
Table A. Fractional parts of the total area under the normal

probability curve, corresponding to distance on the baseline
between the mean and successive points laid off from the mean in
units of standard deviation.
Example: Between the mean and a point 1.380SD = 1.38 are

found 41.62% of the entire area under the curve.
118
X .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 0000 0040 0080 0120 0160 0199 0239 0279 0319 0359
0.1 0398 0438 0478 0517 0557 0596 0636 0675 0714 0753
0.2 0793 0832 0871 0910 0948 0987 1026 1064 1103 1141
0.3 1179 1217 1255 1293 1331 1368 1406 1443 1480 1517
0.4 1554 1591 1628 1664 1700 1736 1772 1808 1844 1879
0.5 1915 1950 1985 2019 2054 2088 2123 2157 2190 2224
0.6 2257 2291 2324 2357 2339 2422 2454 2485 2517 2549
0.7 2580 2611 2642 2673 2704 2734 2764 2794 2823 2852
0.8 2881 2910 2939 2967 2995 3023 3051 3075 3106 3133
0.9 3159 3186 3212 3238 3264 3290 3315 3340 3365 3380
1.0 3413 3438 3461 3485 3508 3531 3554 3577 3599 3621
1.1 3643 3665 3686 3708 3729 3749 3770 3790 3810 3830
1.2 3849 3869 3888 3907 3925 3944 3962 3980 3997 4015
1.3 4032 4049 4066 4082 4099 4115 4131 4147 4162 4177
1.4 4192 4107 4222 4236 4251 4265 4279 4292 4306 4319
1.5 4332 4345 4357 4370 4383 4394 4406 4418 4429 4441
1.6 4452 443 4474 4484 4495 505 4515 4525 4535 4545
1.7 4554 4564 4573 4582 4591 4599 4608 4616 4625 4633
1.8 4641 4649 4656 4664 4671 4678 4686 4693 4699 4606
1.9 4713 4719 4726 4732 4738 4744 4750 4756 4761 4767
2.0 4772 4778 4783 4788 4793 4798 4809 4808 4812 4817
2.1 4821 4826 4830 4834 4838 4842 4846 4850 4854 4857
2.2 4861 4864 4868 4871 4875 1878 4881 4884 4887 4890
2.3 4893 4896 4898 4901 4904 4906 4909 4911 4913 4916
2.4 4918 4920 4922 4925 4927 4929 4931 4932 4934 4936
2.5 4938 4940 4941 4943 4945 4946 4948 4949 4951 4952
2.6 4918 4920 4922 4925 4927 4929 4931 4932 4934 4936
2.7 4938 4940 4941 4943 4945 4946 4948 4949 4951 4952
2.8 4974 4975 4976 4977 4977 4978 4979 4979 4980 4981
2.9 4981 4982 4982 4983 4984 4984 4985 4985 4986 4986
3.0 4986.5 4996.9 4987.4 4987.8 4988.2 4988.6 4988.9 4989.3 4989.7 4999
3.1 4990.3 4990.6 4991.0 4991.3 4991.6 4991.8 4992.1 4992.4 4992.6 4992
3.2 4993.129
3.3 4996.166
3.4 4996.631
3.5 4997.674
3.6 4998.409
3.7 4998.922
3.8 4999.277
3.9 4999.519
4.0 4999.683
4.5 4999.966
5.0 4999.997133
119
Name Course Date
120
121
122
Name Course Date
123
124
125
COEFFICIENT OF CORRELATION
Correlation is the relationship between two or more series of measures of

the same individuals. In correlation, paired facts are studied. Thus, pupil’s marks
in one subject are compared with their marks in another subject. The degree or
amount of relationship is expressed by the coefficient of correlation, an index of
relationship.
The coefficient of correlation ranges in value from +1.00 (perfect positive

correlation), through zero (no correlation) to -1.00 (perfect negative correlation).
When a pupil gets high in one subject and low in another subject, it is an
example of positive correlation implying direct relationship.
Negative correlation implies inverse relationship. When a pupil gets low in

one subject and high in another subject is a cause of inverse relationship
The zero coefficient of correlation denotes no correlation. There is no

definite direction of increase or decrease in one subject over the other.
Case 1 Case 2 Case 3

Pupil A B A B A B
A 15 53 15 68 15 104
B 14 52 14 67 14 103
C 13 51 13 66 13 102
D 12 50 12 65 12 101
E 11 49 11 64 11 100
Positive Correlation Negative CorrelationZero Correlation

R = 1.00 R = -1.00 R = 0.00
The coefficient of correlation is issued in determining the validity,

reliability, and objectivity of a test prepared. By correlating the results of the test
prepared with the results of a valid criterion in the same subject, the validity of
the test prepared is determined. If the coefficient of correlation between them is
not less than 0.85, the test is more or less valid. The reliability of a test is also
found by comparing the results of the test prepared with those of a reliable test in
the same subject. If the scores assigned by two or more corrections is high, the
test prepared is objective; if the agreement is low the test is subjective.
126
The coefficient of correlation is also used for prediction. A pupil takes the
test in English literature but is absent in the test in American literature. With the
reliability of each of these test and the relationships between them known, it is
possible to predict the probable score the pupil the test in American literature by
substituting the values in the regression equation which are not within the scope
of this book.
Coefficient of Correlation by the Rank Difference Method or RHO
Procedure:
1. Rank the scores in each test and enter the corresponding ranks in
columns “Rank X” and “Rank Y”.
2. Find the difference in ranks by subtracting algebraically the ranks in one
test from the ranks in the other set and write difference in column D.
3. Square each of the difference in rank and enter it in column D 2.
❑
4. Take the sum of the squares of the difference in rank ( ∑ ❑ D ) and the
2
❑
numbers of pairs (N); substitute their values in the formula.
❑
6∑ ❑ D
2
P ( RHO )=1− ❑
N ( N 2−1)
Table of Interpretation
Any coefficient of correlation that is not zero and that is also statistically
significant denotes some degree of relationship between two variables.
Less than 0.20 Slight correlation, negligible relationship

0.20 – 0.39 Low correlation, definite but small
relationship
0.40 – 0.69 Moderate correlation, substantial
relationship
0.70 – 0.80 High correlation, marked relationship
0.90 – 1.00 Very high correlation, very dependable

relationship
127
Name Course Date
Coefficient of Correlation by the Rank Difference Method or Rho

Exercise 21a
Pupils Test X Test Y Rank X Rank Y Difference D2

A 15 12 1 3 -2 4
B 14 14 2 1 1 1
C 13 10 3 5.5 -2.5 6.25
D 12 8 4 10.5 -6.5 42.25
E 11 12 6 3 3 9
F 11 9 6 7.5 -1.5 2.25
G 11 12 6 3 3 9
H 10 8 9 10.5 -1.5 2.25
I 10 10 9 5.5 3.5 12.25
J 10 9 9 7.5 1.5 2.25
K 9 8 11.5 10.5 1 1
L 9 7 11.5 13.5 -2 4
M 8 7 13 13.5 -0.5 0.25
N 7 8 14.5 10.5 4 16
O 7 6 14.5 15 -0.5 0.25
❑
N = 15 ∑
❑
2
❑ D =¿ 112
❑
6∑ ❑D
2
P(RHO) = ❑
1−
N ¿¿
128
6 (112)
= 1−
15(152 −1)
672
= 1−
3360
= 1−0.20
= 0.80 High Positive Correlation Indicating Marked Relationship.
COEFFICIENT OF CORRELATION BY THE SPEARMAN “FOOTRULE”

FORMULA
Procedure:
1. Rank the scores in each test and enter the corresponding ranks in
columns “Rank X” and “Rank Y”.
2. Subtract the ranks in one test algebraically from the ranks in the
other test, but enter only the positive difference as are the gains in
ranks of one test over the other (Col. G).
❑
3. Find the sum of the gains in ranks ( ∑ ❑G ) and the number of pairs
❑
❑
(N) and substitute their values in the formula. R=1−

∑
❑
❑G
2
N −1
4. The result is the Spearman’s coefficient of correlation.
129
130
Name Course Date
EDUCATION 602- STATISTICS
Coefficient of Correlation by the Spearman “Footrule” Formula

Exercise 21b
Pupils Test X Test Y Rank X Rank Y Difference Gain

A 15 12 1 3 -2 0
B 14 14 2 1 1 1
C 13 10 3 5.5 -2.5 0
D 12 8 4 10.5 -6.5 0
E 11 12 6 3 3 3
F 11 9 6 7.5 -1.5 0
G 11 12 6 3 3 3
H 10 8 9 10.5 -1.5 0
I 10 10 9 5.5 3.5 3.5
J 10 9 9 7.5 1.5 1.5
K 9 8 11.5 10.5 1 1
L 9 7 11.5 13.5 -2 0
M 8 7 13 13.5 -0.5 0
N 7 8 14.5 10.5 4 4
O 7 6 14.5 15 -0.5 0
❑
N = 15 ∑
❑
❑G=17
❑
6 ∑ ❑G
R = 1− ❑
2
N −1
6(17)
= 1− 2
15 −1
131
102
= 1−
225−1
102
= 1−
224
= 1 – 0.46
R = 0.54 Moderate Correlation Indicating Substantial Relationship
THE PRODUCT MOMENT COEFFICIENT OF CORRELATION
The product – moment coefficient of correlation may be thought of

essentially as that ratio which expresses the extent to which changes in one
variable are accompanied by or are dependent upon, changes in a second
variable.
The sum of deviations from the mean (raised to some power) and divided
by N is called a moment. When corresponding deviations in x and y are
[ ]
❑
multiplied together, summed and divided by N to give

∑
❑
❑ xy
the term product
N
moment is used.
The coefficient of correlation, r is often called the “Pearson” after
Professor Karl Pearson who developed the product moment method, following
the earlier work of Galton and Bravais.
Procedure:
1. Compute the arithmetic mean of each test by the long method.
2. Subtract algebraically the mean of one test from every score in that test to
get the corresponding deviations x and y.
3. Square the deviation of each test and sum them up. (Cols. X 2 and y2)
4. Multiply the deviation in one test by the corresponding deviations to the

other test and add the product algebraically (XY).
5. Divide the sum of the deviation squared (x 2 and y2) of each test by N.
Extract the square roots of the quotient to get the sign.
132
6. Divide the sum of the product by the products of the sigmas of the two
tests, or divide the algebraic sum of the products of the deviations by the
square root of the products of the sums of the squared deviations in the
two tests. The result is Pearson coefficient of correlations.
❑
∑
❑
❑ xy
N
R=
√
❑ ❑
∑ ❑ x2
❑
∑
❑
❑ y2
x
N N
Name Course Date
EDUCATION 602
The Product – Moment Coefficient of Correlation

Exercise 21c
Pupils Test X Test Y X Y XY X2 Y2

A 15 12 4.5 2.7 12.15 20.25 7.29
B 14 14 3.5 4.7 16.45 12.25 22.09
C 13 10 2.5 0.7 1.75 6.25 0.49
D 12 8 1.5 -1.3 -1.95 2.25 1.69
E 11 12 0.5 2.7 1.35 0.25 7.29
F 11 9 0.5 -0.3 -0.15 0.25 0.09
G 11 12 0.5 2.7 1.35 0.25 7.29
H 10 8 -0.5 -1.3 0.65 0.25 1.69
I 10 10 -0.5 0.7 -0.35 0.25 0.49
J 10 9 -0.5 -0.3 0.15 0.25 0.09
K 9 8 -1.5 -1.3 1.95 2.25 1.69
L 9 7 -1.5 -2.3 3.45 2.25 5.29
M 8 7 -2.5 -2.3 5.75 6.25 5.29
N 7 8 -3.5 -1.3 4.55 12.25 1.69
O 7 6 -3.5 -3.3 11.55 12.25 10.89
133
❑ ❑ ❑ ❑
N = 15 ∑
❑
❑ X =157 ∑ ❑Y =140
❑
∑
❑
❑ XY =58.65 ∑
❑
2
❑ X =77.75
❑
∑
❑
❑Y 2=73.35
❑ ❑
MX
∑❑X
= ❑ MY
∑ ❑Y
= ❑
N N
157 140
= =
15 15
MX = 10.5 MY = 9.3
❑
∑
❑
❑ XY
N
√
R = ❑ ❑
∑❑ x
❑
2
∑
❑
❑Y
2
x
N N
58.65
15
=
√ 77.75 73.35
15
3.91
x
15
=
√5.18 x 4.89
3.91
=
√25.3302
3.91
=
5.032
= 0.777
R = 0.78
R = 0.8 High Positive Correlation Indicating Direct Marked Relationship.
TRANSMUTATION OF RAW SCORES INTO RATING BY

THE LONG OR SPREAD METHOD
Procedure:
1. Determine the highest score and the lowest score.
134
2. Beginning from the highest score, write the number
consecutively down to the lowest score (top to bottom).
3. Tally the raw score into this consecutive number

distribution. Then summarize the tallies under column f.
4. Determine the rating scale for the period. (This is such

usually supplied by the principal or group chairman) such
as the 7 point scale or 5 point scale. Compute the required
percentages and indicate the number of scores necessary
in every percentage with brackets in the distribution of
scores.
5. Assign the upper and lower rating limits of every step

opposite the upper and lower score limits respectively.
6. Compute the difference between the upper and lower rating

limits and the range between the upper and lower limits of
every steps. Make the difference the numerator and the
range the denominator of the fractional part which is to be
consecutively added to the lower rating limit of the step
until the upper rating limit is reached.
7. Round the rating opposite every tallied number in the

distribution, consider a fraction if the numerator is at least
one – half of the denominator.
Name Course Date
EDUCATION 602
135
Transmutation of Raw Scores into Ratings by Long or Spread Method
Exercise 22a
HS= 59
LS= 12
Group A 5% =6 91-95
Group B 25% = 26 86-90
Group C 40% = 40 80-85
Group D 20% = 19 75-79
Group E 10% =9 70-74
Group A Equivalent Group C Equivalent

59 1 95 95 39 2 85 85
58 0 94 5/9 95 95 38 6 84 4/9 84 85
57 1 94 1/9 94 - 91 37 5 83 8/9 84 - 80
56 0 93 2/9 93 04 36 3 83 3/9 83 05
55 0 93 2/9 93 35 3 82 7/9 83
54 0 92 7/9 93 59 34 8 82 2/9 82 39
53 0 92 3/9 92 - 50 33 3 81 6/9 82 - 30
52 1 91 8/9 92 09 32 4 81 1/9 81 09
51 2 91 4/9 91 31 4 80 5/9 81
50 1 91 91 4/9 30 2 80 80 5/9
Group B Equivalent Group D Equivalent
49 1 90 90 29 4 79 79
48 1 89 5/9 90 90 28 0 78 5/9 79 79
47 0 89 1/9 89 - 86 27 3 78 1/9 78 - 75
46 3 88 3/9 88 04 26 3 77 6/9 78 04
45 3 87 3/9 87 25 2 77 2/9 77
44 1 86 8/9 87 49 24 2 76 7/9 77 29
43 3 87 3/9 87 - 40 23 1 76 3/9 76 - 20
42 2 86 8/9 87 09 22 1 75 8/9 76 09
41 5 86 4/9 86 21 1 75 4/9 75
40 7 86 86 4/9 20 2 75 75 4/9
Group E Equivalent
19 1 74 74 15 1 72 5/7 73 19
18 2 74 3/7 74 74 14 2 72 1/7 72 - 12
17 0 73 6/7 74 - 70 13 0 71 4/7 71 07
16 2 73 2/7 73 04 12 1 70 70
4/7
136
Name Course Date
EDUCATION 602
Transmutation by the Mean and SD of the Distribution

Exercise 22b
100 Raw Scores M = 34.36

HS = 59 SD = 9.76
LS = 12
Limits: Groups Grades
M+1.5SD to HS A 88-90
M+.5SD to M+1.5SD B 85-87
M-.5SD to M+.5SD C 80-84
M-1.5SD to M-.5SD D 77-79
LS to M-1.5SD E 75-76
Limits: Scores Ratings Scores

Ratings
49-59 59 90 90 33 81 7/9 82
39-48 58 89 4/5 90 32 81 3/9 81
29-38 57 89 3/5 90 31 80 8/9 81
20-28 56 89 2/5 89 30 80 4/9 80
12-19 55 89 1/5 89 29 80 80
Computations 54 89 89 28 79 79
9.76 x .5 = 4.880 53 88 4/5 89 27 78 ¾ 79
9.76x 1.5 = 14.640 52 88 3/5 89 26 78 2/4 78
34.36+14.64 = 49.00 51 88 2/5 89 25 78 1/4
34.36+4.88 = 39.48 50 88 1/5 88 24 78 78
34.36-4.88 = 29.00 49 88 88 23 77 ¾ 78
34.36-14.64 = 19.7248 87 87 22 77 2/4 78
47 86 7/9 87 21 77 ¼ 77
46 86 5/9 87 20 77 77
45 86 3/9 86 19 76 76
44 85 1/9 86 18 75 6/7 76
43 85 8/9 86 17 75 5/7 76
42 85 6/9 85 16 75 4/7 76
41 85 4/9 85 15 75 3/7 75
40 85 2/9 85 14 75 2/7 75
39 85 85 13 75 1/7 75
38 84 84 12 75 75
37 83 5/9 84
36 83 1/9 83
35 82 6/9 83
34 82 2/9 82
137
TRANSMUTATION OF RAW SCORES INTO RATINGS BY SHORT CUT
METHOD
Procedure:
1. Determine the highest and lowest scores. Find the range
2. Determine the highest and lowest ratings for the period. Find their
difference.
3. Divide the range by the difference to determine the size of class

interval (i) for a number of steps equal to the difference when the
remainder of the division operation is subtracted from the divisor.
4. If the lowest score is zero or extremely low in comparison to the

next consecutive number, distributed as many steps as is required
with the I computed in step 3. If the lowest score is not extremely low
begin the distribution with it.
5. Add 1 (constant) to the I in step 3 to determine the new I of the

remaining steps in the distribution. The number of steps using the
new I is equal to the remainder of the division operation. Continue
the distribution with this new I and this number of step. If the lowest
score has been written alone, the upper limit of the last or highest
step in the distribution is equal to the highest score. But if the lowest
score has been included in the first step of the distribution, the
highest score is written alone after the last step.
6. Assign the rating. The lowest rating is assigned to the lowest score
or lowest step, as the case may be, the next consecutive rating to
the step next to the lowest score or step and so on or score as the
case may be.
138
Note:
If the quotient when the range step 1 is divided by the difference

step 2 is exact, that is there is no remainder, if the quotient is the I of the
whole distribution and the number of steps is equal to the divisor. It is not
necessary to follow steps 5.
139
Name Course Date
EDUCATION 602
Transmutation by Short Method

Exercise 22c
Given:
N = 100 Scores Equivalent Rating
HS = 59 - 60 95
LS = 12 57 – 59 94
HR = 95 54 – 56 93 i = 3 (2 steps)
LR = 70 52 – 53 92 i = 2 (23 steps)
50 – 51 91
1. Determine the Range 48 – 49 90
HS= 59 46 – 47 89
LS= - 12 44 – 45 88
47 42 – 43 87
2. Determine the range of 40 – 41 86
ratings for the period. 38 – 39 85
HR= 95 36 – 37 84
LR= - 70 34 – 35 83
25 32 – 33 82
30 – 31 81
25/47 = 1.8 or 2 = 1st i 28 – 29 80
25-2 = 23 26 – 27 79
24 – 25 78
23 = No. of steps with an I of 2 22 – 23 77
2 = 1st i + 1 = 3 20 – 21 76
3 = 2nd i 18 – 19 75
2 = No. of steps with an I of 3 16 – 17 74
14 – 15 73
12 – 13 72
10 – 11 71
08 – 09 70
140
PART II
MODULES FOR
STATISTICAL
METHOD
141
Modules for Statistical Methods
STEPS IN HYPOTHESIS TESTING:
1. Construct the following hypothesis:

a. Research hypothesis (RH)
b. Null Hypothesis (Ho)
c. Alternate Hypothesis (Ha)
2. Set level of Significance:
(Set level of .05 for behavioral science researches: .01 in experimental
studies)
3. Determine the test statistics to e used.
4. Determine the critical value based on the degree of freedom (di)
5. Compute the values needed for testing.
6. Test for significance and give findings
a. Compute value > tabled value (significant)
b. Compute value < tabled value (rot significant)
7. Give decisions:
A. Reject Ho if:
Computed value > t.v. (.05 or .01)
B. Accept Ho if:
Computed value < t.v. (.05 or .01)
8. Interpret findings.
9. Give implications of findings.
EXERCISES
1. Recall two research studies you have read recently.
a. What would be the hypothesis you can generate from such studies:
1. Research hypothesis
2. Null hypothesis
3. Alternate hypothesis
b. Submit one good research study you would like to go into, what
would be your:
1. Research hypothesis
142
2. Null hypothesis
3. Alternate hypothesis
MODULE I
I. How to compute Chi-square (one sample) Testing the Significant
difference between responds within a group.
How to compute Chi-Square (one sample):
2
2 ( 0−Ʃ)
Formula: x =Ʃ
e
2
x = Chi- Square
Ʃ = sum
0 = Observed Frequencies
E = Expected Frequencies
II. Problems and Hypothesis (Null)
A. Null hypothesis (Ho): There will be no significant difference for each
of three kinds of responses.
B. Problems:
Thirty perspective teachers were asked their opinion about
the desirability of introducing technological innovations rate the
classroom.
1. To what extent is the teachers opinion on technological
innovations in the classroom?
2. Test at .05 level if a significant difference exists among the
teachers.
III. Statistical Procedures:
Step 1: Record expected and observed frequencies as follows:

f Agree Undecided Disagree Total
0 20 5 5 30
E 10 10 10 30
Step 2: (0−E)2 10 -5 -5
Step 3: (0- E2 ) 100 25 25
(0−E)
Step 4: 10 2.5 2.5
E
Step 5: Substitute the numbers in the chi-square formula and perform the
indicated operetions.
2
x = 10 + 2.5 + 2.5 = 15
x 2= 15
143
Step 6: Consult Chi-Square table with equal degree of freedom.
df= (R – 1) (C – 1) Tabled value (Critical Value) at
df= (2 – 1) (3 – 1) 2df = 5.99
df= 2
Step 7: Match computed value with critical value at .05:

Computed value Critical Value at .05
2
x = 15 df= 2
2
IV. Findings: x (15) > α .05 (5.99) α .05= 5.99
V. Decisions: Reject Ho ,,
There is a significant difference in opinions among the
teachers.
VI. Interpretations/Implication: Teachers differ significantly in their opinion
in the use of technological innovations in the classroom.
VII. Computing the weighted mean of the response to determine the extent
of opinion of teachers towards the use of technological innovations in
the classrooms.
Ʃ fw :
Formula: weighted x=
N
Ʃ= sum
f= frequency
w= weight
N= number of respondents
Give higher values in points to positive responses, lower values to
negative responses, such as:
agree = 3 undecided = 2 disagree = 1
Table of Computation of weighted X

Responses f w fw weighted X
Agree 20 3 60
Undecided 5 2 10 75/30 = 2.5 (Agree)
Disagree 5 1 5
Total 30
Numerical range of cut-off scores for 3 levels of descriptive rating

2.34 – 3.00 Agree
1. 67 – 2.33 Undecided
1.00 – 1.66 Disagree
Therefore teachers significantly agree on the use of technological
innovations in the classroom.
144
For five (5) levels of descriptive ratings, the following may be use
4.21 – 5.00 strongly agree
3.41 – 4.20 agree
2.61 – 3.40 undecided
1.81 – 2.60 disagree
1.00 – 1.80 strongly disagree
EXERCISE:
A. Problem: A survey questionnaire given to college teachers on the
issue. “It is best for students to be told what subjects to take rather
than have them choose for themselves.” Test at what .05 the
significant responses of the respondents.
B. Data:
Strongly Agree – 325 x 5 = 1625
Agree – 584 x 4 = 2336
Uncertain – 189 x 3 = 567
Strongly Disagree – 82 x 1 = 82
1466 5182
Questions:
1. Is there a significant difference in responses among the college teachers?
Support your answer.
2. What is the level of attitude of respondents?
3. If you were the college president, would you allow affixed curriculum for
students to follow strictly based on findings?
145
Module 2
How to compute the chi-square (two or more samples) testing the significant
between two or more groups.
I. Formula: X 2 = Ʃ (D – E)2 Where: x2= chi-square D=

observed freq. E Ʃ = sum of
E = expected freq.
II. Formula and Hypothesis (Null)

A. Null Hypothesis Ho: There is no significant difference in opinions
among the three groups of respondents towards the teaching of sec
education in school.
B. Problem: Teachers were asked if they favor the teaching of sex
education in school Test at .05 level that significant difference in
attitude exist among teachers.
C. Data: Scales SA A U D SD Total
Teachers 15 6 25 18 11 75
College 32 12 28 13 9 94
High School 26 18 29 15 10 98
Elementary 73 36 82 46 30 236
D. Questions:
1. Are the attitudes of the 3 levels of teachers significantly different?
E.
2. Is there a significant difference in attitude in each following
group:
a. Elementary Teachers
b. High School Teachers
c. College Teachers
3. What is the extent of attitude of each group of teachers?

a. Elementary b. High School c, College
4. Based on findings are you in favor of teacher sex education in our

schools?
146
III. Statistical Procedure:
Step 1. Record observed frequencies as follows:
Teachers/ SA A E D SD TOTAL
Scales
College 15(a) 6(b) 25(c) 18(d) 11(e) 75
High School 32(f) 12(g) 28(h) 13(i) 9(i) 94
Elementary 26(k) 18(l) 29(m) 15(n) 10(o) 98
TOTAL 73 36 82 46 30 267
Step 2. Compute the “e” of each cell. The “e” of each cell is
obtained by multiplying the marginal sums and divide this by the total number of
respondent (N).
cell (a) = 73x76 = 20.51 cell (f) = 73 x 92 = 25.70 cell (k) = 73 x 98 = 26.79
267 267 267
cell (b) = 36 x 75 = 10.11 cell (g) = 36 x 94 = 12.67 cell (l) = 36 x 98 = 13.21

267 267 267
cell (c) = 82 x 75 = 23.03 cell (h) = 82 x 94 = 28.87 cell (m) = 82 x 98 =

30.10
267 267 267
cell (d) = 46 x 75 = 12.92 cell (i) = 46 x 94 = 16.19 cell (n) = 46 x 98 =

16.88
267 267 267
cell (e) = 30 x 75 = 8.43 cell (j) = 30 x 94 = 10.56 cell (o) = 30 x 98 =

11.01
267 267 267
147
Step 3. Substitute the “e” in the chi – square formula and perform
the indicated operations.
cell (a) = (15-20.51)2 = 1.48 cell (b) = (6-10.11) 2 = 1.67 cell (c) = (25-23.03) 2 = 0.17
20.51 10.11 23.03
cell (d) = (18-12.92)2 = 2.00 cell (e) = (11-8.43)2 = 0.78 cell (f) = (32-25.70)2 = 1.54
12.92 8.43 25.70
cell (g) = (12-12.67)2 = 0.04 cell (h) = (28-28.87)2 = 0.03 cell (i) = (13-16.19)2 = 0.63
12.67 28.87 16.19
cell (k) = (26-26.79)2 = 0.02 cell (l) = (18-13.21)2 = 1.74 cell (m) = (29-30.10)2 = 0.04
26.79 13.21 30.10
cell (n) = (15-16.88)2 = 0.11 cell (o) = (10-11.01)2 = 0.09

16.82 11.01
X2 = 1.48 + 1.67 + 0.17 + 2.00 + 0.78 +1.54 + 0.04 + 0.03 + 0.63 + 0.23 + 0.02 + 1.74 + 0.04 + 0.21
+ 0.09
X2 = 10.67
Step 4. Match the computed value with the critical value.
Computed value critical value at .05 (15.57)

X2 = 10.67 d = (R – 1) (c-1)
= (3-1) (5-1)
= (2) (4)
df = 8 at 0.05 ∫ = (15.51)
Findings:
X2 (10.67) is lesser than ∫ .05 (15.51)
Decision: Accept 110

There is no significant difference in opinion among
the three groups.
Interpretations/Implementations:
Opinion among groups of teachers in comparable to
their opinion on the teaching of sex education In school is
agreeable to teachers.
148
Module 3
How to Compute Coefficient of Contingency
Through the Chi-Square
Problem I. Is there a significant correlation between marriage adjustment and

education
of husbands.
2.) Test at .05 level of significance.
Data: Education of Husbands: Marriage Adjustment Levels
Very Low Low High Very High
Grad Work 4 9 38 54
College 20 31 55 99
High School 23 37 41 51
Elementary 11 10 11 19
149
Questions:
1. Construct the Following

1.1. Research Hypothesis
1.2. Null Hypothesis
1.3. Alternate Hypothesis
2. Compute the Following
2.1. X2 2.3 “c” 2.5 df (x2) 2.7 CU (x2)
2.2. e 2.4 t 2.6 df (t) 2.8 CU (t)
Formula: Where:
√ x2
N +x 2
C = coefficient of contingency
x2 = Chi - Square
N = Number of cases
II. Problem and Hypothesis: (NULL)
A. Null – Hypothesis (HO): There is no significant correlation between
marriage adjustment level and education of husbands
B. Problem: Test at .05 level of significance that There is no significant
correlation between marriage adjustment level and education of
husbands
III. Statistical Procedure
Step 1.Record observed frequencies as follows:
Education of Husband: Marriage Adjustment Levels
Very Low Low High Very High TOTAL
Grad Work 4 (a) 9 (b) 38 (c) 54 (d) 105
College 20 (e) 31(f) 55 (g) 99 (h) 205
High School 23 (i) 37 (j) 41 (k) 51 (l) 152
Elementary 11 (m) 10 (n) 11 (o) 19 (o) 51
513
Step 2.Compute the “c” of each cell by multiplying the marginal sums and
dividing this by the total number (N).
150
58 x 105 58 x 205
Cell (a) = =11.87 Cell (e) = =23.18 Cell (i) =
513 513
58 x 152
=17.19
513
87 x 105 87 x 205
Cell (b) = =17.81 Cell (f) = =34.77 Cell (j) =
513 513
87 x 152
=25.78
513
145 x 105 145 x 205
Cell (c) = =29.68 Cell (g) = =57.94 Cell (k) =
513 513
145 x 152
=242.96
513
283 x 105 223 x 205
Cell (d) = =45.64 Cell (h) = =89.11 Cell (l) =
513 513
223 x 152
=66.07
513
Step 3.Compute
❑
X 2=
∑
❑
❑ ( o−e ) x
2
2 2
( 4−11.87) (23−17.19)
Cell (a) = =5.22 Cell (i) = =1.96
11.87 17.19
(9−17.81)2 (37−25.78)2
Cell (b) = =4.36 Cell (j) = =4.88
17.81 25.78
2 2
(38−29.68) ( 41−42.96)
Cell (c) = =2.33 Cell (k) = =0.09
29.68 42.96
2 2
(54−45.64) (51−66.07)
Cell (d) = =1.53 Cell (l) = =3.44
45.64 66.07
2 2
(20−23.18) (11−5.77)
Cell (e) = =0.44 Cell (m) = =4.74
23.18 5.77
(31−34.77)2 (10−8.65)2
Cell (f) = =0.41 Cell (n) = =0.21
34.77 8.65
151
2 2
(85−57.94) (11−14.42)
Cell (g) = =0.15 Cell (o) = =0.81
57.94 14.42
2 2
(99−89.11) (19−22.17)
Cell (h) = =1.10 Cell (p) = =0.45
89.11 22.17
IV. Matching computed Value with Table Value

Computed Value x2 = 32.12 Critical Value at 0.05
df = (2-1) x (c-1)
= (4-1) x (4-1)
= 3x3
df = 9∝ at 0.05 (df=9) = 16.92
V. Findings: x2 (32.12) >δ 0.05 (16.92)

The x2 value of 32.12 is greater than the critical value of δ 0.05 of 16.92.
Relationship therefore is established between marriage adjustment level and
education of husbands.
Step 4: C=
√ x2
N +x 2
=
√ 32.12
513+32.12
=0.24
C 0.24
Step 5: = =0.29
84 0.84
Step 6: t = C
√ N−2
1.00−C
=0.29
√ 513−2
1.00−(0.29)
=7.78
Match computed Value with Table Value

Computed Value Critical Value at 0.05
x2 = 1.99 df = (R-1) x (c-1)
= (2-1) x (5-1)
= 1 (4)
152
df = 4
∝ .05 (4df) = 9.488 tabled value
X2 = 1.99 <∝ .05 (9.488)
Findings x2 (1.99) is lesser than the critical value at 0.05 level of significance
(9.488)
Decision: Accept the hypothesis of no divergence
It is a normal distribution
Interpretation/Implication:
Analysis of the data has shown no significant divergence in 5 levels of
performance in N1 level the same number of expected (0.42) pupils compared to
the observed no (1) same is true with the ms vs and o levels, however we are
only four pupils who are satisfactory (S) out of the expected 5 pupils the
distribution is normal.
Answer to Questions:
1. The graph is not normal at 3 levels of performance
2. The graph is normal at 5 levels of performance
Module 4
153
How to determine the profile of the Academic Performance of a group (Testing
the significant Divergence from the normal curve of distribution)
FORMULA :
❑
(O−E ) ❑2
x 2=∑ ❑
❑ E
2
x =Chi−square
= sum
O = observed frequency
E = expected frequency
The expected frequencies in the test for normality are not of the
distribution of the normal curve.
II. PROBLEM AND HYPOTHESIS (null)
The distribution of levels of performance in a mathematics test is not
divergent from the normal curve of distribution.
(Distribution is normal)
A. Problem: Grade IV pupils were given an achievement test in
Mathematics. Test at .05 level that distribution of performance is
not divergent from the normal curve of distribution.
B. Data: Level of performance Frequencies

Above Average 18
Average 26
Below Average 4
Total
48
154
III. STATISTICS L PROCEDURE
Step 1. Record observed frequencies.
Perform indicated operations.
Frequencies BA A AA TOTAL
Observed (o) 4 26 18 48
Expected (e) 16% or 48 68% of 48 16% of 48
Step 2. O-E 3.68 6.64 7.68
STEP 3. (O – E)❑2 13.54 44.09 106.50
2
(O−E)❑
STEP 4. 1.76 1.35 13.87
E
Expected “E” in 5 levels of academic performance uses the following proportion.
Outstanding – 3.5% satisfactory – 45%
Very satisfactory—24% moderately satisfactory—24%
Needs improvement—3.5%
Step 5. Substitute the numbers in the chi-square formula and perform the
indicated operation.
2
x =1.76 + 1.35 + 13.87 = 16.98
x 2=16.98
Step 6. Match computed value with critical value.
Computed value critical value at .05
2
x =16.98 df = (3-1) (2-1)
=2
0.5 (df2) = 5.99
2
IV. FINDINGS: x ( 16.98 ) is greater than the critical value at .05 level of
significance (5.99)
V. DECISION: Reject the hypothesis of no divergence
VI. INTERPRETATIONS/IMPLICATIONS:
155
Analysis of data has shown a significant divergence in 3 level of
performance. In the BA (below average) level there are only 4 out of the
expected 7 pupils (7.68). In the A (average) level, there are only 26 out of the
expected 32 pupils (32.64). However, there are more pupils (18) than expected
(7.68) in the AA (Above Average) level. The distribution presents a skewness to
the left the normal curve of distribution.
Therefore the academic performance of the Grade IV pupils is that we
have brighter pupils than poor ones.
EXERCISE :
I. PROBLEM : Give the profile of the academic performance of a group
of pupils in a Reading test getting the following scores.
II. Data : Scores in a Reading Test

30, 29, 28, 25, 23, 21, 20, 19, 18, 17, 16, 12
III. Questions:
Is the group normal or not in a
a. 3 levels of performance
b. 5 levels of performance
156
MODULE 4a. How to convert scores into Grades: (Levels of Academic
Performance) A, B, C,
D, F
I. Formula : 1.8 SD ± x (A and F)
.6 SD ± x (B, C, D)
II. Scores: 56,53, 48, 38, 37, 23, 18, 15, 8
III. Procedures:
Step 1. Compute the mean score : ( x ¿=

∑ of X =33.60
N
Step 2. Compute the SD (.5) = sum of √ ( x−x ) ❑2

N
STEP 3. Compute the cut-off scores for A?B?C?D?F?
For A and F
15.93 x 1.8 = 28.68 28.68 + 33.60 = 62.27 (A)
For B, C, D
15.93 X .6 = 9.56 33.60 + 9. 56 = 43.16 (B)
STEP A. Setting the levels of Performance with Cut-off score.

62.27____________ A Levels with Frequency
62.26 56
53 A --- 0
43.16____48______B B --- 3
43.15 40 C --- 3
38 D --- 4
24.15____37______ C F --- 0
24.04 23
18
15
4.94 8_______ D
4. 93 F
Of cut-off scores for A?B?C?D?F? Levels of Performance.

62. 27 up - A 62.15 =A
43.16 – 62.26 - B 43.12 – 62.14 = B
24.05 – 43.15 - C 24.08 -- 43.11 = C
4.94—24.05 – D 5.06 – 24.07 =D
4.93- below - F
157
TABLE A.
VALUES of t (use critical ratio) at the .05 and .01 levels of

Significance.
Example : When the df are 20 an the t is 2.09, the .05 level means that 5
times in 100 trials in a divergence as large or as larger than that obtained (plus or
minus) may be expected under the null hypothesis.
Degrees of Freedom .05 .01
Df
1 12.71 63.66
2 4.30 9.92
3 3.18 5.84
4 2.78 4.60
5 2.57 4.03
6 2.45 3.71
7 2.36 3.50
8 2.31 3.36
9 2.26 3.25
10 2.23 3.17
11 2.20 3.11
12 2.18 3.06
13 2.16 3.01
14 2.14 2.98
15 2.13 2.95
16 2.12 2.92
17 2.11 2.90
18 2.10 2.88
19 2.09 2.86
20 2.09 2.84
21 2.08 2.83
22 2.07 2.82
23 2.07 2.81
24 2.06 2.80
25 2.06 2.79
26 2.06 2.78
27 2.05 2.77
28 2.05 2.76
29 2.04 2.76
30 2.04 2.75 .
50 2.01 2.68.
100 1.98 2.63
Over 100 1.96 2.58
158
TABLE B.
Values of Chi-square ( x 2 ¿ at the .05 and the .01 levels of

Significance.
Example : For 12 degrees of freedom, a computed x 2 must be at least at large

21.03 to be significance at the 5% level and as large as 26.22 to be significant at
the level.
Degrees of Freedom .05 .01
1 3.84 6.64
2 5.99 9.21
3 7.82 11.34
4 9.49 13.28
5 11.07 15.09
6 12.59 16.81
7 14.07 18.48
8 15.51 20.09
9 16.92 21.67
10 18.31 23.21
11 19.68 24.72
12 21.03 26.22
13 22.36 27.69
14 23.68 29.14
15 25.00 30.58
16 26.30 32.00
17 27.59 33.41
18 28.87 34.80
19 30.14 36.19
20 31.41 37.57
21 32.67 38.93
22 33.92 40.29
23 35.17 41.64
24 36.42 42.98
25 37.65 44.31
26 38.88 45.64
27 40.11 46.96
28 41.34 48.28
29 42.56 49.59
30 43.77 50.89
159
Module 5
How to compute the Product-Moment Correlation (r)
(Testing the significant correlation between two variables)
( x−X ) ( y −Y )
Formula: r =
( N )( SDx ) ( SDy )
r = this coefficient of correlation

Ʃ = sum of
x – X = differences between each scores on test x and the mean of the test
N = the number of pairs of the scores
SDx = the standard deviation of the test x
SDy = the standard deviation of the test y
Problem and Hypothesis:
Null Hypothesis
There is no significant correlation between Test X and Test Y Scores
Problem:
Test at 0.05 level of significance that there is no significant correlation between
test x and test y scores.
Students Test X Test Y
A 50 60
B 60 80
C 70 90
D 80 70
E 90 10
160
STATISTICAL PROCEDURE:
Step 1: Prepare 4 columns of figure for each test, as follows:
Student X X x-X ( x− X)2 y Y y-Y ¿¿ (x-X) (y-Y)

s
Total
A 50 70 -20 400 60 80 -20 400 (-20) (-20)
400
B 60 70 -10 100 80 80 0 0 -10 (0)
C 70 70 0 0 90 80 10 100 (0) (10)
D 80 70 10 100 70 80 -10 100 (10) (-10)
-100
E 90 70 20 400 10 80 20 400 (20) (20)

0
100
Total 350 1000 1000 700

=70
5
Step 2: To obtain the sum of (x-X)(y-Y) multiply columns x-X and y-Y as indicated
above, then add these products:
The sum of (x-X) (y-Y) = 700
Step 3: Determine the standard Deviation of each follows:
161
SDx = 14.14 SDy = 14.14
Step 4: Substitute the numbers in the formula as follows:
700
r=
(5)(14.14)(14.14)
= 0.70
Step 5: Test the significance of r with the t-test formula:
t = r √ n−21.00−¿ ¿ t = 0.70
√ 5−2 1.00−¿ ¿
t = 1.69
Step 6: Match computed t – value with the critical t – value.
Computed Value Critical Value
t = (1.69) df = N – 2
=5-2
=3
df = 0.05(df3)
= 3.18
IV. Findings:
t (1.69) is less than the significant correlation between test x and test y
scores.
V. Decision:
Accept Ho.
VII. nterpretation: there is no significant correlation between test x and y scores.
1. Problem: Test the hypothesis of no significant correlation between Grade

IV pupils, Process and attitude scores at 0.05 level of significance.
2. Data:
162
Student Test A Test B
(Process test) (Attitude test)
A 22 35
B 16 25
C 25 30
D 9 15
E 10 20
3. Question:
A. Describe the relationship between the Pupils performance in Process and

Attitude Test?
B. Is the correlation significant?
Answer:
STATISTICAL PROCEDURE:
Step 1: Prepare 4 columns of figure for each test, as follows:
Students x X x-X (x− X)2 Y Y y-Y ¿¿ (x-X) (y-Y)
Total
A 22 16.4 5.6 31.36 35 25 10 100 5.6 10
56.0
B 16 16.4 - 0.16 25 25 0 0 -0.4 0

0.4
0
C 25 16.4 8.6 73.96 30 25 5 25 8.6 5
43.0
163
D 9 16.4 - 54.76 15 25 -10 100 -7.4 -10
7.4
74.0
E 10 16.4 - 40.96 20 25 -5 25 -6.4 -5

6.4
32.0
Total 82 201.2 25 250

=16.4
5
Step 2: To obtain the sum of (x-X)(y-Y) multiply columns x-X and y-Y as
indicated above, then add these products:
The sum of (x-X) (y-Y) = 700
Step 3: Determine the standard deviation of each follows:
SDx = 6.34 SDy = 7.07
Step 4: Substitute the numbers in the formula as follows:
205
r =
(5)(6.34)(7.07)
= 0.91
Step 5: Test the significance of r with the t-test formula:
t = r √ n−21.00−¿ ¿ t = 0.91 √ 5−2 1.00−¿ ¿

t = 3.80
Step 6: Match computed t – value with the critical t – value.
Computed Value Critical

Value
t = (3.80) df = N – 2
=5-2
=3
164
df = 0.05(df3) =
3.18
V. Findings: t (3.80) is less than the significant correlation between test x and test
y scores.
VI. Decision: Reject Ho.
VII. Interpretation: there is no significant correlation between test x and y scores.
EXERCISE
Problem:
A unit test was given to a group of Grade VI pupils in Science. Here are the
scores. 28, 30, 33, 42, 17, 18, 33, 32
Questions:
Compute the following
Mean Score
Standard Deviation
Compute the cut-off scores for levels f academic performance.
How many of the Students are outstanding, very satisfactory, satisfactory,

moderately satisfactory, needs improvement; in 3 levels.
Answer:
Ʃx
Mean Score = = 29.89
N
Standard Deviation = √¿¿ = 7.59
Cut off Scores
For A and F
7.59 x 1.8 = 13.66
165
29.89 + 13.66 = 43.55
29.89 – 4.55 = 25.34
C are the score between B and D
Setting the levels of performance with Cut off Scores
43.55
A
43.54
42
36
34.44
B
34.43
33
33
32
30
25.34
C
25.33
28
18
17
16.23
D
16.22
Summary of Cut – Off Scores for A, B, C, D level of performance
43.44 – up = 0
34.4 – 43.54 = 2
25.34 – 34.43 = 3
16.23 – 25.33 = 2
16.22 – below = 0
N= 9
5 Level of Performance
166
f
Outstanding (O) 42, 36 2
Very Satisfactory (VS) 33, 33, 32 3
Satisfactory(S) 30, 28 2
Moderately Satisfactory (MS) 18, 17 2
Needs Improvement(NI) 0
N= 9
3 Level of Performance
f
Above Average 42,36 2
Average 33, 33, 32, 30,28 5
Below Average 18, 17 2
Module 6
(Biserial r)Testing Statistically the Significant Correlation between Variable A
(Continuous) and Variable B (Discontinuous)
I. Hypothesis Problem:
Is there relationship between music appreciation and training in music?
A. Variables:
Variable A (Continuous) – Music appreciation test score
Variable B (Discontinuous) –
1. Those with training in Music (Group 1)
2. Those without training in Music (Group 2)
167
B. Formula:
Biserial Coef. of Correlation or Biserial r
X1 (p) – x2 Pq
R (bis) - (q) x
SD U
II. Hypothetical Data :
Music Apprec. With training in Without training in

Total
Scores Music Music
85 – 89 5 6 11
80 – 84 2 16 18
75 – 79 6 19 25
70 – 74 6 27 33
65 – 69 1 19 20
60 – 64 0 21 21
55 – 59 1 16 17
N1 = 21 N2 = 124 N = 145
Total
168
III. Hypothesis:
There is no significant correlation between performance in music

appreciation scores and training in music.
IV. Statistical Procedures:
A. Step 1. Compute the following values:

1. X1 (p) = mean of group 1 (77.00)
2. X2 (q) = mean of group 2 (71.35)
3. SD = standard deviation of all scores (8.80)
B. Step 2. Compute p and q.

1. p = proportion of group 1 (0.145)
2. q = proportion of group 2 (0.855)
3. p and q Proportion in the Normal Curve
Untrained (q) 50% 35% trained (p)
p = 85.5%
14.5%
C. Step 3. Compute u.
U is the value of the height of the ordinate of the specific area in the
normal curve. (See illustration of u in the normal curve).
Step 3a – Solve are above the mean (AM).
q - 50%
AM =
100
169
85.55%-50% (assuming that q is the
=
100 larger group)
35.5%
=
100
= 0.355
Step 3b – See tabled value of 0.355 AM to get the value of U.

AM = 0.355
U = 0.288
D. Step 4 – Substitute computed values in formula for r (bis).
x1 (p) - xz (q) pq
r = X
SD u
77.00 – 71.35 (0.145) ( 0.855)

= X
SD 0.288
5.65 0.124
= X
8.80 0.288
= 0.642 x 0.431
170
r (bis) = 0.276 or 0.28
E. Step 5 – Test the significant of r with t = test
t = r N - 2
1–(r)2
t = 0.28 142 - 2
1 – (0.28) 2
t = 0.28 143
0.22
t = 0.28 650
t = 0.28 x 25.50
171
t = 7.14
V. Findings:
1. What is the value of r?
● r (bis) = 0.276 0r 0.28
How do you describe it, low, marked, strong, high?
● Low
2. Is the correlation significant?
● significant
Support this with your t-test findings.
● The t-test finding is 7.14 which is higher than the computed

values.
172
Module 7
Testing the SignificantChange in opinions/Attitudes of a Group After
Treatment
I. Hypothetical Problem: Fifteen Nursing students were given a personality

inventory test
beforethe start of a personality inventory test before the start of a
Personality Development
Program.Test at 0.5 that there is no significant change in attitude of the
subjects after the
development program.
II.Hypothetical Data:
AFTER
Negative Positive
Positive 13A 2B
_________
Negative 9C 6D
Before
III. Hypothesis :There is no significant change in attitude after the personality

development
Program
IV. Level of Significance – set A .05

V. Df = (R – 1)(C-1)
VI. Test of Significant Change
VII. Formula:
( ⃓ A−D ⃓ −⃓ ) ❑2
X2=
A+ D
173
VIII.Computation
2 ( ⃓ 13−⃓ G−⃓ ) ❑2
X = df =(−1)(C−1)
19
( 7−1 ) ❑2
= =( 2−1 ) ( 2−1 )
19
3G
= = (1 ) ( 1 )
19
= 1.89 =1
IX. Findings.
2
X = (1.89) < .05(3.84)
X. Decision
Accept Ho of No Significant change
XI. Interpretation
The attitude of the fifteen nursing students did not change after the
development program. Attitude is
the difficult to change. It takes a longer time to change his attitude one’s attitude.
174
Module 8
How to compute the t-ratio (Testing the Significant Difference between two
Mean Scores of Large N’s)
I. Formula
X1-x2
A= (SD1)2 ₊ (SD2)2
N1 N2
t = t-ratio also called t-task or critical ration
X 1 = mean of sample 1 (SD 1)2 = square of error of mean for

sample 1
N2
X 2 = mean of sample 2 (SD 2)2 = square of error of mean for

sample 2
N2
N1 = number of cases of sample 1
N2 = number of cases of sample 2
II. Problem and Null Hypothesis:
III.
A. Problem:
School A and School B were compared in the performance
achievement test in Mathematics of Grade VI pupils of the same
City. Test at .05 level that there is no significant different in mean
scores between the 2 scores.
175
B. Null Hypothesis (Ho): There is no significant difference in mean
scores between School A and School B in a Mathematic Test.
C. Data: School A School B
Difference
X 69 67
2
SD 10 12
N 100 144
IV. Statistic Procedure:
Step 1 – Determine the mean of sample 1 and 2
Step 2 = Compute the Standard error of the mean for sample a and b.
Standard error for the mean sample a.
SEX1 = square root of (10)2 = square root of 100 or 1

100 100
= square root of (12)2 = square root of 144 or 1

144 144
Step 3 – Substitute the number in the formula and perform the indicate operation.
A= 69-67 2
=
(10)2 + (12)2 100 + 144
100 100 100 144
2 2 2
A= = = = t = 1.42
1+1 2 1.41
Step 4 – Match computed value with critical value:
Computed Value Critical Value at .05

t = 1.42 df = (N 1+ (N2)2
df = (100-1)+ (144-1)
176
df = 99+143
df = 242
v .05= (df 242)= 1.96
V. Findings:
t = (1.42) is less than critical value (1.96)
VI. Decision Accept Ho.

There is no significant difference in the mean scores between School A
and School B in a Mathematics Achievement Test.
VII. Interpretation/ Implications:

The performance of pupils in School A and School B is Comparable.
EXERCISE:
I. Problem: Class A and Class B were compared in the performance of

an achievement test of first year high school student in biology. Test
at .05 level that there is no significant difference in mean score
between the 2 schools.
II. Data:
Class A Class B
X 74.55 62.42
SD 14.70 18.25
N 6.6 8.2
III. Questions:
1. Const. the following:
a. RH
b. Ho
c. Ha
177
2. Compute
a. T
b. Critical value at .05
3. Give the following:
a. Findings
b. Decision
c. Interpretation
Answers:
1.
a. RH
b. Ho – There is no significant difference in mean scores between the 2

schools.
c. Ha
2. Compute
a. t
A= 74.55- 62.42
(14.70)2 (18.25)2
6.6 + 8.2
= 12.13
216.09 333.06
6.6 + 8.2
12.13
32.74 + 40.62
= 12.13
178
73.36
= 12.13
8.57
A = 1.42
t = 1.42
b. critical value at .05
Computed Value Critical Value at .05

t = 1.42 df = (N 1+ (N2)2
df = (6.6-1)+ (8.2-1)
df = 5.6 + 7.2
df = 12.8
v .05 = (df 12.8)= 2.16
3.
a. Findings
- T = (1.42) is less than critical value (2.16)
b. Decision: Accept Ho.

There is no significant difference in mean scores between the 2
schools.
c. Interpretation/ Implication
The performance of the Achievement Test in Class A and Class B is
comparable.
179
Module 9
How to compute the Significant difference between two mean scores when N is
small (below 30)
A. Hypothetical Problem
An experiment is conducted to find out if the use of modular instruction is

effective in the teaching of Mathematics in Grade IV. Two classes, experimental
and control groups were organized. Test at .05 that there is no significant
difference in performance between the two groups. Is modular instruction
effective in teaching Grade IV Mathematics?
C. Null Hypothetical Data
Control Group Experimental Group
Scores Scores
18 15
17 18
16 19
12 18
19 19
15 20
180
19 19
12 N=8
N = 10
C. Formula
(Critical t = ratio for small N’s)
t = X1 – X2 where S2 = ( x - X)2
2
S1 + S
2
N1 N2
D. Statistical Procedures
Step 1 : Compute the mean score of each group :
a. Mean 1 = 16
b. Mean 2 = 18
Step 2 : Compute the variable
(S2) of each group
62 64
a. S2 = =
10−1 9
2
S = 7. 11
1
181
64 64
S2 = =
10−1 9
2
S = 7. 11
1
Module 10
How to compute the significant differences in two mean scores between
pre post test scores.
I. Hypothetical Problem
Test at .05 level if there is a significant mean gain between pre and
post test scores in an activity- centered science class.
II. Null Hypothesis

There is no significant differences in mean scores between pre and
post test data.
III. Formula
xd
A=
S2
√N
Where s 2=
√
( X −xd ) 2
N−1
IV. Data
Pre-Post Post-Test
Students Score Scores xd Xd x-Xd

(x-Xd)2
A 8 11 3 0 0
B 2 4 2 3 -1 1
C 6 10 4 +1 1
182
N=3 2
V. Statistical Procedure
Step 1- Compute the mean difference (Xd)
Step 2 – Compute the variables (S2)
S2 =
√ ( xd ixd ) 2
N −1
=
2
2√
S2 =
2
2
2√
S = 1 = (1)
=
S = (1)
2
√ 2
2
Step 3- Compute the square root of N

√3 = 1.73
Step 4 – Substitute value in formula
3 3
t= =
1 1
t= √ 3 = 1.73
3
t=
50
t= 5.17
Step 5- Method of computed t-value with tabled value.
Computed Value Tabled Value at .05
t= 5.17 df= N-1
= 3-1
=2
.05= 4.30 (two-tailed)
= 2.92 (one-tailed)
VI. Findings
t( 5.17) > .05 (2.92)
The computed t (5.17) is greater than the tabled value of 2.92.
183
VII. Decision
Rejected the Null Hypothesis
VIII. Interpretation:
There is a significant difference in two mean scores between pre-test

and post test, pre-out gains in the study. The pupils have improved in
their science skills after the activity.
184

Module With Exercises

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Module With Exercises

Uploaded by

Copyright:

Available Formats

COURSE

A variable is a characteristic, description, or attribute of persons or objects,

a) Height, age, weight are variables, which assumes numerical responses or

Civil status and religious affiliation are qualitative variables.

a.) Number of students, and number of patients are discrete quantitative

a.) Gender can be categorized as either

Thus, if an individual is a member of the male group then he cannot be a

b.) College year level can be categorized as:

b.) Pain assessment is also an ordinal data, which is categorized as:

Consider the IQ of four students: 70, 140, 75, and 145.

Age, income and scores are examples in ratio data.

In an statistical investigations, there are a lot of ways of collecting data.

In the registration method, the respondents provide the necessary information

The use of existing studies also provides an archival method of data

In the survey method, the desired information is obtained through asking

In the direct or personal interview method, there is a person-to-person contact

The indirect or questionnaire method is considered to the easiest method of

163 180 148 156 168 172

177 193 142 152 167 178

189 162 157 161 167 173

176 188 198 158 151 162

167 186 143 164 169 171

182 157 171 197 159 168

153 147 136 161 166 162

165 156 165 167

RAW SCORES OF 100 PUPILS IN A VOCABULARY TEST

THE MASTER SHEET

Procedures in classifying scores on the master sheet:

1. Determine the highest and the lowest scores.

3. Draw 13 (constant) vertical lines and many horizontal lines as

4. Write in the horizontal cells the units 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, and

EDUCATION 602 - Statistics

THE MASTER SHEET OR CLASSIFIER

H.S. = 198 – 19t

L.S. = 136 – 13t

Constant 13 Vertical line

EDUCATION 602 - Statistics

Name Course Date

THE MASTER SHEET OR CLASSIFIER

N 4 IIII-II IIII II III I III III I I 26

3 II IIII IIII III IIII-III III III IIII IIII-I II 40

Ranking is the relative placement or arrangement of measures in a series

Procedure in the ranking of scores or measures:

EDUCATION 602- Statistics

Name Course Date

THE RANKING OF SCORES

197 2 2 165 30 29.5

173 16 156 44 43.5

168 23 22.5 142 51 51

167 27 TR = Tentative Rank

166 28 28 RR = Real Rank

EDUCATION 602- Statistics

Name Course Date

The Ranking Scores

Scores TR RR Scores TR RR Scores TR RR

THE SCORE DISTRIBUTION

The grouping of scores in a score distribution is resorted to when there are

Grouping scores in a score distribution, as well as in a step or frequency

The distribution also indicates whether the examination is easy, difficult, or

Scores are grouped in score or step distribution to economize space and

Procedures in grouping scores or measures in a score distribution:

Name Course Date

Scores Tallies Freq. Scores Tallies Freq. Scores Tallies Freq.