You are on page 1of 31

ED 203: TEST CONSTRUCTION AND EVALUATION

Professor
Chapter 4
DESCRIBING EDUCATIONAL DATA
Objectives:
At the end of this chapter, the students should be able to:
➢ 1. present and interpret correctly data in a tabular or graphic data
➢ 2. arrange data in tables and graphs in a correct fashion
➢ 3. compute the mean, median and mode of a set of test scores.
➢ 4. differentiate the relationship between the shape of and the relative positions of
measures of central tendency.
➢ 5. explain how the measures of central tendency differ and significance of those
differences
➢ 6. determine the standard deviation and semi-interquartile range of a set of test
scores.
➢ 7. express the relationship between standard deviation units and area under a normal
curve.
➢ 8. compute the coefficient of correlation using different methods
➢ 9. Interpret the Pearson and Spearman of measures of relationship
➢ 10. Show appreciation for the value of the information presented in the chapter to an
educator who wishes to describe and interpret data.

Preparing a Frequency Distribution


One way of organizing the scores for presentation is to what is termed as a frequency
distribution. This is a table showing how often each score occurred. Each score value is listed
and the number of times it occurred is shown.
Steps in drawing a frequency distribution follow:
a. Find the Range of the scores.
b. Decide on the number or size of the grouping.
GROUPING refers to the number of steps.
 Maximum No. of Grouping - 20
 Minimum No. of Grouping - 7
 Ideal No. of Grouping - 10-15

c. Determine the interval


Range ÷ No. of Steps = Interval
Example: 45 ÷ 10 = 4.5 or 5

d. Get the Lowest Limit (L.L.) of the step interval.


Divide the Lowest Score by the Interval and then multiply by the Interval
Example: 42 ÷ 5 = 8 x 5 = 40
So: Lowest Limit is 40-44

Math Test is given to a class. Here are the scores of A the 50 pupils. Let's make a
frequency distribution and tally the frequency.
Graphic Representation

helpful to translate data into a pictorial representation. A common type of graphic


representation, which is called a HISTOGRAM, is shown below.
HISTOGRAM

Image Score interval


-Histogram of Reading Scores This can be thought of somewhat grimly, as "piling up
the bodies". The score intervals are shown along the horizontal base-line (abscissa). The
vertical height of the pile (ordinate) represents the number of cases.
-The diagram indicates that there are two “bodies" piled up in the interval 16-18, three
in the interval 19-21, and so forth. This figure gives a clear picture of how the cases pile up,
with most of them in the 30's and a long low pile running up to the high score.

Another way of picturing the same data is by preparing a FREQUENCY POLYGON.


This is shown below.

Frequency Polygon of Reading Scores

Here we have plotted a point at the mid-point of each of our score intervals. The
height at which we have plotted the point corresponds to the number of cases, or frequency
(1), in the interval. These points have been connected and the jagged line provides a
somewhat different picture of the same set of data illustrated in Fig. 1.
MEASURES OF CENTRAL TENDENCY
is a single value that attempts to describe a set of data by identifying the central
position within that set of data. As such, measures of central tendency are sometimes called
measures of central location.

The three measures of Central tendency


1. MEAN
2. MEDIAN
3. MODE

MEAN
➢ The Mean (x̄ ) is the arithmetic average of a set of scores. It involves the values of the
scores in the distribution.
➢ The most dependable measure of central tendency.
➢ The most reliable since all scores are important.

THE MEAN OF THE UNGROUPED SCORES


The formula is:
⅀𝑵
Mean = 𝑵
Where: N= scores
⅀ = sum or summation of all scores
To compute the Mean, the following steps are followed:
1. Choose the interval (for the assumed mean) to be the arbitrary starting point or
“origin”. In this example the interval 65-69 has been chosen. Call this interval zero.
2. Call the next higher interval +1, the one above that +2, etc.; call the next lower -1, the
one below that -2, etc. These are show in the column labeled d. This column indicates
the number of intervals steps each interval is above or below our chosen assumed
mean.
3. For each row, multiply the number of cases or frequency (f) by the number of steps or
deviations (d) above or below chosen origin. These products give the values in the
column fd.
4. Sum the values in the fd column taking account of the plus and minus signs.
5. Sum the frequencies in the column f to get the total number of cases in the group. This
is usually labeled N.
6. Divide the sum of the fd by N.
7. Multiply the answer in step 6 by the interval.
8. Add the result to the assumed mean.
The assumed mean in our example is 67. (67 is the midpoint of the interval (67-69). So,
67 + 3.5 = 70.5.
MEDIAN
➢ the middle number in a sorted, ascending or descending, list of numbers and can be
more descriptive of that data set than the average.
➢ the value or the score scale that separates the top half of the group from the bottom
half.

Steps in finding the median of ungrouped scores:


a) Arrange the scores from highest to lowest.
b) Add one to the total number of cases (N + 1).
c) Divide the total by 2.
d) Find the th
To compute the median the following steps are followed:
1. Accumulate the scores up through each score interval. The cumulative frequencies as
shown in our examples are 1, 3, 7, 10, 16, 25, etc.
2. Calculate the number of cases that represents 50 percent of the total.
3. Find the interval for which the cumulative frequency is just less than the required
number of cases.
4. Find the score distance to be added to the top of this interval. In order to include the
required number of cases by this operation:
𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒂𝒅𝒅𝒕𝒊𝒐𝒏𝒂𝒍 𝒄𝒂𝒔𝒆𝒔 𝒓𝒆𝒒𝒖𝒊𝒓𝒆𝒅
( ) (𝒊𝒏𝒕𝒆𝒓𝒗𝒂𝒍)
𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒄𝒂𝒔𝒆𝒔 𝒊𝒏 𝒏𝒆𝒙𝒕 𝒊𝒏𝒕𝒆𝒓𝒗𝒂𝒍
5. Add this amount to the Lowest Limit of the interval.
The Lowest Limit is 104.5
So 104.5 + 1.32 = 105.82
MODE
➢ is the value that occurs most often.
➢ score that occurs most frequently.
➢ It is the score that occurs more times than any other scores.
➢ It is the score having the highest point in a frequency polygon.
Given these ungrouped scores, find the mode.
96, 97, 98, 97, 93, 90, 89, 97, 81, 80
Formula in finding the mode of grouped scores
Mode = 3 Median – 2 Mean
Let’s assume that the Median of a set of grouped scores is 62.55 and the Mean is
60.25. What is the mode?
To translate the formula:
Mode = 3 x 62.55 – 2 x 60.25
= 187.65 – 120.5
= 67.15

When to use the different measures of central tendency


Mean is used:
A. When the scores are distributed symmetrically around a central point; that is, the
distribution is not badly skewed.
B. When what is needed is the measure of central tendency having the greatest stability.
C. When other statistics are to be computed later.

Median is used:
A. When one wants the exact midpoint or 50% of the distribution.
B. When there are extreme scores which would markedly affect the score.
C. When it is desired that certain scores should influence the central tendency but all that
is known about them is that they are above or below the median.

Mode is used:
A. When a quick/approximate measure of central tendency is all that is wanted.
B. When the measures of central tendency should be the most typical value.

MEASURES OF VARIABILITY
Measures of Variation of dispersion
-indicate the degree or extent to which numerical value are dispersed or spread out
about the average value in a distribution.
Range
-the difference between the highest and the lowest score.
Formula:
R = HS – LS
Example:
In reading test, the highest score is 59 and the lowest score is 17. What is the
range?
*FOR GROUPED DATA

Range
-is determined by subtracting the lower class boundary of the lower class interval from
the upper boundary of the highest class interval of class distribution

Interquartile Range (IQR)


Is found by finding the difference between the values of the third quartile(Q3) or
upper quartile and the first quartile(Q1) or lower quartile.

Semi - Interquartile Range (SIQR) o Quartile Deviation (QD)


-indicates the variation or dispersion of the values covering the middle 50% of the
distribution of data.
-it is found by getting half of the value or distance between the third quartile or upper
quartile and the first quartile or the lower quartile
Percentile
-refers to those values that divide a distribution into one hundred equal parts.

Percentile rank
-tells what percent of the cases got below the rank position.

Percentile point (Pn)


-is the score or value that corresponds to the given percentile rank
a. The student who scored 24 surpasses all the others. Recall that in the discussion of class
intervals, the boundaries are x – 0.5 and x + 0.5 respectively.
The 10th percentile point is the upper boundary of the highest score that corresponds to
the percentile rank is 24.5 or P100 = 24.5
b. The score n surpasses half or 50% of the students. Therefore, P50=11.
c. The lowest score is 7. It means that there are no scores below 7. The zeroth percentile point
is the lower boundary of the lowest score. Thus, the score corresponding to the zero
percentile rank is 6.5 or P0 = 6.5
where:
STANDARD DEVIATION OR SD
-is a measure of dispersion among all scores in the distribution rather than though
extreme scores. It is the square root of the average of the squared deviations from the mean.

Here are the steps followed in computing the S.D. of the Ungrouped Scores.
a. Find the Mean
b. Subtract the Mean from the scores.
c. Square the deviation
d. Find the sum of the squared deviation (∑ 𝐝𝟐 ).
e. Divide the sum of the squared deviation by the number of cases.
f. Find the square root of the answer in e.

Here are the steps followed in computing the S.D. of the Grouped Scores.
1. Do steps 1-5 in finding the Mean
2. Add the column f𝐝𝟐 . To get the f𝐝𝟐 , multiply the d by the fd.
3. Get the sum of f𝐝𝟐 .
4. Divide the sum of f𝐝𝟐 by the number of cases(N).
5. Divide the sum of f𝐝 by the number of cases.
6. Square the result in step no. 5.
7. Subtract the result in step No. 6 from the result in step no. 4
8. Extract the square root of the difference found in step no 7.
9. Multiply the class interval by the result in step No. 8.
INTERPRETING THE STANDARD DEVIATION
Thorndike and Hagen believe that it is almost impossible to say in any simple terms
what the standard deviation is or what it corresponds to in pictorial or geometric terms.
Primarily, it is a statistic that characterizes a distribution of scores. It increases in direct
proportion as the scores spread scores. It increases in direct proportion as the scores spread out
more widely. The larger the standard deviation, the wider the spread of scores.

The standard deviation gets its most clear-cut materials meaning for one particular
type of distribution of scores. This distribution is called the “normal” distribution. It is defined
by a particular mathematical equation, but to the everyday user it is defined approximately
by its pictorial qualities. The “normal” curve is a symmetrical curve having a bell-like shape.
That is, most scores pile up in the middle scores values; as one goes away from the middle in
either direction the pile drops off, first slowly and then more rapidly, and the cases tail out to
relatively long tails on either end.
Pearson Product-Moment Correlation Coefficient
CORRELATION
➢ is a measure of relationship between two variables.
Most measures of correlation indicate two things:
➢ the magnitude or size of relationship
➢ the direction of relationship between two sets of measurements
Note:
➢ For instance, a correlation of +85 and -85 are of the same size. The size does not have
anything to do with the size of the relationship; rather, it indicates the direction of the
relationship.
➢ When two variables are positively related, one increases as the other increases.
➢ On the on the other hand, when two variables are negatively related one increases as
the other decreases.

Pearson Product-Moment Correlation Coefficient


➢ the most commonly used correlation coefficient.
➢ symbolizes by “ r “

The size of correlation indicated varies from +1 to 0 through -1.

Computing the r from Raw Scores

∑𝑿𝒀 ∑𝑿 ∑𝒀
− ( )( )
𝑵 𝑵 𝑵
𝒓=
𝟐 𝟐 𝟐 𝟐
√∑𝑿 − (∑𝑿) √∑𝒀 − (∑𝒀)
𝑵 𝑵 𝑵 𝑵
Directions for computing a product-moment correlations coefficient (r) from ungrouped
data.
1. Write pairs of scores to be studied in two columns. Be sure that the pair of scores for
each pupil is in the same row. Label one set of scores X and the other Y.
2. Squares each score in the X column and write the result in the X 2 column.
3. Squares each score in the Y column and enter each result in the Y 2 column.
4. Multiply each score in the X column by its pair in the Y column. Enter the product in
the XY column.
5. Add all the entries in each column to get the sum of (∑) for each column.
6. Note the number (N) of pairs of scores.
7. Substitute the values obtained in the formula.

The computation of the coefficient of correlation actually involves the mean and standard
deviations of each set of scores (X and Y), although this is not readily apparent in the above
formula. (Gronlund 1981) Thus, the formula can also be written as:
∑𝑿𝒀
− (𝑴𝑿 )(My)
𝒓= 𝑵
(𝑺𝑫𝑿 )(𝑺𝑫𝒚 )
Where:
𝑴𝑿 = mean of scores in column X
My = mean of scores in column Y
𝑺𝑫𝑿 = standard deviation of scores in column X.
𝑺𝑫𝒚 = standard deviation of scores in column Y.
Thus for these data:
𝟏𝟖𝟐𝟒
− (𝟏𝟓)(11)
𝒓= 𝟏𝟎
(𝟒. 𝟔𝟓)(𝟒. 𝟑𝟏)
𝒓 = 𝟎. 𝟖𝟕

Directions for computing the Pearson r from the deviations from the means:

1. Begin by writing the pairs of scores to be studied in each pupil is in the same row. Label
one set of scores X the other Y.
2. Get the sums (∑) of the scores in each column. Divide the sum by the number of scores
(N) in each column to get the mean (M).
3. Subtract each score in column X from the mean x to get the deviation from the mean.
Enter the result under column x. Be sure to write the sign.
4. Subtract each score in column Y from the mean y to get its deviation from the mean .
Enter the result under column y.
5. Square each entry in x. Write the result in column x2.
6. Square each entry in y. Write the result in column y2.
7. Multiply each entry in column x by each entry in column x by each entry in column y.
Enter the product in column xy.
8. Get the sum (∑) of all entries in xy, x2 , and y2.
9. Apply the formula.
Directions for computing the Pearson r from Standard Scores:

1. Begin by writing the pairs of scores to be studied in two columns. Be sure that the pair
of scores for each pupil is in the same row. Label one set of scores X, the other Y.
2. Get the sum (∑) of the scores for each column. Divide the sum by the number of scores
(N) in each column to get the mean (M).
3. Subtract each score in column X from the mean X. Write the difference I column x. Be
sure to put the algebraic signs.
4. Subtract each score in column Y from the mean y. Write the difference in column y.
Don’t forget the signs.
5. Steps 5 and 6 may be omitted if the standard deviation for each set of scores has been
previously computed. Square each score in column X. Enter each result under X 2. Then
apply the formula for finding the standard deviaton to find SD x.
6. Square each score in column Y. Enter each result under Y2. Then apply the formula for
finding the standard deviaton to find SDy.
7. Divide each entry in column x by the standard deviation SD x and enter the result
under Zx (standard score).
8. Divide each entry in column Y by the standard deviation SD y to get the standard
scores. Enter the result under Zy.
9. Multiply each Z score in Zx by Zy and enter the results under Zx Zy.
10. Get the sum of (∑) Zx Zy.

11. Apply the formula.

Interpretation of Coefficient of Correlation:


Generally, an r of 0.8 and above is considered high coefficient, an r of around 0.5 is
considered moderate, and an r of 0.3 or below is considered low coefficient.
It should be noted, however, that the coefficient of correlation is not a measure of casualty.
Three possible explanations why two variables X and Y can be related:
1.) One may be the cause of the other – X may cause;
2.) Y may cause X; and
3.) The two, X and Y, may be the result of a common cause or both are related to a
third variable Z.
There are correlations that may mean nothing, or in other words, the coefficient is
purely the result of chance. (Doownie & Heath, 1974. For example, suppose a Math teacher
obtained the weight of each member of his class, correlated this with their scores in their final
examination in Math and obtained an r of 0.85. For obvious reasons, the coefficient of
correlation makes no sense. The only way we can account for this coefficient is to say that it is
the result of chance.

r value =
+.70 or higher Very strong positive relationship
+.40 to +.69 Strong positive relationship
+.30 to +.39 Moderate positive relationship
+.20 to +.29 weak positive relationship
+.01 to +.19 No or negligible relationship
0 No relationship [zero correlation]
-.01 to -.19 No or negligible relationship
-.20 to -.29 weak negative relationship
-.30 to -.39 Moderate negative relationship
-.40 to -.69 Strong negative relationship
-.70 or higher Very strong negative relationship

Uses of the Coefficient of Correlation


➢ The major uses of correlation coefficients are in the computation of reliability and
validity of tests. It should be emphasized that the correlation coefficient used to
determine reliability is calculated and interpreted in the same manner as that used in
determining the statistical estimates of validity.

➢ The difference is that Reliability Coefficient is based on agreement or consistency


between two sets of results from the same procedure or test, while Validity Coefficient
is based on agreement with an outside criterion.
The Spearman –Brown Formula
Kuder Richardson Formula 20

Like the split-half method, the Richardson formula also yields a coefficient of internal
consistency. This formula is easy to apply when an item analysis of a test has been made.
Since the item analysis provides a difficulty measure if each test item, (or the percentage of
examinees who answer each test item correctly) the preparation of a worksheet for a Kuder-
Richardson solution can easily be prepared. Directions on how to prepare it follow.
Directions for computing reliability coefficient using the Kuder-Richardson Formula 20:
1. Begin by identifying test items through numbers.
2. Determine the percentage of examinees who answered each item correctly. Enter each
percentage under column P?
3.Subtract each percentage under column P from 1. Enter the result under column q.
4. Multiply each entry in column P by each entry in column q. Enter the result under column
Pq.
5. Add all the entries in column Pq to obtain the sum of Pq.(∑ 𝑷𝒒 )
6. Substitute obtained values in the formula

Kuder Richardson Formula 21

This formula requires only the test mean 𝐗


̅, the variance (𝑺𝟐 ) or
standard deviation squared, and the number of items(K) on the test
are of equal difficulty P= 𝐗
̅/K

For example, on the 60-item test where 𝐗


̅ = 40 and S= 5, 𝐏
̅=
𝟒𝟎
or .67 and q = 1 – P = .33. Thus:
𝟔𝟎
SUMMARY

Test Scores are useless unless given meaning. The following can be done to give
meaning to a set of scores.
a. Scores can be arranged into a frequency distribution or plotted in a histogram.
b. To represent the middle of the group, the median (the 50 th percentile) or the
arithmetic mean (common average) and the mode can be computed.
c. To represent the spread of scores, statisticians have developed the semi-interquartile
range, half the distance between the 25th and 75th percentile and the standard deviation, a
type of average of the deviations of the scores away from the average.
d. The individual score takes on meaning as it is a translated into percentile rank, i.e.,
the percentage of the group he surpassed, or into a standard score. i.e., his position in the
group in terms of the number of standard deviations above or below the mean.
e. A measure of relationship is given by the correlation coefficient, a numerical index of
“going togetherness”. This index is important in describing the prescription or reliability of a
test and in describing the accuracy with which a test score predicts some other factor such as
school grades or job success.

Questions for chapter Review


A. Answer these questions.
1. Should test scores be interpreted? Why?
2. What are the different ways of picturing a set of scores?
3. What are the different measures of central tendency and variability? Explain when
each measure is used.
4. Enumerate the steps in preparing a frequency distribution.
5. Interpret the following results of a 50-item Math Test given to all Grade III pupils in
Nasugbu Elementary School.

Grade/Section Mean S.D

III – Narra 44.75 3.5

III - Molave 40.12 3.5

III – Acacia 30.15 6.25

III – Yakal 35.25 10.10

III – Tangile 38.25 5.15

III – Apitong 22.40 2.9

III – Talisay 25.10 11.5

You might also like