You are on page 1of 14

SUCCEED REVIEW CENTER

EDUCATIONAL MEASUREMEMENT AND EVALUATION


(ASSESSMENT OF LEARNING)
The Development of Intelligence Tests
1. Jean Etienne Esquirol
- a French Psychiatrist who made the first efforts to draw the differences between mental deficiency
and insanity
- used language capability a criterion rather than sensations in trying to classify individuals with
mental retardation
- known as the Leader of Abnormal Psychology
2. Wilhelm Wundth
- a German Philosopher and Psychologist who established the first laboratory in the world, which he
dedicated to experimental psychology at Leipzig, Germany in the year 1879
- His primary preoccupation was on the measurement of powers of sensory discrimination, which
resulted in the science of psychophysics.
- Known as the Father of Experimental Psychology and the Founder of Modern Psychology
3. Hermann Ebbinghaus
- a German Experimental Psychologist who devised a word-completion test
- investigated color vision and mental capacity
- first to demonstrate that learning and memory could be studied experimentally
- known as the Founder of Quantitative Study Memory
4. Francis Galton
- a British Psychologist who is noted as an early proponent of statistical analysis as applied to
mental and behavioral phenomena
- one of the first to use questionnaire and survey methods in investigating mental imagery in different
groups
5. Karl Pearson
- a British mathematician who developed techniques of modern statistics
- extended Galtons ideas of regression and developed the methods of correlation known as the
Pearson Product-Moment Coefficient of Correlation
6. Charles Spearman
- an English Psychologist who was influenced during his studies by the works of Francis Galton
- developed a two-factory theory of intelligence
- developed the method of correlation known as the Spearman Rank-Difference Coefficient of
Correlation
7. Edward L. Thorndike
- an American Psychologist who developed psychological connectionism
- used objective measurements of intelligence on human subjects together with his students
- developed a test of intelligence that consisted of completion, arithmetic, vocabulary, and directions
test known as the CAVD which became the foundation of modern intelligence tests
8. James McKeen Cattell
- an American Psychologist who used statistical methods and quantification of data for the
development of American Psychology as an experimental science
- stressed the importance of quantification, ranking and ratings
- was recognized as the Father of Mental Testing
9. Clark Wissler
- an American Anthropologist who applied correlation factor to empirically disprove J.M. Cattells
method of intelligence testing
- evaluated the results of Cattells attempts to measure the mental ability of students by measuring
their reaction time, movement time, and other simple mental and sensory processes
- found very small correlation between academic standing and the tests
10. Alfred Binet
- a French Psychologist who with his daughter further refined his developing conception on
intelligence, especially the importance of attention span and suggestibility in intellectual
development
- developed Binet-Simon Scale with Theodore Simon
- Both Simon and Binet were the first researchers to use mental age as a measure of intelligence, but
their idea was refined by other researchers.
11. Walter V. Bingham
- an American Industrial and Applied Psychologist who believed that intelligence is a complex set of
factors that can be measured by looking at individual aptitudes for mathematical, verbal,
mechanical, and social skills
- believed that heredity is the most important factor in intellectual development, and that
environmental influences serve only to modify what is already present within the individual
12. Henry Herbert Goddard

an American Psychologist who established the first laboratory for the psychological study of
mentally retarded persons in 1910
- translated the Binet-Simon Scale into English
- His views on intelligence were derived from Mendelian genetics
- believed the feeblemindedness was caused by the transmission of a single recessive gene
- known as the Father of Intelligence Testing in the US
13. William Stern
- a German Psychologist who tried to classify people according to types, norms, and aberrations
- influenced by the work of Binet
- developed the idea of expressing intelligence test results in the form of a single number, the
intelligence quotient
- IQ = Mental Age/Chronological Age
14. Lewis Madison Terman
- an American Cognitive Psychologist who decided to see what mental tests could do in
distinguishing usually backward students from very bright ones
- Published a perfect revision of Binet-Simon Scale known as the Stanford-Binet, which is the best
available individual intelligence test
- Suggested modifying the equation of intelligence quotient by multiplying it by 100 to eliminate
decimals
- IQ = (Mental Age/Chronological Age) x 100
15. Robert Mearns Yerkes
- an American Comparative Psychologist who urged the American Psychological Association to
contribute psychological expertise to the war effort
- developed a group intelligence test that would identify recruits with low intelligence and allow the
Army to recognize men who were particularly well-suited for special assignments and officers
training schools
- constructed the verbal and nonverbal test known as the Army Alpha and the Army Beta,
respectively, for illiterate and non-English speaking recruits
16. David Wechsler
- an American Psychologist who understood intelligence to be more of an effect, rather than a cause
- introduced the Deviation Quotient, an IQ computed by considering the individuals mental ability in
comparison with the average individual of his or her own age
- developed an individual intelligence test for adults to supplement the Stanford-Binet Test in 1939,
known as Wechsler Adult Intelligence Scale (WAIS)
- published another intelligence Test known as Wechsler Intelligence Scale for Children (WISC)
17. Joy Paul Guilford
- an American Psychologist who made a number of contributions to the study of human intellectual
abilities
- His model of human intelligence Structure of the Intellect is a complex, three-dimensional model of
intelligence that can be used to guide educational instruction
- Modified and developed many existing test by factor analysis
The Development of Achievement Test
1. Horace Mann
- introduced the written examination to the schools in Boston due to the weaknesses of oral
examination
- His efforts contributed to the establishment of the first Normal Schools for Teachers in
Massachusetts
- known as the Father of American Public School Education
2. Rev. George Fisher
- an English Schoolmaster who devised and used the first objective measure of achievement of
pupils
- devised an instrument which he called Scale Book, for measuring the learners achievement in
different school subjects, like scale in handwriting, spelling, mathematics, grammar, composition
and others
- Scale Book, predecessor of the modern day proficiency tests
3. J.M. Rice
- first inventor of comparative objective test in America
- administered a list of spelling words to measure differences between groups of students who were
taught differently
- found out that those pupils who studied spelling for thirty minutes every day for eight years did not
show better spelling abilities than those pupils who studied spelling for only fifteen minutes every
day for eight years
- prepared similar tests in language and in arithmetic, which served as the predecessor of the
modern objective tests in different school subjects
4. Dr. Edward L. Thorndike

developed methods for measuring a wide variety of abilities and achievements by the time US
entered WWI.
- his book Mental Social Measurements contains statistical procedures and principles upon which
statistical techniques and test today are based
- constructed the first handwriting scale to measure childrens handwriting which assigned qualitative
values to different qualities of handwriting, known as Thorndike Handwriting Scale
- regarded as the Father of Educational Measurement
5. Cliff W. Stone
- student of Thorndike who constructed two tests, one on four fundamental operations in arithmetic
and the second test on arithmetic reasoning
- regarded as the first to publish standardized achievement test on arithmetic, known as the Stone
Arithmetic Test
- contributed reasoning test in arithmetic to educational testing and measurement
6. S.A. Curtis
- student of Thorndike who was interested in measuring the growth of pupils in arithmetic and in
establishing a norm of attainment for each grade
- developed a series of standardized tests in Arithmetic available for use in 1909
- originated the concepts of nouns and standards
- Curtis Series of Tests in Arithmetic, his test
7. M. Hillegas
- student of Thorndike who constructed a series of standardized tests in Composition Scale by
following the principles in the construction of the Thorndike Handwriting Scale
- Hillegas Composition Scale, served as the basis of the composition scale of today
8. Ayres
- student of Thorndike who developed a series of standardized spelling scales, known as Ayres
Spelling Scales
9. William A. McCall
- published his pioneer book dealing with test adaptation
- made an informal objective type of test which is widely used today
10. Ralph W. Tyler
- responsible for the extension of achievement testing to the more intangible outcomes of instruction
which cannot be measured accurately like attitudes, appreciation, interests, ideals and others
The Development of Character and Personality Test
1. Fernand
- first to measure character by test
2. Voelker
- invented some actual situations for testing character
3. Percival Symonds
- developed a scientific study on personality
4. Herman Rorschach
- Introduced a multi-dimensional test of personality known as the Rorschach Test, which consists of
10 inkblots used as projective techniques to appraise the global aspects of personality. The
students responds by reporting what he sees in the inkblot and his reactions determine his
personality variables as impulsiveness, sensitivity, and emotional stability
5. Raymond B. Cattell
- a British and American Psychologist who contributed the application of advanced statistical
techniques
- searched for a comprehensive theory of human behavior through then use of multi-factor analysis
since the beginning of his career and was attracted to C. Spearmans factor analysis
- developed a theory of personality which fathered many methodological innovations
- His pursuit of a comprehensive theory of behavior through factor analysis methods has produced a
variety of theoretical models and psychometric instruments.
- His theoretical developments in the measurement of personality by question are embodied in the
16PF (personality factor).
MEASUREMENT
- process of measuring the individuals intelligence, achievement, personality, attitudes, and values
and anything that can be expressed quantitatively
- limited to the quantitative description of an attribute
- represents the objective aspects of evaluation
Classification
1. Classroom Measurement
- set of measurements made by classroom measurements
- purpose: to assess the progress of a particular student or of an entire class
- include teacher-made tests and other instruments for assessing students learning in a class

particular concept has to be re-taught instead of introducing and discussing a new concept
depending on the results of the test conducted by the subject teacher
2. System Measurement
- set of measurements made by policy-makers either in the local or national levels
- considered by the school administrators and managers
- examples:
a. National Elementary Achievement Test (NEAT)
b. National Secondary Aptitude Test (NSAT)
Types
1. Criterion-Referenced Measures
- measure the students performance with respect to some particular criterion or standard
- used to:
a. evaluate performance against performance objective
b. ascertain a students performance for a particular course
c. establish the need for the student to take the course or not
d. establish the effectiveness of the teaching and the extent of learning
e. assess readiness for the next stage of course
2. Normative-Referenced Measures
- measure the ability of one student compared to the abilities of other students in the same class
EVALUATION
- a continuous process
- the process of determining the extent to which instructional objectives are attained
- involves measurements and assigning qualitative meaning through value judgments
- represents both objective and subjective processes in assessing students learning
- a means of determining the effectiveness of teaching methodologies, instructional materials and
many others
Types
1. Diagnostic
- conducted before instruction
- purpose: to identify the strengths and weaknesses of students
2. Formative
- conducted during or after instruction
- objective: to obtain immediate feedback for both the students and the teachers
3. Summative
- conducted at the end of a unit or a specified period of time
- purpose: the grading of the students at the end of a broad unit of work usually by grading period,
semester, or course
Functions of Measurement and Evaluation
1. Measures students achievement
2. Motivates student learning
3. Predicts students success
4. Diagnoses students difficulty
5. Evaluates instruction
Stages of the Teaching-Learning Process and Evaluation
1. Classifying instructional objectives
2. Preassessment of students
3. Providing relevant instructional activities
4. Determining the learning outcomes
Principles of Evaluation
1. Evaluation should be based on clear instructional objectives.
2. Evaluation procedures and techniques should be selected in terms of the objectives they serve.
3. Evaluation should be comprehensive.
4. Evaluation should be continuous.
5. Evaluation should be diagnostic and functional.
6. Evaluation should be a cooperative endeavor.
7. Evaluation should be used judiciously.
INSTRUMENTS USED IN EVALUATION
A. Objective Instruments
a. Achievement Test

b.

c.
d.
e.
f.
g.

h.

i.
j.
k.
l.
m.
n.

o.
p.
q.
r.

s.
t.
u.

Measures how well a student has mastered specified instructional objectives


Intelligence Test
Measures the students broad range of abilities
May be expressed as very superior, superior, above average, average, below average,
borderline or mentally defective
Diagnostic Test
Measures students strengths and weaknesses in a specific area of study
Formative Test
Measures a students progress that occurs over a short period of time
Summative Test
Measures the extent to which the students have attained the desired outcomes for a given
chapter or unit
Aptitude Test
Measures the ability or abilities in a given area
Survey Test
Measures general achievement in a given subject or area
more concerned with scope of coverage
Performance Test
Measures a students proficiency level in a skill
requires manual or other motor responses
Personality Test
Measures the ways in which individuals interest is focused with other individuals or in terms of
the roles that other individual has ascribed to him and how he adopts in the society
Prognostic Test
Predicts the students future achievement in a specific subject area
Preference Test
Measures both interest or aesthetic judgments by requiring the students to make forced choices
between members of paired or grouped items
Accomplishment Test
Measures individual students achievement in the school curriculum
Scale Test
Test in a series of items arranged in the order of difficult ones
Power Test
Measures level of performance rather than speed of response
Made up of a series of test items in graded difficulty, from the easiest to the most difficult ones
Speed test
Measures the speed and accuracy of the students in answering the questions within the time
limits imposed
Placement Test
Measures the grade or year level the student is enrolled after ceasing from school
Standardized Test
A test made after certain norms have been established
Teacher-made Test
Constructed by the classroom teachers but not as carefully prepared as the standardized test
No full guarantee of validity and reliability
Mastery Test
Determines the extent to which individuals in a group have learned or mastered a given unit of
instruction
Omnibus Test
Measures a variety of mental operations combined into a single sequence from which only a
single score is taken
Readiness Test
Measures the extent to which an individual has achieved certain skills needed for beginning
some new learning activities

B. Subjective Instruments
a. Observation
Refers to a reaction to various situations
May be focused in the changes taking place in the behavior of the students
b. Checklist
Used to reveal the frequencies of occurrence of specific
type of students behavior
c. Rating Scale
Used in evaluating students attitudes, or other characteristics of students
d. Questionnaire

e.
f.

g.
h.
i.
j.

k.

Used to survey students interests and needs, progress, and reaction for diagnostic purposes
Opinionnaire
An information form that attempts to measure the attitude or belief of an individual student
Projective Technique
A psychological technique which involves the use of black-and-white pictures or abstract images
to elicit expression of unconscious material
Rorschach Ink Blot Test is an example of this technique.
Sociogram
A sociometric technique to show in diagram from the interpersonal relationship within a certain
group
Anecdotal Record
A short written account about the behaviors, motives and attitudes of a given statement
Work Sample
Used as a source of information about individual students
Conference
A one-to-one conversation between a teacher and a student
A better source for diagnostic information about the students
Interest Inventory
A cumulative record of students interests
Affords the teachers to check the interest of his students

QUALITIES OF A GOOD MEASURING INSTRUMENT


A. Validity
- The most important quality of a good measuring instrument
- Refers to the degree to which a test measures what it intends to measure
- The usefulness of the test for a given measure
- A valid test is always reliable.
Methods of validating a test
a. By judgment of competent teachers, usually three or more experts in the field
b. By correlating the result of the test scores against an outside valid criterion
c. By computation of the percentage of students who got the answer right both in the upper and
lower groups
d. By factor analysis
B. Reliability
- The second important quality of a good measuring instrument
- Refers to consistency and accuracy of test results
- If the test measures exactly the same degree each time it is administered, the test is said to have
high reliability.
- A test to be reliable should yield essentially the same scores when administered twice to the same
group of students.
- Increasing its length or items may raise the reliability of the test.
- Clear and concise directions would also increase the reliability of the test.

C.

D.

E.
F.

Method of Determining Reliability


a. By correlating the results of the test which was administered twice to the same students at
different time
b. By comparing the results of the test with those of a reliable test
c. By correlating the results of the test which was administered once to the students. In the case of
correlation of the test conducted once, the results are divided or broken into two sets, which
may be done on the basis of certain criterion
Objectivity
- Refers to the degree to which personal judgment is eliminated in the scoring of the test
- Requires that the personal opinion of the teachers does not affect the score of individual student
- The more objective the test, the greater is its reliability.
- The essay test is not that satisfactory as the giving of points for this kind of test vary from teacher to
teacher.
Administrability
- Refers to the ability of the test to be administered easily
- Instructions should be clear, simple, and directions should be given to the students, to the proctors,
and to the scorer(s)
Scorability
- The quality wherein the test can be scored in a simplest way and at a quickest possible time
- Directions should be clear and separate answer sheet must be provided
Comprehensiveness

Refers to the degree to which a test contains a fairly wide sampling of items to determine the
objectives or abilities so that the resulting scores are representatives of the relative total
performance in the areas measured
G. Interpretability
- The quality of the test in which the test results can be readily, easily, and properly interpreted
H. Economy
- Refers to the cheapest way of giving test
ORGANIZATION OF TEST RESULTS
A. Raw Scores
- Collected scores that have not been organized numerically
B. Tally Sheet
- Device used in arranging the scores from highest to lowest or from lowest to highest
C. Ranking
- Described as a relative arrangement in a series according to magnitude from highest to lowest or
from lowest to highest
- By considering the tally sheet
D. Frequency Distribution
- Listing of possible score values and the number of students who obtained the test scores
- Used when there are many test scores
- Utilized when the test scores are greater than or equal to 30
- Makes the test scores meaningful
- Indicates whether the test is easy, moderately difficult, or difficult
E. Cumulative Frequency Distribution
A less than cumulative frequency indicates the number of scores in the distribution that falls
below a specified upper class boundary.
A greater than cumulative frequency indicates the number of scores in the distribution that lies
above certain lower class boundary.
F. Graph
- A diagram which makes a systematic presentation of a class frequency distribution together with the
comparisons and relationship of the classes
a. Histogram represents a pictorial presentation of a frequency distribution
b. Frequency Polygon constructed by joining the midpoints of the steps as against their
corresponding frequencies
MEASURES OF CENTRAL TENDENCY
A. Mean

Commonly known as the average


Most frequently used
Most reliable measure
A prerequisite which needs to be determined before other important measures can be
computed

B. Median
A point measure that divides the distribution of arranged test scores from highest to lowest or
vice versa in half
Most stable measure of central tendency
Depends on the number of scores, not on the magnitude of the scores
C. Mode
Easiest to find
The score that occurs most frequently
Need not to be unique
MEASURES OF POSITION
A. Quartiles
- The points which divide the test scores into four
B. Deciles
- The points that divide the test scores onto ten
C. Percentiles
- The points that divide the total number of test scores into exactly 100 parts
MEASURES OF DISPERSION

A. Range
The simplest and easiest
Measures how far the highest score is from the lowest score
Considered as the least satisfactory measure of dispersion
B. Interquartile Range
Range of scores of specified part of the total group usually the middle 50 percent of the cases
lying between Q1 and Q3
C. Quartile Deviation
Divides the difference of the third and first quartiles by two
Average distance from the median to the two quartiles
When QD is small, the set of test scores is more or less homogeneous.
When QD is large, the set of scores is more or less heterogeneous.
Used when there are extremely high and low scores especially when there are big gaps
between scores
D. Average Deviation
Measure of absolute dispersion that is affected by every individual score
The mean of the absolute deviations of the individual scores from the mean of all scores
A large average deviation would mean that a set of scores is widely dispersed about the mean
A small average deviation would imply that the set of scores is quite close to the mean
E. Standard Deviation
Involves all scores in the distribution rather than through extreme scores
Refer to as the root-mean-square of the deviation from the mean
Considered the most important measure of dispersion
MEASURES OF CORRELATION
A. Coefficient of Correlation
Correlation is a measure of the strength, weakness and the direction of relationship between
two sets of data
Can range from -1 to + 1: the sign tells the direction and the numerical value tells the strength
B. Scatter Diagram
Also known as scatterplot
Shows the degree of relationship between two sets of test scores
a. Positive perfect correlation
b. Negative perfect correlation
c. Some positive correlation
d. Some negative correlation
e. Zero/No correlation
C. Interpreting Coefficient of Correlation
r(P)
1
0.75 to
0.99
0.51 to
0.74
0.31 to
0.50
0.01 to
0.30
0.00

Descriptive Level
Perfect Correlation
High Correlation
Moderately High
Correlation
Moderately Low
Correlation
Low Correlation
No Correlation

D. Pearson Product-Moment Coefficient of Correlation


Most commonly used measure of correlation
E. Spearman Rank-Difference Coefficient of Correlation
Used when the Pearson Product-Moment Coefficient of Correlation cannot be applied
Applicable when the set of scores is in terms of rank-order rather than in terms of a continuous
interval scale, and when the set of scores is small
May also be applied even if numerical values in the form of scores are available: ranks may be
preferred as they are found to be necessary under certain conditions
INSTRUCTIONAL OBJECTIVES
Meaning of Instructional Objectives
- Refer to the objectives, which are stated behaviorally

Guide the teachers in their day-to-day activities


Describe the kind of behavior that indicates whether or not learning has taken place
Very specific

Characteristics of the Instructional Objectives


SMART
1. Specific
2. Measurable
3. Attainable
4. Realistic
5. Time-Bound
The Taxonomy of Educational Objectives
A. Cognitive Domain
o Emphasizes recall or recognition of knowledge and the development of intellectual abilities and
skills
o Hierarchy of the Cognitive Domain
1. Knowledge
Remembering of previously learned materials
Student simply reproduces with little change what was presented before.
Involves recall which can be learned by rote memorization with little
understanding of what is read or studied
2. Comprehension
The ability of the individual students to grasp the meaning of materials
May be shown by:
a. Translating material from one language into another or from one form into
another
b. Interpreting materials by giving meaning or new examples in words of
concepts presented in a picture or graph
c. Extrapolation by estimating from the trends
3. Application
The ability to use a learned rule, method, procedure, principle, theory, law,
and formula to solve new situation
4. Analysis
The ability to breakdown materials into component parts to identify the
relationship
This may include:
a. Identification of parts
b. Analysis of the relationship between parts
c. Recognition of the organizational principles involved
5. Synthesis
The ability to put parts together to form a new whole
Stresses creative behaviors with emphasis on the formulation of new
structure
6. Evaluation
Concerned with the ability to judge the value of material for a given purpose
B. Affective Domain
o Describe changes in interest, attitudes and values, and the development of appreciation and
adequate adjustment
o Hierarchy of Affective Domain
1. Receiving
The students willingness to give attention to the materials being presented
The teacher is concerned with getting, holding, and directing the attention of
the students.
May range from simply becoming aware that a thing exists to giving selective
attention on the part of the student
Receiving is the lowest level.
2. Responding
The active participation on the part of the students
Range from merely complying with the expectations of someone else to
willingly responding from inner drive until they feel satisfied
3. Valuing
Concerned with the worth, value or importance a student attaches to a
particular object, situation or action
Range in degree from the more simple acceptance of a value to the more
complex level of commitment

4. Organization
Concerned with bringing together different values, resolving conflicts between
them and organizing them into a value system
Emphasized on comparing, relating, and bringing together different values
5. Characterization
The student has a value system that has controlled his behavior for a
sufficiently long time. His behavior is consistent and predictable.
Covers a broad range of activities but emphasizes on the fact that the
behavior or characteristic of the student is typical
Concerned with the individuals general patterns of personal, social, and
emotional adjustments
C. Psychomotor Domain
o Emphasizes some muscular or some manipulation of materials and objects, some acts which
require a neuromuscular coordination
o Hierarchy of the Psychomotor Domain
1. Perception
The use of the sense organs to obtain cues that guide motor activity
Ranges from awareness of a stimulus, selection of cues to translating cues to
action in performance
2. Set
Readiness to act
Includes mental, physical, and emotional readiness to act
3. Guided Response
The early stage in learning a complex skill
Concerned with initiating the act of the teacher as a model and trying out
different approaches and choosing the most appropriate ones
4. Mechanism
Concerned with performance acts that have become automatic and can be
performed with some proficiency and confidence
5. Complex Overt Response
The skillful performance of motor acts that involve complex movement pattern
6. Adaptation
Concerned with well-developed skills
The individual can modify movement patterns to fit special requirements or a
problem situation
7. Originality
The creation of new movement patterns to fit a particular situation or specific
problem
Emphasizes creativity based upon highly developed skills
TABLE OF SPECIFICATIONS
Meaning
- A plan prepared by a classroom teacher as a basis for test construction especially a periodic test
- Contributes to the development of the quality test which per se is a good instrument for diagnostic
and remedial teaching
- Provide an assurance that the test will measure representative samples of the instructional
objectives and the contents included in the instruction
Two Types of Tables
1. One-way
- Only the contents are listed down the left side
2. Two-way
- The instructional objectives are listed across the top
Content Mastery
Can be determined by simply counting the number of test items horizontally from the TOS for the
particular content area being tested and the number of correct items the student got
Skill Mastery
Can be determined by counting the number of test items vertically from the TOS and the number of
correct items the student got
CONSTRUCTION OF THE TEST
Preliminary Steps in Constructing Teacher-Made Tests

1. Prepare a table of specifications.


2. The test should be of various types of items.
3. Clear, concise, and complete directions should precede all types of test.
4. There should only be one possible correct response for each item in the objective test.
5. The test items should be carefully worded to avoid ambiguity.
6. Majority of the test items should be of moderate difficulty.
7. The items should be arranged in a rising order of difficulty.
8. The regular sequence in the pattern of responses should be avoided.
9. Each item should be independent.
10. The test should not be too short nor too long but it can be completed within the time allotted.
11. Make the answer key that contains all acceptable answers.
12. Decide upon the values of scoring.
Objective Tests
- Item types that can be scored objectively
- Equally competent scorers can score them independently and obtain the same results
Two types
1. Recall Type the answer is not part of the test
a. Completion Test Items
b. Enumeration Test Items
2. Recognition Type the answer is part of the test
a. True-False
b. Matching Test Items
c. Multiple-Choice Test Items
Scoring an Objective Test
For True-False and Multiple-Choice Tests:
S = R (W/N-1)
where: S = total number of scores
R = number of correct responses
W = number of wrong responses
N = Number of Choices
Essay Tests
To assess the students ability to express their ideas accurately and to think critically within a certain
period of time
Should be able to reveal those skills and abilities that are to be measured
Appropriate for testing at higher levels of cognitive domain
ITEM ANALYSIS
Meaning
- The process of examining the students response to each item in the test
Characteristics of an Item
a. Desirable can then be retained for subsequent use
b. Undesirable either to be revised or rejected
Three Important Criteria
a. Difficulty of the item
b. Discriminating power of the item
c. Measures of attractiveness
Difficulty Index
The proportion of the number of students in the upper and lower groups who answered an item
correctly
The ideal test should contain items whose difficulty indices are from 0.41 to 0.60, but for most teachermade test, o.30 to 0.70 would be acceptable.
Index
Range
0.00 0.20
0.21 0.40
0.41 0.60
0.61 0.80

Difficulty Level
Very difficult
Difficult
Optimum difficulty
Easy

0.81 1.00

Very easy

Discrimination Index
The proportion of the students in the upper group who got an item right minus the proportion of
students in the lower group who got an item right
A maximum positive discriminating power of an item is indicated by an index of 1.00 and is obtained
when all the upper groups answered correctly and no one in the lower group did.
A zero discriminating power is obtained when an equal number of students in both groups got the item
right.
A negative discriminating power is obtained when more students in the lower group got the item right
than in the upper group.
Index
Range
Below
0.10
0.11 0.20
0.21 0.30
0.31 0.40
0.41 1.00

Discrimination
Level
Questionable Item
Not discriminating
Moderately
discriminating
Discriminating
Very discriminating

Measures of Attractiveness
To measure the attractiveness of the incorrect option (or distractors), in a multiple-choice tests, we
count the number of students who selected the incorrect option in both the upper and lower groups. The
incorrect options should attract less of the upper group than the lower group.
Categorization of an Item
Difficulty Level
Very difficult

Difficult

Moderately Difficult

Easy

Very Easy

Discrimination Level
Questionable
Not Discriminating
Moderately
Discriminating
Discriminating
Very Discriminating
Questionable
Not Discriminating
Moderately
Discriminating
Discriminating
Very Discriminating
Questionable
Not Discriminating
Moderately
Discriminating
Discriminating
Very Discriminating
Questionable
Not Discriminating
Moderately
Discriminating
Discriminating
Very Discriminating
Questionable
Not Discriminating
Moderately
Discriminating
Discriminating
Very Discriminating

Item Category
Very poor
Poor
Poor
Poor
Very poor

Remarks
Rejected
Rejected
Rejected
Rejected
Rejected

Very poor
Poor
Fair
Good
Poor

Rejected
Rejected
Retained
Retained
Rejected

Poor
Reasonably
good
Good
Very goo
Poor
Very poor
Poor
Reasonably
good
Reasonably
goo
Poor
Very poor
Poor
Poor
Poor
Very poor

Rejected
Revised
Retained
Retained
Rejected
Rejected
Rejected
Revised
Revised
Rejected

Rejected
Rejected
Rejected
Rejected
Rejected

VALIDITY OF A TEST
Meaning of Validity
- Refers to the degree to which a test actually measures what it tries to measure
- Concerns what the test measures and how well it does so
- Two major forms:
a. External the ability to be generalized across persons, settings, and times
b. Internal the ability of a test to measure what it purports to measure
Factors that Affect the Validity of a Test
1. Inappropriateness of the test items
2. Directions of the test items
3. Reading vocabulary and sentence structure
4. Level of difficulty of the test item
5. Poorly constructed test items
6. Length of test items
7. Arrangement of the test items
8. Pattern of the answers
9. Ambiguity
Methods of Establishing Validity
1. Face Validity
2. Content Validity
3. Construct Validity
4. Concurrent Validity
5. Predictive Validity
RELIABILITY OF A TEST
Meaning of Reliability
- The consistency and accuracy of the test
- Should yield essentially the same scores when administered twice to the same students.
- Reliability index is o.50 and above is acceptable for a teacher-made test.
- Can be determined statistically by means of correlation techniques and Kuder-Richardson Formula
21.
Factor that Affect Reliability
1. Length of the test
2. Moderate item difficulty
3. Objective scoring
4. Heterogeneity of the student group
5. Limited time
Methods of Establishing Reliability
1. Test-Retest Method
2. Alternate-Form Method
3. Split-Half Method
4. Kuder-Richardson Formula 21
SCORES, GRADES, AND FINAL RATING
Scores
Grades
-

a number that indicates the quantity of achievement an individual student obtained in a test that is
commonly determined in terms of items correctly answered
symbols used to inform a students achievement in a subject area
lowest grade is 70 - DECS Order No. 70, s. 1998 also known as Revised System of Rating and
Reporting of Student Performance for Elementary and Secondary Public Schools

Symbols Used for Reporting


1. number symbols
2. percent symbols
3. letter symbols
4. descriptive symbols
Methods of Transmutation of Scores
A. Criterion-Referenced Approach

Refers to evaluation of student performance based on the minimum standard that the class must
reach
- Students grades are based on a target performance level.
- Lacks flexibility
B. Normative-Referenced Approach
- Refers to evaluation of student performance relative to other student performance
- The teacher bases the grade of the student on his standing relative to the whole class.
- Probably the most valid method of converting raw scores into grades because it is based upon the
normal distribution curve
- Difficult in applying the procedure
C. Combination of Criterion-Referenced and Normative-Referenced Approaches
- The range is from perfect score to zero.

You might also like