Professional Documents
Culture Documents
MEASUREMENT
An educational process that checks the specificity of an individual which is expressed quantitatively.
The quantification of what students learned through the use of tests, questionnaires, rating scales, checklists, and other
devices.MEASUREMENT answers the question, how much does a student learn or know?
EVALUATION
An educational process that checks the personality of an individual which is expressed qualitatively.
A process of making judgements, assigning value, or deciding on the worth of student’s performance. EVALUATION answers the
question, how good, adequate, or desirable is it?
ASSESSMENT
The full range of information gathered and synthesized by teachers about their students and their classrooms. Gathered
through observation, verbal exchange, written reports, or outputs. ASSESSMENT looks into how much change has occurred on the
student’s acquisition of a skill, knowledge or value before and after a given learning experience.
PURPOSES OF M-E-A
- Appraisal of the school, curriculum, instructional materials, physical plant, equipment
- Appraisal of the teacher
- Appraisal of the school child
FUNCTIONS OF M-E-A
- Improvement of student learning
- Identification of students’ strengths and weaknesses
- Assessment of the effectiveness of a particular teaching strategy
- Appraisal of the effectiveness of the curriculum
- Assessment and improvement of teaching effectiveness
- Communication with and involvement of parents in their children’s learning
TYPES OF EVALUATION
1. DIAGNOSTIC EVALUATION
Undertaken before instruction, in order to assess student’s prior knowledge of a particular topic or lesson. Done to determine strengths
and weaknesses of students as bases for remedial instruction.
2. FORMATIVE EVALUATION
Administered during the instructional process to provide feedback to students and learners on how well the former are learning the
lesson being taught. Frequently done to determine who have reached mastery of the lesson.
3. SUMMATIVE EVALUATION
Undertaken to determine student achievement for grading purposes. Usually done at the end of a unit, which summarizes the student’s
accomplishments.
APPROACHES TO EVALUATION
CRITERION-REFERENCED MEASURE (CRM)
- A student’s performance is compared against a predetermined or agreed upon standard.
- Designed to measure students’ performance with respect to some particular criterion or standard. It is used to evaluate
performance against performance objective.
1. Mode of Response
• Oral
• Written
• Performance
3. Mode of Administration
• Individual – one student at a time
• Group – simultaneous
4. Test Constructor
• Standardized – prepared by an expert or specialist; follow uniform procedure
• Teacher-Made – prepared by classroom teacher with no established norm for scoring and interpretation
6. Nature of Answer
• Personality – emotion, social adjustment, dominance & submission, value orientation, disposition, emotional stability,
frustration level, degree of introversion or extroversion
• Intelligence – mental ability (I.Q)
• Aptitude – predicting the likelihood in a learning area
• Achievement – to determine what student has learned from formal instruction
• Accomplishment – to determine what students has learned form a broader area
• Socio-metric (Preference) – discovering learner’s likes and dislikes; social acceptance; social relationships
• Trade – to measure an individual’s skill or competence in an occupation or vocation
• Speed – to determine ability and accuracy bounded with time
• Diagnostic – to identify specific strengths and weaknesses in past and present learning
• Formative – to improve teaching and learning while it is going on
• Summative – given at the end of instruction to determine student’s learning and assign grades
MULTIPLE CHOICES
Stem – question or problem in each item; can be presented in 2 ways:
– Incomplete statement – all the options end with a period or only the last option ends with a period.
– Direct question – options do not end with a period but stem ends with a question mark.
Options - alternatives where student selects the correct answer
- there is only one correct/best answer from the options, the less appropriate are foils or distracters (maximum no. of options is
5 and the minimum is 4)
ADVANTAGES
• great versatility in measuring objectives - from the level of rote memorization to the most complex level
• the teacher can cover a substantial amount of course material in relatively short time
• scoring is objective
• teachers can construct options that require students to discriminate among them - vary in the degree of correctness
• effects of guessing are largely reduced since there are greater options
• items are more amenable to item analysis
DISADVANTAGES
• more time-consuming in terms of looking for options that are plausible
• more than one defensible correct answer
RULES
• essence of the problem should be in the stem; all options should measure the same objective
• when the incomplete statement format is used, the options should come at the end of the statement
• there should be coherence in stems and options
• there should be consistency in the length/presentation of choices
• avoid repetition of words in the options
• the choices should be arranged ascendingly/descendingly
• the choices should be arranged in vertical/columnar order
• stems and options should be stated positively whenever possible
• avoid negative statements or double-negative statements in the stem
• options should be plausible and homogeneous
• items should have defensible correct or best option
• vary the placement of correct options (to avoid pattern)
• avoid overlapping options
• options for complex type must be clear
• make sure there is only one correct/best answer to an item
• stem and options should be in a single page
• avoid using none of the above
• use none of the above option only if there is an absolute right answer
• avoid using all of the above
• It is a poor distracter since it has very little discriminating power to identify knowledgeable from
non-knowledgeable students.
• do not have combination of all of the above and none of the above in the options
• use four or five options
• there should be uniformity in the number of choices for all the items
• there should be no articles a/an at the end of the stem
• stem should be clear and grammatically correct and should contain elements common to each option (MC obey Standard
English rules of punctuation and grammar; a question requires a question mark)
TYPES OF ESSAY
1. EXTENDED RESPONSE QUESTIONS
- Leave students free to determine the content and to organize the format of their answer
- Opinionated or open-ended answers are solicited from students
2. RESTRICTED RESPONSE QUESTIONS
- Limit both the content and the format of the students’ answers
- Certain parameters are used in the questions/problems
ADVANTAGES OF ESSAY
• No guessing, assesses factual information
• Allows divergent thinkers to demonstrate higher order thinking skills (HOTS)
• Reduces lead time required to produce
• Less work to administer for smaller number of students
• Can be rich in diagnostic information
DISADVANTAGES OF ESSAY
• Subjectivity in scoring
• Even different times of day make a difference
• First paper to be read/checked often sets standard
• Time consuming in checking
• Can result to student rambling, confusion or inability to find a focus
TEST INTERPRETATION
CENTRAL TENDENCY
MEAN
Most common measure of central tendency
Best for making predictions
The best measure of central tendency if the distribution is normal
The arithmetic average, computed simply by adding together all scores and dividing by the number of scores.
It uses information from every single score.
MEAN (Ungrouped)
ΣX
X=
n
Example
If X = {3, 5, 10, 4, 3}
X = (3 + 5 + 10 + 4 + 3) / 5
= 25 / 5
= 5
MEDIAN
Divides a distribution of scores exactly in half.
The middle-most value.
Better than mode because only one score can be median and the median will usually be around where most scores fall.
If data are perfectly normal, the mode is the median.
The median is computed when data are ordinal scale or when they are highly skewed.
Finding the Median
First you rank order the values of X from low to high or vice versa
Count number of observations and add 1
· Divide by 2 to get the middle score
MODE
The most common observation in a group of scores.
Distributions can be unimodal, bimodal, or multimodal.
If the data is categorical (measured on the nominal scale) then only the mode can be calculated.
The most frequently occurring score.
With positively skewed data, the mode is lowest, followed by the median and mean.
With negatively skewed data, the mean is lowest, followed by the median and mode.
KURTOSIS
1. Platykurtic
2. Mesokurtic
3. Leptokurtic
The higher the value of d, the more effective the item is.
When d is 1.00, all test takers in the upper group and no test takers in the lower group answered the item correctly.
When d is -1.00, none of the upper group but all of the lower group answered an item correctly.
GENERALLY, an item is considered acceptable if its d index is 0.30 or higher.
MEASURES of ATTRACTIVENESS
To measure the attractiveness of the incorrect option (distracters) in multiple-choice tests, we count the number of students who
selected the incorrect option in both the upper and lower groups. The incorrect option is said to be effective distracter if there are
more students in the lower group chose that incorrect option than those students in the upper group.
MODE The score that occurs most The quickest estimate of typical
frequently in the distribution. performance is wanted.
- A numerical index that shows the extent to which the scores of a group scatter/ disperse/spread below and above a central
point in a distribution.
1) RANGE- the difference between the highest and the lowest values in a dataset.
2) INTER-QUARTILE RANGE- indicates the extent to which the central 50% of values within the data set are dispersed. It is
based upon and related to median.
3) STANDARD DEVIATION- summarizes the amount by which every value within a data set varies from the mean. It indicates
how tightly the values in the data set are bunched around the mean value.
MEASURE OF VARIABILITY/ DISPERSION
The range is simple to compute and is useful when you wish to evaluate the whole of a data set. It is useful for showing the
spread within a dataset.
The inter-quartile range provides a clearer picture of the over all data set as compared to range.
The SD is the most widely used measure of dispersion, it takes into account EVERY VARIABLE in the dataset. It is usually
presented in conjunction with the mean.
PERCENTILE RANK- used to determine where a particular score or value fits within a broader distribution.
- refers to the percentage of scores that are equal to or less than a given score.
EX. A percentile rank of 35 indicates that 35% of the scores in a distribution of scores fall at or below the score at the 35 th percentile.
SKEWNESS
is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The
skewness value can be positive or negative.
SKEWED DATA
Data can be "skewed", meaning it tends to have a long tail on one side or the other:
KURTOSIS
PLATYKURTIC – negative kurtosis, has a lower, wider peak around the mean and thinner tails.
SOCIOMETRY
Sociometry is a quantitative method for measuring social relationships. Sociometric technique shows the interpersonal relationships
among the members of a group.
PERSONALITY ASSESSMENTS- used to determine one’s emotional adjustment, interpersonal relations, motivation, interest,
attitudes/feelings toward self, others and events or activities.
APTITUDE TEST- used to determine student beliefs, perceptions, or feelings; can be measured through use of a rating scale.
Draw-a-Person Test-psychological projective personality or cognitive test used to evaluate children and adolescents for a
variety of purposes.
Leiter International Performance Scale or simply Leiter scale is an intelligence test in the form of a strict performance
scale. It was designed for children and adolescents ages 2 to 18, [ although it can yield an intelligence quotient (IQ) and a
measure of logical ability for all ages.
Raven's Progressive Matrices (often referred to simply as Raven's Matrices) or RPM is a nonverbal group test typically
used in educational settings. It is the most common and popular test administered to groups ranging from 5-year-olds to the
elderly
GAT is a standardized assessment of an applicant's general reasoning ability, and it measures learning capacity,
observational skills and problem solving ability. It provides an objective measure that is free from language and cultural bias.
The Otis–Lennon School Ability Test (OLSAT), a test of abstract thinking and reasoning ability of children pre-K to 18.
Thematic Apperception Test (TAT) is a projective psychological test. Proponents of this technique assert that a person's
responses reveal underlying motives, concerns, and the way they see the social world through the stories they make up about
ambiguous pictures of people.
Rorschach test ALso known as the Rorschach inkblot test, the Rorschach technique, or simply the inkblot test) is
a psychological test in which subjects' perceptions of inkblots are recorded and then analyzed
using psychological interpretation, complex algorithms, or both. Some psychologists use this test to examine a person's
personality characteristics and emotional functioning.
Psychometrics
The branch of psychology that deals with the design, administration, and interpretation of quantitative tests for the measurement of
psychological variables such as intelligence, aptitude, and personality traits.’
Psychometric Tests
Verbal
Numerical
Logical
Likert scale
-psychometric scale commonly involved in research that employs questionnaires. It is the most widely used approach to
scaling responses in survey research, such that the term (or more accurately the Likert-type scale) is often used interchangeably with
rating scale,
- -respondents specify their level of agreement or disagreement on a symmetric agree-disagree scale for a series of statements.
Thus, the range captures the intensity of their feelings for a given item.
- -The format of a typical five-level Likert item, for example, could be:
- Strongly disagree
- Disagree
- Neither agree nor disagree
- Agree
- Strongly agree
Semantic differential
is a type of a rating scale designed to measure the connotative meaning of objects, events, and concepts. The connotations
are used to derive the attitude towards the given object, event or concept.
The respondent is asked to choose where his or her position lies, on a scale between two bipolar adjectives (for example: "Adequate-
Inadequate", "Good-Evil" or "Valuable-Worthless"). Semantic differentials can be used to measure opinions, attitudes and values on a
psychometrically controlled scale.
Thurstone scale
means of measuring attitudes towards religion. It is made up of statements about a particular issue, and each
statement has a numerical value indicating how favorable or unfavorable it is judged to be. People check each of the
statements to which they agree, and a mean score is computed, indicating their attitude.
What is grading system?
The K to 12 Basic Education Program uses a standards-and-competency-based grading system. These are found in the
curriculum guides.
All grades shall be based on the weighted raw score of the learners’ summative assessments.
The minimum grade needed to pass a specific learning area is 60, which is transmuted to 75 in the report card.
The lowest mark that can appear on the report card is 60 for Quarterly Grades and Final Grades
EsP
Tasks
Assessment
How are learners promoted or retained at the end of the school year?
Requirements Decision
For Grades 4 to 10 1. Final Grade of at least 75 in all Promoted to the next grade level
learning areas
Learners 2. Did not meet expectations in not Must pass remedial classes for
more than two learning areas learning areas with failing mark to be
promoted to the next grade level.
Did not meet expectations in three or Retained in the same grade level
more learning areas
How are learners promoted or retained at the end of the school year?
For Grades 1 to 10, learner who Did Not Meet Expectations in at most two learning areas must take remedial classes.
Remedial classes are conducted after the Final Grades have been computed.
The learner must pass the remedial classes to be promoted to the next grade level.
How are learners promoted or retained at the end of the school year?
However, teachers should ensure that learners receive remediation when they earn raw scores which are consistently below
expectations in Written Work and Performance Tasks by the fifth week of any quarter.
This will prevent a student from failing in any learning area at the end of the year.