Module 7 8 Assessment in Learning I

ILOCOS SUR POLYTECHNIC STATE COLLEGE
Tagudin Campus
MODULE MODULE 7
TITLE: Performance-Based Tests
WHAT IS THE MODULE ALL ABOUT
The course includes about performance-based tests, rubrics and exemplars and the
process of creating sample rubrics based on learning assessments.
INTENDED LEARNING OUTCOMES (ILO)

After going through this module, you are expected to:
1. Demonstrate understanding on what performance-based tests are.
2. Develop performance-based tests to assess selected learning through a rubric.
3. Construct appropriate scoring rubrics for given students' products/ performances.
Let’s Study This

Over the past few years, there has been a general dissatisfaction over the results of traditional
standardized objective tests. Concerted efforts have, therefore, been expended to find alternative
assessment mechanisms of measuring educational outcomes and processes and ensure more
complex processes in education. For example, multiple choice tests have been criticized because
they, purportedly, are unable to measure complex problem solving skills, are hopeless in
measuring processes that occur in daily classroom activities, gauge the processes involved in
accomplishing the task performance and examine learners' application skills rather than superficial
learning of the material. Educators have therefore focused their attention to finding alternative
assessment methods that would hopefully address these difficulties with the traditional methods of
objective assessment. Performance-based assessment is one alternative assessment
technique that has been proposed.
Performance-based assessment procedures believe that the best way to gauge a

student or pupil competency in a certain task is through observation en situ or on site.
Such a belief appears consistent with the constructivist philosophy in education often
taught courses on Philosophy of Education. A performance-based test is designed to
assess students on what they know, what they are able to do and the learning strategies
they employ in the process of demonstrating it.
Many people have noted serious limitations of performance-based tests and their vulnerability
toward subjectivity in scoring and creating or providing the real or closer-to-the task environment
for assessment purpose. However, the concerns for subjectivity may be addressed simply by
automating the test. The second issue is obviously a bigger problem, and there is no guarantee
that ideas from one domain will apply to another.
Performance-Based Tests
There are many testing procedures that are classified as performance tests with a generally
agreed upon definition that these tests are assessment procedures that require students to
perform a certain task or activity or perhaps, solve complex problems.
Course Code: Educ 105

Descriptive Title: Assessment of Learning I Instructor: Mr. Jhunrey Calibuso
Tagudin Campus
MODULE
For example, Bryant suggested assessing portfolios of a student's work over time,
students' demonstrations, hands on execution of experiments by students, and a student's
work in simulated environments. Such an approach falls under the category of portfolio
assessment (i.e. keeping records of all tasks successfully and skillfully performed by a student).
According to Mehrens performance testing is not new. In fact, various types of performance-based
tests were used even before the introduction of multiple-choice testing. For instance, the following
are considered performance testing procedures: performance tasks, rubrics scoring guides and
exemplars of performance.
Performance Tasks
In performance tasks, students are required to draw on the knowledge and skills they possess
and to reflect upon them for use in the particular task at hand. Not only are the students expected
to obtain knowledge from a specific subject or subject matter but they are in fact required to draw
knowledge and skills from other disciplines in order to fully realize the key ideas needed in doing
the task. Normally, the tasks require students to work on projects that yield a definite output or
product, or perhaps, following a process which tests their approach to solving a problem. In many
instances, the tasks require a combination of the two approaches. Of course, the essential idea in
performance tasks is that students or pupils learn optimally by actually doing (Learning by Doing)
the task which is a constructivist philosophy.
As in any other test, the tasks need to be consistent with the intended outcomes of the
curriculum and the objectives of instruction; and must require students to manifest (a) what they
know and (b) the process by which they came to know it. In addition, performance-based tests
require that tasks involve examining the processes as well as the products of student learning.
Rubrics and Exemplars

Modern assessment methods tend to use rubrics to describe student performance. A rubric is a
scoring method that lists the criteria for a piece of work, or "what counts" (for example, purpose,
organization, details, voice, and mechanics are often what count in a piece of writing); it also
articulates gradations of quality for each criterion, from excellent to poor. Perkins et al (1994)
provide an example of rubrics scoring for student inventions and lists the criteria and gradations of
quality for verbal, written, or graphic reports on student inventions. This is shown in the
succeeding figure as a prototype of rubrics scoring. This rubric lists the criteria in the column on
the left: The report must explain (1) the purposes of the invention, (2) the features or parts of the
invention and how they help it serve its purposes, (3) the pros and cons of the design, and (4) how
the design connects to other things past, present, and future. The rubric could easily include
criteria related to presentation style and effectiveness, the mechanics of written pieces, and the
quality of the invention itself. The four columns to the right of the criteria describe varying degrees
of quality, from excellent to poor.
There are many reasons for the seeming popularity of rubrics scoring in the Philippine school
system.
First, they are very useful tools for both teaching and evaluation of learning outcomes.
Rubrics have the potential to improve student performance, as well as monitor it, by clarifying
teachers' expectations and by actually guiding the students how to satisfy these expectations.
Secondly, rubrics seem to allow students to acquire wisdom in judging and evaluating
the quality of their own work in relation to the quality of the work of other students. In
several experiments involving the use of rubrics, students progressively became more aware of

Tagudin Campus
MODULE
the problems associated with their solution to a problem and with the problems inherent in the
solutions of other students. In other words, rubrics increase the students' sense of responsibility
and accountability.
Third, rubrics are quite efficient and tend to require less time for the teachers in
evaluating student performance. Teachers tend to find that by the time a piece has been self-
and peer-assessed according to a rubric, they have little left to say about it. When they do have
something to say, they can often simply circle an item in the rubric, rather than struggling to
explain the flaw or strength they have noticed and figuring out what to suggest in terms of
improvements. Rubrics provide students with more informative feedback about their strengths and
areas in need of improvement.
Finally, it is easy to understand and construct a rubrics scoring guide. Most of the items
found in the rubrics scoring guide are self-explanatory and require no further help from outside
experts.
Rubric for an Invention Report

Criteria Quality
3 2 1 0
Most acceptable acceptable less acceptable not acceptable
Purposes The report The report The report does
explains the key explains all of The report not refer to the
purposes of the the key explains some of purposes of the
invention and purposes of the the purposes of invention.
points out less invention. the invention but
obvious ones as misses key
well. purposes.
Features The report The report The report The report does
details both key details the key neglects some not detail the
and hidden features of the features of the features of the
features of the invention and invention or the invention or the
invention and explains the purposes they purposes they
explains how purposes they serve. serve.
they serve serve.
several
purposes.
Critique The report The report The report The report does
discusses the discusses the discusses either not mention the
strengths and strengths and the strengths or strengths or the
weaknesses of weaknesses of weaknesses of weaknesses of
the invention, the invention. the invention but the invention.
and suggests not both.
ways in which it
can be
improved.
Connections The report The report The report The report
makes makes makes unclear makes no
appropriate appropriate or inappropriate connections
connections connections connections between the
between the between the between the invention and

Tagudin Campus
purposes and purposes and invention and other things.
MODULE features of the features of the other
invention and invention and phenomena.
many different one or two
kinds of phenomena.
phenomena.
Figure 14. Prototype of Rubric Scoring
Creating Rubrics
In designing a rubric scoring guide, the students need to be actively involved in the process.
The following steps are suggested in actually creating a rubric:
1. Survey models - Show students examples of good and not-so-good work. Identify the
characteristics that make the good ones good and the bad ones bad.
2. Define criteria from the discussions on the models, identify the qualities that define good
work.
3. Agree on the levels of quality - Describe the best and worst levels of quality, then fill in the
middle levels based on your knowledge of common problems and the discussion of not-so-
good work.
4. Practise on models Using the agreed criteria and levels of quality, evaluate the models
presented in step 1 together with the students.
5. Use self and peer assessment - Give students their task. As they work, stop them
occasionally for self-and peer-assessment.
6. Revise. Always give students time to revise their work based on the feedback they get in
Step 5.
7. Use teacher assessment Use the same rubric students used to assess their work yourself.
Writing and Selecting Effective Rubrics

Two main defining aspects of rubrics are the criteria that describe the qualities that you and
students should look for as evidence of students' learning and the descriptions of the levels
of performance.
Desired Characteristics of Criteria for Classroom Rubrics Characteristics the criteria are:
 Appropriate
 Definable
 Observable
 Distinct from one another
 Complete
 Explanation
Each criterion represents an aspect of a standard, curricular goal, or instructional goal or

objective that students are intended to learn. Each criterion has a clear, agreed-upon
meaning that both students and teachers understand. Each criterion describes a quality in
the performance that can be perceived (seen or heard, usually) by someone other than
the person performing.
Each criterion identifies a separate aspect of the learning out comes the performance is
intended to assess. All the criteria together describe the whole of the learning out comes
the performance is intended to assess. Able to support descriptions. Each criterion can be
described over a range of performance along a continuum of quality level.

Tagudin Campus
MODULE
Tips in Designing Rubrics
Perhaps the most difficult challenge is to use clear, precise and concise language. Terms
like "creative", "innovative" and other vague terms need to be avoided. If a rubric is to teach
as well as evaluate, terms like these must be defined for students. Instead of these words,
try words that can convey ideas and which can be readily observed. Patricia Crosby and
Pamela Heinz, both seventh grade teachers (from Andrade, 2007), solved the same
problem in a rubric for oral presentations by actually listing ways in which students could
meet the criterion (fig. 19). This approach provides valuable information to students on how
to begin a talk and avoid the need to define elusive terms like creative.
Rubrics are scales that differentiate levels of student performance. They contain the criteria
that must be met by the student and the judgment process that will be used to rate how well
the student has performed. An exemplar is an example that delineates the desired
characteristics of quality in ways students can understand. These are important parts of the
assessment process.
In summary, we can say that to design problem based tests, we have to ensure that both
processes and end-results should be tested. The tests should be designed carefully enough
to ensure that proper scoring rubrics can be designed, so that the concerns about
subjectivity in performance based tests are addressed. Indeed, this needs to be done
anyway in order to automate the test, so that a performance based testing is used widely.
Automating Performance-Based Tests

Going by the complexity of the issues that need to be addressed in designing performance-
based tests, it is clear that automating the procedure is no easy task. The sets of tasks that
comprise a performance based test have to be chosen carefully in order to tackle the
design issues mentioned. Moreover, automating the procedure imposes another stringent
requirement for the design of the test. In this section, we summarize what we need to keep
in mind while designing an automated performance based test.
We have seen that in order to automate a performance based test, we need to identify a set
of tasks which all lead to the solution of a fairly complex problem. For the testing software to
be able to determine whether a student has completed any particular task, the end of the
task should be accompanied by a definite change in the system. The testing software can
track this change in the system, to determine whether the student has completed the task.
Indeed, a similar condition applies to every aspect of the problem solving activity that we
wish to test. In this case, a set of changes in the system can indicate that the student has
the desired competency.
Such tracking is used widely by computer game manufacturers, where the evidence of a
game player's competency is tracked by the system, and the game player is taken to the
next level of the game.
In summary, the following should be kept in mind as we design a performance-based test.

Each performance task/problem that is used in the test should be clearly defined in terms of
performance standards not only for the end result but also for the strategies used in various
stages of process.
A user need not always end up accomplishing the task; hence it is important to identify
important milestones the test taker reaches while solving the problem. Having defined the

Tagudin Campus
MODULE
possible strategies, the process and milestones, the selection of tasks that comprise a test
should allow the design of good rubrics for scoring. Every aspect of the problem-solving
activity that we wish to test has to lead to a set of changes in the system, so that the testing
software can collect evidence of the student's competency.
MODULE 8
TITLE: Grading System
WHAT IS THE MODULE ALL ABOUT
The course includes about item analysis as a concept and as a process, including the item
analysis tools such as validity, reliability, etc. This lesson will explain how to identify the range or
index of difficulty and discrimination. Such would discuss the benefits of item analysis.
INTENDED LEARNING OUTCOMES (ILO)

After going through this module, you are expected to:
2. Demonstrate understanding on what item analysis is and how it is done.
3. Discuss how index of difficulty and discrimination are computed.
4. Explain the relationship between validity and reliability.
Let’s Study This
Distinguish between norm-referenced and criterion-referenced grading; cumulative and

averaging grading system
Compute grades of students in various grade levels observing
DepEd guidelines
INTRODUCTION
Assessmen Assessment of student performance is essentially knowing
how the student is progressing in a course (and, incidentally, how a teacher is also
performing with respect to the teaching process). The first step in assessment is, of
course, testing (either by some pencil-paper objective test or by some performance based
testing procedure) followed by a decision to grade the performance of the student,
Grading, therefore, is the next step after testing. Over the course of several years, grading
systems had been evolved in different schools systems all over the world. In the American
system, for instance, grades are expressed in terms of letters, A, B, B+, B-, C, C-, D or what
is referred to as a seven-point system. In Philippine colleges and universities, the letters
are replaced with numerical values: 1, 1.25, 1.50, 1.75, 2.0, 2.5, 3.0 and 4.0 or an eight-point
system. In basic education, grades are expressed as percentages (of accomplishment)
such as 80% or 75%. With the implementation of the K to 12 Basic Education curriculum,
however, student's performance is expressed in terms of level of proficiency. Whatever be
the system of grading adopted, it is clear that there appears to be a need to convert raw
score values into the corresponding standard grading system. This Chapter is concerned
Tagudin Campus
with the underlying philosophy and mechanics of converting raw score values into
MODULE
standard grading formats.
8.1. Norm-Referenced Grading
The most commonly used grading system falls under the
category of norm-referenced grading. Norm-referenced grading
refers to a grading system wherein a student's grade is placed
in relation to the performance of a group. Thus, in this system,
a grade of 80 means that the student performed better than or
same as 80% of the class (or group). At first glance, there
appears to be no problem with this type of grading system as
it simply describes the performance of a student with reference
to a particular group of learners. The following example shows
some of the difficulties associated with norm-referenced grading:
Example: Consider the following two sets of scores in an English 1 class for two
sections of ten students each:
A ( 30, 40,50, 55, 60, 65,70,75,80, 85}
B 60, 65, 70, 75, 80, 85, 90, 90, 95, 100}
In the first class, the student who got a raw score of 75 would get a grade of 80% while
in the second class, the same grade of 80% would correspond to a raw score of 90. Indeed,
if the test used for the two classes are the same, it would be a rather "unfair" system of
grading. A wise student would opt to enroll in class A since it is easier to get higher grades
in that class than in the other class (class B).
The previous example illustrates one difficulty with using a norm-referenced grading
system. This problem is called the problem of equivalency. Does a grade of 80 in one class
represent the same achievement level as a grade of 80 in another class of the same
subject? This problem is similar to the problem of trying to compare a Valedictorian from
some remote rural high school with a Valedictorian from some very popular University in
the urban area. Does one expect the same level of competence for these two
valedictorians?
As we have seen, norm-referenced grading systems are based on a pre-established

formula regarding the percentage or ratio of students within a whole class who will be
assigned each grade or mark. It is therefore known in advance what percent of

Tagudin Campus
the students would pass or fail a given course. For this reason, many opponents to
MODULE
norm-referenced grading aver that such a grading system does not advance the cause of
education and contradicts the principle of individual differences.
In norm-referenced grading, the students, while they may work individually, are actually
in competition to achieve a standard of performance that will classify them into the desired
grade range. It essentially promotes competition among students or pupils in the same
class. A student or pupil who happens to enroll in a class of gifted students in Mathematics
will find that the norm-referenced grading system is rather worrisome. For example, a
teacher may establish a grading policy whereby the top 15 percent of students will receive
a mark of excellent or outstanding, which in a class of 100 enrolled students will be 15
persons. Such a grading policy is illustrated below:
1.0 (Excellent)
1.50 (Good)
2.0
3.0
(Average, Fair)
(Poor, Pass)
5.0 (Failure)
Top 15 % of Class
Next 15% of Class
Next 45 % of Class
Next 15% of Class
Bottom 10% of Class
The underlying assumption in norm-referenced grading is that the students have

abilities (as reflected in their raw scores) that obey the normal distribution. The objective is
to find out the best performers in this group. Norm-referenced systems are most often used
for screening selected student populations in conditions where it is known that not all
students can advance due to limitations such as available places, jobs, or other controlling
factors. For example, in the Philippine setting, since not all high school students can
actually advance to college or university level because of financial constraints, the norm
referenced grading system can be applied.
Example: In a class of 100 students, the mean score in a test is 70 with a standard
deviation of 5. Construct a norm referenced grading table that would have seven-grade
scales and such that students scoring between plus or

Tagudin Campus
minus one standard deviation from the mean receives an Solution: The following
MODULE
intervals of raw scores to grade
average grade.
equivalents are computed:
Raw Score
Below 55
55-60
61-65
66-75
76-80
81-85
Above 85
Grade Equivalent
Fail
Marginal Pass
Pass
Average
Above Average
Very Good
Excellent
Percentage
1%
4%
11%
68%
11%
4%

Tagudin Campus
1%
MODULE
Only a few of the teachers who use norm-referenced grading apply it with complete
consistency. When a teacher is faced with a particularly bright class, most of the time, he
does not penalize good students for having the bad luck to enroll in a class with a cohort of
other very capable students even if the grading system says he should fail a certain
percentage of the class. On the other hand, it is also unlikely that a teacher would reduce
the mean grade for a class when he observes a large proportion of poor performing
students just to save them from failure. A serious problem with norm-referenced grading is
that, no matter what the class level of knowledge and ability, and no matter how much they
learn, a predictable proportion of students will receive each grade. Since its essential
purpose is to sort students into categories based on relative performance, orm- referenced
grading and evaluation is often used to weed out students for limited places in selective
educational programs.
Norm-referenced grading indeed promotes competition to the extent that students

would rather not help fellow students because by doing so, the mean of the class would be
raised and consequently it would be more difficult to get higher grades. Similarly, students
would do everything (legal) to pull down the scores of everyone else in order to lower the
mean and thus
assure him/her of higher grades on the curve.
A more subtle problem with norm-referenced grading is that a strict correspondence

between the evaluation methods used and the course instructional goals is not necessary
to yield the required grade distribution. The specific learning objectives of norm-referenced
classes are often kept hidden, in part out of concern that instruction not "give away" the
test or the teacher's priorities, since this might tend to skew the curve. Since norm
referenced grading is replete with problems, what alternatives have been devised for
grading the students?
8.2. Criterion-Referenced Grading
Criterion-referenced grading systems are based on a fixed criterion measure. There is a

fixed target and the students must achieve that target in order to obtain a passing grade in
a course regardless of how the ther students in the class perform. The scale does not
change regardless of the quality, or lack thereof, of the students. For example, in a class of
100 students using the table below, no one might get a grade of excellent if no one scores
98 above or 85 above depending on the criterion used. There is no fixed percentage of
students who are expected to get the various grades in the criterion-referenced grading
system.
1.0
(Excellent)
1.5 Good)
2.0 (Fair)
3.0 (Poor/Pass)

Tagudin Campus
5.0 (Failure)
MODULE
= 98-100
= 88-97
= 75-87
=65-74
= below 65
or 85-100
or 80-84
or 70-79
or 60-69
or below 60
Criterion-referenced systems are often used in situations where the teachers are agreed
on the meaning of a standard of performance" in a subject but the quality of the students is
unknown or uneven; where the work involves student collaboration or teamwork; and
where there is no external driving factor such as needing to systematically reduce a pool
of eligible students. Note that in criterion-referenced grading system, students
can help a fellow student in a group work without necessarily worrying about lowering
his grade in that course. This is because the criterion-referenced grading system does not
require the mean (of the class) as basis for distributing grades among the students. It is
therefore an ideal system to use in collaborative group work. When students are evaluated
based on predefined criteria, they are freed to collaborate with one another and with the
instructor. With criterion-referenced grading, a rich learning environment is to everyone's
advantage, so students are rewarded for finding ways to help each other, and for
contributing to class and small group discussions.
Since the criterion measure used in criterion-referenced grading is a measure that

ultimately rests with the teacher, it is logical to ask: What prevents teachers who use
criterion referenced grading from setting the performance criteria so low that everyone can
pass with ease? There are a variety of measures used to prevent this situation from ever
happening in the grading system. First, the criterion should not be based on only one
teacher's opinion or standard. It should be collaboratively arrived at. A group of teachers
teaching the same subject must set the criterion together. Second, once the criterion is
established, it must be made public and open to public scrutiny so that it does not become
arbitrary and subject to the whim and caprices of the teacher.
8.3. Four Questions in Grading
Marinila D. Svinicki (2007) of the Center for Teaching Effectiveness of the University of
Texas at Austin poses four intriguing questions relative to grading. We reflect these

Tagudin Campus
questions here in this section and the corresponding opinion of Ms. Svinicki for your own
MODULE
reflection:
Should grades reflect absolute achievement level or achievement relative to others in

the same class?
Should grades reflect achievement only or nonacademic components such as attitude,

speed and diligence?
3. Should grades report status achieved or amount of growth?
4 How can several grades on diverse skills combine to give
a single mark?
8.4. What Should Go Into a Student's Grade
The grading system an instructor selects reflects his or her educational philosophy.
There are no right or wrong systems, only systems which accomplish different objectives.
The following are questions which an instructor may want to answer when choosing what
will go into a student's grade.
Should grades reflect absolute achievement level or achievement relative to others in

the same class?
This is often referred to as the controversy between norm referenced versus criterion-
referenced grading. In norm-referenced grading systems the letter grade a student receives
is based on his or her standing class. A ertain percenta of those at the top receive A's, a
specified percent of the next highest grades receive B's and so on. Thus an outside person,
looking at the grades, can decide which student in that group performed best under those
circumstances. Such a system also takes into account circumstances beyond the students'
control which might adversely affect grades, such as poor teaching, bad tests or
unexpected problems arising for the entire class. Presumably, these would affect all the
students equally, so all performance would drop but the relative standing would stay the
same.
On the other hand, under such a system, an outside evaluator has little additional
information about what a student actually knows since that will vary with the class. A
student who has learned an average amount in a class of geniuses will probably know
more than a student who is average in a class of low ability. Unless the instructor provides
more information than just the grade, the external user of the grade is poorly informed.
The system also assumes sufficient variability among student performances that the
difference in learning between them justifies giving different grades. This may be true in
large beginning classes, but is a shaky assumption where the student
population is homogeneous such as in upper division classes.
The other most common grading system is the criterion referenced system. In this case
the instructor sets a standard of performance against which the students' actual
performance is measured. All students achieving a given level receive the grade assigned
to that level regardless of how many in the class receive the same grade. An outside

Tagudin Campus
evaluator, looking at the grade, knows only that the student has reached a certain level or
MODULE
set of objectives. The usefulness of that information to the outsider will depend on how
much information he or she is given on what behavior is represented by that grade. The
grade, however, will always mean the same thing and will not vary from class to class. A
possible problem with this is that outside factors such as those discussed under norm-
referenced grading might influence the entire class and performance may drop. In such a
case all the students would receive lower grades unless the instructor made special
allowances for the circumstances.
A second problem is that criterion-referenced grading does not provide "selection"

information. There is no way to tell from the grading who the "best" students are, only that
certain students have achieved certain levels. Whether one views this as positive or
negative will depend on one's individual philosophy.
An advantage of this system is that the criteria for various grades are known from the
beginning. This allows the student to take some responsibility for the level at which he or
she is going to perform. Although this might result in some students working below their
potential, it usually inspires students to work for a high grade. The instructor is then faced
with the dilemma of a lot of students receiving high grades. Some people view this as a
problem.
A positive aspect of this foreknowledge is that much of the uncertainty which often
accompanies grading for students is eliminated. Since they can plot their own progress
toward the lesired grade, the students have little uncertainty about where they stand.
2. Should grades reflect achievement only or nonacademic components such as

attitude, speed and diligence?
It is a very common practice to incorporate such things as turning in assignments on

time into the overall grade in a
course, primarily because the need to motivate students to get their work done is a real
problem for instructors. Also it may be appropriate to the selection function of grading that
such values as timeliness and diligence be reflected in the grades. External. users of the
grades may be interpreting the mark to include such factors as attitude and compliance in
addition to competence in the material.
The primary problem with such inclusion is that it makes grades even more ambiguous
than they already are. It is very difficult to assess these nebulous traits accurately or
consistently. Instructors must use real caution when incorporating such value judgments
into final grade assignment. Two steps instructors should take are (1) to make students
aware of this possibility well in advance of grade assignment and (2) to make clear what
behavior is included in such qualities as prompt completion of work and neatness or
completeness.
3. Should grades report status achieved or amount of growth?
This is a particularly difficult question to answer. In many beginning classes, the

background of the students is so varied that some students can achieve the end objectives
with little or no trouble while others with weak backgrounds will work twice as hard and
still achieve only half as much. This dilemma results from the same problem as the
previous question, that is, the feeling that we should be rewarding or punishing effort or
attitude as well as knowledge gained.

Tagudin Campus
MODULE
A positive aspect of this foreknowledge is that much of the uncertainty which often
accompanies grading for students is eliminated. Since they can plot their own progress
toward the desired grade, the students have little uncertainty about where they stand.
There are many problems with "growth" measures as a basis for change, most of them
being related to statistical artifacts. In some cases the ability to accurately measure
entering and exiting levels is shaky enough to argue against change as a basis for grading.
Also many courses are prerequisite to later courses and, therefore, are intended to provide
the foundation for those courses. "Growth" scores in this case would be disastrous.
Nevertheless, there is much to be said in favor of "growth" as a component in grading.

We would like to encourage hard
work and effort and to acknowledge the existence of different abilities. Unfortunately,
there is no easy answer to this question. Each instructor must review his or her own
philosophy and content to determine if such factors are valid components of the grade.
How can several grades on diverse skills combine to give a single mark?
The basic answer is that they can't really. The results of instruction are so varied that
the single mark is really a "Rube Goldberg" as far as indicating what a student has
achieved. It would be most desirable to be able to give multiple marks, one for each of the
variety of skills which are learned. There are, of course, many problems with such a
proposal. It would complicate an already complicated task. There might not be enough
evidence to reliably grade any one skill. The "halo" effect of good performance in one area
could spill over into others. And finally, most outsiders are looking for only one overall
classification of each person so that they can choose the "best." Our system requires that
we produce one mark. Therefore, it is worth our while to see how that can be done even
though currently the system does not lend itself to any satisfactory answers,
8.5. Standardized Test Scoring
Test standardization is a process by which teacher or researcher-made tests are

validated and item analyzed. After a thorough process of validation, the test characteristics
are established. These characteristics include: test validity, test reliability, test difficulty
level and other characteristics as previously discussed. Each standardized test uses its
own mathematical scoring system derived by the publisher and administrators, and these
do not bear any relationship to academic grading systems. Standardized tests are
psychometric instruments whose scoring systems are developed by norming the test using
national samples of test-takers, centering the scoring formula to assure that the likely
score distribution describes a normal curve when graphed, and then using the resulting
scoring system uniformly in a manner resembling a criterion-referenced approach. If you
are interested in understanding and interpreting
the scoring system of a specific standardized test, refer to the policies of the test's
producers.
8.6. Cumulative and Averaging Systems of Grading
In the Philippines, there are two types of grading systems used: the averaging and the
cumulative grading systems. In the averaging system, the grade of a student on a particular

Tagudin Campus
grading period equals the average of the grades obtained in the prior grading periods and
MODULE
the current grading period. In the cumulative grading system, the grade of a student in a
grading period equals his current grading period grade which is assumed to have the
cumulative effects of the previous grading periods. In which grading system would there be
more fluctuations observed in the students grades? How do these systems relate with
either norm or criterion-referenced grading?
8.7. Policy Guidelines on Classroom Assessment for the Kto12 Basic Education, DepEd
Order No. 8, s. 2015
Below are some of the highlights of the new K to 12 Grading System which was
implemented starting SY 2015-2016. These are all lifted from DepEd Order No. 8, s. 2015
Weights of the Components for the Different Grade Levels and Subjects
The student's grade is a function of three components: 1) written work, 2) performance

tasks and 3) quarterly assessment. The percentages vary across clusters of subjects.
Languages, Araling Panlipunan (AP) and Edukasyon sa Pagpapahalaga (ESP) belong to
one cluster and have the same grade percentages for written work, performance tasks and
quarterly assessment. Science and Math are another cluster with the same component
percentages. Music, Arts, Physical Education and Health ( MAPEH) make up the third
cluster with same component percentages. Among the three components, performance
tasks are given the largest percentages. This means that the emphasis on assessment is
on application of concepts learned.
Table 4. Weight of the
Components for Grades 1-10
Components
Written Work
Performance Tasks
Quarterly
Assessment
Languages
30%
50%
20%
AP ESP
Science
40%

Tagudin Campus
40%
MODULE
20%
Math
MAPEH
EPPI TLE
20%
60%
20%
Table 5 presents the weights of the components for the Senior High School subjects
which are grouped into 1) core subjects, 2) all other subjects (applied and specialization)
and work immersion of the academic track, and 3) all other subjects (applied and
specialization) and work immersion / research/ exhibit / performance. An analysis of the
figures reveal that among the components, performance tasks have the highest percentage
contribution to the grade. This means that DepEd's grading system consistently puts most
emphasis on application of learned concepts and skills.
Table 5. Weight of the Components for SHS
Academic Track
Work Immersion/
Research/ Business Enterprise Simulation/ Exhibit/
Performance
35%
40%
25%
Technical-Vocational and Livelihood (TVL)/ Sports/ Arts and Design Track
All other subjects
Work Immersion/ Research/ Exhibit/ Performance
20%
60%
20%
Core Subjects

Tagudin Campus
MODULE
All other subjects
25%
45%
30%
25%
50%
25%
1 to 10
Written Work
Performance
Tasks
Quarterly Assessment
Alternative Grading Systems
Pass-Fail Systems. Other colleges and universities, faculties, schools, and institutions
use pass-fail grading systems in the Philippines, especially when the student's work to be
evaluated is highly subjective (as in the fine arts and music), there are no generally
accepted standard gradations (as with independent studies), or the critical requirement is
meeting a single satisfactory standard (as in some professional examinations and
practicum).
Non-Graded Evaluations. While not yet practised in Philippine schools, and institutions,
non-graded evaluations do not assign numeric or letter grades as a matter of policy. This
practice is usually based on a belief that grades introduce an inappropriate and distracting
element of competition into the learning process, or that they are not as meaningful as
measures of intellectual growth and development as are carefully crafted faculty
evaluations. Many faculty, schools, and institutions that follow a no-grade policy will, if
requested, produce grades or convert their student evaluations into formulae acceptable to
authorities who require traditional measures of performance.
The process of deciding on a grading system is a very complex one. The problems
faced by an instructor who tries to design a system which will be accurate and fair are
common to any manager attempting to evaluate those for whom he or she is responsible.
The problems of teachers and students with regard to grading are almost identical to those
of administrators and faculty with regard to evaluation for promotion and tenure. The need
for completeness and objectivity felt by teachers and administrators must be balanced
against the need for fairness and clarity felt by students and faculty in their respective
situations. The fact that the faculty member finds himself or herself in both the position of
evaluator and evaluated should help to make him or her more thoughtful about the needs
of each position.


Module 7 8 Assessment in Learning I

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Module 7 8 Assessment in Learning I

Uploaded by

Copyright:

Available Formats

ILOCOS SUR POLYTECHNIC STATE COLLEGE

INTENDED LEARNING OUTCOMES (ILO)

Let’s Study This

Performance-based assessment procedures believe that the best way to gauge a

Course Code: Educ 105

Rubrics and Exemplars

Course Code: Educ 105

Rubric for an Invention Report

Course Code: Educ 105

Writing and Selecting Effective Rubrics

Each criterion represents an aspect of a standard, curricular goal, or instructional goal or

Course Code: Educ 105

Automating Performance-Based Tests

In summary, the following should be kept in mind as we design a performance-based test.

Course Code: Educ 105

INTENDED LEARNING OUTCOMES (ILO)

Let’s Study This

Distinguish between norm-referenced and criterion-referenced grading; cumulative and

Compute grades of students in various grade levels observing

Assessmen Assessment of student performance is essentially knowing

8.1. Norm-Referenced Grading

The most commonly used grading system falls under the

category of norm-referenced grading. Norm-referenced grading

refers to a grading system wherein a student's grade is placed

in relation to the performance of a group. Thus, in this system,

a grade of 80 means that the student performed better than or

same as 80% of the class (or group). At first glance, there

appears to be no problem with this type of grading system as

it simply describes the performance of a student with reference

to a particular group of learners. The following example shows

some of the difficulties associated with norm-referenced grading:

A ( 30, 40,50, 55, 60, 65,70,75,80, 85}

As we have seen, norm-referenced grading systems are based on a pre-established

Course Code: Educ 105

persons. Such a grading policy is illustrated below:

Next 15% of Class

Next 15% of Class

Bottom 10% of Class

The underlying assumption in norm-referenced grading is that the students have

Course Code: Educ 105

equivalents are computed:

Course Code: Educ 105

Norm-referenced grading indeed promotes competition to the extent that students

assure him/her of higher grades on the curve.

A more subtle problem with norm-referenced grading is that a strict correspondence

8.2. Criterion-Referenced Grading

Criterion-referenced grading systems are based on a fixed criterion measure. There is a

Course Code: Educ 105

of eligible students. Note that in criterion-referenced grading system, students

Since the criterion measure used in criterion-referenced grading is a measure that

8.3. Four Questions in Grading

Course Code: Educ 105

Should grades reflect absolute achievement level or achievement relative to others in

Should grades reflect achievement only or nonacademic components such as attitude,

3. Should grades report status achieved or amount of growth?

4 How can several grades on diverse skills combine to give

8.4. What Should Go Into a Student's Grade

Should grades reflect absolute achievement level or achievement relative to others in

population is homogeneous such as in upper division classes.

Course Code: Educ 105

A second problem is that criterion-referenced grading does not provide "selection"

2. Should grades reflect achievement only or nonacademic components such as

It is a very common practice to incorporate such things as turning in assignments on

3. Should grades report status achieved or amount of growth?