You are on page 1of 169

Introduction to Student Assessment and Evaluation

INTRODUCTION

Evaluation of educational programmes is a global issue. It involves the effectiveness of


proposals, curricula, instructional materials, organizations, administration, resources, utilization,
facilities, duration etc. Though a progressive evaluator will be concerned with many issues in
evaluating educational programmes, the practicing teacher will mainly be concerned with student
assessment and evaluation.

From this point of view evaluation is defined as a systematic process of determining the extent to
which the learners achieve instructional/training objectives. It may include either quantitative or
qualitative description of learner behavior plus value judgement concerning its worth. It is
imperative that we make judgements based on proper information (qualitative or quantitative)
through suitably designed tools and techniques for the purpose.

A sound evaluation programme will include both measurement and non-measurement


techniques, each to be used as appropriate.

Evaluation may be based on information obtained by


1. presenting an individual with a task/set of tasks to perform
2. asking him/her questions about himself/herself
3. by asking others to appraise his/her behaviour

Role of Evaluation

The relationship between evaluation and other components of the teaching


learning process is shown in the following diagram:
Figure 1.1 : Instructional System

Evaluation involves value judgement with respect to actual achievement of objectives in


comparison with the proposed ones. The tools for gathering the needed information are therefore
to be necessarily designed, using the proposed objectives as reference points.

Purposes of Evaluation

Feedback provided by evaluation influences the student to

• know his/her strengths and weaknesses and direct his/her study efforts to
make up for gap or knowledge and understanding

• compare his/her progress with that of his/her peers and get motivated to
do better

• develop regular and good study habits if assessment is continuous)

It helps the teacher to

• assess how effective the instructional methods and strategies used are
• detect students' learning difficulties and provide for remedy
• identify individual student differences and suitably adapt teaching
strategies
• grade students

It helps administrators to

• make any structural changes in the system such as providing more resources,
revision of curriculum etc., to improve the system

General Principles

Some general principles that provide direction to the evaluation process are:
• Evaluation is a systematic process to determine the extent to which objectives
are achieved. This means that formulating objectives in clear terms is an
important prerequisite, as that will spell out 'what to evaluate'.
• Evaluation procedures are selected in terms of the purposes to be served. The
question is not 'should this procedure be used?' but rather 'when this
procedure be used?' A particular procedure is suitable for certain purposes
and not appropriate for others.
• A variety of procedures are needed for evaluation. Tests (different types), self-
report techniques and observation are some of the procedures available,
Appropriate procedures are to be used depending on the nature of objectives
(cognitive, psychomotor; and affective) for ensuing comprehensive
evaluation.
• Knowledge of limitations as well as strengths of different evaluation
procedures is needed for their proper use- A teacher/trainer should develop
skills in minimizing errors in evaluation by being able to design and use
different procedures appropriately
• Evaluation is a means to an end and not an end in itself. Evaluation has to be
looked upon as a process of obtaining reliable information upon which to base
educational decisions (instructional, guidance or administrative). It is not the
end of the teaching learning process.
Concepts of Educational Testing
Testing is neither assessment nor appraisal, but at the same time it may become a
means to getting information, data or evidences needed for assessment and appraisal. Testing
is one of the significant and most usable technique in any system of examination or
evaluation. It envisages the use of instruments or tools for gathering information or data. In
written examinations, question paper is one of the most potent tools employed for collecting
and obtaining information about pupils’ achievement.

Context of Educational Testing


A test of educational achievement is one designed to measure knowledge,
understanding, or skills in a specified subject or group of subjects. The test might be
restricted to a single subject, such as arithmetic, yielding a separate score for each subject and
a total score for the several subjects combined.
Tests of educational achievement differ from those of intelligence in that the former
are concerned with the quantity and quality of learning attained in a subject of study, or group
to subjects, after a period of instruction and the latter are general in scope and are intended
for the measurement and analysis of psychological processes, although they must of necessity
employ some acquired contentthat resembles the content found in achievement tests.
In student assessment and evaluation, it is necessary to understand that teaching is a
process where effective teachers can organize the environment to provide students with
active, hands-on learning and authentic tasks which aims to meet the outcomes at end of the
instruction. Opportunities for “active” learning experiences, in which students are asked to
use ideas by writing and talking about them, creating models and demonstrations, applying
these ideas to more complex problems, and constructing projects that require the integration
of many ideas, have been found to promote deeper learning. So, learning theories are playing
an important role in the educational system.

Learning Theory
Learning theories are conceptual frameworks describing how information is absorbed,
processed and retained during learning. Cognitive, emotional, and environmental influences,
as well as prior experience, all play a part in how understanding, or a world view, is acquired
or changed and knowledge and skills retained.
Behaviorists look at learning as an aspect of conditioning and will advocate a system of
rewards and targets in education. Educators who embrace cognitive theory believe that the
definition of learning as a change in behavior is too narrow and prefer to study the learner
rather than their environment and in particular the complexities of human memory. Those
who advocate constructivism believe that a learner's ability to learn relies to a large extent on
what he already knows and understands, and the acquisition of knowledge should be an
individually tailored process of construction. Transformative learning theory focuses upon
the often- necessary change that is required in a learner's preconceptions and world view.
Behaviorism
Behaviorism is a philosophy of learning that only focuses on objectively observable
behaviors and discounts mental activities. Behavior theorists define learning as nothing more
than the acquisition of new behavior. Experiments by behaviorists identify conditioning as a
universal learning process. There are two different types of conditioning, each yielding a
different behavioral pattern: Classic conditioning occurs when a natural reflex responds to a
stimulus.

The most popular example is Pavlov's observation that dogs salivate when they eat or even
see food. Essentially, animals and people are biologically "wired" so that a certain stimulus
will produce a specific response. Behavioral or operant conditioning occurs when a response
to a stimulus is reinforced. Basically, operant conditioning is a simple feedback system: If a
reward or reinforcement follows the response to a stimulus, then the response becomes more
probable in the future. For example, leading behaviorist B.F. Skinner used reinforcement
techniques to teach pigeons to dance and bowl a ball in a mini-alley.
How Behaviorism impacts learning:
• Positive and negative reinforcement techniques of Behaviorism can be very effective.
• Teachers use Behaviorism when they reward or punish student behaviours.
Cognitivism
Jean Piaget authored a theory based on the idea that a developing child builds
cognitive structures, mental "maps", for understanding and responding to physical
experiences within their environment. Piaget proposed that a child's cognitive structure
increases in sophistication with development, moving from a few innate reflexes such as
crying and sucking to highly complex mental activities.
The four developmental stages of Piaget's model and the processes by which children
progress through them are: The child is not yet able to conceptualize abstractly and needs
concrete physical situations. As physical experience accumulates, the child starts to
conceptualize, creating logical structures that explain their physical experiences. Abstract
problem solving is possible at this stage. For example, arithmetic equations can be solved
with numbers, not just with objects. By this point, the child's cognitive structures are like
those of an adult and include conceptual reasoning. Piaget proposed that during all
development stages, the child experiences their environment using whatever mental maps
they have constructed. If the experience is a repeated one, it fits easily - or is assimilated -
into the child's cognitive structure so that they maintain mental "equilibrium". If the
experience is different or new, the child loses equilibrium, and alters their cognitive structure
to accommodate the new conditions. In this way, the child constructs increasingly complex
cognitive structures.

How Piaget's theory impacts learning:


• Curriculum - Educators must plan a developmentally appropriate curriculum that
enhances their students' logical and conceptual growth.
• Instruction - Teachers must emphasize the critical role that experiences, or
interactions with the surrounding environment, play in student learning. For example,
instructors have to consider the role that fundamental concepts, such as the
permanence of objects, play in establishing cognitive structures.

Constructivism
Constructivism is a philosophy of learning founded on the premise that, by reflecting
on our experiences we construct our own understanding of the world we live in. Each of us
generates our own "rules" and "mental models," which we use to make sense of our
experiences. Learning, therefore, is simply the process of adjusting our mental models to
accommodate new experiences.

The guiding principles of Constructivism:


Learning is a search for meaning. Therefore, learning must start with the issues
around which students are actively trying to construct meaning. Meaning requires
understanding wholes as well as parts and parts must be understood in the context of wholes.
Therefore, the learning process focuses on primary concepts, not isolated facts. In order to
teach well, we must understand the mental models that students use to perceive the world and
the assumptions they make to support those models. The purpose of learning is for an
individual to construct his or her own meaning, not just memorize the "right" answers and
repeat someone else's meaning. Since education is inherently interdisciplinary, the only
valuable way to measure learning is to make assessment part of the learning process, ensuring
it provides students with information on the quality of their learning.

How Constructivism impacts learning:


• Curriculum - Constructivism calls for the elimination of a standardized curriculum.
Instead, it promotes using curricula customized to the students' prior knowledge. Also,
it emphasizes hands-on problem solving.
• Instruction - Under the theory of constructivism, educators focus on making
connections between facts and fostering new understanding in students. Instructors
tailor their teaching strategies to student responses and encourage students to analyze,
interpret and predict information. Teachers also rely heavily on open-ended questions
and promote extensive dialogue among students.
• Assessment - Constructivism calls for the elimination of grades and standardized
testing. Instead, assessment becomes part of the learning process so that students play
a larger role in judging their own progress.
Different forms of Assessment
Assessment frames learning, creates learning activity and orients all aspects of learning
behavior. Tests and other assessment procedures can also be classified in terms of their
functional role in classroom instruction. The functional role explains the sequence of
assessment procedures are likely to be used in the classroom. This kind of sequencing and
categorization is continuing today. According to David Miller, the classification are
Placement assessment: To determine student performance at the beginning of instruction
Formative assessment: To monitor learning progress during instruction
Diagnostic assessment: To diagnose learning difficulties during instruction
Summative assessment: To assess achievement at the end of instruction
Placement Assessment
Placement assessment is concerned with the student's entry performance and typically focuses
on questions such as (a) Does the student possess the knowledge and skills needed to begin the
planned instruction? For example, beginning algebra, student should have a sufficient
command of essential mathematics concepts (b) To what extent has the student already
developed the understanding and skills that are the goals of the planned instruction? Sufficient
levels of comprehension and proficiencies might indicate the desirability of skipping certain
units or of being placed in a more advanced course. (c) To what extent do the student's interests,
work habits, and personality characteristics indicate that one mode of instruction might be
better than another (e.g., group instruction versus independent study)? The goal of placement
assessment is to determine for each student the position in the instructional sequence and the
mode of instruction that is most beneficial.

Formative Assessment
Assessment for learning is a formative assessment. Formative assessment is used to monitor
learning progress during instruction. Its purpose is to provide continuous feedback to both
students and teachers concerning learning successes and failures. The wide variety of
information that teachers collect about students’ learning processes provides the basis for
determining what they need to do next to move student learning forward. It provides the basis
for providing descriptive feedback for students and deciding on groupings, instructional
strategies, and resources The feedback to students provides reinforcement of successful
learning and identifies the specific learning errors and misconceptions that need correction.
Formative assessment depends heavily on specially prepared tests and assessments for each
segment of instruction that is unitwise or chapterwise. Tests and other types of assessment tasks
used for formative assessment are most frequently teacher made, but customized tests made
available by publishers of textbooks and other instructional materials also can serve th.is
function. Observational techniques are, of course, also useful in monitoring student progress
and identifying learning errors. Because formative assessment is directed toward. improving
learning and instruction, the results are typically not used for assigning course grades.

Role of faculty member in formative assessment


Assessment for learning occurs throughout the learning process. It is interactive, with teachers:
aligning instruction
• identifying particular learning needs of students or groups
• selecting and adapting materials and resources
• creating differentiated teaching strategies and learning opportunities for helping
individual students move forward in their learning
• Providing immediate feedback and direction to students

Diagnostic Assessment
Diagnostic assessment is a highly specialized procedure. It is concerned with the
persistent or recurring learning difficulties that are left unresolved by the standard corrective
prescriptions of formative assessment. If a student continues to experience failure in reading,
mathematics, or other subjects despite the use of prescribed alternative methods of instruction,
then a more detailed diagnosis is indicated. To use a medical analogy, formative assessment
provides first-aid treatment for simple learning problems, and diagnostic assessment searches
for the underlying causes of those problems that do not respond to first-aid treatment. Thus,
diagnostic assessment is much more comprehensive and detailed. It involves the use of
specially prepared diagnostic tests as well as various observational techniques. Serious learning
disabilities also are likely to require the services of educational, counsellors, and medical
specialists. The aim of diagnostic assessment is to determine the causes of persistent learning
problems and to formulate a plan for remedial action.

Summative Assessment
The last kind of assessment is called summative assessment and it is also called as
assessment of learning. Summative assessment typically comes at the end of a course of
instruction. It is designed to determine the extent to which the instructional goals have been
achieved and is used primarily for assigning course grades or for certifying student mastery of
the intended learning outcomes. The techniques used in summative assessment are determined
by the instructional goals, but they typically include teacher­ made achievement tests, ratings
on various types of performance, and assessments of products. These various sources of
information about student achievement may be systematically collected into a portfolio that
may be used to summarize or showcase the student’s accomplishments and progress. Although
the main purpose of summative assessment is grading or the certification of student
achievement, it also provides information for judging the appropriateness of the course
objectives and the effectiveness of the instruction.

Faculty member role in Summative Assessment


Faculty members have the responsibility of reporting student learning accurately and
fairly, based on evidence obtained from a variety of contexts and applications. Effective
assessment of learning requires that faculty members provide:
• a rationale for undertaking a particular assessment of learning at a particular point in
time
• clear descriptions of the intended learning
• processes that make it possible for students to demonstrate their competence and skill
a range of alternative mechanisms for assessing the same outcomes
• public and defensible reference points for making judgements
• transparent approaches to interpretation
• descriptions of the assessment process
• strategies for recourse in the event of disagreement about the decisions

So, sequencing of assessment is more sensible in Teaching – Learning process. Among the all
forms of assessment formative assessment is more powerful assessment for the improvement
teaching – learning process. In the formative assessment questioning to the students is an art.
Graduate Attributes leads to programme outcome

Every student become a graduate at end of their programme and each graduate should
possess appropriate graduate attributes. The term Graduate Attribute by itself has been
defined differently by educationalists.

The most popular one is graduate attributes as the qualities, skills and understandings an
institution community agrees its students would desirably develop during their time at the
institution. Another expert gave an explanation is the term Graduate Attribute GA is abstract
and demonstrates broad concepts for employability, lifelong learning, preparation for an
uncertain future.

National Board of Accreditation, Govt of India listed the graduate attributes for different
programmes. The details are available in http://www.nbaind.org. The following list is
graduate attributes leads to programme outcome for the UG Engineering programme.

Engineering Graduates will be able to ( Source : http://www.nbaind.org)

1. Engineering knowledge: Apply the knowledge of mathematics, science, engineering


fundamentals, and an engineering specialization to the solution of complex
engineering problems.
2. Problem analysis: Identify, formulate, review research literature, and analyze
complex engineering problems reaching substantiated conclusions using first
principles of mathematics, natural sciences, and engineering sciences.
3. Design/development of solutions: Design solutions for complex engineering
problems and design system components or processes that meet the specified needs
with appropriate consideration for the public health and safety, and the cultural,
societal, and environmental considerations.
4. Conduct investigations of complex problems: Use research-based knowledge and
research methods including design of experiments, analysis and interpretation of data,
and synthesis of the information to provide valid conclusions.
5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and
modern engineering and IT tools including prediction and modeling to complex
engineering activities with an understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent
responsibilities relevant to the professional engineering practice.
7. Environment and sustainability: Understand the impact of the professional
engineering solutions in societal and environmental contexts, and demonstrate the
knowledge of, and need for sustainable development.
8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities
and norms of the engineering practice.
9. Individual and team work: Function effectively as an individual, and as a member
or leader in diverse teams, and in multidisciplinary settings.
10. Communication: Communicate effectively on complex engineering activities with
the engineering community and with society at large, such as, being able to
comprehend and write effective reports and design documentation, make effective
presentations, and give and receive clear instructions.
11. Project management and finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a
member and leader in a team, to manage projects and in multidisciplinary
environments.
12. Life-long learning: Recognize the need for, and have the preparation and ability to
engage in independent and life-long learning in the broadest context of technological
change.

Graduate attributes can be assessed indirectly in a programme. The assessment starts from the
micro level as course outcome. Achievements of all the course outcomes leads to programme
outcome (PO) and programme specific outcomes (PSO). The achievement of PO and PSO
indirectly linking with the graduate attributes.
The Languages of Assessment

Test: A test is a procedure in which a sample of an individual’s behavior is obtained, evaluated


and scored using standardized procedures. It is a method to determine a student's ability to
complete certain tasks or demonstrate mastery of a skill or knowledge of content. Some types
would be multiple choice tests, or a weekly spelling test. While it is commonly used
interchangeably with assessment, or even evaluation, it can be distinguished by the fact that
a test is one form of an assessment.

Assessment: It is systematic procedure for collecting information that can be used to make
inferences about the characteristics of people or object. It is a process of gathering
information to monitor progress and make educational decisions if necessary. As noted in my
definition of test, an assessment may include a test, but also includes methods such as
observations, interviews, behavior monitoring, etc.

Measurement: Measurement is a set of rules or procedures for assigning numbers to


represent objects, traits, attributes or behaviors. Measurement, beyond its general definition,
refers to the set of procedures and the principles for how to use the procedures in educational
tests and assessments. Some of the basic principles of measurement in educational
evaluations would be raw scores, percentile ranks, derived scores, standard scores, etc.

Evaluation: Basically, evaluation is the process of making judgments based on criteria and
evidence. Also, Procedures used to determine whether the subject (i.e. student) meets a
preset criteria, such as qualifying for special education services. This uses assessment
(remember that an assessment may be a test) to make a determination of qualification in
accordance with a predetermined criterion.
Questioning Skill
Questioning skill is an important skill need to be developed among the faculty
members. Questioning plays a major role in all forms assessment particularly placement,
formative and diagnostic. Much classroom practice can be described as assessment activities.
Teachers set tasks and activities and pose questions to learners. Learners respond to the tasks,
activities and questions, and the teachers make judgements on the learners’ knowledge,
`understanding and skills acquisition as evidenced in the learners’ responses. These judgements
on learners’ performance happen quite naturally in the course of any teaching and learning
session and require two-way dialogue, decision-making and communication of the assessment
decision in the form of quality feedback to the learner on their performance. Depending on how
successfully these classroom practices have been undertaken, learning will have taken place in
varying degrees from learner to learner.

Testing the learning is an important part of classroom practice, and questioning is one of the
most common methods of checking learner understanding. Questioning is something teachers
do naturally as part of their daily routine, but developing the skills associated with questioning
techniques presents many challenges for teachers and it is something that is developed over
time. Teachers need to review what is to be learnt in any one teaching and learning session and
plan for the inclusion of questioning accordingly. When to pose open and closed questions,
teacher must know, how to develop a question distribution strategy and when to use questions
to check learners’ knowledge.

So, the following points we have to remember


• the learning aims and objectives are clearly defined and shared with the learner and
• methods of testing for learning are appropriately identified.
As a faculty member you need to be aware that :
• questioning is a skill which needs to be developed
• communication is a two-way process
• questioning is a good way to develop an interaction style of communication
• you need confidence to develop questioning skills
• when you pose a question, you have no idea as to what the learner is going to say,
despite your hopes
• you need to have the courage and confidence to deal with any answer, no matter how it
is completely strange
• when you ask questions, you have to be prepared for the learner asking you a wide
range of searching questions in response
• you have to be confident in their subject matter and be well prepared
• one of the first stages of questioning is getting the learner to talk, which may seem
strange as teachers spend a lot of time trying to get the learner to be quiet
• some teachers tend to talk far too much without checking that learners understand what
they are saying.
• When you questioning the student,
• First, Ask the question leave some time then identify the student to get the answer. This
kind of questioning make the every student to think the answer.
• Identify the student and ask question directly when the student didn’t attentive to the
class. We should not mix the questioning directly to the student and questioning and
give time for thinking.
• Now can you tell me the difference between the two way of asking questions. Yes you
are correct. The first one is to get the feedback on learning and the second one to bring
the attention of the specific learner.

So,
• Pose the question to the whole group.
• Pause – allowing all learners to think of the answer Pose, Pause, Pounce (PPP)
• Name a learner to answer.
• Listen to the answer.
• Reward correct answers.
• Incorrect answers should not be ridiculed either by the teacher or the remainder of the
group of learners.
• Spread the questions around the class so that all can participate.
The distribution of the question is again very important.
• If teachers work around the class in an obvious systematic order, those who have
answered tend to relax a little, and sometimes ‘switch off’.
• Use a technique which is not obvious.
• Be conscious of the tendency to choose the same learners when asking questions.
• Most teachers tend to concentrate their attention on those learners, so deliberately pay
attention to those normally omitted.
Different Questioning types in the classroom

Closed end questions

We have discussed different questioning strategies in “Development of


Questioning Skill of the Teacher”. In that lecture, we have discussed different
scenario to ask the questions and bring the attention of the students. This closed
end questions are popular as icebreaker questions in group situations because
they’re easy to answer. Of course, most questions can be opened up for further
discussion, including closed questions

There are many advantages to closed questions. They’re quick and easy to
respond to and generally reduce confusion. They’re also particularly useful for
challenging pupils’ memory and recalling facts.

There are, however, also a number of disadvantages to using closed questions.


Students may start to try and guess what you’re thinking and give an answer
based on that. They may also become anxious that they’re going to get the
answer wrong, which reduces their willingness to answer. But end-up with
closed end questions, the teacher has to start other type of questioning with
closed end questions.

Useful for: warming up group discussions, getting a quick answer

Open end questions

Open-ended questions require a little more thought and generally encourage


wider discussion and elaboration. They can’t be answered with a simple yes or
no response.

Useful for: critical or creative discussion, finding out more information about a
concept or lesson
Probing questions

These questions are useful for gaining clarification and encouraging others to
tell you more information about a subject. Probing questions are usually a
series of questions that dig deeper and provide a fuller picture.

When a teacher wishes to start the new lesson, the teacher start with
questions and at one point the student not able to tell the answers then the
teacher introduce the new lesson. This kind of question and bring the students
at one point and introducing the concepts / lesson motivate the student to
listen the class.

Leading questions

Leading questions help the teacher to identify the exact difficulty in


understanding the concepts. Some time, the students do not know what is the
exact solution to solve the given problem in their subject. This also happened
during the project phase of the students, or mini projects. In this scenario, the
faculty member has to start with basic questions and keep-on move to next
question based on their answers and provide solution to them.

Useful for: building positive discussions in the classroom, steering a discussion


towards an outcome that serves your interest

It’s important to use leading questions carefully; they can be seen as an unfair
way of getting the answer you want.

Loaded questions
Loaded questions are seemingly straightforward, closed questions — with a
twist: they contain an assumption about the respondent. They’re popularly used
by examiner during viva-voce of laboratory or project to trick their interviewee
or student to get the fundamental concept of the lab or project they would
otherwise be unwilling to disclose.

For example, the question: ‘have you stopped copying the answers from the
nearby students?’ assumes the respondent copied more than once. Whether
the student answers yes or no, the student will admit to having copied the
answer at some point.

Of course, the preferred response would be: ‘I have never copied answers in my
examination’ But it’s not always easy to spot the trap. These questions are quite
rightly seen as manipulative.

Useful for: discovering facts about someone who would otherwise be reluctant
to offer up the information

Funnel questions

When a faculty member wish to start with a generalized discussion and get the
concepts in detail then funnel questions are more suitable. This funnel questions
very much necessary while refresh the concepts before exams or practical.

Beginning with a broad question before refining in on specific concept — is often


used when questioning witnesses to gain the maximum amount of information
about lesson or concept.

Funnel questions can also be used to bring the students in the relaxed attention
mode: asking students to go into detail about their difficulties in learning or
listening distracts them from their anxiety and gives the information that the
faculty member need to provide them a solution, which in turn calms them
down and makes them think something positive is being done to help them.

Recall and process questions

To start with preparative assessment, the recall questions are very much useful
for the faculty members. Recall questions require the student to remember the
lesson which taught in the earlier class. For example, the faculty members wish
to start the class, before the faculty member wish to connect the previous class
concept and make the students to remember the earlier class, then recall
questions are suitable to start the session. This kind of question is also used to
the student who is not listening the class and bring back the attention with
simple recall questions.

Rhetorical questions

The rhetorical questions are make the students to always remember the
concept or formulae or statement. Rhetorical questions are asked to keep the
students engaged in the class by remembering the lesson. It also helps student
to think, be creative and come up with ideas.

Sometime, when a faculty member gives a webinar or lecture to a big group of


audience then rhetorical questions make the audience engaged in the lecture.
In fact, the polling questions are gives more engagement during the session
while online classes. Those polling questions are help to keep the attention
thread of the audience.
General Principles for Evaluation Process
Some general principles that provide direction to the evaluation process are:
• Evaluation is a systematic process to determine the extent to which objectives are
achieved. This means that formulating objectives in clear terms is an important
prerequisite, as that will spell out 'what to evaluate'.
• Evaluation procedures are selected in terms of the purposes to be served. The question
is not 'should this procedure be used?' but rather 'when this procedure be used?' A
particular procedure is suitable for certain purposes and not appropriate for others.
• A variety of procedures are needed for evaluation. Tests (different types), self-report
techniques and observation are some of the procedures available.
• Appropriate procedures are to be used depending on the nature of
objectives(cognitive, psychomotor; and affective) for ensuing comprehensive
evaluation.
• Knowledge of limitations as well as strengths of different evaluation procedures is
needed for their proper use- A teacher/trainer should develop skills in minimizing
errors in evaluation by being able to design and use different procedures appropriately
• Evaluation is a means to an end and not an end in itself. Evaluation has to be looked
upon as a process of obtaining reliable information upon which to base educational
decisions (instructional, guidance or administrative). It is not the end of the teaching
learning process.

Consequences of Testing on Students


Critics of testing argue that testing is likely to have certain undesirable effects on
students. Some of the most commonly mentioned charges directed toward the use of aptitude
and achievement tests are listed here with brief comments.

Criticism 1: Tests Create Anxiety. There is no doubt that anxiety increases during testing.
For most students, it motivates them to perform better. For a few, test anxiety may be so great
that it interferes with test performance. These typically are students who are generally
anxious, and the test simply adds to their already high level of anxiety. A number of steps can
be taken to reduce test anxiety, such as thoroughly preparing for the test, taking practice
exercises, and using liberal time limits. Fortunately, many test publishers in recent years have
provided practice tests and shifted from speed tests to power tests. This should help, but it is
still necessa1y to observe students carefully during testing and to discount the scores of
overly anxious students.
Criticism 2: Tests Categorize and Label Students. Categorizing and labeling individuals can
be a serious problem, particularly when those labels are used as an excuse for poor student
achieve men t rather than a means of providing the extra-services and help to ensure better
achievement. It is all too easy to place individuals in pigeonholes and apply labels that
determine, at-least in part, how they are viewed and treated. Classifying students in terms of
levels of mental ability has probably caused the greatest concern in education. When students
are classified as mentally retarded, for example, it influences how teachers and peers view the
m, how they view themselves, and the kind of institutions programs they receive. When
students are mislabeled as mentally retarded, as has been the case with some racial and ethnic
minorities, the problem is compounded. At least some of the support for mainstreaming
handicapped students has come from the desire to avoid the categorizing and labeling that
accompanies special education classes. Classifying students into various types of learning
groups can more efficiently use the teacher 's time and the school's resources. However, when
grouping, teachers must consider that tests measure only a limited sample of a student 's
abilities and that students are continuously changing and developing. By keeping the
groupings tentative and flexible and regrouping for different subjects (e.g., science and math),
teachers can avoid most of the undesirable features of grouping. It is when the categories are
viewed as rigid and permanent that labeling becomes a serious problem. In such cases, it is
not the test that should be blamed but the user of the test.

Criticism 3: Tests Damage Students' Self-Concepts. This is a concern that requires the
attention of teachers, counselors, and other users of tests. The improper use of tests may
indeed contribute to distorted self-concepts. The stereotyping of students is one misuse of
tests that is likely to have an undesirable influence on a student's self-concept. Another is the
inadequate interpretation of tst scores that may cause students to make unwarranted
generalizations from the results. It is certainly discouraging to receive low scores on tests,
and it is easy to see how students might develop a general sense of failure unless the results
are properly interpreted. Low-scoring students need to be made aware that aptitudeand
achievement tests are limited measures and that the results can change. In addition, the
possibility of over generalizing from low test scores will be lessened if the student's positive
accomplishments and characteristics are mentioned during the interpretation. When properly
interpreted and used, tests can help students develop a realistic understanding of their
strengths and weaknesses and thereby contribute to imp roved learning and a positive self-
image.

Criticism 4: Tests create Self-fulfilling prophecies. This criticism has been directed primarily
toward intelligence or scholastic aptitude tests. The argument is that test scorescreate tead1er
expectations concerning the achievement of individual students; the tead1er then teaches in
accordance with those expectations, and the students respond by achieving to their expected
level- a self-fulfilling prophecy. Thus, those who are expected to achieve more do achieve
more, and those who are expected to achieve less do achieve less. The belief that teacher
expectations enhance or hinder a student's achievement is widely held, and the role of testing
in creating these expectations is certainly worthy of further research.

In summary, there is some merit in the va1ious criticisms concerning the possible
undesirable effects of tests on students; but more often than not, these criticisms should be
directed at the users of the tests rather than the tests themselves. The same persons who
misuse test results are likely to misuse alternative types of information that are even less
accurate and objective. Thus, the solution is not to stop using tests but to start using tests and
other data. sources of information more effectively.
Two-Dimensional Approach
Teachers traditionally have struggled with issues and concerns pertaining to education,
teaching, and learning. Here are four of the most important organizing questions:
• What is important for students to learn in the limited school and class­ room time
available? (the learning question)
• How does one plan and deliver instruction that will result in high levels of learning for
large numbers of students? (the instruction question)
• How does one select or design assessment instruments and procedures that provide
accurate information about how well students are learning? (the assessment question)
• How does one ensure that objectives, instruction, and assessment are consistent with
one another? (the alignment question)
Once an objective has been placed into a particular cell of the Taxonomy Table shown in fig 2,
we can begin systematically to attack the problem of helping students achieve that objective
through learning question can be answered. Different types of objectives (that is, objectives in
different cells of the table) require different approaches to deliver the instruction and
assessment. Similar types of objectives (that is, objectives in the same cells ofthe table) likely
involve similar approaches to assessment. For example, To assess students' learning with
respect to the number systems, we could provide each student with a list of, say, six numbers,
all of which are either rational or irrational numbers, and ask the student to answer questions
about the list of numbers. The numbers selected should be as different as possible from the
numbers in the textbook or discussed during class. Three example questions follow:
• To what number system, rational or irrational, do all of these numbers belong?
• How do you know that it is the type of number system you say it is?
• How could you change each number so it is an example of the other number system?
That is, if it is an irrational number, change it to a rational number, and if it is a rational
number, change it to an irrational number.
So, it is necessary to understand the two-dimensional framework for mapping two-dimensional
approach of preparing instructional objectives and assessment procedure and complexity. The
framework can be represented in a two-dimensional table that we call the Taxonomy Table.
The rows and columns of the table contain carefully delineated and defined categories of
knowledge and cognitive processes, respectively. The cells of the table are where the
knowledge and cognitive process dimensions intersect. Objectives, either explicitly or
implicitly, include both knowledge and cognitive processes that can be classified in the
Taxonomy framework. Therefore, objectives can be placed in the cells of the table. It should
be possible to place any educational objective that has a cognitive emphasis in one or more
cells of the table.
Categories of the knowledge dimension
After considering the various designations of knowledge types, especially
developments in cognitive psychology that have taken place since the original framework's
(Bloom’s Taxonomy) creation, it settled on four general types of knowledge: Factual,
Conceptual, Procedural, and Metacognitive. Table-1 summarizes these four major types of
knowledge and their associated subtypes.
Factual knowledge is knowledge of discrete, isolated content elements­ "bits of information".
It includes knowledge of terminology and knowledge of specific details and elements. In
contrast, Conceptual knowledge is knowledge of "more complex, organized knowledge forms".
It includes knowledge of classifications and categories, principles and generalizations, and
theories, models, and structures. Procedural knowledge is "knowledge of how to do something''
(p. 52). It includes knowledge of skills and algorithms, techniques and methods, as well as
knowledge of the criteria used to determine and/or justify "when to do what" within specific
domains and disciplines. Finally, Metacognitive knowledge is "knowledge about cognition in
general as well as awareness of and knowledge about one's own cognition". It encompasses
strategic knowledge; knowledge about cognitive tasks, including contextual and conditional
knowledge; and self-knowledge. Of course, certain aspects of metacognitive knowledge are
not the same as knowledge that is defined consensually by experts.
Factual Knowledge
Factual knowledge encompasses the basic elements that experts use in communicating
about their academic discipline, understanding it, and organizing it systematically. These
elements are usually serviceable to people who work in the discipline in the very form in which
they are presented; they need little or no alteration from one use or application to another.
Factual knowledge contains the basic elements students must know if they are to be acquainted
with the discipline or to solve any of the problems in it. The elements are usually symbols
associated with some concrete referents, or "strings of symbols" that convey important
information. For the most part, Factual knowledge exists at a relatively low level of abstraction.
Because there is a tremendous wealth of these basic elements, it is almost inconceivable that a
student could learn all of them relevant to a particular subject matter. As our knowledge
increases in the Engineering and Technology, sciences, and mathematics, even experts in these
fields have difficulty keeping up with all the new elements. Consequently, some selection for
educational purposes is almost always required. For classification purposes, Factual knowledge
may be distinguished from Conceptual knowledge by virtue of its very specificity; that is,
Factual knowledge can be isolated as elements or bits of information that are believed to have
some value in and of themselves. The two subtypes of Factual knowledge are knowledge of
terminology (Aa) and knowledge of specific details and elements (Ab).

Knowledge of terminology
Knowledge of terminology includes knowledge of specific verbal and nonverbal la­
bels and symbols (e.g., words, numerals, signs, pictures). Each subject matter contains a large
number of labels and symbols, both verbal and nonverbal, that have particular referents. They
are the basic language of the discipline the shorthand used by experts to express what they
know. In any attempt by experts to communicate with others about phenomena within their
discipline, they find it necessary to use the special labels and symbols they have devised. In
many cases it is impossible for experts to discuss problems in their discipline without making
use of essential terms. Quite literally, they are unable to even think about many of the
phenomena in the discipline unless they use these labels and symbols. The novice learner must
be cognizant of these labels and symbols and learn the generally accepted referents that are
attached to them. As the expert must communicate with these terms, so must those learning the
discipline have a knowledge of the terms and their referents as they attempt to comprehend or
think about the phenomena of the discipline. Here, to a greater extent than in any other category
of knowledge, experts find their own labels and symbols so useful and precise that they are
likely to want the learner to know more than the learner really needs to know or can learn. This
may be especially true in the sciences, where attempts are made to use labels and symbols with
great precision. Scientists find it difficult to express ideas or discuss particular phenomena with
the use of other symbols or with "popular" or "folk knowledge" terms more familiar to a lay
population.

Examples of knowledge of terminology


• Knowledge of the information security
• Knowledge of scientific terms in Data mining
• Knowledge of the electrical power meter
• Knowledge of android operating system
• Knowledge of the standard representational symbols on maps and charts
• Knowledge of the symbols used to indicate the correct pronunciation of words

Knowledge of specific details and elements


Knowledge of specific details and elements refers to knowledge of events, locations,
people, dates, sources of information, and the like. It may include very precise and specific
information, such as the exact date of an event or the ex­ act magnitude of a phenomenon. It
may also include approximate information, such as a time period in which an event occurred
or the general order of magnitude of a phenomenon. Specific facts are those that can be isolated
as separate, discrete elements in contrast to those that can be known only in a larger context.
Every subject matter contains some events, locations, people, dates, and other details that
experts know and believe to represent important knowledge about the field. Such specific facts
are basic information that experts use in de­ scribing their field and in thinking about specific
problems or topics in the field. These facts can be distinguished from terminology, in that
terminology generally represents the conventions or agreements within a field (i.e., a common
language), whereas facts represent findings arrived at by means other than consensual
agreements made for purposes of communication. Subtype Ab also includes knowledge about
particular books, writings, and other sources of information on specific topics and problems.
Thus, knowledge of a specific fact and knowledge of the sources of the fact are classified in
this subtype.
Again, the tremendous number of specific facts forces educators (e.g., curriculum specialists,
textbook authors, teachers) to make choices about what is basic and what is of secondary
importance or of importance primarily to the expert. Educators must also consider the level of
precision with which differ­ ent facts must be known. Frequently educators may be content to
have a stu­ dent learn only the approximate magnitude of the phenomenon rather than its
precise quantity or to learn an approximate time period rather than the precise date or time of
a specific event. Educators have considerable difficulty deter­ mining whether many of the
specific facts are such that students should learn them as part of an educational unit or course,
or they can be left to be acquired whenever they really need them.
Examples of knowledge of specific details and elements
• Knowledge of major facts about particular version of engine
• Knowledge of practical facts important to health, citizenship, and other human needs
and concerns
• Knowledge of the more significant development of automobile sector
• Knowledge of the reputation of a given author for presenting and interpreting facts on
Online application problems
• Knowledge of major products and exports of countries
• Knowledge of reliable sources of information for wise purchasing
CONCEPTUAL KNOWLEDGE
Conceptual knowledge includes knowledge of categories and classifications and the
relationships between and among them-more complex, organized knowledge forms.
Conceptual knowledge includes schemas, mental models, or implicit or explicit theories in
different cognitive psychological models. These schemas, models, and theories represent the
knowledge an individual has about how a particular subject matter is organized and structured,
how the different parts or bits of information are interconnected and interrelated in a more
systematic manner, and how these parts function together. For example, a mental model for
why the seasons occur may include ideas about the earth, the sun, the rota­tion of the earth
around the sun, and the tilt of the earth toward the sun at different times during the year. These
are not just simple, isolated facts about the earth and sun but rather ideas about the relationships
between them and how they are linked to the seasonal changes. This type of conceptual
knowledge might be one aspect of hat is termed "disciplinary knowledge," or the way experts
in the discipline think about a phenomenon-in this case the scientific explanation for the
occurrence of the seasons.
Conceptual knowledge includes three subtypes: knowledge of classifications and
categories, knowledge of principles and generalizations, and knowledge of theories, models,
and structures. Classifications and categories form the basis for principles and generalizations.
These, in turn, form the basis for theories, models, and structures. The three subtypes should
capture a great deal of the knowledge that is generated within all the different disciplines.

Knowledge of classifications and categories


Subtype Knowledge of classifications and categories includes the specific categories,
classes, divisions, and arrangements that are used in different subject matters. As a subject
matter develops, individuals who work on it find it advantageous to develop classifications and
categories that they can use to structure and systematize the phenomena. This type of
knowledge is somewhat more general and often more abstract than the knowledge of
terminology and specific facts. Each subject matter has a set of categories that are used to
discover new elements as well as to deal with them once they are discovered. Classifications
and categories differ from terminology and facts in that they form the connecting links between
and among specific elements.
When one is writing or analyzing a story, for example, the major categories include
plot, character, and setting. Note that plot as a category is substantially different from the plot
of this story. When the concern is plot as a category, the key question is What makes a plot a
plot? The category "plot" is defined by what all specific plots have in common. In contrast,
when the concern is the plot of a particular story, the key question is What is the plot of this
story? Knowledge of a specific details and elements.
Sometimes it is difficult to distinguish knowledge of classifications and categories from Factual
knowledge. To complicate matters further, basic classifications and categories can be placed
into larger, more comprehensive classifications and categories. In mathematics, for example,
whole numbers, integers, and fractions can be placed into the category rational numbers. Each
larger category moves us away from the concrete specifics and into the realm of the abstract.
For the purposes of our Taxonomy, several characteristics are useful in distinguishing
the subtypes of knowledge. Classifications and categories are largely the result of agreement
and convenience, whereas knowledge of specific details stems more directly from observation,
experimentation, and discovery. Knowledge of classifications and categories is commonly a
reflection of how experts in the field think and attack problems, whereas knowledge of which
specific details become important is derived from the results of such thought and problem
solving.
Knowledge of classifications and categories is an important aspect of developing
expertise in an academic discipline. Proper classification of information and experience into
appropriate categories is a classic sign of learning and development. Moreover, recent
cognitive research on conceptual change and understanding suggests that student learning can
be constrained by misclassification of information into inappropriate categories. For example,
students may have difficulty understanding basic science concepts such as heat, light, force,
and electricity when they classify these concepts as material substances rather than as
processes. Once concepts are classified as substances or objects, students invoke a whole range
of characteristics and properties of "objects." As a result, students try to apply these object-like
characteristics to what are better described in scientific terms as processes. The naive
categorization of these concepts as substances does not match the more scientifically accurate
categorization of them as processes. The categorization of heat, light, force, and electricity as
substances becomes the basis for an implicit theory of how these processes are supposed to
operate and leads to systematic misconceptions about the nature of the processes. This implicit
theory, in turn, makes it difficult for students to develop the appropriate scientific
understanding. Accordingly, learning the appropriate classification and category system can
reflect a "conceptual change" and result in a more appropriate understanding of the concepts
than just learning their definitions (as would be the case in the Factual knowledge category).
For several reasons, it seems likely that students will have greater difficulty learning
knowledge of classifications and categories than Factual knowledge. First, many of the
classifications and categories students encounter represent relatively arbitrary and even
artificial forms of knowledge that are meaningful only to experts who recognize their value as
tools and techniques in their work. Second, students may be able to operate in their daily life
without knowing the appropriate subject matter classifications and categories to the level of
precision expected by experts in the field. Third, knowledge of classifications and categories
requires that students make connections among specific content elements (i.e., terminology and
facts). Finally, as classifications and categories are combined to form larger classifications and
categories, learning becomes more abstract. Nevertheless, the student is expected to know these
classifications and categories and to know when they are appropriate or useful in dealing with
subject matter content. As the student begins to work with a subject matter within an academic
discipline and learns how to use the tools, the value of these classifications and categories
becomes apparent.

Examples of knowledge of classifications and categories


• Knowledge of the variety of building structures
• Knowledge of the various forms of inheritance in Object Oriented Programming
• Knowledge of the parts of sentences (e.g., nouns, verbs, adjectives)
• Knowledge of different kinds of challenges in risk management

Knowledge of principles and generalizations


As mentioned earlier, principles and generalizations are composed of classifications
and categories. Principles and generalizations tend to dominate an academic discipline and are
used to study phenomena or solve problems in the discipline. One of the hallmarks of a subject
matter expert is the ability to recognize meaningful patterns (e.g., generalizations) and activate
the relevant knowledge of these patterns with little cognitive effort (Bransford, Brown, and
Cocking, 1999).
Subtype of Knowledge of principles and generalizations includes knowledge of
particular abstractions that summarize observations of phenomena. These abstractions have the
greatest value in describing, predicting, explaining or determining the most appropriate and
relevant action or direction to be taken. Principles and generalizations bring together large
numbers of specific facts and events, describe the processes and interrelationships among these
specific details (thus forming classifications and categories), and, furthermore, describe the
processes and interrelation­ ships among the classifications and categories. In this way, they
enable the expert to begin to organize the whole in a parsimonious and coherent manner.
Principles and generalizations tend to be broad ideas that may be difficult for students to
understand because students may not be thoroughly acquainted with the phenomena they are
intended to summarize and organize. If students do get to know the principles and
generalizations, however, they have a means for relating and organizing a great deal of subject
matter. As a result, they should have more insight into the subject matter as well as better
memory of it.

Examples of knowledge of principles and generalizations


• Knowledge of major generalizations about particular cultures
• Knowledge of the fundamental laws of physics
• Knowledge of the principles of chemistry that are relevant to life processes and health
• Knowledge of the major principles involved in learning
• Knowledge of the principles of compilers
• Knowledge of the principles that govern rudimentary arithmetic operations (e.g., the
commutative principle, the associative principle)

Knowledge of theories, models, and structures


Subtype Knowledge of theories, models, and structures includes knowledge of
principles and generalizations together with their interrelationships that present a clear,
rounded, and systemic view of a complex phenomenon, problem, or subject matter. These are
the most abstract formulations. They can show the interrelationships and organization of a great
range of specific details, classifications and categories, and principles and generalizations. This
subtype, Be, differs from Bb in its emphasis on a set of principles and generalizations related
in some way to form a theory, model, or structure. The principles and generalizations in subtype
Bb do not need to be related in any meaningful way.Subtype Knowledge of theories, models,
and structures includes knowledge of the different paradigms, epistemologies, theories, and
models that different disciplines use to describe, under­ stand, explain, and predict phenomena.
Disciplines have different paradigms and epistemologies for structuring inquiry, and students
should come to know these different ways of conceptualizing and organizing subject matter
and areas of research within the subject matter. In biology, for example, knowledge of the
theory of evolution and how to think in evolutionary terms to explain different biological
phenomena is an important aspect of this subtype of Conceptual knowledge. Similarly,
behavioural, cognitive, and social constructivist theories in psychology make different
epistemological assumptions and reflect different perspectives on human behaviour. An expert
in a discipline knows not only the different disciplinary theories, models, and structures but
also their relative strengths and weaknesses and can think "within" one of them as well as
"outside" any of them.

Examples of knowledge of theories, models, and structures


• Knowledge of the interrelationships among chemical principles as the basis for
chemical theories
• Knowledge of the overall structure of top-down approach
• Knowledge of a relatively complete formulation of the theory of evolution
• Knowledge of the theory of plate tectonics
• Knowledge of genetic models (e.g., DNA)
Procedural knowledge
Procedural knowledge is the "knowledge of how" to do something. The "something"
might range from completing fairly routine exercises to solving novel problems. Procedural
knowledge often takes the form of a series or sequence of steps to be followed. It includes
knowledge of skills, algorithms, techniques, and methods, collectively known as procedures.
Procedural knowledge also includes knowledge of the criteria used to determine when to use
various procedures. In fact, as Bransford, Brown, and Cocking (1999) noted, not only do
experts have a great deal of knowledge about their subject matter, but their knowledge is
"conditionalized" so that they know when and where to use it. Whereas Factual knowledge and
Conceptual knowledge represent the "what" of knowledge, procedural knowledge concerns the
"how." In other words, Procedural knowledge reflects knowledge of different "processes,"
whereas Factual knowledge and Conceptual knowledge deal with what might be termed
"products." It is important to note that Procedural knowledge represents only the knowledge of
these procedures.
In contrast to Meta-cognitive knowledge (which includes knowledge of more general
strategies that cut across subject matters or academic disciplines), Procedural knowledge is
specific or germane to particular subject matters or academic disciplines. Accordingly, we
reserve the term Procedural knowledge for the knowledge of skills, algorithms, techniques, and
methods that are subject specific or discipline specific. In mathematics, for example, there are
algorithms for performing long division, solving quadratic equations, and establishing the
congruence of triangles. In science, there are general methods for designing and performing
experiments. In social studies, there are procedures for reading maps, estimating the age of
physical artifacts, and collecting historical data. In language arts, there are procedures for
spelling words in English and for generating grammatically correct sentences. Because of the
subject-specific nature of these procedures, knowledge of them also reflects specific
disciplinary knowledge or specific disciplinary ways of thinking in contrast to general
strategies for problem solving that can be applied across many disciplines.

Knowledge of subject specific skills and algorithms


As we mentioned, Procedural knowledge can be expressed as a series or sequence of
steps, collectively known as a procedure. Sometimes the steps are followed in a fixed order; at
other times decisions must be made about which step to per­ form next. Similarly, sometimes
the end result is fixed (e.g., there is a single pre-specified answer); in other cases it is not.
Although the process may be either fixed or more open, the end result is generally considered
fixed in this sub­ type of knowledge. A common example is knowledge of algorithms used
with mathematics exercises. The procedure for multiplying fractions in arithmetic, when
applied, generally results in a fixed answer (barring computational mistakes, of course).
Although the concern here is with Procedural knowledge, the result of using Procedural
knowledge is often Factual knowledge or Conceptual knowledge. For example, the algorithm
for the addition of whole numbers that we use to add 2 and 2 is Procedural knowledge; the
answer 4 is simply Factual knowledge. Once again, the emphasis here is on the student's
knowledge of the procedure rather· than on his or her ability to use it.

Examples of knowledge of subject-specific skills and algorithms


• Knowledge of the skills used in painting with watercolors
• Knowledge of the skills used to determine word meaning based on structural analysis
in NLP
• Knowledge of the various algorithms for solving quadratic equations

Knowledge of subject specific techniques and methods


In contrast with specific skills and algorithms that usually end in a fixed result, some
procedures do not lead to a single predetermined answer or solution. We can follow the general
scientific method in a somewhat sequential manner to design a study, for example, but the
resulting experimental design can vary greatly depending on a host of factors. In this subtype,
Knowledge of subject specific techniques and methodsof Procedural knowledge, then, the
result is more open and not fixed, in contrast to subtype Knowledge of skills and algorithms.
Knowledge of subject-specific techniques and methods includes knowledge that is
largely the result of consensus, agreement, or disciplinary norms rather than knowledge that is
more directly an outcome of observation, experimentation, or discovery. This subtype of
knowledge generally reflects how experts in the field or discipline think and attack problems
rather than the results of such thought or problem solving. For example, knowledge of the
general scientific method and how to apply it to different situations, including social situations
and policy problems, reflects a "scientific" way of thinking. Another example is the
"mathematization" of problems not originally presented as mathematics problems. For
example, the simple problem of choosing a checkout line in a grocery store can be made into a
mathematical problem that draws on mathematical knowledge and procedures (e.g., number of
people in each line, number of items per person).
Examples of knowledge of subject-specific techniques and methods
• Knowledge of research methods relevant to the social sciences
• Knowledge of the techniques used by scientists in seeking solutions to problems

Knowledge of criteria for determining when to use appropriate procedures


In addition to knowing subject-specific procedures, students are expected to know when
to use them, which often involve knowing the ways they have been used in the past. Such
knowledge is nearly always of a historical or encyclopaedic type. Though simpler and perhaps
less functional than the ability to actually use the procedures, knowledge of when to use
appropriate procedures is an important prelude to their proper use. Thus, before engaging in an
inquiry, students may be expected to know the methods and techniques that have been used in
similar inquiries. At a later stage in the inquiry, they may be expected to show relationships
between the methods and techniques they actually employed and the methods employed by
others.
Here again is a systematization that is used by subject matter experts as they attack
problems in their field. Experts know when and where to apply their knowledge. They have
criteria that help them make decisions about when and where to use different types of subject-
specific procedural knowledge; that is, their knowledge is "conditionalized," in that they know
the conditions under which the procedures are to be applied (Chi, Feltovich, and Glaser, 1981).
For example, in solving a physics problem, an expert can recognize the type of physics problem
and apply the appropriate procedure (e.g., a problem that involves Newton's second law, F =
ma). Students therefore may be expected to make use of the criteria as well as have knowledge
of them.

Examples of knowledge of criteria for determining when to use appropriate procedures


• Knowledge of the criteria for determining which of several types of essays to write
(e.g., expository, persuasive)
• Knowledge of the criteria for determining which method to use in solving algebraic
equations
• Knowledge of the criteria for determining which statistical procedure to use with data
collected in a particular experiment
Knowledge of the criteria for determining which technique to apply to create a desired effect
in a particular watercolour painting
META-COGNITIVE KNOWLEDGE
Meta-cognitive knowledge is knowledge about cognition in general as well as
awareness of and knowledge about one's own cognition. One of the hallmarks of theory and
research on learning since the publication of the original Hand­ book is the emphasis on making
students more aware of and responsible for their own knowledge and thought. Regardless of
their theoretical perspective, researchers generally agree that with development students will
become more aware of their own thinking as well as more knowledgeable about cognition in
general, and as they act on this awareness they will tend to learn better (Bransford, Brown, and
Cocking, 1999). The labels for this general developmental trend vary from theory to theory but
include meta-cognitive knowledge, meta-cognitive aware­ ness, self-awareness, self-reflection,
and self-regulation.
Recognizing this distinction, in this chapter we describe only students' knowledge of
various aspects of cognition, not the actual monitoring, control, and regulation of their
cognition. In Flavell's (1979) classic article on meta-cognition, he suggested that metacognition
included knowledge of strategy, task, and person variables. We have represented this general
framework in our categories by including students' knowledge of general strategies for learning
and thinking (strategic knowledge) and their knowledge of cognitive tasks as well as when and
why to use these different strategies (knowledge about cognitive tasks). Finally, we include
knowledge about the self (the person variable) in relation to both cognitive and motivational
components of performance (self-knowledge).

Strategic knowledge
Strategic knowledge is knowledge of the general strategies for learning, thinking, and
problem solving. The strategies in this subtype can be used across many different tasks and
subject matters, rather than being most useful for one particular type of task in one specific
subject area (e.g., solving a quadratic equation or applying Ohm's law).
This subtype, Strategic knowledge includes knowledge of the variety of strategies that stu­
dents might use to memorize material, extract meaning from text, or comprehend what they
hear in classrooms or read in books and other course materials. The large number of different
learning strategies can be grouped into three general categories: rehearsal, elaboration, and
organizational (Weinstein and Mayer, 1986). Rehearsal strategies involve repeating words or
terms to be recalled over and over to oneself; they are generally not the most effective strategies
for deeper levels of learning and comprehension. In contrast, elaboration strategies include the
use of various mnemonics for memory tasks as well as techniques such as summarizing,
paraphrasing, and selecting the main idea from texts. Elaboration strategies foster deeper
processing of the material to be learned and result in better comprehension and learning than
do rehearsal strategies. Organizational strategies include various forms of outlining, drawing
"cognitive maps" or concept mapping, and note taking; students transform the material from
one form to another. Organizational strategies usually result in better comprehension and
learning than do rehearsal strategies.
In addition to these general learning strategies, students can have knowledge of various
meta-cognitive strategies that are useful in planning, monitoring, and regulating their cognition.
Students can eventually use these strategies to plan their cognition (e.g., set sub-goals), monitor
their cognition (e.g., ask themselves questions as they read a piece of text, check their answer
to a math problem), and regulate their cognition (e.g., re-read something they don't understand,
go back and "repair'' their calculating mistake in a math problem). Again, in this category we
refer to students' knowledge of these various strategies, not their actual use. Finally, this
subtype, Strategic knowledge includes general strategies for problem solving and thinking
(Baron, 1994; Nickerson, Perkins, and Smith, 1985; Sternberg, 1985). These strategies
represent the various general heuristics students can use to solve problems, particularly ill-
defined problems that have no definitive solution method. Examples of heuristics are means-
ends analysis and working backward from the desired goal state. In addition to problem-solving
strategies, there are general strategies for deductive and inductive thinking, including
evaluating the validity of different logical statements, avoiding circularity in arguments,
making appropriate inferences from different sources of data, and drawing on appropriate
samples to make inferences (i.e., avoiding the availability heuristic-making decisions from
convenient instead of representative symbols).

Examples of strategic knowledge


• Knowledge that rehearsal of information is one way to retain the information
• Knowledge of various mnemonic strategies for memory
• Knowledge of various elaboration strategies such as paraphrasing and summarizing
• Knowledge of various organizational strategies such as outlining or diagramming.
• Knowledge of planning strategies such as setting goals for reading
• Knowledge of comprehension-monitoring strategies such as self-testing or self-
questioning
• Knowledge of means-ends analysis as a heuristic for solving an ill-defined problem
• Knowledge of the availability heuristic and the problems of failing to sample in an
unbiased manner

Knowledge about cognitive tasks, including contextual and conditional knowledge


In addition to knowledge about various strategies, individuals accumulate knowledge
about cognitive tasks. In his traditional division of Meta-cognitive knowledge, Flavell (1979)
included knowledge that different cognitive tasks can be more or less difficult, may make
differential demands on the cognitive sys­ tem, and may require different cognitive strategies.
For example, a recall task is more difficult than a recognition task. The recall task requires the
person to search memory actively and retrieve the relevant information, whereas the
recognition task requires only that the person discriminate among alternatives and select the
correct or most appropriate answer.
As students develop knowledge of different learning and thinking strategies, this knowledge
reflects both what general strategies to use and how to use them. As with Procedural
knowledge, however, this knowledge may not be sufficient for expertise in learning. Students
also need to develop the conditional knowledge for these general cognitive strategies; in other
words, they need to develop some knowledge about the when and why of using these strategies
appropriately (Paris, Lipson, and Wixson, 1983). All these different strategies may not be
appropriate for all situations, and the learner must develop some knowledge of the different
conditions and tasks for which the different strategies are most appropriate. Conditional
knowledge refers to knowledge of the situations in which students may use Meta-cognitive
knowledge. In contrast, Procedural knowledge refers to knowledge of the situations in which
students may use subject-specific skills, algorithms, techniques, and methods.
If one thinks of strategies as cognitive "tools" that help students construct
understanding, then different cognitive tasks require different tools, just as a carpenter uses
different tools for performing all the tasks that go into building a house. Of course, one tool,
such as a hammer, can be used in many different ways for different tasks, but this is not
necessarily the most adaptive use of a hammer, particularly if other tools are better suited to
some of the tasks. In the same way, certain general learning and thinking strategies are better
suited to different tasks. For example, if one confronts a novel problem that is ill defined, then
general problem-solving heuristics may be useful. In contrast, if one con­ fronts a physics
problem about the second law of thermodynamics, then more specific Procedural knowledge
is more useful and adaptive. An important aspect of learning about strategies is the conditional
knowledge of when and why to use them appropriately.
Another important aspect of conditional knowledge is the local situational and general social,
conventional, and cultural norms for using different strategies. For example, a teacher may
encourage the use of a certain strategy for monitoring reading comprehension. A student who
knows that strategy is bet­ ter able to meet the demands of this teacher's classroom. In the same
manner, different cultures and subcultures may have norms for the use of different strategies
and ways of thinking about problems. Again, knowing these norms can help students adapt to
the demands of the culture in terms of solving the problem. For example, the strategies used in
a classroom learning situation may not be the most appropriate ones to use in a work setting.
Knowledge of the different situations and the cultural norms regarding the use of different
strategies within those situations is an important aspect of Meta-cognitive knowledge.

Examples of knowledge about cognitive tasks, including contextual and conditional knowledge
• Knowledge that recall tasks (i.e., short-answer items) generally make more demands on
the individual's memory system than recognition tasks (i.e., multiple-choice items)
• Knowledge that a primary source book may be more difficult to under­
• stand than a general textbook or popular book
• Knowledge that a simple memorization task (e.g., remembering a phone number) may
require only rehearsal
• Knowledge that elaboration strategies like summarizing and paraphrasing can result in
deeper levels of comprehension
• Knowledge that general problem-solving heuristics may be most useful when the
individual lacks relevant subject- or task-specific knowledge or in the absence of
specific Procedural knowledge
• Knowledge of the local and general social, conventional, and cultural norms for how,
when, and why to use different strategies

Self knowledge
Along with knowledge of different strategies and cognitive tasks, Flavell (1979)
proposed that self-knowledge was an important component of meta-cognition. In his model
self-knowledge includes knowledge of one's strengths and weaknesses in relation to cognition
and learning. For example, students who know they generally do better on multiple-choice tests
than on essay tests have some self-knowledge about their test-taking skills. This knowledge
may be useful to students as they study for the two different types of tests. In addition, one hall­
mark of experts is that they know when they do not know something and they then have some
general strategies for finding the needed and appropriate in­ formation. Self-awareness of the
breadth and depth of one's own knowledge base is an important aspect of self-knowledge.
Finally, students need to be aware of the different types of general strategies they are likely to
rely on in different situations. An awareness that one tends to over-rely on a particular strategy,
when there may be other more adaptive strategies for the task, could lead to a change in strategy
use.
In addition to knowledge of one's general cognition, individuals have beliefs about their
motivation. Motivation is a complicated and confusing area, with many models and theories
available. Although motivational beliefs are usually not considered in cognitive models, a fairly
substantial body of literature is emerging that shows important links between students'
motivational beliefs and their cognition and learning.
A consensus has emerged, however, around general social cognitive models of motivation that
propose three sets of motivational beliefs (Pintrich and Schunk, 1996). Because these beliefs
are social cognitive in nature, they fit into a taxonomy of knowledge. The first set consists of
self-efficacy beliefs, that is, students' judgments of their capability to accomplish a specific
task. The second set includes beliefs about the goals or reasons students have for pursuing a
specific task (e.g., learning vs. getting a good grade}. The third set contains value and interest
beliefs, which represent students' perceptions of their personal interest (liking) for a task as
well as their judgments of how important and useful the task is to them. Just as students need
to develop self-knowledge and awareness about their own knowledge and cognition, they also
need to develop self­ knowledge and awareness about their own motivation. Again, awareness
of these different motivational beliefs may enable learners to monitor and regulate their
behaviour in learning situations in a more adaptive manner.
Self-knowledge is an important aspect of Meta-cognitive knowledge, but the accuracy
of self-knowledge seems to be most crucial for learning. We are not advocating that teachers
try to boost students' "self-esteem" (a completely different construct from self-knowledge) by
providing students with positive but false, inaccurate, and misleading feedback about their
academic strengths and weaknesses. It is much more important for students to have accurate
perceptions and judgments of their knowledge base and expertise than to have 'inflated and
inaccurate self-knowledge. If students are not aware they do not know some aspect of Factual
knowledge or Conceptual knowledge or that they don't know how to do something (Procedural
knowledge), it is unlikely they will make any effort to learn the new material. A hallmark of
experts is that they know what they know and what they do not know, and they do not have
inflated or false impressions of their actual knowledge and abilities. Accordingly, we
emphasize the need for teachers to help students make accurate assessments of their self-
knowledge and not attempt to inflate students' academic self-esteem.

Examples of self-knowledge
• Knowledge that one is knowledgeable in some areas but not in others
• Knowledge that one tends to rely on one type of "cognitive tool" (strategy)
• in certain situations
• Knowledge of one's capabilities to perform a particular task that are accurate, not
inflated (e.g., overconfident)
• Knowledge of one's goals for performing a task
• Knowledge of one's judgments about the relative utility value of a task
CATEGORIES OF THE COGNITIVE PROCESS DIMENSION
Let us define the cognitive processes within each of the six categories in detail, making
comparisons with other cognitive processes, where appropriate. In addition, sample
educational objectives and assessments in various subject areas as well as alternative versions
of assessment tasks. Each illustrative objective in the following material should be read as
though preceded by the phrase "The student is able to ... " or "The student learns to...."
Remember
When the objective of instruction is to promote retention of the presented material in
much the same form as it was taught, the relevant process category is Remember.
Remembering involves retrieving relevant knowledge from long­ term memory. The two
associated cognitive processes are recognizing and recalling. The relevant knowledge may be
Factual, Conceptual, Procedural, or Meta-cognitive, or some combination of these. To assess
student learning in the simplest process category, the student is given a recognition or recall
task under conditions very similar to those in which he or she learned the material. Little, if
any, extension beyond those conditions is expected. If, for example, a student learned the
English equivalents of 20 Spanish words, then a test of remembering could involve requesting
the student to match the Spanish words in one list with their English equivalents in a second
list (i.e., recognize) or to write the corresponding English word next to each of the Spanish
words presented in the list (i.e., recall).
Remembering knowledge is essential for meaningful learning and problem solving as
that knowledge is used in more complex tasks. For example, knowledge of the correct spelling
of common English words appropriate to a given grade level is necessary if the student is to
master writing an essay. Where teachers concentrate solely on rote learning, teaching and
assessing focus solely on remembering elements or fragments of knowledge, often in isolation
from their context. When teachers focus on meaningful learning, however, remembering
knowledge is integrated within the larger task of constructing new
knowledge or solving new problems.

Recognizing
Recognizing involves retrieving relevant knowledge from long-term memory in order
to compare it with presented information. In recognizing, the student searches long-term
memory for a piece of information that is identical or extremely similar to the presented
information (as represented in working memory). When presented with new information, the
student determines whether that information corresponds to previously learned knowledge,
searching for a match. An alternative term for recognizing is identifying.

Sample objectives and corresponding assessments: In general studies, an objective could be for
students to recognize the correct dates of important events in Indian history. A corresponding
test item is: "True or false: The Declaration of Independence was on August 15, 1947." In
literature, an objective could be to recognize authors of Indian literary works. A corresponding
assessment is a matching test that contains a list of ten authors and a list of slightly more than
ten novels. In mathematics, an objective could be to recognize the numbers of sides in basic
geometric shapes. A corresponding assessment is a multiple choice test with items such as the
following: "How many sides does a pentagon have? (a) four, (b) five, (c) six, (d) seven."

Assessment formats: As illustrated in the preceding paragraph, three main methods of


presenting a recognition task for the purpose of assessment are verification, matching, and
forced choice. In verification tasks, the student is given some information and must choose
whether or not it is correct. The true-false format is the most common example. In matching,
two lists are presented, and the student must choose how each item in one list corresponds to
an item in the other list. In forced choice tasks, the student is given a prompt along with several
possible answers and must choose which answer is the correct or ''best answer." Multiple-
choice is the most common format.

Recalling
Recalling involves retrieving relevant knowledge from long-term memory when given
a prompt to do so. The prompt is often a question. In recalling, a student searches long-term
memory for a piece of information and brings that piece of information to working memory
where it can be processed. An alternative term for recalling is retrieving.

Sample objectives and corresponding assessments: In recalling, a student remembers


previously learned information when given a prompt. In Computer Science and Engineering,
an objective could be to recall the mother board functions of computer. A corresponding test
item is "What is the function of Motherboard?" In literature, an objective could be to recall the
different components interfaced with motherboard. A corresponding test question is "whether
memory directly interfaced with motherboard" In mathematics, an objective could be to recall
the whole-number multiplication facts. A corresponding test item asks students to multiply 7
X 8.

Assessment formats: Assessment tasks for recalling can vary in the number and quality of cues
that students are provided. With low cueing, the student is not given any hints or related
information (such as "What is a meter?"). With high cueing, the student is given several hints
(such as "In the metric system, a meter is a measure of .").
Assessment tasks for recalling can also vary in the amount of embedding, or the extent to which
the items are placed within a larger meaningful context. With low embedding, the recall task
is presented as a single, isolated event, as in the preceding examples. With high embedding,
the recall task is included within the context of a larger problem, such as asking a student to
recall the formula for the area of a circle when solving a word problem that requires that
formula.
Understand
As we indicated, when the primary goal of instruction is to promote retention, the focus
is on objectives that emphasize Remember. When the goal of instruction is to promote transfer,
however, the focus shifts to the other five cognitive processes, Understand through Create. Of
these, arguably the largest category of transfer-based educational objectives emphasized in
institutions is Understand. Students are said to Understand when they are able to construct
meaning from instructional messages, including oral, written, and graphic communications,
however they are presented to students: during lectures, in books, or on computer monitors.
Examples of potential instructional messages include an in-class physics demonstration, a
geological formation seen on a field trip, a computer simulation of a trip through an art
museum, and a musical work played by an orchestra, as well as numerous verbal, pictorial, and
symbolic representations on paper.
Students understand when they build connections between the "new" knowledge to be
gained and their prior knowledge. More specifically, the incoming knowledge is integrated
with existing schemas and cognitive frameworks. Since concepts are the building blocks for
these schemas and frameworks, Conceptual knowledge provides a basis for understanding.
Cognitive processes in the category of Understand include interpreting, exemplifying,
classifying, summarizing, inferring, comparing, and explaining.

Interpreting
Interpreting occurs when a student is able to convert information from one
representational form to another. Interpreting may involve converting words to words (e.g.,
paraphrasing), pictures to words, words to pictures, numbers to words, words to numbers,
musical notes to tones, and the like. Alternative terms are translating, paraphrasing,
representing, and clarifying.
Sample objectives and corresponding assessments: In interpreting, when given information in
one form of representation, a student is able to change it into another form. For example, an
objective could be to learn to paraphrase important functions of compiler phases. A
corresponding assessment asks a student to check the email grammar. In science, an objective
could be to learn to draw pictorial representations of various natural phenomena. A
corresponding assessment item asks a student to draw a series of diagrams illustrating
photosynthesis. In mathematics, a sample objective could be to learn to translate number
sentences expressed in words into algebraic equations expressed in symbols. A corresponding
assessment item asks a student to write an equation that corresponds to the statement "There
are twice as many boys as girls in this class."

Assessment formats: Appropriate test item formats include both constructed response (i.e.,
supply an answer) and selected response (i.e., choose an answer). Information is
presented in one form, and students are asked either to construct or to select the same
information in a different form. For example, a constructed response task is: "Write an
equation that corresponds to the following statement, using T for total cost and P for
number of bundles. The total cost of mailing a package is Rs. 2.00 for the first bundle
plus Rs.1.50 for each additional bundle." A selection version of this task is: "Which
equation corresponds to the following statement, where T stands for total cost and P for
number of bundles?
The total cost of mailing a package is Rs. 2.00 for the first bundle plus Rs.1.50 for each
additional bundle (a) T = Rs.3.50 + B, (b) T = Rs. 2.00 + Rs. 1.50(B), (c) T = Rs. 2.00 +
Rs.1.50(B-1)."
To increase the probability that interpreting rather than remembering is being assessed,
the information included in the assessment task must be new. "New" here means that
students did not encounter it during instruction. Unless this rule is observed, we cannot
ensure that interpreting rather than remembering is being assessed. If the assessment task
is identical to a task or example used during instruction, we are probably assessing
remembering, despite our efforts to the contrary. Although we will not repeat this point
from here on, it applies to each of the process categories and cognitive processes beyond
Remember. If assessment tasks are to tap higher-order cognitive processes, they must
require that students cannot answer them correctly by relying on memory alone.

Exemplifying
Exemplifying occurs when a student gives a specific example or instance of a general
concept or principle. Exemplifying involves identifying the defining features of the general
concept or principle (e.g., an isosceles triangle must have two equal sides) and using these
features to select or construct a specific instance (e.g., being able to select which of three
presented triangles is an isosceles triangle). Alternative terms are illustrating and instantiating.

Sample objectives and corresponding assessments: In exemplifying, a student is given a


concept or principle and must select or produce a specific example or instance of it that was
not encountered during instruction. In art history, an objective could be to learn to give
examples of various artistic painting styles. A corresponding assessment asks a student to select
which of four paintings represents the impressionist style. In science, a sample objective could
be to be able to give examples of various kinds of chemical compounds. A corresponding
assessment task asks the student to locate an inorganic com­ pound on a field trip and tell why
it is inorganic (i.e., specify the defining features). In literature, an objective could be to learn
to exemplify various play genres. The assessment may give the students brief sketches of four
plays (only one of which is a romantic comedy) and ask the student to name the play that is a
romantic comedy.

Assessment formats: Exemplifying tasks can involve the constructed response format in which
the student must create an example or the selected response format in which the student must
select an example from a given set. The science example, "Locate an inorganic compound and
tell why it is inorganic," requires a constructed response. In contrast, the item "Which of these
is an inorganic compound? (a) iron, (b) protein, (c) blood, (d) leaf mold" requires a selected
response.

Classify
Classifying occurs when a student recognizes that something (e.g., a particular instance
or example) belongs to a certain category (e.g., concept or principle). Classifying involve s
detecting relevant features or patterns that "fit" both the specific instance and the concept or
principle. Classifying is a complementary process to exemplifying. Whereas exemplifying
begins with a general concept or principle and requires the student to find a specific instance
or example, classifying begins with a specific instance or example and requires the student to
find a general concept or principle. Alternative terms for classifying are categorizing and
subsuming.

Sample objectives and corresponding assessments: In social studies, an objective could be to


learn to classify observed or described cases of mental disorders. A corresponding assessment
item asks a student to observe a video of the behaviour of a person with mental illness and then
indicate the mental disorder that is displayed. In the natural sciences, an objective could be to
learn to categorize the species of various prehistoric animals. An assessment gives a student
some pictures of prehistoric animals with instructions to group them with others of the same
species. In mathematics, an objective could be to be able to determine the categories to which
numbers belong. An assessment task gives an example and asks a student to circle all numbers
in a list from the same category.

Assessment formats: In constructed response tasks, a student is given an instance and must
produce its related concept or principle. In selected response tasks, a student is given an
instance and must select its concept or principle from a list. In a sorting task, a student is given
a set of instances and must determine which ones belong in a specified category and which
ones do not, or must place each instance into one of multiple categories.
Summarizing
Summarizing occurs when a student suggests a single statement that represents
presented information or abstracts a general theme. Summarizing involves constructing a
representation of the information, such as the meaning of a scene in a play, and abstracting a
summary from it, such as determining a theme or main points. Alternative terms are
generalizing and abstracting.

Sample objectives and corresponding assessments: In summarizing, when given information,


a student provides a summary or abstracts a general theme. A sample objective in history could
be to learn to write short summaries of events portrayed pictorially. A corresponding
assessment item asks a student to watch a videotape on the Indian freedom fight phases and
then write a short summary. Similarly, a sample objective in the natural sciences could be to
learn to summarize the major contributions of famous scientists after reading several of their
writings. A corresponding assessment item asks a student to read selected writings about poet
Bharathiyar and summarize the major points. In computer science, an objective could be to
learn to summarize the purposes of various subroutines in a program. An assessment item
presents a program and asks a student to write a sentence describing the sub-goal that each
section of the program accomplishes within the overall program.

Assessment formats: Assessment tasks can be presented in constructed response or selection


formats, involving either themes or summaries. Generally speaking, themes are more abstract
than summaries. For example, in a constructed response task, the student may be asked to read
an untitled passage on software versus hardware growth and then write an appropriate title for
the comparison. In a selection task, a student may be asked to read a passage on software versus
hardware growth and then select the most appropriate title from a list of four possible titles or
rank the titles in order of their "fit" to the point of the passage.
Inferring
Inferring involves finding a pattern within a series of examples or instances. Inferring
occurs when a student is able to abstract a concept or principle that accounts for a set of
examples or instances by encoding the relevant features of each instance and, most important,
by noting relationships among them. For example, when given a series of numbers such as 1,
2, 3, 5, 8, 13, 21, a student is able to focus on the numerical value of each digit rather than on
irrelevant features such as the shape of each digit or whether each digit is odd or even. He or
she then is able to distinguish the pattern in the series of numbers (i.e., after the first two
numbers, each is the sum of the preceding two numbers).
The process of inferring involves making comparisons among instances within the
context of the entire set. For example, to determine what number will come next in the series
above, a student must identify the pattern. A related process is using the pattern to create a new
instance (e.g., the next number on the series is 34, the sum of 13 and 21). This is an example
of executing, which is a cognitive process associated with Apply. Inferring and executing are
often used together on cognitive tasks.
Finally, inferring is different from attributing (a cognitive process associated with
Analyze). As we discuss later in this chapter, attributing focuses solely on the pragmatic issue
of determining the author's point of view or intention, whereas inferring focuses on the issue
of inducing a pattern based on presented information. Another way of differentiating between
these two is that attributing is broadly applicable to situations in which one must "read between
the lines," especially when one is seeking to determine an author's point of view. Inferring, on
the other hand, occurs in a context that supplies an expectation of what is to be inferred.
Alternative terms for inferring are extrapolating, interpolating, predicting, and concluding.

Sample objectives and corresponding assessments: In inferring, when given a set or series of
examples or instances, a student finds a concept or principle that accounts for them. For
example, In mathematics, an objective could be to learn to infer the relationship expressed as
an equation that represents several observations of values for two variables. An assessment
item asks a student to describe the relationship as an equation involving x and y for situations
in which if x is 1, then y is 0; if x is 2, then y is 3; and if x is 3, then y is 8.

Assessment formats: Three common tasks that require inferring (often along with
implementing) are completion tasks, analogy tasks, and oddity tasks. In completion tasks, a
student is given a series of items and must determine what will come next, as in the number
series example above. In analogy tasks, a student is given an analogy of the form A is to B as
C is to D, such as "nation" is to "president" as "state" is to ____________. The student's task
is to produce or select a term that fits in the blank and completes the analogy (such as
"governor"). In an oddity task, a student is given three or more items and must
determine which does not belong. For example, a student may be given three physics problems,
two involving one principle and another involving a different principle. To focus solely on the
inferring process, the question in each assessment task could be to state the underlying concept
or principle the student is using to arrive at the correct answer.

Comparing
Comparing involves detecting similarities and differences between two or more objects,
events, ideas, problems, or situations, such as determining how a well­ known event (e.g., a
recent political scandal) is like a less familiar event (e.g., a historical political scandal).
Comparing includes finding one-to-one correspondences between elements and patterns in one
object, event, or idea and those in another object, event, or idea. When used in conjunction with
inferring (e.g., first, abstracting a rule from the more familiar situation) and implementing (e.g.,
second, applying the rule to the less familiar situation), comparing can con­ tribute to reasoning
by analogy. Alternative terms are contrasting, matching, and mapping.
Sample objectives and corresponding assessments: In comparing, when given new information,
a student detects correspondences with more familiar knowledge. For example, in social
studies, an objective could be to understand historical events by comparing them to familiar
situations. A corresponding assessment question is "How is the Indian culture revolution like
a family fight or an argument between friends?" In the natural sciences, a sample objective
could be to learn to compare an electrical circuit to a more familiar system. In assessment, we
ask "How is an electrical circuit like water flowing through a pipe?" Comparing may also
involve determining correspondences between two or more presented objects, events, or ideas.
In mathematics, a sample objective could be to learn to compare structurally similar word
problems. A corresponding assessment question asks a student to tell how a certain mixture
problem is like a certain work problem.

Assessment formats:A major technique for assessing the cognitive process of comparing is
mapping. In mapping, a student must show how each part of one object, idea, problem, or
situation corresponds to (or maps onto) each part of another. For example, a student could be
asked to detail how the battery, wire, and resistor in an electrical circuit are like the pump,
pipes, and pipe constructions in a water flow system, respectively.
Explaining
Explaining occurs when a student is able to construct and use a cause-and­ effect model
of a system. The model may be derived from a formal theory (as is often the case in the natural
sciences) or may be grounded in research or experience (as is often the case in the social
sciences and humanities). A complete explanation involves constructing a cause-and-effect
model, including each major part in a system or each major event in the chain, and using the
model to determine how a change in one part of the system or one "link" in the chain affects a
change in another part. An alternative term for explaining is constructing a model.

Sample objectives and corresponding assessments: In explaining, when given a description of


a system, a student develops and uses a cause-and-effect model of the system. For example, in
social studies, an objective could be to explain the causes of important eighteenth-century
historical events. As an assessment, after reading and discussing a unit on the Indian
Revolution, students are asked to construct a cause-and-effect chain of events that best explains
why the war occurred. In the natural sciences, an objective could be to explain how basic
physics laws work. Corresponding assessments ask students who have studied Ohm's law to
explain what happens to the rate of the current when a second battery is added to a circuit, or
ask students who have viewed a video on lightning storms to explain how differences in
temperature affect the formation of lightning.

Assessment formats: Several tasks can be aimed at assessing a student's ability to explain,
including reasoning, troubleshooting, redesigning, and predicting. In reasoning tasks, a student
is asked to offer a reason for a given event. For example, "Why does air enter a bicycle tire
pump when you pull up on the handle?" In this case, an answer such as "It is forced in because
the air pressure is less inside the pump than outside" involves finding a principle that accounts
for a given event. In troubleshooting, a student is asked to diagnose what could have gone
wrong in a malfunctioning system. For example, "Suppose you pull up and press down on the
handle of a bicycle tire pump several times but no air comes out. What's wrong?" In this case,
the student must find an explanation for a symptom, such as "There is a hole in the cylinder"
or "A valve is stuck in the open position." In redesigning, a student is asked to change the
system to accomplish some goal. For example, "How could you improve a bicycle tire pump
so that it would be more efficient?" To answer this question, a student must imagine altering
one or more of the components in the system, such as "Put lubricant between the piston and
the cylinder."
In predicting, a student is asked how a change in one part of a system will effect a change in
another part of the system. For example, "What would hap­ pen if you increased the diameter
of the cylinder in a bicycle tire pump?" This question requires that the student "operate" the
mental model of the pump to see that the amount of air moving through the pump could be
increased by in­ creasing the diameter of the cylinder.
Apply
Apply involves using procedures to perform exercises or solve problems. Thus, Apply
is closely linked with Procedural knowledge. An exercise is a task for which the student already
knows the proper procedure to use, so the student has developed a fairly routinized approach
to it. A problem is a task for which the student initially does not know what procedure to use,
so the student must locate a procedure to solve the problem. The Apply category consists of
two cognitive processes: executing-when the task is an exercise (familiar)-and implementing-
when the task is a problem (unfamiliar). When the task is a familiar exercise, students generally
know what Procedural knowledge to use. When given an exercise (or set of exercises), students
typically perform the procedure with little thought. For example, an algebra student confronted
with the 50th exercise involving quadratic equations might simply "plug in the numbers and
tum the crank."
When the task is an unfamiliar problem, however, students must determine what
knowledge they will use. If the task appears to call for Procedural knowledge and no available
procedure fits the problem situation exactly, then modifications in selected Procedural
knowledge may be necessary. In contrast to executing, then, implementing requires some
degree of understanding of the problem as well as of the solution procedure. In the case of
implementing, then, to understand conceptual knowledge is a prerequisite to being able to apply
procedural knowledge.

Executing
In executing, a student routinely carries out a procedure when confronted with a
familiar task (i.e., exercise). The familiarity of the situation often provides sufficient clues to
guide the choice of the appropriate procedure to use. Executing is more frequently associated
with the use of skills and algorithms than with techniques and methods (see our discussion of
Procedural knowledge on pages 52-53). Skills and algorithms have two qualities that make
them particularly amenable to executing. First, they consist of a sequence of steps that are
generally followed in a fixed order. Second, when the steps are performed correctly, the end
result is a predetermined answer. An alternative term for executing is carrying out.

Sample objectives and corresponding assessments: In executing, a student is faced with a


familiar task and knows what to do in order to complete it. The student simply carries out a
known procedure to perform the task. For example, a sample objective in elementary level
mathematics could be for students to learn to divide one whole number by another, both with
multiple digits. The instructions to "divide" signify the division algorithm, which is the
necessary Procedural knowledge. To assess the objective, a student is given a worksheet that
has 15 whole-number division exercises (e.g., 784/15) and is asked to find the quotients. In the
natural sciences, a sample objective could be to learn to compute the value of variables using
scientific formulas. To assess the objective, a student is given the formula Density =
Mass/Volume and must answer the question "What is the density of a material with a mass of
18 pounds and a volume of 9 cubic inches?"

Assessment formats: In executing, a student is given a familiar task that can be performed using
a well-known procedure. For example, an execution task is "Solve for x: x2 + 2x - 3 = 0 using
the technique of completing the square." Students may be asked to supply the answer or, where
appropriate, select from among a set of possible answers. Furthermore, because the emphasis
is on the procedure as well as the answer, students may be required not only to find the answer
but also to show their work.

Implementing
Implementing occurs when a student selects and uses a procedure to perform an
unfamiliar task. Because selection is required, students must possess an understanding of the
type of problem encountered as well as the range of procedures that are available. Thus,
implementing is used in conjunction with other cognitive process categories, such as
Understand and Create. Because the student is faced with an unfamiliar problem, he or she
does not immediately know which of the available procedures to use. Furthermore, no single
procedure may be a "perfect fit'' for the problem; some modification in the procedure may be
needed. Implementing is more frequently associated with the use of techniques and methods
than with skills and algorithms. Techniques and methods have two qualities that make them
particularly amenable to implementing. First, the procedure may be more like a "flow chart"
than a fixed sequence; that is, the procedure may have "decision points" built into it (e.g., after
completing Step 3, should I do Step 4A or Step 4B?). Second, there often is no single, fixed
answer that is expected when the procedure is applied correctly.
The notion of no single, fixed answer is especially applicable to objectives that call for
applying conceptual knowledge such as theories, models, and structures, where no procedure
has been developed for the application. Consider an objective such as "The student shall be
able to apply a social psychological theory of crowd behaviour to crowd control." Social
psychological theory is Conceptual not Procedural knowledge. This is clearly an Apply
objective, however, and there is no procedure for making the application. Given that the theory
would very clearly structure and guide the student in the application, this objective is just barely
on the Apply side of Create, but Apply it is. So it would be classified as implementing. To see
why it fits, think of the Apply category as structured along a continuum. It starts with the
narrow, highly structured execute, in which the known Procedural knowledge is applied almost
routinely. It continues through the broad, increasingly unstructured implement, in which, at the
beginning, the procedure must be selected to fit a new situation. In the middle of the category,
the procedure may have to be modified to implement it. At the far end of implementing, where
there is no set Procedural knowledge to modify, a procedure must be manufactured out of
Conceptual knowledge using theories, models, or structures as a guide. So, although Apply is
closely linked to Procedural knowledge, and this linkage carries through most of the category
of Apply, there are some instances in implementing to which one applies Conceptual
knowledge as well. An alternative term for implementing is using.

Sample objectives and corresponding assessments: In mathematics, a sample objective could


be to learn to solve a variety of personal finance problems. A corresponding assessment is to
present students with a problem in which they must choose the most economical financing
package for a new car. In the natural sciences, a sample objective could be to learn to use the
most effective, efficient, and affordable method of conducting a research study to address a
specific research question. A corresponding assessment is to give students a research question
and have them propose a research study that meets specified criteria of effectiveness,
efficiency, and affordability. Notice that in both of these assessment tasks, the student must not
only apply a procedure (i.e., engage in implementing) but also rely on conceptual
understanding of the problem, the procedure, or both.

Assessment formats: In implementing, a student is given an unfamiliar problem that must be


solved. Thus, most assessment formats begin with specification of the problem. Students are
asked to determine the procedure needed to solve the problem, solve the problem using the
selected procedure (making modifications as necessary), or usually both.
Analyse
Analyse involves breaking material into its constituent parts and determining how the
parts are related to e another and to an overall structure. This process category includes the
cognitive processes of differentiating, organizing, and attributing. Objectives classified as
Analyse include learning to determine the relevant or important pieces of a message
(differentiating), the ways in which the pieces of a message are organized (organizing), and the
underlying purpose of the message (attributing). Although learning to Analyze may be viewed
as an end in itself, it is probably more defensible educationally to consider analysis as an
extension of Understanding or as a prelude to Evaluating or Creating. Improving students' skills
in analyzing educational communications is a goal in many fields of study. Teachers of science,
social studies, the humanities, and the arts frequently give "learning to analyze" as one of their
important objectives. They may, for example, wish to develop in their students the ability to:
• distinguish fact from opinion (or reality from fantasy);
• connect conclusions with supporting statements;
• distinguish relevant from extraneous material;
• determine how ideas are related to one another;
• ascertain the unstated assumptions involved in what is said;
• distinguish dominant from subordinate ideas or themes in poetry or music; and
• find evidence in support of the author's purposes.
The process categories of Understand, Analyze, and Evaluate are interrelated and often used
iteratively in performing cognitive tasks. At the same time, however, it is important to maintain
them as separate process categories. A person who understands a communication may not be
able to analyze it well. Similarly, someone who is skillful in analyzing a communication may
evaluate it poorly.
Differentiating
Differentiating involves distinguishing the parts of a whole structure in terms of their
relevance or importance. Differentiating occurs when a student discriminates relevant from
irrelevant information, or important from unimportant in­ formation, and then attends to the
relevant or important information. Differentiating is different from the cognitive processes
associated with Understand because it involves structural organization and, in particular,
determining how the parts fit into the overall structure or whole. More specifically,
differentiating differs from comparing in using the larger context to determine what is relevant
or important and what is not. For instance, in differentiating apples and oranges in the context
of fruit, internal seeds are relevant, but color and shape are irrelevant. In comparing. all of these
aspects (i.e., seeds, color, and shape) are relevant. Alternative terms for differentiating are
discriminating, selecting, distinguish­ ing, and focusing.

Sample objectives and corresponding assessments: In the social sciences, an objective could
be to learn to determine the major points in research reports. A corresponding assessment item
requires a student to circle the main points in an archaeological report about an ancient Indian
city (such as when the city began and when it ended, the population of the city over the course
of its existence, the geographic location of the city, the physical buildings in the city, its
economic and cultural function, the social organization of the city, why the city was built and
why it was deserted). Similarly, in the natural sciences, an objective could be to select the main
steps in a written description of how something works. A corresponding assessment item asks
a student to read a chapter in a book that describes lightning formation and then to divide the
process into major steps (including moist air rising to form a cloud, creation of updrafts and
downdrafts inside the cloud, separation of charges within the cloud, movement of a stepped
leader downward from cloud to ground, and creation of a return stroke from ground to cloud).
Finally, in mathematics, an objective could be to distinguish between relevant and irrelevant
numbers in a word problem. An assessment item requires a student to circle the relevant
numbers and cross out the irrelevant numbers in a word problem.
Assessment formats: Differentiating can be assessed with constructed response or selection
tasks. In a constructed response task, a student is given some material and is asked to indicate
which parts are most important or relevant, as in this example: ''Write the numbers that are
needed to solve this problem: Pencils come in packages that contain 12 each and cost Rs.2.00
each. John has Rs.5.00 and wishes to buy 24 pencils. How many packages does he need to
buy?" In a selection task, a student is given some material and is asked to choose which parts
are most important or relevant, as in this example: "Which numbers are needed to solve this
problem? Pencils come in packages that contain 12 each and cost Rs.2.00 each. John has
Rs.5.00 and wishes to buy 24 pencils. How many packages does he need to buy? (a) 12,
Rs.2.00, Rs.5.00, 24; (b) 12, Rs.2.00, Rs.5.00; (c) 12, Rs.2.00, 24; (d) 12, 24."

Organizing
Organizing involves identifying the elements of a communication or situation and
recognizing how they fit together into a coherent structure. In organizing, a student builds
systematic and coherent connections among pieces of presented information. Organizing
usually occurs in conjunction with differentiating. The student first identifies the relevant or
important elements and then determines the overall structure within which the elements fit.
Organizing can also occur in conjunction with attributing, in which the focus is on determining
the author's intention or point of view. Alternative terms for organizing are structuring,
integrating, finding coherence, outlining, and parsing.

Sample objectives and corresponding assessments: In organizing, when given a description of


a situation or problem, a student is able to identify the systematic, coherent relationships among
relevant elements. A sample objective in the natural sciences could be to learn to analyze
research reports in terms of four sections: hypothesis, method, data, and conclusion. As an
assessment, students are asked to produce an outline of a presented research report. In
mathematics, a sample objective could be to learn to outline textbook lessons. A corresponding
assessment task asks a student to read a textbook lesson on basic statistics and then generate a
matrix that includes each statistic's name, formula, and the conditions under which it is used.
Assessment formats: Organizing involves imposing a structure on material (such as an outline,
table, matrix, or hierarchical diagram). Thus, assessment can be based on constructed response
or selection tasks. In a constructed response task, a student may be asked to produce a written
outline of a pas­ sage. In a selection task, a student may be asked to select which of four
alternative graphic hierarchies best corresponds to the organization of a presented passage.

Attributing
Attributing occurs when a student is able to ascertain the point of view, biases, values,
or intention underlying communications. Attributing involves a process of deconstruction, in
which a student determines the intentions of the author of the presented material. In contrast to
interpreting, in which the student seeks to Understand the meaning of the presented material,
attributing involves an extension beyond basic understanding to infer the intention or point of
view underlying the presented material. An alternative term is deconstructing.

Sample objectives and corresponding assessments: In attributing, when given information, a


student is able to determine the under­ lying point of view or intention of the author. For
example, In social studies, a sample objective could be to learn to determine the point of view
of the author of an essay on a controversial topic in terms of his or her theoretical perspective.
A corresponding assessment task asks a student whether a report on Amazon rain forests was
written from a pro-environment or pro-business point of view. This objective is also applicable
to the natural sciences. A corresponding assessment task asks a student to determine whether a
behaviourist or a cognitive psychologist wrote an essay about human learning.

Assessment formats: Attributing can be assessed by presenting some written or oral material
and then asking a student to construct or select a description of the author's or speaker's point
of view, intentions, and the like. For example, a constructed response task is "What is the
author's purpose in writing the essay you read on the Amazon rain forests?" A selection version
of this task is "The author's purpose in writing the essay you read is to: (a) provide factual
information about Amazon rain forests, (b) alert the reader to the need to protect rain forests,
(c) demonstrate the economic advantages of developing rain forests, or (d) describe the
consequences to humans if rain forests are developed." Alternatively, students might be asked
to indicate whether the author of the essay would (a) strongly agree, (b) agree, (c) neither agree
nor disagree, (d) disagree, or (e) strongly disagree with several statements. Statements like
"The rainforest is a unique type of ecological system" would follow.
Evaluate
Evaluate is defined as making judgments based on criteria and standards. The criteria
most often used are quality, effectiveness, efficiency, and consistency. They may be
determined by the student or by others. The standards may be either quantitative (i.e., Is this a
sufficient amount?) or qualitative (i.e., Is this good enough?). The standards are applied to the
criteria (e.g., Is this process sufficiently effective? Is this product of sufficient quality?). The
category Evaluate includes the cognitive processes of checking judgments about the internal
consistency) and critiquing judgments based on external criteria). It must be emphasized that
not all judgments are evaluative. For example, students make judgments about whether a
specific example fits within a category. They make judgments about the appropriateness of a
particular procedure for a specified problem. They make judgments about whether two objects
are similar or different. Most of the cognitive processes, in fact, require some form of judgment.
What most clearly differentiates Evaluate as defined here from other judgments made by
students is the use of standards of performance with clearly defined criteria. Is this machine
working as efficiently as it should be? Is this method the best way to achieve the goal? Is this
approach more cost effective than other approaches? Such questions are addressed by people
engaged in Evaluating.

Checking
Checking involves testing for internal inconsistencies or fallacies in an operation or a
product. For example, checking occurs when a student tests whether or not a conclusion follows
from its premises, whether data support or disconfirm a hypothesis, or whether presented
material contains parts that contradict one another. When combined with planning (a cognitive
process in the category Create) and implementing (a cognitive process in the category Apply),
checking· involves determining how well the plan is working. Alternative terms for checking
are testing, detecting, monitoring, and coordinating.

Sample objectives and corresponding assessments: In checking, students look for internal
inconsistencies. A sample objective in the social sciences could be to learn to detect
inconsistencies in persuasive messages. A corresponding assessment task asks students to
watch a television advertisement for a political candidate and point out any logical flaws in the
persuasive message. A sample objective in the sciences could be to learn to determine whether
a scientist's conclusion follows from the observed data. An assessment task asks a student to
read a report of a chemistry experiment and determine whether or not the conclusion follows
from the results of the experiment.

Assessment formats: Checking tasks can involve operations or products given to the students
or ones created by the students themselves. Checking can also take place within the context of
carrying out a solution to a problem or performing a task, where one is concerned with the
consistency of the actual implementation (e.g., Is this where I should be in light of what I've
done so far?).

Critiquing
Critiquing involves judging a product or operation based on externally imposed criteria
and standards. In critiquing, a student notes the positive and negative features of a product and
makes a judgment based at least partly on those features. Critiquing lies at the core of what has
been called critical thinking. An example of critiquing is judging the merits of a particular
solution to the problem of acid rain in terms of its likely effectiveness and its associated costs
(e.g., requiring all power p1ants throughout the country to restrict their smokestack emissions
to certain limits). An alternative term is judging.

Sample objectives and corresponding assessments: In critiquing, students judge the merits of a
product or operation based on specified or student-determined criteria and standards. In the
social sciences, an objective could be to learn to evaluate a proposed solution (such as
"eliminate all grading") to a social problem (such as "how to improve K-12 education") in
terms of its likely effectiveness. In the natural sciences, an objective could be to learn to
evaluate the reasonableness of a hypothesis (such as the hypothesis that strawberries are
growing to extraordinary size because of the unusual alignment of the stars). Finally, in
mathematics, an objective could be to learn to judge which of two alternative methods is a more
effective and efficient way of solving given problems (such as judging whether it is better to
find all prime factors of 60 or to produce an algebraic equation to solve the problem "What are
the possible ways you could multiply two whole numbers to get 60?").

Assessment formats: student may be asked to critique his or her own hypotheses or creations
or those generated by someone else. The critique could be based on positive, negative, or both
kinds of criteria and yield both positive and negative consequences.
Create
Create involves putting elements together to form a coherent or functional whole.
Objectives classified as Create have students make a new product by mentally reorganizing
some elements or parts into a pattern or structure not clearly present before. The processes
involved in Create are generally coordinated with the student's previous learning experiences.
Although Create requires creative thinking on the part of the student, this is not completely free
creative expression unconstrained by the demands of the learning task or situation.
To some persons, creativity is the production of unusual products, often as a result of some
special skill. Create, as used here, however, although it includes objectives that call for unique
production, also refers to objectives calling for production that all students can and will do. If
nothing else, in meeting these objectives, many students will create in the sense of producing
their own synthesis of information or materials to form a new whole, as in writing, painting,
sculpting, building, and so on.
Although many objectives in the Create category emphasize originality (or uniqueness),
educators must define what is original or unique. Can the term unique be used to describe the
work of an individual student or is it reserved for use with a group of students (e.g., "This is
unique for a fifth-grader")? It is important to note, however, that many objectives in the Create
category do not rely on originality or uniqueness. The teachers' intent with these objectives is
that students should be able to synthesize material into a whole. This synthesis is often required
in papers in which the student is expected to assemble previously taught material into an
organized presentation.
Although the process categories of Understand, Apply, and Analyze may involve
detecting relationships among presented elements, Create is different because it also involves
the construction of an original product. Unlike Create, the other categories involve working
with a given set of elements that are part of a given whole; that is, they are part of a larger
structure the student is trying to understand. In Create, on the other hand, the student must draw
upon elements from many sources and put them together into a novel structure or pat­ tern
relative to his or her own prior knowledge. Create results in a new product, that is, something
that can be observed and that is more than the student's beginning materials. A task that requires
Create is likely to require aspects of each of the earlier cognitive process categories to some
extent, but not necessarily in the order in which they are listed in the Taxonomy Table.
We recognize that composition (including writing) often, but not always, requires the
cognitive processes associated with Create. For example, Create is not involved in writing that
represents the remembering of ideas or the interpretation of materials. We also recognize that
deep understanding that goes beyond basic understanding can require the cognitive processes
associated with Create. To the extent that deep understanding is an act of construction or
insight, the cognitive processes of Create are involved.
The creative process can be broken into three phases: problem representation, in which a
student attempts to understand the task and generate possible solutions; solution planning, in
which a student examines the possibilities and devises a workable plan; and solution execution,
in which a student successfully carries out the plan. Thus, the creative process can be thought
of as starting with a divergent phase in which a variety of possible solutions are considered as
the student attempts to understand the task (generating). This is followed by a convergent
phase, in which the student devises a solution method and turns it into a plan of action
(planning). Finally, the plan is executed as the student constructs the solution (producing). It is
not surprising, then, that Create is associated with three cognitive processes: generating,
planning, and producing.

Generating
Generating involves representing the problem and arriving at alternatives or hypotheses
that meet certain criteria. Often the way a problem is initially represented suggests possible
solutions; however, redefining or coming up with a new representation of the problem may
suggest different solutions. When generating transcends the boundaries or constraints of prior
knowledge and existing theories, it involves divergent thinking and forms the core of what can
be called creative thinking.
Generating is used in a restricted sense here. Understand also requires generative
processes, which we have included in translating, exemplifying, summarizing, inferring,
classifying, comparing, and explaining. However, the goal of Understand is most often
convergent (that is, to arrive at a single meaning). In contrast, the goal of generating within
Create is divergent (that is, to arrive at various possibilities).An alternative term for generating
is hypothesizing.

Sample objective and corresponding assessment: In generating, a student is given a description


of a problem and must produce alternative solutions. For example, in the social sciences, an
objective could be to learn to generate multiple useful solutions for social problems. A
corresponding assessment item is: "Suggest as many ways as you can to assure that everyone
has adequate medical insurance." To assess student responses, the teacher should construct a
set of criteria that are shared with the students. These might include the number of alternatives,
the reasonableness of the various alternatives, the practicality of the various alternatives, and
so on. In the natural sciences, an objective could be to learn to generate hypotheses to explain
observed phenomena. A corresponding assessment task asks students to write as many
hypotheses as they can to explain strawberries growing to extraordinary size. Again, the teacher
should establish clearly defined criteria for judging the quality of the responses and give them
to the students. Finally, an objective from the field of mathematics could be to be able to
generate alternative methods for achieving a particular result. A corresponding assessment item
is: "What alternative methods could you use to find what whole numbers yield 60 when
multiplied together?" For each of these assessments, explicit, publicly shared scoring criteria
are needed.

Assessment formats: Assessing generating typically involves constructed response formats in


which a student is asked to produce alternatives or hypotheses. Two traditional subtypes are
consequences tasks and uses tasks. In a consequences task, a student must list all the possible
consequences of a certain event, such as "What would happen if there was a flat income tax
rather than a graduated income tax?" In a uses task, a student must list all possible uses for an
object, such as "What are the possible uses for the World Wide Web?" It is almost impossible
to use the multiple-choice format to assess generating processes.

Planning
Planning involves devising a solution method that meets a problem's criteria, that is,
developing a plan for solving the problem. Planning stops short of carrying out the steps to
create the actual solution for a given problem. In planning, a student may establish sub-goals,
or break a task into subtasks to be performed when solving the problem. Teachers often skip
stating planning objectives, instead stating their objectives in terms of producing, the final stage
of the creative process. When this happens, planning is either assumed or implicit in the
producing objective. In this case, planning is likely to be carried out by the student covertly
during the course of constructing a product (i.e., producing). An alternative term is designing.
Sample objectives and corresponding assessments: ln planning, when given a problem
statement, a student develops a solution method. In history, a sample objective could be to be
able to plan research papers on given historical topics. An assessment task asks the student,
prior to writing a research paper on the causes of the Indian Revolution, to submit an outline
of the paper, including the steps he or she intends to follow to conduct the research. In the
natural sciences, a sample objective could be to learn to design studies to test various
hypotheses. An assessment task asks students to plan a way of determining which of three
factors determines the rate of oscillation of a pendulum.. In mathematics, an objective could be
to be able to lay out the steps needed to solve geometry problems. An assessment task asks
students to devise a plan for determining the volume of the frustum of a pyramid (a task not
previously considered in class). The plan may involve computing the volume of the large
pyramid, then computing the volume of the small pyramid, and finally subtracting the smaller
volume from the larger.

Assessment formats: Planning may be assessed by asking students to develop worked-out


solutions, describe solution plans, or select solution plans for a given problem.

Producing
Producing involves carrying out a plan for solving a given problem that meets certain
specifications. As we noted earlier, objectives within the category Create may or may not
include originality or uniqueness as one of the specifications. So it is with producing objectives.
Producing can require the coordination of the four types of knowledge. An alternative term is
constructing.
Sample objectives and corresponding assessments: In producing, a student is given a functional
description of a goal and must create a product that satisfies the description. It involves carrying
out a solution plan for a given problem. Sample objectives involve producing novel and useful
products that meet certain requirements. In history, an objective could be to learn to write
papers pertaining to particular historical periods that meet specified standards of scholarship.
An assessment task asks students to write a short story that takes place during the Indian
Revolution. In science, an objective could be to learn to design habitats for certain species and
certain purposes. A corresponding assessment task asks students to design the living quarters
of a space station. In all these examples, the specifications become the criteria for evaluating
student performance relative to the objective. These specifications, then, should be included in
a scoring rubric that is given to the students in advance of the assessment.

Assessment formats: A common task for assessing producing is a design task, in which students
are asked to create a product that corresponds to certain specifications. For example, students
may be asked to produce schematic plans for a new institution that include new ways for
students to conveniently store their personal belongings.
Assessment Procedures
Assessment methods are the strategies, techniques, tools and instruments for collecting
information to determine the extent to which students demonstrate desired learning outcomes.
Several methods should be used to assess student learning outcomes. Relying on only one
method to provide information about the program will only reflect a part of students’
achievement. Additionally, student learning outcome may be difficult to assess using only one
method. For each student learning outcome, a combination of direct and indirect assessment
methods should be used. For example, responses from student surveys may be informative,
however, when combined with students’ test results they will be more meaningful, valid, and
reliable.
Principles of Assessment
Assessment will be valid
Assessment will be explicitly designed to measure student achievement of the intended
learning outcomes, and all intended learning outcomes will be summatively assessed. The
processes for the approval of new modules and programmes, and for amending existing
modules and programmes, will ensure that assessment is an integral part of module and
programme design, and the ongoing validity of assessment will be considered through annual
and periodic review.
Assessment will be reliable
To ensure the level of consistency that is necessary for assessment to be reliable, all awards at
the same academic level will be aligned with the institution generic qualification descriptor,
level descriptor and assessment criteria for that level of award.
Assessment will be equitable
Different assessment methods may be appropriate for different learning styles, and it therefore
encourages all programmes to employ (in a way that is consistent with the intended learning
outcomes being assessed) a diversity of assessment methods to allow all students to
demonstrate their knowledge, understanding and skills.
Assessment will be explicit and transparent
Prior to undertaking any assessment task, students will be clearly informed of the purpose and
requirements of the task and will be provided with the specific assessment criteria that will be
used for marking it. Feedback to students will be related to the stated learning outcomes and
specific assessment criteria. Clear information on the policies and processes relating to
assessment will be easily available to all involved in the assessment process.
Assessment will support the student learning process
All assessment tasks influence the way in which students approach their learning, and this will
be taken into account in the design of all assessment tasks.
Assessment will be efficient
Assessment will be efficient for both students and staff such that learning outcomes are not
overly assessed and that knowledge and skills can be sampled.
Direct Method of Assessment
Direct method of assessment will provide the exact outcome of the classroom. The evidence of
the direct assessment is concrete like quantifiable, measurable and visible. It clearly shows the
student learning in a course. It gives the direction to the faculty members that what is the
understanding of the subject and with the understanding what they can do?. This method used
commonly by most of the faculty members. There are different methodologies of direct
assessment method.

1. Standardized Examination
2. Quiz
3. Simulations
4. Demonstrations
5. Capstone Projects
6. Portfolios
7. Oral Exams

So, the strength of direct measurement is, the faculty members are getting the concrete evidence
of a sample what students can do with their student learning.
But at the same time, direct measurement has its own weakness as some of the teaching learning
process components cannot be evaluated directly.
Indirect method of assessment
Indirect measurement mostly results the learning experience by the students. It gives
the opinion or survey of student learning. Indirect measurement is completely perception of
their learning. For example, the teacher like to know, what is the interest rate of the students in
a particular subject. This kind of evidence the faculty members could collect only from the
indirect measurement. It gives the affective domain components. The main disadvantage of
the indirect measurement is that it is not useful in identifying specific knowledge and skills of
the student.
Source : Assessment 101: Assessment Tips with Gloria Rogers, Ph.D. Direct and Indirect
Assessment, August 2006
However, considering the evidence parameter, indirect measures are not as strong as
direct measures because assumptions must be made about what exactly the self-report means.
If students report that they have attained a particular learning outcome, how can that report be
validated? An indirect assessment is useful in that it can be used to measure certain implicit
qualities of student learning, such as values, perceptions, and attitudes, from a variety of
perspectives. However, in the absence of direct evidence, assumptions must be made about
how well perceptions match the reality of actual achievement of student learning.

It is important to remember that all assessment methods have their limitations and contain some
bias. A meaningful assessment program would use both direct and indirect assessments from a
variety of sources (students, alumni, faculty, employers, etc.). This use of multiple assessment
methods provides converging evidence of student learning. Indirect methods provide a valuable
supplement to direct methods and are generally a part of a robust assessment program.

The different types of indirect measurements are


1. Survey
2. Exit interviews
3. Placement statistics
Source : http://www.skidmore.edu/assessment/
Examples of Direct and Indirect Measures of Student Learning at the Course, Program,
and Institutional Levels
Level Direct Measurement Indirect Measurement

• Course and homework


assignments
• Exams and quizzes • Course evaluations
• Standardized tests • Test blue prints (outlines of the
• Term papers and reports concepts and skills covered on
• Observations of field work, tests)
internship performance, • Percent of class time spent in
service learning, clinical active learning
experiences • Number of student hours spent
• Research projects on service learning
Course
• Class discussion participation • Number of student hours spent
• Case study analysis on homework
• Rubric scores for writing, oral • Number of student hours spent
presentations, and at intellectual or cultural
performances activities related to the course
• Artistic performances and • Grades that are not based on
products explicit criteria related to clear
• Grades based on explicit learning goals
criteria related to clear
learning goals

• Capstone projects, senior


• Focus group interviews with
theses, exhibits, or
students, faculty members, or
performances
employees
• Pass rates or scores on
• Registration or course
licensure, certification, or
enrollment information
Program subject area tests
• Department or program review
• Student publications or
data
conference presentations
• Job placement
• Employer and internship
• Employer or alumni surveys
supervisor ratings of
• Student perception surveys
students' performance
Level Direct Measurement Indirect Measurement

• Proportion of upper-level
courses compared to the same
program at other institutions
• Graduate school placement
rates

• Performance on tests of
writing, critical thinking, or
general knowledge • Locally developed,
• Rubric scores for class commercial, or national surveys
assignments in General of student perceptions or self-
Education, interdisciplinary report of activities (e.g.,
core courses, or other courses National Survey of Student
required of all students Engagement)
• Performance on achievement • Transcript studies that examine
Institutional
tests patterns and trends of course
• Explicit self-reflections on selection and grading
what students have learned • Annual reports including
related to institutional institutional benchmarks (e.g.,
programs such as service graduation and retention rates,
learning (e.g., asking students grade point averages of
to name the three most graduates, etc.)
important things they have
learned in a program)
CONSTRUCTION OF TEST ITEMS AND QUESTIONS
INTRODUCTION
Teachers are concerned with their students achieving the specified learning outcomes in the
subjects they teach. They have to test the achievement of all those learning outcomes. Any
testing device therefore attempt to test the entire content prescribed and taught by teachers.
Essay questions that are generally used in assignments by their very nature can only cover a
limited content. Objective items are most suitable for a wider coverage of content. In any
assessment of students, teachers must ensure objectivity and reliability of assessment. This unit
describes the rules for constructing different types of test items and questions and their
advantages and limitations.

Types of test items and questions


Teachers are required to find out how their students are progressing in their studies. They would
be keen to get information on whether students have achieved the outcomes spelt out for the
lesson Achievement or otherwise of all outcomes is a necessity. Learning outcomes are spelt
out in three areas. The types of tests, test items and questions would depend upon the learning
outcome. Test items and questions dealt with in this Unit are intended to assess learning
outcomes.
For assessing cognitive domain two categories of test items and questions, are used by teachers.
These are
• Supply Type
• Selection Type
Supply type items are so called because students are required to supply (write) answers to the
items. They supply answers or write them. In the selection type a set of possible answers are
provided for each item. Students are required to select the best] most appropriate answer from
out of the list given,
The Supply question requires students to write their own answers to a question. Supply type
items require students to provide his/her own answer to the question. No answers are given to
students to choose from. Therefore, there is no scope for any guessing by students on answers.
The question may require students to give a one word or one sentence answer. The answer may
be longer also a paragraph or more than a page.
The following are the types of Supply type items
• Completion Choice
• Single answer question
• Short answer/structured essay question
• Numerical Problem Solving

The Selection question provides students with alternate answers from which to choose. The
correct answer. The following are the types of Selection items
• Multiple Choice
• True /false (also called Alternate response)
• Matching

Supply type items are used to test the following


• writing ability
• ability to organise ideas and thoughts
• student’s creativity
• synthesis and evaluation of ideas
• problem solving skills
• recall of a sequence of activities
Selection type items are useful in the following situations
to test recognition of correct answers
• to test products of any mathematical calculation
• to judge the correctness or otherwise of a statement
• to differentiate similar objects, words symbols etc,.
Completion and Single Answer Questions
A statement given in an incomplete form or a question is posed to the student. The student is
required to complete the sentence by filling in the blank with the correct answer or supply
answer to the question in a single word or phrase.
Example:
What is the atomicity of Hydrogen?
The valency of Carbon is
These types of questions are useful to test a student's
• knowledge of facts, principles, theories
• comprehension of information including interpretation of data, parts

Construction
• Items must be clearly stated so that a single brief answer is possible.
• The question must be direct
Example: What is the SI unit of force
• The item with blank spaces must make enough sense so that the student knows
what to do
Example: A metal is--------------
The above item does not indicate to the student what is expected of him/her.
• Answer must be related to the main point in the statement
• Place the blanks at the end ofthe statement. The blank may either be provided
at the end or beginning of the statement. Blanks in the middle ofthe statement
should be avoided as much as possible.
Example: There are two rational numbers which have themselves as reciprocals. One of them
is l . The other number is
Don 't make the statement too general.
For example: A circular saw is ------------
Let us look at some examples.
What is a refrigerator?
This is a vague question. Students can answer it in different ways. One answer can be it is a
gadget to store vegetables and prevent them from spoilage. Another answer may be it is a
gadget that works on the principle of refrigeration. Yet another answer could be it is a gadget
which maintains the required temperature for preventing spoiling of food items.
A capacitor-----------------DC capacitor---------and ------- electric energy.
These two items violate rules of construction. More than one answer is possible for both
questions.

Advantages of Completion type items


• can provide a wide sampling of content
• can efficiently measure lower levels of cognitive ability
• can minimize guessing as compared to multiple choice or true-false items
• can usually provide an objective measure of student achievement or ability
Limitations of Completion type items
• They are difficult to construct so that the desired response is clearly indicated
• They have difficulty measuring learning objectives require more than simple recall of
information
• They can often include more irrelevant clues than do other item types
• They are more time consuming to score when compared to multiple choice or true-
false items
• They are more difficulty to score since more than one answer may have to be
considered correct if the item was not properly prepared.
Short Answer Questions
Short-answer questions are open-ended questions that require students to supply their answer.
They are commonly used in examinations to assess the basic knowledge and understanding
(low cognitive levels) of a topic before more in-depth assessment questions are asked on the
topic.
This is a supply type item where the student is given a clear direction to restrict the answer to
2 or 3 sentences. Questions must be such that answers are possible within the limits of specified
lengths.
Example;
Define Poisson's ratio
List three important uses of poor conductors
What is normalization in database?

Construction
• The question must be simple, clear and unambiguous
• Scope of answer must be limited by the use of words such as 'List 'give reasons', 'define'
etc,.
• Questions must be interpretable in the same way by all students.

Advantages of Short Answer Questions


• Short Answer Questions are relatively fast to mark and can be marked by different
assessors, as long as the questions are set in such a way that all alternative answers can
be considered by the assessors.
• Short Answer Questions are also relatively easy to set compared to many assessment
methods.
• Short Answer Questions can be used as part of a formative and summative assessment,
as the structure of short answer questions are very similar to examination questions,
students are more familiar with the practice and feel less anxious.
• Unlike MCQs, there is no guessing on answers, students must supply an answer.
Limitations of Short Answer Questions
• Short Answer Questions are only suitable for questions that can be answered with short
responses. It is very important that the assessor is very clear on the type of answers
expected when setting the questions, because Short Answer Questions is an open-ended
questions, students are free to answer any way they choose, short-answer questions can
lead to difficulties in grading if the question is not worded carefully.
• Short Answer Questions are typically used for assessing knowledge only, students may
often memorize Short Answer Questions with rote learning. If assessors wish to use
Short Answer Questions to assess deeper learning, careful attention on appropriate
questions are required.
• Accuracy of assessment may be influenced by handwriting/spelling skills
• There can be time management issues when answering Short Answer Questions

How to design a good Short Answer Question?


1. Design short answer items which are appropriate assessment of the learning objective
2. Make sure the content of the short answer question measures knowledge appropriate to
the desired learning goal
3. Express the questions with clear wordings and language which are appropriate to the
student population
4. Ensure there is only one clearly correct answer in each question
5. Ensure that the item clearly specifies how the question should be answered (e.g. Student
should answer it briefly and concisely using a single word or short phrase? Is the
question given a specific number of blanks for students to answer?
6. Consider whether the positioning of the item blank promote efficient scoring
7. Write the instructions clearly so as to specify the desired knowledge and specificity of
response
8. Set the questions explicitly and precisely.
9. Direct questions are better than those which require completing the sentences.
10. For numerical answers, let the students know if they will receive marks for showing
partial work (process based) or only the results (product based), also indicated the
importance of the units.
11. Let the students know what your marking style is like, is bullet point format acceptable,
or does it have to be an essay format?
12. Prepare a structured marking sheet; allocate marks or part-marks for acceptable
answer(s).
13. Be prepared to accept other equally acceptable answers, some of which you may not
have predicted
Essay Questions (Long answer questions)
A Long answer question is one that requires students to supply an answer that is longer
than a mere listing, simple computation or a single paragraph. Students are required to write
reasonably lengthy answers.
Merits and Demerits
The essay type questions require the student to express himself/herself in own words.
They can measure certain complex outcomes such as ability to create, to organise, to integrate,
to express and similar behaviors that call for the production and synthesis of ideas, If carefully
prepared they can be used to check for understand, apply, analyze, evaluate and create.
Essay questions take so long to answer that relatively few can be answered in a given
period of time. It is usually very time consuming to score. Essay questions are noted for their
vagueness and ambiguity. They lack reliability They have a limited content coverage.
Many essay questions are so stated that students do not know what kind of answer is expected.
Several kinds of answers may be justified, and some of them may be quite different from what
the teacher had in mind while setting the paper. Invariably students have to guess as to what
the teacher implied by the question. Finally, the diversity of answers makes it difficult to
mark the answers fairly and objectively.
Improving Essay Questions
Essay questions can be improved by
• Structuring the questions properly and demanding a restricted response from the learner
than an extended response
• Stating the mark distribution for each sub division in the question
Scoring of Essay Questions
The main objective of an essay question is that it is difficult to have a reliable scoring, The
following guidelines may be followed to minimize the subjectivity of scoring essay questions
• Evaluate the answers by having a marking scheme and model answer
• Evaluate and score question by question for all students rather than scoring all questions
student by student
• Evaluate the answers in terms of the learning outcomes measured
Structured Essay Questions
It is possible to break the open-ended aspect of an essay question by means of a series
of small tasks/related questions and making them clear and unambiguous.
Example
1. Draw a sketch showing the main parts of an electroslag welding equipment - 5 marks
2. Explain how the flux and filler metal are supplied to the weld pool- 4 marks
3. List the advantages and limitations of the Electroslag welding process - 4 marks
4. State the need for water cooled copper shoes - 4 Marks
5. State how the heat is produced in this process - 3 Marks

Guidelines for Constructing structured Essay Questions


• Decide on a structured essay type question only when an objective item is not suitable
• Consider carefully the content and behaviour before deciding on the wording of the
question
• Structuring of a question by breaking the open-ended aspect of an essay question by
means of a series of tasks to be performed and making them clear unambiguous and
specific. All students and teachers must interpret the questions in the same way.
• Relate the questions directly as much as possible to the learning outcomes being
measured and question must be a valid testing situation for the ability considered
• In a set of structured questions each part should be independent and complete in itself
and the correct answer to one part should not significantly influence the answer to the
other parts. At the same time each of the parts should refer to the same situation and
contribute to the overall purposes of the item. This means that the sub parts in a
structured essay question are not independent questions and put together under the same
main item. They are all part of a bigger question broken down for ease of answering,
providing suitable answering and to ensure that the responses are mutually related.
• Indicate the relative importance of each sub division in a set of structured essay
questions in terms of marks allotted. Arrange the sub divisions in order of increasing
difficulty.
• Tendency to include too many essay questions in a single test as an attempt to overcome
the problem of limited sampling should be avoided. For measuring achievement of
complex learning outcomes, it is better to use fewer questions and to improve the
sample by more frequent testing
• When the statement of the question doesn't impose fairly well defined limits, some
indications of the desired length of the answer should be given ( for ex. answer should
not exceed 200 words, answer must be brief and to the point. It should not exceed one
page in the normal handwriting.
• Directions must be clearly given. Example: using a neatly drawn diagram describe the
working of a four-stroke engine
• To a large extent answers must be such that they are capable of objective assessment
• Avoid phrases such as "discuss briefly", "explain in detail", "what do you mean by"
These terms don't clearly indicate to students what you expect from them. They can be
interpreted differently by different students.
• Avoid items as " write what you know about "
• Clearly define and describe the behaviour that is being tested
• Use case studies, wherever is possible to test higher level abilities
• Ask for justification of reasoning to support an opinion

Suggested terms and situations for essay tests


Following are the types of situation that require higher order ability testing in the form of
structured essay questions
• Comparisons of two things on a single designated basis, distinguish, discriminate, and
differentiate
• Comparison of two things in general
• Decision for or against with reason
• Relationship involving cause and effect
• Explanation of the use
• Summary or inference from known principles and facts
• Analysis
• Illustrations with own examples
• Classification
• Application of rules or principles in new situations
• Discussion by multiple interpretation
• Statement of aim, purpose in his selection or organization of materials
• Criticism as to adequacy
• Criticism on correctness or relevance
• Outline methods and procedures
• Explain or define a problem
• Detailing observations and offering remarks
• Terms that could be used in framing the questions are- define, state, identify, quote,
recite, describe, show how, why, give reasons, prove, disprove, justify
Example:
Poor: Describe the operation and working principle of a silicon controlled rectifier 20 Marks
Better
a) Sketch SCR circuit 4 Marks
b) Draw the transistor equivalent circuit for the SCR 6 Marks
c) Derive an equation for the total current through the transistor equivalent
5 Marks
circuit
d) Explain the valence action with the equation derived in (c) 5 Marks

Examples for Structured Essay Questions


Poor: Write an essay on colloids 20 Marks
Better
a) Define the two phases present in colloidal system. 4 Marks
b) Illustrate at least four types of colloids by naming the two phases present in
4 Marks
each colloids
c) Describe any one method of preplanning colloids with examples 5 Marks
d) Describe any one optical property and any one electrical property of colloids 6 Marks

Poor: Describe the manufacture of sulphuric acid by contact process 20 Marks


Better
a) Draw a neat flow diagram. 5 Marks
b) Write balanced equations for the chemical reactions that take place 3 Marks
c) Predict the optimum conditions for the yield of H2S04 on the
8 Marks
d) basis of the physico-chemical principles involved
e) What is Oleum amd how is it prepared? 2 Marks
f) Why is contract process preferable to chamber process? 2 Marks
Poor: Write an essay on energy and its transformation 20 Marks
Better
Classify the different forms of energy 2 Marks
Distinguish energy from matter. 5 Marks
Distinguish the meaning of loss of energy and gain of energy on the basis of work 5 Marks
State the law of conservation of energy 2 Marks
Give four examples where one form of energy is converted into another 6 Marks
Numerical problems Solving
In examining students undergoing technical programmes the higher order and complex
learning outcomes could be tested by providing numerical problems. The problems with deal
with application of certain formulae, laws to new situations, demanding the abilities of analysis,
synthesis and evaluation (decision making). In setting such questions one must avoid problems
which seek answers by mere substitution of numbers directly into a formula.

Common guidelines for teaching problem solving method for assessment


• Prepare a model for useful problem-solving method. Problem solving can be
difficult and sometimes it is a lengthy procedure.
• Guide within a specific context. Guide problem-solving method in the context in
which they will be used.
• Encourage the students understand the problem. In order to solve problems,
students need to define the end goal. This step is crucial to successful learning of
problem-solving skills. If you succeed at helping students answer the questions “what?”
and “why?”, finding the answer to “how?” will be easier.
• Ask questions and make suggestions. Ask students to predict “what would happen if
…” or explain why something happened. This will help them to develop analytical and
deductive thinking skills.
• Link errors to misconceptions. Use errors as evidence of misconceptions, not
carelessness or random guessing.

Example:
Write the quadratic Equation – Expected factual knowledge of the student

A ball is thrown straight up, from 10 m above from the ground, with a velocity of 20 m/s. When
does the ball will hit the ground? (Ignore the air resistance) – This question will bring the
conceptual knowledge and procedural knowledge by applying quadratic equation, how to solve
the problem. So, problem solving method brings the students conceptual knowledge and
procedural knowledge.

Example:
In a packet switching network, packets are routed from source to destination along a
single path having two intermediate nodes. If the message size is 48 bytes and each packet
contains a header of 3 bytes, then what is the optimum packet size?
The above question will extract the concept knowledge which is behind the packet switching
network. Solving the problem ensure the students understanding clearly.

In software engineering, the problem solving will be split large complex goals into
small, simpler ones, try to think different kind of a parallel solution of each one, make the
problem as abstract so that the problem can be applied in an another same abstract in an another
issue, learn to use the existing solutions instead re-inventing the wheel and think in terms of
data flow. This procedure would give the clear idea to the student to approach a problem in
software engineering.
True / Fa1se items
True/False items give students two choices from which to select the correct answer. A
statement is presented and students are asked to indicate whether the statement is true or false
as stated. There are only two choices for the student. These choices may be altered to suit the
needs of the test situation Yes/No, Right/Wrong, Correct/ln-correct
True/False items are most often used to test a student's ability to
• recognize correctness a statement
• identify relationship
• identify attitudes, values and beliefs
• cause and effect relationship
• identify fact
• identify a new situation where principles are applicable
This type of item is most suitable when
• there are only two alternatives which are plausible
• there is only one correct response to the question
• a large amount of content needs to be tested
• reduction of reading on the test is important
• easy scoring is desired
True/False items are also called constant alternate response type.
Example
A ceiling fan in a room will push warm air downward
Narcotics are painkillers
Insects can be characterized by their three distinct body parts
Construction
The item should include only on central and significant idea.
Example:
The second method that can be used to determine the difficulty of a question is to run a test
analysis programme on test questions.
• This question has two ideas contained in the statement. A student may be able to choose
'yes' or 'no' to both ideas. It may be 'yes' for one idea and 'no' for the other idea or vice
versa. It may be true for both ideas.
• The statement must be precise so that it can be judged as absolutely true or absolutely
false.
Flowers bloom in the springtime.
This statement is partly true in that many flowers do bloom in the spring but flowers also bloom
in other seasons. The statement is neither true nor false. It is partly true and partly false.
• The statements are to be short and written in simple language.
An individual with blood type AB negative may receive blood from any other individual
because there are no antigens in the recipient's blood to cause a reaction with antibodies that
may be in the donor's blood.
This statement is a very long one. It requires considerable time for students to decipher the
meaning of this question. It would be better to ask first if a person with AB negative blood can
receive blood from anyone else. Then ask about the presence of antigens and antibodies.

Advantages of True- False items


• the widest sampling of content or objectives per unit of testing time
• scoring efficiency and accuracy
• versatility in measuring all levels of cognitive ability
• an objective measurement of student achievement or ability
Disadvantages of True -False items
• incorporate an extremely high guessing factor. For simple true-false items, each student
has a 50/50 chance of correctly answering the item without any knowledge of the item
's content
• can often lead a teacher to write unclear items due to the difficulty of constructing items
which are unequivocally true or false
• don't discriminate between students of varying ability as well as other item types
• can often include more irrelevant clues than do other item types
• can often a teacher to favour testing of trivial knowledge
Matching Type Items
The matching item is simply a modification of the multiple-choice form. Instead of the possible
response being listed underneath each individual stem, they are listed in two columns. Each
statement or word in one column is matched to a suitable phrase or word in the second column.
A matching itemgives the student two lists. One list is a guide and the other is a list from which
to choose the answer. Consider the following example of a matching item.
In the above example students have to choose an answer for each item in Column A from those
given in Column B. Column A items are called premises. Column B items are called responses.
Matching type items are useful when testing student's ability to
• identify relationship between two elements/ideas such as parts and their uses, parts and
their location, words and definitions, principles and applications, symbols and
meanings etc,.
• Classify items into categories such as examples and classifications, process and product
• knowledge of facts
• application of principles
• analysis of design
In the matching type the list of premises is given in the left and the list of responses are given
in the right. Each item in the column B must be a response to one of the items in Column A.
However, one or two more items are added to the list in Column B to provide for minimizing
guess. If four items are given in both columns all that the student will be required to do is to
match any three items. The fourth is automatically done. To avoid this situation one or two
additional items are added in Column B.
Construction
• The items should include only homogeneous ones. Non-homogeneous material will
distract the student.
Example:
Column A Column B
(Type of transducers) (Quantities measured)
1. Strain gauge a. Elongation
2. Thermocouple b. Flow
3. Tacho generator c. Temperature
4. Differential Transformer d. Angular velocity
e. Level
The items in Column A are all homogeneous as they are all transducers. Those in Column B
are what these measure
The following example violates the rule
Column A Column B
1. Indian independence a. Bangalore
2. The first Woman Prime Minister b. Nainital
3. Garden City c. 1947
4. Hill Station d. Indira Gandhi
e. Vijayalakshmi Pandit
• Number of responses should be sufficiently large so that the list of premises can still have
some options to choose from.
• Specify in the directions the basis for matching and indicate that each response may be used
once or more than once or not at all, as the case may be.
Example
Column A consists of list of characteristics of objective type test items. Column B has the
names of the test. A bracket ( ) is given before every item of Column A. Choose the response
from Column B for every item of Column A and write the serial number of the response in the
brackets. Each response in Column B may be used once, more than once or not at all.

Column A Column B
1. Best for measuring computational a. Matching item skill
2. Least useful for educational b. Multiple Choice diagnosis
3. Most difficult to score objectively c. True-False item
4. Provides high scores by guessing d. Short answer item alone
5. Measures greater variety of learning outcomes
6. Measures learning at recall level

• Give very clear instructions about how students must write the answers to each item, where
they are to mark their answers.
• The acceptable format for numbering matching questions is to place numbers in front of
the premises on the left place letters in front of the responses on the right
• Keep the lists as short as possible
• Arrange the lists in a logical order. If dates are used it is preferable to put them in a
chronological order.
• Use proper numbering for both columns. Items in Column may be given alphabetical
numbers while those in Column may be given numerical numbers.

Advantages of Matching type items


• require short periods of reading and response time, allowing you to cover more
content
• provide objective measurement of student achievement or ability
• provide highly reliable test scores
• provide scoring efficiency and accuracy

Limitations of Matching type items


• have difficulty measuring learning objectives requiring more than simple recall
of information
• are difficult to construct due to the problem of selecting a common set of stimuli
and responses
Multiple Choice Test Items
The Multiple-Choice questions(MCQ) are one of the most widely used for the
assessment. They are also known to be quite difficult for construction. In a multiple-choice
item, the student is required to select the correct answer for a question from a group of several
alternatives.

An example:
The transfer ofheat in a steel bar from one end to the other end is by
a) Conduction
b) Convection
c) Radiation
d) Fusion

In the above example " the transfer of heat from one end of the steel bar to the other end" is the
main question. This is at the top of the item. This is the question to which the student must
select the correct answer. This statement or question is called Stem. The Stem can be either in
the form of a direct question or an incomplete sentence. This acts as a stimulus to evoke the
correct response from students. The alternatives provided as possible answers are called
Options. In the example four options are given. The student has to choose the correct answer
from the options. There may be four or even five options. In the example given items at a, b, c,
d are options.
The correct answer is called the Key. In the example option (a) is the key. Other than the correct
answer are called Distracters. Options b, c, and d are the distracters.

Construction of a Multiple-Choice Item


The stem must be a direct question or an incomplete statement. If it is a statement it must imply
a question
There must be one and only one correct answer.
Example:
Ex1 :What is the mode of transfer of heat from one end of a steel bar to the other end?
Ex2 :The heat from one end of the steel bar to the other end is transmitted through the process
of
• Distracters must be plausible. They must act as distracters for higher ability
students and attractors for lower ability students.
• The item must test important information and not too trivial ones.
• Use four or five options.
• Stem must be concise and unambiguous, avoiding negatives. If negatives are
unavoidable these must be emphasized.

Example:
Voltage drop in a resistor is NOT proportional to
a) current
b) resistance
c) power dissipation
d) physical dimensions or size
(Notice NOT, the negative. This is given in capital to emphasize. It may also be underlined).
Stem must be a complete question by itself not requiring the student to read the options in order
to discover what is being asked
Example:
When two resistors of value 10 ohms and 30 ohms are connected in series, the net resistance
value will be
a) 3 ohms
b) 20 ohms
c) 40 ohms
d) 300 ohms
In this item, the student could work-out the answer without referring to the response, since the
stem is a complete question by itself.
Content of the question must be made clear to avoid confusion. State the stem of the item in a
simple clear sentence. Use simple language so that students understand the statement without
much difficulty
Example:
Poor construction
The paucity of plausible, but incorrect statement that can be related to a central idea poses a
problem when constructing which one of the following types of test items?
a) Short answer
b) True- False
c) Multiple choice
d) Essay
Better constructed item
The lack of plausible but incorrect alternatives will cause the greatest difficulty when
constructing
a) Short answer question
b) True-False
c) Multiple Choice item
d) Essay
Put as much of the wording as possible in the item of the item and anything that needs repeating
in each option should be included in the stem.
Example:
In objective testing the term objective
a) refers to the method of identifying the learning outcomes
b) refers to the method of selecting the test content
c) refers to the method of presenting the problem
d) refers to the method of scoring the answers.

The phrase 'refers to the method' repeats itself in all the four options. It must be taken to the
stem. The stem must then read In objective testing the term objective refers to the method of '
The options must be closely related to the stem.

Example:
The property of a circuit that tends to oppose a change in current is called

a) Conductance
b) Voltage
c) temperature
d) Inductance
In the above example b and c are not properties of a circuit. These are not good options. Better
options would be to replace b and c by
b) capacitance
c) resistance
The options should be parallel to structure i.e. they should fit grammatically with the stem.
Grammatical consistency ofall options is very important.

Example:
The station where an aircraft is taken for repairs is called an
a) apron
b) hanger
c) tower
d) workshop
In this example only one option fits the grammatical structure of the stem. In order to improve
this stem may end with .... is called an
The item must not contain clue to the student such as combination ofsingulars and plurals in
the options.
Example:
The direction of propagation ofan electromagnetic wave in the free space is
a) along the electric field
b) along the magnetic field
c) in the plane of electric and magnetic field
d) perpendicular to the surface of containing the two fields,
In the above example the precision and length of the key option d makes it stand out from the
rest. To avoid it, the phrase in d must be inserted appropriately in each of the answer.
Example:
An ion is
a) a charged particle
b) an atom which has gained or lost electrons
c) a neutral particle
d) formed in electrolytes
Here the stem is vague and three of the options given are acceptable.
Distracters must be incorrect yet likely to be plausible to weaker students. This means that the
distracters must be believable.
Example:
Waste and overflow fittings for a bathtub are installed
a) before the bathtub is set in place
b) after the bathtub is set in place
c) at the same time as the trap
d) none of the above.
It is unlikely that any student would choose d as the answer; particularly since all the other
options are likely alternatives,
Another example:
A person invested Rs 500 in a business. He sold goods worth Rs 550 in this business. The %
profit he got was
a) Rs.50
b) Rs.10
c) 50%
d) Rs 550

The options a and c are not suitable options


The option "all ofthese " should never be used. "None ofthese " should be avoided Ifthis is not
at all possible then it should be sometimes being the key. In such a case it should be the exact
correct answer.
What instruments can be used to find the mid-point of a 9 inch steel bar?
a) calipers
b) Straightedge
c) protractor
d) all of the above
Another example:
Which of the following acts as a positive reinforce for a polytechnic student?
a) money
b) smile by teacher
c) Scholarship
d) None of the above
Avoid similarity of wording in both the stem and correct answer.
Example:
The amplifier tube was widely used for
a) amplification of voltage
b) regulation of voltage
c) rectification of voltage
d) regulation of current
In this case even without any understanding of the subject anybody can choose a as the correct
response.
Don 't include two responses that have the same meaning
Example:
The normal room temperature in South India during summer month is
a) 300 c to 40 0 c
b) 300c to
c) 1 to 200c
d) None of the above
Here option a is superfluous. B includes the option (a) also.
Each item should be written around a single principle
Which of the following statements is true?
a) wood sinks in mercury
b) All metals except mica are good conductors
c) Thermal expansion of gas is more than that of a liquid
d) Oscillating sander uses only fine grade abrasives.
This type of question should never be given. This has four different ideas.
Items should not be set which require the recall of trivial and unimportant facts.
Example:
In which year did India become independent?
a) 1945
b) 1946
c) 1947
d) 1948

The items should not be very lengthy and involve lengthy calculations Example:
What is the equivalent resistance of 330 K ohms and a 100 K ohm resistor connected in
parallel?
b) 76.74 K ohms
c) 82.05 K ohms
d) 120 K ohms
e) 430 K ohms
The student has to work through lengthy calculations to arrive at the correct answer. The item
should test the understanding of the principle of resistors in parallel. It is not expected to test
the ability of calculation. The item may be reworded suitably.

The level of information required to reject wrong responses should not be higher than that
Coulomb is the unit of measurement of
a) inductive reactance
b) electric charge
c) band width
d) trans conductance
Required selecting a correct response

Here to reject options a, c, and d a higher level of information is required than to select the key.
Hence, a, c, and d are poor distracters. The options for the item may be rewrintten to suit the
level of learning under test. The options may be modified as under

a) resistance
b) charge
c) power
d) potential difference
Advantages of multiple-choice items
• versatility in measuring all levels of cognitive ability
• highly reliable test scores
• scoring efficiency and accuracy
• objective measurement of student achievement or ability
• a wide sampling of content or objectives
• a reduced guessing factor when compared to True-False items
• different response alternatives which can provide diagnostic feedback

Limitations of Multiple-choice items


• are difficult and time consuming to construct
• lead a teacher to favor simple recall of facts
• places a high degree of dependence on students' reading ability and teacher's
writing ability
Pre-validation of Items
Well prepared and used items and questions can be expected to serve the purposes for which
they are used. Teachers have to take care in writing the items so that they are of good quality.
One of the quick and easy ways of checking the quality is by using a checklist. Checklist for
each type of test is given below. A more sophisticated method of checking the quality of items
is by subjecting the items to item analysis.

GENERAL FOR ALL TYPES


• Is the item measuring an important learning outcome?
• Is the item measuring an important content area?
• Is the level of difficulty likely to be reasonable?
• Is the item likely to be answered correctly by higher ability students?
• Is the item independent or does it overlap with other items?

SPECIFIC
Constant Alternative type
1. Does the item include only one significant idea in each statement?
2. Is the statement so precise that it can be judged unequivocally true or false?
3. Is the statement short and in simple language?
4. Does the item use negative statements sparingly and avoid double negatives?

Multiple Choice
1. Is the stem concise and unambiguous? Is the negative(if unavoidable) emphasized?
2. Is the stem a complete question by itself? Does the item require the student to read the
options to discover what is being asked?
3. Is the content of the question clear?
4. Does the stem include anything that needs to be repeated in every option, within itself?
5. Are the options parallel in content?
6. Are the options parallel in structure?
7. Is the item devoid of any clues such as mix up of singular, plural, precision and length
of key option etc.?
8. Is the key option unarguably correct?'
9. Are the distracters plausible?
10. Does the item exclude ' 'all these"?
11. Is the language used in the item appropriate to the vocabulary of students at this level?
12. Does the item avoid similarity of wording in both stem and the correct answer?
13. Does the item exclude responses that are "all inclusive"?
14. Does the item use an efficient format?

Matching type
1. Does the item include only homogeneous material In the premises"?
2. Is the number of responses sufficiently large so that the last of their premises can still
have many options to choose from?
3. Does the item specify the basis of matching, type of matching, kind of entry etc?

Long answer type questions


• In answering this question, in your opinion, does the student need to organise his ideas,
choose the form of his answer in his own words?
• Does the situation presented in the question seem to be new to most of them?
• Is it possible that students can produce memorized answer to this question?
• Does answering this question involve any judgment on the part of the students?
• Is the time limit reasonable?
• Is the length and scope of the answer specified?
• Does it avoid usage of very open verbs?
Design of Question Paper

It is generally agreed that teachers need to evaluate the work of their students and
assess all aspects of their teaching to enhance students’ learning and improve their own
performance. Assessment includes collecting, judging and interpreting information about
students’ performance. It is not a separate add-on activity but an integral part of the learning
and teaching process. Its purpose is to provide reliable information and feedback to improve
and enhance the quality of learning and teaching. Suitable assessment enables
• students to understand their abilities and hence improve their ways of learning;
• teachers to understand the performance of their students so that suitable and timely
measures can be provided; and
• parents to understand the performance of their wards so that they can, in collaboration
with teachers, provide suitable support to help the learning of their wards.

Different modes of assessment serve for different purposes. Assessment for learning,
which is usually formative, focuses on the learning process and learning progress.
Assessment of learning, which is usually summative, focuses on the product of learning. As
both the learning process and product are important. In the summative assessment, frequently
the faculty members are getting the following issues from the students.

• “Too Lengthy paper....... to write”, (Theory Examinations)


• Time was not enough
• “All Questions from specific titles only! No Question from......”, (Theory
Examinations)
• “Questions were too vague, What to write ? What to cut?”, (Theory Examinations)
• “Long Questions were Bouncers! I have not been taught these” (Theory
Examinations)

This is happened due to the Examiner/ Teacher imparts instruction according to what 'she/he
thinks is appropriate or important'. The intended learning outcomes are not stated clearly and
therefore overlooked. Students get confused as they are unaware of what is actually expected
out of them and they suffer. Blueprinting in Assessment, can overcome these issues, if not
completely, to a large extent and hence make the assessment more valid. Blueprint is a map
and a specification for an assessment program that ensures that all aspects of the curriculum
and educational domains are covered by assessment programs over a specified period of time.
It is a two dimensional chart which shows the placement of each question in respect of the
objective and the content area that it test. In simple terms, Blueprint links assessment to
learning objectives. It also indicates the marks carried by each question. It is useful to prepare
a blue print so that the test maker knows which question will test which objective and which
content unit and how many marks it would carry. The blue print concretizes the design in
operational terms and all the dimensions of a question (i.e. its objective, its form, the content
area it would cover and the marks allotted to it) become clear to the test maker. The blue print
is called Table of Specification (ToS).

The purpose of a Table of Specifications is to identify the achievement domains being


measured and to ensure that a fair and representative sample of questions appear on the test.
Teachers cannot measure every topic or objective and cannot ask every question they might
wish to ask. A Table of Specifications allows the teacher to construct a test which focuses on
the key areas and weights those different areas based on their importance. A Table of
Specifications provides the teacher with evidence that a test has content validity, that it
covers what should be covered.
Designing a Table of Specifications
Tables of Specification typically are designed based on the list of course objectives,
the topics covered in class, the amount of time spent on those topics, textbook chapter topics,
and the emphasis and space provided in the text. In some cases a great weight will be
assigned to a concept that is extremely important, even if relatively little class time was spent
on the topic. Three steps are involved in creating a Table of Specifications: 1) choosing the
measurement goals and domain to be covered, 2) breaking the domain into key or fairly
independent parts- concepts, terms, procedures, applications, and 3) constructing the table.
Teachers have already made decisions about the broad areas that should be taught, so the
choice of what broad domains a test should cover has usually already been made. A bit
trickier is to outline the subject matter into smaller components, but most teachers have
already had to design teaching plans, strategies, and schedules based on an outline of content.
Lists of classroom objectives, curriculum guidelines, and textbook sections, and keywords are
other commonly used sources for identifying categories for Tables of Specification. When
actually constructing the table, teachers may only wish to use a simple structure, as with the
first example above, or they may be interested in greater detail about the types of items, the
cognitive levels for items, the best mix of objectively scored items, open-ended and
constructed-response items, and so on.
A Table of Specifications benefits students in two ways. First, it improves the validity of
teacher-made tests. Second, it can improve student learning as well.
A Table of Specifications helps to ensure that there is a match between what is taught and
what is tested. Classroom assessment should be driven by classroom teaching which itself is
driven by course goals and objectives. In the chain below, Tables of Specifications provide
the link between teaching and testing.Tables of Specifications can help students at all ability
levels learn better. By providing the table to students during instruction, students can
recognize the main ideas, key skills, and the relationships among concepts more easily. The
Table of Specifications can act in the same way as a concept map to analyze content areas.
Teachers can even collaborate with students on the construction of the Table of
Specifications- what are the main ideas and topics, what emphasis should be placed on each
topic, what should be on the test? Open discussion and negotiation of these issues can
encourage higher levels of understanding while also modeling good learning and study skills.

For Example, the following table is ToS for the Computer Communication and Networks,

Step 1:
Define the following
1. The type of things the student should be able to do (i.e. ABILITIES)
2. The subject matter in which he should be able to do them (i.e CONTENT)
A Table of Specifications is a two way-chart, which relates CONTENT andABILITIES by
assigning suitable weightages for testing purpose.

Programme : B. Tech – Information Technology


Semester : V Semester
Subject : Computer communication and Networks (Code Number : XXXXX)
Abilities Remember Understand Apply Higher Total

Content order
Recognize Recall 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.1 3.2
abilities
(No. and
Name
of the Unit)

Total 100

Step 2:

Assigning weightage to the various units of the content

Factors to be considered:

1. Number of periods allotted for teaching the unit


2. Usefulness of the content matter of the unit in the student’s job or every day life
3. Usefulness of the content matter of the unit in understanding other units of the same
subject
4. Usefulness of the content matter of the unit in understanding other subjects prescribed
for the programme

According to step 2,

Weightage for abilities and content need to be defined.

CATEGORY OF ABILITY WEIGHTAGE

Remember (R) 20%


Understand (U) 60%
Apply(Ap) 20%
Higher order abilities 0%
Total 100%

Imparting Content
Unit / Module No Unit Name
in Percentage
Unit / Module 1 Network Design 18%
Unit / Module 2 LAN Access methods and Standards 15%

Unit / Module 3 Packet Switching Networks 18%

Unit / Module 4 TCP / IP Architecture 22%


Unit / Module 5 Advanced Network Architecture and Security 27%
Protocols

Abilities Remember Understand Apply Higher Total


Content order
(No. and Name Recognize Recall 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.1 3.2
abilities
of the Unit)
Network Design 5 2 0 0 0 0 2 4 0 0 18

LAN Access 0 3 0 0 0 4 4 0 4 0 0 15
methods and
Standards
Packet Switching 0 3 0 0 0 0 2 4 4 5 0 18
Networks
TCP / IP 0 2 0 22
Architecture
Advanced 27
Network
Architecture and
Security Protocols
Total 20 60 20 0 100

The ability and content details are updated in the above the table. After updating the details,
you have check the each cell where the question can be designed. Finally, the two
dimensional table will be filled with values

Abilities
Remember Understand Apply Higher
Content order Total
(No. and Name
Recognize Recall 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.1 3.2 abilities
of the Unit)

Network 5 5 2 0 0 0 0 2 4 0 0 0 18
Design
LAN Access
methods and 0 3 0 0 0 4 4 0 4 0 0 0 15
Standards
Packet
Switching 0 3 0 0 0 0 2 4 4 5 0 0 18
Networks
TCP / IP 0 2 0 0 0 0 0 3 12 5 0 0 22
Architecture
Advanced
Network
Architecture 0 2 0 0 0 0 0 5 10 5 5 0 27
and Security
Protocols
Total 20 60 20 0 100

Once the values are finalized, the start to design the question paper. The cornerstone of
classroom assessment practices is the validity of the judgments about students’ learning and
knowledge (Wolming&Wilkstrom, 2010). A TOS is one tool that teachers cause to support
their professional judgment when creating or selecting test for use with their students. The
TOS can be used in conjunction with lesson and unit planning to help teacher make clear the
connections between planning, instruction, and assessment.
Performance Assessment

Student performance in the laboratory, classrooms, industrial training, Assignment work


workshop and etc., are very much important to learn their subject actively. To measure their
learning in terms of performance is equally important as regular exams what we are conducting
in terms of assessment or examination.

Figure – 1. Performance Assessment


Educational experts prepared some common guidelines for measuring the student performance.
The guidelines are selecting the appropriate performance tasks, developing clear instructions
for students, developing procedures for evaluating students' performance, and implementing
procedures to minimize rating errors.

Selecting Appropriate Performance Tasks


A performance task is an assessment activity that requires a student to produce a written or to
create a product (Nitko, 2001). Here are some factors that should be considered.

Select the performance task that


• Provide the most direct assessment of the educational objectives you want to measure.
• Maximize your ability to generalize the results of the assessment.
• Reflect essential skills.
• Encompass more than one learning objective.
• Focus your evaluation on the processes and/or products you are most interested in.
• Provide the desired degree of realism.
• Measure skill that are "teachable”.
• Fair to all students
• Can be assessed given the time and resources available.
• Can be scored in a reliable manner.
• Reflect educational objectives that cannot be measures using more traditional measures.

Developing Instructions
Generally, performance tasks often require fairly complex student responses, it is
important that your instruction precisely specify the types of responses you are expecting.
Because originality and creativity are seen as desirable educational outcomes, Performance
tasks often give students considerable freedom in how they approach the task. It is the teacher's
responsibility to write instructions clearly and precisely so that students do not need to "read
the teacher's mind" that what the teacher is expecting from the individual.
Here, a list of questions that assessment professionals recommend you to consider when
evaluating the quality of your instructions (e.g., Nitko, 2001):

• Do your instructions match the educational level of your students?


• Do your instructions contain unnecessary jargon and overly technical language?
• Do your instructions clearly specify the purpose or goal of the task?
• Do your instructions clearly specify the type of response you expect?
• Do your instructions specify all the important parameters of the performance task (e.g.,
time limits, the use of equipment or materials)?
• Do your instructions clearly specify the criteria you will use when evaluating the
student responses?
• Will students from diverse cultural and ethnic backgrounds interpret the instructions in
an accurate manner?

Developing Procedures for Evaluating Responses


Whether the teacher is evaluating the process, product, or a combination of the two, it
is imperative that to develop systematic, objective, and reliable procedure for evaluating
student responses.
A rubric is a scoring guide used to evaluate performance, a product, or a process or a
combination of the two. It consists of three components
1. Performance Criteria
2. Rating Scale
3. Indicators

Indicators

Figure -2 – Parts of the Rubric

A rubric is simply a written guide that helps you score constructed-response assessments. In
discussing the development of scoring rubrics for performance assessments, Popham (1999)
identified three essential tasks that need to be completed, discussed in the following
paragraphs.

Select important criteria that will be considered when evaluating student responses.
Start by selecting the criteria or response characteristics that you will employ when
judging the quality of a student's response. The criteria you are considering when judging the
quality of a student's response should be described in a precise manner so that there is no
confusion about what the rating refers to. It is also highly desirable to select criteria that can be
directly observed and judged. Characteristics such as interest, attitude, and effort are not
directly observable and do not make good bases for evaluation.

Specify explicit standards that describe different levels of performance.


For each criterion that to evaluate, it is necessary to develop clearly stated standards
that distinguish among levels of performance. In other words, the standards should spell out
what a student's response must encompass or look like to be regarded as excellent, average, or
inferior. It is often helpful to provide behavioral descriptions and/or specimens or examples to
illustrate the different levels of performance.

Determine what type of scoring procedure you will use.


Scoring rubrics can be classified as either holistic or analytic. With analytic scoring
rubrics the teacher awards credit on a criterion-by-criterion basis whereas with holistic rubrics
the teacher assigns a single score reflecting the overall quality of the student's response.
Analytic scoring rubrics have the advantage of providing specific feedback to students
regarding the strengths and weaknesses of their response. This informs students which aspects
of their responses were adequate and which need improvement. The major limitation of
analytic rubrics is that they can take considerable time to complete. Holistic rubrics are often
less detailed than analytic rubrics and as a result are easier to develop and complete. Their
major disadvantage is that they do not provide specific feedback to students about the strengths
and weaknesses of their responses.
Most experts suggest that including more than seven positions is not useful because raters
usually cannot make finer discriminations than this.
What Are the Parts of a Rubric?
Rubrics are composed of four basic parts in which the professor sets out the parameters of
the assignment. The parties and processes involved in making a rubric can and should vary
tremendously, but the basic format remains the same. In its simplest form, the rubric includes
a task description (the assignment), a scale of some sort (levels of achievement, possibly in
the form of grades),
Title
Task Description
Scale level 1 Scale level 2 Scale level 3

Criterion 1
Criterion 2
Criterion 3
Criterion 4

Figure 1.1 Basic rubric grid format.

the criteria of the assignment (a breakdown of the skills/knowledge involved in the


assignment), and descriptions of what constitutes each level of performance (specific
feedback) all set out on a grid, as shown in Figure 1.1.

In the rubric, it is created by using a simple Microsoft Word table to create our grids using the
“elegant” format found in the “auto format” section. This sample grid shows three scales and
four criteria. This is the most common kind of rubric, but sometimes it can be further
extended with more criteria with valid label to maximum of five scale levels and six to seven
criteria. In this document, it will look at the four component parts of the rubric and, using an
oral presentation assignment as an example, develop the above grid part-by-part until it is a
useful grading tool (a usable rubric) for the professor and a clear indication of expectations
and actual performance for the student.
Part-by-Part Development of a Rubric
Part 1: Task Description
The task description is almost always originally framed by the faculty member and involves a
“performance” of some sort by the student. The task can take the form of a specific
assignment, such as a laboratory work, paper, a poster, assignment, or a presentation. The
task can also apply to overall behavior, such as participation, use of proper lab protocols, and
behavioural expectations in the classroom. It is necessary to place the task description, usually
cut and pasted from the syllabus, at the top of the grading rubric, partly to remind ourselves
how the assignment was written as we grade, and to have a handy reference later on when
we may decide to reuse the same rubric.

Task Description: Each student will make a 5-minute presentation on an installation and
configuration of the Web Server for web Technologies Lab. The presentation should include
appropriate photographs, presentations, maps, graphs, simulations, and other visual aids
for the audience.

Scale level 1 Scale level 2 Scale level 3

Criterion 1
Criterion 2
Criterion 3
Criterion 4

Figure 1.2 Part 1: Task description.

More important, however, we find that the task assignment grabs the students’ attention in
a way nothing else can, when placed at the top of what they know will be a grading tool. With
the added reference to their grades, the task assignment and the rubric criteria become more
immediate to students and are more carefully read. Students focus on grades. Sad, but true.
We might as well take advantage of it to communicate our expectations as clearly as possible.
If the assignment is too long to be included in its entirety on the rubric, or if there is some
other reason for not including it there, we put the title of the full assignment at the top of the
rubric: for example, “Rubric for Oral Presentation.” This will at least remind the students that
there is a full description elsewhere, and it will facilitate later reference and analysis for the
professor. Sometimes we go further and add the words “see syllabus” or “see handout.”
Another possibility is to put the larger task description along the side of the rubric. For reading
and grading ease, rubrics should seldom, if ever, be more than one page long. Most rubrics
will contain both a descriptive title and a task description. Figure 1.2 illustrates Part 1 of the
sample rubric with the title and task description highlighted.

Part 2: Scale
The scale describes how well or poorly any given task has been performed and occupies yet
another side of the grid to complete the rubric’s evaluative goal. Terms used to describe the
level of performance should be tactful but clear. In the generic rubric, words such as
“mastery,” “partial mastery,” “progressing,” and “emerging” provide a more positive, active,
verb description of what is expected next from the student and also mitigate the potential
shock of low marks in the lowest levels of the scale. Some professors may prefer to use non-
judgmental, non-competitive language, such as “high level,” “middle level,” and “beginning
level,” whereas others prefer numbers or even grades.
Here are some commonly used labels compiled by Huba and Freed (2000):
• Sophisticated, competent, partly competent, not yet competent (NSF Synthesis
Engineering Education Coalition, 1997)
• Exemplary, proficient, marginal, unacceptable
• Advanced, intermediate high, intermediate, novice (American Council of Teachers of
Foreign Languages, 1986, p. 278)
• distinguished, proficient, intermediate, novice (Gotcher, 1997):
• accomplished, average, developing, beginning (College of Education, 1997)
We almost always confine ourselves to three levels of performance when we first construct a
rubric. After the rubric has been used on a real assignment, we often expand that to five. It is
much easier to refine the descriptions of the assignment and create more levels after seeing
what our students actually do. Figure 1.3 presents the Part 2 version of our rubric where the
scale has been highlighted. There is no set formula for the number of levels a rubric scale
should have. Most professors prefer to clearly describe the performances at three or even
five levels using a scale. But five levels is enough. The more levels there are, the more difficult
it becomes to differentiate between them and to articulate precisely why one student’s work
falls into the scale level it does. On the other hand, more specific levels make the task clearer
for the student and they reduce the professor’s time needed to furnish detailed grading notes.
Most professors consider three to be the optimum number of levels on a rubric scale.

Task Description: Each student will make a 5-minute presentation on an installation and
configuration of the Web Server for web Technologies Lab. The presentation should include
appropriate photographs, presentations, maps, graphs, simulations, and other visual aids
for the audience.

Scale level 1 Scale level 2 Scale level 3

Criteria 1
Criteria 2
Criteria 3
Criteria 4
Figure 1.3 Part 2: Scales.

If a faculty chooses to describe only one level, the rubric is called a holistic rubric or a scoring
guide rubric. It usually contains a description of the highest level of performance expected for
each criterion, followed by room for scoring and describing in a “Comments” column just how
far the student has come toward achieving or not achieving that level. Scoring guide rubrics,
however, usually require considerable additional explanation in the form of written notes and
so are more time-consuming than grading with a three-to five-level rubric.

Part 3: Criteria
The criteria of a rubric lay out the parts of the task simply and completely. A rubric can also
clarify for students how their task can be broken down into components and which of those
components are most important. Is it the grammar? The analysis? The factual content? The
research techniques? And how much weight is given to each of these aspects of the
assignment? Although it is not necessary to weight the different criteria differently, adding
points or percentages to each criterion further emphasizes the relative importance of each
aspect of the task. Criteria should actually represent the type of component skills students
must combine in a successful scholarly work, such as the need for a firm grasp of content,
technique, citation, examples, analysis, and a use of language appropriate to the occasion.
When well done, the criteria of a rubric (usually listed along one side of the rubric) will not
only outline these component skills, but after the work is graded, should provide a quick
overview of the student’s strengths and weaknesses in each criterion. Criterion need not and
should not include any description of the quality of the performance. “Organization,” for
example, is a common criterion, but not “Good Organization.” We leave the question of the
quality of student work within that criterion to the scale and the description of the criterion.
Breaking up the assignment into its distinct criteria leads to a kind of task analysis with the
components of the task clearly identified. Both students and faculty members find this useful.
It tells the student much more than a mere task assignment or a grade reflecting only the
finished product. Together with good descriptions, the criteria of a rubric provide detailed
feedback on specific parts of the assignment and how well or poorly those were carried out.
This is especially useful in assignments such as our oral presentation example in which many
different criteria come into play, as shown in Figure 1.4.

Part 4: Description of the criteria

Criteria alone are all-encompassing categories, so for each of the criteria, a rubric
should also contain at the very least a description of the highest level of performance in that
criterion. A rubric that contains only the description of the highest level of performance is
called a scoring guide rubric. Scoring guide rubrics allow for greater flexibility and the personal
touch, but the need to explain in writing where the student has failed to meet the highest
levels of performance does increase the time it takes to grade using scoring guide rubrics. For
most tasks, we prefer to use a rubric that contains at least three scales and a description of
the most common ways in which students fail to meet the highest level of expectations
Task Description: Each student will make a 5-minute presentation on an installation and
configuration of the Web Server for web Technologies Lab. The presentation should include
appropriate photographs, presentations, maps, graphs, simulations, and other visual aids
for the audience.

Scale level 1 Scale level 2 Scale level 3

Knowledge/understanding
20%/20 points
Thinking/inquiry
30%/30 points
Communication
20%/20 points
Use of visual aids
20%/20 points
Presentation skills
10%/10 points
Analytic Rubric

Introduction
An analytic rubric articulates levels of performance for each criterion to allow the
instructor to assess student performance on each criterion. Thus using analytic rubric, the
instructor is able to provide specific feedback on several dimensions of an assignment (e.g.,
thesis, organization, mechanics, etc.) along specific levels of performance
Advantages
• Provide useful feedback on areas of strength and weakness.
• Each criterion is evaluated specifically.
• Criterion can be weighted to reflect the relative importance of each dimension.
Disadvantages

• Takes more time to develop and apply than a holistic rubric.


• Raters may not arrive at the same score if each point for every criterion is not well
defined.

Analytic or holistic rubrics can be used depending on the purpose of teachers and
performance expected from students in the assessment of student’s writings. However, the
matter as to which one is more reliable is controversial. Some researchers assert that analytic
rubric is more re- liable than holistic rubric (e.g. Elbow, 2000; Gunning, 2006).

Template - 1
5 Point Rating Scale

Need to
Criteria Excellent Very Good Good Satisfied
Improve

Criteria - 1 Indicators Indicators Indicators Indicators Indicators

Criteria - 2 Indicators Indicators Indicators Indicators Indicators

Criteria - 3 Indicators Indicators Indicators Indicators Indicators

Criteria – n Indicators Indicators Indicators Indicators Indicators


Template - 2

4 Point Rating Scale

Need to
Criteria Very Good Good Satisfied
Improve

Criteria - 1 Indicators Indicators Indicators Indicators

Criteria - 2 Indicators Indicators Indicators Indicators

Criteria - 3 Indicators Indicators Indicators Indicators

Criteria – n Indicators Indicators Indicators Indicators

Example for Workshop

Collaborative Work Skills : in Mechnical Workshop

Teacher Name: Teacher Teacher

StudentName: ______________________________________

CATEGORY 4 3 2 1
Routinely provides useful Usually provides useful ideas Sometimes provides useful Rarely provides useful
ideas when participating in when participating in the group ideas when participating in the ideas when
the group and in classroom and in classroom discussion. A group and in classroom participating in the
Contributions discussion. A definite leader strong group member who tries discussion. A satisfactory group group and in classroom
who contributes a lot of effort. hard! member who does what is discussion. May refuse
required. to participate.

Provides work of the highest Provides high quality work. Provides work that occasionally Provides work that
quality. needs to be checked/redone by usually needs to be
Quality of Work other group members to ensure checked/redone by
quality. others to ensure
quality.
Routinely uses time well Usually uses time well throughout Tends to procrastinate, but Rarely gets things done
throughout the project to the project, but may have always gets things done by the by the deadlines AND
ensure things get done on procrastinated on one thing. Group deadlines. Group does not have group has to adjust
time. Group does not have to does not have to adjust deadlines to adjust deadlines or work deadlines or work
Time-management adjust deadlines or work or work responsibilities because of responsibilities because of this responsibilities because
responsibilities because of this person\'s procrastination. person\'s procrastination. of this person\'s
this person\'s procrastination. inadequate time
management.

Actively looks for and Refines solutions suggested by Does not suggest or refine Does not try to solve
suggests solutions to others. solutions, but is willing to try problems or help others
problems. out solutions suggested by solve problems. Lets
Problem-solving others. others do the work.

Never is publicly critical of Rarely is publicly critical of the Occasionally is publicly critical Often is publicly
the project or the work of project or the work of others. Often of the project or the work of critical of the project or
others. Always has a positive has a positive attitude about the other members of the group. the work of other
Attitude attitude about the task(s). task(s). Usually has a positive attitude members of the group.
about the task(s). Often has a negative
attitude about the
task(s).
Consistently stays focused on Focuses on the task and what needs Focuses on the task and what Rarely focuses on the
the task and what needs to be to be done most of the time. Other needs to be done some of the task and what needs to
done. Very self-directed. group members can count on this time. Other group members be done. Lets others do
Focus on the task person. must sometimes nag, prod, and the work.
remind to keep this person on-
task.

Almost always listens to, Usually listens to, shares, with, and Often listens to, shares with, Rarely listens to, shares
shares with, and supports the supports the efforts of others. Does and supports the efforts of with, and supports the
Working with efforts of others. Tries to keep not cause \"waves\" in the group. others, but sometimes is not a efforts of others. Often
Others people working well together. good team member. is not a good team
player.

Routinely monitors the Routinely monitors the Occasionally monitors the Rarely monitors the
effectiveness of the group, effectiveness of the group and effectiveness of the group and effectiveness of the
Monitors Group and makes suggestions to works to make the group more works to make the group more group and does not
Effectiveness make it more effective. effective. effective. work to make it more
effective.

When we assess the performance the student each criterion can be considered equally or
unequally. Consider, in the above example all the criteria are treated as equal. The importance
for all the criteria is equal. A student SSS is getting a score as follows

Collaborative Work Skills : in Mechnical Workshop

Teacher Name: Class Teacher

StudentName: _______SSS_______________________________

CATEGORY 4 3 2 1
Routinely provides useful Usually provides useful ideas Sometimes provides useful Rarely provides useful
ideas when participating in when participating in the group ideas when participating in the ideas when
the group and in classroom and in classroom discussion. A group and in classroom participating in the
Contributions discussion. A definite leader strong group member who tries discussion. A satisfactory group group and in classroom
who contributes a lot of hard! member who does what is discussion. May refuse
effort. required. to participate.
Provides work of the highest Provides high quality work. Provides work that occasionally Provides work that
quality. needs to be checked/redone by usually needs to be
Quality of Work other group members to ensure checked/redone by
quality. others to ensure
quality.
Routinely uses time well Usually uses time well Tends to procrastinate, but Rarely gets things done
throughout the project to throughout the project, but may always gets things done by the by the deadlines AND
ensure things get done on have procrastinated on one deadlines. Group does not have group has to adjust
time. Group does not have to thing. Group does not have to to adjust deadlines or work deadlines or work
Time-management adjust deadlines or work adjust deadlines or work responsibilities because of this responsibilities because
responsibilities because of responsibilities because of this person\'s procrastination. of this person\'s
this person\'s procrastination. person\'s procrastination. inadequate time
management.

Actively looks for and Refines solutions suggested by Does not suggest or refine Does not try to solve
suggests solutions to others. solutions, but is willing to try problems or help others
problems. out solutions suggested by solve problems. Lets
Problem-solving others. others do the work.

Never is publicly critical of Rarely is publicly critical of the Occasionally is publicly critical Often is publicly
the project or the work of project or the work of others. of the project or the work of critical of the project or
others. Always has a positive Often has a positive attitude other members of the group. the work of other
Attitude attitude about the task(s). about the task(s). Usually has a positive attitude members of the group.
about the task(s). Often has a negative
attitude about the
task(s).
Consistently stays focused on Focuses on the task and what Focuses on the task and what Rarely focuses on the
the task and what needs to be needs to be done most of the needs to be done some of the task and what needs to
done. Very self-directed. time. Other group members can time. Other group members be done. Lets others do
Focus on the task count on this person. must sometimes nag, prod, and the work.
remind to keep this person on-
task.

Almost always listens to, Usually listens to, shares, with, and Often listens to, shares with, Rarely listens to, shares
shares with, and supports supports the efforts of others. Does and supports the efforts of with, and supports the
Working with the efforts of others. Tries to not cause \"waves\" in the group. others, but sometimes is not a efforts of others. Often
Others keep people working well good team member. is not a good team
together. player.

Routinely monitors the Routinely monitors the Occasionally monitors the Rarely monitors the
effectiveness of the group, effectiveness of the group and effectiveness of the group and effectiveness of the
Monitors Group and makes suggestions to works to make the group more works to make the group more group and does not
Effectiveness make it more effective. effective. effective. work to make it more
effective.

In the rubric evaluation, the bold indicators are score for the student SSS. In a simple table
Collaborative Work Skills : in Mechnical Workshop

Teacher Name: Teacher Teacher

StudentName: ______________________________________
CATEGORY 4 3 2 1
Contributions *
Quality of
Work *
Time-
management *
Problem-
solving *
Attitude *
Focus on the
task *
Working with
Others *
Monitors
Group
Effectiveness
*
Score = 4 + 3+ 3 + 4 + 3 + 3 + 4 + 3
= 27 / 32
= 84.38%

When we consider all the criteria as equal, then we got the score as 84.38%. But, in some cases
we cannot consider all the criteria as equal. So, we have to give the relative importance to each
criterion.

For example, Teacher’s Teaching competency

CATEGORY 4 3 2 1
Knowledge in
the subject
Communication
Skills

Dress Code

Punctuality

Humor
Anecdotes in
the class
Friendliness
Assignment
Preparation

In the above, table it is not necessary to give equal importance to all the criteria. In the above
example, assume that the faculty member like to give the relative importance as follows.
CATEGORY Importance
Knowledge in the subject 2 mark * indicator rating
Communication Skills 2 mark * indicator rating

Dress Code 1 mark * indicator rating

Punctuality 1 mark * indicator rating

Humor 1 mark * indicator rating

Anecdotes in the class 2 mark * indicator rating

Friendliness 0.5 mark * indicator rating


Assignment Preparation 2 mark * indicator rating

If a teacher is evaluated with an above relative importance then the score is not equal as normal
scoring where every criterion is considered as equal.

Score for the teacher = TTT


CATEGORY 4 3 2 1 Score
Knowledge in
the subject * 4x2=8
Communication
Skills * 3x2=6

Dress Code * 3x1=3

Punctuality * 4x1=4

Humor * 4x1=4
Anecdotes in
the class * 4x2=8

Friendliness * 3 x 0.5 = 1.5


Assignment
Preparation * 3x2=6

In the above sample, the important competencies are considered as more weightage . So, a
teacher is punctual, friendliness and humor cannot be treated as equal to the teacher with
knowledge, communication, anecdotes in the class and etc.,

Score for the teacher TTT


= 8 +6 + 3 + 4+ 4+ 8+1.5 +6
= 40.5 / 46
= 88%

So, the performance is exactly assessed. In addition, the strength and weakness easily
identified.
Establish characteristics of Assessment
When constructing or selecting assessments, the most important questions are what
extent will the interpretation of the scores be appropriate, meaningful, and useful for the
intended application of the results? and what are the consequences of the particular uses and
interpretations that are made of the results?
Assessments take a wide variety of forms, ranging from the familiar multiple-choice or other
types of fixed-response tests to extended observations of performance. They also serve a
variety of uses in the institution. For example, assessment results might be used to identify
student strengths and weaknesses, to plan instructional activities, or to communicate
progress to students and parents; achievement tests might be used for selection, placement,
diagnosis, or certification; aptitude tests might be used for predicting success in future learning
activities or occupations; and appraisals of personal-social development might be used to
better understand learning problems or to evaluate the effects of a particular school
program. Regardless of the type of assessment used or how the results are to be used, all
assessments should possess certain characteristics. The most essential of these are validity,
reliability, and usability.

Validity
Validity is the adequacy and appropriateness of the interpretations and uses of
assessment results. An evaluation of the validity of the use and interpretation of an assessment
can take many forms. For example, if an assessment is to be used to describe student
achievement, then we should like to be able to interpret the scores as a relevant and
representative sample of the achievement domain to be measured. If the results are to be used
as a measure of students' understanding of mathematical concepts, then we should like our
interpretations to be based on evidence that the scores actually reflect mathematical
understanding and are not distorted by irrelevant factors, such as the reading demands of the
tasks. If the results are to be used to predict students' success in some future activity, then we
should like our interpretations to be based on as good an estimate of future success as possible.
Basically, then, validity is always concerned with the specific use of assessment results and the
soundness and fairness of our proposed interpretations of those results. As we will see later in
this chapter, however, this does not mean that validation procedures can be matched to specific
assessment uses on a one -to-one basis.
Reliability refers to the consistency of assessment results. If we obtain quite similar scores
when the same assessment procedure is used with the same students on two different occasions,
then we also can conclude that our results have a high degree of reliability from one occasion
to another. Similarly, if different teachers independently rate student performances on the same
assessment ta.sk and obtain similar ratings, we also can conclude that the results have a high
degree of reliability from one rater to another. Like validity, reliability is intimately related to
the type of interpretation to be made. For some uses, we may be interested in asking how
reliable our assessment results are over a given period of time and, for others, how reliable they
are over different samples of the same behavior. In all instances in which reliability is being
determined, however, we are concerned with the consistency of the results rather than with the
appropriateness of the interpretation made from the results (validity). The relation between
reliability and validity is sometimes confusing to persons who encounter these terms for the
first time. Reliability (consistency) of measurement is needed
to obtain valid results, but we can have reliability without validity. That is, we can have
consistent measures that provide the wrong information or are interpreted inappropriately. The
target-shooting illustration in Figure 2 depicts the concept that reliability is a necessary but not
sufficient condition for validity.

Valid & Reliable Invalid and unreliable Reliable and invalid


Figure 2
In addition to providing results that possess a satisfactory degree of validity and reliability, an
assessment procedure must meet certain practical requirements. It should be economical from
the viewpoint of both time and money, it should be easily administered and scored, an d it
should produce results that can be accurately interpreted and applied by available Institute
personnel. These practical aspects of an assessment procedure all can be included under the
heading of usability. The term usability, then, refers only to the practicality of the procedure
and says nothing about the other qualities present. When using the term validity in relation to
testing and assessment, keep the following cautions in mind. Validity refers to the
appropriateness of the interpretation and use made of the results of an assessment procedure
for a given group of individuals, not to the procedure itself. We sometimes speak of the
"validity of a test" for the sake of convenience, but it is more correct to speak of the validity of
the interpretation and use to be made of the results. Validity is a matter of degree; it does not
exist on an all-or-none basis. Consequently, we should avoid thinking of assessment results as
valid or invalid. Validity is best considered in terms of categories that specify degree, such as
high validity, moderate validity, and low validity. Validity is always specific to some particular
use or interpretation for a specific population of test takers. No assessment is valid for all
purposes. For example, the results of a mathematics test may have a high degree of validity for
indicating computational skill, a low degree of validity for indicating mathematical reasoning,
a moderate degree of validity for predicting success in future mathematics courses, and
essentially no validity for predicting success in art or music. When indicating computational
skill, the mathematics test may also have a high degree of validity for third- and fourth-grade
students but a low degree of validity for second- or fifth-grade students. Thus, when appraising
or describing validity, it is necessary to consider the specific interpretation or use to be made
of the results Assessment results are never just valid; they have a different degree of validity.
Validity involves an overall evaluative judgment. It requires an evaluation of the degree to
which interpretations and uses of assessment results are justified by supporting evidence and
in terms of the consequences of those interpretations and uses. Four types of validity are to be
considered. These are
• Content Validity
• Construct Validity
• Concurrent Validity
• Predictive Validity

Content Validity
Each item in a test must be a sampling of knowledge or performance that the test is
supposed to measure. Content validity refers to the degree to which the test measures the
content in relation to the objectives spelt out. Content validity is usually associated with
achievement tests. It may be defined as the extent to which a test measures a representative
sample of subject matter content and the behavioural changes under consideration. The focus
of content validity, then, is on the adequacy of the sample and not simply on the appearance of
the test. A test that appears to be a relevant measure, based on superficial examination, is said
to have 'face validity'. Although a test should look like an appropriate tool to obtain the co-
operation of those taking the test, face validity should not be considered as a substitute for
content validity. The test must adequately sample both subject matter content and the major
types of behavioural changes. These must also be properly weighted in terms of their relative
importance. The factors that will affect the validity of a test are
• unclear directions and ambiguous statements in test items
• reading vocabulary and sentence structure too difficult for the student to understand
• inappropriate level of difficulty of test item for the person being examined, This results
in poor discrimination of marks and therefore low reliability
• poorly constructed test items i.e., items with poor lay out or unclear words or figures
• test items inappropriate for the outcomes being measured
• test too short so that adequate sample not made of content and behaviours

• improper arrangement of items and identifiable pattern in answers to items in the test.

In short, any defect in the construction of items and assembling a test will contribute to
invalidity of the measurement and therefore care must be taken to prevent the same. The content
sampled in a lest must be a representative sample and it should be able to truly measure the
achievement of the learners. Using a table of specifications for the test does this.
Construct validity:
Construct validity concerns the extent to which a test tells us something about a
meaningful characteristic of the individual. Information about such characteristics (or
"Constructs- as they are sometimes called) may help us understand the student's performance
In various aspects. Common examples of constructs are intelligence, scientific attitude, critical
thinking, reading comprehension, study skills and mathematical aptitude. Construct validity
may be defined as the extent to which test performance can be interpreted in terms of certain
psychological constructs.
For example, in order to understand, why a student consistently does well in English but poorly
in Mathematics, it may be useful to know something about his general level of intelligence, his
verbal ability, his numerical ability and perhaps his attitudes towards the different subjects-
Knowledge of such characteristics can help ensure that each student benefits maximally from
the learning experiences provided.
Construct validity is important in the context of achievement testing also. Achievement is an
important construct in an educational setting. Construct validity here refers to the test's ability
to measure the individual's actual achievement of instructional objectives. If an achievement
test has high construct validity, it should distinguish between students' who have achieved at
different levels.
Concurrent Validity
Concurrent validity is a criterion-related validity. It is the extent to which test
performance is related to some other current performance. The concurrent validity of a test
must be considered when one is using the test to distinguish between two or more groups of
individuals, whose status on a criterion is different at the time of testing. Tests or inventories
used to separate individuals in different academic curricula, in different vocational groups etc.
if successful, would be showing concurrent validity.
Concurrent validity may also be of relevant concern In judging achievement tests. In every day
classroom experiences, there frequently are appropriate
contemporary criteria with which achievement test performance must be compared. For
example, test performance in Mathematics, should be related to computational skill exhibited
in Engineering subject (Contemporary Criterion). High concurrent validity of the Mathematics
test means that those who do well in the test also do well in a test of the criterion of
computational skill in the Engineering subject.
Predicted Validity
This is also a content-related validity. It is the extent to which the test performance is
accurate in predicting some future performance. Eg: Aptitude testing, Predictive validity is
pertinent whenever test results are used to make specific predictions.
Consider for example the problem of selecting students for a course based on an admission
test. The assumption made is that the candidates who do well in the test are likely to succeed
in the course. The Admission test should have high predictive validity for this purpose, In order
to determine the predictive validity of the test, it is necessary to establish a correlation between
the admission test scores and a criterion viz. Course performance. Many examination results
are used as predictors of future performance in later stages of education. Eg: Marks in
Mathematics, Physics and Chemistry are used to predict success in the Engineering courses.
Reliability
Reliability refers to the consistency of measurement that is, how consistent are test
scores from one measurement to another. If a test gives a score now and if it is administered
again after lapse of a short time without remedial instruction, gives scores which are
comparable, then the test is said to be reliable. This is called test-retest reliability. The type of
test items in a test can affect reliability also. Items which can be differently scored by different
examiners or by the same examiner at different times will contribute to unreliability (i.e. essay
type items). Factors affecting reliability are
• items that are ambiguous, too easy or too difficult will contribute to unreliability since
all those examined will get the same or nearly same scores.
• a longer test is more reliable than a short one as there is greater scope for a larger spread
of scores.
• test that has greater spread of scores is more reliable, as it discriminates between high
and low achievers.
• Objective tests are more reliable than essay type tests because the subjective judgement
of the scorer does not affect the scores.

Reliability refers to error in measurement. This could be extrinsic error or intrinsic error.
Extrinsic error may be due to:
• test and Examination conditions and situations.
• subjectivity in scanning by the scorers (this can be eliminated by using objective items
or minimised by having a marking scheme and examiners meeting for scoring).
Intrinsic error may be due to:
• the quality of items and questions
• sampling of areas not balanced i.e. it is biased
• time limits set arbitrarily which is not in keeping with the requirements of the test
situation.
The reliability of a test is measured in terms of Reliability coefficient. The following methods
are used to find out the reliability of a test.

Test-Retest Method
The test is administered to a group of students for whom the test is constructed. The
same test is administered to the same group of students after a lapse of time. Test administration
must be under similar conditions. This would mean that the students are retested with the same
test. The scores are compared and coefficient of correlation found out. If there is a high positive
correlation then the test will be valid. However in this method the effect of learning or
unlearning due to the lapse of time cannot be ruled out.

Equivalent form method


This method is also called Parallel form method. Two equivalent forms of the test
(called parallel forms) are constructed. The two tests must be similar in all respects- objectives
tested, number of items, categories of items, length of the test, time for answering, difficulty
level of items, content covered and the type of item used to test a specific content. The tests are
administered to the students for whom the test is designed. Coefficient of correlation of the
scores obtained is calculated. If this is high the test will be reliable. One of the problems of this
method is the difficulty of designing two forms of tests.

Split half method


An easy and much used method for finding the reliability coefficient is the split half method.
The test whose reliability is to be found it is administered to the students for whom the test is
designed. The test is then split into two exact halves. Separating out all odd and all even items
does this. Scores of students on these two forms are found out. The reliability coefficient of
these two scores are found out. If this is high the test would be reliable,
Reliability of a test is closely related to validity. A valid test is usually also reliable but a highly
reliable test need not necessarily be valid. The validity of a test is more important than
reliability and we should not abandon essay type items merely because they contribute to
unreliability of measurements.
ESTABLISH THE CHARACTERISTICS OF TEST

Characteristics of a good test

Teachers often use a variety of evaluation Instruments to assess the scholastic


achievement of students. Test instruments are wider used to measure the student
achievement at different stages of the teaching learning process. The
effectiveness or the quality judgement depend upon the equality of the test
instruments, The essential characteristics of a good test are

v Comprehensiveness refers to the quality of the test to be long enough to


measure all areas of content as well as the range of abilities

v Usability refers to the quality of the test suggesting ease of administration,


scoring and interpreting. Most of the characteristics mentioned above are
interrelated.

Objectivity:

The objectivity of a test refers to the degree to which equally competent scorers give the
same score for the same answer paper. If the test contains objective test items, then the
objectivity of the test will be high. Essay type questions where the scorer has to use his
subjective judgement cannot be highly objective. The scores assigned by different
examiners should not be affected by the personal bias of the scorers. Though objectivity
is a desirable quality, it should not be insisted upon where the other more important
characteristic (namely validity) requires subjective items in the test. Essay tests are less
objective. It is known that no two examiners assign the same score for an essay. In
scoring an essay many extraneous factors come into the picture.

1
Discrimination:

A good test should be able to pick out a good student from the poor one, The test should
also be able to detect small differences in students' achievement. This is the ability of a
test to discriminate and we can increase discriminating power of the test by using items
that can discriminate as well as by having items that have all levels of difficulty. A test
having larger range of scores will be able to discriminate better.

Comprehensiveness:

A test measuring student achievement both in the subject content and in behavioural
outcomes must be comprehensive as well as representative in sampling so as to make
the test good.

Usability:

A test should be easy to administer, score and to interpret to make it usable. The test
which takes minimum student time in administering and which can be administered
without much problems in seating etc., is preferable to one which requires elaborate
precautions to administer. Economy in making the test (printing) and economy of time in
its use are desirable characteristics. A test which is easy to score after administration is
also a desirable test. In short, if 2 tests are compared and other things being equal the
test that is easy to design, duplicate, administer and score is desirable.

Relevance:

This relates to the matter of matching the performance measured by the test item or
question to the type of performance specified by the instructional objectives or the
learning outcomes. This is therefore possible only when the curriculum specifies the
intended learning outcomes (objectives) clearly. It is therefore necessary for the test
constructor to use careful judgement in selecting items and questions. If the outcome
calls for supplying the answer then the item should require the student to supply the
answer rather than select an answer. It is more important when higher order abilities are

2
involved. The items or questions should be exactly matched in performance and in the
level of performance with those indicated in the objectives.

Difficulty of the Test:

In norm-referenced tests, if the items are all easy or too difficult, then the spread of scores
of those taking the test tends to be restricted i.e. either all will get high scores or all will
get low scores, Classroom achievement test should be so constructed that the average
score is around 50%.

Other desirable characteristics:

The test that is administered and scored should be fair to the student. It is desirable that
an average student who has learnt the topics taught should be able to do well and pass.
The items and questions should be able to measure his learning and should not be
twisted or made unnecessarily complicated, The question paper as a whole should be a
balanced one in that the different areas must be given appropriate weightage based on
importance and the time spent in teaching. The items themselves should check important
areas of achievement rather than some trivial or obscure areas in the content.

3
ANALYSIS OF A QUESTION PAPER

The first step to construct a good question paper is to be able to critically look at the
existing question paper and to identify its strengths and deficiencies. The question paper
is a very important component of the assessment system. Since the students are required
to demonstrate the performance that they now become capable after undergoing the
teaching - learning process; it is very necessary that the question paper clearly calls for
the same performance. Thus we see that there needs to be a great deal of relationship
between the instructional objectives and the question paper. The performances that the
question paper asks the student to demonstrate should be the same as those that the
curriculum specifies.

Everv classroom teacher who prepares the student for examinations that are conducted
by bodies that are outside the institution should be capable of analvsing the question
paper and to specify its strengths and weaknesses.

Resources needed

As stated earlier we need some resources for analysis namely


1. The question paper itself

2. The scheme or the pattern of the question paper prescribed by the board
or examination system
3. The table of specifications for the question paper, if available
4. The marking or scoring system together with the marks assigned to
individual questions and sub-divisions
5. The curriculum document with objectives, content details and time
allocation
6. The teacher analysing the question paper should have expert knowledge in
the subject area together with the knowledge and skills in construction and
use of achievement tests and examinations.

1
Qualities of a good paper

Analysing the questions paper in detail we can perhaps discus briefly some desirable
qualities of a paper. These may be stated as follows:

1. In any examination, the paper should be fair to all students. That is those
who have studied more should get more marks than those who have
studied less.

2. The paper should be comprehensive and test or sample the content of the
entire curriculum as also the abilities.

3. Those students who have studied all areas of the curriculum should be able
to get more marks than those who study only selected portions (because of
the open choice given in some question papers) should.

4. The question and the answer expected should be clear and unambiguous.
The language should be easily understandable. If students do not
understand what is expected of them how will they be able to answer?

5. The relative marks for each question and its sub-division should be marked
so that when a student answers he knows how to allocate his time for the
answer.

6. In general the more number of specific questions a paper has the better will
its reliability. So also if the questions are objective, the paper and exam will
be more reliable Scoring of a paper also is easy if the questions of objective
in nature.

7. The instructions given are complete and easy to understand.

2
Analysis

Question paper should be analysed at 2 levels viz.: Micro level and pertains to the
individual questions and items and Macro level in which we consider the question paper
as a whole. Both these analysis are important.

Analysis of question paper at Micro level

Question is considered good if

1 It measures a specific area of content and the achievement of an ability


specified in the curriculum. The performance required from the student
should be the same as that specified in the curriculum.

2. It is well written that there are no technical defects in item construction.

3. It is worded appropriately and it is easy and clear to understand. (The item


or question should be able to check the subject matter and not the students'
abilities in English unless it is question paper in English.

4. The time allocated is appropriate and the marks assigned to the question is
also proportionate to time and performance expected.

5. The difficulty level of the question should be appropriate to the class that
uses the paper. The facility value of around 50% or a bit more may be
recommended (if you have data on item analysis).

6. he instructions given are clear and unambiguous and easy to understand.

7. when checked with other questions and items, this item does not test the
same area of content and abilities and that it does not give clues for
answering other items.

8. the quality of items is good and appropriate.

3
It is not enough that each question and item has been constructed and measures
a specific and important area of content, it must also fit into the question paper
appropriately. The paper as a whole measures the achievement of students. In all
analysis it is the paper that should be considered as a whole more than individual
items and questions. Since the paper cannot measure all areas of curriculum that
is taught it has to necessarily sample content and abilities. How far it is a
representative sample needs to be analysed. So in analysing the question paper
as a whole we should check whether.

1. It samples content and abilities comprehensively and in all areas. The


sample must also be representative.

2. The instructions are clear and that there is no ambiguity.

3. It conforms to the syllabus and table of specifications. If the table of


specifications is available we can check the paper with the table and see
whether it confirms to it. If the table of specifications is not available then
using the questions and marks allocated for each question and sub. division
we need to prepare a table of specifications and check the table to see
whether it is properly balanced or not. Some question papers may not even
have mark allocated. We need to do this before the analysis is made.

4. The time allocated is appropriate. Unless it is a speeded test, where the


candidate's ability to quickly answer is measured, most average students
should be able to complete the paper in the allocated time.

5. The questions expect an answer from the student and it is necessary to


check that the extent and nature of response and the time the student will
take in answering the question is matched properly with the marks allocated
to the question.

4
6. A student who has selectively studied only a few areas of content can get
full marks due to choice. A student who studied more areas of content
should get more marks that will be negotiated if choice is allowed in the
questions. Choice always also results in different students taking different
questions and so that uniformity of assessment of all students is lost.

7. Are there any questions, which are outside the curriculum and syllabus?

5
ANALYSIS OF A QUESTION PAPER

The first step to construct a good question paper is to be able to critically look at the
existing question paper and to identify its strengths and deficiencies. The question paper
is a very important component of the assessment system. Since the students are required
to demonstrate the performance that they now become capable after undergoing the
teaching - learning process; it is very necessary that the question paper clearly calls for
the same performance. Thus we see that there needs to be a great deal of relationship
between the instructional objectives and the question paper. The performances that the
question paper asks the student to demonstrate should be the same as those that the
curriculum specifies.

Everv classroom teacher who prepares the student for examinations that are conducted
by bodies that are outside the institution should be capable of analvsing the question
paper and to specify its strengths and weaknesses.

Resources needed

As stated earlier we need some resources for analysis namely


1. The question paper itself

2. The scheme or the pattern of the question paper prescribed by the board
or examination system
3. The table of specifications for the question paper, if available
4. The marking or scoring system together with the marks assigned to
individual questions and sub-divisions
5. The curriculum document with objectives, content details and time
allocation
6. The teacher analysing the question paper should have expert knowledge in
the subject area together with the knowledge and skills in construction and
use of achievement tests and examinations.

1
Qualities of a good paper

Analysing the questions paper in detail we can perhaps discus briefly some desirable
qualities of a paper. These may be stated as follows:

1. In any examination, the paper should be fair to all students. That is those
who have studied more should get more marks than those who have
studied less.

2. The paper should be comprehensive and test or sample the content of the
entire curriculum as also the abilities.

3. Those students who have studied all areas of the curriculum should be able
to get more marks than those who study only selected portions (because of
the open choice given in some question papers) should.

4. The question and the answer expected should be clear and unambiguous.
The language should be easily understandable. If students do not
understand what is expected of them how will they be able to answer?

5. The relative marks for each question and its sub-division should be marked
so that when a student answers he knows how to allocate his time for the
answer.

6. In general the more number of specific questions a paper has the better will
its reliability. So also if the questions are objective, the paper and exam will
be more reliable Scoring of a paper also is easy if the questions of objective
in nature.

7. The instructions given are complete and easy to understand.

2
Analysis

Question paper should be analysed at 2 levels viz.: Micro level and pertains to the
individual questions and items and Macro level in which we consider the question paper
as a whole. Both these analysis are important.

Analysis of question paper at Micro level

Question is considered good if

1 It measures a specific area of content and the achievement of an ability


specified in the curriculum. The performance required from the student
should be the same as that specified in the curriculum.

2. It is well written that there are no technical defects in item construction.

3. It is worded appropriately and it is easy and clear to understand. (The item


or question should be able to check the subject matter and not the students'
abilities in English unless it is question paper in English.

4. The time allocated is appropriate and the marks assigned to the question is
also proportionate to time and performance expected.

5. The difficulty level of the question should be appropriate to the class that
uses the paper. The facility value of around 50% or a bit more may be
recommended (if you have data on item analysis).

6. he instructions given are clear and unambiguous and easy to understand.

7. when checked with other questions and items, this item does not test the
same area of content and abilities and that it does not give clues for
answering other items.

8. the quality of items is good and appropriate.

3
It is not enough that each question and item has been constructed and measures
a specific and important area of content, it must also fit into the question paper
appropriately. The paper as a whole measures the achievement of students. In all
analysis it is the paper that should be considered as a whole more than individual
items and questions. Since the paper cannot measure all areas of curriculum that
is taught it has to necessarily sample content and abilities. How far it is a
representative sample needs to be analysed. So in analysing the question paper
as a whole we should check whether.

1. It samples content and abilities comprehensively and in all areas. The


sample must also be representative.

2. The instructions are clear and that there is no ambiguity.

3. It conforms to the syllabus and table of specifications. If the table of


specifications is available we can check the paper with the table and see
whether it confirms to it. If the table of specifications is not available then
using the questions and marks allocated for each question and sub. division
we need to prepare a table of specifications and check the table to see
whether it is properly balanced or not. Some question papers may not even
have mark allocated. We need to do this before the analysis is made.

4. The time allocated is appropriate. Unless it is a speeded test, where the


candidate's ability to quickly answer is measured, most average students
should be able to complete the paper in the allocated time.

5. The questions expect an answer from the student and it is necessary to


check that the extent and nature of response and the time the student will
take in answering the question is matched properly with the marks allocated
to the question.

4
6. A student who has selectively studied only a few areas of content can get
full marks due to choice. A student who studied more areas of content
should get more marks that will be negotiated if choice is allowed in the
questions. Choice always also results in different students taking different
questions and so that uniformity of assessment of all students is lost.

7. Are there any questions, which are outside the curriculum and syllabus?

5
Criterion referenced test versus Norm referenced Test

Different kinds of test can be conducted in a teaching – learning process and scores can be
interpreted. Based on the test or interpretation of score the test can be classified as Norm-
referenced Test versus Criterion referenced Test. Norm-referenced test and criterion-
referenced test are differentiated with respect to the ways scores are interpreted and the
purposes of the tests. Norm-referenced test is the process of evaluating or grading the learning
of students by ranking them against the performance of their peer group. Criteria- referenced
test is the process of evaluating or grading the learning of students against a set of defined
criteria. Norm Referenced Test is a test that measures how the performance of a particular
student or a group of students as one group compares with the performance of another student
or group of another set of students as a group whose scores are given as the norm. A test taker‘s
score is, therefore, interpreted with reference to the scores of other test takers or groups of test
takers. Norm referenced test tells that where a student stands compared to other students in
their performance. This position may help the student to take some decisions. The quality of
Norm referenced test is usually good because they are developed by experts, piloted, and
revised before they are used with students. It is also good for ranking and sorting students for
administrative purposes. It is intended to judge the class performance and institutions
accountability of providing learning standards and maintaining quality of education.

Criterion referenced test would be used to assess whether students pass or fail at a certain
criterion. So, Criterion referenced test is an approach of evaluation through which a learner‘s
performance is measured with respect to the same criterion in the classroom. Criterion
referenced test is good to measure specific skills or specific outcome of a student. It provides
the roadmap to the faculty member that how well the students are progressing. It is good to
determine learning progress if students have learning gaps or academic deficits that need to be
addressed . In a paper written by researcher Bond said that Criterion referenced test gives
direction to teaching and re-teaching. Instructors can use the test results to determine how well
they are teaching the curriculum and where they are lagging behind.

In a Military Selection a criterion was set. The Criterion is to climb the wall using a rope and
jump to other side of the wall. The scenario is, consider a wall with a height of 10 metre from
the ground. A rope is hanging in front of the wall. There are 20 candidates are standing in-front
of the wall. The criterion is, each one has to use the rope which is hanging in- front of the wall
and climb the wall using the rope and jump to other side of the wall. Assume that after the test,
some of them not even climbed 50% of the wall. Few of them climbed 75% of the wall and
only three of them touched the top of the wall but they could not jump to other side. So, who
will be selected for military? No body. It is exactly criterion referenced. The criterion need to
succeeded.
Item Analysis for Constructed –Response Items

Our discussion and example of the calculation of the it difficulty index and discrimination index
used examples that were dichotomously scored (i.e., scored right or wrong: 0 or ). Although this
procedure works fine with selected-response items (e-g. true-false, multiple-choice), you need a
slightly different approach with constructed-response items that are scored in a more continuous
manner (e.g. an essay item that can receive scores between 1 and 5 depending on quality). To
calculate the time difficulty index for a continuously scored constructed-response item, use the
following formula (Nitko, 2001):

Average Score on the Item


P=
Range of Possible Scores

The range of possible scores is calculated as the maximum possible score on the item minus the
minimum possible score on the item. For example, if an item has an average score of 2.7 and is
scored on a l to 5 scale, the calculation would be:

2.7 2.7
P= = =0.675
5-1 4

Therefore, this item has an item difficulty index of 0.675. This value can be interpreted the same
as dichotomously scored items we discussed.

To calculate the item discrimination index for a continuously scored constructed-response item,
you use the following formula (Nitko, 2001)
Average Score for the Top Group Average Score for the Bottom Group

D=
Range of Possible Scores
For example, if the average score for the top group is 4.3, the average score for the bottom group
in 1.7, and the item is scored on a 1 to 5 scale, the calculation would be:

D = (4.3 – 1.7) / (5 -1) = 2.6 /4 = 0.65

Therefore, this item has an item discrimination index of 0.65. Again, this value can be
interpreted the same as the dichotomously scored items we discussed.
Item Difficulty Index ( or Item Difficulty Level)

When evaluating items on ability tests, an important consideration is the difficulty level
of the item. Item difficulty is defined as the percentage or proportion of test takers who
correctly answer the item difficulty level or index is abbreviated as p and calculated with the
following formula:
number of examinees correctly answering the item
P=
Number of examinees

For example, in a class of 30 students, among the 30 students 20 students got the answer
correct and ten are incorrect, the item difficulty index is 0.67. The calculations are illustrated
here.

20
P= =0.67
30

In the same class. if ten students get the answer correct and 20 are incorrect, the item difficulty
index is 0.33.
While calculating the item analysis, if X number of students not appeared for the question, then
we have to subtract the X from the number of Examinees.

For Example, in a of class 30 students, among the 30 students 18 students are submitting the
correct answer, 8 students are submitting the wrong answer and 4 students are not responded
for the questions. So, the value P = 18/(30-4)

The item difficulty index can range from 0.0 to 1.0 with easier items having larger decimal
values and difficult items at lower values. An item answered correctly by all students receives
an item difficulty of 1.0 whereas an item answered in-correctly by all students receives an item
difficulty of 0.0. Items with p values of either 1.0 or 0.0 provide no information about
individual differences and are of no value from a measurement perspective. Some test
developers will include one or two items with p values of 1.0 at the beginning of a test to instill
a sense of confidence in test takers. This is a defensible practice from a motivational
perspective, but from a technical perspective these items do not contribute to the measurement
characteristics of the test. Another factor that should be considered about the inclusion of very
easy or very difficult items is the issue of time efficiency. The time students spend answering
ineffective items is largely wasted and could be better spent on items that enhance the
measurement characteristics of the test.

For maximizing variability and reliability, the optimal item difficulty level is 0.50, indicating
that 50% of test takers answered the item correctly and 50% answered incorrectly. Based on
this statement, you might conclude that it is desirable for all is that items on a test are often
correlated with each other, which means the measurement process may be confounded if all
the items have p values of 0.50. As a result, it is often desirable to select some items with p
values below 0.50 and some with values greater than 0.50, but with a mean of 0.50. Aiken
(2000) recommends that there should be approximately a 0.20 range of these p values around
the optimal value. For example, a test developer might select items with difficulty levels
ranging from 0.40 to 0.60, with a mean of 0.50.
Another reason why 0.50 is not the optimal difficulty level for every testing situation involves
the influence of guessing. On constructed-response items (e.g., essay and short-answer items)
for which guessing is not a major concern, 0.50 is typically considered the optimal difficulty
level.
In general, the difficulty of the items are as follows
Difficulty Value Item Evaluation
0.20 to 0.30 Most Difficult
0.30 to 0.40 Difficult
0.40 to 0.60 Moderate Difficult
0.60 to 0.70 Easy
0.70 to 0.80 Most Easy

However, with selected-response items (e.g., multiple choice and true-false items) for which
test takers might answer the item correctly simply by guessing, the optimal difficulty level
varies. To take into consideration the effects of guessing.
TABLE 1: Optimal p Values for Items with Varying Numbers of Choices

Number of Choices Optimal Mean P Value

2 (e.g., True –False) 0.85

3 0.77

4 0.74

5 0.69

Constructed response (e.g., essay) 0.50

The optimal item difficulty level is set higher than for constructed-response items. For
example, for multiple-choice items with four options the average p should be approximately
0.74 (Lord, 1952). That is, the test developer might select items with difficulty levels ranging
from .64 to 0.84 with a mean of approximately 0.74. Table 6.1 provides information on the
optimal mean p value for selected-response items with varying numbers of alternatives or
choices

Reference : Measurement and Assessment in Education by Reynolds, Livingston and Willson


Second Edition
Item Analysis for Constructed –Response Items

Our discussion and example of the calculation of the it difficulty index and discrimination index
used examples that were dichotomously scored (i.e., scored right or wrong: 0 or ). Although this
procedure works fine with selected-response items (e-g. true-false, multiple-choice), you need a
slightly different approach with constructed-response items that are scored in a more continuous
manner (e.g. an essay item that can receive scores between 1 and 5 depending on quality). To
calculate the time difficulty index for a continuously scored constructed-response item, use the
following formula (Nitko, 2001):

Average Score on the Item


P=
Range of Possible Scores

The range of possible scores is calculated as the maximum possible score on the item minus the
minimum possible score on the item. For example, if an item has an average score of 2.7 and is
scored on a l to 5 scale, the calculation would be:

2.7 2.7
P= = =0.675
5-1 4

Therefore, this item has an item difficulty index of 0.675. This value can be interpreted the same
as dichotomously scored items we discussed.

To calculate the item discrimination index for a continuously scored constructed-response item,
you use the following formula (Nitko, 2001)
Average Score for the Top Group Average Score for the Bottom Group

D=
Range of Possible Scores
For example, if the average score for the top group is 4.3, the average score for the bottom group
in 1.7, and the item is scored on a 1 to 5 scale, the calculation would be:

D = (4.3 – 1.7) / (5 -1) = 2.6 /4 = 0.65

Therefore, this item has an item discrimination index of 0.65. Again, this value can be
interpreted the same as the dichotomously scored items we discussed.
Discrimination Index
Probably the most popular method of calculating an index of item discrimination is
based on the difference in performance between two groups. Although there are different ways
of selecting the two groups, they are typically defined in terms of total test performance. One
common approach is to select the top and bottom 27% of test takers in terms of their overall
performance on the test and exclude the middle 46% (Kelley, 1939). Some assessment experts
have suggested using the top and bottom 25%, some the top and bottom 33%, and some the
top and bottom halves. In practice, all of these are probably acceptable (later in this chapter we
will show you a more practical approach that saves both time and effort). The difficulty of the
item is computed for each group separately, and these are labeled PT and PB (T for top, B for
bottom). The difference between PT and PB is the discrimination
PT - proportion of examinees in the top group getting the item correct index, designated as D,
and is calculated with the llowing formula (e.g-Johnson, 1951)
D= PT-PB
Where
D = Discrimination index
PT = proportion of examinees in the top group getting the item correct
PB = proportion of examinees in the bottom group getting the item correct

To illustrate the logic behind this index, consider a classroom test designed to measure
academic achievement in some specified area. If the item is discriminating between students
who know the material and those who do not, then students who are more knowledgeable ie.,
students in the top group) should get the item correct more often than students who are less
knowledgeable (ie. students in the bottom group). For example, if PT= 0. 0.80 indicating 80%
of the students in the top group answered the item correctly) and PB=0.30 (indicating 30% of
the students in the bottom group answered the item correctly), then

D= 0.80 – 0.30 = 0.50

Hopkins (1998) provided guidelines for evaluating items in terms of their D values (see Table
1). According to these guidelines, D values of 0.40 and above are considered excellent, between
0.30 and 0.39 are good, between 0.11 and 0.29 are fair, and between 0.00 and 0.10 are poor.
Items with negative D values are likely mis-keyed or there are other serious problems. Other
testing assessment experts have provided different guidelines, some more rigorous and some
more lenient.

Types of Discrimination Index


1. No Discrimination or Zero discrimination: The item of the test is answered correctly by
all examinee’s or an item is not answered correctly by any of the examinee
2. Positive discrimination: An item is correctly answered by superior and is not answered
correctly by inferior.
3. Negative discrimination: An item is correctly answered by inferiors and not answered
correctly by superiors.

As a general rule, we suggest that items with D values over 0.30 are acceptable (the larger the
better), and items with D values below 0.30 should be carefully reviewed and possibly revised
or deleted. However, this is only a general rule and there are exceptions. For example, most
indexes of item discrimination, including the item discrimination index (D), are biased in favor
of items with intermediate difficulty levels. That is, the maximum D value of an item is

Table 1 : Guidelines for Evaluating D Values


Difficulty Item

0.40 and larger Excellent

0.30 – 0.39 Good

0.11 – 0.29 Fair

0.00-0.10 Poor

Negative values Miskeyed or other major flaw


Table 2 : Maximum D values at Different Difficulty Levels
Item Difficulty Index (p) Maximum D value
1.00 0.00
0.90 0.20
0.80 0.40
0.70 0.60
0.60 0.70
0.50 1.00
0.40 0.70
0.30 0.60
0.20 0.40
0.10 0.20
0.00 0.00

related to its p value (see Table 2). Items that all test takers either pass or fail i.e., P values of
either 0.0 or 1.0) cannot provide any information about individual differences and their D
values will always be zero. If half of the test takers perfectly answered an item and half failed
(i.e., p value of 0.50), then it is possible for the item's D value to be 1.0. This does not mean
that all items with p values of 0.50 will have D values of 1.0, but just that the item can
conceivably have a D value of 1.0. As a result of this relationship between p and D, items that
have excellent discrimination power (i.e.., D Value of 0.40 and above) will necessarily have p
values between 0.20 and 0.80. In testing situations in which it is desirable to have either very
difficult items, D values can be expected to be lower than those normally desired. Additionally,
items that measure abilities or objectives that are not emphasized throughout the test may have
poor discrimination due to their unique focus. In this situation, if the item important ability or
learning objective and is free of technical defects, it should be retained (e.g., Linn & Gronlund,
2000)
Quick Way to Estimate Reliability for Classroom Exams

Saupe (1961) provided a quick method for teachers to calculate reliability for a classroom exam
in Era prior to cay access to calculators or computers. It is appropriate for a test in which each
item is given equal weight and each item is scored either right or wrong. First, the standard
deviation of the exam must be estimated from a simple approximation

First. the standard deviation of


SD = [sum of top 1/6th of scores - sum of bottom 1/6th of scores] / [total # of scores- 1] / 2
Reliability =1 – [0.19 x number of items] /SD2

Thus, for example, in a class with 24 student test scores, the top one-sixth of the scores are 98,
92, 87, and 86, while the bottom sixth of the scores are 48, 72, 74, and 75. With 25 test items,
the calculation are:

SD = [ 98 + 92 + 87 + 86 – 48 – 72 – 74 - 75] / 23 / 2
= [363 – 269] / 11.5
= 94 / 11.5 = 8.17
So,
Reliability = 1 – [0.19 x 25] / 8.172
= 1 – 0.07
= 0.93
A reliability coefficient of 0.93 for a classroom test is excellent.

Reference : Measurement and Assessment in Education by Reynolds, Livingston and Willson Second
Edition
Reliability: Practical Strategies for Teachers

Now, you are aware of the importance of the reliability of measurement. A common question
is "How can I estimate the reliability of scores on my classroom tests?" Most teachers have a
number of options. First, if you use multiple-choice or other tests that can be scored by a
computer scoring program, the score printout will typically report some reliability estimate
(e.g. coefficient alpha or KR-20). If you do not have access to computer scoring, but the items
on a test are of approximately equal difficulty and scored dichotomously (i.e. correct/incorrect),
you can use an internal consistency reliability estimate known as the Kuder-Richardson
formula 21 (KR-21). To calculate KR-21 you need to know only the mean, variance, and
number of items on the test:

𝑋𝑋 (𝑛𝑛−𝑋𝑋)
KR-21 =1 -
𝑛𝑛𝜎𝜎 2

X = mean
𝜎𝜎 2 = Variance
n = Number of Items

Consider the following set of scores:


50 48 47 46 42 42 40 40 38 37 38 41 49 43 40 32 31 30 28 41

X= 40.15
𝜎𝜎 2 = 39.71
and n=50

So,
40.15(50−40.15)
KR -21 = 1 -
50∗39.71

395.4775
=1-
1985.66

= 1 – 0.1991
=0. 80

Reference : Measurement and Assessment in Education by Reynolds, Livingston and


Willson Second Edication
Scoring System

It is essential that each candidate's progress be watched carefully and reported as accurately as
possible. Scores also prove an important means for stimulating, directing and rewarding the
efforts of candidates. Scores represent the degree of achievement as precisely as possible under
the circumstances. Scores are necessary but they are based on sufficient evidences. There are
two major types of scoring systems have been used for evaluation of the item set.

Absolute Scoring System: A scoring system in which candidate's percent score is independent
of any other candidate's scores called absolute scoring. To evaluate the Item set, teacher present
must be there, whose natural instincts in-clime to be helpful guides and counsellors, to stand in
judgment over some of their fellow. Experts stated that, “It is never difficult to give good scores
to candidate if it is higher than he really expected. But there are likely to be more occasions for
disappointment than pleasure in scores". Scoring standards are often varying from instructor to
instructor and from institution to institution. Due to such reasons, there is no scoring system
available, which will make the process of scoring easy and satisfactory. This is to say that no
new scoring system however cleverly devised and conscientiously followed to solve basic
problem of scoring.

Relative Scoring System: A basic principle of relative scoring is to measure achievement of


candidate by comparing with his /her achievement of peers. Another characteristic is to permit
candidate to set the scoring standard rather than to the teacher. In most areas of human
activities, awards go to individuals, who are outstanding in relative, not in absolute terms.
There are no absolute standards for speed in running the mile or for distance in throwing the
javelin. The winner in any race is determined on a relative basis.

Description of Test Score

A distribution is a set of scores. The score can be obtained from any kind of test conducted.
Statically, The distribution of a statistical data set is a listing or function showing all the
possible values of the data and how often they occur. When a distribution of categorical data
is organized, it provides the number or percentage of individuals in each group. When a
distribution of numerical data is organized, they’re often ordered from smallest to largest,
broken into reasonably sized groups, and then put into graphs and charts to examine the shape,
center, and amount of variability in the data.
Measures of central tendency are used to describe the centre of the distribution. There are three
measures commonly used for obtain the distribution. These measures are mean, median and
mode.

Mean: The "average" number; found by adding all data points and dividing by the number of
data points.

Example: 13, 18, 13, 14, 13, 16, 14, 21, 13

The mean is the usual average, so I'll add and then divide:

(13 + 18 + 13 + 14 + 13 + 16 + 14 + 21 + 13) ÷ 9 = 15

Median: The middle number; found by ordering all data points and picking out the one in the
middle

• Arrange the data points from smallest to largest.


• If the number of data points is odd, the median is the middle data point in the list.
• If the number of data points is even, the median is the average of the two middle data
points in the list.

Example: 13, 13, 13, 13, 14, 14, 16, 18, 21

There are nine numbers in the list, so the middle one will be the (9 + 1) ÷ 2 = 10 ÷ 2 = 5th
number:

13, 13, 13, 13, 14, 14, 16, 18, 21 >>> So the median is 14.

Mode: The most frequent number—that is, the number that occurs the highest number of times.
So, 13 is the mode.

In addition with three measures, range is also one kind of measure which is the difference
between the largest value and smallest value. In the list largest values is 21, and the smallest
is 13, so the range is 21 – 13 = 8.

Standard Deviation (SD): This is one of the measures of variability of scores. It indicates the
spread of scores around the mean score. For standard distribution its value is 1. A greater value
of SD indicates that there is a wide spread of scores around the mean score. It is defined as the
positive square root of the arithmetic mean of the square of deviation of given observation from
their arithmetic mean. In short 𝜎𝜎 is defined as root mean square deviation from mean.

Considered the student score for 20 students

Stu No Marks (10) Score Frequency


Mean = 7.3
A101 7 10 1
A102 8 Median = 7.5 9 4
A103 9 8 5
Mode = 8 7 4
A104 6
6 3
A105 7 5 2
A106 6 4 1
A107 10
A108 8
A109 5
Student Score Distribution
A110 9 6
A111 9 5
A112 9 4
A113 8
3
Frequency

A114 4
2
A115 5
A116 6 1
A117 7 0
A118 8 10 9 8 7 6 5 4

A119 8
A120 7
Figure -1

Normal Distribution

Many distributions fall on a normal curve, especially when large samples of data are
considered. This is important to understand because if a distribution is normal, there are certain
qualities that are consistent and help in quickly understanding the scores within the distribution.
The mean, median, and mode of a normal distribution are identical and fall exactly in the center
of the curve.

The empirical rule tells you what percentage of your data falls within a certain number
of standard deviations from the mean:
• 68% of the data falls within one standard deviation of the mean.
• 95% of the data falls within two standard deviations of the mean.
• 99.7% of the data falls within three standard deviations of the mean.
Figure 2

The standard deviation controls the spread of the distribution. A smaller standard deviation
indicates that the data is tightly clustered around the mean; the normal distribution will be
taller. A larger standard deviation indicates that the data is spread out around the mean; the
normal distribution will be flatter and wider.

Properties of a normal distribution

• The mean, mode and median are all equal.


• The curve is symmetric at the center (i.e. around the mean, μ).
• Exactly half of the values are to the left of center and exactly half the values are to the
right.
• The total area under the curve is 1.

Skewness: Literal meaning of skewness is 'Lack of Symmetry'. It is used to study the shape
i.e. symmetry or asymmetry of the frequency distribution. In a symmetrical distribution,
equal distances on either side of the central value. Both the tails (Left and Right) of the
curve is equal in shape and length. The skewness of normal frequency distribution is Zero.
The frequency curve of the distribution is not a symmetric bell shaped curve but it is
stretched more to one side to the other i.e. it has longer tail to one side (Left or Right) than
to another. A frequency distribution, for which the curve has a longer tail towards the right
side, is said to be positively skewed and if longer tail towards the left side, it is said to be
negatively skewed. The figure - 2 graphs show the symmetrical distribution and flowing
figure -3 and figure -4 shows asymmetrical distribution.
Figure – 3
For a right (positive) skewed distribution, the mean is typically greater than the median.
Also notice that the tail of the distribution on the right hand (positive) side is longer than
on the left hand side.

Figure - 4
A distribution that is skewed left has exactly the opposite characteristics of one that is skewed
right:

• the mean is typically less than the median;


• the tail of the distribution is longer on the left hand side than on the right hand side; and
• the median is closer to the third quartile than to the first quartile.
The following figure shows the comparison of skewness.

Percentile Rank: When different groups appear for the tests, scores are also different for tests.
They have widely different means, standard deviations and distributions. It is useful to have
standard scale to which they are referred. One such scale is percentile rank. The percentile rank
of a test score indicates what percent of the scores falls below the midpoint of that score
interval. In calculating percentile rank of any score, half of the persons receiving that score are
considered to have scored below and half of then to have scored above the midpoint of the
score interval. Percentile rank is used to determine the relation between a particular candidate's
score and score of other candidates tested in the group. It is the range of 0 to 100 regardless of
whether the group as a whole performs well or poorly in the test. Percentile ranks differ from
the original or raw test scores. Percentile ranks are rectangular distribution while row scores
are normal distribution. In normal distribution, the scores are concentrated near the middle with
decreasing score frequencies as one moves out to the high and low extremes. In rectangular
distribution the score frequencies are uniform all along the scale.
Self- Assessment
Assessment can be in different purpose and forms.
• To measure achievement (summative assessment/ assessment of learning);
• To stimulate learning (formative assessment/ assessment for learning);
• To enable learners to become conscious of how they learn (assessment as
learning).
However, higher education has generally focused on ‘acquisition of’ rather than
‘participation in’ learning (Boud and Falchikov, 2006). Falchikov (2005) further outlines
the changing definitions of assessment: assessment as measurement, assessment as
procedure, assessment as enquiry, assessment as accountability and assessment as
quality control.
Ideally, assessment is for learning as well as for measuring achievement (of
learning). When students are assessed in activities that seem intrinsically meaningful
or useful, they are more likely to engage and invest in deep learning (Sambell et al.,
2013). However, traditional assessment practices, which focus on grades and
individual certification, can undermine students’ capacity to judge their own work
(Boud and Falchikov, 2006). Students can become passive recipients of externally
imposed assessment practices. Assessment should be perceived of as a fair and
transparent process (Flint and Johnson, 2011). Both peer- and self-assessment can
contribute towards student perceptions of the fairness of assessment (Rust et al.,
2003).

Introduction to Self- Assessment


Self-assessment is a powerful mechanism for enhancing learning. It
encourages students to reflect on how their own work meets the goals set for learning
concepts and skills. It promotes metacognition about what is being learned, and
effective practices for learning. It encourages students to think about how a particular
assignment or course fits into the context of their education. It imparts reflective skills
that will be useful on the job or in academic research.
The underpinning axiom of self-assessment is that the individual student is able
to gain understanding of their own needs, which can then be communicated to fellow
students (leading into peer learning and assessment) and/or the tutor/lecturer. Self-
assessment is a valuable approach to supporting student learning, particularly when
used formatively (Taras, 2010). It is also useful in preparing students for life-long
learning, through discussions about their skills and competencies (including the ability
to assess), not just knowledge (Brew, 1999).
For self-assessment to be effective, students should first become familiar with
the concept. The term ‘self-assessment’ is used to cover all judgements by learners
of their work: it is related to and incorporates terms such as ‘self-evaluation’ and ‘self-
appraisal’. There are several different purposes of self-assessment: to evaluate
understanding of the content, to demonstrate the achievement of outcomes and goals
and the self-development of the learner. These three aspects of self-assessment are
all inter-linked and will receive different emphases at different times during the process
of learning.

Purpose of Self- Assessment


In general, self-assessment supports student learning and is one of the most
important skills that students require for future professional development and life-long
learning, as it develops the capacity to be assessors of learning (Boud and Falchikov,
2006; Taras, 2010). Taras also points out that self-assessment starts from the
perspective of the integration of learning and teaching.
• Promotes the skills of reflective practice and self-monitoring.
• Promotes academic integrity through student self-reporting of learning
progress.
• Develops self-directed learning.
• Increases student motivation.
• Helps students develop a range of personal, transferrable skills

Limitations of Self- Assessment


• Some students are reluctant to self-assess; they feel they lack the necessary
skills, confidence or ability to judge their own work
• Students do not like it and do not see benefit in it
• For some students cultural issues impact on self-assessment because giving
themselves a good grade is considered inappropriate or boasting

You might also like