Evaluation Extrenal Notes

EVALUATION
Evaluation is the process of determining to what extent the educational objectives are
being realised. – Ralph Tyler
Evaluation is the process of ascertaining or judging the value of something by careful

appraisal- Good carter.
It is a continuous process.
Evaluation is defined as an effort involving collection, analysis and interpretation of

data in order to judge the achievement of a programme’s objectives
Evaluation makes judgments about value or worthiness in relation an objective, goal or outcome.

Evaluation needs information from a variety of different sources and at different times.
Evaluation of learners in clinical practice settings is considered subjective rather than objective
Purpose of Evaluation
 Grading: Rank orders students and is usually used in terminal examination.

 Prediction: Predict probable future successes on certain types of tasks on the basis of
present achievement or related tasks.
 Diagnosis: Identifies the weakness in learning among students and monitor learning
progress so as to provide suggestions .
 Selection: Selects suitable candidates for various courses in a university. The
entrance tests for various courses serve this purpose.
 Evaluation of the programme: Monitors the effectiveness of teaching in a particular
course; find out the relevance of the objectives and effectiveness of methods used and
provides a basis for the modification of the curriculum and courses.
 Exploration: Bring out the inherent capabilities of pupils such as attitudes, habits,
appreciation, understanding, manipulative skills, in addition to the conventional
acquisition of knowledge.
 Guidance: Assists student in making decisions for the future in the choices of higher
studies or career.
 Motivation: Help the student to become increasingly self directing in their study and
activities. And it helps in selecting, giving honours, placement of students in
advanced education and writing recommendations
 Evaluation of teachers: Test the efficiencies of teachers in providing learning
experiences and effectiveness of instructions
1
Purpose of Evaluation in Nursing Education
 To become aware of the specific difficulties of individual students or of an entire

class, as a basis for further teaching .
 To determine the level of student’s clinical performance.
 To determine the level of knowledge and understanding of students.
 To help students to acquire that attitude of and skills in self evaluation To help
students to become increasingly self directing in their study.
 To encourage student’s learning by measuring their achievements and informing them
of their success .
 To diagnose each student’s strengths and weakness and to suggest remedial measures
which may be needed .
 To gather information needed for administrative purpose.
 To estimate the effectiveness of teaching and learning techniques, of subject content
and of instructional media in attaining the goals of the programme .
 To provide the additional motivation of examination that provide opportunity to
practice critical thinking, the application of principles, the making of judgments etc.
Principles of evaluation
1. Determining and clarifying what is to be evaluated always has priority in the evaluation
process.
2. Evaluation techniques should be selected according to the purpose to be served .
3.Comprehensive evaluation requires a variety of evaluation techniques
4.Proper use of evaluation techniques requires an awareness of both their limitations and
strengths.
5.Evaluation is a means to an end , not an end in itself
Characteristics of Evaluation
1. Evaluation is a continuous process
2. Evaluation includes academic and non-academic subjects
3. Evaluation is a procedure for improving the product:
4. Discovering the needs of an individual and designing learning experiences:
5. Evaluation is purpose oriented:
2
Functions of Evaluation
1. To test the efficiency of teachers in providing learning experiences. To motivate

pupils toward better attainment and growth.
2. To provide a basis for a modification of the curriculum and the course.
3. To locate areas where remedial measures are needed .
4. To diagnose the weakness and strength of the pupils.
5. To make provisions for guiding the growth of individuals and pupils.
Types of Evaluation
Formative evaluation is continuous, diagnostic and focused on both what students are doing well
and areas where they need to improve (Carnegie Mellon, n.d.). As the goal of formative
evaluation is to improve future performance, a mark or grade is not usually included (Gaberson,
Oermann & Scellenbarger, 2015; Marsh et al., 2005). Formative evaluations, sometimes referred
to as mid-term evaluation, should precede final or summative evaluation.
Summative evaluation summarizes how students have or have not achieved the outcomes and
competencies stipulated in course objectives (Carnegie Mellon, n.d.), and includes a mark or
grade. Summative evaluation can be completed at mid-term or at end of term. Both formative and
summative evaluation consider context. They can include measurement and assessment methods
noted previously as well as staff observations, written work, presentations and a variety of other
measures.
Diagnostic Evaluation:• This type of evaluation is concerned with finding out the reasons for
students persistent or recurring learning difficulties that cannot be resolved by standard
corrective measures or formative evaluation .• The aim of diagnostic evaluation is to find out
the causes of learning problems and plan to take remedial actions.• Observational techniques
or specially prepared diagnostic techniques can be used to diagnose the problem
3
TEST:
Test is defined as a series of questions on the basis of which information is sought.
CHARACTERISTICS OF A GOOD TEST:

For a test to be scientifically sound, it must possess the following characteristics;
 Objectivity:
A test must have the trait of objectivity, that is it must be free from the subjective
element so that there is complete interpersonal agreement among experts regarding
the meaning of items and scoring of the test. Objectivity here refers of two aspects of
the test i-e
 Objectivity of Items:
By objectivity of items is meant that the items should be phrased in such a manner
that they are interpreted in exactly the same way by all those who are taking the test.
For ensuring objectivity of items, items must have uniformity of order of
presentation (either ascending or descending).
 Objectivity of Scoring:
By objectivity of scoring is meant that the scoring method of the test should be a
standard one so that complete uniformity can be maintained when the test is scored
by different experts at different times.
 Reliability:
A test must also be reliable. Reliability is “Self-correlation of the test.” It shows the
extent to which the results obtained are consisted when the test is administered.
Once or more than once on the same sample with a reasonable gap. Consistency in
results obtained in a single administration is the index of internal consistency of the
test and consistency in results obtained upon testing and retesting is the index of
temporal consistency. Reliability thus, includes both internal consistency as well as
temporal consistency. A test to be called sound must be reliable because reliability
indicates the extent to which the scores obtained in the test are free from such
internal defects of standardization, which are likely to produce errors of
measurement.
4
 Validity:
Validity is another prerequisite for a test to be sound. Validity indicates the extent to
which the test measure what it intends to measure, when compared with some
outside independent criteria. In other words it is the correlation of the test with some
outside criteria. The criteria should be independent one and should be regarded as
the best index of trait or ability being measured by the test. Generally, validity of a
test is dependent upon the reliability because a test which yields inconsistent results(
poor reliability) is ordinarily not expected to correlate with some outside
independent criteria.
 Norms:
A test must also be guided by certain norms. Norms refer to the “average
performance of the representative sample on a given test.” There are four common
types of norms;
 Age norm
 Grade norm
 Percentile norms
 Standard score norms.
Depending upon the purpose and use, a test constructor prepares any of these above
norms of his test. Norms help in interpretation of the scores. In the absence of norms
no meaning can be added to the score obtained on the test.
 Practicability:
A test must also be practicable from the point of view of the time taken in its
completion, length, scoring etc. in other words, the test should not be lengthy and
the scoring method must not be difficult nor one which can only be done by highly
specialized person.
TEST CONSTRUCTION
Testing is generally concerned with turning performance into numbers. 13% of

students who fail in class are caused by faulty test questions It is estimated that 90%
of the testing items are out of quality.
5
Principles of Test Construction
Test construction requires a systematic organized approach if positive results are to

be expected. Firstly, the objective must be well defined. There are numerous
points which are common to all types of tests and items which must be observed in
constructing a test. Some of the more important are given below
1. Avoid obvious, trivial, meaningless and ambigjous items;
2. Observe the rules of rhetoric, grammar and punctuation;
3. Avoid items that have no answer upon which all experts will agree;
4. Avoid trick, or catch items that are so phrased that the correct answer depends on
a single obscure key word.to which even good students are unlikely to give
sufficient expression.
5. Avoid items which contain irrelevant clues.
6. Avoid items which furnish the answers to other items.
7. All the pupils are to take the same tests and permit no chance among items. Pupils
cannot be compared with one another unless they all take the same tests.
8. Measure all instructional objectives – Objectives that are communicated and
imparted to the students and Harmonious to the teachers instructional objectives
9. Make test valid & reliable – Reliable when it produce dependant, consistent, and
accurate scores – Valid when it measures what it purports to measure – Test which
are written clearly and unambiguous are reliable – Tests with fairly more items are
reliable than tests with less items – Tests which are well planned, covers wide
objectives, & are well executed are more valid
Forms of New Type of Tests
1. Multiple choice/ True or False

2. Matching
3. Free response (analogy type)eg. Fill in the blanks/ sentence completion
4. Objective Type
5. Short answer type/ Essay type
6
“GENERAL STEPS OF TEST CONSTRUCTION”
Development of good psychological test requires thoughtful and sound application of

established principles of test construction. Before the real work of test construction,
the test constructor take some broad decisions about the major objectives of the test
in general terms and population for whom the test is intended and also indicates the
possible conditions under which the test can be used and its important uses. These
preliminary decisions have far reaching consequences. For example a test
constructor may decide to construct an intelligence test meant for students of tenth
grade broadly aiming at diagnosing the manipulative and organizational ability of the
pupils. Having decided the above preliminary things, the test constructor go ahead
with the following steps:
1. Planning
2. Writing items for the test.
3. Preliminary administration of the test.
4. Reliability of the final test.
5. Validity of the final test.
6. Preparation of norms for the final test.
7. Preparation of manual.
Each stages are briefly described below.
1. PLANNING:
The first step in the test construction is the careful planning. At this stage, the author
has to spell out the broad and specific objectives of the test in clear terms. That is the
the purpose or purposes for which they will use the test. Also the author has to kept
im mind the following points.
 What will be the appropriate age range, educational level and cultural
background of the examinees, who would find it desirable to take the test?
 What will be the content of the test? Is this content coverage different from
that of the existing tests developed for the same or similar purposes? Is this
cultural specific?
 what would be the nature of items, that is to decide if the test will be multiple
choice, true false, inventive response or n some other form.
 What would be the type of instructions i-e written or to be delivered orally?
7
 Whether the test would be administered individually or in groups? Will the
test be designed or modified for computer administration.
 The test constructor must have to decide about the probable length and time
for completion of test.
 Is there any potential harm for the examinees resulting from the
administration of this test? Are there any safeguards built into the
recommended testing procedure to prevent any sort of harm to anyone
involved in the use of this test.
 How will the scores be interpreted? Will the scores of an examinee be compare
to others in the criteria group or will they be use to assess mastery of a specific
content area?
2. WRITING DOWN ITEMS:
Item:
A single question or task that is not often broken down into any smaller units. (Bean,
1953:15)
The second step in item writing is the preparation of the items of test. Item writing
starts with the planning done earlier. If the test constructor decides to prepare an
essay test, then the essay items are written down. However, if he decide to construct
an objective test, he writes down the objective items such as the alternative response
item, matching item, multiple choice item, completion item, short answer item,
pictorial form of item, etc. Depending upon the purpose, he decides to write any of
these objective type of items.
Prerequisites for Item Writing:

Item writing is essentially a creative art. There are no set rules to guide and
guarantee writing of good items. A lot depends upon the item writer’s intuition,
imagination, experience, practice and ingenuity. However there are some essential
prerequisites which must be met if the item writer wants to write good and
appropriate items. These requirements are briefly discussed as follows;
 The item writer must have a thorough knowledge and complete mastery of the
subject matter. In other words, he must be fully acquainted with all facts,
principles, misconceptions, Fallacies in a particular field so that he may be
able to write good and appropriate items.
8
 The item writer must be fully aware of those persons for whom the test is
meant. He must also be aware of the intelligence level of those persons so that
he may manipulate the difficulty level of the items for proper adjustment with
their ability level. He must also be able to avoid irrelevant clues to correct
responses.
 The item writer must be familiar with different types of items along with their
advantages and disadvantages. He must also be aware of the characteristics of
good items and the common probable errors in writing items.
 The item writer must have a large vocabulary. He must know the different
meanings of a word so that confusion in writing the items may be avoided. He
must be able to convey the meaning of the items in the simplest possible
language.
 Always give due importance to the difficulty level of test items (easy, average,
difficult).
 Before starting to write the items, divide the whole unit/ variable into sub-
units/ components and decide the weightage given to each sub-unit/
component.
Follow all these rules and write down the test items.
Submit the items for Expert Opinion:

After writing down the items, they must be submitted to a group of subject experts
for their criticism or suggestions, which must then be duly modified.
Arrangement of Items:
After the items have been written down, they are reviewed by some experts are by the
item writer himself and then arranged in the order in which they are to appear in the
final test. Generally, items are arranged in an increasing order of difficult those
having the same form (say alternative form, matching, multiple-choice, etc.) and
dealing with same contents are placed together.
3. PRELIMINARY ADMINISTRATION:
Before proceeding toward administration of the test review by at least three experts.
When the test have been written down and modified in the light of the suggestions
9
and criticisms given by the experts, the test is said to be ready for experimental try-
out.
The Pre-Try-Out:
The first administration of the test is called PRE-TRY-OUT. The sample size for pre-
try out should be 400.The main purpose of the pre tryout of any psychological and
educational test is as follows:
 Finding out the major weaknesses, omissions, ambiguities and inadequacies

of the Items.
 Experimental tryout helps in determining the difficulty level of each item,
which in turn helps in their proper distribution in the final form.
 Helps in determining a reasonable time limit for the test.
 Determining the appropriate length of the tests. In other words, it helps in
determining the number of items to be included in the final form.
 Identifying any weaknesses and vagueness in directions or instructions of
the test.
 The pre try out is carried out for the item analysis. ITEM ANLYSIS is the
technique of selecting discriminating items for the final composition of the
test. It aims at obtaining two kind of information regarding the items. That
is;
ITEM ANALYSIS
Item analysis is a statistical technique which is used for selecting and rejecting the items of
the test on the basis of their difficulty value and discriminated power.
Objectives of Item Analysis
 To select appropriate items for the final draft.

 To modify the test.
Steps of Item Analysis
1. Arrange the answer sheets in descending order of scores.
2. Take 27% of answer sheets having highest scores and mark them as Higher group (H).
3. Take 27% of answer sheets having lowest scores and mark them as Lower group (L)
10
4. Count the number of Right answers in the Higher group for each question and mark it as
(RH).
5. Count the number of Right answers in the Lower group for each question and mark it as
(RL).
Then calculate the Item difficulty or Difficulty Index (DI) and Discriminating Power (D.P)
 Item Difficulty ( Difficulty Index, DI):

Item difficulty is the proportion or percentage of the examinees or individuals who
answer the item correctly.
Item Difficulty or DI = (RH + RL)/(NH+ NL)
RH- No.of right answers in the highest group for a particular question.
RL- No.of right answers in the lowest group for a particular question.
NH – Number of students in the highst group
NL- Number of students in the lowest group
 Discriminatory Power of the Items:

The discriminatory power of the items refers to the extent to which any given item
discriminates successfully between those who possess the trait in larger amounts and
those who possess the same trait in the least amount.
DP = (RH-RL)/ NH
Or
DP- (RH-RL)/ NL
RH- No.of right answers in the highest group for a particular question.
RL- No.of right answers in the lowest group for a particular question.
NH – Number of students in the highst group
NL- Number of students in the lowest group
After item analysis, items with item difficulty in between 0.25 and 0.75 and items
with discriminating power greater than 0.4 will be selected for the final test.
4. PREPARATION OF FINAL TEST
Based on pre-try out and item analysis, the author will prepare the final test.
11
5. ESTABLISHING RELIABILITY OF THE FINAL TEST
In simple words it is defined as the degree to which a measurement is consistent. If

finding from research are replicated consistently then they are reliable.
Ways of Finding Reliability:

Following are the methods to check reliability
 Test-retest
 Alternate form
 Split –half method
 Test-Retest Method:
It is the oldest and commonly used method of testing reliability. The test retest
method assesses the external consistency of a test. Examples of appropriate tests
include questionnaires and psycho metric tests. It measures the stability of a test
over time.
A typical assessment would involve giving participants the same test on two separate
occasions. Each and every thing from start to end will be same in both tests. Results
of first test need to be correlated with the result of second test. If the same or similar
results are obtained then external reliability is established.
The timing of the test is important if the duration is to brief then participants may
recall information from the first test which could bias the results. Alternatively, if the
duration is too long it is feasible that the participants could have changed in some
important way which could also bias the results.
Utility and worth of a psychological test decreases with time so the test should be
revised and updated. When tests are not revised systematic error may arise.
 Alternate Form:
In alternate form two equivalent forms of the test are administered to the same
group of examinees. An individual has given one form of the test and after a period of
time the person is given a different version of the same test. The two form of the rest
are then correlated to yield a coefficient of equivalence.
 Split-Half Method:
The split half method assesses the internal consistency of a test. It measures the
extent to which all parts of the test contribute equally to what is being measured. The
12
test is technically spitted into odd and even form. The reason behind this is when we
making test we always have the items in order of increasing difficulty if we put (1,2,
—-10) in one half and (11,12,—-20) in another half then all easy question/items will
goes to one group and all difficult questions/items will goes to the second group.
When we split the test we should split it with same format/theme e.g. Multiple
questions – multiple questions or blanks – blanks.
6. ESTABLISHING VALIDITY OF THE TEST:
It refers to the extent to which test claim to measure what it claims to measure.
If a test is reliable then it is not necessary that it is valid but if a test is valid then it
must be reliable.
Types of Validity:
 Face validity
 Construct validity
 Criterion related validity
 Face Validity
Face validity is determined by a review of the items and not through the use of
statistical analysis. Face validity is not investigated through formal procedures.
Instead anyone who looks over the test, including examinees, may develop an
informal opinion as to whether or not the test is measuring what it is supposed to
measure. While it is clearly of some value to have the test appear to be valid, face
validity alone is insufficient for establishing that the test is measuring what it claims
to measure.
 Construct Validity:
It implies using the construct correctly (concepts, ideas, notions). Construct validity
seeks agreement between a theoretical concept and a specific measuring device or
procedure.
For example, a test of intelligence now a day’s must include measures of multiple
intelligences, rather than just logical-mathematical and linguistic ability measures.
 Criterion Related Validity:

It states that the criteria should be clearly defined by the teacher in advance. It has to
take into account other teachers criteria to be standardized and it also needs to
13
demonstrate the accuracy of a measure or procedure compared to another measure
or procedure which has already been demonstrated to be valid.
7. ESTABLISH NORMS:
Norm is defined as the average performance or scores of a large sample

representative of a specified population. Norms are prepared to meaningfully
interpret the scores obtained on the test for as we know, the obtained scores on the
test themselves convey no meaning regarding the ability or trait being measured. But
when these are compared with the norms, a meaningful inference can immediately
be drawn.
Types of norms:
 Age norms
 Grade norms
 Percentile norms
 Standard scores norms
All these types of norms are not suited to all type of tests. Keeping in view the
purpose and type of test, the test constructer develops a suitable norm for the test.
 Age Norm
Age norms indicate the average performance of different samples of test takers who
were at various ages at the time the test was administered.
If the measurement under consideration is height in inches for example we know

that scores (heights) for children will gradually increase at various rates as a function
of age up to the middle to late teens.
The child of any chronological age whose performance on a valid test of intellectual
ability indicated that he or she had intellectual ability similar to that of the average
child of some other age was said to have the mental age of the norm group in which
his or her test score fell.
The reasoning here was that irrespective of chronological age, children with the same
mental age could be expected to read the same level of material, solve the same kinds
of math problems, and reason with a similar level of judgment. But some have
complained that the concept of mental age is too broad and that although a 6-year-
old might, for example perform intellectually like a 12-year-old, the 6 year old might
14
not be very similar at all to the average 12 year old socially, psychologically and
otherwise.
 Grade Norms:
Grade norm was designed to indicate the average test performance of test takers in a
given school grade, grade norms are developed by administering the test to
representative samples of children over a range of consecutive grade levels.
Like age norms, grade norms have wide spread application with children of
elementary school age, the thought here is that children learn and develop at varying
rates but in ways that are in some aspects predictable.
One drawback in grade norms is that they are useful only with respect to years and
months of schooling completed. They have little or no applicability to children who
are not yet in school or who are out of school.
 Percentile Norms:
Percentile system is ranking of test scores that indicate the ratio of score lower from
higher than a given score. A percentile is an expression of the percentage of people
whose score on a test or measure falls below a particular raw score. A more familiar
description of test performance, the concept of percentage correct, must be
distinguished from the concept of a percentile.
A percentile is a converted score that refers to a percentage of test takers.
Percentage correct refers to the distribution of raw scores-more specifically, to the

number of items that were answered correctly multiplied by hundred and divided by
the total number of items.
Because percentiles are easily calculated they are a popular way of organizing test
data and are very adoptable to a wide range of tests.
 Standard Score Norms:

When a raw score is converted into a formula it becomes standard scores.
For example marks obtained in paper may be in 100% are applicable only in specific
area but when they are converted in GPA they become standard score.
15
7. PREPARATION OF MANUAL AND REPRODUCTION OF THE TEST:
The last step in test construction is the preparation of a manual of the test. In the
manual the test constructor reports the psychometric properties of the test, norms
and references. This gives a clear indication regarding the procedures of the test
administration, the scoring methods and time limits, if any of the test. It also
includes instructions as well as the details of arrangement of materials that is
whether items have been arranged in random order or in any other order. The test
constructer finally orders for printing of the test and the manual.
MARKING vs GRADING
Definition of Marking System
16
The system of examination in which numbers are provided for their achievement in various
subjects and students are generally evaluated on 101-point scale is known as Marking system
of examination.
Advantages of Marking System
1. It satisfies the rank holders.

2. It is easy to rank the students.
3. Teachers‟ task is easy in this system.
4. Students work hard for achieving higher positions in class.
5. A feeling of competition forces them to learn more than others.
6. Parents get accurate information about the performance of their wards.
Disadvantages of Marking System

1. It promotes a rat race for marks among students.
2. It pressurizes students to achieve better than their peers.
3. Societal pressure will make students stressed.
4. This system results in suicidal tendencies among slow learners.
5. Variation in marks of different teachers de-motivates students.
6. There can be biasness in giving marks.
7. Its focus is on marks not on healthy learning environment.
8. It adversely affects the interpersonal relations among students.
9. Misclassification of students on the basis of marks is a major disadvantage.
Grading
Grading, whether with a numerical value, letter grade or pass/fail designation, indicates the
degree of accomplishment achieved by a learner. Differentiating between norm-referenced
grading and criterion-referenced grading is important. Norm-referenced grading evaluates
student performance in comparison to other students in a group or program, determining
whether the performance is better than, worse than or equivalent to that of other students
(Gaberson, Oermann, & Shellenbarger, 2015). Criterion-referenced grading evaluates student
performance in relation to predetermined criteria and does not consider the performance of
other students (Gaberson, Oermann, & Shellenbarger, 2015).
Criterion-referenced grading reflects only individual accomplishment. If all the participants

in a learning group demonstrate strong clinical skills, they all earn top grades. In contrast, a
17
learner’s grade in norm-referenced grading reflects accomplishment in relation to others in
the group. Only a select few can earn top grades, most will receive mid-level grades, and at
least some will receive failing grades. Norm-referenced grading is based on the symmetrical
statistical model of a bell or normal distribution curve.
Advantages of norm-referenced grading include the opportunity to compare students in a

particular location with national norms; to highlight assignments that are too difficult or too
easy; and to monitor grade distributions such as too many students receiving high or over-
inflated grades (Centre for the Study of Higher Education, 2002). The disadvantages of norm-
referenced grading centre on the notion that one student’s achievements, successes and even
failure can depend unfairly on the performances of others. Grade inflation, an upward trend
in grades awarded to students, has led many programs in the health disciplines to establish
rigorous admission requirements and use a pass/fail grading approach.
Criterion-referenced grading judges student achievement against objective criteria outlined in

course objectives and expected outcomes, without consideration for what other students have
or have not achieved. The process is transparent and students can link their grades to their
performance on predictable and set tasks (Centre for the Study of Higher Education, 2002). In
turn, this approach can consider individual student learning needs and build in opportunities
for remediation when needed (Winstrom, n.d.). One disadvantage of criterion-referenced
grading is the need for more instructor time for grading. Also, awarding special recognition
with prizes or scholarships to excelling students may not be as clear-cut when students are
not compared to peers.
Types of Grading
1. Direct Grading
Here each answer is graded with letters such as A, B, C, D and E
2. Indirect Grading
18
The two types of Indirect grading are
Absolute Grading
In this type marks are initially awarded on a 101 point scale and then these marks are
converted into grades.
In this system, it is possible for all of your students to pass and even for all of them to get As.
If all of your students score a 90 or above on the test you have just given, then all of your
students will get an A on this test.
A = 90-100
B = 80-89
C = 70-79
D = 60-69
F = 0-59
Relative Grading
The other kind of grading system is called relative grading. In this system, grades are given
based on the student's score compared to the others in the class. Relative grading allows for
the teacher to interpret the results of an assessment and determine grades based on student
performance. One example of this is grading “on the curve.” In this approach, the grades of
an assessment are forced to fit a “bell curve” no matter what the distribution is. A hard grade
to the curve would look as follows.
A = Top 10% of students

B = Next 25% of students
C = Middle 30% of students
D = Next 25% of students
F = Bottom 10% of students
As such, if the entire class had a score on an exam between 90-100% using relative grading
would still create a distribution that is balanced. Whether this is fair or not is another
discussion.
Some teachers will divide the class grades by quartiles with a spread from A-D. Others will
use the highest grade achieved by an individual student as the A grade and mark other
students based on the performance of the best student.
There are times when institutions would set the policy for relative grading. For example, in a
graduate school, you may see the following grading scale.
A = top 60%
B = next 30%
C = next 10%
D, F = Should never happen
the philosophy behind this is that in graduate school all the students are excellent so the
grades should be better. Earning a “C” is the same as earning an “F.” Earning a “D” or “F”
often leads to removal from the program.
19
Advantages of Grading System
The grading system is supposed to have following advantages:

 The degree of achievement of an individual student is communicated effectively
to stakeholders.
 The judgmental quality of marks is done away with, leading to stress-free learning
environment in schools.
 Misclassification of students on the basis of marks is minimized.
 Unhealthy cut-throat competition among high achievers is minimized.
 Societal pressure is reduced.
 The focus is on a better learning environment
Disadvantages of Grading System:
 It does not necessarily eliminate the possibility of misclassification depending on

variability and reliability of test.
 It subjects the abler students to a disadvantage and poorer ones to an advantage.
 Possibility of assigning wrong grades in the neighborhood of cut scores is also

substantial.
QUESTION BANK PREPARATION, VALIDATION & MODERATION BY

PANEL
The question bank makes available statistically sound questions of known technical
worth and model question papers and thus facilitates selection of proper question for
a well designed to question paper. A question bank is a planned library of test items
designed to fulfill certain predetermined purposes. Question bank should be
20
prepared with at most care so as to cover the entire prescribed text. Question bank
should be exhaustive and cover entire content with different types.
DEFINITION
• “An item bank is defined as an organized collection of test items that can be
assessed for test development” - Rudner.
• “An item bank or question bank is a collection of test items organized, classified
and catalogues the order to facilitate the construction of a variety of achievement and
other types of mental test.” - B. H. Choppin.
PURPOSES
• To improve the teaching learning process
• Through instructional efforts the student’s growth will be obtained
• To improve evaluation process
• A pool of test items can be used for formative and summative evaluation of the
student’s performance
NEED
• Before the test, the teachers are generally not getting adequate time to prepare the
questions. Naturally in the absence of adequate time, they prepare them
haphazardly.
• Construction of test questions on the spot is rather difficult. It is time consuming.
• If there is an item or question bank, the material from it can be used by any teacher
and by any school.
• It is needed to produce and evaluate questions.
PRINCIPLES
• Spent adequate amount of time for developing the question
• Match the questions to the content
• Try to make the question valid, reliable, and balanced.
21
• Use a variety of testing methods
• Write questions that tests skills other than recall
• While framing question it has to be ensured that they are unambiguous, simple in
the language and brief as far as possible.
• Each of the question should evaluate some specific content area or learning
outcome.
PRINCIPLES
• Their difficulty level should be appreciated to the group of learners being tested.
• Each question should be accompanied with some specific information
• All the objective items should be grouped in one section, while the short answer
type and essay type item should be in another section.
• In the section of objective type items having the same format, e.g. Yes- No type,
True- False type, multiple choice types etc should be grouped together.
• Item in each section should be arranged in order of their difficulty, as far as

possible.
PREPARATION OF QUESTION BANK
STEPS
1. Planning a question bank
A question bank should be established only after due planning. Objectives of

question bank should be clearly visualized.
2. Development of question bank and Collection of test items
What type of questions made up the bank entirely depends on the total frame of
reference envisaged at the planning stage the scope of question bank is determined
by taking such decisions. It should be decided whether only the written
examinations, oral examinations, practical examinations or all to be stored in a
bank..
3. Blue Printing for Developing Question Bank
22
The preparation of blue print is essential in developing a questions bank. If the items
are not relevant to the objectives of the program for which the question pool is being
developed, it results in hodge- podge of question. There, blue prints help to generate
a quality question pool.
Writing of questions:-
a). Questions are written on collection from various sources.
b).Ready – made question may be lifted from old- question papers, from
standardized tests and from review exercise of good textbook. It must fit into the
specification of questions must be given, question should be accompanied by answer
key, besides indicating objectives and content area. .
c).In case of new question, there may be invited from experienced teacher, examiners
and paper setters on the basis of some honorarium Specifications of questions must
be given, questions should be accompanied by answer key, besides indicating
objectives and content area
d). Get the questions prepared by practicing teacher invited to a workshop for the
purpose. This get together helps to discuss the questions face- to face and get quality
questions.
e). In such a workshop subject area and the objective may be allotted to a
participants according to their competencies.
4. Validation
. Screening of questions:- a) After the questions are written, questions sheets are
passed on to other members of the group for their comments. These comments are
passed on to author of the question, who in consultation with two or three
participants finalizes the questions individual questions are written on black board
by author followed by discussion by participants. Though, it is a time consuming
process,
It is educationally more potent. not only the quality of the question is improved, built
also provides good training to the participants for framing good questions
b) 2nd level screening is done with the help of 3 subject experts, are conversant with
the technique of text construction. Such a group may consist of subject specialist who
would pass judgment on the authenticity of the subject matter. Another person is a
teacher associated with that class who helps to judge the suitability of the question
for a particular grade level. A third person may be an evaluation expect who helps to
improve the format of the question in the light of the objectives to be tested.
5. Recording and storing
23
6. Developing a system for maintaining confidentiality
III. MODERATION BY PANEL
PURPOSES
A panel comprising of three or more experts (moderators/subject experts) has to

ensure that:
• The task is clear in each item and the person attempting an item will know what is
expected. The task in an item is understood in the same way by all candidates.
• The items are expressed in a language which is as clear as possible to the

candidates.
• The items are set within and based on the objectives and course contents outlined
in the syllabus.
• The questions are well distributed in the different parts of the syllabus course
contents/cover syllabus adequately.
• The items are fair assessment of candidates at a particular level and if they are not,
they should be tempered to the level of the candidates, actually, among others, that
are moderation is,
• The items are technically correct and accurate, offering the best way testing the
concepts or principles or knowledge it is intended to test. They should not have clues
to the correct answers.
• The items are original and not just copied from the text books or past examination
papers.
IV. UTILIZATION OF QUESTION BANK
• A judicious selection of questions can be made for instructional purpose.
• Different types of types of questions selected from a question bank may be used for
pre testing, development, review and revision of a lesson.
• In the preparation of textual material a question poll can be utilized for preparing
review exercises in the text books.
24
• the preparation of teaching units or resource units also involves the use of
evaluation materials which may be picked up from the question bank.
• For evaluating the pupil’s progress the question bank can be used most efficiently.
Individual questions can be stored and grouped for use in topic or unit testing in
periodical test.
• Individual questions, unit tests and question papers can be profitably used by the
examining agencies by making the question bank available for paper setters.
• When question banks are established in institutions, students can use them for self
evaluation in their spare time.
• When questions on all topics of the prescribed syllabus are available pupils can
review the lessons. Even teachers can make use of such data for quick revision.
25

Evaluation Extrenal Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Evaluation Extrenal Notes

Uploaded by

Copyright:

Available Formats

EVALUATION

Evaluation is the process of ascertaining or judging the value of something by careful

Evaluation is defined as an effort involving collection, analysis and interpretation of

Evaluation makes judgments about value or worthiness in relation an objective, goal or outcome.

 Grading: Rank orders students and is usually used in terminal examination.

 To become aware of the specific difficulties of individual students or of an entire

2. Evaluation techniques should be selected according to the purpose to be served .

3.Comprehensive evaluation requires a variety of evaluation techniques

5.Evaluation is a means to an end , not an end in itself

1. Evaluation is a continuous process

2. Evaluation includes academic and non-academic subjects

3. Evaluation is a procedure for improving the product:

4. Discovering the needs of an individual and designing learning experiences:

5. Evaluation is purpose oriented:

1. To test the efficiency of teachers in providing learning experiences. To motivate

CHARACTERISTICS OF A GOOD TEST:

Testing is generally concerned with turning performance into numbers. 13% of

Test construction requires a systematic organized approach if positive results are to

Forms of New Type of Tests

1. Multiple choice/ True or False

Development of good psychological test requires thoughtful and sound application of

2. WRITING DOWN ITEMS:

Prerequisites for Item Writing:

Submit the items for Expert Opinion:

 Finding out the major weaknesses, omissions, ambiguities and inadequacies

Objectives of Item Analysis

 To select appropriate items for the final draft.

Steps of Item Analysis

1. Arrange the answer sheets in descending order of scores.

 Item Difficulty ( Difficulty Index, DI):

Item Difficulty or DI = (RH + RL)/(NH+ NL)

NH – Number of students in the highst group

NL- Number of students in the lowest group

 Discriminatory Power of the Items:

NH – Number of students in the highst group

NL- Number of students in the lowest group

4. PREPARATION OF FINAL TEST

In simple words it is defined as the degree to which a measurement is consistent. If

Ways of Finding Reliability:

6. ESTABLISHING VALIDITY OF THE TEST:

 Criterion Related Validity:

Norm is defined as the average performance or scores of a large sample

If the measurement under consideration is height in inches for example we know

A percentile is a converted score that refers to a percentage of test takers.

Percentage correct refers to the distribution of raw scores-more specifically, to the

 Standard Score Norms:

Definition of Marking System

Advantages of Marking System

1. It satisfies the rank holders.

Disadvantages of Marking System

Criterion-referenced grading reflects only individual accomplishment. If all the participants

Advantages of norm-referenced grading include the opportunity to compare students in a

Criterion-referenced grading judges student achievement against objective criteria outlined in

Here each answer is graded with letters such as A, B, C, D and E

A = Top 10% of students

The grading system is supposed to have following advantages:

 It does not necessarily eliminate the possibility of misclassification depending on

 It subjects the abler students to a disadvantage and poorer ones to an advantage.

 Possibility of assigning wrong grades in the neighborhood of cut scores is also

QUESTION BANK PREPARATION, VALIDATION & MODERATION BY

• To improve the teaching learning process

• Through instructional efforts the student’s growth will be obtained

• To improve evaluation process

• Construction of test questions on the spot is rather difficult. It is time consuming.