You are on page 1of 11

There are two ways of assessing pupils — formal summative assessment and informal

formative assessment. Find out the benefits of both to pupils’ learning outcomes.

Formative assessment and summative assessment are two overlapping, complementary


ways of assessing pupil progress in schools. While the common goal is to establish
the development, strengths and weaknesses of each student, each assessment type
provides different insights and actions for educators. The key to holistic assessment
practice is to understand what each method contributes to the end goals — improving
school attainment levels and individual pupils’ learning — and to maximise the
effectiveness of each.

Both terms are ubiquitous, yet teachers sometimes lack clarity around the most effective
types of summative assessment and more creative methods of formative assessment.
In our latest State of Technology in Education report, we learnt that more educators
are using online tools to track summative assessment than formative, for example. Yet
this needn’t be the case. In this post we will explain the difference between these two
types of assessment, outline some methods of evaluation, and assess why both are
essential to student development.

Summative assessment explained

Summative assessment aims to evaluate student learning and academic


achievement at the end of a term, year or semester by comparing it against a universal
standard or school benchmark. Summative assessments often have a high point value,
take place under controlled conditions, and therefore have more visibility.

Summative assessment examples:

 End-of-term or midterm exams

 Cumulative work over an extended period such as a final project or creative portfolio

 End-of-unit or chapter tests

 Standardised tests that demonstrate school accountability are used for pupil admissions;
SATs, GCSEs and A-Levels
Why is summative assessment important for learning?
In the current education system, standard-driven instruction plays a significant role.
Summative assessment, therefore, provides an essential benchmark to check the
progress of students, institutions and the educational program of the country as a whole.

Summative assessment contributes largely towards improving the British curriculum and
overall curriculum planning. When summative assessment data indicates gaps across the
board between student knowledge and learning targets, schools may turn to improved
curriculum planning and new learning criteria to assess and improve their school
attainment levels.

Formative assessment explained

Formative assessment is more diagnostic than evaluative. It is used to monitor pupil


learning style and ability, to provide ongoing feedback and allow educators to improve
and adjust their teaching methods and for students to improve their learning.

Most formative assessment strategies are quick to use and fit seamlessly into the
instruction process. The information gathered is rarely marked or graded. Descriptive
feedback may accompany formative assessment to let students know whether they have
mastered an outcome or whether they require more practice.
Formative assessment examples:

 Impromptu quizzes or anonymous voting

 Short comparative assessments to see how pupils are performing against their peers

 One-minute papers on a specific subject matter

 Lesson exit tickets to summarise what pupils have learnt

 Silent classroom polls

 Ask students to create a visualisation or doodle map of what they learnt

Why is formative assessment important for learning?


Formative assessment is a flexible and informal way of assessing a pupil’s progress and
their understanding of a certain subject matter. It may be recorded in a variety of ways,
or may not be recorded at all, except perhaps in lesson planning to address the next
steps.

Formative assessment helps students identify their strengths and weaknesses and target
areas that need work. It also helps educators and governors recognise where students
are struggling and address problems immediately. At a school level, SMT and school
leaders use this information to identify areas of strength and weakness across the
institution, and to develop strategies for improvement.

As the learning journey progresses, further formative assessments indicate whether


teaching plans need to be revised to reinforce or extend learning.
Measurement refers to the process by which the attributes or dimensions of some physical object are
determined. One exception seems to be in the use of the word measure in determining the IQ of a
person. The phrase, "this test measures IQ" is commonly used. Measuring such things as attitudes or
preferences also applies. However, when we measure, we generally use some standard instrument to
determine how large, tall, heavy, voluminous, hot, cold, fast, or straight something actually is. Standard
instruments refer to physical devices such as rulers, scales, thermometers, pressure gauges, etc. We
measure to obtain information about what is. Such information may or may not be useful, depending on
the accuracy of the instruments we use, and our skill at using them. There are few such instruments in
the social sciences that approach the validity and reliability of say a 12" ruler. We measure how big a
classroom is in terms of square feet or cubic feet, we measure the temperature of the room by using a
thermometer, and we use an Ohm meter to determine the voltage, amperage, and resistance in a circuit.
In all of these examples, we are not assessing anything; we are simply collecting information relative to
some established rule or standard. Assessment is therefore quite different from measurement, and has
uses that suggest very different purposes. When used in a learning objective, the definition provided on
the ADPRIMA site for the behavioral verb measure is: To apply a standard scale or measuring device to
an object, series of objects, events, or conditions, according to practices accepted by those who are
skilled in the use of the device or scale. An important point in the definition is that the person be skilled in
the use of the device or scale. For example, a person who has in his or her possession a working Ohm
meter, but does not know how to use it properly, could apply it to an electrical circuit but the obtained
results would mean little or nothing in terms of useful information.

Assessment is a process by which information is obtained relative to some known objective or goal.
Assessment is a broad term that includes testing. A test is a special form of assessment. Tests are
assessments made under contrived circumstances especially so that they may be administered. In other
words, all tests are assessments, but not all assessments are tests. We test at the end of a lesson or unit.
We assess progress at the end of a school year through testing, and we assess verbal and quantitative
skills through such instruments as the SAT and GRE. Whether implicit or explicit, assessment is most
usefully connected to some goal or objective for which the assessment is designed. A test or assessment
yields information relative to an objective or goal. In that sense, we test or assess to determine whether or
not an objective or goal has been obtained. Assessment of skill attainment is rather straightforward.
Either the skill exists at some acceptable level or it doesn’t. Skills are readily demonstrable. Assessment
of understanding is much more difficult and complex. Skills can be practiced; understandings cannot. We
can assess a person’s knowledge in a variety of ways, but there is always a leap, an inference that we
make about what a person does in relation to what it signifies about what he knows. In the section on this
site on behavioral verbs, to assess means To stipulate the conditions by which the behavior specified in
an objective may be ascertained. Such stipulations are usually in the form of written descriptions.

Evaluation is perhaps the most complex and least understood of the terms. Inherent in the idea of
evaluation is "value." When we evaluate, what we are doing is engaging in some process that is designed
to provide information that will help us make a judgment about a given situation. Generally, any
evaluation process requires information about the situation in question. A situation is an umbrella term
that takes into account such ideas as objectives, goals, standards, procedures, and so on. When we
evaluate, we are saying that the process will yield information regarding the worthiness, appropriateness,
goodness, validity, legality, etc., of something for which a reliable measurement or assessment has been
made. For example, I often ask my students if they wanted to determine the temperature of the classroom
they would need to get a thermometer and take several readings at different spots, and perhaps average
the readings. That is simple measuring. The average temperature tells us nothing about whether or not it
is appropriate for learning. In order to do that, students would have to be polled in some reliable and valid
way. That polling process is what evaluation is all about. A classroom average temperature of 75 degrees
is simply information. It is the context of the temperature for a particular purpose that provides the criteria
for evaluation. A temperature of 75 degrees may not be very good for some students, while for others, it
is ideal for learning. We evaluate every day. Teachers, in particular, are constantly evaluating students,
and such evaluations are usually done in the context of comparisons between what was intended
(learning, progress, behavior) and what was obtained. When used in a learning objective, the definition
provided on the ADPRIMA site for the behavioral verb evaluate is: To classify objects, situations, people,
conditions, etc., according to defined criteria of quality. Indication of quality must be given in the defined
criteria of each class category. Evaluation differs from general classification only in this respect.

To sum up, we measure distance, we assess learning, and we evaluate results in terms of some set of
criteria. These three terms are certainly share some common attributes, but it is useful to think of them as
separate but connected ideas and processes.

There are four basic measurement scales. From least complex to most complex, they are: nominal,
ordinal, interval, and ratio. They are fundamental to the process of measurement, and without an
understanding of their differences, at best poor information will be derived, and at worse, erroneous
conclusions will be reached.
The measurement scale descriptions:

Nominal measurement scales refer to those measurements when the only meaningful results are the
delineations that one thing is different from another. For example, if you have a bag of apples and a
bucket of coal, the only measurement possible involves the nominal scale. All you can say is that one set
is apples and the other set is coal. It is a measurement where the only conclusion you can reach is that
one thing is different from another. Another way to consider the nominal measurement scale is to think of
it as a basic classification system. It might also be worthwhile to take a look at the behavioral verb
"classify." In the nominal scale you are essentially classifying by name. It is always a good idea to be as
clear as possible when doing this.

Ordinal measurement scales refer to those measurements where the results indicate only that one thing
is either greater or lesser than another. This always means a measurement that explicitly implies that the
objects, events or processes and be placed into some order. The assigning of grades based on scores is
an example of this scale, with, for example, the observation that a grade of "A" represents not only a
different value than a grade of "C" but that it also represents a higher or greater value.

Interval measurement scales refer to those measurements where there are equal intervals between
given values. Interval scales are used in almost every aspect of common measurement. A ruler employs
an interval scale. That means that the distance between three inches and six inches is the same as the
distance between nine inches and twelve inches. In a room thermometer, the difference in degrees
between 72 Fahrenheit and 78 Fahrenheit is the same as that between 90 degrees Fahrenheit and 96
degrees. The intervals are the same.

Ratio measurement scales are the same as ordinal scales with one important difference. The difference
is that ratio measurement scales contain a zero. the inclusion of a zero allows for negative values to be
expressed in relation to a positive value. The most obvious and easily understood example of a ratio
measurement scale is an outdoor thermometer. The intervals are equal, but whether Fahrenheit or
Celsius, measurement values can be expressed as a negative, as in -10 degree Celsius.

So there in a nutshell you have it. Measurement always involves some sort of scale, and the observations
linked to the measurements can be noted as a simple difference of name and thus a simple classification.
One step up in complexity is the ordinal scale which implies the there is an order to the object or process,
and one thing can be said to be not just different, but greater or lesser than another. The next up in
cmplexity, the interval scale is the most frequently used for measurement and rests on the certainty of
equal intervals between sequential points on the scale. Finally there are ratio scales, which are exactly
like interval scales with the addition of a zero point.

This is provided to give a little perspective on the description of educational measurement.


In educational measurement and evaluation we have different definition terms which
have to be understood in order for a learner to have a know how or rather a prior
knowledge when studying the course for effective understanding as well as mastery of
relevant concepts. The following are the terms:
Measurement: Process of quantifying individual’s achievement, personality, attitudes,
habits and skills or Process by which information about the attributes or characteristics
of things are determined and differentiated.
Evaluation: Qualitative aspect of determining the outcomes of learning. Process of
ranking with respect to attributes or trait.
Assessment: is a process by which information is gained relative to some known
objective or goal.

SUBJECTIVE AND OBJECTIVE TESTS

Objective test: this is a test consisting of factual questions requiring extremely short
answers that can be quickly and unambiguously scored by anyone with an answer key.
They are tests that call for short answer which may consist of one word, a phrase or a
sentence.

Subjective test: this is a type of test that is evaluated by giving opinion. They are more
challenging and expensive to prepare, administer and evaluate correctly, though they
can be more valid.

TYPES OF OBJECTIVE TEST ITEMS


They include the following:
I. True- false items
II. Matching items
III. Multiple choice items
IV. Completion items

1) True –false test items

Here, a factual statement is made and the learner is required to respond with either true
or false depending on the correctness of the statement. They are easy to prepare, can
be marked objectively and cover a wide range of topics

ADVANTAGES

 can test a large body of material


 they are easy to score
DISADVANTAGES

 Difficult to construct questions that are definitely or unequivocally true or false.


 They are prone to guessing
2) MATCHING ITEMS
Involves connecting contents of one list to contents in another list. The learners are
presented with two columns of items, for instance column A and column B to match
content in both columns correctly.

Advantages:
a. Measures primarily associations and relationships as well as sequence of events.
b. Can be used to measure questions beginning with who, when, where and what
c. Relatively easy to construct
d. They are easy to score

Disadvantages:
 Difficult to construct effective questions that measure higher order thinking and
contain a number of plausible distracters.
3) MULTIPLE CHOICE TEST ITEMS
In a multiple choice item, a statement of fact is made. It is followed by four or five
alternative responses from which only the best or correct one must be selected. The
statement or question is termed as ‘stem’. The alternatives or choices are termed as
‘options’ and the ‘key is the correct alternative. The other options are called ‘distracters’.
Advantages:
 Measures a variety of levels of learning.
 They are easy to score.
 Can be analyzed to yield a variety of statistics.
 When well constructed, has proven to be an effective assessment tool.
Disadvantages:
Difficult to construct effective questions that measure higher order of thinking and
contain a number of plausible distracters.

4) COMPLETION ITEMS OR SHORT ANSWER TEST ITEMS


In this, learners are required to supply the words or figures which have been left out.
They may be presented in the form of questions or phrases in which a learner is
required to respond with a word or several statements.

Advantages:
• Relatively easy to construct.
• Can cover a wide range of content.
• Reduces guessing.
Disadvantages:
 Primarily used for lower levels of thinking.
 Prone to ambiguity.
 Must be constructed carefully so as not to provide too many clues to the correct
answer.
 Scoring is dependent on the judgment of the evaluator.
TYPES OF SUBJECTIVE TEST ITEMS
In subject test we have two types of test which are:
Restricted response items &
Extended response items.

Restricted response items. On restricted response items examinees provide brief


answers, usually no more than a few words or sentences, to fairly structured questions.
Extended response items .here items require lengthy responses that count heavily in
scoring. These items focus on major concepts of the content unit and demand higher
level thinking. Examinees must organize multiple ideas and provide supporting
information for major points in crafting responses.
Advantages of restricted response items
a. Measures specific learning outcome.
b. Restricted response items provide for more ease of assessment
c. Restricted response item is more structured
d. Any outcomes measured by an objective interpretive exercise can be measured by a
restricted subjective item.

Limitations of restricted response items


a. Restricts the scope of the topic to be discussed and indicating the nature of the
desired response which limits students opportunity to demonstrate this behavior.

Advantages of Extended response items


I. Measures knowledge at higher cognitive levels of education objective such as
analysis, synthesis and evaluation.
II. They expose the individual difference in terms of attitudes, values and creative
thinking.

Limitations
i. They are insufficient for measuring knowledge of factual materials because they call
for extensive details in selected content area at a time.
ii. Scoring is difficult and unreliable.
EXAMPLES OF SUBJECTIVE TEST ITEMS
Extended response item
Imagine that you and a friend found a magic wand. Write a story about an adventure
that you and your friend had with the magic wand.

Restricted response item


Why is the barometer one of the most useful instruments for forecasting weather?
Explain in a brief summary.
EXAMPLES OF OBJECTIVE TEST ITEMS
 Completion item example:
The capital city of Tanzania is___________

 Matching item example:


Match the people in table A with the country they ruled
A
KENYATTA
OBOTE
NYERERE
B
UGANDA
TANZANIA
ZAIRE
KENYA
 True-false item
John ate his supper yesterday. (true/false)

 Multiple choice item


Which of the following towns is the capital of Kenya?
A. NAKURU C. MOMBASA
B. KISUMU D. NAIROBI

TEST CONSTRUCTION
Tests should be constructed and administered in such a way that the scores (marks)
yield reflect the ability they are supposed to measure.
The type of test to be constructed depends on the nature of the ability its meant to
measure and purpose of the test.
Certain types of educational tests can only be constructed by teams of suitably qualified
and equipped researchers.
The process of test construction s long and painstaking for it involves creating large
batteries of test questions in the particular area to be examined followed by extensive
trials in order to assess their effectiveness.
In this way, questions are eliminated which:
 Do not discriminate or distinguish between children whose abilities are different.
 Are frequently misunderstood by children.
 Have more than one correct answer.
 Give an advantage to certain children on the basis of factors other than those being
tested.
An ordinary teacher can help his pupils by using the different types of tests we have for
the particular purpose for which they are designed. The teacher therefore needs to
construct test that tell him:
What the pupils have learned from his or her teaching.
How well they can perform the practical skills he has taught them.
Whether they understand the underlying principles of what they are learning.
How quickly and accurately they can work.
How well they can apply what they know to problems they meet.
If they have yet developed the intellectual skills that older children can perform such as
the ability to analyze, deduce, compare, and evaluate.
CONSTRUCTION OF OBJECTIVE TESTS
They are tests that call for short answers which may consist of one word ,a phrase or
sentence. In these tests all possibility of human error or prejudice by the marker is
removed by constructing items that demand answers that are either right or wrong and
each of which there is only one possible answer.
Guideline to constructing true or false items
 Do not provide clues by using determinants such as ‘all’, ‘never’, ‘absolutely’ or ‘none’
because they signal that the statement is false. Words such as ‘may’, perhaps’,
sometimes and ‘could’ signal that the statement is true. If such words are to be used,
they must be balanced and used in both true and false statements.
 Statements must be irrevocably true or false, so they must be unambiguous (clear).
 Use of negative statements should be avoided.
 Limit true or false statement to a single concept. True or false tests items may require
the learner to underline a word or clause in a statement, correct a false statement or
trace a path in a maze.

2. CONSTRUCTION OF MATCTHING TEST ITEMS


These items involve connecting contents of one list to contents in another list. The
learners are presented with two columns of items, for instance, column A and column B.
they are asked to match each item that appears in column A with an appropriate item
from column B. in such questions, an equal number of premises (what is in the left hand
column) may be provided for balance or perfect matching when an unequal number of
premises and responses are provided, this is called an unbalanced or imperfect keep
matching.
To control guess work, it is better to have more responses and fewer premises.
When writing the items in the columns, it is important to:
 Keep the expressions homogeneous.
 Make the items relatively short.
 Use heading for each column that accurately describes its content
 Specify the basis for matching.

3. COMPLETION OR SHORT ANSWER TEST ITEMS


In this, learners are required to supply the words or figures which have been left out.
They may be presented in the form of questions or phrases in which a learner is
required to respond with a word or several statements.
Questions must be specific and unambiguous. For instance: JOMO KENYATTA WAS
BORN IN_____________
This is ambiguous since it’s not clear whether it is his date of birth or the country or
place where he was born that is required.
Besides this, statements that leave too many key words may not carry the intended
meaning. If the answer is numerical or a quantity the unit must be indicated. The answer
required should be related to the main point or statement.
In constructing completion items, the blank should come last to ensure that the learners
read the whole question before supplying the answer. Unintentional help should not be
given in the question, for example, JUDAS ISCARIOT, WHO BETRAYED JESUS WAS
BORN AN_____________
In the above question ‘an’ provides unintentional help to the learners as it means that
the answer must begin with a vowel.

4 MULTIPLE CHOICE TEST ITEMS CONSTRUCTION

In a multiple choice question, a statement of fact is made. It is followed by four or five


alternative responses from which only the best or correct one must be selected.
The following are the guidelines that a teacher should use when constructing multiple
choice items:
Draw a table of specification showing topics or subtopics and the skills to be tested. The
table of specification come from the subject syllabus. The test items should be based on
the three domains of learning( cognitive, affective, psychomotor)
The area emphasized during teaching should have more items.
Questions should be based on bloom’s taxonomy- of the six levels of cognitive
objectives multiple choice questions should reflect comprehension, application and
analysis. There should be minor doses of knowledge, synthesis and evaluation.
Knowledge is too basic while synthesis is too complete. Allocation of marks for these
skills can be as follows:
 Knowledge- 12%
 Comprehension-16%
 Application -32%
 Analysis-20%
 Synthesis-12%
 Evaluation -8%
Total= 100%
The stem of the question should state the problem clearly. It should not contain
unnecessary information
Options should be carefully selected and must include the best answer or key.
Each question should be relevant and not far- fetched.
All options should be almost equal in length.
The distractors should be relevant and not far-fetched.
Placement of the key should be unpredictable and should not follow a pattern.
No test or option test should provide clues or be answers to another question in the
same test.
The reading difficult and vocabulary level of items should correspond to the level of the
learners.
All items should be independent.
Avoid tricky questions
Ensure instructions to learners are clear.
Edit the paper carefully.

CONSTRUCTION OF SUBJECTIVE TESTS


In this kind of test the objective is to measure qualities such as pupil’s ability to perform
certain practical or intellectual skills which might include describing something
accurately either in oral or written or written form using materials imaginatively, working
creatively, handling information logically building convincing arguments or exposing
flaws in the arguments of others.
The common types of subjective test items that we have are:
Restricted response items
Extended response items
Construction of the above types test items has a detailed process which includes the
following stages:

 Developing the prompt


 Creating the scoring rubric
 Scoring response
Developing the prompt

 The prompt for a subjective item poses a question, presents a problem, or prescribes
a task. It sets forth a set of circumstances to provide a common context for framing the
response.
Action verbs direct the examinee to focus on the desired behavior, for instance, solve,
interpret, compare and contrast, discuss or explain. Appropriate directions indicate
expected length format of the response, allowable resources or equipment’s, time limits
and features of the response that count in scoring.

Creating the scoring rubric


These are analytic or holistic in nature.
For holistic rubric the item writer/ constructor lists desired features of the response with
a number of points awarded for each specific feature.
An analytic rubric provides a scale for assigning points to the response based on overall
impression.
A range of possible points is specified and verbal descriptors are developed to
characterize a response located at each possible point on the scale.
Illustrative responses that correspond to each scale point are often developed or
selected from actual examinee responses.

Scoring response

During subjective scoring at least four types of rater errors may occur as the rater;
becomes more lenient or severe over time or scores erratically due to fatigue or
distractions; has knowledge or belief about an examinee that influences perception of
response; is influenced by examinees good or poor performance on items previously or
influenced by the strength or weakness of a preceding examinees response.
Under extended response items we can take an example of the essay test items look on
how it is constructed:
 Essay items require learners to write or type the answer in a number of paragraphs.
The learners use their own words and organize the information or material as they see it
fit.
 In writing essay test, clear and unambiguous language should be used. Words such
as ‘how’, ‘why’, ‘contrast’, ‘describe’ and discuss are useful. The questions should
clearly define the scope of the answer required.
 The time provided for the learner to respond to the questions should be sufficient for
the amount of writing required for a satisfactory response. The validity of questions can
be enhanced by ensuring that the questions correspond closely to the goals or objective
being tested.
 An indication of the length of the answer required should be given.

Uses of tests
1. To Identify What Students Have Learned
The obvious point of classroom tests is to see what the students have learned after the
completion of a lesson or unit. When the classroom tests are tied to effectively written
lesson objectives, the teacher can analyze the results to see where the majority of the
students are having problems with in their class. These tests are also important when
discussing student progress at parent-teacher conferences.
2. To Identify Student Strengths and Weaknesses
Another use of tests is to determine student strengths and weaknesses. One effective
example of this is when teachers use pretests at the beginning of units in order to find
out what students already know and where the teacher’s focus needs to be. Further,
learning style and multiple intelligences tests help teachers learn how to best meet the
needs of their students through instructional techniques.
3. To Provide a Method for Awards and Recognition
Tests can be used as a way to determine who will receive awards and recognition.
4. To Provide a Way to Measure a Teacher and/or School’s Effectiveness
More and more states are tying funding to schools to the way that students perform on
standardized tests. Further some states are attempting to use these results when they
evaluation and give merit raises to the teachers themselves. This use of high stakes
testing is often contentious with educators since many factors can influence a student’s
grade on an exam. Additionally, controversy can sometimes erupt over the number of
hours schools use to specifically ‘teach to the test’ as they prepare students to take
these exams.
5. To Provide a Basis for Entry into an Internship, Program, or College
Tests have traditionally been used as a way to judge a student based on merit.
6. To Gain College Credit
Advanced Placement exams provide students with the opportunity to earn college credit
after successfully completing a course and passing the exam with high marks. While
every university has its own rules on what scores to accept, most do give credit for
these exams. In many cases, students are able to begin college with a semester or
even a year’s worth of credits under their belts.

You might also like