You are on page 1of 67

APPROPRIATENESS AND

ALIGNMENT OF
ASSESSMENT METHODS TO
LEARNING OUTCOMES
OVERVIEW
What principles govern assessment of
learning? Chappuis, Chappuis & stiggins
(2009) delineated five standards of quality
assessment to inform sound instructional
decisions {1} clear purpose;[2} clear learning
strategies [3] sound assessment design; [4]
effective communication of results; and [5]
student involvement in the assessment process.
 Classroom assessment begins with the questions
“why are you assessing ‘’? The answer to this
questions gives the purpose of assessment which
was discussed in section 1.
 The next question is “what do you want to asses?
These pertains to the student learning outcomes -
What the teacher would like their students to
know and be able to do at the end of a
section or unit. Once targets or outcomes are
defined “how are you going to assess?
These refers to the assessment tools that can
measure the learning outcomes. Assessment
methods and tools
IDENTIFYING LEARNING OUTCOME
A learning outcome pertains to a particular
level of knowledge, skills and values that a
student has acquired at the end of a unit or
period of study as a result of his/her
engagement in a set of appropriate and
meaningful learning experiences. An organize
set of learning outcomes helps teacher plan
and deliver appropriate instructions an a
design valid assessment tasks and strategies.
Anderson, et al.[2005] listed four steps in a
student outcome assessment [1] create
learning statements; [2]design
teaching/assessment to achieve this outcome
statements; [3] implement
teaching/assessment activities [4] analyze
TAXONOMY OF LEARNING DOMAINS
Learning outcomes are statements of
performance expectations: cognitive,
affective and psychomotor. These are
three broad domains of learning characterized
by change in a learner’s behavior. Within
each domain are levels of expertise that
drives assessment. These levels are listed in
order of increasing complexity. Higher levels
requires more sophisticated methods of
assessment but they facilitate retention and
transfer of learning(Anderson, et al 2005)
importantly, all learning outcomes must be
capable of being assessed and measured.
A. COGNITIVE (KNOWLEDGE BASED)
Table 3.1 shows the levels of cognitive
learning originally devise by blooms
Engelhart, furst jill & krathwohl in 1956 and
revised by Anderson, krathwohl et.al in 2001
to produce three dimensional framework of
knowledge and cognitive process and account
for twenty fist century needs by including
metacognition. It is designed to help teachers
understand and implements standard-based
curriculum. The cognitive domain involves the
development of knowledge and intellectual
skills it answer the questions “What do I want
learners to know” the first three are lower-
order, while the next three
Levels promote higher order thinking.
Krathwohl [2002] stressed that the revised
blooms taxonomy table is not only used to
classify instructional and learning activities
used to achieve the objectived, but also for
assessments employed the determine how will
learners have attained and mastered the
objectives .
Marzano and Kendall (2007) came up with their
own taxonomy composed of three systems
{self system} {metacognitive system}
and { cognitive system} and the knowledge
domain . There cognitive levels have four
levels: knowledge, comprehension, Analysis,
and knowledge Utilization.
The knowledge component is same as
Remembering level in the revised blooms
taxonomy . Comprehension entails synthesis
and representation. Relevant information are
taken and then organized into categories.
Analysis involves processes of matching,
classifying, error analysis, generalising and
specifying. The last level Knowledge
Utilization, comprises decision making
problem solving, experimental inquiry and
investigation – processes essential in problem
based and project based learning.
COGNITIVE LEVELS AND PROCESSES
(ANDERSON, ET. AL 2001)

Levels Process and action verbs Sample learning


describing outcomes competencies

Remembering – retrieving Recognizing, recalling verb; Define the four levels of


relevant knowledge from define ,describe identify, mental processes in
long term memory label, list, match, name marzano and kendall's
outline, reproduce, select, (cognitive system)
state.

Understanding – - interpreting, exemplying, Explain the purpose of


constructing meaning from classifying, summarizing, marzano and kendall's new
instructional messages, inferring , comparing and taxonomy of educational
including oral, written, and explaining, paraphrase, objectives.
graphic communication - re write. Summarize.
Analyzing- breakig material Process- diffrentiating, Compare and contrast, the
into its constituent parts and organizing,attributing, thinking levels ad the revised
determne how the parts relate Verbs; analyze, bloom's taxonomy and
to one another and to overall arrange,associate, Marzano & kendall”s
structure or purpose. compare,contrast, infer, Cognitive System.
organize, solve, support, a
(thesis)

Evaluating- making Process; executing, Judge the effectivenes of


judgements based on criteria monitoring, generating, writing learning outcomes
and standards. Verbs; appraise, compare, using marzano and kendall's
conclude, contrast,criticize, taxonomy.
evaluate, judge, justify,
support, ( a judgement),
verify.

Creating- putiing elements Process; planning, producing, Design a classification


together to form a coherent or Verbs; classify ( infer the scheme for writing learning
functional whole, reorganize classification system), outcomes using the levels of
elements into new pattern or construct, create, extend, cognitive system developed
structure. formulate, generate, by marx=zano & kendall.
synthesize.
B. PSYCHOMOTOR (SKILLS-BASED)
 The psychomotor domain focuses on physical and mechanical
skills involving coordination of the brain and muscular
activity. It answer the questions “what action do I want
learners to be able to perform”?
Dave (1970) identified five levels of behavior in the sychomotor
domain ; Imitation, Manipulation, Precision, Articulation, and
Naturalization. In his taxonomy, Simpson (1972) laid down
seven progressive levels : Perception , Set Guided response,
mechanism, complex overt response.
Meanwhile, Harrow (1972) developed her own taxonomy with
six categories organized according to degree of coordination,;
Reflex movement, Basic fundamental movement, Perceptual
activities , Skilled movements and non discursive comm.
TAXONOMY AND PSYCHOMOTOR
DOMAIN

levels Action verbs describing Sample competencies


learning outcomes

Observing – active mental Describe, detect, Relate music to a particular


attending of a physical distinguish , dance.
event, diffrentiate,relate,select

Imitatiing- attempted Beigin, display, explain, Demonstrate, a simple


copying of a physical move, proceed,react, show, dance step.
behaivour. state, volunteer.
Practicing – trying a Bend, calibrate, construct, Display several dance
specific physical activity differentiate, dismantle, steps in sequence.
over and over. fasten, grasp, grind,
handle, measure, mix,
organize, operate,
manipulate, mend.

Adapting- fine tuning. Arrange, combine, Perform a dance showing


Making minor adjustments compose, construct, new combinations of
in the physical activity in create , design, originate, steps.
order to perfect it. re -arange,, reorganize.
C. AFFECTIVE(VALUES,ATTITUDES
AND INTERESTS)

the affective domain emphasizes emotional knowledge. It


tackles the question, “ what actions do I want learners to think
or care about.

able 3.3 presents the classification scheme for the affective


domain developed by krathwohl, Bloom and Masia in 1964.
the affective domain includes factors such as a student
motivation, attitudes, appreciation and values.
TAXONOMY OF AFFECTIVE
DOMAIN (KRATHWOHL, ET AL.,
1964)
Receiving – is being Ex. Include to
aware of or sensitive differentiate, to
to the existence of accept, to listen
certain idea, (for), to respond to.
material or
phenomena and
being willing to
tolerate the.
TAXONOMY OF AFFECTIVE
DOMAIN (KRATHWOHL, ET AL.,
1964)
Responding- is Ex . To comply with,
committed in some to follow, to
small measure to the commend to
ideas, materials or commend, to
phenomena involved volunteer , to spend
by actively leisure time in, to
responding to them. acclaim.
TAXONOMY OF AFFECTIVE
DOMAIN (KRATHWOHL, ET AL.,
1964)
Valuing- showing ex. attend optional
some diffinite matches.
involvement or
commitment .
Organizing-
integrating a new
value into one's
general set of Arrange his/her own
values, giving it volleyball practice.
some ranking among
one's general
priorities.
TAXONOMY OF AFFECTIVE DOMAIN
(KRATHWOHL, ET AL., 1964)
Internalizing value- ex- join to play
characterization by a volleyball twice a
value or value week.
complex
acting consistently
with the new value.
TYPES OF ASSESSMENT

Assesment method can be categorized according


to the nature and characteristics of each method.
McMillan (2007) identfied four major categories:
selected-response, constructed,- response, teacher
observation and student self-assessment. These
are similar to carpenter tools and you need to
choose which is apt for a given task. It is not wise
to stick to one method of assessment. As the
saying goes, “ if the tools is only hammer, you
tend to see every problem as a nail”
1.SELECTED- RESPONSE FORMAT

n a selected- response format, students select


from a given set of options to answer a question or
a problem. Because there is only one correct or
best answer, selected- response items are
objective and efficiently.

teacher commonly assess students using


questions and items that are multiple – choice;
alternate response

true or false) ; matching type and interpretive .


Multiple choice question consist a stem of
{question or statement form} with four or five
Students review each stem and match each
with a word, phrases or images. From the list
of responses. Alternate response (true/false)
questions are binary choice type. The
reliability of true /false items is not generally
high of the possibilty of guessing.
2.CONSTRUCTED- RESPONSE
FORMAT

in selected- response type, students need only to


recognize and select the correct answer.

A constructed – response format(subjective)


demands that students create or produce their
own answer in response to a questions , problem
or a task . In this type, items may fall under any of
the following categories: Brief-constructed
response items: performance tasks; essay items;
or oral questioning.
BRIEF-CONSTRUCTED RESPONSE ITEMS

equire only short response from students. Examples


include sentence completion where students fill in blank at
the end of a statement ; short answer to open-ended
questions; labelling a diagram; or answering a
mathematics problem by showing their solutions.
PERFORMANCE ASSESSMENTS

require students to perform a task rather than


select from a given set of options. Unlike brief-
constructed response items, students have to
come up with extensive and elaborate answer or
response. Performance tasks are called authentic
or alternative assessments.
essay assessment- involving answering a
questions or proposition in written form.

oral questioning- is a common assessment


method during instruction to check on student
understanding.
3.TEACHER OBSERVATION

teacher observation are a form of on-going


assessment, usually done in combination with oral
questioning. Teacher regularly observe students to
check on their understanding. By watching how
students respond to oral questions and behave
during individual and collaborative activities, the
teacher can get information if learning is taken
place in the classroom. Non verbal cues
communicate how learners are doing.
4.STUDENT SELF-ASSESSMENT

self-assessment is one of the standards of


quality assessment identified by Chappuis,
Chappuis & Stggins (2009). it is a process where
the students are given a chance to reflect and rate
their own work and judge how well they have
performed in a relation a set of assessment
criteria. Student track and evaluate their own
progress or performance. There are self
assessment monitoring techniques like activity
checklist, diaries, and self report inventories. The
latter are questionaries‘ or surveys that student fill
out to reveal their attitudes and beliefs about
themselves and others,
MATCHING LEARNING TARGETS WITH
ASSESSMENT METHODS

In a outcome- based approach, teaching methods


and resources that are used to support learning as
well as assessment task and rubrics are explicity
linked to the program and course learning
outcomes. Biggs and Tang (2007) calls this
Constructed Alignment. Constructed alignment
provides the “how-to” by verifying that the
teaching- learning activities (Tlas) and the
assessment task (Ats), activate the some verbs as
in the ILOs are indicators of devised by Anderson,
krathwhol, et, at.(2001) can increase the
alignment of learning outcomes and instruction.
Airasin & miranda , 2002)
A learning target is defined as a description a
performance that includes what learners should
know and able to do. It contains the criteria used
to judge student performance. It is derived from
national and local standards. This definitions is
similar to that of a learning outcome.
LEARNING AND ASSESSMENT METHODS (MCCMILLAN,2007)
ASSESSMENT METHOD

Selected Essay Performance Oral observ Student


response task questioning ation self
and Brief- assesseme
constructed nt
response

Targets 5 4 3 4 3 3
Knowledge and
simple
understanding

Deep 2 5 4 4 2 3
understanding
and reasoning

skills 1 3 5 2 5 3
Knowledge and simple understanding –
pertains to mastery of substantive subject
matter and procedures. In the revised blooms
taxonomy, this cover the lower order thinking
skills of remembering, understanding ,and
applying . Selected-response and
constructed-response items are best in
assessing low-level learning targets in terms of
coverage and efficiency.
Reasoning is a mental manipulation and use of
knowledge in critical and creative ways. Deep
understanding and reasoning involve
higher order thinking skills of analyzing,
evaluating, and synthesizing.
To assess skills, performance assessment is
obviously the superior assessment method.
As mentioned, Product are most adequately
assessed through performance task.
Student affective cannot be assessed simply
by selected-response or brief-constructed
response test. Affective pertains to attitudes,
interests, and values students manifest. The
best method for this learning target is self
assessment. Most commonly this is the form
of students response to self report affective
inventories using rating skills. In the study
conducted by Stiggins & Popham (2009) there
are two affective variables influenced by
teacher who employ assessment formatively
in their classess: academic efficacy(perceived
ability to succeed and sense of control over
GUIDE FOR ASSESSING LEARNING
OUTCOME FOR GRADE 1
What to How to assess How to score How to utilize
assess (suggested results
assessment
tool/strategie
s)

• Content of • 1 quizzes Raw score To identify


the • Multiple individual
curriculum choice learner with
• Facts and • True or false specific needs
information • Matching for academic
that learners type interventions
acquire • Constructed and individual
response Rubrics instruction.
2.Oral
participation Raw score
3.Periodical
test
• Cognitive • Quizzes Raw To identify
operations • Outlining,organizing scor learners with
that ,analyzing,interpreti e similar needs
learners ng, translating, for academic
perform on converting or interventions
facts and expressing the and small
information information in group
for another format. instruction.
constructin • Constructing graphs
g meanings flowcharts, maps or To assess
graphic organizer effectiveness
• Transforming a of teaching
textual presentation and learning
into a diagram strategies.
• Drawing or painting
pictures
• Other output
2.Oral participation
Rubri
cs
• Explanation 1.Quizzes Raw scores To evaluate
• Interpretation Explain instrucional
• application /justify materials
something used
based on
facts data,
phenomena
or evidence
•Tell /retell
stories To design
•Make instructional
connections materials
of what was
learned in
real life
situation
•Oral
discourse/re Rubrics
citation
•Open –
ended test Rubrics
• Learners Participation Rubrics To assess and
authentic Projects classroom
task as Homework instruction.
evidence of Experiments
understandi Portfolio To design in
ng others service
• Multiple training
intelligence program of
teachers in
the core
subjects
curriculum.
VALIDITY AND RELIABILITY
Overview:
It is not usual for teachers to complaints or
comments from students regarding test and
other assessment. For one, there maybe an
issue concerning the coverage of the test.
Students may have been tested on areas that
were not part of the content domain. They may
not have been given the opportunity to study or
learn the material. The emphasis of the test
may also be too complex, inconsistent with the
performance verbs in the learning outcome.
Validity alone does high quality assessment.
Reliability of test results should also be
checked.
 Questions on reliability surface if there are
inconsistencies in the results when tests are
administered over different time periods,
sample of questions or groups.
 Both validity and reliability are considered
when gathering information or evedencies
about student achievement.
 This chapter discuss the distinctions between
the two.
Validity
Validity is term derived from the latin word
validus, meaning strong. In view of assessment
, it is deemed valid if measures what is supposed
to. In contrast to what some teachers believe, it
is not property of a test. It pertains to the
accuracy of the inferences teachers make about
students based on the information gathered from
assessment (McMillan,2007;fives &Didonato-
Barnes, 2013) this implies that the conclusions
teachers come up with in their evaluation of
student performance is valid if there are strong
and sound evidences of the extent of students
learning . Decisions also include those about
instructions and classroom climate.
{Russell & Airaian, 2012}.
An assessment is valid if it measures a student
actual knowledge and performance with
respect to the intended outcomes, and not
something else. It is representative of the area
of learning or content of the curricular aim
being assessed ( McMillan, 2007; Popham,
20011) for instance an asssessment purpotedly
for mesuring arithmetic skills of grade 4 pupils
invalid if used for grade 1 pupils because of
issues on content (test content evidences) and
level of performance ( response process
evidence) a test that measures recall of
validity problems particularly on content –
related evidence.
A. CONTENT –RELATED EVIDENCE
Content-related evidence for validity pertains to
the extent to which the test covers the entire
domain of content . In summative test covers a
unit with four topics, then the assessment should
contain items from each topic. This is done
through adequate sampling of a content. A
student performance in the test maybe used as
an indicator of his/her content knowledge. For
intance, if a grade 4 pupils was able to correctly
answer 80% of the items in a science test about
matter , the teacher may infer that the pupil
knows 80% of a content area.
In the previous chapter, we talked about
appropriateness of assessment methods to learning
outcomes.
a test that appears to adequately measure the
learning outcomes and content is said to possess Face
validity . As the name suggest, it looks at the
superficial face value of instrument. It is based on the
subjective opinion of the one reviewing it. Hence, it is
considered non-systematic or non scientific. A test that
was prepared to assess the ability of pupils to
construct simple sentences with correct subject- verb
agreement has face validity if the test looks like an
adequate measure of the cognitive skill.
Another consideration to content validity is
Instructional validity- the extent to which an
assessment is systematically sensitive to the nature of
instruction offered.this is closely related to
What is being assessed.} Yoon & Resnick (1998)
asserted that an instructionally valid test is one
that register differences in the amount and kind
instruction to which students have been
exposed.
They also described the degree of overlap between
the content tested and the content taught as
oppurtunity to learn which has an impact on test
scores . Lets consider the grade 10 curriculum in
Araling Panlipunan (social studies). In the first
grading period they will cover three economic
issues; unemployment, globalization and
sustainable development. Only two were
discussed in classbut assessment covered three
issues. Although these were all identified to the
urriculum guide and may even found in a
texbooks, the questions remains as to whether
the topics were all taught or not.
Inclusion of items that were not taken up in class reduce
because students had no oppurtunuty to learn the
knowledge or skill being assessed.

To improve the validity of assessments, it is


recommended that the teacher constructs a two
dimensional grid called Table of Specification
(Tos). The Tos is prepared before developing
the test. It is a blueprint that identifies the
content area and describes the learning
outcomes at each level of cognitive domain
(notar, et al. 2004) it is a tool to used in
conjuction with the lesson and unitplanning to
help teachers make genuine connections
between planning, instruction and assessment (
Five &Didonato- barnes, 2013) it assures the
teacher that they are testing students learning
As cognitive processes requiring higher order
thinking. Table 4.1 is an example of adapted
Tos using the learning competencies found in
the Math curriculum guide. In a two way table
with learning objectives or content matter on
the vertical axis and the intellectual process
on the other.
TABLE 4.1 SAMPLE OF SPECIFICATION (NOTAR,ET AL.,2004)

Course title: Math


Grade level: V
Periods test is being used: 2
Date of test : August 8 2014
Subject matter digest: Number and number
sense Type of a test: Power, speed, partially
speeded
(circle one)
Test time : 45minutes

Test value: 100 points

Base number of a test questions: 75


Constraints: test time
Learning Item Revised Bloom’s Taxonomy Total
Objective types

Remember Understan Apply Analyze Evaluate Create


d
Instructi
No level onal time
Q/ Q/P
. in P/
minutes
%
1 apply 95 11/ Matc 6(1) 5(2) 11/1
16% 16 hing 6

2 understa 55 7/1 MC 5(2) 5/10


nd 9% 0

: : : : : : : :
10 evaluate 40 5/7 essa 1(7) 1(7)
7% y

total 600 75/ 11 23/31 16/3 4/10 3/6 1/7 58/1


100% 10 /1 4 00
0 2
SPECIFIED SIX ELEMENTS IN TOS
DEVT.

1. Balance among the goals selected for the


examinations
2. Balance among the levels of learning
3. The test format;
4. The total number of items
5. The number of items for each goal and level
of learning
6. The enabling skills to be selected from each
goal frame work.
 The first three elements were discussed in the
previous chapter . As to the number of items
that would depend on the duration of the test
which is contigent on the academic level and
attention span of the students. A sic year old
grade 1 pupil Is not expected to accomplish a
one hour test . They do not have the
tolerance to sit in an exaination that long. The
# of items is also determine by the purpose of
the test or its proposed uses. It is a power test
or a speed test ? Power test are intended to
measureb the range of the student's capacity
in a particular area, as opposed to a speed
test that is characterized by time- pressure.
Time requirements for certin
assessment task
Types of Questions Time required to answer
Alternative response (true-false) 20-30 seconds
Modified true or false 30-45 seconds(notar,et.,al,2004)
Sentence completion (one -word-fill-in) 40-60 seconds
Multiple choice with four 40-60 seconds
responses(lower level)

Multiple choice(higher level) 70-90 seconds


Matching type(5 items,6 choices) 2-4 minutes
Short answer 2-4 minutes
Multiple choice(with calculations) 2-5 minutes
Word problem (simple arithmetic) 5-10 minutes
Short essays 15-20 minutes
Data analysis/graphing 15-25 minutes
Drawing models/labelling 20-30 minutes
Extended essays 35-50 minutes
B. CRITERION- RELATED EVIDENCE
Criterion-related evidence for validity refers to
the degree to which test scores agree with
external criterion. As such, it is related to
eternal validity. Examines the relationship
between an assessment and another measure
of the trait [McMillan,2007]
There are three types of criteria [Nitko &
Brookhart, 2011]
1.Achievement test scores
2.Ratings grade and other numerical
judgements made by the teacher;
3.Career data
Criterion-related evidence is of two types:
concurrent validity and predictive validity.
Concurrent validity provides an estimate of
students current performance in relation to a
previously validated or established measure.
Predictive validity pertains to the power or
usefulness of test scores to predict future
performance.
Person correlation coefficient [r2] or
spearman’s rank order correlation is called
the coefficient of determination.
C. CONSTRUCT- RELATED EVIDENCE
A construct is an individual characteri stics
that explains some aspect behavior.
(Millen,linn & Gronlund, 2009). Construct
related evidence of validity is an assessment
of the quality the instrument used. It measure
that extend to which the assessment is
meaningful measure of an unobservable trait
or characteristic . {McMillan, 2007} there are
three types of constructed -related
evidence :theoritical, logical and statistical
{McMillan, 2007)
A good construct has theoritical basis. This
means that the construct must be
operationally defined or explained explicitly to
Two methods of establishing
construct validity:
 Convergent validity occurs when measures
of constructs that are related are in fact
observed to be related .
 Divergent (or discriminant) validity, on the
other hand, occurs when constructs that are
unrelated are in reality observed not to be.
UNIFIED CONCEPT OF VALIDITY
In 1989, Messick proposed a unified concept
of validity based an expanded theory of
construct validity which adreses score
meaning and social values in test
interpretation and test use.
His concept of unified validity ‘’ integrates
consideration of content, criteria, and
consequences into a construct framework for
the empirical testing of rational hypotheses
about score meaning and theoritically
relevant relationships.
VALIDITY OF ASSESSMENT METHOD

In the previous sections validity of traditional


assessments was discussed. What about the
other assessment method? The same validity
apply .

1.The selected performance should reflect


a valued activity

2.The completion of performance


assessments should provide a valuable
learning experience.
3.The statement of goals and objectives
should be clearly aligned with the
measureable outcomes of the
performance activity.

4.The task should be not examine


extraneous or unintended variables.

5.Performance assessments should be


fair and free from bias
TREATS TO VALIDITY
Miller,linn & gronlund 9(2009) identified ten
factors are defects in the construction of
assessment tasks that would render
assessment inferences inaccurate. The first
four factors apply to traditional test and
performance assessments. The remaining
factors concern brief- constructed
response and selected response- items.
1.Unclear test direction
2.Complicated vocabulary and sentence
structure
3.Ambiguous statements.
4.Inadequate times limits
5. Inappropriate level of difficulty of test items
6.Poorly constructed test items
7.Innaproprite test items for outcome being
measured
8.Short test
9.Improper arrangement items
10.Indenfiable pattern of answers.
Compare scores taken before to those taken
after instruction.
Compare predicted consequences to actual
consequences.
Compare scores taken before to those taken
after instruction.
Compare predicted consequences to actual
consequences.
Compare scores taken before to those taken
after instruction.
Compare scores on similar , but different
traits.
Provide adequate time to complete the
assessment.
Ensure appropriate vocabulary, sentence
structure and item difficulty.
MCMILLAN (2009)LAID DOWN HIS
SUGGESTION FOR ENHANCING
VALIDITY. THESE ARE FOLLOWS
Asks other to judge the clarity of what you are
assessing.
Check to see if different ways of assessing the
same thing give the same result.
Sample a sufficient number of examples of
what is being assessed.
Prepared a detailed table of specifications
Asks others to judge the match between the
assessment items in the objectives of the
assessment.
Compare groups known to differ on what is
being assessed.
Ask easy questions first
Use different methods to assess the same
thing
Use only for intended purpose,
RELIABILITY
- Talks about reproducibility and consistency in
methods and criteria. An assessment is said to
be reliable if it produces the same results if
given to examine on two occasions. It is
important then to stress that reliability
pertained to obtained assessment results and
not to the test or any other instrument. Another
point is that reliabilty is unlikely to turn out
100% because group of students after a day or
two will have some differences , there are
environmental factors like lightning and noise
that affect realiability . Student error and
physical well- being of examiness also affect
consistency of assessment results.
 For test to be valid, it has to be reliable.

 Reliabilty is expressed as correlation


coefficient. A high reliabilty denotes that if a
similar test readministered to the same group
of students, test results from the first and
second resting are comparable.
Types of Reliabilty
 Internal – assesses the consistency of results
across items within a test whereas
 Ixternal consistency evidence based on
scorer or rater consistency and evidence
based on decision consistency.
Sources of Reliabilty
A. Stabilty
the test-retest reliabilty
TYPES OF RELIABILTY
TYPES OF RELIABILTY

You might also like