Assessment of Learning 2

Continuation:
B. Crite rion-Related Evidences

Criterion-related evidence for validity refers to the
degree to which test scores agree with an external
criterion. The relationship between an assessment and
another measure of the same trait. (McMillan, 2007).
There are three types of criteria

•Achievement test scores;
•Ratings, grades and other numerical judgements made by the
teacher.
•Career data.
Two types of Criterion-related evidence
Concurrent validity
-provides an estimate of a students current
performance in relation to a previously validated or
established measure.
 Predictive validity
- pertains to the power or usefulness of test scores
to predict future performance.
C. Construct-related evidence
 Miller, Linn & Gronlund, (2009)

 A construct is an individual characteristics that explains
some aspect of behavior.
 construct -related evidence of validity is an assessment
of the quality of the instrument used.
-Conley, Karabenick & Arbor (2006)

- measured students motivation to learn through a survey
that included measures of self-efficacy for learning, task
value, and students personal achievements goals, focusing
on the domain of mathematics.
- Lee Cronbach and Paul Meehl (1955)
 insisted that to provide evidence of construct validity, one
has to develop a nomological network.
 Nomological network - it is a basically a network of laws

that includes the theoretical framework of the construct.
TWO METHODS OF ESTABLISHING
CONSTRUCT VALIDITY
 1. CONVERGENT VALIDATION
-convergent validity occurs when measures of constructs
that are related are in fact observed to be related.
 2. DIVERGENT ( or DISCRIMINANT VALIDITY)
- occurs when constructs that are unrelated in reality
observed not to be.
Campbell & Fiske (1959) developed a statistical approach called "

Multitrait-Multimethod Matrix (MTMM) a table of correlations arranged
to facilitate the assessment of construct validity, integrating both
convergent and divergent validity (Trochim, 2006)
UNIFIED CONCEPT OF VALIDITY
 -Messiah (1989) proposed a unified concept of validity based

on an expanded theory of construct which addresses score
meaning and social values in test interpretation and test use.
Six Distinct Aspects of Construct Validity
1. CONTENT ASPECTS- are parallel to content-related evidence

which calls for content relevance and representatives.
2. SUBSTANTIVE ASPECTS- pertain to the theoretical constructs
and empirical evidences.
3. STRUCTURAL ASPECTS- how will the scoring structure
matches the contract domain.
 4. GENERALIABILITY ASPECTS- examine how score
properties and interpretations generalize to and across
population groups, contexts and tasks
*EXTERNAL VALIDITY- Criterion-related evidence
for validity is related to external validity as the criterion
may be an externally-defined gold standard.
 5. EXTERNAL ASPECTS- include convergent and

discriminant evidences taken from Multitrait-
Multimethod studies.
 6. CONSEQUENTIAL ASPECTS- pertains to the intended

and unintended effect of assessment on teaching and
learning.
 Messick(1995)
 Consequential validity, explained that the social
consequences of testing may be positive if it leads to
improved educational policies, or negative if beset with bias
in scoring and interpretation or unethical use.
 Conley, Karabenick & Arbor (2006)
- study on students motivation to learn (math) contained a
discussion of the ways which data had been reported and used as
consequential evidence.
 Change- Algebra 1A students had a more adaptive pattern of
change than other students at the school; the drops were
generally smaller than for students in other courses. They saw
math as less useful and were less focused on learning but slightly
more confident in their math ability.
 Goals for next year- help students see how math is useful, and
more importantly, help focus students on learning and
developing (rather than just demonstrating) ability.
 Jambell, McDowell & Brown (1997)
- alternative assessments, wrote that such assessments appear
to have strong consequential validity because they incorporate
meaningful, engaging, and authentic tasks.
 McMillan (2007)
Positive consequences - these consequences pertain to how
asessement directly impact students and teachers.
Validity of Assessment Methods
Developing performance assessments involves three steps:

1. Define the purpose - determining the essential skills
students need to develop and the content worthy of
understanding.
2. Choose the activity - to acquire validity evidence in terms of
content, performance asessement should be reviewed by
qualified content expearts.
3. Develop criteria for scoring
Moskall (2003) laid down five recommendation:

1. The selected performance should reflect a valued activity.
2. The completion of performance assessments should provide a
valuable learning experience.
3. The statement of goals and objectives should be clearly aligned with
the measurable outcomes of the performance activity.
4. The task should not examine extraneous or unintended
variables.
5. Performance assessments should be fair and free from bias.
 In scoring, a rubric or rating scale has to be created. Teachers

must exercise caution because distracting factors like students
handwriting and neatness of the product affect rating.
Additionally, personal idiosyncrasies infringe on the objectivity
of the teacher/rater which lowers the validity of the performance
asessement.
 For observations, operational and response definitions should

accurately describe the behavior of interest. It is highly valid if
evidence is properly recorded and interpreted.
Triangulation
- a technique to validate results through cross verification
from two or more sources.
 Ross (2006)
*Validity in self-assessment; described the agreement
between self assessment ratings with teacher judgements or peer
rankings.
Threats to Validity
Miller, Linn & Gronlund (2009) identified ten factors that affect
validity of assessments result:
 1. Unclear test directions.
 2. Complicated vocabulary and sentence structure.
 3. Ambiguous statements.
 4. Inadequate time limits.
 5. Inappropriate level of difficulty of test items.
 6. Poorly constructed test items.
 7. Inappropriate test items for outcomes being measured.
 8. Short test.
 9. Improper arrangement of items.
 10. Identifiable pattern of answers.
McMillan (2007) laid down his suggestions for enhancing validity.

These are as follows:
 Ask others to judge the clarity of what you are assessing.

 Check to see if different ways of assessing the same thing give the
same result.
➢ Sample a sufficient number of examples of what is being
assessed.
➢ Prepare a detailed table of specifications.
➢ Ask other to judge the match between the asessement items
and the objectives of the asessement.
➢ Compare groups known to differ on what is being assessed.
➢ Compare scores taken before to those taken after instruction.
➢ Compare predicted consequences to actual consequences.
➢ Compare scores on similar, but different traits.
➢ Provide aduquate time to complete the assessment.
➢ Ensure appropriate vocabulary, sentence structure and item
difficulty.
➢ Ask easy questions first.
➢ Use different methods assess the same thing.
➢ Use only for intended purposes.

Assessment of Learning 2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assessment of Learning 2

Uploaded by

Copyright:

Available Formats

Continuation:

B. Crite rion-Related Evidences

There are three types of criteria

 Miller, Linn & Gronlund, (2009)

-Conley, Karabenick & Arbor (2006)

 Nomological network - it is a basically a network of laws

Campbell & Fiske (1959) developed a statistical approach called "

 -Messiah (1989) proposed a unified concept of validity based

Six Distinct Aspects of Construct Validity

1. CONTENT ASPECTS- are parallel to content-related evidence

 5. EXTERNAL ASPECTS- include convergent and

 6. CONSEQUENTIAL ASPECTS- pertains to the intended

Developing performance assessments involves three steps:

Moskall (2003) laid down five recommendation:

 In scoring, a rubric or rating scale has to be created. Teachers

 For observations, operational and response definitions should

McMillan (2007) laid down his suggestions for enhancing validity.

 Ask others to judge the clarity of what you are assessing.

You might also like