You are on page 1of 73

Assessment Literacy Series: PA

Module #6 – Conducting Reviews

Keystone Activities
© Pennsylvania Department of Education
OBJECTIVES
Participants will be able to:
 Develop a set of procedural steps needed
to conduct alignment reviews of
operational forms.
 Create a refinement protocol for school-
based assessment teams.
 Develop customized procedures needed
to implement an assessment quality
rubric.
© Pennsylvania Department of Education 2
ASSESSMENT CYCLE
Apply QC & Establish
Refinements Purpose and
Design

Implement Select
Data Content
Analysis Standards

ie w

De
sig
v
Re

n
Conduct
Build Test
Alignment
Reviews
Build Specifications

Create Test Develop


Forms & Items/
Guidelines Tasks
Develop
© Pennsylvania Department of Education Scoring Key- 3
Rubrics
ASSESSMENT CYCLE
Establish

i on
Apply QC &
Refinements Purpose and
t ra t Design
inis
dm
t -A

Implement Select
Data Content
Pos

Analysis Standards

ie w

De
sig
v
Re

n
Conduct
Build Test

n
Alignment
Build

a ti o
Specifications
Reviews

istr
Pre and Post-

mi n
Administration

-Ad
Develop

Pre
Create Test
Forms & Items/
Guidelines Tasks
Develop
© Pennsylvania Department of Education Scoring Key- 4
Rubrics
ASSESSMENT REVIEW
Three Review Tasks
1. Alignment Review—2. Data Analysis
3. Quality Control and Refinement
Two Tools (Template 6.3)
 Before the assessment is administered-
 Task: Item and Test Alignment Review
 Tool: Quality Control Checklist

 After the assessment is administered-


 Task: Data Analysis and Item Refinement
 Tool: Assessment Quality Rubric
© Pennsylvania Department of Education 5
Participant Materials
Power Point
Handouts
 Handout #6.1 – Alignment Reviews: Key Tasks, Procedural Steps,
Workflow Diagram, Sample Results
 Handout #6.2 – Refinement Procedures: List of Tasks and Resources,
Timelines
 Handout #6.3 – Assessment Quality Rubric Example
Templates
 Template #6.3 – Quality Control Checklist
 Template #6.3 – Assessment Quality Rubric
Participant Materials
 Alignment Characteristic Definitions
 Assessment Alignment Scoring Matrix/Targeted Standards
 Understanding Pennsylvania: Assessment
 Specification Tables
 Blueprints

© Pennsylvania Department of Education 6


MODULE 6 COMPONENTS

Objective 1:
Objective 2:
Develop a set of
Create a
procedural steps
refinement
needed to conduct
protocol for
alignment reviews
school-based
of operational
assessment teams.
forms.

Objective 3,
Develop
customized
procedures needed
to implement an
assessment quality
rubric.
© Pennsylvania Department of Education 7
PM 1
MODULE 6.1

ALIGNMENT
REVIEWS

© Pennsylvania Department of Education 8


ALIGNMENT REVIEWS

Number of items aligned to each standard

Distribution of items across standards

Cognitive demand implied by the standards

© Pennsylvania Department of Education 9


ALIGNMENT REVIEWS
(Cont.)
Internal Reviews

Content Reviews

Fairness and Sensitivity Reviews

Accessibility of Test Items

Alignment Reviews

External Reviews

© Pennsylvania Department of Education 10


ALIGNMENT
PM 1

© Pennsylvania Department of Education 11


ALIGNMENT REVIEW
PROCESSES
1. Alignment Focus
2. (Alignment Models)
Webb Alignment Model
3. PA Assessment Alignment Model
4. Work Flow Diagram
5. Alignment Procedural Steps
6. Alignment Summary Example
© Pennsylvania Department of Education 12
1. ALIGNMENT FOCUS

Items/Tasks
•The degree to which the items/tasks address the
targeted content standards in terms of
• (a) content match, and
• (b) cognitive demand/ higher order thinking
skills.

© Pennsylvania Department of Education 13


1. ALIGNMENT FOCUS
(CONT.)
Operational Form
•The degree to which the completed assessment reflects
(as described in the specification table and blueprint)
the
• (a) content pattern of emphasis, and
• (b) item/task sufficiency.
•Also, focuses on the developmental appropriateness
and linguistic demand.

© Pennsylvania Department of Education 12 14


 SEC (Surveys of Enacted Curriculum) Model
 Survey and Assessment Data Process
 Focus on Content Topics-Cognitive Demand
 Compares Instructional Data with Assessment Data

 Achieve Model
 Panel Review Process
 Focus on content, performance, challenge, balance, and range
 Compares the content and performance challenge of an item /task to
the standards and reviews levels of “coverage”

© Pennsylvania Department of Education 15


Webb Model
 Panel Review Process
 Focus on a reliable set of procedures and criteria for conducting
alignment analysis studies
 Analyzes alignment of an assessment to content standards based
on:
 a. Categorical Concurrence
 b. Balance of Representation
 c. Range of Knowledge
 d. Depth of Knowledge
 e. Source of Challenge
© Pennsylvania Department of Education 16
2. ALIGNMENT MODEL:
WEBB (CONT.)
• Categorical Concurrence
• The same categories of the content standards are
included in the assessment.
• Items could be aligned to more than one content
standard.
• Balance of Representation
• Ensures the distribution of the content standards
across the operational form matches the test blueprint.

© Pennsylvania Department of Education 10 17


2. ALIGNMENT MODEL:
WEBB (CONT.)
• Range of Knowledge
• The extent of knowledge required to answer parallels
the knowledge the standard requires.
• Depth of Knowledge
• The cognitive demand of the content standard must
align to the cognitive demand of the test item.
• Source of Challenge
• Students give a correct or incorrect response for the
wrong reason (e.g., linguistic demand).

© Pennsylvania Department of Education 11 18


3. PA ASSESSMENT ALIGNMENT
MODEL: CHARACTERISTICS
1. Content Match (CM)
•Items/tasks match a specific content standard based
upon the narrative description of the standard and a
professional understanding of the knowledge, skill,
and/or concept being described.
2. Cognitive Demand/Depth of Knowledge (DoK)
•Items/tasks reflect the cognitive demand/higher-order
thinking skill(s) articulated in the standards, with
extended performance tasks typically focused on
several, integrated content standards.

© Pennsylvania Department of Education 19


3. PA ASSESSMENT ALIGNMENT
MODEL CHARACTERISTICS (CONT.)
3. Content Pattern (CP)
•Item/task distributions represent the emphasis
placed on the targeted content standards in terms of
“density” and “instructional focus”, while
encompassing the range of standards articulated on
the test blueprint.
4. Item/Task Sufficiency (ITS)
•Item/task distributions consist of sufficient
opportunities for test-takers to demonstrate skills,
knowledge, and concept mastery at the appropriate
developmental range.
© Pennsylvania Department of Education 20
4. ALIGNMENT
WORKFLOW DIAGRAM
1

© Pennsylvania Department of Education 21


PM 2-8
5. ALIGNMENT
PROCEDURAL STEPS
1. Identify a team of teachers to conduct the
alignment review (best accomplished by
department or grade-level committees) with
technical support from the district.
2. Organize items/tasks, operational forms, test
specification tables, and targeted content
standards.
3. Conduct panelist training on the alignment criteria
and rating scheme. Use calibration techniques
with a “training set” of materials prior to
conducting the actual review.

© Pennsylvania Department of Education 22


PM 2-8
5. ALIGNMENT
PROCEDURAL STEPS (CONT.)
Evaluate the following areas:
• Content Match (CM) and Depth of
Knowledge (DoK)
• Read each item/task in terms of matching the standards
both in terms of content reflection and cognitive
demand/(DoK).
• For SA, ECR, Extended Performance Tasks, ensure
that scoring rubrics are focused on specific content-
based expectations.
• After reviewing all items/tasks, including scoring
rubrics, count the number of item/task points assigned
to each targeted content standard.

© Pennsylvania Department of Education 23


PM 2-8
5. ALIGNMENT
PROCEDURAL STEPS (CONT.)
Evaluate the following area:
• Content Pattern
 Determine if the items/tasks are sampling the
complexity and extensiveness of the targeted
content standards.
 If assessment’s range is too narrowly defined,
refine blueprints and replace items/tasks to
match the range of skills and knowledge implied
within the targeted standards.

© Pennsylvania Department of Education 24


PM 2-8
5. ALIGNMENT
PROCEDURAL STEPS (CONT.)
Evaluate the following area:
• Item/Task Sufficiency (ITS)
• Determine the number of item/task points per targeted
content standard based upon the total available. Using the
item/task distributions, determine whether the assessment has
at least five (5) points for each targeted content standard.
Identify any shortfalls in which too few points are assigned to
a standard listed in the specification table. Refine if patterns
do not reflect those in the standards.
• Ensure the items/tasks are developmentally appropriate for
the test-takers. Further, ensure the passages and narrative text
do not contain linguistic challenges associated with
vocabulary usage and text complexity.

© Pennsylvania Department of Education 25


PM 2-8
5. ALIGNMENT
PROCEDURAL STEPS (CONT.)

• Record findings and present to the larger


group with recommendations for
improvements (e.g., new items/tasks,
design changes, item/task refinements,
etc.).

• Document the group’s findings and


prepare for refinement tasks.
© Pennsylvania Department of Education 26
Alignment Summary (Example)
PM 2-8

Area/Grade CM DoK CP ITS Comments


Science Gr 5     No Findings
Biology Gr 9     No Findings
Chemistry   Standard #6 ≠ DoK 2; Test
HS Length ≠ 56 pts.
Life Science HS   Standard #6 ≠ DoK 2; Test
Length ≠ 58 pts.
Environmental   Standard #6 ≠ DoK 2; Test
Science Length ≠ 60 pts.
HS
Note: Standard #6 requires tasks to be DoK 2 or higher.
The task points do not reflect the specification tables in
quantity or distribution sufficiency.

© Pennsylvania Department of Education 27


MODULE 6.2
REFINEMENT
PROTOCOLS

© Pennsylvania Department of Education 28


Assessment Life Cycle Refresher

n
tio
ra
ist
in
m
Ad
st-
Po

w
vie
Re
Pre and Post-
Administration

© Pennsylvania Department of Education 29


“REVIEW PHASE”
The assessment cycle’s “Review Phase” consists of three key
steps: alignment audit, data analysis, and refinement activities.

e nt Ali
m gn
f i ne ties Au men
Re ctivi dit t
A

Data Analysis

© Pennsylvania Department of Education 30


ALIGNMENT AUDIT
[TASKS REVISITED]
Alignment audits are focused on the following four key tasks:
1. Items/tasks match a specific content standard based upon the
narrative description of the standard and a professional
understanding of the knowledge, skill, and/or concept being
described.
2. Items/tasks reflect the cognitive demand/higher-order
thinking skill(s) articulated in the standards, with extended
performance tasks typically focused on several, integrated
content standards.

© Pennsylvania Department of Education 31


ALIGNMENT AUDIT
[TASKS REVISITED, CONT.]
3. Item/task distributions represent the emphasis placed on
the targeted content standards in terms of “ density” and
“ instructional focus” , while encompassing the range of
standards articulated on the test blueprint.
4. Item/task distributions consist of sufficient opportunities
for test-takers to demonstrate skills, knowledge, and
concept mastery at the appropriate developmental range.

© Pennsylvania Department of Education 32


DATA ANALYSIS
Analytical tasks are focused on:
• Collecting and evaluating item/task
performance.
• Validating performance levels.
• Examining score distributions.
• Determining score consistency.
• Inspecting potential confounding variables.

© Pennsylvania Department of Education 33


 Item difficulty (p-values)
 Point bi-serial correlation (Rbis)
 Distractor Analysis
 Attempted or omitted rates
 IRT (Item Response Theory)
 DIF (Differential Item Functioning) Analysis
 Mean Item Difficulty
 Mean Discrimination Index
 Test Characteristic Curve
 Reliability
 Inter-correlation of Sections
© Pennsylvania Department of Education 34
DATA ANALYSIS (CONT.)
Post-administration data are used to
evaluate the psychometric properties
of the assessment by examining
aspects such as:
• rater reliability
• internal consistency
• inter-item correlations
• decision consistency
• measurement error

© Pennsylvania Department of Education 35


 Sample Size

 Preliminary items statistics: n=100

 DIF (Differential Item Functioning) Analysis:


n=100 per group

 Performance of Operational Forms and Items:


 IRT (Item Response Theory): n=500-1000
 Rasch Model: n=smaller samples

© Pennsylvania Department of Education 36


REFINEMENT ACTIVITIES
PM 2-8

Refinement activities are


those tasks used to:
• Create new items/tasks.
• Improve existing items/tasks.
• Create alternate forms.

© Pennsylvania Department of Education 37


PM 6
REFINEMENT
ACTIVITIES (CONT.)
Refinement activities are those tasks
used to:
• improve human-scoring guidelines.
• streamline administration protocols.
• conduct professional development
seminars/workshops.

© Pennsylvania Department of Education 38


T 1-7

MODULE 6.3 ASSESSMENT QUALITY


REVIEW TOOLS

© Pennsylvania Department of Education 39


QUALITY REVIEWS
Ensuring the assessment:
• Reflects the developed test blueprint or specification table.
• Matches targeted content standards.
• Includes multiple ways for test-takers to demonstrate
knowledge, skills, and abilities.
Eliminating potential validity threats by
reviewing for:
• Bias
• Fairness
• Sensitive Topics
• Accessibility/Universal Design Features
© Pennsylvania Department of Education 40
Quality Control Checklist

© Pennsylvania Department of Education 41


• Checklist is designed to provide a “ quick
reference” to those criteria associated with
high-quality, item/task development.
• Checklist is organized into three parts:
• Part I: Material Screening
• Part II: Form/Item Rigor
• Part III: Standardized Protocols

© Pennsylvania Department of Education 42


 Review assessment items and tasks for:
1. Content accuracy
Factually correct prompts and answers that are connected to the
curriculum.
2. Item stems (Multiple Choice items)
Stems should present a definite, explicit, and singular question; avoid
extraneous information in the stem.
3. Distractors (Multiple Choice items)
There should be one and only one correct answer and the distractors must
be incorrect; however, distractors should seem plausible to students who
have not mastered the material; all choices should be internally consistent
(parallel), contain the same level of detail, and be grammatically
consistent with the stem.

© Pennsylvania Department of Education 43


4. Constructed Response (CR) questions
The task and expectations for students should be clearly defined.
5. Cognitive demand/Depth of Knowledge (DoK)
Degree to which critical analysis or advanced problem-solving skills are
required to answer a question (see Webb’s DoK for examples).
Expectations for the cognitive demand or DoK of items should be
addressed in the item and test specifications. From Webb’s alignment
methodology (Webb, 1997), DoK refers to the level of cognitive
processing required to answer a question. Webb defines four levels).
6. Alignment match
Item are directly aligned to a content standard and associated Assessment
Anchor(s) and Eligible Content.

© Pennsylvania Department of Education 44


PM 2-8

7. Grade-level appropriateness
Degree to which items are written using language and content suitable for
the grade level being assessed.
8. Item format
Appearance of items, individually and together on a page.
9. Language load
Verbal skills required to understand what the question is asking. Most
experts indicate that language load should be minimized when testing any
content other than language skills.
10. Editorial style
Rules on how items should be written and appear.
11. Item difficulty
Ease or complexity of an item is typically measured by the p-value statistic
(percent correct).
© Pennsylvania Department of Education 45
Part I: Material Screening
Task Task Status
ID
1.1 Purpose statement □
1.2 Content standards (selected) □
1.3 Specifications table □
1.4 Assessment blueprint □
1.5 Operational form □
1.6 Score key and/or Scoring rubric(s) □
1.7 Administrative & scoring guidelines □
© Pennsylvania Department of Education 46
QUALITY CONTROL CHECKLIST:
Part II: Form/Item Rigor

© Pennsylvania Department of Education 47


T 1-2
1. CONTENT REVIEWS
Task ID • Determine if each item/task clearly aligns to
2.3, 2.6
the targeted content standard.
Task ID • Evaluate all items for content “accuracy".
2.3,2.6
• Judge if each item/task is developmentally
Task ID (grade) appropriate in terms of:
2.8 • Reading level
• Vocabulary
• Required reasoning skills

• Review each item/task response in terms of


the targeted standards.
Task ID
2.3, 2.6
© Pennsylvania Department of Education 48
PM 2-8
T2
2. SENSITIVITY REVIEWS
 Sensitive to different cultures,
religions, ethnic and socio-economic
groups, and disabilities.
 Balanced by gender roles.
Task ID  Positive in their language, situations,
2.9
and imagery.
 Void of text that may elicit strong
emotional responses by specific
groups of students.

© Pennsylvania Department of Education 49


PM 2-8
T2
3. BIAS REVIEWS
• Bias is the presence of some characteristic
of an item/task that results in the differential
performance of two individuals with the
same ability but from different subgroups.
Task ID
2.10 • Bias-free items/tasks provide an equal
opportunity for all students to demonstrate
their knowledge and skills.
• Bias is not the same as stereotyping.

© Pennsylvania Department of Education 50


PM 2-8
T2
4. FAIRNESS REVIEWS
• Fairness generally refers to the
opportunity for test-takers to learn the
content being measured.
• Item/task concepts and skills should have
Task ID
been taught to the test-taker prior to
2.11 evaluating content mastery.
• Item/task should be more complex for
large-scale assessments.
• Fairness reviews assumptions that earlier
grades taught the foundational content.
© Pennsylvania Department of Education 51
PM 2-8
T2
5. EDITORIAL REVIEWS
Task ID  Ensure the assessments have
2.1, 2.8,
2.12
developmentally appropriate:
• Readability levels.
• Sentence structures.
• Word choice.

Task ID
 Eliminate validity threats created by:
2.12 • Confusing or ambiguous directions or prompts.
• Imprecise verb use to communicate
expectations.
• Vague response criteria or structure.

© Pennsylvania Department of Education 52


Part II: FORM/ITEM RIGOR

Rigor Reminder!
Task ID
2.2, 2.7

COGNITIVE DEMAND/DEPTH OF KNOWLEDGE

Revised Bloom’s Webb’s


Verbs
Taxonomy Classification
© Pennsylvania Department of Education 53
COGNITIVE DEMAND/DEPTH OF
KNOWLEDGE
Revised Bloom’s Webb’s
Verbs
Level Taxonomy Classification
1 Remembering & Recall, Reproduction define, duplicate, list, memorize,
Understanding recall, repeat, reproduce, state,
[Recall or recognition classify, describe, discuss, explain,
[Recalling specifics; of a fact, information, identify, locate, recognize, report,
processing knowledge term, or a simple select, translate, paraphrase
at a low level] procedure]

2 Applying Skills, Concepts, Basic choose, demonstrate, dramatize,


Reasoning employ, illustrate, interpret, operate,
[Using information in [Use of information or schedule, sketch, solve, use, write,
another familiar conceptual knowledge] appraise, compare, contrast,
situation] criticize, differentiate, discriminate,
distinguish, examine, experiment,
question,
© Pennsylvania Department of Education 54
COGNITIVE DEMAND/DEPTH OF
KNOWLEDGE (CONT.)
Bloom’s Webb’s
Level Verbs
Taxonomy Classification
3 Analyzing Strategic Thinking, Complex appraise, argue,
Reasoning defend, judge,
[Deconstructing information [Requires reasoning, developing select, support,
into subordinate parts to a plan or sequence of steps; some value, evaluate,
explore understandings and complexity; more than one assemble,
relationships] possible answer construct, create,
develop,
formulate, write
4 Evaluating and Creating Extended Thinking, Extended design, critique,
Reasoning create, prove,
[Constructing and/or apply concepts,
reorganizing information from [Requires an investigation; time connect.
elements and parts and then to think and process multiple
making value judgments about conditions of the task; combine
the method] and synthesize ideas into new
concepts]
© Pennsylvania Department of Education 55
PM 2-8
T 1-2

Part II: FORM/ITEM RIGOR


Task Task Status
ID
2.1 Operational form is developmentally appropriate (100% on
grade-level)

2.2 Operational form is rigorous (60% DoK 2 or higher)

2.3 Operational form matches the targeted standards (100%
accuracy)

© Pennsylvania Department of Education 56
PM 2-8
T 1-2

Part II: FORM/ITEM RIGOR


Task Task Status
ID
2.4 Operational form has sufficient item/task density (5
items/points)

2.5 Operational form reflects the content pattern (95%
coverage)

© Pennsylvania Department of Education 57


PM 2-8
T 1-2

Part II: FORM/ITEM RIGOR

Task ID Task Status


2.6 Items/tasks are assigned correctly to the targeted content
standards

2.7 Items/tasks are assigned the correct cognitive level □
2.8 Items/tasks are developmentally appropriate (readability, □
content focus)
2.9 Items/tasks have been screened for sensitive subject matter □
© Pennsylvania Department of Education 58
PM 2-8
T 1-2

Part II: FORM/ITEM RIGOR

Task Task Status


ID
2.10 Items/tasks have been screened for potential bias (e.g., □
contextual references, cultural assumptions, etc.)
2.11 Items/tasks have been screened for fairness, including □
linguistic demand and readability
2.12 Items/tasks have been screened for structure, and editorial □
soundness
© Pennsylvania Department of Education 59
PM 2-8
T3

Part III: STANDARDIZED PROTOCOLS


Task Task Status
ID
3.1 Specifications and/or blueprints reflect the □
operational form
3.2 Administrative guidelines for teachers are clear and □
standardized
3.3 Item/task directions for test-takers articulate □
expectations, response method, and point values

3.4 Accommodation guidelines for SWD, 504, ELL, and □


others are referenced
3.5 SCR/ECR scoring guidelines and rubrics are □
standardized

© Pennsylvania Department of Education 60


1. Identify two subject matter experts (teachers) with experience in
teaching the content upon which the assessment is based.
2. Complete Part I by organizing and reviewing the operational form,
answer key, and/or scoring rubrics, blueprint, administrative guide, etc.
3. Complete Part II by screening each item/task and highlight any
“potential” issues in terms of content accuracy, potential bias, sensitive
materials, fairness, and developmental appropriateness and then
flagging any item/task needing a more in-depth review.
4. Complete Part III by examining the assessment protocols and
identifying any shortcomings.
5. Review the Quality Control Checklist; Prepare for any needed item/task
revisions.

© Pennsylvania Department of Education 61


ASSESSMENT
QUALITY RUBRIC

© Pennsylvania Department of Education 62


Step 1. Identify two subject matter experts (teachers) with experience
in teaching the content upon which the assessment is based.
Step 2. Complete Dimension I by reviewing the operational form,
answer key, and/or scoring rubrics, blueprint, administrative
guide, etc. that were created during the Design Phase of the
assessment.
Step 3. Complete Dimension II by conducting alignment and other
types of reviews. Also, review documentation on the
procedures used in item/task development.
Step 4. Complete Dimension III after administration of the
assessment. Evaluate the psychometric evidence of the
assessment and how those data were used in refining the
assessment for future administrations.
© Pennsylvania Department of Education 63
PM 2-8 DIMENSION I: DESIGN
T 4-6

The assessment’s design is appropriate for the


intended audience and reflects challenging material
I.A needed to develop higher-order thinking skills. The
purpose of the performance measure is explicitly
stated.

The assessment’s design has targeted content


standards representing a range of knowledge and
I.B
skills students are expected to know and
demonstrate.

© Pennsylvania Department of Education 64


PM 2-8
T 4-6 DIMENSION I: DESIGN (CONT.)

Specification tables and blueprints articulate the


number of items/tasks, item/task types, passage
I.C
readability, and other information about the
assessment.

Items/tasks are rigorous (designed to measure a range


of cognitive demands/higher-order thinking skills at
I.D developmentally appropriate levels) and of sufficient
quantities to measure the depth and breadth of the
targeted content standards.

© Pennsylvania Department of Education 65


PM 2-8
T 4-6
DIMENSION II: BUILD
Items/tasks and score keys were developed using
standardized procedures, including scoring rubrics
II.A for human-scored, open-ended questions. The total
time to administer the assessment is
developmentally appropriate for the test-takers.

Items/tasks were created in terms of: (a) match to


the targeted content standards, (b) content accuracy,
II.B
(c) developmental appropriateness, (d) cognitive
demand, (e) bias, (f) sensitivity, and (g) fairness.
© Pennsylvania Department of Education 66
PM 2-8
T 4-6
DIMENSION II: BUILD (CONT.)
Administrative guidelines contain step-by-step
procedures used to administer the assessment in a
consistent manner, including scripts to orally
II.C
communicate directions to students, day and time
constraints, and allowable accommodations or
adaptations.

Scoring guidelines were developed for human-scored


items/tasks to promote score consistency across
II.D items/tasks and among different scorers. These
guidelines articulate point values for each item/task
used to combine results into an overall raw score.
© Pennsylvania Department of Education 67
PM 2-8
T 4-6
DIMENSION II: BUILD
(CONT.)

Summary scores were reported in terms of raw and


standard scores. Performance levels reflect the
II.E
range of scores possible on the assessment and use
statements or symbols to denote each level.

© Pennsylvania Department of Education 68


PM 2-8
T 4-6
DIMENSION III: REVIEW
The assessment was reviewed in terms of: (a)
item/task distribution based upon the design
properties found within the specification and
III.A blueprint documents, and (b) item/task and form
performance (e.g., levels of difficulty, complexity,
distracter quality, bias, and other characteristics)
using pre-established criteria.

The assessment was reviewed in terms: (a) editorial


III.B soundness, (b) document consistency, and (c)
linguistic demand.
© Pennsylvania Department of Education 69
PM 2-8 DIMENSION III: REVIEW (CONT.)
T 4-6

The assessment was reviewed in terms of the


following alignment characteristics:
Content Match (CM)
III.C
Cognitive Demand/Depth of Knowledge (DoK)
Content Pattern (CP)
Item/Task Sufficiency (ITS)

Post-administration analyses were conducted on the


assessment (as part of the refinement process) to
III.D
examine items/tasks performance, scale functioning,
overall score distribution, rater drift, etc.
© Pennsylvania Department of Education 70
DIMENSION III: REVIEW (CONT.)
PM 2-8
T 4-6
The assessment has score validity evidence that demonstrate
item responses were consistent with content specifications.
Data suggest that the scores represent the intended construct
III.E by using an adequate sample of items/tasks within the
targeted content standards. Other sources of validity
evidence such as the interrelationship of items/tasks and
alignment characteristics of the assessment are collected.

The assessment’s reliability coefficients are reported for the


assessment, which includes estimating internal consistency.
III.F Standard errors are reported for summary scores. When
applicable, other reliability statistics such as classification
accuracy, rater reliability, etc. are calculated and reviewed.
© Pennsylvania Department of Education 71
REFLECTION

© Pennsylvania Department of Education 72


SUMMARY
Outlined steps needed to conduct
alignment reviews of operational
forms.
Explored the refinement tasks.
Examined an assessment quality
control checklist tool and an
assessment quality rubric.

© Pennsylvania Department of Education 73

You might also like