You are on page 1of 7

may desire more unstructured portfolios.

As Wolf (1989) bluntly puts it:


“Portfolios are messy” (p. 37). But in the context of designing test items to
reflect student learning, I take the position that a structured approach will lead
to a more satisfactory result that will stand up to rigorous standards for validity
and reliability. The evaluation portfolio must have the same common elements
as any tes; it should not be “messy.”

The Design of an Ideal Evaluation Portfolio


The steps for designing an ideal evaluation portfolio are familiar to readers of previous
chapters:

1. Identify the learner outcomes desired. Generally speaking , this will be a


student ability, such as reading, writing, or mathematical problem solving. You
might to go. Roid (1994) identifies five types of writing activities: descriptive,
persuasive, expository, narrative, and imaginative. You might identify six analytic
traits of concern in the development of writing ability: ideas, organization, voice,
word choice, sentence fluency, and conventions (e.g., spelling, grammar,
capitalization , punctuation). Yyou also might identify the developmental level of
the student, as Purves (1993) did using three levels: basic, proficient, and
advanced.
2. Determine the content areas to be covered. From this three-
dimensional array above (five types of writing, six analytical traits, and three
developmental levels), you can specify what the portfolio is supposed to
accomplish. This might resemble a test blueprint and look like Table 7.4.
3. Provide a place for student self-reflection, analysis, and growth. This is a
critical part of theportfolio contents. Student are acting as autobiographers,
describing how they have improved, their insights into learning, where they are
going . a good self-reflection should be something like a good Barbara Walters’
interview: it should contain specific questions to help students reflect.this is a
rare opportunity for sudent to express their personal views about learning.
Remember that you need to gain students confidence and trust to express their
true feelings about learning. Thus, grading the self-reflection is a questionable
pratice. The next section deals with an acceptable way to score this part, if you
want to do so.
4. Provide a schedule for completion. Students need to have a specific
timeline and deadline for the portfolio. A good idea is to have a timeline for the
completion of specific aspects. Procrastinators will let everthing slide until the
end. Check points in the timeline a low teachers and parents glimpses into the
process, but without appearing too heavy-handed or interfering. Deadlines
should be established; but if a mastery approach is used, then flexibility is
needed with respect to any deadline, as students should have extra time for
revision and polishing.
5. Provide for choices in assigments. This is a controversial issue in testing
with choice come differences in performance that may happen because a
student chose an easy or a hard assigment (Roid, 1994). Testing experts will
recommend against choice among items or activities (Wainer, Wang, & Thissen,
1994), but curriculum and instruction experts will insist on it (Wiggins, 1993).
You will have to decide to use a structured, unmessy approach and lose some of
the advantage in student selection , or to use a more unstructured approach and
follow Wolf’s predilection for a more collegial and collaborative arrangement
between student and teacher.
6. State how the portfolio should be created. Should a table of contents be
provided? Yes, because it provides an advance organizer for the
teacher/evaluator. Yes, because it also provides a good overview for the student
of what the portfolio is trying to accomplish.

TABLE 7.4
A Blueprint for a Student Evaluation Portfolio

IDEAS ORGANIZATIO VOICE WORD CONVENTIONS


N CHOICE
Descriptive 4 3 2 4 3
Persuasive 10 3 2 4 3
Expository 4 3 2 4 3
Narrative 6 3 6 4 3
Imaginative 8 3 6 4 3

TABLE 7.5
A Template for Designing a Student Evaluation Portfolio
Table of contents : This is one-page listing of the products
included.
Student’s reflective letter: This section contains the studen’s
autobiographical reflection on learning,
frustrations, successes, motivations, and
other insights. Indicate a maximum length
for this letter.
Spesific tasks to be included: This section contains the products to be
evaluated in the order listed in the table of
contents. Determine if students choose
examples or if you have a specific list of
what you expect. Refer to Tables 5A2 and
6.3 for specific advice on items requiring
high and low inferences.
Limit for number of pages: How many pages long is this portfolio?
Consultation: Were students allowed or encouraged to
consult with others?
Collaboration: We editorial assistance obtained? Is this
O.K.?
Appendix: Is an appendix required that may contain
ancillary material, such as preliminary
drafis?
Grading criteria: Ratings scales and/or checklists are
included for each object, with the point
value applied to the student grade. This
section should inform the student of
exactly how the portfolio connects to the
student grade, specifying. If points are
assigned, how every student performance
is evaluated in terms of points. Refer to
Table 5B.6 for information on
systematically constructing rating scales.

Table 7.5 provides a template for designingan evaluation portfolio that might
be used by the teacher, with a more detailed version given to the student.
Chronic scoring problems occur with poor interrater consistency, bias, and
reliability in scoring performance test and portfolios. Templates for high-
inference and low-design specific items for tthe evaluation portfolio.

Scoring the Evaluation Portfolio


This section is problematic because the technology for scoring portfolio is so
new and experimental. Recent experiences in Kentucky and Vermont with
statewide testing programs featuring portfolio have provided some negative
result with scoring. With the Vermont portfolio, Korezt et al. (1994) reported
low reability. Kentucky’s experiments with statewide portfolios found local
teacher bias to be considerable in scoring. External auditors scored the same
portfolios and discovered systematic discrepancies indicating lenency on the
teacher’s part. Because of this bias, the state reevaluated its portfolio scoring
process and eliminated intradistrict scoring.
In part, these problems were due to the inexperience of the teachers
involved in this project. Public accountability was also a factor, as was discussed
earlier in this chapter. If teacers know they are going to be evaluated, then their
judgments are likely to be influenced by this fact. These statewide assessments
have told us that schools and teachers mostly favor using portfolios, but the
scoring process is still fraught with hazards that limit the usefulness of result
thus far.
You need to develop a strategy to overcome these limitations. you know
that scoring wiill be time-consuming and expensive and that teachers and other
judges will need to be well trained and experienced. To ignore these technical
problems invites trouble. If test scores are used to make decisions affecting
students, such as pass/fail, or affecting teacher’s future employment , litigation
may be initiated and school personnel may be in an indefensible position with
respect to how the portfolio results were used.
We will review several basic approaches to scoring portfolios, and then
focus on one method that is best suited for measuring student behavior while
still providing some dependability.

Holistic Scoring
A holistic rating scale has not been substantially supported in this book as a
sound basis for scoring performance tests, with the portfolio being considered a
special case of a performance test. Thus, the holistic rating scale is not
recommended here. The primary reason is that if a single rating scale is used to
score a portfolio , the limited range of scores is not likely to be very dependable
(Reckase, 1995). Another problem is that a holistic rating scale is used with
student products that may very according tp personal choice; the rating scale
does not automatically standardize result. Some students may choose more
challenging material or tasks than other student do. Although this is true with
other approaches, too, it is less of a problem there. The sole argument favoring
a holistic scale is that you may be concerned with the wholeness of writing or
some similar ability. Given the preponderance of arguments against the holistic
scale, it is easy to see why it can’t be recommeded here.

Analytical Scoring
Recalling from Chapter 5B, the analytical scales provide a provile of important
characteristics of performence . For istance, if we were scoring writing samples,
we might be interested in analytic traits, as shown in Table 5B.3. Analytical
scoring has the advantage of producing more variation in scores and, more
likely, higher reliability. Thus, you can have more confidence in the student
result and have a more defensible result. Another advantage is that the
analytical diagnosis of strengths and weaknesses in the ability being tested.

Reliability
Reckase (1995) provides a thoughtful and useful analysis of reability of portfolio
scores. To get a sufficient reliability level (80 on a scale from .00 to 100), he
estimates that roughly seven scorable categories are needed. If the quality of
scoring is very high, the number of scorable categories might be reduced to five.
As mentoned earlier in this chapter, the cost of such a lengthy scoring process is
likely to be very high, between $10 and $20 per test. But as also mentioned
earlier , if theportfolio is viewed as a significant achievement of each student
and if the use of the portfolioresult is important for placement, future
instruction, and grading, then this cost and effort may be justified.

A Scoring Guide
Table 7.6 presents a scoring guide that organizes the aspects of the
evaluation portfolio in summary form. The total portfolio is worth 100 points,
and the 100 points can be used as part of the criteria in a grading policy. The
first two items in the scoringguide contain essential elements that are
nongraded, although points are assigned for their completion. The same is true
for the appendix. You simply want the information, but it is nongraded. Six
generic items are provided, following Reckase’s recommendation for scorable
units. A minimum of six items, together with the three observations (table of
contents, self-reflection, and appendix),should constitute enough scoring
categories to provide a reliable index of achievement. The analytical rating
scales might contain five points and reflect the developmental levels desired in
teaching abilities, such as writing. The most confusing into points assigned,
because you are likely to vary the weight assigned to each item, and the weights
are not multiples of five.
The self-reflection section is not evaluated but is merely acknowledged.
Students are not required to write a “good” reflection but to take the
opportunity to reflect on their learning style and strategy and what they did to
succeed. If they complete the reflection, this hypothetical scoring system
provides student point credit toward a grade. The premium is on doing the self-
reflection, not on the quality of the effort.
The scoring guide in Table 7.6 is consistent with good measurement practice
and gives you the opportunity to structure the portfolio so that a student can
produce a high-quality effort of work consistent with your standards. The
scoring is objective with respect to the table of contents, self-reflection, and
appendix, but will require extensive work with respect to the six items.

TABLE 7.6
Hypothetical Scoring Guide for Writing Portfolio
ACTIVITY/ POSSIBLE RATING OR POINTS
ASSIGMENT POINTS CHECKLIST EARNED
Table of contents 10 Yes No
Self-reflection 10 Yes No
Item 1 12 1…5
Item 2 22 1…5
Item 3 8 1…5
Item 4 20 1…5

Item 5 8 1…5
Item 6 5 1…5
Appendix: Showing rough drafts 5 Yes No
100
Based on a series of five-point analytical rating scales designed for each of
the six specific tasks. Refer to Chapter 5B for information on the design of these
scales. The value of each rating-scale point should be converted to the points
possible in a consistent manner. For instance, a five-point rating scale to judge
item 2 would require that each rating-scale point be worth 4.2 points. This
straightforward arithmetic conversion reflects that each item has a different
weight, as determined by you, the teacher.

0 1 2 3 4 5
0 4.4 8.8 13.2 17.6 22

Students dhould always know the value of each item. In fact, they should have
the scoring guide when they receive their portfolio assignment.
Summary
In this chapter you have read about the five types of portfolio and how
portfolioresult should and should not be used. The strengths and limitations of
the portfolio were discussed. Although using the portfolio proses many
problems, it may be the most significant and controversial aspect of teaching
and testing today. For the many reasons given in Chapter 5 and 6 and in this
chapter, the portfolio is likely to be a mainstay in classrooms at all levels,
including profesional, business, and military training. Portfolio design can reach
a high level of expertise, and with fine tuning we can expect to use a portfolio
design repeatedly over many yaers. The template in Table 7.5 helps you design a
portfolio, but considerable creativity is also needed. There are major problems
in scoring the results of a portfolio. Not only dowe face problems with interrater
consistency, but there is bias as well. The cost of scoring portfolios may be
higher than sponsors want to support, and shortcuts lead to low reliability and
undependable results that cannot be used for high stakes purposes. The most
realistic positionto take here is to evaluate the importance of the portfolio
versus its cost. If it done correctly it can be used to drive instruction
appropriately, provide evidence of opportunity to learn, and provide scores that
are dependable enough to assign grades, place students for instruction, evaluate
curriculum, evaluate the instructional program, evaluate teaching, and give state
or ntional policymakers information about how the public’s resources are being
spent to educate students. Table 7.6 provides a hypothetical scoring guide. The
portfolio should continue to increase in use, but at the same time teachers will
improve their knowledge, skills, and ability to design and use portfolios to
address more effectively the development of school-based abilities.

You might also like