Professional Documents
Culture Documents
Objectives:
At the end of this module, the learner should be able to:
Explain the goal of assessment
Categorize different types of assessment methods
Distinguished the methods and procedures in creating effective assessment
Create my own assessment tools
ASSESSMENT OF LEARNING
Assessments are a way to find out what students have learned and if they’re aligning to curriculum or grade-
level standards.
Assessments of learning are usually grade-based, and can include:
Exams
Portfolios
Final projects
Standardized tests
They have a concrete grade attached to them that communicates student achievement to teachers, parents,
students, school-level administrators and district leaders.
Common types of assessment of learning include:
Summative assessments
Norm-referenced assessments
Criterion-referenced assessments
ASSESSMENT AS LEARNING
Assessment as learning actively involves students in the learning process. It teaches critical thinking skills,
problem-solving and encourages students to set achievable goals for themselves and objectively measure their
progress.
They can help engage students in the learning process, too! One study found:
“Students develop an interest in mathematical tasks that they understand, see as relevant to their own
concerns, and can manage. Recent studies of students’ emotional responses to mathematics suggest that
both their positive and their negative responses diminish as tasks become familiar and increase when tasks are
novel” (21)
Douglas B. McLeod
Some examples of assessment as learning include ipsative assessments, self-assessments and peer
assessments.
1. DIAGNOSTIC ASSESSMENT
2. FORMATIVE ASSESSMENT
Just because students made it to the end-of-unit test, doesn’t mean they’ve mastered the skill. Formative
assessments help teachers understand student learning while they teach, and adjust their teaching strategies
accordingly.
Meaningful learning involves processing new facts, adjusting assumptions and drawing nuanced conclusions.
Or, as researchers Thomas Romberg and Thomas Carpenter describe it:
“Current research indicates that acquired knowledge is not simply a collection of concepts and procedural skills
filed in long-term memory. Rather, the knowledge is structured by individuals in meaningful ways, which grow
and change over time.”
Formative assessments help you track how student knowledge is growing and changing in your classroom in
real-time. While it requires a bit of a time investment — especially at first — the gains are more than worth it.
Some examples of formative assessments include:
Portfolios
Group projects
Progress reports
Class discussions
Entry and exit tickets
Short, regular quizzes
Virtual classroom tools like Socrative or Kahoot!
When running formative assessments in your classroom, it’s best to keep them short, easy to grade and
consistent. Introducing students to formative assessments in a low-stakes way can help you benchmark their
progress and reduce math anxiety when a big test day rolls around.
3. SUMMATIVE ASSESSMENT
Summative assessments measure student progress as an
assessment of learning and provide data for you, school
leaders and district leaders.
They're cost-efficient and valuable when it comes to
communicating student progress, but they don’t always
give clear feedback on the learning process and can foster
a “teach to the test” mindset if you’re not careful.
Plus, they’re stressful for teachers. One Harvard
survey found 60% of teachers said “preparing students to
pass mandated standardized tests” “dictates most of” or
“substantially affects” their teaching.
Sound familiar?
But just because it’s a summative assessment, doesn’t mean it can’t be engaging for students and useful for
your teaching. Try creating assessments that deviate from the standard multiple-choice test, like:
Recording a podcast
Writing a script for a short play
Producing an independent study project
No matter what type of summative assessment you give your students, keep some best practices in mind:
Keep it real-world relevant where you can
Make questions clear and instructions easy to follow
Give a rubric so students know what’s expected of them
Create your final test after, not before, teaching the lesson
Try blind grading: don’t look at the name on the assignment before you mark it
Use these summative assessment examples to make them effective and fun for your students!
Use Assignments to differentiate math practice for each student or send an end-of-unit test to the whole class.
Or use our Test Prep tool to understand student progress and help them prepare for standardized tests in an
easy, fun way!
4. IPSATIVE ASSESSMENTS
How many of your students get a bad grade on a test and get so discouraged they stop trying?
Ipsative assessments are one of the types of assessment as learning that compares previous results with a
second try, motivating students to set goals and improve their skills.
When a student hands in a piece of creative writing, it’s just the first draft. They practice athletic skills and
musical talents to improve, but don’t always get the same chance when it comes to other subjects like math.
A two-stage assessment framework helps students learn from their mistakes and motivates them to do better.
Plus, it removes the instant gratification of goals and teaches students learning is a process.
You can incorporate ipsative assessments into your classroom with:
Portfolios
A two-stage testing process
Project-based learning activities
One study on ipsative learning techniques found that when it was used with higher education distance learners,
it helped motivate students and encouraged them to act on feedback to improve their grades. What could it
look like in your classroom?
5. NORM-REFERENCED ASSESSMENTS
Norm-referenced assessments are tests designed to
compare an individual to a group of their peers,
usually based on national standards and occasionally
adjusted for age, ethnicity or other demographics.
Proponents of norm-referenced assessments point out that they accentuate differences among test-takers and
make it easy to analyze large-scale trends. Critics argue they don’t encourage complex thinking and can
inadvertently discriminate against low-income students and minorities.
Norm-referenced assessments are most useful when measuring student achievement to determine:
Language ability
Grade readiness
Physical development
College admission decisions
Need for additional learning support
While they’re not usually the type of assessment you deliver in your classroom, chances are you have access
to data from past tests that can give you valuable insights into student performance.
6. CRITERION-REFERENCED ASSESSMENTS
Criterion-referenced assessments compare the score of an individual student to a learning standard and
performance level, independent of other students around them.
In the classroom, this means measuring student performance against grade-level standards and can include
end-of-unit or final tests to assess student understanding.
Outside of the classroom, criterion-referenced assessments appear in professional licensing exams, high
school exit exams and citizenship tests, where the student must answer a certain percentage of questions
correctly to pass.
Criterion-referenced assessments are most often compared with norm-referenced assessments. While they’re
both valuable types of assessments of learning, criterion-referenced assessments don’t measure students
against their peers. Instead, each student is graded on their own strengths and weaknesses.
LESSON #5 What’s the Purpose of Different Types of Assessment?
Objectives:
At the end of this module, the learner should be able to:
Create effective assessment methods and tools
MAKE A RUBRIC
Students do their best work when they know what’s expected of them and how they’ll be marked. Whether
you’re assigning a cooperative learning project or an independent study unit, a rubric details the exact
requirements students must meet to get a specific grade.
Ideally, your rubric should have a detailed breakdown of all the project’s individual parts, what’s required of
each group member and an explanation of what would be poor, passable, good or excellent work.
A well-crafted rubric lets multiple teachers grade the same assignment and arrive at the same mark. It’s an
important part of assessments for learning and assessments of learning, and teaches students to take
responsibility for the quality of their work.
While student grades provide a useful picture of achievement and help you communicate progress to school
leaders and parents, the ultimate goal of assessments is to improve student learning.
Ask yourself questions like:
What’s my plan for the results?
Who’s going to use the results, besides me?
What do I want to learn from this assessment?
What’s the best way to present the assessment to my students, given what I know about their progress and
learning styles?
This helps you effectively prepare students and create an assessment that moves learning forward.
The analytic approach consists in considering each part of the object and the contribution that it
makes to the assemblage by its nature and function, and thus to arrive at a mental representation
of the whole by applying rules of composition to its parts.
The holistic approach is to directly grasp the whole without consideration of the parts. This can be
done if the object itself is already familiar or if, by its contours or its contextual setting and function,
it bears an essential analogy to some familiar object.
3. A given complex object may only be analysed in certain parts or aspects, while the internal structure of
other parts remains out of consideration.
To illustrate:
E1. a X chooses the correct approach to Y.
.
b X takes the correct approach to Y.
.
In E1.a the combination of the relational noun approach with its prepositional dependent, and the
combination of the transitive verb choose with its direct object, are interpreted by general rules of semanto-
syntax.
In E1.b the combination X [takes (Z) approach] to Y constitutes a proper part of the sentence. Its contour
and function are analogous to the simpler construction X approaches Y (in a Z way).
2. The collocation choose ... approach could instead be accessed holistically, whereby the specific
contribution of choose would essentially be foregone, and the whole would be largely synonymous
with take ... approach; and again, the collocation take ... approach could instead be accessed
analytically, whereby take would regain a more literal sense (contrasting, e.g., with abandon), and the
resulting constructional meaning would be slightly different.
3. The holistic approach treats take ... approach as a proper part of the construction, which it is not in the
analytic approach. However, this does not mean that the construction of E1.b is an unanalyzed whole,
since we can still integrate the contributions of each of the elements in the slots X, Y and Z with the
help of general compositional rules.
The faculty of taking either of the two approaches is wired into the human brain: in right-handers, the
left hemisphere is specialized on the analytic approach, while the right hemisphere is specialized on
the holistic approach. This neurological foundation guarantees that the two approaches are ubiquitous
in human life. They are also constitutive of language.
When a child learns his mother language, he at first does not know the rules. He therefore lacks the
presuppositions of taking an analytic approach even to such expressions that a fully competent
speaker may analyze. In the initial phase, he therefore takes as unanalyzable wholes all the things that
he learns. In the course of further learning, the child does abstract the common schemata from all the
expressions which he first stored as unanalyzed wholes (cf. Tomasello 2003). Further incoming
expressions of the same kind may then be accessed analytically. The units already learnt may, from
there on, be accessed either holistically or analytically.
Reference:
https://www.prodigygame.com/main-en/blog/types-of-assessment/
https://www.christianlehmann.eu/ling/ling_theo/holistic_vs_analytic_approach.html
HOW TO CREATE A RUBRIC:
Perhaps you have never even thought about the care it takes to create a rubric. Perhaps you have never
even heard of a rubric and its usage in education, in which case, you should take a peek at this
article: "What is a rubric?" Basically, this tool that teachers and professors use to help them communicate
expectations, provide focused feedback, and grade products, can be invaluable when the correct answer is
not as cut and dried as Choice A on a multiple choice test. But creating a great rubric is more than just
slapping some expectations on a paper, assigning some percentage points, and calling it a day. A good
rubric needs to be designed with care and precision in order to truly help teachers distribute and receive the
expected work.
STEPS TO CREATE A RUBRIC
The following six steps will help you when you decide to use a rubric for assessing an essay, a project,
group work, or any other task that does not have a clear right or wrong answer.
Before you can create a rubric, you need to decide the type of rubric you'd like to use, and that will largely
be determined by your goals for the assessment.
Once you've figured out how detailed you'd like the rubric to be and the goals you are trying to reach, you
can choose a type of rubric.
1. Analytic Rubric: This is the standard grid rubric that many teachers routinely use to assess
students' work. This is the optimal rubric for providing clear, detailed feedback. With an analytic
rubric, criteria for the students' work is listed in the left column and performance levels are listed
across the top. The squares inside the grid will typically contain the specs for each level.
A rubric for an essay, for example, might contain criteria like "Organization, Support, and Focus,"
and may contain performance levels like "(4) Exceptional, (3) Satisfactory, (2) Developing, and (1)
Unsatisfactory." The performance levels are typically given percentage points or letter grades and
a final grade is typically calculated at the end. The scoring rubrics for the ACT and SAT are
designed this way, although when students take them, they will receive a holistic score.
2. Holistic Rubric: This is the type of rubric that is much easier to create, but much more difficult to
use accurately. Typically, a teacher provides a series of letter grades or a range of numbers (1-4 or
1-6, for example) and then assigns expectations for each of those scores. When grading, the
teacher matches the student work in its entirety to a single description on the scale. This is useful
for grading multiple essays, but it does not leave room for detailed feedback on student work.
This is where the learning objectives for your unit or course come into play. Here, you'll need to brainstorm
a list of knowledge and skills you would like to assess for the project. Group them according to similarities
and get rid of anything that is not absolutely critical. A rubric with too much criteria is difficult to use! Try to
stick with 4-7 specific subjects for which you'll be able to create unambiguous, measurable expectations in
the performance levels. You'll want to be able to spot the criteria quickly while grading and be able to
explain them quickly when instructing your students. In an analytic rubric, the criteria are typically listed
along the left column.
Once you have determined the broad levels you would like students to demonstrate mastery of, you will
need to figure out what type of scores you will assign based on each level of mastery. Most ratings scales
include between three and five levels. Some teachers use a combination of numbers and descriptive labels
like "(4) Exceptional, (3) Satisfactory, etc." while other teachers simply assign numbers, percentages, letter
grades or any combination of the three for each level. You can arrange them from highest to lowest or
lowest to highest as long as your levels are organized and easy to understand.
This is probably your most difficult step in creating a rubric. Here, you will need to write short statements of
your expectations underneath each performance level for every single criteria. The descriptions should be
specific and measurable. The language should be parallel to help with student comprehension and the
degree to which the standards are met should be explained.
Again, to use an analytic essay rubric as an example, if your criteria was "Organization" and you used
the (4) Exceptional, (3) Satisfactory, (2) Developing, and (1) Unsatisfactory scale, you would need to write
the specific content a student would need to produce to meet each level. It could look something like this:
4 3 2 1
Exceptional Satisfactory Developing Unsatisfactory
Organization Organization is Organization is coherent and Organization is coherent Organization is confused
coherent, unified in support of the in and fragmented. It does
unified and paper’s purpose and usually support of the essay’s not support the essay’s
effective in demonstrates effective and purpose, but is purpose and
support of the appropriate transitions ineffective at times and demonstrates a
paper’s purpose between ideas and may demonstrate abrupt lack of structure or
and paragraphs. or weak transitions coherence that negatively
consistently between ideas or affects readability.
demonstrates paragraphs.
effective and
appropriate
transitions
between ideas
and paragraphs.
A holistic rubric would not break down the essay's grading criteria with such precision. The top two tiers of a
holistic essay rubric would look more like this:
After creating the descriptive language for all of the levels (making sure it is parallel, specific and
measurable), you need to go back through and limit your rubric to a single page. Too many parameters will
be difficult to assess at once, and may be an ineffective way to assess students' mastery of a specific
standard. Consider the effectiveness of the rubric, asking for student understanding and co-teacher
feedback before moving forward. Do not be afraid to revise as necessary. It may even be helpful to grade a
sample project in order to gauge the effectiveness of your rubric. You can always adjust the rubric if need
be before handing it out, but once it's distributed, it will be difficult to retract.
Grading Rubrics: Sample Scales
Different assignment types may require different grading scales and different numbers of levels. As you
develop your rubric, decide how many different levels it should have. For example, you may choose a
rubric with three or four levels for an essay assignment, while a one-level rubric (or credit/no credit) may be
useful for smaller assignments and save you time when grading. You may also consider whether to list the
highest possible level of achievement first or last. Be mindful with your word choice when labelling your
rating scales, especially if the grading rubric will be shared with the student. As Stevens & Levi note,
“labelling the levels on the scale can be a delicate matter. We need to be clear about expectations and
about failures as well as successes, yet we also try to avoid overly negative or competitive labels.
Three Levels
Weak, Satisfactory, Strong
Beginning, Intermediate, High
Weak, Average, Excellent
Developing, Competent, Exemplary
Low Mastery, Average Mastery, High Mastery
Four Levels
Unacceptable, Marginal, Proficient, Distinguished
Beginning, Developing, Accomplished, Exemplary
Needs Improvement, Satisfactory, Good, Accomplished
Emerging, Progressing, Partial Mastery, Mastery
Not Yet Competent, Partly Competent, Competent, Sophisticated
Inadequate, Needs Improvement, Meets Expectations, Exceeds Expectations
Poor, Fair, Good, Excellent
Five Levels
Poor, Minimal, Sufficient, Above Average, Excellent
Novice, Intermediate, Proficient, Distinguished, Master
Unacceptable, Poor, Satisfactory, Good, Excellent
Six Levels
Unacceptable, Emerging, Minimally Acceptable, Acceptable, Accomplished, Exemplary
Sources:
Stevens, Dannelle D. and Antonia Levi (2005). Introduction to Rubrics : An Assessment Tool to Save
Grading Time, Convey Effective Feedback, and Promote Student Learning. Sterling, VA: Stylus Publishing.
Stanny, Claudia J. and Linda B. Nilson. (2014). Specifications Grading: Restoring Rigor, Motivating
Students, and Saving Faculty Time. Sterling, VA: Stylus Publishing.
https://www.brown.edu/sheridan/teaching-learning-resources/teaching-resources/course-design/classroom-
assessment/grading-criteria/rubrics-scales
https://www.thoughtco.com/how-to-create-a-rubric-4061367
#5 My Own Holistic and Analytic Rubric
Answer the question below, then have a picture on it and send your output via
messenger or email: (zirkhonz@gmail.com)
I. Think of specific task to be assess. (ex. baking cake) or specific output to rated (ex. Cake)
II. Then you have to create holistic rubric with the following criteria…
Checklists, rating scales and rubrics are tools that state specific criteria and allow teachers and students to
gather information and to make judgments about what students know and can do in relation to the
outcomes. They offer systematic ways of collecting data about specific behaviours, knowledge and skills.
The quality of information acquired through the use of checklists, rating scales and rubrics is highly
dependent on the quality of the descriptors chosen for assessment. Their benefit is also dependent on
students’ direct involvement in the assessment and understanding of the feedback provided.
The purpose of checklists, rating scales and rubrics is to:
provide tools for systematic recording of observations
provide tools for self-assessment
provide samples of criteria for students prior to collecting and evaluating data on their work
record the development of specific skills, strategies, attitudes and behaviours necessary for
demonstrating learning
Clarify students' instructional needs by presenting a record of current accomplishments.
TYPES OF SCALES
Most frequently used Scales
1. Nominal Scale
2. Ordinal Scale
3. Interval Scale
4. Ratio Scale
Self Rating Scales
1. Graphic Rating Scale
2. Itemized Rating Scales
a. Likert Scale
b. Semantic Differential Scale
c. Stapel’s Scale
d. Multi Dimensional Scaling
e. Thurston Scales
f. Guttmann Scales/Scalogram Analysis
g. The Q Sort technique
Four types of scales are generally used for Marketing Research.
1. NOMINAL SCALE
This is a very simple scale. It consists of assignment of facts/choices to various alternative
categories which are usually exhaustive as well mutually exclusive. These scales are just
numerical and are the least restrictive of all the scales. Instances of Nominal Scale are - credit card
numbers, bank account numbers, employee id numbers etc. It is simple and widely used when
relationship between two variables is to be studied. In a Nominal Scale numbers are no more than
labels and are used specifically to identify different categories of responses. Following example
illustrates -
What is your gender?
[ ] Male
[ ] Female
Another example is - a survey of retail stores done on two dimensions - way of maintaining stocks
and daily turnover.
How do you stock items at present?
[ ] By product category
[ ] At a centralized store
[ ] Department wise
[ ] Single warehouse
Daily turnover of consumer is?
[ ] Between 100 – 200
[ ] Between 200 – 300
[ ] Above 300
A two way classification can be made as follows
Daily/Stock
Product Department Centralized Single
Turnover
Category wise Store Warehouse
Method
100 – 200
200 – 300
Above 300
Mode is frequently used for response category.
2. ORDINAL SCALE
Ordinal scales are the simplest attitude measuring scale used in Marketing Research. It is more
powerful than a nominal scale in that the numbers possess the property of rank order. The ranking
of certain product attributes/benefits as deemed important by the respondents is obtained through
the scale.
Example 1: Rank the following attributes (1 - 5), on their importance in a microwave oven.
1. Company Name
2. Functions
3. Price
4. Comfort
5. Design
The most important attribute is ranked 1 by the respondents and the least important is ranked 5.
Instead of numbers, letters or symbols too can be used to rate in a ordinal scale. Such scale
makes no attempt to measure the degree of favorability of different rankings.
Example 2 - If there are 4 different types of fertilizers and if they are ordered on the basis of quality
as Grade A, Grade B, Grade C, Grade D is again an Ordinal Scale.
Example 3 - If there are 5 different brands of Talcom Powder and if a respondent ranks them
based on say, “Freshness” into Rank 1 having maximum Freshness Rank 2 the second maximum
Freshness, and so on, an Ordinal Scale results.
Median and mode are meaningful for ordinal scale.
3. Interval Scale
Herein the distance between the various categories unlike in Nominal, or numbers unlike in
Ordinal, are equal in case of Interval Scales. The Interval Scales are also termed as Rating Scales.
An Interval Scale has an arbitrary Zero point with further numbers placed at equal intervals. A very
good example of Interval Scale is a Thermometer.
Illustration 1 - How do you rate your present refrigerator for the following qualities.
Less Well
Company Name 1 2 3 4 5
Known Known
Functions Few 1 2 3 4 5 Many
Price Low 1 2 3 4 5 High
Design Poor 1 2 3 4 5 Good
Very
Overall Very
Dis- 1 2 3 4 5
Satisfaction Satisfied
Satisfied
Such a scale permits the researcher to say that position 5 on the scale is above position 4 and also
the distance from 5 to 4 is same as distance from 4 to 3. Such a scale however does not permit
conclusion that position 4 is twice as strong as position 2 because no zero position has been
established. The data obtained from the Interval Scale can be used to calculate the Mean scores of
each attributes over all respondents. The Standard Deviation (a measure of dispersion) can also
be calculated.
4. Ratio Scale
Ratio Scales are not widely used in Marketing Research unless a base item is made available for
comparison. In the above example of Interval scale, a score of 4 in one quality does not
necessarily mean that the respondent is twice more satisfied than the respondent who marks 2 on
the scale. A Ratio scale has a natural zero point and further numbers are placed at equally
appearing intervals. For example scales for measuring physical quantities like - length, weight, etc.
The ratio scales are very common in physical scenarios. Quantified responses forming a ratio scale
analytically are the most versatile. Rati scale possess all he characteristics of an internal scale, and
the ratios of the numbers on these scales have meaningful interpretations. Data on certain
demographic or descriptive attributes, if they are obtained through open-ended questions, will have
ratio-scale properties. Consider the following questions:
Q 1) what is your annual income before taxes? ______ $
Q 2) How far is the Theater from your home? ______ miles
Answers to these questions have a natural, unambiguous starting point, namely zero. Since
starting point is not chosen arbitrarily, computing and interpreting ratio makes sense. For example
we can say that a respondent with an annual income of $ 40,000 earns twice as much as one with
an annual income of $ 20,000.
0 1 5 7
(poor quality) (bad quality) (neither good nor bad) (good quality)
BRAND 1
This is also known as continuous rating scale. The customer can occupy any position. Here one
attribute is taken ex-quality of any brand of icecream.
poor good
BRAND 2
This line can be vertical or horizontal and scale points may be provided. No other indication is there
on the continuous scale. A range is provided. To quantify the responses to question that “indicate
your overall opinion about ice-ream Brand 2 by placing a tick mark at appropriate position on the
line”, we measure the physical distance between the left extreme position and the response
position on the line.; the greater the distance, the more favourable is the response or attitude
towards the brand.
Its limitation is that coding and analysis will require substantial amount of time, since we first have
to measure the physical distances on the scale for each respondent.
a. LIKERT SCALE
It was developed Rensis Likert. Here the respondents are asked to indicate a degree of
agreement and disagreement with each of a series of statement. Each scale item has 5
response categories ranging from strongly agree and strongly disagree.
5 4 3 2 1
Strongly agree Agree Indifferent Disagree Strongly disagree
Each statement is assigned a numerical score ranging from 1 to 5. It can also be scaled as
-2 to +2.
-2 -1 0 1 2
For example quality of Mother Diary ice-cream is poor then Not Good is a negative
statement and Strongly Agree with this means the quality is not good.
Each degree of agreement is given a numerical score and the respondents total score is
computed by summing these scores. This total score of respondent reveals the particular
opinion of a person.
Likert Scale is of ordinal type, they enable one to rank attitudes, but not to measure the
difference between attitudes. They take about the same amount of efforts to create as
Thurston scale and are considered more discriminating and reliable because of the larger
range of responses typically given in Likert scale.
A typical Likert scale has 20 - 30 statements. While designing a good Likert Scale, first a
large pool of statements relevant to the measurement of attitude has to be generated and
then from the pool statements, the statements which are vague and non-discriminating
have to be eliminated.
Thus, likert scale is a five point scale ranging from ’strongly agreement’ to ’strongly
disagreement’. No judging gap is involved in this method.
1 7
Unpleasant 2 3 4 5 6 Pleasant
Submissive Dominant
---------------/---------------
----------/--------------------
--------/----------------------
Mean and median are used for comparison. This scale helps to determine overall
similarities and differences among objects.
When Semantic Differential Scale is used to develop an image profile, it provides a good
basis for comparing images of two or more items. The big advantage of this scale is its
simplicity, while producing results compared with those of the more complex scaling
methods. The method is easy and fast to administer, but it is also sensitive to small
differences in attitude, highly versatile, reliable and generally valid.
c. STAPEL’S SCALE
It was developed by Jan Stapel. This scale has some distinctive features:-
i. Each item has only one word/phrase indicating the dimension it represents.
ii. Each item has ten response categories.
iii. Each item has an even number of categories.
iv. The response categories have numerical labels but no verbal labels.
For example, in the following items, suppose for quality of ice cream, we ask respondents
to rank from +5 to -5. Select a plus number for words which best describe the ice cream
accurately. Select a minus number for words you think do not describe the ice cream
quality accurately. Thus, we can select any number from +5,for words we think are very
accurate, to -5,for words we think are very inaccurate. This scale is usually presented
vertically.
e. THURSTON SCALES
These are also
known as equal
appearing interval
scales. They are
used to measure
the attitude
towards a given
concept or
construct. For this
purpose a large
number of
statements are
collected that
relate to the
concept or construct being measured. The judges rate these statements along an 11
category scale in which each category expresses a different degree of favourableness
towards the concept. The items are then ranked according to the mean or median ratings
assigned by the judges and are used to construct questionnaire of twenty to thirty items
that are chosen more or less evenly across the range of ratings.
The statements are worded in such a way so that a person can agree or disagree with
them. The scale is then administered to assemble of respondents whose scores are
determined by computing the mean or median value of the items agreed with. A person
who disagrees with all the items has a score of zero. So, the advantage of this scale is that
it is an interval measurement scale. But it is the time consuming method and labour
intensive. They are commonly used in psychology and education research.
Researchers have developed a variety of attitude rating scales to measure the intensity of an attitude's
affective, cognitive, and behavioral components. These scales may require a respondent to rank, rate, sort,
and choose when we assess an attitude.
Scaling refers to the process of assigning numbers or symbols to measure the intensity of abstract
attitudes. Scales can be uni-dimensional or multi-dimensional. Uni-dimensional scales measure a single
aspect or dimension of an attitude. Multi-dimensional scales measures more than one dimension of an
attitude.
Ranking: Ranking is a measurement that asks respondents to rank a small number of items on some
characteristic. Respondents might be asked to rank their favourite hot breakfast beverage: Hot Chocolate,
Tea, Coffee, or Herbal Tea. Ranking delivers an ordinal score.
Rating: Rating asks respondents the extent to which an item of interest possesses a characteristic. Scales
that requires respondents to rank an item result in a quantitative score.
Sorting: Sorting is a measurement task that asks respondents to sort several items into categories.
Choice: Choice is a measurement task that requires respondents to select among two or more alternatives.
Category Scales: Category scales are the simplest type of rating scale. They contain only two choices:
yes/no or agree/disagree.
Example:
I approve of the Affordable Care Act or Obama Care.
We can expand the number of response categories to give respondents greater flexibility in rating the item
of interest.
Example:
How often to you this positively about the Affordable Care Act or Obama Care?
Category scales can deal with a wide variety of issues: Quality, Importance, Interest, Satisfaction,
Frequency, Truth, and Uniqueness.
Graphic Rating Scales: Graphic ratings scales include a graphic continuum anchored between two
extremes. When used for online surveys, graphic rating scales may have a "slider," which respondents can
move up or down the scale. Sliders allow respondents to make finely tuned responses using a continuous
scale.
Source: http://www.iperceptions.com/en/blog/2013/august/3-easy-steps-to-build-a-great-survey
Graphic rating scales are easy to create. Researchers must be careful about using overly extreme anchors,
which tend to push responses toward the center of the scale. Graphic rating scales are frequently used
when conducting research among children. Graphic rating scales are considered non-comparative scales
because respondents make their judgments without making comparisons to other objects, concepts,
people, or brands.
Itemized Rating Scales: Itemized rating scales require respondents to select from a limited number of
ordered alternatives. These scales are easy to construct, but they do not allow the respondent to make the
fine distinctions of a graphic rating scale using a slider.
Example:
How likely are you to use an open-source textbook in the courses you teach?
Graphic rating scales and itemized rating scales ask respondents about a single concept in isolation. Such
scales are often called monadic rating scales.
Rank-Order Scales: Unlike graphic rating scales and itemized rating scales, rank-order scales are
comparative scales. Responses rank the objects, concepts, people, or brands by comparing them to similar
alternatives.
Example:
Rank the following smart phones with one being the brand that best meets the characteristic and six
being the brand that is the worst on the characteristic.
Rank-order scales have the following disadvantages: First, if the alternative choice is missing, the
respondent's answer could be misleading. In the question above, the Blackberry 10 is not listed. If that is
the respondent's choice, the answer to this question might not reflect his or her real attitude. Second, the
answers provided are on an ordinal scale. We will not have the "distance" between the ranks. Third, the
question does not offer information as to why the respondent choose the order he or she selected.
Paired Comparisons: Paired comparisons is a measurement scale that asks respondents to select one of
two alternatives.
Example:
Listed below are some of the characteristics of a McDonald's Big Mac and a Burger King Whopper.
Select the answer that best matches your opinion.
Which of the two brands tastes better?
Constant Sum Scales: Constant sum scales require respondents to divide a set number of points, usually
100, to rate two or more attributes. The problem with constant sum scales is that respondents find it difficult
to allocate points especially if there are a lot of attributes to be measured.
Example:
Below are five attributes of the iPhone 6 Plus, Please allocate 100 points to these attributes so that they
reflect the importance of each attribute. Please make certain that the total number of points adds up to
100.
Semantic Differential Scales: Semantic differential scales measure respondents' attitudes about the
strengths and weaknesses of a concept or construct. With this scale, researchers select a pair of
dichotomous adjectives to describe the concept under investigation. Typically researchers use a scale from
1 through 7. The mean of each pair is calculated and then plotted on the table.
Example:
Below is a list of characteristics of Kmart stores. For each pair of adjectives, place an "X" at the point
that you believe best reflects your experience at Kmart.
Stapel Scale: The Stapel Scale is a uni-polar scale that requires respondents to rate a concept on a scale
of negative 5 to positive 5 on how closely an adjective at the center of the scale represents the concept.
The chief advantage of the Stapel Scale is that the researcher does not have to spend the time and energy
to creating bipolar pairs.
Example:
Select the appropriate plus number for the phrase that best represents attributes of the iPhone 6. If the
phrase does not represent the iPhone 6, select the appropriate negative number that reflects your
attitude.
Likert Scale: The Likert scale allows respondents to state how strongly they agree or disagree with an
attitude. The scale is named after Rensis Likert, who developed this scale in 1932 for his doctoral
dissertation. Likert is pronounced "Lick-ert," not "Like-urt."
Although the Likert scale is typically a five-point scale that ranges from "strongly disagree" to neutral to
"strongly agree." It is not uncommon to see a six-point or seven-point variant. A six-point Likert scale has
three levels of disagreement and three levels of agreement with no neutral point. The seven-point Likert
scale adds a neutral point.
Example:
McDonald's Happy Meals are good value for the money.
Researchers disagree on whether the Likert Scale is an ordinal or interval scale. Those who argue that it is
an ordinal scale say the intervals between the five-points of the scale are unknowable. Those who argue
that it is an interval scale score "Strongly Disagree" as a 1, "Disagree" as a 2, "Neutral" as a 3, "Agree" as a
4, and "Strongly Agree" as a 5.
Closely related to the Likert Scale is a Purchase Intent scale. The disagreement and agreement statements
are replaced with answers that reflect a respondent's intent to purchase a product.
Example:
After owning a Chevrolet Impala for three years, how likely are you to purchase a new Chevrolet
Impala?
A five-point purchase intent scale is widely used in new product development and advertising testing.
Another consideration: How long does the researcher have to develop the scales. Rank-order scales can
be developed quickly while developing a semantic differential scale can take a long time.
Forced versus Non-Forced Choice: Sometimes researchers will add a "do not know" category to the range
of possible answers. When they are concerned that respondents with limited knowledge will tend to answer
with a "neutral" option, if available. Some researchers avoid using a "do not know" answer out of fear that
lazy respondents will often check this answer without much reflection.
The argument for "forcing" respondents to answer a question is that it makes them think about their feelings
and attitudes. The argument against "forcing" an answer is that respondents will give a "false" answer, or
they may refuse to answer the question.
Examples
The scale is designed to measure one factor or subject. For example, the following shows a questionnaire
for a person’s attitudes towards depression:
Sometimes, sensitive topics are concealed within other questions to disguise the intent of the
questionnaire. For example, this one quizzes for possible gaming addiction:
One disadvantage of the Guttman scale is that respondents may feel overly committed to questions; They
may continue to answer YES beyond the point where they should have stopped. Using the concealed
questionnaire helps to avoid this issue.
Use in Education
In the social sciences, the Guttman scale is often used to measure an increasing amount of “attitude”
towards a single topic. In education, it’s sometimes used to show a student’s logical progression through
coursework. For example, the expected progression through math topics for 3 children. It’s expected that a
child does well in fractions before they are able to grasp algebra. A “0” means that the student hasn’t
mastered a topic, while a “1” means that a student has mastered it:
In practice, it’s rare to find data that fits perfectly to a Guttman scale. More often than not, you’re actually
testing more than one factor. For example, algebra may need good reading skills as well as the ability to
solve an equation. If you know (or suspect) that your instrument is measuring two or more factors,
use multidimensional scaling or Correspondence Analysis to analyze your results.
#6 My Own Assessment using other Assessment tools
Answer the question below, then have a picture on it and send your output
via messenger or email: (zirkhonz@gmail.com)
Cite a task then create an assessment tools using checklist, anecdotal records, likert scale or semantic
differential scale to evaluate the output.
Checklists
Anecdotal records
Likert scale
Semantic differential scales: (with positive and negative side)
Example Task: Creating a Poster about Education
My Sample Checklist
Criterion Yes No
Poster includes all required information –such
names, illustrational affiliation
All components fits into space provided
Title lettering
Anecdotal Records
Your comments/suggestion regarding the poster