Professional Documents
Culture Documents
Measurement: Scaling, Reliability, Validity
Measurement: Scaling, Reliability, Validity
Reliability, Validity
CHAPTER 7
1
Chapter Objectives
Know the characteristics and power of the
four types of scales- nominal, ordinal,
interval, and ratio.
Know how and when to use the different
forms of rating scales and ranking scales.
Explain stability and consistency and how
they are established.
Discuss what “goodness” of measures
means, and why it is necessary to establish it
in research.
2
Scale
Is a tool or mechanism by which
individuals are distinguished as to how
they differ from one another on the
variables of interest to our study.
3
scales
There are four basic types of scales:
1. Nominal Scale
2. Ordinal Scale
3. Interval Scale
4. Ratio Scale
4
scales
The degree of sophistication to
which the scales are fine-tuned
increases progressively as we
move from the nominal to the ratio
scale.
The information on the variables can be
obtained in greater detail when we
employ an interval or a ratio scale than
the other two scales. 5
scales
With more powerful scales,
increasingly sophisticated data
analyses can be performed, which in
turn, means that more meaningful
answers can be found to our research
questions.
6
Nominal Scale
A nominal scale is one that allows the researcher to assign
subjects to certain categories or groups.
7
Nominal Scale
For example, the variable of gender,
respondents can be grouped into two
categories- male and female.
Notice that there are no third
category into which respondents
would normally fall.
8
Nominal Scale
The information that can be
generated from nominal scaling is to
calculate the percentage (or frequency)
of males and females in our sample of
respondents.
9
Example 1
Nominally scale the nationality of
individuals in a group of tourists to a country
during a certain year.
We could nominally scale this variable in the
following mutually exclusive and
collectively exhaustive categories .
American Japanese
Russian Malaysian
Chinese German
Arabian Other
10
Example 1
Note that every respondent has to fit
into one of the above categories and
that the scale will allow computation of
the numbers and percentages of
respondents that fit into them.
11
Ordinal Scale
Ordinal scale: not only categorizes variables in
such a way as to denote differences among various
categories, it also rank-orders categories in some
meaningful way.
13
Example 2
Rank the following five
characteristics in a job in terms of
how important they are for you.
You should rank the most important
item as 1, the next in importance a 2,
and so on, until you have ranked each
of them 1, 2, 3, 4, or 5.
14
.(Example 2 (Cont
Job Characteristic
Ranking
The opportunity provided by the job to:
1. Interacts with others _____
2. Use different skills _____
3. Complete a task to the end _____
4. Serve others _____
5. Work independently _____
15
.(Example 2 (Cont
This scale helps the researcher to
determine the percentage of
respondents who consider interaction
with others as most important, those
who consider using a number of skills
as most important, and so on. Such
knowledge might help in designing jobs
that would be seen as most enriched by
the majority of the employees.
16
.(Example 2 (Cont
We can see that the ordinal scale
provides more information than the
nominal scale. Even though differences
in the ranking of objects, persons are
clearly known, we do not know their
magnitude.
This deficiency is overcome by interval
scaling.
17
Interval Scale
21
.(Example 3a (Cont
Suppose that the employees circle the
numbers 3, 1, 2, 4, and 5 for the five items.
The magnitude of difference represented
by the space between points 1 and 2 on the
scale is the same as the magnitude of
difference represented by the space between
points 4 and 5, or between any other two
points. Any number can be added to or
subtracted from the numbers on the scale,
still retaining the magnitude of the difference.
22
.(Example 3a (Cont
If we add 6 to the five points on the
scale, the interval scale will have the
numbers 7, 8,….., 11 ( instead of 1 to
5).
The magnitude of the difference
between 7 and 8 is still the same as
the magnitude of the difference
between 9 and 10. It has an arbitrary
origin. 23
Example 3b
3. For the efforts I put into the organization, I get much in return
26
Ratio Scale
The ratio scale is the most powerful
of the four scales because it has a
unique zero origin ( not an
arbitrary origin).
The differences between scales are
summarized in the next Figure.
27
The differences between
scales
28
Properties of the Four Scales
Developing Scales
The four types of scales that can be used
to measure the operationally defined
dimensions and elements of a variable are:
Nominal, Ordinal, Interval, and Ratio
scales.
It is necessary to examine the methods of
scaling (assigning numbers or symbols) to
elicit the attitudinal responses of subjects
toward objects, events, or persons.
30
Developing Scales
Categories of attitudinal scales:
(not to be confused with the four
different types of scales)
The Rating Scales
The Ranking Scales
31
Developing Scales
Rating scales have several response
categories and are used to elicit
responses with regard to the object,
event, or person studied.
Ranking scales, make comparisons
between or among objects, events, or
persons and elicit the preferred choices
and ranking among them.
32
Rating Scales
The following rating scales are often
used in organizational research .
1. Dichotomous scale
2. Category scale
3. Likert scale
4. Numerical scale
33
Rating Scales
5. Semantic differential scale
6. Itemized rating scale
7. Fixed or constant sum rating scale
8. Stapel scale
9. Graphic rating scale
10. Consensus scale
34
Dichotomous Scale
Is used to elicit a Yes or No answer.
(Note that a nominal scale is used to
elicit the response)
Example 4
Do you own a car? Yes No
35
Category Scale
It uses multiple items to elicit a single
response.
Example 5
Where in Jordan do you reside?
Amman
Mafraq
Irbid
Zarqa
Other
36
Likert Scale
Is designed to examine how strongly
subjects agree or disagree with
statements on a 5-point scale as
following:
_________________________________
Strongly Neither Agree Strongly
Disagree Disagree Nor Disagree Agree Agree
1 2 3 4 5
______________________________________________________
37
Likert Scale
This is an Interval scale and the
differences in responses between any
two points on the scale remain the
same.
38
Semantic Differential Scale
We use this scale when several
attributes are identified at the
extremes of the scale. For instance,
the scale would employ such terms as:
Good – Bad
Strong – Weak
Hot – Cold
39
Semantic Differential Scale
This scale is treated as an Interval
scale.
Example 6
What is your opinion on your supervisor?
Responsive--------------Unresponsive
Beautiful-----------------Ugly
Courageous-------------Timid
40
Numerical Scale
Is similar to the semantic differential scale,
with the difference that numbers on a 5-
points or 7-points scale are provided, as
illustrated in the following example:
How pleased are you with your new job?
Extremely Extremlely
pleased 5 4 3 2 1 displeased
41
Itemized Rating Scale
A 5-point or 7-point scale is provided for each item
and the respondent states the appropriate number on
the side of each item. This uses an Interval Scale.
Example 7(i)
Respond to each item using the scale below, and indicate your
response number on the line by each item.
1 2 3 4 5
Very unlikely unlikely neither likely very likely
unlikely nor
likely
--------------------------------------------------------------------------------
I will be changing my job in the near future. --------
42
Itemized Rating Scale
Note that the above is balanced
rating with a neutral point.
The unbalance rating scale which
does not have a neutral point, will be
presented in the following example.
43
Itemized Rating Scale
Example 7(ii)
Circle the number that is closest to how you
feel for the item below:
Not at all Somewhat Moderately Very much
interested interested interested interested
1 2 3 4
--------------------------------------------------------------------------------
How would you rate your interest 1 2 3 4
In changing current organizational
Policies?
44
Fixed or Constant Sum Scale
The respondents are asked to distribute a
given number of points across various items.
Example : In choosing a toilet soap, indicate the importance you
attach to each of the following five aspects by allotting points for
each to total 100 in all.
Fragrance -----
Color -----
Shape -----
Size -----
_________
Total points 100
This is more in the nature of an ordinal scale.
45
Stapel Scale
This scale simultaneously measures
both the direction and intensity of
the attitude toward the items under
study. The characteristic of interest
to the study is placed at the center
and a numerical scale ranging, say from
+3 to – 3, on either side of the item as
illustrated in the following example:
46
Example 8: Stapel Scale
State how you would rate your supervisor’s abilities with respect
to each of the characteristics mentioned below, by circling the
appropriate number.
+3 +3 +3
+2 +2 +2
+1 +1 +1
Adopting modern Product Interpersonal
Technology Innovation Skills
-1 -1 -1
-2 -2 -2
-3 -3 -3
47
Graphic Rating Scale
A graphical representation helps the
respondents to indicate on this scale
their answers to a particular question by
placing a mark at the appropriate point
on the line, as in the following example:
48
Graphic Rating Scale
Example 9
On a scale of 1 to 10, how would you
rate your supervisor?
10
49
Ranking Scales
Are used to tap preferences between
two or among more objects or items
(ordinal in nature). However, such
ranking may not give definitive
clues to some of the answers sought.
50
Ranking Scales
Example 10
There are 4 product lines, the manager seeks
information that would help decide which product line
should get the most attention.
Assume:
35% of respondents choose the 1st product.
52
Forced Choice
The forced choice enables respondents
to rank objects relative to one another,
among the alternative provided. This is
easier for the respondents, particularly
if the number of choice to be ranked is
limited in number.
53
Forced Choice
Example 11
Rank the following newspapers that you
would like to subscribe to in the order of
preference, assigning 1 for the most preferred
choice and 5 for the least preferred.
-------• الدستور
---------• الرأي
----• أخبار اليوم
-----------• الغد
--------• شيحان
54
Goodness of Measures
55
Goodness of Measures
We need to assess the goodness of
the measures developed . That is,
we need to be reasonably sure that the
instruments we use in our research do
indeed measure the variables they
are supposed to, and that they
measure them accurately.
56
Goodness of Measures
Goodness of Measures
How can we ensure that the measures
developed are reasonably good?
First an item analysis of the
responses to the questions tapping the
variable is done.
Then the reliability and validity of
the measures are established.
58
Item Analysis
Item analysis is done to see if the items in
the instrument belong there or not. Each item
is examined for its ability to discriminate
between those subjects whose total scores
are high, and those with low scores.
In item analysis, the means between the
high-score group and the low-score group
are tested to detect significant differences
through the t-values.
59
Item Analysis
The items with a high t-value are then
included in the instrument. Thereafter,
tests for the reliability of the
instrument are done and the validity of
the measure is established.
60
Reliability
Reliability of measure indicates extent
to which it is without bias and hence
ensures consistent measurement
across time (stability) and across the
various items in the instrument (internal
consistency).
66
Stability
70
.(Example 12 (Cont
Those with high work ethic values would
not want to be on welfare and would ask for
employment. Those who are low on work
ethic values, might exploit the opportunity to
survive on welfare for as long as possible.
If both types of individuals have the
same score on the work ethic scale, then
the test would not be a measure of work
ethic, but of something else.
71
Construct Validity
Construct Validity testifies to how well the results
obtained from the use of the measure fit the theories
around which the test is designed. This is assessed
through convergent and discriminant validity.
Convergent validity is established when the
scores obtained with two different instruments
measuring the same concept are highly correlated.
Discriminant validity is established when, based
on theory, two variables are predicted to be
uncorrelated, and the scores obtained by measuring
them are indeed empirically found to be so.
72
Goodness of Measures
Goodness of Measures is established
through the different kinds of validity and
reliability.
The results of any research can only be as
good as the measures that tap the concepts
in the theoretical framework.
Table 7.2 summarizes the kinds of validity
discussed in the lecture.
73
Validity
.
74