Professional Documents
Culture Documents
Reliability
Reliability
Abstract: The purpose of this paper is to show why single-item questions pertaining to a construct are
not reliable and should not be used in drawing conclusions. By comparing the reliability of a summated,
multi-item scale versus a single-item question, the authors show how unreliable a single item is; and
therefore it is not appropriate to make inferences based upon the analysis of single-item questions which
are used in measuring a construct.
Introduction
Oftentimes information gathered in the social sciences, marketing, medicine, and
business, relative to attitudes, emotions, opinions, personalities, and descriptions of peoples
environment involves the use of Likert-type scales. As individuals attempt to quantify constructs
which are not directly measurable they oftentimes use multiple-item scales and summated ratings
to quantify the construct(s) of interest. The Likert scales invention is attributed to Rensis Likert
(1931), who described this technique for the assessment of attitudes.
McIver and Carmines (1981) describe the Likert scale as follows:
A set of items, composed of approximately an equal number of favorable and unfavorable
statements concerning the attitude object, is given to a group of subjects. They are asked
to respond to each statement in terms of their own degree of agreement or disagreement.
Typically, they are instructed to select one of five responses: strongly agree, agree,
undecided, disagree, or strongly disagree. The specific responses to the items are
combined so that individuals with the most favorable attitudes will have the highest
scores while individuals with the least favorable (or unfavorable) attitudes will have the
lowest scores. While not all summated scales are created according to Likerts specific
procedures, all such scales share the basic logic associated with Likert scaling. (pp. 2223)
Spector (1992) identified four characteristics that make a scale a summated rating scale
as follows:
First, a scale must contain multiple items. The use of summated in the name implies that
multiple items will be combined or summed. Second, each individual item must measure
something that has an underlying, quantitative measurement continuum. In other words,
it measures a property of something that can vary quantitatively rather than qualitatively.
82
83
Number
Analysis Strategy
3
14
10.3
48.3
3.4
3.4
13.8
13.8
7.0
None
Chronbachs alpha: Total scale
and/or subscales
Chronbachs alpha: Total scale
and/or subscales
Test-Retest: selected items
Chronbachs alpha: Total scale
and/or subscales
Chronbachs alpha: Total scale
and/or subscales
Chronbachs alpha: Total scale
and/or subscales
Chronbachs alpha: Total scale
and/or subscales
84
6
5
4
3
2
1
0
0
First Administration
Table 2: Multi-item statements to measure students pleasure with their graduate program
at The Ohio State University
Item
1.
2.
3.
4.
5.
85
Strongly
Disagree
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
Strongly
Agree
5
5
5
5
1
1
2
2
3
3
4
4
5
5
86
Item Means
Item Variances
Inter-Item
Correlations
Mean
Variance
SD
29.1042
30.8187
5.5515
Mean
Minimum
Maximum
3.6380
1.0750
3.3125
.7017
3.9792
1.4109
.6667
.7092
1.2013
2.0107
.0729
.0714
.3824
.0415
.5861
.5446
14.1266
.0191
Range
25.0479
23.2748
24.6525
25.2128
22.9202
24.3387
23.9840
24.0811
Alpha
.8240
.6046
.5351
.4260
.5134
.6578
.4473
.6134
.6432
Max/Min
Squared
Multiple
Correlation
.4909
.3693
.4474
.4587
.5104
.3116
.5202
.4751
Variance
Alpha
If Item
Deleted
.7988
.8063
.8219
.8081
.7874
.8192
.7949
.7920
87
88