Professional Documents
Culture Documents
environment
by
Roger Gates
DSS Research
6750 Locke Avenue
Ft. Worth, TX 76116
Telephone: (817) 665-7000
Fax: (817) 665-7001
E-mail: rgates@dssresearch.com
1
Abstract
Purpose: To produce up-to-date inventories for satisfaction and Likert scales that contain
commonly used scale point descriptors and their respective mean scale values and standard
deviations.
Methodology/Approach: All data were collected online using the SSI Survey Spot Panel. The
panel is national (U.S.) in scope and was screened to include individuals 21-65 years of age. A
random sample was drawn. Thirty-nine satisfaction items and 19 agreement items were tested,
and the mean value and the standard deviation were calculated for each of these descriptors.
Findings: Even though only six of the items that had been tested by Jones and Thurstone (1955)
were included in the list of satisfaction scale descriptors, the semantic meanings of those six have
Research limitations/implications: One limitation of the current study might be the chosen
service context, since scale point descriptor inventories developed within the context of health
Practical Implications: Since the present study focuses on two types of scales that are frequently
used in service environments, namely Likert and satisfaction scales, the major contribution of
this study is to provide researchers and managers in services marketing with quantitative
measurement of the meanings of commonly used scale point descriptors, which as pointed out by
Myers and Warner (1968) will make possible the development of equal interval scales and thus
aid analyses of data sets. It will thus help service marketers to develop questionnaires that more
2
Developing inventories for satisfaction and Likert scales in a service
environment
1. Introduction
Researchers and managers in services marketing are often concerned with assessing
customer satisfaction and opinions (Bearden, Malhotra and Uscátequi, 1998). When developing
questions to assess satisfaction it has been strongly suggested that the end points of preference
response scales should be words or phrases that denote bi-polar extremes, and that all anchoring
points should be suitably spaced along the semantic continuum connecting the end points (Jones
and Thurstone, 1955). Jones and Thurstone (1955) further express the need to investigate the
semantic properties of commonly used scale point descriptors to make sure that they possess the
above properties and also carry meaning that is as clear as possible to subjects that represent the
researcher’s population of interest. Further, knowing the exact scale value of each scale point
Jones and Thurstone (1955) examine the semantic meanings, to respondents, of 51 scale point
descriptors using 9-point scales and subsequently present the research community with a listing
of words and phrases that range from those expressing “greatest like” to those conveying the
“greatest dislike.” That is, the authors succeed in constructing a “continuum of meaning” that
ranges from the end points “best of all” to its bi-polar extreme “despise” (p.33), and further
provide future researchers with both the scale value and standard deviation of each of the tested
Similarly, Myers and Warner (1968) argue that the construction of accurate and
meaningful scales requires that researchers comprehend the psychological meaning, to the
respondent, of scale point descriptors. These authors further assert that quantitative measurement
3
of the meanings of commonly used scale point descriptors would allow researchers to develop
equal interval scales that are desirable for subsequent statistical analyses of data sets.
Accordingly, Myers and Warner (1968) modify the technique introduced by Jones and Thurstone
(1955), investigate the psychological meaning of 50 commonly used scale point descriptors to
four different groups of respondents, and present the respective mean scale values and standard
deviations for all four groups of respondents. Even though the four subject groups are very
different from each other (i.e., housewives, business executives, undergraduate and graduate
business students), their mean scale values and standard deviations are very similar.
Similar studies have been conducted by Bartram and Yelding (1973), Vidali (1975),
Wildt and Mazis (1978), and the findings indicate that inventory scale values such as provided
by Jones and Thurstone (1955) and Myers and Warner (1968) “are surprisingly consistent among
very diverse groups of people,” “can be used with a high degree of confidence,” and are “likely
to provide psychological scales that are virtually equi-distant” (Vidali, 1975, p.25).
Considering, however, that languages change over time (Graddol, 2004; Yang, 2000),
and no recent inventories are available, the purpose of the present study is to produce a current
inventory containing commonly used scale point descriptors and their respective mean scale
values and standard deviations. Since the present study focuses on two types of scales that are
frequently used in service environments, namely Likert and satisfaction scales, the major
meanings of commonly used scale point descriptors, which as pointed out by Myers and Warner
(1968) will make possible the development of equal interval scales and thus aid statistical
2. Methods
4
The goal of the present research was to develop inventories for two types of frequently
used response scales, namely satisfaction and Likert scales. A review of the literature focused on
locating commonly used scale point descriptors for both types of scales (see Tables 1 and 2).
Given that that there is considerable overlap of scale point descriptors, a final number of 39
The data collection followed the method first outlined by Jones and Thurstone (1955).
Accordingly, all satisfaction scale point descriptors were treated as items on nine-point scales
(from -4 to +4). Each scale was anchored to the left by “greatest dislike,” its midpoint by
“neither like nor dislike,” and to the right by “greatest like” (see Table 3 for the instructions
given to respondents). The procedure for the Likert scale point descriptors was similar, except
that the left-hand anchor read “greatest disagreement,” the scale midpoint “neither agree nor
disagree,” and the right-hand anchor “greatest agreement.” For each of the scale point
descriptors, respondents were asked to place a check mark in the space on the nine-point scale
that best described the meaning of the respective scale point descriptor.
All data were collected online, in the United States. For that purpose, the SSI Survey Spot
Panel was used. The panel is national in scope and was screened to include individuals 21-65
years of age. A random sample was drawn, and of those invited to participate by panel, 65%
qualified to participate in the survey. That is, because the present study focuses on creating an
inventory of satisfaction and agreement measures in the health insurance industry, we recruited
only subjects who actually had experience with such insurance, i.e., had group health insurance
through an employer [self or spouse]. Considering that 65% of the U.S. population has health
5
insurance, our samples are therefore representative of the population of interest. Further, only the
household decision-maker or co-decision maker was qualified to participate. The response rate of
those who qualified was 62%. All subjects were asked to rate each of the 39 satisfaction and 19
agreement items. The satisfaction scale point descriptors were rated first, followed by the
agreement scale point descriptors. The order of the items within each of the categories (i.e.,
satisfaction and agreement descriptors) was random. Following the procedure outlined by Jones
and Thurstone (1955) and defended by Myers and Warner (1968), all subjects (N = 272) were
The mean value and the standard deviation were calculated for each of the scale point
descriptors (Tables 4 and 5). Interestingly, even though only six of the items that had been tested
by Jones and Thurstone (1955) were included in the list of satisfaction scale descriptors, the
semantic meanings of those six have changed very little over the years (see Table 4).
The current study examines the semantic properties of commonly used scale point
descriptors for both satisfaction and agreement scales, and subsequently provides inventories of
mean values and standard deviations for these scale point descriptors to be used by researchers.
Knowing a scale point descriptor’s mean value makes it possible to construct successive interval
and/or equal interval scales that support meaningful statistical analyses and interpretation.
Although the current study manages to overcome some of the limitations pointed out by
Myers and Warner (1968) – namely the use of relatively small samples that are not national in
scope and are not random in kind – one limitation of the current study that future research should
6
investigate is the limitation that might arise due to the chosen product context. It is conceivable
that scale point descriptor inventories developed within the context of health insurance might not
be valid in other product contexts. However, even as we point to this limitation, Mittelstaedt
(1971, p. 236), who compares three different studies that focused on building scale point
descriptor inventories, helps us argue that the product context used to develop an inventory is not
very likely to impact the usefulness of that inventory in other product contexts: “In spite of
differences in time, place, subjects, instruments, instructions, referents and the contextual
differences which may arise from using widely different arrays of stimuli, the correspondence
7
TABLE 1
Satisfaction Scales
Crosby and Stephens, 1987, Journal of Displeased
Marketing Research (cited by Wirtz and Pleased
Lee, 2003, Journal of Service Research)
Kolodinsky, 1999, Journal of Consumer Very dissatisfied
Affairs Dissatisfied
Neutral
Satisfied
Very satisfied
Peterson and Wilson, 1992, Journal of the Very satisfied
Academy of Marketing Science Somewhat satisfied
Somewhat dissatisfied
Very dissatisfied
Uncertain
Peterson and Wilson, 1992, Journal of Very satisfied
Marketing Research Somewhat satisfied
Unsatisfied
Very unsatisfied
Peterson and Wilson, 1992, Journal of the Completely satisfied
Academy of Marketing Science Very satisfied
Fairly satisfied
Somewhat dissatisfied
Very dissatisfied
Preisser, 2002, Health Services and Excellent
Outcomes Research Methodology Very good
Good
Fair
Poor
SIP Servizio Opinioni, 1989, as cited in Very satisfied
Peterson and Wilson, 1992, Journal of the Quite satisfied
Academy of Marketing Science Not very satisfied
Not at all satisfied
Weinstein, 1989, American Banker Very satisfied
Consumer Survey Somewhat satisfied
Completely unsatisfied
Westbrook, 1980, Journal of Marketing (T- Delighted
D Scale) Pleased
Mostly satisfied
Mixed (about equally satisfied and dissatisfied)
Mostly dissatisfied
Unhappy
Terrible
For reasons of completion and exploratory Extremely satisfied, acceptable, slightly
purposes, the following scale point satisfied, OK, neither satisfied nor dissatisfied,
descriptors were added slightly dissatisfied, fairly dissatisfied,
completely dissatisfied, extremely dissatisfied
8
TABLE 2
Likert Scales
Albaum, 1997, Market Research Society Strongly agree
Agree
Neither agree nor disagree
Disagree
Strongly disagree
Hair, Bush and Ortinau, 2003, Marketing Definitely agree
Research Generally agree
Slightly agree
Slightly disagree
Generally disagree
Definitely disagree
Jacoby and Matell, 1971, Journal of Agree
Marketing Research Uncertain
Disagree
McDaniel and Gates, 2002, Marketing Strongly agree
Research Somewhat agree
Neutral
Somewhat disagree
Strongly disagree
Menezes and Elbert, 1979, Journal of Strongly agree
Marketing Research Generally agree
Moderately agree
Moderately disagree
Generally disagree
Strongly disagree
For reasons of completion and exploratory Completely agree
purposes, the following two scale point Completely disagree
descriptors were added
9
TABLE 3
Instructions to Respondents
In this test are words and phrases that people might use to show like or dislike for
health insurance plans. For each word or phrase make a check mark to show what the word or
phrase means to you. Look at the examples.
Example I
Suppose you heard a person say that he/she “barely liked” his/her health insurance
plan. You would probably decide that he/she likes it only a little. To show the meaning of the
phrase “barely like,” you would probably check under +1 on the scale below.
-4 -3 -2 -1 0 +1 +2 +3 +4
Barely like
√
Example II
If you heard someone say he had the “greatest possible dislike” for a certain health
insurance plan, you would probably check under -4, as shown on the scale below.
-4 -3 -2 -1 0 +1 +2 +3 +4
Greatest possible
dislike √
For each phrase on the following pages, check along the scale to show how much like
or dislike the phrase means.
10
TABLE 4
Satisfaction Items
Item Valid N Means Std Dev
Excellent 272 3.74 0.94
Completely satisfied 272 3.58 1.24
Extremely satisfied 272 3.33 1.91
Very satisfied 272 3.29 1.21
Delighted 272 3.11 1.22
Very good 272 2.61 0.91
Quite satisfied 272 2.67 1.17
Mostly satisfied 272 2.39 1.14
Pleased 272 2.04 1.17
Satisfied 272 1.88 1.14
Good 272 1.81 0.96
Fairly satisfied 272 1.45 1.01
Somewhat satisfied 272 1.32 0.86
Acceptable 272 1.22 0.82
Slightly satisfied 272 0.94 0.91
OK 272 0.69 0.85
Fair 272 0.47 0.92
Neutral 272 0.03 0.36
Mixed (about equally 272 0.00 0.43
satisfied and dissatisfied)
Neither satisfied nor 272 -0.02 0.36
dissatisfied
Uncertain 272 -0.07 0.50
Slightly dissatisfied 272 -1.13 0.76
Somewhat dissatisfied 272 -1.42 0.81
Fairly dissatisfied 272 -1.66 0.98
Not very satisfied 272 -1.51 1.32
Displeased 272 -1.85 1.10
Unhappy 272 -1.87 1.10
Dissatisfied 272 -1.85 1.25
Poor 272 -1.92 1.29
Unsatisfied 272 -2.14 1.21
Mostly dissatisfied 272 -2.78 0.92
Quite dissatisfied 272 -2.65 1.73
Very unsatisfied 272 -3.15 1.45
Not at all satisfied 272 -3.25 1.43
Very dissatisfied 272 -3.08 1.57
Terrible 272 -3.36 1.14
Completely dissatisfied 272 -3.22 2.14
Completely unsatisfied 272 -3.60 1.45
Extremely dissatisfied 272 -3.71 1.02
* Statistically significant at the .05 level
Jones and Thurstone (1955) inventoried the scale point descriptors excellent (mean = 3.71, std
dev = 1.01); very good (mean = 2.56, std dev = .87); good (mean = 1.91, std dev = .76); fair
(mean = .78, std dev = .47); neutral (mean = .02, std dev = .18); poor (mean = -1.55, std dev =
.87)
11
TABLE 5
Likert Items
Item Valid N Means Std Dev
Completely agree 272 3.63 0.91
Definitely agree 272 3.32 1.13
Strongly agree 272 3.05 1.60
Agree 272 1.92 0.88
Generally agree 272 1.67 0.91
Moderately agree 272 1.62 1.03
Somewhat agree 272 1.23 0.65
Slightly agree 272 0.96 0.56
Neutral 272 0.01 0.25
Neither agree nor 272 0.00 0.32
disagree
Uncertain 272 -0.08 0.44
Slightly disagree 272 -0.95 0.84
Somewhat disagree 272 -1.37 0.80
Moderately disagree 272 -1.65 1.12
Disagree 272 -1.82 1.01
Generally disagree 272 -1.74 1.09
Strongly disagree 272 -3.21 1.43
Definitely disagree 272 -3.45 1.18
Completely disagree 272 -3.69 1.18
12
REFERENCES
Albaum, G. (1997) “The Likert scale revisited: an alternate version,” Market Research Society,
Vol 39 No 2, pp. 331-48.
Bearden, W. O., Malhotra, M. K., Uscátequi, K. H. (1998) “Customer contact and the evaluation
of service experiences: propositions and implications for the design of services,”
Psychology and Marketing, Vol 15 No 8, pp. 793-809.
Bertram, P., Yelding, D. (1973) “The development of an empirical method of selecting phrases
used in verbal rating scales,” Journal of the Market Research Society, Vol 15, pp. 151-56.
Graddol, D. (2004) “The future of language,” Science, Vol. 303 (February 27), pp. 1329-31.
Hair, J. F., Bush, R. P., Ortinau, D. J. (2003) Marketing Research: Within a Changing
Information Environment, 2nd edition. McGraw-Hill, New York.
Jacoby, J., Matell, M. S. (1971) “Three-point Likert scales are good enough,” Journal of
Marketing Research, Vol 8 (November), pp. 495-500.
Kolodinsky, J. (1999), “Consumer satisfaction with a managed health care plan,” The Journal of
Consumer Affairs, Vol 33 No 2, pp. 223-36.
McDaniel, C., Gates, R. (2005) Marketing Research, 6th edition. Wiley, Hoboken, NJ.
Menezes, D., Elbert, N. F. (1979) “Alternative semantic scaling formats for measuring store
image: an evaluation,” Journal of Marketing Research, Vol 16 (February), pp. 80-7.
Myers, J. H., Gregory Warner, W.G. (1968) “Semantic properties of selected evaluation
adjectives,” Journal of Marketing Research, Vol 5 (November), pp. 409-12.
Peterson, R. A., Wilson, W. R. (1992) “Measuring customer satisfaction: fact and artifact,”
Journal of the Academy of Marketing Science, Vol 20 No 1, pp. 61-71.
Preisser, J. S. (2002) “Quasi-likelihood analysis of patient satisfaction with medical care,” Health
Services & Outcomes Research Methodology, Vol 3 No 4, pp. 233-45.
13
Vidali, J. J.(1975) “Context effects on scaled evaluatory adjective meaning,” Journal of the
Market Research Society, Vol 17 No 1, pp. 21-5.
Weinstein, M. (1989) “Consumers still like service, but their enthusiasm erodes,” American
Banker Consumer Survey, Vol 6, pp. 18, 20.
Wildt, A. R., Mazis, M. B. (1978) “Determinants of scale response: label versus position,”
Journal of Marketing Research, Vol 15 (May), pp. 261-7.
Wirtz, J., Lee, M. C. (2003) “An examination of the quality and context-specific applicability of
commonly used customer satisfaction measures,” Journal of Service Research, Vol 5 No
4, pp. 345-55.
Yang, C. D. (2000) “Internal and external forces in language change,” Language Variation and
Change, Vol 12 No 3, pp. 231-50.
14