You are on page 1of 75

Measurement and Scaling:

Fundamentals and Comparative


Scaling
8-2

Chapter Outline
1) Overview
2) Measurement and Scaling
3) Primary Scales of Measurement
i. Nominal Scale
ii. Ordinal Scale
iii. Interval Scale
iv. Ratio Scale
4) A Comparison of Scaling Techniques
8-3

Chapter Outline
5) Comparative Scaling Techniques
i. Paired Comparison
ii. Rank Order Scaling
iii. Constant Sum Scaling
iv. Q-Sort and Other Procedures
6)Non comparative Scaling
Techniques
8-4

 Each question in a questionnaire


expects a response from a
respondent. But, here the most
important issue is ………… How to
measure
the
responses?
8-5

Measurement and Scaling

Measurement means assigning numbers or


other symbols to characteristics of objects
according to certain pre specified rules.

Scaling involves creating a continuum(range)


upon which measured objects are located.

 
8-6

Levels of Measurement Scales


 NOMINAL SCALES: Label objects (e.g., race,
religion, buyer/nonbuyer, yes/no,…)
 ORDINAL SCALES: Indicate only relative size
differences between objects (e.g., rank brands,
purchase frequency,…)
 INTERVAL SCALES: Use descriptors that are
equal distances apart (e.g., measuring
temperature, capturing timings of events with
some arbitrary zero.
 RATIO SCALES: Have a true zero point (e.g.,
rupees spent, number of purchases,…)
8-7

Primary Scales of Measurement


Table 8.1
Scale Basic Characteristics Common Marketing
Examples Examples
Nominal Numbers identify & classify Social Security Brand nos., store
objects nos., numbering types
of football players
Ordinal Nos. indicate the relative Quality rankings, Preference
positions of objects but not the rankings of teams rankings, market
magnitude of differences in a tournament position, social
between them class
Interval Differences between objects Temperature Attitudes,
can be compared, zero point (Fahrenheit) opinions, index
is arbitrary Celsius) nos.
Ratio Zero point is fixed, ratios of Length, weight Age, sales,
scale values can be compared income, costs
8-8

Primary Scales of Measurement


Scale Figure 8.1
Nominal Numbers Finish
Assigned
7 8 3
to Runners

Ordinal Rank Order Finish


of Winners
Third Second First
place place place

Interval Performance
Rating on a 8.2 9.1 9.6

1 to 10 Scale
15 10 5
Ratio Time to
Finish, in
Primary Scales of Measurement 8-9

Nominal Scale

 The numbers serve only as labels or tags for


identifying and classifying objects.
 The numbers do not reflect the amount of the
characteristic possessed by the objects.
 The only permissible operation on the numbers
in a nominal scale is counting.
 Only a limited number of statistics, all of which
are based on frequency counts, are permissible,
e.g., percentages, and mode.
8-10

Illustration of Primary Scales of Measurement


Table 8.2

Nominal Ordinal Interval Ratio


Scale Scale Scale Scale
Preference Preference $ spent last
No. Store Rankings Ratings
3 months 1-7 11-17
7 79 5 15 0
1. Lord & Taylor 2 25 7 17 200
2. Macy’s 8 82 4 14 0
3. Kmart 3 30 6 16 100
4. Rich’s 1 10 7 17 250
5. J.C. Penney 5 53 5 15 35
6. Neiman Marcus 9 95 4 14 0
7. Target 6 61 5 15 100
8. Saks Fifth Avenue 4 45 6 16 0
9. Sears 10 115 2 12 10
10.Wal-Mart
Primary Scales of Measurement 8-11

Ordinal Scale
 A ranking scale in which numbers are assigned to
objects to indicate the relative extent to which
the objects possess some characteristic.
 Can determine whether an object has more
or less of a characteristic than some other
object, but not how much more or less.
 Any series of numbers can be assigned that
preserves the ordered relationships
between the objects.
 In addition to the counting operation allowable
for nominal scale data, ordinal scales permit the
use of statistics based on centiles, e.g.,
percentile, quartile, median.
8-12

Football scores
 Brazil  829 1
 Argentina  785 2
 Netherlands  783 3
 Mexico  759 4
 England  757 5
 Spain  747 6
 USA  744 7
Ordinal
8-13

 Ranked Preferences
you were asked to taste five different foods and rank your
preference in order. The foods are sweet, salty, bitter, sour,
fatty. We usually rank our strongest preference as "1" . With
five foods, our lowest preference would be "5". These ranks
have the property of identity because they tell us which food
and magnitude because they place the preference in order.
They do not tell us "how much" more, just more or less. 
Primary Scales of Measurement 8-14

Interval Scale
 Numerically equal distances on the scale
represent equal values in the characteristic being
measured.
 It permits comparison of the differences between
objects.
 The location of the zero point is not fixed. Both
the zero point and the units of measurement are
arbitrary.
 Statistical techniques that may be used include all
of those that can be applied to nominal and
ordinal data, and in addition the arithmetic mean,
standard deviation, and other statistics commonly
used in marketing research.
Primary Scales of Measurement 8-15

Ratio Scale
 Possesses all the properties of the nominal, ordinal,
and interval scales.
 It has an absolute zero point.
 It is meaningful to compute ratios of scale values.
 All statistical techniques can be applied to ratio data.
 IT MEASURES WEIGHT, TIME DURATION, NO.
OF PURCHASES

 NUMBERS RANKS CALCULATE DIFFERENCES

 A IS HOW MANY TIMES B????? (RATIO SCALE)


8-16

Illustration of Primary Scales of Measurement


Table 8.2

Nominal Ordinal Interval Ratio


Scale Scale Scale Scale
Preference Preference $ spent last
No. Store Rankings Ratings
3 months 1-7 11-17
7 79 5 15 0
1. Lord & Taylor 2 25 7 17 200
2. Macy’s 8 82 4 14 0
3. Kmart 3 30 6 16 100
4. Rich’s 1 10 7 17 250
5. J.C. Penney 5 53 5 15 35
6. Neiman Marcus 9 95 4 14 0
7. Target 6 61 5 15 100
8. Saks Fifth Avenue 4 45 6 16 0
9. Sears 10 115 2 12 10
10.Wal-Mart
8-17

THE LADDER IS:

ABSOLUTE
ZERO

EQUAL
DISTANCE
MAGNITUDE

IDENTITY
8-18

A Classification of Scaling Techniques


Figure 8.2

Scaling Techniques

Comparative Non comparative


Scales Scales

Paired Rank Constant Q-Sort and Continuous Itemized


Comparison Order Sum Other Rating Scales Rating Scales
Procedures

Semantic Stapel
Likert
Differential
8-19

A Comparison of Scaling Techniques


 Comparative scales involve the direct
comparison of stimulus objects. Comparative
scale data must be interpreted in relative terms
and have only ordinal or rank order properties.
 Rank the following in order of your preference:
  Coke Pepsi Mirinda Fanta Frooti Sprite
 In non comparative scales, each object is
scaled independently of the others in the stimulus
set. The resulting data are generally assumed to
be interval or ratio scaled.
 Coke in taste
Excellent Very Good Good Average Fair
Mirinda in taste
Excellent Very Good Good Average Fair
Comparative Scaling Techniques 8-20

Paired Comparison Scaling

 A respondent is presented with two


objects and asked to select one
according to some criterion.
 The data obtained are ordinal in nature.
 Paired comparison scaling is the most
widely used comparative scaling
technique.
 With n brands, [n(n - 1) /2] paired
comparisons are required
8-21

Paired Comparison Scales

Assume that we have brands A, B, C and D and you have to


select two of them, then rank the following pairs according to
your preference. 1 means the most preferred one, 2 - highly
preferred one, and so on.
 A and B
 A and C
 A and D
 B and C
 B and D
 C and D
8-22

Paired Comparison Selling

The most common method of taste testing is paired comparison. The


consumer is asked to sample two different products and select the one
with the most appealing taste. The test is done in private and a
minimum of 1,000 responses is considered an adequate sample. A blind
taste test for a soft drink, where imagery, self-perception and brand
reputation are very important factors in the consumer’s purchasing
decision, may not be a good indicator of performance in the
marketplace. The introduction of New Coke illustrates this point. New
Coke was heavily favored in blind paired comparison taste tests, but its
introduction was less than successful, because image plays a major role
in the purchase of Coke.

A paired comparison
taste test
Comparative Scaling Techniques 8-23

Rank Order Scaling

 Respondents are presented with several objects


simultaneously and asked to order or rank them
according to some criterion.
 It is possible that the respondent may dislike the brand
ranked 1 in an absolute sense.
 Furthermore, rank order scaling also results in ordinal
data.
8-24

RANK ORDER SCALE


Rank the following characteristics of cellular
phone service:
__ Total Cost __ Reliability of service
__ Reception quality __ 24-hr customer service
__ Low fixed cost __ Size of local coverage area
Indicate your preferred type of music with a 1,
your second favorite with a 2, and so on for
each type of music:
____ Pop
____ Rock
____ Indian Classical
____ Light
Preference for Toothpaste Brands
8-25

Using Rank Order Scaling


Figure 8.4

Instructions: Rank the various brands of toothpaste in order


of preference. Begin by picking out the one brand that you like
most and assign it a number 1. Then find the second most
preferred brand and assign it a number 2. Continue this
procedure until you have ranked all the brands of toothpaste
in order of preference. The least preferred brand should be
assigned a rank of 10.
No two brands should receive the same rank number.
The criterion of preference is entirely up to you. There is no
right or wrong answer. Just try to be consistent.
Preference for Toothpaste Brands
8-26

Using Rank Order Scaling


Figure 8.4 cont.

Form
Brand Rank Order
1. Babool _________
2. Colgate _________
3. dabur _________
4. Gleem _________
5. Macleans _________

6. Ultra Brite _________


7. Close Up _________
8. Pepsodent _________
9. Plus White _________
10. kidodent _________
Comparative Scaling Techniques 8-27

Constant Sum Scaling

 Respondents allocate a constant sum of units, such


as 100 points to attributes of a product to reflect
their importance.
 If an attribute is unimportant, the respondent assigns
it zero points.
 If an attribute is twice as important as some other
attribute, it receives twice as many points.
 The sum of all the points is 100. Hence, the name of
the scale.
8-28

Constant Sum Scale(continued…)

You have 100 points to distribute


among the following aspects of
restaurants. Use these points to
indicate the relative importance of
each factor:
____ Atmosphere
____ Price
____ Service
____ Food Quality
100 = total
8-29

CONSTANT SUM SCALE


 Allocate a fixed number of rating points (usually
100) among several objects to reflect the relative
importance of each object
How important are the following items when selecting a health care
plan:
Ability to choose doctor _____
Extent of coverage provided _____
Quality of medical care _____
Monthly cost of plan _____
Travel distance _____
--------
Total 100
Importance of Bathing Soap Attributes
8-30

Using a Constant Sum Scale


Figure 8.5

Instructions
On the next slide, there are eight attributes of
bathing soaps. Please allocate 100 points among
the attributes so that your allocation reflects the
relative importance you attach to each attribute.
The more points an attribute receives, the more
important the attribute is. If an attribute is not at
all important, assign it zero points. If an attribute is
twice as important as some other attribute, it
should receive twice as many points.
Importance of Bathing Soap Attributes
8-31

Using a Constant Sum Scale


Figure 8.5 cont.

Form
Average Responses of Three Segments
Attribute
Segment I 8Segment II 2Segment III 4
1. Mildness 2 4 17
2. Lather 3 9 7
3. Shrinkage 53 17 9
4. Price 9 0 19
5. Fragrance 7 5 9
6. Packaging 5 3 20
7. Moisturizing 13 60 15
Sum 100 100 100
8. Cleaning Power
8-32

Q-sorting Scale
 a scale in which a set of objects are

distributed into piles according to

specified rated categories.

 It is a sophisticated form of rank ordering

wherein a set of objects is given to an individual

to sort into piles to specified rating categories

 Eg. Sort your inspiration


Measurement and Scaling:
Non comparative Scaling
Techniques
8-34

GRAPHIC
RATING SCALE
8-35

 GRAPHIC RATING SCALES: a scale showing


a graphic continuum that is typically
anchored by two extremes. The respondent
indicates her or his rating by placing a mark
at the appropriate point on the continuum.
One can also provide brief description of
scale points along the line.
8-36

GRAPHIC RATING SCALES

 Sweet----------------------------------------Sour

Sweet-------------------------------------Not Sweet

Excellent___________________________ Poor
1 2 3 4 5 6 7 8 9 10
8-37

GRAPHIC RATING SCALES


Scale A

Uncomfortable Comfortable

Scale B

0 10 20 30 40 50 60 70 80 90 100

Uncomfortable Neutral Comfortable


8-38

Continuous Rating Scale


Respondents rate the objects by placing a mark at the
appropriate position on a line that runs from one extreme of the
criterion variable to the other.
The form of the continuous scale may vary considerably

How would you rate Sears as a department store?


Version 1
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Probably the best
 
Version 2
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - Probably the best
0 10 20 30 40 50 60 70 80 90 100
 
Version 3
Very bad Neither good Very good
nor bad
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - -Probably the best
0 10 20 30 40 50 60 70 80 90 100
8-39

An Itemized Rating Scale - Likert Scale


The Likert scale requires the respondents to indicate a degree of agreement or
disagreement with each of a series of statements about the stimulus objects.
 
Strongly Disagree Neither Agree Strongly
disagree agree nor agree
disagree
 
1. Sears sells high quality merchandise. 1 2X 3 4 5
 
2. Sears has poor in-store service. 1 2X 3 4 5
 
3. I like to shop at Sears. 1 2 3X 4 5
 
 The analysis can be conducted on an item-by-item basis (profile analysis), or a
total (summated) score can be calculated.

 When arriving at a total score, the categories assigned to the negative


statements by the respondents should be scored by reversing the scale.
8-40

Likert Scale – Another Example


 Mr. Shahi Kant is conducting some market-
research related project for GT-Stores, Patel
Nagar. For that he has designed a questionnaire
and some of the questions he has asked in that
are like given below:

Strongly Disagree Neither Agree Strongly


disagree agree nor agree disagree
 
1. GT-Store sells high quality merchandise. 1 2 3 4
5
 
2. GT-Store has poor in-store service. 1 2 3 4 5
 
3. I like to shop at GT-Store. 1 2 3 4 5
 
8-41

ITEMIZED RATING SCALES


 Indicate overall level of satisfaction with present health
insurance plan:
__ very satisfied
__ quite satisfied
__ somewhat satisfied
__ not at all satisfied

As a company, Infosys is:

very not at all


innovative innovative

1 2 3 4 5 6 7
8-42

Itemized Rating Scales…


Odd vs. Even Scale Points

Odd Even
1. Strongly Agree _____ 1. Strongly Agree _____
2. Agree _____ 2. Agree _____
3. Neutral _____ 3. Disagree _____
4. Disagree _____ 4. Strongly disagree ____
5. Strongly disagree _____
8-43

Balanced vs. Unbalanced Scales

Balanced Unbalanced
Very good ______ Excellent ______
Good ______ Very Good ______
Fair ______ Good ______
Poor ______ Fair ______
Very Poor ______ Poor ______
8-44

Forced vs. Unforced Scales

Forced Unforced

Extremely Reliable ___ Extremely Reliable ___

Very Reliable ___ Very Reliable ___

Somewhat Reliable ___ Somewhat Reliable ___

Somewhat Unreliable ___ Somewhat Unreliable ___

Very Unreliable ___ Very Unreliable ___

Extremely Unreliable ___ Extremely Unreliable ___


Don’t know ___
8-45

Labeled vs. End Anchored Scales

Labeled
End Anchored
Excellent _____
Excellent _____
Very Good _____ _____
Fair _____ _____
Poor _____ _____
Very Poor _____ Poor _____
8-46

KINDS OF SCALES…(continued)
 SEMANTIC DIFFERENTIAL SCALE: a
scale that rates opposite pairs of
words, or phrases on a continuum,
which are then plotted as a profile
or image.
 In it, researcher begins with the determination of a concept to
be rated.

 Then, she/he selects opposite pairs of words or phrases that


describe the object.
8-47

Semantic Differential Scales

Heavy Metal music is:

Danceable Not Danceable

Predictable Unpredictable

Soft Hard

Modern ....................................Old Fashioned

Friendly ...................................Unfriendly

Well Established ................................Not Well Established

Reliable ...................................Unrelaible

Interesting Ads ................................Uninteresting Ads


8-48
A Semantic Differential Scale for Measuring Self-
Concepts, Person Concepts, and Product Concepts

1) Rugged :---:---:---:---:---:---:---: Delicate

2) Excitable :---:---:---:---:---:---:---: Calm


3) Uncomfortable :---:---:---:---:---:---:---: Comfortable

4) Dominating :---:---:---:---:---:---:---: Submissive

5) Thrifty :---:---:---:---:---:---:---: Indulgent

6) Pleasant :---:---:---:---:---:---:---: Unpleasant

7) Contemporary :---:---:---:---:---:---:---: Obsolete

8) Organized :---:---:---:---:---:---:---: Unorganized

9) Rational :---:---:---:---:---:---:---: Emotional

10) Youthful :---:---:---:---:---:---:---: Mature


8-49

KINDS OF SCALES…(continued)
 STAPEL SCALE: a scale that provides a
single description in the centre, which is
usually measured by plus or minus 5 points.
It is designed to measure both the direction
and intensity of attitudes simultaneously.
8-50

Stapel Scales

+5 +5
+4 +4
+3 +3
+2 +2
+1 +1
Cheap Prices Satisfying
-1 -1
-2 -2
-3 -3
-4 -4
-5 -5
8-51

SURVEY RESULTS CAN NEVER BE


ERROR FREE…!
 Survey results usually have errors. The errors can be
broadly classified into
 RANDOM ERROR
 MEASUREMENT/NON- RANDOM ERROR
 To improve upon the quality of the survey, one
should try to minimize measurement error. It means
increase the VALIDITY and RELIABILITY of
the Survey data
8-52
8-53

RELIABILITY ...
 Reliability is a statistical measure of how reproducible
the survey instruments’ data are.
 It is the ability of a measure to produce the same or
highly similar results on repeated administrations.
 Reliability of a questionnaire relates to the
consistency of responses across retesting with the
same or equivalent instrument.
 To be short, reliability means stability and
consistency in results.
 A survey is said to be a reliable one if it provides a
consistent measure of important characteristics
despite background fluctuations.
8-54

RELIABILITY …(continued)
 “Reliability is the degree to which the same event or
behaviour produces the same score each time if
measured.”
 We do not have a concept of absolute reliability. In
fact, it is a relative concept. Reliability is a matter of
degree - some are more reliable than others.
 Reliability is inversely related with random error.
8-55

Is it possible to MEASURE
RELIABILITY...?
 YES!!! It is possible to get a measure of
reliability of questionnaire data through some
tools which are broadly divided into two-

 one, through which reliability is determined


by repeated testing; and

 second, through only one time testing.


8-56

Measuring RELIABILITY …??

 ASSESSING RELIABILITY BY -

 TEST-RETEST METHOD

 ALTERNATE - FORM METHOD

 INTERNAL CONSISTENCY METHOD

 SPLIT - HALF METHOD

 INTEROBSERVER METHOD

 INTRAOBSERVER METHOD
8-57

VALIDITY …
 Validity is of a measure is the extent to which it
measures what is intended to be measured
 … a ruler is considered to be a valid instrument
if it provides an accurate measure of a person’s
height
 … whatever we try to measure, we have actually
measured
 Validity ensures accuracy
 A valid survey is always reliable but a reliable
survey may not always be valid
 Validity depends on the extent of non-random error present
in the measurement process
8-58

Validity looks at the end results

of measurement. The basic

question that it asks is - “Are

we really measuring what we think

we are measuring?”
8-59

RELIABILITY
vs
VALIDITY???
8-60

Neither Valid Reliable but Valid


nor Reliable not Valid and Reliable
8-61

Measuring VALIDITY …
 We may assess the validity of a
measurement instrument by
using the following methods:
 FACE VALIDITY
 CONTENT VALIDITY
 CRITERION VALIDITY
 CONCURRENT VALIDITY
 PREDICTIVE VALIDITY
 CONSTRUCT VALIDITY
 INTERNAL AND EXTERNAL VALIDITY
8-62

Increasing Validity

 Ask respondents to answer in great detail.


 Ask questions fully, so there is no ambiguity.
 Use a reliable instrument.
 Ask questions on related areas to get “the big picture”.
 Ask lots of questions to be sure you cover the subject
completely.
8-63

If the goal was to hit the “Bull’s eye”


with each dart…

Then the
results were
consistent
but off-target

Reliable but
not Valid
8-64

If the goal was to hit the “Bull’s eye”


with each dart…

Then the
results were
both
consistent
and accurate

Valid
and Reliable
8-65

If the goal was to hit the “Bull’s eye”


with each dart…
Then the
results were
neither
consistent
nor accurate

Neither Valid
nor Reliable
8-66

Reliability and Validity


 Reliability has to do with consistency
while Validity has to do with accuracy.

 To have validity we must first have


reliability

 i.e. reliability is a prerequisite for validity.

 Reliability is a necessary but not


sufficient condition for validity.
8-67

Measuring VALIDITY …
 We may assess the validity of a measurement instrument
by using the following methods:
 FACE VALIDITY
 CONTENT VALIDITY
 CRITERION VALIDITY
 CONCURRENT VALIDITY
 PREDICTIVE VALIDITY
 CONSTRUCT VALIDITY
 INTERNAL AND EXTERNAL VALIDITY
8-68

ASSESSING VALIDITY...?
8-69

FACE VALIDITY
 It is the weakest form of validity. It is concerned with the
degree to which a measurement “LOOKS LIKE” it measures
what it is supposed to.

 The basic objective of the Face Validity is to ensure that


the items that appear in a questionnaire are not
OBVIOUSLY inconsistent or absurd.

 Face Validity refers to judgements about validity made on


the basis of overall appearance. It is based on a cursory
review of items by untrained judges or laymen.
8-70

CONTENT VALIDITY
 Content Validity of a measure is guided by the question -
“Is the substance or content of this measure
representative of the content or the universe of content
the property being measured?” That’s is to say, content
validity is the degree to which the tool represent the
universe of the concept under study.

 If it can be proved that items or questions of a survey


accurately represent the characteristics or attitudes that
they are intended to measure then, we can conclude that
the survey has the content validity.
8-71

CONTENT VALIDITY
(continued…)

 Whether a measurement has content


validity depends ultimately on how the
researcher defines a concept it is designed
to measure.
 The assessment of content validity
involves an organized review by
experts of the survey contents to
ensure that it includes everything it should
and does not include anything it should not.
8-72

CONSTRUCT VALIDITY
 A measure has a construct validity if it behaves according
to the underlying theory.
 It represents the degree to which a measurement is
connected logically to a phenomena via an underlying
theory.
 It may be measured as a degree of correlation between
the results of a particular of measurement and the results
obtained from the underlying theory.
 A scale has construct validity if it measures an observable
phenomenon that an underlying theory correlates with the
construct of interest.
8-73

CONSTRUCT VALIDITY
 If a theory is not already developed, then the
process of construct validity may lead to the
development of scientific theories. In such a
case, it is a measure how meaningful the
scale or the survey instrument is when it is
put in practical use.
 Construct Validity is divided further into two-
 CONVERGENT VALIDITY
 DIVERGENT/DISCRIMINANT VALIDITY
8-74

INTERNAL VALIDITY
 In INTERNAL VALIDITY, the researcher is concerned
with the fact whether the variables under study can be
used to generalize to study differences in the impacts; if
yes, the study has internal validity.
 It is the degree to which the relationship between the
scores reflects only the relationship between the
intended variables.
 Internal Validity tries to capture “the
approximate validity with which we can infer
that a relationship is causal”.
8-75

EXTERNAL VALIDITY
 In EXTERNAL VALIDITY, the researcher is concerned with the fact
whether the effects under study could be used to make generalisation
about some population or setting.

 It is the degree to which we can draw the correct inferences when


generalisation takes place beyond a study.

 “External Validity asks the question of generalizability: to


what populations, settings, treatment variables, and
measurement variables can this effect be generalized?”

You might also like