You are on page 1of 4

GARY M. MULLET and MARVIN J.

KARSON*

The authors show how to incorporate actual purchase behavior data (usually from
diaries) into the calculation of standard errors for making inferences about the pro-
portion of purchasers from self-reported purchase intent. The case considered is for
a single sample and two products, where interest is in the difference between the
two groups in probability of purchase.

Analysis of Purchase Intent Scales Weighted by


Probability of Actual Purchase

Purchase intent scales are well known and widely used Now how likely is it that you, yourself, would purchase
in contemporary marketing research. Even a brief survey this product? That is, would you say that you
of marketing research texts shows a wide variety of scales -definitely will buy
used to get at the key issue: how likely is a respondent -probably will buy
to purchase a given product (concept)?For instance, Aaker -mayor may not buy
-probably will not buy
and Day (1980) illustrate both 3- and 5-point scales used
-definitely will not buy?
to meter future intentions. They state that such scales
have good predictive ability. Green and Tull (1978), Pe- Indeed, the k = 5-point scale is not atypical. Note that
terson (1982), and Smith and Swinyard (1983) also show because each respondent is supplying two pieces of in-
examples of purchase intent scales. formation, the results we present are analogous to the
Churchill (1979) gives an example of this type of scale, McNemar procedure (and extensions) for dichotomous
but warns against its indiscriminate use because of the and multichotomous data, unweighted. See, for exam-
potential difference between stated intentions and actual ple, Dixon and Massey (1969). In the general situation,
(later) behavior. Wentz (1979) shows a table of stated one assumes known the conditional probability that a re-
and actual purchase behavior for three brands in a par- spondent who responds with a given intention will ac-
ticular product category. Finally, though writing from a tually purchase. These probabilities are assumed known
different perspective than ours, Morrison (1979) empha- from previous studies or, more typically, from diary in-
sizes, among other things, the importance of gathering formation. The important point here is that these prob-
actual purchase data as a followup to stated intentions. abilities are treated as known quantities and not as ran-
Obviously, the stated intention may be different from dom variables which are subject to random and sampling
actual behavior for a variety of reasons. variation. These actual purchase probabilities are incor-
We consider the following important research situa- porated into the observed sample data to obtain the ap-
tion. A randomly selected respondent group, of sample propriate inference procedures for the difference be-
size N, is asked to evaluate each of two products on a tween the probability a respondent actually would buy
k-point purchase intent scale. For example, the respon- one product and the probability a respondent actually
dents may evaluate each of two products and then an- would buy the other product.
swer, among other questions, a question such as
DEFINITIONS AND NOTATION
For notational purposes, and with k purchase intent
*Gary M. Mullet is Director of Statistical Services, Sophisticated categories so that the indices i and j are from 1, 2,. . .,
Data Research, Inc. Marvin J. Karson is Professor of Business Sta-
tistics and the Carter Chair of Management, Whittemore School of k, we define
Business and Economics, University of New Hampshire.
= the population proportion of respondents who actually
The authors acknowledge the helpful comments of the anonymous rij

JMR reviewers. answer in category i for product 1 and in category j


for product 2,

93

Journal of Marketing Research


Vol. XXII (February 1985), 93-6
94 JOURNAL OF MARKETING RESEARCH, FEBRUARY 1985

Pi' = the known proportion of respondents in category i for and


product 1 who actually purchase, k
P.j = the known proportion of respondents in category j for
(7) Var[r(2)] = "i p 2j Var(X·NN 2
product 2 who actually purchase, j=l
rio = the marginal proportion of respondents who actually
answer in category i for product 1, and k k

rj- = the marginal proportion of respondents who actually + 2 "i "i rr; ce-rx., X. m)/N
2

answer in category j for product 2. )=1 m=l


j<m

Also we define These variances are estimated by replacing the vari-


Xi} = the random variable representing the number in the ances and covariances in equations 6 and 7 by their sam-
sample who respond simultaneously in category i ple estimates, giving the estimated variances as
for product 1 and category j for product 2, k
P(I) = the population proportion of respondents who ac- (8) var[r(l)] = "i Pf.Xi' (N - Xi')/ N 3
tually would buy product 1, given the opportunity, i=1
and k k
P(2) = the population proportion of respondents who ac-
tually would buy product 2, given the opportunity. - 2 "i "i Pi.p/.Xj.xt-! N 3
i~1 /~1

t-a
Finally, let N denote the sample size. and
The sample-based estimators of these actual purchase'
k
proportions are denoted by r(1) and r(2), respectively,
and are given by Morrison (1979) for a single product. (9) var[r(2)] = "i P2jx.j (N - x)/ N 3
j~1

k k
(1)
;=1
- 2 "i "i P.jP.mx.jx. m/ N 3 •
j~1 m~.
k j<m
(2) r(2) = "i P.jX.jN
Here and in the following discussion, x denotes a sample
realized value.
j~1

where: An alternative and equivalent formula for calculating


x.,
k
these estimated variances is to use the following equa-
(3) Xi' "i
=
j~1
tions from Haberman (1978, p. 62).

~ (~Pi.qj.)
k

(4) x., = "i Xi). (10) var[r(l)] = Pf.qi' - 2/ N


;=1

The actual sample estimates are obtained by replacing


~., x.j , and Xi} with their sample realizations which would (11) var[r(2)] = ~ P.]q.j - (~ PA.j) 2 / N
be denoted by x with appropriate subscripts.
For purposes of inference, it is the variance of the dif- where:
ference, r(1) - r(2) , that is important and of interest; x;
from multinomial distribution theory (e.g., Hogg and qj. = -
N
Craig 1978) it is straightforward to obtain the exact vari-
ance and hence exact standard error of the difference. and
As is well known, the variance of r(1) - r(2) is
(5) Var[r(l) - r(2)] = Var[r(1)] + Var[r(2)]
- 2 Cov[r(1) , r(2)]
(The reader should note that calculating variances from
where Var denotes a variance and Cov denotes a co- sample values is a statistical estimating process. The for-
variance. Because the marginal purchase intent category mulas shown are exact; however, the value of a variance
responses X; and x.j for products 1 and 2, respectively, estimated from a sample is unlikely to be exactly equal
are marginal multinomial random variables, the vari- to the true, exact, and unknown population variance. Of
ances of r(1) and r(2) are course, this difference is sampling error.)
k To find Cov[r(1), r(2)], we define the following vec-
(6) Var[r(1)] = "i pf. Var(X i')/N 2 tors and matrices.
;=1
k k
a' = (P •., P2 . , ••• , Pd/N
+ 2"i "i Pj.p/. Cov(X i·, Xd/N 2
= row vector of weights for r(1),
i=1 /~1
i<l b' = (P'l> P. 2 , ••• , P.k ) / N
PURCHASE INTENT SCALES 95

= row vector of weights for r(2), and replacing I 12 in equation 14 by 8 12 , giving


I\2 = the (k x k) matrix of covariances between (16) s[r(1), r(2)] = a'S\2b.
the product 1 marginal totals X i- and
the product 2 marginal totals X. j • Therefore, the standard error of r(1) - r(2) is found from
equations 8 and 9, or 10 and 11, and 16.
Then, we may write
ILLUSTRATION
2: Pj.Xj'/N = a'X
k

(12) r(I) = I
;=1 Assume that a group of 600 respondents gave the fol-
lowing answers to the 5-point purchase intent question
and mentioned before for two nationally distributed beer
2: P.jXjN = sx,
k

(13) r(2) = brands. One is rapidly growing and increasing distri-


j~l bution (product 1) and the other is an old, established
national brand (product 2).
with
Established brand (product 2)
DWB PWB M/MNB PWNB DWNB
= the row vector of marginal totals for product 1
DWB 60 10 10 0 0 80
and Growing PWB 10 80 20 10 0 120
brand M/MNB 10 10 50 0 10 80
(product 1) PWNB 20 0 10 90 30 150
DWNB 0 0 0 10 160 170
= the row vector of marginal totals for product 2. 100 100 90 110 200 600
The matrix notation enables us to write next
Hence, for example, of those 100 = X'l who said "def-
(14) Cov[r(l), r(2)] = Cov(a'Xl> b'Xz)
initely will buy" the established brand, 60 = Xu said the
= a'I\2b same for the growing brand, 10 = XZI said "probably will
buy" the growing brand, 10 = X31 said "mayor may not
which means we need an estimator of I 12 • We denote buy" the growing brand, 20 = X41 said "probably will
that estimator as 8 12 , To find 8 12 , note that the element not buy" the growing brand, and 0 = XSI said "definitely
in row i, column j of I 12 is the covariance of Xi' and X; will not buy" the growing brand.
and this is Further assume the following respective distribution

(15) Cov(X j . , X. j) = cov(~ x.; tl X mj)


of the known diary purchase probabilities, Pi' and P.j •

r; = .95 P' I = .90

= .70 r, =
2: 2: Cov(X
k k
Pz. .80
= ii • X mj )
m~ll~l P 3. = .50 P.3 = .60
k k
= .20
2: 2: - Nrilrmj'
P 4. P. 4 = .30
=
m=ll=l P s. = .05 P.5 = .10
This follows because (l) the covariance of two totals is From the marginals, we readily find, from equations 1
the sum of the covariances of all pairs of random vari- and 2, that
ables, where one member of a pair is from one total and
the other is from the other total, and (2) the covariance r(1) = 238.5/600 = .3975
of the multinomial random variables Xii and Xmj is
(-Nri/rm) (e.g., Hogg and Craig 1978).
and
Finally, the (i,j) element in 8 12 is, by the preceding r(2) = 277/600 = .4617,
notation,
and from equations 8 and 9, or 10 and 11,
2: 2: - N(XiI/N)(XmJN)
k k

S(Xj., x.j) =
m~ll~l
var[r(l)] = .000174
var[r(2)] .000171.
2: 2: XilXmj
k =
= -(l/N)
m=ll=l
Also,

= -(1/N)xj,x-j' a' = 0/600)(.95, .70, .50, .20, .05)


Next, the sample covariance of r(l) and r(2) is found by b' = 0/600)(.90, .80, .60, .30, .10)
96 JOURNAL OF MARKETING RESEARCH, FEBRUARY 1985

8000 8000 7200 8800 16000~ subjected to a Wilcoxon rank sign test (Gibbons 1976).
12000 12000 10800 13200 24000 Though dubious at best (see e.g. Mullet 1983 where the
S12 = -(1/600) 8000 8000 7200 8800 16000, following non-uniformity is demonstrated empirically),
[ 15000 15000 13500 16500 30000 the purchase intents were uniformly assigned higher val-
17000 17000 15300 18700 34000 ues from +5 for DWB to + 1 for DWNB. After elimi-
nating zero differences and adjusting for the numerous
and
tied differences, we find the large-sample approximate
a'S12b = (1/600)(.95, .70, .50, .20, .05)(Sd[.90j Z = - .36. As Gibbons points out, because of the large
.80 number of tied absolute differences this value is not es-
.60 (1/600) pecially accurate, but tables do not exist for the reduced
.30 n = 160.
.10 Thus, it seems that a nonparametric approach to the
problem lacks statistical power in relation to the preced-
= -.000306
ing weighted parametric method.
= sample Cov[r(l), r(2)].
SUMMARY AND CONCLUSIONS
Finally, we can find Using the classic k-point purchase intent scale weighted
Var[r(1) - r(2)] = .000174 + .000171 - 2(-.000306) by the assumed known probability of actual purchase,
we developed the estimators of the actual probabilities
= .000956 of purchase and the attendant standard errors for the case
which yields the standard error of r(l) - r(2) as of two products evaluated by a single sample. An illus-
tration for two national brands is presented.
s.e. [r(1) - r(2)] = .0309. The correct standard error for the difference of the two
Furthermore, if we wish to test the null hypothesis sample probabilities of purchase, though perhaps im-
posing to calculate, is of great value, given the extent
Ha: P(1) - P(2) = 0, of the typical research effort.
the large-sample theory test statistic, based on the central REFERENCES
limit theorem, is the Z-statistic given by
Aaker, D. A. and G. S. Day (1980), Marketing Research:
Z = [r(1) - r(2)]/s.e. [r(l) - r(2)]. Private and Public Sector Decisions. New York: John Wiley
& Sons, Inc.
This is found to be Z = (.3975 - .4617)/.0309 = -2.08, Churchill, G. A., Jr. (1979), Marketing Research: Methodo-
which, of course, would be used in making final mar- logical Foundations, 2nd ed. Hinsdale, II: Dryden Press.
keting decisions. Dixon, W. J. and F. J. Massey, Jr. (1969), Introduction to
Finally, it might be argued that a simple nonparame- Statistical Analysis, 3rd ed. New York: McGraw-Hill Book
tric sign test should be preferred because of the calcu- Company.
Gibbons, Jean D. (1976), Nonparametric Methods for Quan-
lations required for the proposed Z-test. Our judgment titative Analysis. New York: Holt, Rinehart and Winston.
is that the sign test would be unreasonably conservative Green, P. A. and D. S. Tull (1978), Research for Marketing
by ignoring the multinomial distribution aspects of the Decisions, 4th ed. Englewood Cliffs, NJ: Prentice-Hall, Inc.
data. However, for comparison, we analyzed the sample Haberman, S. J. (1978), Analysis of Qualitative Data, Volume
data using the sign test (Gibbons 1976). Define a plus I, Introductory Topics. New York: Academic Press, Inc.
if a respondent scores product 1 as higher in purchase Hogg, R. V. and A. T. Craig (1978), Introduction to Math-
intent than product 2, a minus if lower, and zero oth- ematical Statistics, 4th ed. New York: Macmillan.
erwise. We then see 90 pluses, 70 minuses, and 440 zero Morrison, D. G. (1979), "Purchase Intentions and Purchase
differences, which are ignored. These, in tum, yield Behavior," Journal of Marketing, 43 (Spring), 65-74.
Mullet, Gary M. (1983), "Itemized Rating Scales: Ordinal or
70 - 80 Interval?" European Research, 11 (April), 49-52.
Z = = -1.58. Peterson, R. A. (1982), Marketing Research. Plano, TX:
Y(.5)(.5)160 Business Publications, Inc.
Smith, R. E. and W. R. Swinyard (1983), "Attitude-Behavior
Depending on the researcher's significance (or confi- Consistency: The Impact of Product Trial Versus Advertis-
dence) level, the conclusion could obviously differ here ing," Journal ofMarketing Research, 20 (August), 257-67.
from that obtained from the Z = -2.08. Wenta, W. B. (1979), Marketing Research: Management,
As a further comparison, these same numbers were Method and Cases, 2nd ed. New York: Harper and Row.

You might also like