Professional Documents
Culture Documents
a r t i c l e i n f o a b s t r a c t
Article history: The use of food labelling to convey information about product and process quality and for product differ-
Received 7 December 2012 entiation purposes has multiplied. In order to judge the relevance attributed by consumers to such infor-
Received in revised form 25 February 2013 mation, valid measurement methods are needed. Such methods are also needed to reveal the
Accepted 25 February 2013
probabilistic nature of preference data, so that heterogeneity can be explicitly accounted for. A survey
Available online 7 March 2013
among Swedish residents (n = 506) compared attribute importance rankings for labelling of beef from
two formats of bestworst scaling (BWS) with those from standard direct ranking (DR). A choice proba-
Keywords:
bility Rindex measure was modelled to make the methodological comparison consistent. While earlier
Food quality
Food labelling
studies on labelling of beef were conrmed, BWS and DR did not concur when directly compared, even
Standard ranking when using the Rindex. BWS improved individual choice predictions compared with DR, and generated
Bestworst scaling a more consistent dominance ordering of attribute importance. These ndings suggest that methods used
Hierarchical Bayesian estimation to elicit importance weights or preference ranking may violate transitivity and dominance requirements.
Rindex 2013 Elsevier Ltd. All rights reserved.
0950-3293/$ - see front matter 2013 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.foodqual.2013.02.005
78 C.J. Lagerkvist / Food Quality and Preference 29 (2013) 7788
To compare two methods for measuring attribute importance bute dominance relations within information sets (Jacoby et al.,
for beef labelling and packaging information attributes in such 1971).
a way that the underlying importance dimension could be Furthermore, lack of attribute discrimination and use of choice
isolated. heuristics may exist due to lack of attribute attractiveness or to
To examine the ability of the two methods to generate domi- choice task complexity (Hensher, 2006). It is therefore important
nance order of attribute importances and thereby compare the that methods devised to measure the same dimension of attribute
approach by which individual importances are aggregated with importance do not differ in understanding by respondents. In es-
a probabilistic approach by adapting the non-parametric R sence, the ability to assimilate and understand labelling informa-
index method. This approach avoids the cancelling-out effect tion and the experimental conditions could have implications for
associated with the traditional way of analysing choice data the statistical validity when comparing methods devised to mea-
pooled across respondents by instead making data comparable sure attribute importance.
in terms of choice probabilities. The Rindex approach was orig- Moreover, if attribute importance measures are to be used to in-
inally developed to represent measures of discrimination for form research, practice and policy, it is important that issues sur-
use in food and sensory research (OMahony, 1992). rounding adherence to basic axioms in decision theory are
explored. Transitivity (which depicts consistency on preference
2. Overview of labelling information processing research relations) represents such a key normative axiom (von Neumann
& Morgenstern, 1947). In addition, dominance (which denes an
Food packaging provides an attractive way to convey cues for attribute as best when its preference intensity equals or exceeds
product differentiation to potential consumers and such visual the preference intensity of every other attribute) and invariability
and informational elements have been identied as potentially (a requirement for the preference order between two prospects not
affecting consumer purchasing decisions (Silayoi & Speece, 2007). to depend on the way they are described or captured) represent the
Extrinsic credence and non-credence informational elements then two principles in rational choice theory (Kahneman & Tversky,
relate to information about the product as such, as well as about 1984).
the processing or packaging technology in relation to the product
(e.g. vacuum packaging for presenting meat). 3. Overview of stated direct methods to assess attribute
Consumer research into the information search process includes importance
three dimensions: (a) The amount of information sought (the
depth) (Bettman, 1979), (b) the type of information examined Louviere and Islam (2008) found high agreement within direct
(the content) (e.g. Jacoby, Speller, & Kohn-Berning, 1974), and or indirect methods for measuring the concept of importance used
(c) the order or pattern by which information is acquired (the se- in judgment and decision making, but reported divergence be-
quence) (e.g. Bettman & Jacoby, 1976). The decision-making pro- tween direct and indirect methods. In addition, indirect methods,
cess used by consumers has been described as a phased process, i.e. those inferring importance through an outcome measure such
in which selective attention to the stimuli to which individuals as choices, were found to be more susceptible to context effects,
are exposed is used to sort out the depth of information based while direct methods were not. Existing direct measurement
on certain criteria (e.g. relevance), after which selected information methods such as rating, ranking or bestworst scaling and indirect
is examined for content (e.g. Srinivasan, 1988). methods such as conjoint or trade-off models that typically dene
Regarding the amount of information sought, the literature sug- importance as the difference between the least and most liked, or
gests that relatively few informational items are used by consum- between a base level and some increment, can be used to impose
ers as a basis for their purchasing decisions and that relevance may importances from a set of attributes under study on an ordered
be predicted by relatively few pieces of information (e.g. Olson and common scale. However, even when data are estimated at the indi-
Jacoby, 1972). vidual level, the results are often interpreted at an aggregated level
Regarding the type of information examined, the type to which and may be taken as inputs to marketing or policy decisions as
consumers are likely to attribute importance is of interest. Early such.
ndings suggested that brand and price attributes are exceedingly This study compared two prominent methods for measuring
important and also veried the existence of dominance relations attribute importance: direct ranking (DR) and bestworst scaling
(Jacoby, Olson, & Haddock, 1971; Stokes, 1973). However, ndings (BWS). While traditional rating scales provide a way to obtain
related to meat labelling are not congruent: Bernus, Olaizola, and quantitative data for assessment of attribute importance through
Corcoran (2003) noted that the preferences for labelling are likely multivariate methods, several potential response biases (e.g. social
to be diverse due to the existence of quite heterogeneous views on desirability bias, acquiescence bias, extreme response bias), as
quality. While little research seems to be available concerning the identied by Paulhus (1991), provide clear disadvantages. In addi-
likely importance of labelling information, one exception is tion, due to cultural equivalence problems (e.g. related to terms of
Verbeke and Ward (2006), who used an ordered probit model to gradation), rating measures typically have meaning only within a
assess the impact of individual and labelling characteristics. specic experimental context.
As the number of informational cues increases, the interplay Within the extant literature attribute importance has been dis-
and meaning in the interpretation of information elements be- tinguished to include three dimensions: salience, relevance and
comes a complicated process and consumers can be expected to determinance (Myers & Alpert, 1968, 1977), but a lack of conver-
use a set of choice heuristics when choosing and comparing prod- gent validity of available methods for measuring attribute impor-
ucts based on information provided (Bettman, 1979). While some tance has been noted and attributed to the fact that different
heuristics are based on comparison by brand, others such as elim- methods measure different dimensions of attribute importance
ination by aspects imply processing by attributes, and others imply (Van Ittersum, Pennings, Wansink, & Trijp, 2007, p. 1178).
processing of attribute dominance through paired comparison The DR and BWS methods are similar in not explicitly identify-
(Russo & Rosen, 1975), which means that there is a transition in- ing sub-levels of included attributes. Instead, in both of these
volved. In this respect, the existence and use of higher order infor- methods people are asked to assess the importance of attributes
mational elements, so-called chunks (Simon, 1974), as aggregated through an information search and judgment process, while attri-
quality cues, may provide more accurate meaning of importance, bute information is held at a general level. According to Myers and
or may provide more saliency and as such establish relative attri- Alpert (1968), people rely on their personal values and desires
C.J. Lagerkvist / Food Quality and Preference 29 (2013) 7788 79
when responding to attribute information held at a more general methods for the measurement of attribute importance, BWS has
level, hence providing their assessment of attribute relevance. been found to generate greater respondent discrimination, provid-
Therefore, as the methods chosen for comparison in this study ing more accurate information in sensory testing, while viewed by
measure the same sub-dimension of attribute importance, con- panellists as the most demanding method, requiring several suc-
struct and convergent validity could be anticipated. In addition, cessive choice tasks to be completed (Hein et al., 2008). In addition,
since neither of these methods uses verbal anchors, they are repre- compared with ranking techniques, use of BWS may be susceptible
sentative of methods which should be applicable across cultures to attribute non-attendance because of task (non-)attractiveness or
(Hein, Jaeger, Carr, & Delahunty, 2008). complexity (Hensher, 2006), e.g. in responding to a set in which the
cognitive burden of deciding upon importance becomes too dif-
3.1. Direct ranking cult when all attributes are considered to be vital, resulting in lack
of discrimination.
Direct ranking is a non-forced choice method by which prefer- While a growing body of research has emerged around BWS, lit-
ence intensity is obtained by directly comparing alternative attri- tle attention has been paid to fullment of the qualitative princi-
butes. It is based on direct consumer behaviour and therefore ples that should govern the preferences of rational choice. For
eliminates the variance from combining individual rating data. BWS the lack of complete transitivity is inherent in each choice
For each respondent, the boundary between 1st and 2nd, 2nd task, at the individual level. To illustrate this, consider a choice
and 3rd, etc. is unambiguous, so the spacing of numbers is not an set of ve attributes (AE) providing data on two end-points, say
issue as it would be when making numerical estimates using rating attribute A as best and attribute E as worst. Such a choice reveals
scales (Lee & Van Hout, 2009). At the individual level, the revealed that AB, C, D and E, and that BE; CE and DE, but no information
ranking is then based directly upon respondents behaviour, unlike about the transitivity of preferences concerning attributes B, C and
results obtained by application of multivariate methods to rating D can be inferred. Successive choice tasks would reveal further
data. When aggregating ranking data across individuals, rank transitive pairs, but there is clearly no guarantee of the existence
sums, or alternatively mean ranks, are capable of showing not only of an transitive relationship over the set of attributes included.
the rank order, but also the distance between attribute ranks. However, dominance and invariance are the two fundamental
Importantly, however, use of rank sums as an indicator of product principles in all analysis of rational choice. Dominance with respect
attribute importances comes with the caveat of being unable to ad- to attribute importance would require attribute A to be at least as
dress heterogeneity across respondents, as such effects are aver- important as attribute B in every respect and more important than
aged out when aggregating ranks over individuals and expressing B in at least one respect. Invariance would require two choice for-
attribute importances by mean ranks or by order of rank sums. mats that are recognised as equivalent as regards the importance
domain they tap to elicit the same preference order even when
3.2. Bestworst scaling shown separately.
5. Materials and methods on approximations of the laws of the Member States relating to
labelling, presentation and advertising of foodstuffs. In addition,
5.1. Stimulus material labelling of pre-packaged beef the main European Directive for nutrition and health claims (Coun-
cil Directive 1924/2006/EC; OJ L 404, 30.12.2006, p. 9) in all cases
EU Regulation 1760/2000 requires mandatory labelling of all requires substantiation based on scientic evidence and in some
non-minced beef and beef products sold in the EU with an individ- Member States prior authorisation. A detailed presentation of the
ual reference or code number referring to the specic animal and a EU labelling requirements can be found in Cheftel (2005).
licence number for the slaughterhouse, in order to allow for trace- The labelling attributes in the present study also introduced an
ability to the country where slaughter took place and where cut- alternative labelling scheme on origin by which a set of attributes
ting were performed. Furthermore, labelling must include regarding the geographical zone where animals were born, fat-
information about the country where the animal were born, fat- tened and slaughtered and where meat was packaged was used,
tened and slaughtered. If these aspects coincide, this information as inside or outside the EU, but not specic country (Table 1).
can be pooled into one heading, origin. Imported beef to be sold The reason for this was to examine current policy proposals.
within the EU must be labelled Origin: non-EC, followed by the Additional attributes were developed based on focus group dis-
name of the third country. Any further voluntary information to cussions and an extensive literature review. Four separate focus
be provided on the product package must be specied, for account- group sessions (total n = 31; male = 14, female = 17) were con-
ability, and sent for approval to the relevant authority of the Mem- ducted by a marketing research company to identify the food label-
ber State in which the beef is to be sold. ling attributes that received attention during beef purchasing by
Beside including the attributes stipulated in EU Regulation various age groups (younger 2044; older 4575) and by type of
1760/2000, the present study included further attributes corre- resident (larger city (Stockholm); non-urban (City of Enkping)).
sponding to the 10 main requirements within EU Directive 2000/ Participants expressed concerns about trust in labelling informa-
13/EC with annexes (Ofcial Journal L 109, 6.5, 2002, pp. 2942) tion related to intrinsic product quality characteristics and the nal
Table 1
Description of the 30 food quality attributes used in the survey by rank, and mean of importance weight as percentages.
Attribute Rank: Rank: relative Rank: anchored Attribute importance: Attribute importance:
standard bestworst bestworst relative bestworst anchored bestworst
ranking scaling scaling scaling % scaling %
1 Environmental impact of the livestock productiona 19 20 20 2.82 0.96
2 Extent of good animal welfare for the livestock 8 6 6 6.99 6.88
productiona
3 Health impact from consumption of beefa 24 23 23 0.86 0.85
4 Extent of social responsibility for the livestock 17 16 17 1.41 1.45
productiona
5 Information about organic productiona 15 13 14 1.87 1.95
6 Information about if the animal received 9 5 5 7.14 7.14
preventative medication or nota
7 Traceability to group or specic animal 18 18 18 1.32 1.30
8 Country where the animal was born 6 9 9 3.87 3.79
9 Country where the animal was fattened/bred 1 2 2 10.96 11.56
10 Country where the animal was slaughtered 5 10 10 3.19 3.10
11 Traceability to specic slaughterhouse 13 14 13 1.84 1.95
12 Country where the meat was cut 10 15 15 1.69 1.61
13 Country where the meat was packaged 11 19 19 1.19 1.16
14 Ingredients 12 11 11 2.89 2.84
15 Nutrient value per 100 g 22 25 26 0.74 0.68
16 Recommended storage temperature 23 29 29 0.65 0.61
17 Storage instructions and expiry date for opened 16 12 12 2.88 2.80
package
18 Date of minimum durability (expiry) 2 1 1 15.28 15.82
19 Date of packaging 3 4 4 8.48 8.18
20 Method used to tender the meat 27 28 27 0.67 0.68
21 Brand 25 27 28 0.68 0.64
22 Price 4 3 3 9.47 9.24
23 Weight (kg) 7 7 7 5.32 5.41
24 Geographical zone where the animal was born 28 24 25 0.74 0.68
(inside or outside the EU, but not specic country)
25 Geographical zone where the animal was fattened/ 21 17 16 1.40 1.46
bred (inside or outside the EU, but not specic
country)
26 Geographical zone where the animal was 26 21 21 0.88 0.91
slaughtered (inside or outside the EU, but not
specic country)
27 Geographical zone where the meat was cut (inside 29 30 30 0.59 0.56
or outside the EU, but not specic country)
28 Geographical zone where the meat was packaged 30 26 24 0.72 0.71
(inside or outside the EU, but not specic country)
29 Traceability to specic breeder 14 8 8 4.40 4.17
30 Type of animal feed given during raising of the 20 22 22 0.86 0.88
animal
Average percentage certainty (Chance ratio) 50.3% (2.51) 51.7% (3.10)
a
Veried by government authority or EU body.
C.J. Lagerkvist / Food Quality and Preference 29 (2013) 7788 81
Fig. 1. Example of anchored bestworst (BW) questions used in the consumer survey.
82 C.J. Lagerkvist / Food Quality and Preference 29 (2013) 7788
A a1 b2 z2 b1 c2 z2 w1 z2 1
1
B a1 a2 z1 z2 2
2
Fig. 2. Illustration of a given response matrix for the computation of Rindex
indicating choice probability. C a2 b1 z1 b2 c1 z1 w2 z1 3
When calculated pair-wise for all n attributes, an n n matrix
of Rindex values is obtained (Fig. 3) with Rii = 100. By rows, the
analytical hierarchy process, which allows the overall importance Rindex matrix includes the pair-wise probability that a given
of each attribute to be derived (Saaty, 1977). attribute is chosen as more important than any of the other attri-
butes. These pair-wise probabilities then reect the dominance
relations between attributes. A strictly dominant attribute could
5.3.2. Bestworst scaling be expected to be preferred to all other attributes (except itself),
To implement the BWS, as shown in Fig. 1, respondents were, whereas an attribute revealing weak dominance would not be
for each set of attributes, asked to choose the BW labelling combi- dominated by other attributes but would dominate at least one
nation and then to choose between the labelling features and the other attribute in terms of importance. Furthermore, by summing
dual-response question. Choice situations in the BWS experiment across rows in the Rindex matrix, the sum of choice probabilities
were specied using the MaxDiff designer v.2.0.2. (Sawtooth Soft- is obtained for each attribute, which can be used to assess the sto-
ware). A highly randomised design including 300 versions was em- chastic rank order of attributes. Such a measure would be directly
ployed to reduce issues related to scale effects, as these can be comparable across ranking methods. By expressing choice proba-
expected to be more pronounced with few versions of a question- bilities, the Rindex therefore provides further information for
naire. The balanced design meant that for each version, 18 choice the analysis of choice data beyond the masking regression to the
sets were generated, with each set including ve information mean effect that is entailed in analysing importance weights
attributes. Each attribute was presented three times per version. pooled across respondents.
A two-way balance was favoured in the design, which meant that
the design was directed towards how often paired combinations of
5.4. Data analysis
the attributes appeared together, and each pair of attributes ap-
peared together on average 0.4 times. The hierarchical Bayesian
For the DR data, attribute rank sums obtained by aggregating
method used to estimate attribute importance does not require
ranks across respondents were used to rank attributes by
perfect balance. Research by Orme (2005) directed the number of
importance.
attributes to be combined into the choice set, as well as the num-
Hierarchical Bayesian (HB) models were used to estimate loca-
ber of displays of choice sets.
tions of attributes on the underlying scale of importance from the
BWS data. HB models have recently been shown to outperform
5.4. Rindex for comparison of partial order of ranking and scaling aggregate methods (MNL) and latent class methods in estimations
data of B-W choice data related to food quality attributes (Lagerkvist
et al., 2012). HB models can handle the presence of within- and be-
While DR, RBWS and ABWS are theoretically consistent in terms tween-respondent choice heterogeneity and offer the advantage of
of tapping the relevance dimension of attribute importance, a the- investigating the probability distribution of the parameters given
oretically appropriate empirical measure of similarity in data re- the data, instead of the opposite as in random parameters logit
trieved from such measurement has been lacking to date. To models (RPL), which means that data quality is not lost in estimat-
measure the similarity between attribute importances and the ex- ing a HB model. A further advantage of HB is the ability to generate
tent of transitivity of each ranking method, we therefore adapted individual specic data from sparse data sets.
the Rindex measure proposed by OMahony (1992)). Estimation of the HB models was performed here using CBCHB
Our adapted Rindex measure expresses the pairwise probabil- v.5.0.4 (http://www.sawtoothsoftware.com). The model was esti-
ity (Rij) that a given attribute i is preferred over another attribute j, mated through an iterative process in a Monte Carlo Markov Chain
with i = 1,. . ., n and j = 1,. . .n. For Rindex analysis, the frequencies approach to obtain convergence. The estimation was carried out
for each rank, summed over all respondents, are counted and a re- with 10,000 iterations before results were used and then with an
sponse matrix is derived, as illustrated in Fig. 2. This computational additional 20,000 iterations to calculate importance weights for
approach is recommended when there may be heterogeneity in the each interviewee. The ABWS data were coded in accordance with
ranking of choice alternatives (Lee & Van Hout, 2009). The pooled the procedure detailed in Appendix 4 of Sawtooth Software.
probability to which attribute 1 is preferred over attribute 2 is then (2009). This coding means that no attribute choice information is
obtained as R12 A B=A 2B C 100, where A is the total lost when accounting for the dual-response question. Instead, the
of attribute 1 preferred over attribute 2, B is the total of ties in data le is augmented by additional choice tasks to recognise
either: (a) that all ve attributes are preferred to the threshold Table 3
(zero utility) with a new worst task, or (b) that all ve attributes Respondents use of labelling information.
are inferior to the threshold by a new best task. The importance Statement Alternative Share
weighting of attribute k over the full set of attributes was obtained %
on a common ratio scale following the procedure outlined in Lag- To what extent would you say that you look at the I look at all 17
erkvist et al. (2012), from which ranks were obtained. labelling information (on the package) when I look at most 38
The average percentage certainty measure (Hauser, 1978), ob- you buy beef today? I look at some 35
but not all
tained as the difference between the log likelihood of each model I look at just a 9
and the log likelihood of a chance model, was used to assess model few
t. In addition, a chance ratio measure (average percentage cer- I do not look at 1
tainty divided by the chance) was used to compare the predictive it
accuracy between model specications. For the relative BWS a How do you nd expressing the type of beef Very easy 7
chance model had a predictive power of 20% (one out of ve choice labelling information that is important to you? Fairly easy 35
Neither easy 25
options) while in the anchored BWS study, a chance model had a
nor difcult
predictive power of 16.7% (one out of six choice options). Fairly difcult 29
For Rindex analysis, the direct ranking method provided all Very difcult 4
necessary input, while for the BWS methods the individual ranking
of attributes was obtained through the estimated location of each
attribute on the underlying interval scale of importance and then
transformed onto the common ratio scale. Critical values for test-
Table 4
ing the signicance of the Rindex were obtained from Bi and Respondents evaluation of the response formats.
OMahony (2007). A two-tailed test was used, since the comparison
Statement Alternative Direct Best
was whether attribute i or attribute j, in the comparison, was to be
ranking worst
chosen as more important over the other for whatever attribute % scaling %
under consideration. The chance value for a Rij index was 50%.
It was easy to understand how I Disagree 2 3
The Friedmans test for related samples was applied to test should provide my choices Partly disagree 7 8
equality of distributions of attribute importances per attribute be- Neutral (neither 19 19
tween direct ranking and BWS. Furthermore, the bivariate Spear- disagree nor
man rank-order correlation coefcient, which indicates the agree)
Partly agree 30 28
degree of association between the rankings (Siegel & Castellan, Agree 42 42
1988) was used for comparison of rankings across methods.
I understood the meaning of the Disagree 2 1
labelling alternatives Partly disagree 7 8
Neutral (neither 17 17
6. Results
disagree nor
agree)
6.1. Understanding of methods used to assess attribute importance Partly agree 34 39
Agree 41 35
The distribution of responses about use of labelling information I was able to express what was Disagree 3 3
when buying beef is shown in Table 3. The majority of respondents important for me concerning Partly disagree 8 10
indicated that they look at most or some, but not all, labelling beef labelling Neutral (neither 20 20
disagree nor
information. These results are similar to ndings on how often con- agree)
sumers read nutrition fact panel labels (Gracia, Loureiro, & Nayga, Partly agree 35 38
2009). In addition, the majority of respondents were neutral or Agree 33 30
were found on the range between fairly easy and easy to express
the importance of the type of beef labelling that they prefer. Hence,
Table 4 shows the respondents evaluation of response formats be-
that the choices were balanced in relation to the experimental de-
tween DR and BWS. For each of the evaluative statements, an anal-
sign. The anchored dual-response option all ve of these are
ysis of cross-classication using the crosstab (Bonferrioni adjusted)
important was selected in total 2713 times (29.9%), while the
Chi-square test was unable to reject the possibility that the evalu-
threshold some are important, some are not was selected 5959
ation differed by response format: Ease of understanding
times (65.5%), leaving only 418 observations (4.6%) to the none
n216 439:4, P < 0.0001); understanding meaning of labelling alter-
of these are important alternative. Taken together, these results
natives (n216 583:9, P < 0.0001); ability to express what was
indicate that the majority of choices were made in a discriminatory
important concerning labelling of beef (n216 568:1, P < 0.0001).
way. The average percentage explained was 50.3% and 51.7% for
Hypothesis H1, which stated that the understanding of methods
the RBWS and ABWS models, respectively. This corresponds to a
used to measure importance of labelling attributes is equal, could
predictive accuracy of 2.51 and 3.10 times higher than a pure
thus not be rejected.
chance model, respectively. The estimated average variance was
slightly higher for the ABWS model (10.07) than for the RBWS
6.2. Comparison of attribute importance between direct ranking and model (9.25). Likewise, the parameter root mean square was
bestworst scaling slightly higher for the ABWS model (3.52) than for the RBWS mod-
el (3.37). This indicates that the use of the dual-response task
The ranks from DR data pooled across the respondents and the introduced some further heterogeneity in preferences and that
estimated attribute importances for the RBWS and the ABWS, the magnitude of the locations on the underlying interval scale of
which are based on 9108 observations (36 observations per importance became slightly higher.
respondent), are presented in Table 1. The choice frequencies for Visual inspection of the results in Table 1 suggests that the two
the alternative response categories 1, 2, 3, 4 and 5 were 21.45%, ranking methods did not concur in rankings of attribute impor-
19.71%, 18.68%, 19.77% and 20.38%, respectively, which indicates tances. Table 5 shows the results for the Friedman tests of equality
84 C.J. Lagerkvist / Food Quality and Preference 29 (2013) 7788
Table 5
Test of equality in medians of distributions of importance ranking and correlation between importance ranking by attributes.
in mean ranks of pair-wise differences for the ranking of each attribute importance is invariant to elicitation format, cannot be
respective attribute by DR, RBWS and ABWS, respectively. strictly rejected. This observed degree of association is an impor-
Although not conclusive, there are rather strong indications that tant aspect of the structural reliability of the measurement
the distributions of ranking differed more between DR and BWS, methods.
while the two BWS methods had more similarities between distri- Furthermore, the Friedman test rejected equality between the
butions of rankings. However, a pair-wise comparison of the Spear- distributions of attribute importance weights, except for three
man (rho) rank-order correlation coefcient indicated that the attribute pairs: price (test statistic = 0.198, P = 0.657); date of
associations between the rankings were signicantly positive minimum durability (test statistic = 0.387, P = 0.534); and coun-
(P < 0.0001) and strong. Hence, hypothesis H2, that ranking of try where the animal was fattened/bred (test statistic = 3.83,
Fig. 4a. Frequency by ranks for the class of attributes with a strong indication of being selected as most important. Data from anchored bestworst scaling.
C.J. Lagerkvist / Food Quality and Preference 29 (2013) 7788 85
Fig. 4b. Frequency by ranks for the class of attributes with a strong indication of being selected as least important. Data from anchored bestworst scaling. Note: method used
to tender the meat refers to type of package.
P = 0.05). On average, ABWS gave greater heterogeneity than 6.3. Comparison of choice probabilities and attribute order by Rindex
RBWS and the heterogeneity was less pronounced for attributes
of higher importance than for those of lower importance (minA- Two classes of responses from ABWS data for distributions of
BWS = 1.10; minRBWS = 1.07 (attribute 19), maxABWS = 4.05; frequency by rank are shown in Figs. 4a and 4b, respectively. Sim-
maxRBWS = 3.83 (attribute 20). Hypothesis H3, that attribute ilar patterns were found for the other two methods. Fig. 4a shows
discrimination does not differ depending on the use of forced the distributions for attributes which the majority of respondents
choice (RBWS) or non-forced choice (ABWS), could therefore be strongly endorsed in terms of importance, while Fig. 4b shows
rejected. This means that a non-forced method suggests larger the distributions for which there was strong lack of endorsement
heterogeneity in attribute importances than a forced choice in terms of importance. For a third class (not shown), more even
method. distributions of frequency by ranks were obtained. The results
Table 6
Rank, choice probabilities and attribute dominance based on Rindex calculation.
Note: Prob. = total choice probability per attribute in percentage, i.e. sum of Rindex by rows in the Rindex matrix. AD = attribute dominance = number of attribute which the
attribute dominates according to the attribute Rindex at the 5% signicance level. Diff. = difference between rank of attribute minus AD.
86 C.J. Lagerkvist / Food Quality and Preference 29 (2013) 7788
displayed in Figs. 4a and 4b suggest that there may be more infor- request). For RBWS and ABWS, the date of minimum durability
mation within the choice data than is captured when additively was the only strictly dominant attribute, while as the least impor-
aggregating attribute importance ranking or weights across tant in terms of choice probabilities, nutritional labels together
respondents. with method used to package the meat (RBWS) and brand (ABWS)
The attribute rankings and dominance relations based on choice were the only strictly dominant attributes. For DR there was no
probabilities as expressed by the Rindex calculations are shown in strictly dominant attribute. The RBWS method showed the lowest
Table 6. The results indicate that ranking of product attribute deviation between rank based on choice probability and theoreti-
importance did not differ between DR and Rindex ranking based cal rank dominance, while DR showed the largest deviation. Hence,
on the DR data (rs = 1, P < 0.0001). However, although attribute as regards hypothesis H6, none of the methods generated a com-
ranking between pooled importance weights and choice probabil- plete transitive ordering of attributes.
ities was very similar for RBWS (rs = 0.949, P < 0.0001) and for
ABWS (rs = 0.964, P < 0.0001), the rankings did not concur. It was
found that the same 10 attributes were ranked with the highest 7. Discussion
Rindex, irrespective of measurement model. These attributes
were also associated with the highest ranking when the data were Irrespective of measurement method, the results related to
analysed over rank sums and importance weights. However, for importance of beef labelling attributes corroborate ndings from
RBWS there were 27 preference reversals compared with the attri- earlier studies that the highest level of attribute importance is gi-
bute importances obtained by pooling individual importance ven to expiry date (Bernus et al., 2003; Verbeke & Ward, 2006)
weights, whereas ABWS revealed 26 reversals. Hence, there is sup- and that the related quality cue of packaging date is a top priority.
port for rejection of H5 being conditional upon measurement Furthermore, information about country where the animal was fat-
method. tened/bred or born, together with traceability to a specic breeder,
Furthermore, the results in Table 6 show that the ability for dis- apparently functioned as an important quality cue to consumers,
crimination of attribute importances differed across methods. To while other country-of-origin and traceability attributes did not.
further highlight this, the total choice probability per attribute The results also show that country-specic information is more
was plotted against the rank (Fig. 5). RBWS and ABWS revealed important than information about geographical zone. Beside meet-
strong similarities, suggesting that these methods generate stron- ing stipulated labelling criteria, some food labels with little or no
ger tails discrimination (more emphasised kinks separating the standards make further claims to certify a wide array of product
most important and least important attributes). For DR the results quality and process characteristics concerning animal welfare, sus-
revealed a more linear structure, indicating less polar discrimina- tainability or environmental values, all of which may be strong
tion. In addition, the relationship between choice probabilities re- marketing features for the food industry. In this respect, the results
vealed a different relationship between the scope of differences in corroborate earlier ndings that information cues about produc-
attribute importance than the comparison between importance tion processes related to animal welfare conditions are of high rel-
weights from RBWS or ABWS. For example, attribute 18 (date of evance (Tonsor & Wolf, 2011). Interestingly, the importance
minimum durability) was 2 (=2404.9/1202.5) times more likely concerning preventative medication suggests that this designation
to be chosen as most important in comparison with attribute 20 of production methods could be perceived as an important extrin-
(extent of social responsibility) from the Rindex based on ABWS, sic cue related to food safety.
while the difference in importance weights was 10.9 times The nding that price is considered of high importance, thus
(=15.82%/1.45%). functioning as a quality indicator, while brand information is not,
Finally, Table 6 shows the extent to which each of the methods is consistent with the previous nding that consumers loyal to a
established a dominant ordering of labelling attributes (an Appen- brand pay less attention to price (Bronnenberg & Vanhonacker,
dix with Rindex tables corresponding to Fig. 2 is available upon 1996).
Fig. 5. Total choice probabilities per attribute by rank within the Rindex method.
C.J. Lagerkvist / Food Quality and Preference 29 (2013) 7788 87
At the lower end of attribute importances, those cues related to that data quality can be ascertained for preference intensity
environmental, human health effects from meat consumption, so- studies.
cial responsibility during rearing, and organic production were The public policy considerations emanating from the results of
consistently ranked low. The literature shows mixed ndings this study are interesting. First, the ndings suggest that there is an
about information cues on the production process. ordering of attributes within the concept of origin. Specication of
While the results presented here for labelling of beef (Table 1) country where the beef was fattened, slaughtered and born was
hold specically for Swedish consumers, they show that the dis- found to be among the most important attributes, while other
crimination among attribute importances was rather pronounced mandatory country-of-origin attributes, including traceability,
and that several of the label cues have, or would have, little impor- ranked only of medium importance. Furthermore, the alternative
tance to consumers, which is in line with results from Verbeke and labelling of origin by geographical zone (inside or outside the EU
Ward (2006). The sum of attribute importance for the 10 most but no specication of country) was among the lowest ranked
important attributes was 75.1% and 75.3% for RBWS and ABWS, attributes, suggesting that the current legislation on labelling is
respectively. acceptable.
As the methods used to elicit importance weights or preference Second, animal welfare and medication for preventative pur-
ranking reveal violations of transitivity and dominance, this may poses were among the top ranked attributes. This suggests that
indicate that people violate such basic axioms. If so, then existing the ongoing work within the EU Commission to introduce a label-
models used by researchers and practitioners need to be revised ling scheme for animal welfare is well justied in relation to how
to better describe human behaviour. A related issue, in the event relevant consumers nd this information. Information about pre-
of such deviations, is whether the observed pattern represents true ventive medication can be expected to be related to concerns about
deviations, is due to random errors, or is induced by shortcomings animal welfare, but also about public health. Such concerns extend
in the data generation process. The relatively large sample used in beyond practices within organic production and have come to be
this study together with the highly randomised design should have used in marketing, for example in restaurants. In addition, antibi-
reduced problems with scale factors, so the pattern was due to true otic residues are a clear concern to the public and in environmental
deviations. In any case, the potential inuence of choice heuristics policy.
for interfacing with, and contributing to, traditional preference Third, and related to the issue of pharmaceutical residues, the
analysis is demonstrated here. Considerable attention was focused ndings suggest that more extended quality cues (health claims,
on measuring but not explaining attribute attendance. An impor- environmental impact, social responsibility, information on organ-
tant further step would be to incorporate consumers use of pro- ic production) are not of general relevance to consumers. This does
cess-orientated procedures of attribute selection into the not mean that they are irrelevant, but from an informational search
measurement models of importance weights to improve predictive perspective they should be used to target specic consumer seg-
accuracy. ments instead of all consumers.
Acknowledgements
8. Conclusions
Kristian Sundstrm and Helena Johansson from AgriFood Eco-
This study compared the direct ranking (DR) and bestworst nomics Centre in Lund, Sweden, contributed to the development
scaling (BWS) methods, which theoretically should relate to an of labelling attributes to be included in the study. Constructive
underlying relevance dimension, in terms of their results with re- comments and suggestions as well as assistance in data collection
spect to attribute importance discrimination and dominance for were provided by NORM Nordic Market Research AB, Sweden is
beef labelling and packaging information attributes. The ability of greatly appreciated.
the two methods to generate a dominance order of attribute
importances was also compared using an adapted non-parametric References
Rindex method.
Overall, the results conrmed previous ndings suggesting that Bettman, J. R., & Jacoby, J. (1976). Patterns of processing in consumer information
acquisition. In B. B. Anderson (Ed.). Advances in consumer research (Vol. 3,
consumer information searches in relation to beef labelling focus
pp. 315320). Provo, UT: Association for Consumer Research.
on extrinsic quality cues. Regarding the depth of information Bettman, J. R. (1979). An information processing theory of consumer choice. Reading,
search, the results suggested that as few as 10 attributes explained MA: Addison-Wesley.
75% of consumers importance rankings in terms of relevance. This Bernus, A., Olaizola, A., & Corcoran, K. (2003). Labelling information demanded by
European consumers and relationships with purchasing motives, quality and
indicates that there is an hierarchy of labels which needs to be safety of meat. Meat Science, 65, 10951106.
carefully considered, as consumers are increasingly exposed to a Bi, J., & OMahony, M. (2007). Updated and extended table for testing the
wider set of information cues as the product range and number signicance of the Rindex. Journal of Sensory Studies, 22(6), 713720.
Bronnenberg, B. J., & Vanhonacker, W. R. (1996). Limited choice sets, local price
of labelling attributes increase. There is thus a need for prioritisa- response, and implied measures of price competition. Journal of Marketing
tion, in which the results of this study can be useful. Research, 33, 163173.
Use of the Rindex method, instead of importance measures Cheftel, J. C. (2005). Food and nutrition labelling in the European Union. Food
Chemistry, 93, 531550.
using aggregation over individuals, illustrated how information Combris, P., Bazoche, P., Giraud-Hraud, E., & Issanchou, S. (2009). Food choices:
contained in heterogeneous responses can contribute additional What do we learn from combining sensory and economic experiments? Food
insights into labelling preferences. For DR, the ordering of attribute Quality and Preference, 20, 550557.
Finn, A., & Louviere, J. J. (1992). Determining the appropriate response to evidence
importance was identical between the use of rank sums and sto- of public concern: The case of food safety. Journal of Public Policy and Marketing,
chastic rank sums, but not so for BWS. This suggests that there is 11, 1225.
more choice information in BWS data than is typically reported Gracia, A., Loureiro, M. L., & Nayga, R. M. Jr., (2009). Consumers valuation of
nutritional information: A choice experiment study. Food Quality and Preference,
from such studies. The measurement methods suffered from
20, 463471.
incomplete transitivity of attribute importances, although the Hauser, J. R. (1978). Testing and accuracy, usefulness and signicance of
two BWS scaling methods were slightly better than DR in terms probabilistic choice models: An information-theoretic approach. Operations
of deviations from the theoretical order. This is an important meth- Research, 26, 406421.
Hein, K., Jaeger, S. R., Carr, B. T., & Delahunty, C. M. (2008). Comparison of ve
odological issue on which future research is required. Based on the common acceptance and preference methods. Food Quality and Preference, 19,
ndings, it is recommended that Rindex results be reported so 651661.
88 C.J. Lagerkvist / Food Quality and Preference 29 (2013) 7788
Hensher, D. (2006). How do respondents process stated choice experiments? OMahony, M. (1992). Understanding discrimination tests: A user-friendly
Attribute consideration under varying information load. Journal of Applied treatment of response bias, rating and ranking Rindex tests and their
Econometrics, 21, 861878. relationship to signal detection. Journal of Sensory Studies, 7, 147.
Jacoby, J., Olson, J. C., & Haddock, R. A. (1971). Price, brand name, and product Orme, B. (2005) Accuracy of HB estimation in Maxdiff experiments. Technical paper
composition characteristics as determinants of perceived quality. Journal of available at: <http://www.sawtoothsoftware.com>.
Applied Psychology, 55, 570579. Paulhus, D. L. (1991). Measurement and control of response bias. In J. P. Robinson, P.
Jacoby, J., Speller, D. E., & Kohn-Berning, C. A. (1974). Brand choice behavior as a R. Shaver, & L. S. Wrightsman (Eds.), Measures of personality and social
function of information load: Replication and extension. Journal of Consumer psychological attitudes (pp. 1759). New York: Academic Press.
Research, 1, 3342. Russo, J. E., & Rosen, L. D. (1975). An eye xation analysis of multi-alternative
Jaeger, S., Jrgensen, A. S., Aaslyng, M. D., & Bredie, W. L. P. (2008). Bestworst choice. Memory and Cognition, 3, 267276.
scaling: An introduction and initial comparison with monadic rating for Saaty, S. L. (1977). A scaling method for priorities in hierarchical structures. Journal
preference elicitation with food products. Food Quality and Preference, 19, of Mathematical Psychology, 15, 234281.
579588. Sawtooth Software. (2009). Anchored scaling in MaxDiff using dual response.
Kahneman, D., & Tversky, A. (1984). Choices, values and frames. American Available at: <http://www.sawtoothsoftware.com/download/techpap/
Psychologist, 39, 341350. dualresponsemaxdiff.pdf>. Last accessed 15.09.12.
Lagerkvist, C. J., Okello, J. J., & Karanja, N. (2012). Anchored vs. relative bestworst Siegel, S., & Castellan, N. J. (1988). Nonparametric statistics for the behavioral sciences.
scaling and latent class vs. hierarchical Bayesian analysis of bestworst choice New York: McGraw-Hill International Editions.
data: Investigating the importance of food quality attributes in a developing Silayoi, P., & Speece, M. (2007). The importance of packaging attributes: A conjoint
country. Food Quality and Preference, 25, 2940. analysis. European Journal of Marketing, 41, 14951517.
Lee, H.-S., & Van Hout, D. (2009). Quantication of sensory and food quality: The R Simon, H. A. (1974). How big is a chunk? Science, 183, 482488.
index analysis. Journal of Food Science, 74, R57R64. Statistics Sweden (2012). Annual population statistics. Available at: http://
Louviere, J. J., & Islam, T. (2008). Acomparison of importance weights and www.scb.se/BE0101-EN. Accessed 23.11.12.
willingness-to-pay measures derived from choice-based conjoint, constant Stokes, R. C. (1973). Unit pricing, differential brand density, and consumer deception.
sum scales and bestworst scaling. Journal of Business Research, 61, Washington, DC: Consumer Research Institute.
903911. Srinivasan, V. (1988). A conjunctive-compensatory approach to the self-explication
Mantel, S. P., & Kardes, F. R. (1999). The role of direction of comparison, attribute- of multiattributed preferences. Decision Sciences, 19, 295305.
based processing, and attitude-based processing in consumer preference. Tonsor, G. T., & Wolf, C. A. (2011). On mandatory labelling of animal welfare
Journal of Consumer Research, 25, 335352. attributes. Food Policy, 36, 430437.
Marley, A. A. J., & Louviere, J. J. (2005). Some probabilistic models of best, worst, and Van Ittersum, K., Pennings, J. M. P., Wansink, B., & Trijp, H. C. M. (2007). The validity
bestworst choices. Journal of Mathematical Psychology, 49, 464480. of attribute-importance measurement: A review. Journal of Business Research,
Myers, J. H., & Alpert, M. I. (1968). Determinant buying attitudes: Meaning and 60, 11771190.
measurement. Journal of Marketing, 32, 1320. Verbeke, W., & Ward, R. W. (2006). Consumer interest in information cues denoting
Myers, J. H., & Alpert, M. I. (1977). Semantic confusion in attitude research: salience quality, traceability and origin. An application of ordered probit models to beef
vs. importance vs. determinance. Advances in Consumer Research, 4, 106110. labels. Food Quality and Preference, 17, 453467.
Olson, J. C., & Jacoby, J. (1972). Cue utilisation in the quality per-ception process. In von Neumann, J., & Morgenstern, O. (1947). Theory of games and economic behaviour
M. Venkatesan (Ed.), Proceedings of the Third Annual Conference of the Association (2nd ed.). Princeton: Princeton University Press.
for Consumer Research (pp. 167179). Chicago: Association for Consumer
Research.