You are on page 1of 13

Judging wine quality: Do we need experts, consumers or trained

panelists?
Helene Hopfer

, Hildegarde Heymann
Department of Viticulture & Enology, University of California, Davis, One Shields Avenue, Davis, CA 95616, USA
a r t i c l e i n f o
Article history:
Received 12 July 2013
Received in revised form 27 September 2013
Accepted 8 October 2013
Available online 17 October 2013
Keywords:
Wine quality
Californian Cabernet Sauvignon
Descriptive Analysis
Hedonic liking
Consumers
Wine experts
a b s t r a c t
A Descriptive Analysis panel, wine experts and consumers evaluated 27 Californian Cabernet Sauvignon
wines with varying quality scores. Descriptive Analysis revealed several aroma and avor descriptors
driving quality scores. For all consumer segments as well as the wine experts, hedonic liking was shown
to highly correlate to perceived quality, but for some consumers liking and perceived quality was not at
all correlated to the quality scores of the wines. Wine experts were able to nd signicant differences in
liking and quality, but did not agree completely with the assigned quality scores from the wine judgment.
Wine experts also used a combination of both descriptive and hedonic terms when describing a high
quality wine, indicating that they are better at communicating and describing what they like.
2013 Elsevier Ltd. All rights reserved.
1. Introduction
The quality of wine is hard to dene, mainly due to the lack of
agreement on the quality term in general, and this discussion is not
limited to wine alone. People who study wine quality therefore talk
about perceived quality, and how various populations differ in their
wine quality perception (Charters & Pettigrew, 2007). The advantage
of using a holistic approach, e.g. quality perception, lies in the global
assessment of quality, which is the result of individuals conceptions
andprevious experiences, andincorporates all different levels of qual-
ity into one judgment (Charters &Pettigrew, 2007). Nevertheless, the
overall quality perception can be broken into several dimensions of
extrinsic andintrinsic layers (Charters &Pettigrew, 2007; VerdJover,
Llorns Montes, & Fuentes Fuentes, 2004). Extrinsic factors include
grape growing and winemaking, and, at a lower level, the technical
correctness including the most basic denition of wine quality as
the absence of faults and/or drinkability. The intrinsic dimension is
more dened by the drinking experience, including factors such as
pleasure, aroma, avor and mouthfeel, appearance, as well as factors
that are typically more important for people with a high involvement
such as origin, variety, typicality and potential.
When talking about the different dimensions of quality, one
needs to keep in mind that the two levels inuence each other,
as shown by Siegrist and Cousin (2009), who found that extrinsic
information, such as wine critic scores, directly inuence the
expectation and therefore, also the tasting experience. Similarly,
consumers found signicant differences in liking of Champagne
wines when they were able to see the labels, but in contrast, could
not differentiate among the same Champagne wines when tasted
blindly (Lange, Martin, Chabanet, Combris, & Issanchou, 2002).
Consumers are inuenced by extrinsic information, however,
they report that the intrinsic tasting experience is the most
important reason for drinking wine (Charters & Pettigrew,
2007), indicating the importance of avor, i.e. as dened by
the ASTM as the . . . perception resulting from stimulating a com-
bination of the taste buds, the olfactory organs, and chemesthetic
receptors within the oral cavity . . . (ASTM International, 2009).
In the end, consumers of wine want to drink and enjoy quality
wine, a fact, that is true for everyone independent of the degree
of wine involvement (Charters & Pettigrew, 2007). This also indi-
cates that perceived quality is linked to hedonic liking (Lawless,
Liu, & Goldwyn, 1997). However, the average consumers, espe-
cially those with a lower degree of wine involvement, do not
necessarily have the tasting experience and expertise to select
appropriate wines, and so turn towards wine experts and trusted
sources for guidance, followed by brand, awarded medals and
wine articles. They also tend not to use back and front labels
or store display information in their decision making process
(Thach, 2008).
Ideally, wine experts screen wines and award some kind of
quality score, which would then give consumers an indication
whether they would enjoy and like a wine or not.
0950-3293/$ - see front matter 2013 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.foodqual.2013.10.004

Corresponding author. Tel.: +1 530 752 9356; fax: +1 530 752 0382.
E-mail addresses: hhopfer@ucdavis.edu (H. Hopfer), hheymann@ucdavis.edu
(H. Heymann).
Food Quality and Preference 32 (2014) 221233
Contents lists available at ScienceDirect
Food Quality and Preference
j our nal homepage: www. el sevi er . com/ l ocat e/ f oodqual
Experts are known to act more analytically when assessing
quality compared to inexperienced consumers (DAlessandro and
Pecotich, 2013). However, as with every product, levels of liking
and also perceived quality show large variabilities, not only among
consumers, but also among wine experts (Hodgson, 2008, 2009).
Hodgson (2009) calculated that being awarded a Gold Medal in
one of the many wine judgements is simply a matter of how many
competitions you enter, as he could not nd concordance in gold
medals awarded among the 13 U.S. wine competitions studied.
These factors and previous studies on perceived wine quality,
using either experts or consumers, set the stage for our study,
where we evaluated a set of wines, varying in quality, with three
different populations wine experts, trained panelists and con-
sumers, in an attempt to gain a broader understanding of perceived
wine quality in a set of commercial Cabernet Sauvignon wines
from California.
2. Materials and methods
The study was approved by the UC Davis Institutional Review
Board (IRB, protocol number 305379-2).
2.1. Samples
Twenty-seven Cabernet Sauvignon wines from 9 Californian
wine regions were selected for the study based on their perfor-
mance in the 2012 California State Fair Commercial Wine Compe-
tition. Any bonded winery can enter their grape or fruit product
grown in California in the competition. The entered wine must
be from a lot of at least 300 gal (i.e. 1135.62 L), and at least
240 gal (i.e. 908.50 L) of this lot must be available for sale
(http://www.bigfun.org/wp-content/uploads/2012/02/2012-Com-
Wine-Pros-4pages.pdf).
A total of 333 Cabernet Sauvignon wines were entered in the
competition in 2012, coming from 9 wine regions in California,
which are geographically designated and established by the ofcial
legal body, the Alcohol and Tobacco Tax and Trade Bureau (TTB).
From each region three wines were selected, one considered high
in quality (i.e. the highest scoring wine, in most cases either a Gold
or a Double Gold wine, except for region H where the highest scor-
ing wine was a Silver medal (W27)), one low in quality (i.e. a No
award wine, scoring lowest in the region), and one wine of medium
quality (around the average point score between the high and the
low quality wine). For 7 out of the 9 regions wines from all three
quality categories could be acquired, with the exception of region
H (no Gold or Double Gold available) and region G (no No award
wine available). From region H we had two No award wines (W5
and W21), one Bronze wine (W7) and one Silver wine (W27). Wine
vintages varied between 2001 and 2011 (median=2009), and retail
prices varied between $9.99 and $70.00 per bottle with a median
price of $26.95 (Table 1).
2.2. Descriptive Analysis (DA) panel
All wines were characterized by a generic Descriptive Analysis
(DA) (Lawless, 2010), using a panel of 15 trained judges (10 males;
Table 1
Wines used in the study together with their information (code, region, awarded points and medals in the wine competition, bottle retail price) and average hedonic liking (HL)
and quality (Q) scores for the consumers (cons) and experts (exp). Letters denote signicant differences in HL and Q using 1-way ANOVA and post hoc analysis according to Tukey
(P 6 0.05). Columns that share the same letter are not signicantly different from each other (P 6 0.05).
Code Vintage Region
a
Pts. Awards
b
Retail ($)price HLcons Qcons HLexp
W1 2008 G 82 NA 26.95 4.10 e 4.56 abc 2.89 e
W2 2009 B 89 S 39.00 4.60 abcde 4.99 a 3.54 cde
W3 2009 I 95 G 21.00 4.86 abc 4.91 a 5.04 abcd
W4 2008 G 90 S 34.00 4.87 abc 4.53 abc 4.25 abcde
W5
c
2006 H 83 NA 15.00 4.90 abc 4.34 abc 2.68 e
3.68 cde
W6 2009 C 90 S 55.00 4.97 abc 4.89 a 5.30 abc
W7 2010 H 86 B 25.00 4.98 ab 4.88 a 5.00 abcd
W8 2008 C 98 DG 47.00 5.04 ab 4.63 abc 5.32 abc
W9 2009 D 94 G 25.00 5.24 a 4.65 abc 5.32 abc
W10 2009 A 94 G 9.99 4.10 e 4.68 abc 5.11 abcd
W11 2007 A 82 G 38.00 4.11 de 4.39 abc 3.86 bcde
W12
c
2009 F 89 S 15.00 4.24 cde 4.70 abc 5.00 abcd
5.25 abc
W13 2007 D 88 S 34.00 4.41 bcde 4.40 abc 4.07 abcde
W14 2008 B 84 NA 45.00 4.42 bcde 4.47 abc 3.89 abcde
W15 2009 I 89 S 24.99 4.45 bcde 5.00 a 4.39 abcde
W16 2011 E 82 NA 10.00 4.47 bcde 4.51 abc 5.11 abcd
W17
c
2009 F 95 G 19.99 4.47 bcde 4.88 a 5.70 a
4.96 abcd
W18 2007 G 98 DG 70.00 4.54 abcde 4.39 abc 3.68 cde
W19 2010 F 87 B 22.00 4.56 abcde 4.91 a 5.07 abcd
W20 2010 B 94 G 19.99 4.64 abcde 4.17 bc 4.93 abcd
W21 2007 H 83 NA 29.00 4.66 abcde 4.46 abc 5.68 ab
W22 2010 F 83 NA 13.00 4.71 abcde 4.62 abc 4.50 abcde
W23 2010 E 89 S 14.00 4.72 abcde 4.33 abc 4.43 abcde
W24 2009 A 88 S 28.00 4.76 abcde 4.72 ab 4.96 abcd
W25 2008 D 82 NA 32.00 4.78 abcde 4.60 abc 4.00 abcde
W26 2009 C 83 NA 59.00 4.83 abcde 4.07 bc 3.39 de
W27 2001 H 92 S 45.00 4.85 abcd 4.01 c 3.04 e
Min 2001 82 9.99 0.74
d
0.71
d
1.83
d
Max 2011 98 70.00
Median 2009 89 26.95
a
A North Coast includes everything except Napa and Sonoma; B Sonoma County, C Napa County, D Greater Bay Area, E North Central Coast, F South Central
Coast, G South Coast, H Sierra Foothills, I Lodi/Woodbridge Grape Commission.
b
DG Double Gold; G Gold; S Silver; B Bronze; NA No Award.
c
Three wines were presented twice to the experts.
d
Honestly signicant difference (HSD) according to Tukey.
222 H. Hopfer, H. Heymann/ Food Quality and Preference 32 (2014) 221233
age 37 17 yrs (mean s.d.)). Panelists were recruited via email
from the UC Davis afliates, including students, staff, and retirees,
and gave oral consent to participate in the study. Panelists received
snacks after each session and a gift card at the end of the study as a
token of appreciation.
Six one-hour training sessions over a period of two weeks were
held where panelists were exposed to subsets of the 27 wines to
create, rene and gain consensus on the aroma, taste and mouth-
feel attributes which described the perceived differences among
the wines (Table 2). Each wine was seen blind at least once during
the training. Training of the DA panelists was evaluated by blind
recognition exercises of the reference standards at the end of the
training, and all panelists successfully succeeded in this before
wine evaluation took place. After training was completed, all 27
wines were evaluated in triplicate in individual sensory booths un-
der red light and positive air pressure. 25 mL of wine was served in
black standard wine tasting glasses, labeled with a random three-
digit code, and panelists were instructed to expectorate the sam-
ple, and rinse with deionized water (Arrowhead, Nestle, Stamford,
CT) in between samples. Panelists rated each attribute on a com-
puter screen using an anchored, unstructured line scale provided
by FIZZ (version 2.47B, Biosystmes, Couternon, France). A Wil-
liam-Latin Square block design was used to control for carry-over
effects, with 67 wines per block and a total of 12 blocks, evaluated
over a period of 4 weeks.
During the rst training, panelists were screened for color vi-
sion deciencies (redgreen color blindness) using pseudo-iso-
chromatic plates (American Optical Corporation, Ontario, Canada)
and were considered to have normal color vision if they correctly
identied six out of 7 plates. During the remaining training ses-
sions they were encouraged to note color differences. Only a few
wines differed in color while most of the wines presented during
the training sessions were very similar, so the panel decided to
complete a free sorting task for color in triplicate. During the last
3 evaluation sessions panelists were asked to sort 30 samples
(27 wines with three blind duplicates) according to color into as
many groups as they wished, but at least two and a maximum of
29 groups. Two individual tasting booths with dened illumination
conditions were set up with 30 clear standard wine tasting glasses,
labeled with random three-digit codes, containing 25 mL of wine,
and covered with a transparent plastic lid, as well as an evaluation
sheet. The evaluation table had an off-white background color and
was illuminated by two vertically mounted halogen lamps, 1.4 m
distant from the table surface and 30 cm apart from each other.
The halogen lamps were used at maximum luminous intensity
(1580 cd, MR16 Superline Reekto, Ushiro America, Inc., Cypress,
Table 2
Reference standards for the sensory attributes used in the DA panel. All attributes were anchored with the words low and high at the end of the unstructured line scale.
Franzia Vintners Select Cabernet Sauvignon (Ripon, CA) was used as base wine.
Aroma standards
Overall aroma
intensity
Verbal description: the intensity of the wine smell
Earthy Tbsp. potting soil (Black Gold, Bellevue, WA) + 1 g Orchid bark (Black Gold) + fresh champignon mushroom + 5 drops water
Fresh veggie 1 cut green bean + 2 frozen green pepper strips (C+W Birds Eye, Peoria, IL) + 1 fresh Broccoli rosette in 25 mL base wine
Fresh green herbal 0.05 g dried Dill (The Spice Hunter, San Luis Obispo, CA) + 0.1g dried herb mix (Davis Co-Op, Davis, CA) in 30 mL base wine
grassy 4 Fresh grass clippings
minty 2 Crushed fresh mint leaves + 0.5 mL Eucalyptus solution (3 drops eucalyptus essential oil in 100 mL water)
Canned veggie 2 mL canned asparagus brine (Green Giant, Minneapolis, MN) + 2 mL canned green bean brine (Green Giant) + 1 mL canned sweet corn brine (Best
Yet, Keene, NH) in 15 mL base wine
Floral 0.05 g dried lavender (Davis Co-Op) + 0.2 g dried red rose buds (Davis Co-Op) + 2 mL violet solution (2 drops violet essential oil in 100 mL water) in
10 mL base wine
Dried fruit dried fruit 1 Cut dried g (SunMaid, Stockton, CA) + 10 raisins (SunMaid) + 1 cut dried apricot (SunMaid) in 15 mL base wine
oxidized 5 mL Marsala Superiore riserva 10anni DOC (Marco de Bartoli) in 10 mL base wine
Soysauce 12 mL soy sauce (Hisakawa, Golding Farms Foods, Winston-Salem, NC) in 15 mL base wine
Yeasty 0.1 g SuperFood Yeast Nutrient (Gusmer Enterprises, Fresno, CA) in 15 mL base wine
Sweet aroma honey/
caramel
0.25 mL vanilla extract (Kirkland, Costco) + 1 Tbsp. Mrs. Richardsons Butterscotch caramel (Frankfort, IL) + 1 Tbsp. honey (Lienerts
Mountain wildower honey, Sacramento, CA)
chocolate 0.5 g grated 70% chocolate (BRIX, Rutherford, CA) + 0.5 g 100% chocolate (Bakers, Krafts Food, Northeld, IL) in 15 mL base wine
Spices 0.05 g ground cloves (McCormick, Hunt Valley, MD) + 0.06 g ground nutmeg (McCormick) + 0.07 g ground ginger (McCormick) + 0.2 g ground
cinnamon (McCormick) in 15 mL base wine
Red fruit 1 frozen strawberry (Dole, West Village, CA) + 5 frozen raspberries (Dole) in 15 mL base wine
Dark fruit 1 Tbsp. blueberry spread (Cascadian Farms, Rockport, WA) + 5 mL black cherry juice concentrate (RW Knudsen, Chico, CA) + 1 Tbsp. blackberry jam
(Mary Ellen, Orrville, OH) + 0.5 mL black currant avoring (IFF, New York, NY) + 3 mL water
Chemical Verbal description: The smell of ammonia and chlorinated swimming pool
Alcoholic 1 mL 95% ethanol (Goldshield, Hayward, CA)
Brett
a
20 mL Cabernet Sauvignon (Walton 2006, Napa Valley, CA) + 0.05 g white pepper (McCormick)
Smoky 1 drop guaiacol (Acros Chemicals, Pittsburgh, PA, 99 + %) in 25 mL base wine
Black pepper 0.1 g ground black pepper (McCormick) in 15 mL base wine
Musty/dusty musty 2 mL CE organic acid solution (Agilent, Santa Clara, CA) + 1 mL red wine vinegar (Star, Fresno, CA) in 15 mL base wine
dusty 1/8 tsp. 100% Montmorillonite European clay (Now Personal Care, Bloomingdale, IL) + 2 mL water
Oak 0.15 g Evoak Premium dark roasted chips (Oak Solutions, Napa, CA) in 15 mL base wine
Sulfur burnt
rubber
1 mL 95% ethanol (Goldshield, Hayward, CA) in 15 mL wine
rotten egg hard boiled egg (boiled for 30 min) + 0.5 mL SO
2
solution
Taste and mouthfeel
b
Sweet 10 g/L Sucrose (C+H, Crockett, CA)
Sour 1 g/L L+ tartaric acid (Fisher Scientic, Pittsburgh, PA)
Bitter 0.8 g/L Caffeine (SigmaAldrich, St. Louis, MO)
Astringent 0.8 g/L Aluminum sulfate (McCormick)
Hot 250 mL/L 40% Vodka
Viscous 1.5 g/L carboxymethyl cellulose (SigmaAldrich)
a
Bett refers to the smell associated with a wine spoilage yeast, Brettanomyces bruxellensis. It is reminiscent of leather, sweaty horse, and barnyard.
b
All standards were prepared in deionized water (Arrowhead, Nestle, Stamford, CT).
H. Hopfer, H. Heymann/ Food Quality and Preference 32 (2014) 221233 223
CA, USA) and had a color temperature of 3000 K, resembling the
spectral distribution of a CIE standard illuminant A, but with more
yellow and red wavelengths. No specic evaluation procedure was
used, but panelists were told to be consistent in the way they eval-
uate the color of the wines over the three days.
2.3. Consumer panel
One hundred and seventy-four consumers who reported that
they consumed wine regularly were recruited by email from the
greater Davis area, and came to the sensory laboratory at UC Davis
for one session to participate in the study. During a short introduc-
tion about the tasks of the study the consumers gave oral consent
before they proceeded into the sensory booths for the tasting. All
consumers evaluated the wines the same day. Each consumer
tasted 6 wines and rated the overall liking and the overall quality
of each wine on a computer screen on an anchored, unstructured
line scale with the anchors Dislike extremely for the liking and
Very low quality for the quality rating at the left end and Like
extremely and Very high quality at the right end of the scale
as provided by FIZZ, which also collected the data and converted
them into two-digit values between 0 and 9. Consumers were
encouraged to expectorate the sample. The presentation order
was a balanced incomplete block design with carry-over control,
using the algorithm of Wakeling and MacFie (1995). After each
wine was scored, consumers answered some demographic ques-
tions regarding sex, age, income, and wine consumption. They also
answered 15 questions to determine their wine expertise. Consum-
ers were assigned wine expertise status depending on the number
of correctly answered questions, low (6 or less correct answers
out of 15), medium (711 correct answers) and high (12 or
more correct answers).
2.4. Expert panel
Twenty-eight wine professionals, the experts, satisfying the
criteria of (Parr, White, & Heatherbell, 2004), were recruited to par-
ticipate in a blind tasting of the wines. Experts included winemak-
ers, enologists, cellar workers, wine consultants, enology teachers
and other wine professionals working in the wine supporting
industry (e.g. cork supply, cooperage, external wine laboratories,
etc.). One wine expert stated in the questionnaire to have been a
wine judge in the wine judgment, where he also tasted Cabernet
Sauvignon wines. However, due to the great number of Cabernet
Sauvignon wines entered in the competition (333) compared to
the subset used in this study (27) the chances are very small that
this wine expert would have tasted the same sample set during
the wine judgment.
The evaluation consisted of two separate sets of 30 wines
(27 samples and three blind duplicates). With the rst set the
experts were asked for their overall liking of each wine, using a
9-point category scale, labeled at the left end with Dislike extre-
mely, in the middle with Neither like nor dislike and at the right
end of the scale with Like extremely. With the second set the ex-
perts were asked to sort the wines into ve quality categories,
ranging from lowest to highest quality. Each glass was coded with
a random three-digit code, and each panelist was seated at a sep-
arate table, equipped with the two sets of wines with differently
coded glasses, evaluation sheets, water and a spit bucket. Finally,
they were asked some questions with regards to demographics
(age, sex) as well as wine industry related ones, such as job title,
wine tasting frequency, wine industry experience and wine judg-
ment experience. We also asked them two open-ended questions
about attributes they associate with a high and a low quality wine.
2.5. Data analysis
All data analyses were done in RStudio (version 0.97.551,
RStudio, 2012), with the additional packages candisc (Friendly &
Fox, 2010), SensoMineR (L & Husson, 2008), FactoMineR (L, Josse,
& Husson, 2008), missMDA (Josse & Husson, 2012), pls (Mevik &
Wehrens, 2007), cluster (Maechler, Rousseeuw, Struyf, Hubert, &
Hornik, 2013) and DistatisR (Beaton, Fatt, & Abdi, 2013).
Signicance testing using an alpha level of 5% was done on the
DA data by multivariate analysis of variance (MANOVA) for the
wine effect, followed by univariate analysis of variance (ANOVA)
for each attribute using a three-way xed effect model with all
two-way interactions (wine W, panelist P, replicate R, WxP, WxR,
PxR). For attributes with a signicant panelist effect and signicant
panelist interactions (WxP) a pseudo mixed model with the inter-
action as the error term was used (Gay, 1998).
Missing liking and quality score values in the consumer data
due to the incomplete block design of the study were imputed
using a regularized iterative PCA algorithm as proposed by (Josse
& Husson, 2012). Multiple imputation with principal component
analysis (MI-PCA) was used to obtain a measure for the uncertainty
for the imputation of the missing values as described in (Josse,
Pags, & Husson, 2011). MI-PCA estimates both a missing value
as well as a variability for the imputation. MI methods can be used
for values missing at random, such as the missing consumer values
in this study.
Once a complete data set was obtained all following data anal-
yses used the imputed data set. Consumer segmentation based on
the liking and quality scores were obtained by hierarchical cluster-
ing of the imputed data set, using Euclidean distances and Wards
linkage. The resulting clusters were chosen visually where a large
drop in the hierarchical tree height was observed. Hedonic liking
(HL) and quality scores (Q) from the consumer clusters and aver-
aged over all consumers, and the HL data from the expert panel
were analyzed by ANOVA for the wine and panelist effects, followed
by post hoc analysis according to Tukey (Honestly signicant dif-
ference HSD). Cluster segmentation was tested for equal propor-
tions using a Chi-Square test for all demographical questions.
Various product space representations were obtained using
principal component analysis (PCA) for the DA data and internal
preference maps (IPM) with the HL and Q data from both the con-
sumer and expert panels. The experts quality sorting data was
analyzed with DISTATIS (Abdi, Valentin, Chollet, & Chrea, 2007).
Correlation of the various data sets, foremost, correlating the HL
and Q scores from both consumer clusters and experts to the DA
data, was done using Partial Least Squares Regression type 2
(PLS2), with the DA attributes as the predicting and HL and Q
scores as the predicted variables. The obtained model was cross-
validated by a leave-one-out procedure.
3. Results and discussion
3.1. Obtaining product characteristics with Descriptive Analysis (DA)
After the MANOVA revealed signicant differences among the
wines, the ANOVAs showed signicant differences in 17 aroma,
two taste and two mouthfeel attributes among the 27 wines
(P 6 0.05) (Suppl. Table 1).
All signicant attributes were used to create the graphical prod-
uct presentation with PCA of the covariance matrix, shown in
Fig. 1a and b: Along the rst principal component PC 1, explaining
34.3% of the total variance, samples were to some extent separated
by their performance in the wine competition, reected in the
opposite direction of vegetal-green and chemical-earthy aromas on
the left hand side of the variables plot and fruity, oak and sweet aro-
224 H. Hopfer, H. Heymann/ Food Quality and Preference 32 (2014) 221233
mas on the right hand side. Along PC 2, explaining additional 19.3%
of the total variance, wines were separated due to taste and
mouthfeel attributes, with astringency, bitterness and hot mouthfeel
explaining the positive PC 2 axis, opposite to sweetness on the neg-
ative second principal component.
Especiallywines W19andW16and, toalesser extent, wines W22
and W24 were driven by their canned and fresh vegetal and green
notes, while wine samples W5, W6, W11, W14, W18, W26 and
W27 were rated high in the attributes Brett, rotten egg/sulfur, chem-
ical and earthy, all attributes that point towards wine spoilage.
On the right hand side of the PCA product plot the DA panel
rated wines W1W3, W7, W10, W12, W17, W23 and W25 high
in various fruit (i.e. dried fruit, dark fruit and red fruit) and sweet
(i.e. honey, caramel, vanilla) aromas as well as sweet taste. Lastly,
wines W8, W15, W21 and W24 were mostly driven by an astrin-
gent and hot mouthfeel, bitter taste, alcoholic, smoky, spicy and
oak aromas. Wine 19 was positively correlated to soysauce, and
two wines (W13 and W20) were located in the origin of the prod-
uct plot together with overall aroma, indicating a rather balanced
sensory prole, without any particular attributes driving the sen-
sory characteristics of these wines.
The aroma attributes oak, sweet aroma, red fruit, dark fruit, dried
fruit and spices are all signicantly positively loaded on PC 1, while
chemical, earthy, fresh veg, canned veg, sulfur and Brett are all signif-
icantly negatively loaded on PC 1 (P 6 0.05). Similarly, on PC 2, the
mouthfeel attributes astringent and hot as well as bitter taste and
chemical aroma are signicantly loaded on the positive PC 2 axis.
Similar ndings were reported for AOC Bordeaux and Bordeaux
Superieur wines, in many cases Cabernet Sauvignon blends
(Szolnoki & Hoffmann, 2011). The authors performed multiple
linear regressions between quality scores and DA attributes, and
found signicantly positive correlations between the quality scores
and the terms body, coffee, oak, as well as rose, jam and strawberry,
while Brett, mushroom, lactic/butter and bacon as well as acidity and
oxidation were negatively correlated to quality scores.
3.2. Can color be used to predict quality?
The trained DA panel also evaluated color differences in a sorting
task as described in Section 2. Panelists sorted 30 wines (27 wines
with 3 blind duplicates) on three consecutive days, and data from
each day was analyzed with DISTATIS (Fig. 2af). In the DISTATIS
procedure the individuals sorting data are used, thus, an RV coef-
cient map of individuals agreement was calculated, and the DA pa-
nel showed a partial agreement in the sorting task, indicated by the
lower explained variance of the rst dimension of around 30%
(Fig. 2b, d and f). Compared to e.g. 60% explained variance in a beer
sorting task (Abdi et al., 2007), the DA panel was in less agreement
when sorting the wines by color. However, the beers used in the
study were more different in terms of beer style, and assessors
are more likely to agree in their sorting tasks. Additionally, in
DISTATIS, the assessors importance for the compromise product
map is weighted based on the RV compromise map, and therefore,
assessors that show less agreement with the others contribute to a
smaller extent to the compromise map. So called a weights varied
between the three replicates and assessors between 0.04 (J3, J11
and J11 in the three replicates) and 0.080.09 (J4, J5 and J5 in the
three replicates). In all three replicates (around 25% of the variance
explained in the rst two dimensions of the compromise plot) three
product clusters were found due to similarly perceived colors.
Wines W1, W4, W5, W13, W18, W25 and W27 were found to be
similar in color, and form a group in the bottom right quadrant of
the product plot (Fig. 2a, c and e). Those wines were fromthe oldest
vintages in the set (20012008). A second group made up of wines
W6W8, W14, W15, W20, W21, W23, W24 and W26 represents the
middle-aged wines in the set, mostly from the 2009 and 2010 vin-
tages, except for W8 fromthe 2008 and W21 fromthe 2007 harvest.
A last, less tight group consists of all wines harvested in 2009 or la-
ter, thus, representing the youngest wines (W2, W3, W9, W10,
W12, W16, W17, W19 and W22). Wine W11 is positioned in the
center of the plots, indicating an averaged color, most likely due
to its medium age (wine vintage, 2007).
Based on these results, color is an indicator of age, rather than
quality, as the DA panel grouped wines from the same vintages
together, independent of the assigned quality scores in the wine
competition. This is in good agreement with the ndings of
Machado (2009), who found no correlation between the quality
ratings of red wines and their color. However, we do not know if
judges in the wine competition were able to judge the color of
the wines, so the possibility exists that color was incorporated into
PC 1, 34.3%
P
C

2
,

1
9
.
3
%

-4 0 4
-
3

0

3

W1
W10
W11
W12
W13
W14
W15
W16
W17
W18
W19
W2
W20
W21
W22
W23
W24
W25
W26
W27
W3
W4
W5
W6
W7
W8
W9
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
PC 1, 34.3%
P
C

2
,

1
9
.
3
%

-1 0 1
-
0
.
5

0
.
0

1
.
0

oA
Alcohol
Brett
CanVeg
Chemical
DkFrt
DrFrt
Earthy
FrGreen
FrVeg
Oak
RdFrt
Smoky
Soys.
Spice
Sulfur
SweetA
Astringent
Hot
Bitter
Sweet
(a) (b)
Fig. 1. (a) PCA product and (b) variables plot using the DA attributes that differed signicantly among the wines (P 6 0.05). Wines are color-coded according to their
performance in the wine competition (green No Award; blue Silver or Bronze medal; gold Gold or Double Gold medal). Aroma attributes are shown in bold, taste
attributes are italicized and mouthfeel attributes are underlined. (For interpretation of the references to color in this gure legend, the reader is referred to the web version of
this article.)
H. Hopfer, H. Heymann/ Food Quality and Preference 32 (2014) 221233 225
the quality assessment during the competition. To some extent col-
or can be a quality predictor, as winemaking and storage condi-
tions, such as oxygen amount, inuence the color of red wine
(Caill et al., 2010; Wirth et al., 2010), however, in this set, the im-
pact of vintage was driving the observed sensory differences in red
wine color.
3.3. Measuring the overall liking and quality with consumers
A total of 174 consumers were recruited to taste subsets of the
wines and rate the overall liking (HL) and overall perceived quality
(Q). Due to the use of a balanced incomplete block design, missing
HL and Q values were imputed by a regularized iterative MI-PCA as
indicated in the methods section. An ANOVA on overall liking and
perceived quality values revealed signicant differences among the
wines (Table 3, P 6 0.05). Hierarchical clustering on the imputed
data set, separately for the HL and the Q values, led to four con-
sumer segments for both data sets (HL1HL4 and Q1Q4), and sig-
nicant differences between the clusters were found for HL and Q
scores. Chi-Square tests for equal proportions for all demographical
questions were not signicant (P > 0.05), indicating no segmenta-
tion due to demographics but only due to different hedonic liking
and quality perceptions (Table 3).
A graphical representation of the different consumers and the
relation between their HL and Q scores to the wines were studied
with an internal preference and internal quality map using multi-
dimensional preference mapping (Delgado & Guinard, 2012). The
resulting product and variable plots are shown in Fig. 3ad, and
these results will be combined with the demographical separation
of the individual consumer segments shown in Table 3.
Consumers were equally distributed among the wines, indicat-
ing a broad range of liking and quality perception. Additionally,
due to the imputation of the missing data, circles, representing
the uncertainty of the imputation, are large, and overlap for most
wines. However, if we averaged the raw data for each wine, as
shown in Table 1, most wines were not considered signicantly
different using Tukeys post hoc test, and this fact is also shown
in the internal preference and quality maps (Fig. 3), therefore,
we believe that the imputation of the missing values is a good
approximation. However, only after the imputation, cluster analysis
and demographic exploration of the consumer data was possible.
For the internal preference map (Fig. 3a and b) explaining over
80% of the total variance within the rst two dimensions clusters
liking scores differed signicantly (P 6 0.05) and separated the
wines into the four quadrants of the product map, with 9 wines
in the top left quadrant (W3, W5, W8, W9, W13, W14, W16,
W23 and W27) being liked the most by cluster HL4, a consumer
segment of 42 consumers. Consumer in this segment liked the
wines overall the most (average of 6.24), and reported the lowest
percentage of less than $35,000 yearly income. Most people in this
cluster earn between $35,000 and $75,000 per year. Consumers in
this cluster reported to consumer wine 15 times per week, and
showed the highest percentage of medium wine expertise of all
HL clusters. Among the wines preferred by HL 1 were 1 Double
Dim 1, 33%
D
i
m

2
,

8
.
7
%

0.5 1.0
-
1

0

1

J1
J2
J3
J4
J5
J6
J7
J8
J9
J10
J11
J12
J13
J14
J15
(b)
Dim 1, 30.3%
D
i
m

2
,

9
%

0.5 1.0
-
1

0

1

J1
J2
J3 J4
J5
J6
J7
J8
J9
J10
J11
J12
J13
J14
(d)
Dim 1, 31.8%
D
i
m

2
,

8
.
4
%

0.5 1.0
-
1

0

1

J1
J2
J3
J4
J5
J6
J7
J8
J9
J10
J11
J12
J13
J14
(f)
Dim 1, 14.8%
D
i
m

2
,

1
1
.
8
%

-0.3 0.0 0.3
-
0
.
3

0
.
0

0
.
3

W1
W1b
W10
W11
W12
W13
W14b
W15
W16
W17
W18
W19
W2
W20
W21
W22
W23
W24
W25
W26
W27
W3
W4
W5
W6
W7
W8
W9
W8b
W14
(a)
Dim 1, 12.8%
D
i
m

2
,

1
0
.
9
%

-0.3 0.0 0.3
-
0
.
3

0
.
0

0
.
3

W1 W1b
W10
W11
W12
W13
W14b
W15
W16
W17
W18
W19
W2
W20
W21
W22
W23
W24
W25
W26
W27
W3
W4
W5
W6
W7
W8
W9
W8b
W14
(c)
D
i
m

2
,

1
0
.
4
%

W1
W1b
W10
W11
W12
W13
W14b
W15
W16
W17
W18
W19
W2
W20
W21
W22
W23
W24 W25
W26
W27
W3
W4
W5
W6
W7
W8
W9
W8b
W14
(e)
-
0
.
3

0
.
0

0
.
3

Dim 1, 13.3%
-0.3 0.0 0.3
Fig. 2. (a) DISTATIS product plot and (b) RV consensus plot with each DA panelist for the rst sorting replicate, (c) DISTATIS product plot and (d) RV consensus plot with each
DA panelist for the second sorting replicate, (e) DISTATIS product plot and (f) RV consensus plot with each DA panelist for the third sorting replicate. Wines are color-coded
according to their performance in the wine competition (green No Award; blue Silver or Bronze medal; gold Gold or Double Gold medal). (For interpretation of the
references to color in this gure legend, the reader is referred to the web version of this article.)
226 H. Hopfer, H. Heymann/ Food Quality and Preference 32 (2014) 221233
Gold, 2 Gold, 3 Silver and 3 No Award medals (see Table 1). Five
wines (W1, W4, W7, W10 and W24), including 1 Gold, 3 Silver
and 1 No Award, were the wines that consumers in HL cluster 2
liked the most. HL2, a consumer segment of 42 people, had the
highest percentages of under 30 year olds, and thus, the highest
percentage of under $35,000 yearly income. Consumers in this seg-
ment drink wine less frequently, and reported 14 drinking occa-
sions per month. Most HL2 consumers had low or medium wine
expertise (81%). Wines positioned in the bottom right quadrant
(W2, W6, W18, W19, W21, W26) were rated highest in liking by
consumers in the HL 3 segment (n = 42). Wines in this quadrant in-
clude 1 Double-Gold, 2 Silver, 1 Bronze and 2 Non-awards. HL3
consumers were very similar to consumers in HL2, but had a lower
percentage of under 30 year olds and less people in this segment
earned less than $35,000 per year. The majority in this cluster is
under 40 years old. HL3 consumers showed the highest percentage
of more than 5 drinking occasions per week, and also include the
second highest percentage of high expertise.
All remaining wines (3 Gold (W11, W17 and W20), 2 Silver
(W12, W15), and 2 NA (W22, W25)) are located in the bottom left
quadrant of the internal preference map. Most consumers in the
HL1 segment rated these wines highest, however, this consumer
segment (n = 48) rated the wines overall signicantly lower than
the other three segments (Table 3). HL1 had the highest percentage
of over 50 year olds, and also the highest percentage of over
$75,000 yearly income. Nearly 50% of this segment drinks wine
at least once a week or more, and 63% showed medium wine
expertise. Across all clusters, a similar split between females and
males of 4245% males was observed.
A different picture to the internal preference map is shown in
the internal quality map using the perceived quality ratings of
the consumers (Fig. 3c and d), despite explaining a similar percent-
age of the total variance within the rst two dimensions (>80%):
Again, four consumer segments (Q1Q4) were found with hierar-
chical clustering, using Euclidean distances and Wards linkage
(Table 3).
Signicant differences in the overall quality ratings of the wines
were found between the four segments, with Q1 liking the wines
signicantly lower than the other three segments
(Q1 < Q2 < Q3 < Q4; Table 3).
Wines positioned in the top left quadrant of the internal quality
map (Fig. 3c) included 1 Double-Gold (W8), 2 Silver (W12, W23), 1
Bronze (W14) and 2 NA (W1, W22) wines. Consumers of segment
Q3 (n = 57) rated these wines highest in quality. This segment had
the highest percentage of under 30 year olds, and the lowest per-
centage of 3039 year olds of all consumer clusters. More than
3
=
4
in this cluster reported a yearly income of less than $50,000, and
a similar percentage reported to drink wine between once a month
and up to ve times a week. More than half of these consumers
(65%) were classied as medium experts in their wine knowledge.
Wines in the top right quadrant, including 3 Gold (W3, W10,
W17), 3 Silver (W2, W6 and W15) and 2 Bronze (W7, W19) wines,
were rated highest in quality by most consumers in segment Q1
(n = 37) and a few consumers of segment Q3. In Q1, the highest
percentage of over 60 year olds (11%) of all clusters was found,
and thus, the highest percentage of yearly incomes over $75,000
was reported as well. However, nearly half (46%) in this segment
were under 30 years old, and 62% in Q1 earn less than $35,000
per year. Wine consumption is nearly equally distributed among
the categories, but the highest percentage of drinking wine more
than 5 times per week of all clusters was found in this segment.
Consumers were either low or high in wine expertise.
The bottom right quadrant was made up by 6 wines (1 Double-
Gold, 1 Gold, 2 Silver and 2 NA), and particularly consumers in seg-
ment Q2 rated these wines high in quality. Q2 consumers were the
second largest segment (n = 50), and interestingly, cluster closer
together than any of the other clusters, thus, show a more similar
quality perception than other consumer segments. Forty percent of
these consumers were between 30 and 50 years old, and more than
half reported a yearly income of less than $35,000, together with
14 wine consumptions per month. This segment has the highest
percentage of medium wine expertise, and was interestingly the
only segment that was less equally divided between females and
males (38% males vs. more than 40% in the other clusters).
The last consumer segment, Q4, included 30 consumers, and
had the highest percentage of over 50 year olds, with over 50%
Table 3
Demographical distribution of the four consumer segments, separated for the HL and Q scores. Letters denote signicant differences in HL and Q between the consumer segments
by post hoc analysis according to Tukey (P 6 0.05).
Overall HL1 HL2 HL3 HL4 Overall Q1 Q2 Q3 Q4
n 174 48 42 42 42 174 37 50 57 30
HL 3.06 c 4.12 b 5.53 a 6.24 a 4.58 2.88 d 3.77 c 5.25 b 6.73 a
Gender
Male (%) 43 42 45 43 43 43 43 38 46 47
Female (%) 57 58 55 57 57 57 57 62 54 53
Age
<30 years (%) 47 35 60 55 40 47 46 46 54 37
3039 years (%) 22 29 21 17 19 22 22 26 19 20
4049 years (%) 10 6 10 7 19 10 8 14 9 10
5059 years (%) 16 21 7 14 19 16 14 10 14 30
>60 years (%) 5 8 2 7 2 5 11 4 4 3
Income
<$35 k (%) 52 52 62 52 40 52 62 54 47 43
$35$50 k (%) 28 25 26 24 38 28 22 28 30 33
$50$75 k (%) 11 8 7 19 12 11 5 8 16 17
>$75 k (%) 9 15 5 5 10 9 11 10 7 7
Consumption
<1 month (%) 17 19 19 17 14 17 24 12 18 17
14 month (%) 41 33 52 38 40 41 35 54 35 37
15 week (%) 27 29 21 24 33 27 19 20 37 30
>5 week (%) 15 19 7 21 12 15 22 14 11 17
Expertise
Low (%) 23 25 33 21 12 23 32 18 23 20
Med (%) 63 63 48 62 79 63 49 68 65 67
High (%) 14 13 19 17 10 14 19 14 12 13
H. Hopfer, H. Heymann/ Food Quality and Preference 32 (2014) 221233 227
earning between $35,000 and $75,000. This segment also perceived
the wines quality overall signicantly higher than any of the other
segments (P 6 0.05). Two-thirds in this segment drink wine at least
once a month or a week, and 87% had medium wine expertise.
Wines that were positively correlated to this segment were located
in the bottom left quadrant of the internal quality map (Fig. 3c),
and included 2 Gold, 2 Silver and 3 NA wines (W5, W11, W13,
W20, W21, W26 and W27).
These results are not very surprising, as after all, all wines in the
study are commercially available, thus, each wine is liked by some
(b)
Dim 1, 42.2%
W1
W17
W18
W21
W24
W27
W3
W5
W7
(a)
Dim 1, 43.9%
D
i
m

2
,

4
2
.
0
%

(c)
(d)
W22
W11
W20
W8
W23
W15 W26
W6
W9
W10 W4
W2
W19
W12
W25
W13
W16 W14
-20 0 20
2
0
-
2
0
0

W1
W10
W11
W13
W16
W17
W19
W2
W20
W23
W24 W25
W26
W27
W3
W4
W5
W6
W7
W9
-1.0 0.0 1.0
-
1
.
0

0
.
0

1
.
0

D
i
m

2
,

4
1
.
3
%

-
1
.
0

0
.
0

1
.
0

D
i
m

2
,

4
2
.
0
%

-20 0 20
D
i
m

2
,

4
1
.
3
%

2
0
-
2
0
0

-1.0 0.0 1.0
Dim 1, 42.2% Dim 1, 43.9%
W12
W22
W8
W15
W14 W21
W18
HL1 (n=48) HL2 (n=42)
HL3 (n=42) HL4 (n=42)
Q1 (n=37) Q2 (n=50)
Q3 (n=57) Q4 (n=30)
Fig. 3. (a) Internal Preference map and (b) loadings plot using the imputed HL scores of the consumers (n = 174) together with a circular measure of imputation uncertainty
obtained from multiple imputation PCA (Josse et al., 2011). (c) Internal Quality map and (d) variables plot using the imputed Q scores of the consumers (n = 174) together
with a circular measure of imputation uncertainty obtained from multiple imputation PCA. Wines are color-coded according to their performance in the wine competition
(green No Award; blue Silver or Bronze medal; gold Gold or Double Gold medal). Consumers are color-coded according to their cluster segment. (For interpretation of
the references to color in this gure legend, the reader is referred to the web version of this article.)
Table 4
Demographic details of the experts (n = 28) with regards to their wine tasting expertise.
Age (years) Under 30 3040 4150 5160 Over 60
4% 36% 29% 18% 14%
Gender Female Male
32% 68%
Years of industry experience Less than 1 15 510 Over 10 Over 30
0% 0% 29% 39% 32%
Current job title Enologist (Assistant) winemaker Production Market ng & sales Other
11% 57% 4% 7% 21%
Frequency of wine tasting Daily 13/wk 13/mo 13/yr
18% 71% 7% 4%
Is wine tasting part of your job title? Yes No
86% 14%
Years of professional tasting experience Less than 1 15 510 Over 10 Over 30
18% 11% 21% 32% 18%
Do you taste outside of your job (e.g. wine competition)? No Yes Examples:
64% 36% NVV, St. Helena Star, IWSC, CA State Fair 2012, Orange County
Wine Society Competition, New York wine juding, Romancing the
Rhone, Napa County Fair, Sunset International Wine Competition,
Home Winemaker Classics
228 H. Hopfer, H. Heymann/ Food Quality and Preference 32 (2014) 221233
consumer. However, we also found that consumer were not able to
nd the same quality pattern as the wine judges, as Gold medal
wines and No Award wines were liked similarly and perceived sim-
ilarly in quality as well.
3.4. Experts view on liking and quality
The 28 wine professionals, the experts, showed signicant dif-
ferences in the liking of the wines (P 6 0.05, Table 1). Two No
Awards and one Silver medal wine (W1, W5 and W27) were liked
least, while a Gold medal wine (W17) was liked the most. We
served the experts 30 wines with 3 blind duplicated wines (W5,
W12 and W17), and found that although the wine pairs did not re-
ceive identical ratings the ratings were within the HSD range, thus,
not signicantly different from each other (P 6 0.05).
Similarly to the consumers, we created an internal preference
map based on the HL scores and an internal quality map using
the DISTATIS algorithm on the quality sorting data (Fig. 4ad).
Experts HL scores were in higher agreement with each other than
the consumer scores, shown in the variables plot of the internal
preference map (compare Fig. 3b vs. Fig. 4b) where experts
loadings group together along the positive rst dimension. Wines
that were liked most are located along the positive rst dimension
of the internal preference product map, which explained around
35% of the total variance (Fig. 4a). Wines liked most included
W6-W10, W16, W19, W20, W24, W29 and W30 as well as the
two blindly served pairs W12ab and W17ab. Wines W3, W15,
W22 and W23 were rated medium in liking, and all other wines,
located on the negative rst dimension, were liked least. All pairs
of the three duplicated wines (W5, W12 and W17) were close to-
gether, indicating that experts perceived them similarly.
From the sorting task where experts were asked to sort the
wines into 5 quality categories, ranging from low to high, a similar
product map was found (Fig. 4c and d). In the DISTATIS procedure
the individuals sorting data are used, thus, an RV coefcient map
of individuals agreement can be calculated. The DISTATIS product
map, a kind of internal quality map, is shown in Fig. 4c. Generally,
wines that experts liked similarly were also grouped together in
Dim. 1, 22.9%
D
i
m
.

2
,

1
2
.
3
%

-7 0 7
-
4

0

4

W1
W10
W11
W12a
W12b
W13
W14
W15
W16
W17a
W17b
W18
W19
W2
W20
W21
W22
W23
W24
W25
W26
W27
W3
W4
W5a
W5b
W6
W7
W8
W9
(a)
Dim. 1, 22.9%
D
i
m
.

2
,

1
2
.
3
%

-1 0 1
-
1

0

1

(b)
Dim. 1, 11.3%
D
i
m
.

2
,

7
.
8
%

W1
W10
W11 W12a
W13
W14
W15
W16
W17a
W18
W19
W2
W20
W21
W22
W23
W24
W25
W26
W27
W5b
W12b
W3
W17b
W4
W5a
W6
W7
W8
W9
-0.2 0 0.2
-
0
.
2

0

0
.
2

(c)
0.0 1.0
-
1
.
0

0
.
0

1
.
0

Dim. 1, 15.4%
D
i
m
.

1
,

5
%

(d)
Fig. 4. (a) Internal preference map and (b) loadings plot using the HL scores of the experts (each line represents one expert). (c) Internal quality map using the experts quality
sorting data in combination with the DISTATIS algorithm. (d) Experts RV consensus plot (each dot represents one expert). Wines are color-coded according to their
performance in the wine competition (green No Award; blue Silver or Bronze medal; gold Gold or Double Gold medal). (For interpretation of the references to color in
this gure legend, the reader is referred to the web version of this article.)
H. Hopfer, H. Heymann/ Food Quality and Preference 32 (2014) 221233 229
terms of quality, e.g. wines W6W10, W12ab, W16, W17ab, W19
W21 and W24 grouped together in both product maps, while the
less liked wines W1W5, W11, W13W15, W18, W25 and W26
were similarly rated in quality. The experts showed low agreement
among each other, shown by the RV map (Fig. 4d), with an ex-
plained variance of 15% in the rst dimension. However, all of them
cluster together in a tight group similar to the DA panel in the color
sorting task, and the a weights ranged between 0.024 and 0.043.
In a last step all experts were asked to answer some questions
with regards to demographics and their wine tasting experience
(Table 4). They were also asked to describe what makes a high or
a low quality wine, using two open questions Which attributes
do you associate with a high quality wine? and Which attributes
do you associate with a low quality wine?. The majority of the ex-
perts were male and over 30 years old, and had at least 5 years
of professional experience in the wine industry, over 70% reported
10 and more years of experience. Winemakers and assistant wine-
makers made up over half of the participating experts, and nearly
90% of the experts reported to taste at least once or more per week,
mainly due to the fact that 86% report that wine tasting is part of
their job description. Several wine tastings per week have been
part of 50% of the experts daily life for the last 10 years or more.
However, only a third of the experts use their tasting experience
outside their job, and participate for example in wine shows or
wine competitions. One expert reported to have participated as a
judge in the 2012 CA State Fair Commercial Wine Competition.
In the open questions about the attributes of high and low qual-
ity wines, most experts used general terms such as balance (14 out
of 25, 3 experts did not answer these questions), lack of defects (9)
and complexity (3) to describe a high quality wine. One expert ex-
plained balance = visually nice in combination with intense aroma
and lasting nish. Another winemaker described complexity as
layers of interesting qualities, concentrated avors, appropriate
mouthfeel for the wine style.
Additionally, more tangible terms were given, including true/
correct varietal character (5), soft/smooth/round tannins (6), cor-
rect mouthfeel (2), big/voluminous aroma (2), structure/texture
(3), length/nish (4) and volume/weight (3). Four experts de-
scribed the importance of the balance between fruit and oak, while
additional four experts considered the presence of fruit essential
for wine quality. Also, wine professionals use hedonic terms for
wine quality, such as pleasure (2), yumminess and drinkability,
and one expert described a high quality wine as pleasure that out-
weighs aws. One expert each thinks integration, elegance,
uniqueness, sense of place and typicity are attributes of a high
quality wine, while another expert summarized it as quality that
is appropriate for the price.
When describing a lowquality wine, most experts (18 out of 25)
noted defects, faults and aws, and some gave examples, such as
aromas associated with Brettanomyces spoilage, volatile acidity,
oxidation, reduction or other microbial spoilages. Coarse or harsh
tannins were mentioned by 10 experts, followed by lack of balance
(9), and in some cases, more specically, lacking balance between
fruit and oak characters or between sweetness and acidity. Other
descriptions related to winemaking were overly or unbalanced
sweetness (7), too much acid (7), atypical and/or lacking aromas
and avors (9), too much and/or unintegrated oak (8), lack of body/
thin/short/at (7), and bitterness (3). Two experts noted the lack of
pleasure, and one expert associated simplicity with a low quality
wine.
Overall, the experts provided insight into their quality assess-
ment of red wines, and seem to combine a rather objective frame-
work with descriptive terms with personal preference when
evaluating wine quality. It could be that due to their training wine
experts are more able to describe what they like, and which attri-
butes they associate with high quality red wine.
3.5. Predicting hedonic liking and quality ratings by Descriptive
Analysis
Correlating various data set to each other to understand which
variables are driving liking and which are not correlated was the
last step in our study. Due to the nature of sensory data, i.e. mul-
ti-collinearity of DA attributes is typically the case not the excep-
tion, partial least squares (PLS) regression is an ideal method as
in PLS the covariance of both the predicting and the predicted vari-
ables is modeled. In our case we used all signicant DA attributes
to predict the HL and Q ratings from the averaged consumers, the
consumer segments and the experts.
The reasoning behind that approach is several-fold: First, we
wanted to understand if DA proles could predict liking and per-
ceived quality for both untrained and experienced wine tasters.
Secondly, the question was whether hedonic liking would correlate
to perceived quality or whether these two concepts would be inde-
pendent from each other, and lastly, if different populations dif-
fered in their liking and quality scores, and, if these differences
could be correlated to particular sensory attributes. Since we used
the combination of all liking and quality parameters at once, a PLS2
algorithm was used.
Table 5 shows the summary of the PLS model. For an explained
variance of the predicting matrix (X) of over 75%, the rst 8 dimen-
sions of the model are needed, which predict over 50% of the pre-
dicted variables (Y). However, within each of the predicted
variables, large differences in the prediction quality were found.
For examples, while the hedonic liking and the quality ratings of
the experts (HLexp, Qexp) as well as the averaged consumer liking
(HLcons) could be predicted to around 50% with the rst two latent
Table 5
PLS regression model summary, showing the quality of the model for the rst 10 model components (LV), i.e. percentages of the total explained variance for the sum of the
predicting variables (X), the average over all predicted variables (Y), and each of the predicted variables (HL14, HLcons, HLexp, Q14, Qcons, Qexp).
(%) LV 1 LV 2 LV 3 LV 4 LV 5 LV 6 LV 7 LV 8 LV 9 LV 10
X 25.5 40.0 52.4 58.7 65.7 72.1 74.4 79.1 84.1 86.9
Y 13.2 22.2 27.0 35.5 42.9 47.4 53.8 57.4 59.5 63.1
HLcons 22.6 25.2 29.7 38.0 39.0 39.2 56.0 62.6 62.6 63.0
HL1 1.9 3.9 11.9 16.8 21.5 33.8 42.8 45.7 53.5 59.2
HL2 0.1 0.6 5.2 18.5 19.8 28.0 45.6 50.2 58.4 61.1
HL3 1.4 3.7 21.7 39.6 45.3 53.1 57.1 57.8 58.4 64.5
HL4 0.5 2.2 22.7 38.6 45.4 54.4 57.2 58.5 59.1 67.4
HLexp 39.2 55.8 55.9 59.5 64.6 67.8 68.2 68.6 70.1 72.6
Qcons 38.6 46.7 46.7 53.7 53.7 56.1 60.7 73.0 73.0 73.2
Q1 3.4 14.7 14.9 27.7 32.3 36.8 39.0 45.0 45.8 50.4
Q2 17.4 32.9 33.2 33.3 50.3 50.5 60.4 60.4 61.9 65.0
Q3 0.6 16.0 16.3 20.2 34.4 35.9 41.9 44.5 44.5 50.6
Q4 1.0 14.2 14.4 24.1 32.7 37.5 39.7 44.8 45.1 49.0
Qexp 31.4 50.3 51.1 56.6 75.7 75.8 77.0 77.7 81.1 81.4
230 H. Hopfer, H. Heymann/ Food Quality and Preference 32 (2014) 221233
variables (LV) of the model, the consumer liking was very poorly
modeled, with only 25% or way less of the variance explained.
This fact is also visually apparent in Fig. 5b, where the experts
and the quality scores of the consumers are positioned on the out-
side of the correlation plot. All consumer segment variables (HL14
and Q14) were only marginally explained by the PLS model, with
slightly better prediction of the quality scores than the HL scores.
This is also reected in the validation plot (Suppl. Fig. 1) where
no minimum was found within the rst 20 model components in
the root mean squared error of prediction (RMSEP) for the con-
sumer variables, while both expert variables as well as the quality
ratings of the averaged consumers show a minimum after 24
components.
Plotting the product scores and variable correlations next to
each other (Fig. 5a and b) one can see that most gold and double
gold wines (except W18) are located on the left hand side or close
to the center of the score plot, showing a high positive correlation
to the HL and Q scores of the experts and the averaged consumers.
It seems that the liking and the quality ratings of the experts are
driven by the presence of various fruit attributes (red fruit, dark
fruit) as well as oak and sweetAroma and absent or only marginally
detectable vegetal-green, chemical, earthy, sulfur and Brett aromas.
This is also supported by the open-ended questions about high
and low quality wine descriptors, where most experts associated
low quality with the presence of Brett, chemical, and reduction
aromas, and in contrast, described the presence of fruit and oak
as a characters in a high quality wine. It seems that experts are able
to correlate perceived quality and liking to dened sensory attri-
butes, thus, making the PLS regression model a more accurate
one when experts are included, rather than using untrained con-
sumers only. However, for the averaged consumers as well as the
experts, liking was highly correlated to perceived quality, indicated
by the close proximity of the two variables in the correlation plot.
This nding is in good agreement with (Lawless et al., 1997) who
found similarly that wine consumers hedonic scores were posi-
tively correlated to hedonic scores of wine experts, and that wine
experts showed a high correlation between hedonic liking scores
and a 20-point quality scores.
4. Conclusions
Based on the DA results we can assume that the judges at the
wine competition awarded Gold and Double Gold medals to wines
that showed a balanced avor prole with detectable aromas of
fruit and oak and absent or only marginal present notes of vege-
tal-green, chemical, earthy or sulfur characters. Similarly, most
Gold and Double Gold wines showed lower scores in astringency,
hot mouthfeel and bitterness, and higher sweetness. However,
wines that were characterized similarly in the DA did not perform
equally well in the Wine Judgment, indicating a high inconsistency
among the wine judges compared to the trained panel. This is in
accordance with the observations of Hodgson (2008, 2009) who re-
ported a lack of concordance among different wine competitions.
This fact is not too surprising, as a trained sensory panel spends
several training sessions on both the sample set as well as aroma,
taste and mouthfeel references, in addition to the replicate blind
evaluation of each sample. However, a similar improvement in
consistency due to some form of training could be expected with
wine experts too, based on the results of Tempere et al. (2011),
Tempere, Cuzange, Bougeant, Revel, and Sicard (2012). They
showed that wine experts could improve their olfactory sensitivi-
ties when repeatedly exposed to (or trained with) specic odor-
ants. Using short-term, repeated exposure to the odorant all
experts showed signicantly improved detection of the odorant.
The authors concluded that screening, training and monitoring of
wine experts are important factors for improved quality control
of wine. Another approach to improve consistency among wine
judges was proposed by Gawel and Godden (2008) who recom-
mended replicate assessments of the wines in competitions.
For the 174 consumers that rated the hedonic liking and per-
ceived quality of the studied wines, signicant differences were
found in liking and perceived quality ratings, and consumers could
be segmented into four clusters due to similar liking and quality
ratings, after missing values were imputed with MI-PCA. Clusters
differed only signicantly due to the liking and quality ratings,
not due to any of the collected demographical information, such
as age, gender, income, wine consumption, or wine expertise. Con-
sumer segments were separated in the internal preference map,
indicating that each segment preferred different wines. This is
not too surprising as all wines in the study are commercial prod-
ucts, thus, reecting the broad range of consumer preferences.
However, these results also indicate that some consumers liking
patterns are opposite to experts quality perceptions. A similar
observation was made by Machado (2009) who found consumer
segments that did not like highly rated dry red wines.
Twenty-eight wine experts (wine professionals working in
North California), different from the wine experts that evaluated
the wines in the wine competition, tasted all 27 wines and rated
their liking and perceived quality in a similar way to the consumer
LV 1, 25.0%
L
V

2
,

1
4
.
5
%

-3 0 4
-
4

0

4

L
V

2
,

1
4
.
5
%

-
1

0

1

W1
W10
W11
W12
W13
W14
W15
W16
W17
W18
W19
W2
W20
W21
W22
W23
W24
W25
W26
W27 W3
W4
W5
W6
W7
W8
W9
oAroma
Alcohol
Brett
CanVeg
Chemical
DarkFruit
DriedFruit
Earthy
Floral
FreshGreen
FreshVeg
Musty
Oak
RedFruit
Smoky
Soysauce Spices
Sulfur
SweetA
Astringent
HotMF
Bitter
Sweet
HLcons
Qcons
HL2
HL3
HL4
HL1
Q1
Q2
Q3
Q4
HLexp
Qexp
*
*
*
*
*
*
*
*
*
* +
+
+ +
+
+
+
*
*
* *
* *
* *
*
*
*
*
*
+
+
+
+
+
-1 0 1
LV 1, 25.0%
(a) (b)
Fig. 5. (a) PLS score plot showing the wines color-coded according to their performance in the wine competition (green ... No Award; blue Silver or Bronze medal; gold
Gold or Double Gold medal). (b) PLS correlation plot showing both the predicting variables (DA attributes, taste attributes are italicized, mouthfeel attributes are underlined)
and the predicted variables (hedonic liking (HL) and quality ratings (Q) for the four consumer clusters (14), averaged over the consumers (cons) and the experts (exp)). (For
interpretation of the references to color in this gure legend, the reader is referred to the web version of this article.)
H. Hopfer, H. Heymann/ Food Quality and Preference 32 (2014) 221233 231
group. The experts found signicant differences among the wines
in liking, and conrmed the wine judgment ratings only to some
degree, with liking a Gold medal wine the most, and two No Award
wines and one Silver medal liked the least. Wine experts were
more in agreement about the wines than consumers, as the inter-
nal preference map and the internal quality map using the experts
evaluations showed a similar picture, with wines that were per-
ceived as high quality wines being also the most liked ones by all
wine experts. However, most Gold and Silver wines were not sig-
nicantly different from each other, raising the question if the as-
signed differences in the medals really reect perceivable sensory
differences rather than just differences due to serving order and/or
random effects. Based on our results, wine experts were not able to
repeatedly nd differences among the wines in a similar way to the
wine judgment.
Comparing all three populations (trained panel, consumers and
wine experts), it is interesting to note that only the experts liking
and quality ratings could be sufciently modeled by the DA
descriptors, indicating that the experts based their evaluations on
objective, descriptive attributes, while the consumers were less
able to do so. The average over all consumers showed a similar pat-
tern to the experts, with very similar liking and quality ratings,
while none of the consumer segments was positioned close to
the experts in the PLS correlation plot. Additionally, none of these
cluster were well described by the PLS model.
Overall, in contrast to other food products, such as peaches and
nectarines (Delgado, Crisosto, Heymann, & Crisosto, 2013), con-
sumer liking and perceived quality ratings could not be modeled
well by DA, while wine experts seemed to use a similar construct
to evaluate liking and quality of wines. This could be the result
of their constant exposure to the product, together with the neces-
sity to describe their perception in a more objective way when
communicating with others. This result is also in agreement with
the nding of a more analytical assessment of wine quality by wine
experts compared to novices (DAlessandro and Pecotich, 2013).
In conclusion, wine quality does not only have several dimen-
sions, including extrinsic and intrinsic factors, it seems that wine
quality is also a highly variable subject. Although we could identify
some sensory attributes that wine experts are looking for in high
quality red wines, it seems that wine judges were not able to con-
sistently apply these standards when judging wines. This can most
likely be attributed to the large number of wines entered and eval-
uated in the wine competition, and some kind of quality control in
wine competitions is highly recommended. However, we also
found that consumers span a broad range of liking, and even a
low quality wine, from an expert and judge perspective, can be
appreciated by some consumers. However, the general public
might not be able to communicate their preferences as well as
wine experts, but they are clearly capable of tasting and making
a choice, so it is in the wine industrys best interest to come up
with constant wine quality.
Acknowledgements
We would like to acknowledge the nancial support from the
American Vineyard Foundation (grant # 523 (2012) Judging Wine
Quality) and Jerry Lohr for making this work possible. A big Thank
You to all panelists and experts for their time and efforts, as well as
Christine Wilson, Meredith Bell and Anna Hjelmeland for their
help. We acknowledge the wineries who donated or discounted
the wines used in this study, including Alexander Valley Vineyards,
Baily Vineyard & Winery, Biltmore Estate Wine Company, Cameron
Hughes, Cecchetti Wine Company, Inc., Convergence Vineyards,
Dogwood Cellars, Fields Family Wines, Le Vigne Winery, Mettler
Family Vineyards, Monte de Oro Winery, Muscardini Cellars,
Perrucci Family Vineyards, Sean Minor Wines, The Wine Group,
Tricycle Wine Co., V. Sattui Winery and numerous others.
Appendix A. Supplementary data
Supplementary data associated with this article can be found, in
the online version, at http://dx.doi.org/10.1016/j.foodqual.2013.
10.004.
References
Abdi, H., Valentin, D., Chollet, S., & Chrea, C. (2007). Analyzing assessors and
products in sorting tasks: Distatis, theory and applications. Food Quality and
Preference, 18, 116.
ASTM International. (2009). E253-13: Standard terminology relating to sensory
evaluations of materials and products.
Beaton, D., Fatt, C. C., & Abdi, H. (2013). DistatisR: DiSTATIS three way metric
multidimensional scaling. R package version. Retrieved from http://cran.r-
project.org/package=DistatisR.
Caill, S., Samson, A., Wirth, J., Dival, J.-B., Vidal, S., & Cheynier, V. (2010). Sensory
characteristics changes of red Grenache wines submitted to different oxygen
exposures pre and post bottling. Analytica Chimica Acta, 660, 3542.
Charters, S., & Pettigrew, S. (2007). The dimensions of wine quality. Food Quality and
Preference, 18, 9971007.
DAlessandro, S., & Pecotich, A. (2013). Evaluation of wine by expert and novice
consumers in the presence of variations in quality, brand and country of origin
cues. Food Quality and Preference, 28, 287303.
Delgado, C., Crisosto, G. M., Heymann, H., & Crisosto, C. H. (2013). Determining the
primary drivers of liking to predict consumers acceptance of fresh nectarines
and peaches. Journal of Food Science, 78, S605S614.
Delgado, C., & Guinard, J.-X. (2012). Internal and external quality mapping as a new
approach to the evaluation of sensory quality A case study with olive oil.
Journal of Sensory Studies, 27, 332343.
Friendly, M., & Fox, J. (2010). Candisc: Generalized canonical discriminant analysis. R
package . Retrieved from http://cran.r-project.org/package=candisc.
Gawel, R., & Godden, P. W. (2008). Evaluation of the consistency of wine quality
assessments from expert wine tasters. Australian Journal of Grape and Wine
Research, 14, 18.
Gay, C. (1998). Invitation to comment. Food Quality and Preference, 9, 166.
Hodgson, R. T. (2008). An examination of judge reliability at a major U.S. wine
competition. Journal of Wine Economics, 3, 105113.
Hodgson, R. T. (2009). An analysis of the concordance among 13 U.S. wine
competitions. Journal of Wine Economics, 4, 19.
Josse, J., & Husson, F. (2012). Handling missing values in exploratory multivariate
data analysis methods. Journal de la Societe Francaise de Statistique, 153, 7999.
Josse, J., Pags, J., & Husson, F. (2011). Multiple imputation in principal component
analysis. Advances in Data Analysis and Classication, 5, 231246.
Lange, C., Martin, C., Chabanet, C., Combris, P., & Issanchou, S. (2002). Impact of the
information provided to consumers on their willingness to pay for Champagne:
Comparison with hedonic scores. Food Quality and Preference, 13, 597608.
Lawless, H. T., & Heymann, H. (2010). Sensory evaluation of food: Principles and
practices. New York: Springer.
Lawless, H. T., Liu, Y.-F., & Goldwyn, C. (1997). Evaluation of wine quality using a
small-panel hedonic scaling method. Journal of Sensory Studies, 12, 317332.
L, S., & Husson, F. (2008). Sensominer: A package for sensory data analysis. Journal
of Sensory Studies, 23, 1425.
L, S., Josse, J., & Husson, F. F. (2008). FactoMineR: An R package for multivariate
analysis. Journal of Statistical Software, 25, 118.
Machado, B. (2009). Revealing the secret preferences for top-rated dry red wines
through sensometrics (Unpublished Masters thesis). Davis: University of
California.
Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., & Hornik, K. (2013). Cluster:
Cluster analysis basics and extensions. Retrieved from http://cran.r-project.org/
package=cluster.
Mevik, B.-H., & Wehrens, R. (2007). The pls package: Principal component and
partial least squares regression in R. Journal of Statistical Software, 18, 124.
Parr, W. V., White, K. G., & Heatherbell, D. A. (2004). Exploring the nature of wine
expertise: What underlies wine experts olfactory recognition memory
advantage? Food Quality and Preference, 15, 411420.
RStudio. (2012). RStudio: Integrated development environment for R (Version 0.97.551)
[Computer software]. Boston. MA, Retrieved May 13, 2013. Available from http://
www.rstudio.org/.
Siegrist, M., & Cousin, M.-E. (2009). Expectations inuence sensory experience in a
wine tasting. Appetite, 52, 762765.
Szolnoki, G., & Hoffmann, D. (2011). What makes a good Bordeaux wine? A sensory
characterization of Bordeaux and Bordeaux Suprieur red wines based on
regression analysis. In 6th international conference of the academy of wine
business research. Retrieved from http://academyofwinebusiness.com/wp-
content/uploads/2011/09/66-AWBR2011_Szolnoki_Hoffmann.pdf.
Tempere, S., Cuzange, E., Bougeant, J. C., Revel, G., & Sicard, G. (2012). Explicit
sensory training improves the olfactory sensitivity of wine experts.
Chemosensory Perception, 5, 205213.
232 H. Hopfer, H. Heymann/ Food Quality and Preference 32 (2014) 221233
Tempere, S., Cuzange, E., Malak, J., Bougeant, J. C., De Revel, G., & Sicard, G. (2011).
The training level of experts inuences their detection thresholds for key wine
compounds. Chemosensory Perception, 4(3), 99115.
Thach, L. (2008). How American Consumers Select Wine. Wine Business Monthly.
Verd Jover, A. J., Llorns Montes, F. J., & Fuentes Fuentes, M. D. M. (2004).
Measuring perceptions of quality in food products: The case of red wine. Food
Quality and Preference, 15, 453469.
Wakeling, I. N., & MacFie, H. J. H. (1995). Designing consumer trials balanced for
rst and higher orders of carry-over effect when only a subset of k samples from
t may be tested. Food Quality and Preference, 6, 299308.
Wirth, J., Morel-Salmi, C., Souquet, J. M., Dieval, J. B., Aagaard, O., Vidal, S., et al.
(2010). The impact of oxygen exposure before and after bottling on the
polyphenolic composition of red wines. Food Chemistry, 123, 107116.
H. Hopfer, H. Heymann/ Food Quality and Preference 32 (2014) 221233 233

You might also like