You are on page 1of 12

Food Quality and Preference 13 (2002) 117–128

www.elsevier.com/locate/foodqual

An artificial neural network model for predicting flavour


intensity in blackcurrant concentrates
Raymond K. Boccorh, Alistair Paterson*
Centre for Food Quality, University of Strathclyde, Department of Bioscience and Biotechnology, 204 George Street,
Glasgow G1 1XW, Scotland, UK

Received 28 December 1999; received in revised form 27 August 2001; accepted 27 October 2001

Abstract
Artificial neural networks (ANNs)—machine learning acquiring knowledge in training and using deduced relationships to predict
responses—were studied to rationalise concentrate use in fruit drinks production. Sets of ANNs were developed for predicting fla-
vour intensity in blackcurrant concentrates from gas chromatographic data on flavour components (37) in 133 sorbent extracts
from blackcurrant concentrates varying in season, geographical origin and processing technology. Sensory data was collected using
ratio scaling on flavour intensities in drinks from concentrates. Relationships between chromatographic and sensory data for con-
centrates of three seasons (1989, 1990 and 1992) were modelled by ANNs with back propagation using principal component
regression scores as input. Predictions were compared with a global model from random concentrates from all three seasons. In
predicting overall flavour intensity, ANN models were better fitted than partial least square regression. Ability of artificial neural
networks to simulate non-linear relationships observed in human perceptions could explain such improvements. Crown Copyright
# 2002 Published by Elsevier Science Ltd. All rights reserved.
Keywords: Blackcurrant; Artificial neural networks; Multivariate statistical analyses; Flavour modelling; Fruit flavour

1. Introduction eration of perceptual spaces. In such product spaces,


relationships between samples are determined by their
Consistent quality is expected in fruit drinks products perceived characteristics (Williams, 1985). Maindonald
despite differences in the key ingredient—fruit con- (1998) has suggested that artificial neural networks
centrates (Boccorh, 1996; Boccorh, Paterson, & Piggott, (ANNs), a machine learning approach, should be
1999b). Variations in quality and intensity of flavour thought of as mathematical models of processes of
character can be minimised by the blending of fruits and learning, or ordering of samples as in these perceptual
concentrates. However, currently, drinks manufacturers spaces. This requires a problem decomposition (Bun-
cannot readily determine levels of concentrate necessary tine, 1996), fundamental in data mining (Fayyad, Pia-
for any predetermined flavour intensity. Ripley (1996) tetsky-Shapiro, & Smyth, 1996), with three components:
has pointed out that flavour recognition in humans, model representation, model evaluation and searching.
such as nature of a fruit character, is generally an Artificial neural networks emulate human systems
unsupervised pattern recognition. Supervised pattern (Obermeier & Barron, 1989; Ripley, 1996), exhibiting
recognition e.g. training of wine assessors with pre- brain characteristics of learning, adaptation and self-
selected examples, can be structured, as in sensory pro- organisation. Important features are resilience to small
filing, and simulated in machine learning. Williams and errors in inputs and ability to adapt to new information,
his co-workers (Williams, Rogers, & Collins, 1988) have i.e. self-learning (Lippmann, 1987). Appropriate ANNs
argued that the key to understanding relationships generalize and abstract essential characteristics from
between sensory and chemical/physical data is the gen- inputs containing chaotic data (Wasserman, 1989) and
handle complex nonlinear relationships, even when
* Corresponding author. Tel.: +44-141-548-2307; fax: +44-141-
these are unknown (Rataj & Schindler, 1991). This is
553-4124. because ANNs proceed by identifying features that are
E-mail address: cdas01@strath.ac.uk (A. Paterson). then fed into a pattern recognition classifier (Ripley,
0950-3293/02/$ - see front matter Crown Copyright # 2002 Published by Elsevier Science Ltd. All rights reserved.
PII: S0950-3293(01)00072-6
118 R.K. Boccorh, A. Paterson / Food Quality and Preference 13 (2002) 117–128

1996). A key issue is that information for training the and PCR, with polynomials or splines, in addition to
ANN should be complete and unedited (Cheeseman & locally weighted regression (LWR), project pursuit
Stutz, 1996). regression (PPR), alternative conditional expectations
Maindonald (1998) has advised caution in that neural (ACE), multiplicative adaptive regression splines
nets may not be sufficiently mature for everyday tools. (MARS) and artificial neural networks (ANNs).
Yet such processing strategies are the basis of artificial Modern neural networks are multi-layer and hier-
nose operation (Moy, Vasic, Berdague, & Rossi, 1995). archical, descendants of perceptrons (Rosenblatt, 1962;
A fundamental problem with most alternative conven- Widrow, 1963). Cascaded groups of single layers, com-
tional multivariate statistical procedures is that without prising groups of neurones (nodes) preceded by appro-
pre-treatment of the data, only linear relationships priate weightings, have single input and output layers
between data sets on flavour components and sensory (Fig. 1), with one or more hidden layers. Input is fed
character can be modelled. Production of graphical into nodes from other nodes, or from outside the net-
models, such as perceptual spaces, for establishing pro- work, and the weighted sum of these inputs are calcu-
duct knowledge (Buntine, 1996) requires use of dimen- lated and processed according to a transfer function, the
sion reducing techniques which can either be statistically most common being sigmoidal or logistic functions for
based, such as principal component (PCR) and partial prediction modelling. This has the desirable character-
least squares regression (PLS), or aim to mimic human istic of being continuous and non-linear (Wilkinson &
reasoning. Neural networks seek to provide insights into Yuksel, 1997).
human processes (Michie, Spiegelhalter, & Taylor, Interest in ANNs was revived in the mid- 1980s, as
1994) and are thus attractive for modelling such rela- improved back propagation algorithms became avail-
tionships as that between intensity of flavour attributes able with subsequent development of suitable computa-
and flavour component composition. A further advan- tional hardware (Cheng & Titterington, 1994;
tage of the ANN approach is that it is possible to Rumelhart, Hinton, & Williams, 1986). A gradient des-
develop models with large but not unlimited flexibility cent approach to modifying weights and thresholds in
(Ripley, 1996). an iterative manner (network training) minimises the
In factors determining sensory attributes and pref- sum of errors between desired and calculated output
erence of foods, relationships are generally nonlinear, signals of the network (Bardot, Bochereau, Martin, &
with usually sigmoidal characteristics (Frijters, 1979; Palagos, 1994). Multilayer ANNs are particularly
Meilgaard, Elizondo, & Moya, 1970). Sensory attributes
can originate in interactions of complex non-linear
physical and/or chemical processes that can only indi-
vidually be quantified by instruments. Even in sensory
analysis, use of category scales in scoring of attributes
may also produce non-linear relationships, especially if
many scores are located close to scale anchors (Wilk-
inson & Yuksel, 1997). Although non-linear relation-
ships can be linearised prior to treatment with
conventional regression statistics, such manipulations
reduce the value of models for human responses to food
stimuli—the primary aim.
Statistical regression methodologies for modelling
non-linear relationships often assume that data repre-
sents an underlying ‘‘reality’’ that can be expressed by
an algebraic equation (Ni & Gunasekaran, 1998). Mar-
tens and Næs (1989) suggested that for certain non-lin-
ear data, a linear approach using PLS or PCR, with
extra variables should contain all relevant information.
Linearisation and modelling processes are initially
separated by transforming either sensory or instru-
mental variables, followed by fitting of the linear model.
However, information on non-linearity must be avail-
able to facilitate choice of an appropriate transforma-
tion. Suggestions about explicit inclusion of non-
linearities in models have been made. Sekulic, Sea-
sholltz, Wang, Kwalski, Lee, and Holt (1993) suggested Fig. 1. Schematic diagram of a three-layer artificial neural network
a number of alternatives: non-linear extensions of PLS (ANN).
R.K. Boccorh, A. Paterson / Food Quality and Preference 13 (2002) 117–128 119

valuable for handling fuzzy, chaotic, or incomplete tion, employed C18 bonded phase columns (2.8 ml; 500
information sets (Obermeier & Barron, 1989). Proces- mg of Bond-Elut; Jones Chromatography, Mid-Gla-
sing elements build relationships between input and morgan, Wales, UK). Two tandem C18 matrix phases,
output data in training (Rumelhart et al., 1986; Smith & conditioned with 5 ml methanol and 10 ml of de-ionised
Walter, 1991). Such ANNs have been exploited in che- water, adsorbed flavour components of two 100 ml ali-
mometrics and sensometrics (Bardot et al., 1994; Lipp, quots of diluted concentrates (with 10 mg 3,4-dimethyl-
1996b; Sekulic et al., 1993) and forecasting problems in phenol as internal standard) at 10 ml min 1 at 35 kPa
food quality control (Arteaga & Nakai, 1993; Ni, pressure. Components were desorbed with a 10 ml bin-
Gunasekaran, Bogenrief, & Olson, 1994). Sensory and ary mixture of HPLC grade dichloromethane and
food quality can be predicted without knowing the methanol (10:1 volume ratio) at 20 kPa vacuum. Pooled
intricacies of human responses to individual or sets of sorbent extracts were evaporated to 0.5 ml with O2-free
sensory stimuli (Lipp, 1996b; Ni & Gunasekaran, 1998; nitrogen, then transferred to 2 ml glass vials with teflon
Smith & Walter, 1991) with enhanced understanding of lined caps, and stored at 18  C.
complex human responses (Bardot et al., 1994), varying High resolution GC analysis was performed on Car-
between individuals, groups and cultures (Shepard, bowax 20M (20 m  0.32 mm i.d.); 0.25 mm film thick-
1989; Williams, 1994). ness; SGE Ltd, Buckinghamshire, UK) using a Carlo
In a previous report (Boccorh, Paterson, & Piggott, Erba HRGC 5300 series gas chromatograph, with cold-
1999b) modelling of relationships between chromato- on-column injection. The temperature ramp was: iso-
graphic data and sensory flavour character intensity in thermal at 60  C for 3 min; increasing to 230  C at 8  C
blackcurrant concentrates was effected by PLS regres- min 1; then to 240  C at 1.0  C min 1, and finally iso-
sion. The aim of this present study was to ascertain thermal at 240  C for 5 min; 40 min analysis time. Car-
whether an artificial neural network strategy could yield rier gas was helium at 2 ml min 1. Blanks of
models showing improved prediction of flavour char- dichloromethane and a dichloromethane/methanol
acter intensity in drinks. (1:10 v/v) mixture were chromatographed. Duplicate
extractions were performed for each concentrate; and
duplicate injections for each extract. To measure preci-
2. Materials and methods sion in extraction, replicate data for components of each
extract, and injection were subjected to ANOVA using
2.1. Quantification of flavour components Minitab v. 9.0.
Flavour components were discriminated into character-
A processor supplied a total of 133 blackcurrant con- enhancing or non-enhancing (diminishing) (Table 2) on
centrates, processed from fruits of three different crops the basis of characterisation of extracts from different
and varying in geographical origin, post-harvest storage sorbent eluents and gas chromatography/olfactometry
and processing technology (Table 1). Sorbent extrac- (Boccorh, 1996; Boccorh, Paterson, & Piggott, 2000).

Table 1
Blackcurrant concentrates studied, grouped into seasons, geographical origin, post harvest storage and concentration technology

Season UK Fresh UK Frozen UK Freeze Imported Polish New Zealand Total

1989 13 32 0 0 0 2 47
1990 34 10 3 7 7 2 63
1992 12 11 0 0 0 0 23

Total 59 53 3 7 7 4 133

Table 2
Flavour components relevant to processed blackcurrant character

Character-enhancing notes Character-non-enhancing (diminishing) notes

Cooked beans (Cb) (3) Faecal (Fc) (1)


Spicy (Sp) (3) Wilted Flower (WJ) (I)
Manure-like (Ma) (3) Burnt beans (Bb) (5)
Burning incense (Bi) (5) Burning rubber (Br) (5)
Leathery (Lea) (I) Tree bark-like (Tb) (I)
Old leather-like (01) (4) Dried hay (Dh) (2)
Smoky (Sm) (3)

Numbers of compounds with a particular note are indicated in parenthesis.


120 R.K. Boccorh, A. Paterson / Food Quality and Preference 13 (2002) 117–128

2.2. Determination of flavour intensity in concentrates data. The interaction of factors scores on to which sen-
sory variables are regressed introduces the non-linearity
Magnitude estimation of flavour intensity in model into the ANN. After PCR, as with most variable
blackcurrant drinks was effected using assessors with reduction techniques, it is necessary to determine the
previous experience in ratio scaling (Boccorh, 1996; appropriate number of Factors to best fit the model.
Boccorh & Paterson, 2002). Drinks, formulated as sug- Selecting too few Factors will result in the exclusion of
gested by the processor, were presented to assessors in relevant information. Insufficient Factors will, however,
300 ml translucent disposable plastic cups covered with result in inclusion of ‘‘noise’’ (Lipp, 1996a). Optimal
watch glasses. As described by Moskowitz (1983), number of Factors was determined by plotting var-
assessors were requested to taste an initial sample and iances as functions of the number of principal compo-
assign a positive, non-zero number (modulus). Sub- nent (scree plot). Values of RMSEP (root mean square
sequent drinks were scored, quantifying perceived fla- error of prediction) or minimum error of prediction,
vour intensity as a ratio to this modulus. Experimental were used to determine prediction efficiencies. These, the
design employed balanced incomplete blocks for pre- mean deviations in flavour intensities of predicted test
sentations (MacFie, Bratchell, Greenhoff, & Vallis, set samples, relate to the measured intensity levels
1989) with replication, and assessments were performed (Esbensen, Schönkopf, & Midtgaard, 1996). Correspond-
under purple lighting to minimise colour effects. Prior to ing values of RMSEC (root mean square error of calibra-
statistical modelling, individual assessor scores for sam- tion) were calculated but not reported in this study.
ples were pooled and modulus normalisation (Mosko- Lack of fit in models was determined by visual
witz, 1983) employed to bring scores within a single inspection of residual plots: Y-residual vs. Y-Predicted
consensus range, thus minimising variation between was used. Randomly distributed residuals indicated a
assessors. good fit.
The Unscrambler ANN module uses an error back
2.3. Modelling of relationships between flavour propagation algorithm for weight adjustment, deter-
component data and flavour intensity mining the contribution of each weight to prediction
error. Weights are then adjusted by a fixed proportion
Modelling of relationships between data sets was of that contribution. This algorithm, in combination
effected using principal components regression and the with a logistic function, has been used in a range of food
artificial neural network module of Unscrambler, v. 5.55 applications (Cheng & Titterington, 1994; Wilkinson &
(CAMO A/S. N-7041 Trondheim, Norway). This mod- Yuksel, 1997).
ule is based on the Optimal Minimal Neural Inter- Due to the heterogeneity of the 1989 and 1990 (var-
pretation of Spectra (OMNIS) approach (Borgaard & iations in geographical origin and concentration tech-
Thodberg, 1992) in which input (instrumental) data is nology) concentrates, specific models were established
pre-processed with principal components analysis with these sample subsets. No sub-model was developed
(PCA) scores being obtained from PCR. It is the inter- for concentrates of 1992 since there was little hetero-
action between these scores that generates the non-line- geneity among samples. A set of 96 concentrates, ran-
arity in the ANN approach. domly selected and representative of the entire data, was
The ANN network will thus contain a PCR solution then used for a global model. For any model, at least
if there is a direct connection between input and output 50% of the data was used in training and the remainder
layers. This approach requires models to be deduced as a test set.
(model representation) and validated (model evalua- For training, number of nodes in input layers of each
tion) with separate calibration and test sets, respectively. network were set to the number of optimal extracted
In this approach a selected set of weights minimise error PCR Factors. This minimised errors; output layers were
on the calibration set. The combination of principal set to a single neurone. For other network parameters,
components and direct connection has the additional learning rate (the magnitude of weight changes during
advantage of speeding up training processes (Wilkinson training), was set initially at 10, and later reduced to 0.1
& Yuksel, 1997). as training progressed. Update parameters (number of
In PCR, factors are selected from only the calibration training object pairs presented to the ANN at each
set, used to model relationships between the chromato- iteration) was set to the number of objects in each
graphic data and the sensory response. The first stage training set.
involves PCA which decomposes the original chroma- The RMSEP was used to estimate prediction perfor-
tographic data and extracts new sets of uncorrelated mance of models and sets of errors for each correlation
Factors, on the basis of ability to summarise maximum method compared. Samples that appeared to be outliers
information (Martens, Martens, & Wold, 1983a; Mar- in models, from examining residual variances and
tens, Wold, & Martens, 1983; Williams, 1994). These leverage effects, were examined together with influence
Factors thus represent linear models of the original on models.
R.K. Boccorh, A. Paterson / Food Quality and Preference 13 (2002) 117–128 121

3. Results 3.3. Model 2: 1989 Season: UK concentrates (N=41)

3.1. Accuracy and reproducibility of modelling data Fig. 5 shows PCR scores performed with training and
test sets of 20 and 21 samples, respectively, after dele-
Analysis of variance (ANOVA) on the data revealed tion of outliers in model 1. A minimum prediction error
that replicates of both analytical and sensory data sets of 0.106 was attained after 2 Factors (70% variance).
did not differ significantly (P > 0.05): it was concluded Fresh fruit concentrates had high levels of flavour-
both were suitable for the modelling process. After enhancing components, e.g. leathery, cooked beans,
PCR, variable levels of linearity were indicated in resi- burning incense and old leather, while frozen fruit con-
dual plots for models. centrates were characterised by non-enhancing compo-
There was also a general reduction in error of predic- nents, e.g. dried hay and burnt beans. Back propagation
tion from PCR to ANN. Table 3 summarises the pre- reduced prediction error to 0.094 after more than 5104
diction errors for each model for PCR, PLS (Boccorh et iterations (again indicating a non-linearity). The corre-
al., l999a) and ANN. lation coefficient of 0.74 with back propagation (Fig. 6),
compared with 0.67 obtained with PCR (Table 4), was
3.2. Model 1. 1989 Season: 45 UK and two New an indicated improved fit for this ANN model.
Zealand concentrates (N=47)

Figs. 2 and 3 show PCR scores for the first 4 Factors,


the input data for the ANN. Training and test sets were Table 3
23 and 24, respectively, spanning the entire range of Summary of RMSEP (prediction performance) for PCR, ANN and
concentrates. Minimum error of prediction of 0.127 was PLSa
obtained after just 3 PCR Factors (82% variance). Correlation method
Back-propagation in the ANN reduced the minimum
Model PLSa PCR ANN
error to 0.108 after more than l05 iterations. Fig. 4
shows the ANN observed from filter values of flavour Model 1 (1989a) 0.137 0.127 0.108
intensity a correlation coefficient of 0.64. However, New Model 2 (1988b) 0.111 0.106 0.094
Model 3 (1990a) 0.109 0.091 0.082
Zealand concentrates, and three from fresh, and one
Model 4 (1990b) 0.101 0.095 0.077
from frozen fruit, characterised by high levels of com- Model 5 (1992) 0.138 0.128 0.081
ponents conferring leathery, smoky and manure-like Model 6 (Global) 0.131 0.117 0.071
notes, had high residual variances and leverage effects a
Results obtained from a previous report (Boccorh et al., 1999b).
and consequently were not well modelled.

Fig. 2. Principal component regression: concentrate scores and fla-


vour component loadings in first and second Factors in the product
space of chromatographic data from blackcurrant concentrates pre- Fig. 3. Concentrate scores and flavour component loadings in third
pared from fresh (*), frozen () UK fruit and New Zealand con- and fourth Factors in product space of chromatographic data from
centrates (!) of the 1989 season (the underlined are with character- blackcurrant concentrates prepared from fresh (*) and frozen UK
enhancing notes). fruit (), and New Zealand in (!) 1989.
122 R.K. Boccorh, A. Paterson / Food Quality and Preference 13 (2002) 117–128

3.4. Model 3: 1990 Season: all concentrates (N=63) high levels of flavour enhancing components, e.g. burnt
incense, cooked beans, old leather and spicy on Factor 1,
PCR (training and test sets of 42 and 21 samples, freeze concentrates by manure-like, cooked beans and
respectively) revealed minimum error of prediction was dried hay components on Factor 2. Non-linearity was
attained with 3 Factors (Figs. 7 and 8) accounting for indicated by an RMSEP of 0.091. ANN, however,
81% variance. Imported and freeze concentrates were, reduced this error to 0.082 (Table 3) after more than 104
however outliers with atypical residual variances iterations. Prediction with test set samples indicated a
(Fig. 7). Imported concentrates were characterised by correlation coefficient of 0.68, 0.52 with PCR (Fig. 9).

Fig. 4. Artificial neural net correlation: relationships between mea- Fig. 6. Artificial neural net correlation: relationships between mea-
sured and predicted flavour intensity scores for blackcurrant drinks sured and predicted flavour intensity scores for blackcurrant drinks
prepared from fresh (*) and frozen UK fruit () and New Zealand prepared from fresh (*) and frozen () UK fruit concentrates in 1989.
concentrates (!) of 1989.

Fig. 7. Concentrate scores and flavour component loadings of first


Fig. 5. Concentrate scores and flavour component loadings in first and second Factors in product space of chromatographic data from
and second Factors in product space of chromatographic data from blackcurrant concentrates prepared from fresh (*) and frozen () UK
blackcurrant concentrates prepared from fresh (*) and frozen () UK fruit, freeze (^) concentrates, and Imported (&), Polish (~), and
fruit in 1989. New Zealand (!) concentrates of the 1990 season.
R.K. Boccorh, A. Paterson / Food Quality and Preference 13 (2002) 117–128 123

3.5. Model 4: 1990 Season; UK, Polish and New flavour enhancing components. the reverse was true for
Zealand concentrates (N=53) frozen fruit concentrates. Non-linearity in residuals
plots was confirmed by an RMSEP of 0.077, attained
PCR with training and test sets each of 24 samples, after more than 5l04 iterations using direct connection,
after deletion of Imported and freeze concentrate sam- compared to 0.095 with PCR. Degree of fit was
ples, indicated that 3 Factors (82% variance) were indicated by a correlation coefficient of 0.85 (Fig. 12)
optimal for this model (Figs. 10 and 11). On Factors 2 for ANN, compared with 0.69 obtained with PCR.
and 3, fresh fruit concentrates were characterised by This model indicated improved fits for Polish and New

Fig. 10. Concentrate scores and flavour component loadings in first


and second Factors in product space of chromatographic data from
Fig. 8. Concentrate scores and flavour component loadings in third blackcurrant concentrates prepared from fresh (*) and frozen () UK
and fourth Factors, in product space of chromatographic data from fruit, and Polish (~), and New Zealand (!) concentrates of the 1990
blackcurrant concentrates prepared from fresh (*) and frozen () UK season.
fruit, freeze concentrates (^), and Imported (&) Polish (~), and New
Zealand (!) concentrates in 1990.

Fig. 11. Concentrate scores and flavour component loadings in first


Fig. 9. Artificial neural net correlation: relationships between mea- and second Factors in product space of chromatographic data from
sured and predicted flavour intensity scores for blackcurrant drinks blackcurrant concentrates prepared from fresh (*) and frozen () UK
prepared from all 1990 concentrates (symbols as in Fig. 7). fruit, and Polish (~), and New Zealand (!) concentrates of 1990.
124 R.K. Boccorh, A. Paterson / Food Quality and Preference 13 (2002) 117–128

Zealand concentrates, as well as a general improvement former with predominantly, flavour enhancing com-
in predictive ability over that in Model 3. ponents, and the latter, with non-enhancing notes. Back
propagation attained a minimum error value of
3.6. Model 5: 1992 Season (N=23) 0.081 after more than 5104 iterations and improve-
ment over 0.128 obtained with PCR. Prediction was
The PCR model using training and test sets of 12 also improved with a correlation coefficient of 0.89
and 11 samples, respectively, had a single Factor, (Fig. 14) over 0.81 obtained with PCR. Although this
explaining 67% of variance, optimal for the prediction correlation was high, flavour intensities of fresh fruit
of character intensity score (Fig. 13). This Factor concentrates were better modelled than those from fro-
separated fresh from frozen fruit concentrates, the zen fruit.

Fig. 12. Artificial neural net correlation: relationships between mea- Fig. 14. Artificial neural net correlation: relationships between mea-
sured and predicted flavour intensity scores for blackcurrant drinks sured and predicted flavour intensity scores for blackcurrant drinks
prepared from fresh (*) and frozen () UK fruit, and Polish (~), and prepared from 1992 concentrates (symbols as in Fig. 13).
New Zealand (bf6) concentrates of 1990.

Fig. 15. Concentrate scores and flavour component loadings in first


and second Factors in the product space of chromatographic data
Fig. 13. Concentrate scores and flavour component loadings in first from 96 random selected concentrates: 1989, UK—fresh (&) and fro-
and second Factors in product space of chromatographic data from zen (&) fruit; 1990, UK—fresh (*) and frozen () fruit; Imported
blackcurrant concentrates prepared from fresh (*) and frozen fruit () (^), Polish (!), New Zealand (!) and freeze concentrates ( ); and
in 1992. 1992, UK—fresh (~) and frozen (~) fruit.
R.K. Boccorh, A. Paterson / Food Quality and Preference 13 (2002) 117–128 125

3.7. Model 6: 1989, 1990 and 1992 Seasons (N=96) selection of 15 was due to a limitation imposed by the
Unscrambler version used in this study.
A subset of 96 randomly selected concentrates, was The formula derived for the PCR model was:
used to develop a model for all three blackcurrant seasons.
PCR, using 48 samples for training and test sets, respec- Y=0.8754+0.744(Man1)+0.041(Lea) 0.448(Sp2)+
tively, revealed that 4 Factors (81% variance) were opti- 0.784(Sm2)+0.876(Ma2) + 0.7 15(Ma3)+0.0468(0l2)
mal (Figs. 15 and 16). In 1990 UK and Imported +0.098(Cb1)+0.004(Sm1)+0.056(Bi1) + 0.156(Bi2)+
concentrates were mostly characterised by flavour enhan- 0.072(Bi5) + 0.026(Dhl) 3.92 (Dh2) 0.065(Brl)
cing components on Factor 1, and others mostly with
components with non-enhancing notes. Imported con- where: Y=Flavour Intensity. 0.8754 is the offset (B0)
centrates were also separated on Factor 2 with high for this model. The neural net model cannot be represented
levels of components with leathery and cooked beans mathematically. The results of this study indicate the good
notes. Concentrates of 1989 were clearly separated on modelling ability of neural networks utilising back-pro-
Factor 3, mostly characterised with non-enhancing com- pagation. Analysis of correlation coefficients, however,
ponents, burning rubber, faecal and smoky (Fig. 16). The suggesed sample heterogeneity. For example, for the 1989
residuals plot indicated an appreciable amount of non-lin- season, UK only concentrates produced an improved
earity, or lack of fit. For ANN training, update para- model (0.94) compared to when all concentrates were
meter and final learning rate were set at 18 and 0.05, modelled (0.84). The high degree of heterogeneity in the
respectively. Fig. 17 shows the ANN modelling efficiency global model was, however, probably the cause of the
for the test set samples. A regression coefficient for high correlation coefficient obtained in this model.
observed values filtered when 0.78 as compared to 0.68
Table 4
obtained by PCR (Table 4) which is an indication of the Summary of correlation coefficients for models obtained by PCR,
improved modelling. ANN and PLSa

3.8. A formula for predicting intensity of flavour Regression method


character in drinks Model PLSa PCR ANN

Model 1 (1989a) 0.39 0.67 0.84


Analysis of loadings and residual variances of vari- Model 2 (1898b) 0.64 0.67 0.74
ables in the initial PCR model suggested 15 flvour Model 3 (1990a) 0.51 0.52 0.68
components were important for the prediction of fla- Model 4 (1990b) 0.59 0.69 0.85
vour character. Although more than this number of Model 5 (1992) 0.70 0.81 0.89
variables could be important in the final model, the Model 6 (Global) 0.64 0.68 0.78
a
Results obtained from a previous report (Boccorh et al., 1999a).

Fig. 16. Concentrate scores and flavour component loadings in first


and second Factors in the product space of chromatographic data Fig. 17. Artificial neural net correlation: relationships between mea-
from randomly selected concentrates: UK, Imported, New Zealand sured and predicted flavour intensity scores for test-set blackcurrant
and freeze concentrates from blackcurrants in 1989, 1990, and 1992 sample drinks prepared from randomly selected concentrates of the
(symbols as Fig. 15). 1989, 1990 and 1992 seasons (symbols as in Fig. 15).
126 R.K. Boccorh, A. Paterson / Food Quality and Preference 13 (2002) 117–128

4. Discussion Models developed with statistical methods can be ad


hoc, predictive or causative. The models generated in
In this study ANN models had improved prediction this study cannot be classified as causative, as such
abilities over PCR and PLS analyses (Boccorh, 1996; models would require additional data on odour or fla-
Boccorh et al., l999a). The ANN prediction, however, vour stimulant profiles, and interactions on physi-
gave smaller improvements over efficient PLS models ological and perceptual levels (von Sydow & Åkesson,
(high correlation coefficients), than over those with poor 1977). The present study did not set out to include any
PLS predictive ability. This was observed in model 1 such information, thus excluding any extrapolation
(1989), models 3 and 4 (1990), where improvements outside data ranges. Moreover such models are only
were from 0.67 to 0.84, 0.52 to 0.68 and 0.69 to 0.85, rarely reported in food research (Guadagni & Meirs,
respectively. This was in contrast with observed 1969). Within season models are likely to be ad hoc in
improvements in model 2 (1989 UK samples) and model nature since they were intended to display and sum-
5 (1992 samples) and model 6 (global). marise a given set of data by means of numerical fit-
Data sets for concentrates for which PLS produced tings. Peerson and von Sydow (1972), suggested
models with comparatively low predictive abilities, took replication of related experiments in addition to inclu-
more iterations for back propagation to reach overall sion of much heterogeneity in samples as possible to
global minima. Such improvements indicated greater minimise chances of having ad hoc characteristics. The
non-linearity in correlations. An alternative explanation global model (model 6) is thus most likely to be predictive
would be reductions in prediction error by PCR. since it includes samples from all three blackcurrant crops
Although it is possible that this observation could result in addition to having replicated data. Perhaps more
from less apparent underlying relationships and fits of importantly, the possible predictive nature of this model
the particular data sets, this is unlikely since only the was indicated by the improved predictions obtained
first few principal components were used as inputs for despite the heterogeneity in samples that caused pro-
neural networks. Such components generally contain blems in the individual within season models.
meaningful relationships. This supports an earlier
report that improved predictive abilities by back pro-
pagation were not indicated by increases in the number 5. Conclusions
of PCR Factors explained (Martens & Martens, 1986).
The ANN also tended to minimise prediction errors Artificial neural network models gave slightly better
arising from concentrates that were outliers in PLS: these predictions of intensity of overall flavour character from
probably had atypical contents of aroma components. concentrates in blackcurrant drinks than those from
It was interesting to note that most of the variables partial least square regression. A final predictive model
(flavour components) in the final formula for the global would be satisfactory for industrial applications. The
model had flavour enhancing notes. This is a possible use of sorbent extracts for compositional analysis of
indication of the importance of such compounds in the flavour components by gas chromatography would
processed blackcurrant flavour of concentrates. The inten- facilitate industrial implementation of prediction of fla-
sities of the few flavour non-enhancing (negative) compo- vour intensity. Such an approach would provide an
nents; burning rubber and dried hay are subtractions from alternative to an electronic nose.
the equation that relates to the overall flavour intensity.
A central question in ANN modelling is whether to
use original or latent variables as input data. In an ear- Acknowledgements
lier study, Piggott et al. (1993) concluded that variations
in chromatographic data (obtainable by PCA) for John R. Piggott is thanked for his advice.
blackcurrant drinks were likely due to simultaneous
differences in significant numbers of aroma components
in concentrates, rather than changes in a small number References
of impact compounds. Use of factors scores for the
optimum number of principal component would be Arteaga, G. B., & Nakai, S. (1993). Predicting protein function with
artificial neural networks. Journal of Food Science, 58, 1152–1156.
predicted to provide summarised versions of the original Bardot, I., Bochereau, L., Martin, N., & Palagos, B. (1994). Sensory-
data, and facilitate modelling of non-linear relation- instrumental correlations by combining data analysis and neural
ships. In addition, noise and random error in the origi- networks techniques. Food Quality and Preference, 5, 159–166.
nal data will be excluded by use of these scores. Thus, Boccorh, R. K. (1996) An analysis of the relationships between compo-
ANN modelling would be expected to minimise error. sition of blackcurrant concentrates and intensity of formulated bev-
erage flavour, PhD thesis, University of Strathclyde, Glasgow, UK.
This is because training of networks is, principally, the Boccorh, R. K., & Paterson, A. (2002) Quantifying flavour in black-
location of minimal error values on surfaces (Rumelhart currant drinks from fruit concentrates. Journal of Sensory Studies
et al., 1986; Smith & Walter, 1991; Wasserman, 1989). (in press).
R.K. Boccorh, A. Paterson / Food Quality and Preference 13 (2002) 117–128 127

Boccorh, R. K., Paterson, A., & Piggott, J. R. (l999a) Development if multivariate data analysis. In H. Martens, & H. Russwurm Jr.
a model for intensity of flavour character in blackcurrant con- (Eds.), Food research and data analysis (pp. 473–492). London:
centrates. Journal of the Science of Food and Agriculture. 79, 1495– Applied Science Publishers.
1502. Meilgaard, M. C., Elizondo, A., & Moya, B. (1970). A study of car-
Boccorh, R. K., Paterson, A., & Piggott, J. R. (1999b). Sources of bonyl compounds in beer. Part 2. Flavour and flavour thresholds of
variations in aroma-active volatiles, or flavour components in aldehydes and ketones added to beer. Master Brewers Association of
blackcurrant concentrates. European Food Research and Technology, the Americas (Technical Quarterly), 7, 143–149.
208, 362–368. Michie, D., Spiegelhalter, D. J., & Taylor, C. C. (1994). Machine
Boccorh, R. K., Paterson, A., & Piggott, J. R. (2002). Extraction of learning, neural and statistical classification. New York: Ellis
flavour components to quantify overall sensory character in a pro- Horwood.
cessed blackcurrant (Ribes nigrum L.) concentrate. Flavour and Moskowitz, H. R. (1983). Product testing and sensory evaluation of
Fragrance Journal (in press). foods: marketing and research and development approaches. Westport
Borgaard, C., & Thodberg, H. H. (1992). Optimal minimal neural Conn: Food and Nutrition Press Inc.
interpretation of spectra. Analytical Chemistry, 64, 545–551. Moy, L., Vasic, G., Berdague, J. L., & Rossi, V. (1995). Transient
Buntine, W. (1996). 3. Graphical models for discovering knowledge. In signal modelling for fast odour classification. In P. Etiévant, &
U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, & R. Uthurusamy P. Schreier (Eds.), Bioflavour 95 (pp. 55–58). Paris: IINRA.
(Eds.), Advances in knowledge discovery and data mining (pp. 59–82). Ni, H., & Gunasekaran, S. (1998). Food quality prediction with neural
Menlo Park, California: MIT Press. networks. Food Technology, 52, 60–65.
Cheeseman, P., & Stutz, J. (1996). 6. Bayesian classification (auto- Ni, H., Gunasekaran, S., Bogenrief, D. and Olson, N. F. (1994) Pre-
class): theory and results. In U. M. Fayyad, G. Piatetsky-Shapiro, dicting cheese quality using neural networks. Paper 943560, pre-
P. Smyth, & R. Uthurusamy (Eds.), Advances in knowledge discovery sented at ASAE International. Winter Meeting, Atlanta, USA.
and data mining (pp. 153–180). California: MIT Press. Obermeier, K. K., & Barron, J. J. (1989). Time to get fired up: in
Cheng, B., & Titterington, D. M. (1994). Neural networks: a review depth, neural networks. Byte, August, 217–224.
from a statistical perspective. Statistical Science, 9(1), 2–54. Peerson, T., & von Sydow, E. (1972). A quality comparison of frozen
Esbensen, K., Schönkopf, S., & Midtgaard, T. (1996). Multivariate and refrigerated cooked sliced beef. 2. Relationships between gas
analysis in practice. Trondheim, Norway: Computer Aided Model- chromatographic data and flavour profiles. Journal of Food Science,
ling (CAMO) A/S. 37, 234–239.
Fayyad, U. M., Piatetsky-Shapiro, G., & Smyth, P. (1996). 1. From Piggott, J. R., Paterson, A., & Clyne, J. (1983). Prediction of flavour
data mining to knowledge discovery: an overview. In U. M. Fayyad, intensity of blackcurrant (Ribes nigrum L.) drinks from composi-
G. Piatetsky-Shapiro, P. Smyth, & R. Uthurusamy (Eds.), Advances tional data on fruit concentrates by partial least squares regression.
in knowledge discovery and data mining (pp. 1–34). California: MIT International Journal of Food Science and Technology, 28, 629–637.
Press. Rataj, J., & Schindler, B. (1991). Multivariate data correlations by
Frijters, J. E. R. (1979). Some psychophysical notes on the use of the combining data analysis and neural networks techniques. In
odour unit number. In D. G. Land, & H. E. Nursten (Eds.), Pro- J. A. Anderson, & E. Rosenfield (Eds.), Neurocomputing: founda-
gress in flavour research (pp. 47–51). London: Applied Science tions of research (pp. 156–177). Boston, MA: MIT Press.
Publishers. Ripley, B. D. (1996). Pattern recognition and neural networks. Cam-
Guadagni, D. G., & Miers, J. C. (1969). Statistical relationship bridge, UK: CU Press.
between methyl sulphide content and aroma intensity in canned Rosenblatt, F. (1962). Principles of aerodynamics. New York: Spartan
tomato juice. Food Technology, 23, 375–377. Books.
Lipp, M. (1996a). Comparison of PLS, PCR and MLR for the quan- Rumelhart, D. E., Hinton, G. E., Williams, R. J. (1986) Learning
titative determination of foreign oils and fats in butter of several internal representations by error propagation. In Parallel distributed
European countries by their triglyceride composition. European processing, 1, pp. 318–326. MIT Press, Boston, MA.
Food Research and Technology, 202, 193–198. Sekulic, S., Seasholltz, M. B., Wang, Z., Kowalski, B. R., Lee, S. E., &
Lipp, M. (1996b). Determination of adulteration of butter fat by its Holt, B. R. (1993). Nonlinear multivariate calibration methods in
triglyceride composition obtained by GC. A comparison of the analytical chemistry. Analytical Chemistry, 65, 835–845.
suitability of PLS and neural networks. Food Chemistry, 55, 389– Shepard, R. (1989). Factor influencing food preferences and choice. In
395. R. Shepard (Ed.), Handbook of the psychophysiology of human eat-
Lippmann, R. P. (1987). An introduction to computing with neural ing (pp. 3–22). Chichester: Wiley and Son.
networks. IEEE ASSP Magazine, April, 4–22. Smith, P., & Walter, L. G. (1991). Neural networks in sensory per-
MacFie, H. J., Bratchell, N., Greenhoff, K., & Vallis, L. (1989). ception. In H. T. Lawless, & B. P. Klein (Eds.), Sensory science
Designs to balance the effect of order of presentation and first order theory and applications in foods (pp. 207–222). New York: Marcel
carry-over effects in Hall tests. Journal of Sensory Studies, 4, 129– Dekker, Inc.
148. von Sydow, B., & Åkesson, C. (1977). Correlating instrumental and
Maindonald, J. H. (1998) New approaches to using scientific- data sensory flavour data. In G. G. Birch, J. G. Brennan, & K. J. Parker
statistics, data mining & related technologies in research & (Eds.), Sensory properties of foods (pp. 113–127). London: Applied
research training available: http://www.anu.edu.au/graduate/ Science Publishers.
papers/gs982html. Wasserman, P. D. (1989). Neural computing: theory and practice. New
Martens, M., & Martens, H. (1986). Partial least squares regression. In York: von Nostrand Reinhold.
J. R. Piggott (Ed.), Statistical procedures in food research (pp. 293– Widrow, B. (1963). A statistical theory of adaptation. In B. Hoff (Ed.),
359). London: Elsevier Applied Science. Adaptive control systems (pp. 178–184). New York: Pergamon Press.
Martens, H., & Næs, T. (1989). Multivariate calibration. New York: Wilkinson, C., & Yuksel, D. (1997). Using artificial neural networks to
John Wiley. develop prediction models for sensory-instrumental relationships;
Martens, M., Martens, H., & Wold, S. (1983a). Preference of cauli- an overview. Food Quality and Preference, 8, 439–445.
flower related to sensory descriptive variables by partial least Williams, A. A. (1985). The use of perceptual space approaches for
squares (PLS) regression. Journal of the Science of Food and Agri- determinining the influence of intrinsic and extrinsic factors on food
culture, 34, 715–724. choice. In J. E. R. Frijters (Ed.), Consumer behaviour research and
Martens, H., Wold, S., & Martens, M. (1983b). A layman’s guide to marketing of agricultural products. Proceedings of the Agro Food
128 R.K. Boccorh, A. Paterson / Food Quality and Preference 13 (2002) 117–128

Workshop Organized by the Commission of the European Commu- to do? The data, approaches and problems. Food Quality and Pref-
nities (pp. 29–37). The Hague: The National Council for Agri- erence, 5, 3–16.
cultural Research. Williams, A. A., Rogers, C. A., & Collins, A. J. (1988). Relating che-
Williams, A. A. (1994). Flavour quality-understanding the relationship mical/physical and sensory data in food acceptance studies. Food
between sensory responses and chemical stimuli. What are we trying Quality and Preference, 1, 25–31.

You might also like