You are on page 1of 10

J Food Sci Technol (July 2021) 58(7):2815–2824

https://doi.org/10.1007/s13197-020-04890-9

ORIGINAL ARTICLE

Study of the influence of line scale length (9 and 15 cm)


on the sensory evaluations of two descriptive methods
Aline Iamin Gomide1 • Rita de Cássia dos Santos Navarro Silva1 • Moysés Nascimento2 •

Luis Antônio Minim1 • Valéria Paula Rodrigues Minim1

Revised: 28 October 2020 / Accepted: 6 November 2020 / Published online: 3 January 2021
 Association of Food Scientists & Technologists (India) 2021

Abstract The line scale is widely used in different lengths Introduction


to quantify the intensity of descriptors in sensory evalua-
tion. Since studies related to its size are still limited the Classical descriptive analysis consists of a complete qual-
objective was to determine what variables of descriptive itative and quantitative description of the sensory charac-
sensory evaluation can be influenced when different scale teristics of food products by a trained panel (Varela and
length is considered in two different methods: Optimized Ares 2012). The Quantitative Descriptive Analysis (QDA)
Descriptive Profile (ODP) (low degree of training) and is one of the most well-known classical methods. In recent
Conventional Profile (CP) (high degree of training). Five years, generic methodologies including the Conventional
chocolate samples were evaluated by two panels, one using Profile (CP) have been extensively used due to its increased
the 9 cm and the other using the 15 cm line scale. The freedom of application. (Murray et al. 2001).
panels performed the sensory analysis using the ODP and Due to the time consumption to perform these analyses,
after the CP method. The following criteria were investi- many alternative methodologies have been developed to
gated: interaction between sample and evaluator, discrim- eliminate the long training stage. However, these faster
inative capacity, repeatability of results, and frequency of methods provide only qualitative data. The Optimized
score use on the unstructured scale. The influence of scale Descriptive Profile (ODP), proposed by Silva et al. (2012),
length on sensory responses was similar in the two methods stands out among alternative methods due to its ability to
(ODP and CP). When comparing the two scales in both also provide quantitative data.
methods, it was observed that the 15 cm scale resulted in In the quantitative description, the evaluator expresses
an improvement in discriminative capacity, reduction of the intensity of each qualitative term for a specific food,
interaction and the evaluators tended to distribute their allowing applications on quality control, formulation
ratings more evenly across this scale length. The repeata- optimization and also to correlate sensory and instrumental
bility of results showed a slight tendency to be better on the measurements (Meilgaard et al. 2006).
9 cm scale. Descriptive analyses commonly use three types of
intensity scales: magnitude estimation, category and line.
Keywords Line scale  Scale length  Conventional The selection of the scale depends on the method to be
profile  Optimized descriptive profile used. The advantage of the line scale is the absence of any
numerical values associated with the response and the
limited use of words, minimizing potential trends among
evaluators to avoid or to prefer specifics numbers or
& Aline Iamin Gomide expressions (Minim and Silva 2016). Furthermore, it pro-
aline.gomide@hotmail.com
vides several numbers of places (within the constraints of
1
Department of Food Technology, Federal University of the actual length of the line) to indicate the intensity of the
Viçosa (UFV), Viçosa, Minas Gerais 36570-000, Brazil sensory attribute. Because the line scale is a type of interval
2
Department of Statistics, Federal University of Viçosa scale, most statistical procedures can be used for their
(UFV), Viçosa, Minas Gerais 36570-000, Brazil

123
2816 J Food Sci Technol (July 2021) 58(7):2815–2824

analysis, including means, standard deviation, t-tests, 2012) and six samples (Wszelaki et al. 2005), while Hong
analysis of variance and others (Stone and Sidel 2004). et al. (2010) and Lee and Vickers (2010) used a greater
Due to its advantages, the line scale was recommended scale (15 cm) to evaluate almost the same numbers of
for QDA and posteriorly in other classical methods such as samples, four and six samples, respectively, showing the
the Free-Choice Profile (1984) and Spectrum (1991). Since lack of consensus.
the advent of the QDA, the line scale has been used in Several studies have compared different types of scaling
many generic methods (Dairou and Sieffermann 2002; such as category scales, line scales and magnitude esti-
Ginés et al. 2004; Blancher et al. 2007; Brannan 2009; mation (Shand et al. 1985; Lawless and Malone 1986;
Silva et al. 2012) and was selected for the ODP method. Purdy et al. 2002; Jeon et al. 2004; Silva et al. 2013a;
On the line scale the intensity of an attribute generally Gamba et al. 2020), but few studies have focused on
increases from left to right, with extreme values anchored comparing different lengths of line scale (Carlin et al.
by terms that represent ‘‘weak’’ and ‘‘strong’’ intensity of 1956; Jeon et al. 2004; Park et al. 2007). Additionally,
the stimulus. The evaluator’s task is to make a mark on the these few studies that compared different scale length have
scale that reflects the intensity of the attribute evaluated focused on counting scaling errors based on the inversion
(Stone and Sidel 2004). of stimulus, considering how a judge generates numbers in
The line scale has been used in different lengths. The response to a set of intensities and on some others aspects
ODP and QDA recommended the use of a 9 and a 15 cm (Carlin et al. 1956; Jeon et al. 2004; Park et al. 2007),
scale, respectively. Generic methods are more flexible, without measuring the effect of scale length on sensory
including studies that used 9 cm (Wszelaki et al. 2005; responses obtained by statistical tests.
Silva et al. 2012; Castilhos et al. 2020), 10 cm (Dairou and Due to the scarcity of studies, it is necessary to develop
Sieffermann 2002; Blancher et al. 2007; Picouet et al. more works on this subject. Therefore, this study aims to
2019; Jeyaprakash et al. 2020), 12 cm (Ginés et al. 2004) compare the influence of two different scale length on
and 15 cm line scales (Brannan 2009; Mielby et al. 2014; some variables of descriptive sensory evaluation, such as
Sharma et al. 2017). sample 9 evaluator interaction, discrimination of samples
The scale length and number of scale categories are and repeatability of results, by means of statistical tests,
major variables that affect scale sensitivity (Stone and and also to compare the frequency of score use on that
Sidel 2004). According to Stone and Sidel (2004), a three- unstructured scales.
point scale is less sensitive than a five-point scale (about According to Meilgaard et al. (2006), besides the ade-
30%) and both are less sensitive than a seven- or nine-point quate scale selection, the validity and reliability of intensity
scale. Regarding the line scale, Stone and Sidel (2004) measurements depend on the training of the evaluators.
mentioned that in a limited study it was observed that Therefore, the training is an important aspect that should be
extending the scale from 15 to 20 cm did not increase the considered when evaluating the influence of line scale
sensitivity. Shortening to less than 15 cm reduced the length. Thus, this work aims to study the influence of line
sensitivity. scale length on sensory responses in one descriptive
According to Park et al. (2007), the number of cate- method with low degree of training (ODP—provides
gories to be chosen for the category scale would depend on quantitative data) and in other with high degree of training
the number of different stimuli to be assessed. Enough (CP—widely used with freedom of application), consider-
categories should be available to represent accurately the ing five samples on evaluations.
perceived spacing between the ranks. For a 9-point scale, a
given judge would only be able to represent the spacing
between perhaps four or five products for a single experi- Material and methods
mental session, while data for the other products would not
be useful because they would be ‘bunched’ together (Park The two descriptive methods (ODP and CP) were per-
et al. 2007). formed using two scales: 9 cm line scale (because it is
Regarding the line scale, it would be expected that the already recommended by ODP) and 15 cm line scale
scale length would depend on the number of samples. A (commonly used in descriptive methods and recommended
large number of samples would imply on a larger scale, and by the QDA). The influence of scale length was indepen-
how about a small number of samples? Is there any dif- dently evaluated on each sensory method, separately.
ference when different lengths are used? Even when a Subsequently, it was verified which of the methods is more
small number of samples is considered, different studies sensitive by the variation of scale length.
adopted different scale lengths, showing that in the litera-
ture there is no consensus on that selection. For example, a
9 cm scale was used for evaluation of five (Silva et al.

123
J Food Sci Technol (July 2021) 58(7):2815–2824 2817

Samples (2000) and Silva et al. (2013b), as proposed by Damasio


and Costell (1991). Eight attributes were defined by con-
Five chocolate samples were utilized as food matrices and sensus: brown color, cocoa mass aroma, cocoa mass flavor,
were defined by preliminary tests. The five samples were sweetness, residual bitterness, hardness, spreadability and
composed by different proportions of milk and bittersweet adhesivity. Next, with the assistance of the evaluators, the
chocolate, respectively, 90:10; 70:30; 50:50; 30:70; 10:90. reference materials (‘‘weak’’ and ‘‘strong’’) of each attri-
The milk and bittersweet chocolate contained 35% and bute were defined.
70% of cocoa, respectively. Each chocolate unit measured For the familiarization step the descriptive terms and
was approximately 30 mm in diameter and 20 mm in their reference materials were presented to the evaluators in
height and was prepared by a local company (Viçosa, MG, individual booths during one session. The evaluators were
Brazil). instructed to read the attribute definitions and to taste the
references.
Sensory evaluation Posteriorly, the evaluators were randomly distributed
into two teams, each with 20 evaluators, satisfying the
The sensory characterization of chocolates was conducted minimum of 16 as proposed by Silva et al. (2014b) for the
in a laboratory of the Federal University of Viçosa and it ODP method. One panel evaluated the samples according
was approved by the ethics committees of the Institution to the ODP method followed by the CP, using the 9 cm line
(17104913.4.00005153). It was performed by two sensory scale (panel 1). The other performed the same procedure
descriptive methods: CP and ODP. The common stages of using the 15 cm line scale (panel 2). The scales were
both methods were: recruitment, pre-selection, determina- presented on printed ballots.
tion of descriptive terminology and familiarization of The initial and commons steps of both methods (ODP
evaluators with the reference material. The experiment and CP) were executed by the two panels together in order
design overview is illustrated in Fig. 1. to obtain consensus in the evaluation of the sensory attri-
A total of 64 candidates were recruited using question- butes, allowing for subsequent comparison between the
naires, as proposed by Meilgaard et al. (2006). The pre- techniques.
selection of recruited candidates consisted of a sequence of
four triangular tests. The criterion for selection was an Optimized descriptive profile (ODP)
assertion of 75% of the tests, as recommended by Meil-
gaard et al. (2006). The sensory attributes were defined by The evaluators of panels 1 and 2 began assessing the
the previous list technique, obtained from Minim et al. chocolate samples according to the attribute-by-attribute

)LJ
40 Definition of the
60 recruited pre-selected
Recruitment of the candidates Pre-selection of descriptive
evaluators
evaluator candidates the evaluators terminology and
(64 questionnaires) (triangular tests) definition of the
reference material

Panel 1 Panel 2
20 candidates 20 candidates 20 candidates Evaluation 20 candidates
Evaluation Familiarization Training tests
Training tests (panel 1) (panel 1) (panel 2) 15 cm line scale (panel 2)
9 cm line scale with the (15 cm line scale)
(9 cm line scale) (ODP-15)
(ODP-9) reference material

Common steps

Selection of the The same 16 evaluators The same 16 evaluators Selection of the
evaluators were considered were considered evaluators
for analysis of results for analysis of results

16 candidates 16 candidates
(panel 1) (panel 2)

Evaluation
Evaluation
15 cm line scale
9 cm line scale
(CP-15)
(CP-9)

Fig. 1 Flow diagram of the experiment

123
2818 J Food Sci Technol (July 2021) 58(7):2815–2824

protocol, as recommended by Silva et al. (2012). There- evaluated the samples using the 9 cm line scale (CP-9) and
fore, only one attribute was evaluated per session and all panel 2 using the 15 cm line scale (CP-15).
the samples were presented together with the reference
materials of the attribute to be evaluated. Statistical analysis
Panel 1 evaluated the samples using the 9 cm line scale
and panel 2 using the 15 cm scale, thus generating data for The data obtained by the four techniques (ODP-9, ODP-15,
the two techniques (ODP-9 and ODP-15). Evaluations were CP-9 and CP-15) were individually analyzed in relation to
performed using the Balanced Block Design (BBD). Thus, the effect of samples 9 evaluator interaction, discrimina-
each evaluator represented a block and assessed samples tion of samples, repeatability of results and frequency of
with three repetitions for each attribute. The number of score use on the line scale. The specific techniques com-
sessions required corresponded to the number of attributes pared were: (i) ODP-9 and ODP-15, and (ii) CP-9 and CP-
(eight) multiplied by the number of repetitions, totalizing 15 in order to evaluate the effect of scale length in both
24 sessions. methods and to determine which method was more influ-
enced by the scale length with regards to each criterion
Conventional profile (CP) studied.

After performing all ODP steps, the evaluators of both Effect of sample 9 evaluator interaction
panels were properly trained for subsequent evaluation of
products by the CP. Thus, the evaluation step of the ODP The effect of interaction was determined by ANOVA with
served as a pre-training of the panels, as performed by two sources of variation (sample and evaluator) and sam-
Silva et al. (2012). ple 9 evaluator interaction. Eq. (1) shows the mathemati-
Training consisted of several exercises, including cal model.
ordering tests, recognition of reference materials and Y ijk ¼ m þ T i þ Bj þ ðTBÞij þ eijk ð1Þ
allocation of sensory attribute intensity on the line scale, as
performed by Simiqueli et al. (2015). The assessors where:
underwent training exercise for about two months. After
• Yijk = score of sample i attributed by evaluator j in
that, the evaluators performed a preliminary test to verify if
repetition k;
they were adequately trained. Thus, the final evaluation
• m = constant inherent to the model or general average;
step of CP was simulated with four repetitions, according
• Ti = fixed effect of sample i;
to BBD, where two samples of chocolate were evaluated
• Bj = random effect of evaluator j;
(one composed by 70% of milk chocolate ? 30% of bit-
• (TB)ij = effect of sample 9 evaluator interaction;
tersweet chocolate and the other by 30% of milk choco-
• eijk = normal random error, independent and equally
late ? 70% of bittersweet chocolate). Panel 1 performed
distributed (0, r2).
the preliminary tests using 9 cm line scale and panel 2 used
the 15 cm line scale. Analyses of variance (ANOVA) were Significance (p \ 0.05) of the interaction effect was
performed per attribute for each evaluator. It was selected determined for each sensory attribute by the F-test. The
the evaluators that presented discriminatory capacity and technique which presented the most attributes with signif-
reproducibility for all attributes (p.Fsample \ 0.3 and icant effect was considered the technique with the greatest
p.Frepetition [ 0.05), considering the same selection interaction.
parameter performed by (Silva et al. 2012). Thus, of the 20
evaluators of each panel, 16 presented satisfactory selec- Discriminative capacity of the evaluators
tion parameter and were selected to make up the sensorial
team. To permit comparison between the ODP and CP The effect of the samples (F-test) was determined for each
methods, the evaluators of panel 1 and panel 2 considered attribute by ANOVA (Eq. 1). In the case of a significant
for analysis of results in the ODP were the same 16 used in interaction effect, the Fsample was calculated by using the
the CP. Mean Square of interaction as the denominator, as rec-
Lastly, the trained and selected evaluators analyzed the ommended by Stone and Sidel (2004).
test-chocolates according to the BBD. Each evaluator In the case of a significant difference between samples,
randomly analyzed the five samples in one session in the ANOVA was followed by the Tukey test. The statistical
relation to all attributes, without the presence of reference procedures (F-test and Tukey test) were performed con-
materials and in a monadic way. Three repetitions were sidering a level of 5% of significance. The technique which
conducted, resulting in 3 evaluation sessions. Panel 1 formed the most groups for one specific attribute was

123
J Food Sci Technol (July 2021) 58(7):2815–2824 2819

considered the technique with the greatest discriminative the frequency of use of each 0.1 cm of the standardized
capacity. scale was determined.

Repeatability of the results Software

To assess the repeatability of the panels, the ANOVA The analysis of variance and the means tests were per-
(level of 10% of significance) was conducted considering formed using the SAS (Statistical Analysis System), ver-
an error among the evaluations (eij) and within the evalu- sion 9.1, licensed to the Universidade Federal de Viçosa.
ations (eijk). This analysis was performed for each attribute
and technique separately, considering the three repetitions
of the same sample. It was determined if there was a sig- Results
nificant effect between successive evaluations by the same
panel (effect of repetition - eij). According to Barbin Study of sample 3 evaluator interaction
(1993), the mathematical model that represents the analysis
is shown by Eq. (2). The same mathematical model was In all techniques a significant effect of interaction
used by Silva et al. (2014a) to assess the capacity of the (p \ 0.05) was observed in the F-test (Table 1). The
panel to repeat the results in the ODP. existence of interaction indicates that at least one evaluator
The null hypothesis of zero variability was tested among is assessing the samples differently from the panel. This is
evaluation repetitions (re2 = 0). Because it is desirable to a common occurrence in sensory analysis and is difficult to
accept the null hypothesis for this criterion, the level of control (Silva and Damásio 1994).
significance considered was greater than the others (10%). A reduction in interaction was noted when the larger
Thus, the probability of type II error (probability of to scale was used. In the ODP-9 the interaction was signifi-
accept the null hypothesis when it is false) is diminished, cant for four attributes (cocoa flavor, residual bitterness,
being more rigorous. The technique that presented the most hardness and adhesivity) and in the ODP-15 for only one
attributes with no significant effect was considered the attribute (residual bitterness).
technique with the greatest repeatability. A similar result was observed in the CP method, how-
Y ijk ¼ m þ T i þ Bj þ eij þ eijk ð2Þ ever in a more pronounced way. In the CP-9 the interaction
was significant for all attributes, while in CP-15 a drastic
where: reduction was observed, where interaction was significant
• Yijk = score of sample i attributed by evaluator j in for only one descriptive term (spreadability).
repetition k;
• m = constant inherent to the model or general average; Discriminative capacity of evaluators
• Ti = fixed effect of sample i;
• Bj = random effect of evaluator j; For all techniques the Fsample test was significant (p.Fsam-
• eij = random effect of repetitions for evaluation of the ple \ 0.01) for all attributes. Thus, the Tukey test was
same sample; performed and the results are listed in Table 2.
• eijk = normal random error, independent and equally
distributed (0, r2). Table 1 p-value of analysis of variance for sample 9 evaluator
interaction
ODP-9 ODP-15 CP-9 CP-15
Frequency of score use on the line scale Attributes p-value p-value p-value p-value

This analysis sought to verify which range of the scale each Brown color 0.077ns 0.701ns 0.008* 0.859ns
panel used with the greatest frequency to indicate the Cocoa aroma 0.078ns 0.051ns 0.001* 0.501ns
* ns *
intensity of each attribute in each technique. To enable Cocoa flavor 0.001 0.074 0.003 0.058ns
ns ns *
comparison the data was previously corrected by dividing Sweetness 0.112 0.100 0.028 0.496ns
the individual scores by the scale length used in the eval- Residual bitterness 0.002* \0.001* 0.003* 0.136ns
uation. Thus, the individual scores of the evaluators Hardness 0.040* 0.509ns \0.001* 0.306ns
obtained from the ODP-9 and CP-9 techniques were divi- Spreadability 0.400ns 0.518ns 0.021* 0.045*
* ns *
ded by 9 and in the other techniques by 15. This correction Adhesivity 0.001 0.057 0.012 0.075ns
was necessary since 1 cm represents a higher proportion in *p-value significant at 5% probability; ns: not significant at 5%
the 9 cm scale (0.11%) than in the 15 cm scale (0.07%). probability
Thus, all scales were standardized to the range of 0–1 and

123
2820 J Food Sci Technol (July 2021) 58(7):2815–2824

Table 2 Means scores (±standard deviation) of the sensory attributes of chocolate in the four techniques, tested by means of the Tukey test
(a = 0.05)
Sensory attributes
Brown color Cocoa aroma Cocoa flavor Sweetness Residual Hardness Spreadability Adhesivity
bitterness

ODP F1 0.7 ± 0.7e 0.9 ± 0.9d 0.7 ± 0.7d 7.9 ± 1.5a 0.6 ± 0.6e 1.4 ± 1.4d 7.7 ± 1.6a 7.5 ± 2.0a
d c c b d c a
9 cm F2 2.7 ± 1.2 2.7 ± 1.7 2.3 ± 1.4 6.6 ± 1.8 2.2 ± 1.2 3.4 ± 2.1 6.8 ± 1.5 6.1 ± 2.6a
F3 5.5c ± 1.4c 5.1 ± 2.4b 4.4 ± 2.0b 3.6 ± 1.8c 4.5 ± 1.9c 6.1 ± 2.1b 3.6 ± 1.8b 3.6 ± 2.4b
b a a d b ab bc
F4 7.1 ± 1.4 7.0 ± 1.6 6.3 ± 2.0 1.9 ± 1.5 6.4 ± 1.8 7.3 ± 1.6 2.6 ± 1.9 1.7 ± 1.2b
a a a d a a c
F5 8.3 ± 0.8 8.1 ± 1.6 7.5 ± 1.7 0.9 ± 0.6 8.0 ± 1.0 7.8 ± 1.3 1.5 ± 0.9 1.8 ± 1.5b
e d e a e d a
ODP F1 1.3 ± 1.1 1.9 ± 1.5 1.5 ± 1.4 13.1 ± 1.8 1.2 ± 1.0 3.3 ± 2.8 12.8 ± 2.0 12.6 ± 2.3a
d c d b d c a
15 cm F2 4.4 ± 1.8 4.2 ± 2.7 4.0 ± 2.7 10.3 ± 3.1 3.2 ± 2.3 5.7 ± 3.5 11.3 ± 2.1 10.5 ± 3.1b
c b c c c b b
F3 7.7 ± 2.1 8.1 ± 3.1 7.5 ± 3.1 7.0 ± 3.1 6.8 ± 3.2 9.0 ± 3.4 7.2 ± 3.2 7.4 ± 3.4c
F4 11.8 ± 1.7b 10.9 ± 2.5a 10.4 ± 2.5b 4.4 ± 2.8d 10.2 ± 3.5b 10.7 ± 3.1ab 4.6 ± 3.2c 4.5 ± 3.1d
a a a e a a d
F5 13.6 ± 1.1 12.7 ± 2.4 12.6 ± 2.5 2.3 ± 2.3 12.6 ± 2.0 12.2 ± 2.8 2.8 ± 2.3 2.7 ± 2.5d
d d c a d c a
CP F1 0.8 ± 0.9 0.6 ± 0.6 0.5 ± 0.5 8.2 ± 1.3 0.4 ± 0.5 1.3 ± 1.4 8.1 ± 1.0 8.0 ± 1.2a
c c c a d c a
9 cm F2 2.3 ± 1.9 2.1 ± 1.9 1.7 ± 1.3 7.2 ± 1.7 1.4 ± 1.3 2.2 ± 1.6 7.0 ± 1.9 7.0 ± 1.9a
b b b b c b b
F3 4.5 ± 2.2 4.3 ± 2.7 4.1 ± 2.5 4.7 ± 2.5 4.0 ± 2.6 4.5 ± 2.7 4.9 ± 2.5 4.8 ± 2.5b
F4 6.5 ± 1.8a 6.5 ± 2.0a 6.4 ± 2.4a 2.2 ± 1.6c 6.2 ± 2.5b 6.9 ± 1.9a 2.9 ± 2.2c 2.2 ± 1.9c
F5 7.7 ± 1.1a 7.5 ± 1.5a 7.7 ± 1.6a 1.2 ± 1.2c 7.6 ± 1.7a 6.9 ± 2.3a 1.8 ± 2.0c 1.6 ± 1.6c
CP F1 3.0 ± 2.2e 2.4 ± 2.2e 1.9 ± 1.7e 12.6 ± 2.7a 1.4 ± 1.3c 3.6 ± 3.2c 12.4 ± 2.3a 12.2 ± 2.5a
d d d a c c a
15 cm F2 5.3 ± 3.2 4.9 ± 3.6 3.9 ± 3.4 11.0 ± 3.4 3.1 ± 3.5 5.2 ± 3.3 10.7 ± 3.6 10.2 ± 3.9b
c c c b b b b
F3 8.8 ± 3.0 8.7 ± 3.5 8.3 ± 3.7 6.3 ± 3.5 7.5 ± 4.2 8.5 ± 3.3 6.9 ± 3.7 5.5 ± 3.6c
F4 11.3 ± 2.9b 11.0 ± 3.1b 11.0 ± 3.5b 2.9 ± 2.7c 11.2 ± 3.6a 10.9 ± 3.0a 4.8 ± 3.3c 3.9 ± 3.4c
a a a c a a c
F5 13.1 ± 1.5 12.9 ± 1.4 13.1 ± 1.7 1.4 ± 1.1 12.5 ± 2.8 12.1 ± 3.0 3.1 ± 3.1 1.9 ± 2.0d
a,b,c,d,e
Letters obtained by Tukey test. Means followed by same letter in the column do not differ at 5% probability
F1: chocolate sample composed by 90% of milk chocolate ? 10% of bittersweet chocolate
F2: chocolate sample composed by 70% of milk chocolate ? 30% of bittersweet chocolate
F3: chocolate sample composed by 50% of milk chocolate ? 50% of bittersweet chocolate
F4: chocolate sample composed by 30% of milk chocolate ? 70% of bittersweet chocolate
F5: chocolate sample composed by 10% of milk chocolate ? 90% of bittersweet chocolate

The evaluators were able to detect significant differ- flavor). With regards to adhesivity, the samples were dis-
ences between all five samples for two attributes (brown criminated into four groups and for the others only three
color and residual bitterness) in ODP-9. In ODP-15 this groups were formed.
occurred for four attributes (brown color, cocoa flavor, Thus, discrimination tended to increase when the 15 cm
sweetness and residual bitterness), while for the other scale was used in both methods. Compared to the ODP-9
attributes, the samples were separated into four distinct technique, the ODP-15 resulted in an increase of one dis-
groups (minimum number of groups observed for this crimination group for three attributes (cocoa flavor, sweet-
technique). In ODP-9 the presence of attributes with less ness and spreadability) and of two groups for one attribute
discrimination was observed. For spreadability the samples (adhesivity). In CP a similar behavior was verified, with an
were separated into three distinct groups and for adhesivity increase of also one group for three attributes (brown color,
into only two. cocoa aroma and adhesivity) and of two groups for one
In CP-9, the samples were discriminated into four attribute (cocoa flavor). Thus, the effect of scale length on
groups for three attributes (brown color, cocoa aroma and sample discrimination was the same for ODP and CP.
residual bitterness) and the formation of five groups was
not observed. For the five other attributes (more than half), Repeatability of the results
the samples were separated into only three groups. Dif-
ferently from CP-9, in the CP-15 five groups were formed In ODP-9, the evaluators presented repeatability (p [ 0.1)
for three attributes (brown color, cocoa aroma, cocoa for all sensory attributes, while in ODP-15 there was a

123
J Food Sci Technol (July 2021) 58(7):2815–2824 2821

significant effect of repetitions (p \ 0.1) for two attributes, fact was not always observed in the CP-15 technique
brown color and spreadability (Table 3). (Fig. 3b), as can be verified for brown color, cocoa aroma,
Similar behavior was observed for CP, but slightly less spreadability and hardness. For the other sensory charac-
pronounced. In CP-9, two sensory stimuli presented a teristics in CP-15, although the extremes of the scale were
significant effect of repetition (brown color and cocoa used more in comparison with the other ranges, the use of
aroma) versus three in CP-15 (brown color, cocoa aroma that extremes were more pronounced in the 9 cm scale
and flavor) (Table 3). (Fig. 3a) for the respective attributes.
This result shows a small tendency of evaluators to
present greater repeatability on a small scale for both
methods. In other words, there was a tendency of evalua- Discussion
tors to assign similar scores in different repetitions when
the small scale was used. The results showed that some criteria of the descriptive
evaluation were influenced when the same number of
Frequency of score use on the line scale samples (five) was evaluated by different scale lengths (9
and 15 cm). These findings are important in the context
For all techniques, independent of the scale length, the where different studies adopted different scale lengths in
evaluators used the entire scale to express the intensity of the evaluation of the same number of samples (Wszelaki
attributes. However, some ranges were used more fre- et al. 2005; Lee and Vickers 2010). For ODP and CP, the
quently than others. The evaluators that performed the interaction was smaller on the larger scale. In sensory
ODP-9 technique (Fig. 2a) assigned scores between 0–0.1 analysis, one of the reasons for interaction is the inversion
and 0.9–1 cm with greater frequency for all attributes. In of sensory stimuli perception (Silva and Damásio 1994).
the ODP-15 technique (Fig. 2b) the evaluators most fre- The small space for locating the samples on the 9 cm scale
quently used scores between 0–0.1 for only three attributes, could lead to a more confusable scaling contributing to
bitterness, cocoa aroma and flavor. For the latter two the more inversion of sensory stimuli by evaluators. This
difference in frequency was very low when compared with hypothesis is reinforced by Jeon et al. (2004) that stated
other ranges. For the other attributes there was not a range that on smaller scales the evaluators have a greater ten-
that presented an expressive frequency of use, instead, the dency to assign equal scores, or even to invert them when
scores were homogeneously distributed over the scale. In evaluating samples with different intensities of a given
general, it was therefore verified that evaluators tended to stimulus. The inversion of stimulus generate a ‘‘distur-
use the extremes of the scale less (superior as well as bance’’, a source of variation in the system, caused by the
inferior) for all attributes when the evaluation was per- fact that evaluators generate particular numerical respon-
formed on the larger scale. ses, spacing their scores differently on the scale (Park et al.
A similar result was observed for the CP (Fig. 3). In the 2004), contributing to increase the interaction when the
CP-9 technique (Fig. 3a), the ranges 0–0.1 and 0.9–1 were 9 cm scale was used.
used with greatest frequency for all descriptive terms. This In the CP, the decrease in the number of attributes that
had a significant effect of interaction on the 15 cm scale
was more pronounced than in the ODP. Silva et al. (2012),
Table 3 Repeatability (eij) of descriptive techniques
when comparing the ODP and CP methodologies using a
ODP-9 ODP-15 CP-9 CP-15 9 cm line scale, verified a higher tendency of the CP to
p-value generate more attributes with significant interactions
Attributes eij eij eij eij between samples 9 evaluators, which agrees with the
ns * *
results obtained in the present study for the 9 cm scale.
Brown color 0.199 0.034 0.040 0.012*
This observation may be a consequence of the differences
Cocoa aroma 0.866ns 0.257ns 0.037* 0.012*
ns ns ns
between the evaluation protocols of the two methodologies.
Cocoa flavor 0.211 0.591 0.184 0.050*
ns ns ns
In the simultaneous protocol (ODP), the evaluators can re-
Sweetness 0.114 0.762 0.248 0.156ns
taste and review the scores given to the different samples,
Residual bitterness 0.260ns 0.621ns 0.152ns 0.144ns
which reduces the effect of ‘‘forgetting’’ (a fact observed in
Hardness 0.466ns 0.637ns 0.607ns 0.384ns
the monadic protocol-CP), resulting in a smaller number of
Spreadability 0.369ns 0.001* 0.184ns 0.923ns errors and consequently less interaction (Jeon et al. 2004;
ns ns ns
Adhesivity 0.446 0.458 0.244 0.878ns Park et al. 2004).
*p-value significant at 10% probability; ns: not significant at 10% However, this trend was not observed for the 15 cm
probability scale. No representative difference of interaction effect was
indicated between ODP-15 and CP-15. The fact that

123
2822 J Food Sci Technol (July 2021) 58(7):2815–2824

(a)
30 30 30 30

Frequency (%)
Frequency (%)

Frequency (%)

Frequency (%)
25 Brown color 25 Cocoa aroma 25 Sweetness 25 Residual bitterness
20 20 20 20
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Ranges Ranges Ranges Ranges
30 30 30 30
Frequency (%)

Frequency (%)

Frequency (%)

Frequency (%)
25 Cocoa flavor 25 Spreadability 25 Adesivity 25 Hardness
20 20 20 20
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Ranges Ranges Ranges Ranges

(b)
30 30 30 30
Frequency (%)

Frequency (%)

Frequency (%)

Frequency (%)
25 Brown color 25 Cocoa aroma 25 Sweetness 25 Residual bitterness
20 20 20 20
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Ranges Ranges Ranges Ranges
30 30 30 30
Frequency (%)

Frequency (%)

Frequency (%)

Frequency (%)
25 Cocoa flavor 25 Spreadability 25 Adesivity 25 Hardness
20 20 20 20
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Ranges Ranges Ranges Ranges

Fig. 2 Frequency distribution for sensory scores assigned to the descriptive attributes of chocolates for ODP method (a) ODP-9; (b) ODP-15

(a)
30 30 30 30
Frequency (%)

Frequency (%)

Frequency (%)

Frequency (%)
25 Brown color 25 Cocoa aroma 25 Sweetness 25 Residual bitterness
20 20 20 20
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Ranges Ranges Ranges Ranges
30 30 30 30
Frequency (%)
Frequency (%)

Frequency (%)
Frequency (%)

25 Cocoa flavor 25 Spreadability 25 Adesivity 25 Hardness


20 20 20 20
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Ranges Ranges Ranges Ranges

(b)
30 30 30 30
Frequency (%)
Frequency (%)

Frequency (%)

Frequency (%)

25 Brown color 25 Cocoa aroma 25 Sweetness 25 Residual bitterness


20 20 20 20
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Ranges Ranges Ranges Ranges
30 30 30 30
Frequency (%)
Frequency (%)

Frequency (%)

Frequency (%)

25 Cocoa flavor 25 Spreadability 25 Adesivity 25 Hardness


20 20 20 20
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Ranges Ranges Ranges Ranges

Fig. 3 Frequency distribution for sensory scores assigned to the descriptive attributes of chocolates for CP method (a) CP-9; (b) CP-15

123
J Food Sci Technol (July 2021) 58(7):2815–2824 2823

evaluators have a lower tendency to invert scores on a evaluators tend to distribute their responses across the
larger scale, may have contributed to a smaller interaction entire extension of the scale, independent of its length
on the 15 cm scale so that it was not influenced by the (Poulton 1973). However, some ranges were used more
method used. Therefore, the fact that the CP, when com- frequently than others. The evaluators who used the 15 cm
pared to the ODP, presented greater interaction on the 9 cm scale for both methods tended to use the extremes of the
scale and did not present an expressive difference on the scale less when compared to the panels that used the 9 cm
15 cm scale (with lower interaction effect), contributed to a scale, showing that they tended to distribute their rating
more pronounced influence of scale on interaction in this more evenly along the length of the scale. The reduced
method. utilization of the extreme ranges on the larger scale may
The discriminative capacity was improved when sam- reflect a lesser need of the evaluators to use its entire length
ples were evaluated on the 15 cm scale. According to to represent differences in intensity between the samples,
Pecore et al. (2015), on scales with a limited extent the i.e., to express the ‘‘correct spacing’’ between them. On the
evaluators are unable to express small differences in other hand, greater utilization of the extremes on the 9 cm
intensity between samples. In the present study, when scale may have shown the need for more space so that the
assessing the samples on the 15 cm scale, the evaluators evaluators could separate all samples, thus more frequently
had more available space to indicate the intensity of the positioning extreme chocolate samples at the ends of the
attributes, which may have given them a greater chance of scale.
adequately representing the ‘‘spacing’’ between the differ-
ent samples, so that they could express small differences.
Furthermore, a reduction of the discriminative capacity Conclusion
observed on the 9 cm scale may also be related to the
greater effect of interaction between samples and evalua- The studied variables of descriptive sensory evaluation
tors, since the variation introduced into the system, caused were influenced by the scale length. This influence had a
by the fact that evaluators attribute scores to the samples in similar behavior in the two methods, differing in sensitivity
a particular way, may imply a reduction in the discrimi- regarding sample 9 evaluator interaction which effect was
nating power of the samples (Park et al. 2007). more pronounced in the CP method. For the conditions
For repeatability of the results, a trend was observed that studied, the 15 cm line scale is more advantageous because
was slightly greater for the 9 cm scale. The larger the scale it provides greater discrimination and decreased the inter-
the greater the number of possible locations that the eval- action effect. Despite of repeatability tended to be better on
uators can use to represent the attribute intensity, making it the 9 cm scale, this variable was the least influenced by
difficult to memorize the markings in the different repeti- scale length.
tions. According to Meilgaard et al. (2006), on larger scales
the evaluators have more difficulty remembering the Acknowledgments The authors would like to acknowledge the
National Research Council—CNPq for their financial support.
position indicated in the different assessments, which may
have contributed to diminishing the ability of the evalua-
tors to assign similar scores in different repetitions of the References
same sample.
The influence of scale length on repeatability was Barbin D (1993) Componentes de variância: teoria e aplicações, 2nd
slightly more pronounced in the ODP. Because there was edn. FEALQ, Piracicaba
no training in the ODP, the panel that used the 15 cm scale Blancher G, Chollet S, Kesteloot R et al (2007) French and
Vietnamese: how do they describe texture characteristics of the
may have felt the effect of scale size when compared with same food? A case study with jellies. Food Qual Prefer
the panel that assessed the 9 cm scale. In the CP, the fact 18:560–575. https://doi.org/10.1016/j.foodqual.2006.07.006
that the panels were trained for using the scales permitted Brannan RG (2009) Effect of grape seed extract on descriptive
them to adapt their use, which may have contributed to sensory analysis of ground chicken during refrigerated storage.
Meat Sci 81:589–595. https://doi.org/10.1016/j.meatsci.2008.10.
balance responses between the panels that performed 014
evaluations on the 9 and 15 cm scales, resulting in small Carlin A, Kempthorne O, Gordon J (1956) Some aspects of numerical
differences in repeatability. scoring in subjective evaluation of foods. J Food Sci
When studying the usage frequency of scores on the 21:273–281. https://doi.org/10.1111/j.1365-2621.1956.tb16921.
x
unstructured scale, for all techniques it was observed that Castilhos MB, Del Bianchi V, Gómez-Alonso S et al (2020) Sensory
the evaluators tended to use the full length of the scale to descriptive and comprehensive GC-MS as suitable tools to
express the intensity of the attributes. This observation may characterize the effects of alternative winemaking procedures on
be a consequence of the ‘‘response range equalizing bias’’ wine aroma. Part II: BRS Rúbea and BRS Cora. Food Chem
311:126025. https://doi.org/10.1016/j.foodchem.2019.126025
theory, which states that for a given set of stimuli,

123
2824 J Food Sci Technol (July 2021) 58(7):2815–2824

Dairou V, Sieffermann J-M (2002) A comparison of 14 jams Picouet PA, Gou P, Pruneri V et al (2019) Implementation of a quality
characterized by conventional profile and a quick original by design approach in the potato chips frying process. J Food
method, the flash profile. J Food Sci 67:826–834. https://doi. Eng 260:22–29. https://doi.org/10.1016/j.jfoodeng.2019.04.013
org/10.1111/j.1365-2621.2002.tb10685.x Poulton E (1973) Unwanted range effects from using within-subject
Damasio M, Costell E (1991) Análisis sensorial descriptivo: gen- experimental designs. Psychol Bull 80:113–121. https://doi.org/
eración de descriptores y selección de catadores. Rev Agroquı́- 10.1037/h0034731
mica Tecnol Aliment 31:165–178 Purdy JM, Armstrong G, McIlveen H (2002) Three scaling methods
Gamba MM, Lima Filho T, Della Lucia SM et al (2020) Performance for consumer rating of salt intensity. J Sens Stud 17:263–274.
of different scales in the hedonic threshold methodology. J Sens https://doi.org/10.1111/j.1745-459X.2002.tb00347.x
Stud:1–15. https://doi.org/10.1111/joss.12592 Shand P, Hawrysh Z, Hardin R, Jeremiah L (1985) Descriptive
Ginés R, Valdimarsdottir T, Sveinsdottir K, Thorarensen H (2004) sensory assessment of beef steaks by category scaling, line
Effects of rearing temperature and strain on sensory character- scaling and magnitude estimation. J Food Sci 50:495–500.
istics, texture, colour and fat of Arctic charr (Salvelinus alpinus). https://doi.org/10.1111/j.1365-2621.1985.tb13435.x
Food Qual Prefer 15:177–185. https://doi.org/10.1016/S0950- Sharma M, Kristo E, Corredig M, Duizer L (2017) Effect of
3293(03)00056-9 hydrocolloid type on texture of pureed carrots: rheological and
Hong JH, Duncan SE, Dietrich AM (2010) Effect of copper speciation sensory measures. Food Hydrocoll 63:478–487. https://doi.org/
at different pH on temporal sensory attributes of copper. Food 10.1016/j.foodhyd.2016.09.040
Qual Prefer 21:132–139. https://doi.org/10.1016/j.foodqual. Silva MP, Damásio M (1994) Análise sensorial descritiva. Fundação
2009.08.010 Tropical de Pesquisas e Tecnologia ‘‘André Tosello’’, Campinas
Jeon S-Y, O’Mahony M, Kim K (2004) A comparison of category and Silva RCSN, Minim VPR, Simiqueli AA et al (2012) Optimized
line scales under various experimental protocols. J Sens Stud descriptive profile: a rapid methodology for sensory description.
19:49–66. https://doi.org/10.1111/j.1745-459X.2004.tb00135.x Food Qual Prefer 24:190–200. https://doi.org/10.1016/j.foodq
Jeyaprakash S, Heffernan J, Driscoll R, Frank D (2020) Impact of ual.2011.10.014
drying technologies on tomato flavor composition and sensory Silva AN, Silva RCSN, Ferreira MAM et al (2013a) Performance of
quality. LWT—Food Sci Technol 120:108888. https://doi.org/ hedonic scales in sensory acceptability of strawberry yogurt.
10.1016/j.lwt.2019.108888 Food Qual Prefer 30:9–21. https://doi.org/10.1016/j.foodqual.
Lawless H, Malone G (1986) The discriminative efficiency of 2013.04.001
common scaling methods. J Sens Stud 1:85–98. https://doi.org/ Silva RCSN, Minim VPR, Carneiro JD et al (2013b) Quantitative
10.1111/j.1745-459X.1986.tb00160.x sensory description using the optimized descriptive profile:
Lee CA, Vickers ZM (2010) Discrimination among astringent comparison with conventional and alternative methods for
samples is affected by choice of palate cleanser. Food Qual evaluation of chocolate. Food Qual Prefer 30:169–179. https://
Prefer 21:93–99. https://doi.org/10.1016/j.foodqual.2009.08.003 doi.org/10.1016/j.foodqual.2013.05.011
Meilgaard M, Civille G, Carr B (2006) Sensory evaluation tech- Silva RCSN, Minim VPR, Silva AN et al (2014a) Optimized
niques, 4th edn. CRC Press descriptive profile: how many judges are necessary? Food Qual
Mielby LH, Hopfer H, Jensen S et al (2014) Comparison of Prefer 36:3–11. https://doi.org/10.1016/j.foodqual.2014.02.011
descriptive analysis, projective mapping and sorting performed Silva RCSN, Minim VPR, Silva AN et al (2014b) Validation of
on pictures of fruit and vegetable mixes. Food Qual Prefer optimized descriptive profile (ODP) technique: accuracy, preci-
35:86–94. https://doi.org/10.1016/j.foodqual.2014.02.006 sion and robustness. Food Res Int 66:445–453. https://doi.org/10.
Minim VPR, Silva RCSN (2016) Análise Sensorial Descritiva, 1st 1016/j.foodres.2014.10.015
edn. Editora UFV, Viçosa-MG Simiqueli AA, Minim VPR, Silva RCSN et al (2015) How many
Minim VPR, Silva MA, Cecchi HM (2000) Perfil sensorial de ovos de assessors are necessary for the optimized descriptive profile
Páscoa. Ciência e Tecnol Aliment 20:47–50. https://doi.org/10. when associated with training? Food Qual Prefer 44:62–69.
1590/S0101-20612000000100010 https://doi.org/10.1016/j.foodqual.2015.03.019
Murray J, Delahunty C, Baxter I (2001) Descriptive sensory analysis: Stone H, Sidel J (2004) Sensory evaluation practices, 3rd edn.
past, present and future. Food Res Int 34:461–471. https://doi. Academic Press, New York
org/10.1016/S0963-9969(01)00070-9 Varela P, Ares G (2012) Sensory profiling, the blurred line between
Park J-Y, Jeon S-Y, O’Mahony M, Kim K-O (2004) Induction of sensory and consumer science. A review of novel methods for
scaling errors. J Sens Stud 19:261–271. https://doi.org/10.1111/j. product characterization. Food Res Int 48:893–908. https://doi.
1745-459X.2004.tb00147.x org/10.1016/j.foodres.2012.06.037
Park JY, O’Mahony M, Kim KO (2007) ‘‘Different-stimulus’’ scaling Wszelaki AL, Delwiche JF, Walker SD et al (2005) Consumer liking
errors; effects of scale length. Food Qual Prefer 18:362–368. and descriptive analysis of six varieties of organically grown
https://doi.org/10.1016/j.foodqual.2006.03.021 edamame-type soybean. Food Qual Prefer 16:651–658. https://
Pecore S, Kamerud J, Holschuh N (2015) Ranked-scaling: a new doi.org/10.1016/j.foodqual.2005.02.001
descriptive panel approach for rating small differences when
using anchored intensity scales. Food Qual Prefer 40:376–380. Publisher’s Note Springer Nature remains neutral with regard to
https://doi.org/10.1016/j.foodqual.2014.02.002 jurisdictional claims in published maps and institutional affiliations.

123

You might also like