Boris Oicherman PHD Thesis

Effects of colorimetric additivity failure
and of observer metamerism

on cross-media colour matching
Boris Oicherman
Project supervisor: Ronnier M. Luo

Co-supervisors: Alan R. Robertson, Arthur W. S. Tarrant, Brian Rigg
Submitted in accordance with the requirements for the degree of

Doctor of Philosophy in Colour Science
The University of Leeds

Department of Colour Science
May 2007
The candidate confirms that the work submitted is his own

and that appropriate credit has been given
where reference has been made to the work of others.
This copy has been supplied on the understanding

that it is copyright material
and that no quotation from the thesis may be published
without proper acknowledgement.
1
Acknowledgements
I would like to thank my observers for their time, effort and dedication: Seo-Young Choi, Chen-
Yang Fu, Don-Gyou Lee, Raymond Ho, Wen-Yuan Lee, Wei Ji, Saori Kitaguchi, Jang-Jin Yoo,
Youn Jin Kim, Hossain Izadan and Cheng Li.
I would like to thank Dr. Arthur Tarrant for building a fantastic visual colorimeter, and for his
invaluable help and advice in designing my experiments.
I would like to thank Dr. Peter Rhodes for his continuous support.
I would also like to thank Dr. Alan Robertson for fruitful discussions, criticism and advice.
Finally, I am grateful to Prof. Ronnier Luo: for making my postgraduate studies possible, for
endless arguments, tireless support, infinite patience, and for being the best boss on this planet.
2
Abstract
Observer metamerism and failures of colorimetric additivity were investigated in two colour
matching settings. One was the classical large-field bipartite field matching with the use of
narrow-band lights. The other was the cross-media setup typical for soft-proofing, with the use
of reflective stimuli and two types of computer monitors.
In the first experiment, five observers made repeated maximum saturation colour matches with a
6° bipartite field, and one observer made similar matches with a 2° field. The results indicate
failure of the additivity law in the results of individual observer. High correlation is shown
between the failures of additivity in large field matching and rod participation. Variability of
colour matching functions within our group of observersin was very similar to the variability
within the Stiles and Burch colour matching dataset. We conclude that additional colour
matching data may not be required for successful modelling of observer metamerism.
In the second experiment, eleven observers made repeated cross-media colour matches between
two computer displays and surface colour stimuli. Observer metamerism is shown to be an
insignificant factor in the variability of colour matches in all colours except neutrals. The
properties of distribution of individual judgements suggest that the precision of cross-media
colour matches is governed by thresholds of colour discrimination, thus it can be modelled well
by advanced colour difference formulae with suitably adjusted parametric coefficients.
In the same cross-media colour matching conditions, we find significant systematic

discrepancies between the predictions of the CIE Standard Colorimetric Observer and the mean
matches made by the group of observers. We attribute these discrepancies to additivity failures
caused by postreceptoral adaptation, leading to nonlinear change in sensitivity of S/(L+M)
chromatic channel. An adaptation transform accounting for postreceptoral adaptation can
compensate for the colorimetric discrepancies; a framework for such a transform is proposed.
We suggest that a reliable cross-media colour reproduction system involving devices with
narrow-band colorants cannot be established based on basic colorimetry alone. Observer
metamerism can not be assumed to govern the individual differences in matches in conditions
different from quasi-symmetric colour matching, and an adaptation transform must be
introduced in order to compensate for the adaptation differences caused by different spectral
properties of colorants of different media.
3
Table of contents
1. Introduction ........................................................................................................................................ 13
1.1. Background ....................................................................................................................................................14
1.2. Aims and scope ...............................................................................................................................................16
1.3. Thesis structure ..............................................................................................................................................17
1.4. Summary of contribution ..............................................................................................................................18
2. Literature review ................................................................................................................................ 20
2.1. The Eye ...........................................................................................................................................................21
2.1.1. Introduction.............................................................................................................................................21
2.1.2. The cornea and aqueous and vitreous humors.........................................................................................23
2.1.3. Pupil and retinal illuminance ..................................................................................................................23
2.1.4. Lens ........................................................................................................................................................24
2.1.5. Macular pigment .....................................................................................................................................27
2.1.6. Photoreceptors ........................................................................................................................................29
2.1.6.1. Rods...............................................................................................................................................29
2.1.6.2. Cones.............................................................................................................................................32
Cone sensitivity ................................................................................................................................32
Shift in peak sensitivity ....................................................................................................................34
L/M cones ratio.................................................................................................................................36
2.1.6.3. MacLeod-Boynton chromaticity diagram......................................................................................36
2.1.7. Retinal topography..................................................................................................................................37
2.1.7.1. Summary of retinal topography .....................................................................................................37
2.1.7.2. Cone density ..................................................................................................................................38
2.1.7.3. Macular pigment............................................................................................................................39
2.2. Principles of Colorimetry ..............................................................................................................................40
2.2.1. Introduction.............................................................................................................................................40
2.2.2. Colour matching and metamerism ..........................................................................................................40
2.2.2.1. Metamerism...................................................................................................................................40
2.2.2.2. Neural and quantal colour match and metamerism........................................................................41
2.2.3. Trichromacy, Additivity and Trichromatic Generalisation .....................................................................42
2.2.4. Tristimulus space and chromaticity diagram...........................................................................................44
2.2.5. Transformation of tristimulus space........................................................................................................46
2.2.6. Colour matching experiment...................................................................................................................48
2.2.6.1. Maxwell’s method of colour matching..........................................................................................48
2.2.6.2. Maximum saturation method of colour matching..........................................................................50
2.2.6.3. The units of tristimulus values.......................................................................................................50
2.2.6.4. Symmetry of colour matching .......................................................................................................51
2.2.7. Colour matching functions and calculation of tristimulus values............................................................51
2.2.7.1. Locus of monochromatic stimuli ...................................................................................................53
2.2.8. Colour matching functions and retinal topography .................................................................................55
2.2.9. Evaluation of rod intrusion .....................................................................................................................55
2.3. CIE Colorimetry ............................................................................................................................................58
2.3.1. Introduction: the Standard Colorimetric Observer ..................................................................................58
2.3.2. CIE 1931 and 1964 Standard Colorimetric Observers ............................................................................58
2.3.2.1. CIE 1931 Standard Colorimetric Observer....................................................................................58
2.3.2.2. CIE 1964 Standard Colorimetric Observer....................................................................................59
2.3.3. CIE XYZ tristimulus values.....................................................................................................................60
2.3.3.1. Tristimulus values of self-luminous stimuli ..................................................................................60
2.3.3.2. Tristimulus values of object-colour stimuli ...................................................................................61
2.3.4. CIE 1931 and CIE 1964 chromaticity diagrams .....................................................................................61
2.3.5. CIELAB colour space and colour differences.........................................................................................62
2.3.5.1. CIELAB 1976................................................................................................................................62
2.3.5.2. Advanced colour differenced formulae .........................................................................................64
2.3.5.3. Mean colour difference from mean (MCDM) ...............................................................................66
2.4. Statistics ..........................................................................................................................................................67
2.4.1. Univariate statistics.................................................................................................................................67
2.4.1.1. Mean..............................................................................................................................................67
2.4.1.2. Variance, standard deviation and covariance.................................................................................68
2.4.1.3. The probability density function (pdf) and cumulative distribution function (cdf)........................69
2.4.1.4. Normal distribution .......................................................................................................................70
2.4.1.5. The Central Limit Theorem...........................................................................................................71
2.4.2. The t-distribution ....................................................................................................................................72
2.4.3. Inferences about the difference between two means (small sample).......................................................72
2.4.4. Evaluation of uncertainty in measurement..............................................................................................73
2.4.5. Multivariate statistics..............................................................................................................................74
4
2.4.5.1. Organisation, variance and covariance ..........................................................................................74

2.4.5.2. Linear combinations of random variables......................................................................................75
2.4.6. Inferences about the equalities of two mean vectors (small sample) ......................................................76
2.4.6.1. Hotelling T2 test.............................................................................................................................76
2.4.6.2. Problem of unequal covariance matrices (Behrens-Fisher problem) .............................................77
2.4.7. Error propagation in colorimetric transformations..................................................................................78
2.4.7.1. Propagation of random errors – general case.................................................................................78
2.4.7.2. Propagation of random errors in colorimetric transformations ......................................................79
2.4.7.3. Propagation of random errors through the matrix inversion ..........................................................81
2.4.7.4. Programming implementation of error propagation model............................................................82
2.4.8. Confidence ellipses and ellipsoids ..........................................................................................................82
2.4.8.1. Properties of ellipse .......................................................................................................................82
2.4.8.2. Plotting confidence ellipses in CIELAB space ..............................................................................84
2.5. Additivity failures...........................................................................................................................................86
2.5.1. Blottiau (1947) and Trezona (1953)........................................................................................................86
2.5.1.1. Blottiau (1947) ..............................................................................................................................86
2.5.1.2. Trezona (1953, 1954) ....................................................................................................................87
2.5.2. After Stiles and Burch colour matching experiment ...............................................................................88
2.5.2.1. Stiles’ “Addendum on additivity” .................................................................................................89
2.5.2.2. Comparing Maxwell and Maximum Saturation methods ..............................................................90
Crawford (1965) ...............................................................................................................................90
Wyszecki (1982)...............................................................................................................................91
2.5.2.3. Lozano and Palmer (1967, 1968)...................................................................................................91
2.5.3. Zaidi (1986) ............................................................................................................................................93
The failures are not due to rods or computational imprecision .........................................................93
Failures are not due to failures of principle of invariance.................................................................93
Failures are not due to multiple photopigments................................................................................94
2.5.4. Thornton (1992-1998).............................................................................................................................95
2.5.4.1. “Toward a More Accurate and Extensible Colorimetry” parts I and II..........................................95
2.5.5. Summary.................................................................................................................................................96
2.6. Variability of colour matching functions......................................................................................................98
2.6.1. Introduction: variability of the CIE Standard Colorimetric Observers....................................................98
2.6.2. Stiles and Burch colour matching study................................................................................................101
2.6.2.1. Intra-observer repeatability..........................................................................................................101
2.6.2.2. Inter-observer variability .............................................................................................................101
2.6.3. Later analyses of Stiles and Burch dataset ............................................................................................102
2.6.3.1. Smith et al (1975)........................................................................................................................102
2.6.3.2. (Webster and MacLeod 1988) and (Webster 1992).....................................................................103
2.6.3.3. Viénot (1977, 1980, 1987)...........................................................................................................104
2.6.4. North and Fairchild (1993) ...................................................................................................................106
2.7. Observer metamerism and real-world metamers......................................................................................107
2.7.1. Introduction...........................................................................................................................................107
2.7.2. Experiments with Davidson & Hemmendinger (D&H) rule.................................................................107
2.7.3. Observer metamerism in cross-media colour matching ........................................................................108
2.7.3.1. Pobboravsky (1988).....................................................................................................................108
2.7.3.2. Rich and Jalijali (1995)................................................................................................................109
2.7.3.3. Alfvin & Fairchild (1997)............................................................................................................110
2.7.4. Summary...............................................................................................................................................110
2.8. The CIE Standard Deviate Observer .........................................................................................................112
2.8.1. Nimeroff et al (1961): colour matching and uncertainty.......................................................................112
2.8.2. Allen (1970): the definition of Standard Deviate Observer...................................................................113
2.8.3. Nayatani’s proposal for SDO................................................................................................................115
2.8.4. Proposal for SDO by Ohta (1995).........................................................................................................115
2.8.5. CIE Publication 80: the Standard Deviate Observer .............................................................................116
2.8.5.1. Calculating the Metamerism Index for change in observer .........................................................116
2.8.5.2. Constructing the 95% confidence ellipse.....................................................................................118
2.8.5.3. Effect of age ................................................................................................................................118
2.8.6. Tests of the CIE SDO ...........................................................................................................................119
2.8.6.1. North and Fairchild (1993) and Nayatani’s response (1994) .......................................................119
2.8.6.2. Alfvin and Fairchild (1997) .........................................................................................................120
2.8.7. Development of the SDO: summary .....................................................................................................120
2.9. Literature review: summary .......................................................................................................................121
3. Experiment 1. Colour matching in small and large fields........................................................... 122
3.1. Introduction..................................................................................................................................................123
3.1.1. Experiment 1: the objectives.................................................................................................................123
3.1.1.1. Test of the reproducibility of the variability of CMF in S&B dataset.........................................123
3.1.1.2. Test of reproducibility of additivity failures................................................................................124
5
3.1.2. Summary of the experimental conditions and results............................................................................124

3.2. Experimental ................................................................................................................................................126
3.2.1. Test of additivity...................................................................................................................................126
3.2.1.1. Proportionality test ......................................................................................................................126
3.2.1.2. Additivity test ..............................................................................................................................128
3.2.2. Experimental setup ...............................................................................................................................130
3.2.2.1. Visual colorimeter .......................................................................................................................130
3.2.2.2. Primary and test stimuli...............................................................................................................131
3.2.2.3. Telespectroradiometric measurements.........................................................................................133
3.2.2.4. Calculation of the tristimulus values ...........................................................................................134
3.2.2.5. Observers and observational sessions..........................................................................................135
3.3. Results: large field experiment....................................................................................................................136
3.3.1. Variability of colour matching data ......................................................................................................136
3.3.1.1. Physical variability ......................................................................................................................136
Random fluctuations of TSR and visual colorimeter ......................................................................137
Fields spatial uniformity .................................................................................................................138
Cross-talk between the channels.....................................................................................................139
3.3.1.2. Psychophysical variability...........................................................................................................140
Intra-observer variability ................................................................................................................140
Inter-observer variability ................................................................................................................141
3.3.2. Proportionality and Additivity test results ............................................................................................143
3.3.2.2. Additivity test results...................................................................................................................144
3.4. Data analysis and discussion: large field experiment ................................................................................146
3.4.1. Variability of colour-matching data ......................................................................................................146
3.4.1.1. Intra-observer, inter-observer and instrumental variability..........................................................146
3.4.1.2. Comparison with S&B data .........................................................................................................150
3.4.1.3. Comparison with CIE SDO .........................................................................................................153
3.4.2. Proportionality and additivity tests .......................................................................................................155
3.4.2.2. Additivity test ..............................................................................................................................158
3.4.2.3. Forward- and Inverse-Matrix methods of transformation of tristimulus space............................160
3.5. Small field colour matching experiment.....................................................................................................162
3.5.1. Results: small field colour matching experiment ..................................................................................162
3.5.1.1. Variability of colour matching data .............................................................................................162
3.5.1.2. Additivity test results...................................................................................................................165
3.6. Discussion: small field colour matching experiment .................................................................................167
3.7. Conclusion: the colour matching experiment ............................................................................................168
3.7.1. Uncertainty of colour matching: what is next? .....................................................................................170
4. Experiment 2: Cross-media colour matching experiment ........................................................... 171
4.1. Introduction..................................................................................................................................................172
4.1.1. Rationale and research question............................................................................................................172
4.1.2. Summary of results ...............................................................................................................................174
4.2. Experimental ................................................................................................................................................175
4.2.1. Experimental setup ...............................................................................................................................175
4.2.1.1. Background .................................................................................................................................175
4.2.1.2. Viewing cabinet...........................................................................................................................176
4.2.1.3. Computer monitors......................................................................................................................177
4.2.1.4. Reference spectroradiometer .......................................................................................................179
4.2.1.5. Test stimuli ..................................................................................................................................179
4.2.1.6. Setup............................................................................................................................................182
4.2.1.7. Colour matching procedure .........................................................................................................183
4.2.1.8. Observers.....................................................................................................................................187
4.3. Setup performance evaluation ....................................................................................................................189
4.3.1. Repeatability .........................................................................................................................................189
4.3.1.1. Short-term repeatability...............................................................................................................189
4.3.1.2. Medium term repeatability ..........................................................................................................190
4.3.2. Spatial uniformity .................................................................................................................................191
4.3.3. Spatial channel independency...............................................................................................................192
4.3.4. Channel additivity.................................................................................................................................194
4.3.5. Stray light .............................................................................................................................................195
4.3.6. Consistency of hard-copy stimulus presentation...................................................................................195
4.4. Results ...........................................................................................................................................................197
4.4.1. Intra-observer variability ......................................................................................................................197
4.4.2. Inter-observer variability ......................................................................................................................199
4.4.3. Agreement with the Standard Colorimetric Observer ...........................................................................203
4.5. Data analysis and discussion: Variability of colour matching.................................................................206
6
4.5.1. Fluctuation of stimulus presentation .....................................................................................................206

4.5.2. ∆Eab vs. ∆E00 ........................................................................................................................................206
4.5.3. Intra-observer variability ......................................................................................................................207
4.5.3.1. Dependence on observer..............................................................................................................207
4.5.3.2. Estimation of threshold sensitivity ..............................................................................................208
4.5.3.3. Variations in different colour dimensions....................................................................................209
4.5.3.4. Modelling the intra-observer variability ......................................................................................210
Combined intra-observer variability ...............................................................................................210
Constructing CIEDE2000 ellipses ..................................................................................................212
Adjusting parametric factors...........................................................................................................214
Variability in Lightness dimension.................................................................................................216
4.5.3.5. Practical implications of intra-observer variability......................................................................217
4.5.4. Inter-observer variability ......................................................................................................................218
4.5.4.1. Anomalous observers ..................................................................................................................218
4.5.4.2. Comparison with intra-observer variability .................................................................................219
4.5.4.3. Modelling the observer metamerism from S&B dataset..............................................................221
4.5.4.4. Modelling the observer metamerism: the eye optical model .......................................................227
4.5.5. Agreement between observers ..............................................................................................................233
4.5.5.1. Agreement between individual observers....................................................................................233
4.5.5.2. Agreement between individual observers and the mean..............................................................238
4.6. Data analysis and discussion: Agreement with the Standard Colorimetric Observer – adaptation,
colour matching and additivity failures.............................................................................................................240
4.6.1. Description of the discrepancies ...........................................................................................................240
4.6.2. Discrepancies and adaptation................................................................................................................244
4.6.2.1. General adaptation model............................................................................................................245
4.6.2.2. CIE Chromatic Adaptation Transform 2002 (CAT02) ................................................................246
4.6.2.3. Inverting CAT02 .........................................................................................................................247
4.6.2.4. Finding the adapting stimulus: the results ...................................................................................249
4.6.2.5. Modelling the relationship between the test colour and the adapting stimuli ..............................250
4.6.3. Statistical significance of the discrepancies ..........................................................................................254
4.6.4. Adaptation, colour matching and additivity failures .............................................................................255
4.6.4.1. Similarity with previous studies ..................................................................................................255
4.6.4.2. Relationship between adaptation and additivity failures..............................................................257
4.6.5. Additivity failures and display colorimetry: modifying the CAT02 .....................................................259
4.6.5.1. Modified CAT02 .........................................................................................................................260
4.7. Summary and conclusions ...........................................................................................................................264
4.7.1. Variability of colour matches and observer metamerism......................................................................264
4.7.2. Additivity failures .................................................................................................................................265
4.7.3. The model of uncertainty of colour matching in soft-proofing .............................................................266
5. Conclusions ....................................................................................................................................... 267
5.1. Variability of colour matching, variability of colour matching functions and observer metamerism ..268
5.2. Adaptation and additivity............................................................................................................................270
5.3. Uncertainty of colour matching and uncertainty of colour vision: basic vs. advanced colorimetry......272
6. References ......................................................................................................................................... 274
7
List of figures
Figure 2.1.1-1. Simplified diagram of the human eye. ..................................................................................................21
Figure 2.1.1-2. Classes of retinal neurons. ....................................................................................................................22
Figure 2.1.4-1. Relative density of the human crystalline lens. .....................................................................................25
Figure 2.1.4-2. Lens relative optical density for subjects of different age. ....................................................................26
Figure 2.1.5-1. Density of macular pigment as measured by three different studies .....................................................27
Figure 2.1.5-2. Density of macular pigment (Stockman et al. 1999) for different viewing field size............................28
Figure 2.1.5-3. Density of macular pigment by (Stockman et al. 1999) for 2° viewing field, with 45%
error bars. ..............................................................................................................................................29
Figure 2.1.6-1. Rods and cones of the human retina......................................................................................................30
Figure 2.1.6-2. Rod sensitivity – the scotopic luminous efficiency function V′(λ) .......................................................31
Figure 2.1.6-3. Aguilar and Stiles TVI function. ...........................................................................................................32
Figure 2.1.6-4. Cone fundamental sensitivity functions – cone fundamentals (Stockman et al. 1999;
Stockman and Sharpe 2000)..................................................................................................................34
Figure 2.1.6-5. Reproduction of Figure 3 from (Dartnall et al. 1983), showing indication of bimodal
distribution of green and red photopigment peak density. ....................................................................35
Figure 2.1.6-6. Pseudocolour image of the retina of two male subjects. .......................................................................36
Figure 2.1.6-7. MacLeod-Boynton chromaticity diagram with locus of monochromatic stimuli. .................................37
Figure 2.1.7-1. Spatial density of rods and cones in the retina. .....................................................................................38
Figure 2.1.7-2. Individual variability of cone density in fovea of seven subjects..........................................................39
Figure 2.1.7-3. Spatial distribution of macular pigment density of 4 subjects measured over 4-14
months...................................................................................................................................................39
Figure 2.2.2-1. Illustration of the phenomenon of metamerism.....................................................................................41
Figure 2.2.3-1. Illustration of the colour matching experiment. ....................................................................................43
Figure 2.2.4-1. Diagram of the tristimulus space...........................................................................................................45
Figure 2.2.4-2. (r, g) chromaticity diagram calculated by Eq. (2.2.7)-(2.2.9)................................................................45
Figure 2.2.6-1. Maximum saturation and Maxwell methods of colour matching. .........................................................50
Figure 2.2.7-.1. Illustration of the process of measurement of colour matching functions.............................................52
Figure 2.2.7-2. Colour matching functions and corresponding chromaticity diagram with locus of
monochromatic stimuli. ........................................................................................................................54
Figure 2.3.2-1. CIE 1931 Standard Colorimetric Observer colour matching functions.................................................59
Figure 2.3.4-1. Chromaticity diagrams of CIE Standard Colorimetric Observers with loci of
monochromatic stimuli .........................................................................................................................62
Figure 2.4.1-1. Illustration of the probability density function......................................................................................69
Figure 2.4.1-2. Illustration of the probability density function (PDF) and the corresponding cumulative
distribution function (cdf). ....................................................................................................................70
Figure 2.4.1-3. Probability density function of normal distribution..................................................................................71
Figure 2.4.2-1. Probability density function of t-distribution compared with one of normal distribution .....................72
Figure 2.4.8-1 Example of scatter plot of set of data from bivariate normal distribution .............................................82
Figure 2.4.8-2. Ellipse centred at the origin, with radii proportional to standard deviation and parallel to
the axes .................................................................................................................................................83
Figure 2.5.1-1 Results of test of additivity by (Ishak 1951) .........................................................................................88
Figure 2.5.2-1. Results of test of additivity by (Stiles 1963). ........................................................................................89
Figure 2.5.2-2. Additivity test by (Crawford 1965).......................................................................................................91
Figure 2.5.2-3. Additivity test by (Lozano and Palmer 1967). ......................................................................................92
Figure 2.6.1-1. Plot of WDW-normalised chromaticities of white mixture matches to the standard white
of CCT c. 4800K made by 36 observers in Wright (Wright 1928) colour matching
investigation..........................................................................................................................................99
8
Figure 2.6.1-2. Colour matching data sets which were the basis of the CIE 1931 and 1964 Standard
Colorimetric Observers. ........................................................................................................................100
Figure 2.6.3-1. Colour matches of each of Viénot’s (Viénot 1980) observers in CIE 1964 chromaticity
diagram transformed to instrumental primaries.....................................................................................105
Figure 2.8.2-1. Allen (Allen 1970) Standard Deviate Observer (dashed line) and the Standard
Colorimetric Observer (solid line). .......................................................................................................114
Figure 2.8.3-1. Comparison of performance of Nayatani’s proposed SDO (Nayatani et al. 1983) with
Allen’s (Allen 1970), with set of 68 grey metamers .............................................................................115
Figure 2.8.6-1. Comparison of 95% confidence ellipses constructed from Alfvin and Fairchild’s
observers’ matches (Alfvin and Fairchild 1997) with the prediction of CIE SDO................................120
Figure 3.2.2-1. Tarrant visual colorimeter – schematic plan view and optical system ..................................................130
Figure 3.2.2-2. Tarrant visual colorimeter. ....................................................................................................................131
Figure 3.2.2-3. SPD of the experimental stimuli. ..........................................................................................................133
Figure 3.2.2-4. Calibration curve applied to correct the Minolta CS-1000 TSR measurements (see text
for details) .............................................................................................................................................133
Figure 3.2.2-5. Comparison between Minolta CS-1000 and Bentham TSR instruments...............................................134
Figure 3.2.2-6. Schematic illustration of calculation of tristimulus value. ....................................................................135
Figure 3.3.1-1. Long-term variability of the combination TSR – visual colorimeter. ...................................................138
Figure 3.3.1-2. Illustration of position of uniformity measurement sample points........................................................138
Figure 3.3.1-3. Evaluation of the uniformity of matching field. ....................................................................................139
Figure 3.3.1-4. Illustration of the cross-talk between the channels................................................................................140
Figure 3.3.1-5. Variabilities in PC and T primary sets. .................................................................................................143
Figure 3.3.2-1. Magnitude of proportionality failure in results of observer B. ..............................................................144
Figure 3.3.2-2. Magnitude of proportionality failure in mean results all observers.......................................................144
Figure 3.3.2-3. Magnitude of additivity failure in results of observer B........................................................................145
Figure 3.3.2-4. Magnitude of additivity failure in mean results of all observers. ..........................................................145
Figure 3.4.1-1. All types of variability compared..........................................................................................................148
Figure 3.4.1-2. Inter-observer CV values transformed to CIE 1964 XYZ primaries using error
propagation model (Section 2.4.7). .......................................................................................................149
Figure 3.4.1-3. Coefficients of variation of colour matching data from present experiment compared
with ones of (Stiles and Burch 1959). ...................................................................................................151
with ones of (Stiles and Burch 1959); with abscissa and ordinate scaled to enlarge the
areas where CMF are significantly different from zero.........................................................................152
Figure 3.4.1-5. Test of the CIE Standard Deviate Observer. .........................................................................................154
Figure 3.4.2-1. 95% confidence ellipses of 661 nm stimulus measured by observer B in full and half
luminance..............................................................................................................................................155
Figure 3.4.2-2. Results of the analysis of rod participation. ..........................................................................................156
Figure 3.4.2-3. Illustration of effect of rods on 661 nm match. .....................................................................................157
Figure 3.4.2-4. Correlation of additivity failure and rod mismatch. ..............................................................................158
Figure 3.4.2-5. 95% confidence ellipses of 461 nm light measured by visual colour matching with PC
primaries, and predicted by transformation of tristimulus space from T into PC
primaries. ..............................................................................................................................................159
Figure 3.4.2-6. Illustration of the statistical test of tristimulus space transformation. ...................................................161
Figure 3.5.1-1. Comparison of intra-observer variability in large- and small-field experiments. ..................................164
Figure 3.5.1-2. Magnitude of additivity failure in results of observer B1......................................................................166
Figure 4.2.1-1. Spectral reflectance function of the white plaque used to calculate the cabinet illuminant
SPD. ......................................................................................................................................................176
Figure 4.2.1-2. Relative spectral power distribution of the light reflected by the white plaque in the
viewing cabinet, superimposed with the calculated cabinet illuminant.................................................177
9
Figure 4.2.1-3. The plot of the CIE 1964 xy chromaticity of the viewing cabinet illuminant. ......................................177
Figure 4.2.1-4. Spectral power distribution functions of monitors’ primaries ...............................................................178
Figure 4.2.1-5. Chromaticity coordinates of the primaries and of the secondaries of both displays..............................178
Figure 4.2.1-6. a*b* projection of the CIELAB coordinates of the paint samples. .......................................................180
Figure 4.2.1-7. SPD of the light reflected by the paint samples in the viewing cabinet.................................................182
Figure 4.2.1-8. Scheme of the experimental setup............................................................................................................183
Figure 4.2.1-9. Images of the experimental setup..........................................................................................................183
Figure 4.2.1-10. The graphic user interface of the DVC program. ..................................................................................184
Figure 4.2.1-11. The stimulus generated by DVC software: ...........................................................................................186
Figure 4.2.1-12. Plots of the results of tests by metameric rules .....................................................................................188
Figure 4.3.1-1. Short-term repeatability of the displays, expressed in colour difference (∆E*ab) from
mean......................................................................................................................................................189
Figure 4.3.1-2. Medium-term repeatability of displays, expressed in colour difference (∆Eab) from
mean......................................................................................................................................................190
Figure 4.3.1-3. Medium-term repeatability of displays: contribution of lightness.........................................................191
Figure 4.3.2-1. Locations of measurements in display uniformity evaluation ...............................................................191
Figure 4.3.2-2. Results of displays spatial uniformity evaluation..................................................................................192
Figure 4.3.3-1. Target for evaluation of channel independency. ...................................................................................193
Figure 4.3.3-2. Result of channel independency test. ....................................................................................................193
Figure 4.3.3-3. Difference from reference versus colour of the background – breakdown into CIE 1964
X, Y and Z tristimulus values. ..............................................................................................................194
Figure 4.3.4-1. Results of channel additivity test. .........................................................................................................195
Figure 4.3.6-1. Values of MCDM of stimulus presentation variations calculated with CIEDE2000
formula..................................................................................................................................................196
Figure 4.4.1-1 Intra-observer variability – per observer. ..............................................................................................198
Figure 4.4.1-2. Intra-observer variability – per test colour ............................................................................................199
Figure 4.4.2-1. Inter-observer variability.......................................................................................................................200
Figure 4.4.2-2. 95% inter-observer confidence ellipses in CIELAB a*b* plane. ..........................................................201
Figure 4.4.2-3. Enlarged 95% inter-observer confidence ellipses in CIELAB a*b* plane. ...........................................202
Figure 4.4.3-1. Mean difference between each observer’s match and the CIE Standard Colorimetric
Observer values of the test colour. ........................................................................................................204
Figure 4.4.3-2. Difference between mean match of eleven observers and the CIE Standard Colorimetric
Observer values of the test colour. ........................................................................................................204
Figure 4.4.3-3. Differences between the CIELAB a*b* coordinates of test stimuli and mean matches of
eleven observers....................................................................................................................................205
Figure 4.5.3-1. Variability of the two most-varying observers in CIEDE2000 units.....................................................208
Figure 4.5.3-2. Intra-observer variability expressed in CV units in MacLeod-Boynton chromaticity
values. ...................................................................................................................................................209
Figure 4.5.3-3. Mean intra-observer variability for each test colour separated to perceptual dimensions. ....................210
Figure 4.5.3-4. 95% intra-observer ellipses in a*b* plane, constructed by averaging variances and
covariances of all observers. .................................................................................................................211
Figure 4.5.3-5. 95% intra-observer ellipses in a*b* plane constructed from common matrices of both
monitors’ data. ......................................................................................................................................212
Figure 4.5.3-6. Calculation of loci of constant colour difference. .................................................................................214
Figure 4.5.3-7. Combined mean 95% intra-observer ellipses, superimposed with ellipses of constant
CIEDE2000 colour difference equal to 1 with parametric coefficients [1 1 1]. ....................................215
CIEDE2000 colour difference equal to 1 with parametric coefficients [1 2 1]. ....................................216
10
Figure 4.5.3-9. Combined 95% ellipses for all observers and both monitors superimposed with ellipses
of constant CIEDE2000 colour difference equal to 1 with parametric coefficients [4 2
1]...........................................................................................................................................................217
Figure 4.5.4-1. 95% inter-observer confidence ellipses in CIELAB a*b* plane, with anomalous
observations marked and labelled with the observer code. ...................................................................218
Figure 4.5.4-2. Comparison of intra- and inter-observer variability values (MCDM00)...............................................219
Figure 4.5.4-3. Comparison of intra- and inter-observer variability in dimensions of lightness and
chromaticness........................................................................................................................................220
Figure 4.5.4-4. 95% confidence ellipses in a*b* plane constructed using common covariance matrices
for all observers and both monitors.......................................................................................................221
Figure 4.5.4-5. MCDM (CIEDE2000) within the modelled colour matching set using S&B 47
observers’ CMF. ...................................................................................................................................223
Figure 4.5.4-6. Relationship between the chromaticness and lightness variability in data modelled with
S&B CMFs. ..........................................................................................................................................223
Figure 4.5.4-7. Comparison of variability in MCDM00 terms (limited to chromaticness dimension)
between S&B dataset simulation and experimental data.......................................................................224
Figure 4.5.4-8. 95% confidence ellipses in a*b* plane constructed from mean eleven observers’
matches, superimposed with ellipses constructed for 47 observers’ CMF from S&B
dataset ...................................................................................................................................................225
Figure 4.5.4-9. Combined 95% inter-observer ellipses for both monitors, superimposed with ellipses of
constant CIEDE2000 colour difference equal to 1 with parametric coefficients [1 3 1]. ......................226
Figure 4.5.4-10. Set of 50 cone fundamental functions generated using the optical model of variability of
colour matching described in the text....................................................................................................231
Figure 4.5.4-11. Variability within the set of simulated CFF superimposed with the corresponding
variability values in S&B dataset. .........................................................................................................232
Figure 4.5.4-12. 95% ellipses in a*b* plane, constructed from the simulated CFF, superimposed with the
corresponding S&B dataset ellipses. .....................................................................................................232
Figure 4.5.5-1. Mean values of PD and D00 for each test colour and for both displays..............................................235
Figure 4.5.5-2. Mean values of PD for each observer and for both monitors ...............................................................235
Figure 4.5.5-3. Mean values of PD and D00 for each test colour and for both displays, calculated for
chromaticness only................................................................................................................................236
Figure 4.5.5-4. Mean values of PD for each observer, for both monitors, for chromaticness only...............................236
Figure 4.5.5-5. Mean values of PD for disagreement between each individual observer and the mean of
the group. ..............................................................................................................................................238
Figure 4.6.1-1. Differences between the paint samples and of the mean stimuli judged by observers to
match these samples in colour, illustrated as chromaticities and as vectors in CIELAB
planes with the origin at the coordinate of the paint sample and head at the coordinate of
the mean match made by observers.......................................................................................................242
Figure 4.6.1-2. Mean CIEDE2000 colour difference between the mean match made by all observers on
both monitors and the paint sample, separated into lightness and chromaticness
dimensions. ...........................................................................................................................................243
Figure 4.6.2-1. Schematic illustration of chromatic adaptation transform.....................................................................245
Figure 4.6.2-2. CIE 1964 xy chromaticities of the adapting stimuli for each the of ten test colours..............................249
Figure 4.6.2-3. MacLeod-Boynton chromaticities of the adapting stimuli plotted against chromaticities
of the paint samples. .............................................................................................................................252
of the paint samples; all dimensions are combined in the same diagram. .............................................254
Figure 4.6.4-1. Results of tests of additivity by (Ishak 1951(Stiles 1963; Crawford 1965; Lozano and
Palmer 1967).........................................................................................................................................256
11
List of tables
Table 2.1.5-1. Summary of measurements of macular pigment density.......................................................................28
Table 2.1.6-1. Summary of estimations of shift in peak sensitivity of green and red cones from various
publications...........................................................................................................................................35
Table 3.3.1-1. Summary of the variability introduced by the instruments. ..................................................................137
Table 3.3.1-2. Summary of mean tristimulus values for individual observer, and corresponding intra-
observer variability. Based on ten repetitions of every colour match made by observer B. ..................141
Table 3.3.1-3. Summary of mean tristimulus values for all observers, and corresponding inter-observer
variability. Means of repeated measurements of five observers............................................................142
Table 3.3.2-1. Results of the proportionality test. ........................................................................................................144
Table 3.3.2-2. Results of the additivity test..................................................................................................................145
Table 3.5.1-1. Summary of the 2° experiment intra-observer variability. ....................................................................163
Table 3.5.1-2. Results of the 2° additivity test. ............................................................................................................165
Table 4.2.1-1. Basic characteristics of the displays used in the experiment. ................................................................178
Table 4.2.1-2. CIELAB coordinates of the paint samples used in the experiment as test stimuli.................................179
Table 4.4.1-1. Intra-observer variability, per observer .................................................................................................198
Table 4.4.1-2. Mean intra-observer variability for each test colour ............................................................................199
Table 4.4.2-1. Inter-observer variability.......................................................................................................................200
Table 4.4.3-1. Mean CIELAB coordinates of matches made by all observers on both displays. .................................203
Table 4.4.3-2. Mean colour difference between each observer’s mean match and the CIE Standard
Colorimetric Observer values of the test colour....................................................................................203
Table 4.4.3-3. Colour difference between mean match of eleven observers and the test colour. .................................204
Table 4.5.3-1. Mean intra-observer variability (CIEDE2000) for each test colour separated to
perceptual dimensions of lightness and chromaticness .........................................................................210
Table 4.5.4-1. Values of variations used in modelling of observer metamerism..........................................................230
Table 4.5.5-1. Mean values of PD and D00 for each test colour and for both displays..............................................235
Table 4.5.5-2. Mean values of PD and D00 for each test colour and for both displays, calculated for
chromaticness only................................................................................................................................236
Table 4.6.1-1. CIEDE2000 colour difference between the mean match made by all observers on both
monitors and the paint sample, separated into lightness and chromaticness dimensions. .....................243
Table 4.6.2-1. The CIE 1964 XYZ tristimulus values of the adapting stimulus computed by inverted
CAT02 model, normalised to value of 100 in Y. ..................................................................................249
Table 4.6.3-1. Results of the statistical test for significance of discrepancies between the CIE observer
prediction and experimental mean matches. .........................................................................................255
12
I did not care what it was all about. All I wanted to know was how to live in it.
Maybe if you found out how to live in it you learned from that what it was all
about.
− ERNEST HEMINGWAY. The Sun Also Rises
and if a bird can speak, who once was a dinosaur,

and a dog can dream; should it be implausible
that a man might supervise
the construction of light
− KING CRIMSON. The ConstruKction of Light
Visual and sybaritic creatures that we are, the qualitative characteristics that
most seem to attract our attention are those connected with colors and pains. Let
us make the more agreeable choice and elect colors as our subject of inquiry in
the following pages, for there is pain enough in the effort to understand them.
− C. L. HARDIN. Color for Philosophers Unweaving the Rainbow

13
1. Introduction
14
1.1. Background
We want our printers, computer displays and TV sets to accurately reproduce the colours of
objects. To make this possible, we need to know what colours we see when we look at the
original objects and at their reproduction, and we need to analyze whether these two colour
perceptions are similar. Technically speaking, we need to be able to measure the colour of the
real and of the reproduced objects, to express these measurements in numbers that correspond to
our perception of these colours and to develop some criteria for their equality; all of these based
on how we – and not the instruments – see colours. This defines the major task of applied colour
science: developing methods of colour measurement: colorimetry. Colorimetry defines scales
relating the colour stimulus to colour perception – scales derived by the methods of
psychophysics.
Colorimetry has existed in its modern form approximately since the beginning of the 20th
century, and it rests on two fundamental principles:
1. The colour vision of all colour-normal humans can be reliably approximated by an
“average” set of properties: the Standard Observer. Conversely, the Standard Observer
can be used for making predictions of the perceptual effect of a colour stimulus,
predictions which will be valid for an average observer with normal colour vision.
2. The quantities of colour stimuli have algebraic properties of Symmetry, Transitivity,
Additivity and Proportionality, and can be handled in accordance with the standard rules
of algebra.
The validity of both principles is not a given fact. The problem associated with the principle of
Standard Observer is apparent from its definition: a prediction made for the Standard might
represent the average well, but would inevitably fail for any given individual. The magnitude of
this failure depends on the extent to which the individual differs from the average in a given set
of conditions. The second principle, usually broadly termed as The Principle of Additivity, was
found to fail in certain conditions by almost everyone who tested it experimentally.
Nevertheless, the CIE system of colorimetry established in 1931 on the basis of principles of
Additivity and of Standard Colorimetric Observer has proved to be one of the most successful
15
industrial standards ever, and is implemented in all contemporary imaging and colour
management products.
This discrepancy between the theoretical implications and laboratory experimental results and
the industrial practice implies that the principles of colorimetry, even if not valid in certain
conditions, do hold in most of the situations that occur in real-life applications. It, however,
does not imply that colorimetry can not fail in principle. Without understanding of the character
and of the magnitudes of such potential failures, it is impossible to identify their practical effect,
nor it is possible to take these effects into account in the design of imaging devices.
Understanding of the practical effect of the failures of laws of basic colorimetry is the principle
aim of this work.
16
1.2. Aims and scope
The range of applications of colorimetry is enormous; it spans many of industries such as

automotive, paint, imaging, printing and textiles. It is practically impossible to encompass all
possible applications in a single study, hence severe limitation of scope are necessary. The
scope of the present research is limited to one industrial application: that of simulation of an
object-colour by a self-luminous computer display. Furthermore, the stimuli are limited to be
simple, and the observers’ judgement is limited to that of colour equality of stimuli: i.e. colour
matching.
Within these conditions, we aim to answer the following two questions:

1. Do individual variations in colour vision lead to perceptible discrepancies between the
matches made by individual observers with normal colour vision and the mean match of
a group of colour-normal observers?
2. Do failures of additivity lead to perceptible discrepancies between the prediction made
with the use of standard colorimetric mathematical procedures and the average
judgement made by a group of real observers with normal colour vision?
If the answer to one or both questions is positive, then we aim to characterise the discrepancies,
and to suggest a method that can be incorporated in colorimetric calculations and used for
estimation and compensation for the resulting colorimetric errors.
17
1.3. Thesis structure
This thesis consists of four chapters.
In Chapter 2, the available literature directly related to the present study is reviewed. It is
divided into the subjects of foundations of vision, colorimetry, relevant mathematical and
statistical basics, and concludes with a critical review of publications directly related to the core
questions of the research: individual variations in colour vision and additivity failure.
Chapter 3 describes the first experiment: colour matching with the classical bipartite field
setting, with small and large fields. This stage deals mostly with replication of experimental
results previously reported by other researchers. Detailed description of the experimental setup
is given, followed by experimental results and their analysis.
Chapter 4 describes the central part of this study: the cross-media colour matching experiment.
The description of the setup and its evaluation is followed by the results and discussion of their
practical implications. Individual variability of colour matching is modelled. The chapter
concludes with the development of a conceptual framework for compensation for additivity
failures in cross-media colour reproduction.
Chapter 5 concludes the thesis by the summary of the findings and their discussion in the
context of development of colorimetry.
18
1.4. Summary of contribution
The statements of the present thesis are:

1. Individual variations in colour vision do not have appreciable practical implications on
metameric colour matching of spatially separated chromatic stimuli of computer display
and object colour.
2. Failures of colorimetric additivity cause consistent discrepancies in prediction of
metameric matches between computer display and object colour.
Following are the thesis deliverables:

1. Variability of colour matching data within Stiles and Burch (Stiles and Burch 1959)
dataset is shown to be reproducible to a high degree even within a small group of
observers. This variability is representative of variations in colour matching functions of
colour-normal population, and can be used for development of models of observer
metamerism.
2. The discrepancies between the colour equality judgements made by individual
observers and the prediction made with the use of Standard Colorimetric Observer are
characterised and quantified for the case of matching the computer display colour to
object-colour. A method for estimation of resulting uncertainties is proposed, which is
based on advanced colour-difference formulae.
3. The discrepancies introduced to the display colorimetry by the failure of Principle of
Additivity are characterised and quantified. A framework of adaptation transform is
proposed to compensate for the colorimetric errors.
In the course of this research the following publications were produced:

1. Oicherman, B., R. M. Luo, A. Robertson and A. Tarrant (2005). Experimental
verification of colorimetric additivity assumption. 10th Congress of the International
Colour Association, Granada, Spain
2. Oicherman, B., R. M. Luo, A. Robertson and A. Tarrant (2005). Uncertainty of colour-
matching data. IS&T/SID's Thirteenth Color Imaging Conference, Scottsdale, Arizona,
US
19
3. Oicherman, B. (2006). "The study of the uncertainty of colour matching: psychophysics

by means of an artwork?" In "Olafur Eliasson: Your Colour Memory". R. Torchia (ed.).
Glenside, PA, USA, Arcadia University Art Gallery: 54-63.
4. Oicherman, B., M. R. Luo and A. Robertson (2006). Test of the transformation of
primary space: forward- and inverse-matrix methods. ISCC-CIE Expert Symposium: 75
Years of the CIE Standard Colorimetric Observer. CIE publication x030:2006, Ottawa
5. Oicherman, B., R. M. Luo and A. R. Robertson (2006). Observer Metamerism and
Colorimetric Additivity Failures in Soft-Proofing. IS&T/SID's Fourteenth Color
Imaging Conference, Scottsdale, Arizona, USA: 24-30.
6. Oicherman, B., M. R. Luo and A. R. Robertson (2007). Effect of small deviations from
normal colour vision on colour matching of display and surface colours. Color science
for industry. Midterm meeting of the International Color Association, Hangzhou, China
20
2. Literature review
21
2.1. The Eye
2.1.1. Introduction
The eye is a ball (…) with two holes: light enters at one of them, the pupil, and
the nervous message leaves at the other, the optic disc, where the fibres of the
optic nerve lead to the brain.
(Weale 1968)
Figure 2.1.1-1 shows the illustration of the eye inspired by Weale’s description. While
simplified, this diagram contains all the elements relevant to our study.
Pupil Light
Photoreceptors
Macular Pigment
Lens
Figure 2.1.1-1. Simplified diagram of the human eye.

The light enters the eye through the pupil and is focused on the layer of photoreceptors in the retina by the
lens. The amount of light entering the eye is controlled by changing the diameter of the pupil
The cornea is the outer layer of the eye, transparent, and void of blood vessels. It has a greater
curvature than the rest of the eye ball, protruding slightly towards the front. Together with the
lens, the cornea ensures focusing of the image. Ciliary muscles change the curvature of the lens
surface so as to move the projected image and to bring it in focus on the surface of the retina.
The iris regulates the amount of light that enters the eye by changing the diameter of the pupil.
The light passes the lens and the eye substance and strikes the retina: a neural tissue, about 0.4
mm thick, that lines the interior of the back surface of the eye (Rodieck 1998). The neurons in
the retina can be broadly classified into five classes (Figure 2.1.1-2): ganglion cells, amacrine
cells, bipolar cells, horizontal cells and photoreceptors. The photoreceptors in their turn are
22
sub-divided into two classes: rods and cones. The rods are responsible for conveying the
variations in amount of light reaching the retina in conditions of dark adaptation; they are
generally believed to convey an achromatic signal. The cones are subdivided into three types
according to the portion of the visible spectrum they are sensitive to: long- , medium- and short-
wavelength (abbreviated as l, m and s); they are less sensitive than rods and are responsible for
generating the chromatic signals which result in the perception of colour.
Photoreceptor
Horisontal cell
Bipolar cell
Amacrine cell
Ganglion cell
Axon
Figure 2.1.1-2. Classes of retinal neurons.

The bottom side of the diagram represents the side of the retina which faces the outer side of the eye, that
is – the light penetrates the entire thickness of the retina before it is absorbed by the photoreceptor.
Reproduced from (Rodieck 1998)
The process of vision is initiated when photons are absorbed by the photoreceptor. The receptor
converts the energy of the absorbed photons into neural signals, which are passed to the bipolar
cells and horizontal cells, processed and passed further on through the ganglion cells to the optic
nerve and to the brain. The receptor has no means of discriminating between photons: equal
amounts of absorbed photons will result in identical photoreceptor signals independently of
photons’ frequency; this principle is known as the principle of invariance.
Although the eye is a physiological construction, the initial stages of visual perception – from
the moment of penetration of the cornea by the light until its absorption by the receptors – are
essentially physical processes, and are governed by the laws of physical optics (Wyszecki and
Stiles 1982). The inert pigmentation in the eye leads to selective absorption of parts of light,
while the absorption properties of the photoreceptors result in variations in number of photons
caught. Therefore, given the same light entering the eye, the generated cone signals vary
depending on the eye properties. Most of the remaining parts of this chapter will deal with
discussion of these variations.
23
2.1.2. The cornea and aqueous and vitreous humors
The cornea absorbs almost 99% of the radiation mostly in the ultra-violet region (<300 nm), and
aqueous and vitreous humors – transparent liquid substances that fill the eye – absorb less than
10% in the visible spectrum. Both – the cornea and the humors – have little effect on colour
vision and can be considered as transparent for most practical purposes (Packer and Williams
2003).
2.1.3. Pupil and retinal illuminance
The pupil is the aperture through which the light enters the eye. The pupil has the ability to
contract and expand in response to changing levels of illumination, thus regulating the amount
of light reaching the retina. In identical conditions, pupil size varies significantly (up to 50%)
between observers, and tends to get smaller with age (Wyszecki and Stiles 1982).
Several formulae have been proposed for estimation of the pupil size given the luminance
(Wyszecki and Stiles 1982):
d = 4.9 − 3tanh ⎡⎣0.4 ( log L + 1) ⎤⎦ (2.1.1)
(Moon and Spenser 1944)
log d = 0.8558 − 4.01 ⋅ 10−4 ( log L + 8.6 )

3
(2.1.2)
(De Groot and Gebhard 1952)
Here d is the calculated pupil diameter, and L is the photopic field luminance in cd/m2. Trezona
proposed an alternative formula which takes into account the size of the viewing field (Trezona
1983), thus it can be considered as more suitable for estimation of pupil diameter in colour
matching experiment:
⎧⎛ 0.389 0.547 ⎞ ⎫
d = 5 − 3tanh ⎨⎜ 0.4 − + ⎟ ( log10 L + 2.989log10 θ − 5.076 ) ⎬ (2.1.3)
⎩⎝ θ θ ⎠
2
⎭
Here d and L are as above, and θ is the size of the viewing field in degrees.
Knowledge of the pupil size allows calculation of retinal illuminance as luminance weighted by
the pupil size:
T = Ld (2.1.4)
24
Here T is the photopic retinal illuminance in trolands, L is the photopic luminance in cd/m2,
and d is the pupil diameter in mm. The scotopic retinal illuminance is calculated by the same
equation, by substituting the photopic luminance value for scotopic one.
2.1.4. Lens
The eye lens is a biconvex structure which acts on the light as an inert filter. The lens absorbs
primarily at short wavelengths; its absorption properties vary between observers, and within the
same observer with age (Wyszecki and Stiles 1982).
Wyszecki and Stiles list three methods for estimation of absorption properties of the lens
− Measurement on excised lens (Ludvigh and McCarthy 1938; Weale 1954)
− Comparison of psychophysical measurements of visual spectral sensitivity of persons
with normal vision and of persons with lens removed by operation (aphakics) (Wald
1945; Wright 1951)
− Comparison of light reflected from the lens-vitreous surface with the light reflected
from the aqueous-lens surface; that is – comparison of light which has not passed
through the lens with the light which passed through the lens twice (Said and Weale
1959)
Additional psychophysical methods can be added to this list:

− Analysis of scotopic sensitivity functions (Norren and Vos 1974) measured by bipartite
field photometric measurements (Crawford 1949), using the assumption that the inter-
observer differences in sensitivity in the far blue region are solely due to variations in
absorption by the lens
− Measurement of relative spectral sensitivity by heterochromatic flicker photometry
(Diaz et al. 1998; Stockman et al. 1999)
In lens density measurements the relative transmittance is of major interest. Therefore the data
is reported in a form of difference in densities between the test wavelength and the reference
wavelength, i.e.
⎡ 1 ⎤ ⎡ 1 ⎤
∆D (λ , λr ) = log ⎢ ⎥ − log ⎢ ⎥ (2.1.5)
⎣ t (λ ) ⎦ ⎣ t (λr ) ⎦
Here λ is the test wavelength, λr is the reference wavelength and t is the transmittance. The
reference wavelength is usually taken from far red portion of the spectrum where the density of
the lens is expected to be negligibly low – such as 700 nm. Norren et al. suggest that a value of
25
0.15 should be added to the relative density to arrive at the absolute density value (Norren and
Vos 1974).
There is a general agreement between the results derived by different methods. The most resent
version (Stockman et al. 1999) is illustrated in Figure 2.1.4-1; differences between this version
and the previous ones (Norren and Vos 1974; Wyszecki and Stiles 1982) are rather small.
3.5
2.5
relative density
1.5
0.5
0
350 400 450 500 550 600 650 700
λ
Figure 2.1.4-1. Relative density of the human crystalline lens.

As proposed by (Stockman et al. 1999); 25% error bars illustrate the estimated variability between
observers under the age of 30.
The density of the lens is known to vary considerably among the observers. Norren and Vos
estimate this variability at 25% (illustrated in Figure 2.1.4-1 by the error bars) (Norren and Vos
1974). This estimation is based on 50 observers (Crawford 1949), all of whom where younger
than 30 years of age. However, lens absorption is known to increase considerably with age (Said
and Weale 1959) (Figure 2.1.4-2). Pokorny et al. estimate the increase in density at 400 nm to
be linear by 0.12 units per decade between ages 20 – 60 years, and by 0.4 units per decade for
age above 60 years (Pokorny et al. 1987). This implies increase in density of approximately 38
per cent from the age of 20 to 60 years. Hence the real variability between observers in a group
with mixed ages is expected to be highly dependent on the variability of age, and can be
considerably higher than the 25%.
26
Figure 2.1.4-2. Lens relative optical density for subjects of different age.
The figures at the top of the curves represent subject’s age. Reproduced from (Weale 1968)
(Pokorny et al. 1987) propose functions for estimation of lens density as a function of age:
TL = TL1 [1 + 0.02( A - 32) ] + TL 2 (2.1.6)
and
TL = TL1 [1.56 + 0.0667( A - 60) ] + TL 2 (2.1.7)
Eq. (2.1.6) should be used for ages 20-60, and Eq. (2.1.7) – for ages above 60. A in these
equations is the observer’s age, TL1 represents the portion of density affected by aging after age
20, and TL 2 represents the portion stable after age 20. For lens density values proposed by
(Wyszecki and Stiles 1982), the values of TL1 and TL 2 are tabulated in Table 1 of (Pokorny et al.
1987); for Norren and Vos functions alternative values are proposed. The more recent lens
density values proposed by Stockman and Sharpe (Stockman et al. 1999) are very close to
Norren and Vos ones, so the same TL1 and TL 2 values are expected to be suitable. It should be
remembered, however, that the estimation of equations (2.1.6) and (2.1.7) is subject to
uncertainty of at least 25% – the estimated variability of lens density within the same group age.
The values illustrated in Figure 2.1.4-1 are appropriate for small pupil size. Due to lenticular
shape of the eye lens, its thickness varies from the centre to the periphery; therefore the
effective lens density varies with the pupil size. Norren and Vos estimate the ratio of the lens
density for small pupil to that with open pupil to be 1.16 (Norren and Vos 1974), i.e.
27
Ds
Do = (2.1.8)
1.16
where Do is the density with the open pupil, and Ds is the density with the small pupil.
2.1.5. Macular pigment
The macula lutea (yellow spot) is a layer of yellow pigment covering the fovea. It is most
intense in the fovea (1.4° – 5.2°), gradually fading out beyond the fovea (Wyszecki and Stiles
1982). It acts as an inert filter absorbing mostly in the short-wavelength part of the visible
spectrum. The functional purpose of this spot is not known. It has been suggested that the it
serves to enhance the vision acuity by partially filtering out the blue spectrum of the light, thus
reducing the effect of chromatic aberration in the eye; it also is said to provide protection to the
retina from potentially harmful high-frequency radiation (Rodieck 1998). The pigment in the
yellow spot has been identified as carotenoid (Wyszecki and Stiles 1982), having its source in
the diet – in products such as egg yolk, orange pepper or spinach (Beatty et al. 2004). It has
been suggested that the density of macular pigment can be reduced as the result of heavy
smoking (Hammond and Caruso-Avery 2000), and increased for observers living in the areas
with high exposure to sun (Ishak 1952).
0.5
0.4
0.3
density
0.2
0.1
0
390 410 430 450 470 490 510 530 550 570
λ
Figure 2.1.5-1. Density of macular pigment as measured by three different studies

Thin line (Wyszecki and Stiles 1982); thick line (Vos 1972); dashed line (Bone et al. 1992)
Table 2.1.5-1 contains the extract from Table 1 of (Pease et al. 1987), combined with the
summary of Table 1 from (Bone et al. 1992) and Table 2 from (Diaz et al. 1998). In the recent
study of the cone spectral sensitivity (Stockman et al. 1999) it was found that the curve derived
by Bone (Bone et al. 1992) results in most plausible short-wavelength cones sensitivity
function.
28
Mean
Number of density @
Publication subjects 460nm Range CV
(Wald 1945) 10 0.5 0.0-1.0 NA
(Bone and Sparrock 1971) 49 0.53 0.0-1.0 NA
(Pease et al. 1987) 27 0.77 0.21-1.22 40.49%
(Vries et al. 1953) 20 - 0.07-0.52 NA
(Grutzner and Kohlrausch 1961) 4 0.54 0.29-0.75 40.74%
(Norren and Tiemeijer 1986) 2 0.24 0.14-0.36 53.03%
(Bone et al. 1992) 7 0.52 0.21-0.77 40.23%
(Diaz et al. 1998) 8 0.33075 0.14-0.38 50.80%
Table 2.1.5-1. Summary of measurements of macular pigment density.

The top 6 rows have been extracted from Table 1 of (Pease et al. 1987). CV in the last column stands for
Coefficient of Variation: is the ratio of standard deviation to mean expressed as percentage.
As noted above, the macular pigment is unevenly distributed in the retina: it has the highest
density in the fovea and is absent in foveola and in periphery. Therefore the effective density of
the pigment is dependent on the angular size of the viewing field. Stockman et al. estimated the
peak density to be 0.095 and 0.32 for 10° and 2° field size (Stockman et al. 1999), respectively
(Figure 1.1.5-2).
0.4
0.35
0.3
0.25
density
0.2
0.15
0.1
0.05
0
390 410 430 450 470 490 510 530 550 570
Figure 2.1.5-2. Density of macular pigment (Stockman et al. 1999) for different viewing field size.
10° (thick line) and 2° (solid line) viewing fields.
The inter-observer variability of macular pigment density is rather difficult to estimate. As we

have mentioned, the density is not only varying between observers, but also within the same
observer. The last column of Table 2.1.5-1 lists the relative standard deviation values estimated
from data in corresponding publications. The mean value is ≈45%, which is illustrated as error
bars in Figure 2.1.5-3.
29
0.5
0.4
0.3
density
0.2
0.1
0
390 410 430 450 470 490 510 530 550
λ
Figure 2.1.5-3. Density of macular pigment by (Stockman et al. 1999) for 2° viewing field, with 45%
error bars.
2.1.6. Photoreceptors
The process of vision is initiated when light penetrates the retina and reaches the outer segments
of the photoreceptors: rods and cones. These segments contain visual pigment, whose molecules
are composed of protein and chromophore. The portion of the protein in the pigment molecule
determines its sensitivity for different parts of visible spectrum (Rodieck 1998). Each
photoreceptor contains only one type of visual pigment.
2.1.6.1. Rods
Rods (Figure 2.1.6-1) are extremely sensitive to light, and are able to generate a signal in
response to single captured photons (Rodieck 1998). This sensitivity facilitates the prime
function of rods: to provide light discrimination under dim conditions. In discussion of colour
matching, it is usually assumed that rod signals are processed separately from cones; they
encode exclusively lightness information and do not have an effect on the perceived colour.
There are evidences that this assumption is incorrect, and rod and cone signals do interact
(Wyszecki and Stiles 1982); they do not have their unique path to the brain (Zaidi 1986), and
their net effect is probably in shifting the perceived hue towards blue (Buck 2001).
30
A) B)
Figure 2.1.6-1. Rods and cones of the human retina.
A) Rods; B) cones. Reproduced from (Rodieck 1998)
The sensitivity of rods to the visible spectrum is described by the CIE 1951 (CIE 1951) scotopic
luminous efficiency function V′(λ) (Figure 1.1.6-2), which is based on experiments by Wald
(Wald 1945) and Crawford (Crawford 1949). In Wald’s experiment, 22 observers made absolute
threshold settings for monochromatic stimuli, whether Crawford’s study involved 50 young
(<30 years) observers making brightness matches between a standard “white” light and a
monochromatic lights. Due to limited observers’ age distribution, the scotopic luminous
efficiency values tabulated by the CIE are valid only for observers under the age of 30. Above
that age the function is changed as the result of lens yellowing. Crawford proposed a formula
for correcting the age effect on the V′(λ) function for the stimuli below 500 nm (Wyszecki and
Stiles 1982):
∆ ⎡⎣log(V ' ( λ ) ⎤⎦ = 10−4 ( 500 − λ )( A − 30 ) (2.1.9)
Here A is the age, and ∆ ⎡⎣log(V ' ( λ ) ⎤⎦ is the difference between the logarithms of standard and
the "aged" V′(λ) value.

31
1.0
0.8
luminous efficiency
0.6
0.4
0.2
0.0
380 430 480 530 580 630 680
λ
Figure 2.1.6-2. Rod sensitivity – the scotopic luminous efficiency function V′(λ)
Rods have limited dynamic range; that is – they are able to detect variations in light intensity
within a certain range, above which they do not respond to changes any more. The response of
rod mechanism to increments in light intensity is described by the Threshold-Versus-
Illuminance (TVI) function. Traditionally the Aguilar-Stiles function (Aguilar and Stiles 1954)
is used. This function was measured extrafoveally and is shown in Figure 2.1.6-3. In the plot,
data points tabulated in (Wyszecki and Stiles 1982) are connected by a 5th order polynomial,
which can be used to determine values which are not included in the table:
y = 0.002666x 5 + 0.016088x 4 - 0.012454x 3 -

(2.1.10)
0.057213x 2 + 0.954447x - 0.603655
According to Aguilar and Stiles TVI functions, rods saturate at retinal illuminance of
approximately 1000 scotopic trolands.
32
log(increment threshold)
1
-1
-2
-3
-4
-4 -2 0 2 4
log scotopic luminance (trolands)
Figure 2.1.6-3. Aguilar and Stiles TVI function.

The dots show the tabulated data points; the continuous line is the 5th order polynomial fit (see text).
Shapiro et al. measured an alternative TVI function to suit better the conditions of colour
matching experiment with bipartite field (Shapiro et al. 1996). They report dependence of the
TVI on the background luminance, and conclude that in some conditions Aguilar and Stiles TVI
can slightly underestimate the threshold. However, the differences between the new TVI and the
Aguilar and Stiles one are minor and perhaps insignificant for practical purposes.
2.1.6.2. Cones
Cone sensitivity
Cone receptors are significantly less sensitive than rods, but are capable of adapting to a
significantly larger luminance range of up to six log units (Rodieck 1998). This allows rods to
facilitate the colour vision in widest range of lighting conditions from very dim to direct
sunlight.
The selective sensitivity of a cone to parts of visible spectrum is defined by its ability to absorb
photons of different wavelengths. Optimal cone sensitivity is facilitated by its waveguide
properties: ability to “capture” and funnel light through its long outer segments (Packer and
Williams 2003), allowing it to interact with the photopigment for a longer path length. In other
words, longer outer segments provide better chance for photons to interact with the molecules of
photopigment, thus providing for better cone sensitivity.
The measure of the ability of the single cone to catch photons is absorptance: the ratio of the
absorbed radiant flux to the incident flux (CIE 2005). Given the absorbance and the peak axial
optical density, the absorptance is calculated as
33
(
a ( λ ) = 1 − 10 ⎣
⎡ − Dmax A0 ( λ ) ⎤⎦
) max (1a ( λ )) (2.1.11)
where Dmax is the peak axial optical density of the cone, and A0 is the low density spectral
absorbance of the photopigment normalised to unity at the peak, that is – the density of
infinitely low concentration of photopigment, calculated as negative logarithm to base ten of the
ratio of the transmitted light to the incident light at wavelength λ:
⎛ I (λ ) ⎞
A0 ( λ ) = − log ⎜⎜ ⎟⎟ (2.1.12)
⎝ I0 ( λ ) ⎠
Eq. (2.1.11) can be inverted to calculate the cone photopigment low density absorbance from
peak density and absorptance normalised to unity at peak:
A0 ( λ ) = ⎜ −
10
⎣ (
⎛ log ⎡1 − a ( λ ) ⋅ 1 − 10( − Dmax ) ⎤ ⎞
⎜ ⎦⎟ )
1
(2.1.13)
D ⎟ max ( A ( λ ) )
⎜ max ⎟ 0
⎝ ⎠
The value of a(λ) characterises the ability of the cone to catch quanta or light, thus ultimately
defining its sensitivity to light. In order to express this sensitivity in terms of radiant energy, the
absorptance values need to be multiplied by wavelength:
c (λ ) = a (λ ) ⋅ λ (2.1.14)
The values c(λ) characterise the relative sensitivity of cones if the absorption of light by the
prereceptoral filters – the macular pigment and the lens – is ignored. If the prereceptoral filters
are accounted for, the resulting curves are termed cone fundamental sensitivity functions, or
cone fundamentals. These functions are the special case of colour matching functions, measured
with imaginary primary lights chosen so each primary excites only one type of cone receptor,
and are denoted as l(λ), m(λ) and s(λ) for long-, medium- and short-wavelength sensitive cones
respectively.
The most recent set of cone fundamentals (Stockman et al. 1999; Stockman and Sharpe 2000) is
shown in Figure 2.1.6-4. The sensitivity of the short-wavelength sensitive cones was estimated
by two methods: direct psychophysical measurement of threshold sensitivity on trichromats and
monochromats (lacking functioning L and M cones), and analysis of Stiles and Burch (Stiles
and Burch 1959) colour matching functions. Both methods yielded very similar results. The
sensitivities of medium- and long-wavelength sensitive cones were estimated from the
34
psychophysical threshold measurements on group of dichromats lacking functioning M or L

cones.
1.0
0.8
relative sensitivity
0.6
0.4
0.2
0.0
390 440 490 540 590 640 690
λ
Figure 2.1.6-4. Cone fundamental sensitivity functions – cone fundamentals (Stockman et al. 1999;
Stockman and Sharpe 2000).
The cone sensitivities in Figure 2.1.6-4 represent the mean sensitivities of population with
normal colour vision. These sensitivities vary between observers as the result of variations in
prereceptoral filtering, as well as due to variations in properties of the cones themselves. The
exact nature and magnitude of cone receptors variations is still a subject of much controversy;
however, they can be classified into two kinds:
− Shift in peak sensitivity
− Relative quantities of cones of different types in the retina
Shift in peak sensitivity
The first notion of possible multimodal distribution of colour matches which could be the result
of two variants of cones of the same type can be found in Wyszecki’s internal NRC report
(Wyszecki 1959) , referenced in (Nimeroff et al. 1961). Additional evidences were reported by
(Dartnall and Lythgoe 1965) from analysis of chemical factors in photopigments, and by
(Alpern and Moeller 1977) from psychophysical study of anomalous trichromats and
dichromats. The first indication from direct measurements came in 1983 (Dartnall et al. 1983)
from a microspectrophotometric study; Figure 3 from the original paper is reproduced here in
Figure 2.1.6-5. However, due to limited sample size (173 outer segments of rods and cones from
seven persons), and ambiguous statistical analysis results, no definite conclusions could be
drawn.
35
Figure 2.1.6-5. Reproduction of Figure 3 from (Dartnall et al. 1983), showing indication of bimodal
distribution of green and red photopigment peak density.
(Neitz and Jacobs 1986), based on psychophysical results, suggested a link between the
variations in the cone spectral sensitivities and the individual variations in DNA sequence:
polymorphism. In the following years this link was established and supported by numerous
publications (Neitz and Jacobs 1989; Neitz et al. 1991; Merbs and Nathans 1992; Winderickx et
al. 1992; Neitz et al. 1993; Sanocki et al. 1993; Neitz et al. 1994; Neitz et al. 1995; Sharpe et al.
1998; Wolf et al. 1998), while most of the data concerns the L (red) cone pigment. Table 1.1.6-1
summarises the reports on this subject.
Publication L cone shift M cone shift

(Dartnall et al. 1983) 9 nm 6 nm
(Neitz and Jacobs
1990) 6 nm 6 nm
(Neitz et al. 1991) 5-6 nm
(Winderickx et al.
1992) 5 nm
(Merbs and Nathans
1992) 5 nm
(Neitz et al. 1993) 6 nm 7 nm
(Sanocki et al. 1993) 2.6-2.7 nm
(Neitz et al. 1995) 5 nm 7 nm
(Sharpe et al. 1998) 2.7 nm
(Kraft et al. 1998) 4 nm
Table 2.1.6-1. Summary of estimations of shift in peak sensitivity of green and red cones from various
publications
The exact consequences and nature of photopigment polymorphism are still not understood.
Specifically, these questions remain open:
− What is the magnitude of the spectral shift in the peak sensitivity?
− Do green and blue cone vary as well as the red ones?
− Can cones of the same type but of different variants be mixed within the same retina?
36
− What is the practical implication of polymorphism on colour perception?
L/M cones ratio
Another variable related to photoreceptors is the relative number of L and M cones. Individual
differences can be very significant, in extreme cases as large as 30-fold (Carroll et al. 2002),
although in most of the subjects the variations are in about 4-fold range. Striking visualisation
of such individual differences is shown in Figure 2.1.6-6, reproduced from (Roorda and
Williams 1999).
Figure 2.1.6-6. Pseudocolour image of the retina of two male subjects.

The estimation of the relative amounts of cones of each type for subject a are 75.8%, 20% and 4.2%for L,
M and S cones, respectively; and subject c 50.6%, 44.2% and 5.2%. Reproduced from (Roorda and
Williams 1999).
Large inter-observer variations in L/M ratio are not known to have significant practical
implications on colour vision. Their effect is mostly in spatial resolution and on luminous
sensitivity curve (Packer and Williams 2003).
2.1.6.3. MacLeod-Boynton chromaticity diagram
In 1979, MacLeod and Boynton proposed a two-dimensional Cartesian diagram (MacLeod and
Boynton 1979), in which a colour stimulus is identified according to the ratio of cone signal to
luminance signal. Under the assumption that the perception of luminance is mediated by L and
M cones only, the chromaticity coordinates are calculated as:
L
L/(L+M) = (2.1.15)
L+M
M
M/(L+M) = (2.1.16)
L+M
S
S/(L+M) = (2.1.17)
L+M
37
The result (Figure 2.1.6-7) is a plane of constant luminance, in which the abscissa corresponds
to the relative excitation of L and M cones, and ordinate represents the excitation of blue cones
relative to “luminance signal”. As evident from the equations, only L/(L+M) and S/(L+M) are
required for full specification of a stimulus, as the third coordinate M/(L+M) is readily
calculated as
L
M/(L+M) = 1 − (2.1.18)
L+M
1.0
0.9
0.8
0.7
0.6
S/(L+M)
0.5
0.4
0.3
0.2
0.1
0.0
0.3 0.4 0.5 0.6
L/(L+M)
Figure 2.1.6-7. MacLeod-Boynton chromaticity diagram with locus of monochromatic stimuli.
Lights at the blue end of the spectrum excite mainly S-cones and very little or none L and M
ones, therefore their chromaticities will plot at the top of the diagram. Greens-reds are
distributed at the bottom. Due to the direct relation of its values to the physiologically-
meaningful values of receptoral and postreceptoral signals, MacLeod-Boynton diagram has
become one of the main colour stimuli specification tools in vision research, preferred over the
CIE xy or CIELAB diagrams.
2.1.7. Retinal topography
2.1.7.1. Summary of retinal topography
The retina is not uniform with respect to the distribution of photoreceptors, the blood supply and
the pigmentation. These non-uniformities have an effect on colour matching, acuity and
sensitivity; hence they have an affect on the design of the colour vision psychophysical
experiments.
38
Figure 2.1.7-1. Spatial density of rods and cones in the retina.

Number of receptors plotted versus eccentricity in millimetres. The diagram is reproduced from (Rodieck
1998).
The summary given here is based on (Polyak 1945), reproduced in Table 1(2.2.5) of (Wyszecki
and Stiles 1982), and from (Rodieck 1998); it is illustrated graphically in Figure 2.1.7-1. In
order of increasing distance from the centre:
− 0.17-0.24°: central island. There are no rods and no blood vessels, cone outer segments
have maximum length.
− 1.4°: foveola. Nearly flat area with no blood vessels and no rods.
− 5.2°: fovea. Including the central island and foveola, this is the area with the highest
density of cones. Rods density starts to increase. The cones in the fovea have somewhat
different structure than in periphery, and estimated to have more pathways to the brain.
The density of cones sharply reduces from the foveola; at about 5° it stabilises at c.
4000-5000/mm2, and rods have their maximum density.
− 5.2° – 8.6°: parafovea. Density of rods begin to decrease
− 8.6° – 19°: perifovea. The density of rods continues to decrease, and gets to the
minimum of ≈3000/mm2 at 19°.
2.1.7.2. Cone density
Existence of large variations in M:L cone ratio in different subjects has been already mentioned.
Curcio et al. show that also the maximum cone packing density in the fovea is highly variable:
between 100000-324000 cones/mm2 in different individuals at the point of highest density
(≈0.032°) (Curcio et al. 1990). These individual variations are illustrated in Figure 2.1.7-2: the
variability between subjects is highest in the fovea, and decreases to periphery. The total
number of cones within the 5 mm of the central retina remains almost invariant in different
subjects – the differences are in the distribution of cones within this area (Curcio et al. 1990).
The receptors are also not distributed symmetrically in the eye: the density in the nasal area is
about 40%-45% higher than in the temporal area (ibid).
39
Figure 2.1.7-2. Individual variability of cone density in fovea of seven subjects.

Reproduced from (Curcio et al. 1990)
The exact functional effect of the variations in cone density on vision is not well understood. It
is considered to be mostly on spatial resolution (Curcio et al. 1990); no implications on colour
matching or colour discrimination are reported.
2.1.7.3. Macular pigment
An additional factor contributing to the non-uniformity is the macular pigment (MP) layer. The
pigmentation is very slight in the fovea, about the thickest in the parafovea, and decreases
gradually up to 17°. Although significant variations of MP between subjects exists, the spatial
profile remains rather similar (Hammond et al. 1997) – as illustrated in Figure 2.1.7-3. The plot
also illustrates that the MP density is invariant with time for the same subject; it is noted,
however, that it requires the dietary patterns to be stable.
Figure 2.1.7-3. Spatial distribution of macular pigment density of 4 subjects measured over 4-14
months.
The optical density is plotted against eccentricity in degrees. Reproduced from (Hammond et al. 1997)
40
2.2. Principles of Colorimetry
2.2.1. Introduction
Colour is a cognitive perception resulting from the interaction of light with the vision
mechanisms; therefore, properties of these two systems have to be accounted for in any colour
specification system. Colorimetry is concerned with numerical colour specification, which is
achieved by modelling the interaction between the physical and the perceptual worlds.
Wyszecki & Stiles define the following requirements for the system of colorimetry (Wyszecki
and Stiles 1982):
− When viewed by an observer with normal colour vision, under the same
observing conditions, stimuli with the same specifications look alike (i.e.
in complete color match)
− Stimuli that look alike have same specifications, and
− The numbers comprising the specifications are continuous functions of
the physical parameters defining the spectral radiant power distribution
of the stimulus.
In the more updated terminology, these specifications define basic colorimetry; this is as
opposed to advanced colorimetry (Wyszecki 1973) which deals with specifications of colour in
varying viewing conditions (also termed as Colour Appearance Modelling (Fairchild 2005)).
Basic colorimetry is the subject of this section. The review is based on chapter 3 of (Wyszecki
and Stiles 1982) unless otherwise stated.
2.2.2. Colour matching and metamerism
2.2.2.1. Metamerism
It has already been mentioned that the principle of invariance is fundamental to modelling the
visual perception: the signal generated by a cone in response to photon catch is independent of
the photon’s frequency. Equal amount of absorbed photons results in identical cone signals,
41
independently of the physical properties of the light. Identical signals from the cones result in
identical perception. Hence, two lights which result in identical cone signals will match in
colour, independently of their spectral composition. This is the principle of colour matching,
which is the fundamental principle all colour specification systems, and is the basis of the
present work.
s m l
M
energy
L
S
350nm wavelength 800nm
Figure 2.2.2-1. Illustration of the phenomenon of metamerism.

Vertical lines s, m and l signify hypothetical sensitivities of some sensory system to light. Red and green
curves signify two lights of different spectral properties. These two curves intersect with the sensitivity
lines at the same points, hence the lights they represent, despite the different physical properties, cause
identical signals of magnitude S, M and L, and make a metameric match.
It follows from the principle of colour matching, that any colour perception can be triggered by
an infinite number of spectrally different lights. The phenomenon where lights having different
spectral power distributions (SPD) trigger identical perception of colour is called metamerism.
Two physically different colour stimuli which trigger identical cone signals are termed
metamers, and such a colour match is a metameric match (Figure 2.2.2-1).
2.2.2.2. Neural and quantal colour match and metamerism
If neural signals generated in response to exposure to two stimuli are the same, the resulting
colour sensations will match. The equality of signals can be achieved in three ways:
1. The viewing conditions are such that only the cone receptors are operating. In these
conditions, visual equivalence is the result of identical number of photons absorbed by
each cone type for each stimulus. This is the cone-quantal colour match and can be
thought of as strict trichromacy: three cone signals correspond to three neural channels.
2. The conditions are such that rods are operating, but their contributions to perception of
both stimuli are identical. Rod signal can be thought of as cancelling out, and this
situation is not principally different from 1.
3. Rods are operating and contribute differently to perception of both stimuli. In this
situation the visual equivalence is the result of neural summation at post-receptoral
42
stage; this is the neural colour match. Rods do not have a unique path to the brain, and
their signal is integrated in the visual path with the cone signals during the retinal
processing (Zaidi 1986). Thus the colour match is still trichromatic, but is facilitated by
four types of receptors.
These principles can be summarised as follows:
If two stimuli match in colour, the cone signals that initiate them are not
necessarily identical.
In other words: all the quantal matches are essentially neural, but not all the neural matches are
quantal. This problem has its important consequences in discussion of failures of colorimetric
additivity (Section 2.5).
We are led into the conclusion that there can be two types of metamers: ones produced by cone-
quantal match, and ones produced by neural match. Consequently, we shall call these two types
cone-quantal metamerism and neural metamerism.
At this stage, it would be useful to distinguish and to clarify the meaning of the terms as they
will be used in this work. Terms metamerism and metamer will be used to mean cone-quantal
metamerism; i.e. strictly cone-level phenomenon, in the conditions of strict trichromacy. The
term “colour matching” will be used in the broader meaning of neural metamerism as defined
by Thornton (Thornton 1992a):
…lights that are visually indistinguishable in both chromaticness and perceived

brightness.
2.2.3. Trichromacy, Additivity and Trichromatic Generalisation
Normal human colour vision is mediated by three types of cones differentiating by the portion
of the visible spectrum they are sensitive to: Long- , Medium- or Short-wavelength (abbreviated
as L, M and S cones). Therefore any test stimulus can be matched in colour by appropriate
stimulation of the three cones. Such stimulation can be achieved by suitably-adjusted additive
mixture of three spectrally-different lights termed primary stimuli, if the following two rules are
followed:
− None of the primary lights can be matched in colour by the mixture of the other two
− It is possible to mix one of the primary lights with the test stimulus.
The process of matching the test stimulus by the mixture of primaries is a colour matching
experiment. There are several types of colour matching experiment which will be discussed
later, but its classical form is illustrated in Figure 2.2.3-1. Observer is presented with a bipartite
43
field. On one side of the field there is a test stimulus. Observer adjusts the quantities of the
primary stimuli at the other side of the field until the mixture matches the test stimulus in
colour.
Background
Red (R)
Test
Colour Green (G)
(T)
Blue ( B)
Figure 2.2.3-1. Illustration of the colour matching experiment.

Test stimulus T on the left hand side of the field is matched in colour by the mixture of the three primary
stimuli at the right hand side.
The result of a colour matching experiment can be expressed as follows. Let T be some test
colour stimulus, and R, G and B to be the primary stimuli. The colour match between T and the
mixture of primary lights can be expressed as
T ≡ R⊕G ⊕ B (2.2.1)
This statement does not have mathematical meaning, and is interpreted as “mixture of stimuli at
the right hand side of the equation matches the stimulus at the left hand side”. In order to
perform mathematical operations with results of colour matching experiment, we must make an
assumption that quantities of stimuli T, R, G and B follow the rules of algebra.
First such assumption was formulated by Grassmann (Grassmann 1853), when he stated the
principle of the additive mixture:
The total intensity of any mixture is the sum of the intensities of the lights mixed.
He immediately states, however:
This (…) assumption is not to be regarded as (…) well founded (…), although it
appears to be probable.
Since Grassmann, this assumption is called the Grassman’s assumption of additivity, or

Grassman’s Law. Later it was re-formulated by Wyczecki & Stiles (Wyszecki and Stiles 1982)
and became the weaker, or qualitative form of principle of trichromatic additive colour mixture:
Additive mixture means a color stimulus for which the radiant power in any
wavelength interval, small or large, in any part of the spectrum is equal to the
sum of powers in the same interval of the constituents of the mixture, constituents
44
which are assumed to be optically incoherent.
However, this statement is still not sufficient. In order to consider the colour stimuli to be
algebraic quantities they need to obey the rules of linearity – which are stated in the stronger
form of Trichromatic Generalisation (TG) (Wyszecki and Stiles 1982):
Symmetry Law: If colour stimulus A matches colour stimulus B, then colour stimulus B matches
colour stimulus A
if A ≡ B then B ≡ A (2.2.2)
Transitivity Law: If A matches B and B matches C, then A matches C
if A ≡ B and B ≡ C then A ≡ C (2.2.3)
Proportionality Law: If A matches B, then kA matches kB, where k is any positive factor by
which the radiant power of the stimulus is increased or reduced, while its relative spectral power
distribution remains the same
if A ≡ B then kA ≡ kB (2.2.4)
Additivity Law: If A, B, C and D are any four colour stimuli, then if any two of the following
three conceivable colour matches
A ≡ B, C ≡ D and A ⊕ C ≡ B ⊕ D
holds true, then so does the remaining match
A⊕ D ≡ B ⊕C (2.2.5)
The sign ⊕ above means “additively mixed”, and sign ≡ means “visually matches”.
2.2.4. Tristimulus space and chromaticity diagram
Assuming the validity of the TG as it was expressed by Eq. (2.2.2)-(2.2.5), the result of the
colour matching experiment in Eq. (2.2.1) can be written as a vector colorimetric equation
having the following form:
Q = RR + GG + BB (2.2.6)
where R, G and B are the unit amounts of primary stimuli, R, G and B are their scalar
multipliers, and Q is the unit amount of test stimulus. Equation (2.2.6) defines stimulus Q as a
45
vector having coordinates (R, G, B) in the three-dimensional tristimulus space with axes R, G
and B (Figure 2.2.4-1). The vector coordinates R, G and B are said to be the tristimulus values
of stimulus Q in tristimulus space (R,G,B), and equation (2.2.6) is called the colour matching
equation.
G
B
Q
G
R R
Figure 2.2.4-1. Diagram of the tristimulus space.

Three axes are given by the unit amounts of the three primaries R, G and B. Any stimulus Q can be
represented by a tristimulus vector, whose coordinates (R,G,B) are lengths along the axes R, G and B, and
are said to be the tristimulus values of Q.
In practical work, it is often convenient to visualise colour stimuli in a two dimensional plane.
One such useful visualisation is obtained in the chromaticity diagram, the unit plane
r + g + b = 1 calculated as
R
r= (2.2.7)
R+G+ B
G
g= (2.2.8)
R+G+ B
B
b= (2.2.9)
R+G+ B
r b
B R
Figure 2.2.4-2. (r, g) chromaticity diagram calculated by Eq. (2.2.7)-(2.2.9).
46
2.2.5. Transformation of tristimulus space
Assuming the validity of TG, tristimulus values measured with one set of primaries can be
transformed to another set by two methods, termed as inverse-matrix and forward-matrix
transformations (Brill and Robertson 2006). Let R1, G1 and B1 be the primary stimuli of
tristimulus space 1, and R2, G2 and B2 be the primaries of space 2. Let Q1, Q2 and Q3 be three
stimuli measured in both tristimulus spaces, resulting in the following two sets of equations:
Q1 = R1,1R1 + G1,1G1 + B1,1B1

Q 2 = R2,1R1 + G2,1G1 + B2,1B1 (2.2.10)
Q3 = R3,1R1 + G3,1G1 + B3,1B1
and
Q1 = R1,2 R 2 + G1,2G 2 + B1,2 B 2

Q 2 = R2,2 R 2 + G2,2 G 2 + B2,2 B 2 (2.2.11)
Q 3 = R3,2 R 2 + G3,2G 2 + B3,2 B 2
where Ri,j, Gi,j and Bi,j (i = 1,2,3 j = 1,2) are the tristimulus values of stimulus i in tristimulus
space j. By the Transitivity Law of the Trichromatic Generalisation, and put in a matrix form:
⎡ R1,1 G1,1 B1,1 ⎤ ⎡ R1 ⎤ ⎡ R1,2 G1,2 B1,2 ⎤ ⎡ R 2 ⎤

⎢ ⎥⎢ ⎥ ⎢ ⎥⎢ ⎥
⎢ R2,1 G2,1 B2,1⎥ ⎢G1 ⎥ = ⎢ R2,2 G2,2 B2,2 ⎥ ⎢G 2 ⎥ (2.2.12)
⎢ R G B ⎥ ⎢B ⎥ ⎢ R ⎥⎢ ⎥
⎣ 3,1 3,1 3,1 ⎦ ⎣ 1 ⎦ ⎣ 3,2 G3,2 B3,2 ⎦ ⎣ B 2 ⎦
Eq. (2.2.12) can be re-stated for convenience as
D1P1 = D2 P2 (2.2.13)
where D1 and D2 are 3x3 matrices of tristimulus values, and P1 and P2 are the 3x1 vectors of
primaries. The task is to build an expression which transforms tristimulus values measured in
terms of primaries 1 to tristimulus space 2.
Solving Eq. (2.2.13) for P1 gives:
P1 = MP2 (2.2.14)
where M is
M = D1−1D2 (2.2.15)
47
and is the 3x3 matrix that relates the primaries of tristimulus space 2 to the primaries of space 1.
The transformation to tristimulus space 2 of the tristimulus values measured in space 1 is carried
out as follows. Let QS be some test stimulus, for which the following equation was measured in
tristimulus space 1:
Q S = RS ,1R1 + GS ,1G1 + BS ,1B1 (2.2.16)
or, in vector form
Q S = ⎡⎣ RS ,1 GS ,1 BS ,1⎤⎦ P1 (2.2.17)
Substituting the vector of primaries P1 by the right part of Eq. (2.2.14) we have
Q S = ⎡⎣ RS ,1 GS ,1 BS ,1⎤⎦ MP2 (2.2.18)
Eq. (2.2.18) expresses stimulus QS in terms of the primaries of the space 2.
Two special cases develop when the stimuli Q1, Q2 and Q3 are the primary stimuli of one of the
tristimulus spaces. If they are the primaries of space 1:
Q1 = R1
Q 2 = G1 (2.2.19)
Q 3 = B1
then the matrix D1 is an identity matrix, and the transformation matrix M becomes equal to the
matrix containing the tristimulus values of primaries of space 1 in tristimulus space 2:
M = D2 (2.2.20)
The transformation becomes:
⎡ R1,2 G1,2 B1,2 ⎤

⎢ ⎥
⎡⎣ RS ,2 GS ,2 BS ,2 ⎤⎦ = ⎡⎣ RS ,1 GS ,1 BS ,1⎤⎦ ⎢ R2,2 G2,2 B2,2 ⎥ (2.2.21)
⎢R ⎥
⎣ 3,2 G3,2 B3,2 ⎦
where RS,2, GS,2 and BS,2 are the calculated tristimulus values of Qs in tristimulus space 2, BS,1,
GS,1 and BS,1 are the tristimulus values of Qs in space 1, and Ri, j, Gi,j and Bi,j (i,j = 1,2,3) are the
tristimulus values of primary lights of set 1 measured by means of set 2. Equation (2.2.21)
describes the forward-matrix method of tristimulus space transformation.
48
An alternative method is developed if the stimuli Q1, Q2 and Q3 are the primary stimuli of space
2:
Q1 = R 2
Q2 = G 2 (2.2.22)
Q3 = B 2
In this case, the matrix D2 is an identity matrix and, from Eq. (2.2.15), the transformation matrix
becomes equal to the inverse of the matrix containing the tristimulus values of primaries of set 2
measured in terms of set 1:
M = D1−1 (2.2.23)
The transformation in full becomes:
−1
⎡ R1,1 G1,1 B1,1 ⎤
⎢ ⎥
⎡⎣ RS ,2 GS ,2 BS ,2 ⎤⎦ = ⎡⎣ RS ,1 GS ,1 BS ,1⎤⎦ ⎢ R2,1 G2,1 B2,1⎥ (2.2.24)
⎢R G B ⎥
⎣ 3,1 3,1 3,1 ⎦
where RS,2, GS,2, BS,2, RS,1, GS,1 and BS,1 are as in Eq. (2.2.21), and Ri,j, Gi,j and Bi,j are the
tristimulus values of primary lights of set 2 measured by means of set 1. This is the inverse-
matrix method of tristimulus space transformation.
2.2.6. Colour matching experiment
Figure 2.2.3-1 illustrates the most common of the possible configurations of the colour
matching experiment. All of the configurations share the same basic principle: observer adjusts
the mixture of the primary stimuli to match in colour some test stimulus. The differences
between the methods concern the properties of the test stimulus (maximum saturation and
Maxwell Methods) and the temporal and spatial configuration of the test and the matching field
(symmetric and asymmetric match)
2.2.6.1. Maxwell’s method of colour matching
The first to carry out a colour matching experiment in its modern form was Maxwell ((Maxwell
1860), cited in (Mollon 2003)). In his experiment, Maxwell’s observer adjusted the mixture of
three monochromatic lights to match the colour of white daylight (Figure 2.2.6-1 B). First, a
match was established between the fixed test stimulus – the white light – and the mixture of the
three primaries:
49
WW = Rw R + GwG + Bw B (2.2.25)
where Rw, Gw and Bw are the quantities of the primaries R, G and B in the mixture which
matches quantity W of the white light W. Next, one of the primary lights is replaced by the
variable test stimulus, and the quantities of the remaining two primaries are re-adjusted so the
mixture matches the white light again:
WW = RR + QQ + BB (2.2.26)
where Q is the quantity of the variable test stimulus Q. By the transitivity law of the TG, from
Eqs. (2.2.25) and (2.2.26) it follows that
Rw R + GwG + Bw B = RR + QQ + BB (2.2.27)
or
( Rw − R ) R + Gw G + ( Bw − B ) B = Q (2.2.28)
Q Q Q
It follows that the tristimulus values of stimulus Q are:
( Rw − R )
Rq =
Q
Gw
Gq = (2.2.29)
Q
B −B
Bq = w
Q
The decision which primary needs to be replaced by the variable test stimulus depends on the
test stimulus itself.
While mathematically the above procedure is perfectly valid, its practical implementation is
subject to a weakness (Crawford 1965; Wyszecki and Stiles 1982; Trezona 1993). If the values
Rw and R and/or Gw and G in Eq. (2.2.29) are very similar, difference between them would
approach zero, thus leading to amplification of the experimental uncertainties of Rq and/or Bq.
50
WW B B Q Q RR
BB QQ GG RR
A) B)
Figure 2.2.6-1. Maximum saturation and Maxwell methods of colour matching.
A) Maximum Saturation method; B) Maxwell method. See text for details.
2.2.6.2. Maximum saturation method of colour matching
In the maximum saturation method, a variable test stimulus Q is presented on one side of the
field, and the observer’s task is to match it in colour by adjusting the mixture of the primaries R,
G and B. However, due to overlapping sensitivities of the cones, most of the monochromatic
lights would be out of gamut of any set of real primaries. Therefore one of the primaries would
have to be mixed with the test light to “desaturate” it and, for all of the spectrum lights with
exception of yellow-red region, observers make matches between mixtures of two spectral lights
on both sides of the bipartite field. In the example of Figure 2.2.6-1 (A), the blue primary is
added to the test stimulus, so when the match is established the following colour matching
equation is written:
QQ + BB = RR + GG (2.2.30)
It follows that the tristimulus values of unit quantity of test stimulus Q are
R
Rq =
Q
G
Gq = (2.2.31)
Q
B
Bq = −
Q
Thus the maximum saturation method has an advantage over the Maxwell method that the
tristimulus values are determined directly, and are subject to less uncertainty.
2.2.6.3. The units of tristimulus values
The values of R, G, B and Q in colour matching equations (2.2.26) and (2.2.30) are usually
obtained by taking radiometric measurements of the bipartite field. These equations express the
amounts of radiant energy in every primary required to match in colour certain amount of
energy in the test stimulus, where the amounts are expressed in watts per sterradian per meter
squared (w/sr/m2). Since the subsequent calculation of tristimulus values involves normalisation
51
to unit amount of the test stimulus, the units of the original measurements are disregarded. The
tristimulus values are unitless entities, defining the proportions of the primary lights in the
mixture that matches the test light in colour.
2.2.6.4. Symmetry of colour matching
Wyszecki and Stiles (Wyszecki and Stiles 1982) discuss in details the properties of different
types of colour matching procedures, which they classify in three groups:
− Match by strict substitution: a visual match is established between stimuli, while both
are imaged at the same area of the retina for identical periods of time; the rest of the
conditions for the stimuli presentation being identical. Some recent colour matching
studies were carried out by using this procedure (Nakano et al. 2003).
− Asymmetric match: a match is established between stimuli displayed in different
viewing conditions: they are imaged on different parts of the retina, have different
backgrounds, etc.
− Quasi-symmetric match: an asymmetric match is established between stimuli, but it is
reasonable to assume that the match would hold if performed by a symmetric procedure.
For example, when the two stimuli are imaged on different but adjacent parts of the
retina, the rest of the conditions being identical. Most of the colour matching
experiments are of quasi-symmetric type.
2.2.7. Colour matching functions and calculation of tristimulus values
Colour matching experiments can be carried out with any test and primary stimuli – as long as
they satisfy the condition that none of the primaries can be matched in colour by the mixture of
the other two. If the test stimuli in the experiment are monochromatic, the result is a set of
spectral tristimulus values. When the colour matching is carried out with monochromatic test
stimuli spanning the entire visible spectrum, and the tristimulus values thus measured are
plotted against the wavelength of the stimulus, the result is a set of three curves termed Colour
Matching Functions (CMF) (Figure 2.2.7-.1). In (Wyszecki and Stiles 1982) CMF are defined
as
…the tristimulus values, with respect to three given primary stimuli, of

monochromatic stimuli of equal radiance, regarded as functions of wavelength.
The primary use of CMF in colorimetry is prediction of metameric matches, e.g. a prediction
whether a given pair of spectrally different lights will match in colour for a given observer for
whom the CMF were measured. The procedure of such a prediction is entirely based on the
assumption of validity of the Trichromatic Generalisation, and is described below.
52
Figure 2.2.7-.1. Illustration of the process of measurement of colour matching functions
Let r ( λ ) , g ( λ ) and b ( λ ) be a set of CMF measured with respect to primaries R, G and B.
Let Q be some stimulus whose spectral power distribution was measured at 1 nm intervals in the
wavelength range of 380-780 nm. Stimulus Q is an additive mixture of monochromatic stimuli
Q(λ) spanning the range of 380-780 nm in 1 nm intervals, chosen so that the radiance of each
monochromatic stimulus is equal to the value of the spectral power distribution function of Q at
the corresponding wavelength:
780
Q= ∑ Q (λ )
λ = 380
(2.2.32)
CMF contain the tristimulus values of every monochromatic stimulus in the visible spectrum.
Hence the tristimulus values of each stimulus Q(λ) with respect to primaries R, G and B can be
readily derived from the values of CMF:
R (λ ) = r (λ )Q (λ )
G (λ ) = g (λ )Q (λ ) (2.2.33)
B (λ ) = b (λ )Q (λ )
For every Q(λ) a colour matching equation can be written:
Q ( λ ) = R ( λ )R + G ( λ ) G + B ( λ ) B (2.2.34)
From Eqs. (2.2.32) – (2.2.34) follows the colour matching equation
⎛ 780 ⎞ ⎛ 780 ⎞ ⎛ 780 ⎞

Q = ⎜ ∑ r ( λ ) Q ( λ ) ⎟R + ⎜ ∑ g ( λ ) Q ( λ ) ⎟ G + ⎜ ∑ b ( λ ) Q ( λ ) ⎟ B (2.2.35)
⎝ λ =380 ⎠ ⎝ λ =380 ⎠ ⎝ λ =380 ⎠
53
Equation (2.2.35) means that, given spectral power distribution function of any stimulus Q, and
a set of CMF measured with respect to primaries R, G and B, the tristimulus values of Q with
respect to R, G and B can be calculated as a sum-product of each or the CMFs and the SPD of
Q:
780
RQ = ∑ r (λ )Q (λ )
λ = 380
780
GQ = ∑ g (λ )Q (λ )
λ = 380
(2.2.36)
780
BQ = ∑ b (λ )Q (λ )
λ = 380
The prediction whether two stimuli with different SPDs will match in colour when viewed by an
observer having CMFs r ( λ ) , g ( λ ) and b ( λ ) is carried out by comparing the corresponding
tristimulus values of both stimuli. Thus Q1 and Q2 will make a metameric match if
780 780
∑
λ = 380
r ( λ ) Q1 ( λ ) = ∑ r (λ )Q (λ )
λ = 380
2
780 780
∑
λ = 380
g ( λ ) Q1 ( λ ) = ∑ g (λ )Q (λ )
λ = 380
2 (2.2.37)
780 780
∑
λ = 380
b ( λ ) Q1 ( λ ) = ∑ b (λ )Q (λ )
λ = 380
2
where Q1(λ) and Q2(λ) are spectral power distributions of lights corresponding to both stimuli at
the entrance to the observer’s eye.
2.2.7.1. Locus of monochromatic stimuli
By applying the Eqs. (2.2.7)-(2.2.9) on colour matching functions, the chromaticity coordinates
of the monochromatic stimuli spanning the spectrum are obtained:
r (λ )
r (λ ) =
r (λ ) + g (λ ) + b (λ )
g (λ )
g (λ ) = (2.2.38)
r (λ ) + g (λ ) + b (λ )
b (λ )
b (λ ) =
r (λ ) + g (λ ) + b (λ )
When all the spectral chromaticity coordinates are plotted, they form a distinct “horseshoe”
shape of the spectral locus. Figure 2.2.7-2 shows example of a set of CMF and the
corresponding chromaticity diagram with spectral locus. Since the spectral colours are the most
54
“pure” or saturated colours attainable, the spectral locus marks the boundary within which lie
the chromaticities of all physically realisable colours.
3.5
3.0
2.5
tristimulus value
2.0
1.5
1.0
0.5
0.0
390 440 490 540 590 640 690 740
-0.5
λ
A)
2.5
2.0
1.5
1.0
g
0.5
0.0
-2.1 -1.6 -1.1 -0.6 -0.1 0.4 0.9
-0.5
r
B)
Figure 2.2.7-2. Colour matching functions and corresponding chromaticity diagram with locus of
monochromatic stimuli.
A) Colour matching functions; B) Chromaticity diagram; the points corresponding to the primary stimuli
are connected by red straight lines.
The chromaticity diagram follows the properties of additive colour mixture. This means that the
mixture of any two stimuli lies on the straight line connecting the corresponding chromaticities.
In Figure 2.2.7-2 (B), the primary stimuli used in the colour matching experiment are marked
with dots and connected by lines. All the chromaticities that can be obtained by additive mixture
of these three stimuli lie within the triangle. The chromaticities that lie within the spectral locus
but outside the triangle are obtained by making one of the primaries “negative”, e.g. by adding it
to the test colour.
55
2.2.8. Colour matching functions and retinal topography
As discussed (in Section 2.1), the human retina is not a homogeneous surface. Properties of the
retina of the same observer vary as the function of location in several respects, of which the
relevant ones from the colour matching point of view are (Stockman and Sharpe 2000)
− distribution of photoreceptors and presence and absence of rods
− density of the macular pigmentation
− density of the lens
− peak density of the outer segments of the photopigments
In terms of angular size of external target, two areas of principal importance can be identified
− 1.4°: area of no blood vessels and almost no rods
− 8.6°: area that includes the fovea and parafovea, with the macular pigment layer.
The measurements of CMF generally fall into one of the two categories according to the field
size: small field (<2°) and large field (>2°). Due to significant difference in retinal structure,
CMF measured by the same subject with different angular field sizes can be considered as if
measured by different subjects.
2.2.9. Evaluation of rod intrusion
In the conditions of large (>4°) field colour matching at scotopic and mesopic luminance levels
the colour vision is not trichromatic due to rod participation. However, the colour matching is
still trichromatic, i.e. a complete colour match can still be achieved by the mixture of three
primaries. This is possible because rods do not have unique path to the brain, and signals from
rods are combined with cone signals during the postreceptoral retinal processing (Zaidi 1986).
As discussed in 2.2.2.2, this requires classifying colour matches into two classes: cone-quantal
and neural. Cone-quantal match means that stimuli are perceived as identical when they result
in identical quantal catch in all three types of cone receptors. The neural match is produced
when both stimuli result in identical neural signals sent to the brain, while these signals do not
necessarily result from identical receptor signals. The latter case is possible when more than
three receptors participate in the match, i.e. when there is a rod participation. If rods participate
in the match but generate identical signals at both sides of the bipartite field then the match is
still quantal, and the rods participation can be disregarded. If rods generate significantly
different signals in response to two matching stimuli then the match is classified as neural.
56
One important implication of rod participation is on the validity of additivity law in colour
matching. Additivity is only possible if the vision is strictly trichromatic, e.g. if only cones
participate in the match, or if rods participate but they generate identical signals for both
matching stimuli. In order to estimate the significance of the rod participation, rods excitation is
evaluated. If this excitation is significantly different for both stimuli then rod participation
affects the match, i.e. there is rod intrusion to the match (Wyszecki and Stiles 1982).
Scotopic luminous efficiency function is taken to represent the sensitivity of rods. First, scotopic
luminance values are calculated for both sides of the bipartite field:
L1' = K ' ∑ Q1 ( λ )V ' ( λ ) ∆λ

(2.2.39)
L'2 = K ' ∑ Q2 ( λ )V ' ( λ ) ∆λ
where L1' and L'2 are the calculated scotopic luminances, Q1 ( λ ) and Q2 ( λ ) are spectral power
distribution functions corresponding to both sides of the bipartite field, V ' ( λ ) is the CIE
scotopic luminance efficiency function, and K′ is the normalising constant equal to 1700 lm/W.
Similarly, photopic luminance is calculated for both sides:
L1 = K ∑ Q1 ( λ )V ( λ ) ∆λ
(2.2.40)
L2 = K ∑ Q2 ( λ )V ( λ ) ∆λ
where V ( λ ) is the CIE photopic luminance efficiency function and K is the normalising
constant equal to 683 lm/W.
Geometric mean photopic luminance is computed:
L0 = L1 L2 (2.2.41)
Next, the scotopic luminance for both fields is expressed in scotopic trolands, and the geometric
mean and the difference between them are calculated:
T1' = L1' p
(2.2.42)
T2' = L'2 p
T0' = T1'T2' (2.2.43)
∆T ' = T1' − T2' (2.2.44)

57
Here p is the pupil size (we used Trezona model, Eq. (2.1.3)) calculated for the mean photopic
luminance L0.
The value of ∆T ' in Eq. (2.2.42) represents the difference in rod excitation between the two
halves of the bipartite field. Rod participation in the match is considered to be insignificant if
this difference is smaller than the rod sensitivity threshold at luminance level T0' , e.g. if the
following condition is true:
∆T '
−1 < <1 (2.2.45)
τs
τ s is given by Threshold-Versus-Illuminance function (Aguilar and Stiles 1954) (Figure

2.1.6-3, Eq. (2.1.10)).
58
2.3. CIE Colorimetry
2.3.1. Introduction: the Standard Colorimetric Observer
The prime task of colorimetry is specification of colour. As discussed in section 2.1, there are
significant individual variabilities in the properties of the visual system of observers with
normal colour vision. However, it is impractical to specify colour as it would be perceived by
every individual observer. There is a need for a set of data which would represent the mean
properties of human observer, and which will be used in colour specification. Such a set of data
is termed the Standard Colorimetric Observer.
The Standard Colorimetric Observer is a set of colour matching functions which represent the
colour matching properties of human population with normal colour vision. The following
section will briefly discuss the CIE Standard Colorimetric Observers, and the CIE
implementation of colorimetry based on them and on the basic principles of colorimetry
described in section 2.2. The full account for the development and properties of the CIE
Standard Colorimetric Observers can be found in section 3.3.3 of Colour Science textbook
(Wyszecki and Stiles 1982).
2.3.2. CIE 1931 and 1964 Standard Colorimetric Observers
The International Commission on Illumination (Commission Internationale de l’Eclairage, CIE)

has established two Standard Colorimetric Observers – for small and for large viewing field.
The latest specifications for these CMF are published by the CIE in 2004 (CIE 2004).
2.3.2.1. CIE 1931 Standard Colorimetric Observer
The experimental data for the small field Standard Colorimetric Observer was provided by four
studies. Guild measured CMF of seven observers using broadband lights as his primaries (Guild
1931). The units of the tristimulus values were normalised so equal amounts of primaries would
match in colour broadband NPL white standard light (≈4800 K). Wright measured CMF of ten
59
observers using monochromatic primary lights at 650 nm, 530 nm and 460 nm for red, green
and blue, respectively (Wright 1928). Both experiments were carried out with small 2° bipartite
field and maximum saturation colour matching.
CIE has decided to combine photometric and colorimetric properties of Standard Colorimetric
Observer in one set of functions. Therefore, two additional sets of measurements – of photopic
luminous efficiency function – were employed (Coblentz and Emerson 1918; Gibson and
Tyndall 1923). The luminous efficiency function was embedded in the Standard Colorimetric
Observer CMF as the central, or “green”, CMF.
An additional modification of the original colour matching data was made in order to convert
the CMF to an all-positive system. This was made possible by employing the principle of
imaginary primary stimuli: stimuli which exist only as mathematical constructs and are not
physically realisable. The resulting set of curves representing CMF of an average colour-normal
observer measured with respect to imaginary primaries X, Y and Z are denoted by x ( λ ) , y ( λ )
and z ( λ ) . The CIE 1931 Standard Colorimetric Observer is recommended for use for small
fields of up to 4° angular substance.
2.0
1.5
tristimulus value
1.0
0.5
0.0
380 430 480 530 580 630 680 730 780
Figure 2.3.2-1. CIE 1931 Standard Colorimetric Observer colour matching functions
2.3.2.2. CIE 1964 Standard Colorimetric Observer
Originally termed CIE 1964 Supplementary Standard Colorimetric Observer, this set of CMF
was the result of two colour matching studies (Speranskaya 1959; Stiles and Burch 1959).
In the Stiles and Burch experiment, 49 observers made colour matches using monochromatic
primary and test lights, with 10° bipartite field. Due to presence of rods in 10° region, special
care was taken to minimise the effect of rod participation in the results. This was done by
employing two sets of primary lights for different regions of the spectrum, and by keeping the
luminance of the stimuli as high as possible within the limitations of the instruments. An
60
additional complication of large-field colour matching – the presence of maxwell spot caused by
the macular pigmentation – was solved by asking the observers to ignore the central area of the
field and make the match according to the periphery only.
Speranskaya used 27 observers, who did 10° colour matches using broadband primaries. To
avoid the maxwell spot, central 2° of the field were masked off. The luminance conditions were
30 to 40 times darker than in Stiles and Burch study, so the results were severely contaminated
by rods. However, when Speranskaya transformed her data to the primaries used by Stiles and
Burch, the agreement between the two sets was high.
Judd performed the final amalgamation of the two sets of data (Judd 1993), and calculated the
all-positive set of colour matching functions following the same principles as in 1931 Standard
Colorimetric Observer. The resulting CMF are denoted by x10 ( λ ) , y10 ( λ ) and z10 ( λ ) , and are
recommended for use in fields exceeding 4° of angular substance.
2.3.3. CIE XYZ tristimulus values
The CIE Standard Colorimetric Observer CMF are used to calculate the CIE tristimulus values
of a stimulus with known spectral power distribution, according to the principles defined in Eqs.
(2.2.32)-(2.2.37). Two variations of the calculations exist differing in their normalisation
procedures: for self-luminous stimuli and for object-colour stimuli.
2.3.3.1. Tristimulus values of self-luminous stimuli
Let Q(λ) be the spectral power distribution function of the stimulus Q. The CIE XYZ tristimulus
values are calculated as
λmax
X =K ∑ Q ( λ ) x ( λ )∆λ
λ = λmin
λmax
Y =K ∑ Q ( λ ) y ( λ ) ∆λ
λ = λmin
(2.3.1)
λmax
Z=K ∑ Q ( λ ) z ( λ ) ∆λ
λ = λmin
Here x ( λ ) , y ( λ ) and z ( λ ) are the CIE 1931 Standard Colorimetric Observer CMF, λmin and
λmax are the minimum and maximum wavelength of the range in which the spectral power
distribution is sampled, and ∆λ is the sampling interval in nanometres. K is an arbitrary
normalising factor, which is usually set to 683 lm/W so the value of Y can be directly used for
the specification of luminance in candelas per square meter (cd/m2).
61
In the calculation of the CIE 1964 XYZ tristimulus values X10, Y10 and Z10 the same equations are
used, with the x ( λ ) , y ( λ ) and z ( λ ) CMF replaced by the x10 ( λ ) , y10 ( λ ) and z10 ( λ ) ones.
y10 ( λ ) function, unlike y ( λ ) , was not adjusted to coincide with the luminous efficiency
function. However, it was shown to give good estimation of luminance for large field, and was
recommended for use by the CIE (CIE 2005) as the large field luminous efficiency function.
2.3.3.2. Tristimulus values of object-colour stimuli
The spectral power distribution of the light Q(λ) reflected by an object is a product of the SPD
of the light S(λ) illuminating the object, and the object’s spectral reflectance function R(λ):
Q (λ ) = R (λ ) S (λ ) (2.3.2)
Thus the CIE XYZ tristimulus values of the object colour can be readily defined as
λmax
X =k ∑ R ( λ ) S ( λ ) x ( λ )∆λ
λ = λmin
λmax
Y =k ∑ R ( λ ) S ( λ ) y ( λ ) ∆λ
λ = λmin
(2.3.3)
λmax
Z =k ∑ R ( λ ) S ( λ ) z ( λ ) ∆λ
λ = λmin
k in the above equations is the factor used to normalise the tristimulus values so Y value of a
perfectly reflecting object will be 100:
100
k= λmax
(2.3.4)
∑ S ( λ ) y ( λ )∆λ
λ = λmin
As with the self-luminous stimulus, in the calculation of the CIE 1964 XYZ tristimulus values
X10, Y10 and Z10 the same equations are used, with the x ( λ ) , y ( λ ) and z ( λ ) replaced by the
x10 ( λ ) , y10 ( λ ) and z10 ( λ ) , respectively.
2.3.4. CIE 1931 and CIE 1964 chromaticity diagrams
Given the CIE XYZ or CIE XYZ10 tristimulus values, chromaticity coordinates xy and xy10 can be
calculated using Eq. (2.2.38):
62
X
x=
X +Y + Z
Y
y= (2.3.5)
X +Y + Z
x + y + z =1
and
X 10
x10 =
X 10 + Y10 + Z 10
Y10
y10 = (2.3.6)
X 10 + Y10 + Z10
x10 + y10 + z10 = 1
Both diagrams with corresponding loci of monochromatic stimuli are shown in Figure 2.3.4-1.
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
y y
10
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0.0 0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
x x 10
A) B)
Figure 2.3.4-1. Chromaticity diagrams of CIE Standard Colorimetric Observers with loci of
monochromatic stimuli
A) CIE 1931 B) CIE 1964
2.3.5. CIELAB colour space and colour differences
2.3.5.1. CIELAB 1976
Practical application of colorimetry requires the ability of evaluation of the colour difference
between two stimuli: a metric which would correspond to the magnitude of visual discrepancy
between the two colours. As the CIE XYZ tristimulus values uniquely define each stimulus as a
vector in the three-dimensional tristimulus space, the most straightforward approach would be
63
to calculate the Euclidian distance between the two points. However, the XYZ tristimulus space
is known to be perceptually non-uniform: identical distances have different perceptual
magnitudes in different locations in the space. Therefore CIE developed a colour space which is
more perceptually uniform – CIELAB, and recommended it for use in evaluation of colour
differences between object-colour stimuli.
The relation between the CIE XYZ and CIELAB L*a*b coordinates is given by the following
equations:
L* = 116 f ⎛⎜ Y ⎞⎟ − 16 (2.3.7)
⎝ Yn ⎠
⎡ ⎞ − f ⎛ Y ⎞⎤
a* = 500 ⎢ f ⎛⎜ X ⎟ ⎜ Y ⎟⎥ (2.3.8)
⎣ ⎝ X n⎠ ⎝ n ⎠⎦
⎡ ⎤
b* = 200 ⎢ f ⎛⎜ Y ⎞⎟ − f ⎛⎜ Z ⎞⎟ ⎥ (2.3.9)
⎣ ⎝ Yn ⎠ ⎝ Zn ⎠⎦
where X, Y and Z are the CIE XYZ tristimulus values of the object-colour stimulus, Xn, Yn and Zn
are the tristimulus values of the reference white, and f(ω) are given by
⎧ω 13 ω > ( 6 / 29 )
3
⎪
f (ω ) = ⎨ (2.3.10)
⎪⎩841/108 (ω ) + 16 116 ω ≤ ( 6 / 29 )
3
*
Perceptual correlate of Chroma ( Cab ) is given by the Euclidian distance of the point
representing the colour stimulus from the neutral axis:
*
Cab = a *2 +b *2 (2.3.11)
Perceptual correlate of Hue ( hab ) is given by the angle in a*b* plane:
⎛ b* ⎞
hab = arctan ⎜ * ⎟ (2.3.12)
⎝a ⎠
Given two sets of CIELAB values ( L*1 , a1* , b1* ) and ( L*2 , a2* , b2* ) , the colour difference between
the stimuli they represent is calculated as the Euclidian distance between two points in three-
dimensional space:
64
( L − L ) + (a
* 2
− a2* ) + ( b1* − b2* )
2 2
∆Eab* = *
1 2
*
1 (2.3.13)
2.3.5.2. Advanced colour differenced formulae
It was soon realised that the CIELAB space is not perfectly uniform, and the ∆Eab
*
metric does
not always correlate with the perceived colour difference. Two alternative formulae were
developed and proposed by the CIE to solve the CIELAB non-uniformity problem: CIEDE94
∆E94
*
(Berns 1993b) and CIEDE2000 ∆E00
*
(Luo et al. 2001). The calculation details of the
latter are given below. The details of the experiments that led to the development of both
formulae are discussed in detail in (Luo and Rigg 1986; Berns 1993a; Berns 2000; Luo et al.
2001).
The description below is given according to (Sharma et al. 2005). The calculation starts off with
two sets of CIELAB values: ( L*1 , a1* , b1* ) and ( L*2 , a2* , b2* ) .
1. Calculate Ci' and hi' :
*
Cab, i = ( a ) + (b )
* 2
i
* 2
i (2.3.14)
*
Cab,1 + Cab,2
*
Cab* = (2.3.15)
2
⎛ Cab* 7 ⎞
G = 0.5 ⎜1 − ⎟ (2.3.16)
⎜ Cab* 7 + 257 ⎟
⎝ ⎠
ai' = (1 + G ) ai* i = 1, 2 (2.3.17)
Ci' = ( a ) + (b )
' 2
i i
' 2
i = 1, 2 (2.3.18)
⎧⎪0 bi* = ai' = 0

h = ⎨ −1 * '
'
i = 1, 2 (2.3.19)
⎪⎩ tan ( bi , ai )
i
otherwise
2. Calculate ∆L ' , ∆C ' and ∆H ' :
∆L ' = L*2 − L*1 (2.3.20)
∆C ' = C2' − C1' (2.3.21)

65
⎧0 C1' C2' = 0
⎪ '
⎪⎪h2 − h1 C1'C2' ≠ 0; h2' − h1' ≤ 180°
'
∆h ' = ⎨ ' (2.3.22)

⎪h2 − h1 − 360 C1'C2' ≠ 0; h2' − h1' > 180°
'
⎪ '
⎪⎩h2 − h1 + 360 C1'C2' ≠ 0; h2' − h1' < −180°
'
⎛ ∆h ' ⎞
∆H ' = 2 C1' C2' sin ⎜ ⎟ (2.3.23)
⎝ 2 ⎠
3. Calculate the CIEDE2000
L'=
(L + L )
*
1
*
2
(2.3.24)
2
C'=
(C '
1 + C2' )
(2.3.25)
2
⎧ h1' + h2'
⎪ h1' − h2' ≤ 180; C1'C2' ≠ 0
⎪ 2
⎪ h1' + h2' + 360
⎪ h1' − h2' > 180; ( h1' + h2' ) < 360; C1' C2' ≠ 0
h'=⎨ 2 (2.3.26)
⎪ h1' + h2' − 360
⎪ h1' − h2' > 180; ( h1' + h2' ) ≥ 360; C1' C2' ≠ 0
⎪ 2
⎪( h' + h ' ) C1'C2' = 0
⎩ 1 2
T = 1 − 0.17 cos ( h '− 30° ) + 0.24cos ( 2h ' ) + 0.32cos ( 3h '+ 6° )

(2.3.27)
−0.20cos ( 4h '+ 63° )
⎡ ⎛ h '− 275° ⎞ 2 ⎤
∆θ = 30exp ⎢ − ⎜ ⎟ ⎥ (2.3.28)
⎢⎣ ⎝ 25 ⎠ ⎥⎦
C '7
RC = 2 (2.3.29)
C '7 + 25
0.015 ( L '− 50 )
2
SL = 1 + (2.3.30)
20 + ( L '− 50 )
2
SC = 1 + 0.045C ' (2.3.31)

66
S H = 1 + 0.015C 'T (2.3.32)
RT = − sin ( 2∆θ ) RC (2.3.33)
And finally
2 2 2
⎛ ∆L ' ⎞ ⎛ ∆C ' ⎞ ⎛ ∆H ' ⎞ ⎛ ∆C ' ⎞ ⎛ ∆H ' ⎞
∆E00 = ⎜ ⎟ +⎜ ⎟ +⎜ ⎟ + RT ⎜ ⎟⎜ ⎟ (2.3.34)
⎝ k L S L ⎠ ⎝ kC S C ⎠ ⎝ k H S H ⎠ ⎝ kC S C ⎠ ⎝ k H S H ⎠
The parametric weighting factors kL, kC and kH are adjusted to suit the requirements of particular
application, and are set to 1 by default.
2.3.5.3. Mean colour difference from mean (MCDM)
Standard deviation expresses the degree of dispersion of values of one-dimensional variable

about the mean. With two- of three-dimensional variable, equivalent expression takes form of
covariance matrix rather then a single value (Section 2.4.5). Colour values in CIELAB space
have three-dimensional distribution; however, expressing their dispersion as covariance matrix
is impractical and inconvenient for everyday use. Instead, it is possible to express it in familiar
units of CIELAB colour difference as Mean Colour Difference From Mean (MCDM).
Let us have n sets of CIELAB vectors Ci representing a set of measurements of the same colour
stimulus. The dispersion of these values about the mean can be computed as follows (Berns
2000)
∑ ∆E ( C , C )
i
MCDM = i =1
(2.3.35)
n
where ∆E is the colour difference calculated with any chosen CIELAB-based formulae (i.e.
(2.3.13) or (2.3.34)), and C is the mean vector of CIELAB values.
The calculation of MCDM as an alternative to standard deviation assumes perceptual uniformity

of the colour space, i.e. that the distribution of colour deviations about the mean is spherical.
Naturally, use of advanced colour difference formulae improves the perceptual uniformity of
colour difference calculation. Thus use of CIEDE2000 formula in Eq. (2.3.35) is expected to
lead to more reliable MCDM estimation then the use of standard CIELAB Euclidian distance
formula.
67
2.4. Statistics
A brief overview of statistics is given in this section. Basic principles and formulae of univariate
and multivariate statistics are followed by discussion of principles of evaluation of the
measurement uncertainty, and propagation of uncertainty in colorimetric transformations. This
overview is by no means comprehensive and covers only the areas directly related to the present
work.
The overview of univariate statistics is given according to (Upton and Cook 1996), and of the
multivariate statistics according to (Johnson and Wichern 2002; Anderson 2003) – unless
otherwise stated.
2.4.1. Univariate statistics
2.4.1.1. Mean
Let X be a randomly varying quantity, of which n independent measurements were obtained

under the same conditions. In most of the practical cases, the best estimate of the value of X is
the arithmetic mean x of the results of all measurements:
1 n
x= ∑ xi
n i =0
(2.4.1)
If several sets of measurements with different number of observations in each set are combined,
the best estimate of X is weighted mean: first, mean value for each set is calculated according to
Eq. (2.4.1), then the final value is computed by
∑w x i i
xw = i =0
n
(2.4.2)
∑w
i =0
i
68
where xi is the mean value of each set, and wi is the number of observations (the weight) in
each set.
2.4.1.2. Variance, standard deviation and covariance
The observations in the sample vary as the result of random fluctuation in the measurement
instrument, the measured matter and others. A measure of the spread of the values in the
population, or population variance σ n2 , is evaluated as the mean of the squares of the
differences between each value and the sample mean:
1 n
σ n2 ( x ) = ∑ ( xi − x )
2
(2.4.3)
n i =1
In practical situations, it is necessary to estimate the variation in the entire population based on
the sample from it. Thus the experimental variance of the sample σ n2−1 is estimated by replacing
the devisor n in the above equation by (n-1):
1 n
σ n2−1 ( x ) = ∑ ( xi − x )
2
(2.4.4)
n − 1 i =1
In the following text, the term “variance” will be used in the meaning of “experimental
variance” only, and it will be denoted by the symbol σ 2 .
The standard deviation σ is calculated by taking the square root of the variance:
∑( x − x)
2
i
σ ( x) = i =1
(2.4.5)
n −1
The value of the standard deviation, unlike variance, has the same units as the measured
quantity, and therefore often is more convenient for use as the measure of dispersion of values
about the mean.
The experimental variance of the mean, in its turn, is calculated by dividing the experimental
variance of the sample by the number of observations:
σ 2 ( x)
σ 2
(x) = (2.4.6)
n
The experimental standard deviation of the mean is equal to square root of its variance:
69
σ 2 ( x) σ ( x)
σ (x) = = (2.4.7)
n n
If f is the function of two variables x and y, the measure of correlation between variances of x
and y, their covariance, can be found as the mean of the products of the deviations of each x and
y value from the corresponding mean:
1 n
σ x , y = cov ( x, y ) = ∑ ( xi − x ) ( xi − y )
2 2
(2.4.8)
n − 1 i =1
2.4.1.3. The probability density function (pdf) and cumulative distribution function (cdf)
In a histogram plot, the area of the rectangle representing each class is proportional to the
relative frequency of the values represented by the same class. As the sample size increases, the
histogram converges to a smooth curve, and the values of relative frequencies of random
variables approach the corresponding population probabilities. Consequently, the area below the
resulting curve comes to represent the probability. Such curve is described by the probability
density function (pdf), denoted f(x). This concept is illustrated in Figure 2.4.2-1, where the
probability that the random variable x takes value between a and b is equal to the shaded area
and calculated by:
b
P ( a < x < b ) = ∫ f ( x )dx (2.4.9)
a
Since the probability is never negative, the pdf cannot take negative values. Naturally, the total
area below pdf is equal to 1.
Probability
density
f(x)
a b x
Figure 2.4.1-1. Illustration of the probability density function
The cumulative distribution function (cdf), denoted by F(x), is devised from the pdf and defines
the probability that the variable X takes value less than b:
70
F ( x) = P ( X ≤ b) (2.4.10)
Cdf is related to pdf by
b
F ( x) = ∫ f ( x )dx (2.4.11)
−∞
This is illustrated in Figure 2.4.1-2.

Probability
density Probability
f(x) F(x)
1
0
a b x a b x
A) B)
Figure 2.4.1-2. Illustration of the probability density function (PDF) and the corresponding cumulative
distribution function (cdf).
A) pdf; B) cdf
2.4.1.4. Normal distribution
Normal distribution is a unimodal symmetric continuous distribution which can be described by

two parameters: mean µ and variance σ 2 ; this is denoted by N( µ , σ 2 ). The pdf of normal
distribution is illustrated in Figure 2.4.1-3 and is described by the following expression:
1 ⎛ ( x − µ )2 ⎞
f ( x) = exp ⎜ − ⎟ (2.4.12)
σ 2π ⎜ 2σ 2 ⎟⎠
⎝
f(x) cannot be integrated explicitly, so the values of the cumulative distribution function are
obtained from tables. Any normal distribution can be related to the standard normal distribution
with mean 0 and standard deviation 1 by expression
x−µ
Z= (2.4.13)
σ
71
Probability
density
f(x)
Figure 2.4.1-3. Probability density function of normal distribution
2.4.1.5. The Central Limit Theorem
Let X1, X2,…, Xn be n independent random variables having the same distribution. According to
the Central Limit Theorem, as n increases, the distribution of the sum of the variables and of
their mean approaches normal distribution; this is independently of the type of the original
distribution of each X. The variance of mean X is given by Eq. (2.4.6):
σ2(X )
σ2(X ) = (2.4.14)
n
Consequently,
⎛ σ2 ⎞
X ∼ N ⎜ µ, ⎟ (2.4.15)
⎝ n ⎠
Substituting σ 2 ( X ) into the equation for standard normal distribution (2.4.13), we get
X −µ
Z= ∼ N ( 0,1) (2.4.16)
σ
n
The Central Limit Theorem is widely used as it allows assumption of the normal distribution for
the experimental results, and consequently allows their effective analysis: the mean values and
their confidence intervals can be estimated with specified probability, and mean values from
different samples can be adequately compared using the estimated confidence intervals.
72
2.4.2. The t-distribution
In large samples, equation (2.4.6) provides reasonably accurate estimate of the population
variance. However, in many practical situations the sample size is limited, and possible
difference between the population variance and its estimation needs to be taken into account.
This is done through the use of t-distribution.
t-distribution is symmetric about zero (Figure 2.4.2-1), and has a single positive integer
parameter v called degrees of freedom and given by
v = n −1 (2.4.17)
The formulae used to estimate the mean and the variance are the same as with normal
distribution. However, the critical values and percentage points are taken from tables of t-
distribution rather then tables of normal distribution. Tables of t-distribution give percentage
points which depend on the number of degrees of freedom. As v increases, the t-distribution
approaches the shape of normal distribution, so the percentage points for v = ∞ are equal to
those given in tables of normal distribution.
normal distribution
t-distribution
Figure 2.4.2-1. Probability density function of t-distribution compared with one of normal distribution
2.4.3. Inferences about the difference between two means (small sample)
Let X1 and X2 be two samples of sizes n1 and n2 , with mean values x1 and x2 and estimated
variances s12 and s22 . We wish to make inferences concerning the difference between the two
means, that is – to test the null hypothesis
H0 : µ1 − µ2 = D0 (2.4.18)
with the alternative hypothesis being
H0 : µ1 − µ2 ≠ D0 (2.4.19)
73
To do this, we first estimate the common variance of the two means:
s=
( n1 − 1) s12 + ( n2 − 1) s22 (2.4.20)
n1 + n2 − 2
The test statistics T is calculated by
T=
( x1 − x2 ) − D0 (2.4.21)
⎛1 1⎞
s ⎜ + ⎟
2
⎝ n1 n2 ⎠
The null hypothesis (2.4.18) is accepted at a confidence level p% if
T <t (2.4.22)
Where t is the critical value from t-distribution table for corresponding confidence level and
n1 + n2 − 2 degrees of freedom.
2.4.4. Evaluation of uncertainty in measurement
The term uncertainty represents a concept, according to which the true measurement value is
unknowable because, after all the known sources of error are being corrected for, there will
always be uncontrollable variations in measurement. The value of uncertainty is “…associated
with the result of the measurement, that characterises the dispersion of the values that can be
reasonably attributed to the measurand” (ISO 1993). The standard process and principles of
uncertainty evaluation are given in (ISO 1993).
The standard uncertainty is given by the experimental standard deviation of the mean (Eq.
(2.4.7)):
s ( x)
u(x ) = s(x ) = (2.4.23)
n
If there are several sources of uncertainty, they should be combined to produce the combined
standard uncertainty. This is done by taking square root of sum of squares of all the uncertainty
values associated with the measurand, e.g.
n
uc = ∑u
i =1
2
i (2.4.24)
74
In addition to the standard uncertainty, an expanded uncertainty U can be provided to define the
interval about the reported measurement corresponding to specified level of confidence p, this is
said to provide an interval X = x ± U . The value of expanded uncertainty is obtained by
multiplying the value of standard uncertainty by a coverage factor k:
U = kuc (2.4.25)
The value of k is selected from the t-distribution table according to the required level of
confidence and number of degrees of freedom. If the uncertainty is calculated from the results of
n repeated measurements, then the coverage factor is selected according to v parameter of t-
distribution, e.g. n − 1 . If the uncertainty is combined as described by Eq. (2.4.24), then the
value of effective degrees of freedom veff is calculated by
uc4
veff = (2.4.26)
n
uc4
∑
i =1 vi
where vi is the number of degrees of freedom of every component used in calculation of

combined uncertainty.
In many practical applications, when the number of degrees of freedom is significant, and taking
into account the approximate nature of the process of uncertainty estimation, it is often
sufficient to adopt the coverage value of 2 and assume that it defines approximately 95%
confidence level.
2.4.5. Multivariate statistics
Multivariate statistics deal with cases in which values vary in more than one dimension. Most of
the problems of colorimetry deal with multivariate statistics as every colour stimulus is
specified by at least three variables – its tristimulus values, or CIELAB values.
2.4.5.1. Organisation, variance and covariance
In multivariate statistics every variable has p specifies, where p is the number of dimensions of
data. It is convenient to present the sets of multivariate data in the form of matrices. For a p-
dimensional variable X we have
75
⎡ x11 x12 x1 p ⎤
⎢ ⎥
x21 x22 x2 p ⎥
X = ⎢⎢ ⎥ (2.4.27)
⎢ ⎥
⎣⎢ xn1 xn 2 xnp ⎦⎥
Where xnp means “dimension p of sample #n of X”.
The mean vector is calculated by computing the mean of each column of X according to
standard formula (2.4.1) and transposing the result:
⎡ µ1 ⎤
⎢µ ⎥
µ=⎢ ⎥
2
(2.4.28)
⎢ ⎥
⎢ ⎥
⎣⎢ µ p ⎦⎥
The covariance matrix Σ is calculated as
⎡σ 11 σ 12 σ1p ⎤
⎢ ⎥
σ 21 σ 22 σ2p ⎥
Σ = Cov ( X ) = ( X − µ )( X − µ ) ' = ⎢⎢ ⎥ (2.4.29)
⎢ ⎥
⎢⎣σ p1 σ p2 σ pp ⎥⎦
In the symmetric square covariance matrix Σ the variances of columns of X are situated on the
diagonal, while the rest of the values are the covariances between the corresponding columns of
X.
2.4.5.2. Linear combinations of random variables
q linear combinations of p random variables X1 … Xp can be written as a series of linear

equations:
Z1 = c11 X 1 + c12 X 2 + … + c1 p X p
Z 2 = c21 X 1 + c22 X 2 + … + c2 p X p
(2.4.30)
Z q = cq1 X 1 + cq 2 X 2 + … + cqp X p
or, in the matrix form,

76
⎡c11 c12 c1 p ⎤ ⎡ X 1 ⎤
⎢ ⎥
c21 c22 c2 p ⎥ ⎢⎢ X 2 ⎥⎥
Z = ⎢⎢ ⎥⎢ ⎥ (2.4.31)
⎢ ⎥⎢ ⎥
⎣⎢cn1 cn 2 cnp ⎦⎥ ⎣⎢ X p ⎦⎥
Z has the mean vector µ Z and the covariance matrix Σ Z given by
µ Z = Cµ X (2.4.32)
Σ Z = CΣ X C (2.4.33)
where C is the q×p matrix of transformation coefficients in Eq. (2.4.31), and Σ X is the
covariance matrix of X.
2.4.6. Inferences about the equalities of two mean vectors (small sample)
2.4.6.1. Hotelling T2 test
Let X1 and X2 be two p-dimensional samples of sizes n1 and n2, having mean vectors µ1 and
µ 2 , for which the following assumptions hold true:

1. Samples X1 and X2 are independent
2. The populations of X1 and X2 are multivariate normal
3. The covariance matrices of X1 and X2 are equal: Σ1 = Σ 2
The null hypothesis
H0 : µ1 − µ 2 = δ 0 (2.4.34)
can be tested by application of Hotelling T2 test.
First, pooled covariance matrix Spooled of both sets of data is calculated:
n1 − 1 n2 − 1
S pooled = S1 + S2 (2.4.35)
n1 + n2 − 2 n1 + n2 − 2
where S1 and S2 are the covariance matrices of the corresponding set. The test statistic T2 is then
calculated by
−1
⎡⎛ 1 1 ⎞ ⎤
T = ( x1 − x2 − δ 0 ) ' ⎢⎜ + ⎟ S pooled ⎥
2
( x1 − x2 − δ 0 ) (2.4.36)
⎣⎝ n1 n2 ⎠ ⎦
77
The null hypothesis (2.4.34) is rejected if
T 2 > c2 (2.4.37)
where the critical value c2 is determined as
n1 + n2 − 2
c2 = p Fp , n1+ n 2 − p −1 (α ) (2.4.38)
n1 + n2 − p − 1
The term Fp ,n1+ n 2 − p −1 (α ) is a value from F-distribution function for significance levels α and p,
and n1 + n2 − p − 1 degrees of freedom.*
2.4.6.2. Problem of unequal covariance matrices (Behrens-Fisher problem)
Standard small-sample Hotelling T2 test as it is described in Eqs. (2.4.35)-(2.4.38) can not be

used if one or more of the three conditions listed in (Section 2.4.6.1) are violated. However,
condition 3 often does not hold as two sets of experimental data frequently do not have equal
covariance matrices; this is what is known in statistics as Behrens-Fisher problem. There is no
yet a generally agreed-upon solution, although several are proposed (Christensen and Rencher
1997; Lix and Keselman 2004). Christensen and Rencher reviewed seven different solutions of
various levels of complexity (Christensen and Rencher 1997), and found out that Nel and Van
der Merwe’s procedure (Nel and Merwe 1986) performed among the best while being less
computationally complex than other solutions. This procedure is described below.
Pooled covariance matrix is calculated by
S1 S 2
S pooled = + (2.4.39)
n1 n2
The test statistic T*2 is calculated as
T *2 = ( x1 − x2 − δ 0 ) ' S −pooled
1
( x1 − x2 − δ 0 ) (2.4.40)
The null hypothesis (2.4.34) is tested similarly to standard Hotelling T2 procedure, but the term
Fp , n1+ n 2 − p −1 (α ) is replaced by Fp ,v (α ) , where v is calculated by
*
F-distribution function is calculated with Matlab function “finv”
78
tr ( S e ) + ⎡⎣ tr ( S e ) ⎤⎦
2 2
v= (2.4.41)
1 ⎧⎪ ⎛ S1 ⎞ ⎡ ⎛ S1 ⎞ ⎤ ⎫⎪ 1 ⎧⎪ ⎛ S2 ⎞ ⎡ ⎛ S2 ⎞ ⎤ ⎫⎪
2 2 2 2
⎨ tr ⎜ ⎟ + ⎢ tr ⎜ ⎟ ⎥ ⎬ + ⎨ tr ⎜ ⎟ + ⎢ tr ⎜ ⎟ ⎥ ⎬
n1 − 1 ⎪ ⎝ n1 ⎠ ⎣ ⎝ n1 ⎠ ⎦ ⎪ n2 − 1 ⎪ ⎝ n 2 ⎠ ⎣ ⎝ n 2 ⎠ ⎦ ⎪
⎩ ⎭ ⎩ ⎭
tr indicates trace operation, that is – the sum of the elements of the main diagonal of the matrix.
2.4.7. Error propagation in colorimetric transformations
The colorimetric transformation (also termed as transformation of tristimulus space) was

described by Eqs. (2.2.21) and (2.2.24) and is given by expression
⎡ a11 a12 a13 ⎤

[ R2 G2 B2] = [ R1 G1 B1] ⎢⎢ a21 a22 a23 ⎥
⎥
(2.4.42)
⎢⎣ a31 a32 a33 ⎥⎦
where the vector [R2 G2 B2] is the calculated set of tristimulus values, [R1 G1 B1] is the “source “
set of tristimulus values, and 3×3 matrix aij (i,j = 1,2,3) is the transformation matrix which
relates the coordinates in tristimulus space 1 to coordinates in space 2.
In the present study we deal with transforming results of visual colour matching experiments
conducted with one set of primaries to another set of primaries. In such an operation, the source
values [R1 G1 B1] and the values of transformation matrix aij result from the repeated visual
colour measurements, and are subject to uncertainty. This uncertainty propagates into the
resulting tristimulus values [R2 G2 B2], and can be estimated by application of method of error
propagation.
The general model of error propagation in linear transformations has already been given n Eq.
(2.4.33). The more general case, which covers linear and non-linear transformation, is described
below as developed from (Burns and Berns 1997), followed by development of application of
this method on the case of transformation of tristimulus space.
2.4.7.1. Propagation of random errors – general case
Let us have m functions, each of n variables:
⎧ y1 = f ( x1 ,… , xn )
⎪
⎪ y2 = f ( x1 ,… , xn )
⎨ (2.4.43)
⎪
⎪⎩ ym = f ( x1 ,… , xn )
79
A Jacobian matrix sized m×n is built of partial derivatives of each function taken with respect
to each variable:
⎡ ∂y1 ∂y1 ⎤
⎢ ∂x ∂xn ⎥
⎢ 1 ⎥
J=⎢ ⎥ (2.4.44)
⎢ ⎥
⎢ ∂ym ∂ym ⎥
⎢⎣ ∂x1 ∂xn ⎥⎦
Knowing the variances and covariances of variables (x1, …, xn), we construct n×n covariance
matrix C:
⎡σ x1 x1 σx x σx x ⎤
⎢ ⎥
1 2 1 n
⎢σ x x σx x σx x ⎥
C = Cov(X) = ⎢ 2 1 2 2 2 n
⎥ (2.4.45)
⎢ ⎥
⎢⎣σ xn x1 σx x
n 2
σ xn xn ⎥⎦
The variances and covariances propagated to the results of the functions calculations are given
by
C' = JCJ T (2.4.46)
Here, superscript T denotes matrix transpose operation, and C′ is m×m covariance matrix,
containing the estimated variances and covariances of the outputs of functions (2.4.43):
⎡σ y1 y1 σy y σy y ⎤
⎢ ⎥
1 2 1 m
⎢σ y y σy y σy y ⎥
C' = ⎢ 2 1 2 2 2 m
⎥ (2.4.47)
⎢ ⎥
⎢⎣σ ym y1 σy m y2
σ ym ym ⎥⎦
2.4.7.2. Propagation of random errors in colorimetric transformations
Wyszecki (Wyszecki 1958) and Wyszecki and Stiles (Wyszecki and Stiles 1982) published an
extensive account on this subject. The approach was to expand the partial derivative and matrix
operations described by Eqs. (2.4.44)-(2.4.47), and to express each output member of matrix C′
as a series of summations and multiplications. The limitation of this approach is that it results in
extremely long and complex expressions – which invariably leads to making simplifying
assumptions. These can include the assumption of no correlation between the members of
transformation matrix and the tristimulus values, or the assumption that the elements of the
transformation matrix are known precisely and are free of random errors.
80
Modern programming techniques allow operating on matrices directly, without the need to
expand the matrix operations. Therefore the error propagation as described by equations
(2.4.43) – (2.4.47) can be relatively conveniently implemented as is, without making any
simplification assumptions.
In the case of transformation of tristimulus values from tristimulus space 1 to space 2 as

described by Eq. (2.4.42), each calculated tristimulus value can be represented as a function of
twelve variables:
⎧ R2 = f ( R1 , G1 , B1 , a1,1 , a1,2 ,… , a3,3 )

⎪
⎨G2 = f ( R1 , G1 , B1 , a1,1 , a1,2 ,… , a3,3 ) (2.4.48)
⎪
⎩ B2 = f ( R1 , G1 , B1 , a1,1 , a1,2 ,… , a3,3 )
e.g. explicit expression for R2 would be
R2 = R1a1,1 + G1a1,2 + B1a1,3 + 0a2,1 + 0a2,2 + 0a2,3 + 0a3,1 + 0a3,2 0a3,3 (2.4.49)
Similarly for G2 and B2. 3×12 Jacobian matrix is formed by taking partial derivatives with
respect to each of the variables:
⎡ ∂R2 ∂R2 ∂R2 ∂R2 ∂R2 ∂R2 ⎤

⎢ ⎥
⎢ ∂R1 ∂G1 ∂B1 ∂a1,1 ∂a1,2 ∂a3,3 ⎥
⎢ ∂G ∂G2 ∂G2 ∂G2 ∂G2 ∂G2 ⎥
J=⎢ 2 ⎥ (2.4.50)
⎢ ∂R1 ∂G1 ∂B1 ∂a1,1 ∂a1,2 ∂a3,3 ⎥
⎢ ⎥
⎢ ∂B2 ∂B2 ∂B2 ∂B2 ∂B2 ∂B2 ⎥
⎢⎣ ∂R1 ∂G1 ∂B1 ∂a1,1 ∂a1,2 ∂a3,3 ⎥⎦
A 12×12 covariance matrix is formed, which contains the variances and covariances of all the
elements of the transformation:
⎡σ R1R1 σRG σRB σRa σRa σRa ⎤

⎢ ⎥
1 1 1 1 1 11 1 12 1 33
⎢σ G1R1 σG G 1 1
σG B 1 1
σG a 1 11
σG a 1 12
σG a ⎥
1 33
⎢σ σBG σBB σBa σBa σ B a ⎥⎥

⎢ B1R1 1 1 1 1 1 11 1 12 1 33
C = ⎢σ a11R1 σa11G1
σa 11 B1
σa 11a11
σa11 , a12
σa 11 , a33
⎥ (2.4.51)
⎢ ⎥
⎢σ a12 R1 σa12 G1
σa 12 B1
σa 12 , a11
σa12 a12
σ a12 ,a33 ⎥
⎢ ⎥
⎢ ⎥
⎢σ σa σa σa σa σ a33a33 ⎥⎦
⎣ a33 R1 33G1 33 B1 33 , a11 33 , a12
In covariance matrix C, indices (1:3, 1:3) are occupied by the 3×3 covariance matrix of the
tristimulus values R1, G1 and B1, indices (4:12, 4:12) are occupied by the 9×9 covariance matrix
81
of the elements of the transformation matrix a1,1 – a3,3, and the rest of the indices are occupied
by covariances between the tristimulus values and members of the transformation matrix.
Finally, the Jacobian matrix J is pre- and post-multiplied by the covariance matrix C as in Eq.
(2.4.46), to yield 3×3 covariance matrix C′ of [R2 G2 B2].
2.4.7.3. Propagation of random errors through the matrix inversion
If the inverse-matrix method of transformation of tristimulus space is used, an additional step is

required for correct estimation of propagated uncertainties: the inverse of the matrix of
tristimulus values of the primaries (Eq. (2.2.24)). Each element of the inverted matrix can be
presented as a function of the elements of the original matrix – as a cofactor divided by a
determinant. For the matrix of tristimulus values of the primaries in the right part of the Eq.
(2.2.24), if
−1
⎡ a11 a12 a13 ⎤ ⎡ R11 G11 B11 ⎤
⎢ ⎥
⎢ a21 a22 a23 ⎥ = M = ⎢⎢ R21 G21 B21⎥⎥
−1
(2.4.52)
⎢⎣ a31 a32 a33 ⎥⎦ ⎢⎣ R31 G31 B31⎥⎦
then
G21 B31 − B21G31

a1,1 = (2.4.53)
M
and similarly for the rest of the matrix elements. The 9×9 Jacobian matrix is formed as
⎡ ∂a11 ∂a11 ⎤
⎢ ∂R ∂B31 ⎥
⎢ 11 ⎥
JI = ⎢ ⎥ (2.4.54)
⎢ ⎥
⎢ ∂a33 ∂a3,3 ⎥
⎢ ∂R ∂B31 ⎥
⎣ 11 ⎦
The 9×9 covariance matrix is given by
⎡σ R11R11 σR 11G11
σR 11 B31
⎤
⎢ ⎥
⎢σ G G σG 11G11
σ G11B31 ⎥
C I = ⎢ 11 11 ⎥ (2.4.55)
⎢ ⎥
⎢⎣σ R11B31 σG 11 B31
σ B31B31 ⎥⎦
And finally, the error propagation model is applied as usual:
C'I = J I C I J TI (2.4.56)
82
It yields a 9×9 covariance matrix of the elements of the transformation matrix, which feeds into
the general covariance matrix C (Eq. (2.4.51)) at locations (4:12, 4:12).
2.4.7.4. Programming implementation of error propagation model
Explicit expression of the described model, which would express the resulting covariances in
terms of source tristimulus values, transformation coefficients and their covariances, although
possible, would be excessively elaborate. However, the described model can be conveniently
implemented in Matlab using the Symbolic Math facility.
2.4.8. Confidence ellipses and ellipsoids
The spread of values about the mean in bivariate distributions is most commonly visualised by
confidence ellipses. In this section, the statistical and algebraic properties of ellipses will be
described (Johnson and Wichern 2002), followed by the discussion of derivation of ellipses
from statistics of colorimetric data. The discussion is limited to two-dimensional case.
2.4.8.1. Properties of ellipse
The use of confidence ellipse is based on the concept of statistical distance: two points situated
at equal statistical distance from the origin have equal probability of occurrence. Consider the
set of data illustrated in Figure 2.4.8-1. This set has covariance matrix
⎡ s11 0 ⎤
S=⎢
⎣0 s22 ⎥⎦ (2.4.57)
s11 > s22
Figure 2.4.8-1 Example of scatter plot of set of data from bivariate normal distribution
83
The variability of the data along x axis is greater than along y axis; consequently, a sample from
this distribution is more likely to have larger x values then y values. In order to equate the scales
in probability terms, this discrepancy can be eliminated by “standardising” the data, that is – by
dividing each point’s coordinates by their corresponding standard deviation:
x
x* =
s11
(2.4.58)
y
y* =
s22
The distance of every “standardised” point from the origin is found by the Euclidian distance
formula:
2 2
⎛ x ⎞ ⎛ y ⎞ x2 y 2
d= ⎜ ⎟ +⎜ ⎟ = + (2.4.59)
⎜ s ⎟ ⎜ y ⎟ s11 s22
⎝ 11 ⎠ ⎝ 11 ⎠
or
x2 y 2
+ − d2 = 0 (2.4.60)
s11 s22
But Equation (2.4.60) is the equation of an ellipse, centred at the origin with radii parallel to the
axes. That is – locus of all points having equal statistical distance from the origin forms an
ellipse having radii proportional to standard deviations of the data. This is illustrated in Figure
2.4.8-2.
Figure 2.4.8-2. Ellipse centred at the origin, with radii proportional to standard deviation and parallel to
the axes
Equation (2.4.60) can be modified to account for a more general case. The statistical distance
between any two points is given by
84
1 1
( x1 − x2 ) + ( y1 − y2 ) − d 2 = 0
2 2
(2.4.61)
s11 s22
Furthermore, if covariance between the values is not zero, that is
⎡s s12 ⎤
S = ⎢ 11 (2.4.62)
⎣ s12 s22 ⎥⎦
it can be shown (Johnson and Wichern 2002) that the equation (2.4.61) can be further
generalised to have the form
a11 ( x1 − x2 ) + 2a21 ( x1 − x2 )( y1 − y2 ) + a22 ( y1 − y2 ) − d 2 = 0

2 2
(2.4.63)
where a11, a12 and a22 are elements of the inverted covariance matrix S:
−1
⎡ a11 a12 ⎤ ⎡s s12 ⎤
⎢a ⎥ = S −1 = ⎢ 11 (2.4.64)
⎣ 12 a22 ⎦ ⎣ s12 s22 ⎥⎦
Equation (2.4.63) is used to construct the confidence ellipses in the two dimensional system of
coordinates. The relative size of the ellipse is determined by the required level of confidence,
and is controlled by the parameter d2 of the equation which is set to the value from χ 2
distribution table for 2 degrees of freedom.
2.4.8.2. Plotting confidence ellipses in CIELAB space
As an example, following is the procedure for plotting projections of CIELAB confidence

ellipsoids on a*b*, a*L* and b*L* planes.
Let M be a mean vector of CIELAB values:
⎡ µ L* ⎤
⎢ ⎥
M = ⎢ µ a* ⎥ (2.4.65)
⎢⎣ µb* ⎥⎦
for which covariance matrix S is known:
⎡ s L* L * sL*a* sL*b* ⎤
⎢ ⎥
S = ⎢ sa*L* sa * a * sa*b* ⎥ (2.4.66)
⎢ ⎥
⎣ sb*L* sb*a* sb*b* ⎦
85
Matrix S can be calculated directly from multiple measurements, or by error propagation model
– by propagating variances and covariances of CIEXYZ values of the stimulus. From S, three
sub-matrices are extracted – one for each CIELAB plane, and an inverse is computed for each:
−1
−1 ⎡ s L* L* s L *a * ⎤ ⎡u L* u L*a * ⎤
S L*a* =⎢ ⎥ =⎢ ⎥
⎣ sL*a* sa*a* ⎦ ⎣u L*a* ua* ⎦
−1
⎡ s L* L* sL*b* ⎤ ⎡ u L* uL*b* ⎤
S −L1*b* = ⎢ ⎥ =⎢ ⎥ (2.4.67)
⎣ sL*b* sb*b* ⎦ ⎣uL*b* ub* ⎦
−1
−1 ⎡ sa * a * sa*b* ⎤ ⎡ u a* ua*b* ⎤
S a*b* =⎢ ⎥ =⎢ ⎥
⎣ sa*b* sb*b* ⎦ ⎣ua*b* ub* ⎦
Finally, an ellipse equation is constructed for each plane:
f ( L * a*) : u L* ( L * − µ L* ) + 2u L*a* ( L * − µ L* )( a * − µ a* ) + ua* ( a * − µ a* ) − kd 2 = 0

2 2
(2.4.68)
f ( L * b*) : uL* ( L * − µ L* ) + 2uL*b* ( L * − µ L* )( b * − µb* ) + ub* ( b * − µ a* ) − kd 2 = 0

2 2
(2.4.69)
f ( a * b*) : ua* ( a * − µ a* ) + 2ua*b* ( a * − µ a* )( b * − µb* ) + ub* ( b * − µb* ) − kd 2 = 0

2 2
(2.4.70)
d 2 is the value from χ 2 distribution table for two degrees of freedom for desired confidence
level ( d 2 = 5.99 for 95%), and k is scaling factor (e. i. k=1 to plot at 1:1 scale).
86
2.5. Additivity failures
…its [additivity assumption, BO] inclusion in the principles of colour mixture

postulated by Grassmann … has tended to make its acceptance axiomatic.
(Wright 1964)
Grassmann’s assumption of additivity (Section 2.2.3) is basic to the system of CIE colorimetry
and, as W. D. Wright points out in the above quote, is generally axiomatically accepted to hold.
Indeed, the intuitive reaction is that the consequence of establishing the fact of additivity failure
is a requirement to re-design the entire system of colour measurement – too high a price to be
easily accepted. However, failures in additivity have been repeatedly reported in the last several
decades. In the following section the literature on the subject will be reviewed.
The interest to the subject appears to come in “waves”, triggered by a publication reporting a
case of the failure. The present review will be – somewhat arbitrarily – structured according to
these waves. The accent is on experiments conducted in “normal” colour-matching viewing
conditions with bipartite field, not involving bleaching of photopigment (i.e. (Wyszecki and
Stiles 1982)). The terms “additivity law”, “Grassmann’s law/assumptions” and “trichromatic
generalisation” are used here interchangeably and have the same meaning.
2.5.1. Blottiau (1947) and Trezona (1953)
2.5.1.1. Blottiau (1947)
The first measurement of additivity failure appears to be obtained by Blottiau (Blottiau 1947)
(reviewed in (Trezona 1953; Trezona 1993)) using Donaldson colorimeter (Wyszecki and Stiles
1982). After a match has been established, an equal amount of the same red desaturating
stimulus was added to both half-fields. If additivity holds, the matches are suppose to hold with
and without the desaturating stimulus. However, addition of desaturating stimulus to both sides
upset the match. Interestingly, the largest deviations from additivity were observed not in red –
the colour of the desaturating stimulus – but in blue tristimulus values. This is the first time
87
when a possible link is indicated between the additivity failures and interaction of blue cone
mechanism with other receptor signals.
2.5.1.2. Trezona (1953, 1954)
Trezona (Trezona 1953) reproduced Blottiau’s experiment using Wright colorimeter (Wright
1964), with six observers and with narrow-band primaries at 650, 530 and 460 nm. Blottiau’s
results were reproduced, including one that the deviation from additivity is largest in blue
tristimulus values; the failures in red were of smaller extent, and there were no failures in green
at all. However, her conclusions were somewhat different: despite the result of the statistical t-
test showing significant failure for blue tristimulus value in all test colours, Trezona concludes
that the failures are the result of poor discrimination in blue region. No evidence for this
proposition is reported.
In the follow-up paper (Trezona 1954) Trezona continued to explore the subject of additivity
failures and reported further experimental results and analysis. The proportionality law, which
can be considered to be the special case of additivity law in which the added stimuli are
identical, was found to hold for 1.2° field. Also, the matches were found to be persistent to
various conditions of chromatic and luminance adaptation.
The calculation of tristimulus values by integrating the spectral power distribution and
multiplying by colour matching functions assumes the validity of additivity. Hence, additivity
can be tested by comparing tristimulus values measured directly by visual colour matching with
ones calculated using the same observer’s colour matching functions. Trezona performed this
test, and found deviations from additivity. In addition, diagrams showing results of similar test
by (Ishak 1951) were reproduced (shown here, Figure 2.5.1-1). In Ishak’s experiment, four
observers did matches of three different whites. For all the colours matched by all the observers
with exception of one the results show consistent trend: the match made directly is significantly
bluer than the one predicted by calculation. This feature can be traced to results of almost all the
additivity experiments ever since.
88
Figure 2.5.1-1 Results of test of additivity by (Ishak 1951)

Results of four observers plotted in WDW chromaticity diagram. M stands for “Measured” (visually) and
C stands for “Calculated”. Reproduced from (Trezona 1954).
In conclusion, several possible causes for additivity failures are suggested. One is the
breakdown of Weber’s Law* for blue stimulus of low intensity: Trezona found that the Weber’s
fraction tends to increase as the intensity of blue light decreases. In the case of very desaturated
stimuli when the conditions are close to Maxwell match there is a computational complication
related to differences between two large nearly-equal quantities (as discussed in Section
2.2.6.1); this could lead to results resembling additivity failure. A possibility of physiological
cause is mentioned as well. Finally, it is acknowledged that the deviations from additivity, if
real, have significant effect on colorimetry.
2.5.2. After Stiles and Burch colour matching experiment
The largest colour matching investigation carried out so far is the one which has led to the
establishment of the CIE 1964 10° Standard Colorimetric Observer. The experiment was held in
the NPL and led by Stiles (Stiles and Burch 1959). In the following decade, considerable
amount of activity concentrated on colour matching as the field trials of the new CMF were
carried out. Some of the studies were concerned with failures of additivity; the first one was by
Stiles himself.
*
Ratio of JND to stimulus intensity is constant
89
2.5.2.1. Stiles’ “Addendum on additivity”
As the part of the colour matching study, Stiles’ 49 subjects made colour matches of broadband
white light using monochromatic primaries at 645, 526 and 444 nm. In addition, a complete set
of CMF was measured for each observer, and the tristimulus values calculated with the CMF
were compared with ones obtained by direct visual matching (Stiles 1963). The measured and
the calculated chromaticities were transformed to CIE 1964 XYZ tristimulus values.
The mean result of all observers did not deviate from additivity; however, the results of
individual observers did. Figure 2.5.2-1 shows the diagram from the original paper: the
individual deviations seem to lie on the same line drawn from the blue end, but in opposite
directions, depending on the spectral position of the blue primary. About one third of the
individual deviations is statistically significantly different from the other two thirds. Stiles
summarises:
It must be concluded that, in these observations on a match involving severe

differences of energy distribution, some one in three subjects shows a substantial
failure in additivity. This takes form in main of the variation in the ratio of the
blue (Z) component to the green (Y) and red (X)…It is only for the mean of this
quite large group of subjects that the observed chromaticity and the chromaticity
predicted on the assumption of additivity agree well.
Figure 2.5.2-1. Results of test of additivity by (Stiles 1963).

Results for two alternative blue primaries. 445 nm: differences: circles; mean: square; 470 nm:
differences: crosses; mean: triangle. Reproduced from (Stiles 1963)
90
2.5.2.2. Comparing Maxwell and Maximum Saturation methods
Crawford (1965)
One method of additivity test used by several researchers is that of comparing CMF measured
by Maxwell and by maximum saturation methods. If additivity holds, both methods would lead
to identical results for the same observer.
Crawford did his experiment on the same equipment used by Stiles and Burch in NPL
(Crawford 1965). Six observers measured their CMF by two methods using narrow-band
primaries at 650, 530 and 460 nm, at luminance level of 100 td. The lower precision of Maxwell
matching method was partially compensated for by making two matches for every test stimulus.
The results of four observers are shown in Figure 2.5.2-2. The characteristic difference between
the chromaticities obtained by two methods is apparent: in blue-green region, tristimulus values
measured by Maxwell method are significantly “bluer” than ones measured by maximum
saturation. Crawford rules out the possibility of interference of Maxwell spot in matches, as one
of the observers did not see Maxwell spot at all – but has still shown the same deviations. The
option that the discrepancies are due to rod intrusion is ruled out as well: measurements on
small (2° and 1°) fields show smaller, but equally significant effect.
Crawford concludes that
…it is probable that adaptation effects of the sort here found will spread
appreciably over colour space and render the general application of simple
system of colorimetry ambiguous.
91
Figure 2.5.2-2. Additivity test by (Crawford 1965).

rg chromaticity diagrams. Loci of monochromatic stimuli measured by four observers by maximum
saturation (dashed lines) and Maxwell matching (solid lines). Reproduced from (Crawford 1965)
Wyszecki (1982)
In the Colour Science textbook (Wyszecki and Stiles 1982) Wyszecki published results of
similar measurements. The same observer measured his CMF by two methods, and with two
field sizes – 2° and 9°. The field luminance level was about 1000 td. The results show the same
trend as Crawford’s (Crawford 1965), with the only exception that the deviations do not become
smaller with smaller matching field. Wyszecki discusses the uncertainties of this kind of
measurements, and notes that significantly larger uncertainties are associated with the results of
Maxwell colour matches than with the maximum saturation ones. He notes that the
discrepancies are rather small from an experimental point of view but are significant.
2.5.2.3. Lozano and Palmer (1967, 1968)
Lozano and Palmer published a series of two papers (Lozano and Palmer 1967; Lozano and
Palmer 1968). They begin the first paper by summarising the studies on the subject available by
then, and conclude the summary by this statement:
…Taken together, these various investigations suggest that the large-field colour
matching is very probably non-additive for many observers, but that the extent of
the discrepancies and the consequences for practical colorimetry is uncertain…
92
Using Stiles colorimeter in NPL, they measured CMF of four observers using procedures very
similar to Stiles’. In addition, the observers matched 20 broadband stimuli spanning wide range
of chromaticities. Each match was repeated three of four times. Again, the test of additivity was
carried out as a comparison of the visually measured tristimulus values with ones calculated
with each observer’s CMF. The result was that
…the observed “blue” tristimulus values were often much larger than those
calculated…
The discrepancies were reported to be about 20% of the values, compared with intra-observer
repeatability of about 3%, and were particularly pronounced in nearly-white matches. One
observer out of four has shown almost perfect additivity for all the colours, and the other three
…were non-additive, especially for “blue” tristimulus value…
The paper concludes with a rather desperate note:
This complicated situation defies analysis…

It is impossible to be certain that they [additivity failures, BO] will never intrude
into some practical situation, especially as their cause is unknown.
In the second paper (Lozano and Palmer 1968), one observer measures three sets of CMF: one
by Maxwell method at high luminance level (160 td) and two by maximum saturation method –
at 160 and 10 td. The results show similar trend for all three sets, although with different
magnitudes. Discrepancies are large in all the areas except of reds, oranges and yellows.
Figure 2.5.2-3. Additivity test by (Lozano and Palmer 1967).

Comparison of “calculated” (crosses) and visually measured (black dots) results of 5 observers.
Reproduced from (Lozano and Palmer 1967)
93
2.5.3. Zaidi (1986)
The most comprehensive empirical study of additivity* so far was carried out and reported by
Zaidi (Zaidi 1986). In a series of carefully designed colour matching experiments, he attempted
to identify possible causes for the discrepancies between the results of Maxwell and maximum
saturation colour matching of the kind discussed above (i.e. (Crawford 1965; Lozano and
Palmer 1967; Lozano and Palmer 1968; Wyszecki and Stiles 1982)). Since enough evidence
exists that the non-additivities occur mostly in blue region, he chose to concentrate his tests on
matches of blue-green colours.
The failures are not due to rods or computational imprecision
The first experiment was designed to reproduce Crawford’s (Crawford 1965) results. Two
observers made trichromatic matches of narrow-band lights spanning the spectrum in range 410-
510 nm in 10 nm intervals, with primaries at 670, 546 and 450 nm. The same matches were
made with and without the desaturating 580 nm light added to both fields. If the additivity law
holds then the tristimulus values in both conditions should be identical within the experimental
error. The results were compared with Wyszecki’s (Wyszecki and Stiles 1982) CMF measured
by Maxwell method, transformed to experimental primaries.
Zaidi reports consistent discrepancies between the CMF measured with and without the
desaturating light, while the nature and direction of discrepancies is similar to those in
Wyszecki’s CMF. The conclusion is that the observed additivity failures are not the result of
computational imprecision of the Maxwell method – because the computations were identical in
both conditions; they are also not due to rod intrusion: because significant rod participation is
unlikely in 2° field. The prereceptoral filters were also ruled out as possible cause – because the
desaturating light was identical in both fields. Zaidi makes these observations, which are central
to the discussion of the adaptation phenomenon leading to additivity failures:
(a) the 580 nm light [the desaturating light, BO] only affects LWS and MWS
cones; (b) nonlinearity is observed only for test wavelengths shorter than 450 nm.
Failures are not due to failures of principle of invariance
The second experiment was conducted using a setup in which observers made minimally-
distinct border (MDB) matches. In such a match, the visibility of the border between two
adjacent colour stimuli depends solely on differences in the activity of L and M cones, while the
activity of S cone has no significance (Tansley and Boynton 1978). The match is independent of
any prereceptoral filtering, and depends only on the quantal catch: the border between two fields
*
Zaidi uses term “linearity”; for the sake of consistency we will use term “additivity” in this review.
94
“melts” when M and L receptors on both sides of the border absorb equal quanta. When a MDB
match is set, adding identical adapting light on both fields will not change the relative amount of
quanta absorbed. Therefore, if the match is upset, it can be concluded that the principle of
invariance fails for the L and M cones.
Two observers made repeated MDB matches with and without adapting light. No difference
between two conditions was observed, hence the principle of invariance was shown to hold.
Failures are not due to multiple photopigments
A possibility can be suggested that the additivity failures can be due to presence of more than
three types of photopigment which take part in the 2° foveal matches, i.e. rods. Similarly to rod
intrusion in large field matches, if four mechanisms contribute to trichromatic match, such a
match is not necessarily a cone-quantal one but can also be neural; thus failures of additivity
should be expected.
In order to test this possibility, two observers made a match of mixture (450 nm + 670 nm) with
mixture (430 nm + 546 nm) in four adaptation conditions: the same light was added to both
fields four times at different luminance levels. Twelve added narrow-band lights were used,
spanning the spectrum from 500 to 660 nm. If failure of additivity is observed, and the
magnitude of the failure is proportional to scotopic luminance of the field, then it can be
suggested that rods are responsible for the failure. However, the extent of the failure was found
to be proportional to photopic rather then scotopic luminance, hence the possibility of rod
intrusion is disregarded.
Further in the same paper, Zaidi also rules out the variations in peak cone sensitivity as the
cause of additivity failures. In the discussion, he analyses the differences in cone excitation
between the maximum saturation and Maxwell method. It appears that the only difference for
the short-wavelength matches is in excitation of the L and M cones relative to the S cones: in
desaturated conditions the ratio between the two excitations is much higher. Therefore it can be
postulated that the additivity failures result from
… change in shape of the SWS cone mechanism’s spectral sensitivity for

wavelengths shorter than 450 nm in response to change in LWS and MWS cone
excitation relative to SWS cone excitation.
Zaidi concludes by noting that the observed non-additivity seem to be confined to spectral
region of shorter than 440 nm wavelengths. As this is also the region of highest inter-observer
variability due to variations in prereceptoral filtering, Zaidi suggests that this can explain why
the failures in additivity did not have effect on successful implementation of CIE 1931 Standard
95
Colorimetric Observer: the uncertainties introduced by observer metamerism in this range are
larger than ones introduced by the additivity failures.
2.5.4. Thornton (1992-1998)
Between 1992 and 1998, William Thornton published a series of six papers with the common
title “Toward a More Accurate and Extensible Colorimetry” (Thornton 1992a; Thornton 1992b;
Thornton 1992c; Thornton 1997; Thornton 1998b; Thornton 1998c) which initiated a new, and
the last one, series of debates on validity of Grassmann’s law, and eventually led to initiation of
the current project. Most of the papers published in the last decade on the subject of additivity in
colorimetry and validity of the Standard Colorimetric Observer were in some degree in response
to Thornton’s publications.
In his papers, Thornton touches upon several issues in colorimetry. In this review the accent will
be on the main subject of this section – the failures in Grassmann’s law.
2.5.4.1. “Toward a More Accurate and Extensible Colorimetry” parts I and II
Part I of the paper deals with the description of the instrument, the procedures and basic
description of the results. Thornton’s visual colorimeter provided 10° horizontally-split bipartite
field. A special features in this instrument was a built-in telespectroradiometer, which allowed
taking instantaneous measurements of the colorimeter’s output at any given time, and
availability of multiple channels: up to ten beams could be mixed on any side of the bipartite
field. Six observers measured colour matching functions by Maxwell method using three sets of
primaries. In Thornton’s notation these were
− Prime Colours (PC): 607, 533 and 452 nm
− Non-Prime Colours (NP): 638, 558 and 477 nm
− Anti-Prime Colours (AP): 653, 579 and 497
The additivity of matches was tested in two ways. A straightforward test of additivity (Thornton
1992a) included making two sets of matches: (A ≡ B) and (C ≡ D). The corresponding lights
were superimposed, and observer made evaluation whether the resulting match (A ⊕ C ≡ B ⊕
D) holds. 110 matches of this kind were evaluated, and no significant deviations were found;
that is – observers saw none or very minor mismatches between the “added” pairs of lights.
The discrepancies in additivity appeared when Thornton attempted to transform tristimulus

values measured with one set of primaries to another. Tristimulus values of each stimulus were
visually matched by an observer using three sets of primaries in turn. In addition, the primary
96
lights of every set were measured with every other set. Thus, the 3×3 transformation matrix for
the forward-matrix transform (2.2.21) was experimentally determined for each observer and for
each primary set combination. When the tristimulus values were transformed from one
tristimulus space to another, it appeared that the calculated values are different from the
measured ones. Since the validity of the procedure of tristimulus space transformation rests
entirely on assumption of validity of Grassmann’s law, the observed discrepancies implied its
failure, i.e. (Thornton 1992b):
...failure of Grassmann's assumptions III and/or IV in the case of computation of

tristimulus values of a single matching pair of doublets, using CMFs from
different primary sets, for the same human observer.
By Grassmann’s assumptions III and IV Thornton means the statements III and IV of the
Trichromatic Generalisation (section (2.2.5), (Wyszecki and Stiles 1982)). The results of the
transformation corresponded to the results of visual colour matching better if the “source”
tristimulus space was PC one.
The remaining 4 papers (Thornton 1992c; Thornton 1997; Thornton 1998b; Thornton 1998c)
deal mostly with discussion of the results reported in the first two, with responding to criticism,
and with reporting results of new experiments conducted as the result of the criticism. On the
subject of the failures of transformation of tristimulus space no new data were reported.
Having reviewed the literature about additivity failures published during the four decades
preceding 1993, it would be safe to note that it would be more surprising if Thornton would not
find any discrepancies. In fact, Thornton’s tests were not on validity of additivity law per se, but
on the validity of its application. As we discuss below in the conclusions, this is the real novelty
of Thornton’s work: he was the first to rise and to try and investigate the question of the
practical consequences of the failures of Grassmann’s assumptions. The publications of the
following years dealt mostly with discussion of Thornton’s results on various levels, and did not
bring much to our knowledge of additivity failures – hence we conclude this review here.
2.5.5. Summary
The Grassmann’s assumption of additivity and its quantitative formulation, Trichromatic

Generalisation, allow handling quantities of colour stimuli as ordinary algebraic values, and are
the basis of colorimetry. Because of its importance, the additivity law has been subject of tests
almost ever since the system of colorimetry has been established. Invariably, the assumptions
were found to fail by almost all the researchers. Invariably, the failures were of consistent
nature, and seemed to point to some elusive factor which is not taken into account in our
97
understanding of colour vision and colour matching phenomena. This situation was summarised
by Wyszecki and Stiles (Wyszecki and Stiles 1982):
The fact that all investigators independently obtain similar deviations,

particularly in regard to their direction in the (r,g)-chromaticity diagram, is a
strong argument in favour of considering the deviations to be significant.
However, to our knowledge, no definitive physiological model has emerged as yet
which would predict the deviations quantitatively.
The question “do additivity failures exist?” was answered positively already by the 70’s of 20th
century. Zaidi’s (Zaidi 1986) seem to be first to try and answer the question “what is the cause
of additivity failures?” in a systematic empirical study, and thus transferred the problem from
the area of “applied” colour science (i.e. the colorimetry) to vision research. For the colorimetry
it has left to answer the remaining question: “What do additivity failures mean in practical
terms?” If the failures do not lead to any visually-appreciable errors in colour stimulus
specification, then the problem belongs solely in the academic domain of vision research.
However, if they do have practical consequences, these need to be characterised and quantified,
methods of compensation and estimation of these consequences need to be developed, and the
colorimetry needs to be updated.
By the time of Thornton’s first paper in the series (Thornton 1992a), another empirical study to
show that the additivity failures exist was not justified, and his results were highly predictable.
However, the importance of his papers in provoking a debate on practical implications of the
failures can hardly be overestimated; an entire CIE symposium was dedicated to discussion on
his results (CIE 1993). The more is the disappointment with many responses to Thornton’s
findings, which instead of trying to move further in estimating the consequences of additivity
failures, or at least to replicate the results, chose to concentrate on attempts to undermine their
significance. Thornton, on his side, attempted to explain his results by postulating hypotheses in
vision theory that almost all the experts found untenable. However, instead of questioning
Thornton’s theories, many critics questioned his experimental results. This situation was
characterized by Billmeyer with a quote from Lev Tolstoy (Billmeyer 1993):
Most men can seldom accept even the simplest and most obvious truth if it obliges
them to admit the falsity of conclusions which they have delighted in explaining to
colleagues, which they have proudly taught others, and which they have woven
thread by thread into the fabric of their lives.
So far, not much progress has been made, and both questions posed above remain open:
− The question of colour vision research: What are the physiological and functional
causes of additivity failures?
− The question of colorimetry: What are the practical consequences of additivity failures?
98
2.6. Variability of colour matching functions
2.6.1. Introduction: variability of the CIE Standard Colorimetric Observers

There is no such thing as an average eye any more than there is an average
observer or a man in the street.
(Weale 1968)
From the discussion of the physiological construction of the eye in section 2.1, it is evident that
there is almost no single element in this structure that does not vary in its optical properties from
one observer to another. The variations are of varying extent, and having varying effect on
colour vision. In this section we discuss one practical consequence of these variations: the
phenomenon in which one observer perceives a metameric match where another perceives a
mismatch: the observer metamerism.
The variability of colour matching properties is best evaluated from individual variations in
colour matching functions. However, colour matching experiment is a lengthy and complex
project, which involves considerable investment in equipment and effort and dedication of many
observers. This is the main reason that, despite the advances in technology and experimental
methods, the only large-scale colour matching studies carried out in the past century were those
which led to the establishment of each of the CIE Standard Colorimetric Observers.
The CIE 1931 Standard Colorimetric Observer is based on results of two sets of measurements
of colour matching functions, carried out independently by Wright (Wright 1928) and Guild
(Guild 1931). Wright measured CMF of ten observers, and Guild – of seven observers. The
resulting sets of individual CMF are shown in Figure 2.6.1-2 (A-B). The variability between the
colour matching functions is apparent. From Wright’s data also comes the first evidence that
this variability results not only from prereceptoral absorption. He developed a special method of
normalisation, termed now by his initials as WDW normalisation, which results in elimination
of all the variability caused by inert filters in the eye. The remaining variability can be only due
to variations in receptors or post-receptor processing and, of course, the experimental
99
uncertainty. Unfortunately, the original data of Wright and Guild experiments have been lost,
thus no analysis can be made beyond visual examination of the plots.
Figure 2.6.1-1 shows the results of the preparation study by Wright, in which 36 observers made
narrow-band matches of a test white light. The plot shows wide spread of chromaticities, and is
the first example of use of the method utilised extensively in the present study, whereby
matches made by different observers are plotted in the diagram of some reference observer. For
every data point, the distance from the mean match is representative of the colour difference that
the “mean” observer would see when presented the test and the individual observer’s match.
Figure 2.6.1-1. Plot of WDW-normalised chromaticities of white mixture matches to the standard white
of CCT c. 4800K made by 36 observers in Wright (Wright 1928) colour matching investigation.
Reproduced from (Wyszecki and Stiles 1982)
Luckily, Stiles and Burch data, the main basis of the CIE 1964 Standard Colorimetric Observers
(Figure 2.6.1-2 C), is available. It is the largest and the highest quality colour matching data set
existing, which has been used by numerous researchers ever since it was produced. Some of
these studies will be reviewed later in this section, along with others which discuss various
aspects and implications of variability of colour matching among humans with normal colour
vision.
100
A)
B)
3
Tristimulus value
-1
350 400 450 500 550 600 650 700 750
λ
C)
Figure 2.6.1-2. Colour matching data sets which were the basis of the CIE 1931 and 1964 Standard
Colorimetric Observers.
A) 10 observers by (Wright 1928); WDW-normalised. Reproduced from (Wright 1964);
B) 7 observers by (Guild 1931). Reproduced from (Wright 1964);
C) 49 observers by (Stiles and Burch 1959)
101
2.6.2. Stiles and Burch colour matching study
In 1958, Stiles and Burch presented their final report (Stiles and Burch 1959) of the colour
matching investigation. 49 observers made 10° maximum saturation colour matches using
narrow-band primary and test stimuli. The field was circular, horizontally divided, with a
surrounding 14° field of the same spectral composition as the test stimulus. The retinal
illuminance levels were ≈20 td for stimuli at the blue end of the spectrum to maximum of ≈1000
td in greens.
2.6.2.1. Intra-observer repeatability
Two observers made repeated measurements (four and five times), and provided data for
evaluation of intra-observer variability. The highest variations were in blue tristimulus values –
about 60% higher than in red and green ones: 2.9%, 3.0% and 4.6% for red, green and blue,
respectively. The lowest relative variability was always in the region where the corresponding
tristimulus values are largest. It is noted, however, that the two observers were experienced in
colour matching and were expected to have better repeatability than the rest of the group.
2.6.2.2. Inter-observer variability
The variations between the colour matching functions measured by different observers provides
a measure of variations in their colour-matching properties. However, Stiles and Burch note that
certain features in behaviour of the relative standard deviation values as the function of
wavelength can be expected from the knowledge of mathematical treatment of colour matching
data. Thus, each CMF is normalised to unity at the primary wavelength – a normalisation which
is required by definition of colour matching experiment: a unit of primary light must be equal to
itself. This means that the standard deviation of CMF will approach zero as the primary
wavelength is approached. Another expected feature is that the relative standard deviation for
certain primary light will have large values in region where the corresponding CMF approaches
zero, that is – the wavelengths of the other two primaries.
Correlations between the three CMF variations are also considered, and it is suggested that the
high correlation in the blue region of the spectrum is related to the presence of yellow lens
pigmentation, which absorbs light reaching all three photoreceptors in similar degree. Also,
from the analysis of WDW-normalised data, it is noted that the variations are higher than those
expected from mere variations in prereceptoral filtering; i.e. the differences between subjects
result not only from the differences in lens and macular pigmentation.
Still on the subject of prereceptoral filtering, an attempt was made to correlate the colour-
matching results with variations in lens pigment density, which is known to increase with age.
102
Although some correlation could be found, it was not possible to separate the effect of
variations in macular pigment from that of the lens; the variations within each age group was
very large, rendering the statistical significance of the age effect very low.
There is correlation between individual variations at different wavelengths. Therefore the

standard deviation of CMF alone does not give full specification of inter-observer variability,
and can not be used for estimation of level of observer metamerism for a given metameric pair.
This remark bears important consequence on future development of the Standard Deviate
Observer.
2.6.3. Later analyses of Stiles and Burch dataset
Later analyses of Stiles & Burch data concentrated mostly on theoretical considerations
identifying the components in visual system that are responsible for the variation in colour
matches.
2.6.3.1. Smith et al (1975)
Lens and macular pigment are the well known and understood causes of individual variations in
CMF. As results of both large colour-matching investigation show, prereceptoral filtering is not
the sole cause of such variations. In WDW coordinates (Wright 1928), the variability can only
be the result of variations in properties of the receptors themselves or of the post-receptoral
mechanisms, as the effect of inert filtering is eliminated.
Smith and co-workers performed a theoretical analysis of possible causes of these variations
(Smith et al. 1975). The colour matching data used as the reference was the Stiles 2° pilot data
produced during the preparations to the large-field investigation (Stiles 1955), with WDW
normalisation applied. In their analytical model, a set of basic spectra were defined, which had
the properties of the visual pigments of the photoreceptors. Various factors affecting the cone
sensitivity were introduced:
− Variation in effective optical density of the cones determined by the length of the outer
cone segment and its orientation
− Shifts in peak cone sensitivity
− Density of a spectral filter
The modelling has shown that no single factor can account for the variability in measured CMF;
it must be a combination of both – variation of photopigment absorption and some spectral filter
103
with varying maximum density, which is an integral part of cone outer segments. The authors
regard assumption about such filter a “speculation” as there is no evidence of its existence.*
2.6.3.2. (Webster and MacLeod 1988) and (Webster 1992)
(Webster and MacLeod 1988) performed a two-step analysis of Stiles and Burch 10° data. They
used a statistical technique of factor analysis to determine the patterns of variations in the colour
matching data. Second, they attempt to identify the physiological variables in the visual path
that can cause such patterns.
They identified three main causes for the variations: differences in the inert prereceptoral
filtering (lens + macular pigment), density and spectral shift of photopigment, and the degree
and form of rod intrusion. Variation in other physiological factors, if exists, is not expected to
be significant, as the named three account almost for entire variability in the data.
Specifically, the following variation values are suggested:

− Standard deviation of 0.12 in macular pigment density at 460 nm
− Standard deviation of 0.12-0.18 for the lens density at 400 nm. Consistently with the
report by Stiles and Burch, only weak correlation was found with the observers’ age.
− The variations in density of the photopigments are estimated to 0.045 standard deviation
units. Moreover, the densities of all photopigments seem to vary in a correlated manner.
It is suggested that such correlation can result from lengths of outer segments of all
receptors varying to the same degree.
− The variation in peak density of the photopigments is estimated to 1.5, 0.9 and 0.8 nm
for L, M and S respectively.
All the above estimations agree quite well with the reports based on psychophysical and
physiological studies known by then. Similar estimations are obtained by applying the same
analysis technique on Stiles 2° data – just with somewhat larger uncertainties due to smaller
number of observers.
In the follow-up paper (Webster 1992) Webster discusses two options of distribution of peak
density variations of cone photopigments derived from the analysis of the same 10° Stiles &
Burch data: one that suggests continuous distribution (Webster and MacLeod 1988), and one
that suggests discrete groupings (Neitz and Jacobs 1989). He concludes that there is no evidence
*
As another speculation, we can draw attention to the selective change in blue photopigment density shape
postulated by (Zaidi 1986) , which can have similar origins
104
for multimodal distribution in the data, and the results published by Neitz and Jacobs are
affected by the choice of statistical analysis method.
We note that there is no contradiction between the knowledge of discrete variation in cone peak
density established from physiological and molecular biology studies and the report of
MacLeod, if it is postulated that the same observer can have more than one type of cone of each
class in his retina.
2.6.3.3. Viénot (1977, 1980, 1987)
In (Viénot 1977b), four observers made repeated 10° maximum saturation colour matches on a
visual colorimeter constructed by Viénot (Viénot 1977a); with the primary lights @ 627, 526
and 466 nm. The colorimeter displayed vertically-divided bipartite on the 30° white surround,
luminance of which could be adjusted. The luminance levels of the matches were between 250
to 5000 td, while the radiance of the test lights was maintained constant throughout the spectrum
above 466 nm.
Each of the eight test stimuli was matched by each observer as many as 30 times, which
provided enough data to evaluate the intra-observer statistics of colour matching. The ratio
between the inter- to intra-observer variations is about 4 – comparable with previous report
(Stiles and Burch 1959).
In (Viénot 1980), the subject of relation between the inter- and intra-observer variability is
developed further, reporting data collected from 10 observers. On average, the individual
variations were in order of 10-20 per cent of the mean. The intra-observer variations were quite
expectedly found to depend on the observer’s experience, with the more experienced being of
course more stable. The differences in intra-individual variations are a very large, up to the ratio
of 7. Somewhat surprising result was, however, that the intra-observer variability can be
sometimes higher than the inter-observer one. For most of the spectrum, the variability was
found to be similar to one in Stiles & Burch data (after it was transformed to instrumental
primaries); however it is much higher at the extremes of the spectrum. The discrepancies can be
possibly explained by differences in experimental technique.
First, to our knowledge, attempt is made to try and correlate between the variability of colour
matching functions and variability in matching metameric samples representative of real-world
object colours. Figure 2.6.3-1 shows individual results on the chromaticity diagram, with the
Davidson & Hemmendinger colour rule (Kaiser and Hemmendinger 1980) results next to each
105
data point. Little or no correlation between the two sets can be seen, as in some cases observers
having similar D&H scores make matches on two opposite sides of the distribution*.
Figure 2.6.3-1. Colour matches of each of Viénot’s (Viénot 1980) observers in CIE 1964 chromaticity
diagram transformed to instrumental primaries.
The data labels indicate each observer’s result on D&H colour rule. Reproduced from (Viénot 1980)
Further on, Viénot discusses possible causes of intra-observer variability. Maxwell spot distracts
observer’s attention, and forces development of some criteria of equality of to hemi-fields.
Effectively it makes the match less symmetric; the developed criteria are unknown, and can
vary between the observers.
In the conclusions, it is suggested that chromaticity diagram is not a good tool for evaluation of
individual observer variability, and that intra-observer variability is a significant part of it.
The subject of the strategy that the observers use in their decision-making when doing colour
matching is developed in (Viénot 1987). The tristimulus values of repeated matches made by
observers were transformed to cone excitation values, and analysed for correlation. Strong
correlation was found in most of the cases. This was interpreted as indication that observers do
not seek to achieve a match by equating the cone signals from the two hemi-fields – as the
classical colour matching theory says, but by equating the cone excitation ratios mostly between
the red and the green cones, e.g. based on the post-receptoral processing. The practical
consequence of this is that the distribution of the matches is not random about the mean, but is
restricted to certain range in which the cone excitation ratios are nearly equal. The effect seems
to be stronger when the matching field gets larger to include the fovea. Another way to phrase
*
This discrepancy between the two can be viewed as an early indication for future failure of the CIE Standard
Deviate Observer (SDO, CIE 1989) (to be discussed in the next section).
106
this conclusion can be: the larger the matching field the more the match is guided by post-
receptoral processes, hence the less quantal it becomes.
2.6.4. North and Fairchild (1993)
North and Fairchild (North and Fairchild 1993a; North and Fairchild 1993b) constructed a
visual colorimeter which used CRT display as its primaries, and narrow-band interference-
filtered lights as test lights. 18 observers did Maxwell 2° matches. One observer did 20
measurements of his CMF. Authors used a new model of determination of colour matching
functions (Fairchild 1989), which requires only limited number of stimuli to be measured.
The single-observer results show very good repeatability and good correspondence with CMF
measured by the same observer using different visual colorimeter. The inter-observer variations
are about ten times larger than the intra-observer ones, and are in good agreement with ones in
Stiles’ (Stiles and Burch 1959) study. An extensive evaluation of the performance of the CIE
Standard Deviate Observer is reported in this paper; it will be reviewed in due course in section
2.8.
107
2.7. Observer metamerism and real-world metamers
2.7.1. Introduction
The practical consequence of individual variability in CMF is the observer metamerism: a

phenomenon in which a pair of stimuli make a metameric match for one observer (say, the
Standard Colorimetric Observer), but mismatch for another. Virtually all colour matches in
industry are metameric, unless the match is between identical or very similar materials;
therefore observer metamerism presumably poses a significant industrial challenge. In this
section, publications will be reviewed which deal with evaluation of observer metamerism and
its consequences in conditions relevant to industry.
2.7.2. Experiments with Davidson & Hemmendinger (D&H) rule
D&H rule is a simple device, in which two series of paint patches with different spectral
reflectances slide one against another. One pair of patches from both series can be seen at any
given time through a rectangular opening. The spectral reflectances of the patches are designed
in such a way that there is always a pair of patches which is metameric with respect to a given
combination of illuminant and observer. The observer’s task is to slide the two series until she
finds a pair which matches in colour. If the colour matching properties of observer change while
the illuminant remains the same, the matching pair will change as well. The size of the opening
corresponds to approximately 2° viewing field at distance of 57 cm. Every patch is identified
with either number or letter, so the result of the experiment is a combination of the two – as was
shown in Figure 2.6.3-1
The device can be used to identify variations in colour matching functions of individual
observers in a very simple and straightforward manner. Kaiser and Hemmendinger (Kaiser and
Hemmendinger 1980) report results of observations made by 59 observers, 17 to 64 years old,
under two illuminants. There is a clear correlation between the result and age, with the linear
relation between them. The increase in settings between 17 and 60 years of age is about eight
108
units, which is about twice as large as the variation within each group age. An interesting
parallel is drawn between the effect of observer’s age and illuminant change: the yellowing of
aging lens is analogous to positioning a colour temperature reduction filter in the eye’s optical
path.
Billmeyer and Saltzman (Billmeyer and Saltzman 1980) published a short note reporting results
of a similar study with 72 students as observers, mostly of relatively young age. The note was
meant to show the significance of observer metamerism in surface colour matching.
Nardi (Nardi 1980) was able to reduce the variations in D&H rule matches significantly
relatively to those reported by Billmeyer and Saltzman (Billmeyer and Saltzman 1980) by
restricting the age of his observers. These were college students aged 17-29, tested for normal
colour vision.
2.7.3. Observer metamerism in cross-media colour matching
With the introduction of colour computer displays into the graphic arts industry, the question of
reliable on-screen colour reproduction became more relevant. A practice of soft-proofing began
to develop, whereby the displays were required to reproduce colour-accurate simulation of
object colour. The colour match between a display and an object is essentially metameric, as
physical properties of inks, paints a dyes differ from the properties of the computer display
phosphors. Thus it is subject to observer metamerism: one observer, say, a workstation operator,
may see a colour match between the display and the hardcopy, while another – say his client – a
mismatch. The reproduction of hard-copy colours on computer display became to be known as
cross-media colour reproduction. The present review will discuss reports of studies of cross-
media reproduction with respect to observer metamerism.
2.7.3.1. Pobboravsky (1988)
(Pobboravsky 1988) defined the problem of observer metamerism in cross-media colour

reproduction by two questions:
Given a pair of soft and hard colors which match to a reference observer, do they
match to other observers, and if not, how large a color difference is seen between
them?
Pobboravsky used two approaches to answer this question. First, he computationally evaluated
the spread of matches that would be expected from published sub-set of CMF of 20 observers
(Wyszecki and Stiles 1982). This is done in two steps:
109
1. Given a surface colour sample, knowledge of computer display primaries, and CMF of
the CIE Standard Colorimetric Observer, generate a spectral power distribution function
of the monitor so its CIE XYZ tristimulus values are identical to the surface sample.
2. Compute 20 sets of tristimulus values using each of the 20 “alternative” observers’
CMF
3. Compute the colour difference between the CIE XYZ and each of the “alternative” XYZ.
Pobboravsky made calculations for 625 metameric pairs. The mean calculated colour difference
was 1.36 CIELAB units, with the 95% of values below 3 units, and majority of differences
about 1 unit. Pobboravsky references unpublished work by Roy Berns, who did similar
calculations for different type of metamers and arrived at similar results. These colour
differences are considered to be surprisingly small.
At the second stage a psychophysical experiment was carried out. Six observers were screened
with D&H rule; two of them were anomalous trichromats and the rest were normal. A cross-
media colour matching experiment was carried out with five hard-copy samples and a CRT
display, in two conditions: one with the stimuli positioned adjacent to each other with a 3 mm
gap between them, and another with the distance of about 57 cm between the samples.
The finding was quite definite:
Color vision differences between normal observers appear to pose no problem

for the comparison of soft and hard proofs.
All the matches made by a normal observer were acceptable by all the other normal observers.
However, matches made by anomalous trichromats were “clearly unacceptable” by the normals,
and vice versa. There was no difference in result of two viewing configurations.
The outcome of Pobboravsky’s study seem to indicate why no significant problem was detected
in the industries using computer monitors as proofing devices, despite well-known variations in
colour matching properties among observers. Unfortunately, Pobboravsky did not publish the
colour-matching data from the experiment, nor did he publish the plots of spectral power
distributions of the hard-copy samples. Hence his report is, although very important, is no more
than qualitative.
2.7.3.2. Rich and Jalijali (1995)
In response to North and Fairchild publication (North and Fairchild 1993b) reporting large
individual variability in colour matches, Rich and Jalijali (Rich and Jalijali 1995) published a
report of an experiment very similar in principle to Pobboravsky’s (Pobboravsky 1988), but
with rather different results and conclusions. 26 observers matched the colour of the computer
110
display to white porcelain tile and seven metameric grey painted paper samples. Observers
viewed the display and the sample through a box with two vertically arranged apertures ≈4° in
size. Chromaticity plots show large variations in matches, with distribution having shape similar
to one in North and Fairchild’s plots. No numerical data is reported. Most of the paper is
dedicated to reviewing some of the available literature on the subject of observer metamerism,
and to discussion of possible consequences of it on cross-media reproduction, colour difference
evaluation and others. Rich and Jalijali conclude with
…a plea for commercially viable special index of metamerism for change in

observer…
2.7.3.3. Alfvin & Fairchild (1997)
An apparatus was constructed, which allowed carrying out a hybrid of classical and cross-media
colour matching experiment (Alfvin and Fairchild 1997). The observer viewed a 2.9° bipartite
field, in one side of which there was a reflection of a CRT display through a diffuser to
eliminate the visibility of scanning lines, and on another – a reflection of either print or
transparency sample. The observers did not know the origins of the stimuli; both were perceived
as self-luminous colours. The illumination in the cabinet was a simulated CIE D50, providing
luminance of about 50 cd/m2 for all the colours. Twenty observers adjusted the CRT primaries
to produce a match with the print or transparency sample.
The results were reported in CIELAB units* and three sets of CMF: CIE 1931, CIE 1964 and
Stiles & Burch 2°. For the CIE 1931, the mean colour difference from mean for inter-observer
variation was 2.67 units and intra-observer – 1.35, with maximums as large as 19.7 and 11.4,
respectively.
2.7.4. Summary
In vision research, Stiles and Burch (Stiles and Burch 1959) set of CMF data provided an
invaluable source of information about the fundamental causes underlying the variations in
colour matching properties between individuals. The literature on the subject is rich and
continuously developing, as is evident from the review presented in section 2.1 and here. As the
knowledge of underlying physiological construction increases, there is less and less need to
resort to psychophysical data of Stiles and Burch – which is evident from limited amount of
publications in the last decade.
*
It should be noted that the use of CIELAB is somewhat problematic in this context, as it is defined as object-
colour model. However, it is the only model available today which allows expression of variations in colour
matches in numbers which correspond to perceptual effect of these variations.
111
The task of applied colour science and colorimetry, however, is less in studying the
fundamentals of colour vision, but in applying the knowledge about them to solving practical
industrial problems of colour reproduction. Considering this task, almost complete absence of
studies on evaluation of observer metamerism in industrially-relevant conditions is very
surprising. It seems that there is a marked discrepancy between the declared significance of
observer metamerism in industry, and interest of researchers in carrying out studies on
quantifying and characterising the phenomenon. The studies that are reported are either of
qualitative nature, or do not provide much data which allows understanding of the underlying
mechanisms or modelling of any kind. The first and the most comprehensive study so far by
(Alfvin and Fairchild 1997) is a step in right direction, but it still provides mere quantification,
and the applicability of the results derived in conditions which are very far from real industrial
ones is questionable.
The reason for this strange situation is perhaps twofold. On the side of vision science, there is a
lack of interest in applying the developed knowledge on solution of industrial problems. On the
colorimetry side, there is unexplained reluctance to use the advances of vision science.
In all the discussions on the consequences of observer metamerism, there is one common
feature: it is assumed that the observer metamerism is the sole cause of individual variations in
observer’s judgement in industrial conditions. It is not taken into account that observer
metamerism describes phenomena which takes place at the level of cones, and is applicable to
symmetric and quasi-symmetric matches only. In the Viénot’s report (Viénot 1987) reviewed
above, there is an indication that even in bipartite field, as field size extends to the fovea it
becomes less quantal, and depends more on the postreceptoral processing. It can be speculated
that the match in conditions close to industrial where there is a significant separation between,
say, monitor and the print, is not quantal, and most probably high-level cognitive mechanisms
are operating. Is classical colour matching data applicable to such case? In fact, our knowledge
on mechanisms of comparison of spatially separated stimuli is very limited, but there are
indications that latter speculation has a basis in experimental psychology (Danilova and Mollon
2006).
112
2.8. The CIE Standard Deviate Observer
Before we try to fix an index for metamerism, we must clearly define just what it
is we are trying to index.
(Allen 1970)
The need for a method to evaluate the effect of observer metamerism on industrial colour
matching was recognised long ago. First such method seem to have been proposed by Wyszecki
(Wyszecki 1969), then the chairman of the CIE Subcommittee on Degree of Metamerism. He
suggested the use of number of colour matching functions of different observers – in his
example 20. For each observer’s CMF, the tristimulus values for each member of the metameric
pair are computed, and colour difference between them is calculated. The degree of observer
metamerism for the given pair is then evaluated as the mean of 20 colour differences. Wyszecki
regarded this procedure as lengthy and not practical, brought more is as an example than as a
feasible proposition.
In conclusions Wyszecki writes:
I am very hopeful that in not too distant future the CIE colorimetry committee will
be able to come forward with a proposal suitable for industrial application.
It was not before 1989 that the CIE published its Publication #80 entitled “Special metamerism
index: change in observer” (CIE 1989). In the present section, the development of the standard
will be discussed, followed by examples of its experimental evaluation.
2.8.1. Nimeroff et al (1961): colour matching and uncertainty
The colour matching functions are experimentally measured values, and as such have
uncertainties associated with them. When CMF are applied to calculation of tristimulus values,
these uncertainties propagate into results of the calculations, and need to be estimated. Nimeroff
et al. seem to be the first to attempt and apply the widely accepted concept of uncertainty to
colorimetry (Nimeroff et al. 1961), and defined a concept of Complete Standard Colorimetric
Observer System, which
113
… should contain not only the mean spectral tristimulus functions xλ , yλ and
zλ derived from the color mixture data, but should contain, also, the variances
and covariances of these functions as derived from the within- and between-
observer variability of the color mixture data.
This was formulated into a statistical model. Each set of tristimulus values empirically measured
by an observer in the colour matching experiment can be expressed as the function of three
variables:
Ri = R + bRi + ε Ri
Gi = G + bGi + ε Gi (2.8.1)
Bi = B + bBi + ε Bi
Ri, Gi and Bi are the tristimulus values measured by ith observer; R, G and B are mean tristimulus
values measured by a population of colour-normal observers representing the “Standard
Colorimetric Observer”; bRi, bGi and bBi are the “individual bias” values, or the differences
between the Standard Colorimetric Observer and the individual; and εRi, εGi and εBi are the
“observer’s errors” resulting from intra-observer variability: the difference between the
measured and the “true” observer’s tristimulus values . If the CMF result from the repeated
measurements made by the same observer, then expected values of εRi, εGi and εBi can be assumed
to be equal zero, and the mean tristimulus values can be expressed as
Ri = R + bRi
Gi = G + bGi (2.8.2)
Bi = B + bBi
Eq. (2.8.2) expresses the model of Standard Deviate Observer as it is used today.
Nimeroff et al. evaluated and published tables of variances and covariances developed from
Stiles and Burch data. The application of such system is
..to determine,…, the region of within and between uncertainties; that is, the
extent to which a normal observer tends to make different matches on successive
attempts, and the extent to which different normal observers vary one from
another.
2.8.2. Allen (1970): the definition of Standard Deviate Observer
As a proposal to the CIE subcommittee chaired by Wyszecki, Allen developed a concept of

General Index of Metamerism (Allen 1970). His answer to the question in the epigraph of this
section is:
How much of a color difference will there be, …, among normal observers
114
looking at two samples which match exactly for the Standard Colorimetric
Observer?
…
The method to be followed is to establish what may be termed, with all due
apologies, a standard deviate observer.
The concept includes a set of colour marching functions ∆x , ∆y and ∆z , which differ from
the Standard Colorimetric Observer by one standard deviation in CMFs of group of observers at
the corresponding wavelength. These standard deviations, however, are signed, to account for
the correlation between variations at different wavelengths. The deviate colour matching
functions are used in a usual way to calculate ∆X, ∆Y and ∆Y values – which can also have
negative signs. The index of metamerism is calculated by computing colour difference between
the X, Y, Z and ∆X, ∆Y and ∆Y.
Allen uses the 20 observers subset of Stiles and Burch data set from the “Color Science“ book
(Wyszecki and Stiles 1982). The proposed Standard Deviate Observer (SDO) is the result of the
analysis of the variances and covariances in the data, and is illustrated in Figure 2.8.2-1. Allen
notes that the application of the proposed SDO can sometimes lead to negative tristimulus
values, but has chosen to leave it as is for the sake of integrity of the method. It is clear,
however, that no modern colour difference formula can account for negative tristimulus values,
hence the implementation of the method is not as straightforward as Allen meant it to be.
Figure 2.8.2-1. Allen (Allen 1970) Standard Deviate Observer (dashed line) and the Standard
Colorimetric Observer (solid line).
Reproduced from (Allen 1970).
115
2.8.3. Nayatani’s proposal for SDO
(Nayatani et al. 1983) applied a statistical technique of singular value decomposition on 20

observers subset of Stiles and Burch data set. The technique is described in the original paper.
Not one but four deviations were developed, of which only the first one is used to evaluate the
degree of observer metamerism.
The new SDO was tested on two sets of metameric spectral reflectances of 12 and 68 metamers.
The index derived by means of the new SDO was compared with the index derived with the
Wyszecki’s (Wyszecki 1969) technique involving all 20 colour matching functions. An
improvement upon Allen’s SDO, and a very good agreement with Wyszecki’s method were
reported (Figure 2.8.3-1).
A) B)
Figure 2.8.3-1. Comparison of performance of Nayatani’s proposed SDO (Nayatani et al. 1983) with
Allen’s (Allen 1970), with set of 68 grey metamers
A) Correlation between the prediction of the new SDO with index of observer metamerism calculated
with 20 individual CMF (Wyszecki 1969). B) same with Allen’s CDO.
Reproduced from (Nayatani et al. 1983)
In the follow-up paper (Takahama et al. 1984), the application method of the new SDO is
developed further: now all four deviation functions of (Nayatani et al. 1983) are utilised. The
first deviation is used to evaluate the index of observer metamerism, while all four construct the
confidence ellipsoids defining the range of mismatches expected for a given pair of metamers
viewed by observer different from the standard one.
2.8.4. Proposal for SDO by Ohta (1995)
(Ohta 1985) proposed SDO developed by an alternative technique: instead of basing the
analysis on statistics of the colour matching data, he chose to employ a nonlinear optimisation
technique. This technique allows optimising the SDO by its performance on a set of metamers
in the uniform colour-difference space. The results, though, gave very similar results to these of
Nayatani, despite the very different derivation method.
116
2.8.5. CIE Publication 80: the Standard Deviate Observer
In 1989, the CIE published (CIE 1989) its Special Metamerism Index: Change in Observer –
referred to as CIE SDO in the following text. The aim of this index is (CIE 1989):
… to provide a method for evaluating the degree of color mismatch for a

metameric color pair (object color or illuminant color) when an actual observer
with normal colour vision is substituted for the standard colorimetric observer.
The CIE SDO was based mostly on (Nayatani et al. 1983) and (Takahama et al. 1984). Four
deviation functions are published, ∆xi , ∆yi and ∆zi (i = 1,2,3,4) for wavelength range of 380-
780 nm in 5nm steps. The first function is used to define the index of observer metamerism. All
four define the confidence ellipse.
2.8.5.1. Calculating the Metamerism Index for change in observer
The calculation procedure for a metameric pair is as follows. Let Q1(λ) and Q2(λ) be the spectral
power distribution functions of a pair of stimuli which are metameric with respect to the CIE
Standard Colorimetric Observer (CIE 1931 or CIE 1964):
Q1 ( λ ) ≡ Q2 ( λ ) (2.8.3)
The CIE XYZ tristimulus values Xref,i, Yref,i and Zref,i for the corresponding “reference” observer
are calculated:
⎡ X ref , j ⎤ ⎡ xref ( λ ) ⎤
⎢ ⎥ ⎢ ⎥
⎢Yref , j ⎥ = ∑ Q j ( λ ) ⎢ yref ( λ ) ⎥ ∆λ (2.8.4)
⎢Z ⎥ λ ⎢ ⎥
⎣ ref , j ⎦ ⎣⎢ zref ( λ ) ⎦⎥
where Qi(λ) is the spectral power distribution of i-th sample, and xref ( λ ) , yref ( λ ) and zref ( λ )
are the colour matching functions of the CIE 1931 or the CIE 1964 Standard Colorimetric
Observer. By definition of the metameric match:
X ref ,1 = X ref ,2 = X ref

Yref ,1 = Yref ,2 = Yref (2.8.5)
Z ref ,1 = Z ref ,2 = Z ref
The Standard Deviate Observer CMF are defined as:

117
xdev ( λ ) = xref ( λ ) + ∆x1 ( λ )

ydev ( λ ) = yref ( λ ) + ∆y1 ( λ ) (2.8.6)
zdev ( λ ) = zref ( λ ) + ∆z1 ( λ )
Two sets of tristimulus values of the members of the metameric pair with respect to the
Standard Deviate Observer are calculated:
⎡ X dev , j ⎤ ⎡ xdev ( λ ) ⎤
⎢ ⎥ ⎢ ⎥
⎢Ydev , j ⎥ = ∑ Q j ( λ ) ⎢ ydev ( λ ) ⎥ ∆λ (2.8.7)
⎢Z ⎥ λ ⎢ ⎥
⎣ dev , j ⎦ ⎣ zdev ( λ ) ⎦
The metamerism index for change in observer Mobs for the pair (Q1(λ),Q2(λ)) is defined as
M obs = ∆Eobs ⎡⎣( X dev ,1 , Ydev ,1 , Z dev ,1 ) , ( X dev ,2 , Ydev ,2 , Z dev ,2 ) ⎤⎦ (2.8.8)
where ∆Eobs is the colour difference between the two stimuli calculated in approximately
uniform colour space. The CIE publication 80 recommends the use of CIELAB or CIELUV.
When the pair is not exactly metameric (parameric), as it often occurs in practical situations,
e.g.
X ref ,1 ≠ X ref ,2
Yref ,1 ≠ Yref ,2 (2.8.9)
Z ref ,1 ≠ Z ref ,2
the tristimulus values of one of the stimuli are defined as reference:
X ref = X ref ,1
Yref = Yref ,1 (2.8.10)
Z ref = Z ref ,1
and the “corrected” tristimulus values are computed:
X ref ,1
,2 = X ref ,2
'
X ref
X ref ,2
Yref ,1
Yref' ,2 = Yref ,1 (2.8.11)
Yref ,2
Z ref ,1
,2 = Z ref ,1
'
Z ref
Z ref ,2
118
' ' '

The values X ref ,2 , Yref ,2 and Z ref ,2 are then used instead of X ref ,2 , Yref ,2 and Z ref ,2 .
2.8.5.2. Constructing the 95% confidence ellipse
All four deviate observer functions are used to estimate the 95% confidence ellipses in
chromaticity or CIELAB a*b* diagrams. First, calculate four sets of deviate tristimulus values:
⎡ ∆ 2 X dev ,i ⎤ ⎡ xdev ,i ( λ ) ⎤
⎢ 2 ⎥ ⎢ ⎥
⎢ ∆ Ydev ,i ⎥ = ∑ ∆Q ( λ ) ⎢ ydev ,i ( λ ) ⎥ ∆λ (2.8.12)
⎢ 2 ⎥ λ ⎢ ⎥
⎢⎣ ∆ Z dev ,i ⎥⎦ ⎣⎢ zdev ,i ( λ ) ⎦⎥
Here i (i = 1,2,3,4) is the index of the corresponding deviation CMF function, and ∆Q(λ) is the
difference between the spectral power distributions of the members of the metameric pair*, e.g.
∆Q ( λ ) = Q1 ( λ ) − Q2 ( λ ) (2.8.13)
The four sets of deviation tristimulus values are used to calculate the covariance matrix and
Jacobian matrix for the transformation to CIELUV or CIELAB space, and to construct the
ellipses as was described in 2.4.7-2.4.8; the explicit formulae will be omitted here.
2.8.5.3. Effect of age
CIE Publication 80 (CIE 1989) also includes a method for evaluation of the effect of age on
index of observer metamerism. A pair which is metameric with respect to observer of age N1,
will mismatch to observer with age N2 due to increase in lens pigment density. This mismatch
can be evaluated as follows. The deviation functions ∆x1 ( λ , N ) , ∆y1 ( λ , N ) and ∆z1 ( λ , N ) for
age N ( 20 ≤ N ≤ 60 ) are estimated as
⎡∆x1 ( λ , N ) ⎤ ⎡∆x1 ( λ ) ⎤
⎢ ⎥ ⎢ ⎥
⎢ ∆y1 ( λ , N ) ⎥ = L ( N ) ⋅ ⎢ ∆y1 ( λ ) ⎥ (2.8.14)
⎢ ⎥ ⎢ ⎥
⎣ ∆z1 ( λ , N ) ⎦ ⎣ ∆z1 ( λ ) ⎦
where
L( N ) = 0.064 N − 2.31 (2.8.15)
*
This difference is termed metameric black (Wyszecki and Stiles 1982, Cohen 1988).
119
The CMF of the observer with N years of age are calculated by equation (2.8.6) where
∆x1 ( λ , N ) , ∆y1 ( λ , N ) and ∆z1 ( λ , N ) are used instead of ∆x1 ( λ ) , ∆y1 ( λ ) and ∆z1 ( λ ) . The
two sets of colour matching functions for ages N1 and N2 are used to calculate two sets of
tristimulus values; the colour difference between the two sets is the index of observer
metamerism due to difference in age between N1 and N2.
2.8.6. Tests of the CIE SDO
2.8.6.1. North and Fairchild (1993) and Nayatani’s response (1994)
As the part of their colour matching investigation in Munsell Color Science Laboratory in RIT,
North and Fairchild (North and Fairchild 1993b) performed a test of the CIE Standard Deviate
Observer. Maxwell-type colour matching data was collected from 18 observers, and included
also 20 measurements made by single observer used for the evaluation of intra-observer
variability. The correspondence between experimental results and prediction of the CIE SDO
was rather poor: the SDO prediction was of the order of variability within the 20 matches of the
single observer, significantly lower than the inter-observer variability. North and Fairchild
suggest that the discrepancies can be due to exclusion of some of the Stiles and Burch (Stiles
and Burch 1959) observers from the analysis which has led to the establishment of the CIE SDO
(Nayatani et al. 1983; Takahama et al. 1984).
Nayatani (Nayatani 1994) responded to North and Fairchild’s findings by suggesting several
possible sources of observed discrepancies of the experimental results with the CIE SDO. The
Maxwell method of colour matching is known to have inherent inaccuracies which could have
propagated into the conclusions. The field size in North and Fairchild experiment also could
have an effect – small field matches are known to be less accurate. The model (Fairchild 1989)
developed and used for estimation of the CMF could also be imprecise.
Nayatani concludes the lengthy discussion of the North and Fairchild’s report with two
important conclusions:
− Due to the way the SDO was developed, it can not be tested by comparing it’s
prediction with the variability in colour matching data in absolute terms – but only by
correlation of the results.
− Consequently, CIE SDO can be used only for comparing the degrees of observer
metamerism between two pairs, and its absolute values are meaningless.
Both statements are in apparent contradiction with the definition of the CIE SDO aims – as cited
above.
120
2.8.6.2. Alfvin and Fairchild (1997)
The experiment by Alfvin and Fairchild (Alfvin and Fairchild 1997) was described in 2.7.3.3.
As a part of the analysis they compared the variability in their data with the SDO prediction, and
found the SDO to significantly under-predict the real observer’s results – by factors ranging
from ×3 to ×10. Also, no correlation was found between the results and the CIE SDO values.
The plot comparing the 95% ellipses in the CIE a*b* diagram is shown in Figure 2.8.6-1. One
possible source of the discrepancies that is suggested is the normalisation of CMF at the primary
wavelengths, that supposedly has reduced the CMF variability.
Figure 2.8.6-1. Comparison of 95% confidence ellipses constructed from Alfvin and Fairchild’s
observers’ matches (Alfvin and Fairchild 1997) with the prediction of CIE SDO.
Reproduced from (Alfvin and Fairchild 1997)
2.8.7. Development of the SDO: summary
Review of studies that have led to the establishment of the CIE SDO, and ones which have
followed, leads to unlikely conclusion that the question posed by Allen in the epigraph to this
chapter remains unanswered. This is to say – the purposes of the index of observer metamerism
are still not defined, and its application and usefulness is questionable. No doubt that this is the
major reason that the index, having been established for almost two decades, is not used in the
industry. Moreover – we even do not know whether the industry needs one. To our knowledge,
no evaluation of the CIE SDO in conditions resembling any practical industrial application has
been reported. Moreover, more attempts to define new improved SDO are published (Martinez
et al. 2003; Martinez et al. 2005), while just what needs to be improved in the SDO, and how
this improvement is to be tested – is not known.
121
2.9. Literature review: summary
The present research deals with issues at the basis of colorimetry; consequently, the volume of
available literature is enormous. Limitations of scope were imposed, hence the review is not
intended to be complete. The aim is to summarise the available knowledge on several topics
directly related to the subject of the study. Hence we started with the discussion of basic
anatomy of the eye, defined the basic principles of colorimetry, and described the CIE
implementation of these principles as it is used in virtually all industrial applications. Then we
proceeded to describe the available research on subjects that undermine the basis of the
colorimetry, namely the observer metamerism and failure of additivity law.
Having reviewed the state of the art on colorimetry, we are bound to conclude that, at a certain
point of time, colorimetry in a way disconnected from the vision research. This seems to be true
at least for the two subjects we deal with. The consistent reports of additivity failure were
ignored, and the advances in evaluation of individual variability of colour matching properties
and their causes were not implemented in any practically useful way. Moreover, there is not
even a basic understanding whether the issues in question have any practical implications on
day-to-day colorimetric practices. This defines the starting point of our research.
122
3. Experiment 1.
Colour matching in
small and large
fields
123
3.1. Introduction
3.1.1. Experiment 1: the objectives
The objectives of the first experiment are defined as follows:

1. To test reproducibility of the variability of CMF in S&B (Stiles and Burch 1959)
colour-matching dataset
2. To test reproducibility of Thornton’s (Thornton 1992a; Thornton 1992b) experimental
results concerning the failure of the transformation of tristimulus space.(section (3.1.1))
3.1.1.1. Test of the reproducibility of the variability of CMF in S&B dataset
S&B colour matching investigation with 49 observers provides the highest quality colour
matching dataset available so far. The trials of the CIE 1964 Standard Colorimetric Observer
(Wyszecki 1959; Stiles and Wyszecki 1962; Wyszecki and Stiles 1982), and the experience of
several decades of industrial implementation have shown that the mean colour matching
functions of S&B set do represent the average human observer with normal colour vision for
large field colour matching; thus the validity of the mean CMF themselves is not questioned.
However, when the uncertainty of colour matching is discussed, another question has to be
asked: whether the variability within the S&B dataset is representative of the individual
variability of CMFs in colour-normal population.
Most of the studies of individual observer variability were based on analyses of this dataset, and
eventually it was used for the derivation of the CIE Standard Deviate Observer (SDO) (CIE
1989). The SDO, however, was shown to significantly underestimate the inter-observer
variability (North and Fairchild 1993b; Alfvin 1995) (Section 2.6.4). This can result from one or
more of the following reasons:
1. The variability of the colour-matching data in S&B dataset is not representative of the
variability of CMF in colour-normals, due to particular pool of observers, mathematical
data treatment, etc.
124
2. The SDO based on the variability of CMF is not applicable to conditions it was tested
in.
3. The technique used to derive the SDO leads to reduced prediction
The first experiment of this study was designed to test the first possibility.
3.1.1.2. Test of reproducibility of additivity failures
The magnitudes of additivity failure Α R , ΑG and Α B (Eq. (3.1.1)) are not readily available.
Moreover, there are no agreed methods for evaluating these magnitudes, nor there is agreement
on whether or not and how it needs to be accounted for.
The existence of the failures of colorimetric additivity due to unknown adaptation mechanism is
extensively discussed and established (Section 2.5). Additivity failures are also known to occur
as the result of rod participation in colour matching (Wyszecki and Stiles 1982). The practical
implications of both kinds of failures remain unknown. Thornton (Thornton 1992a; Thornton
1992b) (Section 2.5.4) tested the implications of additivity failures on one of the most
commonly used operations in colorimetry – the transformation of tristimulus space, and found
that the discrepancies between the prediction of the model which assumes additivity and the
experimental results are significantly large for individual observer. This experimental result was
never reproduced since Thornton, and this is another task of this experiment.
3.1.2. Summary of the experimental conditions and results
A colour matching psychophysical experiment of maximum saturation type was conducted on a

group of five observers, using two sets of narrow band primary lights and nine narrow band test
lights, in conditions of low luminance and large bipartite matching field. The uncertainty of the
colour matching data was broken down into three categories according to their source: physical,
psychophysical and additivity failure. The magnitude of uncertainty introduced by each of these
components was evaluated.
The physical uncertainty plays a significant role, accounting on average for about 45-70% of the
variability of colour matching data within single observer. The intra-observer variability
accounts for about 37-75% of the variability in the mean data of all observers. The variability of
colour matching depends on the choice of the primary lights. CIE Standard Deviate Observer
significantly under-predict the variability in our experiment.
There are consistent failures of additivity in individual observer’s data. The characteristics of
the failures depend on the primary set. The magnitude of the failures and their dependence on
125
the spectral position of the primaries are mostly explained by the model of rod intrusion. When
mean results of all observers are considered, the inter-observer variability in colour matching
data renders the failures of additivity statistically insignificant in most of the cases.
126
3.2. Experimental
3.2.1. Test of additivity
We tested the additivity assumptions in two ways. In one we verified the proportionality law in
its strict sense as defined by the third statement of the Trichromatic Generalisation (TG)
(Section 2.2.3). In another, we test the application of the TG: the procedure of converting sets of
tristimulus values from one tristimulus space to another (Section 2.2.5). For the sake of
simplicity of notation these two tests are referred to as “Proportionality Test” and “Additivity
Test” in the following text.
3.2.1.1. Proportionality test
The proportionality law states that when each member of a colour matching equation is
multiplied by the same constant, the resulting matching equation holds true (Eq. (2.2.4)). In the
colour matching experiment, the tristimulus values are normalised to some fixed amount of
energy in the test stimulus, or in other words they are proportional to the test stimulus. Hence, if
the proportionality law is valid, tristimulus values of a stimulus are independent of the
luminance level at which this stimulus is measured, as long as the relative spectral power
distribution remains the same. In our study, the same narrow-band colour stimulus was
measured at two luminance levels with the ratio of approximately 1/2. The tristimulus values
thus measured were used to test the validity of the proportionality assumption as follows.
Let U and V be unit amounts of some matching stimulus. Let Uk and Vk be some fractions of
unit amounts of U and V, matched in the tristimulus space (R,G,B) by
U k = RU R + GU G + BU B
(3.2.1)
Vk = RV R + GV G + BV B
where Rj, Gj and Bj (j = U, V) are the tristimulus values of Uk and Vk and R, G and B are the unit
amounts of primaries. Assuming the validity of the proportionality law, we can multiply both
parts of the equations (3.2.1) by positive values kU and kV respectively defined as
127
U
kU =
Uk
(3.2.2)
V
kV =
Vk
This is the normalisation that is applied in the process of derivation of colour matching
functions, where the tristimulus values measured by the observer are normalised to unit amount
of test stimulus. The validity of this procedure depends on the assumption of validity of the
proportionality law.
Two colour matching equations result:
U = kU RU R + kU GU G + kU BU B
(3.2.3)
V = kV RV R + kV GV G + kV BV B
Since U=V by definition, and according to transitivity law of the TG, the right parts of
equations (3.2.3) are equal:
kU RU R + kU GU G + kU BU B = kV RV R + kV GV G + kV BV B (3.2.4)
Hence the tristimulus values of U and V are equal:
kU RU = kV RV
kU GU = kV GV (3.2.5)
kU BU = kV BV
Let us denote the two vectors of tristimulus values in right and left sides of equations (3.2.5) as
TU and TV. The validity of Proportionality Law in given experimental conditions is tested by
testing the validity of expression (3.2.5):
H0 : T1 = TU
(3.2.6)
H1 : T1 ≠ TV
This test is performed by applying the multivariate statistical test for equality of mean vectors
with unequal covariance matrices, as described in (Section 2.4.6).
The magnitude of the proportionality failure pi for each tristimulus value is evaluated in terms of
“percentage of deviation from proportionality”:
128
⎛ k R ⎞
pR = ⎜ 1 − U U ⎟ × 100
⎝ kV RV ⎠
⎛ k G ⎞
pG = ⎜ 1 − U U ⎟ × 100 (3.2.7)
⎝ kV GV ⎠
⎛ k B ⎞
pB = ⎜ 1 − U U ⎟ × 100
⎝ kV BV ⎠
If proportionality holds, the two tristimulus values should be very similar (ideally the same), so
the values of the above expressions will approach zero.
3.2.1.2. Additivity test
Under the assumption of additivity, it is possible to transform sets of tristimulus values from
one tristimulus space to another using two methods: forward- or inverse-matrix (Section 2.2.5).
To test these procedures, same stimulus is measured with two sets of primary lights. In addition,
primary lights of every set are matched by mixture of primaries of the other set, and used to
build the transformation matrix as described by Eqs. (2.2.21) and (2.2.24). Thus, the tristimulus
values of the stimulus can be transformed between the two spaces. The difference between the
two sets of tristimulus values – one which is calculated and one which is experimentally
determined – provides an indication of the failure of the principles underlying the procedure,
e.g. additivity failure. Difference between the results obtained by forward- and inverse-matrix
procedures would provide an additional information concerning the implications of the choice
of mathematical method.
Let U be the test stimulus measured in two tristimulus spaces 1 and 2:
U = R1R1 + G1G1 + B1B1

(3.2.8)
U = R2 R 2 + G2G 2 + B2 B 2
where Rj, Gj and Bj ( j =1,2) are the unit amounts of primaries in the corresponding system j and
Rj, Gj and Bj are the tristimulus values of stimulus U in tristimulus space j. The task is to predict
the tristimulus values of the stimulus U in the tristimulus space 2 from its tristimulus values in
system 1, e.g. absolutely accurate prediction will result in values equal to R2, G2 and B2.
Tristimulus values R1, G1 and B1 are related to R2, G2 and B2 by either of the following
expressions:
⎡ RR1,2 GR1,2 BR1,2 ⎤

⎢ ⎥
⎡⎣ R2,C G2,C B2,C ⎤⎦ = [ R1 G1 B1] ⎢ RG1,2 GG1,2 BG1,2 ⎥ (3.2.9)
⎢R ⎥
⎣ B1,2 GB1,2 BB1,2 ⎦
129
−1
⎡ RR 2,1 GR 2,1 BR 2,1 ⎤
⎢ ⎥
⎡⎣ R2,C G2,C B2,C ⎤⎦ = [ R1 G1 B1] ⎢ RG 2,1 GG 2,1 BG 2,1⎥ (3.2.10)
⎢R ⎥
⎣ B 2,1 GB 2,1 BB 2,1 ⎦
The subscript C in the result vector stands for “calculated”. Eq. (3.2.9) is the forward-matrix
transformation, where the elements of the 3×3 transformation matrix are the tristimulus values
of the primary lights of set 1 measured with the primaries of the set 2. Eq. (3.2.10) is the
inverse-matrix transformation, where the elements of the transformation matrix are the
tristimulus values of the primaries of set 2 measured in set 1.
For either of the transformations, the null hypothesis that the tristimulus values of stimulus U
can be transformed from one tristimulus space to another, e.g.
H0 : ⎡⎣ R2,C G2,C B2,C ⎤⎦ = [ R2 G2 B2 ]

(3.2.11)
H1 : ⎡⎣ R2,C G2,C B2,C ⎤⎦ ≠ [ R2 G2 B2 ]
is tested by the multivariate test for equality of mean vectors adapted for the case of unequal
covariance matrices – similarly to the proportionality test. However, in the test of
proportionality law both sets of tristimulus values were determined experimentally, hence their
covariance matrices were estimated directly from the experimental data. Here, the tristimulus
values R2,C, G2,C and B2,C are calculated rather then measured, and their variances and
covariances need to be estimated by application of model of error propagation as described by
Eqs. ((2.4.48)-(2.4.56)).
The magnitude of the deviation of the predicted values from the measured ones, similarly to the
test of the proportionality, is expressed as percentage of deviation:
⎛ R ⎞
pR = ⎜1 − 2 ⎟ × 100
⎜ R ⎟
⎝ 2,C ⎠
⎛ G ⎞
pG = ⎜ 1 − 2 ⎟ × 100 (3.2.12)
⎜ G ⎟
⎝ 2,C ⎠
⎛ B ⎞
pB = ⎜ 1 − 2 ⎟⎟ × 100
⎜ B
⎝ 2,C ⎠
130
3.2.2. Experimental setup
3.2.2.1. Visual colorimeter
A visual colorimeter (Tarrant 2002) (Figure 3.2.2-2 - Figure 3.2.2-2 ), initially built for teaching
and demonstration purposes, was adapted for research and used in this study. This instrument
provides vertically-divided bipartite matching field of 6° in size and allows for maximum
saturation type of colour matching. The test and the primary stimuli are generated by filtered
tungsten light projected onto the white diffusive surface on the instrument’s back wall. The
projection units are mounted rigidly, while a system of apertures allows each of the four
colorimeter’s channels (three primaries and the test) to be switched to either side of the field,
thus allowing for one set of primary filters to be used for projection on both sides. The viewing
is binocular through the aperture at the instrument’s front wall, and the viewing distance is
about 1500 mm. The brightness of each channel is controlled by changing the electrical current
fed to the lamp; the effect of changing of light chromaticity as the result of this adjustment is
avoided by the use of narrow-band interference filters for primary and the test stimuli.
A) B)
Figure 3.2.2-1. Tarrant visual colorimeter – schematic plan view and optical system
A) Plan view of the colorimeter. B) Optical system of the test beam projection unit. Reproduced from
(Tarrant 2002).
B: Lamp; C: Condenser lens system; E: Projection lens system; F: Colour filter; G: Alignment mirror;
P: Position of the photometer head; S: Screen; V: viewing aperture; W: auxiliary viewing aperture.
131
A)
B)
Figure 3.2.2-2. Tarrant visual colorimeter.

A) Schematic top view: observer (2) is binocularly viewing the bipartite field on the screen (1).
Immediately after the match has been performed, observer moves aside and a radiometric measurements
of both fields are taken by the telespectroradiometer (3) located just behind the observer’s head.
B) Image of the colorimeter in operation. The TSR is not visible.
3.2.2.2. Primary and test stimuli
The choice of the primaries was governed by the following considerations (Robertson 2002):
1. Replication of the results reported by Thornton (Thornton 1992a; Thornton 1992b)
regarding the failure of transformation of tristimulus space
2. Relation of the findings to the traditional colorimetry
3. Availability of interference filters
Accordingly, three sets of primaries were selected:

1. The primaries identified by Thornton (Thornton 1992a; Thornton 1992b) as “the most
visually efficient”, that is – requiring minimum amount of radiant energy in primary
132
lights to match in colour the unit amount of energy in test light at any wavelength. After
Thornton, we call this set “Prime Colours” (PC).
2. The primary set which is the “least visually efficient” (in Thornton’s terminology), that
is – the opposite of the PC set. This set was named by Thornton (ibid) “Anti Prime
Colours” (AP).
3. The final S&B (Stiles and Burch 1959) primaries in the experiment which led to the
derivation of the CIE 1964 Standard Colorimetric Observer, this set was named
“Traditional” (T).
At the initial stages of the experiment it was decided to abandon the AP primary set. It appeared
extremely difficult to use in the maximum saturation type of colour matching. Observers
complained that the adjustments were not intuitive, and the matches took extremely long to
complete – up to 40 minutes (as opposed to maximum of 5 minutes in other primary sets). It
was impractical to continue the experiment with this setup; hence it was decided to discontinue
the use of the AP set and to use its filters as additional test stimuli. It is worth noting that the
reported experiment (Thornton 1992a) which used the AP lights as primaries were of Maxwell
type and not of the maximum saturation type.
Finally, the experiment was performed with following stimuli:

1. Two sets of primary stimuli:
a “Prime Colours” (PC) set at 603 nm @ 2.9 cd/m2, 530 nm @ 2.4 cd/m2, 451 nm
@ 0.09 cd/m2
b “Traditional” (T) set at 641 nm @ 1.0 cd/m2, 521 nm @ 1.9 cd/m2, 441 nm @
0.07 cd/m2
2. Nine test stimuli:
a “Anti Prime” colours at 500 nm @ 0.72 cd/m2, 584 nm @ 3.4 cd/m2 and 650
nm @ 0.73 cd/m2
b Test stimuli at 461 nm @ 0.15 cd/m2, 541 nm @ 3.5 cd/m2 and 661 nm @ 0.4
cd/m2
c Same filters at 461 nm, 541 nm and 661 nm at luminance level of
approximately half of one stated above.
The filters bandwidth at half-height was approximately 10 nm. The experiment was conducted
at low levels of illumination: the photopic illuminance of the test stimuli were in range of
approximately 0.07-3.3 cd/m2; which correspond to 3.2 to 125 photopic trolands calculated with
Trezona model of pupil size (Eq. (2.1.3)).
133
Figure 3.2.2-3. SPD of the experimental stimuli.

Long dashes – T primary set; short dashes – PC primary set, solid lines – test stimuli. Normalised to unity
at the peak.
3.2.2.3. Telespectroradiometric measurements
A Minolta CS-1000 telespectroradiometer (TSR) was used to measure the stimuli provided by
the visual colorimeter. It provides SPD data in 380-780 nm range, in 1 nm intervals interpolated
from 5 nm measurements. A tungsten light source in the integrating sphere (“White light
calibration gauge”) with measurement data provided by the NPL was used for the TSR
calibration. The light source and the TSR were allowed 30 minutes warm-up time before the
measurements started. The supplied 5 nm NPL measurement data and the mean 5 nm values
from the 20 measurements were used to construct the correction curve. The 5 nm values were
then interpolated to 1 nm intervals curve which was finally used to correct the experimental
measurements (Figure 3.2.2-4).
1.4
1.3
correction factor
1.2
1.1
0.9
380 430 480 530 580 630 680 730 780
λ
Figure 3.2.2-4. Calibration curve applied to correct the Minolta CS-1000 TSR measurements (see text
for details)
Performance of the Minolta CS-1000 in resolving the SPD of narrow-band lights was evaluated
by comparing its measurements with ones taken at 1 nm intervals by the Bentham TSR. This
instrument is equipped with Bentham D300 single monochromator with 1 nm bandwidth, and
134
Bentham DH-3 detector. It was calibrated against an SRS8 luminance gauge (NPL traceable)
and CL-Hg mercury lamp (NIST/NBS). In order to compare the two sets, the measurements
from the two instruments were normalised to have 1 W/sr/m2 under the transmittance curve of
540 nm filter. The mean wavelength error at filters’ peak was 1 nm with maximum of 2 nm. The
mean difference between the areas under the transmittance curves of the same filter taken with
two instruments was 3.5%. An example of measurements of the same filter taken by both
instruments is given in Figure 3.2.2-5.
0.05
Bentham
Minolta
0.045
0.04
0.035
0.03
W/sr/m2
0.025
0.02
0.015
0.01
0.005
0
475 480 485 490 495 500 505 510 515 520 525
Figure 3.2.2-5. Comparison between Minolta CS-1000 and Bentham TSR instruments.
Measurement of 500 nm filter taken by Minolta CS-1000 with 5 nm intervals, interpolated to 1nm
(dashed); and Bentham with 1nm intervals (solid).
The instrument was positioned on a tripod just behind the observer’s head (Figure 2.2.7-2).
Once observer pronounced a match she moved aside, and two radiometric measurements were
taken – one from each matching field. Thus, the stimuli were measured from almost the same
angle and distance as viewed by the observer.
3.2.2.4. Calculation of the tristimulus values
The tristimulus values were calculated from the radiometric measurements by a procedure
similar to one described by Thornton (Thornton 1992a) – by integrating up the radiant energy in
the range of 20 nm about the peak wavelength of the corresponding primary and normalising it
to 1 W/sr/m2 of energy in the test stimulus (Figure 3.2.2-6):
λ0 + 20 nm
Pλ = k ∑
λ0 − 20 nm
wP (3.2.13)
Here Pλ is the calculated tristimulus value which corresponds to primary λ, λ0 is the wavelength
of the peak radiance for that primary, and wP is the measured radiance (W/sr/m2). k is the
normalising factor calculated as
135
1
k= (3.2.14)
wT
where wT is the radiance measured at the test stimulus.
10nm
Pλ
λ-20nm λ λ+20nm
Figure 3.2.2-6. Schematic illustration of calculation of tristimulus value.

The area under the SPD curve of a primary light is integrated to arrive at its corresponding tristimulus
value (see text for details).
3.2.2.5. Observers and observational sessions
Five observers took part in the experiment, all Colour Science research students. There were
three males and two females. Mean age was 30.2 years, with the standard deviation of 4.8.
Three of the observers had considerable experience (>1 year) in colour critical evaluations, and
two observers had none. One observer, the author, has earned significant experience in
operation of visual colorimeter prior to the experiment; three other observers had some
experience with the operation of visual colorimeter from their studies; one observer had no such
experience at all and went through a training session. All observers were tested for colour vision
deficiencies with Ishihara pseudoisochromatic plates (Ishihara).
Each colour matching session for each primary set comprised of 12 stimuli and lasted for
approximately an hour. In order to assess the intra-observer repeatability, four observers
repeated each match three times, and the fifth (observer B, the author) has performed 10
repetitions for each stimulus for each primary set. All the repetitions were performed on
different days.
136
3.3. Results: large field experiment
3.3.1. Variability of colour matching data
In this study, we did not perform the analysis of the correlation between all the elements of the
colour matching experiment setup. Rather, we report the variability in each of the elements, and
the variability of the tristimulus values as measured by the TSR and calculated by means of the
equations (3.2.13)-(3.2.14) is assumed to represent the combined variability of the system and is
used in the statistical evaluations.
Due to large variations between tristimulus values over the wavelength range, expression of
variabilities as standard deviations (i.e. in the units of tristimulus values) is problematic. For
instance, standard deviations of the same magnitude of R tristimulus values would have
different meaning in the "red" region, where the tristimulus values are large, and in the blue
region – where the values are small. Expression of variability relatively to the tristimulus value
would provide a more meaningful measure, allowing variabilities at different wavelength
locations to be compared with each other. Therefore, all the variability values are reported in the
units of coefficient of variation (CV) (also termed as relative standard deviation), which express
the standard deviation s(q) of variable q relatively to the mean value q :
s (q)
CV = × 100 (3.3.1)
q
3.3.1.1. Physical variability
The same instrument was previously evaluated (Luo 2003); however, due to the further
improvements made to the instrument, there was a need in a new study. The following sources
of physical uncertainly were identified and evaluated:
1. Random temporal fluctuations of the colorimeter’s optical and electrical system,
measured as variations in spectral transmittance of a grey filter projected on the test
field side of the bipartite field
137
2. Fluctuations of the TSR

3. Bipartite field spatial non-uniformity – separately for each channel and each field side
4. Cross-talk between the channels
5. NPL white calibration gauge (used to calculate the correction curve for TCR
measurements)
Variability values for the physical sources are summarised in Table 3.3.1-1; below is a detailed
description of the methods used to arrive to them.
TSR + Visual Colorimeter (CV) 1.44%
Cross talk between the colorimeter channels 1.10 %

(mean relative magnitude)
NPL calibration gauge (% error) 1.45%
Combined 2.32%
A)
R (left) R (right) G (left) G (right) B (left) B (right) Test
0.4% 0.9% 0.9% 0.3% 6.9% 2.5% 1.2%
B)
Table 3.3.1-1. Summary of the variability introduced by the instruments.

A) All the evaluated factors with exception of field uniformity; B) field uniformity for every channel and
field side.
Random fluctuations of TSR and visual colorimeter
As we used the TSR to measure the radiance of the bipartite field presented by the visual
colorimeter, the variations in the measurements can be considered as representing the combined
variability of both instruments – the visual colorimeter and the TSR. Before each experimental
session, SPD of a neutral grey filter was measured at the same current setting of the visual
colorimeter test lamp. By the end of the experiment, 38 measurements were made at random
time intervals during approximately six weeks. The measurement data were analysed to derive
the CV of every value in 420-680 nm range (the range of available filters). The plot of the CV
values versus wavelength is given in Figure 3.3.1-1.
138
2.5%
2.0%
1.5%
CV
1.0%
0.5%
0.0%
420 470 520 570 620 670
Figure 3.3.1-1. Long-term variability of the combination TSR – visual colorimeter.

Coefficient of variation (CV) value versus wavelength.
Fields spatial uniformity
The uniformity of the two sides of the bipartite field was evaluated for each colorimeter channel
by taking a series of measurements across the matching field. The difference in luminance at
each location from the mean luminance of the field was calculated; the ratio of this difference to
the mean (expressed in percentage terms) was used as a measure of the spatial uniformity.
Figure 3.3.1-2 illustrates the measurement locations. The numerical results are given in Table
3.3.1-1 (B), and are illustrated graphically in Figure 3.3.1-3.
1 2 3 4 5 6 7 8 9 10
Figure 3.3.1-2. Illustration of position of uniformity measurement sample points

139
Red: Left field Red: Right field
10.00 10
% difference from mean

5.00 5
0.00 0
1 2 3 4 5 6 7 8 9 10
-5.00 -5
-10.00 -10
sample # sample #
A-1) A-2)
Green: Left field Green:Right field
10.00 10

5.00 5
0.00 0
1 2 3 4 5 6 7 8 9 10
-5.00 -5
-10.00 -10
sample # sample #
B-1) B-2)
Blue: Left field Blue: Right field
10 10
5 5
0 0
1 2 3 4 5 6 7 8 9 10
-5 -5
-10 -10
sample # sample #
C-1) C-2)
Figure 3.3.1-3. Evaluation of the uniformity of matching field.
A) Red channel; B) Green channel; C) Blue channel. Column 1: left field; column 2: right field. Data
points represent percentage of deviation from the mean value plotted versus sample number (Figure
3.3.1-2).
Red and green channels are significantly more uniform than the blue one. This is due to
modifications made to the optics in this channel in an attempt to increase the light throughput.
While improvement in luminance of about 230% has been achieved, the uniformity of the beam
was affected.
Cross-talk between the channels
Some amount of light from the test field falling on the match field – and vice versa – caused the
problem of cross-talk between the channels. The way cross-talk shows up in the measurements
is illustrated in Figure 3.3.1-4.
140
5.0E-05
4.0E-05
3.0E-05
2
W/sr/m
2.0E-05
1.0E-05
0.0E+00
500 520 540 560 580 600 620 640
λ
Figure 3.3.1-4. Illustration of the cross-talk between the channels

Some amount of light from the test field (solid line) is falling on the matching field (dotted line), and vice
versa.
The amount of cross-talk was quantified as
S −T
c= × 100 (3.3.2)
T
where c is the magnitude of the cross-talk, T is the sum of the radiant power content of the two
lights mixed at the field, and S is the total radiant power measured from it.
3.3.1.2. Psychophysical variability
As we discuss above, the variability of the tristimulus values is the result of both – physical and
psychophysical uncertainty. In this study, no attempt was made to separate the variability of the
two kinds from one another. Hence, it might be worthy to differentiate between the observer
variability and the variability of tristimulus values in colour matching experiment. However, to
keep to the accepted terminology, we use the common notation of intra- and inter-observer with
the meaning of uncertainties of tristimulus values.
Intra-observer variability
One observer (B) has performed ten repetitions of every match. The variability of each
tristimulus value of every stimulus was evaluated as a standard deviation within the ten sets, and
is reported as the intra-observer variability in our experiment. The numerical values in CV terms
are given in Table 2.3.1-2.
141
PC set T set
λ R G B λ R G B
441 0.0203 -0.0444 1.1014 451 -0.0375 0.0426 0.9165
461 -0.0264 0.0631 0.8163 461 -0.0819 0.1034 0.7354
500 -0.1247 0.5210 0.0944 500 -0.1970 0.5506 0.0807
521 -0.0709 0.9156 0.0107 530 0.1794 1.0910 -0.0154
541 0.1355 0.9689 -0.0117 541 0.4866 1.1105 -0.0237
584 0.8194 0.3256 -0.0094 584 1.9173 0.5718 -0.0249
641 0.4453 -0.1058 0.0093 603 2.1419 0.2253 -0.0276
650 0.2830 -0.0697 0.0098 650 0.6210 -0.0033 0.0045
661 0.1580 -0.0395 0.0091 661 0.3514 -0.0031 0.0041
A) B)
PC set T set
λ R G B λ R G B
441 10.12% 4.90% 3.39% 451 11.89% 7.29% 4.14%
461 6.59% 5.57% 2.72% 461 4.55% 3.43% 4.93%
500 3.78% 3.51% 6.72% 500 6.61% 1.79% 5.95%
521 7.56% 2.45% 28.01% 530 6.21% 0.95% 22.36%
541 3.76% 1.30% 35.24% 541 2.59% 1.72% 15.48%
584 1.62% 2.18% 39.26% 584 3.20% 2.88% 14.17%
641 1.24% 2.96% 21.90% 603 2.07% 1.95% 9.12%
650 4.26% 3.45% 16.72% 650 1.57% 16.66% 18.25%
661 1.68% 3.44% 6.31% 661 1.77% 19.41% 22.72%
Mean 4.51% 3.31% 17.81% Mean 4.50% 6.23% 13.01%
C) D)
Table 3.3.1-2. Summary of mean tristimulus values for individual observer, and corresponding intra-
observer variability. Based on ten repetitions of every colour match made by observer B.
A) PC set – tristimulus values; B) T set – tristimulus values; C) PC set – intra-observer variability in CV
terms; D) T set – intra-observer variability in CV terms
Inter-observer variability
All the observers except observer B performed three repetitions of each colour match. The
arithmetic mean of the results of every observer formed five sets of tristimulus values. The
standard deviation in this set was calculated to estimate the inter-observer variability; the results
are listed in Table 2.3.1-3.
142
PC set T set
λ R G B λ R G B
441 0.0133 -0.0279 0.9618 451 -0.0259 0.0402 0.9212
461 -0.0199 0.0558 0.8245 461 -0.0612 0.0898 0.7430
500 -0.1047 0.4884 0.1147 500 -0.1615 0.5298 0.0972
521 -0.0377 0.9095 0.0091 530 0.1805 1.1149 -0.0111
541 0.1273 0.9934 -0.0110 541 0.4826 1.1464 -0.0179
584 0.8249 0.3408 -0.0107 584 1.9845 0.6117 -0.0344
641 0.4355 -0.0981 0.0125 603 2.1990 0.2514 -0.0328
650 0.2808 -0.0756 0.0135 650 0.6182 -0.0038 0.0057
661 0.1543 -0.0435 0.0118 661 0.3442 -0.0038 0.0050
A) B)
PC set T set
λ R G B λ R G B
441 30.98% 48.66% 16.40% 451 32.21% 7.78% 4.66%
461 28.14% 10.81% 5.26% 461 25.06% 12.55% 5.35%
500 11.20% 4.26% 11.36% 500 11.80% 3.35% 9.53%
521 75.21% 2.00% 89.65% 530 11.28% 3.38% 57.24%
541 17.27% 2.89% 109.73% 541 10.02% 3.08% 64.21%
584 5.00% 7.42% 118.85% 584 9.63% 6.71% 68.89%
641 7.64% 29.61% 58.57% 603 5.05% 8.59% 60.59%
650 5.88% 11.94% 36.64% 650 1.06% 13.41% 33.67%
661 6.20% 12.83% 46.28% 661 2.24% 22.12% 31.03%
Mean 20.84% 14.49% 54.75% Mean 12.04% 8.99% 37.24%
C) D)
Table 3.3.1-3. Summary of mean tristimulus values for all observers, and corresponding inter-observer
variability. Means of repeated measurements of five observers.
A) PC set – tristimulus values; B) T set – tristimulus values; C) PC set – inter-observer variability in CV
terms; D) T set – inter-observer variability in CV terms
Figure 3.3.1-5 shows the plots of CV values of all three types of variability – physical, inter-
and intra-observer for PC and T primary sets.
143
80% 35%
70% 30%
60% 25%
50%
20%
CV
40%
15%
30%
10%
20%
10% 5%
0% 0%
440 490 540 590 640 440 490 540 590 640
λ λ
A-1) A-2)
60% 25%
50% 20%
40%
15%
30%
10%
20%
10% 5%
0% 0%
440 490 540 590 640 440 490 540 590 640
λ λ
B-1) B-2)
140% 80%
120% 70%
100% 60%
50%
80%
CV
40%
60%
30%
40%
20%
20% 10%
0% 0%
440 490 540 590 640 440 490 540 590 640
λ λ
C-1) C-2)
Figure 3.3.1-5. Variabilities in PC and T primary sets.

A) R tristimulus value; B) G tristimulus value; C) B tristimulus value.
Column 1: PC primary set; Column 2: T primary set.
Instrumental (thin dashed line), intra-observer (thick dashed line) and inter-observer (thick solid line)
variability.
3.3.2. Proportionality and Additivity test results
Same narrow-band stimulus was matched at two luminance levels, one approximately ½ of
another. The proportionality was tested by comparing the tristimulus values measured at two
levels after the normalisation to 1 W/sr/m2 at test stimulus (Eq. (3.2.1)- (3.2.6)). Table 2.3.2-1
shows the results of the Nel and Van der Merwe’s multivariate statistical test at 95% confidence
level, for single observer and for mean results of all observers.
`
144
PC, observer B PC, All observers T, observer B T, All observers

461 nm − √ √ √
541 nm √ √ √ √
661 nm − − − −
Table 3.3.2-1. Results of the proportionality test.

Results of statistical test for equality of mean vectors of tristimulus values measured for the same
stimulus at two levels of luminance. “-“ signifies failure of proportionality; ”√” signifies cases where
there was no statistically-significant failure. Results are for both primary sets; for observer B and for
mean of all observers.
The statistical test provides only a pass/fail result; in case of the failure it does not provide
information about the contribution of each tristimulus value to it. Therefore it is useful to
express the test results also as magnitude of proportionality failure (Eq. (3.2.7)). Figure 3.3.2-1
illustrates the magnitudes for observer B, and Figure 3.3.2-2 for mean of all observers.
80% 80%
60% 60%
40% 40%
% mismatch
% mismatch
20% R 20% R
0% G 0% G
-20% 461 541 661 B -20% 461 541 661 B
-40% -40%
-60% -60%
-80% -80%
test colour (nm) test colour (nm)
A) B)
Figure 3.3.2-1. Magnitude of proportionality failure in results of observer B.
A) PC primary set; B) T primary set. The ordinate values are outputs of Eq. (3.2.7)
80% 80%
60% 60%
40% 40%
% mismatch
% mismatch
20% R 20% R
0% G 0% G
-20% 461 541 661 B -20% 461 541 661 B
-40% -40%
-60% -60%
-80% -80%
test colour (nm) test colour (nm)
A) B)
Figure 3.3.2-2. Magnitude of proportionality failure in mean results all observers.
A) PC primary set; B) T primary set. The ordinate values (“% mismatch”) are the values of “degree of
proportionality failure”, as computed by means of Eq. (3.2.7)
3.3.2.2. Additivity test results
Same narrow-band stimuli were matched by two sets of primaries: T and PC. The tristimulus
values measured with one set were transformed into another, and the transformed values were
compared with ones measured for the corresponding stimulus by the corresponding primaries.
145
Table 2.3.2-2 shows the results of the Nel and Van der Merwe’s multivariate statistical test at
95% confidence level, for single observer and for mean results of all observers, for inverse- and
forward-matrix transformations.
Similarly to the test of proportionality, the magnitude of failure of additivity is visualised in a

bar chart (Figure 3.3.2-3 and Figure 3.3.2-4).
PC, B, 6°, FM PC, B, 6°, IM T, B, 6°, FM T, B, 6°, IM PC ALL 6°, FM PC ALL 6°, IM T ALL 6°, FM T ALL 6°, IM
461 − − − − √ √ √ √
500 − √ √ − √ √ √ √
541 − √ √ − √ √ √ √
584 − √ √ − √ √ √ √
650 − √ √ − − √ √ √
661 − − √ − − √ √ √
Table 3.3.2-2. Results of the additivity test.

Results of statistical test for equality of mean vectors of tristimulus values measured by visual colour
matching and calculated by transformation of tristimulus space. “-“ signifies failure of additivity; ”√”
signifies cases where there was no statistically-significant failure. In the column head, PC and T signify
the primary set; B and ALL signify observer B and mean of all observers; FM and IM stand for “Forward
Matrix” and “Inverse Matrix” transformations, respectively.
100%
100%
80%
80%
% mismatch
% mismatch
60% R R
60%
G G
40% 40%
B B
20% 20%
0% 0%
461 500 541 584 650 661 461 500 541 584 650 661
test colour (nm) test colour
A) B)
Figure 3.3.2-3. Magnitude of additivity failure in results of observer B.
A) PC primary set; B) T primary set.
PC set, Primaries transformation test T set, primaries transformation test
100% 100%
80% 80%
% mismatch
% mismatch
R R
60% 60%
G G
40% 40%
B B
20% 20%
0% 0%
461 500 541 584 650 661 461 500 541 584 650 661
test colour test colour
A) B)
Figure 3.3.2-4. Magnitude of additivity failure in mean results of all observers.
A) PC primary set; B) T primary set.
146
3.4. Data analysis and discussion: large field experiment
3.4.1. Variability of colour-matching data
In analysis of variability of the colour matching data collected in our experiment, the following
questions were considered:
1. What is, in quantitative terms, the physical and the psychophysical variability?
2. What is the relation of the physical and the psychophysical types of variability?
3. What is the relation of the intra- and inter-observer variability?
4. Do the results reproduce the variability within the S&B colour matching dataset?
5. What is the performance of the CIE Standard Deviate Observer (CIE 1989) in
predicting our experimental data?
6. Do the results depend on the choice of the primary colours?
7. How does the variability of the tristimulus values of a stimulus depend on its spectral
position?
3.4.1.1. Intra-observer, inter-observer and instrumental variability
The comparison of the physical and the psychophysical intra- and inter-observer variability
within each tristimulus space is illustrated in Figure 3.3.1-5. As expected, the instrumental
variability has the smallest contribution to the total uncertainty, followed by the intra-observer
variability, while the individual variations between observers contribute the most.
The mean combined physical variability is about 70% of the total intra-observer variability of
red and green tristimulus values, and about 45% of the blue ones. Provided the rather low
relative value of the combined physical variability – about 3% on average – it can be concluded
that the high ratio of the physical to intra-observer variability is not due to the instruments poor
performance or inappropriate design of the colorimeter, but rather due to the low intra-observer
variations of data. We consider the variability introduced by the instruments to be the most
significant part in the variability of single observer’s data in our experiment.
147
The similarity between the instrumental and the intra-observer variabilities is especially strong
for green and red tristimulus values and less so in blue ones. It seems reasonable to attribute
poorer observer performance in blues to very low luminances at which the blue primary
operated (<0.1 cd/m2), hence low discrimination should be expected. Interestingly, for some
tristimulus values, the relative physical variability exceeds the intra-observer one.
The relationship between the intra- and inter-observer variability is of interest because it
provides some indication of contribution of observer metamerism per se into the total
uncertainty of colour matching. This relationship is somewhat different for the two primary sets:
the intra-observer variability is 37% of the inter-observer in PC set and 55% in T set. When the
standard deviations were transformed into CIE 1964 XYZ space (Section 2.2.5), this relationship
remained similar (Figure 3.4.1-1). It seems that the use of T set primaries leads to observer
metamerism to be less pronounced, thus making it a preferable choice in this respect.
148
40% 40%
35% 35%
30% 30%
25% 25%
CV
CV
20% 20%
15% 15%
10% 10%
5% 5%
0% 0%
440 490 540 590 640 440 490 540 590 640
λ λ
A-1) A-2)
PC
8%
7%
6% 8%
5% 6%
CV
4%
CV
3% 4%
2%
2%
1%
0% 0%
440 490 540 590 640 440 490 540 590 640
λ λ
B-1) B-2)
350% 350%
300% 300%
250% 250%
200% 200%
CV
CV
150% 150%
100% 100%
50% 50%
0% 0%
440 490 540 590 640 440 490 540 590 640
λ λ
C-1) C-2)
Figure 3.4.1-1. All types of variability compared.
A) X tristimulus values; B) Y tristimulus values; C) Z tristimulus values. T primary set in the column 1,
PC in column 2.
Coefficients of variation corresponding to instrumental (thin dashed), intra-observer (thick dashed) and
inter-observer (thick solid) variabilities. Transformed to CIE 1964 XYZ primaries (Section 2.2.5).
However, the differences perhaps are not so significant – as illustrated in Figure 3.4.1-2, where
the CV plots corresponding to the two primary sets are superimposed. Although PC set shows
generally higher variability, it tends to happen mostly in areas where the tristimulus values for
the corresponding primary are very small; such as blue region for X, red region for Y and green-
red region for Z. This general pattern of variabilities is similar to one reported by (Stiles and
Burch 1959): the relative variabilities tend to be lowest where the absolute tristimulus values are
highest, and vice versa.
149
35%
30%
25%
20%
CV
15%
10%
5%
0%
440 490 540 590 640
λ
A)
8%
7%
6%
5%
CV
4%
3%
2%
1%
0%
440 490 540 590 640
λ
B)
350%
300%
250%
200%
CV
150%
100%
50%
0%
440 490 540 590 640
λ
C)
Figure 3.4.1-2. Inter-observer CV values transformed to CIE 1964 XYZ primaries using error
propagation model (Section 2.4.7).
A) X (red) tristimulus values; B) Y ( green) tristimulus values; C) Z (blue) tristimulus values. Dashed
lines: PC; Solid lines: T. In each plot, the thick dotted line is the S&B (Stiles and Burch 1959) CMF
transformed to the CIE 1964 primaries and scaled to fit the graph.
An interesting feature of the plots in Figure 3.4.1-2 are the maximas which consistently coincide
with certain wavelength values. The red is the most apparent: the values in both sets increase
rapidly around 500 nm; the variability in that region is almost 10 times higher than in the rest
parts of the spectrum; still in red there is a small increase in CV values in T set at 580 nm and in
PC set – at 650 nm. In the blue channel the peaks in both primary sets are around 584 nm. There
are also small peaks in green: in PC set at 650 nm and in T set at 580 nm. Those three
wavelength positions – 500 nm, 584 nm and 650 nm – coincide with ones termed by Thornton
150
as “Anti Prime” colours (Thornton 1992a). No publications reporting similar trends could be
found in the literature. However, the significance of these peaks in relative variability is
questionable: they occur in regions of spectrum where the absolute values of the corresponding
CMF approach zero.
3.4.1.2. Comparison with S&B dataset
One of the major goals of present experiment was to reproduce the variability in S&B colour
matching dataset (Stiles and Burch 1959). The mean relative variability values for X, Y and Z
spectral tristimulus values in S&B data are 40.5%, 3.5% and 138.9%, respectively; this is
compared with 7.2%, 3.9% and 46.0% for T primary set and 11%, 4.9% and 83.4% for PC set.
The apparent difference in X and Z variability values is very large. However, as evident from
the plots in Figure 3.4.1-3, the variations in these values across the spectrum are very large,
hence the mean values can not be considered as reliable.
151
350% 350%
300% 300%
250% 250%
200% 200%
CV
CV
150% 150%
100% 100%
50% 50%
0% 0%
440 490 540 590 640 440 490 540 590 640
λ λ
A-1) A-2)
8% 8%
7% 7%
6% 6%
5% 5%
4%
CV
CV
4%
3% 3%
2% 2%
1% 1%
0% 0%
440 490 540 590 640 440 490 540 590 640
λ λ
B-1) B-2)
800% 800%
700% 700%
600% 600%
500% 500%
CV
CV
400% 400%
300% 300%
200% 200%
100% 100%
0% 0%
440 490 540 590 640 440 490 540 590 640
λ λ
C-1) C-2)
with ones of (Stiles and Burch 1959).
A-1), B-1) and C-1): T primary set; A-2), B-2) and C-2): PC primary set.
Superimposed with the suitably scaled mean S&B CMF, transformed to 1964 CIE Observer primaries.
Solid line – present experiment; dashed line: S&B experiment, thick dotted line: CMF.
The situation becomes clearer if the attention is concentrated on spectral regions which are
“relevant” for the specific tristimulus value, i.e. ones where the tristimulus values are
significantly different from zero. In Figure 3.4.1-3, the two curves indicating the variability
values are superimposed with the suitably-scaled CMF curve, so the “relevant” areas can be
easily identified. Plots in Figure 3.4.1-4 are “zoom-in” versions of ones in Figure 3.4.1-3, with
abscissa and ordinate scaled in order to enlarge the area of interest – i.e. the wavelength region
were the corresponding tristimulus values are significantly different from zero. This time, the
curves are extremely similar: they follow the same trends, and in some regions almost coincide.
The mean variability values in the “relevant” areas reflect this visual resemblance: 4.3%, 3.4%
and 9.9% in S&B data, 5.0%, 3.9% and 12.9% in T primary set, and 7.8%, 4.9% and 16.8% in
PC set.
152
10% 10%
8% 8%
6% 6%
CV
CV
4% 4%
2% 2%
0% 0%
500 550 600 650 530 580 630
λ λ
A-1) A-2)
10% 10%
8% 8%
6% 6%
CV
CV
4% 4%
2% 2%
0% 0%
450 500 550 600 650 440 490 540 590 640
λ λ
B-1) B-2)
35% 35%
30% 30%
25% 25%
20% 20%
CV
CV
15% 15%
10% 10%
5% 5%
0% 0%
450 470 490 510 450 470 490 510 530
λ λ
C-1) C-2)
with ones of (Stiles and Burch 1959); with abscissa and ordinate scaled to enlarge the areas where CMF
are significantly different from zero.
A-1), B-1) and C-1): T primary set; A-2), B-2) and C-2): PC primary set.
Superimposed with the suitably scaled mean S&B CMF, transformed to 1964 CIE Observer tristimulus
space. Solid line – present experiment; dashed line: S&B experiment.
Results of T primary set are somewhat closer to the S&B ones. Primaries of the T set were
chosen to be similar to the primaries of S&B experiment. This can be considered as another –
indirect – evidence for the dependence of variability of CMF on choice of primaries.
S&B colour matching data set contains CMFs of 49 observers, compared with only five in our
experiment. The S&B CMF were measured on the colorimeter providing maxwellian view and
having optical design and controls entirely different from Tarrant colorimeter used in our
experiment. Considering these differences, the similarity in variability values is remarkable,
153
suggesting that the variability of CMF data in S&B dataset can be reproduced even by a small
group of observers.
3.4.1.3. Comparison with CIE SDO
The CIE Standard Deviate Observer (CIE 1989) was designed and tested for prediction of the
level of observer metamerism of metameric broadband stimuli. The applicability of this
standard to conditions of colour matching experiment with narrow-band lights is not known,
and indeed rather irrelevant: there are no industrial tasks which provide similar conditions.
Therefore the test of SDO on the data from our experiment would not be correct.
However, some limited test still can be carried out. CIE SDO (CIE 1989) defines four deviate
observers. These four sets of CMF are applied on the spectral power distribution of a stimulus;
the resulting four sets of tristimulus values are used to construct the 95% confidence ellipse
indicating the spread of colours that will match the stimulus in question for a range of colour-
normal observers. As a “stress test”, this procedure can be applied on narrow-band stimuli used
in our colour matching experiment. The outcome of such test is an evaluation of similarities
between the SDO prediction and our results.
The test is done as follows:

1. Deriving a transformation from each of the primary sets to the CIE XYZ and to each of
the four Standard Deviate Observers.
2. Transforming the mean tristimulus values of all observations to 5 sets of XYZ(i)
tristimulus values, where i corresponds one of the five observers: the CIE 1964 and the
four Deviate Observers .
3. The variability within the set of five values is used as the representative of the
variability of the SDO, and is compared with the variability within our group of
observers.
The graphical representation of the results of this test is shown in Figure 3.4.1-5.
154
10%
9%
8%
7%
6%
CV
5%
4%
3%
2%
1%
0%
440 490 540 590 640
λ
A)
8%
7%
6%
5%
CV
4%
3%
2%
1%
0%
440 490 540 590 640
λ
B)
35%
30%
25%
20%
CV
15%
10%
5%
0%
440 490 540 590 640
λ
C)
Figure 3.4.1-5. Test of the CIE Standard Deviate Observer.

A) Red tristimulus values; B) Green tristimulus values; C) Blue tristimulus values. Thick lines: T set; thin
lines: PC set; dashed lines: experimental data; solid lines: SIE SDO prediction.
SDO consistently under-predicts the experimental results, predicting variabilities of 3 to 6 times

lower than the observed ones. Again, this provides more of an indication then definite results, as
we test the standard in the conditions it was not designed for.
155
3.4.2. Proportionality and additivity tests
Proportionality consistently fails (Table 3.3.2-1) for the 661 nm stimulus, in both primary sets,
in the individual as well as in the mean results of all observers. The character of the failures is
similar in individual and group results (Figure 3.3.2-1, Figure 3.3.2-2 and Figure 3.4.2-1), but is
different in the two primary sets
95% confidence ellipses in Figure 3.4.2-1 are constructed from ten matches of 661 nm stimulus
made by observer B at full and half luminances. If proportionality strictly holds, the two sets of
tristimulus values should be identical, so the two ellipses in each plane should overlap
completely. The discrepancies in positions illustrate the direction and magnitude of the
proportionality failure. Figure 3.4.2-1 shows the results for observer B; mean results of all
observers exhibit the same trend.
661 PC B, 541 661 PC B, 541 661 PC B, 541

0.17 0.022 0.17
0.168 0.168
0.02
0.166 0.166
0.018
0.164 0.164
0.016
0.162 0.162
0.16 0.014 0.16

R
R
B
0.158 0.158
0.012
0.156 0.156
0.01
0.154 0.154
0.008
0.152 0.152
0.15 0.006 0.15

−0.045 −0.04 −0.035 −0.03 −0.045 −0.04 −0.035 −0.03 0.005 0.01 0.015 0.02 0.025
G G B
A-1) A-2) A-3)

661 T B, 661 −3 661 T B, 661 661 T B, 661
0.37 x 10
7 0.37
0.365 0.365
6
0.36 0.36
5
0.355 0.355
0.35 4
R
0.35
B
0.345 0.345
3
0.34 0.34
2
0.335 0.335
0.33 1
−4 −2 0 −5 −4 −3 −2 −1 0.33
0 0.005 0.01
G −3
x 10 G −3
x 10 B
B-1) B-2) B-3)

Figure 3.4.2-1. 95% confidence ellipses of 661 nm stimulus measured by observer B in full and half
luminance.
A-1) – A-3): PC primary set; B-1) – B-3): T primary set. RG, BG and RB planes in native T and PC
tristimulus spaces. Thick line: match at full luminance; thin line: match at half-luminance.
In PC set, observers consistently used significantly more of the blue primary light and slightly
(but consistently) less of the green primary light to match 661 nm light at reduced luminance
than they did at the full luminance. On the diagram this trend shows up as a shift in position of
156
the mean match towards blue (Figure 3.4.2-1 A). In numerical terms, increase in blue light in
the match is about 44%, and reduction in green light is about 4%.
In T set, the failure takes the form of a shift in “green” direction: observers used significantly
(about 70%) less green primary light with the stimuli at reduced luminance. The practical
significance of such a shift is, perhaps, questionable as the absolute “green” tristimulus values in
661 nm region are extremely small.
The “blue” shift in far red region of the spectrum of the kind we observe here in PC primary set
is long known in the literature and is believed to be caused by rod participation. Just how rod
signals cause the perceived colour to shift towards blue is still not known, but Buck (Buck
2001) suggested that rods can have a blue effect on the perceived hue. The rods problem was
well known to Stiles, who developed a method in which two different set of primaries were used
in the red region of the spectrum in order to reduce the effect of the rod intrusion (Stiles and
Burch 1959). Even so, Judd reports (Judd 1993) that there was still the need to correct S&B data
for rod intrusion. It was assumed that S cones are not excited at all by light in yellow-red region
of the spectrum, thus any blue component in that region results from rod participation. This has
led to the artificial straightening of the corresponding part of the chromaticity locus in CIE 1964
chromaticity diagram.
Thus, it was of interest to try and relate the observed failures to rod participation. The analysis
was done as described in (Section 2.2.9); the results are shown in Figure 3.4.2-2.
10 10 641
650
8 8
6 6 661
rod mismatch/tvi
rod mismatch/tvi
4 461 4
650 461 500 521
500
2 661 2
0 0
-2 541 -2
530 541
-4 -4
584
-6 -6
-8 584 -8
-10 603 -10
λ λ
A) B)
Figure 3.4.2-2. Results of the analysis of rod participation.
A) T primary set; B) PC primary set. The extent of rod mismatch plotted against wavelength. The red
horizontal lines mark values of ±1; value of rod participation more than 1 and less than -1 is significant
and is likely to have effect on the match.
The horizontal lines define the boundaries of ±1; values below the line of -1 and above the line
of +1 signify that the rod mismatch between the two sides of the matching field is higher than
the just-noticeable difference in scotopic luminance, and the colour match is significantly
affected by rods. The sign in the value of the rod participation is arbitrary and changes if the
157
scotopic luminance values for left and right field are swapped in the expression of the rod
mismatch (Eq. (2.2.42)).
From Figure 3.4.2-2, the level of rod participation is clearly affected by the choice of primaries,
and the differences are mostly in green-red region. This is an expected effect, as the main
difference between the PC and T primary sets is in red primary. For the 661 nm stimulus the rod
participation value in PC set is equal 6.6 and in T set – 1.9. Plots of the relative spectral power
distribution of the matching pairs in the two tristimulus spaces (Figure 3.4.2-3) plotted next to
the scotopic luminance efficiency curve V′(λ) provide some insight. At 661 nm the sensitivity
of rods is extremely low. When matched with PC primary set, at the test side of the bipartite
field the rods are excited only by the small amount of desaturating green primary light. In the
matching field, however, the rods are excited by both red and blue primary lights, thus creating
misbalance of rod excitation between the two fields. With the T primary set, the red primary and
the 661 nm test light are both situated in the area of low rod sensitivity, while the level of
another two primary stimuli is very low; therefore the rod mismatch between the two fields is
significantly lower than in PC primary set.
PC set T set
1 1
0.8 0.8
relative intensity
relative intensity
0.6 0.6
0.4 0.4
0.2 0.2
0 0
430 480 530 580 630 680 430 480 530 580 630 680
λ λ
A) B)
Figure 3.4.2-3. Illustration of effect of rods on 661 nm match.
A) PC set; B) T set. Spectral power distribution of test field (solid line), match field (dotted line) and rod
sensitivity function – the scotopic luminous efficiency function V’(λ) (thick solid).
The rod mismatch analysis points to possible cause of the proportionality failure, but does not
provide an answer on underlying mechanisms leading to the observed effects. Namely, the open
questions are:
1. Why do observers compensate for the rod excitation misbalance with blue primary light
in PC set and with green – in T set?
2. How do rod signals combine with the cone signals to produce the failures of
proportionality?
158
In addition to 661 nm, proportionality also fails for 461 nm light in observer’s B data. In this
case, however, the failure seems to be minor; and is rendered insignificant in the mean data of
all observers.
3.4.2.2. Additivity test
Similarly to the proportionality test, same trends can be seen in single observer’s results and in
mean results of all observers (Figure 3.3.2-3 and Figure 3.3.2-4). In the PC set, the
discrepancies are mostly due to differences in the blue tristimulus values in the green region
(541 nm and 584 nm), and green and blue tristimulus values in red region of the spectrum (650
nm and 661 nm). In the T set, the discrepancies are mostly due to differences in the blue
tristimulus values in the green-red region (541mn – 650 nm).
Figure 3.4.2-4 illustrates the correlation between the additivity failure and rod participation in
results of observer B. The agreement between the two sets of data is rather high: the correlation
coefficients are 87% and 74% for T and PC sets, respectively; hence the link between the rod
participation and additivity failure is rather clear.
9 3.0 9
0.6 8 8
2.5
0.5 7 7
Additivity failire
Additivity failire
Rod mismatch
Rod mismatch
6 2.0 6
0.4
5 5
1.5
0.3 4 4
0.2 3 1.0 3
2 2
0.1 0.5
1 1
0.0 0 0.0 0
461 500 541 584 650 661 461 500 541 584 650 661
λ λ
A) B)
1.6 9 9
0.40
1.4 8 8
0.35
1.2 7 7
0.30
Additivity failire
Additivity failire
Rod mismatch
Rod mismatch
6 6
1.0 0.25
5 5
0.8 0.20
4 4
0.6 ` 0.15
3 3
0.4 2 0.10 2
0.2 1 0.05 1
0.0 0 0.00 0
461 500 541 584 650 661 461 500 541 584 650 661
λ λ
C) D)
Figure 3.4.2-4. Correlation of additivity failure and rod mismatch.
A) T set, forward matrix transform; B) PC set, forward matrix transform; C) T set, inverse matrix
transform; D) PC set, inverse matrix transform;. Values of rod mismatch (triangles) and magnitudes of
failure of transformation of tristimulus space (circles). The value of the magnitude of failure of
transformation of tristimulus space for a stimulus is taken as the maximal magnitude of failure of its
tristimulus values.
159
The correspondence between the individual and common results points to existence of a
common mechanism underlying the discrepancies. This is also supported by similarities in
results of the present and of the proportionality tests: in both the discrepancies are in blue and
green tristimulus values in green-red region. However, the test of transformation of tristimulus
space does not lend itself to a straightforward interpretation, as it does not directly represent any
physiological mechanism operating in the vision system. Its significance is in testing the
application of principles of additivity and not the additivity per se; this is to say – it can point
out to the inconsistencies in our assumptions about operation of vision system, but does not help
to identify particular reasons for these inconsistencies.
Unlike in proportionality test, in additivity test there is a clear difference between the results of
a single observer and those of group mean (Table 2.3.2-2): the transformation of tristimulus
space mostly fails for observer B, and mostly holds for mean of all observers. The reason for
this discrepancy is the statistical significance of the result in two cases: the intra-observer spread
of matches is significantly smaller than the inter-observer one, making the discrepancies more
statistically significant in one case and less significant in the other. This, as well as the
ambiguity of the statistical test results, is illustrated in Figure 3.4.2-5.
461 B PC2T FM 6deg 461 B PC2T FM 6deg 461 B PC2T FM 6deg
0.115 0.95 0.95
0.9 0.9
0.11
0.85 0.85
0.105
0.8 0.8
0.1
B
G
0.75 0.75
0.095
0.7 0.7
0.09
0.65 0.65
0.085
−0.095 −0.09 −0.085 −0.08 −0.075 −0.07 −0.065 −0.095 −0.09 −0.085 −0.08 −0.075 −0.07 −0.065 0.085 0.09 0.095 0.1 0.105 0.11 0.115
R R G
A-1) A-2) A-3)

461 ALL PC2T FM 461 ALL PC2T FM 461 ALL PC2T FM
0.12 0.95 0.95
0.9 0.9
0.11
0.85 0.85
0.1
0.8 0.8
0.09 0.75 0.75

G
0.7 0.7
0.08
0.65 0.65
0.07
0.6 0.6
0.06 0.55 0.55

−0. 1 −0.09 −0.08 −0.07 −0.06 −0.05 −0.04 −0.03 −0.02 −0.01 −0. 1 −0.09 −0.08 −0.07 −0.06 −0.05 −0.04 −0.03 −0.02 −0.01 0.06 0.07 0.08 0.09 0.1 0.11 0.12
R R G
B-1) B-2) B-3)

Figure 3.4.2-5. 95% confidence ellipses of 461 nm light measured by visual colour matching with PC
primaries, and predicted by transformation of tristimulus space from T into PC primaries.
A) Observer B; B) mean of all observers.
Column 1: RG plane; column 2: RB plane; column 3: GB plane. Thin line: measured; thick line:
predicted.
In each plot in Figure 3.4.2-5, one ellipse is constructed directly from experimental data. The
other ellipse represents the data “predicted” by the model of transformation of tristimulus space
(Section 2.2.5), and is constructed from the covariance matrix estimated by the model of error
propagation (Section 2.4.7). The ellipse parameters in this case do not represent a spread of
160
experimental values, but rather serve only for the evaluation of uncertainty of the result of the
transformation, and as an input to the statistical test. The degree of the overlap between the two
ellipses in each plane corresponds to degree to which the assumption of additivity, underlying
the transformation, is valid.
The spread of matches made by five observers is naturally larger than the spread of individual
observer’s results. Hence, although the direction and the magnitude of the distances between the
centres of the ellipses are similar in both cases, the all-observers ellipses are significantly larger
and overlap to greater degree. The task of the statistical test is to translate these differences in
overlap to binary pass/fail decision. As we noted, the correspondence between the statistical
results and plots is not a straightforward one. Nel and Van der Merwe’s test considers the two
distributions in Figure 3.4.2-5 (A) to be statistically different, and the two in the bottom row –
to be statistically identical. However, the ellipses in the top row, although overlapping to lesser
degree, still do overlap in large part. Hence there is still significant probability that a sample can
belong to both distributions. We are bound to conclude that the results of Nel and Van der
Merwe’s test must be taken with some scepticism.
Our visual judgement of the ellipses’ overlap can be either qualitative (more overlap/less
overlap), or binary (overlap/does not overlap); it is not sufficient on itself to disqualify the
statistical test. Moreover, the alternatives performed poorer in specialised tests (Christensen and
Rencher 1997). While we accept the statistical results, the scepticism remains in its place.
3.4.2.3. Forward- and Inverse-Matrix methods of transformation of tristimulus space
The effect of the mathematical procedure on the results of the statistical test of additivity is
evident from Table 2.3.2-2: in most of the stimuli in single observer results, and one stimulus in
all-observers, opposite results are obtained using the two methods with the same experimental
data. This makes an already difficult interpretation of the results almost impossible.
Figure 3.4.2-6 illustrates the effect of the transformation method on the 95% confidence ellipses
in rg chromaticity diagrams of PC and T tristimulus spaces, on the example of the 661 nm
stimulus. The ellipses were constructed using the covariance matrices transformed from the
corresponding RGB tristimulus spaces: for “measured” ellipses the covariances were estimated
directly from the experimental data, and for “IM” and “FM” ellipses the covariances were
estimated by means of the uncertainty propagation model. If the additivity assumption held
perfectly in our experiment, all three ellipses would coincide.
161
661nm, PC• T, 6deg 661nm, T−PC, 6deg

0.005 −0.24
IM
0 IM
−0.26
Measured Measured
• 0.005
−0.28
FM
• 0.01
FM
−0.3
g
• 0.015
g
• 0.02
−0.32
• 0.025
−0.34
• 0.03
• 0.035 −0.36
0.975 0.98 0.985 0.99 0.995 1 1.005 1.01 1.015 1.02 1.16 1.18 1.2 1.22 1.24 1.26 1.28 1.3 1.32
r r
A) B)
Figure 3.4.2-6. Illustration of the statistical test of tristimulus space transformation.
A) T tristimulus space; B) PC tristimulus space. 95% confidence ellipses of 661 nm stimulus measured by
visual colour matching (thick lines) and transformed from another tristimulus space by inverse-matrix
(IM) and forward-matrix (FM) methods – as indicated by arrows. The plots are in primary-specific rg
chromaticity diagram. See text for details.
The choice of the method affects both: the accuracy of the transformation – as shown by the
location of the centres of the “predicted” ellipses relative to the “measured” ones, and the
variances and covariances of the predicted values – as shown by the variations in size and
orientation of ellipses. As both – the accuracy and the covariances – have an effect on the result
of the statistical test for the equality, the degree of the overlap between the ellipses corresponds
to the result of the statistics in Table 2.3.2-2.
162
3.5. Small field colour matching experiment
From the results of the colour matching experiment with the large field, we conclude that the
additivity failures are statistically significant in the results of individual observer, and are mostly
statistically insignificant in group results. We were able to show correlation between the
additivity failures and rod participation, although we were not able to describe its mechanism. If
rods indeed are solely responsible for the observed discrepancies in additivity in individual
observer’s data, then minimising rod participation would minimise the failure. One way to
achieve this is to conduct a colour matching experiment with small bipartite field which would
be projected on nearly rod-free central region of the retina. This defines the single task of the
second experiment:
− To test the validity of additivity law for individual observer for small field colour
matching.
In order to derive data which is fully compatible with the first experiment, it was desirable to
maintain the experimental conditions the same as much as possible. Hence the observer, the
apparatus, the filters used for primary and test lights generation, the telespectroradiometer, the
methods and calculations were the same as in experiment 1. Minor modifications were made to
visual colorimeter to reduce the field size; these modifications did not change any other of the
colorimeter properties. The test stimuli were 461nm, 541nm and 661 nm. Ten repetitions of
every match were made.
The completely rod-free region occupies central 1.3° of the retina. However, bipartite field
subtending 1.3° is very small, which leads to difficulties in obtaining reliable matches.
Historically, 2° field was often used for small-field matching, and we accepted this practice.
3.5.1. Results: small field colour matching experiment
3.5.1.1. Variability of colour matching data
The intra-observer variability was evaluated in each tristimulus space separately, without
transforming the data to a common space. The numerical data in CV terms are given in Table
163
3.5.1-1. Plots of variability in 2° data are superimposed with ones for the same observer but 6°
field in Figure 3.5.1-1.
PC set T set
λ r g b λ r g b
441 33.41% 21.47% 4.29% 451 23.14% 54.60% 11.58%
461 21.17% 5.97% 9.97% 461 18.92% 5.11% 9.54%
521 10.46% 2.98% 289.46% 530 11.17% 2.44% 30.65%
541 9.24% 2.64% 212.46% 541 9.08% 2.63% 36.65%
641 3.98% 3.43% 216.58% 603 2.03% 3.59% 1.91%
661 2.08% 4.04% 37.55% 661 3.01% 16.91% 0.50%
Mean 13.39% 6.75% 128.39% Mean 11.23% 14.21% 15.14%
Table 3.5.1-1. Summary of the 2° experiment intra-observer variability.

A) PC primary set; B) T primary set. Values are in CV units, based on ten repetitions of every colour
match made by observer B.
164
PC
40%
30%
CV
20%
10%
0%
440 490 540 590 640
λ
A-1) A-2)
PC
60%
50%
25%
40% 20%
CV
30% 15%
CV
20% 10%
10% 5%
0% 0%
440 490 540 590 640 440 490 540 590 640
λ λ
B-1) B-2)
40% 350%
35% 300%
30% 250%
25%
200%
CV
CV
20%
150%
15%
10% 100%
5% 50%
0% 0%
440 490 540 590 640 440 490 540 590 640
λ λ
C-1) C-2)
Figure 3.5.1-1. Comparison of intra-observer variability in large- and small-field experiments.
A-1, B-1, C-1: T set; A-2, B-2, C-2: PC set. A) – red tristimulus value; B) – green tristimulus value; C) –
blue tristimulus value. CV values for 2° data (solid lines), superimposed with ones for 6° (dashed line)
measured by the same observer.
165
3.5.1.2. Additivity test results
Table 2.5.1-2 contains the results of the statistical test for equality of mean vectors measured by
observer B and transformed by procedure of transformation of tristimulus space. The
magnitudes of failure are shown in Figure 3.5.1-2, where they are also compared with ones in 6°
field size experiment.
PC, B, 2°, FM PC, B, 2°, IM T, B, 2°, FM T, B,2°, IM PC, B, 6°, FM PC, B, 6°, IM T, B, 6°, FM T, B, 6°, IM
461 nm √ − √ √ − − − −
541 nm √ √ − √ − √ √ −
661 nm − − − − − − √ −
Table 3.5.1-2. Results of the 2° additivity test.

Results of statistical test for equality of mean vectors of tristimulus values measured by visual colour
matching and calculated by transformation of tristimulus space. “-“ signifies failure of additivity; ”√”
signifies cases where there was no statistically-significant failure. In the column head, PC and T signify
the primary set; B signifies observer B; 2° or 6° field; FM and IM stand for “Forward Matrix” and
“Inverse Matrix” transformations, respectively.
166
PC set, Observer BO, 2°, FM PC set, Observer BO, 6°, FM
100% 100%
80% 80%
% mismatch
% mismatch 60% 60%
40% 40%
20% 20%
0% 0%
461 541 661 461 541 661
A-1) A-2)
PC set, Observer BO, 2°, IM PC set, Observer BO, 6°, IM
100% 100%
80% 80%
% mismatch
% mismatch
60% 60%
40% 40%
20% 20%
0% 0%
461 541 661 461 541 661
B-1) B-2)
T set, Observer BO, 2°, FM T set, Observer BO, 6°, FM
100% 100%
80% 80%
% mismatch
% mismatch
60% 60%
40% 40%
20% 20%
0% 0%
461 541 661 461 541 661
C-1) C-2)
T set, Observer BO, 2°, IM T set, Observer BO, 6°, IM
100% 100%
80% 80%
% mismatch
% mismatch
60% 60%
40% 40%
20% 20%
0% 0%
461 541 661 461 541 661
D-1) D-2)
Figure 3.5.1-2. Magnitude of additivity failure in results of observer B1.
A) PC set – forward matrix transform; B) PC set – inverse matrix transform; C) T set – forward matrix
transform; D) T set – inverse matrix transform. Column 1: 2° field; column 2: 6° field.
167
3.6. Discussion: small field colour matching experiment
The variability of small field colour matching is significantly – 2-3 times – higher than of the
large field one. This is a known effect attributed to poorer colour discrimination with small field
size (Brown 1952). It is expected that the resulting high levels of uncertainty would render
additivity failures – if there are any – to be less statistically significant. However, from Table
2.5.1-2, additivity evidently fails in small field colour matching with the same rate as it does
with large field, but for different test colours and with somewhat different failure patterns
(Figure 3.5.1-2).
The only two “additive” transformations are for 461 nm transformed from PC to T set with
forward-matrix, and from T to PC set with inverse matrix. The rest fail with different, but
invariably statistically significant, magnitudes. The patterns of failures, visualised as height of
bars in Figure 3.5.1-2, show no commonalities with 6° transforms except of 661 nm stimulus
transformed from PC to T by forward-matrix. Hence there is no reason to attribute the failures
to the same causes as we did in large field matching: to the rod intrusion. What are the other
mechanisms that are operating in the visual system and are causing non-linearity in colour
matching we do not know. Rods can still operate in our conditions since the field size is slightly
larger than the rod-free area, and also because the viewing in our colorimeter is binocular and
free and the projected spot does not necessarily fall exactly in the centre of the retina. Properties
of rods can be slightly different in the central retina. Other causes such as unknown adaptation
mechanism (Zaidi 1986) are possible, although they are not previously reported to exist in our
experimental conditions. Whatever is the reason, the answer to the question at the base of this
experiment – do additivity failures are significant for individual observer in small field
matching – is positive.
168
3.7. Conclusion: the colour matching experiment
The results of the two colour matching experiments can be summarised as follows:
1. Variability:
a Physical variability accounts for 45%-70% of the intra-observer variability, thus
the variability of observer’s B data is probably the lowest attainable in our
experimental conditions.
b Intra-observer variability accounts for 37%-55% of inter-observer one, and is
dependent on the choice of spectral position of the primary lights.
c The intra-observer variability is about 2-3 times higher in 2° colour matching
experiment than in 6° one
d The inter-observer variability depends on the primary set, ranging from 3% to
85% depending on the spectral position of the stimulus and of the primary
lights.
e The relative variability is lowest where the CMF values are highest, and vice
versa.
f The variability within S&B colour matching dataset (Stiles and Burch 1959)
could be reproduced within a group of five observers.
g The CIE SDO does not predict the variability in our experiment; however, this
conclusion can not be generalised because the SDO was not designed for the
conditions of colour matching experiment.
2. Proportionality and additivity
a Proportionality test fails for the stimulus at 661 nm to a statistically significant
degree in both primary sets, for the individual as well as for the mean of all
observers. The failure can be attributed to rod participation; however the
underlying mechanism is not known.
b The test of transformation of tristimulus space fails for most of the stimuli in he
results of the individual observer in small and large field matching, but …
c It does hold in most of the cases in mean results of all observers.
169
d There are indications that result of multivariate statistical evaluation of colour

matching results can be ambiguous.
e The choice of the mathematical procedure of transformation of tristimulus
space – forward- or inverse-matrix method – affects the results of the statistical
test.
Perhaps the most important outcome of this experiment is the conclusion that variability in S&B
data set was reproduced even within a small group of observers. The immediate implication of
this conclusion is that we may not need any more colour matching data to assess the degree of
observer metamerism. We suggest that S&B dataset can be used for the purposes of modelling
the psychophysical uncertainty of colour matching that is associated with inter-individual
variations of CMF.
Intra-observer variability was also very similar to one in the S&B experiment. The finding that
the intra-observer variability is only slightly larger than associated instrumental uncertainties
allows us to conclude that observers do very reliable matches. This allows limiting the number
of individual repeated matches in the future experiments.
The results of the proportionality and of the additivity tests largely confirm the previous
publications on the subject. In the present experimental conditions, the failures are likely to be
due to rod intrusion; this is confirmed by our analysis at least for large field matching. Failures
of a kind reported by (Stiles 1963; Crawford 1965; Zaidi 1986) occur when broadband light is
matched in colour by a mixture of narrow-band lights; they are not known to occur in our
conditions.
The outcome that the result of the additivity test depends on the calculation procedure is not
itself an evidence of additivity failure. Both methods of transformation of tristimulus space are
equally valid mathematically, and should lead to identical results if colour matching operated
strictly according to the Trichromatic Generalisation. However, this outcome illustrates how
much the conclusions can be affected by the choice of the calculation procedure. One such
conclusion, for example, was made by Thornton (Thornton 1992b), and confirmed by ourselves
(Oicherman et al. 2005): that the transformation from the PC tristimulus space to any other set
of primaries produces more accurate results than the transformation in the opposite direction. It
is clear now that choosing a different transformation method could lead to an entirely different
conclusion.
It is important to note that the results from the two methods are not really contradictory in a
basic sense; they just appear to be so. Valid conclusions can only be drawn with a proper
analysis of uncertainty. No experiment in the real world can prove that additivity holds exactly
170
but rather only that it holds – or does not hold – within the uncertainty of the method. Different
methods can raise or lower the "bar" and therefore can appear to give different results. If
uncertainty is not considered at all, it is impossible to decide whether the conclusions are valid.
3.7.1. Uncertainty of colour matching: what is next?
Both phenomena at the subject of this research – the individual variability of CMF and
additivity failures – seem to undermine validity of system of colorimetry. If the CMF vary
between observers, how accurate prediction of metameric match is possible? And if the basic
principle of mathematical construct of colorimetry is not valid, how its successful
implementation is possible?
However, the fact is that the CIE colorimetry, which is based on the concept of Standard
Colorimetric Observer and on Grassmann’s assumption of additivity, is successfully used and
implemented in industry. One possible interpretation of this discrepancy between the theory and
practice could be that the success of CIE colorimetry is determined by its suitability for the
entire population of observers with normal colour vision, rather then for an individual observer.
The concept of additivity in this sense is not different from the concept of the Standard
Colorimetric Observer itself – which does not represent any individual observer, but the mean
of all.
Therefore, a valid research question seems to be not whether additivity failures, or observer
metamerism, exist, but – what are the practical implications of these phenomena? Do they have
any practical importance, or do they become insignificant within the uncertainties of industrial
conditions? What are the applications in which these errors can be pronounced? Would the
errors introduced by additivity failures and observer metamerism result in colour differences
detectable by observers in real industrial conditions, and how can these differences be
estimated?
These are the questions that we aim to answer in the second part of our research.
171
4. Experiment 2:
Cross-media
colour matching
experiment
172
4.1. Introduction*
4.1.1. Rationale and research question
Colour Proofing, or just proofing, is the term used in the graphic arts industry for simulation of
the output of some colour reproduction device on another device. Usually, the simulated device
is a printing press: offset, gravure or flexographic; and the simulating device is a high-quality
analogue or digital printer. With the introduction of personal computers into the graphic arts
production and design, improvement of quality and reliability of display technologies, and
development of colour management technologies and standards, it became customary to
perform the simulation on the computer display. In the graphic arts jargon, this process is
known as soft-proofing.
In a typical soft-proofing setup, a computer display and a viewing cabinet are positioned next to
each other. The light source in the viewing cabinet simulates some standard CIE illuminant –
usually D50 or D65. The “original” reflective sample is placed in the cabinet, and its
reproduction is shown on the display. The task of the operator is to match the colour of the
reproduction to the original, or simply to evaluate the degree of colour match between the two.
The physical properties of the colorants used to generate colours on the computer display and on
the print are different. The printed and the monitor colours can be made to match visually;
however, the spectral power distribution (SPD) of the lights reflected from the sample and
emitted by the monitor would always be different. In other words – any colour match between
the monitors and the reflective sample is essentially metameric. The term adopted for this type
of colour matching process is cross-media colour matching.
When the operator of a soft-proofing workstation establishes a match between the monitor and
the print she performs an asymmetric metameric colour match. Her task is to equate the
sensations of colour triggered by the two stimuli, while these sensations do not originate from
*
Author wishes to thank ICI Paints for their kind support in this part of the research.
173
the same photoreceptors, even not from the same retinal regions; hence the term asymmetric
(section 2.2.6.4). If this process is governed by the basic principles of colour matching – the
principle of cone-quantal metameric match (Section 2.2.2) and the law of additivity (Section
2.2.3) – then the cross-media colour matching can be modelled by application of basic
colorimetry:
− The colour match can be predicted from the knowledge of observer’s colour matching
functions and of the spectral power distributions of the stimuli (Section 2.2.7)
− The variation of colour matches made by different colour-normal observers can be
predicted from knowledge of variation of their colour matching functions (Section 2.8).
These indeed are the working assumptions in the graphic arts and in digital imaging. If the basic
colour matching principles fail when applied to cross-media colour matching, then the
knowledge of spectral power distributions and of colour matching functions alone is not
sufficient. In this case the modelling must be carried out using the tools of advanced
colorimetry, based on the knowledge of higher order colour vision mechanisms. Thus our
enquiry can be viewed in a global context of identification of demarcation lines between the
application of basic and advanced colorimetry. This is the rationale underlying this part of our
research.
In his review of colorimetry, Wyszecki writes (Wyszecki 1973):
Fundamental research in basic colorimetry is concerned with the limitations of

the colour-matching laws…
Two such limitations were known to scientists at the time of his writing: the observer
metamerism and failures of colorimetric additivity. In the first part of this research, we dealt
with reproducing and gaining hands-on understanding of these effects in conditions of classical
maximum saturation colour matching experiment. We reported two major outcomes:
1. The individual variability within S&B (Stiles and Burch 1959) colour matching dataset
is reproducible to high degree even within a relatively small group of observers
2. In the conditions of colour matching experiment, the failures of colorimetric additivity
are statistically significant in results of individual observer.
Based on these outcomes, as well as on the review of the available literature from the last six
decades, we arrive to the same conclusion that did Wyszecki (ibid):
What is needed is more work … with regard to the significance of such failures to
practical colorimetry.
This comment was made with regard to additivity failures; it is valid for the observer
metamerism as well. In the second part of our research, we choose – somewhat arbitrarily – a
174
particular industrial colour matching application: soft-proofing. We aim to construct a

laboratory setup which simulates this application, and to carry out a colour matching experiment
which would gain us understanding of practical implications of the effects in question – at least
for this particular case. We aim to fill in the gap between the theoretical and the practical
knowledge, and to answer the following questions:
1. Are individual variations in matches statistically significant and do they have practical
consequences?
2. Do additivity failures have practical consequences?
3. If the answer is positive for either question then - how can these consequences be
modelled and accounted for in practical colorimetry?
4.1.2. Summary of results
Eleven observers made repeated colour matches between LCD and CRT monitors and paint
samples in the viewing conditions similar to those of soft-proofing. The matches were used to
evaluate the practical significance of observer metamerism and of failure of colorimetric
additivity in cross-media colour matching.
Individual variations in matches are of magnitudes that are expected to have practical
consequences in graphic arts applications; they can not be explained by observer metamerism
and thus can not be modelled from individual variability of CMF, i.e. by the Standard Deviate
Observer. At the other hand, these variations are modelled well by the CIEDE2000 colour
difference formula. We conclude that the variability of cross-media colour matches is governed
by mechanisms of colour discrimination, while the effect of observer metamerism is significant
only in neutral colours.
Failures of colorimetric additivity lead to systematic disagreements between the matches made
by observers and the ones predicted by the CIE 1964 Standard Colorimetric Observer. The
discrepancies are consistent with all the reports on the subject, but have never been confirmed to
exist in practical colorimetry. We attribute the failures to the post-receptoral adaptation mainly
in blue-yellow chromatic channel. The colorimetric discrepancies can be compensated for by a
suitably designed adaptation transform.
We conclude that additivity failure is a significant contributor to discrepancies in cross-media

colour matching and needs to be accounted for in colour management systems. The practical
implications of inter-individual variability which is not result of observer metamerism remains
unclear.
175
4.2. Experimental
4.2.1. Experimental setup
4.2.1.1. Background
In planning an experiment which aims to simulate conditions relevant to the industry, the design
should consider two almost contradicting requirements:
1. The conditions should allow for the uncertainties which are inherent to the application
2. The uncertainties in the results should be kept at levels which allow meaningful
analysis.
In our case, the industrial application we aim to simulate is soft-proofing. At the principal level,
the only difference between the colour matching conditions with a bipartite field and in the soft-
proofing setup is the spatial separation of the stimuli. Visual colorimeter facilitates quasi-
symmetric colour matching in which the lights being matched fall on adjacent or same
photoreceptors in the retina. In soft-proofing, the matching is highly asymmetric: the stimuli are
usually spatially separated to extent that does not allow simultaneous viewing, and requires
turning head to move the gaze from one to another (this type of matching was also termed as
“short-memory colour matching” (Braun and Fairchild 1997)) It seems reasonable to assume
that this asymmetry is the principal contributor to the uncertainty in such conditions as far as the
operation of human visual system is concerned. Additional sources of uncertainty are the
environmental and the viewing conditions: the ambient illumination, sizes and complexity of the
stimuli, varying viewing distances and angles, and others. In the design of our setup and
procedures, we attempted to maintain the sources of uncertainties inherent to the application: the
asymmetry, type of equipment and viewing setup, while minimising the environmental
uncertainties. Thus, we use commercially available industry-standard computer displays as
stimuli generators, standard graphic arts viewing cabinet, and we arrange these as is accepted by
graphic arts practitioners. At the same time, we conduct the experiment in as strictly controlled
a conditions as possible – although we realise that these conditions are often not attainable in the
industry.
176
As the way the additivity failure and observer metamerism will manifest in our conditions was
not known, the experiment was aimed at collection of colour matching data for later analysis,
with no a priory hypothesis.
4.2.1.2. Viewing cabinet
The viewing cabinet was VeriVide DTP-60. The inner surface of the cabinet was covered with
black velvet, and luminance monitored by luminance meter affixed to the cabinet’s floor. The
SPD and the corresponding CIE 1964 XYZ tristimulus values of the cabinet illuminant were
derived as follows. A highly-diffusive white plaque of known reflectance was positioned in the
cabinet, and the light reflected from its surface was measured by the TSR. The plaque
reflectance measured by the spectrophotometer for 380-750 nm range in 10 nm intervals was
interpolated to 380-780 nm range in 1 nm intervals to match the range of the TSR (Figure
3.5.1-1). The illuminant SPD was calculated using the following formula:
Sw ( λ )
I0 ( λ ) = (4.2.1)
Rw ( λ )
where I0(λ) is the calculated illuminant, Sw(λ) is the SPD of the white plaque as measured in the
cabinet, and Rw(λ) is the reflectance of the white plaque measured by the spectrophotometer.
The resulting SPD is shown in Figure 4.2.1-2.
95
93
91
relative reflectance
89
87
85
83
81
79
77
75
380 430 480 530 580 630 680 730 780
λ
Figure 4.2.1-1. Spectral reflectance function of the white plaque used to calculate the cabinet illuminant
SPD.
Thick black curve: reflectance measured at 10 nm intervals in 380-750 nm range; thin grey curve: 1 nm
380-780 nm range curve calculated from the 10 nm reflectance by spline interpolation.
177
350
300
250
relative power
200
150
100
50
0
380 430 480 530 580 630 680 730 780
λ
Figure 4.2.1-2. Relative spectral power distribution of the light reflected by the white plaque in the
viewing cabinet, superimposed with the calculated cabinet illuminant.
Plaque reflectance: dashed line; cabinet illuminant calculated by Eq. (4.2.1): solid line. Both curves are
normalised to have value of 100 at 550 nm.
The XYZ tristimulus values of the cabinet illuminant are 93.7, 100, 102.5 for X, Y and Z
respectively, which correspond to correlated colour temperature (CCT) of approximately 6000K
(Figure 4.2.1-3).
5500K
0.339
Booth illuminant
0.334
y
6000K
0.329
6500K
0.324
0.306 0.311 0.316 0.321 0.326 0.331
x
Figure 4.2.1-3. The plot of the CIE 1964 xy chromaticity of the viewing cabinet illuminant.
Line indicates the locus of Plankian radiators, with correlated colour temperature values marked as data
points.
4.2.1.3. Computer monitors
A cathode ray tube (CRT) and a liquid crystal display (LCD) (Hunt 2004) (p. 376) were used. In
the CRT display, light is produced when a ray of accelerated electrons strikes the screen coated
with phosphor. Colour is generated by three kinds of phosphors emitting light in red, green and
blue regions of the visible spectrum. Thus the spectral power distribution of the light emitted by
the CRT is the sum of the three SPDs of the phosphors.
In LCD displays the light is emitted by continuously operating light source, usually fluorescent
or LED. Liquid crystals are used to control, by means of polarisation, the amount of light
178
passing through three types filters corresponding to red, green and blue. Thus the SPD of the
light emitted by the LCD is the function of the SPD of the light source and of the transmittance
of the filters.
Due to described technological differences, the SPD of lights emitted by the primaries of the
two displays differ; therefore any colour match between them would be metameric. However,
the filters of LCD display were designed to approximately match the chromaticities of the CRT,
so the gamuts of the two are very similar. Spectral and gamut properties of displays are
illustrated in Figure 4.2.1-4 and Figure 4.2.1-5.
0.012 0.012
0.008 0.008
w/sr/m2
w/sr/m2
0.004 0.004
0.000 0.000
380 480 580 680 780 380 480 580 680 780
λ λ
A) B)
Figure 4.2.1-4. Spectral power distribution functions of monitors’ primaries
A) LaCIE 321 (LCD); B) LaCIE BlueEye IV (CRT)
0.8
0.6
y
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1
x
Figure 4.2.1-5. Chromaticity coordinates of the primaries and of the secondaries of both displays.
CRT (dashed line) and LCD (solid line).
The displays were manufactured by LaCIE company. Their model details and basic technical
characteristics are given in Table 3.5.1-2; they are referred to as CRT and LCD in the
remaining parts of this report. Both displays were driven from the same computer equipped with
dual video card. Both were equipped with hoods to minimise reflections of stray light from the
displays’ surface.
Display model Size Resolution used in the experiment

LaCIE BlueEye IV (CRT) 20” 1280x1024 pixels
LaCIE 321 (LCD) 21.3” 1600x1200 pixels
Table 4.2.1-1. Basic characteristics of the displays used in the experiment.
179
4.2.1.4. Reference spectroradiometer
The same Minolta CS-1000 telespectroradiometer as in the colour matching experiment was
used. Its technical characteristics and calibration method are described in Section 3.2.2.
4.2.1.5. Test stimuli
Ten paint samples were selected from a large colour library provided by ICI Paints company
according to two criteria:
− colours are within the gamut of both monitors;
− colours span the monitors' gamut in an approximately uniform manner.
There were two achromatic samples and eight chromatic ones, chosen so they span the CIELAB
a*b* plane in an approximately uniform manner, in hue angle steps of approximately 45°. The
CIELAB coordinates of the samples calculated for cabinet illuminant and CIE 1964 Standard
Colorimetric Observer are given in Table 4.2.1-2; their a*b* projection is shown in Figure
4.2.1-6. The spectral power distributions of the lights reflected by the samples in the viewing
cabinet are shown in Figure 4.2.1-7.
L* a* b*
White 90.02 -0.03 0.23
Grey 62.98 0.23 -1.24
Yellow 79.28 -2.62 77.53
Brown 39.78 26.94 32.27
Magenta 49.50 48.32 5.11
Purple 50.89 18.64 -13.88
Blue 40.40 2.87 -43.66
Cyan 55.63 -10.21 -15.69
Dark green 46.37 -13.53 12.31
Light green 59.24 -26.18 17.75
Background 63.322 -1.1746 1.3918
Table 4.2.1-2. CIELAB coordinates of the paint samples used in the experiment as test stimuli.
Calculated from spectroradiometric measurements taken in the cabinet, for cabinet illuminant and 1964
CIE Standard Colorimetric Observer.
180
80
60
40
20
b*
-60 -40 -20 0 20 40 60
-20
-40
-60
-80
a*
Figure 4.2.1-6. a*b* projection of the CIELAB coordinates of the paint samples.
181
white grey
5.0E-03 2.5E-03
4.5E-03
4.0E-03 2.0E-03
3.5E-03
3.0E-03 1.5E-03
w/sr/m2
w/sr/m 2
2.5E-03
2.0E-03 1.0E-03
1.5E-03
1.0E-03 5.0E-04
5.0E-04
0.0E+00 0.0E+00
380 430 480 530 580 630 680 730 780 380 430 480 530 580 630 680 730 780
λ λ
yellow brown
3.5E-03 6.0E-04
3.0E-03
5.0E-04
2.5E-03
4.0E-04
2.0E-03
w/sr/m 2
w/sr/m 2
3.0E-04
1.5E-03
2.0E-04
1.0E-03
1.0E-04
5.0E-04
0.0E+00 0.0E+00
380 430 480 530 580 630 680 730 780 380 430 480 530 580 630 680 730 780
λ λ
magenta purple
1.4E-03 2.0E-03
1.8E-03
1.2E-03
1.6E-03
1.0E-03 1.4E-03
1.2E-03
8.0E-04
w/sr/m 2
w/sr/m 2
1.0E-03
6.0E-04
8.0E-04
4.0E-04 6.0E-04
4.0E-04
2.0E-04
2.0E-04
0.0E+00 0.0E+00
380 430 480 530 580 630 680 730 780 380 430 480 530 580 630 680 730 780
λ λ
blue cyan
2.5E-03 2.5E-03
2.0E-03 2.0E-03
1.5E-03 1.5E-03
w/sr/m 2
w/sr/m 2
1.0E-03 1.0E-03
5.0E-04 5.0E-04
0.0E+00 0.0E+00
380 430 480 530 580 630 680 730 780 380 430 480 530 580 630 680 730 780
λ λ
182
dark green
light green
1.0E-03
9.0E-04 1.8E-03
8.0E-04 1.6E-03
7.0E-04 1.4E-03
6.0E-04 1.2E-03
w/sr/m 2
w/sr/m 2
5.0E-04 1.0E-03
4.0E-04 8.0E-04
3.0E-04 6.0E-04
2.0E-04 4.0E-04
1.0E-04 2.0E-04
0.0E+00 0.0E+00
380 430 480 530 580 630 680 730 780 380 430 480 530 580 630 680 730 780
λ λ
Figure 4.2.1-7. SPD of the light reflected by the paint samples in the viewing cabinet.
Test colour is indicated at the top of each graph.
4.2.1.6. Setup
The monitors were positioned at both sides of the viewing cabinet, so that surface planes of each
monitor and of the cabinet were at angles of approximately 45° to each other. This arrangement
was chosen because it corresponds to the industrial practice of soft-proofing setup.* The
distance between the observer’s eyes plane and monitors and cabinet planes was approximately
80 cm. It was not possible for the observer to see both monitors simultaneously; however, when
looked at one of the monitors she could see the cabinet with the peripheral vision, and vice
versa. The setup of the experiment is illustrated schematically in Figure 4.2.1-8, and in images
in Figure 4.2.1-9. Observers wore black shirts during the experiment in order to avoid
reflections of their clothing from the monitors surfaces. No light sources operated in the
laboratory apart from the two displays and the viewing cabinet.
Sizes of all stimuli (monitors and paint sample) corresponded to viewing angle of 6° at the
experimental viewing distance, surrounded by the grey background of viewing angle
approximately 60°×40°. The maximum luminance of both monitors was set to 120 cd/m2. The
viewing cabinet was adjusted so the luminance of the white paint sample was approximately
110 cd/m2. The difference in luminance display maximum and viewing booth luminance values
was set so as to allow for some “extra room” in lightness matching of the white patch, to avoid
working at maximum attainable luminance.
*
In fact, there are number of industrial practices for soft-proofing setup; this one was chosen based the author’s
personal experience, according to which the 45° arrangement provides the most convenient environment for
monitor – viewing cabinet comparisons.
183
Lighting booth
Test stimulus
CRT display LCD display
Observer
Figure 4.2.1-8. Scheme of the experimental setup
A)
B)
Figure 4.2.1-9. Images of the experimental setup.

A) Overview of the setup. The observer controls the colour of the display by rotating the wheel of the
mouse held in his right hand on the knee. The TSR is not in the picture, it is positioned on the tripod just
behind the observer.
B) The setup in experimental conditions, when no light sources except of the monitors and the cabinet are
operating in the lab. The picture shows the final stage of the match, when the colours of the patches on
the monitors match the patch in the cabinet (see text) for a given observer.
4.2.1.7. Colour matching procedure
The colour matching was facilitated by especially developed software utility – “Digital Visual
Colorimeter” (DVC). This utility provided the visual stimulus, the means of its colour control,
the user interface, and the facility of saving the results of each observation to a file and loading
the results from the file for the purpose of measurement.
184
The initial position from which the observers began the match was always black, i.e. [R,G,B] =
[0,0,0]. Observers controlled the colour by rotating the mouse wheel, in one of the two modes
(Zhang and Montag 2004):
1. CIELAB lightness ( L* ), chroma ( C*ab ) and hue ( h ab ) mode was used to match the
chromatic stimuli, i.e. all the colours except grey and white.
2. CIELAB lightness ( L* ), a * and b* was used for matching the achromatic stimuli and
the background.
Observers could switch between the modes of matching at any time by pressing a button on the
graphic user interface (Figure 4.2.1-10). In addition, they had a control over the “speed” of
matching by choosing one of the three options: fast, medium and slow. The fast speed meant
that the change in colour per mouse wheel click is very significant; this mode was used for
initial crude match. The medium speed allowed finer control of the match, while the slowest one
allowed for final tuning, operating at the level of single digital RGB counts.
A) B)
Figure 4.2.1-10. The graphic user interface of the DVC program.
A) With L*c*h* mode enabled; B) with L*a*b* mode enabled.
The top five buttons set the dimension of colour control (i.e. L*a*b* or L*C*h*). The middle three
buttons set the “speed” of the matching. The bottom three buttons set full screen viewing mode (makes
the menu invisible, normal experimental mode); toggle a*b* or C*h* matching modes, and reset the
match.
The matching was done in two stages (Figure 4.2.1-11):

185
1. Matching the background. At the beginning of their first session, observers adjusted the
background on both monitors to match in colour the grey background in the viewing
cabinet. At the time of this adjustment, a white paint patch was placed in the cabinet.
Observer adjusted both – the patch and the grey background – simultaneously, by
alternating between the two until the entire image on the display looked identical to one
in the cabinet. This iterative double-adjustment procedure was necessary in order to
eliminate possible effect of appearance induction between the background and the
patch. At the beginning of each of the following sessions, observers verified that the
backgrounds on monitors still match one in the cabinet, and made appropriate
adjustments if it did not.
2. Establishing the colour match. Each of the test stimuli in turn was placed in the viewing
cabinet. Observers altered the colour of the central patch on each of the monitors to
match in colour the test stimulus in the viewing cabinet.
The first stage of the background matching may seem somewhat unusual a procedure: the
common approach to standardise the viewing conditions would be to set the background colours
on both monitors to have the same CIE XYZ coordinates as one in the cabinet, and to present
this background to all the observers. However, since one of the goals of this experiment is to
evaluate the significance of observer metamerism in the current setup, it must be remembered
that each observer differs to various degrees from the CIE Standard Colorimetric Observer. The
colour of the monitor background which matches one in the cabinet to the Standard
Colorimetric Observer (i.e. both having the same CIE XYZ values), will not necessarily match to
the real one, and we do not have means of evaluating the perceptual magnitude of such
mismatch. Thus, if the common procedure was implemented, it would be expected that the
backgrounds will mismatch for some or all observers, thus rendering the matches unreliable.
186
A) B)
C) D)
Figure 4.2.1-11. The stimulus generated by DVC software:
A) Initial setup at the beginning of the first session of each observer; the background and the patch are
black, only controls are visible
B) After the simultaneous background-white adjustment made at the beginning of the first session;
C) Initial position before the colour matching: the background is set to values adjusted in B), and the
patch is reset to initial adjustment point [R,G,B] = [0,0,0];
D) The match is established.
This procedure aims to achieve individualised standardisation of the conditions. Our assumption
is that, after the procedure is carried out, the backgrounds of both monitors will visually match
that in the cabinet for each individual observer, and not for the standard one. In the other words,
we do not know what colour the observer sees as the background colour of the monitor, but we
know that this colour is identical to the background in the cabinet and on the other monitor.
In the second stage, the observers established metameric matches between the patch on the
monitor and the test stimulus in the cabinet. Matching between both monitors and the paint
sample was established simultaneously. The observers were instructed to begin the matching
with the LCD monitor; this choice is arbitrary, with the sole purpose of standardising the
conditions for all the observers. Once the match between the LCD and the viewing cabinet was
established, observers turned to the CRT monitor and adjusted its colour to match the same
187
paint sample. The observer kept on iterating between the two monitors until all three stimuli –
the two monitors and the paint sample – looked identical. This process was repeated for each of
the ten test stimuli within each session. The first match in every session was white, during
which the background colour was verified as well. The remaining nine stimuli were randomised.
Each session included ten matches. The duration of the match was not limited; average session
lasted for about one hour. Radiometric measurements of the matches made on both monitors
were taken upon completion of the observation sessions on the same day. The paint samples in
the viewing cabinet were monitored on a daily basis – also by radiometric measurements.
Hence, all the results reported herein are based on the direct measurements of the stimuli, and
are independent of characterisation, calibration state and signal bit rate of the monitors. These
radiometric data were used to calculate the CIE 1964 XYZ and CIELAB values.
4.2.1.8. Observers
Eleven observers took part in the experiment, eight males and three females, aged 32 years on
average, all colour science postgraduate students experienced in making colour judgements and
in performing psychophysical tasks. All were screened for colour vision deficiencies by Ishihara
pseudoisochromatic plates (Ishihara), Farnsworth-Munsell 100 Hue Test (Farnsworth 1943),
D&H Colour Rule (Kaiser and Hemmendinger 1980) and a device similar to it – Munsell
Matchpoint rule. All the tests were done in the same viewing cabinet as was used in the colour
matching experiment.
With Ishihara test and in Farnsworth-Munsell 100 Hue Test all observers performed well, none
has shown any indication of colour vision anomaly. However, the D&H and Munsell
Matchpoint rules revealed some differences (Figure 4.2.1-12). Three observers consistently
stood out of the group, while one – S2 – shown results particularly far from the average.
Pobboravsky (Pobboravsky 1988) reported an observation that the D&H rule is the most
sensitive of the available hardcopy tests for colour vision anomalies.
188
D&H Matchpoint
14 17
16 S2
13
S2 15
12 14
11 13
12
10 H1
11 H1 D1
D1
9 10
9
8
8
7
7
6 6
7 8 9 10 11 12 13 14 15 3 4 5 6 7 8
A) B)
Figure 4.2.1-12. Plots of the results of tests by metameric rules
A) D&H; B) Munsell Matchpoint.
Labelled datapoints mark the observations which are far from the group mean.
Since Ishihara and Farnsworth-Munsell 100 Hue Test results have not shown indication of
colour deficiency for any of the observers, we do not exclude any observer’s data from the
analysis. However, the three observers, and especially observer S2, appear to be anomalous. In
Pobboravsky’s experiment, observers which have shown anomalies in D&H rule judgments
tended to disagree with the rest of the group on their cross-media colour matches. We will
analyse the experimental results for indications of similar disagreements (Section 4.5.4.1).
189
4.3. Setup performance evaluation
4.3.1. Repeatability
4.3.1.1. Short-term repeatability
Short-term repeatability characterises the ability of the display to reproduce the same colour
over the time period of single minutes. 20 consequent measurements of a medium grey patch
were taken with the interval of 1 minute. The CIELAB (D65/10°) values of the patch were 49.5,
-0.7 and -1.6 and 51.3, -0.2, 1.0 for LCD and CRT displays, respectively. The repeatability was
evaluated as the Mean Colour Difference from Mean (MCDM, Section 2.3.5.3). The plot of the
results is given in Figure 4.3.1-1.
0.06
0.05
∆E*ab from mean
0.04
0.03
0.02
0.01
0
1 3 5 7 9 11 13 15 17 19
time (minutes)
Figure 4.3.1-1. Short-term repeatability of the displays, expressed in colour difference (∆E*ab) from
mean.
Thick line: LCD monitor; thin line: CRT monitor
Although slightly higher MCDM value was calculated for the LCD display than for CRT (0.031
and 0.02, respectively), the absolute values of the variations are very small and are of no
practical significance. Therefore both displays can be considered stable in a short-term period.
190
4.3.1.2. Medium term repeatability
Medium term repeatability characterises the ability of the display to reproduce the same colour
over the period of single days – in this evaluation 65 hours. One hundred and thirty consequent
measurements of a medium grey patch were taken with the intervals of 30 minutes. The colour
coordinates of the patch were the same as in the short-term repeatability test. Graphical
representation of the results in terms of colour difference from mean versus time is given in
Figure 4.3.1-2
0.9
0.8
0.7
∆E*ab from mean
0.6
0.5
0.4
0.3
0.2
0.1
0
0 10 20 30 40 50 60
time (hours)
Figure 4.3.1-2. Medium-term repeatability of displays, expressed in colour difference (∆Eab) from
mean.
Thick line: LCD monitor; thin line: CRT monitor
As expected, the medium-term variability is more significant than the short-term one, with
higher variability values for the CRT display than for the LCD one. The Mean MCDM values
are 0.25 and 0.43 CIELAB units for the LCD and CRT, respectively. The analysis of values in
the dimensions of lightness, chroma and hue reveals that in both monitors the variations occur
mainly in lightness (Figure 4.3.1-3).
191
BIV 321
1 1
0.8
∆L* difference from mean

0.8
∆L* difference from mean

0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
∆E*ab from mean ∆E*ab from mean
A) B)
Figure 4.3.1-3. Medium-term repeatability of displays: contribution of lightness
A) LCD; B) CRT.
Values of lightness difference from mean plotted versus colour difference from mean form a straight line
at angle close to 45°, illustrating that almost entire medium-term variation in display colour is due to
variation in lightness.
4.3.2. Spatial uniformity
Spatial uniformity characterises the ability of the display to reproduce the same colour in
different locations on the display’s surface. 9 measurements of the medium grey sample were
taken at different locations on both displays as shown in Figure 4.3.2-1. The colour difference
from mean in CIELAB units among these 9 measurements was taken as representation of the
display’s spatial uniformity. Figure 3.6 shows the results of the evaluation
1 2 3
4 5 6
7 8 9
Figure 4.3.2-1. Locations of measurements in display uniformity evaluation

192
∆E*ab from mean

0
1 2 3 4 5 6 7 8 9
-1
-2
-3
sample #
Figure 4.3.2-2. Results of displays spatial uniformity evaluation
Both monitors displayed some non-uniformity, although CRT was found to be significantly
more uniform than the LCD. In the CRT, the non-uniformity mainly shows up as increase in
lightness towards the centre of the display, with the difference of 1.5-2 L* units between the
centre and the periphery. In the LCD, the non-uniformity was due to variations of lightness on
the vertical axis, with the bottom three patches (samples 7-9, Figure 4.3.2-1) differing by about
3-4 L* units from the top six. The mean MCDM values are 1.54 and 0.87 CIELAB units for
LCD and CRT displays’ respectively.
The non-uniformity of the CRT display was found to be barely detectable by visual inspection.
The non-uniformity of the LCD, however, was easily detectable, with the bottom part of the
display being visibly darker than the top part. It is not clear to us whether the LCD non-
uniformity is the result of the technological limitation or a manufacturing defect of the particular
display that we used in our study. It was also not the result of the well-known LCD angular
dependence problem. At any rate, variation, although visible, was considered to be
insignificant for our purposes.
4.3.3. Spatial channel independency
Spatial channel independency characterises the ability of the display to reproduce consistent
colour independently of the colours of its surround. This was evaluated as the shift in colour of
the grey patch in the centre of the display as the function of the colour of the surround (Figure
4.3.3-1).
193
surround colour
grey patch
Figure 4.3.3-1. Target for evaluation of channel independency.
The colours of the surround were: black, white (W), red (R), green (G), blue (B), cyan (C),
magenta (M) and yellow (Y). The channel independency was characterised as the CIELAB
difference of the colour of the grey patch on black (reference) surround from the colour of the
same patch on the coloured surrounds. The results are illustrated on Figure 4.3.3-2.
2.5
2
∆E*ab from reference
1.5
0.5
0
R Y G C B M W
Background colour
Figure 4.3.3-2. Result of channel independency test.

Colour difference from reference colour as the function of the background colour. Black: LCD; Grey:
CRT
The results clearly indicate significant advantage of the LCD display over the CRT, with the
colour difference introduced by changing colour of the surround on average more than twice as
large in the CTR monitor: 0.72 CIELAB units for LCD and 1.66 CIELAB units for CRT.
A breakdown of the colour differences into tristimulus values reveals interesting properties of
the dependencies of central patch colour on the background (Figure 4.3.3-3). The CIE 1964 XYZ
values are taken to loosely correspond to red, green and blue. In the LCD monitor, the colour of
the patch shifts towards the colour of the background: “red” X tristimulus value increases with
red surround, “green” and “red” X and Y tristimulus values increase with yellow background,
194
etc. In the CRT monitor, the effect is an opposite one: the colour of the patch seems to shift to
the direction opposite to the colour of the background. Although interesting, this effect is not
expected to have any practical consequences for us.
Difference from reference

Diference from reference
0.8 0.8
0.3 0.3
-0.2 R Y G C B M W -0.2 R Y G C B M W
-0.7 -0.7
-1.2 -1.2
Bck. colour Bck. colour
A) B)
Figure 4.3.3-3. Difference from reference versus colour of the background – breakdown into CIE 1964
X, Y and Z tristimulus values.
A) LCD; B) CRT. Red bar: X tristimulus values; Green: Y; Blue: Z.
4.3.4. Channel additivity
Channel additivity* characterises the ability of the display to reproduce colour which is exact
sum of the colours of the primaries that comprise it. This property is important for the accuracy
of the monitor characterisation. In order to evaluate the channel additivity, radiometric
measurements of the primaries R, G and B of both monitors were taken, and the CIE XYZ
coordinates were calculated. Then measurements of combinations of primaries (R+G, G+B,
R+B, R+G+B) were taken as well and compared with the sum of the XYZ coordinates of
corresponding primaries comprising the mixture. The CIELAB colour difference between the
colour as measured and as predicted by the assumption of perfect additivity represents the extent
of the channel additivity failure. The results are illustrated in Figure 4.3.4-1
*
Note that this is a characteristic of display hardware performance which is not related to the failures of
colorimetric additivity.
195
1.6
1.4
1.2
∆E*ab from reference

1
0.8
0.6
0.4
0.2
0
C M Y White
Test colour
Figure 4.3.4-1. Results of channel additivity test.

Colour difference from colour expected under assumption of perfect additivity. Black: LCD; Grey: CRT.
The mean value of additivity failure is almost identical for both monitors: 0.67 and 0.66
CIELAB units for LCD and CRT, respectively. The results for the CRT, however, indicate
larger dispersion of the values, with lower minimum and higher maximum. It is worth noting
that our test conditions represent the “worst case”, and the additivity failures occurring in
practical situations are expected to be significantly lower.
4.3.5. Stray light
No light sources operated in the lab during the experiment – except of the monitors and the
viewing cabinet. Even though the arrangement of the setup elements was such as to minimise
reflections and mutual illumination, there was still a chance that some stray light is reflected
from the stimuli. This effect was evaluated by taking radiometric measurements of the stimuli in
two conditions: one was the normal experimental situation with both monitors and the cabinet
switched on, and in another only one monitor or the booth operated, while the rest of the setup
elements were switched off. The colour difference between the two conditions corresponds to
the effect of the stray light. This effect was found to be negligibly small, with means of 0.16,
0.17 and 0.04 CIELAB units for LCD, CRT and viewing cabinet, respectively.
4.3.6. Consistency of hard-copy stimulus presentation
Radiometric measurements of all the samples were taken every day after the experimental
sessions took place. By the end of the experiment, 31 measurements of every sample were
accumulated, taken at random intervals during approximately two months. The variability
within this sample set represents the long-term repeatability of the sample presentation
combined with repeatability of the TSR over the same period. The perceptual effect of
196
radiometric variations was estimated by calculating the MCDM values using the CIEDE2000
formula, and was 1.1 units. Most of the variations occurred in lightness, as illustrated in Figure
4.3.6-1. This variability seems to be the result of small drifts in intensity of the cabinet
illuminant, and will be dealt with in the “Discussion” section.
1.35
1.25
MCDM00 (∆L)
1.15
1.05
y = 1.0706x - 0.1169
0.95
R2 = 0.9916
0.85
0.85 0.95 1.05 1.15 1.25
MCDM00
Figure 4.3.6-1. Values of MCDM of stimulus presentation variations calculated with CIEDE2000
formula.
The abscissa specifies the full MCDM00 value, the ordinate specifies MCDM00 calculated for L*
differences only. Dots form a 45° line, illustrating that the variations in stimulus presentation colour are
almost exclusively due to variations in lightness.
197
4.4. Results
In the course of data analysis we found that the conclusions on trends and significance of the
variability strongly depend on the colour difference metric used. In order to illustrate this
dependence, in some cases we report two values for each result: in CIELAB units and in
CIEDE2000 units. The differences between two representations will be dealt with in the
“Discussion” section.
4.4.1. Intra-observer variability
The intra-observer variability characterises the ability of an individual observer to reproduce the
same match twice. In our experimental conditions it results from the limitation of observer’s
visual system in resolving colour differences between spatially separated stimuli: colour
discrimination. It also includes the uncertainties introduced by environmental and experimental
variables such as variations in presentation of the stimulus, small variations in distance and
viewing angle, and others.
Every observer carried out five matches of every test stimulus. The intra-observer variability
was evaluated as MCDM in CIELAB and CIEDE2000 units. Mean results for each observer are
given in Table 4.4.1-1, and are illustrated graphically in Figure 4.4.1-1. The statistics summary
per test colour is given in Table 4.4.1-2 and in Figure 4.4.1-2.
198
MCDM MCDM00 MCDM MCDM00 MCDM MCDM00

Observer LCD LCD CRT CRT sample sample
S1 2.40 1.45 2.70 1.66 0.69 0.55
H1 3.00 1.85 2.41 1.46 1.39 1.14
W1 5.93 3.31 5.92 3.53 1.08 0.89
D1 2.53 1.82 2.07 1.42 1.16 0.90
C1 1.50 0.93 1.73 1.19 0.86 0.68
J1 4.02 2.41 4.58 2.89 1.52 1.20
C2 2.35 1.45 3.08 1.85 0.22 0.15
B1 1.13 0.73 1.07 0.72 0.58 0.40
S2 2.48 1.76 3.08 2.10 0.33 0.21
W1 2.95 1.55 2.35 1.38 0.58 0.41
Y1 2.75 1.44 2.84 1.70 0.65 0.47
Mean 2.82 1.70 2.89 1.81 0.82 0.64
Table 4.4.1-1. Intra-observer variability, per observer

MCDM in CIELAB units (headed MCDM) and in CIEDE2000 units (headed MCDM00), for each
observer. Last two columns contain the values of variation of stimulus presentation within corresponding
observer’s sessions.
7.00
7.00
6.00
6.00
5.00
5.00
4.00
MCDM00
MCDM
4.00
3.00 3.00
2.00 2.00
1.00 1.00
0.00 0.00
S1 H1 W1 D1 C1 J1 C2 B1 S2 W1 Y1 S1 H1 W1 D1 C1 J1 C2 B1 S2 W1 Y1
Observer Observer
A) B)
Figure 4.4.1-1 Intra-observer variability – per observer.
A) In CIELAB units; B) CIEDE2000 units.
Black: LCD, Grey: CRT; White: variation of stimulus presentation within corresponding observer’s
sessions.
199

Test colour LCD LCD CRT CRT SAMPLE SAMPLE
White 2.45 1.87 2.41 1.81 1.07 0.71
Grey 1.63 1.49 1.97 1.74 0.76 0.65
Yellow 3.67 1.77 4.18 2.01 1.12 0.62
Brown 3.62 1.88 3.00 1.72 0.86 0.60
Magenta 3.03 1.73 2.91 1.71 0.80 0.67
Purple 2.45 1.65 2.35 1.66 0.70 0.65
Blue 4.11 1.60 3.71 1.50 0.72 0.53
Cyan 1.99 1.56 2.61 1.91 0.73 0.65
Dark green 2.20 1.63 2.45 1.85 0.67 0.62
Bright green 3.07 1.83 3.36 2.19 0.82 0.67
Mean 2.82 1.70 2.89 1.81 0.82 0.64
Table 4.4.1-2. Mean intra-observer variability for each test colour

MCDM in CIELAB units (headed MCDM) and in CIEDE2000 units (headed MCDM00). Last two
columns contain the values of variation of stimulus presentation within corresponding observer’s
sessions.
7 7
6 6
5 5
MCDM00
MCDM
4 4
3 3
2 2
1 1
0 0
y
ue
y
n
w
ta
ue
an
ta
te
le
en
te
le
en
en
en
re
re
ya
ow
ow
llo
llo
en
rp
rp
hi
en
hi
Bl
Bl
Cy
re
re
re
re
G
G
C
W
Ye
Ye
Pu
Pu
Br
Br
ag
ag
G
G
G
G
M
M
k
ht
rk
ht
ar
ig
Da
ig
D
Br
Br
Test colour Test colour
A) B)
Figure 4.4.1-2. Intra-observer variability – per test colour
A) CIELAB units B) CIEDE2000 units. Black: LCD, Grey: CRT; White: variation of stimulus
presentation.
4.4.2. Inter-observer variability
The inter-observer variability characterises the agreement between different observers in

matching the same test colour. It includes the inter-individual variations in visual mechanism, as
well as by intra-observer variability and variations in stimulus presentation reported in the
previous section.
Table 4.4.2-1 lists the variability values for each test colour; Figure 4.4.2-1 illustrates the same
data graphically; Figure 4.4.2-2 shows the projections of 95% confidence ellipsoids onto
CIELAB a*b* plane; and Figure 4.4.2-3 shows the same ellipses enlarged and superimposed
with the ellipse of stimulus presentation variation.
200

Test colour LCD LCD CRT CRT SAMPLE SAMPLE
White 4.03 2.86 3.94 2.91 1.64 1.03
Grey 4.29 3.81 4.00 3.54 1.10 0.93
Yellow 6.74 3.19 5.57 2.69 1.72 0.98
Brown 4.74 2.97 3.89 2.46 1.02 0.80
Magenta 4.72 3.31 3.81 2.51 1.17 1.00
Purple 3.94 3.36 3.55 2.98 1.02 0.99
Blue 5.45 2.56 4.60 2.34 1.00 0.75
Cyan 4.71 3.90 4.15 3.39 1.04 0.96
Dark green 3.31 2.84 3.16 2.65 1.03 0.98
Bright green 4.67 3.50 4.73 3.49 1.28 1.08
Background 4.38 4.06 3.92 3.60 − −
Mean 4.63 3.31 4.12 2.96 1.09 0.86
Table 4.4.2-1. Inter-observer variability.

MCDM in CIELAB units (headed MCDM) and in CIEDE2000 units (headed MCDM00). Last two
columns contain the values of variation of stimulus presentation within corresponding observer’s
sessions.
7 7
6 6
5 5
MCDM00
MCDM
4 4
3 3
2 2
1 1
0 0
ta
ta
w
w
en
en
y
ck en
ck en
n
n
ue
an
ue
n
nd
nd
te
le
te
le
re
re
ow
ow
ya
llo
llo
en
en
rp
rp
hi
hi
Br Gre
Br Gre
re
re
y
Bl
Bl
ou
ou
G
G
C
C
W
W
Ye
Ye
Pu
Pu
Br
Br
ag
ag
G
G
gr
gr
M
M
k
k
ht
ht
ar
ar
ig
ig
Ba
Ba
D
Observer Observer
A) B)
Figure 4.4.2-1. Inter-observer variability
A) CIELAB units; B) CIEDE2000 units. Black: LCD, Grey: CRT; White: variation of stimulus
presentation.
201
100
80
60
40
20
0
b*
−20
−40
−60
−80
−100
−100 −80 −60 −40 −20 0 20 40 60 80 100
a*
Figure 4.4.2-2. 95% inter-observer confidence ellipses in CIELAB a*b* plane.

Constructed from eleven mean individual observers’ matches. Thick line: LCD; thin line: CRT.
202
95
90
1 0
85
0
−1
−1
80
−2
−2
b*
−3
−3 75
−4
b*
b*
−4
−5
70
−6
−5
−7 65
−6
−8
−9 −7
60
−2 −1 0 1 2 3 4 5 −2 −1 0 1 2 3 −10 −5 0 5
a* a* a*
White Grey Yellow

42
−11
40
−12
38
−13
36
−14
34 −15
32
b*
8 −16
b*
7
30 6 −17
5
28 4 −18
3
b*
26 2 −19
1
24 0 −20
−1
22 −2
−21
15 20 25 30 35 38 40 42 44 46 48 50 52 54 56 58 14 15 16 17 18 19 20 21 22 23 24
a* a* a*
Brown Magenta Purple

17
16
−30 −10
15
−12
−35
14
−14
−40 −16 13
b*
−18
12
−45
b*
b*
−20
11
−50 −22
10
−24
−55
9
−26
−60 −28 8
−5 0 5 10 15 −16 −14 −12 −10 −8 −6 −4 −17 −16 −15 −14 −13 −12 −11 −10 −9
a* a* a*
24
Blue Cyan Dark green
22
20
18
b*
16
14
12
10
−34 −32 −30 −28 −26 −24 −22 −20 −18
a*
Light green
Figure 4.4.2-3. Enlarged 95% inter-observer confidence ellipses in CIELAB a*b* plane.
Data points represent mean individual observers’ matches: “⋅” – LCD; “+” – CRT; “×” – sample.
203
4.4.3. Agreement with the Standard Colorimetric Observer
Matches made by real observers differ from the prediction of the standard CIE observer. The
magnitude of this mismatch can be related to differences between the real observers and the
standard, failures in the assumptions underlying the use of Standard Colorimetric Observer, and
differences in viewing conditions between the stimuli. The figures reported here reflect the
measurements made from the test stimulus and the observers’ match; their meaning and possible
causes for discrepancies will be dealt with in the “Discussion” section.
Test colour LCD CRT

L* a* b* L* a* b*
White 89.6 0.6 -3.7 90.1 1.5 -3.1
Grey 60.8 0.2 -3.3 63.1 0.8 -3.1
Yellow 79.6 -3.8 78.1 80.8 -3.4 76.9
Brown 36.0 25.9 32.6 36.4 26.3 32.6
Magenta 46.0 47.1 2.5 47.3 48.2 3.1
Purple 47.7 18.7 -16.2 48.2 19.5 -16.0
Blue 37.2 5.6 -45.5 37.6 6.7 -46.2
Cyan 52.2 -10.5 -18.7 53.1 -9.4 -18.8
Dark green 42.5 -13.1 11.3 43.0 -12.9 12.2
Bright green 55.9 -26.1 16.9 56.9 -25.7 17.4
Background 59.4 -1.2 -0.1 60.4 -1.1 -0.2
Table 4.4.3-1. Mean CIELAB coordinates of matches made by all observers on both displays.
∆Ea*b* ∆E00 ∆Ea*b* ∆E00

Observer LCD-sample LCD-sample CRT-sample CRT-sample
S1 8.50 6.80 8.45 6.90
H1 6.54 4.53 5.37 4.47
W1 7.63 5.86 6.89 5.17
D1 3.48 2.49 2.85 2.43
C1 7.82 6.46 8.05 6.65
J1 3.25 1.88 4.19 2.79
C2 6.45 5.29 4.81 2.93
B1 1.99 1.38 2.07 1.35
S2 4.85 3.89 4.38 3.27
W1 5.02 4.30 4.66 3.59
Y1 7.19 5.00 6.95 4.44
Mean 5.70 4.35 5.33 3.99
Table 4.4.3-2. Mean colour difference between each observer’s mean match and the CIE Standard
Colorimetric Observer values of the test colour.
204
LCD-CIE CRT-CIE
Test colour LCD-CIE (∆E00) CRT-CIE (∆E00)
White 3.97 3.73 3.68 3.75
Grey 2.95 2.61 1.90 1.82
Yellow 1.36 0.71 1.89 1.23
Brown 3.88 3.26 3.37 2.86
Magenta 4.46 3.68 2.96 2.41
Purple 3.91 3.44 3.50 2.90
Blue 4.62 3.06 5.40 3.02
Cyan 4.55 3.67 4.11 3.13
Dark green 3.94 3.62 3.36 3.15
Bright green 3.39 3.02 2.36 2.09
Background 4.25 3.72 3.30 2.91
Mean 3.75 3.14 3.26 2.66
Table 4.4.3-3. Colour difference between mean match of eleven observers and the test colour.
10 10
8 ∆E00 from reference 8

∆E from reference
6 6
4 4
2 2
0 0
S1 H1 W1 D1 C1 J1 C2 B1 S2 W1 Y1 S1 H1 W1 D1 C1 J1 C2 B1 S2 W1 Y1
Observer Observer
A) B)
Figure 4.4.3-1. Mean difference between each observer’s match and the CIE Standard Colorimetric
Observer values of the test colour.
A) CIELAB difference; B) CIEDE2000 difference. Black: LCD; grey: CRT.
7 7
∆E00 from reference
∆E from reference
6 6
5 5
4 4
3 3
2 2
1 1
0 0
w
w
en
en
ck en
ck en
y
wn
ed n
ue
ue
n
nd
nd
te
te
le
nk
k
re
re
ow
ya
ya
in
pl
llo
llo
rp
hi
hi
Br Gre
Br Gre
re
re
Pi
Bl
Bl
ou
ou
o
-P
G
G
r
C
C
W
Ye
Ye
W
Pu
Pu
Br
Br
G
G
-
gr
gr
ed
k
ht
ht
R
R
ar
ar
ig
ig
Ba
Ba
D
Figure 4.4.3-2. Difference between mean match of eleven observers and the CIE Standard Colorimetric
Observer values of the test colour.
A) CIELAB difference; B) CIEDE2000 difference. Black: LCD; grey: CRT.
205
80
60
40
20
b*
0
−20
−40
−60
−60 −40 −20 0 20 40 60
a*
Figure 4.4.3-3. Differences between the CIELAB a*b* coordinates of test stimuli and mean matches of
eleven observers.
Vectors in a*b* plane with the origin at the coordinate of the paint sample, and head at the coordinate of
the mean match made by observers. Solid line: LCD display; dotted line: CRT display. Vectors are scaled
up ×5 their original size.
206
4.5. Data analysis and discussion:

Variability of colour matching
4.5.1. Fluctuation of stimulus presentation
The presence of the fluctuations in stimulus presentation is somewhat distracting, as it raises

questions about the validity of the results, and complicates the analysis. These fluctuations,
however, are bound to exist in any real industrial system, to perhaps a much more significant
extent. On average, the stimulus fluctuates with magnitude about 4 times smaller than the mean
intra-observer variability. In the following analysis, no separation is made between the two, but
the relationships are discussed where relevant; the intra-observer data are assumed to be
affected by both – the observer-related and the stimulus-related variabilities.
4.5.2. ∆Eab vs. ∆E00
Unlike with the visual colorimeter, in our present experiment we use object colours stimuli,
whose colour can be expressed in a familiar CIELAB object-colour space. The values in this
space have familiar intuitive meaning, and colour differences can be calculated as Euclidian
distances, denoted ∆Eab.
However, CIELAB space has the well-known problems of uniformity: equal colour differences
have different perceptual meaning in different locations in the space. No alternative Euclidian
colour-difference space exists; however, an advanced, non-Euclidean colour difference
formulae has been proposed: CIEDE2000, or ∆E00 (Luo et al. 2001) (Section 2.3.5.2). As ∆E00
is a relatively new metric, it is often customary to report its values along the more familiar ∆Eab
– as was done throughout the results section of this chapter.
In the course of the data analysis, it was soon realised that the ∆E00 is a significantly more
suitable metric than ∆Eab. A particularly striking example in support of ∆E00 is Figure 4.4.1-2,
where the mean intra-observer variations are illustrated. In Figure 4.4.1-2 A), the bars show
MCDM calculated with ∆Eab formula; the values vary with test colour in range of
207
approximately 1.6-4.2 units for both displays. In Figure 4.4.1-2 B), the bars show MCDM
calculated with ∆E00: the heights are nearly identical, varying within the 1.5-2 units range.
The intra-observer variability on our experiment is governed by the ability of the subjects to
detect colour differences between the paint and the monitor stimuli, and to minimise these
differences until they are not detectable. This process is governed by the subject’s colour
discrimination limitations: where no difference is discriminable between the test and the match
colours no adjustments need to be made, and the match can be pronounced. These
discrimination limitations, or the “criteria” that observers are using in “match-mismatch”
decision, must be independent of the test colour: there is no reason to assume that larger – in
perceptual terms – colour differences will be tolerated in some colours than in others. This is
what is meant by “perceptual uniformity”, and what the results calculated by the ∆E00 formula
illustrated in Figure 4.4.1-2 B) express: the variability of matches within individual observers’
data is constant throughout the perceptually-uniform colour space. Also, it shows that the
formula is applicable to viewing conditions significantly different from ones in the course of its
development. In the following analysis, we use only ∆E00 (also denoted as CIEDE2000) values
unless otherwise stated.
4.5.3. Intra-observer variability
Generally, observers show good repeatability. Considering the spatial separation of the stimuli,
the fluctuations of the stimulus presentation, and the relative complexity of the task, values of
1.7 and 1.8 ∆E00 units for LCD and CRT monitors can be considered rather low (Table 4.4.1-1).
4.5.3.1. Dependence on observer
The performance does not seem to depend on the type of the monitor; however, it does vary
considerably between observers. In this respect, the subjects can be roughly divided into three
groups: two with variability below 1 ∆E00 (B1 and C1), six between 1.5 and 2, and two with
variability above 2.5 ∆E00 units. The first (the least varying) group consists of the author and the
observer with considerable experience of colour-sensitive computer work in graphic arts; hence
the relatively good performance can perhaps be attributed to experience and motivation. We can
not suggest reasons for the two observers in the last group to show almost twice as high a
variability as the average: they did not show anomalous results in Farnsworth – Munsell 100
Hue test hence their colour discrimination is not likely to be impaired. From Figure 4.5.3-1, the
high variability values do not seem to be caused by the outliers either: there are no values that
are considerably higher than the mean.
208
7 7
6 6
5 5
MCDM00
MCDM00
4 4
3 3
2 2
1 1
0 0
w
ta
en
ta
en
en
en
y
n
ue
ue
n
te
te
e
re
re
w
w
ya
ya
pl
pl
llo
llo
en
en
hi
hi
re
re
re
re
Bl
Bl
o
o
G
G
r
r
C
C
W
W
Ye
Ye
Pu
Pu
Br
Br
ag
ag
G
G
G
G
M
M
k
k
ht
ht
ar
ar
ig
ig
D
D
Br
Br
A) B)
Figure 4.5.3-1. Variability of the two most-varying observers in CIEDE2000 units.
A) Observer 1; B) Observer 2. Black: LCD, Grey: CRT; White: variation of stimulus presentation.
It seems that the “normal” variability, i.e. one we would expect in majority of colour-normals,
corresponds to the second group, i.e. 1.5-2 ∆E00 units. Data summarised by test colour confirms
this (Table 4.4.1-2 and in Figure 4.4.1-2): mean intra-observer variability of matches in all the
colours falls within this range.
4.5.3.2. Estimation of threshold sensitivity
The magnitude of the variability of stimulus presentation is about 30% of the intra-observer
variability. This, however, represents merely a numerical relationship: it remains to understand
whether this variation in fact accounts for any part of the intra-observer variability, or whether it
is well below the threshold sensitivity to colour differences (Just Noticeable Difference, JND)
in our conditions. If the threshold is lower than or comparable with the stimulus presentation
variations then it is likely to affect the judgment, and vice versa. The MacAdam’s relationship
between the standard deviation of colour matching (MacAdam 1942) and the JND is that the
latter is three times the former (JND = 1/3 standard deviation) (Wyszecki and Stiles 1982) (p.
306). Applied on our case, it would imply that the JND is of order of 4.5 – 6 ∆E00 units – which
is not supported by experience; thus MacAdam JND criterion is not suitable for our conditions.
We can try and infer about the sensitivity thresholds in our conditions from a study of spatially
separated stimuli (Danilova and Mollon 2006). The results were reported in MacLeod –
Boynton (MacLeod and Boynton 1979) chromaticity values, and have shown a remarkable – in
chromaticity terms – ability to detect colour differences: sensitivity thresholds were under 0.4%
– 2% on M/(L+M) axis and under 3% – 6% on S/(L+M) axis for stimuli separated by 10°. It is
still significantly closer an arrangement than in our case (45°). The similarity is setup is in fact
that the stimuli can not be compared by direct neural “hard-wired” comparison of cone signals;
the difference is that the stimuli can not be viewed simultaneously. For the sake of comparison
with Danilova and Mollon’s report, we computed the MacLeod-Boynton chromaticities for our
209
data using Stockman and Sharpe (Stockman et al. 1999; Stockman and Sharpe 2000) cone
fundamentals and calculated the variations within the resulting set of values (Figure 4.5.3-2) .
2.0% 10%
1.8% 9%
1.6% 8%
1.4% 7%
1.2% 6%
CV
CV
1.0% 5%
0.8% 4%
0.6% 3%
0.4% 2%
0.2% 1%
0.0% 0%
y
n
w
w
n
ta
ue
ta
ue
le
en
le
en
te
te
re
re
ow
ya
ow
ya
ee
ee
llo
llo
rp
rp
en
en
hi
hi
Bl
Bl
re
re
G
G
C
C
W
W
Pu
Pu
r
r
Ye
Ye
Br
Br
ag
ag
G
tG
tG
M
M
k
k
gh
gh
ar
ar
Li
Li
D
D
A) B)
Figure 4.5.3-2. Intra-observer variability expressed in CV units in MacLeod-Boynton chromaticity
values.
A) L/(L+M) axis; B) S/(L+M) axis. Constructed using Stockman and Sharpe (Stockman et al. 1999;
Stockman and Sharpe 2000) cone fundamentals.
The CV for S/(L+M) axis for Yellow and Brown are especially high because these stimuli
almost do not excite blue cones, hence the plotted value is a ratio of standard deviation to
extremely small number. The mean relative standard deviations in our intra-observer data –
0.5% and 5.8% for M/(L+M) and S/(L+M), respectively - seem to be in an exceptionally good
agreement with the threshold data by (Danilova and Mollon 2006) cited above. Assuming that
the colour discrimination of our observers in our experimental conditions is similar to ones in
Danilova and Mollon’s experiment, we can suggest that one standard deviation of colour
matches is approximately similar to the threshold sensitivity to colour differences.
Thus the fluctuations of the stimulus in our experiment are about 30% of the sensitivity
threshold, which leads to conclusion that they are not likely to have appreciable effect on the
observer variations in colour matches and can be ignored in the analysis. This conclusion is
based on comparison of MCDM values, more accepted for expression of spread of colour values
about the mean than the standard deviation. Assuming that the result of CIEDE2000 colour
difference formula represents distances in uniform colour space in which the range of JND
about the mean forms a sphere, and due to approximate nature of this estimation, this fact is not
expected to have any practical consequences on our results.
4.5.3.3. Variations in different colour dimensions
In the description of the stimulus fluctuations, we found that the variations are almost
exclusively due to variations in lightness (Figure 4.3.6-1) as the result of random fluctuations of
intensity of the cabinet illuminant. It is of interest to perform similar evaluation of observer
variability and to understand how the matches vary in dimensions of lightness, chroma and hue.
210
As CIEDE2000 does not allow for easy separation of colour differences into chroma and hue
dimensions, we use only two categories: lightness and chromaticness – which corresponds to
CIELAB a*b* plane at constant L* value. The variations in chromaticness were calculated as
usual MCDM, but with the L* values set to the mean of the set of measurements. The variations
in lightness were calculated similarly, by setting the a* and b* values of the set to the set’s
means. Table 4.5.3-1 gives the numerical summary of such an evaluation; Figure 4.5.3-3
illustrates the same data graphically. Except of the brown colour, the contribution of variation in
chromaticness is slightly higher than in lightness. On average, lightness variations are
approximately 42% of the total intra-observer MCDM value, and are approximately 20% and
25% higher then the chromaticness variations for LCD and CRT, respectively.
MCDM00(∆C) MCDM00(∆L) MCDM00(∆C) MCDM00(∆L)

Test colour MCDM00 LCD LCD LCD MCDM00 CRT CRT CRT
White 1.87 1.16 1.21 1.81 1.02 1.38

Grey 1.49 0.83 1.05 1.74 0.87 1.51
Yellow 1.77 0.68 1.52 2.01 0.87 1.86
Brown 1.88 1.21 1.19 1.72 1.16 1.23
Magenta 1.73 0.98 1.31 1.71 0.94 1.47
Purple 1.65 0.99 1.09 1.66 0.97 1.06
Blue 1.60 0.97 1.15 1.50 0.90 1.09
Cyan 1.56 0.82 1.17 1.91 1.15 1.59
Dark green 1.63 0.98 1.09 1.85 1.05 1.39
Bright green 1.83 1.09 1.28 2.19 1.14 1.92
Mean 1.70 0.97 1.21 1.81 1.01 1.45
Table 4.5.3-1. Mean intra-observer variability (CIEDE2000) for each test colour separated to
perceptual dimensions of lightness and chromaticness
Total (MCDM00), chromaticness (MCDM00(∆C) ) and lightness (MCDM00(∆L)).
2.5 2.5
2.0 2.0
MCDM00
MCDM00
1.5 1.5
1.0 1.0
0.5 0.5
0.0 0.0
w
w
ta
ta
en
n
y
en
en
n
n
ue
ue
te
le
te
le
re
re
e
w
ow
ya
ya
llo
llo
en
en
rp
rp
hi
hi
re
re
re
re
Bl
Bl
o
G
G
C
C
W
W
Ye
Ye
Pu
Pu
Br
Br
ag
ag
G
G
G
M
ht
k
k
ht
ar
ar
ig
ig
D
D
Br
Br
A) B)
Figure 4.5.3-3. Mean intra-observer variability for each test colour separated to perceptual dimensions.
A) LCD data; B) CRT data. Black bars: variations in chromaticness; grey bars: variations in lightness.
4.5.3.4. Modelling the intra-observer variability
Combined intra-observer variability
Before we attempt to model the intra-observer variability in our data, we first need to clarify
that the data indeed allow modelling; this is to say – that the variabilities in each individual
observers have features in common which can be described by available mathematical tools.
211
The covariance matrices describing the variabilities of all observers with each monitor type
were combined into common covariance matrix by computing the mean variances and
covariances:
⎡ LCD σ L*
2
σ L*a* σ L*b* ⎤
⎢ LCD LCD
⎥
Σ common, LCD = ⎢ LCD σ a*L* LCD σ a*
2
LCD σ a*b*
⎥ (4.5.1)
⎢ ⎥
⎢ LCD σ b*L* σ b*a* σ b*2 ⎥
⎣ LCD LCD
⎦
⎡ CRT σ L*
2
σ L*a* σ L*b* ⎤
⎢ CRT CRT
⎥
Σ common, CRT = ⎢ CRT σ a*L* CRT σ 2
a* CRT σ a*b*
⎥ (4.5.2)
⎢ ⎥
⎢ CRT σ b*L* σ b*a* CRT σ b*
2
⎥
⎣ CRT
⎦
The two sets of common covariance matrices were used to construct the 95% confidence
ellipses in a*b* plane (Figure 4.5.3-4).
321
100
80
60
40
20
b*
−20
−40
−60
−60 −40 −20 0 20 40 60
a*
Figure 4.5.3-4. 95% intra-observer ellipses in a*b* plane, constructed by averaging variances and
covariances of all observers.
Thick line: LCD; thin line: CRT
Figure 4.5.3-4 makes it evident that there are only minor differences between the variations in
the two displays; hence we can proceed and combine the two sets into one:
212
Σ common = ( Σ common, LCD + Σ common, CRT ) 2 (4.5.3)
Two observers were identified: one with the lowest and one with the highest variability. These
are readily available from Figure 4.4.1-1: observer B1 for the least-varying, and observer W1
for the most-varying. Covariance matrices were calculated for each and combined into common
two-monitor matrix as in Eq. (4.5.3).
Using three sets of covariance matrices, three sets of 95% confidence ellipses were constructed
and superimposed. The result is the plot in Figure 4.5.3-5.
100
80
60
40
20
b*
−20
−40
−60
−60 −40 −20 0 20 40 60
a*
Figure 4.5.3-5. 95% intra-observer ellipses in a*b* plane constructed from common matrices of both
monitors’ data.
Green: observer B1; Red: observer W1; Blue: mean variability of eleven observers constructed from
common covariance matrices.
The differences between the three sets of ellipses in Figure 4.5.3-5 are mostly in scale; the
shapes and orientations are nearly identical. Thus we conclude that variabilities of different
observers indeed follow the same trends.
Constructing CIEDE2000 ellipses
The short ellipses’ axes in Figure 4.5.3-5 are almost parallel to the chroma lines, and the long
axes are parallel to hue lines – with exception of blue, where CIELAB has the well known “blue
hue inconstancy” problem. These features are known to characterise the sensitivity to colour
differences in the CIELAB a*b* plane: the sensitivity is highest to hue differences and lowest to
213
chroma differences. This suggests that an advanced colour difference metric can be used to
describe the features of intra-observer variability. From Figure 4.4.1-2 we know that this
variability is almost constant for all colours when calculated with the CIEDE2000 colour
difference formula – which makes it a likely candidate to start with.
The comparison between our data and prediction of the formula can be done using the similar
method of confidence ellipses in the a*b* plane. One set of ellipses will be that of mean of
eleven observers and both monitors (blue ellipses in Figure 4.5.3-5). Another will be the set of
loci of constant CIEDE2000 colour difference drawn around the colour centres which coincide
with the mean observers’ matches, i.e. the centres of the blue ellipses of Figure 4.5.3-5.
Figure 4.5.3-6 illustrates the construction of the locus of constant CIEDE2000 difference. Point
A with known coordinates ( a *A b*A ) is the colour centre. The task is to calculate the coordinates
( a *B b*B ) of point B, which is situated on the line passing through A and having angle of α with
the a* axis, and such that colour difference between A and B equals D CIEDE2000 units
(2.3.5.2, Eqs. (2.3.14)-(2.3.34):
∆E00 ( A, B ) = D (4.5.4)
First, the coordinates of point R are calculated, so that it would lie on the line connecting A and
B at the distance of one a*b* unit from A, i.e.
∆Ea*b* ( A, R ) = 1 (4.5.5)
We have:
a *R = a *A + cos (α ) (4.5.6)
b*R = b*A + sin (α ) (4.5.7)
CIEDE2000 colour difference between A and R is calculated:
∆E00 ( A, R ) = CIEDE 2000 ( A, R ) (4.5.8)
Finally, the coordinates of B are calculated as
a *B = a *A + cos (α ) * r (4.5.9)
b*B = b*A + sin (α ) * r (4.5.10)

214
Where r is equal to
D
r= (4.5.11)
∆E00 ( A, R )
Carrying out similar calculations for number of angles spanning the range 0°-360° results in
series of points all having CIEDE2000 colour difference from A equal to D. Covariance
matrices are computed for the resulting sets of values, and used to construct the 95% confidence
ellipses as usual.
Figure 4.5.3-6. Calculation of loci of constant colour difference.

See text for details.
Adjusting parametric factors
Figure 4.5.3-7 shows 1 CIEDE2000 unit contours superimposed with the combined intra-
observer ellipses. Orientations of the corresponding ellipses from the two sets are similar,
however the sizes and relative shapes are not in a good fit.
215
100
80
60
40
20
b*
−20
−40
−60
−60 −40 −20 0 20 40 60
a*
CIEDE2000 colour difference equal to 1 with parametric coefficients [1 1 1].
Thin line: intra-observer; thick line: CIEDE2000.
CIEDE2000 formula has a parametric control, which allows weighting colour differences in
different perceptual dimensions differently according to the requirements of the application. The
parameters are kL, kC and kH in Eq. (2.3.34) for Lightness, Chroma and Hue, respectively, and are
set to 1 by default. In Figure 4.5.3-7, the widths (the short radii) of the two sets of ellipses are
very similar, but the lengths (the long radii) are markedly different. As we noted, the length of
the ellipses are parallel to Chroma lines, therefore we increase the parametric coefficient kC to 2.
The result is shown in Figure 4.5.3-8.
216
100
80
60
40
20
b*
−20
−40
−60
−60 −40 −20 0 20 40 60
a*
CIEDE2000 colour difference equal to 1 with parametric coefficients [1 2 1].
Thin line: intra-observer; thick line: CIEDE2000.
The fit between the two sets is improved significantly, with some pairs of ellipses almost
coinciding (magenta, purple and cyan). The largest difference is between the ellipses
corresponding to blue test colour.
Computational methods exist (Cui 2000) which allow optimising the parametric coefficients in
order to achieve minimum possible mean colour difference between the two sets of ellipses.
However, due to a different scope of our study, as well as a relatively small amount of data and
thus a rather approximate nature of the fitting, employing these methods does not seem
appropriate. Rather, fitting by visual evaluation as we just did would suffice to make the
conclusion that the intra-observer chromaticness variability in our experimental conditions can
be modelled well by one unit of colour difference calculated with the CIEDE2000 formula, with
the chroma parametric coefficient kC set to 2 (i.e. CIEDE2000(1:2:1)).
Variability in Lightness dimension
Variations in lightness are at least as significant a contributor to total intra-observer variability

as in chromaticness (Table 4.5.3-1 and Figure 4.5.3-3). We attempted to model lightness
variations by the CIEDE2000 formula in a similar manner as chromaticness ones; however, with
a lesser success. The results are illustrated in Figure 4.5.3-9.
217
It was not possible to model the experimental ellipses by varying the parametric coefficients.
Although in some colour centres the ellipses almost coincide (greens), mostly the shapes and
orientations are different. The best parametric coefficient for lightness, kL = 4, was identified by
comparing only the vertical dimensions of the ellipses. Considering a very good correspondence
between the CIEDE2000 prediction and experimental data in chromaticness, the lack of
agreement here is perhaps a result of different visual mechanisms operating in lightness
discrimination in our conditions and in conditions in which the formula has been developed; or
possibly insufficient attention to Lightness dimension in developing the formula.
A) B)
Figure 4.5.3-9. Combined 95% ellipses for all observers and both monitors superimposed with ellipses
of constant CIEDE2000 colour difference equal to 1 with parametric coefficients [4 2 1].
A) CIELAB a*L* plane; B) b*L* plane. Thick line: experimental ellipses; thin lines: CIEDE2000
ellipses.
4.5.3.5. Practical implications of intra-observer variability
The values of intra-observer variability have practical implications on design of display

calibration, ICC profiling and soft-proofing systems. They provide the practical requirements
for the accuracy of on-screen colour simulation: differences between the monitor and the
hardcopy which are larger than the sensitivity threshold would result in perceptible colour
discrepancies; achieving better than the threshold accuracy would mean spending resources on
improvements which can not be detected by users. More than one experiment is necessary to
reach definite conclusions. The only relevant study we aware of is the one by Alfvin and
Fairchild (Alfvin and Fairchild 1997), who reported 1.15 CIELAB units as the MCDM of intra-
observer adjustment. However, this result is based on one observer, and viewing conditions
resembled more quasi-symmetric colour matching experiment than soft-proofing. Our analysis
shows that 95% of matches of a surface colour stimulus made by an average observer using
CRT or LCD display with typical primaries would lie within the limits of 1 CIEDE2000 unit
with parametric coefficients set to [KL : KC : KH] equal [1 2 1].
218
4.5.4. Inter-observer variability
The variability between observers is slightly but consistently higher in LCD monitor matches
than in CRT ones (Figure 4.4.2-1), although this difference seems to be insignificant for
practical purposes. The variabilities are approximately similar in all test colours, ranging
between 3 and 4 CIEDE2000 units with the averages of 3.3 and 3 units for LCD and CRT,
respectively.
4.5.4.1. Anomalous observers
In the plots of the result of metameric D&H and Munsell Matchpoint rules tests (Figure
4.2.1-12) we saw three observers producing consistent anomalous (compared to the rest of the
group) results: H1, S2 and D1. Same observers produced normal results in other two colour
vision tests: Ishihara plates and Farnsworth-Munsell 100 Hue Test.
In Figure 4.4.2-3, we reproduced 95% confidence ellipses in a*b* plane constructed from
eleven observers’ matches, whereby each data point on the plot corresponds to the mean of five
matches made by each observer. The points are generally grouped well; however in some plots
there are “stray” ones which stand out. These plots are for white, grey and yellow, and the
anomalous points indeed are produced by observers H1 and S2 (Figure 4.5.4-1). The third
observer which has shown slightly anomalous metameric rule judgments (D1) does not appear
to make matches significantly different from the group. Additional observation reveals some
other cases “stray” points – such as the one produced by Y2 marked in the plot for yellow test
colour in Figure 4.5.4-1; however they generally follow the same trend as the rest and do have
as large an effect on the ellipse shape as the two (H1 and S2) abovementioned.
1 0 95
Y2
0
−1 90
−1
S2
S2
−2 85
−2
−3
−3 80
−4
b*
b*
b*
−4 75
−5 H1 S2
−6 70
−5
H1
−7
−6 65
−8
−9 −7 60
−2 −1 0 1 2 3 4 5 −2 −1 0 1 2 3 −10 −5 0 5
a* a* a*
White Grey Yellow
Figure 4.5.4-1. 95% inter-observer confidence ellipses in CIELAB a*b* plane, with anomalous
observations marked and labelled with the observer code.
“⋅” – LCD; “+” – CRT; “×” – sample.
219
Any attempt of analysis of possible origins of the anomalies will be reduced to speculation, as
objective tools of assessing the properties of the colour vision path elements of our observers –
such as cone peak sensitivities or macular pigment density – are not available to us. At any rate,
the decision we have to make is whether to include observers H1 and S2 data in our analysis of
inter-observer variability. Both observers have passed all the standard tests for colour vision
deficiencies, hence in an industrial setting they would be categorised as “normals” for all
practical purposes. Therefore it seems reasonable to include them in the analysis.
4.5.4.2. Comparison with intra-observer variability
As expected, the inter-observer variability is higher than the intra-observer one, being
approximately twice as high (Figure 4.5.4-2).
4.0
4.0
MCDM00
3.0
MCDM00
3.0
2.0 2.0
1.0 1.0
0.0 0.0
w
ta
en
y
wn
en
w
n
ta
en
ue
en
te
le
ue
n
le
te
re
ya
re
ow
ya
llo
llo
en
rp
en
rp
hi
hi
re
re
re
Bl
re
Bl
o
G
G
C
C
Ye
W
W
Pu
Ye
Pu
Br
Br
ag
ag
G
G
k
M
ht
k
M
ht
ar
ar
ig
ig
D
D
Br
Br
Observer Observer
A) B)
Figure 4.5.4-2. Comparison of intra- and inter-observer variability values (MCDM00).
A) LCD; B) CRT. Black: intra-observer; grey: inter-observer.
However, when the variations are separated into perceptual dimensions of lightness and
chromaticness, rather unexpected behaviour is revealed: the increase in variability from intra- to
inter-observer is almost exclusively due to increase in variability of lightness matching (Figure
4.5.4-3). This result is surprising, because the variations between observers in our setup are
believed to be the result of observer metamerism, which is, in its turn, results mostly from
variations in optical properties of the lens and of the macular pigment in the blue spectrum
region. Variations in blue region have their effect on chromaticness and almost none on
lightness; thus the behaviour observed here is the complete opposite from the expected.
220
3.0
2.5 3.5
MCDM00(a*b*)
2.0 3.0
2.5
MCDM(L*)
1.5 2.0
1.0 1.5
1.0
0.5 0.5
0.0 0.0
ta
n
w
ta
y
en
wn
en
y
en
ue
n
n
ue
te
le
te
le
re
e
re
ya
ow
ya
llo
llo
en
en
rp
rp
hi
hi
re
re
re
re
Bl
Bl
o
G
G
C
C
Ye
W
Ye
Pu
Pu
Br
Br
ag
ag
G
G
G
G
ht
k
M
ht
k
ar
ar
ig
ig
D
D
Br
Br
A) B)
3.0
2.5 3.5
MCDM(a*b*)
2.0 3.0
2.5
MCDM(L*)
1.5 2.0
1.0 1.5
1.0
0.5 0.5
0.0 0.0
w
w
ta
ta
n
en
y
en
wn
en
n
ue
n
ue
te
le
te
le
re
re
e
ow
ya
ya
llo
llo
en
en
rp
rp
hi
hi
re
re
re
re
Bl
Bl
o
G
G
C
W
W
Ye
Ye
Pu
Pu
Br
Br
ag
ag
G
G
G
k
M
M
ht
ht
k
ar
ar
ig
ig
D
D
Br
Br
C) D)
Figure 4.5.4-3. Comparison of intra- and inter-observer variability in dimensions of lightness and
chromaticness.
A) Chromaticness, LCD; B) Lightness, LCD; C) Chromaticness, CRT; D) Lightness, CRT. Black bars:
intra-observer; grey bars: inter-observer.
In order to investigate this finding further, we computed a common covariance matrix by

combining both monitors’ data (Section 4.5.3.4), and constructed the 95% confidence ellipses
for mean matches of eleven observers. The result is shown in Figure 4.5.4-4, superimposed with
the mean intra-observer ellipses. With the exception of yellow, ellipses corresponding to the
same test colour almost coincide, illustrating that inter- and mean intra-observer variability in
a*b* plane are qualitatively and quantitatively nearly identical. In yellows, the inter-observer
variability is higher then intra-observer one mostly in Chroma direction. As the difference
between the inter- and intra-observer variabilities is mostly in L*, and due to high correlation
between the C* and L* values in yellow region of L*a*b* space, this behaviour is an expected
one.
This suggests that the inter-individual variability in colour matches in our experimental
conditions is not the result of observer metamerism; for if it was, it should have had properties
and magnitude substantially different from the intra-observer variability. This conclusion is a far
reaching one, as it implies that any model or index of metamerism for change in observer, such
as CIE SDO (CIE 1989), is bound to fail when applied on prediction of inter-observer
disagreement in cross-media colour matching. Hence it is of interest to try to model the
221
variability that we would expect if it was the result of observer metamerism, and compare it
with our experimental results.
100
80
60
40
20
b*
−20
−40
−60
−60 −40 −20 0 20 40 60
a*
Figure 4.5.4-4. 95% confidence ellipses in a*b* plane constructed using common covariance matrices
for all observers and both monitors.
Thick line: common intra-observer; thin line: common inter-observer.
4.5.4.3. Modelling the observer metamerism from S&B dataset
Colour matching can be mathematically modelled if the SPDs of the primary lights, SPD of the
test light, and the colour matching functions of the observer are known. For our experiment,
such model results in a spectral power distribution of the light emitted by the monitor which
matches a paint sample reflecting light of given SPD for the observer represented by the CMF.
When done for a set of CMFs belonging to a group of observers, the spread of matches thus
constructed corresponds to the magnitude of observer metamerism within this group of
observers for a particular “display-surface colour” metameric pair.
Let Q(λ) be the spectral power distribution of the paint sample, R(λ), G(λ) and B(λ) be the
SPDs of the monitor primaries, and r ( λ ) , g ( λ ) and b ( λ ) be the CMF. The tristimulus values
of Q(λ) with respect to colour matching functions r ( λ ) , g ( λ ) and b ( λ ) are given by

222
780
RQ = ∑ r (λ )Q (λ )
λ = 380
780
GQ = ∑ g (λ )Q (λ )
λ = 380
(4.5.12)
780
BQ = ∑ b (λ )Q (λ )
λ = 380
Tristimulus values of each of the monitor primary lights with respect to the same set of CMF are
calculated similarly:
780
RR = ∑ r (λ ) R (λ )
λ = 380
780
GR = ∑ g (λ ) R (λ )
λ = 380
(4.5.13)
780
BR = ∑ b (λ ) R (λ )
λ = 380
Expressions for G(λ) and B(λ) are similar. The tristimulus values of stimulus Q(λ) in tristimulus
space defined by R(λ), G(λ) and B(λ) are calculated by the inverse-matrix transformation
(Section 2.2.5) as
−1
⎡ RR GR BR ⎤
⎡⎣ RQ , M GQ , M BQ , M ⎤⎦ = ⎡⎣ RQ GQ BQ ⎤⎦ ⎢⎢ RG GG BG ⎥⎥ (4.5.14)
⎢⎣ RB GB BB ⎥⎦
Subscript M stands for “Monitor”. Tristimulus values RQ,M, GQ,M and BQ,M are, by definition of
tristimulus values, the multipliers of SPDs of the monitor primaries. Thus, the SPD of the light
emitted by the monitor that matches stimulus Q(λ) for observer having CMF r ( λ ) , g ( λ ) and
b ( λ ) is given by
QM ( λ ) = RQ , M R ( λ ) + GQ , M G ( λ ) + BQ , M B ( λ ) (4.5.15)
This modelling assumes that the cross-media colour matching is governed by the laws of cone-
quantum colour matching and is additive. Under this assumption, using observer’s colour
matching functions to predict the match is analogous to having the actual observer doing the
matching. Consequently, if the variabilities in experimental and in the modelled data are
markedly different than this assumption is wrong: the matching in our experiment is not cone-
quantum, and the experimental variability is not the result of observer metamerism.
223
Using the described procedure, we constructed a set of 47 monitor SPDs using the CMFs of the
individual observers in S&B (Stiles and Burch 1959) dataset.*,† The resulting SPDs were
converted into CIELAB values, which were subjected to analysis similarly to our own
experimental data. The MCDM values and the relationship between the lightness and
chromaticness are illustrated in Figure 4.5.4-5 and Figure 4.5.4-6.
1.4
1.2
1
MCDM00
0.8
0.6
0.4
0.2
0
ta
en
y
ue
en
te
le
re
ya
llo
en
rp
hi
e
Bl
re
o
G
gr
W
Ye
Pu
Br
ag
tg
k
M
gh
ar
D
Li
Test colour
Figure 4.5.4-5. MCDM (CIEDE2000) within the modelled colour matching set using S&B 47
observers’ CMF.
Black: LCD; grey: CRT.
1.4 1.4
1.2 1.2
1 1
MCDM00
MCDM00
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
w
ta
w
ta
n
en
n
y
ue
n
ue
en
en
te
le
te
le
re
re
ow
ow
ya
ee
ya
llo
llo
en
en
rp
rp
hi
hi
e
Bl
Bl
re
re
G
G
C
C
gr
W
gr
Ye
W
Ye
Pu
Pu
Br
Br
ag
ag
tg
tg
k
k
M
M
gh
gh
ar
ar
D
Li
Li
A) B)
Figure 4.5.4-6. Relationship between the chromaticness and lightness variability in data modelled with
S&B CMFs.
A) LCD; B) CRT. Black: Chromaticness variability; Grey: Lightness variability
Generally values are rather similar for both displays, with the only significant difference in
yellow test colour where in CRT the level of observer metamerism is about 30% higher than in
LCD (Figure 4.5.4-5). The mean MCDM values are 0.51 and 0.55 CIEDE2000 units for LCD
and CRT respectively, with maximums at 1.16 and 1.25 units in white test colour. The
breakdown into lightness and chromaticness (Figure 4.5.4-6) shows that observer metamerism
almost does not have effect on lightness matching – effect which is expected, and is reflected in
*
The original data is for 49 observers; two observers were excluded from the analysis due to missing entries
†
The author wishes to thank P. Trezona for providing the set of original NPL colour matching data.
224
the CIE Standard Deviate Observer by very low values in y deviate function, as well as in our
own analysis of colour matching data in (Section 3.3.1.2).
Since observer metamerism has no effect in lightness dimension, it only makes sense to
compare the modelled S&B colour matching data with our experimental data in chromaticness.
The mean S&B set variability is significantly smaller than ours: 0.52 versus 1.24 on average for
both displays. The difference is not homogeneous: the variability in neutrals is very similar,
while in the rest of test colours the differences are very large; this is illustrated in Figure 4.5.4-7.
1.4
1.2 1.4
1.2
1 1
MCDM00
MCDM00
0.8 0.8
0.6 0.6
0.4
0.4
0.2
0.2 0
0
ta
n
y
n
ue
en
te
le
re
ow
ya
ee
llo
en
rp
hi
Bl
re
White Yellow Magenta Blue Dark
gr
W
Ye
Pu
Br
ag
tg
k
M
gh
green
ar
D
Li
Test colour test colour
A) B)
Figure 4.5.4-7. Comparison of variability in MCDM00 terms (limited to chromaticness dimension)
between S&B dataset simulation and experimental data.
A) LCD; B) CRT. Black: S&B simulation; grey: experimental data.
Comparison of absolute variability provides only limited information, and can not be used for
identification of sources of variation. More qualitative comparison of the two sets can be done
by examination of confidence ellipses in a*b* plane (Figure 4.5.4-8).
225
Figure 4.5.4-8. 95% confidence ellipses in a*b* plane constructed from mean eleven observers’
matches, superimposed with ellipses constructed for 47 observers’ CMF from S&B dataset
A) LCD monitor; B) CRT monitor. Mean experimental matches: grey; S&B dataset: black.
Evidently, not only the sizes of the ellipses are different, but – in most cases – their orientation
as well. If asymmetric cross-media colour matching is governed strictly by the rules of
metameric matching, then it should be possible to model it from the S&B dataset. The
differences in size and orientation of the ellipses in all colours (with exception of neutrals)
imply that the variability of our experimental data is not the result of observer metamerism.
The similarity between the inter- and intra-observer ellipses (Figure 4.5.4-4) implies similarity
of the mechanisms underlying both variations. If the intra-observer variations are governed by
the observer’s colour discrimination thresholds, so do the inter-observer ones. Hence we should
be able to model the inter-observer variability with CIEDE2000 colour difference formula as we
did with the intra-observer one (Figure 4.5.3-8). An attempt to do so is illustrated in Figure
4.5.4-8.
226
Figure 4.5.4-9. Combined 95% inter-observer ellipses for both monitors, superimposed with ellipses of
constant CIEDE2000 colour difference equal to 1 with parametric coefficients [1 3 1].
Black: experimental ellipses; grey: CIEDE2000 ellipses
To achieve a better similarity between the two sets, parametric coefficients [1 3 1] were used
this time. The match between the two sets is very good; it is clear that our data is modelled by
the colour difference formula much better than it does by the model of observer metamerism.
There remain two colours which do not fit into the common pattern – the neutrals: white and
grey (enlarged ellipses for white are shown in Figure 4.5.4-10). In terms of ellipse size and
shapes, they seem to be modelled fairly well by both – the observer metamerism and colour
difference models. In this case the colour discrimination mechanisms seem to be as sensitive as
the colour matching ones; however the CIEDE2000 formula can still be used.
227
-1
-2
-3
-4
b*
-5
-6
-7
-8
-9
-10
-4 -3 -2 -1 0 1 2 3 4
a*
Figure 4.5.4-10. Combined 95% inter-observer ellipses of white matches for both monitors,
superimposed with ellipses of constant CIEDE2000 colour difference equal to 1 with parametric
coefficients [1 3 1], and S&B dataset ellipses.
Black: experimental ellipse; Grey: CIEDE2000 ellipse; Red: S&B ellipse
4.5.4.4. Modelling the observer metamerism: the eye optical model
There is still, however, an unlikely possibility that S&B CMFs are not representative of the
variations in colour-normal population due to, for example, mathematical treatment of original
colour matching data (Alfvin and Fairchild 1997) or particular choice of observers. We can test
this possibility by modelling the colour matching experiment without the use of S&B CMFs, but
using only the knowledge of the physiology and optics of the eye. The following procedure is
the development based on the formulae and methods published in (CIE 2005)
Cone-quantum colour matching is guided by the laws of physics and optics. Upon entering the
eye the light is being filtered by the lens, by the macular pigment, and is absorbed by the
photopigment of the cones. Each of these stages can be mathematically modelled if light
absorbing properties of these substances are known. The knowledge of the ranges and of the
distributions of variation of each of the eye elements among colour-normal observers can be
used to introduce a pseudorandom noise which would model the inter-individual variations in
colour matching properties. The result is a set of artificially generated cone fundamental
functions, variation within which is representative of variations within colour-normal
population.
The procedure can be broken down into following stages:

228
1. Start with the set of standard cone fundamental functions (CFF)

2. Convert CFF to cone absorptance spectra using the standard values of lens, macular
pigment and photopigment densities
3. Assign new values to the lens, macular pigment and photopigment densities. The new
densities are chosen from pseudorandomly generated pool of values, all of which are
within the boundaries chosen for the model (Table 4.5.4-1) and distributed according to
chosen distribution.
4. Convert the cone absorptance to CFF while replacing the standard values by the
randomly-generated ones.
The formulae which relates the CFF to cone absorptance was given in (Section 2.1.6.2); it is
reproduced here in a form more resembling algorithmic statements and thus more convenient
for programming implementation. Let l ( λ ) , m ( λ ) and s ( λ ) be the set of CFF, and Dlens(λ)
and Dmac(λ) be the lens and the macular pigment density spectra. The cone fundamental values
are corrected for the effect of macular pigment and lens absorption, and converted to quantum
spectrum scale:
l (λ )
lC ( λ ) = QklC
T (λ )
m(λ )
mC ( λ ) = QkmC (4.5.16)
T (λ )
s (λ )
sC ( λ ) = Qk sC
T (λ )
where Q is a factor which relates the values of equal energy spectrum to quantum spectrum:
1
Q= (4.5.17)
λ
T is the total transmittance of the prereceptoral filters – lens and macular pigment:
T ( λ ) = 10 ( lens
− D ( λ ) + Dmac ( λ ) )
(4.5.18)
and klC, kmC, ksC are the normalising factors:

229
1
klC =
max ( lC ( λ ) )
1
kmC = (4.5.19)
max ( mC ( λ ) )
1
k sC =
max ( sC ( λ ) )
Calculate the cone absorptance spectra from the corrected fundamentals:
α l ( λ ) = − log (1 − tl lC ( λ ) ) klα
α m ( λ ) = − log (1 − tm mC ( λ ) ) kmα (4.5.20)
α s ( λ ) = − log (1 − ts sC ( λ ) ) ksα
Here ti (i = l, m, s) is the peak transmittance of ith photopigment, calculated from its peak density
Dmax,i as
− Dmax,i
t i = 1 − 10 (4.5.21)
and k is the normalising factor:
1
klα =
max (α l ( λ ) )
1
kmα = (4.5.22)
max (α m ( λ ) )
1
k sα =
max (α s ( λ ) )
Next, new values are assigned to Dlens(λ) and Dmac(λ), and the absorptance functions αi (i = l, m,
s) are shifted on the wavelength scale to account for new peak sensitivity wavelength. The new
values are taken from the randomly generated pool within the ranges indicated in Table 4.5.4-1.
With the new values assigned, the above calculation procedure is reversed to arrive at the new
cone fundamental functions. The CFF corrected for prereceptoral absorption are given by
230
−α l ,new ( λ )
1 − 10 klα
lC ,new ( λ ) = −D
1 − 10 max,l ,new
−α m ,new ( λ )
1 − 10 kmα
mC ,new ( λ ) = −D
(4.5.23)
1 − 10 max,m ,new
−α s ,new ( λ )
1 − 10 ksα
sC , new ( λ ) = −D
1 − 10 max,s ,new
Here, the subscript new indicates the randomly modified values: αi,new (i = l, m, s) are the cone
absorptance spectra with peak absorptance shifted on the nm scale, and Dmax,i,new is the modified
peak cone maximum density. Lastly, the prereceptoral filtering correction is reversed:
lC , newTnew ( λ )
lnew ( λ ) =
QklC
mC ,newTnew ( λ )
mnew ( λ ) = (4.5.24)
QkmC
sC ,newTnew ( λ )
snew ( λ ) =
Qk sC
lnew ( λ ) , mnew ( λ ) and sλ ( λ ) are the new CFFs, Tnew(λ) is the total transmittance of the
prereceptoral filters – lens and macular pigment, calculated with randomly modified lens and
macular pigment density values Dlens,new(λ) and Dmac,new(λ):
− ( Dlens ,new ( λ ) + Dmac ,new ( λ ) )

Tnew ( λ ) = 10 (4.5.25)
Stockman and Sharpe (Stockman et al. 1999; Stockman and Sharpe 2000) CFF were used, with
their estimation of cone peak density of 0.38, 0.38 and 0.3 for L, M and S, respectively. Also
were used Bone et al. macular pigment density with the peak of 0.095 (Bone et al. 1992), and
Stockman and Sharpe (Stockman et al. 1999) lens density estimations. The literature on values
and magnitudes of variability of the model elements was reviewed in (Section 2.1); the values
used in the model are given here in Table 4.5.4-1:
Model parameter Variations

Variations in location of peak sensitivity of ±2 nm
M and L cones:
Peak cone density σ=0.045
Macular pigment ±45%
Lens density ±25%
Table 4.5.4-1. Values of variations used in modelling of observer metamerism.

231
Several choices had to be made in the model implementation, some based on the available
literature, and some arbitrary due to lack of agreement or insufficient data. It was assumed that
cones with different locations of peak sensitivity are mixed within the same retina; hence the
cone peak sensitivities were allowed to vary continuously along the wavelength scale within the
indicated range. Cone maximum density was assumed to vary similarly (co-vary) in all three
types of cones. It was assumed that the cones are distributed uniformly in the retina, all having
identical effective optical density. The macular pigment layer is assumed to have uniform
density over the entire area corresponding to 6° external field. The type of distribution for the
random value generation in all the model elements was Student t-distribution.
The set of 50 cone fundamental functions generated by the above procedure is illustrated in
Figure 4.5.4-11.
Figure 4.5.4-11. Set of 50 cone fundamental functions generated using the optical model of variability of
colour matching described in the text.
This set was used to simulate the colour matching experiment with the paint samples and LCD
and CRT monitors primaries as described in section 4.5.4.3 by Eqs. (4.5.12)-(4.5.15). The
resulting 50 spectra were converted to CIELAB values. The variability within the simulated set
in MCDM terms (CIEDE2000) is illustrated in Figure 4.5.4-12, where it is compared with the
corresponding values calculated for S&B colour matching dataset. Same data is illustrated in the
form of 95% confidence ellipses in the a*b* plane in Figure 4.5.4-13.
232
1.4
1.4
1.2
1.2
1
1
MCDM(00)
MCDM(00)
0.8
0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
White Yellow Magenta Blue Dark White Yellow Magenta Blue Dark
green green
A) B)
Figure 4.5.4-12. Variability within the set of simulated CFF superimposed with the corresponding
variability values in S&B dataset.
A) LCD; B) CRT. Grey: simulated CFF; black: S&B dataset.
A) B)
Figure 4.5.4-13. 95% ellipses in a*b* plane, constructed from the simulated CFF, superimposed with the
corresponding S&B dataset ellipses.
A) LCD; B) CRT. Black: Simulated CFF; grey: S&B dataset. The ellipses are scaled up ×5 their real size.
The S&B dataset is the result of colour matching experiment made by 47 colour-normal
observers. The simulated set is the result of mere optical modelling based on approximate
knowledge of variations that exist in the eye optical path, mean density estimations and number
of gross simplifications about the construction of the eye optical and sensory system.
Considering these circumstances, the correspondence between the two sets of data is truly
remarkable: in MCDM terms, values are close within 14% in LCD data and 30% in CRT data.
The relatively large discrepancies in CRT data are mostly due to underestimations in neutrals
and green colours.
233
The 95% a*b* confidence ellipses illustrate that the correspondence is not only quantitative but
also qualitative. The orientation and sizes of the ellipses are very similar; in some cases the
ellipses of the two sets practically coincide. Even in the CRT plot, where the correspondence in
sizes is poorer than in LCD, the correspondence in relative shape and orientation is high.
Our results represent an initial attempt to model the variability of colour matching process by
purely mathematical means. It should be certainly possible to optimise the model for a better
performance and correspondence with the experimental results, and also to reveal the reasons
for poorer performance with the CRT display spectra. This, however, would be out of scope of
this study. The motivation for this modelling was to perform an additional test of our main
conclusion made so far: the variability of cross-media colour matching in soft-proofing
conditions is not the result of observer metamerism. This conclusion was based on the
modelling of colour matching experiment using 47 S&B observers and monitors and paint
sample spectra. The variability within the S&B data, in its turn, can be simulated well by an
analytical modelling of the eye optical path. This confirms that the lack of correspondence
between levels and character of variability in colour matches within S&B dataset and our
experimental results is not due to low quality of S&B data, but rather due to differences in
vision mechanisms operating in the two conditions.
4.5.5. Agreement between observers
In the analysis of the inter-observer agreement we ask two questions:

1. What is the probability that a pair considered as a match by one observer will be
perceived as a mismatch by another, and what is the magnitude if this mismatch?
2. What is the probability that a mean match made by a group of observers will be
perceived as a mismatch by individuals, and what is the magnitude of this mismatch?
The first question is the one of significance of individual variations in colour vision. The second
one is the question of practical feasibility of the concept of Standard Colorimetric Observer in
our conditions and experimental task.
4.5.5.1. Agreement between individual observers
CIELAB coordinates of five matches made by each observer for each test colour and computer
display were used to calculate the mean match, and to construct the covariance matrix
characterising the uncertainty of that match. The mean CIELAB values and covariance matrices
for every test colour, monitor and pair of observers were fed into Nel and Van der Merwe’s
(NVM) test for equality of mean vectors (2.4.6.2, Eqs. (2.4.39)-(2.4.41)). The test resulted in
234
the binary value of 0 if the means were statistically different, or 1 otherwise. For each observer,
monitor and test colour, the disagreement rate PD was calculated as
n −1
∑d i
PD = 1 − i =1
(4.5.26)
n −1
where di is the result of NVM statistical test for the equality of mean CIELAB values of
observer’s match with those of i-th observer, and n is the number of observers. The value of PD
can be considered to be the probability that the individual observer would disagree on the colour
match made by the other observer – for the particular test colour and monitor.
Consequently, the mean value PD of all the observers for a given test colour:
∑P D
PD = i =1
(4.5.27)
n
represents the probability of disagreement on a particular monitor-paint sample colour match

within our group. Value PD can be viewed as the measure of uncertainty of colour matching for
a given monitor-paint sample colour match.
The value D00 of CIEDE2000 colour difference between matches made by a pair of observers is
the measure of perceptual discrepancy between the monitor and a paint sample as perceived by
one observer when viewing the match made by the other. For a given test colour and the
monitor, mean colour difference for all observers pairs represents the mean discrepancy
perceived by any observer within the group when viewing a match made by any other observer.
Thus, the uncertainty of colour matching is fully characterised by two values:

− the statistical estimate PD of probability that a pair of samples matching to one observer
would mismatch to another, and
− the colour difference D00 specifying the difference perceived by one observer when
viewing the match made by another.
Our definition of uncertainty of colour matching is completely independent of the causes of

disagreement. This independency is necessitated by our conclusion made in the section 4.5.4
that the variability of colour matches in our experiment is not the result of observer
metamerism.
235
The results of this analysis are given in numerical form in Table 4.5.5-1, and illustrated
graphically in Figure 4.5.5-1. Figure 4.5.5-2 illustrates the mean disagreement rates for each
observer. Table 4.5.5-2, Figure 4.5.5-3 and Figure 4.5.5-4 give the results of the analysis in
chromaticness plane.
Test colour PD (LCD) PD (CRT) D00 (LCD) D00 (CRT)
White 0.47 0.58 4.28 4.34

Grey 0.79 0.70 5.36 5.18
Yellow 0.56 0.70 4.24 3.95
Brown 0.58 0.70 4.07 3.46
Magenta 0.49 0.43 4.54 3.95
Purple 0.45 0.60 4.71 3.95
Blue 0.58 0.54 3.78 3.57
Cyan 0.63 0.57 5.40 5.15
Dark green 0.47 0.35 4.03 3.83
Bright green 0.51 0.36 4.70 5.08
Mean 0.55 0.55 4.51 4.25
Table 4.5.5-1. Mean values of PD and D00 for each test colour and for both displays.
1.0
0.9 6
0.8
0.7 5
0.6 4
0.5
PD
D00
0.4 3
0.3
0.2 2
0.1
1
0.0
0
w
ta
en
y
en
ue
n
te
le
re
ya
llo
en
rp
hi
re
re
Bl
o
G
C
W
Ye
Pu
Br
w
y
wn
ag
n1
n2
ue
n
ed
te
e
G
tG
re
ya
pl
llo
hi
e
Bl
M
R
o
G
r
gh
C
ar
Ye
re
re
Pu
Br
D
Li
G
A) B)
Figure 4.5.5-1. Mean values of PD and D00 for each test colour and for both displays.
A) PD B) D00 . Black: LCD; grey: CRT
1.0
0.9
0.8
0.7
0.6
0.5
PD
0.4
0.3
0.2
0.1
0.0
S1 H1 W1 D1 C1 J1 C2 B1 S2 W1 Y1
observer
Figure 4.5.5-2. Mean values of PD for each observer and for both monitors
Black: LCD; grey: CRT
236
PD ,a*b* PD ,a*b* D00, a*b* D00, a*b*

Test colour (LCD) (CRT) (LCD) (CRT)
White 0.43 0.53 2.22 2.14
Grey 0.68 0.51 1.57 1.66
Yellow 0.68 0.64 1.78 2.14
Brown 0.48 0.54 2.24 1.77
Magenta 0.35 0.38 1.59 1.54
Purple 0.35 0.30 1.65 1.67
Blue 0.49 0.47 1.62 1.86
Cyan 0.59 0.29 2.10 1.86
Dark green 0.22 0.29 1.35 1.46
Bright green 0.45 0.29 1.85 2.02
Mean 0.47 0.42 1.80 1.81
Table 4.5.5-2. Mean values of PD and D00 for each test colour and for both displays, calculated for
chromaticness only
1.0 2.5
0.9
0.8 2.0
0.7
0.6 1.5
D00, a*b*
PD, a*b*
0.5
0.4 1.0
0.3
0.2 0.5
0.1
0.0 0.0
w
w
ta
ta
n
y
en
en
ue
an
ue
n
le
le
te
te
re
ee
re
ee
ow
ow
ya
llo
llo
en
en
rp
rp
hi
hi
re
re
y
Bl
Bl
G
G
C
C
r
r
W
W
Ye
Ye
Pu
Pu
Br
Br
ag
ag
G
G
tG
tG
M
M
k
k
gh
gh
ar
ar
D
D
Li
Li
A) B)
Figure 4.5.5-3. Mean values of PD and D00 for each test colour and for both displays, calculated for
chromaticness only
1
0.9
0.8
0.7
0.6
0.5
PD
0.4
0.3
0.2
0.1
0
S1 H1 W1 D1 C1 J1 C2 B1 S2 W1 Y1
Observer
Figure 4.5.5-4. Mean values of PD for each observer, for both monitors, for chromaticness only.
LCD: black; CRT: grey
237
In the conditions of our experiment, there is a probability of on average 55% that a match
established by one observer will be perceived as a mismatch by another, while the mean colour
difference between the samples would be 4.3 CIEDE2000 units. The disagreement rates of
individual observers vary within 10%-15% about the mean, with the exception of observer W1.
The analysis limited to chromaticness dimension results in disagreement rates only

insignificantly smaller. This is at odds with much lower – more than twice – inter-individual
chromaticness variations as compared with total colour variations. This indicates that the
contribution of lightness perception to disagreements between observers is very low, so the
observers disagree mostly about the chroma and the hue of the matches.
The reported disagreement and colour differences values are alarmingly high. Moreover, as we
have already shown, discrepancies of this magnitude can not be caused by individual variations
of colour matching functions – the observer metamerism. Two explanations can be suggested
for these outcomes:
1. The disagreements result from combination of observer metamerism with individual
variations in colour vision not directly related to cone sensitivities. These can be
variations in immediate post-receptoral processes, as well as in neural processing and
the processing of visual information in the cortex.
2. The disagreements do not result from individual variations in colour vision mechanism
– whatever they are, but rather they are an artefact of our experimental method.
Both explanations do not offer possibilities for development. The possibility that the individual
variations in matches are caused by variations in high-order neural and brain processing would
mean that the concept of standard deviate observer based on variability of CMF, as proposed by
Nimeroff (Nimeroff et al. 1961) and Allen (Allen 1970), is not applicable to cross-media colour
matching. In fact, such applicability was never shown. At the other hand, any alternative
framework for accounting for individual variations in colour vision must be based on
knowledge of visual mechanisms underlying these variations; such knowledge is not available.
In this situation, result of experiments such as the present one provide general estimation of
uncertainty of colour matching for applications utilising similar setup.
It also must be noted that the problem expected to result from 55% probability of disagreement
between individual observers does not seem to be known in the industry: computer displays are
employed successfully for decades, and no documented or anecdotal evidences of disagreement
of this magnitude exist. This can indicate in favour of the second option: the results are an
artefact of our experimental conditions. Although it is a valid possibility, we do not see
238
evidence for this to be the case, and can not identify elements in experimental setup that could
cause such a bias.
A third possibility that must be considered is that the observed variations are real, however the
industrial practice does not require, and does not provide conditions for inter-individual
comparisons of the kind we are dealing with. This means: in reality, it rarely happens that two
observers would simultaneously judge a colour match between the monitor and the hardcopy.
Considering the uncertainties of industrial viewing conditions, this might render even the large
variations not practically significant. However, this option can be only considered as
speculation.
4.5.5.2. Agreement between individual observers and the mean
It is impossible to design system which would create perfect colour reproduction for each
individual observer. Any system would have to be based on a Standard Colorimetric Observer: a
set of assumption regarding the average properties of all colour-normal humans. The success of
such system depends on the degree to which each individual differs from the average. We
performed analysis of the discrepancies between each individual observer and the mean of all
observers in our group using methods similar to ones described above for inter-individual
analysis. In this case, one observer in each comparison was the “Standard Colorimetric
Observer” of the group.
On average for both monitors and all colours and observers, there is a probability of 38% that an
individual observer would disagree with the mean match made by all observers, while the
perceived colour difference would be 3 CIEDE2000 units. These results are illustrated in Figure
4.5.5-5.
1.0 4.0
0.9 3.5
0.8
3.0
0.7
0.6 2.5
D00
0.5 2.0
PD
0.4 1.5
0.3
1.0
0.2
0.1 0.5
0.0 0.0
w
w
ta
ta
en
n
y
wn
wn
en
en
an
an
ue
ue
te
le
te
le
re
re
ee
llo
llo
en
en
rp
rp
hi
hi
e
re
re
y
y
Bl
Bl
o
o
G
G
C
C
r
r
W
W
Ye
Ye
Pu
Pu
Br
Br
ag
ag
G
G
tG
tG
M
M
k
k
gh
gh
ar
ar
D
D
Li
Li
A) B)
Figure 4.5.5-5. Mean values of PD for disagreement between each individual observer and the mean of
the group.
239
As with the inter-individual agreement, these values are significantly higher than ones expected
to result from observer metamerism, and do not seem to correspond to industrial experience.
Consequently, similar explanations can be suggested.
240
4.6. Data analysis and discussion:

Agreement with the Standard Colorimetric Observer –
adaptation, colour matching and additivity failures
4.6.1. Description of the discrepancies
The geometric configurations of the stimuli, i.e. viewing angle and visual sizes of the stimuli
and of the background, were similar on both displays and in the viewing cabinet. The colour of
the background on both monitors was adjusted individually by each observer to match the
background in the cabinet. Thus, the viewing conditions of the stimuli on the display and in the
cabinet were nearly identical. In such a setup, it is reasonable to expect that, while each observer
will be different from the standard, the mean match of eleven observers would be very similar to
the CIE 1964 Observer prediction. This is not so: the mean matches of all colours except
Yellow and Brown are shifted towards blue-purple – as illustrated in Figure 4.4.3-3 (reproduced
here in Figure 4.6.1-1). On average, observers used more blue light to match the paint sample on
computer displays than the CIE Standard Colorimetric Observer predicts.
The systematic shift of mean matches’ a*b* coordinates relatively to paint samples ones could
be already seen in the ellipses in Figure 4.4.2-3: the centres of the ellipses in most of the plots
are lower (i.e. “bluer”) than the colour centre. These discrepancies are of different magnitudes
for different observers. However, it must be noted that comparisons of this kind can only be
done on the mean results of the group and can not be done on individual matches, and it is only
valid within the same system of coordinates. Under the assumption that the CIE 1964 Standard
Colorimetric Observer is representative of our group, we infer that the average group’s observer
is similar to the standard, hence we compare the two colours – the paint sample and the mean
group match – as the CIE observer would “see” them. On the other hand, comparison of the
individual match with the CIE coordinate of the paint sample is a comparison of colours as seen
by two different observers, and thus not valid. Consequently, by noting that some individual
observation is closer to the CIELAB coordinate of the paint sample than the rest, we can not
conclude that for that particular individual the discrepancy is less significant than for the rest of
241
the observers: without having the individual’s CMF we have no way of estimating her response
to both stimuli, and have no method of estimating the colour difference between them in her
individual colour space.
Apart from the slight shift in direction of the discrepancy, there is no significant difference in
matches made with the two displays. Therefore mean results of both will be dealt with in the
following discussion.
242
0.5
0.4
0.3
y10
0.2
0.1
0.15 0.25 0.35 0.45 0.55
x10
A)
80
60
40
20
b*
−20
−40
−60
−60 −40 −20 0 20 40 60
a*
B)
90 90
80 80
70 70
60 60
50 50
L*
L*
40 40
30 30
20 20
10 10
0 0
−60 −40 −20 0 20 40 60 80 −60 −40 −20 0 20 40 60 80
a* b*
C) D)
Figure 4.6.1-1. Differences between the paint samples and of the mean stimuli judged by observers to
match these samples in colour, illustrated as chromaticities and as vectors in CIELAB planes with the
origin at the coordinate of the paint sample and head at the coordinate of the mean match made by
observers.
A) CIE 1964 chromaticity diagram; crosses: chromaticities of the paint samples; dots: chromaticities of
mean matches of 11 observers; B) CIELAB a*b* plane; C) CIELAB a*L* plane; D) CIELAB b*L*
plane. LCD display: solid line; CRT display: dotted line. Vectors are scaled up ×5 their original size.
243
In Figure 4.6.1-1, plot from Figure 4.4.3-3 is reproduced along with similar plots for the other
two CIELAB planes. As a*L* and b*L* plots demonstrate, the consistent discrepancies are not
limited to the chromaticness dimension, but do occur in lightness as well; Table 4.6.1-1 and
Figure 4.6.1-2 illustrate their magnitudes.
Test colour CIEDE2000* CIEDE2000 (a*b*) CIEDE2000 (L*)

White 3.74 3.73 0.00
Grey 2.21 1.84 0.37
Yellow 0.97 0.58 0.39
Brown 3.06 0.64 2.42
Magenta 3.04 1.20 1.84
Purple 3.17 1.35 1.82
Blue 3.04 1.56 1.47
Cyan 3.40 1.84 1.56
Dark green 3.38 0.53 2.85
Bright green 2.56 0.34 2.22
Background 3.32 1.50 1.82
Mean 2.90 1.37 1.53
Mean (relevant) 3.10 1.92 1.18
Table 4.6.1-1. CIEDE2000 colour difference between the mean match made by all observers on both
monitors and the paint sample, separated into lightness and chromaticness dimensions.
Row entitled “Mean (relevant)” contains mean values only for the colours in which the discrepancies in
chromaticness plane are significant; names of these colours are accentuated by bold-italic typeface.
4.0 4.0
3.0 3.0
CIEDE2000
CIEDE2000
2.0 2.0
1.0 1.0
0.0 0.0
w
ta
ta
w
gh een
en
y
ck en
n
ck en
ue
n
ue
nd
nd
le
te
te
re
re
ow
ya
ya
pl
llo
llo
en
en
rp
hi
hi
re
re
re
Bl
Bl
ou
ou
o
G
G
r
r
W
W
Ye
Ye
Pu
Pu
Br
Br
ag
ag
G
G
tG
tG
gr
gr
k
M
M
k
gh
ar
ar
Ba
Ba
D
D
Li
Li
A) B)
Figure 4.6.1-2. Mean CIEDE2000 colour difference between the mean match made by all observers on
both monitors and the paint sample, separated into lightness and chromaticness dimensions.
A) Total colour difference: Black: LCD, Grey: CRT
B) Mean colour difference for both displays separated to chromaticness (Black) and lightness (Grey)
In the chromaticness, the discrepancies between the Standard and the real observers take the
form of a shift towards blue; they are most significant in the lower half of the a*b* diagram and
in neutrals. Moreover, these discrepancies are stronger for white than for grey. In the green-
yellow-orange region the discrepancies are minimal or non-existent. The mean chromaticness
difference between the mean match and CIE prediction in colours where the discrepancy is
significant – white, grey, magenta, purple, blue and cyan – is 1.92 CIEDE2000 units. The shift
244
is strongest in white – 3.73 CIELAB units, while in rest of the colour centres it varies between
1.2 in magenta to 1.84 in grey.
In lightness, the discrepancies take the form of shifts towards lower lightness. The mean shift in
“relevant” colours is 1.18 CIEDE2000 units. It seems to be smallest in colours with highest
lightness: white and yellow, which suggests a possibility of some colour appearance effect.
In the following discussion we will concentrate on discrepancies in chromaticness.
4.6.2. Discrepancies and adaptation
Wyszecki and Stiles define chromatic adaptation as (Wyszecki and Stiles 1982)
…modifications of visual response, particularly the response to chromatic test

stimuli, brought about by chromatic conditioning (adapting) stimuli that are
surrounding or pre-exposed.
The fact that the observed discrepancies are of very systematic fashion suggests a change in
visual response when switching between the two viewing conditions: one of viewing the paint
sample in the cabinet, and another of viewing the patch on computer display. However, there
were no conditioning stimuli in our experiment except for the test stimuli themselves: the paint
patch surrounded by grey board, and a patch on the monitor surrounded by grey background.
Therefore we adopt a working assumption that the discrepancies result from adaptation to the
experimental stimuli, while the exact nature of this adaptation is, for the time being, unknown.
Due to obvious differences from classical adaptation case as described by Wyszecki and Stiles
(i.e. absence of conditioning stimulus), we will use the term adaptation rather than chromatic
adaptation, thus allowing for the possibility that the underlying mechanisms in both cases are
different. This is also consistent with the long-term treatment of adaptation in colour matching
in the literature, which we will discuss later.
The only light sources in the laboratory were the viewing cabinet and the monitors. The paint
samples and the background in the cabinet were illuminated by the fluorescent illuminant, and
viewed in an object-colour mode. Hence, when viewing the paint sample, the assumption is that
the observers are adapted to the cabinet illuminant. In the monitor, however, there was no a
priori illuminant or white point to adapt to: observers began their adjustments from black
screen. The following options can be suggested:
1. Observers adapted to the surrounding stimulus: the grey background of the monitor.
This option is in fact self-contradictory as it can not explain the discrepancies in the
adjustment of the background itself, which follow the common trend.
2. Observers adapted to the stimulus on the monitor (the matching colour patch)
245
3. When looking at the monitor, the observers are in a “neutral” state of adaptation, that is
– they are adapted to some “internal” reference illuminant. Different options are
suggested for such an illuminant, for example D65 (Rich 2006) or 5500K (Hurvich and
Jameson 1951; Hurvich and Jameson 1951; Jameson and Hurvich 1951).
Following is a report of testing these possibilities. This testing was done under three working
assumptions:
− The CIE 1964 observer describes correctly the mean colour matching properties of our
group of observers
− The entire observed chromaticness shift is a result of adaptation
− The effect of adaptation is only on chromaticness dimension of colour
The subject of adaptation is extensively studied, and vast amount of literature exists. In order to
keep to the scope of this work, we limit the review of the subject to brief description of methods
used in the analysis, the results and their interpretation. Comprehensive review of chromatic
adaptation transformations and datasets can be found on pages 429-458 of (Wyszecki and Stiles
1982), and in (CIE 2004).
4.6.2.1. General adaptation model
Adaptation is a change in response of colour vision mechanism as the result of exposure to a

visual stimulus. Chromatic adaptation transform (CAT) is a mathematical model that predicts
the change in appearance of the stimulus under different adaptation conditions. In the classical
form, the inputs to CAT are the XYZ tristimulus values of the stimulus under the test illuminant,
XwYwZw tristimulus values of the reference white under the test illuminant, and XrwYrwZrw
tristimulus values of the reference white under the reference illuminant. The output is a set of
tristimulus values XcYcZc describing colour XYZ when viewed under reference illuminant. In a
more general case, tristimulus values XwYwZw and XrwYrwZrw can be thought to describe any
adapting stimuli rather than just illuminants.
XYZ patch on the monitor
XYZ viewing cabinet illuminant CAT XYZ patch in the cabinet
? XYZ adapting stimulus
Figure 4.6.2-1. Schematic illustration of chromatic adaptation transform
We find the CIE terminology of test and reference viewing conditions (or illuminants) to be
confusing and not intuitive: it does not provide immediate information on the function of each
246
element and condition in the transform. In the following discussion, we will use terms source
viewing conditions (or illuminant) to indicate the conditions from which the stimulus tristimulus
values are transformed, and destination viewing conditions (or illuminant) to indicate the
conditions the stimulus’ tristimulus values are transformed to.
Figure 4.6.2-1 illustrates the chromatic adaptation transform suited to our experimental
conditions. Given are two of the inputs to the model: the XYZ tristimulus values of the mean
observer’s match on the monitor and the illuminant of the viewing cabinet; and the model
output: the XYZ of the paint sample in the viewing cabinet. Since it is not known what do
observers adapt to, the tristimulus values of the adapting stimulus are unknown. Finding these
values would hopefully lead to understanding of the causes of the discrepancies in our data, and
possibly to ability of their modelling and prediction.
Thus, the problem we need to solve is:
Given tristimulus values XYZ of the mean match of a paint sample, and
tristimulus values XswYswZsw of the “source” cabinet illuminant, find such a set of
tristimulus values of “destination adapting stimulus” XdwXdwXdw which, when fed
into an adaptation transform, yield tristimulus values XcYcZc so that
⎡Xc ⎤ ⎡Xm ⎤
⎢ ⎥ ⎢ ⎥
⎢Yc ⎥ = ⎢Ym ⎥ (4.6.1)
⎢⎣ Z c ⎥⎦ ⎢⎣ Z m ⎥⎦
where XmYmZm are the tristimulus values of the corresponding paint sample.
4.6.2.2. CIE Chromatic Adaptation Transform 2002 (CAT02)
We used CAT02 transform, a part of the CIECAM02 colour appearance model. Following is the
description of this model’s mathematical procedure (CIE 2004).
First, the tristimulus values of the stimulus (XYZ), of the source illuminant (XswYswZsw) and of
the destination illuminants (XdwXdwXdw) are transformed to cone responses (R G B), (Rsw Gsw Bsw)
and (Rdw, Gdw, Bdw):
⎡ Ri ⎤ ⎡Xi ⎤
⎢ ⎥ ⎢ ⎥
⎢Gi ⎥ = M CAT02 ⎢Yi ⎥ (4.6.2)
⎢⎣ Bi ⎥⎦ ⎢⎣ Z i ⎥⎦
where Ri, Gi and Bi are the corresponding calculated cone responses, and MCAT02 is the
transformation matrix:
247
⎡ 0.7328 0.4296 −0.1634 ⎤

M CAT02 = ⎢⎢ −0.7036 1.6975 0.0061 ⎥⎥ (4.6.3)
⎢⎣ 0.0030 0.0136 0.9834 ⎥⎦
The degree of adaptation D is calculated as
⎡ ⎛ 1 ⎞ ⎛⎜ − L92 a − 42 ⎞ ⎤
⎟
D = F ⎢1 − ⎜ ⎟ e ⎝ ⎠
⎥ (4.6.4)
⎢⎣ ⎝ 3.6 ⎠ ⎥⎦
Here, F is set to 1.0, 0.9 and 0.8 for “average”, “dim” and “dark” surrounding conditions
respectively, and La is the luminance of the adapting field in cd/m2. Calculation of the “adapted”
cone responses Rc, Gc and Bc:
⎡ ⎛R ⎞ ⎤
Rc = ⎢ D ⎜ dw ⎟ + 1 − D ⎥ R
⎢⎣ ⎝ Rsw ⎠ ⎥⎦
⎡ ⎛G ⎞ ⎤
Gc = ⎢ D ⎜ dw ⎟ + 1 − D ⎥ G (4.6.5)
⎣⎢ ⎝ Gsw ⎠ ⎦⎥
⎡ ⎛B ⎞ ⎤
Bc = ⎢ D ⎜ dw ⎟ + 1 − D⎥ B
⎣⎢ ⎝ Bsw ⎠ ⎦⎥
And finally, calculation of tristimulus values of the stimulus under destination viewing
conditions using the inverted transformation matrix (4.6.3):
⎡Xc ⎤ ⎡ Rc ⎤
⎢ ⎥ −1 ⎢ ⎥
⎢Yc ⎥ = M CAT02 ⎢Gc ⎥ (4.6.6)
⎢⎣ Z c ⎥⎦ ⎢⎣ Bc ⎥⎦
In all our calculations, we set the value of degree of adaptation D in Eq. (4.6.5) to 1.
4.6.2.3. Inverting CAT02
The usual task in application of chromatic adaptation transform is to predict the colour response
to a stimulus under destination conditions given the stimulus and of the source conditions. In
our present case the task is an inversed one: we know the response to the stimulus in both
conditions from the measured XYZ tristimulus values of the paint sample and of the monitor,
and we know the viewing conditions in the cabinet from the tristimulus values of the cabinet
illuminant. What we do not know is what observers are adapted to when they adjust the colours
on the display. In our terminology, we know the destination illuminant (the cabinet) and the
stimuli under both conditions; we need to find the source adapting stimulus. This can be done
by inverting the CAT02.
248
First, transform the tristimulus values of the cabinet illuminant, the paint sample and the
monitor stimuli to cone responses using the transformation matrix (4.6.3):
⎡ Ri ⎤ ⎡Xi ⎤
⎢ ⎥ ⎢ ⎥
⎢Gi ⎥ = M CAT02 ⎢Yi ⎥ (4.6.7)
⎢⎣ Bi ⎥⎦ ⎢⎣ Z i ⎥⎦
Solving equation (4.6.5) for cone responses of source illuminant gives:
⎛ ⎞
⎜ D ⎟
Rsw = Rdw ⎜ ⎟
⎜ Rd − 1 + D ⎟
⎜R ⎟
⎝ s ⎠
⎛ ⎞
⎜ D ⎟
Gsw = Gdw ⎜ ⎟ (4.6.8)
⎜ Gd − 1 + D ⎟
⎜G ⎟
⎝ s ⎠
⎛ ⎞
⎜ D ⎟
Bsw = Bdw ⎜ ⎟
⎜ Bd − 1 + D ⎟
⎜B ⎟
⎝ s ⎠
But D = 1; hence Eq. (4.6.8) simplifies:
Rs
Rsw = Rdw
Rd
Gs
Gsw = Gdw (4.6.9)
Gd
Bs
Bsw = Bdw
Bd
Finally, the XYZ tristimulus values of the adapting stimulus are computed using the inverted
matrix (4.6.3):
⎡ X sw ⎤ ⎡ Rsw ⎤
⎢ ⎥ −1 ⎢ ⎥
⎢Ysw ⎥ = M CAT02 ⎢Gsw ⎥ (4.6.10)
⎢⎣ Z sw ⎥⎦ ⎢⎣ Bsw ⎥⎦
249
4.6.2.4. Finding the adapting stimulus: the results
The inverted CAT02 model as described by Eqs. (4.6.7)-(4.6.10) was applied on mean data of
all observers and both displays. The resulting tristimulus values of the adapting stimuli are
given in Table 4.6.2-1, and their chromaticities are illustrated graphically in Figure 4.6.2-2.
Adapting stimulus mean both displays

Test colour X Y Z
White 94.33 100.00 108.71
Grey 93.87 100.00 107.00
Yellow 93.39 100.00 105.10
Brown 94.22 100.00 93.95
Magenta 96.36 100.00 109.17
Purple 94.94 100.00 110.17
Blue 94.83 100.00 115.96
Cyan 92.91 100.00 111.86
Dark green 93.50 100.00 102.33
Bright green 93.07 100.00 102.26
Background 93.61 100.00 105.82
Mean 94.09 100.00 106.58
Table 4.6.2-1. The CIE 1964 XYZ tristimulus values of the adapting stimulus computed by inverted
CAT02 model, normalised to value of 100 in Y.
0.345
0.340 5500K
Mean display background
Booth illum inant
0.335
y
EE
Mean adaptation
stim ulus
6000K
0.330
D65
0.325
6500K
0.320
0.300 0.305 0.310 0.315 0.320 0.325 0.330 0.335
x
Figure 4.6.2-2. CIE 1964 xy chromaticities of the adapting stimuli for each the of ten test colours.
Test colours: coloured triangles. The colour of the triangle corresponds to represented test colour. Data
points are connected by the grey line for clarity.
Also on the plot: Plankian locus (red line), cabinet illuminant (brown dot), display background averaged
for all observers and both monitors (purple dot), D65 (magenta) and Equal Energy (green) illuminants.
The chromaticities of the adapting stimuli form a polygon elongated in the direction parallel to
the Plankian black body radiator locus. The correlated colour temperatures (CCT) span the
range from approximately 5500K for brown to 6700K for blue. The mean adapting stimulus has
250
the CCT of approximately 6300K; the nearest to the mean is the adapting stimulus for grey test
colour.
Figure 4.6.2-2 readily allows rejecting the option that the observers adapt to the background on
the display: the corresponding chromaticity is relatively far from the mean adapting stimulus.
The option that the observers adapt to a some single “internal” white point also seems unlikely,
as the adapting stimuli span the range of more than 1200K on the CCT scale. The option that
observers adapt to the stimuli themselves, however, is supported by this plot: the adapting
stimuli are shifted relatively to the mean in directions that correspond to the colour of the
stimuli; i.e. adapting stimuli for blue test colour is shifted towards blue end of the diagram, etc.
Therefore at this stage we conclude that our working assumption holds.
4.6.2.5. Modelling the relationship between the test colour and the adapting stimuli
As the test stimuli and the corresponding adapting stimuli seem to be related, the next step is to
attempt to model this relationship mathematically. We have found that the most suitable colour
space for this purpose is the MacLeod-Boynton (MacLeod and Boynton 1979) chromaticity
space (Section 2.1.6.3).
The tristimulus values of the paint samples (XYZ) and of the adapting stimuli (XasYasZas) are
transformed to the cone excitation coordinates using the CAT02 transform:
⎡ Ri ⎤ ⎡Xi ⎤
⎢ ⎥ ⎢ ⎥
⎢Gi ⎥ = M CAT02 ⎢Yi ⎥ (4.6.11)
⎢⎣ Bi ⎥⎦ ⎢⎣ Z i ⎥⎦
In colorimetry, in calculation of the object colour XYZ tristimulus values a normalisation to

“perfect reflecting surface” is performed, which implies that an object can not reflect more light
than illuminates it; the illuminant XYZ tristimulus values are normalised so that Y = 100. The
adapting stimuli can be thought of as “virtual” illuminants to which the observers adapt, hence
they need to be normalised to a constant “maximum” luminance level. In cone excitation space,
luminance is the sum of R and G signals, so the normalisation is carried out by following
equations:
k
Ras,n = Ras
Ras + Gas
k
Gas,n = Gas (4.6.12)
Ras + Gas
k
Bas,n = Bas
Ras + Gas
251
The value of constant k is arbitrary. However, it is convenient to set k = 1, as in this case the
cone excitation values after normalisation are identical to MacLeod-Boynton chromaticities.
MacLeod-Boynton chromaticities of paint samples are computed:
R
R/ ( R+G ) =
R+G
G
G/ ( R+G ) = (4.6.13)
R+G
B
B/ ( R+G ) =
R+G
The relationship between the MacLeod-Boynton chromaticities of the paint stimuli and of the
adapting stimuli describes the change in sensitivity of corresponding chromatic channel, caused
by adaptation to the stimulus having XYZ tristimulus values of the paint sample displayed on the
monitor. Plots showing this relation are shown in Figure 4.6.2-3, along with the best fitting
curves and the 95% confidence intervals of the fit. The relationship between R/(R+G) and
G/(R+G) pairs of chromaticities is linear:
R as / ( R as +G as ) = 0.0345 ( R / ( R + G ) ) + 0.4368 (4.6.14)
G as / ( R as +G as ) = 0.0345 ( G / ( R + G ) ) + 0.5061 (4.6.15)
The relationship between B/(R+G) chromaticities is best described by a power function:
Bas / ( R as +G as ) = −0.1144 ( B / ( R + G ) )
−0.3234
+ 0.6804 (4.6.16)
The R-square value for R/(R+G) and G/(R+G) model fits is 0.92, and 0.97 for B/(R+G).
252
0.55
R/(R+G) adapting stimulus

0.5
0.45
0.4
0.4 0.45 0.5 0.55 0.6 0.65
R/(R+G) paint sample
A)
G/(R+G) adapting stimulus
0.55
0.5
0.45
0.35 0.4 0.45 0.5 0.55 0.6
G/(R+G) paint sample
B)
B/(R+G) adapting stimulus
0.55
0.5
0.45
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
B/(R+G) paint sample
C)
of the paint samples.
A) R/(R+G); B) G/(R+G); C) B/(R+G). Data points are the chromaticities of the stimuli, lines illustrate
the fitting models. Dashed lines mark the 95% confidence intervals of the fit. The vertical scale is
adjusted to be identical in all three plots.
Red circle marks the B/(R+G) chromaticity of yellow which was excluded from the fit.
253
The value corresponding to yellow test colour was excluded from the model of B/(R+G)
channel signals. This was done because the “yellow” data point did not follow the common
behaviour of the rest of the data points, presumably due to extremely low ratio of blue cone
signal values to luminance causing erratic results.
The interpretation of the plots in Figure 4.6.2-3 is as follows. If the cones sensitivities are
constant in both viewing conditions, then the adapting stimuli are identical for all colours, and
horizontal lines result. Any line other than horizontal means that there is a change in adaptation
state between monitor and cabinet viewing conditions. If the chromatic channel mechanisms
adapts to a light of certain wavelength range, the cones sensitivity decreases and observers
adjust the monitors to emit more light in that range; this results in the positive slope on the
graphs in Figure 4.6.2-3. This indicates that the adaptation is different to colorimetrically
identical (metameric) stimuli, and the law of persistence of colour matches fails.
There is a change in adaptation state of all three chromatic channels. The R/(R+G) and G/(R+G)
are in fact opponents of the same channel; hence the adaptation is identical in both. It is linear,
minor but consistent. The magnitude of sensitivity change as the result of it does not exceed
0.7% and 0.8%. These values are similar to sensitivity thresholds for perception of colour
differences in spatially separated stimuli (Danilova and Mollon 2006). Considering larger
sample separation than used in (Danilova and Mollon 2006), and significantly more relaxed
experimental conditions, the practical effect of adaptation in R and G cone channels is perhaps
negligible.
The adaptation of B/(R+G) channel is strikingly different. Its magnitude is about ten times
higher, and it is highly non-linear. In order to illustrate the difference in adaptation of different
chromatic channels, the plots of Figure 4.6.2-3 are combined in one diagram in Figure 4.6.2-4.
254
60
58
MacLeod-Boynton chromaticity adapting stimulus

56
54
52
50
48
46
0 0.5 1 1.5 2
MacLeod-Boynton chromaticity paint sample
of the paint samples; all dimensions are combined in the same diagram.
Squares: R/(R+G); Triangles: G/(R+G); Circles: B/(R+G). The confidence intervals are omitted for the
sake of plot clarity.
The change in sensitivity of B/(R+G) channel between two extremes in our data – the adaptation
to blue and to brown – is approximately 20%; approximately 10 times the threshold reported by
(Danilova and Mollon 2006). When observers moved their gaze from the paint sample to the
monitor, their blue-yellow chromatic channel mechanism underwent a rapid adaptation resulting
in loss of sensitivity to blue light (but not to yellow). As the response, observers had to use more
blue light to match the paint sample then the CIE Standard Colorimetric Observer predicts. This
rapid adaptation seems to be the reason for the discrepancies that we find in our experimental
data between the CIE 1964 coordinates of the paint samples and mean matches made by
observers.
4.6.3. Statistical significance of the discrepancies
In colour reproduction, the discrepancies between the CIE observer predictions and actual
observations are of interest only if they have perceptual meaning; i.e. if they lead to perceptible
colour differences. The question of significance of our discrepancies can only be answered for
the mean observer of the group. The reason for this has already been discussed above: to
evaluate the discrepancies at the level of individuals we need to know individual CMF,
knowledge that we do not have.
The statistical significance was evaluated by applying Nel and Van der Merwe’s multivariate
test for equality of mean vectors (Section 2.4.6.2). The mean vectors were the mean match of all
observers, and the CIE coordinate of the paint sample. Each of these means was assigned a
covariance matrix identical to one computed for mean results of all observers. The results are
listed in Table 4.6.3-1.
255
Discrepancy is
Test colour significant
White √
Grey √
Yellow –
Brown –
Magenta √
Purple √
Blue √
Cyan √
Dark green –
Bright green –
Background √
Table 4.6.3-1. Results of the statistical test for significance of discrepancies between the CIE observer
prediction and experimental mean matches.
“√” signifies that the difference is significant, i.e. perceptible. “–“ signifies that the difference is
statistically insignificant.
The results are rather expected: the discrepancies are significant in all the test colours where
there is significant participation of blue cones relatively to luminance. We note that these
results, however, do not provide any information as for acceptability of the mismatches: they
can be perceptible but small enough to be unacceptable.
4.6.4. Adaptation, colour matching and additivity failures
4.6.4.1. Similarity with previous studies
A number of figures which were already shown in the literature review (Section 2.5) are
reproduced here (Figure 4.6.4-1). All of these figures are from publications dealing with failures
of additivity in colour matching, which report variants of the same test. Colour matching
functions of an observer are measured. The same observer makes visual quasi-symmetric match
of a broadband stimulus using narrow-band primaries, thus producing a set of tristimulus values.
The tristimulus values of the same test colour are calculated using observer’s own CMF, and the
two sets are compared. In a different variant of this test, the same observer measures her CMF
using maximum-saturation and Maxwell methods. Again, tristimulus values of a broadband
stimulus (white light in Maxwell matching) are compared with those measured with narrow-
band lights (maximum-saturation).
In Figure 2.5.1-1 (A), plots show the results of Ishak’s experiment (reviewed in (Trezona
1954)). The M (visually measured) chromaticities are shifted towards the blue-red corner of the
diagram relatively to C (calculated) ones. In Figure 2.5.1-1 (B), S&B’s observers make matches
256
which are shifted either towards or against blue end of the chromaticity diagram, depending on
the spectral position of the blue primary (Stiles 1963). In Figure 2.5.1-1 (C), Crawford’s
observers produce Maxwell CMF which have significantly higher blue tristimulus values than
the maximum-saturation ones (Crawford 1965). In Figure 2.5.1-1 (D), Lozano and Palmer’s
observers make significantly “bluer” matches than ones predicted by their own CMF (Lozano
and Palmer 1967), with especially significant deviations in nearly-white matches; their plot
looks almost indistinguishable from the plot of our matches in Figure 4.6.1-1 (A).
A) B)
C) D)
Figure 4.6.4-1. Results of tests of additivity by (Ishak 1951(Stiles 1963; Crawford 1965; Lozano and
Palmer 1967).
A) (Ishak 1951). Results of four observers plotted in WDW chromaticity diagram. M stands for
“Measured” and C stands for “Calculated”. From (Trezona 1954).
B) (Stiles 1963). Results for two alternative blue primaries; 445 nm: differences – circles, mean – square;
470 nm: differences – ×, mean – triangle. From (Stiles 1963).
C) Chromaticity loci measured by four observers by maximum saturation (dashed lines) and Maxwell
matching (solid lines). From (Crawford 1965).
257
D) Comparison of “calculated” (crosses) and visually measured (black dots) results of 4 observers by
(Lozano and Palmer 1967). From (Lozano and Palmer 1967).
Outcomes of all these studies can be summarised as follows:
When colour matching functions measured with narrow-band lights are applied
on prediction of metameric matches between narrow-band and broadband
stimuli, the calculated “blue” tristimulus value of broadband stimulus is smaller
than one set by visual colour matching using narrow-band stimuli.
Two parallels could be drawn between these previous reports and our study:
1. The light emitted by the LCD and CRT displays is in part narrowband due to the red
primary in CRT, and red and green primaries in LCD. Therefore the conditions of
matching broadband surface colour by display primaries are similar to conditions of
matching broadband filtered light by narrowband lights.
2. The discrepancies observed in our experiment are similar to discrepancies previously
reported for these conditions. Plot in Figure 4.6.1-1 (A) strongly resembles plot in
Figure 2.5.1-1: colour matches made by observers are bluer than ones predicted by the
CMF of the Standard Colorimetric Observer. In Lozano and Palmer’s, as well as in ours
experiment, the strongest “blue” shift occurs in white.
Since the results from LCD and CRT displays are similar, perhaps the narrow-band nature of the
red primary is the most significant.
Under the assumption that the CIE 1964 Standard Colorimetric Observer represents correctly
the mean observer of our group of subjects, we suggest that our discrepancies result from the
same adaptation phenomenon which led to discrepancies in (Trezona 1953; Trezona 1954; Stiles
1963; Crawford 1965; Lozano and Palmer 1967; Zaidi 1986).
4.6.4.2. Relationship between adaptation and additivity failures
The link between the adaptation and additivity failures is discussed in all the papers cited above.
However, to our knowledge, the nature of the relationship between the two was never stated
explicitly. Here we attempt to do this.
In cone-quantal colour match, cone excitation caused by members of a metameric pair is

identical. Hence, if members of the pair cause adaptation, this adaptation is identical for both
stimuli; consequently, the cone sensitivity adjusts identically to both stimuli, and the match is
not upset. This principle is known as “the law of persistence of colour match” (Wyszecki 1973):
even though the absolute cone signal values are altered by adaptation, relatively to each other
they remain the same.
258
If, for some reason, the adaptation caused by the members of metameric pair is not identical,
then the cone signals from both stimuli are altered differently and the match does not hold any
more. Two stimuli which matched each other before the adaptation do not match after the
adaptation, and the law of persistence of colour matches fails.
Moreover, if the rate of adaptation is different for different types of photoreceptors, then the
balance between the cone signals is upset, and the mismatch is accentuated by changing the
qualitative character (hue/chroma) of the colour. For example, if adaptation for both stimuli is
different, but it is similar for all three types of cones, the difference between the colours which
used to match before the adaptation will be mostly in lightness. However, if blue cones adapt
faster than the red and the green ones, then the mismatch acquires chromatic dimension (less
blue/more yellow), hence the perceptual significance of the mismatch is enhanced.
At the principal level, adaptation causes additivity failure by definition: the same amount of
photons result in different cone signals at different adaptation states. If the adaptation is long
enough a process so it can be controlled by the experimenter, then the working assumption is of
fixed adaptation and of validity of additivity within limited periods of time. If the adaptation is
so rapid so it causes almost instantaneous change in sensitivity of photoreceptors, then the same
amount of photons absorbed by the receptor within almost adjacent periods of time causes
different cone signals. The cone signal is not proportional to amount of absorbed photons at any
given time, and the additivity fails.
Colour matching functions are measured in conditions where a mixture of two narrow-band
lights is matched by mixture of other two narrow-band lights. The CMF are applied, however, to
calculate the tristimulus values of broadband as well as narrow-band stimuli. If the visual
system adapts differently to both types of stimuli, then the cone sensitivities of the same
observer in two conditions are different. The CMF measured for a certain observer with narrow-
band lights is not representative of the same observer viewing broad-band lights. We suppose
that this is the mechanism behind the discrepancies in our and other studies discussed above.
Our results indicate that the adaptation in question is neural rather than quantal: there is a
change in sensitivity of chromatic channel rather then a single type of cones. Viénot (Viénot
1987) suggested that large field matches are governed more by adjustment of cone signals ratios
rather then cone signals themselves. Crawford reported that the discrepancies between the
Maxwell and maximum saturation CMF are larger for large field than for small one (Crawford
1965). We suggest that Crawford’s difference between the results for different field sizes is due
to the shift of colour matches from cone-quantal to neural, causing the adaptation of chromatic
channels to play more significant role.
259
4.6.5. Additivity failures and display colorimetry: modifying the CAT02
In the industrial conditions, virtually all cross-media colour matches are neural, and the effect of
the kind we report should be expected. All the reports of additivity failure that we found in the
literature are concerned with conditions of quasi-symmetric colour matching with bipartite field.
The question of practical relevance of these failures is a long-standing issue. To the best of our
knowledge, the present report is the first to describe, characterise and quantify the failure of
additivity in conditions relevant to practical colorimetry. However, the effect we have found is
long known to the practitioners in the field, and significant anecdotal evidence exists. Hunt in
“The Reproduction of Colour” (Hunt 2004)(p. 390) states that displays adjusted to have a white
point of CCT below 3000K look “intolerably yellow”, although surfaces with colorimetrically
similar white do not look so. Rich notes (Rich 2006) “…the monitor could be made to provide
good visual simulations of the prints … but the tristimulus values … were very far from being
equal”. Similar problem is noted by Brill (Brill and Derefeldt 1991; Brill 2006). Fairchild
describes a demonstration in which a piece of white paper illuminated by incandescent lamp is
compared with colorimetrically identical white on the computer display (Fairchild 1992); he
notes that “The CRT display will appear relatively high-chroma yellow”. In colour management
it is commonly advised to calibrate the displays to a higher temperature white point than the
illuminant in the viewing cabinet to achieve better visual match. Thus, although the standard
white point CCT in graphic arts is 5000K, the informal recommendations in ICC web forums
often advise to calibrate the displays to 6500K. Higher white point results in “bluer” display
affecting mostly neutral colours – this corresponds to adjustments made by our observers.
The problem seems to be confirmed by laboratory studies as well as by practical experience:

colorimetric match between the computer display and a surface colour does not necessarily
result in perceptual match, given the same viewing conditions The next logical step would be to
find a solution. It is tempting to suggest that the CMF should be measured in conditions as close
as possible to conditions of their application. Thus, to predict matches of broadband colours,
CMF measured by Maxwell colour matching might seem more appropriate. However, it is clear
that it is impossible to predict a priori all possible applications of CMF, and properties of
primaries of future display technologies are unknown. It is impractical to measure a new set of
CMF for every new technology that develops. Therefore, from methodology point of view, it
seems that it is best to leave the CMF to be what they are by definition: description of colour
matching properties of visual system, a linear combination of cone sensitivities invariant with
adaptation. If the discrepancies we find indeed result from adaptation, they should be corrected
for – presumably by using the framework of chromatic adaptation transforms.
260
However, the CATs in its present form, such as CAT02 described by Eqs. (4.6.2)-(4.6.6), are
based on Von-Kries assumption that the change in chromatic sensitivity is caused by
independent changes in sensitivity of cone mechanisms, triggered exclusively by photons
absorbed by photoreceptors corresponding to adapting mechanism, and is unaffected by photons
absorbed by other receptor types. Zaidi already suggested (Zaidi 1986) that the sensitivity of
blue cone mechanism can change as the result of excitation of green and red cones. This
suggestion is confirmed by Figure 4.6.2-4. In this figure, the axes represent MacLeod-Boynton
chromaticity coordinates: ratios of cone signals to luminance signals (Eqs. (4.6.13)). In other
words, if Figure 4.6.2-4 is representative of the actual operation of the colour vision, the visual
system adjusts the sensitivity of blue-yellow chromatic channel based on the “knowledge” of
signals from all three types of cones. In order to be able to account for the discrepancies caused
by adaptation of this kind, the existing CAT framework needs to be principally modified. Below
we propose a scheme of such a modification based on CAT02. We do not attempt to develop a
working chromatic adaptation transform: it is out of our scope, our experiment was not designed
for such a purpose, and the data is insufficient. Rather, we propose a framework, which can be
improved by incorporating new experimental data.
4.6.5.1. Modified CAT02
The inputs to the model are the tristimulus values of the surface colour stimulus (X Y Z), and of
the illuminant under which this stimulus is viewed (XwYwZw).
Stage 1: The tristimulus values are transformed to cone responses (R G B) and (Rw Gw Bw):
⎡ Ri ⎤ ⎡Xi ⎤
⎢ ⎥ ⎢ ⎥
⎢Gi ⎥ = M CAT02 ⎢Yi ⎥ (4.6.17)
⎢⎣ Bi ⎥⎦ ⎢⎣ Z i ⎥⎦
where Ri, Gi and Bi are the corresponding calculated cone responses, and MCAT02 is the CAT02
transformation matrix:
⎡ 0.7328 0.4296 −0.1634 ⎤

M CAT02 = ⎢⎢ −0.7036 1.6975 0.0061 ⎥⎥ (4.6.18)
⎢⎣ 0.0030 0.0136 0.9834 ⎥⎦
Stage 2: MacLeod-Boynton chromaticities of the stimulus are computed:
R
R/ ( R+G ) = (4.6.19)
R+G
261
G
G/ ( R+G ) = (4.6.20)
R+G
B
B/ ( R+G ) = (4.6.21)
R+G
Stage 3: MacLeod-Boynton chromaticities of adapting stimulus are computed from

chromaticities of the stimulus, using models (4.6.14) – (4.6.16):
⎡⎣ R / ( R + G ) ⎤⎦ as = 0.0345 ( R/ ( R+G ) ) + 0.4594 (4.6.22)
⎡⎣G / ( R + G ) ⎤⎦ as = 0.0345 ( G/ ( R+G ) ) + 0.5061 (4.6.23)
⎡⎣ B / ( R + G ) ⎤⎦ as = −0.1144 ( B/ ( R+G ) )
−0.3234
+ 0.6804 (4.6.24)
Since in the development of the model we normalised the adapting stimuli to (R + G) = 1 (Eq.
(4.6.12)); chromaticity values computed by Eqs. (4.6.22)-(4.6.24) are in fact identical to cone
signals themselves normalised to unit luminance signal (R+G):
Ras = ⎡⎣ R/ ( R+G ) ⎤⎦ as (4.6.25)
Gas = ⎡⎣ G/ ( R+G ) ⎤⎦ as (4.6.26)
Bas = ⎡⎣ B/ ( R+G ) ⎤⎦ as (4.6.27)
Stage 4: Likewise, cone signal values corresponding to illuminant are normalised to lie in the
plane of unit luminance, where the luminance is a sum of red and green cone signals:
k
⎣⎡ R/ ( R+G ) ⎦⎤ w ,k = Rw R + G (4.6.28)
w w
k
⎡⎣ G/ ( R+G ) ⎤⎦ w ,k = Gw (4.6.29)
Rw + Gw
k
⎡⎣ B/ ( R+G ) ⎤⎦ w , k = Bw (4.6.30)
Rw + Gw
262
Again, as we set k = 1, Eqs. (4.6.28)-(4.6.30) result in a set of MacLeod-Boynton chromaticities

of illuminant identical to cone signals normalised to unit luminance:
Rw,k = ⎡⎣ R / ( R + G ) ⎤⎦ w,k (4.6.31)
Gw,k = ⎡⎣G / ( R + G ) ⎤⎦ w,k (4.6.32)
Bw,k = ⎡⎣ B / ( R + G ) ⎤⎦ w,k (4.6.33)
The cone signals of the display stimulus are computed using the standard CAT transformation,
and assuming degree of adaptation D = 1:
Ras
Rc = R (4.6.34)
Rw ,k
Gas
Gc = G (4.6.35)
Gw ,k
Bas
Bc = B (4.6.36)
Bw ,k
Finally, monitor XYZ tristimulus values are computed using the inverted MCAT02 matrix:
⎡Xc ⎤ ⎡ Rc ⎤
⎢ ⎥ ⎢ ⎥
⎢Yc ⎥ = M CAT02 ⎢Gc ⎥
-1
(4.6.37)
⎢⎣ Z c ⎥⎦ ⎢⎣ Bc ⎥⎦
The tristimulus values (Xc Yc Zc) specify the stimulus that has to be displayed on computed
display having typical CRT or LCD primaries, so it visually matches surface colour stimulus
having tristimulus values (X Y Z).
The principal feature of the proposed framework is that the adaptation of postreceptoral
chromatic channels is modelled, rather then adaptation of individual cone mechanisms. The
“adapting stimulus” in this framework is a “virtual” one; it can be thought of as “the stimulus
that the observer needs to adapt to in order to perceive the colour on the monitor to be identical
to the surface colour”. As evident from Eqs. (4.6.22)-(4.6.24), this virtual stimulus is the
function of the surface stimulus only. The exact functions in Eqs. (4.6.22)-(4.6.24) are very
initial and can be modified and updated when new knowledge or data become available; thus
they can be re-written in a general form:
263
⎡⎣ R/ ( R+G ) ⎤⎦ as,k = f ( R/ ( R+G ) ) (4.6.38)
⎡⎣ G/ ( R+G ) ⎤⎦ as, k = f ( G/ ( R+G ) ) (4.6.39)
⎡⎣ B/ ( R+G ) ⎤⎦ as,k = f ( B/ ( R+G ) ) (4.6.40)
Also, according to our results, the contribution of R/(R+G) and G/(R+G) channels to the
adaptation is rather negligible, and the entire shift in sensitivity can be modelled by adaptation
of blue-yellow mechanism alone. If future studies confirm this, then all the equations modelling
red-green channel adaptation can be eliminated from the model, so only the blue-yellow
adaptation is modified as the function of all three channels signals:
Rc = R (4.6.41)
Gc = G (4.6.42)
Bc = f ( B/ ( R+G ) , Bw / ( R w +G w ) ) (4.6.43)
264
4.7. Summary and conclusions
During the past century, failures of colorimetric additivity and observer metamerism have been
studied extensively, and significant amount of theoretical knowledge has been accumulated.
However, application of this knowledge in practical colorimetry and imaging has been, and still
is, very limited. The main reason for this seem to be the lack of understanding of practical
implications of both effects in practical tasks of colorimetry: specification of colour and
prediction of metameric matches in wide range of viewing and environmental conditions, for
stimuli having wide range of properties. Despite the fact that the conditions of colorimetry
application almost never resemble ones in laboratory-based colour matching experiment, the
working assumption is of identity of conditions. The main task of the cross-media colour
matching experiment was to bridge the gap between the theory and practice for at least one
industrial application: soft-proofing. The results were most surprising.
4.7.1. Variability of colour matches and observer metamerism
Existence of variations in colour matching functions between colour-normal observers, and their
consequence – observer metamerism – are long established and well studied. As any colour
match between computer monitor and reflective object is metameric, it seemed reasonable to
assume that the inter-observer variations in cross-media colour matches are governed by
observer metamerism, and can be modelled from knowledge of inter-observer variations in
CMF. When the CIE Standard Deviate Observer, which was based on variability within S&B
colour matching data set (Stiles and Burch 1959), has failed to achieve this , it was explained by
imperfections of the SDO, and improper mathematical treatment of original colour matching
data by the CIE (Alfvin and Fairchild 1997).
Our findings indicate that, with the exception of neutral colours, observer metamerism does not
explain the variability of cross-media colour matches. The variations predicted from S&B set of
CMFs has distinctively different behaviour and much smaller magnitude than variation in our
experimental data – in all of the test colours but neutrals. An alternative method which involved
mathematical modelling of observer metamerism gave predictions almost indistinguishable
from ones of S&B set.
265
At the other hand, our data is modelled well by the advanced colour difference formulae. This
formulae is based on studies of observer’s ability to discriminate colour differences. Therefore,
it seems that the vision mechanisms that resulted in inter-observer variations in our data are the
same as ones that govern ability to discriminate colour differences: the colour discrimination.
Consequently, any attempt to model these variabilities by variations of CMF is bound to fail
and, in fact, unnecessary: the only modelling which is required is optimisation of CIEDE2000
parametric coefficients. Such optimisation was out of scope of this study, and is left for future
researches.
Threshold observer sensitivity to colour differences in cross-media colour matching is

approximately equal to MCDM of matches. In our conditions, it was approximately 1
CIEDE2000 unit with optimised parametric coefficients [1 2 1].
4.7.2. Additivity failures
As with the observer metamerism, the failures of additivity are long known and studied in the
laboratory in conditions of colour matching experiment. Most of the studies have concentrated
on comparison of tristimulus values directly measured by an observer with ones calculated from
the same observer’s CMF under assumption of additivity. Invariably, it was found that the
observer’s measured values are “bluer” than the predicted ones, and the discrepancies were
attributed to some unknown mechanism of adaptation. Next to nothing is known about the
implications of this effect in practical colorimetry. Moreover, most studies indicated that, even
in the laboratory environment, these failures are only significant for individual observers, and
become insignificant in results of a group of observers.
We observed similar “blue” shifts in conditions of cross-media colour matching. Moreover,

these shifts are significant in the results of the group of observers, and not just for individuals.
We were able to show that the discrepancies can be explained by a rapid non-linear adaptation
of mostly blue-yellow chromatic channel. This adaptation does not follow the Von-Kries
principle: the sensitivity chromatic channel is adjusted as the response to signals generated by
all three cone mechanisms. We modelled this adaptation using the MacLeod-Boynton
(MacLeod and Boynton 1979) chromaticity space. We suggest a modification framework for
chromatic adaptation transform which accounts for adaptation of this kind, but were not able to
develop this transform to a working state as our experimental data is insufficient.
266
4.7.3. The model of uncertainty of colour matching in soft-proofing
In section 3.1.1, we defined the model of uncertainty of colour matching as the function of
individual variability of CMF and of additivity failure:
U R ( λ ) = f (U w, R ( λ ) , Α R )
U G ( λ ) = f (U w,G ( λ ) , ΑG ) (4.6.44)
U B ( λ ) = f (U w, B ( λ ) , Α B )
This model is based on two assumptions:

1. The colour matching is cone-quantal; that is – colour equality is correctly predicted by
equality of tristimulus values. Consequently, uncertainty of colour matching is the
uncertainty of tristimulus values.
2. Magnitude of additivity failure is unpredictable and can not be compensated for.
Results of our last experiment have shown that both assumptions do not hold in cross-media
colour matching in conditions typical to soft-proofing:
1. Colour matching in these conditions is neural rather then cone-quantal. This is shown
by properties of variability of colour matching data having characteristics of colour
discrimination rather then observer metamerism. Thus the uncertainty of colour match
is described by colour difference formulae rather then variability of tristimulus values.
2. Magnitude and direction of failures of additivity can be predicted to high degree and
compensated for. This was shown by the model describing the relationship between the
stimulus displayed on the monitor and the post-receptoral adaptation, and by showing
that it can be incorporated in an adaptation transform.
Thus the model of uncertainty of cross-media colour match in conditions of soft-proofing is the
model of colour difference, in our case – one unit of CIEDE2000 formula with suitably adjusted
parametric coefficients:
U Q , a*b* = f (1 ⋅ ∆E00 ) (4.6.45)
95% of colour matches of surface colour stimulus Q made by individual observer on a display
having typical LCD or CRT primaries will lie within a region about the coordinates of Q in
CIELAB space limited by locus of one unit of CIEDE2000 unit with parameter kC set to 2. 95%
of colour matches made by all observers with normal colour vision will lie within a region
limited by locus of one unit of CIEDE2000 unit with parameter kC set to 3.
267
5. Conclusions
268
5.1. Variability of colour matching, variability of colour

matching functions and observer metamerism
Strictly trichromatic colour match is a cone-level event: in conditions of symmetric colour

matching experiment, if signals from three types of cones are identical then the lights which
trigger them will match in colour, whatever their spectral power distribution is. In the conditions
of quasi-symmetric colour match the strict trichromacy can be reasonably assumed to hold in a
small bipartite field (Wyszecki and Stiles 1982). It was shown that as the size of the bipartite
field is increased the observers begin to compare signals from postreceptoral chromatic channels
rather then cone signals themselves (Viénot 1987); i.e. colour match does not mean equality of
cone signals any more. However, successful measurement of large-field colour matching
functions (Stiles and Burch 1959) shows that trichromacy still can be assumed to hold in these
conditions. Finally, trichromacy can not be assumed if more then three types of receptors are
participating in the match, i.e. in the event of rod intrusion. In the conditions of strict
trichromacy, variability of colour matching properties in colour-normal population is ultimately
characterised by the variability within a set of colour matching functions measured by the pool
of observers representative of this population.
Strictly trichromatic colour match is a physical/optical event: it can be modelled

mathematically, given the knowledge of the prereceptoral filters and of the receptor absorptance
properties. Such a model would predict whether two stimuli would match. Moreover, given the
information about variability of each of the elements in the eye’s optical path, the model would
predict the range of mismatches that different observers would see in a pair which is metameric
with respect to the average observer. Provided that the knowledge about the eye’s properties is
complete, the result of the modelling would be identical to the result of the colour matching
experiment with human observers, and eventually make such experiment unnecessary.
Both of the methods of modelling the observer metamerism will successfully predict the
variations of matches made by different colour-normal observers in conditions of symmetric or
quasi-symmetric colour matching. Mechanisms operating in asymmetric colour matching are
unknown, but they do not seem to be those of hard-wired comparison of cone signals (Danilova
and Mollon 2006). However, these are the conditions in which the colorimetry is usually applied
269
in imaging, where the stimuli are almost always spatially separated. If the mechanisms
governing this kind of matching are not the same as those governing quasi-symmetric colour
matching, none of the two possibilities would successfully predict the spread of judgments made
by observers. Colour matching functions can be used to predict the match, but the variability of
colour matching functions can not be used to predict the range of mismatches accepted by
different observers as a match: degree of observer metamerism does not correspond to degree of
variability of matches between spatially separated stimuli. This is the main finding of our study
concerning the variability of colour matching.
An open question is the one of perceptibility and acceptability of colour differences between
matches made by different observers, and between individual and mean observers’ match. Our
results indicate that matches made by different observers are often statistically different. As
these differences do not result from observer metamerism, it indicates that there are other
individual differences that must reside in higher-order visual mechanisms – in neural or cortex
processing. Thus the question is moved from the domain of variations in colour matching to
variations in colour vision (Webster et al. 2000a; Webster et al. 2000b; Webster and Webster
2002; Malkoc et al. 2005). Whether or not these inter-individual differences lead to
perceptually-significant colour differences is not known, and needs to be researched.
We must limit our conclusions to the conditions of our experiment: typical LCD or CRT
monitor matching surface sample illuminated by fluorescent light source. Recent developments
in imaging include narrowband displays in which colours are created by LED of laser light
sources. In lighting, narrow-band LED technology becomes more and more popular. How the
visual system will behave in conditions of extreme metamerism, when both spatially separated
stimuli will be narrow-band lights, is not known and needs to be established by further studies.
The variability of CMF can only be used to model variability resulting strictly from observer
metamerism, i.e. in conditions similar to symmetric or quasi-symmetric colour matching. These
conditions are rare in imaging, however, they are common in textiles, paints, automotive and
other industries where samples are often compared adjacent one to another, with no or very
small gap between them. If a tool for prediction of degree of metamerism is indeed required, it
seems reasonable to make the estimation based directly on set of colour matching functions as
was suggested by Wyszecki (Wyszecki 1969), and as was done in this study. The need for
“elegant” mathematical tool such as Standard Deviate Observer is doubtful: the reason for
insistence on using two rather then twenty colour matching functions for estimation of observer
metamerism seem to lie in the period when calculations were done by hand on piece of paper.
Virtually all modern colorimetric tools are computerised, and having large number of CMF
stored in the memory does not pose a problem.
270
5.2. Adaptation and additivity
The fact that self-luminous displays look differently from what colorimetry predicts they should
look like was known for decades, in fact since the introduction of colour television (Hunt 2004)
(p. 390). The existence of adaptation differences between broadband and narrowband stimuli
leading to additivity failures is also long acknowledged (Trezona 1953; Trezona 1954; Stiles
1963; Crawford 1965; Lozano and Palmer 1967; Zaidi 1986). In this study we have established
a relationship between the two, and have shown evidence indicating that both effects are caused
by the same mechanism of postreceptoral adaptation. To our knowledge, this report is the first
to show the consequences of additivity failure in conditions relevant to practical colorimetry.
Since the establishment of colorimetry, the additivity laws were somewhat of a sacred issue.
The general understanding seemed to be that failure of additivity essentially leads to breakdown
of CIE colorimetry and need of redesigning it. Our main conclusion concerning the failure of
additivity is that this is not so: the additivity failure can be predicted, modelled and compensated
for.
The results of the present experiment are not enough to devise a general model of additivity
failure. We only have characterised the effect for spectra typical for LCD and CRT displays.
Although difference between the two was insignificant, it is not known how visual system reacts
to other narrow-band sources. General model of additivity failure, that would predict the
postreceptoral adaptation for any arbitrary spectral power distribution, is still to be developed,
and it will essentially be based on understanding of visual mechanisms underlying the
phenomenon. It has been recently suggested that visual system uses the non-linear processing of
colour information to compensate for its limited spectral sensitivity, and to assign hues
according to properties of physical environment rather then mere physiological receptor
responses (Mizokami et al. 2006). It is possible that understanding of functional purpose of this
mechanism will lead to new possibilities of accounting for it in colorimetry and colour
reproduction.
Abney effect occurs when narrow-band light is diluted by equal energy white: the hue of the
mixture changes together with the saturation (Abney 1910). This effect is a demonstration of
271
additivity failure: the hue of the mixture is not the sum of the hues of the lights that constitute it
(Mizokami et al. 2006). The similarity of the setup in Abney effect demonstration and one of
matching narrow-band with broadband lights is evident: in both cases visual system seems to
adjust its response to lights of varying bandwidth. Study of mechanisms underlying Abney
effect seems to be a reasonable direction of future research into additivity failures.
If indeed the additivity only fails to an appreciable extent in conditions of comparison of

broadband and narrowband lights, it might explain why the CIE colorimetry, which does not
account for the possibility of such failure, is successfully implemented for nearly a century.
Practical tasks of matching colours of markedly different bandwidth are rare. Indeed the only
one which comes into mind is of matching display colour to object colour – one which was
tested in our experiment. In order to continue to be successfully implemented, colorimetry needs
to devise a generic way of accounting for situations where additivity does not hold – not only in
the setup described in this thesis, but so it would suit possible future imaging technologies.
272
5.3. Uncertainty of colour matching and uncertainty of colour

vision: basic vs. advanced colorimetry
In his 1973 paper, Wyszecki defines the concepts of basic and advanced colorimetry (Wyszecki
1973). The basic colorimetry is
… a tool used to making a prediction on whether two lights (visual stimuli) of

different spectral power distributions will match in colour for certain given
conditions of observation.
Advanced colorimetry, at the other hand, deals with
… assessing the appearance of colour stimuli presented to the observer in

complicated surroundings as they may occur in everyday life.
The present research has begun as the research in basic colorimetry. We saw our target in the
development of the new standard deviate observer. The way to approach the goal seemed to be
in collection of large as possible amount of colour matching data. By the end of the first
experiment we knew that we do not need any more data: the set from S&B (Stiles and Burch
1959) study from 50 years ago provides all the information we need. The second experiment
taught us that the new SDO in not needed altogether – at least for the cross-media colour
matching: the observer metamerism does not contribute much to variations in colour matches of
spatially separated stimuli. Moreover: in these conditions, the colour matching itself does not
seem to operate according to classical cone-quantum metamerism model, as the observer’s
adaptation state changes instantaneously when the gaze is moved from one media to another. As
the result, we had to resort to advanced colour difference formulae and chromatic adaptation
transform.
Effectively, we have moved from the domain of basic colorimetry to the advanced one.
Observers did not make their matches by equating the cone signals triggered by the two stimuli,
but rather by equating the colour perceptions. Consequently, the tolerances of matches were
ones of colour perception and not of colour matching. In the introduction to the second
experiment, we already quoted Wyszecki’s note from the same paper (Wyszecki 1973):
Fundamental research in basic colorimetry is concerned with the limitations of

the colour-matching laws which govern basic colorimetry…
273
In our research we seem to have reached these limitations. Basic colorimetry can only be used
under the assumption that in conditions of its application vision operates no principally different
from the conditions of colour matching experiment. While in many situations this assumption is
justified, the possibility of the contrary must always be taken into a account.
274
6. References
275
Abney, W. (1910). "On the changes in hue of spectrum colors by dilution with white light." Proceedings of the Royal
Society of London 82: 120-127.
Aguilar, M. and Stiles, W. S. (1954). "Saturation of the rod mechanism of the retina at high levels of stimulation."
Optica Acta 1: 59.
Alfvin, R. A. (1995). MSc thesis: A computational analysis of the observer metamerism in cross-media color
matching. Rochester Institute of Technology, Rochester, NY, USA.
Alfvin, R. A. and Fairchild, M. D. (1997). "Observer variability in metameric color matches using color reproduction
media." Color Research and Application 22(3): 174-188.
Allen, E. (1970). "An index of metamerism for observer differences." Proceedings of the 1st AIC congress, Color 69,
Musterschmidt, Göttingen: 771-784.
Alpern, M. and Moeller, J. (1977). "The red and green cone visual pigments of deuteranomalous trichromacy." The
Journal of Physiology 266: 647-675.
Anderson, T. W. (2003). An Introduction to multivariate statistical analysis. John Wiley & Sons, Inc., New Jersey.
Beatty, S., Nolan, J., Kavanagh, H. and O’donovan, O. (2004). "Macular pigment optical density and its relationship
with serum and dietary levels of lutein and zeaxanthin." Archives of Biochemistry and Biophysics 430: 70-76.
Berns, R. S. (1993a). Mathematics of CIE colorimetry. CIE symposium on advanced colorimetry, CIE publication
x007, Central bureau of the CIE, Vienna, Austria: 7-17.
Berns, R. S. (1993b). The mathematical development of CIE TC 1-29 proposed color difference equation: CIELCH.
7th Congress of the International Color Association: C19-1 - C19-4.
Berns, R. S. (2000). Billmeyer and Saltzman's principles of color technology. John Wiley and Sons, Inc., New York,
USA.
Billmeyer, F. W., Jr. (1993). Initial remarks to Thornton presentation CIE Vienna 93. CIE symposium on advanced
colorimetry, CIE publication x007, Central bureau of the CIE, Vienna, Austria: 18.
Billmeyer, F. W., Jr. and Saltzman, M. (1980). "Observer metamerism." Color research and application 5(2): 72.
Blottiau, F. (1947). "Les défaults d’additivité de la colorimétrie trichromatique." Rev. d’Opt. (Théor. Instrum.) 26:
193.
Bone, R. A., Landrum, J. T. and Cains, A. (1992). "Optical density spectra of the macular pigment in vivo and in
vitro." Vision Research 32(1): 105-110.
Bone, R. A. and Sparrock, J. M. (1971). "Comparison of macular pigment densities in human eyes." Vision Research
11: 1057-1064.
Braun, K. M. and Fairchild, M. D. (1997). "Testing five color-appearance models for changes in viewing conditions."
Color Research and Application 22: 165-173.
Brill, M. H. (2006). Personal communication.
Brill, M. H. and Derefeldt, G. (1991). "Comparison of reference-white standards for video display units." Color
research and application 16(1): 26-30.
Brill, M. H. and Robertson, A. R. (2006). "Open problems on the validity of Grassmann's laws." In "Colorimetry:
Understanding the CIE System (CIE preprint edition)". J. Schanda (ed.), John Wiley and Sons, Inc.: 10-1 -
10-13.
Brown, W. R. J. (1952). "The effect of field size and chromatic surroundings on color discrimination." Journal of the
Optical Society of America 42: 837.
Buck, S. L. (2001). "What is the hue of rod vision." Color Research and Application 26: S57-S59.
Burns, P. D. and Berns, R. S. (1997). "Error propagation analysis in color measurement and Imaging." Color
research and application 22(4): 280-289.
Carroll, J., Neitz, J. and Neitz, M. (2002). "Estimates of L:M cone ratio from ERG flicker photometry and genetics."
Journal of Vision 2: 531-542.
Christensen, W. F. and Rencher, A. C. (1997). "A comparison of Type I error rates and power levels for seven
solutions to the multivariate Behrens-Fisher problem." Communications in Statistics - Simulation and
Computation 26(4): 1251-1273.
Cie (1951). CIE proceedings 1951, Central bureau of the CIE, Paris
Cie (1989). Special metamerism index: change in observer. CIE publication No. 80. Central Bureau of the CIE,
Vienna, Austria.
Cie (1993). CIE symposium on advanced colorimetry, CIE publication x007, Central bureau of the CIE, Vienna,
Austria
276
Cie (2004). Publication CIE 15:2004. Colorimetry, 3rd Edition. Vienna, Austria, CIE Central Bureau.
Cie (2004). A review of chromatic adaptation transforms: CIE publication 160:2004. Central Bureau of the CIE,
Vienna, Austria.
Cie (2005). CIE 10 degree photopic photometric observer. CIE publication 165-2005. Central Bureau of the CIE,
Vienna, Austria.
Cie (2005). Fundamental chromaticity diagram with physiological axes - part I. Report of the CIE TC 1-36 (draft).
Central Bureau of the CIE, Vienna, Austria.
Coblentz, W. W. and Emerson, W. B. (1918). "Relative sensibility of the average eye to light of different colors and
some practical applications of radiation problems." U. S. Bureau of Standards Bulletin 14: 167.
Crawford, B. H. (1949). "The scotopic visibility function." Proceedings of the Physical Society B 62: 321-334.
Crawford, B. H. (1965). "Color matching and adaptation." Vision Research 5: 71-78.
Cui, G. (2000). PhD Thesis: Colour-difference evaluation using CRT displays. University of Derby, Derby, UK.
Curcio, C. A., Sloan, K. R., Kalina, R. E. and Hendrickson, A. E. (1990). "Human photoreceptor topography." The
Journal of Comparative Neurology 292: 497-523.
Danilova, M. V. and Mollon, J. D. (2006). "The comparison of spatially separated colours." Vision Research 46:
2006) 823–836.
Dartnall, H. J. A., Bowmaker, J. K. and Mollon, J. D. (1983). "Human visual pigments: microspectrophotometric
results from the eyes of seven persons." Proceedings of the Royal Society of London. Series B, Biological
Sciences 220(1218): 115-130.
Dartnall, H. J. A. and Lythgoe, J. N. (1965). "The spectral clustering of visual pigments." Vision Research 5: 81-106.
De Groot, S. G. and Gebhard, J. W. (1952). "Pupil size as determined by adapting illuminant." Journal of Optical
Society of America 42: 492.
Diaz, A. J., Chiron, A. and Not, F. V. (1998). "Tracing a metameric match to Individual variations of color vision."
Color Research and Application 23(6): 379-389.
Fairchild, M. D. (1989). "A novel method for the determination of color matching functions." Color Research and
Application 14(3): 122-130.
Fairchild, M. D. (1992). "Chromatic adaptation to image displays." TAGA Proceedings 2: 803-823.
Fairchild, M. D. (2005). Color appearance models. John Wiley & Sons, Inc.
Farnsworth, D. (1943). "The Farnsworth -Munsell 100 hue and dichotomous tests for color vision." Journal of
Optical Society of America 33: 568-578.
Gibson, K. S. and Tyndall, E. P. T. (1923). "Visibility of radiant energy." Bulletin Bureau of Standards 19: 131.
Grassmann, H. G. (1853). "Theory of compound colors, Philosophic Magazine, 4(7), pp. 254-264, ." In "Sources of
Color Science". D. L. MacAdam (ed.). Cambridge, MA, USA, The MIT press.
Grutzner, P. and Kohlrausch, A. (1961). "Der lichtverlust in der macula lutea und das farbensehen von
deuteranomalen." Pflugers Arch. 274: 318-330.
Guild, J. (1931). "The colorimetric properties of the spectrum." Philosophical Transactions of Royal Society
(London) 230: 149-187.
Hammond, B. R. and Caruso-Avery, M. (2000). "Macular pigment optical density in a southwestern sample."
Investigative Ophthalmology and Visual Science (supplement) 41: 1492-1497.
Hammond, B. R., Wooten, B. R. and Snodderly, D. M. (1997). "Individual variations in the spatial profile of human
macular pigment." Journal of the Optical Society of America A 14(6): 1187-1196.
Hunt, R. W. G. (2004). The Reproduction of Colour. John Wiley & Sons, Chichester.
Hurvich, L. M. and Jameson, D. (1951). "A Psychophysical study of white." Journal of the Optical Society of
America 41(8): 521-527.
Hurvich, L. M. and Jameson, D. (1951). "A Psychophysical study of white. III. Adaptation as variant." Journal of the
Optical Society of America 41(11): 787-801.
Ishak, I. G. H. (1951). (Bibliographical details could not be located).
Ishak, I. G. H. (1952). "The photopic luminosity curve for a group of fifteen Egyptian trichromats." Journal of the
Optical Society of America 42: 529-534.
Ishihara, S. The series of plates designed as a test for colour-blindness. Kanehara Shuppan Co., Tokyo.
Iso (1993). Guide to the Expression of Uncertainty in Measurement, International Organisation for Standartisation.
Jameson, D. and Hurvich, L. M. (1951). "A Psychophysical study of white. II. Neutral adaptation. Area and duration
as variants." Journal of the Optical Society of America 41(8): 528-536.
Johnson, R. A. and Wichern, D. W. (2002). Applied multivariate statistical analysis. Prentice Hall, Upper Saddle
River, NJ, US.
277
Judd, D. B. (1993). Judd's method for calculating the tristimulus values of the CIE 10° observer. CIE symposium on
advanced colorimetry, CIE publication x007, Central bureau of the CIE, Vienna, Austria: 107-114.
Kaiser, P. K. and Hemmendinger, H. (1980). "The color rule: a device for color vision testing." Color Research and
Kraft, T. W., Neitz, J. and Neitz, M. (1998). "Spectra of human L cones." Vision Research 38: 3993-3670.
Lix, L. M. and Keselman, H. J. (2004). "Multivariate tests of means in independent groups designs effects of
covariance heterogeneity and nonnormality." Evaluation & the health professions 27(1): 45-69.
Lozano, R. D. and Palmer, D. A. (1967). "The additivity of large-field colour matching functions." Vision Research
7: 929-937.
Lozano, R. D. and Palmer, D. A. (1968). "Large-field color matching and adaptation." Journal of the Optical Society
of America 58: 1653-1656.
Ludvigh, E. and Mccarthy, E. F. (1938). "Absorption of visible light of the refractive media of the human eye."
Archives of Ophthalmology 20: 37.
Luo, M. R., Cui, G. and Rigg, B. (2001). "The development of the CIE 2000 colour-difference formula:
CIEDE2000." Color research and application 26(5): 340-350.
Luo, M. R. and Rigg, B. (1986). "Chromaticity-discrimination ellipses for surface colours." Color research and
application 11: 25-42.
Luo, W. (2003). Measuring the uncertainty of colour matching. Derby, UK.
Macadam, D. L. (1942). "Visual sensitivities to color differences in daylight." Journal of the Optical Society of
America 32: 247.
Macleod, D. I. A. and Boynton, R. M. (1979). "Chromaticity diagram showing cone excitation by stimuli of equal
luminance." Journal of Optical Society of America 69(8): 1183-1186.
Malkoc, G., Kay, P. and Webster, M. A. (2005). "Variations in normal color vision. IV. Binary hues and hue scaling."
Journal of Optical Society of America A 22(10): 2154-2168.
Martinez, J. A., Perez-Ocon, F., Garsia-Beltran, A. and Hita, E. (2003). "New deviate observer (JF-DO) obtained
from experimental colour-matching functions for small fields of real observers." Color Research and
Martinez, J. A., Vega, F. J., Diaz, J. A., Perez-Ocon, F. and Jimenez, J. R. (2005). "Testing the behavior for
metamerism of the new Deviate Observer (JF-DO)." Color research and application 30(5): 363-370.
Maxwell, J. C. (1860). "On the theory of compound colours, and the relations of the colours to the spectrum."
Philosophical Transactions of Royal Society (London) 10: 404-409.
Merbs, L. S. and Nathans, J. (1992). "Absorption spectra of human cone pigments." Nature 356: 433-435.
Mizokami, Y., Werner, J. S. and Webster, M. A. (2006). "Nonlinearities in color coding: compensating color
appearance for the eye's spectral sensitivity." Journal of Vision 6: 996-1007.
Mollon, J. D. (2003). "The origins of modern color science." In "The science of color". S. K. Shevell (ed.). Oxford,
UK, Optical Society of America and Elsevier.
Moon, P. and Spenser, D. E. (1944). "Visual data applied to lighting design." Journal of Optical Society of America
34: 605.
Nakano, Y., Kurokami, M., Moriki, H., Suehara, K., Kohda, J. and Yano, T. (2003). Polychrometer using digital
micro-mirror device and its application to additivity test of color matching. Proceedings of the 25th session of
the CIE, San Diego, USA: 56-59.
Nardi, M. A. (1980). "Observer metamerism in college-age observers." Color research and application 5(2): 73.
Nayatani, Y. (1994). "Comments to the Articles "Measuring Color Matching Functions, Part I and II" by Amy D.
North and Mark D. Fairchild." Color research and application 19(5): 383-389.
Nayatani, Y., Takahama, K. and Sobagaki, H. (1983). "A proposal of new standard deviate observers." Color
Research and Application 8(1): 47-56.
Neitz, J. and Jacobs, G. H. (1986). "Polymorphism of the long-wavelength cone in normal human color vision."
Nature 323: 623-625.
Neitz, J. and Jacobs, G. H. (1989). "Polymorphism of cone pigments among color normals: evidence from color
matching." In "Colour Vision Deficiencies IX". B. Drum and G. Verriest (ed.). Dordrecht, The Netherlands,
Kluwer Academic: 22-34.
Neitz, J. and Jacobs, H. (1990). "Polymorphism in normal human color vision and its mechanism." Vision Research
30(4): 621-636.
Neitz, J., Neitz, M. and Jacobs, G. H. (1993). "More than three different cone pigments among people with normal
color vision." Vision Research 33(1): 117-122.
278
Neitz, M., Neitz, J. and Grishok, A. (1994). "Polymorphism in the number of genes encoding long-wavelength-
sensitive cone pigments among males with normal color vision." Vision Research 35(17): 2395-2407.
Neitz, M., Neitz, J. and Jacobs, G. (1991). "Spectral tuning of pigments underlying red-green color vision." Science
252: 971-974.
Neitz, M., Neitz, J. and Jacobs, G. H. (1995). "Genetic basis of photopigment variations in human dichromats."
Vision Research 35(15): 2095-2113.
Nel, D. G. and Merwe, C. A. V. D. (1986). "A solution to the multivariate Behrens-Fisher problem." Communications
in Statistics: Theory and Methods 15: 3719-3735.
Nimeroff, I., Rosenblatt, J. R. and Dannemiller, M. C. (1961). "Variability of spectral tristimulus values." Journal of
Research of the National Bureau of Standards - A 65(6): 475-483.
Norren, D. V. and Tiemeijer, L. F. (1986). "Spectral reflectance of the human eye." Vision Research 26: 313-320.
Norren, D. V. and Vos, J. J. (1974). "Spectral transmission of the human ocular media." Vision Research 14(11):
1237-1243.
North, A. D. and Fairchild, M. D. (1993a). "Measuring color-matching functions, Part I." Color Research and
Application 18: 155-162.
North, A. D. and Fairchild, M. D. (1993b). "Measuring color-matching functions, Part II: New data for accessing
observer metamerism." Color Research and Application 18: 163-170.
Ohta, N. (1985). "Formulation of a standard deviate observer by a nonlinear optimization technique." Color Research
and Application 10(3): 156-164.
Oicherman, B., Luo, R. M., Robertson, A. and Tarrant, A. (2005). Experimental verification of colorimetric additivity
assumption. 10th Congress of the International Colour Association, Granada, Spain
Packer, O. and Williams, D. R. (2003). "Light, the retinal image, and photoreceptors." In "The science of color". S. K.
Shevell (ed.). Oxford, UK, Optical Society of America and Elsevier: 41-102.
Pease, P. L., Adams, A. J. and Nuccio, E. (1987). "Optical density of human macular pigment." Vision Research 27:
705-710.
Pobboravsky, I. (1988). Effect of small color differences in color vision on the matching of soft and hard proofs.
TAGA Proceedings: 62-79.
Pokorny, J., Smith, V. C. and Lutze, M. (1987). "Aging of the human lens." Applied Optics 26(8): 1437-1440.
Polyak, S. L. (1945). The Retina. University of Chicago press, Chicago.
Rich, D. (2006). (Personal communication).
Rich, D. and Jalijali, J. (1995). "Effects of observer metamerism in the determination of human color-matching
functions." Color Research and Application 20: 29-35.
Robertson, A. R. (2002). Proposed test of colorimetric additivity law (not published).
Rodieck, R. W. (1998). The first steps in seeing. Sinauer Associates, Inc., Sunderland, Massachusetts, USA.
Roorda, A. and Williams, D. R. (1999). "The arrangement of the three cone classes in the living human eye." Nature
397: 520-522.
Said, F. S. and Weale, R. A. (1959). "The variation with age of the spectral transmissivity of the living human
crystalline lens." Gerontologia 3: 213.
Sanocki, E., Lindsey, D. T., Winderickx, J., Teller, D. Y., Deeb, S. S. and Motulskii, A. G. (1993). "Serine/Alanine
Amino Acid Polymorphism of the L and M Cone Pigments: Effects on Rayleigh Matches among
Deuteranopes, Protanopes and Color Normal Observers." Vision Research 33(15): 2139-2152.
Shapiro, A. G., Pokorny, J. and Smith, V. C. (1996). "An investigation of scotopic threshold-versus-illuminance
curves for the analysis of color-matching data." Color Research and Application 21: 80-86.
Sharma, G., Wu, W. and Dal, E. N. (2005). "The CIEDE2000 color-difference formula: implementation notes,
supplementary test data, and mathematical observations." Color research and application 30(1): 21-30.
Sharpe, L. T., Stockman, A., Jagle, H., Knau, H., Klausen, G., Reitner, A. and Nathans, J. (1998). "Red, green, and
red-green hybrid pigments in the human retina: correlations between deduced protein sequences and
psychophysically measured spectral sensitivities." The Journal of Neuroscience 18(23): 10053–10069.
Sharpe, L. T., Stockman, A., Knau, H. and Jägle, H. (1998). "Macular pigment densities derived from central and
peripheral spectral sensitivity differences." Vision Research 38: 3233-3239.
Smith, V. C., Pokorny, J. and Starr, S. J. (1975). "Variability of color mixture data - I. Interobserver variability in the
unit coordinates." Vision Research 16(10): 1087-1094.
Speranskaya, N. I. (1959). "Determination of spectrum color co-ordinates for twenty-seven normal observers." Optics
and Spectroscopy 7: 424.
Stiles, W. S. (1955). "18th Thomas Young Oration: The basic data of colour matching." Philosophical society year
book: 44.
279
Stiles, W. S. (1963). "N. P. L. Colour matching investigation: appendum on additivity." Optica Acta 10: 229-232.
Stiles, W. S. and Burch, J. M. (1959). "N.P.L. colour-matching investigation: final report (1958)." Optica Acta 6: 1.
Stiles, W. S. and Wyszecki, G. (1962). "Field trials of color mixture functions." Journal of the Optical Society of
America A 52: 58-75.
Stockman, A. and Sharpe, L. T. (2000). "The spectral sensitivities of the middle- and long-wavelength-sensitive
cones derived from measurements in observers of known genotype." Vision Research 40: 1711–1737.
Stockman, A., Sharpe, L. T. and Fach, C. (1999). "The spectral sensitivity of the human short-wavelength sensitive
cones derived from thresholds and color matches." Vision Research 39: 2901–2927.
Takahama, K., Sobagaki, H. and Nayatani, Y. (1984). "Prediction of observer variation in estimating colorimetric
values." Color Research and Application 10(2): 106-117.
Tansley, B. W. and Boynton, R. M. (1978). "Chromatic border perception: the role of red- and green-sensitive
cones." Vision Research 18: 683-697.
Tarrant, A. W. S. (2002). Visual colour matching equipment for teaching and research. 10th Congress of the
International Colour Association, Proceedings of SPIE: 985-988.
Thornton, W. A. (1992a). "Toward a more accurate and extensible colorimetry, Part I: Introduction. The visual
colorimeter-spectroradiometer. Experimental results." Color Research and Application 17: 79-122.
Thornton, W. A. (1992b). "Toward a more accurate and extensible colorimetry, Part II: Discussion." Color Research
and Application 17: 162-186.
Thornton, W. A. (1992c). "Toward a more accurate and extensible colorimetry, Part III: Discussion (continued)."
Color Research and Application 17: 240-262.
Thornton, W. A. (1997). "Toward a more accurate and extensible colorimetry, Part IV: Visual experiments with
bright fields and both 10° and 1.3° field sizes." Color Research and Application 22: 189-198.
Thornton, W. A. (1998b). "Toward a more accurate and extensible colorimetry, Part VI: Improved weighting
functions. Preliminary results." Color Research and Application 23: 226-233.
Thornton, W. A. (1998c). "Toward a more accurate and extensible colorimetry, Part V: Testing visually matching
pairs of lights for possible rods participation on the Aguilar-Stiles model." Color Research and Application
23: 92-103.
Trezona, P. W. (1953). "Additivity of colour equations I." Proceedings - Physical Society London B66: 548.
Trezona, P. W. (1954). "Additivity of colour equations II." Proceedings - Physical Society London B67: 513.
Trezona, P. W. (1983). "Luminance level conversions to assist lighting engineers to use fundamental visual data."
Lighting Research and Technology 15: 83.
Trezona, P. W. (1993). Maxwell and maximum saturation methods of colour matching: some sources of error. CIE
symposium on advanced colorimetry, CIE publication x007, Central bureau of the CIE, Vienna, Austria: 88-
90.
Upton, G. and Cook, I. (1996). Understanding Statistics. Oxford University Press, Oxford, UK.
Viénot, F. (1977a). "New equipment for measurement of colour-matching functions." Color Research and
Application 2: 165-170.
Viénot, F. (1977b). "About the statistics of color-matching functions." Die Farbe 26(3): 205-212.
Viénot, F. (1980). "Relations between inter- and intra-individual variability of color-matching functions.
Experimental results." Journal of the Optical Society of America 70(12): 1476-1483.
Viénot, F. (1987). "What are observers doing when making color matches." Die Farbe 34: 221-228.
Vos, J. J. (1972). "Literature review of human macular absorption in the visible and its consequences for the cone
receptor primaries." Soesterberg, The Netherlands: Netherlands Organization for applied scientific research,
Institute for Perception.
Vries, H. L., Spoor, A. and Jielof, R. (1953). "Properties of the eye with respect to polarized light." Physica 19: 419-
432.
Wald, G. (1945). "Human vision and the spectrum." Science 101: 653-658.
Wald, G. (1945). "The spectral sensitivity of the human eye; a spectral adaptometer." Journal of the Optical Society
of America 35: 187.
Weale, R. A. (1954). "Light absorption by the lens of the human eye." Optica Acta 1: 107-110.
Weale, R. A. (1968). "The photoreceptor." In "Techniques of photostimulation in biology". B. H. Crawford (ed.).
Amsterdam, North-Holland Publishing Company, Wiley Interscience Division, John Wiley and Sons, Inc.,
New York: 145-200.
Webster, M. A. (1992). "Reanalysis of lambda max variations in the Stiles-Burch 10° color-matching functions."
Journal of Optical Society of America A 9(8): 1419-1421.
280
Webster, M. A. and Macleod, D. I. A. (1988). "Factors underlying individual differences in the color matches of
normal observers." Journal of the Optical Society of America A 5: 1722-1735.
Webster, M. A., Miyahara, E., Malkoc, G. and Raker, V. E. (2000a). "Variations in normal color vision. I. Cone-
opponent axes." Journal of Optical Society of America A 17(9): 1535-1544.
Webster, M. A., Miyahara, E., Malkoc, G. and Raker, V. E. (2000b). "Variations in normal color vision. II. Unique
hues." Journal of Optical Society of America A 17(9): 1545-1555.
Webster, M. A. and Webster, S. M. (2002). "Variations in normal color vision. III. Unique hues in Indian and United
States observers." Journal of Optical Society of America A 19(10): 1951-1962.
Winderickx, J., Lindsey, D. T., Sanocki, E., Teller, D. Y., Motilsky, A. G. and Deeb, S. S. (1992). "Polymorphism in
red photopigments underlies variation in colour matching." Nature 356: 431-433.
Wolf, S., Sharpe, L. T., Knau, H. and Wissinger, B. (1998). "Numbers and ratios of X-chromosomal-linked opsin
genes." Vision Research 38: 3227-3231.
Wright, W. D. (1928). "A re-determination of the trichromatic coefficients of the spectral colours." Transactions of
Optical Society 30: 141-161.
Wright, W. D. (1951). "The visual sensitivity of normal and aphakic observers in the ultra-violet." Anne Psycho-
Physiologie 50: 169-176.
Wright, W. D. (1964). The Measurement of Colour. Hilger & Watts LTD, London, UK.
Wyszecki, G. (1958). "Evaluation of metameric colors." Journal of the Optical Society of America 4: 451.
Wyszecki, G. (1959). First NRC progress report on field trials (Donaldson colorimeter) of colour-matching functions.
Ottawa, Canada, National Research Council of Canada.
Wyszecki, G. (1969). "The degree of color metamerism and its specification." Textile Chemist and Colorist 1(1): 12-
15.
Wyszecki, G. (1973). Current developments in colorimetry. Proceedings 2nd AIC Congress, Colour 73, York, Adam
Higler, London: 21-51.
Wyszecki, G. and Stiles, W. S. (1982). Color science, concepts and methods, quantitative data and formulae. John
Wiley and Sons, Inc.
Zaidi, Q. (1986). "Adaptation and color matching." Vision Research 26(12): 1925-1938.
Zhang, H. and Montag, E. D. (2004). How Well Can People Use Different Color Attributes? Twelfth Color Imaging
Conference: Color Science and Engineering Systems, Technologies, Applications, Scottsdale, Arizona, USA:
10-17.

Boris Oicherman PHD Thesis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Boris Oicherman PHD Thesis

Uploaded by

Copyright:

Available Formats

Effects of colorimetric additivity failure

and of observer metamerism

Project supervisor: Ronnier M. Luo

Submitted in accordance with the requirements for the degree of

The University of Leeds

The candidate confirms that the work submitted is his own

This copy has been supplied on the understanding

In the same cross-media colour matching conditions, we find significant systematic

2.4.5.1. Organisation, variance and covariance ..........................................................................................74

3.1.2. Summary of the experimental conditions and results............................................................................124

4.5.1. Fluctuation of stimulus presentation .....................................................................................................206

− ERNEST HEMINGWAY. The Sun Also Rises

and if a bird can speak, who once was a dinosaur,

− KING CRIMSON. The ConstruKction of Light

− C. L. HARDIN. Color for Philosophers Unweaving the Rainbow

1.2. Aims and scope

The range of applications of colorimetry is enormous; it spans many of industries such as

Within these conditions, we aim to answer the following two questions:

1.3. Thesis structure

This thesis consists of four chapters.

1.4. Summary of contribution

The statements of the present thesis are:

Following are the thesis deliverables:

In the course of this research the following publications were produced:

3. Oicherman, B. (2006). "The study of the uncertainty of colour matching: psychophysics

2.1. The Eye

Figure 2.1.1-1. Simplified diagram of the human eye.

Figure 2.1.1-2. Classes of retinal neurons.

2.1.2. The cornea and aqueous and vitreous humors

2.1.3. Pupil and retinal illuminance

d = 4.9 − 3tanh ⎡⎣0.4 ( log L + 1) ⎤⎦ (2.1.1)

(Moon and Spenser 1944)

log d = 0.8558 − 4.01 ⋅ 10−4 ( log L + 8.6 )

(De Groot and Gebhard 1952)

Additional psychophysical methods can be added to this list:

Figure 2.1.4-1. Relative density of the human crystalline lens.

TL = TL1 [1 + 0.02( A - 32) ] + TL 2 (2.1.6)

TL = TL1 [1.56 + 0.0667( A - 60) ] + TL 2 (2.1.7)

2.1.5. Macular pigment

Figure 2.1.5-1. Density of macular pigment as measured by three different studies

(Wald 1945) 10 0.5 0.0-1.0 NA

(Bone and Sparrock 1971) 49 0.53 0.0-1.0 NA

(Pease et al. 1987) 27 0.77 0.21-1.22 40.49%

(Vries et al. 1953) 20 - 0.07-0.52 NA

(Grutzner and Kohlrausch 1961) 4 0.54 0.29-0.75 40.74%

(Norren and Tiemeijer 1986) 2 0.24 0.14-0.36 53.03%

(Bone et al. 1992) 7 0.52 0.21-0.77 40.23%

(Diaz et al. 1998) 8 0.33075 0.14-0.38 50.80%

Table 2.1.5-1. Summary of measurements of macular pigment density.

The inter-observer variability of macular pigment density is rather difficult to estimate. As we

∆ ⎡⎣log(V ' ( λ ) ⎤⎦ = 10−4 ( 500 − λ )( A − 30 ) (2.1.9)

the "aged" V′(λ) value.

y = 0.002666x 5 + 0.016088x 4 - 0.012454x 3 -

Figure 2.1.6-3. Aguilar and Stiles TVI function.

psychophysical threshold measurements on group of dichromats lacking functioning M or L

Shift in peak sensitivity

Publication L cone shift M cone shift

− What is the practical implication of polymorphism on colour perception?

L/M cones ratio

Figure 2.1.6-6. Pseudocolour image of the retina of two male subjects.

2.1.6.3. MacLeod-Boynton chromaticity diagram

Figure 2.1.6-7. MacLeod-Boynton chromaticity diagram with locus of monochromatic stimuli.

2.1.7. Retinal topography