Calibración Lineal y Comparación de Métodos

Departament de Qumica Analtica i Qumica Orgnica
rea de Qumica Analtica

CALIBRACI LINEAL I COMPARACI DE MTODES
ANALTICS MITJANANT TCNIQUES DE
REGRESSI QUE INCORPOREN ERRORS
EN TOTES LES VARIABLES

Memria presentada per
NGEL MARTNEZ BARAMBIO
per assolir el grau de
Doctor en Qumica
Tarragona, 2001

Desprs de quasi quatre anys i mig de treball on tanta i tanta
gent mha ajudat, no noms en laspecte cientfic sin tamb en el
personal, sento una gran necessitat dagrament. Sc conscient de la
responsabilitat que implica mencionar-vos a tots, i aix explica la
meva por a descuidar-me ni que tan sols sigui a un de vosaltres. Crec
ms convenient no posar noms perqu estic segur que no noms aquells
que hagueu tingut alguna cosa a veure amb aquesta tesi, sin tamb
aquells que alguna vegada hagueu pensat en mi, sabreu llegir el vostre
nom en aquestes lnies.

Sapigueu que us portar sempre en el meu pensament, perqu
heu estat vosaltres els que mheu donat forces per tirar endavant i
poder acabar aquesta tesi. Per aix, us vull dir duna forma tan
senzilla com sincera,
... grcies per ser-hi.

A la M Eugnia,
perqu malgrat les dificultats,
sempre mhas donat el teu suport i amor.
ndex

XI
NDEX

Captol 1. Introducci.
1.1 Objectiu de la tesi doctoral

3
1.2 Estructura de la tesi doctoral

4
1.3 Notaci 6
1.4 Regressi lineal

10
1.4.1 Tcniques de regressi lineal univariant

11
1.4.1.1 Tcniques de regressi que consideren els errors en un eix

14
1.4.1.2 Tcniques de regressi que consideren els errors en els dos eixos 22
1.4.2 Regressi lineal multivariant 33
1.4.2.1 Regressi lineal mltiple 35
1.4.2.2 Mnims quadrats multivariants 37
1.5 Tests dhiptesi sobre els coeficients de regressi 40
1.5.1 Condicions daplicaci 41
1.5.2 Importncia de la falta dajust 42
1.5.3 Probabilitats derror de primera i segona espcie 43
1.6 Aplicaci de la regressi lineal considerant errors en tots els
eixos 45
1.6.1 Calibraci de mtodes analtics 45
1.6.2 Comparaci de mtodes analtics 46
1.6.3 Predicci 46
1.7 Referncies 46

Captol 2 Falta dajust dels punts experimentals a la recta de regressi que
considera errors en els dos eixos.
2.1 Objectiu del captol
53
2.2 Possibles aproximacions per la detecci de falta dajust
54
2.3 Lack of fit in linear regression considering errors in both axes
(Chemometrics and Intelligent Laboratory Systems, 54 (2000) 61-73) 63
2.4 Conclusions
91

ndex

XII
2.5 Referncies

92

Captol 3. Probabilitat derror de primera i segona espcie en els tests individuals
sobre lordenada a lorigen i el pendent en regressi lineal considerant errors en els
dos eixos.
-
95
3.2 Estimaci de la probabilitat derror de segona espcie en
laplicaci de tests individuals sobre els coeficients de regressi 96
3.3 Relaci entre les probabilitats derror de primera i segona
espcie amb el nombre de mostres de calibraci 100
3.4 Detecting proportional and constant bias in method
comparison studies by using linear regression with errors in both
axes (Chemometrics and Intelligent Laboratory Systems, 49 (1999) 179-
193) 104
3.5 Conclusions
136

3.6 Referncies
136

Captol 4. Detecci del biaix en mtodes analtics per la determinaci de mltiples
analits simultniament. Probabilitat de cometre un error de tipus .
-
139
4.2 Comparaci de mtodes analtics
-
140
4.2.1 Determinaci de diversos analits simultniament
143
4.3 Validation of bias in multianalyte determination methods.
Application to RP-HPLC derivatizing methodologies (Analytica
Chimica Acta 406 (2000) 257-278) 151
4.4 Probabilitat derror en el test conjunt
-
176
4.5 Evaluating bias in method comparison studies using linear
regression with errors in both axes (Journal of Chemometrics,
acceptat) 178
4.6 Conclusions
-
213
4.7 Referncies
-
214
ndex

XIII
Captol 5. Comparaci de mltiples mtodes mitjanant lanlisi per components
principals de mxima versemblana considerant els errors en tots els eixos.
-
217
5.2 Anlisi per components principals de mxima versemblana
(MLPCA) 219
5.3 Validaci de laproximaci per comparar mltiples mtodes
225
5.4 Multiple analytical method comparison by using MLPCA and
linear regression with errors in both axes (Analytica Chimica Acta,
enviat) 230
5.5 Conclusions
259
5.6 Referncies
260

Captol 6. Habilitat de predicci utilitzant regressi lineal multivariant
considerant errors en tots els eixos en PCR i PCR de mxima versemblana
(MLPCR).
265
6.2 Tcniques de calibraci multivariant de mxima versemblana
-
266
6.2.1 MLPCR
-
267
6.2.2 MLLRR
-
268
6.3 Errors de predicci
-
270
6.4 Application of multivariate least squares regression method to
PCR and maximum likelihood PCR techniques (Journal of
Chemometrics, enviat) 272
6.5 Conclusions
-
294
6.6 Referncies
295

Captol 7. Conclusions.
7.1 Conclusions generals -
299
7.2 Lnies de recerca futura -
307

CAPTOL 1
Introducci


3

Aquesta tesi doctoral pretn aprofundir en diferents aspectes de la
regressi lineal considerant els errors en tots els eixos, emprada en el camp
de la qumica tant en la comparaci de mtodes analtics com en la
calibraci. Aquests mtodes de regressi consideren, per una banda, les
incerteses degudes als errors comesos en lanlisi duna srie de mostres
per cadascun dels diversos mtodes analtics en processos de comparaci
de mtodes i per laltra, les incerteses degudes a tots els valors
experimentals en calibraci. Dentre els aspectes considerats sen poden
diferenciar els segents:

1. Revisi crtica de les tcniques de regressi lineal emprades per estimar-
ne els coeficients.

2. Desenvolupament i validaci dun test estadstic per detectar la falta
dajust dels resultats experimentals a la recta de regressi.

3. Desenvolupament i validaci dexpressions matemtiques per estimar les
probabilitats de cometre errors de primera i segona espcie en laplicaci de
tests individuals sobre els coeficients de regressi.

4. Estudi de la detecci dun biaix significatiu en els resultats de mtodes
analtics capaos danalitzar diferents analits alhora mitjanant regressi
lineal.

5. Desenvolupament i validaci duna tcnica per la comparaci dels
resultats de mltiples mtodes danlisi que consideri les incerteses dels
resultats analtics.

Captol 1. Introducci

4
6. Estudi sobre la millora de lhabilitat de predicci en mtodes de
calibraci multivariant mitjanant una tcnica de regressi multivariant que
considera les incerteses en tots els valors experimentals.

7. Generaci dalgorismes informtics per facilitar laplicaci prctica dels
tests desenvolupats.


La memria daquesta tesi doctoral es troba estructurada en set
captols que a la vegada es divideixen en diversos apartats i subapartats. En
el primer captol es fa un recull de les aproximacions ms emprades per
estimar els coeficients de regressi tant en el cas de la calibraci univariant
com multivariant. Tanmateix es presenta el mtode de regressi de mnims
quadrats bivariants (bivariate least squares, BLS), que s la base del treball
desenvolupat en els captols segents. Finalment, sestableixen les
condicions necessries per laplicaci correcta de tests estadstics sobre els
coeficients de regressi BLS, aix com diverses aplicacions daquesta tcnica
de regressi.

Arran de les conseqncies que es poden derivar de lexistncia de
falta dajust dels punts experimentals a la recta de regressi, en el segon
captol es presenten i discuteixen dos possibles tests estadstics per la seva
detecci sota les condicions de regressi prpies del mtode BLS. Les
conclusions extretes daquest captol serviran de base per discutir la relaci
entre els errors de primera i segona espcie en els tests individuals sobre els
coeficients de regressi amb el nombre de punts emprats en la construcci
de la recta de regressi BLS, que es tracta en el tercer captol.

La rellevncia de la falta dajust i la relaci amb lestimaci de lerror
de segona espcie es torna a posar de manifest en el quart captol, per en

5
aquesta ocasi en laplicaci del test de confiana conjunta sobre els
coeficients de regressi BLS, per a la comparaci de dos mtodes analtics.
Les conclusions extretes dels exemples prctics daplicaci del test de
confiana conjunta del captol quart justifiquen la importncia del
desenvolupament dun procediment de clcul que permeti estimar les
probabilitats de cometre un error de segona espcie en laplicaci del test
de confiana conjunta sobre els coeficients de regressi BLS, que tamb es
presenta en aquest quart captol.

Desprs de dedicar els captols 2, 3 i 4 al desenvolupament i
laplicaci prctica duna srie de tests estadstics aplicables als coeficients
de regressi estimats pel mtode BLS, en el captol cinqu s'entra en el
camp multivariant i es passa a exposar un mtode per la comparaci dels
resultats obtinguts per ms de dos mtodes analtics considerant les
incerteses de tots els resultats individuals. Aquest mtode es basa per una
part en un mtode de calibraci multivariant, lanlisi per components
principals de mxima versemblana (maximum likelihood principal component
analysis, MLPCA), i per laltra en laplicaci del test de confiana conjunta
sobre els coeficients de la recta de regressi BLS.

El sis captol es dedica a lestudi dun mtode de regressi
multivariant que considera les incerteses degudes als errors en les mesures
de les diferents mostres i la seva aplicaci en tcniques de calibraci
multivariant. Aix possibilita diferenciar entre lerror de predicci observat
i el vertader, aix com la discussi de la millora observada en lhabilitat de
predicci vertadera. Finalment, les conclusions generals extretes daquesta
tesi i les possibles lnies de recerca futura es presenten en el set captol.


6
1.3 Notaci

La notaci que es detalla a continuaci s la seguida en el text escrit
en catal de la memria daquesta tesi doctoral, ja que els articles presentats
en cadascun dels captols tenen una notaci especfica definida en lapartat
Notation, que en alguns casos difereix lleugerament de la que es defineix tot
seguit.

Les matrius es representen en majscula i negreta (p. e. R), els vectors
en minscula i negreta (p. e. y) i els escalars en cursiva (p. e. x
i
).

Smbols que comencen amb una lletra de lalfabet llat

b
0
Valor estimat de lordenada a lorigen de la recta de regressi.
0
H
0
b Valor teric de lordenada a lorigen per al qual es postula la hiptesi
nulla.
1
H
0
b Valor teric de lordenada a lorigen per al qual es postula la hiptesi
alternativa.

b
1
Valor estimat de lordenada a lorigen de la recta de regressi.
0
H
1
b Valor teric del pendent per al qual es postula la hiptesi nulla.
1
H
1
b Valor teric del pendent per al qual es postula la hiptesi alternativa.

b
p
Valor estimat del pendent p de lhiperpl de regressi.

b Vector amb els valors estimats dels coeficients de regressi.

e
i
Error residual en el punt i.
2 1
, ,
F Valor de la distribuci F de Fischer per a un nivell de significana
(1 cua) amb
1
i
2
graus de llibertat.
2 1
, , 2
F Valor de la distribuci F de Fischer per a un nivell de significana
(2 cues) amb
1
i
2
graus de llibertat.
L Funci de versemblana.
1.3 Notaci

7

n Nombre de punts experimentals.

p
i
Nombre de repeticions fetes en la mesura de la mostra i.

R Matriu de mesures espectroscpiques.

s
2
Estimaci de lerror residual mitj al quadrat. Error experimental.

S Suma de residuals (ponderats o no) al quadrat.
0
b
s Estimaci de la desviaci estndard del pendent de la recta de
regressi.
0
H
0
b
s Desviaci estndard del valor teric de lordenada a lorigen per al
qual es postula la hiptesi nulla.
1
H
0
b
s Desviaci estndard del valor teric de lordenada a lorigen per al
qual es postula la hiptesi alternativa.
1
b
s Estimaci de la desviaci estndard de la ordenada a lorigen de la
recta de regressi.
0
H
1
b
s Desviaci estndard del valor teric del pendent per al qual es
postula la hiptesi nulla.
1
H
1
b
s Desviaci estndard del valor teric del pendent per al qual es
postula la hiptesi alternativa.
2
i
e
s Estimaci de la varincia de lerror residual en el punt i. Factor de
ponderaci
2
x
s Estimaci de la varincia de les mesures experimentals de la variable
predictora.
2
i
x
s Estimaci de la varincia de la variable predictora en el punt i.
2
i
k
x
s Estimaci de la varincia de la variable predictora k en el punt i.

s
xx
Suma del quadrat de les distncies entre cadascuna de les mesures de
la variable predictora i el valor mitj.

s
xy
Suma del producte de les distncies entre cadascuna de les mesures
de les dues variables resposta i els respectius valors mitjans.

8
2
i
y
s Estimaci de la varincia de la variable resposta en el punt i.

s
yy
Suma del quadrat de les distncies entre cadascuna de les mesures de
la variable resposta i el valor mitj.

T Matriu de valors propis o scores.

t
,
Valor de la distribuci t de Student per a un nivell de significana (1
cua) i graus de llibertat.

t
/2,
Valor de la distribuci t de Student per a un nivell de significana (2
cues) i graus de llibertat.

V Matriu de vectors propis o loadings.

X Matriu amb els valors mesurats de la/les variable(s) predictores.

x Variable predictora.
x Valor mitj de les mesures experimentals de la variable predictora.

x
i
Valor mesurat de la variable predictora en el punt i.

i
x
Valor predit de la variable predictora en el punt i.
i
k
x
Valor mesurat de la variable predictora k en el punt i.

i
k
x
Valor predit de la variable predictora k en el punt i.
p
x Coordenada x del centroide ponderat.

y Variable resposta.

y Vector de mesures experimentals de la variable resposta.
y Valor mitj de les mesures experimentals de la variable resposta.

y
i
Valor mesurat de la variable resposta en el punt i.
ij
y Valor mesurat de la rplica j en el punt i.
i
y
Valor predit de la variable resposta en el punt i.
p
y Coordenada y del centroide ponderat.

1.3 Notaci

9
Smbols que comencen amb una lletra de lalfabet grec

Nivell de significana, probabilitat derror de primera espcie o error
de tipus I.
Probabilitat derror de segona espcie o error de tipus II.
0
Valor vertader del pendent de la recta de regressi.
1
Valor vertader de lordenada a lorigen de la recta de regressi.
2
2 , 1 n
Valor de la distribuci
2
per a un nivell de significana amb n-2
graus de llibertat.
Biaix, mxima diferncia acceptable entre un valor estimat i un de
referncia.
i
Error aleatori coms en la mesura de la variable predictora en el punt
i.
i
Error residual vertader en el punt i.
i
Error aleatori coms en la mesura de la variable resposta en el punt i.
Factor de fiabilitat.
Relaci entre els errors de les variables resposta i predictora en CVR.
Matriu diagonal de varincies en lespai definit per les files.
2
Varincia vertadera de les mesures experimentals.
2
i
Varincia de lerror residual en el punt i.

2
x
Varincia vertadera de totes les mesures experimentals de la variable
predictora.
2
Varincia del valor vertader de la variable predictora.

2
i
x
Varincia vertadera de les mesures experimentals de la variable
predictora en el punt i.
2
y
Varincia vertadera de totes les mesures experimentals de la variable
resposta.

10
2
i
y
Varincia vertadera de les mesures experimentals de la variable
resposta en el punt i.
Variable resposta vertadera.
i
Valor vertader de la variable resposta en el punt i.
Matriu diagonal de varincies en lespai definit per les columnes.
Variable predictora vertadera.
i
Valor vertader de la variable predictora en el punt i.

1.4 Regressi lineal

De forma general, el terme regressi lineal comprn un conjunt de
tcniques estadstiques emprades per identificar les relacions existents
entre dues o ms variables (regressi lineal univariant o multivariant
respectivament).
1
Per fer-nos una idea de la seva antiguitat, sha de dir que
el terme regressi va ser introdut per primera vegada per lantropleg i
metrleg britnic Sir Francis Galton (1822-1911) el 1885.
2
Dentre els
diferents tipus de mtodes de regressi lineal tant univariant com
multivariant, ens centrarem en els anomenats mtodes no esbiaixats, s a
dir, aquells que no introdueixen un biaix en els coeficients de regressi
estimats.
1

Pel que fa ls de la regressi lineal univariant en el camp de la
qumica analtica, sn dos els casos en qu sempra majoritriament. El ms
conegut s possiblement la calibraci de mtodes analtics, on es relaciona
una resposta instrumental amb les concentracions conegudes dels patrons
de calibraci, basant-se sovint en una llei terica que justifiqui aquesta
relaci (equaci de Lambert-Beer, equaci dIlkovich, equaci de Nernst,
etc.). Aix permet la posterior predicci de la concentraci de mostres
desconegudes a partir duna mesura instrumental. El segon s de la
regressi lineal, tot i que menys ests, t igualment una gran importncia, ja
1.4 Regressi lineal

11
que permet comparar els resultats dun nou mtode analtic amb els dun
mtode de referncia ja establert.
3

Daltra banda, la regressi lineal multivariant permet establir una
relaci entre una resposta instrumental amb ms duna variable predictora.
En cas que els valors de les variables predictores siguin linearment
dependents, s a dir collinears,
4,5
lestimaci dels coeficients de regressi
no ser possible. Per solucionar aquest problema, es van desenvolupar
altres tcniques de calibraci multivariant, que tot i ser esbiaixades,
1
sn
capaces destablir el model de calibraci quan els valors de les variables
predictores sn altament collinears. Aquestes tcniques es coneixen, entre
daltres, amb el nom de regressi per components principals (principal
component regression, PCR)
6
i mnims quadrats parcials (partial least squares,
PLS).
4
En aquests casos, la regressi lineal multivariant juga un paper molt
important, ja que permet establir un model matemtic que possibilita la
predicci dunes propietats determinades en mostres desconegudes.

1.4.1 Tcniques de regressi lineal univariant

Dedicarem aquest apartat a descriure els mtodes que hem considerat
ms rellevants dentre la gran quantitat daproximacions existents, per
estimar els coeficients de la recta de regressi. Tots aquests mtodes tenen
en com el fet de considerar que la relaci vertadera
7-14
existent entre la
variable predictora () i la variable resposta (), obeeix lequaci duna lnia
recta expressada com:

i i

1 0
+ = (1.1)

on les variables
0
i
1
sn els coeficients de regressi de la lnia recta
vertadera, per desconeguda. Tant en calibraci lineal com en comparaci
de mtodes, la mesura de les variables predictora i resposta est afectada en

12
major o menor grau per errors experimentals. Aix fa que els valors
experimentals mesurats de les variables predictora (x) i resposta (y) siguin
diferents dels valors vertaders. La relaci existent entre els valors vertaders
i els mesurats pot expressar-se com:

i i i
x + = (1.2)

i i i
y + = (1.3)

Els errors aleatoris comesos en la mesura de les variables x
i
i y
i
sn
representats per les variables
i
i
i
, on ) , 0 ( N ~
2
i
x i
i ) , 0 ( N ~
2
i
y i
.
8
Si
substitum les expressions 1.2 i 1.3 en lequaci 1.1, i allem la variable y
i

sobt lexpressi segent:
7,15,16

i i i
x y + + =
1 0
(1.4)

Aquesta s lequaci de la recta de regressi vertadera emprant el valors
mesurats de les variables predictora i resposta. El terme
i
s lerror
residual vertader del punt i amb ) , 0 ( N ~
2
i
i

17
i es pot expressar com a
funci de les variables
i
,
1
i
i
.
8

i i i

1
= (1.5)

La figura 1.1 mostra les rectes vertaderes considerant els valors
terics (eq. 1.1) i mesurats (eq. 1.4) de les variables predictora i resposta.
Tamb sn representats els valors dels residuals vertaders (eq. 1.5) per a
cadascun dels valors y
i
.
1.4 Regressi lineal

13
1
i i i
e x b b y + + =
1 0
i i

1 0
+ =
) , (
1 1
y x
) , (
2 2
y x
) , (
3 3
y x
) , (
4 4
y x
) , (
5 5
y x
) , (
6 6
y x
1
e
2
e
3
4
e
5
e
5
3
e
4

Figura 1.1. Representaci de diferents variables en regressi lineal univariant.

La distribuci normal bivariant
18
que segueixen conjuntament els
errors x
i
i y
i
comesos en la mesura de cadascuna de les variables x
i

i y
i
(eqs.
1.2 i 1.3) est representada en la figura 1.1 i mostra la densitat de
probabilitat associada als punts de calibraci dobservar un determinat
valor experimental. Com es pot veure, sha representat la possibilitat ms
general, em qu la varincia de les variables
i
i
i
s diferent en tots els
punts (heteroscedasticitat). Aquesta condici dheteroscedasticitat noms s
assumida per algun dels mtodes descrits a lapartat 1.4.1.2. No obstant
aix, lobjectiu final de tots els mtodes de regressi lineal s trobar
estimacions dels coeficients de regressi vertaders
0

i
1
que facin que la
recta de regressi presentada a lexpressi 1.6 s'ajusti el millor possible als n
punts experimentals (x
i
,y
i
) seguint un criteri determinat.

i i i
e x b b y + + =
1 0
(1.6)

El terme e
i
s lerror residual observat pel punt i (x
i
,y
i
) amb ) , 0 ( N ~
2
s e
i
,
8

on s
2
s la seva varincia anomenada error experimental.


14
Dentre els mtodes de regressi que descriurem a continuaci, sen
poden diferenciar dos grans grups que fan diferents consideracions sobre
lexistncia i lestructura de les varincies generades pels errors en les
mesures de les mostres. Per una banda, tenim els mtodes de regressi que
noms consideren les varincies en un eix, descrits en les seves diferents
variants en la secci 1.4.1.1. Per una altra, a la secci 1.4.1.2 es presenten
altres tcniques de regressi que consideren les varincies en ambds eixos,
tant si estan basades en lestimaci per mxima versemblana com en
lestimaci per mnims quadrats.

1.4.1.1 Tcniques de regressi que consideren els errors en un eix

En aquesta secci descriurem tres dels mtodes ms emprats en
regressi lineal: mnims quadrats ordinaris (ordinary least squares, OLS),
mnims quadrats ponderats (weighted least squares, WLS) i mnims quadrats
generalitzats (generalized least squares, GLS). Com es pot comprovar en els
apartats segents, algunes de les expressions per estimar els valors dels
coeficients de regressi es donen en notaci matricial degut a la seva
simplicitat. Segons aquesta notaci, el model lineal de lequaci 1.6 tamb
es pot expressar com:
1

e Xb y + = (1.7)

=
y
n
1
X
n
2
2
1
b
+
e
n
1

on el vector y de dimensions n1 cont els valors de la variable resposta i la
matriu X de dimensions n2 est formada per una primera columna de
1.4 Regressi lineal

15
valors unitat i una segona amb els valors de la variable predictora. El vector
b de dimensions 21 representa els dos coeficients de regressi i e s un
vector n1 amb els valors dels errors residuals de la variable resposta.

Mnims quadrats ordinaris (OLS)

Dentre tots els mtodes de regressi lineal desenvolupats, el ms
conegut i utilitzat s el de mnims quadrats ordinaris (OLS). Tot i que el seu
descobriment se sol atribuir a Carl Friedrich Gauss (1777-1855), que el va
usar abans del 1803, la primera referncia bibliogrfica s dAdrien-Marie
Legendere (1752-1833) el 1805. Aquests fets van aixecar al seu moment una
gran controvrsia sobre qui va ser el primer a descobrir aquest mtode de
regressi.
19,20

La recta de regressi OLS troba els coeficients de la lnia de regressi
que millor sajusta als punts experimentals (x
i
,y
i
), seguint un criteri pel qual
es minimitza una funci de les distncies residuals entre els valors
experimentals de la variable resposta y
i
i els valors predits
i
y
, obtinguts a
partir de la recta de regressi segons lexpressi:

i i
x b b y
1 0
+ = (1.8)

Aix doncs, la distncia residual del punt i que apareix a lequaci 1.6
tamb es pot expressar com:

i i i
y y e = (1.9)

Per garantir que la recta de regressi obtinguda per OLS s la que millor
sajusta als punts experimentals, la variable que minimitza aquest mtode
de regressi s la suma de residuals al quadrat:


16
( ) ( )

= = =
= = =
n
i
i i
n
i
i i
n
i
i
x b b y y y e S
1
2
1 0
1
2
1
2
(1.10)

Per tant, les estimacions de lordenada a lorigen i el pendent es
troben calculant les derivades parcials de l'equaci 1.10 respecte als
mateixos coeficients i igualant a zero:

0
0
=
b
S
(1.11)

0
1
=
b
S
(1.12)

A partir de les equacions 1.11 i 1.12 es troben les expressions de les
estimacions dels coeficients de regressi per mnims quadrats:

=
=

=
n
i
i
n
i
i i
x x
y y x x
b
1
2
1
1
) (
) )( (
(1.13)

x b y b
1 0
= (1.14)

on x i y corresponen als valors mitjans dels valors i la variable resposta
respectivament. La recta de regressi trobada amb aquests coeficients de
regressi passa pel punt ( y x, ), anomenat centroide.

Una variable molt important en regressi lineal univariant, necessria
per estimar les varincies dels coeficients de regressi, s la varincia de
lerror residual (
2
s ) o error experimental. Aquesta variable dna una idea
1.4 Regressi lineal

17
de la dispersi dels punts experimentals al voltant de la recta de regressi i
obeeix a l expressi segent:

( )
2
2
1
2
1
2
2
=

= =
n
y y
n
e
s
n
i
i i
n
i
i
(1.15)

Considerant la notaci matricial introduda en lequaci 1.7, les estimacions
de lordenada a lorigen i el pendent tenen aquesta forma:

y X X X b
T 1 T
) (

= (1.16)

on X
T
s la matriu trasposta de X. Daltra banda, lerror experimental (eq.
1.15) tamb es pot expressar com:

2
) ( ) (
T
2

=
n
s
Xb y Xb y
(1.17)

Si el model lineal s correcte, lerror experimental s una estimaci de la
varincia vertadera de les mesures experimentals, s a dir, de lerror
experimental vertader
2
.

A fi que laplicaci daquest mtode sigui correcta i que, per tant, els
coeficients de regressi estimats no estiguin esbiaixats, les dades
experimentals han de complir uns requisits, que implcitament sn
assumits per aquest mtode de regressi:
1,3,21,22

1. Els valors vertaders de la variable predictora no sn aleatoris sin
fixos (model funcional). Lerror coms en la mesura experimental de la
variable resposta, expressat en termes de varincia ) (
2
i
y
s , ha de ser molt

18
ms gran que el corresponent valor per la variable predictora ) (
2
i
x
s
multiplicat pel quadrat del pendent. Per aquest motiu OLS considera que
els errors comesos en la mesura de la variable predictora sn quasi nuls.

0
2 2
1
2 2
>>
i i i
x x y
s b s s (1.18)

2. Les varincies dels valors de la variable resposta han de ser
constants al llarg de tot linterval de linealitat (homoscedasticitat) i
mtuament independents. Aix equival a dir que els errors residuals
vertaders dels diferents punts (terme
i
a lequaci 1.4) no han destar
correlacionats i ) , 0 ( N ~
i
per tot i.

En el cas que es compleixin les condicions anteriorment descrites, les
estimacions dels coeficients de regressi mitjanant el mtode OLS es
poden considerar de mxima versemblana.
1,21
Aix significa que la recta
de regressi (eq. 1.6) obtinguda amb aquests coeficients ser la que tingui
una probabilitat mxima de donar prediccions de la variable resposta )
(
i
y
ms semblants al valors vertaders ) (
i
. Es pot demostrar considerant la
funci de densitat conjunta dels errors residuals vertaders
i
, tamb
coneguda amb el nom de funci de versemblana:

= =
=
n
i
i
i
e e
n n
n
i
1
2 2
2 2
2 / 1
2 /
2 /
1
2 / 1
) 2 (
1
) 2 (
1
L

(1.19)

Trobar les estimacions (b
0
i b
1
) dels coeficients vertaders que
maximitzin la probabilitat de trobar una estimaci de lerror residual (e
i
)
igual al valor vertader (
i
), equival a maximitzar la funci de versemblana
L. Per fer aix la quantitat observable del el terme exponencial
=
n
i
i
1
2
, s a
1.4 Regressi lineal

19
dir,
=
n
i
i
e
1
2
sha de minimitzar. Aix implica minimitzar la suma dels
residuals al quadrat S (eq. 1.10), que s el mateix criteri seguit pel mtode
de regressi OLS.

Mnims quadrats ponderats (WLS)

Les condicions dhomoscedasticitat assumides pel mtode OLS es
violen freqentment en calibraci lineal univariant. Sovint algunes de les
respostes instrumentals sn menys fiables que daltres i, per tant, les
varincies dels errors associats a aquests valors experimentals no sn iguals
(heteroscedasticitat).
1
Sota aquestes condicions els coeficients de regressi
estimats pel mtode OLS poden ser esbiaixats i cal aplicar el mtode de
regressi de mnims quadrats ponderats (WLS). En aquest mtode es
continua considerant la variable predictora com a lliure derror ( 0
i
),
per ara lestimaci dels coeficients de regressi es troba minimitzant la
suma de les distncies ponderades al quadrat, segons lequaci:

( )

= =

= =
n
i e
i i
n
i e
i
i i
s
x b b y
s
e
S
1
2
2
1 0
1
2
2
(1.20)

on el terme
2
i
e
s s el factor de ponderaci que correspon a la varincia de
lerror residual e
i
(eq. 1.9), que pel mtode de regressi WLS es pot
expressar com:

2
1 0
2
) ( var
i i
y i i e
s x b b y s = =

(1.21)

Aix doncs, aquest mtode de regressi dna ms importncia a aquells
punts on lerror en la mesura de la variable resposta (expressat en termes
de varincia) sigui menor, s a dir, aquells valors experimentals ms

20
precisos. Anlogament al mtode de mnims quadrats, lerror residual pel
mtode de WLS sexpressa com:

( )
=
n
i e
i i
i
s
x b b y
n
s
1
2
2
1 0 2
2
1
(1.22)

Les estimacions de lordenada a lorigen i el pendent es troben, igual que
pel mtode OLS, calculant les derivades parcials de lequaci 1.20 i
igualant-les a zero (eqs. 1.11 i 1.12), de manera que sobtenen les
expressions segents:

=
=

=
n
i e
p i
n
i e
p i p i
i
i
s
x x
s
y y x x
b
1
2
2
1
2
1
) (
) )( (
(1.23)

p p
x b y b
1 0
= (1.24)

Les variables
p
x i
p
y sn les mitjanes ponderades de les variables
predictora i resposta respectivament.

=
=
=
n
i
e
n
i
e i
p
i
i
s
s x
x
1
2
1
2
1
(1.25)

=
=
=
n
i
e
n
i
e i
p
i
i
s
s y
y
1
2
1
2
1
(1.26)

1.4 Regressi lineal

21
Aquests dos valors defineixen la posici del centroide ponderat, punt per
on passa la recta de regressi trobada pel mtode WLS. Utilitzant la notaci
matricial, les estimacions dels coeficients de regressi i de lerror
experimental es podrien obtenir a partir daquestes expressions:

y X X V X b
1 T 1 1 T
) (

= (1.27)

2
) ( ) (
1 T
2

=

n
s
Xb y Xb y
(1.28)

on representa una matriu diagonal de dimensions nn en qu lelement i
de la diagonal s la varincia del corresponent valor de la variable resposta
(
2
i
y
s ). Cal destacar que si les varincies dels valors de la variable resposta
sn constants, les estimacions dels coeficients de regressi obtingudes de
les expressions 1.23, 1.24 i 1.27 seran iguals a les generades pel mtode de
mnims quadrats ordinaris (OLS, eqs. 1.13, 1.14 i 1.16).

Mnims quadrats generalitzats (GLS)

Aquest s un mtode de regressi que igual que el mtode WLS
saplica quan les varincies dels valors de la variable resposta (
2
i
y
s ) sn
heteroscedstiques. A diferncia del mtode WLS, el mtode GLS
5,23
t en
compte la possibilitat de correlaci entre els valors experimentals de la
variable resposta (covarincia). Les expressions matricials per trobar les
estimacions dels coeficients de regressi i de lerror experimental sn les
mateixes que les presentades per WLS (eqs. 1.27 i 1.28). En aquest cas, la
matriu ja no s diagonal, sin que els elements s
ik
(ik, 1<i<n, 1<k<n)
corresponen a les covarincies entre els valors y
i
i y
k
de la variable resposta
(cov(y
i
,y
k
)).


22
En cas que les covarincies entre els diferents valors de la variable
resposta siguin nuls, les estimacions dels coeficients de regressi i de lerror
experimental obtingudes per GLS sn idntics als obtinguts amb el mtode
WLS (equacions 1.27 i 1.28).

1.4.1.2 Tcniques de regressi que consideren els errors en els dos eixos

Dins del camp de la qumica analtica hi ha casos en qu lassumpci
que fan els mtodes de regressi descrits en lapartat 1.4.1.1 respecte a la
no-existncia derror en la mesura experimental de la variable predictora
no s justificable. Aquest fet s degut en alguns casos a la constant millora
de la precisi en els resultats obtinguts mitjanant instruments danlisi
qumica, com labsorci o emissi atmica,
24
que fa que la varincia deguda
als errors en les mesures no pugui ser, en molts casos, menyspreable
respecte a la varincia generada pels errors aleatoris comesos en la
preparaci dels patrons de calibraci. Un altre exemple es pot trobar en
laplicaci de la tcnica de fluorescncia per raigs X en mostres
geolgiques.
25
En aquest cas, la complexitat de les mostres reals fa que els
patrons de calibraci se substitueixin per materials de referncia certificats.
Els errors comesos en la mesura de les concentracions daquests materials
comporten que les varincies associades a aquests valors siguin
comparables a les generades en la mesura instrumental. Aquesta
problemtica tamb es posa de manifest en aquelles tcniques analtiques
relacionades amb la dataci per radiocarboni,
26,27
on els valors dels patrons
de calibraci presenten varincies degudes a la inestabilitat en el temps
daquests materials. Daltra banda, un altre dels camps de la qumica
analtica on les varincies degudes als errors comesos en la mesura de les
variables predictora i resposta sn similars, s en la comparaci dels
resultats de dos mtodes analtics.
3

Si el mtode de mnims quadrats saplica en casos com els descrits
anteriorment, el fet de negligir els errors en la variable predictora fa que les
1.4 Regressi lineal

23
estimacions dels coeficients de regressi estiguin afectades per un biaix,
22

determinat per una variable anomenada factor de fiabilitat,
7,10
que es pot
expressar com:

2
2
x
s

= (1.29)

on
2
i
2
x
s sn les varincies dels valors vertaders i observats de la
variable predictora respectivament. s per aquest motiu que shan
desenvolupat una srie de tcniques de regressi que troben estimacions
dels coeficients tenint en compte els errors comesos en la mesura de les
variables predictora i resposta.

Es considera que Adcock
28,29
fou la primera persona a considerar de
manera seriosa el problema de la regressi lineal quan els errors en la
mesura afecten les variables predictora i resposta. El mtode de regressi
que va desenvolupar es coneix avui en dia amb el nom de regressi
ortogonal (orthogonal regression, OR) perqu va assumir que la relaci de les
varincies dels errors eren iguals. Ms endavant Kummel
30
va generalitzar
el resultat dAdcock al cas en qu la relaci de les varincies dels errors fos
coneguda, desenvolupant aix el que avui en dia sanomena mtode de
relaci de varincies constant (constant variance ratio approach, CVR).
Aquests dos mtodes de regressi han estat redescoberts un gran nombre
de vegades en una gran varietat drees del coneixement, com ara la
quimiometria.
31-34
Per aquest motiu el mtode de regressi ortogonal s
conegut per diversos noms com regressi de la distncia ortogonal
(orthogonal distance regression, ODR
3
) o mnims quadrats totals (total least
squares, TLS).
35

Tot i que durant el transcurs daquest segle shan desenvolupat un
gran nombre de mtodes de regressi que consideren les varincies dels
errors en la mesura de les variables predictora i resposta, tots intenten

24
solucionar aquest problema minimitzant les distncies, ja siguin
perpendiculars o ponderades, dels punts experimentals a la recta de
regressi. A continuaci presentem alguns dels mtodes de regressi que
consideren les varincies dels errors generats en la mesura de les variables
predictora i resposta ms utilitzats en qumica analtica dividits en dos
grans grups: els que troben els coeficients de regressi per un criteri de
mxima versemblana i els que ho fan per mnims quadrats.

Estimaci per mxima versemblana

Lestimaci per mxima versemblana dels models de regressi lineal
que consideren les varincies dels errors comesos en la mesura de les
variables predictora i resposta, tracta dobtenir les estimacions dels
coeficients de regressi (b
0
i b
1
) amb mximes probabilitats de ser iguals (o
de mxima versemblana) als valors vertaders (
0
i
1
). Daquesta manera
els valors predits de la variable resposta (
i
y
) seran els que tindran una

mxima probabilitat de ser iguals als valors terics per desconeguts (
i
).

Igual que les altres tcniques de regressi descrites a lapartat 1.4.1.1,
el model lineal amb errors en les mesures assumeix que les variables i
estan relacionades per lequaci 1.1. Ara b, aquests models assumeixen
que aquestes dues variables no sn observables i que noms es poden
mesurar les variables presents a les equacions 1.2 i 1.3, que estan afectades
per errors aleatoris. Daquesta manera es poden distingir tres tipus de
models amb errors en les mesures:
8

- El model funcional, que considera els valors vertaders de la variable
predictora
i
com a constants.

1.4 Regressi lineal

25
- El model estructural, que assumeix
i
com a variables aleatries
independents i distribudes de manera igual.

- El model ultraestructural,
8,36
pel qual les variables
i
sn com en el
model estructural, per no estan distribudes de forma igual i a ms poden
tenir diferents mitjanes amb varincia comuna.

Dentre aquests tres tipus de models lineals, ens centrarem en el
funcional, ja que s el que millor sajusta a les condicions experimentals
sota les quals saplica la regressi lineal en lanlisi qumica. Aix s aix
perqu ja sigui en calibraci o en comparaci de mtodes analtics, els
valors vertaders de la variable predictora
i
sn constants corresponents
als valors desconeguts de la concentraci danalit en cadascuna de les
mostres que shan danalitzar.

Dintre de lestimaci per mxima versemblana del model funcional
hi ha diferents casos:
8
a) Quan la relaci de varincies
2 2
x y
= s coneguda.
b) Quan el factor de fiabilitat
(eq. 1.29) s conegut.

c) Quan la varincia de lerror coms en mesurar la variable
predictora (
x
) s conegut.
d) Quan la varincia de lerror coms en mesurar la variable resposta
(
y
) s conegut.
e) Quan les dues varincies dels errors comesos en mesurar les
variables predictora i resposta (
x
i
y
) sn conegudes.
f) Quan el valor vertader de lordenada a lorigen (
0
) s conegut.

Dentre totes aquestes possibilitats, ens centrarem en els casos a) i e),
ja que corresponen a dos dels mtodes de regressi ms emprats en el camp

26
de la qumica analtica, coneguts com regressi ortogonal (OR) i regressi
per relaci constant de varincies (CVR). Mentre que el mtode OR noms
considera que =1, el mtode CVR s ms general i considera que la relaci
de varincies s constant amb
2 2
x y
= . Per poder trobar les
estimacions dels coeficients de regressi seguint un criteri de mxima
versemblana per a un model funcional, primer caldr definir la funci de
versemblana L o funci de densitat conjunta dels errors residuals
vertaders
i
, que sha de maximitzar:
8

(
|
.
|
\
|
+

= =
n
i
n
i
i i i i
x
n
x
n
y x
1 1
2
1 0
2
2
2
2
) ( ) (
2
1
exp L
(1.30)

En aquesta expressi apareix smbol (proporcional a), perqu hem oms
la constant de normalitzaci. Per estimar els coeficients de regressi sha de
maximitzar la funci L trobant les derivades parcials respecte a les
variables
0
,
1
,
2
x
i
n
,...,
1
i igualant-les a zero.
8
Considerant =1
(regressi ortogonal) i tenint en compte que
i i i
x = (eq. 1.2) i
i i i
y
1 0
= (eq. 1.4), aix s equivalent a minimitzar la part
exponencial de lequaci 1.30, que tamb es pot expressar com
=
+
n
i
i i
1
2 2
) ( . Segons el teorema de Pitgores aix s la suma de les
distncies ortogonals dels punts experimentals a la recta de regressi
ortogonal. La figura 1.2 representa aquesta situaci pel punt experimental
(x
i
,y
i
).
8

1.4 Regressi lineal

27

1 0
+ =
1
) , (
i i

) , (
i i
y x
i
x
i i
x +
i
y
i
x
1 0
+
i

Figura 1.2. Distncia a minimitzar en regressi ortogonal.

Un cop trobades les expressions que maximitzen L a partir de les
derivades parcials esmentades, shan dallar les variables corresponents als
coeficients de regressi b
0
i b
1
. Les expressions que sobtenen sn les
segents:
33

xy
xy xx yy xx yy
s
s s s s s
b
2
4 ) (
2 2
1
+ +
= (1.31)

x b y b
1 0
= (1.32)

on

=
=
n
i
i xx
x x s
1
2
) ( (1.33)

28

=
=
n
i
i yy
y y s
1
2
) ( (1.34)

=
=
n
i
i i xy
y y x x s
1
) )( ( (1.35)

Les solucions de mxima versemblana dels coeficients de regressi
pel model funcional en els dos casos tractats (a i e) coincideixen amb les
obtingudes pel model estructural. No obstant aix, lestimaci de lerror
experimental s
2
segons el principi de mxima versemblana pel model
funcional no s correcta, tot i que aquest problema ha estat solucionat per
Lindley.
37
Daltra banda, en la resta de casos (b), c), d) i f)) lestimaci per
mxima versemblana considerant un model funcional no s possible. Aix
doncs, es pot concloure de forma general que lexistncia i consistncia de
les estimacions dels diferents parmetres de regressi pel model funcional
des dun punt de vista de mxima versemblana no sn garantides. Aix s
degut al fet que en el cas dassumir un model funcional, el nombre de
parmetres (
i
) augmenta amb la quantitat de mostres de calibraci. Com
que aquestes tcniques de regressi requereixen que la variable predictora
estigui modelada per una funci de versemblana,
38
no es pot garantir de
forma general lexistncia destimacions de mxima versemblana
consistents en conjunts de calibrat grans.
8
A ms, existeixen molts casos en
els que les dades experimentals sn altament heteroscedstiques i les
estimacions de les varincies dels errors de mesura noms es poden obtenir
mitjanant lanlisi replicada de les mostres de calibraci. En aquests casos
el valor de s desconegut i, per tant, lestimaci per mxima
versemblana no s possible per models funcionals. Per aquest motiu cal
cercar un mtode de regressi lineal univariant que permeti trobar les
estimacions dels coeficients de regressi encara que no sigui mitjanant el
principi de mxima versemblana. Segons va demostrar Lindley,
37
alguns
mtodes basats en el principi de mnims quadrats donen estimacions
1.4 Regressi lineal

29
idntiques a les dels mtodes de mxima versemblana quan sassumeix
una relaci de varincies constant. Per aquest motiu hem decidit emprar
el mtode de regressi BLS, que s un mtode de regressi per mnims
quadrats iteratius aplicable a qualsevol conjunt de dades experimentals
sense haver de fer assumpcions sobre la distribuci dels valors vertaders de
la variable predictora .

Estimaci per mnims quadrats

Una gran varietat de mtodes de regressi lineal univariant es basen
en el principi de mnims quadrats capaos destimar els coeficients de
regressi considerant les varincies heteroscedstiques dels errors comesos
en la mesura dels diferents valors de les variables predictora i resposta.
39-59
Dentre tots aquests mtodes de regressi, es va trobar que el mtode de
mnims quadrats bivariants (bivariate least squares, BLS), desenvolupat per
Lis
60
i collaboradors, era el ms adequat a causa de la senzillesa per
programar el seu algoritme, la rapidesa per estimar els diferents
parmetres de regressi i la facilitat per obtenir la matriu de varincia-
covarincia.
61

Aquest mtode, igual que els basats en el principi de mxima
versemblana, considera que existeix una relaci lineal vertadera entre les
variables i (eq. 1.1) i que existeixen errors en la mesura dambdues
variables segons sexpressa en les equacions 1.2 i 1.3. A diferncia daltres
tcniques de regressi i tenint en compte aquestes assumpcions, el mtode
BLS considera la varincia associada a lerror residual observat e
i
(eq. 1.6)
per trobar les estimacions dels coeficients de regressi. Aquesta variable
sanomena factor de ponderaci (
2
i
e
s ) i t en compte les varincies dels
errors experimentals (
2
i
x
s i
2
i
y
s ) comesos en mesurar repetidament les
variables x
i
i y
i
en cadascuna de les mostres. La correlaci (covarincia)

30
entre els valors de les variables x
i
i y
i
tamb es t en compte, tot i que
normalment sassumeix igual a zero.

) ( cov 2 ) var(
1
2 2
1
2
1 0
2
i i x y i i e
y x b s b s x b b y s
i i i
+ = =

(1.36)

El mtode de regressi BLS troba les estimacions dels coeficients de
regressi minimitzant la suma dels residuals ponderats al quadrat S segons
lexpressi:

) 2 (
)
( )
( )
(
2
1
2
2
1
2
2
2
2
=
=
(
(
=

= =
n s
s
y y
s
y y
s
x x
S
n
i
e
i i
n
i
y
i i
x
i i
i i i
(1.37)

on s
2
s lestimaci de lerror experimental. Segons aquesta equaci, el
mtode BLS assigna un pes ms important als parells de dades amb valors
2
i
x
s i
2
i
y
s ms petits, s a dir, aquells valors experimentals que siguin ms
precisos i on per tant, lerror en la mesura experimental ha de ser menor.
Minimitzant la suma dels residuals ponderats al quadrat S, sobtenen dues
equacions no lineals que en notaci matricial es poden expressar com:

g Db = (1.38)

(
(
(
(
(
(
|
|
|
.
|
\
|
|
|
.
|
\
|
+
|
|
|
.
|
\
|
|
|
.
|
\
|
+
=
(
(
(
(
(

=
=
= =
= =
n
i
e
e
i
e
i i
n
i
e
e
i
e
i
n
i e
i
n
i e
i
n
i e
i
n
i e
b
s
s
e
s
y x
b
s
s
e
s
y
b
b
s
x
s
x
s
x
s
i
i i
i
i i
i i
i i
1 1
2
2
2 2
1 0
2
2
2 2
1
0
1
2
2
1
2
1
2
1
2
2
1
2
1
1
(1.39)

Les estimacions dels coeficients de regressi en el vector b (eq. 1.38) es
calculen amb un procediment iteratiu seguint lexpressi segent:
61

1.4 Regressi lineal

31

g D b
1
= (1.40)

Amb aquest mtode, la matriu de varincies-covarincies dels coeficients
de regressi sestima multiplicant la matriu
1
D resultant del procs
iteratiu per lestimaci de lerror experimental s
2
(eq. 1.37). Quan les
varincies dels errors experimentals comesos en la mesura de la variable
predictora sn nulles, lequaci 1.36 queda reduda a lequaci 1.21, i per
tant, les estimacions dels coeficients de regressi pel mtode BLS sn les
mateixes que les obtingudes pel mtode WLS. Daltra banda, en el cas que
les varincies de tots els errors experimentals comesos en la mesura de la
variable resposta siguin constants i nulles per totes les variables
predictores, el valor del factor de ponderaci
2
i
e
s (eq. 1.36) ser constant i
les estimacions dels coeficients de regressi seran iguals a les obtingudes
pel mtode OLS.

Daquesta manera el mtode de regressi BLS minimitza les
distncies (
i
S ) entre els punts experimentals i la recta que apareixen a la
figura segent.


32
x
y
5 . 0
1
=
x
s
1
S
4 . 0
2
=
x
s
1 . 0
2
=
y
s
75 . 0
1
=
x
s
5 . 0
1
=
y
s
2
S
3
S
1
) ( ) (
2
2
2 2 2 2
2 2
=
x x
s
x x
s
x x
1
) ( ) (
2
2
2 2 2 2
2 2
=
y y
s
y y
s
y y
1 . 0 ) (
2 2
= y y
2 . 0 ) (
3 3
= y y
4 . 0 ) (
3 3
= x x
4 . 0
) (
3
3 3
=
y
s
y y
53 . 0
) (
3
3 3
=
x
s
x x
2 . 0
4
=
y
s
1 . 0
4
=
x
s 2 . 0 ) (
4 4
= x x
085 . 0 ) (
4 4
= y y
2
) (
4
4 4
=
x
s
x x
42 . 0
) (
4
4 4
=
y
s
y y
s
y
1
=

0
.
3
5
4 . 0 ) (
1 1
= y y
4 . 0 ) (
2 2
= x x
28 . 0
) (
2
2
3 3
3
=
x
s
x x
16 . 0
) (
2
2
3 3
3
=
y
s
y y
18 . 0
) (
2
2
4 4
4
=
y
s
y y
4
) (
2
2
4 4
4
=
x
s
x x
4
S

Figura 1.3. Distncies que minimitza el mtode de regressi BLS.

Per cada punt individual es compleix
2
2
2
2
)
( )
(
i i
y
i i
x
i i
i
s
y y
s
x x
S

+
= . Les lnies
verticals i horitzontals en negreta que apareixen en la figura 1.3 centrades
als punts experimentals corresponen a dues vegades als valors de les
desviacions estndard dels errors comesos en la mesura dels valors
experimentals
i
x
s i
i
y
s . s important destacar que els coeficients de
regressi estimats pel mtode BLS no varien en intercanviar els eixos.

Un punt important a tenir en compte pel mtode de regressi BLS fa
referncia a lestimaci de les varincies dels errors experimentals comesos
en la mesura de les mostres de diferents concentracions (
2
i
x
s i
2
i
y
s ). Per
obtenir les millors estimacions possibles dels coeficients de regressi quan
no es disposa destimacions prvies de les varincies dels errors
experimentals, s necessari fer un nombre suficient de rpliques per cada
una de les mostres. Tot i aix, les estimacions de les varincies dels errors
56 . 2
) (
2
2
1 1
1
=
x
s
x x
6 . 1
) (
1
1 1
=
x
s
x x
8 . 0 ) (
1 1
= x x
14 . 1
) (
1
1 1
=
y
s
y y
30 . 1
) (
2
2
1 1
1
=
y
s
y y
1.4 Regressi lineal

33
experimentals poden incloure fonts de variaci que no tenen res a veure
amb els errors aleatoris comesos en lanlisi de les mostres.
62
Aquest pot ser
el cas de rpliques amb diferents mitjanes, falta dhomogenetat en les
mostres (cas de mostres geolgiques) o interferncies que poden afectar de
diferent manera cada un dels mtodes en comparaci. Sota aquestes
circumstncies, les estimacions dels coeficients de regressi amb el mtode
BLS, aix com amb la resta de mtodes de regressi que consideren els
errors comesos en les mesures experimentals, poden ser esbiaixades. El
biaix es produeix perqu aquests mtodes de regressi consideren que la
variabilitat en les rpliques de les anlisis de les mostres s nicament
deguda a errors aleatoris.
62
Daquesta manera, signora una font de
variabilitat present quan els valors vertaders de les variables predictora i
resposta (
i
i
i
) no segueixen una relaci lineal (existeix un error en
lequaci
7, 8
) i, per tant, els parells de valors (
i
,
i
) no sajusten perfectament
a una lnia recta. No obstant aix, ls de models lineals amb un error en
lequaci no s freqent en qumica analtica, perqu les respostes
instrumentals solen obeir a una llei terica (llei de Lambert-Beer, llei de
Nernst, etc.). A ms, en la comparaci de mtodes analtics, com que els dos
mtodes danlisi mesuren les mateixes mostres, els models lineals amb un
error en lequaci no sn estadsticament justificables.
63
Tot i aix, els
usuaris de mtodes de regressi que consideren les estimacions de les
varincies dels errors experimentals, han de tenir molt presents les
diferents fonts derror que poden afectar les mesures experimentals.

1.4.2 Regressi lineal multivariant

En la regressi lineal mulitvariant es relaciona la variable resposta ()
amb diferents variables predictores (
j
, j=1,...,p) segons lexpressi:

i i i
p p i
+ + + + = ...
2 2 1 1 0
(1.41)


34
Anlogament al cas de la regressi lineal univariant, les variables resposta i
predictores vertaderes no es poden obtenir experimentalment a causa dels
errors aleatoris en la mesura. Aix, els valors observats per la variable
resposta vnen donats per lequaci 1.3, mentre que per les variables
predictores es poden expressar com:

i i i
i i i
i i i
p p p
x
x
x

+ =
+ =
+ =
...
2 2 2
1 1 1
(1.42)

Els errors comesos en la mesura de les variables predictores es
distribueixen de forma anloga a la regressi univariant. Introduint les
expressions 1.3 i 1.42 en lequaci 1.41, sobt lequaci de la recta de
regressi multivariant tenint en compte el valors mesurats de les variables
predictores i resposta:
10

i p p i
i i i
x x x y + + + + + = ...
2 2 1 1 0
(1.43)

Aquesta s lequaci de lhiperpl de regressi vertader emprant el valors
mesurats de les variables predictora i resposta. El terme
i
s lerror
residual vertader del punt i amb ) , 0 ( N ~
2
i
i

17
i es pot expressar com a
funci de les variables
i
,
1
,...,
p
i
i i
p
,...,
1
:

i i i
p p i i
= ...
2 2 1 1
(1.44)

Igual que en el cas de la regressi lineal univariant, lobjectiu dels mtodes
de regressi lineal multivariant s trobar estimacions dels coeficients de
regressi vertaders que facin que lhiperpl de regressi (p+1)-dimensional
1.4 Regressi lineal

35
(eq. 1.45) s'ajusti el millor possible als n punts experimentals seguint un
criteri determinat.

i p p i
e x b x b x b b y
i i i
+ + + + + = ...
2 2 1 1 0
(1.45)

on el terme e
i
s lerror residual observat pel punt i ) , ,..., , (
2 1 i p
y x x x
i i i
. A
partir dels coeficients de lhiperpl de regressi, es poden calcular els
valors predits de la variable resposta (
i
y
) segons lexpressi:

i i i
p p i
x b x b x b b y + + + + = ...
2 2 1 1 0
(1.46)

Atesa lescassa bibliografia sobre regressi lineal multivariant, hem
cregut convenient descriure els dos mtodes utilitzats en el captol 7.
Aquest dos mtodes de regressi multivariant sn anlegs als mtodes
dOLS i BLS en la regressi univariant.

1.4.2.1 Regressi lineal mltiple

El mtode de regressi lineal mltiple (multiple linear regression, MLR)
s el ms utilitzat en regressi multivariant per trobar els coeficients de
lhiperpl de regressi. De manera anloga al mtode OLS en regressi
lineal univariant, els coeficients de regressi estimats pel mtode MLR sn
aquells que minimitzen la suma de residuals al quadrat, segons lexpressi:

= = =
= = =
n
i
p p i
n
i
i i
n
i
i
i i
x b x b b y y y e S
1
2
1 1 0
1
2
1
2
) ... ( )
( (1.47)

Aquest mtode de regressi multivariant t en compte les mateixes
assumpcions que les descrites pel mtode OLS, sota les quals es pot
considerar que els coeficients estimats sn el ms semblants possibles als
valors terics desconeguts, i per tant cal considerar MLR com a un mtode

36
de mxima versemblana. La segent figura mostra les distncies residuals
entre els punts experimentals i el pla de regressi, minimitzades per MLR
en un espai tridimensional:

2
x
i i
e x b x b b y
i i
+ + + =
2 2 1 1 0
1
x
y
5
y
5
y
5
e

Figura 1.4. Regressi multivariant pel mtode MLR

A causa de lincrement de variables i, per tant, de coeficients de
regressi que shan destimar, la notaci matricial en regressi lineal
multivariant est molt estesa per la simplificaci que aporta a les
expressions matemtiques. En forma matricial lexpressi de lequaci 1.45
s idntica a la ja vista pel mtode de regressi OLS (eq. 1.7), amb la
diferncia que en aquest cas les dimensions de la matriu X i del vector de
regressi b sn n(p+1) i (p+1)1 respectivament. De la mateixa forma,
lexpressi matemtica per estimar els coeficients de lhiperpl de regressi
s igual a la presentada pel mtode OLS (eq. 1.16). En aquest cas, cal
destacar que perqu lestimaci del vector b per mnims quadrats sigui
nica, les columnes de la matriu X han de ser independents.
4

1.4 Regressi lineal

37
Un dels inconvenients ms importants de MLR sorgeix quan els
valors de la variable predictora en la matriu X no sn independents entre
si. Aquest fenomen es coneix amb el nom de collinearitat i s habitual en la
majoria de dades espectroscpiques (infraroig proper, ultraviolat-visible,
etc.) i provoca que lestimaci del vector b per mnims quadrats no sigui
nica, ja que la matriu X no s invertible. Aquest problema sha solucionat
amb els mtodes de calibraci multivariant basats en la descomposici de
la matriu X en components principals o variables latents. Dos exemples
daquest tipus de tcniques sn la regressi per components principals
(principal component regression, PCR)
6
i els mnims quadrats parcials (partial
least squares, PLS).
4
En aquests dos mtodes de calibraci multivariant, el
mtode MLR juga un paper fonamental, ja que susa per establir el model
de regressi multivariant durant letapa de calibraci, que permet la
predicci posterior dunes propietats determinades en mostres
desconegudes.

1.4.2.2 Mnims quadrats multivariants

El mtode de mnims quadrats multivariants (multivariate least
squares, MLS) s un mtode de regressi multivariant anleg al mtode BLS
en regressi lineal univariant. El mtode MLS tamb es basa en el treball
desenvolupat per Lis i collaboradors
60
per estimar els coeficients de
lhiperpl de regressi tenint en compte les varincies degudes als errors
comesos en la mesura de les variables predictores ) ,..., , (
2 2 2
2 1
i
p
i i
x x x
s s s i
resposta ) (
2
i
y
s . De manera similar al mtode univariant BLS, en aquest cas,
els coeficients de regressi estimats sn aquells que minimitzen la suma del
quadrat de les distncies residuals ponderades en un espai tridimensional
(vegeu figura 1.5), segons lexpressi:


38

= =
=
(
(
=
n
i
i i
e
n
i x x y
i i
y y
s s
x x
s
x x
s
y y
S
i
i
i i
i
i i
i
1
2
2
1
2
2
2 2
2
2
1 1
2
2
)
(
1
)
( )
(
)
(
2 1
(1.48)

on
2
i
e
s s el factor de ponderaci corresponent a la varincia del residual e
i

(eq. 1.45) del punt i ) , ,..., , (
2 1 i p
y x x x
i i i
i
i
y
sobt a partir de lequaci 1.46.

Anlogament al factor de ponderaci per BLS, t en compte les varincies
tant de les variables predictores com de la variable resposta, aix com la
covarincia entre els valors experimentals que se sol assumir igual a zero:

= + = = =
+ + =
p
k
p
k l
l k l k
p
k
k k
p
k
x k x e
i i i i
i
k
i
i
x x b b x x b s b s s
2 1 2
1
2
2 2 2 2
) , cov( 2 ) , cov( 2
1
(1.49)

2
x
1
x
y
) , , (
2 1
i i
x x y
i
) , , (
2 1
i i
x x y
i
) , , (
2 1
i i
x x y
i
) , , (
2 1
i i
x x y
i
) (
2 2
i i
x x
) (
1 1
i i
x x
) (
i i
y y
5 . 0
1
=
x
s
2
2
=
x
s
1 =
y
s
i
S

Figura 1.5. Distncia residual minimitzada pel mtode MLS.

1.4 Regressi lineal

39
Lestimaci de lerror experimental sobt dividint la suma de les distncies
residuals ponderades S (eq. 1.48) entre el nmero apropiat de graus de
llibertat.

p n
S
s
=
2
(1.50)

En minimitzar la suma de residuals ponderats al quadrat S respecte als
coeficients de regressi (b
k
, k=1...p), es generen p equacions no lineals, que
en notaci matricial poden ser expressades com:

g b D = (1.51)

=
=
=
=
= = = =
= = = =
= = = =
= = = =
(
(
(
(
+
(
(
(
(
+
(
(
(
(
+
(
(
(
(
+
=
n
i p
e
e
i
e
p i
n
i
e
e
i
e
i
n
i
e
e
i
e
i
n
i
e
e
i
e
i
p
n
i
e
p
n
i
e
p
n
i
e
p
n
i
e
p
n
i
e
p
n
i
e
n
i
e
n
i
e
n
i e
p
n
i e
n
i e
n
i e
n
i
e
p
n
i
e
n
i
e
n
i
e
b
s
s
e
s
x y
b
s
s
e
s
x y
b
s
s
e
s
x y
b
s
s
e
s
y
b
b
b
b
s
x
s
x x
s
x x
s
x
s
x x
s
x
s
x x
s
x
s
x x
s
x x
s
x
s
x
s
x
s
x
s
x
s
i
i i
i
i
i i
i
i
i i
i
i
i i
i
i
i
i i
i
i i
i
i
i
i i
i
i
i
i i
i
i
i
i i
i
i i
i
i
i
i
i
i
i
i
i
i
i
1
2
2
2 2
1 3
2
2
2 2
3
1 2
2
2
2 2
2
1 1
2
2
2 2
3
2
1
1
2
2
1
2
3
1
2
2
1
2
1
2
3
1
2
2
3
1
2
3 2
1
2
3
1
2
2
1
2
2 3
1
2
2
2
1
2
2
1
2
1
2
3
1
2
2
1
2
2
1
2
1
2
1
2
1
1
(1.52)

El vector b amb els coeficients de regressi es pot estimar a travs dun
procediment iteratiu seguint lexpressi:

g D b
1
= (1.53)

Amb aquest mtode, la matriu de varincies-covarincies dels coeficients
de regressi es pot obtenir multiplicant la matriu
1
D final per lestimaci

40
de lerror experimental s
2
(eq. 1.50). En el cas que les varincies de tots els
errors experimentals comesos en la mesura de la variable resposta siguin
constants i nulles per totes les variables predictores, el valor del factor de
ponderaci
2
i
e
s (eq. 1.49) ser constant i les estimacions dels coeficients de
regressi seran iguals a les obtingudes pel mtode MLR. Sha de destacar
que, tal com succeeix en el mtode BLS, el vector de regressi b estimat per
MLS no varia en intercanviar els eixos.

De forma anloga al mtode de regressi BLS, perqu els coeficients
de lhiperpl de regressi estimats amb el mtode MLS siguin correctes,
tamb es necessiten bones estimacions de les varincies dels errors aleatoris
comesos en les mesures experimentals. Per aquest motiu, els comentaris al
final de la secci 1.4.1.2 referents a lestimaci de les varincies dels errors
experimentals tamb sn vlids per al mtode de regressi MLS.

1.5 Tests dhiptesi sobre els coeficients de regressi

Mitjanant laplicaci de tests dhiptesi sobre els coeficients de
regressi s possible detectar per un nivell de significana , errors
sistemtics significatius en els valors dels coeficients de regressi respecte a
uns valors de referncia establerts.
3
Per fer aix inicialment es postulen
dues hiptesis:

1. La hiptesi nulla (H
0
) assumeix que els coeficients de regressi
estimats pertanyen a una distribuci centrada al voltant dun valor de
referncia, s a dir, que no existeixen diferncies significatives entre els
valors dels coeficients de regressi estimats i els de referncia per un nivell
de significana .


41
2. La hiptesi alternativa (H
1
) assumeix que els coeficients de
regressi pertanyen a una distribuci centrada al voltant dun valor
esbiaixat i, per tant, que existeixen diferncies significatives per un nivell
de significana entre els valors dels coeficients de regressi estimats i els
de referncia postulats per H
0
.

En cas dacceptar la hiptesi alternativa ser necessari revisar el
procediment analtic per identificar la font derror que afecta la mesura dels
valors experimentals. Tot i que laplicaci daquests tests estadstics est
fora estesa en calibraci lineal, s important tenir en compte que shan de
complir una srie de condicions perqu les conclusions extretes sobre les
hiptesi formulades tinguin significana estadstica.

1.5.1 Condicions daplicaci

Els procediments de mesura qumics, a diferncia de les mesures
fsiques, solen estar compostos de diverses etapes que normalment sn
independents. Per tant, la majoria dels resultats obtinguts per anlisis
qumiques acostumen a incorporar errors provinents de les diferents etapes
del procediment danlisi. Segons el teorema del lmit central,
7

independentment de la distribuci dels errors comesos en la mesura de les
variables x i y en les diverses etapes del procediment danlisi, la suma
daquests errors seguir una distribuci normal. Aix assegura que els
residuals vertaders
i
(eqs. 1.4, 1.5 i 1.44), igual que els coeficients de
regressi, estaran distributs normalment.
1
Per aquesta ra els tests
dhiptesi emprats amb els coeficients de regressi, aix com amb la resta
dels parmetres de regressi, assumeixen la hiptesi de normalitat.

En els mtodes de regressi que consideren els errors en tots els eixos,
i particularment en el mtode desenvolupat per Lis i collaboradors,
Kalantar
64
va demostrar que els coeficients de regressi no seguien una

42
distribuci normal. Per aquest motiu en el tercer captol sestudia el grau de
desviaci de la distribuci real dels coeficients de regressi BLS respecte a
una distribuci normal. En funci del grau de desviaci es determinar si
s raonable lassumpci de normalitat en els coeficients de regressi BLS
per poder aplicar tests dhiptesis.

1.5.2 Importncia de la falta dajust

En els tests estadstics basats en la regressi lineal s important
assegurar ladequaci dels valors experimentals al model de regressi. Si
els punts experimentals no estan prou ajustats a la recta de regressi, el
model lineal pot no ser vlid. En aquest cas, lerror experimental s
2
estar
sobreestimat i no donar una mesura correcta dels errors aleatoris presents
en els valors experimentals.
65
Per evitar aix, en el mtode de mnims
quadrats, se sol emprar el coeficient de correlaci, r (o el seu quadrat, el
coeficient de determinaci). Aquest coeficient, per, no s un parmetre
informatiu de la qualitat de lajust dels punts experimentals a la recta, ja
que no es tracta dun test estadstic.
66
Tot i que ls de grfics de residuals
augmenta la fiabilitat en la detecci de la falta dajust mitjanant aquest
coeficient, la forma ms correcta, tot i que costosa experimentalment, de
detectar la falta dajust s mitjanant un test de lanlisi de la varincia
(ANOVA).
1

Una de les conseqncies ms clares de la falta dajust es manifesta en
laplicaci de tests dhiptesi sobre els coeficients de regressi. La mida dels
intervals de confiana calculats per detectar diferncies significatives
respecte als valors terics s funci directa de lerror experimental s
2
. Per
aquest motiu, en cas dexistir falta dajust entre els punts experimentals i la
recta de regressi, el valor de s
2
estar sobreestimat i els intervals de
confiana seran ms grans.
67
Sota aquestes circumstncies, hi haur una

43
major probabilitat de confondre els possibles errors sistemtics comesos en
el procs de mesura de les mostres amb errors aleatoris.

1.5.3 Probabilitats derror de primera i segona espcie

s conegut que en els tests dhiptesi per poder acceptar o rebutjar la
hiptesi nulla (H
0
), sha de fixar un nivell de significana que marca la
probabilitat de rebutjar-la quan en realitat s la correcta. Aquest error s
conegut amb el nom derror de primera espcie o error de tipus . Daltra
banda, en el cas dacceptar la hiptesi alternativa (H
1
) quan en realitat la
correcta s la hiptesi nulla, cometrem un error de segona espcie, tamb
conegut com a error de tipus .
3
La figura 1.6 mostra la probabilitat de
cometre els errors i en el cas dun test individual sobre lordenada de la
recta de regressi.

Biaix ()
H b
0 0
0 : = H 0+
1
:
=
?
s
b
0

Probabilitat
derror
b
0
b
0
s
b
0
Probabilitat
derror
Probabilitat
derror
Figura 1.6. Probabilitats de cometre errors i .

El biaix s la diferncia mnima (fixada per lusuari) entre els valors
dels coeficients a partir dels quals es postulen H
0
i H
1
, que es vol detectar
com a error sistemtic. La taula 1.1 esquematitza les situacions en qu es
cometen aquests dos tipus derrors en els tests dhiptesi.

44

Conclusi mitjanant el test
H
0
certa H
0
falsa
H
0
certa Correcta Error
Situaci real
H
0
falsa Error Correcta
Taula 1.1. Situacions en qu es poden cometre errors i .

Tot i que tradicionalment sha donat ms importncia a les
probabilitats derror , hi ha casos on es necessita assegurar una
probabilitat derror baixa. Aquest s el cas de la verificaci de la
traabilitat
68
dun mtode analtic, on una probabilitat derror elevada
implica que hi ha moltes probabilitats que es pugui afirmar errniament
que un mtode analtic que dna resultats esbiaixats sigui traable al
mtode de referncia. Segons el tipus de mostres que shan danalitzar, ser
preferible fer un major esfor experimental i fer un major nombre de
rpliques en el procs de verificaci de la traabilitat per no arriscar-se a
emprar un mtode que pot donar resultats esbiaixats. Un exemple s el cas
dels estudis de bioequivalncia de dues drogues, en qu normalment s
ms important no acceptar errniament que lactivitat de dues drogues s
similar. Aix implica assegurar que lerror de segona espcie s baix.
69

Daltra banda, tamb hi ha casos en qu pot interessar assegurar que
la probabilitat derror sigui baixa. En estudis farmacolgics, per exemple,
interessa que el risc de concloure errniament que una substncia actua
com una droga sigui mnim per evitar ls de substncies que no tenen cap
efecte teraputic.
70
Per aquest motiu sha suggerit que la postulaci de les
hiptesis nulla i alternativa es faci segons el tipus derror que sha de
controlar.
70


45
Tamb cal destacar que hi ha una relaci entre les probabilitats de
cometre un error de primera espcie i un de segona. En el cas daugmentar
la probabilitat derror , disminuir la probabilitat dacceptar errniament
la hiptesi alternativa, i per tant, la probabilitat derror . Com es pot veure
a la figura 1.6, les probabilitats derror tamb depenen de la distncia
(biaix, ) entre el valor de referncia (en aquest cas 0) i lesbiaixat (en aquest
cas +0) i de la desviaci estndard del coeficient de regressi (
0
b
s ). Aix
doncs, per a una probabilitat derror i un biaix fixat, existeix una tercera
variable que relaciona les dues probabilitats derror, el nombre de mostres
de calibraci. Augmentant-ne el nombre de mostres pot disminuir la
probabilitat derror , ja que daquesta manera s possible reduir la
desviaci estndard del coeficient de regressi.

1.6 Aplicacions de la regressi lineal considerant errors en tots
els eixos

Tot i que ls ms com de la regressi lineal en lanlisi qumica s la
calibraci de mtodes analtics, s a dir, lestabliment de la relaci
matemtica entre les respostes instrumentals i la concentraci de lanalit, hi
ha daltres aplicacions de gran importncia en el camp qumic.

1.6.1 Calibraci de mtodes analtics

Aquest s segurament ls ms conegut de la regressi lineal en el
camp de lanlisi qumica. En la majoria de casos, les varincies degudes als
errors en la mesura instrumental sn majors a les corresponents varincies
generades en la preparaci de les mostres de calibrat. No obstant aix, en
alguns casos, com el de laplicaci de la fluorescncia de raigs-X en mostres
geolgiques,
25
a causa de la complexitat de les mostres reals, els patrons de
calibrat sn substituts per materials de referncia certificats. Els valors de
concentraci daquests materials van acompanyats per unes varincies

46
degudes als errors comesos en lanlisi de la mostra corresponent. Un altre
cas en qu s necessari considerar les varincies pels errors comesos en la
mesura de les concentracions dels patrons de calibrat, s en les tcniques
que utilitzen dataci per radiocarboni, ja que els patrons de calibraci
presenten una gran inestabilitat en el temps.

1.6.2 Comparaci de mtodes analtics

La comparaci de dos o ms mtodes analtics a diversos nivells de
concentraci pot fer-se mitjanant la regressi lineal. En aquest cas, es
busca comparar si els coeficients de la recta de regressi no sn
significativament diferents als valors terics que es trobarien si els dos
mtodes en comparaci donessin resultats idntics. Normalment els
resultats dels mtodes en comparaci presenten varincies degudes als
errors en les mesures, del mateix ordre de magnitud. Per aquest motiu les
tcniques de regressi que consideren les varincies dels errors comesos en
la mesura de les diferents mostres sn les ms adequades.
71

1.6.3 Predicci

En calibraci lineal letapa de predicci s molt important, ja que
sempra per trobar el valor de la concentraci de mostres desconegudes a
partir de la seva resposta instrumental. En el cas de la comparaci de
mtodes analtics, la predicci tamb t importncia perqu de vegades pot
ser interessant conixer el valor i la incertesa duna mostra analitzada per
un nou mtode a partir de dels valors obtinguts del mtode ja establert.

1.7 Referncies

1.- Draper N., Smith H., Applied Regression Analysis, 2
nd
ed., John Wiley &
Sons: New York, 1981.
1.7 Referncies

47
2.- Galton Sir Francis, Journal of the Anthropological Institute, 15 (1885) 246-
263.
3.- Massart D.L., Vandeginste B.M.G.,. Buydens L.M.C, de Jong S., Lewi
P.J., Smeyers-Verbeke J., Handbook of Chemometrics and Qualimetrics: Part A,
Elsevier: Amsterdam, 1997.
4.- Martens H., Ns T., Multivariate Calibration, Wiley: Chichester, 1989.
5.- Rawlings J.O., Applied Regression Analysis: A Research Tool, Wadsworth &
Brooks/Cole Advanced Books & Software: Belmont, 1988.
6.- Beebe K.R., Kowalski B.R., Analytical Chemistry, 59 (1987) 1007A-1017A..
7.- Fuller W.A., Measurement Error Models, John Wiley & Sons: New York,
1987.
8.- Cheng C.L., Van Ness J.W., Statistical Regression with Measurement Error,
Kendalls Library of Statistics 6, Arnold: London, 1999.
9.- Cheng C.L., Van Ness J.W., Journal of the Royal Statistical Society, Series B,
56 (1994) 167.
10.- Cheng C.L., Schneeweiss H., Journal of the Royal Statistical Society, Series
B, 60 (1998) 189.
11.- Chan L.K., Mak T.K., Journal of the Royal Statistical Society, Series B, 41
(1979) 263.
12.- Huwang L., Journal of Multivariate Analysis, 55 (1995) 230.
13.- Czapkiewicz A., Applicationes Mathematicae, 25 (1999) 401.
14.- Isogawa Y., Journal of the Royal Statistical Society, Series B, 47 (1985) 211.
15.- Edland S.D., Biometrics, 52 (1996) 243.
16.- Schaalje G.B., Butts R.A., Biometrics, 49 (1993) 1262.
17.- Sprent P., Models in Regression and related topics, Methuen & Co. Ltd.:
London, 1969.
18.- Mood A.M., Garybill F.A., Introduction to the Theory of Statistics,
McGraw-Hill: New York, 1963.
19.- Plackett R.L., Biometrika, 59 (1972) 239-251.
20.- Eisenhart C., Journal of the Washuington Academy of Sciences, 54 (1964) 24.

48
21.- Myers R.H., Classical and Modern Regression with Applications, 2
nd
ed.,
Duxbury Press: Belmont, 1989.
22.- Irvin J.A., Quickenden T.I., Journal of Chemical Education, 60 (1983) 711-
712.
23.- Meloun M., Militk J., Forina M., Chemometrics for Analytical Chemistry
Volume 2. PC-aided Regression and Related Methods, Ellis Horwood: London,
1994.
24.- Speigelman C.H., Waters R.L., Hungwu L., Chemometrics and Intelligent
Laboratory Systems, 11 (1991) 121.
25.- Bennett H., Olivier G., XRF Analysis of Ceramics, Minerals and Allied
Materials, John Wiley & Sons: New York, 1992.
26.- Clark R.M., Journal of the Royal Statistical Society, Series A, 142 (1979) 47.
27.- Clark R.M., Journal of the Royal Statistical Society, Series A, 143 (1980) 177.
28.- Adcock R.J., Analyst, 4 (1877) 183-184.
29.- Adcock R.J., Analyst, 5 (1878) 53-54.
30.- Kummel C.H., Analyst, 6 (1879) 97-105.
31.- Anderson R.L., Practical Statistics for Analytical Chemists, Van Nostrand
Reinhold: New York, 1987.
32.- Creasy M.A., Journal of the Royal Statistical Society, Series B, 18 (1956) 65-
69.
33.- Mandel J., Journal of Quality and Technology, 16 (1984) 1-14.
34.- Hartmann C., Smeyers-Verbeke J., Penninckx W., Massart D.L.,
Analytica Chimica Acta, 338 (1997) 19-40.
35.- Van Huffel S., Vandewalle J., The Total Least Squares Problem.
Computational Aspects and Analysis, Siam: Philadelphia, 1991.
36.- Dolby G.R., Biometrika, 63 (1976) 39.
37.- Lindley D.V., Journal of the Royal Statistical Society / Series B, 9 (1947)
218-244.
38.- Schafer D.W., Puddy K.G., Biometrika, 83 (1996) 813-824.
39.- York D., Canadian Journal of Physics, 60 (1966) 1079 .
40.- Reed B.C., American Journal of Physics, 60 (1992) 59.
1.7 Referncies

49
41.- Reed B.C., American Journal of Physics, 57 (1989) 642.
42.- Williamson J.A., Canadian Journal of Physics, 46 (1968) 1845.
43.- Asuero A.G., Gonzlez A.G., Microchemical Journal, 40 (1989) 216.
44.- Ogren P.J., Norton J.R., Journal of Chemical Education, 69 (1992) A130.
45.- Gonzlez A.G., Mrquez A., Fernndez J., Computers Chemistry, 16
(1992) 25.
46.- Neri F., Saitta G., Chiofalo S., Journal of Physics E, Scientific Instruments,
22 (1989) 215.
47.- Brooks C., Went I., Harre W., Journal of Geophysical Research, 73 (1968)
6071.
48.- Lwin T., Spiegelman C.H., Journal of the Royal Statistical Society C, 35
(1986) 256.
49.- Lybanon M., American Journal of Physics, 52 (1984) 22.
50.- Jefferys W.H., Astronomy Journal, 85 (1980) 177.
51.- Jefferys W.H., Astronomy Journal, 86 (1981) 149.
52.-. Britt H.I., Luecke R.H., Technometrics, 15 (1973) 233.
53.- Powell D.R., MacDonald J.R., Computers Journal, 15 (1972) 148.
54.- Powell D.R., MacDonald J.R., Computers Journal, 16 (1973) 51.
55.- Cumming G.L., Rollett J.S., Rossotti F.J.C., Whewell R.J., Journal of the
Chemical Society, Dalton Transactions, 23 (1972) 2652.
56.- Press W.H., Teukolsky S.A., Computational Physics, 6 (1992) 274.
57.- Clutton-Brock M., Technometrics, 9 (1967) 261.
58.- Barker D.R., Diana L.M., American Journal of Physics, 42 (1974) 224.
59.- Orear J., American Journal of Physics, 52 (1984) 278.
60.- Lis J.M., Cholvadov A.., Kutej J., Computers Chemistry, 14 (1990) 189-
192.
61.- Riu J., Rius F.X., Journal of Chemometrics, 9 (1995) 343-362.
62.- Carroll R.J., Ruppert D., American Statistician, 50 (1996) 1-6.
63.- Comunicaci personal del professor C.L. Cheng, Institut dEstadstica,
Acadmia Snica, Taipei, Taiwan, Repblica de China.
64.- Kalantar A.H., Gelb R.I., Alper J.S., Talanta, 42 (1995) 597-603.

50
65.- Analytical Methods Committee, Analyst, 119 (1994) 2363-2366.
66.- Hunter J.S., Journal Association of Official Analytical Chemists, 64 (1996)
574.
67.- Hahn G.J., Meeker W.Q., Statistical Intervals, a Guide for Practitioners,
John Wiley & Sons: New York, 1991.
68.- Gnzler H., Accreditation and Quality Assurance in Analytical Chemistry,
Springer-Verlag: Heidelberg, 1996.
69.- Steinijans V.W., Hauscke D., Clinical Research Regulatory Affairs, 10
(1993) 203.
70.- Hartmann C., Smeyers-Verbeke J., Pennickx W., Vander-Heyden Y.,
Vankeerberghen P., Massart D.L., Analytical Chemistry, 67 (1995) 4491.
71.- Riu J., Rius F.X., Analytical Chemistry, 68 (1996) 1851-1857.

CAPTOL 2
Falta dajust dels punts experimentals a la recta
de regressi que considera errors en els dos eixos


53

Com sha indicat en el captol anterior, quan hi ha falta dajust dels
punts experimentals a la recta de regressi, el model lineal estimat pot no
ser vlid i es poden obtenir valors sobreestimats de lerror experimental s
2
.
Per tant, els intervals de confiana construts per detectar la presncia
derrors significatius en els coeficients de regressi seran ms grans.
1
Per
aquest motiu la probabilitat de no detectar la presncia de diferncies
significatives en els coeficients de regressi respecte als valors terics
(probabilitat de cometre un error ) augmentar. Aix pot significar, en el
cas de la comparaci de mtodes analtics, considerar com a correctes els
resultats dun mtode analtic alternatiu que realment no seria traable al
mtode de referncia.

Aquests motius justifiquen la necessitat de desenvolupar un test
estadstic per detectar la falta dajust dels punts experimentals a la recta de
regressi BLS, que considera les incerteses produdes pels errors comesos
en la mesura de les variables predictora i resposta. Laplicaci daquest test
estadstic abans de portar a terme qualsevol test dhiptesi sobre els
coeficients de regressi BLS permet fer-se una idea del grau de fiabilitat
que es pot esperar de les conclusions extretes a partir dels tests dhiptesis
posteriors. Com sindica a lapartat 2.2, cal tenir en compte que quan es fa
servir el mtode de regressi BLS, la falta dajust dels valors experimentals
no s tan fcil de detectar com ho pot ser pel mtode OLS. Aix es deu a
que en el mtode BLS la falta dajust no noms depn de la distncia dels
punts experimentals a la recta de regressi, sin que a ms tamb depn de
la magnitud de les incerteses generades pels errors comesos en la mesura
de les dues variables. Aquells punts amb incerteses ms grans tendiran a
tenir pitjor ajust a la recta de regressi, ja que la recta de regressi BLS els
dna menor importncia. No obstant aix, s probable que aquest tipus de
punts experimentals no contribueixin de manera important a lndex final
Captol 2. Falta dajust dels punts experimentals ...

54
de la falta dajust. Daltra banda, aquells punts experimentals amb
incerteses individuals molt petites tendiran a tenir millor ajust a la recta de
regressi, ja que el mtode BLS dna ms importncia a aquest tipus de
valors experimentals, on sassumeix que els errors comesos en la mesura
sn ms petits. En aquests tipus de punts, una petita distncia respecte la
lnia de regressi, pot fer augmentar de manera significativa lndex de falta
dajust. La gran varietat de situacions que es poden donar quant a la
distribuci dels valors experimentals o a lestructura de les seves incerteses
individuals en conjunts de dades reals, fa que la identificaci de la falta
dajust pel mtode de regressi BLS sigui complex.

A lapartat 2.2 es fa un recull de les aproximacions ms emprades
per detectar falta dajust dels punts experimentals a la recta de regressi
OLS. Tamb es presenten els resultats obtinguts per laplicaci dels tests
2

i ANOVA desenvolupats pel mtode BLS en conjunts de dades simulats. A
lapartat 2.3 es presenta el gruix del treball tractat en aquest captol, com a
part de larticle Lack of fit in linear regression considering errors in both axes,
publicat en la revista Chemometrics and Intelligent Laboratory Systems.
Finalment a lapartat 2.4, es presenten les conclusions del captol.

2.2 Possibles aproximacions per la detecci de falta dajust

De les diferents aproximacions per detectar falta dajust dels valors
experimentals a la recta de regressi, es pot distingir entre el coeficient de
determinaci r
2
,
2
el coeficient de qualitat (quality coefficient, QC),
3
lanlisi
de la varincia (analysis of variance, ANOVA)
4
i el test
2
,
5
sent la primera la
ms emprada. El coeficient de determinaci r
2
es pot expressar com:
2

2.2 Possibles aproximacions per la detecci ...

55

=
=
=
n
i
i
n
i
i
y y
y y
r
1
2
1
2
2
) (
)
(
(2.1)

on y s el valor mitj de les n variables y
i
. Aquesta variable mesura la
variaci, explicada per la recta de regressi, dels n valors y
i
al voltant del
seu valor mitj y . Per tant, r
2
pot prendre valors entre 0 i 1 sempre que no
es considerin els valors de les rpliques a lhora destimar els coeficients de
regressi.
2
En cas contrari, r
2
no podr ser igual a 1, ja que cap model lineal,
per molt b que sajusti a les dades experimentals, pot explicar la variaci
en les mesures a causa de lerror experimental.
2
Daltra banda, diversos
autors
6-8
desaconsellen ls del coeficient de determinaci r
2
, perqu es
tracta dun ndex numric sense sentit estadstic, que a ms no t en compte
els errors experimentals comesos en la mesura de les diferents rpliques y
ij
.

Quant al coeficient de qualitat, tamb mesura el grau dajust dels
valors experimentals a la recta de regressi, segons lexpressi:

2
100
1

=

=
n
y
y y
QC
n
i
i i
(2.2)

Aquest coeficient s una mesura de lerror que es pot cometre en predir
amb la recta de regressi les concentracions mesurades y
i
. Aix doncs, com
pitjor sigui lajust dels punts a la recta de regressi, ms gran ser QC.
Aquesta mesura de la falta dajust s preferible al coeficient de
determinaci r
2
, ja que proporciona una idea millor de la dispersi dels
valors experimentals i dna una indicaci de lerror que es pot cometre en
la predicci de les concentracions. Aquest coeficient tamb es pot utilitzar
en models ms complexes substituint el denominador de lequaci 2.2 per

56
n-p, on p s el nombre de coeficients del model.
4
Tot i aix aquesta
aproximaci per detectar falta dajust continua sense tenir una base
estadstica i en el cas de tenir rpliques, continua sense considerar lerror
experimental coms en la seva mesura.

Pel que fa al test basat en lanlisi de la varincia, laplicaci al
mtode de regressi OLS requereix fer rpliques de les variables resposta
i
y per cadascun dels valors de la variable predictora
i
x , amb 1<i<n i
1<j<p
i
on p
i
s el nombre de repeticions en
i
x . Aquesta aproximaci
considera que la suma de residuals al quadrat SS
r
est formada per dues
parts:
2,4

= = = = =
+ = =
n
i
i i i
n
i
p
j
i ij
n
i
p
j
i ij r
y y p y y y y SS
i i
1
2
1 1
2
1 1
2
)
( ) ( )
( (2.3)

El primer terme (suma de quadrats degut a lerror pur, SS
) s una mesura
de la variaci dels valors de les p
i
rpliques fetes sobre la mostra i (
ij
y )
respecte al seu valor mitj
i
y . Per tant dna una idea de la incertesa
associada als valors mitjans de la variable resposta
i
y , a causa dels errors
experimentals comesos en la mesura de les rpliques. Daltra banda, el
segon terme (suma de quadrats degut a la falta dajust, SS
lof
) mesura la
variaci dels valors mitjans de la variable resposta
i
y al voltant de la recta
de regressi. A la figura segent es representen grficament aquests dos
termes.
SS

SS
lof

57
y
x
x b b y
1 0
+ =
1
x
n
x i
x
3 i
y
2 i
y
1 i
y
i
y
i
y
) (
2
SS y y
i i

i i
y y SS
1
) (

) (
lof i i
SS y y
3
) (
i i
y y SS

Figura 2.1. Descomposici de les distncies residuals entre les rpliques y
ij
, ( ), el
seu valor mitj y
i
( ) i el valor predit
i
y ( ).

Dividint els termes SS
i SS
lof
pels graus de llibertat corresponents,
sobtenen els respectius valors mitjans MS
i MS
lof
:

2
=
n
SS
MS
lof
lof
(2.4)

( )
=

=
n
i
i
p
SS
MS
1
1
(2.5)

Per detectar la falta dajust es calcula el quocient
lof cal
MS MS F = que es
compara per a un nivell de significana amb el valor tabulat de la
distribuci F,
=

n
i
i
p n
F
1
) 1 ( , 2 , 1
. Si el valor F
cal
s major que el valor tabulat, es
podr concloure que hi ha falta dajust perqu la variaci dels valors

58
mitjans de la variable resposta
i
y al voltant de la recta de regressi (MS
lof
)
no pot ser explicada per la variaci deguda a lerror experimental pur (MS
)
coms en la mesura de les rpliques.

Quant al mtode de regressi WLS, lAnalytical Methods Comitee
9
va
desenvolupar un test per detectar falta dajust basat tamb en lanlisi de la
varincia. Aquest s anleg al descrit pel mtode OLS, ja que fa servir les
mateixes equacions, per ponderant les distncies representades a la figura
2.1 per les varincies individuals de la variable resposta (
2
i
y
s ). Pel que fa al
mtode de regressi BLS, atesa la seva equivalncia amb els mtodes dOLS
i WLS sota les condicions apropiades (apartat 1.4.1.2 de la Introducci), en
lapartat 2.3 daquest captol es presenten les expressions desenvolupades
per detectar la presncia de falta dajust dels valors experimentals a la recta
de regressi, basant-se en lanlisi de la varincia.

Daltra banda, tamb s possible detectar la falta dajust dels valors
experimentals a la recta de regressi obtinguda a partir del mtode OLS,
WLS o BLS mitjanant un test
2
. Aquest test assumeix que l'estimaci de
lerror experimental s
2
s una variable aleatria que es distribueix segons
una distribuci
2
amb n-2 graus de llibertat.
5,9
Quan el valor de lerror
experimental s superior al valor tabulat per un nivell de significana ,
2
2 , 1 n
es conclou que existeix falta dajust dels punts a la recta de
regressi. No obstant aix, la detecci de falta dajust mitjanant aquest test
s considerada com aproximada, ja que lassumpci que lerror
experimental es distribueix segons una distribuci
2
noms s correcta
quan el nombre de punts experimentals s elevat,
10
condici que rarament
es compleix en regressi lineal.

Per comprovar la capacitat de detecci de falta dajust dels tests
2
i
ANOVA adaptats al mtode de regressi BLS en diferents tipus de dades,

59
es faran servir conjunts de dades simulats. Aquests es generen mitjanant el
mtode de Monte Carlo,
11,12
tal com sexplica en la secci Validation Process
de lapartat 2.3 daquest captol. Sobre cada un daquests conjunts
sapliquen els dos tests estadstics per a un determinat nivell de
significana . En el cas que es detecti falta dajust en un % dels conjunts
de dades generats a partir dun conjunt inicial on se simula un ajust
perfecte de les dades experimentals a la recta de regressi, es podr
concloure que les expressions teriques desenvolupades sn correctes. La
figura segent representa de manera grfica el procs de validaci per un
conjunt de dades inicial heteroscedstic simulant un ajust perfecte dels
valors experimentals a la recta de regressi.

Monte
Carlo
1
2
3
100.000
Conjunt de dades inicial
x
y
Test F ()
Test
2
()
Test F ()
Test
2
()
Test F ()
Test
2
()
Test F ()
Test
2
()
Falta dajust?
Falta dajust?
No
No
Falta dajust?
Falta dajust?
S
No
Falta dajust?
Falta dajust?
No
No
Falta dajust?
Falta dajust?
No
S
Test F ()
Test
2
()
Falta dajust?
Falta dajust?
? =
x
y
x
y
x
y
x
y % Falta ajust
Test
2
% Falta ajust
Test
2

Figura 2.2. Procs de validaci mitjanant el mtode de Monte Carlo.

Aquest procs de validaci es va aplicar sobre diferents tipus de
conjunts inicials descrits a la secci Experimental (Experimental Section) de
lapartat 2.3 daquest captol. La taula segent recull els resultats sobre
lexistncia de falta dajust obtinguts a partir de les expressions
desenvolupades pel mtode de regressi BLS, aix com de les ja existents
pel mtode de regressi WLS.
9
Per proporcionar ms claredat a la taula 1

60
noms es presenten els resultats obtinguts dels conjunts de dades amb 7
punts en el cas ms complex, s a dir, per conjunts inicials amb
heteroscedasticitat aleatria. Aquests conjunts es troben representats en la
figura 1 de lapartat 2.3. A la columna Rpliques es presenten el nombre
de repeticions generades de cada punt del conjunt inicial, a partir de les
quals es van calcular els nous valors mitjans dels punts (
i
x ,
i
y ) i les seves
desviacions estndard
i
x
s i
i
y
s . Per estudiar lefecte dun nombre gran de
rpliques (representat pel smbol ), es van simular conjunts de dades en
qu les desviacions estndard per a cada nou punt (
i
x ,
i
y ) eren iguals a les
desviacions estndard associades als punts en el conjunt de dades inicial.

Rpliques Test F
(BLS)
Test
2

(BLS)
Test F
(WLS)
Test
2

(WLS)

10 3 19.5 28.9 35.1 44.6
6 14.7 17.8 21.0 24.9
9.9 10.1 10.1 10.1

5 3 13.0 21.7 26.8 36.9
6 8.4 11.0 13.4 17.2
4.9 4.9 4.9 5.0

1 3 5.1 11.8 13.7 25.8
6 2.5 4.1 5.3 8.4
0.9 1.1 1.0 1.1
Taula 1. Percentatge de vegades en els que es detecta falta dajust en els
100.000 conjunts de dades simulats, mitjanant els dos tests estadstics.

Com es pot observar a la taula 1, per als nombres de rpliques ms
baixos el percentatge de conjunts de dades simulats en qu es va detectar
falta dajust, tant amb els tests desenvolupats per BLS com amb els ja
coneguts per WLS,
9
s superior al nivell de significana fixat en cada cas
perqu el nombre de rpliques s insuficient per donar estimacions
correctes de les incerteses del conjunt inicial. Aix doncs, en aquells punts
on les incerteses estimades (variaci deguda a lerror experimental pur)

61
siguin inferiors a les incerteses inicials (vertaderes) ser ms difcil explicar,
segons el test F, la variaci dels valors mitjans (
i
x ,
i
y ) al voltant de la recta
de regressi. Aquests punts amb incerteses subestimades (amb una major
probabilitat de donar-se com menor sigui el nombre de rpliques) tindran
una contribuci molt important a la variable F
cal
i, per tant, faran
augmentar les probabilitats de detectar falta dajust. s per aix que, com
reflecteix la taula 1, el nombre de rpliques generades per estimar els nous
punts simulats i les seves incerteses individuals s fonamental per a la
detecci correcta de la falta dajust. En aquesta taula es pot veure que el
mtode WLS s ms sensible a aquest efecte que no pas el mtode BLS.
Aix s degut a que mentre que pel mtode WLS amb les rpliques
generades sestima noms una desviaci estndard (
i
y
s ) per punt, pel
mtode BLS sen estimen dues (
i
x
s i
i
y
s ). Per tant, s ms probable que en
estimar una sola desviaci estndard, aquesta pugui estar subestimada (cas
WLS) que no pas ho estiguin les dues a la vegada (cas BLS).

Daltra banda, tamb sha de destacar que els resultats del test
2
en
la taula 1 sn sempre superiors als obtinguts pel test F pels diferents nivells
de significana . Aix demostra que el test
2
detecta errniament falta
dajust en ms casos que el test F, quan el nombre de rpliques a partir del
qual sestimen les varincies dels nous punts simulats s limitat. Aquest
resultat s degut al fet que, com ja sha dit anteriorment, lassumpci que
2
es distribueix segons una distribuci
2
noms s
correcta quan el nombre de punts experimentals s prou elevat.
10
A ms,
igual que en el test F, els resultats obtinguts del test
2
pel mtode de
regressi WLS sn sempre superiors als del mtode BLS. Com sha explicat
anteriorment, el mtode WLS necessita lestimaci duna sola desviaci
estndard per punt, i per tant, s ms probable que aquesta pugui estar
subestimada que no pas ho estiguin dues a la vegada en el cas del mtode
BLS. Aix significa que pel mtode WLS la suma de residuals ponderats i,

62
per tant, lerror experimental s
2
tindran una major probabilitat destar
sobreestimats. Per aquesta ra ser ms probable que pel mtode WLS se
superi el valor crtic
2
2 , 1 n
per un nivell de significana i, en
conseqncia, que es detecti falta dajust.

Finalment, quan sassocia als nous punts generats les mateixes
varincies que als punts del conjunt inicial (se simula un gran nombre de
rpliques, smbol ), el percentatge de casos en qu es detecta falta dajust
s aproximadament igual al nivell de significana fixat en cada cas. Aix
demostra que si es tenen bones estimacions dels valors experimentals
(
i
x ,
i
y ) i de les incerteses individuals, les expressions desenvolupades per
detectar falta dajust pel mtode BLS, igual que les ja conegudes pel mtode
WLS,
9
sn correctes.

2.3 Chemom. Intell. Lab. Syst., 54 (2000) 61-73

63
2.3 Lack of fit in linear regression considering errors in both
axes (Chemometrics and Intelligent Laboratory Systems, 54
(2000) 61-73).

ngel Martnez
*
, Jordi Riu, F. Xavier Rius

Department of Analytical and Organic Chemistry.
Institute of Advanced Studies. Universitat Rovira i Virgili.
Pl. Imperial Tarraco, 1. 43005-Tarragona. Spain.

ABSTRACT

Testing for lack of fit of the experimental points to the regression
line is an important step in linear regression. When lack of fit exists,
standard deviations for both regression line coefficients are overestimated
and this gives rise, for instance, to confidence intervals that are too large. If
these confidence intervals are then used in hypothesis tests, bias may not be
detected so there is a greater probability of committing a error. In this
paper we present a statistical test which analyses the variance of the
residuals from the regression line whenever the data to be handled have
errors in both axes. The theoretical expressions developed were validated
by applying the Monte Carlo simulation method to two real and nine
simulated data sets. Two other real data sets were used to provide
examples of application.

INTRODUCTION

Linear regression has two fundamental uses in analytical chemistry:
it relates the instrumental responses to the analyte concentration (i.e. it
establishes the calibration line within the quantitative analytical process)
and compares analytical methodologies over a set concentration range. For

64
some analytical methods, such as X-ray fluorescence (XRF), certified
reference materials (CRM) are often used as calibration standards because
real samples (i.e. geological materials) [1] are too complex. For this reason
uncertainties are associated to both CRM concentration values and
instrumental responses (predictor and response variables) and thus placed
on the x and y axis respectively. In method comparison studies replicate
measurements of a set of samples containing the analyte of interest at
different concentration levels, are carried out by the two methods to be
compared. Results can be placed in both axes with their respective
uncertainties and regressed on each other. In this way, bias in the method
being tested can be detected, for instance, by using the joint confidence
interval test for the slope and the intercept of the regression line which was
obtained considering the errors in both axes [2].

Ordinary least-squares (OLS), or weighted least-squares (WLS)
which considers heteroscedasticity in the response variable, are probably
the most widely used regression techniques. However, they are of limited
scope because they consider that the x axis is free of error. For this reason,
OLS and WLS should not be applied in the cases described above since the
uncertainties associated to the results in both axes are habitually of the
same order of magnitude. An alternative may be the errors-in-variables
regression [3], also called constant variance ratio (CVR) approach [4-6]. This
regression method considers the errors in both axes but does not take into
account the individual uncertainties of each experimental point. It also
concludes that the ratio of the variances of the response and predictor
variables is constant for every experimental point (=s
y
2
/s
x
2
). A particular
case is the orthogonal regression method (OR) [7], in which the errors are of
the same order of magnitude in the response and in the predictor variables
(i.e. =1). Bivariate least squares (BLS) regression techniques [8,9],

are
another option because they take into account individual non-constant
errors in both axes to calculate the regression coefficients.

65

It is essential to check whether lack of fit exists before a statistical
hypothesis test is applied to the regression line coefficients. Confidence
intervals calculated with data which contain lack of fit lead to oversized
regions [10]. The use of these confidence regions in any statistical
hypothesis test may for the BLS regression method, not allow the detection
of constant or proportional bias in the calibration line [11], or in the case of
method comparison studies, may lead to wrongly considering the results
from an alternative analytical method as unbiased (i.e. error) [12]. To
prevent these misleading situations in OLS and WLS regression techniques,
residual plots can be used together with statistical tests to detect lack of fit
[13,14]. For this reason, in this paper we present a statistical test which is
adapted to detect lack of fit under BLS regression conditions. This test is
based on the analysis of the variance of the residuals from the regression
line (Anova) obtained when errors in both axes are taken into account.

Two real data sets were used in the validation process to check if the
conclusions reached about the existence of lack of fit under OLS, WLS and
BLS regression conditions were similar. Simulated data sets randomly
generated from nine different initial data sets using the Monte Carlo
method [15,16], were also used to validate the theoretical expressions
proposed. In addition, two more real data sets were considered to provide
real examples of detecting of lack of fit in experimental data which have
errors in both axes, using the test based on the analysis of the variance of
the residuals.


66
BACKGROUND AND THEORY

Notation

In general, the true values of the variables used throughout this
study are represented with Greek characters, and their estimates are
represented with Latin ones. Thus the true values of the BLS regression
coefficients are written
0
(intercept) and
1
(slope), while their respective
estimates are written as b
0
and b
1
. The estimates of the standard deviation
of the intercept and the slope for the BLS regression line, are written as
0
b
s
and
1
b
s respectively. The true experimental error (residual mean square
error), expressed in terms of variance for the n experimental data pairs
(x
i
,y
i
), is referred to as
2
, while its estimate is
2
s . Values x
i
and y
i
of each
experimental data pair are the mean values of the p
i
replicate
measurements of the ith sample x
ij
and y
ij
(1<j<p
i
) by both methods.
Predictions of the experimental mean values x
i
and y
i
are symbolised as
i
x
and
i
y .

Bivariate Least-Squares Regression (BLS)

Of all the least squares approaches for calculating the regression
coefficients when there are errors in both axes, Liss method [8] (referred
to as BLS) was found to be the most suitable [9]. This technique assumes
the true linear model to be:

i i

1 0
+ = (1)

The true variables
i
and
i
are unobservable. Only the experimental
variables can be observed:

67

i i i
x + = (2)

i i i
y + = (3)

The random errors committed in the measurement of variables x
i
and y
i
, are
represented by variables
i
and
i
, where ) , 0 ( N ~
2
i
x i
and ) , 0 ( N ~
2
i
y i
.
In this way, when eqs. 2 and 3 are introduced in eq. 1 and the variable y
i
is
isolated, the following expression is obtained:

i i i
x y + + =
1 0
(4)

The term
i
is the ith true residual error with ) , 0 ( N ~
2
i
i
[17] and can be
expressed as a function of
i
,
i
and
1
:

i i i

1
= (5)

To estimate the regression line coefficients whenever there are
errors in both variables, several authors have developed procedures based
on a maximum likelihood approach [3,18-20]. In most cases these methods
need the true predictor variable to be carefully modelled [18]. This is not
usually possible in chemical analysis, where the true predictor variables
i

are not often randomly distributed (i.e. functional models are assumed).
Moreover there are cases in which the experimental data is heteroscedastic
and estimates of measurement errors are only available through replicate
measurements (i.e. the ratio
i i
y x
can be non-constant or unknown).
These conditions, common in chemical data, make it very difficult to
rigorously apply the principle of maximum likelihood to the estimation of
the regression line coefficients. On the other hand, there is a method to
estimate the regression coefficients using a maximum likelihood approach

68
even when a functional model is assumed [17]. This method is not
rigorously applicable when individual heteroscedastic measurement errors
are considered. It has been shown that when it is assumed that
i i
y x
=
for any i, least squares methods provide the same estimates of the
regression coefficients as maximum likelihood estimation approaches [21].
For these reasons, we have chosen an iterative least squares method (i.e. the
BLS method) that can be applied to any group of ordered pairs of
observations with no assumptions about the probability distributions [21].
This means that this method can be applied to real chemical data when
individual heteroscedastic errors in both axes are considered. In this way,
the BLS regression method relates the observed variables x
i
and y
i
as
follows [22]:

i i i
e x b b y + + =
1 0
(6)

The term e
i
is the observed ith residual error. The variance of e
i
is
2
i
e
s and
will be referred to as the weighting factor. This parameter takes into
consideration the experimental variances of any individual point in both
axes (
2
i
x
s and
2
i
y
s ) obtained from replicate analysis. The covariance between
the variables for each (x
i
,y
i
) data pair, which is normally assumed to be
zero, is also taken into account:

) ( cov 2 ) var(
1
2 2
1
2
1 0
2
i i x y i i e
y x b s b s x b b y s
i i i
+ = =

(7)

The BLS regression method finds the estimates of the regression line
coefficients by minimising the sum of the weighted residuals, S which is
known to follow a
2
distribution with n-2 degrees of freedom [23]:


69

2
1
2
2
1
2
2
2
2
) 2 (
) ( ) ( ) (
s n
s
y y
s
x x
s
y y
S
n
i e
i i
n
i x
i i
y
i i
i i i
=
=

= =

(8)

where s
2
is the estimate of the residual mean squared error, also known as
experimental error. Therefore the BLS regression technique assigns less
importance to those data pairs with larger
2
i
x
s and
2
i
y
s values, that is to say,
the most imprecise data pairs. By minimising the sum of the weighted
residuals (eq. 8), two non-linear equations are obtained, from which the
regression coefficients b
0
and b
1
can be estimated by means of an iterative
process [2].

It should be noted that the BLS regression method is equivalent to
the WLS and OLS methods in the appropriate regression conditions [2].
Thus, when uncertainty is only available for the experimental values on the
y-axis, estimates of the BLS regression line coefficients are the same as those
estimated with the WLS regression technique. This is because the BLS
weighting factor (eq. 7) reduces to the one assumed by the WLS method,
that is
2 2
i i
y e
s s =

when the uncertainties in the x-axis are zero [14]. On the
other hand, when null and constant uncertainties (homoscedasticity) are
considered in the x and y axes respectively (OLS regression conditions), the
BLS weighting factor in eq. 7 becomes a constant, which means that the
sum of the weighted residuals (eq. 8) minimized by BLS becomes identical
to the one assumed by the OLS regression method. For this reason, the
estimates of the BLS regression line coefficients under homoscedastic
conditions are the same as the ones from the OLS regression method.

Lack of fit

Lack of fit of the experimental points to the regression line under
BLS regression conditions may not be as easy to recognise as it may be

70
under other regression methods that do not account for individual
uncertainties in the experimental data (e.g. OLS). This is because under BLS
conditions lack of fit not only depends on the distance of the experimental
points from the regression line but also on the magnitude of the
uncertainties on both axes for each individual data pair. So, data pairs with
larger uncertainties will tend to be further from the BLS regression line
since this regression method does not give much importance to low-
precision data pairs. This kind of situation, however, may not make an
important contribution to the overall lack of fit index. On the contrary, data
pairs with low individual uncertainties should lie near the BLS regression
line. This is because the BLS method gives more importance to high-
precision data pairs, that are supposed to have lower measurement errors.
In these cases, low deviations from the regression line may make an
important contribution to the overall lack of fit index. The wide variety of
situations that can arise when real heteroscedastic data is used, make it
very difficult to observe the existence of lack of fit and identify its possible
causes (dispersion of the data, outliers or non-linearities). This means that
a lack of fit test is required that can be used under BLS regression
conditions.

When no lack of fit exists in the BLS regression line, the observed
linear model (eq. 6) can be assumed to be correct and the weighted
residuals can be assumed to follow a normal distribution with mean 0 and
standard deviation
i
. If, however, lack of fit is present, the regression

model may not be correct and the residual mean square error s
2
(eq. 8) will
tend to be overestimated and may not provide a right measure of the
random variation present in the experimental data pairs [12]. In a work by
Williamson [23], goodness of fit was tested when errors were considered to
be present in both axes by applying a
2
test on the residual mean square
error estimate s
2
. This is a random variable that can be approximated by a
2
distribution with n-2 degrees of freedom. However, this is regarded as a

71
rough test for detecting lack of fit, and the test based on the analysis of the
residual variance [13] is habitually preferred. This is because a chi-squared
distribution is justified by the asymptotic theory only in large samples [24].
This condition is not usually met in linear regression, where the number of
samples is limited.

As well as using a statistical test for detecting lack of fit, it is also
advisable to take a look at the plot of the weighted residuals from the BLS
regression line [25]. This plot provides a view of the individual residual of
each experimental point corrected by the corresponding weighting factor
(eq. 7). So it gives a better view of the data structure and hence of the
possible causes of lack of fit of the experimental points to the regression
line (low-precision data, outliers or non-linearities).

Test for detecting lack of fit under BLS conditions

This lack of fit test is based on an analysis of the variance of the
residuals (Anova). Given the equivalence between the BLS, WLS and OLS
regression methods under the appropriate experimental conditions, the
expressions for the lack of fit test developed for the BLS method should be
analogous to those for the OLS [14] and WLS [13] regression techniques. In
this way, the variation of n
i
x and
i
y (1<i<n) group means around the
regression line (sum of squares due to lack of fit, SS
lof
) is compared with the
variance of the n data pairs due to pure experimental uncertainty (sum of
squares due to pure error, SS
), generated by p
i
replicate measurements on
each sample. These two sources of variation are included in the residual
sum of squares from regression, or total sum of squares (SS
r
) [14]. The new
expressions take into account the residuals in both x and y axes to evaluate
SS
r
according to the next equation:


72

2
2
1 1
2
2
) ( ) (
i
i
i
x
i ij
n
i
p
j y
i ij
r
s
x x
s
y y
SS

+
= =
(9)

This expression provides the sum of the weighted squared distances
between each single replicate measurement
ij
x or
ij
y and the
corresponding predicted mean value
i
x or

i
y . Analogously to the sum of
weighted residuals S, variable SS
r
can also be assumed to follow
2

distribution with 2
1

=
n
i
i
p degrees of freedom, since the only difference
between equations 8 and 9 are the replicate measurements
ij
x and
ij
y , that
can be assumed to be normally distributed around the mean values
i
x and
i
y [26]. The residual sum of squares from regression accounts for both the
lack of fit of the experimental mean values x
i
and y
i
around the BLS
regression line and the dispersion of the p
i
replicate measurements around
their respective experimental mean values in both axes. Because these
distances are divided by the individual variances, less importance is
assigned to those data pairs with higher uncertainties in both axes (i.e. the
most imprecise data pairs) and viceversa.

The sum of squares due to lack of fit (SS
lof
) included in the total sum
of squares in eq. 9 can be expressed as:

=
2
2
2
2
1
) ( ) (
i i
x
i i
y
i i
n
i
i lof
s
x x
s
y y
p SS (10)

This equation gives the sum of the weighted squared distances between the
experimental and the predicted mean values in both axes, which is due to
lack of fit of the data pairs around the BLS regression line. Since eq. 10 is
very similar to eq. 8, it is clear that the variable SS
lof
, like the sum of

73
weighted residuals S, follows a
2
distribution, with n-2 degrees of
freedom. It should be noted that a data pair with a high number of
replicates is likely to have lower individual uncertainties, and thus show a
better fit to the BLS regression line than another data pair with the same
experimental mean value obtained with a lower number of replicates. To
offset this effect, the term p
i
in eq. 10 gives greater importance to the
residuals of those data pairs with a higher number of replicates. Finally, the
sum of squares due to pure error (SS
) accounted by eq 9. can be calculated

by subtracting the sum of squares due to lack of fit in eq. 10:

SS

= SS
r
- SS
lof
(11)

Because variables SS
r
and SS
lof
follow a
2
distribution, it is clear from eq.
11 that the sum of squares due to pure error SS
, also has a
2
distribution
with
=

n
i
i
p
1
) 1 ( degrees of freedom. An F-test can be used to compare the
sums of squares SS
lof
and SS
because they both follow a

2
distribution
with
=

n
i
i
p
1
) 1 ( and n-2 degrees of freedom respectively [26]. To apply the
F -test these two variables (eqs. 10 and 11) first have to be divided by the
appropriate degrees of freedom:

2
=
n
SS
MS
lof
lof
(12)

( )
=

=
n
i
i
p
SS
MS
1
1
(13)

The F-ratio is therefore given by:


74

MS
MS
F
lof
cal
= (14)

If no lack of fit exists, F
cal
can be expected to be a random variable
drawn from an
=

n
i
i
p n
F
1
) 1 ( , 2
distribution. In this case, F
cal
will be lower than
the corresponding
=

n
i
i
p n
F
1
) 1 ( , 2 , 1
tabulated value for a given level of
significance . It should be pointed out that correctly estimating of the
individual uncertainties
2
i
x
s and
2
i
y
s is very important. If the uncertainties
of the points are extremely low, the regression line will tend to perfectly fit
these points. However, very slight deviations from the regression line may
cause lack of fit to be detected in the data set. This is because the terms
2
i
x
s
and
2
i
y
s that appear in the denominator of eqs. 9 and 10 make the F
cal
value
in eq. 14 very sensitive to small deviations of abnormal high-precision data
pairs from the regression line. Although these situations are not frequent in
experimental data, one should pay special attention to the fact that
repeated measurements should comprise all the experimental variability of
the measurement.

Validation Process

Since the theoretical expressions 9-14 are a result of adapting
expressions obtained from OLS and WLS regression methods, they need to
be validated. For this reason, a validation process was designed to prove
that correct results are provided by the theoretical expressions for the lack
of fit test considering BLS regression conditions (i.e. lack of fit is detected
when it exists and not detected when it does not). Two strategies were
followed to carry out this validation.


75
The first one was designed to validate the theoretical expressions
under OLS, WLS and BLS regression conditions using two real data sets.
Uncertainties in both axes for both data sets were modified so that
individual uncertainties in the y axis were approximately constant (i.e.
homoscedasticity) and much higher than the ones in the x axis. This
uncertainty structure, imposed for the BLS regression method, is similar to
the one assumed by WLS (heteroscedastic and null uncertainties in the y
and x axes respectively) and OLS (constant and null uncertainties in the y
and x axes respectively) methods. In this way, if eqs. 9-14 were correct,
conclusions about the existence of lack of fit reached using the lack of fit
test considering uncertainties in both axes should be similar to the ones
reached under OLS or WLS conditions.

The second strategy checked whether lack of fit could be correctly
detected with the expressions developed under BLS conditions using
simulated data. Nine initial simulated data sets in two different groups
were considered; in six of them all the data pairs perfectly fitted an straight
line (simulating no lack of fit) and in the remaining three they fitted a curve
(simulating lack of fit due to a non-linearity). From these two groups of
initial simulated data sets, and using the Monte Carlo method [15,16],
100,000 new simulated data sets were generated. This simulation method
adds a random error to each initial data pair based on the individual
uncertainties present in both axes. In this way, replicates are randomly
generated for each data pair in every new simulated data set. The new x
i

and y
i
values for each data pair were calculated from the mean value of the
p
i
simulated replicates x
ij
and y
ij
. The individual uncertainties in both axes
for each new data pair (i.e.
2
i
x
s and
2
i
y
s ) were considered equal to the true
uncertainties from the data pairs in the initial simulated data sets. This
ensured that possible errors in the detection of lack of fit from the
validation process could only be due to the theoretical expressions and not

76
produced by an inaccurate estimation of the individual uncertainties in
both axes (if calculated from the corresponding p
i
simulated replicates).

When lack of fit was detected in approximately an % of the data
sets generated from an initial data set which simulated no lack of fit for a
level of significance , it could be concluded that the theoretical
expressions provided correct results (i.e. lack of fit is only detected in an
% of the cases, when it does not exist). This conclusion would be
confirmed if lack of fit were to be detected in most of the data sets
generated from an initial data set which simulated lack of fit.

EXPERIMENTAL SECTION

Data sets and software

Nine initial simulated data sets with different characteristics (such
as number of data pairs or uncertainty patterns, Figs. 1a-1i) and two real
data sets (Figs. 2a and 2b) were used for the validation process. In six of the
simulated data sets goodness of fit was simulated by perfectly fitting all the
data pairs to a straight line (Figs. 1a-1f), whereas in the other three all the
data pairs followed a non-linear pattern simulating lack of fit (Fig. 1g-1i).
Moreover two supplementary real data sets (Figs. 2c and 2d) were also
used as application examples of the Anova test for detecting lack of fit
under BLS regression conditions.

Simulated data sets

Number of data pairs: Three data sets were composed of seven data
pairs (Figs. 1a-1c) and the other six contained twenty-one data pairs each
(Figs. 1d-1i). In all cases the data pairs were randomly distributed within a
linear range from 0 to 100 units.

77

Uncertainties: Homoscedastic data sets were composed of data pairs
with constant standard deviations (Figs. 1a, 1d and 1g). The heteroscedastic
data sets were divided into two different groups: those with standard
deviations which increased by 10% (Figs. 1b, 1e and 1h) for each individual
x
i
and y
i
value and those with random standard deviations (Figs. 1c, 1f and
1i) that were never higher than 10% of each individual x
i
and y
i
value.

0 20 40 60 80 100
0
10
20
30
40
50
60
70
80
90
100
X Data
Y

D
a
t
a
a)
0 20 40 60 80 100 120
0
20
40
60
80
100
120
X Data
Y

D
a
t
a
b)
0 20 40 60 80 100 120
0
20
40
60
80
100
120
X Data
Y

D
a
t
a
c)
-20 0 20 40 60 80 100 120
0
20
40
60
80
100
120
X Data
Y

D
a
t
a
d)
-20 0 20 40 60 80 100 120
0
20
40
60
80
100
120
X Data
Y

D
a
t
a
e)
-20 0 20 40 60 80 100 120
0
20
40
60
80
100
120
X Data
Y

D
a
t
a
f)
-20 0 20 40 60 80 100 120
0
20
40
60
80
100
120
140
X Data
Y

D
a
t
a
g)
-20 0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
140
160
X Data
Y

D
a
t
a
h)
-20 0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
140
X Data
Y

D
a
t
a
i)

Figure 1. Initial data sets used to generate of simulated data sets in the validation
process. Crosses around the data pairs represent the standard deviations associated to
each individual mean value in both axes.

78

For each of the nine simulated data sets, three levels of significance
were considered; 10%, 5% and 1%.
Real data sets

Two real data sets were used to validate the expressions proposed
for the lack of fit test under BLS regression conditions. In these data sets,
the individual uncertainties were modified so that they had a structure
similar to the ones assumed by OLS or WLS regression methods. In
addition two other supplementary real data sets were used to provide real
application examples of the test for detecting lack of fit when errors were
considered in both axes .

Data Set 1 [27]. This data set was composed of eight data pairs
generated from the determination of eight polycyclic aromatic
hydrocarbons in various environmental matrices at different concentration
levels through a stepwise interlaboratory study approach. The two
analytical methods being compared are GC/MS-SIM (on the x axis) and
GC-ECD (on the y axis). The linear range is between zero and 6 g/g (Fig.
2a).

Data Set 2 [28]. Comparative study of two multiresidue methods for
the determination of organochlorine insecticides and polychlorinated
biphenyl congeners in fatty processed foods. Eight data pairs represent the
results from the analysis of -endosulfan using HPLC (results on the x
axis) and GC (results on the y axis). The linear interval is between eighty
and a hundred and ten units presented as percentage of recovery (Fig. 2b).


79
-1 0 1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
GC-MS/SIM
G
C
-
E
C
D
a)

80 85 90 95 100 105 110
70
75
80
85
90
95
100
105
110
HPLC
G
C
b)

Figure 2. Regression lines for the two real data sets used in the validation process
under BLS (solid line), WLS (dashed line) and OLS (dotted line) conditions; (a) data set
1 and (b) data set 2. Crosses around the data pairs represent the standard deviations
associated to each individual mean value in both axes.

Data Set 3 [29]. Twelve data pairs distributed between eighty and a
hundred and five units presented as percentage of recovery. Results are
obtained by analysing two synthetic pyrethroid residues (permethrin and
cypermethrin) in fruits, vegetables and grains at six concentration levels for
each analyte using two chromatographic methods based on GC-ECD with
an acetonitrile extraction system and two different types of columns; wide
bore (on the x axis) and narrow bore (on the y axis). Uncertainties in both
axes for each data pair are the result of a six measurement analysis. Units
are expressed as a percentage of recovery (Fig. 2c).

Data Set 4 [30]. The composition of this data set is similar to the one
described for data set 3. In this case, however, the solvent used in the
extraction system is acetone and the number of synthetic pyrethroid
residues analyzed at different concentration levels is increased to eight.
This produces thirty-three data pairs distributed between eighty and a
hundred units expressed as a percentage of recovery (Fig. 2d).


80
80 85 90 95 100 105
80
85
90
95
100
105
110
115
GC-ECD wide bore
G
C
-
E
C
D

n
a
r
r
o
w

b
o
r
e
c)

75 80 85 90 95 100 105
75
80
85
90
95
100
105
110
GC-ECD wide bore
G
C
-
E
C
D

n
a
r
r
o
w

b
o
r
e
d)
115
110

Figure 2 (cont.). BLS Regression lines for real data sets 3 (c) and 4 (d). Crosses around
the data pairs represent the standard deviations associated to each individual mean
value in both axes.

All the computational work was performed with home-made
Matlab subroutines (Matlab for Microsoft Windows ver. 4.0, The
Mathworks, Inc., Natick, MA).

RESULTS AND DISCUSSION

Validation Process

Table 1 shows the results concerning the existence of lack of fit in
the two real data sets used in the validation process (data sets 1 and 2). In
all the cases the level of significance was set at 5%. The calculated and
tabulated values of the F parameter are in the F
cal
and F
tab
columns for the
three regression methods (Reg. column). In this way lack of fit was
detected when the F
cal
values were higher than the tabulated values for the
given level of significance. The number of degrees of freedom are
summarised in the d.o.f. column; the first and second values correspond
to the number of degrees of freedom of the numerator and the denominator
in eq. 14 respectively. The columns n and p present the number of

81
samples and measurements per sample (considered constant in the cases
studied) respectively.

Data Set n p d.o.f.

Reg. F
cal
F
tab

1 8 11 6, 80 BLS 1.15 2.21
WLS 0.96
OLS 0.69
2 8 3 6, 16 BLS 3.40 2.74
WLS 2.82
OLS 2.83

3 12 6 10, 60 BLS 0.33 1.99

4 33 6 31, 165 BLS 2.02 1.42
Table 1. Results of applying the lack of fit test to the four real
data sets for a level of significance of 5%.

Data Set 1. As can be seen in Table 1, the application of the lack of fit
test in this data set under BLS, WLS and OLS regression conditions showed
that no lack of fit was detected in any case. The weighted residual plots for
the three regression methods show considerable similarity (Fig. 3). This is
because the structure of the uncertainties in both axes (under BLS
conditions) is very similar to the structure assumed by WLS and OLS
methods. In the three cases the fifth data pair is the only one between the
warning and the action limits (twice and three times the sum of the
residuals, S in eq. 8, respectively). Figure 2a shows that this data pair is not
only the furthest from the regression line, but also has one of the lowest
individual uncertainties and thus is one of the least imprecise data pairs.
This forces most of the data pairs to be placed above the regression line.
The weighted residual of this data pair is higher than the others and this
suggests that this point might be considered as an outlier. For this reason, a
test for detecting outliers under BLS regression conditions is needed [25].


82
0 1 2 3 4 5 6
Concentration Level
0
S 3
S 2
S 3
S 2

Figure 3. Residual plots for data set 1 under BLS ( ), WLS ( ) and OLS ( ) regression
conditions. Residuals were weighted for BLS and WLS according to the different
regression conditions.

Data Set 2. In this example lack, of fit was detected by the lack of fit
test under the three regression conditions (Table 1). As in the previous
example the weighted residual plots (Fig. 4) show considerable similarity.
In the three cases, there are two data pairs (second and third) that are near
the warning limits and might be the cause of lack of fit. Figure 2b shows
that because of the high degree of homoscedasticity in the data set, the data
pairs with the highest weighted residuals are those which are further from
the regression line. As in the previous example, they are near the warning
limits in the weighted residual plot (Fig. 4) and need not, therefore, be
immediately eliminated but it would be interesting to check if they can be
considered as outliers under each of the three regression conditions.


83
80 85 90 95 100 105 110
0
S 3
S 2
S 3
S 2
Concentration Level

Figure 4. Residual plots for data set 2 under BLS ( ), WLS ( ) and OLS ( ) regression
conditions. Residuals were weighted for BLS and WLS according to the different
regression conditions.

These two examples demonstrate that the conclusions reached
about the existence of lack of fit are similar to the conclusions reached
under WLS and OLS conditions, when the structure of the uncertainties in
both axes is similar to the strucutre assumed by the WLS and OLS
regression methods. This suggests that the expressions developed for the
lack of fit test which consider errors in both axes provide results which are
consistent with those obtained under OLS or WLS conditions.

Table 2 summarises the percentages of the 100,000 simulated data
sets generated using the Monte Carlo method in which lack of fit was
detected (l.o.f. column) for the nine initial simulated data sets at the three
levels of significance. The three different uncertainty patterns considered
(homoscedasticity and constant and random heteroscedasticity) are
summarized in the Uncertainty column.


84
n Uncertainty % l.o.f. n Uncertainty % l.o.f.
7 homo. 10 9.92 hetero. rnd. 10 9.68
5 5.13 5 4.57
1 0.94 1 1.12

hetero. 10 9.87 21* homo. 10 96.23
5 4.96 5 92.55
1 0.95 1 86.27

hetero. rnd. 10 9.52 hetero. 10 97.85
5 4.93 5 94.25
1 0.92 1 88.46

21 homo. 10 9.95 hetero. rnd. 10 98.93
5 4.84 5 96.32
1 0.92 1 90.53

hetero. 10 9.86
5 4.84
1 1.12

Table 2. Percentages of detection of lack of fit in simulated data sets with
homoscedasticity (homo.), proportional heteroscedasticity (hetero.) and random
heteroscedasticity (hetero. rnd.) during the validation process. The symbol (*)
denotes the existence of lack of fit in the initial simulated data set.

A paired t-test [31] (with =5%) was applied to the results obtained
for the different uncertainty patterns of the data sets which simulated no
lack of fit in Table 2.The differences between the lack of fit percentages and
the percentages of were not significant. So, the percentages of cases in
which lack of fit was wrongly detected do not significantly differ from the
different levels of significance that set the probability of wrongly
detecting lack of fit. For this reason, it can be concluded that the theoretical
expressions adapted to perform the lack of fit test whenever errors in both
axes are present provide correct results. Moreover results were best in
those simulated data sets with homoscedastic uncertainties, followed by
those from data sets with constant and random heteroscedasticity.


85
The capability of the test to correctly detect lack of fit was also
checked. Table 2 presents the results of applying the lack of fit test to the
simulated data sets generated using the Monte Carlo method on the three
initial data sets with data pairs following non-linear patterns with different
kinds of uncertainties (i.e. showing an evident lack of fit to the regression
line, Figs. 1g-1i). Lack of fit was detected in a higher percentage than the set
level of significance in the three cases (i.e. lack of fit was correctly
detected when it existed). Table 2 also shows that lack of fit was detected
most in those data sets with random and constant heteroscedasticity
respectively.

Application of the lack of fit test to real data sets under BLS conditions

Data Set 3. Results of applying the lack of fit test to this real data set
are reflected in Table 1. It can be seen that no lack of fit was detected under
BLS regression conditions, as the F
cal
value is lower than the tabulated one
for 10 and 60 degrees of freedom and a level of significance of 5%. This
conclusion seems to be consistent with the data structure observed in the
weighted residual plot (Fig. 5a), in which the dispersion of the data appears
to be moderate except for the 6th data pair. As Figure 2c shows, the high
residual for the 6th data pair is because it is furthest from the regression
line, although it is the most imprecise data pair (i.e. the one with highest
individual uncertainties). The weighted residual for this data pair in Figure
5a appears between the warning and the action limits and thus, a test for
detecting outliers is necessary here.

Data Set 4. In this example lack of fit was detected considering the
uncertainties in both axes, because the F
cal
value is higher than the
tabulated one for 31 and 165 degrees of freedom and a level of significance
of 5% (Table 1). The weighted residual plot (Fig. 5b), suggests that there is a
non-linear trend at the higher concentration levels. This could not be

86
observed in the plot of the BLS regression line (Fig. 2d) as the large number
of data pairs with their respective uncertainties provides an unclear image,
typical in relatively large data sets. This example shows the advantages of
using the weighted residual plot, particularly when working with medium
and large data sets. The possible non-linear pattern observed might
therefore cause the dispersion of the data observed in Fig. 2d which causes
the detection of lack of fit. From this conclusion the experimenter should
review the GC-ECD methodology being tested to search for the causes of
non linear responses at higher concentration levels.

88 90 92 94 96 98 10
0
Concentration Level
S 3
S 2
S 3
S 2
80 85 90 95 100
0
S 3
S 2
S 3
S 2
Concentration Level
Figure 5. Weighted residual plots under BLS regression conditions for data set 3 (a)
and data set 4 (b).

CONCLUSIONS

In this paper we have proposed and validated a statistical test
which detects lack of fit of the BLS regression line to data with errors in
both axes, based on the analysis of the variance (Anova) of the residuals.
When the uncertainty structure in both axes considered by the BLS
technique is similar to the one assumed by OLS and WLS, conclusions
about lack of fit were similar in all three cases. The test has also provided
correct results when detecting lack of fit in simulated data.
a) b)

87

The fact that BLS requires replicate measurements of each sample to
perform the lack of fit test does not represent an additional analytical effort
since the BLS method needs replicate measurement data to find the
coefficients of the regression line. This is not so for OLS which needs an
additional analytical effort to be made to provide the replicate
measurement values so that the analogous test can be applied. Despite the
fact that our work suggests that the lack of fit test is suitable when data
have errors in both axes, we recommend that the plot of the weighted
residuals be used to complement the statistical analysis. In this way, the
data structure can be visualised and the reasons for a hypothetical lack of
fit explained

Finally, if lack of fit is detected in the experimental data, the linear
model may not be correct and the experimental error estimate s
2
will no
longer provide an accurate estimate of the random error present in the
experimental data. If lack of fit is caused by the existence of outliers, these
experimental points should be tested and removed if necessary. Depending
on the magnitude of the individual uncertainties, lack of fit might also be
caused by different degrees of dispersion of experimental data pairs
around the regression line. In such cases, if no outliers are identified, the
analytical methodology should be revised to search for unexpected
measurement errors in those problematic data pairs, being necessary to
perform new analyses for each sample. Special attention should be paid to
the presence of data pairs with abnormally low individual uncertainties,
since very slight deviations from the BLS regression line may cause lack of
fit to be detected in the data set. If, however, lack of fit is due to non-linear
data and the causes for non-linear responses are neither known nor
identified, polynomial regression considering errors in both axes [32]
should be considered.


88
ACKNOWLEDGMENTS

The authors thank the DGICyT (project no. BP96-1008) for financial
support, and the University Rovira i Virgili for providing a doctoral
fellowship to A. Martnez.

BIBLIOGRAPHY

1.- K. Govindaraju, I. Roelandts, 1988 Compilation report on trace elements
in Six ANRT rock Reference Samples: Diorite DR-N, Serpentine UB-N,
Bauxite BX-N, Disthene DT-N, Grsnite GS-N and Potash Feldespar FK-N,
Geostandards Newsletter, 13 (1989) 1 5-67.
2.- J. Riu, F.X. Rius, Assessing the accuracy of analytical methods using
linear regression with errors in both axes, Anal. Chem. 68 (1996) 1851-1857.
3.- W.A. Fuller, Measurement Error Models, John Wiley & Sons, New York,
1987.
4.- R.L. Anderson, Practical Statistics for Analytical Chemists, Van
Nostrand Reinhold, New York, 1987.
5.- M.A. Creasy, Confidence limits for the gradient in linear in the linear
functional relationship, J. Roy. Stat. Soc. B 18 (1956) 65-69.
6.- J. Mandel, Fitting straight lines when both variables are subject to error,
J. Qual. Tech. 16 (1984) 1-14.
7.- C. Hartmann, J. Smeyers-Verbeke, W. Penninckx, D.L. Massart,
Detection of bias in method comparison by regression analysis, Anal. Chim.
Acta 338 (1997) 19-40.
8.- J.M. Lis, A. Cholvadov, J. Kutej, Multiple straight-line least-squares
analysis with uncertainties in all variables, Comput. Chem. 14 (1990) 189-
192.
9.- J. Riu, F.X. Rius, Univariate regression models with errors in both axes, J.
Chemom. 9 (1995) 343-362.

89
10.- G.J. Hahn, W. Q. Meeker. Statistical Intervals, a guide for practitioners,
John Wiley & Sons, New York, 1991.
11.- A. Martnez, J. del Ro, J. Riu and F. X. Rius, Chemometrics Intell. Lab.
Syst. 49 (1999) 179-193.

12.- A. Martnez, J. Riu and F. X. Rius submitted for publication.
13.- Analytical Methods Committee, Is my calibration linear?, Analyst 119
(1994) 2363-2366.
14.- Draper, N.; Smith, H. Applied Regression Analysis, 2
nd
ed.: John Wiley &
Sons: New York, 1981;pp 5-128.
15.- P. C. Meier and R. E. Zund, Statistical Methods in Analytical Chemistry,
John Wiley & Sons, New York, 145-150, 1993.
16.- O. Gell and J.A. Holcombe, Analytical applications of Monte Carlo
techniques, Analytical Chemistry, 60 (1990) 529A - 542A.
17.- P. Sprent, Models in Regression and related topics, Methuen & Co. Ltd.:
London, 1969.
18.- D. W. Schafer and K. G. Puddy, Likelihood analysis for errors-in-
variables regression with replicate measurements, Biometrika 83 (1996) 813-
824.
19.- K. C. Lai and T. K. Mak, Maximum likelihood estimation of a linear
structural relationship with replication, J. R. Statist. Soc. B 41 (1979) 263-268.
20.- C. L. Cheng and J. W. van Ness, On estimating linear relationships
when both variables are subject to error, J. R. Statist. Soc. B 56 (1994) 167-
183.
21.- D. V. Lindley, Regression lines and the linear functional relationship, J.
R. Statist. Soc./ London Suppl. Series B, 9 (1947) 218-244.
22.- G. A. F. Seber, Linear regression analysis, John Wiley & Sons: New York,
1977; pp. 160-211.
23.- J. H. Williamson, Least-squares fitting of a straight line, Can. J. Phys. 46
(1968) 1845-1847.
24.- P. Bentler and D. G. Bonett, Significance tests and goodness of fit in the
analysis of covariance structures, Psychological Bulletin 88 (1980) 588-606.

90
25.- J. del Ro, J. Riu, F. X. Rius, in preparation.
26.- A. M. Mood, F. A. Garybill, Introduction to the Theory of Statistics,
McGraw-Hill, New York (1963).
27.- P. de Vogt, J. Hinschberger, E. A. Maier, B. Griepink, H. Muntau, J.
Jacob, Improvements in the determination of eight polycyclic aromatic
hydrocarbons through a stepwise interlaboratory study approach,
Fresenius J. Anal. Chem., 356 (1996) 41-48.
28.- A. Sannino, P. Mambriani, M. Bandini and Luciana Bolzoni,
Multiresidue method for determination of organochlorine insecticides and
polychlorinated biphenyl congeners in fatty processed foods, J. AOAC Int.
79 (1996) 1434-1446.
29.- G. F. Pang, Y. Z. Chao, C. L. Fan, J. J. Zhang and X. M. Li, Modification
of AOAC multiresidue method for determination of synthetic pyrethroid
residues, vegetables, and grains. Part I: Acetonitrile extraction system and
optimization of florisil cleanup and gas chromatography, J. AOAC Int. 78
(1995) 1481-1488.
30.- G. F. Pang, Y. Z. Chao, C. L. Fan, J. J. Zhang, X. M. Li and Y. M. Liu,
Modification of AOAC multiresidue method for determination of synthetic
pyrethroid residues, vegetables, and grains. Part II: Acetone extraction
system, J. AOAC Int. 78 (1995) 1489-1495.
31.- D. L. Massart, B.M.G. Vandeginste, L.M.C. Buydens, S. de Jong, P.J.
Lewi, J. Smeyers-Verbeke, Handbook of Chemometrics and Qualimetrics:
Part A, Elsevier, Amsterdam, 1997.
32.- J.M. Lis, A. Cholvadov and B. Dbron, Polynomial (linear in
parameters) least squares analysis when all experimental data are subject to
random errors, Comput. Chem. 15 (1991) 135-141.

2.4 Conclusions

91
2.4 Conclusions

Deixades a part les conclusions extretes de larticle presentat en
lapartat 2.3, sen poden extreure daltres relacionades amb els resultats
obtinguts amb dades simulades. Sha pogut comprovar la necessitat de
realitzar prou rpliques per poder estimar de forma correcta tant els valors
experimentals (
i
x ,
i
y ) com les seves varincies individuals (
2
i
x
s ,
2
i
y
s ).
Aquests resultats sn poc aplicables sota condicions danlisi reals, ja que
fer un nombre de rpliques elevat per mostra augmentaria drsticament
tant el temps com el cost de lanlisi. Tot i aix, aquests resultats demostren
de forma clara les virtuts i les limitacions dels dos tests estudiats per
detectar la falta dajust dels punts experimentals a les rectes de regressi
obtingudes pels mtodes WLS o BLS. Conixer aquestes limitacions s
fonamental a lhora destablir el disseny experimental, ja que de vegades
pot ser convenient reduir el nombre de mostres de diferents concentracions
que shan danalitzar i augmentar-ne el nombre de rpliques.

Daltra banda, tamb sha demostrat que quan el nombre de
rpliques realitzades per estimar els valors experimentals i les seves
incerteses individuals s limitat, la capacitat del test F sota condicions de
regressi BLS per detectar correctament falta dajust s superior a la
mostrada pel test
2
. Una altre fet a destacar s que laplicaci del test F en
el mtode de regressi BLS, a diferncia del mtode OLS, no comporta un
major esfor experimental, ja que utilitza els valors de les incerteses
individuals per estimar els coeficients de regressi. No obstant aix, com es
veur en lapartat 5.4 del cinqu captol, hi ha casos en qu no s possible
conixer els valors de les rpliques a partir de les quals sestimen els valors
dels punts (
i
x ,
i
y ) i de les seves incerteses individuals. En aquests casos
lnica possibilitat per detectar la falta dajust dels punts experimentals a la
recta de regressi BLS ser mitjanant el test
2
, tot i ser menys rigors que
el test F.

92
2.5 Referncies

2.- Draper N., Smith H., Applied Regression Analysis, 2
nd
ed., John Wiley &
Sons: New York, 1981.
3.- Vakeerberghen P., Smeyers-Verbeke J., Chemometrics and Intelligent
Laboratory Systems, 15 (1992) 195-202.
5.- Williamson J.H., Canadian Journal of Physics, 46 (1968) 1845-1847.
6.- Analytical Methods Committee, Analyst, 113 (1988) 1469.
7.- Miller J.N., Spectroscopy International, 3 (1991) 41-43.
8.- Hunter J.S., Journal Association of Official Analytical Chemists, 64 (1981)
574.
10.- Bentler P., Bonett D. G., Psychological Bulletin, 88 (1980) 588-606.
11.- Meier P.C., Zund R.E., Statistical Methods in Analytical Chemistry, John
Wiley & Sons: New York, 1993.
12.- Gell O., Holcombe J.A., Analytical Chemistry, 60 (1990) 529A - 542A..

CAPTOL 3
Probabilitat derror de primera i segona espcie
en els tests individuals sobre lordenada a
lorigen i el pendent en regressi lineal
considerant errors en els dos eixos


95

Desprs dhaver comprovat la importncia de la detecci de la falta
dajust dels punts experimentals a la recta de regressi obtinguda pel
mtode BLS en el captol anterior, en aquest captol ens centrarem en
diferents aspectes dels tests estadstics basats en els intervals de confiana
individuals aplicats sobre els coeficients de regressi BLS. En el cas dels
estudis de comparaci de metodologies analtiques, aquests tests permeten
detectar la presncia derrors constants o proporcionals en els resultats del
nou mtode en comparaci als resultats del mtode de referncia. Daltra
banda, en el cas de la calibraci lineal, laplicaci dels intervals de confiana
individuals sobre lordenada a lorigen i el pendent permet determinar la
necessitat de correccions del blanc (comprovar si lordenada a lorigen de la
recta de regressi s significativament diferent dun valor establert) o
leficcia en processos de recuperaci (mitjanant linterval de confiana
individual pel pendent).

Un dels aspectes ms importants a tenir en compte a lhora daplicar
un test individual sobre un dels coeficients de regressi BLS, s la
possibilitat de cometre un error de tipus . Com ja sha comentat en
lapartat 1.5.3 de la Introducci, segons el problema analtic tractat, la seva
importncia pot ser molt gran. Una conseqncia de tenir una probabilitat
derror elevada en el cas daplicar tests individuals en la comparaci de
metodologies analtiques, podria ser lacceptaci duna nova metodologia
analtica que dna resultats amb errors proporcionals o constants
significatius respecte als del mtode de referncia. En processos de
calibraci, una probabilitat derror elevada podria portar a no aplicar
correccions de blanc quan en realitat serien necessries, o a no detectar que
el nivell de recuperaci s significativament diferent a un valor determinat.
Per aquests motius en els apartats 3.2 i 3.4 es presenten les expressions
necessries per estimar la probabilitat derror en laplicaci de tests
Captol 3. Probabilitat derror de primera i segona espcie ...

96
individuals sobre els coeficients de regressi calculats pel mtode BLS.
Daltra banda, en lapartat 3.3 es presenta un procediment per determinar
la quantitat de mostres necessries per construir la recta de calibrat per
tenir unes probabilitats fixades de cometre errors i a lhora de detectar
un biaix () determinat en els coeficients de regressi BLS.

A lapartat 3.4 es troba el gruix del treball tractat en aquest captol,
com a part de larticle Detecting proportional and constant bias in method
comparison studies by using linear regression with errors in both axes, publicat
en la revista Chemometrics and Intelligent Laboratory Systems. Finalment, en
lapartat 3.5 es presenten les conclusions extretes daquest captol.

3.2 Estimaci de la probabilitat derror de segona espcie en
laplicaci de tests individuals sobre els coeficients de
regressi

Per poder aplicar els tests individuals basats en la distribuci t de
Student sobre els coeficients de regressi obtinguts pel mtode BLS, cal
comprovar que segueixen una distribuci normal. s conegut, per, que els
coeficients de regressi estimats pel mtode BLS no segueixen una
distribuci normal.
1
Per aquest motiu en la secci Background and Theory de
lapartat 3.4 sestudia el grau de desviaci de la normalitat daquests
coeficients de regressi. Com es demostra en aquest estudi, lerror coms
per no considerar les incerteses en la mesura de la variable predictora (cas
dels mtodes OLS i WLS, en qu els coeficients de regressi s que
segueixen una distribuci normal) s major que el coms en assumir la
normalitat de les distribucions dels coeficients de regressi BLS. En la
secci Results and Discussion de l'apartat 3.4 es pot comprovar com el grau
de desviaci de normalitat coms pels coeficients de regressi estimats
segons el mtode BLS s prou baix com per acceptar la hiptesi que aquests
coeficients de regressi es distribueixen normalment. Aquests resultats
3.2 Estimaci de la probabilitat derror ...

97
justifiquen el desenvolupament de tests individuals per als coeficients de
regressi BLS sota lassumpci de normalitat.

Com que aquests tests individuals sn uns tests dhiptesi,
inicialment s necessari fixar els valors dels coeficients de regressi terics
a partir dels quals es postulen les hiptesis nulla (H
0
) i alternativa (H
1
) amb
els quals es compararan les estimacions dels coeficients de regressi, tal
com es descriu a lapartat 1.5 de la Introducci. Al llarg daquest treball sha
considerat que les distribucions seguides pels valors terics en el cas del
pendent (
0
H
1
b i
1
H
1
b ) presenten desviacions estndard diferents a la dels
coeficients de regressi estimats. Aix s degut al fet que a diferncia dels
mtodes OLS i WLS, en el mtode de regressi BLS la desviaci estndard
del pendent depn directament del factor de ponderaci
2
i
e
s (vegeu
equacions 8 i 9 a la secci Background and Theory en lapartat 3.4), que a la
vegada tamb depn directament del valor del pendent (eq. 1.36). Per tant,
un valor del pendent ms gran tindr associat una distribuci amb una
desviaci estndard ms elevada i viceversa. Cal destacar que aquesta
distribuci s una t de Student, ja que el nombre de punts de la recta de
regressi sol ser baix.
2
Linterval de confiana es construeix al voltant del
valor a partir del qual es postula la hiptesi nulla per un nivell de
significana , segons lexpressi:

0
H
0
0
H
2 , 2 0 b n
s t b

(3.1)

on
0
H
0
b s el valor teric de lordenada a lorigen pel qual es postula H
0
(en
aquest cas fixat a 0) i
0
H
0
b
s s la seva desviaci estndard, que s igual a
lestimada per lordenada a lorigen (
0
b
s ), ja que el factor de ponderaci
2
i
e
s
s independent del valor de lordenada a lorigen. Per al pendent
lexpressi anloga s:

98

0
H
1
0
H
2 , 2 1 b n
s t b

(3.2)

En aquest cas el valor de
0
H
1
b pel qual es postula H
0
s 1. Si el valor del
coeficient de regressi estimat cau dins linterval de confiana
corresponent, saccepta H
0
i es rebutja H
1
. En aquest cas existeix la
possibilitat de cometre un error de tipus , ja que sest acceptant H
0
quan
la hiptesi correcta pot ser en realitat H
1
. Aquesta situaci es representa
grficament com lrea de la distribuci corresponent a H
1
intersectada dins
de linterval de confiana al voltant del valor de referncia considerat per
postular H
0
:

(biaix)
Probabilitats
derror
Probabilitats
derror /2
1
H
1
H
1 1 1
1
: H
1
b b
b
=
=
Probabilitats
derror /2
0
H
0
H
1 1 0
1
: H
1
b b
b
=
=
0
H
1
b
s
1
H
1
b
s
1
b
0
H
1
2 b
s t
1
H
1
b
s t

Figura 3.1. Representaci de les probabilitats derror i en laplicaci del test
individual per al pendent considerant diferents distribucions per a H
0
i H
1
. La
probabilitat derror es calcula a partir dun nivell de significana fixat.

Aquesta rea simbolitza la probabilitat que un coeficient de regressi amb
un valor igual al postulat per H
1
(amb un biaix prviament definit per
lexperimentador) pogus ser errniament considerat igual al valor de
referncia establert per H
0
,

per un nivell de significana , a causa dels
3.2 Estimaci de la probabilitat derror ...

99
errors aleatoris comesos en les mesures experimentals. Aquesta probabilitat
es pot estimar a partir de les equacions 6 i 7 presentades en la secci
Background and Theory en lapartat 3.4.

Una altra manera destimar les probabilitats de cometre un error
s considerant el nivell de significana mxim pel qual sacceptaria H
0

(sempre que aquest valor fos acceptable per lanalista, p. e. superior al 5%)
en comptes del valor fixat inicialment. Daquesta manera, les
probabilitats derror serien ms baixes per a aquelles estimacions dels
coeficients de regressi ms semblants al valor de referncia establert per
H
0
, ja que sn les que tenen una major probabilitat de pertnyer realment a
la distribuci associada a H
0
. La figura segent mostra aquesta segona
aproximaci per estimar les probabilitats derror per tests individuals
sobre el pendent.

(biaix)
Probabilitats
derror
Probabilitats
derror /2
1
H
1
H
1 1 1
1
: H
1
b b
b
=
=
Probabilitats
derror /2
0
H
0
H
1 1 0
1
: H
1
b b
b
=
=
0
H
1
b
s
1
H
1
b
s
1
b
1
H
1
b
s t
0
H
1
2 b
s t

Figura 3.2. Representaci de les probabilitats derror i en laplicaci del test
individual per al pendent considerant diferents distribucions per a H
0
i H
1
. La
probabilitat derror es calcula a partir del nivell de significana mxim pel
qual no es detecta biaix.

100

Cal tenir en compte que en aquells casos en qu el nombre de
mostres de calibrat s baix, les estimacions de lerror experimental s
2

tendeixen a estar sobreestimades.
3
Aix provoca que els valors de les
desviacions estndard dels coeficients de regressi BLS (eqs. 8 i 9 en
lapartat 3.4) i, per tant, les distribucions associades tant a H
0
com a H
1
(en
cas daplicar el test individual sobre el pendent) siguin ms grans del que
haurien de ser. En aquest cas, sobtindr una sobreestimaci de la
probabilitat derror . Per compensar aquest efecte, atesa la dependncia
directa entre les desviacions estndard i el pendent de la recta BLS, s
recomanable considerar valors esbiaixats de
1
H
1
b inferiors a 1. Aix far que
el valor de la desviaci estndard pel pendent
1
H
1
b
s i, per tant, la distribuci
associada a H
1
sigui ms baixa, la qual cosa far lestimaci de la
probabilitat derror ms acurada.

3.3 Relaci entre les probabilitats derror de primera i segona
espcie amb el nombre de mostres de calibraci

Degut a les conseqncies que es poden generar al cometre errors
de primera o segona espcie en els tests individuals sobre lordenada a
lorigen i el pendent de la recta de regressi, controlar les probabilitats
derror i pot ser interessant segons el problema analtic tractat. Aix s
possible grcies a la relaci existent entre les probabilitats derror i , el
biaix a detectar en el coeficient de regressi estimat i el nombre de
mostres utilitzades per construir la recta de regressi. En el cas del mtode
OLS, hi ha una expressi que relaciona aquestes variables:
4

2
2
) (
+
=
s t t
n

(3.3)

3.3 Relaci entre les probabilitats derror ...

101
Aquesta expressi dna el mnim nombre de mostres necessries per poder
detectar un biaix en lestimaci del coeficient de regressi, amb unes
probabilitats de cometre errors i determinades. El terme s correspon a
lerror experimental en unitats de desviaci estndard. Aquest valor s
nic, ja que al contrari del mtode BLS, per OLS lerror experimental s
2
no
canvia en funci dels valors terics del pendent que sescullin per establir
H
0
i H
1
. Per aquest motiu, les distribucions associades a H
0
i H
1
pel mtode
OLS sn iguals.

En el mtode de regressi BLS, lestimaci del nombre de punts de
la recta de regressi necessaris per detectar un biaix en el coeficient de
regressi corresponent amb unes probabilitats de cometre errors i
determinades, s ms complicada. La predicci del nombre de mostres es fa
a partir de les desviacions estndard de lordenada o el pendent, seguint les
expressions 6 i 7 de la secci Background and Theory de lapartat 3.4. Per
poder relacionar el nombre de punts necessaris per construir la recta de
regressi amb les desviacions estndard de lordenada i el pendent, s
necessari descompondre aquestes expressions (eqs. 8 i 9 de lapartat 3.4) en
diferents factors. Aquesta descomposici, per, noms s possible si es
considera que els factors de ponderaci
2
i
e
s sn constants, s a dir, si es
considera que les incerteses generades pels errors comesos en la mesura
experimental sn constants (homoscedasticitat).

Per estimar el nombre de punts necessaris per construir la recta de
regressi sutilitza un procediment iteratiu esquematitzat en la figura
segent.


102
Method 1
M
e
t
h
o
d

2

t t ,
2
n
iter
iter
n
?
iter iter
n n =
S No
2
s
=
i
i
n
i
i
s
x
1
2
iter
n n =
iter iter
n n =
Afegir punts
fins que
n
iter
= n
iter
0

Figura 3.3. Esquema del procediment iteratiu per estimar el mnim nombre de
mostres necessries per detectar un biaix en el coeficient de regressi estimat
mitjanant el mtode BLS, amb unes probabilitats de cometre errors i
determinades.

Les millors estimacions possibles tant de lerror experimental s
2
com dels
sumatoris de la variable predictora (eqs. 12 i 13 de lapartat 3.4), en cas que
no sen tingui cap coneixement previ, sobtenen a partir dun conjunt de
dades inicial. Amb les estimacions inicials daquestes variables, es pot
calcular el nombre de punts de la recta de regressi mitjanant un
procediment iteratiu, ja que les variables t
/2
i t

a les equacions 12 i 13 de
lapartat 3.4 depenen del nombre de punts que es vol estimar. El pas
segent consisteix a mesurar els nous valors experimentals fins a arribar a
tenir el nombre de punts estimats per construir la recta de regressi BLS.
Segons es vagin afegint punts experimentals, les estimacions de lerror
experimental i dels sumatoris de la variable predictora seran ms acurades.
En el moment que el nombre de punts estimat sigui igual al que ja t el
conjunt de dades, el procs haur finalitzat. Cal destacar que aquest
procediment s molt sensible a les estimacions inicials de lerror
experimental i els sumatoris de la variable predictora. Segons es descriu a
la secci Results and Discussion de lapartat 3.4, en les primeres etapes del
procediment, quan el nombre de punts s encara baix, lestimaci pot
arribar a ser negativa. Aix s degut a que les estimacions de lerror
experimental i els sumatoris de la variable predictora no solen ser gaire
3.3 Relaci entre les probabilitats derror ...

103
correctes quan el nombre de punts en el conjunt de dades s petit. Tot i que
aquest procediment considera que les incerteses associades als valors
experimentals sn homoscedstiques, les estimacions del nombre de punts
necessaris per construir la recta de regressi BLS sota les condicions abans
esmentades sn satisfactries, fins i tot en conjunts de dades amb
heteroscedasticitat moderada (vegeu secci Results and Discussion en
lapartat 3.4).


104
3.4 Detecting proportional and constant bias in method
comparison studies by using linear regression with errors in
both axes. (Chemometrics and Intelligent Laboratory Systems,
49 (1999) 179-193)

ngel Martnez
*
, F. Javier del Ro, Jordi Riu, F. Xavier Rius


ABSTRACT

Constant or proportional bias in method comparison studies using
linear regression can be detected by an individual test on the intercept or
the slope of the line regressed from the results of the two methods to be
compared. Since there are errors in both methods, a regression technique
that takes into account the individual errors in both axes (bivariate least
squares, BLS) should be used. In this paper we demonstrate that the errors
made in estimating the regression coefficients by the BLS method are fewer
than with the OLS or WLS regression techniques and that the coefficient
can be considered normally distributed. We also present expressions for
calculating the probability of committing a error in individual tests under
BLS conditions and theoretical procedures for estimating the sample size in
order to obtain the desired probabilities of and errors made when
testing each of the BLS regression coefficients individually. Simulated data
were used for the validation process. Examples for the application of the
theoretical expressions developed are given using real data sets.

Chemom. Intell. Lab. Syst., 49 (1999) 179-193

105
INTRODUCTION

Linear regression is widely used in the validation of analytical
methodologies. In method comparison studies, for example, a set of
samples of different concentration levels are analysed by the two methods
to be compared, and the results are regressed on each other. Ordinary least-
squares (OLS), or weighted least-squares (WLS), which considers
heteroscedasticity in the response variable, are the most widely used
regression techniques. However, these techniques have a limited scope,
since they consider the x-axis to be free of error. OLS and WLS should not
usually be applied, for instance, in method comparison studies, since the
uncertainties associated with the methods to be compared are usually of
the same order of magnitude. An alternative is the errors-in-variables
regression [1], also called CVR approach [2-4], which considers the errors in
both axes. It does not take into account the individual uncertainties of each
experimental point but considers the ratio of the variances of the response
to predictor variables to be constant for every experimental point
(=s
y
2
/s
x
2
). A particular case of the CVR approach is the orthogonal
regression (OR) [5], in which the errors are of the same order of magnitude
in the response and predictor variable (i.e. =1). Another option is a
bivariate least squares (BLS) regression technique [6,7], which takes into
account individual non-constant errors in both axes to calculate the
regression coefficients.

Despite the recent development of a joint confidence interval test for
the BLS regression method [8], no statistical test to individually assess the
presence of bias in the regression coefficients which takes into account the
individual uncertainties in every experimental point has yet been
described. For this reason, we present expressions for the application of the
individual tests which take into account individual errors in both axes.
Although the distributions of the BLS slope and intercept have been

106
reported to be nongaussian [9], in this paper we show that the results of
applying statistical tests based on the assumption of normality of the BLS
regression coefficients do not show significant errors and that these errors
are fewer than those obtained with the OLS or WLS regression techniques.

Of the two types of error associated with the statistical tests ( and
), the error, related to the probability of not detecting an existing
proportional or constant bias is seldom considered. However, the
theoretical background and the expressions which enable its calculation in
the individual tests which use the OLS method have already been
developed [5]. In this paper we describe the expressions for estimating the
probability of error when performing an individual test on one of the
regression coefficients to detect a set proportional or constant bias based on
the BLS regression technique. These expressions take into account the
different distributions that may be associated to the reference and to the
selected biased regression coefficient values. These estimates are compared
with the ones from the OLS and the WLS techniques for several real data
sets. Finally, we describe the procedure for estimating the sample size, i.e.
the number of experimental data pairs necessary for detecting the specific
selected bias when performing an individual test with set probabilities of
making and errors when the BLS regression method is used. Simulated
data sets have been used to validate the theoretical expressions.


Notation

In general, the true values of the different variables used in this
work are represented with greek characters, while their estimates are
denoted with latin letters. In this way, the true values of the BLS regression
coefficients are represented by
0
(intercept) and
1
(slope), while their

107
respective estimates are denoted as a and b. The estimates of the standard
deviation of the slope and the intercept for the BLS regression line, are
symbolised as s
b
and s
a
respectively. The experimental error, expressed in
terms of variance for the n experimental data pairs (x
i
,y
i
), is referred to as
2
, while its estimate is s
2
. By analogy,
i
y represents the estimated value for
the y
i
predicted. The estimated variance-covariance matrix of the regression
coefficients related to the BLS regression technique is denoted as B.

In the individual tests, the terms
0
H
a ,
1
H
a ,
0
H
b and
1
H
b represent
the values of the theoretical regression coefficients from which the null (H
0
)
and the alternative hypothesis (H
1
) are assumed. The distance between
0
H
a
and
1
H
a or between
0
H
b and
1
H
b , known as bias, is denoted by and
represents the value of the systematic error that the experimenter wants to
check. By analogy, the values of the standard deviations of the theoretical
regression coefficients defining H
0
and H
1
are denoted as
0
H
a
s

(or
0
H
b
s ) and
1
H
a
s (or
1
H
b
s ).


BLS is the generic name given to a set of regression techniques
applied to data which contain errors in both axes. From all the different
existing approaches for calculating the regression coefficients, Liss
method [6] was found to be the most suitable [7]. This technique assumes
the true linear model to be:

i i

1 0
+ = (1)

The true variables
i
and
i
are unobservable and instead, one can only
observe the experimental variables:

108

i i i
x + = and
i i i
y + = (2)

Variables
i
and
i
are random errors committed in the measurement of
variables x
i
and y
i
respectively, where ) , 0 ( N ~
2
i
x i
and ) , 0 ( N ~
2
i
y i
. In
this way, the observed variables x
i
and y
i
are related as follows:

i i i
bx a y + + = (3)

where
i
is the ith residual error. The BLS regression method finds the
estimates of the regression line coefficients by minimising the sum of the
weighted residuals, S, expressed in eq. (4):

2
1
2
2
) 2 (
) (
s n
s
y y
S
n
i
i i
i
=
=

(4)

The weighting factor
2
i
s
is expressed as the variance of the ith residual

i

and takes into consideration the variances of any individual point in both
axes (
2
i
x
s and
2
i
y
s ) obtained from the replicate analysis of each sample by
both methods. The covariance between the variables for each (x
i
,y
i
) data
pair, which is normally assumed to be zero, is also taken into account:

) , ( cov 2 ) var( ) ( var
2 2 2 2
i i x y i i i
y x b s b s s bx a y
i i i
+ = = =

(5)

For this reason, the BLS regression technique assigns higher weights
to those data pairs with larger
2
i
x
s and
2
i
y
s

values, i.e. the most imprecise
data pairs. By minimising the sum of the weighted residuals (eq. (4)), two
non-linear equations are obtained, from which the regression coefficients a
and b can be estimated by an iterative process [8].


109
Characterisation of the distribution of the BLS regression coefficients

The distribution functions of the regression coefficients a and b
found by the BLS regression technique have been reported to be
nongaussian [9]. This influences the individual tests on the regression
coefficients, since they are usually performed under the assumption of
normality. To determine the degree of non-normality of the distributions of
the BLS coefficients, three different statistical tests were used: Cetama [10]
(which also allows the actual probability function to be characterised), the
Kolmogorov test [11] and the normal probability plot (or Rankit test) [12].
These tests were applied to different types of real data sets to find a
relationship between their structure and the degree of non-normality.
Furthermore, to characterise their distribution, the real distributions and
some theoretical distributions were compared. These comparisons were
carried out with the quantile-quantile graphic method (Q-Q plot) [12].

error in the individual tests for the BLS regression coefficients

According to the theory of hypothesis testing, when an individual
test is applied on a regression coefficient, the null hypothesis H
0
is usually
defined as the one that considers the estimated regression coefficient to
belong to the distribution of a hypothetical regression coefficient (
0
H
a or
0
H
b ) equal to the reference value, or in other words, that there are no
proportional or constant systematic errors in the method being tested. On
the other hand, the alternative hypothesis H
1
considers that the estimated
regression coefficient belongs to the distribution of a hypothetical
regression coefficient (
1
H
a or
1
H
b ) with a given value. This value, which has
to be set by the experimenter according to the systematic error one wants to
detect in the analytical method being tested, defines the distance between
0
H
a (or
0
H
b ) and
1
H
a (or
1
H
b ), or in other words the so-called bias [13]. The

110
standard deviations
0
H
a
s

(or
0
H
b
s ) and
1
H
a
s (or
1
H
b
s ) can be calculated for a
given data set with the values of
0
H
a (or
0
H
b ) and
1
H
a (or
1
H
b ).

The expressions developed for estimating the probability of
committing a error in the application of an individual test to one of the
regression coefficients calculated by using the OLS regression technique are
established [5]. Analogous expressions can be adapted for the BLS
technique by considering the appropriate standard deviation values:

1
H
0
H
2 b b b
s t s t + =

1
H
0
H
2
b
b b
s
s t
t

=

(6)

1
H
0
H
2 a a a
s t s t + =

1
H
0
H
2
a
a a
s
s t
t

=

(7)

The probability of committing a error under the assumption of normality
is finally given by the Students t value for n-2 degrees of freedom for a
fixed level of significance . The standard deviations
0
H
a
s

(or
0
H
b
s ) and
1
H
a
s (or
1
H
b
s ) can be estimated in a similar way to the standard deviations of
the intercept and the slope, and are easily obtained from the B variance-
covariance matrix

[8] calculated while estimating the regression coefficients
with the BLS technique:

s
s
x
s
x
s
s
x
s
n
i
i
n
i
n
i
i
n
i
i
a
i i i
i
= = =
=
2
1
2
1 1
2
2
2
1
2
2
1

(8)


111

s
s
x
s
x
s
s
s
n
i
i
n
i
n
i
i
n
i
b
i i i
i
= = =
=
2
1
2
1 1
2
2
2
1
2
1
1

(9)

To calculate the values of
0
H
a
s

(or
0
H
b
s ) and
1
H
a
s (or
1
H
b
s ) it is only
necessary to recalculate the value of the weighting factor (eq. (5)) according
to the new slope value. Due to the dependence of the weighting factor on
the slope, the values of
0
H
a
s and
1
H
a
s will be equal to the standard
deviation obtained for the estimated regression coefficient (
1
H
0
H
a a a
s s s = = ),
which is not true for the slope. The experimental error s
2
remains
unchanged.

Estimating the sample size

Relating eqs. (8-9) with the number of data pairs n it is possible to
estimate the number of data pairs required to detect certain bias with set
probabilities of committing and errors. This can only be achieved if the
individual uncertainties, and hence the weighting factors are considered
constant for all the data pairs (
2
0
H
a
s
,
2
0
H
b
s
or
2
1
H
b
s
= ct):

s
x x n
s x
s
n
i
i
n
i
i a
n
i
i
a
a
= =
=
2
1 1
2
1
2 2
0
H
0
H
(10)


112

s
x x n
s n
s
n
i
i
n
i
i b
b
b
b
=

= =
2
1 1
2
2
0
H
0
H

or s
x x n
s n
s
n
i
i
n
i
i b
b
b
b
=

= =
2
1 1
2
2
1
H
1
H

(11)

Introducing these two expressions in eq. (6-7) respectively it is possible to
isolate n in terms of the desired variables , and :

=
=

+
=
n
i
i
n
i
i
a
a
x
x
s
s t t
n
a
1
2
2
1 2
2
2 2
2 /
0
H
) (

(12)

2 2
2 /
1
2 2
2
1
2
) (
1
H
0
s s t s t x
x
n
b
H
b
n
i
i b
n
i
i b
b
+

=
=

(13)

Initial estimates of the terms
2
0
H
a
s

or
2
0
H
b
s
and
2
1
H
b
s
, s
2
and both
sums involving x data coordinates can be set from an initial data set
containing few data pairs. After an iterative calculation (due to the
dependence of the t
/2
and t
values on the number of data pairs) an

estimate of n
a
or n
b
is obtained. It is then important to recalculate the
sample size adding more data to the initial data set, as the estimates of the
terms mentioned in eqs. (12-13) are likely to change. In this way a new
estimate of n
a
or n
b
is obtained. The estimation process ends when the
differences between two consecutive n
a
or n
b
values are below a set
threshold value.


113
Validation

The objective of the validation process is twofold. Firstly, to show
that, despite the non-normal distribution of the BLS regression line
coefficients, the confidence interval computed using the t-distribution can
generally be accepted without committing relevant errors. Secondly, to
assess whether the theoretical estimate of either the error and the number
of data pairs required to perform the individual tests, based on BLS under
defined statistical conditions, provides correct results.

To show the degree of non-normality of the intercept and the slope
distributions under real regression conditions, six real data sets with errors
in both axes were studied. The Monte Carlo method [14] was applied to
generate 200,000 data sets from each of the six initial ones (Figure 1).

Monte
Carlo
n straight
lines
1
2
3
n
Tests of normality Initial data set
n
n
b
a
2
2
b
a
3
3
b
a
1
1
b
a
a b

Figure 1. Scheme of the procedure followed to check the normality of the BLS
regression coefficients using the Monte Carlo simulation method and the three
selected test for checking the normality.

This method adds a random error to every data pair based on the
individual uncertainties in both axes. In this way, 200,000 simulated data

114
sets were randomly generated. This gave rise to 200,000 regression lines, to
which the three selected tests for assessing the normality of the
distributions were applied. The error made in estimating the BLS
regression coefficients when their respective distributions were assumed to
be normal (when in fact they are not) was quantified and compared with
the error made in estimating the regression coefficients by OLS and WLS
techniques. Figure 2 illustrates the comparison procedure. Once the
distribution of the regression coefficients corresponding to the real data set
is obtained by the Cetama method, we can determine its left (x
lr
) and right
(x
rr
) limits for a chosen level of significance . The shaded areas in Figure 2
represent the errors made by estimating the regression coefficients with
each of the three regression techniques studied.

Real distribution
BLS
WLS
OLS
x
lr
x
rr
x
lbls
x
rbls
x
lwls
x
rwls
x
lols
x
rols

Figure 2. Error made in estimating the BLS regression coefficients assuming
normal distributions. Comparison with errors made using OLS and WLS
regression techniques.

115
To validate the expressions for the estimation of the probability of
error, 24 initial simulated data sets were used with all the data pairs
perfectly fit to an straight line with either biased slope or intercept values.
From each of these initial data sets, 100,000 simulated new ones were
randomly generated by adding a random error to every individual data
pair (x
i
,y
i
) in the initial data set with the Monte Carlo method. An
individual test was then applied on one of the regression coefficients for
every one of these 100,000 data sets to check whether H
0
could be accepted
in each case for a fixed level of significance . So every time H
0
was
accepted, a error was being committed because the data set had been
generated from an initial biased one, but due to the application of random
errors by the Monte Carlo method, however, the bias could not be detected.
The value of the bias was chosen to provide a probability of error similar
to the level of significance in each of the four cases. In this way, if the
estimate of the probability of error from the theoretical expressions was
similar to the one from the simulation process, we may conclude that the
stated expressions provide correct results.

Once the estimates of the probability of error were proved to be
correct, the expressions to estimate the sample size were validated. The
probabilities of error estimated for the different levels of significance ,
the calculated standard deviations and the experimental error from the
iterative process (terms t

, t
/2
,
2
0
H
a
s
,
0
H
b
s
or
1
H
b
s
and s
2
respectively) for
each of the initial data sets in the validation process were introduced in
expressions 12 and 13. If the estimated sample size required to achieve the
chosen probabilities of and error was similar to the number of data
pairs in each data set, results were considered correct. To show the
applicability of the procedure, a real data set was used as a case study.


116


Six real data sets with different characteristics (such as number of
data pairs, heteroscedasticity or position within the experimental domain)
were used to check the distribution of the BLS regression coefficients.
Twenty-four different simulated data sets were considered to validate the
expressions for the estimates of the probability of error (eqs. (6-7)).
Finally, one of the six former real data sets was used to show the different
estimates of the probability of error between BLS, OLS and WLS
regression techniques and provide an example of the sample size
estimation procedure using data with errors in both axes.

Data Set 1 [15]. Data set obtained from the study of the supercritical
fluid extraction (SFE) recoveries of policyclic aromatic hydrocarbons
(PAHs) from railroad bed soil using two different modifiers; CO
2
(on the x-
axis) and a mixture of CO
2
with 10% of toluene (on the y-axis). The data set
is composed of seven data pairs. The standard deviations (
i
x
s and
i
y
s )
were the result of a triplicate supercritical fluid extraction at each level of
concentration. The units are expressed in terms of g/g of soil. The data set
and the regression lines obtained by the OLS, WLS and BLS regression
techniques are shown in Figure 3a.


117
-200 0 200 400 600 800 1000
-200
0
200
400
600
800
1000
1 amalgamation
2

a
m
a
l
g
a
m
a
t
i
o
n
s
(b)
0 5 10 15 20 25 30
0
5
10
15
20
25
30
CO
2
C
O
2
/

1
0
%

t
o
l
u
e
n
e
(a)
0 20 40 60 80 100 120
0
20
40
60
80
100
120
140
AAS
S
I
A
(c)
-5 0 5 10 15 20 25
-5
0
5
10
15
20
25
AAS / selective reduction
A
E
S

/

c
o
l
d

t
r
a
p
p
i
n
g
(d)
40 60 80 100 120 140 160
10
20
30
40
50
60
70
(e)
kPa

m
V
80 85 90 95 100 105 110 115 120 125
50
100
150
200
solvent
s
o
l
v
e
n
t

/

s
o
i
l
(f)

Figure 3. OLS (dashed line), WLS (dotted line) and BLS (solid line) regression
lines obtained for the six real data sets.

Data Set 2 [16]. Comparative study of mercury determination using
gas chromatography coupled to a cold vapour atomic fluorescence
spectrometer following derivatization with sodium tetraethylborate. One
(x-axis) and two (y-axis) amalgamation steps were used to obtain five data
pairs with their respective uncertainties (
i
x
s and
i
y
s ) generated from six
replicates performed at each point. Units are expressed in terms of pg. of
recovered mercury. The data set and the regression lines generated by the
three regression techniques are shown in Figure 3b.

Data Set 3 [17]. Twenty-seven data pairs obtained from a method
comparison study which analysed Ca(II) in water by atomic absorption
spectroscopy (AAS), taken as the reference method (x-axis), and sequential
injection analysis (SIA), taken as the tested method (y-axis). The data set
and the regression lines generated by OLS, WLS and BLS regression

118
techniques are shown in Figure 3c. Units are expressed in mg/l. The
uncertainties associated with the AAS method were derived from the
analytical procedure, including the linear calibration step [18]. The
uncertainties of the SIA results were calculated with a multivariate
regression model and the PLS technique using the Unscrambler program
(Unscrambler-Ext, ver. 4.0, Camo A/S, Trondheim, Norway).

Data Set 4 [19]. Comparative study for determining arsenic in
natural waters from two techniques: continuous selective reduction and
atomic absorption spectrometry (AAS) as the reference method (x-axis) and
non-selective reduction, cold trapping and atomic emission spectrometry
(AES) as the tested method (y-axis). Thirty experimental data pairs were
obtained with three replicates per data pair. The units are expressed in
terms of g/l. The data set and the regression lines obtained using all three
regression techniques are shown in Figure 3d.

Data Set 5 [20]. Data set obtained by measuring the CO
2
Joule-
Thompson coefficient. The data was acquired from thermocouple-
measured voltage differences (mV, on the y-axis) as a function of pressure
increments (kPa, on the x-axis). Eleven equally-distributed data pairs
were obtained with estimated unity x-axis uncertainties. The y-axis
uncertainties were estimated to be between one and two units. The data set
and the three regression lines found by using the stated regression
techniques are shown in Figure 3e.

Data Set 6 [21]. Comparative study of the average recoveries for
organochlorine pesticides present in solvent (on the x-axis) or in
solvent/soil suspension (on the y-axis) after microwave-assisted extraction
(MAE) analysis. Twenty-one data pairs were used in the analysis. The
uncertainties were obtained from triplicate MAE analysis at each point. The

119
data set and the straight lines regressed by the three regression techniques
are shown in Figure 3f.

To validate the estimates of the probability of error, twenty-four
different initial data sets showing different values of bias in the intercept or
in the slope were built to cover several analytical situations; different linear
ranges, number of data pairs and uncertainty patterns.

Linear Ranges: Two linear ranges were considered during validation,
a short one for values from 0 to 10 units, and a large one for values from 0
to 100 units.

Number of data pairs: Data sets containing five, fifteen, thirteen and a
hundred data pairs were selected. In all cases the data pairs were randomly
distributed throughout the two different linear ranges.

Uncertainties: Homoscedastic and heteroscedastic data sets were
considered. The homoscedastic data sets were comprised of data pairs with
constant standard deviations on both x and y values. In the short linear
ranges the standard deviations presented half unity values, whereas in the
large linear ranges they showed unity values. The heteroscedastic data sets
were divided into two other different types. On one hand those with
increasing standard deviations and on the other hand, those which
presented random standard deviations. In both cases however, the
standard deviation values were never higher than the 10% of each
individual x
i
and y
i
value.

For every one of the twenty four different simulated data sets, four
levels of significance were considered: 10, 5, 1 and 0.1%. Depending on
the regression coefficient being tested and on the level of significance, the
slope (
1
H
b ) or the intercept value (
1
H
a ) of the selected bias changed in such

120
a way that the probabilities of error from the iterative process were
similar to the specified values. In this way the accuracy of estimates of
different magnitudes from eqs. (6-7) was also tested.

All the computational work was performed with home-made


Distribution of the regression coefficients

The results of studying the distributions of the slope (b) and the
intercept (a) using the three tests to check normality are summarised in
Table 1. The variation in the number of iterations needed to achieve non-
normality can be used to identify the degree of normality. The more
iterations needed to achieve non-normality (if finally achieved) the more
normal the distribution is.

Kolmogorov
Cetama =1% =5% =10% Rankit Plot
Data set Iterations a
b a
b a
b a
b a
b
1 10.000 NSNL NSLRL N NN NN NN NN NN NN NN
30.000 NSNL NSNL NN NN NN NN NN NN NN NN
2 10.000 N NSNL N N N N N N NN NN
30.000 N NSLRL N N N N N N N N
50.000 N NSNL N N N N N N N N
100.000 NSNL NSNL N N N N N N N N
200.000 NSLRL NSLL N N N N N N N N
3 10.000 NSNL NSLRL N N N N N N NN NN
30.000 NSNL NSLRL N N N N N N NN N
Table 1. Normality study results for the BLS regression coefficients.

121
Kolmogorov
Cetama =1% =5% =10% Rankit Plot
Data set Iterations a
b a
b a
b a
b a
b
50.000 NSNL NSNL NN N NN N NN N NN N
100.000 NSLRL NSLRL NN N NN N NN N NN N
200.000 NSNL NSNL NN N NN N NN N NN N
4 10.000 N NSNL N N N N N N N N
30.000 N NSNL N NN N NN N NN N NN
5 10.000 N N N N N N N N N N
30.000 N N N N N N N N N N
50.000 N N N N N N N N N N
100.000 N N N N N N N N N N
200.000 N N N N N N N N N N
6 10.000 NSNL NSNL N NN N NN N NN NN NN
30.000 NSNL NSNL N NN NN NN NN NN NN NN

N: Normal distribution.
NN: Non-normal distribution.
NSNL: Non-symmetric and non-limited.
NSLRL: Non-symmetric and left and right limited.
NSLL: Non-symmetric and left limited.

Table 1 (cont.). Normality study results for the BLS regression coefficients.

Data set 1 presents non-normal distributions mainly due to the high
lack of fit of the data pairs to the regression line. Data sets 2 and 5 present
the best goodness of fit of all the sets, which helps the distribution of the
regression coefficients to be normal. In data set 3, the data structure and the
errors in both axes make the regression line mainly change the intercept
value, which leaves the slope almost unmodified. In this way the intercept
value shows a major uncertainty which leads to a non-normal distribution,
whereas a much lower uncertainty is associated to the slope value. In data
set 4, the slope of the regression line does not follow a normal distribution
since the remarkable heteroscedasticity along the experimental range
causes the regression line to move along a conical-shaped region when

122
considering errors in both axes. This varies the slope and leaves the
intercept almost unmodified. Finally, data set 5 has normal distributions
and data set 6 presents non-normal ones due to the irregular disposition of
the points in the space and the high heteroscedasticity. The more similar
the error pattern to OLS conditions (i.e. larger errors in the y axis than in
the x axis, homoscedasticity) and the better the goodness of fit, the more
normal the distribution is. It has to be pointed out that the Cetama method
was the most sensitive in detecting deviations from normality.

Table 2 shows the quantification of the error made in estimating the
BLS regression coefficients when normality in their distributions is
assumed, and the comparison with the analogous results from OLS and
WLS regression techniques. The error is calculated according to the shaded
areas in Figure 2 (where the error is considered to be the part that belongs
to the OLS, WLS or BLS distribution for a fixed level and which does not
belong to the real distribution, and the part that does not belong to the
OLS, WLS or BLS distribution for the same level and belongs to the real
one). This table shows that the error made from assuming normality for the
BLS regression technique is low, and significantly lower than the ones
obtained for the OLS and WLS regression methods for all the data sets. The
data sets that present BLS regression coefficients as normally distributed
have errors equal to zero. We can also see that the error committed when
using the WLS method is usually lower than when using OLS.


123

% Error
Data set Coefficient BLS WLS OLS
1 a
4.69 26.84 58.29

b
4.46 14.59 16.43

2 a
0 9.81 44.35

b
0 5.51 3.66

3 a
0.53 1.37 11.42

b
0.58 6.20 11.03

4 a
0 5.11 88.50

b
2.79 14.97 25.28

5 a
0 0.26 0.62

b
0 0.25 3.28

6 a
2.48 2.31 6.60

b
2.48 3.75 6.45
Table 2. Differences between the theoretical and estimated regression
coefficients by the three regression techniques (normal distributions
assumed).

Once the BLS regression coefficients have been found, in most cases,
to be non-normally distributed, their distributions were compared with
some theoretical ones (beta, binomial, chi-squared, exponential, F, gamma,
geometric, hypergeometric, normal, Poisson, t-Student, uniform, uniform
discrete and Weibull distributions) using the quantile-quantile plot graphic
method (Q-Q plot) [12]. As the results provided by the Cetama method
(Table 1) indicate that the regression coefficients that do not follow a
normal distribution are mainly non-symmetric and non-limited, it seems
reasonable to suppose that the regression coefficient distributions follow
some kind of constant pattern. However, the results given by the Q-Q plot
indicate that the theoretical distributions that are most similar to the real

124
ones are the chi-squared, normal and t-Student since their differences are
very difficult to appreciate.

error and sample size validation

Tables 3 and 4 summarise the results from 100,000 iterations using
the Monte Carlo method for the four levels of significance in the twenty
four simulated data sets. Columns
1
H
a and
1
H
b show the regression
coefficient values which define the chosen bias (distance between H
0
and
H
1
). The values in the
exp
column are those from the simulation process,
whereas the values shown in the
pred
column are the ones obtained with
the theoretical expressions to be validated (eqs. (6-7)). Finally, the values in
the column n
pred
are the estimated sample sizes of the different simulated
data sets for the different levels of significance.

Uncertainty (%)
1
H
a
0
H
a
s

exp.

pred.

n
pred

5 homo. 10 2.4 0.641 9.97 12.91 5
5 3.2 5.02 8.39 5
1 5.2 2.22 5.38 5
0.1 10.5 0.13 2.03 5

hetero. 10 0.7 0.189 10.11 13.67 5
5 0.95 4.32 8.26 5
1 1.5 2.75 6.53 5
0.1 3 0.74 3.14 5

heter. rnd. 10 1 0.261 8.36 11.77 5
5 1.3 4.80 8.48 5
1 2.1 2.23 5.71 5
0.1 4.3 0.11 2.59 5

15 homo. 10 1 0.341 13.24 13.34 15
5 1.3 5.73 6.14 15
1 1.9 0.93 1.19 15
0.1 2.6 0.10 0.24 15

hetero. 10 5e-2 1.69e-2 12.02 12.99 15
5 6.5e-2 4.98 4.9 15
Table 3. Estimated and experimentally obtained probabilities of
error for individual tests on the intercept. Predicted sample size to
achieve the and probabilities of error for each data set.


125
Uncertainty (%)
1
H
a

0
H
a
s

exp.

pred.

n
pred

1 9.5e-2 0.57 1.11 15
0.1 0.125 0.10 0.28 15

heter. rnd. 10 2.5e-2 8.79e-3 13.95 15.12 15
5 3.4e-2 4.39 5.56 15
1 4.5e-2 1.81 2.75 15
0.1 6.4e-2 0.13 0.45 15

30 homo. 10 0.75 0.262 12.93 12.82 30
5 1 4.36 4.43 30
1 1.3 1.74 1.84 30
0.1 1.8 0.12 0.17 30

hetero. 10 5.5e-3 1.92e-3 12.19 12.62 30
5 7e-3 5.53 5.99 30
1 9.5e-3 1.43 1.84 30
0.1 1.2e-2 0.54 0.76 30

heter. rnd. 10 1.9e-2 6.48e-3 11.07 11.46 30
5 2.4e-2 4.97 5.47 30
1 3.2e-2 1.50 1.92 30
0.1 4.3e-2 0.16 0.31 30

100 homo. 10 0.4 0.142 12.78 12.68 100
5 0.5 6.61 6.51 100
1 0.68 1.77 1.70 100
0.1 0.88 0.35 0.32 100

hetero. 10 1.5e-5 5.37e-6 12.89 12.98 100
5 1.9e-5 6.02 6.16 100
1 2.6e-5 1.41 1.45 100
0.1 3.4e-5 0.19 0.20 100

heter. rnd. 10 1.9e-4 6.41e-5 9.49 9.76 100
5 2.4e-4 3.86 4.07 100
1 3e-4 1.91 2.13 100
0.1 4.2e-4 0.07 0.10 100

Table 3 (cont). Estimated and experimentally obtained probabilities
of error for individual tests on the intercept. Predicted sample size
to achieve the and probabilities of error for each data set.


126
n Uncertainty (%)
1
H
b
0
H
b
s

1
H
b
s

exp.

pred.
n
pred.

5 homo. 10 1.45 0.118 0.147 10.39 16.44 5
5 1.6 0.157 5.87 12.60 5
1 2 0.187 3.09 9.87 5
0.1 3.1 0.272 0.62 6.37 5

hetero. 10 1.27 7.48e-2 8.55e-2 12.67 17.64 5
5 1.36 9.02e-2 4.56 10.70 5
1 1.65 0.102 1.11 6.42 5
0.1 2.3 0.132 0.22 4.36 5

heter. rnd. 10 1.27 7.59e-2 9.07e-2 14.41 19.44 5
5 1.4 9.80e-2 3.76 10.24 5
1 1.67 0.113 1.19 6.99 5
0.1 2.35 0.153 0.26 4.78 5

15 homo. 10 0.8 6.92e-2 6.26e-2 10.84 11.91 15
5 0.75 6.11e-2 5.11 6.21 15
1 0.68 5.86e-2 3.75 2.14 15
0.1 0.55 5.58e-2 0.48 0.71 15

hetero. 10 0.93 2.49e-2 2.41e-2 14.59 15.2 15
5 0.91 2.39e-2 6.98 7.73 15
1 0.87 2.34e-2 1.14 1.78 15
0.1 0.83 2.29e-2 0.35 0.72 15

heter. rnd. 10 0.965 1.19e-2 1.16e-2 11.77 12.72 15
5 0.955 1.153e-2 5.07 5.98 15
1 0.94 1.19e-2 1.98 2.74 15
0.1 0.915 1.12e-2 0.15 0.42 15

30 homo. 10 1.12 4.27e-2 4.53e-2 14.92 15.22 30
5 1.16 4.62e-2 5.44 6.38 30
1 1.23 4.78e-2 0.99 1.32 30
0.1 1.32 4.99e-2 0.10 0.14 30

hetero. 10 1.02 7.18e-3 7.25e-3 14.22 14.61 30
5 1.026 7.27e-3 5.92 6.59 30
1 1.036 7.31e-3 1.29 1.77 30
0.1 1.05 7.36e-3 0.082 0.17 30

heter. rnd. 10 1.037 1.26e-2 1.28e-2 10.79 11.62 30
5 1.047 1.29e-2 4.82 5.48 30
1 1.065 1.30e-2 0.95 1.35 30
0.1 1.085 1.31e-2 0.14 0.31 30

100 homo. 10 0.93 2.41e-2 2.32e-2 10.39 9.94 100
5 0.951 2.30e-2 5.81 5.47 100
1 0.89 2.28e-2 2.35 2.13 100
0.1 0.85 2.23e-2 0.16 0.14 100

hetero. 10 0.995 1.89e-3 1.88e-3 15.92 16.16 100
5 0.993 1.88e-3 4.17 4.31 100
Table 4. Estimated and experimentally obtained probabilities of error for
individual tests on the slope. Predicted sample size to achieve the and
probabilities of error for each data set.

127
n Uncertainty (%)
1
H
b

0
H
b
s

1
H
b
s

exp.

pred.
n
pred.

1 0.991 1.87e-3 1.56 1.68 100
0.1 0.988 1.87e-3 0.16 0.18 100

heter. rnd. 10 0.986 4.85e-3 4.82e-3 11.02 11.07 100
5 0.983 4.81e-3 6.45 6.48 100
1 0.979 4.80e-3 4.39 4.48 100
0.1 0.972 4.79e-3 0.81 0.90 100

Table 4 (cont). Estimated and experimentally obtained probabilities of
error for individual tests on the slope. Predicted sample size to achieve the
and probabilities of error for each data set.

To detect significant differences between the estimated probabilities
of error and the values from the simulation process, paired t-tests [22]
(with =1%) were applied on the error values obtained for the different
number of data points (since it is the most critical factor for achieving good
predictions of probabilities of error) at the same level of significance. In
this way significant differences between the values in the
exp
and
pred

columns were found only in the data sets with five data pairs for the slope
and intercept at the four levels of significance. The possible sources of error
and some important observations concerning the results from the
simulation process can be summarised as follows:

(i) In most cases the predicted probabilities of error from eqs. (6-7)
are higher than the experimental values from the simulation process. This
overestimation may be due to a lack of information, since the
overestimation is higher in those data sets with fewer data pairs (where the
experimental error, and thus the uncertainty of the regression coefficient is
higher [23]), and lower in those data sets with a larger number of points. In
this latter case however, small disagreements still exist due to the
assumption of the normality of the regression coefficients. Figure 4 plots
the differences between the experimentally-obtained probabilities of error
(from the simulation process) and the predicted probabilities against the
number of data pairs of each data set for the slope and intercept with a

128
level of significance of 5%. Only the results corresponding to the low range
are shown in Figure 4 since the results for the high range where identical.

0 20 40 60 80 100
0
20
40
60
80
100
120
140
160
180
homoscedasticity
heteroscedasticity
random heteroscedasticity
Number of data pairs
%
Slope
0 20 40 60 80 100
0
10
20
30
40
50
60
70
80
90
100
%
Intercept
homoscedasticity
heteroscedasticity
random heteroscedasticity
Figure 4. Difference between the experimentally-obtained probabilities
(simulation process) and the predicted probabilities of error for the slope and
the intercept (in percent) in relation to the number of data pairs for each data set.

(ii) Results for the intercept show a higher agreement than the ones
for the slope (Figure 4). This may be because estimating the slope is more
complex since two different distributions have to be considered for
0
H
b and
1
H
b , whereas only one is needed when the probabilities of error are
estimated for the intercept, as
1
H
0
H
a a
s s = .

(iii) There is no clear relationship between the uncertainty patterns
and the error made in predicting the error (in percent) for the different
simulated data sets. As Figure 4 shows, the three lines depicting the three
patterns of uncertainty do not maintain a constant relative position as they
cross each other. Results for the intercept seem to follow a steadier pattern
for the different uncertainties. As previously stated, the number of data
pairs on the regression line is the key factor for obtaining a better estimate
of the error.

129
(iv) Results from the predicting the probabilities of error (eqs. (6-
7)) and sample size (eqs. (12-13)) for data sets with a high linear range were
identical to the ones with a low linear range. Results shown in Tables 3 and
4 correspond to the low linear range, while the ones from the high linear
range have been omitted. These results can be explained because the
distribution of the data pairs in data sets (for a given uncertainty and
number of data pairs) with different linear ranges is identical. So the only
difference between data sets with different linear ranges is that the values
of the individual data pairs and their respective uncertainties (taken as
standard deviations) are ten times higher in the high linear range than in
the low linear range. Only the standard deviation values for the intercept
were exactly ten times higher in the high linear range than the ones in the
low linear range. This is due to the direct dependence of the standard
deviation for the intercept on the sum of the x-axis values (eq. (8)).

If we look at the results of estimating the sample size in Tables 3
and 4 (n
pred
columns), we can see that the predicted results in all cases
provide the correct number of data pairs of the different initial data sets
considered. From these results we can conclude that the expressions for
estimating the sample size provide correct results for the three kinds of
distribution of uncertainties considered.

Procedure for error and estimation of sample size in a real data set

Table 5 summarises the results of estimating the probabilities of
committing a error in the individual tests for the BLS slope and intercept
for a level of significance of 5% ( column, in percent) for data set 3.
Columns a a
0
H
and b b
0
H
show the distance between the estimated
regression coefficients and the reference values ( 0
0
H
= a and 1
0
H
= b ). The
columns
0
H
a
s t and
0
H
b
s t (=5%) show the values of the confidence

130
intervals associated to the reference values. Columns
1
H
a and
1
H
b represent
the bias that the experimenter wants to check in the regression coefficient
being tested. Bias is detected in the regression coefficient whenever the
difference a a
0
H
and b b
0
H
is higher than its associated confidence
interval. Probabilities of error are not calculated if bias is detected.

a a
0
H

0
H
a
s t
1
H
a

BLS 2.94 5.35 40.2
WLS 4.38 5.19 6 37.6
OLS 3.97 7.11 62.5
b b
0
H

0
H
b
s t
1
H
b

BLS 0.0364 0.0991 2.77
WLS 0.0571 0.100 1.2 2.60
OLS 0.0656 0.110 5.30
Table 5. Results obtained in estimating the probability of error in
the individual tests for the intercept and the slope in data set 3.

Table 5 shows that neither constant nor proportional bias are found
in the SIA methodology in the analysis of Ca(II) in water according to the
results from the three regression techniques. The highest probability of
error is estimated at 62.5% for the OLS technique, due to the highest
standard deviation value. On the other hand, the probabilities of error for
BLS and WLS are lower and similar to each other although the WLS
intercept value is nearer the upper confidence interval limit. This means
that the results are less reliable, although this is not reflected in the
estimated probabilities of error. Results for the slope show that the
estimated probabilities of error in the three cases are very similar, despite
the differences in the slope values from the three regression methods.
However, if we look at the slope values we can be more confident about the
accuracy of the one estimated by the BLS method as it is the closest to the
reference value
0
H
b .

131
The process for estimating the sample size to achieve the calculated
probabilities of error in the slope (2.77%) and intercept (40.2%) for a level
of significance of 5% is shown in Table 6. For the intercept, starting with an
initial data set of five data pairs ( n
a
0
column), thirteen iterations were
needed to end up with twenty-seven data pairs. For the slope, twenty-six
data pairs were needed to achieve convergence and there was no estimate
of the data pairs until 13 had been considered ( n
b
0
column) since, according
to the denominator of eq. (13), high experimental errors may produce
negative estimates of sample size for the slope (denoted by <0 in Table 6).

Iteration
0
b
n

0
H
b
s
1
H
b
s
n
b
f

n
a
0

0
H
a
s

n
a
f

1 5 0.0974 0.0992 <0 5 6.369 9
2 9 0.131 0.134 <0 9 3.694 11
3 13 0.0753 0.0769 18 11 3.511 13
4 18 0.0666 0.0678 22 13 3.728 16
5 22 0.0609 0.0622 24 16 3.403 18
6 24 0.0530 0.0542 25 18 3.391 20
7 25 0.0511 0.0522 26 20 3.199 22
8 26 0.0492 0.0502 26 22 3.103 23
9 23 3.103 24
10 24 2.954 25
11 25 2.887 26
12 26 2.838 27
13 27 2.657 27

Table 6. Iterations during estimation of the sample size for a and b (data set 3).

CONCLUSIONS

The results of this work show that, in spite of the non-normality of
the distributions of the BLS regression coefficients, the errors made in the
calculating the confidence intervals for the BLS regression coefficients are
lower than the ones made with OLS or WLS techniques for data with
uncertainties in both axes. Thus, the probability of error in the individual

132
tests on the BLS regression coefficients can be estimated under the
hypothesis of normality.

We have also demonstrated that the expressions for estimating the
probability of committing a error when testing an individual regression
coefficient with the BLS regression technique and considering different
distributions for the reference (
0
H
a

or
0
H
b ) and for the biased (
1
H
a

or
1
H
b )
regression coefficients, provide correct results. Some sources of error have
also been detected and identified to explain the disagreements produced in
validating the results. The number of data pairs of the regression line
appear to be crucial for better estimating the probability of error. In
addition, results in real data show that in some cases it may be interesting
to calculate the probability of error not with the set threshold value, but
with the maximum level of significance for which no bias is detected in
the regression coefficient. One would be more confident of the regression
coefficient value being accurate than when it falls near one of the
boundaries of the confidence interval (in this way the probabilities of
error would be higher but the probabilities of error would be lower than
in the usual way).

Finally, we found that it is advisable to estimate the sample size,
since it allows the experimenter to control the probabilities of committing
and errors that they consider reasonable for the analytical problem in
question. The iterative process for estimating the sample size guaranteed
the chosen probabilities of making and errors when an individual test is
applied to one of the estimated BLS coefficients and produced correct
results for those data sets with moderate heteroscedasticity, but not for
those with high heteroscedasticity. The experimenter also has to weigh up
the pros and cons of performing the discontinuous series of experiments
that this iterative procedure requires.


133
ACKNOWLEDGMENTS

We would like to thank the DGICyT (project no. BP96-1008) for
financial support, and the Rovira i Virgili University for providing a
doctoral fellowship to A. Martnez and F. J. del Ro.

BIBLIOGRAPHY

1.- W.A. Fuller, Measurement Error Models, John Wiley & Sons, New York,
1987.
2.- R.L. Anderson, Practical Statistics for Analytical Chemists, Van
Nostrand Reinhold, New York, 1987.
3.- M.A. Creasy, Confidence limits for the gradient in linear in the linear
functional relationship, J. Roy. Stat. Soc. B 18 (1956) 65-69.
4.- J. Mandel, Fitting straight lines when both variables are subject to error,
J. Qual. Tech. 16 (1984) 16 1-14.
5.- C. Hartmann, J. Smeyers-Verbeke, W. Penninckx, D.L. Massart,
Detection of bias in method comparison by regression analysis, Anal.
Chim. Acta 338 (1997) 19-40.
6.- J.M. Lis, A. Cholvadov, J. Kutej, Multiple straight-line least-squares
analysis with uncertainties in all variables, Comput. Chem. 14 (1990)
189-192.
7.- J. Riu, F.X. Rius, Univariate regression models with errors in both axes, J.
Chemom. 9 (1995) 343-362.
8.- J. Riu, F.X. Rius, Assessing the accuracy of analyticas methods using
linear regression with errors in both axes, Anal. Chem. 68 (1996) 1851-
1857.
9.- A.H. Kalantar, R.I. Gelb, J.S. Alper, Biases in summary statistics of
slopes and intercepts in linear regression with errors in both variables,
Talanta 42 (1995) 597-603.
10.- Cetama, Statistique applique lexploitation des mesures, 2nd ed.,
Masson, Paris, 1986.

134
11.- G. Kateman and L. Buydens, Quality Control in Analytical Chemistry,
2nd ed., John Wiley & Sons, New York, 1993.
12.- M. Meloun, J. Militk and M. Forina, Chemometrics for Analytical
Chemistry. Volume 1: PC-aided statistical data analysis, Ellis Horwood
ltd., Chichester, 1992.
13.- M.R. Spiegel, Theory and Problems of Statistics; McGraw-Hill, New
York, 1988.
14.- O. Gell, J.A. Holcombe, Analytical applications of Monte Carlo
techniques, Anal Chem. 62 (1990) 529A - 542A.
15.- J.J. Langenfeld, S.B. Hawthorne, D.J. Miller, J. Pawliszyn, Role of
modifiers for analytical-scale supercritical fluid extraction of
environmental samples, Anal. Chem. 66 (1994) 909-916.
16.- I. Saouter, B. Blattmann, Analyses of organic and inorganic mercury by
atomic fluorescence spectrometry using a semiautomatic analytical
system, Anal. Chem. 66 (1994) 2031-2037.
17.- I. Ruisnchez, A. Rius, M.S. Larrechi, M.P. Callao, F.X. Rius, Automatic
simultaneous determination of Ca and Mg in natural waters with no
interference separation, Chemom. Intell. Lab. Syst. 24 (1994) 55-63.
18.- R. Boqu, F.X. Rius, D.L. Massart, Straight line calibration: something
more than slopes, intercepts and correlation coefficients, J. Chem.
Educ. (Comput. Ser.) 71 (1994) 230-232.
19.- B.D. Ripley, M. Thompson, Regression techniques for the detection of
analytical bias, Analyst 112 (1987) 337-383.
20.- P.J. Ogren, J.R. Norton, Applying a simple linear least-squares
algorithm to data with uncertainties in both variables, J. Chem. Educ.
69 (1992) 130-131.
21.- V. Lpez-vila, R. Young, F.W. Beckert, Microwave-assisted extraction
of organic compounds from standard reference soils and sediments,
Anal. Chem. 66 (1994) 1097-1106.
22.- D. L. Massart, B.M.G. Vandeginste, L.M.C. Buydens, S. de Jong, P.J.
Lewi, J. Smeyers-Verbeke, Handbook of Chemometrics and
Qualimetrics: Part A, Elsevier, Amsterdam, 1997.

135
23.- G.J. Hahn, W. Q. Meeker. Statistical Intervals, a guide for practitioners,
John Wiley & Sons, New York, 1991.


136
3.5 Conclusions

Dentre les conclusions presentades en lapartat 3.4, cal destacar-ne
dues com a objectius principals daquest captol. Per una banda, remarcar
la importncia de considerar les probabilitats de cometre un error en
laplicaci dun test individual sobre un dels coeficients de regressi. Com
sha esmentat en repetides ocasions, les conseqncies dassumir
probabilitats elevades de cometre aquest tipus derrors poden arribar a ser,
segons el problema analtic, fora greus.

Per una altra banda, tamb cal insistir en els avantatges que
introdueix el clcul del nombre de mostres per construir la recta de
regressi mitjanant el mtode BLS. Aquest procediment permet estimar el
nombre de mostres que shan de mesurar per construir la recta de regressi
BLS de manera que el risc de cometre errors i a lhora de detectar un
cert biaix en un dels coeficients de regressi mitjanant un test individual
estigui controlat. Tot i els inconvenients que presenta aquest procediment
iteratiu (descrits en lapartat 3.4) el seu s s recomanat en aquells
problemes analtics en qu les conseqncies de cometre errors de tipus
i/o puguin ser especialment problemtiques.

3.6 Referncies

1.- Kalantar A.H., Gelb R.I., Alper J.S., Talanta, 42 (1995) 597-603.
2.- Cetama, Statistique applique lexploitation des mesures 2
nd
ed., Masson:
Paris, 1986.

CAPTOL 4
Detecci del biaix en mtodes analtics per la
determinaci de mltiples analits simultniament.
Probabilitat de cometre un error de tipus


139

En el captol anterior sha incidit en la importncia de lestimaci de
les probabilitats de cometre un error en laplicaci de tests individuals
sobre els coeficients de la recta de regressi BLS. Tamb es va demostrar la
utilitat de lestimaci del nombre de punts necessaris per construir la recta
de regressi BLS, per poder detectar un cert biaix en un dels coeficients
de regressi mitjanant un test individual, amb unes probabilitats de
cometre errors i determinades. En aquest captol es continua tractant
sobre lestimaci de la probabilitat de cometre un error i les seves
conseqncies en lanlisi qumica, per en aquest cas per al test conjunt
sobre lordenada a lorigen i el pendent de la recta de regressi BLS.

Amb el test conjunt sobre lordenada i el pendent de la recta BLS s
possible detectar errors significatius en els resultats de lanlisi dun sol
analit a diferents nivells de concentraci mitjanant un nou mtode analtic,
en comparaci amb els dun mtode de referncia.
1
No obstant aix, hi ha
una gran varietat de mtodes capaos de determinar la concentraci de
diferents analits alhora (mtodes cromatogrfics, electrofortics, etc.), que
tenen una gran importncia ja que es troben mpliament estesos en el camp
de lanlisi qumica. Per aquest motiu, en lapartat 4.2 es demostra la
capacitat del test conjunt per lordenada i el pendent de la recta BLS per
detectar errors significatius en els resultats de mtodes analtics que
determinen diferents analits simultniament. Tamb sestudien les
conseqncies daplicar el test de confiana conjunta sobre els resultats de
cada analit per separat o sobre els resultats de tots els analits alhora. Daltra
banda, en els apartats 4.4 i 4.5 es presenta el fonament teric i les
expressions matemtiques que permeten comprendre i estimar la
probabilitat de cometre un error de tipus en aplicar el test conjunt sobre
els coeficients de regressi BLS.

Captol 4. Detecci del biaix en mtodes analtics ...

140
Als apartats 4.3 i 4.5 es presenta la major part del treball realitzat per
aquest captol, com a part dels articles Validation of multianalyte
determination methods. Application to RP-HPLC derivatizing methodologies,
publicat en la revista Analytica Chimica Acta, i Evaluating bias in method
comparison studies using linear regression with errors in both axes, enviat a la
revista Journal of Chemometrics (en revisi). Per acabar, a lapartat 4.6 es
presenten les conclusions del captol.


En els processos de comparaci de dos mtodes analtics (on
normalment un dels dos s un mtode de referncia) mitjanant regressi
lineal, es representen els resultats procedents de lanlisi dun analit en una
srie de mostres amb diferents nivells de concentraci pels dos mtodes en
comparaci. En general els resultats del mtode de referncia se situen
sobre leix de les ordenades i els obtinguts pel nou mtode (mtode
candidat), sobre leix de les abscisses. Com que les incerteses degudes als
errors comesos en la mesura de les diferents mostres mitjanant els dos
mtodes acostumen a ser del mateix ordre de magnitud, s convenient
utilitzar el mtode de regressi BLS (vegeu apartat 1.6 de la Introducci).
En el cas hipottic que els resultats obtinguts pels dos mtodes fossin
idntics, lordenada a lorigen i el pendent de la recta de regressi serien
igual a 0 i a 1 respectivament. En els conjunts de dades reals, les mesures
poden estan afectades tant per errors aleatoris com sistemtics que fan que
els coeficients de regressi siguin diferents dels valors terics. Per poder
determinar si les diferncies entre els coeficients de la recta de regressi
BLS i els valors de referncia (0 per lordenada i 1 pel pendent) sn
significatives, cal aplicar un test conjunt per a lordenada a lorigen i el
pendent que tingui en compte les incerteses associades als resultats
obtinguts per ambds mtodes.
1
A lapartat 4.3 es presenta lexpressi que
genera aquest interval de confiana conjunta pel mtode de regressi BLS

141
(eq. 4). Aquest test va ser desenvolupat inicialment pel mtode OLS per
Mandel i Linnig
2
i defineix per a un nivell de significana un interval de
confiana ellptic, que est centrat sobre el punt definit pels coeficients de
regressi estimats (b
0
,b
1
) . En el cas que el punt definit pels valors de
referncia (0,1) caigui dintre daquest interval de confiana, es considera
que no existeixen diferncies significatives entre els coeficients estimats b
0
i
b
1
i els valors de referncia 0 i 1 respectivament.

No obstant aix, per ser ms coherent amb la definici dels tests
dhiptesi
3
en el sentit dacceptar o rebutjar la hiptesi nulla (H
0
), linterval
de confiana conjunta hauria destar centrat sobre el punt de referncia (0,1)
definit pels coeficients terics a partir dels quals es postula H
0
. Es considera
que no existeixen diferncies significatives respecte als coeficients de
regressi terics (saccepta H
0
) per qualsevol dels possibles valors dels
coeficients de regressi que caiguin dintre daquest interval de confiana
conjunt per a un determinat nivell de significana . La mida daquest
interval de confiana conjunta s definit fonamentalment per tres
parmetres: lestimaci de lerror experimental s
2
, les varincies dels
coeficients de regressi i el nivell de significana escollit. Com es pot
observar a la figura 4.1, linterval de confiana ellptic est inclinat a causa
de la correlaci negativa entre lordenada a lorigen i el pendent de la recta
de regressi, tpica en processos de comparaci de mtodes.


142
0
) 2 , 1 ( b n
s t

1
) 2 , 1 ( b n
s t

0
b
1
b
(0,1)
(b
0
,b
1
)

Figura 4.1. Intervals de confiana individuals i conjunt per un nivell de
significana .

Com es pot comprovar a la figura 4.1, ls dintervals de confiana
individuals (eq. 3.1 i 3.2 del captol 3) per determinar si les diferncies entre
els coeficients de regressi BLS i els valors de referncia sn significatives
no s correcte, ja que aquests intervals individuals no tenen en compte la
correlaci entre lordenada a lorigen i el pendent de la recta de regressi.
4

s important destacar que el test conjunt per a lordenada a lorigen i el
pendent est basat en la hiptesi de normalitat dels coeficients de regressi
BLS. Com ja es va mostrar en el captol anterior, tot i que els coeficients de
regressi BLS no segueixen una distribuci normal de forma rigorosa,
lerror coms en assumir la hiptesi de normalitat no s significativament
gran.


143
4.2.1 Determinaci de diversos analits simultniament

Un dels objectius del captol ha estat demostrar que quan es treballa
amb mtodes analtics que determinen diversos analits simultniament,
laplicaci del test conjunt per detectar errors significatius en els resultats
del mtode candidat sha de fer sobre els coeficients de regressi BLS
obtinguts considerant els resultats de tots els analits simultniament. En
aquest cas, el nombre de punts que shan de considerar per construir
linterval de confiana conjunta (variable n en lequaci 4 de lapartat 4.3)
correspondr al nombre de nivells de concentraci multiplicat pel nombre
danalits determinats simultniament. Per demostrar que en aquests casos
el test conjunt aplicat sobre els coeficients de regressi BLS detecta
correctament errors significatius en els resultats del mtode candidat quan
existeixen i no en detecta quan no existeixen, es van utilitzar conjunts de
dades simulats. Aquests conjunts de dades shan generat amb el mtode de
Monte Carlo
5,6
de forma similar a la explicada en el captol 2 (figura 2.2). Els
conjunts de dades inicials, em qu tots els punts es troben perfectament
alineats sobre una lnia recta i a partir dels quals es generen els nous
conjunts simulats, es poden dividir en dos grups. Per una banda, els
conjunts inicials que simulen els resultats obtinguts pels diferents analits de
forma individual. Per una altra, quan tots els parells de dades
corresponents a cada analit individual suneixen en un sol conjunt,
sobtenen els conjunts de dades globals. La figura segent mostra els
conjunts de dades inicials considerats en funci del grau
dheteroscedasticitat: a) homoscedasticitat, b) heteroscedasticitat constant i
c) heteroscedasticitat aleatria. En els casos a i b la recta de regressi
presenta una ordenada i un pendent igual a 0 i 1 respectivament, de
manera que se simula un cas en qu els resultats del dos mtodes sn
idntics. Pel contrari, en el cas c se simula que els resultats dels dos
mtodes en comparaci sn diferents, ja que lordenada s igual a 0 per el
valor del pendent s de 1.05.

144
a)
Mtode Alternatiu
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 1
Mtode Alternatiu
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 2
Mtode Alternatiu
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 3
Mtode Alternatiu
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 4
Mtode Alternatiu
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 5
Mtode Alternatiu
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 6
Mtode Alternatiu
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 7
Mtode Alternatiu
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Conjunt Global
b)
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 1
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 2
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 3
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 4
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 5
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 6
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 7
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Conjunt Global
Mtode Alternatiu Mtode Alternatiu Mtode Alternatiu Mtode Alternatiu
c)
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 1
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 2
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 3
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 4


145
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 5
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 6
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Analit 7
M
t
o
d
e

d
e

R
e
f
e
r
n
c
i
e
a
Conjunt Global
Figura 4.2. Conjunts de dades inicials individuals i globals a partir dels quals es
generen conjunts de dades simulats mitjanant el mtode de Monte Carlo.

A partir de cada un dels conjunts globals inicials sen generen
100.000 conjunts simulats. El test conjunt per lordenada a lorigen i el
pendent saplica sobre els coeficients de regressi BLS obtinguts per
cadascun daquests conjunts simulats. La taula 1 recull el percentatge de
vegades en qu no es van detectar diferncies significatives entre els
resultats dels dos mtodes en cadascun dels tres conjunts de dades globals,
per diferents nivells de significana .

Incertesa 1- (%) % Monte Carlo
90 88.78
95 94.47
99 98.95
homoscedasticitat
99.9 99.89

90 88.95
95 94.62
99 98.96
heteroscedasticitat
constant
99.9 99.90

90 2.28
95 5.01
99 18.03
heteroscedasticitat
aleatria
99.9 47.75
Taula 1. Percentatges dels 100.000 conjunts de dades
globals simulats pels quals no es detecten diferncies
significatives entre els resultats dels dos mtodes.

146
Com es pot veure en aquesta taula, en els casos en qu els resultats dels dos
mtodes en comparaci en el conjunt inicial sn idntics (a i b), el
percentatge de vegades en qu no es detecten diferncies significatives s
similar al nivell de significana fixat en cada cas. Daltra banda, en els cas
en qu els resultats dels dos mtodes en el conjunt inicial sn diferents (c),
els percentatges de vegades en qu no es detecten diferncies significatives
sn molt inferiors als nivells de significana fixats. Daquests resultats es
pot concloure que laplicaci del test conjunt sobre els coeficients de
regressi BLS estimats considerant simultniament els resultats de lanlisi
de tots els analits, proporciona conclusions correctes sobre lexistncia de
diferncies significatives entre els resultats dels dos mtodes en
comparaci. Aix s perqu es detecten diferncies significatives en un
percentatge elevat quan els resultats no sn comparables i no sen detecten
(es detecten en un % de les vegades) quan els resultats s ho sn.

A banda de conixer les conclusions sobre lexistncia de diferncies
significatives entre els dos mtodes en comparaci considerant els conjunts
de dades globals, s interessant saber les conclusions que es poden extreure
respecte lexistncia de diferncies significatives entre els resultats dels dos
mtodes aplicant el test conjunt sobre els coeficients de regressi BLS
estimats a partir de cadascun dels 700.000 conjunts individuals. Aquests
conjunts de dades contenen els resultats de lanlisi de cada un dels set
analits simulats i la seva uni dna lloc a cadascun dels 100.000 conjunts
globals als quals es fa referncia en la taula 1. Es considera que hi ha
diferncies significatives en els resultats dels dos mtodes si en ms de la
meitat dels set conjunts individuals que formen un conjunt global en
cadascuna de les 100.000 iteracions es detecta que els coeficients de
regressi BLS sn significativament diferents dels valors terics 0 i 1. La
figura segent mostra esquemticament aquest procs de simulaci.


147
Conjunts Globals
on no es detecten
diferncies significat.
100.000 Conjunts de dades Globals
Conjunts de dades Individuals
Amb 4 o ms conjunts de dades individuals
on NO es detecten diferncies significatives
(cas a1 a la Taula 2)
(0,1)
Conjunts Globals on
es detecten diferncies
significatives.
Conjunts de dades Individuals
on S es detecten diferncies significatives
(cas a2 a la Taula 2)
on NO es detecten diferncies significatives
(cas b1 a la Taula 2)
on S es detecten diferncies significatives
(cas b2 a la Taula 2)
(cas a a la Taula 2)
(cas b a la Taula 2)
(0,1) (0,1) (0,1) (0,1)
(0,1) (0,1) (0,1)
(0,1) (0,1)
(0,1) (0,1)
(0,1) (0,1)
(0,1)
Figura 4.3. Esquema del procs de simulaci per detectar diferncies
significatives entre els resultats de dos mtodes, mitjanant el test conjunt sobre
els coeficients de regressi BLS, estimats considerant els resultats de tots els
analits simultniament o cadascun dels analits individualment.

La taula 2 mostra els resultats obtinguts en laplicaci del test
conjunt a diferents nivells de significana sobre els coeficients de regressi
BLS, obtinguts considerant els conjunts de dades individuals generats a
partir dels tres tipus diferents de conjunts de dades inicials. Hi ha quatre
possibles situacions quant a lexistncia de diferncies significatives en els
conjunts individuals:


148
Conjunts Globals
Cas a Cas b
Conjunts Individuals Conjunts Individuals
Conjunt
de dades
1- (%) Cas a1 Cas a2 Cas b1 Cas b2
90 99.65 0.34 98.88 1.12
1 95 99.98 0.02 99.90 0.09
99 100 0 100 0
99.9 100 0 100 0

90 99.65 0.35 99.19 0.81
2 95 99.98 0.02 99.86 0.14
99 100 0 100 0
99.9 100 0 100 0

90 99.56 0.44 94.91 5.09
3 95 99.96 0.04 99.40 0.6
99 100 0 100 0
99.9 100 0 100 0
Taula 2. Percentatges dels conjunts de dades individuals que formen els conjunts
globals a qu es fa referncia en la taula 1, segons si es pot concloure lexistncia o
labsncia de diferncies significatives en els resultats dels dos mtodes.

a) En aquells conjunts de dades globals en qu no shan detectat
diferncies significatives entre els resultats obtinguts pels dos mtodes
(vegeu-ne percentatges a la taula 1), poden donar-se dues possibilitats en
aplicar el test conjunt sobre els coeficients de la recta de regressi BLS
estimats per cadascun dels conjunts individuals:

a1) No es detecten diferncies significatives en els resultats
continguts en quatre o ms conjunts individuals.
a2) S es detecten diferncies significatives en els resultats


149
b) En aquells conjunts de dades globals en qu s es detecten
diferncies significatives entre els resultats dels dos mtodes (vegeu-ne
percentatges a la taula 1) poden donar-se les mateixes dues possibilitats
descrites en lapartat a) en aplicar el test conjunt sobre els coeficients de la
recta de regressi BLS estimats per cadascun dels conjunts individuals:

b1) No es detecten diferncies significatives en els resultats
b2) S es detecten diferncies significatives en els resultats

Els resultats obtinguts pel cas a1 en la taula 2 demostren que no es
detecten correctament diferncies significatives entre els resultats dels dos
mtodes en comparaci aplicant el test conjunt sobre els coeficients de
regressi BLS, ja siguin estimats considerant els resultats de tots els analits
conjuntament o els de cada analit individualment. Daltra banda, lalt
percentatge obtingut en el cas b1 pels tres tipus de conjunts inicials
demostra que la detecci de diferncies significatives s molt difcil quan el
test conjunt saplica sobre els coeficients de regressi BLS estimats a partir
de cadascun dels diferents conjunts de dades individuals. Aix passa
perqu a que en els conjunts de dades on el nombre de punts s petit,
2
(eq. 1.37) t una probabilitat ms gran destar
sobreestimat.
7
Ats que la mida de linterval de confiana conjunta pel
mtode de regressi BLS s directament proporcional a la magnitud de
lerror experimental, una sobreestimaci de s
2
far que linterval de
confiana conjunt estigui sobredimensionat.

Aix doncs, quan es treballa amb conjunts de dades individuals la
probabilitat de no detectar diferncies significatives entre els resultats de
dos mtodes quan realment existeixen, s a dir, dacceptar H
0
quan la
hiptesi correcta s H
1
(probabilitat derror ) s elevada. Per aquest motiu,

150
a fi devitar una probabilitat derror gran en la comparaci dels resultats
de dos mtodes analtics capaos de determinar diversos analits alhora, cal
considerar tots els valors experimentals per estimar els coeficients de
regressi BLS sobre els quals saplica el test conjunt. En lapartat 4.3 es
presenta un exemple de comparaci de metodologies analtiques per
determinar diversos analits simultniament, en el cas de les tcniques de
cromatografia dalta resoluci en fase reversa per la determinaci damines
bigenes en vins.

4.3 Analytica Chimica Acta, 406 (2000) 257-278

151
4.3 Validation of bias in multianalyte determination methods.
Application to RP-HPLC derivatizing methodologies.
(Analytica Chimica Acta 406 (2000) 257-278)

ngel Martnez
1*
, Jordi Riu
1
, Olga Busto
2
, Josep Guasch
2
and F. Xavier Rius
1

1. Departament de Qumica Analtica i Qumica Orgnica. Institut
destudis Avanats. Facultat de Qumica. Universitat Rovira i Virgili.

2. Departament de Qumica Analtica i Qumica Orgnica. Unitat
d'Enologia del CeRTA. Universitat Rovira i Virgili.

Keywords

HPLC, biogenic amines, method validation, linear regression, joint
confidence interval.

ABSTRACT

This paper reports a new approach for validating bias in analytical
methods that provide simultaneous results on multiple analytes. The
validation process is based on a linear regression technique taking into
account errors in both axes. The validation approach is used to individually
compare two different chromatographic methods with a reference one.
Each of the two methods to be tested is applied on a different set of data
composed of two real data sets each. In addition, three different kinds of
simulated data sets were used. All three methods are based on RP-HPLC
and are used to quantify eight biogenic amines in wine. The two methods
to be tested use different derivatizatizing procedures; precolumn 6-
aminoquinolyl-n-hydroxysuccinimidyl carbamate (AQC) and oncolumn o-
phtalaldehyde (OPA) respectively. On the other hand, the reference

152
method uses derivatization with OPA precolumn. Various analytes are
determined in a set of samples using each of the methods to be tested and
their results are independently regressed against the results of the reference
method. Bias is detected in the methods to be tested by applying the joint
confidence interval test to the slope and the intercept of the regression line
which takes into account uncertainties in the two methods being compared.
The conclusions about the trueness of the two methods being tested varied
according to whether the joint confidence interval test was applied to data
obtained from various biogenic amines considered simultaneously or
individually.

INTRODUCTION

Biogenic amines need to be determined in fermented beverages
because they are potentially toxic when consumed in large amounts [1].
Many methods for quantifying the biogenic amine content in food have
been described, (i.e. gas chromatography [2,3] and HPLC techniques [4-6]).
However, procedures based on RP-HPLC have commonly been used as the
amines can be automatically injected [7], even without previous treatment
of the samples. Most of the RP-HPLC analytical methods used to determine
biogenic amines are based on derivatization reactions which improve the
selectivity and sensitivity of the different procedures.

The Office International de la Vigne et du Vin (OIV) has yet to
propose an official method of analysing biogenic amines in wines.
However, the most used method [7,8] and the only one that has been
validated for this kind of analysis [9], is the method that uses precolumn
derivatization with OPA. For this reason, in this study, this method has
been taken as the reference method. Nevertheless, alternative methods can
be used to analyse biogenic amines in wines, since they have some
advantages. For instance, the use of AQC as derivatizing reagent [10]
provides more stable compounds and greater selectivity than the method

153
using OPA. On the other hand, the oncolumn derivatization with OPA [11]
does not require the use of an automatic injector, what makes it more
affordable without loss of sensitivity. As an essential step in the validation
process [12] of the methods using AQC precolumn and OPA oncolumn as
derivatizing reagents, bias must be evaluated in order to assess the trueness
[13] of the methodologies. On the other hand, the evaluation of the
precision is as well a relevant and complementary matter that will be
addressed in future works.

The trueness of the two methodologies to be tested (AQC
precolumn and OPA oncolumn derivatization) at different concentration
levels can be assessed by comparing each of them to a reference method
(OPA precolumn derivatization), using linear regression. The analytical
results obtained by applying each one of the methodologies to be tested to
a set of samples containing the biogenic amines at different concentration
levels are regressed on the results obtained by the reference method from
the same set of samples. In this way, a straight line is expected. Should the
slope and the intercept values of the straight line not both be significantly
different from unity and zero respectively, the methodology being tested
can be considered to be comparable to the OPA precolumn derivatization
one throughout the specific range. This comparison can be performed using
the joint confidence interval test for the slope and the intercept [14].

Traditionally, in order to compare analytical methods using linear
regression, the ordinary least squares (OLS) technique has been used to
find the regression coefficients. However, this technique only assumes the
presence of constant random errors (homoscedasticity) in one of the
methods in comparison, usually in the method being tested represented on
the y-axis, considering the other method free of random errors. Since non
constant errors (heteroscedasticity) are normally present in both methods,
bivariate least squares (BLS) regression techniques which consider errors in

154
both axes should be used. Recently a joint confidence interval test for the
slope and intercept which considers both homoscedastic and
heteroscedastic errors in both methods has been developed [15].

However, so far, only a single analyte per sample has been
considered when comparing methodologies at multiple concentration
levels using linear regression. Because the RP-HPLC methodologies make it
possible to analyse multiple analytes, in this paper we extend the technique
for assessing bias by method comparison using linear regression taking
into account the errors in both axes to multiple analytical determinations.
In this way, the joint confidence interval test can be applied to the
regression coefficients of the BLS straight line, which is obtained
considering the information of all the different amines in the sample, so
bias in the method to be tested considering all the analytes can be detected.

Four real data sets containing chromatographic data of eight
biogenic amines in red, white and ros wines from the province of
Tarragona (Spain) were used to check the existence of bias in the two RP-
HPLC derivatizing methods to be tested. It is shown that the conclusions
drawn from statistically analysing the data from individual amines cannot
be used to asses the validity of the AQC precolumn and OPA oncolumn
derivatization methodologies for analysing the whole set of biogenic
amines.


Notation

The true values of the bivariate least-squares regression coefficients
are represented by a (intercept) and b (slope), while their respective
estimates are denoted as a and
b . The number of experimental data pairs


155
(x
i
,y
i
) from the analysis of the different analytes at the different levels of
concentration, is denoted as n. The experimental error associated to the
regression line (i.e. residual error), expressed in terms of variance for the
experimental data pairs, is referred to as s
2
, while its estimate will be s
2
.
Likewise, y
i
will represent the estimated value of y
i
.

Bivariate least-squares regression (BLS)

Bivariate least-squares is the generic name given to a set of straight
line regression techniques applied to data containing errors in both axes. Of
all the approaches for calculating the regression coefficients, Liss method
is one of the most suitable [16].

The regression technique minimises the sum
of the weighted residuals, S, expressed in eq. 1:

=
n
i i
i i
w
y y
S
1
2
) (
(1)

where the weighting factor w
i
, takes into account the variances of each
(x
i
,y
i
) data pair, s
x
i
2
and s
y
i
2
, according to eq 2:

w s b s
i y x
i i
= +
2 2 2
(2)

and the estimate of the experimental error is defined as:

2
) (
1
2
2
=

=
n
w
y y
s
n
i i
i i
(3)


156
Therefore, the bivariate regression method assigns larger weights
(i.e less importance) to those data pairs with larger s
x
i
2
and/or s
y
i
2
values,
that is to say, the most imprecise data pairs.

Minimising the sum of the weighted residuals, two non-linear
equations are obtained, from which the regression coefficients a and
b can
be obtained by means of a quick iterative process [17].

Joint confidence interval test

In order to compare two multianalyte methodologies using
bivariate linear regression, the analytical results obtained from a set of
samples by the method to be tested are regressed on those obtained by the
already established methodology. Different analytes at different
concentration levels are considered to generate the straight line. If neither
of the straight line regression coefficients statistically differ from unity
slope and zero intercept, the results produced by the two methodologies
will not be considered to be statistically different at a given level of
significance .

The joint confidence interval test for the slope and the intercept
considering errors in both axes [15] is used to test whether there are
significant differences between the regression coefficients and the
theoretical values of zero intercept and unity slope. This consists of
checking the presence of the theoretical point zero intercept and unity slope
within the limits of the elliptical-shaped joint confidence region defined by
eq. 4:

1
2 2
1
2
1
2
2 2
1 2 2
1
w
a a
x
w
a a b b
x
w
b b s F
i i
n
i
i i
n
i
i
n
i
n
= =

=

+ + = ( ) ( )(

) (

)
( , )
(4)


157
where F
n 1 2 2 ( , )
is the tabulated F-value, at a level of significance with 2
and n-2 degrees of freedom. When using eq. 4 on a data set generated from
the individual analysis of a certain type of amine (from now on, individual
data set), the value of n indicates the number of different concentration
samples (five in this work). If on the other hand, the data set contains data
from the simultaneous analysis of the different amines considered (from
now on, global data sets), variable n indicates the overall number of
samples analyzed (number of samples x number of analytes). Only if the
theoretical point falls inside the elliptical joint confidence region delimited
by eq. 4 can it be concluded that there are no significant differences
between the two methodologies.

The size of the joint confidence region for a given level of
significance , depends directly on the estimate of the experimental error.
In this way, when few experimental data are available, the values of are
usually overestimated [18]. This increase in uncertainty is due to the lack of
information inherent to a small number of data pairs, or in some cases to
the lack of fit of the experimental data to the BLS regression line [19]. In
these cases the joint confidence region is oversized. This may prevent a
possible bias from being detected in the method being tested, because there
is a higher probability that the theoretical point (0,1) falls within the joint
confidence interval. In other words, in these situations there is a higher
probability of committing a error when applying the joint confidence
interval test [20].

Evaluating bias

An earlier study assessed the correctness of the joint confidence
interval test for detecting bias in method comparison studies considering
uncertainties in both axes and one single analyte at different concentration
levels [15]. New studies based on simulated data generated using the

158
Monte Carlo method [21,22] and reproducing typical results from
multianalyte determination methods have been carried out to show the
correctness of the joint confidence interval test when detecting bias in
multianalyte determination methods. Moreover real data sets have also
been used to provide application examples.

In this way, for both real and simulated data sets, a bias was detected in the
method being tested when the theoretical point (0,1) falls outside the joint
confidence region at a given level of significance . The conclusions drawn
about bias from the global data sets were compared with the conclusions
drawn from the joint confidence interval test applied to individual data sets
(i.e. those that only contain data corresponding to the single analytes). In
this latter case, the results from the method being tested were considered
biased when for more than the half of the analytes (four or more) the
theoretical point (0,1) falls outside the joint confidence region at the same
level of significance .


Eight biogenic amines were initially considered in this research
work: histamine, methylamine, tyramine, ethylamine, phenethylamine, 1,4-
diaminobutane (putrescine), 1,5-diaminopentane (cadaverine) and 3-
methylbutylamine. They were all perfectly resolved by both OPA-
derivatization methods for the different concentration samples. Figures 1a
and 1b show the chromatograms obtained with both methods for a 3 ppm
standard addition in a red wine sample. As can be seen, the eight peaks
corresponding to the eight amines appear perfectly resolved along with
other compounds which do not interfere in the analysis. Figure 1c shows
the chromatogram obtained when the same 3 ppm standard addition
sample was analysed using the AQC-derivatization method. As can be
seen, putrescine is partially overlapped with an interfering compound

159
when the analysis was performed with this method. This overlap did not
allow putrescine to be quantified by the AQC-derivatization method. For
this reason this analyte was not considered in the subsequent evaluation of
bias between OPA precolumn and AQC derivatization methods. Moreover
the low peak resolution for the tyramine, phenetilamine and 3-
methylbuthylamine shown by the AQC-derivatization method (Fig. 1c) is
mostly due to the modification of the gradient program considering the
relatively high number of analyses that had to be carried out.

min 0 5 10 15 20 25
8000
10000
12000
14000
16000
1
2
3
4
5
6
7
8
(a)

16000
14000
12000
10000
8000
1
2
3
4
5
6
7
8
(b)
Figure 1. Chromatograms for the OPA precolumn (a) and OPA oncolumn (b)
derivatization methods in the analysis of the biogenic amines in 3 ppm standard
addition red wine samples.

160
min 0 2 4 6 8 10 12 14 16 18
10000
20000
30000
40000
50000
60000
2
1
4
3
6
8
5
7
(c)

1.- Histamine. 5.- Phenethylamine.
2.- Methylamine. 6.- Putrescine.
3.- Tyramine. 7.- 3-methylbutylamine.
4.- Ethylmine. 8.- Cadaverine.
Figure 1 (cont). Chromatogram from the AQC (c) derivatization method in the
analysis of the biogenic amines in 3 ppm standard addition red wine samples.

Chemicals and reagents

All the biogenic amines were supplied by Aldrich-Chemie (Beerse,
Belgium). An individual standard solution of 2000 mg l-1 of each amine
was prepared in HPLC-grade acetonitrile (Scharlau, Barcelona, Spain) and
stored in darkness at 4oC. A working standard solution containing all the
amines was prepared with an aliquot of each solution and subsequently
diluted with synthetic wine (3.5 g of tartaric acid in a 12% hydroalcoholic
solution, and the pH of the solution adjusted to 3.5) in a volumetric flask.
More diluted solutions used in the different studies were prepared by
diluting this standard solution with the synthetic wine.

The Milli-Q quality water (Millipore, Bedford, USA) used in the
chromatographic experiments was filtered through a 0.45 m nylon
membrane. The methanol, tetrahydrofurane and sodium acetate used to
prepare the mobile phases were of HPLC grade (Scharlau).


161
For the automatic derivatization methods, the AccQFluor Reagent
Kit (Waters, Milford, MA, USA) and o-phtalaldehyde/mercaptoethanol
(Aldrich) were used as described in previous studies [10,23].

Equipment

Chromatographic experiments were performed using a Hewlett-
Packard (Waldbronn, Germany) 1050 liquid chromatograph with a Hewlett
Packard model 1046A fluorescence detector. In the precolumn
derivatization methods, the samples were derivatized and injected with a
Hewlett Packard Series 1050 automatic injector. Separation of the amine
derivatives was performed using an ODS Basic cartridge (250 x 4.6 mm i.d.,
particle size 5 m) supplied by Hewlett Packard. In the OPA oncolumn
derivatization method, separation was performed using an Asahipack OP-
50 cartridge (250 x 4.6 mm i.d., particle size 5 m) also supplied by Hewlett
Packard.

High-performance liquid chromatographic methods

For the oncolumn derivatization method, three solvent reservoirs
containing the following eluents: (A) ACN, (B) 5 mM borate solution (pH 9)
with 12 mM OPA-NAC and (C) 5 mM borate solution (pH 9) with 1% THF
were used to separate all the amines. The HPLC gradient elution is
described below:

OPA oncolumn gradient program

This is a slight modification of the program presented in a previous
study [11]. It began with an isocratic elution of 16% of solution A and 74%
of solution C for 10 minutes, followed by linear gradient elution from these
percentages to 21% and 79% in 30 seconds. This composition was

162
maintained for 17 minutes before changing to a 23.5% and 76.5% of
solutions A and C in 30 seconds. Finally, another isocratic elution with this
latter composition was applied until minute 40. Determination was
performed at 40C with a flow-rate fixed at 0.8 mlmin-1. The eluted
derivatives were detected by monitoring their fluorescence using 340 nm
and 450 nm as the excitation and emission wavelengths, respectively.
Under these conditions all eight amines were eluted in under 40 minutes.

Both precolumn derivatization methods used the same mobile
phases and consisted of two solvent reservoirs which contained (A) 0.05M
sodium acetate in 1% THF and (B) methanol. Nevertheless, the gradients
used were different and adjusted as described below:

OPA precolumn gradient program

This is a modified proceeding from the one reported in [8] in order
to adequate the analysis time according to the number of samples analysed
of the eight biogenic amines. The computer program started with 40% of
methanol in the mobile phase and finished 30 minutes later with a 100% of
this solvent. Finally, the column was cleaned with an isocratic elution at
this percentage of methanol for 2 more minutes. Determination was
performed at 60oC with a flow-rate of 1 mlmin-1 and the eluted OPA-
derivatives were detected by monitoring their fluorescence at excitation
and emission wavelengths of 330 nm and 445 nm, respectively. Under these
conditions all seven amines were eluted in less than 25 minutes.

AQC gradient program

This program [10] consisted of a linear gradient elution from 35% to
100% of methanol in 13 minutes. Then, the column was cleaned up by
eluting 100% of methanol for 2 more minutes. The eluted AQC-derivatives
were detected by monitoring their fluorescence using excitation and

163
emission wavelengths of 250 nm and 395 nm respectively. The flow rate
was set at 1 mlmin-1. Under these conditions all seven amines were eluted
in less than 15 minutes.

Derivatization

Both precolumn derivatization methods were fully automated by
means of two injector programs. The derivatization reagents and the
samples were drawn sequentially into the injection needle. The reactants
were then mixed, injected into the column and separated using the gradient
elutions described above. On the other hand, the derivatization in the OPA
oncolumn method was performed by adding OPA-NAC to the mobile
phase, as explained above.

Samples

In order to ensure that the possible differences between the methods
were due to the experimental methodologies rather than to the samples,
identical groups of samples of red, ros and white wines from different
zones of Tarragona were prepared for the three methods. The procedure
developed was as follows. Three wines were chosen, one red, one ros and
other white. A stock solution containing 100 ppm of the amines in synthetic
wines was prepared. The samples to be injected were prepared in 25 ml
volumetric flasks by adding 0, 250, 750, 1,250, 1,750 and 2,500 l of the
working standard solution and bringing to volume with the red, the ros
and the white wine in each case. In this way, the final 18 solutions of 25 ml
each had a biogenic amine concentration between 1 and 10 ppm. Six
aliquots of 1 ml were taken from each one of the volumetric flasks and
frozen. Finally, they were injected on alternate days, and analysed
according to the methodologies described.


164
Calibration experiments

In order to verify the linearity of the response of the different
derivatives at the previously specified wavelengths for the working
concentrations (0.5 to 15 mgl-1), standard solutions of amines were
prepared in synthetic wine. Calibration curves for each amine were
constructed by plotting the amine peak-area against the amine
concentration. As in previous studies [8,10,11], linear least-squares
regression was used to calculate the calibration parameters.


Simulated data sets

Three simulated global data sets were used to prove the correctness
of the validation technique for multianalyte determination methods based
on the joint confidence interval test. Two of them simulated a situation in
which two multianalyte determination methods provide identical results.
In the other one results from the method being tested were chosen to be
biased in comparison to the results from the reference method. Moreover
the individual uncertainties associated to the data pairs in each simulated
global and individual data set were different; homoscedasticity,
proportional heteroscedasticity and random heteroscedasticity were
considered.

Real data sets

Four real data sets were used to show the different conclusions
reached about the correctness of a multianalyte determination method
when considering the experimental data for all the analytes at the different
concentration levels jointly (i.e. global data sets) or independently (i.e.
individual data sets).

165
Data Sets 1 and 2. Made up of the results obtained from analysing
seven biogenic amines (histamine, methylamine, tyramine, ethylamine,
phenethylamine, 3-methylbutylamine and cadaverine) in red (data set 1)
and white (data set 2) wines spiked at five different concentration levels.
The seven individual data sets were each composed of five data pairs (Figs.
2a-2g for red and 3a-3g for white wines respectively) and the global data
set was built up by joining the individual data sets (Figs. 2h for red and 3h
for white wines respectively). In this way, the global data set contained 35
data pairs distributed along a linear range between 0 and 14 ppm. Two
different RP-HPLC derivatization procedures were used as analytical
methods in this comparative study; OPA precolumn as the reference
method and AQC precolumn as the method to be tested. The uncertainties
present in both axes were a result of a six replicate analysis at each data
pair.

-2 0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
b) Methylamine
OPA Precolumn
A
Q
C
P
r
e
c
o
l u
m
n
-2 0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
a) Histamine
OPA Precolumn
A
Q
C
P
r
e
c
o
l u
m
n
-2 0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
c) Tyramine
OPA Precolumn
A
Q
C
P
r
e
c
o
l u
m
n
-2 0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
d) Ethylamine
OPA Precolumn
A
Q
C
P
r
e
c
o
l u
m
n
-2 0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
e) Phenethylamine
OPA Precolumn
A
Q
C
P
r
e
c
o
l u
m
n
-2 0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
f) i-Amylamine
OPA Precolumn
A
Q
C
P
r
e
c
o
l u
m
n
-2 0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
g) Cadaverine
OPA Precolumn
A
Q
C
P
r
e
c
o
lu
m
n
-2 0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
h) Global data set
OPA Precolumn
A
Q
C
P
r
e
c
o
lu
m
n
Figure 2. Data sets obtained from analysing the seven biogenic amines in red
wines using the AQC and OPA precolumn derivatizing methods. Individual
uncertainties from the six replicate analysis of each sample by both methods are
symbolized as the horizontal and vertical lines around the dots that represent the
mean values.


166
0 2 4 6 8 10 12
0
2
4
6
8
10
12
a) Histamine
OPA Precolumn
A
Q
C
P
r
e
c
o
l u
m
n
0 2 4 6 8 10 12
0
2
4
6
8
10
12
OPA Precolumn
b) Methylamine
A
Q
C
P
r
e
c
o
l u
m
n
0 2 4 6 8 10 12
0
2
4
6
8
10
12
c) Tyramine
OPA Precolumn
A
Q
C
P
r
e
c
o
l u
m
n
0 2 4 6 8 10 12
0
2
4
6
8
10
12
d) Ethylamine
OPA Precolumn
A
Q
C
P
r
e
c
o
l u
m
n
0 2 4 6 8 10 12
0
2
4
6
8
10
12
e) Phenethylamine
OPA Precolumn
A
Q
C
P
r
e
c
o
l u
m
n
0 2 4 6 8 10 12
0
2
4
6
8
10
12
f) i-Amylamine
OPA Precolumn
A
Q
C
P
r
e
c
o
l u
m
n
0 2 4 6 8 10 12
0
2
4
6
8
10
12
g) Cadaverine
OPA Precolumn
A
Q
C
P
r
e
c
o
l u
m
n
0 2 4 6 8 10 12
0
2
4
6
8
10
12
h) Global data set
OPA Precolumn
A
Q
C
P
r
e
c
o
l u
m
n
Figure 3. Data sets obtained from analysing the seven biogenic amines in white
wines using the AQC and OPA precolumn derivatizing methods. Individual
uncertainties from the six replicate analysis of each sample by both methods are
symbolized as the horizontal and vertical lines around the dots that represent the
mean values.

Data Sets 3 and 4. Because putrescine was perfectly resolved by both
OPA precolumn (reference method) and OPA oncolumn (method to be
tested) RP-HPLC derivatizing methods, all eight biogenic amines could be
analysed in this comparison study in ros (data set 3) and red wines (data
set 4). For this reason, the global data set for data set 3 (Fig. 4i) and for data
set 4 (Fig. 5i) consisted of forty data pairs from the eight individual data
sets (Figs. 4a-4h for ros and 5a-5h for red wines respectively) in which five
levels of concentration were considered. The linear range spans from 0 to
14 ppm. The uncertainties for each data pair in both axes were generated
from a six replicate analysis with each method.


167
2 4 6 8 10 12
0
2
4
6
8
10
12
OPAPrecolumn
A
Q
C
P
re
c
o
lu
m
n
a) Histamine
0 1 2 3 4 5 6
0
1
2
3
4
5
6
7
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
b) Methylamine
0 2 4 6 8 10 12
0
2
4
6
8
10
12
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
c) Tyramine
0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
14
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
d) Ethylamine
0 2 4 6 8 10 12
0
2
4
6
8
10
12
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
e) Phenetylamine
0 2 4 6 8 10
1
2
3
4
5
6
7
8
9
10
11
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
g) Cadaverine
0 2 4 6 8 10 12
0
2
4
6
8
10
12
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
f) i-Amylamine
0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
14
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
h) Putrescine
0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
14
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
i) Global data set

2 4 6 8 10 12 14
2
4
6
8
10
12
14
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
a) Histamine
0 1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
b) Methylamine
0 2 4 6 8 10 12
0
2
4
6
8
10
12
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
c) Tyramine
2 4 6 8 10 12 14
0
2
4
6
8
10
12
14
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
d) Ethylamine
0 2 4 6 8 10 12
0
2
4
6
8
10
12
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
e) Phenetylamine
0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
f) i-Amylamine
0 2 4 6 8 10 12
0
1
2
3
4
5
6
7
8
9
10
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
g) Cadaverine
0 2 4 6 8 10 12 14
2
4
6
8
10
12
14
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
h) Putrescine
-2 0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
14
OPAPrecolumn
O
P
A
O
n
-c
o
lu
m
n
i) Global data set

Figures 4 and 5. Data sets obtained from analysing the eight biogenic amines in
ros and red wines respectively using the OPA oncolumn and OPA precolumn
derivatizing methods. Individual uncertainties from the six replicate analysis of
each sample by both methods are symbolized as the horizontal and vertical lines
around the dots that represent the mean values.

All the computational work was performed with home made


168

Simulated data sets

Our results have shown that the joint confidence interval test
provides correct results when different analytes at different concentration
levels (i.e. global data sets) are considered. In addition, when bias was
detected in biased global data sets, it was not often detected in most of the
corresponding individual data sets. In this way, when the method being
tested is used to determine various analytes simultaneously, conclusions
about the presence of bias may be wrong if the joint confidence interval test
is only applied on data sets which contain single analyte data. This is
because overestimated values of the experimental error are likely when
few experimental data is considered. This makes the joint confidence region
too large and thus increases the probability of not detecting the existing
bias. Simulation results are available on request.

Real data sets

Data Set 1. The results of applying the joint confidence interval test
to the individual data sets show significant differences between both
methods only for the histamine (Fig. 6a) and the phenethylamine (Fig. 6e)
at a level of significance of 5%. That is, there are no significant differences
between the two methodologies tested (the theoretical point (0,1) falls
inside the joint confidence region) when determining five of the seven
biogenic amines tested in red wines. So because for most of the single
analytes tested bias was detected, it could be concluded from an individual
testing approach, that the RP-HPLC multianalyte determination method
using AQC provides correct results when simultaneously analysing the
seven biogenic amines in red wines. On the contrary, when the joint
confidence interval test is applied on the global data set (Fig. 6h) for the
same level of significance stated above, bias between both methods is

169
detected. The high distance between the theoretical point (0,1) and the
boundary of the joint confidence region observed in Figure 6h, indicates
that results from the RP-HPLC derivatization methodology using AQC
have important bias. For this reason, the experimenter needs to review the
AQC derivatization methodology in search for possible errors, so that bias
in the analytical results can be reduced.

Results from the joint confidence test show that to check the
presence of bias in a multianalyte determination method all the
experimental data should be considered. This is because when more data is
handled better estimates of the experimental error are obtained. This avoids
the oversizing of the joint confidence interval region due to high
experimental random errors and thus the probability of not detecting the
presence of an existing bias (i.e. error).

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
Intercept
S
lo
p
e
(0,1)
-2 -1.5 -1 -0.5 0 0.5 1
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Intercept
S
lo
p
e
(0,1)
(,
) ab
(, ) ab
-2.5 -2 -1.5 -1 -0.5 0 0.5 1
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
2.2
Intercept
S
lo
p
e
(0,1)
(, ) ab
-0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4
0.96
0.98
1
1.02
1.04
1.06
1.08
Intercept
S
lo
p
e
(,
) ab
(0,1)
-3 -2 -1 0 1 2 3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
Intercept
S
lo
p
e
(,
) ab
(0,1)
-4 -3 -2 -1 0 1 2
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
Intercept
S
lo
p
e
-3 -2 -1 0 1 2 3 4
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Intercept
S
lo
p
e
-0.5 0 0.5 1
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Intercept
S
lo
p
e
(0,1)
(,
) ab
(,
) ab
(, ) ab
(0,1)
(0,1)
a) Histamine b) Methylamine c) Tyramine d) Ethylamine
e) Phenethylamine f) i-Amilamine g) Cadaverine h) Global data set
Intercept
S
l o
p
e
Intercept
S
l o
p
e
Intercept
S
l o
p
e
Intercept
S
l o
p
e
Intercept
S
l o
p
e
Intercept
S
l o
p
e
Intercept
S
l o
p
e
Intercept
S
l o
p
e
Figure 6. Joint confidence regions for the BLS regression coefficients spanned for
the data sets obtained from analysing red wines respectively using AQC and
OPA precolumn derivatization methods. The level of significance was set at 5 %
in all cases.

170
-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2
0.8
0.85
0.9
0.95
1
1.05
1.1

(0,1)
(,) ab
a) Histamine
Intercept
S
l
o
p
e
-1.2 -1 -0.8 -0.6 -0.4 -0.2 0
0.8
0.9
1
1.1
1.2
1.3
1.4

(0,1)
(,) ab
b) Methylamine
Intercept
S
l
o
p
e
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
0.2
0.4
0.6
0.8
1
1.2
1.4

(,) ab
(0,1)
c) Tyramine
Intercept
S
l
o
p
e
-1 -0.5 0 0.5 1 1.5
0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2

(,) ab
(0,1)
d) Ethylamine
Intercept
S
l
o
p
e
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4

(0,1)
(,) ab
e) Phenethylamine
Intercept
S
l
o
p
e
-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
(0,1)
(,) ab
f) i-Amilamine
Intercept
S
l
o
p
e
-1.5 -1 -0.5 0 0.5 1 1.5
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
(,) ab
(0,1)
g) Cadaverine
Intercept
S
l
o
p
e
-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4
0.9
0.95
1
1.05
1.1
(,) ab
(0,1)
h) Global data set
Intercept
S
l
o
p
e
Figure 7. Joint confidence regions for the BLS regression coefficients spanned for
the data sets obtained from analysing white wines respectively using AQC and
OPA precolumn derivatization methods. The level of significance was set at 5 %
in all cases.

Data Set 2. The joint confidence interval test applied to individual
analytes shows that there are statistical differences between the two
chromatographic methods at a 5% level of significance only for the
methylamine (Fig. 7b). In this way, as explained in data set 1, no bias would
be detected in the method being tested considering single analyte data in
white wines. Likewise, looking at the joint confidence region for the global
data set (Fig. 7h), neither is bias detected (the theoretical point (0,1) falls
inside the joint confidence region) when comparing the two methods using
linear regression taking into account all the analytes and the errors
associated to their determination. So in this example it could be concluded
that the RP-HPLC method using AQC provides correct results when
simultaneously analysing the seven biogenic amines in white wines.


171
Data Set 3. Only in two of the eight biogenic amines analysed in
ros wines (histamine and cadaverine), was bias detected for a level of
significance of 5% (Figs. 8a and 8g). As in data set 1, we may conclude from
these results that the RP-HPLC method being tested using oncolumn OPA
derivatization provides correct results when analysing the eight biogenic
amines in ros wines for a level of significance of 5%, if the experimental
data is considered separately. On the other hand, as in data set 1, if the joint
confidence interval test is applied to all the experimental data available,
bias is detected in the OPA oncolumn derivatization method for the same
level of significance (Fig. 8i). This confirms that overestimated values of the
experimental error
2
s , often obtained in data sets with a low number of
data pairs, can generate oversized joint confidence regions and prevent the
bias in the method from being accurately detected. This example, like the
one in data set 1, shows that if we had finally concluded that the method
being tested provided correct results in the simultaneous analysis of the
biogenic amines in ros wines, we would have accepted a biased
multianalyte determination method (i.e. error).

In this case however, although significative differences between the
results from both methods have been found when simultaneously
analysing the eight amines in ros wines, the experimenter may not decide
to revise the OPA oncolumn derivatization methodology. This is because in
this case the distance between the points (0,1) and the boundary of the joint
confidence region is small. In this way, the experimenter may find
acceptable to set a lower level of significance to the initial 5%, for which no
bias would be detected in the experimental results. In such case, the
experimenter should be aware of the consequences of setting low levels of
significance in terms of an increase of the probabilities of committing a
error.


172
-2 -1.5 -1 -0.5 0 0.5
0.9
1
1.1
1.2
1.3
Intercept
S
lo
p
e
-0.2 0 0.2 0.4 0.6 0.8
0.9
1
1.1
Intercept
S
lo
p
e
-1.5 -1 -0.5 0 0.5 1 1.5 2
0.9
1
1.1
1.2
Intercept
S
lo
p
e
-3 -2 -1 0 1 2
0.8
1
1.2
1.4
Intercept
S
lo
p
e
-0.2 0 0.2 0.4 0.6
0.9
0.95
1
Intercept
S
lo
p
e
a) Histamine b) Methylamine c) Tyramine d) Ethylamine e) Phenetylamine
-1 -0.5 0 0.5 1
0.9
1
1.1
S
lo
p
e
-0.5 0 0.5 1 1.5
0.8
1
1.2
1.4
-0.5 0 0.5 1 1.5
0.9
1
1.1
Intercept
S
lo
p
e
-0.1 0 0.1 0.2 0.3 0.4 0.5
0.96
1
1.04
Intercept
S
lo
p
e
Intercept Intercept
f) i-Amylamine g) Cadaverine h) Putrescine i) Global data set
(0,1)
(0,1)
(0,1)
(0,1)
(0,1)
(0,1)
(0,1)
(0,1)
(0,1)
)
, ( b a
)
, ( b a
)
, ( b a
)
, ( b a
)
, ( b a
)
, ( b a

-1.5 -1 -0.5 0 0.5
1
1.04
1.08
1.12
1.16
Intercept
S
lo
p
e
-1 -0.5 0 0.5 1 1.5 2
0.4
0.6
0.8
1
1.2
1.4
1.6
Intercept
S
lo
p
e
-1 -0.5 0 0.5 1 1.5 2
0.8
0.9
1
1.1
1.2
Intercept
S
lo
p
e
-6 -4 -2 0 2 4 6
0.6
1
1.4
1.8
Intercept
S
lo
p
e
-3 -2 -1 0 1 2
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
Intercept
S
lo
p
e
a) Histamine b) Methylamine c) Tyramine d) Ethylamine e) Phenetylamine
-0.5 0 0.5
0.8
0.85
0.9
0.95
1
1.05
1.1
Intercept
S
lo
p
e
0 0.2 0.4 0.6 0.8 1
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
Intercept
S
lo
p
e
0 0.5 1 1.5 2
0.85
0.9
0.95
1
1.05
Intercept
S
lo
p
e
-0.4 -0.2 0 0.2 0.4 0.6
0.9
0.95
1
Intercept
S
lo
p
e
f) i-Amylamine g) Cadaverine h) Putrescine i) Global data set
(0,1)
)
, ( b a (0,1)
)
, ( b a (0,1)
(0,1)
)
, ( b a
(0,1)
)
, ( b a
(0,1)
)
, ( b a
(0,1)
(0,1)
)
, ( b a
(0,1)
)
, ( b a

Figures 8 and 9. Joint confidence regions for the BLS regression coefficients
spanned for the data sets obtained from analysing ros and red wines
respectively using OPA precolumn and OPA oncolumn derivatization methods.
The level of significance was set at 5 % in all cases.

Data Set 4. As in data set 3, bias was detected in only two of the
eight biogenic amines (cadaverine and putrescine) for a level of significance
of 5% (Figs. 9g and 9h). We would therefore conclude that there is no bias
in the results from the OPA oncolumn RP-HPLC derivatizing method if the
results of analysing the eight biogenic amines in red wines are treated
separately. In this example, like in data set 2, this conclusion is confirmed

173
when the joint confidence interval test is applied to the results from the
analysis of the eight amines taken simultaneously (Fig. 9i), because the
theoretical point (0,1) falls inside the joint confidence region at the same
level of significance. We may therefore conclude that the RP-HPLC method
using oncolumn OPA derivatization produces correct results when the
eight biogenic amines in red wines are simultaneously analysed.

CONCLUSIONS

The detection of bias is an important step in the process of
validating an analytical method. Bias in methods that provide
simultaneous results on multiple analytes at different concentration levels
can be detected with regression analysis using bivariate least squares (BLS)
and the joint confidence interval test. Systematic errors should be evaluated
considering all the data pairs produced by the two methods when
determining all the analytes. Otherwise, there are more probabilities for the
multianalyte methodology to be erroneously interpreted as being valid (i.e.
error) when the joint confidence interval test is applied to single analyte
data.

The probability of committing a error is higher when there are few
data pairs in a data set, because in these cases there is a higher probability
for the value of the experimental error
2
s to be overestimated. This
generates oversized joint confidence regions what provides a higher
probability for the theoretical point zero intercept and unity slope to fall
inside the joint confidence region and, therefore, a higher probability of
falsely accepting biased methods. This clearly shows that the joint
confidence interval test should be applied on data sets containing all the
information from the different analytes determined by the analytical
methodologies.


174
The application of the joint confidence interval test to the results of
analysing the seven biogenic amines (histamine, methylamine, tyramine,
ethylamine, phenethylamine, 1,5-diaminopentane (cadaverine) and 3-
methylbutylamine) in wines with the AQC precolumn derivatization RP-
HPLC method reveals that bias was detected when analysing red wines,
but not when analysing white wines. On the other hand, 1,4-
diaminobutane (putrescine) could be quantified using the OPA oncolumn
derivatization method. For this reason the joint confidence interval test
could be applied to the results of analysing all eight biogenic amines. This
showed that bias was detected when analysing ros wines, but not when
analysing red wines.

Despite the suitability of the present approach which is based on the
regression technique considering errors in both methods, researchers
should be aware of two weaknesses. The first, inherent in any BLS
regression technique, is that the uncertainties of all the results of the
analysis need to be known [24]. The second is that the regression technique
is not very robust in the presence of outliers with low individual
uncertainty. This limitation of the BLS regression method will be addressed
in future works.

ACKNOWLEDGMENTS

We would like to thank the DGICyT (project no. BP96-1008) and the
CICyT (project no. ALI97-0765) for financial support, and the Rovira i
Virgili University for providing a doctoral fellowship to A. Martnez.

BIBLIOGRAPHY

[1]. J. Stratton, R. Hutkins and S. Taylor. J. Food Protec. 54 (1991) 460.
[2]. P.L. Rogers and W. Staruszkiewicz. J. AOAC Internat. 80 (1997) 591.
[3]. Bonilla, L.G. Enrquez and H.M. Nair. J. Chromatogr. Sci. 35 (1997) 53.

175
[4]. W.J. Hurst. J. Liq. Chromatogr. 13 (1990) 1.
[5]. O. Busto, J. Guasch and F. Borrull. J. Internat. Sci. Vigne Vin 30 (1996) 85
[6]. P. Lehtonen. Am. J. Enol. Vitic. 47 (1996) 127.
[7]. P. Lehtonen, M. Saarinen, M. Vesanto and M.L. Riekkola. Z Lebensm
Unters Forsch 194 (1992) 434.
[8]. O. Busto, M. Mestres, J. Guasch and F. Borrull. Chromatographia 40
(1995) 404.
[9]. M.J. Pereira and A. Bertrand. Bull. OIV 765-766 (1994) 918.
[10]. O. Busto, J. Guasch and F. Borrull. J. Chromatogr. 737 (1996) 205.
[11]. O. Busto, M. Miracle, J. Guasch and F. Borrull. J. Chromatogr. 757
(1997) 311.
[12]. Eurachem/Welac Guide 1. Accreditation of Chemical Laboratories.
Laboratory of the Government Chemist, London 1993.
[13]. ISO 5725-6: 1994(E). Geneva. 1994.
[14]. J. Mandel and F. J. Linnig. Anal Chem. 29 (1957) 743.
[15]. J. Riu and F. X. Rius. Anal. Chem.; 68 (1996) 1851.
[16]. J. M. Lis, A. Cholvadova and J. Kutej. Comput Chem.; 14 (1990) 189.
[17]. J. Riu and F. X. Rius. J. Chemom. 9 (1995) 343.
[18]. G.J. Hahn and W. Q. Meeker. Statistical Intervals, a guide for
practitioners; John Wiley & Sons: New York, 1991; p. 39.
[19]. A. Martnez, J. Riu and F. X. Rius. in preparation
[20]. A. Martnez, J. Riu and F. X. Rius. in preparation
[21]. P. C. Meier , R. E. Zund Statistical Methods in Analytical Chemistry;
John Wiley & Sons: New York, 1993; pp. 145-150.
[22]. O. Gell, J.A. Holcombe, Analytical applications of Monte Carlo
techniques, Anal. Chem. 60 (1990) 529A - 542A.
[23]. O. Busto, J. Guasch and F. Borrull. J. Chromatogr. 718 (1995) 309.
[24]. R. J. Carroll and D. Ruppert. Amer. Stat. 50 (1996) 1.


176

Com sha posat de manifest anteriorment, cometre un error en la
comparaci dels resultats de dos mtodes analtics pot portar a considerar
que el mtode analtic candidat s traable al mtode de referncia, quan en
realitat els resultats del mtode candidat estan esbiaixats. Segons el tipus de
problema analtic tractat, les conseqncies dacceptar un mtode analtic
que proporciona resultats esbiaixats poden arribar a ser molt greus. En
aquests casos ser preferible fer ms mesures experimentals per aconseguir
una millor estimaci de lerror experimental s
2
i per tant, crrer un menor
risc dacceptar un mtode candidat que pot donar resultats esbiaixats.

s per aquest motiu que sha desenvolupat un mtode de clcul per
estimar la probabilitat de cometre un error quan saplica el test de
confiana conjunta sobre els coeficients de regressi BLS. La concepci
terica de les probabilitats de cometre un error en el cas del test conjunt
s anloga a la presentada pels tests individuals en el captol 3. s
important tenir en compte que en aquest cas no es parteix dun interval de
confiana individual amb una sola dimensi, sin que linterval de
confiana conjunta t dues dimensions. Aix doncs, ser necessari trobar
lexpressi matemtica que caracteritza la distribuci de confiana conjunta
tridimensional associada als coeficients de regressi terics que defineixen
tant H
0
com H
1
(apartat 4.5). Aquesta distribuci de confiana conjunta s
anloga a la distribuci t de Student utilitzada en els tests individuals i
relaciona en un espai tridimensional tots els nivells de significana de 0 a 1,
amb els valors dels coeficients de regressi que satisfan lequaci 9 de
lapartat 4.5. Daquesta manera la probabilitat de cometre un error no
vindr determinada per una rea com en els tests individuals, sin per un
volum.
El fonament teric necessari per entendre la probabilitat de cometre
un error en el test conjunt es detalla en la secci Probabilities of and

177
error in the joint confidence interval test de lapartat 4.5. A ms, es presenta
lexpressi de la integral triple que en permet lestimaci (equaci 12) i es
demostra mitjanant conjunts de dades simulats que les estimacions de les
probabilitats derror sn correctes. Finalment, la probabilitat de cometre
un error en aplicar el test conjunt sobre els coeficients de regressi BLS
tamb sestima en conjunts de dades reals.

178
4.5 Evaluating bias in method comparison studies using
linear regression with errors in both axes (Journal of
Chemometrics, accepted for publication).

ngel Martnez*, Jordi Riu and F.Xavier Rius


KEYWORDS

Method bias, probability of error, method comparison, linear regression,
errors in both axes.

ABSTRACT

This paper presents a theoretical background for estimating the
probability of committing a error when checking the presence of method
bias. Results obtained at different concentration levels from the analytical
method being tested are compared by linear regression with the results
from a reference method. Method bias can be detected by applying the joint
confidence interval test to the regression line coefficients from a bivariate
least squares (BLS) regression technique. This finds the regression line
considering the errors in the two methods. We have validated the
estimated probabilities of a error by comparing them with the
experimental values from twenty-four simulated data sets. We also
compared the probabilities of error estimated using the BLS regression
method on two real data sets with those estimated by using ordinary least
squares (OLS) and weighted least squares (WLS) regression techniques for
a given level of significance . We found that there were important
4.5 Evaluating bias in method comparison studies ...

179
differences in the values predicted with WLS and OLS compared to those
predicted with the BLS regression method.

INTRODUCTION

Assessing the accuracy of a new analytical method,
1
is an essential
part of method validation studies. This can be done, for example, by
comparing the results of the method being tested with those from a
reference method. Results for a particular analyte, obtained at different
concentration levels, can be evaluated by linear regression. The results from
the method being tested (usually on the y axis) are normally regressed onto
those of the reference method (usually on the x axis). The methodology
being tested will be considered correct over the specified range only if the
slope and the intercept of the straight line are not statistically different from
their reference values of unity and zero respectively. This can be checked
using the joint confidence interval test for the slope and the intercept.
2

So far, the most common regression techniques for finding the
regression line between the results of the two methods are ordinary least-
squares (OLS) and weighted least-squares (WLS). Both techniques consider
the predictor variable to be error-free. OLS assumes constant errors while
WLS considers nonconstant ones in the response variable. An alternative is
errors-in-variables regression,
3-6
known as the constant variance ratio
(CVR) approach, which considers the errors in both axes. It does not take
into account the individual uncertainties of each experimental point but
considers the ratio of the variances of the response and predictor variables
to be constant for every experimental point (
2 2
x y
s s = ). A particular case
of the CVR approach, is orthogonal regression (OR),
7
in which the errors
are of the same order of magnitude in the response and predictor variable
(i.e. =1). However, the best option is to use bivariate least squares
regression techniques (BLS), since they consider the individual nonconstant
180
errors in both axes. Recently, it has been developed a joint confidence
interval test for the slope and the intercept considering the BLS regression
conditions.
8

We know that two kinds of errors may arise from hypothesis tests.
Type I (or ) error is the one which is often considered and within the field
of analytical chemistry, deals with wrongly accepting the presence of bias
in the results of the analytical method being tested. Type II (or ) error, on
the other hand, occurs when its presence is wrongly denied. Although it
has not been yet extensively introduced in routine chemical analysis
practices, information about the probability of committing a error is in
many cases as important, if not more important, than that provided by the
level of significance . The consequences of introducing bias in future
chemical analysis results as a consequence of erroneously accepting a
biased method may even affect a laboratorys reputation. Despite its
importance, to our knowledge there is no statistically based foundation for
estimating the probability of error in method comparison studies. In this
paper we present the theoretical background for estimating these
probabilities when comparing the results of two analytical methods at
multiple levels of concentration and taking into account the individual
errors in both methods (i.e. applying the joint confidence interval test with
the BLS technique). We also validate the results from our theoretical
expressions and include two practical examples using real data. This
technique is not only applicable in method comparison studies but also
when the results from two analysts, laboratories or techniques are to be
compared.

To validate the expressions we used simulated and real data sets.
The simulated data sets, generated by the Monte Carlo method,
9,10
were
chosen to reproduce usual data set structures found in real analysis data.
Applying the technique to two real data sets demonstrated that the
probability of error using OLS and WLS regression techniques can be

181
very different from that from the BLS regression method when errors in
both axes are considered.


Notation

In general, the true values of the different variables used in this
work are represented with Greek characters, while their estimates are
denoted with Latin letters. In this way, the true values of the regression
coefficients are represented by
0
(intercept) and
1
(slope), while their
respective estimates are denoted as a and b. The estimates of the standard
deviations of the regression coefficients are
a
s and
b
s . The experimental
error, expressed in terms of variance for the n experimental data pairs
(x
i
,y
i
), is
2
, while its estimate is
2
s . Analogously,
i
y is the prediction of the
experimental y
i
.

For the joint confidence interval tests,
0
H
a ,
0
H
b ,
1
H
a and
1
H
b are the
values of the theoretical regression coefficients that define the points
(
0
H
a ,
0
H
b ) and (
1
H
a ,
1
H
b ), from which the null and alternative hypothesis
(H
0
and H
1
) are postulated. Their respective standard deviation estimates
are
0
H
a
s ,
0
H
b
s and
1
H
a
s ,
1
H
b
s .

Bivariate Least Squares Regression (BLS)

From all the existing least squares approaches for calculating the
regression coefficients when errors in both axes are present, Liss
method
11
(referred to as BLS) was found to be the most suitable.
12
This
technique assumes the true linear model to be:

182

i i

1 0
+ = (1)

The true variables
i
and
i
are unobservable and instead, one can only
observe the experimental variables:

i i i
x + = (2)

i i i
y + = (3)

The random errors committed in the measurement of variables x
i
and y
i
, are
represented by variables
i
and
i
, where ) , 0 ( N ~
2
i
x i
and ) , 0 ( N ~
2
i
y i
.
In this way, introducing eqs. 2 and 3 in eq. 1 and isolating the variable y
i
,
the following expression is obtained:

i i i
x y + + =
1 0
(4)

The term
i
2
i
i

13
and can be
i
,
i
and
1
:

i i i

1
= (5)

Many authors
3,14-16
have developed procedures to estimate the
regression line coefficients based on a maximum likelihood approach
whenever errors in both variables are present. In most cases, these methods
need the true predictor variable to be carefully modelled.
16
This is not
usually possible in chemical analysis, where the true predictor variables
i

are not often randomly distributed (i.e. functional models are assumed).
Moreover there are cases in which the experimental data is heteroscedastic
and estimates of measurement errors are only available through replicate
measurements (i.e. the ratio
i i
y x
is not constant or unknown). These

183
conditions, common in chemical data, make it very difficult to rigorously
apply the principle of maximum likelihood to the estimation of the
regression line coefficients. On the other hand, there is a method to
estimate the regression coefficients using a maximum likelihood approach
even when a functional model is assumed.
13
This method is not rigorously
applicable when individual heteroscedastic measurement errors are
considered. It has been shown that when assuming
i i
y x
= for any i,
least squares methods provide the same estimates of the regression
coefficients as the ones from a maximum likelihood estimation approach.
17

For these reasons, we have chosen an iterative least squares method (i.e. the
BLS method) that can be applied on any group of ordered pairs of
observations with no assumptions about the probability distributions.
17

This allows the application of this method in real chemical data when
individual heteroscedastic errors in both axes are considered. In this way,
the BLS regression method relates the observed variables x
i
and y
i
as
follows:
18

i i i
e bx a y + + = (6)

The term e
i
i
is
2
i
e
s and
will be referred to as weighting factor. This parameter takes into
axes (
2
i
x
s and
2
i
y
s ) obtained from replicate analysis. The covariance between
the variables for each (x
i
,y
i
) data pair, which is normally assumed to be
zero, is also taken into account:

) ( cov 2 ) var( ) ( var
2 2 2 2
i i x y e i i i
y x b s b s s bx a y e
i i i
+ = = =

(7)

184
coefficients by minimising the sum of the weighted residuals, S, expressed
in eq. 8:

2
1
2
2
) 2 (
) (
s n
s
y y
S
n
i e
i i
i
=
=

(8)

The experimental error s
2
is an important variable since it provides a
measure of the dispersion of the data pairs around the regression line and
can give a rough idea of the lack of fit of the experimental points to the
regression line. In this way, the BLS regression technique assigns less
importance to those data pairs with larger
2
i
x
s and
2
i
y
s values, that is to say,
the most imprecise data pairs. By minimising the sum of the weighted
0
and b
1
process.
8

Special attention should be paid to the estimation of the variances of
the errors made in the measurement of the different concentration samples
(
2
i
x
s and
2
i
y
s ) by the methods in comparison. To obtain the best possible
estimates of the regression coefficients when no estimates of the error
variances are available from previous experiments, a sufficient number of
replicates should be made. However, measurement error variances
estimated by means of replicate measurements may include sources of
variation that are not related to the random errors made when analysing
the samples.
19
This is the case of replicate measurements with different
means, lack of homogeneity (i.e. geological samples) or different kinds of
interferences that affect the analysis of the analyte(s) of interest by both
methods. In such cases, estimates of the regression coefficients from the
BLS method, as with those from other regression methods that consider
measurement errors, may be biased. It has been reported that this bias

185
occurs because these regression methods consider that all the variability in
the data is due to the random errors made when measuring the different
samples.
19
They ignore an important component of variability in the data
when the true variables
i
and
i
do not follow a linear relationship (i.e. an
error in the equation is not considered) and consequently, the pairs (
i
,
i
)
do not fall exactly on a straight line. However, linear models with an error
in the equation are not usual in chemical analysis, since instrumental
responses are justified by theoretical laws (Lambert-Beers law, Nernsts
law, ...). Also in method comparison studies, since the two methods being
compared measure the same samples, linear models with an error in the
equation are not theoretically justified.
20
Despite the infrequent use in
analytical chemistry of linear models with an error in the equation,
researchers should be aware that by using a regression method that needs
to estimate the variances of measurement errors, the error components that
affect experimental measures should be carefully considered.

Joint Confidence Interval Test

To compare two analytical methodologies by bivariate linear
regression, the results from a set of samples by the method to be tested are
regressed onto those from the already established methodology, with their
respective uncertainties. Different concentration levels are considered in
order to generate the straight line. If neither of the straight line regression
coefficients statistically differs from unity slope and zero intercept, the
results of the two methods will not be considered statistically different at a
given level of significance .

The joint confidence interval test for the slope and the intercept
considering the errors present in both axes, is used to check the presence of
bias in the method to be tested by applying it to the BLS regression
coefficients.
8
In other words, this can be regarded as checking the suitability
186
of the null hypothesis H
0
, which can be defined depending on the risk of
error ( or ) to be controlled.
21
In this paper H
0
assumes that both
estimated BLS regression coefficients belong to a joint confidence
distribution centred at the reference point ) , (
0 0
H H
b a (where 0
0
H
= a and
1
0
H
= b ). Traditionally, no method bias was detected (i.e. H
0
was accepted)
when the theoretical point (0,1) for a given level of significance , fell
within the limits of the joint confidence region centred at the point (a,b).
2

However, in order to be more consistent with the definition of hypothesis
testing in terms of accepting or rejecting H
0
,
22
the joint confidence interval
should be centred at the reference point (0,1), so that if the experimental
point (a,b) falls within this joint confidence interval no bias is detected
between the two methods in comparison.

In this way, as the standard deviations associated to any pair of BLS
regression coefficients (
a
s and
b
s ) not only depend on the experimental
error
2
s but also on the slope value,
23
the size of the joint confidence region
centred at the reference point (0,1) will now be set according to the
0
H
b
unity value, and not according to the estimated slope b. Considering the
stated modifications, the expression developed in a previous work
8
that
spans the joint confidence interval for a given level of significance , can be
re-defined so that it is centred at the reference point ) , (
0 0
H H
b a as follows:

=

= =
= + +
n
i
n
e
i
n
i e
i
n
i e
F s b b
s
x
b b a a
s
x
a a
s
i i i
1
) 2 , 2 ( 1
2 2
H 2
2
H H
1
2
2
H
1
2
2 ) ( ) ( ) ( 2 ) (
1
0
0
H
0 0
0
H
0
0
H
(9)

where F
1-(2,n-2)
is the tabulated F value for a level of significance with 2
and n-2 degrees of freedom. Values of variables a and b that satisfy eq. 9
define the boundary of the joint confidence interval for a given level of
significance . The term
0
H
i
e
s is the weighting factor from eq. 7 recalculated

187
with the
0
H
b unity value. The value of the experimental error s
2
must be the
one initially estimated for the BLS regression line (with regression
coefficients a and b) because there can be no experimental error associated
to the reference values
0
H
a and
0
H
b .

Probabilities of and error in the joint confidence interval test

The risk of committing a error is defined as the probability of
accepting H
0
when the correct hypothesis is H
1
.
24
This latter hypothesis
considers that the estimated regression coefficients a and b belong to a joint
confidence distribution centred at the point ) , (
1 1
H H
b a . In a more analytical
context it can be defined as the probability of accepting that the method
being tested provides correct results when in fact these results are biased.

To estimate the probability of error, it is essential to find the
expression that spans the joint confidence distribution for the slope and the
intercept. The expression has been developed so that the same joint
confidence intervals from eq. 9 can be easily generated for any level of
significance . For this reason, we have derived an expression that relates
all the possible values of a and b with the level of significance (i.e. risk of
error) that satisfies eq. 9 (see Appendix A). This produces a tri-
dimensional joint confidence distribution when the three variables are
represented on the x, y and z axes respectively (Figure 1):

= = =
+ +
+ =

2
2
2
1
2
H 2
2
H H
1
2
2
H
1
2
) 2 (
) ( ) ( ) ( 2 ) (
1
1
0
0
H
0 0
0
H
0
0
H
n
n
i e
i
n
i e
i
n
i e
s n
b b
s
x
b b a a
s
x
a a
s
i i i
(10)

188
-1
0
1
2
0.5
1
1.5
0
0.2
0.4
0.6
0.8
1
Intercept
S
l
o
p
e
Risk of
error

Figure 1. Joint confidence distribution for the slope and the
intercept as a function of the level of significance . A tri-
dimensional distribution is produced.

The width of this tri-dimensional joint confidence distribution
depends on the standard deviations
0
H
a
s and
0
H
b
s (directly related to the
first and third summation term in eq. 10 respectively). These standard
deviations are estimated considering the reference regression values
0
0
H
= a and 1
0
H
= b and depend on the experimental error s
2
. The tilt of the
joint confidence region depends on the covariance between a and b.

In the tri-dimensional joint confidence distribution, the probability
of an error,
24
can be illustrated as the volume of the distribution
associated to the reference point (0,1) that falls outside the projection of the
bi-dimensional joint confidence region for a given level of significance . In
other words, all those pairs of values (a,b) for which method bias would be

189
detected, although they belong to the joint confidence distribution centred
at the reference point (0,1) (Figure 2).

Risk of error
Joint Confidence
Region for a level
of significance
) , (
1 1
H H
b a
Risk of error
(0,1)
In
te
rc
e
p
t
Risk of
error
S
lo
p
e

Figure 2. Volume of the joint confidence distribution centred at the reference
point (0,1) excluded from the projection of the joint confidence interval for a level
of significance , i.e. risk of error (dotted section). Volume of the joint
confidence distribution centred at the alternative point
) , (
1 1
H H
b a
overlapped
within the projection of the joint confidence interval for a level of significance ,
i.e. risk of error (shaded section).

On the other hand, the probabilities of error can be illustrated as
the volume of the tri-dimensional joint confidence distribution centred at
the point ) , (
1 1
H H
b a (to be set by the experimenter), that falls inside the
projection of the bi-dimensional joint confidence region centred at the
reference point (0,1) for a given level of significance (Figure 2). That is all
the possible pairs of values for which method bias would not be detected
even though they belong to the joint confidence distribution centred at the
point ) , (
1 1
H H
b a .

Therefore, the probability of error not only depends on the size of
the joint confidence region (defined by
0
H
a
s and
0
H
b
s ) centred at the
reference point, but also on the size of the joint confidence distribution
190
centred at the point ) , (
1 1
H H
b a , defined by the values of
1
H
a
s and
1
H
b
s . As
happens with the dependence of
0
H
a
s and
0
H
b
s on
0
H
b and s
2
, the values of
1
H
a
s and
1
H
b
s depend on the biased slope value
1
H
b and s
2
. This means that
to obtain the correct probability of error, an accurate estimation of s
2
is
necessary. For this reason, a statistical test which detects lack of fit
considering errors in both axes is needed to prevent overestimated s
2

values.
25

To obtain an initial non-normalised value of the probability of
error (
prev
), a triple integral to calculate the intersected volume can be
defined as:
26

=
1
2
1
2
) (
) (
) , (
0
prev
d d d
a
a
a b
a b
b a
a b
(11)

where (a,b) is the tri-dimensional joint confidence distribution (eq. 10)
centred at the point ) , (
1 1
H H
b a (i.e. considering
1
H
a and
1
H
b coefficients
instead of
0
H
a and
0
H
b in eq. 10). The terms b
1
(a) and b
2
(a) are functions of
a that, for intercept values between a
1
and a
2
, define the upper and lower
halves of the elliptical joint confidence region centred at the reference point
(0,1) for a given level of significance . Also, a
1
and a
2
are the intercept axis
values where the two halves meet their ends (mathematical expressions are
presented in Appendix B). This
prev
value must be normalised by dividing
it by the total volume inside the three-dimensional joint confidence
distribution centred at the point ) , (
1 1
H H
b a to obtain the usual probability of
error ranging from zero to one:

=
a b b a
a b
a
a
a b
a b
b a
d d ) , (
d d d
1
2
1
2
) (
) (
) , (
0
(12)

191
Validation Process

As can be seen in Appendix A, the expression of the joint confidence
distribution (eq. 10) for the slope and the intercept, used to calculate the
probabilities of error in eq. 12, is derived from the equation defining the
BLS joint confidence interval (eq. 9). As stated previously, the BLS
technique is an iterative least-squares method, that unlike those based on a
maximum likelihood approach, lacks of a rigorous mathematical
background. In this way, the derived BLS joint confidence interval and thus
the expression for estimating the probabilities of error should be
validated. For this reason, a validation process has been designed to assess
how correct the estimates of the probability of error from eq. 12 are. We
used simulated data sets showing uncertainties in both axes, plus two real
data sets to estimate and compare the probabilities of error under BLS
regression conditions with the probabilities of error under WLS and OLS
conditions, for a level of significance of 5%.

The simulated data sets were generated by the Monte Carlo method.
In this way, 100,000 new data sets were randomly generated for each initial
data set, where all the data pairs perfectly fitted along a straight line with a
slope and intercept which were significantly different from unity and zero,
respectively (i.e. results from a two-methods comparison study whose
differences are statistically significant), by adding random errors based on
the individual uncertainties of each point to each of the data pairs. Those
simulated data sets in which the experimental point (a,b), although
generated from an initial data set with obvious bias in both slope and
intercept coefficients, fell within the joint confidence interval (eq. 9) for a
given level of significance , were considered the reference estimate of the
probability of error. This estimate was then compared to the analogous
value from eq. 9, where the experimental error
2
s has been calculated as
192
the mean value of the 100,000 individual experimental errors generated in
the simulation process.


Data Sets and Software

In the validation step we used twenty-four simulated data sets to
reproduce some usual structures of routine analytical measurements. Two
linear ranges were combined with four different numbers of data pairs and
three different kinds of uncertainty patterns, as described below:

Linear Ranges: The short range spans from 0 to 10 units and the
large range spans from 0 to 100 units.

Number of data pairs: Data sets composed of five, fifteen, thirteen
and a hundred data pairs were taken. In all cases the data pairs were
randomly distributed throughout the two different linear ranges.

Uncertainties: Both homoscedastic and heteroscedastic data sets
were used. In the homoscedastic data sets standard deviations on both x
i

and y
i
values were constant. In one heteroscedastic data set standard
deviations increased and in the other, standard deviations were random. In
neither case however, was the standard deviation higher than 15% of each
individual x
i
and y
i
value.

Two real data sets were used to show the differences in the
estimates of the probability of error considering BLS, WLS and OLS
regression methods.

Data Set 1.
27
Eight data pairs were generated from the
determination of eight polycyclic aromatic hydrocarbons in various

193
environmental matrices at different concentration levels through a stepwise
interlaboratory study. Uncertainties for each concentration sample were
generated from three replicate analyses. The two analytical methods
compared are high performance liquid chromatography (HPLC) (results on
the x axis) and gas chromatography (GC) (results on the y axis). Eleven
laboratories took part. The linear range spans from eight to twenty six
g/g. The regression lines generated for this data set with the three
regression techniques are shown in Figure 3a.

Data set 2.
28
Comparative study on the determination of mercury in
biological tissue using gas chromatography with sodium tetraethylborate
derivatization, coupled to a cold vapour atomic fluorescence spectrometer.
One (x axis) and two (y axis) gold amalgamations were used to reduce
mercury before the transfer to the atomic fluorescence spectrometer. Five
data pairs were obtained with their respective uncertainties generated from
six replicates performed at each concentration level. Units are expressed in
pg of recovered mercury. The data set and the regression lines generated
by the three regression techniques are shown in Figure 3b.

5 10 15 20 25 30
8
10
12
14
16
18
20
22
24
26
HPLC (mg/g)
G
C

(
m
g
/
g
)

a)

-200 0 200 400 600 800 1000
0
200
400
600
800
1000
One amalgamation step (pg)
T
w
o

a
m
a
l
g
a
m
a
t
i
o
n

s
t
e
p
s

(
p
g
)
b)

Figure 3. OLS (dashed), WLS (dotted) and BLS (solid) regression lines for data set
1 (a) and data set 2 (b).
194
All the computational work involving simulation processes and
integrations for the theoretical estimation of the probability of error was
done with home-made Matlab subroutines (Matlab for Microsoft Windows
ver. 4.2, The Mathworks, Inc., Natick, MA).


Simulated data sets

Table 1 shows the probabilities of error from the Monte Carlo
simulation process (column
sim
) and the estimated probabilities using eq.
12 (column ). Columns
1
H
b and
1
H
a show the values of the regression
coefficients defining the point ) , (
1 1
H H
b a chosen to provide probabilities of
error similar to the levels of significance . The linear ranges and kinds of
uncertainty for each data set are summarized in the Range and
Uncertainty columns.

n Range Uncertainty (2 tails)
1
H
b

1
H
a

sim

5 [0-10] homo. 0.1 1.1 -2.0 11.01 17.21
0.05 1.1 -2.7 5.98 10.09
0.01 1.1 -5.5 0.75 2.34
0.001 1.1 -13.3 0.08 0.56

hetero. 0.1 1 -0.5 11.47 16.67
0.05 1.75 -0.5 7.17 14.84
0.01 2.4 -0.5 2.94 5.81
0.001 4.3 -0.5 0.88 2.25

heter. rnd. 0.1 1 -0.25 10.94 16.18
0.05 1.1 -0.45 5.07 8.67
0.01 1.2 -0.9 1.69 3.07
0.001 1.5 -2.3 0.23 0.76

[0-100] homo. 0.1 1.5 -25 16.02 26.57
0.05 1.5 -40 6.72 13.51
0.01 1.5 -67 1.72 4.08
0.001 1.5 -140 0.29 0.95
Table 1. Predicted and experimentally obtained probabilities of error
for the simulated data sets.

195
1
H
b

1
H
a

sim

hetero. 0.1 1.1 -6 9.13 16.32
0.05 1.8 -6 5.66 13.42
0.01 2.4 -6 3.85 6.57
0.001 3.8 -6 2.61 3.90

heter. rnd. 0.1 1.1 -3.3 10.23 15.86
0.05 1.5 -6.3 5.94 11.14
0.01 2.2 -9 2.63 4.83
0.001 3.8 -17 0.758 1.20

15 [0-10] homo. 0.1 0.9 1 10.09 10.02
0.05 0.9 1.15 6.82 6.34
0.01 0.9 1.5 1.99 1.47
0.001 0.9 2 0.314 0.14

hetero. 0.1 0.99 0.02 13.84 14.70
0.05 0.973 0.02 5.51 5.80
0.01 0.96 0.02 1.12 1.02
0.001 0.945 0.02 0.17 0.107

heter. rnd. 0.1 0.99 0.018 10.17 10.98
0.05 0.97 0.023 5.04 5.35

0.01 0.96 0.033 1.09 0.97
0.001 0.95 0.045 0.009 0.17

[0-100] homo. 0.1 0.975 0.03 7.76 8.59
0.05 0.975 2.5 5.1 5.45
0.01 0.975 3.3 0.99 0.94
0.001 0.975 4.2 0.27 0.16

hetero. 0.1 0.98 0.05 12.40 13.04
0.05 0.975 0.05 5.82 6.07
0.01 0.965 0.05 1.21 1.16
0.001 0.95 0.05 0.14 0.07

heter. rnd. 0.1 0.97 0.18 10.56 11.16
0.05 0.96 0.22 6.59 6.70
0.01 0.95 0.3 2.89 2.36
0.001 0.94 0.43 0.53 0.27

30 [0-10] homo. 0.1 1.1 -0.8 10.05 11.74
0.05 1.1 -0.95 3.93 4.90
0.01 1.1 -1.2 0.32 0.74
0.001 1.1 -1.4 0.08 0.25

hetero. 0.1 1.05 -0.015 11.40 12.77
0.05 1.1 -0.015 5.52 6.84
0.01 1.16 -0.015 0.37 0.74
0.001 1.2 -0.015 0.10 0.23

heter. rnd. 0.1 1.05 -0.03 10.56 11.91
Table 1 (cont.). Predicted and experimentally obtained probabilities of
error for the simulated data sets.
196
1
H
b

1
H
a

sim

0.05 1.07 -0.035 4.00 5.16
0.01 1.1 -0.04 1.07 1.81
0.001 1.15 -0.045 0.01 0.11

[0-100] homo. 0.1 1.02 -0.45 15.04 16.03
0.05 1.02 -1.85 5.20 5.79
0.01 1.02 -2.2 2.53 2.62
0.001 1.02 -2.8 0.12 0.18

hetero. 0.1 1.05 -0.15 11.57 12.77
0.05 1.1 -0.15 5.35 6.84
0.01 1.15 -0.15 0.82 1.53
0.001 1.2 -0.15 0.06 0.23

heter. rnd. 0.1 1.05 -0.3 10.57 11.91
0.05 1.07 -0.35 3.99 5.16
0.01 1.1 -0.4 1.01 1.81

0.001 1.15 -0.45 0.02 0.11

100 [0-10] homo. 0.1 0.95 0.43 10.91 11.13
0.05 0.95 0.5 4.48 4.67
0.01 0.95 0.59 1.55 1.59
0.001 0.95 0.7 0.18 0.25

hetero. 0.1 0.99 0.04 9.09 9.25
0.05 0.95 0.04 6.69 6.82
0.01 0.92 0.04 1.28 1.33
0.001 0.89 0.04 0.07 0.07

heter. rnd. 0.1 0.99 0.017 11.38 11.65
0.05 0.97 0.022 4.30 4.55
0.01 0.96 0.028 1.54 1.64
0.001 0.95 0.035 0.35 0.43

[0-100] homo. 0.1 0.99 0.15 9.27 9.63
0.05 0.99 1 4.79 4.96
0.01 0.99 1.2 1.13 1.28
0.001 0.99 1.4 0.24 0.29

hetero. 0.1 0.99 0.28 9.88 10.07
0.05 0.94 0.28 5.55 5.70
0.01 0.91 0.28 1.39 1.41
0.001 0.88 0.28 0.15 0.15

heter. rnd. 0.1 0.99 0.35 12.87 13.14
0.05 0.97 0.45 5.47 5.65
0.01 0.95 0.57 1.80 1.95
0.001 0.93 0.7 0.51 0.51
Table 1 (cont.). Predicted and experimentally obtained probabilities of
error for the simulated data sets.


197
These results show that the probabilities of error calculated with
eq. 12 are usually overestimated. This is because when few data pairs are
considered, the experimental error s
2
(and therefore the values of
0
H
a
s ,
0
H
b
s
and
1
H
a
s ,
1
H
b
s ) is likely to be overestimated.
29
This overestimation is clear in
the s
2
values from the simulation process used to estimate the probabilities
of error for the different levels of significance , since they are the mean
of 100,000 observations. For this reason the overestimation of the
probabilities of error is clearer in those data sets with five data pairs,
where the amount of information is lower (and hence there is greater
uncertainty). As we can see in Figure 4, the more data pairs there are, the
better the agreement between the values in columns
sim
and (more
information is added). Moreover, the uncertainty pattern in the data and
the linear range do not seem to affect the error in the estimates.

0
20
40
60
80
100
120
5 15 30 100
%
III
III III
III
I
II
II
II
II
I
I
I
a)

Figure 4. Variation of the error of prediction
= 100
sim
sim
with the number
of data pairs for a level of significance of 5%, considering homoscedasticity (I),
constant (II) and random (III) heteroscedasticity, for the low (a) linear range.
198
0
50
100
150
200
5 15 30 100
%
III
III
III
III
I
II
II
II I
I
I
b)
II

Figure 4 (cont.). Variation of the error of prediction
= 100
sim
sim
with the
number of data pairs for a level of significance of 5%, considering
homoscedasticity (I), constant (II) and random (III) heteroscedasticity, for the
high (b) linear range.

Results for data sets with fifteen data pairs (
1
H
b values of less than
one) are more accurate than those for data sets with thirty (
1
H
b values of
greater than one), and similar to those for data sets with a hundred data
pairs (Table 1). This is because
1
H
b values of less than one were chosen,
which provides underestimated standard deviations
1
H
b
s and
1
H
a
s . This
partially offsets the overestimation of the probabilities of error caused by
high s
2
values because of the uncertainty in the experimental data, which is
more obvious in data sets with fewer data pairs. In this way, the intersected
volume of the distribution centred at the point ) , (
1 1
H H
b a is lower (the size
of the joint confidence distribution directly depends on
1
H
b
s and
1
H
a
s )
producing more accurate results, although underestimated in some cases.
Finally, as expected, predictions for data sets with a hundred data pairs

199
were the best thanks to the large amount of information provided by the
high amount of data and to the
1
H
b values of less than one.

To check whether there were significative differences between the
results in column with the results from the Monte Carlo simulation
process, we carried out a paired t-test
1
on the error values predicted at the
different levels of significance for each type of data set studied. Although
there were no significant differences between the values in columns
sim

and for a level of significance of 5% in any data set, we found the errors
made in estimating the probabilities of error in data sets with five data
pairs (Figure 4) to be excessive (between 70 % and 145 %). We have
therefore considered that a number of five data pairs is generally too low to
estimate the experimental error s
2
correctly, and so produce accurate
estimates of the probabilities of error. These results show that, if an
accurate estimate of s
2
is available, the probabilities of error from eq. 12
are correct.

Real data sets

Table 2 shows the results of the joint confidence interval test on both
real data sets. Columns a and b show the values of the straight line
coefficients for each one of the regression methods (Reg. Meth. column).
Values of the biased BLS regression coefficients (in columns
1
H
a and
1
H
b )
have been set according to the bias we have considered unacceptable to
remain undetected in the regression coefficients. It is important to bear in
mind that there are no theoretical rules for setting these biased coefficients.
The experimenter therefore, needs some experience in order to define the
alternative hypothesis according to the bias that cannot be accepted to
remain undetected in the regression coefficients. The bias depends on the
objective of the comparison study, the kind of analytical method being
tested or other factors. Another important issue to consider when setting
200
the biased regression coefficients that define H
1
, is the covariance between
the regression line coefficients. This covariance (responsible for the tilt of
the joint confidence distribution) makes that not all the possible pairs of
regression coefficients (a,b) at the same Euclidean distance
( ) ( )
+
2 2
1 0 1 0
H H H H
b b a a from the theoretical point (0,1) have the same
probability of being experimentally observed. We therefore recommend
first setting one of the biased regression coefficients and then calculating
the value of the other according to the direction of the major axis of the
elliptical joint confidence region. In this way, the bias to be detected in the
regression coefficients (defined by the point (
1
H
a ,
1
H
b )) is the most likely to
happen. Therefore the calculated probabilities of committing a error can
be considered to be the most suitable ones for the experimental data
available.

Data Set Reg. Meth. a b
1
H
a
1
H
b
s
2

1 BLS 0.19 0.96 4 0.63 0.21 17.3
WLS 2.65 0.67 4.02 4.00 0.66 2.24 74.7
OLS 1.00 0.87 4.00 0.71 3.54 75.2

2 BLS 2.38 0.99 5 0.90 0.72 27.0
WLS 2.79 0.97 5.00 5.00 0.96 1.88 71.4
OLS 0.22 0.99 5.00 0.99 80.8 93.3

Table 2. Predicted probabilities of error for the two real data sets.

It should also be noted that for each of the real data sets in Table 2,
the biased regression coefficients differ according to the regression
technique. This is done to maintain the biased point (
1
H
a ,
1
H
b ) at the same
Euclidean distance from the reference point (0,1), in the direction that
makes the set bias most likely to happen. This direction depends on the
covariance of the straight line coefficients, which changes according to the

201
regression technique used. In this way, the calculated probabilities of
error can be compared under the same conditions for each of the three
regression methods.
Table 2 also shows the experimental errors for the different
regression methods and data sets in column s
2
. The predicted probabilities
of error are in column . Only when the point defined by the estimated
regression coefficients (a,b) falls within the joint confidence region for a
level of significance, set at 5% in this case (i.e. no bias is detected, Figure 5),
can the probabilities of error be calculated.

-6 -4 -2 0 2 4 6
0.4
0.6
0.8
1
1.2
1.4
1.6
S
l
o
p
e
Intercept
WLS
BLS
OLS
(0,1)
a) b)
S
l
o
p
e
Intercept
WLS BLS
OLS
(0,1)
-20 -10 0 10 20
0.92
0.96
1
1.04
1.08

Figure 5. OLS (dashed line), WLS (dotted line) and BLS (solid line) joint
confidence regions obtained for data set 1 (a) and data set 2 (b) (=5%).

Data set 1. There are no significant differences between the results
of the two chromatographic methods, as the experimental point (a,b) falls
inside the joint confidence region centred at the theoretical point (0,1) for
the three regression techniques (Figure 5a). As an example, we decided that
in this case a bias of 4 in the intercept for the BLS method was too big to
remain undetected. For this intercept the most likely bias in the BLS slope is
0.63, which sets a distance of 4.017 between the points (
1
H
a ,
1
H
b ) and (0,1)
(Figure 6a). For the WLS and OLS regression methods (Figures 6b and 6c),
202
the biased regression coefficients change slightly (due to the different sizes
of the joint confidence intervals and the different covariance between the
intercept and slope) so that the bias in the regression coefficients is most
likely to happen and the distance towards the point (0,1) remains at 4.017.
Under these circumstances, the probabilities of a error considering BLS,
WLS and OLS regression methods are estimated at 17.3%, 74.7% and 75.2%
respectively (Table 2). These values are consistent with the volume of the
joint confidence distribution, centred at the point ) , (
1 1
H H
b a and overlapped
within the projection the joint confidence interval. This is centred at the
reference point (0,1) and generated under BLS, WLS and OLS regression
conditions (Figures 6a, 6b and 6c). In each case, the level of significance was
5%. Since the estimated probabilities of error are a direct function of s
2
,
for a constant distance between the points (
1
H
a ,
1
H
b ) and (0,1) and a level of
significance , high experimental errors (eq. 8) produce high probabilities
of committing a error.

It is clear, therefore, that by neglecting the errors in the results of
both analytical methods (OLS conditions) or partially considering them
(WLS conditions), one would wrongly risk errors of 75.2% and 74.7%,
respectively. This means that with WLS and OLS regression methods it
would be very difficult to detect an existing bias of the set magnitude
(Table 2) in both regression coefficients jointly, or in other words, in the
experimental results from the chromatographic method being tested. On
the other hand, with the BLS regression method, probability of error was
estimated at 17.3%. This indicates that if method bias of the set magnitude
exists in the experimental results, it will be more likely to be detected when
errors in both axes are considered.


203
S
l
o
p
e
Intercept
-4 0 4 8
0.2
0.6
1
1.4
a)
(0,1)
)
, ( b a
) , (
1 1 H H
b a

S
l
o
p
e
Intercept
-4 0 4 8
0.2
0.6
1
1.4
b)
(0,1)
)
, ( b a
) , (
1 1
H H
b a

S
l
o
p
e
Intercept
-4 0 4 8
0.2
0.6
1
1.4
c)
(0,1)
)
, ( b a
) , (
1 1
H H
b a

Figure 6. Representation of the probability of error (as seen in Figure 2, but
from the z axis) for data set 1, under BLS (a), WLS (b) and OLS (c) regression
conditions (=5%).

Data set 2. The experimental point (a,b) falls within the joint
confidence region centred at the theoretical point (0,1) for the three
regression techniques (Figure 5b), and so there are no significant
differences between the results from the two amalgamation procedures
being compared with the BLS, WLS and OLS regression methods. In this
case, we decided to estimate the probability of not detecting a bias of 0.1 in
the BLS slope. As in the previous data set the intercept for which bias in the
204
regression coefficients are most likely to happen for the BLS method is 5.00.
For the WLS and OLS regression methods, the bias that is most likely to
happen in the intercept, for a bias of 0.1 in the slope, are present in Table 2.
In this way, the probabilities of committing a error (not detecting the set
bias) according to the different s
2
values are estimated at 27.0%, 71.4% and
93.3% for the BLS, WLS and OLS techniques respectively, for a level of
significance of 5%.

-20 -10 0 10 20 30
0.8
0.9
1
1.1
S
l
o
p
e
Intercept
a)
(0,1)
)
, ( b a
) , (
1 1 H H
b a
-20 -10 0 10 20 30
0.8
0.9
1
1.1
S
l
o
p
e
Intercept
b)
(0,1)
)
, ( b a
) , (
1 1
H H
b a

-20 -10 0 10 20 30
0.8
0.9
1
1.1
S
l
o
p
e
Intercept
c)
(0,1)
)
, ( b a
) , (
1 1 H H
b a

Figure 7. Representation of the probability of error (as seen in Figure 2, but
from the z axis) for data set 2, under BLS (a), WLS (b) and OLS (c) regression
conditions (=5%).

205
Figures 7a, 7b and 7c show that these values agree with the volume of the
joint confidence distribution centred at the point ) , (
1 1
H H
b a intersected in
each case inside the projection of the respective joint confidence intervals.

As in the previous example, it is clear that by neglecting the errors
in the results of both analytical methods (OLS conditions) or partially
considering them (WLS conditions), the risk of committing a error would
be wrongly higher than if the presence of errors in both axes is considered.
In this example it is especially important to note the differences for the OLS
regression method, as the risk of error would be 93.3%. So, unlike with
the BLS regression method (risk of error estimated to be 27.0%), it would
be very difficult to detect an existing bias of the set magnitude in the
mercury recovery results with two gold amalgamation steps if errors in
both axes are not considered.

CONCLUSIONS

We have developed a theoretical background and mathematical
expressions to interpret and estimate the probability of a error, using the
joint confidence interval test for the slope and the intercept of a regression
line found by considering the errors in both axes. The immediate use in
measurement science is in the comparison of the results from two analytical
methods, but this can be extended for example, to comparisons between
analysts, laboratories, or between the chemical composition of various
samples.

We have found that if OLS or WLS regression methods are applied
on data with errors in both axes, estimates of the probability of committing
a error, unlike those obtained if the BLS technique is used, are not correct.
Therefore, correctly detecting bias in the results from the analytical method
being tested might be very difficult when neglecting or partially
206
considering the uncertainties in both axes using OLS or WLS regression
methods. This might have serious negative effects for the results of future
analytical studies, as in these cases there might be a strong probability of
wrongly accepting biased analytical methods. Moreover, an important
advantage in method comparison studies is that, for the BLS regression
technique, the probabilities of error when the axes are switched for given
1
H
a and
1
H
b values are the same.

We have also seen how important it is to obtain accurate estimates
of the experimental error s
2
. This regression parameter directly affects the
standard deviations of both slope and intercept, which is extremely
important for obtaining accurate estimates of the probability of error. For
this reason, tests to detect lack of fit in the three regression methods are
strongly recommended. Moreover, in data sets with few data pairs there is
a higher probability that the experimental error s
2
and therefore, the
standard deviations
1
H
b
s and
1
H
a
s are overestimated.
29
In these cases it may
be preferable to consider
1
H
b values of less than one when defining the
alternative hypothesis in order to reduce the overestimation of the
probability of error. This is because the standard deviations
1
H
b
s and
1
H
a
s directly depend on
1
H
b , and so
1
H
b values of less than one can partially
offset the overestimation of
1
H
b
s and
1
H
a
s due to a high experimental error
s
2
. On the other hand, when
1
H
b values below one are considered, the
estimated probability of error may be underestimated depending on the
magnitude of the bias set.

Defining the values of the biased regression coefficients
1
H
a and
1
H
b
is one of the most difficult steps in estimating the probability of error. For
this reason, we suggest the possibility of first setting one of the biased
regression coefficients and then calculate the other one so that the set bias

207
(defined by the point (
1
H
a ,
1
H
b )) are most likely to happen. Also, estimates
of the probability of error were inaccurate in simulated data sets with a
low number of data pairs (five in this case). This might be regarded as a
limitation of the estimation process, but these results are in fact not due to
our theoretical expressions, but to poor estimates of the experimental error
s
2
when few experimental data is available.

ACKNOWLEDGMENTS

The authors would like to thank the DGICyT (project no. BP96-1008)
for financial support, and the Rovira i Virgili University for providing a
doctoral fellowship to A. Martnez.

APPENDIX A: Characterisation of the joint confidence distribution for
the slope and the intercept.

The first step towards finding the expression for the tri-dimensional
joint confidence distribution, that relates the different levels of significance
(in the z axis) to the slope and the intercept (in the x and y axes), so that
the elliptical joint confidence regions from eq. 9 can be easily generated, is
to find the relationship between the level of significance and the
parameter F
1-(2,n-2)
in eq. 9:
30

= < =

) 2 , 2 ( 1
0
) 2 , 2 ( 1
d ) ( ) ( Prob 1
n
F
F n
F F h F F (A.1)

where h
F
(F) is the statistical function that generates the F distribution,
which can be expressed as:
31

( )
( )
2
2
2
1
2
2
! 2 ) 4 (
! 2 ) 2 (
) (
n
F
n
F
n n
n
F h

= (A.2)
208

This expression is obtained by considering 2 and n-2 degrees of freedom.
Introducing eq. A.2 to eq. A.1 and performing the stated integral:

+ = < =
2
2
) 2 , 2 ( 1
) 2 , 2 ( 1
2
2
1 1 ) ( Prob 1
n
n
n
n
F
F F (A.3)

and therefore:

+ = > =
2
2
) 2 , 2 ( 1
) 2 , 2 ( 1
2
2
1 ) ( Prob
n
n
n
n
F
F F (A.4)

This expression gives the probability of a given F value of being higher
than a set threshold value F
1-(2,n-2)
, and therefore of not belonging to the F
distribution with 2 and n-2 degrees of freedom, for a level of significance .
Isolating the term F
1-(2,n-2)
from eq. 9 and substituting it into eq. A.4, we
find the expression for the joint confidence distribution for any level of
significance :

= = =
+ +
+ =

2
2
2
1
2
H 2
2
H H
1
2
2
H
1
2
) 2 (
) ( ) ( ) ( 2 ) (
1
1
0
0
H
0 0
0
H
0
0
H
n
n
i e
i
n
i e
i
n
i e
s n
b b
s
x
b b a a
s
x
a a
s
i i i (A.5)

The value (in the z axis) for which the bi-dimensional joint confidence
region is generated agrees with the volume of the distribution that falls
outside the projection of the elliptical region (i.e. risk of error in the tri-
dimensional joint confidence distribution, Figure 2).


209
APPENDIX B: Defining the joint confidence interval for a given level of
significance .

From eq A.5 we can generate the equations to span the joint
confidence region for a level of significance , referred to as b
1
(a) and b
2
(a)
in eqs. 11 and 12. To do so, one of the two variables must be isolated (b was
chosen in this case). To provide clearer expressions, the three summation
terms in eq. A.5 have been renamed as:
=
=
n
i e
i
s
A
1
2
H
0
H
0
1
,
=
=
n
i e
i
i
s
x
B
1
2
H
0
H
0
and
=
=
n
i e
i
i
s
x
C
1
2
2
H
0
H
0

( )

+ + + =

1 ) 2 ( ) ( ) (
1
) (
2
2
2
H H H
2
H
2
H H H H H
H
1
0 0 0 0 0 0 0 0 0
0
n
n s C A C B a a b C a a B
C
a b
(B.1)

( )

+ + =

1 ) 2 ( ) ( ) (
1
) (
2
2
2
H H H
2
H
2
H H H H H
H
2
0 0 0 0 0 0 0 0 0
0
n
n s C A C B a a b C a a B
C
a b
(B.2)

These two functions span the upper and lower halves of the ellipse
that defines the joint confidence region for a given level of significance .
They do not have real images for any intercept (a) value, but only for those
intercept values between a
1
and a
2
(eqs. 11 and 12). These are the two
unique a values that make b
1
(a)=b
2
(a) and thus define the two intercept-
axis points, where the two halves of the ellipse defining the joint confidence
region meet their ends. The expressions quantifying their values are:

210

+ =
2
H H H
2 2
2
H
H 1
0 0 0
0
0
1 ) 2 (
B A C
s n C
a a
n
(B.3)

2
H H H
2 2
2
H
H 2
0 0 0
0
0
1 ) 2 (
B A C
s n C
a a
n
(B.4)

REFERENCES

1.- D. L. Massart, B.M.G. Vandeginste, L.M.C. Buydens, S. de Jong, P. J.
Lewi and J. Smeyers-Verbeke, Handbook of Chemometrics and Qualimetrics:
Part A, Elsevier, Amsterdam (1997).
2.- J. Mandel and F. J. Linnig Anal. Chem. 29, 743-749 (1957).
3.- W.A. Fuller, Measurement Error Models, John Wiley & Sons, New York
(1987).
4.- M.A. Creasy, J. R. Statist. Soc. B, 18, 65-69 (1956).
5.- J. Mandel, J. Qual. Tech., 16, 1-14 (1984).
6.- R.L. Anderson, Practical Statistics for Analytical Chemists, Van Nostrand
Reinhold, New York (1987).
7.- C. Hartmann, J. Smeyers-Verbeke, W. Penninckx and D.L. Massart, Anal.
Chim. Acta, 338, 19-40 (1997).
8.- J. Riu and F. X. Rius, Anal. Chem., 68, 1851-1856 (1996).
John Wiley & Sons, New York, 145-150 (1993).
10.- O. Gell and J. A. Holcombe, Anal. Chem., 60, 529A-542A (1990).

211
11.- J. M. Lis, A. Cholvadova and J. Kutej, Comput. Chem., 14, 189-192
(1990).
12.- J. Riu and F. X. Rius, J. Chemometrics, 9, 343-362 (1995).
13.- P. Sprent, Models in Regression and related topics, Methuen & Co. Ltd.,
London (1969).
14.- K. C. Lai and T. K. Mak, J. R. Statist. Soc. B, 41, 263-268 (1979).
15.- C. L. Cheng and J. W. Van Ness, J. R. Statist. Soc. B, 56, 167-183 (1994).
16.- D. W. Schafer and K. G. Puddy, Biometrika, 83, 813-824 (1996).
17.- D. V. Lindley, Suppl. J. R. Statist. Soc. B, 9, 218-244 (1947).
18.- G. A. F. Seber, Linear regression analysis, John Wiley & Sons, New York,
160-211, (1977).
19.- R. J. Carroll and D. Ruppert., Amer. Stat., 50 1 (1996).
20.- C. L. Cheng, Institute of Statistical Science, Academia Sinica,
Taipei,Taiwan, Republic of China, personal communication.
21.- C. Hartmann, J. Smeyers-Verbeke, W. Penninckx, Y. Vander Heyden, P.
Vankeerberhen and D. L. Massart, Anal. Chem., 67, 4491-4499 (1995).
22.- M.R. Spiegel, Theory and Problems of Statistics, McGraw-Hill, New York
(1988).
23.- A. Martnez, J. del Ro, J. Riu and F. X. Rius, Chemometrics Intell. Lab.
Sys., 49, 179-193 (1999).

24.- W. G. Snedecor and G. C. Cochran, Statistical Methods, Iowa State
University Press, Ames, Iowa (1989).
25.- A. Martnez, J. Riu and F. X. Rius, (1999), in preparation.
26.- G. B. Thomas Jr., R. L. Finney, Calculus and Analytic Geometry, Addison-
Wiley, Wilmington, Delaware, 943-950 (1987).
27.- P. De Vogt, J. Hinschberger, E. A. Maier, B. Griepink, H. Muntau and J.
Jacob, Fresenius J. Anal. Chem., 356, 41-48 (1996).
28.- I. Saouter, B. Blattmann, Anal. Chem., 66, 2031-2037 (1994).
29.- G.J. Hahn and W. Q. Meeker, Statistical Intervals, a guide for practitioners,
John Wiley & Sons, New York (1991).
212
30.- Cetama, Statistique applique lexploitation des mesures, Masson, Paris,
31 (Appendix) (1986).
31.- A. M. Mood, F. A. Garybill, Introduction to the Theory of Statistics,
McGraw-Hill, New York (1963).
4.6 Conclusions

213
4.6 Conclusions

Les conclusions extretes de larticle presentat a lapartat 4.3 sn
confirmades pels resultats obtinguts pel procediment de simulaci de
Monte Carlo presentats a la secci 4.2. Aplicant el test conjunt sobre els
coeficients de regressi BLS estimats per cadascun dels 100.000 conjunts
globals (aquells amb els resultats experimentals de tots els analits alhora),
sha pogut demostrar que la detecci de diferncies significatives entre els
resultats dels dos mtodes analtics en comparaci es realitza de manera
correcta quan es consideren tots els resultats de lanlisi dels diferents
analits alhora. Daltra banda, amb el procs de simulaci de Monte Carlo
tamb sha demostrat que quan saplica el test conjunt sobre els coeficients
de regressi BLS estimats a partir dels conjunts individuals (que contenen
els resultats experimentals de cadascun dels analits per separat), la detecci
de diferncies significatives entre els resultats dels dos mtodes s molt
difcil. La causa s que lestimaci dels coeficients de regressi BLS amb un
nombre baix de valors experimentals produeix sobreestimacions de lerror
experimental s
2
que generen, per a un determinat nivell de significana ,
intervals de confiana sobredimensionats. En aquest cas hi ha una
probabilitat elevada de cometre un error , ja que acceptaria la hiptesi
nulla (H
0
) quan la correcta s la hiptesi alternativa (H
1
).

Daltra banda, a lapartat 4.5 sha demostrat que s possible estimar
la probabilitat de cometre un error de tipus en laplicaci del test conjunt
sobre els coeficients de regressi BLS. Tamb sha comprovat que les
expressions matemtiques desenvolupades proporcionen estimacions
correctes de la probabilitat de cometre un error sempre que es disposi
duna bona estimaci de lerror experimental s
2
. Sota aquestes condicions,
considerant els mtodes de regressi OLS i WLS sobtenen estimacions
errnies de la probabilitat de cometre un error de tipus en comparaci
amb lestimaci obtinguda amb el mtode de regressi BLS, que considera

214
els errors comesos en les mesures de les mostres mitjanant els dos mtodes
en comparaci.

4.7 Referncies

1.- Riu J., Rius F.X., Analytical Chemistry, 68 (1996) 1851-1857.
2.- Mandel J. and Linnig F.J., Journal of Quality and Technology, 16 (1984) 1-
14.
3.- Spiegel M.R., Theory and Problems of Statistics, McGraw-Hill: New York,
1988.
5-. Meier P.C., Zund R.E., Statistical Methods in Analytical Chemistry, John
Wiley & Sons: New York, 1993.
6.- Gell O., Holcombe J.A., Analytical Chemistry, 60 (1990) 529A - 542A..

CAPTOL 5
Comparaci de mltiples mtodes mitjanant lanlisi
per components principals de mxima versemblana
considerant els errors en tots els eixos


217

Desprs dhaver tractat diferents aspectes de la comparaci de dos
mtodes analtics mitjanant regressi lineal considerant els errors en els
dos eixos, en el cinqu captol es presenta lextensi lgica al camp
multivariant estudiant la comparaci dels resultats obtinguts per mltiples
mtodes analtics considerant les incerteses degudes als errors comesos en
la mesura de mostres amb diverses concentracions. Aquesta comparaci
pot ser aplicada, entre daltres, al que es coneix en el camp de la qumica
analtica amb el nom destudis interlaboratori, emprats amb diferents
finalitats de les quals es poden distingir tres tipus:
1

a) Estudis de certificaci dun material (material-certification studies). Hi
participen laboratoris especialitzats, de reconegut prestigi i competncia,
que analitzen amb diferents mtodes un material per tal de determinar la
concentraci dun o ms analits amb la menor incertesa possible. Per tant,
la finalitat daquests estudis s la de proporcionar materials de referncia.

b) Estudis de collaboraci o daptitud dun mtode analtic (collaborative
studies or method-performance). Sutilitzen per establir les caracterstiques
dun mtode especfic danlisi, que normalment tenen a veure amb la
precisi.
2,3
La International Standarization Organisation (ISO) ha publicat un
protocol que permet avaluar la precisi i el biaix dun mtode analtic, per
noms considerant un nivell de concentraci.
4

c) Estudis daptitud de laboratoris (proficiency studies). Diversos
laboratoris participants que volen augmentar el seu nivell de qualitat
analitzen un o ms analits dun material. Els laboratoris participants fan
servir diferents mtodes analtics segons la disponibilitat. Els resultats
obtinguts per cadascun dels laboratoris sn comparats entre ells i aix
Captol 5. Comparaci de mltiples mtodes ...

218
poder millorar laptitud de cada laboratori per analitzar els analits
considerats.
5,6
Per tant, en aquest cas l'objecte d'avaluaci s el laboratori.

Tot i que la comparaci dels resultats de diferents mtodes analtics
s til en qualsevol dels tres casos descrits anteriorment, el seu s s ms
freqent en els estudis daptitud de laboratoris (proficiency studies), en els
quals ens centrarem. Com es comenta ms detalladament a lapartat 5.4, en
aquest tipus destudis interlaboratori sutilitzen uns tests estadstics per
establir diferents parmetres de qualitat com ara la repetibilitat i
reproducibilitat en estudis de collaboraci o lexistncia de biaixos en els
estudis daptitud dun conjunt de laboratoris. Aquests tests estadstics,
per, presenten una srie dinconvenients entre els quals cal destacar el fet
de considerar els resultats de les mostres amb diferents nivells de
concentraci per separat i no conjuntament, o no tenir en compte les
incerteses degudes als errors comesos per cadascun dels laboratoris en
comparaci en la mesura de les mostres.

Per tant, lobjectiu daquest captol s el de desenvolupar una
tcnica que permeti comparar els resultats de diversos mtodes analtics
que solucioni els inconvenients caracterstics dels tests estadstics
convencionals i amb la que es puguin tenir en compte les incerteses
degudes als errors en la mesura comesos pels diferents mtodes (o
laboratoris) en comparaci. Aquesta tcnica de comparaci de mltiples
mtodes es fonamenta en ls de lanlisi per components principals de
mxima versemblana (maximum likelihood principal component analysis,
MLPCA), del qual sen detalla el funcionament a lapartat 5.2, aix com en
ls del test conjunt per lordenada a lorigen i el pendent estimats
mitjanant el mtode de regressi BLS. En lapartat 5.3 es presenten els
resultats obtinguts del procs de validaci daquesta tcnica de comparaci
de mltiples mtodes. El seu funcionament es detalla en lapartat 5.4 com
part de larticle Multiple analytical method comparison by using MLPCA and
linear regression with errors in both axes, enviat per la seva publicaci a la

219
revista Analytica Chimica Acta. Finalment, les conclusions del captol es
recullen en lapartat 5.5.

5.2 Anlisi per components principals de mxima
versemblana

MLPCA s un mtode de modelatge multivariant anleg al danlisi
per components principals (principal components analysis, PCA),
7,8
per que
considera els errors comesos en la mesura dels diferents valors
experimentals continguts en la matriu R de dimensions mn, per estimar els
coeficients del millor model p-dimensional possible des dun punt de vista
de mxima versemblana. La descomposici de la matriu R mitjanant
MLPCA es pot representar com:

E TV E USV R + = + =
T T
(5.1)

on E s una matriu mn de residuals. Les matrius T i V tenen dimensions
mp i np per un model p-dimensional, la matriu V
T
s la transposada de V
. Aquestes matrius contenen les estimacions de mxima versemblana dels
valors propis o scores i dels vectors propis o loadings respectivament. Cal
tenir present que les n columnes de la matriu R corresponen als mtodes o
laboratoris en comparaci, mentre que les m files contenen els resultats de
les mostres de diferents concentracions analitzades per cadascun dels
mtodes. El model p-dimensional estimat amb MLPCA t la mxima
probabilitat de donar lloc a les mesures experimentals observades.
9

MLPCA utilitza un mtode de mnims quadrats iteratiu per estimar els
parmetres del model multivariant (rang p, scores i loadings) que minimitza
la suma dels residuals ponderats al quadrat. En aquest captol hem assumit
que els errors en la mesura de les diferents mostres per cadascun dels
mtodes no estan correlacionats i que estan distributs de forma normal.

220
Sota aquesta condici, la funci objectiva minimitzada per MLPCA,
corresponent a la suma de residuals ponderats al quadrat, es defineix com:
9

= =
=
m
i
n
j ij
ij ij
r r
S
1 1
2
2
2
)
(5.2)

En aquesta equaci les variables r
ij
(elements de la matriu R)
representen les mesures de la mostra de concentraci i pel mtode j. La
variable
ij
r
s lestimaci de mxima versemblana de r

ij
proporcionada pel
model i
ij
s la desviaci estndard vertadera comesa en la seva mesura.
En la prctica s impossible obtenir els valors vertaders de les variables
ij
i
sha de treballar amb les seves estimacions s
ij
. De forma anloga al mtode
de regressi BLS, MLPCA considera implcitament que es coneixen els
valors vertaders
ij
. El fet de treballar amb les estimacions s
ij
generades a
partir de les rpliques efectuades en lanlisi de les m mostres pels n
mtodes afecta la qualitat de les estimacions dels coeficients del model.
9
Tot
i aix, ha estat una premissa en aquesta tesi doctoral considerar que
mtodes de modelatge com MLPCA i de regressi com BLS, que tenen en
compte les desviacions estndard dels errors comesos en les mesures
experimentals, encara que aquestes siguin aproximades, sn millors que
aquells que no les tenen en compte.

Per entendre ms fcilment la manera com lalgoritme MLPCA
estima el model multivariant, cal fer-ne una breu descripci del seu
funcionament. Abans de comenar, per, s interessant visualitzar les
diferents matrius que intervenen en el procs MLPCA. La figura 1 mostra
la matriu de concentracions mesurades experimentalment (R) i la matriu
amb les corresponents varincies (Q):


221
n
mn m
n
m
r r
r r
r r r
1
22 21
1 12 11
... ...
...
R

n
mn m
n
m
2 2
1
2
22
2
21
2
1
2
12
2
11
... ...
...

Q

Figura 1. Matriu de concentracions R i varincies vertaderes Q.

Durant el procs de modelatge, lalgoritme MLPCA fa servir les matrius de
covarincies tant en lespai definit per les files (
i
) com per les columnes
(
j
). Com que sha assumit que les desviacions estndard dels errors en
les mesures experimentals sn independents, les matrius de covarincies de
cadascuna de les files i de les columnes respectivament sn diagonals. La
figura 2 representa aquestes matrius per la primera fila i la primera
columna de la matriu Q:

n
n
n
2
1
2
12
2
11
1
0
...
0

m
m
m
2
1
2
21
2
11
1
0
...
0

Figura 2. Matrius de covarincies per a la primera fila i
columna de la matriu Q.

En aquest cas el model estimat per mxima versemblana ha de ser
equivalent en els dos espais, ja que la funci objectiva que sha de
minimitzar (eq. 5.1) en els dos casos s la mateixa:
9

=
= =
n
j
j j j
m
i
i i i
S
1
1 T
1
1 T 2
r r r r (5.3)

222
En aquesta equaci r
j
i r
i
sn vectors columna i fila de la matriu R,
resultant de les diferncies entre la matriu de concentracions mesurades R i
la matriu amb les estimacions predites de mxima versemblana R
.
Daquesta manera la matriu R
ser la mateixa tant en lespai definit per les

files com en el definit per les columnes. Per aquest motiu sha desenvolupat
un procediment de clcul iteratiu que transposa la matriu R
, de manera
que fa servir alternativament les estimacions de mxima versemblana en
lespai generat per les files per tornar a calcular-les en lespai definit per les
columnes.
9
Lalgoritme comena descomponent la matriu inicial mn de
concentracions R mitjanant una descomposici per valors singulars
(singular value decomposition, svd). Al contrari del que passa amb PCA, en
MLPCA sha despecificar inicialment el rang del model, ja que en aquest
cas el model de rang p no es pot obtenir a partir dun model de rang
superior (models no aniuats):

1. Descomposici de la matriu R inicial.

[ ] ) , ( svd p R V S, U, = (5.4)

Les matrius U, S i V tenen dimensions mp, pp i np respectivament. El
segon pas consisteix a transposar la matriu R i calcular-ne les estimacions
de mxima versemblana en lespai definit per les files. Per fer aix
MLPCA utilitza projeccions que no sn ortogonals al subespai definit pels
scores, sin que estan ponderades per les incerteses en les concentracions
mesurades:

2.- Estimaci de la matriu R en lespai definit per les files.

T
R R


223

i i i i
r V V V V r
1 T 1 1 T
) (

= (5.5)

En aquesta equaci r
i
s un vector columna amb dimensions n1 de la nova
matriu R
T
(vector fila de la matriu R original). Daquesta manera sobt
una estimaci de la matriu R inicial transposada amb dimensions nm, s a
dir
T
R , i per tant la funci objectiva es pot calcular segons lequaci 5.3:

=
m
i
i i i i i
S
1
1 T 2
1
)
( )
( r r r r (5.6)

3.- Estimaci de la matriu R en lespai definit per les columnes.

En el tercer pas es fa una descomposici per valors singulars de la
matriu
T
R estimada en el segon pas:

[ ] ) ,
( svd
T
p R V S, U, = (5.7)

En aquest cas les matrius U, S i V tindran dimensions np, pp i mp
respectivament. Es repeteix el segon pas per ara sestima la matriu mn R

en lespai definit per les columnes:

T
R R

j j j j
r V V V V r
1 T 1 1 T
) (

= (5.8)

En lequaci 5.8 r
j
s un vector columna amb dimensions m1 de la nova
matriu R
T
, s a dir, vector fila de la matriu original R. Ara el valor de la
funci objectiva es calcula segons lexpressi:


224

=
n
j
j j j j j
S
1
1 T 2
2
)
( )
( r r r r (5.9)

4.- Clcul del parmetre de convergncia.

En el quart pas es torna a descomposar en valors singulars la matriu
R
estimada en el pas anterior:

[ ] ) ,
( svd p R V S, U, = (5.10)

Donat que per obtenir una estimaci de mxima versemblana de la matriu
R sha de complir que
2
2
2
1
S S = (eq. 5.3), es comprova la diferncia entre els
seus dos darrers valors successius:

2
2
2
2
2
1
S
S S
= (5.11)

Si el valor de s inferior al lmit de convergncia establert (en aquest cas
10
-10
) el procediment finalitza. En cas contrari es torna al segon pas.

Cal destacar que en el cas de la comparaci de mtodes analtics es
treballa amb la matriu original R sense centrar. Aix s perqu en aquest
cas, el rang de concentracions s el mateix per a tots els mtodes en
comparaci i per tant, les diferncies entre els resultats dels diferents
mtodes sn degudes exclusivament a errors (aleatoris o sistemtics) en les
mesures experimentals de les diferents mostres. En el cas de les dades
espectroscpiques el centrat de les dades s usat freqentment, ja que aix
seliminen les variacions de les absorbncies que no sn degudes a la
diferent composici de les mostres, sin a la variaci en la capacitat de
lanalit a absorbir radiaci de diferents longituds dona.
5.3 Validaci del procediment per comparar mltiples mtodes

225
5.3 Validaci del procediment per comparar mltiples
mtodes

Com es descriu a lapartat 5.4, el procediment desenvolupat per
comparar mltiples mtodes analtics est basat en laplicaci del test de
confiana conjunta sobre els coeficients de la recta de regressi BLS.
Aquesta recta de regressi permet relacionar les m concentracions de
lanlisi de les mostres per cadascun dels mtodes analtics, amb les m
concentracions de la resta de mtodes analtics. Aquestes darreres m
concentracions sn generades aplicant MLPCA sobre les concentracions de
lanlisi de les mostres per tots els mtodes analtics restants (veure figura 1
en lapartat 5.4). Si quan es comparen les concentracions del mtode j
respecte a les de la resta de mtodes, el punt definit pels coeficients de la
recta de regressi BLS cau dintre de linterval de confiana generat per un
nivell de significana , es pot concloure que no existeixen diferncies
significatives entre les concentracions mesurades pel mtode analtic j, en
comparaci amb les concentracions obtingudes per la resta de mtodes.

El mtode de regressi BLS (veure apartat 1.4.1.2 de la introducci)
est basat en un procediment iteratiu de mnims quadrats, que al contrari
dels mtodes basats en el principi de mxima versemblana no t un
fonament matemtic rigors. Aix doncs, per demostrar que la detecci de
diferncies significatives entre els resultats dels diferents mtodes en
comparaci mitjanant el nou procediment es fa de forma correcta, tot el
procediment per la comparaci dels resultats de mltiples mtodes ha de
ser validat. Aquesta validaci sha realitzat fent servir quatre conjunts de
dades inicials. Aquests conjunts de dades simulen els resultats de deu
mtodes o laboratoris analtics que analitzen tres analits a cinc nivells de
concentraci. Els quinze punts resultants per a cada laboratori es troben
distributs de forma aleatria al llarg dun rang de concentracions que va
des de 0 a 100 unitats. Com que degut al nmero de mtodes en

226
comparaci s impossible representar grficament els conjunts de dades
inicials, a continuaci sen fa una descripci detallada.

Cada un dels quatre conjunts inicials simula una possible situaci
en quant a la presncia de biaix en els resultats dalgun dels mtodes. Aix,
en un dels conjunts de dades inicials els resultats dels deu mtodes sn
idntics simulant un cas on tots els mtodes sn comparables. En els altres
tres conjunts de dades inicials, el nmero de mtodes amb resultats
esbiaixats ha estat fixat en un, tres i cinc respectivament. En aquests casos
els resultats dels mtodes considerats com esbiaixats eren un 10% superiors
als dels mtodes no esbiaixats. Daltra banda, shan considerat tres tipus
dincerteses en cadascun dels quatre tipus de conjunts de dades inicials. Els
conjunts homoscedstics, compostos per parells de dades amb desviacions
estndard constants, sigui quin sigui el valor de la concentraci. Els
conjunts de dades amb heteroscedasticitat es divideixen en dues classes;
per una banda aquells en els que les desviacions estndard sn un 10%
superior als valors individuals de les concentracions i per altra banda,
aquells on les desviacions estndard varien aleatriament entre un 6% i un
12% de cadascun dels valors individuals de concentraci.

A partir de cadascun daquests quatre conjunts de dades inicials, es
van generar 100.000 conjunts de dades simulats mitjanant el mtode de
Monte Carlo.
10,11
Com ja ha estat explicat en anteriors captols, aquest
mtode de simulaci genera nous conjunts de dades mitjanant laddici
dun error aleatori a cadascun dels valors individuals dun conjunt inicial.
La magnitud daquest error aleatori depn directament de les incerteses
associades a cadascun dels valors inicials. Daquesta manera, quan saplica
el procediment per comparar mltiples mtodes per un nivell de
significana sobre conjunts de dades simulant mtodes no esbiaixats,
noms shauria de detectar diferncies significatives en un % dels conjunts
simulats. Ara b, si el mtode en comparaci proporciona resultats

227
esbiaixats respecte la resta dels mtodes, shaurien de detectar diferncies
significatives en els resultats daquest mtode en un percentatge de
vegades superior al percentatge observat pels mtodes no esbiaixats.

La figura 3 mostra els resultats obtinguts amb laplicaci del
procediment per la comparaci dels resultats de mltiples mtodes
analtics sobre els 100.000 conjunts de dades simulats amb el mtode de
Monte Carlo a partir de cadascun dels quatre conjunts de dades inicials per
diferents nivells de significana .

a)
A B C D E F G H I J
1
5
10
Mtode Analtc
%

R
e
s
u
l
t
a
t
s

s
i
m
u
l
a
t
s

e
s
b
i
a
i
x
a
t
s

b)
0
20
40
60
80
A B C D E F G H I J
Mtode Analtc
%

R
e
s
u
l
t
a
t
s

s
i
m
u
l
a
t
s

e
s
b
i
a
i
x
a
t
s

c)
0
20
40
60
80
A B C D E F G H I J
Mtode Analtc
%

R
e
s
u
l
t
a
t
s

s
i
m
u
l
a
t
s

e
s
b
i
a
i
x
a
t
s

d)
0
20
40
60
A B C D E F G H I J
Mtode Analtc
%

R
e
s
u
l
t
a
t
s

s
i
m
u
l
a
t
s

e
s
b
i
a
i
x
a
t
s

Figura 3 . Resultats obtinguts de la validaci del procediment de comparaci dels
resultats mltiples mtodes analtics mitjanant conjunts de dades simulats.

228
La figura 3a mostra els resultats en el cas de no tenir cap mtode esbiaixat.
Les figures 3b, 3c i 3d mostren els resultats obtinguts quan es van
considerar 1 (mtode A), 3 (mtodes A, E i J) i 5 (mtodes A, C, E, G i I)
mtodes esbiaixats respectivament. Daltra banda, els tres tipus de lnies
representen els resultats obtinguts pels diferents tipus dincerteses;
homoscedasticitat (lnia contnua), heteroscedasticitat constant (lnia
discontnua) i heteroscedasticitat aleatria (lnia puntejada). Per cadascun
dels quatre conjunts inicials i cadascun dels tres tipus dincerteses, es van
considerar nivells de significana del 1%, 5% i 10%.

Com es pot observar en la figura 3a, quan es simula que tots els
mtodes en comparaci proporcionen resultats comparables, el percentatge
de vegades en el que es detecten diferncies significatives per cadascun
dels deu mtodes respecte als altres nou, independentment del tipus
dincertesa considerada, s aproximadament del % en cada cas. Aix s
equivalent a dir que el percentatge de vegades en els que es detecta
errniament la presncia de diferncies significatives en els resultats de
cadascun dels mtodes en comparaci amb els altres, s similar al nivell de
significana que fixa la probabilitat de detectar errniament un mtode
esbiaixat. Per aquesta ra, es pot concloure que el procediment per la
comparaci de mltiples mtodes analtics no detecta biaix en els resultats
dels diferents mtodes quan realment no existeix.

Daltra banda, quan es simula que un dels mtodes est esbiaixat
respecte la resta, els resultats del procs de validaci a la figura 3b mostren
que el procediment per comparar mltiples mtodes detecta correctament
el mtode (mtode A en aquest cas) que genera els resultats esbiaixats
respecte als de la resta de mtodes en comparaci. Cal destacar que el
percentatge de vegades en els que es detecten diferncies significatives en
els resultats dels mtodes no esbiaixats (del B al J) s lleugerament superior
al nivell de significana fixat en cada cas. Aix s degut a que quan es

229
comparen els resultats dun dels mtodes no esbiaixats amb la resta, entre
els nou mtodes restants es troba el mtode A que proporciona resultats
esbiaixats. Aix fa augmentar les diferncies entre els resultats del mtode
no esbiaixat en comparaci als altres nou.

De manera similar, quan es simula que els mtodes que generen
resultats esbiaixats sn tres (mtodes A, E i J), la figura 3c mostra que la
identificaci daquests tres mtodes s possible mitjanant el procediment
per la comparaci dels resultats de mltiples mtodes analtics. Daltra
banda, quan es comparen els resultats dels mtodes no esbiaixats (mtodes
B, C ,D, F, G, H i I) amb la resta hi ha un major percentatge de casos en els
que es detecten diferncies significatives. De manera anloga al cas
anterior, aix s degut a que entre els nou mtodes amb els que el compara
cadascun dels mtodes no esbiaixats, hi ha tres mtodes que generen
resultats diferents a la resta, cosa que fa que augmentin les diferncies entre
els mtodes no esbiaixats i els altres nou.

Finalment, a la figura 3d es mostren els resultats del procs de
validaci quan es simula que cinc mtodes generen resultats esbiaixats i
que els altres cinc generen resultats correctes. Com es pot observar en
aquesta figura el procediment per la comparaci de mltiples mtodes
detecta diferncies significatives entre els resultats de cadascun dels
mtodes i els altres nou en un percentatge molt similar. Aix s degut a que
en aquest cas les diferncies entre els resultats dels mtodes analtics
esbiaixats i no esbiaixats sn molt similars i per tant no existeix cap mtode
que sigui especialment diferent als altres nou: el procediment no pot
detectar amb fiabilitat quins mtodes sn els esbiaixats quan
aproximadament un 50% dels mtodes donen lloc a resultats diferents a la
resta.


230
5.4 Multiple analytical method comparison by using MLPCA
and linear regression with errors in both axes (Analytica
Chimica Acta, sent for publication)



ABSTRACT

This paper discusses a new stepwise approach for comparing the
results from several analytical methods which analyse a set of analytes at
different concentration levels, taking into account all the individual
uncertainties produced by measurement errors. This stepwise comparison
approach starts detecting the methods that provide outlying concentration
results. The concentration results from each one of the remaining analytical
methods are then compared to the ones from the other methods taken
together, by using linear regression. To do this, the concentration results
from the methods considered together and their individual uncertainties,
are decomposed at each step to obtain a vector of concentrations. This is
achieved by a maximum likelihood principal component analysis
(MLPCA), which takes into account the measurement errors in the
concentration results. The bivariate least squares (BLS) regression method
is then used to regress the concentration results from the method being
tested at a given step on the scores generated from the MLPCA
decomposition (which have the information of the other remaining
methods), considering the uncertainties in both axes. To detect significant
differences between the results from the method being tested at a given
5.4 Multiple analytical method comparison ...

231
step and the results from the other methods (MLPCA scores), the joint
confidence interval test is applied on the BLS regression line coefficients for
a given level of significance . We have used four real data sets to provide
application examples that show the suitability of the approach.

INTRODUCTION

Interlaboratory studies are used in analytical chemistry for a variety
of purposes. These may be proficiency tests (for comparing the
performance of several laboratories), collaborative studies (for validating a
standard method) or certification trials (to establish the true analyte
concentration in a reference material). Data from these interlaboratory
studies is usually statistically analysed to characterise different figures of
merit such as repeatability and reproducibility in collaborative studies, or
the existence of systematic bias (i.e. laboratory performance) in proficiency
tests. To date, systematic bias has been tested with different kinds of well-
known statistical tests such as ranking [1] and z-score methods [2]. Both of
these methods use scores to evaluate laboratory performance, but are
calculated in different ways. The ranking method uses a ranking of the
results from the different laboratories to calculate the scores. In this way,
the laboratory score is the sum of the ranks of the different samples. The z-
score method, on the other hand, uses the expression s x x z / )
( = , where x
is the result or mean of results obtained for a given concentration sample
by a laboratory, x
is the best possible estimate of the true concentration,

and s the standard deviation of all the laboratories after outliers have been
eliminated. With good x
and s estimates, the z-scores can be assumed to

follow a normal distribution. In both cases bias is detected for a given
laboratory if its score is beyond the lower or upper limits defined according
to the corresponding scores distribution in each case, for a given level of
significance .


232
With these two methods we can detect the presence of significant
bias in the results from more than two laboratories at each concentration
level independently. In other words, they do not consider samples
containing different concentration levels simultaneously when checking the
presence of significant bias. However, it might be more suitable to consider
all the concentration samples simultaneously, since the main objective of
interlaboratory studies is to establish an overall performance index for each
laboratory. Moreover some statistical aspects discourage the use of the z-
score method; for instance when more than one test material is analysed. In
these cases the z-score method uses different kinds of combination scores,
such as RSZ or SSZ [3]. These combination scores are not generally
recommended for evaluating the performance of the laboratories when
determining one or more analytes in samples with different chemical
matrices [4]. This is because in these cases a statistically heterogeneous
population of z-scores might be obtained, which makes the assumption of
normality, on which the z-score method is based, no longer true.

For these reasons, this paper discusses a new approach, which,
unlike the existing ones, allows to identify methods providing outlying
concentration results and then detect significant bias in the concentrations
from at least one of the remaining methods in test. Moreover, this approach
can cope with heteroscedastic uncertainties from measurement errors (i.e. it
considers the different degrees of precision from the different methods in
comparison) generated by the replicate analysis of one or more analytes in
several concentration samples with similar or different chemical matrices.
In this way, the comparison is made by taking into account not only the
concentration values provided for each method, but also their uncertainties.
Once the outlying methods have been removed, the concentrations from
every method are compared to the scores from applying MLPCA on the
rest of methods, using linear regression and the joint confidence interval for
the intercept and the slope.

233

The approach presented here is general in nature, that is, it can be
applied to any experimental problem in which the concentration results
from analysing one or more analytes in different concentration samples by
several laboratories, analytical methodologies, techniques, analysts or
instruments are to be compared. To show the suitability of the new
approach, we used four real data sets and drew conclusions about the
validity of the laboratories in comparison according to the concentration
results from the different laboratories in each data set.


Notation

In this paper we have used bold uppercase characters to denote
matrices, bold lowercase characters to denote vectors and italic lowercase
characters to denote scalars. The true values of the different variables are
represented with Greek characters, while their estimates are denoted with
Latin ones. The variables used during the comparison process outlined in
Figure 1 are described as follows:

Detection of outlying methods. The concentration results from the
replicate analysis of m concentration samples by the n laboratories to be
compared are in a mn matrix R. From the application of MLPCA on
matrix R the loadings of the first principal component are obtained in an
n1 vector p. After using the Grubbs test on the elements of vector p, the
number of laboratories with outlying results is represented by variable l.


234
R
p
m
n
m
n
n
1
Outlier
detected?
Yes
No
Single/Paired
Grubbs Test
MLPCA
1 PC
Remove results from
suspicious method(s) in
matrices

R and var(R)
set j =1
MLPCA
1 PC
t
m
1
var
(t)
m
1
set l = 0
l = l +1
k = n - l
R
m
k-1
m
k-1
BLS
0 20 40 60 80
0
20
40
60
t
j
r
Joint Confidence
Interval Test
-0.5 0 0.5
0.9
0.95
1
1.05
1.1
S
l
o
p
e
Intercept
(0,1)
(b
0
,b
1
)
j = k ?
End
No
j = j +1
Yes
var(R)
Extract jth column from
mk matrices R and var(R)
to obtain vectors r
j
and var (r
j
)
var(R)
% 5 =
k Remaining Methods
Initial Methods
Detection of Outliers
Estimation of scores
with MLPCA
BLS regression and
joint confidence interval test

Figure 1. Scheme of the overall process for comparing the concentration results
from multiple laboratories. See the Notation section for a description of the
variables.

Estimation of scores with MLPCA. After the l laboratories with
outlying results have been eliminated, k laboratories are left to be
compared ( ) l n k = . k steps are therefore necessary to compare the
concentration results from each one of the laboratories with the others. In

235
the jth step ( ) k j < < 1 the results from the jth laboratory (column vector
j
r
in the new m(k-1) matrix R), are compared to those from the other k-1
laboratories in matrix R. The application of MLPCA on matrix R produces
a (k -1)1 vector of loadings p and a m1 vector of scores t for the first
principal component. The individual variances of the concentration results
from the replicate analysis of the ith (1<i<m) concentration sample by the k-
1 methods are in the diagonal (k-1)(k-1) matrix
i
(uncorrelated
measurement errors are considered). Projecting each one of these m
diagonal covariance matrices onto the scores subspace yields the scalar
2
i
t
s .
This is the estimate of the true variance (
2
i
t
) of the ith score in t. Therefore
the m1 vector of variances var(t) comprises the m
2
i
t
s values.

BLS regression method and joint confidence interval test. In the jth step
the BLS technique is used to regress the m1 vector of concentrations
j
r on
the m1 vector of scores t, considering the uncertainties associated to the
elements of both vectors. Estimates of the variance for the elements of
j
r
from the replicate analysis of the ith concentration sample are denoted as
2
ij
r
s , while their true values are
2
ij
r
. The true values of the BLS regression
coefficients are
0
(intercept) and
1
(slope), while their respective estimates
are b
0
and b
1
. The estimates of the standard deviation of the intercept and
the slope for the BLS regression line are
0
b
s and
1
b
s respectively. The true
experimental error (residual mean square error), expressed in terms of
variance for the m data pairs (
i
t ,
ij
r ), is
2
and its estimate is
2
s . The
predicted values of the results from the jth method being tested from the
BLS regression line are
ij
r
.


236
Maximum likelihood principal component analysis (MLPCA)

Since the decomposition of matrix R using MLPCA is essential in
this comparison approach we believe that it may be useful to note some
important points. In this multiple comparison approach, and without any
loss of generalisation, methods will be treated as the sensors in
spectroscopic data. MLPCA allows to estimate the multivariate model
taking into account the uncertainties of each concentration result due to
measurement errors, so that non-orthogonal projections of the original data
into the scores subspace [5] are obtained. In this case MLPCA projects the
concentrations and standard deviations onto a one-dimensional space
defined by the first principal component (PC) taking into account the
uncertainties in all the individual concentrations. As there is a linear
relationship between the true concentrations from all the methods, most of
the variance in the scores is explained by the first PC, even when
concentrations are affected by measurement errors. We have seen that the
minimum percentage of variance explained by the first PC in any of the
data sets studied (not only those in the experimental section) was never less
than 96%.

Data pairs with lower individual uncertainties (supposedly those
with lower measurement errors) are those from which MLPCA extracts a
greater amount of information to estimate the multivariate model
parameters (i.e. scores and loadings). In this way, even when data from any
of the methods in test is missing, the loadings of the first PC from MLPCA
can still retain most of the original chemical information [6]. To do so, high
standard deviations are associated to the estimates of the missing
concentration values. Therefore these are not taken into account by MLPCA
to estimate the loadings of the first PC.


237
Detection of outlying methods

To identify those analytical methods that provide outlying
concentration results, so that they do not produce wrong conclusions about
the existence of significant bias in the concentration results from the other
methods in test, it is first necessary to decompose the concentrations in the
initial mn matrix R with MLPCA. This produces a n1 loading vector p
with information about the performance of the methods in test (Figure 1).
These loadings are distributed around a theoretical value of n n , since
this would be the value of all the loadings if the true concentrations were
considered. This is shown in Figure 2 where, in a three-dimensional case,
the loading values are equal to 3 3 when all three methods provide
concentration results that are identical to the true values. In this way, if
systematic bias in the concentration results from a given method is big
enough, the corresponding loading value will differ from the rest. We have
checked with normal probability plots (results available on request to the
authors) that the distribution followed by the loadings in p is normal.
Because the single (or paired) Grubbs' test is based on the assumption of
normality [7] it allows to detect, for a level of significance of 2.5% (2-tails)
[8], the loadings that can be considered as outliers and therefore, the
methods that should be removed from the initial mn matrix R.


238
1
1
1
0
2
3
Method 1
Method 2
Method 3
3
3
3
1
) cos( ) cos( ) cos( = = = =
PC 1

Figure 2. Loading values (cos(), cos() and cos()) in the hypothetical case that
the three laboratories provide identical results to the true concentrations.

Estimation of scores with MLPCA

After the l outlying methods have been eliminated, the
concentration results from the k remaining methods are compared
(Figure1). To do this k steps are carried out so that in each one the
concentrations from one method (in the m1 vector
j
r ) are compared using
linear regression to the m1 vector of scores t containing information about
the concentrations of the k-1 remaining methods. These scores are
generated from the MLPCA decomposition of the concentrations from the
methods that comprise the m(k-1) matrix R in each step, taking into
account the uncertainties of the concentrations in matrix R. To estimate the
uncertainties of vector t, the uncertainties of the concentrations in matrix R

239
(usually obtained by the replicate analysis of the m samples) are projected
on the scores subspace. By error propagation this can be expressed as [6]:

( )
1
-1 T 2

= p p
i t
i
s (1)

Estimates of the variances from eq. 1 are approximate because this
expression does not consider the uncertainty inherent to the principal
components from MLPCA. However, it gives an indication of the precision
of the replicate measurements [6], which the BLS regression method takes
into account to find the regression coefficients of the regression line in each
of the k steps.

Before comparing the results in
j
r with the scores in t in the jth
step, the scores must be first scaled because the units of the scores are
different from the ones of the concentrations. In other words, the scores are
the coordinates of the concentrations in a different coordinated system (one
PC), whose direction is defined by the loadings in p. In the hypothetical
case where all the results from the k-1 methods were identical to the true
concentration values, the m1 vector of scores t should perfectly fit a
straight line corresponding to the first PC. As stated previously, the
loadings in p would be equal to ) 1 ( 1 k k (Figure 2). In such case, the
differences between vectors t and
j
r are therefore due to the change of the
coordinated system, following the relation 1 = k
j
r t , (Figure 2) for any
j (1<j<k). This shows that the scores in t should be divided over 1 k to
offset the change of scale caused by the projection of the m(k-1) matrix R
onto the first PC.


240

Once the scores in vector t have been obtained from applying
MLPCA on m(k-1) matrix R in the jth step, they have to be regressed
against the concentration values from the jth analytical method
j
r
considering all the individual uncertainties. From all the existing least
squares approaches for estimating the regression coefficients when
measurement errors in both axes are present, Liss method [9] (referred to
as BLS) was found to be the most suitable [10]. This technique assumes that
the true linear model between the error-free scores
i
(from applying
MLPCA on the error-free concentration results from the corresponding k-1
methods in the jth step) and the error-free results
ij
from the
corresponding jth method is:

i ij

1 0
+ = (2)

The true variables
i
and
ij
are unobservable; instead, only the following
variables can be measured:

i i i
t + = (3)

ij ij ij
r + = (4)

The random errors made generating variables
i
t and
ij
r , are represented by
variables
i
, and
ij
where ) , 0 ( N ~
2
i
t i
and ) , 0 ( N ~
2
ij
r ij
. In this way,
introducing eqs. 3 and 4 into eq. 2 and isolating the variable
ij
r , the
following expression is obtained:

i i ij
t r + + =
1 0
(5)


241
where
i
2
i
i
[11] and can be
i
,
1
and
ij
:

ij i i

1
= (6)

Several authors have developed procedures to estimate the
regression line coefficients based on a maximum likelihood approach
whenever errors in both variables are present [12-15]. In most cases these
methods need the true predictor variable to be carefully modelled [14]. This
is not usually possible in chemical analysis, where the true predictor
variables
i
are not often randomly distributed (i.e. functional models are
assumed). Moreover, there are cases in which the experimental data is
heteroscedastic and estimates of measurement errors are only available
through replicate measurements (i.e. the ratio
ij i
r t
is non-constant or
unknown). These conditions, common in chemical data, make it very
difficult to rigorously apply the principle of maximum likelihood to the
estimation of the regression line coefficients. On the other hand, Sprent [11]
presented a method for estimating the regression coefficients using a
maximum likelihood approach, even when a functional model is assumed.
This method is not rigorously applicable when individual heteroscedastic
measurement errors are considered. Moreover, it has been shown that
when assuming
ij i
r t
= for any i, least squares methods provide the
same estimates of the regression coefficients as those from a maximum
likelihood estimation approach [16]. For these reasons, we have chosen an
iterative least squares method (i.e. the BLS method) which can be used on
any group of ordered pairs of observations with no assumptions about the
probability distributions [16]. This allows the application of this method in
real chemical data when individual heteroscedastic errors in both axes are
considered. In this way, the BLS regression method relates the measured
variables
i
t and
ij
r as follows [17]:

242

i i ij
e t b b r + + =
1 0
(7)

Where e
i
i
is
2
i
e
s and will
be referred to as the weighting factor. This parameter takes into
axes (
2
i
t
s and
2
ij
r
s ). The covariance between the variables for each (
i
t ,
ij
r )
data pair, which is normally assumed to be zero, can also be taken into
account:

) , ( cov 2 ) var(
1
2 2
1
2
1 0
2
ij i t r i ij e
r t b s b s t b b r s
i ij i
+ = =

(8)

coefficients by minimising the sum of the weighted residuals, S:

2
1
2
2
1 0
1
2
2
) 2 (
) ( )
(
s n
s
t b b r
s
r r
S
n
i e
i ij
n
i e
ij ij
i i
=

=
=

= =

(9)

The experimental error s
2
is an important variable since it provides
a measure of the dispersion of the data pairs around the regression line and
can provide a rough idea of the lack of fit of the experimental points to the
regression line. The BLS regression technique assigns lower importance to
those data pairs with larger
2
i
t
s and
2
ij
r
s values, i.e. the most imprecise data
pairs. In this way, estimates of the missing concentrations (to which we will
associate high standard deviations) have a minimal influence on the BLS
regression line coefficients. By minimising the sum of the weighted
0
and b
1
process [18].


243
Joint confidence interval test

To check whether there is significant bias in the concentration
results from the jth method (vector r
j
) in comparison to the scores from the
corresponding k-1 remaining methods in the jth step (vector t), the joint
confidence interval test for the slope and the intercept [18] must be applied
to the BLS regression coefficients for a given level of significance . As
shown in an earlier paper, the BLS regression method allows to compare
the concentration results from analysing more than one analyte [19] in
several concentration samples with different chemical matrices. If the
intercept and the slope of the regression line do not simultaneously show
significant differences from the reference values of 0 and 1 respectively, it
can be concluded that the results from the jth method are comparable to
those from the remaining k-1 ones for a given level of significance . In this
case, the experimental point (b
0
, b
1
) falls within this joint confidence
interval centred at the reference point (0,1) and the null hypothesis H
0
can
be accepted [20]. H
0
assumes that both BLS regression coefficients belong to
a joint confidence distribution centred at the reference point
) 1 , 0 (
0
H
0
H
1 0
= = b b . The joint confidence region is defined for a given level
of significance by all those values (b
0
, b
1
) that satisfy the equation:

=

= =
= + +
n
i
m
e
i
n
i
e
i
n
i
e
F s b b
s
t
b b b b
s
t
b b
s
i i i
1
) 2 , 2 ( 1
2 2
1 1
2
2
1 1 0 0
1
2
2
0 0
1
2
2 ) ( ) ( ) ( 2 ) (
1
0
H
0
H
0
H
0
H
0
H
0
H
0
H
(10)

where F
1-(2,m-2)
is the tabulated F value for a level of significance with 2
and m-2 degrees of freedom. The term
2
0
H
i
e
s is the weighting factor
associated to the reference regression line coefficients
0
H
0
b and
0
H
1
b from

244
which H
0
is postulated and can be recalculated from eq. 8 considering
0
H
1 1
b b = [20].

It is interesting to check lack of fit of the experimental points (
i
t ,
ij
r )
to the BLS regression line. When lack of fit is present, the residual mean
square error s
2
(eq. 9) tends to be overestimated, and joint confidence
regions may therefore be too large. In this case there would be a greater
probability for a given method that bias would remain undetected, i.e.
there would be a greater probability of committing a error [20].
Unfortunately, a rigorous test for detecting lack of fit based on the analysis
of the residual variance [21] cannot be applied because replicates for each
data pair (
i
t ,
ij
r ) are not available. This is because the scores
i
t and their
respective standard deviations
i
t
s , are directly generated from the
projection of m(k-1) matrices R and var(R). The only option would be to
apply a
2
test on the residual mean square error estimate s
2
, which is a
random variable that can be approximated by a
2
distribution with m-2
degrees of freedom. However, this is a rough test for detecting lack of fit,
because a chi-squared distribution is justified by the asymptotic theory only
in large samples [22] This condition is not usually met in linear regression,
where the number of samples is limited. For this reason, we decided not to
apply this test since conclusions about lack of fit could be misleading.

The suitability of the approach developed for multiple method
comparison was initially examined with different kinds of simulated data
sets (results are available on request to the authors). The simulated data
sets were generated using the Monte Carlo method [23,24]. Different
uncertainty patterns, which in some cases contained results from methods
simulating bias, were considered. When the multiple method comparison
approach was applied on those data sets in which no biased methods were
simulated for a given level of significance , bias was detected in each

245
method in approximately an % of the cases (i.e. the theoretical results one
expects to find if the procedure is correct). On the other hand, when
concentration results from one or more methods were simulated to be
biased, the percentage of times in which bias was detected in these biased
methods was significantly higher than for the unbiased ones.


Data sets

Data Set 1 [25]. The total Cr content (mg/kg) is determined in six
soil samples using four different separation/extraction methods: HNO
3
,
deionized water, KCl and acetate buffer solution. The total Cr content in
the soil samples ranges between 1.1 and 1420 mg/Kg. Heteroscedasticity is
present in the data set in such a way that each standard deviation ranges
between 2% and 14% of each individual value. Only one individual point
exceeds this range with a standard deviation of 68% of the individual mean
values. Figure 3a shows the Cr content in the different samples and their
standard deviations from replicate analysis by the four
separation/extraction methods.

Data Set 2 [26]. Collaborative study conducted on a liquid
chromatographic method for determining taurine in infant formula and
milk powders. Twenty laboratories participated in the analysis of eight
blind duplicates ranging from approximately 3 to 65 mg/100 g of sample.
Heteroscedasticity ranges from 0.1% to 29 % of the individual mean values.
In six cases one or both concentration values were missing and thus high
standard deviations (about ten times higher than the others) were
associated to the substituting taurine concentration values (see Figure 4a).


246
Data Set 3 [27]. Five-method comparison study to determine
polycyclic aromatic hydrocarbons (PAHs, ng/g) in twelve samples of the
sewage sludge CRM 392: Soxhlet method with toluol, SFE with CO
2
, SFE
with CO
2
and 5% toluene, SFE with CO
2
and TEA in toluene, and SFE with
CO
2
and TFA in toluene. PAHs concentrations range from 46 to 1071 ng/g.
Heteroscedasticity is high in the data set and the standard deviations range
from 0.3% to 37% of the individual mean values. Concentrations of PAHs
in the thirteen samples and their standard deviations are shown in Figure
5a.

Data Set 4 [28]. Collaborative study involving fourteen laboratories
to test a gas chromatographic method for determining putrescine in
seafood. This data is obtained from the analysis of putrescine in fourteen
canned tuna and raw mahimahi (including blind duplicates and a spike).
The putrescine content ranges from 0.2 to 9.2 ppm. Heteroscedasticity is
present in the data set, so the standard deviations range from 0.03% to 52%
of the individual mean values. Figure 6a shows the putrescine
concentrations and their standard deviations from the duplicate analysis of
the fourteen samples by the fourteen laboratories.

Computational Aspects

The calculations performed in this study were carried out with a
Pentium III-based personal computer with 64 Mb of memory and a clock
speed of 500 Mhz. Although the MLPCA algorithm has been reported to be
time consuming with spectroscopic data [5], the time needed to compare
the concentration results from the analytical methods in test for all the real
data sets was never more than 3 minutes since the data sets used are
smaller than the spectroscopic ones. All the algorithms were written in
Matlab (Matlab for Microsoft Windows ver. 5.2, The Mathworks, Inc.,
Natick, MA).


247

Results from the multiple-method comparison approach are
presented in Figures 3 to 6 and Tables 1 to 3. In each figure, the first plot
shows the concentrations (solid lines) and their standard deviations
(dashed lines) from the replicate analysis of the samples by the different
laboratories, whereas the second plot shows the loading values used for
detecting outliers. With these second plots it is possible to visually identify
any suspicious loading values, on which the Grubbs test for detecting
outliers (single or paired) is applied for a level of significance of 2.5% (2-
tails). On the other hand, values in Tables 1 to 3 under column show the
maximum level of significance for which no bias can be detected using the
joint confidence interval test on the BLS regression coefficients from
regressing results from the jth method against the scores in t in the jth step.
In this way, if the level of significance is equal to or higher than 5% (set in
this case for the BLS joint confidence interval test), the differences between
the results from each method in comparison to the other ones will not be
significant and vicecersa (in this latter case the levels of significance are
highlighted in bold).

Data Set 1. The concentration results from the four
separation/extraction methods in Figure 3a show that results from the
method using HNO
3
are systematically higher than those from the other
methods. This is confirmed by the plot of the loadings of the first PC from
MLPCA (Figure 3b), since the loading value from the results of the method
using HNO
3
is much higher than the loadings from the other methods. The
single Grubbs test showed that the suspicious loading value should be
considered as an outlier and therefore, the systematic differences between
the concentration results from the method using HNO
3
and the others were
significant for a level of significance of 2.5% (2-tails). Once the results from
this method were eliminated no outlying loading values were detected

248
(Figure 3c) and the second stage of the multiple-method comparison
approach could therefore be applied on the results from the three
remaining methods.

The maximum levels of significance for which no bias can be
detected using the joint confidence interval test on the BLS regression
coefficients estimated from the results of the separation/extraction
methods using HNO
3
, KCl and the acetate buffer solution are 69.5%, 70.9%
and 61.1% respectively. Therefore, none of the results from the three
methods for analyzing the total Cr content in the six soil samples is
significantly different from the rest. This conclusion is logical since
concentration results from methods 2, 3 and 4 are very similar (Figure 3a).

1 2 3 4 5 6
0
1000
1500
Sample Number
C
r

c
o
n
t
e
n
t

(
m
g
/
k
g
)
Method usingHNO
3
a)
500
~ ~
~ ~
130

Figure 3. Plot a): Cr contents (solid lines) and their standard deviations (dashed
lines) in the six soil samples analysed by the four methods in data set 1. The bold
solid line represents results from method using HNO
3
(outlier).

249
0.4 0.45 0.5 0.55 0.6 0.65 0.7
1 2 3 4
4 4
Loadings
2 3 4
3 3
b)
c)
1.- HNO
3
2.- H
2
O
3.- KCl
4.- Acetate

Figure 3 (cont.). Plots b) and c): Loadings for detecting outliers with an outlier
and after eliminating the results from the outlier laboratory respectively.

Data Set 2. Figure 4a shows that the concentration results from
laboratory 14 are quite different from those of the other laboratories for
some of the samples analysed. The loading plot for this data set (Figure 4b)
confirms this observation since the loading value from the results of
laboratory 14 is far from the others. The Grubbs test showed that the
loading value for the results from laboratory 14 should be considered as an
outlier. After eliminating the results from laboratory 14, the Grubbs test
was again applied on the loadings from the nineteen remaining
laboratories and no outlying loading values were detected (Figure 4c).


250
1 2 3 4 5 6 7 8
0
10
20
30
40
50
60
70
Sample Number
T
a
u
r
i
n
e

c
o
n
t
e
n
t

(
m
g
/
1
0
0
g
)
Laboratory 14
a)

0.21 0.22 0.23 0.24 0.25 0.26
9
19 19
Loadings
9 14
20 20
b)
c)

Figure 4 (cont.). Plot a): Concentration values (solid lines) of taurine and their
standard deviations (dashed lines) in the eight duplicate samples when analysed
by the twenty laboratories in data set 2. The bold solid line represents results
from method 14 (outlier).. Plots b) and c): Loadings for detecting outliers, with an
outlier and after eliminating the results from the outlier laboratory respectively.


251
Method (%) Method (%)
1 1.0 11 71.9
2 46.6 12 58.2
3 97.9 13 11.3
4 14.1 15 42.8
5 66.5 16 3.5
6 66.3 17 55.7
7 37.3 18 12.7
8 1.4 19 2.2
9 0.029 20 21.3
10 69.2
Table 1. Maximum levels of significance () for which no bias is detected
between the results of the different methods in data set 2.

Table 1 shows the results from the application of the joint
confidence interval test on the BLS regression coefficients. Significant
differences were detected in the results from laboratories 1, 8, 9, 16 and 19 .
Results from laboratory 9 appear to be especially different from the rest
since the value for which bias would not be detected is extremely low.
This is confirmed by the high loading value for laboratory 9 in Figure 4c,
which indicates that concentration results from this laboratory are higher
than those from the others.

Data Set 3. Figure 5a shows that the concentration results from the
analysis of PAHs with the five extraction methods are quite different. The
plot of the loadings in Figure 5b shows that the results for each method
increase, which indicates that the extraction efficiencies from each method
get higher and higher. In this example no outlying methods were detected
with the single or paired Grubbs test. Table 2 shows that the different
values are lower than the threshold value of 5%. This indicates that,
because of the big differences in the concentration results from the five

252
extraction methods in Figure 5a, the results from the five methods are
significantly different between them for a level of significance of 5%.

Method (%)
Soxhlet 0.47
SFE/CO
2 0.0027
SFE/CO
2
+Toluene 1.62
SFE/CO
2
+TEA 0.16
SFE/CO
2
+TFA 0.0071
Table 2. Maximum levels of significance () for which no bias is
detected between the results of the different methods in data set 3.

Sample Number
P
A
H
s

c
o
n
t
e
n
t

(
n
g
/
g
)
2 4 6 8 10 12
0
200
400
600
800
1000
a)

Figure 5. Plot a): Concentration (solid lines) of PAHs and their standard
deviations (dashed lines) in thirteen samples when analysed by the five methods
in comparison from data set 3.

253

0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6
1 2 3 4 5
5 5
Loadings
b)
1.- Soxhlet
2.- SFE/CO2
3.- SFE/CO2+Toluene
4.- SFE/CO2+TEA
5.- SFE/CO2+TFA

Figure 5 (cont.). Plot b): Loadings for detecting outliers among the five methods
in comparison from data set 3.

Data Set 4. Figure 6a shows that the concentration results from
laboratories 10 and 14 are the highest and the lowest respectively. Figure 6b
shows that the loadings for these two laboratories are much over and much
under the theoretical loading value of 14 14 respectively. This indicates
that results of the determination of putrescine in seafood from laboratories
10 and 14 are systematically higher and lower, respectively, than those
from the other laboratories. These two loadings were detected as outliers
when the paired Grubbs test for the stated level of significance of 2.5% (2-
tails) was applied. Once both laboratories were eliminated, the loading plot
in Figure 6c was obtained MLPCA was applied on the results of the
determination of putrescine from the remaining twelve laboratories.


254
1 2 3 4 5 6 7
1
3
5
7
9
Sample Number
P
u
t
r
e
s
c
i
n
e

c
o
n
t
e
n
t

(
p
p
m
)
Laboratory 10
Laboratory 14
a)

0.1 0.2 0.3 0.4
10 14
Loadings
14 14
12 12
b)
c)

Figure 6. Plot a): Putrescine concentration values (solid lines) and their standard
deviations (dashed lines) in fourteen samples when analysed by the fourteen
chromatographic methods in data set 4. Bold solid lines represent results from
methods 10 and 14 (outliers). Plots b) and c): Loadings for detecting outliers with
two outliers and after eliminating the results from the outlier laboratories
respectively.


255
Table 3 shows that bias was detected in the results from laboratories
2, 4, 12 and 13. Since the lowest values in Table 3 are from the
concentration results from laboratories 12 and 4, it can be concluded that
these laboratories provide the most different concentration results.

Method (%) Method (%)
1 83.2 7 13.9
2 2.7 8 8.1
3 5.0 9 52.6
4 0.026 11 73.2
5 30.1 12 0.0013
6 53.6 13 0.99
Table 3. Maximum levels of significance () for which no bias is
detected between the results of the different methods in data set 4.

On the other hand, although results from laboratory 3 were not
significantly different from the others, doubts may arise about the
performance of this laboratory when analysing putrescine in seafood with
the chromatographic method being tested, since the value in Table 3 is
equal to the threshold value of 5%.

CONCLUSIONS

In this paper we have developed a stepwise multiple method
comparison approach that allows to detect significant bias by comparing
the results from several analytical methods or laboratories considering their
individual heteroscedastic uncertainties (i.e. their different levels of
precision). This approach can be applied even when more than one analyte
in several concentration samples with different chemical matrices are
analysed. Moreover, unlike the existing approaches, this one
simultaneously considers all the concentration results and their respective

256
uncertainties to detect methods reporting outlying concentration results. In
this way, it is possible to have a clear view of the overall performance of
each analytical method.

Real data sets were used to check the suitability of the multiple
method comparison approach. Conclusions about laboratories providing
outlying concentration results or results with significant bias in comparison
to the rest, seemed to agree with the differences observed in the plots of the
concentration results from the different laboratories. Although this
comparison procedure has been used to compare results from methods or
laboratories, it can be applied to any experimental problem in which results
from the analysis of several analytes in various chemical matrices at
different concentration levels are obtained with their respective individual
uncertainties.

Despite the promising results, the researcher should be aware of
two important points. The first, which is inherent to both MLPCA and BLS
techniques, is that the uncertainties from the replicate analysis of the
different concentration samples need to be known, so a greater
experimental effort is therefore required to carry out a sufficient number of
replicate analyses [29]. Although in some of the real data sets the number of
replicates was very low (only two), it has been a premise of this work to
assume that a comparison approach that accounts for approximate
estimates of the individual heteroscedastic uncertainties is better than one
that does not consider them at all. The second is that the BLS regression
technique is not very robust in the presence of outliers with low individual
uncertainties. This limitation will be addressed in future works [30].


257
ACKNOWLEDGEMENTS

The authors thank the DGICyT (project no. BP96-1008) for financial
support, and the Rovira i Virgili University for providing a doctoral
fellowship to A. Martnez.

BIBLIOGRAPHY

1.- G.T. Wernimont, Use of statistics to develop and evaluate analytical methods,
AOAC, Arlington, V.A., 1987.
3.- D. L. Massart, B. M. G. Vandeginste, L.M.C. Buydens, S. de Jong, P.J.
Lewi and J. Smeyers-Verbeke, Handbook of Chemometrics and
Qualimetrics, Part A, Elsevier, Amsterdam, 1997.
4.- J. Kucera, P. Mader, D. Miholov, J. Szkov, I. Stejskalov and V.
Stepnek, Fresenius J. Anal. Chem., 360 (1998) 439-442.
5.- P. D. Wentzell, D. T. Andrews, D. C. Hamilton, K. Faber and B. R.
Kowalski, J. Chemom., 11 (1997) 339-366.
6.- P. D. Wentzell and D. T. Andrews, Anal. Chim. Acta, 350 (1997) 341-352.
7.- Cetama, Statistique Applique lExploitation des Mesures, 2nd ed.,
Masson, Paris, 1986.
8.- AOAC International Guidelines for Collaborative Study Procedures to
Validate Characteristics of a Method of Analysis, J. AOAC Int., 78 (1995)
143A- 160A.
9.- J.M. Lis, A. Cholvadov, J. Kutej, Comput. Chem., 14 (1990) 189-192.
10.- J. Riu, F.X. Rius, J. Chemom., 9 (1995) 343-362.
11 .- P. Sprent, Models in Regression and related topics, Methuen & Co.
Ltd., London, 1969.
12.- W. A. Fuller, Measurement Error Models, John Wiley & Sons, New
York, 1987.
13.- C. L. Cheng and J. W. van Ness, J. R. Stat. Soc. B, 56 (1994) 167-183.
14.- D. W. Schafer and K. G. Puddy, Biometrika, 83 (1996) 813-824.

258
15.- K. C. Lai and T. K. Mak, J. R. Stat. Soc. B, 41 (1979) 263-268.
16.- D. V. Lindley, J. R. Stat. Soc. B / London Suppl. Series B, 9 (1947) 218-
244.
17.- G. A. F. Seber, Linear regression analysis, John Wiley & Sons, New
York, 1977.
18.- J. Riu and F.X. Rius, Anal. Chem., 68 (1996) 1851-1857.
19.- A. Martnez, J. Riu, O. Busto, J. Guasch and F. X. Rius, Anal. Chim.
Acta, 406 (2000) 257-278.
20.- A. Martnez, J. Riu and F. X. Rius, J. Chemom., submitted for
publication.
21.- A. Martnez, J. Riu and F. X. Rius, Chemom. Intell. Lab. Syst., accepted
for publication.
22.- P. Bentler and D. G. Bonett, Psychological Bulletin 88 (1980) 588-606.
23.- P. C. Meier and R. E. Zund, Statistical Methods in Analytical
Chemistry, John Wiley & Sons, New York, 145-150, 1993.
24.- O. Gell and J.A. Holcombe, Anal. Chem., 60 (1990) 529A - 542A.
25.- P. Fodor and L. Fischer, Fresenius J. Anal. Chem., 351 (1995) 454-455.
26.- D. C. Woollard, J. AOAC Int., 80 (1997) 860-865.
27.- C. Friedrich, K. Cammann and W. Kleibhmer, Fresenius J. Anal.
Chem., 352 (1995) 730-734.
28.- P. L. Rogers and W. Staruszkiewicz, J. AOAC Int., 80 (1997) 591-602.
29.- R. J. Carroll and D. Ruppert., Amer. Stat., 50 (1996) 1-6.
30.- J. del Ro, J. Riu and F. X. Rius, in preparation.

5.5 Conclusions

259
5.5 Conclusions

Com conclusi principal daquest captol cal remarcar que sha
aconseguit desenvolupar un procediment que permet comparar els
resultats obtinguts per diversos mtodes analtics quan analitzen una srie
de mostres. Aquesta comparaci es porta a terme considerant tots els
nivells de concentraci simultniament, aix com les incerteses generades
pels errors comesos en la mesura de les mostres analitzades.

Un altre punt important a destacar s que sha demostrat mitjanant
conjunts de dades simulats, que aquest procediment per la comparaci de
mltiples mtodes proporciona resultats correctes quant a la detecci i
identificaci dels mtodes esbiaixats. Aix equival a dir que no es detecten
diferncies significatives entre els resultats dels diferents mtodes quan
aquestes no existeixen, mentre que quan alguns dels mtodes proporcionen
resultats esbiaixats en comparaci als resultats de la resta, aquestos sn
detectats correctament. s necessari remarcar que el procediment presentat
en aquest captol ha estat desenvolupat per comparar els resultats obtinguts
per diversos mtodes danlisi. Aix significa que a partir dels resultats
obtinguts no es pot concloure quins dels mtodes analtics proporcionen
resultats esbiaixats, sin quins mtodes proporcionen resultats
comparables als de la resta. Per poder conixer quins mtodes
proporcionen resultats esbiaixats, caldria comparar els resultats de
cadascun daquests mtodes per separat amb els dun mtode de referncia,
mitjanant el test de confiana conjunta desenvolupat pel mtode de
regressi BLS.

Daltra banda, en lapartat 5.4 es posa de manifest que aquest
procediment tamb s capa de detectar correctament la presncia de
mtodes analtics amb resultats que es poden considerar discrepants
(outliers) respecte la resta. Tanmateix ha estat possible la detecci de

260
comparaci, en el cas de problemes analtics reals. s necessari recordar
que, com ja ha estat indicat en lapartat 5.4, tot i que el procediment
desenvolupat sembla apropiat per comparar mltiples mtodes considerat
les incerteses degudes als error comesos en la mesura de les mostres, tamb
existeixen una srie dinconvenients a tenir en compte quant a la seva
aplicaci. Finalment sha de dir que tot i que en aquest captol el
procediment de comparaci ha estat aplicat per comparar diversos mtodes
danlisi, tamb s aplicable a qualsevol problema experimental on
sobtenen resultats de lanlisi de mostres de diverses concentracions amb
les seves incerteses corresponents (p.e. comparaci de laboratoris, analistes
o instruments).

5.6 Referncies

Elsevier, Amsterdam, 1997.
2.- ISO 5725, Precision of test methods Determination of repeatability and
reproducibility for a standard test method by inter-laboratory tests, 1994.
3.- AOAC/IUPAC, Journal of the AOAC International, 78 (1995) 143A-160A.
4.- ISO 5725, Accuracy (trueness and precision) of measurement methods and
results, 1994.
5.- IUPAC, Pure and Applied Chemistry, 65 (1993) 2123-2144.
7.- Martens H., Ns T., Multivariate Calibration, Wiley & Sons: Chichester,
1989.
8.- Wold S., Esbensen K., Geladi P., Chemometrics and Intelligent Laboratory
Systems, 2 (1987) 37-52.
9.- Wentzell P.D., Andrews D.T., Hamilton D.C., Faber K., Kowalski B.R.,
Journal of Chemometrics, 11 (1997) 339-366.
5.6 Referncies

261
10-. Meier P.C., Zund R.E., Statistical Methods in Analytical Chemistry, John
Wiley & Sons: New York, 1993, 145-150.
11-. Gell O., Holcombe J.A., Analytical Chemistry, 60 (1990) 529A - 542A.

CAPTOL 6
Habilitat de predicci utilitzant regressi
lineal multivariant considerant errors
en tots els eixos en PCR i MLPCR


265

En el captol anterior sha demostrat la utilitat de lanlisi de
components principals de mxima versemblana (MLPCA)
1
juntament amb
el test conjunt sobre lordenada a lorigen i el pendent de la recta de
regressi BLS per comparar els resultats de mltiples mtodes analtics
conjuntament, considerant les incerteses degudes als errors comesos en les
mesures de les mostres de diferents concentracions.

En aquest captol es continua treballant en el camp multivariant
considerant els errors comesos en la mesura de les mostres en tots els eixos.
Shi introdueix una tcnica de regressi multivariant, anomenada mnims
quadrats multivariants (multivariate least squares, MLS), que per estimar els
coeficients de regressi, considera les incerteses degudes als errors comesos
en la mesura de les variables predictores i resposta per les diferents
mostres. En aquest captol sabandona la comparaci dels resultats
obtinguts per dos o ms mtodes analtics, per passar a la calibraci
multivariant de dades espectroscpiques. Tot mtode de calibraci
multivariant consta de dues etapes: letapa de descomposici de les
mesures espectroscpiques i letapa en qu es fa la regressi de les
concentracions de referncia sobre els valors resultants de la descomposici
feta en la primera etapa. MLS sha aplicat en l'etapa de regressi dels
mtodes de calibraci multivariant de components principals (principal
compomemt regression, PCR) i de components principals de mxima
versemblana
2
(maximum likelihood principal compomemt regression, MLPCR)
substituint la clssica tcnica de regressi lineal mltiple MLR, que s
l'extensi de la tcnica de mnims quadrats OLS al camp multivariant.

Al contrari que amb PCR, la tcnica de regressi MLPCR t en
compte les incerteses dels errors comesos en les mesures espectroscpiques
a lhora de descompondre la matriu de mesures espectrals original R, en els
Captol 6. Habilitat de predicci utilitzant regressi...

266
corresponents valors i vectors propis (scores i loadings respectivament). Aix,
la quantitat dinformaci extreta de cadascuna de les mesures
espectroscpiques per MLPCR s ptima des dun punt de vista de mxima
versemblana.
2
No obstant aix, en letapa de regressi en MLPCR sutilitza
el mtode de MLR, que no t en compte les incerteses degudes als errors
comesos en la mesura dels valors de referncia de les propietats dinters.
Tot i que aquest problema ja sha solucionat amb la tcnica de regressi de
variables latents de mxima versemblana (maximum likelihood latent root
regression, MLLRR), aquesta s ms complexa i menys intutiva que
MLPCR, i la interpretaci dels resultats s ms difcil.
2
Per aquest motiu,
lobjectiu daquest captol s estudiar els errors de predicci comesos tant
per MLPCR com per PCR en substituir la tcnica de regressi MLR per la
de MLS i comparar-los amb els obtinguts mitjanant MLLRR.

A lapartat 6.2 es detalla el funcionament de letapa de regressi en
els mtodes de calibraci multivariant de mxima versemblana MLPCR i
MLLRR. Lapartat 6.3 es dedica a destacar les diferncies entre els errors de
predicci vertaders i observats. Lapartat 6.4 cont el gruix del treball
tractat en aquest captol, com a part de larticle Application of multivariate
least squares regression method to PCR and maximum likelihood PCR techniques,
enviat per la seva publicaci a la revista Journal of Chemometrics. Per acabar,
a lapartat 6.5 es presenten les conclusions extretes daquest captol.

6.2 Tcniques de calibraci multivariant de mxima
versemblana

Aquestes tcniques de calibraci multivariant tenen en compte les
incerteses associades a les mesures espectroscpiques a lhora de construir
el model de calibrat multivariant. Les tcniques de MLPCR i MLLRR fan
servir lanlisi per components principals (MLPCA), descrit a la secci 5.2
del captol anterior, per descompondre la matriu R de dimensions mn que
6.2 Tcniques de calibraci multivariant ...

267
cont els perfils espectroscpics. Aquestes tcniques de calibraci
multivariant proporcionen una important millora en lhabilitat de predicci
respecte a tcniques ms convencionals com ara PCR. Aquestes millores
sn especialment importants quan les mesures espectroscpiques presenten
heteroscedasticitat (desviacions estndard no constants), que pot ser
deguda a variacions en la intensitat de la font, transformacions no lineals
en les mesures dabsorbncia o variacions en les caracterstiques del soroll
del detector.
2

6.2.1 Regressi per components principals de mxima
versemblana (MLPCR)

En aquest apartat ens centrarem en letapa de regressi, ja que
letapa de descomposici de la matriu de mesures espectrals seguint un
criteri de mxima versemblana, es va tractar en el captol anterior en
detallar el funcionament de lalgoritme MLPCA. En MLPCR letapa de
regressi mitjanant el mtode MLR es fa de forma similar que en PCR i els
coeficients del model de regressi sestimen segons la mateixa expressi:

y T T T q
T 1 T
) (

= (6.1)

on T s una matriu mp de scores obtinguts mitjanant MLPCA per p factors
o components principals i T
T
s la seva transposada. A diferncia de PCR,
per calcular els scores de les mostres desconegudes en MLPCR sutilitza una
projecci de mxima versemblana que t en compte les respectives
matrius de varincies
unk
:

1 1 1 T 1
) (

= V V V r t
unk unk unk unk
(6.2)


268
on V s una matriu np de loadings obtinguts per MLPCA i r
unk
s un vector
np amb els perfil espectroscpic de la mostra desconeguda. La variable
unk
s una matriu diagonal nn (es considera que els errors en les mesures
espectroscpiques sn independents) amb les varincies corresponents a
les diferents rpliques de les mesures espectroscpiques fetes per la mostra
desconeguda. Les concentracions de les mostres desconegudes sestimen,
igual que en PCR, segons lexpressi:

q t
unk unk
y =
(6.3)

Com que les matrius de varincies
unk
poden ser diferents per les diverses
mostres desconegudes, no es pot definir un vector de regressi per a totes
les mostres desconegudes com en PCR.

Pel que fa a l'aplicaci del mtode MLS a l'etapa de regressi de
MLPCR, cal destacar que les concentracions de les mostres desconegudes
sestimen amb una expressi diferent a la 5.3, ja que encara que es treballi
amb dades centrades, el model de regressi MLS no ha de tenir
necessriament ordenada 0, com passa quan saplica el mtode MLR sobre
dades centrades (eq. 5.3). Aix s degut al fet que MLS troba aquells
coeficients de regressi que fan que el model sajusti millor a aquells punts
amb unes incerteses associades ms petites. Lexpressi a partir de la qual
sestimen les concentracions de les mostres desconegudes amb MLS
correspon a lequaci 1 de lapartat 6.3.

6.2.2 Regressi per variables latents de mxima versemblana
(MLLRR)

En aquesta tcnica de calibraci multivariant la projecci de les
mesures espectroscpiques sobre el subespai dels scores, igual com en
MLPCR, es fa mitjanant MLPCA. En aquest cas, per, MLPCA saplica
6.2 Tcniques de calibraci multivariant ...

269
sobre una matriu augmentada [R|y] amb les mesures espectroscpiques i
els valors de les concentracions. Tamb s necessria una segona matriu
que contingui les varincies dels errors comesos tant en les mesures
espectrals (matriu mn Q) com en les concentracions [Q|var(y)].

Una vegada sha portat a terme la descomposici espectral aplicant
MLPCA sobre aquestes dues matrius augmentades, sobtenen les matrius
mp de scores i (n+1)p de loadings T i V respectivament. En MLLRR la
predicci de la concentraci de la mostra desconeguda es fa segons
lexpressi:

[ ] [ ]
T 1 1 T 1
) ( 0 |
| V V V V r r

=
unk unk unk unk unk
y (6.4)

En aquest cas, la variable
unk
s una matriu (n+1)(n+1) de varincies del
vector augmentat [ ]
unk unk
y
| r corresponent a la mostra desconeguda.

Aquesta equaci proporciona un vector de dimensions 1(n+1) lltim
element del qual correspon al valor predit
unk
y
. Per entendre com funciona

aquest tipus de predicci, sha de pensar que el terme
T 1 1 T 1
) ( V V V V

unk unk
de lexpressi 5.4 representa la projecci de mxima
versemblana de lespectre de la mostra desconeguda sobre el subespai
dels scores. Com que els loadings shan obtingut en letapa de calibraci
aplicant MLPCA sobre la matriu [R|y], contenen informaci sobre les
concentracions de les mostres de calibraci. Aix doncs, si sassigna a
lltim element de la diagonal de la matriu
unk
un valor numricament
equivalent a infinit, el darrer element del vector augmentat [ ] 0 |
unk
r no es t
en compte en projectar lespectre de la mostra desconeguda r
unk
, cosa que
permet predir-ne la concentraci. Per aquest motiu el valor en lltima
posici del vector augmentat [ ] 0 |
unk
r no ser important i per tant es fixa
igual a 0. En cas de tenir concentracions de diversos analits o propietats a

270
predir, noms caldr augmentar el nombre de zeros del vector augmentat a
lequaci 5.4. i el nombre de valors numricament equivalents a infinit al
final de la diagonal de la matriu
unk
.


En aquest captol, lerror de predicci (root mean squared error of
prediction, RMSEP) obtingut pels mtodes de calibraci multivariant
utilitzant MLS o MLR sha diferenciat en dues classes. Per una banda,
lerror de predicci observat, en el qual es comparen els valors predits per
la tcnica multivariant respecte als mesurats pel mtode de referncia.
Daltra banda, en lerror de predicci vertader es comparen els valors
predits respecte als valors de referncia vertaders, s a dir, aquells valors
desconeguts a la prctica que estan lliures derror (vegeu equaci 9 a
lapartat 6.4).

Lhabilitat de predicci dun model de calibraci multivariant se sol
mesurar mitjanat lerror de predicci observat. Ats que aquesta s lnica
forma emprica destablir la capacitat dun model multivariant per predir el
valor de la concentraci (o qualsevol altra propietat dinters) de mostres
desconegudes, shan dedicat molts esforos per millorar la predicci dels
valors de les propietats dinters i, en conseqncia, per minimitzar el
RMSEP observat. No obstant aix, sha de tenir en compte que les
concentracions de les mostres de calibraci i validaci estan afectades per
errors en la mesura, que segons el mtode analtic de referncia emprat
poden arribar a ser molt importants.
3-5
Per aquest motiu, el fet que lerror
de predicci sigui baix no s sinnim que el model de calibraci
multivariant sigui el que millor predigui les concentracions vertaderes de
les mostres desconegudes (malgrat no ser observables experimentalment).


271
A lapartat 6.4 es demostra que el mtode MLS aplicat en letapa de
regressi de les tcniques de calibraci multivariants proporciona
prediccions de les concentracions de les mostres de calibraci ms
semblants als valors vertaders que no pas les obtingudes pel mtode MLR.
Aix s aix perqu el mtode de regressi MLS considera les desviacions
estndard degudes als errors comesos en la mesura de les mostres de
calibraci. Daquesta forma, el model de regressi estimat per MLS sajusta
millor a aquells valors experimentals que presenten unes desviacions
estndard menors (en qu en teoria lerror en la mesura experimental s
menor) i, per tant, sn ms semblants als valors vertaders. Malauradament,
per, la mesura daquesta habilitat de predicci vertadera no s possible en
conjunts de dades reals, ja que es desconeixen els valors de referncia
vertaders. s per aix que en aquest captol hem treballat amb dades
simulades, que s permeten calcular lerror de predicci vertader mitjanant
les diferents tcniques de calibraci multivariant utilitzant els mtodes de
regressi MLR i MLS i, per tant, la seva comparaci.


272
6.4 Application of multivariate least squares regression
method to PCR and maximum likelihood PCR techniques
(Journal of Chemometrics, enviat per a la seva publicaci).



ABSTRACT

Reference analytical methods that provide correct concentration
values are essential for building valid multivariate calibration models. In
some cases reference analytical methods provide concentration values with
high levels of uncertainty, what may lead to the construction of wrong
multivariate calibration models. This paper presents a multivariate least
squares regression method (MLS) for regressing the reference concentration
values on the scores from the decomposition of the spectroscopic data on p
factors or principal components. It considers the uncertainties in the
reference concentration values and/or those in the spectroscopic
measurements. We have replaced the traditional ordinary least squares
regression method (OLS, also known as multiple linear regression, MLR) in
both principal component regression (PCR) and maximum likelihood
principal component regression (MLPCR) by the MLS regression method.
We have compared prediction errors of the true concentration values from
the validation step using the MLS regression method and those obtained
using OLS. The true prediction errors are greatly improved when the MLS
6.4 Application of multivariate least squares ...

273
technique is applied to the multivariate calibration methods for real and
simulated data sets.

INTRODUCTION

Over the last few years multivariate calibration methods have been
used as an alternative to well-established analytical techniques, because
they allow fast and reliable predictions of the concentration of the analyte
of interest in unknown samples with interferences, and this makes them
useful for routine analysis. For this reason, a wide variety of multivariate
calibration techniques have appeared. These include multiple linear
regression (MLR),
1
principal component regression (PCR),
2
partial least
squares (PLS)
3
and latent root regression (LRR).
4
The suitability of each
technique depends on the specific chemical problem and the characteristics
of the experimental data. On the other hand, reference methods usually
require longer analysis times and are more expensive. Moreover, in cases in
which the analytical problem or the analytical methodology is complex, the
uncertainty associated with the estimated concentrations in the calibration
set is high.
5-7
Consequently, some reference concentration values may be
affected by high measurement errors.

Multivariate calibration models constructed with reference
concentration values that contain high random errors may not be correct.
This can lead to high prediction errors from the validation step and, more
importantly, from future working samples because the ordinary least
squares regression method (OLS, also known as multiple linear regression,
MLR) used in multivariate calibration techniques such as PCR or PLS to
regress the reference concentration values on the scores from the
decomposition of the spectroscopic data on the first p factors or principal
components, considers neither the uncertainty (i.e. level of precision) in the
reference concentrations nor the uncertainty in the spectroscopic data. For

274
this reason, we present a multivariate least squares (MLS) regression
method that can account for the individual uncertainties in either the
spectroscopic data and/or the reference concentration values in order to
estimate the regression coefficients of a p+1 dimensional model correctly.
This technique may be considered an extension of the bivariate least
squares regression method (BLS)
8
used in univariate calibration. In this
way, the final multivariate regression model assigns minor weights (i.e. less
importance) to the most imprecise reference concentration and/or
spectroscopic values.

We applied the MLS regression method in the regression stage to
both the PCR and MLPCR
9
techniques. In PCR the dimensionality
reduction of the spectroscopic data on p PCs is carried out without
considering the uncertainties from replicate measurements. This is why the
MLS regression method in this case only considers the uncertainties of the
reference concentration values in the calibration set to estimate the
coefficients of the multivariate calibration model. On the other hand,
MLPCR considers the uncertainties of the spectroscopic data projected onto
the p-dimensional scores subspace when performing the dimensionality
reduction. This produces estimates of the spectroscopic measurements that
are more likely to be experimentally observed (i.e. maximum likelihood
estimates). If the OLS method is used in the regression step after the
maximum likelihood decomposition,
10
neither the uncertainties in the
scores nor those in the reference concentration values will be taken into
account, what is not optimal from a maximum likelihood point of view. A
maximum likelihood multivariate calibration method based on the latent
root regression technique (MLLRR) was therefore developed
9
to account for
the uncertainties in both the spectroscopic and the concentration
measurements. However, MLLRR is more cumbersome and less intuitive
than the MLPCR technique, since it simultaneously performs the
dimensionality reduction and the regression steps on an augmented matrix

275
containing both spectroscopic and concentration values. LRR also has this
disadvantage, which is one of the reasons why this multivariate calibration
method has been virtually ignored in chemistry, compared to other
techniques like PCR. In this paper we prove that true prediction errors
(committed when predicting the true but unobservable concentration
values) using the MLS method in the regression stage of the calibration
process, are similar to those from MLLRR and lower than those from the
conventional OLS (MLR) technique in the regression stage.

To simplify the maximum likelihood spectroscopic decomposition,
we have assumed uncorrelated measurement errors (i.e. diagonal
covariance matrices) in the spectroscopic measurements throughout this
paper. Although this assumption is not correct for real experimental data,
MLPCR has proved its potential under these assumptions.
9,10
Moreover, it
has been proved recently that, although it has few practical problems,
MLPCR can also account for correlated measurement errors.
11
In any case,
MLS can easily be applied to the regression step independently from the
measurement error assumptions considered in the maximum likelihood
spectroscopic decomposition. However the MLS regression technique does
have some limitations and these should be pointed out from the beginning.
Firstly, inherent to any technique dealing with uncertainties, only the
estimates of the exact measurement error variances are available through
replicate measurements. A second limitation is related to the projection of
the individual uncertainties of the spectroscopic data on the scores
subspace. In this paper we have calculated the projections using the theory
of propagation of errors, which considers that the eigenvectors (i.e.
principal components) are error-free. Although we know that eigenvectors
are also affected by uncertainty, the projection of the spectroscopic
uncertainties into the scores subspace provides an accurate measure of the
precision of the replicate measurements
12
which the MLS method needs in
the regression step. We have used one real data set and another simulated
one to demonstrate the advantages of the MLS method over the

276
conventional OLS technique when used in the regression step of the
multivariate calibration methods considered.


Notation

We have used bold uppercase characters to denote matrices, bold
lowercase characters to denote vectors and italic lowercase characters to
denote scalars. Since the MLS regression method can be applied to both
PCR and MLPCR we have made no distinction between the matrices used
by these two multivariate calibration methods. The singular value
decomposition of the mn matrix R containing the spectroscopic values in a
p-dimensional subspace yields an mp score matrix T and an np loading
matrix V. The scalar p is the pseudorank of R, or the number of observable
components in the mixtures of the calibration set. The individual
spectroscopic uncertainties for the ith sample are contained in the
corresponding diagonal covariance nn matrix (uncorrelated errors are
considered)
i
. The projection of this diagonal covariance matrix onto the
scores subspace yields the pp diagonal matrix Z
i
. In addition, we have
used the caret to distinguish between the measured and the predicted
concentrations y
i
and
i
y
respectively.

Multivariate Least Squares

Multivariate least-squares (MLS), which will be used in the
regression stage to build the multivariate model, is a regression technique
that can be applied on multivariate data considering their individual
uncertainties. Of all the accurate approaches for calculating the coefficients
of the regression model, we selected Liss method
8
because of its speed in
estimating the correct results for the regression coefficients and the

277
simplicity of programming its algorithm. In the particular case of
multivariate calibration, this regression method assumes a linear model of
the form:

i i i
e q y + + =
2 1
q t (1)

where y
i
is the ith element (i=1,...,m) of the m1 vector y of analyte
concentrations and t
i
is the ith row of matrix T. The term q
1
represents the
first element of the (p+1)1 regression vector q. Vector q
2
contains the
remaining p elements of q. These values are, respectively, the intercept and
the slopes of the regression hyperplane that best fits the m points defined
by the coordinates (t
i1
, ..., t
ip
,y
i
), in a p+1 dimensional space taking into
account the uncertainties in the spectroscopic and/or reference
concentration measurements. The m1 vector e contains the residual errors
between the observed and estimated concentration values with
) , 0 ( N ~
2
i
e i
e , as expressed in eq. 2:

2 1
q t
i i i i i
q y y y e = = (2)

The estimate of
2
i
e
is
2
i
s
e
which will be referred to as the weighting factor,
expressed as the residual variance of the ith point (t
i1
, ..., t
ip
, y
i
). The
weighting factor can be expressed using Taylor series, even when the
covariances between the scores and the concentration values in the
calibration set are not zero:

= + = = =
+ + =
p
j
p
j l
il ij
p
j
ij i
p
j
j i y e
t t q q t y q q s s
l j j j i i
1 1
2 2
1
2
1
2 2
2
2 2
) , cov( 2 ) , cov( 2 ) diag( (3)


278
In MLPCR the pp matrix Z
i
contains the uncertainties of the spectroscopic
measurements projected onto the scores subspace. By error propagation
this can be expressed as:

( )
1
T

= V V
i i
(4)

Estimates of the uncertainties from eq. 4 are, however, approximate. This is
because this equation does not consider the uncertainty present in the
eigenvectors due to the spectral decomposition. However this expression is
reported to give a good idea of the precision of the spectral replicate
measurements.
12
This is the information that the MLS regression method
takes into account to find the regression coefficients of the multivariate
calibration model. If however, no uncertainties are associated to the
spectroscopic measurements (which are assumed by PCR), Z
i
is reduced to
a matrix of zeros. Therefore, the expression of the weighting factor (eq. 3) is
simplified to
2
y
2
i i
s s
e
= .

In this way, the MLS method takes into account each of the
individual uncertainties in the weighting factors
2
i
e
s to find the regression
coefficients of the multivariate calibration model. In other words, the
regression hyperplane is better fitted to those points with lower
uncertainties (lower measurement errors and higher precision), in such a
way that minimises the sum of weighted residuals, S, expressed as:

( ) ( ) ( )

= =
= = =
m
i
i i
e
m
i
i i
e
q y
s
y y
s
m s S
i i
1
2 2
1
2
1
1
2 q t
1
(5)

Variable s
2
is the estimate of the residual mean square error, also known as
experimental error, and provides a measure of the dispersion of the data
pairs around the regression line hyperplane. By minimising the sum of the

279
weighted residuals S in relation to the regression coefficients in q, p+1
nonlinear equations are obtained. By including the partial derivatives of the
squared residuals, eq. 6 can be written in the equivalent matrix form
expressed in eq. 7:

g Bq= (6)

+
=

=
=
=
=
= = = =
= = = =
= = = =
= = = =
m
i
e
e
i
e
pi i
m
i
e
e
i
e
i i
m
i
e
e
i
e
i i
m
i
e
e
i
e
i
m
i e
pi
m
i e
pi i
m
i e
pi i
m
i e
pi
m
i e
pi i
m
i e
i
m
i e
i i
m
i e
i
m
i e
pi i
m
i e
i i
m
i e
i
m
i e
i
m
i e
pi
m
i e
i
m
i e
i
m
i e
p
i
i i
i
i i
i
i i
i
i i
i i i i
i i i i
i i i i
i i i i
q
s
s
e
s
t y
q
s
s
e
s
t y
q
s
s
e
s
t y
q
s
s
e
s
y
q
s
t
s
t t
s
t t
s
t
s
t t
s
t
s
t t
s
t
s
t t
s
t t
s
t
s
t
s
t
s
t
s
t
s
1 2
2
2
2 2
1 2
2
2
2 2
2
1 2
2
2
2 2
1
1 1
2
2
2 2
2
1
1
2
2
1
2
3
1
2
2
1
2
1
2
3
1
2
2
3
1
2
3 2
1
2
2
1
2
2
1
2
3 2
1
2
2
2
1
2
1
1
2
1
2
2
1
2
1
1
2
2
1
2
1
2
1
2
1
1
2
1
q
(7)

where the vector containing the estimates of the slopes of the regression
hyperplane is
p
q q
2 2 2
, ... ,
1
= q . The regression coefficients, (i.e. elements of
vector q), can be determined by carrying out an iterative process on the
following matrix form:

g B q
1
= (8)

With this method, the variance-covariance matrix of the regression
coefficients is obtained by multiplying the final matrix B
-1
, by the estimate
of the experimental error
2
s (eq. 5). It should be pointed out that, if
2
i
e
s
was constant for all the samples (i.e., there are only homoscedastic errors in

280
the concentrations and the spectroscopic uncertainties were neglected), the
expressions obtained would be the same as if the OLS (MLR) regression
method were applied.

The MLS regression model extracts more information from data
pairs that are supposed to have lower measurement errors (lower
uncertainties). It should be noted that the linear model assumed by the OLS
regression method in eq. 1 considers a zero intercept when both the
spectroscopic and the concentration values are mean centred. This is not
the case with centred data for the MLS regression method, where the
regression coefficients, and therefore the intercept, depend on the
uncertainties in the concentration values and/or in the spectroscopic data
(which are projected onto the scores subspace according to eq. 4). In this
way, the MLS regression hyperplane will fit the most precise data pairs
better, which does not necessarily ensure a zero intercept specially when
highly heteroscedastic data is handled.

Prediction Errors

In this paper we have distinguished between two types of
prediction errors. The first of these refers to the true but unobservable
concentration values (true prediction error), which can be expressed as:

test
N
i
i i
N
y
test
=

=
1
2
)
(
RMSEP true

(9)

where
i
is the true concentration value of a given component in the ith
sample and N
test
is the number of samples used for validation. Although,
theoretically speaking, the true concentration values are unknown, we have
considered values obtained with a high degree of accuracy, such as those

281
from the dilution of stock solutions in data set 1, to be the true
concentrations. To show how the MLS regression method can improve the
true prediction error obtained with OLS, we have generated new
spectroscopic and concentration data using the Monte Carlo simulation
method.
13,14
This method requires that the error-free spectral matrix and the
corresponding concentration values be known, which was only possible in
data set 1. This simulation method generates the new spectral matrices by
adding random errors to the error-free spectral matrix based on the
individual uncertainty (i. e. standard deviation) of the spectroscopic
measurements at each wavelength. Analogously, the new concentrations
are generated by adding a random error to the error-free concentrations
that depends on the set standard deviation. In this way the new
spectroscopic and concentration values with higher measurement errors
will be those with higher standard deviations and viceversa. Since the MLS
regression method gives a greater weight to the spectroscopic (scores) and
concentration values with lower standard deviations (the most similar to
the true ones) and viceversa, predictions of the true concentration values
are better than with the OLS regression method.

The second type of prediction error we considered measures the
ability of the multivariate model to predict the measured concentrations
(observed prediction error, RMSEP). It is analogous to the one in eq. 9, but
takes into account the measured concentration values
i
y instead of the true
ones.


Data Sets

Data Set 1. Data set used elsewhere
9
to compare the prediction error
from MLPCR and MLLRR techniques with the ones from other

282
multivariate calibration methods and has been downloaded from P.
Wentzell Research Group web site
15
This data set was obtained through a
carefully designed experiment involving three-component mixtures of
metal ions (Co
+2
, Cr
+3
, Ni
+2
), a system suggested by Osten and Kowalski
16
.
The three spectral profiles of the pure components are shown in Figure 1a.
The spectra of the metal ion mixtures in Figure 1b contain nonuniform
noise produced by a dichroic band-pass filter placed between the source
and the sample to decrease the source intensity at high and low
wavelengths for all measurements.

To compare the true prediction ability of MLPCR using MLS or OLS
regression methods in the regression stage and MLLRR, 200 calibration and
200 test sets of 128 samples each were generated with the Monte Carlo
simulation method. To generate new calibration and test sets, this
simulation method was applied on an error-free spectral matrix (Fig. 1c)
obtained from multiplying of the 128x3 measured concentration matrix by
the 3x150 pure component spectra matrix (Fig. 1a). The random errors that
were added to these error-free spectra matrices to obtain the new noisy
spectra matrices had the standard deviation profile shown in Figure 1d.
The new concentration values for the calibration and test sets were also
generated by applying the Monte Carlo method to the measured
concentrations. As stated earlier, because the measured concentration
values were obtained by diluting the stock solutions, we may assume that
they are very similar to the true ones (i.e. very low measurement errors are
committed during dilution). Since no standard deviations were available
for the measured concentration values, uncertainty levels of 1%, 5%, 10%,
15% and 20% were considered for generating the new concentration values
for each simulated calibration and test sets.


283
350 400 450 500 550 600 650
0
1
1.6
Wavelength (nm)
A
b
s
o
r
b
a
n
c
e
Cr
Co
Ni

350 400 450 500 550 600 650
-0.4
0
0.6
Wavelength (nm)
A
b
s
o
r
b
a
n
c
e

350 400 450 500 550 600 650
0
0.3
0.6
A
b
s
o
r
b
a
n
c
e
Wavelength (nm)

350 400 450 500 550 600 650
0
0.1
0.2
0.3
Wavelength (nm)
A
b
s
o
r
b
a
n
c
e

Figure 1. Spectral profiles for data set 1: (a) Pure component spectra for the three
metal ions, (b) noisy spectra for metal ion mixtures used for calibration, (c) Error-
free spectra for metal ion mixtures used for calibration and (d) standard
deviation profile of the noisy spectra matrix.

Data Set 2. To compare the performance of the MLS regression
method with OLS (MLR) when applied to MLPCR and MLLRR and also to
PCR, a simulated data set was generated in a similar way to that reported
elsewhere
9
. This data set reproduces the spectral profiles of a three-
component mixture. The pure spectra comprises three Gaussian curves for
each component, centred at 480, 500 and 520 nm respectively. The width
(standard deviation) of each curve is 20 nm. Spectroscopic measurements
were taken every 5 nm within the wavelength range from 400 to 600 nm
(Figure 2a). Reference concentration values were randomly generated from
a uniform distribution between 0 and 1 for each one of the three
components.

The calibration and validation sets are made up of 20 and 100
samples respectively. The Monte Carlo simulation method is used to
randomly generate 10,000 calibration and test sets by adding measurement
a) b)
c) d)

284
errors to the error-free spectral matrices and error-free concentration values
of each of the three components. Figure 2b shows the spectral profile of the
error-free calibration matrix. Measurement errors for the spectral
measurements had both constant and proportional terms. The standard
deviation of the constant term was 1% of the maximum value of the
spectroscopic measurements. The proportional term was set at 2% of the
error-free spectroscopic values. The standard deviation
ij
(1<i<m, and
1<j<n) for each spectroscopic measurement A
ij
was obtained from the
expression
2 2
) 02 . 0 ( 01 . 0
ij ij
A + = . Since this standard deviation
structure is much less complex than the one in data set 1 (Figure 1d), the
maximum likelihood spectral decomposition algorithm is much faster. This
allows a dramatic increase in the number of iterations, from 200 to 10,000.
To show the differences between the prediction errors from the
multivariate calibration techniques using OLS (MLR) and MLS regression
methods and the MLLRR technique, proportional errors with standard
deviations (
y
) of 1, 5, 10, 15 and 20 percent, respectively, were added to the
error-free concentration values in both calibration and test sets.

400 500 600 480 520
a)
1 2 3
Wavelength (nm)

400 500 400
b)
Wavelength (nm)
Figure 2. Spectral profiles for data set 2: (a) Simulated pure components spectra
and (b) error-free spectra of the simulated mixtures used for calibration.


285
Computational Aspects

Our calculations were performed with a Pentium III-based personal
computer with 64 Mb of memory and a clock speed of 500 Mhz. All the
algorithms were written in Matlab (Matlab for Microsoft Windows ver. 5.2,
The Mathworks, Inc., Natick, MA).


Data Set 1. Figure 3 shows the true errors of prediction for the three
components (Cr
+3
, Ni
+2
and Co
+2
). They are the mean values of 200
iterations using the Monte Carlo simulation method for each standard
deviation value associated to the reference concentrations values. In all
cases the optimal number of factors was 3.

1 5 10 15 20
0.5
1.5
2.5
3.5
x 10
-4
MLPCR(MLS)
MLPCR(OLS)
MLLRR
% Standard deviation in Concentration
t
r
u
e

R
M
S
E
P
Cr
+3

Figure 3. Mean values of the 200 true RMSEPs for Cr
+3
generated for each
standard deviation level of the measurement errors added to the error-free
concentrations.

286
1 5 10 15 20
0.2
0.6
1
1.4
1.8
x 10
-3
MLPCR(MLS)
MLPCR(OLS)
MLLRR
t
r
u
e

R
M
S
E
P
Ni
+2

1 5 10 15 20
1
3
5
7
x 10
-4
MLPCR(MLS)
MLPCR(OLS)
MLLRR
t
r
u
e

R
M
S
E
P
Co
+2

Figure 3 (cont.). Mean values of the 200 true RMSEPs for Cr
+3
and Co
+2
generated
for each standard deviation level of the measurement errors added to the error-
free concentrations.

As expected, the true prediction errors increase as the standard
deviations of the measurement errors added to the error-free concentration
values increase in all three cases. Predictions of the true concentration
values using MLPCR with MLS and MLLRR are much better than those

287
with MLPCR when using the OLS regression method. These results are
logical because in the regression step MLPCR with both MLS and MLLRR
takes into account the uncertainties of the scores (from the maximum
likelihood decomposition of the spectra matrix) and the reference
concentration values. In this way, the regression models of the three
analytes give a larger weight (i.e. have a better fit) to the concentration
values in the calibration set with lower uncertainties, which are the most
similar to the true concentration values. Although in this case the
differences are small, MLLRR produced higher true prediction errors than
MLPCR with MLS. The tiny differences in the prediction errors between
the two techniques arise because MLLRR implicitly assumes that the
regression model has zero intercept (q
1
in eq. 1). As stated in the previous
section, this is not the same for the MLS regression method, since
uncertainties considered in both spectroscopic and concentration values
make the regression hyperplane fit those data pairs with lower individual
uncertainties better, which does not ensure a 0 intercept term in eq. 1.

Results from PCR have been omitted from this data set because the
highly heteroscedastic uncertainty structure of the measurement errors in
this data set makes PCR not a suitable multivariate calibration method.
This is confirmed by the poor prediction ability of PCR that has been
thoroughly discussed elsewhere
9
.

Data Set 2. Figure 4 shows the mean true errors of prediction
(considering 10,000 iterations) from five multivariate calibration techniques
using both MLS and OLS (MLR) regression methods for the first and
second components in the three-component mixtures. True prediction
errors for the third component in the mixtures are omitted because by
symmetry they are statistically equivalent to those for the first component.
In all cases, the optimum number of PCs was 3. The lowest prediction
errors were always produced by MLPCR using the MLS regression

288
method. This is because when these two techniques are combined, greater
importance is given to those scores and concentration values with lower
uncertainties (the most similar to the true ones) in the estimation of the
calibration model coefficients. In this way, the calibration model provides
better predictions of the true concentration values. The tiny differences in
the prediction errors between MLPCR with MLS and MLLRR arise for the
same reason as in data set 1. Results for PCR with the MLS regression
method were similar to MLPCR with MLS or MLLRR because
measurement errors in the spectroscopic values in this case are low. This
makes the scores from the singular value decomposition of the maximum
likelihood spectra estimates similar to those from the decomposition of the
measured spectra values.

1 5 10 15 20
0.015
0.025
0.035
0.045
0.055
T
r
u
e

R
M
S
E
P
MLPCR(MLS)
MLPCR(OLS)
MLLRR
PCR(MLS)
PCR(OLS)
Component 1

Figure 4. Mean values of the 10,000 true RMSEPs from the three simulated
component mixtures generated for each standard deviation level of the
measurement errors added to the error-free concentrations.

289
1 5 10 15 20
0.02
0.03
0.04
0.05
0.06
T
r
u
e

R
M
S
E
P
Component 2
MLPCR(MLS)
MLPCR(OLS)
MLLRR
PCR(MLS)
PCR(OLS)

Figure 4 (cont.). Mean values of the 10,000 true RMSEPs from the three
simulated component mixtures generated for each standard deviation level of the

On the other hand, the highest prediction errors, specially in the
y

range between 5% and 20%, were provided by PCR and MLPCR

using the
OLS regression method. This is because OLS does not account for the
uncertainties in the reference concentration values. In this way,
concentration values with high measurement errors are considered in the
calibration step, which degrades the prediction ability of the final
calibration model. The differences between the prediction errors of the two
multivariate calibration techniques are similar to those when using the
MLS regression method, for the same reasons as with data set 1.

Figure 5 shows the observed RMSEPs mean values from the 10,000
iterations. These prediction errors are higher than the true ones shown in
Figure 4.


290
1 5 10 15 20
0.02
0.06
0.1
0.14
o
b
s
e
r
v
e
d

R
M
S
E
P
MLPCR(MLS)
MLPCR(OLS)
MLLRR
PCR(MLS)
PCR(OLS)
Component 1

1 5 10 15 20
0.02
0.06
0.1
0.14
o
b
s
e
r
v
e
d

R
M
S
E
P
MLPCR(MLS)
MLPCR(OLS)
MLLRR
PCR(MLS)
PCR(OLS)
Component 2

Figure 5. Mean values of the 10,000 observed RMSEPs from the three simulated
component mixtures generated for each standard deviation level of the

It is therefore clear that multivariate calibration methods provide
better predictions of the true concentration values than of the
experimentally observed ones. In other words, predictions of the true
values from the multivariate calibration methods are more accurate than

291
those from the reference method
17
, because random errors in the
concentration and/or spectroscopic measurements are averaged by the
calibration models. This is specially clear for the multivariate calibration
methods using the MLS regression technique. This regression method
extracts more information from the concentration values with lower
uncertainties, which are the most similar ones to the true concentration
values. Moreover, although in this example the observed RMSEPs from
MLS are lower than those from OLS, this is not always the case. The MLS
regression model estimated by considering the uncertainties does not
necessarily improve the prediction errors for the measured reference
concentrations since these reference concentrations contain measurement
errors. The concentrations estimated with MLS are more similar to the true
values than to the measured reference concentration values. For this reason,
there may be cases in which the observed RMSEP from MLS is higher than
the one from OLS (MLR).

CONCLUSIONS

This paper presents a new multivariate least squares regression
method (MLS) that estimates the regression model coefficients by taking
into account the uncertainties of the individual values in all the axes. We
have applied this regression method to two types of multivariate
calibration techniques (PCR and MLPCR) to show the prediction ability of
the true and measured concentration values when the uncertainties in both
the scores and the concentration values are considered (MLPCR conditions)
and when only the uncertainties in the concentration values is considered
(PCR conditions). We have compared both the true and observed
prediction errors made with these multivariate calibration techniques using
the MLS regression method to the corresponding prediction errors made
with OLS (MLR) and MLLRR.


292
Results using the MLS regression method in PCR and MLPCR show
that the prediction error of the true concentration values is considerably
lower than the true prediction error when using OLS. Although the true
RMSEP with MLS is similar to the one with MLLRR, the use of the MLS
regression method with MLPCR is more suitable than MLLRR, since
MLPCR is a simpler and more intuitive multivariate calibration method
9
.

Moreover, we have also shown that the observed prediction errors
of the measured concentrations are not necessarily lower with MLS than
with OLS. Although this variable is the only measure of prediction ability
for real data, low observed RMSEPs should not be the ultimate goal of the
researcher. Rather, more attention should be paid to constructing
multivariate calibration models that provide the best possible estimates of
the true concentrations, since lower observed RMSEPs do not necessarily
mean a better prediction ability of the true concentration values from the
multivariate calibration model.
18

Finally, two important points concerning the MLS regression
method should be noted. Firstly, since uncertainties from the replicate
analysis of the different concentration samples in the calibration set must
be known, a greater experimental effort than with OLS is required.
Secondly, the MLS regression technique is not very robust in the presence
of outliers with low individual uncertainties. It is therefore important to
search for possible oultying samples with low uncertainties in the
concentrations.

ACKNOWLEDGEMENTS

The authors would like to thank the DGICyT (project no. BP96-1008)
for financial support, and the Rovira i Virgili University for providing a
doctoral fellowship to A. Martnez.


293
BIBLIOGRAPHY

1.- N. Draper and H. Smith, Applied Regression Analysis, 2nd ed.: John Wiley
& Sons: New York, 5-128 (1981).
2.- K. R. Beebe and B. R. Kowaslki, Anal. Chem., 59, 1007A-1017A (1987).
3.- S. Wold, Systems Under Indirect Observation, Part II, North Holland
Publishing Co., Amsterdam, 1-54 (1982).
4.- E. Vigneau and D. Bertrand and E. M. Qannari, Chemom. Intell. Lab. Syst.,
35, 231-238 (1996).
5.- J. D. Hall, B. McNeil, M. J. Rollins, I. Draper, B. G. Thompson and G.
Macaloney, Appl. Spectrosc., 50, 102-108 (1996).
6.- T. Fearn, Appl. Stat., 32, 73-79 (1983).
7.- A. H. Aastveit and P. Marum, Appl. Spectrosc., 49, 67-75 (1995).
8.- J.M. Lis, A. Cholvadov and J. Kutej, Comput. Chem., 14, 189-192 (1990).
9.- P. D. Wentzell and D. T. Andrews, Anal. Chem., 69, 2299-2311 (1997).
10.- P. D. Wentzell, D. T. Andrews, D. C. Hamilton, K. Faber and B. R.
Kowalski, J. Chemom., 11, 339-366 (1997).
11.- P. D. Wentzell and M. T. Lohnes, Chemom. Intell. Lab. Syst., 45, 65-85
(1999).
12.- P. D. Wentzell and D. T. Andrews, Anal. Chim. Acta, 350, 341-352 (1997).
John Wiley & Sons, New York, 145-150 (1993).
14.- O. Gell and J. A. Holcombe, Anal. Chem., 60, 529A-542A (1990).
15.- http://www.dal.ca/~pdwentze/home.htm
16.- D.W. Osten and B.R. Kowalski, Analytical Chemistry, 57, 908-915
(1985).
17.- R. DiFoggio, Appl. Spectr., 49, 67-75 (1995).
18.- U. H. Olsson, S. V. Troye and R. D. Howell, Multivariate Behavioral
Research, 34(1), 31-58 (1999).


294
6.5 Conclusions

En aquest captol sha desenvolupat un mtode de regressi
multivariant (MLS) que considera les incerteses degudes als errors comesos
en la mesura de les diferents mostres. Sha demostrat que aquest mtode de
regressi s fcilment aplicable a letapa de regressi de dues importants
tcniques de calibraci multivariant com sn PCR i MLPCR. Mitjanant
MLS hem aconseguit millorar sensiblement els errors de predicci
vertaders tant en PCR com en MLPCR respecte als obtinguts utilitzant el
mtode de regressi MLR.

En el cas de MLPCR, els valors dels errors de predicci vertaders
utilitzant el mtode de regressi MLS sn similars i fins i tot lleugerament
inferiors als obtinguts per MLLRR. A ms, tant la interpretaci dels
parmetres del model multivariant obtinguts amb MLPCR utilitzant MLS,
com ls en termes generals, sn ms fcils que emprant el mtode
MLLRR.
2

Finalment cal remarcar que tot i que el mtode de regressi
multivariant MLS proporciona millors errors de predicci vertaders,
aquesta millora noms s observable en dades simulades, en les quals es
coneixen els valors vertaders de les propietats dinters. En conjunts de
dades reals, els valors experimentals obtinguts pel mtode de referncia
incorporen un error de mesura que en alguns casos (per exemple, en la
determinaci de diverses propietats en gasolines) pot ser important. Per
aquest motiu els valors predits de les propietats dinters mitjanant la
tcnica de regressi MLS no tenen perqu ser ms semblants als valors
obtinguts pel mtode de referncia que quan sutilitza MLR. En
conseqncia, s possible que lerror de predicci observat mitjanant la
tcnica de regressi MLS no sigui millor per a un determinat conjunt de
6.5 Conclusions

295
validaci que lobtingut amb MLR, encara que els resultats obtinguts amb
MLS s'acostaran ms als valors reals que els obtinguts amb MLR.

6.6 Referncies

1.- Wentzell P.D., Andrews D.T., Hamilton D.C., Faber K., Kowalski B.R.,
Journal of Chemometrics, 11 (1997) 339-366.
2.- Wentzell P.D., Andrews D.T., Analytical Chemistry, 69 (1997) 2299-2311.
3.- Hall J.D., McNeil B., Rollins M.J., Draper I., Thompson B.G., Macaloney
G., Applied Spectroscopy, 50 (1996) 102-108.
4.- Fearn T., Applied Statistics, 32 (1983) 73-79.
5.- Aastveit A.H., Marum P., Applied Spectroscopy, 49 (1995) 67-75.

CAPTOL 7
Conclusions

7.1 Conclusions generals

299

En aquest captol es presenten les conclusions generals que shan
extret daquesta tesi doctoral a partir dels objectius plantejats en lapartat
1.1 de la Introducci. Per aquest motiu, la discussi de les conclusions es fa
seguint la mateixa estructura.

Revisi crtica de les tcniques de regressi lineal emprades per estimar els
coeficients de regressi.

En lapartat 1.4.1 de la Introducci shan presentat tres de les
tcniques de regressi ms utilitzades quan les incerteses associades als
valors en leix dabscisses (x) sn negligibles respecte a les incerteses
associades als valors en leix dordenades (y). Aquestes tcniques es
coneixen amb el nom de mnims quadrats ordinaris (OLS), mnims
quadrats ponderats (WLS) i mnims quadrats generalitzats (GLS). Per
poder aplicar correctament aquestes tcniques de regressi s fonamental
tenir en compte el tipus dincerteses degudes als errors comesos en la
mesura de les variables resposta situades sobre leix d'ordenades.

La tcnica de regressi OLS s la ms emprada perqu presenta una
srie de propietats matemtiques que sn ben conegudes (apartat 1.4.1.1) i
a la seva senzillesa i rapidesa en lestimaci dels coeficients de regressi. En
el cas que les incerteses associades als valors de la variable resposta no
siguin iguals en tots els punts experimentals (existncia
dheteroscedasticitat), aquest mtode de regressi no proporciona
estimacions correctes ni dels coeficients de regressi ni de les seves
varincies. Sota aquestes condicions experimentals el mtode WLS
representa una millora respecte a OLS, ja que considera lheteroscedasticitat
de la variable resposta. En cas que tamb shagi de considerar la correlaci
entre les varincies
2
i
y
s dels diversos punts experimentals degudes als
Captol 7. Conclusions

300
errors comesos en la mesura de la variable resposta (covarincia), les
millors estimacions dels coeficients de regressi sobtenen mitjanant el
mtode GLS. Aquests mtodes de regressi es poden utilitzar en la
comparaci de mtodes analtics quan les incerteses degudes als errors
comesos en la mesura de les mostres proporcionades per un dels dos
mtodes en comparaci sn negligibles respecte a les incerteses de laltre
mtode. En aquests casos els valors experimentals del mtode amb ms
precisi es collocaran en leix d'abscisses, mentre que els valors obtinguts
per laltre mtode han de ser a leix dordenades.

Daltra banda, quan les incerteses associades a leix d'abscisses no
sn negligibles en comparaci a les de leix dordenades, laplicaci dels
mtodes de regressi esmentats anteriorment no est justificada
estadsticament. En aquests casos s necessari emprar mtodes de regressi
que considerin les incerteses dels valors experimentals en els dos eixos. En
lapartat 1.4.1.2 shan presentat tres mtodes de regressi adequats a
aquestes condicions experimentals. Aquests mtodes es coneixen com
regressi per relaci de varincies constant (CVR), regressi ortogonal (OR)
i mnims quadrats bivariants (BLS). Si la relaci entre les incerteses dels
valors experimentals de les mostres analitzades pels dos mtodes s
constant, s convenient utilitzar el mtode de regressi CVR, ja que estima
els coeficients de regressi seguint un criteri de mxima versemblana. En
el mtode CVR cal fixar la variable (apartat 1.4.1.2), que correspon a la
relaci de les varincies dels valors experimentals obtinguts pels dos
mtodes. Un cas particular de CVR s el mtode de regressi OR i es dna
quan =1. Finalment, el mtode de regressi BLS s indicat quan les
incerteses degudes als errors comesos en lanlisi de les mostres mitjanant
els dos mtodes en comparaci sn diferents per a cadascun dels valors
experimentals en els dos eixos. Dentre tots els mtodes de regressi que
consideren les incerteses individuals en els dos eixos, es va triar el mtode
de Lis i collaboradors, ja que no noms dna estimacions correctes dels

301
coeficients de regressi, sin que proporciona la matriu de varincia-
covarincia dels coeficients de regressi, de gran utilitat per aplicar de
diversos tests estadstics. Daltra banda, el mtode de regressi de Lis i
collaboradors, al contrari que el mtode de regressi multivariant MLR,
tamb permet estimar els coeficients del hiperpl de regressi en un espai
de ms dues dimensions considerant les incerteses individuals associades
als valors experimentals en tots els eixos (mtode de regressi multivariant
MLS). Sha de destacar la importncia de tenir bones estimacions de les
incerteses dels errors experimentals en els mtodes de regressi BLS i MLS.
Aix implica fer prou rpliques en lanlisi de les diferents mostres. Tot i
aix, les incerteses dels errors comesos en les mesures experimentals
estimades mitjanant rpliques poden incloure fonts de variaci no
relacionades amb els errors aleatoris comesos en lanlisi de les mostres. En
aquests casos, els mtodes de regressi BLS i MLS, com tots els altres
mtodes de regressi, donen estimacions esbiaixades dels coeficients de
regressi. Per aquest motiu ls daquests mtodes de regressi requereix
que les fonts de variabilitat que poden afectar les mesures experimentals
estiguin controlades. El diagrama de flux segent esquematitza les
condicions daplicaci de les diferents tcniques de regressi lineal tant
univariant com multivariant descrites en el primer captol daquesta tesi
doctoral:


302
Dades
Experimentals
Ms de dues
variables
predictores?
Errors en
totes les
variables?
MLR
MLS
Errors en
la variable
predictora?
Errors
constants en
la variable
resposta?
Covarincia
entre
i
y
WLS
Relaci
derrors
constant?
BLS
=1?
CVR OR
GLS
OLS
S S
No
No
No
S
No
No S
S
S
S
No
No

Esquema 7.1. Condicions daplicaci dels diferents mtodes de regressi lineal en
funci de les incerteses associades a les variables predictora i resposta degudes als
errors experimentals comesos en la mesura de les mostres.

Desenvolupament i validaci dun test estadstic per detectar la falta dajust dels
resultats experimentals a la recta de regressi.

Shan desenvolupat i validat dos tests estadstics per la detecci de
falta dajust, tal com es mostra al segon captol daquesta tesi doctoral. A

303
partir de conjunts de dades simulats ha estat possible demostrar que la
capacitat del test F sota condicions de regressi BLS per detectar
correctament falta dajust s superior a la mostrada pel test
2
. Amb els
conjunts de dades simulats tamb es va concloure que per poder detectar
correctament lexistncia de falta dajust dels punts experimentals a la recta
de regressi BLS es necessita un nombre fora elevat de rpliques en
lanlisi de les mostres. Tot i que aquests resultats no sn gaire aplicables
sota condicions experimentals reals, ha estat possible descriure els
avantatges i els inconvenients del test estadstic desenvolupat. El
coneixement de les limitacions daquests tests per detectar la falta dajust
dels valors experimentals a la recta de regressi BLS, pot proporcionar una
informaci addicional important a lhora destablir el disseny experimental
que sha de seguir.

Desenvolupament i validaci dexpressions matemtiques per estimar les
probabilitats de cometre errors de primera i segona espcie, en laplicaci de tests
individuals sobre els coeficients de regressi.

En el tercer captol shan tractat diferents aspectes referents a
laplicaci de tests individuals sobre els coeficients de regressi estimats
mitjanant el mtode BLS. Sha de destacar la importncia de considerar les
probabilitats de cometre un error en laplicaci daquests tests
individuals, ja que les conseqncies dassumir probabilitats derror
elevades poden arribar a ser molt greus, segons el problema analtic. Per
aquest motiu sha demostrat mitjanant conjunts de dades simulats que les
expressions matemtiques desenvolupades per estimar la probabilitat de
cometre un error de tipus sn correctes.

Tamb cal insistir en la importncia de lestimaci del nombre de
mostres necessries per construir la recta de regressi mitjanant el mtode
de regressi BLS, de manera que el risc de cometre errors i a lhora de

304
detectar un cert biaix en un dels coeficients de regressi mitjanant un test
individual estigui controlat. A causa de la natura iterativa, aquest
procediment de clcul pot resultar una mica complicat en algunes ocasions.
Tot i aix, el seu s s molt recomanable en aquelles situacions en qu les
conseqncies de cometre errors de tipus i/o siguin especialment
problemtiques pels problemes analtics tractats.

Estudi de la detecci dun biaix significatiu en els resultats de mtodes analtics
capaos danalitzar diferents analits alhora mitjanant regressi lineal.

En aquest estudi, a partir de conjunts de dades simulats ha estat
possible demostrar que la detecci de diferncies significatives entre els
resultats dels dos mtodes en comparaci es realitza de manera correcta
quan es consideren tots els resultats de lanlisi dels diferents analits
alhora. Aix equival a dir que quan saplica el test conjunt sobre els
coeficients de regressi BLS estimats a partir dels conjunts individuals (que
contenen els resultats experimentals de cadascun dels analits per separat),
la detecci de diferncies significatives entre els resultats dels dos mtodes
s molt difcil. En aquests casos existeix una elevada probabilitat de
cometre un error , i en conseqncia, de considerar com a correcte un
mtode analtic esbiaixat. Per minimitzar aquest risc, s necessari entendre
que lestimaci dels coeficients de regressi BLS amb un nombre baix de
valors experimentals produeix sobreestimacions de lerror experimental s
2

que generen, per un determinat nivell de significana , intervals de
confiana sobredimensionats.

Daltra banda, atesa la importncia de les conseqncies que es
poden derivar dun error de tipus en laplicaci del test conjunt sobre els
coeficients de regressi, shan desenvolupat expressions matemtiques per
fer possible lestimaci de la probabilitat de cometre aquest tipus derror
quan el mtode de regressi considera les incerteses degudes als errors

305
experimentals en els dos eixos (BLS). Tamb sha comprovat mitjanant
conjunts de dades simulats que una bona estimaci de lerror experimental
s
2
permet estimar de manera correcta la probabilitat de cometre un error
mitjanant les expressions matemtiques desenvolupades.

Desenvolupament i validaci duna tcnica per la comparaci dels resultats de
mltiples mtodes danlisi que consideri les incerteses dels resultats analtics.

En el cinqu captol sha presentat un procediment que permet
comparar simultniament els resultats a diversos nivells de concentraci de
ms de dos mtodes analtics, considerant les incerteses generades en
lanlisi de les mostres de diverses concentracions. Ha estat possible
demostrar mitjanant conjunts de dades simulats que aquest procediment
per la comparaci de mltiples mtodes proporciona resultats correctes
quant a la detecci dels mtodes analtics que donen resultats esbiaixats.

Daltra banda, aquest procediment tamb s capa de detectar
correctament la presncia de mtodes analtics amb resultats que es poden
considerar discrepants (outliers) respecte la resta. El procediment per
comparar mltiples mtodes analtics tamb sha aplicat a conjunts de
dades reals. Aix ha perms comprovar que tant la identificaci de mtode
analtics que poden ser considerats com a outliers com la detecci de
comparaci, s congruent amb les dades experimentals observades.

Estudi sobre la millora de lhabilitat de predicci en mtodes de calibraci
multivariant mitjanant una tcnica de regressi multivariant que considera les
incerteses en tots els valors experimentals

Ha estat possible desenvolupar una tcnica de regressi
multivariant (MLS) que estima els coeficients de lhiperpl de regressi

306
considerant les incerteses associades tant a les concentracions com als scores
generats per la descomposici de les dades espectrals. El mtode de
regressi MLS s fcilment aplicable a letapa de regressi de dues
importants tcniques de calibraci multivariant com sn PCR i MLPCR.
Mitjanant MLS hem aconseguit millorar sensiblement els errors de
predicci vertaders en les dues tcniques de calibraci multivariant
esmentades respecte als obtinguts utilitzant el mtode de regressi clssic
MLR.

Daltra banda, laplicaci del mtode de regressi MLS a MLPCR
permet obtenir errors de predicci vertaders molt semblants als obtinguts
mitjanant MLLRR. Aquesta darrera tcnica de calibraci multivariant de
mxima versemblana tamb considera les incerteses degudes als errors
comesos en la mesura de les concentracions durant letapa de regressi. No
obstant aix, tant la interpretaci dels diferents parmetres del model
multivariant obtinguts amb MLLRR com ls en termes generals s fora
ms complex en comparaci amb el mtode MLPCR.

Un altre punt que cal tenir molt en compte s el fet que tot i que el
mtode de regressi multivariant MLS proporciona millors errors de
predicci vertaders, aquesta millora noms s observable quan es coneixen
els valors vertaders de les propietats estudiades, s a dir, en conjunts de
dades simulats. En conjunts de dades reals, les concentracions mesurades
pel mtode de referncia incorporen errors experimentals; per tant, les
concentracions predites utilitzant MLS no tenen perqu ser ms semblants
a les de referncia que les concentracions predites amb MLR. En
conseqncia, no s difcil obtenir amb la tcnica de regressi MLS un pitjor
error de predicci per a un determinat conjunt de validaci que lobtingut
amb MLR, malgrat que s molt probable que els resultats obtinguts amb la
tcnica de regressi MLS s'acostin ms als valors reals.


307
Generaci dalgoritmes informtics per facilitar laplicaci prctica dels tests
desenvolupats.

Tots els clculs fets per a aquesta tesi doctoral shan realitzat amb
subrutines programades per MATLAB versi 4.0 (Matlab per a Microsoft
Windows, The Mathworks Inc., Natick, MA). Aquest programa de clcul
permet treballar de forma senzilla i rpida amb matrius de dades que
poden arribar a tenir dimensions fora elevades. Tot i que aquests
algoritmes no es presenten en forma de text per raons despai, es troben
disponibles per a aquelles persones que hi estiguin interessades. Cal dir
que aquests algoritmes shan desenvolupat per a un s personal i, per tant,
el disseny no s tan acurat com el dels programes comercials. Per poder
accedir als codis utilitzats en cadascun dels captols noms cal adrear-se a
lautor o coautors dels articles corresponents.

7.2 Lnies de recerca futura

El grup de Quimiometria i Qualimetria de la Universitat Rovira i Virgili
est tractant actualment una srie de temes importants relacionats amb el
mtode de regressi BLS. Tot i que es tracta de temes de recerca en
desenvolupament, la descripci s convenient per comprendre la direcci
que ha de seguir la recerca futura. Per una banda, sest desenvolupant una
tcnica de regressi robusta que considera les incerteses degudes als errors
aleatoris comesos en la mesura de les mostres de diverses concentracions.
Una altre tema de recerca consisteix a desenvolupar un procediment que
permeti la detecci de punts discrepants al voltant de la recta de regressi
BLS. Daltra banda, tamb sest treballant en la determinaci de lmits de
decisi, detecci i quantificaci pel mtode de regressi BLS. A causa de la
intensa activitat dels darrers anys del grup de Quimiometria i Qualimetria
de la Universitat Rovira i Virgili sobre la calibraci lineal univariant

308
considerant les incerteses dels errors experimentals, la lnia de recerca
futura se centren principalment en la regressi lineal multivariant.

Un aspecte interessant a estudiar dins la regressi lineal
multivariant considerant les incerteses dels errors experimentals en tots els
eixos s la detecci de punts discrepants entorn a lhiperpl de regressi.
Aquest treball resulta especialment important perqu el mtode de
regressi MLS, igual que el mtode BLS, sn poc robustos en presncia de
punts experimentals amb incerteses molt baixes (punts delevada precisi).
Un altre tema de recerca futura relacionat amb el mtode de regressi
multivariant MLS consisteix en el clcul de lmits de decisi, detecci i
quantificaci de forma anloga als estudis realitzats per BLS. Una altra
possible actuaci en aquest camp vindria donada per la millora de
lalgoritme de clcul que permet estimar els coeficients del hiperpl de
regressi MLS. Tot i que el temps de convergncia daquest algoritme s
baix en la majoria dels casos estudiats, hi ha casos molt puntuals en qu el
temps de convergncia s massa elevat i la convergncia pot resultar
problemtica.

Finalment, cal esmentar un altre tema de recerca dirigit al
desenvolupament duna tcnica de regressi per estimar els coeficients de
regressi quan les mesures experimentals sajusten a una lnia corba, que
de forma anloga als mtodes de regressi lineal BLS i MLS, consideri les
incerteses degudes als errors comesos en la mesura de les mostres de
diverses concentracions. Aquest tipus de regressi s fora emprat en
algunes rees de la qumica analtica com, per exemple, en la dataci per
radiocarboni de materials arqueolgics mitjanant mesures per centelleig
lquid, en qu la relaci entre concentraci i resposta se sol ajustar a un
polinomi de tercer grau.

Calibración Lineal y Comparación de Métodos

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Calibración Lineal y Comparación de Métodos

Uploaded by

Copyright:

Available Formats

Departament de Qumica Analtica i Qumica Orgnica

rea de Qumica Analtica

Varincia de lerror residual en el punt i.

Varincia del valor vertader de la variable predictora.

) seran els que tindran una

(eq. 1.29) s conegut.

sobt a partir de lequaci 1.46.

. If, however, lack of fit is present, the regression

) accounted by eq 9. can be calculated

because they both follow a

is expressed as the variance of the ith residual

values on the number of data pairs) an

b . The number of experimental data pairs

s lestimaci de mxima versemblana de r

ser la mateixa tant en lespai definit per les

R , i per tant la funci objectiva es pot calcular segons lequaci 5.3:

R estimada en el segon pas:

estimada en el pas anterior:

is the best possible estimate of the true concentration,

and s estimates, the z-scores can be assumed to

| r corresponent a la mostra desconeguda.

. Per entendre com funciona

You might also like