You are on page 1of 11

Content

Method evaluation (validation) and method comparison


Introduction
• The analytical quality triangle
• Purpose of method evaluation

Performance standards

Performance characteristics of a method


• Precision
• Limit of detection
• Working range …

Experiments

Making decisions

Method evaluation strategies

References

Statistics & graphics for the laboratory 1


Introduction

Objectives of the course


• Be able to efficiently manage analytical quality by applying a concept that
integrates specification, creation, and control of analytical quality.

• Understand that management of analytical quality needs communication with


the "outside" partners (note: this should be a two-way communication).

• Accomplish means that allow you to anticipate future quality needs in an early
stage.

Analytical quality in the medical laboratory


An integrated approach

Specification Creation
of quality of quality

Profession Labor- Manufacturer


Legislation atory

External quality-
assessment

Control
of quality

S pe c ific atio n o f quality P atie nt


P rofe s s ion P hys ician
Re gulation
Laboratory

Cre atio n o f quality


Manufacture r
Laboratory

Co ntro l o f quality
Inte rnal: Laboratory
Exte rnal: EQA

Statistics & graphics for the laboratory 2


Introduction

Method evaluation – Place in the overall analytical quality

The analytical quality triangle


For valid measurements…

Clinical Method evaluation/


comparison
Biology
Expert Chemistry
State-of-art Instrument

Quality Quality
specification creation

Quality
management

Planning
Control
Assurance
Improvement

… the analytical quality triangle!

Purpose of method evaluation

J. Westgard 
The inner, hidden, deeper, secret meaning of a method evaluation/validation
= ERROR ASSESSMENT
 
From: J.O. Westgard, Basic method validation, Westgard Quality Corporation
1999, pp 250. www.westgard.com

Carey et al$
Evaluate performance & make decisions about performance.
Apply a clinical perspective to the whole task!

Requirements
• Experimental protocols to estimate performance reliably ("Error assessment")
• Standards (specifications, claims) for acceptable performance
• Criteria for comparing estimated performance with performance standards

$Carey RN, Garber CC, Koch DD. Concepts and practices in the evaluation of
laboratory methods. Workshop, AACC 48th Annual Meeting, Chicago (IL), July
28, 1996.

Statistics & graphics for the laboratory 3


Introduction

WHAT is validation?
Validation is the confirmation, through the provision of objective evidence, that
requirements for a specific intended use or application have been fulfilled (ISO
9000).
We see, from this definition, that we have to
• specify the intended use of a method,
• define performance requirements,
• provide data from validation experiments (objective evidence), and
• interprete the validation data (confirmation that requirements have been
fulfilled).

WHICH type of performance requirements (specifications) exist?


Performance requirements can be statistical, analytical, or application-
driven/regulatory.
Statistical and analytical specifications are most useful for method evaluation.
Application-driven/regulatory specifications are used for validation. Some
examples are given in the table below.
Performance requirements (specifications)

Statistical Analytical Application-driven#


t-test: P ≥ 0.05 Bias  Bias  3%
Calibration tolerance
F-test: P ≥ 0.05 CV  stable CV CV  3%

#Cholesterol (National Cholesterol Education Program)

WHICH performance characteristics exist?


We have seen that we have to specify performance requirements for a validation.
These requirements refer to the following performance charateristics of an
analytical method:
• Imprecision
• Limit of detection
• Working range
• Linearity
• Recovery
• Interference/Specificity
• Total error (method comparison)
• [Robustness/Ruggedness]: will not be addressed in this book.

Statistics & graphics for the laboratory 4


Introduction

WHICH experiments do we have to perform?


The experiments we have to perform depend on the performance characteristic
we want to validate. For the estimation of method imprecision, for example, we
need to perform repeated measurements with a stable sample. However, there is
no agreement over the various application fields of analytical methods about the
design of such experiments. In this book, we will mainly refer to the experimental
protocols from the Clinical and Laboratory Standards Institute (CLSI). The table
below gives an overview about typical experiments to be performed during a
method validation study.

Performance Samples
chracteristic Measurements
Imprecision IQC-samples; no target
n = 20 (repetition over several days)
LoD/LoQ Blank; Low sample
n = 20 (repetition over several days)
Linearity 5 related samples/-calibrators (mix); no target
n = 4 (repetition within day)
Working range See: Imprecision/Linearity
Interference Samples: Interferent spike & control (no target)
n = 4 (repetition within day)
Recovery Samples: Known analyte spike & control or
(Accuracy/Trueness) certified reference materials (CRM)
n = 4 - 5 (repetition over several days)
Total error 40 samples (target by reference method)
(method comparison n = 1 or 2 (measurement in one or several days)
IQC: Internal Quality Control; LoD: limit of detection; LoQ: limit of quantitation

Statistics & graphics for the laboratory 5


Introduction

HOW do we make decisions?


When we have created data, we have to decide whether they fulfill the
requirements that have been selected for the application of the method "for a
specific intended use". Currently, it is common practice to make decisions without
considering confidence intervals or statistical significance testing. Modern
interpretation of analytical data, however, requires the use of confidence
intervals/statistical significance testing.These two approaches are compared in
the table below for the case of a recovery experiment.

Decision making approaches

“Old” “Modern”
Experimental recovery: 90% Experimental recovery: 90%
Confidence interval: 11%
(with n = 4 and CV = 7%)
Limit: 85 – 115% Limit: 85 – 115%
Decision: passed Decision: fail
(90 – 11 = 79%, exceeds 85%)
Action: increase n or reduce CV

In the “old” approach, we compare one “naked” number with the specification.
This approach misses the information on the number of measurements that have
been performed and the imprecision of the method. If we would repeat the
validation, we easily could obtain a recovery estimate of 80%, for example.
Therefore, decision-making should be statistics-based. This is by applying a
formal statistical test or by interpreting the confidence interval of an experimental
estimate.

Statistics-based decision – Importance of the “test-value”


(= requirement, specification)
When we make statistics-based decisions, the selection of the test value will
depend on the type of requirement we apply (statistical, analytical, validation).
Statistical
• Statistical test versus Null-hypothesis (F-test, t-test, 95% confidence-
intervals, …): Bias = 0; Slope = 1; Intercept = 0; etc.
Analytical
• Statistical test versus estimate of stable performance (F-test, t-test, 95%
confidence-intervals, etc.): Bias  calibration tolerance; etc.
Validation case (application-driven; “specific intended use”)
• Statistical test versus validation limit (F-test, t-test, 95% confidence-intervals,
etc.): CVexp  CVmax; Biasexp  Biasmax; etc.
Nevertheless, in all three situations, we apply the same type of statistical tests.

Statistics & graphics for the laboratory 6


Evaluation strategies

Evaluation strategies
Strategies
• By comparison "with a reference"
• By the method itself

Evaluation strategy: By comparison "with a reference"


• "Traditional" external quality assessment
• "Traditional" Certified Reference Materials
• IQC materials with target
• Method comparison with "true" reference method  Preferred
="Complete picture"-type

Evaluation strategy: By the method itself


• Imprecision
• Limit of detection (LoD)
• Working range
• Linearity
• Recovery
• Interference
• Specificity
• Shift/drift/Carryover
• Ruggedness
= "Mosaic"-type

Statistics & graphics for the laboratory 7


Evaluation strategies

Evaluation strategies
Evaluation strategy "complete picture"
Advantages
• 1 experiment
• Gives the complete picture
Beware
The interpretation heavily depends on the quality of the comparison method
& the samples used!

Disadvantages
• The reason of errrors may remain unknown
 Apply "mosaic-type"

Specific purposes of method comparisons

Sufficient quality of a test method?


– Comparison method is a reference method
Are 2 methods equivalent?
– The 2 methods are of the same hierarchy
Recalibration of the test method
– The comparison method is of higher or of the same hierarchy

Note on the calibration (adaptation) of a method via method comparison studies:


 Be aware that minimum criteria have to be fulfilled!

Reliable outcome Not possible Unreliable outcome

900 900 900


Routine (nmol/L)

Routine (nmol/L)

Routine (nmol/L)

600 600 600

300 300 300


y = 1,17x - 7,4 y = 0,97x + 54
Subgroup
r = 0,9939 r = 0,9258
0 0 0
0 300 600 900 0 300 600 900 0 300 600 900
Reference (nmol/L) Reference (nmol/L) Reference (nmol/L)

Statistics & graphics for the laboratory 8


Evaluation strategies

Evaluation strategies
Evaluation strategy "mosaic"
Evaluate the performance characteristics of a method separately
• Imprecision
• LoD
• Interferences
"Mosaic stones"
• etc

Try to put together the complete picture from these "mosaic stones".

Recommended reference
Westgard JO. Basic method validation. Madison (WI): Westgard Quality
Corporation, 1999, 250pp.
But be aware: Makes simplifications, often confidence limits are missing!

Advantages
• Detailed evaluation of the method
– For a commercial test: manufacturers' task
– Task of the lab: performance verification
• Can be done with the method itself
Disadvantages
• Time-consuming experiments
• Are the results reliable?
– SD with IQC materials
– LoD from SD of blank
– Linearity/recovery = trueness
– Interferences: all tested/effect of combinations
– Matrix effects of investigated materials
• Can we establish the complete picture from the mosaic stones?
• May be unnecessary for the laboratory!

Statistics & graphics for the laboratory 9


Evaluation strategies

Evaluation strategy
Westgard terminology

Experiment
Type of error "Selective" "Complex"
Repeat control Duplicates,
Random
samples native samples

Interference Method
Constant
studies comparison

Method
Proportional Recovery
comparison

"Mosaic" "Complete picture"

The practice

We take performance standards


• From "biology" (westgard.com/biodatabase1.htm)
• Manufacturers' specifications

We use xperimental protocols


• CLSI protocols
• "Adapted" CLSI protocols (LoD, recovery)

We compare the experimental estimates with the performance standards


• Statistics/Graphics

Note
Method evaluation/validation is detailed in the book:
Method validation with confidence.

Statistics & graphics for the laboratory 10


References

References
Book
J.O. Westgard, Basic method validation, Westgard Quality Corporation 1999, pp
250.

CLSI protocols
Evaluation of Precision Performance of Clinical Chemistry Devices; Approved
guideline. CLSI Document EP5-A. Wayne, PA: CLSI 1999.
Evaluation of the linearity of quantitative measurement procedures: A statistical
approach; Approved guideline. CLSI Document EP6-A. Wayne, PA: CLSI 2003.
Interference testing in clinical chemistry; Approved guideline. CLSI Document
EP7-A. Wayne, PA: CLSI 2002.
Method comparison and bias estimation using patient samples; Approved
guideline. CLSI Document EP9-A2. Wayne, PA: CLSI 2002.
Preliminary evaluation of quantitative clinical laboratory methods; Approved
guideline. CLSI Document EP10-A2. Wayne, PA: CLSI 2002.
Protocols for Determination of Limits of Quantitation. CLSI Document EP17.
Wayne, PA: CLSI in preparation.

Related CLSI protocols


Evaluation of Matrix Effects; Approved guideline. CLSI Document EP14-A.
Wayne, PA: CLSI 2001.
User Demonstration of Performance for Precision and Accuracy; Approved
guideline. CLSI Document EP15-A. Wayne, PA: CLSI 2001.
Estimation of Total Analytical Error for Clinical Laboratory Methods; Approved
guideline. CLSI Document EP21-A. Wayne, PA: CLSI 2003.

Other
Vassault A, et al. Société Française de Biologie Clinique. Protocole de validation
de techniques. Ann Biol Clin 1986;44:686-719 (english: 720-45).
Vassault A, et al. Société Française de Biologie Clinique. Analyses de biologie
médicale: spécifications et normes d’acceptabilité à l’usage de la validation de
techniques. Ann Biol Clin 1999;57:685-95.
Dewitte K, Stöckl D, Van de Velde M, Thienpont LM. Evaluation of intrinsic and
routine quality of serum total magnesium measurement. Clin Chim Acta
2000;292:55-68.

Statistics & graphics for the laboratory 11

You might also like