You are on page 1of 16

7

Biological Variation
Callum G. Fraser and Sverre Sandberg

ABSTRACT
Background individuality of a measurand and the use of conventional
There are many sources of variation in numerical results population-based reference intervals are determined by
generated by examinations performed in laboratory medi- comparison of the within-subject and between-subject
cine. Although some measurands have biological variations biological variations. Within-subject biological variation
over the span of life and others have predictable cyclical varia- and analytical imprecision can be used to create reference
tion, most measurands have random variation around change values (RCVs) to assess the statistical significance of
homeostatic setting points, which differ between individuals. changes in serial results from an individual or the probability
Knowledge of the generation and application of data on that any change seen is significant. Analytical performance
within-subject and between-subject variation is essential for specifications for imprecision, bias, total error allowable,
the correct interpretation of results. measurement uncertainty, and other characteristics can be
created using within-subject and between-subject variation.
Content The data have many other uses. Currently, there are concerns
In this chapter, we explain that numerical estimates of ana- regarding the robustness of some data and estimates for
lytical, within-subject, and between-subject biological varia- some measurands that show considerable heterogeneity. We
tion are generated by examination of a series of specimens support recent recommendations on the evaluation, genera-
taken from a cohort of individuals, which is then followed tion, and application of biological variation data. If followed,
by statistical analysis of the sources of variation. Databases more evidence-based data on biological variation will be
of estimates are available that facilitate applications. The published.

NATURE OF BIOLOGICAL VARIATION concentrations. These can be daily, monthly, or seasonal


in nature. The major ramifications for interpretation are
There are many sources of variation that contribute to the that reference intervals cannot be generated for every point
uncertainty of any result generated in laboratory medicine. during cycles, knowledge of the expected values throughout
Biological variation is one of the most important and should the cycle is vital for clinical interpretation, specimen collec-
be taken into account in any interpretation made. In the tion must be at appropriate times, and absence of rhythm
previous edition of this textbook, a footnote to the section on may indicate disease. These types of biological variations
Biological Variability stated: the author has based much of have been described in detail in Chapter 5, and the strati-
the discussion on a monograph (and) this source should be fication (or partitioning) of reference intervals is explained
consulted for further details.1 This still holds true, although in Chapter 8.
there has been considerable progress in this field, which will The most important type of biological variation is ran-
be particularly highlighted in this chapter. dom biological variation. As an example, four specimens
There are various types of biological variation. The con- were taken from four individuals at daily intervals, and
centration or activity of some measurands changes over the serum sodium activity was examined (reference interval:
span of life, some slowly and some more quickly, particu- 135–147 mmol/L). The results are provided in Table 7.1. It
larly at times of rapid physiologic development, such as the is evident that the results for each individual vary from day
neonatal period, childhood, puberty, menopause, and old to day; this is due to three sources of variation, namely, pre-
age. The concentration or activity of measurands can also analytical, analytical, and within-subject biological variations.
differ between men and women. This variation is taken care The mean value is termed the homeostatic setting point. In
of by the creation of age- and/or sex-stratified (partitioned) addition, each individual has a different average serum so-
reference intervals when these are needed, although a dis- dium activity; the variation among the homeostatic setting
advantage is that age-stratified reference intervals are based points of individuals is the between-subject variation that can
on chronological age rather than biological age. A number translate into reference intervals, whereas the average varia-
of measurands have predictable cyclical rhythms in their tion within each individual is the within-subject variation.

157
158 SECTION I Basics of Laboratory Medicine

TABLE 7.1 Serum Sodium Activity CVG components of biological variation should be generated
in Four Specimens Collected at Daily using the following experimental approach: select a group
of reference individuals (usually apparently healthy volun-
Intervals From Each of a Cohort of teers); take a set of specimens from each of the individu-
Four Individuals als at regular time intervals while minimizing all sources of
pre-examination variation in preparation of the subjects for
Sodium Day 1 Day 2 Day 3 Day 4
collection, and the collection and handling and transport of
Individual 1 137 139 136 138 specimens; store the samples derived from the specimens
Individual 2 144 146 145 144 until ready for examination; and undertake the examination
Individual 3 141 143 142 140 in duplicate while minimizing examination sources of varia-
Individual 4 139 138 141 140 tion. Then, after removal of statistical outliers and confirma-
Values are measured in millimoles per liter.
tion that all results from the individuals are homogenous,
dissect out the CVA, CVI, and CVG components using nested
analysis of variance. This approach is described in detail in
Generation and subsequent application of numerical data the review by Fraser and Harris,3 where it is stated that “the
on the components of biological variation are crucial facets components of variation can be obtained from a relatively
of laboratory medicine, and both of these are described in small number of specimens collected from a small group of
detail in this chapter. subjects over a reasonably short period of time,” although
good evidence for this subjective statement was not provided
and has been lacking until recently.5
TERMINOLOGY
There is little doubt that global harmonization in laboratory Design of Studies
medicine needs to go beyond the examination phase and The preceding general design has been widely used and is
must include all steps within the total process, including ter- very suitable for those measurands that have low CVI and
minology, symbols, and units. Unfortunately, the range of tight homeostatic control, but it is somewhat simplistic for a
terms and variety of symbols used to define the components number of reasons. The measurand may be unstable, and
of biological variation has grown with the increasing body of examinations must be performed soon after the collection of
literature, which undoubtedly causes confusion. A recent specimens (eg, for some hematological measurands, such as
study that investigated papers on biological variation in the mean cell volume, and numbers of erythrocytes and leuko-
13 most highly cited journals of laboratory medicine found cytes per volume). In this case, to obtain the necessary statisti-
that, from 2009 to 2013, 62 papers contained terms and cally unconfounded estimate of CVI, the CVA can be estimated
symbols for components of biological variation. There were by analyzing all of the samples in duplicate, but this is a
68 terms and 25 symbols for the components applicable to within-run CV. Thus, quality control materials have to be
individuals and 47 terms and 18 symbols for the component analyzed between each run during the examinations to ascer-
applicable to groups of individuals.2 It was proposed that the tain that variance due to systematic deviations in the analyti-
following terms and symbols should be used, because they cal procedure between each examination is not introduced.
were mostly applied and were also suggested by Fraser and If the CVA estimated from examination of the duplicates is
Harris in their highly cited review of 19893: less than the CVA estimated from the between-run control
• CVI: within-subject biological variation (variation material, the CVI might be overestimated. Using this strategy,
within a single individual estimated as a pooled varia- it must be assured that the concentrations or activities of the
tion from a [homogenous] group of individuals) quality control materials are similar to those of the specimens
• CVG: between-subject biological variation (varia- from the subjects studied because CVA often varies with con-
tion between the central tendencies of a group of centration or activity. Moreover, it must be assured that the
individuals) analytical variation of the examinations of specimens from
• CVA: analytical variation (analytical imprecision). the individuals and the quality control materials are not sig-
These terms and symbols are used throughout this chapter, nificantly different. This can be assessed by examining a
and the other recommendations in the publication by Fraser number of the specimens from the individuals in duplicate,
and Harris3 are followed. This work has been supported by a calculating the analytical SD, using the formula: SD = [(sum
prestigious, high-impact journal of laboratory medicine: the of differences between duplicates − Σd)2/2 × number of pairs
journal now recommends that both authors and referees − n]1/2, SD = (Σd2/2n)1/2, and then comparing this SD with
comply with the use of these terms and symbols.4 that obtained with the quality control materials, using the
F-test for comparison of variances.
GENERATION OF DATA ON COMPONENTS There is a school of thought that most biological data are
naturally logarithmically distributed, and if this is the case,
OF BIOLOGICAL VARIATION the calculations must be performed on the natural logarithms
Production of data on biological variation is very similar of the observations to make the data distribution closer to
to derivation of population-based reference intervals (see normal.6 Furthermore, it may be that the measurand is not
Chapter 8) except that, instead of one specimen being taken present in matrices from apparently healthy subjects (such as
from a large number of reference individuals, a number of unusual proteins found in myeloma) or that it would be
specimens are taken from a smaller cohort of reference indi- unacceptable or unethical to collect specimens from the indi-
viduals. In general terms, the traditional approach recom- viduals in whom the measurand is most interesting, such as
mends that numerical estimates of CVA and both CVI and children. In such cases, specimens could be collected from
CHAPTER 7 Biological Variation 159

patients with stable disease, as discussed later and described difference between the duplicate results and dividing by two
in recent reviews and editorials.7-9 (as per the preceding formula for calculation of the SD of the
Estimates in laboratory medicine are usually accompanied set of duplicate results). To decide whether an extreme value
by confidence intervals (CIs); this is rarely done in reports is different from the remainder of the distribution, the ratio
that provide estimates of the components of biological varia- of the maximum variance to the sum of the variances is cal-
tion. The determination of CIs for different balanced designs culated, and the Cochrane test is applied. This assumes that
for a two-level nested variance analysis model with varying each variance is based on an equal number of observations
analytical imprecision has been examined in detail.5 Data sets (two in this case), and that only one variance appears to be
based on the model were created to calculate the power of an outlier. Failure to examine for outliers of the duplicates
different study designs for estimation of CVI. It was found can result in a falsely high CVA and a falsely low CVI, with
that the reliability of an estimate for CVI and the power are wide CIs. Second, again using the Cochrane test, outliers in
greatly influenced by the study design and by the ratio the variances of the specimen mean values are examined to
between CVA and CVI. For a fixed number of measurements, assess whether any duplicate values are different from the
it is preferable to have a high number of specimens from each remainder; this is vital—because if the difference between
individual. If the CVA is high compared with the CVI, the any duplicate results is not significantly different from the
number of replicate examinations should be increased. The rest, it does not mean that both duplicates are not different
study provided tables that indicated the effects of increasing from those from the other individuals. Although the Cochrane
the number of individuals studied and the number of repli- test of the variances of the mean values of the specimens is
cates at different levels of imprecision. This work is manda- primarily an outlier test, it can also be used as a simple alter-
tory reading before studies generating data on the components native for a homogeneity test, such as Bartlett’s test (see later).
of biological variation are performed. In addition, estimates Failure to examine for outliers in the mean results from each
of the components of biological variation should always be individual can result in a falsely increased CVI. Finally, outli-
reported with CIs. ers among the mean values of the individuals are assessed; a
simple strategy to perform this process is to use Reed’s
Methods for Data Analysis criterion—that the difference between any mean value and
The currently accepted best method for data analysis is that the next value in the series should be less than one-third of
described by Fraser and Harris.3 Duplicate examinations are the overall range of all values. In the assessment of outliers,
performed on specimens from a cohort of individuals. one of the most useful tools to use is a simple graphical
However, before any estimation of components of variation approach in which the mean values and the range of all these
are made, it is important to examine the data set for outliers, values are plotted for each individual (on the y-axis) against
that is, values that do not belong to the set. This assessment concentration or activity (on the x-axis); an example is pro-
of outliers is important because this process might detect vided in Fig. 7.110 and discussed later in this chapter. Failure
either contamination of specimens or an error in the analysis to exclude outliers of the mean values will result in a falsely
(eg, insufficient sampling); such aberrant values will lead to large CVG, and because the mean value will be different, it will
erroneous inflated values. This assessment is done at three also affect the CVI. After exclusion of the outliers, a nested
levels. First, the duplicate variances from the individuals analysis of variance is used to derive the components of ana-
are examined; each variance is calculated by squaring the lytical and biological variations.

27
26
25
24
23
22
21
20
19
18
17
16
Subject

15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
60 70 80 90 100 110 120 130 140 150
Creatinine (μmol/L)
FIGURE 7.1 Means and extreme values for serum creatinine in 27 older adults.33 Note: 100 μmol/L
= 1.13 mg/dL.
160 SECTION I Basics of Laboratory Medicine

Many have not used formal analysis of variance but instead and the same population are ‘‘homogeneous’’ by definition,
have used simply subtracted variances. The thesis is that and consequently, the ranked cumulative distributions of
because pre-examination sources of variation have been min- these variances are distributed around the true variance of
imized and can be considered negligible, the total CV (CVT) the population according to χ2/df (χ2 distribution for degrees
of a set of results from each cohort of individuals includes of freedom [df] according to the individual sample sizes). In
CVA, CVI, and CVG. Then, because contrast, when a series of different variances have a dispersion
around the pooled variance according to a χ2/df distribution,
CVT = [(CVA )2 + (CVI )2 + (CVG )2 ]1 2 they are considered to be heterogeneous. This can be illus-
trated by plotting the cumulated ranked fractions of within-
the components can be calculated. Outliers are often not subject variation values as a function of the within-subject
looked for, and a normal distribution is assumed, but the variation estimates on a rankit scale. If homogeneous, this
detection of the former and the checking of the latter assump- curve will fit to the theoretical of the square root of the
tion (using Kolmogorov-Smirnov or Anderson-Darling or pooled variance times χ2/df. Variance homogeneity can be
other techniques for assessment of normality) should now be tested further by Bartlett’s test.6,10
considered mandatory. Usually what is done in practice is Alternatively, an index of heterogeneity (IH) has been pro-
that, first, the results from each individual are taken and the posed, and the simple mathematical estimation of this and
mean, SD, and CV are calculated. This CV (CVB) is comprised its interpretation are given in the review of Fraser and Harris.3
of CVA and CVI: The IH provides a means of determining whether individuals
within a population have similar within-subject variation for
CVB = [(CVA )2 + (CVI )2 ]1 2 a given analyte. It is defined as the ratio of the overall CV of
the (SDA2 + SDI2)1/2 of the subjects to [2/(n − 1)]1/2, where
Overall, CVA is then calculated from the duplicate or SDA and SDI are the examination and within-subject biologi-
replicate assays, or from assessment using quality control cal variations as standard deviations, and n is the number of
materials (with the previously noted caveats). Then, by specimens per subject. The higher the index of heterogeneity
simple subtraction, an estimate of CVI = [(CVB)2 − (CVA)2]1/2 is, the greater the heterogeneity of within-subject biological
is generated for each individual. If replicates are used and variation. It is important to test for homogeneity (eg, with
the mean of the values calculated for subsequent examina- the Bartlett or Cochrane test) and indicate how many indi-
tion, then the analytical imprecision is reduced by n1/2, where viduals have had to be removed to obtain homogeneity of the
n is the number of replicates and the correct formula is estimate of CVI. This again would provide an indication of
CVI = [(CVB)2 − (CVA)2/n]1/2. This is not required if only the the representative nature of the data and underscore its suit-
first result is used, but this also appears to be a waste of half ability for wide application.
of the results generated. Then an overall estimate of the CVI Many data have been generated over the last 45 years on
is calculated by taking the individual CVI, squaring all of the components of biological variation in a broad range of
these, adding the squares (the variances), dividing by the measurands. Before considering the uses of the existing data,
number of subjects and taking the square root; thus, average it is necessary to consider their reliability. A small number of
CVI = (ΣCVI2/n)1/2. Following this, then CVG can be generated the measurands that have been assessed have had a number
using the formula: of studies done, which has allowed a few reviews to be per-
formed on the robustness of estimates; examples are blood
CVG = [(CVT )2 − {(CVA )2 + (CVI )2 }]1 2. glycated hemoglobin (HbA1c),11,12 serum C-reactive protein
(CRP),13 and three serum enzymes.14 Published studies on the
Many publications dealing with the generation of esti- biological variation of HbA1c were examined to check the
mates of the components of biological variation do not consistency of the available data to accurately define analyti-
dissect out CVA and simply report the previously noted CVB, cal performance specifications; the authors found nine studies
which is [(CVA)2 + (CVI)2]1/2, as the “within-subject biological and considered that these were limited in a number of ways,
variation”; this is clearly incorrect. Pure unconfounded esti- including choice of analytical methodology, population selec-
mates of the components of analytical and biological varia- tion, protocol application, and statistical analyses.12 A similar
tion are required so that they can be applied correctly, as evaluation of the 11 available studies on CRP again found
described in the following. deficiencies in all aspects of the generation of the estimates,
and only 1 study fulfilled all major preexamination, examina-
Homogeneity and Heterogeneity tion, and postexamination requirements.13 A search of the
The calculation of biological variation data assumes that the literature found 10 publications with data on the components
individuals examined are in a “steady state,” that is, the mea- of biological variation of alanine aminotransferase (ALT), 14
surand does not change during the time span of the study. on aspartate aminotransferase (AST), and 9 on γ-glytamyl
Moreover, data on within-subject biological variation can be transferase (GGT).14 The protocols used for the derivation of
applied, particularly for estimation of RCVs, only if the esti- the components were varied. The ranges of CVI reported were
mates are homogeneous and do not show heteroscedasticity. 11.1 to 58.1% for ALT, 3.0 to 32.3% for AST, and 3.9 to 14.5%
If the data are not homogenous, the results are not represen- for GGT. The range of values is shown in Fig. 7.2. Available
tative of the population, and ubiquitous application of the CIs are shown: the dark diamond is the estimate in the current
estimates is fraught with difficulties. Although estimates of database. The median values (ALT: 18.0%, AST: 11.9%, and
CVI and RCVs can clearly be calculated, these cannot be GGT: 13.8%) were, possibly as expected, similar to those
generalized for the entire population. It is therefore impor- listed in a database commonly used as a reference source.
tant to know that the variances of samples drawn from one These three studies, and other similar studies, suggest that
CHAPTER 7 Biological Variation 161

60 estimates into the database. The criteria for acceptance state


Within-subject variation CV, %

that the ratio CVA/0.5 CVI, originally called the index of fidu-
50 ciality,21 must be less than 2.0, which ensures that the esti-
mates of CVI are not confounded by CVA.
40
The database, which has been much cited and widely
30 used, has a number of advantages. CVI has been determined
for 358 measurands in 247 articles, and this large resource has
20 been developed and refined over nearly 20 years. Data are
available on measurands in a number of matrices, namely,
10 serum (n =185), plasma (n = 74), whole blood (n = 55), and
urine (n = 47). In addition, as stated previously, data are
0
systematically updated every 2 years and made available on
ALT AST GGT
the Internet.19 Moreover, an introduction informs of changes
FIGURE 7.2 Estimates of within-subject biological variation made to existing estimates and information on included new
for three enzyme activities as components of variations [CV]
measurands.
(%). ALT, Alanine aminotransferase; AST, aspartate amino-
transferase; GGT, γ-glutamyl transferase. (From Carobene A, The disadvantages include lack of data for many measur-
Braga F, Roraas T, et al. A systematic review of data on biological ands of interest in laboratory medicine. There is a paucity of
variation for alanine aminotransferase, aspartate aminotransfer- new publications available each year for inclusion in the data-
ase and γ-glutamyl transferase. Clin Chem Lab Med 2013;51: base, with only 25% published since 2000; few data are docu-
1997–2007.) mented for some measurands because 202 were found in a
single publication, 129 had data from 2 to 9 publications, and
27 had data from 10 or more publications. Further, as dis-
there are some concerns regarding the usefulness of the cur- cussed previously, there appears to be heterogeneity for some
rently available data. measurands, including blood HbA1c,12 serum CRP,13 AST, ALT,
and GGT,14 prostate-specific antigen,22 and urinary albumin.23
However, it must be realized that these data are estimates, and
DATABASES as such, it should not be expected that they will be identical
across studies in numerical terms, but they will have a distri-
Current Databases and Their Merits bution. Therefore, in future, CIs for the estimates should be
and Disadvantages provided to allow for comparison of these. In the current
As stated previously, derivation of the components of biologi- database, in which most of the published papers have no CIs
cal variation is not without difficulties and requires expendi- given, the robustness of the data is difficult to assess objec-
ture of considerable resources of various types, so the need tively. However, this was examined in the current database by
for a database of published information found in the litera- calculation of the ratio of the maximum CVI to the minimum
ture was perceived some years ago. In view of the concern CVI (CVI max/CVI min), and it was considered that a ratio
over some of the data available on the components of biologi- below seven indicates robustness. This criterion was achieved
cal variation, examination of these available databases is a or surpassed by 86% of the measurands included in the
necessary prerequisite to their use. Compilations of data were database.20 It has been suggested, in an editorial about the
generated,15-17 but these simply provided lists of published database, that it is difficult to use the data for setting perfor-
data. It was believed that publication of a database giving one mance specifications or RCVs if the estimates can vary with
value for each measurand for which data were available, based a factor of seven, and that a ratio of no more than two is more
on the objective assessment of the reliability of the data in than likely to indicate a significant difference between the
the literature and updated regularly, would be of major estimates.6
benefit. Generation of such a comprehensive database was It is generally assumed that the data on the components
initiated in 1997 by the Analytical Quality Commission of the of biological variation are robust, and that the estimates are
Spanish Society of Clinical Chemistry (SEQC), and it was representative for the specific population and setting in which
first published in 1999.18 An update of this table is made they will be applied. In view of the growing concerns about
available biannually on the Internet,19 and the information the apparent heterogeneity of the estimates, which is proba-
documented contains a brief introduction concerning the bly due in part to the less than ideal methods used in many
aims of provision of the database, the changes that have been of the studies, and therefore, their general applicability,
made compared with the previous edition, and three impor- an Expert Working Group on Biological Variation of the
tant appendices, namely, a list of the measurands studied with European Federation of Clinical Chemistry and Laboratory
the within-subject and between-subject components of bio- Medicine (EFLM) has produced a checklist to enable stan-
logical variation. These are expressed as CV (CVI and CVG, dardized assessment of existing and future publications of
respectively), the derived examination quality specifications biological variation data.24 The checklist identifies key ele-
for imprecision, bias, and total allowable error at three differ- ments to be reported in studies to enable safe, accurate, and
ent levels (desirable, minimum, and optimum), the number effective transfer of biological variation data sets across
of publications examined for each measurand, a list of the healthcare systems. The checklist is mapped to the domains
publications examined by measurand, and documentation of of a minimum data set required to enable this process.
the full citation for every publication. Full details of the struc- Following this work, a new EFLM Task and Finish Group has
ture and derivation of the database have been recently pub- evolved the checklist into a practical tool that will enable
lished,20 including the criteria for inclusion of published existing studies to be classified according to how well the
162 SECTION I Basics of Laboratory Medicine

work fulfils all the required attributes. In addition, work is CVI from results of a number of examinations in diseased
already in progress to examine the data in the existing data- individuals performed regularly in the intensive care unit
base; a new database with high-quality estimates generated in provided reliable estimates.32 These interesting examples
studies that fulfill the criteria laid down will be generated and showed that, although it might be ideal to undertake the well-
made available on the EFLM website.25 Moreover, compliance documented, previously noted experimental protocols and
with the checklist for new studies will enable authors, review- statistical analyses, further studies using novel approaches
ers, and journal editors to ensure that studies are fit for such as these might be required for examinations for which
purpose, appropriately powered, share common terminology, the generation and application of traditional estimates of CVI
and deliver robust estimates of CVI and CVG, accompanied are difficult.
by the key metadata required to enable valid application of
the data described.26 It is anticipated that this ongoing work
will define a standard for the reporting of studies on biologi- INTERPRETATION AND USE OF DATA
cal variation akin to the well-known standard for reporting
of studies on diagnostic accuracy (STARD),27 and will mean Individuality
that only studies accompanied by a complete checklist will be The results of most examinations in laboratory medicine are
considered acceptable by reviewers and editors. It is hoped compared with conventional population-based reference
that this standard will be included in the requirements docu- intervals or sometimes fixed clinical decision-making limits.
mented in instructions for authors. This is mandatory when previous results are unavailable, as
is often the case in clinical settings of diagnosis, case finding,
Biological Variation in Health and Disease and screening. However, reference intervals represent the
Some argue that many of the data on the components of values found in a fractile (usually 0.95) of the reference popu-
biological variation are inappropriate for wide use in labora- lation rather than the values found in a single individual. The
tory medicine because they have been derived, in general, ramification of biological variation on the use of reference
from studies on healthy individuals, and not on patients with intervals is determined by the individuality of the measurand;
disease who are the source of most requests for examina- this has been explained in detail.33 The example therein is
tions.28 A database on the components of biological variation reproduced in part here.
in disease has been published29; the group from the Analytical Fig. 7.1 shows a graph, which as stated earlier, should be
Quality Commission of the SEQC have continued to collect prepared by all who are generating data on the components
such data and have prepared an update that has recently been of biological variation to assess the presence of outlying
made available on the Internet.30 The 2014 database has infor- observations visually. This shows the means and extreme
mation on 97 measurands in subjects with 41 disease states. values on a cohort of 27 older adults for serum creatinine
This work has led to further evidence that estimates of CVI concentration. Subjects 1 to 13 were women and subjects 14
are generally independent of the state of health, except when to 27 were men. The conventional reference intervals for cre-
the measurand is one that is pathologically changed, such as atinine in individuals older than 55 years of age generated in
tumor markers in patients with cancers. the laboratory were 60 to 98 μmol/L (0.68–1.10 mg/dL) for
It is generally assumed that CVI is independent of age, women and 66 to 128 μmol/L (0.75–1.45 mg/dL) for men. As
sex, time span of study (unless very frequent specimens are documented previously,10 analysis of the data in Fig. 7.1
collected when serial correlation of data exist), geograph- allows the following conclusions to be drawn:
ical area, health, and disease, as well as the measurement • No individual has observed values that span the entire
procedure used. This is hardly surprising because CVI are reference interval, and the range of values from each
numerical estimates of the homeostatic mechanisms of the individual occupies only a small part of the dispersion
measurands. of the reference interval.
As a consequence, it can be considered that estimates of • Most individuals have all observed values within the
the components of biological variation should be relatively reference interval.
easily available and can be used in a large number of applica- • The means of the observed values of most individuals
tions fundamental to laboratory medicine. However, because lie within the reference interval, but they are different
the quality of published papers varies, it is strongly recom- from each other.
mended that all users investigate the sources and quality of • A few individuals have observed values that span the
the data selected before application in their individual labo- lower reference limit, and these individuals have values
ratories (eg, by using the proposed check list).24 that change from usual to unusual over time.
• A few individuals have observed values that span the
Generation of Data From Patient Populations upper reference limit, and these individuals also have
Because estimates of CVI, a measure of the homeostatic values that change from usual to unusual.
mechanisms in individuals, seem constant in general, this It is clear that the CVI of creatinine (the variation around
characteristic can be used to determine this component of the homeostatic setting points) is smaller than the CVG (the
biological variation in difficult settings. For example, estima- difference among homeostatic setting points). In numerical
tion of CVI of HbA1c was done in specimens taken routinely terms, CVI was 4.3% and CVG was 18.3%. Thus, CVI is less
from children with cystic fibrosis who had no evidence of than CVG, and in such situations, the measurand is said to have
diabetes mellitus or impaired glucose tolerance.31 Similarly, it marked individuality. This characteristic can be expressed
would be difficult to justify taking multiple samplings from mathematically as an index of individuality (II) and is best
apparently healthy individuals for examinations such as arte- calculated as the ratio of examination plus within-subject
rial pH, gas partial pressures, and electrolytes; derivation of biological variations to between-subject biological variation,
CHAPTER 7 Biological Variation 163

mathematically: II = (CVA2 + CVI2)1/2/CVG. However, it is TABLE 7.2 Within-Subject (CVI)


common now for II to be simply calculated as CVI/CVG.
and Between-Subject (CVG) Biological
This is satisfactory if CVA is less than 0.5CVI, as is often the
case with modern analytical technology and methodology, Variation of Urine Creatinine and Indices
because CVA will then contribute little analytical noise to of Individuality (II)
the numerator (CVA2 + CVI2)1/2. Calculation of II from the
Group CVI (%) CVG (%) II
most recent database shows that most commonly examined
measurands have a low II, meaning that they have marked Men (n = 7) 11.0 6.0 1.83
individuality. Women (n = 8) 15.7 11.0 1.42
This example provides a biological explanation for the fact Total 13.0 28.2 0.46
that serum creatinine concentration, compared with conven-
tional reference intervals, even if partitioned by age or sex,
does not have high sensitivity for the detection of mild renal
impairment, and provides the reason for the undoubted ben- low, and it is important to discover most of the diseased
efits of estimated glomerular filtration rate. The estimated patients, the measurement should not be repeated. Thus, the
glomerular filtration rate uses formulas that take age, sex, and only clear reason for a “confirmation” measurement is in a
ethnicity into account. This example also provides a sound low prevalence situation when the II is high; this is rare in
rationale for the well-established fact that most measurands laboratory medicine.36,37
examined in laboratory medicine are not very useful in popu- If II is defined as CVI/CVG, this ratio must be made larger
lation screening or in case finding. to make conventional population-based reference intervals of
higher clinical usefulness, especially in diagnosis, case finding,
and screening, when no previous results on an individual are
Consequences for Population-Based available. This is actually sometimes easily achieved because
Reference Intervals CVG can be made smaller by stratifying (or partitioning) the
The consequences of individuality were first postulated by data.36 An example is shown in Table 7.2.
Harris34,35 who showed that, when CVI/CVG is high (the cri- Table 7.2 shows that, for the total cohort, II is 0.46, and
terion usually applied is CVI/CVG >1.4), the distribution of therefore, the reference intervals will be of low usefulness,
values from any single individual will cover much of the especially for monitoring individuals. However, for men
entire dispersion of the reference interval derived from values and women taken separately, the II are 1.83 and 1.42,†
found in reference individuals. In contrast, if CVI/CVG is low respectively, and reference intervals will be very useful.
(especially when CVI/CVG is <0.6), the dispersion of values Stratification according to gender has vastly increased the
for any individual will span only a small part of the conven- usefulness of conventional population-based reference inter-
tional population-based reference interval. vals. Because most measurands have low II, stratification
The ramifications of this individuality on the interpreta- must be considered when reference intervals are being devel-
tion of the results of examinations are profound. When II is oped (see Chapter 8). Knowledge of individuality gives a
low, most individuals can have values that are unusual for sound scientific basis for stratification; II must be high for
them, but these will often lie within the reference interval. As conventional, population-based reference intervals to be of
a consequence, these results would not be flagged by labora- high usefulness. Because CVI is generally considered constant
tories as deserving of further attention because, although they and rarely stratified, it must be CVG that is made smaller, if
are unusual for that individual, they are within the reference possible.
interval. Moreover, the users of the laboratory results would
be highly unlikely to pay attention to such unusual values.
Thus, taking one specimen from an individual and compar- REFERENCE CHANGE VALUES AND
ing the result of an examination with the population-based
reference interval will not, as shown previously for creatinine,
DIFFERENCES IN SERIAL RESULTS
be an effective way of picking up the small changes often seen The results of examinations in laboratory medicine are used
in early pathological processes. However, when only one for many purposes; most are used for monitoring, either of
sample is examined in an individual, the II will have no influ- acute disease in the short term in secondary and tertiary care
ence on the percentage of false positives and true positives institutions, or for chronic disease, which is often done in
detected, irrespective of whether the upper reference limit or primary care. Monitoring, by definition, means assessment of
a selected clinical decision-making value is used. However, if results over time. Because most measurands have marked
a “confirmatory” measurement is performed, the II is of individuality and low II, conventional population-based ref-
importance. For quantities with a very low II, which is the erence intervals have disadvantages as aids to interpretation
usual situation in laboratory medicine, a new result of mea- of serial results in an individual.
surement will be close to the first and only provide limited Harris and Yasaka38 introduced the concept of the RCV (as
new information. For quantities with a high II, a repeat mea- discussed earlier), which is also sometimes called the critical
surement will decrease the number of true positives and false difference; the former term is much preferred to indicate an
positives. In a low prevalence situation (eg, in screening and analogy with population-based reference intervals. The gen-
case finding), in which it is important to prevent healthy eration and application of RCVs have been recently reviewed
individuals being incorrectly labelled, a positive result will in depth.6-8
“confirm” the first. In a relatively high prevalence situation The result of one examination will have: dispersion =
(eg, in diagnosis) in which the number of false positives is Z × (CVA2 + CVI2)1/2, where Z is the Z-score equal to the
164 SECTION I Basics of Laboratory Medicine

number of standard deviations appropriate for the probabil- laboratory is will depend on the CVA achieved. Furthermore,
ity desired. The result of a second examination will have the generation of data on CVI is not easy, and it is important
same dispersion, and so the total dispersion of two results that the CVI used is obtained from a population with a
will be 21/2 × Z × (CVA2 + CVI2)1/2. Thus, for two results on homogenous CVI and that is similar to the population for
the same individual to be different, this inherent difference whom the RCV is being created. In addition, the time interval
due to CVA and CVI must be exceeded, and this is the RCV. used for obtaining the CVI must be comparable to the one
It is assumed that preexamination sources of variation are used in practice. An overview of the source publications on
considered negligible, and in clinical and laboratory prac- CVI can be found in the available database,19 and these should
tice, this means having well-documented standard operating be examined for their usefulness23 until the new EFLM data-
procedures for patient preparation and specimen collection, base (which is being developed) is available. Thus, laborato-
transport, and handling before examination, and also good ries can calculate relevant RCVs by using CVA derived from
training of healthcare staff performing these tasks. Moreover, their own internal quality control programs, using data close
it is important to realize that changes in the bias of the exami- to clinical decision-making concentrations or activities, along
nation between the collections of the serial specimens can with the CVI estimates from publications cited in the most
also add to the RCV; if these can be quantitated, as a dif- up-to-date database to create a RCV to use for a variety of
ference due to bias in percentage terms, ΔB, the formula purposes. The RCV is commonly calculated assuming that
becomes RCV = ΔB + 21/2 × Z × (CVA2 + CVI2)1/2. However, CVI for the both examinations is identical (ie, CVI is con-
in a single laboratory in practice, the main source of ΔB over stant). A formula has been developed, which specifies that,
time is due to recalibration, and this random bias is usually an even if CVA is constant, because the concentrations or activi-
integral component of the longer term CVA as estimated from ties will be different, the SD of the two examinations can also
replicate examinations of internal quality control materials. differ.40 Using the proposed more complex formula, the RCV
Thus, it can be assumed that the bias of the examination does becomes larger for increases than for decreases. This interest-
not change during the period between the two examinations, ing proposal does not seem to have been translated into
and the simpler formula applies. routine practice.
It is often assumed that a Z-score of 1.96 for P <0.05 (and Moreover, it is obvious that the RCV generated using the
sometimes also 2.58 for P <0.01) are appropriate. It is almost traditional formula will depend on the number of significant
ubiquitously stated in studies on biological variation that the figures to which the results of examinations are reported;
RCV is calculated as 2.77 × (CVA2 + CVI2)1/2. This is incorrect these should be determined by consideration of the CVA
for a number of reasons. achieved in practice.41 However, it has been suggested that,
First, these Z-scores are termed bidirectional (or two-tailed for examinations in which small changes in results may be of
or two-sided), and this infers that the difference between the clinical significance for monitoring purposes, the effect of the
two serial results can be either an increase or a decrease. number of significant figures reported on the RCV should
However, in most clinical situations, the decision-making is also be taken into consideration.42
the assessment of a significant fall (decline, decrease or reduc- Traditional RCVs are believed by some to be rather sim-
tion, for example, HbA1c after treatment for diabetes mellitus plistic, especially because they only address how likely it is
or in blood glucose after adjustment of insulin dosage), or a that a certain change can be explained by CVA and CVI, but
significant rise (for example, an increase in serum creatinine not the probability that a change in the disease state has
to assess acute kidney injury or serum troponin after acute occurred. It has been suggested that a tool for better under-
chest pain). Thus, unidirectional (one-tailed or one-sided) standing and interpretation of measured differences in moni-
Z-scores must be used in most clinical situations to facilitate toring is needed; the concepts of sensitivity, specificity,
correct interpretation; these are 1.65 for P <0.05 and 2.33 for likelihood ratios, and odds used for diagnostic test evalua-
P <0.01. Correct definitions of the clinical decision-making tions were applied to monitoring by substituting measured
context and the major differences between the terms “change” concentrations with measured differences.43 It was suggested
and “rise or fall,” and their synonyms, are required for correct that this idea expanded the earlier concept of RCV by making
calculation of appropriate RCVs. it possible to have an estimate of the post-test odds for a
Second, clinical decision-making is not always done at certain difference to occur. Consequently, the likelihood ratio
P <0.05, which is the most commonly used probability in for change increases with a larger measured difference, and
analysis of research data. The semantics used are crucial to when used together with the pretest odds or pretest probabil-
understanding the probability that is appropriate. An example ity, the post-test odds and post-test probability, which are
was given in a recent study on RCVs for dehydration markers, related to the clinical situation, can be calculated.
which, in addition to graphs that showed probability against It has been proposed6-8 that the probability of significance
change for plasma osmolality, urine specific gravity, and body of any difference seen in clinical practice between two results
mass, integrated a series of semantic interpretative anchors, can be readily calculated using a simple rearrangement of
namely, change was likely at P >0.80, more likely at P >0.90, the RCV equation making the Z-score (and thus, the prob-
very likely at P >0.95, and virtually certain at P >0.99.39 RCVs ability) the unknown, namely: Z = difference/[21/2 × (CVA2 +
should be used in a spectrum of post-examination processes, CVI2)1/2]. This does not seem to have been applied in practice,
including provision of graphs and tables of change versus but there is clearly scope for adoption of this technique to
probability, Δ-checking, and flagging of significant changes at enhance interpretation of serial results. Exactly as for RCVs,
different levels of probability on electronic and paper reports this application would have important consequences for CVA:
of results of examinations.1 the smaller the CVA, the smaller the RCV will be for any prob-
In addition, because RCV is calculated as RCV = 21/2 × ability, and the significance of any difference seen will be of
Z × (CVA2 + CVI2)1/2, the magnitude of what the RCV for each higher probability.
CHAPTER 7 Biological Variation 165

recent result is the starting point from which to assess whether


Reference Change Values When the Distribution change has occurred. These models have not been widely
Is Not Normal applied; they generate many false positive signals that change
Some would argue that biological variables are not normally has occurred, when there is no obvious usual or pathological
distributed but have a natural logarithm distribution. The change. They also seem too complex and time consuming to
traditional approach to generation of RCV does assume that be used in everyday practice.
CVA is random and also assumes that CVI is a random fluc- Recently, studies on both unidirectional and bidirectional
tuation around a homeostatic setting point, both distributed changes in serial results that used computer simulations have
normally. However, recently, a number of examinations have been performed.48,49 Factors used to multiply the first result
been evaluated, making the assumption that the distributions from an individual were calculated to create the limits for
of results in an individual were ln-normal rather than normal. constant cumulated significant differences. The factors were
The strategy proposed44 has since been adopted by others shown to become a simple function of the number of results
who are investigating examinations such as serum tropo- and the total CV. The first result is multiplied by the appropri-
nin.45,46 CVI was calculated from the total imprecision (CVT) ate factor for an increase or a decrease, which gives the limits
for duplicate examinations by the often used formula: CVI = for a significant difference. It remains to be seen if such
(CVT2 − CVA2)1/2. With the ln-normal approach, the median apparently simple techniques become used in practice.
normal deviation of the n-normal distribution (σ) was cal-
culated as σ = [ln(CVT2 + 1)]1/2. The asymmetrical limits for ANALYTICAL PERFORMANCE SPECIFICATIONS
the upward (positive) value for the ln-normal RCV (RCVpos)
and for the downward (negative) value for the ln-normal
BASED ON BIOLOGICAL VARIATION
RCV (RCVneg), were calculated as: RCVpos = [exp(1.96 × 21/2 Analytical performance specifications are the numerical
× σ) − 1] × 100 and RCVneg = [exp(1.96 × 21/2 σ) − 1] × 100, standards of examination performance required to facilitate
respectively. This approach gives different RCVs for increases optimum patient care. There are many strategies available to
in concentration or activity to the RCV for decreases. This set these specifications, and these have been described over
approach is likely to be used more often in the future when time as this facet of laboratory medicine has evolved.50,51 A
the measurand has a ln-normal rather than a normal distri- concise historical perspective has been published recently.52
bution.47 The CVT term used in this calculation is equal to Following pioneering studies done in the United States53
(CVA2 + CVI2)1/2, and where it is difficult to dissect out numer- on the definition of the components of biological variation,
ical estimates of CV from the total variation of individuals, a College of America Pathologists conference held in 1976
then calculation of RCV as RCV = 21/2 × Z × CVT seems supported the concept that specifications should be best
appropriate and does provide an estimate of RCV that can be based on biology.54 The consensus statement, restated in
used in clinical and laboratory practices, assuming the CVA is current terms and symbols, was: “For group screening, in
similar between that derived in the assessment of (CVA2 + which an individual is to be selected from a population, a
CVI2)1/2 and in the practice in which it is going to be used. specification for imprecision (CVA) is defined as: CVA = 0.5 ×
(CVI2 + CVG2)1/2” and “For individual single and multipoint
Reference Change Values for More Than testing, in which an individual is evaluated on the basis of
Two Serial Examinations discrimination values: CVA = 0.5 × CVI.”
Frequently, in practice, more than two results of examina-
tions are available for the individuals over time. Using the Specifications for Examination Imprecision
traditional RCVs described previously, it is only possible to This approach became of even greater interest as the quantity
calculate the significance of changes between each of the two of data on the components of biological variation increased.
consecutive examinations. Thus, a RCV method including all In most cases, the examination performance specification for
available serial results might be useful for interpretation of CVA was simply taken as CVA = 0.5 × CVI, with the rationale
significant differences over time. that the same technology and methodology were used to
Mathematical methods have been developed to assess examine specimens in both of the previously described clini-
serial results from an individual, which is sometimes called cal settings (and other situations). In consequence, the most
methods for time series analysis.34 One model is called the stringent situation would allow both specifications to be met.
“homeostatic model.” The model assumes that the measur-
and varies randomly around a homeostatic setting point. Specifications for Examination Bias
After collection of results from a small number of examina- Examination bias was not mentioned at that time, possibly
tions, the mean and standard deviation are calculated. The because the thesis was that laboratories all had their own
next result should fall within the range calculated from the reference intervals to which the results of their examinations
mean and the standard deviation. Then, new data from were compared. However, as interest in harmonization of
the same individual is used in an ongoing manner to refine data across time and geography developed, it was realized that
the estimates of the mean and standard deviation for that harmonization of reference intervals was important, and it
individual to interpret whether the next new result has was proposed that the examination performance specifica-
changed significantly from previous data. In contrast, the tion for bias (B), to allow use of harmonized reference inter-
“random walk” model assumes that the measurand behaves vals, was B <0.25 × (CVI2 + CVG2)1/2.55
randomly, and there is no homeostatic setting point. The
dispersion of each result is calculated. However, instead of Specifications for Total Error Allowable
calculating a mean for the individual and recalculating this As the concepts of total laboratory quality management
mean as further data are gathered, it is assumed that the most evolved, and the idea that random and systematic sources of
166 SECTION I Basics of Laboratory Medicine

variation (imprecision and bias) were both important, it sources of variation (CVP), CVA, CVI, and changes in
became clear that total analytical error was the most impor- bias (ΔB):
tant clinically.56 It was proposed that the linear model of
combining imprecision and bias could be used to set analyti- difference = 21 2 × Z × [CVP2 + CVI2 + CVA 2 ]1 2 + ΔB
cal performance specifications for total error allowable (TEa)
as a simple linear addition; for P <0.05: TEa <0.1.65 × 0.5 CVA now, let CVP and ΔB be zero and rearrange the equation:
+ 0.25 (CVI2 + CVG2)1/2. It should be noted that this model for
combining bias and imprecision is only one of the models CVA = [difference (21 2 × Z + CVI2 )]1 2.
available; the advantages and disadvantages of these have
been discussed in detail,57 and the disadvantages of this par- Studies have been done using this model on a variety of
ticular model have been recently restated.58 EFLM has estab- measurands, and have included some studies that involved
lished a Task and Finish Group to address the difficulties with patients undertaking self-testing. An early example is pro-
the total error concept and to try to evolve new models that vided by the studies on blood hemoglobin63; other studies
are sounder. However, the present model is widely used in performed since then have investigated HbA1c, glucose, pro-
current practice in laboratory medicine, albeit sometimes thrombin time, erythrocyte sedimentation rate, and urinary
with a different multiplier for the imprecision term, to deliver albumin.64 It was found that the derived analytical per-
different levels of probability. formance specification depends on the clinical setting, the
Application of this has become very widespread because semantics of the questions posed, how close the result was to
the formula is simple, much data on biological variations a decision limit or reference limit, and probably on current
exist, and they are directly related to the general use of examination performance. Moreover, there was large intercli-
results of examinations in laboratory medicine. However, nician variation in responses. This approach is probably pos-
concerns have been expressed. First, because the biological sible only for assessment of a single measurand in a specific
variation data currently used for some measurands vary,6 clinical situation. In addition, it has been suggested that this
and second, because the analytical performance standards strategy is possible only for analytes that have a major role in
derived using biological variation are unattainable for monitoring and/or diagnosis.
some measurands with current technology and methodol-
ogy, including serum sodium, chloride, and calcium. How- Specifications for Measurement Uncertainty
ever, some analytical performance standards were easy to Because there is now a requirement for medical laboratories
obtain for others, including serum urea, triglycerides, and to document the measurement uncertainty to comply with
many enzyme activities. A three-tier model to cater for this ISO 15189, there should be consideration of the setting of
was proposed, giving minimum, desirable, and maximum analytical performance specifications for this characteristic.
examination performance specifications based on biological Because the fundamental principles are that bias should be
variation using 0.25 CVA, 0.5 CVA, and 0.75 CVA for impreci- eliminated (if possible) and all sources of variation should be
sion and 0.125 (CVI2 + CVG2)1/2, 0.25 (CVI2 + CVG2)1/2, and added linearly as variances, then the only possible analytical
0.325 (CVI2 + CVG2)1/2 for bias, respectively.59 Numerical data performance specifications for measurement uncertainty are
on these three levels of analytical performance specifica- that CVA < f × CVI, where f = 0.10, 0.25, 0.50, 0.75 for ideal,
tions are included for the 358 measurands in the eighth optimum, desirable, and minimum levels of performance,
edition of the biological variation database on the Internet.19 respectively,60 but again this is dependent on reliable data
Furthermore, because analytical technology has evolved being available for CVI.
rapidly of late, and analytical performance has improved,
it has been suggested that a fourth “ideal” level should be Specifications for Other Applications
introduced with multipliers of 0.1 for both imprecision Data on biological variation have been used to generate ana-
and bias.60 Combinations of the three (or four) levels can be lytical performance specifications in other than general clini-
done to obtain relevant analytical performance specifications cal settings, including an evaluation of systems,65 reference
for TEa. methods,66 and for the allowable difference in bias (ΔB)
between two systems used to generate results on the same
Specifications Derived From Opinions of Clinicians individuals, which is ΔB < 0.33 CVI.67
Because it is the users of the results of examinations in labo-
ratory medicine that make use of the data provided, it might Role of Biological Variation in Setting Analytical
be believed that they should be able to inform about the Performance Specifications
analytical quality required to facilitate decision-making. One The setting of analytical performance specifications in labo-
way of attempting to do this is with clinical vignettes, whereby ratory medicine has been a topic of discussion and debate
the clinician provides information about the change in serial for more than 50 years; 15 years ago, as this topic matured
results that is required to stimulate a clinical decision. Some and a profusion of recommendations appeared, a number of
early studies were carried out, but these were completely leading professionals in laboratory medicine realized that
unsatisfactory, in that it was assumed that changes were all there was a need for a global consensus on the setting of such
due to analytical imprecision, and within-subject biological specifications. The Stockholm Conference held in 1999 on
variation was not considered. The probability used was “Strategies to set global analytical quality specifications in
bidirectional, and always at P <0.05 despite the very large laboratory medicine” achieved this and advocated the ubiq-
differences in probability attributable to words used in the uitous application of a hierarchical structure of approaches,
statements.61,62 The principle behind the studies should be based on a model proposed by Fraser and Petersen.68 The
that a difference in serial results is due to preexamination hierarchy has five levels, namely: (1) evaluation of the effect
CHAPTER 7 Biological Variation 167

of analytical performance on clinical outcomes in specific OTHER APPLICATIONS


clinical settings; (2) evaluation of the effect of analytical
performance on clinical decisions in general, using (a) data Calculation of Reliability Coefficient
based on components of biological variation or (b) analysis The individuality of tests and the consequences of the marked
of clinicians’ opinions; (3) published professional recom- individuality (low II) of most measurands in laboratory
mendations from (a) national and international expert bodies medicine in diagnosis, case finding, screening, and monitor-
or (b) expert local groups or individuals; (4) performance ing has been discussed. Epidemiologists and others use
goals set by (a) regulatory bodies or (b) organizers of external similar information to the II, but in a slightly different way.
quality assessment (EQA) schemes; and (5) goals based The reliability coefficient is used, which is calculated as
on the current state of the art as (a) demonstrated by data CVG2/(CVA2 + CVI2 + CVG2)1/2, which is the between-subject
from EQA or proficiency testing scheme, or (b) found in variance divided by the total variance.1 The reliability coef-
current publications on methodology.69 This approach has ficient, usually called R, is numerically equal to the correla-
mostly been used, and there is considerable evidence that the tion coefficient of repeated measurements. It can be between
specifications advocated in level 2, which are based on bio- 0 and 1. If R approached 0, II would be high, and if R
logical variation, are widely implemented. A number of EQA approached 1, then II would be low. Most measurands have
schemes, including those in Australia,70 base their acceptable high values for R.
standards of examination performance on the Stockholm
consensus conference recommendations, especially the use of Number of Samples Needed
biological variation data. Three convocations of experts in In usual clinical practice, only one sample is taken.
quality control in laboratory medicine discussed the genera- Examination result variation can be reduced by multiple
tion and application of analytical performance specifications, sampling (or multiple examinations), and the variation is
and these strategies, based on biological variation, are used made smaller by the square root of the number of replicates.
by many.60,71,72 Furthermore, a recent survey with more than To estimate the number of specimens needed to determine
450 responses from professionals in more than 80 countries the homeostatic setting point within a certain percentage
demonstrated the wide use of specifications based on biologi- error with a stated probability, a simple rearrangement of the
cal variation.73 Since the Stockholm conference, there have usual standard error of the mean formula is used, namely:
been further proposals for setting performance specifications n = (Z × [CVA 2 + CVI2 ] 12 D)2 , where Z is the Z-score appro-
for examination performance characters,74,75 but it is interest- priate for the probability, and D is the desired percentage
ing that all of these include biological variation data in their closeness to the homeostatic setting point. It is important
considerations.76 to note that taking multiple specimens and undertaking
Because laboratory medicine has evolved considerably replicate analyses does affect the overall variability of the
over the last few years, the EFLM organized the 1st Strategic individual examination result; the dispersion (expressed as
Conference on defining analytical goals 15 years after the 1 CV) can be calculated as: dispersion = Z × [(CVA2/nA) +
Stockholm conference; the many insightful presentations are (CVI2/nS)]1/2, where Z is the number of standard deviations
available on the Internet.77 The consensus statement and appropriate to the probability selected, nA is the number of
formal papers emanating from the speakers are included in a replicate examinations, and nS is the number of specimens.
Special Issue of Clinical Chemistry and Laboratory Medicine,78 The relative magnitudes of CVA and CVI are important in
and represent an invaluable up-to-date resource on all aspects deciding if a lower dispersion is required, whether it is better
of setting analytical performance standards, and on genera- to undertake replicate examinations on one specimen or
tion and application of data on biological variation. The con- singleton examinations on multiple specimens. Calculators
sensus was that the Stockholm hierarchical approach was for this can be found, along with a more detailed discussion
supported, but could be simplified. In this revision, the hier- of this topic, with real examples.79 A review has documented
archy was simplified and represented by three different further examples and detailed the reasons why knowledge of
models to set analytical performance specifications. Model 1 numerical data on the components of biological variation is
is based on the effect of analytical performance on clinical of crucial importance.80
outcomes using direct (1a) or indirect (1b) outcome studies,
which basically investigate the impact of analytical perfor-
mance of the test on clinical outcomes. Model 2 is based on Reporting Results, Selecting the Best Specimen
components of biological variation of the measurand. Model to Collect, and Choosing the Best Examination
3 is based on state of the art. The three models use different It is sometimes possible to report the results of examinations
principles, and some models will be better suited for certain in different ways. For example, measurands in urine such
measurands than for others. Application of model 1 is diffi- as creatinine can be reported as concentration or output per
cult and will probably be limited to a few measurands. It has day, and many are reported as a ratio with creatinine concen-
been considered that generally applicable analytical per- tration. Moreover, for some measurands, it is possible to
formance specifications will be best based on biological vari- collect different samples for the same clinical purpose (eg,
ation data for some time to come. In view of the caveats early morning or random or timed urine specimens for low
regarding the reliability of certain data on the components of concentration albumin and protein examinations). In certain
biological variation, it will be significant when the EFLM clinical situations, examinations that might be considered
group on biological variation fulfills its remit and evaluates to have a somewhat similar purpose are available, such as
the current database, creates a database of high quality, and serum creatinine and cystatin-C, or blood HbA1c and serum
its recommendations on standards of generation and report- fructosamine. Knowledge of the components of biological
ing of data are carried through into practice. variation can assist in making decisions about reporting
168 SECTION I Basics of Laboratory Medicine

results, selecting the best specimen to collect, and choosing proper design and performance of such studies is complex.
the best test.1 However, databases of estimates are available that facilitate
To undertake such comparisons, the influences of biologi- application in determining the individuality of a measurand
cal variation should be considered. The ideal measurand and the usefulness of conventional population-based refer-
would have low CVI so that a single examination will give a ence intervals, the statistical significance of changes in serial
good measure of the true value for that individual. Moreover, results from an individual, analytical performance specifica-
this would allow easy monitoring over time and detection tions for imprecision, bias, total error allowable, measure-
of significant differences, because the RCV would be low, ment uncertainty, and other characteristics. The estimates
provided that the CVA was also low. In addition, the ideal have many other uses. There are current concerns regarding
measurand would have no heterogeneity of CVI among indi- the robustness of certain of these data and estimates because
viduals and across studies, and would not be dependent on some measurands show considerable heterogeneity. Recent
age and gender and other possible confounding factors so recommendations should be followed in the generation and
that the simple general formulas given in this chapter would application of biological variation data.
hold for all.

Method Development and Evaluation POINTS TO REMEMBER


Introduction of new examinations is an ongoing task for
most medical laboratories. Some years ago, Zweig and Biological variation may be daily, monthly or seasonal, but
Robertson suggested that the introduction of a new proce- most measurands have random variation around homeostatic
dure should be similar to the structured evolution of a new setting points.
drug through phase trials.81 They suggested that the phases It is challenging to generate data on within-subject and
should be the following: analytical investigation and assess- between-subject components of biological variation, but
ment of reliability and practicability characteristics; overlap available databases facilitate application; users should ensure
investigation involving generation of reference intervals and applicability of published data to their applications.
assessment of values in disease; clinical investigation and Data on biological variation are used, inter alia, to determine:
evaluation of sensitivity, specificity, and predictive value; • the individuality of a measurand and the usefulness of
outcome investigation of whether individuals gain an advan- conventional population-based reference intervals
tage; and, finally, investigation of usefulness, which is a cost– • the statistical significance of changes in serial results
benefit analysis with respect to individuals and the population. from an individual and the probability that any change
In contrast to the linear models, another approach has been documented is significant
proposed in which the essential components of analytical and • examination performance specifications for impreci-
clinical performance, clinical and cost-effectiveness, and the sion, bias, TEa, measurement uncertainty, and other
broader impact of testing are assembled in a dynamic cycle. characteristics.
This approach emphasizes the interaction of the different
components, and that clinical effectiveness data should be fed
back to refine analytical and clinical performances to achieve
improved outcomes.82 SELECTED REFERENCES
One aspect that is not covered in any of these phases,
however, is the need to generate and apply data on biological For a full list of references for this chapter, please refer to
variation early in the evolution of any examination. Data ExpertConsult.com.
can be generated on the components of biological variation 1. Fraser CG. Biological variation: from principles to practice.
through duplicate analysis of a small number of samples from Washington, DC: AACC Press; 2001.
a small cohort of healthy individuals. This allows the setting 3. Fraser CG1, Harris EK. Generation and application of data on
of analytical performance specifications, calculation of the biological variation in clinical chemistry. Crit Rev Clin Lab Sci
significance of changes in an individual (otherwise unobtain- 1989;27:409–37.
able and not mentioned in any of the phases), and assessment 5. Røraas T, Petersen PH, Sandberg S. Confidence intervals and
of the use of the generated population-based reference power calculations for within-person biological variation:
intervals. Generation and application of data on biological effect of analytical imprecision, number of replicates, number
variation are essential prerequisites for introducing new of samples, and number of individuals. Clin Chem 2012;58:
examinations.83 Moreover, the data are also necessary for 1306–13.
objective analysis of the often somewhat subjective guide- 8. Fraser CG. Reference change values. Clin Chem Lab Med
lines from professional bodies that give recommendations on 2011;50:807–12.
interpretation of the numerical results of examinations and 19. Minchinela J, Ricós C, Perich C, et al. Biological variation
on examination performance specifications; unfortunately, database and quality specifications for imprecision, bias and
these recommendations are often flawed.80 total error (desirable and minimum). The 2014 update.
http://www.westgard.com/biodatabase-2014-update.htm>.
20. Perich C, Minchinela J, Ricós C, et al. Biological variation
OVERALL CONCLUSIONS database: structure and criteria used for generation and
Numerical estimates of within-subject and between-subject update. Clin Chem Lab Med 2015;53:299–305.
biological variation are best generated by examination of a 24. Bartlett WA, Braga F, Carobene A, et al. A checklist for critical
series of specimens taken from a cohort of individuals, fol- appraisal of studies of biological variation. Clin Chem Lab
lowed by statistical analysis of the sources of variation. The Med 2015;53:879–85.
CHAPTER 7 Biological Variation 169

29. Ricos C, Iglesias N, Garcia-Lario JV, et al. Within-subject 69. Hyltoft Petersen P, Fraser CG, Kallner A, et al., editors.
biological variation in disease: collated data and clinical Strategies to set global analytical quality specifications in
consequences. Ann Clin Biochem 2007;44:343–52. laboratory medicine. Scand J Clin Lab Invest 1999;59:
33. Fraser CG. Inherent biological variation and reference values. 475–585.
Clin Chem Lab Med 2004;42:758–64. 75. Klee GG. Establishment of outcome-related analytic
37. Petersen PH, Sandberg S, Fraser CG, et al. Influence of index performance goals. Clin Chem 2010;56:714–22.
of individuality on false positives in repeated sampling from 78. Sandberg S, Fraser CG, Horvath AR, et al. Defining analytical
healthy individuals. Clin Chem Lab Med 2001;39:160–5. performance specifications: Consensus Statement from the 1st
43. Hyltoft Petersen P, Sandberg S, Iglesias N, et al. Likelihood- Strategic Conference of the European Federation of Clinical
ratio and odds applied to monitoring of patients as a Chemistry and Laboratory Medicine. Clin Chem Lab Med
supplement to reference change value (RCV). Clin Chem 2015;833–5.
Lab Med 2007;46:157–64. 80. Fraser CG. Test result variation and the quality of evidence-
59. Fraser CG, Hyltoft Petersen P, Libeer JC, et al. Proposals for based clinical guidelines. Clin Chim Acta 2004;346:
setting generally applicable quality goals solely based on 19–24.
biology. Ann Clin Biochem 1997;34:8–12. 83. Fraser CG. Data on biological variation; essential prerequisites
68. Fraser CG, Petersen PH. Analytical performance characteristics for introducing new procedures. Clin Chem 1994;40:
should be judged against objective quality specifications. Clin 1671–3.
Chem 1999;45:321–3.
CHAPTER 7 Biological Variation 169.e1

REFERENCES 21. Fraser CG, Browning MC. The “index of fiduciality” proposed
for use in evaluation and comparison of methods. Clin Chem
1. Fraser CG. Biological variation: from principles to practice. 1988;34:1356–7.
Washington, DC: AACC Press; 2001. 22. Sölétormos G, Semjonow A, Sibley PE, et al. Biological
2. Simundic AM, Kackov S, Miler M, et al. Terms and symbols variation of total prostate-specific antigen: a survey of
used in studies on biological variation: the need for published estimates and consequences for clinical practice.
harmonization. Clin Chem 2015;61:438–9. Clin Chem 2005;51:1342–51.
3. Fraser CG1, Harris EK. Generation and application of data on 23. Miller GW, Bruns DE, Hortin GL, et al. Current issues in
biological variation in clinical chemistry. Crit Rev Clin Lab Sci measurement and reporting of urinary albumin excretion.
1989;27:409–37. Clin Chem 2009;55:24–38.
4. Plebani M, Padoan A, Lippi G. Biological variation: back to 24. Bartlett WA, Braga F, Carobene A, et al. A checklist for critical
basics. Clin Chem Lab Med 2015;53:155–6. appraisal of studies of biological variation. Clin Chem Lab
5. Røraas T, Petersen PH, Sandberg S. Confidence intervals and Med 2015;53:879–85.
power calculations for within-person biological variation: 25. European Federation of Clinical Chemistry and Laboratory
effect of analytical imprecision, number of replicates, number Medicine. <http://www.efcclm.org>. [accessed 15.02.14].
of samples, and number of individuals. Clin Chem 2012;58: 26. Simundic AM, Bartlett WA, Fraser CG. Biological variation: a
1306–13. still evolving facet of laboratory medicine. Ann Clin Biochem
6. Aarsand AK, Røraas T, Sandberg S. Biological variation— 2015;52:189–90.
reliable data is essential. Clin Chem Lab Med 2015;53: 27. STARD Guidelines. <http://www.stard-statement.org>.
153–4. [accessed 15.02.14].
7. Fraser CG. Improved monitoring of differences in serial 28. Lawson N. Is variation in biological variation a problem? Ann
laboratory results. Clin Chem 2011;57:1635–7. Clin Biochem 2007;44:319–20.
8. Fraser CG. Reference change values. Clin Chem Lab Med 29. Ricos C, Iglesias N, Garcia-Lario JV, et al. Within-subject
2011;50:807–12. biological variation in disease: collated data and clinical
9. Fraser CG. Making better use of differences in serial consequences. Ann Clin Biochem 2007;44:343–52.
laboratory results. Ann Clin Biochem 2012;49:1–3. 30. Westgard QC. Quality requirements. <http://
10. Fraser CG. Biological variation in the elderly; implications for www.westgard.com/biodatabasedisease.htm>. [accessed
reference values. In: Faulkner WR, Meites S, editors. Geriatric 15.02.15].
clinical chemistry reference values. Washington, DC: AACC 31. Desmeules P, Cousineau J, Allard P. Biological variation of
Press; 1994. p. 44. glycated haemoglobin in a paediatric population and its
11. Carlsen S, Petersen PH, Skeie S, et al. Within-subject biological application to calculation of significant change between
variation of glucose and HbA1c in healthy persons and in results. Ann Clin Biochem 2010;47:35–8.
type 1 diabetes patients. Clin Chem Lab Med 2011;49:1501–7. 32. Cembrowski GS, Trani DV, Higgins TN. The use of serial
12. Braga F, Dolci A, Mosca A, et al. Biological variability of patient blood gas, electrolyte and glucose results to derive
glycated haemoglobin. Clin Chim Acta 2010;411:1606–10. biologic variation: a new tool to assess the acceptability of
13. Braga F, Panteghini M. Biological variability of C-reactive intensive care unit testing. Clin Chem Lab Med 2010;48:
protein: is the available information reliable? Clin Chim Acta 1447–54.
2012;413:1179–83. 33. Fraser CG. Inherent biological variation and reference values.
14. Carobene A, Braga F, Roraas T, et al. A systematic review of Clin Chem Lab Med 2004;42:758–64.
data on biological variation for alanine aminotransferase, 34. Harris EK. Some theory of reference values. II. Comparison of
aspartate aminotransferase and γ-glutamyl transferase. Clin some statistical models of intraindividual variation in blood
Chem Lab Med 2013;51:1997–2007. constituents. Clin Chem 1976;22:1343–50.
15. Fraser CG. The application of theoretical goals based on 35. Harris EK. Statistical aspects of reference values in clinical
biological variation data in proficiency testing. Arch Pathol Lab pathology. Prog Clin Pathol 1981;8:45–66.
Med 1988;112:404–15. 36. Hyltoft Petersen P, Fraser CG, Sandberg S, et al. The index of
16. Fraser CG. Biological variation in clinical chemistry. An individuality is often a misinterpreted quantity characteristic.
update: collated data, 1988 – 1991. Arch Pathol Lab Med Clin Chem Lab Med 1999;37:655–61.
1992;116:916–23. 37. Petersen PH, Sandberg S, Fraser CG, et al. Influence of index
17. Sebastián-Gámbaro MA, Lirón-Hernández FJ, Fuentes-Arderiu of individuality on false positives in repeated sampling
X. Intra- and inter-individual biological variability data bank. from healthy individuals. Clin Chem Lab Med 2001;39:
Eur J Clin Chem Clin Biochem 1997;35:845–52. 160–5.
18. Ricós C, Álvarez V, Cava F, et al. Current databases on 38. Harris EK, Yasaka T. On the calculation of a “reference
biological variation: pros, cons and progress. Scand J Clin Lab change” for comparing two consecutive measurements. Clin
Invest 1999;59:491–500. Chem 1983;29:25–30.
19. Minchinela J, Ricós C, Perich C, et al. Biological variation 39. Cheuvront SN, Fraser CG, Kenefick RW, et al. Reference
database and quality specifications for imprecision, bias and change values for monitoring dehydration. Clin Chem Lab
total error (desirable and minimum). The 2014 update. Med 2011;49:1033–7.
<http://www.westgard.com/biodatabase-2014-update.htm>. 40. Jones GRD. Critical difference calculations revised: inclusion
[accessed 15.02.14]. of variation in standard deviation with analyte concentration.
20. Perich C, Minchinela J, Ricós C, et al. Biological variation Ann Clin Biochem 2009;46:517–19.
database: structure and criteria used for generation and 41. Hawkins RCW. The significance of significant figures. Clin
update. Clin Chem Lab Med 2015;53:299–305. Chem 1990;36:824.
169.e2 SECTION I Basics of Laboratory Medicine

42. Jones GRD. The effect of the reporting interval size on critical 61. Elion-Gerritzen WE. Analytic precision in clinical chemistry
difference estimation – beyond ‘2.77’. Clin Chem 2006;52: and medical decisions. Am J Clin Pathol 1980;73:
880–5. 183–95.
43. Hyltoft Petersen P, Sandberg S, Iglesias N, et al. Likelihood- 62. Skendzel LP, Barnet RN, Platt R. Medically useful criteria for
ratio and odds applied to monitoring of patients as a analytic performance of laboratory tests. Am J Clin Pathol
supplement to reference change value (RCV). Clin Chem Lab 1985;83:200–5.
Med 2007;46:157–64. 63. Thue G, Sandberg S, Fugelli P. Clinical assessment of
44. Fokkema MR, Herrmann Z, Muskiet FAJ, et al. Reference haemoglobin values by general practitioners related to
change values for brain natriuretic peptides revisited. Clin analytical and biological variation. Scand J Clin Lab Invest
Chem 2006;52:1602–3. 1991;51:453–9.
45. Frankenstein L, Wu AH, Hallermayer K, et al. Biological 64. Thue B, Sandberg S. Analytical performance specifications
variation and reference change value of high-sensitivity based on how clinicians use laboratory tests. Experiences from
troponin T in healthy individuals during short and a post-analytical external quality assessment scheme. Clin
intermediate follow-up periods. Clin Chem 2011;57: Chem Lab Med 2015;53:857–62.
1068–71. 65. Fraser CG, Hyltoft Petersen P, Ricos C, et al. Proposed quality
46. Vasile VC, Saenger AK, Kroning JM, et al. Biologic variation of specifications for the imprecision and inaccuracy of analytical
a novel cardiac troponin I assay. Clin Chem 2011;57:1080–1. systems in clinical chemistry. Eur J Clin Chem Clin Biochem
47. Braga F, Ferraro S, Ieva F, et al. A new robust statistical model 1992;30:311–17.
for interpretation of differences in serial test results from an 66. Thienpont L, Franzini C, Kratochvila J, et al. Analytical
individual. Clin Chem Lab Med 2015;53:815–22. quality specifications for reference methods and operating
48. Lund F, Petersen PH, Fraser CG. Sölétormos G. Calculation of specifications for networks of reference laboratories. Eur J Clin
limits for significant unidirectional changes in two or more Chem Clin Biochem 1995;33:949–57.
serial results of a biomarker based on a computer simulation 67. Petersen PH, Fraser CG, Westgard JO, et al. Analytical
model. Ann Clin Biochem 2015;52:237–44. goal-setting for monitoring patients when two analytical
49. Lund F, Petersen PH, Fraser CG. Sölétormos G. Calculation of methods are used. Clin Chem 1992;38:2256–60.
limits for significant bidirectional changes in two or more 68. Fraser CG, Petersen PH. Analytical performance characteristics
serial results of a biomarker based on a computer simulation should be judged against objective quality specifications. Clin
model. Ann Clin Biochem 2015;52:434–40. Chem 1999;45:321–3.
50. Fraser CG. Desirable performance standards for clinical 69. Hyltoft Petersen P, Fraser CG, Kallner A, et al., editors.
chemistry tests. Adv Clin Chem 1983;23:299–339. Strategies to set global analytical quality specifications in
51. Fraser CG, Hyltoft Petersen P. Desirable standards for laboratory medicine. Scand J Clin Lab Invest 1999;59:
laboratory tests if they are to fulfil medical needs. Clin Chem 475–585.
1993;39:1447–55. 70. RCPAQAP Chemical Pathology. Assessment of performance.
52. Fraser CG. The 1999 Stockholm Consensus Conference on <http://dataentry.rcpaqap.com.au/chempath/index.cfm>.
quality specifications in laboratory medicine. Clin Chem Lab 71. Burnett D, Ceriotti F, Cooper G, et al. Collective opinion
Med 2015;53:837–40. paper on findings of the 2009 convocation of experts on
53. Cotlove E, Harris EK, Williams GZ. Biological and analytic quality control. Clin Chem Lab Med 2010;48:41–52.
components of variation in long-term studies of serum 72. Cooper G, DeJonge N, Ehrmeyer S, et al. Collective opinion
constituents in normal subjects. III. Physiological and medical paper on findings of the 2010 convocation of experts
implications. Clin Chem 1970;16:1028–32. on laboratory quality. Clin Chem Lab Med 2011;49:
54. Elevitch FR, editor. College of American Pathologists Conference 793–802.
II (1976): analytical goals in clinical chemistry. Skokie, IL: 73. Westgard S. Global analytical goal survey results. <http://
College of American Pathologists; 1977. www.westgard.com/global-goal-results.htm>.
55. Gowans EM, Hyltoft Petersen P, Blaabjerg O, et al. Analytical 74. Haeckel R, Wosniok W. A new concept to derive permissible
goals for the acceptance of common reference intervals for limits for analytical imprecision and bias considering
laboratories throughout a geographical area. Scand J Clin Lab diagnostic requirements and technical state-of-the-art. Clin
Invest 1988;48:757–64. Chem Lab Med 2011;49:623–35.
56. Westgard JO, Hunt MR. Use and interpretation of common 75. Klee GG. Establishment of outcome-related analytic
statistical tests in method-comparison studies. Clin Chem performance goals. Clin Chem 2010;56:714–22.
1973;19:49–57. 76. Hyltoft Petersen P, Sandberg S, Fraser CG. Do new concepts
57. Petersen PH, Fraser CG, Jørgensen L, et al. Combination of for deriving permissible limits for analytical imprecision and
analytical quality specifications based on biological within- bias have any advantages over existing consensus? Clin Chem
and between-subject variation. Ann Clin Biochem 2002;39: Lab Med 2011;49:637–40.
543–50. 77. <http://www.eflm.eu/index.php/educational-material.html>.
58. Oosterhuis WP. Gross overestimation of total allowable error 78. Sandberg S, Fraser CG, Horvath AR, et al. Defining analytical
based on biological variation. Clin Chem 2011;57:1334–6. performance specifications: Consensus Statement from the 1st
59. Fraser CG, Hyltoft Petersen P, Libeer JC, et al. Proposals for Strategic Conference of the European Federation of Clinical
setting generally applicable quality goals solely based on Chemistry and Laboratory Medicine. Clin Chem Lab Med
biology. Ann Clin Biochem 1997;34:8–12. 2015;833–5.
60. Adams O, Cooper G, Fraser C, et al. Collective opinion paper 79. Westgard QC. Dispersion calculator and critical number of
on findings of the 2011 convocation of experts on laboratory test samples. <http://www.westgard.com/dispersion-calculator
quality. Clin Chem Lab Med 2012;50:1547–58. -and-critical-number-of-test-samples.htm>.
CHAPTER 7 Biological Variation 169.e3

80. Fraser CG. Test result variation and the quality of evidence- 82. Horvath AR, Lord SJ, StJohn S, et al. From biomarkers to
based clinical guidelines. Clin Chim Acta 2004;346: medical tests: the changing landscape of test evaluation. Clin
19–24. Chim Acta 2014;427:49–57.
81. Zweig MH, Robertson EA. Why we need better test 83. Fraser CG. Data on biological variation; essential prerequisites
evaluations. Clin Chem 1982;28:1272–6. for introducing new procedures. Clin Chem 1994;40:1671–3.

You might also like