You are on page 1of 13

Computers in Biology and Medicine 29 (1999) 289±301

www.elsevier.com/locate/compbiomed

An algorithm for ®nding the linear region in a nonlinear


data set
Martin H. Kroll a,*, Ken Emancipator b, David Floering c, Daniel Tholen d
a
Department of Pathology, Clinical Chemistry Division, The Johns Hopkins School of Medicine, 600 North Wolfe
Street, Meyer B125, Baltimore, MD 21287-7065, USA
b
Department of Pathology, Diagnostic Pathology/Lab Medicine, Beth Israel Medical Center, First Avenue at 16th
Street, New York, NY 10003, USA
c
Immediate Response Laboratory, The Christ Hospital, 2139 Auburn Avenue, Cincinnati, OH 45219, USA
d
Statistical Consulting Service, 823 Webster, Traverse City, MI 49686, USA
Received 15 May 1998; received in revised form 15 April 1999; accepted 15 April 1999

Abstract

Finding the linear reportable range is an important procedure for each method in clinical chemistry.
One is often called upon to limit the reportable range in order to ®nd the linear region. Limiting the
reportable range by visual techniques is subjective, may introduce bias and is not programmable. Using
Kroll and Emancipator's polynomial method for linearity, we compare the residuals of a test to
determine whether eliminating a point from one end or the other of the data set worsens or improves
the data sets' linearity. In an example of urinary cortisol, the root mean squares of the residuals
improve by 2% when the lowest point is removed, 39% when the highest point is removed and 82%
when the two highest points are removed. The latter data set is the most linear. # 1999 Elsevier Science
Ltd. All rights reserved.

Keywords: Linear; Nonlinear; Reportable range; Accuracy

1. Introduction

The determination of linearity is an important evaluation for any method. Not only is

* Corresponding author. Tel.: +1-410-955-6304; fax: +1-410-955-0767.


E-mail address: mkroll@pathlan.path.jhu.edu (Martin H. Kroll)

0010-4825/99/$ - see front matter # 1999 Elsevier Science Ltd. All rights reserved.
PII: S 0 0 1 0 - 4 8 2 5 ( 9 9 ) 0 0 0 1 1 - 6
290 M.H. Kroll et al. / Computers in Biology and Medicine 29 (1999) 289±301

linearity of a method important from a regulatory perspective, but it is also a helpful


evaluation tool for solving problems with methods. The actual experimental values are referred
to as the data set. Linearity is a measure of the internal consistency of the points in a data set.
A data set is linear if the linear ®t is superior to higher-order polynomial ®ts and the data set
is visually linear [1].
An important concept is the reportable range. The reportable range is the range of values
that are accurate and consistently linear. The extremes of the reportable range represent the
lowest and highest reportable results without dilution. It behooves one to have the reportable
range as wide as possible, to reduce the number of dilutions and the number of results
reported as less than the minimum reportable result. Typically, one challenges the method
under evaluation beyond the expected upper and lower limits, that is, with samples whose
concentrations are above and below the expected linearity of the method. The result is that
frequently one will have a nonlinear data set. The question then arises which extreme points to
eliminate from the data set in order to make the data set linear. Sometimes it is apparent
which points to remove by looking at the data set; but sometimes it is not. Further, using such
subjective means as visual evaluation may introduce bias into the assessment. The purpose of
our study is to develop an objective procedure to evaluate the e€ect of selectively removing
points from a nonlinear data set. The procedure will have measurable criteria and will be
programmable in a computer.

2. Materials and methods

2.1. Linearity procedure

We follow the method as outlined previously, with a few minor modi®cations [1]. We ®t the
data set to a ®rst-order (linear), second-order and third-order polynomials. We evaluate the
coecients of each variable for statistical signi®cance by applying the t-test. In this case t is
equal to the coecient divided by the standard error. The null hypothesis is that the coecient
is not signi®cantly di€erent from zero. When the probability of the null hypothesis is less than
or equal to 0.05, the corresponding coecient is considered statistically signi®cant. When the
probability of the null hypothesis is greater than 0.05, the corresponding coecient is
considered non-signi®cant. Assignment of signi®cance to the coecient delineate which
variables are important contributions to their polynomial ®ts. If none of the higher-order
variables are important contributors to their polynomial ®ts, i.e. variables of the forms x 2 or
x 3 have insigni®cant coecients, then the data set is presumed linear.
When even one higher-order variable has a statistically signi®cant coecient, then the data
set is presumed to be nonlinear [1]. Because each value is rounded up or down, an additional
error is introduced into the regressions. We deal with rounding error by calculating the
di€erence at nine values assumed equally spaced for x between the ®rst- and second-order
polynomial ®ts and the ®rst- and third-order polynomial ®ts. When all of the di€erences are
less than a predetermined value or less than a ®xed percentage, the data set is presumed linear,
even though nonlinear coecients may have been statistically signi®cant. Choice of a
predetermined value and the ®xed percentage take some care. For the predetermined value, we
M.H. Kroll et al. / Computers in Biology and Medicine 29 (1999) 289±301 291

use the sensitivity of the method, i.e. the smallest di€erence the method can detect between two
di€erent concentrations of analyte. We use a di€erence of 3% for electrolytes and routine
analytes and 5% for enzymes, immunoassays and other less precise methods.

2.2. Linearity improvement test

In this section, we described a way to objectively evaluate whether the linearity is improved
or worsened when the reportable range is reduced. The theory supporting the method is that
the mean square of the residuals from the ®rst-order regression would decrease as the linearity
of a data set improves.
The mean square of the residuals (MSr) is given SSr/DFr, where SSr is the sum of square of
the residuals and DFr is the degrees of freedom of the residuals. As the number of values in
the reportable range is reduced, the sum of squares is reduced as is the number of degrees of
freedom. Thus, if the error remains the same after reduction of the reportable range, the mean
square of the residuals would increase. If the error were slightly decreased, the mean square of
the residuals would remain relatively unchanged or could increase slightly. When the error is
greatly reduced, the mean square of the residuals would be decreased.
If we suppose that for a given nonlinear data set, we remove one distinct point, i.e. all the
replicates for a given concentration of analyte and that distinct point was actually a
contributor to the linearity of the data set, rather than a detractor, then the linear ®t would be
worse than the old one and the error would increase. On the other hand, if we remove one
distinct data point and that distinct data point is a detractor to the overall linearity, rather
than a contributor, the new linear regression would be a better ®t than the old regression and
the error would decrease. In this case, examination of the coecients would not be a sucient
means to evaluate if the error had decreased, because the data set may still be nonlinear and
the nonlinear coecients may still be statistically signi®cant. Changing the data set a€ects the
error, as measured by the square of the residuals, in a continuous fashion, so changes in
linearity can be easily quanti®ed.
The mean square of the residuals is taken from the analysis of variance (ANOVA) table for
the regression. The analysis of variance takes the same format for any order polynomial, i.e. it
is the same format for the ®rst-, second- and third-order polynomials. The two key variables in
the analysis of variance are the degrees of freedom (DF) and the sum of squares (SS). The
number of degrees of freedom of the residuals is equal to n, the number of samples minus the
order of polynomial (r ), minus one (n ÿ r ÿ 1 ). In the case of a second-order regression, the
number of degrees of freedom for the regression would be two. The number of degrees of
freedom for the residuals would be n ÿ 3. The SS is the sum of the squared di€erences between
each result and its expected value, as determined by the regression model.
The mean square of the residuals represents the error of ®tting the data with the regression
equation. There are two sources for this error; one is the random variation present in the data
set (precision error), the other is the bias or inappropriateness of the particular regression
equation used to ®t the data (nonlinearity for the ®rst-order line) [2]. We are interested in
quantifying the nonlinear source of error relative to the random variation.
To compare a reduced data set with the complete one, we examine the ®rst-order regressions
and calculate the root mean square for data sets with one or more points removed from the
292 M.H. Kroll et al. / Computers in Biology and Medicine 29 (1999) 289±301

highest end or one or more points removed from the lowest end. We calculate the percentage
improvement in the ®t of a ®rst-order polynomial with the equation
p p
MSA ÿ MSB
p  100
MSA

SSA=sum of squares of the residuals of A; SSB=sum of squares of the residuals of B;


DFA=degrees of freedom of the residuals of A; DFB=degrees of freedom of the residuals of
B.

SSA
MSr,A ˆ
DFA

SSB
MSr,B ˆ
DFB

We expect that a more linear data set would improve the root mean square of the residuals
by at least 10%. If the percentage di€erence is less than 10%, the removal of points does not
a€ect the linear ®t of the model.

2.2.1. Example Ð urinary free cortisol

2.2.1.1. Extraction column. Clean screen extraction column (Worldwide Monitoring, 417 Cara-
dean Dr., Horsham, PA 19044). Place 5 ml of urine in the column. Measure content on the elu-
ent.

2.2.1.2. Cortisol. Cortisol method, TDx (Abbott Laboratories, Abbott Park, IL 60064). This
method is an immunoassay and uses ¯uorescent polarization as the detection system. The
method was performed according to the manufacturers instructions.

2.2.1.3. Sample. To a urine pool we added a cortisol standard, then diluted with water to make
11 concentrations standard, 10 mg, Steroid Screen, (Hydrocortisone lot no. 031321, expiration
date June 1994), (Biochemical Diagnostics, Inc, 180 Heartland Blvd., Edgewood, NY 11717).

2.3. Nonlinear data sets

In addition to the cortisol example, three data sets were constructed. The data sets contain a
linear central portion; a nonlinear set of points was added to each linear portion. One data set
where the lower 10 points are not linear; one set where the upper ®ve points are not linear; and
one set where both lower and upper data points are not linear, giving the data set a sigmoid
appearance.
M.H. Kroll et al. / Computers in Biology and Medicine 29 (1999) 289±301 293

3. Results

We will use the urinary free cortisol example to demonstrate the algorithm, presenting the
raw data in Table 1, analysis in Tables 2±4 and summary in Table 5 (Fig. 1). Linearity analysis
of the entire data set show that the coecients for the second- and third-order polynomials are
signi®cant (Table 2) and therefore the data set is nonlinear. The largest percentage di€erences
are for the two lowest points. The linear (®rst-order) ®t yields 4.5 and 11.1 for the two lowest
values, while the third-order regression yields 2.4 and 8.0; these values di€er by 46% and 28%,
respectively. The linear ®t di€ers from the observed values by 411% for the lowest point and
31% for the next-to-lowest points whereas the third-order regression di€ers from the observed
by 172% for the lowest point and 4.5% for the next-to-lowest points. At the middle values, the
linear regression yielded a value of 41.0, the second-order regression, a value of 45.2 and the
third-order regression, a value of 45.2 and these values represent di€erences of 10% between
the linear and both nonlinear regressions. At the highest point, the observed value was 68, the
value from the ®rst-order regression, 78.3, from the second-order regression, 72.4 and from the
third-order regression, 68.9. The regressed values di€er from the observed values by 15%,
6.5% and 1.3% for the ®rst-, second- and third-order ®ts, respectively. The high percentage
di€erences between the ®rst-order regression and the observed values and the second- and
third-order regression indicate that the ®rst-order regression model represents a poor ®t of the
data; therefore, the data set is not linear. The highest values of the percentage di€erences of
the lowest point suggest that the lowest point, at an observed value of 0.88, should be
removed. The root mean squares for the ®rst-order regression is 5.1, for the second-order
regression, 3.4 and for the third-order regression, 1.7. The ®t improves as the order of
regression increases.
Removal of the lowest point still leaves the data set nonlinear, because of the ®rst order
regression, second- and third-order coecients continue to be statistically signi®cant (Table 3).
The same holds true when the two lowest points are removed (Table 3) continue to be
statistically signi®cant. The root mean square for the ®rst-order regression with the lowest

Table 1
Urinary free cortisol values for linearity

Cortisol concentration (mg/dl) Determined value (mg/dl)

1.09 0.88
8.83 8.48
17.42 17.50
25.7 22.8
35.7 32.5
45.3 42.2
53.3 50.1
62.4 60.0
74.0 69.0
86.1 70.0
93.7 68.0
294 M.H. Kroll et al. / Computers in Biology and Medicine 29 (1999) 289±301

Table 2
Linearity analysis of the entire data set

Polynomial order Coecient t-value DF residuals SS residuals MS residuals RMSa

First x 15.3 9 238 26.4 5.14


Second x 9.6 8 91.3 11.4 3.38
x2 3.6
Third x 3.3 7 21.1 3.02 1.74
2
x 3.6
x3 4.8

a
RMS=root mean square of the residuals.

point removed is 5.2, while that for the second-order regression is 3.37 and for the third-order
regression is 1.4. None of these root mean squares for the data set with the lowest point
removed show any improvement over the complete data set. Again, with the two lowest points
removed, the root mean square for the ®rst-order regression is 5.3, for the second-order
regression, 3.3 and for the third-order-regression, 0.66. The third-order and second-order
regression root mean squares show some improvement compared with those for the complete
data set, but the ®rst-order regression root mean square does not.
When the highest point is removed, the data set is still nonlinear because second- and third-
order coecients of the ®rst-order regression remain signi®cant (Table 4); however, the root
mean squares do demonstrate a decrease in value, with that of the ®rst-order regression being

Table 3
Linearity analysis of reduced data set Ð lowest and two lowest points removed

Polynomial order Coecient t-value DF residuals SS residuals MS residuals RMSa

Lowest point removed


First x 12.8 8 219 27.4 5.23
Second x 7.9 7 80.1 11.4 3.38
x2 3.5
Third x 0.7 6 11.4 1.9 1.38
2
x 4.8
x3 6.0
Two lowest points removed
First x 10.4 7 200 28.6 5.35
Second x 6.7 6 65.4 10.9 3.30
x2 3.5
Third x 2.9 5 2.21 0.44 0.66
2
x 9.9
x3 12.0
a
RMS=root mean square of the residuals.
M.H. Kroll et al. / Computers in Biology and Medicine 29 (1999) 289±301 295

Table 4
Linearity analysis of reduced data set Ð highest and two highest points removed

Polynomial order Coecient t-value DF residuals SS residuals MS residuals RMSa

Highest point removed


First x 23.6 8 78.8 9.86 3.14
Second x 10.0 7 46.3 6.62 2.57
x2 2.2
Third x 3.4 6 18.1 3.02 1.74
2
x 2.5
x3 3.1
Two highest points removed
First x 70.3 7 6.2 0.89 0.94
Second x 17.4 6 5.9 0.99 0.99
x2 0.5
Third x 6.4 5 5.7 1.13 1.06
2
x 0.6
x3 0.5
a
RMS=root mean square of the residuals.

3.1, that of the second-order regression being 1.7. Further, when the two highest data points
are removed from the data, the root mean squares continue to decrease, with values of 0.94 for
the ®rst-order regression, 0.99 for the second-order regression and 1.06 for the third-order
regression. This data set is, in addition, linear because the nonlinear coecients of the ®rst-
order regression are no longer statistically signi®cant (Table 4). Dropping the two highest data
points shows an improvement of 82% for the root mean square of the residuals for the ®rst-
order regression.
Comparison of the root mean squares of the residuals provides a faster and easier means of
®nding the linear region. When removing the lowest and two lowest points fails to provide
improvement in the size of the residuals, in actuality, the situation becomes worse with the
improvement being ÿ2% and ÿ4% for the data sets with the lowest and two lowest points
removed, respectively (Table 5). On the other hand, removing the highest point provides a 39%

Table 5
Improvement in percentage of root mean square of the residuals

Point(s) removed Improvement (%)

Lowest ÿ 2a
Two lowest ÿ4
Highest 39
Two highest 82

a
A negative percentage means the root mean square of the residual increased.
296 M.H. Kroll et al. / Computers in Biology and Medicine 29 (1999) 289±301

Fig. 1. Cortisol data set. (a) The two upper data points are obviously nonlinear. (b) Removal of the upper two data
points results generates a linear data set statistically but not obviously so upon visual inspection.

improvement and removing the two highest points, an 82% improvement (Table 5). The size of
the improvement clearly indicates the correct direction.
The data for the three constructed nonlinear sets are shown in Table 6, with graphs in Figs.

Fig. 2. (a) The lower ten data points are clearly nonlinear. (b) Removal of the ten lower data points produces a
linear data set. (c) The upper ®ve data points are the nonlinear ones, though their nonlinear nature is not obviously
apparent. (d) The data set appears linear after removal of the upper nonlinear points.
M.H. Kroll et al. / Computers in Biology and Medicine 29 (1999) 289±301 297

Table 6
Data for three nonlinear curves with the lower, upper or both ends of the curve nonlinear

X Lower Upper Sigmoid

0 ÿ 0.007 ÿ 0.007 ÿ 0.007


0.25 0.210 2.585 0.210
0.50 0.271 4.896 0.271
0.75 0.674 7.424 0.674
1.00 1.475 10.225 1.475
1.25 2.082 12.582 2.082
1.50 3.021 15.146 3.021
1.75 3.880 17.505 3.880
2.00 4.914 19.914 4.914
2.25 6.259 22.509 6.259
2.50 7.534 25.034 7.534
2.75 8.659 27.409 8.659
3.00 10.032 30.032 10.032
3.25 11.252 32.502 11.252
3.50 12.436 34.936 12.436
3.75 13.851 37.601 13.851
4.00 15.021 40.021 15.021
4.25 16.270 42.520 16.270
4.50 17.452 44.952 17.452
4.75 18.715 47.465 18.715
5.00 19.967 49.967 19.967
5.25 21.249 52.499 21.249
5.50 22.706 55.206 22.706
5.75 23.732 57.482 23.732
6.00 25.057 60.057 25.057
6.25 26.528 62.778 26.528
6.50 27.501 65.001 27.501
6.75 28.913 67.663 28.913
7.00 30.058 70.058 30.058
7.25 31.280 72.530 31.280
7.50 32.473 74.973 32.473
7.75 33.744 77.494 33.744
8.00 34.975 79.975 34.975
8.25 36.053 82.303 36.053
8.50 37.500 84.875 37.500
8.75 38.715 87.090 38.715
9.00 39.704 88.954 39.579
9.25 40.818 90.818 40.443
9.50 42.290 92.915 41.540
9.75 44.027 95.152 42.777
10.00 45.003 96.503 43.253
298 M.H. Kroll et al. / Computers in Biology and Medicine 29 (1999) 289±301

2(a), 2(c) and 3(a). The points in the data sets have an overall imprecision of about one to two
percent over the entire range. Removal of points linear in the original linear set never
decreased the root mean square of the residual by 10% or more. Typically, removal of such
points actually increased the value (Table 7). Removal of the nonlinear points decreased the
root mean square by more than 10% in all three cases (Table 7), even when only one out of
many such points were removed (a 24% improvement). With the sigmoidably-shaped nonlinear
data set, removal of the upper nonlinear points after the lower nonlinear points had been
removed still reduced the root mean square by 72% (Table 7).
When only one aberrant point was introduced into a linear data set of 37 points, the
contamination of the data set by this one outlying point failed to make the data set nonlinear,
even though the aberrant point di€ered from the original point by more than 35% (17.5 to
24.0). The slope with the aberrant points was 4.96820.0672 and y-intercept, ÿ4.6815 20.4111
while the uncontaminated data set had a slope of 4.994420.0077 and y-intercept of ±
4.992820.0470 (a di€erence of 0.5% for the slope and 6.6% y-intercept). The di€erence in the

Fig. 3. (a) The lower three points are obviously nonlinear, but not necessarily the lower ®fth through tenth points,
nor the upper ®ve points. (b) Removal of the lower ten points generates an improvement in the linearity, but even
though the data looks linear, it still retains signi®cant nonlinearity (Table 7). (c) Removal of the upper ®ve
nonlinear points improves the linearity, but still leaves a nonlinear data set. (d) Removal of both upper and lower
nonlinear points, ®nally produces a linear data set.
M.H. Kroll et al. / Computers in Biology and Medicine 29 (1999) 289±301 299

Table 7
Root mean squares of the residuals and percent di€erences based on the nonlinear data sets with the lower, upper
or both end nonlinear

Points used in analysis Root mean square % Change Is % change signi®cant

Lower points nonlinear


All 0.910 ± ±
Upper points removed 0.941 3.4 noa
Lower points removed 0.140 85 yes
Upper points nonlinear
All 0.646 ± ±
Upper points removed 0.099 85 yes
Lower points removed 0.675 4.5 noa
Both lower and upper points nonlinear (sigmoid curve)
All 0.888 ± ±
Upper points removed 0.941 6 noa
Lower points removed 0.342 61 yes
Upper and lower points removed 0.0935 72 yesb
a
The percentage change is in the wrong direction, showing an increase in the root mean square of the residuals
instead of a decrease.
b
The percentage change represents the decrease from the data set with the lower points removed to one where the
upper points are removed as well.

slope was 0.023 and in the y-intercept, 0.311; both di€erences are less than the standard errors
of the slope and y-intercept, respectively, for the contaminated data set.

4. Discussion

Establishing the linear (reportable) range at the manufacturer's level is an important quality
procedure for every method. Likewise, the user needs to verify the manufacturer's and their
claim for linearity. Such veri®cation needs to be more rigorous if the method is new for the
manufacturer or the laboratory.
Manufacturers and users would like to have the widest linearity ranges possible and
therefore should challenge the method beyond its linearity at both the high and low end. Often
one will obtain nonlinear results. The problem then becomes to determine which end of the
range does one remove points to make the data set linear and whether the high and low points
cause the nonlinearity or the nonlinearity is inherent in the data set.
In the data set for the urinary cortisol, the percentage di€erences between the linear and
nonlinear regression analyses were greatest for the lowest cortisol values. These di€erences
mistakenly pointed to the lowest concentrations in the data set as the major contributors to the
nonlinearity. Thus, the percentage di€erence often may lead one astray and incorrectly
implicate a portion of the curve as the nonlinear contributor.
300 M.H. Kroll et al. / Computers in Biology and Medicine 29 (1999) 289±301

A rational approach for determining the points that contribute to nonlinearity and therefore
necessary to remove from the data set is needed, because the nonlinear points are not always
intuitively obvious. Furthermore, a rationale approach can be written into a computer program
and run automatically.
As viewed from its graph, the upper two points of the cortisol data are obviously nonlinear
(Fig. 1). The nonlinearities for the remaining three data sets are not quite as obvious and with
many more points, they still show great reductions in the root mean squares of the residuals
when the appropriate nonlinear points are removed (Table 7, Figs. 2 and 3).
The proposed algorithm in this manuscript compares the root mean squares of the residuals
between data sets, which decreases as the ®rst-order regression improves in the its ®t of the
data [2±4]. Use of the mean square of the residuals (SS/DF) controls for the number of degrees
of freedom in the regression model and the points in the regression. Further, by using the root
mean square of the residuals (equivalent to the standard error of the estimate), the actual value
is on the same scale as that of the y-values. As one increases the order of the regression,
making the regression model better, or removes points in the nonlinear regions of the curve,
making the data set better, one decreases the root mean square of the residuals. Once the ideal
model has been reached, or the data set is linear, the values for the root mean square of the
residuals no longer decrease, but instead plateau.
The error due to bias or inappropriateness of ®t by an incomplete regression model
comprises almost the entire sum of squares of the di€erences between residuals (SSr). There is
a minor contribution from the deleted point, but its contribution should be small relative to
the random error (also known as the pure error) from the rest of the points. If the random
error of the removed point is much greater than the average random error of the other points
in the data set, one can subtract its contribution from the sum of squares of the di€erence
between the residuals.
We have described a rational mechanism to evaluate the suitability of eliminating points
from a data set, either from the upper or lower end, to make the data set more linear. The
algorithm does not require visual inspection of the data points, but does mimic the thought
processes used to subjectively evaluate the nonlinear aspects of a data set on a graph. The
algorithm is easy to perform and requires only minimal calculations, because most all
regression programs provide the root mean square of the residuals in their results. We chose a
percentage di€erence as a cuto€ for convenience purposes and from experience; changes that
improve the regression ®t usually alter the di€erence between root mean square by more than
10% [2]. The algorithm could be used in a computer program or for teaching and can easily be
performed from data already produced during the linear regression analysis. One should use
this algorithm because it requires a minimal amount of further calculations, is quick and is
based on well-known properties of linear and nonlinear data sets.

Acknowledgements

We thank Rosario Delgado and Dr. Cli€ord Eng for providing us with the data on urinary
free cortisol.
M.H. Kroll et al. / Computers in Biology and Medicine 29 (1999) 289±301 301

References

[1] M.H. Kroll, K. Emancipator, A theoretical evaluation of linearity, Clin. Chem. 39 (1993) 405±413.
[2] Draper, N.R., Smith, H. Applied Regression Analysis. Second ed. John Wiley and Sons, New York, 1981, pp.
33±42, 102±107, 117±121, 294±313.
[3] J. Mandel, Statistical analysis of experimental data, Dover Publications, Inc, New York, 1964 pp. 164-165.
[4] K. Emancipator, M.H. Kroll, A quantitative measure of nonlinearity, Clin. Chem. 39 (1993) 766±772.

Dr. Martin H. Kroll is currently Associate Professor and Associate Director of Clinical Chemistry, Department of Pathology, The
Johns Hopkins School of Medicine, 600 N. Wolfe St., Meyer B-125, Baltimore, Maryland 21287-7065. He received a BS in
Chemistry from the University of Maryland, College Park, Maryland in 1974. He earned his medical degree at the University of
Maryland School of Medicine, Baltimore, Maryland in 1978. Then he trained in pathology at the University of Maryland Hospital,
also in Baltimore. Upon ®nishing his residency, he went to the Clinical Center at the National Institutes of Health, Bethesda,
Maryland, training in a fellowship in Clinical Pathology with a subspecialty in Clinical Chemistry. Upon ®nishing his fellowship,
he joined the sta€ in the Clinical Chemistry Service of the Clinical Pathology Department at the National Institutes of Health,
remaining there until 1993 when he joined the Pathology Deparhnent at Johns Hopkins. He is a certi®ed chemist by the American
Chemical Society, and has been certi®ed as a physician by the National Board of Medical Examiners, and in Anatomic and
Clinical Pathology, by the American Board of Pathology. Among awards he has received include a National Science Foundation
Summer Research Fellowship in 1973; a Young Investigator Award, Academy of Clinical Laboratory Physicians and Scientists in
1983; numerous Quality Service and Performance awards from the National Institutes of Health, Department of Health and
Human Services; and the Joseph H. Roe Award, in recognition of contributions in the ®eld of Clinical Chemistry in 1995, by the
American Association for Clinical Chemistry, Capital Section. He is an active member of the American Association for Clinical
Chemistry, American Chemical Society, College of American Pathologists, and The Society of Mathematical Biology. He has been
active in the American Association of Clinical Chemistry, serving on the Arnold O. Beckman Conference committee and recently
elected as Chair-elect of the Lipids and Lipoproteins Division. He serves as chairman of the Instrumentation Resource Committee
of the College of American Pathologists where he has been actively involved in surveys that assess linearity of clinical laboratory
methods. He has published over 50 research articles, several establishing modern approaches to assessing linearity for laboratory
methods. In addition, he has spoken widely on many topics, including the statistical assessment of linearity.

Kenneth Emancipator, MD is Director of Clinical Chemistry at Beth Israel Medical Center in New York City. He is also the
Chairholder of the American Society of Clinical Pathologists' Council on Clinical Chemistry and Past Chairman of the American
Association for Clinical Chemistry's New York Metro Section. His expertise includes applications of statistics, computers, and
robotics in the clinical laboratory; diagnosis and monitoring of diabetes mellitus; and strategic planning for the clinical laboratory.
Dr. Emancipator received his A.B. degree with honors in physics from Harvard University in 1979, and his M.D. degree from St.
Louis University in 1983. He certi®ed in anatomic and clinical pathology by the American Board of Pathology.

Dr. Floering is Medical Director of The Christ Hospital Laboratory and the Chemistry Section of the Alliance Laboratory Services
for the Health Alliance of Cincinnati, Ohio. He is a past member and Chairman of the Instrumentation Resource Committee for
the College of American Pathologists. He was involved in the start and early development of the Linearity Surveys for the
Instrumental Resource Committee.

Dan Tholen is an independent consultant in statistics and laboratory quality systems, now living in Traverse City, Michigan. He
also teaches statistics at the college level, and serves as an Assessor of laboratory quality systems and interlaboratory pro®ciency
testing programs. He serves a broad range of clients in the medical and environmental laboratory industries. Prior to ``moving
north and going solo'', Dan's work experience included research into burn etiology and survival for the National Institute for Burn
Medicine (Ann Arbor, MI), and research and development of interlaboratory comparison programs for the College of American
Pathologists (North®eld, IL). He is the author or co-author of 29 articles or chapters in text books. He has served on committees
for numerous professional organizations, and has co-authored consensus standards and guidelines for the International
Organization for Standardization (ISO), American Society for Testing and Materials (ASTM), and National Committee for
Clinical Laboratory Standards (NCCLS). Dan's experience with problems in Linearity Testing grew out of the development of an
interlaboratory comparison program in Linearity testing for medical laboratories. This required automated processing of linearity
data from over 3000 laboratories for a wide variety ol quantitative lahoratory tests.

You might also like