You are on page 1of 8

Mathematical Association of America

Fitting a Logistic Curve to Data


Author(s): Fabio Cavallini
Source: The College Mathematics Journal, Vol. 24, No. 3 (May, 1993), pp. 247-253
Published by: Mathematical Association of America
Stable URL: http://www.jstor.org/stable/2686488
Accessed: 27-10-2015 19:26 UTC

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact support@jstor.org.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The College
Mathematics Journal.

http://www.jstor.org

This content downloaded from 137.189.170.231 on Tue, 27 Oct 2015 19:26:16 UTC
All use subject to JSTOR Terms and Conditions
COMPUTER

CORNER

EDITOR

Eugene A. Herman
Department of Mathematics
GrinnellCollege
Grinnell,IA 50112

In thiscolumn, readers are encouraged to share theirexpertiseand experiences withcomputersas


theyrelateto college-levelmathematics.Articlesthatillustrate
how computerscan be used to enhance
pedagogy, solve problems,and model real-lifesituationsare especiallywelcome.

Classroom Computer Capsules featuresnew examples of usingthe computerto enhance teaching.


These shortarticles demonstratethe use of readilyavailable computingresources to presentor
elucidate familiartopics in ways thatcan have an immediateand beneficialeffectin the classroom.

Send submissionsforbothcolumnsto Eugene A. Herman.

Fitting a Logistic Curve to Data


Fabio Cavallini

Fabio Cavallini received a degree in physics ("laurea cum


laude") from the Universityof Trieste in 1975. He carried out
research and teaching activities in mathematical system theory
at the InternationalCentre for Mechanical Sciences in Udine and
at the Universityof Trieste, respectively. From 1978 to 1982 he
was a secondary school teacher in mathematics, physics and
electronics, while followinggraduate courses in functionalanaly?
sis at the InternationalSchool forAdvanced Studies in Trieste.
Since 1982 he has been working at Osservatorio Geofisico
Sperimentale, Trieste, Italy,mainly on applied numerical mod?
elling in geophysical fluiddynamics and seisimic wave propaga?
tion. His favoritehobby is marathon running.

Ecological problems are nowadays of general concern. In particular, in Italy much


attention is being paid to the problem of algal blooms in the Adriatic Sea and this
has led also to an increase of interest in mathematical ecology at all levels, from
high school teaching to advanced research. The logistic differential equation, dealt
with in the next section, is a classical but still useful model for describing the
dynamics of a one-species population in an environment with limited resources.
For example, the data in Table 1 and Figure 1 represent the time evolution of an
algal sample taken in the Adriatic Sea [5] and do seem to follow a logistic curve.

VOL. 24, NO. 3, MAY1993 247

This content downloaded from 137.189.170.231 on Tue, 27 Oct 2015 19:26:16 UTC
All use subject to JSTOR Terms and Conditions
We will explore the problem of fitting, in the least-squares sense, a logistic curve
(i.e. the solution to the logistic equation) to such a set of measured data.
This problem is far less trivial than that of fitting a straight line or even a
polynomial, because the logistic curve depends in a nonlinear way on the parame?
ters that identify the system. The standard advanced method for nonlinear mini?
mization is the Marquardt algorithm [4] that usually works very well but still suffers
from some drawbacks, especially from the point of view of students and non-
mathematicians. First, it requires an initial point which is seldom obvious to guess;
second, it has no apparent link with analytical minimization techniques such as
putting the derivative equal to zero; and third, it gives no immediate information
on the sensitivity of the estimate, i.e. on how much the error is affected by changes
in the parameters we are estimating. Similar drawbacks exist also for the trial-
and-error method used in [2].
In the present paper a computer method is described which is quite elementary
and yet overcomes in a rigorous way the difficulties arising in the application of the
Marquardt algorithm. Indeed, it starts with an analytic trick that reduces the
number of parameters and then uses the graphical facilities of the software
Mathematica [3] to get a first approximation of the solution and a feeling of the
sharpness of the estimate. The procedure can be iterated until an arbitrary
accuracy is reached. Alternatively, after a first run, an easy-to-use Mathematica
numerical routine can be used to get high accuracy without iterating the graphical
processing. As an illustration, the whole procedure is applied to fit the data in
Figure 1.

I
S

10 20 30 40 50

Time (days)

Figure 1
Plot of the data listed in Table 1.

The Logistic Equation


The logistic ordinary differential equation can be written in the form

dm I m
-? rm 1-
dt I K

where t is time, m = m(t) is biomass and r, K are positive constant parameters.

248 JOURNAL
THE COLLEGE MATHEMATICS

This content downloaded from 137.189.170.231 on Tue, 27 Oct 2015 19:26:16 UTC
All use subject to JSTOR Terms and Conditions
Using the method of separation of variables, as in [1], or the Mathematica
command DSolve the general solution to the logistic equation is readily found to
be

K
m(t) =-

where c is an arbitrary positive constant. Since the second derivative of the logistic
function m is

cKr2ert(c-ert)

(c + ert)

we see that m has an inflection point where exp(rO = c. So if we put c = exp(rt0),


the logistic function becomes

K
m(t) =-?? (1)

and we see that t0 is the abscissa of the inflection point (this fact will be of use in
the next section). Moreover, the logistic function tends asymptotically to K since r
is positive, which leads us to interpret K as a positive saturation value (this fact
also will be of use in the next section).

The Curve Fitting Algorithm


We now show a procedure for fitting, in the least-squares sense, a logistic curve (1)
to a given data set (tt, mi)i = l^n. In symbols, the problem is to minimize, for K, r
and t0 varying in the real line, the error

We take advantage of the fact that the parameter K appears linearly in equation
(1) to reduce the problem to that of minimizing a function of only two parameters,
r and t0, which appear nonlinearly. To do this, we first rewrite equation (1) in the
form

m(t)=Kh(t)

where

h(0=1 + e-r(l-tt))

and hence the error can be written as

e = K2(U,U) - + <M,M>
2K(U,M) (2)

where (,) indicates scalar product and

U = (h(tl),...,h(tn)),

M = (mx,...,mn).

VOL. 24, NO. 3, MAY1993 249

This content downloaded from 137.189.170.231 on Tue, 27 Oct 2015 19:26:16 UTC
All use subject to JSTOR Terms and Conditions
A necessary condition for minimizing the error is de/BK = 0 which yields

<H,M>
K=---. (3)
<H,H>

By substituting equation (3) into equation (2), we get

<H,M>2
e = <M,M> (4)
<H,H>
and this is the function to be minimized with respect to r and tQ.
In order to do this, we shall take advantage of the built-in facilities of Mathe?
matica according to the following procedure:

? Make a 3D plot and a contour plot of the error function (4) to get an
appreciation of the position of its minimum.
? Refine the estimate of the minimizing point (r, t0) using the built-in Mathe?
matica function FindMinimum and taking as a starting value the first approxi?
mation for (r, t0) just obtained by visual inspection of the plots.
? Compute K from equation (3) using the values of (r, t0) just obtained from
FindMinimum.
? Plot the measured data and the fitted curve as a final output.

An initial difficulty is that the user must give the plotting command lower and
upper bounds for parameters r and t0. This is readily done: first, bounds for t0 can
be guessed directly from the graph of the measured data, since t0 is the abscissa of
the inflection point; second, we can estimate r from the formula r = 4m'(t0)/K,
where the saturation value K can be chosen to be the maximum measured value
and mf(t0) can be the difference quotient of a pair of data points near the
inflection point; third, bounds for r can be obtained by multiplying its estimated
value by suitably small or large factors.

Example. We first use the measured data in Table 1 to define the vector of
sampling times T = {11,..., 74} and the vector of measured biomass M =
{0.00476,...,5.09}. Then, the four steps listed in the previous section are worked
out in Mathematica as follows:

Table 1
Measured data *

time 11 15 18 23 26 31
biomass 0.00476 0.0105 0.0207 0.0619 0.337 0.74

time 39 44 54 64 74
biomass 1.7 2.45 3.5 4.5 5.09

Time is expressedin daysand biomassis expressedin mm2,sincewhatis actuallymeasuredis


the surfacecoveredby biomass in a microscopesample.

Step 1. We define the vector

H-l./(l. + Exp[-r(T-t0)])

containing the symbols r and t0. The error function given in formula (4) is then

250 THE COLLEGE MATHEMATICS


JOURNAL

This content downloaded from 137.189.170.231 on Tue, 27 Oct 2015 19:26:16 UTC
All use subject to JSTOR Terms and Conditions
obtained through

error = (M.M) -
(H.M) 2/(H.H)

Next, we estimate the bounds on t0 and r as described above. For example,


reasonable estimates are t0Min = 11, t0Max = 74, rMin = 0.01 and rMax = 0.6.
Then we produce the 3D plot in Figure 2 and the contour plot in Figure 3. For
example, the latter is produced by

ContourPlot[error, {r,0.01,0.6}, {t0,ll,74}]

Figure 2
Three-dimensional plot of the error e as a function of the growth rate r and of the
inflection time tn.

From these figures we may choose, as a first approximation, r = 0.1 and t0 = 50.
The corresponding error can be computed using

- > - >
error/.{r 0.1, tO 50}

Its value is 0.47.

Step 2. The statement

FindMinimum[error, {r,0.1}, {t0,50}]

gives the final estimates r = 0.12 and t0 = 45.8 with a corresponding error of 0.24.

Step 3. The saturation value K follows from the equation (3) by

- > - >
((H.M) / (H.H)) / .{r 0.12, tO 45.8}

which evaluates to 5.1.

VOL. 24, NO. 3, MAY1993 251

This content downloaded from 137.189.170.231 on Tue, 27 Oct 2015 19:26:16 UTC
All use subject to JSTOR Terms and Conditions
a

O
o

0.1 0.2 0.3 0.4 0.5 0.6

Growth rate, r

Figure 3
Contour plot of the error as a function of the growth rate and of the inflection time.

Step 4. As a final step, we produce an overlay of the fitted curve on top of the
original data by

Fig4a = ListPlot[Transpose[{T, M}]]

Fig4b = Plot[5.1/(1 +Exp[-0.12 (t-45.8)]), {t,Min[T] ,Max[T] }]

Show[Fig4a,Fig4b]

which yields Figure 4.

Conclusions

Ecological modelling is a strong motivation for tackling the nontrivial mathemati?


cal problem of fitting a logistic curve to a given set of data. We have seen a
rigorous, though elementary, algorithm which solves the problem. This method
also permits us literally to see how the error changes as a function of the
parameters, by using a combination of analytical, numerical and graphical tech?
niques. Mathematica is very suitable for this kind of hybrid computation since its
instructions encompass all these areas and mimic very closely mathematical nota?
tion.

Acknowledgment.Thanks are due to ProfessorE. Feoli fordiscussionson the biologicalbackground.

252 THE COLLEGE MATHEMATICS


JOURNAL

This content downloaded from 137.189.170.231 on Tue, 27 Oct 2015 19:26:16 UTC
All use subject to JSTOR Terms and Conditions
10 20 30 40 50 60 70 80

Time (days)

Figure 4
Optimally fittedlogistic curve and measured data.

References

1. C. W. Clark, MathematicalBioeconomics:the OptimalManagementof RenewableResources,Wiley,


New York, 1976.
2. R. Hunt, Plant GrowthCurves.The FunctionalApproachto Plant GrowthAnalysis,Edward Arnold,
1984.
3. Mathematica,ver. 2.0, WolframResearch,Inc., Champaign,IL, 1991.
4. W. H. Press, B. P. Flannery,S. A. Teukolskiand W. T. Vetterling,NumericalRecipes,Cambridge
UniversityPress,Cambridge,MA, 1986.
5. M. Zangrandi, GrowthRate of Algae of Genera Fosliella and Pheophyllumin the Gulf of Trieste,
Thesis (in Italian), University
of Trieste,1991.

Impertinent Petition?
...the student skit at Christmas contained a plaintive line: "Give us
Master's exams that our faculty can pass, or give us a faculty that can pass
our Master's exams."

Paul R. Halmos, / Want to Be a Mathematician, MAA spectrum, 1985, p. 146.

VOL. 24, NO. 3, MAY1993 253

This content downloaded from 137.189.170.231 on Tue, 27 Oct 2015 19:26:16 UTC
All use subject to JSTOR Terms and Conditions

You might also like