You are on page 1of 4

Francis x.

McConville
Impact Technology Consuitants
FunClions for
Easier Curve Filling
An overview Df empirical
relations that cao be
used to tit your data
E
ngineers often need to fit ex
perimental data to an empirical
relationship for extrapolation or
modeling without resorting to a
fuU mathematical treatment based
on physical principIes or theory. The
most widely used platform in sci
ence for doing this is MS Excel, and
while its "off-the-shelf" Trendline tool
Shifted
reciprocal
1
y=-
a-x
is useful, it is limited to only 4 or 5
simple functions. Excel's Solver add
in, on the other hand, offers a simple,
powerful means to fit data to user
defined functions.
There are numerous commercial
packages for data fitting and statisti
cal analysis (1], but Excel is ubiqui
tous, and as E. G. John [2] aptly puts
it, the use of Solver for data fitting is
"simplicity itself'. More recently, in
this publication, Du Plessis [3] further
extols the virtues of Excel's Solver
function and describes in detail how to
use it for this purpose. For a synopsis,
see box on p. 5l.
Once one has identified an appropri
ate function and achieved a good fit,
then more-advanced modeling tasks
become easier; and using calculus one
can obtain derivatives and integraIs
and thus rates of change, areas under
curves, and so on.
But selecting the best function for
the curve-fitting exercise is never triv
ial. To simplify the task, this article
illustrates a coUection of 52 common
one-, two- and three-parameter bi
nary functions. that cover quite a wide
range of behavior and should provide
a good starting point. Even the one
and two-parameter relationships can
be very versatile in spite of their sim
plicity. The accompanying figures il
13
L,
y=

17
II
I
li
ti
I{
11
Modi
y =(,

FIGURE 2. Eq
I
lustrate the. b:
I
these equatlOn
of welI-known!
where

I
Simple
/1
. I
exponentla
I
x
(asymptotlc)
y=a
\ I I
x
y=l-a
A I
Modified
power
Pareto
I' \
I-Inx
(asymptotic)
y=a
1
y=I-
7
y/I
I
Hyperbolic
COSlne
cosine
y = acosh(x/ a) a
y = cosh(x)
For the cUf\]
most effective,
data are in thE
y = a
l/x
Exponential
(asymptotic)
y=I_e-
ax
Exponential
(rootfit)
FIGURE 1. Equations (1) to (12) are functions with one fitting parameter, a
48 CHEMICAL ENGINEERING WWW.CHE.COM DECEMBER 2008
13
tants
Reciprocal
Linear
y=ax+b 1
y=-
a+bx
1 good fit,
ing tasks
lculus one
integraIs
Exponential
'eas under
bx
y=ae
Logarilhmic
. ction for
never triv
y = a+blnx
his article
2 common
Lmeter bi
te a wide
lld provide
n the one
nships can
'their sim-
figures il-
Rational
a
y=-
b+x
Exponenlial
b/x
y =ae
Power
b
y=ax
16
Hyperbola
(saturation
growlh)
ax
y=-
b+x
Exponential
(asymplolic)
y = a(l-e
bx
)
24
Shijied power
y = a(l + X)b
y = ax
b
/
x
FIGURE 2. 'Equations (13) to (32) are functions with two fitting parameters, a and b
lustrate the basic curve shapes that
these equations generate. The names
of well-known functions are included
where possible.
For the curve fitting exercise to be
most effective, it is important that the
data are in the correct form and that
all units are consistent. For example,
solubility data can often be fit to one
of the simpIe logarithmic functions,
but the best results are obtained if
solubility is expressed as mole frac
tion and not weight percent. Thus,
some understanding of the underly
ing principIes proves valuable in se
lecting and properly applying the best
empirical model. It is also best to use
the minimum number of parameters
that will give a good fit, or else the fit
may become meaningless.
A common technique is linearizing
CHEMICAL ENGINEERING WWW.CHE,COM DECEMBER 2008 49
li
lnverse hyperbola
b c
y=a+-+
2
x x
FIGURE 3. Equations (33) to (52) are functions with three fitting parameters, a, b and c
Logarithmic
Shifted
power
a non-linear equation by rearranging
it, thus simplifying the fit processo For
example, Equation (9) can be linear
ized by plotting ln(l- y) versus x. This
results in a straight line with slope
--a that passes through O. Another
example is using Lineweaver-Burke
plots to linearize enzyme kinetic data.
This technique can work well in many
cases, but it tends to distort the exper
imental error and amplify the effect of
outlying data points. Thus parameters
, derived in this way may not be as ac
curate as those obtained by fitting
50 CHEMICAL ENGINEERING WWW.CHE.COM DECEMBER 2008
data to the native model. The use of a
program like Excel obviates the need
to utilize such methods.
It is also possible to treat a data set
as a bimodal distribution and fit the
data to two different functions, apply
ing one function above a certain value
lhe Solver F
spreodshee
Column A:
Column B: E
coIumn C:'
coIumn D:
Column E: t
lhe values
setup and r
such a way
lhis genera
simpie spre
www.pprbo
and anothel
For exampl
linear up te
which it e ~
ior. This a ~
achieve a I
quickly WhE
Fortheva
quick look a
cate the ex!
pIe, Equatie
pected to a ~
x increases.
the value a
is negative,
ishing valUl
And functio
perbola [Eql
man model
to O. Funcl
[Equation C
peak and th
Some oft:
expected bei
shapes whel
used. This
important p
is selecting
fit parametE
derstanding
work for selE
must resort
ues selected
physical wc
the model fi
plished first
tified by usi
(goodness of
fraction thai
from 1 for 11
negative vaI
GoodplacE
tion on curvl
-c)
~
model
, tx
C
model
-bx )C
e
rI'he use of a
les the need
~ t a data set
and fit the
bons, apply
ertain value
A OUICK LOOK AT USING EXCEL SOLVER
The Solver procedure is usuolly bosed on lhe "sum oF leost squores" opprooch. A typicol
spreodsheet setup For on x-r data set would look like this:
Column A: independent variable (xl data
Column B: experimental dependent variable (r) data
Column C: rvalues calculaled using lhe curve Fitting function of inleresl
Column D: the difference between the values in Column C and Column B [residuais)
Column E: lhe square of lhe values in column D
The values in column E are lhen summed lo generale a "solver parameler". Solver is
selup and run to aulomalicolly adjusl lhe volues oF equalion paramelers o, b and c in
such o way as lo minimize lhe value of lhe "solver parometer" (c1ick Tools>Solver [4]).
This generales lhe values a, b, and c lhat provide lhe besl possible Fil of lhe dolo. A
simple spreodsheel demonslraling lhe approach is available 01 lhe aulhor's websile
www.pprbook.com under "Templates".
and another function below that value.
For example, a data set may be very
linear up to a certain value of x, after
which it exhibits exponential behav
ior. This approach can help the user
achieve a purely empirical fit more
quickly when required.
For the various asymptotic models, a
quick look at the function should indi
cate the expected behavior. For exam
pIe, Equations (7) and (27) can be ex
pected to approach a value of unity as
x increases. Equation (20) approaches
the value a as x increases and when b
is negative, but approaches Oat dimin
ishing values of x when b is positive.
And functions such as the inverse hy
perbola [Equation (35)] and the Chap
man model [Equation (48)] asymptote
to O. Functions such as Box-Lucas
[Equation (21)] fit data that reach a
peak and then begin to decline.
Some of the models can exhibit un
expected behavior and unusual curve
shapes when negative parameters are
used. This raises the point that an
important part of running the Solver
is selecting the starting values of the
fit parameters. Where a physical un
derstanding does not provide a frame
work for selecting starting values, one
must resort to trial and error. The val
ues selected must make sense in the
physical world. Checking how well
the model fits will usually be accom
plished first by eye, but can be quan
tified by using values such as the R2
(goodness of fit) parameter, a unitless
fraction that usually ranges in value
from 1 for a perfect fit to O (or even
negative values) for a poor fito
Good places to look for more informa
tion on curve fitting in general are the
O
websites listed in References [51 and
[6]. For a more advanced discussion of
proper curve fitting using non-linear
regression, multivariate analysis, deal
ing with outlying data points, applying
weighting to the data, and other other
issues, see References [7-9].
The functions
The functions included here represent
a number of the most common curve
types - power, exponential, logarith
mie and trigonometric among others.
Some are well known and historically
important, and many correspond di
rectly to well-known physical models.
For example, the Arrhenius kinetic
equation k =Ae-
E
/
RT
is a classic ex
ample of a basic exponential form
[(Equation (19)], where: x =T;y =rate
constant, k; a = Arrhenius constant, A;
and b =-E/R.
Radioactive decay is a commonly
cited example of an exponential model
[Equation (18)]. Tlie basic form of the
equation is L(t) = L(O)e-
lt
. In our con
text, y =L(t), the number of decays per
unit time; x =t, elapsed time; a =initial
decay rate; and b = l, the probability of
a decay event during one time unit.
Similarly, the dilution of a species
in a stirred tank being continuously
fed fresh media ([A]=O) is described by
Equation (18). Here the relationship is
[A]/[A ] = e-(q/V)t where [Ao] and [A]
o
are the concentration of A at time = O
and at time = t, q and V the flowrate
and system volume, and t is time.
The Michaelis-Menten enzyme ki
netic model, v = Vmax!3/(S + K
m
), char
acterized by classic non-competitive
substrate inhibition (saturation), fits
the form of a hyperbola [Equation
(16)]. Here x = substrate concentra
tion, S; y = reaction rate, v; a = Vmax;
and b = Michaelis constant, K
m
.
Other interesting examples are de
scribed in Bates and Watts [8]. Here
the drop in biological oxygen demand
(BOD) at a fixed rate constant was re
ported to follow an exponential decay
of the form of Equation (20). And the
change in intercellular concentration
ofions due to membrane transport out
of the cell is effectively fit to the form
ofEquation (47).
As with any other endeavor, the
deeper the understanding ofthe mech
anisms at work, the easier the selec
tion of an appropriate model will be.
Hopefully the equations collected here
will simplify your data fitting tasks by
highlighting typical behaviors and the
models that describe them.
Edited by Gerald Ondrey
References
1. MathCAD, FindGraph, SigmaPlot, Origin
Lab, XlXtrFun. Easily found through an in
ternet search.
2. E. G. John, Simplified Curve Fitting using
Spreadsheet Add-ins, Int. J Engineering Ed.
14(5)pp.375-380, 1998.
3. B. J. du Plessis, Using Spreadsheets as Curve
Fitting Tools, Chem. Eng. 114 (5) pp 66-69,
May 2007.
4. If 'Solver' does not appear under the tools
menu in Excel, it may be necessary to acti
vate the add-in by selecting the 'Solver add
in' checkbox under Tools>Add-ins.
5. http://www.aip.org/tip/INPHFA/vol-9/iss-2/
p24.html (by Marko Ledvij at The Industrial
Physicist)
6. http://www.curvefit.com/(by GraphPad Soft
ware)
7. Draper, N. R; Smith, H. "Applied Regression
Analysis," 3rd edition, John Wiley & Sons,
N.Y.,1998
8. Bates, D. M. and Watts, D. G., "Nonlinear Re
gression Analysis and its Applications," John
Wiley & Sons, N.Y., 1988.
9. Bevington, P. R "Data Reduction and Error
Analysis for the Physical Sciences," McGraw
HiII, N.Y., 1969.
Author

.
Francis McConville is a
senior consultant for Impact
Technology Consultants in
Lincoln, Mass. He is the au
...
thor of "The Pilot Plant Real
Book - A Unique Handbook
. . for the Chemical Process In
dustry," and an instructor for
. "'f'
the Scientific Update profes
sional training course "Se
'-!
crets of Batch Process Scale-
Up". He has over 25 years
experience in the process industries, including
14 years as a pharmaceutical process develop
ment engineer at Sepracor, Inc. McConville holds
a B.S. degree in Chemistry and M.S. degrees in
Chemical Engineering and in Biotechnology
from Worcester Polytechnic Institute in Mas
sachusetts. He is a member of the ACS, ISPE,
and a lifetime member of the AIChE. He may be
reached at fran@fxmtech.com.
CHEMICAL ENGINEERING WWW.CHE.COM DECEMBER 2008 51

You might also like