Mathematical Modeling: My Students February, 2010

Mathematical Modeling
My Students
February, 2010
It
is mostly based on the textbook, Frank R. Giordano, Maurice D. Weir, and William P. Fox, A First Course in
Mathematical Modeling, 3rd Ed and it has been reorganized and retyped by Jae Lee.
Spring, 2010
Page 2 of 57
C ONTENTS
Modeling Change
1.1 Modeling Change with Difference Equations . . . . . . . . . . . . . . . . . . . . . .
1.2 Approximating Change with Difference Equations . . . . . . . . . . . . . . . . . . . .
Topic I. Discrete Versus Continuous Change . . . . . . . . . . . . . . . . . . . . . . .
Topic II. Model Refinement: Modeling Births, Deaths, and Resources . . . . . . . . .
1.3 Solutions to Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Topic I. Method of Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Topic II. Homogeneous Linear Dynamical System an+1 = ran , r constant . . . . . . .
Topic III. LongTerm Behavior of an+1 = ran , r constant . . . . . . . . . . . . . . . .
Topic IV. Nonhomogeneous Linear Dynamical System an+1 = ran + b, r and b constant
Topic V. Finding and Classifying Equilibrium Values . . . . . . . . . . . . . . . . . .
Topic VI. Nonlinear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4 Systems of Difference Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Modeling Process
2.1 Mathematical Models . . . . . . . . . . . . . . .
2.2 Modeling Using Proportionality . . . . . . . . .
Topic I. Introduction . . . . . . . . . . . . . . .
Topic II. Geometric Interpretation . . . . . . . .
Topic III. Modeling Vehicular Stopping Distance
2.3 Modeling Using Geometric Similarity . . . . . .
Topic I. Introduction . . . . . . . . . . . . . . .
Topic II. Testing Geometric Similarity . . . . . .
2.4 Automobile Gasoline Mileage . . . . . . . . . .
2.5 Body Weight and Height, Strength and Agility . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Modeling Fitting
Topic I. Relationship Between Model Fitting and Interpolation
Topic II. Sources of Error in the Modeling Process . . . . . .
3.1 Fitting Models to Data Graphically . . . . . . . . . . . . . . .
Topic I. Visual Model Fitting with the Original Data . . . . . .
Topic II. Transforming the Data . . . . . . . . . . . . . . . .
3.2 Analytic Methods of Model Fitting . . . . . . . . . . . . . . .
Topic I. Chebyshev Approximation Criterion . . . . . . . . .
Topic II. Minimizing the Sum of the Absolute Deviations . . .
Topic III. LeastSquares Criterion . . . . . . . . . . . . . . .
Topic IV. Relating the Criteria . . . . . . . . . . . . . . . . .
3.3 Applying the LeastSquares Criterion . . . . . . . . . . . . .
Topic I. Fitting a Straight Line . . . . . . . . . . . . . . . . .
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
6
7
7
8
9
9
10
11
11
13
14
14
.
.
.
.
.
.
.
.
.
.
17
17
17
17
17
20
21
21
25
25
25
.
.
.
.
.
.
.
.
.
.
.
.
27
27
28
28
28
29
30
30
31
31
31
33
33
Spring, 2010
Topic II. Fitting a Power Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Topic III. Transformed LeastSquares Fit . . . . . . . . . . . . . . . . . . . . . . . . . . .
Choosing a Best Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
37
41
Chapter 7 Discrete Optimization Modeling

Section 7.4 Linear Programming III: Simplex Method . . . . . . . . . . . . . . . . . . . . . . . .
45
45
Chapter 8 Dimensional Analysis

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Dimensions as Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
53
53
Chapter 10 Modeling with a Differential Equation

10.5 Numerical Approximation Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
55
Chapter 11 Modeling with Systems of Differential Equations

11.1 Graphical Solutions of Autonomous Systems of FirstOrder Differential Equations . . . . . .
57
57
3.4
Page 4 of 57
Chapter 1
Modeling Change
A mathematical model is an idealization of the realworld phenomenon and never a completely accurate
representation. In modeling our world, we are often interested in predicting the value of a variable at some
time in the future such as a population, a real estate value, and the number of people with a communicative
disease.
.Simplification
.RealWorld Data
.Model
.Analysis
.Verification
.Predictions/Explanations
.Interpretation
.Mathematical Conclusions
Figure 1.1: A flow of the modeling process beginning with an examination of realworld data
One very powerful simplifying relationship is proportionality.
Definition 1.0.1. Two variables, x and y, are proportional (to each other) if there is a nonzero constant k such
that y = kx. We write y x.
When x and y are proportional, the graph of y versus x is a straight line passing through the origin. When x
and y are proportional, one of our concerns is to find the constant of proportionality k. Moreover, we observe
x and y are proportional if and only if y/x or x/y or (y/x) p or (x/y) p is constant, where p is any real number.
Example 1.0.2. Consider a springmass system. An experiment gives the following table.
Elongation (e)
1.000
1.875
2.750
3.250
4.375
4.875
5.675
6.500
7.250
8.000
8.750
Mass (m)
50
100
150
200
250
300
350
400
450
500
550
A simple computation shows that the ratio of e over m is roughly a constant:

e
= 0.0171,
m
where the number is the average of each ratio ei /mi , i = 1, 2, . . . , 11. Since the ratio e/m is roughly a constant
0.017, it is allowed to say that m and e are proportional with the relation e = 0.0171m.
When we plot the points of (m, e), we can observe that a graph close to all points looks like a straight line
passing through the origin. See the figure 1.2.
A paradigm to use in modeling change is

future value = present value + change,
i.e.,
change = future value present value.
If the pattern of the model is taking place over discrete time periods, the preceding construction leads to
a difference equation. In the case that it is taking place continuously with respect to time, it leads to a
differential equation.
Spring, 2010
Figure 1.2: Data from springmass system with proportionality line

1.1 Modeling Change with Difference Equations.
Definition 1.1.1. For a sequence of numbers, A = {a0 , a1 , a2 , . . .}, the difference an+1 an is called the nth
first difference and denoted by an , i.e.,
an = an+1 an ,
n = 0, 1, 2, . . . .
Geometrically, the first difference represents the vertical change in the graph of the sequence during one time
period.
Example 1.1.2 (Savings Certificate). Consider the value of a savings certificate initially worth $1000 that
accumulates interest paid each month at 1% per month.
Let an be the value of the certificate after n months. Then a0 = 1000 and because of the interest,
a0 = a1 a0 = 0.01a0
a1 = a2 a1 = 0.01a1
..
.
an = an+1 an = 0.01an .
The last equation can be rewritten by
an+1 = an + 0.01an = 1.01an ,
i.e.,
an+1 = 1.01an
with
a0 = 1000,
which is called the dynamical system model and the equation is called the dynamical system.
Example 1.1.3. Consider the value of a savings certificate initially worth $1000 that accumulates interest
paid each month at 1% per month. We withdraw $50 from the account each month.
Let an be the value of the certificate after n months. Then a0 = 1000 and because of the interest and the
withdrawal, we have
an = an+1 an = 0.01an 50,
i.e.,
an+1 = 1.01an 50
with
a0 = 1000.
How to describe a change mathematically? Often it is necessary to plot the change and observe a pattern and
describe the change in mathematical terms. Simply we will try to find
change = an = some function f .
Page 6 of 57
Spring, 2010
Example 1.1.4 (Mortgaging a Home). Six years ago your parents purchased a home by financing $80, 000
for 20 years paying monthly payments of $880.87 with a monthly interest of 1%. Currently they have made
72 payments. Currently how much do they owe on the mortgage?
A NSWER. Let bn be the amount of money owed to the bank after n months. Then we have
bn = bn+1 bn = 0.01bn 880.87,
i.e.,
bn+1 = 1.01bn 880.87
with
b0 = 80, 000.
The answer to the raised question is b27 , which can be easily computable.
0
...
72
Owed Money (bn )
80000
79919.1
79837.5
79755
71532.1
Month (n)
237
238
239
240
Owed Money (bn )
2589.58
1734.6
871.078
1.0814
Month (n)
Based on the table of n and bn , we have the figure 1.3 below from 0 month to 240 months.
Figure 1.3: Mortgaging a Home

Definition 1.1.5. A sequence is a function whose domain is the set of all nonnegative integers and whose
range is a subset of the real numbers. A dynamical system is a relationship among terms in a sequence. A
numerical solution is a table of values satisfying the dynamical system.
1.2 Approximating Change with Difference Equations.
In this section, we approximate some observed change to complete the expression
change = an = some function f .
Topic I. Discrete Versus Continuous Change.
Some changes takes place in discrete time intervals such as the depositing of interest in an account. In this
case, we consider a difference equation. But some changes happen continuously such as the change in the
temperature of a cold can of soda on a worm day. In this case, a differential equation can be dealt with.
Example 1.2.1 (Growth of a Yeast Culture). By an experiment measuring the growth of a yeast culture, we
have the following table.
Mortgage:
a legal agreement by which a bank, building society, etc. lends money at interest in exchange for taking title of
the debtors property, with the condition that the conveyance of title becomes void upon the payment of the debt (Concise Oxford
English Dictionary)
Yeast: a microscopic singlecelled fungus capable of converting sugar into alcohol and carbon dioxide
Page 7 of 57
Spring, 2010
pn
pn
9.6
8.7
18.3
10.7
29.0
18.2
47.2
23.9
71.1
48.0
119.1
55.5
174.6
82.7
257.3
Here n represents the time in hour, pn observed yeast biomass, and pn = pn+1 pn the change in biomass.
We observe the ratio (pn )/pn is roughly a constant,
pn
= 0.6057,
pn
which is the average of the ratios (pi )/pi , i = 0, 1, 2, . . . , 6. It implies that pn and pn are proportional with
the constant of proportionality 0.6057 so that
pn = 0.6057pn ,
i.e.,
pn+1 = 1.6057pn .
Since the ratio pn+1 /pn = 1.6057 > 1, so the model predicts the population will increase forever. See the
figure 1.4.
Figure 1.4: Change in Biomass versus Biomass

Topic II. Model Refinement: Modeling Births, Deaths, and Resources.
Certain resources (e.g., food) can support only a maximum population level rather than one that increases
indefinitely.
Example 1.2.2 (Growth of a Yeast Culture Revisited). Under the restriction, suppose we have the table 1.8
on the page 11 of the textbook. The table gives the plot (the one on the lefthand side).
From the graph of the population above, the population appears to be approaching a limiting value, which
seems to be 665. So we may propose
pn (665 pn )pn .
By computing the ratio (pn )/((665 pn )pn ), we can estimate a constant
pn
= 0.000802886,
(665 pn )pn
Page 8 of 57
Spring, 2010
which is the average of (pi )/((665 pi )pi ), i = 1, 2, . . . , 18. It implies

pn = 0.000802886(665 pn )pn ,
pn+1 = pn + 0.000802886(665 pn )pn
with
p0 = 9.6.
(The textbook uses the linear proportionality with the proportionality k = 0.00082.)
Let us solve our model pn+1 = pn + 0.000802886(665 pn )pn with p0 = 9.6 numerically. That is, for each n,
we compute pn and plot the data (n, pn ) and compare the one obtained by the experiment. See the figure 1.5.
Figure 1.5: Red #: Experimental Result, Blue : Model Result
1.3 Solutions to Dynamical Systems.

Topic I. Method of Conjecture.
The method of conjecture is a powerful mathematical technique to hypothesize the form of a solution to a
dynamical system and then to accept or reject the hypothesis. The method has four steps: Look for Pattern,
Conjecture, Test Conjecture and Conclusion.
Example 1.3.1 (Savings Certificate (Revisited)). A savings certificate is initially worth $1000 accumulated
interest paid each month at 1% of the balance. No deposits or withdrawals occurred in the account. Letting
an be the amount in the account after n months, we deduce the dynamical system
an+1 = 1.01an
with a0 = 1000.
Page 9 of 57
(1.3.1)
Spring, 2010
Step 1 Look for Pattern:

a1 = 1.01a0
a2 = 1.01a1 = 1.01(1.01a0 ) = 1.012 a0
a3 = 1.01a2 = 1.01(1.012 a0 ) = 1.013 a0
......
an = 1.01n a0 .
Step 2 Conjecture: From the step 1, we conjecture an = 1.01n a0 .
Step 3 Test Conjecture: The conjecture implies
an+1 = 1.01n+1 a0 = 1.01(1.01n a0 ) = 1.01an ,
which is the dynamical system (1.3.1). Thus, our conjecture is right.
Step 4 Conclusion: The solution of the dynamical system (1.3.1) is an = 1.01n a0 = 1.01n 1000.
Topic II. Homogeneous Linear Dynamical System an+1 = ran , r constant.
Theorem 1.3.2. The solution of the linear dynamical system an+1 = ran with constant r = 0 is an = rn a0 ,
where a0 is the given initial value.
Example 1.3.3 (Sewage Treatment). A sewage treatment plant processes raw sewage to produce usable
fertilizer and clean water by removing all other contaminants. The process is such that each other 12% of
remaining contaminants in a processing tank are removed.
Questions:
1. What percentage of the sewage would remain after 1 day?
2. How long would it take to lower the amount of sewage by half?
3. How long until the level of sewage is down to 10% of the original level?
A NSWER. Let an be the amount of sewage contaminants after n hours and a0 the initial amount. Then we
build the model
an+1 = an 0.12an = 0.88an , i.e., an+1 = 0.88an .
Using the Theorem above, we have the solution of the dynamical system,
an = 0.88n a0 .
Answer to Question 1: Since 1 day is equivalent to 24 hours, so the answer is
a24 = 0.8824 a0 = 0.0465a0 .
It means the level of contaminants in the sewage can be reduced by more than 95% at the end of the first day.
Answer to Question 2: The question is about the time n satisfying an = 0.5a0 . So we solve
ln 0.5
= 5.42
ln 0.88
Hence, it takes about 5.42 hours to lower the contaminants to half their original level.
Answer to Question 3: The question is about the time n satisfying an = 0.1a0 . So we solve
0.5a0 = 0.88n a0
0.5 = 0.88n
n=
ln 0.1
= 18.01
ln 0.88
Hence, it takes about 18 hours before the contaminants are reduced to 10% of their original level.
0.1a0 = 0.88n a0
Sewage:
0.1 = 0.88n
refuse liquids or waste matter usually carried off by sewers
Page 10 of 57
n=
Spring, 2010
Topic III. LongTerm Behavior of an+1 = ran , r constant.

By the Theorem, we recall that the solution of an+1 = ran is an = rn a0 . We observe the longterm behavior
of an = rn a0 when n is sufficiently large.
r
Behavior of an
r=0
The sequence an converges to a0

and so the dynamical system has a constant solution and equilibrium value at 0.
r=1
The sequence an converges to a0 and so all initial values are constant solutions.
r<0
The sequence an oscillates.
|r| < 1
The sequence an converges to 0 and so it decays to the limiting value of 0.
|r| > 1
The sequence an diverges and so it grows without bound.
Topic IV. Nonhomogeneous Linear Dynamical System an+1 = ran + b, r and b constant.
Definition 1.3.4. If a dynamical system an+1 = f (an ) has a constant solution an = constant, say an = c, then
the constant c is called an equilibrium value or fixed point of the system.
Example 1.3.5. Consider an+1 = 0.5an + 0.1.
(1) When a0 = 0.1, the given dynamical system implies
a1 = (0.5)(0.1) + 0.1 = 0.15,
......
a2 = (0.5)(0.15) + 0.1 = 0.175

a15 = 0.19999695
So we may expect that as n , an 0.2.

a1 = (0.5)(0.2) + 0.1 = 0.2,
......
a2 = (0.5)(0.2) + 0.1 = 0.2

a15 = 0.2
So we can deduce an = 0.2 = a0 for any integer n. That is, 0.2 is the equilibrium value.
a1 = (0.5)(0.3) + 0.1 = 0.25,
......
a2 = (0.5)(0.25) + 0.1 = 0.225

a15 = 0.20000305
So we may expect that as n , an 0.2.

When we see the graphs of the sequences (1), (2) and (3), we observe that whatever the initial value is, the
sequence converges
to the bequilibrium
0.2. We say this equilibrium value is stable.
Example
1.3.6. Consider
n 1000.
n+1 = 1.01bvalue
(1) When b0 = 90000, the given dynamical system implies
b1 = (1.01)(90000) 1000 = 89900
b2 = (1.01)(89900) 1000 = 89799
......
b15 = 88390
So we may expect that as n , an 100000.
b1 = (1.01)(100000) 1000 = 100000
b2 = (1.01)(100000) 1000 = 100000
Page 11 of 57
Spring, 2010
Figure 1.6: Red (Lower @): a0 = 0.1, Blue (Middle #): a0 = 0.2, Black (Upper ): a0 = 0.3
......
b15 = 100000
So we can deduce bn = 100000 = b0 for any integer n. That is, 100000 is the equilibrium value.
b1 = (1.01)(110000) 1000 = 110100
b2 = (1.01)(110100) 1000 = 110201
......
b15 = 111610
So we may expect that as n , an 100000.
When we see the graphs of the sequences (1), (2) and (3), we observe that whatever the initial value is, the
sequence does not converge to the equilibrium value 100000. We say this equilibrium value is unstable.
Figure 1.7: Red (Lower): b0 = 90000, Blue (Middle): b0 = 100000, Black (Upper): b0 = 110000
Page 12 of 57
Spring, 2010
Topic V. Finding and Classifying Equilibrium Values.

Suppose an+1 = ran + b has an equilibrium value a = 0. Then by definition, we get
an+1 = a = an .
Putting them into the dynamical system, we have
a = ra + b
a=
b
1r
(r = 1).
It implies the following Theorem.

Theorem 1.3.7. If an+1 = ran + b has a nonzero equilibrium value a, then the equilibrium value a is given
by
b
a=
(r = 1).
1r
(1) If r = 1 and b = 0, then the dynamical system becomes an+1 = an , i.e., any initial value is an equilibrium
value, i.e., every number is an equilibrium value.
(2) If r = 1 but b = 0, then there is no equilibrium value.
Example 1.3.8. Let us use the Theorem 1.3.7 to find the equilibrium values of the dynamical systems an+1 =
0.5an + 0.1 in the Example 1.3.5 and bn+1 = 1.01bn 1000 in the Example 1.3.6 above.
The Theorem 1.3.7 implies the equilibrium values a and b for each system
a=
0.1
= 0.2,
1 0.5
b=
1000
= 100000.
1 1.01
Through examples, we can observe the following longterm behavior for an+1 = ran + b, b = 0.
r
LongTerm Behavior
|r| < 1
Stable equilibrium value
|r| > 1
Unstable equilibrium value
r=1
Straight line with no equilibrium value
Theorem 1.3.9. The dynamical system an+1 = ran + b has the solution
an = rn c +
b
,
1r
(1.3.2)
where c is a constant depending on the initial condition, explicitly,

a0 = r0 c +
b
b
= c+
1r
1r
c = a0
b
.
1r
So the solution can be rewritten by

)
(
b
b(1 rn )
b
an = r a0
+
= r n a0 +
.
1r
1r
1r
n
P ROOF. Substituting the result (1.3.2) into the given system, we have
(
)
b
b
b
n+1
n
an+1 = r c +
,
and
ran + b = r r c +
+ b = rn+1 c +
.
1r
1r
1r
So an given in (1.3.2) satisfies the given system an+1 = ran + b. Therefore, (1.3.2) is the solution.
Page 13 of 57
Spring, 2010
We observe that the second term in the solution (1.3.2) is the equilibrium value of the given system an+1 =
ran + b.
Example 1.3.10. Solve an+1 = 1.01an 1000.
A NSWER. By the Example 1.3.6 or Example 1.3.8, we recall that the given system has the equilibrium value
100000. The Theorem 1.3.9 above implies the solution
an = 1.01n c + 100000
=
=
a0 = c + 100000 = c = a0 100000
an = 1.01n (a0 100000) + 100000,
where a0 is the initial value.

Topic VI. Nonlinear Systems.
We recall in the Example of Growth of a Yeast Culture (Revisited):
pn+1 = pn + 0.00082(665 pn )pn = 1.5453pn 0.00082p2n = 1.5453(1 0.0005306pn )pn
0.0005306pn+1 = 1.5453(1 0.0005306pn )0.0005306pn
which is a nonlinear dynamical system. Letting an = 0.0005306pn and r = 1.5453, the equation can be
rewritten by
an+1 = r (1 an ) an .
In this system, when we play with various rs: r = 1.5453, r = 2.750, r = 3.250, r = 3.525 and r = 3.555,
we have plots showing various longterm phenomena. (See the figure 1.21 on page 32 in the textbook.)
1.4 Systems of Difference Equation.
In the previous section, we have studied the equilibrium value of one linear/nonlinear dynamical system. In
this section, we study the equilibrium value of a pair of dynamical systems involved with each other.
Example 1.4.1 (Car Rental Company). A car rental company has distributorships in Orlando and Tampa.
In analyzing the historical records, it is determined that 60% of the cars rented in Orlando are returned to
Orlando, whereas 40% end up in Tampa. Of the cars rented from the Tampa office, 70% are returned to
Tampa, whereas 30% end up in Orlando.
Questions:
1. Will a sufficient number of cars end up in each city to satisfy the demand for cars in that city?
2. If not, how many cars must the company transport from Orlando to Tampa or from Tampa to Orlando?
A NSWER. Let On and Tn be the number of cars in Orlando and Tampa, respectively, at the end of day n.
Then, we can build the following dynamical system model:
On+1 = 0.6On + 0.3Tn ,
and
Tn+1 = 0.7Tn + 0.4On ,
which is a system of difference equations.

Suppose the system has the equilibrium values On = O and Tn = T . Then putting them into the equations, we
have
3
O = 0.6O + 0.3T
and
T = 0.7T + 0.4O
=
O = T.
4
So if the Orlando and Tampa offices initially have O0 = 3000 and T0 = 4000 cars, respectively, then we
observe
O1 = 0.6(3000) + 0.3(4000) = 3000,
O2 = 0.6(3000) + 0.3(4000) = 3000,
T1 = 0.7(4000) + 0.4(3000) = 4000

T2 = 0.7(4000) + 0.4(3000) = 4000
Page 14 of 57
Spring, 2010
......
On = 3000,
......
Tn = 4000.
That is, the system remains at the initial value (On , Tn ) = (3000, 4000) = (O0 , T0 ).
In fact, even if we change the initial values, we observe eventually On 3000 and Tn 4000 (assuming the
total number of cars is 7000).
Case 1
Case 2
Case 3
Case 4
O0
7000
5000
2000
T0
2000
5000
7000
For the various starting values in the table above, the figures 1.8 shows that On and Tn approach to the
equilibrium values, i.e., On 3000 and Tn 4000.
Answers to Question 1 and 2: Even though an office starts with insufficient number of cars, it can satisfy the
demand on that day. Even 2 days later, it can have the ideal number of cars (i.e., equilibrium values). So we
dont have to transport any car from one city to the other.
Figure 1.8: Red: On , Blue: Tn

Example 1.4.2 (Competitive Hunter Model Spotted Owls and Hawks). Suppose a species of spotted owls
competes for survival in a habitat that also supports hawks. Suppose also that in the absence of the other
species, each individual species exhibits unconstrained growth in which the change in the population during
an interval of time (e.g., 1 day) is proportional to the population size at the beginning of the interval.
Page 15 of 57
Spring, 2010
Let On and Hn denote the size of the spotted owls and hawks population, respectively, at the end of day n.
Then by the assumption on the proportionality, we have
On On
and
Hn Hn
On = k1 On
and
Hn = k2 Hn ,
where k1 and k2 are constants.

Assuming that the decrease of the population is proportional to the product of On and Hn , the system becomes
On = k1 On k3 On Hn
and
Hn = k2 Hn k4 On Hn
On+1 = (1 + k1 )On k3 On Hn
and
Hn+1 = (1 + k2 )Hn k4 On Hn ,
where k1 , k2 , k3 and k4 are constants. We fix the constants ki s and consider the system:
On+1 = 1.2On 0.001On Hn
Hn+1 = 1.3Hn 0.002On Hn .
and
Letting O and H be the equilibrium values, we have

O = 1.2O 0.001OH,
H = 1.3H 0.002OH
0 = O(0.2 0.001H) 0 = H(0.3 0.002O).
It gives the equilibrium values (O, H) = (150, 200).

Plotting On and Hn with various initial values as given in the table,
Case 1
Case 2
Case 3
O0
151
149
10
H0
199
201
10
we observe On 150 and Hn 200 in any case. In Case 1, the population of owls grows indefinitely while
the population of hawks goes extinct. In Case 2, the opposite phenomenon occurs. That is, in either case, one
of the two pieces drives the other to extinct. However, it is interesting to see in Case 3 that the population of
hawks grows while owls goes to be vanished. Confer the figure 1.28 on pages 4345 in the textbook.
Let us compare the results on equilibrium values in Examples 1.4.1 (Car Rental Company) and 1.4.2 (Owls
and Hawks). The equilibrium values in Example 1.4.1 (Car Rental Company) are stable and insensitive to
the initial conditions, while those in Example 1.4.2 (Owls and Hawks) are unstable and very sensitive to the
initial conditions.
Page 16 of 57
Chapter 2
The Modeling Process
2.1 Mathematical Models.

Read the textbook.
2.2 Modeling Using Proportionality.
Topic I. Introduction.
From Chapter 1, we recall
y x (i.e., x and y are proportional to each other) if and only if y = kx for some constant k > 0.
It is easy to see
yx
x y.
Example 2.2.1.
1.
2.
3.
4.
y x2 if and only if x y1/2 .

y xn for a fixed constant n if and only if x y1/n .
y ln x if and only if x ey .
y ex if and only if x ln y.
P ROOF. Skip. But, one should be able to prove all of them.
Property 2.2.2 (T RANSITIVITY). If z y and y x, then z x.

Topic II. Geometric Interpretation.
Suppose x and y are proportional to each other. Then there is a constant k such that y = kx. It is clear to see
that the graph of y = kx is a line with a slope of k passing through the origin. That is, for x and y which are
proportional, its graph should be a line and should pass the origin. So u and v satisfying v = mu + b with
constant m and b = 0 cannot be proportional.
Example 2.2.3. Suppose there is an open box floating in the water tank. As we add heavy objects into the
box, the water in the tank will flow out. We recall the fact that the volume of the water displaced by the loaded
box is equal to the weight of the loaded box. For an example, if we add a ball into the box and the volume
of the water flown out of the tank is 5, then we can say the weight of the ball with the box is 5 (without
considering all the units).
Let y be the volume of the water displaced by the loaded box and x be the weight of the loaded box.
Then we have y = x. However, letting z be the weight of the loaded ball alone, we should have y =
z + (weight of the box), i.e., y and z cannot be proportional, because of the constant term. See the figure 2.1.
If we have a case as in the Example 2.2.3, then is it prohibited from assuming the proportionality? No it is
not. In fact, the answer depends on the problem, specifically, the slope. Let us consider two lines having
same slope, L : y = mx and M : y = mx + b, where b = 0. Let (x0 , yL ) and (x0 , yM ) be points on lines L and M,
respectively. That is,
yL = mx0 ,
and
yM = mx0 + b.
17
Spring, 2010
.Displaced volume y
.
.Added weight x
Figure 2.1: It is not a proportionality because the line fails to pass through the origin.
Then, we can compute
yM yL = mx0 + b mx0 = b,
We divide both sides by yM :
i.e.,
yM yL = b.
yM yL
b
=
.
yM
yM
(2.2.1)
Now we observe:
1. If the slope m is relatively large (e.g., compared to 1), then yM should be relatively large too, which
implies the ratio (2.2.1) is close to zero, i.e.,
yM yL
0,
yM
i.e.,
yM yL 0,
i.e.,
yM yL .
So in this case, we may assume the proportionality for the data fitting the model y = mx + b.
2. If the slope m is relatively small (e.g., compared to 1), then yM should be relatively small too, which
implies the ratio (2.2.1) is not close to zero, i.e.,
yM yL
0,
yM
i.e.,
yM yL 0,
i.e.,
yM yL .
So in this case, we cannot assume the proportionality for the data fitting the model y = mx + b.
In a nutshell, if the data fit to the model y = mx + b and m is relatively large, then we can assume the
proportionality even though b = 0. See the figure 2.2.
Example 2.2.4 (K EPLER S T HIRD L AW). What is the relationship between the orbital period and the mean
distance between the sun and the planet in the solar system?
A NSWER. Method 1. Keplers Third Law:
Keplers Third Law: The square of the orbital period of a planet is directly proportional to the cube of the
semi-major axis of its orbit.
Symbolically, it can be written by
T 2 R3 ,
where T is the orbital period of planet and R is the semimajor axis of the orbit, i.e., mean distance between
the sun and the planet. The proportionality constant is same for any planet around the Sun,
2
2
2
TPlanet
TEarth
TMars
=
=
.
R3Earth R3Mars R3Planet
Page 18 of 57
.y
Spring, 2010
.y
.y = mx + b
.y = mx
.y = mx + b
.y = mx
.
.x
.x
Figure 2.2: Assumable and Unassumable Proportionality
Computing the ratio for the Earth, TEarth = 365.25 days and REarth = 92.9 millions of miles = 149.508058 millions of k
we have the constant of proportionality,
365.252
= 0.166392 (days2 /millions of miles3 ).
3
92.9
Thus, we deduce that for any planet, T 2 = 0.166392R3 , i.e., T = 0.1663921/2 R3/2 = 0.407912R3/2 .
Method 2. Modeling Method based on Data: The following table is from 1993 World Almanac.
Planet
Mercury
Period (T )
Mean Distance (R)
(days)
(millions of miles)
88.0
36.
Venus
224.7
67.25
Earth
365.3
93.
Mars
687.0
141.75
Jupiter
4331.8
483.80
Saturn
10760.0
887.97
Uranus
30684.0
1764.50
Neptune
60188.3
2791.05
Pluto
90466.8
3653.90
When we plot the point (R3/2 , T ), we can approximate a straight line passing through the origin. The slope
(constant of proportionality) can be obtained by choosing any points. However, let us use the leastsquares
criterion in S ECTION 3.3 A PPLYING THE L EASTS QUARES C RITERION. Then we deduce T = 0.40948R3/2 .
See the figure 2.3.

Remark 2.2.5 (A SIDE). Keplers First Law: Each planet moves along an ellipse with the sun at one focus.
Keplers Second Law: For each planet, the line form the sun to the planet sweeps out equal areas in equal
times.
We have some famous formulas involved with the proportionalities as follows:
1. (Hookes Law) F = kS, where F is the restoring force in a spring stretched or compressed a distance S.
2. (Newtons Law) F = ma or a = F/m, where a is the acceleration of a mass m subjected to a net external
force F.
3. (Ohms Law) V = iR, where i is the current induced by a voltage V across a resistance R.
Page 19 of 57
Spring, 2010
Figure 2.3: Keplers Third Law as proportionality

4. (Boyles Law) V = k/p, where under a constant temperature k the volume V is inversely proportional to
the pressure p.
5. (Einsteins Theory of Relativity) E = c2 M, where under the constant speed of light squared c2 the energy
E is proportional to the mass M of the object.
6. (Keplers Third Law) T = cR3/2 , where T is the period (days) and R is the mean distance to the sun.
Topic III. Modeling Vehicular Stopping Distance.
We start with recalling the onecarlength rule (OCL) that allows one car length for every 10 mph of speed.
That rule was also stated as the 2seconds rule (2S) which allows for 2 seconds between cars. In fact, these
two rules are not same and they cannot be compatible. If one is true, the other one should be wrong. First we
will show this incompatibility and then develop a better model on the stopping distance.
If we stick to the rule 2S, then a simple computation shows
(
)
speed in ft
1 car length = distance =
(2 sec)
sec
(
)(
)(
)
10 miles
5280 ft
1 hr
=
(2 sec) = 29.33 ft.
hr
mile
3600 sec
It says that if we follow the rule 2S, then a car should be about 29.33 ft long. However, by statistics, since the
average car length is 15 ft, so the rule OCL should be wrong.
When we use the correct information on the average car length, we have
(
)
speed in ft
15 ft = 1 car length = distance =
(x sec)
sec
(
)(
)(
)
10 miles
5280 ft
1 hr
44
x
=
(x sec) =
hr
mile
3600 sec
3
45
= 1.02273.
= x =
44
It says that the time should be 1.02273 seconds rather than 2 seconds. So if we follow the rule OCL, then the
rule 2S becomes wrong. Although two rules are incompatible, it may be preferable to keep using both rules
for the road safety.
Now, we recall from Chapter 1,
total stopping distance = reaction distance + braking distance.
Page 20 of 57
Spring, 2010
Using the collected data on the reaction and braking distances, we observe the proportionalities,
dr v,
db v2 ,
where v is the car speed and dr and db are reaction and braking distances, respectively. Explicitly, we deduce
dr = 1.1v,
db = 0.054v2 ,
d = 1.1v + 0.054v2 ,
where d is the total stopping distance. See the figure 2.4.
Figure 2.4: Red : Given Data, Blue +: Prediction by Model, Black: OCL rule
In the left one of the figure 2.4, the black line is the prediction by the OCL rule of which equation is given by
d = 1.5v, because the rule says d/v = 15/10 (ft/mph). However, as we can see from the figure, it is definitely
useless especially after the car speed 20. So when we take the guideline given in the table,
Speed (mph)
0 10
10 40
40 60
60 75
Guideline (sec)
the modified OCL rule will be better than the original one, as shown in the right one of the figure 2.4.
2.3 Modeling Using Geometric Similarity.
Geometric similarity is a concept related to proportionality and can be useful to simplify the mathematical
modeling process.
Topic I. Introduction.
Definition 2.3.1. Two objects are said to be geometrically similar if there is a onetoone correspondence
between points of the objects such that the ratio of distances between corresponding points is constant for all
possible pairs of points.
Example 2.3.2. Consider two boxes X and X , where each one has length l, l , width w, w , and height h, h ,
respectively. See the figure 2.5. Suppose X and X are geometrically similar so that there is a onetoone
correspondence between points A, B, C, and A , B and C , and other points and the ratio of the distances
between corresponding points is constant. Then it must be true that
w
h
l
= = = k,
l
w
h
Page 21 of 57
Spring, 2010
.A
C
.
C
.
.A
.
l.
.h
.l
.
h
.
.B
.w
.B
.D
.w
.D
Figure 2.5: Two geometrically similar objects X and X

for some constant k > 0.
1. For two triangles ABC and A BC in the boxes X and X , respectively, we observe that the angles are
same, i.e., BCA = BC A and BAC = B AC . It is easy to understand that the boxes X and X are
geometrically similar and so those triangles also should be geometrically similar.
The shape is the same for two geometrically similar objects and one object is simply an enlarged copy of the
other. We can think of geometrically similar objects as scaled replicas of one another, as in an architectural
drawing in which all the dimensions are simply scaled by some constant factor.
2. One of the advantages with the geometric similarity lies on simplifying the computations. For the boxes
X and X above, the volumes of X and X are, respectively, VX = lwh and VX = l w h . It is easy to see
VX
lwh
= = k3 ,
VX
lwh
VX = k3VX ,
VX VX .
Similarly, for the total surface areas SX and SX of the boxes X and X , we have
SX
2(lw + wh + hl)
=
= k2 ,
SX
2(l w + w h + h l )
SX SX .
SX = k2 SX ,
Moreover, we can find a relationship between the ratio of volumes and the ratio of surface areas:
VX /VX
k3
= 2 = k,
SX /SX
k
VX
SX
=k
,
VX
SX
VX
SX
.
VX SX
Now let us choose l and l between the dimensions of the boxes. Since l/l = k and SX /SX = k2 , so we have
l2
SX
= k2 = 2 ,
SX
l
SX
SX Aside
= 2
= constant.
2
l
l
It implies
SX = (constant) l 2 ,
SX l 2 ,
similarly
SX l 2 .
By the same argument on VX and VX , it follows

VX l 3 ,
VX l 3 .
Remark 2.3.3 (A SIDE). From SX /SX = l 2 /l 2 = constant, how can we deduce SX /l 2 = SX /l 2 = constant?
First it is easy to see
SX
l2
SX
SX
= 2 =
= 2
,
2
SX
l
l
l
Page 22 of 57
Spring, 2010
by multiplying both sides by SX /l 2 . Let us consider two functions f (x) and g(y) where we have only two
f (x) x
independent variables x and y. Suppose
= = constant. Then, we have
g(y) y
f (x) g(y)
=
.
x
y
Letting F(x) = f (x)/x and G(y) = g(y)/y, the equation says F(x) = G(y). If the function of x and the function
of y are same, then both of them should be a constant, i.e., F(x) = G(y) = constant. (For instance, one may
recall that a polynomial of any independent variable of degree 0 is a constant.) This kind of technique is
typically used under the topic, separation of variables, in PARTIAL D IFFERENTIAL E QUATIONS.
In the Example 2.3.2 above, we have argued on S and V with the length l. We can develop the same argument
with the width w and the height h, i.e.,
S w2 ,
S h2 ,
V w3 ,
V h3 .
Once we choose a dimension (in the Example above, it was the length l), it is called the characteristic
dimension.
Suppose a function f depends on the length l and surface area S and volume V of a box. Then since we
can express S and V in terms of the length l, eventually the function f can be expressed by l, l 2 and l 3 .
For instance, if y = f (l, S,V ) = 3l + S V , then there are some constants k1 and k2 such that S = k1 l 2 and
V = k2 l 3 . So, we have
y = 3l + S V = 3l + k1 l 2 k2 l 3 ,
which is a function of l, l 2 and l 3 .
Example 2.3.4 (R AINDROP FROM A M OTIONLESS C LOUD). Suppose we are interested in the terminal
velocity (i.e., maximum velocity) of a raindrop from a motionless cloud. We assume only two forces exert
on the raindrop, Fd due to the air resistance and Fg due to the gravity. Then the net force F (i.e., sum of all
forces) becomes F = Fg Fd so that it falls down. By Newtons Second Law, the net force F should be equal
to ma, i.e.,
Fg Fd = ma,
where a is the acceleration and m is the mass of the raindrop. Since the maximum velocity occurs when the
acceleration vanishes (i.e., a = 0), the equation for the terminal velocity becomes
Fg Fd = 0,
Fg = Fd .
Question: What is the relationship between the terminal velocity and the mass of the raindrop?
Assumptions:
(A1)
(A2)
(A3)
(A4)
Fd is proportional to the surface area S times the square of its speed v, i.e., Fd Sv2 .
Fg is proportional to weight w, i.e., Fg w.
Mass m is proportional to the weight w, i.e., m w.
All the raindrops are geometrically similar.
A NSWER. Thanks to (A4), we can use the proportionality. We recall in general that the surface area S and
the volume V of an object are proportional to l 2 and l 3 for any characteristic dimension l, i.e.,
S l2,
V l3
S V 2/3 .
Because weight w and mass m are proportional to volume, the transitive rule for proportionality gives
S V 2/3 ,
and V m
Page 23 of 57
S m2/3 .
Spring, 2010
With this result and (A1), we have

Fd Sv2 m2/3 v2 ,
Fd m2/3 v2 .
i.e.,
The assumptions (A2) and (A3) yield

Fg w,
and
wm
Fg m.
Since Fd m2/3 v2 and Fg m, there are some positive constants k1 and k2 such that
Fd = k1 m2/3 v2 ,
Fg = k2 m.
The equation Fg = Fd given in the problem implies

k2 m = k1 m2/3 v2 ,
m1/3 =
k1 2
v ,
k2
m1/3 v2 ,
m1/6 v.
Thus, the terminal velocity of the raindrop is proportional to its mass raised to the onesixth power.
Remark 2.3.5 (A SIDE : S TOKES L AW ). Droplets falling in a motionless air can be modeled by the differential equation,
d2y
c dy
= 32.2 2 ,
2
dt
D dt
where 32.2 is the gravitational acceleration and c is a fixed constant and D is the diameter of the spherical
raindrop and dy/dt is the velocity of the raindrop. So the terminal velocity can be obtained by solving the
differential equation,
d2y
c dy
dy 32.2 2
0 = 2 = 32.2 2 ,
=
D .
dt
D dt
dt
c
The diameter D is proportional to the radius r of the spherical raindrop and the volume V of the raindrop is
proportional to r3 , so we can deduce
V D3 ,
D2 V 2/3 .
Since the mass m is proportional to the volume V , the result above becomes
D2 V 2/3 ,
V m
D2 m2/3 .
Hence, the differential equation implies

dy
D2 m2/3 ,
dt
dy
m2/3 ,
dt
i.e., the terminal velocity is proportional to the mass raised to the (2/3) power. This is a quite different result
than the one we deduced in the Example 2.3.4 above. Why? What happened? Its because in formulating
the differential equation, we assume the constant gravitational acceleration 32.2. But in the Example 2.3.4
above, we assume that Fg is not a constant and so Fg is involved with the mass m.
A droplet falling according to the differential equation above never quite reaches its terminal velocity, but gets
closer and closer to it. Unless its fall is interrupted by hitting the ground, the velocity eventually becomes so
close to the solution of the differential equation that for practical purpose, we consider it equal to the terminal
velocity.
From
the book, Concepts of Mathematical Modeling, written by Walter J. Myer
Page 24 of 57
Spring, 2010
Topic II. Testing Geometric Similarity.

Are a triangle and a rectangle geometrically similar? Clearly, they are not. Because of the differences between
the vertices of two polygons, there is no onetoone correspondence. When we use the geometric similarity
assumption, those involved objects should be of same shape.
Now let us fix a polygon, for example, a circle. Then we can think of the radius, area, circumference, arc,
angle of the arc, length of the arc, and so on. We consider two circles of diameters d1 and d2 , respectively.
Letting C1 and C2 be the circumferences of the circles, it is straightforward to see
C1 C2
=
= ,
d1
d2
C1 d1 d1
=
= .
C2 d2 d2
and
(2.3.1)
We recall that the length l of an arc (part of a circle) having the angle in a circle with radius r is given by
l = r . Letting l1 and l2 be the lengths of arcs in circles above having the angles 1 and 2 , respectively, we
have
and
l2 = r2 2 .
l1 = r1 1
If 1 = 2 , then there is no geometric similarity between those two arcs. So assuming 1 = 2 , we deduce
l1 r1
= = 1 = 2 ,
l2 r2
and
l1 2r1 d1
=
= .
l2 2r2 d2
(2.3.2)
Combining two results above yields,

C1 d1 l1
=
= ,
C2 d2 l2
i.e.,
C1 l1
= .
C2 l2
From those two equations, it is deduced that the ratio of distances between corresponding points around any
two circles is always the ratio of their diameters.
Example 2.3.6 (M ODELING A BASS F ISHING D ERBY). A sport fishing club wishes to encourage its membership to release their fish immediately after catching them. The club also wishes to grant awards based on
the total weight of fish caught. It is suggested that each individual carry a small portable scale. Question:
How does someone fishing determine the weight of a fish he/she has caught?
A NSWER. Skip. Read the textbook.

2.4 Automobile Gasoline Mileage.
Read the textbook.
2.5 Body Weight and Height, Strength and Agility.
Read the textbook.
Page 25 of 57
Spring, 2010
Page 26 of 57
Chapter 3
Modeling Fitting
When analyzing a collection of data points, it is suggested to consider the following three tasks.
1. Fitting a selected model type or types to the data.
2. Choosing the most appropriate model from competing types that have been fitted. For example, we
may need to determine whether the bestfitting exponential model is a better model than the bestfitting
polynomial model.
3. Making predictions from the collected data.
In the first two tasks, we do have a model or competing models explaining the observed behavior of the data.
It will be discussed in this chapter under the model fitting. For the third case, since no model can explain the
observed behavior, so we will try to construct an empirical model based on the collected data, which will be
studied in the following chapter.
Topic I. Relationship Between Model Fitting and Interpolation.
Consider the figure 3.1 of the collected data.
Figure 3.1: Observations relating the variables y and x
There are mainly two ways to approximate the given data.

1. Based on the shape of the data, we make the assumption on the model and find the better one. For
example, for the data in the figure 3.1, we assume a quadratic model and find a best fitting parabola
y = ax2 + bx + c such as in the figure 3.2. In this way, we may explain the situation on which the data
lie. Usually this approach is theory driven.
2. We can find a curve passing through all those points. Finding such a curve is called the spline interpolation and it will be studied in the following chapter. In this way, we can capture the trend of the data
to predict in between the data points. Usually this approach is data driven. For the collected data, the
figure 3.3 shows the curve obtained by the spline interpolation.
27
Spring, 2010
Figure 3.2: Fitting a parabola y = ax2 + bx + c to the Figure 3.3: Interpolating the data using a smooth polydata points
nomial
Topic II. Sources of Error in the Modeling Process.
For purposes of easy reference, we classify errors under the following category scheme:
1. Formulation error: for instance, in the Example of Stopping Distance, we ignored the road friction for
the braking distance. Because of this ignorance, the model may be less effective.
2. Truncation error: for instance, when we compute the value of sin x, we may use only x x3 /3! + x5 /5
and because of the truncation of the other terms, the computation cannot be accurate.
3. Roundoff error: for instance, rigorously speaking, 0.333333333 1/3 = 0. So if we use 0.333333333,
then we may confront an error.
4. Measurement error: one can understand this error as human error. When we measure an object in a
naked eye, it may not be accurate compared to the one measured by a machine.
3.1 Fitting Models to Data Graphically.
Topic I. Visual Model Fitting with the Original Data.
Figure 3.4: Minimizing the sum of the absolute deviation from the fitted line
Suppose we want to fit the model y = ax + b to the data shown in figure 3.4. All of them cannot be expected
to lie exactly along a single straight line. So there will be some vertical discrepancy between a few of the data
points and any particular line under consideration. These vertical discrepancies are called absolute deviation.
Page 28 of 57
Spring, 2010
Based on the deviation, we may think of two cases.

1. Minimizing the sum of the absolute deviation from the fitted line and
2. Minimizing the largest absolute deviation form the fitted line.
For the bestfitting line, we might try to achieve the first one, minimizing the sum of deviation. For the
second one, minimizing the largest deviation, see the figure 3.5.
Figure 3.5: Minimizing the largest absolute deviation from the fitted line
Topic II. Transforming the Data.

Suppose we have the following collected data.
Collected Data:
8.1
22.1
60.1
165
Transformed Data:
ln y
2.1
3.1
4.1
5.1
Since the data points are suspected to follow the form y = cex , by taking the logarithmic function on each
side, we deduce
ln y = x + ln c,
which is a line on the (x)(ln y)plane such that its slope is 1 and the (ln y)intercept is (x, ln y) = (0, ln c).
Page 29 of 57
Spring, 2010
3.2 Analytic Methods of Model Fitting.

Topic I. Chebyshev Approximation Criterion.
Goal: For the collection of m data points (xi , yi ), i = 1, 2, . . . , m and a certain function y = f (x) (given as
in the example below), we want to minimize the largest deviation between the data and the function, i.e.,
minimize
Maximum of |yi f (xi )|,
i = 1, 2, . . . , m.
(3.2.1)
In other words, if f (x) = ax+b, then we want to find a and b which minimizing the maximum value in (3.2.1).
This criterion is often called the Chebyshev approximation criterion.
Rewriting Problem: Let
ri = |yi f (xi )|,
i = 1, 2, . . . , m.
Then, we have
r = Maximum of |yi f (xi )| = Maximum of ri ,
i = 1, 2, . . . , m
and so r is the largest deviation and we want to minimize this r. Since r is the maximum value of all |ri |s,
i = 1, 2, . . . , m, it is easy to see
|ri | r = r ri r = 0 r ri
and
0 r + ri
i = 1, 2, . . . , m.
Thus, the whole problem can be rewritten as follows:

Minimize r
(i.e., find the minimum value of r)
subject to
0 r ri
and
0 r + ri
i = 1, 2, . . . , m.
This kind of problem (finding a maximum/minimum value) is called a linear program or optimization problem.
Strategy: Computer implementation of an algorithm known as Simplex Method. It will be discussed in
Chapter 7 later. In the examples below, we will see how to minimize the largest deviation for a given function
via Mathematica.
Example 3.2.1. For the following data set, formulate the mathematical model that minimize the largest
deviation between the data and the line y = ax + b. (We use Mathematica to estimate a and b.)
x
1.0
2.3
3.7
4.2
6.1
7.0
3.6
3.0
3.2
5.1
5.3
6.8
A NSWER. Let r be the largest absolute deviation between the data and f (x) = ax + b. Then for the following
absolute deviations, |yi f (xi )|, we have
|3.6 f (1.0)| = |3.6 1.0a b| r
= r 3.6 1.0a b r
|3.0 f (2.3)| = |3.0 2.3a b| r
= r 3.0 2.3a b r
|3.2 f (3.7)| = |3.2 3.7a b|
= r 3.2 3.7a b r
|5.1 f (4.2)| = |5.1 4.2a b|
= r 5.1 4.2a b r
|5.3 f (6.1)| = |5.3 6.1a b|
= 0 r 1.0a b + 3.6 and
0 r + 1.0a + b 3.6
= 0 r 2.3a b + 3.0 and
0 r + 2.3a + b 3.0
= 0 r 3.7a b + 3.2 and
0 r + 3.7a + b 3.2
= 0 r 4.2a b + 5.1 and
0 r + 4.2a + b 5.1
Page 30 of 57
Spring, 2010
= r 5.3 6.1a b r = 0 r 6.1a b + 5.3 and

|6.8 f (7.0)| = |6.8 7.0a b| r
= r 6.8 7.0a b r = 0 r 7.0a b + 6.8 and
0 r + 6.1a + b 5.3
0 r + 7.0a + b 6.8
So the constraints are

0 r 1.0a b + 3.6
0 r 2.3a b + 3.0
0 r 3.7a b + 3.2
0 r 4.2a b + 5.1
0 r 6.1a b + 5.3
0 r 7.0a b + 6.8
0 r + 1.0a + b 3.6
0 r + 2.3a + b 3.0
0 r + 3.7a + b 3.2
0 r + 4.2a + b 5.1
0 r + 6.1a + b 5.3
0 r + 7.0a + b 6.8.
We want to find a and b, i.e., f (x) = ax + b minimizing the largest absolute deviation r subject to the 12
constraints above.
By the computer, we obtain
f (x) = 0.533333x + 2.14667
and the largest absolute deviation is r = 1.45333 which occurs at the last data (7.0, 6.8).
Topic II. Minimizing the Sum of the Absolute Deviations.

Consider a given data (xi , yi ), i = 1, 2, . . . , m, and the model y = f (x). Let ri = |yi f (xi )|. Then the sum
m
of the absolute deviations is
ri .
Let us consider the case of m = 2 so that we have only two absolute
i=1
sum r1 + r2 .
deviations r1 and r2 . The

If we plot the points (r1 , 0) and (r1 + r2 , 0) on the line, we observe that
minimizing the sum r1 + r2 can be interpreted as minimizing the length of the line formed by adding together
the numbers ri .
To solve this optimization problem using the calculus, the differentiability of the absolute deviation in terms
of the parameter should be guaranteed so that its critical number can be found. However, since an absolute
function is not differentiable at the cusp, the calculus technique may not be applied to the sum of the absolute
deviation. Because of this drawback, the following technique is considered.
Topic III. LeastSquares Criterion.
Currently, the most frequently used curvefitting criterion is the leastsquares criterion. Consider a given
data (xi , yi ), i = 1, 2, . . . , m, and the model y = f (x). Let ri = |yi f (xi )|. Then the sum of the squares
m
of absolute deviations is
ri2.
Let us consider the case of m = 3. If we introduce a vector in the three
i=1
r = r , r , r , we observe that the sum of the squares of the absolute deviations is in

dimensional space,
1 2 3
fact
m
r 2 .
r2 = r , r , r 2 =
i=1
That is, we may interpret the leastsquares criterion as minimizing the magnitude of the vector whose coordinates represent the absolute deviation between the observed and predicted values.
Topic IV. Relating the Criteria.
In the previous topics, we have discussed the geometric interpretations. Now let us compare the criteria
analytically.
Suppose m data, (xi , yi ), i = 1, 2, . . . , m, are given and Chebyshev and leastsquares criterion give the model
y = fC (x) and y = fL (x), respectively. Let
ci = |yi fC (xi )|,
cmax = max {ci : i = 1, 2, . . . , m} ,

Page 31 of 57
Spring, 2010
dmax = max {di : i = 1, 2, . . . , m} .
di = |yi fL (xi )|,

Then we observe
1. cmax dmax .
Proof. Because of the parameters of the function y = fC (x) are determined so as to minimize the value
of cmax , it is the minimal largest absolute deviation obtainable.
2. Letting
D=
2
m
i=1 di
,
m
we have D cmax dmax .

Proof. Since y = fL (x) gives the minimal sum of the squares obtainable, so we have
m
di2 = d12 + d22 + + dm2 c21 + c22 + + c2m c2max + c2max + + c2max = mc2max
i=1
2
m
i=1 di
c2max
D=
2
m
i=1 di
cmax .
m
With the observation 1, we have D cmax dmax .
Through an example, one can apply the criteria and compare the values D, cmax and dmax , which will be
studied in S ECTION 3.4 C HOOSING A B EST M ODEL3.4.
Page 32 of 57
Spring, 2010
3.3 Applying the LeastSquares Criterion.

In this section we study the leastsquares criterion to estimate the parameters for several types of curves. We
discuss the topics analytically rather than graphically.
Topic I. Fitting a Straight Line.
Suppose a model of the form y = ax + b is expected and m data points (xi , yi ), i = 1, 2, . . . , m, are given.
The leastsquares criterion is minimizing the sum of the squares of the largest deviations, i.e., minimizing S
defined by
m
S = (yi axi b)2 .
(3.3.1)
i=1
Considering S as a function of two independent variables a and b, finding the minimum value of S is the
problem on the minimum value of S(a, b) in Calculus. From Calculus, we recall
1. A point (a0 , b0 ) is called a critical point of S(a, b) if
(i) (a0 , b0 ) is in the domain of S(a, b) and
(ii) either Sa (a0 , b0 ) = 0 or Sb (a0 , b0 ) = 0 and
(iii) one of both of Sa (a0 , b0 ) and Sb (a0 , b0 ) do not exist.
(Here Sa means the partial derivative of S(a, b) with respect to a.)
2. If S(a, b) has a local extremum at (a, b) = (a0 , b0 ), then (a, b) = (a0 , b0 ) must be a critical point of S(a, b).
(However, the converse is not generally true.)
3. (S ECOND D ERIVATIVE T EST) Suppose that S(a, b) has continuous secondorder partial derivatives in
some open disk containing the point (a0 , b0 ) and that Sa (a0 , b0 ) = 0 = Sb (a0 , b0 ). For the discriminant
D(a, b) for the point (a, b) defined by
D(a, b) = Saa (a, b)Sbb (a, b) [Sab (a, b)]2 ,
(i) if D(a0 , b0 ) > 0 and Saa (a0 , b0 ) > 0, then S has a local minimum at (a0 , b0 ),
(ii) if D(a0 , b0 ) > 0 and Saa (a0 , b0 ) < 0, then S has a local maximum at (a0 , b0 ),
(iii) if D(a0 , b0 ) < 0, then S has a saddle point at (a0 , b0 ),
(iv) if D(a0 , b0 ) = 0, then no conclusion can be drawn.
Let us find the local minimum of S(a, b) defined in (3.3.1). To find the critical point of S, we compute
[
]
m
m (
m
m
m
)
S
0=
= 2 (yi axi b) xi = 2 xi yi axi2 bxi = 2 (xi yi ) a xi2 b xi ,
a
i=1
i=1
i=1
i=1
i=1
[
]
m
m
m
m
S
0=
= 2 (yi axi b) = 2 yi a xi b 1 .
b
i=1
i=1
i=1
i=1
For a simple computation, we introduce vectors x = x1 , x2 , . . . , xm and y = y1 , y2 , . . . , ym and i = 1, 1, . . . , 1.
Then we observe
m
xi2 = x12 + x22 + + xm2 = x2 = x x,
i=1
m
xi = x1(1) + x2(1) + + xm(1) = x i,
i=1
m
yi = y1(1) + y2(1) + + ym(1) = y i,
i=1
m
(xiyi) = x1y1 + x2y2 + + xmym = x y,
i=1
Page 33 of 57
Spring, 2010
m = 1 + 1 + + 1 = i i,
where x y is the dot product between vectors x and y. So those equations on Sa and Sb for the critical points
become
0 = x y ax x bx i = ax x + bx i = x y
0 = y i ax i bi i = ax i + bi i = y i.
(3.3.2)
(3.3.3)
The resulting two equations (3.3.2) and(3.3.3) are called the normal equations. Simply,
x (y ax bi) = 0
i (y ax bi) = 0.
and
(Be careful! In general, u v = 0 implies neither u = 0 nor v = 0. So we should not say y ax bi = 0.)
To find the critical points of S, we should solve the normal equations (3.3.2) and (3.3.3) for a and b.
(1) (3.3.2) (i i) (3.3.3) (x i) implies
a [(x x)(i i) (x i)(x i)] = (x y)(i i) (y i)(x i),
i.e.,
a=
(x y)(i i) (x i)(y i)
.
(x x)(i i) (x i)2
i.e.,
b=
(x x)(y i) (x y)(x i)
.
(x x)(i i) (x i)2
(2) (3.3.3) (x x) (3.3.2) (x i) implies

b [(x x)(i i) (x i)(x i)] = (x x)(y i) (x y)(x i),
Thus, S(a, b) defined in (3.3.1) has the critical point

(
)
(x y)(i i) (x i)(y i) (x x)(y i) (x y)(x i)
(a0 , b0 ) =
,
(x x)(i i) (x i)2
(x x)(i i) (x i)2
)
(
(x y)i2 (x i)(y i) (y i)x2 (x y)(x i)
,
=
x2 i2 (x i)2
x2 i2 (x i)2
(3.3.4)
which is clearly in the domain of S(a, b), i.e., R R under the assumption (x x)(i i) (x i)2 = 0. In fact,
the denominator (x x)(i i) (x i)2 cannot be zero. See below (3.3.5).
Now we use the S ECOND D ERIVATIVE T EST to classify the local extremum at the found critical point
(a0 , b0 ).
m
Sa (a, b) = 2 (yi axi b) xi = 2

i=1
m
xi yi axi2 bxi
i=1
Sb (a, b) = 2 (yi axi b) ,

i=1
Sbb (a, b) = 2 1 = 2i i,
i=1
2
D(a, b) = Saa (a, b)Sbb (a, b) Sab (a, b) = 4(x x)(i i) 4(x i)2
Saa (a, b) = 2 xi2 = 2x x

i=1
m
Sab (a, b) = 2 xi = 2x i,
i=1
[
]
= 4 (x x)(i i) (x i)2 .
We observe D(a, b)/4 = (x x)(i i) (x i)2 is the denominator of the critical points. So it cannot be zero.
See below (3.3.5).
As we can see, all of Saa and Sbb and D(a, b) are constants, because they dont have the variables a and b.
Moreover, the C AUCHYS CHWARTZ I NEQUALITY in Calculus says
|u v| uv,
which implies
|u v|2 u2 v2 ,
Personally,
0 u2 v2 |u v|2 ,
0 (u u)(v v) (u v)2 ,
I do believe that it is one of THE MOST IMPORTANT inequalities in MATHEMATICS.
Page 34 of 57
(3.3.5)
Spring, 2010
where the equality holds for the cases: Case 1. u = 0 or v = 0 and Case 2. u and v are parallel, i.e., u = sv
for some scalar s.
[
]
Since x and i are not parallel, hence, we deduce D(a, b) = 4 (x x)(i i) (x i)2 > 0 for all (a, b) and also
Saa (a, b) = 2x x = 2x2 > 0 for all (a, b). Therefore, by the S ECOND D ERIVATIVE T EST, the function S
has the local minimum at the critical point (a0 , b0 ) found in (3.3.4).
Before we find the local minimum value of S, let us modify the function S using the vectors:
m
m [
]
S(a, b) = [yi axi b]2 = y2i + a2 xi2 + b2 2 (axi yi abxi + byi )
i=1
m
i=1
m
2
= y2i + a2 xi2 + b
i=1
i=1
12
i=1
i=1
i=1
i=1
a (xi yi ) ab xi + b yi
= y y + a x x + b i i 2 (ax y abx i + by i)
= (ax + bi) (ax + bi) 2(ax + bi) y + y y
2
= (ax + bi y) (ax + bi y) = y ax bi2 ,

which is amazingly nice, because the function S defined by the sum is rewritten as the norm of the vector
having the same form as in the sum definition (3.3.1).
Therefore, the local minimum value of S is obtained by putting (a, b) = (a0 , b0 ) into the result:
S(a0 , b0 ) = y a0 x b0 i2 .
(3.3.6)
Example 3.3.1. For the given data,

x
10
estimate the parameters of the fitting model y = ax + b by using the LeastSquares criterion.
A NSWER. We use the formulas deduced above. Let x = 1, 5, 8 and y = 1, 10, 6 and i = 1, 1, 1. Then the
objective function S is
S(a, b) = y ax bi2 ,
and by the result (3.3.4), it has the following critical point (a0 , b0 ):
)
(
(x y)i2 (x i)(y i) (y i)x2 (x y)(x i)
(a0 , b0 ) =
,
x2 i2 (x i)2
x2 i2 (x i)2
(
) (
)
99(3) 14(17) 17(90) 99(14)
59 72
=
,
=
,
= (0.797297, 1.94595) .
90(3) 142
90(3) 142
74 37
Putting it into the formula (3.3.6) on the minimum value, we have the minimum value
(
59 72
S
,
74 37

2

59
72

= y x i
74
37

2

59 72
5(59) 72
8(59) 72

= 1849 = 24.9865.
= 1 , 10
, 6
74 37
74
37
74
37
74
Thus, we conclude that the model y = 0.797297x + 1.94595 gives the minimum value of the sum of the
squares of the absolute deviations and the minimum value is 24.9856.
Remark 3.3.2. 1. The minimum value 24.9856 is obtained analytically by the formula (3.3.6). When we use
the model y = 0.797297x + 1.94595 and the data, we can make the following table.
Page 35 of 57
Spring, 2010
10
yi 0.797297xi 1.94595
1.74324
4.06757
2.32432
and the sum of the squares of all deviations becomes

3
(yi 0.797297xi 1.94595)2 = (1.74324)2 + (4.06757)2 + (2.32432)2 = 24.9856,
i=1
which is exactly same as the one obtained by the formula (3.3.6).

2. When we use Mathematica, we obtain the result as in the figure 3.6, which is exactly same as we did
above.
Figure 3.6: Mathematica Results
Topic II. Fitting a Power Curve.

To the given m data points (xi , yi ), i = 1, 2, . . . , m, suppose we fit the model y = axn by the leastsquares
criterion, where n is fixed and the parameter a will be determined. In this case, the sum S of the squares of
the absolute deviations becomes
m
S = (yi axin )2 .
i=1
(3.3.7)
n and y =
By the similar argument as discussed in Topic I above, we introduce vectors x = x1n , x2n , . . . , xm
y1 , y2 , . . . , ym . Then S becomes the function of one variable a and its critical point is found by
[
]
m
m
m
[
]
S
xy
= 2 (yi axin ) xin = 2 (xin yi ) a xi2n = 2 x y ax2
= a =
.
0=
a
x2
i=1
i=1
i=1
That is, S has the critical point
a0 =
xy
.
x2
Using the vectors, the function S in (3.3.7) turns to be

m
m (
)
S = (yi axin )2 = y2i + a2 xi2n 2axin yi
i=1
m
i=1
i=1
y2i + a2
i=1
xi2n 2a
(xinyi) = y y + a2x x 2x y = y ax2.
i=1
Putting the critical point a = a0 into the resulting function S(a), we have the minimum value
S(a0 ) = y a0 x2 .
Page 36 of 57
Spring, 2010
Example 3.3.3 (L EASTS QUARES WITH F IXED P OWER n = 2). For the given data,
x
0.5
1.0
1.5
2.0
2.5
0.7
3.4
7.2
12.4
20.1
estimate the parameters of the fitting model y = ax2 by using the LeastSquares criterion.
A NSWER. Let x = 0.52 , 1.02 , 1.52 , 2.02 , 2.52 and y = 0.7, 3.4, 7.2, 12.4, 20.1. Then the objective function
S is
S(a) = y ax2 ,
and it has the following critical point a = a0 :
a0 =
xy
= 3.18693.
x2
Putting it into the formula, we have the minimum value

S (3.18693) = y 3.18693x2 = 0.20954
Thus, we conclude that the model y = 3.18693x2 gives the minimum value of the sum of the squares of the
absolute deviations and the minimum value is 0.20954.
Remark 3.3.4. Check with Mathematica: See the figure 3.7.
Figure 3.7: Mathematica Results
Topic III. Transformed LeastSquares Fit.

When the data can be approximated by a linear function y = ax + b (Topic I) or a curve with fixed degree
y = axn , where n is fixed (Topic II), it is not difficult to estimate the parameters. In this topic, we consider a
simple case of the fitting model y = axn , where a and n are parameters to be determined.
m
If we apply the leastsquares criterion on y = axn directly to the m data, we have to differentiate (yi axin )2
i=1
with respect to a and n to get the critical point. However, it is not easy to get the critical point of such a
function. (One may try to find the critical point!)
The strategy to estimate the parameters of the model y = axn is using the independent substitution on x and y
or simply transformation on the data, Y = ln y and X = ln x. Taking the natural logarithmic function on data
(x, y), we have the transformed data (X,Y ) = (ln x, ln y) and the fitting model becomes
ln y = ln(axn ) = ln a + ln xn = ln a + n ln x
Y = nX + A,
(A = ln a)
(3.3.8)
which is linear in terms of X and Y and whose parameters n and ln a can be estimated by the technique
discussed in Topic I above. The reason why we transform the data and the model by the natural logarithmic
function mainly lies on the properties of the logarithmic function: explicitly, we can pull down the power of
Page 37 of 57
Spring, 2010
an exponential function so that the exponential model y = axn becomes a linear one Y = nX + ln a. Moreover,
since a logarithmic function is a onetoone correspondence (or a bijection) and a conformal mapping (i.e.,
anglepreserving mapping), the transformation via the function does not change the critical properties (such
as the absolute deviations) inherited in the original data.
Now let us estimate a and n in the exponential model y = axn . We apply the leastsquares criterion developed
in Topic I to the transformed model (3.3.8). Then the objective function S on the transformed data (X,Y ) =
(ln x, ln y) has the independent variables n and A, i.e.,
m
i=1
i=1
S(n, A) = (ln yi n ln xi ln a)2 = (Yi nXi A)2 ,

where Xi = ln xi and Yi = ln yi . As we did in Topic I, let us introduce vectors
X = X1 , X2 , . . . , Xm = ln x1 , ln x2 , . . . , ln xm ,
Y = Y1 ,Y2 , . . . ,Ym = ln y1 , ln y2 , . . . , ln ym ,
i = 1, 1, . . . , 1 .
Then the objective function S(n, A) becomes

S(n, A) = Y nX Ai2 .
(See the argument preceding the result (3.3.6).) It has the normal equations,
nX X + AX i = X Y,
nX i + Ai i = Y i.
and
(See the equations (3.3.2) and (3.3.3).) Solving the equations for n and A, we find the critical point (n, A) =
(n0 , A0 ),
(
)
(X Y)i2 (X i)(Y i) (Y i)X2 (X Y)(X i)
(n0 , A0 ) =
,
(3.3.9)
X2 i2 (X i)2
X2 i2 (X i)2
(See the result (3.3.4).) Thus S(n, A) has the minimum value
S(n0 , A0 ) = Y n0 X A0 i2 .
(3.3.10)
Since now we know the critical point (n0 , A0 ), those parameters of the exponential model y = axn are obtained
by
n = n0 ,
and
a = eA0 ,
i.e.,
y = eA0 xn0 .
Example 3.3.5 (T RANSFORMED L EASTS QUARES WITH U NFIXED P OWER n (S AME DATA AS IN E XAM PLE 3.3.3)). For the given data,
x
0.5
1.0
1.5
2.0
2.5
0.7
3.4
7.2
12.4
20.1
estimate the parameters of the fitting model y = axn by using the Transformed LeastSquares criterion. (Here
we have two parameters a and n to estimate.)
A NSWER. Since the model is an exponential function, we transform the data and the model by the logarithmic function.
x
0.5
1.0
1.5
2.0
2.5
0.7
3.4
7.2
12.4
20.1
X = ln x
0.693147
0.405465
0.693147
0.916291
Y = ln y
0.356675
1.22378
1.97408
2.5177
3.00072
Page 38 of 57
Spring, 2010
Then the model becomes

ln y = n ln x + ln a
Y = nX + A,
by setting X = ln x and Y = ln y and A = ln a. Using the results deduced above, the objective function of the
transformed data becomes
S(n, A) = Y nX Ai2 ,
where X, Y, and i are vectors as in the argument above. By the formulas (3.3.9) and (3.3.10), the function S
has the critical point (n0 , A0 ) and the minimum value S(n0 , A0 ):
(
(n0 , A0 ) =
(X Y)i2 (X i)(Y i) (Y i)X2 (X Y)(X i)

,
X2 i2 (X i)2
X2 i2 (X i)2
)
= (2.06281, 1.12661) ,
S(n0 , A0 ) = Y n0 X A0 i2 = Y 2.06281X 1.12661i2 = 0.014179.

Therefore, we deduce the fitting model y = eA0 xn0 = e1.12661 x2.06281 = 3.08519x2.06281 with the minimum
value 0.014179 of the sum of the squares of the absolute deviations.
Example 3.3.6 (T RANSFORMED L EASTS QUARES

AMPLE 3.3.3)). For the given data,
WITH
F IXED P OWER n = 2 (S AME DATA
0.5
1.0
1.5
2.0
2.5
0.7
3.4
7.2
12.4
20.1
AS IN
EX-
estimate the parameters of the fitting model y = ax2 by using the Transformed LeastSquares criterion. (Here
we have two parameters a and n to estimate.)
A NSWER. We follow the exactly same argument in the previous example. The model y = ax2 becomes
ln y = 2 ln x + ln a
Y = 2X + A.
The objective function of the transformed data becomes

S(A) = Y 2X Ai2 ,
The function S(A) has the critical point A = A0 and the minimum value S(A0 ):
Y i 2(X i)
= 1.14322,
i2
S(A0 ) = Y 2X A0 i2 = Y 2X 1.14322i2 = 0.0205521.
A0 =
Therefore, we deduce the fitting model y = eA0 x2 = e1.14322 x2 = 3.13684x2 with the minimum value 0.0205521
of the sum of the squares of the absolute deviations.
Remark 3.3.7. For the data given in Example 3.3.3, we have tested three models with different criteria:
1. y = ax2 with LeastSquares (Example 3.3.3): We have deduced y = 3.18693x2 with the minimum value
0.20954.
2. y = axn with Transformed LeastSquares (Example 3.3.5): We have deduced y = 3.08519x2.06281 with the
minimum value 0.014179.
3. y = ax2 with Transformed LeastSquares (Example 3.3.6): We have deduced y = 3.13684x2 with the
minimum value 0.0205521.
Let us make a table on the deviations and compare them.
Page 39 of 57
Spring, 2010
0.5
1.0
1.5
2.0
2.5
0.7
3.4
7.2
12.4
20.1
1. yi 3.18693xi2
0.0967325
0.21307
0.0294075
0.34772
0.181688
2. yi 3.08519xi2.06281
0.0384383
0.31481
0.0792666
0.489902
0.32474
3. yi 3.13684xi2
0.08421
0.26316
0.14211
0.14736
0.49475
Model
2.
1. yi 3.18693xi2
yi 3.08519xi2.06281
3. yi 3.13684xi2
2
m
i=1 (yi y(xi ))
max |yi y(xi )|
0.20954
0.34772
0.452326
0.489902
0.363032
0.49475
In the case of the given data, we observe the first model y = 3.18693x2 makes the smallest sum of the
absolute deviations and also smallest sum of the squares of the deviations. So we are allowed to say the
best one between those three models is the first one. We may guess this result before going through all
the computations in those three examples. Simply speaking, when we use the transformation, the data get
damaged and the transformed model also loose some information. Due to these loss, the transformed model
may not be better than the one obtained from the original data. However, such as in the exponential model
y = axn with unfixed power n, we need to transform the data and the model anyway.
We end this section by copying one paragraph in the textbook. The preceding examples illustrate two
facts.
1. If an equation can be transformed to yield an equation of a straight line (Topic I) in the transformed
variable, equations (3.3.2) and (3.3.3) can be used directly to solve for the slope and intercept of the
transformed graph.
2. The leastsquares best fit to the transformed equations does not coincide with the leastsquares best
fit of the original equations. The reason for this discrepancy is that the resulting optimization problems
are different. In the case of the original problem, we are finding the curve that minimizes the sum of the
squares of the deviations using the original data, whereas in the case of the transformed problem we are
minimizing the sum of the squares of the deviations using the transformed variables.
Page 40 of 57
Spring, 2010
3.4 Choosing a Best Model.

We start with the example from the previous section. The data is given and approximate by the model y = ax2 .
x
0.5
1.0
1.5
2.0
2.5
0.7
3.4
7.2
12.4
20.1
Table 3.1: Collected Data

LeastSquares Criterion: We have y = 3.1869x2 .
Transformed LeastSquares Fit: We have y = 3.1368x2 .
Chebyshev Criterion: We have y = 3.17073x2 .
We can summarize the deviations by each method as follows:
xi
yi
yi 3.1869xi2
yi 3.1368xi2
yi 3.17073xi2
0.5
0.7
0.0967
0.0842
0.0927
1.0
3.4
0.2131
0.2632
0.2293
1.5
7.2
0.029475
0.1422
0.0659
2.0
12.4
0.3476
0.1472
0.2829
2.5
20.1
0.181875
0.4950
0.28293
Table 3.2: Summary of deviations for each model y = ax2

2
[yi y(xi )]
max |yi y(xi )|
y = 3.1869x2
0.2095
0.3476
Transformed LeastSquares
y = 3.1368x2
0.3633
0.4950
Chebyshev
y = 3.17073x2
0.2256
0.28293
Criterion
Model
LeastSquares
Table 3.3: Summary of results for the three models

Which model is best? The model obtained by the leastsquares criterion is better than the one by Chebyshev
in the sense that the former gives smaller sum of the squares of the deviations. However, under the purpose of
minimizing the largest absolute deviation, the model obtained by Chebyshev is better. Thus, the best criterion
depends on the case, judged by the purpose of modeling and so on.
Even if the sum of the squares of deviations is small, we need to be careful before we jump into making a
decision on the trend of the data. Its because we may be misled into a wrong prediction. See the figure 3.15
on page 121 in the textbook. The model y = x has the same sum of squared deviations. However, as we
can see, they give significantly different prediction on the trend of the data. In order to prevent this kind of
misleading, we need to plot the model and the data together and compare them.
Example 3.4.1.
1. Find a model using the leastsquares criterion either on the data or on the transformed
data (as appropriate).
2. Compare your results with the graphical fits obtained in the problem set 3.1 by computing the deviations, maximum absolute deviations and the sum of the squared deviations for each model.
3. Find a bound on cmax if the model was fit using the leastsquares criterion.
Problem 3 in problem set 3.1. In the following data, x is the diameter of a ponderosa pine in inches measured
at breast height and y is a measure of volumenumber of board feet divided by 10.
(1) Test the model y = axb by plotting the transformed data.
Page 41 of 57
Spring, 2010
(2) If the model seems reasonable, estimate the parameters a and b of the model graphically.
Figure 3.8: Data and model y = 0.00282062x3.11139 by transforming data
A NSWER TO Q UESTIONS (1)

sides of y = axb , we have
AND
(2)
IN
S ECTION 3.1. Taking the natural logarithmic function on both

ln y = b ln x + ln a,
of which graph on the ln y versus ln x is a line with the slope b and the (ln y)intercept (0, ln a). A simple
computation gives the transformed data (ln x, ln y). Plotting them, we observe a line. Using two points
(ln 17, ln 19) and (ln 41, ln 294), we have the slope b = 3.11139 and the (ln y)intercept (0, 5.8708), i.e.,
a = 0.00282062 and so we deduce
y = 0.00282062x3.11139 .
See the figure 3.8.
A NSWER TO Q UESTIONS 1, 2, AND 3 IN S ECTION 3.4. For the model y = axb , we transform the data and
apply the transformed leastsquares fit:
ln y = b ln x + ln a,
which gives the normal equations with m = 15,
m
m
m m
i=1 ln xi ln yi i=1 ln xi i=1 ln yi
m
2
2
m m
i=1 (ln xi ) (i=1 ln xi )
m
m
m (ln xi )2 m
i=1 ln yi i=1 ln xi i=1 ln xi ln yi
ln a = i=1
.
m
2
2
m m
i=1 (ln xi ) (i=1 ln xi )
b=
By the equations, we deduce b = 3.09187 and a = 0.00320603 and

y = 0.00320603x3.09187 .
x
17
19
20
22
23
25
28
31
32
33
36
37
38
39
41
19
25
32
51
57
71
113
141
123
187
192
205
252
259
294
Page 42 of 57
Spring, 2010
Figure 3.9: Data and model y = 0.00320603x3.09187 by transformed leastsquares fit

See the figure 3.9.
As we can observe from the table 3.5, in this problem, the transformed leastsquares model gives a better
approximation than the one obtained by transforming the data and the linearity.
Page 43 of 57
Spring, 2010
yi 0.0028xi3.1114
yi
xi
141
31
yi 0.0028xi3.1114
xi
17.8165
yi 0.0032xi3.0919
yi
10.0625
21.4427
12.9732
123
32
1.4341
0.0001
19
17
28.1397
37.3648
187
33
3.8209
1.8563
25
19
15.8991
4.1592
192
36
1.7741
0.4967
32
20
21.2786
8.6150
8.6215
22
37
5.6514
51
205
8.3356
23
19.9041
4.9701
57
6.2729
252
38
3.6688
7.9215
71
25
7.2763
7.3673
259
39
17.4145
23.2535
113
28
16.8033
0.0022
294
41
Criterion
y = 0.00282062x3.11139
Model
2826.26
3174.81
28.1397
37.3648
max |yi y(xi )|
yi 0.0032xi3.0919
Table 3.4: Summary of the deviations for each model
Transformed Linearity
y = 0.00320603x3.09187
[yi y(xi )]
Transformed LeastSquares
Table 3.5: Summary of results for the two models
Page 44 of 57
Chapter 4
Chapter 7 Discrete Optimization Modeling
Section 7.4 Linear Programming III: Simplex Method.

P ROBLEM : Use the simplex method to solve the following optimization problem.
Maximize 3x1 + x2
subject to 2x1 + x2 6
x1 + 3x2 9
x1 , x2 0.
(Original Pr)
A NSWER. We introduce variables y1 and y2 and convert the problem as follows:

Maximize 3x1 + x2
subject to 2x1 + x2 + y1 = 6
x1 + 3x2 + y2 = 9
x1 , x2 , y1 , y2 0,
which is called canonical slack maximization and the variables y1 and y2 are called slack variables.
Since x1 0 and x2 0 are required by the condition, so the objective function should satisfy
3x1 x2 0
and we introduce another slack variable z so that
3x1 x2 + z = 0
and
z 0.
Hence, we can rewrite the original problem (Original Pr) as follows:

2x1 + x2 + y1 = 6
x1 + 3x2 + y2 = 9
3x1 x2 + z = 0
x1 , x2 , y1 , y2 , z 0,
(4.0.1)
(4.0.2)
(4.0.3)
and we will find the largest value of z and the pair (x1 , x2 ) which gives the largest value of z.
By collecting the coefficients, we record the problem into the socalled Tucker tableau .
In the objective function constraint (4.0.3), we compare the absolute value of the coefficients. Since the
coefficient of x1 has the largest absolute value 3, so we choose x1 as the entering variable.
Compute the ratio of the RHS divided by the column labeled x1 to determine the minimum positive ratio.
Choose y1 corresponding to the minimum positive ratio 3 as the exiting variable.
Pivot Divide the row containing the exiting variable (the first row in this case) by the coefficient of the
Simplex
algorithm was developed in the 1940s by George B. Dantzig. We will employ certain refinements in Dantzigs
original technique developed in the 1960s by A.W. Tucker.
slack: wanting in activity; lacking in completeness, finish, or perfection
Tableau: picture, painting, representation, illustration, image
Pivot: a shaft or pin on which something turns
45
Spring, 2010
x1
x2
y1
y2
RHS
Table 4.1: Original Tucker Tableau

x1
x2
y1
y2
RHS
Table 4.2: Entering Variable x1

x1
x2
y1
y2
RHS
Ratio
3 (= 6/2)
9 (= 9/1)
Table 4.3: Entering Variable x1 and Exiting Variable y1

entering variable in that row (the coefficient of x1 in this case), giving a coefficient of 1 for the entering
variable in this row. Then eliminate the entering variable x1 from the remaining rows (which do not contain
the exiting variable y1 and have a zero coefficient for it).
x1
x2
y1
y2
RHS
1/2
1/2
x1
x2
y1
y2
RHS
1/2
1/2
11
3 1/2
0 1/2
93
3 + 3
1 + 3/2
0 + 3/2
0+9
Simply,
Since there are no negative coefficients in the bottom row, thus x1 = 3 and y2 = 6 (i.e., x2 = 0 by (4.0.2))
gives the extreme point (x1 , x2 ) = (3, 0) at which the optimal objective function value z = 9.
Page 46 of 57
Spring, 2010
x1
x2
y1
y2
RHS
1/2
1/2
3 (= x1 )
5/2
1/2
6 (= y2 )
1/2
3/2
9 (= z)
Table 4.4: Tableau giving Extreme Point and Optimal Value

Example 4.0.2. Solve Carpenters problem.
Maximize 25x1 + 30x2
subject to 20x1 + 30x2 690
5x1 + 4x2 120
x1 , x2 0.
A NSWER. By introducing slack variables y1 , y2 and z, we convert the problem as follows:
20x1 + 30x2 + y1 = 690
5x1 + 4x2 + y2 = 120
25x1 30x2 + z = 0
x1 , x2 , y1 , y2 , z 0.
(4.0.4)
(4.0.5)
(4.0.6)
By collecting the coefficients, we record the problem into the Tucker tableau:
x1
x2
y1
y2
RHS
20
30
690
120
25
30
x1
x2
y1
y2
RHS
20
30
690
120
25
30
Page 47 of 57
Spring, 2010
x1
x2
y1
y2
RHS
Ratio
20
30
690
23 (= 690/30)
30 (= 120/4)
25
30

Choose y1 corresponding to the minimum positive ratio 23 as the exiting variable.
Pivot Divide the row containing the exiting variable (the first row in this case) by the coefficient of the
x1
x2
y1
y2
RHS
2/3
1/30
690/30
120
25
30
x1
x2
y1
y2
RHS
2/3
1/30
690/30
5 8/3
44
0 4/30
120 4(690/30)
25 + 30(2/3)
30 + 30
0 + 30/30
0 + 30(690/30)
Simply,
x1
x2
y1
y2
RHS
2/3
1/30
23
7/3
2/15
28
690
Since we have a negative coefficient in the bottom row, we repeat the work by comparing the coefficients of
the absolute values in the bottom row.
Simply,
Page 48 of 57
Spring, 2010
x1
x2
y1
y2
RHS
2/3
1/30
23
7/3
2/15
28
690

x1
x2
y1
y2
RHS
2/3
1/30
23
2/35 = (2/15)(3/7)
3/7
12 = 28(3/7)
690
x1
x2
y1
y2
RHS
2/3 2/3
1/30 (2/3)(2/35)
0 (2/3)(3/7)
23 (2/3)12
2/35
3/7
12
5 + 5
1 + 5(2/35)
0 + 5(3/7)
690 + 5(12)
x1
x2
y1
y2
RHS
1/14
2/7
15 (= x2 )
2/35
3/7
12 (= x1 )
5/7
15/7
750 (= z)
Since there are no negative coefficients in the bottom row, thus by choosing y1 = 0 = y2 , we get x1 = 12
and x2 = 15 which gives the extreme point (x1 , x2 ) = (12, 15) and the optimal objective function value z =
750.
Example 4.0.3. An electrical firm manufactures circuit boards in two configurations, say configuration #1
and configuration #2. Each circuit board in configuration #1 requires 1A component, 2B component, and 2C
component; each circuit board in configuration #2 requires 2A component, 2B component, and 1C component. The firm has 20A components, 30B components and 25C components available. If the profit realized
upon sale is $200 per circuit board in configuration #1 and $150 per circuit board in configuration #2, how
many circuit boards of each configuration should the electrical firm manufacture so as to maximize profits?
A NSWER. Let x1 and x2 be respectively the number of circuit boards in configuration #1 and #2. Then the
mathematical formulation of the problem is
Maximize 200x1 + 150x2
subject to x1 + 2x2 20
2x1 + 2x2 30
2x1 + x2 25.
Using the slack variables y1 , y2 , y3 and z, we can rewrite the original problem as follows:
x1 + 2x2 + y1 = 20
Page 49 of 57
(4.0.7)
Spring, 2010
2x1 + 2x2 + y2 = 30
2x1 + x2 + y3 = 25
200x1 150x2 + z = 0
x1 , x2 , y1 , y2 , y3 , z 0,
(4.0.8)
(4.0.9)
(4.0.10)
and we will find the largest value of z and the pair (x1 , x2 ) which gives the largest value of z.
By collecting the coefficients, we record the problem into the Tucker tableau.
x1
x2
y1
y2
y3
RHS
20
30
25
200
150
x1
x2
y1
y2
y3
RHS
20
30
25
200
150
x1
x2
y1
y2
y3
RHS
20 (= 20/1)
15 (= 30/2)
200
150
25/2 (= 25/2)
0
Choose y3 corresponding to the minimum positive ratio 25/2 = 12.5 as the exiting variable.
Pivot Divide the row containing the exiting variable (the third row in this case) by the coefficient of the
Page 50 of 57
Spring, 2010
x1
x2
y1
y2
y3
RHS
20
30
1/2
1/2
25/2
200
150
x1
x2
y1
y2
y3
RHS
11
2 1/2
0 1/2
20 25/2
22
21
01
30 25
1/2
1/2
25/2
200 + 200
150 + 100
100
0 + (25/2)200
Simply,
x1
x2
y1
y2
y3
RHS
3/2
1/2
15/2
1/2
1/2
25/2
50
100
2500
Since there are no negative coefficients in the bottom row, thus x1 = 3 and y2 = 6 (i.e., x2 = 0 by (4.0.2))
gives the extreme point (x1 , x2 ) = (3, 0) at which the optimal objective function value z = 9.
Page 51 of 57
Spring, 2010
Page 52 of 57
Chapter 5
Chapter 8 Dimensional Analysis
5.1 Introduction.
Read the textbook. Studied in class but the lecture note has not been typed.
5.2 Dimensions as Product.
53
Spring, 2010
Page 54 of 57
Chapter 6
Chapter 10 Modeling with a Differential Equation
10.5 Numerical Approximation Method.

55
Spring, 2010
Page 56 of 57
Chapter 7
Chapter 11 Modeling with Systems of Differential Equations
11.1 Graphical Solutions of Autonomous Systems of FirstOrder Differential Equations.

57

Mathematical Modeling: My Students February, 2010

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mathematical Modeling: My Students February, 2010

Uploaded by

Copyright:

Available Formats

Mathematical Modeling

Topic II. Fitting a Power Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Chapter 7 Discrete Optimization Modeling

Chapter 8 Dimensional Analysis

Chapter 10 Modeling with a Differential Equation

Chapter 11 Modeling with Systems of Differential Equations

A simple computation shows that the ratio of e over m is roughly a constant:

A paradigm to use in modeling change is

change = future value present value.

Figure 1.2: Data from springmass system with proportionality line

bn+1 = 1.01bn 880.87

Owed Money (bn )

Owed Money (bn )

Figure 1.3: Mortgaging a Home

Figure 1.4: Change in Biomass versus Biomass

which is the average of (pi )/((665 pi )pi ), i = 1, 2, . . . , 18. It implies

pn+1 = pn + 0.000802886(665 pn )pn

Figure 1.5: Red #: Experimental Result, Blue : Model Result

1.3 Solutions to Dynamical Systems.

Step 1 Look for Pattern:

refuse liquids or waste matter usually carried off by sewers

Topic III. LongTerm Behavior of an+1 = ran , r constant.

The sequence an converges to a0

The sequence an oscillates.

The sequence an converges to 0 and so it decays to the limiting value of 0.

The sequence an diverges and so it grows without bound.

a2 = (0.5)(0.15) + 0.1 = 0.175

So we may expect that as n , an 0.2.

a2 = (0.5)(0.2) + 0.1 = 0.2

a2 = (0.5)(0.25) + 0.1 = 0.225

So we may expect that as n , an 0.2.

Topic V. Finding and Classifying Equilibrium Values.

It implies the following Theorem.

Stable equilibrium value

Unstable equilibrium value

Straight line with no equilibrium value

where c is a constant depending on the initial condition, explicitly,

So the solution can be rewritten by

where a0 is the initial value.

Tn+1 = 0.7Tn + 0.4On ,

which is a system of difference equations.

T1 = 0.7(4000) + 0.4(3000) = 4000

Figure 1.8: Red: On , Blue: Tn

where k1 and k2 are constants.

Hn+1 = 1.3Hn 0.002On Hn .

Letting O and H be the equilibrium values, we have

0 = O(0.2 0.001H) 0 = H(0.3 0.002O).

It gives the equilibrium values (O, H) = (150, 200).

The Modeling Process

2.1 Mathematical Models.

y x2 if and only if x y1/2 .

P ROOF. Skip. But, one should be able to prove all of them.

Property 2.2.2 (T RANSITIVITY). If z y and y x, then z x.

Figure 2.2: Assumable and Unassumable Proportionality

Mean Distance (R)

See the figure 2.3.

Figure 2.3: Keplers Third Law as proportionality

where d is the total stopping distance. See the figure 2.4.

Figure 2.5: Two geometrically similar objects X and X

By the same argument on VX and VX , it follows

With this result and (A1), we have

The assumptions (A2) and (A3) yield

The equation Fg = Fd given in the problem implies