given
shape functions
For linear
approximation
Difference (residual)
between
data and
surrogate
Minimize square
residual
Differentiate to obtain
1
( )
b
n
i i
i
y b
=
=
x
1
( )
b
n
j j i i j
i
r y b X
=
= =
x r y b
( ) ( )
T T
X X = r r y b y b
y b
T T
X X X =
1 2
1 x = =
Basic equations
General form = (, , e. g. , =
1
1
+
2
sin(
3
2
Residuals
Rms error
Finding the coefficients requires the solution of an
optimization problem.
However, minimizing the sum of squares is a specialized
problem with specialized algorithms. Matlab lsqnonlin is
very good.
( , )
i i i
r y y = x b
2 2
1
rms i
i
y
e r
n
=
Example Linear vs. Nonlinear Regression
y(1) = 20, y(2) = 7, y(3) = 5, and y(4) = 4.
Data suggests a rational function =
1
+
2
Compare to quadratic polynomial =
1
+
2
+
2
Both use three coefficients
Get
1 1.5 2 2.5 3 3.5 4
2
4
6
8
10
12
14
16
18
20
Data
Rational function
Polynomial
2
6.99
1.99
0.61
36.5 20 3
2
rational
quadratic
x
x
y x
y =
+ =
Estimating uncertainty in coefficients
Brute force approach, generate noise in data
and repeat multiple times
Alternatively linearize about optimum set of
coefficients b*
Now perform linear regression with
.
Provides improvement to solution
Provides estimate of uncertainty in
, which is
an estimate for the uncertainty in
( , *)
n
i i i j
j
j
y
r y y b
b

=
c
= = A
c
x b
Model based error for linear
regression
The common assumptions for linear regression
Surrogate is in functional form of true function
The data is contaminated with normally distributed
error with the same standard deviation at every point.
The errors at different points are not correlated.
Under these assumptions, the noise standard
deviation (called standard error) is estimated as.
Similarly, the standard error in the coefficients is
2
T
y b
n n
o =
r r
( )
1
) (
=
ii
T
i
X X b se o
Rational function example
Linearize with respect to bs
Perform fit by linear regression
=1.0e007* [0.1435 0.3685 0.1230]
Finally perform error analysis
Standard errors range between 4% to 10% of
the bs (1.99 6.99 0.612)
( )
*
2 3 2 2
1 1
2 *
*
3 3
3
b b b b
y b r b
b x b x
b x
A A
= + = A +
T T
X X X A = b r
2
0.102
T
y
n n

o = =
r r
( )
1
Instead of using the data in Slide 4, generate your own
data for 31 uniformly distributed points (1,1.3,)from the
identified algebraic model
= 1.99
6.99
0.612
and
contaminate the data with normally distributed random
noise with zero mean and standard deviation of 1.
Compare the standard error from linear regression with
the value you get by repeating the process multiple times
using different realizations of the noise.