Professional Documents
Culture Documents
Chapter 4 - Student
Chapter 4 - Student
1
2
Learning outcome
Upon completion of this module students
shall be able to;
3
Outline
Introduction
Least square regression
Interpolation
4
Curve fitting - Introduction
Typically data
• is discrete but we are interested to know the intermediate value
• need to estimate these intermediate values
Curve fitting:
• finding a curve (approximation) which has the best fit to a series of discrete data
• The curve is the estimate of the trend of the dependent variables
• the curve can be used to determine the intermediate estimate of the data.
Mathematical background
• Regression – background on statistics
• Interpolation – background on TSE & finite divided difference
6
Elementary statistics – revisited
Elementary statistics – specify characteristics of data sets;
1. Location of centre of distribution
2. Degree of spread of data Population
Sample
Arithmetic mean; y=
∑y i
∑ (y i − y )
2 2
Summed squared error (SSE); St = or 𝑆𝑆𝑡𝑡 = ∑ 𝑒𝑒𝑖𝑖
Variance; S 2y =
∑ (y i − y) 2
[measure of spread]
n −1
sy
Coefficient of variation; c.v. = × 100%
y
7
Elementary statistics – revisited
i yi y^2 ei = yi -`y ei^2 = (yi -`y)^2 Data point n 14
1 12 144.00 -1.61 2.58 Arithmetic mean, y =(∑yi)/n 13.61
2 15 225.00 1.39 1.94 DEGREE OF SPREAD:
3 14.1 198.81 0.49 0.24 Range of data, R = ymax - ymin 4.70
4 15.9 252.81 2.29 5.26 Residual/error, ei = yi -y -1.61 i=1
5 11.5 132.25 -2.11 4.44 Sum squared error (SSE), St =∑ (yi -y )2 33.15
6 14.8 219.04 1.19 1.42
Standard deviation, Sy = √(st)/(n-1) 1.60
7 11.2 125.44 -2.41 5.79
Variance, Sy2 = st/(n-1) 2.55
8 13.7 187.69 0.09 0.01
Coefficient of variance CV=(Sy/y)100% 11.74
9 15.9 252.81 2.29 5.26
10 12.6 158.76 -1.01 1.01
11 14.3 204.49 0.69 0.48
12 12.6 158.76 -1.01 1.01
13 12.1 146.41 -1.51 2.27
14 14.8 219.04 1.19 1.42
∑= 190.5 2625.31 0.00 33.15
∑ei SSE
8
Least square regression
9
Least square regression - Introduction
Regression?
Given n data points (x1, y1), (x2,y2),…. , (xn,yn), best fit y=f(x) to the data set. The best fit
is generally based on minimising the sum of the square of the residuals, Sr.
Regression model;
(xn, yn)
y p = f(x)
(x1, y1)
Sum of the square of the residuals
n
Sr = ∑
i =1
(yi − f(xi)) 2
Basic model for regression
11
Linear least square regression
Fit a straight line to a set of n data points (x1,y1), (x2,y2), ....., (xn, yn).
y = a 0 + a 1x + e
• a1- slope
• a0- intercept
• e- error, or residual, between the model and the measurement
Ideally, if all the residuals are zero, one may have found an equation in which all the
points lie on the model.
The most popular method to minimize the residual is the least squares methods, where
the estimates of the constants of the models are chosen such that the sum of the
squared residuals, Sr is minimized.
12
Linear least square regression
Why best fit is based on Square of residuals?
1. Minimize the sum of the residual errors for all available data:
n n
∑ e = ∑ (y − a
i =1
i
i =1
i o − a 1x i )
∑e =∑y
i =1
i
i =1
i − a 0 − a 1x i
14
Linear least square regression
Minimize the sum of the squares of the residuals, Sr, between the measured y and the y
calculated with the linear model:
n n n
Sr = ∑
i =1
e i2 = ∑ (y , measured − y , model) = ∑ (y
i =1
i i
2
i =1
i − a 0 − a 1x i ) 2
a1 =
n ∑x y −∑x ∑y
i i i i
n ∑ x − (∑ x )
2 2
i i
a 0 = y − a 1x
15
Linear least square regression – Example 1
Fit a straight line to the data series
n ∑ x i yi − ∑ x i ∑ yi
a1 =
n ∑ x − (∑ x i )
i xi yi (xi)2 xiyi 2 2
i
1 10 25
2 20 70
3 30 380
4 40 550
5 50 610
a 0 = y − a 1x
6 60 1220
7 70 830
8 80 1450
Least-square fit is given by;
Σ 360 5135
16
Linear least square regression – Example 1
The regression model is given as
17
Linear least square regression – Quantification of error
Quantification of error
for a straight line, the sum of the squares of the estimate residuals:
n n
Sr = ∑ e = ∑ (y i − a 0 − a 1 x i )
2 2
i
i =1 i =1
Sr
s y/x = • Quantify the spread of data around the regression line
n−2 • Used to quantify the ‘goodness’ of a fit
18
Linear least square regression – Quantification of error
Standard error of estimate, Sy/x Quantify the spread of data around regression line
19
Linear least square regression – Quantification of error
Coefficient of determination, r2
r2 is the difference between the sum of the squares of the data residuals, St and the sum of the
squares of the estimate residuals, Sr, normalized by the sum of the squares of the data
residuals:
20
Linear least square regression – Example 1
i xi yi f =
S t = ∑ (y i − y ) =
2
1 10 25
Sr = ∑ (y i − a 0 − a 1 x i ) =
2 20 70 2
3 30 380
4 40 550
sy =
5 50 610 s y/x =
6 60 1220 r2 =
7 70 830
8 80 1450
Σ 360 5135
21
How good is the model?
23
Non linear relationship
24
Non linear relationships
Linear regression is predicted based on linear relationship between the dependent and
independent variables. But this is not always the case!
25
Non linear relationship
Example of nonlinear transformation
In their transformed forms, these model can use linear regression to evaluate the
constant coefficients.
26
Linearization of non-linear relationship
Example 2 :
Given
y = α 2 x β2
28
Polynomial regression – non-linear model
The linear least-squares regression
procedure can be readily extended to fit
data to a higher-order polynomial.
2
y = a 0 + a 1x + a 2 x + e
29
Polynomial regression – non-linear model
For a second order polynomial, the best fit would mean minimizing:
n n
∑ ∑ (y − a )
2
Sr = e i2 = i
2
0 − a 1x i − a 2 x i
i =1 i =1
In general, this would mean minimizing:
n n
∑ ∑ (y − a )
2
Sr = e i2 = i
2 m
0 − a 1x i − a 2 x i − − a m x i
i =1 i =1
The standard error for fitting an mth order polynomial to n data points is:
Sr
s y/x =
n − (m + 1)
2 S t − Sr
The coefficient of determination r2 is still found using: r =
St
30
Polynomial regression – non-linear model
To find the constants of the polynomial model, we set the
derivatives with respect to
∑ ( )
n
∂S r
= 2. y i − a 0 − a 1x i − . . . − a m x im (−1) = 0
∂a 0 i =1
= ∑ 2.(y )
n
∂S r
i − a 0 − a 1x i − . . . − a m x im (− x i ) = 0
∂a 1 i =1
∑ ( )
n
∂S r
= 2. y i − a 0 − a 1x i − . . . − a m x im (− x im ) = 0
∂a m i =1
31
Polynomial regression – non-linear model
In general, these equations in matrix form are given by
n
n n m
n
∑
x
i
i =1
. . ∑
. x i
i =1
a 0
∑ yi
i =1
n n 2 n a1 n
∑
xi
i =1 ∑
xi
i =1
. .
∑
. x i
i =1
m +1 =
. . .
∑ xiyi
i =1
. . . . . . . . . . . a m . . .
n
n m n m +1 n
∑
x i ∑
xi
. . .
∑ 2m
xi
∑
i =1
x m
i yi
i =1 i =1 i =1
32
Polynomial regression – non-linear model
Non-linear model, 2nd order regression
34
MEM682
NUMERICAL METHODS
Objective :
To determine nth order polynomial interpolation for n+1 data point
To use this polynomial interpolant to calculate the intermediate
values
Approaches:
1. Newton divided difference polynomial
2. Lagrange polynomial
3. Spline interpolation
3
4
1. Newton’s interpolating
polynomial
5
Polynomial interpolation
In the mathematical subfield of numerical analysis,
interpolation is a method of constructing new data points within
the range of a discrete set of known data points.
6
Newton’s interpolating polynomial
◦ Simple polynomial interpolant for n+1 data points is given as:
Linear
Quadratic
7
Newton’s Divided-Difference Interpolating Polynomial
Linear (1st order) Interpolation
◦ Is the simplest form of interpolation, connecting two data points with a
straight line.
◦ Given (x0,y0), & (x1,y1), pass a linear interpolant through these data points
◦ 1st order Newton polynomial interpolant is;
f1(x) = b 0 + b1(x − x 0 )
◦ f1(x) indicate first-order polynomial.
◦ The coefficient b0 and b1 are found using
linear algebra.
◦ Substituting (x0,y0), & (x1,y1) into the equation
to solve for coefficients b0 and b1 gives;
Linear-interpolation formula
f(x 1 ) − f(x 0 )
f1 (x) = f(x0 ) + (x − x 0 )
x1 − x 0 Slope and a finite divided difference
approximation to 1st derivative 8
Newton’s Divided-Difference Interpolating Polynomial
◦ Alternatively, the first-order Newton
interpolating polynomial may be obtained
from linear interpolation and similar
triangles, as shown.
Linear-interpolation formula
f(x1) −f(x0 )
f1(x) = f(x0 ) + (x −x0 ) Slope and a finite divided
x1 −x0 difference approximation to 1st
derivative
9
Example
1
Use
calculator
x0 eg. ln 1 = 0
x1 to get all
1.609438 f(x) values.
f( x 1 ) − f(x 0 )
f 1 (x) = f(x 0 ) + (x − x 0 )
x1 − x 0
x =2
10
Example
1
Use
calculator
x0 eg. ln 1 = 0
to get all
1.609438 x1 f(x) values.
f( x 1 ) − f(x 0 )
f 1 (x) = f(x 0 ) + (x − x 0 )
x1 − x 0
x =2
0.6931472−0.549306
𝜀𝜀𝑡𝑡 = x100 = 20.8
0.6931472
10
Newton’s Divided-Difference Interpolating Polynomial
Quadratic Interpolation
◦ Given (x0,y0), (x1,y1) & (x2, y2) fit a quadratic interpolant thru’ these
data points
◦ The estimate is improved by introducing some curvature into the line
connecting the points.
◦ 2nd order Newton polynomial interpolant is; Non linear
x = x0 b 0 = f(x 0 )
f(x 1 ) − f(x 0 )
x = x1 b1 =
x − x0 Finite divided
f(x 2 ) − f(x 1 ) f(x 1 ) − f(x 0 ) difference of 2nd
− derivative
x 2 − x1 x1 − x0
x = x2 b2 =
x2 − x0
11
Example 2 Example 1
Formulas:
x0
x1
x2 1.609438
1.609438
-0.073473
12
General form of Newton’s Interpolating Polynomial
◦ In general, for (n+1) data, the nth Newton interpolating polynomial is;
14
General form of Newton’s Interpolating Polynomial
15
Example 3
Example 2
Given in
x0
example 2 x1
1.609438 x2
x3
Add new x
1.609438
0.255413
1.609438
0.223144
0.255413
-0.073473
-0.073473
0.223144 - 0.255413
-0.032269
16
Example 3 (continued)
0.549306 -0.073473
0.255413 0.013735
-0.032269
0.223144
-0.032269 – (-0.073473)
0.013735
17
Error of Newton’s interpolating polynomial
◦ Structure of interpolating polynomials is similar to the Taylor series
expansion in the sense that finite divided differences are added
sequentially to capture the higher order derivatives.
◦ For an nth-order interpolating polynomial, an analogous relationship for
the error is:
f (n+1) (ξ)
Rn = (x − x 0 )(x − x1 )(x − x n )
(n + 1)!
◦ where ξ is somewhere in the interval containing the unknown and the
data.
◦ Fortunately, an alternative formula can be used that does not require
prior knowledge of the function:
30