You are on page 1of 22

INDE 2333

ENGINEERING STATISTICS I

CURVE FITTING
(LINEAR REGRESSION)

University of Houston
Dept. of Industrial Engineering
Houston, TX 77204-4812
(713) 743-4195
AGENDA

 Linear Regression
LINEAR REGRESSION

 Statistical method for determining a linear


relationship between two variables assuming
normality and independence
 Two types of variables
 Independent
 Dependent
 Method of least squares
DATA
Y (dependent variable)

X (independent variable)
WHICH PREDICTION LINE
Y (dependent variable)

X (independent variable)
PREDICTION LINE
Y (dependent variable)

ei

a + b * xi
yi

Intercept X (independent variable)


DETERMINING PREDICTION LINE

 Calculate the total squared error between the


 Actual value
 Predicted value according to the prediction line

n n 2
2
 e   y   a  bx 
i 1
i
i 1
i i
DIFFERENTIATE

 Want the minimum of the sum of squared error


 Take partial derivatives of the previous equation with
respect to a and b
 Set to 0
n
2 [ yi  (a  bxi )]( 1) 0
i 1
n
2 [ yi  (a  bxi )]( xi ) 0
i 1
NORMAL EQUATIONS

 With some rearrangement


 Resulting linear equations are called normal equations

n n

y
i 1
i an  b xi
i 1
n n n
2
x y
i 1
i i a  xi  b x
i 1 i 1
i
INTERCEPT AND SLOPE

 Substitute x, y, xy, and x squared values in normal


equations
 Solve for a
 Intercept with y axis at x=0
 Solve for b
 Coefficient of slope for x values

Yi a  bxi  ei
EXAMPLE
Air velocity in cm/sec (x) Evaporation Coeff. (y)
20 0.18
60 0.37
100 0.35
140 0.78
180 0.56
220 0.75
260 1.18
300 1.36
340 1.17
380 1.65
NORMAL EQUATIONS

x
i 1
i 2000
n

y
i 1
i 8.35
n

x y
i 1
i i 2175.40
n
2
 i 532,000
x
i 1
NORMAL EQUATIONS

8.35 10a  2,000b


2,175.40 2,000a  532,000b
PREDICTION EQUATION

Yi 0.069  0.00383xi
INFERENCES

 R squared…
 T value for coefficients…
R SQUARED

 What percentage of the variation in the data can be


accounted for by the equation
 The closer to 1.0, the better the fit
 For physical experiments r squared should be > 0.9
 For people related experiments r squared should be >0.30
 R squared formula…
R SQUARED

n
2
(y i  yˆ i )
2 i 1
r 1  n
2
(
 iy  y )
i 1
INTERCEPT PARAMETER Alpha

 Typically tested against the intercept alpha being 0


 Hypotheses test
 Ho: alpha=0
 Ha: alpha < > 0
 Two sided critical value based on n-2 df.
 Test statistic of…
TEST STATISTIC FOR INTERCEPT
n
2

2
(y i  a  bxi )
s  i 1
e
n 2
n

s
2
XX
 ( xi  x )
i 1

(a   ) n S xx
t
s S
2
e xx
 n( x )
SLOPE PARAMETER Beta

 Typically tested against the slope Beta being 0


 Hypotheses test
 Ho: Beta=0
 Ha: Beta < > 0
 Two sided critical value based on n-2 df.
 Test statistic of…
SLOPE PARAMETER B
n
2

2
(y i  a  bxi )
s  i 1
e
n 2
n

s
2
XX
 ( xi  x )
i 1

(b   )
t S
s e
xx
IN EXCEL
 Data Analysis-Regression
 Specify
 Input x range
 Input y range
 Returns
 R squared
 Coefficients
 T stat
 P value

You might also like