Metode Kuadrat Terkecil (Least Square Method) : Budi Waluyo

METODE KUADRAT TERKECIL
(LEAST SQUARE METHOD)
Budi Waluyo
FAKULTAS PERTANIAN
UNIVERSTAS BRAWIJAYA 2009
Metode kuadrat terkecil
• digunakan untuk mendapatkan penaksir

koefisien regresi linier
Model regresi sederhana
• Model regresi linier sederhana

dinyatakan dengan persamaan :
– Y = 0 + 1X +  , model umum
– Yi = 0 + 1Xi + i , model setiap
pengamatan
• Didapatkan eror, yaitu  atau i
–  = Y – Ŷ = Y – bo – b1X, atau
– i = Yi – Ŷi = Yi – bo – b1Xi
Graphical - Judgmental Solution
• Titik-titik merah adalah nilai hasil

eksperimen, di-notasikan Yi , yang
diduga membentuk garis lurus
Garis inilah model yang akan di-
taksir, dengan cara menaksir
koefisiennya, yaitu b0 dan b1,
sehingga terbentuk persamaan
b0 + b1 Xi.
Garis tegak lurus sumbu horisontal
yang menghubungkan titik
eksperimen dengan garis lurus
dugaan dinamai error.
Graphical - Judgmental Solution
1 b1
b0
The Least Square Method
yi xi yˆ i
y1 x1 b0  b1 x1
y2 x2 b0  b1 x2
y3 x3 b0  b1 x3
. . .
. . .
yn xn b0  b1 xn
n

Min Z   (y i  y i ) 2
i 1
n
Min Z   (y i  b0  b1 x i ) 2
i 1
Classic Minimization
n
Min Z   (y i  b0  b1 x i ) 2
i 1
We want to minimize this function with respect to b0 and b1

This is a classic optimization problem.
We may remember from high school algebra that to find the

minimum value we should get the derivative and set it equal to
zero.
The Least Square Method
Note : Our unknowns are b0 and b1 .
xi and yi are known. They are our data

yi xi yi
y1 x1 b0  b1 x 1
y2 x2 b0  b1 x 2
y3 x3 b0  b1 x 3
. . .
. . .
yn xn b0  b1 x n
n
Z   (y i  b0  b1 x i ) 2
i 1
Find the derivative of Z with respect to b0 and b1 and set them

equal to zero
Derivatives
n
Z   ( y i  b0  b1 x i ) 2
i 1
Z n
  2( 1 )( y i  b0  b1 x i )  0
b0 i  1
Z n
  2(  x i )( y i  b0  b1 x i )  0
b1 i  1
b0 and b1
( x  y )
 xy  n
b1 
( x ) 2
x 2

n
b0  y  b1 x
Pizza Restaurant Example
We collect a set of data from random stores of our Pizza
restaurant example
Restaurant Student population Quarterly Sales

(1000s) ($1000s)
i xi yi
1 2 58
2 6 105
3 8 88
4 8 118
5 12 117
6 16 137
7 20 157
8 20 169
9 22 149
10 26 202
Example
Restaurant i Xi Yi Xi Yi Xi2
1 2 58 116 4
2 6 105 630 36
3 8 88 704 64
4 8 118 944 64
5 12 117 1404 144
6 16 137 2192 256
7 20 157 3140 400
8 20 169 3380 400
9 22 149 3278 484
10 26 20 5252 676
Total 140 1300 21040 2528

b1
( x  y ) (140 )(1300 )
21040 
 xy  n b1  10
b1 
(  x )2
2
(140 )
 
x 2
n
2528 
10
2840
b1  5
568
b0
Y  b0  b1 X
1300
Y   130
10
140
X  14
10
130  b0  5( 14 )
b0  60
Estimated Regression Equation
Y  60  5 X
Now we can predict.

For example, if one of restaurants of this Pizza Chain is close to
a campus with 16,000 students.
We predict the mean of its quarterly sales is
Y  60  5(16)
Y  140 thousand dollars
Summary ; The Simple Linear Regression Model
• Simple Linear Regression Model

Y = 0 + 1 X + 
• Simple Linear Regression Equation

E(Y) = 0 + 1X
• Estimated Simple Linear Regression Equation

Ŷ = b0 + b1X
Summary ; The Least Square Method
• Least Squares Criterion
min (Yi - Ŷi)2
where
Yi = observed value of the dependent variable
for the i th observation
Ŷi = estimated value of the dependent variable
for the i th observation
Summary ; The Least Square Method
• Slope for the Estimated Regression Equation
(  X i  Yi )
XY
i i 
n
b1 
(  X i )2
X i
2

n
• Y -Intercept for the Estimated Regression Equation
b0  Y  b1 X
Xi = value of independent variable for i th observation

Yi = value of dependent variable for i th observation
_
X_ = mean value for independent variable
Y = mean value for dependent variable
n = total number of observations
Coefficient of Determination
Question :
How well does the estimated regression line fits the data.
Coefficient of determination is a measure for Goodness of

Fit.
Goodness of Fit of the estimated regression line to the
data.
Given an observation with values of Yi and Xi.
^
We put Xi in the equation and get Y
i
. Ŷi = b0 + b1Xi
(Yi – Ŷi) is called residual.
It is the error in using Ŷi to estimate Yi.
SSE =  (Y - Ŷ )2
SSE : Pictorial Representation
y10 - ^y10
Y = 60+5x
SSE Computations
i Xi Yi
1 2 58
2 6 105
3 8 88
4 8 188
5 12 117
6 16 137
7 20 157
8 20 169
9 22 149
10 26 202
SSE Computations
i Xi Yi Ŷi = 60 + 5Xi
1 2 58 70
2 6 105 90
3 8 88 100
4 8 188 100
5 12 117 120
6 16 137 140
7 20 157 160
8 20 169 160
9 22 149 170
10 26 202 190
SSE Computations
i Xi Yi Ŷi = 60 + 5xi (Yi - Ŷi ) (Yi- Ŷi )2

1 2 58 70 -12 144
2 6 105 90 15 225
3 8 88 100 -12 144
4 8 188 100 18 324
5 12 117 120 -3 9
6 16 137 140 -3 9
7 20 157 160 -3 9
8 20 169 160 9 81
9 22 149 170 -21 441
10 26 202 190 12 144
SSE Computations
i Xi Yi Ŷi = 60 + 5xi (Yi - Ŷi ) (Yi- Ŷi )2

1 2 58 70 -12 144
2 6 105 90 15 225
3 8 88 100 -12 144
4 8 118 100 18 324
5 12 117 120 -3 9
6 16 137 140 -3 9
7 20 157 160 -3 9
8 20 169 160 9 81
9 22 149 170 -21 441
10 26 202 190 12 144
Total SSE = 1530
SSE = 1530 measures the error in using estimated equation to predict

sales
SST Computations
Now suppose we want to estimate sales without using the level of
advertising. In other words, we want to estimate Y without using X.
If Y does not depend on X, then b1 = 0.

Therefore y = b0 + b1x ===> b0 = y
Here we do not take x into account, we simply use the average of
y as our sales forecast.
y = ( yi) / n
y = 1300/10 =
130
This is our estimate for the next value of y.
Given an observation with values of yi and xi.
(yi –y ) is the error in using x to estimate yi.
SST =  (y - y )2
SST : Pictorial Representation
y10 - y
yi = 130
SST Computations
i Xi Yi (Yi - Y ) (Yi - Y )2
1 2 58 -72 5184
2 6 105 -25 625
3 8 88 -42 1764
4 8 188 -12 144
5 12 117 -13 169
6 16 137 7 49
7 20 157 27 729
8 20 169 39 1521
9 22 149 19 361
10 26 202 72 5184
Total SST = 15730
SST = 15730 measures the error in using mean of y values to

predict sales
SSE , SST and SSR
SST : A measure of how well the observations cluster around y

SSE : A measure of how well the observations cluster around ŷ
If x did not play any role in vale of y then we should

SST = SSE
If x plays the full role in vale of y then

SSE = 0
SST = SSE + SSR
SSR : Sum of the squares due to regression
SSR is explained portion of SST

SSE is unexplained portion of SST
Coefficient of Determination for Goodness of Fit
SSE = SST - SSR
The largest value for SSE is
SSE = SST
SSE = SST =======> SSR = 0
SSR/SST = 0 =====> the worst fit
SSR/SST = 1 =====> the best fit

Coefficient of Determination for Pizza example
In the Pizza example,

SST = 15730
SSE = 1530
SSR = 15730 - 1530 = 14200
r2 = SSR/SST : Coefficient of Determination
1  r2  0
r2 = 14200/15730 = .9027
In other words, 90% of variations in y can be explained by the
regression line.
SST Calculations
SST   (Y  Y ) 2
(Y ) 2
SST   Y  2
n
SST Calculations
(  Y )2
SST   Y 2 
n
Observation Xi Yi Yi^2
1 2 58 3364
2 6 105 11025
3 8 88 7744
4 8 118 13924
5 12 117 13689
6 16 137 18769
7 20 157 24649
8 20 169 28561
9 22 149 22201
10 26 202 40804
1300 184730
SST  184730  ((1300 )2 / 10 )  15730

SSR Calculations
 X Y 
2

 XY 
 n 
SSR 
 X  2
X 2

n
Observation X Y XY Y^2 X^2
1 2 58 116 3364 4
2 6 105 630 11025 36
3 8 88 704 7744 64
4 8 118 944 13924 64
5 12 117 1404 13689 144
6 16 137 2192 18769 256
7 20 157 3140 24649 400
8 20 169 3380 28561 400
9 22 149 3278 22201 484
10 26 202 5252 40804 676
10 140 1300 21040 184730 2528
[ 21040  ( 140 )( 1300 ) / 10 ] 2
SSR   14200
2528  ( 140 ) / 10
2
SSR Calculations
r  SSR / SST  14200 / 15730

2
r  .9027
2
SSE  SST  SSR
SSE  15730  14200  1530

Example : Reed Auto Sales
Reed Auto periodically has a special week-long sale.

As part of the advertising campaign Reed runs one or
more television commercials during the weekend
preceding the sale. Data from a sample of 5 previous
sales showing the number of TV ads run and the
number of cars sold in each sale are shown below.
Number of TV Ads Number of Cars Sold
1 14
3 24
2 18
1 17
3 27
SSR 
  XY  (  X  Y ) / n 
2
SST   Y 2
 (  Y ) 2
/n
 X   X  / n
2 2
We need to calculate X, Y, XY , X2, Y2
X Y XY X2 Y2
1 14 14 1 196
3 24 72 9 576
2 18 36 4 324
1 17 17 1 289
3 27 81 9 729
10 100 220 24 2114

SSR 
  XY  (  X  Y ) / n 
2
SST   Y 2
 (  Y ) 2
/n
 X   X  / n
2 2
 x = 10
 y = 100
 xy = 220
 x2 = 24
 y2 = 2114
SST  2114  ( 100 ) / 5

2
SSR 
 220  ( 10 )( 100 ) / 5 
2
24   10  / 5
2
SST  114
SSR 
 220  200 
2
 100
24  20
Example : Read Auto Sales
Alternatively; we could compute SSE and SST and then find
SSR = SST -SSE
y x y^2 yhat=10+5x y-yhat (y-yhat)^2

14 1 196 15 -1 1
24 3 576 25 -1 1
18 2 324 20 -2 4
17 1 289 15 2 4
27 3 729 25 2 4
Sy Sx Sy^2 SSE
100 10 2114 14
SST = Sy^2-[(Sy)^2]/n
114
SSR = SST - SSE
100 R2=100/114
0.877193
• Coefficient of Determination
r 2 = SSR/SST = 100/114 = .88
The regression relationship is very strong since

88% of the variation in number of cars sold can be
explained by the linear relationship between the
number of TV ads and the number of cars sold.
The Correlation Coefficient
Correlation Coefficient = Sign of b1 times Square Root of the
Coefficient of Determination)
rxy  ( sign of b1 ) r 2
Correlation coefficient is a measure of the strength of a linear association

between two variables. It has a value between -1 and +1
rxy = +1 : two variables are perfectly related through a line with

positive slope.
rxy = -1 : two variables are perfectly related through a line with

negative slope.
rxy = 0 : two variables are not linearly related.

The Correlation Coefficient : example
IN our Pizza example, r2 = .9027 and sign of b1 is positive
rxy  ( sign of b1 ) r 2
rxy   .9027
rxy  .9501
There is a strong positive relationship between x and y.

Correlation Coefficient and
Coefficient of Determination
Coefficient of Determination and Correlation Coefficient are

both measures of associations between variables.
Correlation Coefficient for linear relationship between two

variables.
Coefficient of Determination for linear and nonlinear

relationships between two and more variables.
Exercise
• Given the following experimental data on rice

yield (t/ha), plant height (cm) and tiller
number, determine the relationships of these
variables with each other using correlation
and regression analysis. Obtain a model
relating YIELD to the variables PLTHT and
TILLER# and interpret results. Test for the
significance of the parameter estimates and
the regression equation. Evaluate the
adequacy of the model obtained.
• SELAMAT BELAJAR

Metode Kuadrat Terkecil (Least Square Method) : Budi Waluyo

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Metode Kuadrat Terkecil (Least Square Method) : Budi Waluyo

Uploaded by

Copyright:

Available Formats

METODE KUADRAT TERKECIL

(LEAST SQUARE METHOD)

• digunakan untuk mendapatkan penaksir

• Model regresi linier sederhana

• Titik-titik merah adalah nilai hasil

We want to minimize this function with respect to b0 and b1

We may remember from high school algebra that to find the

Find the derivative of Z with respect to b0 and b1 and set them

Restaurant Student population Quarterly Sales

Total 140 1300 21040 2528

Now we can predict.

• Simple Linear Regression Model

• Simple Linear Regression Equation

• Estimated Simple Linear Regression Equation

• Least Squares Criterion

min (Yi - Ŷi)2

• Slope for the Estimated Regression Equation

Xi = value of independent variable for i th observation

Coefficient of determination is a measure for Goodness of

(Yi – Ŷi) is called residual.

It is the error in using Ŷi to estimate Yi.

i Xi Yi Ŷi = 60 + 5xi (Yi - Ŷi ) (Yi- Ŷi )2

i Xi Yi Ŷi = 60 + 5xi (Yi - Ŷi ) (Yi- Ŷi )2

Total SSE = 1530

SSE = 1530 measures the error in using estimated equation to predict

If Y does not depend on X, then b1 = 0.

Total SST = 15730

SST = 15730 measures the error in using mean of y values to

SST : A measure of how well the observations cluster around y

If x did not play any role in vale of y then we should

If x plays the full role in vale of y then

SST = SSE + SSR

SSR : Sum of the squares due to regression

SSR is explained portion of SST

SSE = SST - SSR

The largest value for SSE is

SSE = SST =======> SSR = 0

SSR/SST = 0 =====> the worst fit

SSR/SST = 1 =====> the best fit

In the Pizza example,

r2 = SSR/SST : Coefficient of Determination

SST  184730  ((1300 )2 / 10 )  15730

r  SSR / SST  14200 / 15730

SSE  SST  SSR

SSE  15730  14200  1530

Reed Auto periodically has a special week-long sale.

We need to calculate X, Y, XY , X2, Y2

10 100 220 24 2114

SST  2114  ( 100 ) / 5

y x y^2 yhat=10+5x y-yhat (y-yhat)^2

The regression relationship is very strong since

Correlation coefficient is a measure of the strength of a linear association

rxy = +1 : two variables are perfectly related through a line with

rxy = -1 : two variables are perfectly related through a line with

rxy = 0 : two variables are not linearly related.

IN our Pizza example, r2 = .9027 and sign of b1 is positive

There is a strong positive relationship between x and y.

Coefficient of Determination and Correlation Coefficient are

Correlation Coefficient for linear relationship between two

Coefficient of Determination for linear and nonlinear

• Given the following experimental data on rice

You might also like