Regression Analysis Using Least Squares Method

Two-Variable Regression
Model: The Problem of

Estimation
Gujarati 4e, Chapter 3
The Method of Ordinary Least
Squares (OLS)
 Carl Friedrich Gauss (German)
 OLS  very attractive statistical properties
that made OLS one of the most powerful and
popular methods
 PRF: Yi  1   2 X i  ui
 SRF: Yi  ˆ1  ˆ2 X i  uî
 Yî  uî
01/06/22 Prepared by Sri Yani Kusumastuti 2

Least squares criterion
Y
Yi SRF: Yî  ˆ1  ˆ2 X i
û3
û1
uî  Yi  Yî  Yi  ˆ1  ˆ2 X i
û4
û2 Minimizing the sum of the
residuals  zero
OLS  Minimizing the sum

of squares the residuals
X
X1 X2 X3 X4

 
2
Minimizing  uˆ 2
i  Yi  ˆ1  ˆ2 X i
  ˆ
u 2

 i  2  1 Y  ˆ  ˆ X  0
ˆ

 i 1 2 i  
1
  ˆ
u 2

 i  2  1 Y  ˆ  ˆ X X  0
ˆ

 i 1 2 i i  
2
The normal equation:

 Y  nˆ  ˆ
i X 1 2  i
Y X i i  ˆ1  X i  ˆ2  X i2

 Solving the normal equations simultaneously,
we obtain
ˆ
2 
n X iYi   X i  Yi

 X  X   Y Y    x y
i i i i i i
n X i    X i 
2 2
 X  X  i x i
2 2
i
ˆ1   i  Yi   X i  X iYi
X 2
 Y  ˆ2 X
n X    X i 
2 2
i
ˆ1 and ˆ2  the least-squares estimators

The regression line properties
1. It passes through the sample mean of Yi

and Xi
2. The mean value of the estimated Yi is equal
to the mean value of actual Yi
3. The mean value of the residual is zero
4. The residuals are uncorrelated with the
predicted Yi
5. The residuals are uncorrelated with Xi

The classical linear regression
model: the assumption of LS
1. Linear regression model
2. X values are fixed in repeated sampling
3. Zero mean value of disturbance ui
4. Homoscedasticity or equal variance of ui
5. No autocorrelation between the disturbances
6. Zero covariance between ui and Xi
7. The number of observation n must be greater than
the number of parameters to be estimated
8. Variability in X values
9. The regression model is correctly specified
10. There is no perfect multicollinearity

What happen if the assumptions of
CLRM are violated?
Assumption number Type of violation
1 Nonlinearity in parameters
2 Stochastic regressor(s)
3 Nonzero mean of ui
4 Heteroscedasticity
5 Autocorrelated disturbance
6 Nonzero covariance between disturbance and
regressor
7 Sample observation less than the number of
regressor
8 Insufficient variability in regressors
9 Specification bias
10 multicollinearity

Standard errors of Least-Squares
(LS) estimates
 
   
2
var ˆ2  dan se ˆ2 =
ix 2
i
x 2
  n x 
var ˆ1 
 i
X 2
2
dan  
se ˆ1 =
 i
X 2

2
i n x 2
i
2  i
ˆ
u 2
dan  i
ˆ
u 2
nk nk
  xy
2
  xi yi 
2
 uˆ   y  2  xi   yi   x 2 
ˆ
x  y  x
2 2 22 2 2 2 i i
  i  
i i i i 2
i

Properties of LS estimators: the
Gauss-Markov Theorem
 An estimator is said to be a best linear
unbiased estimator (BLUE) if
1. Linear  a linear function of a random variable
2. Unbiased  its average or expected value, E ˆ2 is
equal to the true value, 2
 
3. Efficient or best  it has minimum variance
Gauss-Markov Theorem: given the assumption of CLRM,

the least-squares estimators, in the class of unbiased linear
estimators, have minimum variance, that is, they are BLUE

The coefficient of determination r2: a
measure of “Goodness of Fit”
uî  due to regression
Y
Yi
SRF : ˆ1  ˆ2 X i
 Yi  Y   total Yî
 Yˆ  Y   due to regression
i
Y
r2  tells how well
the SRF fits the data
X
Xi

Yi  Yî  uî
in the deviation form
yi  yˆ i  uî
 i  i  i  2 yîuî
y 2
 ˆ
y 2
 ˆ
u 2
  yî2   uî2
 ˆ22  xî2   uî2
TSS  ESS  RSS
ESS RSS
1 
TSS TSS
 
2
ESS Yî  Y
r 
2

 Y Y 
2
TSS i
r2 measures the proportion or percentage of the total

variation in Y explained by the regression model
Two properties of r2:
1. It is a nonnegativity quantity
2. Its limits are 0  r2  1

Coefficient of correlation r
 r is a measure of degree of association between
two variables
r   r2
r
x y i i
x y 
2
i
2
i
n X Y    X    Y 
i i i i
r
n X   X   n Y   Y  
      
2 2 2 2
i i i i 

The properties of r
1. It can be positive or negative
2. It lies between -1 and +1, that is -1  r  +1
3. It is symmetrical in nature, rxy=ryx
4. It is independent of the origin and scale
5. If X and Y are statistically independent, the
correlation coefficient between them is zero, but if
r=0, it does not mean that two variables are
independent
6. It is a measure of linear association or linear
dependence only; it has no meaning for describing
nonlinear relations
7. It does not necessarily imply any cause-and-effect
relationship

Regression Analysis Using Least Squares Method

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Regression Analysis Using Least Squares Method

Uploaded by

Copyright:

Available Formats

Two-Variable Regression

Model: The Problem of

01/06/22 Prepared by Sri Yani Kusumastuti 2

OLS  Minimizing the sum

01/06/22 Prepared by Sri Yani Kusumastuti 3

The normal equation:

01/06/22 Prepared by Sri Yani Kusumastuti 4

ˆ1 and ˆ2  the least-squares estimators

01/06/22 Prepared by Sri Yani Kusumastuti 5

1. It passes through the sample mean of Yi

01/06/22 Prepared by Sri Yani Kusumastuti 6

01/06/22 Prepared by Sri Yani Kusumastuti 7

01/06/22 Prepared by Sri Yani Kusumastuti 8

01/06/22 Prepared by Sri Yani Kusumastuti 9

Gauss-Markov Theorem: given the assumption of CLRM,

01/06/22 Prepared by Sri Yani Kusumastuti 10

01/06/22 Prepared by Sri Yani Kusumastuti 11

r2 measures the proportion or percentage of the total

01/06/22 Prepared by Sri Yani Kusumastuti 13

01/06/22 Prepared by Sri Yani Kusumastuti 14

01/06/22 Prepared by Sri Yani Kusumastuti 15

You might also like