Professional Documents
Culture Documents
Regression. Module One PDF
Regression. Module One PDF
EVIEWS
Module One
Prepared by:
Sayed Hossain
For more articles and videos, visit at:
www.sayedhossain.com
1.
2.
3.
4.
5.
6.
7.
(Assumption no. 1)
Regression line must be fitted to data strongly
(Goodness of Data Fit)
***
Guideline : R2 => 60 percent (0.60) is better
www.sayedhossain.com
4
(Assumption no. 2)
Most of the independent variables should
be individually individually significant
**
t- test
t test is done to know whether each and every
independent variable (X1, X2 and X3 etc here) is
individually significant or not to influence the
dependent variable, that is Y here.
5
For Example>>
Variables:
We have four variables, Y, X1, X2 X3
Here Y is dependent and X1, X2 X3 are independent
Population regression model
Y = Bo + B1X1+ B2X2 + B3X3 + u
Tips
If the sample collection is done as per the
statistical guideline (several random procedures)
then sample regression line can be a
representative of population regression line.
Our target is to estimate the population
regression line from a sample regression line.
Hypothesis Setting
Null hypothesis : B1=0
Alternative hypothesis: B10
(Assumption no. 3)
Joint Significace
Independent variables should be jointly significant
to explain dependent variable
**
F- test
ANOVA
(Analysis of Variance)
11
Joint significance
13
Few things
Residual ( u or e) = Actual Y estimated (fitted)
Y
Residual, error term, disturbance term all are
same meaning.
Serial correlation and auto-correlation are same
meaning.
14
(Assumption no. 4)
The sign of the coefficients should follow
economic theory or expectation or experiences
of others (literature review) or intuition.
www.sayedhossain.com
15
Residual Analysis
(Assumption no. 5)
No serial or auto-correlation in the residual (u).
**
Breusch-Godfrey serial correlation LM test : BG test
17
Serial correlation
Serial correlation is a statistical term used to the
describe the situation when the residual is
correlated with lagged values of itself.
In other words, If residuals are correlated, we
call this situation serial correlation which is not
desirable.
18
19
20
Hypothesis setting
Null hypothesis Ho: no serial correlation (no
correlation between residuals (ui and uj))
Alternative hypothesis H1: serial correlation
(correlation between residuals (ui and uj )
21
(Assumption no. 6)
The variance of the residual (u) is
constant (Homoscedasticity)
***
Breusch-Pegan-Godfrey Test
22
23
24
25
Detection of heteroscedasticity
There are many test involed to detect
heteroscedasticity.
One of them is Bruesch-Pegan-Godfrey test
which we will employ here.
26
(Assumption no. 7)
Residuals (u ) should be normally distributed
**
Jarque Bera statistics
27
An Emperical
Model Development
www.sayedhossain.com
29
30
DATA
Sample size is 35 taken from
population
DATA
obs
RESID
X1
X2
X3
YF
0.417167
1700
1.2
20000
1.2
0.782833
-0.27926
1200
1.03
18000
0.65
0.929257
-0.17833
2100
1.2
19000
0.6
0.778327
0.231419
937.5
15163
1.2
0.968581
-0.33278
7343.3
0.97
21000
0.5
0.832781
0.139639
837.9
0.88
15329
1.2
1.060361
-0.01746
1648
0.91
16141
1.017457
-0.14573
739.1
1.2
21876
0.65
0.795733
0.480882
2100
0.89
17115
1.5
1.019118
10
-0.0297
274.6
0.23
23400
1.5
1.529701
11
-0.32756
231
0.87
16127
0.75
1.077562
12
0.016113
1879.1
0.94
17688
0.983887
13
-0.34631
1941
0.99
17340
0.6
0.946315
14
0.485755
2317.6
0.87
21000
1.5
1.014245
15
0.972181
471.4
0.93
16000
1.027819
16
-0.22757
678
0.79
16321
0.9
1.127572
17
-0.2685
7632.9
0.93
18027
0.6
0.868503
18
-0.41902
510.1
0.93
18023
0.6
1.019018
19
-0.4259
630.6
0.93
15634
0.6
1.0259
20
0.076632
1500
1.03
17886
0.923368
DATA
obs
RESID
X1
X2
X3
YF
21
-0.37349949
1618.3
1.1
16537
0.5
0.873499
22
0.183799347
2009.8
0.96
17655
1.15
0.966201
23
0.195832507
1562.4
0.96
23100
1.15
0.954167
24
-0.46138707
1200
0.88
13130
0.6
1.061387
25
0.309577968
13103
20513
0.690422
26
-0.21073204
3739.6
0.92
17409
0.75
0.960732
27
-0.08351157
324
1.2
14525
0.75
0.833512
28
-0.02060854
2385.8
0.89
15207
1.020609
29
0.14577644
1698.5
0.93
15409
1.15
1.004224
30
-0.06000649
544
0.87
18900
1.060006
31
-0.50510204
1769.1
0.45
17677
0.85
1.355102
32
0.870370225
1065
0.65
15092
2.1
1.22963
33
0.274774344
803.1
0.98
18014
1.25
0.975226
34
-0.1496757
1616.7
28988
0.75
0.899676
35
0.062732149
210
1.2
21786
0.87
0.807268
Regression Output
Regression output
Dependent Variable: Y
35 Observation
Included observations: 98
Variable
Coefficient
Std. Error
t-Statistic
Prob.
1.800
0.4836
3.72
0.0008
X1
-2.11E-05
2.58E-05
-0.820
0.4183
X2
-0.7527
0.3319
-2.267
0.0305
X3
-3.95E-06
2.08E-05
-0.189
0.8509
R-squared
0.1684
0.9834
Adjusted R-squared
0.087
0.3912
S.E. of regression
0.3736
0.9762
4.328
Schwarz criterion
1.15
Log likelihood
-13.08
F-statistic
2.093
Durbin-Watson stat
2.184
Prob(F-statistic)
0.1213
35
Few things
t- statistics= Coeffient / standard error
t-statistics (absolute value) and p values always
move in opposite direction
Output
Actual Y, Fitted Y, Residual and its plotting
obs
Actual
Fitted
Residual
Residual Plot
1.2
0.782832991
0.417167009
. | .*
0.65
0.92925722
-0.27925722
.* | .
0.6
0.778327375
-0.178327375
.*| .
1.2
0.96858115
0.23141885
. | *.
0.5
0.8327808
-0.3327808
* | .
1.2
1.060360549
0.139639451
. |*.
1.017457055
-0.017457055
. * .
0.65
0.79573323
-0.14573323
.*| .
1.5
1.019118163
0.480881837
. | .*
10
1.5
1.529701243
-0.029701243
. * .
11
0.75
1.077562408
-0.327562408
* | .
12
0.983887019
0.016112981
. * .
13
0.6
0.946314864
-0.346314864
* | .
14
1.5
1.014244939
0.485755061
. | .*
15
1.027819105
0.972180895
. | .
16
0.9
1.127572088
-0.227572088
.* | .
17
0.6
0.868503447
-0.268503447
.* | .
18
0.6
1.019018495
-0.419018495
*. | .
19
0.6
1.025899595
-0.425899595
*. | .
20
0.923368304
0.076631696
. |* .
Output
Actual Y, Fitted Y, Residual and its plotting
obs
Actual
Fitted
Residual
Residual Plot
21
0.5
0.873499486
-0.373499486
* | .
22
1.15
0.966200653
0.183799347
. |*.
23
1.15
0.954167493
0.195832507
. |*.
24
0.6
1.061387074
-0.461387074
*. | .
25
0.690422032
0.309577968
. | *
26
0.75
0.960732042
-0.210732042
.*| .
27
0.75
0.833511567
-0.083511567
. *| .
28
1.020608541
-0.020608541
. * .
29
1.15
1.00422356
0.14577644
. |*.
30
1.060006494
-0.060006494
. *| .
31
0.85
1.355102042
-0.505102042
*. | .
32
2.1
1.229629775
0.870370225
. | .
*|
33
1.25
0.975225656
0.274774344
. | *.
34
0.75
0.899675696
-0.149675696
.*| .
35
0.87
0.807267851
0.062732149
. |* .
10
15
Residual
20
Actual
25
30
Fitted
35
Sample residual
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
5
10
15
20
Y Residuals
25
30
35
(Assumption no. 1)
Goodness of Fit Data
R-square: 0.1684
It means that 16.84 percent variation in Y can
be explained jointly by three independent
variables such as X1, x2 and X3. The rest
83.16 percent variation in Y can be explained
by residuals or other variables other than X1 X2
and X3.
41
(Assumption no. 2)
Joint Hypothesis : F statistics
F statistics: 2.093 and Prob 0.1213
Null hypothesis Ho: B1=B2=B3=0
42
Assumption No. 3
Independent variable significance
For X1, p-value : 0.4183
Null Hypothesis: B1=0
Alternative hypothesis: B10
Since the p-value is more than 5 percent (0.05) we can not reject null and meaning
we accept null meaning B1=0. In other words, X1 can not influence Y in the
population.
For X2, p-value: 0.0305 (3.05 percent)
Null Hypothesis: B2=0
Alternative hypothesis: B20
Since p-value (0.03035) is less than 5 percent meaning that we can reject null and
accept alternative hypothesis. It means that variable X2 can influence variable Y in
the population but what direction we can not say as alternative hypothesis is .
43
Assumption No. 4
Sign of the coefficients
Our sample model:
Y=bo+b1x1+b2x2+b3x3+e
Sign we expected after estimation as follows:
Y=bo - b1x1 + b2x2 - b3x3
Decision : The outcome did not match with our expectation.
So assumption 4 is violated.
Assumption no 5
SERIAL OR AUTOCORRELATION
Breusch-Godfrey Serial Correlation LM Test:
F-statistic
1.01
Obs*R-squared 2.288
Prob. F(2,29)
0.3751
Prob. Chi-Square(2) 0.3185
45
Assumption no. 6
Heteroscedasticy Test
Breusch-Pegan-Godfrey test (B-P-G Test)
F-statistic
1.84 Probability 0.3316
Obs*R-squared 3.600 Probability 0.3080
Null Hypothesis: Residuals (u) are Homoscedastic
Alternative: Residuals (u) are Hetroscedastic
Assumption no. 7
Residual (u) Normality Test
6
Series: Residuals
Sample 1 35
Observations 35
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
Jarque-Bera
Probability
1.15e-16
-0.029701
0.972181
-0.505102
0.356788
0.880996
3.508042
4.903965
0.086123
0
-0.6
-0.4
-0.2
-0.0
0.2
0.4
0.6
0.8
1.0
48
Use the information of this website on your own risk. This website
shall not be responsible for any loss or expense suffered in
connection with the use of this website.
49