Professional Documents
Culture Documents
In the last work, we analyzed the relationship between electricity consumption and the average
temperature and the growth value of the water coal industry, and guessed that there is a certain
collinear relationship between the two independent variables. In this assignment, we will adjust
the regression formula for the collinear relationship between the two.
The data we use are Taiwan's monthly electricity sales from 2017 to 2021, the year-on-year
growth rate of Taiwan's monthly electricity sales from 2017 to 2021, and Taiwan's monthly
average temperature from 2017 to 2021 (the average temperature of Kaohsiung, Taipei, Taichung,
and Hualien), and the 2017 - Taiwan's industrial growth value in 2021.
The temperature data comes from the "Main Temperature Data Query Page"
https://stat.motc.gov.tw/mocdb/stmain.jsp?sys=100&funid=b8101
The data on the industrial growth value of the electricity and gas supply industry comes
from the Department of Statistics, https://dmz26.moea.gov.tw/
We define the average monthly electricity sales as AE, and the average monthly temperature data
as AT , Hydro-Coal Industry Growth Value as WG .
Through the presentation of the scatter plot, we preliminarily guess that the electricity
sales and W G have a heteroskedasticity relationship. As WG gradually becomes
larger, AE will become larger and larger, and the fluctuation of AE will also become
larger and larger.
3. Multicollinearity Analysis
(1) Regressive Multicollinearity Detection
a. Pearson Correlation
Correlation
The growth
Total value of
electricity average hydropower,
sales ( ten temperatur coal and power
million kWh ) e industry
Pearson Total electricity 1.000 .780 .465
correlation sales ( ten million
kWh )
average temperature .780 1.000 .321
The growth value of .465 .321 1.000
hydropower, coal and
power industry
salience (single Total electricity . .000 .000
tail) sales ( ten million
kWh )
average temperature .000 . .006
The growth value of .000 .006 .
hydropower, coal and
power industry
number of cases Total electricity 60 60 60
sales ( ten million
kWh )
average temperature 60 60 60
The growth value of 60 60 60
hydropower, coal and
power industry
Pearson correlation between AT and W G is 0.32, and the collinearity is not obvious .
b. VIF _
a
coefficient
Unstandardized standardized
coefficients coefficient Collinearity Statistics
standard
Model B error Beta Tolerance VIF
1 ( constant ) -181.235 390.470
average temperature 32.495 3.766 .704 .897 1.115
The growth value of 11.816 4.036 .239 .897 1.115
hydropower, coal and
power industry
a. Dependent variable: total electricity sales ( ten million kWh )
Through V IF we can find that V IF is 1.12< 10 , so the collinearity is not obvious.
c. Collinearity diagnosis
diagnosisa
Collinearity
Variance ratio
The growth
value of
average hydropower,
dimens Eigenvalue Condition ( constant temperatur coal and power
Model ion s indicator ) e industry
1 1 2.983 1.000 .00 .00 .00
2 .016 13.703 .02 .95 .01
3 .001 68.895 .98 .05 .99
a. Dependent variable: total electricity sales ( ten million kWh )
From the chart, we can see that both the two-dimensional and three-dimensional
condition indicators are greater than 10 , so there is still a certain collinearity problem
in the regression.
To sum up, although no collinearity problem was found in the Pearson and VIF tests,
but in the process of collinearity diagnosis, we found that two-dimensional and three-
dimensional collinearity problems cannot be ignored. Therefore, we will discuss
further on the collinearity of regression.
Variable Coefficients
( 2) LASSO Regression
Dependent Variable: AE
Method: Elastic Net Regularization
Date: 10/02/22 Time: 18:08
Sample: 2017M01 2021M12
Included observations: 60
Penalty type: Lasso (alpha = 1)
Lambda at minimum error: 0.05647
Regressor transformation: None
Cross-validation method: K-Fold (number of folds = 5), rng=kn,
seed=886243792
Selection measure: Mean Squared Error
Variable Coefficients
df 2 2 2
L1 Norm 225.1470 659.5542 1131.294
R-squared 0.660103 0.618183 0.553820
Dependent Variable: AE
Method: Elastic Net Regularization
Date: 10/02/22 Time: 18:10
Sample: 2017M01 2021M12
Included observations: 60
Penalty type: Elastic Net (alpha = 0.5)
Lambda at minimum error: 0.3449
Regressor transformation: None
Cross-validation method: K-Fold (number of folds = 5), rng=kn,
seed=1985126259
Selection measure: Mean Squared Error
(minimum) (+1 SE) (+2 SE)
Lambda 0.3449 5.621 8.155
Variable Coefficients
df 2 2 2
L1 Norm 212.4047 58.31391 139.4934
R-squared 0.660040 0.647532 0.636673
We found that in the Elastic Net Method, R ^2 has strong explanatory power
and small fluctuations, so we choose to use this method to deal with the
collinearity problem of the regression.
Variable Coefficients
df 2 2 2
L1 Norm 212.4047 58.31391 139.4934
R-squared 0.660040 0.647532 0.636673
From the table, we can see that the influence of WG and AT on AE is positive,
but the influence of AT is significantly higher than that of WG. The explanatory
power of R ^2 is 0.659, our model has high explanatory power.
(2) T able