You are on page 1of 6

1

Short-term Multi-Region Load Forecasting


Based on Weather and Load Diversity Analysis
S. Fan, K. Methaprayoon, and W. J. Lee, )HOORZ,(((

 load pattern in a certain region may significantly differ from


Abstract—In a Power system covering large geographical its neighboring region. Under such a situation, it is hard to
area, a single forecasting model for overall load of the accurately predict the overall electricity demand of the whole
whole region sometimes can not guarantee satisfactory region by using a single forecasting model.
forecasting accuracy. One of the major reasons is because The purpose of the work in this paper is to develop a multi-
the existence of load diversity, usually caused by weather region load forecasting system for an electricity utility whose
diversity. In such a system, multi-region load forecasting control area covers a large geographical area in Midwest US.
will be a feasible and effective solution to provide more The control area of the utility has been divided into twenty
accurate forecasting results. This paper aims to four regions in the analysis. We will first make a detailed
demonstrate the existence of weather and demand investigation to the regional weather and electricity demand
diversity within the control area of an electric utility in characteristics, and quantify the load diversity within the
Midwest US. Based on the analysis, an Artificial Neural system. An ANN-based multi-region forecasting system is
Network (ANN) based multi-region load forecasting
then proposed and has been examined by using the real data
system has been developed and tested by using the actual
from the power company. For comparative study, a universal
data. Simulation results validate the superiority of the
forecasting model has also been developed to forecast
proposed multi-region load forecasting system to the single
aggregate forecasting model. aggregate system load. Simulation results validate the
superiority of the proposed multi-region load forecasting
Index Terms—Load forecasting, Multi-region, Load diversity, system to the single aggregate load forecast.
Neural network
II. TESTING SYSTEM DESCRIPTION
I. INTRODUCTION
In this research, we use the regional load and weather data
L Load forecasting is a key issue for power system
operating and marketing. Many operating decisions, such
as dispatch scheduling of generating capacity, reliability
in the control area of an electric utility in Midwest US as test
example. The whole control area has been divided into twenty
four different regions for analysis. The area ID, yearly average
analysis, and maintenance plan for the generators, are based
and peak load are listed in Table I.
on load forecasts. Planning for future new generation plant
and transmission augmentations is also dependent on load
forecasts. In many energy systems, the increase for a few Table I
The area ID and load data in 2006
percentages in the prediction accuracy would bring benefits
Average load Peak Load
worth of millions of dollars [1]. Therefore, load forecasting is Area ID
(MW) (MW)
always a popular topic in electricity industry. Area001 8.111838 14.30769
So far, a wide variety of techniques was proposed to
Area002 25.03388 41.20340
forecast the electricity load during the past years [2], however,
Area003 52.70056 96.61885
most of the works are focusing on the forecasting model itself,
Area004 61.6222 103.8052
and few attention have been paid to the load forecasting for a
power system with large geographical area, although some Area005 51.39617 90.45446
technical issues are required to be resolved for this specific Area006 27.85061 50.85374
problem. Area007 78.43186 128.3923
In a system with large geographical area, the load Area008 10.31720 16.65236
characteristics and weather conditions are usually diverse in Area009 9.618536 22.77330
different districts. Especially, when there is a significant Area010 23.22630 48.25764
climate change, like a cold or warm front approaching, the Area011 20.19723 33.64496
Area012 19.39542 31.16093
S. Fan and W. J. Lee are with Energy Systems research Center at the
University of Texas at Arlington. (e-mail: shufan@uta.edu; wlee@uta.edu) Area013 31.17934 52.48743
K. Methaprayoon is with Electric Reliability Council of Texas (ERCOT). Area014 42.0646 88.07723
(e-mail: kmethaprayoon@ercot.com)

562

978-1-4244-1726-1/07/$25.00 
c 2007 IEEE
2

Area015 18.36458 30.67214 III. INVESTIGATION OF REGIONAL LOAD AND WEATHER


Area016 1.332454 5.205772 CHARACTERISTICS
Area017 119.0486 252.3563 In this section, we will investigate the regional weather and
Area018 31.15284 60.53848 load characteristics, as well as the dependency of electricity
Area019 31.23019 58.47161 demand on temperature.
Area020 34.87022 51.81867 $ :HDWKHU9DULDWLRQWKURXJKRXWWKH5HJLRQ
Area021 31.82051 63.33005
It is known that electricity demand has close relationship
Area022 16.95924 32.27306 with weather conditions. In recent years, demand levels have
Area023 29.7387 59.166 become increasingly dependent on weather conditions. This
Area024 2.310241 6.282474 has been attributed to the increases in availability of
household and commercial air conditioning units. Generally,
Fig. 1 illustrates the hourly aggregate electricity demands of most of the meteorological variables, including temperature,
this system for three years. humidity and wind speed, etc, are driving variables to the
electricity consumption. On the other hand, temperature was
1600 consistently found to be the dominant factor compared with
the other weather indexes, especially for a system with
dominant residential load. Therefore, we will focus on the
temperature characteristics in this paper.
The whole control area has been divided into twenty four
1200 different regions for analysis. Table II shows the mean,
maximum, and minimum temperature of each area in 2006
Load (MW )

and Fig. 2 illustrate the temperature distribution of three


typical sites across the control area. It can be seen that the
temperatures differ significantly across the entire system,
800
especially the temperature distributions are quite different for
different sites.

400
0 10000 20000

Hour
Fig. 1 Hourly aggregate electricity demand for the control area from
Jan. 1, 2004 to Dec. 31, 2006

From Fig. 1, the following common characteristics of load


can be observed: the load series have multiple seasonal
patterns, corresponding to a daily, weekly and monthly
periodicity respectively; load are also influenced by calendar
effect, i.e. weekend and holidays; sometimes, the demand
presents high volatility and non-constant mean. From the
above observations, it can be concluded that there exist
different regimes or dynamics in the load series during
different periods. Especially, with the deregulating of power
system, the electric load series becomes increasingly variable
resulting from the dynamic bidding strategies of market
players and price-dependent loads as well as time-varying
prices.
Furthermore, the system load shows a clear upward trend Fig. 2 Histogram plot of temperature distribution of three different
and load patterns in a short period may varies significantly sites across the control area in year 2006
due to the volatile weather in Midwest US. Different from the
typical system, its weekend load is often higher because the Table II
increased residential consumption in weekend. The mean, maximum, and minimum temperature of each area in
2006
AREA ID Mean Maximum Minimum
Area001 65.81763 107 11

2007 39th North American Power Symposium (NAPS 2007) 563


3

Area002 62.13418 108 9 &RY(W , G )


U W ,G (1)
Area003 64.49312 107 11
V WV G
Area004 64.16981 102 12
where &RY WG is the covariance of temperature W and load
Area005 65.2785 104 15
G, and ıW and ıG are the standard deviations for W and G. ȡWG = 1
Area006 64.25483 110 2
corresponds to a perfect linear correlation while an
Area007 64.9285 108 9
intermediate value describes partial correlations, and ȡWG = 0
Area008 65.38804 110 8 represents no correlation at all.
Area009 62.39155 107 2 The dotted lines in Fig. 3 indicate the separate point
Area010 64.86872 106 5 between the two piecewise segments, which are obtained by
Area011 64.07923 106 11 maximizing the absolute values of the two correlation
Area012 63.46147 104 11 coefficients on both segments. According to our computation,
Area013 61.25713 108 4 the separating point is approximately 59.0 degree, and ȡWG on
Area014 64.88092 106 5 the two segments is -0.65 and 0.71, which indicates a strong
Area015 62.42633 107 2 correlation between load and temperature for both winter and
Area016 69.1116 109 19 summer seasons. Besides, the correlation for this system in
winter is nearly as high as that in summer, a major reason is
Area017 64.98563 107 13
because the heat appliances in this area is mainly electric,
Area018 67.35749 108 15
instead of gas and oil heaters.
Area019 69.11051 109 19
Fig. 4 shows the correlation between the maximum daily
Area020 65.4628 106 12 demand and corresponding temperature for the three typical
Area021 65.98925 108 4 regions in 2006. In this figure, the demands were normalized
Area022 66.14058 107 11 about their yearly average. As seen, the correlations between
Area023 62.23019 104 11 electricity demand and temperature are high for every region,
Area024 66.46377 111 12 whereas each area displaying different levels of dependency
on weather factors.
% 'HSHQGHQF\RIHOHFWULFLW\GHPDQGRQZHDWKHUFRQGLWLRQV 2.5
Here, we continue to analyze the relationship between load
and temperature. The correlation between the aggregate load
Area 5
and average temperature for the system is shown in Fig.3. Area 14
2.0 Area 20
Normalized daily peak load

1600

1.5

1200
1.0
Load (MW)

0.5
800 0 25 50 75 100 125

Temperature (F)

Fig. 4 Correlation between maximum daily demand and


corresponding temperature for three different regions in 2006
400
0 25 50 75 100 & /RDGGLYHUVLW\DQDO\VLV
Temperature (F) Since the load is largely dependent on the ambient
Fig. 3 Correlation between demand and temperature for the system temperature and weather condition are diverse throughout the
Control area in 2006 regions. It is desired to further quantify the impact of diversity
on electricity demand. In this section, we will investigate the
According to Fig. 3, there exists approximately a piecewise load characteristics by using the load diversity factor within a
linear relationship of correlations between load and system.
temperature for cold and hot days respectively. The Load diversity is a reference to the level that different
correlation in each segment can be computed using the electricity demand patterns affect overall system demand.
following expression. Different regions can have different daily, weekly, and
seasonal load profiles [3]. Load diversity can result in

564 2007 39th North American Power Symposium (NAPS 2007)


4

different areas having noncoincident load peaks. This higher accuracy. Moreover, the multi-region load forecast can
diversity can be partly due to the existence of weather make it possible to transfer ‘spare’ generation from zones not
diversity throughout wide area of a power system [4]. For the at their peak demand, to zones that are at peak demand. This
system control area covering a substantial geographic area, effective ‘sharing’ of power supply can lead to reductions in
load diversity caused by weather diversity may have a large unserved energy (USE) levels, potentially impacting minimum
influence to the aggregate load forecasting. reserve level decisions, as well as new generation and
The level of diversity for a group of electrical loads has transmission augmentation decisions.
been defined by a coincidence factor & [5] [6],
$ $11%DVHG/RDG)RUHFDVWLQJ0RGHO
¦3 L An ANN based forecasting model will be employed for the
& L
(2) work in this paper [7]. Since we mainly focus on the multi-
3$ region load forecasting for a large area, the forecasting model
where, 3L stands for the individual peak load, and 3$ stands itself will only be briefly introduced for completeness. In this
for the aggregate peak load for the group of regions. paper, we use a three-layer-feed forward networks for the
In this paper, we calculate the load diversity factor for the study, the hyperbolic tangent function in (3) is used for hidden
twenty four regions by using peak load of increasing time units and output units in our developed model. Since it can
interval: from daily peak to two-daily peak till thirty-daily produce both positive and negative values, which helps
peak. The calculation results are shown in Fig. 5. As seen, the speeding up the training process compared with the logistic
factors are larger than 100%, which indicates the existence of function whose output is only positive [8].
load diversity within the regions, furthermore, the coincidence § H1.5K  H 1.5K ·
factor increases over longer calculation time periods. I [ tanh 1.5K ¨¨ 1.5K ¸
1.5 K ¸
(3)
©H H ¹
1.05 The Levenberg-Marquardt approach is used to train the
model. This approach is suitable for training medium-size
ANN (containing up to hundred weights) with low Mean
1.04
Square Error (MSE) [8]. It utilizes an approximation of
Hessian matrix from the first-order gradient to obtain a
Average Diversity Factor

second-order training speed with low computation cost.


Network weights and biases are updated in the batch mode,
1.03
i.e., weights are updated after the entire training set is
presented to the model. The model is actually trained three
times with different random starting points and picks one with
1.02
lowest MSE to process the forecast. This multiple training
reduces the possibility of local minima trap due to the nature
of training process.
1.01
0 10 20 30
Finally, a forward method is applied to select number of
Time Interval for the Calculation (Day)
neurons in the hidden layer. The method starts by choosing a
Fig. 5 Average load diversity factor for increasing calculation
small number of hidden neurons and gradually increases this
intervals number. Each time, the model is trained and forecast error
from test set is recorded for comparison. The process stops at
IV. MULTI-REGION LOAD FORECASTING optimal number of hidden neuron when the error decreases to
acceptably threshold, or no significant improvement is
Based on the above analysis, it can be concluded that the
observed when the number of hidden neuron is increased.
regional weather conditions are diverse across the whole
region, this, coupled with the changed local demand/weather % &DVH6WXG\DQG5HVXOWV
relationships and the load diversity factor being more than The hourly electricity load data from the twenty four
100% demonstrates the difference of load patterns within the regions and weather data observed at the corresponding sites
region. For load forecasting in such a situation, it is sometimes have been used for the study. Day-ahead load forecasting is
hard to accurately predict the overall electricity demand by performed in this paper. One month of testing data have been
using a universal model based on the weather data from a selected to forecast and validate the performance of the
single site. A straightforward solution is to use weather proposed model. The testing data corresponds to January
information from several sites distributed in the whole region 2007, which is a winter months with high demand. The hourly
as input data. However, this method will increase the data used to forecast the testing set are from January 1, 2006
complexity of the model and make the forecasting tool to December 31, 2006. A few missing load and temperature
sensitive to its variables. In this paper, we will apply multi- data were filled in by interpolating between neighboring
region forecasting model to forecast the regional load values.
respectively, and then obtain the aggregate load forecast with

2007 39th North American Power Symposium (NAPS 2007) 565


5

The test sets are completely separate from the training sets 1200
Multi-region
and are not used during the learning procedure. Because it is Single model
Actual
clear that a larger forecasting lead-time does not necessarily
imply a larger forecasting error, which depends on the data
variability for the different periods. 1000
The input parameters for ANN model usually span in

Load (MW)
different ranges. Data pre-processing is always helpful before
feeding into the network. The process is also critical when
sigmoid function were applied at output layer, since the output
800
of this function only covers a certain range from 0 to 1 or -1 to
1. Accordingly, the corresponding output range of the model
should only cover the same range. This paper applies
normalization as data pre-processing. The process helps
speeding up and securing the convergence of training. All 600
0 100
input parameters are scaled to have the same zero mean and Hour
unity standard deviation using the following equation.
Fig. 6 Forecast result for January 1, 2007 to January 7, 2007
[  P[
V (4)
As seen, the multi-region forecasting system performs
V[
better than the single model for the aggregate load forecasting
where V[ is standard deviation original variable X in a large area.
P[is the mean original variable X
V. CONCLUSION AND FUTURE WORK
The performance of the proposed model has been verified In this paper, the load forecasting problem has been studied
with the actual data from the electric utility. The criteria to for a power system with large geographical area in Midwest
compare the performance are the mean absolute error (MAE) US. We first investigate the weather and load characteristic
and mean absolute percentage error (MAPE) in this paper, for the twenty four regions within the system. Through the
which indicate the accuracy of recall. analysis to the actual data, it is discovered that the
MAE is defined as: temperatures differ significantly across the whole area, and the
Q correlation between load and temperature also varies at
0$( ¦( G
L 1
DL  G IL ) / Q (5) different regions, although strong correlations exist for all
regions. We then calculated the load diversity factor for the
where GDL is the actual value; GIL is the forecast value; and Q is system using daily to monthly peak load, and results of being
total number of value predicted. more than 100% further demonstrate the diverse load patterns
MAPE is given as follows: in different regions. Based on the analysis results, we develop
Q
a multi-region load forecasting system for the regional load
0$3( ¦( G
L 1
DL  G IL *100 / G DL ) / Q (6) profile prediction. This multi-region forecasting system has
been examined with the actual data and compared with the
For comparative study, we also developed a single forecasting single forecasting model. The simulation results validate the
model for the aggregate load forecast. The forecasting errors superiority of the multi-region load forecasting system.
of the two systems for the testing sets are shown in Table III. The work in this paper belongs to an ongoing research
And Fig. 6 illustrates the forecasting results for the first week project regarding regional load forecasting within electric
of January 2007. utility in Midwest US. The future work includes an
investigation on the influences of region partition or
Table III
combination to the load forecasting accuracy. We are working
Forecasting results
on the method which can find the optimal partition of
Single model Multi-region model
geographical region for load forecasting, and will develop a
MAE (MW) 51.17 35.45
MAPE (%) 5.49 3.86
dynamical multi-region forecasting system which can
dynamically regroup the regions to follow the changed
weather in a certain area.

REFERENCES
[1] D.W. Bunn and E. D. Farmer, &RPSDUDWLYH0RGHOVIRU(OHFWULFDO/RDG
)RUHFDVWLQJ. New York: Wiley, 1985.
[2] H.S. Hippert, C.E. Pedreira, and R. Castro, “Neural networks for short-
term load forecasting: A review and evaluation,” ,((( 7UDQV 3RZHU
6\VWHPV, vol. 16, no. 1, pp. 44-55, 2001.

566 2007 39th North American Power Symposium (NAPS 2007)


6

[3] C. J. Ziser, Z. Y. Dong, T. Saha, Investigation of Weather Dependency


and Load Diversity on Queensland Electricity Demand, $XVWUDODVLDQ
8QLYHUVLWLHV 3RZHU (QJLQHHULQJ &RQIHUHQFH , 25-28 September,
2005, Hobart, Tasmania, Australia.
[4] J. D. McQuigg, S. R. Johnson, and J. R. Tudor, "Meteorological
Diversity-Load Diversity, A Fresh Look at an Old Problem," -RXUQDORI
$SSOLHG0HWHRURORJ\, vol. 11, pp. 561, 1972.
[5] A. Sargent, R. P. Broadwater, J. C. T hompson, and J. Nazarko,
"Estimation of diversity and kWHR-to-peak-kW factors from load
research data," 3RZHU6\VWHPV,(((7UDQVDFWLRQVRQ, vol. 9, pp. 1450-
1456, 1994.
[6] H. Lee Willis, 6SDWLDO(OHFWULF/RDG)RUHFDVWLQJ: Marcel Dekker, Inc.,
1996.
[7] W. Charytoniuk, E. Don Box, W. J. Lee, M. S. Chen, P. Kotas, and P. V.
Olinda, “Neural-network-based demand forecasting in a deregulated
environment,” ,(((7UDQV,QGXV$SS, vol.36, pp. 893-898, 2000.
[8] Simon S. Haykin, “Neural Networks: A Comprehensive Foundation,”
Macmillan, 1994.

BIOGRAPHIES
Shu Fan received his B.S. M.S. and Ph.D. degrees in Department of Electrical
Engineering, from Huazhong University of Science and Technology (HUST),
in 1995, 2000 and 2004 respectively. He was a research scholar sponsor by
Japanese Government in Osaka Sangyo University from 2004 to 2006.
Presently, he works in Energy Systems Research Center at the University of
Texas at Arlington. His research interests include energy system forecasting,
power system control and high-power power electronics.

K. Methaprayoon (S’03-M’07) received the B.S. degree from Chulalongkorn


University, Bangkok, Thailand in 2000 and M.S. and Ph.D. degrees from the
University of Texas at Arlington in 2003 and 2007, respectively, all in
Electrical Engineering. He is currently an employee of the ERCOT ISO. His
research interests include power system analysis, application of artificial
neural networks in power system, and generation planning in a deregulated
electricity market.

Wei-Jen Lee (S’85-M’85-SM’97-F’07) received the B.S. and M.S. degrees


from National Taiwan University, Taipei, Taiwan, R.O.C., and the Ph.D.
degree from the University of Texas, Arlington, in 1978, 1980, and 1985,
respectively, all in Electrical Engineering. In 1985, he joined the University of
Texas, Arlington, where he is currently a professor of the Electrical
Engineering Department and the director of the Energy Systems Research
Center. He has been involved in research on power flow, transient and
dynamic stability, voltage stability, short circuits, relay coordination, power
quality analysis, and deregulation for utility companies.
Prof. Lee is a registered Professional Engineering in the State of Texas

2007 39th North American Power Symposium (NAPS 2007) 567

You might also like