You are on page 1of 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/278036847

Documentation of loadFromWeather Module in Python

Technical Report · April 2015


DOI: 10.13140/RG.2.1.4074.5124

CITATIONS READS

0 322

1 author:

Paulin Jacquot
École Polytechnique
22 PUBLICATIONS   120 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Nonatomic congestion games View project

All content following this page was uploaded by Paulin Jacquot on 12 June 2015.

The user has requested enhancement of the downloaded file.


loadFromWeather documentation
P. Jacquot

April 3, 2015

1 Introduction
The python module loadFromWeather.py module is designed to forecast
scenarios for electricity load for use in stochastic unit commitment. The
scenarios depend on the error made on the forecast. They should give a set
of potential behaviors of the actual load, with associated probability. Since
load highly depends on the weather, this program is designed to use the
weather forecast to compute the potential scenarios for the load.

2 Input data
2.1 Weather forecast database
The weather forecast database can be given either in q pickle or an excel
file. It contains for each entry, in this order :

• the number of the entry

• the day the forecast was made

• the location or city for which the weather is forecast

• the day for which the weather is forecast

• the hour (from 1 to 24) for which the weather is forecast

• the temperature forecast

• the dew point temperature forecast

• the wind speed forecast

1
For instance, one line of the excel file will be :
7 3/30/2007 BDL 3/31/2007 7 29 20 7
The file path will be given as an option forecast database filename
or will be given as an argument to the main function.

2.2 Load and weather data


This is the data that will be used to generate scenarios. Scenarios will be
computed for each day present in this file. The name of the file is given
by the option load database filename and is given either as a pickle or
an Excel file. Each line contains the following entries, for each geographical
zone :
• the date
• the hour of the day : each day has 24 hours, from 1 to 24
• the forecast demand (from ISO), although we will never use it
• the actual demand at this day and this hour
• temperature and dew point for this day and hour.
The data given by ISO-NE gives this parameters in columns 1,2,3,12 and
13 respectively, and looks like the following example. We don’t use the other
data.

1/4/2011 9 3296.8 3385 48.29 49.43 −0.49 −0.65 55.5956.10 0.00 −0.51 26

2.3 Zones
One can proceed several different zones in the same time, for instance Con-
necticut and New Hampshire. One can link the zones to master zones. The
zone data is provided in a .dat file and whose name is given by the option
zones filename, and each entry contains the zone name and its masterzone
name, separated by a space.

2.4 Locations per zone


If the weather forecast is provided from several locations for the same zone,
one has to give coefficients to each location to compute a weather forecast
for the zone.

2
3 Parameters
3.1 Scenarios parameters
One has to specify the different parameters that are used to generate the
scenarios. This is done in a file which name is defined by the option
loadFromWeather params filename, that contains the following :

• the day part separators (dps), that is, the cutting hours that will divide
the day in sub-periods, on a 0 basis, that is the first dps should be 0
and the last 23. For instance this line will be 0 11 23 if we want to
separate mornings and after-noon.

• the distribution cutting point ( DistrCutPts) for each day part limit,
that are used to generate the scenarios path for each category within
the error points, and for each error sub-segment defined by the cat-
egory bounds. For instance, 11 0.0 0.5 1.0 will give two segments
and two scenario skeleton points for noon, for each error category.

• the category bounds CatBounds that are used to sub-segment the data
according to the error distribution function, in the day part category.
These are breakpoints for the CDF of the error, so the first breakpoint
must be 0.0 and the last must be 1.0. For example, CatBounds 0.0
0.5 1.0 will give two equally weighted error categories data for each
day part.

3.2 Date ranges and segments


It would be more efficient to differentiate periods of the year that give very
different load or weather patterns. For instance, summer and spring. More-
over, one can decide to forget some days that won’t be significant or for
which data is missing. Date range is given in a .dat file whose name is
given by segments dates filename, and each line contains, separated by a
space :

• the name of the period

• the number of segments that will be used to cluster days according to


the weather forecast

• the coefficients given to the temperature and to the dew point value
in the weather forecast

3
• for each range, date of beginning and date of end.

As an example :
summer 3 0.5 0.5 2010−5−15 2010−5−15 2010−5−17 2010−06−10

4 Options
4.1 Kick-out date
For any day after this date, the module will generate load scenarios. It can
be either given as a parameter or changed in the option kickout date. Its
default value is none, which means that all days in the data 2.2 will be used.
The date has to be given in the format yyyy-mm-dd.

4.2 Data to fit


The option data to fit determines which days given in data 2.2 will be
used to compute the scenario loads. It can take three different values. none
means that data from all days available will be used. leave-one-out means
that the kick-out date 4.1 will be excluded. Finally, ruling means that all
days after the kick out date will be excluded, which is what happens in
practice.

4.3 Segmenter criteria


This criteria is defined by the option seg criteria. It can be either avg or
any hour h between 1 and 24, which means that we will segment the data
according to either the average temperature of each day or the temperature
at hour h.

5 Drive everything
The module drive everything.py load the data and runs every modules
that is needed to pre-treat the data and have the parameters ready before
launching loadFromWeather.py. In the following order :

5.1 Loading data


The weather forecasts and the load data is read from 2.2 and 2.1 and written
into the dictionaries :

4
• forecats[location][date day][hour] = [temperature, dew point,
wind speed]

• loads[zone name][date day][hour] = (actual demand, temperature,


dew point, forecasted demand)

5.2 Segmenter
The segmenter will cluster the data days given in data from 2.2 according
to the temperature at a certain hour or the average (see option 4.3). The
number of segments for each period of the year is given by the parameters
in 3.2.
The segmenter first compute the temperature limits corresponding to
the break points given in 3.2. It then creates, for each master zone and each
data range, a list of the days belonging to each segment. According to the
value of option 4.2, the ”kick-out day” itself or all days following it won’t
be taken into account.
The segmenter writes three types of file that will be used :

• a .load file for each zone and each segment that contains the load
data 2.2 that has been clustered in this segment

• a .forecasts file, for each zone and each segment that contains the
weather 2.1

• a .segnames file for each zone that contains the names of the segments
followed by the coefficients given to temperature and dew point value
in the segmentation, e.g. summer segment0 0.5 0.5.

5.3 Wednesday rules


The load depends on the day of the week, so to have more data available for
each day, we convert everyday to a Wednesday. The module wedrules.py
use the data in each segment to compute a matrix A for each day type (the
five week days plus one for week-end days), that can convert any day to a
Wednesday, and compute the inverse of this matrix.
For each segment, zone and day type, the module creates a file .wedrules
containing the coefficients of this matrix and the coefficients of its inverse
in a file .invwedrules.

5
5.4 Zone rules
As in the wednesday rule transformation , we aggregate the datas for differ-
ent zones into a smaller number of masterzones. The module zonerules.py
computes and writes the coefficients of the transformation matrix and their
inverse in .zonerules and .invzonerules files, for each segment and each
zone.

5.5 Epifit
The Epifit.py module first get the weather data, and the load data trans-
formed with 5.3 and 5.4, into sorted dictionaries , for each master zone,
segment and zone and days.
For each master zone and segment, the module solves the epi-spline
problems to get a regression of the load according to weather forecast. The
epi-splines coefficients are computed using the data for all days and seg-
ments.
The module writes, for each master zone and each segment :

• a .Epicoeff pickle file that contains the coefficients that define the
epi-spline for this master zone and segment

• a .Eload that contains the expected load from the weather forecast
with this epi-spline, for each hour of each day belonging to the seg-
ment and masterzone, after applying the inverse zone and wednesday
transformations from 5.3 and 5.4

• a .Errors that contains the error between the expected load computed
and the actual load, for each hour of each day belonging to the segment
and masterzone. The errors are kept in the wednesday and masterzone
format.

6 Load From Weather


loadFromWeather.py begins to load the parameters 3.1 to generate the
scenarios.

6.1 Generate Error distribution


The function Gen Errodists makes the second segmentation of the data.
It takes the data and the category bounds given in 3.1, and reorganize the

6
days in the data into error subsets according to the error computed in 5.5
for each day part separator hour.

It then computes new epi-splines coefficients for each zone, segment and
error category, which lead to a new estimated load, and new errors regarding
to these estimations. It then can compute the distribution function of the
error in each error category.
With the inverse of the distribution function, the cutting points given in
3.1 for each day part separator hour provide different ranges of the error. In
each error category and each error range, we can compute the conditional
expected value of the error with this distribution function.

The module returns the new epi-splines coefficients, the distribution


functions of the errors, their expected values and their probability.

6.2 Generate the scenarios


Using loops, we proceed on every master zone, zone, segments, and days
that have been segmented in 5.2 to generate its scenarios.
We apply the wednesday and zone transformation computed in 5.3.
With the epi-splines coefficients determined for each day part separa-
tor above, loadFromWeather then computes a regression curve for the load
according to the weather for the entire day, and for each error category.
To generate the scenarios, we use the function generateScenarios that
loops over the error categories and over the day parts separators, that is,
the skeleton points. We use the regression curves computed to interpolate
the scenarios paths between the skeleton points inside each error category
and between two day part separators.
We save the load value of each skeleton point, its probability, father
node, hour, day, segment and zone in nodes, and we make a list of the leafs
of the scenario tree.

Finally, looping on the nodes of the tree, we write the resulting loads cor-
responding to each scenario s in a file scen s zone name scengen out.dat.
For each day, we make a plot of all the scenarios computed, and we write
a summary file summary.dat containing the number of scenarios and their
probabilities .

View publication stats

You might also like