Minor Project

A project report on
Model Selection, Hyperparameter Tuning and Time

Series Analysis for Energy Price Data using Modern
Model Amalgamation and Model Optimization
Submitted in partial fulfillment of the requirements for the

Degree of B. Tech in Electrical Engineering
By
Mayank Shahabadee (1803096)
Mayank Dubey (1803131)
Under the guidance of

Prof. Banishree Misra
School of Electrical Engineering
Kalinga Institute of Industrial Technology
Deemed to be University
Bhubaneswar
April 2021
1
© KALINGA INSTITUTE OF INDUSTRIAL TECHNOLOGY KIIT-
DEEMED TO BE UNIVERSITY, BHUBANESWAR ALL RIGHTS
RESERVED
2
CERTIFICATE
This is to certify that the project report entitled “Model Selection,

Hyperparameter Tuning and Time Series Analysis for Energy Price Data
using Modern Model Amalgamation and Model Optimization” submitted by
Mayank Shahabadee bearing roll number 1803096 in partial fulfillment of the
requirements for the award of the degree of Bachelor of Technology in
Electrical Engineering is a Bonafede record of the work carried out under
our guidance and supervision at the School of Electrical Engineering, KIIT.
Signature of Supervisor 1
Assistant Professor
School of Electrical Engineering,
KIIT
................................................................................................................................................
The Project was evaluated by us on
EXAMINER 1 EXAMINER 2
3
CERTIFICATE
This is to certify that the project report entitled “Model Selection,

Hyperparameter Tuning and Time Series Analysis for Energy Price Data
using Modern Model Amalgamation and Model Optimization” submitted by
Mayank Dubey bearing roll number 1803131 in partial fulfillment of the
requirements for the award of the degree of Bachelor of Technology in
Electrical Engineering is a Bonafede record of the work carried out under
our guidance and supervision at the School of Electrical Engineering, KIIT.
Signature of Supervisor 1
Assistant Professor
School of Electrical Engineering,
KIIT
................................................................................................................................................
The Project was evaluated by us on
4
ACKNOWLEDGEMENTS
Firstly, I would like to sincerely thank my supervisors Prof.

Banishree Misra for sharing their knowledge and expertise
in this study. I would not be able to thank them enough for
making themselves available to even clear my silliest doubts
and also for teaching and guiding me throughout the
process of research.
I am also thankful to my mentor Prof. Rudra Narayan Dash

for helping and advising me in the right way every time.
I would further thankful to the Dean, School of Electrical

Engineering Dr. Byamakesh Nayak, all faculty members of
school of Electrical Engineering and technical assistants of
School of Electrical Engineering for their constant support.
Mayank
Shahabadee
Mayank
Dubey
5
Preface
The Goal of Forecasting is not to predict the future but to tell

you what you need to know to take meaningful action in the
present.
• Paul Saffo
Author: Mayank Shahabadee 1803096

Mayank Dubey 1803131
Under the Supervision of Dr. Banishree Misra
6
Chapter 1 - Background and Introduction:
Demand Forecasting (Big Data and Predictive Analysis):
If we can predict the demand of future, like in the current
scenario, where we are failing to predict the demand of oxygen
cylinder and beds in covid-19 hospitals, accurate prediction can
lead to solving many challenges in day-to-day life.
Formal Definition and Common Terminologies:
7
Forecasting Types:
1. Judgement based manual forecasting

2. Univariate Forecasting
3. Multivariate Forecasting
4. Automated Forecasting at Scale
Techniques for Demand Forecasting:

Mainly there are 2 categories in Demand Forecasting:
1. Time Series Models: These are a set if methods which take Time as
an intrinsic part of the model.
E.g.: ARIMA, Prophet, Recurrent Neural Network
2. Regression Models: We can also use general Regression methods

and inject time as an external dependency to the models.
E.g.: Ridge/Lasso Regression, Gradient Boosting Machines, Feedforward
Neural Network.
8
Problems with Traditional Time Series Problems:
• There is no systematic approach for the identification and selection
of an appropriate model, and therefore, the identification process is
mainly trial-and-error.
• There is difficulty in verifying the validity of the model
• Most traditional methods were developed from intuitive and
practical considerations rather than from a statistical
foundation
The Classic ARIMA Model Solution and its Problems:

• Autoregressive Integrated Moving-average
• A “stochastic” modeling approach that can be used to calculate
the probability of a future value lying between two specified limits.
• Autoregressive AR process:
• Series current values depend on its own previous values
• AR(p) - Current values depend on its own p-previous values
• P is the order of AR process
• Moving average MA process:

• The current deviation from mean depends on previous deviations
• MA(q) - The current deviation from mean depends on q- previous
deviations
• q is the order of MA process
• Autoregressive Moving average ARMA process
ARIMA is also known as Box-Jenkins’s approach. It is popular

because of its generality.
It can handle any series, with or without seasonal elements, and
it has well documented computer programs
9
Series current values depend on its own previous values
The current deviation from mean depends on previous deviations
10
Example y(t) data:
Example Arima Nodes:
11
 Arima (p, d, q) modeling:
To build a time series model issuing ARIMA, we need to study the time
series and identify p, d, q:
1. Ensuring Stationarity
Determine the appropriate values of d
2. Identification:
Determine the appropriate values of p & q using the ACF, PACF,
and unit root tests
<<<p is the AR order, d is the integration order, q is the MA order>>>
3. Estimation:
Estimate an ARIMA model using values of p, d, & q you
think are appropriate.
4. Diagnostic checking:
Check residuals of estimated ARIMA model(s) to see if they are white
noise, pick best model with well-behaved residuals.
5. Forecasting:
Produce out of sample forecasts or set aside last few data points for in-
sample forecasting.
12
Stationarity:
In order to model a time series with the Box-Jenkins approach, the series
has to be stationary:
In statistical terms, a stationary process is assumed to be in a
particular state of statistical equilibrium, i.e., p(xt) is the same for all t
In particular, if zt is a stationary process, then the first difference Δzt
= zt - zt-1 and higher differences Δd zt are stationary.
Demo: Testing Stationarity
Stochastic Trend: Inexplicable changes in direction.
13
Achieving Stationarity:
Power of Differentiation:
14
Other Methods to Achieve Stationarity:
15
Chapter 2 – Model Selection
Prophet Model (By Facebook):
Prophet is a Generalized Additive Model which models a Time Series
prediction as a curve fitting exercise with the given formula:
y(t) = g(t) + s(t) + h(t) + ε(t)
g(t) - Trend Model - Linear/Logistic
s(t) – Seasonality Model - Daily/Weekly/Monthly
h(t) – Holiday Model - Regressor Effect of Different Models
ε(t) - Error/Residual
Advantages:
1. Prophet is robust to outliers, missing data, and dramatic changes in

your time series.
2. Automatic Changepoint Detection to models piecewise trends.
3. Provides reasonable forecast on messy data with little to no manual
efforts.
Code Sample:
16
Recurrent Neural Networks (RNNs):
• Designed to handle sequential data.
• And Time Series is a Sequential data.
Vanilla RNN is rarely used due to its problem of vanishing gradients,

LSTM (Long Short-Term Memory) which is modified version of RNN is
used to overcome a lot of problems of traditional RNN.
17
Chapter 3 - Time Series as Regression:
To use the time series data as a supervised data for regressor, we use
what’s called a sliding window method, where we take t1, t2, t3 as x and
then t4 is y, we are using t1, t2, t3 to predict t4 and them moving the
window ahead, we get t2, t3, t4 to predict t5. That’s how we generate
different samples of data for the supervised learning.
Then we can apply linear regression onto this transformed matrix:
18
Chapter 4 - Feature Engineering:
Just depending on lags in the sliding window, is very rudimentary way of

doing it, if we want to get good performance, we would want to help our
model learn the time much better.
1. Datetime Features: Features like day of the week, month, quarter

can be included keeping in mind the frequency of the time series,
can help the model to take into consideration seasonality very well.
2. Lag Features: We can provide the previous values of the time
series as features, as in the sliding window method.
3. Rolling Features: We can also use the sliding window to create
the rolling statistics (mean, max, etc.) features, instead of giving t1,
t2, t3.
4. Trend Features: We can extract the trend with a simple linear
model and provide that feature, as regression models are not ideal
to find trends in data.
5. Fourier Features: Sometimes its beneficial to use Fourier
transformation on those features, to get continuous representation
of the seasonality, and plus it also gives boost in performance.
19
Chapter 5 - Common Problems faced in Regression:
1. Temporal Data Leakage: Data Leakage is basically when the

targets leak into the input, and the model starts to exploit it.
Temporal Data Leakage, with reference to cross validation or k fold
validation, we need to take care of the time in our temporal data, we
obviously can’t use future data train on it and then predict on past.
Time Series Split (Sci-Kit Learn) – It does the data drop-in

replacement for the k-fold validation. Very useful for temporal data.
20
2. Lag and Rolling Features: In Pandas, we have the mean rolling
window function, by default it includes the current timestamp
version also, which is the direct leak of the target into the rolling
mean, hence we shift it by one and then apply the rolling feature.
This makes it absolutely sure that the target is not leaked.
3. Rolling Forecast: Multi-Step Rolling Forecast, we basically train a

single model to train one step ahead. We take that forecasted
values as actuals and adjust the lags and rolls and everything. Then
we pass it into the model to get the next and the next. Disadvantage
here is that we are propagating the same error, we made in the first
timestamp into the subsequent timestamp. The forecast will be
worst as we move far away from the current line.
21
4. Direct Forecast: Single Models in Direct Forecast have added
advantages over Multi Model Forecast, for 18 Months of data we are
using 18 models and difficult to manage. When regressor supports
multiple targets, we can use the Single Models in Direct Forecast,
which basically comes under the neural network world where we
can take past data and do direct predictions on t+1, t+2, and t+3.
22
Chapter 6 - Forecasting at Scale:
Traditional Data Science Process as depicted by Joe Blitzstein and Hans

Peter Pfister:
Suppose, when we have 1M items (Big Data) to forecast it’s difficult to

plot the data and find patterns. We can’t validate each and every model
and make decisions on it, and engineer specific features for each model.
Solution: Embrace Automation into the data
science pipeline.
23
Automated Model Selection (Auto ML) and Hyper Parameter
Tuning:
Our Default choice for model selection is
Grid Search and on top of it we do hyper
parameter tuning. Trying out all possibilities
for hyperparameter tuning is computation-
ally expensive for large datasets.
Random Search:
It’s basically like throwing darts in a dark room and expecting it to hit the
target.
Bayesian Optimization (Best of both worlds):

1. Let’s imagine we
have 3 parameters
a, b and c for which
we need to find the
optimal values
which minimizes a
loss / metric.
2. We start out with a
range of values or
probability density
distribution for
each of the three
parameters.
24
- This is the prior believe in Bayesian world.
3. Now we sample three parameters from these prior distributions and
record the value of the metric. Based on the value of the metric, we
update the distribution of the three parameters – This is called the
posterior.
4. Now we repeat this for N number of iterations, each iteration
sharpening the posterior and moving closer to the optimal solution.
 One of the many implementations of Bayesian Optimization

in Python is Optuna.
o A Python library which has implemented state of the art algorithms

in Bayesian Optimization Space.
o Eager Search Spaces let us define and create search spaces with
pythonic syntax and flexibility.
o Efficient Pruning (If a trial is not promising, we stop it mid-way.) and
integrations to all the major ML/DL libraries.
 Sample Code:
25
Pooled vs Un-pooled Model:
VS.
Basically, depends on the amount of data we are using out

weights the disadvantages.
Advantages Disadvantages
01. Less Training Time 01. May not be able to learn
specific patterns of each time
02. Less Resource Utilization series.
03. Cross Learning 002. Need highly complex

algorithms to learn the patterns
for all the pooled time series
dataset.
26
Chapter 7 - Time Series Segmentation:
Classical Approach (Heuristic Based Segmentation):
o ADI (Average Demand Interval):

It’s a proxy for intermittency, basically, it measures the demand
regularity in time by computing the average interval between the

two demands.
o Coefficient of Variation: Proxy for variability in the series. It

measures the variation in quantities.
o Smooth Bucket comprises of no sporadicity (continuous), and

very low variability. Ideal conditions, Real World dataset is found to
be in erratic and lumpy region.
27
Chapter 8 - Unsupervised Clustering for Time Series
o Euclidean Distance
• Sensitive to time shifts
• Only works for equal length time Series
 Dynamic Time Warping Distance

o Computationally Expensive
o Shape Based and Time Shift Intensive
o SoftDTW is a faster approximation
 Autoencoders:
o Uses a Deep Neural Network to learn feature representation
of a time series.
o Uses normal clustering algorithms on those representations
28
 Python Libraries:
tslearn implements the classical time series clustering.
Autoencoders can be coded using any of the popular DL
frameworks.
29
From Point Forecast to Probabilistic Forecast:
Some Ways of generating Probabilistic Forecast:
30
31
Chapter 9 - Conclusion:
1.> For Large Scale data, Neural Network Models gives the best
output.
2.> For Small Datasets, Precise Hyperparameter Tuned Model turns
the best.
3.> Multivariate Dataset can perform better in Larger Datasets than
Univariate Datasets.
32
Chapter 10 - Future Challenges and Research Problems:
1.> Self-Optimization of Model according to the kind of dataset
given or is available.
2.> Optimizing Pretrained Models and customizing it to fit the
dynamic nature of data.
3.> Still Data Analysis misses due to lack of insights due to access
to classified data. Data should be openly available for analysis
without personal identity violations.
33
Chapter 11 – References
C. R. Madhuri, M. Chinta and V. V. N. V. P. Kumar, "Stock Market

Prediction for Time-series Forecasting using Prophet upon ARIMA," 2020
7th International Conference on Smart Structures and Systems (ICSSS),
Chennai, India, 2020, pp. 1-5.
doi: 10.1109/ICSSS49621.2020.9202042
keywords: {Data models;Stock markets;Predictive
models;Forecasting;Time series analysis;Market
research;Python;Statistical indicators;Stock market;Time series
forecasting;Future projections;Recurrence relation;Predict},
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=92020
42&isnumber=9201947
I. Yenidoğan, A. Çayir, O. Kozan, T. Dağ and Ç. Arslan, "Bitcoin

Forecasting Using ARIMA and PROPHET," 2018 3rd International
Conference on Computer Science and Engineering (UBMK), Sarajevo,
Bosnia and Herzegovina, 2018, pp. 621-624.
doi: 10.1109/UBMK.2018.8566476
keywords: {Predictive models;Bitcoin;Time series
analysis;Forecasting;Training;Correlation;bitcoin;forcasting},
76&isnumber=8566244
V. Demir, M. Zontul and İ. Yelmen, "Drug Sales Prediction with ACF and
PACF Supported ARIMA Method," 2020 5th International Conference on
Computer Science and Engineering (UBMK), Diyarbakir, Turkey, 2020,
pp. 243-247.
doi: 10.1109/UBMK50275.2020.9219448
34
keywords: {Time series analysis;Drugs;ARIMA;ACF;PACF;Time Series
Prediction;Drug Sales Prediction},
48&isnumber=9219361
N. Kumar and S. Susan, "COVID-19 Pandemic Prediction using Time

Series Forecasting Models," 2020 11th International Conference on
Computing, Communication and Networking Technologies (ICCCNT),
Kharagpur, India, 2020, pp. 1-7.
doi: 10.1109/ICCCNT49239.2020.9225319
keywords: {COVID-19;Forecasting;Predictive models;Viruses
(medical);Mathematical model;Market research;Diseases;ARIMA;COVID-
19;Pandemic;Prophet;Time series forecasting},
19&isnumber=9225262
S. S. HELLİ, Ç. DEMİRCİ, O. ÇOBAN and A. HAMAMCI, "Short-Term

Forecasting COVID-19 Cases In Turkey Using Long Short-Term Memory
Network," 2020 Medical Technologies Congress (TIPTEKNO), Antalya,
Turkey, 2020, pp. 1-4.
doi: 10.1109/TIPTEKNO50054.2020.9299235
keywords: {Forecasting;COVID-19;Time series analysis;Predictive
models;Market research;Estimation;Logic gates;COVID-
19;forecasting;Turkey;LSTM;ARIMA;HWAAS;Prophet;elu},
35&isnumber=9299212
P. Chakraborty, M. Corici and T. Magedanz, "A comparative study for

Time Series Forecasting within software 5G networks," 2020 14th
35
International Conference on Signal Processing and Communication
Systems (ICSPCS), Adelaide, SA, Australia, 2020, pp. 1-7.
doi: 10.1109/ICSPCS50536.2020.9310033
keywords: {5G mobile communication;Forecasting;Time series
analysis;Monitoring;Predictive models;Data models;Tools;Machine
Learning;Time Series Forecasting;3GPP 5G Core;Open5GCore
Toolkit;Failure Prediction},
33&isnumber=9309996
M. A. R. Shuvo, M. Zubair, A. T. Purnota, S. Hossain and M. I. Hossain,

"Traffic Forecasting using Time-Series Analysis," 2021 6th International
Conference on Inventive Computation Technologies (ICICT),
Coimbatore, India, 2021, pp. 269-274.
doi: 10.1109/ICICT50816.2021.9358682
keywords: {Training;Computational modeling;Time series analysis;Urban
areas;Predictive models;Data models;Forecasting;Traffic
Forecasting;Time-series forecasting models;Auto Regressive Integrated
Moving Average;Seasonal Naive;Exponential Smoothing;Prophet},
82&isnumber=9358467
K. Thiyagarajan, S. Kodagoda, N. Ulapane and M. Prasad, "A Temporal

Forecasting Driven Approach Using Facebook’s Prophet Method for
Anomaly Detection in Sewer Air Temperature Sensor System," 2020 15th
IEEE Conference on Industrial Electronics and Applications (ICIEA),
Kristiansand, Norway, 2020, pp. 25-30.
doi: 10.1109/ICIEA48937.2020.9248142
keywords: {Temperature sensors;Atmospheric
modeling;Corrosion;Predictive models;Forecasting;Anomaly
detection;Monitoring;Anomaly detection;ARIMA;Bagged
model;concrete corrosion;ETS model;Facebook
Prophet;forecasting;sewer pipe;TBATS model;temperature sensor;time
36
series model},
42&isnumber=9248088
V. Poleneni, J. K. Rao and S. Afshana Hidayathulla, "COVID-19 Prediction

using ARIMA Model," 2021 11th International Conference on Cloud
Computing, Data Science & Engineering (Confluence), Noida, India,
2021, pp. 860-865.
doi: 10.1109/Confluence51648.2021.9377038
keywords: {COVID-19;Training;Social networking (online);Data
visualization;Predictive models;Real-time systems;Testing;COVID-
19;ARIMA Model;FB Prophet;Machine Learning;Time Series
Analysis;web scraping;forecasting;R-squared score;Root Mean Squared
Error;Mean Squared Error},
38&isnumber=9376876
N. K. Chikkakrishna, C. Hardik, K. Deepika and N. Sparsha, "Short-Term

Traffic Prediction Using Sarima and FbPROPHET," 2019 IEEE 16th India
Council International Conference (INDICON), Rajkot, India, 2019, pp. 1-4.
doi: 10.1109/INDICON47234.2019.9028937
keywords: {Predictive models;Solid modeling;Data models;Time series
analysis;Market research;Forecasting;Analytical
models;SARIMA;PROPHET;MAPE;RMSE},
37&isnumber=9028850
T. Hondou and Y. Sawada, "Dynamical process of learning chaotic time

series by neural networks," Proceedings of 1993 International
Conference on Neural Networks (IJCNN-93-Nagoya, Japan), Nagoya,
Japan, 1993, pp. 2387-2390 vol.3.
doi: 10.1109/IJCNN.1993.714206
37
keywords: {Chaos;Neural networks;Artificial neural networks;Chaotic
communication;Computer simulation;Computational
modeling;Feedforward neural networks;Recurrent neural networks;Limit-
cycles;Logistics},
6&isnumber=15470
N. Z. Hakim, J. J. Kaufmann, G. Cerf and H. E. Meadows, "Nonlinear time

series prediction with a discrete-time recurrent neural network
model," IJCNN-91-Seattle International Joint Conference on Neural
Networks, Seattle, WA, USA, 1991, pp. 900 vol.2-.
doi: 10.1109/IJCNN.1991.155485
keywords: {Recurrent neural networks;Predictive models;Target
recognition;Signal processing algorithms;Nonlinear systems;Nonlinear
equations;Stochastic processes;Neurons;Neural networks;Computer
architecture},
5&isnumber=4027
J. C. Lv, Z. Yi and Y. Li, "Non-Divergence of Stochastic Discrete Time

Algorithms for PCA Neural Networks," in IEEE Transactions on Neural
Networks and Learning Systems, vol. 26, no. 2, pp. 394-399, Feb. 2015.
doi: 10.1109/TNNLS.2014.2312421
keywords: {Principal component analysis;Algorithm design and
analysis;Convergence;Signal processing algorithms;Neural
networks;Approximation algorithms;Heuristic algorithms;Neural
networks;nondivergence;principal component analysis (PCA);stochastic
discrete time (SDT) method.;Neural networks;nondivergence;principal
component analysis (PCA);stochastic discrete time (SDT) method},
30&isnumber=7010866
A. Bulsari and H. Saxen, "Nonlinear time series analysis by neural
38
networks: an example," Proceedings of 1993 International Conference on
Neural Networks (IJCNN-93-Nagoya, Japan), Nagoya, Japan, 1993, pp.
995-998 vol.1.
doi: 10.1109/IJCNN.1993.714079
keywords: {Time series analysis;Neural networks;Testing;Chemical
analysis;Heat engines;Laboratories;Chemical engineering;Electronic
mail;Performance analysis;Feedforward neural networks},
9&isnumber=15487
D. Y. C. Chan and D. Prager, "Analysis of time series by neural

networks," [Proceedings] 1991 IEEE International Joint Conference on
Neural Networks, Singapore, 1991, pp. 355-360 vol.1.
doi: 10.1109/IJCNN.1991.170427
keywords: {Time series analysis;Neural networks;Artificial neural
networks;Predictive models;Chaos;Testing;Biological system
modeling;Logistics;Joining processes},
7&isnumber=4411
M. Lehtokangas, J. Saarinen, P. Huuhtanen and K. Kaski, "Chaotic time

series modeling with optimum neural network architecture," Proceedings
of 1993 International Conference on Neural Networks (IJCNN-93-
Nagoya, Japan), Nagoya, Japan, 1993, pp. 2276-2279 vol.3.
doi: 10.1109/IJCNN.1993.714179
keywords: {Chaos;Neural networks;Nonlinear dynamical
systems;Feedforward neural networks;Mathematical model;Predictive
models;Nonlinear
systems;Backpropagation;Microelectronics;Laboratories},
9&isnumber=15470
N. Hazarika and D. Lowe, "A neural-network extension of the method of
39
analogues for iterated time series prediction," Neural Networks for Signal
Processing VIII. Proceedings of the 1998 IEEE Signal Processing Society
Workshop (Cat. No.98TH8378), Cambridge, UK, 1998, pp. 458-466.
doi: 10.1109/NNSP.1998.710676
keywords: {Neural networks;Nonlinear systems;Time series
analysis;Nonlinear dynamical systems;Ear;Iron;Multidimensional
systems;State-space methods;Tracking;Coordinate measuring
machines},
6&isnumber=15338
H. G. Seedig, R. Grothmann and T. A. Runkler, "Forecasting of clustered

time series with recurrent neural networks and a fuzzy clustering
scheme," 2009 International Joint Conference on Neural Networks,
Atlanta, GA, USA, 2009, pp. 2846-2853.
doi: 10.1109/IJCNN.2009.5178775
keywords: {Recurrent neural networks;Fuzzy neural networks;Predictive
models;Neural networks;Clustering algorithms;Fuzzy sets;Linear
regression;USA Councils;Error correction;History;Fuzzy Clustering;Time
Series Forecasting;Error Correction Neural Networks;Time-Delay
Recurrent Neural Networks},
75&isnumber=5178557
A. Gharehbaghi and M. Lindén, "A Deep Machine Learning Method for

Classifying Cyclic Time Series of Biological Signals Using Time-Growing
Neural Network," in IEEE Transactions on Neural Networks and Learning
Systems, vol. 29, no. 9, pp. 4102-4115, Sept. 2018.
doi: 10.1109/TNNLS.2017.2754294
keywords: {Hidden Markov models;Time series analysis;Biological system
modeling;Neural networks;Phonocardiography;Medical services;Brain
modeling;Deep learning;deep time-growing neural network
(DTGNN);phonocardiogram (PCG);time-growing neural network
40
(TGNN)},
55&isnumber=8440865
I. Khandelwal, U. Satija and R. Adhikari, "Forecasting seasonal time series

with Functional Link Artificial Neural Network," 2015 2nd International
Conference on Signal Processing and Integrated Networks (SPIN), Noida,
India, 2015, pp. 725-729.
doi: 10.1109/SPIN.2015.7095387
keywords: {Forecasting;Time series analysis;Predictive models;Artificial
neural networks;Biological system modeling;Computational
modeling;seasonal time series;forecasting accuracy;functional link
artificial neural network;random walk model},
87&isnumber=7095159
W. Yan, "Toward Automatic Time-Series Forecasting Using Neural

Networks," in IEEE Transactions on Neural Networks and Learning
Systems, vol. 23, no. 7, pp. 1028-1039, July 2012.
doi: 10.1109/TNNLS.2012.2198074
keywords: {Artificial neural networks;Predictive
models;Forecasting;Training;Time series analysis;Biological system
modeling;Artificial neural network;generalized regression neural
network;model combination;time-series forecasting},
91&isnumber=6214687
41

Minor Project

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Minor Project

Uploaded by

Copyright:

Available Formats

A project report on

Model Selection, Hyperparameter Tuning and Time

Submitted in partial fulfillment of the requirements for the

Under the guidance of

This is to certify that the project report entitled “Model Selection,

This is to certify that the project report entitled “Model Selection,

Firstly, I would like to sincerely thank my supervisors Prof.

I am also thankful to my mentor Prof. Rudra Narayan Dash

I would further thankful to the Dean, School of Electrical

The Goal of Forecasting is not to predict the future but to tell

Author: Mayank Shahabadee 1803096

Formal Definition and Common Terminologies:

1. Judgement based manual forecasting

Techniques for Demand Forecasting:

E.g.: ARIMA, Prophet, Recurrent Neural Network

2. Regression Models: We can also use general Regression methods

The Classic ARIMA Model Solution and its Problems:

• Moving average MA process:

• Autoregressive Moving average ARMA process

ARIMA is also known as Box-Jenkins’s approach. It is popular

The current deviation from mean depends on previous deviations

Example Arima Nodes:

Demo: Testing Stationarity

Stochastic Trend: Inexplicable changes in direction.

g(t) - Trend Model - Linear/Logistic

s(t) – Seasonality Model - Daily/Weekly/Monthly

h(t) – Holiday Model - Regressor Effect of Different Models

1. Prophet is robust to outliers, missing data, and dramatic changes in

Vanilla RNN is rarely used due to its problem of vanishing gradients,

Then we can apply linear regression onto this transformed matrix:

Just depending on lags in the sliding window, is very rudimentary way of

1. Datetime Features: Features like day of the week, month, quarter

1. Temporal Data Leakage: Data Leakage is basically when the

Time Series Split (Sci-Kit Learn) – It does the data drop-in

3. Rolling Forecast: Multi-Step Rolling Forecast, we basically train a

Traditional Data Science Process as depicted by Joe Blitzstein and Hans

Suppose, when we have 1M items (Big Data) to forecast it’s difficult to

Bayesian Optimization (Best of both worlds):

 One of the many implementations of Bayesian Optimization

o A Python library which has implemented state of the art algorithms

Basically, depends on the amount of data we are using out

03. Cross Learning 002. Need highly complex

Classical Approach (Heuristic Based Segmentation):

o ADI (Average Demand Interval):

regularity in time by computing the average interval between the

o Coefficient of Variation: Proxy for variability in the series. It

o Smooth Bucket comprises of no sporadicity (continuous), and

 Dynamic Time Warping Distance

Some Ways of generating Probabilistic Forecast:

C. R. Madhuri, M. Chinta and V. V. N. V. P. Kumar, "Stock Market

I. Yenidoğan, A. Çayir, O. Kozan, T. Dağ and Ç. Arslan, "Bitcoin

N. Kumar and S. Susan, "COVID-19 Pandemic Prediction using Time

S. S. HELLİ, Ç. DEMİRCİ, O. ÇOBAN and A. HAMAMCI, "Short-Term

P. Chakraborty, M. Corici and T. Magedanz, "A comparative study for

M. A. R. Shuvo, M. Zubair, A. T. Purnota, S. Hossain and M. I. Hossain,

K. Thiyagarajan, S. Kodagoda, N. Ulapane and M. Prasad, "A Temporal

V. Poleneni, J. K. Rao and S. Afshana Hidayathulla, "COVID-19 Prediction

N. K. Chikkakrishna, C. Hardik, K. Deepika and N. Sparsha, "Short-Term

T. Hondou and Y. Sawada, "Dynamical process of learning chaotic time

N. Z. Hakim, J. J. Kaufmann, G. Cerf and H. E. Meadows, "Nonlinear time

J. C. Lv, Z. Yi and Y. Li, "Non-Divergence of Stochastic Discrete Time

A. Bulsari and H. Saxen, "Nonlinear time series analysis by neural

D. Y. C. Chan and D. Prager, "Analysis of time series by neural