You are on page 1of 23

Module 5

5.1 Taxonomy of Time Series Forecasting methods, Time Series


Decomposition
5.2 Smoothening Methods: Average method, Moving Average
smoothing, Time series analysis using linear regression, ARIMA Model,
Performance Evaluation: Mean Absolute Error, Root Mean Square Error,
Mean Absolute Percentage Error, Mean Absolute Scaled Error
5.3 Self-Learning Topics: Evaluation parameters for Classification,
regression and clustering.
Taxonomy of Time Series Forecasting
methods
The taxonomy of time series forecasting can be broadly divided into
three categories:
Univariate Forecasting : It involves the forecasting the future values of a
single variable over time , such as sales figures or stock prices.
Multivariate time series Forecasting : It involves forecasting the
interaction between multiple variable , such as the relationship
between a company's stock price and its profit.
Time Series forecasting with external variable : It involves forecasting
the future values of variable based on the impact of one or more
independent variable ,such as the impact of advertising on sales.
Time Series Data
• Time-series data are observations obtained over time through
repeated measurements and collected together. Expressed visually on
a graph, one of the axes is always time when you plot the points in
time from series data.
• Time series metrics are specific data tracked at set time increments.
For example, a time series metric might track how much inventory a
store sells each day. A user might plot this data for a month to see
when the busiest sales days were.
• Because time is always an observable factor, time series data is
everywhere. Sensors and systems emit time series data constantly as
the world around us becomes more instrumented, and that data can
be applied across numerous industries in many ways.
Time Series Data Examples
• Stock Prices
• Sales data
• Economic indicators
• Weather data
• Traffic data
• Energy Consumption
• Medical data
Objective of time series data analysis:
• Trend Analysis
• Seasonality analysis
• Forecasting
• Anomaly Detection
• Feature Extraction
Steps to Analyze Time Series Data
• Data Preparation
• Visualization
• Decomposition
• Model Selection
• Model Fitting
• Model Evaluation
• Forecasting
• Model Refinement
Time Series Forecasting
Time series forecasting is a technique for the prediction of events through a sequence
of time. It predicts future events by analyzing the trends of the past, on the assumption
that future trends will hold similar to historical trends. It is used across many fields of
study in various applications including:
• Astronomy
• Business planning
• Control engineering
• Earthquake prediction
• Econometrics
• Mathematical finance
• Pattern recognition
• Resources allocation
• Signal processing
• Statistics
• Weather forecasting
Time Series Decomposition Techniques
Time series data consists of observations taken at consecutive points in
time. These data can often be decomposed into multiple components to
better understand the underlying patterns and trends. Time series
decomposition is the process of separating a time series into its
constituent components, such as trend, seasonality, and noise.
Time series decomposition helps us break down a time series dataset into
three main components:
1. Trend: The trend component represents the long-term movement in the
data, representing the underlying pattern.
2. Seasonality: The seasonality component represents the repeating,
short-term fluctuations caused by factors like seasons or cycles.
3. Residual (Noise): The residual component represents random variability
that remains after removing the trend and seasonality.
Types of Time Series Decomposition Techniques
• Additive Decomposition:
In additive decomposition, the time series is expressed as the sum of its
components : Y(t) = Trend(t) + Seasonal(t) + Residual(t)
It’s suitable when the magnitude of seasonality doesn’t vary with the
magnitude of the time series.
• Multiplicative Decomposition:
In multiplicative decomposition, the time series is expressed as the
product of its components : Y(t) = Trend(t) * Seasonal(t) * Residual(t)
It’s suitable when the magnitude of seasonality scales with the magnitude
of the time series.
Methods of Decomposition
Moving Averages:
• Moving averages involve calculating the average of a certain number of
past data points.
• It helps smooth out fluctuations and highlight trends.
Seasonal Decomposition of Time Series:
• The Seasonal and Trend decomposition using Loess (STL) is a popular
method for decomposition, which uses a combination of local regression
(Loess) to extract the trend and seasonality components.
Exponential Smoothing State Space Model:
• This method involves using the ETS framework to estimate the trend and
seasonal components in a time series.
Performance Evaluation
• Performance Evaluation metrics are used to assess the accuracy of
time series forecasting models including smoothing models , The
most commonly used metrics for this purpose are:
• Mean Absolute Error: The mean absolute error (MAE) is defined
as the average variance between the significant values in the dataset
and the projected values in the same dataset.
• Root Mean Square Error: Root mean square error or root mean square
deviation is one of the most commonly used measures for evaluating the
quality of predictions. It shows how far predictions fall from measured true
values using Euclidean distance.
To compute RMSE, calculate the residual (difference between prediction and
truth) for each data point, compute the norm of residual for each data point,
compute the mean of residuals and take the square root of that mean.
RMSE is commonly used in supervised learning applications, as RMSE uses and
needs true measurements at each predicted data point.
Root mean square error can be expressed as

Where N is the number of data points, y(i) is the i-th measurement, and y ̂(i) is its
corresponding prediction.
• Mean Absolute Percentage Error: MAPE is the mean absolute
percentage error, which is a relative measure that
essentially scales MAD to be in percentage units instead of
the variable’s units.
• Mean Absolute Scaled Error
• In time series forecasting, Mean Absolute Scaled Error (MASE) is
a measure for determining the effectiveness of forecasts
generated through an algorithm by comparing the predictions
with the output of a naïve forecasting approach. Let’s break this
down to understand in detail:
• Naïve Forecast
The naïve forecast is generated at any step by equating the
current forecast to the output from the last time step. For
example, prediction of sales of a company at the start of a month
is done by equating it to the actual sales from the last month
without considering any seasonal pattern.
Consider a time series with output for N steps given as y1, y2,
y3,…………yn

Naïve forecast error at different time steps is given by:


Autoregressive Integrated Moving Average (ARIMA)
• The Autoregressive Integrated Moving Average (ARIMA) model uses
time-series data and statistical analysis to interpret the data and make
future predictions. The ARIMA model aims to explain data by using time
series data on its past values and uses linear regression to make
predictions.
• The ARIMA model uses statistical analyses in combination with accurately collected
historical data points to predict future trends and business needs.
• The ARIMA model is typically denoted with the parameters (p, d, q), which can be
assigned different values to modify the model and apply it in different ways.
• Some of the limitations of the model are its dependency on data collection and the
manual trial-and-error process required to determine parameter values that fit best.
Understanding the ARIMA Model
The following descriptive acronym explains the meaning of each of the key components
of the ARIMA model:
• The “AR” in ARIMA stands for autoregression, indicating that the model uses the
dependent relationship between current data and its past values. In other words, it
shows that the data is regressed on its past values.
• The “I” stands for integrated, which means that the data is stationary. Stationary data
refers to time-series data that’s been made “stationary” by subtracting the
observations from the previous values.
• The “MA” stands for moving average model, indicating that the forecast or outcome
of the model depends linearly on the past values. Also, it means that the errors in
forecasting are linear functions of past errors. Note that the moving average models
are different from statistical moving averages.
Each of the AR, I, and MA components are included in the model as
a parameter. The parameters are assigned specific integer values that
indicate the type of ARIMA model. A common notation for the ARIMA
parameters is shown and explained below:
• ARIMA (p, d, q)
• The parameter p is the number of autoregressive terms or the number of
“lag observations.” It is also called the “lag order,” and it determines the
outcome of the model by providing lagged data points.
• The parameter d is known as the degree of differencing. it indicates the
number of times the lagged indicators have been subtracted to make the
data stationary.
• The parameter q is the number of forecast errors in the model and is also
referred to as the size of the moving average window.
Applications of the ARIMA Model
• In business and finance, the ARIMA model can be used to forecast
future quantities (or even prices) based on historical data. Therefore,
for the model to be reliable, the data must be reliable and must show
a relatively long time span over which it’s been collected. Some of the
applications of the ARIMA model in business are listed below:
• Forecasting the quantity of a good needed for the next time period
based on historical data.
• Forecasting sales and interpreting seasonal changes in sales
• Estimating the impact of marketing events, new product launches, and
so on.

You might also like