Professional Documents
Culture Documents
Project Submitted to
Submitted By:
ALEN SUNNY
MS. DEVI N
Assistant Professor
Department of Mathematics
DEPARTMENT OF MATHEMATICS
ST. BERCHMANS COLLEGE , CHANGANASSERY
2021-2024
CERTIFICATE
This is to certify that this Mr. ALEN SUNNY , has undergone Bachelor of Science in Mathematics
Course at St. Berchmans College, Changanassery, during the period 2021-2024 and has under-
taken the dissertation under the guidance of Ms. Devi N , Assistant Professor , Department of
Changanassery
Department of Mathematics
CERTIFICATE
This is to certify that this project entitled TIME SERIES is a record of bonafide project work
done by ALEN SUNNY (Reg. no : 12100003 ) under my guidance and supervision, in partial
fulfilment of the requirements for the award of Bachelor of Science Degree in Mathematics and that
his project has not been previously submitted for the award of any Degree, Diploma, Fellowship,
Title or Recognition.
Changanassery
19 / 02 / 2024
DECLARATION
I, ALEN SUNNY (Reg. No: 12100001 , 2021-2024), do hereby declare that the
dissertation entitled TIME SERIES is a bonafide record of project work done by me under
the guidance and supervision of Ms. DEVI N , Assistant Professor, Department of Mathematics,
St Berchmans College (Autonomous), Changanassery, and that this dissertation or any part
there of has not previously formed the basis for the award of any degree, diploma, associateship,
19 / 02 / 2024
ACKNOWLEDGEMENT
First and foremost, praises and thanks to the Almighty God for guiding me to complete this project
successfully.
I take this opportunity to express my profound gratitude and deep regards to my guide
Ms. DEVI N, Assistant Professor, Department of Mathematics, for her exemplary guidance,
monitoring and constant encouragement throughout the course of this project. The blessing, help
and guidance given by her, time to time shall carry me a long way in the journey of life on which
I am about to embark.
I would like to thank Fr. John J Chavara, Assistant Professor and Head of the Department of
I may add that I am indebted to my family for their valuable encouragement. I express my sincere
thanks to all my colleagues and friends for their help and suggestions to improve this report.
Finally, my thanks go to all the people who have supported me to complete the project work
directly or indirectly.
ALEN SUNNY
Contents
1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Estimation of Trend 11
2.1 Method for Measurement of Secular Trend . . . . . . . . . . . . . . . . . . . . . . . 11
2
3.2 Forecasting related models of Time Series: . . . . . . . . . . . . . . . . . . . . . . 27
4 FITTING 34
4.1.1 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.2 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.3 Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2.1 Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
INTRODUCTION
Time series analysis serves as a powerful tool in understanding and deciphering patterns within
sequential data. In this chapter of , we discuss into the fundamental definition of time series and
explore its various manifestations, unraveling the intricate variations that make it a captivating
field of study.
Moving forward, the second chapter focuses on the pivotal methods employed for measuring sec-
ular trends and deciphering seasonal variations within time series data. This chapter acts as a
crucial foundation for comprehending the nuances of temporal patterns inherent in the datasets
under scrutiny.
The third chapter takes a comprehensive approach, exploring the landscape of time series mod-
els. From additive to multiplicative models, and delving into the intricacies of Autoregressive
(AR), Moving Average (MA), Autoregressive Integrated Moving Average (ARIMA), and more,
this section equips the reader with a diverse toolkit for analyzing and interpreting time- dependent
phenomena.
Our journey culminates in the fourth chapter, where the Box-Jenkins approach takes center stage.
This chapter unfolds the methodology behind fitting time series models, providing a systematic
4
PRELIMINARIES
What are Time Series ? A Time series refers to the values of a chronologically ordered over
a successive period of time. Or can be said that It is a set of data collected and arranged in
accordance of Time. The analysis of time series means separating out different components which
Examples
• Weather Data
• Rainfall Measurements
Mean
It is the average of the given numbers and is calculated by dividing the sum of given numbers by
P
x
M ean =
n
Covariance
covariance is a measure of how much two random variables change together. It provides insights
5
into the direction of the relationship between two variables, whether they tend to increase or
decrease together.
Pn
i=1 (Xi − X̄)(Yi − Ȳ )
Cov(X, Y ) =
n−1
Method of least squares It is the process of finding the best-fitting curve or line of best fit for
a set of data points by reducing the sum of the squares of the offsets (residual part) of the points
y = mx + b
This method is used to find a linear line of the form y = mx + b, where y and x are variables, m
Pearson correlation coefficient It is a number between –1 and 1 that measures the strength
cov(X, Y )
ρ(X, Y ) =
σX σY
6
Chapter 1
1.1 Definition
Assume that the series Xt runs throughout time, that is (Xt )t = 0, ±1, ±2, . . . but is only observed
process (Xt )t ∈ Z
The theory for the time series is based on the assumption of ‘second-order stationary’. (second-
order stationary means time series have a constant mean, variance and autocovariance that doesn’t
The variations in the time series can be divided into two parts: long term variations
7
1.2 Long Term Variation
Long-term variations refer to patterns or changes in a phenomenon that occur over extended peri-
ods of time, typically spanning years, decades, or even centuries. These variations are characterized
1. Secular Trend
2. Cyclical Variation
Secular Trend shows the definite and basic tendency of the statistical data with the passage of
time. This statistical data it often shows a consistent upward or downward direction.
example
1. Changes in productivity
3. Growth of population
Cyclical variations refer to recurring, periodic fluctuations that extend beyond a year. These
variations are often linked to economic or business cycles. Series relating to prices, production,
8
demand etc undergo in the cyclical variation.
examples
1. Stock Prices
2. Medical Data
Short-term variations refer to changes that occur over relatively brief periods of time, typically
1. Seasonal Variations
2. Irregular Variations
Seasonal Variations are those variations which occur with some degree of regularity within a
specific period of one year or shorter. These variations are associated with recurring events,
examples
9
• At the time of diwali season , sale of crackers increases
Irregular Variations also known as random variations, It is caused by unusual, unexpected and
nature.
examples
1. Crime Rates
2. Transportation Trends
1.4 Objective
breaking it down to its components. And to fit a mathematical model and procced to forecast the
future.
10
Chapter 2
Estimation of Trend
As we know time series consists of data arranged chronologically. In forecasting (an application of
Time series) it is important to analyse the characteristic movement of variations in the given time
series. Following are methods which is served as a tool for this analysis:
11
2.1.1 Freehand curve Method (Graphical Method)
This is the simple method of studying trend. In this method the given time series data are plotted
on graph paper by taking time on X-axis and the other variable on Y-axis. The graph obtained will
be irregular as it would include short-run oscillations. We may observe the up and down movement
of the curve and if a smooth freehand curve is drawn passing approximately to all points of a curve
previously drawn, it would eliminate the short-run oscillations (seasonal, cyclical and irregular
variations) and show the long-period general tendency of the data. However, It is very difficult to
draw a freehand smooth curve and different persons are likely to draw different curves from the
same data. The following points must be kept in mind in drawing a freehand smooth curve:
2. The numbers of points above the line or curve are equal to the points below it.
3. The sum of vertical deviations of the points above the smoothed line is equal to the sum
of the vertical deviations of the points below the line. In this way the positive deviations
will cancel the negative deviations. These deviations are the effects of seasonal cyclical and
4. The sum of the squares of the vertical deviations from the trend line curve is minimum.
Example
Year 1990 1991 1992 1993 1994 1995 1996 1997 1998
Sales (in Lakhs ) 65 95 115 63 120 100 150 135 172
If we draw a graph taking year on x-axis and sales on y- axis, it will be irregular as shown
below. Now drawing a freehand curve passing approximately through all this points will represent
12
Merits
4. If the observations are relatively stable, the trend can easily be approximated by this method.
13
Demrits
In this method, two points considered to be the most representative or normal, are joined by
straight line to get secular trend. This, again, is a subjective method since different persons may
have different opinions regarding the representative points. Further, only linear trend can be
In this method, as the name itself suggests semi averages are calculated to find out the trend values.
By semi-averages is meant the averages of the two halves of a series. In this method, thus, the
given series is divided into two equal parts (halves) and the arithmetic mean of the values of each
part (half) is calculated. The computed means are termed as semi-averages. Each semi-average is
paired with the centre of time period of its part. The two pairs are then plotted on a graph paper
and the points are joined by a straight line to get the trend. It should be noted that if the data is
for even number of years, it can be easily divided into two halves. But if it is for odd number of
years, we leave the middle year of the time series and two halves constitute the periods on each
14
Merits
2. It is an objective method because anyone applying this to given data would get identical
trend value.
Demerits
1. This method can give only linear trend of the data irrespective of whether it exists or not.
2. This is only a crude method of measuring trend, since we do not know whether the effects
Example
Fit a trend line by the method of semi averages for the given data
Solution
Since the number of years is odd(seven) , we will leave the middle year’s production value and
obtain the averages of first three years and last three years.
15
Figure 2.1: Trend line
This is one of the most popular methods of fitting a mathematical trend. The fitted trend is
termed as the best in the sense that the sum of squares of deviations of observations, from it, is
minimized. This method of Least squares may be used either to fit linear trend or a nonlinear
Given the data (yt , t) for n periods, where t denotes time period such as year, month, day, etc.
We have the values of the two constants, ‘a’ and ‘b’ of the linear trend equation:
yt = a + bt
Where the value of ‘a’ is merely the Y-intercept or the height of the line above origin. That is,
when X=0,Y= a. The other constant ‘b’ represents the slope of the trend line. When b is positive,
the slope is upwards, and when b is negative, the slope is downward. This line is termed as the
16
line of best fit because it is so fitted that the total distance of deviations of the given data from
the line is minimum. The total of deviations is calculated by squaring the difference in trend value
and actual value of variable. Thus, the term “Least Squares” is attached to this method using
least square method, the normal equation for obtaining the values of a and b are :
X X
yt = na + b t
X X X
tyt = a t+b t2
P
Let X = t – A, such that X = 0 where A denotes the year of origin. The above equations can
also be written as
X X
Y = na + b t
X X X
XY = a X +b x2
P
Since x = 0 i.e. deviation from actual mean is zero We can write
P P
Y XY
a= ;b = P 2
n x
Merits
1. Given the mathematical form of the trend to be fitted, the least squares method is an
objective method.
2. Unlike the moving average method, it is possible to compute trend values for all the periods
and predict the value for a period lying outside the observed data
3. The results of the method of least squares are most satisfactory because the fitted trend
(y0 − yt )2
P P
satisfies the two most important properties , ie .(1) (y0 − yt ) = 0 and (2)
minimum. Here y0 denotes the observed values and yt denotes the calculated trend value.
17
The first property implies that the position of fitted trend equation is such that the sum of devi-
ations of observations above and below this equal to zero. The second property implies that the
sums of squares of deviations of observations, about the trend equations, are minimum.
Demrits
2. It is not flexible like the moving average method. If some observations are added, then the
3. It can predict or estimate values only in the immediate future or the past.
4. This method cannot be used to fit growth curves, the pattern followed by the most of the
Example
Given below are the data relating to the production of sugarcane in a district.
Fit a straight trend by the method of least squares tabulate the trend values
Solution
P P
Y 316 XY
a= = = 45.146; b = P 2 = 1.036
n 7 x
18
Year (x) Production of X = x-2003 x2 XY Trend Values (yt )
Sugarcane (Y)
2000 40 -3 9 -120 42.04
2001 45 -2 4 -90 43.07
2002 46 -1 1 -46 44.11
2003 42 0 0 0 45.14
2004 47 1 1 47 46.18
2005 50 2 4 100 47.22
2006 46 3 9 138 48.23
2
N= 7 ΣY = 316 ΣX = 0 Σx = 28 ΣXY = 29 Σyt = 316
Y = a+bX
Y = 45.143 + 1.036(x-2003)
This is the easiest and the simplest method of studying seasonal variations. This method is used
when the time series variable consists of only the seasonal and random components. The effect
of taking average of data corresponding to the same period (say first quarter of each year) is to
eliminate the effect of random component and thus, the resulting averages consist of only seasonal
19
It involves the following steps:
2. Find the sum of all the figures relating to a month. It means add all values of January for
all the year. Repeat the process for all the months.
3. Find the average of monthly figures. i.e., divide the monthly total by the number of years.
4. Obtain the average of monthly averages by dividing the sum of averages by 12.
5. Taking the average of monthly average as 100 find out the percentages of monthly averages.
For the average of January (X1) this percentage would be:( Monthly Average(for Jan) /
This is a simplest method of measuring seasonal variations. However this method is based on the
unrealistic assumption that the trend and cyclical variations are absent from the data.
Example
Calculate the seasonal index for the monthly sales of a product using of simple average
Months
Years
Jan Feb Mar Apr May June July Aug Sep Oct Nov Dec
2001 15 41 25 31 29 47 41 19 35 38 40 30
2002 20 21 27 19 17 25 29 31 35 39 30 44
2003 18 16 20 28 24 25 30 34 30 38 37 39
Solution
20
S.I for Jan = (Monthly Average(for Jan) / Grand average) x 100
Grand Average=355.582/12=29.63
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
59.62 87.77 80.99 87.77 78.74 109.12 112.49 94.49 112.49 129.36 120.36 126.89
This method is used when then cyclical variations are absent from the data, i.e. the time series
variable Y consists of trend, seasonal and random components. Using symbols, we can write Y =
T.S .R
1. Obtain the trend values for each month or quarter, etc, by the method of least squares
2. Divide the original values by the corresponding trend values. This would eliminate trend
21
values from the data.
T.S.R
× 100
T
S.R × 100
The ratio to moving average is the most commonly used method of measuring seasonal variations.
This method assumes the presence of all the four components of a time series. Various steps in
1. Compute the moving averages with period equal to the period of seasonal variations. This
would eliminate the seasonal components and minimize the effect of random component.
The resulting moving averages would consist of trend, cyclical and random components.
2. The original values, for each month are divided by the respective moving average figures and
the ratio is expressed as a percentage, i.e. SR” = Y / M.A = TCSR / TCR’, where R’ and
22
Merits and Demerits
This method assumes that all the four components of a time series are present and, therefore,
widely used for measuring seasonal variations. However, the seasonal variations are not completely
eliminated if the cycles of these variations are not of regular nature. Further, some information is
Example
Calculate seasonal indices by Ratio to moving average method from the following data
Solution
Year Quater Given 4-Fig Total 2-Fig Total 4-Fig Avg % of Moving Avg
2005 I 68
II 62 254 251 63.186 96.63
III 61 62.260 101.20
IV 63
2006 I 65
II 58 247 252 62.375 104.21
III 66 62.750 92.43
IV 61 62.875 104.97
2007 I 68
II 63 255 261 64.125 106.04
III 63 64.500 97.67
IV 67
23
399.32
Arithmetic average of averages = = 99.83
4
By expressing each quarterly average as percentage of 99.83 , we will obtain seasonal indices.
Cyclic variation exists in the data when tendency of the data increases and decreases in a given
period but time period is not fixed for cyclic variation (because it a short term variation).
For measurement of cycle variation first calculate seasonal and trend components then remove
seasonal, trend component and irregular component. Irregular component is just like an error term
like the previous knowledge which cannot be directly eliminated. To eliminate irregular component
moving average method is used. This method of elimination of the irregular component is known
First estimate trend (T) and seasonal values (S) of the given time series.
1. Divide time series values (Y) by trend (T) and seasonal estimated value (S), get cyclic (C)
Y / T S = TCSI / T S = CI
2. b) Now eliminate random component from second step by using moving average of 3 or 5
24
2.4 Measurement of Irregular Component:
Irregular component is the last component, also known as error term of the time series. An error
term can’t be eliminate fully from any time series because this happens due to natural forces.
There are no methods available to measure this component . But this component can be removed
little bit by averaging of the indices. If there is multiplicative model of time series then it can be
removed by dividing all other components to the irregular component. If there is additive model
25
Chapter 3
In the previous chapters we have seen that Time series , variables of time series etc.. Now we are
2. Forecasting
• Additive Model
• Multiplicative Model
26
3.1.1 Additive Model
A data model in which the effects of individual factors are differentiated and added together to
Assume that the value of Y of a composite series is the sum of the four components. That is
Y =T +S+C +I
This model assumes that as the data increase, so does the seasonal patten. Most time series plots
exhibit such a pattern . In this model , the trend , seasonal , cyclical , irregular components are
multiplied.
It is assumed that the value Y of a composite series is the product of the four components. That
is
Y = T SCI
For forecasting , there are mainly four approaches based on time series data.
27
2. Single-equation regression models
Exponential smoothing is a time series method for forecasting univariate time series data. Time
series methods work on the principle that a prediction is a weighted linear sum of past observa-
tions or lags. The Exponential Smoothing time series method works by assigning exponentially
decreasing weights for past observations. Because the weight assigned to each demand observation
is exponentially decreased.
The model assumes that the future will be somewhat the same as the recent past. The only pattern
that Exponential Smoothing learns from demand history is its level - the average value around
Exponential smoothing is generally used to make forecasts of time-series data based on prior as-
Single – equation Regression Models in time series analysis involve using a single equation to
explain and predict the behaviour of a dependent variable . Typically, these models express the
dependent variable as a linear function of one or more independent variables , including time.
The most basic form is the autoregressive model (AR) , where the current value of the variable
depends on its past values. Another common model is the moving average model(MA), where the
current value is expressed as a linear as a linear combination of past error terms. And the ARIMA
28
ACF (Auto Correlation Function):
Auto Correlation function takes into consideration of all the past observations irrespective of its
effect on the future or present time period. It calculates the correlation between the t and (t-k)
time period. It includes all the lags or intervals between t and (t-k) time periods. Correlation is
The PACF determines the partial correlation between time period t and t-k. It doesn’t take into
e.g : let’s assume that today’s stock price may be dependent on 3 days prior stock price but
it might not take into consideration yesterday’s stock price closure. Therefore we considers the
time lags having a direct impact on future time period by neglecting the insignificant time lags in
The model which is a forecasting algorithm based on the assumption that previous values carry
inherent information and can be used to predict future values. In order to understand ARIMA,
1. AR
2. I
3. MA
29
The ARIMA model takes in three parameters:
AR (auto-Regressive) Model
The AR model only depends on past values to estimate future values. Generalized form of the
AR model:
p
X
AR(p) : xt = α + βi xt−i + ϵ
i=1
The value p determines the number of past values p will be taken into account for the prediction.
The higher the order of the model, the more past values will be taken into account.
30
Consider an example of a milk distribution company that produces milk every month in the
country. We want to calculate the amount of milk to be produced current month considering the
We begin by calculating the PACF values of all the 12 lags with respect to the current month. If
the value of the PACF of any particular month is more than a significant value only those values
e.g : in the above figure the values 1,2, 3 up to 12 displays the direct effect(PACF) of the milk
production in the current month w.r.t the given the lag t. If we consider two significant values
(The AR model can simply be thought of as the linear combination of p past values.)
The moving-average model depends on past forecast errors to make predictions. Generalised
form of MA model:
31
Consider an example of Cake distribution during a birthday function . Let’s assume that a person
asks you to bring pastries to the party. Every year you miss judging the number of invites to the
party and end upbringing more or less no of cakes as per requirement. The difference in the actual
and expected results in the error. So you want to avoid the error for this year hence we apply the
moving average model on the time series and calculate the number of pastries needed this year
based on past collective errors. Next, calculate the ACF values of all the lags in the time series.
If the value of the ACF of any particular month is more than a significant value only those values
e.g : in the above figure the values 1,2, 3 up to 12 displays the total error(ACF) of count in pastries
current month w.r.t the given the lag t by considering all the in-between lags between time t and
current month. If we consider two significant values above the threshold then the model will be
termed as MA(2).
( The MA model can simply be thought of as the linear combination of q past forecast errors.)
This is a model that is combined from the AR and MA models. In this model, the impact of
32
previous lags along with the residuals is considered for forecasting the future values of the time
series. Here β represents the coefficients of the AR model and α represents the coefficients of the
MA model.
Integrated ( I ) :
1. The integrated part refers to differencing the time series data to make it stationary.
2. Stationarity means that the statistical properties of the time series, such as mean and vari-
3. The differencing parameter ”d” represents the number of times differencing is needed to
achieve stationarity.
The ARIMA model is quite similar to the ARMA model other than the fact that it includes one
more factor known as Integrated( I ) i.e. differencing which stands for I in the ARIMA model. So
in short ARIMA model is a combination of a number of differences already applied on the model
in order to make it stationary, the number of previous lags along with residuals errors in order to
33
Chapter 4
FITTING
The Box-Jenkins procedure is concerned with fitting an ARIMA model to data. It has three parts:
• Identification
• Estimation
• Verification
The data may require pre-processing to make it stationary. To achieve stationarity we may do any
of the following:
34
• Look at the time series
4.1.1 Identification
For the moment we will assume that our series is stationary. The initial model identification is
carried out by estimating the sample autocorrelations and partial autocorrelations and comparing
the resulting sample autocorrelograms and partial autocorrelograms with the theoretical ACF and
As we have noted, very approximately, both the sample ACF and PACF have standard deviation
of around √1T , where T is the length of the series. A rule of thumb is that ACF and PACF values
±2
are negligible when they lie between √
T
. An ARMA(p, q) process has kth order sample ACF and
4.1.2 Estimation
: ARMA Process Now we consider an ARMA(p, q) process. If we assume a parametric model for
the white noise - this parametric model will be that of Gaussian white noise we can use maximum
likelihood.
35
We rely on the prediction error decomposition.That is ,X1 , . . . , X,have joint density
n
Y
f (X1 , . . . , Xn ) = f (X1 ) f (Xt |X1 , . . . , Xt−1 )
t=2
Suppose the conditional distribution of Xt given X1 , . . . , Xt−1 is normal with meanX̄t , and
variance Pt − 1, and suppose that X1 ∼ N (X̄1 , P0 ) Then for the log likelihood we obtain
n−1
X (Xt − X̄t )2
−2 log L = log(2π) + log Pt−1 +
t=1
Pt−1
Here X̄ and Pt−1 are functions of the parameters α1 , . . . , αp , β1 , . . . , βq and so maximum likelihood
estimators can be found (numerically) by minimising −2 log L with respect to these parameters.
The matrix of second derivatives of −2 log L, evaluated at the maximum likelihood funtion esti-
mation, is the observed information matrix, and its inverse is an approximation to the covariance
matrix of the estimators. Hence we can obtain approximate standard errors for the parameters
Estimation : AR processes
p
X
ρk = αi ρ| i − k|, for k > 0 (4.1.2)
i=1
p
X
rk = αi r|i − k| k = 1, . . . , p (4.1.3)
i=1
These are p equations for the p unknowns α1 , . . . , αp which, as before, can be solved using a
36
Levinson-Durbin recursion. The Levinson-Durbin recursion gives the residual variance
2
n p
1 X X
σ 2p = Xt − ᾱj Xt−j (4.1.4)
n t=p+1 j=1
This can be used to select the appropriate order p. Define an approximate log likelihood by
4.1.3 Verification
The third step is to check whether the model fits the data Two main techniques for model verifi-
cation are
• Overfitting: add extra parameters to the model and use likelihood ratio or t tests to check
• Residual analysis: calculate residuals from the fitted model and plot their acf, pacf, ‘spectral
density estimates’, etc, to check that they are consistent with white noise.
White noise : white noise refers to a random signal with a constant mean and constant variance.
It is a type of stochastic process where each data point is independent and identically distributed
• Constant Mean
• Constant Variance
37
• Independence
• Randomness
Tests for white noise The Box–Pierce test is based on the statistic
m
X
2
Qm = T r2k (4.1.6)
k=1
where rk is the kth sample autocorrelation coefficient of the residual series, and p + q < m ≪ T .
It is called a ‘portmanteau test’. Because it is based on the all-inclusive statistic. If the model is
In fact ,r − k has variance (T − k)/(T (t + 2)) , an improved test is the Box-Ljung procedure which
replaces Q by
m
X (−1) 2
Qm = T (T + 2) (T − k) r2k (4.1.7)
k=1
4.2.1 Forecasting
Following is the prediction data for Bitcoin obtained from Yahoo Finance using the
38
4 pip install matplotlib
5 import pandas as pd
6 import math
7 import numpy as np
8 import statsmodels
9 import pandas as pd
10 import numpy as np
12 import yfinance as yf
17 import datetime
18 import warnings
25 import yfinance as yf
1 df
2 plt . show
39
3 plt . grid = ( True )
6 plt . plot ( df [0: to_row ][ ’ Adj Close ’] , ’ green ’ , label = ’ Train data ’)
7 plt . plot ( df [ to_row :][ ’ Adj Close ’] , ’ Blue ’ , label = ’ Test data ’ )
8 plt . legend ()
1 m o d e l _ p r e d i c t i o n s = []
40
Figure 4.1: Enter Caption
2 p , d , q = arima_model . order
5 print ( output )
41
6 print ( " ARIMA Order (p , d , q ) : " , order )
11
12
42
1 for i in range ( n_test_obser ) :
6 m o d e l _ p r e d i c t i o n s . append ( yhat )
7 a c t u a l _ t e s t _ v a l u e = testing_data [ i ]
8 training_data . append ( a c t u a l _ t e s t _ v a l u e )
43
1 plt . figure ( figsize =(11 ,7) )
6 plt . plot ( data_range , testing_data , color = ’ red ’ , label = ’ BTC Actual Price ’)
10 plt . legend ()
11 plt . show ()
44
45
CONCLUSION
In the introductory chapter, the fundamental concepts of the time series were explored. A clear
definition of time series was provided, emphasizing its sequential nature and the inherent patterns
within. The discussion delved into the distinctions between long-term and short-term variations,
variations within time series, readers gained insight into the complexity and diversity of temporal
data. This groundwork set the stage for an in-depth exploration of techniques to analyse and
model time-dependent phenomena. The second chapter focused on estimating trends in time se-
ries data, employing various methodologies for measuring secular trends and seasonal variations.
The analysis included methods such as the Method of Selected Points, Least Squares, Ratio to
Trend, and Moving Average Method. The exploration extended to cyclic and irregular varia-
tions, with concrete examples elucidating the application of each approach. By comprehensively
evaluating different trend estimation methods, the chapter provided a robust foundation for sub-
sequent modeling and forecasting endeavours. The third chapter delved into the crucial task of
identifying time series models, employing a two-fold classification based on components and fore-
casting. The first classification distinguished between additive and multiplicative models, offering
Average (ARIMA) models. The incorporation of graph representations enhanced the conceptual
46
clarity, facilitating a more intuitive grasp of the modeling approaches presented. The final chap-
ter culminated in the application of the ARIMA model, as outlined in Chapter three , through
the Box-Jenkins approach. The process of model fitting was intricately detailed, emphasizing the
systematic steps involved. To exemplify the practical utility of the developed models, a real-world
case study involving Bitcoin data was undertaken. The application side showcased the forecasting
capabilities of the ARIMA model, demonstrating its adaptability to dynamic and volatile datasets.
The chapter provided a conclusive demonstration of the theoretical concepts discussed throughout
the project, offering valuable insights into the applicative potential of time series modeling tech-
niques.
47
BIBLIOGRAPHY
Books
Gunasekar
Websites
1. data.world
2. finance.yahoo.com
3. towardsdatascience.com
48