# QAFD Project Forecasting using eviews

Executive Summary This project tries to forecast future data from a series of past stock market da ta using EViews software. The algorithm followed for this purpose is the Box Jen kins model. To start, whether the data is stationary or not is tested. The data collected is found to be non-stationary. Stationary data is obtained by taking t he first difference followed by estimating an appropriate model for the data. On ce the model is obtained and the residuals of the data are shown to have no corr elation, forecasting is done. Introduction The stock market is a dynamic environment. To predict the future movement of the market is indeed helpful if one wants to make some profit in this market. For p rojection, we use Box Jenkins’ algorithm. But before starting with this algorithm, it is essential to look at some concepts. Stationary and Non-stationary data series In mathematics, a stationary process is a stochastic process whose joint probab ility distribution does not change when shifted in time or space. Consequently, parameters such as the mean and variance, if they exist, also do not change over time or position. A non-stationary is a process where the data shifts with time or space. Conseque ntly, parameters such as the mean and variance, if they exist, also changes ove r time or position. Now, stationarity is also of two types:• Weak stationary:- A time series Y(t) is said to be weak stationary if and only i f:1. E(y(t)) = A constant independent of t 2. Var(y(t)) is independent of t 3. Covar(y(t),y(t-1)) is also independent of t • Strong stationary:- If probability distribution of y(t) does not change with tim e, it is said to be strong stationary To test whether time-series is stationary or not, we use Dicky Fuller test or Au gmented Dicky Fuller test. Non-stationary time series can be converted into stationary time series by the f ollowing method:• Taking log • Taking difference ( first or second) • Taking a mean and subtracting from each item After the series has been made stationary, we model it. There are three main mod els for the same. • Auto regressive model or AR model

the next step in fit ting an ARIMA model is to determine whether AR or MA terms are needed to correct any autocorrelation that remains in the differenced series. Box Jenkins Model This model has four stages:• Stabilizing variance of data and making it stationary • Identification of model using ACF and PACF • Estimation of model • Diagnostic checking • Forecasting Step 1:. Step 4:. The PACF plot is a plot of the partial correlation coefficients between the series and lags of itself. The autocorrelation of a time series Y at lag 1 is the coefficient of correlation b etween Y(t) and Y(t-1). + α(k)e(t-k) + e(t) where 0 < α < 1 AR model can be expressed as:Y(t) = δ + β(1)e(t-1) + β(2)e(t-2) +….. For example. which is presumably also the correlation between Y(t-1) and Y(t-2). you can tentatively identify the numbers of AR and /or MA terms that are needed. you could just try some different combinations of terms and see what works best. the amount of correlation we should expect at lag 2 is precis ely the square of the lag-1 correlation. the correlation at lag 1 "propag ates" to lag 2 and presumably to higher-order lags.. In general. the "partial" correlation between two variables is the amount of cor relation between them which is not explained by their mutual correlations with a specified set of other variables. Of course. the partial correlation between Y and X3 is the amount of correlation between Y and X3 that is not explained by their commo n correlations with X1 and X2. then we should also expect to find correlation between Y(t) and Y(t-2).. with sof tware like Statgraphics. But there is a more systematic way to do this.Plotting the ACF and PACF Step 3:. The estimated model should have all the coefficients significant. This partial correlation can be computed as the s quare root of the reduction in variance that is achieved by adding X3 to the reg ression of Y on X1 and X2. Step 5:. X2. and Y(t-1) is equally correla ted with Y(t-2).Note down each model’s AIC and BIC .Taking log or first or second differences to reduce the variance and ma king the data stationary Step 2:. By look ing at the autocorrelation function (ACF) and partial autocorrelation (PACF) plo ts of the differenced series. + β(k)y(t-k) + e(t) ACF & PACF After a time series has been stationarized by differencing.Finding the ACF and PACF functions.Estimating the model by regression. and X3. if we are regressing a variable Y on other variables X1. + β(k)e(t-k) + e(t) ARMA model can be expressed as:Y(t) = δ + β(1)y(t-1) + β(2)y(t-2) +…. (In fact. A partial autocorrelation is the amount of correlation between a variable and a lag of itself that is not explained by correlations at all lower-order-lags. But if Y(t) is correlated with Y(t-1). You are already familiar with the ACF plot: it is merely a bar chart of the coefficients of correlation between a time series and lags of itself. This is used to identify AR and MA model which one to go for.) Thus. The partial autocorrelation at lag 2 is therefore the difference between the actual correlation at lag 2 and the expected correlation due to the propagation of correlation at lag 1.• Moving Average model or MA model • ARMA model MA model can be expressed as:Y(t) = + α(1)e(t-1) + α(2)e(t-2) +….

The model having minimum AIC and BIC is supposed to be the best model Step 7:. Checking non stationarity or stationarity using ADF test H0:.t-statistic<critical value . Q-statistic = (number of observations)*∑(ACF)^2 From the table. we reject H0 For level :. For this we use hypothesis testing:H0:. Series is non stationary For first difference:.1) model. First we take ARIMA(1. Now we need to do test whether all lags have zero correlation or not. otherw ise we ignore it.7 . the respective details in this model are as follows:For the rest of the models the probabilities are exceeding 0. we go for forecasting.If the ACF of residual of estimated model is white noise.t-statistic > critical value. Now estimating different combinations of ARMA model using the following equation in the object box:D(nifty_Past_data) c AR(m) MA(n) where m and n take different variables. Plotting ACF and PACF for first difference we get the following:From the plots it is evident that we have to employ ARMA model 3. Number of observations = 261 1.The lags are having zero correlation H1:.There is no correlation between the lags H1:.05 and hence we ign ore it. Else we r epeat from step 2. all lags are having zero correlation For this. Now we plotted the ACF of the residuals obtained in the ARIMA(1. we get the value of Q-statistics as:Q-statistic = 31.Step 6:. 4. If the residual is white noise.The lags do not have zero correlation 5.1).Series is non stationary We test the t-statistic with the critical values at different levels.There is correlation between the lags We calculate a Q-statistic where Q = n ∑ (ACF)^2 and checking it using chi square table with degrees of freedom = k – p where p is the number of parameters in the e stimated model. Series is stationary for fir st difference 2. Methodology Dow Jones past data are taken for one year. we need to use hypothesis testing:H0:.Plotting the ACF of residuals of the estimated model Step 8:. If t-statistic > Critical value.Series is stationary H1:. If the probability of AR(m) and MA(n) are less than 0.05 we take it significant. We calculate Q-statistics and compare it chi-square value.

Since the Q-statistic lies well within the chi-square value. . the lags are having zero correlation and our estimated model is correct.