Professional Documents
Culture Documents
Second Semester 2011-2012 EViews Tutorial MSc Accounting and Finance MSc Finance and Investment MSc Carbon Finance
Preface
EViews is a widely used statistical and econometric package for data analysis, regression, time series and forecasting. This booklet is to help you learn how to use EViews by taking step by step through some basic features of this program using the data supplied. It is divided into three main parts: Part I 1 - Introduction to EViews and OLS Regression 2 - 47 Data Input Basic analysis of data Multivariate Regressions models estimations Diagnostic check for residuals
Data used: Regression Analysis.xls Part II 2 - Time Series Analysis - 48 74 Stationarity vs. Non-Stationarity ARIMA Models Engle Granger 2 Step Test
Data used: Regression Analysis.wf1 and ECM.xls Part III 3 - GARCH Types Models - 75 - 93 Estimation of symmetric GARCH (1, 1) model and Residuals Diagnostic check Estimation of asymmetric EGARCH (1, 1) model and Residuals Diagnostic check Demo of Variance Ratio Test programme in EViews
Data used: ftse.xls This book is not a substitute for the online EViews Manual which provides comprehensive technical descriptions for many of the statistical procedures and related references. You should use the Manual frequently especially when you want to know How do I do XXX using EViews. Please consult the Manual when in doubt most if not all the questions you may have will be answered there. Acknowledgments: We would like to thank previous classes and teachers for compiling this booklet, some minor changes have been made, all remaining errors are our own.
MSc Accounting and Finance MSc Finance and Investment MSc Carbon Finance
Table of Contents
1. Regression Analysis 2. Example: 2.1. How to input data into EViews file? 2.2. Basic Statistics and graphs 2.3. Generate series 2.4. Estimation of the Models 2.5. Understanding the outputs 2.6. Assumptions Underlying the Model 3. Model Diagnostic Check 3.1. Functional Form of Regression 3.1.1. Test for the linearity assumption 3.1.2. Test for Omitted Variables 3.1.3. Redundant Variables 3.2. Parameter Stability 3.3. Multicollinearity 3.4. Normality on Errors 3.5. Autocorrelation on Errors 3.6. Heteroscedasticity Test 2 2 4 14 21 24 28 29 29 30 30 32 34 35 38 40 42 44
1. Regression Analysis
Regression is one of the most versatile and powerful statistical procedures. The primary goal is to predict the variability of one variable (dependent variable) using one or more known variables (independent variables). Although regression and correlation are two different terms are usually confused and misinterpreted by researchers. Correlation measures the degree of linear relation between two variables which means that the two variables are treated in a symmetrical way. Correlation states that movements on the two variables, y and x, are related in a specific way (the value of the correlation coefficient) but it provides no information about the causality effect of the two variables. Correlation does not allow us to draw conclusions about the impact of x on y (i.e. a change on x will cause a change on y) and vice versa. The causality effect is captured through regression. In regression, we treat the y variable as the dependent variable and therefore it assumed random or stochastic, i.e. it has a probability distribution. Xs on the other hand are assumed to be the same in repeated samples, i.e. are non-stochastic. Therefore, we can draw conclusions about the impact of xs on ys.
2. Example:
Given that British Airways (BAY) is a company with international operations where a large proportion of its sales come from abroad it is interesting to examine its economic exposure. Economic exposure, in this context, refers to the variability of share price returns. Following Bilsons (1994) paradigm, let us examine whether the variability on BAYs share price returns (RBAY) can be explained by changes in exchange rates, oil prices, and market movements. More specifically examine whether percentage changes in RBAY can be explained by:
Percentage changes in FTSE all share Index (RFTSE) returns, Percentage changes in S&P500 Index since a large proportion of overseas sales comes from United States- (RS&P500) returns, Changes in oil prices since it directly influences costs (DPOIL), and Percentage changes in Exchange rates of /$ - since a large proportion of overseas sales comes from United States- (E/$).
3 In order to estimate percentage changes (or elasticity) you need estimate the following model:
RBAY ,t = c + 1 RFTSE ,t + 2 RS &P 500,t + 3 DPOIL ,t + 4 E / $,t + ut
Where D refers to changes and ui is the error term (or, if you want, the deviation of an individual RBAY,t around its expected value).
Use data in the Regression Analysis.xls file for the above analysis.
2.1. How to input data into EViews file? 1) Run EViews software. It should be located on your Start All Programs tool bar:
2) When you run the program the following should appear on your screens:
5 3) In order to start working in EViews we must first create a Workfile: To do this click on File, New, and then Workfile. The following appears on your screen.
5) Next, you will need to define the structure of your data (dated, undated, balanced panel), their frequency (monthly, yearly, daily etc) or number of observations etc. In our example we have dated (or regular frequency data), so choose Dated regular frequency for
6 structure type. For data specification, choose Monthly for frequency, Jan 1989 for start date and Dec 2003 for end date. Leave Names blank. Your screen should appear as follows:
Please note that EViews provide the other data frequencies (weekly, monthly and so on). For the Daily data, EViews gives you the option to choose either 5 day week or 7 day week. This is very convenient in the case you want to insert daily stock returns since stock markets are not open on weekends. EViews also controls for holidays. 6) After clicking OK, the untitled Workfile screen will appear Variable c stands for constant and resid for residuals. These variables will always appear on your Workfile by default, which will be updated after every model you estimate.
7) Now we need to input our data in the Workfile we have just created. a. Click File, Import from file. b. Your Workfile will look like this:
8 8) When you click on Import from file the following window will appear:
9) Choose the location where you want to retrieve your excel file from and then choose the file where your data are stored. Type in the Excel file name where the data you want to import are stored. If you have named your series in the excel file, then you do not need to name your series in the EViews file. 10) When EViews opens the Excel file, it determines that the file is in Excel file format, analyses the contents, and opens the Excel Open wizard. The first page of the wizard includes a preview of the data found in the spreadsheet. In most cases, you need not worry about any of the options on this page. The second page of the wizard contains various options for reading the Excel data. These options are set at the most likely choices given the EViews analysis of the contents of your workbook. In most cases you should simply click on Finish to accept the default settings. In other cases where the preview window does not correctly display the desired data, you may click on Next and adjust the options that appear on the second page of the wizard.
11) Make sure that your data have been imported correctly. In order to do this, highlight (use ctrl) all the imported variables, click View Show and then click OK. Your screen will show as follows:
10
10
11 Note: In some cases, after you input the data set from Excel to EViews, there are some "NAs" in the final section of a column although the data points are not NAs in the original Excel file. In other words, the data set shown in EViews is different from the data in Excel spreadsheet. If this happens, one way to solve the problem is to restructure the dataset as follows: (1) Following the normal procedure to input data (variable series and date series) - you need also to input the column that shows dates as a series besides other series of variables! (2) In the Workfile window, go to Proc --> Structure/Resize Current page --> in the pop-up window, select Dated - specified by date series as the Workfile structure type; input the name of the series of dates in the Identifier series area; tick the box Insert empty obs to remove gaps if you want to show NAs in the gap cells, or Leave it blank if you simply want to delete the gap cells. (3) Click "Yes" in pop up dialogue boxes. Then you should get the correct data set. Caution. This does not apply for the residual series. 12) Now that you have created your Workfile, you need to save it as an EViews file in order to avoid following the same procedure every time you want to use your data. In order to do this, click File, Save as, and then choose name and location of the file. Note that the file will be saved with an *wf1 extension.
11
12
13) If you asked for accuracy, choose the 16-digit accuracy and then click OK.
12
13
14)
Next time you open the file click File, Open, EViews Workfile
13
14
2.2.
1) Now that you have your data in an EViews workfile, you may use basic EViews tools to investigate your data. As we have already mentioned the best way to start investigating your data is a graphical representation. Let us examine the characteristics of our series. Double click on Bay variable. The spreadsheet with returns for Bay will appear. Now click View Descriptive Statistics & Tests Histogram and Stats:
A complement of standard descriptive statistics are displayed along with the histogram. All of the statistics are calculated using the observations in the current sample. A table reporting the main characteristics of your series will appear on your screen.
14
15
Mean is the average value of the series, obtained by adding up the series and dividing by the number of observations. Median is the middle value (or average of the two middle values) of the series when the values are ordered from the smallest to the largest. The median is a robust measure of the center of the distribution that is less sensitive to outliers than the mean. Max and Min are the maximum and minimum values of the series in the current sample. Std. Dev. (standard deviation) is a measure of dispersion or spread in the series. The standard deviation is given by:
where series.
Skewness is a measure of asymmetry of the distribution of the series around its mean. Skewness is computed as:
15
16
where
is an estimator for the standard deviation that is based on the biased estimator . The skewness of a symmetric distribution, such as
the normal distribution, is zero. Positive skewness means that the distribution has a long right tail and negative skewness implies that the distribution has a long left tail. Kurtosis measures the peakedness or flatness of the distribution of the series. Kurtosis is computed as
where
is again based on the biased estimator for the variance. The kurtosis of the
normal distribution is 3. If the kurtosis exceeds 3, the distribution is peaked (leptokurtic) relative to the normal; if the kurtosis is less than 3, the distribution is flat (platykurtic) relative to the normal. Jarque-Bera is a test statistic for testing whether the series is normally distributed. The test statistic measures the difference of the skewness and kurtosis of the series with those from the normal distribution. The statistic is computed as:
where
is the kurtosis.
Under the null hypothesis of a normal distribution, the Jarque-Bera statistic is distributed as with 2 degrees of freedom. The reported Probability is the probability that a Jarque-
Bera statistic exceeds (in absolute value) the observed value under the null hypothesis-a small probability value leads to the rejection of the null hypothesis of a normal distribution. For the BAY series displayed above, we reject the hypothesis of normal distribution. 2). It is always useful to have a visual inspection of all your series in order to see how they behave over time and if they exhibit any pattern. Highlight all series (use Ctrl). Now click Show and then OK. When you get the spreadsheet with the five series click View Graph Line.
16
17
Be careful on how you interpret your graphs. You have plotted raw data and therefore you need to re-adjust the axes of your graph in order to see clearly the exchange rates as well. In order to do this double-click on the graph and choose Axes and Scaling.
17
18
On the Series axis assignment, highlight the second series 2 Left (the name of the series EUS_UK- appears at the top of the box.
Now choose Right which is located at the right of the box and click OK.
18
19
Now you can clearly see the exchange rate but you cannot see the oil prices graph. You need to see the graph separately. It is very important to have a visual idea of how your series look like. You need to know if there are trends, outliers, volatility clustering, or mean reversion. The line graphs above exhibit trends i.e. are not stationary, but apart from that, we cannot say much. 3). For our example is a good idea to plot the two series together in a scatter diagram in order to see if there is any kind of relationship between the two variables. In order to do this, click View Graph Scatter Scatterplot matrix
19
20
20
21
2.3.
Generate series
1. Let us say now that we want to compare the returns of the two series rather than raw data. There are two ways you could do this. You could either make the calculations in Excel and then re-import the data again, or you could generate new return-series in EViews. In EViews you can: 1) click Genr (Workfile window); or 2) go to the main tool bar click Quick Generate series.
2. Some common expressions you may need to create new series: Expression + * / ^ Operator add subtract multiply divide raise to power
21
22 Function abs(x) exp(x) inv(x) log(x) Name absolute value of x exponential reciprocal, 1/x natural logarithm of x
3. Now type the new series you want to create, for example, Rbay (Bays share price returns) and the calculation formula Rbay= (bay-bay(-1))/bay(-1)
4. Repeat the similar procedure as above to generate the new series Reus_uk, Rs_p, and Rftse (the returns for EUS_UK, S&P500, and FTSE) respectively. Reus_uk=(eus_uk-eus_uk(-1))/(eus_uk(-1)) Rs_p=(s_p-s_p(-1))/s_p(-1) Rftse=(ftse-ftse(-1))/ftse(-1)
22
23 In addition, in order to generate DPoil (the difference in oil prices) you need to type the formula DPoil=poil - poil(-1).
Now draw a scatterplot graph to see how your series are correlated to the Bay following two steps: Step 1. In Workfile window, select (highlight) five series (highlight Rbay first, and then the other four series). View Show (and click on OK) Step 2. In the Group window, click View Graph select Scatter select Multiple graphs: First vs. All in the drop down menu of Multiple series Note: The order to highlight series matters in step 1! In this example, you need to highlight series Rbay first in order to see how all other series are correlated to Rbay.
23
24
The scatterplot shows clear positive relationship between the return on Bay and return on FTSE and S&P500. However, correlation does not provides any information about the causality effect of the two variables. Therefore, in order to find out the causality effect, it is normally using the regression with the method of ordinary least squares by minimising the sum of the squared residuals under number of assumptions.
2.4. Estimation of the Models 1) Now let us run the regression of interest:
RBAY ,t = c + 1 RFTSE ,t + 2 RS &P 500,t + 3 DPOIL ,t + 4 E / $,t + ut
In order to estimate the model click Quick and then Estimate Equation.
24
25
2) In the dialog box type the equation you want to estimate following the instruction provided on the dialog box dependent variable followed by a list of regressors. All variables or regressors are separated by a space. In this case, type: rbay c rftse rs_p dpoil reus_uk
The first variable is the dependent variable and c stands for the constant. The estimation method we are using is the LS - Least Squares (NLS and ARMA) method, so do not
25
26 make any changes on the Method for the moment, the sample range we just leave it as it is by default for the full sample. 3) When you click OK the following results appear on your screen:
4) In order to avoid running the same model every time you open EViews save your results in your workfile. Click Name and give the equation a name, and then click Ok.
26
27 5) Exporting the output to word: Let us see now how to copy and paste the output in a word document. Highlight the table and then click Edit, Copy:
Choose Formatted Copy Numbers as they appear in table and then click OK and then in a word document click Edit and Paste:
Dependent Variable: RBAY Method: Least Squares Date: 11/04/07 Time: 14:04 Sample (adjusted): 1989M02 2003M12 Included observations: 179 after adjustments Coefficient C RFTSE RS_P DPOIL REUS_UK R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood F-statistic Prob(F-statistic) -0.003638 1.442363 0.334504 -0.001350 -0.228651 0.425113 0.411897 0.091807 1.466576 176.0088 32.16708 0.000000 Std. Error 0.007083 0.236502 0.252306 0.004469 0.265540 t-Statistic -0.513571 6.098732 1.325786 -0.302099 -0.861077 Prob. 0.6082 0.0000 0.1866 0.7629 0.3904 0.012629 0.119716 -1.910712 -1.821679 -1.874610 2.185461
Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion Hannan-Quinn criter. Durbin-Watson stat
27
28 2.5. Understanding the outputs The above output provides information about the estimated coefficients, their significance and some diagnostic tests about the specification of the model. Variable refers to the independent variables in our sample. Coefficient refers to the coefficient estimate of our model. Standard error is a measure of reliability or precision of the estimators. We need this measure because our data are likely to change from sample to sample and therefore the estimates will change ipso facto. In this case (and in statistics in general), this precision is measured with the standard error (se) (square root of variance). t-statistic is the ratio between Coefficient and Std. Error and helps us evaluate the significance of the coefficient estimates. Compare t-statistic with students-t distribution. Alternatively check p-values (or Prob). P-value (or Probability), if the p-value is less than the significance level you are testing, say 0.05, you reject the null hypothesis. R-square, it measures the success of the regression in predicting the values of the dependent variable within the sample. Adjusted R-Square. One problem with using obtain an as a measure of goodness of fit is that the
will never decrease as you add more regressors. In the extreme case, you can always of one if you include as many independent regressors as there are sample observations. The adjusted , commonly denoted as , penalizes the for the addition of regressors which do not contribute to the explanatory power of the model. Standard Error of the Regression (S.E. of regression), it is a summary measure based on the estimated variance of the residuals. Sum of Squared Residuals refers to the sum of squared residuals generated from the model. Log Likelihood, EViews reports the value of the log likelihood function (assuming normally distributed errors) evaluated at the estimated values of the coefficients. Durbin-Watson Statistic, it measures the serial correlation in the residuals, as a rule of thumb, if the DW is less than 2, there is evidence of positive serial correlation. The DW statistic in our output is very close to one, indicating the presence of serial correlation in the residuals. Mean and Standard Deviation (S.D.) of the Dependent Variable, it is the mean and standard deviation of are computed using the standard formulae: Akaike Information Criterion (AIC), it often used in model selection for non-nested alternatives-smaller values of the AIC are preferred.
28
29 Schwarz Criterion (SC), is an alternative to the AIC that imposes a larger penalty for additional coefficients. F-Statistic, is from a test of the hypothesis that all of the slope coefficients (excluding the constant, or intercept) in a regression are zero, Under the null hypothesis with normally distributed errors, this statistic has an F-distribution with numerator degrees of freedom and denominator degrees of freedom. Prob(F-statistic), is the marginal significance level of the F-test. If the p-value is less than the significance level you are testing, say 0.05, you reject the null hypothesis that all slope coefficients are equal to zero.
From the above output, we can see that the sample regression function (SRF) is given from the following equation:
R BAY ,t = c + 1.44 R FTSE ,t + 0.33RS & P 500,t 0.001DPOIL ,t 0.23E / $,t + u t
5) No serial correlation in the error terms, cov(u i , u j ) = 0 , ij 6) Errors are homoscedastic, i.e. constant volatility, var(u i ) = 2 7) Zero covariance between error term and independent variables, cov(u i , X i ) = 0
29
30 In this section, we will introduce the ways to run the diagnostic checks as follows: Stability Tests Coefficient Tests
1) Linearity assumption Functional form of regression 2) Omitted variables 3) Redundant variables Parameter stability Multicollinearity Normality on errors Autocorrelation on errors Heteroscedasticity test Residual Tests Stability Tests Covariance
For a detailed discussion for each of the tests and assumptions, see Brooks (2002) and Gujarati (2003).
The consequences of model specification errors are: Underfitting a model (omitting a relevant variable) Overfitting a model (including an irrelevant variable)
30
31
In the Reset Specification dialog box select 1 number of fitted terms (1 is the default) then click OK.
31
32
We cannot reject the null hypothesis, so the functional form is correct in this case.
32
H0: The additional set of variables are not jointly significant, i.e. omitted variables are not significant.
33
34
The P-value is not less than , so we cannot reject the null hypothesis, so the additional set of variables are not statistically significant.
34
35
35
36 wars, weather disasters etc) or policy changes (i.e. tax changes). In our example, we deal with time series data and the possibility of structural breaks (change in parameters value) is very high. A possible structural break might exist during the period 1990-1991 when OPEC (oil cartel) imposed oil embargoes due to the Gulf War of 1990-1991. Let us examine this hypothesis using the Chow test according to which the null hypothesis is: H0: No structural change H1: Structural change In order to perform this test in EViews, in Equation window click View Stability Diagnostics Chow Breakpoint Test
36
37
If you insert one date in the box above you essentially break your sample up into two subsamples. If you enter two dates you break your sample into three sub-samples. For our example lets use the period 1990 until 1992 as the hypothesised structural point (try other moths as well). Therefore type 1990M01 1992M12 in the window that appears if you follow the above procedure.
37
38
EViews provides you with three separate tests for structural breaks. According to the p-values (probabilities) obtained we do not reject the null hypothesis and therefore there is no indication of structural breaks.
3.3. Multicollinearity
This problem refers to the existence of a perfect or exact linear relationship among some or all explanatory variables of a regression model. If multicollinearity is present, X variables are indeterminate and their standard errors cannot be estimated. In the case where Multicollinearity is present but less than perfect (or exact) we can determine the X variables but their standard errors are very high, which is a problem when doing statistical inference. There are several sources of multicollinearity (see Gujarati, 2003): The data collection method employed Constraints on the model or in the population being sampled Model specification An overdetermined model
The consequences of multicollinearity are the following: Large variances and covariances of OLS estimators Wider confidence intervals Insignificant t Ratios A high R2 but Few significant t Ratios Sensitivity of OLS estimators and their standard errors to small changes in data How do we detect multicollinearity? There are several methods (rules of thumbs): Pair-wise correlations among regressors Estimation of partial correlations Auxiliary regressions Eigenvalues and condition index The steps to obtain pair-wise correlations among regressors in EViews are as follows: 1) highlight all the independent variables (use Ctrl key + click) in our model; 2) in Workfile window, click View Show. 3) in Group window, click View Covariance analysis in the pop-up window, select correlation.
38
39
39
40
Note that the correlation coefficient for FTSE and S&P500 indices is quite high. This is an indication of multicollinearity. Several remedial measures have been suggested over the time (and very well summarised in Gujarati, 2003): Do nothing Use a-priori information (from previous empirical work) to eliminate the variables causing multicollinearity Combine cross-sectional and time series data Drop one of the collinear variables Transform your variables (i.e. divide by a specific variable, or estimate the first differences) Add new data Factor analysis or principal components analysis
40
41
We have:
Jarque-Bera test examines formally whether residuals are normally distributed: H0: Residuals are normally distributed. In our example p-value<0.10, therefore we reject the null.
41
42 If our objective is estimation only, then violation of normality is not a problem.1 Violation of normality is also not a problem in big samples (if variances of residuals are constant).2 It becomes a problem when we want to make hypothesis testing and prediction.
Since estimators are BLUE (best linear and unbiased estimators). See Gujarati (2003). Since OLS estimators are normally distributed asymptotically (in large samples) even if residuals are not normally distributed.
2
42
43
43
44 Click OK
Form the results above (F-value) we cannot reject H0, therefore residuals are not serially autocorrelated by using the LM test. What do we do when you find autocorrelation? We have several options:3 Make sure that autocorrelation is not due to mis-specification biases in the model If it is pure autocorrelation we have to use some type of generalised least-square (GLS) method In large samples you can use the Newey-West method to obtain standard errors of OLS estimators that are corrected for autocorrelation In some cases we can continue to use the OLS estimators
For a full discussion (which is beyond the scope of this course), see Gujarati, 2003, p.p. 475-488.
44
45 View, Residual Diagnostics, and then Heteroskedasticity Tests. The null hypothesis of this test is: H0: Error terms are Homoskedastic. H1: Error terms are Heteroskedastic.
Select the White Test and DESELECT the Include the cross terms
45
46
From the results above it seems that we cannot reject the null hypothesis at 5% level (see Obs* R-Squared), which means that errors are homoskedastic at 5% level. But we can reject the homoskedasticity at 10% level. There are several diagnostic tests available (apart from Whites heteroskedasticity test), but one cannot tell for sure which will work in a given situation. It is not very easy to correct for heteroskedasticity if detected. In large samples, however, one can obtain Whites heteroskedasticity corrected standard errors of OLS estimators and conduct statistical inference on these standard errors. In order to do this in EViews you first click Estimate in order to retrieve your regression model. Now click on Options, Select White, and if you would like to correct both the heteroskedasitcy and the autocorrelation, then in this case, you will need to select the Newey-West.
46
47
As you can see, using White or Newey-West Estimate, the coefficients remain the same and standard errors, t-statstics and p-value have been changed.
47
48
Part II Time Series Analysis Stationarity vs.Non Stationarity ARIMA models Cointegration Test Term 2: 2011 - 2012 MSc Accounting and Finance MSc Finance and Investment MSc Carbon Finance
48
49
TABLE OF CONTENTS
1. Stationary vs. Non-Stationary time series 1.2. Forms of (weak) Non-Stationarity 2. Testing for Non-Stationarity 2.1. Graphical Analysis 2.2. Unit Root Tests 3. ARIMA Models 3.1. AR(p) Processes 3.2. MA (q) Processes 3.3. Practical issues for ARIMA models 4. Relationships between Non-Stationary Time Series 4.1. Create the workfile 4.2. The eyeball method: 4.3. Unit Root Test 4.4. Engle Granger 2 Step Test
50 50 54 54 56 60 60 61 63 65 66 66 66 69
49
50
where F denotes the joint distribution function. In other words a strictly stationary process is one where the probability density function for the sequence {yt} is the same
50
51 as that for {yt+k} for any k. In other words, a series is strictly stationary if the distribution of its values remains the same as time progresses, implying that the probability that y falls within a particular interval is the same now as at any time in the past or the future (Brooks, 2002).
Weakly Stationarity
A weakly stationary series is one that satisfies the following conditions: a) b) c) E ( yt ) =
E ( yt )( yt ) = 2 < E yt1 yt 2 = t1 t 2
)(
t1 , t2
The three conditions above state that a series is weakly stationary if it has a constant mean, a constant variance, and constant autocovariance structure. The third equation (constant autocovariance structure) implies that autocovariance depends on the number of lags (t1-t2) and not on the actual point in time. In other words, if the series is weakly stationary, the autocovariance between yt1 and yt10 is the same with the autocovariance between yt11 and yt 20 .
Empirical evidence shows that financial data are weakly non-stationary and therefore for the rest of the analysis we will concentrate only on weakly stationary process.
51
52 yt = + t + ut
Deterministic procecss (=0.5, =0.2)
120
100
80
60
40
20
0 1 12 23 34 45 56 67 78 89 100 111 122 133 144 155 166 177 188 199 210 221 232 243 254 265 276 287 298 309 320 331 342 353 364 375 386 397 408 419 430 441 452 463 474 485 496
-20
gradually die. In other words the impact of yt-T on yt will disappear as time moves on. Under these circumstances the series is stationary. In other words shocks do not persist for ever but fade away gradually and the three conditions mentioned in 5.1.2 are satisfied.
52
53
0.02 )
0 1 -1 12 23 34 45 56 67 78 89 100 111 122 133 144 155 166 177 188 199 210 221 232 243 254 265 276 287 298 309 320 331 342 353 364 375 386 397 408 419 430 441 452 463 474 485 496
-2
-3
-4
2) ||=1
In this case, yt = T + yt T + ut 1 + ut 2 + ... + ut T + ut , which means that shocks persist in the system and do not die away. In other words the current value of y is just an infinite sum of past shocks plus the drift plus some starting value yt-T. In this case the series is a stochastic non-stationary series and has a unit root.
Random Walk with Drift
(Drift = 0.03, Starting Point= 0.5, = 25 1 )
20
15
10
5 = 0 1 -5 12 23 34 45 56 67 78 89 100 111 122 133 144 155 166 177 188 199 210 221 232 243 254 265 276 287 298 309 320 331 342 353 364 375 386 397 408 419 430 441 452 463 474 485 496
-10
-15
-20
-25
3) |>1
In this case a given shock becomes more important as time moves on. We can see this from equation 12 since, if ||>1, < 2 < 3 < 4 , etc. This is an
53
54
1.01 )
0 1 -100 12 23 34 45 56 67 78 89 100 111 122 133 144 155 166 177 188 199 210 221 232 243 254 265 276 287 298 309 320 331 342 353 364 375 386 397 408 419 430 441 452 463 474 485 496
-200
-300
-400
-500
-600
-700
-800
2.1. Graphical Analysis Before we perform the tests see diagrammatically how the series behaves:
BAY
1,000 900 800 700 600 500 400 300 200 100 1990 1992 1994 1996 1998 2000 2002
From the above graph it seems that BAY might not be stationary. In order to confirm this conclusion, we can perform the correlogram of the series. Click: View Correlogram.
54
55
After you do this a Correlogram Specification window will appear on your screen.
You should specify the highest order of lag to display the correlogram; type in a positive integer in the field box (e.g., 36). (For non-stationary series, the autocorrelation coefficients at various lags are very high.) The series view displays the correlogram and associated statistics:
55
56
In this case, we can reject the null hypothesis that there is no autocorrelation up to order 36.
2.2.2 Example To perform the unit root test for BAY, click View Unit Root Test.
56
57
After you do this a unit root test window will appear on your screen.
Since we are not sure about the number of lags required in the model, if any, is better to choose the Augmented Dickey-Fuller test type. We want to test for unit roots (or non-stationarity) in the levels and therefore we choose Level whereas we assume that only an intercept is required. The new version of EViews can select automatically for us the number of lags needed in the model using a specific information criterion.
57
58 Choose the Akaike Information Criterion (ideally you should repeat the tests choosing a different information criterion in order to verify your results). Alternatively, if you know the number of lags required, you can specify the number on the User Specified Section. When you click OK the following output will appear.
The P-value is larger than alpha, so we cannot reject the null hypothesis, which means that BAY has at least one unit roots. However, we do not know which order the series is. In order to find out we need to repeat the same test on the differences of the series. Follow the same procedure as before but now choose test for unit roots in 1st Difference.
58
59
Click OK
Unit root test on the first differences shows that the P-Value is statistical significant at 1% level, therefore we reject H0 which means that the BAY is integrated of order 1.
59
60
3. ARIMA Models
The univariate or non-seasonal ARIMA model is a linear function of past stationary observations and present and past forecasting errors; it can be presented as follows: d Yt (1 1 L 2 L2 ... p Lp ) = (1 + 1 L + 2 L2 + ... + q Lq )
3.1. AR (p) Processes A time series is said to be an AR (p), if it is a weighted linear sum of the past p values
plus a random shock. Obviously, the simplest example of an AR process is the firstorder case given by: Yt = 1Yt 1 + t In order to estimate the model click Quick and then Estimate Equation:
60
61
If you want to try for different order, for example, for AR (3), just type:
3.2. MA (q) Processes A time series is said to be a MA (q), if it is a weighted linear sum of the last q random
shocks. Again, the simplest example of a MA process is the first-order case given by:
61
] Again, if you want to estimate for different orders, for example, for MA (3), just type:
62
63
Process
AR(p) MA(q) ARMA(p,q)
ACF (Autocorrelation)
Geometrically declining Curtailed after lag q Declining after lag q
2) Estimation: testing a number of different models, for example, AR(1), MA(2), ARMA (1,1) ect. 3) Diagnostic checking: testing whether the residual series from the model is white noise (e.g., using the Ljung Box LB Statistics). 4) Select the model: choose the model which has the smallest AIC criterion.
For more information for ARIMA model selection, please see Box and Jenkins (1976), Harvey (1990), Chatfield (2004). In this case, we will provide an example of estimating the ARIMA (1, 1, 1) model in EViews. To estimate the ARIMA (1,1,1), click Quick and then Estimate Equation, in the dialog box type in the equation, type: d(bay) c ar(1) ma(1)
63
64
64
65
example to test the cointegration relationship based on the Engle Granger 2 step test.
Example:
Let us examine whether the stock price of BT Group and the FTSE All Share Index are stationary series, and if not, whether there is long term equilibrium relationship between them. Open EViews and create a new workfile for BT and FTSE from excel file: ECM.xls.
65
66
4.2. The eyeballing method: Before we perform the tests see diagrammatically how the series behave. Highlight the series BT and FTSE (use Ctrl key + click) View (in Workfile window) Show View (in the pop-up Group window) Graph (select Basic graph, Line & Symbol, Single graph).
3,200 2,800 2,400 2,000 1,600 1,200 800 400 0 86 88 90 92 94 BT 96 FTSE 98 00 02
4.3. Unit Root Test In order to perform the unit root test for FTSE and BT, we click on FTSE series or BT, and then, in Series window, click View Unit Root Test.
66
67
The P-value is larger than alpha, so we cannot reject the null hypothesis, which means that FTSE all share index has at least one unit roots. Follow the same procedure as before but now choose test for unit roots in 1st Difference.
Unit root test on the first differences shows that the P-Value is statistical significant at 1% level, therefore we reject H0 which means that the FTSE all share price index is integrated of order 1.
67
68
The P-value is larger than alpha, so we cannot reject the null hypothesis, which means that BT has at least one unit roots. Again, we dont know of which order. In order to find out we need to repeat the same test on the differences of the series. Follow the same procedure as before but now choose test for unit roots in 1st Difference.
68
69
Unit root test on the first differences shows that the P-Value is statistical significant at 1% level, therefore we reject H0 which means that the BT is also integrated of order 1. Now, we have found out that both BT and FTSE series integrated of order one and therefore a cointegrating relationship might exists.
We first estimate the impact of FTSE all share index on the price of BT using the model below:
PBTt = a + PFTSE t + ut
Estimate Equation.
69
70
70
71
We need to examine the residuals of this series and test if are stationary or not. In order to do so we need to save the residuals obtained from this model. Return back to the original workfile, and generate a new series named resid1 by clicking on Genr button in Workfile window type resid1=resid in the pop-up window.
Now create a line graph of the new series you have created by double clicking on the series resid1 in the pop-up Series window, select View Graph (select Basic graph, Line & Symbol. You will get the following graph.
71
72
In the above graph it seems that residuals might not be stationary. In order to test this, we need to perform a unit root test by clicking on View (in the resid1 Series window Unit Root Test
Select the default settings: Phillips-Perron and the default estimation method then click OK.
72
73
From the above outcome you can see that the error term is non-stationary (under Phillips-Perron test). However, if we use a different type of test, we might get a different picture about the stationarity of our series.
For example, if you use the ADF(Augmented Dickey-Fuller) test, you will see that the non-stationarity hypothesis can be rejected at 10% level as follows (Select Augmented Dickey-Fuller Test in Levels, AIC and then click OK):
73
74 In cases like this, one should be prudent and is better to accept that the series is nonstationary and use the differences instead of the levels. In general, if the series are not cointegrated, you need to get the right number of differences for each of the variables in your model in order to transform them into stationary series. Then you can proceed with the normal/standard regression analysis.
74
75
Term 2: 2011-2012
MSc Accounting and Finance MSc Finance and Investment MSc Carbon Finance
75
76
TABLE OF CONTENTS
1. Testing for non-linearity 2. Symmetric GARCH Estimations 2.1. Estimate GARCH (1, 1) model 2.2. Understand the Options 2.3. Draw the GARCH conditional standard deviation graph 2.4. GARCH Model Diagnostics 3. Asymmetric GARCH Estimations 3.1. EGARCH 3.1.1. EGARCH (1, 1) Model 3.1.2. Draw the GARCH conditional standard deviation graph 3.1.3. EGARCH Model Diagnostics 3.2. TGARCH Estimations 4. Running a program in EViews Variance Ratio Tests
77 80 80 81 83 83 85 85 85 87 87 89 90
76
77
Leptokurtosis, the series have distributions that exhibit fat tails and excess peakedness at the mean. Volatility clustering or volatility pooling, large changes in returns are expected to be followed by large changes in returns, and small changes in returns are expected to be followed by small changes in returns. Leverage Effects, the tendency for volatility to rise more following a large price fall than following a price rise of the same magnitude, for example, 1% price drop will lead to a large volatility compare to 1% price rise.
One particular non-linear model in widespread usage in finance is known as Autoregressive Conditional Heteroskedasticity (ARCH) models, which were introduced by Engle (1982) and generalized as GARCH (Generalized ARCH) by Bollerslev (1986) and Taylor (1986). GARCH types of models are designed to model and forecast conditional variance. See Bellerslev, Chou and Kroner (1992), Figlewski (1997), Poon and Granger (2003) for recent surveys.
Example:
You are given daily time series data for the FTSE100 stock market index for the period 1 Jan 1990 to 31 Oct 2006. Total observations of returns amount to 4,392 (n). Daily returns are calculated as follows:
X Rt = ln t X t 1
where xt and Rt denote the level of index and the continuously compounded return on day t, respectively. The EViews labels ftse and r_ftse denote the time series observations of the index and log daily returns respectively.
77
78 1). Run the linear regression for example: r_ftse c r_ftse(-1) r_ftse(-2) r_ftse(-3)
2). Check for autocorrelated heteroskedasticity by clicking View (in Equation window) Residual Diagnostics Correlogram of squared residuals It shows positive autocorrelations:
78
79 3). ARCH LM Tests, click View Residual Diagnostics Heteroskedasticity Tests ARCH
Both the F-statistic and the Chi-Squared statistic are very significant, suggesting the presence of ARCH in the FTSE100 index returns. 4) Normality test for the residuals:
The Jarque-bera shows that the residuals are not normally disturbed, it also contains high kurtosis.
79
80
t2 = 0 + i t2i + j t2 j
i =1 j =1
The First Equation is the mean Equation with the conditional variance; and the Second Equation is the variance Equation. The estimated volatility is symmetric; in other words, whether positive or negative, the forecast errors will have the same effect on the conditional volatility.
The default is GARCH (1, 1), which is not a bad starting point:
80
81
If you wish to estimate an asymmetric model, you should enter the number of asymmetry terms in the Threshold order edit field. We will give examples later.
Coefficient Covariance Option: the Heteroskedasticity Consistent Covariance option is used to compute the quasi-maximum likelihood (QML) covariances and standard errors using the methods described by Bollerslev and Wooldridge (1992).
81
82 This option should be used if you suspect the residuals are not conditionally normally distributed.
We have the GARCH parameter 1 0.913292 which is close to 1, implying that the movements of the conditional variance away from its long-run mean last a long time
82
83
2.3. Draw the GARCH conditional standard deviation graph View GARCH Graph Conditional Standard Deviation
Conditional standard deviation graph briefly shows the period with high volatility.
83
84
b) Correlogram Squared Residuals It displays the correlogram of the squared standardized residuals, if the variance equation is correctly specified, there should be no ARCH left in the standardized residuals, all Q-statistics should not be significant.
84
85
log( t2 ) = w + i
i =1
q r t i. + j log( t2 j ) + k t k t k t i j =1 k =1
Note that the left-hand side of the second equation is the log of the conditional variance. This implies that the leverage effect is exponential, rather than quadratic, and that forecasts of the conditional variance are guaranteed to be nonnegative. The presence of leverage effects can be tested by the hypothesis that i < 0 . The impact is asymmetric if i 0 .
Example: Using the same data to estimate the EGARCH (1, 1) Model
3.1.1. EGARCH (1, 1) Model Quick Estimate Equations Methods ARCH EGARCH
Remember to also select the Asymmetric order to 1: Below we have EGARCH (1, 1) Model:
85
86
log( t2 ) = w +
t 1 + log( t21 ) + t 1 t 1 t 1
C(5) C(4)
C(2) C(3)
86
87 The asymmetric term of C (4) is negative and highly statistically significant, which means that the negative shocks imply a higher next period conditional variance than the positive shocks at the same magnitude.
3.1.2. Draw the GARCH conditional standard deviation graph View GARCH Graph Conditional Standard Deviation
One can use the Jarque-Bera statistic to test whether the standardized residuals are normally distributed.
In this case, the residuals are highly leptokurtic and the Jarque-bera statistic rejects the hypothesis of normal distribution. Note that Kurtosis is smaller than before, but still not 3.
87
88
a. Correlogram Squared Residuals It displays the correlogram of the squared standardized residuals, if the variance equation is correctly specified, there should be no ARCH left in the standardized residuals, all Q-statistics should not be significant.
88
89
Yt = + i =1 i X i ,t + t ; t N (0, t2 )
k
t2 = 0 + i+ t2i + i t2i K t i + j t2 j
i =1 i =1 j =1
where Kt-1 is a dummy variable taking the value of 1, if the previous days forecast error is negative t 1 < 0, otherwise 0. Furthermore, if the coefficient significantly differs from zero, the null of no asymmetry in conditional volatility is rejected. A significantly positive shows the existence of the leverage effect.
89
90
4. Running a program in EViews Variance Ratio Tests Variance Ratio Test (VRT)
The new version of EViews, EViews 7 has a Click and Run procedure for performing the test which you can practice on your own. Here you are shown how to run an Eviews program by carrying out the steps below.. The description of the VRT below is based on the description in the Chapter of Campbell, Lo and MacKinlay (1997) available on WebCT. You can refer to that if you want details.
For the next step we need to locate the program. Select File/Open/Programs and in the same location you found FTSE data you will find the program file name vrtest. Open the program.
90
91
You need to adjust the program lines for your own series and dates. In this example the following changes are necessary: In the line you need to specify your series you need to enter: lnftse In the line you need to specify you dates (period date) you need to enter: 1/1/1990 and 10/31/2006.
91
92
92
93 This contains the results. To view the results by double clicking on vrtest.
The vrtest table provides the variance ratio, the Z (q) test statistic for the variance ratio assuming RW1 (random walk with constant variance of the innovations), and the Z*(q) test statistic assuming RW3 (adjusted for the possible effect of heteroscedasticity). Adjust again the data period and return horizon of the test when necessary and select Run for additional results.
93
94
Important sources:
Brooks, C., (2008), Introductory econometrics for finance, Second edition UK: Cambridge Press. Campbell, Lo and Mackinlay, (2003), The Econometrics of Financial Markets, USA, Princeton University Press (Chapters 1 and 7) Gujarati, D. N., (2003), Basic Econometrics Fourth edition, McGraw-Hill: Singapore Koop, G., (2009), Analysis of Economic Data, Third edition, UK, Willey Lecture materials will be made available online via WebCT, which is accessible from the quick links area on MyBiz http://www.business-school.ed.ac.uk/mybiz/home EViews 7 Users Guide (most recent version). Course booklet.
94