This action might not be possible to undo. Are you sure you want to continue?
Summary This term paper outlines and demonstrates use the Statistics in understanding ten year stock market prices and returns from fifty countries using basic statistics, linear regression and time series forecasting models. Keywords Term paper, Statistics, Linear Regression, Average, Logarithmic Transformation, Kurtosis, FTest for Linear Regression, T-Test for Linear regression, P-value, Partial Linear Regression, Exponential Time Series Forecasting, Moving Averages Forecasting
Page 2 of 28
Contents: 1.7 5.4 5. Residual Analysis. References. Linear Regression Model 5.2 6.2 5.1 5. Test for the Significance of the Overall Multiple Regression Model. Conclusion Appendix A. Multiple Linear Regression Conclusions. Forecast Error Measures.1 6. List of Excel files used for calculations. Basic Statistics: 4. Line Plots.3 5. Further Analysis. Predicting the Dependent Variable Y.9 5. Inferences Concerning the Population Regression Coefficients. Charts for Forecast Models. Abstract 2. Introduction 3. Coefficient of Mutliple Determination. Correlation 5.3 Models.6 5. 6. Forecasting 6. Page 3 of 28 .8 5. 7.10 Introduction. B.
We had with us monthly MSCI index prices for fifty countries. then to seven bigger neighbors and in the end to eight other important economies. Stock market is a direct indicator economical environment. Stock markets are where investors invest in companies big and small. Abstract In this paper we demonstrated the use of statistical concepts to understand stock market returns for fifty different countries. 2. stock investors. We used continuously compounded returns using logarithm transformation for our analysis. Additionally. We either have direct stock holdings or have indirect stock holdings through mutual funds in retirement plans. Typically stock brokers. The raw price series are converted into series of returns. We used Excel and PHStat as our tools to do our Statistical analysis. Excel and PHStat were tools of our choice. and these involve the formation of simple returns. The index prices are not distributed normally but the returns are. The data is monthly closing prices of MCSI country index in US dollars.1. We showed usage of linear regression to relate returns of Finland first to the stock markets of the top seven economies of the world. returns have the added benefit that they are unit-free and currency free. allowing comparisons to be done across markets. It is not preferable to work directly with the price series for performing any statistical analysis. and continuously compounded returns. which are achieved as follows: Simple returns =(( − )/( Continuously compounded returns: where: denotes the simple return at time t denotes the continuously compounded return at time t denotes the asset price at time t ln denotes the natural logarithm. Introduction Most of us have some relationship with the stock market. Hence understanding stock markets is helpful for managers while making investment and capital expenditure decisions. government officials and banks are interested in stock market data and analysis. We created basic statistics for the set of fifty countries stock market monthly returns. In the end we demonstrated the use of time series prediction to predict returns of Austrian stock index prices. There are two methods used to calculate returns from a series of prices. Page 4 of 28 ) ) × 100% = 100% × ln ( / ) .
89% annual compounded returns for the nine plus year period of 10/31/1997 to 7/31/2007. it focuses on how returns are ranged Page 5 of 28 . Countries Colombia.16 and 40. large or small. We found that Russia. 18.72 for Czech Republic and Columbia. it was Denmark and Austria which provided lesser volatility among these 4 countries. as opposed to frequent modestly sized deviations. Austria and Denmark gave investors the biggest returns.68%.98% and 14. Austria and Denmark were on the top four with 18.44 and 23. United Kingdom and Canada are mostly stable (as of Sep 2007). The chart below shows the average rate of returns against the standard deviation. We were able to easily identify that while Colombia. 17. Turkey. The standard deviation of Denmark and Austria were 17. Higher kurtosis means more of the variance is due to infrequent extreme deviations. We made an attempt to understand the statistical measure Kurtosis. Indonesia has highest volatility whereas USA. Czech Republic.3. depending on the volatility of the stock price.99 compared to 32.22%. Investing in the stock market always bears some risk. Basic Statistics: We found out which countries on average gave most returns over ten year period. Since Kurtosis measures the shape of the distribution (the fatness of the tails). Czech Republic. A stock that has large volatility may make give higher or negative returns depending on when the investor enters and exits out of the stock holdings. Kurtosis measures whether the data is sharp or flat relative to a normal distribution.
We found that Netherlands and France had the value of 0.around the mean. with correlation coefficient of 0. we found Pakistan and Denmark with correlation value of -0.886. Correlation We know that relationship between two variables is expressed through correlation. Kurtosis describes how bunched around the center or spread at the endpoints a frequency distribution is.886035. In the other hand. will lead to same type of returns or risks. A Kurtosis coefficient of three indicates a normal distribution. It is Russia which has the largest standard deviation and Kurtosis. Page 6 of 28 .19975. Therefore. Correlation could be good tool to find diversification. For example investment in France and Netherlands. meaning Russia stock market is most volatile among 50 countries. We used Excel to create a matrix of 50 X 50 correlation relationship with Excel which is showed in the table. 4. Kurtosis greater than three indicates a sharp/high peak with a thin midrange and fat tails (leptokurtic). Any investment requires diversification so that the investing risks are spread among unrelated investments. A quick glance at the correlation matrix table identify on how to diversify. which was one of the highest correlations. Kurtosis of less than three indicates a low peak with a fat midrange on either side (platykurtic)." The chart below shows kurtosis in increasing order against average monthly returns with its standard deviation. Sometimes Kurtosis is also called "the volatility of volatility. to put simply. Conversely.
Linear Regression Model Introduction We picked Finland for creating linear regression models. is the estimated rate of return for Finland for which the linear model was to be created. the computed values of the regression coefficients are Page 7 of 28 . 2009 edition) Multiple regression Model with k independent variables is given as: In our case we have 7 independent variables. Finland is 33rd among the top 50th economies of the world. are monthly rate of returns for the seven biggest countries of the world. We first used Microsoft Excel to compute the values of the eight regression coefficients. for j= 1 to 7. so the model to be developed is: Here .5. Output of Microsoft Excel Multiple Regression Analysis From above figure. The top 7 economies are: United States China Italy Japan United Kindom Germany France (Source: The Economist Pocket World Figures.
287008 1. holding constant the effect of other X variables.57856 CHINA given value 5.1237 JAPAN given value GERMANY given value 4.439322 Confidence Interval Lower Limit 1.Therefore the multiple regression equation is: The sample Y intercept ( ) estimates the return of Finland stock market when returns of all other seven stock markets are zeroes. the Finland rate of return is going to decrease by 0. 1998 for all seven countries and found that model predicted range of 1. shown in the table below. the value of has no practical interpretation. We used PHSTAT’s Confidence interval estimate and prediction interval function to arrive at range. The slope of rate of return with US rate indicates that for a given amount of rate of return for US. 1998 with given rates of returns for the seven countries: Data Confidence Level 95% USA given value 6. The estimates of all allowed us to better understand the effect of the rate of returns of biggest seven economies on Finland.31292 at 95% confidence level. The table below shows expected the output of Finland’s monthly return at 95% confidence level on Feb 27. The actual value of dependent variable or Finland’s monthly rate of return was 9.412191 30. They estimate the mean change in Y per unit change in a particular X.309952 UNITED KINGDOM given value FRANCE given value 8. We took monthly rate of return from Feb 27. Regression coefficients in multiple regression are called net regression coefficients.434281 Confidence Interval Upper Limit 8. Because the stock returns cannot be zeroes for all markets at same time.312925 Page 8 of 28 .189638 6. Predicting the Dependent Variable Y We used the multiple regression equation to predict the value of the dependent variable.14045 times.306493 ITALY given value For Average Predicted Y (YHat) Interval Half Width 3.43428 to 8.80996.
47053 Excel also created the same result for us when we did regression analysis: SUMMARY OUTPUT Regression Statistics Multiple R 0.46883 Page 9 of 28 . Condition All but Italy All but France All but United Kingdom All but China All but Germany All but Japan All but USA for was calculated six times.46382 0.44104 0. We also looked at PHStat output where removing one of the each time.47041 0.685955 0.47012 0.43121 0.203 / 7011.43684 Standard Error 5. One would have expected higher correlation but this is not the case here.470534 R Square Adjusted R Square 0.809409 Observations 118 The coefficient of multiple determination ( = 0.Coefficient of Mutliple Determination The coefficient of multiple determination is equal to the regression sum of square (SSR) divided by the total sum of squares (SST): = Regression Sum of Squares / Total Sum of Squares = SSR /SST For Finland’s monthly rate of return we have =3299.46662 0. Which Variable Removed 0.618 = 0.47053) indicates that 47.05% of the variation in Finland’s rate of return is explained by the rate of returns of the seven biggest economies.
j= 1.74923103 F 13.) The overall F Test Statistic is equal to the regression mean square (MSR) divided by the error mean square F = MSR/MSE Where: F= test statistic from an F distribution with k and n – k – 1 degrees of freedom k=number of independent variables in the regression model n=number of samples uses to create the regression model ANOVA table for our model Significance F 7. Because 13. Using a 0.02. 2. Here we tried to find if there is a significant relationship between the dependent variable and the entire set of independent variables.9652. we rejected and found statistical proof to conclude that at least one of the independent variables (rates of returns of seven biggest economies) is related to Finland rate of return.02. 109 (118 -8 -1) degrees of freedom found from F tables is approximately 2.415413 7011.618422 MS 471.) : : At least one 0. Figure showing Significance of the Overall Multiple Regression Model F-Test Page 10 of 28 . ….05 level of significance.7 (There is linear relationship between the dependent variable and at least one of the independent variables.9652 > 2. do not reject .Test for the Significance of the Overall Multiple Regression Model We performed the significance test of the overall multiple regression model using F-test.54666E-13 Df Regression Residual Total 7 110 117 SS 3299.96519865 The decision rule is to reject : at the α level of significance if F > otherwise. the critical value of the F distribution with 8. Since there is more than one independent variable.203009 3712. we used the following null and alternate hypothesis: = 0 (There is no linear relationship between the dependent variable and the independent variables.3147155 33. From figure above the F statistic is 13.
Residual Analysis We evaluated the appropriateness of using the multiple linear regression model using residual analysis. Page 11 of 28 . We created the seven residual plots using Excel along with residuals for expected Y. From these charts we saw that the pattern is random for all the charts and use linear regression was appropriate in this case.
holding constant the effects of the other independent variables.Inferences Concerning the Population Regression Coefficients To determine the existence of a significant linear effect on y (Finland’s rate of return) and independent variable (the monthly rate of return of one of the biggest seven economies) the null and the alternate hypotheses are: : : (There is no linear relationship) (There is a linear relationship) The t-statistic equals the difference between the sample slope and the hypothesized value of the population slope divided by the standard error of the slope: t=( where = slope of variable j with Y. )/( ) Page 12 of 28 . holding constant the effects of the other independent variables. t k = Standard Error of the regression coefficient = test statistic for a t distribution with n – k – 1 degrees of freedom = number of independent variables = hypothesized value of the population for variable j.
068825 36.9799 1.9799 1.061448851 0.59559 2.90193 0.034545387 0.65509029 116 4266.9799 1. We decided to perform PHStat Stepwise Regression to confirm the findings above. and for p-value > 0. So the liner equation reduced to: PHStat Stepwise Regression Analysis Table Finland and 7 biggest economies Table of Results for Forward Selection GERMANY entered.55615E-14 df Significance F 3. 1.562734089 0. We found that the PHStat’s Forward Selection Function Best Model Fit selected only Germany and Japan as significant dependent variables. Is t-stat in area of nonrejection Yes No No Yes Yes Yes Yes Country USA JAPAN GERMANY CHINA UNITED KINGDOM FRANCE ITALY t Stat 0.005098 0.552673 0.014826 0.951107497 0.18083 0.9799 1.05 null hypothesis was accepted.874559 > 0.897454003 Page 13 of 28 .77645538 117 7011.369066 0.The table below summarized our findings.55615E-14 Regression Residual Total Lower 95% Upper 95% Intercept GERMANY JAPAN entered.148015985 1.9799 Null Hypothesis Accepted Rejected Rejected Accepted Accepted Accepted Accepted p-value 0.549597 74.549597 2745.9799 1.640317719 3.240219 0.780206 0.05 Yes No No Yes Yes Yes Yes We found that only Japan and Germany have the biggest contribution to Y value or Finland’s rate of return.084498518 8. We used 95% confidence levels. SS MS F 1 2745.47559 2.9799 1.730094046 0.15824 Critical t 1.85814 0.27974 1.562181166 0.078925211 0. This confirmed what we found earlier.618422 Coefficients Standard Error t Stat P-value -0.
351600388 0.083192301 7. Line Plots We created line plots for Finland for each of independent variables: Page 14 of 28 .2593E-15 Lower 95% Upper 95% -1.51106E-12 JAPAN 0.117655288 0.495580058 0.537672497 0.000724094 df Significance F 1.947370414 0.151117685 0.55208309 No other variables could be entered into the model.82717555 GERMANY 0.101212614 3.66036799 0. Stepwise ends.SS MS F Regression 2 3150.57316021 Total 117 7011.825155923 0.913425 33.704997 1575.473879136 0.352499 46.937849769 1.92297325 Residual 115 3860.18268099 0.218823333 0.618422 Coefficients Standard Error t Stat P-value Intercept -0.
Further Analysis At point we wondered if the Finland’s stock market was correlated more to the bigger economies in its neighborhood. We picked following countries to do further analysis: Page 15 of 28 .
we rejected and found enough statistical proof to conclude that at least one of the independent variables (rates of returns of seven neighboring economies) is related to Finland’s rate of return.596 told us that Finland’s stock market has better linear relationship to its seven neighboring countries than the biggest seven economies of the world.02. Hence the multiple linear regression equation is: Page 16 of 28 . Again with a 0.1811. 109 (118 -8 -1) degrees of freedom found from F tables is approximately 2. the critical value of the F distribution with 8.02.05 level of significance. Because 23. The F statistic is 23.1811 > 2.Country Russia Netherlands Belgium Sweden Economic Rank 11 16 18 19 Country Poland Norway Denmark Economic Rank 24 25 28 We obtained following Excel output: Here =0.
617217 0. Page 17 of 28 . Country RUSSIA NETHERLANDS BELGIUM SWEDEN POLAND NORWAY DENMARK t Stat 0.9799 Is t-stat in area of nonrejection Yes No No Yes Yes Yes Yes Null Hypothesis Accepted Accepted Accepted Rejected Rejected Accepted Accepted p-value 0.76688076 4.383704 5.9799 1.50266 > 0. that Finland is more related to Sweden and Poland.77E-06 4.9799 1.9799 1.229502 -0.9799 1.87458724 4.893112 0.1346789 0.Again we created the t-test table and found that only Sweden and Poland have the biggest contribution to Y value or Finland’s rate of return.672525 Critical t 1.86E-05 0.3460091 -0.9799 1.50122106 -0.729997 0.05 Yes Yes Yes No No Yes Yes With this new insight.9799 1. we changed the linear equation to: Once again we performed PHStat Stepwise Regression to confirm the findings and we found the same.
These countries were picked (the one’s in bold were picked before): ARGENTINA AUSTRALIA BRAZIL INDONESIA JAPAN MEXICO CHINA POLAND GERMANY SWEDEN INDIA SINGAPORE We obtained following Excel output: Page 18 of 28 .The findings above got us curious. We also decided kept four countries which had linear relationship in our earlier analysis and added other eight which were not considered before. We wanted to see if Finland’s rate of return had any linear relationship with other countries in Asia Pacific and Latin America.
We found that =0.715 which is greater than 2.05 level of significance and the F statistic being 14. Page 19 of 28 . Again with a 0.02. we rejected and found statistical proof to conclude that at least one of the independent variables (rates of returns from twelve countries) is related to Finland rate of return.62 told us that Finland’s stock market has better linear relationship to twelve countries picked in the list.
001162327 0.52605 2.9799 1.044244445 0.649476 0. Country t Stat -0.374116105 0.892593 0.282879 ARGENTINA AUSTRALIA BRAZIL CHINA GERMANY INDIA INDONESIA JAPAN MEXICO POLAND SWEDEN SINGAPORE Critical t -1.627811 -2.9799 1.9799 1. Poland.9799 1. China and Australia had linear relationship to Y value or Finland’s rate of return.039635123 0.913007487 0.08348 0.05 Yes No Yes No Yes Yes Yes Yes Yes No No Yes Page 20 of 28 .We created the t-test table and found that only Sweden.9799 1.9799 1.59996518 0.9799 1.89865689 0.777826147 > 0.9799 1.127667 3.531490492 0.9799 -1.03623 0.9799 Is t-stat in area of nonrejectio n Yes No Yes No Yes Yes Yes Yes Yes No No Yes Null Hypothesis Accepted Rejected Accepted Rejected Accepted Accepted Accepted Accepted Accepted Rejected Rejected Accepted p-value 0.422115 0.673804307 0.297052 0.001333794 0.339558 3.9799 1.9799 1.517449239 0.109509 0.
China and Australia. China and Australia. Multiple Linear Regression Conclusions We found that regression can be very good tool to model stock market returns and find relationship among different market returns. and this needs to be kept in mind while making all investing decisions. Page 21 of 28 . In other way to understand this will be that these five countries present similar investment risks.Again PHStat Stepwise Regression confirmed the same findings and we found the same that Finland is related linearly to Sweden. With any statistical analysis there is always going to be uncertainty. We were able to find that Finland’s rate of return was related to rates of returns of these countries: Sweden. Poland. Poland.
First Order Naïve: 2. Forecasting We exercised forecasting modeling techniques with the Austria’s stock index prices. Models We used following forecasting techniques to create our forecasting modes: 1. Exponential Smoothing Forecast with ω = 0. 4 Period Moving averages: 5. In this case the use of exponential and moving average smoothing models for forecasting purposes was most appropriate. Here we found the monthly returns are non seasonal has no consistent upward and downward trend. First we plotted the monthly rate of return against time.6.758: 7. 3 Period Moving averages: 4. 5 Period Moving averages: 6. 2nd Order Naïve: 3. Exponential Smoothing Forecast with ω = 0.2 Page 22 of 28 .
Forecast Error Measures We created these seven models using Excel.2 had least variability.599314 7.11619 -0.367133 5. After creating forecasted series we calculated Forecast Error Measures.00982 73. Page 23 of 28 .25684 91. whereas exponential Smoothing Forecast with ω = 0.58189 91.958197 7.596432 6.27465 7.2 Error Error -0.11329 -0.396453 MSE SE MAD -0.077884 5.08465 -0.166996 0.438287 5.75387 57.237657 54.62089 63.57188 7.89865 8.530948 5.758 also had MAD close to the moving averages model.172345 7.08389 -0.00982.758 Exp_0.569843 9.551451 We found that the exponential Smoothing Forecast with ω = 0.74957 9.758 had least bias with average error of -0.3329 58. The Mean absolute error (MAD) was least for the 4 Period Moving averages model but exponential Smoothing Forecast with ω = 0. These are given as: Bias Average error Variability: t 1 et n t 1 n 1 et n 2 n Mean squared error MSE Standard deviation s MSE Mean absolute error MAD e t 1 n t n 1 We calculated the error measures and here they are: 2nd NAIV Error Date Average Error FNAIV Error F_MA_3 Error F_MA_4 Error F_MA_5 Error EXP_0.665107 7.
Red Forecasted) Second Order Naïve Forecast Model Chart (Blue Actual. Red Forecasted) 3 Period Moving Averages Forecast Model Chart (Blue Actual. Charts for Forecast Models First Order Naïve Forecast Model Chart (Blue Actual.We created charts for all these models with actual vs forecasted models to visually show the different models used for forecasting. Red Forecasted) Page 24 of 28 .
Red Forecasted) Page 25 of 28 . Red Forecasted) 5 Period Moving Averages Forecast Model Chart (Blue Actual. Blue Actual.758.4 Period Moving Averages Forecast Model Chart (Blue Actual. Red Forecasted) Exponential Smoothing Forecast Chart (with ω=0.
Linear Regression and Time Forecasting. Blue Actual. Page 26 of 28 . The study as such could be very exhaustive.Exponential Smoothing Forecast Chart (with ω=0. but in our limited scope. we successfully demonstrated the use of basic Statistics. Conclusion In this paper we used different Statistical aspects to analyze international stock market returns.2. Red Forecasted) 7.
Essentials of Modern Business Statistics. b. 2. Statistical terms at http://en. CorrelationMatrix worksheet has the 50 X 50 correlation matrix table for 50 countries. MarketReturns worksheet transforms HistoryIndex worksheet to log returns in percentages using formula =100*LN(HistoryIndex!B11/HistoryIndex!B10) It also has following basic statistics for each country: Monthly Avreage Return Monthly Variance Monthly Std. Histograms worksheet has histograms for all countries rate of returns including their frequency tables. B. List of all countries in one column is in CountryNames work sheet. Krehbiel. References 1. 5th Edition. Class notes. instead of using function and calculations for each of statistical measures in b. Page 27 of 28 . 2009 3.xlsm is the main excel file with following worksheets: a. 2008. g. The base data for fifty countries monthly rate of return is in HistoryIndex worksheet. Kurtosis_Chart workseet has monthly Kurtosis measure plotted with monthly average returns and standard deviation. Berenson Statistics for Managers Using Microsoft Excel. Dev Coefficient of Variance Skewness Kurtosis c. Sweeney Williams. Deviation Yearly Avg Return Yearly Std.wikipedia. f. d. e. Line diagram for average yearly rates of return with standard deviation is in YRLY_RETURN_CHART_BY_COUNTRIES worksheet. country_data_in_pc.Appendix A. Worsheet BasicStatsUsingExcel contains once again basic statistics but calculated using Data Analysis Descriptive Statistics Tools.org/ 4. Levine. 4th Edition. Anderson. List of Excel files used calculation 1. Stephan. h.
2.758 and EXP_0. FNAIV. These are the worksheets in the excel file: a.xlsx has Finland’s regression calculations with 12 other major economies. F_MA_5. Austria_time_forecasting. MR worksheet has linear regression model. 5.xlsx has Finland’s regression calculations with seven biggest neighboring economies. 3. File Finland_and_other_tweleve_countries. Worksheets FNAIV. c. EXP_0. 4.xlsx has 50 X 50 correlation matrix table for 50 countries CorrelationMatrix.2 are used to calculate Avg Error. Worksheet Main has the basic time series and columns showing all time series models with values and errors. SE and MAD Page 28 of 28 .xlsx has Finland’s regression calculation along with residual plots and line diagrams in the first worksheet Finland_Regression_Model. Finland_n_other_7_big_economies. Other work sheets contain output from PHStat. Note Stepwise worksheet has PHStat Stepwise Regression output.xlsm has time series modeling output. F_MA_4. Note Stepwise worksheet has PHStat Stepwise Regression output. Other work sheets contain output from PHStat. MR worksheet has linear regression model. File finland_vs_7_Neighboring _big_countries. b. Correlationmatrix. Basic Charts worksheet has Austria’s stock price and returns plotted against time. Other work sheets contain output from PHStat. MSE. 6. F_MA_3.