You are on page 1of 4

Using the Bureau of Metrology (BOM) raw temperature data given for the 8 cities, as a group we have chosen

3 cities being Adelaide, Sydney and Melbourne to compare and analyse the aims of the project. The aim of the project in a wrap is to analyse the variability and error in weather forecasts for the cities we have chosen cities. Along with the set of data we are also given 5 different multiple linear regression models that might help us improve the temperature of the forecast to get it as close as possible to the observed temperatures given by BOM. We have used the 5 models for every one of the 3 cities and used excel and Minitab to help us get closest results of the forecast temperatures to the observed temperatures. Along with the regression model Minitab produced, it also has given us a P-value; the P-value helps us to find out the significance of that particular constant/coefficient to that specific model, if the P-value is greater than the chosen alpha being alpha=0.05 in our case, that particular constant/coefficient is not significant to that corresponding model. The following is summary and brief descriptions of the outcomes produced by excel:

Adelaide
Y = a+b*ecsp, is the first model we have used to try and improve the temperatures of the forecast, where Y is the improved temperatures, a is a constant and b is the coefficient to the variable ecsp, where ecsp is the European Central Model (given to us part of the BOM data), we went ahead and took the difference between the observed temperatures and the improved temperatures, squared that difference, then took the Sum Of Squares (SOS) and finally calculated the Root Mean Square (RMS) by taking the square root of the SOS and dividing it by the count of the SOS, we then used solver in excel to give us the minimum constant/coefficient for that model, the results are as follows: Y = 2.4603+1.0148*ecsp SOS = 1608.6942 RMS = 2.1139 Y = a+b*laps, is the second model we have used to help us improve the forecast temperatures, where Y is the improved temperatures, a is a constant and b is the coefficient to the variable laps, where laps is the Limited Area Prediction Scheme (also given to us also part of the BOM data), as for the first model we have followed the same steps and got the following results: Y = -0.0362+1.1026*laps SOS = 1865.6611 RMS = 2.2765 Y = a+b*ecsp+c*laps, is the third model we have used, where Y is the improved temperatures, a is a constant, b is the coefficient to the variable ecsp and c is the coefficient to the variable laps, we have went ahead and followed the usual steps, the only difference is that with this model we have combined both the ecsp and laps temperatures to see if it will help us improve the forecast temperatures better than the first 2 models, we got the subsequent results: Y = 1.0492+0.6227*ecsp+0.4476*laps SOS = 1414.5925 RMS = 1.9823 Y = a+b*ecsp+c*laps+d*[T(obs)-3], is the fourth model we have used to help us improve the forecast temperatures, where Y is the improved temperatures, a is a constant, b is the

coefficient to the variable ecsp, c is the coefficient to the variable laps and d is the coefficient to the variable [T(obs)-3], where [T(obs)-3] is the observed temperature 3 days beforehand, as with the other models we followed the usual steps, but in addition to model 3 we added the d coefficient to the variable [T(obs)-3] to further analyse if it were to help us improve the forecast temperatures, the following results are as follows: Y = 1.2504+0.6331*ecsp+0.4553*laps-0.0249*[T(obs)-3] SOS = 1408.1738 RMS = 1.9861 Y = a+b*ecsp+c*laps+d*[T(obs)-3}+e*[T(obs)-4], is the fifth and final model we used to help us improve the forecast temperatures, this model is exactly like model 4, the only difference is that we added an extra coefficient/variable being e and [T(obs)-4] respectively, where [T(obs)-4] is the observed temperature 4 days beforehand, this model is used to also improve on the other models, the following are the results: Y = 1.3021+0.6382*ecsp+0.4522*laps-0.0156*[T(obs)-3]-0.0130*[T(obs)-4] SOS = 1405.4852 RMS = 1.9870 Model 1 2 3 4 5 a/P-value 2.4601/0.000 -0.0363/0.935 1.0491/0.009 1.2503/0.004 1.3020/0.003 b/P-value 1.01480/0.000 1.10263/0.000 0.62272/0.000 0.63309/0.000 0.63823/0.000 c/P-value N/A N/A 0.44764/0.000 0.45532/0.000 0.45216/0.000 d/P-value N/A N/A N/A -0.02490/0.224 -0.01563/0.584 e/P-value N/A N/A N/A N/A -0.01303/0.649

Analysing the results from both excel and Minitab and comparing the 5 models we can see that the fourth model was the most effective model as it has given us the lowest RMS out of all the 5 models, however we can also see that model 5 was also an effective model to the given data as it was only had a difference of 0.009 in the RMS. Having a look at the coefficients/constants in the given Minitab results we can see that the corresponding results for the P-values for model 2 constant a, model 4 coefficient d to the variable [T(obs)-3] and model 5 coefficients d and e to their variables [T(obs)-3] and [T(obs)-4] respectively are all not significant to the respective models, as it has given a P-value greater than the chosen alpha=0.05. Like we have done for Adelaide we have also done for Melbourne and Sydney, the results of the 5 models in order 1-5 are:

Melbourne
Y = 1.5196+1.0520*ecsp SOS = 1104.8829

RMS = 1.7519 Y = 0.4826+1.0678*laps SOS = 1881.2783 RMS = 2.2860 Y = 1.1162+0.8798*ecsp+0.1870*laps SOS = 1071.2407 RMS = 1.7250 Y = 1.3447+0.8871*ecsp+0.1998*laps-0.0287*[T(obs)-3] SOS = 1060.6494 RMS = 1.7237 Y = 1.4750+0.8946*ecsp+0.2037*laps-0.0014*[T(obs)-3}-0.0439*[T(obs)-4] SOS = 1049.0557 RMS = 1.7166 Model 1 2 3 4 5 a/P-value 1.5194/0.000 0.4824/ 0.243 1.1162/0.000 1.3448/0.000 1.4750/0.000 b/P-value 1.05196/0.000 1.06784/0.000 0.87983/0.000 0.88717/0.000 0.89471/0.000 c/P-value N/A N/A 0.18699/0.001 0.19970/0.000 0.20361/0.000 d/P-value N/A N/A N/A -0.02871/0.101 -0.00135/ 0.952 e/P-value N/A N/A N/A N/A -0.01303/ 0.050

Having a look at the results for the 5 models of Melbourne from excel we can see that the most effective model was model 5 as it has given us the lowest RMS. Looking at the results produced by Minitab i.e. the constants/coefficient to their corresponding P-value, we can say that the following constants/coefficients are not significant to their respective models: model 2 constant a, model 4 coefficient d to the variable [T(obs)-3] and model 5 coefficients d and e to their variables [T(obs)-3] and [T(obs)-4] respectively.

Sydney
Y = 0.9217+1.0401*ecsp SOS = 1046.8208 RMS = 1.7052 Y = 1.2618+1.0340*laps SOS = 1788.9680 RMS = 2.2292 Y = 0.7047+0.9383*ecsp+0.1128*laps SOS = 1036.8434 RMS = 1.6971

Y = 0.9822+0.9404*ecsp+0.1271*laps-0.0269*[T(obs)-3] SOS = 1026.9851 RMS = 1.6961 Y = 1.1900+0.9363*ecsp+0.14103*laps-0.0100*[T(obs)-3}-0.0346*[T(obs)-4] SOS = 1020.4495 RMS = 1.6931 Model 1 2 3 4 5 a/P-value 0.9212/ 0.037 1.2618/ 0.034 0.7044/ 0.121 0.9819/ 0.052 1.1897/0.023 b/P-value 1.04015/0.000 1.03398/0.000 0.93842/0.000 0.94058/0.000 0.93648/0.000 c/P-value N/A N/A 0.11274/ 0.065 0.12694/0.040 0.14087/0.025 d/P-value N/A N/A N/A -0.02690/0.202 -0.01005/0.674 e/P-value N/A N/A N/A N/A -0.03460/0.144

Like Melbourne, we can see from the results produced by excel that model 5 was the most effective model for Sydney too as it has given us the lowest RMS figure. Now having a look at the results produced by Minitab and seeing the significance of the constant/variable to the particular model we look at the corresponding P-value and concluded that it was not significant if the P-value was higher than the chosen alpha being 0.05, unlike Adelaide and Melbourne, Sydney had more constant/coefficient that were not significant: model 3 constant a and coefficient c to the variable laps, model 4 constant a and coefficient d to the variable [T(obs)-3] and model 5 coefficients d and e to their variables [T(obs)-3] and [T(obs)-4] respectively. In summary of the results, having a look at all 3 cities and comparing all 5 models and the significance of the constant/coefficients they have given us, we can conclude that model 5 was the most effective in giving us the closest temperatures of the forecast to the observed temperatures given to us by BOM (except for Adelaide where model 4 was the most effective), we can also conclude that the significance of coefficients d and d and e to their corresponding models 4 and 5 were not significant to all 3 cities we have analysed. The city that had the most effect to the regression model 5 was Sydney as it has given the lowest RMS, and for that particular reason it resulted to giving us the closest forecast temperatures to the observed temperatures.