You are on page 1of 11

Weather Forecasting: A Time Series Analysis

using R
Abstract
In the past several years, weather forecasting has grown in importance as a subject of study.
The majority of the time, the researcher tried to create a linear relationship among the target
data and the related input meteorological data. However, when nonlinearity in the makeup of
meteorological data was discovered, attention has switched to nonlinear forecasting of the
weather data. Despite the fact that there is a wealth of literature on quadratic statistics for
forecasting the weather, the majority of it calls for the model in question to be defined before
estimating is carried out. A forecast model was run in R using the weather history datasets
from Kaggle. This study offers a thorough analysis of various methods for weather
forecasting and includes certain datasets that are freely accessible.

Table of Content
Abstract
1. Introduction
2. Problem of Current Service
3. Analysis of Data
3.1 Descriptive Analysis
3.2 Data Visualization
3.3 Forecasting
3.3.1 Trend of the Data
3.3.2. Training data set trend for prediction
3.3.3. Predicted Trend for next 4 years
4. Conclusion
5. References

1. Introduction
Ambient air temperatures and rainfall are two of the climatic factors that have the greatest
impact on the biosphere and human activities. This is why their measurement has received a
lot of attention. End of the 19th century, in 1883, Köppen made the first effort to get the
mean worldwide surface air temperature time series [1]. His series spanned a substantial
amount of time.
There are two goals for our investigation. First, rather than employing a single monitoring
station, consider the possible advantages of using fine-scale meteorology data products to
estimate the acute health consequences of high temperatures. Comparing exposure
assessment approaches that take into account spatial variability and those that presume
spatially homogenous exposures is something we are particularly interested in doing. The
second goal of this study is to close a significant knowledge gap about the relationship
between high temperatures and morbidity, as shown by visits to emergency departments
(EDs). The majority of previous research has been on mortality [19,20,21] and
hospitalisations [22,23,24,25], and many of the multi-city US sickness studies have only
included Medicare beneficiaries 65 years of age or older.

One crucial use of neural networks is forecasting time series data. Time series analysis
facilitates previous analysis, which is useful for predicting the future.Forecasting is crucial in
many fields, including business and finance, where it helps groups plan their policies;
chemical control of operations and digital transactions, where it helps identify unusual or
fraudulent scenarios; electric power distribution, where it helps manage power flow problems
and load management through advanced monitoring; weather prediction, where it helps with
well-informed choices for agriculture, aviation, and maritime navigation; and biological
sciences, where it helps with understanding biological processes.

As of right now (Paolella, 2018; Tsay and Chen, 2018), insights on linear and nonlinear time
series models are typically offered separately in books.
Furthermore, no recent study that substantially advances a sufficiently wide perspective on
data series modelling downstream prediction could be found. A compilation that could put
streaming data pretreatment, different elements of time series simulation, and forecast in
context and be accessible to scientists without deep experience in time series was also
absent. In a nutshell, one of the objectives of this survey is to offer a comprehensive
description of the most recent developments that we discuss.

2. Problem of Current Service

The primary goal of this study was to assess the accuracy of the valid forecasts. This is the
most difficult research undertaking to do in the field of weather services. Making forecasts
about the environment for the future while assessing a massive amount of data is a difficult
endeavour known as weather forecasting. Although technological and modelling methods
have advanced, forecast accuracy is not always flawless. Since weather systems are
unpredictable, even minor changes to the starting conditions or underlying assumptions in a
model might result in considerable departures from the forecasted weather. Weather
agencies are constantly working to increase forecast accuracy. The time series approach
was applied to the weather past information in this study to assess forecasting accuracy. It
was not an easy process, but there is still much to be done regarding this on significant
platforms in the future. The forecasting is evaluated and visualised using R software.
Over 2500 observations for each of the 13 variables in this research's massive data set.
Each had a plethora of information on weather history as well as a number of other columns
that may have been irrelevant for risk evaluations. I started by cleaning. Before using charts
to conduct both bivariate and univariate analyses on this enormous set of data, it was
cleaned, certain outliers were discovered, and they were deleted.

Data visualisation was done by using descriptive analysis. then project the weather for
further years in the future. In this inquiry, the R is a tool that was employed. Because this
software works to investigate and analyse the provided facts, we can discover new
information and fully grasp it. You can download this historical weather dataset from Kaggle.
This data set is used for this evaluation bearing in mind the value of business analytics.
Weather forecasting accuracy and lead time are being improved by researchers. They
create simulations of the atmosphere using information on humidity, temperature, wind, and
altitude. These models provide beneficial data for preparedness for disasters, rescue efforts,
and public safety by helping to forecast conditions, powerful winds, storms, and other
dangerous events.

Researchers examine the processes that govern weather patterns on many scales, including
local hurricanes to global climate trends, through intensive data gathering, analysis, and
simulation. To gather and analyse enormous volumes of data, they make use of advanced
instruments including weather stations, infrared systems, satellites, and mathematical
models. Researchers find trends and anomalies by looking at past weather data and
evaluating the state of the atmosphere today, which helps to improve weather forecasting
methods.

3. Analysis of Data
Weather-related information time series Statistics are run on hourly records that represent
the usual year from 2006 to 2016 to determine the mean value, standard deviation, and the
maximum and minimum value. Additionally, temperature periodicity histograms are
produced. The two time flow-related characteristics are added because the weather data
vary during the day and are seasonal. Hours represent hours from the start of the day,
whereas time represents hours from the start of the year.

3.1 Descriptive Analysis


In weather history, the yearly average temperature was 10.10°C (M = 10.10, SD = 0.876).
The temperatures showed a mild variance in temperature throughout the year, ranging from
a minimum of -8.59°C to a maximum of 32.64°C. The average temperature appears to have
been strongly concentrated around the mean, as indicated by the standard deviation of
0.876. In accordance with weather records, the annual average actual temperature was
9.022 °C (M = 9.02, SD = 0.6576). The temperatures showed a mild variance in temperature
throughout the year, ranging from a minimum of -11.91°C to a maximum of 35.49°C. The
average temperature appears to have been strongly concentrated around the mean, as
indicated by the standard deviation of 0.657.
3.2 Data Visualization
For temperature, What is the most and least contribution of Percip Type?
The pie chart shows the percip type category according to the variable temperature.
● 88% having the rain type of temperature in this dataset.
● 11% having the snow type of temperature in this dataset.
For temperature, What is the most and least contribution of the summary category?
The pie chart shows the summary category according to the variable temperature type.
● 33% having the partly cloudy type of temperature in this dataset.
● 29% having the mostly cloudy type of temperature in this dataset.
● 38% having the other type of temperature in this dataset.

Compare the distribution of the temperature variable in the weather history?

The chart given below shows the distribution of the data is about normal.
For summary, What is the most and least contribution summary in detail ?
The pie chart shows the summary category according to the variable chest pain type.
● 48.25 % having the humid and mostly cloudy type in this dataset.
● 25.91% having the Foggy type in this dataset.
● 21.88% having the Overcast type in this dataset.
● 3.95% having the other type in this dataset.

3.3 Forecasting

3.3.1 Trend of the Data


Over the period that was observed, the time series chart clearly showed an upward trend.
The statistics showed that temperatures rose steadily between 2017 and 2020. Although
there were notable variations throughout 2013, the general trend was still good. This means
that the temperature will increase steadily over time. The data gathered over the duration of
the study is represented visually in the time series graphic. The graph shows how the
temperature has changed over time, giving important insights into the fundamental trends
and patterns.

Looking at the chart, it is clear that the temperature has been increasing over the course of
the recorded time period. Although there is an upward tendency overall, the chart also
shows certain oscillations and deviations at particular intervals. For example, the
temperature decreased significantly from 2013 and 2014. These variations might be
attributable to factors or occurrences that affected the basic structure of temperature.

Nevertheless, despite these brief shifts, the predominant pattern continues to be one of
steady expansion. The upward trajectory suggests there are advantageous underlying
processes or influences at work by showing a smooth and constant increase in temperature
over the years.

Finally, the time series graphic offers an accurate and appealing representation of the
temperature changes across the seen time period. The increasing trend and sporadic
variations offer important information for comprehending temperature dynamics and the
overall development of the observed event.

The additive time series' decomposition offers insights into the underlying factors influencing
the observed data. Long-term behaviour is reflected by the trend component, shorter-term
behaviour is reflected by the cyclical component, and unaccounted-for variability is reflected
by the residuals. On the basis of the time series data, predicting, seeing patterns, and
arriving at well-informed judgements can all be aided by comprehending and interpreting
these components. The component of seasonality depicts recurring patterns or variations
that take place over shorter timescales, such as months, quarters, or years. These seasonal
patterns might represent cyclical repetitions or regular variations that are seen over a certain
length of time. For instance, there is a definite seasonality seen in the 2013 and 2017
quarters, with larger values appearing during these times continuously.

3.3.2. Training data set trend for prediction


The pattern seen in the experimental data set offers useful information for making
predictions. The information clearly shows a downward trend. This implies that the
phenomena being measured throughout the training time has an ongoing trajectory or
behaviour.

A possible persistence of the aforementioned pattern in future forecasts is suggested by the


presence of a declining trend in the training data. It suggests that the underlying causes or
influences causing the pattern that was observed throughout the training session are likely to
continue.
3.3.3. Predicted Trend for next 4 years
The forecast produced by the merged STL and ETS(1,1,0) model offers insightful
information about how the phenomenon under study will behave in the future. The model
took into consideration the exponential smoothing properties of the ETS(1,1,0) model as well
as the cyclical and trend factors discovered using STL decomposition.

The forecast shows that there won't be any seasonal effects which are additive or multiplying
and that the value of the variable of interest will grow in the future in a smooth exponential
manner. As a result, it is likely that the variable will grow steadily over time without
experiencing any substantial periodic oscillations or systematic changes.

Because they express the degree of uncertainty surrounding the anticipated values, the
ranges of trust or prediction ranges that are presented with the forecast are advised to be
taken into account while evaluating it.
4. Conclusion
In conclusion, the time series analysis of information has shed important light on the actions
and trends of the variable under investigation.

Over the examined period, the data showed a decreasing pattern, showing a negative
tendency in the occurrence. This implies that there are underlying causes or influences
behind the observable changes.

Furthermore, the time series' breakdown using methods like the STL decomposition
methodology used, demonstrated the existence of separate components. Although the
seasonality component suggested, the trend component displayed a negative trend. These
elements contributed to a deeper comprehension of both the immediate and long-term
trends affecting the variable.

5. References

Parker, D. J., Blyth, A. M., Woolnough, S. J., Dougill, A. J., Bain, C. L., de Coning, E., et al.
(2021). The African SWIFT project: Growing science capability to bring about a revolution in
weather prediction. Bull. Am. Meteorological Soc. 103 (2), E349–E369.
Radeny, M., Desalegn, A., Mubiru, D., Kyazze, F., Mahoo, H., Recha, J., et al. (2019).
Indigenous knowledge for seasonal weather and climate forecasting across East Africa.
Clim. Change 156, 509–526. doi:10.1007/s10584-019-02476-9

Sultan, B., and Gaetani, M. (2016). Agriculture in West Africa in the twenty-first century:
Climate change and impacts scenarios, and potential for adaptation. Front. Plant Sci. 7,
1262

Waindi, I. O., and Khalid, A. (2011). Renewable energy in East Africa: An introductory
evaluation using a systems approach to assess alternatives to providing electricity. J. Afr.
Bus. 12, 387–418.

Webber, S. (2019). Putting climate services in contexts: Advancing multi-disciplinary


understandings: Introduction to the special issue. Clim. Change 157, 1–8.

World Meteorological Organisation [WMO] (2015). WMO guidelines on multi-hazard impact-


based forecast and warning services. WMO Doc. 1150, 34.

You might also like