You are on page 1of 14

Received: 4 February 2021 Accepted: 24 February 2021

DOI: 10.1002/dac.4805

RESEARCH ARTICLE

Weather forecasting and prediction using hybrid C5.0


machine learning algorithm

Sudhan Murugan Bhagavathi1 | Anitha Thavasimuthu2 |


3
Aruna Murugesan | Charlyn Pushpa Latha George Rajendran2 | Vijay A4 |
Laxmi Raja5 | Rajendran Thavasimuthu5

1
Department of Electronics and
Communication Engineering, Vins
Summary
Christian College of Engineering, In this research, a weather forecasting model based on machine learning is
Nagercoil, India
proposed for improving the accuracy and efficiency of forecasting. The aim of
2
Saveetha School of Engineering, Saveetha
this research is to propose a weather prediction model for short-range predic-
Institute of Medical and Technical
Sciences, Chennai, India tion based on numerical data. Daily weather prediction includes the work of
3
Department of Computer Science and thousands of worldwide meteorologists and observers. Modernized computers
Engineering, College of Engineering and make predictions more precise than ever, and earth-orbiting weather satellites
Technology, Faculty of Engineering and
Technology, SRM Institute of Science and capture pictures of clouds from space. However, in many cases, the forecast
Technology, Chennai, India under many conditions is not accurate. Numerical weather prediction (NWP)
4
Department of Business Administration is one of the popular methods for forecasting weather conditions. NWP is a
and Information Systems, Arba Minch
major weather modeling tool for meteorologists which contributes to more
University, Sawla Campus, Sawla,
Ethiopia accurate accuracy. In this research, the weather forecasting model uses the
5
Faculty of Engineering, Karpagam C5.0 algorithm with K-means clustering. The C5.0 is one of the better decision
Academy of Higher Education, tree classifiers, and the decision tree is a great alternative for forecasting and
Coimbatore, India
prediction. The algorithm for clustering the K-means is used to combine identi-
Correspondence cal data together. For this process, the clustering of K-means is initially applied
Anitha Thavasimuthu, Saveetha School of
to divide the dataset into the closest cluster of K. For training and testing,
Engineering, Saveetha Institute of Medical
and Technical Sciences, Chennai, the meteorological data collection obtained from the database Modern-Era
Tamilnadu, India. Historical Analysis for Research and Applications (MERRA) is used. The
Email: anitha.bioinformatics@gmail.com
model's performance is assessed through MAE mean absolute error (MAE)
and root mean square error (RMSE). And the proposed model is assessed with
accuracy, sensitivity, and specificity for validation. The results obtained are
compared with other current machine learning approaches, and the proposed
model achieved predictive accuracy of 90.18%.

KEYWORDS
C5.0 algorithm, K-means clustering, machine learning, MERRA database, numerical weather
prediction, weather forecasting

Int J Commun Syst. 2021;e4805. wileyonlinelibrary.com/journal/dac © 2021 John Wiley & Sons Ltd. 1 of 14
https://doi.org/10.1002/dac.4805
2 of 14 MURUGAN BHAGAVATHI ET AL.

1 | INTRODUCTION

Weather essentially refers to the state of air on the earth at a given time and place. It is a persistent, multidimensional,
data-intensive, chaotic, and dynamic process. These characteristics make weather forecasting a difficult challenge.1
Forecasting is the process of assessment in obscure circumstances from the recorded data. Weather forecasting is one of
the most technologically and scientifically complicated issues faced worldwide from the last century. To make accurate
forecasting is one of the significant difficulties that meteorologists are confronting globally. Since ancient times,
weather forecasting has been a fascinating and interesting domain. Scientists have attempted to predict meteorological
attributes utilizing various techniques, some of these techniques being more precise than others.2
Weather predictions could be interpreted as an atmospheric data translation that involves temperature, precipita-
tion, humidity, wind speed, and direction. These circumstances may be gradually modified. A wide range of resources
are available for weather prediction; however, the volume of data produced for weather prediction is high in nature
and unstructured. Henceforth, weather forecasting on behalf of the weather information was not an easy process,
because it comprises a lot of parameters which can adjust rapidly based on conditions of atmosphere. Forecasters use
space and ground observations, along with formulas and rules depending on what happened before, and hence make
their prediction. Meteorologists utilize a consolidation of many techniques to find their daily weather forecasts.3 They
are as follows.

1.1 | Persistence forecasting

Constant forecasting is a simplified technique for estimating weather. It depends on the current conditions to forecast
the conditions of tomorrow. This can be a considerable technique of forecasting the weather, for example, in the tropics,
when it is in a constant state throughout the summer season. This prediction technique relies firmly on the presence of
a static weather pattern. In both short- and long-range forecasting, it could be helpful.

1.2 | Synoptic forecasting

This technique uses predictive principles which are essential. Meteorologists are taking their observations and applying
those rules to provide short-term forecasts.

1.3 | Statistical forecasting

Over the years, records of average temperatures, precipitation, and snowfall provide forecasters an understanding of
what the weather will be “expected to be like” at a particular season. (Figure 1)

1.4 | Computer forecasting

Forecasters are taking their observations and applying the numbers to complicated equations. Many ultra-high-speed
computers are running these different equations to make “models” (computers) which provide a prediction for the
following few days. Usually, different equations produce diverse results, so along with this one, meteorologist should
consistently use the other forecasting techniques.4,5
Utilizing all the above techniques, forecasters propose their “best guess” concerning what conditions will be the
weather throughout the following few days. Weather prediction now has a wide range of forecasting types that are
classified as6 follows:

• Nowcasting—a report of current weather parameters and 0–2 h report of forecasted weather parameters
• Very short-range forecast—up to 12-h report of weather parameters
• Short-range forecast—above 12 h and up to 72-h report of weather parameters
• Medium-range forecast—above 72 h and up to 240-h report of weather parameters
MURUGAN BHAGAVATHI ET AL. 3 of 14

FIGURE 1 Weather forecasting process (Websource: https://bit.ly/2PGrka2)

• Extended-range forecast—above 10 days and up to 30-days report of weather parameters.


• Long-range forecast—from 30 days up to 2 years.

Numerical weather prediction (NWP) is a main technique for meteorologists to predict the weather and leads to
more forecasting accuracy. The objective of NWP is to compute the condition of the atmosphere relying upon time.
That implies that the simulation has the reason to compute the speed, pressure, density, humidity, and temperature of
each and every point in the atmosphere. As it is impossible to expect each and every point as a result of observational
and computational restrictions, the 2D or 3D grid is utilized for forecasting.7
In this research, a NWP model is proposed using a decision tree (DT) algorithm with k-means clustering approach
to forecast the weather conditions based on meteorological dataset collected from Modern-Era Historical Analysis for
Research and Applications (MERRA) database. The Chennai city is selected and all the data regarding weather condi-
tions from 1980-2020 was collected in CSV format. The data collected from this database was provided by National
Aeronautics and Space Administration (NASA) /Goddard Space Flight Center. By using the proposed DT algorithm
with K-means clustering, the model is trained with dataset using different attributes like temperature, humidity, rain-
fall, etc. For other attributes like snow accumulation and sea breeze, related datasets can be used and evaluated. After
the training, the experimental testing is carried over and the weather forecasting for a certain time period will be
predicted and compared with the actual forecast on that certain period for model validation.
4 of 14 MURUGAN BHAGAVATHI ET AL.

The sea breeze is a local circulation that occurs at coastal locations throughout the world. The sea breeze can pro-
vide relief from oppressive hot weather, trigger thunderstorms, provide moisture for fog and may result in either
improved or reduced air quality near Earth's surface. As air masses move around the globe, so air pressure changes.
Areas of high pressure are called anticyclones, while low pressure areas are known as cyclones or depressions. Each
brings with it different weather patterns. Anticyclones typically result in stable, fine weather, with clear skies while
depressions are associated with cloudier, wetter, windier conditions.
The following will be the rest of the work: Section 2 discusses the related works based on weather predictions using
different techniques and algorithms, Section 3 represents the proposed methodology, Section 4 presents the perfor-
mance analysis, and Section 5 represents the conclusion and future work of this research.

2 | R E LA T E D WOR KS

The data of weather prediction are generated from different sources like radar, ships, flights, and ground observations.
It includes specific significant and insignificant data for forecasting weather data in the unstructured data form. Pandey
et al. analyzed the big data application in the field of weather forecasting. Hadoop system was implemented to process
this unstructured information. The word count algorithm was utilized to discover the total condition of that day. Addi-
tionally, fuzzy logic and adaptive neuro fuzzy inference system (ANFIS) techniques were applied for precise forecasting
of weather information based on mean square error.8 It is very familiar that NWP models need high-tech computer to
solve difficult scientific equations to attain a prediction dependent on climate conditions. Hewage et al. proposed the
new lightweight data-driven weather prediction model by analyzing temporal modeling techniques of long short-term
memory (LSTM) and temporal convolutional networks (TCN). Moreover, the arbitrage of forecasting expert (AFE) was
used as the dynamic ensemble technique. This deep model comprises various layers which used surface parameters of
weather over a given timeframe for weather prediction. This deep learning network with TCN and LSTM layers were
evaluated in two distinct regressions, in particular multi-input single-output and multi-input multi-output. The weather
prediction results predicted up to 12 h.9
Liu et al. proposed a computational intelligence technique called stacked auto-encoder for simulating hourly
weather information in 30 years. This technique could learn the features automatically from the large amount of
dataset by means of feature granulation layer-by-layer, and the huge size of the dataset could ensure that the compli-
cated deep model does preventing from the overfitting issue.10 Yahya and Seker designed a model to predict weather
factors selected dependent on artificial neural network (ANN) comprising radial basis function (RBF), fuzzy c-means,
and nonlinear autoregressive network with exogenous input. The performance analysis of this model represented close
predicted outcomes with very less statistic errors for predicted years from 2015 to 2050, then the model starts to decline,
and its outcomes were inconsistent.11 Andrade and Bessa proposed a prediction system to analyze data from a grid of
NWP used for both solar and wind energy. The technique integrated the gradient boosting trees algorithm with feature
technical approaches which extricate the maximum data from the NWP grid. For solar energy, consolidating all the
NWP grid predictions, an integration of transient and spatial indexes processed for the shortwave flux and PCA used
for the cloud cover at various levels obtained the possible outcomes. This prompted a decrease in the total sharpness of
probabilistic predictions, more significant for clear-sky days, eliminating some of the unusual high instability levels
proven in those cases.12
Karevan and Suykens proposed an application for weather forecast called transductive LSTM (T-LSTM) that used
the local data in predictions of time series. In transductive learning, the test point region's samples were deemed to be
fitting the method in higher affect. A quadratic cost function was examined for the regression issue in this research.
Objective function localizing was finished by analyzing the weighted quadratic cost functionality; then the neighbor-
hood samples of the test point have large weights. Two weighting methods were dependent on the cosine comparability
among the training samples, and the test point was analyzed. Hence, to evaluate the analysis of the technique in various
weather situations, the analysis was led on two distinctive time periods of a year.13 Oana and Spataru analyzed the
implementation of genetic algorithm (GA) along with the Weather Research and Forecasting - Numerical Weather Pre-
diction (WRF-NWP) model for optimizing the physical configuration of parameterization and for enhancing the predic-
tion of two significant weather parameters: 2-m temperature and relative humidity. The outcomes inferred that the
ones started for the temperature prediction optimization demonstrated well, but not substantial outcomes.14
Bhatkande1 and Hubball analyzed the implementation of data mining methods in predicting factors like maximum
temperature, minimum temperature in weather forecasting. This was done with DT algorithms and meteorological
MURUGAN BHAGAVATHI ET AL. 5 of 14

dataset from 2012 to 2015 from the various cities in India. The DT algorithm was utilized for removing the irrelevant
data in the dataset. This DT model predicted weather conditions like full cold, full hot, and snowfall which can be life-
saving data.14 Zhang et al. designed a technique utilizing multimodal fusion to create a weather visibility forecast
model. An advanced NWP model and a technique for emission identification were utilized to create a multimodal
fusion visibility forecast model. The advanced regression algorithms like XGBoost and LightGBM were utilized to train
the numerical prediction fusion model. Based on the Landsat-8 satellite images the forecasting has been performed.15

3 | PROPOSED METHOD

In this research, a NWP model is proposed using a DT algorithm with k-means clustering approach to forecast the
weather conditions based on meteorological dataset collected from MERRA database. Initially, the dataset is processed,
and the attributes in the dataset like temperature, pressure, humidity, wind speed, rainfall, snowfall, and snow depth
are selected. The C5.0 algorithm is the DT classifier used for the classification and prediction model along with k-means
clustering. The K-means clustering algorithm is used for grouping similar datasets together. In this technique, K-means
clustering is initially utilized to isolate dataset into the K closest cluster. (Figure 2)

3.1 | K-means clustering

It is a partitional clustering technique that is used for clustering numerical data. The technique of partitioning cluster-
ing divides the entity into the un-nested or one-level cluster set and non-overlapping to increase the value of clustering

FIGURE 2 Proposed model


6 of 14 MURUGAN BHAGAVATHI ET AL.

evaluation where every cluster optimizes the clustering criteria. K-means clustering was applied to cluster the group of
objects into disjoint group k, based on their attributes. The objects inside the cluster would be the same while those
from different clusters will be different. K-means clustering is an algorithm for data mining/machine learning used
to cluster observations into groups of relative observations without previous knowledge of those relationships.
The k-means algorithm was one of the basic clustering methods and is commonly utilized in many fields.16 The value
of k is 2.

Step 1: Place K points into the space represented by the clustered objects.
Step 2: These points represent the centroids of the initial group. Allocate every object to the category whose centroid
is nearest.
Step 3: Recalculate the locations of the K centroids when all items have been assigned.
Step 4: Repeat Steps 2 and 3 before centroids do not shift any more. It generates a partition of the objects into groups
from which to measure the metric to be minimized.

3.2 | C5.0 algorithm

While there are numerous implementations of DTs, one of the most well-known is the C5.0 algorithm. The C5.0
algorithm has become the industry standard for producing DTs, because it does well for most types of problems directly
out of the box. Compared to more advanced and sophisticated machine learning models (e.g., neural networks and sup-
port vector machines [SVM]), the DTs under the C5.0 algorithm generally perform nearly as well but are much easier
to understand and deploy. The DT is a technique to perform the classification. DTs have turned out the most effective
and notable techniques in machine learning and data mining. DTs need two sorts of information: training and testing.
Training, which is commonly the huge data, is used to create mining trees. The testing data are used to get the accuracy
and misclassification rate of the DT.
C5.0 algorithm is evolved from C4.5 algorithm, it is a classification algorithm that is suitable for large datasets.
It is more improved than C4.5 on efficiency, memory, and speed. C5.0 algorithm uses a pruning technique. Once
a DT is built, some branches may contemplate errors in the training data because of noise that is expelled by the
tree pruning techniques. The tree pruning technique uses the statistical measure to expel the least reliable bra-
nches. Post-pruning and Pre-pruning are the two usual methods. In the pre-pruning technique, the tree is pruned
by determining not to additionally split the sub set of training tuples at a presented node. Post-pruning process
expels subtrees from a fully grown tree, by exchanging a subtree with a leaf labeled as the most prevalent class in
it. C5.0 also uses a BOOSTING method to generate and integrate various classifiers to deliver enhanced predictive
accuracy. Contrasted with C4.5, the error rate of the C5.0 classifier is around 1:3 of the C4.5 classifier. Ross Quin-
lan developed the C5.0 algorithm with solutions for the classification issues. C5.0 works in three essential stages;
initially, all samples of the root node at the top of the tree considered and forwarded them through to the second
node called “branch node.” The branch node produces rules for a set of samples reliant on an entropy measure.
At this point, the C5.0 creates a huge tree by considering all characteristic values and concludes the decision prin-
ciple by pruning. It uses a heuristic technique for pruning depended on the statistical measure of splits. Hence,
fixing the better rule, the branch nodes send the last class value in the previous node, known as the “leaf node.”
Machine learning enables to predict an outcome using data about past events. C5.0 algorithm is used to develop a
DT for classification.16  
There are two sequences, X  = X 1 , X 2 ,…, X N is a training dataset and Y = {y1, y2, …, yn} is a set of corresponding
classes. Here X i = x i1 , x i2 , …:, x id is a vector of attributes, where i ϵ {1, …, N}, d is the total attribute, N is the total vector
in the training dataset,  yiϵ C= {1, …, M} is a number of class of Xi vector. An attribute x ij is a discrete or a real-valued
variable, that is, x ij ϵ 1, …, T j for some integer Tj. Let DOMj = R if xj is a real value; and DOMj = {1, …, Tj} if xj is a
discrete-valued attribute. The problem is to develop a function F : DOM1 × … × DOMd ! C that is a classifier. The
function classifies a new vector X = (x1, …, xd) that is not from X .
DT is a tree so that every node tests some condition on input variables. Assume that B is some test with outputs b1,
b2,…,bt that is tested in a node. Then, there are t outgoing edges for the node for every output. Every leaf is linked with
a result class from C. The process of testing is as follows: test conditions are initiated from the root node and pass by
edges based on a result of the condition. The label on the reached leaf is the output of the process of classification. In a
training set, the number of vectors is N, the number of classes is M, and the number of attributes is d. Let the height of
MURUGAN BHAGAVATHI ET AL. 7 of 14

a building tree h be the algorithm's parameter. Let RVA be the real-valued attribute's set indexes, and DVA is the
discrete-valued attribute's set indexes.
The following will be the procedure for enhancements. Assume that a binary tree was constructed of height h. The
primary process is to construct a classifier that performs a recursive process, which is used for developing nodes in a
Form-Tree. A Form-Tree will construct a subtree for performing index operations where, 'X0 is a set that used for con-
structing a subtree. The form-tree process performs two steps. The first one, choose-split, is selecting the test B that is
0
selecting an attribute and the splitting by this attribute that expand the objective function GPððXX0 ;;BBÞÞ . The result attribute
index is att, and the resulting split is a split variable. The second step, divide, is the splitting process itself.

3.3 | Proposed C5.0 algorithm

• A—attribute.
• DS—total dataset.
• T—an attribute with n number of mutually exclusive outputs T1, T2 …Tn.
• c—count of classes.
• p (DS, j)—the ratio of instances in DS according to the jth class.
• Di ⊆ DS—the division of dataset where each record has Ti for the attribute T.
• jDij—size of the division Di.
8 of 14 MURUGAN BHAGAVATHI ET AL.

C5.0 algorithm initially computes the entropy of the total dataset (DS) as follows:

Xc
I ðDSÞ = − j=1
pðDS,jÞlog2 ðpðDS, jÞÞ ð1Þ

If T is a categorical variable, C5.0 algorithm in the following stage computes entropy inside a dataset where each record
has Ti for T. It calculates the entropy as
Xc
I ðD i Þ = − j=1
pðDi , jÞlog2 ðpðDi , jÞÞ ð2Þ

Hence, the total dataset's weighted entropy when variable T is analyzed at the initial node is as,

Xn jDi j
I ðDS, T Þ = − i = 1 jDSj
× I ð Di Þ ð3Þ

If a numerical attribute T holding a domain [l, u], hence the data were initially re-organized; thus, T values were
arranged in descending or ascending order. Then the dataset was separated with two divisions D1 and D2 depend on a
split point p; thus, the T in D1 was [l, p], and D2 was [p + 1, u], where p + 1 refers the following high value to p in the
domain. Then I (DS, T) is computed as
Xc
I ð Di Þ = − j=1
pðDi , jÞlog2 ðpðDi , jÞÞ, for1 ≤ i ≤ 2 ð4Þ

X2 jDi j
I ðDS, T Þ = − i = 1 jDSj
× I ð Di Þ ð5Þ

I (DS, T) are computed for every feasible split points of T. At last, the minimal I (DS, T) was deemed as the I (DS, T) of
T and the split point which generates the minimal I (DS, T) was considered to be as the best T split point.
The entropy reduction, through selecting an attribute T as a test variable, was deemed as the data gain for the vari-
able and computed as

GainðDS, T Þ = I ðDSÞ −I ðDS, T Þ ð6Þ

Gain (DS, T) of T was impacted due to the size of the domain of T and would be maximal while there was just a single
record in every subset Di. Thus, the discussed gain computation supports the attribute with a large domain size over
those holding small size. To decrease this excessive favor, the gain ratio of the attribute was used to choose the test vari-
able for the node. The gain ratio was computed by

I ðDSÞ− I ðDS, T Þ
Gain RatioðDS, T Þ = ð7Þ
Split ðDS, T Þ

The split data of attribute Split (DS, T) expand once the attribute has a larger size domain. Split data of every attribute
were computed as

Xk j Di j jDi j
Split ðDS, T Þ = − i = 1 jDSj
× log2 ð8Þ
jDSj

where the domain size of T, jTj = k.


At last, the one holding the maximum gain ratio was selected as the root node of DT from total non-class attributes.
If the selected variable T is a categorical variable containing the size of domain jTj = k, the dataset DS was isolated into
MURUGAN BHAGAVATHI ET AL. 9 of 14

k conflicting partitions D1, D2 … … Dk. Instead, if the selected variable T was a numerical variable with domain [l, u],
therefore the dataset DS was isolated into D1 and D2 divisions using the best split point of T. When the test variable of a
DT is selected for the root node, the similar operations are reiterated frequently on every division of the dataset until an
end condition.
The proposed weather prediction model is designed to predict the weather conditions by using a machine learning-
based classifier called the C5.0 algorithm. This prediction model works in simple steps, as the initial step is to analyze
the dataset, and the attributes are selected like temperature, rainfall, pressure, humidity, wind speed, snowfall, and
snow depth. Then the classifier trained on the dataset with 75% of the data, and for testing, 25% of the data are used. In
the final step, based on the training and testing, the classifier can predict the weather conditions.

4 | PERFORMAN CE A N ALY S I S

4.1 | Dataset description

This meteorological dataset was collected from MERRA database. The Chennai city is selected as the location, and the
weather conditions from the period 1980–2020 was collected in CSV format. The data collected from this database were
provided by NASA/Goddard Space Flight Center. This is a publically available database for collecting meteorological
datasets around the world. The data are available from the period of 1980, and the data can be collected in the range
like hour based, day based, and month based. The attributes present in the dataset are temperature, pressure, relative
humidity, wind speed, rainfall, wind direction, snow depth, snowfall, and short-wave irradiation. But the selected attri-
butes for this research are temperature, humidity, pressure, rainfall, and wind speed as shown in Table 1.
The highest temperature ever recorded in Chennai was 45 C on May 31, 2003. The average high temperature was
37.1 C, and the average low temperature recorded was 21.2 C. The lowest temperature was 18.3 C recorded on
December 29, 2008, but the lowest temperature ever recorded in Chennai was 13.9 C on December 11, 1895. The
highest rainfall recorded in Chennai was 108 mm (4.25 inch) on September 19, 2019. The lowest rainfall recorded was
22.4 mm on October 2016, before that 11.2 mm on October 1918.

4.2 | Performance metrics

The performance of the model is evaluated by the mean absolute error (MAE) and root mean square error (RMSE).
And for validation, the proposed model is evaluated with accuracy, sensitivity, and specificity. The achieved results are
compared with other machine learning methods like Naïve Bayes (NB), C4.5, GA, ANN, SVM, LSTM, and RBF
techniques.1718
Pn
j f −oi j
MAE = i = 1 i ð9Þ
n
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 Xn
RMSE = i=1 i
ð f −oi Þ2 ð10Þ
n

where n is the number of samples, f are the forecasts, and o are the observed values.

TABLE 1 Dataset description


Attributes Description
Temperature (K) Temperature observed at 2 m above ground
Relative Humidity (%) Humidity observed at 2 m above ground
Pressure (hPa) Pressure at ground level
Wind Speed (m/s) Wind speed observed at 10 m above ground
Rainfall (mm) Rain depth in mm
10 of 14 MURUGAN BHAGAVATHI ET AL.

The MAE estimates the average magnitude of the errors in a set of forecasts, without taking their direction into con-
sideration. It calculates accuracy with respect to continuous variables. The MAE is the average of the absolute values of
the differences among forecast and respective observation over the verification sample. The MAE is a linear value
which means that the average weight of all the individual differences is equal.19–23
The RMSE is a quadratic scoring rule that calculates the error's average magnitude. Each squared is the difference
among the forecast and the related observed values and is then averaged over the sample. Finally, the average square
root is taken. Given that the errors are squared before they are averaged, large errors are given a relatively high weight
by the RMSE. This means that RMSE is most useful when big mistakes are particularly unnecessary.24,25
The MAE and RMSE can be used together in a series of forecasts to identify the variability in the errors as shown in
Table 2. The RMSE will often be superior or equal to the MAE; the greater their variance, the greater the sample vari-
ance in the individual errors.
The weather forecast predicted by the proposed model is shown in the above figures, from Figures 3 to 7. The
predicted values are the forecasted results of July 1, 2020, to July 31, 2020. After the prediction, the predicted results are
compared with the actual results for the verification of validation. The proposed model predicted more or less equal in
every major predictions. The temperature prediction is shown in Figure 3, where the predicted results and actual results
are very close. In Figures 4–6, the prediction of pressure, humidity, and wind speed is plotted. The rainfall prediction is
plotted in Figure 7, where some of the results nearly matched with the actual results, and some of the results do not
match with the actual results. The proposed model is evaluated with accuracy, sensitivity, and specificity.
Accuracy = (number of correct evaluation)/(number of all evaluation)

TN + TP
Accuracy = ð11Þ
TN + TP + FN + FP

TABLE 2 MAE and RMSE analysis


Algorithm MAE RMSE
NB 7.62 8.32
C4.5 7.36 7.40
GA 7.20 7.25
SVM 6.25 6.33
ANN 4.31 5.56
RBF 3.02 3.70
LSTM 2.34 3.64
C5.0-KNN 1.99 2.81

FIGURE 3 Actual vs. predicted temperature


MURUGAN BHAGAVATHI ET AL. 11 of 14

FIGURE 4 Actual vs. predicted pressure

FIGURE 5 Actual vs. predicted humidity

FIGURE 6 Actual vs. predicted wind speed

Sensitivity = TP/(TP + FN) = (number of true positive evaluation)/(number of all positive evaluation)

TP
Sensitivity = ð12Þ
TP + FN
12 of 14 MURUGAN BHAGAVATHI ET AL.

FIGURE 7 Actual vs. predicted rainfall

TABLE 3 Performance analysis of


Algorithm Accuracy Sensitivity Specificity
prediction
NB 84.62 88.32 83.23
C4.5 85.36 90.13 84.44
GA 86.27 90.25 84.15
SVM 83.25 87.33 82.65
ANN 84.31 85.56 82.42
RBF 85.02 88.70 82.89
LSTM 88.34 90.64 85.34
C5.0-KNN 90.18 92.81 87.11

FIGURE 8 Graphical plot of performance analysis for predicted results

Specificity = TN/(TN + FP) = (number of true negative evaluation)/(number of all negative evaluation)

TN
Specificity = ð13Þ
TN + FP

The performance of this weather forecasting model is evaluated with accuracy, sensitivity, and specificity, where
these parameters are computed and tabulated in Table 3. Compared with the other machine learning algorithms, the
MURUGAN BHAGAVATHI ET AL. 13 of 14

obtained results are efficiently better in all the parameters. The proposed model achieved 1.8% to 6.9% more accuracy,
2.1% to 7.2% more sensitivity, and 1.7% to 4.6% more specificity than the other compared techniques as the related
graph is plotted in Figure 8.
The proposed weather prediction model was designed to predict the weather conditions by using the classifier called
the C5.0 algorithm with K-means clustering approach. This prediction model performed in the following process: as the
initial step was analyzing the dataset, and the attributes like temperature, rainfall, pressure, humidity, wind speed are
selected. The selected attributes are used for training and testing with the proposed model. Then the classifier with 75%
of the data was used for training, and for testing, 25% of the data are used. In the final step, based on the training and
testing, the classifier predicted the weather conditions.

5 | C ON C L U S I ON

In this research, a machine learning-based weather forecasting model was proposed. The C5.0 algorithm with K-means
clustering was used for the weather forecasting. The C5.0 was one of the best DT classifiers and used for forecasting
and prediction. The K-means algorithm was used for combining similar data together. In this model, the K-means clus-
tering was initially applied to divide dataset into the K closest cluster. The meteorological dataset collected from
MERRA database was used for training and testing. The performance of the model was evaluated by the MAE and
RMSE. And for validation, the proposed model was evaluated with accuracy, sensitivity, and specificity. By comparing
with the other machine learning algorithms, the obtained results of the proposed model are efficiently better in all the
parameters. The proposed model achieved 1.8% to 6.9% more accuracy, 2.1% to 7.2% more sensitivity, and 1.7% to 4.6%
more specificity than the other compared techniques. This proposed research is suitable for short range forecasting, and
it can be extended and developed for long-range forecasting in real-time application. In the future, the deep learning-
based classifier can be used to improve the accuracy and efficiency of this model, and more number of attributes can be
added to obtain accurate predictions.

ORCID
Sudhan Murugan Bhagavathi https://orcid.org/0000-0003-4007-6774
Anitha Thavasimuthu https://orcid.org/0000-0002-5341-1247
Aruna Murugesan https://orcid.org/0000-0002-7187-7964
Charlyn Pushpa Latha George Rajendran https://orcid.org/0000-0002-8869-8487
Vijay A https://orcid.org/0000-0003-4083-8927
Laxmi Raja https://orcid.org/0000-0001-6040-8794
Rajendran Thavasimuthu https://orcid.org/0000-0003-0759-1846

R EF E RE N C E S
1. Sillmann J, Thorarinsdottir T, Keenlyside N, et al. Understanding, modeling and predicting weather and climate extremes: challenges
and opportunities. Weather Clim Extremes. 2017;18:65-74. https://doi.org/10.1016/j.wace.2017.10.003
2. Naveen L, Mohan HS. Atmospheric weather prediction using various machine learning techniques: a survey, In: Proceedings of the
Third International Conference on Computing Methodologies and Communication (ICCMC), IEEE; 2019:422-428. https://doi.org/10.
1109/ICCMC.2019.8819643
3. Saima HJ, Jaafar SB, Jillani TA. Intelligent methods for weather forecasting: a review, In: National Postgraduate Conference, IEEE;
2011:1-6. https://doi.org/10.1109/NatPC.2011.6136289
4. Rajinikanth TV, Balaram VV, Rajasekhar N. Analysis of Indian weather data sets using data mining techniques, In: Computer Science &
Information Technology (CS & IT)-Computer Science Conference Proceedings (CSCP), 2014:89-94. https://doi.org/10.5121/csit.2014.
4510
5. Thomas J, Mio M. Verification of global numerical weather forecasting systems in polar regions using TIGGE data. Q J Roy Meteorol
Soc. 2016;142(695):574-582. https://doi.org/10.1002/qj.2437
6. Singh S, Kaushik M, Gupta A, Malviya AK. Weather forecasting using machine learning techniques, In: International Conference on
Signal Processing and Communication (ICSC), IEEE; 2019:171-174. https://doi.org/10.1109/ICSC45622.2019.8938211
7. Wiegand B Introduction to numerical weather prediction, Technical Report—STL-TR-2015-03, 2015. http://www-stl.htwsaar.de/tr/STL-
TR-2015-03.pdf
8. Pandey AK, Agrawal CP, Meena A. A Hadoop based weather prediction model for classification of weather data, In: Second Interna-
tional Conference on Electrical, Computer and Communication Technologies (ICECCT), 2017:1-5. https://doi.org/10.1109/ICECCT.
2017.8117862
14 of 14 MURUGAN BHAGAVATHI ET AL.

9. Hewage P, Trovati M, Pereira E, Behera A. Deep learning-based effective fine-grained weather forecasting model. Pattern Anal Applic.
2020;24(1):1-24. https://doi.org/10.1007/s10044-020-00898-1
10. Liu JNK, Hu Y, He Y, Chan PW, Lai L. Deep neural network modeling for big data weather forecasting. In: Pedrycz W, Chen SM, eds.
Information Granularity, Big Data, and Computational Intelligence, Studies in Big Data; 2014:389-408 https://doi.org/10.1007/978-3-319-
08254-7_19
11. Yahya BM, Seker DZ. Designing weather forecasting model using computational intelligence tools. Appl Artif Intell. 2018;33(2):1-15.
https://doi.org/10.1080/08839514.2018.1530858
12. Andrade JR, Bessa RJ. Improving renewable energy forecasting with a grid of numerical weather predictions. IEEE Trans Sustainable
Energy. 2017;8(4):1571-1580. https://doi.org/10.1109/TSTE.2017.2694340
13. Karevan Z, Suykens JAK. Transductive LSTM for time-series prediction: an application to weather forecasting. Neural Netw. 2020;125:
1-9. https://doi.org/10.1016/j.neunet.2019.12.030
14. Oana L, Spataru A. Use of genetic algorithms in numerical weather prediction, In: International Symposium on Symbolic and Numeric
Algorithms for Scientific Computing, IEEE, 2016:456-461. https://doi.org/10.1109/SYNASC.2016.075
15. Bhatkande1 SS, Hubball RG. Weather prediction based on decision tree algorithm using data mining techniques. Int J Adv Res Comput
Commun Eng. 2016;5(5):483-487. https://doi.org/10.17148/IJARCCE.2016.55114
16. Zhang C, Wu M, Chen J, et al. Weather visibility prediction based on multimodal fusion. IEEE Access. 2019;7:74776-74786. https://doi.
org/10.1109/ACCESS.2019.2920865
17. Narmatha C. A new neural network-based intrusion detection system for detecting malicious nodes in WSNs. J Comput Sci Intell
Technol. 2020;1(3):1-8.
18. Thavasimuthu R, Sridhar KP, Manimurugan S, Deepa S. Recent innovations in soft computing applications. Curr Signal Transduction
Ther. 2019;14(2):129-130. https://doi.org/10.2174/157436241402191010112727
19. Mugesh R. A survey on security risks in internet of things (IoT) environment. J Comput Sci Intell Technol. 2020;1(2):1-8.
20. Sasirekha SP, Priya A, Anitha T, Sherubha P. Data processing and management in IoT and wireless. J Phys Conf Ser. 1712;2020(2020):
1-8. https://doi.org/10.1088/1742-6596/1712/1/012002
21. Latha GCP, Sridhar S, Prithi S, Anitha T. Cardio-vascular disease classification using stacked segmentation model and convolutional
neural networks. J Cardiovasc Dis Res. 2020;11(4):26-31. https://doi.org/10.31838/jcdr.2020.11.04.05
22. Anitha T, Santhi N, Sathiyasheelan R, Emayavaramban G, Rajendran T. Brain–computer interface for persons with motor disabilities—
a r. Open Biomed Eng. 2019;13(Suppl-1, M5):127-133. https://doi.org/10.2174/1874120701913010127
23. Anitha T, Latha GCP, Surendra PM. A proficient adaptive K-means based brain tumor segmentation and detection using deep learning
scheme with PSO. J Comput Sci Intell Technol. 2020;1(3):9-14.
24. Mehrotra M, Joshi N. Anomaly detection in temporal data using K-means clustering with C5.0. Int J Eng Sci. 2017;6(5):77-81. https://
doi.org/10.9790/1813-0605017781
25. Srinivas AS, Somula R, Govinda K, Saxena A, Reddy PA. Estimating rainfall using machine learning strategies based on weather radar
data. Int J Commun Syst. 2019:1-11. https://doi.org/10.1002/dac.3999

How to cite this article: Murugan Bhagavathi S, Thavasimuthu A, Murugesan A, et al. Weather forecasting
and prediction using hybrid C5.0 machine learning algorithm. Int J Commun Syst. 2021;e4805. https://doi.org/10.
1002/dac.4805

You might also like