Professional Documents
Culture Documents
Paresh Chandra Deka, Amit Prakash Patil, P. Yeswanth Kumar & Sujay
Raghavendra Naganna
To cite this article: Paresh Chandra Deka, Amit Prakash Patil, P. Yeswanth Kumar &
Sujay Raghavendra Naganna (2017): Estimation of dew point temperature using SVM and
ELM for humid and semi-arid regions of India, ISH Journal of Hydraulic Engineering, DOI:
10.1080/09715010.2017.1408037
Download by: [The National Library - Kolkata] Date: 27 November 2017, At: 21:04
ISH Journal of Hydraulic Engineering, 2017
https://doi.org/10.1080/09715010.2017.1408037
Estimation of dew point temperature using SVM and ELM for humid and semi-arid
regions of India
Paresh Chandra Dekaa, Amit Prakash Patilb, P. Yeswanth Kumara and Sujay Raghavendra Nagannaa
a
Department of Applied Mechanics and Hydraulics, National Institute of Technology Karnataka, Surathkal, India; bDepartment of Civil Engineering,
Dr. Daulatrao Aher College of Engineering, Karad, India
weather stations in the earlier studies. Also, the estimation of 2006 to October 2006 were reserved for testing of the mod-
DPT in humid and semi-arid region is very scanty in literature. els. Similarly, the data sample of Hyderabad station covers
Hence, in this study, the ELM and SVM models are devel- a period of January 2007 to December 2009, with a total of
oped to estimate daily DPT of 3rd hour and 12th hour UTC 1047 daily observations, wherein twenty-six months of data
(Coordinated Universal Time) using three input variables viz. between the period January 2007 to February 2009 were used
wet bulb temperature, relative humidity and vapor pressure for model training and the remaining ten months of data
only. Daily data of above input variables from humid (Bajpe) between the period March 2009 to December 2009 were
and semi-arid (Hyderabad) regions of India are used in the reserved for testing of the models. After finding the correla-
present study. The performance of ELM models is compared tion of every influencing physical parameter (such as dry bulb
with that of SVM models using statistical indices, such as root temperature, wet bulb temperature, vapor pressure, relative
mean square error (RMSE) and mean absolute error (MAE) humidity, average wind speed, and sunshine hours) with DPT,
and Nash–Sutcliffe efficiency (NSE). only three parameters mentioned above were considered as
inputs for estimating the DPT. Table 1 presents the statistical
properties of the data-set.
2. Study area and data descriptions
In the present study, a model was developed for estimating
3. Methodology and model development
DPT of two diverse climatic regions. The weather data of Bajpe
station 17.44° N, 78.47° E which come under humid tropical 3.1. Extreme learning machine
Downloaded by [The National Library - Kolkata] at 21:04 27 November 2017
Table 2. Correlation coefficient of dew point temperature with model input performance. Huang and Siew (2004) shown that instead of
parameters. tuning the centers and impact widths of RBF kernels, we may
Bajpe station Hyderabad station just simply randomly choose values for these parameters and
3rd hour 12th hour 3rd hour 12th hour analytically calculate the output weights of RBF. Radial basis
Correlation
UTC UTC UTC UTC activation function was employed for all the ELM models tested
coefficient DPT (°C) DPT (°C) DPT (°C) DPT (°C) in this study.
WBT (°C) 0.97 0.90 0.90 0.87 The SVM model was developed by means of LIBSVM soft-
RH (%) 0.75 0.80 0.69 0.83 ware version 3.21 (Chang and Lin 2011) employing RBF kernel
VP (hPa) 0.99 0.99 0.98 0.99 function. The accuracy of an RBF kernel based SVR model is
DPT (°C) 1 1 1 1
principally dependent on the selection of the model parame-
ters such as C- regularization parameter, Gamma (γ) – kernel
parameter and the epsilon parameter (ε). The V-fold cross-val-
hydrological forecasting is provided by Sujay Raghavendra and idation parameter selection method was used to search for the
Deka (2014). optimal parameters of SVM using the error computed from the
training data. The optimal values of hyper-parameters obtained
3.3. Model development during training of RBF kernel based SVM models are as given
in Table 3.
The input–output combination devised for formulating ELM
and SVM models is based on the weather variables having a
good correlation with DPT (refer Table 2). The DPTs of both 3.4. Performance evaluation
3rd hour and 12th hour UTC are estimated by the input-output The statistical indices assess the level of confidence that one
structure as given below. can have in the estimates of the model. The performance of
Dew point temperature → f [Wet Bulb Temperature ELM and SVM models were evaluated using the following
(WBT) + Relative Humidity (RH) + Vapor Pressure (VP)] statistical indices:
A three-layered architecture was adopted for ELM model Root Mean Square Error (RMSE),
development. The first layer (input layer) used different mete- �
orological parameters (refer Table 2) as inputs. The output � n �
�∑ �2
layer had one neuron representing the estimated DPT. For the �
� i=1 x i − yi
hidden layer a maximum of 200 neurons were tested for each RMSE =
model. For determining the optimum number of neurons in n
the hidden layer, initially 10 neurons were tested and subse- Mean Absolute Error (MAE),
quently the number of neurons was gradually increased to 200 n
by an interval of ten. ELM can be extended to single-hidden ∑ �y − x �
� i i�
layer feedforward neural networks with radial basis function MAE =
i=1
(RBF) kernels. The ELM algorithm with RBF kernels can com- n
plete learning at extremely fast speed and produce generalized Nash Sutcliffe Efficiency (NSE).
4 P. C. DEKA ET AL.
n �
∑ �2 4. Results and discussion
yi − xi
NSE = 1 −
i=1
This part of the paper focuses on evaluating the performance of
n
∑�
xi − x̄
�2 proposed SVM and ELM models for estimating the daily DPT
i=1 of 3rd and 12th hour UTC, respectively. Basically, the ability
of any model or technique to provide a precise estimation is
where, xi – true value; yi – model estimated value; x̄ – mean of
dependent on the appropriate selection of input parameters.
true values; ȳ – mean of model estimated values; n – number
For this study, three meteorological variables namely, WBT,
of data points.
RH and VP were selected as inputs considering the correlation
Table 3. Optimal values of RBF kernel-based SVM hyper-parameters. between these inputs with the DPT (refer Table 2). The results
of the SVM and ELM models of the humid Bajpe weather sta-
No. of support
SVM models C γ ε vectors tion for 3rd hour and 12th hour UTC are presented in Table 4.
Bajpe station 3rd hour model 22 8 0.01 235 The results presented here are with respect to the performance
12th hour model 28 7 0.01 254 of the models in the testing phase. The 3rd hour SVM model
Hyderabad 3rd hour model 37 12 0.01 262 performed poorly with RMSE, MAE and NSE of 0.48 (°C), 0.21
station 12th hour model 41 14 0.01 269
(°C) and 0.52, respectively. However, the performance of 3rd
hour ELM model is satisfactory with RMSE, MAE and NSE of
Table 4. Testing performance of the SVM and ELM models at the Bajpe station. 0.38 (°C), 0.04 (°C), and 0.69, respectively. The performance
Model Model parameters RMSE (°C) MAE (°C) NSE of both SVM and ELM models are better while estimating the
Downloaded by [The National Library - Kolkata] at 21:04 27 November 2017
3rd hr. SVM 28, 8, 0.01 0.48 0.21 0.52 12th hour DPT as compared to 3rd hour DPT. Also, it can
3rd hr. ELM 3-40-1 0.38 0.04 0.69 be observed that the performance of ELM models is better
12th hr. SVM 28, 7, 0.01 0.52 0.28 0.62
12th hr. ELM 3-90-1 0.10 0.02 0.90
than the SVM models for estimating both 3rd hour and 12th
hour DPT. The 3rd hour ELM model used forty neurons in the
hidden layer and presented a better performance than the SVM data points in the scatter plots of the ELM model is substan-
model. However, there is a drastic variation in the performance tially lower than that of the plots of SVM models. This clearly
of SVM and ELM models while estimating 12th hour DPT. It indicates the high potential of ELM to predict daily DPT. It
can be observed that the 12th hour ELM model with ninety can be observed from the scatter plots that the SVM model
neurons in the hidden layer delivered the best performance underestimated the DPT values.
amongst all the models tested at the Bajpe station. A similar analysis was also carried out for the semiarid
To graphically analyze the capability of the developed ELM Hyderabad weather station to evaluate the performance
and SVM models for Bajpe station, the predicted values are of the proposed models. The results of the models tested at
plotted against the measured data. The scatter plots of predicted Hyderabad station are presented in the Table 5. The results of
daily DPTs by ELM and SVM against the measured data for the the Hyderabad station are also in line with the results at the
testing phase are illustrated in Figure 2. There exists a favorable Bajpe station. The minimum RMSE and maximum NSE were
correlation between the measured and predicted values by the found by ELM models with respect to both 3rd hour and 12th
ELM models for both stations; the dispersion degree of the hour DPT estimations. The 12th hour ELM and SVM mod-
els performed better than the 3rd hour UTC models. Further,
Table 5. Testing performance of the SVM and ELM models at the Hyderabad sta- based on all indices it can be inferred that the ELM models have
tion. performed better than the SVM models. This performance may
Model Model parameters RMSE (°C) MAE (°C) NSE be because of the less parameter optimization needed for the
3rd hr. SVM 28, 8, 0.01 2.36 1.04 0.63 ELM models. The 12th hour ELM model with RMSE of 0.59 °C
Downloaded by [The National Library - Kolkata] at 21:04 27 November 2017
3rd hr. ELM 3-50-1 0.63 0.32 0.95 is the best model for estimating DPT at the Hyderabad station.
12th hr. SVM 28, 7, 0.01 1.98 1.05 0.82 This model used seventy nodes in the hidden layer.
12th hr. ELM 3-70-1 0.59 0.14 0.97
Scatter plots for the Hyderabad station also depict the supe- and agriculture. Artificial intelligence techniques like ELM and
riority of ELM models in estimating the DPT for both 3rd SVM possess particular features that are instrumental to pro-
hour UTC and 12th hour UTC as evident from Figure 3. It can vide reliable and accurate DPT estimates. The present study
be further observed that the both the SVM models underesti- investigated the applicability of ELM and SVM models to esti-
mated the smaller values and overestimated the larger values mate the DPT of humid and semiarid regions of India and were
of DPT. Whereas, the dispersion degree of the scatter plots of compared to each other’s. The performance of the models was
the ELM models is lower than the plots of the SVM models. evaluated using different statistical parameters, scatter plots
This demonstrates the higher degree of linear relationship and box plots.
between the estimated values of ELM models and the meas- From the analysis of the results, it is evident that both the
ured values. SVM and ELM perform well in the estimation of DPT. Based
Further box plots were also used to analyze the spread on the performance indices, the ELM models were found to
of the data points estimated by the models. Figures 4 and 5 perform better than the SVM models. For instance, the ELM
presents the box plots for the models tested at the Bajpe and models at Bajpe and Hyderabad stations had NSE = 0.90 and
Hyderabad station. The box plots clearly present the superiority 0.97, respectively. Further, it was observed that these models
of the ELM models in estimating the DPT of both Bajpe and were more efficient in modeling 12th hour DPT at both humid
Hyderabad stations. For both Bajpe and Hyderabad station, it and semi-arid regions. The forecasting efficiency of SVM is
can be observed that the SVM model has underestimated the dependent over the optimal choice of C, ε and kernel parameter
smaller values of DPT. However, similar patterns of data spread (γ) and subsequently the performance of the ELM is governed
can be seen between the observed data-set and the ELM model by the optimum number of neurons in the hidden layer. These
at both the stations. The analysis clearly demonstrates that the models are particularly effective for estimating the missing
ELM model has high capabilities to estimate daily DPT in both DPT records due to faulty equipment or routine maintenance
humid and semiarid regions. schedules. The suggested strategy can be adopted to model
other weather parameters of similar statistical behavior.
5. Conclusions
Acknowledgements
Accessibility of reliable and accurate estimates of DPT is of
immense importance in many fields like hydrology, climatology The authors would like to thank the Indian Meteorological Department
for providing the necessary data required for research and the
ISH JOURNAL OF HYDRAULIC ENGINEERING 7
Downloaded by [The National Library - Kolkata] at 21:04 27 November 2017
Department of Applied Mechanics & Hydraulics, National Institute of Feng, Y., Cui, N., Zhao, L., Hu, X., and Gong, D. (2016). “Comparison of
Technology Karnataka for the necessary infrastructural support. ELM, GANN, WNN and empirical models for estimating reference
evapotranspiration in humid region of Southwest China.” J. Hydrol.,
536, 376–383. doi:10.1016/j.jhydrol.2016.02.053.
Disclosure statement Hill, A.J., Dawson, T.E., Shelef, O., and Rachmilevitch, S. (2015). “The
role of dew in Negev Desert plants.” Oecologia, 178(2), 317–327.
No potential conflict of interest was reported by the authors. doi:10.1007/s00442-015-3287-5.
Huang, G. B., & Siew, C. K. (2004). “Extreme learning machine: RBF
network case.” Control, automation, robotics and vision conference,
ORCID 2004. ICARCV, Vol. 2, 1029–1036. IEEE. <http://citeseerx.ist.psu.
Sujay Raghavendra Naganna http://orcid.org/0000-0002-0482-1936 edu/viewdoc/download?doi=10.1.1.217.2036&rep=rep1&type=pdf>
Huang, G.-B., Zhu, Q.-Y., and Siew, C.-K. (2006). “Extreme learning
machine: Theory and applications.” Neurocomputing, 70(1–3), 489–501.
References doi:10.1016/j.neucom.2005.12.126.
Kim, S., Singh, V.P., Lee, C.-J., and Seo, Y. (2015). “Modeling the physical
Amirmojahedi, M., Mohammadi, K., Shamshirband, S., Seyed Danesh, dynamics of daily dew point temperature using soft computing
A., Mostafaeipour, A., and Kamsin, A. (2016). “A hybrid computational techniques.” KSCE J. Civ. Eng., 19(6), 1930–1940. doi:10.1007/s12205-
intelligence method for predicting dew point temperature.” Environ. 014-1197-4.
Earth Sci., 75(5), 415. doi:10.1007/s12665-015-5135-7. Kisi, O., Kim, S., and Shiri, J. (2013). “Estimation of dew point temperature
Chang, C.-C., and Lin, C.-J. (2011). “LIBSVM: a library for support using neuro-fuzzy and neural network techniques.” Theor. Appl.
vector machines.” ACM Trans. Intell. Syst. Technol., 2, 27:1–27:27. Climatol., 114(3–4), 365–373. doi:10.1007/s00704-013-0845-9.
doi:10.1145/1961189.1961199 Lawrence, M.G. (2005). “The relationship between relative humidity
Chung, C.H., Chiang, Y.M., and Chang, F.J. (2012). “A spatial neural fuzzy and the dewpoint temperature in moist air: A simple conversion and
network for estimating pan evaporation at ungauged sites.” Hydrol. applications.” Bull. Am. Meteorol. Soc., 86(2), 225–233. doi:10.1175/
Earth Syst. Sci., 16, 255–266. doi:10.5194/hess-16-255-2012. BAMS-86-2-225.
Cortes, C., and Vapnik, V. (1995). “Support-vector networks.” Mach. Mohammadi, K., Shamshirband, S., Motamedi, S., Petković, D., Hashim,
Learn., 20(3), 273–297. doi:10.1007/BF00994018. R., and Gocic, M. (2015). “Extreme learning machine based prediction
Cristianini, N., and Shawe-Taylor, J. (2000). An introduction to support of daily dew point temperature.” Comput. Electron. Agric., 117, 214–225.
vector machines and other kernel-based learning methods, Cambridge doi:10.1016/j.compag.2015.08.008.
University Press, Cambridge.
8 P. C. DEKA ET AL.
Oudin, L., Moulin, L., Bendjoudi, H., and Ribstein, P. (2010). “Estimating Shiri, J., Kim, S., and Kisi, O. (2014). “Estimation of daily dew point
potential evapotranspiration without continuous daily data: possible temperature using genetic programming and neural networks
errors and impact on water balance simulations.” Hydrol. Sci. J., 55(2), approaches.” Hydrol. Res., 45(2), 165. doi:10.2166/nh.2013.229.
209–222. doi:10.1080/02626660903546118. Sujay Raghavendra, N., and Deka, P.C. (2014). “Support vector machine
Patil, A.P., and Deka, P.C. (2016). “An extreme learning machine approach applications in the field of hydrology: a review.” Appl. Soft. Comput.,
for modeling evapotranspiration using extrinsic inputs.” Comput. 19, 372–386. doi:10.1016/j.asoc.2014.02.002.
Electron. Agric., 121, 385–392. doi:10.1016/j.compag.2016.01.016. Vapnik, V.N. (1995). The nature of statistical learning theory. Springer, New
Roberts, J.S. (2003). “Dew point temperature.” Encyclopedia of agricultural, York (Vol. 8). doi:10.1109/TNN.1997.641482
food and biological engineering, D.R. Heldman, eds., CRC Press, Boca Vapnik, V.N. (1999). “An overview of statistical learning theory.” IEEE.
Raton, FL, 186–191. http://doi.org/10.1081/E-EAFE120007052 Trans. Neural. Netw., 10(5), 988–999. doi:10.1109/72.788640.
Robinson, P.J. (2000). “Temporal trends in United States dew point Zounemat-Kermani, M. (2012). “Hourly predictive Levenberg–Marquardt
temperatures.” Int. J. Climatol., 20(9), 985–1002. doi:10.1002/1097- ANN and multi linear regression models for predicting of dew point
0088(200007)20:9<985::AID-JOC513>3.0.CO;2-W. temperature.” Meteorol. Atmospheric. Phys., 117(3–4), 181–192.
Shank, D.B., McClendon, R.W., Paz, J., and Hoogenboom, G. doi:10.1007/s00703-012-0192-x.
(2008). “Ensemble artificial neural networks for prediction
of dew point temperature.” Appl. Artif. Intell., 22(6), 523–542.
doi:10.1080/08839510802226785.
Downloaded by [The National Library - Kolkata] at 21:04 27 November 2017