You are on page 1of 12

Ocean Dynamics (2017) 67:357–368

DOI 10.1007/s10236-017-1032-9

Prediction of daily sea surface temperature using efficient


neural networks
Kalpesh Patil 1 & Makaranad Chintamani Deo 1

Received: 16 August 2016 / Accepted: 17 January 2017 / Published online: 2 February 2017
# Springer-Verlag Berlin Heidelberg 2017

Abstract Short-term prediction of sea surface temperature Keywords Sea surface temperature . SST prediction . Neural
(SST) is commonly achieved through numerical models. networks . Wavelet networks
Numerical approaches are more suitable for use over a large
spatial domain than in a specific site because of the difficulties
involved in resolving various physical sub-processes at local 1 Introduction
levels. Therefore, for a given location, a data-driven approach
such as neural networks may provide a better alternative. The The knowledge of potential variations in the daily values of
application of neural networks, however, needs a large experi- sea surface temperature (SST) at a given geographical location
mentation in their architecture, training methods, and formation is applied for planning specific activities in coastal areas, such
of appropriate input–output pairs. A network trained in this man- as fishing expeditions and recreational events, and for studies
ner can provide more attractive results if the advances in network monitoring the growth of marine microorganisms. However,
architecture are additionally considered. With this in mind, we the prediction of SST is fraught with a high amount of com-
propose the use of wavelet neural networks (WNNs) for predic- plexity, with large variations arising out of irregularities in
tion of daily SST values. The prediction of daily SST values was heat radiation and flux, and the uncertain nature of the wind
carried out using WNN over 5 days into the future at six different blowing over the sea surface.
locations in the Indian Ocean. First, the accuracy of site-specific The SST prediction can be made with the help of physics-
SST values predicted by a numerical model, ROMS, was based numerical methods or, alternatively, with data-driven
assessed against the in situ records. The result pointed out the methods. The numerical approach is well suited for predictions
necessity for alternative approaches. First, traditional networks over a large spatial domain, while data-driven approaches can be
were tried and after noticing their poor performance, WNN was tailor-made for specific sites. There are alternative numerical and
used. This approach produced attractive forecasts when judged data-based procedures currently in use for this purpose. The
through various error statistics. When all locations were viewed latter methods range from common statistical and stochastic
together, the mean absolute error was within 0.18 to 0.32 °C for ones to the latest machine learning and artificial intelligence
a 5-day-ahead forecast. The WNN approach was thus found to tools. Examples of the traditional statistical techniques applied
add value to the numerical method of SST prediction when until now for this purpose include the Markov model (Xue and
location-specific information is desired. Leetmaa 2000), pattern searching (Agarwal et al. 2001), and
regression models (Laepple and Jewson 2007). Among modern
data-driven methods, artificial neural networks (ANNs) appear
to be the most popular because of their flexibility in fitting to
Responsible Editor: Pierre F.J. Lermusiaux random data and their relatively simple development.
Nonetheless, approaches such as genetic algorithms and support
* Makaranad Chintamani Deo vector machine (SVM) have also been attempted (Martinez and
mcdeo@civil.iitb.ac.in Hsieh 2009; Wu et al. 2006).
In the past, many researchers have used ANNs to success-
1
Indian Institute of Technology Bombay, Mumbai 4000 076, India fully predict the SST. Tangang et al. (1997) made seasonal
358 Ocean Dynamics (2017) 67:357–368

SST predictions over a specific region in the tropical Pacific models of atmosphere and ocean, over geographical grids of a
by devising an ANN with the input of empirical orthogonal certain size. While this approach can yield SST data for a long
functions of wind-stress and SST anomalies. Pozzi et al. period and over a close grid spacing, the resulting SST is none-
(2000) demonstrated the effectiveness of ANN as an alterna- theless synthetic and, therefore, may involve a certain amount of
tive to the conventional approaches in paleoceanography. inherent inaccuracy. For accurate predictions, remotely sensed
Tripathi et al. (2006) explored the monthly averaged SST and in situ measurements are inevitable. The remotely sensed or
predictions for a region in the Indian Ocean. Tang et al. satellite-based techniques consist of sensing the ocean radiation
(2000) made a comparison between canonical correlation of certain wavelengths of an electromagnetic spectrum and re-
analysis, ANN, and linear regression for the prediction of lating it to SST. The moderate resolution imaging
equatorial SST in the Pacific. Using meteorological variables spectroradiometer or microwave radiometry, based on an imag-
as inputs, Garcia-Gorriz and Garcia-Sanchez (2007) made ing radiometer, is another technique for this purpose. Regarding
SST predictions in the western Mediterranean Sea. Gupta in situ data collection, a variety of instruments is in use, such as
and Malmgren (2009) assessed the prediction capabilities of thermometers and thermistors mounted on drifting or moored
several methods based on transfer functions, regression, and buoys, Argo type of buoys, and moorings.
ANN. In a comparative study across SVM, regression, and In order to improve the accuracy of the numerical products,
ANN, Martinez and Heish (2009) found that ANN was able actual SST measurements can be assimilated by a variety of
to provide better predictions than SVM. Lee et al. (2011), methods such as optimal interpolation or Kalman Filter
using ANN, identified the sources of errors in satellite-based (Reynolds et al. 2007; Wan and Van Der Merwe 2000).
SST estimates. Mohongo and Deo (2013) predicted monthly Similarly, site-specific SST data collected by alternative
and seasonal SST anomalies near an eastern African shore by methods can be synthesized and combined together to provide
using autoregressive integrated moving average (ARIMA) more accurate input in modeling. However, this may involve
and ANN methods; they found that a non-linear some gaps in data filled up by the outcome of numerical mod-
autoregressive network performed better than ARIMA in the el runs, constituting the so-called reanalysis datasets.
prediction of SST, and in capturing the El Niño Southern In this study, we first examine the accuracy of the site-
Oscillation (ENSO) and Indian Ocean Dipole (IOD) events. specific information yielded by a numerical model by com-
It can be seen that the data-driven methods used in several paring it with in situ records. Thereafter, we attempt to predict
earlier works consisted of time series forecasting, in which a more accurately daily SST values, over a few time steps into
historical sequence of observations was used to recognize a the future, by means of neural networks trained with the help
hidden pattern across them, and the SST forecast was made by of reanalysis data. Therefore, the datasets used in this study
sliding such pattern upwards along the time axis. Such tem- are of three different types: reanalysis, numerical, and in situ
poral predictions, according to Taken (1981), were seen as SST observations.
equivalent to the causal mapping of physics-based numerical
methods and will be referred to later. 2.1 Reanalysis data
A review of the past publications mentioned above shows
that the ANN is an attractive technique to predict SST, and there The reanalysis data used in this work were extracted from the
is an opportunity to make it more useful by taking advantage of daily SST product of the National Oceanic and Atmospheric
rapid advances in the ANN technology. The present study is Administration (NOAA), over grids of size 0.25° × 0.25°
oriented toward this aim, and it attempts to make daily predic- each, available on the web portal http://www.esrl.noaa.
tions of SST over five time steps into the future with the help of a gov/psd/. This data product, called BHigh Resolution OI v2,^
hybrid Bwavelet transform–ANN^ technique. This study goes had resulted from the optimum interpolation of high-resolution
beyond a previous one (Patil et al. 2016), by comparing the information from the Advanced Very High Resolution
predictions with in situ instrumental records and testing the re- Radiometer (AVHRR) and the Advanced Microwave
sults at several locations in the Indian Ocean. It is considered that Scanning Radiometer (AMSR). The radiometer information
such revalidation with a more rigorous analysis could make the has been adjusted for the satellite bias using in situ data from
suggested approach more acceptable to the user community. ships and buoys. For further information on the blending of
However, it is emphasized that the suggested method is site AVHRR and AMSR in the SST product, reference is made to
specific and it is targeted at daily predictions of SST. Reynolds et al. (2007). The considered SST data ranged from
1981 to 2015.

2 Locations of study and data 2.2 In situ data

The data required for predicting SST with the help of data-driven The in situ measurements of daily SST came from the instru-
approaches can be derived from the product runs of numerical mented buoys deployed under the BResearch Moored Array
Ocean Dynamics (2017) 67:357–368 359

for African-Asian-Australian Monsoon analysis and Table 1 Coordinates of the study locations
Prediction (RAMA)^ project. The data were recorded by spe- Locations
cial moorings with the help of sensors at a nominal depth of
1 m from the sea surface. More details of the sensors and the L4 L11 L14 L16 L19 L24
RAMA buoys can be seen in McPhaden et al. (2009). The Latitude (°N or S) 0 1.5 S 12 S 15 N 4N 5S
time length of this in situ data was around 13.5 years, ranging Longitude (°E) 67 80.5 80.5 90 90 95
from January 2002 to June 2015.

2.3 Numerical data not need any interpolation because the coordinates of the lo-
cations matched with the grid points of ROMS SST data.
The numerical predictions of daily SST values used herein However, the same predictions required to be averaged to
belong to the Regional Ocean Modelling System (ROMS). daily scales from the original 6-h intervals.
ROMS is a primitive-equation ocean model used for a wide
range of oceanographic applications (Moore et al. 2004;
Wilkin et al. 2005). The Indian National Centre for Ocean
Information Services (INCOIS) made these data available on 3 Methodology
request (Francis et al. 2013). This information corresponds to
a period of 29 months, from January 2013 to May 2015. The modeling scheme adopted in this work to make future
SST predictions is based on the time series forecasting princi-
2.4 Study locations ple. In such forecasting, the ANN considers a selected number
(typically 5 to 16) of preceding values, identifies an unknown
Depending on the simultaneous availability of all the three pattern in it, and by sliding one step ahead and maintaining the
datasets mentioned above, six sites located in the Indian same pattern, it makes the desired future prediction. As per the
Ocean were selected, code-named as L4, L11, L14, L16, well-known Taken’s theorem (Taken 1981) in signal process-
L19, and L24. Arrows in Fig. 1 show these sites. This figure ing, modeling a cause–effect structure (essentially the under-
also indicates the sites numbered from 1 to 27 where instru- lying physical process) is the same as modeling the time series
mented floating buoys were deployed under the RAMA pro- of the effect because it is assumed that the underlying causal
ject. The exact latitudes and longitudes of these sites are given relationship is reflected in the very sequence of the occurrence
in Table 1. of the preceding values of the time series. Hence, if we model
It was necessary to interpolate the reanalysis data to the the preceding observations, there is no need to model the
exact study locations, which was carried out with the help of causal behavior. Thus, the modeling procedure adopted here
a bilinear interpolation. The numerical predictions of SST did implicitly takes care of the past changes in the SST arising out

Fig. 1 Locations of the buoys


deployed under the RAMA
project. Stations 1 to 27 are those
where SST is recorded and these
include the study locations: 4, 11,
14, 16, 19, and 24 shown with
small arrows
360 Ocean Dynamics (2017) 67:357–368

of flow reversals, eddies, and upwelling typically occurring in level detail and approximate signals, which constitutes multi-
the Indian Ocean, as example. level decomposition.
Considering the availability of simultaneous sets of data, There are alternate mother wavelet functions available for
the analysis was restricted to a period of 29 months ranging use, including Haar, Symlet, Coiflet, Daubechies, and Meyer.
from January 2013 to May 2015. The first 70% of the sample These wavelet families are classified according to the width of
was used for calibration or training of models, while the re- the filter window (which influences the localization) and num-
maining 30% was employed for their testing. ber of vanishing moments (which represents the polynomial
In order to arrive at the best choice, a large number of trials behavior of data). For further reading on the wavelet families,
were conducted with traditional ANN architecture, such as please refer to Addison (2002), Dghais and Ismail (2013), and
feed forward, recurrent, and radial basis, and also with alter- Shoiab et al. (2014). In this study, the Bdiscrete approximation
native network training algorithms of ordinary gradient de- of Meyer wavelet^ was selected as wavelet family because it
scent, conjugate gradient, and resilient back-propagation has not only a quick decay and infinite differentiability but
(Addison 2002; Haykin 1999; Kosko 1992). However, it also a compact spread in the frequency domain (Lu et al.
was finally found that one of the latest and hybrid types of 2012), facilitating efficient filtering. Using this wavelet, the
network, called the wavelet neural network (WNN), per- original signal was transformed into detail and approximation
formed the most satisfactorily. The WNN is a combination sub-signals up to three levels in this application.
of a discrete wavelet transform (DWT) and ANN, as will be The discrete Meyer wavelet has a shape as in Fig. 2a, which
briefly explained below. shows the variation of signal strength with time during low
and high passes.

3.1 Discrete wavelet transform


3.2 Artificial neural network
Discrete wavelet transform is the most popular form of signal
transformation because it provides in a signal a high resolution ANN is made up of artificial neurons. Typically, it has three
in time and low resolution in frequency at higher frequencies, layers: input, hidden, and output. Each neuron is associated
and vice versa for lower frequencies. A wavelet transform with some weight. Every neuron gets an input from its previ-
fundamentally isolates frequency in a given signal at different ous layer, and then the weighted sum of inputs, after adding
time resolutions. Transformation of the signal is done by scal- some bias, is transferred to the next layer through a transfer
ing (dilation) and transformation (shifting) functions, which function. A stepwise flow of information in an ANN is ex-
are derived from a mother wavelet function. plained below, assuming that it has Bi^ neurons in the input
The mother wavelet, Ψs,u (x), where s and u are scale and layer, Bh^ neurons in the single hidden layer, and Bo^ neurons
position parameters, respectively, and x is an independent var- in the output layer (Fig. 2b).
iable, can be expressed as follows: (a) Sum up weighted inputs at each node in hidden layer:
 
1  x−u  N h ¼ ∑i;h W i;h X i þ Bh ð3Þ
Ψs;u ðxÞ ¼ pffiffi ψ ð1Þ
s s
where N h = weighted sum for hth hidden node,
The parameter Bu^ relates the location of the wavelet func- Wi,h = weight of ith input and hth hidden node, Xi = input at
tion when it is shifted through the signal, and thus it represents ith input node, and Bh = bias value at the hth hidden node.
the time-related information, and the parameter Bs^ represents (b) Transfer the weighted input to hidden neurons by the
the frequency related information. If ψ*(.) is the complex con- tan-sigmoid function, as below:
jugate of ψ(.), then the wavelet transform WTs,u [f(x)] for a
function f(x) is mathematically given as 1
Hh ¼ ð4Þ
½1 þ e−2N h −1

1  x−u 
WT s;u ½ f ðxÞ ¼ ∫ f ðxÞ pffiffi ψ* dx ð2Þ where Hh = transferred weighted input at hth hidden node.
−∞ s s (c) Sum up the weighted information at each node in output
The mother wavelet either compresses the signal (high layer:
pass), which provides the detailed hidden information in the  
N o ¼ ∑h;o W h;o H h þ Bo ð5Þ
signal, or expands it (dilates, low pass) to provide approximate
information. The transformed signals with high (detail) and where N o = weighted sum for oth output node,
low (approximate) frequencies are analyzed independently. W h,o = weight of hth hidden and oth output node, and
The approximate component is further transformed into sub- Bo = bias value at the oth output node.
Ocean Dynamics (2017) 67:357–368 361

Fig. 2 a Variation of signal


strength (y-axis) versus time (x-
axis) in a discrete Meyer wavelet
for low and high passes. b Typical
ANN architecture

(a) Variation of signal strength (y axis) versus time (x axis) in a discrete Meyer wavelet
for low and high passes

Input Layer Hidden Layer Output Layer

Neuron
N Weight Bias Neuron

(b) Typical ANN architecture

(d) Transfer the above weighted sum, No, from the hidden the LM algorithm. For details of this algorithm, readers are
neurons to output neurons by linear transfer function, as below: referred to Marquardt (1963).
Assessed by the trials aimed at achieving the best testing
Oo ¼ N o :ðpurelin transfer functionÞ ð6Þ performance, and considering the uniformity across various
ANNs, the number of hidden neurons was kept as four.
where Oo = transferred weighted input from hidden layer to
oth output node. 3.3 Wavelet neural network
This inter-connected network is trained before its actual use
with the help of a training algorithm. The training of ANN, or Wavelet neural network (WNN) is a combination of DWT and
fixing connections weights and bias, is done by iteratively ANN. As mentioned earlier, in this work, the original signal
processing input–output pairs, and by minimizing the errors was decomposed up to three levels of sub-signals using DWT.
for each pair, in order to obtain an optimum combination of Thus, three details (D1, D2, and D3) and one approximate (A3)
weights and biases. A cross-validation method is applied to series are obtained as in Fig. 3, showing the decomposition of
avoid overtraining and biasing of prediction results. Various the SST time series at a typical location (L19).
training algorithms are available to train the neural network, The WNN was employed to forecast daily SST over a 5-
such as quasi-Newton, Bayesian regulation, Levenberg– day horizon. For training the ANN, a segment of preceding
Marquardt (LM), Resilient back-propagation, and Scaled con- SST values of these details and approximate series were given
jugate gradient. Among available algorithms, the LM is con- as input, and SST values in the original (not decomposed)
sidered as the fastest to train the moderately sized neural net- series served as the output or target values. If BS^ is the orig-
works, up to several hundred weights (Hagen and Menhaj inal SST series and D1, D2, D3, and A3 are its decomposed
1994). Therefore, the ANN used in this study was trained by series, then targets were given from S and inputs were given
362 Ocean Dynamics (2017) 67:357–368

Fig. 3 Discrete wavelet


transform of SST anomaly series

SST (°C)
with three level wavelet
decomposition at location BL19^
over a period of 881 days (01-Jan-
2013 to 31-May-2015)

A3 (°C)
D3 (°C)
D2 (°C)
D1 (°C)

from D1, D2, D3, and A3. Table 2 gives the number of values 4 Results and discussion
of inputs (from each decomposed series) and targets for each
location, while Fig. 4 depicts the WNN architecture along The basic statistics in the form of mean, standard deviation,
with an inserted table that typically presents input and output and coefficient of variation of SST, along with their observed
at location L24 mentioned in the last row of Table 2. range, is presented in Table 3. It may be noticed that station
The performance of the WNN model was assessed by three 16, being far away from the equator, has the lowest mean and
statistical error criteria, namely, correlation coefficient, r; root maximum range. It also has the highest variance in SST, in
mean square error, RMSE; and mean absolute error, MAE. terms of the standard deviation and coefficient of variation.
The Br^ measures a linear association between the modeled However, such variability was not found to pose any difficulty
and the target SST and it is highly influenced by errors or for the accurate prediction of SST, owing to the strength of the
deviations among the high values. The RMSE gives an overall models used. Similarly, small differences in SST values at
error structure but it is also sensitive to high order deviations. other stations did not lead to specific problems in modeling,
The MAE, commonly understood in engineering assessments, as will be seen later in this section.
similarly provides an overall picture of errors, although it ne- To begin with, the accuracy of the numerical SST predic-
glects the positive or negative type of errors. tion over a 5-day horizon was assessed within the testing
The development of networks presented in this work was done period of December 10, 2014 to May 27, 2015, which formed
with the help of Matlab tool box, version 8 (Misity et al. 2010). the last 20% segment of the total sample. Table 4 shows the
results of the comparison between numerical prediction of
SST and in situ data, at all six stations over the five lead times.
Table 2 Inputs and targets used for WNN at each location It can be seen that while the r values were fairly high and thus
attractive, the RMSE and MAE, which should have been low,
Inputs Targets
were also indicating a high margin for improvement in the
Locations D1, D2, D3, and A3 S obtained accuracy.
L4 7 5 Therefore, and as mentioned in the preceding section,
L11 11 5 WNNs were developed to predict daily values of SST at all
L14 13 5 six locations of study and over a time horizon of 5 days in
L16 11 5 advance. The training of WNN was done by the reanalysis
L19 16 5 (NOAA OI v2) data, and the trained networks were tested as
L24 5 5 per the actual SST observations made by the RAMA buoy
network. Additionally, the outcome of WNN was compared
Ocean Dynamics (2017) 67:357–368 363

Fig. 4 WNN architecture for location BL24^

with the more common and traditional three-layered feed-for- WNN, while Fig. 5b shows the same comparison from
ward back-propagation ANN (exemplified in Fig. 2) and with RAMA buoys, WNN, and the numerical ROMS model. In ad-
a base level persistent model (PER). The input and output of dition, Fig. 5c depicts the same comparison across RAMA
such ANN belonged to the original (not decomposed) SST buoys, WNN, and the PER model. These comparisons are for
series. The PER consisted in taking the predicted value as a lead time of 4 days at location L19 and over a testing period of
equal to the current one, i.e., SST(t + Δt) = SST (t), in which 169 days, ranging from December 10, 2014 to May 27, 2015.
SST (t) is SST at the current time t while SST (t + Δt) is the As can be seen from these figures, the WNN SST predic-
predicted value of SST at time (t + Δt). tions are better than the numerical ROMS, traditional ANN,
The reanalysis data were interpolated at the study locations.
Because the changes in SST are very small compared to their
absolute values, for the model development, the anomalies of Table 4 Performance indices of numerical SST predictions when
SST from the mean were preferred rather than their absolute compared with in situ SST data
values. Hence, the reanalysis data were converted into anom- Lead time (in days) 1 2 3 4 5
alies by subtracting the long-term mean corresponding to the
period of 32 years (1982 to 2014) from the absolute values. Location
These anomalies at each location were then decomposed by L4 r 0.83 0.81 0.81 0.80 0.79
DWT at three levels to produce details (D1, D2, and D3) and RMSE (°C) 0.84 0.88 0.90 0.93 0.94
approximate (A3) components, which were further supplied to MAE (°C) 0.73 0.76 0.77 0.79 0.79
ANN as input to make the predictions. These predicted anom- L11 r 0.90 0.89 0.89 0.86 0.86
alies were then added to long-term means to get the absolute RMSE (°C) 0.59 0.61 0.62 0.64 0.64
values of SST predictions. MAE (°C) 0.53 0.55 0.55 0.57 0.57
Figure 5a gives an example of comparison between SST L14 r 0.73 0.73 0.72 0.68 0.63
predicted from RAMA buoy-based observations, ANN, and RMSE (°C) 0.56 0.56 0.57 0.61 0.66
MAE (°C) 0.50 0.49 0.49 0.50 0.52
L16 r 0.96 0.96 0.96 0.96 0.96
Table 3 Basic statistics of SST data RMSE (°C) 0.55 0.53 0.52 0.52 0.52
MAE (°C) 0.44 0.42 0.41 0.41 0.42
L16 L19 L4 L11 L24 L14
L19 r 0.76 0.76 0.75 0.76 0.76
Minimum (°C) 26.38 28.45 27.40 28.20 28.25 27.66 RMSE (°C) 0.77 0.78 0.79 0.79 0.80
Average (°C) 28.54 29.68 29.28 29.45 29.30 28.99 MAE (°C) 0.67 0.68 0.68 0.69 0.70
Maximum (°C) 31.76 30.93 31.43 30.53 30.68 30.5 L24 r 0.90 0.89 0.88 0.87 0.84
SD (°C) 1.42 0.63 1.2 0.61 0.68 0.64 RMSE (°C) 0.42 0.44 0.47 0.50 0.56
Coefficient of variation (%) 4.97 2.11 4.11 2.06 2.34 2.22 MAE (°C) 0.33 0.35 0.38 0.41 0.45
364 Ocean Dynamics (2017) 67:357–368

Fig. 5 Performance during


testing showing times series of
WNN and PER models, and in
situ buoy observations

and PER model-based predictions, and compare well with the while for the numerical ROMS prediction, it varied from 0.42
in situ measurements, even at the longer lead time. to 0.79 °C.
The qualitative comparison of WNN with ROMS, PER, and The statistical significance of the derived correlation coef-
ANN is further quantified with more detail in Figs. 6, 7, and 8 in ficients was ascertained through the Bt^ statistic test. The sam-
terms of the error statistics of r, RMSE, and MAE, respectively, ple size was 170 days (December 10, 2014 to May 27, 2015)
computed with the in situ SST observations as a basis. The and, thus, the number of degrees of freedom was 168 and the t
comparison is at all six locations and all five lead times. statistic was 1.974 (as per the standard Bt-distribution^ table).
It is clear from Figs. 6 to 8 that the WNN-based SST pre- Hence, the critical value γ of the correlation was 0.15 at a 5%
dictions not only showed better agreement with actual obser- significance level (see Altman and Krzywinski 2015 for
vations in terms of higher r, lower RMSE, and MAE than the calculation of γ and other details). Because all correlations
SST predictions by ROMS, PER, and ANN models. The r were far above this value, it was concluded that they were
value of the WNN-based SST predictions ranged from 0.83 statistically significant.
to 0.98 for the 5-day ahead forecast, while it ranged from 0.63 If we look at Fig. 6 that shows the Br^ values at all six
to 0.96 for the numerical ROMS-based predictions. The locations, no significant variations are seen across the different
RMSE of the 5-day-ahead WNN-based prediction ranged sites, except at station L24, located toward the eastern end of
from 0.23 to 0.40 °C, while for the ROMS predictions, it the domain, where Br^ is relatively much lower. The low cor-
ranged from 0.52 to 0.94 °C. Finally, the MAE for a 5-day- relation at L24 may be caused by a possible lower data quality
ahead prediction based on WNN ranged from 0.18 to 0.32 °C, rather than by a special local oceanography because the latter
Ocean Dynamics (2017) 67:357–368 365

Fig. 6 Correlation coefficients


between WNN, ANN, PER,
numerical ROMS-based SST
predictions, and the in situ SST
measurements by buoys (testing
period; lead time varying from 1
to 5 days)

would have been reflected in a high variability of the basic emphasized the need to pursue more non-linear approaches than
data statistics in Table 3. It was observed that the ANN models the base level one. Numerical models are usually set up with
were trained as per the reanalysis data, and such data at L24 initial conditions from other ocean analysis systems or by assim-
did not have a very good agreement with the corresponding ilating the observations into the model setup. However, no such
buoy data. procedures were used in this ROMS model, which could explain
The above discussion indicates that, in general, the WNN can its less satisfactory performance.
make daily SST predictions more efficiently than the numerical An additional observation from Figs. 6, 7, and 8 is that the
model and the base level persistent and traditional ANN. This performance of ANN, PER, and ROMS decreases with in-
could be because of the capability of the wavelet transform to creasing lead times, which in general does not happen with
provide less confusing input to the subsequent ANN, which on WNN. The reduction in the performance is due to the decreas-
its own was unable to come up with predictions that are more ing dependency between neighboring values as the prediction
accurate. The comparison with the baseline persistent model horizon rises. WNN, on the contrary, de-correlates the signal
366 Ocean Dynamics (2017) 67:357–368

Fig. 7 The root mean square


error between WNN, ANN, PER,
numerical ROMS-based SST
predictions, and the in situ SST
measurements by buoys (testing
period; lead time varying from 1
to 5 days)

and acts at the root of such dependency; hence, we get almost information in WNN, which consists in filtering the noise in
similar performance over multiple future times steps. the time series and using such filtered information as input for
However, there is a limit in the correlation structure across all lead times in one go, is advantageous compared to the
neighboring values that can be acted upon by its finite archi- alternative techniques.
tecture. Additionally, the given sample size also governs the It can be noted that the models developed herein are site
ability of the architecture to de-correlate the signals. In this specific. Such site-specific models can possibly be developed
work, it was further noticed that if we pursue predictions be- at every geographical grid in a given spatial domain, and
yond 5 days, the errors with WNN also increase. This analysis therefore, 2D SST fields can be predicted. This was earlier
is not elaborated here because our objective in this work was attempted by Alvarez et al. (2000, 2003, 2004) and
to understand the relative merit of WNN vis-à-vis the ROMS Youzhuan et al. (2008), in which genetic algorithms or statis-
predictions, which are restricted up to 5 days in advance. tical regression were used as a modeling tool, rather than
Thus, it can be seen that the very nature of processing input ANN. However, such approach is more suitable for weekly,
Ocean Dynamics (2017) 67:357–368 367

Fig. 8 The mean absolute error


between WNN, ANN, PER,
numerical ROMS-based SST
predictions, and the in situ SST
measurements by buoys (testing
period; lead time varying from 1
to 5 days)

monthly, or seasonal SST predictions, rather than the daily Such networks were trained using reanalysis data and tested
forecasts of the present study, because at daily intervals, a high against in situ SST measurements, made at six different loca-
amount of noise gets introduced in satellite data as a result of tions in the Indian Ocean.
cloud covering, airplane wakes, transient ocean structure, and The level of match between SST predicted by the physical-
so forth. Moreover, the sample size required for model cali- ly based numerical model ROMS and in situ data pointed out
bration becomes very large (Alvarez et al. 2000, 2003, 2004). to the necessity of improving the numerical forecasts.
The alternative data-driven methods of persistent modeling
and traditional feed-forward back-propagation types of ANN
5 Conclusions were also found to exhibit a low performance.
In contrast, the WNN model was in general found to be
The development of wavelet neural networks for predicting more accurate and reliable in SST predictions over a 5-day
daily SST values over 5 days into the future was discussed. lead time and at all six stations. Such predictions were
368 Ocean Dynamics (2017) 67:357–368

associated with a mean error as low as 0.18–0.32 °C, when all Lu, J., Liu, H. P. and Hsu, C. Y. (2012). Discrete Meyer wavelet transform
features for online Hangul script recognition.
sites and lead times were viewed together.
Mahongo SB, Deo MC (2013) Using artificial neural networks to forecast
Therefore, the WNN approach was found to be valuable as monthly and seasonal sea surface temperature anomalies in the
an addition to numerical methods for SST predictions, when Western Indian Ocean. The International Journal of Ocean and
location-specific information is required. Climate Systems 4(2):133–150
Marquardt DW (1963) An algorithm for least-squares estimation of non-
linear parameters. Journal of the Society for Industrial & Applied
Acknowledgements This study was made as part of a research project
Mathematics 11(2):431–441
(no. 13MES001) funded by ESSO-INCOIS, Ministry of Earth Sciences,
Martinez SA, Hsieh WW (2009) Forecasts of tropical Pacific sea surface
Government of India, Hyderabad, India, under the BHigh resolution op-
temperatures by neural networks and support vector regression.
erational ocean forecast and reanalysis system (HOOFS)^ program. The
International Journal of Oceanography, Vol 2009
authors gratefully acknowledge the help of Dr. Francis P. A. and Dr. M.
McPhaden MJ, Meyers G, Ando K, Masumoto Y, Murty VSN,
Ravichandran, INCOIS, Hyderabad, in releasing numerical model-based
Ravichandran M, Syamsudin F, Vialard J, Yu L, Yu W (2009)
data and for helpful suggestions in implementing the research project.
RAMA: the research moored array for African–Asian–Australian
Special thanks are due to Mrs. Anuradha Modi and Mr. K Kaviyazhahu
monsoon analysis and prediction
from INCOIS for their help in compiling and providing the ROMS data.
Misity M, Misiti Y, Oppenheim G, Poggi JM (2010) Wavelet Toolbox 4
user’s guide. The Math Works, Inc, 1–27
Moore AM, Arango HG, Di Lorenzo E, Cornuelle BD, Miller AJ,
References Neilson DJ (2004) A comprehensive ocean prediction and analysis
system based on the tangent linear and adjoint of a regional ocean
model. Ocean Model 7(1):227–258
Addison PS (2002) The illustrated wavelet transform handbook. Institute Patil KR, Deo MC, Ravichandran M (2016) Prediction of sea surface
of Physics Publishing, London temperature by combining numerical and neural techniques.
Agarwal N, Kishtawal CM, Pal PK (2001) An analogue prediction meth- Journal of Atmospheric and Oceanic Technology, American
od for global sea. Curr Sci 80(1) Meteorological Society, 33(8), doi:10.1175/JTECH-D-15-0213. 1
Altman N, AndKrzywinski M (2015) Points of significance: association, Pozzi M, Malmgren BA, Monechi S (2000) Sea surface-water tempera-
correlation and causation. Nat Methods 12(10):899–900 ture and isotopic reconstructions from nannoplankton data using
Alvarez A, Lopez C, Riera M, Hernández-Garcia E, Tintore J (2000) artificial neural networks. Palaeontol Electron 3(2):1–14
Forecasting the SST space-time variability of the Alboran Sea with Reynolds RW, Smith TM, Liu C, Chelton DB, Casey KS, Schlax MG
genetic algorithms. Geophys Res Lett 27(17):2709–2712 (2007) Daily high-resolution-blended analyses for sea surface tem-
Alvarez A, Orfila A, Sellschopp J (2003) Satellite based forecasting of sea perature. J Clim 20(22):5473–5496
surface temperature in the Tuscan Archipelago. Int J Remote Sens Shoaib M, Shamseldin AY, Melville BW (2014) Comparative study of
24(11):2237–2251 different wavelet based neural network models for rainfall–runoff
Alvarez A, Orfila A, Tintore J (2004) Real-time forecasting at weekly modeling. J Hydrol 515:47–58
timescales of the SST and SLA of the Ligurian Sea with a satellite- Takens F (1981) Detecting strange attractors in fluid turbulence. In: Rand
based ocean forecasting (SOFT) system. Journal of Geophysical D, Yeung LS (eds) Dynamical systems and turbulence, Lecture
Research: Oceans, 109(C3) Notes Math, vol. 898. Springer-Verlag, New York, pp 366–369
Dghais AAA, Ismail MT (2013) A comparative study between discrete Tang B, Hsieh WW, Monahan AH, Tangang FT (2000) Skill comparisons
wavelet transform and maximal overlap discrete wavelet transform between neural networks and canonical correlation analysis in
for testing stationarity. Int J Math Comput Sei Eng 7:1184–1188 predicting the equatorial Pacific sea surface temperatures. J Clim
Francis PA, Vinayachandran PN, Shenoi SSC (2013) The Indian Ocean 13(1):287–293
Forecast System. Curr Sci 104(10):1354–1368 Tangang FT, Hsieh WW, Tang B (1997) Forecasting the equatorial Pacific
Garcia-Gorriz E, Garcia-Sanchez J (2007) Prediction of sea surface tem- sea surface temperatures by neural network models. Clim Dyn
peratures in the western Mediterranean Sea by neural networks 13(2):135–147
using satellite observations. Geophys Res Lett 34(11) Tripathi KC, Das ML, Sahai AK (2006) Predictability of sea surface
Gupta SM, Malmgren BA (2009) Comparison of the accuracy of SST temperature anomalies in the Indian Ocean using artificial neural
estimates by artificial neural networks (ANN) and other quantitative networks. Indian Journal of Marine Sciences 35(3):210–220
methods using radiolarian data from the Antarctic and Pacific Wan E, Van Der Merwe R (2000) The unscented Kalman filter for non-
Oceans. e-Journal Earth Science India, Vol. 2 (II), pp 52–75 linear estimation. In Adaptive systems for signal processing, com-
Hagan MT, Menhaj MB (1994) Training feed forward networks with the munications, and control. Symposium 2000. AS-SPCC. The IEEE
Marquardt algorithm. Neural Networks, IEEE Transactions on 5(6): 2000:153–158
989–993 Wilkin JL, Arango HG, Haidvogel DB, Lichtenwalner C, Glenn SM,
Haykin S (1999) Adaptive filters. Signal Processing Magazine, 6 Hedström KS (2005) A regional ocean modeling system for the
Kosko B (1992) Neural networks and fuzzy systems: a dynamical sys- Long-term Ecosystem Observatory. Journal of Geophysical
tems approach to machine intelligence. Vol. 1, Prentice Hall Research: Oceans (1978–2012), 110(C6)
Laepple T, Jewson S (2007) Five year ahead prediction of Sea Surface Wu A, Hsieh WW, Tang B (2006) Neural network forecasts of the tropical
Temperature in the Tropical Atlantic: a comparison between IPCC Pacific sea surface temperatures. Neural Netw 19(2):145–154
climate models and simple statistical methods. arXiv preprint phys- Xue Y, Leetmaa A (2000) Forecasts of tropical Pacific SST and sea level
ics/0701165 using a Markov model. Geophys Res Lett 27(17):2701–2704
Lee YH, Ho CR, Su FC, Kuo NJ, Cheng YH (2011) The use of neural Youzhuan D, Dongyang F, Zhihui W, Xianqiang H, Haiqing H, Delu P
networks in identifying error sources in satellite-derived tropical (2008) A study of predictability of SST at different time scales based
SST estimates. Sensors 2011:7530–7544 on satellite time. In Proc. of SPIE Vol (Vol. 7149, pp. 714917–1)

You might also like