You are on page 1of 13
‘Remote Sensing Applications: Socety and Environment 2¢ (2021) 100651 Contents lists available at ScienceDirect Remote Sensing Applications: Society and Environment ELSEVIER Journal homepage: ww.clsevier-comflocate/rsase Spatiotemporal imputation of MODIS land surface temperature a using machine learning techniques (Case study: New Mexico’s Lower Rio Grande Valley) Esmaiil Mokari**“'’, Hamid Mohebzadeh"*”’, Zohrab Samani", David DuBois‘, Prasad Daggupat ® ag. of Cnt grin New Mes Seat Uni, as Cr, 0488009, USA ® Schl of Engineering Univers of Gui Gu ON, NIG2W1, Canada «Dag of lad Enron Scie Now Mey Ste Un, La Ones, M, 88008, USA ARTICLE INFO ABSTRACT Keyword Land surface temperatre (LST is onc ofthe main fctorsin the physical processes of land surface Sapper vce rereton energy and water balance atthe global sale. The importance of LST is being progestvely Seomed egesion tren Increased in a valely of water resource Fields. However, the time series of LST observations sxremelesing mine regularly have missing values because of various reasons such as clovdiness. Missing value Keates neighbors {imputation is technique to use a realistic value to eximate or replace the mising value. The Ming valu imputation objective ofthis study was to investigate and compare the poteatial of machine leaaing (ML) rodels including K-Nearest Neighbors (KNN), Support Vector Regression (SVR), Boosted Regression Trees (BRT), and Extreme Learning Machine (ELM) for spatiotemporal imputation of LST satelite images in New Mexico's Lower Rio Grande Valley (LRGV) where LST isa critical variable for water resource studies. The cross-validation approach was applied to obtain the ‘optimized parameters foreach model separately, The model comparison results showed that the ‘SVR model is the most accurate model compared to other models from missing rato 0.1 t0 0.8 (0.24 < mean RMSE (°C) < 0.38) The BRT and FIM models showed the same performance level fn general with almost the same mean CV-RMSE of 05°C. However, the ELM imputed maps showed noisy estimations particularly when the missing ratios were increased. The BRT model showed a better performance compared to ELM with a lower missing ratio from 0.1 t0 0.5. The KN model found to be the least reliable MI. model for spatiotemporal imputation of LST among the all models with mean RMSE of 0.982°C. Also Blurting effects were more obvious inthe KNN {imputed maps than other models particularly in larger missing ratios. Overall, the Gndings confirmed the superiority ofthe SVR and BRT over the ELM and KNN for spatiotemporal im- putations of LST 1. Introduction Land surface temperature (LST) is one of the major factors in the physical processes of land surface energy and water balance at the * Gaveespoding suor. Department of il Engineering, New Mexico State Univesity, La Crue, NM, BEO05, USA, + conesponding auhoe ‘Bmal adres 1000s ed (E, Moa, esoheza@ uogulph ca (H. Mabebaadeh ips /.og/10.1016 as 2021100651 Received 16 June 2021; Received in revised frm 8 Ortober 2021; Accepted 10 Ortober 2021 Available online 14 October 2021 2352-9385/1 202 Elsevier BV. All rights reserved, Moar et emote Sensing Aplications Society and ronment 24 (2021) T0065 107°200°W 106'40;0"°W 106°200°W rorravow | 1orrbow — ro6%eoow | 106°200W ig 1 Lower io Grande Valley locaton inthe fo Grane Sa elobal scale (Karnieli et al, 2010; Kusas and Anderson, 20095 Zhang ea, 2008), Several successful uses of LST ina variety of water Fesource fields including evapotranspration (Kisaca etal, 20184 Le l, 20095 Wang and Liang, 2008) and drought (Karalic l, 2010; Zhang eta), 20178) ae available in the literature. However, time series of LST observations regularly have missing values because of clouds, sensor failure, dust, aerosol, and ozone (Ghafarian Malan etal, 20184 Zhow etal, 2015), The energy reflected fom the surface can be influenced noticeably by clouds and cause error when thermal sensors are in operation (Julien ad Sobvino, 2010). The existence of missing values may limit the use ofthe data for various research purposes (Korelsen and Cou'baly, 2014). Therefore, iis crucial to apply an effective imputation model to fill the gaps created by cloud cover. Missing value imputation is a method to use a realistic value to estimate or replace the missing value (Zang ct al, 2017). Spatiotemporal imputation of LST data through interpolation methods (Hhaliacharjee ct al, 2020 Chen etal, 20144 Shiode and Shiode, 20113 Spadavecchia and Williams, 2005) and passive microwave satelite observations (Dua el, 2020; Zhang eta, 2020) have been widely proposed. As an example, Bhattacharjee et sl. (2020) developed a spatiotemporal semantic kriging method to, reconstruct the Moderate Resolution Imaging Spectroradiometer (MODIS) LST data, They reported that the proposed method can estimate the missing LST data more accurate than the spatiotemporal ordinary kriging and other existing interpolation methods. However, the performance of this new proposed approach is highly dependent on several contributing factors such asthe study eg and the land use/land cover LST corelation, Although spatiotemporal LST data imputation using interpolation and satelite passive microwave methods have been widely practiced, these methods have limitations, assumptions, and difficulties particularly the second ‘method which is highly challenging (Duan ete, 2020). Duan etal. (2020) reviewed extensively various methods for predicting LST fom passive microwave satellite observations They highlighted several issues such as mathematical underestimations and dificulties in implementing atmospheric correction (!1an et @l., 2017) when satellite passive microwave is used to reconstruct LST data. ‘Altematively, machine learning (ML) techniques can be easly applied to meet above-mentioned challenges with LST recon- struction, ML techniques have shown several sucessful applications in a variety of remote sensing studies including imagery classi- fication (Hong et al., 20212, 2021b), Downscaling (I:brahimy and Azadbakht, 2019; Mohebzadeb et al., 2020), gully erosion (Amiri celal, 2019; Aalmati et al, 2017), and ete, Imputation ML techniques for missing spatiotemporal data are also proposed by various investigators (Carvalho et, 2016; Feng eta, 20144 etal, 2020; Poloczekee al, 2014 Rustum and Adeloye Adebayo, 20075 Sharpe 2nd Soll, 1995} Zhang et al, 20275). Carvalio et al. (2016) proposed a spatiotemporal model for daly rainfall data iexputation. The results were compared with the geostatistical techniques of ordinary kriging and ordinary cokriging, The results by Carvalho al. Moar ta emote Sensing Aplications Society and ronment 24 (2021) T0065 a) Average cloud coverage b) Summer cloud coverage ¢) Winter cloud coverage A — how - A. km cious percertage o-22% Miz -23% Mion Mea sem (0) -3802 2500} 2255 4 § ow & : e 10. g i 5 cw oo0 2 a 0 353286 pr aN 2a 00% >e0% ato oar data Fat of ising ta ig. 8. atl of mlsng dally MODIS LST over the LAGV frm 2010 o 2028: (a) Teapoval pexceatge of missing LST; () spatial percentage of missing LS. (2016) showed more than 17% improvement in estimating missing daily precipitation compared to kriging and cokriging methods. Zhuang etal. (20170) proposed a novel spatiotemporal hybrid method which is combined of three advanced methods for imputation spatiotemporal missing values. The findings by Zhang et, (2017) confirmed higher performance than k-nearest neighbor (KNN) and CUTOFF (Peng et al., 2014) imputation models. Inthis present study, four different ML techniques including K-Nearest Neighbors (KNN), Boosted Regression Trees (BRT), Extreme Learning Machine (ELM), and Support Vector Regression (SVR) were applied to see if they are capable of being used for spatiotemporal imputation of LST data. To the best of our knowledge, the BRT (raya and Ghezzshei, 20195 Pith et al, 20083 Natekin and Knoll, 2013), ELM (Ebrahimy and Azadbakht, 2019; Parisouj etal, 2020), and SVR (Gizaw and Gan, 2016} Jhong etal, 20175 Yu etal, 2017) have shown successful performance compared to other ML. techniques when they are used for different purposes. Therefore, research was conducted to investigate the capability of ML techniques including KNN, BRT, ELM, and SVR for spatiotemporal imputation of LST data in New Mexico's Lower Rio Grande Valley (LRGV) where LST isa critical variable, Comparing the performance of these four different ML techniques was the second objective of this study. 2. Material and methods 21, Study area New Mexico's LRGV, a part of the Rio Grande Basin, is located in south-central New Mexico extending from Elephant Butte Dam to the borders of Texas and Chihuahua, Mexico (s. 1). Irigated agriculture is the major activity in the LRGY, and pecan is the main crop in the area, New Mexico is currently the leading pecan producer in the United States with 43.8 million kilograms of in-shell nuts produced ((JSD4-NVASS, 2020), The LRGV is classified as an arid climate with a lower annual precipitation of 203-255 mm (Oi"ice of ‘he State Engineer, 2017) The average annual temperature fluctuates between 16 and 24 (C). The main sources of water supply in the LRGV are Rio Grande River (6036) and groundwater (40%). More than 90% ofthe water use in the area is for agriculture (O‘ice of the Moar et emote Sensing Aplications Society and ronment 24 (2021) T0065 1) Preprocessing Dieabt cay lS ce ST Caleta aatst ae ‘Libya aba KNN process ‘SVR, ELM, and BRT process Calculate the distance between prediction set and train sat (x) [D,..D.] Find set of model’ parameters [8.84] using rain sey t0 ‘make generalized function: J 12) = 2985 +2180 Use the function and prediction set to ‘calculate the missing value Fig. 4. The schematic diagram of spatiotemporal nisin imputation by BL techniques, State Engineer, 2017), Fresh surface water for irigation in the LRGV is getting scarce due to low precipitation, high evaporation and Increasing demand for water from competing sources (Mokar! et al., 2019). Water availablity for irigation Is crueial for pecan productivity because the pecan is a major water user with an average of 1200-1300 mm of annual Evapotranspiration (ET) (Samar! etal., 2011), 2.2, Data description Daily MODIS LST (MOD11A1) products from the Terra satellite were downloaded from January 2010 to July 2020. The MODIS LST product includes LST data from instantaneous measurements at specific hours ofthe day. The spatial resolution is 1 km 1 kmand the {temporal resolution is daily. ig. 2 shows the map of cloud coverage over the study area within the study period (2010-2020). Itis clear that the study area has a high level of cloud coverage particularly in summer which is the main reasoa for high missing values in the IST data over the study area (Fig. 3). Temporal ané spatial percentage of missing LST in the LRGV over the study period is shown in ‘ig. 8. The review on the daily LST showed that 13,753 of pixels are lost when the images are temporally assessed during the study period. The highest numberof lost pixels was related to lowest ratio of missing data which was les than 30% (Vg. °(a)). From spatial, perspective, arouncd 60% of images was lost when the ratio of missing data was less than 20% (*'s. 3 (b)) Moka ta Remo Sensing Applian: Salty and Baonmant 24 2021) 100551 2.3. Imputation methods for comparison Pour different ML techniques including KNN, SVR, BRT, and ELM with different ratios of missing data were assessed and compared to find the best ML technique for spatiotemporal imputation of LST satelite images in the LRGV. The schematic diagram ofthe missing, {imputation process by ML techniques is shown in Fig. . Missing values were imputed by identifying their non-missing spatial neighbors and then considering temporal values within these spatial neighbors. To accomplish this, firstly, the MODIS LST images were arranged in an M x N matrix X corresponding to M days (time points) and N pixels (spatial locations) as can be seen in Tig. 4. Let xf, be the known observation value in day tfor i= 1,2, .., mandj~1,2,....1. Also, letx,,, tepresent missing values. The missing values are {imputed using following steps: Step 1 The non-missing spatial neighbors ofthe missing value x, are identified and considered as the prediction set. Then, temporal values within these spatial neighbors are considered as the train set (x), and the temporal values within x, are selected as train set Q). ‘Step 2 Each ML model is trained utilizing the train set (x) and (y) and the trained ML model is used to predict the missing value x, , using the prediction set. To impute the missing value, KNN model uses a different procedure compared to the other three ‘models (SVR, ELM, and BRT) as seen in Fs, 4. In KN, firstly, the distance between prediction set and each row ofthe train set is computed. Then, the list of K nearest neighbors and their index is stored from the train set (x). Finally, average of the train set (y) values corresponding to the stored indices is computed to impute the missing value. In SVR, BRT, and ELM, the ML model is trained using the train set (x) and (y), and the generalized model can be used to impute the missing value x, using prediction set. The mentioned procedure is repeated to impute whole missing values in the dataset. 2.3.1. K-nearest neighbors (KN) The KNN is one of the simplest ML algorithms which can be effectively applied to regression and classification problems. This technique is highly dependent on the closest neighboring station to provide the imputation information (Voloc7e' etal, 2014). The basic idea of KN imputation for spatiotemporal data is that a missing value ina target station is imputed by a weighted mean of data from the k nearest stations to the candidate station, The weights depend on the distance of the stations from the candidate station; Euclidean distance is mostly used and is used here (Feng et al, 201) 2.3.2, Support Vector Regression (SVR) Support vector machine (SVM) is a commonly used machine learning technique for classification and regression purposes (Cortes ‘and Vapnik, 1995), The structural risk reduetion (SRR) concept is implemented in this technique instead of the empirical risk reduction concept which is commonly used by artificial neural networks (ANN) models. Based on the SRR concept, the upper bound to the generalization error is minimized rather than the training error which is resulted in an optimum network structure (Li et ., 2006). The SVR is founded in a fundamental hypothesis which is the nonlinearly mapping of the principal data into a higher-dimensional feature space, The performance of linear regression in the feature space is implemented by the kernels. OF several kernels used in the SVR, the radial basis function (RBF) found to be the best compared to other kernels (Barzegar et al., 2017. Thus, the RBF kernel was used in this study. 2.3.3. Extreme Learning Machine (ELM) Extreme Learning Machine (ELM) is an advanced technique of the single-hidden layer feed-forward neural networks (SLENS) Puang-Ain et al., 2004), A model with a single input layer, a hidden layer, and an output isa standard form of ELM which isa type of ANN model. The computation procedure of ELM in much quicker than the traditional ANN. In ELM, the hidden nodes are chosen firstly. ‘Then, hidden biases and input-hidden weights are generated randomly. After that, the hidden layer outputs are computed and finally, the hidden-output weights are determined using the Moore-Penrose generalized inverse. More information about ELM model is available in literature ('ivang e* a., 2005). 2.3.4, Boosted Regresion Trees (BRT) BAT is one of ML models which increases the model performance by combining two different models including decision trees and gradient boosting (ith et al, 2008). Decision trees are commonly used because the information is presented in an innate way to visualize. Predictors preparation is not complicated because predictor variables can be defined in different forms such as numeric, binary, and categorical. The model outputs are not influenced by monotone transformations and differing scales of measurement among predictors. The gradient boosting is a method to improve the model accuracy. In gradient boosting, the model is fitted iter- atively to the data in traning phase, using suitable methods progressively to enhance emphasis on observations modelled poorly by the existing collection of trees (Flith et l., 2008) 2.4, Cross validation approach In this study, 2 10-fold cross validation (10-CV) approach was used to determine the optimum parameters ofall the applied ML models (Peng et al., 20145 Zhang et al., 2017) First, the pixels with almost completed time series over the study period were selected. Therefore, a total number of 93 pixels were chosen and data sets were arranged in aM x 93 data matrix, in which °M = 3831” is ‘number of days. In the next step, 10-CV was performed considering each column of the data matrix as the target variable and the remaining columns as the predictor variables. Therefore, the predictor variables were a complete matrix of Xsan. Then, different ‘missing data ratios from 10% to 80% were randomly se to each target column as the validation set and the 10-CV approach was used to Moar ta Remo Sensing Applian: Salty and Baonmant 24 2021) 100551 Optnaed parameter ofeach ML mel tolerance tests parameter) nd pny concent (ae SVRRBF kere ofc £0001 sm depth— 10 ‘mingle leaf = 10 smisriples sll = 20 find the optimum parameters of each ML model through minimization ofthe performance criteria in the training set. For this purpose, the performance criteria were computed between the non-missing values and imputed values and averaging over all 10-folds to produce 10-CV error. Following this procedure, 93 imputed 10-CV error were obtained in each missing ratfo. In this study, the Root, ‘Mean Square Error (RMSE) was considered as the performance criteria that can be written as CV-RMSE. The RMSE is defined as follow: ao ‘where 5; is the ith imputed value, M; isthe ith observed value, and N is the total number of non-missing values ‘Normalization was applied to all input variables to match the consistency of the ML, models as follows: Normalized a where w is the mean and o is the standard deviation ofthe trained data set. Normalization, which is known as part of data preparation for machine learning, changes the values of numeric columns in the dataset to a common scale, without distorting differences in the range of values. The mean and standard deviation ofthe trained data set was used for normalizing the test data to prevent using any values computed on the test data in model development. 2.5. Model parameterization 25.1. KNN The performance of KNN model can be largely affected by the choice of the tuning parameter k, As the value of K was decreased to 1, the predictions appeared to be less stable. Inversely, when the value of K was increased, the predictions were found to be more stable due to majority voting or averaging which result in more accurate predictions. However, large number of k cannot not always {guarantee more accurate prediction In this study, the cross-validation algorithm was used to find the optimal value of k over arange of 1-30 values, 25.2, SVR The RBF kernel function was selected which contains three main parameters including structural parameter (7), penalty coefficient (©), and tolerance threshold (ore-precision). The different values for each parameter including (20 values between 0.0001 and 10000), € (20 values from 0.0001 to 10000), and e-precision (10 values between 0,001 and 1) were evaluated and the optimized values were determined (Zhang et al., 2018) 2.5.3, BRI In this procedure, in order to determine the optimized values of the BRT parameters, the initial values of the regression tree structure parameter were set. In this study, the depth ofthe tree ranged from 5 to 10 depending on the total number of samples in the training set. The smaller value of § was selected to avoid over-fiting inthe intial model. The intial value of min. samples split was set as 30 because the number of samples in training set for LNG bus was 3831. Also, the initial value of min,samples,eaf and max features ‘was set as 10 and 6, respectively. When the initial values were set, n.estimator was optimized with values from 20 to 90 in the increment of 10. Then, max_depth was increased from 5 to 10, and the mia, samples split was inereased from 10 to 300 in the increment of 10 to determine the structure of the regression tree. After that, the values of min, samples leaf were increased from 10 to 100 in inerements of 10. The optimized values of above-mentioned parameters were determined based on the cross-validation results, 254, ELM To achieve the most accurate model, the random computation is applied to generate weights and biases ofthe hidden layer in this ‘model. The random initialization ofthe weights in ELM can result in different outputs ofthe networks for identical numbers of neurons. To find the best weights, 1000 ELMs were trained in the training period with the selected number of hidden neurons, and the best weights were maintained by minimizing the objective function of the validation period. A three-layer ELM model with a sigmoid activation function was developed. Therefore, the optimum number of hidden neurons was found using the 10-CV. Moar et emote Sensing Aplications Society and ronment 24 (2021) T0065 . very wise Hs EIR oa wins NS eT AwOR af WEY oogen'’? Y Wigiyti a ey liye lets i | dH OG ile: baat ig. S. Cros validation reculteforseazeing the opal k values fr KNN method fom the missing deta tio .1 to 0.8 Red eircles present he mos cepted Ka every ral that produced the lowest CV-RMSE (), (For lterpretaion of the reference o coor in this gure legend, the reader referred othe Web version of this arte) 3. Result and discussion 3.1. Parameter optimization using cross validation ‘The cross-validation approach mentioned in section 2.4 was used to find out the best parameters ofthe four applied models. Table 1 shows the optimized values for each MI. model. Fig. 5 illustrates the fluctuating of parameter k due to CV-RMSE error from the missing, ratio 0,1 100.8. For each missing ratio, one column was removed and predicted with k values from the 1 to 30 to obtain the optimum k value for the removed column, Therefore, 93 optimum values for each missing ratio were produced ('s. 5). The most repeated k in ‘each ratio was highlighted with red circles. The mode ofthe k values was found to range from 4 to 8 and thus, the middie k value of 6 was selected as the optimum k value (Fis. 5) 3.2. Comparison of models Fable 2 shows the results of 93 imputed errors and denoted the GV-RMSE forall applied models under varying missing data ratios in the training set. The mean of CV-RMSE for the SVR was between 0.24 and 0.33°C asthe best accuracy among all applied models from ‘missing ratio 0.1 to 0.8. The mean results of CV-RMSE showed that SVR had the best overall performance for all missing data ratios ‘mean value of 0.271°C (able 2). Fis. 6 compares the boxplot of 93 RMSE results forthe four different imputation models from the Moar ta emote Sensing Aplications Society and ronment 24 (2021) T0065 Table? ‘comprehensive analysis results of V.ANSE () forthe al applied methods under varying mising dca ratios inthe traning set. Te ld type is used to emphasize the optimal performance Saisie Mingo” ~——0i”~—~wSSCi rani os os) amass ome tor as ee siaar om 025 fash mr cas Sas gaya) ag Gaza to oss ose ose assays ass at sat ra 3 oo omar ta? mare as (e)KNN os (o)SVRREF 20 20 Ss 3 8 7 LiL Lig 3 gi: fbb as © 10) [ol : fo} fol-fel—fo}—pey-Te} | 10] , 08 8 8 : I r3rrdsTF g $ 2 2 2 os 05: bh be - h (a ai 02 03 0a 08 08 a7 08 1 020304 05 08 07 0B Missing Ratio Missing Ratio ae (ert ee (yeu 20 20 wis 2 | wis . " a y eo 8 et gg eile go. ks 2 |e s $ ¢$ 8 ¢ 2/2 |g © g & 8 oe Zio igs fs T | B10 ss os § ft I I fy 05) [o}~f2y o fo}- 05: t oo 00-1 0203 04 05 06 07 08 oor 02 03 04 05 06 a7 08 Missing Ratio Missing Ratio Fig, 6. Comparative boxploc resus (feerence to 9 RMSE resus In“) forthe four imputacion medals with diferent mising deta ratios in he validation set ing data ratios of 0.1-0.8 in the validation set. The RMSE results showed that the SVR model had the lowest error and the lowest error fluctuations compared to other models (Fig. 6). The results showed that SVR has better generalization performance compared other models due to the structural risk minimization procedure which can lead to an optimal global solution (Kumar etal. 2016; Gizaw and Gan 2026). The SVR model applies the Lagrangian approach to make simpler the quadratic optimization problem in its calculations. Also, the complexity of high-dimensional computations is reduced by applying the kernel function. Thus, the SVR is capable of achieving the efficient characteristies (Zhang ef al, 2018), The BRT and ELM were found to have the same performance level with the mean CV-RMSE of 0.552 and 0.566'C, respectively. However, the ELM was relatively stable with a smaller standard, deviation (Std) compared to BRT ("able 2). The BRT was found to have a better performance than ELM with low missing ratio between, (0.1 and 0.5 (‘able 2). KNN model with the mean CV-RMSE value of 0.982'C showed the least accurate performance for estimating LST among all the applied models. This could be explained by poor generalization of the KNN model compaed to other models and from possible drawbacks of the KNN algorithm such as assuming the complex nonlinear relationship of the spatiotemporal data as a simple Moar ta aon Ison lsoa0w Missing ratio of 20% won {(@)original ovis LST emote Sensing Aplications Society and ronment 24 (2021) T0065 (b)KNN imputed maps (€)SVR imputed maps 4 son lxo0w lsoz0w ‘Missing rato of 40% s20N > - 1408 Ison ‘Missing ratio of €0% Missing ratio of 80% zon > en a > ‘Mois LST SGT-20W 1Or-OW 1OS'4OW 105'20W SOT 2W TOTEW 105'4OW 108'20W 107-20 1OTOW 1OS4OW 10520 High : 49.77 Low 17.69 High : 52.38 Low : 314 High: 84.79 Low :35.39 Fig. 7. Comparison ofthe ML methods fo spatial imputation ofthe MODIS LST forfour images with 20,40, 60, and 80H of isting values) orignal MODIS LST, (0) KAN, (©) SURLABE, (2) BRT, and (e ELM relationship using the nearest mean (Zhang et al., 20175). KNN model has difficulties when used on large datasets with high di ‘mensions because large number of dimensions can cause more difficulties for the algorithm to calculate the distance in each dimension which result in inaccurate values. The model's accuracy decreased when the missing ratio increased despite of some fluctuations for Moar ta emote Sensing Aplications Society and ronment 24 (2021) T0065 (a) Original MODIS LST (d)BRT imputed maps (@)ELM imputed maps ‘Mobis LST High 49.27 ony i 5 = 5 8 own g a i sem ae a Low : 17.89 or - High 8228 8 hewn 3 = 2 a i am aon Low : 24.67 ser ; 7 high: 81.21 B aon 2 s220n, a i see a = = Low : 31.67 | axow w High : 54.79 ‘ A A A 3 § wvwom e z B oon = een a a a Low : 35.39 it Te Seer SONNE WERENT SEAS Sra REA Ta NENT 7. tne various missing ratios. This could be explained by the size of predictors in training process where it decreases when there Is a larger missing ratio. This phenomenon can cause a weak model generalization which result in inaccurate predictions of missing values. Overall, the obtained results confirmed the superiority of the SVR and BRT methods over the ELM and KNN in both training and Moar ta emote Sensing Aplications Society and ronment 24 (2021) T0065 a | 5° | 2m g \ =i) | i a Sa Hn =a an 3a ae a a ver es (b) ‘Missing Ratio = 35% ‘Missing Ratio = 35% regen ee ra pagans SE aA oS = cs o” Fe ei & 5” an a a an Zo 2 7 i ea | Foe a) aon = == ae > ee ver or (o) ° hy 7 oo! Anil a Ny y I, = eof | Wy | Sm 5 “WA few z 2a Nyaa 0 3 iy haat 2 En een i i 7a ea a aa a a = s ig. 8, Comparison of the WL methods for temporal impatation ofthe MODIS LST for thee pixels with 20,35, and 40% of ising values (2) msn rato 30%, () rissing rato 38%, () missing aio 40%, The two blue dash lines in the figures ofthe ee panel separte pat the time series fra large resentation inthe ght ane. (Far interpretation ofthe references to coor his igre legend the reader is refereed to the Web version of this article) validation phase. Therefore, the accuracy ranking is SVR > BRT > ELM > KNN according to the comparison of the mean CV-RMSE of the models (Vsble 2) 3.3. Spatiotemporal missing imputation Pig. 7 shows the comparison of different ML models for spatial imputation of the MODIS IST for images with different percentages of missing values. For spatial missing imputation, four images with 20, 40, 60, and 80% missing values (May 20, 2016; August 24, 2019; August 16, 2019; July 10, 2019) were selected and imputed with the all applied ML models. Comparison of calumas “b,c, d, and with column “a” shows how missing values (white color part in column “a”) are filled out with different ML models (Pig. 7). The ELM imputed model was not successful to fil out the missing values properly and some noisy estimations were observed particularly when the missing ratios were increased (“e” column in Fig. 7). This can be atributed to the random initialization ofthe weights in ELM that lead to weak generalization of the model for spatial imputation. Also, some blurring effects can be seen in the results of the ML methods ‘with increasing pattern from missing ratio of 20%-80%. This effect is more obvious in KNN imputed maps in larger missing ratios (“b column in Fig. 7). Blurring effects are also an issue with other remote sensing studies like downscaling methods. However, this effect appears to be less evident when MIL. models are used compared to other techniques ("orshimy snl Azadakht, 2079). The SVR imputed ‘maps showed that the SVR model is capable to be used for spatial imputation of the MODIS LST regardless of missing ratlo percentage ¢ column in Fig, 7) Fig, 8 shows the comparison of different ML models for temporal imputation of the MODIS LST for three pixels th different missing ratios. To show a better representation ofthe time series data, a part of eft panel was separated by blue dashes and presented on the right columns (Fig. 8). For temporal missing imputation, 3 pixels with 30, 35, and 40% missing values were selected and imputed with the all ML models. I can be seen from these reconstructed time series that the ML methods are capable to ‘impute the missing values quite well, especially in the large gaps (Ve. 8).'The imputed values for each pixel seem reasonable in the way Moar ta Remo Sensing Applian: Salty and Baonmant 24 2021) 100551 that the variability in the imputed values is very similar to that of the non-missing values (Hs. 8). 4. Conclusion This study was carried out to investigate the performance of four different ML models including KNN, SVR, BRT, and ELM for spatiotemporal imputation of LST satellite images in the LRGV. The final results showed that the SVR spatiotemporal imputation model hhad the most accurate performance for imputing the missing LST values regardless of the missing data ratios compared with other ‘models (mean CV-RMSE = 0.271°C). This suecessful imputation by SVR model could be because of a better generalization perfor ‘mance. The BRT and ELM showed the same performance level with the mean GV-RMSE of 0,552 and 0.566°C, respectively. However, the BRT was observed to have a better performance compared to ELM when the missing ratios were between 0.1 and 0.5. The ELM imputed maps showed several noisy estimations, particularly when the missing ratios were increased, due to the random initialization of the weights in the ELM model. The KNN model was found to be the least accurate model compared to other models for spatio- temporal imputation of the LST values (mean RMSE = 0,982'C). Blurring effects were observed more in the KNN imputed maps in larger missing ratios compared to other models. Overall, the results of this study showed the superiority of the SVR and BRT models ‘compared to the ELM and KNN for spatiotemporal imputation of the LST values in LRGV. In summary, our results showed that ‘imputation methods, particularly SVR, caa be used to a reconstruct MODIS LST data set. The procedures are equally applicable to other MODIS data sets such as Aerosol Optical Depth and acean color products. Author statement Esmaiil Mokari: Conceptualization, Methodology, Writing (Original draft preparation); Hamid Mohebzadeh: Conceptualization, Methodology, Writing (Original draft preparation); Zohrab Samani: Methodology, Technical Investigation, Supervision; David DuBois: Technical investigation; Prasad Daggupati: Writing (review and editing) All authors have read and agreed to the published version of the manuscript. Ethical statement The authors consciously assure that the following items for this paper are fulfilled: 1) This material is the authors’ own original work, which has not been previously published elsewhere, 2) The paper is not currently being considered for publication elsewhere. 3) The paper reflects the authors’ own research and analysis in a truthful and complete manner. 4) The paper properly credits the meaningful contributions of co-authors and co-researchers. 5) The results are appropriately placed in the context of prior and existing research. 6) All sources used are properly disclosed (correct citation). Literally copying of text must be indicated as such by using quotation marks and giving proper reference. 7, All authors have been personally and actively involved in substantial work k for its content. ding to the paper, and will take publie responsibility Declaration of competing interest The authors dectare that they have no known competing financial interests or personal relationships that could have appeared to {influence the work reported in this paper. Acknowledgments The authors thanks the National Institute of Food and Agriculture, U.S, Department of Agriculture (award number: 2017-68007- 26318) for supporting this work. Authors also appreciate NASA for providing the MODIS LST data References Anle, M, Pourghasenl, HAL, Ghanaian, GA, Afual, SE, 2018. Assesment of the inprtance of gully erosion elective factors using Borat algorithm an its spatial modeling and mapping using thee machine laring algorithms. Geaderma 340, $569. ip sovors 10-1006 guaderna 2015-12082 Resour, Rex 5, 5715-5797. sp oor 101025, 2018RO287. nrzegar, Re, Asghar Moghadéam, A, Agama, J, Pin, 2017, Comparison of machine lerng models for predicting verde contamination in grout ‘toch, Environ. Res Rsk Asset 31, 2705-2718, otpe/io.ory/10 1007/04 -0le- 3h sbattachaje,S, Chen, J, Chor, SK, 2020, Spatio-temporal prediction of and suface temperature using semantc hing Trans, GIS 24, 189-202. i/o. org 0111 7112555, Carvalho, JRP., Naka AM, Monteiro, LEB.A, 2016 Spatiotemporal modeling of dla imputation fr dil antl sri in homogeneous zone. Revista Bria de Meteorloga 1, 196-201, ip iorg 10.1590 010277965 290150025 Sen. 6 biG oxg/10.90/1 6042848, cores, C, Vapnik, V., 195, Supprt-vector networks Mach, Leer. 20, 279-997, //do onp/40 1007880009408, Duan, 8, Han, Xa, Huang, GL, ZL, Wu H, Olan, Y., Gao, M, Leng, P, 2020, Land surface tomporatue reuievl fem passive merowave satllte chwervatone aatcotthert and fuure directions, Rem, Sen 12 lai /do.rg/10.3390/9 2162572, [Sbrahiy, HL, Azadba, 2019. owcaing MODIS land surface temperature over & heterogeneous are: an faestigation of machine learning erhnigues, Tealreslection, and impacts of mined pitts. Comput. Geen 124, 8-102. hipe//deiorg/10.1016/;cxyeo.2019.0.004 Moar ta emote Sensing Aplications Society and ronment 24 (2021) T0065 Sh Leah, Hae, T, 2008, A working gue io boosted epresion tes. J. Ani. Hea. 77, 602-813 ps//doLoyg/20.0111/).1365- eng Nowak G, O'Nell, Ts, Welsh, AM, 2014, CUTOFF: a mstitempora imputation method. J Hydrol $18, 3591-3605. ho=/doiony/10. 10161. ‘Otafaian Malai, H.R, Rous, Olen, H, Zare, H, Zhang, H, 2018. Coping of MODIS time series land surface temperature (7) products using singular spectrum analy (SSA). Atmosphere s/s ony 2290 stmos90905, laa, MLS, Gan, 1, 2016, Reglonal flood fegueney anlyl sag support vector epesion under istorial nd furute climate J pdr, $38, 387-398 ip// dsl.org/10-1016/ hyde 2016.04.41. CGuangtin, 1, Qiu, Z, Chee-con, 5 2004, 2006 TEBE International Joint Canfereace on Neural Network IE Cal. NO.OACHITSA1),Fatreme Learning Macine:» Neve Learning Scheme of Fesorward Neural Network, 982, pp. 985-990. s/o wory10 105m 2008.1 380008 Han, X3, Duan, 5B, ly ZL, 2017. Atmospheric corecon for ercvng ron brighnes temperature at commnly-sed passive microwave fequences. Opt express 25, A36-a6%, hips /ol ore 0.1 364/00 7.00080 Huang, 8, Zbu, Q-¥, Siew, C-K, 2006, xtreme lating machin: theory and applications. Neurocomputing 70, 489-802, hips ory/10.1016/) Hong, D., Gao, Yao, J, Zhang, 8, Plaza, A, Chanusot, J, 2021, Graph enveluioal networks fo hyperspectal nage casein, IEEE Trans. Geos Rem. Sent 58, 5966-5978. hp ion i0-1109/TaRS 2020015157 Hon D Ga, Yokoy, , Ya, J, Chama, Des Zhan, B, 2021, More diverse means better mlimodsl deep larrng mess remote-sensing sgery easication, IEE Trans, Geos Rem. Ses, 58, 4340-4354, hip} oy 0.1109/ToRS 2020 00820, hong, B-C, Wang, J-H, Lin, G-F, 2017. An integrated two-stage suppor: vecar machine approach to forecast inundation maps during Wphoons. J. Hyol. S47, 236-253, hups//Slorg/10.10:6/ hydro 2017 01.057 Julien, Sobsiao, JA, 2010, Comparison of cloud-eeonsttion methods for ae series of omsposte NDVI dats, Remote Sensing of Envionment 114, 618-625. Tipe dolor 10-1006 / 8 2003.21 001 Karnal, A Agum, Pikes, RT, Anderson, Mf, Imbolf, ML, Gutman, 6. Panov, N, Goldberg, A, 2010, Use of NDVI and land surface temperate for drought sseesment: merits ad Hnitatons. J Clim 23, 618-633, hp or 101175 90910112900 Kisara, Gy Papaloannot,G, Reals, A. Paronis D., Kskies, P2018, Estimation of air emperatare ad reference evapotranspiration sing MODIS and surface temperatre over Greece. In J. Rem. Sens. 39, 924-948 i /do.ory 10,1080 /014s11612017 1395900, Komesen K, Cotlbaly, P, 2014. Comparison of interpolate, stata, abd éata-civen mathods for imputstion of missing values in e distributed sol moisture staat Hydrol. Eng 1, 25-4, pol. ong 101063/(ASCE HE 1942 S804 0000787, Kumar, D. Pandey, A, Sharma Nigel, W-A, 2016. Dally suspended sedineat simulation uting machine ening approach. Catena 138, 77-80 Jory! Kustas,W., Anderson, M, 2008, Advances in thermal infared rem sensing fr land surface modeling. Agric. For. Metro. 149, 2071-2081 h//do.ore/ 10167} agrformet 2009 05016 1, L, Franklin, M., Cirgus, ML, Larmann,R, Wu, J, Pavlov, N, Breton, C,Cillaé,F., Habre, R., 2020. Spatiotemporal imputation of MAIAC AOD using deep Teaming with downscaling” Remote Sensing of Covronment 287, 111384. his. do 1y/10"1016/8020191- 195% 1, ZL, Teng R, Was, ZB, ¥, Zhou, C, Tang, B, Yan, G, Zhang X, 2009. A relew of crea methodologies fr relonl evapotensplration estimation from remotely sensed dats, Sensor 9. s/o. g710.1390/ 9050801 in, Ja, Cheng, C-T, Chis, KW, 2006, Using suppor vector machines for longterm dicharge peiction. Hydrol Si J. $1, 598-612. s//doony/10 1625 Monebrade Hl, Yeom, J Vee, 202, Spatial downscaling of MODIS chloophyla with genetic programing in South Korea. Rem. Ses: 12. Molar, Shukla, MK, Siminek, J, FemandezJ1L, 2018. Numeccal modeling of nizate Ina food-migated pecan orchard Sl Se. Sot Ar. J 83, 585-564, tps dol. org/ 0.3196 /s)3018 11.0482 Natekin, A, Kool, A, 2015, Gradient boosting machines, atti ron. Neutrob, 7 21. lic or/ 108389 abo 201.0002 fice ofthe Sat Engineer, Sate of New Mexico, 2017. Lower Rlo Grande Regional Water Pla. p/w. ate nas Punning/ document Reg? Pars Ps Mohebzaded Hy Le, 2020. Employing machine leasing algorithms or streamflow prediction: eas sty of four iver basins wit diferent climatic tones inthe Unive Sites Water Resour Mang sips) isisry iu yishizeovo ae Polos, J, Treiber, WA, Kramer, 0, 2014 KNN regression as gomptaton method for spatiotemporal win data, In: dela Puerta, 3. eter, LG, BigasP , Kiet, Abvahan, A, de Carvalho, ACP, Hever, A, Aarugu,B, Quintin, H,Corchado, (Hs), lnernalonl Jot Conference SOGO'is-ISis 14 CHUTE Springer iatrnational Publishing, ham, pp. 188-195. hp. /to.nsg 10.1007 9 ss007990 19 Rahat, ©, Takmasehipour, 8, Haghisadeh, A, Pourghaseni, 1, Fezzade, 2017, Hvaluton of diferent machine laren models fr predicting and ‘mapping thence of pul eoton. Geomorphology 298, 18-137. "ipso wny/10-10L5/»geomerp 201708. Rusti, Ry Adloye Adebayo, J, 2007. Replacing oes and mining vals from active sdge data using kononen selorganizng map. Environ. Eg, 13, 305-916, bp. org 101061 {ASCEIO?S9372€2007)13 9009 Sumani, Z, Bawazr, S, Skaggs, R, Longworth, Pion, A. Tren, V., 201. A simple uration scheduling approach fr pecans, Agric, Water Manag. 98, 651-668, Tp ol org/ 101016 /sagwat 2010.11.03, Shape, PX, Soy, Ri, 1995, Dealing with misig values in neural network-based agnostic systems. Neus] Comput. Appl 8, 73-77 //doLoxy/10.2007/ Shiode, N, Stiode,S, 2011. Stee-ve sail interpolation using nework-baued IDW and ordinary keiging, Tras. GIS 15, 457-477, his//do.or/10.1000/ 1-1467.96712011.01278 Spadavecehia 1, Wiliams, M, 2009. Can spatio-temporal geostatistical methods improve high resolution repionaisaton of meteorological variables? Agric. Fo. Meteorol. 149, 1105-1117, hips //doe/10 1016 egrformet 2009 01.008 US. Deparment of Agculuse-NASS, 2020. Pecen Producto, Crops and Crop Products category. his//dowaloads usta iva cvs y/sds-ss/s/ Seka WOE /Na06tetw/pecopca0 pal Wang, K, Lang, 5, 2008. An iproved method for estimating global evaptranspiraion tase on satelite determination of surtce net dition, vegetation index, Tenperatr, and sol more. J ydrometeorl 8, 712-727. Nin sotony 0.1175 20070119 Yo, PS Yang, Tc, Che, SV, Koo, GoM Tot, HW, 2017, Comparison of random forests and support vector machine for ralime radarerived afl forecasting. J. Hye. 852, 92-104, hps:/e01og/ 101016 jyete 2017 00.02 ‘aang, D, Lin, Peng. Q, Wang, D.,Yeng T,Sorooshan, Su, X, Zhuang, J, 2018. Modeling end simulating of reserve operation using the eri neural network, suppod vector regression, deep larlngalgocith. J Hydsl. 565, 720-736 ips. /saucey sO 101s) hycio.20"8 08050 ‘ang, L, Jiao, We, Zang, H, Huang. C, Toa, Q, 20178 Studying drought phenomena in te Content United Slates 2011 and 2072 using vavious drought indice Remote Sensing of Exvibament 180, 96-106, itp do. org V010Te/ ne 2016-12010, ‘aang, Re Tan, iH, Sn, X, Chen, SX, J, 2008. Tw improverent ofan operational we-ayer mo! fr terrestrial srfes hea fx retieval. Sensor 8 ‘Zhang, Z, Yang, XL, HL, Wy Yan, H Shi, E, 20170, Application ofa aoe! hybrid method for spatiotemporal data imputation: a ase study ofthe Mingin County oundwater lve. 3 Hydra. 558, 384-397. hosed orp/a0 1016) yee 2017 07055 ‘hod, JL, Menend Mt, 2035. Reconstruct of global MODIS NDVI ume seri: peformance of harmoole ANalyss of me sels (HANTS). Remote Seasng of Environment 163, 217-228, hips oon 10.10 16/) 9 20:5.05038 3

You might also like