You are on page 1of 4

PRODUCING A GAP-FREE LANDSAT TIME SERIES FOR THE TAITA HILLS,

SOUTHEASTERN KENYA

Zhipeng Tang 1,2,3,*, Hari Adhikari 1,2, Petri K. E. Pellikka 1,2, Janne Heiskanen 1,2
1
Department of Geosciences and Geography, University of Helsinki, P.O. Box 68, FI-00014, Finland
2
Institute for Atmospheric and Earth System Research, University of Helsinki, Finland
3
School of Forestry and Environmental Studies, Yale University, New Haven, CT, USA

ABSTRACT Landsat reflectance time series using a gap-filling and


harmonic fitting procedure [4].
Long-term Landsat time series imagery provides valuable The Eastern Arc Mountains stretch from southeastern
opportunities for monitoring land surface changes. However, Kenya to eastern Tanzania. The remaining montane forests in
missing observations that result from clouds, cloud shadows, the region are important reservoirs of biodiversity and carbon
and scan line corrector failure makes Landsat data record stocks, and have an elementary role in the hydrological cycle.
neither a continuous nor a consistent time series. In this study, The Taita Hills form the northernmost part of the mountains.
we present an approach to produce a gap-free Landsat time Due to its unique geographical and biological characteristics,
series for the Taita Hills in southeastern Kenya. We used several previous studies have focused, for example, on
simulated gaps in nearly cloud-free images to assess the mapping forest cover changes [5], tree species diversity [6],
performance of the approach while considering two factors: and biomass mapping [7]. However, because of the great
the size of the gaps and effect of the wet or dry seasons. It number of missing data in medium resolution satellite
turned out that filling image with the largest area of simulated imagery in the area caused by CCS, applications based on
gaps, which were equivalent to 52% of the test image area, dense time series have been so far limited. Although studies
yielded almost the same root mean square error, relative for removing CCS were applied in tropical landscape [8], the
RMSE and the coefficient of determination as the smallest remaining gaps limit the usage of the datasets. Therefore, in
gaps equivalent to 25%. Furthermore, gap-free images in both this study, our objective was to examine a temporal
dry and wet seasons could be filled without visual artifacts, interpolation approach to produce a gap-free Landsat time
and dry season images were better predicted. Finally, the time series for the Taita Hills.
series images produced for the study area showed consistent
temporal variation across land use/land cover types. 2. METHODOLOGY

Index Terms— Tropical areas, image processing, image 2.1. Study area
reconstruction, gap-filling
The Taita Hill study area (3°18’S, 38°30’E) is located in
1. INTRODUCTION southeastern Kenya (Fig. 1). The area has a variable
topography, where the altitude in the hills ranges from around
Providing almost 50 years of surface records, Landsat ac- 1000 m to 2200 m. The surrounding plains have an
quires data at a spatial resolution of 30 meters and temporal approximate altitude of between 500 and 1000 m.
resolution of 16 days [1]. However, a less attractive feature
of Landsat data is missing observations – like in any other
optical satellite data – which results from clouds and cloud
shadows (CCS) [2]. Also, the failure of the Landsat Enhanced
Thematic Mapper Plus (ETM+) Scan Line Corrector (SLC)
that occurred in May 2003 caused wedge-shaped gaps in each
image thereafter, accumulating roughly 22% missing values
per image [3]. Therefore, producing gap-free Landsat images
continues to be one of the major barriers in applications of
Landsat data. Filling absent pixels in an individual Landsat
image is an efficient and relatively easy way to solve the pro-
blem. For example, a new method was proposed to complete Fig. 1. Location of the study area in (a) Kenya and (b) Taita
Hills from a Landsat 8 image displayed in true color
*Corresponding author. E-mail: zhipeng.tang@helsinki.fi composites.

978-1-7281-6374-1/20/$31.00 ©2020 IEEE 1319 IGARSS 2020


Influenced by the Intertropical Convergence Zone, this assumed that LULC changes are relatively insignificant
area has a bimodal rainfall pattern – a long wet season during that period; therefore, we used a one-year time series
between March and June, and a short wet season between to compute statistical metrics. Using the k-NN regression, a
October and December [5]. Land use/land cover (LULC) relationship between pixels in a target image and STMs is
includes bushland, cropland, montane and plantation forests, built. Based on the relationship, gaps in the target image can
grassland, and built-up areas. be predicted.

2.2. Satellite data 3. ALGORITHM TESTS

We obtained Landsat Collection 1 Level-2 surface refle- 3.1. Experiment designs


ctance data (2013–2017) from USGS website1. We removed
CCS from all images using pixel quality band and scaled the Two experiments were designed:
surface reflectance into 0–1 range. Out of the time series, two (1) Approximately half of the images in the time series
nearly cloud-free images (Fig. 2) were selected to test the have less than 30% clear pixels. Although there are also
gap-filling approach presented below, detailed in Table 1. For almost cloud-free images occasionally, the fraction of the
convenience, hereafter, we refer to any image that will be missing values are in general high and variable. To verify
filled as a target image. how the gap size affects accuracy, in target images, we
simulated gaps of three different sizes (300×300, 400×400,
and 500×500 pixels). The gaps are equivalent to 25%, 37%
and 52% of missing observations.
(2) To verify whether the method performs well in both dry
and wet seasons, we compared the results between the two
using gaps of 500×500 pixels.
In terms of accuracy assessment, we evaluated the
performance of the method both qualitatively and
Fig. 2. Nearly cloud-free target images in (a) 2014 and (b) quantitatively. On one hand, we assessed visually the spatial
continuity of the filled gaps and the presence of noise. On the
2017, displayed in true color composites.
other hand, we compared the filled reflectance with the actual
reflectance using root mean square error (RMSE), relative
RMSE (rRMSE), and the coefficient of determination (R2).
Table 1. Landsat 8 OLI images and acquisition dates.
Year, Season Area Resolution
Furthermore, we randomly selected about 1000 pixels for
Path,Row Bands bushland, cropland, forest, grassland, and built-up areas
DOY (km2) (m)
2017, 259 Dry based on “S2 prototype LC map at 20m of Africa 2016”
167, 62 7266 7 30 downloaded from the ESA CCI LC 20162 to assess how gap-
2014, 331 Wet
filling accuracy performs for different LULC types.
2.3. Gap-filling algorithm
3.2. Results of simulated gaps
There are three steps in the gap-filling method. First,
statistical metrics consisting of various percentiles (10%, The target image acquired in 2017 with simulated gaps is
25%, 50%, 75%, and 90%) and mean reflectance were shown in Fig. 3. All the filled images appear similar to the
computed based on one-year time series of all available valid actual image without clear artifacts. In Table 2, there are no
observations of each pixel. A total number of 42 bands of obvious differences between the three sizes of gaps. The
statistical metrics were obtained. Second, a pool of these mean rRMSE for the three gap sizes (300×300, 400×400, and
metrics were created by stacking them and target images. 500×500 pixels) were 4.7%, 4.8% and 4.9%. The mean R2
Third, with a pool of statistical metrics, the k-NN regression was greater than 0.90, which indicates a good agreement
was used to predict missing values in the target image. between the predicted and actual reflectance. Also, our
Statistical metrics have been commonly used for land method was capable of predicting regions where land cover
cover and vegetation attribute mapping [9]. In most cases, it was changed. Subsets of gap-filled results showed the built-
is reasonable to employ one year of data to compute STMs, up area was well predicted without a road in 2014 and with a
as one year covers a full phenological cycle, and it can be road in 2017 (see red frames in Fig. 4).

1 https://earthexplorer.usgs.gov/ 2 http://2016africalandcover20m.esrin.esa.int/

1320
homogenous, resulting in stable spectral reflectance all year
round.
As for the dry and wet seasons, the dry season image had
smaller rRMSE and greater R2 (Table 2). The mean rRMSE
in the wet season was 9.6%, which is twice as high as that in
the dry season (4.8%). Although R2 in the wet season was
high (0.86), it was even higher in the dry season (> 0.90). The
results suggest that our method performs well for both dry
and wet season images, although the accuracy is higher in the
dry season.

3.3. Results based on the actual time series

Next, we applied the method to fill gaps in all time series


images. The results for the three additional images are
demonstrated in Fig. 6. All the images appear to be spatially
continuous without any remaining gaps or errors. This
indicates that the method is robust to the situation where gaps
occur either in the dry or wet season, in the hills or lowlands,
and across homogeneous or heterogeneous areas.

Fig. 3. Gap-filling results based on three different gap sizes


in the target image acquired in 2017. The gaps in different
sizes were simulated: (a) 300×300, (b) 400×400, and (c)
500×500 pixels. (d)–(f) are the filled images for (a)–(c).
Images are true color composites. Yellow wireframes show
the areas that were filled.

Fig. 5. Accuracy assessment of the effect of gap sizes in the


five LULC types (dry season image): (a) rRMSE and (b) R2.
We also extracted the Normalized Difference Vegetation
Index (NDVI) time series for five LULC types: bushland,
cropland, forest, grassland and built-up areas (Fig. 7).

Table 2. Accuracy of the gap-filling results for the two


experiments (mean values for all spectral bands).
Dry Wet
300 pix. 400 pix. 500 pix. 500 pix.
RMSE 0.008 0.008 0.008 0.013
rRMSE (%) 4.7 4.8 4.9 9.6
2
R 0.94 0.94 0.93 0.86
Fig. 4. Subsets of gap-filled results (500×500 pixels
simulated gaps) for images acquired from 2014 and 2017. These examples demonstrated good temporal consistency
Built-up area in the (a) original image acquired in 2014, (b) of the gap-filled values, even when there were only a few
gap-filled result of (a), (c) original image acquired in 2017, cloud-free observations for the particular pixel. Changes in
and (d) gap-filled result of (c). observations are smooth, and the double peak form of the
NDVI seasonality caused by the bimodal rainfall pattern is
In terms of performance in different LULC types, also well-recovered.
grassland had higher accuracy than other types, particularly
for the largest gap size (Fig. 5). The reason may be that
grassland has semi-natural vegetation, so the area is relatively

1321
missing. In terms of different LULC types, five LULC types
had R2 over 0.97.
Likewise, it is understandable that the results in the wet
season are not as accurate as in the dry season because the
statistical metrics collected in a wet season are lower in
quality. As continuous cloud obstruction often occurs when it
rains, the valid observations are mostly obtained from the dry
season in the one-year period. Thus, missing values in the wet
season are less accurately reconstructed.
The produced gap-free time series provides promising
opportunities for remote sensing applications such as
monitoring land cover change and mapping the biophysical
properties of vegetation. The future studies should focus on
testing this procedure in a larger number of study areas on a
global scale.

5. ACKNOWLEDGMENTS

This work was supported by the China Scholarship Council.


We also thank the Academy of Finland for the
SMARTLAND project (decision number 318645). We thank
Fig. 6. Examples of the actual and gap-filled Landsat OLI Mr. Clayton Snider for revising the language.
images from the full time series, displayed in true color
composites. The images were acquired on (a) January 11, 6. REFERENCES
2014, (b) May 22, 2015, and (c) January 1, 2016; (d)–(f) are
the gap-filled images for (a)–(c), respectively. [1] M. A. Wulder et al., "Current status of Landsat program, science, and
applications," Remote Sensing of Environment, vol. 225, pp. 127-147, 2019.
[2] F. Gerber, R. d. Jong, M. E. Schaepman, G. Schaepman-Strub, and R.
Furrer, "Predicting Missing Values in Spatio-Temporal Remote Sensing
Data," IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no.
5, pp. 2841-2853, 2018.
[3] J. Ju and D. P. Roy, "The availability of cloud-free Landsat ETM+ data
over the conterminous United States and globally," Remote Sensing of
Environment, vol. 112, no. 3, pp. 1196-1211, 2008.
[4] L. Yan and D. P. Roy, "Spatially and temporally complete Landsat
reflectance time series modelling: The fill-and-fit approach," Remote
Sensing of Environment, vol. 241, p. 111718, 2020.
[5] P. K. E. Pellikka, M. Lötjönen, M. Siljander, and L. Lens, "Airborne
remote sensing of spatiotemporal change (1955–2004) in indigenous and
exotic forest cover in the Taita Hills, Kenya," International Journal of
Applied Earth Observation and Geoinformation, vol. 11, no. 4, pp. 221-232,
2009.
[6] E. Schäfer, J. Heiskanen, V. Heikinheimo, and P. Pellikka, "Mapping tree
species diversity of a tropical montane forest by unsupervised clustering of
airborne imaging spectroscopy data," Ecological indicators, vol. 64, pp. 49-
58, 2016.
[7] J. Heiskanen, H. Adhikari, R. Piiroinen, P. Packalen, and P. K. E.
Pellikka, "Do airborne laser scanning biomass prediction models benefit
Fig. 7. The NDVI time series based on the gap-filled images from Landsat time series, hyperspectral data or forest classification in
in five years: (a) bushland, (b) cropland, (c) forest, (d) tropical mosaic landscapes?," International Journal of Applied Earth
grassland and (e) built-up areas. Red and blue points Observation and Geoinformation, vol. 81, pp. 176-185, 2019.
correspond to the observed and gap-filled reflectance values, [8] S. Martinuzzi, W. A. Gould, and O. M. R. González, "Creating cloud-
free Landsat ETM+ data sets in tropical landscapes: cloud and cloud-shadow
respectively. removal," US Department of Agriculture, Forest Service, International
Institute of Tropical Forestry. Gen. Tech. Rep. IITF-32., vol. 32, 2007.
4. DISCUSSION AND CONCLUSION [9] P. V. Potapov et al., "Quantifying forest cover loss in Democratic
Republic of the Congo, 2000–2010, with Landsat ETM+ data," Remote
Sensing of Environment, vol. 122, pp. 106-116, 2012.
We presented a method to produce a gap-free Landsat time
series for a cloud-prone tropical study area in Kenya. We
noticed that as the size of the gaps increased, the accuracy
remained nearly the same, meaning that gap-filling method
performs well even when a very large fraction of the image is

1322

You might also like