Remote Sensing and GIS Integration For Amazon Rainforest Wildfire

University of South Florida
Digital Commons @ University of

South Florida
USF Tampa Graduate Theses and Dissertations USF Graduate Theses and Dissertations
November 2021
Remote Sensing and GIS Integration for Amazon Rainforest

Wildfires Applications
Cong Ma
Follow this and additional works at: https://digitalcommons.usf.edu/etd
Part of the Geographic Information Sciences Commons
Scholar Commons Citation

Ma, Cong, "Remote Sensing and GIS Integration for Amazon Rainforest Wildfires Applications" (2021).
USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/9175
This Thesis is brought to you for free and open access by the USF Graduate Theses and Dissertations at Digital
Commons @ University of South Florida. It has been accepted for inclusion in USF Tampa Graduate Theses and
Dissertations by an authorized administrator of Digital Commons @ University of South Florida. For more
information, please contact digitalcommons@usf.edu.
Remote Sensing and GIS Integration for Amazon Rainforest Wildfires Applications
by
Cong Ma
A thesis submitted in partial fulfillment

of the requirements for a degree of
Master of Arts in Geography
Department of Geosciences
College of Arts and Sciences
Co-Major Professor: Ruiliang Pu, Ph.D.

Joni Downs Firat, Ph.D.
He, Jin Ph.D.
Date of Approval:
Nov 19, 2021
Keywords: wildfire occurrence, spatial-temporal pattern, geographically weighted regression
Copyright © 2021, Cong Ma

Dedication
I dedicate my thesis work to my lovely family. I am particularly grateful to my loving parents,
whose encouragement and tenacity have inspired me, without whom I would not be able to
accomplish my study and life abroad. Additionally, I dedicate this thesis to my dearly beloved
friends, who have always supported me and helped me to cope with my stress throughout my thesis
journey.
Thank all of you.

Acknowledgments
Without the support of my committee, this work would not have been possible. I sincerely
appreciate your help and time. Special thanks offer to Dr. Ruiliang Pu for always providing his
support and expert advice. Dr. Ruiliang Pu always has time to discuss with me, corrects my writing
mistakes, and always responds to my questions as soon as possible. Furthermore, I would like to
thank the members of my thesis committee, Dr. Joni Downs Firat and Dr. He Jin, for all the advice
and encouragements that they provided during this process. They imparted a lot of knowledge
about GIS analysis and statistics to me.
I wish to acknowledge NASA LANCE FIRMS for making their wildfire data available. I
acknowledge NASA LP DAAC, MapBiomas, and GTOPO30 for providing the land cover, land
use, vegetation cover, and elevation data; NASA GES DISC at NASA Goddard Space Flight
Center for providing weather and climate data; WorldPop for providing the population data; DNIT
for providing the highway, and other geographic data.

Table of Contents
List of Tables .................................................................................................................................. iv
List of Figures ..................................................................................................................................v
List of Abbreviations..................................................................................................................... vii
Abstract .......................................................................................................................................... xi
Chapter 1. Introduction ....................................................................................................................1
Chapter 2. Questions and Significance ............................................................................................6

2.1 Research Questions ........................................................................................................6
2.2 Thesis Significance ........................................................................................................8
Chapter 3. Literature Review ...........................................................................................................9

3.1 Background ....................................................................................................................9
3.2 Spatial Analysis Tools .................................................................................................. 11
3.3 Causes of Amazon Wildfires ........................................................................................13
3.4 Temporal Distribution Characteristics .........................................................................15
3.5 Spatial Distribution Characteristics .............................................................................17
3.6 Wildfires Driving Factors ............................................................................................19
3.7 Scale Effects .................................................................................................................27
Chapter 4. Study Area and Data Sets .............................................................................................30

4.1 Study Area ....................................................................................................................30
4.1.1 Physiography.................................................................................................30
4.1.2 Climate ..........................................................................................................31
4.1.3 Terrain ...........................................................................................................32
4.1.4 Forest Resources ...........................................................................................33
4.1.5 Socioeconomic Status ...................................................................................33
4.1.6 Wildfires Regimes .........................................................................................34
i
4.2 Data Sets/ Sources........................................................................................................36
Chapter 5. Methodology ................................................................................................................38

5.1 Overarching Study Design ...........................................................................................38
5.2 Data Preparation and Processing .................................................................................41
5.2.1 Dependent and Explanatory Variables ..........................................................42
5.2.1.1 Topographic ...................................................................................44
5.2.1.2 Vegetation Cover ............................................................................47
5.2.1.3 Land Use ........................................................................................49
5.2.1.4 Anthropogenic ................................................................................51
5.2.1.5 Meteorological ...............................................................................52
5.2.2 Projection ......................................................................................................56
5.2.3 Defining Analysis Field ................................................................................56
5.2.4 Integration and Collect Events ......................................................................57
5.2.5 Create Fishnet ...............................................................................................58
5.2.6 Fill Missing Values .......................................................................................60
5.3 Exploratory Data Analysis (EDA) ...............................................................................61
5.4 Measuring Geographic Distribution ............................................................................62
5.5 Spatial Density Analysis ..............................................................................................63
5.6 Spatial Pattern Analysis ...............................................................................................64
5.6.1 Average Nearest Neighbor (ANN) ................................................................65
5.6.2 Spatial Autocorrelation .................................................................................65
5.6.3 Hot Spot Analysis .........................................................................................66
5.6.4 Cluster and Outlier Analysis .........................................................................68
5.7 Modeling Spatial Relationship.....................................................................................68
5.7.1 Ordinary Least Squares (OLS)......................................................................70
5.7.2 Poisson Regression .......................................................................................73
5.7.3 Overdispersion: Quasi-Poisson Regression ..................................................74
5.7.4 Geographically Weighted Gaussian Regression (GWGR) ...........................76
5.7.5 Geographically Weighted Poisson Regression (GWPR) ..............................78
5.7.6 Model Evaluation and Comparison ..............................................................79
Chapter 6. Results ..........................................................................................................................81

6.1 Wildfire Data Extraction ..............................................................................................81
6.2 Directional Distribution ...............................................................................................81
6.3 Temporal Distribution ..................................................................................................82
ii
6.4 Spatial Distribution ......................................................................................................86
6.5 Spatial Relationship .....................................................................................................96
6.5.1 Global Model Using OLS .............................................................................97
6.5.2 Global Model Using Poisson ......................................................................102
6.5.3 Geographically Weighted Gaussian Regression .........................................104
6.5.4 Geographically Weighted Poisson Regression ...........................................108
6.5.5 Model Evaluation and Comparison ............................................................ 110
Chapter 7. Discussion .................................................................................................................. 116

7.1 Spatial Density ........................................................................................................... 116
7.2 Temporal Pattern ........................................................................................................ 118
7.3 Spatial Pattern ............................................................................................................ 119
7.4 Spatial Modeling ........................................................................................................ 119
7.4.1 Comparison of Prediction Models .............................................................. 119
7.4.2 Accuracy of Prediction Models ...................................................................121
7.4.3 Influence of Drivers on Wildfire Occurrence .............................................122
7.5 Overdispersion ...........................................................................................................127
Chapter 8. Conclusions ................................................................................................................129
Chapter 9. References ..................................................................................................................134
Appendix A: Layer Creation Workflows .....................................................................................147
Appendix B: Spatial Distribution.................................................................................................150
Appendix C: Spatial Relationship................................................................................................151
Appendix D: Supplementary Information ...................................................................................161
iii
List of Tables
Table 4.1. Sources and descriptions for variables ...................................................................36
Table 5.1. Wildfire records (MCD14ml) and their corresponding attributes .........................60
Table 5.2. Descriptive statistics of dependent and explanatory variables ..............................61
Table 5.3. Brazil administrative boundaries ...........................................................................63
Table 6.1. Summary of variable significance and multicollinearity .......................................96
Table 6.2. Summary of global OLS results .............................................................................98
Table 6.3. OLS diagnostics statistics ......................................................................................98
Table 6.4. Poisson and quasi-Poisson diagnostics statistics .................................................100
Table 6.5. Summary statistics for estimated local coefficients from the OLS model
and relative spatial variation status ......................................................................101
Table 6.6. Summary statistics for estimated local coefficients from the Poisson model
and relative spatial variation status ......................................................................105
Table 6.7. Comparison of the performance of the four models ............................................107
iv
List of Figures
Figure 3.1 Number of wildfires in Amazônia Biome ..............................................................13
Figure 3.2 Fire triangle ............................................................................................................22
Figure 4.1 Location of the study area ......................................................................................30
Figure 4.2 Köppen-Geiger climate classification map for the study area ...............................31
Figure 4.3 States with the most fires .......................................................................................34
Figure 4.4 Spatial distribution of wildfires of the Amazônia Biome in 2013–2015 ...............35
Figure 5.1 The logical framework of the study .......................................................................40
Figure 5.2 Frequency distribution of wildfire occurrences .....................................................41
Figure 5.3 Wildfires by states in Amazônia Biome .................................................................42
Figure 5.4 Spatial distribution of observed wildfire counts ....................................................42
Figure 5.5 Continuous spatial distribution of observed wildfire counts .................................43
Figure 5.6 Spatial distribution of (a) elevation, (b) slope, and (c) aspect ...............................45
Figure 5.7 Spatial distribution of fractional vegetation cover (FVC) .....................................47
Figure 5.8 Spatial distribution of forest patches and edge proportion ....................................48
Figure 5.9 Spatial distribution of pastureland area and cropland area ....................................49
Figure 5.10 Spatial distribution of road density, population density, and forest loss ................51
Figure 5.11 Spatial distribution of meteorological variables ....................................................53
Figure 5.12 Spatial distribution of MCWD ...............................................................................55
v
Figure 5.13 Identification of wildfire hotspots ..........................................................................56
Figure 5.14 Spatial join for wildfires ........................................................................................58
Figure 5.15 The quadrants of the cluster and outlier analysis ...................................................67
Figure 5.16 Regression analysis workflow ...............................................................................68
Figure 6.1 The standard deviation ellipse of wildfire incidents ..............................................80
Figure 6.2 Monthly number of wildfires in the Amazônia Biome in 2019 .............................81
Figure 6.3 Optimized hot spot analysis for each month ..........................................................82
Figure 6.4 Average nearest neighbor summary of wildfires ...................................................83
Figure 6.5 Kernel density map of wildfire incidents in Amazônia Biome in 2019 .................84
Figure 6.6 Global Moran’s I summary of wildfire incidents ...................................................85
Figure 6.7 GIZScore map ........................................................................................................88
Figure 6.8 GI_BIN map...........................................................................................................89
Figure 6.9 IDW of GIZScore map...........................................................................................90
Figure 6.10 Optimized outlier analysis .....................................................................................92
Figure 6.11 Hotspot analysis at municipality level ...................................................................93
Figure 6.12 The synthesized table of the explanatory and dependent variables .......................94
Figure 6.13 Moran’s I results of residuals ...............................................................................102
Figure 6.14 Spatial distribution of significant model coefficients ..........................................103
Figure 6.15 Comparison between original SEs and robust SEs of coefficients ......................106
Figure 6.16 Moran scatter plot ................................................................................................108
vi
Figure 6.17 The standardized residuals from (a) OLS, (b) Poisson, (c) GWGR, and
(d) GWPR ............................................................................................................109
Figure 6.18 Spatial distributions of model predictions from (a) OLS, (b) GWGR,
(c) Poisson, and (d) GWPR.................................................................................. 110
Figure 6.19 Spatial distribution of the R2 and local percent deviance by the
(a) GWGR and (b) GWPR ................................................................................... 111
vii
List of Abbreviations
3D - Three-Dimensional
AIC - Akaike Information Criterion
AICc - Corrected Akaike Information Criterion
ANOVA - Analysis of variance
ArcGIS - Aeronautical Reconnaissance Coverage Geographic Information System
ANN - Average Nearest Neighbor
BIC - Bayesian Information Criterion
BRT - Boosting Regression Trees
CHIRPS - Climate Hazards Group InfraRed Precipitation With Station
CSR - Complete Spatial Randomness
CSV - Comma-Separated Values
CV - Cross-validation Criteria
DEM - Digital Elevation Model
EDA - Exploratory Data Analysis
EP - Edges Proportion
ESRI - Environmental Systems Research Institute
FVC - Fractional Vegetation Cover
Gi* - Getis-Ord Spatial Statistics
viii
GIS - Geographic Information Systems
GLM - Generalized Linear Model
GTOPO30 - Global 30 Arc-Second Elevation
GWGR - Geographically Weighted Gaussian Regression
GWNBR - Geographically Weighted Negative Binomial Regression
GWPR - Geographically Weighted Poisson Regression
GWR - Geographically Weighted Regression
IDW - Inverse Distance Weighted
IGBP - International Geosphere-Biosphere Program
IPCC - Intergovernmental Panel on Climate Change
IQR- Interquartile Range
KDE - Kernel Density Estimation
KS - Kolmogorov-Smirnov
LR - Logistic Regression
MAUP - Modifiable Areal Unit Problem
MCWD - Maximum Cumulative Water Deficit
ML - Machine Learning
MODIS - Medium Resolution Imaging Spectroscopy
MSE - Mean Square Errors
NB - Negative Binomial
NDVI - Normalized Difference Vegetation Index
ix
NetCDF - Network Common Data Form
NFP - Number of Forest Patches
NN - Nearest Neighboring
NNI - Nearest Neighbor Index
OHSA - Optimized Hot Spot Analysis
OLS - Ordinary Least Squares
OSM - OpenStreetMap
PPA - Point Pattern Analysis
RF - Random Forest
RS - Remote Sensing
SPP - Spatial Point Process
STSSP - Space-Time Scan Statistics Permutation
TRASP - Transformation of Aspect
UN - United Nations
VIF - Variance Inflation Factor
ZIP - Zero Inflated Poisson
x
Abstract
Known as the “Lung of the world”, the Amazon rainforest produces more than 20% of the
oxygen of the world, which was originally a carbon pool for mitigating climate change, but in
recent years it has become a significant carbon emitter due to wildfires. In 2019, a large-scale fire
occurred in the Amazon rainforest, causing serious damage to the ecosystem and to humans as
well. Therefore, managing wildfires effectively has become an urgent task for fire authorities. This
thesis tried to incorporate spatial analysis and spatial statistics approaches to study wildfire from
two aspects, namely the temporal and spatial distribution and spatial relationships in the Brazilian
Amazônia Biome in 2019.
First, this thesis uses different spatial analysis approaches to provide information on broad-
scale wildfire occurrence spatiotemporal patterns in the Amazônia Biome. Spatial analysis
methods of kernel density, average nearest neighbor (ANN), spatial autocorrelation (Global
Moran’s I), and hot spot analysis were adopted to assess wildfire occurrence spatial patterns such
as spatial distribution characteristics, spatial clustering, and spatial hotspots in the Amazônia
Biome. Using the ANN tool, the existence of statistically significant clusters of wildfire
occurrences was identified at the state level. Also, Global Moran's I was employed to determine
spatial autocorrelation based on the locations of wildfires, and the results revealed that the wildfire
incidents were aggregated. The Getis-Ord Gi* and the Optimized Hot Spot Analysis (OHSA) tool
were used to identify statistically significant locations of high values clusters (hotspots) for
xi
wildfire occurrences across the state, and at the municipality level. Further, several outliers at the
state level were detected using the Optimized Cluster and Outlier Analysis. Second, spatial
statistics models were used to demonstrate differences in the contributions of these factors and
variation in their effects across different locations. We mainly attempted to investigate the
relationship between the number of wildfires and topographical, vegetation cover, land use,
anthropogenic, and meteorological factors by using the ordinary least square (OLS), global (quasi)
Poisson, geographically weighted Gaussian regression (GWGR), and geographically weighted
Poisson regression (GWPR), and to compare and evaluate the fitting and performance of different
models.
Results indicate that wildfire spatial patterns exhibit strong regional differences, and seasonal
patterns are different as well. The results of wildfire temporal pattern analysis indicate that the
spatial variation and temporal contribution of wildfire across the biome of each state are different.
Therefore, the fire department should formulate different fire plans and schedules for fire combat
planning. In the vast and heterogeneous territory of the Amazônia Biome, the task of fire
prevention and suppression can be challenging, requiring strategic planning to prioritize resources
to the most critical areas. In terms of spatial distribution, wildfire occurrences were concentrated
in some regions, including the southern Amazonas, the central part of Roraima, eastern Acre, the
northernmost part of Rondônia, and southwest Pará, the southern and the northeastern of Pará, and
the southeastern part of Mato Grosso.
The spatial relationship analysis reveals that the distribution and frequency of wildfires in the
Amazônia Biome are strongly influenced by anthropogenic factors (e.g., deforestation and
xii
agricultural activities). For both global models, the forest loss, pastureland area, edge proportion,
and the number of forest patches were strongly positively associated with an increased number of
wildfire occurrences. As a result, the areas with an extensive deforestation and agriculture, along
with the wildland-farmland interface, require more attention from the fire brigades.
In comparison with the global models, GWR models provided a more detailed understanding
of spatial relationships between contributing drivers and wildfires. The number of forest patches
was positively related to the wildfire counts in eastern Acre, eastern Amazona, western Pará, and
central Mato Grosso but negatively related to the wildfire counts in western Maranhão. As a result
of the assessment of model fitting and predictions, it was found that the GWR models had better
model fitting than the global models, yielded more accurate and consistent parameter estimations,
and generated more realistic spatial distributions for the model predictions.
xiii
Chapter 1. Introduction
Wildfires play a significant role in ecosystem management, land cover change, and wildlife
habitat studies (Levine, 1991). Since they have the potential to be both beneficial and harmful. On
the one hand, wildfires are of a natural part of several ecosystems for maintaining their diversity
and health in numerous ways, such as regulating plant succession and fuel accumulations, shaping
the structure and composition of vegetation, controlling insect and disease populations, regulating
metabolic pathways, and affecting the distribution of nutrients and energy. On the other hand,
wildfires result in huge economic losses to forestry resources, as well as numerous environmental
impacts, such as destroying animal habitat and biodiversity (Lovejoy, 1991), modifying vegetation
successional patterns (Christensen, 1993), and altering biological nutrient cycling (Menaut et al.,
1993). Meanwhile, combusted vegetation emits large quantities of trace gases (e.g., CO,𝐶𝑂2,𝐶𝐻4 ,
𝑁𝑂𝑥 ), particulates, and smoke into the atmosphere (Andreae et al., 1996); without proper
management, these will threaten ecology balance in one of the most megadiverse regions of the
world, in addition, the recovery of forest structure and species composition probably requires more
than decades or centuries, which will cause negative impacts on human livelihoods and well-being,
as well as enormous economic damages. In the context of global climate change, with the increase
of regional temperature, economic development and human activities, the frequency of wildfires
increases significantly. Therefore, understanding the occurrence, change, and influence factors of
1
wildfire is a focus of the forestry management department, which has an important practical
significance for forestry management.
Wildfire is one of the prominent disturbance factors in most vegetation zones throughout the
world, especially in Amazon. It may be caused naturally or at hands of humans. Most natural
wildfires are caused by lightning and sometimes by volcanic eruptions (Houston, 1973). Humans
can accidentally start wildfires via careless slash‐and‐burning or by using sparks from machinery.
Many are started intentionally ignited to clear the forest for farming and cattle ranching, such as
some of the wildfires in the Amazon rainforest. The Amazon rainforest is the world’s largest
tropical rainforest, with an area of more than 2 million square miles (Jennifer, 2019). It serves as
one of Earth’s largest reservoirs of carbon dioxide, which plays a critical role in global and regional
carbon and water cycles, and it provides important ecological services to our planet, such as
stabilizing the world's climate and world’s rainfall patterns, regulating global temperatures,
preventing soil erosion, mitigating air pollution, maintaining global ecosystem balance, etc. (Bawa
and Markham, 1995). Yet, even small changes to tropical rainforests may result in global impacts
on climate dynamic and warming, water cycle, and the loss of biodiversity (Lewis, 2006).
It is observed that year 2019 was the worst fire to hit the Amazon Rainforest for over a decade
(Justine, 2019). There were more than 80,000 fires in 2019 in Amazon, which was a nearly 80
percent jump compared over the same time period in 2018, according to data from Brazil’s
National Institute for Space Research (INPE) (BBC, 2019). In view of figures of this magnitude,
it is evident that wildfires should be reduced as an important issue of public concern. Considering
the increasing sensitivity of environmental issues in the current political climate, the topic of
2
Amazon wildfires has also become an incendiary topic that has garnered international attention
(Rosen, 2019). NASA study indicates that the Amazon rainforest has not experienced sustained
drought this year and that wildfires in the area are mostly due to human activity (Shanna, 2020).
For agricultural purposes, people continue to cut down rainforests and clean up the site by burning
tree trunks, branches, leaves, etc. to expand the area of land available for grazing or farming. It
caused uncontrolled fires and has increased by "thousands of tons of carbon" released into the
planet's atmosphere, which will contribute to climate chaos. Moreover, such deforestation
activities have also increased forest fragmentation (Soares-Filho et al., 2012), which leads the
microclimate along forest edges to become drier, and then the tree mortality rates have also
increased, which in turn increase fuel loads, finally turning the forest into a fire-prone system.
Furthermore, due to the lack of adoption of fire management practices in this region, human
activities have caused wildfires to spread in areas that have not been cleared, and this burning
range encompasses several countries and cities throughout South America. As an example, due to
the high biomass of the areas being burnt, the smoke from these fires reached cities as far as São
Paulo more than 2,700 km away (Lovejoy and Nobre, 2019). In the long term, such fires can cause
severe environmental, socioeconomic, and public health impacts, costing not only huge economic
losses for forestry resources but also a host of environmental impacts worldwide. The Amazon
may not belong to all of us, but what happens there affects all of us.
Therefore, understanding the model of wildfire in the Amazon Rainforest and its temporal and
spatial variation is required to better support the planning of sustainable fire management and risk
reduction activities. Besides, grasping the possibility of wildfire in different spatial locations is
3
helpful to strengthen the construction of fire prevention facilities in fire-prone areas, such as
isolation zone, which has a practical significance for wildfire prevention and wildfire management.
The main objectives of this study were: (1) to digitize and visualize the wildfire events in Amazon
Rainforest, 2019, and analyzes the spatial distribution characteristics and regularity by using the
concept of spatial analysis methods; (2) to identify the most influential wildfire drivers and their
relative importance on fire occurrence. Hopefully, this thesis can improve the comprehensive study
on, and understanding of, wildfire patterns and spatial variation in the target areas to support
agencies as they prepare and plan for wildfire and land management activities in the Amazônia
Biome district. The thesis is organized as follows:
Chapter 1 - Introduction: This chapter provides the significance, justification, and motivation of
this thesis research.
Chapter 2 - Research Questions: This chapter describes a set of research questions and significance.
Chapter 3 - Literature Review: This chapter introduces the research background context and
development trend of the spatial analysis of wildfire in a macro view, and also offers an overview
of literature related to different types of variables used in wildfire studies and techniques and
methods with remote sensing and Geographic Information Systems (GIS) spatial technologies for
wildfire detection and mapping.
Chapter 4 - Study Area and Data Sets: This chapter introduces the study area, the Amazon
Rainforest, including geographical location, climate condition, terrain feature, socioeconomic
status, forest resources, and wildfires, and a description of the data sets.
4
Chapter 5 - Methodology: This chapter describes the methods, the conceptual framework, and
tools used to perform the analysis. The different statistical methods are introduced, and their
outlines for each step were described.
Chapter 6 - Results: This chapter presents the results of the data analysis, including verified results.
Chapter 7 - Discussion: This chapter provides a discussion on relevant issues associated with the
analysis results, their contributions to the related literature and, limitations, and directions for
future research.
Chapter 8 - Conclusions: Summarizes conclusions and findings derived from the analysis results
and discussion and provides any remarks for this thesis.
5
Chapter 2. Questions and Significance
2.1. Research Questions
Known as the “Lung of the world”, the Amazon Rainforest produces more than 20% of the
oxygen of the world (Maeda et al., 2009), which was originally a carbon pool for mitigating climate
change, but in recent years it has become a significant carbon emitter (Welch, 2021) due to
wildfires. The Amazon Rainforest has been subject to increasing events of wildfires, both spatially
and temporally. According to the report of United Nations (UN) Intergovernmental Panel on
Climate Change (IPCC): In Brazil, the fire in the tropical forests is a principal source of greenhouse
gas emissions, accounting for approximately 75% of the total volume of CO2 released in the
country (MCT, 2004). Massive greenhouse gas emissions exacerbate climate change (Fearnside,
2007), which in turn will further intensify the loss of rainforests. Rainforests can create their own
weather and influence climate around the world. More seriously, this change will be shared by the
whole world and become a “Tragedy of the Commons” (Aishani, 2019). Unfortunately, the fragile
ecosystem is facing the constant threat of wildfires (due to natural or anthropogenic factors).
Therefore, effectively managing wildfires has become an urgent task for fire authorities.
Though there has been considerable research conducted on the contributing factors and spatial
distribution of wildfires in many regions across the globe, the results of the studies differ widely
because of differences in topography, vegetation, meteorology, infrastructure, and other
socioeconomic factors, making generalization challenging. In addition, at present, Brazil mainly
6
studies the relationship between wildfire and its driving factors on the relatively small spatial scale
of states (Silva et al., 2020), municipalities (Bustillo Sánchez et al., 2021), and districts (Bem et
al., 2019), but there is a lack of spatial research on a large regional scale. Limited to this research
status, this thesis attempts to establish a large-scale wildfire occurrence prediction model in the
Amazônia Biome region and to evaluate the model results, in the expectation of providing a
reference for the local large-scale wildfire prediction and management.
Given these social and ecological losses caused by wildfires, the thesis research, by integrating
remote sensing and GIS geospatial technologies, will address a list of the following research
questions:
1) What is the spatial distribution of wildfires: Random, clustered, or dispersed?
2) Where are the hotspots gathered?
3) Is there a directional trend in the distribution of wildfires?
4) What are the main driving factors contributed to the Amazonia fires of 2019? (human activities,
such as deforestation or agricultural operations, etc.)
5) Where is spatial variability found? What are the region-specific factors that deviate from
general trends that are predominant in wildfire occurrences?
6) What are the driving factors that can eventually explain the spatially varying parameters?
7) Which model is used for wildfire prediction? Which model shows better performance and
fitting?
7
2.2. Thesis Significance
The significance of this thesis includes threefold. First, this thesis uses several different spatial
analysis approaches to provide information on broad-scale wildfire occurrence spatiotemporal
patterns in the Amazônia Biome. Second, the thesis illustrates environmental and anthropogenic
factors driving the spatial distributions of wildfire activity; spatial statistics models were used to
demonstrate differences in the contributions of these factors and variation in their effects across
different ecoregions. In addition to some common driving factors, such as climate, topographic,
and land use, etc., this thesis also introduces several habitat configuration metrics, aiming to
identify relationships between forest fragmentation and wildfire. These issues have rarely been
systematically studied in the Amazônia Biome, especially in active deforestation frontiers, where
the interactions between deforestation, forest fragmentation, and wildfire are evident. Then, the
model fitting and performance were also compared and evaluated. Third, this thesis is hoping
improve the comprehensive study on, and understanding of, fire regime and risk, and the results
have clear implications for land and fire management practices, thus providing references for fire
managers to formulate effective regional conservation plans and management strategies.
8
Chapter 3. Literature Review
3.1. Background
Since the appearance of human beings, fire has been used. Fire is one of the most important
inventions in the process of human civilization. Fire occupies a unique position in environmental
history. The event is a natural catastrophe, but it is also a result of human intervention in ecological
and environmental conditions. For instance, on the one hand, fires can aid some native species
with regeneration, germination, and establishment (Keeley, 2000). On the other hand, wildfires
may injure or kill fire-sensitive vegetation species (Smith and Whelan, 1996). Wildfires are not
only a regional issue but also a global one. Fire affects the carbon cycle and global climate by
affecting carbon exchange between land and the atmosphere (David et al., 2009). Due to its sudden,
destructive, and frequent occurrence, wildfire has become one of the major disasters in the world
(Claire, 2019). In various ecosystems around the world, numerous studies have been carried out to
identify the spatial distribution and driving factors behind wildfires. In this chapter, the importance
of wildfires as an ecological disturbance and their effects on human societies and economies, as
well as the advanced research methodologies and technologies of wildfire research in various
countries, etc., are reviewed.
Wildfires occur frequently all over the world. The global news of wildfires reporting blazes
across the Arctic region, Mediterranean, the Amazon, throughout Africa, Australia, the UK, and
the US, etc. Wildfires have a significant socio-economic impact. The 2018 Campfire, for instance,
9
is the deadliest and most destructive wildfire on record for California. The fire caused at least 85
civilian fatalities, destroyed more than 18,000 structures, 621 km² of lands burned by wildfires
(Priyanka, 2019). The 2019 Siberia wildfires generated significant publicity, as of 30 July 2019,
the size of the fires reached 26000 km² (Novaya Gazeta, 2019). By the end of 2019, the Australian
forest fires that started in September had destroyed more than 5 km² of forest area, caused 173
deaths, destroyed more than 2,030 houses, and displaced over 7,562 people, resulting in an
estimated cost of more than $4 billion, seriously affecting Australian ecology and causing
devastating damage to local species (Center for Disaster Philanthropy, 2019). In Latin America,
over 700 Brazilian Amazon people were killed by the smoke, and at least 92000 km² of land was
burned, causing estimated damage of $U.S. 10 to 15 billion (Cochrane and Laurance, 2002).
The study of the spatial and temporal distribution patterns and driving factors of wildfires has
been a hot field of wildfire research in many countries. In temperate high latitude forests and
Mediterranean climate regions, especially in North America (Bartlein et al., 2008; Gralewicz et al.,
2012; Massada et al., 2007), Europe (Martínez et al., 2009), Australia (Collins et al., 2015), and
South Africa (Silva et al., 2005), the research on the temporal and spatial distribution of wildfire
and its driving factors has a long history and is relatively mature. The region is absolutely dominant
in wildfire research, and wildfire research in China, Brazil (Nogueira et al., 2017), and Southeast
Asia (Thach et al., 2018) has also developed rapidly during the last decade.
In the beginning, the temporal and spatial distribution of wildfires was mostly analyzed by
combining fire historical records with mathematical statistics (Zheng and Jin, 2013). In recent
years, scholars around the world have mainly studied the spatial-temporal pattern of wildfires
10
based on the historical occurrence data and remote sensing data by using the spatial analysis and
mathematical statistics methods of SPSS, ArcGIS, MATLAB, and other software. Several trends
emerge in a spatial and temporal study of wildfires in terms of research procedures and
methodologies. From analyzing causal relationships between variables to the analysis of different
statistical techniques, the methods of understanding the patterns of wildfire have a few recurring
themes. The research on the spatial distribution of wildfire mainly includes the species composition
of the forest ecosystem, the dynamic change law of wildfire, and the pattern of vegetation
landscape, etc. Secondly, wildfire variables are varied and expansive, from the social and economic
impacts to meteorological effects, to their clustering patterns. The occurrence of wildfire is affected
by many factors, such as climate, fire source, vegetation type and distribution, human activities,
and so on. In addition, researchers developed an assortment of research designs to include hotspot
analysis, cluster analysis, and correlation methods.
3.2. Spatial Analysis Tools
With a rapid development of science and technology, GIS and remote sensing technology is
becoming more and more mature in wildfire prevention. In particular, with the development of
remote sensing technology in the 1980s, the acquisition of large-scale data materials became
possible. The development of RS technology provides a new method for wildfire study, which is
a means to study fire ecology at multiple scales using an efficient and quantitative method. With
their wide area coverage and repeatability, as well as information on non-visible spectral regions,
satellite sensors are a very valuable tool for detection, mapping, and preventing wildfires. In
11
addition to remote sensing, GIS has also helped fire departments reduce risk, increase efficiency,
and the GIS provides means that can help illustrate the spatial extension of wildfire hotspots, and
allow fire control personnel to predict where and when wildfires in these areas will most likely
occur. This should increase their ability to locate and extinguish wildfires before they become large,
based on its powerful functions of spatial analysis and network analysis. By utilizing GIS, complex
analysis and modeling of wildfire data can be quickly performed to monitor and predict. In addition,
this information can also be shared with other stakeholders through various means of data sharing.
By tying various spatial analysis means of GIS to satellite imagery and putting that
information in the hands of trained analysts, the location of a wildfire can be accurately detected;
the extent or perimeter of wildfire can be estimated, and the severity of wildfire can also be
assessed. Then by using the spatial analysis and spatial statistics methods, we can determine the
patterns, causes, impacts and, possibilities of wildfires, to realize the temporal and spatial analysis
of wildfires and carry out more scientific and effective management and prediction of wildfire.
The research on wildfire spatial distribution is more and more inclined to the direct visual
expression; the research factors are from single to complex, and the research methods and
standards are more and more consistent due to mutual reference. To sum up, geospatial tools
comprising GIS and RS are powerful tools to evaluate the spatial fire distribution patterns by
integrating the temporal data and are widely adopted as an analysis system for wildfire
management.
12
3.3. Causes of Amazon Wildfires
Wildfires have always played a crucial role in nature by clearing deadwood and decomposing
foliage, thereby supplying nutrients and space for plants and trees to grow. However, in recent
years, wildfires in the Amazon Rainforest region have been increasing (Laurance et al., 2001) (See
Figure 3.1), they are causing tremendous damage and disrupting nature’s delicately-balanced
ecosystems. Our literature review indicates that a complex mixture of natural and anthropogenic
factors influence the occurrence of large wildfires. And global climate change, deforestation, and
socioeconomic drivers are often considered as the main causal agents of Amazon Rainforest
wildfires (Juárez-Orozco et al., 2017).
Figure 3.1. Number of wildfires in Amazônia Biome

Note: Modified from Statista.
Global warming is a key factor here, when the ozone layer is destroyed, ultraviolet light from
the sun hits the forest directly. Then, as the temperature rises, the trees also dry up. It is inevitable
that forests will experience a drought. Dry weather causes trees to dry out. Thus, they are prone to
catching wildfire and starting to burn easily (Harris et al., 2008). Meanwhile, in the tropical
13
Atlantic, a warmer climate decreases precipitation over a wide area of the Amazon, resulting in
droughts and increasing the susceptibility of the rainforest to fire. For example, the El Nino events,
caused by warming in the Pacific have become major forces in wildfire occurrences. More than
39, 000 km2 area in the Amazon Rainforest was affected by wildfires during the severe drought of
1997 and 1998 (Alencar et al., 2006).
On the other hand, human activity has a significant influence on fire regimes. Aldersley et al.
(2011) analyzed human drivers for an instance tend to be positively related to burned areas in
tropical forests and outweigh climate-induced factors in South America. Deforestation is one of
the causes. Agriculture and ranching are two of the main activities related to wildfires. (Sanford et
al., 1985). Agriculture displays the strongest association with wildfires (Cochrane, 2013). Some
people clean up the forest to settle down. Rather than cut down the trees, many choose to burn
them (Nepstad et al., 2006). Silva Junior et al. (2018) analyzed the association between
deforestation-induced fragmentation and wildfire occurrences in Central Brazilian Amazonia, the
results indicated that the frequency and intensity of wildfires are affected by forest loss and forest
fragmentation. In forest areas, about 95% of active fires were found within one kilometer from the
edge.
Another cause is socio-economic drivers. Some wildfire incidents may be caused by mining
activities (showed as a stem of the rose pattern) that take place on the shores of the river that passes
through the Amazon (Sanford et al., 1985). Also, in Brazil, the government is pushing through an
ambitious infrastructure plan that would convert Amazonian waterways into electricity producers
(Voices, 2019).
14
3.4. Temporal Distribution Characteristics
Several biotic and abiotic factors determine the presence of a wildfire. Nevertheless, the
effects of each factor differ between ecosystems and between spatiotemporal scales (Ren et al.,
2014). For a long time, many regions of the world have strengthened their research on wildfire
issues. There have been a number of statistical approaches and models have been developed to
understand the spatial-temporal pattern and driving factors for the occurrence of wildfires and
predict their potential threats to lives, property, and the environment.
The assertion that 80% of all information has some geographic reference is quite famous
amongst geoscientists (Robert, 1987). The occurrence of wildfires also has spatial components in
a certain spatial range. Wildfires are not randomly distributed but in a certain spatial distribution
pattern. The temporal and spatial distribution of wildfire is influenced by various factors and has
certain regularity, especially in a specific time and space, and it often shows a
continuous periodic law in time and aggregation law in space (Bai and Ma, 2017).
Garcia et al. (1995) employed a logit model to predict the number of fire-days in the
Whitecourt Forest in Alberta. Keane et al. (2003) analyzed the spatial and temporal distribution
characteristics of wildfires. The researchers used time series to analyze the increase and decrease
trend of the number of wildfires per month and the burned area, and the results showed that the
fires had obvious seasonal characteristics. Lee et al. (2006) work showed that the average number
of wildfires in South Korea was 429 yearly from 1970 to 2003, and March and April were the
highest months of wildfires. The paper published by Yang et al. (2009) also studied the temporal
and spatial distribution of wildfire in Beijing from 1986 to 2006. A temporal and spatial
15
distribution pattern of the wildfires was analyzed by using statistical analysis methods and GIS
spatial analysis methods. Results indicated that the number of annual wildfires differed
significantly between 1986 and 2006, but the trend was generally downward. Beijing wildfires
mainly occurred in the spring. Wittenberg and Malkinson (2009) studied the variation
characteristics of wildfire hotspots in the Mediterranean region, and the results showed that
wildfire occurred frequently with the increase of time. Wen et al. (2009) analyzed the burned areas
and frequency of wildfire in Guangxi, its season distribution, month distribution city distribution
as well as its influencing factors such as atmosphere circulation, terrain, drought, and oceanic
currents. In addition, several studies have demonstrated a temporal association between wildfire
and human activities in the Brazilian Amazon rainforest (Bucini and Lambin, 2002; Morton et al.,
2008). In the future, as human development continues to expand into undeveloped areas, wildfire
has the potential to become more frequent and more costly.
Structural models and assessments based on long-term historical wildfire records play a key
role in identifying and revealing the functions of the most significant wildfire drivers. At present,
although there are studies on the temporal variation of wildfires, the data mainly come from the
statistical yearbook, and there is a lack of research devoted to the spatial location of wildfires.
Thus, there is a lack of research on the temporal distribution and spatial heterogeneity of wildfires
in different ecological regions either. Hence, based on the above research, this thesis takes the
Amazônia Biome as a study area, and with the support of remote sensing data to analyze the
spatial-temporal variation characteristics of wildfires in different regions in the study area.
16
3.5. Spatial Distribution Characteristics
The common point of the above scholars is to analyze and study the temporal law of wildfires
at time scales of year, month, and day. The spatial heterogeneity of wildfire refers to the spatial
distribution characteristics in a certain area. Maingi and Henry (2007) analyzed the spatial
variation of wildfire from biological and human factors. They pointed out that the spatial
distribution of wildfire among abiotic factors was correlated with drought severity index, slope,
slope direction, and altitude. Among the human factors, the fire point was closely related to the
distance from the highway or the residential area. Aragão et al. (2007) used the NOAA-12 data set
to extract burn scars in the Brazilian Amazon and analyzed the response characteristics of
precipitation and deforestation to wildfires. Romero-Calcerrada et al. (2008) investigated the
socioeconomic factors related to wildfires in the southwestern area of Madrid, Spain. Rather than
a hot spot analysis, the Bayesian method was used, and different socioeconomic variables were
incorporated with conditional probabilities to determine the spatial association between
socioeconomic factors and wildfire occurrence.
Podur et al. (2003) applied spatial point pattern analysis to analyze the spatial pattern of
lightning fires in Ontario from 1976 to 1998. They revealed that the fire locations were found to
be spatially clustered. Wittenberg and Malkinson (2009) analyzed the spatial and temporal
distribution characteristics of wildfires in mixed pine forests in the Mediterranean region by
analyzing GIS data and establishing a prediction model. They pointed out that the spatial
distribution of wildfires in Carmel mountain in Israel was not random, and the location of fire
points in this area was obviously closer to the roadside. Cáceres (2011) identified wildfire risk
17
zones from FIRMS fire hotspots reported between 2003 and 2009 in the Yeguare region,
southeastern Honduras, and modelled fire hotspots through Gi(d) statistics to determine how and
to what extent commonly known driving factors contribute to fire occurrences.
Orozco et al. (2012) used the space-time scan statistics permutation model (STSSP) and a GIS
to identify hot spots in wildfire sequences in Switzerland. Moreira de Araújo et al. (2012) used the
MCD45A1 and TRMM rainfall data sets to explore the temporal and spatial distribution patterns
and influencing factors of the burned areas in different ecological zones in Brazil and believed that
the precipitation anomaly caused by the La Niña phenomenon was the main reason for the inter-
annual dynamics of the wildfire burned areas in Brazil. Wing and Long (2014) examined patterns
in the spatial and temporal distribution of large fires over time in Oregon and Washington over 25
years. These patterns and relevant driving factors (temperature, rainfall, etc.) were also
investigated using GIS methods. The researchers performed an average nearest neighbor analysis
and quadrat analysis. Hot spot analysis using Getis-Ord G* and Moran's I were also performed to
determine the significant clustering. It was concluded that over the past 25 years, there has been
an increase in fire frequency, magnitude, and duration. According to the wildfire situation in
Portugal in 2001 and 2014, Parente et al. (2018) studied the spatial distribution characteristics and
driving factors of wildfires in Portugal and found that the driving factors affecting wildfires are
different between the south and north.
Although the above studies have explored the spatial distribution characteristics of wildfires,
the spatial distribution characteristics of different regions are different, and the spatial distribution
of wildfires estimated by different models in the same region is also different. Therefore, based on
18
previous studies, the wildfire data monitored by remote sensing satellites will be used to determine
the spatial patterns of wildfires, and directional trends mainly measured by Nearest Neighbor Index
(NNI), Moran’s I, Getis-Ord Gi*, Kernel Density Estimation, and directional distribution analysis,
etc.
3.6. Wildfires Driving Factors
Climate change is projected to alter evaporation and precipitation patterns around the globe,
resulting in wetter climates in some areas and drier climates in others. As droughts become
increasingly severe, wildfires will also become larger and more widespread (Jessica, 2019).
Wildfire not only has a considerable impact on the ecological environment but also poses a
significant threat to the safety of people's lives and property (Associated Press, 2019). Yet wildfire
regimes and patterns are varied in both space and time as a result of the interaction of a multitude
of factors. A comprehensive understanding of relationships between the spatial distribution of
wildfire occurrences and determining the underlying reasons behind wildfire occurrences is critical
for effective wildfire management. To define effective strategies for reducing the impacts of
wildfires, it is crucial to have robust information about the nature of wildfires, especially regarding
the underlying causes and conditions that promote a wildfire. We need to have strong background
knowledge, based on the awareness of risk and influencing factors, combined with experience and
relevant literature, thus, to design and operate appropriate prevention programs. However, such
knowledge is not yet sufficiently developed for application in the real world. For instance, Brazil’s
National System for Forest Fire Prevention and Control has tried to allocate the fire control
19
resources by evaluating the municipalities that experienced the most fires in previous years, but
the result is not satisfied with only a 30% accuracy rate (Morello et al., 2020).
Exploring wildfire drivers is crucial to developing wildfire prediction models. The spatial
pattern of wildfires is affected by many factors, and wildfire drivers are generally divided into four
categories: meteorology, vegetation, topography, and human activities (including infrastructure
and socio-economic) (Ali et al., 2009; Schulte and Mladenoff, 2005). Wildfires can be divided into
natural and man-made wildfires (Prestemon et al., 2013). Studies on the law of wildfires show that
natural disaster causing factors include relative humidity, wind, temperature, precipitation, and
other meteorological conditions, which are the basic conditions for the occurrence of wildfire, and
man-made disaster causing factors are the induction factors of wildfires (Mhawej et al., 2015).
With the continuous socio-economic development in recent years, human-caused wildfires and
total wildfires were found that have similar laws (Sen and Hu, 2002) in many areas. In recent years,
scholars in various countries have carried out a lot of research on the methodological schemes,
driving factors, and scales of wildfires and have made a lot of achievements.
Genton et al. (2006) used point pattern analysis to study the Spatio-temporal structure of
wildfire occurrences in north-eastern Florida. The results revealed that occurrences by lightning,
arson, and railroad are spatially more clustered than occurrences by other accidental causes. In
Mexico, Román-Cuesta and Martínez-Vilalta (2006) used spatial analysis and spatial statistic
methods to investigate a significant relationship between road density, agricultural land expansion,
and wildfire within Biosphere Reserves. According to Rodriguez et al. (2008), a combination of
factors such as maximum wind speed, number of producers, and education level is the most
20
prominent factor affecting wildfires in Mexico. Moreover, in the State of Durango, Mexico,
research conducted on the spatial pattern of wildfire occurrence in Las Bayas Forest Reserve
suggested that wildfires are related to extreme weather (Drury and Veblen, 2008).
Furthermore, in terms of influence modes, meteorological factors are generally considered to
be the decisive factors affecting the occurrence of wildfires (Minnich and Bahre, 1995). Vegetation
is the fuel in the "three elements of combustion" (See Figure 3.2), which directly affects the ability
of ignition (Pew and Larsen, 2001). Topography will be affecting the number and structure of
vegetation, thus affecting the possibility of a wildfire, and affecting the speed and direction of
wildfire spread. Human activities affect the spatial distribution and frequency of wildfires through
architectural expansion, transportation network construction, and outdoor activities (Cardille et al.,
2001). These factors also influence each other in various ways. The study of these factors and their
interactions can provide a more detailed analysis of the spatial distribution and probability of
wildfires.
Although the models discussed above are well-defined and conceptually useful, it is not easy
to identify the true fire determinants in a given environment, and the interaction between bottom-
up and top-down controls is often complex and variable (Parisien and Moritz, 2009). In addition,
the relationship between fire and its determinants varies not only by a scale but also by a region.
All of these make the site-specific case studies necessary.
21
Figure 3.2. Fire triangle.
Note: Modified from (Hannah, 2019).
Since the 1970s, the research on the driving factors and distribution pattern of wildfire is
mostly based on mathematical and statistical methods (Castro and Chuvieco, 1998; Cunningham
and Martell, 1973; Gill et al., 1987; Liu et al., 2018; Martell et al., 1987). The probability of
wildfire occurrence is calculated by establishing a mathematical model of the relationship between
wildfire and various factors such as weather, terrain, human activities, etc. The main methods
include artificial neural network (Bisquert et al., 2012b), MaxEnt algorithm (Martín et al., 2019)
classification and region tree models (Lozano et al., 2008), Poisson regression model (Dayananda,
1977), Negative Binomial (NB) regression (Chas-Amil et al., 2015), Zero Inflated Poisson (ZIP)
models, and Ordinary Least Squares (OLS) model (Guo et al., 2010). Moreover, the Logistic
Regression (LR) model is also a common method in the study of wildfires. This model has high
prediction accuracy for the occurrence of wildfires and is applied frequently in the study of
wildfires (Koutsias et al., 2010a; Rodrigues et al., 2014b; Rodrigues et al., 2018).
Cunningham and Martell (1973) studied the occurrence of man-made forest fires during the
summer fire season in Ontario, they developed a simple Poisson model to describe the distribution
of man-made fires. The results indicated that the average number of fires per day depends on the
22
Fine Fuel Moisture Code. Martell et al. (1987) used of logistic regression analysis techniques to
predict the probability of a fire day and applied a Poisson regression to model daily occurrences
of forest fires caused by humans in the Northern Region of the province of Ontario. The result
conducted that this procedure works well during relatively wet periods. Castro and Chuvieco (1998)
obtained the coefficients of fire danger variables from multiple regression models, in the county
of Valparaiso, Chile, and presented fire danger maps using a Geographic Information System.
Wotton et al. (2010) applied Poisson regression analysis methods to develop a way to predict
the number of fires occurring in each of the ecoregions within the region of fire management in
Ontario on a daily basis. Del Vilar Hoyo et al. (2011) used binary logistic regression and kernel
estimation models to analyze the occurrence and density of human-caused wildfires in Madrid,
Spain. It was found that the binary logistic regression model performed slightly better. Oliveira et
al. (2012) used multiple linear regression and Random Forest models to explore the spatial pattern
of wildfires in the Mediterranean region. Results indicated that the random forest model provided
a better prediction ability than multiple linear regression.
Bisquert et al. (2012a) chosen the Galicia region (north-west of Spain) as the study area to
created fire danger models with artificial neural networks and logistic regression. Rodrigues and
La Riva (2014) explored the use of machine learning (ML) models for the assessment of
occurrences of human-caused wildfires. They modeled man-made fires in Spain based on non-
meteorological factors. The results showed that Random Forest (RF) and Boosting Regression
Trees (BRT) algorithms have better fitting results.
23
Faivre et al. (2014) modeled ignition occurrence and frequency in terms of a linear and
Poisson regression analysis, to analyze wildfires in Southern California. As a result of the models,
the proximity to buildings, the distance from the nearest road, and the slope were found to be the
most important determinants of ignition frequency. Chas-Amil et al. (2015) applied count data to
model deliberately set fires in Galicia, Spain, using the Generalized Linear Model (GLM) NB2
negative binomial regression model. As a result, wildfire risks were found to be greater in areas
with very dense populations, development/urbanization pressures, as well as in areas with
unattended forests.
The above methods are mathematical global models (i.e., models assume that a relationship
between dependent variables and explanatory variables is spatially stable, that is, a single equation
with a set of parameters can be used for describing the entire study area), which have limitations
in explaining the nonlinear relationship (Rodrigues et al., 2018).
Nevertheless, fire is a spatially complex phenomenon, and there is a complex nonlinear
relationship between wildfire occurrence and some driving factors. The term nonstationarity or
spatial heterogeneity refers to the process whereby responses vary with geography, size, or other
attributes of the spatial units (Anselin and Griffith, 1998).
With the deepening of research, many scholars recognized that spatial heterogeneity is a
necessity. Because the assumption of stationarity is often violated in real-world situations,
resulting in spatial models affected by spatial heterogeneity (Brunsdon et al., 1996). Considering
the varying importance and contributions of explanatory factors across space, the application of
global regression methods over large areas might not be appropriate because stationary coefficients
24
are applied to the entire research area, which could obscure interactions between explanatory
factors at the local level.
Thanks to the spatial statistical model, land managers can now quantify uncertainties in older
mathematical models and develop additional analytical tools that will aid in their response to
wildfire occurrences. Fotheringham et al. (2003) proposed geographically weighted regression
(GWR) as a means of modeling spatially non-stationary fires across locations. In this way, GWR
is used for local modeling.
Koutsias et al. (2005) applied the OLS regression model, GWR model, classical logistic
regression model, and GWR logistic model to analyze the occurrence of wildfire in southern
Europe. Their results indicate that GWR models were more appropriate than the classical global
ones. Avila-Flores et al. (2010) used the GWR method to determine contributing factors that are
spatially associated with the wildfire occurrence in the State of Durango, Mexico. Results showed
that the land-use change is one of the main explanatory variables for the spatial pattern of wildfires
throughout the study area. There was a positive correlation between land-use intensity and forest
fire frequency. Furthermore, vegetation type and precipitation play an important role as well.
Martínez-Fernández et al. (2013) used two different statistical methods to evaluate human-
caused fires in Spain: multiple linear regression and binary logistic regression. The aim was to
identify factors associated with fire density and fire presence, respectively. Further, they utilized
GWR to explore and better understand the local and regional variations of those factors
contributing to man-made fire occurrence. Rodrigues et al. (2014a) used GWR methods to develop
models of the varying spatial relationships between man-made wildfires and the explanatory
25
variables. According to the results, high wildfire occurrence rates were primarily associated with
wildland-agricultural interfaces and wildland-urban interfaces.
The literature reviewed above reveals that spatial statistics provide important insights into the
behavior and distribution of wildfires, especially wildfires caused by humans. A lot of models of
wildfire occurrence provide a good understanding of the physical and meteorological factors (such
as topographic slope or fuel moisture content) that influence and promote wildfire spread and
recurrence.
Several studies have shown that wildfires are more likely to occur near places frequented by
humans and machinery than elsewhere, and these patterns can be explained by many socio-
economic variables. Nevertheless, the literature on this topic is predominantly site-specific due to
the complexity of predicting human behavior, both in space and time (Krawchuk et al., 2009; Le
Page et al., 2010; Martínez et al., 2009).
In addition, due to the regional spatial heterogeneity, the previous global models may produce
large errors in wildfire prediction. GWR can effectively deal with the problem of spatial non-
stationarity. It has been extensively used in a variety of topics (Chalkias et al., 2013; Wang et al.,
2013; Xiao et al., 2013) and specifically in wildfire science. It is also expected that, as technology
and data processing capabilities continue to improve, the spatial and temporal study of wildfires
will become more precise and valuable to those who require it.
This thesis study will investigate and try to figure out the key drivers of wildfire occurrence,
and then further analyze spatiotemporal changes in the contribution of wildfire drivers in the
Amazon Rainforest and provide deeper insights into the influence of fire characteristics.
26
3.7. Scale Effects
Point patterns are scale-specific. The multi-scale effect on the distribution characteristics of
environmental factors is one of the core issues in ecological research. The conclusions of a study
of an ecological problem or phenomenon largely depend on the spatial scale adopted by
researcher(s). The Modifiable Areal Unit Problem (MAUP) also affects the analysis of spatial
patterns of wildfire data (Dark and Bram, 2007).
In wildfire spatial analysis, it is common to estimate and use aggregate statistics to describe
wildfire patterns at different temporal and spatial scales (e.g., from local to global, city to country,
and short to long term). However, the aggregation scheme is arbitrary and modifiable in a sense
and refers to scaling and zoning issues (Openshaw, 1981).
Scale is the unifying concept among all scientific disciplines, and it is also a common
challenge. It is well known in fire ecology that patterns and processes are scale-dependent. The
wildfire spot is a kind of point feature. The spatial distribution pattern of wildfire is related to the
spatial scale. As a phenomenon's scale changes, new processes and patterns are revealed, and
influencing factors alter. The characteristics of wildfire distribution observed at small spatial scales
may not exist at large spatial scales, and vice versa.
In a heterogeneous landscape structure, wildfire patterns are the result of the complex
interaction of climate, terrain, vegetation type, fuel moisture content, infrastructure, and other
factors. Scale in fire ecological systems can be interpreted as an organized interactive complexity.
However, based on a study's purpose and, zoning methods for partitioning their data in different
ways, these factors have strong heterogeneity in different spatial scales, which leads to different
27
distribution patterns of wildfire in different spatial and temporal scales, and a process at any one
scale may be influenced by factors at other scales. Because the results can be modified, bias,
reliability, and validity concerns can arise from the research.
Therefore, for wildfire patterns to be fully understood, a multi-scale approach to analyses
should be considered. Gonzalez-Olabarria et al. (2011) analyzed the spatial distribution
characteristics of wildfires in Catalonia at the municipal, county, and 1000 km2 hexagonal grid
scales. His research pointed out that multi-scale analysis can not only reflect the changing trend of
wildfires in the overall space but also highlight the variation characteristics of wildfires in local
space.
The factors affecting the occurrence of wildfires are non-stationary at different spatial scales
(Parisien and Moritz, 2009). Wildfires are regulated by top-down and bottom-up factors across a
range of spatial and temporal scales.
To be specific, at a large scale (continental scale), wildfires are mainly controlled by fire
sources, vegetation, and climate, which are considered as a ‘top-down’ control (Morgan et al.,
2001). At a mesoscale (landscape scale), the occurrence of wildfires is mainly controlled by
weather, terrain, human activities, and fuel, which is a ‘bottom-up’ control because they are
spatially more variable and are associated with local fire patterns (Brown et al., 1999; Rollins et
al., 2002). At a small scale (stand scale), the spread of wildfires is mainly controlled by
combustibles, microtopography, and microclimate (Yang et al., 2008). The scale determines the
level at which a phenomenon is observed. It is crucial to use correct scales to detect the phenomena
of interest and to generate relevant results.
28
Chapter 4. Study Area and Data Sets
4.1. Study Area
4.1.1. Physiography
Amazon Rainforest is located in the Amazon basin of South America, with a total land area
of 4.2 million km2, accounting for half of the total area of the earth's rainforests. The area chosen
for this thesis study (See Figure 4.1) is the tropical forest biome within the Brazilian Amazônia
Biome, which encompasses the states of Acre, Amapá, Amazonas, Pará, Rondônia, Roraima,
Tocantins, Mato Grosso, and part of Maranhão (west of the meridian of 44◦west longitudes).
Amazônia Biome is recognized as a heterogeneous region, with different land uses and vegetation
types (Malhi et al., 2008). It covers almost 25% of South America and its adjacent eight countries
with about 5 million km2 covers over half of Brazil’s land areas, sustains 40% of the world’s
remaining tropical forests, and one-third of the world’s germplasm is restricted to this biome. In
2016, the total forest cover in this biome was estimated to be 3,494,643 km2. It is bounded by the
Guiana Highlands to the north, the Andes Mountains to the west, the Brazilian central plateau to
the south, and the Atlantic Ocean to the east.
29
Figure 4.1. Location of the study area.
Note: With the red line indicating the area of the Amazônia Biome, the green line is the Legal Amazon.
4.1.2. Climate
The tropical rainforest experiences a humid season throughout the year. In the Amazon forest,
there are no periodic seasons due to the virtue of the tropics. According to the Köppen-Geiger
climate classification, the study area can be subdivided into three sub-climatic regions (See Figure
4.2): tropical rainforests (“Af”), tropical monsoon (“Am”), and tropical savannah (“Aw”) (Beck et
al., 2018). In general, the climate is characterized by warm temperature (ranging from 24 to 27°C
whole year), high rainfall (around 3200 mm/year), and high relative humidity (from 80% to 94%,
throughout the year) (Hilker et al., 2014). The seasonality of rainfall and the rapid transition
between wet and dry seasons are primarily caused by the South American Monsoon System, which
is controlled by the sea surface temperature close to the Equator (Nobre et al., 2009). Within this
30
study area, intra- and interannual rainfall variability is governed by sea surface temperatures in the
Pacific and Tropical Atlantic (Nobre et al., 2009).
Figure 4.2. Köppen-Geiger climate classification map for the study area.
4.1.3. Terrain
Rainforest of the Amazon lies on an alluvial plain with a surface elevation of less than 100
meters; particularly around the rivers are lowland plains. There is a highland area along the border
between Brazil, Venezuela, and Guyana called the Guiana Shield. The southern Amazon highlands
31
comprise parts of Rondonia and Mato Grosso, as well as the southern parts of Amazonas and Pará.
Within the biome, the highest point is Neblina Peak (3040 meters) in northern Brazil.
4.1.4. Forest Resources
In the Amazon rainforest, trees are the most common plants; the number of trees has been
estimated at 3.9 × 1011 with approximately 16,000 species where palms and members of the
families Myristicaceae and Lecythidaceae account for half of them (Steege et al., 2013). Plants in
the Amazon are vital in regulating the ecosystem. As plants in the Amazon photosynthesize, they
create their own weather.
4.1.5. Socioeconomic Status
The Amazon has a long history of human settlement, but over the past few decades, the pace
of change in the Amazon has accelerated due to a growing population, the introduction of
mechanized agriculture, and the integration of the Amazon region into the global economy. Due
to settlers' clearing of the Amazon for lumber and grazing pastures and cropland, the size of the
Amazon forest shrank dramatically. It has been estimated that 1.4 million acres of forest have been
cleared since the 1970s. The rainforest once covered 14% of Earth and now it is 6% of the Earth's
surface (Butler, 2001). There are already many areas that are being affected by selective logging
and forest fires. In today's global economy, the Amazon is widely exported to China, Europe,
American, Russian, and other countries as a major producer of cattle, beef, and leather, timber, soy,
oil and gas, and minerals. This shift has had a substantial impact on Amazon. The USDA reports
32
that Brazil is the world's largest exporter of soybeans and the world's top producer of cattle and
veal. Humans have done many things to damage the Amazon Rainforest. In exchange for economic
growth (i.e., agribusiness), a great deal of fire-setting and deforestation occurs here (Aishani, 2019).
4.1.6. Wildfires Regimes
It has been observed that rainfall and wildfire occurrence over the study area are highly
seasonal. The dry season extends from July to September in non‐drought years for most of the
Brazilian Amazon (Phillips et al., 2019); Satellites detect the majority of active fires from July to
September, which follows the dry season (Anderson et al., 2010). Besides, spontaneous fires are
rare in humid rainforests. Because the intact old old-growth forests have the ability to maintain
microclimates moist enough to prevent fire spread even during dry seasons. Based on prior
research, in the past, wildfire events in the Amazon Rainforest are mainly associated with extreme
drought events, such as the El Niño, which has affected the region since pre-Columbian times
(Meggers, 1994).
However, the Amazon rainforest has not seen sustained severe dry weather in recent years,
thus, scientists think most fires in the Amazon rainforest are man-made, not natural (Rebecca,
2021). It can occur for many reasons, such as deforestation, agricultural burning, and illegal
logging, etc. For example, workers often use a method called slash-and-burn, which is used to
clear forests to make way for agriculture, logging, livestock, and mining. These "controlled" fires
are particularly prone to get out of control and spread to forested areas, where it is extremely
33
difficult to put out. Although this is illegal in some countries in the Amazon rainforest, the
enforcement of the regulations is lax.
In recent years (2013-2019), the budget cuts of the Brazilian Environment Agency have led to
a significant reduction in manpower and resources to enforce the regions most affected (See Figure
4.3). As part of the Amazonian Legal region, we find that Pará has the most fires of all states by a
large margin, and the states of Mato Grosso, Amazonas, Rondônia, Maranhão, and Acre have been
particularly badly affected (See Figure 4.4).
Figure 4.3. States with the most fires.
34
Figure 4.4. Spatial distribution of wildfires of the Amazônia Biome in 2013–2015.
Note: Nine federals states: Acre (AC), Amazonas (AM), Amapá (AP), Maranhão (MA), Mato Grosso (MT),
Pará (PA), Rondônia (RO), Roraima (RR) and Tocantins (TO).
4.2. Data Sets/Sources
In this thesis, we used Medium Resolution Imaging Spectroscopy (MODIS) to record the
spatial distribution of fire pixels in the Amazon Rainforest in 2019. This product is considered a
reliable and suitable source for monitoring wildfires (Devisscher et al., 2016; Silva Junior et al.,
2018). Using the surface reflectance images collected by MODIS, scientists can detect the accurate
location of the wildfire hotspots. Further analysis of these data (yearly wildfire frequency)
indicates the seasonality patterns of the wildfire in Amazon, as well as the spatial distribution of
wildfires. To improve geolocation accuracies, we used the MODIS Collection 5 MCD14ML global
monthly fire location product instead of the gridded composite fire product (MOD14 A1), which
35
provides monthly geospatial information, including location, date, confidence, and additional
information for each 1 km fire pixel detected by Terra and Aqua MODIS sensors. The thermal
anomalies / active fire is detected using the contextual algorithm described by Giglio et al. (2003).
Each pixel contains one or more fires, and each pixel represents the center of a 1km pixel. This
product has been widely used in recent wildfires studies. In addition, a range of datasets was
collected and transformed for this study. The data sets and sources to be collected and analyzed
are included in Table 4.1.
Table 4.1. Sources and descriptions for variables.
Note: All variables were generated or resampled at a resolution of 1 km.
36
Chapter 5. Methodology
5.1. Overarching Study Design
This thesis could be divided into two main sections, the first section is related to spatial
analysis of the wildfire occurrences spatial patterns, and the second section is related to spatial
statistics, to model relationships between the explanatory variable and wildfire spatial patterns
using the regression models, thus identifying the spatial heterogeneity of the explanatory variables.
Everyone in the geospatial industry knows that 80% of data has a geospatial component
(Robert, 1987). The occurrence of wildfires also has spatial components in a certain spatial range.
The development of spatial statistics provides many methods for the study of wildfire spatial
distribution patterns. Point pattern analysis (PPA) is commonly used to study the spatial
distribution of points (Boots and Getis, 1985). Generally speaking, to study the distribution of
points, we can adopt the method of random samplings, such as selecting a quadrat in the study area
with a certain size and then counting the individual information in the quadrat. When the number
of quadrats is large enough, we can infer the information of the whole sample area according to
the statistical data in the quadrat. The selection of sample area is required to be random and
independent, which can obtain the variables of the number of points in the different quadrat, 𝑥1 ,
𝑥2 ··· 𝑥𝑛 . Because 𝑥𝑛 is like a random process in time, it is also called a Spatial Point Process
(SPP). SPP models typically include two types of analysis: the explanatory data analysis and the
parametric modeling analysis; the former assesses patterns to determine whether they exhibit
37
patterns of a null hypothesis of complete spatial randomness while the latter is used to model
relationships between explanatory variables and wildfire spatial patterns, considering the spatial
heterogeneity.
Wildfires can be regarded as point events on the map. The key objective of the study is
identifying the spatial distribution of the geographic entities or events (random, clustered, or
dispersed). This research focuses not only on individual points, but also on the interaction between
points and their mutual influence. Several analysis methods of SPP will be used in this thesis, such
as kernel density estimation (cell counts), which mostly addresses the first-order properties of point
patterns. By contrast, distance-based methods measure second-order properties by calculating the
distance between adjacent point pairs. As an indicator of the degree of spatial autocorrelation in a
dataset, spatial autocorrelation statistics such as Moran's I give an overall level of estimation of
spatial autocorrelation. In local spatial autocorrelation statistics, which is estimates disaggregated
by spatial analysis units, it is possible to see how dependency relationships differ between different
areas. Various statistics such as Getis-Ord Gi*(d) and Ripley's K(d) can be used to compare
autocorrelations across different neighborhoods. However, this thesis does not use Ripley's
function, because the threshold distance calculation algorithm has been added to the updated tool
such as the Optimized Hot Spot Analysis (OHSA).
Directional distribution trends, correlation analyses, hot spot analyses of wildfires in the
Amazon rainforest are the spatial pattern analysis components in this thesis research. Firstly,
Standard Deviational Ellipse is used to summarize the directional trends of wildfires. Secondly,
38
to investigate whether the distribution of wildfire occurrence is spatially clustered, global and local
measures of spatial association were calculated by using Moran's I statistics (Anselin, 1995), and
the Getis-Ord Gi* statistics were used to identify the hotspots. In addition, the Kernel density
estimation and average nearest neighbor are also used to describe the global trend of wildfire
occurrences.
The spatial relationships analysis was mainly conducted by applying several different
regression models (i.e., Ordinary Least Square, Poisson, Geographically Weighted Gaussian
Regression, and Geographically Weighted Poisson Regression). The dependent variable, which
was wildfire occurrence in 2019, was extracted using ArcGIS 10.8 in grid format together with
explanatory variables. In terms of model fitting and performance, the Corrected Akaike
information criterion (AICc) and the Moran’s Index were adopted to compare and evaluate. As a
result of the model evaluation criteria, spatial heterogeneity can be identified for the explanatory
variables. Detailed descriptions of the data extraction procedures for the explanatory and
dependent variables are given below. The logical framework of the thesis is shown in Figure 5.1.
39
Figure 5.1. The logical framework of the study.
5.2. Data Preparation and Processing
Since most data obtained were initially raw and cannot be directly used, data preparation and
transformation is needed. This stage can often take the bulk of the time. In this thesis research, the
primary software used for spatial analysis, spatial statistics, mapping, and visualization is ESRI
ArcGIS version 10.8 (Esri Inc, 2021). This GIS software was chosen because it presents numerous
extensions for spatial analysis, spatial statistics (including kernel density estimation, Global
Moran’s I, Getis-Ord Gi*, OLS, GWR, and other analyst tools).
40
5.2.1. Dependent and Explanatory Variables
For this thesis, wildfire occurrences are the dependent variable. Figure 5.2 shows the
frequency distribution of wildfire counts. It can be seen that the frequency of wildfire occurrences
is a positive skew, and the right tail is longer, which means pixels with zero fire counts are much
more frequent, and other wildfire counts decrease rapidly. Figure 5.3 illustrates the spatial
distribution of the wildfire points. Figure 5.4 reveals high wildfire occurrences in 2019 for the
states of Pará, Mato Grosso, Amazonas, Rondônia, Roraima, and Acre. Figure 5.5 visualizes the
continuous distribution of wildfire occurrences by using the empirical Bayesian kriging model.
Figure 5.2. Frequency distribution of wildfire occurrences.

Note that the y-axis represents the relative frequency of wildfires, and the x-axis represents the number of
wildfires per unit grid.
41
Figure 5.3. Wildfires by states in Amazônia Biome.
Figure 5.4. Spatial distribution of observed wildfire counts.
42
Figure 5.5. Continuous spatial distribution of observed wildfire counts.
Fire behaviors are affected by many factors; based on an extensive literature review and
available data, the explanatory or independent variables in this thesis study included five broad
categories: 1) topographic, 2) vegetation cover, 3) land use, 4) anthropogenic, and 5)
meteorological factors, with a total of 15 explanatory sub-variables, which include 2 landscape
metrics.
5.2.1.1. Topographic
Topographic factors consisted of three explanatory variables: elevation (m), slope (degree),
and aspect (range from 0 to 1). Elevation influences fire behavior by changing the timing and
amount of precipitation, as well as by influencing exposure to wind (Scott and Alicia, 2008). To
be specific, in lower elevations, the wind speed tends to be faster. Moreover, elevation also
43
influences the drying of the fuel. Fuels are more dryer in lower elevations due to the higher
temperature and lower precipitation, whereas the fuels at higher elevations are opposite, and there
is also a tendency for lightning strikes subsequent ignitions at higher elevations. Slope affects the
speed and direction of fire spread; usually, the uphill speed of the fire is faster than the downhill,
so the fire at the steep slope is faster. In addition, the larger the slope, the more serious the soil
erosion and the less water content of forest combustibles (Huang et al. 2015). Due to the direct
sunlight of the sun, the north aspect gets more heat, and the soil and vegetation are dry. Compared
to the south aspect, combustibles are dry and have a low density. Moreover, the north aspect usually
has a higher temperature and a stronger wind(Williams, 2018).
The elevation layer was collected from the Global 30 Arc-Second Elevation (GTOPO30)
(https://www.usgs.gov/centers/eros/science/), which was built in 2010. After that, the slope and
aspect were derived from the Digital Elevation Model (DEM) by using the 3D analysis tool in
ArcGIS 10.8. Because Aspect is a circular variable that cannot be used in linear statistics, that layer
was cosine-transformed (trigonometric function) to obtain a linear index that can better distinguish
the degree of exposures, and aspect was converted into an aspect index using the following formula
(Stage, 1976):
Aspect index = cos(θ×𝛑/180) (5.1)
where θ is the degree of aspect generated in ArcGIS and the range is 0-360°. The values of the
aspect index ranged from −1 (southward) to 1 (northward); slopes with a north orientation receive
less direct sunlight than slopes with a south orientation in the Northern Hemisphere. On the other
hand, north-facing slopes in the Southern Hemisphere are generally warmer since they receive
44
more sunlight (Williams, 2018). Thus, the greater the aspect index, the closer the region is to the
north and northeast, where it will have more sunshine, higher temperatures, and faster evaporation.
Figure 5.6 illustrates the spatial distribution of the elevation, slope, and aspect, respectively.
Figure 5.6. Spatial distribution of (a) elevation, (b) slope, and (c) aspect.
5.2.1.2. Vegetation Cover
In this section, the following metrics were adopted: (1) Fractional Vegetation Cover (FVC),
(2) Number of Forest Patches (NFP), and (3) Edges Proportion (EP). Vegetation cover is an
45
important indicator of the level of vegetation at the ground surface. Currently, remote sensing has
provided various methods for measuring vegetation cover. In this thesis, we used Fractional
Vegetation Cover (FVC) to represent the corresponding fuel amount within the grid. FVC was
calculated based on the NDVI, it is a graphical indicator that can be used to determine the moisture
content of live fuels. The data were derived from the MODIS NDVI (MOD13Q1)
(https://lpdaac.usgs.gov/products/mod13q1v006/) with 500 m spatial resolution. Since the data are
expanded 10000 times, we use the Raster Calculator in ArcGIS 10.8 to restore it. Gutman and
Ignatov (1997) proposed the relationship model between NDVI and FVC as:
𝐅𝐕𝐂 = (𝐍𝐃𝐕𝐈 − 𝐍𝐃𝐕𝐈𝐬𝐨𝐢𝐥 )/(𝐍𝐃𝐕𝐈𝐯𝐞𝐠 − 𝐍𝐃𝐕𝐈𝐬𝐨𝐢𝐥 ) (5.2)
where 𝑁𝐷𝑉𝐼𝑣𝑒𝑔 and 𝑁𝐷𝑉𝐼𝑠𝑜𝑖𝑙 are the NDVI of dense vegetation canopy (close to 1) and bare
soil (close -1). Based on the reliable correlation between NDVI and vegetation coverage (Zribi et
al., 2003), the equation (5.2) can be expressed as:

(𝐍𝐃𝐕𝐈−𝐍𝐃𝐕𝐈𝐦𝐢𝐧 )
𝐅𝐕𝐂 = (5.3)
(𝐍𝐃𝐕𝐈𝐦𝐚𝐱 − 𝐍𝐃𝐕𝐈𝐦𝐢𝐧 )
where 𝑁𝐷𝑉𝐼𝑚𝑎𝑥 is the maximum NDVI value of pure vegetation pixel, which is theoretically
close to 1; 𝑁𝐷𝑉𝐼𝑚𝑖𝑛 is the minimum NDVI value of pure non vegetation pixel, which is close to
0. However, due to the influence of meteorological, vegetation type and distribution, season and
other factors, 𝑁𝐷𝑉𝐼𝑣𝑒𝑔 and 𝑁𝐷𝑉𝐼𝑠𝑜𝑖𝑙 of different images have certain differences. Thus, to
define 𝑁𝐷𝑉𝐼𝑣𝑒𝑔 and 𝑁𝐷𝑉𝐼𝑠𝑜𝑖𝑙 , this thesis did not artificially select the value of bare soil or dense
vegetation canopy area as the minimum and maximum value but took the minimum and maximum
value in the confidence interval of 1%-99%. Figure 5.7 illustrates the spatial distribution of the
FVC.
46
Figure 5.7. Spatial distribution of fractional vegetation cover (FVC).
In addition, based on previous research, the interaction between deforestation, forest
fragmentation, and wildfires is clear in the Amazon rainforest (Alencar et al., 2006; Armenteras et
al., 2013). Pieces of large, contiguous, forested areas are deforested and separated for purpose of
road construction, agriculture, or other human development (Cochrane, 2001), which activities
directly caused forest fragmentation and forest loss. Forest fragmentation and forest loss then often
change the microenvironment at the fragment edge, and configuration of the ecosystem, resulting
in increased light intensity, higher daytime temperatures, higher wind speeds, and lower humidity
(Coe et al., 2013). As increased wind, lower humidity, and higher daytime temperatures make fires
more likely in forest fragments (Aragão et al., 2007). In conclusion, wildfire has a higher tendency
to spread in fragmented landscapes with smaller patches of forest and more edges than in intact
and continuous forests. For these reasons, in this thesis, first, the forest cover map (total 34156)
was used to calculate landscape metrics using the Fragstats software (University of Massachusetts,
47
Amherst, MA, USA). Then, I visualized the results back to ArcGIS 10.8, and used the “Zonal
Statistics as Table” tool to extract landscape metrics in each unit grid (See Figure 5.8).
Figure 5.8. Spatial distribution of forest patches and edge proportion.
5.2.1.3. Land Use
Despite the fact that fragmentation renders forests more vulnerable to fire, the existence of
ignition sources is required for fire to occur. As a matter of fact, wildfires throughout the Amazon
are primarily caused by extensive forest conversions and agricultural practices (Silva Junior et al.,
2018). To boost economic growth, local businesses are rampantly deforesting land for cattle
grazing and soybean cultivation, and the remaining forest fragments appear unusually susceptible
to wildfire (Cano-Crespo et al., 2015). The slaughterhouses and soy companies are popping up
near wildfire hotspots (Glenn et al., 2019). Due to the synergistic interactions of deforestation,
forest fragmentation, and human-induced fires pose serious threats to Amazonian forests, then the
pastureland area and cropland area (See Figure 5.9) were also considered in this thesis. To obtain
48
the land use data, the official data from MapBiomas (Appendix D-1) (https://mapbiomas.org/)
were collected. MapBiomas was launched in July 2015 to make LULC dynamics in Brazil more
clear (Mapbiomas, 2015). This dataset is based on a Landsat archive available on Google Earth
Engine, containing Brazilian land cover and use maps on a 30-meter scale for the years 1985 to
the present day. There are 6 primary classes, and 30 sub-classes of land use: (1) forest, (2) non-
forest formation, (3) farming, (4) non-vegetated area, (5) water, and (6) not-observed. A detailed
classification can be found in Appendix D-2.
Figure 5.9. Spatial distribution of pastureland area and cropland area.
5.2.1.4. Anthropogenic
The majority of wildfires are anthropogenic or accident-caused, indicating a connection
between human-caused factors and wildfire occurrence. In recent decades, in the context of
continuous socioeconomic development and increasing mobility of people, the spread of tourism
and recreational activities, as well as the recent popularity of ecotourism (Obua, 1997) has
49
increased the opportunity for people to influence forest ecosystems. In this thesis study, the
anthropogenic factors included socioeconomic variables (population density), infrastructural
variables, and forest loss (percentage of deforestation) (See Figure 5.10).
The population density was determined by dividing the mean of the grid's population during
the study period by its geographical area. Population data (https://www.worldpop.org/) were
obtained from WorldPop, at a 1-km resolution. Infrastructural variables included roads (including
federal, state, and other) density (km/km2, ratio of road length to grid area). This dataset is an
extraction of roads from the National Department of Transport Infrastructure (DNIT) at a 1:25,000
precision vector map.
The data for forest loss was calculated by annual deforestation in 2019, which was from
PRODES/INPE Amazonia (http://terrabrasilis.dpi.inpe.br/downloads/). This spatial data was
primarily from the moderate-resolution satellite of the Landsat program with 30 meters spatial
resolution, but also from Sentinel 2 and CBERS, both are with 10 to 20 meters spatial resolution.
Forest loss is the sum of all deforested areas within a cell, divided by total cell area, and multiplied
by 100 (to convert to a percentage). A detailed illustration of the process used to create these layers
can be found in Appendix A-4.
50
Figure 5.10. Spatial distribution of road density, population density, and forest loss.
5.2.1.5. Meteorological
It is widely believed that wildfires are caused by intentional human activities or by man-caused
landscape changes, but it has also been speculated that meteorological conditions may have played
a role. The weather or climate factors are top-down controls on wildfires that have been
emphasized in several regional and continental studies (Flannigan et al., 2009; Marlon et al., 2008;
San-Miguel-Ayanz et al., 2013; Schelhaas et al., 2010; Westerling et al., 2006; Wotton et al., 2010).
51
It has been identified that the wind is one of the most important factors because it can produce a
fresh supply of oxygen for the wildfire as well as push the wildfire toward a new source of fuel
(Beer, 1991). For example, wind speed strongly affects the propagation of fire by increasing the
radiative heat transfer (McArthur, 1967). Fuels rely on the ambient temperature for their
temperature since they absorb solar radiation surrounding them to obtain heat. The ambient
temperature determines the combustibility of fuels. Humidity and precipitation affect the moisture
level of fuel (Živanović et al., 2020). When fuels are dry, they catch fire more easily and burn
faster than when they are humid (Chen et al., 2014; Živanović et al., 2020).
In this thesis, ambient weather elements include average annual temperature (℃), relative
humidity, and wind speed (m/s), and maximum cumulative water deficit (MCWD) (mm). Data for
temperature and wind speed are derived from TerraClimate datasets
(http://www.climatologylab.org/terraclimate.html), which record global monthly climate and
hydrological water balance datasets for global terrestrial surfaces spanning 1958 to the present
with ~4 meters spatial resolution. The Relative Humidity (%) data were obtained from the
(https://power.larc.nasa.gov/) Prediction Of Worldwide Energy Resources (POWER) project of
NASA, with a spatial resolution of 0.5 * 0.625-degree latitude and longitude for meteorology data.
The data are produced each hour and averaged into daily values (Sparks, 2018).
Firstly, we used the Make NetCDF Raster Layer tool to open all meteorological variables (File
extension is .nc). NetCDF (Network Common Data Form) is a file format designed to simplify the
creation, access, and exchange of scientific data across a network (ArcGIS, 2021b). Figure 5.11
52
illustrates the spatial distribution of the temperature, relative humidity, and wind speed,
respectively.
Figure 5.11. Spatial distribution of meteorological variables.

Note: (a) temperature, (b) specific humidity and (c) wind speed.
Water stress in tropical forests is often measured by the maximum cumulative water deficit
(MCWD). This metric was proposed by Aragão et al. (2007), according to several studies, in
different parts of the Amazonian Rainforest, the amount of evapotranspiration (E) is usually about
100mm per month on average. As a result, monthly precipitation below 100 mm means that more
53
water will be lost and evaporation will be greater than precipitation, thus generating a water deficit
state. To generate the MCWD, precipitation data from the Climate Hazards Group InfraRed
Precipitation(https://www.chc.ucsb.edu/data/chirps) with Station (CHIRPS) datasets were
collected, which provides precipitation estimates based on both NASA and NOAA satellite
information, and interpolations from in situ weather stations, with 0.05°spatial resolution (Climate
Hazards Center, 2015). Moreover, Anderson et al. (2018) assessed the accuracy of this
precipitation data, they found an R2 for the Amazônia Biome region is 0.73, and the root mean
square error is below 15 mm/month. To generate the monthly cumulative water deficit (MCWD)
for each pixel, firstly, the Raster Calculator tool in ArcGIS 10.8 was used to calculate and obtain
the average monthly precipitation data from the global annual precipitation data and subtract 100
mm from monthly precipitation values. Then, the most negative (minimum) monthly MCWD
value for each pixel (See Figure 5.12) was used. It can be seen that the trend in water deficit levels
declined from east to west throughout the biome in 2019. This process is also shown in Appendix
A-6.
54
Figure 5.12. Spatial distribution of MCWD.
5.2.2. Projection
The wildfire data for Amazônia Biome were projected to an appropriate projected coordinate
system before analyzing them. For this investigation, the South America Albers Equal Area (meter)
projected coordinate system was adopted. Biomes in low-latitude zones, such as Amazonia, can
be measured more accurately with this coordinate system.
5.2.3. Defining Analysis Field
It is necessary for us to define what is the Analysis Field (Input Filed) before analyzing hot
spot data. In that case, we will need to aggregate our incident data before analysis. There are several
ways to do this:
1) For the polygon data set (See Figure 5.13(1)), a spatial union was made with these
polygons to count the number of wildfires (events) in each polygon, therefore the result
of this union was the analysis variable for this thesis.
55
2) Create a polygon grid over the point features using the Create Fishnet tool. Then, use the
Spatial Join tool to count the number of events falling within each grid polygon (See
Figure 5.13(2)).
3) For the point data set (See Figure 5.13(3)), Integrate with the Collect Events tool can be
used to detect the hotspots. These methods are described separately below.
Figure 5.13. Identification of wildfire hotspots.
5.2.4. Integration and Collect Events
Integrate tool enables consideration of nearby features as identical and coincident if they
are at a short distance from one another. In addition, the Integrate tool can also address the issues
caused by inaccurate GPS coordinate positioning. The step of gathering events was important
because the Hot Spot analysis relies on weighted points rather than individual incidents. This
56
means that many events that happen close by are visually stacked on top of one another, appearing
as one point. To obtain an aggregate of overlapping points, this study used the Collect Events tool
in ArcGIS 10.8 (Appendix B-1). Collect events tool helps transform wildfire incidents into
weighted points data. This tool combines the features and creates a new feature class named
ICOUNT. It represents the sum of all the events in a specific location. This tool combines the
number of events found at a particular location. Then, the resultant ICOUNT field would be used
as the Input Field for analysis.
5.2.5. Create Fishnet
A fishnet-like grid was created by overlaying the wildfire maps (Create Fishnet in ArcGIS
10.8) after the study area was defined. The Spatial Join tools (See Figure 5.14) were then used to
divide the study area into 20 km × 20 km grids (a total of 7158 grids) to count the number of events
falling within each grid. Thus, the dependent variable was the number of wildfires in each grid in
2019. In this thesis, the choice of the grid resolution (20 km × 20 km) was made to increase the
reliability of climate data, while assuming moderate commission error. Due to the lack of weather
stations in low populated areas such as the Amazonian region, the interpolated data for those
regions are derived from small satellite information (Hijmans et al., 2005). For those areas,
uncertainty is more pronounced and climate predictions are therefore less accurate than for other
well-sampled areas.
57
Figure 5.14. Spatial Join for wildfires.
In addition, since the variables were obtained and collected from different types of data (e.g.,
vector data and raster data), the resolution of the raster data were different. For this reason, it is
necessary to unify all variables to the same scale (resolution) in order to ensure the accuracy of the
analysis and avoid systematic errors. Given the fact that the area of Amazônia Biome is about 4.2
million km2, in this study, the resolution scale of all analyzed data was set to 20 km × 20 km.
58
Having a small grid resolution would lead to a large sample size, which would be difficult to
compute. Meanwhile, a smaller pixel size would result in a greater number of 0-value pixels,
resulting in an overdispersion of the variables.
5.2.6. Fill Missing Values
After the spatial processing for the above layers has been done, there were null values in some
of the fishnet layers, such as the elevation and temperature layers. For example, due to the digital
elevation model: GTOPO30, the coastline boundaries do not coincide with the boundaries of the
Atlantic Northeast of the study area. Therefore, some elevation values are missing within the study
area. So, needing someway cleans up the data and makes them coincide. The most common
solution to this situation is to remove the observation from the dataset. However, throwing out
those null values can affect the analysis results. According to Waldo Tobler’s first law of geography,
an estimate for missing values can be constructed based on the values of neighboring features in
space. Filling in missing values is commonly referred to as imputation or geoimputation in the
case of spatial data (Dilekli et al., 2018). Filling in missing values on the surface can potentially
lead to problematic or undesirable consequences. In spite of this, it is often necessary to fill in
missing values in order to conduct spatial analysis on the entire study area. In this thesis, we used
the Fill Missing Values tool in the Space-Time Pattern Mining toolbox of ArcGIS Pro 2.4 was
adopted to help fill in missing values for several layers.
59
5.3. Exploratory Data Analysis (EDA)
The first step in understanding the spatial pattern of wildfires is to judge the overall situation
of wildfire data. Exploring data is commonly the first step in building predictive models in data
science (Hoaglin et al., 2011). Based on the classical statistical principles and some statistical
charts, diagrams, and tabular data, the characteristic of spatial information can be analyzed, which
provides a deeper understanding of the phenomena you are investigating and the overall situation
of the data. Moreover, preparation can be made for subsequent hypotheses or model building. Table
5.1 shows dataset attributes (wildfires) and their short description. Table 5.2 lists the descriptive
statistics of the explanatory and dependent variables.
Table 5.1.Wildfire records (MCD14ML) and their corresponding attributes.
60
Table 5.2. Descriptive statistics of dependent and explanatory variables.
5.4. Measuring Geographic Distribution
Some tools are used to calculate basic descriptive statistics that are used as a starting point in
the spatial analysis process to help summarize the main characteristics of spatial distribution
(Holcomb, 2017). In this thesis, Directional Distribution (or Standard Deviational Ellipse) method
was adopted to summarizes the spatial characteristics of wildfire incidents: central tendency,
61
dispersion, and directional trends. The ellipse allows seeing if the distribution of wildfire incidents
is elongated and hence has a particular orientation (ESRI, 2019).
5.5. Spatial Density Analysis
Kernel Density Estimation (KDE) is one of the useful tools for determining the spatial
intensity of a point process, and understanding and predicting potentially event patterns (Gavin et
al., 1993). The density-based method of statistical analysis is also called the analysis of the first
property of spatial distribution. It describes the global trend of a parameter, and the kernel density
estimation method uses kernel density interpolation to reveal the distribution of event points in the
study area (Silverman, 1952). The location of the wildfire ignition is not a precise geographical
position as it is the post-fire record. A continuous surface can minimize the effect of fire location
uncertainty (Amatulli et al., 2007). Kernel density estimation, a non-parametric statistical method
for estimating probability densities, which takes into account the symmetric probability density
function for each point location, and it functions by generalizing or smoothing discrete point data
into a continuous surface area (Amatulli et al., 2007). KDE is employed to model the distribution
of wildfire occurrences in the Amazon Rainforest in this study. The expression of the density f(x)
at the point x where the wildfire occurs is:

𝟏 𝟏 𝐗−𝐗 𝐢
𝐟̂𝐡 (x) = ∑𝐧𝐢=𝟏 𝐊 𝐡 (x - 𝐱 𝐢 ) = ∑𝐧𝐢=𝟏 𝐊 ( ) (5.4)
𝐧 𝐧𝐡 𝐡
where, 𝐾 is the kernel function, and ℎ > 0 is a smoothing parameter called the bandwidth. x is
the dataset of wildfire incidents (𝑋1, 𝑋2, ... 𝑋𝑛 ). X - 𝑋𝑖 is the distance between the estimated
point to the sample point i. The bandwidth (named ‘search radius’ in ArcGIS 10.8) is an essential
62
factor that affects how “smooth” the resulting curve is. If the selected bandwidth is large, the
hotspots of the event will be scattered; the hotspots will appear smoother; the visualization effect
is good; but the important details may be missing, and the smaller hotspots will be merged into
one larger hotspot. When the selected bandwidth is small, the details of the hotspot will be more
abundant, and the distribution of hotspots will be relatively concentrated. The choice of bandwidth
depends mainly on experience and requires multiple trials to achieve the best results (Koutsias et
al., 2004).
5.6. Spatial Pattern Analysis
According to the guidance of ArcGIS, if we use the number of events in each polygon as the
Input Field for analysis, the results of Getis-Ord Gi* are only reliable when the input feature class
contains at least 30 features. Therefore, this rule limits our analysis at a state level because there
are only have nine federal states in the study area. The spatial analysis in this thesis will be mainly
conducted at two levels (See Table 5.3): at the biome level (based on point data set), and at the
municipal level (Municipio in Portuguese) (based on polygon data set).
Table 5.3. Brazil administrative boundaries.
Brazil Administrative Boundaries

Country
Unidade
Municipio
Distrito
Subdistrito
Setore
63
5.6.1. Average Nearest Neighbor (ANN)
First, the ANN method calculates the Euclidean distance of each point and its nearest
neighboring (NN) feature, and it then averages all these nearest neighbor distances. Second, it
compares the average distance with the average for the complete spatial randomness (CSR). If the
average distance is less than the average for a hypothetical random distribution, the wildfire
incidents in the study area are more concentrated, and the features are considered clustered. If the
ANN distance of the actual measurement is equal to the ANN distance in a random spatial pattern,
the distribution of wildfires is considered randomly. If the average distance is greater than a
hypothetical random distribution, the distribution is considered dispersed. Note that the observed
mean distance was used in another step of the analysis that works towards a hot spot analysis and
is discussed below.
5.6.2. Spatial Autocorrelation
To begin the hot spot analysis, we need to determine whether the data is statistically clustered.
Moran's I is a correlation coefficient that can be used to test for global and local spatial
autocorrelation (Stephanie, 2016). The Global Moran Index ranges from -1 to 1, approaching zero
when spatial patterns are random. The principles of design of this inferential statistic method
measure the result of the analysis in terms of the null hypothesis. For statistical hypothesis testing,
Moran's I values are transformed to z-scores, with a z-score near zero indicating no apparent
clustering, the null hypothesis must be accepted: Wildfire incidence risk rate clusters in areas with
a positive (negative) value.
64
Moreover, the spatial autocorrelation of Moran’s I was also used to determine the residuals
produced by the OLS regression. GWR is used if there are spatially autocorrelated residuals from
the OLS regression, and it incorporates spatial relationships into the analysis to contribute to its
findings.
5.6.3. Hot Spot Analysis
As noted above, the tools mentioned above were employed as a means to answer questions
such as “What kind of spatial patterns did the study area display?”. Hot Spot Analysis in this part
does more than provide a simple yes-no answer to this question. Instead, it answers questions such
as “where is the statistically significant clustering?” or “where are the statistically significant
hotspots and coldspots in the study area?” Hot spots are areas that show statistically higher
tendencies to cluster spatially. This is determined by looking at each event within the context of
neighboring features. In other words, a single point with a high value is not necessarily a hot spot.
It becomes a hot spot only when its neighbors also have high values. The most commonly used
and well-known tools are Getis-Ord Gi* (i.e., traditional) and OHSA (i.e., optimized). Both tools
are valuable sets of tools to use in cluster analysis. Statistical cluster analysis can help minimize
the subjectivity in maps.
Getis-Ord Gi* (pronounced “G-i-star”) is an index of spatial autocorrelation that can identify
spatially significant spatial clusters of high values (hot spots) and low values (cold spots). The Gi*
statistic quantifies the degree of this association as a result of the concentration of weighted points
(or area represented by a weighted point), and all other weighted points included within a radius
65
of distance from the original weighted point (including the point itself). Getis-Ord Gi* identifies
whether a local pattern of wildfire occurrence for a given parish and its neighbors is different from
what is generally observed across the whole study area. It computes a z-statistic by comparing the
proximity-weighted sum of total fires at a particular parish to the sum across the entire sample to
identify areas of more intense clustering of high (low) wildfire occurrence. A high z-score and a
small p-value for a feature indicate a spatial clustering of high values. A low negative z-score and
a small p-value indicate a spatial clustering of low values. The higher (or lower) the z-score, the
more intense the clustering. A z-score near zero indicates no apparent spatial clustering. However,
when points are analyzed, the Getis-Ord Gi* tool also asks for the threshold distance to run the
analysis.
OHSA is a new tool, which is a combination of the Incremental Spatial Autocorrelation tool
and the Hot Spot Analysis (Getis-Ord Gi*). It is able to interrogate data to determine appropriate
parameters for analysis and calculate the threshold distances directly, to create a map of statistically
significant hot and cold spots, which is more efficient and comprehensive, and this eliminates all
the steps mentioned above. Choosing to use the optimized, traditional, or both sets of tools will
depend on analysis and workflow.
The calculation of Global Moran's I and Gi* statistics was conducted using the Hot-Spot
analysis function in ArcGIS 10.8. Inverse Distance Weighted (IDW) was then applied to the
hotspot map generated using Getis-Ord Gi* analysis for the visualization of hotspots or cold spots.
Among the many interpolation techniques used for mapping spatial hotspots, IDW is one of the
most widely used.
66
5.6.4. Cluster and Outlier Analysis
For the study of the next step, the existence of outliers (High-Low and Low-High) in a dataset
contradicts the first law of geography, which states that near features are similar to each other. For
those existing outlier features, we know that the distribution of those features is not random,
therefore, more investigation to determine the reasons is needed. In this thesis, Optimized Outlier
Analysis (ESRI, 2021b) was adopted to determine outliers, and the tool used the Anselin Local
Moran's I statistic (See Figure 5.15). Even though, this tool does exactly what the Hot Spot
Analysis tool does, this tool also provides a different view to understand the analyzed data.
Figure 5.15. The quadrants of the cluster and outlier analysis.
5.7. Modeling Spatial Relationship
After analyzing the spatial patterns of wildfires, I am also interested in understanding some of
the reasons behind these phenomena and trying to answer the following questions. For example,
why are there so many wildfires in these hot spots? What are some of factors that contribute to
these hot spots? These are the types of questions that can be answered using regression analysis
techniques.
67
Regression analysis (See Figure 5.16) is a widely used set of statistical techniques or tools for
examining, modeling, and exploring spatial relationships (Anselin, 2005). This thesis used the
regression analysis tools to demonstrate the strength of relationships and impacts of several
contributed factors that potentially promote negative or positive change in the number of wildfire
occurrences. Regression analysis tools can help in better understanding the reasons for the wildfire
occurrences and possibly make decisions to mitigate the damage caused by wildfires. Both global
regression and local Geographically Weighted Regression (GWR) spatial models (i.e.,
Geographically Weighted Gaussian Regression (GWGR) and Geographically Weighted Poisson
Regression (GWPR)) can be estimated in the GWR software — the current version is 4.09, as well
as in ArcGIS Pro 2.8 (ESRI, 2021d), and a SAS macro 9.4 (SAS Institute Inc, 2021), and R using
in "spgwr" package (R Development Core Team, 2021).
Figure 5.16. Regression analysis workflow.

Note: Adapted from ArcGIS.
In this study, relationships between the number of wildfires and a set of explanatory variables
by using both global and local modeling methods were attempted to model and thus to find the
spatial patterns of driving factors on the occurrence of wildfire in the study area. The objectives
were as follows: (1) to develop global Gaussian and Poisson models to regress the occurrence of
wildfire by using 15 driving factors; (2) to develop Geographically Weighted Gaussian regression
(GWGR) and Geographically Weighted Poisson regression (GWPR) models to regress the
68
occurrence of wildfire; (3) to compare and evaluate the global and local models in terms of model
performance; and (4) to visually represent the distributions of local model coefficients across the
study area and identify the spatial patterns of the geographically varying model coefficients.
5.7.1. Ordinary Least Squares (OLS)
Ordinary Least Squares (OLS) regression is a starting point for regression analysis. The OLS
is defined by Hutcheson (2011) as a global model that generates a single equation that describes
the data relationships between a dependent variable and each of the explanatory variables in the
study region. The OLS model’s equation for this analysis is presented as (Scott and Pratt, 2009):
Y = 𝛃𝟎 + 𝛃𝟏 𝛘𝟏 + 𝛃𝟐 𝛘𝟐 + 𝛃𝟑 𝛘𝟑 + ... + 𝛃𝐧 𝛘𝐧 + 𝛆 (5.5)
where Y is the dependent variable; 𝛽𝑖 is coefficient i (i = 1, 2, …, n), 𝛽0 is the interpret; 𝜒 are
the exploratory variables, and ε is random error term of the residuals. Therefore, for modelling
wildfire occurrences, OLS regression equation will be:
Wildfire Occurrence = 𝜷𝒊𝟎 + 𝜷𝒊𝟏 * (Elevation) + 𝜷𝒊𝟐 * (Relative Humidity) + 𝜷𝒊𝟑 *
(Road Density) + … + 𝜺𝒊
The OLS is often used as a diagnostic tool and for choosing the appropriate explanatory
variables for the GWR model (ArcGIS, 2021a). According to ArcGIS, there are six essential
statistical checks before employing the GWR model:
Check 1: Whether these explanatory variables are statistically significant.
Check 2: Whether the relationship (positive or negative) meets the expectations.
Check 3: Whether exists the redundant explanatory variables.
69
Check 4: Whether the model is biased.
Check 5: Whether all key explanatory variables are found.
Check 6: What is the performance of the created model?
After reviewing theory and previous research, a group of candidate explanatory variables was
presumably identified. However, some of them are statistically significant while others aren't
(Check 1). OLS will calculate a coefficient for each explanatory variable in ArcGIS. An asterisk
(*) next to the coefficient values indicates that the variable is statistically significant to the model.
However, due to the nonstationarity of the wildfire data in this thesis, the statistical significance
can only rely on the robust probability. OLS will also calculate the Koenker (BP) Statistic
(Koenker's studentized Bruesch-Pagan statistic), which is a useful statistic to determine whether
explanatory variables are statistically significant in a nonstationary relationship. When Koenker
(BP) Statistic is statistically significant, which means GWR needs to consider. In addition, when a
list of candidate explanatory variables is made, it is required to make sure to include the expected
relationship for each variable (Check 2).
The statistical check of the redundant explanatory variables is also called the multicollinearity
test (Check 3). Multicollinearity refers to a situation in which more than two explanatory variables
in a multiple regression model are highly linearly related, that is, one explanatory variable can be
represented by a linear combination of other explanatory variables. This leads to redundant
information that can skew regression results.
Therefore, when regression analysis is performed on the relationship between dependent
variables and explanatory variables, a multicollinearity test should be performed on explanatory
70
variables first to eliminate explanatory variables, which leads to a significant collinearity
relationship. Assume that a set of nonzero variables of 𝑋1 , 𝑋2 ,... 𝑋𝑛 , if there are constants
𝑏1 , 𝑏2 , …, 𝑏𝑛 , which can make the following equation:
𝐛𝟏 𝐗 𝟏 + 𝐛𝟐 𝐗 𝟐 + 𝐛𝟑 𝐗 𝟑 + ... + 𝐛𝐧 𝐗 𝐧 = 0 (5.6)
then, there is a linear relationship between 𝑋1, 𝑋2, ..., 𝑋𝑛 .
The Variance Inflation Factor (VIF) value, which is calculated by OLS, is a measure of
variable redundancy and can assist in determining which variables can be removed without
compromising the explanatory power. VIF can be calculated by the formula below (Wheeler and
Tiefelsdorf, 2005):
𝟏 𝟏
𝐕𝐈𝐅𝐢 = = (5.7)
𝟏−𝐑𝐢 𝟐 𝐓𝐨𝐥𝐞𝐫𝐚𝐧𝐜𝐞
where 𝑅𝑖 2 represents the unadjusted coefficient of determination for regressing the 𝑖 𝑡ℎ
explanatory variable on the remaining ones. The reciprocal of VIF is known as tolerance. If VIF
is greater than 7.5, it indicates that there is multicollinearity between explanatory variables.
Moreover, the model residuals of a proper regression model should be normally distributed (Check
4). The Jarque-Bera Statistic tests a null hypothesis that the residuals are normally distributed.
To avoid missing the important explanatory variables (Check 5), it needs to hit the book again,
scan the relevant literature, and try to come up with as many candidates' spatial variables as
possible. When crucial explanatory variables are absent, statistically significant spatial
autocorrelation (clustering of residuals) is frequently a signal of misspecification.
The last check focuses on the model's performance (Check 6). The adjusted R2 value is an
important measure of how well the explanatory variables are modeling the dependent variable. R2
71
values vary from 0 to 1 and are expressed as a percentage. The Akaike information criterion (AIC)
is another measure of model performance. Models with a lower AIC value are better.
In this thesis, to identify the exploratory variables that are significant to explain the dependent
variable, the OLS and the Exploratory Regression tool in ArcGIS 10.8 were used. The Exploratory
Regression tool generated a summary report that comprised the maximum Variance Inflation
Factor (VIF) value, the explanatory variables’ significance and sign, the Jarque-Bera p-value, the
spatial autocorrelation p-value, the Akaike’s Information Criterion (AIC), and the adjusted R2
value, etc.
5.7.2. Poisson Regression
In addition to transforming discrete count data to a continuous scale as described above, the
expected number of wildfires can be estimated by a generalized linear model (GLM) of Poisson.
Since the 1970s, the Poisson model has been regarded as a reliable method for wildfire occurrence
modeling and risk assessment (Cunningham and Martell, 1973). ArcGIS Pro 2.8 provides a
Generalized linear regression tool, which combines three different model types, in addition to the
existing OLS model type, it offers a Poisson model type for count data.
According to the assumption of OLS, if a data distribution is not bell-curved, such as our
dependent variable (i.e., wildfire occurrence), the Poisson regression model might be appropriate.
The Poisson regression model is written as (Cameron and Trivedi, 2013):

𝐞−𝛍𝐢 𝐭 𝐢 (𝛍𝐢 ,𝐭 𝐢 )𝐲𝐢
P(𝐘𝐢 )= (𝐘𝐢 =𝐲𝐢 | 𝛍𝐢 , 𝐭 𝐢 ) = (y= 0,1,2,3…) (5.8)
𝐲𝐢 !
72
where P(Yi ) is the probability that the number of wildfire event (Yi ) occurred during a given time
period, and μi is the parameter denotes the expected value of Yi . The symbol “!” is a factorial.
Assumption of Poisson distribution is that the variance value and the mean value are equal
(equidispersion) or are expressed μ=E(𝑌) = Var(Y), µ is the mean of the count data Y.
Poisson regression assumes that Poisson distributed random variables Y and the identity link
function of the expected value of Y can be modeled by a linear combination of the parameters are
unknown (Eq.5.9), or via a link function of a natural logarithm (Eq.5.10).
Identity link: 𝛍𝐢 = 𝛃𝟎 + 𝛃𝛘𝐢 (5.9)
𝐥𝐧(𝛍𝐢 ) = 𝛃𝟎 + 𝛃𝛘𝐢 (5.10)
where 𝜒𝑖 represents the explanatory variables; 𝛽0 is the intercept coefficient, and β is the vector
of the model slope coefficients.
5.7.3. Overdispersion: Quasi-Poisson Regression
Although the Poisson has been considered a reliable model for predicting the number of
wildfires, the model often faces the challenge of strict requirement of a Poisson distribution, i.e.,
the mean is equal to the variance. Although this equal mean-variance relationship can be observed
in empirical data, it is rarely found in observational data. In most cases, the observed variance is
larger than the assumed variance, which is called overdispersion (Ismail and Jemain, 2007).
Given the nature of rare events, the number of wildfires also has the problem of overdispersion,
given the fact that the zero counts tend to occur more often than higher numbers of wildfire events.
73
If overdispersion is ignored, the model fitting will underestimate the standard errors for Poisson
regression model coefficients and lead to biased hypothesis testing.
There is a way to adjust for dispersion with respect to the Poisson model, called the quasi-
Poisson or Scaled Poisson model. That is, a dispersion parameter is introduced into the Poisson
variance so that the Poisson model is scaled, which bases inference on robust standard errors (Lee
et al., 2012).
The dispersion parameter (𝜎 2 ) is estimated by deviance or Pearson’s χ2 test statistic divided
by its degrees of freedom from the Goodness-of-Fit test. If the estimated dispersion is >1, the data
may be overdispersed, while a dispersion<1 indicates that the data may be underdispersed. Once
the dispersion parameter is calculated, the quasi-likelihood method is applied. The quasi-likelihood
requires the standard errors generated from the Poisson model to be multiplied by the factor 𝜔,
which is the square root of the dispersion parameter:
𝟐
𝛚 = √𝛔𝟐 (5.11)
This scaled the “quasi-Poisson” regression model, which has the same mean function as the
Poisson regression model but with a variance that is 𝜔 times the mean as represented by (Lee et
al., 2012):
Var(𝚼) = 𝛚𝛍 （5.12)
The model is fit in the usual way, and the resulting parameter coefficients would be the same
as the estimated regression coefficients from the Poisson model, but only the standard errors of
the coefficients would be larger. The t statistic is the coefficient divided by its standard error. The
critical t-test values when using a 95 percent confidence level are -1.96 and +1.96 standard
74
deviations. Once we find this t-test statistic, we typically find the p-value associated with it. The
t-test values can be calculated according to the robust standard errors and original coefficient, to
ensure the inference at a significant confidence level.
5.7.4. Geographically Weighted Gaussian Regression (GWGR)
As a spatial regression technique, geographically weighted regression has certain advantages
in spatial data mining and analysis. The traditional regression model, namely the global model,
assumes that a relationship between variables is isotropic before data analysis, and the model
parameters obtained are the statistical results of the whole study area, with single results. However,
in practice, observation data are generally sampled according to the given geographical location
as the sample unit. With the change of geographical location, the relationship or structure among
variables will change, that is, "Spatial Nonstationarity". In the whole study area, different areas
have different properties, which mean that spatial heterogeneity exists in the spatial analysis.
Therefore, the single model results may have some errors in the interpretation of the actual
situation.
Tobler’s first law of geography, “Everything is related to everything else, but near things are
more related than distant things” (Tobler, 1970) can be applied to the study of wildfire in the
Amazon Rainforest. The activities of fire differ with variations in environments and vegetation
types, etc. (Luke and McArthur, 1978). Spatial autocorrelation reflects the degree of observations
are located near to other observations with similar values (Tobler, 1970). It can also be considered
as a causal process to measure the influence of neighborhood effect (Goodchild, 1986).
75
Fotheringham et al. (2003) compared classic GWR to a "spatial microscope" because of its
ability to measure and visualize variations in relationships that are unobservable in aspatial, global
models. This modeling approach focuses on spatial differences and the search for exceptions or
local "hot spots." GWGR constructs a separate OLS equation (as a spatial disaggregation of the
OLS) for every location in the dataset, which includes the dependent and explanatory variables of
locations falling within the bandwidth of each target location. Therefore, it is critical to choose an
optimal bandwidth for model fitting (Fotheringham et al., 2003). Bandwidth can be defined based
on the literature review above etc. or it can be determined by the statistical software (e.g., AICc
(Corrected Akaike information criterion)) or Cross-validation Criteria (CV) in ArcGIS or GWR4).
According to Fotheringham et al. (2003), the AICc is generally more applicable than the CV.
According to Tobler's (1970) First Law of Geography, observed fire points in the wildfire
model in GWGR are weighted according to their proximity to the regression point, with a higher
weighting placed on the observations located closer to the regression point than those farther away
(Brunsdon et al., 2002). Fotheringham et al. (2003) gave a general form of a basic GWGR model
as:
𝐲𝐢 = 𝛃𝐢𝟎 + ∑𝐦
𝐤=𝟏 𝛃𝐢𝐤 𝐱 𝐢𝐤 + 𝛆𝐢 (5.13)
where, 𝑦𝑖 is the dependent variable at location i; 𝑥𝑖𝑘 is the 𝑘 𝑡ℎ explanatory variable at location
i; m is the number of explanatory variables; 𝛽𝑖0 is the intercept parameter at location i; 𝛽𝑖𝑘 is
the local regression coefficient for the 𝑘 𝑡ℎ explanatory variable at location i, and 𝜀𝑖 is the
random error at location i.
76
5.7.5. Geographically Weighted Poisson Regression (GWPR)
Although the GWR model can effectively resolve the spatial non-stationarity problem, this
technique is only used when the distribution of the data is Gaussian. The problem of overdispersion
in spatial count data is one of the most common problems encountered in reality (Haining et al.,
2009), most explanatory variables have skewness, which makes classic GWR inappropriate to
model this type of data. Therefore, to overcome the issue of overdispersion, the classic GWR has
been extended to the framework of generalized linear models, such as that Da Silva and Rodrigues
(2013) developed the Geographically Weighted Poisson Regression (GWPR), which is more
apposite for this thesis research, because the dependent variable is discrete and represents the
number of occurrences of wildfire. The GWPR model is developed by adding geographical
location into the standard Poisson regression. The GWPR model is written as:
𝐩
𝐥𝐧 (𝛍𝐢 ) = 𝛃𝟎 (𝐯𝐱𝐢, 𝐯𝐲𝐢 ) + ∑𝐤=𝟏 𝛃𝐤 (𝐯𝐱𝐢, 𝐯𝐲𝐢 )𝚾𝐤𝐢 (5.14)
where 𝛽0 and 𝛽𝑘 are the GWPR model parameters that describe the location i in terms of x and
y coordinates. Thus, the expected value of μi can be predicted by the inverse link function μ̂i =
̂ ̂
𝑒 𝛽0+ 𝛽𝜒𝑖 . For modelling wildfire occurrences, the GWPR model can be re-written as Equation 5.15,
where a constant was added to the average observed number of wildfire occurrence to mitigate
problems related to counts of zero.
𝐥𝐧 (𝐖𝐢 + 𝟏) = ln (𝛍𝐢 ) = 𝛃𝟎 (𝐯𝐱𝐢, 𝐯𝐲𝐢 ) + 𝛃𝟏 (𝐯𝐱𝐢, 𝐯𝐲𝐢 ) (Elevation) + 𝛃𝟐 (𝐯𝐱𝐢, 𝐯𝐲𝐢 ) (Slop)
+ … (all explanatory variables) + 𝛆𝐢

(5.15)
77
5.7.6. Model Evaluation and Comparison
The measures of AICc and adjusted R2 values may be used to evaluate the performance of all
global and local models. Adjusted R2 values indicate how well a regression line is fitted, while the
AICc is an information criterion used to measure the goodness-of-fit between models. According
to the evaluation criteria proposed by Fotheringham, a model with a high adjusted R2 value have
greater explanatory power. In contrast, smaller AICc implies better model fitting performance. The
spatial autocorrelation (global Moran's I) tool is also used to evaluate the spatial autocorrelation
of the global and local model residuals. Moran’s index is calculated at the significance level of
0.05. The closer the index value is to -1 or +1, the higher the degree of spatial autocorrelation,
thereby indicating that the model cannot adequately explain the change in the dependent variable;
the closer the index value is to 0, the lower the spatial dependence of the residual will be, and
hence, the model accounts for more spatial structure problems. In the global regression, spatial
autocorrelation is a commonly encountered issue when the model cannot be adjusted for existing
spatial heterogeneity.
To evaluate spatial variation or heterogeneity of the explanatory variables in the GWGR and
GWPR models, Chen et al. (2012) developed an approach. They used the interquartile range (IQR)
of the coefficient estimates computed by the GWR localized models to be compared to the standard
error of the global estimates derived with a classic regression model (Chen et al., 2012). When
IQR is twice as large as the standard error, it indicates that spatial non-stationary exists in
relationships between the dependent variable and explanatory variables.
78
Chapter 6. Results
6.1. Wildfire Data Extraction
To avoid false detections and improve the accuracy of analysis, according to the performance
robustness of the MCD14ML algorithm mentioned by Giglio et al. (2016), in the final wildfire
dataset, all fires with a confidence level of 30% or lower were removed. The error rate of fire
detection is around 4% for South America overall. Finally, there are 81,010 fire points in total were
used as study objects in this thesis. The sum of wildfire points coupled with their particular
locations is recorded as ICOUNT.
6.2. Directional Distribution
The standard deviation ellipse of wildfire incidents in 2019 showed directionality, and the
rotation is about 73.06° from north direction, which means the spatial distribution pattern was
dominated by Northeast-Southwest (See Figure 6.1). This means that there is more variation in the
x-axis than in the y-axis of the data because the data stretches farther across the western and central
Amazônia Biome than it stretches to the north or south.
79
Figure 6.1. The standard deviation ellipse of wildfire incidents.
6.3. Temporal Distribution
Within the Amazônia Biome region, wildfires also showed temporal heterogeneity. Figure 6.2
presents the temporal distribution characteristics of wildfires in 2019, and it can be seen that it has
strongly inhomogeneous, temporally different throughout the months of the year. According to the
figure, the number of wildfires averaged 7,153 per month but ranged from 35,257 in August to 405
in January. August and September have had the highest number of wildfires and also from June to
September being a wildfire-prone time of the year, accounting for 85.5% of the total cases of the
whole year. Apart from the well-known wildfire high season in summer, a secondary peak in March
also presented.
Figure 6.3 showed the result of the hotspot and direction trend analyses performed on wildfire
data ranging from January to December in 2019. The dark red hexagon bins signify areas where
there is intense clustering of high values with 99 % confidence. These are areas where there were
high numbers of wildfires occurring, and as such requires a special attention from wildfire
authorities. The distribution of wildfires in the Amazônia Biome region presents important spatial
80
and temporal inhomogeneities, which preventive wildfire policy should take into account.
According to Figure 6.3, in the month of January, four wildfire hotspots were detected in the study
area. One of the hotspots was in the state of Roraima's southeast. One of the hotspots is in the
northwest corner of Pará state. One of the hotspot groups could be found in the southern part of
Mato Grosso state. The other hotspot group was a cluster of adjacent hotspots in northern
Maranhão state. The standard deviation ellipse in January showed a strong directionality, and the
spatial distribution pattern was dominated by Northeast - Southwest. In February, March, and April,
the hotspots were mainly distributed in the state of Roraima, and the spatial distribution pattern
was dominated by Northwest-Southeast. In May and June, the wildfires mainly occurred in the
southern part of Mato Grosso state. Wildfires occurred most frequently in July, August, and
September (which can be considered as fire season), and hot spots were widely distributed in the
state of Acre, Amazonas, Pará, Mato Grosso, and Rondônia. During October, November, and
December, the hotspots were mainly detected in the eastern part of the state (Pará and Maranhão),
and some of the small part of Mato Grosso state.
Figure 6.2. Monthly number of wildfires in the Amazônia Biome in 2019.
81
Figure 6.3. Optimized hot spot analysis for each month.
Note: The ellipse in each plot presents the directional trend of wildfires for each month.
82
6.4. Spatial Distribution
For the wildfire incidents dataset, Figure 6.4 shows that the observed mean nearest neighbor
distance was calculated as 6574.7863 meters. The expected mean nearest neighbor distance was
calculated as 11311.2084 meters. These two values were compared using the normally distributed
Z statistic. The calculated Z value was much less than -2.58, so the null hypothesis of the complete
spatial randomness pattern would be rejected. The negative Z value indicated that a clustered
pattern exists. Moreover, according to the Ratio, the NNI formula leads to 𝑅𝑛 = 0.581263, which
also indicated that the distribution of wildfire was clustered with a 99% certainty.
Figure 6.4. Average nearest neighbor summary of wildfires.
83
As mentioned in section 5.5, the density surface can provide a way to visualize the
concentration of wildfire occurrences. As seen from Figure 6.5, wildfires were mainly concentrated
in the northern part of Rondônia, the southern part of Amazonas, eastern Acre, and most of the
central part of Roraima. In addition, there were certain degrees of aggregation in the northeast and
southwest part of Pará, and southeast part of Mato Grosso. Note that the mean center of wildfire
occurrences also aligned well with the highest density.
Figure 6.5. Kernel density map of wildfire incidents in Amazônia Biome in 2019.
The spatial distribution of wildfire incidents in 2019 was clustered. It has analytical value. The
results created with the spatial autocorrelation tool suggested that the pattern of total wildfires at
each feature location is clustered (See Figure 6.6). The Moran’s Index was 0.1821; the z-score was
159.6426; and the p-value was 0.00000. Since the critical value (z-score) was greater than 2.58,
84
there was less than 1% likelihood that the clustered pattern is a result of random chance. The p-
value was statistically significant, and the z-score was positive. So, the null hypothesis was rejected.
The spatial distribution of high values and/or low values in the dataset was more spatially clustered
than that would be expected if underlying spatial processes were truly random.
Figure 6.6. Global Moran’s I summary of wildfire incidents.
85
Figure 6.7 (A) shows the output of the Getis-Ord Gi* analysis in the study area. This tool
derived the output named GIZScore generates a z-score value for each feature that helps indicate
the statistical significance of feature clusters, and eventually the hot and cold spots. The red color
indicates areas that have statistically significant clusters of high wildfire occurrence values,
referred to as hot spots. The blue color indicates areas that have statistically significant clusters of
low wildfire occurrence values, referred to as cold spots. The varying shades of these colors
indicate how confident you are that these clusters of high and low values are not random and
accidental results, and represent meaningful spatial patterns.
It could be seen that hotspots and cold spots were presented by points where the values were
5.397312> z >-3.350883 (red) and -9.917025 <z< -6.628708 (blue) standard deviations,
respectively. In addition, Figure 6.7 shows that features with a high positive value of z score were
specified as the hotspots (red), whereas features with a low value of z score were specified as the
cold spots (blue).
Figure 6.7 (B) shows the result layer of OHSA in the hexagon grids. This analysis validated
all wildfires and integrated them into 4,836 weighted hexagonal polygons. The optimal fixed
threshold distance based on peak clustering was found at 37.9 km. In this threshold distance, none
of the features had less than eight neighbors, which is a good indication of the accuracy of the
results.
At the state level, because the existent polygons inside the Amazônia Biome region (e.g.,
Country, Unidade, Municipio, Distrito, etc.) are not identical in sizes and shapes, the study area
had to be divided into identical polygons to avoid size bias. Thus, the “Count incidents within the
86
hexagon grid” option was selected, which divides the study area into several similar hexagonal
polygons. Each of the polygons contains the number of wildfire occurrences located in that
polygon under a column (Counts) created by the software.
In the “Bounding Polygons Defining Where Incidents Are Possible” field, the shapefile of the
Amazônia Biome boundary was uploaded to identify the border of the study area. Moreover, due
to the study area is relatively large, the hexagon grid of aggregation method was adopted, which
could decrease the distortion compared to the fishnet grid. Before other factors that relate to
wildfire occurrence would be identified, it was assumed that a distance band of 37.9 km was
appropriate for this analysis.
In addition, this thesis also used method 3) in the above section of Define Analysis Field,
Figure 6.7 (C) shows the result layer of Getis-Ord Gi* in the fishnet grids. This analysis used the
Fixed Distance Band as the Conceptualization of Spatial Relationships. This means that all
features within the neighborhood have the same weight. To be specific, all the wildfire points in
the grid, in spite of in the center or in the corner of the grid, had the same effect on the research.
87
Figure 6.7. GIZScore Map.
Note: (A) Getis-Ord Gi* hotspot analysis (Collect Events) (B) Optimized hot spot analysis in the hexagon grids
(C) Getis-Ord Gi* hotspot analysis in the fishnet grids.
The visualized results of the Optimized Hotspot Analysis and Getis-Ord Gi* for wildfires are
shown in Figure 6.8. Figure 6.8 (A) shows six hotspots with different levels of significance; one
was the central part of Romania; one hotspot was related to the 4 states in the southwest part of
the Amazônia Biome region, which comprise the states of Acre, Amazonas, Rondônia, and Mato
Grosso; one was in southeastern part of Mato Grosso; and remaining 3 hotspots filled in the central
part of Pará state. However, no cold spots had been identified, and the rest of the states was
88
classified as “Not Significant,” which means the wildfires that occurred in those areas were
distributed randomly and there were no significant patterns. By comparing the Optimized Hot Spot
Analysis result with Getis-Ord Gi* result, the Optimized Hot Spot Analysis result (See Figure 6.8
(A)) indicates large hot spots and cold spots that occurred over all or nearly all states. The Hot
Spot Analysis (Getis-Ord Gi*) result (See Figure 6.8 (B)) indicates smaller, more refined hot spots
and cold spots that typically distributed over sections of states.
Figure 6.8. GI_Bin Map.

Note: (A) Optimized hotspot analysis in the hexagon grids (B) Getis-Ord Gi* hotspot analysis in the fishnet
grids.
The smooth surface was classified into 5 different classes of hotspots (See Figure 6.9). The
areas that were characterized as high, moderate, and low require a different level of attention
depending on a specific area. The Very High (red) areas reflect that the areas require more attention
from the local government in managing the wildfire incidence as wildfires are more congregated
with a high value of z-score in those areas. In contrast, the very low areas (blue) show the
89
statistically significant clustering pattern of a negative z-score. These areas require the least
attention as they are considered cold spots.
In addition, Figure 6.9 reveals that wildfire hotspots in the study area; there were significantly
high wildfire occurrences in states across the east and west parts and a few across the northern
states, such as the middle of Mato Grosso, southeast and middle south of Amazonas, northeast of
Rondônia, and middle of Roraima. There were significantly low wildfire occurrences in the east
and some of the north-central states, such as the Acre and Pará, and across the middle of Amazonas.
The red zones indicate the areas with significant spatial clustering, which have a very high value
of z score.
Figure 6.9. IDW of GIZScore map.
The result shown in Figure 6.10 is a layer of clusters with high wildfire occurrences, clusters
with low wildfire occurrences, and spatial outliers. The pink color indicates hexagon areas that
have statistically high values, referred to as hot spots. The green color indicates hexagon areas that
90
have statistically low values, referred to as cold spots. The hot spot analysis identified the areas to
the north-central, north-west, central and south areas where wildfire occurrences were significantly
intense. The outlier analysis provided additional insight on these patterns, noting spatial outliers
within these intense areas. We could further examine these wildfire occurrences or explore the
relationship between wildfire and different explanatory factors in these areas.
As shown in Figure 6.10, there were 25 High-Low outliers shown in red, and they were mostly
in the Amazonas state. This means 25 features had an unexpectedly high number of wildfire
occurrences compared to their neighboring features. However, a total of 237 features were
identified as Low-High outliers, which are shown in blue hexagonal polygons located frequently
around hotspots. These features had a low number of wildfire occurrences, but their surrounding
features had a high or not statistically significant number of wildfire occurrences. This means that
these features unexpectedly had fewer wildfire occurrences in comparison with their neighbors.
In terms of clusters, a total of 1,973 features were identified as Low-Low clustered features or
coldspots (most of them located in the northern part of the Amazônia Biome). The Optimized Hot
Spot Analysis tool did not identify any coldspots features. Furthermore, a total of 423 features
were identified as High-High clustered locations or hotspots, which were almost in the same
locations identified by the Optimized Hot Spot Analysis tool (671 features). The Low-Low clusters
mean that these locations had a low number of wildfire occurrences, and their surrounding
locations also had a statistically significant the low number of wildfire occurrences, and vice versa
for the High-High clusters. The analysis report shows that all the 4,836 weighted hexagonal
polygons were used. The optimal threshold distance was found at 93,225 meters. At this threshold
91
distance, all of the features had at least eight neighbors, which is a robust indicator for the accuracy
of the results.
Figure 6.10. Optimized outlier analysis.
In addition, the Hot Spot Analysis was also conducted at the municipality level based on
polygon data set, as shown in Figure 6.11. For OHSA, a specific distance band was not required
to be defined; this tool may create the neighborhood size for you, and the distance band used here
is 208 km. For Getis-Ord Gi*analysis, since this analysis was conducted based on polygons
(municipalities) with different sizes and shapes, the distance relationship was eliminated;
“Contiguity edges corners” of conceptualization of spatial relationships was used, which means
municipalities that touch one another were qualified and counted as neighbors, and thus the
distance threshold becomes irrelevant.
92
The results from both methods shown in Figure 6.11 were approximately identical. Only a few
municipalities had slightly different. An additional analysis may help further refine the focus of
wildfire management work.
Figure 6.11. Hotspot analysis at municipality level.
6.5. Spatial Relationship
Since the wildfire occurrence locations as points cannot be directly examined, the first step
was to group the explanatory and dependent variables per grid. Then, the “ Are_Identical_To
option in the Spatial Join tool in ArcGIS 10.8 was adopted to aggregate all the variable tables to a
synthesized table (See Figure 6.12). The synthesized table includes the number of wildfire
occurrences and the values of other explanatory variables for each grid. Because a large number
of explanatory variables were thought to be significantly associated with wildfire occurrences, this
93
thesis used a data-mining tool called Exploratory Regression in ArcGIS 10.8 to evaluate every
possible combination of potential explanatory variables entered OLS models that best explained
the dependent variable.
Figure 6.12. The synthesized table of the explanatory and dependent variables.
6.5.1. Global Model Using OLS
First of all, the global Exploratory Regression tool was used to create a single equation that
describes relationships between wildfire occurrences and each of the explanatory variables. There
were 15 exploratory variables that were examined by the exploratory regression in order to select
appropriate variables for the OLS model. The first test of OLS was to investigate the significance
of each exploratory variable, in order to select proper variables for a more in-depth investigation.
The significance of the exploratory variables shown in Table 6.1 defines how statistically
significant each variable was during the analysis of every possible combination in the Significant
94
( % ) column, as well as the influencing direction of variables were examined by the Negative ( % )
and Positive ( % ) columns. The strong candidate variables are those variables that were significant
over 50 percent of the time.
Even though every model passed the multicollinearity test of VIF because all VIF values are
less than 7.5 (See Table 6.1), the Exploratory Regression conducted failed to produce a passing
model. The spatial distribution of the model residuals exhibited clustered, which implies that a
crucial explanatory variable may be missing. In addition, the Jarque-Bera (JB) test showed a
statistically significant JB score, which denotes that there is bias and potential heteroscedasticity
within the model and may result in overfitting or underestimation. The Adjusted R-squared value
for all of the models was also not satisfactory, because it was relatively low (R2 =0.45) to explain
the dependent variables (ESRI, 2021a). The highest Adjusted R-squared was 0.45, meaning that
the model only explained about forty-five percent of the variation in the data. The results of the
Exploratory Regression were included in Appendix C-1.
Due to the low explanatory power of the above model, steps were taken in an attempt to
increase the strength of the models. According to the above diagnostic information, all models
have failed the Jarque-Bera (JB) test, which could be due to data outliers and nonlinear
relationships. Curvilinearity can often be remedied by transforming skewed variables to give them
a normal distribution (ESRI, 2021c). Therefore, firstly, the normality of the dependent and
explanatory variables was tested using SPSS. The Kolmogorov-Smirnov (KS) normality test
examines if variables are normally distributed. The KS result revealed that each of the variables
analyzed was statistically significant, indicating that the data was not normally distributed. To
95
produce a more meaningful model, the transformation of the data is needed. The result of the KS
test was included in Appendix C-2.
Table 6.1. Summary of variable significance and multicollinearity.
Although estimation of the OLS regression does not necessarily require the normality
assumption, meaningful statistical inferences of OLS estimators rely on the assumption that the
error term is normally distributed or at least asymptotically normally distributed. Therefore, to
ensure an effective statistical analysis, logarithm transformation (Song et al., 2017) for the
dependent variable was decided to use, and the explanatory variables were converted by using an
unconditional Box-Cox transformation (Box and Cox, 1964), after comparing transformation
approaches. All of the explanatory variables were then standardized to eliminate the dimensions,
96
avoiding numerical problems caused by too large values and balancing the contribution of each
factor. In this thesis, the data normalization (ESRI, 2021c) was performed with the Z-score method
̅
Χ−Χ
(Equation: Χ ′ = ) by using the Standardized Field tool and Transform Field of ArcGIS Pro, in
𝜎Χ
order to reduce skewness in the distribution and make it follow a normal distribution, thereby
meeting the basic assumption of normality for linear regression, according to the normality tests.
After the transformation and normalization, the data were still not normally distributed, which
was caused by the overdispersion of the data. Even though the histogram showed that the data
became smoother (Appendix C-3) and not ideal, these data processing steps have improved the
performance of the modelling, and the R2 value increased from 0.45 to 0.52. The resulting models
of the secondary Exploratory Regression using the transformed data were much stronger than that
of the analysis conducted using the original data and passed every test except for the JB test and
the Spatial Autocorrelation test (Appendix C-4). However, due to the outliers in the data, it was
still not a perfect linear relationship.
The OLS model was then used to analyze the variables in the model and their relationship
with wildfire occurrences. The OLS regression was conducted using the transformed data, and all
fifteen variables were tested and were not multicollinear. This OLS global model (See Table 6.2)
revealed that it explained about 52 percent (adjusted R2 = 0.52) of the variation in wildfire
occurrence with AICc = 20929.89. These original explanatory variables shown above were
significant at the 95% confidence level (𝛼= 0.05). The final OLS model showed that elevation,
slope, aspect, number of forest patches, edge proportion, population density, road density, forest
97
loss, cropland area, pastureland area, and wind speed were significantly correlated with the
wildfire counts (See Table 6.3).
Table 6.2. Summary of global OLS results.
Table 6.3. OLS diagnostics statistics.
The Analysis of variance (ANOVA) returned a significant F-value = 548.39 and the Wald
statistic had a significant chi-squared value = 6096.61. This means that generally, the model proved
to be statistically significant. The chi-squared value (1419.04) of the Koenker statistic was
statistically significant, therefore, the null hypothesis was rejected and there was a nonstationary
98
condition in the model. Moreover, the Jarque-Bera statistic returned a significant chi-squared value
(p-values < 0.01), showing that the residuals were spatially clustered at a 99% confidence level
and that a biased model was obtained. These results revealed that in the OLS model, the
relationship between the dependent variable and the independent variables had spatial
heterogeneity, and therefore the GWR model would be a more desirable choice. One of the OLS
model assumptions is the independence of the residuals from one another. If spatial patterning is
present, it can indicate a violation of this assumption. Therefore, note that the results of the OLS
regression should be used with caution because the data do not have a normal distribution and
there is heteroscedasticity in the model. However, the histogram of the residuals generated by the
OLS regression still followed a near bell-shaped curve and did not appear to have a severe
abnormal distribution. The histogram was included in Appendix C-5.
6.5.2. Global Model Using Poisson
Table 6.4 shows that in the Poisson regression model, VIF exploratory variables showed that
the Poisson results are not biased by multicollinearity. All explanatory variables were significantly
associated with a number of wildfire occurrences at a significance level of 95%. This Poisson
model (See Table 6.4) revealed that it explained about 55 percent of the variation in wildfire
occurrence with AICc = 15594.00.
In this thesis, the deviance/df ratio (367909.652/7142) of the Poisson model was 51.514 from
Goodness of fit in SPSS, showing the existence of the overdispersion problem. Moreover, from
the statistics summary of the dependent variable, the mean wildfire occurrence count is 12, while
99
the variance is 1348, this is a common occurrence of overdispersed data sets. To adjust for
overdispersion, the association between wildfire incidence and several explanatory variables was
modeled using the multivariate quasi-Poisson regression model. Table 6.4 summarizes coefficient
estimates and standard errors for Poisson and quasi-Poisson. Overdispersion was evident by
analyzing the reported standard errors. While the Poisson and quasi-Poisson models had the same
estimated regression coefficients, quasi-Poisson’s coefficient standard errors were larger. This is
evidence of overdispersion and is indicated as an underestimation of standard errors by the Poisson
model. Compared to the naive Poisson model, the elevation, aspect, FVC, road density, and
temperature were not significant in the quasi-Poisson.
Table 6.4. Poisson and quasi-Poisson diagnostics statistics.
100
6.5.3. Geographically Weighted Gaussian Regression
As mentioned above, the Jarque-Bera and the Koenker statistic calculated from the OLS
model showed the existence of a significant positive spatial correlation of the global model
residuals. However, the above results do not indicate which explanatory variable has spatial
heterogeneity. Therefore, in this thesis, the interquartile range (IQR) approach was used to evaluate
spatial variation or heterogeneity of the explanatory variables. The results indicate that the
parameter estimates for all independent variables had spatial heterogeneity in the study area (See
Table 6.5) because their IQR values were at least twice as large as the standard errors of the
corresponding global model coefficients. Additionally, it is important to note that when the GWGR
model was calculated, there was a systematic suggestion of global or local multicollinearity in the
model. Through investigation, the local redundancy explanatory variables were road density and
crop area, respectively. Therefore, in the GWGR models, the two variables were excluded for
further analysis.
Table 6.5. Summary statistics for estimated local coefficients from the OLS model and relative spatial
variation status.
101
Next, the Global Moran’s I analysis was further used to determine the spatial autocorrelation
of OLS and GWGR regression. By comparison, the result of Global Moran's indicated that the
residuals of OLS exhibited significant spatial autocorrelation, while the pattern of the residual from
the GWGR model was random (See Figure 6.13). In this case, the GWGR models has shown better
performance than the OLS.
Figure 6.13. Moran’s I results of residuals.

Note: Left: OLS; Right: GWGR.
These parameter estimates of each explanatory variable of the GWGR model were calculated
by GWR4 software; this software generated individual parameter estimates for each location; and
the results were stored in a .csv table. Then, the Spatial Join tool was used to spatially associate
the event points with the research fishnet, in order to achieve visualization. This will make the
complex relationship that varies over space easier to comprehend. Since there is currently no
consensus on how to assess confidence in the coefficients from a GWGR model, the t-tests in the
GWGR model was adopted as an informal reference of scaling the magnitude of the estimation
with the associated standard error and visualized results, looking for clusters of high standard
102
errors relative to their coefficients. Based on a 95% confidence level used in this thesis, a simple
mapping technique that combines the local parameter estimates with local t-values on one map
was used, in which the local t-values ranging from -1.96 to +1.96 (nonsignificant parameters) are
shown in gray, whereas the significant parameters were set to 100% transparency (See Figure 6.14).
Figure 6.14. Spatial distribution of significant model coefficients.

Note: (a) Elevation, (b) Slope, (c) Aspect, (d) FVC, (e) Number of Forest Patches, (f) Edge Proportion, (g)
Population Density, (h) Forest Loss, (i) Pasture Area, (j) Temperature, (k) Maximum Cumulative Water Deficit,
(l) Wind Speed, and (m) Relative humidity of the GWGR model. Grey pixels are non-significant coefficients.
103
6.5.4. Geographically Weighted Poisson Regression
Poisson Regression models are more appropriately used for modeling events where the
outcomes are counts data. Therefore, in this thesis, a preliminary attempt at the GWPR model was
made to hope find the best model by comparing different methods. ArcGIS Pro 2.8 was used to
calibrate the GWPR model. The best bandwidth size was determined automatically using the
golden section search method, based on the lowest corrected Akaike Information Criteria (AICc).
The estimated bandwidth was selected to be 211642.28 meters, depending on the residuals of the
global model.
The kernel type and function for geographic weighting to estimate local coefficients for each
fishnet grid and bandwidth size was a Gaussian kernel with adaptive bandwidth. The Gaussian
kernel continuously decreases from the center of the kernel but never reaches zero. Additionally,
in this thesis, an adaptive kernel was appropriate because the regression points were irregularly
positioned in the study area. According to the result of IQRs, the coefficients of all thirteen
explanatory variables used for the GWPR models had spatial heterogeneity, because all the
estimated coefficients were at least twice as large as the robust standard error of the corresponding
global model (See Table 6.6).
104
Table 6.6. Summary statistics for estimated local coefficients from the Poisson model and relative spatial
variation status.
For the GWPR model, we also calculated the robust standard error of GWPR based on the
theory of the quasi-Poisson model and compared the original t-values with the calculated t-values,
which was based on the robust standard error. Our results suggested that the spatial distributions
of the significant model coefficients of explanatory variables between GWGR and GWPR have a
certain similarity (See Figures 6.14 and 6.15).
105
Figure 6.15. Comparison between original SEs and robust SEs of coefficients.
Note: (a) Elevation, (b) Slope, (c) Aspect, (d) FVC, (e) Number of Forest Patches, (f) Edge Proportion, (g)
Population Density, (h) Forest Loss, (i) Pasture Area, (j) Temperature, (k) Maximum Cumulative Water Deficit,
(l) Wind Speed, and (m) Relative humidity of the GWPR model.
6.5.5. Model Evaluation and Comparison
The calibrated local model results indicate that it is a significant improvement on the global
model. The adjusted R-square values of the OLS and Poisson regression model was 0.516 and
0.550, respectively, while the corresponding Percent Deviance Explained value for the GWGR
model and GWPR model was 0.746 and 0.784, respectively. By comparing the AICc values, the
106
AICc values were 20929 for the OLS model and 17069 for the Poisson model, which was
obviously greater than those of both the GWGR model (15594) and GWPR (5183) models. As
expected, the explanatory power of the local model is better than the global model. GWGR model
was 23% better than that of the global OLS model, and the GWPR model was 23.4% better than
that of the global Poisson model (See Table 6.7).
Table 6.7.Comparison of the performance of the four models.
Additionally, the results show that the Moran's I for all models were significantly positive (Z-
values>1.96), and the model residuals of the OLS regression had a slightly higher Moran's I index,
while the model residuals of local Gaussian and Poisson models had a relatively lower Moran's I
index. In general, the GWR models greatly improved the model fitting and performance over the
corresponding global models and produced much desirable model residuals. Figure 6.16 and
Figure 6.17 show the Moran scatter plot and the map of spatial autocorrelation of the standardized
residuals calculated by the four models, respectively.
From Figure 6.17, over predictions (negative residuals) are represented by red, indicating that
there was less wildfires in reality than predicted by the model. Under predictions (positive residuals)
107
are represented by blue, where it is expected that there will be more wildfires than predicted by
the model.
Figure 6.16. Moran scatter plot.

Note: (a) OLS, (b) GWGR, (c) Poisson, and (d) GWPR.
108
Figure 6.17. The standardized residuals from (a) OLS, (b) Poisson, (c) GWGR, and (d) GWPR.
In terms of model predictions, two GWR models were closer to the observed wildfire counts
than the global models. The predicted counts from the global models did not exceed 370 (See
Figures.6.18(a) and 6.18(c)). In contrast, the predictions from the GWR model showed wider
ranges, where some locations were predicted above 500 wildfire occurrences (See Figures.6.18(b)
and 6.18 (d)), which was closer to reality. Additionally, it can be visually concluded that there was
also a spatial consistent between the wildfire locations predicted by the four models and the actual
locations of wildfire occurrences. The areas with frequent wildfires were mainly concentrated in
109
the northern part of Rondônia, the southern part of Amazonas, the central part of Roraima, the
northeast and southwest part of Pará, and the southeast part of Mato Grosso.
Figure 6.18. Spatial distributions of model predictions from (a) OLS, (b) GWGR, (c) Poisson, and (d) GWPR.
Figure 6.19 shows the spatial distribution of the R2 values calculated by the GWGR model
and GWPR model. As seen in Figure 6.19, there was regional variation in the strength of the
relationship in the study region in both models; the values of local R2 for the GWR model ranged
from 0.10 to 0.96, and most R2 values locating in the central (i.e., Amazonas and Pará states), north
110
(i.e., Roraima state), and southwestern (i.e., Acre state) parts of the study area had higher R2 values,
while others in the northwestern, southern, northeastern, and most of the eastern parts of the study
area had lower R2 values.
R2 is not used in the GWPR model because the R2 equation is not held in the Poisson model.
Therefore, the field Local Percent Deviance was adopted, which is a value between 0 and 1, to
map as a substitute for R2. This metric is a goodness of fit metric, and it can play the same role in
the Poisson (and logistic) model as R2 does for the Gaussian model. The values of Local Percent
Deviance for the GWPR model ranged from 0.32 to 0.98, which indicate that GWPR was better
fitness than the GWGR model, and the fitness was significantly improved in the northwestern,
north-central, and southern of the study area. The regions with lower R2 were mainly caused by
the overdispersion issue. To be specific, per the excessive zeros issue happened in those areas,
more attention and further correcting the model standard errors of these regions need to be paid in
the future work.
Figure 6.19. Spatial Distribution of the R2 and local percent deviance by the (a) GWGR and (b) GWPR.
111
Chapter 7. Discussion
This chapter provided a discussion on the objectives of the research. This thesis analyzed the
spatial distribution of 2019 wildfire occurrences across the Amazônia Biome region. Regression
models are used to examine probable explanatory variables connected to wildfire incidences.
Furthermore, a discussion of the limitations that were faced during conducting this research is also
given. Finally, further refinements and opportunities for expanding this thesis in the course of time
are outlined.
7.1. Spatial Density
In the case of KDE, it is difficult to give a general suggestion on the parameter settings (i.e.,
bandwidth - the user’s only role for the estimation), because they are dependent on user
requirements. However, the bandwidth is an essential factor that affects how “smooth” the
resulting curve is. If the selection of the bandwidth is large, the hotspots of the event will be
scattered; the hotspots will appear smoother; the visualization effect is good; but the important
details may be missing, and the smaller hotspots will be merged into one larger hotspot. When the
selection of the bandwidth is small, the details of the hotspot will be more abundant, and the
distribution of hotspots will be relatively concentrated. Usually, the choice of bandwidth depends
mainly on experience and requires multiple trials to achieve the best results. To ease this process,
the bandwidth used in this thesis was calculated using the default search radius formula
112
(Silverman's rule-of-thumb bandwidth estimation formula) recommended by ESRI scientists. In
the meanwhile, the KDE output results from this process would leave space for additional
calculations and further discussion.
Moreover, it should be understood that the hotspots resulting from the KDE map were not
statistically significant, and different cell sizes and bandwidths might obviously affect the results.
That was a main difference between a KDE and a Gi* hot spot analysis. Often, the results of KDE
and Gi* visually yield the same. But they actually are completely different analyses. KDE aimed
to detect clusters of high values within the data, while Gi* statistic not only detected but identified
statistically significant hot spots in our dataset.
In addition, the second important issue related to KDE is the classification of the kernel
density output raster. The classification's goal is to get as close to the original surface as possible
by preserving the phenomenon's characteristic patterns. In the process of visualizing density values,
based on the limited ability of the human eye to discriminate shades, the Magic Number 7 (plus or
minus two) (Lee et al., 1995) was used for providing a more suitable number of classes, and this
number is thought to provide evidence for the capacity of short term memory. Jenks (1967)
proposed a method for keeping similar data values in the same class that uses a measure of
classification error (sum of absolute deviations about class means). As a result, the classification
provides the most accurate and objective summary of the original data. The method is commonly
called natural breaks, which has been applied in a variety of GIS software.
113
7.2. Temporal Pattern
A study on the temporal distribution of wildfire is of great significance for the rational
deployment and allocation of fire control resources. Based on the above research, the wildfires
themselves were not homogeneously distributed throughout the year. Public policies should be
adapted to take into account these heterogeneity patterns.
At present, there are two relatively extreme positions on the formulation of the wildfire
temporal policies. On the one hand, to minimize the impact of wildfire on ecology, decision-
makers formulate plans from the perspective of a time-homogeneous policy, and this scheme
requires that the firefighting capacity must reach or be close to solving the largest wildfire event
in a whole year. However, this would be extremely inefficient in terms of economic, as many
firefighting personnel would be idle for most of the wildfire low-season. On the other hand, from
the perspective of reducing government expenditure, decision-makers eliminated all permanent
firefighter positions and only cooperated with military emergency response units and volunteers.
It will greatly increase the potential safety hazard, because, during the peak of the wildfire season,
firefighting resources would be insufficient for basic damage control. Thus, to specify the wildfire
control scheme reasonably and efficiently, the optimum should lie somewhere in the middle. That
is that the temporal peaks condition determines the number of firefighters required and the wildfire
prevention plan. In any case, it is critical to have a thorough understanding of the temporal
inhomogeneity patterns to find out the optimum balance. In addition, by combining temporal
distribution information and spatial distribution information, decision-makers are able to make
more appropriate programs according to local conditions and time.
114
7.3. Spatial Pattern
Compared the two different analysis methods for mapping clusters, first, the OHSA tool can
automatically aggregate wildfire location data but with Getis-Ord Gi* tool, users have to manually
perform it before running the tool. Second, the OHSA tool makes many intelligent choices for
some parameters by themselves, whereas the Getis-Ord Gi* tool provides more control to users.
Moreover, when wildfire occurrence locations was analyzed, the optimized statistical cluster
analysis tools did not aggregate the feature attribute values into weighted features, but the Getis-
Ord Gi* required using the Collect Events tool to weight the wildfires occurrences. Hence, if the
user is not familiar with the data enough then opting for the OHSA tool could be very beneficial,
due to its efficiency and its comprehensiveness.
7.4. Spatial Modeling
7.4.1. Comparison of Prediction Models
Global models are the most effective in describing the overall data relationships in a study
area. The OLS regression is effective in estimating the homogeneity relationship. However, The
large-scale wildfire prediction model discussed in this thesis will face the problems of changes in
terrain, combustibles, meteorology, and other conditions (Prasad et al., 2006). The parameters of
the traditional generalized linear model (e.g., OLS or Poisson) are constant; when the relationships
behave differently in different parts of the study area, the global regression equation can only use
one mean value to describe the existing relationships, and the global average does not appropriately
model extremes well, and possibly masking local interactions with the explanatory factors.
115
Therefore, it is difficult to obtain high estimation accuracy and good interpretation ability. It is
therefore critical to analyze the spatial variation of driving factors that lead to wildfires in order to
understand their causes. To overcome this limitation, in the present thesis, GWR was adopted to
incorporate in the models the spatial variation of the explanatory variables. Compared with the
global model, the GWR models were able to explain more spatial relationships, and the predictors
were not consistently positive or negative influencing the wildfire occurrence across the study area.
The analysis result also indicate that the GWR models had better performance and fitting than the
global models, which were in line with previous studies (Koutsias et al., 2010b; Nkeki and Osirike,
2013).
Not only can the global redundancy affect our regression result, but also the local redundancy.
Even though the global redundancy (redundancy among predictors) can be explored by using OLS,
the local multicollinearity is sometimes not easy to detect. If there is a high level of spatial
homogeneity in your data (i.e., the map reveals spatial clustering of identical values), you can have
multicollinearity within a single variable, such as the explanatory variable of the Road Density in
this model. Since there were only a few roads within the study area, and most fishnet grids did not
contain roads, the road density was zero in those grids. In terms of the cropland, high significance
values occurred only in the Amazonas, Mato Grosso, and Roraima. However, since GWR is a local
moving window regression, the multicollinearity issue is in regard to redundant values within a
local fit and not only between variables. Therefore, based on the local multicollinearity of these
variables, there are two ways commonly used to address this issue, removing those variables from
116
the local model, or combining those variables with other explanatory variables, thus increasing
variation of variables. In this thesis, those variables were excluded when GWR was running.
7.4.2. Accuracy of Prediction Models
To minimize the possibility of clusters of false positives being identified in the GWR model,
it will be necessary to make corrections to the traditional t-test in the future. Because the threshold
of p-value is artificially specified, no matter how small the p-value is, it can only represent the low
false positive of the result, rather than ensure that the result is true. Even if the p-value is already
very small (such as 0.05), it will be infinitely enlarged by the total number of tests. Therefore, in
future studies, we can consider using multiple testing corrections methods (e.g., False Discovery
Rate (FDR)), to reduce the number of false-positive results in this study.
In order to achieve good performance, the OLS and GWGR both require the relationships
between the explanatory variables and the dependent variable to be linear. Kolmogorov-Smirnov
normality test was applied to clarify the relationships among the explanatory variables. In practice,
for normalizing residual distributions, the variables with nonlinear relationships were transformed
by using Natural Logarithm or BOX-COX transformation (also known as power transformations)
(Box and Cox, 1964). Typically, this kind of transformation enables the analysis to fall within its
legitimate application domain and to perform reliable statistical analysis. For example, in this
thesis, the dependent variable (wildfire occurrence) was transformed by applying Natural
Logarithm to its values while most of the explanatory variables were transformed using the BOX-
COX transformation. However, when the regression residual is highly non-normally distributed, it
117
is important to know what problems may be caused by the misspecification, which can be fully
discussed in future work. As Fotheringham et al. (2003) pointed out, GWR may act as a microscope
by magnifying the relationship's details, while also amplifying any existing issues (Fotheringham
et al., 2003). Moreover, although the Box-Cox transformation is one of the most effective
techniques for obtaining the best results from linear regression, the scale change of the transformed
variables is difficult in model interpretation. Due to the existence of the error term, back-
transformation is complicated. As for the mixed-effects of log and BOX-COX transformation that
exists in our models, back-transformation is almost impossible.
7.4.3. Influence of Drivers on Wildfire Occurrence
For both global models, the explanatory variables of the forest loss, number of forest patches,
edge proportion, cropland area, pastureland area, population density, and wind speed were
statistically significantly associated with the wildfire counts. Especially, the coefficients of forest
loss, pastureland area, and the number of forest patches were strongly positively associated with
an increased number of wildfire occurrences in both global models. Therefore, the results have
also corroborated that the wildfires in the Amazon Rainforest are predominantly of anthropogenic
origin, and deforestation and agricultural activities are the two major factors that determine the
occurrence of wildfires in the Amazon Rainforest (Silva Junior et al., 2018).
There was spatial variation in the coefficients of the different explanatory variables at different
locations in the study area. Consequently, the coefficients can be interpreted using the influencing
direction (the sign of coefficients) and magnitude (estimated coefficient values). For example, the
118
number of forest patches was positively related to the wildfire counts in the eastern Acre, eastern
Amazona, western Pará, and central Mato Grosso but negatively related to the wildfire counts in
western Maranhão. Similarly, the pastureland Area was demonstrated to be positive relationships
with the wildfire counts in most parts of southern Amazonas, southwestern Pará, and central
Rondônia, but to be negative relationships in southwestern Mato Grosso. Besides, the forest loss
was positively correlated with the wildfire occurrence in the central of Pará, south of the Amazonas,
southeast of the Acre, northwest and central of the Mato Grosso, and most part of the Rondônia.
To sum up, the results indicated that wildfire counts for 2019 were abnormally high in Acre,
Amazonas, Pará, Mato Grosso, and Rondônia. Additionally, deforestation rates in these states were
higher than in any other state in 2019.
Furthermore, the pastureland area and cropland area were found to be the two variables with
relatively weak spatial heterogeneity, with a greater mean value for pasture and a lower mean value
for cropland in fishnet grids. High significance values occurred for pastureland and cropland only
in Amazonas, Pará, and Roraima. This also explains why cropland area was negatively correlated
with wildfires in the global models. This is because the cropland coverage rate in most of the study
areas was relatively low or close to zero.
Also, the topography is also a main factor causing wildfires. According to the physical
characteristics of wildfire occurrence, wildfires were prone to burn on flat, ridges, and steep north-
facing slopes, while valleys and steep south-facing slopes were the least likely to occur (Wood et
al., 2011). Flat areas can provide fire enough spaces to expand and sustain. In contrast, the fire is
less possible to burn on a steep slope because there is limited space and combustibles. However,
119
in practice, wildfire occurrence is often not directly related to topographic drivers, but indirectly
affected with other drivers. Wildfires in the Amazon Rainforest have been proven to have
significant relations with human activities, and exhibit close spatial relations with deforestation.
Thus, for giving forest parcel with a terrain of flat slope and low elevation, the area is often
expected to be converted to agriculture. In other words, the slope and elevation drivers strongly
affect the suitability of a given location for different land uses.
The results of the global model suggested that the number of wildfire occurrences was
increased with lower elevation and flatter terrain. This is because human activities (e.g., agriculture
activities) are more likely to occur in relatively low elevation and flatter terrain, which may lead
to higher wildfire risks. The findings are generally supported by two GWR models. Similar to the
global models, the elevation is negatively correlated with wildfire occurrence in the south of
Amazonas, southwest of Pará, west of Rondônia, and most areas of Roraima. Because there are a
large number of farms or pastures here, these agricultural activities tend to be located in more flat
and low-altitude areas. Additionally, the slope was negatively associated with the wildfire
occurrence in the populated southern regions of the Rondônia, whereas the slope was positively
correlated with wildfire occurrence in the eastern of the Roraima because more grassland was
distributed in these areas, and when the slope is steep, the grass becomes drier and are likely to
cause more fires. Many studies have also confirmed that there is a positive correlation between
road networks and wildfire occurrence. However, due to the multicollinearity of road density, this
study did not prove them quantitatively. Nevertheless, it can be visually seen that the denser the
120
road network, the denser the wildfire. For instance, there was a concentration of wildfire incidents
in the southwestern Pará region along the federal highway.
Additionally, both global models indicated that the aspect index was positively associated with
wildfire occurrence. The larger the aspect index was, the closer the region was to the north and
northeast, with a larger amount of sunlight and higher temperatures, which characteristics also lead
to dryer fuel and thus make it also have a higher potential for wildfire occurrence. For instance,
the GWGR models showed that the aspect index is positively correlated with wildfire occurrence
in the central and south of the Amazonas state, and northwest and southeast of the Pará state. In
general, a high FVC results in higher fuel loads and greater wildfire probabilities and occurrences,
but this driver is not present significant within this study area of both global models. This is
because the FVC value of the entire Amazon rainforest is high, and the vegetation cover is dense.
In addition, according to the statistical results of FVC data, the standard deviation of FVC is very
small (0.1), which indicates that data are clustered around the mean and have less dispersion.
Moreover, this study also found that wildfires tended to occur in areas with low population
density. The main reason is that densely populated areas are settlement areas, and human activities
are relatively concentrated and far from the forest, so wildfires are not easy to occur. On the
contrary, sparsely populated areas are mostly distributed at the intersection of farmland and forest,
where human activities have a greater impact on forests and are more likely to generate wildfires,
and over 95% of fires, including the most severity ones, occurred in the first kilometer from the
forest edges, which is consistent with Sturtevant's research conclusion (Sturtevant and Cleland,
2007). The finding was consistent with two GWR models that showed that wildfire occurrence
121
was negatively correlated with the densely populated southern part of the Rondônia. In contrast,
the results were opposite in agricultural settlement areas, and the wildfire had a notorious
concentration near the wildland-farmland-interface in the southern part of in Pará state protected
areas, as well as the agricultural settlements in the southeastern Pará.
The meteorology variables (i.e., temperature, relative humidity, and MCWD) had negative
model coefficients in both global models, indicating that the wildfire proportion would be
decreased if any of the meteorology variables increased. In terms of statistical significance, it was
found that the meteorological conditions were a poor discriminant of the 2019 grid fishnet wildfire
occurrence, which is consistent with Kelley et al. (2021) research conclusion. In the work by
Kelley et al. (2021), during August 2019, the correlation between the area burned by wildfires and
the weather conditions in Acre and the southern Amazonas was less than 10%. The correlation in
northern Mato Grosso was even less than 1%. Even though the temperature, MCWD, and relative
humidity were not statistically significant to the wildfire occurrence based on the global OLS
model but were statistically significant within some regions, e.g., the temperature negatively
impacted the wildfire proportion in the central area of the Roraima state, while positively impacted
it in the south of the Amazonas state. Briefly, the anthropogenic drivers are positively associated
with the wildfire occurrences in the Amazônia Biome and outweigh climate drivers in South
America.
It was not expected that there was a negative correlation between wind speed and wildfire.
One explanation for this may be that the average wind speed in the entire Amazon rainforest was
very low (when wind speed is below around 10 km/h), and fire typically burns slowly and does
122
not have a clear direction of spread. Besides, the main causes of wildfire in this region were human
activities and deforestation. In contrast, GWR models provided more detailed spatial relationships
between meteorological factors and wildfire occurrence. For instance, wind speed was positively
correlated with wildfire occurrence in most of the Roraima state.
In conclusion, human activities are the main factors determining the occurrence of wildfire,
followed by topographic and relative humidity, which should be considered as fundamental
elements in the design of wildfire prevention programs. In addition, enforcement of mandatory
regulations and the monitoring of fire activity are also important techniques for preventing
wildfires, particularly in areas where a fire is a significant factor of the productive cycle.
7.5. Overdispersion
In terms of wildfire prediction, due to the influence of environmental and meteorological
factors, the occurrence time of wildfire is mainly concentrated in a few months, and the number of
wildfire occurrences has distinct discrete characteristics. So, even though the GWPR model has
been regarded as an effective and reliable model for predicting the count of wildfires, the model
often faces the challenge of strict requirement of a Poisson distribution, i.e., the mean is equal to
the variance. Fortunately, even if the dependent variable has the problem of overdispersion when
our sample is large enough, it does not prevent us from getting the predicted value of the
asymptotic estimate (Wooldridge, 2015). Moreover, quasi-Poisson used in this thesis is also an
alternative way to adjust this issue. Since the quasi-Poisson has different variance functions to
minimize overdispersion so that the validity of the hypothetical distribution can be verified. But,
123
this estimate of dispersion could also indicate other issues, such as an incorrectly specified model
or data outliers. Therefore, consideration should be given to whether this type of model is
appropriate for the given data set. Moreover, a limitation of quasi-Poisson model is that it cannot
yield prediction intervals because there is no corresponding distribution or likelihood for the model,
and hence some useful statistical tests and fit measures like the AIC or BIC cannot effectively
compare the model to other types of models. Additionally, there are also a lot of approaches have
been developed to overcome this issue of overdispersion. For instance, the negative binomial
instead of the Poisson model may be used. In terms of the issue of excessive zeros, the zero-inflated
Poisson and zero-inflated negative binomial models are also good choices. In addition, Da Silva
and Rodrigues also incorporate the negative binomial models into spatial models to develop
Geographically Weighted Negative Binomial Regression (GWNBR) for adjusting the
overdispersed count data (Da Silva and Rodrigues, 2013). In future work, we can also try to model
our data by using these models, to build a more reliable model.
124
Chapter 8. Conclusions
In this thesis, wildfires were described not only in spatial patterns and characteristics but also
in terms of temporal patterns and characteristics. In addition, regression models were applied to
explore the driving factors of wildfire and the relative effects of each factor on wildfire occurrence
in the Amazônia Biome region. The analysis results indicate that there are strong regional
differences in the spatial and seasonal patterns of wildfires.
Results of wildfire temporal pattern analysis indicate that the spatial variation and temporal
contribution of wildfires across the biome of each state are different. Wildfires were found to be
more frequent in five states, which include the states of Acre, Amazonas, Rondônia, Pará, and
Mato Grosso, and were most prominent in the summer season. Wildfire activity in most states was
concentrated between July and September, which is confirmed to be related to the rainfall
seasonality in most of the Amazonia region (Silveira et al., 2020). However, it occurred mostly in
February-April, and October-November in Amapá and Maranhão. From October to November,
Pará experienced an increase in wildfire activity that could exceed the annual fire occurrence of
some states. Therefore, given the peaks in fire activity varying from state to state, the fire
department should formulate different fire plans and schedules for fire combat planning. Moreover,
we also recognize that even within states, there are variations in fire season, therefore, our future
research could provide up-to-date and detailed information on such patterns to aid in firefighting
operations.
125
It is a well-known fact that wildfire occurrences are a phenomenon that changes over space
and time. But, due to the time limit, this thesis only studied the phenomena of wildfire occurrences
in 2019, and mainly focused on the spatial relationship between wildfire drivers and wildfire
occurrence, therefore, in our future work, we can consider more about the relationship between
time series and the occurrence of wildfires, as well as the changes in climate conditions (e.g.,
drought years vs. wet years, etc.) before and after the wildfires, to achieve multi-time scale research
on wildfire occurrences. In addition, the GWR model has been also extended further to make it
researchable, and the Geographical and Temporal Weighted Regression (GTWR) model has been
developed to deal with both spatial and temporal non-stationarity (Fotheringham et al., 2015).
In the vast and heterogeneous territory of the Amazônia Biome, fire prevention and
suppression can be a daunting task, requiring strategic planning to prioritize resources to the most
critical areas. Our spatial distribution results revealed that wildfire occurrence had a concentration
in some areas, including the central part of Roraima, southern Amazonas, eastern Acre, the
northernmost part of Rondônia, southwestern Pará along the federal highway, the southern part of
Pará near the protected areas, the northeastern Pará, and the southeastern part of Mato Grosso.
Results of spatial relationship analysis indicate that spatial clustering is primarily caused by
anthropogenic factors (i.e., deforestation and agricultural activities) had a strong influence on the
distribution and frequency of wildfires in the Amazônia Biome. For both global models, the forest
loss, pastureland area, edge proportion, and the number of forest patches were strongly positively
associated with an increased number of wildfire occurrences. Therefore, the areas with intense
126
deforestation and agricultural activities and the wildland-farmland interface are those that will
require a greater need for attention from fire brigades.
The assessment of model fitting and predictions shows that the GWR models had better
performance than the global models. In non-stationarity or missing variables situations, the local
models can provide a helpful complement to the global models. Model parameters can be estimated
more reliably and accurately, and more realistic spatial distributions of model predictions are
derived. Our results were consistent with previous studies, which demonstrated that the GWR
model has a better model fitting and prediction than the global models.
The scale of the study area could possibly be a limitation, and the large study area could mask
subtle nuances at local regions compared to smaller scales, but we need to admit that large-scale
research is still meaningful. Broadscale spatial analysis and statistical evaluations of wildfire
occurrences Broadscale spatial analysis and statistical evaluations of wildfire occurrences offer the
opportunity to identify the factors that have a statistically significant impact on wildfire risks and
damages and can identify subjects needing further study. This methodology can be applied to other
Brazilian or global regions. It serves to isolate knowledge gaps and so provides a basis for
subsequent analyses at a finer scale, which is potentially useful. In a large-scale grid, the spatial
difference between data is not obvious, and the statistical accuracy of data is too rough, which may
cause information loss. Fortunately, the variety of geographic data made it possible to analyze the
spatial distribution of wildfires at different spatial scales, contributing to a better understanding of
Amazon Rainforest wildfires. The different scales used in this analysis could lead to different
results, and this could introduce reliability, potential bias, and validity concerns in the study, as
127
results are modifiable. Therefore, towards a full understanding of wildfire patterns, a multi-scale
approach to analyses should be considered in our future work. Besides, the findings of this research
were highly dependent on the data as the study was based on data-driven analysis and hence limited
by data accuracy. However, given that data sets are not completely homogeneous among countries,
regression analysis might be limited by the data provided. This may have a negative impact on the
analysis and discussion of the results.
Additionally, other explanatory variables (e.g., the El Niño-Southern Oscillation) could be
taken into account in future studies. For instance, the Multivariate El Niño Index (MEI) has been
shown to be a powerful indicator in evaluating the severity of seasonal drought in Amazonian
tropical rainforest regions, according to Aragão et al. (2007). The MEI had a greater impact in
northeastern Amazonas, Roraima, Amapá, most of Pará, and southeast Mato Grosso, where
wildfires also occurred in significant amounts.
Overall, this thesis expects to provide wiser and better insight into wildfire mapping,
understanding, prevention, and management, and to intend to make a contribution to the field of
GIS, wildfire modeling, and spatial statistics. The results present the spatial-temporal patterns of
wildfire occurrence in the Amazônia Biome in 2019 and statistically demonstrate that spatial and
temporal heterogeneity exists in wildfire occurrence. In addition, GWR models have exhibited
superior fitness and proven to be more predictive than global models when modeling spatial data.
Finally, this thesis is hoping improve the comprehensive study on, and understanding of fire regime
and risk, and the results have clear implications for land and fire management practices, thus
128
providing references for fire managers to formulate effective regional conservation plans and
management strategies.
129
Chapter 9. References
Aishani, J. (2019). Amazon forest fires: The tragedy of the Global Commons. Observer Research
Foundation. https://www.orfonline.org/expert-speak/amazon-forest-fires-the-tragedy-of-the-
global-commons-55801/.
Aldersley, A., Murray, S.J., Cornell, S.E. (2011). Global and regional analysis of climate and
human drivers of wildfire. The Science of the total environment 409 (18), 3472–3481.
Alencar, A., Nepstad, D., Diaz, M.C.V. (2006). Forest Understory Fire in the Brazilian Amazon
in ENSO and Non-ENSO Years: Area Burned and Committed Carbon Emissions. Earth
Interactions 10 (6), 1–17.
Ali, A.A., Carcaillet, C., Bergeron, Y. (2009). Long‐term fire frequency variability in the eastern
Canadian boreal forest: the influences of climate vs. local factors. Global Change Biology 15
(5), 1230–1241.
Amatulli, G., Peréz-Cabello, F., La Riva, J. de (2007). Mapping lightning/human-caused
wildfires occurrence under ignition point location uncertainty. Ecological Modelling 200 (3-
4), 321–333.
Anderson, L.O., Malhi, Y., Aragão, Luiz E. O. C., Ladle, R., Arai, E., Barbier, N., Phillips, O.
(2010). Remote sensing detection of droughts in Amazonian forest canopies. New Phytologist
187 (3), 733–750.
Anderson, L.O., Ribeiro Neto, G., Cunha, A.P., Fonseca, M.G., Mendes de Moura, Y., Dalagnol,
R. (2018). Vulnerability of Amazonian forests to repeated droughts. Philosophical
Transactions of the Royal Society B: Biological Sciences. 373 (1760).
Andreae, M.O., Atlas, E., Cachier, H., Cofer III, W.R., Harris, G.W., Helas, G., Ward, D.E.
(1996). Trace gas and aerosol emissions from savanna fires. Biomass burning and global
change 1, 278–295.
Anselin, L. (1995). Local Indicators of Spatial Association-LISA. Geographical Analysis 27 (2),
93–115.
Anselin, L. (2005). Spatial regression analysis in R: a workbook. Urbana 51.
Anselin, L., Griffith, D.A. (1998). Do spatial effecfs really matter in regression analysis? Papers
in Regional Science 65 (1), 11–34.
Aragão, L.E.O.C., Malhi, Y., Roman-Cuesta, R.M., Saatchi, S., Anderson, L.O., Shimabukuro,
Y.E. (2007). Spatial patterns and fire response of recent Amazonian droughts. Geophysical
Research Letters 34 (7).
ArcGIS (2021a). Ordinary Least Squares (OLS)—Help | ArcGIS Desktop. ArcGIS.
https://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-statistics-toolbox/ordinary-least-
squares.htm. Accessed 14 August 2021.
130
ArcGIS (2021b). Reading netCDF data as a raster layer—ArcMap | Documentation. ArcGIS.
https://desktop.arcgis.com/en/arcmap/latest/manage-data/netcdf/reading-netcdf-data-as-a-
raster-layer.htm. Accessed 3 September 2021.
Armenteras, D., González, Tania, Marisol, Retana, J. (2013). Forest fragmentation and edge
influence on fire occurrence and intensity under different management types in Amazon
forests. Biological Conservation 159, 73–79.
Associated Press (2019). 200,000 people under evacuation order in California wildfires.
Associated Press. https://fox8.com/2019/10/28/200000-people-under-evacuation-order-in-
california-wildfires/. Accessed 10 November 2019.
Avila-Flores, D., Pompa-Garcia, M., Antonio-Nemiga, X., Rodriguez-Trejo, D.A., Vargas-Perez,
E., Santillan-Perez, J. (2010). Driving factors for forest fire occurrence in Durango State of
Mexico: A geospatial perspective. Chinese Geographical Science 20 (6), 491–497.
Bai, S., Ma, F. (2017). Temporal and spatial distribution of forest fire sources in Shandong
Province. Journal of Shandong Forestry Science and Technology 232 (5), 6–10.
Bartlein, P.J., Hostetler, S.W., Shafer, S.L., Holman, J.O., Solomon, A.M. (2008). Temporal and
spatial structure in a daily wildfire-start data set from the western United States (1986–96).
International Journal of Wildland Fire 17 (1), 8–17.
Bawa, K.S., Markham, A. (1995). Climate Change and Tropical Forests. Trends in ecology &
evolution 10 (9), 348–349.
BBC (2019). Amazon fires increase by 84% in one year - space agency. BBC.
https://www.bbc.com/news/world-latin-america-49415973.
Beck, H.E., Zimmermann, N.E., McVicar, T.R., Vergopolan, N., Berg, A., Wood, E.F. (2018).
Present and future Köppen-Geiger climate classification maps at 1-km resolution. Scientific
Data 5 (1), 180214.
Beer, T. (1991). The interaction of wind and fire. Boundary-Layer Meteorology 54 (3), 287–308.
Bem, P.P. de, Carvalho Júnior, O.A. de, Matricardi, E.A.T., Guimarães, R.F., Gomes, R.A.T.
(2019). Predicting wildfire vulnerability using logistic regression and artificial neural
networks: a case study in Brazil's Federal District. International Journal of Wildland Fire 28
(1), 35.
Bisquert, M., Caselles, E., Sánchez, J.M., Caselles, V. (2012a). Application of artificial neural
networks and logistic regression to the prediction of forest fire danger in Galicia using
MODIS data. International Journal of Wildland Fire 21 (8), 1025.
Bisquert, M., Caselles, E., Sánchez, J.M., Caselles, V. (2012b). Application of artificial neural
networks and logistic regression to the prediction of forest fire danger in Galicia using
MODIS data. International Journal of Wildland Fire 21 (8), 1025–1029.
Boots, B.N., Getis, A. (1985). Point Pattern Analysis. Wholbk.
Box, G.E.P., Cox, D.R. (1964). An Analysis of Transformations. Journal of the Royal Statistical
Society: Series B (Methodological) 26 (2), 211–243.
Brown, P.M., Kaufmann, M.R., Shepperd, W.D. (1999). Long-term, landscape patterns of past
fire events in a montane ponderosa pine forest of central Colorado. Landscape Ecology 14
(6), 513–532.
131
Brunsdon, C., Fotheringham, A.S., Charlton, M. (2002). Geographically weighted summary
statistics — a framework for localised exploratory data analysis. Computers, Environment
and Urban Systems 26 (6), 501–524.
Brunsdon, C., Fotheringham, A.S., Charlton, M.E. (1996). Geographically Weighted Regression:
A Method for Exploring Spatial Nonstationarity. Geographical Analysis 28 (4), 281–298.
Bucini, G., Lambin, E.F. (2002). Fire impacts on vegetation in Central Africa: a remote-sensing-
based statistical analysis. Applied Geography 22, 27–48.
Bustillo Sánchez, M., Tonini, M., Mapelli, A., Fiorucci, P. (2021). Spatial Assessment of
Wildfires Susceptibility in Santa Cruz (Bolivia) Using Random Forest. Geosciences 11 (5),
224.
Butler, R.A. (2001). The Amazon Rainforest. Mongabay, June 5.
Cáceres, C.F. (2011). Using GIS in hotspots analysis and for forest fire risk zones mapping in the
Yeguare Region, Southeastern Honduras. Papers in Resource Analysis 13, 1–14.
Cameron, A.C., Trivedi, P.K. (2013). Regression Analysis of Count Data. Cambridge University
Press.
Cano-Crespo, A., Oliveira, P.J.C., Boit, A., Cardoso, M., Thonicke, K. (2015). Forest edge
burning in the Brazilian Amazon promoted by escaping fires from managed pastures. Journal
of Geophysical Research: Biogeosciences 120 (10), 2095–2107.
Cardille, J.A., Ventura, S.J., Turner, M.G. (2001). Environmental and social factors influencing
wildfires in the Upper Midwest, United States. Ecological Applications 11 (1), 111–127.
Castro, R., Chuvieco, E. (1998). Modeling forest fire danger from geographic information
systems. Geocarto International 13 (1), 15–23.
Center for Disaster Philanthropy (2019). 2019-2020 Australian Bushfires. Center for Disaster
Philanthropy. https://disasterphilanthropy.org/disaster/2019-australian-wildfires/.
Chalkias, C., Papadopoulos, A.G., Kalogeropoulos, K., Tambalis, K., Psarra, G., Sidossis, L.
(2013). Geographical heterogeneity of the relationship between childhood obesity and socio-
environmental status: Empirical evidence from Athens, Greece. Applied Geography 37, 34–
43.
Chas-Amil, M.L., Prestemon, J.P., McClean, C.J., Touza, J. (2015). Human-ignited wildfire
patterns and responses to policy shifts. Applied Geography 56, 164–176.
Chen, F., Niu, S., Tong, X., Zhao, J., Sun, Y., He, T. (2014). The impact of precipitation regimes
on forest fires in Yunnan Province, southwest China. TheScientificWorldJournal 2014,
326782.
Chen, V.Y.-J., Deng, W.-S., Yang, T.-C., Matthews, S.A. (2012). Geographically Weighted
Quantile Regression (GWQR): An Application to U.S. Mortality Data. Geographical
Analysis 44 (2), 134–150.
Christensen, N.L. (1993). Fire regimes and ecosystem dynamics. In Fire in the environment.
Wiley, 233–244.
Claire, W. (2019). Here's how wildfires get started—and how to stop them.
https://www.nationalgeographic.com/environment/natural-disasters/wildfires/.
132
Climate Hazards Center, 2015. Climate Hazards Center InfraRed Precipitation with Station
(CHIRPSv2.0).
Cochrane (2013). Current fire regimes, impacts and likely changes-V: Tropical South America.
Cochrane, Laurance, W.F. (2002). Fire as a large-scale edge effect in Amazonian forests. Journal
of Tropical Ecology 18 (3), 311–325.
Cochrane, M.A. (2001). Synergistic Interactions between Habitat Fragmentation and Fire in
Evergreen Tropical Forests. Conservation Biology 15 (6), 1515–1521.
Coe, M.T., Marthews, T.R., Costa, M.H., Galbraith, D.R., Greenglass, N.L., Imbuzeiro, H.M.A.,
Levine, N.M., Malhi, Y., Moorcroft, P.R., Muza, M.N., Powell, T.L., Saleska, S.R.,
Solorzano, L.A., Wang, J. (2013). Deforestation and climate feedbacks threaten the
ecological integrity of south-southeastern Amazonia. Philosophical transactions of the Royal
Society of London. Series B, Biological sciences 368 (1619), 20120155.
Collins, K.M., Price, O.F., Penman, T.D. (2015). Spatial patterns of wildfire ignitions in south-
eastern Australia. International Journal of Wildland Fire 24 (8), 1098–1108.
Cunningham, A.A., Martell, D.L. (1973). A Stochastic Model for the Occurrence of Man-caused
Forest Fires. Canadian Journal of Forest Research 3 (2), 282–287.
Da Silva, A.R., Rodrigues, T.C.V. (2013). Geographically Weighted Negative Binomial
Regression—incorporating overdispersion. Statistics and Computing.
Dark, S.J., Bram, D. (2007). The modifiable areal unit problem (MAUP) in physical geography.
Progress in Physical Geography: Earth and Environment 31 (5), 471–479.
David, M.J.S., Bowman, Jennifer, K.B., Paulo, A., William, J.B., Jean, M.C., Mark, A.C., Carla,
M.D., Ruth, S.D., John, C.D., Sandy, P.H., Fay, H.J., Jon E. Keeley, Meg A. Krawchuk,
Christian A. Kull, J. Brad Marston, Max A. Moritz, I. Colin Prentice, Christopher I. Roos,
Andrew C. Scott, Thomas W. Swetnam, Guido R. van der Werf, Stephen J. Pyne (2009). Fire
in the Earth System. Science 324 (5926), 481–484.
Dayananda, P. (1977). Stochastic models for forest fires. Ecological Modelling 3 (4), 309–313.
Del Vilar Hoyo, L., Martín Isabel, M.P., Martínez Vega, F.J. (2011). Logistic regression models
for human-caused wildfire risk estimation: analysing the effect of the spatial accuracy in fire
occurrence data. European Journal of Forest Research 130 (6), 983–996.
Devisscher, T., Anderson, L.O., Aragão, Luiz E. O. C., Galván, L., Malhi, Y. (2016). Increased
Wildfire Risk Driven by Climate and Development Interactions in the Bolivian Chiquitania,
Southern Amazonia. PLOS ONE 11 (9), e0161323.
Dilekli, N., Janitz, A.E., Campbell, J.E., Beurs, K.M. de (2018). Evaluation of geoimputation
strategies in a large case study. International Journal of Health Geographics 17 (1), 30.
Drury, S.A., Veblen, T.T. (2008). Spatial and temporal variability in fire occurrence within the
Las Bayas Forestry Reserve, Durango, Mexico. Plant Ecology 197 (2), 299–316.
ESRI (2019). Directional Distribution (Standard Deviational Ellipse). ESRI.
https://desktop.arcgis.com/en/arcmap/latest/tools/spatial-statistics-toolbox/directional-
distribution.htm. Accessed 12 November 2019.
133
ESRI (2021a). Interpreting Exploratory Regression results. ESRI.
https://desktop.arcgis.com/en/arcmap/latest/tools/spatial-statistics-toolbox/interpreting-
exploratory-regression-results.htm.
ESRI (2021b). Optimized Outlier Analysis. ESRI.
https://desktop.arcgis.com/en/arcmap/10.7/tools/spatial-statistics-
toolbox/optimizedoutlieranalysis.htm.
ESRI (2021c). Regression analysis basics. ESRI.
https://desktop.arcgis.com/en/arcmap/latest/tools/spatial-statistics-toolbox/regression-
analysis-basics.htm#GUID-6D27B3A1-FFC6-4BF5-893F-F6D60AB2E783.
ESRI (2021d). Geographically Weighted Regression (GWR) (Spatial Statistics)—ArcGIS Pro |
Documentation. ESRI. https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-
statistics/geographically-weighted-regression.htm. Accessed 18 October 2021.
Esri Inc, 2021. ArcMap. Esri Inc.
Faivre, N., Jin, Y., Goulden, M.L., Randerson, J.T. (2014). Controls on the spatial pattern of
wildfire ignitions in Southern California. International Journal of Wildland Fire 23 (6), 799.
Faraway, J.J. (2002). Practical regression and ANOVA using R.
Fearnside, P. (2007). Uso da terra na Amazo ˆnia e as mudanc ¸as clima ´ticas globais. Brazilian
Journal of Ecology 10 (2), 83–100.
Flannigan, M.D., Krawchuk, M.A., Groot, W.J. de, Wotton, B.M., Gowman, L.M. (2009).
Implications of changing climate for global wildland fire. International Journal of Wildland
Fire 18 (5), 483.
Fotheringham, A.S., Brunsdon, C., Charlton, M. (2003). Geographically Weighted Regression.
The Analysis of Spatially Varying Relationships. John Wiley & Sons.
Fotheringham, A.S., Crespo, R., Yao, J. (2015). Geographical and Temporal Weighted
Regression (GTWR). Geographical Analysis 47 (4), 431–452.
Garcia, C.V., Woodard, P.M., Titus, S.J., Adamowicz, W.L., Lee, B.S. (1995). A Logit Model for
Predicting the Daily Occurrence of Human Caused Forest-Fires. International Journal of
Wildland Fire 5 (2), 101.
Gavin, J., Haberman, S., Verrall, R. (1993). Moving weighted average graduation using kernel
estimation. Insurance: Mathematics and Economics 12 (2), 113–126.
Genton, M.G., Butry, D.T., Gumpertz, M.L., Prestemon, J.P. (2006). Spatio-temporal analysis of
wildfire ignitions in the St Johns River Water Management District, Florida. International
Journal of Wildland Fire 15 (1), 87.
Giglio, L., Descloitres, J., Justice, C.O., Kaufman, Y.J. (2003). An Enhanced Contextual Fire
Detection Algorithm for MODIS. Remote sensing of Environment 87 (2-3), 273–282.
Giglio, L., Schroeder, W., Justice, C.O. (2016). The collection 6 MODIS active fire detection
algorithm and fire products. Remote sensing of Environment 178, 31–41.
Gill, A.M., Christian, K.R., Moore, P.H.R. (1987). Bushfire incidence, fire hazard and fuel
reduction burning. Australian Journal of Ecology 12 (3), 299–306.
Glenn, H., Mat, J., Etelle, H., Lucia, von, Reusner (2019). The Companies Behind the Burning of
the Amazon. Mighty Earth. https://stories.mightyearth.org/amazonfires/index.html.
134
Gonzalez-Olabarria, J.R., Mola-Yudego, B., Pukkala, T., Palahi, M. (2011). Using multiscale
spatial analysis to assess fire ignition density in Catalonia, Spain. Annals of Forest Science 68
(4), 861–871.
Goodchild, M.F. (1986). Spatial autocorrelation. Geo Books.
Gralewicz, N.J., Nelson, T.A., Wulder, M.A. (2012). Spatial and temporal patterns of wildfire
ignitions in Canada from 1980 to 2006. International Journal of Wildland Fire 21 (3), 230–
242.
Guo, F.T., Hu, H.Q., Ma, Z.H., Zhang, Y. (2010). Applicability of different models in simulating
the relationships between forest fire occurrence and weather factors in Daxing'an Mountains.
Chinese Journal of Applied Ecology 21 (1), 159–164.
Gutman, G., Ignatov, A. (1997). Satellite-derived green vegetation fraction for the use in
numerical weather prediction models. Advances in Space Research 19 (3), 477–480.
Haining, R., Law, J., Griffith, D. (2009). Modelling small area counts in the presence of
overdispersion and spatial autocorrelation. Computational Statistics & Data Analysis 53 (8),
2923–2937.
Hannah, F. (2019). How does a controlled burn help stop a fire? Quora.
https://www.quora.com/How-does-a-controlled-burn-help-stop-a-fire.
Harris, P.P., Huntingford, C., Cox, P.M. (2008). Amazon Basin climate under global warming:
the role of the sea surface temperature. Philosophical transactions of the Royal Society of
London. Series B, Biological sciences 363 (1498), 1753–1759.
Hijmans, R.J., Cameron, S.E., Parra, J.L., Jones, P.G., Jarvis, A. (2005). Very high resolution
interpolated climate surfaces for global land areas. International Journal of Climatology 25
(15), 1965–1978.
Hilker, T., Lyapustin, A.I., Tucker, C.J., Hall, F.G., Myneni, R.B., Wang, Y., Bi, J., Mendes de
Moura, Y., Sellers, P.J. (2014). Vegetation dynamics and rainfall sensitivity of the Amazon.
Proceedings of the National Academy of Sciences 111 (45), 16041–16046.
Hoaglin, D.C., Mosteller, F., Tukey, J.W. (2011). Exploring Data Tables, Trends, and Shapes.
John Wiley & Sons.
Holcomb, Z.C. (2017). Fundamentals of descriptive statistics, 0th ed. Routledge, London.
Houston, D.B. (1973). Wildfires in northern Yellowstone National Park. Ecology 54 (4), 1111–
1117.
Hutcheson, G.D. (2011). Ordinary least-squares regression. The SAGE dictionary of quantitative
management research, 224–228.
Ismail, N., Jemain, A.A. (2007). Handling overdispersion with negative binomial and
generalized Poisson regression models.
Jenks, G.F. (1967). The Data Model Concept in Statistical Mapping. International Yearbook of
Cartography 7, 186–190.
Jennifer, L. (2019). 4 Reasons Why We Need the Amazon Rainforest. Popular Mechanics.
https://www.popularmechanics.com/science/environment/a28910396/amazon-rainforest-
importance/#:~:text=The%20Amazon%20rainforest%20is%20the,%2C%20Suriname%2C%
20and%20French%20Guiana.
135
Jessica, M. (2019). A Drier Future Sets the Stage for More Wildfires. NASA.
https://www.nasa.gov/feature/goddard/2019/a-drier-future-sets-the-stage-for-more-wildfires.
Juárez-Orozco, S.M., Siebe, C., Fernández y Fernández, D. (2017). Causes and Effects of Forest
Fires in Tropical Rainforests: A Bibliometric Approach. Tropical Conservation Science 10,
194008291773720.
Justine, C. (2019). Everything you need to know about the fires in the Amazon.
https://www.theverge.com/2019/8/28/20836891/amazon-fires-brazil-bolsonaro-rainforest-
deforestation-analysis-effects.
Keane, R.E., Cary, G.J., Parsons, R. (2003). Using simulation to map fire regimes: an evaluation
of approaches, strategies, and limitations. International Journal of Wildland Fire 12 (4), 309.
Keeley, J. (2000). Role of fire in regeneration from seed. Seeds : The Ecology of Regeneration in
Plant Communities, 311–330.
Kelley, D.I., Burton, C., Huntingford, C., Brown, M.A.J., Whitley, R., Dong, N. (2021).
Technical note: Low meteorological influence found in 2019 Amazonia fires. Biogeosciences
18 (3), 787–804.
Koutsias, N., Kalabokidis, K.D., Allgöwer, B. (2004). Fire occurrence patterns at landscape
level: beyond positional accuracy of ignition points with kernel density estimation methods.
Natural Resource Modeling 17 (4), 359–375.
Koutsias, N., Martínez, J., Chuvieco, E., Allgöwer, B. (2005). Modeling wildland fire occurrence
in southern Europe by a geographically weighted regression approach. Proceedings of the 5th
international workshop on remote sensing and GIS applications to forest fire management:
fire effects assessment., 57–60.
Koutsias, N., Martínez-Fernández, J., & Allgöwer, B. (2010a). Do factors causing wildfires vary
in space? Evidence from geographically weighted regression. GIScience & Remote Sensing
47 (2), 221–240.
Koutsias, N., Martínez-Fernández, J., Allgöwer, B. (2010b). Do Factors Causing Wildfires Vary
in Space? Evidence from Geographically Weighted Regression. GIScience & Remote Sensing
47 (2), 221–240.
Krawchuk, M.A., Moritz, M.A., Parisien, M.A., van Dorn, J., Hayhoe, K. (2009). Global
pyrogeography: the current and future distribution of wildfire. PloS one 4 (4).
Laurance, W.F., Cochrane, M.A., Bergen, S., Fearnside, P.M., Delamônica, P., Barber, C.,
D'Angelo, S., Fernandes, T. (2001). Environment. The future of the Brazilian Amazon.
Science 291 (5503), 438–439.
Le Page, Y., Oom, D., Silva, J.M., Jönsson, P., Pereira, J.M. (2010). Seasonality of vegetation
fires as modified by human action: Observing the deviation from eco‐climatic fire regimes.
Global Ecology and Biogeography 19 (4), 575–588.
Lee, B., Park, P.S., Chung, J. (2006). Temporal and spatial characteristics of forest fires in South
Korea between 1970 and 2003. International Journal of Wildland Fire 15 (3), 389.
Lee, H.Y., Ong, H.L., Quek, L.H. (1995). Exploiting Visualization in Knowledge Discovery.
136
Lee, J.-H., HAN, G., FULP, W.J., GIULIANO, A.R. (2012). Analysis of overdispersed count
data: application to the Human Papillomavirus Infection in Men (HIM) Study. Epidemiology
and Infection 140 (6), 1087–1094.
Levine, J. (Ed.) (1991). Global biomass burning: atmospheric, climatic, and biospheric
implications.
Lewis, S.L. (2006). Tropical forests and the changing earth system. Philosophical Transactions
of the Royal Society B: Biological Sciences 361 (1645), 195–210.
Liu, K.Z., Shu, L.F., Liang, Y. (2018). A study review of summer forest fires occurrence and
spreading in cold temperate zone. Journal of Temperate Forestry Research 1 (4), 1–8.
Lovejoy, T.E. (1991). Biomass burning and the disappearing tropical rainforest. In Global
Biomass Burning. MIT Press, 77–82.
Lovejoy, T.E., Nobre, C. (2019). Amazon tipping point: Last chance for action. Science Advances
5 (12), eaba2949.
Lozano, F.J., Suárez-Seoane, S., Kelly, M., Luis, E. (2008). A multi-scale approach for modeling
fire occurrence probability using satellite data and classification trees: A case study in a
mountainous Mediterranean region. Remote sensing of Environment 112 (3), 708–719.
Luke, R.H., McArthur, A.G. (1978). Bush Fires in Australia. Bush Fires in Australia.
Maeda, E.E., Formaggio, A.R., Shimabukuro, Y.E. (2009). Predicting forest fire in the Brazilian
Amazon using MODIS imagery and artificial neural networks. International Journal of
Applied Earth Observation and Geoinformation 11 (4), 265–272.
Maingi, J.K., Henry, M.C. (2007). Factors influencing wildfire occurrence and distribution in
eastern Kentucky, USA. International Journal of Wildland Fire 16 (1), 23.
Malhi, Y., Roberts, J.T., Betts, R.A., Killeen, T.J., Li, W., Nobre, C.A. (2008). Climate change,
deforestation, and the fate of the Amazon. Science 319 (5860), 169–172.
Mapbiomas (2015). Mapbiomas Brasil. Mapbiomas. https://mapbiomas.org/en/project. Accessed
18 April 2021.
Marlon, J.R., Bartlein, P.J., Carcaillet, C., Gavin, D.G., Harrison, S.P., Higuera, P.E., Joos, F.,
Power, M.J., Prentice, I.C. (2008). Climate and human influences on global biomass burning
over the past two millennia. Nature Geoscience 1 (10), 697–702.
Martell, D.L., Otukol, S., Stocks, B.J. (1987). A logistic model for predicting daily people-caused
forest fire occurrence in Ontario. Canadian Journal of Forest Research 17 (5), 394–401.
Martín, Y., Zúñiga-Antón, M., Rodrigues Mimbrero, M. (2019). Modelling temporal variation of
fire-occurrence towards the dynamic prediction of human wildfire ignition danger in
northeast Spain. Geomatics, Natural Hazards and Risk 10 (1), 385–411.
Martínez, J., Vega-Garcia, C., Chuvieco, E. (2009). Human-caused wildfire risk rating for
prevention planning in Spain. Journal of environmental management 90 (2), 1241–1252.
Martínez-Fernández, J., Chuvieco, E., Koutsias, N. (2013). Modelling long-term fire occurrence
factors in Spain by accounting for local variations with geographically weighted regression.
Natural Hazards and Earth System Sciences 13 (2), 311–327.
137
Massada, A.B., Radeloff, V.C., Stewart, S.I., Hawbaker, T.J. (2007). Wildfire risk in the
wildland–urban interface: a simulation study in northwestern Wisconsin. Forest ecology and
management 258 (9), 1990–1999.
McArthur, A.G. (1967). Fire behaviour in eucalypt forests.
MCT (2004). Comunicação Nacional Inicial do Brasil à Convenção-Quadro das Nações Unidas
sobre Mudança do Clima.
Meggers, B.J. (1994). Archeological evidence for the impact of mega-Nio events on Amazonia
during the past two millennia. Climatic Change 28 (4), 321–338.
Menaut, J.C., Abbadie, L., Vitousek, P.M. (1993). Nutrient and organic matter dynamics in
tropical ecosystems. In Fire in the Environment. Wiley, 215–230.
Mhawej, M., Faour, G., Adjizian-Gerard, J. (2015). Wildfire likelihood’s elements: A literature
review. Challenges 6 (2), 282–293.
Minnich, R.A., Bahre, C.J. (1995). Wildland fire and chaparral succession along the California
Baja-California boundary. International Journal of Wildland Fire 5 (1), 13–24.
Moreira de Araújo, F., Ferreira, L.G., Arantes, A.E. (2012). Distribution Patterns of Burned Areas
in the Brazilian Biomes: An Analysis Based on Satellite Data for the 2002–2010 Period.
Remote Sensing 4 (7), 1929–1946.
Morello, T.F., Ramos, R.M., Anderson, L.O., Owen, N., Rosan, T.M., Steil, L.O., Erson, L.,
Owen, N., Rosan, T.M., Steil, L. (2020). Predicting fires for policy making: Improving
accuracy of fire brigade allocation in the Brazilian Amazon. Ecological Economics 169,
106501.
Morgan, P., Hardy, C.C., Swetnam, T.W., Rollins, M.G., Long, D.G. (2001). Mapping fire
regimes across time and space: understanding coarse and fine-scale fire patterns.
International Journal of Wildland Fire 10 (4), 329–342.
Morton, D.C., Defries, R.S., Randerson, J.T., Giglio, L., Schroeder, W., van Der Werf, G. R.
(2008). Agricultural intensification increases deforestation fire activity in Amazonia. Global
Change Biology 14 (10), 2262–2275.
Nepstad, D., Schwartzman, S., Bamberger, B., Santilli, M., Ray, D., Schlesinger, P., Lefebvre, P.,
Alencar, A., Prinz, E., Fiske, G., Rolla, A. (2006). Inhibition of Amazon deforestation and fire
by parks and indigenous lands. Conservation biology : the journal of the Society for
Conservation Biology 20 (1), 65–73.
Nkeki, F.N., Osirike, A.B. (2013). GIS-Based Local Spatial Statistical Model of Cholera
Occurrence: Using Geographically Weighted Regression. Journal of Geographic Information
System 05 (06), 531–542.
Nobre, C.A., Obregón, G.O., Marengo, J.A., Fu, R., Poveda, G. (2009). Characteristics of
Amazonian climate: Main features. In: Walker, R. (Ed.) The expansion of intensive
agriculture and ranching in brazilian Amazonia. American Geophysical Union, Washington,
D. C., pp. 149–162.
Nogueira, J.M., Rambal, S., Barbosa, J.P.R., Mouillot, F. (2017). Spatial pattern of the seasonal
drought/burned area relationship across Brazilian biomes: Sensitivity to drought metrics and
global remote-sensing fire products. Climate 5 (2), 42.
138
Novaya Gazeta (2019). Площадь лесных пожаров в Сибири превысила 2,6 млн га. Novaya
Gazeta. https://novayagazeta.ru/news/2019/07/28/153741-ploschad-lesnyh-pozharov-v-sibiri-
i-yakutii-prevysila-2-6-mln-ga.
Obua, J. (1997). Environmental Impact of Ecotourism in Kibale National Park, Uganda. Journal
of Sustainable Tourism 5 (3), 213–223.
Oliveira, S., Oehler, F., San-Miguel-Ayanz, J., Camia, A., Pereira, J.M. (2012). Modeling spatial
patterns of fire occurrence in Mediterranean Europe using Multiple Regression and Random
Forest. Forest ecology and management 275, 117–129.
Openshaw, S. (1981). The modifiable areal unit problem. Quantitative geography: a British view,
60–69.
Orozco, C.V., Tonini, M., Conedera, M., Kanveski, M. (2012). Cluster recognition in spatial-
temporal sequences: the case of forest fires. GeoInformatica 16 (4), 653–673.
Parente, J., Pereira, M.G., Amraoui, M., Tedim, F. (2018). Negligent and intentional fires in
Portugal: Spatial distribution characterization. The Science of the total environment 624, 424–
437.
Parisien, M.A., Moritz, M.A. (2009). Environmental controls on the distribution of wildfire at
multiple spatial scales. Ecological Monographs 79 (1), 127–154.
Pew, K.L., Larsen, C.P.S. (2001). GIS analysis of spatial and temporal patterns of human-caused
wildfires in the temperate rain forest of Vancouver Island, Canada. Forest ecology and
management 140 (1), 1–18.
Phillips, O.L., Aragão, L.E., Lewis, S.L., Fisher, J.B., Lloyd, J., López-González, G., Torres-
Lezama, A. (2019). Drought sensitivity of the Amazon rainforest. Science 323 (5919), 1344–
1347.
Podur, J., Martell, D.L., Csillag, F. (2003). Spatial patterns of lightning-caused forest fires in
Ontario, 1976–1998. Ecological Modelling 164 (1), 1–20.
Prasad, A.M., Iverson, L.R., Liaw, A. (2006). Newer Classification and Regression Tree
Techniques: Bagging and Random Forests for Ecological Prediction. Ecosystems 9 (2), 181–
199.
Prestemon, J.P., Hawbaker, T.J., Bowden, M., Carpenter, J. (2013). Wildfire ignitions: a review
of the science and recommendations for empirical modeling. Gen. Tech. Rep. SRS-GTR-171.
Asheville, NC: USDA-Forest Service, Southern Research Station 171, 1–20.
Priyanka, B. (2019). Camp Fire: By the Numbers.
https://www.pbs.org/wgbh/frontline/article/camp-fire-by-the-numbers/.
R Development Core Team (2021). Geographically Weighted Regression [R package spgwr
version 0.6-34]. Comprehensive R Archive Network (CRAN).
Rao, C.R., Rubin, H. (1964). On a characterization of the Poisson distribution. The Indian
Journal of Statistics, 295–298.
Rebecca, L. (2021). From Forest to Field: How Fire is Transforming the Amazon. NASA.
https://earthobservatory.nasa.gov/features/AmazonFire. Accessed 28 March 2021.
Ren, H., Li, L., Liu, Q., Wang, X., Li, Y., Hui, D., Jian, S., Wang, J., Yang, H., Lu, H., Zhou, G.,
Tang, X., Zhang, Q., Wang, D., Yuan, L., Chen, X. (2014). Spatial and temporal patterns of
139
carbon storage in forest ecosystems on Hainan island, southern China. PLOS ONE 9 (9),
e108163.
Robert, E.W. (1987). Selling a geographical information system to government policymakers.
URISA.
Rodrigues, M., Jiménez-Ruano, A., Peña-Angulo, D., La Riva, J. de (2018). A comprehensive
spatial-temporal analysis of driving factors of human-caused wildfires in Spain using
geographically weighted logistic regression. Journal of environmental management 225,
177–192.
Rodrigues, M., La Riva, J. de (2014). An insight into machine-learning algorithms to model
human-caused wildfire occurrence. Environmental Modelling & Software 57, 192–201.
Rodrigues, M., La Riva, J. de, Fotheringham, S. (2014a). Modeling the spatial variation of the
explanatory factors of human-caused wildfires in Spain using geographically weighted
logistic regression. Applied Geography 48, 52–63.
Rodrigues, M., La Riva, J. de, Fotheringham, S. (2014b). Modeling the spatial variation of the
explanatory factors of human-caused wildfires in Spain using geographically weighted
logistic regression. Applied Geography 48, 52–63.
Rodriguez, T.D.A., Ramirez, M.H., Tchikoue, J.H. (2008). Factors affecting the accident rate of
forest fires. Ciencia Forestal en Mexico 33 (104), 38–57.
Rollins, M.G., Morgan, P., Swetnam, T. (2002). Landscape-scale controls over 20th century fire
occurrence in two large Rocky Mountain (USA) wilderness areas. Landscape Ecology 17 (6),
539–557.
Román-Cuesta, M.R., Martínez-Vilalta, J. (2006). Effectiveness of protected areas in mitigating
fire within their boundaries: case study of Chiapas, Mexico. Conservation biology : the
journal of the Society for Conservation Biology 20 (4), 1074–1086.
Romero-Calcerrada, R., Novillo, C.J., Millington, J.D.A., Gomez-Jimenez, I. (2008). GIS
analysis of spatial patterns of human-caused wildfire ignition risk in the SW of Madrid
(Central Spain). Landscape Ecology 23 (3), 341–354.
Rosen, J. (2019). The Amazon rainforest is on fire. Climate scientists fear a tipping point is near.
Los Angeles Times, August 26.
Sanford, R.L., Saldarriaga, J., Clark, K.E., Uhl, C., Herrera, R. (1985). Amazon rain-forest fires.
Science 227 (4682), 53–55.
San-Miguel-Ayanz, J., Moreno, J.M., Camia, A. (2013). Analysis of large fires in European
Mediterranean landscapes: Lessons learned and perspectives. Forest ecology and
management 294, 11–22.
SAS Institute Inc (2021). SAS Help Center: SAS 9.4 Macro Language: Reference, Fifth Edition.
SAS Institute Inc.
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/mcrolref/titlepage.htm. Accessed
18 October 2021.
Schelhaas, M.-J., Hengeveld, G., Moriondo, M., Reinds, G.J., Kundzewicz, Z.W., Maat, H. ter,
Bindi, M. (2010). Assessing risk and adaptation options to fires and windstorms in European
forestry. Mitigation and Adaptation Strategies for Global Change 15 (7), 681–701.
140
Schulte, L.A., Mladenoff, D.J. (2005). Severe wind and fire regimes in northern forests:
historical variability at the regional scale. Ecology 86 (2), 431–445.
Scott, D., Alicia, R. (2008). Fire Behavior and Effects in Fuel Treatments and Protected Habitat
on the Moonlight Fire. USDA Forest Service.
https://www.fs.fed.us/adaptivemanagement/reports/fbat/MoonlightFinal_8_6_08.pdf.
Accessed 25 November 1019.
Scott, L., Pratt, M. (2009). Answering Why Questions: An introduction to using regression
analysis with spatial data.
Sen, H., Hu, J. (2002). Study on Forest Fire Regime of Heilongjiang Province II. Analysis of
Factors Affecting Fire Dynamics and Distributions. Scientia Silvae Sinicae 38 (2), 98–102.
Shanna, H. (2020). Satellite data show Amazon rainforest likely drier, more fire-prone this year.
Mongabay Environmental News. https://news.mongabay.com/2020/04/satellite-data-show-
amazon-rainforest-likely-drier-more-fire-prone-this-year/. Accessed 6 April 2021.
Silva, Valle, M.E., Barros, L.C., Meyer, J. (2020). A wildfire warning system applied to the state
of Acre in the Brazilian Amazon. Applied Soft Computing 89, 106075.
Silva, J.M., Sá, A.C., Pereira, J.M. (2005). Comparison of burned area estimates derived from
SPOT-VEGETATION and Landsat ETM+ data in Africa: Influence of spatial pattern and
vegetation type. Remote sensing of Environment 96 (2), 188-201.
Silva Junior, C., Aragão, L., Fonseca, M., Almeida, C., Vedovato, L., Anderson, L. (2018).
Deforestation-Induced Fragmentation Increases Forest Fire Occurrence in Central Brazilian
Amazonia. Forests 9 (6), 305.
Silveira, M.V.F., Petri, C.A., Broggio, I.S., Chagas, G.O., Macul, M.S., Leite, Cândida C. S. S.,
Ferrari, E.M.M., Amim, C.G.V., Freitas, A.L.R., Motta, A.Z.V., Carvalho, L.M.E., Silva
Junior, Celso H. L., Anderson, L.O., Aragão, Luiz E. O. C. (2020). Drivers of Fire Anomalies
in the Brazilian Amazon: Lessons Learned from the 2019 Fire Crisis. Land 9 (12), 516.
Silverman, B. (1952). W.(1986), Density Estimation for Statistics and Data Analysis.
Smith, J.K., Whelan, R.J. (1996). The Ecology of Fire. Ecology 77 (5), 1647.
Soares-Filho, B., Silvestrini, R., Nepstad, D., Brando, P., Rodrigues, H., Alencar, A., Coe, M.,
Locks, C., Lima, L., Hissa, L., Stickler, C. (2012). Forest fragmentation, climate change and
understory fire regimes on the Amazonian landscapes of the Xingu headwaters. Landscape
Ecology 27 (4), 585–598.
Song, C., Kwan, M.-P., Zhu, J. (2017). Modeling Fire Occurrence at the City Scale: A
Comparison between Geographically Weighted Regression and Global Linear Regression.
International Journal of Environmental Research and Public Health 14 (4), 396.
Sparks, A. (2018). nasapower: A NASA POWER Global Meteorology, Surface Solar Energy and
Climatology Data Client for R. Journal of Open Source Software 3 (30), 1035.
Stage, A.R. (1976). An Expression for the Effect of Aspect, Slope, and Habitat Type on Tree
Growth. Forest Science 22 (4), 457–460.
Steege, H. ter, Pitman, N.C., Sabatier, D., Baraloto, C., Salomão, R.P., Guevara, J.E., Silman,
M.R. (2013). Hyperdominance in the Amazonian tree flora. Science 342 (6156).
141
Stephanie (2016). What is Moran’s I? Statistics How To.
https://www.statisticshowto.datasciencecentral.com/morans-i/. Accessed 13 November 2019.
Sturtevant, B.R., Cleland, D.T. (2007). Human and biophysical factors influencing modern fire
disturbance in northern Wisconsin. International Journal of Wildland Fire 16 (4), 398.
Thach, N.N., Ngo, D.B.T., Xuan-Canh, P., Hong-Thi, N. (2018). Spatial pattern assessment of
tropical forest fire danger at Thuan Chau area (Vietnam) using GIS-based advanced machine
learning algorithms: A comparative study. Ecological informatics 46, 74–85.
Tobler, W.R. (1970). A Computer Movie Simulating Urban Growth in the Detroit Region.
Economic Geography 46, 234–240.
Voices, E. (2019). Why the Amazon is burning: 4 reasons. EarthSky, August 27.
Wang, K., Zhang, C., Li, W. (2013). Predictive mapping of soil total nitrogen at a regional scale:
A comparison between geographically weighted regression and cokriging. Applied
Geography 42, 73–85.
Welch, C. (2021). First study of all Amazon greenhouse gases suggests the damaged forest is
now worsening climate change. National Geographic, March 11.
Wen, D.X., Zhang, M.J., Long, D.H., Deng, X.W., Wen, D.Y. (2009). The influence of
environment conditions in guangxi on the temporal and spatial distribution of forest fire.
Journal of Central South University of Forestry & Technology 29 (2), 21–26.
Westerling, A.L., Hidalgo, H.G., Cayan, D.R., Swetnam, T.W. (2006). Warming and earlier
spring increase western U.S. forest wildfire activity. Science 313 (5789), 940–943.
Wheeler, D., Tiefelsdorf, M. (2005). Multicollinearity and correlation among local regression
coefficients in geographically weighted regression. Journal of Geographical Systems 7 (2),
161–187.
Williams, D.K. (2018). Differences Between North- and South-Facing Slopes. Sciencing.
https://sciencing.com/differences-between-north-southfacing-slopes-8568075.html.
Wing, M.G., Long, J. (2014). A 25-Year History of Spatial and Temporal Trends in Wildfire
Activity in Oregon and Washington, U.S.A. Modern Applied Science 9 (3).
Wittenberg, L., Malkinson, D. (2009). Spatio-temporal perspectives of forest fires regimes in a
maturing Mediterranean mixed pine landscape. European Journal of Forest Research 128 (3),
297–304.
Wood, S.W., Murphy, B.P., Bowman, David M. J. S. (2011). Firescape ecology: how topography
determines the contrasting distribution of fire and rain forest in the south-west of the
Tasmanian Wilderness World Heritage Area. Journal of Biogeography 38 (9), 1807–1820.
Wooldridge, J.M. (2015). Introductory Econometrics: A Modern Approach. Cengage Learning.
Wotton, B.M., Nock, C.A., Flannigan, M.D. (2010). Forest fire occurrence and climate change in
Canada. International Journal of Wildland Fire 19 (3), 253.
Xiao, R., Su, S., Wang, J., Zhang, Z., Jiang, D., Wu, J. (2013). Local spatial modeling of paddy
soil landscape patterns in response to urbanization across the urban agglomeration around
Hangzhou Bay, China. Applied Geography 39, 158–171.
Yang, G.B., Tang, X.M., Ning, J.J., Yu, D.Y. (2009). Spatial and Temporal Distribution Pattern of
Forest Fire Occurred in Beijing from 1986 to 2006. 7 45, 90–95.
142
Yang, J., He, H.S., Shifley, S.R. (2008). Spatial controls of occurrence and spread of wildfires in
the Missouri Ozark Highlands. Ecological Applications 18 (5), 1212–1225.
Zheng, Q., Jin, S. (2013). Temporal and Spatial Patterns of Forest Fires in Yichun Area during
1980 － 2010 and the Influence of Meteorological Factors. Scientia Silvae Sinicae 49 (4),
157–163.
Živanović, S., Ivanović, R., Nikolić, M., Đokić, M., Tošić, I. (2020). Influence of air temperature
and precipitation on the risk of forest fires in Serbia. Meteorology and Atmospheric Physics
132 (6), 869–883.
Zribi, M., Le Hégarat-Mascle, S., Taconet, O., Ciarletti, V., Vidal-Madjar, D., Boussema, M.R.
(2003). Derivation of wild vegetation cover density in semi-arid regions: ERS2/SAR
evaluation. International Journal of Remote Sensing 24 (6), 1335–1352.
143
Appendix A: Layer Creation Workflows
The following procedure diagrams illustrate the steps that were followed in creating the data
layers for this study. Processes and manipulations are represented by rectangular items, while
inputs and outputs are signified by ovals. Figure A-1 shows how the true wildfire points are filtered
from the MODIS dataset. Figure A-2 shows how to identify the spatial pattern of wildfire
occurrences. We used the Tabulation Intersection tool in ArcGIS 10.8 to compute the intersection
between deforestation patches and each fishnet grid, and cross-tabulate the area of the intersecting
features. Then, aggregated total areas based on each individual field of fishnet grid. Figures A-3,
4, and 5 show the process of the creation of the pasture area, forest loss, and the road density layer,
by using a similar method, to calculate the area and length respectively. Figure A-6 shows the
process of the calculation of the MCWD layer.
144
Figure A-1. Flowchart illustrating the process of creating the wildland fires layers.
Figure A-2. Flowchart illustrating the process of identifying spatial distribution.
145
Figure A-3. Flowchart illustrating the process of creating pasture area layer.
Figure A-4. Flowchart illustrating the process of creating the forest loss layer.
Figure A-5. Flowchart illustrating the process of creating the road density layer.
Figure A-6. Flowchart illustrating the process of creating the maximum cumulative water deficit.
146
Appendix B: Spatial Distribution
The step of collecting events was critical since Hot Spot analysis relied on weighted points
instead of individual incidents. ICONUNT represents the accumulation of points at a particular
location. This result shown as Figure B-1.
Figure B-1. Output of Collect Events of wildfire incidents.
147
Appendix C: Spatial Relationship
148
149
Figure C-1. The results of the exploratory regression by using original data.
150
Figure C-2. The results of the Kolmogorov-Smirnov test.
151
Figure C-3. The histograms of transformed data.
Note: Using the Box-Cox transformation, the original values of the dependent variable were transformed to an
approximately normal distribution.
152
153
154
155
Figure C-4. The results of the Exploratory Regression by using transformed data.
156
Figure C-5. The histogram of residuals resulting from the OLS regression.
157
Appendix D: Supplementary Information
Figure D-1illustrates the spatial distribution of the land use in Amazônia Biome. Table D-2
shows the raster value and RGB color palette of each class of land use defined by Mapbiomas.
Figure D-1. Spatial distribution of land use.
Table D-2. The classification of land use by MapBiomas.
158
ID COLOR
1 1. Forest
2 1.1. Natural Forest
3 1.1.1. Forest Formation
4 1.1.2. Savanna Formation
5 1.1.3. Mangrove
9 1.2. Forest Plantation
10 2. Non Forest Natural Formation
11 2.1. Wetland
12 2.2. Grassland
32 2.3. Salt Flat
29 2.4. Rocky Outcrop
13 2.5. Other non Forest Formations
14 3. Farming
15 3.1. Pasture
18 3.2. Agriculture
19 3.2.1. Temporary Crop
39 3.2.1.1. Soy bean
20 3.2.1.2. Sugar Cane
41 3.2.1.3. Other Temporary Crops
36 3.2.2. Perennial Crop
21 3.3. Mosaic of Agriculture and Pasture
22 4. Non vegetated area
23 4.1. Beach and Dune
24 4.2. Urban Infrastructure
30 4.3. Mining
25 4.4. Other Non Vegetated Areas
26 5. Water
33 5.1. River, Lake and Ocean
31 5.2. Aquaculture
27 6. Non Observed
Note: Modified from MapBiomas. There are 16 categories of land use in the study area.
159

Remote Sensing and GIS Integration For Amazon Rainforest Wildfire

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Remote Sensing and GIS Integration For Amazon Rainforest Wildfire

Uploaded by

Copyright:

Available Formats

University of South Florida

Digital Commons @ University of

Remote Sensing and GIS Integration for Amazon Rainforest

Follow this and additional works at: https://digitalcommons.usf.edu/etd

Part of the Geographic Information Sciences Commons

Scholar Commons Citation

A thesis submitted in partial fulfillment

Co-Major Professor: Ruiliang Pu, Ph.D.

Keywords: wildfire occurrence, spatial-temporal pattern, geographically weighted regression

Copyright © 2021, Cong Ma

I dedicate my thesis work to my lovely family. I am particularly grateful to my loving parents,

Thank all of you.

about GIS analysis and statistics to me.

for providing the highway, and other geographic data.

List of Tables .................................................................................................................................. iv

List of Figures ..................................................................................................................................v

List of Abbreviations..................................................................................................................... vii

Chapter 1. Introduction ....................................................................................................................1

Chapter 2. Questions and Significance ............................................................................................6

Chapter 3. Literature Review ...........................................................................................................9

Chapter 4. Study Area and Data Sets .............................................................................................30

Chapter 5. Methodology ................................................................................................................38

Chapter 6. Results ..........................................................................................................................81

Chapter 7. Discussion .................................................................................................................. 116

Chapter 8. Conclusions ................................................................................................................129

Chapter 9. References ..................................................................................................................134

Appendix A: Layer Creation Workflows .....................................................................................147

Appendix B: Spatial Distribution.................................................................................................150

Appendix C: Spatial Relationship................................................................................................151

Appendix D: Supplementary Information ...................................................................................161

Table 4.1. Sources and descriptions for variables ...................................................................36

Table 5.2. Descriptive statistics of dependent and explanatory variables ..............................61

Table 5.3. Brazil administrative boundaries ...........................................................................63

Table 6.1. Summary of variable significance and multicollinearity .......................................96

Table 6.2. Summary of global OLS results .............................................................................98

Table 6.3. OLS diagnostics statistics ......................................................................................98

Table 6.4. Poisson and quasi-Poisson diagnostics statistics .................................................100

Table 6.7. Comparison of the performance of the four models ............................................107

Figure 3.1 Number of wildfires in Amazônia Biome ..............................................................13

Figure 3.2 Fire triangle ............................................................................................................22

Figure 4.1 Location of the study area ......................................................................................30

Figure 4.3 States with the most fires .......................................................................................34

Figure 5.1 The logical framework of the study .......................................................................40

Figure 5.2 Frequency distribution of wildfire occurrences .....................................................41

Figure 5.3 Wildfires by states in Amazônia Biome .................................................................42

Figure 5.4 Spatial distribution of observed wildfire counts ....................................................42

Figure 5.5 Continuous spatial distribution of observed wildfire counts .................................43

Figure 5.7 Spatial distribution of fractional vegetation cover (FVC) .....................................47

Figure 5.11 Spatial distribution of meteorological variables ....................................................53

Figure 5.12 Spatial distribution of MCWD ...............................................................................55

Figure 5.14 Spatial join for wildfires ........................................................................................58

Figure 5.16 Regression analysis workflow ...............................................................................68

Figure 6.1 The standard deviation ellipse of wildfire incidents ..............................................80

Figure 6.4 Average nearest neighbor summary of wildfires ...................................................83

Figure 6.6 Global Moran’s I summary of wildfire incidents ...................................................85

Figure 6.7 GIZScore map ........................................................................................................88

Figure 6.8 GI_BIN map...........................................................................................................89

Figure 6.9 IDW of GIZScore map...........................................................................................90

Figure 6.10 Optimized outlier analysis .....................................................................................92

Figure 6.11 Hotspot analysis at municipality level ...................................................................93

Figure 6.13 Moran’s I results of residuals ...............................................................................102

Figure 6.14 Spatial distribution of significant model coefficients ..........................................103

Figure 6.16 Moran scatter plot ................................................................................................108