You are on page 1of 8

Evaluating Region Inference Methods by Using

Fuzzy Spatial Inference Models


Anderson Chaves Carniel Felippe Galdino Markus Schneider
2022 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) | 978-1-6654-6710-0/22/$31.00 ©2022 IEEE | DOI: 10.1109/FUZZ-IEEE55066.2022.9882658

Department of Computer Science Department of Meteorology Department of Computer & Information


Federal University of São Carlos Federal University of Rio de Janeiro Science & Engineering
São Carlos, SP 13565-905, Brazil Rio de Janeiro, RJ 21941-901, Brazil University of Florida
Email: accarniel@ufscar.br Email: ocfgaldino@gmail.com Gainesville, FL 32611, USA
Email: mschneid@cise.ufl.edu

Abstract—Increasingly, geoscientists and spatial data scientists (value 1) or non-belonging (value 0) of a point to a spatial
have shown interest in modeling and analyzing spatial phenom- object. Instead, partial belonging is allowed and expressed by
ena characterized by the feature of spatial fuzziness. Applying a membership degree in the interval [0, 1]. Further, multiple
fuzzy logic and fuzzy inference methods to fuzzy spatial objects
leads to fuzzy spatial inference models. These models pursue the belonging of a point to several spatial objects is allowed with
goal of discovering new meaningful findings from fuzzy spatial equal or different membership degrees.
data, hence contributing to data knowledge discovery and sharing Applying fuzzy logic [1] and fuzzy inference methods [2]
this goal with spatial data science. In this paper, we introduce to fuzzy spatial objects leads to fuzzy spatial inference models
a novel type of inference method called region inference; it (FSI models). These models pursue the goal of discovering
combines spatial query processing with fuzzy inference methods.
The objective is to capture all points that intersect a search new meaningful findings from fuzzy spatial data, hence con-
object (e.g., a query window) and whose output values fulfill some tributing to data knowledge discovery and sharing this goal
specific user requirements (e.g., the points with the maximum with spatial data science [3]. For each point location, an FSI
or minimum inferred values). For this, we propose, evaluate, model infers an output value that represents the reasoning
and compare query window inference methods in fuzzy spatial conclusion of the application by employing a knowledge base
inference models. In addition, we show their characterization and
applicability in real spatial applications. composed of fuzzy spatial objects and fuzzy rules.
Index Terms—Spatial fuzziness, fuzzy spatial inference model, This paper has four main goals. The first goal is to in-
region inference, fuzzy spatial data type, spatial data science troduce a novel type of inference called region inference
(RI). Region inference combines spatial query processing with
I. I NTRODUCTION fuzzy inference methods. The objective is to capture all points
Crisp spatial objects such as land properties and school that intersect a given search object and whose output values
districts are characterized by an exact location and a precisely fulfill some specific user requirements (e.g., the points with
defined extent, shape, and boundary in space. Spatial data the maximum or minimum inferred values). We also identify
types enable the representation of crisp point, line, and region different types of questions that RIs can answer in applications.
objects and include a large variety of geometric operations A search object denotes the area of interest of a user and can
(e.g., geometric intersection, overlap, area). They form the have any particular shape. In this paper, we mainly deal with
core components of geographical information systems (GIS), query windows. A query window is an axis-aligned rectangle
spatial database systems (SDBS), and spatial data science that represents an infinite set of points in the Euclidean plane.
(SDS) projects. But increasingly, geoscientists and spatial The second goal is to provide different methods for per-
data scientists have shown interest in modeling and analyzing forming region inferences with query windows in fuzzy spatial
spatial phenomena characterized by the feature of spatial fuzzi- inference models. We propose two main methods for this
ness. It captures the inherent property of many spatial objects purpose, called the discretization method and the optimization
in reality that have inexact locations, vague boundaries, and/or method. The third goal refers to the comparative analysis of
blurred interiors, and hence cannot be adequately represented these two methods to characterize their behavior. For this, we
by crisp spatial objects. Examples are air polluted areas, soil build a scenario based on two real datasets to evaluate different
strata, and habitats of species. types of RIs. The fourth goal relates to an understanding of
In the geoscience and GIS domains, fuzzy set theory has the applicability of the proposed methods. By using statistical
become a popular tool for modeling such fuzzy spatial objects. measures of the results of our comparative analysis, we
Fuzzy spatial data types for fuzzy points, fuzzy lines, and fuzzy compare the methods and study their properties, similarities,
regions have been formally defined for representing them. differences, and relationships. This helps us to determine and
The central idea is to relax the strict decision of belonging deploy the best method to adequately answer an RI.
Section II discusses related work. Section III introduces a
This study was financed in part by the Coordenação de Aperfeiçoamento de
Pessoal de Nı́vel Superior - Brasil (CAPES) - Finance Code 001. Anderson running example to later illustrate our concepts of fuzzy spatial
C. Carniel has been supported by the Google Research Scholar program. data types, fuzzy inference on (fuzzy) spatial data, and RI

Authorized licensed use limited to: Universitas Brawijaya. Downloaded on January 01,2024 at 01:19:06 UTC from IEEE Xplore. Restrictions apply.
978-1-6654-6710-0/22/$31.00 ©2022 IEEE
methods. Section IV sketches some basics of fuzzy inference models that can be applied to different contexts. Among them,
systems. Section V introduces some needed concepts of the FIFUS [7] distinguishes itself due to the formal specification
Spatial Plateau Algebra as an implementation of fuzzy spatial of a model that combines fuzzy inference systems and fuzzy
data types. Section VI describes FSI models, presents different spatial data. Hence, it is the closest related work to this paper.
kinds of spatial inferences, and proposes algorithms for exe- However, it has two main limitations. First, it presents an
cuting the RI methods. Section VII provides a comparative algorithm that infers output values on point locations only.
analysis of the introduced algorithms. Finally, Section VIII Thus, it cannot evaluate RI queries, which are more useful
draws some conclusions and describes future work. and require sophisticated algorithms to be processed. Second,
it does not conduct experiments to analyze the behavior and
II. R ELATED W ORK properties of FSI models. This limits the understanding of the
Several approaches propose inference models for fuzzy spa- practical applicability of FSI models.
tial data. In this paper, we discuss the studies in [4]–[9], which Differently from the related work discussed, this paper intro-
allow us to identify the main features of available approaches. duces the concept of region inferences and its variants that are
We characterize them with respect to (i) the representation generic and can be applied to different contexts. We describe
of spatial phenomena, (ii) the information employed by the two novel methods to evaluate RI queries on query windows
inference model, and (iii) the generality of the model. by employing fuzzy spatial objects as core components of
The first characterization refers to the underlying spatial the underlying inference engine (e.g., Mamdani’s method). To
data model employed by the approach. For instance, the work exhibit the applicability and usefulness of our methods in real
in [6] represent spatial phenomena by using instances of crisp applications, we compare and analyze them by using a running
spatial data types where each object is associated with a set application built with real spatial datasets.
of alphanumerical attributes. Hence, the vagueness present
in such attributes is not explicitly stored and handled in the III. RUNNING E XAMPLE
internal structure of spatial objects. Other approaches employ
In this paper, we extend the application in [10] based on
specific spatial data models to deal with spatial fuzziness.
two real datasets that contain point objects labeled with several
In [9], the authors represent a fuzzy spatial object by using
alphanumerical attributes. The first one consists of the loca-
two crisp spatial objects named kernel and conjecture. While
tions of Airbnb accommodations in New York City1 extracted
the kernel part contains the locations that certainly belong to
from April 7th to April 12th, 2021. Here, we are interested
the spatial phenomenon, the conjecture part consists of the
in using the attributes related to the price and overall ratings
locations that possibly belong to the spatial phenomenon. It
of accommodations. The definition of overall ratings is based
limits the expressiveness of spatial fuzziness. On the other
on values between 0 and 100, where a higher value implies a
hand, the remaining approaches in [4], [5], [7] and [8] employ
better review. We have removed the observations with missing
fuzzy set theory to assign a value between 0 and 1 to each point
data in these attributes from the dataset. The second dataset
in space that indicates the degree to which a point belongs to
refers to inspection results of restaurants in New York City
the possible characterizations of a spatial phenomenon.
provided by the Department of Health and Mental Hygiene
With respect to the second characterization, the available
(DOHMH)2 . Since a restaurant can be reinspected multiple
approaches may consider different input values in their infer-
times, we are interested in considering only the most recent
ence models. Some approaches such as [4] and [6] make use
graded inspection results valid on April 12th, 2021. Hence,
of alphanumerical attributes that are associated with spatial
we have employed the R script supplied by the DOHMH2 and
objects or extract numerical properties from spatial objects
excluded the lines with negative score and missing latitude and
(e.g., area of regions). This strategy is limited since it does not
longitude coordinates. As a result, this dataset has only scores
consider the vague boundaries and blurred interior of spatial
with values greater than or equal to 0, where a smaller score
phenomena. Otherwise, the approaches in [5], [7] and [8]
implies a better grade.
employ points annotated with their corresponding membership
The goal of our application is to recommend the locations
degrees as input of fuzzy inference systems. Hence, they
in New York City that provide great visiting experience. The
exploit the intrinsic spatial fuzziness of spatial objects in the
visiting experience is given by the attractiveness of staying
decision-making process. The work in [9] differs from these
in a particular location based on prices and overall ratings
approaches since it uses sensor data as input to determine the
of accommodations and sanitary conditions of restaurants
kernel and conjecture parts of a fuzzy spatial object.
situated near to the location. The attractiveness is given as
As for the third characterization, the generality of FSI
a value between 0 and 100, where a larger value indicates a
models permits their application in different scenarios and
better visiting experience. In the rest of this paper, we illustrate
situations. The majority of the approaches are specifically
how the goal of this application can be achieved by processing
designed for dealing with a particular context. An example
RIs on an FSI model designed for the application.
is the FSI model in [4], which is designed to analyze land
cover changes. Another example is the hierarchical system 1 http://insideairbnb.com/get-the-data.html
presented in [8] that is applied to the assessment of risky issues 2 https://data.cityofnewyork.us/Health/DOHMH-New-York-City-
such as wildfires. The approaches in [5]–[7] propose tools and Restaurant-Inspection-Results/43nn-pn8j

Authorized licensed use limited to: Universitas Brawijaya. Downloaded on January 01,2024 at 01:19:06 UTC from IEEE Xplore. Restrictions apply.
IV. F UZZY S ETS AND F UZZY I NFERENCE S YSTEMS predicates of crisp spatial algebras for which implementations
(e.g., sf 3 ) are available.
In applications, fuzzy sets are labeled by linguistic values A spatial plateau object can be a (spatial) plateau point,
(LVals), which characterize different instances or occurrences plateau line, or plateau region. In this paper, only plateau
of a linguistic variable (LVar) [11]. We apply this concept in regions will be of interest. In general, a spatial plateau object is
the sense that fuzzy spatial objects can also be associated with specified by a list of pairs where each pair (called component)
LVals as described below. consists of a crisp spatial object and a membership degree
Our running example has four attributes of interest denomi- in ]0, 1]. The crisp spatial objects of all components of a
nated by the LVars accommodation price, accommodation re- spatial plateau object must be of the same crisp spatial data
view, food safety, and visiting experience. The LVals of the first type, have different membership degrees, and be disjoint or
three LVars are represented by fuzzy spatial objects built from adjacent to each other. The single membership degree of each
their corresponding numerical attributes (see Section V). These region component of a plateau region is implicitly assigned to
objects are used as a basis to perform the recommendations. all interior points of that component. However, each boundary
The LVar accommodation price represents the daily price of point of a region component gets the maximum membership
the accommodation and is classified as cut-rate, affordable, degree of all region components to which it belongs. A formal
and expensive. The LVals of the accommodation review are definition of the spatial plateau data types is given in [12].
reasonable, good, and excellent; they characterize the overall While the SPA provides a large number of operations
rating of the accommodation. The LVar food safety has the and predicates, in this paper only one of them, namely the
LVals low, medium, and high that classify levels of sanitary function membership, is needed. Assume that point and region
conditions according to the scores of the sanitary inspection are the spatial data types for crisp point and region objects
results. The last LVar specifies how attractive it is to visit a and that fregion is the spatial plateau data type for plateau
specific location. We employ the LVals awful, average, and region objects. Let (i) p ∈ point, (ii) r1 , . . . , rn ∈ region
great to distinguish the levels of visiting experience. Since a such that disjoint(ri , rj ) ∨ meet(ri , rj ) holds for i 6= j,
LVal labels a fuzzy set, we denote the membership function (iii) m1 , . . . , mn ∈ ]0, 1] with mi < mj for 1 ≤ i < j ≤ n,
of a LVal lv as µlv . and (iv) fr ∈ fregion. The Boolean functions disjoint and meet
Expert knowledge in applications can be expressed by fuzzy are well known topological predicates on crisp region objects.
propositions composed of LVars and LVals [11]. A fuzzy Intuitively, two regions are disjoint if they do not have any
proposition has the format L is v where L is an LVar and v point in common. They meet if only their boundaries share at
is an LVal of the scope of L. For instance, visiting experience least one point. We can now describe fr with n components as
is great is a fuzzy proposition for our running example. A fr = h(r1 , m1 ), . . . , (rn , mn )i. Given p and fr , the function
fuzzy rule has the format IF A THEN B where A and B membership yields the degree of belonging of p to fr , that is,
are fuzzy propositions so that A implies B. Further, A is the membership(fr , p) =

antecedent and B the consequent of the rule. The antecedent 
 mi if ∃ 1 ≤ i ≤ n : p inside ri
and consequent can be formed by multiple fuzzy propositions 
max(mi1 , . . . , mik ) if 1 ≤ k ≤ n, 1 ≤ i1 < . . . < ik ≤ n,


combined by logical connectives such as AND and OR. Our 
running example has five fuzzy rules to represent the cost- ∀ i ∈ {i1 , . . . , ik } :

p on boundary(ri )

benefit of staying in a location so that it expresses the visiting 



experience. An example of the fuzzy rule is IF accommodation 0 otherwise
price is cut-rate AND accommodation review is excellent AND The topological predicate inside checks if a point is located
food safety is high THEN visiting experience is great. in the interior (but not on the boundary) of a region object.
A fuzzy inference system makes use of fuzzy logic to The operation boundary yields the boundary of a region object
determine output values based on a knowledge base that as a line object. Finally, the operation on checks if a point is
consists of fuzzy sets and a fuzzy rules set. There are different located on a line object.
methods for determining the output value of a such system. Figure 1 shows the plateau region objects for the LVar
An example is the Mamdami’s method [2], which is employed accommodation review of our running example. In this figure,
in this paper. each point has a membership degree that indicates to which
extent the point belongs to an LVal. For instance, black areas
V. N EEDED C ONCEPTS OF THE S PATIAL P LATEAU in Figure 1b contain accommodations that have definitely
A LGEBRA received a good review since such areas are represented by
a fuzzy region component labeled with membership degree 1
The Spatial Plateau Algebra (SPA) is an executable type in the LVal good.
system [12]. Its spatial plateau data types, spatial plateau VI. F UZZY S PATIAL I NFERENCE M ODELS
operations, and spatial plateau predicates implement formally
FSI models make use of fuzzy spatial objects labeled
defined fuzzy spatial data types, fuzzy spatial operations, and
with LVals to make reasoning conclusions based on a fuzzy
fuzzy topological predicates. They are specified and imple-
mented in terms of the spatial data types, operations, and 3 https://cran.r-project.org/package=sf

Authorized licensed use limited to: Universitas Brawijaya. Downloaded on January 01,2024 at 01:19:06 UTC from IEEE Xplore. Restrictions apply.
1.00

0.75

0.50

0.25

0.00

(a) reasonable (b) good (c) excellent


Fig. 1. The plateau region objects for each LVal of the LVar accommodation review including New York city boundaries. The side bar depicts how membership
degrees are associated with grayscale colors.

inference method like Mamdani’s method. Similarly to FIFUS,


we use LVals of the fuzzy spatial objects in the antecedent of
a fuzzy rule. The consequent of a fuzzy rule makes use of
the LVals of the output variable of the application. The main
advantage of this strategy is that users can express the knowl-
edge about an application by using the same format of fuzzy
rules as it is employed by traditional fuzzy inference systems
(Section IV). Differently from these systems, FSI models deal
with fuzzy spatial objects in the antecedent of rules and get (a) a query window (b) a set of six query windows
spatial objects as inputs. This leads to the identification of new
Fig. 2. Examples of query windows for our running example.
types of inference queries (Section VI-A). In this paper, we
introduce a novel type of inference named region inference
(RI), which has two variants. We also propose algorithms for Algorithm 1: Evaluation of an FSI model by assuming a
implementing these variants (Sections VI-B, VI-C, and VI-D) simple point object as input.
Input: An FSI model (fsi) and a simple point object (p).
and conduct a comparative analysis (Section VII). Output: A point labeled with an inferred value.
1 Function eval(fsi, p)
A. Types of Spatial Inference Queries 2 Let implication results be an empty list of membership functions
We identify the following types of spatial inference queries 3 foreach rule in get rules(fsi) do
4 Let degrees be an empty list of numeric values
that an FSI model can answer: 5 foreach ant in get ants(rule) do
Point inference query: what is the inferred value for a given 6 add(degrees, membership(ant, p))
single point location? 7 degree fulfillment ← connective method(degrees)
8 if degree fulfillment > 0 then
Linguistic value-based RI query: what are the points that 9 add(implication results,
intersect a given search object and have inferred values that implication method(get conseq(rule),
belong to a target linguistic value? degree fulfillment))

Optimal RI query: what are the points that intersect a given 10 result fset ← aggregation method(implication results)
11 return defuzz(result fset)
search object and have the maximum (or minimum) inferred
values?
To answer a point inference query, we consider a simple
point object as input of an FSI model. This leads to the kind value-based RI query. Another example is to retrieve those
of inference discussed in [7]. The other two types of inferences point locations inside a query window with the maximum
are variants of the RI. They receive a search object as input inferred value, that is, the best locations to visit. This means
and evaluate an RI method. In this paper, a search object is a that the locations have the largest membership degree in the
query window, that is, an axis-aligned rectangle that represents LVal great. This corresponds to the evaluation of an optimal
an infinite set of points in the Euclidean plane. The proposal RI query. In this paper, we propose two RI methods to answer
of methods for RIs handling query windows is the main focus these variants of RIs. We analyze their behavior through a
of this paper. It improves the usability of an FSI model since comparative analysis that employs the six query windows
users can retrieve those points of interest that satisfy a given shown in Figure 2b, as discussed in Section VII.
set of constraints as stated in the queries.
Figure 2 shows examples of query windows for our running B. Evaluation of Point Inference Queries
example. For instance, the user wants to know the locations Algorithm 1 presents the function eval, which evaluates a
where the visiting experience is great in the query window point inference query (line 1). Its inputs are an FSI model (fsi)
depicted in Figure 2a. This request corresponds to a linguistic and a crisp simple point object (p). Algorithm 1 is based on

Authorized licensed use limited to: Universitas Brawijaya. Downloaded on January 01,2024 at 01:19:06 UTC from IEEE Xplore. Restrictions apply.
Algorithm 2: Evaluation of the discretization method.
Input: An FSI model (fsi), a query window (qw), an integer number (k), Inferred Value
and an LVal (target). 80.0

Output: A set of points, each point labeled with an inferred value. 77.5
1 center points ← centers(make grid(qw, sqrt(k), sqrt(k))) 75.0
2 Let qw inferred points be an empty list of pairs in the format (p, v)
72.5
3 foreach p in center points do
4 inferred value ← eval(fsi, p) 70.0
5 if µtarget (inferred value) > 0 then
6 add(qw inferred points, (p, inferred value))

7 return qw inferred points (a) regular grid (k = 100) (b) resulting points
Fig. 3. The execution of an RI query by using the discretization method.

Mamdani’s method (Section IV) and is employed by the other points with inferred values belonging to the target. Hence,
algorithms that evaluate the RI methods. the algorithm can answer linguistic value-based RI queries,
First, Algorithm 1 defines an auxiliary list to store the as discussed in Section VII. To this end, Algorithm 2 firstly
implication results of the fuzzy rules (line 2). Then, it employs applies the function make grid, which is available in many
the function get rules to capture the fuzzy rules of the FSI spatial libraries like sf, to tessellate the query window into a
model. For each fuzzy rule (lines 3 to 9), it determines the grid of rectangles of equal size with c rows and c columns,
degree to which the point belongs to the fuzzy spatial object where c is the square root of k, and employs the function
represented by the LVal of each part of the antecedent of centers to collect the centers of all squares in a list (line 1).
the rule (lines 4 to 6). The parts of the antecedent of a rule Then, Algorithm 2 creates an empty list of pairs where each
are captured by the function get ants. The resulting degrees pair stores a point p and an inferred value v (line 2). For
are added to a list of numeric values (line 6). This list is each center point of a grid square (lines 3 to 6), we invoke
used as input of the function connective method to calculate Algorithm 1 to compute its inferred value (line 4). Next, we
the degree of fulfillment according to the logical connective check whether the resulting inferred value has a degree greater
employed in the antecedent of the rule (line 7). Here, we than 0 in the membership function represented by the target
assume that only one logical connective is allowed to combine LVal (line 5). If this is the case, we add this point and its
the antecedent parts. Next, the algorithm uses the function inferred value to the list of pairs (line 6). Finally, Algorithm 2
implication method to process the implication of the rule if returns the complete list of these pairs (line 7).
the degree of fulfillment of the rule is greater than 0 (lines 8 Figure 3 depicts the evaluation of the discretization method
and 9). The implication method applies the minimum operator for one query window of our running example (Figure 2a). We
to the degree of fulfillment and the membership degrees of are interested in capturing the points whose inferred values
each domain value of the LVar of the consequent of the rule. have some degree in the LVal great. It assumes k = 100 to
Consequently, it reshapes the membership function represented build a regular grid with 10 rows and 10 columns, as shown
by the LVal of the consequent (captured by the function in Figure 3a. For each point, the method invokes Algorithm 1
get conseq). This result is a fuzzy set and is added to an to check whether the point belongs to the final answer of the
auxiliary list. Then, Algorithm 1 applies the union between query. The qualifying points are depicted in Figure 3b where
all fuzzy sets stored in the auxiliary list (line 10). Finally, it each point is labeled with its inferred value.
returns the defuzzification of the resulting union by deploying The discretization method determines all points that are la-
the function defuzz (line 11). beled with values of some degree of membership regarding the
Since Algorithm 1 executes the inference process for one target LVal of the consequent of an FSI model. In this sense, if
simple point object only, it does not handle query windows the target LVal encompasses the maximum (minimum) domain
with their infinitely many points. To answer the other types of values of the consequent, this method can also be a candidate
spatial inference queries, we propose the RI methods described to answer Optimal RI queries. For this, we have only to select
in Sections VI-C and VI-D. Their basic idea is to extract a those points with the maximum (minimum) inferred value.
finite number of points from the query window, evaluate them This means that we can invoke this method for k points, then
by Algorithm 1, and then decide whether the resulting points get m ≤ k points as a result, and pick the n ≤ m points with
contribute to the final answer of the question or not. the maximum (minimum) inferred value. We discuss the effect
of using this strategy in Section VII.
C. The Discretization Method
Algorithm 2 implements the discretization method. Its in- D. The Optimization Method
puts are an FSI model (fsi), a query window (qw), an integer Algorithm 3 presents the optimization method. Its inputs are
number (k), and a target LVal (target) belonging to the LVar an FSI model (fsi), a query window (qw), an integer number
used in the consequent part of the fuzzy rules set. The goal (max depth), and an optimization technique (optim). This
is to draw the reasoning conclusion for k points distributed method differs from the discretization method in the sense that
on a regular grid covering qw and determine the m ≤ k its goal is to discover points with the maximum (minimum)

Authorized licensed use limited to: Universitas Brawijaya. Downloaded on January 01,2024 at 01:19:06 UTC from IEEE Xplore. Restrictions apply.
Algorithm 3: Evaluation of the optimization method. The optimization method (line 3) itself is a recursive ap-
Input: An FSI model (fsi), a query window (qw), an integer number proach and shown in Algorithm 4. In addition to the input
(max depth), and an optimization technique (optim) parameters of Algorithm 3, Algorithm 4 gets an integer
Output: A set of points where each point is labeled with an inferred
value.
value that denotes the current depth of the recursive call
1 Let qw inferred points be an empty list of pairs in the format (p, v) (curr depth), an auxiliary list of pairs (result list), and the
2 Let target be the LVal with the largest (smallest) domain values of the target LVal (target) as input parameters. The stop condition
consequent part from the rules set of fsi
3 qw inferred points ←
of the recursive call of Algorithm 4 is whether it has reached
optim method recursive(fsi, qw , max depth, 0, the maximum depth given as input (lines 2 and 3). In this
qw inferred points, optim, target) case, the algorithm returns the list of pairs obtained so far
4 return qw inferred points
and, consequently, Algorithm 3 stops its execution as well
(line 4). Otherwise, Algorithm 4 splits the query window into
Algorithm 4: The algorithm of the recursive function four subquadrants (line 4), which are rectangles of equal size
employed by Algorithm 3. covering the query window. For each subquadrant (lines 5 to
9), it applies the optimization technique optim to find the point
Input: An FSI model (fsi), a query window (qw), two integer numbers
(max depth and curr depth), a list of pairs (result list), an with the maximum (minimum) inferred value by considering
optimization technique (optim), and an LVal (target). the subquadrant as the search space (line 6). As previously
Output: A set of points where each point is labeled with an inferred discussed, we assume that the optimization technique is given
value.
1 Function optim_method_recursive(fsi, qw , max depth, by the user and employs Algorithm 1 as its fitness function.
curr depth, result list, optim, target) Then, Algorithm 4 tests if the inferred value of the resulting
2 if max depth = curr depth then point has a membership degree greater than 0 in the target LVal
3 return result list
(line 7). If the point satisfies this test, the point is appended
4 subquadrants ← make grid(qw, 2, 2)
5 foreach q in subquadrants do to the result (line 8). Further, the subquadrant containing the
6 optim res ← optim(fsi, q) // eval is used as fitness function point can be recursively explored (line 9) so that the query
7 if µtarget (optim res.value) > 0 then window of the next call is now the subquadrant and the current
8 add(result list, (optim res.point, optim res.value))
9 result list ← optim method recursive(fsi, q, depth of the recursive call is incremented by one. In this
max depth, curr depth + way, Algorithm 4 pursues a pruning strategy since it does not
1, result list, optim, target) explore subquadrants whose maximum (minimum) inferred
10 return result list values do not contribute to the final answer of the evaluation.
We can calculate the maximum number of resulting points
of the optimization method as follows. Since the maximum
number of recursive calls is determined by the parameter
inferred value. The main strategy is to recursively divide the max depth and at most four points can be returned in each
Pmax depth
query window into subquadrants and explore them to find such level of the recursive call, we obtain at most i=1 4i .
points. The parameter max depth determines the maximum Note that the minimum value for max depth is 1 since if it is
number of times the subdivision is performed. In addition, this 0, the query window is not split.
method requires an optimization technique that allows us to Figure 4 depicts the evaluation of the optimization method
get the points with the maximum (minimum) inferred value in for one query window of our running example (Figure 2b). To
a subquadrant. Examples of optimization techniques include better visualize this query window, we zoom into the south
genetic algorithms [13] and the particle swarm optimization west part of Staten Island of New York City. We aim to
(PSO) [14]. While the optimization techniques have their discover the best locations to visit. Thus, we are interested in
own specific parameters, in general, they all deploy a fitness gathering the points with the maximum inferred values. The
function to evaluate candidate solutions to the problem. In our example assumes max depth = 2 to execute Algorithm 3.
case, a candidate solution is a point with an inferred value Only the top-right subquadrant of the first level of the recursive
calculated by Algorithm 1 and the objective is to select only call has a point with some degree in the LVal great. Thus,
the point inside a range (i.e., subquadrant) with the maximum Algorithm 4 is recursively called to deal with this subquadrant,
(minimum) inferred value. This means that the fitness function which is in turn split into its four contained subquadrants, as
corresponds to the function eval in Algorithm 1 and that shown in Figure 4a. Three subquadrants have selected points
the input optimization technique optim in Algorithm 3 is leading to the resulting points depicted in Figure 4b. Note that
parameterized with this fitness function. the method has pruned four subquadrants in its evaluation.
Algorithm 3 starts by creating an auxiliary list of pairs to
store its result (line 1). Each pair has the format (p, v) where p VII. C OMPARATIVE A NALYSIS OF R EGION I NFERENCE
is a point object and v is its inferred value. Next, the algorithm Q UERIES
sets the target according to the user’s goal (line 2). If the user In this section, we conduct a comparative analysis of the
wants to maximize (minimize) inferred values, Algorithm 3 RI methods described in Sections VI-C and VI-D. The goal
considers the LVal that contains the largest (smallest) values is to characterize the behavior of these methods so that we
of the domain of the consequent as the target. can understand their properties, similarities, and relationships.

Authorized licensed use limited to: Universitas Brawijaya. Downloaded on January 01,2024 at 01:19:06 UTC from IEEE Xplore. Restrictions apply.
Inferred Value
80.0
100 14 6 67 61 55 71
77.5

k
75.0
10,000 1,827 714 6,531 6,716 6,434 6,528
72.5

70.0 1 2 3 4 5 6
Query Window Average
Inferred Value
(a) discretization method 79
(a) quadrants (b) resulting points
78
Fig. 4. The execution of a query window inference by using the optimization 77
method. 4 242 33 315 335 338 339 76
75
74
3 74 11 83 84 84 84

max_depth
73
The experimental setup of the comparative analysis is de-
tailed in Section VII-A, while the results are discussed in
Section VII-B. 2 19 4 20 20 20 20

A. Experimental Setup
1 4 1 4 4 4 4
The comparative analysis employs our running example.
Hence, it uses the spatial plateau region objects built for each 1 2 3 4 5 6
LVar of the application (Sections IV and V) to specify an FSI Query Window

model and evaluate the RI methods by using the six query (b) optimization method
windows depicted in Figure 2b. These windows have been Fig. 5. The relationship between the average inferred values and the number
randomly generated inside the geometric shape of New York of returned points for each RI method.
City. They have the unique identifiers 1 to 6 and different
sizes with respect to the total area of the bounding box
encompassing New York City. The query windows 1 and 2 optimization method is stricter in the selection process. Both
have 0.1% of the total area, the windows 3 and 4 have 1%, and methods have in common that the number of returned points
the last two windows have 5%. These sizes allow us to analyze grows as their parameter values increase. We observe that
the effect of different query window sizes on our methods. the growth rates for the different query windows can be
We have employed the package fsr [10] to implement the approximated by linear functions. With respect to the query
RI methods and the running example and to conduct our window size, if it is small, the number of returned points tends
comparative analysis. It is an R package that implements to be small too. This aspect is more evident for query window
the data types and operations defined by the Spatial Plateau 2 in the optimization methods because of its pruning strategy.
Algebra and allows users to create FSI models. If the query window is small, its division generates very small
To understand the behavior of these methods, we have subquadrants that possibly do not have points belonging to the
selected the following parameter values. For the discretization final answer.
method, we have used 100 and 1,000 as values for the As for the average inferred value of the points returned by
parameter k and great as the value for the target LVal. For the the methods, we highlight the optimization method since it
optimization method, we have used the PSO algorithm with the has shown the highest averages in all query windows. The
value 10 chosen for the PSO specific parameters population parameter max depth exerts a strong influence in the picked
size and maximum number of iterations [14]. Further, we have points. If we increase it, we decrease the average inferred value
used the values 1 to 4 for the parameter max depth. of the returned points. This behavior mainly appears in small
query windows (e.g., windows 1 and 2) due to the following
B. Understanding the Characterization and Applicability of fact. For each subquadrant, the optimization method finds one
the Region Inference Methods point with the maximum inferred value inside the subquadrant
Figure 5 depicts some statistical results when executing the only. Thus, the method possibly adds points that do not have
RI methods in the context of our running example. We use heat the global maximum inferred value. On the other hand, the
maps to relate the number of points returned by each method inclusion of points that have inferred values lower than but
and shown by the labels in the cells with the average inferred very near to the maximum value can provide a diversity of
value represented by a grayscale color. The x-axis refers to good choices for the users in applications.
the identifier of the query window, and the y-axis relates to Figure 6 compares the methods for processing optimal
the parameters of the respective RI method. Figure 5 allows RI queries of our running example. While the optimization
us to understand the following characterizations. method is a natural candidate for this type of RI, we can also
With respect to the number of points, the discretization apply the discretization method by selecting only those points
method has distinguished itself in returning the largest number with the maximum (minimum) inferred value (as initially
of points. This is expected since this method picks points discussed in Section VI-A). Let n be the number of points
whose inferred value belongs to the target LVal, while the returned by a method. Let further m ≤ n be the subset of

Authorized licensed use limited to: Universitas Brawijaya. Downloaded on January 01,2024 at 01:19:06 UTC from IEEE Xplore. Restrictions apply.
query window 1 query window 2 query window 3 For future work, we see the following interesting topics.
k = 100 Our query windows are static. But there are many applications
k = 10,000
RI methods and their parameters

max_depth = 1
that would benefit from fuzzy spatial inference on moving
max_depth = 2 windows. For example, consider a user who is driving in a
max_depth = 3
max_depth = 4
car and repeatedly asking for the most interesting locations to
visit nearby. This would lead us to continuous fuzzy spatial
query window 4 query window 5 query window 6
k = 100
inference queries. Another possible topic is that based on the
k = 10,000 behavior identified in our comparative analysis, we aim to
max_depth = 1
max_depth = 2
propose an automatic approach to determining the maximum
max_depth = 3 depth that the optimization method can go without provoking
max_depth = 4
disturbance in the maximum (minimum) inferred values of
0 25 50 75 100 0 25 50 75 100 0 25 50 75 100
% of points with the maximum inferred value the points (i.e., to minimize the standard deviation). More
precisely, the goal could be to propose a variable optimization
Fig. 6. The accuracy rate of the methods when evaluating Optimal RIs. method which performs the exploration of the search space in
a different way. For example, a subquadrant should be visited
only if it satisfies a given condition related to the history of
points with the maximum inferred value. We compare the previously identified points. Finally, we also aim to compare
precision of each configuration of the methods by considering and analyze the runtime of the proposed RI methods. Further,
the ratio m/n. Clearly, the optimization method shows the we plan to improve their performance by using spatial indexing
highest accuracy results. The discretization method returns structures provided by spatial libraries [15].
more points than the optimization method (Figure 5) but the
majority of these points does not have the maximum inferred R EFERENCES
value. Figure 6 also shows the influence of the parameter [1] L. A. Zadeh, “Fuzzy sets,” Information and Control, vol. 8, no. 3, pp.
max depth. If we divide the query window multiple times 338–353, 1965.
and collect points with the maximum inferred value in the [2] C.-C. Lee, “Fuzzy logic in control systems: Fuzzy logic controller, part
II,” IEEE Trans. on Systems, Man, and Cybernetics, vol. 20, no. 2, pp.
subquadrants only (these points represent local maxima), we 419–435, 1990.
tend to decrease the accuracy rate. This is intensified when [3] A. C. Carniel and M. Schneider, “A survey of fuzzy approaches in spatial
small query windows are employed (e.g., windows 1 and 2). data science,” in IEEE Int. Conf. on Fuzzy Systems, 2021, pp. 1–6.
[4] X. Tang, “Spatial object modeling in fuzzy topological spaces with
Supported by our comparative analysis, we highlight two applications to land cover change,” Ph.D. dissertation, International
important conclusions. First, we recommend that the dis- Institute for Geo-information Science & Earth Observation, 2004.
cretization method be employed to process linguistic value- [5] J. Jasiewicz, “A new GRASS GIS fuzzy inference system for massive
data analysis,” Computers and Geosciences, vol. 37, no. 9, pp. 1525–
based RI queries. There are two essential factors to be consid- 1531, 2011.
ered when determining the value of k. If the query window is [6] A. Calzada, J. Liu, H. Wang, and A. Kashyap, “A GIS-based spatial
large (small) and users want to increase (decrease) the number decision support tool based on extended belief rule-based inference
methodology,” in Int. Workshop on Knowledge Discovery, Knowledge
of points returned by the method, then the value of k should be Management and Decision Support, 2013, pp. 388–395.
large (small). Second, we recommend the use of the optimiza- [7] A. C. Carniel and M. Schneider, “Fuzzy inference on fuzzy spatial
tion method for processing optimal RI queries. Its strategy of objects (FIFUS) for spatial decision support systems,” in IEEE Int. Conf.
on Fuzzy Systems, 2017, pp. 1–6.
subdividing the query window in subquadrants helps one find [8] L. Boudet, J.-P. Poli, L.-P. Bergé, and M. Rodriguez, “Situational
only points with the maximum (minimum) inferred values. assessment of wildfires: a fuzzy spatial approach,” in IEEE Int. Conf.
This method has provided the best accuracy rates. For small on Tools with Artificial Intelligence, 2020, pp. 1180–1185.
[9] R. C. N. Njila, , M. A. Mostafavi, and J. Brodeur, “A decentralized
query windows, the value 2 for the parameter max depth is semantic reasoning approach for the detection and representation of
desirable; it provided accuracy rates from 84.21% to 100%. continuous spatial dynamic phenomena in wireless sensor networks,”
For large query windows, this parameter should be increased ISPRS Int. Journal of Geo-Information, vol. 10, no. 3, 2021.
[10] A. C. Carniel, F. Galdino, J. S. Philippsen, and M. Schneider, “Handling
to 3 to augment the number of points returned by the method. fuzzy spatial data in R using the fsr package,” in ACM SIGSPATIAL Int.
It showed accuracy rates ranging from 67.57% to 96.43%. Conf. on Advances in Geographic Information Systems, 2021, pp. 526–
535.
[11] L. A. Zadeh, “Outline of a new approach to the analysis of complex
VIII. C ONCLUSIONS AND F UTURE W ORK systems and decision processes,” IEEE Trans. on Systems, Man, and
Cybernetics, vol. SMC-3, no. 1, pp. 28–44, 1973.
In this paper, we have introduced a new type of inference [12] A. C. Carniel and M. Schneider, “Spatial Plateau Algebra: An executable
method called region inference. Its novelty consists in the type system for fuzzy spatial data types,” in IEEE Int. Conf. on Fuzzy
combination of spatial query processing and fuzzy inference Systems, 2018, pp. 1–8.
[13] M. Srinivas and L. Patnaik, “Genetic algorithms: a survey,” Computer,
methods. Based on a method for point inference, we have in- vol. 27, no. 6, pp. 17–26, 1994.
troduced novel methods for two spatial inference types named [14] R. Poli, J. Kennedy, and T. Blackwell, “Particle swarm optimization,”
linguistic value-based region inference and optimal region Swarm Intelligence, vol. 1, pp. 33–57, 2007.
[15] A. C. Carniel, R. R. Ciferri, and C. D. A. Ciferri, “FESTIval: A versatile
inference. A comparative analysis supported by a running framework for conducting experimental evaluations of spatial indices,”
application example has revealed the characteristics, benefits, MethodsX, vol. 7, p. 100695, 2020.
and advantages of these inference methods.

Authorized licensed use limited to: Universitas Brawijaya. Downloaded on January 01,2024 at 01:19:06 UTC from IEEE Xplore. Restrictions apply.

You might also like