P. 1
PAM an Efficient and Privacy-Aware Monitoring Framework for Continuously Moving Objects

PAM an Efficient and Privacy-Aware Monitoring Framework for Continuously Moving Objects

|Views: 16|Likes:
Published by Kinjalk Kamal

More info:

Published by: Kinjalk Kamal on Jul 22, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

07/22/2011

pdf

text

original

PAM: An Efficient and Privacy-Aware

Monitoring Framework for Continuously
Moving Objects
Haibo Hu, Jianliang Xu, Senior Member, IEEE, and Dik Lun Lee
Abstract—Efficiency and privacy are two fundamental issues in moving object monitoring. This paper proposes a privacy-aware
monitoring (PAM) framework that addresses both issues. The framework distinguishes itself from the existing work by being the first to
holistically address the issues of location updating in terms of monitoring accuracy, efficiency, and privacy, particularly, when and how
mobile clients should send location updates to the server. Based on the notions of safe region and most probable result, PAM performs
location updates only when they would likely alter the query results. Furthermore, by designing various client update strategies, the
framework is flexible and able to optimize accuracy, privacy, or efficiency. We develop efficient query evaluation/reevaluation and safe
region computation algorithms in the framework. The experimental results show that PAM substantially outperforms traditional
schemes in terms of monitoring accuracy, CPU cost, and scalability while achieving close-to-optimal communication cost.
Index Terms—Spatial databases, location-dependent and sensitive, mobile applications.
Ç
1 INTRODUCTION
I
N mobile and spatiotemporal databases, monitoring con-
tinuous spatial queries over moving objects is needed in
numerous applications such as public transportation, logis-
tics, and location-based services. Fig. 1 shows a typical
monitoring system, which consists of a base station, a
database server, application servers, and a large number of
moving objects (i.e., mobile clients). The database server
manages the location information of the objects. The applica-
tion servers gather monitoring requests and register spatial
queries at the database server, which then continuously
updates the query results until the queries are deregistered.
The fundamental problem in a monitoring system is
when and how a mobile client should send location updates
to the server because it determines three principal perfor-
mance measures of monitoring—accuracy, efficiency, and
privacy. Accuracy means how often the monitored results
are correct, and it heavily depends on the frequency and
accuracy of location updates. As for efficiency, two
dominant costs are: the wireless communication cost for
location updates and the query evaluation cost at the
database server, both of which depend on the frequency of
location updates. As for privacy, the accuracy of location
updates determines how much the client’s privacy is
exposed to the server.
In the literature, very few studies on continuous query
monitoring are focused on location updates. Two commonly
used updating approaches are periodic update (every client
reports its new location at a fixed interval) and deviation
update (a client performs an update when its location or
velocity changes significantly) [24], [32], [35], [47]. However,
these approaches have several deficiencies. First, the
monitoring accuracy is low: query results are correct only
at the time instances of periodic updates, but not in between
them or at any time of deviation updates. Second, location
updates are performed regardless of the existence of
queries—a high update frequency may improve the mon-
itoring accuracy, but is at the cost of unnecessary updates
and query reevaluation. Third, the server workload using
periodic update is not balanced over time: it reaches the
peak when updates arrive (they must arrive simultaneously
for correct results) and trigger query reevaluation, but is idle
for the rest of the time. Last, the privacy issue is simply
ignored by assuming that the clients are always willing to
provide their exact positions to the server.
Some recent work attempted to remedy the privacy
issue. Location cloaking was proposed to blur the exact client
positions into bounding boxes [18], [14], [31], [26]. By
assuming a centralized and trustworthy third-party server
that stores all exact client positions, various location
cloaking algorithms were proposed to build the bounding
boxes while achieving the privacy measure such as
/-anonymity. However, the use of bounding boxes makes
the query results no longer unique. As such, query
evaluation in such uncertain space is more complicated. A
common approach is to assume that the probability
distribution of the exact client location in the bounding
box is known and well formed. Therefore, the results are
defined as the set of all possible results together with their
probabilities [14], [31], [7]. However, all these approaches
focused on one-time cloaking or query evaluation; they
cannot be applied to monitoring applications where
continuous location update is required and efficiency is a
critical concern.
404 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 3, MARCH 2010
. H. Hu and J. Xu are with the Department of Computer Science, Hong
Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong SAR,
China. E-mail: {haibo, xujl}@comp.hkbu.edu.hk.
. D.L. Lee is with the Department of Computer Science and Engineering, The
Hong Kong University of Science and Technology, Clear Water Bay,
Kowloon, Hong Kong SAR, China. E-mail: dlee@cse.ust.hk.
Manuscript received 2 Apr. 2008; revised 30 July 2008; accepted 25 Mar.
2009; published online 15 Apr. 2009.
Recommended for acceptance by S. Wang.
For information on obtaining reprints of this article, please send e-mail to:
tkde@computer.org, and reference IEEECS Log Number TKDE-2008-04-0175.
Digital Object Identifier no. 10.1109/TKDE.2009.86.
1041-4347/10/$26.00 ß 2010 IEEE Published by the IEEE Computer Society
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
In [21], we proposed a monitoring framework where the
clients are aware of the spatial queries being monitored, so
they send location updates only when the results for some
queries might change. Our basic idea is to maintain a
rectangular area, called safe region, for each object. The safe
region is computed based on the queries in such a way
that the current results of all queries remain valid as long
as all objects reside inside their respective safe regions. A
client updates its location on the server only when the
client moves out of its safe region. This significantly
improves the monitoring efficiency and accuracy com-
pared to the periodic or deviation update methods.
However, this framework fails to address the privacy
issue, that is, it only addresses “when” but not “how” the
location updates are sent.
In this paper, we take a more comprehensive approach—
instead of dealing with “when” and “how” separately like
most existing work, we propose a privacy-aware monitor-
ing (PAM) framework that incorporates the accuracy,
efficiency, and privacy issues altogether. We adapt for the
monitoring environment the privacy model that has been
employed by location cloaking and other privacy-aware
approaches. More specifically, a client encapsulates its exact
position in a bounding box, and the timing and mechanism
with which the box is updated to the server are decided by
a client-side location updater as part of PAM.
However, the integration of privacy into the monitoring
framework poses challenges to the design of PAM. First,
with the introduction of bounding boxes, the result of a
query is no longer unique. Among all possible results, we
argue that the most probable result, i.e., the one with the
highest probability, is most promising for approximating
the genuine result (the result derived based on the exact
positions). The probability is computed by assuming a
uniform distribution of the exact client position in the
bounding box. Fig. 2 shows two clients o. / together with
their bounding boxes. Both the genuine and most probable
result for the 1NN query Q are ¦o¦. However, even
monitoring only the most probable result adds great
complexity to query evaluation. As such, one of the main
contributions of this paper is to devise efficient query
processing algorithms for common spatial query types.
Second, the most probable result also adds complexity to the
definition of safe region. New algorithms must be designed
to compute maximum safe regions in order to reduce the
number of location updates, and thus, improve efficiency.
Third, as the location updater decides when and how a
bounding box is updated, its strategy determines the
accuracy, privacy, and efficiency of the framework. The
standard strategy is to update when the centroid of the bounding
box moves out of the safe region, which guarantees accuracy—
no miss of any change of the most probable result. To
optimize privacy or efficiency, however, alternative strate-
gies must be devised. Compared to the previous work, the
PAM framework has the following advantages:
. To our knowledge, this is the first comprehensive
framework that addresses the issue of location
updating holistically with monitoring accuracy,
efficiency, and privacy altogether. This framework
extends from our previous work [21] by introducing
a common privacy model, and therefore, suits
realistic scenarios.
. As for efficiency, the framework significantly re-
duces location updates to only when an object is
moving out of the safe region, and thus, is very likely
to alter the query results.
. As for accuracy, the framework offers correct
monitoring results at any time, as opposed to only
at the time instances of updates in systems that are
based on periodic or deviation location update.
. The framework is generic in the sense that it is not
designed for a specific query type. Rather, it
provides a common interface for monitoring various
types of spatial queries such as range queries and
kNN queries. Moreover, the framework does not
presume any mobility pattern on moving objects.
. The framework is flexible in that by designing
appropriate location update strategies, accuracy,
privacy, or efficiency can be optimized.
In the rest of this paper, we will explore the PAM
framework, especially on the aspects of query evaluation
and safe region computation. The remainder of this paper is
organized as follows: Section 2 reviews the related work.
Section 3 overviews the framework components, followed
by Sections 4 and 5 where query evaluation and safe region
computation are presented, with an emphasis on range and
kNN queries. Dynamic client update strategies are given in
Section 6 to optimize privacy and efficiency. Experimental
results of PAM are shown in Section 7.
2 RELATED WORK
There is a large body of research work on spatial temporal
query processing. Early work assumed a static data set and
focused on efficient access methods (e.g., R-tree [19]) and
query evaluation algorithms (e.g., [20], [37]). Recently, a lot
of attention has been paid to moving-object databases,
where data objects or queries or both of them move.
Assuming that object movement trajectories are known a
priori, Saltenis et al. [38] proposed the Time-Parameterized
R-tree (TPR-tree) for indexing moving objects, where the
location of a moving object is represented by a linear
function of time. Benetis et al. [3] developed query
HU ET AL.: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 405
Fig. 2. Monitoring example.
Fig. 1. The system architecture.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
evaluation algorithms for NN and reverse NN search based
on the TPR-tree. Tao et al. [41] optimized the performance
of the TPR-tree and extended it to the TPR
+
-tree. Chon et al.
[10] studied range and kNN queries based on a grid model.
Patel et al. [34] proposed a novel index structure called
STRIPES using a dual transformation technique.
The work on monitoring continuous spatial queries can
be classified into two categories. The first category assumes
that the movement trajectories are known. Continuous kNN
monitoring has been investigated for moving queries over
stationary objects [40] and linearly moving objects [22], [36].
Iwerks et al. [22] extended to monitor distance semijoins for
two linearly moving data sets [23]. However, as pointed out
in [39], the known-trajectory assumption does not hold for
many application scenarios (e.g., the velocity of a car
changes frequently on road).
The second category does not make any assumption on
object movement patterns. Xu et al. [44] and Zhang et al.
[48] suggested returning to the client both the query result
and its validity scope where the result remains the same. As
such, the query is reevaluated only when the query exits the
validity scope. However, their solutions work for stationary
objects only. For continuous monitoring of moving objects,
the prevailing approach is periodic reevaluation of queries
[24], [32], [35], [47]. Prabhakar et al. [35] proposed the
Q-index, which indexes queries using an R-tree-like
structure. At each evaluation step, only those objects that
have moved since the previous evaluation step are
evaluated on the Q-index. While this study is limited to
range queries, Mokbel et al. [32] proposed a scalable
incremental hash-based algorithm (SINA) for range and
kNN queries. SINA indexes both queries and objects, and
achieves scalability by employing shared execution and
incremental evaluation of continuous queries [32], [43].
Kalashnikov et al. and Yu et al. suggested grid-based in-
memory structures for object and query indexes to speed up
reevaluation process of range queries [25] and kNN queries
[47]. Access methods to support frequent location updates
of moving objects have also been investigated [24], [29]. Our
study falls into this category but distinguishes itself from
existing studies with a comprehensive framework focusing
on location update.
Uncertainty andprivacyissues have beenrecently studied
in moving object monitoring. To protect location privacy,
various cloaking or anonymizing techniques have been
proposed to hide the client’s actual location. Among them
are the spatiotemporal cloaking [18], the Clique-Cloak [14],
[15], the Casper anonymizer [31], /i|/¹o1 [17], [26], and
peer-to-peer cloaking [11], [16]. In spatiotemporal cloaking,
for each location update, the server divides the space
recursively in a quad-tree-like format till a suitable subspace
is found to cloak the updated location. The CliqueCloak
algorithm constructs a clique graph to combine some clients
who can share the same cloaked spatial area. The Casper
anonymizer is associated with a query processor to ensure
that the anonymizedarea returns the same query result as the
actual location. In /i|/¹o1, all user locations are sorted by
Hilbert space-filling curve ordering, and then, every / users
are groupedtogether inthis order. Besides, location cloaking,
pseudonym, dummy, andtransformation were also proposedfor
privacy preservation. Pseudonym decouples the mapping
between the user identity and the location so that an
untrusted server only receives the location without the user
identity [33], [4]. Dummy generates fake user locations
(called dummies) and mixes them together with the genuine
user location into the request [28], [46], [45]. Transformation
utilizes certain one-way spatial transformations (e.g., a space
filling curve) to map the query space to another space and
resolves query blindly in the transformed space [27].
As for location uncertainty, a common model for
characterizing the uncertainty of an object is a closed region
with a predefined probability distribution of this object in the
region. Based on this probabilistic model, query processing
and indexing algorithms have been proposed to evaluate
probabilistic range queries [12], [31] and kNN queries [9].
While in these studies, the objects are uncertain, the queries
themselves are still certain. Chen and Cheng extended the
probabilistic processing to more general cases where the
queries are also uncertain [7]. Our study, on the other hand,
addresses the continuous monitoring issue. By adopting the
notion of “safe region,” the frequency of query reevaluation
on uncertain location information is reduced, and hence, the
system efficiency and scalability are improved.
Distributed approaches have been investigated to moni-
tor continuous range queries [6], [13] and continuous kNN
queries [42]. The main idea is to shift some load from the
server to the mobile clients. Monitoring queries have also
been studied for distributed Internet databases [8], data
streams [1], and sensor databases [30]. However, these
studies are not applicable to monitoring of moving objects,
where a two-dimensional space is assumed.
3 FUNDAMENTALS OF PAM FRAMEWORK
3.1 Privacy-Aware Location Model
In this paper, we assume that the clients are privacy
conscious. That is, the clients do not want to expose their
genuine point locations to the database server to avoid
spatiotemporal correlation inference attack [14], by which an
adversary may infer users’ private information such as
political affiliations, alternative lifestyles, or medical pro-
blems. For example, knowing that a user is inside a heart
specialty clinic during business hours, the adversary can
infer that the user might have a heart problem. This has been
cited as a major privacy threat in location-based services and
mobile computing. To protect against it, most existing work
suggests replacing accurate point locations by bounding
boxes to reduce location resolutions [18], [14], [31], [26], [7],
[17]. With a large enough location box covering the sensitive
place (e.g., the clinic) as well as a good number of other
insensitive places, the success rate or confidence of such
spatiotemporal correlation inference can be reduced sig-
nificantly. In our monitoring framework, we take the same
privacy-aware approach. Specifically, each time a client
detects his/her genuine point location, it is encapsulated
into a bounding box. Then, the client-side location updater
decides whether or not to update that box to the server.
1
Without any other knowledge about the client locations or
moving patterns, upon receiving such a box, the server can
only presume that the genuine point location is distributed
uniformly in this box. To simplify the presentation in this
406 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 3, MARCH 2010
1. The computation of a proper bounding box to satisfy a certain privacy
metric (such as /-anonymity) has been extensively studied in the literature
[14], [26], [17] and is beyond the scope of this paper. Nonetheless, the larger
the box is, the less successful and confident the adversary’s inference
becomes.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
paper, we further restrict the shape of such a bounding box
to a c-by-c square (or in short c-square), where c is
customizable for each object. Our problem is therefore to
monitor result changes of spatial queries as objects move,
and monitor them as accurately as possible and at the lowest
cost of location updates.
The key idea to solving the problem is “safe region,”
which was defined in [21] as a rectangle within which the
change of object location does not change the result of any
registered spatial query. Now that locations are c-squares
instead of points, to clarify the definition of “within,” we
use the centroid point of the square as a representative, so
the safe region is essentially a safe region for the centroid of
the c-square. However, the consequence of introducing
c-square is more than that—the result of a spatial query is
no longer unique. For example, if the c-square of an object
partially overlaps with a range query, this object could be
either a result object or a nonresult object of this query. As
such, a unique definition of query result under c-squares is
a prerequisite of safe region.
Since the genuine point location of an object is distributed
uniformly in its c-square, we can define the (unique) query
result as the one with the highest probability among all
possible results. As in the previous range query example, if
the majority of the c-square falls inside the range query, that
object is most probably a result object of this query;
otherwise, that object is most probably a nonresult object.
With the notion of most probable result, we thereby define
the safe region as a rectangle within which the change of the
centroid of the object’s c-square does not change the most
probable result of any registered spatial query. The standard
update strategy of the client is therefore “to update when the
centroid of the c-square is out of the safe region.”
The reason why we exclude all other less probable
results in this definition is threefold: 1) monitoring
continuous queries usually trades accuracy for efficiency—
although the most probable result does not always align
with the genuine result (the result derived based on genuine
point locations of all objects), we will show in Section 4
that it is efficient to compute, and therefore, prevents the
server from being computationally overloaded; 2) if the
query result were defined as the set of all possible results,
the safe region would have to be extremely small to report
location updates if any of the possible results changes,
which makes the update cost overwhelmingly high; and
3) we do not want the choice of c-square—which is made
by the client—to affect query results heavily, and obviously
the most probable results are less vulnerable than other
result definitions.
3.2 Framework Overview
As shown in Fig. 3, the PAM framework consists of
components located at both the database server and the
moving objects. At the database server side, we have the
moving object index, the query index, the query processor,
and the location manager. At moving objects’ side, we have
location updaters. Without loss of generality, we make the
following assumptions for simplicity:
. The number of objects is some orders of magni-
tude larger than that of queries. As such, the query
index can accommodate all registered queries in
main memory, while the object index can only
accommodate all moving objects in secondary
memory. This assumption has been widely
adopted in many existing proposals [25], [47], [21].
. The database server handles location updates se-
quentially; in other words, updates are queued and
handled on a first-come-first-serve basis. This is a
reasonable assumption to relieve us from the issues
of read/write consistency.
. The moving objects maintain good connection with
the database server. Furthermore, the communication
cost for any location update is a constant. With the
latter assumption, minimizing the cost of location
updates is equivalent to minimizing the total number
of updates.
PAM framework works as follows (see Fig. 3): At any
time, application servers can register spatial queries to the
database server (step (1). When an object sends a location
update (step (2), the query processor identifies those queries
that are affected by this update using the query index, and
then, reevaluates them using the object index (step (3). The
updated query results are then reported to the application
servers who register these queries. Afterward, the location
manager computes the new safe region for the updating
object (step (4), also based on the indexes, and then, sends it
back as a response to the object (step (5). The procedure for
processing a new query is similar, except that in step (2 , the
new query is evaluated from scratch instead of being
reevaluated incrementally, and that the objects whose safe
regions are changed due to this new query must be notified.
Algorithm 1 summarizes the procedure at the database
server to handle a query registration/deregistration or a
location update.
Algorithm 1. Overview of Database Behavior
1: while receiving a request do
2: if the request is to register query c then
3: evaluate c;
4: compute its quarantine area and insert it into the
query index;
5: return the results to the application server;
6: update the changed safe regions of objects;
7: else if the request is to deregister query c then
8: remove c from the query index;
9: else if the request is a location update from object j
then
10: determine the set of affected queries;
11: for each affected query c
/
do
HU ET AL.: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 407
Fig. 3. PAM framework overview.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
12: reevaluate c
/
;
13: update the results to the application server;
14: recompute its quarantine area and update the
query index;
15: update the safe region of j;
It is noteworthy that although in this paper, the most
probable result is used, this framework can also adapt to
other query result definitions such as over a probability
confidence (e.g., “returns objects that have 90 percent
probability inside the query range”). The only changes
needed to reflect the new result definition are the query
evaluation algorithms in the query processor and safe
region computation in the location manager. In the rest of
this paper, we stick to the definition of the most probable
result and leave the modification details for other defini-
tions to interested readers.
The following sections explain the components at the
database server in detail, and Section 6 describes the update
strategy of the client-side location updater.
3.3 The Object Index
The object index is the server-side view on all objects. More
specifically, to evaluate queries, the server must store the
spatial range, in the form of a bounding box, within which
each object can possibly locate. Note that this bounding box
is different from a c-square because its shape also depends
on the client-side location updater. That is, it must be a
function (denoted by ) of the last updated c-square and
the safe region. As such, this box is called a bbox as a mark
of distinction. In particular, for the standard update
strategy, the //or is the safe region enlarged by c´2 on
each side, or formally, the “Minkowski sum”
2
of the safe
region and a c´2-square.
With the same rationale for which we assume the
genuine point location of an updating object to distribute
uniformly in the c-square, we assume that the genuine point
locations are distributed uniformly in their respective //ores
when queries are evaluated or reevaluated. The object index
is built on the //ores to speed up the evaluation. While
many spatial index structures can serve this purpose, this
paper employs the R
+
-tree index [2], [19], which is most
widely adopted in the literature. Since the //or changes each
time the object updates, the index is optimized to handle
frequent updates [29].
3.4 The Query Index
For each registered query, the database server stores:
1) the query parameters (e.g., the rectangle of a range
query, the query point, and the / value of a kNN query);
2) the current query results; and 3) the quarantine area of
the query. The quarantine area is used to identify the
queries whose results might be affected by an incoming
location update. It originates from the quarantine line,
which is a line that splits the entire space into two regions:
the inner region and the outer region. An object becomes a
result object if it enters the inner region; likewise, it
becomes a nonresult object once it enters the outer region.
However, the ideal quarantine line is difficult to compute,
especially in the context of the most probable result. In
addition, as object locations have extensions rather than
points, the quarantine line is not unique for a query. As
such, we allow fuzziness by relaxing the line to an area
called “quarantine area.” That is, the entire space is split
into three regions: the inner region, the quarantine area,
and the outer region. The former two are separated by the
inner bound of the quarantine area, whereas the latter two
are separated by theouter bound of the quarantine area. To
ease the computation of these two bounds, an object
becomes a result object if its c-square moves totally inside
the inner bound; on the other hand, an object becomes a
nonresult object once its c-square crosses or is outside the
outer bound. Therefore, a query Q is not affected only if
“of the updated c-square j and its last updated c-square
j
|:t
, both of them are totally inside the inner bound or
both of them cross or are outside the outer bound of the
quarantine area.”
3
For a range query c, the query window can serve as an
inner bound of the quarantine area, because any object
whose c-square is fully inside c is a trivial result of c. On the
other hand, an outer bound can be the Minkowski sum of c
and a c´2-square, i.e., enlarging c by c´2 on each side. The
correctness of this bound can be verified by the observation
that for any c-square that crosses this bound, the majority of
this square must be outside c, thus making the object a
nonresult object. In case, there are different cs for different
objects, the largest c is used. Fig. 4a shows the inner and
outer bounds of c’s quarantine area.
For a kNN query, since only the distance to the query
point c matters, we set both the inner and the outer bounds
as circles centered at c. Furthermore, since the /th NN o
/
determines whether other object is or is not a result object,
we set the radii of the two circles based on o
/
. More
specifically, the inner bound circle is set to be the minimum
distance between c and the //or of o
/
so that if a c-square is
totally inside this circle, it is guaranteed to be closer to c than
o
/
. On the other hand, the outer bound circle is set to be the
maximum distance between c and the //or of o
/
, plus c. If
d(:. t) denotes the distance between two points : and t,
d(o. T) (1(o. T)) denotes the minimum(maximum) distance
between a pair of points in areas o and T, then the radii of
the inner and outer circle are d(c. o
/
) and 1(c. o
/
) ÷c,
respectively.
To quickly find all affected queries, an in-memory grid-
based index is built on the quarantine areas of all queries.
408 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 3, MARCH 2010
Fig. 4. Quarantine area. (a) Range query. (b) kNN query.
2. The Minkowski sum of two shapes ¹ and 1 in euclidean space is the
result of adding every point in 1 to every point in ¹, i.e., the set
¦o ÷/[o ÷ ¹. / ÷ 1¦.
3. For kNN queries, if the order of the result objects is sensitive, Q is not
affected only if both of them cross or are outside the outer region.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
The index partitions the entire space into ` ×` uniform
grid cells, and the bucket for each cell points to those
queries whose quarantine areas overlap with or fully
enclose this cell. If we define the cell(s) that overlap with
the c-square of the updating object j as the home cell(s) of j,
then only queries pointed at by the home cell(s) of j or j
|:t
are affected, and thus, need reevaluation.
3.5 Query Processor and Location Manager
In the PAM framework, based on the object index, the query
processor evaluates the most probable result when a new
query is registered, or reevaluates the most probable result
when a query is affected by location updates. Obviously,
the reevaluation is more efficient as it can be based on
previous results. The detailed algorithms of query evalua-
tion and reevaluation will be presented in Section 4.
The location manager computes the safe region of an
object j (denoted as j.:i). Recall that a safe region is a
rectangle within which the change of the centroid of
j’s c-square does not change the most probable result of
any registered query. As queries are independent of each
other, we can further define the safe region for a single
query Q (denoted as j.:i
Q
) as a rectangle in which the
change of the centroid of j’s c-square does not change the
most probable result of Q. By this definition, j.:i
Q
is a
rectangular approximation, or more accurately an inscribed
rectangle, of Q’s inner (if j is a result object) or outer (if j is
a nonresult object) regions, which are separated by the
quarantine line. The reason why the safe region is based on
the quarantine line rather than the quarantine area is that
the latter is much coarser. Furthermore, the quarantine area
is used only to filter out the queries that are not affected by
a location update, so we trade accuracy for efficiency. The
safe region, on the other hand, directly dictates the
frequency, and hence, the cost of location updates, so we
compute it based on the more accurate quarantine line.
After each individual j.:i
Q
is computed, j.:i is simply
the intersection of these j.:i
Q
from all registered queries. To
eliminate those queries whose safe regions do not contribute
to j.:i, the location manager further requires every j.:i
Q
(and thus, the j.:i) to be fully contained in the home cell(s).
Recall that the home cell(s) are the grid cell(s) of the query
index where the c-square of j is contained or overlaps. By
this means, the location manager only needs to compute the
safe regions for those queries (subsequently called relevant
queries) whose quarantine areas are contained or overlap
with the home cell(s). These relevant queries are exactly
those indexed by the home cell(s) of the query index.
The location manager recomputes the safe region of an
object j in two cases: 1) after a new query Q is evaluated
and 2) after j sends a location update. In the former case,
since no existing queries change their quarantine lines, the
new safe region j.:i
/
is simply the intersection of the
current safe region j.:i and j.:i
Q
, the safe region for this
new query Q. If j.:i
/
is different from j.:i, the new safe
region should be updated to j. In the latter case, the
quarantine areas of some existing queries might change;
therefore, j.:i
/
needs to be completely recomputed by
computing the j.:i
Q
for each relevant query and then
getting the intersection.
As the objective of the PAM framework is to minimize
the number of location updates, the following theorem
shows that the safe region should be the inscribed rectangle
of the inner or outer region with the maximum perimeter:
Theorem 3.1. Assume that the object j moves in a randomly
chosen direction with a constant speed c (see Fig. 5), and that
c-square is small enough to be ignored. Given a convex safe
region 1 and the updated location j, the amortized location
update cost for j over time Co:t
j
is
Co:t
j
= C
|


0
/(0)d0
2¬c

÷1
=
.
C
|
2¬c
1ciiictci(1)
.
where C
|
is the cost for one location update, 0 is the angle
between the moving direction and the positive x-axis, /(0) is
the length of segment ji, i is the intersection point of this
direction and the boundary of 1, in other words, i is the
location at which the next location update occurs.
Proof. First of all, i must be unique for every 0. Otherwise, if
there were another i
/
, the points in segment ii
/
do not
belong to 1, which contradicts the convex assumption.
As such, given 0, the elapsed time before the next location
update is
/(0)
c
. The average elapsed time over all 0 is


0
/(0)
c
d0


0
d0
=


0
/(0)d0
2¬c
.
Therefore, we have
Co:t
j
= C
|


0
/(0)d0
2¬c

÷1
=
.
C
|
2¬c
1ciiictci(1)
.
because


0
/(0)d0 =
.
1ciiictci(1). .¯
Therefore, the optimal safe region j.:i
Q
is the inscribed
rectangle with the longest perimeter, or shortly 1i ÷|j, of
Q’s inner or outer region. Section 5 will present the
detailed algorithms to compute the optimal j.:i
Q
for each
type of query.
4 QUERY PROCESSING
In this section, we present the detailed algorithms to
evaluate or reevaluate a spatial query Q in terms of the
most probable result. Aside from the definition of the query
result, we know that Q also differs from a conventional
spatial query in that the object locations are in the form of
c-square (for updating objects) or //or (for other objects),
both of which are rectangular. In this section, instead of
regarding Q as a special query type, we take an alternative
approach by regarding the space where the object locations
are defined as a special euclidean space. In this space,
spatial relations such as overlapping, containment, or even
HU ET AL.: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 409
Fig. 5. Random movement.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
distance are implemented differently from a conventional
euclidean space. By using the new implementations of
spatial relations, existing spatial query processing algo-
rithms can be applied directly to the new space.
In the following sections, we implement two relations
that are required for spatial queries, namely, containment
and closer.
4.1 Spatial Relations
In this new space, an object j is contained in a rectangle 1 if
in the euclidean space, the majority of j is in 1. The
rectangle divides the enlarged safe region of any object j
into two regions: the region inside rectangle c (where j is a
result object of c) and the region outside c (where j is not a
result). The region with the larger area decides the most
probable result.
In this new space, an object j
1
is closer to a point c than
object j
2
if and only if in the euclidean space, for two
randomly picked points o. / from j
1
and j
2
, respectively, o
is equally or more probably closer to c than /. The closer
relation has a nice property that it is a total order relation,
which is proved by the following preposition:
Proposition 4.1. The closer relation is a total order relation, that
is, it satisfies
1. reflexivity,
2. antisymmetry,
3. transitivity, and
4. comparability.
Proof sketch. Cases 1 and 2 are trivial.
3. Transitivity: if j
1
is closer than j
2
and j
2
is closer
than j
3
, then o (from j
1
) is more probably closer to c than
/ (from j
2
), which is, in turn, more probably closer to c
than c (from j
3
). As such, j
1
is closer than j
2
.
4. Comparability: \j
1
. j
2
, either j
1
is closer than j
2
, or
j
2
is closer than j
1
. .¯
Therefore, the most probable result of kNN query c is
defined as the top-k objects of all objects in the closer order
of their enlarged safe regions.
To implement the “closer” relation, we present an
efficient algorithm that is based on finding out which object
has more portion of area closer to point c. Instead of
computing the exact shape of such area, which is forbid-
dingly costly, the algorithm is based on the divide-and-
conquer paradigm. It maintains a priority queue O whose
elements are pairs of subrectangles of j
1
and j
2
that have
not yet been compared. Initially, the pair <j
1
. j
2
is
inserted into O and the portion of area where j
1
(or j
2
) is
closer is 0. Each time an element <j
/
1
. j
/
2
pops up from O
(where j
/
1
is a subrectangle of j
1
and j
/
2
is a subrectangle of
j
2
), the algorithm checks if any point in j
/
1
(or j
/
2
) is always
closer than any point in j
/
2
(or j
/
1
). If this is the case (case 1),
the multiple of the area j
/
1
(or j
/
2
) is added to the portion of
area where j
1
(or j
2
) is closer. If this is not the case (case 2),
j
/
1
or j
/
2
, whichever is larger, is split into four equal
subrectangles, and thus, four new pairs are inserted to O.
The reason to split the larger rectangle is that the resulted
pairs are more probable to become pairs of case 1. The
algorithm continues until either the portion of area where j
1
(or j
2
) is closer exceeds 0.5, or the queue O becomes empty.
It is noteworthy that the portion of area where j
1
(or j
2
) is
closer is essentially the probability that j
1
(or j
2
) is closer.
As such, this algorithm always returns the correct result. On
the other hand, the algorithm is efficient because it
terminates as soon as one portion of area exceeds 0.5. In
order to let the portion of area converge to the actual
probability more quickly, we use the multiple portions of
area as the key to sort the pairs in O.
4.2 Query Evaluation and Reevaluation on Object
Index
In conventional euclidean space, a new range query is
evaluated as follows: We start from the index root and
recursively traverse down the index entries that overlap
with the query window until the leaf entries storing the
objects are reached. Then, we test each object using the
containment relation in the new space.
Reevaluation of an existing range query c is even
simpler—only the c-square of the updating object needs to
be tested on the containment relation.
The best-known algorithm to evaluate a kNN query c in
conventional euclidean space is the best-first search (BFS)
[20]. It uses a priority queue H to store the to-be-explored
index entries which may contain kNNs. The entries in H
are sorted by their minimum distances to the query point c.
BFS works by always popping up the top entry from H,
pushing its child entries into H, and then, repeating the
process all over. When a leaf entry, i.e., an entry of a leaf
node, is popped, the corresponding object is returned as a
nearest neighbor. The algorithm terminates if / objects have
been returned.
In the new space, the query is evaluated similarly, which
is shown in Algorithm 2. However, the algorithm maintains
an additional priority queue H besides H. It is a priority
queue of objects sorted by the “closer” relation. The reason
to introduce H is that when an object j is popped from H, it
is not guaranteed a kNN in the new space. Therefore, H is
used to hold j until it can be guaranteed a kNN. This occurs
when another object j
/
is popped from H, and its minimum
distance to c (d(c. j
/
)) is larger than the maximum distance
of j to c (1(c. j)). In general, when an object n is popped
from H, we need to do the following. If d(c. n) is larger than
1(c. .), where . is the top object in H, then . is guaranteed a
kNN and removed from H. Then, d(c. n) is compared with
the next 1(c. .) until it is no longer the larger one. Then, n
itself is inserted to H and the algorithm continues to pop up
the next entry from H. The algorithm continues until
/ objects are returned.
Algorithm 2. Evaluating a new kNN Query
Input: ioot: root node of object index
c: the query point
Output: C: the set of kNNs
Procedure:
1: initialize queue H and H;
2: enqueue ¹ioot. d(c. ioot)) into H;
3: while [C[ < / and H is not empty do
4: n = H.pop();
5: if n is a leaf entry then
6: while d(c. n) 1(c. .) do
410 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 3, MARCH 2010
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
7: . = H.pop();
8: insert . to C;
9: enqueue n into H;
10: else if n is an index entry then
11: for each child entry . of n do
12: enqueue ¹.. d(.. c)) into H;
To reevaluate an existing kNN query that is affected by
the updating object j, the first step is to decide whether j
is a result object by comparing j with the /th NN using
the “closer” relation: if j is closer, then it is a result object;
otherwise, it is a nonresult object. This then leads to three
cases: 1) case 1: j was a result object but is no longer so;
2) case 2: j was not a result object but becomes one; and
3) case 3: j is and was a result object.
4
For case 1, there
are fewer than / result objects, so there should be an
additional step of evaluating a 1NN query at the same
query point to find a new result object n. The evaluation
of such a query is almost the same as Algorithm 2, except
that all existing kNN result objects are not considered.
The final step of reevaluation is to locate the order of new
result object j in the kNN set. This is done by comparing
it with other existing objects in the kNN set using the
“closer” relation. For cases 1 and 2, since this object is a
new result object, the comparison should start from the
/th NN, then /-1th NN, and so on. However, for case 3,
since j was in the set, the comparison can start from
where j was. Algorithm 3 shows the pseudocode of kNN
query reevaluation, where j
+
denotes the starting position
of the comparison.
Algorithm 3. Reevaluating a kNN Query
Input: C: existing set of kNNs
j: the updating object
Output: C: the new set of kNNs
Procedure:
1: if j is closer to the /-th NN then
2: if j ÷ C then
3: j
+
= the rank of j in C;
4: else
5: j
+
= k;
6: enqueue j into C;
7: else
8: if j ÷ C then
9: evaluate 1NN query to find n;
10: j
+
= k;
11: remove j and enqueue n into C;
12: relocate j or n in C, starting from j
+
;
5 SAFE REGION COMPUTATION
As mentioned in Section 3, the location manager computes
the optimal safe region for an individual query Q, which is
the inscribed rectangle with the longest perimeter (1i ÷|j)
of Q’s inner or outer region, separated by the quarantine
line. Therefore, the safe region is obtained in two steps:
finding the quarantine line, and then, finding the 1i ÷|j. It
is noteworthy that the safe region must contain the
updating object j (i.e., its centroid), because otherwise, this
object has to send an immediate location update after it
receives this safe region. In this section, we present the
detailed algorithms to compute the quarantine line, and
hence, the safe region for various types of queries.
5.1 Safe Region for Range Query
We first consider the case when object j is a result object.
Fig. 6b shows an example of range query where c is the
centroid of the query. The gray box shows the c-square of j.
Without loss of generality, let us consider the first quadrant,
and let the same j ((r. n)) denote the centroid of the
c-square. Fig. 6a is the close-up image of Fig. 6b. According
to the definition of the most probable result, more than half
of the c-square must reside in the query window. To obtain
the quarantine line, we only need to consider the special
case when exactly half of the square resides in the query
window, which can be further divided into two subcases. In
the first subcase, j is “on” the window border as box “1”
shows, we have either “n = / and r ÷c´2 _ /” or “r = o and
n ÷c´2 _ o.” In the second subcase, j is not on the border as
box “2” shows, we have (c´2 ÷o ÷r)(c´2 ÷/ ÷n) _ c
2
´2.
The two subcases give us the quarantine line in the first
quadrant (the bold curve in Fig. 6a), which is defined by the
following formulae:
r = o. i1 n _ / ÷c´2. oi
n = /. i1 r _ o ÷c´2. oi
(c´2 ÷o ÷r)(c´2 ÷/ ÷n) = c
2
´2. ot/cini:c.

(1)
And the inner region in the first quadrant is therefore the
shaded shape. Summing up all the four quadrants, the total
inner region of this query is the bold shape in Fig. 6b.
The second step is to find the 1i ÷|j of the inner region.
For any inscribed rectangle whose corner point in the first
quadrant is (:. t), the perimeter is 2: ÷ 2t. On the other
hand, since (:. t) must also be on the quarantine line, r =
:. n = t must be a solution to (1). This equation shows that
the perimeter 2: ÷ 2t is maximized at j
+
when
c
2
÷o ÷r =
c
2
÷/ ÷n =
c
ffiffi
2

. Thus, the optimal safe region is the solid
rectangle whose corner point is j
+
(see Fig. 6b). However,
this safe region may not contain the centroid of the
updating object j. For example, in Fig. 6b, if the centroid
is at j
/
, then all inscribed rectangles that contain j lie
between the two dotted rectangles whose horizontal sides
and vertical sides pass j
/
. In this case, the optimal safe
region is one of the two dotted rectangles with longer
HU ET AL.: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 411
Fig. 6. Safe region for range query. (a) Quarantine line. (b) The optimal
safe region.
4. There is a fourth case where j was not and is not a result object. In this
case, the reevaluation is completed by doing nothing.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
perimeter. Therefore, we reach the following proposition on
the safe region for a result object:
Proposition 5.1. For a result object j of a range query (size 2o-
by-2/), the corner of the safe region is:
(r. n) =
o ÷
ffiffi
2

÷1
2
c. / ÷
ffiffi
2

÷1
2
c

. i1 j.r _ o ÷
ffiffi
2

÷1
2
c.
& j.n _ / ÷
ffiffi
2

÷1
2
c.
j.r.
c
2
2(o÷c´2÷j.r)
÷
2/÷c
2

.
oi
c
2
2(/÷c´2÷j.n)
÷
2o÷c
2
. j.n

. otherwise.

If j is a nonresult object, the safe region is an inscribed
rectangle of the outer region. Such rectangle has the longest
perimeter when its corner point j
+
is at (o. 0) or (0. /).
Similar to the case when j is a result object, this rectangle
can serve as the safe region only if it contains the centroid of
updating object j; otherwise, the safe region is chosen from
the two dotted rectangles that has a longer perimeter.
5.2 Safe Region for kNN Query
We first consider the case when object j is the ith NN
(denoted by o
i
) of the query. By definition, its c-square must
be closer than the //or of o
i÷1
, but farther than the //or of
o
i÷1
. However, the exact quarantine line (and hence, the
inner or outer region) for j based on this line is complex. In
what follows, we approximate the inner region with a ring
centered at the query point c.
As the first step, we show that a circle centered at c splits
a c-square into inside and outside parts, and their areas are
dependent on the angle of the c-square to c.
Lemma 5.2. Among all squares of the same size and the same
distance to c, the diagonal square, whose diagonal coincides
with the line of jc, has the smallest inside part, while the side
square, whose sides are parallel to jc, has the largest inside
part. (refer Fig. 7a).
On the other hand, the area of the inside part also depends
on the length of jc. For example, in Fig. 7b, the two c-squares
are of the same angle, but the square that is closer to c has a
larger inside part (area 1) than the farther square (area 11).
Lemma 5.3. For squares of the same angle to c, the closer j to c,
the larger the inside part.
Applying these two lemmas, we can define the lower and
upper bounding circles for an object o. In Fig. 8, there are two
circles, plotted by solid arcs, that touch the near and the far
endpoints of the //or of o. Then, there must be a diagonal
square and a side square that are split by these two arcs into
inside and outside parts of equal area, respectively. The
lower and upper bounding circles, plotted by dotted arcs,
are the circles that cross the centers of these two squares. By
this definition, as long as the centroid of j’s c-square is
within the lower bounding circle, j is always closer than o;
on the other hand, as long as the centroid is beyond the
upper bounding circle, j is always farther than o. The
following proposition proves the correctness:
Proposition 5.4. Any c-square whose centroid is within (beyond)
the lower (upper) bounding circle for object o must be closer
(farther) to c than o.
Proof. Since any point in the inside part of the diagonal
square (i.e., area 11) is always closer than any point in the
//or of o, and since the inside part is half of the square, by
definition, the diagonal square is closer than the //or of o.
On the other hand, by Lemmas 5.2 and 5.3, any square
whose center is closer than that of the diagonal square
must be closer than the diagonal square. Applying the
transitivity of the “closer” relation, any square whose
centroid is within the lower bounding circle is closer to c
than o. The proof for the upper bound is similar. .¯
Based on Proposition 5.4, the inner region for j (i.e., o
i
)
can be approximated by a ring that is formed by the lower
bounding circle for o
i÷1
and the upper bounding circle for
o
i÷1
. To find the radii of the upper and lower bounding
circles, we further adopt an approximation algorithm as
follows: As shown in Fig. 9, to compute the lower bounding
circle, the diagonal square is first partitioned into
`×` (e.g., 4 × 4) subsquares. Then, the distance between
c and the farthest endpoint (the small hollow or solid circles
in the figure) of each subsquare is computed. The medium
(i.e., the
`
2
2
th shortest) distance is set to the radius for the
lower bounding circle. This bounding circle is guaranteed to
satisfy Proposition 5.4 because the subsquares of the first
`
2
2
shortest distances (their farthest endpoints are shown as
hollow circles) must be inside the bounding circle, and these
subsquares already account for half of the total area.
412 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 3, MARCH 2010
Fig. 7. Lemmas on squares. (a) Diagonal and side squares. (b) c and
inside part.
Fig. 8. Upper and lower bounding circles.
Fig. 9. Bounding circle.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
Therefore, this circle can serve as an approximation of the
lower bounding circle. The same approximation can be
applied to upper bounding circles. Obviously, better
approximation can be achieved using larger `, which is at
the cost of higher computation overhead.
Once the ring is obtained, the safe region is the inscribed
rectangle of the ring that has the longest perimeter (1i ÷|j).
In [21], we showed the following proposition (see Fig. 10a):
Proposition 5.5. The Ir-lp of a ring that is centered at c with
inner radius i and outer radius 1 is the one of the following
two Ir-lp which has a longer perimeter. The perimeter of the
first (horizontal) Ir-lp is 41:ii0
1
÷ 2(1co:0
1
÷i), where 0
1
is
0
1
=
oicto2. i1 0
r
_ oicto2 _ 0
n
. oi
0
r
. i1 0
r
< oicto2. oi
0
n
. i1 oicto2 < 0
n
.

where 0
r
= oic:ii
j.r÷c.r
1
and 0
n
= oicco:
c.n÷j.n
1
. The peri-
meter of the second (vertical) Ir-lp is 41co:0
2
÷ 2(1:ii0
2
÷i),
where 0
2
is
0
2
=
oiccto2. i1 0
r
_ oiccto2 _ 0
n
. oi
0
r
. i1 0
r
< oiccto2. oi
0
n
. i1 oiccto2 < 0
n
.

Finally, we reach the following proposition on the safe
region for a result object o
i
:
Proposition 5.6. The safe region of the ith NN o
i
is the Ir-lp of
the ring that consists of the upper bounding circle for o
i÷1
and
the lower bounding circle for o
i÷1
.
It is noteworthy that for the first NN (i.e., i = 1), the ring
degenerates to a circle. On the other hand, if object j is a
nonresult object, we can approximate the outer region by
the complement of the upper bounding circle of o
/
. As such,
the safe region is the 1i ÷|j of the complement of a circle. In
[21], we showed that (see Fig. 10b):
Proposition 5.7. The Ir-lp of the complement of a circle centered
at c with radius i is the inscribed rectangle with one corner
being the cell corner corresponding to j and the opposite corner
is r. r is either on the 1´4 circle whose 0 is
0 =
¬´4. i1 0
n
_ ¬´4 _ 0
r
. oi
0
r
. i1 0
r
< ¬´4. oi
0
n
. i1 0
n
¬´4.

where 0
r
= oic:ii
j.r÷c.r
i
and 0
n
= oicco:
j.n÷c.n
i
.
6 DYNAMIC CLIENT UPDATE STRATEGY
The standard update strategy, which updates when the
centroid of c-square is out of the safe region, guarantees
100 percent monitoring accuracy in the context of the most
probable result. This is a static strategy where the decision
is made independent of previous decisions. In this section,
we discuss two dynamic strategies that achieve objectives
other than monitoring accuracy.
6.1 Mobility-Aware Update Strategy
Previously, we ignore the fact that the server receives a
series of location updates from an object. Although the
server cannot speculate the genuine object location from an
individual c-square, by considering consecutive updates
with certain background knowledge about the object’s
mobility, the server might produce better speculations.
Figs. 11a and11bshowtwoexamples where the maximum
speed .
i
or the exact direction of the movement is known,
respectively. In these examples, a c-square is updated at time
t
0
, then at time t
1
, the object must reside in the dotted shape,
which is called the reachable area from t
0
. In Fig. 11a, the
reachable area is the Minkoski sumof the c-square at t
0
and a
circle with a radius of .
i
(t
1
÷t
0
), i.e., the c-square expanded
by the circle at each point. Likewise, in Fig. 11b, the reachable
area is the half-openspace formedby the rays whose ends are
from the c-square. If the c-square at t
1
overlaps with the
reachable area, then the object can only locate in the part that
is inside the area (shaded in Figs. 11a and 11b).
To prevent the server from narrowing down the object
location like this, we propose the following mobility-
aware strategy:
Definition 6.1 (Mobility-aware update strategy). Update
when the centroid of c-square is out of the safe region and the
c-square is completely inside the reachable area of all previous
c-squares.
The intuitive version of this strategy must maintain the
entire set of historic c-squares. However, due to its dynamic
property, we showin the following lemma that it is sufficient
to maintain only the reachable area for the last c-square:
Lemma 6.1. For a set of c-squares of ¦t
0
. t
1
. . . . . t
i
¦ and
i _ , _ i, the reachable area of t
,
is completely inside that
of t
i
as long as the c-square of any t
i
is completely inside the
reachable area of t
i÷1
(i _ 1).
In what follows, we develop an algorithm to test
whether a c-square at t
1
is completely inside the reachable
HU ET AL.: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 413
Fig. 10. Computing Ir-lp. (a) Ring. (b) Complement of a circle.
Fig. 11. Problems with mobility knowledge. (a) Known maximum speed.
(b) Known direction.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
area of t
0
when the direction is known in the range of
(u
|
. u
/
). This is a general case for Fig. 11b. The idea is to take
an analytic view of this area. As is illustrated in Fig. 12, let o
with coordinates (o. /) denote a point in the c-square of t
0
,
and let j with coordinates (r. n) denote a point in the
c-square of t
1
. Then, the condition that line oj falls in
between direction (u
|
. u
/
) is equivalent to the inequality
to(u
|
) _ (n ÷/)´(r ÷o) _ to(u
/
) (we consider only the first
quadrant for simplicity). Therefore, to test whether the
c-square at t
1
(r
|
_ r _ r
/
. n
|
_ n _ n
/
) is completely inside
the reachable area of t
0
is equivalent to testing whether any
of the following two sets of inequalities with regard to
r. n. o. / can be satisfied simultaneously:
÷(r ÷o)to(u
|
) ÷(n ÷/) _ 0.
0 _ o _ c. 0 _ / _ c.
r
|
_ r _ r
/
. n
|
_ n _ n
/
.

and
(r ÷o)to(u
/
) ÷(n ÷/) _ 0.
0 _ o _ c. 0 _ / _ c.
r
|
_ r _ r
/
. n
|
_ n _ n
/
.

Either of them can be regarded as the set of linear
constraints in a linear programming (11) problem regard-
ing variables r. n. o. /. We build two 11 problems 1
1
. 1
2
with a (dummy) objective function C = 0 and the same
linear constraints as above. Determining whether any of the
two sets of inequalities can be satisfied simultaneously is
then equivalent to testing whether 1
1
or 1
2
has a feasible
solution. The feasibility can be tested by any 11 solver such
as the classic Simplex or Ellipsoid method. The c-square of
t
1
is completely inside the reachable area of t
0
only if neither
1
1
nor 1
2
is feasible.
6.2 Minimum-Cost Update Strategy
In previous sections, we use a rectangular safe region to
approximate the ideal safe area in which the change of the
centroid j of a c-square does not change the most probable
result of any query. Fig. 13 illustrates the relation between a
safe region and the ideal safe area. The gap between them is
inevitable and could be arbitrarily large due to the
following reasons: 1) a safe region for an individual query
is already a rectangular approximation of the inner or outer
region for this query and 2) the whole safe region is
obtained by intersecting the safe regions for all individual
queries, which makes it far smaller than the ideal safe area.
To guarantee 100 percent monitoring accuracy on the
most probable result, the standard strategy updates when-
ever j moves out of the safe region, but this could be an
unnecessary update as j might still be in the ideal safe area.
We therefore believe that in applications where 100 percent
accuracy is not mandatory and location update costs are
serious issues, a strategy that can trade accuracy with costs
is desirable. In this section, we develop such a strategy that
tries to minimize the cost by adding a `-rule to the standard
strategy to filter out unnecessary updates.
More specifically, symbol ` is the probability of j
moving out of the ideal safe area. Let co:t
j
denote the cost
of not updating j in this case. As j causes a result change,
co:t
j
is essentially the penalty of loss of monitoring
accuracy. On the other hand, let co:t
n
denote the cost of
updating j. Therefore, to minimize the expected cost, the
`-rule updates only if co:t
n
< `
+
co:t
j
, i.e., ` co:t
n
´co:t
j
.
To test whether ` at j is larger than co:t
n
´co:t
j
without
sending it to the server, we need to know an additional
point j
0
inside the ideal safe area (see Fig. 13). To find j
0
, we
continue to use the standard strategy. If the next updated
location j
+
causes no result changes (a feedback from the
server), j
+
is regarded as j
0
. Otherwise, if j
+
changes the
result, the ideal safe area changes as well, so we continue to
find j
0
for the new ideal safe area. In general, the `-rule is
only applicable after two consecutive location updates by
the standard strategy, and the second update must cause no
result changes from the first update. This prerequisite is
useful in filtering out those ideal safe areas that are not
significantly larger than their safe regions.
If we regard the space as a space of ` values, ` = 0 when j
is in the safe region and gradually increases as j moves away
from the safe region. As ` at any point is independent of the
` values at other points, the movement of ` from the border
of the safe region to j can be regarded as a discrete random
walk for simplicity (see Fig. 13). Initially, at the border ` = 0,
and by taking steps of length Á, it walks away from the
border toward j. In each step, ` is increased by `
Á
with a
probability t, and not increased with probability 1 ÷t. As
such, the total number of steps ` = di:t(j. 1)´Á. Since the
maximum value for ` is 1, `
Á
= 1´`. In any step, if the `-
rule is satisfied, the rule must also be satisfied at j, because `
is monotonously increasing as it moves. Thus, the strategy
updates the location and stops the walk in any step when the
`-rule is satisfied. On the other hand, if the `-rule is not
satisfied till the last step, then the strategy does not update
the location.
We are yet to estimate t. As j
0
is known to be inside the
ideal safe area, a random walk to j
0
can be conducted in the
same way as above. By the maximum likelihood estimation, we
should maximize the probability that ` at j
0
does not satisfy
the `-rule, i.e., the probability of ` _ co:t
n
´co:t
j
. By the
theory of Bernoulli process, ` at j
0
follows a Binomial
414 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 3, MARCH 2010
Fig. 12. Reachable area.
Fig. 13. Minimum-cost update strategy: `-rule.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
distribution whose cumulative distribution function is
bounded by crj(÷2
(`0t÷``|)
2
`
0
). The function reaches the
maximum when `
0
t = ``|. Putting ` = co:t
n
´co:t
j
, we
have t =
co:tn`´co:tj|
`0
.
The state transition diagram of this strategy is illustrated
in Fig. 14 where the shaded texts mean rule satisfaction and
unshaded texts mean otherwise. There are four states in this
strategy: “initial state,” “no update,” “first update,” and
“second update.” The `-rule is applicable only at “second
update” and “no update” where j
0
is obtained. To transit to
“second update,” there must be two location updates, and
the latter (regarded as j
0
) must cause no result changes.
Once the strategy decides to update, the `-rule is suspended
and the state is reset to “first update” to wait for the next j
0
.
7 PERFORMANCE EVALUATION
To evaluate the monitoring performance, we implement a
simulation test bed, where ` moving objects move within a
unit-square space [0..1, 0..1]. Each object detects its point
location at frequency 1, encapsulates it into a c-square, and
forwards the square to the location updater. Each object has
an individual c and it follows a normal distribution with
mean value j. We compare our PAM framework with two
other frameworks, namely, the optimal monitoring (denoted
as OPT) and the periodic monitoring (denoted as PRD). In
optimal monitoring, every object has the perfect knowledge
of the registered queries and the c-squares of other moving
objects at any time. Therefore, it knows precisely when the
most probable result of any query changes, and only then
does it send a location update to the server. OPT serves as
the lower bound for all monitoring frameworks. In periodic
monitoring, all objects periodically send out location
updates simultaneously and the server reevaluates all
registered queries based on these updates. Obviously, its
monitoring accuracy and cost depend on the updating
interval. In this paper, we test PRD with updating intervals
0.1 and 1, denoted as PRD(0.1) and PRD(1) hereafter.
7.1 Simulation Setup
In the simulation test bed, each object moves according to
the random waypoint mobility model: the client chooses a
random point in the space as its destination and moves to
it at a speed randomly selected from the range [0. 2.[;
upon arrival or expiration of a constant movement period
(randomly picked from the range [0. 2t
.
[), it chooses a
new destination and repeats the same process. This is a
well accepted and studied model in the mobile comput-
ing literature [5].
The workload consists of \ queries, half of which are
range queries and half are kNN queries. For range queries,
the query rectangle is a square and its side length is
uniformly distributed in a range of [0.5c
|ci
. 1.5c
|ci
]. For kNN
queries, the query points are randomly distributed and /
ranges from 1 to /
ior
. In all the three frameworks, the
database server maintains an in-memory grid index
(` ×` cells) on the queries and an R
+
-tree index [2] on
the objects. The database server is simulated on a Pentium 4
2.4 GHz PC with 1 GB RAM running WinXP SP2. Table 1
summarizes the default parameter settings.
To eliminate the effect from hardware configuration, the
simulation uses logical time units instead of clock time
units. Each simulation run lasts for 5,000 time units or until
the measured value stabilizes (for those simulations that
take 12 hours or more).
The performance metrics for comparison include:
. Monitoring accuracy: The monitoring accuracy at
time t, io(t) ÷ ¦0. 1¦, is defined as whether the
monitored results for all queries accord with the
results from the OPT framework. Note that the latter
are the most probable results based on the c-squares
of all objects at t. The monitoring accuracy for a time
period [t
/
. t
c
[ is defined as the amortized accuracy
over time, i.e., io(t
/
. t
c
) =
1
tc÷t/

t
c
t/
io(t)dt.
. Wireless communication cost: It is the amortized
number of location updates sent by a moving object
over time.
. CPU time: This is measured by the amortized server
CPU time, which includes the time for query
evaluation and safe region computation.
7.2 Validity of Most Probable Result
The first set of experiments is to validate the definition of
most probable result. Under various j (the mean of c) and 1
(the location detection frequency), we compare the most
probable result from the OPT framework with the genuine
result (the result as if all the point locations were known) for
all \ queries. Fig. 15a shows the consistency rate, i.e., the
proportion of time when the two results are the same. As j
or 1 increases, the consistency rate drops. However, the
curve is not linear: the drop becomes slower when j and 1
become larger. As such, even when j or 1 is very large, the
consistency rate is above 70 percent. This justifies our claim
that the most probable result is a nice approximation of the
genuine result for monitoring tasks.
HU ET AL.: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 415
Fig. 14. State transition diagram.
TABLE 1
Simulation Parameter Settings
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
7.3 Overall Performance
The next set of experiments evaluates the overall perfor-
mance of three frameworks with default parameter settings.
Fig. 15b shows the monitoring accuracy and communication
cost (normalized by the cost of OPT). As is guaranteed, our
PAM framework achieves 100 percent accuracy, while PRD
gets only 80-90 percent. Obviously, PRD(0.1) is more
accurate than PRD(1) but the performance gap is less than
10 percent. Further, it is at the cost of 10 times higher
communication overhead. On the other hand, the commu-
nication cost of PAM is much smaller than PRD and
remains close to OPT.
7.4 Effects of c
In this section, we evaluate the effect of c on the
performance. We vary j (the mean of c) from 0 to 0.01
and Fig. 16a shows the corresponding monitoring accuracy.
While PAM achieves 100 percent accuracy, the accuracy of
PRD(1) and PRD(0.1) drops significantly as j increases. The
drop is mainly caused by the increasing spatial vagueness
introduced by the c-square. However, the rate of the drop
decreases as j increases, which, in turn, verifies the fact that
the most probable result is stable for even large c-squares.
Fig. 16b shows the communication cost. The cost for OPT
is almost the same for all settings, because the change of j
(and thus, c) merely changes the query results, not
necessarily the frequency of result changes. Similarly,
PRD(1) and PRD(0.1) (not plotted) have constant costs of
1 and 10, respectively. On the other hand, the cost of PAM
consistently grows as j increases, but even for j = 0.01,
which is already large in practice, it still outperforms
PRD(1) by more than 40 percent. Furthermore, the rate of
the increase drops as j increases, which indicates that the
approximation ratio of the safe region to the quarantine area
becomes steady.
Fig. 16c shows the CPU cost of PRD and PAM. PAM
increases faster than PRD(1) and PRD(0.1), because the
increase for PAM arises from two aspects—more location
updates and more complex query reevaluation (especially
for kNN queries), whereas the increase for PRD arises only
from the latter. Nonetheless, even for j = 0.01, PAM is still
about 1´20 that of PRD(1). Therefore, we can conclude that
PAM is robust and efficient for various privacy settings.
7.5 Scalability
This section evaluates the scalability of all frameworks in
terms of the server’s CPU time and communication cost.
Fig. 17a shows the CPU time when the number of registered
queries (\) increases from 10 to 1,000. PAM only increases
by less than 10 times because the grid-based query index
filters out most of the unaffected and irrelevant queries.
However, for PRD(1) and PRD(0.1), the CPU time is linear
to \, as they need to reevaluate every query at each batch
of location updates. When \ = 1.000, for one logical time
unit, the server needs 1.6 CPU seconds to monitor the
100,000 moving objects using PAM, 53 seconds using
PRD(1), and 217 seconds using PRD(0.1). As PRD updates
locations periodically, the high CPU cost imposes on it a
416 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 3, MARCH 2010
Fig. 16. c versus monitoring accuracy, communication, and CPU cost.
(a) Accuracy. (b) Communication cost. (c) CPU cost.
Fig. 17. Performance versus query numbers (\). (a) CPU time.
(b) Communication cost.
Fig. 15. Performance evaluation. (a) Consistency of most probable
result. (b) The overall performance.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
maximum update frequency: in this example, the update
frequency is at most once every 21.7 seconds. PAM has no
such limitation. In terms of communication cost, although
PAM increases linearly with respect to \, it is still less than
double of OPT. All the above results suggest that PAM is
robust under various \ settings.
Similarly, we conduct simulations to vary the number of
objects (`) from 100 to 100,000. Fig. 18a shows that the CPU
cost only increases by about 40 times when ` increases by
1,000 times, due to the R
+
-tree index which is incrementally
maintained. Incontrast, PRD(1) andPRD(0.1) bothincrease at
least linearly to `, as they need to build a newR
+
-tree at each
update to reevaluate all queries. Similarly, Fig. 18b shows
that the communication cost of PAMonly increases by about
200 times when ` increases by 1,000 times. This suggests that
although a denser object distribution makes safe regions
shrink, only a decreasing portion of objects affects the
quarantine area of any query, and hence, the safe region of
any object. In summary, PAM is more scalable than PRD in
terms of CPU and communication cost.
7.6 Effects of Query Types
In this section, we study the performance of PAM on range
and kNN queries separately. We vary the average query
lengthc
|ci
of range queries and/
ior
—the maximum/of kNN
queries. The communication costs are plottedin Figs. 19a and
19b, respectively. We observe that for any parameter setting,
PAM’s communication cost is at most three times as much as
that of OPT. For range queries, as c
|ci
increases, the
communication cost of OPT always increases at a steady
pace. However, the communication cost of PAM increases
more slowly when c
|ci
is relatively small (at 0.001) or large
(_0.01). This can be explained by the fact that when c
|ci
is
relativelysmall or large, the safe regions are determinedmore
by the home cell than by the relevant queries. Since the size of a
cell is fixed, the cost tends to saturate. On the other hand, for
kNN queries, as /
ior
increases, the costs of both OPT and
PAM grow steadily. Even so, PAM manages to narrow the
gap when /
ior
becomes larger. This suggests that for a heavy
workload when results change frequently, the safe region
achieves even better approximation to the ideal safe area.
7.7 Sensitivity of PAM
In this section, we study the sensitivity of PAM to other
influential factors, namely, the average moving speed (.)
and the average constant movement period (t
.
) for the
moving objects. Fig. 20a shows the communication cost
when . varies from 0.001 to 1 per logical time unit. The costs
of both PAM and OPT increase linearly as . increases,
because the time of an object staying in a safe region is
inversely proportional to .. To eliminate this effect, we also
plot the communication cost per distance unit on the
secondary y-axis in the same figure, and observe that this
cost is independent of .. In other words, the update cost of
PAM is not heavily dependent on the speed of object
movement on a trajectory, but on the length of the
trajectory. The CPU time shows a similar trend, and hence,
is not plotted. In Fig. 20b, we also vary t
.
from 0.001 to
1 time unit and find that it has little effect on the
performance of PAM. As such, we conclude that PAM is
robust to various moving parameters.
The next influential factor is the ` ×` grid partitioning
of the query index. We vary ` from 5 to 100 and plot both
the communication cost and CPU time in Fig. 21. The larger
is the value of `, the smaller is the grid cell size. The
communication cost increases monotonously with `
because the grid cell sets the largest possible safe region
of an object. Nonetheless, the cost difference between ` = 5
and ` = 50 is not significant but there is a sharp increase
from ` = 50 to ` = 100. The explanation is the same as in
HU ET AL.: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 417
Fig. 19. Performance versus query types. (a) Range queries. (b) kNN
queries.
Fig. 20. Communication cost versus . and t
.
. (a) .. (b) t
.
.
Fig. 21. Performance versus grid partitioning.
Fig. 18. Performance versus object numbers (`). (a) CPU time.
(b) Communication cost.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
Section 7.6, which is, when ` is small and moderate, the
safe regions are determined more by the relevant queries
than by the grid cell, but when ` is large, the cell becomes
dominate. On the other hand, the CPU time decreases
monotonously because the number of relevant queries in
the cell decreases, and hence, the safe region computation is
faster. In this figure, ` = 50 yields a fairly low commu-
nication cost as well as a fairly low CPU time. From this
experiment, we can see that it is advantageous to adapt the
cell size to the server’s workload: we first use a large ` to
partition the grid, and later if the workload turns out to be
low, we enlarge the cell size by merging the cell where the
update occurs with its neighboring cells within a certain
distance. In this way, we can take full advantage of the CPU
resource and reduce the communication cost (by enlarging
the safe region) as much as possible.
7.8 Dynamic Update Strategy
The last set of experiments is conducted to evaluate the
minimum cost update strategy. We vary the threshold for
the `-rule, i.e., co:t
n
´co:t
j
from 0.01 to 1. Figs. 22a and 22b
show the monitoring accuracy and communication cost in
comparison with the standard strategy. The two curves of
PAM show a similar trend as ` increases, which means that
through `, the strategy effectively trades accuracy for
communication cost, or vice versa. Interestingly, when
` _ 0.1, the decrease of the communication cost is accom-
panied by almost the same decrease of accuracy; however,
when ` 0.1, the accuracy drops more slowly than the
communication cost. This shows that most ideal safe area is
far larger than the safe region, so even an aggressive ` can
still keep the object inside the ideal safe area.
8 CONCLUSIONS
This paper proposes a framework for monitoring contin-
uous spatial queries over moving objects. The framework is
the first to holistically address the issue of location updating
with regard to monitoring accuracy, efficiency, and privacy.
We provide detailed algorithms for query evaluation/
reevaluation and safe region computation in this frame-
work. We also devise three-client update strategies that
optimize accuracy, privacy, and efficiency, respectively. The
performance of our framework is evaluated through a series
of experiments. The results show that it substantially
outperforms periodic monitoring in terms of accuracy and
CPU cost while achieving a close-to-optimal communica-
tion cost. Furthermore, the framework is robust and scales
well with various parameter settings, such as privacy
requirement, moving speed, and the number of queries
and moving objects.
As for future work, we plan to incorporate other types of
queries into the framework, such as spatial joins and
aggregate queries. We also plan to further optimize the
performance of the framework. In particular, the minimum-
cost update strategy shows that the safe region is a crude
approximation of the ideal safe area, mainly because we
separately optimize the safe region for each query, but not
globally. A possible solution is to sequentially optimize the
queries but maintain the safe region accumulated by the
queries optimized so far. Then, the optimal safe region for
each query should depend not only on the query, but also
on the accumulated safe region.
ACKNOWLEDGMENTS
This work was supported by the Research Grants Council,
Hong Kong SAR, China under Project No. HKBU211206,
HKBU211307, HKBU210808, HKBU1/05C, HKBU/FRG08-
09/II-48, RGC GRF 615806, and CA05/06.EG03.
REFERENCES
[1] S. Babu and J. Widom, “Continuous Queries over Data Streams,”
Proc. ACM SIGMOD, 2001.
[2] N. Beckmann, H. Kriegel, R. Schneider, and B. Seeger, “The R*-
Tree: An Efficient and Robust Access Method for Points and
Rectangles,” Proc. ACM SIGMOD, pp. 322-331, 1990.
[3] R. Benetis, C.S. Jensen, G. Karciauskas, and S. Saltenis, “Nearest
Neighbor and Reverse Nearest Neighbor Queries for Moving
Objects,” Proc. Int’l Database Eng. and Applications Symp. (IDEAS),
2002.
[4] A. Beresford and F. Stajano, “Location Privacy in Pervasive
Computing,” IEEE Pervasive Computing, vol. 2, no. 1, pp. 46-55,
Jan.-Mar. 2003.
[5] J. Broch, D.A. Maltz, D. Johnson, Y.-C. Hu, and J. Jetcheva, “A
Performance Comparison of Multi-Hop Wireless Ad Hoc Net-
work Routing Protocols,” Proc. ACM/IEEE MobiCom, pp. 85-97,
1998.
[6] Y. Cai, K.A. Hua, and G. Cao, “Processing Range-Monitoring
Queries on Heterogeneous Mobile Objects,” Proc. IEEE Int’l Conf.
Mobile Data Management (MDM), 2004.
[7] J. Chen and R. Cheng, “Efficient Evaluation of Imprecise Location-
Dependent Queries,” Proc. IEEE Int’l Conf. Data Eng. (ICDE),
pp. 586-595, 2007.
[8] J. Chen, D. DeWitt, F. Tian, and Y. Wang, “NiagaraCQ: A Scalable
Continuous Query System for Internet Databases,” Proc. ACM
SIGMOD, 2000.
[9] R. Cheng, D.V. Kalashnikov, and S. Prabhakar, “Querying
Imprecise Data in Moving Object Environments,” IEEE Trans.
Knowledge and Data Eng., vol. 16, no. 9, pp. 1112- 1127, Sept. 2004.
[10] H.D. Chon, D. Agrawal, and A.E. Abbadi, “Range and kNN
Query Processing for Moving Objects in Grid Model,” ACM/
Kluwer MONET, vol. 8, no. 4, pp. 401-412, 2003.
[11] C.-Y. Chow, M.F. Mokbel, and X. Liu, “A Peer-to-Peer Spatial
Cloaking Algorithm for Anonymous Location-Based Services,”
Proc. ACM Int’l Symp. Geographic Information Systems (GIS),
pp. 171-178, 2006.
[12] D. Pfoser and C.S. Jensen, “Capturing the Uncertainty of Moving-
Objects Representations,” Proc. Int’l Conf. Scientific and Statistical
Database Management (SSDBM), 1999.
[13] B. Gedik and L. Liu, “MobiEyes: Distributed Processing of
Continuously Moving Queries on Moving Objects in a Mobile
System,” Proc. Int’l Conf. Extending DataBase Technology (EDBT),
2004.
[14] B. Gedik and L. Liu, “Location Privacy in Mobile Systems: A
Personalized Anonymization Model,” Proc. IEEE Int’l Conf.
Distributed Computing Systems (ICDCS), pp. 620-629, 2005.
418 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 3, MARCH 2010
Fig. 22. Minimum cost strategy versus standard strategy. (a) Accuracy.
(b) Communication cost.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.
[15] B. Gedik and L. Liu, “Protecting Location Privacy with Persona-
lized k-Anonymity: Architecture and Algorithms,” IEEE Trans.
Mobile Computing, vol. 7, no. 1, pp. 1-18, Jan. 2008.
[16] G. Ghinita, P. Kalnis, and S. Skiadopoulos, “Mobihide: A Mobile
Peer-to-Peer System for Anonymous Location-Based Queries,”
Proc. Int’l Symp. Spatial and Temporal Databases (SSTD), 2007.
[17] G. Ghinita, P. Kalnis, and S. Skiadopoulos, “Prive: Anonymous
Location-Based Queries in Distributed Mobile Systems,” Proc. Int’l
World Wide Web Conf. (WWW ’07), pp. 371-380, 2007.
[18] M. Gruteser and D. Grunwald, “Anonymous Usage of Location-
Based Services through Spatial and Temporal Cloaking,” Proc.
MobiSys, 2003.
[19] A. Guttman, “R-Trees: A Dynamic Index Structure for Spatial
Searching,” Proc. ACM SIGMOD, 1984.
[20] G.R. Hjaltason and H. Samet, “Distance Browsing in Spatial
Databases,” ACM Trans. Database Systems, vol. 24, no. 2, pp. 265-
318, 1999.
[21] H. Hu, J. Xu, and D.L. Lee, “A Generic Framework for Monitoring
Continuous Spatial Queries over Moving Objects,” Proc. ACM
SIGMOD, pp. 479-490, 2005.
[22] G. Iwerks, H. Samet, and K. Smith, “Continuous k-Nearest
Neighbor Queries for Continuously Moving Points with Up-
dates,” Proc. Int’l Conf. Very Large Data Bases (VLDB), 2003.
[23] G.S. Iwerks, H. Samet, and K. Smith, “Maintenance of Spatial
Semijoin Queries on Moving Points,” Proc. Int’l Conf. Very Large
Data Bases (VLDB), 2004.
[24] C.S. Jensen, D. Lin, and B.C. Ooi, “Query and Update Efficient B+-
Tree Based Indexing of Moving Objects,” Proc. Int’l Conf. Very
Large Data Bases (VLDB), 2004.
[25] D.V. Kalashnikov, S. Prabhakar, and S.E. Hambrusch, “Main
Memory Evaluation of Monitoring Queries over Moving Objects,”
Distributed Parallel Databases, vol. 15, no. 2, pp. 117-135, 2004.
[26] P. Kalnis, G. Ghinita, K. Mouratidis, and D. Papadias, “Preventing
Location-Based Identity Inference in Anonymous Spatial
Queries,” IEEE Trans. Knowledge and Data Eng., vol. 19, no. 12,
pp. 1719-1733, Dec. 2007.
[27] A. Khoshgozaran and C. Shahabi, “Blind Evaluation of Nearest
Neighbor Queries Using Space Transformation to Preserve
Location Privacy,” Proc. Int’l Symp. Spatial and Temporal Databases
(SSTD), 2007.
[28] H. Kido, Y. Yanagisawa, and T. Satoh, “An Anonymous
Communication Technique Using Dummies for Location-Based
Services,” Proc. Second Int’l Conf. Pervasive Services (ICPS), 2005.
[29] M.-L. Lee, W. Hsu, C.S. Jensen, B. Cui, and K.L. Teo, “Supporting
Frequent Updates in R-Trees: A Bottom-Up Approach,” Proc. Int’l
Conf. Very Large Data Bases (VLDB), 2003.
[30] S.R. Madden, M.J. Franklin, J.M. Hellerstein, and W. Hong, “TAG:
A Tiny Aggregation Service for Ad-Hoc Sensor Networks,” Proc.
USENIX Symp. Operating Systems Design and Implementation
(OSDI), 2002.
[31] M.F. Mokbel, C.-Y. Chow, and W.G. Aref, “The New Casper:
Query Processing for Location Services without Compromising
Privacy,” Proc. Int’l Conf. Very Large Data Bases (VLDB), pp. 763-
774, 2006.
[32] M.F. Mokbel, X. Xiong, and W.G. Aref, “SINA: Scalable
Incremental Processing of Continuous Queries in Spatio-Temporal
Databases,” Proc. ACM SIGMOD, 2004.
[33] G. Myles, A. Friday, and N. Davies, “Preserving Privacy in
Environments with Location-Based Applications,” Pervasive Com-
puting, vol. 2, no. 1, pp. 56-64, 2003.
[34] J.M. Patel, Y. Chen, and V.P. Chakka, “STRIPES: An Efficient
Index for Predicted Trajectories,” Proc. ACM SIGMOD, 2004.
[35] S. Prabhakar, Y. Xia, D.V. Kalashnikov, W.G. Aref, and S.E.
Hambrusch, “Query Indexing and Velocity Constrained Indexing:
Scalable Techniques for Continuous Queries on Moving Objects,”
IEEE Trans. Computers, vol. 51, no. 10, pp. 1124-1140, Oct. 2002.
[36] K. Raptopoulou, A. Papadopoulos, and Y. Manolopoulos, “Fast
Nearest-Neighbor Query Processing in Moving Object Data-
bases,” GeoInfomatica, vol. 7, no. 2, pp. 113-137, 2003.
[37] N. Roussopoulos, S. Kelley, and F. Vincent, “Nearest Neighbor
Queries,” Proc. ACM SIGMOD, 1995.
[38] S. Saltenis, C.S. Jensen, S.T. Leutenegger, and M.A. Lopez,
“Indexing the Positions of Continuously Moving Objects,” Proc.
ACM SIGMOD, 2000.
[39] Y. Tao, C. Faloutsos, D. Papadias, and B. Liu, “Prediction and
Indexing of Moving Objects with Unknown Motion Patterns,”
Proc. ACM SIGMOD, 2004.
[40] Y. Tao, D. Papadias, and Q. Shen, “Continuous Nearest Neighbor
Search,” Proc. Int’l Conf. Very Large Data Bases (VLDB), 2002.
[41] Y. Tao, D. Papadias, and J. Sun, “The TPR*-Tree: An Optimized
Spatio-Temporal Access Method for Predictive Queries,” Proc.
Int’l Conf. Very Large Data Bases (VLDB), 2003.
[42] W. Wu, W. Guo, and K.-L. Tan, “Distributed Processing of
Moving k-Nearest-Neighbor Query on Moving Objects,” Proc.
IEEE Int’l Conf. Data Eng. (ICDE), 2007.
[43] X. Xiong, M.F. Mokbel, and W.G. Aref, “SEA-CNN: Scalable
Processing of Continuous k-Nearest Neighbor Queries in Spatio-
Temporal Databases,” Proc. IEEE Int’l Conf. Data Eng. (ICDE),
2005.
[44] J. Xu, X. Tang, and D.L. Lee, “Performance Analysis of Location-
Dependent Cache Invalidation Schemes for Mobile Environ-
ments,” IEEE Trans. Knowledge and Data Eng., vol. 15, no. 2,
pp. 474-488, Mar./Apr. 2003.
[45] M.L. Yiu, C.S. Jensen, X. Huang, and H. Lu, “Spacetwist:
Managing the Trade Offs among Location Privacy, Query
Performance, and Query Accuracy in Mobile Services,” Proc. IEEE
Int’l Conf. Data Eng. (ICDE ’08), 2008.
[46] T.-H. You, W.-C. Peng, and W.-C. Lee, “Protect Moving
Trajectories with Dummies,” Proc. Int’l Workshop Privacy-Aware
Location-Based Mobile Services, 2007.
[47] X. Yu, K.Q. Pu, and N. Koudas, “Monitoring k-Nearest Neighbor
Queries over Moving Objects,” Proc. IEEE Int’l Conf. Data Eng.
(ICDE), 2005.
[48] J. Zhang, M. Zhu, D. Papadias, Y. Tao, and D.L. Lee, “Location-
Based Spatial Queries,” Proc. ACM SIGMOD, 2003.
Haibo Hu received the PhD degree in computer
science from the Hong Kong University of
Science and Technology (HKUST) in 2005. He
is an assistant professor in the Department of
Computer Science, Hong Kong Baptist Univer-
sity (HKBU). Prior to this, he held several
research and teaching posts at HKUST and
HKBU. His research interests include mobile
and wireless data management, location-based
services, and privacy-aware computing. He has
published 20 research papers in international conferences, journals, and
book chapters. He is also the recipient of many awards, including the
ACM Best PhD Paper Award and Microsoft Imagine Cup.
Jianliang Xu received the BEng degree in
computer science and engineering fromZhejiang
University, Hangzhou, China, in 1998, and the
PhD degree in computer science from the Hong
Kong University of Science and Technology in
2002. He is an associate professor in the
Department of Computer Science, Hong Kong
Baptist University. He was a visiting scholar in
the Department of Computer Science and
Engineering, Pennsylvania State University, Uni-
versity Park. His research interests include data management, mobile/
pervasive computing, wireless sensor networks, and distributed sys-
tems. He has published more than 70 technical papers in these areas,
most of which appeared in prestigious journals and conference
proceedings. He currently serves as a vice chairman of ACM Hong
Kong Chapter. He is a senior member of the IEEE.
Dik Lun Lee received the BSc degree in
electronics from the Chinese University of Hong
Kong, and the MS and PhD degrees in computer
science from the University of Toronto, Canada.
He is currently a professor in the Department of
Computer Science and Engineering, Hong Kong
University of Science and Technology. He was
an associate professor in the Department of
Computer Science and Engineering, Ohio State
University, Columbus. He was the founding
conference chair for the International Conference on Mobile Data
Management and served as the chairman of the ACM Hong Kong
Chapter in 1997. His research interests include information retrieval,
search engines, mobile computing, and pervasive computing.
HU ET AL.: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 419
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2010 at 09:47:07 UTC from IEEE Xplore. Restrictions apply.

. for each object. this framework fails to address the privacy issue. privacy. Rather. Fig. its strategy determines the accuracy. Among all possible results. The probability is computed by assuming a uniform distribution of the exact client position in the bounding box. b together with their bounding boxes. and privacy issues altogether. In this paper. Both the genuine and most probable result for the 1NN query Q are fag. . First. so they send location updates only when the results for some queries might change. Benetis et al. a client encapsulates its exact position in a bounding box. This significantly improves the monitoring efficiency and accuracy compared to the periodic or deviation update methods. The system architecture. with an emphasis on range and kNN queries. or efficiency can be optimized. we will explore the PAM framework. and privacy altogether. privacy. one of the main contributions of this paper is to devise efficient query processing algorithms for common spatial query types. this is the first comprehensive framework that addresses the issue of location updating holistically with monitoring accuracy. New algorithms must be designed to compute maximum safe regions in order to reduce the number of location updates. To optimize privacy or efficiency. However. As for efficiency. 2 RELATED WORK There is a large body of research work on spatial temporal query processing. Third. The safe region is computed based on the queries in such a way that the current results of all queries remain valid as long as all objects reside inside their respective safe regions. R-tree [19]) and query evaluation algorithms (e.g. However. it provides a common interface for monitoring various types of spatial queries such as range queries and kNN queries. however. We adapt for the monitoring environment the privacy model that has been employed by location cloaking and other privacy-aware approaches. 1. improve efficiency. Second. even monitoring only the most probable result adds great complexity to query evaluation. alternative strategies must be devised. we argue that the most probable result. The framework is generic in the sense that it is not designed for a specific query type. accuracy. Compared to the previous work.e. [3] developed query Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. which guarantees accuracy— no miss of any change of the most probable result. Dynamic client update strategies are given in Section 6 to optimize privacy and efficiency. we propose a privacy-aware monitoring (PAM) framework that incorporates the accuracy. The standard strategy is to update when the centroid of the bounding box moves out of the safe region. we proposed a monitoring framework where the clients are aware of the spatial queries being monitored. 2. where data objects or queries or both of them move. the PAM framework has the following advantages: To our knowledge. where the location of a moving object is represented by a linear function of time. Section 3 overviews the framework components. Our basic idea is to maintain a rectangular area. [20]. Early work assumed a static data set and focused on efficient access methods (e.. as the location updater decides when and how a bounding box is updated. Moreover. . the integration of privacy into the monitoring framework poses challenges to the design of PAM. More specifically. However. efficiency. the framework significantly reduces location updates to only when an object is moving out of the safe region. 2 shows two clients a. the framework does not presume any mobility pattern on moving objects.: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 405 Fig. a lot of attention has been paid to moving-object databases. The framework is flexible in that by designing appropriate location update strategies. with the introduction of bounding boxes. Fig. As for accuracy. especially on the aspects of query evaluation and safe region computation. called safe region.HU ET AL. and therefore. and the timing and mechanism with which the box is updated to the server are decided by a client-side location updater as part of PAM. [37]). as opposed to only at the time instances of updates in systems that are based on periodic or deviation location update. and efficiency of the framework. Assuming that object movement trajectories are known a priori. efficiency. that is. . Restrictions apply. suits realistic scenarios. and thus. the most probable result also adds complexity to the definition of safe region. followed by Sections 4 and 5 where query evaluation and safe region computation are presented. . This framework extends from our previous work [21] by introducing a common privacy model. it only addresses “when” but not “how” the location updates are sent. Recently. In the rest of this paper. the result of a query is no longer unique. A client updates its location on the server only when the client moves out of its safe region. is very likely to alter the query results.. and thus.g.2010 at 09:47:07 UTC from IEEE Xplore. As such. Monitoring example. we take a more comprehensive approach— instead of dealing with “when” and “how” separately like most existing work. is most promising for approximating the genuine result (the result derived based on the exact positions). In [21]. . [38] proposed the Time-Parameterized R-tree (TPR-tree) for indexing moving objects. the one with the highest probability. Experimental results of PAM are shown in Section 7. i. the framework offers correct monitoring results at any time. Downloaded on July 09. The remainder of this paper is organized as follows: Section 2 reviews the related work.. Saltenis et al.

[7]. the Clique-Cloak [14]. the Casper anonymizer [31]. the success rate or confidence of such spatiotemporal correlation inference can be reduced significantly. [4]. where a two-dimensional space is assumed. Tao et al. the server divides the space recursively in a quad-tree-like format till a suitable subspace is found to cloak the updated location.” the frequency of query reevaluation on uncertain location information is reduced.1 Privacy-Aware Location Model In this paper.. However. That is. we assume that the clients are privacy conscious. With a large enough location box covering the sensitive place (e. [26]. While in these studies. the less successful and confident the adversary’s inference becomes. [29]. dummy. 3 FUNDAMENTALS OF PAM FRAMEWORK 3. the system efficiency and scalability are improved. The second category does not make any assumption on object movement patterns. suggested grid-based inmemory structures for object and query indexes to speed up reevaluation process of range queries [25] and kNN queries [47]. the adversary can infer that the user might have a heart problem. and peer-to-peer cloaking [11]. Restrictions apply. most existing work suggests replacing accurate point locations by bounding boxes to reduce location resolutions [18]. 3. Continuous kNN monitoring has been investigated for moving queries over stationary objects [40] and linearly moving objects [22]. The CliqueCloak algorithm constructs a clique graph to combine some clients who can share the same cloaked spatial area. the client-side location updater decides whether or not to update that box to the server. Uncertainty and privacy issues have been recently studied in moving object monitoring.g. these studies are not applicable to monitoring of moving objects. NO. Dummy generates fake user locations (called dummies) and mixes them together with the genuine user location into the request [28]. [16]. or medical problems. [43]. As for location uncertainty. Then. Mokbel et al. At each evaluation step. [17] and is beyond the scope of this paper. the known-trajectory assumption does not hold for many application scenarios (e. location cloaking. 22. [10] studied range and kNN queries based on a grid model. a common model for characterizing the uncertainty of an object is a closed region with a predefined probability distribution of this object in the region. we take the same privacy-aware approach. for each location update. alternative lifestyles. their solutions work for stationary objects only. Xu et al. knowing that a user is inside a heart specialty clinic during business hours. and hence. For continuous monitoring of moving objects. [32]. Downloaded on July 09. To protect location privacy. To protect against it. [45]. In spatiotemporal cloaking. This has been cited as a major privacy threat in location-based services and mobile computing. Patel et al. [31] and kNN queries [9]. The work on monitoring continuous spatial queries can be classified into two categories. the prevailing approach is periodic reevaluation of queries [24]. Transformation utilizes certain one-way spatial transformations (e. and then. on the other hand. the velocity of a car changes frequently on road). Chon et al. upon receiving such a box. all user locations are sorted by Hilbert space-filling curve ordering. by which an adversary may infer users’ private information such as political affiliations. Monitoring queries have also been studied for distributed Internet databases [8]. every k users are grouped together in this order. and achieves scalability by employing shared execution and incremental evaluation of continuous queries [32]. query processing and indexing algorithms have been proposed to evaluate probabilistic range queries [12]. The Casper anonymizer is associated with a query processor to ensure that the anonymized area returns the same query result as the actual location. Based on this probabilistic model. Nonetheless. [13] and continuous kNN queries [42]. In our monitoring framework.406 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. VOL. To simplify the presentation in this 1. which indexes queries using an R-tree-like structure. MARCH 2010 evaluation algorithms for NN and reverse NN search based on the TPR-tree. [46]. the clients do not want to expose their genuine point locations to the database server to avoid spatiotemporal correlation inference attack [14]. the larger the box is. a space filling curve) to map the query space to another space and resolves query blindly in the transformed space [27]. and sensor databases [30].g. SINA indexes both queries and objects. hilbASR [17]. [26]. Our study falls into this category but distinguishes itself from existing studies with a comprehensive framework focusing on location update. and transformation were also proposed for privacy preservation. Pseudonym decouples the mapping between the user identity and the location so that an untrusted server only receives the location without the user identity [33]. only those objects that have moved since the previous evaluation step are evaluated on the Q-index. Our study. By adopting the notion of “safe region.1 Without any other knowledge about the client locations or moving patterns. the server can only presume that the genuine point location is distributed uniformly in this box. [17]. [35] proposed the Q-index. data streams [1]. and Yu et al. As such. Kalashnikov et al. [14]. [34] proposed a novel index structure called STRIPES using a dual transformation technique. However. pseudonym. The main idea is to shift some load from the server to the mobile clients. Iwerks et al. [22] extended to monitor distance semijoins for two linearly moving data sets [23]. [41] optimized the performance of the TPR-tree and extended it to the TPRÃ -tree. [31].. Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Access methods to support frequent location updates of moving objects have also been investigated [24]. [36]. various cloaking or anonymizing techniques have been proposed to hide the client’s actual location. The first category assumes that the movement trajectories are known. Specifically. Chen and Cheng extended the probabilistic processing to more general cases where the queries are also uncertain [7]. Prabhakar et al. it is encapsulated into a bounding box. the query is reevaluated only when the query exits the validity scope. [15]. . For example. the clinic) as well as a good number of other insensitive places. each time a client detects his/her genuine point location. [32] proposed a scalable incremental hash-based algorithm (SINA) for range and kNN queries. the objects are uncertain.2010 at 09:47:07 UTC from IEEE Xplore. However. In hilbASR. the queries themselves are still certain. addresses the continuous monitoring issue. [47].. as pointed out in [39]. Distributed approaches have been investigated to monitor continuous range queries [6]. While this study is limited to range queries. Besides. [44] and Zhang et al.g. [48] suggested returning to the client both the query result and its validity scope where the result remains the same. [35]. Among them are the spatiotemporal cloaking [18]. [26]. The computation of a proper bounding box to satisfy a certain privacy metric (such as k-anonymity) has been extensively studied in the literature [14].

Fig. However. Now that locations are -squares instead of points. we make the following assumptions for simplicity: . [47]. Overview of Database Behavior 1: while receiving a request do 2: if the request is to register query q then 3: evaluate q. Afterward. we have the moving object index. Since the genuine point location of an object is distributed uniformly in its -square. Furthermore. 7: else if the request is to deregister query q then 8: remove q from the query index. and the location manager. The key idea to solving the problem is “safe region. Our problem is therefore to monitor result changes of spatial queries as objects move. 9: else if the request is a location update from object p then 10: determine the set of affected queries. prevents the server from being computationally overloaded. At moving objects’ side. As in the previous range query example. The standard update strategy of the client is therefore “to update when the centroid of the -square is out of the safe region. At the database server side. reevaluates them using the object index (step ).2 Framework Overview As shown in Fig. so the safe region is essentially a safe region for the centroid of the -square. Without loss of generality. The database server handles location updates sequentially. that object is most probably a nonresult object. a unique definition of query result under -squares is a prerequisite of safe region. and therefore. also based on the indexes. this object could be either a result object or a nonresult object of this query. 3. Downloaded on July 09. and then.” The reason why we exclude all other less probable results in this definition is threefold: 1) monitoring continuous queries usually trades accuracy for efficiency— although the most probable result does not always align with the genuine result (the result derived based on genuine point locations of all objects). With the notion of most probable result. The procedure for 5 processing a new query is similar. 2) if the query result were defined as the set of all possible results. sends it 4 back as a response to the object (step ).” we use the centroid point of the square as a representative. 11: for each affected query q0 do 3. As such. and then. the location manager computes the new safe region for the updating object (step ). to clarify the definition of “within. 6: update the changed safe regions of objects. if the -square of an object partially overlaps with a range query. the 2 new query is evaluated from scratch instead of being reevaluated incrementally. and 3) we do not want the choice of -square—which is made by the client—to affect query results heavily. and that the objects whose safe regions are changed due to this new query must be notified. As such. in other words. and obviously the most probable results are less vulnerable than other result definitions. PAM framework overview. while the object index can only Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. With the latter assumption. which makes the update cost overwhelmingly high. 4: compute its quarantine area and insert it into the query index. we have location updaters. the query index can accommodate all registered queries in main memory. that object is most probably a result object of this query.” which was defined in [21] as a rectangle within which the change of object location does not change the result of any registered spatial query. application servers can register spatial queries to the database server (step ). if the majority of the -square falls inside the range query. This is a reasonable assumption to relieve us from the issues of read/write consistency. the query processor identifies those queries 2 that are affected by this update using the query index. The 3 updated query results are then reported to the application servers who register these queries. except that in step . we can define the (unique) query result as the one with the highest probability among all possible results. we will show in Section 4 that it is efficient to compute. . the safe region would have to be extremely small to report location updates if any of the possible results changes. Algorithm 1. . updates are queued and handled on a first-come-first-serve basis. we thereby define the safe region as a rectangle within which the change of the centroid of the object’s -square does not change the most probable result of any registered spatial query. and monitor them as accurately as possible and at the lowest cost of location updates. Algorithm 1 summarizes the procedure at the database server to handle a query registration/deregistration or a location update. minimizing the cost of location updates is equivalent to minimizing the total number of updates.HU ET AL.2010 at 09:47:07 UTC from IEEE Xplore. [21]. we further restrict the shape of such a bounding box to a -by- square (or in short -square). the PAM framework consists of components located at both the database server and the moving objects.: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 407 paper. PAM framework works as follows (see Fig. The moving objects maintain good connection with the database server. the query index. otherwise. This assumption has been widely adopted in many existing proposals [25]. For example. accommodate all moving objects in secondary memory. Restrictions apply. the communication cost for any location update is a constant. The number of objects is some orders of magnitude larger than that of queries. When an object sends a location 1 update (step ). 3): At any time. 5: return the results to the application server. the query processor. the consequence of introducing -square is more than that—the result of a spatial query is no longer unique. where  is customizable for each object. . 3.

(b) kNN query. which is a line that splits the entire space into two regions: the inner region and the outer region. the bbox is the safe region enlarged by =2 on each side. we assume that the genuine point locations are distributed uniformly in their respective bboxes when queries are evaluated or reevaluated. MARCH 2010 12: 13: 14: 15: reevaluate q0 . for the standard update strategy. Q is not affected only if both of them cross or are outside the outer region.”3 For a range query q. “returns objects that have 90 percent probability inside the query range”).. the query window can serve as an inner bound of the quarantine area. Fig. we set both the inner and the outer bounds as circles centered at q. or formally. likewise. which is most widely adopted in the literature. Downloaded on July 09. More specifically. i. It is noteworthy that although in this paper. i. within which each object can possibly locate.408 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. it must be a function (denoted by ) of the last updated -square and the safe region. and 3) the quarantine area of the query. the quarantine area.4 The Query Index For each registered query. in the form of a bounding box. and the k value of a kNN query). thus making the object a nonresult object. The correctness of this bound can be verified by the observation that for any -square that crosses this bound. If dðs.. dðS. The following sections explain the components at the database server in detail. Therefore. The object index is built on the bboxes to speed up the evaluation. it becomes a nonresult object once it enters the outer region. the index is optimized to handle frequent updates [29]. 4a shows the inner and outer bounds of q’s quarantine area. the most probable result is used. Furthermore. An object becomes a result object if it enters the inner region. [19]. the server must store the spatial range. this box is called a bbox as a mark of distinction. 2. Note that this bounding box is different from a -square because its shape also depends on the client-side location updater. b 2 Bg. ok Þ and Dðq. the “Minkowski sum”2 of the safe region and a =2-square. However.. to evaluate queries. especially in the context of the most probable result. respectively. this framework can also adapt to other query result definitions such as over a probability confidence (e. tÞ denotes the distance between two points s and t. The quarantine area is used to identify the queries whose results might be affected by an incoming location update. VOL. For kNN queries. and the outer region. plus . because any object whose -square is fully inside q is a trivial result of q. While many spatial index structures can serve this purpose. 22. the query point. ok Þ þ . The Minkowski sum of two shapes A and B in euclidean space is the result of adding every point in B to every point in A. 4. on the other hand. More specifically. Since the bbox changes each time the object updates. the quarantine line is not unique for a query. a query Q is not affected only if “of the updated -square p and its last updated -square plst . In the rest of this paper.2010 at 09:47:07 UTC from IEEE Xplore.. the largest  is used. As such.e. As such. an outer bound can be the Minkowski sum of q and a =2-square. That is. then the radii of the inner and outer circle are dðq. Fig. since the kth NN ok determines whether other object is or is not a result object. In case. 2) the current query results. . 3. an object becomes a nonresult object once its -square crosses or is outside the outer bound. To ease the computation of these two bounds. we stick to the definition of the most probable result and leave the modification details for other definitions to interested readers. update the results to the application server. if the order of the result objects is sensitive. In particular. NO. it is guaranteed to be closer to q than ok . whereas the latter two are separated by theouter bound of the quarantine area. Quarantine area. the rectangle of a range query.3 The Object Index The object index is the server-side view on all objects. T Þ) denotes the minimum (maximum) distance between a pair of points in areas S and T . both of them are totally inside the inner bound or both of them cross or are outside the outer bound of the quarantine area. recompute its quarantine area and update the query index. (a) Range query. an in-memory gridbased index is built on the quarantine areas of all queries. the entire space is split into three regions: the inner region. and Section 6 describes the update strategy of the client-side location updater. On the other hand. T Þ (DðS. the outer bound circle is set to be the maximum distance between q and the bbox of ok . this paper employs the RÃ -tree index [2].” That is. 3. For a kNN query. Restrictions apply. On the other hand. With the same rationale for which we assume the genuine point location of an updating object to distribute uniformly in the -square. an object becomes a result object if its -square moves totally inside the inner bound. To quickly find all affected queries. It originates from the quarantine line. Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. The former two are separated by the inner bound of the quarantine area. update the safe region of p. the ideal quarantine line is difficult to compute. The only changes needed to reflect the new result definition are the query evaluation algorithms in the query processor and safe region computation in the location manager. we allow fuzziness by relaxing the line to an area called “quarantine area.g.e. since only the distance to the query point q matters. 3. there are different s for different objects. the inner bound circle is set to be the minimum distance between q and the bbox of ok so that if a -square is totally inside this circle. the set fa þ bja 2 A. 3. as object locations have extensions rather than points.g. the majority of this square must be outside q. enlarging q by =2 on each side. we set the radii of the two circles based on ok . In addition. the database server stores: 1) the query parameters (e.

given . since no existing queries change their quarantine lines. P erimeterðRÞ u t : kðÞd ¼ P erimeterðRÞ. or even Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. so we compute it based on the more accurate quarantine line. which are separated by the quarantine line. kðÞ is the length of segment pr. we can further define the safe region for a single query Q (denoted as p:srQ ) as a rectangle in which the change of the centroid of p’s -square does not change the most probable result of Q. in other words. First of all.HU ET AL. if there were another r0 . the new safe region p:sr0 is simply the intersection of the current safe region p:sr and p:srQ . P erimeterðRÞ where Cl is the cost for one location update. the optimal safe region p:srQ is the inscribed rectangle with the longest perimeter. In the former case. so we trade accuracy for efficiency. then only queries pointed at by the home cell(s) of p or plst are affected. Recall that the home cell(s) are the grid cell(s) of the query index where the -square of p is contained or overlaps. containment. Otherwise. Furthermore. spatial relations such as overlapping. the reevaluation is more efficient as it can be based on previous results. Obviously. The average elapsed time over all  is  R 2 kðÞ Z 2 kðÞd 0  d ¼ : R 2 2 d 0 0 Therefore. The location manager recomputes the safe region of an object p in two cases: 1) after a new query Q is evaluated and 2) after p sends a location update. Section 5 will present the detailed algorithms to compute the optimal p:srQ for each type of query.  is the angle between the moving direction and the positive x-axis. the points in segment rr0 do not belong to R. the quarantine areas of some existing queries might change. These relevant queries are exactly those indexed by the home cell(s) of the query index. 5. The reason why the safe region is based on the quarantine line rather than the quarantine area is that the latter is much coarser. After each individual p:srQ is computed. p:srQ is a rectangular approximation. . both of which are rectangular. Therefore. Fig. r must be unique for every . The safe region. or shortly Ir À lp. In this section.1. we know that Q also differs from a conventional spatial query in that the object locations are in the form of -square (for updating objects) or bbox (for other objects). and hence. we present the detailed algorithms to evaluate or reevaluate a spatial query Q in terms of the most probable result.2010 at 09:47:07 UTC from IEEE Xplore. As such. need reevaluation. As queries are independent of each other. r is the location at which the next location update occurs. The detailed algorithms of query evaluation and reevaluation will be presented in Section 4. Downloaded on July 09. of Q’s inner (if p is a result object) or outer (if p is a nonresult object) regions. instead of regarding Q as a special query type. or more accurately an inscribed rectangle. and the bucket for each cell points to those queries whose quarantine areas overlap with or fully enclose this cell. the location manager further requires every p:srQ (and thus. of Q’s inner or outer region. As the objective of the PAM framework is to minimize the number of location updates. Recall that a safe region is a rectangle within which the change of the centroid of p’s -square does not change the most probable result of any registered query. In this space. and that -square is small enough to be ignored. r is the intersection point of this direction and the boundary of R. the safe region for this new query Q. p:sr is simply the intersection of these p:srQ from all registered queries. the p:sr) to be fully contained in the home cell(s). and thus. By this means. Restrictions apply. the elapsed time before the next location update is kðÞ . 5). we have Costp ¼ Cl Á because R 2 0 Z 0 2 kðÞd 2 À1 : ¼ Cl Á 2 . Assume that the object p moves in a randomly chosen direction with a constant speed  (see Fig. the query processor evaluates the most probable result when a new query is registered. the amortized location update cost for p over time Costp is Costp ¼ Cl Á Z 0 2 kðÞd 2 À1 : ¼ Cl Á 2 . Aside from the definition of the query result.5 Query Processor and Location Manager In the PAM framework. therefore. or reevaluates the most probable result when a query is affected by location updates. we take an alternative approach by regarding the space where the object locations are defined as a special euclidean space. The location manager computes the safe region of an object p (denoted as p:sr). the cost of location updates. the quarantine area is used only to filter out the queries that are not affected by a location update. If p:sr0 is different from p:sr. the following theorem shows that the safe region should be the inscribed rectangle of the inner or outer region with the maximum perimeter: Theorem 3. the new safe region should be updated to p. p:sr0 needs to be completely recomputed by computing the p:srQ for each relevant query and then getting the intersection. based on the object index. which contradicts the convex assumption. the location manager only needs to compute the safe regions for those queries (subsequently called relevant queries) whose quarantine areas are contained or overlap with the home cell(s).: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 409 The index partitions the entire space into M Â M uniform grid cells. directly dictates the frequency. 4 QUERY PROCESSING In this section. To eliminate those queries whose safe regions do not contribute to p:sr. Given a convex safe region R and the updated location p. In the latter case. Proof. 3. By this definition. If we define the cell(s) that overlap with the -square of the updating object p as the home cell(s) of p. Random movement. on the other hand.

The best-known algorithm to evaluate a kNN query q in conventional euclidean space is the best-first search (BFS) [20]. p2 > is inserted into Q and the portion of area where p1 (or p2 ) is closer is 0.2010 at 09:47:07 UTC from IEEE Xplore. As such. the algorithm checks if any point in p01 (or p02 ) is always closer than any point in p02 (or p01 ). . pushing its child entries into H. we present an efficient algorithm that is based on finding out which object has more portion of area closer to point q. Downloaded on July 09. H is used to hold p until it can be guaranteed a kNN. This occurs when another object p0 is popped from H. vÞ do 6: 4. rootÞi into H. uÞ is compared with the next Dðq.5. which is shown in Algorithm 2. then v is guaranteed a kNN and removed from H. The region with the larger area decides the most probable result. 4. Transitivity: if p1 is closer than p2 and p2 is closer than p3 . The algorithm terminates if k objects have been returned. dðq. the algorithm is efficient because it terminates as soon as one portion of area exceeds 0. Comparability: 8p1 .e. Proof sketch. u itself is inserted to H and the algorithm continues to pop up the next entry from H. then a (from p1 ) is more probably closer to q than b (from p2 ). the majority of p is in R. repeating the process all over. is split into four equal subrectangles. The algorithm continues until either the portion of area where p1 Query Evaluation and Reevaluation on Object Index In conventional euclidean space. 22. 3. p0 Þ) is larger than the maximum distance of p to q (Dðq. The reason to split the larger rectangle is that the resulted pairs are more probable to become pairs of case 1. it is not guaranteed a kNN in the new space. It uses a priority queue H to store the to-be-explored index entries which may contain kNNs. The closer relation is a total order relation. 4. reflexivity. On the other hand. Therefore. p1 is closer than p2 . a new range query is evaluated as follows: We start from the index root and recursively traverse down the index entries that overlap with the query window until the leaf entries storing the objects are reached. Algorithm 2. containment and closer. To implement the “closer” relation. p01 or p02 . Instead of computing the exact shape of such area. However. 3. When a leaf entry. Therefore. an object p is contained in a rectangle R if in the euclidean space. the query is evaluated similarly. and then. In order to let the portion of area converge to the actual probability more quickly.2 Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. In the following sections. uÞ is larger than Dðq. we implement two relations that are required for spatial queries. respectively. VOL. or the queue Q becomes empty. we test each object using the containment relation in the new space.410 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. for two randomly picked points a. and comparability. By using the new implementations of spatial relations.. antisymmetry. the algorithm maintains an additional priority queue H besides H. 2. dðq. Each time an element <p01 . that is. an object p1 is closer to a point q than object p2 if and only if in the euclidean space. or u t p2 is closer than p1 . which is forbiddingly costly. an entry of a leaf node. 3. It maintains a priority queue Q whose elements are pairs of subrectangles of p1 and p2 that have not yet been compared. more probably closer to q than c (from p3 ). whichever is larger. vÞ. is popped. it satisfies 1. It is a priority queue of objects sorted by the “closer” relation. Restrictions apply. BFS works by always popping up the top entry from H. which is. where v is the top object in H. Initially.1 Spatial Relations In this new space. we use the multiple portions of area as the key to sort the pairs in Q. and its minimum distance to q (dðq.5. when an object u is popped from H. If this is the case (case 1). Then. either p1 is closer than p2 . In the new space. vÞ until it is no longer the larger one. i. The rectangle divides the enlarged safe region of any object p into two regions: the region inside rectangle q (where p is a result object of q) and the region outside q (where p is not a result). Then. in turn. the most probable result of kNN query q is defined as the top-k objects of all objects in the closer order of their enlarged safe regions. and thus. The reason to introduce H is that when an object p is popped from H. It is noteworthy that the portion of area where p1 (or p2 ) is closer is essentially the probability that p1 (or p2 ) is closer. p02 > pops up from Q (where p01 is a subrectangle of p1 and p02 is a subrectangle of p2 ).1. pÞ). Then. The entries in H are sorted by their minimum distances to the query point q. transitivity. 4. the algorithm is based on the divide-andconquer paradigm. In this new space. MARCH 2010 distance are implemented differently from a conventional euclidean space. uÞ > Dðq. Cases 1 and 2 are trivial. existing spatial query processing algorithms can be applied directly to the new space. p2 . a is equally or more probably closer to q than b. (or p2 ) is closer exceeds 0. the multiple of the area p01 (or p02 ) is added to the portion of area where p1 (or p2 ) is closer. If this is not the case (case 2). As such. The closer relation has a nice property that it is a total order relation. 2: enqueue hroot. b from p1 and p2 . NO. Evaluating a new kNN Query Input: root: root node of object index q: the query point Output: C: the set of kNNs Procedure: 1: initialize queue H and H. 3: while jCj < k and H is not empty do 4: u ¼ H.pop(). this algorithm always returns the correct result. Reevaluation of an existing range query q is even simpler—only the -square of the updating object needs to be tested on the containment relation. In general. If dðq. we need to do the following. 5: if u is a leaf entry then while dðq. the corresponding object is returned as a nearest neighbor. namely. four new pairs are inserted to Q. which is proved by the following preposition: Proposition 4. The algorithm continues until k objects are returned. the pair <p1 .

x ¼ s. we present the detailed algorithms to compute the quarantine line. To reevaluate an existing kNN query that is affected by the updating object p. 10: pà ¼ k. since this object is a new result object. its centroid). otherwise. This is done by comparing it with other existing objects in the kNN set using the “closer” relation. 2) case 2: p was not a result object but becomes one. 6a). starting from pà . p is “on” the window border as box “1” shows.HU ET AL.4 For case 1. the optimal safe region is the solid rectangle whose corner point is pà (see Fig. This then leads to three cases: 1) case 1: p was a result object but is no longer so. Algorithm 3 shows the pseudocode of kNN query reevaluation. then k-1th NN. For example. else if u is an index entry then for each child entry v of u do enqueue hv. The final step of reevaluation is to locate the order of new result object p in the kNN set. for case 3.pop(). updating object p (i. For any inscribed rectangle whose corner point in the first quadrant is ðs. we have ð=2 þ a À xÞð=2 þ b À yÞ ! 2 =2. more than half of the -square must reside in the query window. there are fewer than k result objects. Reevaluating a kNN Query Input: C: existing set of kNNs p: the updating object Output: C: the new set of kNNs Procedure: 1: if p is closer to the k-th NN then 2: if p 2 C then 3: pà ¼ the rank of p in C. 6b). then all inscribed rectangles that contain p lie between the two dotted rectangles whose horizontal sides and vertical sides pass p0 . Fig.” In the second subcase. Fig. and 3) case 3: p is and was a result object. the location manager computes the optimal safe region for an individual query Q. tÞ must also be on the quarantine line. enqueue u into H. On the other hand. and then. Without loss of generality. Safe region for range query. For cases 1 and 2. y ¼ t must be a solution to (1). y ¼ b. However. let us consider the first quadrant. 7: else 8: if p 2 C then 9: evaluate 1NN query to find u. separated by the quarantine line. 6b. where pà denotes the starting position of the comparison. if x a À =2. the perimeter is 2s þ 2t. then it is a result object. which is the inscribed rectangle with the longest perimeter (Ir À lp) of Q’s inner or outer region. the comparison should start from the kth NN. and hence. 6b shows an example of range query where q is the centroid of the query. or < x ¼ a. the comparison can start from where p was. so there should be an additional step of evaluating a 1NN query at the same query point to find a new result object u.: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 411 7: 8: 9: 10: 11: 12: v = H. which is defined by the following formulae: 8 if y b À =2. since p was in the set. 6: enqueue p into C. 6b. this safe region may not contain the centroid of the updating object p.. The gray box shows the -square of p. tÞ. Summing up all the four quadrants. The evaluation of such a query is almost the same as Algorithm 2. In this section. 6b. . and so on. because otherwise. the total inner region of this query is the bold shape in Fig. otherwise: ð1Þ And the inner region in the first quadrant is therefore the shaded shape. dðv. There is a fourth case where p was not and is not a result object. It is noteworthy that the safe region must contain the 4. which can be further divided into two subcases. The two subcases give us the quarantine line in the first quadrant (the bold curve in Fig. the optimal safe region is one of the two dotted rectangles with longer 5 SAFE REGION COMPUTATION As mentioned in Section 3. since ðs. Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. if the centroid is at p0 . 4: else 5: pà ¼ k. Restrictions apply. 12: relocate p or u in C. we only need to consider the special case when exactly half of the square resides in the query window. except that all existing kNN result objects are not considered.e. or : ð=2 þ a À xÞð=2 þ b À yÞ ¼ 2 =2. (b) The optimal safe region. In the first subcase. in Fig. Therefore. insert v to C. the safe region is obtained in two steps: finding the quarantine line. the safe region for various types of queries. Fig. This equation shows that  the perimeter 2s þ 2t is maximized at pà when 2 þ a À x ¼  ffiffi p 2 þ b À y ¼ 2 . 5.1 Safe Region for Range Query We first consider the case when object p is a result object. Algorithm 3. the reevaluation is completed by doing nothing. and let the same p (ðx. qÞi into H. it is a nonresult object. 6a is the close-up image of Fig. 6. Thus. 11: remove p and enqueue u into C. In this case. However. this object has to send an immediate location update after it receives this safe region. yÞ) denote the centroid of the -square. The second step is to find the Ir À lp of the inner region. (a) Quarantine line. p is not on the border as box “2” shows. According to the definition of the most probable result. finding the Ir À lp. we have either “y ¼ b and x þ =2 b” or “x ¼ a and y þ =2 a. the first step is to decide whether p is a result object by comparing p with the kth NN using the “closer” relation: if p is closer.2010 at 09:47:07 UTC from IEEE Xplore. To obtain the quarantine line. In this case. Downloaded on July 09.

7. to compute the lower bounding circle.  ðx. perimeter. To find the radii of the upper and lower bounding circles. we show that a circle centered at q splits a -square into inside and outside parts.3. MARCH 2010 Fig. there must be a diagonal Fig. plotted by solid arcs. and since the inside part is half of the square. any square whose centroid is within the lower bounding circle is closer to q than o. the corner of the safe region is: 8  pffiffi pffiffi pffiffi > a À 2À1 . However.e. Upper and lower bounding circles. 5. On the other hand. the safe region is an inscribed rectangle of the outer region. On the other hand.4. Then. are the circles that cross the centers of these two squares.. 7b. plotted by dotted arcs. there are two circles. the safe region is chosen from the two dotted rectangles that has a longer perimeter. Then. Lemmas on squares.2. the two -squares are of the same angle. the diagonal square. and their areas are dependent on the angle of the -square to q. any square whose center is closer than that of the diagonal square must be closer than the diagonal square. the area of the inside part also depends on the length of pq. (a) Diagonal and side squares. we can define the lower and upper bounding circles for an object o. p is always farther than o. has the largest inside part. 8. the M th shortest) distance is set to the radius for the 2 lower bounding circle. 4  4) subsquares. otherwise: 2ðbþ=2Àp:yÞ 2 If p is a nonresult object. this rectangle can serve as the safe region only if it contains the centroid of updating object p. Such rectangle has the longest perimeter when its corner point pà is at ða. the diagonal square is closer than the bbox of o. u t Based on Proposition 5. whose diagonal coincides with the line of pq. Bounding circle. By definition. we approximate the inner region with a ring centered at the query point q. 22. in Fig. Applying these two lemmas. For example. 2ðaþ=2Àp:xÞ À 2bþ .412 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. the closer p to q. the larger the inside part. has the smallest inside part. The medium 2 (i. 7a). (b) q and inside part. 0Þ or ð0.. its -square must be closer than the bbox of oiþ1 .1. but farther than the bbox of oiÀ1 . area II) is always closer than any point in the bbox of o. b À 2À1  .. respectively.. whose sides are parallel to pq. Downloaded on July 09. Proof. For squares of the same angle to q. otherwise. oi ) can be approximated by a ring that is formed by the lower bounding circle for oiþ1 and the upper bounding circle for oiÀ1 .3. the inner or outer region) for p based on this line is complex. 9. we further adopt an approximation algorithm as follows: As shown in Fig. > 2 >  >  > > 2 : or À 2aþ . In what follows. bÞ. Therefore. while the side square.e.2010 at 09:47:07 UTC from IEEE Xplore.2 and 5. but the square that is closer to q has a larger inside part (area I) than the farther square (area II). For a result object p of a range query (size 2aby-2b). The proof for the upper bound is similar. By this definition. the diagonal square is first partitioned into M  M (e. the inner region for p (i. Restrictions apply. and these subsquares already account for half of the total area. Lemma 5. This bounding circle is guaranteed to satisfy Proposition 5. Fig. 9.e. that touch the near and the far endpoints of the bbox of o. (refer Fig. Among all squares of the same size and the same distance to q. VOL. by definition. yÞ ¼  2 > p:x. 8. we reach the following proposition on the safe region for a result object: Proposition 5. The following proposition proves the correctness: Proposition 5. .2 Safe Region for kNN Query We first consider the case when object p is the ith NN (denoted by oi ) of the query. > > 2 2 > pffiffi > > < & p:y b À 22À1 .4 because the subsquares of the first M2 2 shortest distances (their farthest endpoints are shown as hollow circles) must be inside the bounding circle. Lemma 5. p:y .4. the distance between q and the farthest endpoint (the small hollow or solid circles in the figure) of each subsquare is computed. In Fig. p is always closer than o. by Lemmas 5. the exact quarantine line (and hence. square and a side square that are split by these two arcs into inside and outside parts of equal area. Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. NO. Similar to the case when p is a result object. As the first step. as long as the centroid is beyond the upper bounding circle. Any -square whose centroid is within (beyond) the lower (upper) bounding circle for object o must be closer (farther) to q than o. if p:x a À 22À1 . Since any point in the inside part of the diagonal square (i. Applying the transitivity of the “closer” relation.g. on the other hand. 3. as long as the centroid of p’s -square is within the lower bounding circle. The lower and upper bounding circles.

(a) Ring.HU ET AL.5. where x ¼ arcsin p:xÀq:x and y ¼ arccos q:yÀp:y . 6. As such. if object p is a nonresult object. . r r The standard update strategy. x is either on the 1=4 circle whose  is 8 < =4. the reachable area is the Minkoski sum of the -square at t0 and a circle with a radius of vm ðt1 À t0 Þ. In this section. The periR R meter of the second (vertical) Ir-lp is 4Rcos2 þ 2ðRsin2 À rÞ.1. we show in the following lemma that it is sufficient to maintain only the reachable area for the last -square: Lemma 6.2010 at 09:47:07 UTC from IEEE Xplore. . 11b. or 2 ¼ x . In these examples. Fig. we discuss two dynamic strategies that achieve objectives other than monitoring accuracy. The perimeter of the first (horizontal) Ir-lp is 4Rsin1 þ 2ðRcos1 À rÞ. then at time t1 . Figs. The intuitive version of this strategy must maintain the entire set of historic -squares. 6 DYNAMIC CLIENT UPDATE STRATEGY Therefore. we reach the following proposition on the safe region for a result object oi : Proposition 5. the reachable area is the half-open space formed by the rays whose ends are from the -square. if y > =4. which is called the reachable area from t0 . the safe region is the inscribed rectangle of the ring that has the longest perimeter (Ir À lp).1 Mobility-Aware Update Strategy Previously. i ¼ 1). (b) Known direction. in Fig. 11a and 11b). In Fig.. we ignore the fact that the server receives a series of location updates from an object. : y . a -square is updated at time t0 . or if x < =4. Likewise. then the object can only locate in the part that is inside the area (shaded in Figs. To prevent the server from narrowing down the object location like this. 10a): Proposition 5.: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 413 Fig. tn g and i j n. Downloaded on July 09. if x arctg2 y . guarantees 100 percent monitoring accuracy in the context of the most probable result. (a) Known maximum speed. Once the ring is obtained. 10b): Proposition 5. The safe region of the ith NN oi is the Ir-lp of the ring that consists of the upper bounding circle for oiÀ1 and the lower bounding circle for oiþ1 . The Ir-lp of a ring that is centered at q with inner radius r and outer radius R is the one of the following two Ir-lp which has a longer perimeter. by considering consecutive updates with certain background knowledge about the object’s mobility. where 1 is 8 < arctg2. : if arcctg2 < y : y . we develop an algorithm to test whether a -square at t1 is completely inside the reachable Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. or if x < arcctg2. : y . the -square expanded by the circle at each point.. if x arcctg2 y . The Ir-lp of the complement of a circle centered at q with radius r is the inscribed rectangle with one corner being the cell corner corresponding to p and the opposite corner is x. the object must reside in the dotted shape. Problems with mobility knowledge. this circle can serve as an approximation of the lower bounding circle. If the -square at t1 overlaps with the reachable area. Although the server cannot speculate the genuine object location from an individual -square. (b) Complement of a circle. the ring degenerates to a circle. which updates when the centroid of -square is out of the safe region. It is noteworthy that for the first NN (i. In [21].1 (Mobility-aware update strategy). . Update when the centroid of -square is out of the safe region and the -square is completely inside the reachable area of all previous -squares. we can approximate the outer region by the complement of the upper bounding circle of ok . Computing Ir-lp. 11a and 11b show two examples where the maximum speed vm or the exact direction of the movement is known. or 1 ¼ x .e. The same approximation can be applied to upper bounding circles. due to its dynamic property. where x ¼ arcsin p:xÀq:x and y ¼ arccos p:yÀq:y . better approximation can be achieved using larger M. However. if arctg2 < y . In what follows. For a set of -squares of ft0 . . On the other hand. which is at the cost of higher computation overhead. we showed that (see Fig. In [21]. Finally. This is a static strategy where the decision is made independent of previous decisions. the server might produce better speculations. 11a. we showed the following proposition (see Fig. respectively. t1 .6. Restrictions apply. or  ¼ x . Obviously. we propose the following mobilityaware strategy: Definition 6. the safe region is the Ir À lp of the complement of a circle.e. i. 11. if y =4 x . 10. . or if x < arctg2. where 2 is 8 < arcctg2.7. the reachable area of tj is completely inside that of ti as long as the -square of any ti is completely inside the reachable area of tiÀ1 ði ! 1Þ.

MARCH 2010 Fig. NO.414 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. 3. area of t0 when the direction is known in the range of ð. Reachable area. 22. 12. VOL.

l . .

yÞ denote a point in the -square of t1 .h Þ. let o with coordinates ða. As is illustrated in Fig. bÞ denote a point in the -square of t0 . This is a general case for Fig. 12. The idea is to take an analytic view of this area. the condition that line op falls in between direction ð. 11b. and let p with coordinates ðx. Then.

l . .

h Þ is equivalent to the inequality tgð.

l Þ ðy À bÞ=ðx À aÞ tgð.

h Þ (we consider only the first quadrant for simplicity). b can be satisfied simultaneously: 8 < Àðx À aÞtgð. to test whether the -square at t1 (xl x xh . y. yl y yh ) is completely inside the reachable area of t0 is equivalent to testing whether any of the following two sets of inequalities with regard to x. Therefore. a.

and 8 < ðx À aÞtgð. : xl x xh . 0 b .l Þ þ ðy À bÞ 0. 0 a . y l y y h .

h Þ À ðy À bÞ 0. Downloaded on July 09. and by taking steps of length Á. at the border  ¼ 0. Restrictions apply. By the theory of Bernoulli process. the rule must also be satisfied at p. the total number of steps N ¼ distðp. As p causes a result change. y.. pà is regarded as p0 . On the other hand. a strategy that can trade accuracy with costs is desirable. if pà changes the result. As  at any point is independent of the  values at other points. The gap between them is inevitable and could be arbitrarily large due to the following reasons: 1) a safe region for an individual query is already a rectangular approximation of the inner or outer region for this query and 2) the whole safe region is obtained by intersecting the safe regions for all individual queries. if the rule is satisfied. 13). yl y yh : Either of them can be regarded as the set of linear constraints in a linear programming (LP ) problem regarding variables x. the movement of  from the border of the safe region to p can be regarded as a discrete random walk for simplicity (see Fig. We therefore believe that in applications where 100 percent accuracy is not mandatory and location update costs are serious issues.  ¼ 0 when p is in the safe region and gradually increases as p moves away from the safe region. b. the standard strategy updates whenever p moves out of the safe region. but this could be an unnecessary update as p might still be in the ideal safe area. we develop such a strategy that tries to minimize the cost by adding a -rule to the standard strategy to filter out unnecessary updates. a. 13. The feasibility can be tested by any LP solver such as the classic Simplex or Ellipsoid method. If we regard the space as a space of  values. 13 illustrates the relation between a safe region and the ideal safe area. which makes it far smaller than the ideal safe area.2010 at 09:47:07 UTC from IEEE Xplore.  > costu =costp . As p0 is known to be inside the ideal safe area. By the maximum likelihood estimation. symbol  is the probability of p moving out of the ideal safe area. On the other hand. Minimum-cost update strategy: -rule. the strategy updates the location and stops the walk in any step when the -rule is satisfied. i. Initially. More specifically. The -square of t1 is completely inside the reachable area of t0 only if neither P1 nor P2 is feasible. then the strategy does not update the location. We are yet to estimate . To guarantee 100 percent monitoring accuracy on the most probable result. Therefore. This prerequisite is useful in filtering out those ideal safe areas that are not significantly larger than their safe regions.e. if the -rule is not satisfied till the last step. To test whether  at p is larger than costu =costp without sending it to the server. we continue to use the standard strategy. so we continue to find p0 for the new ideal safe area. P2 with a (dummy) objective function C ¼ 0 and the same linear constraints as above.  is increased by Á with a probability . If the next updated location pà causes no result changes (a feedback from the server). Thus. Fig. the probability of  costu =costp . As such. i. Otherwise. costp is essentially the penalty of loss of monitoring accuracy.  at p0 follows a Binomial Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. . because  is monotonously increasing as it moves. the ideal safe area changes as well. We build two LP problems P1 . we need to know an additional point p0 inside the ideal safe area (see Fig. 0 a .2 Minimum-Cost Update Strategy In previous sections. 13).. In this section. : xl x xh . the -rule updates only if costu < à costp . let costu denote the cost of updating p. RÞ=Á. In general. and not increased with probability 1 À . Since the maximum value for  is 1.e. To find p0 . Á ¼ 1=N. we should maximize the probability that  at p0 does not satisfy the -rule. it walks away from the border toward p. In any step. and the second update must cause no result changes from the first update. Let costp denote the cost of not updating p in this case. to minimize the expected cost. Determining whether any of the two sets of inequalities can be satisfied simultaneously is then equivalent to testing whether P1 or P2 has a feasible solution. 0 b . Fig. 6. we use a rectangular safe region to approximate the ideal safe area in which the change of the centroid p of a -square does not change the most probable result of any query. a random walk to p0 can be conducted in the same way as above. In each step. the -rule is only applicable after two consecutive location updates by the standard strategy.

As  or f increases.” “first update.. For range queries. is defined as whether the monitored results for all queries accord with the results from the OPT framework. 0. it knows precisely when the most probable result of any query changes. 2tv Š).HU ET AL. Note that the latter are the most probable results based on the -squares of all objects at t. which includes the time for query evaluation and safe region computation. the consistency rate drops.2 Validity of Most Probable Result The first set of experiments is to validate the definition of most probable result.1 and 1. encapsulates it into a -square.” there must be two location updates. we implement a simulation test bed. we bcostu N=costp c . even when  or f is very large.4 GHz PC with 1 GB RAM running WinXP SP2. (randomly picked from the range ½0. In optimal monitoring. and the latter (regarded as p0 ) must cause no result changes. the optimal monitoring (denoted as OPT) and the periodic monitoring (denoted as PRD).1) and PRD(1) hereafter.000 time units or until the measured value stabilizes (for those simulations that take 12 hours or more). and only then does it send a location update to the server. This justifies our claim that the most probable result is a nice approximation of the genuine result for monitoring tasks.2010 at 09:47:07 UTC from IEEE Xplore. 7. Fig. have  ¼ N0 The state transition diagram of this strategy is illustrated in Fig. However. the consistency rate is above 70 percent.. Once the strategy decides to update. This is a well accepted and studied model in the mobile computing literature [5]. it chooses a new destination and repeats the same process. 7 PERFORMANCE EVALUATION To evaluate the monitoring performance. all objects periodically send out location updates simultaneously and the server reevaluates all registered queries based on these updates. the query points are randomly distributed and k ranges from 1 to kmax . . we test PRD with updating intervals 0. To eliminate the effect from hardware configuration. State transition diagram. the curve is not linear: the drop becomes slower when  and f become larger. denoted as PRD(0. Monitoring accuracy: The monitoring accuracy at time t. i. Therefore. Each object detects its point location at frequency f. Putting  ¼ costu =costp . Obviously. Each object has an individual  and it follows a normal distribution with mean value . 14 where the shaded texts mean rule satisfaction and unshaded texts mean otherwise. 1g.. To transit to “second update. The workload consists of W queries.1.1]. the simulation uses logical time units instead of clock time units. its monitoring accuracy and cost depend on the updating interval. Restrictions apply.1 Simulation Setup In the simulation test bed. te Þ ¼ te Àtb tbe maðtÞdt. . half of which are range queries and half are kNN queries. each object moves according to the random waypoint mobility model: the client chooses a random point in the space as its destination and moves to it at a speed randomly selected from the range ½0.” “no update. and forwards the square to the location updater. The database server is simulated on a Pentium 4 2. the proportion of time when the two results are the same. distribution whose cumulative distribution function is 2 bounded by expðÀ2 ðN0 ÀbNcÞ Þ. The monitoring accuracy for a time period ½tb . OPT serves as the lower bound for all monitoring frameworks. CPU time: This is measured by the amortized server CPU time. the query rectangle is a square and its side length is uniformly distributed in a range of [0:5qlen . i. upon arrival or expiration of a constant movement period 7. 15a shows the consistency rate. 1:5qlen ]. maðtb . the -rule is suspended and the state is reset to “first update” to wait for the next p0 . Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. every object has the perfect knowledge of the registered queries and the -squares of other moving objects at any time. where N moving objects move within a unit-square space [0. In all the three frameworks.” The -rule is applicable only at “second update” and “no update” where p0 is obtained. Table 1 summarizes the default parameter settings. There are four states in this strategy: “initial state. Downloaded on July 09. we compare the most probable result from the OPT framework with the genuine result (the result as if all the point locations were known) for all W queries. 2vŠ. In this paper. Each simulation run lasts for 5. the database server maintains an in-memory grid index (M  M cells) on the queries and an Rà -tree index [2] on the objects. maðtÞ 2 f0. We compare our PAM framework with two other frameworks. The performance metrics for comparison include: . 14.e. The function reaches the N0 maximum when N0  ¼ bNc. As such. For kNN queries.” and “second update. . namely. te Š is defined as the amortized accuracy Rt 1 over time. Under various  (the mean of ) and f (the location detection frequency).: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 415 TABLE 1 Simulation Parameter Settings Fig. Wireless communication cost: It is the amortized number of location updates sent by a moving object over time.e.. In periodic monitoring.

in turn. 15b shows the monitoring accuracy and communication cost (normalized by the cost of OPT). Fig. the CPU time is linear to W . However. which is already large in practice. 53 seconds using PRD(1). not necessarily the frequency of result changes. the rate of the drop decreases as  increases. because the increase for PAM arises from two aspects—more location updates and more complex query reevaluation (especially for kNN queries). while PRD gets only 80-90 percent. 17a shows the CPU time when the number of registered queries (W ) increases from 10 to 1. PRD(1) and PRD(0. Fig.1) drops significantly as  increases. NO. and CPU cost. 17. it is at the cost of 10 times higher communication overhead.2010 at 09:47:07 UTC from IEEE Xplore. the high CPU cost imposes on it a Fig.000 moving objects using PAM.000. communication. PAM only increases by less than 10 times because the grid-based query index filters out most of the unaffected and irrelevant queries. Similarly. (b) The overall performance. we evaluate the effect of  on the performance.4 Effects of  In this section. Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. VOL.  versus monitoring accuracy. which. We vary  (the mean of ) from 0 to 0. Fig. (b) Communication cost. The cost for OPT is almost the same for all settings.1). we can conclude that PAM is robust and efficient for various privacy settings. 3. 16a shows the corresponding monitoring accuracy. PRD(0. for PRD(1) and PRD(0.1) is more accurate than PRD(1) but the performance gap is less than 10 percent. 7. and 217 seconds using PRD(0.1) (not plotted) have constant costs of 1 and 10. the communication cost of PAM is much smaller than PRD and remains close to OPT. even for  ¼ 0:01. 7.000. As is guaranteed. but even for  ¼ 0:01. Downloaded on July 09. 16b shows the communication cost. it still outperforms PRD(1) by more than 40 percent. the cost of PAM consistently grows as  increases. Nonetheless. (a) Consistency of most probable result. the rate of the increase drops as  increases. Fig. our PAM framework achieves 100 percent accuracy. On the other hand. 15. the accuracy of PRD(1) and PRD(0. Performance evaluation. 16. 7. 16c shows the CPU cost of PRD and PAM. 22. Obviously. Furthermore. (b) Communication cost.1). (a) CPU time. Restrictions apply.6 CPU seconds to monitor the 100. While PAM achieves 100 percent accuracy. because the change of  (and thus.3 Overall Performance The next set of experiments evaluates the overall performance of three frameworks with default parameter settings. respectively. ) merely changes the query results. However. MARCH 2010 Fig. which indicates that the approximation ratio of the safe region to the quarantine area becomes steady. Therefore.1). verifies the fact that the most probable result is stable for even large -squares. the server needs 1. (a) Accuracy.416 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. whereas the increase for PRD arises only from the latter. The drop is mainly caused by the increasing spatial vagueness introduced by the -square. PAM increases faster than PRD(1) and PRD(0. Performance versus query numbers (W ). Fig.5 Scalability This section evaluates the scalability of all frameworks in terms of the server’s CPU time and communication cost. for one logical time unit.01 and Fig. Further. On the other hand. PAM is still about 1=20 that of PRD(1). as they need to reevaluate every query at each batch of location updates. (c) CPU cost. . As PRD updates locations periodically. When W ¼ 1.

the safe regions are determined more 7.001 to 1 per logical time unit. 20.: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 417 Fig. However. The costs of both PAM and OPT increase linearly as v increases.6 Effects of Query Types In this section. 18b shows that the communication cost of PAM only increases by about 200 times when N increases by 1. and hence. (a) CPU time. the average moving speed (v) and the average constant movement period (tv ) for the moving objects. We vary the average query length qlen of range queries and kmax —the maximum k of kNN queries. In Fig. Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. but on the length of the trajectory. PAM’s communication cost is at most three times as much as that of OPT. maximum update frequency: in this example. Performance versus query types. 20b. 19a and 19b. and hence. To eliminate this effect. Fig.001) or large (!0:01).1) both increase at least linearly to N. PAM has no such limitation.000 times. by the home cell than by the relevant queries. Nonetheless. we conduct simulations to vary the number of objects (N) from 100 to 100. The next influential factor is the M Â M grid partitioning of the query index. 18. although PAM increases linearly with respect to W . we conclude that PAM is robust to various moving parameters.000. only a decreasing portion of objects affects the quarantine area of any query. . (a) Range queries. for kNN queries. the communication cost of OPT always increases at a steady pace. because the time of an object staying in a safe region is inversely proportional to v. we study the sensitivity of PAM to other influential factors. Performance versus object numbers (N). as kmax increases. as qlen increases. and observe that this cost is independent of v. The explanation is the same as in Fig. 21. Performance versus grid partitioning. Communication cost versus v and tv . The larger is the value of M. (a) v.000 times.7 seconds. Similarly. Downloaded on July 09. we study the performance of PAM on range and kNN queries separately. PRD(1) and PRD(0. respectively.2010 at 09:47:07 UTC from IEEE Xplore. 19. On the other hand. Fig. we also plot the communication cost per distance unit on the secondary y-axis in the same figure. Fig. The communication cost increases monotonously with M because the grid cell sets the largest possible safe region of an object. is not plotted. Restrictions apply. the safe region achieves even better approximation to the ideal safe area. it is still less than double of OPT. As such. due to the RÃ -tree index which is incrementally maintained. the safe region of any object. In summary. the update cost of PAM is not heavily dependent on the speed of object movement on a trajectory.7 Sensitivity of PAM In this section. 7. In terms of communication cost. the costs of both OPT and PAM grow steadily. the smaller is the grid cell size. Fig. the cost tends to saturate. the cost difference between M ¼ 5 and M ¼ 50 is not significant but there is a sharp increase from M ¼ 50 to M ¼ 100. The CPU time shows a similar trend. 18a shows that the CPU cost only increases by about 40 times when N increases by 1. Similarly. the update frequency is at most once every 21.HU ET AL. Since the size of a cell is fixed. This can be explained by the fact that when qlen is relatively small or large. The communication costs are plotted in Figs. 20a shows the communication cost when v varies from 0. as they need to build a new RÃ -tree at each update to reevaluate all queries. (b) kNN queries. PAM is more scalable than PRD in terms of CPU and communication cost. We observe that for any parameter setting. This suggests that although a denser object distribution makes safe regions shrink. we also vary tv from 0. This suggests that for a heavy workload when results change frequently. All the above results suggest that PAM is robust under various W settings. We vary M from 5 to 100 and plot both the communication cost and CPU time in Fig. For range queries. In contrast. (b) tv . PAM manages to narrow the gap when kmax becomes larger. In other words.001 to 1 time unit and find that it has little effect on the performance of PAM. Fig. 21. the communication cost of PAM increases more slowly when qlen is relatively small (at 0. Even so. (b) Communication cost. namely.

ACM Int’l Symp.E. RGC GRF 615806. HKBU211307. Beckmann. Chow. we plan to incorporate other types of queries into the framework. [4] [5] [6] [7] [8] [9] 8 CONCLUSIONS [10] [11] This paper proposes a framework for monitoring continuous spatial queries over moving objects.D. MARCH 2010 Fig. “Location Privacy in Mobile Systems: A Personalized Anonymization Model. Int’l Conf. 22. the safe region computation is faster. Chon. the safe regions are determined more by the relevant queries than by the grid cell. such as spatial joins and aggregate queries. Data Eng. Tian. but when M is large. and J. D. pp. Wang. Sept. IEEE Int’l Conf. Int’l Database Eng. (IDEAS). “MobiEyes: Distributed Processing of Continuously Moving Queries on Moving Objects in a Mobile System. we enlarge the cell size by merging the cell where the update occurs with its neighboring cells within a certain distance.” Proc. 2001. C. 2002. “Processing Range-Monitoring Queries on Heterogeneous Mobile Objects. Pfoser and C. 4. 2005. the strategy effectively trades accuracy for communication cost. 1112.S. HKBU/FRG0809/II-48.A. Distributed Computing Systems (ICDCS).-Y. “A Peer-to-Peer Spatial Cloaking Algorithm for Anonymous Location-Based Services. vol. F. Restrictions apply. and efficiency.” Proc. R. and S.” Proc. pp.” ACM/ Kluwer MONET. Gedik and L. Y. This shows that most ideal safe area is far larger than the safe region. H. 2006. and A. Agrawal. the minimumcost update strategy shows that the safe region is a crude approximation of the ideal safe area. the framework is robust and scales [12] [13] [14] Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. which is. On the other hand. which means that through . We provide detailed algorithms for query evaluation/ reevaluation and safe region computation in this framework. Jensen. 171-178. We also plan to further optimize the performance of the framework. pp. D. 401-412. or vice versa. We vary the threshold for the -rule. Cao. Abbadi. R. Downloaded on July 09. Chen.01 to 1. HKBU210808. no. Cheng. (b) Communication cost. 2004. Jan. ACM SIGMOD.” IEEE Pervasive Computing.8 Dynamic Update Strategy The last set of experiments is conducted to evaluate the minimum cost update strategy. 2004.-C. Benetis. 1. 2004. when  > 0:1. In particular. Beresford and F.” Proc.” Proc.6. pp. Liu. the cell becomes dominate. IEEE Int’l Conf. efficiency. 586-595. Extending DataBase Technology (EDBT). Cheng. “Range and kNN Query Processing for Moving Objects in Grid Model. NO. Karciauskas. China under Project No. “Location Privacy in Pervasive Computing. vol. and Y. Minimum cost strategy versus standard strategy. 46-55. 7. Hu..S. HKBU211206. From this experiment. As for future work. The two curves of PAM show a similar trend as  increases.V. Interestingly.e. “NiagaraCQ: A Scalable Continuous Query System for Internet Databases. but not globally. 16. Johnson. Scientific and Statistical Database Management (SSDBM). “Nearest Neighbor and Reverse Nearest Neighbor Queries for Moving Objects. we can take full advantage of the CPU resource and reduce the communication cost (by enlarging the safe region) as much as possible. ACKNOWLEDGMENTS This work was supported by the Research Grants Council. The framework is the first to holistically address the issue of location updating with regard to monitoring accuracy. well with various parameter settings. we can see that it is advantageous to adapt the cell size to the server’s workload: we first use a large M to partition the grid. 1999. “A Performance Comparison of Multi-Hop Wireless Ad Hoc Network Routing Protocols. A. so even an aggressive  can still keep the object inside the ideal safe area. 2003. Chen and R. such as privacy requirement. “Continuous Queries over Data Streams. “The R*Tree: An Efficient and Robust Access Method for Points and Rectangles. J. 3. and CA05/06. mainly because we separately optimize the safe region for each query. the decrease of the communication cost is accompanied by almost the same decrease of accuracy. B. Kalashnikov. 1990. M.” Proc. moving speed. pp. and G. 9. (ICDE). costu =costp from 0. Liu. 322-331.1127. Furthermore. vol. C. ACM/IEEE MobiCom.” Proc.” Proc. and X. but also on the accumulated safe region. and Applications Symp. ACM SIGMOD. Maltz. however. Saltenis.A. A possible solution is to sequentially optimize the queries but maintain the safe region accumulated by the queries optimized so far. B. and later if the workload turns out to be low. H.-Mar. 2. Jensen.. Schneider. Knowledge and Data Eng. 2000. 22. IEEE Int’l Conf. G. and hence.” Proc. VOL. Liu. . 8.” Proc.418 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. DeWitt. The performance of our framework is evaluated through a series of experiments. K. the CPU time decreases monotonously because the number of relevant queries in the cell decreases. Then. N. HKBU1/05C. In this way. Prabhakar. Section 7. pp. The results show that it substantially outperforms periodic monitoring in terms of accuracy and CPU cost while achieving a close-to-optimal communication cost. “Capturing the Uncertainty of MovingObjects Representations. J.” IEEE Trans. D. Kriegel. pp. the accuracy drops more slowly than the communication cost. D. Hong Kong SAR.EG03. D. D. and the number of queries and moving objects. “Querying Imprecise Data in Moving Object Environments.F. Mokbel. when  0:1. 1998. ACM SIGMOD. Int’l Conf. 620-629. Babu and J. Gedik and L. (a) Accuracy. R. Hua. and S. respectively. and privacy. privacy. Seeger. Mobile Data Management (MDM). no. 2003. J. Widom. In this figure.2010 at 09:47:07 UTC from IEEE Xplore. 22a and 22b show the monitoring accuracy and communication cost in comparison with the standard strategy.” Proc. and B. Stajano. Jetcheva. 85-97. the optimal safe region for each query should depend not only on the query. M ¼ 50 yields a fairly low communication cost as well as a fairly low CPU time. 2007. Y. i. pp. Figs. Cai. when M is small and moderate. Broch. Geographic Information Systems (GIS). no. We also devise three-client update strategies that optimize accuracy. REFERENCES [1] [2] [3] S. “Efficient Evaluation of Imprecise LocationDependent Queries.

and W. Myles. Lee.R. Data Eng. He is also the recipient of many awards. Ooi. Chakka. pp. and the PhD degree in computer science from the Hong Kong University of Science and Technology in 2002. [19] A. Hong. [42] W.” Proc. 2. He was an associate professor in the Department of Computer Science and Engineering. Lu. Pervasive Services (ICPS). Papadias. “Distributed Processing of Moving k-Nearest-Neighbor Query on Moving Objects. “The New Casper: Query Processing for Location Services without Compromising Privacy. Manolopoulos. Tao.-Y. and the MS and PhD degrees in computer science from the University of Toronto.” Proc. 2004. D. D. Int’l Conf. Y.” ACM Trans. and K. 7.E. 19.” IEEE Trans. P.-H. “Fast Nearest-Neighbor Query Processing in Moving Object Databases. He was a visiting scholar in the Department of Computer Science and Engineering. and Query Accuracy in Mobile Services. ACM SIGMOD. pp. [45] M.” Proc. (ICDE).A. University Park.” Proc. Very Large Data Bases (VLDB). Lee. vol. [18] M. Y. vol. Chen.” Proc. 2007. 371-380. “Blind Evaluation of Nearest Neighbor Queries Using Space Transformation to Preserve Location Privacy. “Anonymous Usage of LocationBased Services through Spatial and Temporal Cloaking. Xiong. Prabhakar. and D. ACM SIGMOD. M. “Mobihide: A Mobile Peer-to-Peer System for Anonymous Location-Based Queries.” IEEE Trans. and D. Hjaltason and H. Kalnis. Oct. and S. “TAG: A Tiny Aggregation Service for Ad-Hoc Sensor Networks. and V. Restrictions apply. Downloaded on July 09. [46] T. including the ACM Best PhD Paper Award and Microsoft Imagine Cup. “Indexing the Positions of Continuously Moving Objects. “Protect Moving Trajectories with Dummies. Very Large Data Bases (VLDB). B. and T. His research interests include data management. no. “Protecting Location Privacy with Personalized k-Anonymity: Architecture and Algorithms. Cui. 1999. no. and S. Papadopoulos.G.M. Prabhakar. “Spacetwist: Managing the Trade Offs among Location Privacy. [23] G.: PAM: AN EFFICIENT AND PRIVACY-AWARE MONITORING FRAMEWORK FOR CONTINUOUSLY MOVING OBJECTS 419 [15] B. “STRIPES: An Efficient Index for Predicted Trajectories. Aref. and H.L. 2007. Int’l Conf. 2005. C. Knowledge and Data Eng. Dik Lun Lee received the BSc degree in electronics from the Chinese University of Hong Kong.V. Data Eng. and K. “Prediction and Indexing of Moving Objects with Unknown Motion Patterns. Aref. Columbus. Lee. Hong Kong Baptist University (HKBU).” Proc. Very Large Data Bases (VLDB). Operating Systems Design and Implementation (OSDI). Dec. D.” Proc. Int’l Conf. Aref. and W. and Q. “Main Memory Evaluation of Monitoring Queries over Moving Objects. “Nearest Neighbor Queries. 2003. 1.” Proc. D.” Proc. [25] D. 1124-1140. [20] G. Tan.” Proc.-C. and distributed systems. IEEE Int’l Conf. Franklin. 56-64. Very Large Data Bases (VLDB). Kalnis.L. and Y. and S. vol. “Query and Update Efficient B+Tree Based Indexing of Moving Objects.-L. 12. Int’l Conf.S. J.” IEEE Trans. “Supporting Frequent Updates in R-Trees: A Bottom-Up Approach. Hong Kong Baptist University.G. A. Lee. Guo. Papadias. Peng.V. C. [41] Y. “Maintenance of Spatial Semijoin Queries on Moving Points. S. 265318. in 1998. Int’l Symp. Roussopoulos. and N.” Proc. [21] H. Mokbel.. IEEE Int’l Conf. 2. and B. Mokbel. [24] C. X. 2003. Saltenis. 7. S. [28] H. mobile computing. Iwerks. Canada. Int’l Symp. (ICDE). Smith. Liu. Hu. no. C.” Proc. search engines. no. 2005. Wu. 2003. “The TPR*-Tree: An Optimized Spatio-Temporal Access Method for Predictive Queries. X. Yu. [30] S. .L. pp. pp. vol. 2002.” Proc. [22] G. 2.HU ET AL. Pennsylvania State University. Hambrusch. Mokbel. G. Tao. Ghinita. J. no. Kalnis. Database Systems. Gedik and L. IEEE Int’l Conf. Papadias. 15. Jan. C. Ghinita. H.. W. He is an assistant professor in the Department of Computer Science. 15. and W. 2008. [37] N. no. Tang. Jensen. He has published 20 research papers in international conferences.L. 1984. D. Raptopoulou. ACM SIGMOD. 2003.R. [43] X. 2.S.G. 2007. 1719-1733. He is currently a professor in the Department of Computer Science and Engineering. [27] A. Iwerks. and M. Query Performance. M. 2004.” Proc. Leutenegger. Very Large Data Bases (VLDB). Very Large Data Bases (VLDB). 113-137. 1995. [33] G.” Proc. [40] Y. Lee. Papadias.” Proc. vol. and book chapters. Patel. Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Data Eng.” Proc. He is an associate professor in the Department of Computer Science. Madden. He is a senior member of the IEEE. “Monitoring k-Nearest Neighbor Queries over Moving Objects. [29] M. H. Hsu. Int’l World Wide Web Conf. MobiSys. Tao. Int’l Workshop Privacy-Aware Location-Based Mobile Services. 479-490. 2004. K. K. Mobile Computing. China. 2003. and B. [47] X. Y.P. and K. 117-135. and W. USENIX Symp. “SINA: Scalable Incremental Processing of Continuous Queries in Spatio-Temporal Databases. 2002. 24. Jensen. Kelley.” Proc. Zhu.S. Huang. Samet. and pervasive computing.-C. 2006. Hambrusch.” Pervasive Computing. Ghinita. vol.” Proc. 2007. 1. Teo. [16] G. 2004. “Preserving Privacy in Environments with Location-Based Applications.G. “Prive: Anonymous Location-Based Queries in Distributed Mobile Systems. journals. “Continuous k-Nearest Neighbor Queries for Continuously Moving Points with Updates. Very Large Data Bases (VLDB). “Preventing Location-Based Identity Inference in Anonymous Spatial Queries. He has published more than 70 technical papers in these areas. ACM SIGMOD.C. 2003. Shen. Skiadopoulos. location-based services. Friday. IEEE Int’l Conf. Jensen.” GeoInfomatica. Int’l Conf. 2007. [34] J.S. W. Knowledge and Data Eng. and W.” Proc. 10. Grunwald. S. Kalashnikov. Smith. [35] S. 2005. pp. Chow. pp. “SEA-CNN: Scalable Processing of Continuous k-Nearest Neighbor Queries in SpatioTemporal Databases. no. He was the founding conference chair for the International Conference on Mobile Data Management and served as the chairman of the ACM Hong Kong Chapter in 1997. Aref. Lopez. Vincent.F. and K. Xu. pp. Tao. Y. Koudas.T. Davies.” Proc. Shahabi. Jianliang Xu received the BEng degree in computer science and engineering from Zhejiang University. D. [26] P.J.” Proc. His research interests include mobile and wireless data management. Kido. pp. [44] J. ACM SIGMOD.” Proc.E. pp. pp. Prior to this. His research interests include information retrieval.F. ACM SIGMOD. Hellerstein. 2003.S. Guttman. wireless sensor networks. and D. 2007. Hangzhou. and J. and privacy-aware computing. [48] J./Apr. and N. [32] M. Satoh. ACM SIGMOD.2010 at 09:47:07 UTC from IEEE Xplore. Lin. (WWW ’07). 2005. and D. Spatial and Temporal Databases (SSTD). 2. Skiadopoulos.” Proc.F. [36] K. C. [39] Y. Spatial and Temporal Databases (SSTD).M. Liu. vol. 2003. Mouratidis. M. Gruteser and D. 51. vol. no. 763774. [17] G. “Continuous Nearest Neighbor Search. Int’l Conf. “Query Indexing and Velocity Constrained Indexing: Scalable Techniques for Continuous Queries on Moving Objects. Data Eng. Xia. Zhang. [31] M. 2000. Papadias. Haibo Hu received the PhD degree in computer science from the Hong Kong University of Science and Technology (HKUST) in 2005.Q. Hong Kong University of Science and Technology. Khoshgozaran and C. Faloutsos.” IEEE Trans.-L. P. “Performance Analysis of LocationDependent Cache Invalidation Schemes for Mobile Environments. pp. he held several research and teaching posts at HKUST and HKBU. “A Generic Framework for Monitoring Continuous Spatial Queries over Moving Objects.” Proc. and F. (ICDE ’08). 2004. W. He currently serves as a vice chairman of ACM Hong Kong Chapter.” Distributed Parallel Databases. Xiong. most of which appeared in prestigious journals and conference proceedings. Jensen. A. “LocationBased Spatial Queries.” Proc. Pu. ACM SIGMOD. “An Anonymous Communication Technique Using Dummies for Location-Based Services. “R-Trees: A Dynamic Index Structure for Spatial Searching. “Distance Browsing in Spatial Databases. You. Samet. 474-488. mobile/ pervasive computing. (ICDE). and S. X. Int’l Conf. 1-18. [38] S. Sun. Ohio State University. Computers. 2008. Second Int’l Conf. Yanagisawa. Kalashnikov. 2002. Xu. 2004. Samet. Yiu. W.L. Mar.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->