You are on page 1of 11

20178 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO.

20, 15 OCTOBER 2022

PPAQ: Privacy-Preserving Aggregate Queries for


Optimal Location Selection in Road Networks
Songnian Zhang , Suprio Ray , Member, IEEE, Rongxing Lu , Fellow, IEEE,
Yandong Zheng , Yunguo Guan , and Jun Shao , Senior Member, IEEE

Abstract—Aggregate nearest neighbor (ANN) query, which can


find an optimal location with the smallest aggregate distance
to a group of query users’ locations, has received considerable
attention and been practically useful in many real-world location-
based applications. Nevertheless, query users still hesitate to use
these applications due to privacy concerns, as there is a worri-
some that the location-based service (LBS) providers may abuse
their locations after collecting them. In this article, to tackle this
issue, we propose a novel privacy-preserving aggregate query
(PPAQ) scheme to select an optimal location for query users in
road networks. Specifically, we first analyze the problem of the Fig. 1. Example of ANN query in a road network.
ANN query in road networks and identify two basic operations,
i.e., addition and comparison, in the query. Then, we carefully
design efficient addition and comparison circuits to securely add {v1 , v2 , . . . , v5 } and three POIs {p1 , p2 , p3 }. Assume a group
and compare two bit-based inputs, respectively. With these two of colleagues want to find a restaurant, which should have the
secure circuits, we propose our PPAQ scheme, which can simul-
taneously protect the users’ locations, query results, and access minimum aggregated distance to all group members, to have
patterns from leaking. Detailed security analysis shows that our dinner on the weekend. In Fig. 1, we take two query users as
proposed scheme is indeed privacy-preserving. In addition, exten- an example {q1 , q2 }, and they can launch the ANN query by
sive performance evaluations are conducted, and the results separately sending their own locations to the service provider.
indicate that our proposed scheme has an acceptable efficiency After receiving locations, the service provider performs the
for non-real-time applications.
ANN query to select the optimal restaurant and distribute it
Index Terms—Aggregate queries, location-based service (LBS), to all colleagues in the group.
optimal location, privacy preservation, road networks. In the above example, if the group of colleagues want to
retrieve a restaurant with the minimum total distance, the ANN
query will return p2 as the optimal point since it has the min-
imum sum distance. While if they hope to find a restaurant
I. I NTRODUCTION minimizing the maximum distance for every member, p1 will
GGREGATE nearest neighbor (ANN) query is one of the
A classic problems due to its quite wide range of applica-
tions [1]–[8]. Especially in road networks [5]–[8], ANN query
be selected. See Section III-A for the formal definition of
the ANN queries in a road network. Hereinafter, we will use
“location” and “point” interchangeably.
has huge practical significance and demand for location-based However, since the query users need to provide their loca-
service (LBS) providers to select the optimal point in road tions to the service providers when enjoying the ANN query
networks for a group of users. A representative example of an services, they may hesitate to use such services due to privacy
ANN query is illustrated as follows. concerns. As reported in [9], the disclosure of locations indeed
Example 1 (Motivation): An LBS provider, e.g., Yelp, who incurs serious personal safety threats. Therefore, it is impera-
owns the road network information, e.g., locations of junctions tive to preserve the location privacy of users when populating
and points of interest (POI), offers the ANN query services the ANN query services in road networks. Unfortunately, many
for users. Fig. 1 shows a road network with five junctions previously reported works [5]–[8] on ANN queries in road
networks focused on improving the query performance over
Manuscript received 22 March 2022; accepted 28 April 2022. Date of pub- plaintexts and did not consider the privacy issues. Though
lication 11 May 2022; date of current version 7 October 2022. This work was
supported in part by the NSERC Discovery under Grant 04009, Grant 03787, the works in [10]–[12] exploited the problem of privacy-
and Grant RGPIN-2022-03244. (Corresponding author: Rongxing Lu.) preserving ANN queries, there still exist privacy issues in
Songnian Zhang, Suprio Ray, Rongxing Lu, Yandong Zheng, and these techniques. In [10], a cloak-based technique is adopted
Yunguo Guan are with the Faculty of Computer Science, University of New
Brunswick, Fredericton, NB E3B 5A3, Canada (e-mail: szhang17@unb.ca; to protect users’ locations, and the service providers return a
sray@unb.ca; rlu1@unb.ca; yzheng8@unb.ca; yguan4@unb.ca). super-set of the query result to the query users. It is clear that
Jun Shao is with the School of Computer and Information Engineering, the approach cannot fully protect the locations of users and
Zhejiang Gongshang University, Hangzhou 310018, China (e-mail:
chn.junshao@gmail.com). query results since service providers know their approximate
Digital Object Identifier 10.1109/JIOT.2022.3174184 locations. Regarding [11], the query users can perform group
2327-4662 
c 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: Jinan University. Downloaded on April 16,2023 at 09:19:35 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: PPAQ: PRIVACY-PRESERVING AGGREGATE QUERIES 20179

kNN queries by using the secure multiparty computation tech-


nique. Although it can protect users’ locations, this approach
leaks the query result to service providers. Recently, the work
in [12] proposed a privacy-preserving ANN query scheme
by combining the dummy technique and Paillier encryption.
However, when a query user launches multiple queries in a
fixed location, e.g., at home, the service provider may infer or
narrow down the range of the user’s location by intersecting
the location lists that include the dummy location and user’s
actual location.
In this article, we aim to propose a fully privacy-preserving
ANN query (PPAQ) scheme in road networks, which can Fig. 2. System model under consideration.
simultaneously preserve the privacy of users’ locations, the
query result, and access patterns [13] while returning the exact
query result. Note that the privacy of query results and access scheme, and the results show that our PPAQ scheme has
patterns are also crucial since leaking them is equivalent to an acceptable efficiency for non-real-time applications.
leaking users’ locations if the service providers know the The remainder of this article is organized as follows. In
selected location or which POI is selected as the optimal loca- Section II, we introduce our system model, security model, and
tion. In addition, our proposed scheme should guarantee the design goal. Then, we review our preliminaries in Section III.
privacy of the user’s location no matter how many times the After that, we present our proposed scheme in Section IV,
user queries at the same location. To achieve them, we employ followed by security analysis and performance evaluation in
a lightweight symmetric homomorphic encryption (SHE) [14] Sections V and VI, respectively. Finally, we discuss some
scheme as the cryptographic primitive to encrypt users’ loca- related works in Section VII and draw our conclusion in
tions. It is worth noting that SHE will lower the data utility, Section VIII.
especially for the operations of comparison and equality tests.
Although a two-server model can easily achieve these oper- II. M ODELS AND D ESIGN G OAL
ations over encrypted data [15]–[17], it is still challenging
to practically implement them over a single-server model. To In this section, we formalize our system model, security
tackle it, we carefully devise a bit-based secure comparison model, and identify our design goal.
scheme to obtain the order relation of two values. Meanwhile,
a novel matching approach is also proposed to select the equal A. System Model
value in a given set with the encrypted value without leaking In our system model, we consider a typical and practical
the selected value. Furthermore, our proposed scheme can hide client-server model, which mainly consists of two types of
access patterns in a single server, which prevents the service entities: 1) an LBSs provider LSP and 2) a group of users
provider from knowing the selected POI. Specifically, the main U = {u1 , u2 , . . . , um }, as shown in Fig. 2.
contribution of this article is three-fold as follows. LBSs Provider LSP: In our system model, LSP is a
1) We analyze the problem of ANN queries in road real-world service provider company, e.g., Yelp, who can pro-
networks and identify two basic operations: a) addi- vide LBSs to users. LSP deploys servers to interact with its
tion and b) comparison. Correspondingly, we propose client apps that have been installed in users’ mobile phones.
secure addition and secure comparison circuits to respec- Meanwhile, LSP is equipped with enriched spatial databases,
tively perform these two operations over ciphertexts including road networks and business location information, for
in a single-server model. Note that, when designing example, the locations of restaurants.
the secure comparison circuit, we also extend the SHE Users U = {u1 , u2 , . . . , um }: In our system, some users at
scheme to support dividing a value by 2 over encrypted different locations, who want to obtain the common optimal
data, which significantly makes SHE more practical. location of interest, initially form a virtual group. In Fig. 2, we
2) Based on the above secure circuits, we propose our assume there are m users, i.e., m ≥ 2, in a group. The users
privacy-preserving aggregate query (PPAQ) scheme to in the group can provide their locations to LSP through the
find the optimal location in road networks while installed app (here, we assume all users have installed LSP’s
preserving the privacy of users’ locations, query results, app and have registered themselves). With the received loca-
and access patterns simultaneously. To the best of our tions, LSP can perform the ANN queries in road networks [5]
knowledge, we are the first to consider the fully privacy- to find the optimal location and return it to the group of users.
preserving ANN queries in road networks to find the For the ease of description, hereafter, we denote u1 as an
optimal location. initiator in the group, who is responsible for sending a task
3) We formally analyze the security of our proposed request to LSP, receiving the selected optimal location from
scheme and demonstrate that our PPAQ scheme can LSP, and distributing the location to other users in the group.
indeed achieve our privacy requirements. Furthermore, Specifically, u1 first sends a task generation request to LSP,
we conduct extensive experiments to evaluate the including the identity information (id) of other users. Upon
performance of our proposed secure circuits and PPAQ receiving the query, LSP generates a unique task id tid for

Authorized licensed use limited to: Jinan University. Downloaded on April 16,2023 at 09:19:35 UTC from IEEE Xplore. Restrictions apply.
20180 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 20, 15 OCTOBER 2022

the request and disseminates it to all users in the group. Each into the ANN query, we first define an aggregate network
user then responds his/her location together with the received distance function.
tid to LSP. After receiving all users’ responses, LSP chooses Definition 1 (Aggregate Network Distance Function):
an optional meeting location and forwards it to the initiator Given a set of query points Q = {q1 , q2 , . . . , qm } and a point
u1 . After receiving the optimal meeting location, ui forwards p in a road network, the aggregate network distance function
it to other group users. F is monotonically increasing and can compute the aggregate
distance from p to Q
B. Security Model
F(p, Q) = F(d(q1 , p), d(q2 , p), . . . , d(qm , p)) (1)
In our security model, we consider all users U to be honest,
i.e., they will honestly report location information and task id where d(qt , p), t ∈ [1, m], is the shortest distance between p
to LSP. For the initiator u1 , he/she honestly distributes the and qt along a given road network.
received optimal location to other users. This is reasonable Generally, F is comprised of two basic functions sum and
since all users expect to correctly obtain an optimal location max [5], [7], in which sum indicates the total distance from p
and have no motivation to deviate from the purpose. However, to all query points Q, while max means the maximum distance
due to the sensitive nature of personal data, users do not want between p and Q.
LSP to know their locations or trajectories. For LSP, we Definition 2 (ANN Queries in Road Networks): Given Q
assume it is semi-honest [18], i.e., LSP will faithfully fol- and a set of data points P = {p1 , p2 , . . . , pn } in a road
low the protocol to select the optimal location (performing network, the ANN queries can return an optimal point po ∈ P
ANN queries) for users but may be curious to learn the loca- that has a minimal network distance, i.e., there does not exist
tion information of users. From our system model, we know pi ∈ P/{po } such that F(pi , Q) < F(po , Q).
that LSP receives the report locations from users and finds
the optimal location, which reveals the potential location of
B. Symmetric Homomorphic Encryption
users. As a result, in this work, the privacy threats are from
semi-honest LSP, who may attempt to obtain: 1) query users’ SHE is a lightweight symmetric homomorphic encryption
location information and 2) the selected optimal location, that can support homomorphic addition and multiplication. It
in the process of performing ANN queries. Note that, since was first proposed in [19] and proved to be IND-CPA secure
this work mainly focuses on secure computation techniques, in [14]. Concretely, SHE consists of three algorithms, namely:
other active attacks, e.g., data poisoning and modification, are 1) key generation KeyGen(); 2) encryption Enc(); and 3)
beyond the scope of this article and will be discussed in our decryption Dec(), as follows.
future work. 1) KeyGen(k0 , k1 , k2 ): Given three security parameters
{k0 , k1 , k2 } satisfying k1  k2 < k0 , the algorithm first
C. Design Goal chooses two large prime numbers p and q with |p| =
|q| = k0 and sets N = pq. Then, it generates the secret
Under the above system model and security model, we aim
key sk = (p, L), where L is a random number with
to design a privacy-preserving ANN query scheme for select-
|L| = k2 , and the public parameter pp = (k0 , k1 , k2 , N ).
ing the optimal location in road networks. In particular, the
Besides, the algorithm sets the basic message space
following objectives should be attained.
M = [ − 2k1 −1 , 2k1 −1 ).
1) Preserving Users’ Location Privacy: The basic require-
2) Enc(sk, m): On input of a secret key sk and a message
ment of our proposed scheme is to preserve the location
m ∈ M, the encryption algorithm outputs the ciphertext
privacy of users, i.e., the location information of users
E(m) = (rL + m)(1 + r p) mod N , where r ∈ {0, 1}k2
should be always kept secret from LSP.
and r ∈ {0, 1}k0 are random numbers.
2) Preserving Optimal Location Privacy: The selected
3) Dec(sk, E(m)): Taking the secret key sk and a ciphertext
optimal location should also be kept secret from LSP,
E(m) as inputs, the decryption algorithm recovers a mes-
which implies that we need to hide access patterns in
sage m = (E(m) mod p) mod L = (rL+m) mod L. If
the process of finding the optimal location, i.e., LSP
m < (L/2), it indicates m ≥ 0 and m = m . Otherwise,
does not know which location has been selected as the
m < 0 and m = m − L.
optimal one.
SHE satisfies the homomorphic addition and multiplication
properties as follows.
III. P RELIMINARIES
1) Homomorphic Addition-I: E(m1 ) + E(m2 ) mod N →
In this section, we first state the problem of the ANN queries E(m1 + m2 ).
in road networks. Then, we recall an SHE scheme, which 2) Homomorphic Multiplication-I: E(m1 ) · E(m2 )
serves as the building block of our proposed scheme. mod N → E(m1 · m2 ).
3) Homomorphic Addition-II: E(m1 ) + m2 mod N →
A. Aggregate Nearest Neighbor Queries in Road Networks E(m1 + m2 ).
The ANN query in road networks is a classic problem and 4) Homomorphic Multiplication-II: E(m1 ) · m2 mod N →
has been used to find the optimal location or point of interest E(m1 · m2 ) when m2 > 0.
in road networks, which minimizes an aggregate distance func- SHE Encryption Under Public Key Setting: In order to real-
tion with respect to a set of query points [5]. Before delving ize the SHE encryption under public key setting, we take

Authorized licensed use limited to: Jinan University. Downloaded on April 16,2023 at 09:19:35 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: PPAQ: PRIVACY-PRESERVING AGGREGATE QUERIES 20181

sk = (p, L) as the private key, and use sk to generate two


ciphertexts E(0)1 , E(0)2 of 0 with different random numbers,
and set the public key pk = {E(0)1 , E(0)2 , pp}. In such a way,
one can use the above homomorphic properties to encrypt a
message m ∈ M by the following:
E(m) = m + r1 · E(0)1 + r2 · E(0)2 mod N (2)
Fig. 3. Truth table of addition, in which Carry is the current carry bit, o is
where r1 , r2 ∈ {0, 1}k2 are two random numbers. Note that the the output of (a + b + Carry mod 2), and ĉ is the new carry bit.
encryption in (2) was proven as IND-CPA secure [20].

IV. O UR P ROPOSED PPAQ S CHEME From the above analysis, we can simplify the problem of
In this section, we first analyze the problem of ANN query ANN queries in road networks by converting the distance cal-
over plaintexts and extract ANN query’s two basic operations. culation between P and Q into the distance between V and
Then, we design secure circuits as building blocks to deal with P, which can be precomputed by LSP. Furthermore, from (4)
the two basic operations over ciphertexts. After that, based on and (5), we can see that the ANN queries essentially have two
these building blocks, we present our PPAQ scheme. basic operations: 1) addition and 2) comparison. Accordingly,
we need to perform these two operations over ciphertexts if we
A. Analyzing ANN Queries in Road Network apply SHE to encrypt users’ locations. Note that, although it is
A typical road network is usually abstracted as road seg- easy to have the addition operation over ciphertexts by using
ments (edges) and junctions (vertices) [5], as shown in Fig. 1. the homomorphic addition property, it is not easy to compare
Assume that there are k vertices in a road network, denoted values over their ciphertexts in a single-server model. This
as V = {v1 , v2 , . . . , vk } and n POI P = {p1 , p2 , . . . , pn } that is because the operator is expected to obtain the comparison
lie on edges. Given m query points Q = {q1 , q2 , . . . , qm } that result without knowing any information about the underlying
also lie on edges, we have the following network distances: plaintexts. Although the arithmetic operations (multiplication
for each pi , i = 1, 2, . . . , n: and addition) can be performed on ciphertexts due to the
      homomorphic properties of SHE, the operator cannot directly
F(pi , Q) = F d(qt , pi ) = d qt , v̂t + d v̂t , pi |m t=1 (3) use them to compare two ciphertexts and get the comparison
result in a single server. As a result, to tackle this challenge, we
where v̂t ∈ V is one of the vertices lying on the same edge as
carefully devise two novel bit-based secure circuit schemes to
the query point qt ∈ Q. For the simplicity sake, we assume
perform addition and comparison operations over ciphertexts.
that v̂t ∈ V is the closest neighbor vertex to qt in the road
network. If we let F be the sum function, the ANN query will
return the point in P that has the minimum total distance B. Secure Addition and Comparison Circuits
arg min (F(pi , Q)) In order to securely compare two integers without leaking
i∈[1,n]
 m  them and the result, we design a secure comparison scheme
     based on the encrypted bit sequences. Since we employ SHE
= arg min d qt , v̂t + d v̂t , pi
i∈[1,n] to encrypt each bit, and only the addition and multiplication
 t=1  (homomorphic) operations are used in our comparison scheme,
 m
  m
 
= arg min d qt , v̂t + d v̂t , pi we denote it as secure comparison circuit. Correspondingly, to
i∈[1,n] keep consistency, we also design a secure addition circuit to
t=1  t=1 achieve the addition operation over encrypted bit sequences.
m
 
⇔ arg min d v̂t , pi . (4) In the following, we first introduce the secure addition circuit,
i∈[1,n] and then present the secure comparison circuit.
t=1
Meanwhile, since LSP usually holds the road network, i.e., 1) Secure Addition Circuit: Assume there are two non-
the location of nodes V and data points P, LSP can precom- negative integers {x, y} of the same bit length, and the
 corresponding encrypted bit sequences are E(x) = (E(x j )|0j=l )
pute m t=1 v̂t , pi ) and quickly obtain the optimal point if it
d(
knows the neighbor node of the query points. and E(y) = (E(y j )|0j=l ), where x j , y j ∈ {0, 1}, j = l, . . . , 1, 0,
On the other hand, if F is the max function, the ANN query and l is the most significant bit position of inputs. The secure
can return the point with the following distance: addition circuit is to compute E(z) = (E(z j )|0j=l ) by operating
E(x) and E(y) to satisfy z = x + y, where z is an integer and
arg min (F(pi , Q)) E(z) is its corresponding encrypted bit sequence. For exam-
i∈[1,n]
 ple, if we set x = 5, y = 7, E(5) = (E(0), E(1), E(0), E(1))
    
= arg min arg max d qt , v̂t + d v̂t , pi . (5) and E(7) = (E(0), E(1), E(1), E(1)), then we can compute
i∈[1,n] t∈[1,m]
z = 5 + 7 = 12 and E(12) = (E(1), E(1), E(0), E(0)).
Similarly, d(v̂t , pi ) can be precomputed by LSP. If users send Fig. 3 illustrates the truth table of the addition operation, in
d(qt , v̂t ) and v̂t to LSP, it is simple for LSP to obtain the which o is the output of the current bit and ĉ is the new carry
maximum distance by matching vertex v̂t . bit. It is easy to obtain the logical expressions of o and ĉ as

Authorized licensed use limited to: Jinan University. Downloaded on April 16,2023 at 09:19:35 UTC from IEEE Xplore. Restrictions apply.
20182 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 20, 15 OCTOBER 2022

follows:
¬
o= a ∧¬ b ∧ c ∨ ¬
a ∧ b ∧¬ c

∨ a ∧¬ b ∧¬ c ∨ (a ∧ b ∧ c)
= a + b + c − 2(ab + ac + bc) + 4abc
¬
ĉ = a ∧ b ∧ c ∨ a ∧¬ b ∧ c

∨ a ∧ b ∧¬ c ∨ (a ∧ b ∧ c)
= ab + ac + bc − 2abc.
Fig. 4. Relations of α j , β j , and d j with an example of x = 5, y = 7.
According to the above logical expressions, we have E(z) =
(E(zl ), . . . , E(z j ), . . . , E(z1 ), E(z0 )), for each j = l, . . . , 1, 0
Then, over the ciphertexts E(x) = (E(x j )|0j=l ) and E(y) =
   
E z j = E x j + y j + ĉ j (E(y j )|0j=l )}, we have the output of each bit ρ j as follows:
 
+ E(−1) · E(2) · E x j y j + x j ĉ j + y j ĉ j   j    
    Eα  + E β j · E ρ j−1 , j > 0
+ E(4) · E x j y j ĉ j E ρj = (9)
(6) E αj , j=0
where where j = l, . . . , 1, 0. Finally, the secure comparison circuit
⎧  j−1 j−1  produces E(δ) = E(ρ l ) as the output.
 j ⎨ E x y + x j−1 ĉ j−1
 + y j−1 ĉ j−1 
Correctness: We say our secure comparison circuit is correct
E ĉ = + E(−1) · E(2) · E x j−1 y j−1 ĉ j−1 , j > 0 (7)
⎩ if it outputs E(δ) = E(ρ l ) = E(1) when x > y, and outputs
E(0), j = 0.
E(δ) = E(ρ l ) = E(0) otherwise.
Next, we depict the details about the process of E(12) = Proof: When x > y, it means ∃j ∈ [0, l], d j = 1 and

E(5) + E(7) with our secure addition circuit. First of all, since d = 0, for all j = l, . . . , j + 1. In this case, E(ρ j ) = E(1) +
j

ĉ0 = 0, we have z0 = x0 + y0 − 2x0 y0 = 0. Then, for j = 1, E(0) · E(ρ j−1 ) = E(1). Since d j = 0 for j = l, . . . , j + 1,
 
we can obtain we have E(α j ) = E(0) and E(β j ) = E(1), leading to
 
E(ρ j ) = E(ρ j −1 ). As a result, E(ρ l ) = E(ρ j ) = E(1).
ĉ1 = x0 y0 + x0 ĉ0 + y0 ĉ0 − 2x0 y0 ĉ0 = 1; Similarly, when x < y, we have E(ρ l ) = E(ρ j ) = E(0).
z1 = x1 + y1 + ĉ1 − 2(x1 y1 + x1 ĉ1 + y1 ĉ1 ) + 4x1 y1 ĉ1 When x = y, it means ∀j ∈ [0, l], d j = 0, leading to E(α j ) =
E(0) and E(β j ) = E(1). Therefore, E(ρ l ) = E(ρ 0 ) =
= 2 − 2 = 0.
E(α 0 ) = E(0).
Repeatedly, with (6) and (7), we can calculate ĉ2 = 1, Nevertheless, for obtaining E(α j ), we still need to further
z2 = 1 and ĉ3 = 1, z3 = 1. Finally, we have (E(z j )|0j=3 ) = compute E(d j ·(d j +1))/E(2). To tackle the challenge, we can
choose an odd random number L in SHE, then it is easy for us
(E(1), E(1), E(0), E(0)) = E(12).
to find an inverse element E() satisfying E()·E(2) = E(1).
2) Secure Comparison Circuit: Given E(x) = (E(x j )|0j=l )
Theorem 1: In the KeyGen(k0 , k1 , k2 ) algorithm of SHE,
and E(y) = (E(y j )|0j=l )}, the secure comparison circuit is used
when an odd random number L is chosen, i.e., gcd(2, L) = 1,
to determine whether the corresponding integers x, y satisfy
we can set  = [(1 + L)/2] to satisfy E() · E(2) = E(1).
x > y without leaking x, y, and the comparison result. If yes,
Proof: Since mod L is applied in the SHE decryption
it outputs E(δ) = E(1), otherwise E(δ) = E(0).
algorithm, it is easy to see E() · E(2) = E( · 2) = E([(1 +
The main idea over the plaintexts is to check whether there
L)/2] · 2) = E(1 + L) = E(1). The correctness follows.
exists the most significant differing bit position j such that
  According to Theorem 1, we can obtain E(α j ) by computing
x j > y j and x j = y j for all j = l, . . . , j+1. If yes, then x > y;
E(α j ) = E(d j · (d j + 1)) · E().
otherwise x ≤ y. For j = l, . . . , 1, 0, we can compute d j =
In Fig. 4, we also demonstrate an example of comparing
x j − y j , then there are three cases for d j , i.e., d j = 1, 0, −1.
x = 5 and y = 7. Since x < y, the result δ = ρ 3 = 0.
When d j = 1, or −1, we can directly obtain the output from
Note that, in the above analysis, we only show that our secure
the difference of the jth bit. While when d j = 0, we have to
comparison circuit can determine whether x > y. In fact, we
recursively use the (j − 1)th bit to determine the order relation
can also determine any order relations of two encrypted bit
until j = 0. When the operations work on the ciphertexts, since
sequences by constructing a α-function
we cannot determine whether E(d j ) is E(1), E(0), or E(−1),    j  
we have to go through all j = l, . . . , 1, 0.  j dj dj + 1 d − 1 dj + 1
α = fα d = λ 1
j
+ λ2
For the ease of description, we denote the current jth bit
  2 −1
value as α j and the transitive bit value from j to j − 1 as β j . dj dj − 1
Since we consider x > y, we have the relation of α j , β j , and + λ3 . (10)
2
d j shown in Fig. 4, where
 j  j  j  The correctness of α-function can be easily verified. In Fig. 4,
α = d · d + 1 /2 if the ith row of Table (a) is 1, we can set the ith term of
(8)
β j = 1 − d j · d j. fα (d j ) is 1, and others are 0. For example, if we want to

Authorized licensed use limited to: Jinan University. Downloaded on April 16,2023 at 09:19:35 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: PPAQ: PRIVACY-PRESERVING AGGREGATE QUERIES 20183

determine x ≥ y, we should make λ1 = λ2 = 1 and λ3 = 0. vs in the k tuples that satisfies vs = vt with encrypted E(vt ),
For determining whether x > y, we can set λ1 = 1 and λ2 = and has no idea about which vertex in V is the same as vt .
λ3 = 0. Thus, α j = fα (d j ) = ([d j (d j + 1)]/2). To address it, we adopt the following matching approach.
1) First, LSP calculates a flag fs for each vs as follows:
C. Description of PPAQ 
l
j
In this section, we present our PPAQ scheme, which is E(fs ) = vs  E(vt ) = vsj  E v̂t (11)
comprised of four phases: 1) system initialization; 2) location j=0
report; 3) optimal location selection; and 4) location recov- j j
ery. Before delving into the details, we first assume LSP where the operation vs  E(v̂t ) is defined as

holds a road network with k vertices V = {v1 , v2 , . . . , vk } ⎨ E 1 − v̂tj , vsj = 0
and n POI P = {p1 , p2 , . . . , pn }, and there are m users in the j
vs  E v̂t =
j
(12)
group, where k, n, m ≥ 2. We say vs ∈ V and pi ∈ P, where ⎩ E v̂tj , j
vs = 1.
s ∈ [1, k] and i ∈ [1, n], are unique integers encoded by LSP.
The corresponding binary formats are denoted as vs and pi . It means that only if vs is the same as vt , i.e., for each
j j
j j
Besides, we say E(vs ) = (E(vs )|0j=l ) and E(pi ) = (E(pi )|0j=l ), bit vs = v̂t , the corresponding flag fs is 1, otherwise it
where j = l, . . . , 1, 0, indicate that the binary is encrypted bit is 0.
by bit, and l is the common maximum bit length. 2) Then, with the k tuples (vs , σsi , pi  for s = 1, . . . , k)
1) System Initialization: In our scheme, the initiator u1 ini- and the corresponding fs , LSP computes E(σti ) =
j
tializes a privacy-preserving aggregate query. First, u1 obtains (E(σti )|0j=l )
the user ids of other users in the group and sends a task  k 
k 
generation request with these ids to LSP. Then, LSP ran- E σti =
j j
E(fs ) · E σsi = E
j
fs · σsi (13)
domly generates a task id tid and disseminates it to users s=1 s=1
U according to the received user ids. Next, given security
j j
parameters (k0 , k1 , k2 ), u1 generates the sk = (p, L), where in which LSP can first encrypt σsi into E(σsi )
L is odd, and pp = (k0 , k1 , k2 , N ). Afterward, u1 calls with (2). Next, LSP can construct a two-element tuple
Enc(sk, m) to generate {E(0)1 , E(0)2 , E(−1), E()}, where E(σti ), pi  that indicates qt can reach pi via its neigh-
 = [(1 + L)/2]. Finally, u1 sets the public key pk = {E(0)1 , bor v̂t = vs with the shortest distance σti . Now, LSP
E(0)2 , E(−1), E(), pp}. totally holds m × n tuples: E(σti ), pi , where t ∈ [1, m]
2) Location Report: Assume that each user ut , t ∈ [1, m], and i ∈ [1, n].
has installed the client (LSP’s application) in his/her mobile Step 2 (Calculating Aggregate Values): If F is sum, LSP
phone. First, the client senses the location of ut , denoted as qt , computes the total distance from Q to pi . According to (4),
and calculates the distance from qt to its closest neighbor ver- LSP can only add up all users’ shortest distances m from their
tex v̂t . Since V is embedded into clients by LSP in advance, neighbor
 vertices v̂t to pi , i.e., E(σi ) = t=1 E(σti ) =
the client can easily compute θt = d(qt , v̂t ) with the sensed E( m t=1 σti ). It can be achieved by our secure addition cir-
qt . After that, θt is encoded into a bit sequence and encrypted cuit (Section IV-B1) with inputs E(σti ). Note that it is simple
j
into E(θt ) = (E(θt )|0j=l ) using (2). With the same encryption for LSP to set a proper l to guarantee that there is no overflow
j
approach, v̂t is encrypted into E(vt ) = (E(v̂t )|0j=l ). Finally, ut when using the secure addition circuit.
If F is max, LSP computes the maximum distance from
sends tid, E(θt ), E(vt ) to LSP.
Q to pi . First, LSP adds up E(θt ) and E(σti ) using the secure
3) Optimal Location Selection: Assume there are m users
addition circuit, i.e., E(τti ) = E(θt ) + E(σti ), in which the
in the group. LSP will receive m tuples tid, E(θt ), E(vt ),
inputs of the secure addition circuit are E(θt ) and E(σti ), and
where t ∈ [1, m]. Since LSP holds spatial databases, it
the output is E(τti ). Then, LSP determines the maximum
can precompute the shortest distances {σsi = d(vs , pi )|s ∈
value among E(τti ) for t = 1, 2, . . . , m. Take comparing E(τ1i )
[1, k], i ∈ [1, n]} over plaintexts using existing shortest path
and E(τ2i ) as an example. LSP employs our secure compar-
algorithms, such as Dijkstra [21] and HiTi [22], and encode
ison circuit (Section IV-B2) to determine whether τ1i > τ2i
{σsi , vs } into bit sequences {σsi , vs }, respectively. Afterward,
with inputs E(τ1i ) and E(τ2i ). If yes, our secure comparison
LSP can construct tuples vs , σsi , pi , where s ∈ [1, k], i ∈
circuit outputs E(δ) = E(1), otherwise E(δ) = E(0). Using the
[1, n] in advance. It is worth noting that both calculating short-
following equation, LSP can calculate the maximum value,
est path and encoding bit sequences can be achieved offline.
denoted as E(τmaxi ):
With the constructed and received tuples, LSP can perform
the following steps to find the optimal location. j j j j
E τmaxi = E(δ) · E τ1i − E τ2i + E τ2i . (14)
Step 1 (Matching): For each pi , if all vertices can reach
it, there will be k tuples related to pi : vs , σsi , pi  for s = If δ is 1, E(τmaxi ) = E(τ1i ). Otherwise, E(τmaxi ) = E(τ2i ). It
1, . . . , k. Consequently, there must exist a tuple in the k tuples means τmaxi is always the larger one. After that, LSP can con-
that has a joint vertex with the received vt . That is, ∃s ∈ [1, k] tinue to compare E(τmaxi ) and E(τ3i ) with the secure compar-
such that v̂t (a.k.a vt ) = vs (a.k.a vs ). It means qt can reach ison circuit. Repeating m − 1 times, LSP can obtain the max-
pi with the shortest distance θt + σsi = d(qt , v̂t ) + d(vs , pi ) imum distance from Q to pi : E(σi ) = arg maxt∈[1,m] (E(τti )).
passing the vertex vs that is also the neighbor vertex v̂t of qt . After calculating aggregating values, LSP constructs n
As a result, the problem is converted into: LSP needs to find tuples: E(σi ), pi  for i = 1, 2, . . . , n. Since E(σi ) is computed

Authorized licensed use limited to: Jinan University. Downloaded on April 16,2023 at 09:19:35 UTC from IEEE Xplore. Restrictions apply.
20184 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 20, 15 OCTOBER 2022

Algorithm 1 Optimal Location Selection can achieve it by adding up all shortest


 distances from the
Input: Encrypted locations of query users, {tid, E(θt ), E(vt ) | t ∈ [1, m]}. matched vertices to pi , i.e., E(σi ) = m
t=1 E(σti ). Next, in the
Spatial database, {vs , σsi , pi  | s ∈ [1, k], i ∈ [1, n]}. last step, by repeatedly using our secure comparison circuit
Output: The selected optimal location, E(po ).
1: for s ← 1 to k do and (15), we can obtain E(po ), where po has the minimum
 j j
2: E(fs ) ← vs  E(vt ) = lj=0 vs  E(v̂t ); total distances. The correctness of our secure comparison cir-
k k
3: E(σti ) ← s=1 E(fs ) · E(σsi ) = E( s=1 (fs · σsi )); cuit (see details in Section IV-B) and (15) ensures that po is
4: if F is sum then
 m the optimal location when F is sum. From Algorithm 1, we
5: E(σi ) ← m t=1 E(σti ) = E( t=1 σti );  Sadd
6: else if F is max then know that the only difference between the sum function and
7: E(τti ) ← E(θt ) + E(σti );  Sadd the max function is the aggregating step, in which the maxi-
8: E(σi ) ← arg maxt∈[1,m] (E(τti ));  Scom and Eq. (14) mum distance is guaranteed by the correctness of our secure
9: E(po ) ← E(p1 ); E(σo ) ← E(σ1 ); comparison circuit. Similarly, we can prove that our PPAQ
10: for i ← 2 to n do  
11: E(δi ) ← Scom E(σo ), E(σi ) ;  Inputs are bit sequences scheme is correct when F is max.
12: E(σo ) ← E(δi ) · (E(σo ) − E(σi )) + E(σi );
13: E(po ) ← E(δi ) · (E(po ) − E(pi )) + E(pi );
V. S ECURITY A NALYSIS
14: return E(po );
In this section, we analyze the security of our proposed
scheme PPAQ. Following our design goals, we will prove that
using the secure addition circuit (sum) or the secure compar- our PPAQ scheme can preserve the privacy of users’ locations
ison circuit (max), it is in binary format E(σi ) here. and the query result, i.e., the selected optimal location. Since
Step 3 (Calculating Optimal Location): After obtaining a set LSP holds the plaintext information of road networks, it is
of distances E(σi ), pi , i ∈ [1, n], LSP can determine a POI essential for our PPAQ scheme to hide access patterns.
that has the minimum distance among these n distances, i.e., First, we briefly review the security model for securely
arg mini∈[1,n] (E(σi )). Specifically, LSP computes the mini- realizing an ideal functionality in the presence of the static
mum value among E(σi ) for i = 1, 2, . . . , n. Take comparing semi-honest adversary [23]. Since only LSP is semi-honest
E(σ1 ) and E(σ2 ) as an example. LSP employs our secure in our security model, our proof will focus on the service
comparison circuit to determine whether σ1 > σ2 with inputs provider LSP.
E(σ1 ) and E(σ2 ). If yes, our secure comparison circuit outputs Real World Model: The real world execution of a scheme
E(δ) = E(1), otherwise E(δ) = E(0). Then, LSP calculates takes place in LSP and adversary A, who corrupts LSP.
the minimum distance and the corresponding POI, denoted as Assume that x indicates the input of users’ locations, y repre-
E(σmin ) and E(pmin ), respectively sents the input held by LSP, and z is auxiliary input, e.g., the
 length of ciphertexts and bit sequence. With the inputs of x, y,
j j j j
E σmin = E(δ) · E σ2 − E σ1 + E σ1 and z, the execution of under A in the real world mode is
(15)
E(pmin ) = E(δ) · (E(p2 ) − E(p1 )) + E(p1 ) defined as
def
REAL ,A,z (x, y) = {Output (x, y), View (x, y), z}
where E(p1 ) and E(p2 ) are encrypted by LSP using (2). If
in which Output (x, y) is the output of LSP after an execution
δ is 1, E(σmin ) = E(σ2 ) and E(pmin ) = E(p2 ). Otherwise,
of on (x, y), and View (x, y) is the view of LSP during
E(σmin ) = E(σ1 ) and E(pmin ) = E(p1 ). It means that pmin
an execution of on (x, y).
always has the minimum distance σmin . After that, LSP can
Ideal World Model: In the ideal world execution, there is
continue to compare E(σmin ) and E(σ3 ) with the secure com-
an ideal functionality F for a function f , and LSP interacts
parison circuit. Repeating n − 1 times, LSP can obtain the
only with F. Here, the execution of f under simulator Sim in
optimal location E(po ) that has minimum distance cost E(σo )
the ideal world model on input pair (x, y) and auxiliary input
from Q to P, i.e., E(σo ) = arg mini∈[1,n] (E(σi )). Notably,
z is defined as
the parallelization technique can be used here to improve def
performance. Finally, LSP returns E(po ) to the initiator u1 . IDEALF ,Sim,z (x, y) = {f (x, y), Sim(x, f (x, y)), z}.
See Algorithm 1 for the complete process of optimal location Definition 3 (Security Against Semi-Honest Adversary): Let
selection, in which Sadd and Scom indicate our secure addition F be a deterministic functionality and be a scheme in LSP.
circuit and secure comparison circuit, respectively. We say that securely realizes F if there exists Sim of the
4) Location Recovery: Upon receiving E(po ), the initiator probabilistic polynomial time (PPT) transformations (where
u1 recovers the plaintext po with the secret key sk. Then, u1 Sim = Sim(A)) such that for semi-honest PPT adversary A,
distributes po to other group users via a secure channel. for x, y and z, for LSP holds
Correctness: We say our PPAQ scheme is correct if LSP c
REAL ,A,z (x, y) ≈ IDEALF ,Sim,z (x, y)
obtains the optimal location E(po ) with regard to F.
Proof: First of all, we prove that our PPAQ scheme is c
where ≈ compactly denotes computational
correct when F is the sum function. In the matching step, indistinguishability.
 we know that fs = 1 iff vs = vt . As a
according to (12),
result, E(σti ) = ks=1 E(fs ) · E(σsi ) must be the shortest dis-
A. PPAQ Scheme Is Privacy-Preserving
tance from qt to pi since only the matched vertex has fs = 1,
while others are 0. Then, in the aggregating step, since F is First, we use Definition 3 to demonstrate that our PPAQ
sum, we need to obtain the minimum total distances from all scheme can preserve the privacy of users’ locations and the
query locations to one POI location pi . According to (4), we optimal location.

Authorized licensed use limited to: Jinan University. Downloaded on April 16,2023 at 09:19:35 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: PPAQ: PRIVACY-PRESERVING AGGREGATE QUERIES 20185

Theorem 2: The PPAQ scheme securely selects the optimal


location in the presence of semi-honest adversary A.
Proof: Here, we show how to construct the simulator.
Sim randomly chooses m tuples: tid , θt , vt , t ∈ [1, m]
j j
and po , where θt , vt ∈ {0, 1}, and j = l, . . . , 1, 0. Then,
Sim simulates A as follows: 1) it generates ciphertexts
tid , E(θt ), E(vt ) and E(po ) by the encryption algorithm
of SHE and 2) it outputs A’s view. In the real execution, (a) (b)
A receives m tuples: tid, E(θt ), E(vt ) and obtains E(po ),
whereas Sim gives tid , E(θt ), E(vt ) and E(po ) in the ideal Fig. 5. Average execution time of secure addition and secure comparison
execution. The semantic security of SHE [14] guarantees that circuits. (a) Varying with bit length l, k0 =8192; (b) Varying with k0 , l = 8.
the views of A in the real and the ideal worlds are indistin-
guishable. Here, we ignore tid and tid since both of them are
selection phase is the core part of our PPAQ scheme, and
randomly chosen.
other phases have negligible computational costs compared to
Since all operations are performed over ciphertexts, it is
the phase, we will focus on the performance of the optimal
impossible to break the intermediate results during the scheme
location selection phase in this section. Notably, the commu-
without the secret key. Therefore, there is no advantage for A
nication cost in our PPAQ scheme is evident and trivial. As a
to distinguish the real world and the ideal world.
result, we do not provide such evaluations.
From the above proof, we can see that our PPAQ scheme
Experimental Setting: We implemented our scheme in Java
can ensure the privacy of users’ locations {θt , vt |t ∈ [1, m]}
and executed it on a machine with 16-GB memory, 3.4-
and the optimal location po . As a result, our PPAQ scheme is
GHz Intel Core i7-3770 processors, and Ubuntu 16.04 OS.
privacy-preserving.
In our experiments, we adopt a synthetic data set to evalu-
ate our secure addition and secure comparison circuits and
B. PPAQ Scheme Hides Access Patterns two real-world data sets: Sequoia [24] used in [12] and [25]
Next, we show that our PPAQ scheme can preserve the and DIMACS [26] used in [8], to evaluate our PPAQ scheme.
privacy of access patterns. In real-world scenarios, since query users usually obtain the
Theorem 3: The PPAQ scheme can hide the information optimal location of a specific type of POI, e.g., restaurant, we
about which POI in P is selected as the optimal location. evaluate the POI of “bar” as the target points in Sequoia and
Proof: We prove Theorem 3 by demonstrating LSP does “cafe” in DIMACS, in which there are 150 bars and 100 cafes,
not know which point in P is the optimal location po though respectively. Since the location space in the Sequoia data set
LSP holds the plaintext information of the road network: V = has been normalized into a square space, we randomly choose
{v1 , v2 , . . . , vk } and P = {p1 , p2 , . . . , pn }. 200 vertices in the data set. For the DIMACS data set, we
First of all, LSP receives m tuples tid, E(θt ), E(vt ), t ∈ select 200 real locations of junctions in the New York City as
[1, m] and then calculates k flags E(fs ), s ∈ [1, k] for each vertices with the help of OpenStreetMap [27]. As the shortest
user with the plaintexts vs by (11). However, from (12), distances between vertices and POIs are calculated by LSP
we can see that E(fs ) are calculated from E(vt ) that are over plaintexts in advance, and how to calculate them is not the
kept secret from LSP. As a result, LSP has no idea about focus of this work, we employ the Euclidean distance between
E(fs ) and thus does not know E(σti ) even though σsi [calcu- vertices and POIs in our experiments.
lated by (13)] are in plaintext. Note that k ≥ 2. Next, with
E(σti ), LSP calculates E(σi ) under the aggregate distance A. Performance of Secure Addition and Comparison Circuits
function F.  Here, we take F = sum as an example, in which
m In this section, we evaluate the execution time of our
E(σi ) = m t=1 E(σ ti ) = E( t=1 σti ). Since it is achieved by proposed secure circuits over the synthetic data set, in which
our secure addition circuit that operates on SHE ciphertexts
we randomly generate 2000 integers for each bit length from 8
bit by bit, E(σi ) are kept secret from LSP due to the security
to 20, i.e., l ∈ [8, 20]. Recalling Section IV-B, the performance
of E(σti ). After that, LSP uses (15) to calculate E(pmin ) with
of our secure addition and secure comparison circuits is related
the minimum distance cost. Since all calculations in our secure
to the inputs’ bit length l and the key size k0 (see details in
comparison circuit are performed on encrypted bits, it ensures
the KeyGen() algorithm of SHE). As a result, we explore the
that LSP learns nothing about the inputs E(σi ) and the out-
impact of these two parameters in Fig. 5, where the secure
put E(δ). Therefore, LSP does not know which of the two
addition circuit and secure comparison circuit are denoted
locations to be compared is selected as E(pmin ). Thus, LSP
as Sec-Addition and Sec-Comparison, respectively. Fig. 5(a)
cannot know which location in P is selected as the optimal
depicts the average execution time of Sec-Addition and Sec-
location E(po ).
Comparison varying with the bit length l from 8 to 20. We
can see that the secure addition circuit is always more expen-
VI. P ERFORMANCE E VALUATION sive than the secure comparison circuit in execution time. By
In this section, we will first evaluate the performance of comparing (6) and (9), it is easy to see that our devised secure
our proposed secure addition and comparison circuits and comparison circuit is simpler than the secure addition cir-
then evaluate our PPAQ scheme. Since the optimal location cuit due to the fewer homomorphic operations. In Fig. 5(b),

Authorized licensed use limited to: Jinan University. Downloaded on April 16,2023 at 09:19:35 UTC from IEEE Xplore. Restrictions apply.
20186 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 20, 15 OCTOBER 2022

(a) (b) (a) (b)

Fig. 6. Search time of our PPAQ scheme varying with the number of POIs Fig. 7. Search time of our PPAQ scheme varying with the number of query
n. (a) Over the Sequoia data set. (b) Over the DIMACS data set. users m. (a) Over the Sequoia data set. (b) Over the DIMACS data set.

we explore the impact of key size k0 from 2048 to 8192.


Similarly, the secure addition circuit has a higher execution
time for the same reason. Although our proposed secure cir-
cuits adopt the bit-based homomorphic operations to achieve
addition and comparison, both Fig. 5(a) and (b) show their
efficiency because they run at the millisecond level and have
no communication with other entities.
(a) (b)
B. Performance of Our PPAQ Scheme Fig. 8. Search time of our PPAQ scheme varying with the number of vertices
In this section, we evaluate the search time of our PPAQ k. (a) Over the Sequoia data set. (b) Over the DIMACS data set.
scheme over Sequoia and DIMACS data sets. Since the total
distance in both data sets is less than 216 , we set l = 16 in the
following experiments. In addition, for the security parameters 2) Impact of the Number of Query Users m: Fig. 7(a) and
of SHE, we let k0 = 8192, k1 = 20, and k2 = 80. Note that, (b) illustrate the search time varying with m over the
since SHE is a leveled homomorphic encryption scheme, and Sequoia and DIMACS data sets, respectively. We ran-
its maximum homomorphic multiplication depth is related to domly generate the locations of query users in the spaces
k0 , we can either adopt a bootstrapping protocol [14] between of these two data sets, set the ranges from 4 to 32,
LSP and u1 to refresh ciphertexts or enlarge k0 to support and fix k = 100 and n = 50. Similar to the impact
more homomorphic multiplication operations. For simplicity of n, both the figures show a linearly increasing tread.
sake, here we set k0 = 8192 and adopt the bootstrapping However, the difference of Fig. 7(a) is larger than that of
protocol to refresh ciphertexts. Fig. 7(b). That is because the max function needs to add
From Section IV-C, we know that the search time of our each user’s uploaded distance E(θt ) to the shortest path
PPAQ scheme is related to the number of POIs n, the number distance E(σti ), i.e., E(τti ) = E(θt ) + E(σti ). Therefore,
of query users m, and the number of vertices k. Accordingly, as m increases, so does the difference of the search time
Figs. 6–8 plot the search time of our PPAQ scheme varying between the sum and the max functions.
with n, m, and k, respectively. In particular, we compare the 3) Impact of the Number of Vertices k: Fig. 8 plots the
sum function and max function of the ANN queries, which search time of our PPAQ scheme varying with the num-
are denoted as ANN-Sum and ANN-Max in the following ber of vertices k from 40 to 200, in which Fig. 8(a)
evaluations. shows the search time over the Sequoia data set, while
1) Impact of the Number of POIs n: Since there are 150 Fig. 8(b) is for the DIMACS data set. Very differently,
bars in the Sequoia data set and 100 cafes in the the differences of the search time between the sum and
DIMACS data set, we vary n from 30 to 150 in Fig. 6(a) the max functions remain stable in Fig. 8(a) and (b). It is
and from 20 to 100 in Fig. 6(b). In both figures, we set due to that the number of vertices only affects the search
k = 100 and m = 16. We can see that the search time lin- time of the matching step (step 1). When performing the
early increases with the number of POIs in both figures. calculating aggregate values step (step 2), it has nothing
The reason is clear since our PPAQ scheme is to select to do with the number of vertices.
an optimal location from n POIs, and there is no other From Figs. 6–8, we can see that the search time linearly
impacts when we make m and k be fixed. In addition, increases with the corresponding parameter in all figures. In
we can see that the sum function requires slightly less real-world scenarios, the numbers of POIs and vertices are usu-
search time than the max function in our PPAQ scheme. ally fixed. As a result, our PPAQ scheme can achieve linear
That is because the max function needs to calculate sum complexity. Note that our PPAQ scheme is not designed for
value E(τti ) by using the secure addition circuit before real-time applications but for plan-ahead systems, in which
obtaining the maximum distance, which entails the extra the group of users launch the ANN queries in advance.
computational costs compared to the sum function. Therefore, it is acceptable for an application to select the

Authorized licensed use limited to: Jinan University. Downloaded on April 16,2023 at 09:19:35 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: PPAQ: PRIVACY-PRESERVING AGGREGATE QUERIES 20187

TABLE I
P ERCENTAGE OF E XECUTION T IME OF S TEP 1 Existing privacy-preserving ANN query schemes mainly
concentrated on the Euclidean space [10]–[12]. Although their
privacy-preserving techniques can be used in the road network
scenario, there are privacy problems with these techniques.
Hashem et al. [10] converted each user’s location into a region,
and the service provider returns a super-set of the exact query
result. This approach not only increases the communication
cost but also has privacy issues in terms of users’ locations
optimal location with several minutes for ensuring the fully and query results since they are not fully protected. Ashouri-
privacy preservation. Talouki et al. [11] adopted a secure multiparty computation
Besides, we have observed that the matching step (Step 1) technique to make each user participate in the calculation
consumes the most of execution time in our PPAQ scheme, of the query results. However, this approach cannot pro-
and it is significantly affected by the number of vertices. To tect the privacy of query results against the service provider.
illustrate it, we list the percentage of execution time of Step 1 Recently, Wu et al. [12] presented a novel privacy-preserving
in Table I, where percentage = (execution time of Step 1/ scheme for ANN queries by integrating the dummy technique
[execution time of (Step 1 + Step 2 + Step 3)]). and Paillier encryption. However, this approach has a pri-
It is worth noting that the number of POIs n and query users vacy issue in terms of the users’ real locations when a query
m has little impact on the percentage of the execution time of user launches multiple queries in a fixed location. Note that
Step 1, i.e., their ratios remain stable (around 95%) when vary- although Yilmaz et al. [28] also proposed a privacy-preserving
ing n and m. In order to improve the efficiency, LSP can have optimal location selection scheme, their work is not for the
the following approaches to optimize Step 1: 1) adopt multiple ANN query of a group considered in this article. Moreover, it
threads and 2) select fewer vertices to calculate shortest path mainly focused on the private set intersection (PSI) technique
distances. The second approach may affect the accuracy of and did not protect the location privacy of the client.
the selected optimal location, and how to select vertices is an
interesting topic. We leave the research on these optimization VIII. C ONCLUSION
approaches as future work. In this article, we have proposed a PPAQ scheme in road
networks, which can preserve the privacy of users’ locations,
query results, and access patterns. Specifically, we first ana-
VII. R ELATED W ORK
lyzed the problem of the ANN queries in road networks
The ANN query has attracted considerable attention due over plaintexts. Then, we designed secure addition and secure
to its wide range of applications [1]–[8]. Among them, comparison circuits to securely perform addition and com-
the works in [1]–[4] focus on the Euclidean space, while parison operations in a single-server model without leaking
the others [5]–[8] pay their attention to road networks. operands and results. Based on these secure circuits, we
Papadias et al. [1], [2] proposed several algorithms to effi- proposed our PPAQ scheme in road networks by designing
ciently perform the ANN queries by assuming that the users’ a secure matching approach. Security analysis indicated that
locations fit in memory and POIs are indexed by an R-tree. our PPAQ scheme is indeed privacy preserving, and exten-
Lian and Chen [3] considered the ANN queries over the sive performance evaluations validated its efficiency. In our
uncertain database and integrated effective pruning methods future work, we will further exploit its efficiency and prac-
to facilitate reducing the search space of probabilistic ANN tically implement our proposed scheme, such as developing
queries. In 2010, Li et al. [4] designed efficient algorithms for the mobile application for users and ANN query services for
the group enclosing queries, which actually is a variant of the service providers.
ANN query by calculating the maximum distance instead of
minimum distance after applying aggregate functions. In the R EFERENCES
road network scenario, Yiu et al. [5] first studied the ANN
[1] D. Papadias, Q. Shen, Y. Tao, and K. Mouratidis, “Group nearest neigh-
queries in road networks and presented three algorithms to bor queries,” in Proc. 20th Int. Conf. Data Eng., 2004, pp. 301–312.
explore the network around the query points until the desired [2] D. Papadias, Y. Tao, K. Mouratidis, and C. K. Hui, “Aggregate near-
results are discovered. Zhu et al. [6] presented a voronoi-based est neighbor queries in spatial databases,” ACM Trans. Database Syst.,
vol. 30, no. 2, pp. 529–576, 2005.
approach to solve the ANN problem in road networks. The [3] X. Lian and L. Chen, “Probabilistic group nearest neighbor queries in
advantage of this work is that it does not need to retrieve uncertain databases,” IEEE Trans. Knowl. Data Eng., vol. 20, no. 6,
the network from disk and just utilizes the look-up tables pp. 809–824, Jun. 2008.
[4] F. Li, B. Yao, and P. Kumar, “Group enclosing queries,” IEEE Trans.
of the voronoi diagram generated in advance. Yan et al. [7] Knowl. Data Eng., vol. 23, no. 10, pp. 1526–1540, Oct. 2011.
adopted the ANN query to find the optimal meeting point [5] M. L. Yiu, N. Mamoulis, and D. Papadias, “Aggregate nearest neighbor
in road networks, which has the same scenario as our work. queries in road networks,” IEEE Trans. Knowl. Data Eng., vol. 17, no. 6,
pp. 820–833, Jun. 2005.
Recently, Yao et al. [8] studied the flexible ANN queries [6] L. Zhu, Y. Jing, W. Sun, D. Mao, and P. Liu, “Voronoi-based aggre-
in road networks and proposed a series of universal meth- gate nearest neighbor query processing in road networks,” in Proc. 18th
ods to solve this problem, e.g., the Dijkstra-based algorithm. SIGSPATIAL Int. Conf. Adv. Geograph. Inf. Syst., 2010, pp. 518–521.
[7] D. Yan, Z. Zhao, and W. Ng, “Efficient algorithms for finding optimal
However, all of the above works aimed to improve ANN query meeting point on road networks,” Proc. VLDB Endowment, vol. 4,
efficiency and did not consider the privacy issues. no. 11, pp. 968–979, 2011.

Authorized licensed use limited to: Jinan University. Downloaded on April 16,2023 at 09:19:35 UTC from IEEE Xplore. Restrictions apply.
20188 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 20, 15 OCTOBER 2022

[8] B. Yao, Z. Chen, X. Gao, S. Shang, S. Ma, and M. Guo, “Flexible Suprio Ray (Member, IEEE) received the Ph.D.
aggregate nearest neighbor queries in road networks,” in Proc. IEEE degree from the Department of Computer Science,
34th Int. Conf. Data Eng. (ICDE), 2018, pp. 761–772. University of Toronto, Toronto, ON, Canada, in
[9] Q. Zhao, C. Zuo, G. Pellegrino, and L. Zhiqiang, “Geo-locating drivers: 2015.
A study of sensitive data leakage in ride-hailing services,” in Proc. Annu. He is an Associate Professor with the Faculty of
Netw. Distrib. Syst. Security Symp., Feb. 2019, pp. 1–15. Computer Science, University of New Brunswick,
[10] T. Hashem, L. Kulik, and R. Zhang, “Privacy preserving group nearest Fredericton, NB, Canada. His research interests
neighbor queries,” in Proc. 13th Int. Conf. Extending Database Technol., include big data and database management systems,
2010, pp. 489–500. run-time systems for scalable data science, prove-
[11] M. Ashouri-Talouki, A. Baraani-Dastjerdi, and A. A. Selçuk, “GLP: A nance and privacy issues in big data, and query
cryptographic approach for group location privacy,” Comput. Commun., processing on modern hardware.
vol. 35, no. 12, pp. 1527–1533, 2012.
[12] Y. Wu et al., “Enhanced privacy preserving group nearest neighbor
search,” IEEE Trans. Knowl. Data Eng., vol. 33, no. 2, pp. 459–473, Rongxing Lu (Fellow, IEEE) received the Ph.D.
Feb. 2021. degree from the Department of Electrical and
[13] M. S. Islam, M. Kuzu, and M. Kantarcioglu, “Access pattern disclosure Computer Engineering, University of Waterloo,
on searchable encryption: Ramification, attack and mitigation,” in Proc. Waterloo, ON, Canada, in 2012.
NDSS, 2012, pp. 1–15. He is a Mastercard IoT Research Chair, a
[14] Y. Zheng, R. Lu, Y. Guan, J. Shao, and H. Zhu, “Efficient and privacy- University Research Scholar, an Associate Professor
preserving similarity range query over encrypted time series data,” with the Faculty of Computer Science (FCS),
IEEE Trans. Dependable Secure Comput., early access, Feb. 23, 2021, University of New Brunswick (UNB), Fredericton,
doi: 10.1109/TDSC.2021.3061611. NB, Canada. Before that, he worked as an
[15] Y. Elmehdwi, B. K. Samanthula, and W. Jiang, “Secure k-nearest neigh- Assistant Professor with the School of Electrical
bor query over encrypted data in outsourced environments,” in Proc. and Electronic Engineering, Nanyang Technological
IEEE 30th Int. Conf. Data Eng., 2014, pp. 664–675. University, Singapore, from April 2013 to August 2016. He worked as a
[16] J. Liu, J. Yang, L. Xiong, and J. Pei, “Secure skyline queries on cloud Postdoctoral Fellow with the University of Waterloo from May 2012 to April
platform,” in Proc. IEEE 33rd Int. Conf. Data Eng. (ICDE), 2017, 2013. His research interests include applied cryptography, privacy-enhancing
pp. 633–644. technologies, and IoT-big data security and privacy. He has published exten-
[17] J. Chen, L. Liu, R. Chen, W. Peng, and X. Huang, “SecRec: A privacy- sively in his areas of expertise.
preserving method for the context-aware recommendation system,” Dr. Lu was awarded the most prestigious Governor General’s Gold Medal
IEEE Trans. Dependable Secure Comput., early access, Jun. 7, 2021, from the University of Waterloo and won the 8th IEEE Communications
doi: 10.1109/TDSC.2021.3085562. Society (ComSoc) Asia–Pacific Outstanding Young Researcher Award, in
[18] J. Brickell and V. Shmatikov, “Privacy-preserving graph algorithms in 2013. He is the Winner of the 2016 and 2017 Excellence in Teaching
the semi-honest model,” in Proc. Int. Conf. Theory Appl. Cryptol. Inf. Award, FCS, UNB. He was the recipient of nine best (student) paper awards
Security, 2005, pp. 236–252. from some reputable journals and conferences. He currently serves as the
[19] H. Mahdikhani, R. Lu, Y. Zheng, J. Shao, and A. A. Ghorbani, Chair of IEEE ComSoc Communications and Information Security Technical
“Achieving O (log3 n) communication-efficient privacy-preserving range Committee and the Founding Co-Chair of IEEE TEMS Blockchain and
query in fog-based IoT,” IEEE Internet Things J., vol. 7, no. 6, Distributed Ledgers Technologies Technical Committee.
pp. 5220–5232, Jun. 2020.
[20] Y. Guan, R. Lu, Y. Zheng, S. Zhang, J. Shao, and G. Wei, “Toward
privacy-preserving Cybertwin-based Spatio-temporal keyword query for Yandong Zheng received the M.S. degree from
ITS in 6G era,” IEEE Internet Things J., vol. 8, no. 22, pp. 16243–16255, the Department of Computer Science, Beihang
Nov. 2021. University, Beijing, China, in 2017. She is cur-
[21] E. W. Dijkstra, “A note on two problems in connexion with graphs,” rently pursuing the Ph.D. degree with the Faculty of
Numerische Mathematik, vol. 1, no. 1, pp. 269–271, 1959. Computer Science, University of New Brunswick,
[22] S. Jung and S. Pramanik, “An efficient path computation model for Fredericton, NB, Canada.
hierarchically structured topographical road maps,” IEEE Trans. Knowl. Her research interest includes cloud computing
Data Eng., vol. 14, no. 5, pp. 1029–1046, Sep./Oct. 2002. security, big data privacy, and applied privacy.
[23] O. Goldreich, Foundations of Cryptography: Volume 2, Basic
Applications. Cambridge, U.K.: Cambridge Univ. Press, 2009.
[24] 2012, “Sequoia: Locations of California,” ChoroChronos. [Online].
Available: http://chorochronos.datastories.org/?q=node/58
[25] R. Paulet, M. G. Kaosar, X. Yi, and E. Bertino, “Privacy-preserving
and content-protecting location based queries,” IEEE Trans. Knowl. data Yunguo Guan is currently pursuing the Ph.D.
Eng., vol. 26, no. 5, pp. 1200–1210, May 2014. degree with the Faculty of Computer Science,
[26] “DIMACS: Locations of New York City.” 2006. [Online]. Available: University of New Brunswick, Fredericton, NB,
http://www.diag.uniroma1.it/challenge9/download.shtml Canada.
[27] “OpenStreetMap.” [Online]. Available: https://www.openstreetmap.org His research interests include applied cryptogra-
(Accessed: May 13, 2022). phy and game theory.
[28] E. Yilmaz, H. Ferhatosmanoglu, E. Ayday, and R. C. Aksoy, “Privacy-
preserving aggregate queries for optimal location selection,” IEEE
Trans. Dependable Secure Comput., vol. 16, no. 2, pp. 329–343,
Mar./Apr. 2019.

Jun Shao (Senior Member, IEEE) received the


Ph.D. degree from the Department of Computer
Songnian Zhang received the M.S. degree from Science and Engineering, Shanghai Jiao Tong
Xidian University, Xi’an, China, in 2016. He is cur- University, Shanghai, China, in 2008.
rently pursuing the Ph.D. degree with the Faculty of He was a Postdoctoral Fellow with the School of
Computer Science, University of New Brunswick, Information Sciences and Technology, Pennsylvania
Fredericton, NB, Canada. State University, Pennsylvania, PA, USA, from 2008
His research interest includes cloud computing to 2010. He is currently a Professor with the School
security, big data query, and query privacy. of Computer and Information Engineering, Zhejiang
Gongshang University, Hangzhou, China. His cur-
rent research interests include network security and
applied cryptography.

Authorized licensed use limited to: Jinan University. Downloaded on April 16,2023 at 09:19:35 UTC from IEEE Xplore. Restrictions apply.

You might also like