You are on page 1of 14

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSC.2020.3031061, IEEE
Transactions on Services Computing
1

FedCrowd: A Federated and Privacy-Preserving


Crowdsourcing Platform on Blockchain
Yu Guo, Member, IEEE, Hongcheng Xie, Student Member, IEEE, Yinbin Miao, Member, IEEE,
Cong Wang, Senior Member, IEEE, and Xiaohua Jia, Fellow, IEEE

Abstract—Crowdsourcing has attracted widespread attention in recent years and developed into various applications. An indispensable
service of crowdsourcing systems is task recommendation, which means tasks should be accurately recommended to the workers with
aligned interests. However, existing systems rely on their separate servers to conduct recommendation services, resulting in computing
resources locked inside each isolated system. Moreover, due to the wide attacking surfaces of traditional centralized servers setting,
existing systems are subject to single points of failure or malicious data breaches. Therefore, failure to address these inherent limitations
properly will hinder the wide adoption of crowdsourcing.
In this paper, we propose and implement FedCrowd, the first federated and privacy-preserving crowdsourcing platform by using
blockchain technology. Our main idea is to employ the smart contract as a trusted platform for systems to release encrypted tasks, and
carefully craft matching protocols to enable efficient task recommendations in the ciphertext domain. Our task-matching protocols are
highly customized for the decentralized settings, where users can securely perform keyword and range-based queries over federated
task indexes without sharing secret keys. We formally analyze the security strengths and complete the prototype implementation on
Ethereum. Experiment results demonstrate the feasibility and usability of the FedCrowd platform.

Index Terms—Crowdsourcing, Privacy-Preserving, Decentralized Application, Searchable Encryption, Blockchain

1 INTRODUCTION the computing resource in a common platform by achieving


task recommendation across different systems?
Crowdsourcing has become a popular computing paradigm In light of the above observations, we are motivated
in sharing economy, which harnesses the power of collective to develop a federated crowdsourcing platform. In such
intelligence to solve complicated tasks [1]. Its unparalleled a new federated platform, each broker can keep its own
flexibility and economic advantage are motivating both autonomy while being able to use potential resources (tasks
individuals and enterprises to outsource their local compli- and workers for crowdsourcing) from other brokers. To this
cated tasks to various crowdsourcing systems for attracting end, a promising solution is leveraging the smart contract
task workers. Along with this trend, many well-known technique of blockchain [5] to interconnect different brokers
applications have been developed and widely implemented to form a crowdsourcing federation. Smart contract is an
in practice, such as Upwork [2], CrowdFlower [3], and emerging decentralized computing paradigm where all op-
Amazon Mechanical Turk [4]. erations are transparent and reliable. Due to the intrinsic
In a traditional crowdsourcing model, service providers properties of immutability and verifiability, it can establish
(called brokers) typically rely on their centralized servers trustworthiness in a fully distributed environment. Thus, a
like public clouds to execute task recommendation services. blockchain can provide an ideal platform for building the
They host various tasks from requesters and match workers crowdsourcing federation.
with aligned interests independently. However, this service Nevertheless, the major challenge of the above blueprint
model makes computing resources exist in the form of infor- comes from privacy concerns. Broker-servers might be vul-
mation isolation. Workers are locked inside each individual nerable to security breaches and unauthorized data disclo-
system, and tasks cannot reach numerous potential workers sure incidents occur from time to time in recent years [6].
outside the boundary of the system. Brokers also suffer Meanwhile, the original design of blockchain does not
from this barrier as they cannot attract more workers from ensure the strong protections of the data recorded on the
the other competitors. Motivated by this observation, an chain. Directly outsourcing sensitive task requirements and
interesting research question is: would it be possible to fuse workers’ interests to untrusted broker-servers and the trans-
parent blockchain platform may inevitably expose the pri-
• Y. Guo is with the School of Artificial Intelligence, Beijing Normal
vate information. Thus, it is of critical importance to ensure
University, Beijing, China. E-mail: yuguo@bnu.edu.cn. the requesters/workers privacy and task protection in the
• H. Xie, C. Wang and X. Jia are with the Department of Computer blockchain-based crowdsourcing platform.
Science, City University of Hong Kong, Hong Kong, China. C. Wang In the literature, several studies have been proposed
is also with City University of Hong Kong Shenzhen Research Institute,
Shenzhen, China. E-mail: hongcheng.xie@my.cityu.edu.hk, {congwang, for privacy-preserving task-matching in crowdsourcing [7],
csjia}@cityu.edu.hk. [8], [9], [10]. However, existing solutions only focus on a
• Y. Miao is with the School of Cyber Engineering, Xidian University, Xian centralized server setting, which are not explicitly suitable
710071, China. E-mail: ybmiao@xidian.edu.cn.
for our blockchain-based crowdsourcing platform. Firstly,

1939-1374 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 20:44:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSC.2020.3031061, IEEE
Transactions on Services Computing
2

Blockchain sharing keys. By treating the PRE ciphertexts as task key-


Secure and federated on-chain words, it is possible for distributed broker-servers to gener-
task recommendation services ate ciphertexts with the same format, and later upload them
to the blockchain for federated task-matching. To mitigate
the leakage of deterministic PRE, we propose to leverage
Brokers Brokers
Post encrypted Post encrypted
the idea of random masking and let requesters/workers en-
requirements/ interests requirements/ interests crypt their task information with fresh nonces. Meanwhile,
broker-servers are also required to embed their re-encrypted
Encrypted task storage Encrypted task storage PRE ciphertexts into random masks via bilinear mapping.
Thus, the smart contracts can use bilinear mapping oper-
ations to compare the randomized PRE ciphertexts at the
Requesters Workers Requesters Workers
blockchain platform, without learning anything from them.
The above task-matching scheme is tailored for the
Fig. 1: The federated system framework. federated crowdsourcing platform. Furthermore, we also
consider how to extend our proposed scheme to serve for
all existing schemes [7], [9], [10] for secure task-matching richer functionalities such as range-based task-matching.
rely on a centralized broker-server for ciphertext transfor- It should be noted that existing cryptographic primitives
mation. But in a blockchain-based federated platform, it is for range queries such as Order-Preserving/Revealing En-
impossible for a third-party broker-server to be trusted by cryption (OPE/ORE) schemes [17], [18] are not suitable
various systems. Secondly, the deterministic encryption al- for the federated scenario because they require the same
gorithms of existing schemes [8] are susceptible to the latest key for token generation and order comparisons. To solve
statistic attacks [11], [12]. Thirdly, most of existing solutions this issue, we further devise a new scheme for encrypted
can only support keyword matching, but crowdsourcing range-matching operations in multi-owner settings. In our
tasks are not limited to be expressed in keywords. They are scheme, numeric values are represented as binary strings
often expressed in numeric ranges. However, existing smart and encrypted into ciphertext blocks with pre-defined com-
contracts can only support limited cryptographic tools. This parison results by using bilinear mapping. Thus, the first
restriction makes it difficult to use existing cryptographic different bit blocks of two values can reveal the comparison
techniques [9], [13] to implement range-based task-matching results. Smart contracts can directly perform range queries
on smart contracts. The above-mentioned issues, if not prop- on ciphertexts via conducting pattern-matching operations.
erly addressed, may impede the successful deployment of The thorough security analysis confirms that our schemes
the federated crowdsourcing platform. achieve strong protection of the task confidentiality during
Our contributions: In this paper, we take the first step on-chain task recommendation services.
toward building a federated and privacy-preserving crowd- To evaluate the performance, we implement the system
sourcing platform. Our proposed design, called FedCrowd, prototype on top of Ethereum and provide the open-source
enables independent systems to be interconnected into a code of our implementation. Extensive experiments demon-
loosely coupled crowdsourcing federation. FedCrowd lever- strate the effectiveness and efficiency of our schemes. Over-
ages the blockchain as the underlying platform to build this all, FedCrowd overcomes the limitations in previous solu-
federation of the systems. In particular, different brokers tions and achieves federated crowdsourcing without sacri-
are allowed to publish encrypted task requirements on the ficing user privacy. We believe that our proposed privacy-
blockchain platform by using smart contracts and then let preserving protocol lays a solid foundation for other de-
these smart contracts perform secure task-matching ser- centralized applications to work under malicious adversary
vices. By deploying the service code to the smart contracts, models. In summary, our contributions are listed as follows:
it allows the blockchain to transparently perform matching • We present FedCrowd, the first federated crowdsourc-
services on behalf of these broker-servers. Besides, to make ing platform that enables encrypted task-matching over
the blockchain light-weight and on-chain matching efficient, the smart contracts. It stores encrypted raw tasks at
FedCrowd adopts a hybrid-storage framework [14], [15], the individual crowdsourcing systems and conducts
as shown in Fig. 1. Specifically, broker-servers maintain federated task-matching at the blockchain platform.
the encrypted raw tasks from their requesters at the local • We develop privacy-preserving schemes highly cus-
storage and only upload task requirements or workers’ tomized for on-chain task recommendation. Our design
interests to the blockchain for federated task-matching. With uniquely bridges PRE and bilinear pairing techniques,
our federated platform, requesters can always have a large with secure and efficient mechanisms for achieving
number of potential workers, and workers can have a lot encrypted keyword and range-based task-matching.
more options for tasks. • We provide a thorough analysis of investigating the
To address the privacy issues, we develop new on-chain security guarantees of the proposed schemes. By char-
task-matching schemes that can be adapted to our feder- acterizing and analyzing the leakage profiles from both
ated crowdsourcing platform while ensuring user privacy the ciphertext generation and task-matching protocols,
and task protection. In particular, different from existing we prove the security of our schemes.
works, we devise a new distributed Proxy Re-Encryption • We implement the system prototype and deploy it on
(PRE) scheme tailored for our decentralized service settings. Ethereum. The evaluation results show that our security
PRE [16] is a well-studied algorithm in which different design can support efficient on-chain task-matching
ciphertexts can be transformed into the same format without services with a practically affordable gas cost.

1939-1374 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 20:44:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSC.2020.3031061, IEEE
Transactions on Services Computing
3

The rest of this paper is organized as follows. Section 3 al. [38] proposed to use broadcast encryption to enable
introduces the background knowledge about the techniques multiple users searching, but it only allows a single-user
involved in FedCrowd. Section 4 presents our problem to encrypt its data. In [39], Blaze et al. first introduced
statement and threat assumptions. Section 5 introduces the the notion of proxy re-encryption (PRE), which was later
detailed design of FedCrowd. Our security analysis is con- improved by Bao et al. [36] and Dong et al. [16] to support SE
ducted in Section 6, and an extensive array of evaluation in a multi-user scenario. However, these PRE-based schemes
results is shown in Section 7. Section 2 describes the related suffered from statistical attacks [11], [12], because of the
work, and Section 8 concludes the paper. deterministic of PRE algorithm. To cope with the security
issue, our work redesigns the encryption scheme by adding
randomness to the PRE scheme.
2 RELATED WORK Another line of related works targeted on cryptographic
2.1 Privacy-preserving Crowdsourcing primitives for encrypted range queries [17], [18], [40]. In [41],
Crowdsourcing has become an active research topic as the Agrawal et al. first proposed the notion of Order-preserving
rapid growth of the sharing economy on the internet [19]. Encryption (OPE), which makes order comparisons between
Many well-known crowdsourcing applications have been ciphertexts as if it had operated on plaintexts. However,
developed, such as Freelancer [20], Upwork [2], Waze [21], OPE schemes are deterministic and the order relations are
and Amazon Mechanical Turk [4]. However, privacy issues directly leaked from ciphertexts, which make OPE suffer
in task recommendations have been largely ignored in from sorted attacks [12]. To solve this issue, Popa et al. [42]
these applications. As the user’s data (i.e., interests and re- proposed to encrypt values with standard encryption, but
quirements) often contains sensitive information like health it requires multiple interactions for user-side comparison.
records [22] and geographic locations [23], [24], it should To mitigate the leakage, the notion of Order-revealing En-
not be exposed to public nor any untrusted crowdsourcing cryption (ORE) was proposed [17], [18]. ORE ciphertexts
systems. To address the security concern, several privacy- preserve semantic security, while the order relations can
preserving crowdsourcing systems [9], [10], [25] have been be revealed only after dedicated comparison. Besides the
proposed. In [10], To et al. introduced a trusted third party schemes mentioned above, there are also several works
to protect the location privacy based on differential privacy. relying on other mechanisms, such as the Homomorphic
Shen et al. [9] proposed task recommendation protocols by Encryption (HOM) [13]. However, all these searchable en-
using homomorphic encryption with the assistance of a cryption schemes are built based on symmetric key or public
semi-honest third party. However, these works only con- key encryption, which only allow a single user to search its
sidered the confidentiality of workers’ interests. To enhance encrypted data. Therefore, secure and efficient range queries
the security, Shu et al. [8] proposed to utilize the proxy in the federated multi-user setting need to be studied.
re-encryption to protect the profiles of both workers and
tasks, when performing encrypted task-matching. They fur- 2.3 Blockchain and Decentralized Application
ther extended their scheme to support a proxy-free task
matching [7] by using the bilinear pairing. However, the The advancement of Bitcoin [43] vigorously pushes forward
above designs cannot support range-based task-matching the innovation of blockchain, which is now recognized as
services. A recent design called pRide [26] was proposed to the revolutionary technology in a wide spectrum of indus-
use homomorphic encryption and Yao’s garbled circuit to tries throughout the world [44], [45]. To improve the utility
support secure range comparison. To improve the matching of blockchain, several efforts have been made to enable fast
efficiency, the authors later enhanced their design by using and useful queries over on-chain data [46]. For instance,
the trusted enclave techniques [27], but its security assump- Google has recently announced a toolset for searching and
tion relied on trusted execution environments. Despite the analyzing blockchain data via the BigQuery service [47].
extensive research on privacy-preserving crowdsourcing, Meanwhile, another project called BigchainDB [48] allows
the above-mentioned systems were not explicitly suitable making MongoDB query to search the contents of transac-
for decentralized blockchain settings, which is the main tions, assets, and metadata on the customized blockchain
focus of this work. database. However, all these projects are absent of security
measures for data protection.
To address the privacy issues, Hu et al. [49] first proposed
2.2 Searchable Encryption Scheme to use the smart contract to conduct keyword search over
Secure task-matching services in crowdsourcing applica- encrypted on-chain indexes. In the follow-up design [14],
tions are often realized through Searchable Encryption (SE) Cai et al. improved the query efficiency by storing encrypted
techniques [28]. SE allows an untrusted server to search indexes off-chain. Very recently, Guo et al. [50] devised the
directly over encrypted data without server-side decryption. first blockchain-assisted SE scheme with forward-security.
Early studies of SE focus on the directions of query function- However, all these mentioned schemes only apply to the
ality [29], [30], [31], [32] and update operations [33], [34], [35] single-owner settings because of the adoption of symmetric
in a single-user scenario. That is, the data-owner holding key encryption. In [51], the authors proposed to use a
his secret key can only search or update his own data. To trusted enclave for secure crowdsensing, but its security
enrich the application scope of SE, research in a multi-user assumption relies on trusted hardware at the server-side. A
setting [16], [36], [37] has also attracted attention recently. recent work named CrowdBC [52] proposed a blockchain-
In this setting, multiple users are allowed to search over based crowdsourcing framework, but it does not handle
encrypted data outsourced by multiple owners. Curtmola et the problem of secure task-matching. It mainly deals with

1939-1374 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 20:44:53 UTC from IEEE Xplore. Restrictions apply.
4

the framework design and smart contract implementation.


ZebraLancer [53] leveraged a similar framework that mainly Block Block Block Block Block
focusing on protecting the anonymity of users. Moreover,
the proposed framework is a private blockchain design, Blockchain
which is required to modify the underlying blockchain ar-
chitecture. Besides, all above-mentioned solutions only con-
sider leveraging the blockchain to replace existing central-
ized broker-servers. In contrast, FedCrowd aims to utilize Broker Broker Broker Broker
the blockchain with our tailored cryptographic primitives Recommend Recommend
to interconnect independent broker-servers. Post Task Post Interest

3 BACKGROUND
3.1 Blockchain and Smart Contract Requesters Workers
Blockchain is an emerging technology due to the boom of
cryptocurrencies, such as Bitcoin [43] and Ethereum [5]. It Fig. 2: Overview of the FedCrowd architecture.
can be regarded as a cryptographic and distributed ledger
that records all transactions (can be any format of data) 4 PROBLEM FORMULATION
in an ordered manner. Once a transaction is written to
the blockchain, it cannot be altered and everybody can 4.1 System Overview
see it. Features on the blockchain network are elaborated Fig. 2 shows our system architecture, containing four types
as follows: 1) Transparency: Any participant within the of entities: the blockchain platform, task requesters, workers,
network can access the data recorded on the blockchain; 2) and brokers (i.e., crowdsourcing systems). The blockchain
Consensus: Every participant in the network will reach con- serves as a hub to bridge the other three parties. Each broker
sensus on the blockchain and valid transactions will surely hosts encrypted tasks in its own local database for its clients
be recorded on the chain; and 3) Verifiable: All transactions under its administrative domain. To attract workers from
recorded on the blockchain are audit-able for participants different domains to solve tasks together, the broker posts
within the network and cannot be tampered. encrypted task requirements to the blockchain in the form of
Along with the development of blockchain technology, encrypted indexes within its smart contract; while workers
smart contracts [54] are implemented in the blockchain plat- can submit query transactions to the blockchain network,
forms such as Ethereum [5] and Hyperledger [55] to make containing encrypted task interests. The blockchain nodes
blockchain more powerful. Each smart contract has a unique will execute the query transactions to search the indexes
address and can be triggered by addressing a transaction to embedded in the smart contracts and record the outputs,
it. Once a smart contract is deployed, it can execute the pre- which are the tasks matching the workers’ queries, in the
defined program without involving any central authority. blockchain. Finally, the worker can ask for the tasks from the
Through the program and data stored on the smart contract, respective brokers. In such a federated platform, requesters
the user-defined operations will be triggered and the out- interact with their respective brokers to publish tasks; while
puts are recorded on the blockchain. In FedCrowd, we aim workers can have the opportunity to choose tasks posted by
to utilize smart contracts to honestly perform task-matching all the brokers in the blockchain platform.
services on behalf of individual broker-servers. In this work, we refer to the sensitive task requirements
as keywords wR (or values vR ); while the sensitive worker
3.2 Cryptographic Primitives interests are referred to wI (or vI ). Following the framework
3.2.1 Bilinear Pairings of privacy-preserving crowdsourcing in [8], we formalize
Let G1 , G2 , and GT be multiplicative cyclic groups of prime the target problem of secure task recommendation as fol-
p. Let g1 , g2 be the generators of G1 and G2 , respectively. lows. Given an interest wI (vI ) and a set of n task require-
1 n 1 n
A bilinear map e: G1 × G2 → GT has the following ments {wR , .., wR } ({vR , .., vR }) from different brokers, the
properties: 1) Bilinear: For all u ∈ G1 , v ∈ G2 and a, b ∈ Zp , problem we aim to tackle is to enable the smart contracts
e(ua , v b ) = e(u, v)ab ; 2) Computable: There exists an ef- to accurately perform the on-chain task matching between
ficiently computable algorithm for computing the map e; the interest wI (vI ) and requirements wR (vR ) in a secure
and 3) Non-degenerate: e(g1 , g2 ) 6= 1. It is worth noticing and efficient manner. In addition, we consider that there
that the asymmetric Bilinear Pairings used in our paper is is a key manager service (KMS) to manage public/private
Type-3 in which there is no known efficiently computable keys. KMS has been widely adopted in many blockchain-
isomorphism ψ : G1 → G2 . based applications (e.g., [44], [52]) to facilitate security and
we consider our adoption to be in this trend.
3.2.2 Complexity Assumptions Remarks: Our proposed system framework has three ad-
Definition 1 (Symmetric External Diffie-Hellman (SXDH)). vantages: 1) FedCrowd leverages a hybrid storage architec-
In Type-3 pairings, the Decisional Diffie-Hellman Problem ture, which makes the blockchain light-weighted and the
(DDH) is intractable both in G1 and G2 . Namely, given matching services more efficient; 2) FedCrowd is a federa-
a tuple (g1 , g1a , g1b , g1c ) or (g2 , g2a , g2b , g2c ), the goal of DDH tion of multiple individual crowdsourcing systems, where
problem is to decide whether c = ab or c ∈ Zp , where requesters can reach out to broader potential workers and
a, b ∈ Zp . workers can have more choices of tasks; and 3) FedCrowd is

Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 20:44:53 UTC from IEEE Xplore. Restrictions apply.
5

1) Initialization 5.1 The Process in FedCrowd


In this subsection, we first introduce the general process of
Kur, 𝜶 InitKey InitKey Kbr, 𝜷
Requesters our federated crowdsourcing platform. As shown in Fig. 3,
KMS Brokers
/workers FedCrowd consists of three main steps as follows:
• I NITIALIZATION:
2.1) Post tasks 2.2) Post task indexes 3.3) Task-matching
1) The KMS presets system security parameters and broad-
PostTask t1wR, t2wR, ct PostIndex I1ext,I2ext ,id Task Index casts its public key. For each requester or worker, the
Requesters Broker Smart contract KMS generates a pair of private keys and sends the
private keys with security parameters to the broker and
ExtMatch corresponding requester/ worker, respectively.
3.1) Post interests 3.2) Post query transaction
2) Each broker obtains a unique address of its smart con-
PostInt t1wI, t2wI PostTrans T1ext,T2ext Interest tract and broadcasts the address to the platform.
Workers Broker Transaction • TASK P UBLICATION:
Off-chain On-chain
1) To publish a task to the FedCrowd platform, a requester
first uses the private key to encrypt the task requirement.
Then, the task requirement together with the task content
Fig. 3: The process of task-matching in FedCrowd.
is sent to the broker in encrypted form.
2) The broker generates the encrypted task indexes based
a fair and transparent platform where matching results and on the requirements and maintains the task ciphertexts
workers’ credibility are recorded in the blockchain. With this at the local databases. The task indexes are recorded in
observation, we believe that the proposed blockchain-based the broker’s smart contract for later task-matching.
framework provides useful guidelines and new insights on
designing federated platform. • TASK R ECOMMENDATION:
1) To search for the tasks matching the interests, a worker
submits a search query to the broker, containing the
4.2 Threat Model encrypted keywords (or values) of his interests.
2) Upon receiving the search query from the worker, the
Our security goal is to provide strong protection for the broker will re-encrypt it with the private key. A query
confidentiality of tasks and interests. Specifically, we con- transaction, containing the encrypted interest and the
sider two potential threats from the brokers and blockchain addresses of different smart contracts, is then broadcast
peers. Brokers are semi-honest adversaries, who faithfully to the blockchain network.
follow the prescribed protocols but attempt to infer the task 3) Blockchain nodes search the task indexes embedded in
information from the local storage and encrypted requests. the smart contracts and record the task-matching results
Besides, each broker may also be curious about the task in the blockchain.
information of other brokers. Consistent with the security
assumption in prior works [8], we assume that brokers are
reputable companies and are not collude with other parties. 5.2 Design Rationale
In addition, the detection of malicious users who submit To enable federated crowdsourcing among different broker-
invalid data to intentionally disrupt the platform is not servers, the main challenging issue is to construct the en-
the focus of this paper, and it can be addressed via other crypted task-matching scheme. Specifically, the task require-
complementary techniques like zero-knowledge proofs [56]. ments and interests are encrypted by different requesters/
In light with the properties of blockchain, we consider workers with their private keys. Thus, it is difficult for smart
the blockchain peers are potential adversaries with access to contracts to compare these ciphertexts without sharing keys.
the chain. They honestly execute the designated matching To address this problem, one promising technique is Proxy
protocols, yet intend to learn task privacy from the on-chain Re-Encryption (PRE) [16], [36]. At a high-level, PRE is a
indexes, query transactions, and matching results. Finally, searchable encryption scheme that enables keyword search
we consider the KMS, requesters, and workers are always over encrypted data in a multi-owner setting. PRE schemes
trusted. They will not expose private keys to other parties. require to assign different types of secret keys to the users
and service providers so that ciphertexts of the same key-
word encrypted by different keys can be transformed into
5 THE F ED C ROWD DESIGN the same format for matching purposes. However, exist-
ing PRE schemes are not directly suitable for use in our
This section presents our design of federated and privacy- federated crowdsourcing platform. Firstly, in a federated
preserving task recommendation services via the blockchain platform, there does not exist a centralized server that can
platform in detail. We first introduce the main processes perform ciphertexts re-encryption for different systems. The
in FedCrowd. Then, we propose the basic scheme that reason is that it is difficult for a third-party server to be
supports secure task-matching services with encrypted key- trusted by various systems. Secondly, the algorithm of PRE
words. Moreover, we further extend our basic scheme to is deterministic and thus is susceptible to chosen-keyword
support secure range-match services. The corresponding in- attacks. Since the plaintext domain of the task keywords
dex building procedures and on-chain matching algorithms is limited, an attacker can infer the plaintext by encrypting
are also introduced in this section. keywords with the same PRE algorithm [11], [12]. Therefore,

Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 20:44:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSC.2020.3031061, IEEE
Transactions on Services Computing
6

we are aware that existing solutions still fall short of meeting Algorithm 1: System Initialization
the service requirements and security guarantees. Input: Private key x, public key g1x , requester/worker
To resolve the aforementioned challenges, we resort to a {Ur1 , .., Urn }, security parameters {b, s, α, β}.
hybrid approach that bridges the PRE schemes with bilinear Output: Authenticated requesters/workers table L.
pairing techniques. In particular, our main idea is to employ 1 KMS.InitKey(x, s, α, β ):
a distributed PRE scheme to transform different ciphertexts
2 begin
into the same format at the individual broker-servers side.
3 for each registered Uri , i ∈ {1, n} do
Then, when the re-encrypted ciphertexts are uploaded to the i i i
4 kur ← Zp , kbr = x − kur ;
blockchain, smart contracts utilize our customized bilinear i
5 Send {kur , α, b} to the registered Uri ;
pairing scheme to conduct federated task-matching. To cope i
6 Send {Uri : kbr , s, β} to the broker Br;
with the security issue of existing PRE scheme, our design
requires users to encrypt their data with fresh nonces and 7 Broker Br initializes a hash table L;
security parameters. Specifically, each requester/worker can 8
i
for each key {Uri : kbr }, i ∈ {1, n} do
encrypt the task keyword w with its private key kur as: 9
i
Add new requesters/workers L[Uri ] = kbr ;
t1w = g1−γur g1
F (w)α
, t2w = g1xγur g1−kur γur g1 ur
k F (w)α 10 Broker broadcasts the address of its smart contract;
i
11 Uri stores the secret key kur and parameters {α, b};
where γur is a fresh nonce, F is the standard hash function,
and α is the secure parameter. The private keys {kur , kbr }
are generated from the private key x in the KMS, such that
where v −1 is the prefix of bit block v and j ∈ {1, 2d −
kur = x − kbr . Thus, the ciphertext t2w is actually equal to:
1} are all possible distances for a d-bits block. Given an
sβγ ∗ βH(F (v ∗−1 ),v ∗ )γ ∗
t2w = g1xγur g1−kur γur g1 ur
k F (w)α k F (w)α
⇔ g1kbr γur g1 ur index ciphertext {ct∗1 = g1 br , ct∗2 = g1 br
}
and query tokens {ct1 , ct2 }, the smart contract can leverage
With the private key kbr , the broker can transform different
eq. 2 to check whether e(ct1 , ct∗2 ) = e(ct∗1 , ct2 ), i.e., v ± j =
keyword ciphertexts {t1w , t2w } to the same format as:
v ∗ . By employing our proposed block-based range query
tw = t1w kbr t2w ⇔ (g1−γur g1
F (w)α kbr
) (g1xγur g1−kur γur g1 ur
k F (w)α
) scheme, the smart contract can find the comparison results
from the first different bit-block of two values. The detailed
g1−kbr γur g1 br
k F (w)α kbr γur kur F (w)α xF (w)α
⇔ g1 g1 ⇔ g1 building procedures and matching protocols are conducted
(1) in the next section.
Under our enhanced PRE scheme, we construct a se-
cure protocol that achieves the privacy of task keywords
by inserting randomness into the transformed ciphertexts. 5.3 Exact-match Task Recommendation
However, directly uploading the PRE ciphertext tw to the 5.3.1 System Initialization
blockchain does not facilitate secure task-matching services The system initialization stage is mainly conducted by the
because the equality of PRE ciphertexts is not considered KMS and requesters/workers associated with their brokers
to be protected originally. Thus, we further carefully design before joining FedCrowd. To enable authorized users can
masking techniques that allow smart contracts to securely use the federated crowdsourcing services, each requester/
perform ciphertext comparison while hiding auxiliary in- worker and his broker should obtain the secret keys from
formation of input data. In particular, each broker-server the KMS in advance. Algorithm 1 illustrates the procedure

randomly chooses a fresh nonce γbr and computes the on- of system initialization. Specifically, for each registered Uri
chain ciphertexts for tw as: from the broker Bri , the KMS uses its private key x to
i i
∗ ∗ generate a pair of secret keys {kur , kbr }, and sends them
sβγbr βH(tw )γbr
ct∗1 = g1 , ct∗2 = g1 together with role parameters {α, β} to Uri and Bri , re-
spectively1 . The role parameter will later be used as a
where {β, s} are security parameters for brokers, and H is
security component for generating different private keys.
a standard hash function. Given a query token {ct1 , ct2 } =
βH(t )γ Thus, the roles of requesters/ workers and brokers cannot
{g2sβγbr , g2 w br }, the smart contract can accordingly con- be switched. And the parameter b is used for encrypting
duct encrypted on-chain matching by checking whether the range values. Finally, the broker initializes a hash table L
following equation holds or not: to maintain the keys for each authorized requester/ worker
e(ct1 , ct∗2 ) = e(ct∗1 , ct2 ) (2) and broadcasts its smart contract address to the platform. In
our design, the table L stored in each broker-server can be
In addition, we extend the above scheme to support viewed as an access control layer. It can easily revoke illegal
encrypted range-matching. Our main idea is to formulate workers from the system by removing their keys from the
the problem of encrypted range-matching as block-based table.
pattern matching. We propose to transform the numeric
value into a set of binary bit-blocks with equal length and 5.3.2 Exact-match Task Publication
encrypt each block with the order condition and related bit To publish a task to the FedCrowd platform, the requester
distances. Specifically, the tokens for a d-bits block v are: encrypts the task requirement wR with its secret key kur
and uploads the ciphertext {t1wR , t2wR } to the broker-server.
ct1 = g2sβγbr ,
βH(F (v −1 ),v±1)γbr βH(F (v −1 ),v±j)γbr
(3)
ct2 = {g2 , ..., g2 } 1. g1α , g2β are public parameters.

1939-1374 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 20:44:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSC.2020.3031061, IEEE
Transactions on Services Computing
7

Algorithm 2: Exact-match Task Publication Algorithm 3: Exact-match Task Recommendation


Input: Secret keys {kur , kbr }, standard encryption G, Input: Secret keys {kur , kbr }, public key g1x , hash func-
hash functions {F, H}, table L, keyword wR for tions {F, H}, table L, task interest wI .
the task content Dur . Output: Matched task id.
Output: Encrypted task index Iext . 1 Worker.PostInt(wI ):
1 Requester.PostTask(wR , Dur ): 2 begin
begin $
2 3 Generate a random nonce γur ← {0, 1}λ ;
$
t1wI ← g1−γur g1 I ;
F (w )α
3 Generate a random nonce γur ← {0, 1}λ ; 4
4 Generate a session key ks ; 5 t2wI ← g1xγur g1−kur γur g1 ur
k F (wI )α
;
t1wR ← g1−γur g1 R ;
F (w )α
5 6 Post interest {Ur : t1wI , t2wI } to the broker Br;
−k k F (wR )α
6 t2wR ← g1 ur g1 ur γur g1 ur

; 7 Broker.PostTrans(t1wI , t2wI ):
7 ct ← G(ks , Dur );
8 begin
8 Post task {Ur : t1wR , t2wR , ct} to the broker Br; $
9 Generate a random nonce γbr ← {0, 1}λ ;
9 Broker.PostIndex(t1wR , t2wR , ct):
10 kbr ← L[Ur], twI ← t1wIkbr t2wI ;
10 begin βH(tw )γbr
11
1
Text ← g2sβγbr , Text
2
← g2 I
;
∗ $
11 Generate a random nonce γbr ← {0, 1}λ ; 12
1 2
Broadcast Text = {Text , Text } to the blockchain;
12 kbr ← L[Ur],twR ← t1k br 2
t ;
wR wR * ∗shown in Eq.1 */
/

sβγbr βH(twR )γbr 13 Blockchain.ExtMatch(Text , Iext ):
1 2
13 Iext ← g1 , Iext ← g1 ; 14 begin
14 Generate a random string id as the task id of ct; 15 for each exact-match index {Iext1 2
, Iext } do
1 2
15 Deploy Iext = {Iext , Iext , id} to smart contracts; sβγ ∗ βH(tw )γbr
16 Maintain the encrypted task ct at local databases;
16 p ← e(g1 br , g2 I
);

sβγbr βH(twR )γbr
17 q ← e(g2 , g1 );
18 while p==q do
19 Return matched id; /* shown in Eq. 2 */
Meanwhile, the task content Dur is encrypted with a shared
session key ks . It is shared between requesters and workers 20 Worker can get tasks with id from the broker-server;
via key exchange protocols [57], so that later workers can
use the same key to decrypt the matched ciphertexts. In
our design, {t1wR , t2wR } are masked with fresh nonces γur ,
search the encrypted keyword indexes Iext . Particularly,
which are unique for different requesters. After checking the
it compares the requirement twR with the interest twI in
requester authorization, the broker uses kbr from table L to
ciphertext via a bilinear map operation, i.e.,
re-encrypt the ciphertext via computing twR = t1wRkbr t2wR .
xF (w)α ∗ βH(twI )γbr ∗
sβγbr
As mentioned in Eq. 1, we note that t1wRkbr t2wR = g1 , e(g1 , g2 ) ⇔ e(g1 , g2 )sβγbr βH(twI )γbr
where F is a secure hash function and α is the secu- ∗
βH(twR )γbr ∗

rity parameter. Since twR is masked by the parameter α, ⇔ e(g2sβγbr , g1 ) ⇔ e(g2 , g1 )sβγbr βH(twR )γbr
the broker cannot learn the keyword w by using chosen By using this scheme, task matching can be done efficiently
keyword attacks. Subsequently, the broker generates the over the encrypted on-chain indexes. Finally, the smart
1 2 ∗
encrypted indexes {Iext , Iext } with a random nonce γbr and contract records the outputs, which are the tasks matching
transforms the indexes into a smart contract recorded in the worker’s query, in the blockchain. The worker can then
the blockchain, making the tasks available for workers to ask for the task contents from the respective brokers.

search. The purpose of introducing nonce γbr is to prevent
the blockchain nodes from knowing that newly posted
5.4 Range-match Task Recommendation
tasks are associated with previously recorded indexes. After
completing the procedures of index generation, the broker 5.4.1 Range-match Task Publication
uploads the encrypted task indexes to FedCrowd and stores Recall that as introduced in Section 5.2, we formulate the
the corresponding encrypted task content ct at its local problem of encrypted range-matching as block-based pat-
database. The details of the exact-match task publication is tern matching, so that order comparison can be computed
given in Algorithm 2. efficiently through our tailored cryptographic primitives.
Algorithm 4 illustrates the secure index building process.
5.3.3 On-chain Task Recommendation Given an index value of task requirements, the requester
firstly converts it to a binary string v . Then, v is divided
The secure task-matching scheme following the on-chain
into b blocks with d-bit lengths. For each bit-block vR,i , it
index design is presented in Algorithm 3. To search for tasks
uses the private key kur with a fresh nonce γur to encrypt
matching the interest, the worker first sends an encrypted
the block ciphers {t1vR,i , t2vR,i } and sends the encrypted task
search query {t1wI , t2wI } to the broker-server, containing
requirements to the broker-server. Then the broker can ob-
encrypted keywords of his interests wI . After computing
tain the encrypted block value tR,i via computing t1vR,i kbr 2
tvR,i ,
t1wIkbr t2wI , the broker submits the re-encrypted token Text βH(F (t−1 ),tR,i )γ i∗
R,i br
to the FedCrowd in the form of query transactions, so and embeds tR,i into random masks as g1 .
1 2
that blockchain peers can execute the smart contracts to Finally, the encrypted indexes {Irng , Irng } are uploaded to

1939-1374 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 20:44:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSC.2020.3031061, IEEE
Transactions on Services Computing
8

Algorithm 4: Range-match Task Publication Algorithm 5: Range-match Task Recommendation


Input: Secret keys {ks , kur , kbr }, public key g1x , encryp- Input: Secret keys {kur , kbr }, public key g1x , hash func-
tion scheme G, hash functions {F, H}, table L, tions {F, H}, table L, task interest vI with query
value vR for the task content Dur . conditions cmp ∈ {−, +}.
Output: Encrypted task index Irng . Output: Matched task id.
1 Requester.PostTask(vR , Dur ): 1 Worker.PostInt(vI ):
2 begin 2 begin
$ $
3 Generate a random nonce γur ← {0, 1}λ ; 3 Generate a random nonce γur ← {0, 1}λ ;
4 Divide vR into b blocks with d-bit lengths; 4 Divide vI into b blocks with d-bit lengths;
5 for each block vR,i , i ∈ {1, b} do 5 for each block vI,i , i ∈ {1, b}, j = 1 do
6
F (v )α
t1vR,i ← g1−γur g1 R,i ; 6 while j 6= 2d do
F (v +(cmp) j)α
7 t2vR,i ← g1xγur g1−kur γur g1 ur
k F (vR,i )α
; 7
1
zi,j ← g1−γur g1 I,i ;
k F (v +(cmp) j)α
8
2
zi,j ← g1xγur g1−kur γur g1 ur I,i ;
8 t1vR = {t1vR,1 , .., t1vR,b }, t2vR = {t2vR,1 , .., t2vR,b };
9 j + + /* compute pre-defined results */
9 ct ← G(ks , Dur );
10 Post task {Ur : t1vR , t2vR , ct} to the broker Br; 10
1
t1vI,i = {zi,1 1
, .., zi,2d −1 };
2 2 2
11 tvI,i = {zi,1 , .., zi,2d −1 };
11 Broker.PostIndex(t1vR , t2vR , ct):
12 begin 12 t1vI = {t1vI,1 , .., t1vI,b }, t2vI = {t2vI,1 , .., t2vI,b };
13 kbr ← L[Ur]; 13 Post interest {Ur : t1vI , t2vI } to the broker Br;
14 for each block {t1vR,i , t2vR,i }, i ∈ {1, b} do 14 Broker.PostTrans(t1vI , t2vI , cmp):
i∗ $
15 Generate a random nonce γbr ← {0, 1}λ ; 15 begin
βH(F (t−1 i∗
R,i ),tR,i )γbr 16 kbr ← L[Ur];
16 tR,i ← t1vR,i
kbr 2
tvR,i , zR,i ← g1 ;
17 for each block {t1vI,i , t2vI,i }, i ∈ {1, b} do
1 sβγ 1∗ sβγ b∗
2 $
17 Irng = {g1 br , ..., g1 br }, Irng = {zR,1 , ..., zR,b }; 18
i
Generate a random nonce γbr ← {0, 1}λ ;
18 Generate a random string id as the task id of ct; 19
1 kbr 2
ti,j ← zi,j zi,j ;
1 2
19 Deploy Irng = {Irng , Irng , id} to smart contracts; βH(F (t−1 i βH(F (t−1d i
),ti,2d −1 )γbr
i,1 ),ti,1 )γbr i,2 −1
20 Maintain the encrypted task ct at local databases; 20 zI,i ={g2 ..g2 }
1 sβγ 1 sβγ b 2
21 Trng = {g2 br , ..., g2 br }, Trng = {zI,1 , ..., zI,b };
1 2
22 Broadcast Trng = {Trng , Trng } to the blockchain;
the smart contracts for range-based task recommendation 23 Blockchain.RngMatch(Trng , Irng ):
services. 24 begin
1 2
25 for each range-match index {Irng , Irng } do
5.4.2 On-chain Task Recommendation 26 for each ciphertext block, i ∈ {1, b}, j = 1 do
i∗
sβγbr βH(F (t−1 i
i,j ),ti,j )γbr
The procedure of encrypted range comparison is illustrated 27 p ← e(g1 , g2 );
sβγ i βH(F (t−1 i∗
R,j ),tR,i )γbr
in Algorithm 5. To search tasks with a range condition 28 q← e(g2 br , g1 );
(vI , cmp), the workers first needs to compute the tokens 29 while p==q do
1 2
of each sub-blocks {zi,j , zi,j }, where vI,i is the bit value 30 Return matched id;
of the i-th block, and cmp ∈ {−, +} represents the order
31 j + +;
condition, where “ − ” denoting “ ≥ ” and “ + ” denoting
“ ≤ ”. j ∈ {1, 2d − 1} is the pre-defined value distance
of each block. For instance, all possible value distances
of each 2-bits block are {01, 10, 11}. After that, the bro-
ker can re-encrypt these sub-block tokens as ti,j with its of range-matching operations are presented from line 26 to
private keys. These sub-block tokens zI,i can be treated line 31 in Algorithm 5.
as the encrypted differences between the index item and
search token, and later be used for secure range-matching 5.4.3 Correctness Analysis on Range-match Scheme
i The correctness of our range-match scheme is guaranteed by
operation. In our design, we add randomness γbr into
sβγ i
i,j br βH(F (t−1 ),ti,j )γ i the deterministic property of hash functions. We prove that
each block ciphers {g2 br , g2 }, so that the
there is only one bit-block pair matching the query condition
attack cannot learn the equality among different blocks.
during the range comparison. Accordingly, we present the
From Eq. 2, we know that the smart contract can leverage
following lemma:
the bilinear map operation to check whether the query
token lies in the range of index values. In particular, if the Lemma 1. Given two values {vR , vI } and the query condition
−1 i
sβγ i∗ βH(F (vI,i ),vI,i ±j)γbr cmp ∈ {−, +}, if the query condition stands, there exists one and
comparison result holds: e(g1 br , g2 ) =
sβγ i
−1
βH(F (vR,i i∗
),vR,i )γbr only one matched bit-block pair {vR,i , vI,i }, where i ∈ {1, b}.
e(g2 br , g1 which means that vI,i ± j =
),
vR,i , i.e., the underlying value
matches the task indexes and Proof of lemma 1. Recall that vR,i is actually masked as
xF (v )α xF (v +(cmp) j)α
the corresponding task id should be returned. The details g1 R,i , and vI,i is masked as g1 I,i , where

1939-1374 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 20:44:53 UTC from IEEE Xplore. Restrictions apply.
9

Index bits value 01


6.1 Security Model
We first provide the security model of our PRE-based task
g1sᵦᵧ g1ᵦH(⊥, 01)ᵧ encryption scheme. The Chosen-Plaintext-Attack (IND-CPA
for short) security game between a challenger C and an
ᵦᵧ ᵦ ᵧ ᵦᵧᵦ ᵧ
e (g1s , g2 H( ⊥, 01) ∗) e (g1, g2)s H( ⊥,01) ∗ adversary A is shown as follows:
ᵦᵧ ᵦ ᵧ = ᵦᵧ ᵦ
e (g2s ∗, g1 H( ⊥, 01) ) e (g2, g1)s ∗ H( ⊥,01)
ᵧ == 1 11 > 01
• Setup. C runs the secure system initialization and gives
the public parameters to A.
g2sᵦᵧ* g2ᵦH(⊥ ,11-00)ᵧ* g2ᵦH(⊥ ,11-01)ᵧ* g2ᵦH(⊥ ,11-10)ᵧ* g2ᵦH(⊥ ,11-11)ᵧ* • Phase 1. A adaptively asks the index generation oracle
for any keyword of his choice.
Query bits value 11 > ? • Challenge. A outputs two keywords (w0 , w1 ) with the
same length. C chooses a random bit b ∈ {0, 1} and then
Fig. 4: Our proposed range-match algorithm. encrypts wb by calling the index generation algorithm.
Finally, C returns the ciphertexts (t1wb , t2wb ) to A.
• Phase 2. A can continue to ask for keyword ciphertext
j = {1, ..2d −1}. If two bit-blocks match the query condition
cmp, it means that vR,i = vI,i + (cmp) j . We assume that of any keyword w of his choice as long as w 6= (w0 , w1 ).
0
• Guess. A guesses a random bit b ∈ {0, 1} and wins this
there exists another bit-block v¯ I,i that matches the query
condition. As mentioned, j = {1, ..2d − 1} are all possi- game if b0 = b.
ble distances for each bit-block. The corresponding entries Definition 2. The task encryption algorithm in FedCrowd is
vI,i + (cmp) j contains all possible values, including v¯I,i . IND-CPA secure against the broker if there exist no Probabilistic
Thus, the bit-block matches the query condition if and only Polynomial Time (PPT) adversaries that can break the above
if vI,i = v¯
I,i . Therefore, there exists only one encrypted bit- security game with non-negligible advantages.
block matches the query condition.

6.2 Security on Task Encryption Scheme


5.4.4 Secure Range-match Instantiation In this subsection, we will proof that the task encryption
To better understand the protocol of our encrypted range- algorithm in FedCrowd is IND-CPA secure, indicating that
matching scheme, Fig. 4 uses an example to show how it the broker learns nothing about the underlying plaintext in
works to check whether the block token “11 > ” matches a chosen plaintext attack.
the value “01” of the encrypted index. In this example, Theorem 1. FedCrowd is IND-CPA secure if there exist no
we directly use the bit values “{11, 01}” to represent the PPT adversaries that can break the DDH assumption with non-
block ciphers {ti,j , tR,i } in Algorithm 5 for clarify of pre- negligible advantages.
sentation. As shown in Fig. 4, the pre-defined value dis-
tances “{−01, −10, −11}” for the query block “11” are Proof of theorem 1: Let us consider the challenger
masked with random nonces γ ∗ and the parameter β , C that attempts to break the DDH assumption using the
where “ − ” denotes the “ > ” condition. During the adversary A as a sub-routine. Assume that C is given the
β
procedure of searching indexes, the smart contract can test public parameters g0 = g1α (or g0 = g2 ), hash function F
sβγ βH(⊥,11−10)γ ∗ and a tuple (g1 , g1 , g1 , g1 ) (or (g2 , g2 , g2 , g2 )), where g1a = g1x
a b c a b c
whether “11 > 01” by checking if e(g1 , g2 )=
sβγ ∗
e(g2 , g1
βH(⊥,01)γ
), where ⊥ denotes that the prefix is null. (or g2a = g1x ) for random elements a, b, c ∈ Zp . Then, C
After a sequence of bilinear pairing operations, the result simulates the following security game:
shows that the “ − 10” block matches the query condition, • Setup. C sends the public keys G1 , G2 , GT , p, g1 , g2 to
as highlighted in Fig 4. By using our range-match scheme, A. Then, C chooses a random element kbr ∈ Zp for
encrypted requirements and interests generated by different each registered user and computes g1kur = g1a g1−kbr =
broker-servers can be efficiently compared in our platform. g1a−kbr = g1x−kbr . Finally, C returns kbr to A and keeps
the tuple (kbr , g kur ).
• Phase 1. Whenever A attempts to require the task
6 S ECURITY A NALYSIS encryption oracle, he/she sends the arbitrary key-
word wR to C . C first chooses a random element
In this section, we present a rigorous security analysis
r ∈ Zp and calls the index generation algorithm to
to demonstrate the security guarantees of our proposed F (w ) F (w )k
scheme. Recall that we uniquely bridge our enhanced PRE compute t1wR = g1−ar g0 R , t2wR = g1rkur g0 R br =
F (w )k
scheme and bilinear pairing techniques to design the secure g1ar g1−rkbr g0 R br . Finally, C sends (t1wR , t2wR ) to A.
task-matching protocols. The random masking mechanism • Challenge. A submits two challenging keywords w0 , w1 .
in our PRE scheme can effectively prevent broker-servers C first chooses a random bit b ∈ {0, 1}, and then
F (w ) F (w )k
from learning the content of input data. Meanwhile, the se- computes t1wb = g1−b g0 R , t2wb = g1c g1−bkbr g0 R br .
1 2
curity of bilinear pairing operation enables smart contracts Finally, C sends (twb , twb ) to A.
to securely perform ciphertext comparison while protecting • Phase 2. A repeats the process of Phase 1, but he/she
the on-chain data privacy. Following the security notion of cannot ask the task encryption oracle for (w0 , w1 ).
0 0
searchable encryption schemes [16], [28], we formally define • Guess. A outputs a guess bit b ∈ {0, 1}. If b = b, A
the security model and analyze the security guarantees wins this security game; otherwise, he/she fails. There
following the adopted cryptographic primitives. are two cases to consider:

Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 20:44:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSC.2020.3031061, IEEE
Transactions on Services Computing
10

– If g1c is a random element in group G1 , then the The task-matching scheme in FedCrowd is restricted-chosen-
−b+F (wR )α
challenging keyword ciphertext t1wb = g1 , input secure, if for all PPT adversaries A, there exists a simulator
t2wb = g1 br
c−bk +F (wR )kbr α
consists of two random S such that: P r[RealΩ,A (k) = 1]−P r[IdealΩ,A,S (k) = 1] ≤
group elements in group G, indicating that the dis- negl(k), where negl(k) is a negligible function in k .
tribution of (t1wb , t2wb ) is always uniform no matter Theorem 2. The task-matching scheme in FedCrowd is restricted-
what keyword wb takes. Thus, A cannot distinguish chosen-input secure, assuming that H is an anti-collision hash
the keyword ciphertext of w0 with that of w1 , and function and the scheme holds the DDH assumption.
C ’s probability in successfully breaking this security
game is defined as P r[C(g1 , g1a , g1b , g1c )] = 12 . Proof of theorem 2. The objective is to prove that the adver-
– If g1c = g1ab , the challenging keyword cipher- sary A cannot distinguish between the real experiment
F (w ) F (w )α RealΩ,A (k) and the simulated one IdealA,S (k) as defined
text t1wb = g1−b g0 R = g1−b g1 R , t2wb =
F (w )k F (w )k α in Definition 3. Given LStp , the simulator S can generate
g1ab g1−bkbr g0 R br = g1ab g1−bkbr g1 R br . Then, the simulated encrypted indexes, which is indistinguishable
1 2
(twb , twb ) will have the proper ciphertext form, C ’s from the real one. It contains m index entries, where each
probability in successfully breaking this security entry has the same size as the real indexes. From LMtch ,
game is defined as P r[C(g1 , g1a , g1b , g1ab )] < 21 + negl, S can simulate the matching results. S forms a table to
where negl is the negligible function. maintain the hash value of H(tw ) and implicitly sets each
If the SXDH assumption is tractable, A’s probabil- value as b. For each previously unseen query, S randomly
ity in successfully guessing b0 = b is defined as selects the same number of q entries as the query results on
AdvA < 12 + negl. Thus, A cannot distinguish the the real indexes. Specifically, it chooses a random value γ as
keyword ciphertext in a chosen plaintext attack with sβγ
the random mask and simulates each query result as g1 ,
non-negligible advantage. Similar to above simula- βbγ
g1 . During the challenge phase, if the challenge term is
tion process, we can also derive following inequalities g1bc , S can compute the simulated result as g1sβc , g1βbc , where
P r[C(g2 , g2a , g2b , g2c )] = 12 and P r[C(g2 , g2a , g2b , g2ab )] < b, c are uniform over Zp . Otherwise, S can choose a random
1
2 +negl on condition that C is given the public parame-
sβc
β
element R in group G and returns g1 , R. Therefore, if the
ters g0 = g2 , and a tuple (g2 , g2a , g2b , g2c ), where g2a = g1x . adversary A has an advantage to distinguish RealΩ,A (k)
This completes the proof. and IdealA,S (k), the simulator S has the same advantage
to break the DDH assumption. This completes the proof.

6.3 Security on Task-matching Scheme


Regarding the security analysis of task-matching scheme, 7 E XPERIMENTAL E VALUATION
we define leakages during the protocols and quantify the 7.1 Prototype Implementation
security guarantees. Firstly, we define the leakage LStp for To assess the performance of FedCrowd, we implement a
the index generation algorithm for a given task set tw as: prototype2 in Go and construct smart contracts of Ethereum
LStp = (m, h|I 1 |, |I 2 |i), using Solidity 0.5.4. The smart contract is deployed to the
Ethereum test network TestRPC and evaluated via a PC
where m is the number of encrypted task indexes, and with an Intel(R) Core(TM) i7-9700K processor (3.6 GHz)
|I 1 |, |I 2 | are bit lengths of index ciphertexts. When a broker and 64GB RAM. We use Go-Ethereum to implement the in-
sends a search request, the view of an adversary is defined teraction between the broker-servers and Ethereum. In this
in the leakage LMtch as: experiment, we select real-world task requirements from 20
LMtch = (hT 1 , T 2 i, hI 1 , I 2 iq ), Newsgroups 3 to evaluate the matching performance. We
create 2K task IDs and randomly assign them to different
where {T 1 , T 2 } are tokens and hI 1 , I 2 iq are q matched task requirements. We also generate a Redis (v5.0.3) cluster
partial results. Following the simulation-based security defi- to maintain the encrypted task content. For cryptographic
nition [28], we give the formal security definition as follows: primitives, we implement the index generation algorithms
Definition 3. Let Ω = (InitKey, PostIndex, ExtMatch) be our including the PRE scheme and bilinear pairings functions
scheme for secure task-matching scheme, and let LStp and LMtch via bn256 in the cryptography library of Go. Besides, we use
be the leakage functions. We define the following probabilistic ex- bn256 pre-compiled contracts in Ethereum to implement the
periments RealΩ,A (k) and IdealΩ,A,S (k) with a probabilistic encrypted task-matching protocols. Overall, FedCrowd’s
polynomial time (PPT) adversary A and a PPT simulator S : implementation consists of 600 lines of code, which includes
RealΩ,A (k): The broker calls InitKey to get a private key 200 lines for smart contracts.
K . A selects a task set tw and asks the broker to build the real
indexes via the PostIndex algorithm. Then A adaptively conducts 7.2 Performance Evaluation
a polynomial number of queries via ExtMatch algorithm. Finally Our experimental evaluation targets on testing the practi-
A returns a bit as the output. cality of the proposed federated crowdsourcing design, in-
IdealA,S (k): A selects a task set tw , and S simulates the cluding the initialization time, space cost, gas consumption,
index for A based on LStp . Then A adaptively performs a and on-chain matching performance.
polynomial number of queries. From the leakage LMtch in each
query, S simulates tokens and ciphertext, which are processed 2. Online at git@github.com:JerryXie96/fedEnc.git.
over the simulated indexes. Finally, A returns a bit as the output. 3. http://qwone.com/ jason/20Newsgroups/.

1939-1374 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 20:44:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSC.2020.3031061, IEEE
Transactions on Services Computing
11

TABLE 1: Time consumption on cryptographic operations.

Keyword Keyword Range value Range value Homomorphic Homomorphic Additional


# Tasks
Encryption Matching Encryption Comparison Encryption [13] Comparison [13] leakages in [13]
100 ∼ 0.34 s ∼ 0.23 s ∼ 3.80 s ∼ 1.55 s ∼ 3.96 s ∼ 4.82 s
150 ∼ 0.49 s ∼ 0.35 s ∼ 5.67 s ∼ 2.33 s ∼ 6.15 s ∼ 7.23 s
The order relations
200 ∼ 0.63 s ∼ 0.50 s ∼ 7.79 s ∼ 3.38 s ∼ 8.06 s ∼ 7.72 s
between plaintext values
250 ∼ 0.81 s ∼ 0.58 s ∼ 9.63 s ∼ 3.90 s ∼ 10.09 s ∼ 12.05 s
300 ∼ 1.12 s ∼ 0.69 s ∼ 11.23 s ∼ 4.66 s ∼ 11.88 s ∼ 15.06 s

TABLE 2: Space consumption on encrypted indexes. of the space cost for encrypting 1600 keywords. Regarding
the memory consumption of the range-based design, the
Rng-match Rng-match Rng-match
# Tasks Ext-match ciphertext size depends on the bit length of each block.
(1-bit) (2-bits) (4-bits)
Specifically, encrypting a 32-bit value with b-bits design
100 0.02 MB 0.58 MB 0.29 MB 0.15 MB requires 32/b ciphertext blocks, where each block pair con-
200 0.04 MB 1.18 MB 0.59 MB 0.30 MB tains 192 bits. As shown in Table 2, when the block size is
400 0.09 MB 2.36 MB 1.18 MB 0.60 MB 4-bits, the space cost of building 200 range-based indexes
800 0.17 MB 4.71 MB 2.37 MB 1.20 MB is 0.3MB. According to the evaluation results, our proposed
1600 0.35 MB 9.42 MB 4.74 MB 2.40 MB cryptographic design is shown to be capable of providing a
good balance on space utilization and time efficiency.
0.5 0.2
Initial Time Revoke Time
0.4 0.16 7.2.2 Evaluation on System Initialization
Time cost (s)

Time cost (s)

0.3 0.12 In the system initialization, the KMS needs to generate the
0.2 0.08 public keys and secret keys, and transmit the secret keys to
0.1 0.04 authorized requesters/workers and related broker-servers.
0 0
In Fig. 5, we evaluate the time cost of user registration and
100 150 200 250 300 350 100 150 200 250 300 350 revocation. The results show that both time cost grows lin-
Number of authorized users Number of revoked users early with the increasing users. Specifically, it takes around
(a) Authorized user registration (b) Worker revocation 0.16s to complete the registration for 350 authorized system
users. For workers revocation, broker-servers can revoke the
Fig. 5: Evaluation for the FedCrowd initialization. malicious worker by removing its PRE key kbr from the
user list L. As shown in Fig. 5(b), it just takes about 0.7s
7.2.1 Evaluation on Cryptographic Primitives when revoking 300 workers from the broker-server, which is
To enable secure and federated task-matching services, the similar to the computational efficiency in plaintext domain.
task information (i.e., task keywords and range values)
needs to be encrypted by broker-servers with our proposed 7.2.3 Evaluation on Task Publication
cryptographic primitives. In Table 1, we first measure the Recall that FedCrowd requires the requesters to encrypt
time cost for data encryption and comparison under the var- their keywords or values with our enhanced PRE scheme,
ious number of task data. As shown in Table 1, our design and later send the ciphertexts to related broker-servers for
just takes around 0.63s and 0.5s to encrypt 200 keywords on-chain index generation. Fig. 6 presents the performance
and retrieve the matching results, which is extremely fast evaluation of these procedures at the user-side. As we can
for privacy-preserving applications. For encrypted range see from Fig. 6 (a) and (b), both keyword and range value
queries, our scheme introduces additional computation encryption costs increase linearly along with the number of
cost due to the adopted block-based encryption approach. data items. Encrypting range values take more time because
Nonetheless, we can observe that the comparison perfor- the requesters need to divide the values into bit-blocks and
mance is still significantly faster than the existing homo- generate the value ciphertext block-by-block. Specifically,
morphic encryption scheme [13]. Specifically, when com- given 800 data items, requesters take about 2.3s during
paring 100 random values, our range-match scheme just the processes of keyword encryption, which is basically the
takes about 1.55s, which is over 3× faster compared to same as the existing PRE scheme [8]. Meanwhile, Fig. 6 (b)
the Gentry et al. scheme [13]. Besides, it is worth noting also compares the time cost of the range value encryption in
that our design can protect the order leakage by conducting different block sizes. The result shows that the encryption
comparisons via the pattern-matching method. In contrast, time of 4-bit designs is faster than the other designs. It takes
homomorphic encryption is subject to the order leakage about 2.6s to encrypt 100 values, which can save almost 40%
in which attackers can infer the plaintexts by monitoring of the time cost compared with the 2-bits design.
the comparison results [12]. In Table 2, we then investigate After receiving the PRE ciphertexts from requesters, the
the memory cost of different encryption algorithms. The broker-server is required to generate the encrypted on-chain
results show that the space cost for both index designs indexes as illustrated in Section 5. In Fig. 6 (c) and Fig. 6 (d),
grows linearly with the increasing amount of data items. we measure the total time cost of building the encrypted
For instance, encrypting 800 keywords with our exact-match indexes by using Algorithm 2 and Algorithm 4. The results
index design requires about 0.17MB, which is roughly half show that the broker-server only needs to take less than

1939-1374 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 20:44:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSC.2020.3031061, IEEE
Transactions on Services Computing
12

10 200 5 100
PRE Keyword 1-bits PRE value Ext Index 1-bit Rng Index
8 160 4 80
Time cost (s)

Time cost (s)

Time cost (s)

Time cost (s)


Shu et al. [8] 2-bits PRE value 2-bits Rng Index
4-bits PRE value 4-bits Rng Index
6 120 3 60
4 80 2 40
2 40 1 20
0 0 0 0
50 100 200 400 800 1600 50 100 200 400 800 1600 50 100 200 400 800 1600 50 100 200 400 800 1600
Number of task keywords Number of task values Number of Exact-match indexes Number of Range-match indexes
(a) PRE keyword encryption (b) PRE value encryption (c) Ext index generation (d) Rng index generation

Fig. 6: Evaluation for the FedCrowd off-chain performance.

2.5s to complete the build procedure when generating 1600 ×107 5.0

Transaction time (s)


10
index entries. For the range-based index generation, we Post tasks 4.0
8 Ext-match
note that it introduces additional time cost because each

Gas cost
6 Rng-match 3.0
index value should be divided into bit-blocks and different 2.0
4
ciphertext blocks should be generated independently with
2 1.0
random masks. Specifically, it takes around 3s to generate #100 #150 #200 #250
200 index entries with 4-bits design, which is roughly half 10 tasks 20 tasks 50 tasks 0 5 10 15 20 25
of the time cost when using the 2-bits encryption scheme. Number of task indexes Number of experiment times
The results also show that the building time of range-based (a) Gas consumption (b) Posting Ext indexes
indexes is roughly 10× higher compared to the time cost of
50 1.0K

Transaction time (s)


the exact-match indexes. Nonetheless, the time cost is still
40 0.8K
within an acceptable level and this building procedure is a

Tasks/ min
30 0.6K Ext-match
one-time setup cost.
20 0.4K 2-bits Rng-match
4-bits Rng-match
7.2.4 Evaluation on Task Recommendation 10 0.2K
#100 #150 #200 #250
In FedCrowd, the broker-servers need to post the encrypted 0
0 5 10 15 20 25 50 100 150 200 250
task indexes and query requests on the blockchain platform Number of experiment times Number of matched results
for task-matching services. To assess the practicality of our (c) Posting Rng indexes (d) Throughput comparison
design, Fig. 7(a) measures the gas cost of our implemented
contract on Ethereum, including posting task indexes, exact- 100 50

Transaction time (s)


#50 Ext-match
match task recommendation, and range-match task recom- 80 #100 40 Hu et al. [51]
#150
Ratio(%)

mendation. In particular, the gasPrice is set to 2 Gwei, where 60 #200 30


#250
1 Gwei = 10−9 ether. The total cost of ether is computed 40 20
under the Ethereum gas rule: gasCost × gasPrice. As shown 20 10
in Fig. 7(a), deploying 50 exact-match indexes on Ethereum 0
0
costs about $ 4.8 USD with an exchange rate of 1 ether = 0 4 8 12 16 20 50 100 150 200 250
Transaction confirmation time (s) Number of task indexes
$200 USD. Meanwhile, it takes $ 6.07 USD to conduct 50
times task-matching services, of which approximately 20% (e) Ext task-matching ratio (f) Ext task-matching latency
cost is used for on-chain pairings. The results confirm that 100 300
Transaction time (s)

#50 1-bit Rng-match


the capital cost is not a burden for brokers. 80 #100 240 2-bits Rng-match
In Fig. 7(b) and Fig. 7(c), we further measure the trans- #150 4-bits Rng-match
Ratio(%)

60 #200 180
action confirmation time of deploying different task indexes #250
40 120
to the smart contracts. In this experiment, the average block
20 60
time for mining is set to 2s. The results show that even the
0 0
index load is heavy, the task deployment can be fast. The 0 30 60 90 120 150 50 100 150 200 250
time cost of deploying 100 exact-match indexes is approx- Transaction confirmation time (s) Number of task indexes
imately 2.2s, which is roughly the same as deploying 150 (g) Rng task-matching ratio (h) Rng task-matching latency
tasks. Meanwhile, we also found that when the uploaded
indexes are greater than 150 index entries, the rest of exact- Fig. 7: Evaluation for the FedCrowd on-chain performance.
match indexes will be recorded into the blockchain in the
subsequent transactions. As shown in 7(b), it takes about
4.6s to deploy 200 and 250 indexes to the smart contracts. sults. In our design, smart contracts execute bilinear pair-
Likewise, when the number of newly added range-match ing operations to scan the whole on-chain indexes for
indexes is 100, it takes about 19s to deploy these indexes task-matching services. Thus, the throughput of both task-
entries to the smart contracts, which is quite modest for matching schemes remain stable. In particular, it shows that
blockchain-based applications. our exact-match implementation can handle over 600 of
To assess the efficiency of FedCrowd, Fig. 7(d) com- keyword queries per minute. Regarding the throughput of
pares the throughput of keyword and range-based on- range-based matching, we measure the cryptographic over-
chain matching when varying the number of matched re- head by varying the block size of the encryption scheme.

1939-1374 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 20:44:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSC.2020.3031061, IEEE
Transactions on Services Computing
13

The result shows that 4-bit index design achieves better encryption schemes to support other rich task-matching
efficiency than other designs. Specifically, the throughput functions, such as multi-keyword queries. Meanwhile, we
with 4-bits index design is about 43.5% higher than the 2- leave how to detect malicious requesters/workers who in-
bits design, which can handling around 200 range-based tentionally leak their private keys as our future work.
queries per minute. According to the results, our proposed
schemes achieve acceptable on-chain matching efficiency
and strong security guarantees. R EFERENCES
To gain a deeper understanding of the FedCrowd per-
formance, we accordingly evaluate the matching latency for [1] J. Howe, “The rise of crowdsourcing,” Wired magazine, vol. 14,
no. 6, pp. 1–4, 2006.
a various number of query requests. The total time cost con- [2] Upwork, “Upwork project.” Online at https://www.upwork.com,
sists of generating the query transactions at the broker-side, 2015.
the cryptographic operations at the blockchain-side, and the [3] CrowdFlower, “Crowdflower project.” Online at https://www.
corresponding transaction confirmation time. In particular, crowdflower.com, 2015.
[4] Amazon Mechanical Turk, “Amazon mechanical turk project.”
Fig. 7(e) and Fig. 7(g) present the query latency CDFs for Online at https://www.mturk.com, 2005.
on-chain task matching under the various number of task [5] The Ethereum Project, Online at https://ethereum.org, 2014.
indexes. As we can see from the results, the transaction [6] hackread, “data breach news.” Online at https://www.hackread.
confirmation time follows a similar upward trend as the com, 2020.
[7] J. Shu, K. Yang, X. Jia, X. Liu, C. Wang, and R. Deng, “Proxy-
number of tasks increases. The average query latency for free privacy-preserving task matching with efficient revocation in
returning 150 exact-match results ranges from 10s to 12s, crowdsourcing,” IEEE TDSC, 2018.
which is extremely fast for privacy-preserving applications. [8] J. Shu, X. Jia, K. Yang, and H. Wang., “Privacy-preserving task
Meanwhile, Fig. 7(g) shows the introduced latency for over recommendation services for crowdsourcing,” IEEE TSC, 2018.
[9] Y. Shen, L. Huang, L. Li, X. Lu, S. Wang, and W. Yang, “Towards
90% range-match requests is less than 45s when returning preserving worker location privacy in spatial crowdsourcing,” in
50 matched results. Accordingly, as shown in Fig. 7(f) and Proc. of IEEE GLOBECOM, 2015.
Fig. 7(h), we can find that as the number of on-chain in- [10] H. To, G. Ghinita, and C. Shahabi, “A framework for protecting
worker location privacy in spatial crowdsourcing,” IEEE VLDB
dexes increases, the latency of exact-match and range-match Endowment, vol. 7, no. 10, pp. 919–930, 2014.
queries are raised gradually in similar proportions. Specif- [11] M. Naveed, S. Kamara, and C. V. Wright, “Inference attacks on
ically, the latency for exact-match raises from around 8s to property-preserving encrypted databases,” in Proc. of ACM CCS,
15s when the number of on-chain indexes increases from 2015.
[12] D. Cash, P. Grubbs, J. Perry, and T. Ristenpart, “Leakage-abuse
100 to 200. Compared with the existing blockchain-based attacks against searchable encryption,” in Proc. of ACM CCS, 2015.
SE scheme [49], both schemes have similar query efficiency. [13] C. Gentry, “Fully homomorphic encryption using ideal lattices.”
It is worth noting that the scheme [49] only supports single- in Proc. of ACM STOC, 2009.
user settings because it is constructed with symmetric en- [14] C. Cai, J. Weng, X. Yuan, C. Wang, “Enabling reliable keyword
search in encrypted decentralized storage with fairness,” IEEE
cryption. While our design is customized for the multi-user TDSC, 2018.
settings. Fig. 7(h) shows that the latency of range queries [15] C. Zhang, C. Xu, J. Xu, Y. Tang, and B. Choi, “Gem2 -tree: A gas-
is higher than the exact-match queries because the smart efficient structure for authenticated range queries in blockchain,”
contracts need to conduct block-based pairing operations in Proc. of IEEE ICDE, 2019.
[16] C. Dong, G. Russello, and N. Dulay, “Shared and searchable
to find the range-match ciphertexts. It takes about 17s to encrypted data for untrusted servers,” Journal of Computer Security,
find 50 range-matched tasks over the blockchain platform. vol. 19, no. 3, pp. 367–397, 2011.
Nevertheless, the performance can further be improved via [17] K. Lewi and D. J. Wu, “Order-Revealing Encryption: New Con-
structions, Applications, and Lower Bounds,” in Proc.of ACM CCS,
binary search, i.e., sorting values before encryption. Overall,
2016.
the evaluation results confirm that FedCrowd can support [18] N. Chenette, K. Lewi, S. A. Weis, and D. J. Wu, “Practical Order-
federated on-chain matching efficiently. Revealing Encryption with Limited Leakage,” in Proc. of FSE, 2016.
[19] K. Huang, J. Yao, J. Zhang, and Z. Feng, “When human service
meets crowdsourcing: Emerging in human service collaboration,”
8 CONCLUSION IEEE TSC, vol. 12, no. 3, pp. 460–473, 2018.
[20] Freelancer, “Freelancer project.” Online at https://www.
In this paper, we introduce a new federated crowdsourcing freelancer.com, 2015.
platform, called FedCrowd, that enables secure task recom- [21] Waze, “Waze project.” Online at https://www.Waze.com, 2015.
[22] P. Créquit, G. Mansouri, M. Benchoufi, A. Vivot, and P. Ravaud,
mendations across multiple independent systems. Lever- “Mapping of crowdsourcing in health: Systematic review,” Journal
aging the cryptographic primitives bridging PRE and bi- of medical Internet research, vol. 20, no. 5, p. e187, 2018.
linear mappings as our starting point, we have designed [23] R. Liu, J. Cao, K. Zhang, W. Gao, J. Liang, and L. Yang, “When
and implemented a secure task-matching protocol based on privacy meets usability: unobtrusive privacy permission recom-
mendation system for mobile apps based on crowdsourcing,”
smart contracts to achieve privacy-preserving task recom- IEEE TSC, vol. 11, no. 5, pp. 864–878, 2018.
mendations via the blockchain platform. Besides, we devise [24] R. Liu, J. Liang, J. Cao, K. Zhang, W. Gao, L. Yang, and
a range-matching scheme that supports efficient numeric R. Yu, “Understanding mobile users’ privacy expectations: A
range-matching for federated crowdsourcing. We provide recommendation-based method through crowdsourcing,” IEEE
TSC, vol. 12, no. 2, pp. 304–318, 2019.
a thorough security analysis to illustrate that our design can [25] J. Shu, X. Liu, K. Yang, Y. Zhang, X. Jia, and R. H. Deng., “Sybsub:
protect the confidentiality of task information. We complete Privacy-preserving expressive task subscription with sybil detec-
the prototype implementation and deploy it on Ethereum. tion in crowdsourcing,” IEEE IoT-J, vol. 6, no. 2, pp. 3003–3013,
2018.
Extensive experimental results on real-world datasets con-
[26] Y. Luo, X. Jia, S. Fu, and M. Xu, “pride: Privacy-preserving ride
firm that FedCrowd is provably secure and highly efficient. matching over road networks for online ride-hailing service,”
As future work, we plan to explore advanced searchable IEEE TIFS, vol. 14, no. 7, pp. 1791–1802, 2018.

1939-1374 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 20:44:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSC.2020.3031061, IEEE
Transactions on Services Computing
14

[27] H. Yu, X. Jia, H. Zhang, X. Yu, and J. Shu, “Psride: Privacy- [57] M. Steiner, G. Tsudik, and M. Waidner, “Diffie-hellman key distri-
preserving shared ride matching for online ride hailing systems,” bution extended to group communication,” in Proc. of ACM CCS,
IEEE TDSC, 2019. 1996.
[28] S. Kamara, C. Papamanthou, and T. Roeder, “Dynamic searchable Yu Guo (M’19) is currently a Lecturer at the
symmetric encryption,” in Proc. of ACM CCS, 2012. School of Artificial Intelligence, Beijing Normal
[29] R. A. Popa, C. Redfield, N. Zeldovich, and H. Balakrishnan, University. He received his B.E. degree in Soft-
“CryptDB: protecting confidentiality with encrypted query pro- ware Engineering from Northeastern University
cessing,” in Proc. of ACM SOSP, 2011. in 2013, the M.Sc degree in Electronic Com-
[30] Y. Guo, X. Yuan, X. Wang, C. Wang, B. Li, and X. Jia, “Enabling merce and Ph.D. degree in Computer Science
encrypted rich queries in distributed key-value stores,” IEEE from City University of Hong Kong in 2014 and
TPDS, vol. 30, no. 7, pp. 1283–1297, 2018. 2019. He has also been a Postdoctor and Re-
[31] B. Fuller, M. Varia, A. Yerukhimovich, E. Shen, A. Hamlin, search Fellow at City University of Hong Kong.
V. Gadepally, R. Shay, J. D. Mitchell, and R. K. Cunningham, “Sok: His research interests include cloud comput-
Cryptographically protected database search,” in Proc. of IEEE ing security, network security, privacy-preserving
S&P, 2017. data processing, and blockchain technology.
[32] X. Yuan, Y. Guo, X. Wang, C. Wang, B. Li, and X. Jia, “Enckv:
An encrypted key-value store with rich queries,” in Proc.of ACM
AsiaCCS, 2017. Hongcheng Xie received the B. Eng. in soft-
[33] R. Bost, “Sophos - forward secure searchable encryption,” in Proc. ware engineering from University of Electronic
of ACM CCS, 2016. Science and Technology of China in 2019. He
[34] S. Kamara and C. Papamanthou, “Parallel and dynamic searchable is currently pursuing the Ph.D. degree in De-
symmetric encryption,” in Proc. of Financial Cryptography, 2013. partment of Computer Science, City University
[35] E. Stefanov, C. Papamanthou, and E. Shi, “Practical Dynamic of Hong Kong. His research interests include
Searchable Symmetric Encryption with Small Leakage,” in Proc. information security on cloud computing and
of NDSS, 2014. blockchain.
[36] F. Bao, R. H. Deng, X. Ding, and Y. Yang, “Private query on
encrypted data in multi-user settings,” in Proc. of ISPEC, 2008.
[37] R. A. Popa and N. Zeldovich, “Multi-key searchable encryption,”
Cryptology ePrint Archive, Report 2013/508, 2013.
[38] R. Curtmola, J. Garay, S. Kamara, and R. Ostrovsky, “Searchable
symmetric encryption: improved definitions and efficient con- Yinbin Miao (M’18) received the B.E. degree
structions,” in Proc. of ACM CCS, 2006. with the Department of Telecommunication Engi-
[39] M. Blaze, G. Bleumer, and M. Strauss, “Divertible protocols and neering from Jilin University, Changchun, China,
atomic proxy cryptography,” International Conference on the Theory in 2011, and Ph.D. degree with the Department
and Applications of Cryptographic Techniques, 1998. of Telecommunication Engineering from Xidian
[40] A. Boldyreva, N. Chenette, Y. Lee, and A. O’Neill, “Order- University, Xi’an, China, in 2016. He is also a
Preserving Symmetric Encryption,” in Proc. of EUROCRYPT, 2009. postdoctor in Nanyang Technological University
from September 2018 to September 2019. He
[41] R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu, “Order preserving
is currently a Lecturer with the Department of
encryption for numeric data,” in Proc.of ACM SIGMOD, 2004.
Cyber Engineering in Xidian University, Xi’an,
[42] R. A. Popa, F. H. Li, and N. Zeldovich, “An ideal-security protocol
China, and a postdoctor in City University of
for order-preserving encoding,” in Proc. of IEEE S& P, 2013.
Hong Kong, Hong Kong, China. His research interests include informa-
[43] The Bitcoin Project, Online at https://bitcoin.org/en/, 2009.
tion security and applied cryptography.
[44] K. Peterson, R. Deeduvanu, P. Kanjamala, and K. Boles, “A
blockchain-based approach to health information exchange net-
works,” NIST Workshop Blockchain Healthcare, vol. 1, pp. 1–10, 2016.
[45] Storj, “Storj project.” Online at https://storj.io/storj.pdf, 2015. Cong Wang is an associate professor at the
[46] X. Cheng, C. Zhang, and J. Xu., “vchain: Enabling verifiable Department of Computer Science, City Univer-
boolean range queries over blockchain databases,” in Proc. of sity of Hong Kong. He received his B.E. and
SIGMOD, 2019. M.E. degrees from Wuhan University in 2004
[47] Google, “Introducing six new cryptocurrencies in bigquery pub- and 2007, and the Ph.D. degree from Illinois
lic datasetsand how to analyze them,” Online at https://cloud. Institute of Technology in 2012, all in Electri-
google.com/blog/products/data-analytics/, 2019. cal and Computer engineering. He worked at
[48] BigchainDB, “Bigchaindb project.” Online at https://www. the Palo Alto Research Center in the summer
bigchaindb.com, 2019. of 2011. His research interests include cloud
[49] S. Hu, C. Cai, Q. Wang, C. Wang, X. Luo, and K. Ren, “Searching computing security, multimedia security, mobile
an encrypted cloud meets blockchain: A decentralized, reliable security, targeted advertising, etc.
and fair realization,” in Proc. of IEEE INFOCOM, 2018.
[50] G. Yu, C. Zhang, and X. Jia, “Verifiable and forward-secure en-
crypted search using blockchain techniques,” in Proc. of IEEE ICC, Xiaohua Jia (F’13) received the BSc and MEng
2020. degrees in 1984 and 1987, respectively, from the
[51] H. Duan, Y. Zheng, Y. Du, A. Zhou, C. Wang, and M. H. Au., University of Science and Technology of China,
“Aggregating crowd wisdom via blockchain: A private, correct, and the DSc degree in 1991 in information sci-
and robust realization,” in Proc. of IEEE PerCom, 2019. ence from the University of Tokyo. He is cur-
[52] L. Ming, J. Weng, A. Yang, W. Lu, Y. Zhang, L. Hou, J.-N. Liu, rently the chair professor with the Department
Y. Xiang, and R. H. Deng, “Crowdbc: A blockchain-based decen- of Computer Science at the City University of
tralized framework for crowdsourcing,” IEEE TPDS, vol. 30, no. 6, Hong Kong. His research interests include cloud
pp. 1251–1266, 2018. computing and distributed systems, computer
[53] Y. Lu, Q. Tang, and G. Wang., “Zebralancer: Private and anony- networks, wireless sensor networks and mobile
mous crowdsourcing system atop open blockchain,” in Proc. of wireless networks. He is an editor of the IEEE
IEEE ICDCS, 2018. Transactions on Parallel and Distributed Systems (2006-2009), Wire-
[54] S. Nick, “Formalizing and securing relationships on public net- less Networks, Journal of World Wide Web, Journal of Combinatorial
works,” First Monday, vol. 2, no. 9, pp. 1–2, 1997. Optimization, etc. He is the general chair of ACM MobiHoc 2008, TPC
[55] The Hyperledger Project, Online at https://Hyperledger.org, co-chair of IEEE MASS 2009, area-chair of IEEE INFOCOM 2010,
2015. TPC co-chair of IEEE GlobeCom 2010-Ad Hoc and Sensor Networking
[56] B. Bnz, J. Bootle, D. Boneh, A. Poelstra, P. Wuille, and G. Maxwell, Symposium, and Panel co-chair of IEEE INFOCOM 2011. He is fellow
“Bulletproofs: Short proofs for confidential transactions and of the IEEE Computer Society.
more,” in Proc. of IEEE S&P, 2018.

1939-1374 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 20:44:53 UTC from IEEE Xplore. Restrictions apply.

You might also like