Professional Documents
Culture Documents
Abstract—Crowdsourcing is a promising computing paradigm and interests submitted by task-requesters and workers, respec-
that utilizes collective intelligence to solve complex tasks. While it tively. These submitted task requirements and interests contain
is valuable, traditional crowdsourcing systems lock computation a lot of sensitive information, such as business secrets and
resources inside each individual system where tasks cannot
reach numerous potential workers among the other systems. the location of workers. The privacy of these task-requesters
Therefore, there is a great need to build a federated platform and workers may be compromised when exposing sensitive
for different crowdsourcing systems to share resources. However, information to brokers.
the security issue lies in the center of constructing the federated To address the privacy concern, several privacy-preserving
crowdsourcing platform. Although many studies are focusing on solutions for crowdsourcing have been proposed. Shu et al.
privacy-preserving crowdsourcing, existing solutions require a
trusted third party to perform the key management, which is [7]–[9] first proposed a secure task-worker matching scheme
not applicable in our federated platform. The reason is that it is for crowdsourcing using proxy cryptography and later ex-
difficult for a third party to be trusted by various systems. tended it to SybSub which supports expressive task subscrip-
In this paper, we present a secure crowdsourcing framework tion with sybil detection. Unfortunately, since these schemes
as our initial effort toward this direction, which bridges together rely on central servers, they are weak in traditional trust-
the recent advancements of blockchain and cryptographic tech-
niques. Our proposed design, named PFcrowd, allows different
based models, such as single point of failure. Moreover, they
crowdsourcing systems to perform encrypted task-worker match- are vulnerable to distributed denial of service (DDoS). To
ing over the blockchain platform without involving any third- solve these issues, CrowdBC [10] devised a blockchain-based
party authority. The core idea is to utilize the blockchain to assist decentralized framework for crowdsourcing without relying on
the federated crowdsourcing by moving the task recommendation any third-party trusted institution. This system enhances the
algorithm to the trusted smart contract. To avoid third-party
involvement, we first leverage the re-writable deterministic hash-
security of users and the availability of service. However, all
ing (RDH) technique to convert the problem of federated task- the existing schemes work for a paradigm which only exists
worker matching into the secure query authorization. We then a single broker. The task-requesters and workers are locked
devise a secure scheme based on RDH and searchable encryption inside a single and isolated system. The task-requesters cannot
(SE) to support privacy-preserving task-worker matching via the reach numerous potential workers outside the boundary of the
smart contract. We formally analyze the security of our proposed
scheme and implement the system prototype on Ethereum.
system and the workers have no access to the tasks published
Extensive evaluations of real-world datasets demonstrate the in other platforms, which significantly limits the power of the
efficiency of our design. open market of crowdsourcing systems.
Index Terms—Federated Crowdsourcing, Re-writable Deter- This motivates us to investigate a privacy-preserving and
ministic Hashing, Searchable Encryption, Blockchain. federated crowdsourcing system. Our research aims at de-
signing a privacy-preserving and federated framework that
I. I NTRODUCTION interconnects the existing independent brokers to form a
loosely coupled federation without relying on any third-party
With the development of the internet and sharing-economy, authority. In such a framework, each broker can keep its own
crowdsourcing [1] has emerged as a compelling computing autonomy while being able to use potential resources (tasks
paradigm, which utilizes the collective intelligence to solve and workers for crowdsourcing) in other brokers if autho-
complex tasks, such as sentiment analysis [2] and data col- rized by them. Inspired by the immutability and verifiability
lection [3]. Nowadays, an increasing number of companies of blockchain, we utilize the blockchain as the underlying
choose crowdsourcing as a problem-solving method and many platform for building the federation of crowdsourcing systems.
well-known platforms such as Upwork [4], UBER [5], and Existing solutions of using blockchain for crowdsourcing [10]–
Amazon Mechanical Turk [6] are active in our lives. [12] assume the existence of a trusted authority that can
Despite their advantages, crowdsourcing systems have some faithfully perform key management. The trusted authority can
serious security concerns. In current crowdsourcing systems, assign different types of secret keys to participants so that
the crowdsourcing platforms (called brokers) perform task- the ciphertexts of the same keyword encrypted by different
worker matching based on the plaintext of task requirements keys can be transformed into the same format. However, in
Authorized licensed use limited to: University of Liverpool. Downloaded on November 13,2020 at 11:53:08 UTC from IEEE Xplore. Restrictions apply.
2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS)
token and uploads it to the smart contract. Then the smart cess is based on authorization. It should be guaranteed
contract will search over the on-chain task indexes uploaded that a broker can only search over the on-chain task
by the brokers that authorize bi to access. indexes posted by the brokers that authorize it.
Task taking: After collecting all the tasks matching the • Federation: In PFcrowd, all brokers should be intercon-
worker’s interest, the smart contract records the query result nected and form a loosely coupled group. For the task
on the blockchain and the worker can then ask for the tasks indexes posted by one broker, other authorized brokers
from respective brokers. To confirm with one worker to take a can access it by searching on the blockchain.
task, the broker will sign a task-contract with the worker and • Traceable: All operations, task indexes and authoriza-
record the contract on the blockchain. tion relationships recorded on the blockchain should be
traceable. Additionally, the matching results of all task
B. Threat Model recommendation queries are required to be recorded on
the blockchain.
Brokers: The brokers are considered as semi-honest. They
will honestly execute the pre-defined protocols but may be IV. THE PROPOSED CONSTRUCTION
curious about the sensitive information regarding received task We aim to design a privacy-preserving and federated crowd-
requirements and interests. Moreover, they also curious about sourcing framework, which interconnects existing independent
Authorized licensed use limited to: University of Liverpool. Downloaded on November 13,2020 at 11:53:08 UTC from IEEE Xplore. Restrictions apply.
2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS)
n b→b∗ ← g o ;
Set Aut b b ∗
14
search over the encrypted on-chain task indexes posted by the
brokers authorizing b1 , and then record the matching tasks 15 Send Fb∗ , Autb→b∗ to the smart contract;
g
on the blockchain. However, brokers encrypt their uploaded
task indexes with their secret key and no trusted third-party 16 Smart contract
n build on-chain
o authorization list
exists in our platform. It is a challenging issue to perform 17 for each Fb∗ , Autb→b∗ received from broker b do
g
authorization-based task recommendations over task indexes 18 Add Autb→b∗ in A[F
gb∗ ];
encrypted by different brokers without a trusted party.
To solve this problem, we resort to the RDH [13] technique.
The core idea of it is to convert the search tokens uploaded by
brokers into on-chain search trapdoors based on the authoriza- Algorithm 2: Build Encrypted On-chain Task Index
tion relationship among brokers. For two brokers b1 , b2 ∈ B, Input: secure PRFs {G1 , G2 }; symmeric encryption
the authorization of b1 to b2 is denoted as g b1 /b2 , which is function Enc; states s; encrypted requirement
stored in the authorization relationship set of b2 . The on-chain H(w); B; task ID set T set.
search trapdoor for matching task indexes posted by b1 about Output: On-chain task index I, table S.
interest H(w) is g H(w)·b1 . When b2 searches for H(w), b2 1 for each broker b ∈ B do
generates search token H(w) · b2 and uploads it to the smart 2 for each encrypted requirement H(w) do
contract. When receiving the search token, the smart contract
l m
3 Set α ← |T set(H(w))|
p , where p denotes the
first gets the authorization relationship set of b2 and computes
number of tasks that can be packed;
on-chain search trapdoors. For the authorization value g b1 /b2 ,
4 Divide T set(H(w)) into α blocks;
one on-chain search trapdoor can be computed by the smart
5 Pad the last block to p entries if needed, s ← 0;
contract as: 1
6 Set trap ← g H(w)·Fb ;
(g b1 /b2 )H(w)·b2 = g H(w)·b1 (2)
7 for each block in T set(H(w)) do
which is equal to the on-chain search trapdoor pre-defined by 8 T
g id ← T id1 ||T id2 || · · · ||T idp ;
b1 about interest H(w). Hence b2 can get the tasks uploaded by 9 label ← G1 (trap||s);
b1 . Note that, if b1 does not authorize b2 to search, the smart 10 C ← Enc(k, T gid);
contract cannot generate the right on-chain search trapdoor 11 P ← G2 (trap||s) ⊕ C;
which can return matching tasks posted by b1 successfully. 12 Put [label : P] to index I, s++;
In this way, we realize the authorization-based task-worker
13 Put [H(w) : s] to table S;
matching without involving any third-party authority.
14 send I to the smart contract;
However, as RDH is a deterministic algorithm, the labels
of on-chain task indexes about the same keyword posted by
the same broker are identical. Therefore, it leaks the relation
between queried interests and task indexes to peer nodes
before searching. To solve this problem, we introduce the state broker uploads a task index about one requirement, the state
variable and utilize SSE technology to enhance security. Each variable corresponding to the requirement is incremented by
broker maintains a local state table, in which each encrypted one. In this way, the relation between task requirements and
task requirement corresponds to a state variable. Whenever one tasks is preserved.
Authorized licensed use limited to: University of Liverpool. Downloaded on November 13,2020 at 11:53:08 UTC from IEEE Xplore. Restrictions apply.
2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS)
Algorithm 3: Task Recommendation 1-9 of Algorithm 1. Specifically, given the secure parameter
Input: secure PRFs {G1 , G2 }; search interest H(w); λ and get the parameter g from the group G, which is used
on-chain task index I; on-chain authorization to generate the on-chain authorization list and task indexes.
list A; broker b ∈ B. Each broker b ∈ B randomly selects three master keys
0 $
Output: Matching task ID set T . − {0, 1}λ , and then computes F
keb , kb , kb ← fb , F 1 , and F 2
b b
1 Broker b generates search token via secure PRF G1 . IDb represents the unique ID number of
2 Set Tb,w ← H(w) · Fb ;
2 broker b, such as the public key. Sb denotes the set of brokers
n o authorizing broker b to access their uploaded task indexes. It
3 Send Tb,w , Ffb to blockchain;
is initialized to empty and added to the authorization list A
4 Smartn contract
o search reply in the smart contract. After that, brokers communicate with
5 Get Tb,w , Fb received from broker b;
f each other to determine the authorization relationship among
6 T ← ∅; them, and then the smart contract generates the authorization
list A according to their constructed relationship. In specific,
7 for each Aut in A[F fb ] do
Tb,w broker b ∈ B first connects with each broker b∗ ∈ ACb and
8 ct=Aut , c ← 0;
obtains Fb2∗ and F g b∗ offline. ACb denotes the set of brokers
9 while G1 (ct||c) ∈ I do
that are authorized to access the tasks posted by broker b.
10 C = I[G1 (ct||c)] ⊕ G2 (ct||c), c++;
Broker b computes Autb→b∗ , which is an encrypted value
11 T ← T + C; ∗
indicating then authorizationo relationship of broker b to b and
12 Record T on the blockchain; then sends Fb∗ , Autb→b∗ to the smart contract. Finally, the
g
13 Decryption smart contract adds received Autb→b∗ to A[F b∗ ].
g
14 Read T on the blockchain; Encrypted on-chain task index design: After initializing
15 Request session keys from corresponding brokers and the system, brokers will analyze their local requirement-
decrypt T to get the matching tasks; task indexes and build on-chain task indexes. The building
procedures of encrypted on-chain task indexes are presented
Algorithm 4: Post New On-chain Task Index in Algorithm 2. For a given encrypted requirement H(w),
Input: secure PRFs {G1 , G2 }; newly added encrypted broker b ∈ B first computes the on-chain search trapdoor trap
task requirement H(w); newly added task ID and divides T set(H(w)) into α blocks. T set(H(w)) is the
set T set∗ ; on-chain task index I; table S. set of tasks corresponding to H(w) in the current batch of
Output: Newly generated on-chain task index I ∗ . tasks. p denotes the number of tasks in each block, which is
a parameter chosen by brokers. Then, for each block, broker
1 Broker b posts new tasks
b packages the task IDs in it into one through concatenation
2 for each encrypted requirement H(w) do
1 and generates the label via computing G1 (trap||s). s denotes
3 trap ← g H(w)·Fb ; the state of H(w), which increments by 1 each time. We use s
4 if H(w) not in S then to ensure that peer nodes cannot distinguish whether multiple
5 s ← 0; on-chain task indexes posted by the same broker correspond
6 Put [H(w) : s] to table S; to the same requirement. By traversing all states, we can get
7 else all task indexes about H(w) posted by broker b. After that,
8 s ← S[H(w)]; we compute G2 (trap||s) and use it as an overlay to protect
l ∗
m the ciphertexts C. Finally, broker b maintains the state table
9 Set α ← |T set (H(w))| ;
p S locally and sends index I to the smart contract.
∗
10 Divide T set (H(w)) into α blocks and pad the Task recommendation protocol: Based on the constructed
last block to p entries if needed; on-chain task indexes and authorization list, Algorithm 3
11 for each block in T set∗ (H(w)) do presents the task recommendation protocol. To find the tasks
12 Tgid ← T id1 ||T id2 || · · · ||T idp ; matching his interest, a worker submits a search request to its
13 C ← Enc(k, T gid); broker b, containing the encrypted interest H(w). When n broker
o
14 label ← G1 (trap||s); b receives H(w), it generates the search token Tb,w , F fb ,
15 P ← G2 (trap||s) ⊕ C; and sends it to the blockchain in the form of smart contract.
16 Put [label : P] to index I ∗ , s++; After receiving the search token, the smart contract first
17 S[H(w)] ← s; gets the authorization relationship set A[F fb ] of broker b. For
18 send I ∗ to the smart contract; each element Aut in A[Fb ], which denotes the authorization
f
relationship of broker b∗ ∈ Sb to broker b, the smart contract
computes on-chain search trapdoor ct. Then the smart contract
B. Protocol Details calculates all labels G1 (ct||c) and gets the encrypted matching
task IDs C in each block posted by broker b∗ via computing
System initialization: In the system initialization phase, bro-
I[G1 (ct||c)] ⊕ G2 (ct||c). After traversing all elements in
kers first generate their corresponding parameters, as shown in
Authorized licensed use limited to: University of Liverpool. Downloaded on November 13,2020 at 11:53:08 UTC from IEEE Xplore. Restrictions apply.
2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS)
Smart Contract
On-chain Authorization Index On-chain Search Label On-chain Task Index
Trapdoor
H(w): 1
(𝑔#!"/#!" ))(+) 2#!" → 𝒈𝑯 𝒘
" # # /𝑭𝟏𝒃𝟏 𝑮 𝒘 /𝑭𝟏𝒃𝟏 ||𝟎 𝟏 "
𝟏 𝒈𝑯 𝑮𝟏 𝒈𝑯 𝒘 /𝑭𝒃𝟏
||𝟎 : 𝐺& 𝑔) + /#!"
||0 ⨁𝐶(
Search Token " # " # " #
𝑭"
𝒃𝟏 : 𝑔#!" /#!" , 𝑔#!# /#!" ,𝑔#!$ /#!" 𝑮𝟏 𝒈𝑯 𝒘 /𝑭𝟏𝒃𝟏 ||𝟏
𝑮𝟏
𝟏
𝒈𝑯 𝒘 /𝑭𝒃𝟏 ||𝟏 : 𝐺&
"
𝑔) + /#!" ||1 ⨁𝐶& Broker b2
& , 𝑭"
H(w)∙𝐹%( 𝒃𝟏 #" /## #" /## #" /##
𝐹"
%& : 𝑔 !" !# , 𝑔 !# !# , 𝑔 !$ !#
𝟏 "
(𝑔#!#/#!" ))(+) 2#!" → 𝒈𝑯 𝒘
" # # /𝑭𝟏𝒃𝟐 𝑮 𝒘 /𝑭𝟏𝒃𝟐 ||𝟎
Broker b1 𝟏 𝒈𝑯 𝑮𝟏 𝒈𝑯 𝒘 /𝑭𝒃𝟐
||𝟎 : 𝐺& 𝑔) + /#!#
||0 ⨁𝐶'
" /## " # " #
H(w): 1 𝐹"
%' : 𝑔
#!" !$ , 𝑔 #!#/#!$ , 𝑔 #!$ /#!$
𝒘 /𝑭𝟏𝒃𝟐 ||𝟏 𝑮𝟏 𝒈𝑯
𝟏
𝒘 /𝑭𝒃𝟐
||𝟏 : 𝐺& 𝑔)
"
+ /#!#
||1 ⨁𝐶1
𝑮𝟏 𝒈𝑯 H(w): 0
𝟏
𝑯 𝒘 /𝑭𝒃𝟑 "
) + /#!$
))(+) 2 𝑮𝟏 𝒈 ||𝟎 : 𝐺& 𝑔 ||0 ⨁𝐶2
" /## #
(𝑔 #!$ !" #!" → 𝒈𝑯 𝒘 /𝑭 𝟏
𝒃𝟑 𝑮𝟏 𝒈𝑯 𝒘 /𝑭𝟏𝒃𝟑 ||𝟎
Broker b3
1
A[Ffb ], the smart contract will get the ID set of all tasks the task index label G1 (g H(w)·Fb2 ||2) is not in the existing on-
matching the searched interest. Eventually, broker b and the chain task indexes. It means all the tasks uploaded by broker
worker can read the matching result on the blockchain and ask b2 matching the searched interest have been found. Then the
for tasks from respective brokers. smart contract will choose another on-chain search trapdoor
Post new on-chain task index: After receiving a certain and repeat the above work until the IDs of all matching tasks
batch of new tasks uploaded by task-requesters, brokers will are returned.
generate new task indexes and submit them on the blockchain.
Algorithm 4 shows the process of posting new tasks. For V. SECURITY ANA LYSIS
an encrypted task requirement H(w), if broker b has not In this section, we provide formal security analysis to show
published tasks about it on the blockchain before, broker b will our proposed on-chain matching services are able to address
initialize its state s to 0. Otherwise, broker b gets the value the threats as mentioned in Section III-B. Firstly, we define
of s through S[H(w)] which is stored in broker b locally, as the setup leakage Lstep for a given task set T set as:
shown in line 4-8 of Algorithm 4. Then broker b packages
these task IDs into blocks and generates new on-chain task LStep = (|A|, h|L|, |P|im ),
indexes. Finally, broker b publishes these newly generated task where |A| is the size of the encrypted authorization list, and
indexes on the blockchain. h|L|, |P|im are ciphertext lengths of m label-task pairs. When
C. On-chain Task Recommendation Instantiation a broker b sends a search request w, the view of an adversary
is defined in the leakage LMatch as:
we use an example to illustrate how PFcrowd works to sup-
port privacy-preserving and authorized task-worker matching LMatch = ({Tb,w , F
fb }, hL, Piq ),
across multiple platforms. As shown in Fig. 2, there are three
brokers b1, b2, and b3 ∈ B in our federated crowdsourcing where {Tb,w , F
fb } are the query trapdoors and hL, Piq are
framework. Each broker is authorized to access the on-chain q matched task set. During the update operations, the leakage
task indexes posted by others. The on-chain authorization function LUpdate captured by an adversary is defined as:
list and task indexes constructed by brokers are shown in
LUpdate = (add, µ),
the orange rectangle part of the figure. To find the tasks
matching his interest, a worker submits H(w) to his broker where add represents insert operations and µ denotes the
b1. After receiving nthe encrypted interest,
o b1 calculates and number of newly added entries. Following the simulation-
2 f based security definition in [15], we give the formal security
sends search token H(w) · Fb1 , Fb1 to the smart contract.
definition as follows:
The smart contract gets the Ff b1 in search token and fetches the
authorization relationship between broker b1 and other brokers Definition 1. Let Ω = (Setup, Match, Update) be our scheme
by using A[Ff b1 ]. Then the smart contract uses each element for secure task-matching services, and let LStep , Lmatch and
in A[Ff 2
b1 ] together with received H(w) · Fb1 to calculate on- LUpdate be the leakage functions. We define the following
chain search trapdoors which are shown in the purple rectangle probabilistic experiments RealC (k) and IdealC,S (k) with a
part. We can see that each authorization relationship in A[Ff b1 ] probabilistic polynomial time (PPT) adversary C and a PPT
corresponds to one on-chain search trapdoor. For one on- simulator S:
1
chain search trapdoor, such as g H(w)·Fb2 , the smart contract RealC (k): C selects a task set T set and asks the broker to
1
calculates the first task index label by concatenating g H(w)·Fb2 build or update the real on-chain indexes via Setup or Update
with a variable c that starts at 0 and increments by 1 each time. algorithm with the private key k. Then C adaptively conducts
Then the smart contract traverses the on-chain task indexes and a polynomial number of queries via Match algorithm. Finally
obtains the corresponding ciphertext C1 . Similarly, the smart C returns a bit as the output.
1
contract obtains the ciphertext C2 by concatenating g H(w)·Fb2 IdealC,S (k): C selects a task set T set, and S simulates the
with c which equals 1. When the value of c is 2, we find that index for C based on LStep . From LUpdate , S performs the
Authorized licensed use limited to: University of Liverpool. Downloaded on November 13,2020 at 11:53:08 UTC from IEEE Xplore. Restrictions apply.
2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS)
update operations. Then C adaptively performs a polynomial TABLE II: Local Time Cost
number of queries. From the leakage LMatch in each task Task Index Posting Task
matching request, S simulates trapdoors and ciphertext, which Initialization New Tasks Recommendation
are processed over the simulated index. Finally, C returns a Number Time(ms) Number Time(ms) Number Time(ms)
bit as the output. 1000 103.2 100 82.1 50 5.34
1400 112.9 200 84.2 100 6.08
Ω is a (LStep , LMatch , LUpdate )-secure scheme, if for all
1800 133.1 300 86.6 150 6.92
PPT adversaries C, there exists a simulator S such that:
2200 143.8 400 89.8 200 7.63
P r[RealC (k) = 1] − P r[IdealC,S (k) = 1] ≤ negl(k), where
2600 150.1 500 91.7 250 8.35
negl(k) is a negligible function in k.
3000 166.2 600 94.1 300 9.14
Theorem 1. Ω is a (LStep , LMatch , LUpdate )-secure scheme
under the random-oracle model if {G1 , G2 } are secure PRFs. TABLE III: Local Authorization Latency
Number of Brokers 4 6 8 10 12 14
Proof. The objective is to prove that the adversary C cannot Time(ms) 11.2 25.1 43.2 71.7 102.9 135.2
distinguish between the real index and the simulated one as
defined in Definition 1. Given LStep , the simulator S generates
the simulated encrypted indexes, which is indistinguishable and Web3.keccak in the cryptography library4 of Python. The
from the real one. It contains |A|-bit random strings as the number of tasks p in each block for brokers is set to 2.
authorization list and m index entries with size |L| and |P|
bits. From LMatch , S can simulate the query and results. For B. Performance Evaluation
the simulated index, S randomly selects the same number q Local performance evaluation: To evaluate the performance
of entries as the query on the real index. The token is selected of PFcrowd, we first assess the local time cost of three types
by a random string T 0 , and the label can be simulated as of operations: task index initialization, posting new tasks, and
0 0
L0 = G01 (g F /T ), where F 0 is a random string from the task recommendation. As shown in Table II, the time cost
simulated list A and G01 is a random oracle. The result can be of task index initialization increases slowly with the growing
simulated as P 0 = G02 (L0 ) ⊕ γ, where G02 is a random oracle number of (r, t) pairs. Specifically, it only takes 166.2ms to
and γ is a random string. The bit length of corresponding initialize task indexes when processing 3000 index entries.
simulated indexes and the real one is the same, but S picks We also assess the incremental scalability by measuring the
random strings every time a previously unseen token is used. time latency for posting new tasks. The time cost of posting
For the update operation, S can add µ simulated entries new tasks includes both generating on-chain task indexes
from LUpdate just like mentioned in build procedures. Due and updating the local state table of brokers. For the task
to the pseudo-randomness of PRF and the semantic security recommendation operation, we test the relationship between
of symmetric encryption, C should not be able to distinguish the latency caused by searching one interest and the number of
the outputs of the real experiment RealC (k) and the simulated matching tasks. We can see that with the number of matching
one IdealC,S (k). tasks increases, the latency of task recommendation grows
steadily. Specifically, when the number of matching tasks
VI. IMPLEMENTATION AND EVALUATION and brokers are 300 and 6 respectively, the recommendation
latency is only 9.14ms, which is much less than the time cost
A. Prototype Implementation
of generating on-chain task indexes.
The goal of PFcrowd is to design a privacy-preserving Moreover, we analyze the change in local authorization
and federated crowdsourcing framework. We implement the latency with the number of brokers, which is shown in Table
prototype in python and use Solidity1 to construct the smart III. We can observe that as the number of brokers gains, the
contract of Ethereum, with about 2000 lines of codes2 . We authorization latency witness a significant increase. Specifi-
use TestRPC to test the smart contract of Ethereum and run cally, when the number of brokers rises from 4 to 14, the
PFcrowd on a laptop with 16 GB RAM, 2.4 GHz CPU, 4 Intel authorization latency increases about 12 times. This is because
cores i5, and a MAC 10.15.1 operating system. The interaction the authorization operation is pair to pair among brokers.
between brokers and Ethereum is based on web3.py3 . In this When one broker joins PFcrowd, it requires authorization from
experiment, we select the task requirements database from other brokers and grant access to other brokers. Therefore,
Upwork, a real-world crowdsourcing platform. We generate the authorization latency presents a non-linear increase as
(r, t) pairs for each broker in PFcrowd, where r and t the number of brokers rises, which is consistent with the
represent task requirements and task identifiers respectively. experimental result.
For cryptographic primitives, we implement the symmetric On-chain performance evaluation: To evaluate the efficiency
encryption and pseudo-random functions via functions AES and overhead of PFcrowd, we further assess its on-chain per-
formance from the perspective of task indexes, authorization,
1 Online at: https://solidity.readthedocs.io/en/develop/ task recommendation, and smart contract cost.
2 Online at: https://github.com/groupJia/IWQOS2020/
3 Online at: https://web3py.readthedocs.io/en/stable/ 4 Online at: https://cryptography.io/en/latest/
Authorized licensed use limited to: University of Liverpool. Downloaded on November 13,2020 at 11:53:08 UTC from IEEE Xplore. Restrictions apply.
2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS)
TABLE IV: Contract Cost Evaluation. The capital cost of smart contract
deployment, posting tasks, authorization and task recommendation are $0.236, 3.5 12
Task indexes initialization Authorization
$0.024, $0.029 and $0.948 respectively, with an exchange rate of 1ether =
about 10s to generate the authorization list for 14 brokers in (c) The recommendation latency vs. (d) The recommendation latency vs.
the system initialization phase. Moreover, when the number the number of matching tasks the number of brokers
of brokers is 12 and 2 new brokers join in PFcrowd, the time
Fig. 3: On-chain Performance Evaluation
cost of updating the authorization list is only a bit more than
2s, which is acceptable in real practice.
We also evaluate the recommendation latency of PFcrowd process, the more the number of brokers, the more trapdoors
within the blockchain network. Fig. 3(c) depicts the relation- will be generated and thus the more transaction confirmation
ship among transaction confirmation time of task recommen- time will be required. Furthermore, as shown in Fig. 3(d),
dation, the number of matching tasks, and the number of task when the number of brokers remains the same, the time cost
recommendation requests (searched interests). Note that the on the task recommendation process rises as the number of
transaction confirmation time includes the time to obtain the matching tasks gains. This is because when the number of
task matching result and record the result on the blockchain. tasks matching the searched interest rises, increasing time is
We can observe that the transaction confirmation time for one required for the smart contract to compute all labels matching
task recommendation request grows slowly with the increasing the searched interest and record the task matching result on
number of matching tasks. As the number of matching tasks the blockchain.
increases from 50 to 300, the transaction confirmation time of For the smart contract in PFcrowd, it is designed for three
searching one interest only grows approximately one second. purposes. First, it generates the authorization list for brokers
However, when the number of matching tasks is the same, and records the list on the blockchain. Second, recording
searching for five interests takes much more time than search- the task indexes posted by brokers on the blockchain. Third,
ing for one interest. Specifically, while the number of matching when one broker submits a task recommendation request, it
tasks is 50, the recommendation confirmation time for five recommends matching tasks uploaded by brokers authorizing
interests is approximately four times the recommendation the broker to access and then records the matching results
confirmation time for one interest. We can conclude that for on the blockchain. In PFcrowd, each broker is represented by
task recommendation operation, the time cost on authorization an address that is generated in the system setup stage. Only
relationship computation is longer than other operations. registered brokers can post task indexes and send task recom-
To gain a deeper understanding of the recommendation mendation request on the blockchain. Note that we use gas to
performance, we further measure the time cost of task recom- evaluate the performance of our implemented smart contract
mendation for the varying number of brokers under different on Ethereum. Table IV shows the gas usage and capital cost
numbers of matching tasks, as shown in Fig. 3(d). From the of contract deployment, task posting, authorization, and task
figure, we can perceive that when the number of matching recommendation operation. We can see that the capital cost of
tasks remains the same, the recommendation latency increases each operation is less than $1, which is practical to use.
with the growing number of brokers. The reason is that
the number of elements in each broker’s authorization set VII. RELATED WORK
increases with the growth of the number of brokers. When
one broker sends a task recommendation request, the smart A. Crowdsourcing System
contract requires to traverse the authorization set of the broker With the growing popularity of crowdsourcing applications,
and generate corresponding on-chain search trapdoors. In this an increasing number of people are paying attention to the
Authorized licensed use limited to: University of Liverpool. Downloaded on November 13,2020 at 11:53:08 UTC from IEEE Xplore. Restrictions apply.
2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS)
privacy issues that arise in task recommendations. Since the the key manager. However, it is difficult to find an authority
data uploaded by task-requesters and workers often contains that can be trusted by various parties, and sometimes it may
sensitive information, such as health indicators [16] and not exist. Moreover, the above studies assumed that all users
geographic locations [17], it should not be exposed to the have access to the same set of documents, which cannot realize
public and untrusted crowdsourcing platforms. To address this authorization-based access. To solve this problem, Popa et
concern, several secure crowdsourcing frameworks [17]–[19] al. [42] introduced a multi-key searchable encryption scheme
have been proposed. In [18], To et al. first studied the location that supports searching for data encrypted by different keys.
privacy for spatial crowdsourcing and proposed a privacy- However, Grubbs et al. [43] pointed out that this design cannot
aware crowdsourcing scheme based on differential privacy support the revocation of sharing and its security notion is
and geocasting. Under this scheme, the location privacy of insufficient in real settings. Later, Patel et al. [13] proposed
workers is preserved. Later Gong et al. [19] proposed a RDH, which can be viewed as an important start point in this
task recommendation framework for mobile crowdsourcing design space. However, RDH cannot be directly employed in
without violating the privacy of workers. This framework our federated framework, as its deterministic construction may
used differential privacy and cloaking techniques to protect leak the relation between queried keywords and task indexes
worker statistics and location information, respectively. How- submitted by multiple brokers.
ever, protecting privacy by using location cloaking is not a
good choice, as it is vulnerable to some attacks, such as C. Blockchain and Smart Contract
background knowledge attacks. To enhance security, Shu et Due to the boom of cryptocurrencies, such as Bitcoin [44]
al. [7] proposed a privacy-preserving task recommendation and Ethereum [45], the blockchain as their underlying technol-
scheme. The proposed scheme realized multi-keyword match- ogy has attracted great attention. To make the blockchain more
ing between multiple task requesters and multiple workers powerful, the smart contract was introduced to blockchain
by using proxy re-encryption and asymmetric scalar-product- platforms such as Ethereum [45] and Hyperledger [46]. It is a
preserving encryption [20] techniques. However, due to the piece of pre-defined program that can be executed automati-
adoption of proxy re-encryption, this scheme cannot protect cally on the blockchain without involving any central authority.
the identity privacy of workers. Liu et al. [21] utilized the dual- Recently, some efforts [47]–[52] have been made to design
server model and adopted Paillier and ElGamal cryptosystems verifiable and blockchain-based search schemes. In [47], Hu et
to protect the privacy of tasks and workers. Despite extensive al. constructed a decentralized financially-fair and privacy-
research on secure crowdsourcing, these prior works rely on preserving search scheme by leveraging the smart contract.
a centralized trusted authority for encryption key management In this scheme, all encrypted indexes were stored on top of
[22] and cannot be directly applied to our federated scenario. the smart contract, which greatly increases the storage cost
of the blockchain. Later, Cai et al. [48] proposed a light-
B. Searchable Encryption weight blockchain-based keyword search scheme, where the
Searchable encryption [23]–[25] is a cryptographic prim- files together with the constructed indexes are stored in the
itive that enables untrusted servers to directly search over server. In recent years, a few studies [10]–[12] have focused
encrypted data without server-side decryption. It is recognized on the integration of blockchain and crowdsourcing. CrowdBC
as a fundamental component of building privacy-preserving [10] developed a blockchain-based decentralized framework
applications [26]–[28]. In [23], Song et al. first introduced the for crowdsourcing, which does not rely on any third party.
concept of SSE. Subsequently, plenty of studies have been This work mainly presented the design of system architecture
exploited in this direction [29]–[33]. In [31], Curtmola et and the implementation of the smart contract. Lu et al. [11]
al. introduced the formal security definition of SSE. Later proposed a similar crowdsourcing framework, which aims to
Cash et al. [30] presented the first SSE scheme supporting ensure the privacy and anonymity of participants. In [12],
conjunctive search and general boolean queries on encrypted Wu et al. proposed a blockchain-based task matching scheme
data. However, early studies of SSE focused on the single-user for crowdsourcing, in which the task matching process is
setting [34]–[36], in which case the data owner with the secret transparent and reliable. However, these three solutions still
key can only search or update his own data. Due to the limi- lock task-requesters and workers in an individual crowdsourc-
tation of the application scope of the single-user SE scheme, ing system and lack a federated crowdsourcing platform that
the research of SE in the multi-user scenario [37]–[41] has interconnects existing independent brokers, which is the focus
also attracted attention recently. In this setting, data users are of this paper.
allowed to search over encrypted data outsourced by multiple
data owners. In [38], Blaze et al. first introduced the notion of VIII. CONCLUSION
proxy re-encryption, which was later improved by Bao et al. In this paper, we design and implement the PFcrowd, a
[39] to achieve SE in a multi-user scenario. Nevertheless, as privacy-preserving and federated crowdsourcing framework.
proxy re-encryption is a deterministic algorithm, the schemes We utilize the blockchain technology to interconnect the
based on it are easy to suffer from statistical attacks [40]. existing crowdsourcing platforms with guaranteed security and
To solve this problem, several studies [41] have been done to fairness. Specifically, we first devise a secure task recommen-
transform different keys by introducing a trusted third-party as dation scheme by using our enhanced RDH scheme. We then
Authorized licensed use limited to: University of Liverpool. Downloaded on November 13,2020 at 11:53:08 UTC from IEEE Xplore. Restrictions apply.
2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS)
leverage the structure of SSE and propose a customized design [23] D. X. Song, D. A. Wagner, and A. Perrig, “Practical techniques for
of on-chain task indexes. We provide a thorough security searches on encrypted data,” in Proc. of IEEE S&P, 2000.
[24] J. Yao, Y. Zheng, Y. Guo, and C. Wang, “Sok: A systematic study of
analysis to show security strengths. The extensive evaluations attacks in efficient encrypted cloud data search,” in Proc. of SBC, 2020.
of real-world datasets demonstrate the practicality of PFcrowd. [25] R. Curtmola, J. A. Garay, S. Kamara, and R. Ostrovsky, “Searchable
symmetric encryption: improved definitions and efficient constructions,”
ACKNOWLEDGMENT in Proc. of CCS, 2006.
[26] Y. Guo, C. Wang, X. Yuan, and X. Jia, “Enabling privacy-preserving
This work was supported by the National Natural Science header matching for outsourced middleboxes,” in Proc. of IEEE/ACM
IWQoS, 2018.
Foundation of China Project under Grant No. 61772154, [27] Y. Guo, C. Wang, and X. Jia, “Enabling secure and dynamic deep packet
No. 61732022, and No. 61672195, and the Research Grants inspection in outsourced middleboxes,” in Proc. of ACM Asiaccs-SCC,
Council of Hong Kong under Project No. CityU 11202419 and 2019.
[28] Y. Guo, M. Wang, C. Wang, X. Yuan, and X. Jia, “Privacy-preserving
CityU C1008-16G. packet header checking over in-the-cloud middleboxes,” IEEE IoT-J,
2020.
R EFERENCES [29] R. Bost, B. Minaud, and O. Ohrimenko, “Forward and backward private
searchable encryption from constrained cryptographic primitives,” in
[1] J. Howe, “The rise of crowdsourcing,” IEEE Wired magazine, vol. 53, Proc. of ACM SIGSAC, 2017.
no. 10, pp. 1–14, 2006. [30] D. Cash, S. Jarecki, C. S. Jutla, H. Krawczyk, M. Rosu, and M. Steiner,
[2] A. Fasoulis, M. Virvou, G. A. Tsihrintzis, C. Patsakis, and E. Alepis, “Highly-scalable searchable symmetric encryption with support for
“Sensus vox: Sentiment mapping through smartphone multi-sensory boolean queries,” in Proc. of CRYPTO, 2013.
crowdsourcing,” in Proc. of IEEE ICTAI, 2018. [31] R. Curtmola, J. A. Garay, S. Kamara, and R. Ostrovsky, “Searchable
[3] S. Murata, H. Yanagida, K. Katahira, S. Suzuki, T. Ogata, and Y. Ya- symmetric encryption: Improved definitions and efficient constructions,”
mashita, “Large-scale data collection for goal-directed drawing task with JCS, vol. 19, pp. 895–934, 2011.
self-report psychiatric symptom questionnaires via crowdsourcing,” in [32] S. Kamara, C. Papamanthou, and T. Roeder, “Dynamic searchable
Proc. of IEEE SMC, 2019. symmetric encryption,” in Proc. of ACM CCS, 2012.
[4] Upwork, Online at https://www.upwork.com, 2015. [33] R. Kui, Y. Guo, J. Li, X. Jia, C. Wang, Y. Zhou, S. Wang, N. Cao, and
[5] UBER, Online at https://www.uber.com, 2009. F. Li, “Hybridx: New hybrid index for volume-hiding range queries in
[6] Amazon Mechanical Turk, Online at https://www.mturk.com, 2005. data outsourcing services,” in Proc. of IEEE ICDCS, 2020.
[7] J. Shu, X. Jia, K. Yang, and H. Wang, “Privacy-preserving task recom- [34] H. Cui, Z. Wan, R. H. Deng, G. Wang, and Y. Li, “Efficient and
mendation services for crowdsourcing,” IEEE TSC, 2018. expressive keyword search over encrypted data in cloud,” IEEE TDSC,
[8] J. Shu, X. Liu, K. Yang, Y. Zhang, X. Jia, and R. H. Deng, “Sybsub: vol. 15, no. 3, pp. 409–422, 2018.
Privacy-preserving expressive task subscription with sybil detection in [35] Y. Guo, X. Yuan, X. Wang, C. Wang, B. Li, and X. Jia, “Enabling
crowdsourcing,” IEEE IoT-J, vol. 6, no. 2, pp. 3003–3013, 2019. encrypted rich queries in distributed key-value stores,” IEEE TPDS,
[9] J. Shu, K. Yang, X. Jia, X. Liu, C. Wang, and R. Deng, “Proxy-free vol. 30, no. 6, pp. 1283–1297, 2019.
privacy-preserving task matching with efficient revocation in crowd- [36] X. Yuan, Y. Guo, X. Wang, C. Wang, B. Li, and X. Jia, “Enckv: An
sourcing,” IEEE TDSC, vol. PP, pp. 1–1, 2018. encrypted key-value store with rich queries,” in Proc. ACM AsiaCCS,
[10] M. Li, J. Weng, A. Yang, W. Lu, Y. Zhang, L. Hou, J. Liu, Y. Xiang, and 2017.
R. H. Deng, “Crowdbc: A blockchain-based decentralized framework for [37] W. Zhang, Y. Lin, S. Xiao, J. Wu, and S. Zhou, “Privacy preserving
crowdsourcing,” IEEE TPDS, vol. 30, no. 6, pp. 1251–1266, 2019. ranked multi-keyword search for multiple data owners in cloud comput-
[11] Y. Lu, Q. Tang, and G. Wang, “Zebralancer: Private and anonymous ing,” IEEE TC, vol. 65, no. 5, pp. 1566–1577, 2016.
crowdsourcing system atop open blockchain,” in Proc. of IEEE ICDCS, [38] M. Blaze, G. Bleumer, and M. Strauss, “Divertible protocols and atomic
2018. proxy cryptography,” in Proc. of EUROCRYPT, 1998.
[12] Y. Wu, S. Tang, B. Zhao, and Z. Peng, “BPTM: blockchain-based [39] F. Bao, R. H. Deng, X. Ding, and Y. Yang, “Private query on encrypted
privacy-preserving task matching in crowdsourcing,” IEEE Access, data in multi-user settings,” in Proc. of ISPEC, 2008.
vol. 7, pp. 45 605–45 617, 2019. [40] M. Naveed, S. Kamara, and C. V. Wright, “Inference attacks on property-
[13] S. Patel, G. Persiano, and K. Yeo, “Symmetric searchable encryption preserving encrypted databases,” in Proc. of ACM CCS, 2015.
with sharing and unsharing,” in Proc. of ESORICS, 2018. [41] Q. Wang, Y. Guo, H. Huang, and X. Jia., “Multi-user forward secure
[14] D. Cash and S. Tessaro, “The locality of searchable symmetric encryp- dynamic searchable symmetric encryption,” in Proc. of NSS, 2018.
tion,” in Proc. of EUROCRYPT, 2014. [42] R. A. Popa and N. Zeldovich, “Multi-key searchable encryption,” IACR
[15] D. Cash, J. Jaeger, S. Jarecki, C. S. Jutla, H. Krawczyk, M. Rosu, and Cryptology ePrint Archive, vol. 2013, p. 508, 2013.
M. Steiner, “Dynamic searchable encryption in very-large databases: [43] P. Grubbs, R. McPherson, M. Naveed, T. Ristenpart, and V. Shmatikov,
Data structures and implementation,” in Proc. of NDSS, 2014. “Breaking web applications built on top of encrypted data,” in Proc. of
[16] P. Créquit, G. Mansouri, M. Benchoufi, A. Vivot, and P. Ravaud, “Map- ACM CCS, 2016.
ping of crowdsourcing in health: Systematic review,” J Med Internet [44] The Bitcoin Project, Online at https://bitcoin.org/en, 2009.
Res, vol. 20, no. 5, p. e187, 2018. [45] The Ethereum Project, Online at https://ethereum.org, 2014.
[17] Y. Shen, L. Huang, L. Li, X. Lu, S. Wang, and W. Yang, “Towards [46] The Hyperledger Project, Online at https://hyperledger.org, 2015.
preserving worker location privacy in spatial crowdsourcing,” in Proc. [47] S. Hu, C. Cai, Q. Wang, C. Wang, X. Luo, and K. Ren, “Searching
of IEEE GLOBECOM, 2015. an encrypted cloud meets blockchain: A decentralized, reliable and fair
[18] H. To, G. Ghinita, and C. Shahabi, “A framework for protecting worker realization,” in Proc. of IEEE INFOCOM, 2018.
location privacy in spatial crowdsourcing,” PVLDB, vol. 7, no. 10, pp. [48] C. Cai, J. Weng, X. Yuan, and C. Wang, “Enabling reliable keyword
919–930, 2014. search in encrypted decentralized storage with fairness,” IEEE TDSC,
[19] Y. Gong, L. Wei, Y. Guo, C. Zhang, and Y. Fang, “Optimal for mobile 2018.
crowdsourcing with privacy control,” IEEE IoT-J, vol. 3, no. 5, pp. 745– [49] C. Zhang, C. Xu, J. Xu, Y. Tang, and B. Choi, “Gemˆ2-tree: A gas-
756, 2016. efficient structure for authenticated range queries in blockchain,” in Proc.
[20] W. K. Wong, D. W. Cheung, B. Kao, and N. Mamoulis, “Secure knn of IEEE ICDE, 2019.
computation on encrypted databases,” in Proc. of ACM SIGMOD, 2009. [50] C. Xu, C. Zhang, and J. Xu, “vchain: Enabling verifiable boolean range
[21] A. Liu, W. Wang, S. Shang, Q. Li, and X. Zhang, “Efficient task queries over blockchain databases,” in Proc. of ACM SIGMOD, 2019.
assignment in spatial crowdsourcing with worker and task privacy [51] J. Yao, Y. Zheng, Y. Guo, C. Cai, A. Zhou, C. Wang, and X. Gui, “A
protection,” GeoInformatica, vol. 22, no. 2, pp. 335–362, 2018. privacy-preserving system for targeted coupon service,” IEEE Access,
[22] K. Yang, K. Zhang, J. Ren, and X. Shen, “Security and privacy in mobile vol. 7, pp. 120 817–120 830, 2019.
crowdsourcing networks: challenges and opportunities,” IEEE Commun [52] Y. Guo, C. Zhang, and X. Jia., “Verifiable and forward-secure encrypted
Mag, vol. 53, no. 8, pp. 75–81, 2015. search using blockchain techniques,” in Proc. of IEEE ICC, 2020.
Authorized licensed use limited to: University of Liverpool. Downloaded on November 13,2020 at 11:53:08 UTC from IEEE Xplore. Restrictions apply.