Professional Documents
Culture Documents
sciences
Article
A Practical Multiparty Private Set Intersection Protocol Based
on Bloom Filters for Unbalanced Scenarios
Ou Ruan 1,† , Changwang Yan 1,† , Jing Zhou 2, * and Chaohao Ai 1
Abstract: Multiparty Private Set Intersection (MPSI) is dedicated to finding the intersection of
datasets of multiple participants without disclosing any other information. Although many MPSI
protocols have been presented, there are still some important practical scenarios that require in-depth
consideration such as an unbalanced scenario, where the server’s dataset is much larger than the
clients’ datasets, and in cases where the number of participants is large. This paper proposes a
practical MPSI protocol for unbalanced scenarios. The protocol uses the Bloom filter, an efficient
data structure, and the ElGamal encryption algorithm to reduce the computation of clients and the
server; adopts randomization technology to solve the encryption problem of the 0s in the Bloom
filter; and introduces the idea of the Shamir threshold secret-sharing scheme to adapt to multiple
environments. A formal security proof and three detailed experiments are given. The results of the
experiments showed that the new protocol is very suitable for unbalanced scenarios with a large
number of participants, and it has a significant improvement in efficiency compared with the typical
related protocol (TIFS 2022).
Keywords: Multiparty Private Set Intersection; unbalanced scenario; ElGamal cryptography; Shamir
threshold key-sharing scheme; Bloom filter
Multiparty Private Set Intersection for an unbalanced scenario, where the clients’ address
books consist of small datasets, while the server’s database is very large.
However, most of the existing MPSI protocols are not applicable to such unbalanced
scenarios. For example, the protocol in [22] can only derive the approximate number of
intersections; the protocols in [23,24] require the participants’ datasets to be of equal size;
the protocol in [25] is only applicable to three participants and small datasets; the protocol
in [28] can be applied to such scenarios, but the computation for the clients and the server
is high.
This paper presents a practical MPSI protocol for unbalanced scenarios in which the
server with high computational power handles the majority of the computations, while the
clients only need to perform a small amount of computation. Moreover, the protocol has a
minimal impact on the execution time of clients as the number of participants increases.
algorithm cannot encrypt 0. To address this issue, we randomized the 0s in the Bloom filter,
i.e., we obtained the encrypted Bloom filter by following this process: if an item in the
Bloom filter is 1, it is encrypted directly; otherwise, if the item is 0, we select a random value
to represent it and, then, encrypt the random value. To ensure that participants can obtain
the intersection properly, we adopted the idea of the Shamir t-threshold secret-key-sharing
scheme, in which the private key is divided into n copies and at least t shares are needed
for decryption.
The contributions of this paper are as follows:
(1) A secure MPSI protocol based on the Bloom filter and threshold ElGamal encryption
scheme for unbalanced scenarios is proposed. In this protocol, the runtime of the client
hardly varies with the number of participants, and it is linearly correlated with the data
size of the server.
(2) We present three comprehensive experiments, and the results showed that, in our
protocol, the number (t) of participants has a minimal influence on the computation and
runtime of the client. When t ≥ 24 , the client’s runtime is approximately 1/(2log(t) ) of the
server’s runtime. Therefore, our protocol is highly suitable for unbalanced scenarios with
a large number of participants. Furthermore, our protocol exhibited significant efficiency
improvement compared to the related protocol [28], and the sever and clients’ runtimes
were approximately 1/6 of [28].
(3) We provide a formal security proof of our protocol against semi-honest adversaries.
2.1. Notations
We show the notations used in the paper in Table 1.
Notation Meaning
p p is a large prime number.
Z ∗p Z ∗p is a modulo-p multiplicative group.
Enc( M) The result of encrypting the plaintext M.
Dec(C ) The result of decrypting the ciphertext C.
t The number of participants in the protocol is t.
Pi The i-th participant. P1 , . . . , Pt−1 are the clients, and Pt is the server.
Si Vector of datasets for participant Pi , i ∈ {1, 2, . . . , t}.
ni The dataset size of Si , i ∈ {1, 2, . . . , t}.
Si [ j ] The j-th element in the dataset vector Si of participant Pi , i ∈ {1, 2, . . . , t}, j ∈ {1, 2, . . . , ni }.
m The length of the Bloom filter.
k The number of hash functions used by the Bloom filter.
hu The u-th hash function, u ∈ {1, 2, . . . , k}.
hu ( x ) The hash value of x by using the hu that is in {1, 2, . . . , m}, where u ∈ {1, 2, . . . , k}.
BFi The Bloom filter obtained by mapping the dataset Si , i ∈ {1, 2, . . . , t}.
BFi [l ] The l-th bit of BFi , i ∈ {1, 2, . . . , t}, l ∈ {1, 2, . . . , m}.
RBFi The randomized Bloom filter obtained by randomizing BFi , i ∈ {1, 2, . . . , t}.
RBFi [l ] The l-th element of RBFi , i ∈ {1, 2, . . . , t}, l ∈ {1, 2, . . . , m}.
ERBFi The encrypted randomized Bloom filter obtained by encrypting RBFi , i ∈ {1, 2, . . . , t}.
ERBFi [l ] The l-th element of ERBFi , i ∈ {1, 2, . . . , t}, l ∈ {1, 2, . . . , m}.
rand() Generate a random number between 1 and p − 1.
Appl. Sci. 2023, 13, 13215 4 of 17
Table 1. Cont.
Notation Meaning
pk The public key.
sk The private key.
sk i The share of private key sk distributed to the i-th participant.
cuj,i cuj,i = ERBFi [hu (St [ j])], where i ∈ {1, 2, . . . , t}, j ∈ {1, 2, . . . , nt }, u ∈ {1, 2, . . . , k}.
Comb(sh j,1 , . . . , sh j,t ) For j ∈ {1, 2, . . . , nt }, joint decryption on (sh j,1 , . . . , sh j,t ).
t 0
j
∆i ∆i = ∏ , where i ∈ {1, 2, . . . , t}.
0 0 j0 −i
j =1,j 6=i
sh j,i sh j,i = (c j )∆i ·ski ,
where i ∈ {1, 2, . . . , t}, j ∈ {1, 2, . . . , nt }.
shi shi = {sh1,i , . . . , sh j,i , . . . , shnt ,i }, where i ∈ {1, 2, . . . , t}.
× Homomorphic multiplication calculations. All computations between ciphertexts in the article are
homomorphic multiplicative computations.
2.2. Preliminaries
2.2.1. ElGamal Encryption Algorithm
In 1985, Taher ElGamal [36] proposed an asymmetric encryption algorithm based
on the Diffie–Hellman key exchange. The security of the system primarily relies on the
difficulty of solving the discrete logarithm problem in a finite field. The ElGamal encryption
algorithm can be divided into three main parts: key generation, encryption, and decryption.
Key generation: First, randomly select a large prime number, p, such that p − 1 has
large prime factors. Next, choose a primitive element g modulo p. Finally, choose d
(2 ≤ d ≤ p − 2) as the private key, then y = gd mod p is the public key.
Encryption: Suppose the plaintext is x. Calculate the ciphertext pair C1 = gr mod p
and C2 = x · yr mod p, where r is a random number with 2 ≤ r ≤ p − 2.
C2 x · yr x · gr · d
Decryption: Compute the plaintext x = = gr · d
= gr · d
mod p.
C1d
Multiplicative homomorphism: Assuming that the encrypted plaintexts are m1 and
m2 , the ciphertext pair generated by encrypting m1 is (C11 , C21 ), and the ciphertext pair
generated by encrypting m2 is (C12 , C22 ), where C11 = gr1 mod p, C21 = m1 · gd·r1 mod p,
C12 = gr2 mod p, C22 = m2 · gd·r2 mod p, then:
Bloom filter for dataset X: Each bit of the Bloom filter is initialized to 0. Each xi in the
dataset X = { x1 , . . . , xn } is hashed k times as (h1 ( xi ), . . . , hk ( xi )), and set BF [hu ( xi )] = 1
for u ∈ {1, 2, . . . , k}.
Appl. Sci. 2023, 13, 13215 5 of 17
where
n
x − xi
f j (x) = ∏ x j − xi
(6)
i =1,i 6= j
then
n n
x−x j
f (x) = ∑ ( yi · ∏ xi − x j ) (7)
i =1 j=1,j6=i
t t
∑ ( f (i ) · ∏
j
f (0) = j −i ) mod p (9)
i =1 j=1,j6=i
the ElGamal encryption algorithm into n copies using the Shamir threshold secret-sharing
scheme. Decryption can then be achieved when at least t copies are gathered together.
For a (t, n) threshold ElGamal encryption algorithm, where n is the total number of
participants and t is the minimum number of participants required for decryption, the basic
scheme is described as follows:
Key generation: Randomly select a large prime p, and then, choose a prime element
g modulo p. Randomly pick a sk = d as the private key for the ElGamal algorithm such
that 2 ≤ d ≤ p − 2. Compute y = gd mod p and pk = y as the public key. The private key is
t −1
shared as follows: let a0 = d, and generate the polynomial f ( x ) = ∑ ai xi mod p; the key
i =0
sharingof the j-th participant is sk j = f ( j) mod p, where j ∈ {1, . . . , n}.
Encryption: Assume that the message to be encrypted is M. Choose a random r ∈ Z ∗p ,
and compute the ciphertexts c1 = gr and c2 = Myr = Mgdr .
Share decryption: Among the t participants involved in decryption, the i-th participant
t
calculates c1,i = (c1 )∆i ·ski , where ∆i =
j
∏ j −i .
j=1,j6=i
Combination calculation (Comb): The results of decryption are obtained by performing
ElGamal homomorphic multiplication operations on the shared decryption results of
the t participants:
t
t ∑ ∆i · f (i )
∏ c1,i = (c1 )i=1 (10)
i =1
From the Shamir threshold secret-sharing scheme (Section 2.2.4), it is known that
restoring the key sk = d is calculated as follows:
t t t
∑ ( f (i ) · ∏ ∑ ∆i · f (i )
j
sk = f (0) = d = j −i ) = (11)
i =1 j=1,j6=i i =1
t j
where ∆i = ∏ j −i .
j=1,j6=i
Thus, the decryption result can be obtained as follows:
c2 c2 c2
M= = = (12)
c1d t
∑ ∆i · f (i )
t
∏ c1,i
( c1 ) i =1 i =1
( P1 , . . . , Pt−1 ). Let S = S1 , . . . , St represent the set of inputs from the clients and the server
and f (S) = (⊥, ∩) represent a function in which the server Pt obtains the intersection and
the clients obtain nothing. Assume a protocol ∏t with t participants to compute the function
f (S). During the execution of the protocol ∏t , the server’s view is View∏ t
Pt = ( St , rt , `t , outt ),
and the clients’ views are View∏ t
Pi = ( Si , ri , `i , outi ), in which i ∈ {1, . . . , t − 1}, ri and rt
represent the random numbers generated by each of the clients Pi and server Pt , `i and `t
are the messages received by each of the clients Pi and server Pt , respectively, and outi and
outt represent the outputs of each of the clients Pi and server Pt .
If ∏t satisfies the following conditions: there exist polynomial-time simulators
Simt (St , ∩) and Simi (Si , ⊥) such that Simt (St , ∩) and View∏ t
Pt and Simi ( Si , ⊥) and View Pi
∏t
(14)
Simi (Si , ⊥) ' View∏ P
t
i
From Equation (14), it is clear that such a secure multiparty computation protocol ∏t
is secure in the presence of a semi-honest adversary.
Online stage: The server Pt receives the encrypted randomized Bloom filters
{ ERBF1 , ERBF2 , . . . , ERBFt−1 } and starts the intersection calculation.
(1) i. For each of the data St [ j] of the server Pt , the hash value hu (St [ j]) is computed,
where j ∈ {1, . . . , nt }, u ∈ {1, . . . , k}. Eventually, nt sets of data are obtained, each with k
values, as follows:
{h1 (St [1]) · · ·hu (St [1]) · · ·hk (St [1])}
.. .. ..
. . .
{h1 (St [ j]) · · ·hu (St [ j]) · · ·hk (St [ j])} (15)
.. .. ..
. . .
{h1 (St [nt ]) · · ·hu (St [nt ]) · · ·hk (St [nt ])}
ii. Substitute hu (St [ j]) into each client’s encrypted randomized Bloom filter ERBFi
to obtain cuj,i , where i ∈ {1, . . . , t − 1}, j ∈ {1, . . . , nt }, u ∈ {1, . . . , k } and cuj,i =
ERBFi [hu (St [ j])], as follows:
k k k
{{c11,1 , · · · , c1,1 } · · ·{c11,i , · · · , c1,i } · · ·{c11,t−1 , · · · , c1,t −1 }}
.. .. ..
. . .
{{c1j,1 , · · · , ckj,1 } · · ·{c1j,i , · · · , ckj,i } · · ·{c1j,t−1 , · · · , ckj,t−1 }} (16)
.. .. ..
. . .
{{c1nt ,1 , · · · , cknt ,1 } · · ·{c1nt ,i , · · · , cknt ,i } · · ·{c1nt ,t−1 , · · · , cknt ,t−1 }}
iii. For each set of data in Equation (16), a homomorphic multiplication is performed
to compute c j,i = c1j,i × · · · × ckj,i , where i ∈ {1, . . . , t − 1}, j ∈ {1, . . . , nt }. The resulting
data are as follows:
{c1,1 · · ·c1,i · · ·c1,t−1 }
.. .. ..
. . .
{c j,1 · · ·c j,i · · ·c j,t−1 } (17)
.. .. ..
. . .
{cnt ,1 · · ·cnt ,i · · ·cnt ,t−1 }
iv. For each set of data in Equation (17), homomorphic multiplication is performed to
compute c j = c j,1 × · · · × c j,t−1 × w j , where j ∈ {1, . . . , nt }.
(2) Pi performs the computation of the shareddecryption after obtaining the data c j ,
which is computed as follows:
sh j,i = (c j )∆i ·ski (18)
t 0
j
where ∆i = ∏ 0 , i ∈ {1, . . . , t} ,j ∈ {1, 2, . . . , nt }.
0 0 j −i
j =1,j 6=i
(3) Pt performs the combination computation, which is denoted as
Comb(sh j,1 , . . . , sh j,t ). Then, Dec(c j ) can be obtained by Comb(sh j,1 , . . . , sh j,t ).
If Dec(c j ) = w j = St [ j]ṙ j , the server adds the corresponding St [ j] to the intersec-
tion S = {St [ j]} ∪ S. The range of j above is {1, 2, · · · , nt }.
Appl. Sci. 2023, 13, 13215 9 of 17
∀i ∈ {1, . . . , t − 1}, BFi (h1 (St [ j])) × · · · × BFi (hk (St [ j])) = 1 (19)
In this case, by the nature of Bloom filters, we know that St [ j] is in the sets of all
{S1 , · · · , St−1 }, i.e., St [ j] is an intersection element of the participants {P1 , · · · , Pt }.
If there is Dec(c j ) 6= r j × St [ j] mod p, then
∃i ∈ {1, . . . , t − 1}, RBFi (h1 (St [ j])) × · · · × RBFi (hk (St [ j])) 6= 1 (20)
In this case, there exists i ∈ {1, ..., t − 1} such that the values in the randomized Bloom
filter RBFi that St [ j] maps to are not all 1, i.e., St [ j] is not an intersection element of the
participants {P1 , · · · , Pt }.
In summary, we can determine whether an element belongs to the intersection by
using the nature of the Bloom filter. Therefore, the protocol for secure computation of the
Appl. Sci. 2023, 13, 13215 10 of 17
Theorem 1. If the threshold ElGamal encryption scheme in Section 2.2.5 is secure, then the
unbalanced MPSI protocol ∏ in this paper is secure in the case of semi-honest adversaries.
Proof. To analyze the security of our protocol, we assumed that there are ` participants
controlled by an adversary, where ` < t. In the execution of our protocol, there were t
participants, and we only considered the case where ` < t. If all participants are controlled
by the adversary, the analysis is meaningless. We divided the analysis into two cases:
one is that the adversary controls ` clients and the server Pt is not among the ` participants
controlled by the adversary; the other is that the server Pt is among the ` participants
controlled by the adversary, and the adversary controls ` − 1 clients and the server Pt .
Scenario 1: Server Pt is not among the ` participants controlled by the adversary.
Suppose there are ` participants P1 , . . . , P` is controlled by the adversary, and the ad-
versary can access the inputs of these ` participants, as well as the intermediates generated
by the computation. From the Shamir threshold secret-sharing scheme, it is clear that,
when ` < t, the private key corresponding to the public key cannot be inferred from the
information of the ` participants in our protocol. We denote the participants’, P1 , . . . , P` ,
views of the real protocol as View∏
RN , shown in Equation (21), where { S1 , . . . , S` } denote the
original datasets of the participants { P1 , . . . , P` }, { ERBF1 , . . . , ERBF` } are the encrypted
randomized Bloom filters, and finally, {sh1 , . . . , sh` } denote the intermediate results of
the computation.
View∏
RN = ({ S1 , . . . , S` }, { ERBF1 , . . . , ERBF` }, { sh1 , . . . , sh` }) (21)
0 0 0
w j = Enc(r j ) × Enc(St [ j] ) mod p (22)
0
(5) For each St [ j] , j ∈ {1, 2, . . . , nt }, and each ERBFi (i ∈ {1, ..., `}) and each ERBFi 0
0 0 0
(i ∈ {` + 1, ..., t − 1}), compute {(c1j,i ) , . . . , (ckj,i ) } and c j,i :
0 0 0
c j,i = (c1j,i ) × · · · × (ckj,i ) mod p (23)
Appl. Sci. 2023, 13, 13215 11 of 17
0
(6) For j ∈ {1, 2, . . . , nt }, yield c j by a homomorphic multiplication operation on all
0
the mapped values in the encrypted Bloom filters corresponding to St [ j] .
0 0 0 0
c j = c j,1 × · · · × c j,t−1 × w j mod p (24)
0 0 0 0
(7) Compute shi = {sh1,i , . . . , shnt ,i } by the shareddecryption of c j with private key
t 0
j
sharingsk i , which is expressed as Equation (25), where ∆i = ∏ 0 , i ∈ {1, . . . , `},
0 0 j −i
j =1,j 6=i
j ∈ {1, 2, . . . , nt }.
0 0
sh j,i = (c j )∆i ·ski mod p (25)
0 0
(8) Insert {sh1 , . . . , sh` } into the view. Thus, the view of simulator Sim N is:
Because the simulator Sim N controls only ` participants other than the server, it can
only obtain the key shares of these ` participants, and the joint computation of the data in
the threshold ElGamal encryption requires at least t participants’ shares of decrypted data,
so the simulator Sim N cannot perform the joint computation in the threshold ElGamal to
obtain the final decryption result.
View∏ RN = ({ S1 , . . . , S` }, { ERBF1 , . . . , ERBF` },{ sh1 , . . . , sh` }) represents the view
of the real protocol with participants P1 , . . . , P` , and Sim N ({S1 , . . . , S` }, ⊥) =
0 0
({S1 , . . . , S` }, { ERBF1 , . . . , ERBF` }, {sh1 , . . . , sh` }) represents the view of the simulator
Sim N , where {S1 , . . . , S` } and { ERBF1 , . . . , ERBF` } are the same. Since the threshold ElGa-
0 0
mal encryption algorithm and random values are used, {sh1 , . . . , sh` } and {sh1 , . . . , sh` }
are computationally indistinguishable. Thus,
View∏
RN ' Sim N ({ S1 , . . . , S` }, ⊥) (27)
View∏
RY =({ S1 , . . . , S`−1 , St }, { ERBF1 , . . . , ERBF`−1 },
{sh1 , . . . , sh`−1 , sht }, ∩) (28)
0 0
(4) Generate random values r j , and calculate w j , where j ∈ {1, 2, . . . , nt }.
0 0
w j = Enc(r j ) × Enc(St [ j]) mod p (29)
(5) For each St [ j] (j ∈ {1, 2, . . . , nt }), each ERBFi (i ∈ {1, ..., ` − 1}), and each ERBFi 0
j,i0 j,i0 0
(i ∈ {`, ..., t − 1}), compute {c1 , . . . , ck } and c j,i :
0 0 0
c j,i = (c1j,i ) × · · · × (ckj,i ) mod p (30)
0
(6) For j ∈ {1, 2, . . . , nt }, compute c j .
0 0 0 0
c j = c j,1 × · · · × c j,t−1 × w j mod p (31)
0 0 0 0
(7) Compute shi = {sh1,i , . . . , shnt ,i } by the shared decryption of c j with private key
t 0
j
sharing sk i , which is expressed as Equation (32), where ∆i = ∏ 0 , i ∈ {1, . . . , ` −
0 0 j −i
j =1,j 6=i
1, t}, j ∈ {1, 2, . . . , nt }.
0 0
sh j,i = (c j )∆i ·ski mod p (32)
0 0 0
(8) Insert {sh1 , . . . , sh`−1 , sht } into the view. Thus, the view of the simulator SimY is:
Because the simulator controls ` participants, it can only obtain the key shares of these
` participants, but the joint computation of the data in the threshold ElGamal encryption
requires at least t participants’ shares, so the simulator cannot perform the joint computation
of the threshold ElGamal encryption and cannot obtain the final decryption results. In this
case, the simulator can obtain the final intersection because it controls server Pt .
In the view View∏ RY , as well as in the simulator’s view SimY ({ S1 , . . . , S`−1 , St }, ∩),
{S1 , . . . , S`−1 , St } and the encrypted Bloom filters { ERBF1 , . . . , ERBF`−1 } are identical.
0 0 0
{sh1 , . . . , sh`−1 } and {sh1 , . . . , sh`−1 , sht } are computationally indistinguishable due to the
security of the threshold ElGamal encryption algorithm. Thus,
View∏
RY ' SimY ({ S1 , . . . , S`−1 , St }, ∩) (34)
From the above two scenarios, it can be seen that, in the face of a semi-honest adversary
who controls ` participants when ` < t, the adversary cannot infer the inputs of the honest
participants through the intermediate process and the final result, so the proposed protocol
in this paper is secure in the face of a semi-honest adversary.
4.2.1. Experiment 1
The main difference between our protocol and [28] is the encryption algorithm.
The Paillier encryption algorithm was used in [28], while we adopted the ElGamal en-
cryption algorithm. Firstly, with the same security parameters, the ElGamal encryption
algorithm produces two ciphertexts, while the Paillier encryption algorithm produces only
one ciphertext. However, the ciphertexts produced by the Paillier encryption algorithm
are twice as long as those produced by the ElGamal encryption algorithm. Therefore, both
algorithm have the same communication complexity.
Secondly, we conducted experiments with these two algorithms using the NTL [41]
library. We set the value of the encrypted data to 107 and chose the security parameter of
k = 1024 bit for public key encryption. The size of the dataset was increased from 102 to
105 , and the encryption process used the same random value. The experimental results are
presented in Table 2, which demonstrate that the ElGamal encryption algorithm is more
efficient than the Paillier encryption algorithm. The former encrypts approximately twice
as fast as the latter, and the former decrypts about four-times as fast as the latter.
Table 2. Comparison of Paillier encryption algorithm and ElGamal encryption algorithm (in seconds).
4.2.2. Experiment 2
We set the amount of data for the client and server to 28 and increased the number
of participants from 24 to 29 for these experiments, which aimed at testing the runtime
of individual clients and the server. The experimental results are shown in Table 3 and
Figure 2. From the results, it is evident that the runtime of the clients remained constant
in both our protocol and the protocol [28], regardless of the increase in the number (t) of
participants. Moreover, in our protocol, the client’s runtime was approximately 1/(2log(t) )
of the server’s runtime when t ≥ 24 . Our protocol ran more efficiently than the protocol [28].
This is mainly because the clients in [28] performed the ShDec0() operation, which involves
exponential operations on a large integer nt with large random values. The client Pi in
t 0
only once and sh j,i = (c j )∆i ·ski nt times,
j
our protocol needs to compute ∆i = ∏ 0
0 0 j −i
j =1,j 6=i
and the computation related to the number of participants is only once for ∆i , which is
insignificant in the whole runtime of the client.
Appl. Sci. 2023, 13, 13215 14 of 17
Table 3. Comparison for different MPSI protocols with different numbers of participants (in seconds).
No. of Participants
24 25 26 27 28 29
Protocols
Client 2.234 2.445 2.176 2.441 2.279 2.405
Bay et al. [28]
Server 3.881 5.913 10.202 15.242 25.371 41.615
Client 0.356 0.422 0.387 0.401 0.390 0.381
Ours
Server 0.389 0.785 1.504 3.350 6.845 12.807
The server and client dataset size was fixed at 28 . The number of participants gradually increased from 24 to 29 .
Figure 2. Comparison for different MPSI protocols with different numbers of participants.
4.2.3. Experiment 3
We fixed the size of clients’ datasets at 28 and the number of participants at 25 . Addi-
tionally, we increased the size of the server’s set from 210 to 214 for the experiment to test
the runtime of the clients and the server. The experimental results are shown in Table 4
and Figure 3. It can be seen that the client–server runtime of both our protocol and [28]
was linearly related to the size of the server dataset. But, our protocol had a significant
improvement in efficiency compared to the protocol of Bay et al. [28], and the runtimes of
the server and clients were approximately 1/6 of [28].
Figure 3. Comparison for different MPSI protocols with different volumes of the server’s dataset.
Appl. Sci. 2023, 13, 13215 15 of 17
Table 4. Comparison for different MPSI protocols with different volumes of the server’s dataset
(in seconds).
4.3. Discussion
From the analysis and comparison of the above experiments, we can conclude that
our protocol offers the following advantages:
(1) In our protocol, the number (t) of participants has almost no impact on the compu-
tation and runtime of the client. When t ≥ 24 , the client’s runtime was about 1/(2log(t) ) of
the server’s runtime. Therefore, our protocol is very suitable for unbalanced scenarios with
a large number of participants.
(2) Compared to the typical related protocol [28], our protocol demonstrated a significant
improvement in efficiency. The sever and client runtimes were approximately 1/6 of [28].
5. Conclusions
This paper proposed an MPSI protocol based on the Bloom filter and Shamir threshold
secret-sharing scheme, which is highly suitable for unbalanced scenarios with a large
number of participants. Compared to the typical related protocol [28], our protocol demon-
strated a significant improvement in efficiency. The server’s and clients’ runtimes were
approximately 1/6 of [28]. Extending the approach to the model of malicious adversaries
is our future work.
Author Contributions: Conceptualization, O.R., C.Y., J.Z. and C.A.; methodology, O.R., C.Y., J.Z. and
C.A.; software, C.Y. and C.A.; validation, O.R., C.Y., J.Z. and C.A.; formal analysis, O.R., C.Y., J.Z.
and C.A.; investigation, O.R., C.Y., J.Z. and C.A.; resources, O.R., C.Y., J.Z. and C.A.; data curation,
O.R., C.Y., J.Z. and C.A.; writing—original draft preparation, C.Y. and C.A.; writing—review and
editing, O.R. and J.Z.; visualization, O.R., C.Y., J.Z. and C.A.; supervision, O.R. and J.Z.; project
administration, O.R.; funding acquisition, O.R. All authors have read and agreed to the published
version of the manuscript.
Funding: This research is supported by the National Natural Science Foundation of China under
Grant 62202146 and Enterprise Technology Innovation Development Project of Hubei Province of
China Grant Number 2021BAB009.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data are contained within the article.
Conflicts of Interest: The authors declare no conflicts of interest. The funders had no role in the
design of the study; in the collection, analyses, or interpretation of the data; in the writing of the
manuscript; nor in the decision to publish the results.
Abbreviations
The following abbreviations are used in this manuscript:
References
1. Demmler, D.; Rindal, P.; Rosulek, M.; Trieu, N. PIR-PSI: Scaling Private Contact Discovery. Proc. Priv. Enhancing Technol. 2018,
4, 159–178. [CrossRef]
2. Nagy, M.; De Cristofaro, E.; Dmitrienko, A.; Asokan, N.; Sadeghi, A.-R. Do i know you? efficient and privacy-preserving
common friend-finder protocols and applications. In Proceedings of the 29th Annual Computer Security Applications Conference,
New Orleans, LA, USA, 9–13 December 2013; pp. 159–168. Available online: https://ia.cr/2013/620 (accessed on 15 May 2023).
3. Yuan, X.; Wang, X.; Wang, C.; Squicciarini, A.; Ren, K. Enabling privacy-preserving image-centric social discovery. In Proceedings
of the 2014 IEEE 34th International Conference on Distributed Computing Systems, Madrid, Spain, 30 June–3 July 2014;
pp. 198–207. [CrossRef]
4. Kim, S.P.; Gil, M.S.; Kim, H.; Choi, M.-J.; Moon, Y.-S.; Won, H.-S. Efficient two-step protocol and its discriminative feature
selections in secure similar document detection. Secur. Commun. Netw. 2017, 2017, 6841216. [CrossRef]
5. Phuong, T.T. Privacy-preserving deep learning via weight transmission. IEEE Trans. Inf. Forensics Secur. 2019, 14, 3003–3015. [CrossRef]
6. Fischlin, M.; Pinkas, B.; Sadeghi, A.R.; Schneider, T.; Visconti, I. Secure set intersection with untrusted hardware tokens. In
Proceedings of the CT-RSA 2011, LNCS, San Francisco, CA, USA, 14–18 February 2011; Volume 6558, pp. 1–16. [CrossRef]
7. Bogdanov, D.; Niitsoo, M.; Toft, T.; Willemson, J. High-performance secure multi-party computation for data mining applications.
Int. J. Inf. Secur., 2012, 11, 403–418. [CrossRef]
8. Wang, Y.-W.; Wu, J.-L. A Privacy-Preserving Symptoms Retrieval System with the Aid of Homomorphic Encryption and Private
Set Intersection Schemes. Algorithms 2023, 16, 244. [CrossRef]
9. Fan, C.; Jia, P.; Lin, M.; Wei, L.; Guo, P.; Zhao, X.; Liu, X. Cloud-Assisted Private Set Intersection via Multi-Key Fully Homomorphic
Encryption. Mathematics 2023, 11, 1784. [CrossRef]
10. Resenede, A.C.D.; de Freitas Aranha, D. Faster unbalanced Private Set Intersection in the semi-honest setting. J. Cryptogr. Eng.
2021, 11, 21–38. [CrossRef]
11. Falk, B.H.; Noble, D.; Ostrovsky, R. Private set intersection with linear communication from general assumptions. In Proceedings
of the 18th ACM Workshop on Privacy in the Electronic Society. London: Association for Computing Machinery, London, UK,
11 November 2019; pp. 14–25. [CrossRef]
12. Le P.H.; Ranellucci, S.; Gordon, S.D. Two-party private set intersection with an untrusted third party. In Proceedings of the 2019 ACM
SIGSAC Conference on Computer and Communications Security, London, UK, 11–15 November 2019; pp. 2403–2420. [CrossRef]
13. Ciampi, M.; Orlandi, C. Combining private set-intersection with secure two-party computation. In Security and Cryptography
for Networks (SCN 2018); Catalano, D., De Prisco, R., Eds.; Lecture Notes in Computer Science; Springer: Amalfi, Italy, 2018;
Volume 11035, pp. 464–482. [CrossRef]
14. Wang, Z.S.; Banawan, K.; Ulukus, S. Multi-party private set intersection: An information-theoretic approach. IEEE J. Sel. Areas Inf. Theory
2021, 2, 366–379. [CrossRef]
15. Debnath, S.K.; Sakurai, K.; Dey, K.; Kundu, N. Secure outsourced private set intersection with linear complexity. In Proceedings
of the 2021 IEEE Conference on Dependable and Secure Computing (DSC), Aizuwakamatsu, Japan, 30 January–2 February 2021;
pp. 1–8. [CrossRef]
16. Blanton, M.; Aguiar, E. Private and Oblivious Set and Multiset Operations; Springer: Berlin/Heidelberg, Germany, 2012. [CrossRef]
17. Chen, H.; Huang, Z.; Laine, K.; Rindal, P. Labeled PSI from fully homomorphic encryption with malicious security. In Proceedings
of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15–19 October 2018;
pp. 1223–1237. [CrossRef]
18. Chen, H.; Laine, K.; Rindal, P. Fast private set intersection from homomorphic encryption. In Proceedings of the 2017 ACM SIGSAC
Conference on Computer and Communications Security, New York, NY, USA, 30 October–3 November 2017; pp. 1243–1255. [CrossRef]
19. Lv, S.; Ye, J.; Yin, S.; Cheng, X.; Feng, C.; Liu, X.; Li, R.; Li, Z.; Liu, Z.; Zhou, L. Unbalanced private set intersection cardinality
protocol with low communication cost. Future Gener. Comput. Syst. 2020, 102, 1054–1061. [CrossRef]
20. Ma, J.P.K.; Chow, S.S.M. Secure-Computation-Friendly Private Set Intersection from Oblivious Compact Graph Evaluation. In
Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, Nagasaki, Japan, 30 May–3 June
2022; pp. 1086–1097. [CrossRef]
21. Resende, A.C.D.; Aranha, D.F. Faster unbalanced private set intersection. In Proceedings of the International Conference on
Financial Cryptography and Data Security, Nieuwpoort, Curaçao, 26 February–2 March 2018; pp. 203–221. [CrossRef]
22. Freedman, M.J.; Nissim, K.; Pinkas, B. Efficient private matching and set intersection. In Proceedings of the International Conference
on the Theory and Applications of Cryptographic Techniques, Interlaken, Switzerland, 2–6 May 2004; pp. 1–19. [CrossRef]
23. Kissner, L.; Song, D. Privacy-preserving set operations. In Proceedings of the 25th Annual International Cryptology Conference
on Advances in Cryptology, Santa Barbara, CA, USA, 14–18 August 2005; pp. 241–257. [CrossRef]
24. Sang, Y.; ; Shen, H. Efficient and secure protocols for privacypreserving set operations. ACM Trans. Inf. Syst. Secur. 2009,
13, 1–35. [CrossRef]
25. Zhang, L.; He, C.; Wei, L. Efficient and malicious secure three-party private set intersection computation protocols for small sets.
J. Comput. Res. Dev. 2022, 59, 2286–2298. [CrossRef]
26. Miyaji, A.; Nakasho, K.; Nishida, S. Privacy-preserving integration of medical data: A practical Multiparty Private Set Intersection.
J. Med Syst. 2017, 41, 1–10. [CrossRef]
Appl. Sci. 2023, 13, 13215 17 of 17
27. Davidson, A.; Cid, C. An efficient toolkit for computing private set operations. In Proceedings of the Information Security and
Privacy: 22nd Australasian Conference, ACISP 2017, Auckland, New Zealand, 3–5 July 2017; Proceedings, Part II 22; Springer
International Publishing: Berlin/Heidelberg, Germany, 2017; pp. 261–278. [CrossRef]
28. Bay, A.; Erkin, Z.; Hoepman, J.-H.; Samardjiska, S.; Vos, J. Practical Multi-Party Private Set Intersection Protocols.
IEEE Trans. Inf. Forensics Secur. 2022, 17, 1–15. [CrossRef]
29. Kolesnikov, V.; Matania, N.; Pinkas, B.; Rosulek, M.; Trieu, N. Practical multi-party private set intersection from symmetric-key
techniques. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA,
30 October–3 November 2017; pp. 1257–1272. [CrossRef]
30. Kavousi, A.; Mohajeri, J.; Salmasizadeh, M. Efficient scalable multi-party private set intersection using oblivious PRF. In
Proceedings of the 17th International Workshop on Security and Trust Management, Darmstadt, Germany, 8 October 2021;
pp. 81–99. [CrossRef]
31. Inbar, R.; Omri, E.; Pinkas, B. Efficient scalable multiparty private set-intersection via garbled Bloom filters. In Proceedings of the 11th
International Conference on Security and Cryptography for Networks, Amalfi, Italy, 5–7 September 2018; pp. 235–252. [CrossRef]
32. Zhang, E.; Liu, F.; Lai, Q.; Jin, G.; Li, Y. Efficient multi-party private set intersection against malicious adversaries. In Proceedings of the
2019 ACM SIGSAC Conference on Cloud Computing Security Workshop, London, UK, 11–15 November 2019; pp. 93–104. [CrossRef]
33. Ben-Efraim, A.; Nissenbaum, O.; Omri, E.; Paskin-Cherniavsky, A. PSImple: Practical multiparty maliciously-secure private set
intersection. In Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, Nagasaki, Japan,
30 May–3 June 2022; pp. 1098–1112. [CrossRef]
34. Nevo, O.; Trieu, N.; Yanai, A. Simple, fast malicious Multiparty Private Set Intersection. In Proceedings of the 2021 ACM SIGSAC
Conference on Computer and Communications Security, Seoul, Korea, 15–19 November 2021; pp. 1151–1165. [CrossRef]
35. Gordon, S.D.; Hazay, C.; Le, P.H. Fully Secure PSI via MPC-in-the-Head [EB/OL]. 2022. Available online:
https://eprint.iacr.org/2022/379 (accessed on 15 May 2023).
36. ElGamal, T. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans. Inf. Theory 1985,
31, 469–472. [CrossRef]
37. Bloom, B.H. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 1970, 13, 422–426. [CrossRef]
38. Dong, C.; Chen, L.; Wen, Z. When private set intersection meets big data: An efficient and scalable protocol. In Proceedings
of the 2013 ACM SIGSAC conference on Computer & Communications Security, Berlin, Germany, 4–8 November 2013;
pp. 789–800. [CrossRef]
39. Shamir, A. How to share a secret. Commun. ACM 1979, 22, 612–613. [CrossRef]
40. Lindell, Y. How to simulate it—A tutorial on the simulation proof technique. In Tutorials on the Foundations of Cryptography;
Lindell, Y, Ed.; Information Security and Cryptography; Springer: Berlin/Heidelberg, Germany, 2017; pp. 277–346. [CrossRef]
41. Shoup, V. NTL: A Library for Doing Number Theory. [Online]. 2020. Available online: https://www.shoup.net/ntl/ (accessed
on 15 May 2023).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.