Applsci 13 13215

applied
sciences
Article
A Practical Multiparty Private Set Intersection Protocol Based
on Bloom Filters for Unbalanced Scenarios
Ou Ruan 1,† , Changwang Yan 1,† , Jing Zhou 2, * and Chaohao Ai 1
1 School of Computer Science, Hubei University of Technology, Wuhan 430068, China;

ruanou@hbut.edu.cn (O.R.); 102211192@hbut.edu.cn (C.Y.); 102011051@hbut.edu.cn (C.A.)
2 Digital Art Industry Institute, Hubei University of Technology, Wuhan 430068, China
* Correspondence: zhoujing0076@hbut.edu.cn
† These authors contributed equally to this work.
Abstract: Multiparty Private Set Intersection (MPSI) is dedicated to finding the intersection of
datasets of multiple participants without disclosing any other information. Although many MPSI
protocols have been presented, there are still some important practical scenarios that require in-depth
consideration such as an unbalanced scenario, where the server’s dataset is much larger than the
clients’ datasets, and in cases where the number of participants is large. This paper proposes a
practical MPSI protocol for unbalanced scenarios. The protocol uses the Bloom filter, an efficient
data structure, and the ElGamal encryption algorithm to reduce the computation of clients and the
server; adopts randomization technology to solve the encryption problem of the 0s in the Bloom
filter; and introduces the idea of the Shamir threshold secret-sharing scheme to adapt to multiple
environments. A formal security proof and three detailed experiments are given. The results of the
experiments showed that the new protocol is very suitable for unbalanced scenarios with a large
number of participants, and it has a significant improvement in efficiency compared with the typical
related protocol (TIFS 2022).
Keywords: Multiparty Private Set Intersection; unbalanced scenario; ElGamal cryptography; Shamir
threshold key-sharing scheme; Bloom filter
Citation: Ruan, O.; Yan, C.; Zhou, J.;

Ai, C. A Practical Multiparty Private
Set Intersection Protocol Based on
1. Introduction
Bloom Filters for Unbalanced In the era of Big Data, the intersection of large amounts of data becomes inevitable.
Scenarios. Appl. Sci. 2023, 13, 13215. Private Set Intersection (PSI) addresses the current need for a solution to compute the
https://doi.org/10.3390/ intersection of two or more datasets without disclosing information about the data of
app132413215 each party, except for the intersection itself. PSI has wide applications such as social
Academic Editor: David Megías
discovery [1–3], document-like detection [4], joint learning in neural network models [5],
suspect detection [6], privacy-preserving data mining [7], privacy-preserving retrieval
Received: 27 October 2023 systems [8], cloud-based applications [9], and so on. Many high-performance PSI protocols
Revised: 9 December 2023 have been proposed recently [10–16]. Additionally, there are some PSI protocols [10,17–21]
Accepted: 9 December 2023 specifically designed for unbalanced scenarios, where the server has a large dataset and the
Published: 13 December 2023
client has a dataset much smaller than the server’s size.
Multiparty Private Set Intersection (MPSI) protocols are designed to solve the inter-
section problem among multiple participants. Traditional approaches for designing MPSI
Copyright: © 2023 by the authors.
protocols mainly use the homomorphic encryption technology of public keys [22–28] and
Licensee MDPI, Basel, Switzerland. oblivious transfer methods [29–35]. For MPSI protocols, the unbalanced scenario is also a
This article is an open access article common real-life situation in which the server possesses a large dataset, while multiple
distributed under the terms and clients have smaller datasets. For example, some WeChat (version 3.9.8) users want to
conditions of the Creative Commons find their mutual friends in order to create a friend circle or group chat. In this scenario,
Attribution (CC BY) license (https:// there are a server that has information about all registered users and their phone numbers
creativecommons.org/licenses/by/ and multiple clients with their own phone address books who want to perform MPSI with
4.0/). the server in order to find common phone numbers among them. We refer to this as a
Appl. Sci. 2023, 13, 13215. https://doi.org/10.3390/app132413215 https://www.mdpi.com/journal/applsci

Appl. Sci. 2023, 13, 13215 2 of 17
Multiparty Private Set Intersection for an unbalanced scenario, where the clients’ address
books consist of small datasets, while the server’s database is very large.
However, most of the existing MPSI protocols are not applicable to such unbalanced
scenarios. For example, the protocol in [22] can only derive the approximate number of
intersections; the protocols in [23,24] require the participants’ datasets to be of equal size;
the protocol in [25] is only applicable to three participants and small datasets; the protocol
in [28] can be applied to such scenarios, but the computation for the clients and the server
is high.
This paper presents a practical MPSI protocol for unbalanced scenarios in which the
server with high computational power handles the majority of the computations, while the
clients only need to perform a small amount of computation. Moreover, the protocol has a
minimal impact on the execution time of clients as the number of participants increases.
1.1. Related Works

Freedman et al. [22] proposed the first MPSI protocol based on the additive homo-
morphic encryption technique. This protocol represents the set elements as the roots
of polynomials and implements two-party PSI under the semi-honest adversary model.
Additionally, they introduced the idea of constructing a Multiparty PSI protocol for the
case of malicious adversaries. In 2005, Kissner et al. [23] used additive homomorphic
encryption and private key secret-sharing techniques to design an MPSI protocol, whose
computational and communication complexity was twice the size of the set and the num-
ber of participants. The computational complexity of such algorithms does not meet the
requirements of practical applications. After that, reducing the computational and commu-
nication overhead of MPSI protocols based on homomorphic cryptography has become an
important research goal.
In 2017, Miyaji et al. [26] gave an MPSI protocol based on Bloom filters and homomor-
phic encryption. This protocol maps the set into a Bloom filter, encrypts it, and performs a
homomorphic multiplication operation on the encrypted Bloom filter. The intersection set
is then obtained by comparing the results based on the corresponding decryption operation.
However, the protocol of Miyaji et al. [26] has a rather obvious drawback that the Bloom
filter to be encrypted by each participant is equally large, even if the original set is small.
In the same year, Davidson et al. [27] also proposed a PSI protocol based on
Bloom filters and Paillier homomorphic encryption, which overcomes the drawback of
Miyaji et al. [26]. In the protocol [27], the operations of mapping the aggregate data and
encrypting the Bloom filter are basically the same as in [26]. The difference is that the
participant who wants to obtain the intersection maps his/her aggregate data into the
encrypted Bloom filter using the same hash functions as those used in the mapping. The in-
tersection is obtained by performing the interactive decryption operation and utilizing the
homomorphic properties of the Paillier encryption scheme.
In 2022, Bay et al. [28] extended the protocol [27] to multiple participants by perform-
ing threshold Paillier additive homomorphic encryptions on the Bloom filters of each of
them. They conducted interactive decryption operations, utilized homomorphic properties,
and obtained the intersection as long as the number of decryptors exceeded the threshold.
Although their protocol has a longer runtime, it is an open-source Multiparty PSI protocol,
and it has a greater advantage in scenarios where multiple participants have small datasets.
1.2. Our Contribution

This paper presents a secure MPSI protocol based on the Bloom filter and Shamir
threshold secret-sharing scheme for a large number of participants and unbalanced scenarios.
Our protocol follows the methods of [27,28], which were based on the Bloom filter and
homomorphic encryption. The Bloom filter is an efficient data structure, which can be used
to reduce the computation for both clients and server. The main difference is that we used
the relatively efficient ElGamal multiplicative homomorphic encryption algorithm instead
of the Paillier additive homomorphic encryption algorithm. The ElGamal encryption
Appl. Sci. 2023, 13, 13215 3 of 17
algorithm cannot encrypt 0. To address this issue, we randomized the 0s in the Bloom filter,
i.e., we obtained the encrypted Bloom filter by following this process: if an item in the
Bloom filter is 1, it is encrypted directly; otherwise, if the item is 0, we select a random value
to represent it and, then, encrypt the random value. To ensure that participants can obtain
the intersection properly, we adopted the idea of the Shamir t-threshold secret-key-sharing
scheme, in which the private key is divided into n copies and at least t shares are needed
for decryption.
The contributions of this paper are as follows:
(1) A secure MPSI protocol based on the Bloom filter and threshold ElGamal encryption
scheme for unbalanced scenarios is proposed. In this protocol, the runtime of the client
hardly varies with the number of participants, and it is linearly correlated with the data
size of the server.
(2) We present three comprehensive experiments, and the results showed that, in our
protocol, the number (t) of participants has a minimal influence on the computation and
runtime of the client. When t ≥ 24 , the client’s runtime is approximately 1/(2log(t) ) of the
server’s runtime. Therefore, our protocol is highly suitable for unbalanced scenarios with
a large number of participants. Furthermore, our protocol exhibited significant efficiency
improvement compared to the related protocol [28], and the sever and clients’ runtimes
were approximately 1/6 of [28].
(3) We provide a formal security proof of our protocol against semi-honest adversaries.
2. Materials and Methods

This section first describes the notations used in the paper and, then, provides an
overview of the preliminaries about the protocol.
2.1. Notations
We show the notations used in the paper in Table 1.
Table 1. Table of notations.
Notation Meaning
p p is a large prime number.
Z ∗p Z ∗p is a modulo-p multiplicative group.
Enc( M) The result of encrypting the plaintext M.
Dec(C ) The result of decrypting the ciphertext C.
t The number of participants in the protocol is t.
Pi The i-th participant. P1 , . . . , Pt−1 are the clients, and Pt is the server.
Si Vector of datasets for participant Pi , i ∈ {1, 2, . . . , t}.
ni The dataset size of Si , i ∈ {1, 2, . . . , t}.
Si [ j ] The j-th element in the dataset vector Si of participant Pi , i ∈ {1, 2, . . . , t}, j ∈ {1, 2, . . . , ni }.
m The length of the Bloom filter.
k The number of hash functions used by the Bloom filter.
hu The u-th hash function, u ∈ {1, 2, . . . , k}.
hu ( x ) The hash value of x by using the hu that is in {1, 2, . . . , m}, where u ∈ {1, 2, . . . , k}.
BFi The Bloom filter obtained by mapping the dataset Si , i ∈ {1, 2, . . . , t}.
BFi [l ] The l-th bit of BFi , i ∈ {1, 2, . . . , t}, l ∈ {1, 2, . . . , m}.
RBFi The randomized Bloom filter obtained by randomizing BFi , i ∈ {1, 2, . . . , t}.
RBFi [l ] The l-th element of RBFi , i ∈ {1, 2, . . . , t}, l ∈ {1, 2, . . . , m}.
ERBFi The encrypted randomized Bloom filter obtained by encrypting RBFi , i ∈ {1, 2, . . . , t}.
ERBFi [l ] The l-th element of ERBFi , i ∈ {1, 2, . . . , t}, l ∈ {1, 2, . . . , m}.
rand() Generate a random number between 1 and p − 1.
Appl. Sci. 2023, 13, 13215 4 of 17
Table 1. Cont.
Notation Meaning
pk The public key.
sk The private key.
sk i The share of private key sk distributed to the i-th participant.
cuj,i cuj,i = ERBFi [hu (St [ j])], where i ∈ {1, 2, . . . , t}, j ∈ {1, 2, . . . , nt }, u ∈ {1, 2, . . . , k}.
Comb(sh j,1 , . . . , sh j,t ) For j ∈ {1, 2, . . . , nt }, joint decryption on (sh j,1 , . . . , sh j,t ).
t 0
j
∆i ∆i = ∏ , where i ∈ {1, 2, . . . , t}.
0 0 j0 −i
j =1,j 6=i
sh j,i sh j,i = (c j )∆i ·ski ,
where i ∈ {1, 2, . . . , t}, j ∈ {1, 2, . . . , nt }.
shi shi = {sh1,i , . . . , sh j,i , . . . , shnt ,i }, where i ∈ {1, 2, . . . , t}.
× Homomorphic multiplication calculations. All computations between ciphertexts in the article are
homomorphic multiplicative computations.
2.2. Preliminaries
2.2.1. ElGamal Encryption Algorithm
In 1985, Taher ElGamal [36] proposed an asymmetric encryption algorithm based
on the Diffie–Hellman key exchange. The security of the system primarily relies on the
difficulty of solving the discrete logarithm problem in a finite field. The ElGamal encryption
algorithm can be divided into three main parts: key generation, encryption, and decryption.
Key generation: First, randomly select a large prime number, p, such that p − 1 has
large prime factors. Next, choose a primitive element g modulo p. Finally, choose d
(2 ≤ d ≤ p − 2) as the private key, then y = gd mod p is the public key.
Encryption: Suppose the plaintext is x. Calculate the ciphertext pair C1 = gr mod p
and C2 = x · yr mod p, where r is a random number with 2 ≤ r ≤ p − 2.
C2 x · yr x · gr · d
Decryption: Compute the plaintext x = = gr · d
= gr · d
mod p.
C1d
Multiplicative homomorphism: Assuming that the encrypted plaintexts are m1 and
m2 , the ciphertext pair generated by encrypting m1 is (C11 , C21 ), and the ciphertext pair
generated by encrypting m2 is (C12 , C22 ), where C11 = gr1 mod p, C21 = m1 · gd·r1 mod p,
C12 = gr2 mod p, C22 = m2 · gd·r2 mod p, then:
C11 × C12 = gr1 × gr2 = gr1 +r2 mod p (1)

d ·r1 d ·r2
C21 × C22 = m1 · g × m2 · g
= m1 · m2 · gd·(r1 +r2 ) mod p (2)
Therefore, two ciphertexts encrypted by random numbers r1 and r2 are modulo

multiplied equivalent to the result of multiplying the corresponding plaintexts of these
two ciphertexts choosing the random numbers r1 + r2 for encryption, and they decrypt the
same result as shown in Equation (3):
Dec( Enc(m1 · m2 )) = Dec( Enc(m1 ) · Enc(m2 )) (3)
2.2.2. Bloom Filters

A Bloom filter, BF = { BF [1], BF [2], . . . , BF [m]}, is an array of bits of length m with
random k hash functions {h1 , h2 , . . . , hk }, which was proposed by Bloom [37] in 1970.
According to the analysis by Dong [38], when a set’s size is n and the error rate is Perr , we
can generally set m and k as follows:
m ≥ −n · log2 e · log2 Perr , k ≥ −log2 Perr (4)
Bloom filter for dataset X: Each bit of the Bloom filter is initialized to 0. Each xi in the
dataset X = { x1 , . . . , xn } is hashed k times as (h1 ( xi ), . . . , hk ( xi )), and set BF [hu ( xi )] = 1
for u ∈ {1, 2, . . . , k}.
Appl. Sci. 2023, 13, 13215 5 of 17
Verify whether an element x belongs to a set X: If BF [hu ( x )] = 1 is held for each u,

u ∈ {1, 2, . . . , k}, x is considered to belong to the set X at an acceptable error rate Perr .
Randomized Bloom filter (RBF): The BF after performing the following operation is
denoted as the RBF: for each element BF [i ] of the BF, i ∈ {1, 2, . . . , m}, modify BF [i ] =
rand() if BF [i ] = 0; if BF [i ] = 1, then no other operation is performed.
Encrypted randomized Bloom filters (ERBF): The encrypted randomized Bloom filter
is denoted as ERBF, where ERBF = { Enc( RBF [1]), Enc( RBF [2]), . . . , Enc( RBF [m])}.
2.2.3. Lagrange Interpolation

Lagrange interpolation is a polynomial interpolation method named after Joseph
Lagrange. Given n points (( x1 , y1 ), . . . , ( xn , yn )), a polynomial passing through all points
and having a polynomial degree at most n − 1 can be expressed as follows:
n
f (x) = ∑ y j · f j (x) (5)
j =1
where
n
x − xi
f j (x) = ∏ x j − xi
(6)
i =1,i 6= j
then
n n
x−x j
f (x) = ∑ ( yi · ∏ xi − x j ) (7)
i =1 j=1,j6=i
2.2.4. Shamir Threshold Secret-Sharing Scheme

The Shamir threshold secret-sharing scheme [39], also known as Shamir secret sharing,
is a threshold secret-sharing scheme based on the Lagrange interpolation formula.
There are two important parameters in the threshold secret-sharing scheme: t and n.
In this scheme, n represents the number of parties involved in splitting the secret, and t is
the minimum number of participants needed to recover the secret. The basic principle of
the threshold secret-sharing scheme is as follows: a secret s is divided into n sub-secrets,
and each party holds one sub-secret. When the secret s needs to be restored, at least t
parties need to retrieve the sub-secret and restore the secret s.
The Shamir threshold secret-sharing scheme is as follows:
(i) Choose a large prime p, and assume that s(s ∈ Z ∗p ) is the secret that requires at least
t parties to recover.
(ii) Construct a random-degree t − 1 polynomial f ( x ) = a0 + a1 · x + · · · + at−1 · x t−1 ,
where a0 = s and a1 , . . . , at−1 ∈ Z ∗p .
(iii) Each participant i, i ∈ {1, . . . , n}, holds a sub-secret (i, f (i )), which can be viewed
as a point ( xi , yi ) in the Lagrangian interpolation formula.
(iv) Since a0 = s, calculating f (0) = a0 restores the secret. Using Lagrange interpo-
lation, f ( x ) is restored with the sub-secrets of any t parties, and then, f (0) is computed
as follows:
t t
∑ ( f (i ) · ∏
x− j
f (x) = i− j ) mod p (8)
i =1 j=1,j6=i
t t
∑ ( f (i ) · ∏
j
f (0) = j −i ) mod p (9)
i =1 j=1,j6=i
2.2.5. Threshold ElGamal Encryption Scheme

Based on the Shamir secret-sharing scheme [39] and the threshold Paillier encryption
scheme proposed by Bay et al. [28], we present a threshold ElGamal encryption scheme.
The basic idea of the threshold ElGamal encryption scheme is to divide the private key of
Appl. Sci. 2023, 13, 13215 6 of 17
the ElGamal encryption algorithm into n copies using the Shamir threshold secret-sharing
scheme. Decryption can then be achieved when at least t copies are gathered together.
For a (t, n) threshold ElGamal encryption algorithm, where n is the total number of
participants and t is the minimum number of participants required for decryption, the basic
scheme is described as follows:
Key generation: Randomly select a large prime p, and then, choose a prime element
g modulo p. Randomly pick a sk = d as the private key for the ElGamal algorithm such
that 2 ≤ d ≤ p − 2. Compute y = gd mod p and pk = y as the public key. The private key is
t −1
shared as follows: let a0 = d, and generate the polynomial f ( x ) = ∑ ai xi mod p; the key
i =0
sharingof the j-th participant is sk j = f ( j) mod p, where j ∈ {1, . . . , n}.
Encryption: Assume that the message to be encrypted is M. Choose a random r ∈ Z ∗p ,
and compute the ciphertexts c1 = gr and c2 = Myr = Mgdr .
Share decryption: Among the t participants involved in decryption, the i-th participant
t
calculates c1,i = (c1 )∆i ·ski , where ∆i =
j
∏ j −i .
j=1,j6=i
Combination calculation (Comb): The results of decryption are obtained by performing
ElGamal homomorphic multiplication operations on the shared decryption results of
the t participants:
t
t ∑ ∆i · f (i )
∏ c1,i = (c1 )i=1 (10)
i =1
From the Shamir threshold secret-sharing scheme (Section 2.2.4), it is known that
restoring the key sk = d is calculated as follows:
t t t
∑ ( f (i ) · ∏ ∑ ∆i · f (i )
j
sk = f (0) = d = j −i ) = (11)
i =1 j=1,j6=i i =1
t j
where ∆i = ∏ j −i .
j=1,j6=i
Thus, the decryption result can be obtained as follows:
c2 c2 c2
M= = = (12)
c1d t
∑ ∆i · f (i )
t
∏ c1,i
( c1 ) i =1 i =1
2.2.6. Security Model

Negligible function: For a security parameter λ, we call a function negl (λ) negligible
1
if it satisfies the following condition: negl (λ) < for all possible polynomials p and
p(λ)
sufficiently large λ.
Computational indistinguishability: For a sufficiently large λ, X = { Xλ }λ∈ N and
Y = {Yλ }λ∈ N represent two distributions of length λ. X and Y are computationally
indistinguishable if, for any possible polynomial-time algorithm T, the following condition
is satisfied:
| Pr [ T ( Xλ ) → 1] − Pr [ T (Yλ ) → 1]| ≤ negl (λ) (13)
where Pr [ E] represents the probability of the event E occurring.
We call it X ' Y.
Semi-honest model: Each participant strictly adheres to the normal operation of
the protocol, including the inputs, outputs, and intermediate processes. In the semi-
honest adversary model, however, an adversary attempts to obtain any information about
other participants.
Security model under semi-honest adversaries: In a general multiparty computa-
tion, one of the participants is called the server Pt , and the others are called the clients
Appl. Sci. 2023, 13, 13215 7 of 17
( P1 , . . . , Pt−1 ). Let S = S1 , . . . , St represent the set of inputs from the clients and the server
and f (S) = (⊥, ∩) represent a function in which the server Pt obtains the intersection and
the clients obtain nothing. Assume a protocol ∏t with t participants to compute the function
f (S). During the execution of the protocol ∏t , the server’s view is View∏ t
Pt = ( St , rt , `t , outt ),
and the clients’ views are View∏ t
Pi = ( Si , ri , `i , outi ), in which i ∈ {1, . . . , t − 1}, ri and rt
represent the random numbers generated by each of the clients Pi and server Pt , `i and `t
are the messages received by each of the clients Pi and server Pt , respectively, and outi and
outt represent the outputs of each of the clients Pi and server Pt .
If ∏t satisfies the following conditions: there exist polynomial-time simulators
Simt (St , ∩) and Simi (Si , ⊥) such that Simt (St , ∩) and View∏ t
Pt and Simi ( Si , ⊥) and View Pi
∏t
are computationally indistinguishable, where i ∈ {1, . . . , t − 1}, ∏t securely computes the

function f (S).
Simt (St , ∩) ' View∏ Pt
t
(14)
Simi (Si , ⊥) ' View∏ P
t
i
From Equation (14), it is clear that such a secure multiparty computation protocol ∏t
is secure in the presence of a semi-honest adversary.
3. Our Multiparty Private Set Intersection Protocol

In the protocol, we used Bloom filters to reduce the computational complexity, ElGamal
multiplicative homomorphic encryption to achieve message confidentiality, and Shamir
secret sharing and Lagrange interpolation to apply to multiparty scenarios.
3.1. Protocol Description

This section presents our new MPSI protocol, which is based on the Bloom filter and a
threshold ElGamal encryption scheme.
Firstly, the protocol uses an efficient data structure called the Bloom filter, which was
also utilized in [27,28], to minimize the computation on both the clients and the server.
Secondly, the protocol adopts a relatively efficient ElGamal encryption algorithm instead
of the Paillier algorithm. The ElGamal encryption algorithm cannot be used to encrypt 0,
but we can avoid this issue by randomizing the 0s in the Bloom filter. Thirdly, in order
to be suitable for multiparty environments, a threshold ElGamal encryption scheme was
designed by incorporating the concept of the Shamir threshold secret-key-sharing scheme.
This scheme divides the private key into n copies for n participants and requires no less
than t participants to decrypt the data accurately. Finally, we incorporated time-consuming
encryptions into a preprocessing phase, which facilitates efficient execution during the
online interaction stage.
The protocol is shown in Figure 1, which is described as follows:
Data input: Participant Pi ’s dataset Si , i ∈ {1, 2, . . . , t}.
Data output: The intersection set S = {S1 ∩ S2 ∩ · · · ∩ St }.
Initialization:
(1) A trusted third-party generates an ElGamal public–private key pair ( pk, sk) and a
(t − 1)-th polynomial f ( x ) = sk + a1 x + · · · + at−1 x t−1 for sk, then distributes pk to P1 , ..., Pt
and sk i = f (i ) to Pi , i ∈ {1, 2, . . . , t}.
(2) Then, he/she picks k hash functions h1 , . . . , hk and sends them to participants
P1 , . . . , Pt .
Preprocessing stage:
(1) Each client Pi , i ∈ {1, 2, . . . , t − 1}, generates a hash-mapped Bloom filter BFi for
its own private set Si , which is then randomized to obtain RBFi , and finally, ERBFi is
computed by encrypting each value in the randomized Bloom filter RBFi using public
key pk.
(2) For each j ∈ {1, 2, . . . , nt }, the server Pt generates a random number r j and, then,
calculates w j = Enc(r j ) × Enc(St [ j]) mod p.
Appl. Sci. 2023, 13, 13215 8 of 17
Online stage: The server Pt receives the encrypted randomized Bloom filters
{ ERBF1 , ERBF2 , . . . , ERBFt−1 } and starts the intersection calculation.
(1) i. For each of the data St [ j] of the server Pt , the hash value hu (St [ j]) is computed,
where j ∈ {1, . . . , nt }, u ∈ {1, . . . , k}. Eventually, nt sets of data are obtained, each with k
values, as follows:
{h1 (St [1]) · · ·hu (St [1]) · · ·hk (St [1])}
.. .. ..
. . .
{h1 (St [ j]) · · ·hu (St [ j]) · · ·hk (St [ j])} (15)
.. .. ..
. . .
{h1 (St [nt ]) · · ·hu (St [nt ]) · · ·hk (St [nt ])}
ii. Substitute hu (St [ j]) into each client’s encrypted randomized Bloom filter ERBFi
to obtain cuj,i , where i ∈ {1, . . . , t − 1}, j ∈ {1, . . . , nt }, u ∈ {1, . . . , k } and cuj,i =
ERBFi [hu (St [ j])], as follows:
k k k
{{c11,1 , · · · , c1,1 } · · ·{c11,i , · · · , c1,i } · · ·{c11,t−1 , · · · , c1,t −1 }}
.. .. ..
. . .
{{c1j,1 , · · · , ckj,1 } · · ·{c1j,i , · · · , ckj,i } · · ·{c1j,t−1 , · · · , ckj,t−1 }} (16)
.. .. ..
. . .
{{c1nt ,1 , · · · , cknt ,1 } · · ·{c1nt ,i , · · · , cknt ,i } · · ·{c1nt ,t−1 , · · · , cknt ,t−1 }}
iii. For each set of data in Equation (16), a homomorphic multiplication is performed
to compute c j,i = c1j,i × · · · × ckj,i , where i ∈ {1, . . . , t − 1}, j ∈ {1, . . . , nt }. The resulting
data are as follows:
{c1,1 · · ·c1,i · · ·c1,t−1 }
.. .. ..
. . .
{c j,1 · · ·c j,i · · ·c j,t−1 } (17)
.. .. ..
. . .
{cnt ,1 · · ·cnt ,i · · ·cnt ,t−1 }
iv. For each set of data in Equation (17), homomorphic multiplication is performed to
compute c j = c j,1 × · · · × c j,t−1 × w j , where j ∈ {1, . . . , nt }.
(2) Pi performs the computation of the shareddecryption after obtaining the data c j ,
which is computed as follows:
sh j,i = (c j )∆i ·ski (18)
t 0
j
where ∆i = ∏ 0 , i ∈ {1, . . . , t} ,j ∈ {1, 2, . . . , nt }.
0 0 j −i
j =1,j 6=i
(3) Pt performs the combination computation, which is denoted as
Comb(sh j,1 , . . . , sh j,t ). Then, Dec(c j ) can be obtained by Comb(sh j,1 , . . . , sh j,t ).
If Dec(c j ) = w j = St [ j]ṙ j , the server adds the corresponding St [ j] to the intersec-
tion S = {St [ j]} ∪ S. The range of j above is {1, 2, · · · , nt }.
Appl. Sci. 2023, 13, 13215 9 of 17
Figure 1. Our Multiparty Private Set Intersection protocol.
3.2. Protocol Correctness

In the protocol, the server obtains the final intersection result, from which we start to
illustrate the correctness of our protocol.
For j ∈ {1, 2, . . . , nt }:
Dec(c j ) = Dec(c1,j × · · · × ct−1,j × w j )

= Dec((c1j,1 × · · · × ckj,1 ) × · · · × (c1j,t−1 × · · · × ckj,t−1 )
× Enc(r j ) × Enc(St [ j]))
= Dec(( ERBF1 (h1 (St [ j])) × · · · × ERBF1 (hk (St [ j]))) × · · ·
× ( ERBFt−1 (h1 (St [ j])) × · · · × ERBFt−1 (hk (St [ j])))
× Enc(r j ) × Enc(St [ j]))
= ( RBF1 (h1 (St [ j])) × RBF1 (hk (St [ j]))) × · · ·
× ( RBFt−1 (h1 (St [ j])) × RBFt−1 (hk (St [ j]))) × r j × St [ j]
If there is Dec(c j ) = r j × St [ j] mod p, then
∀i ∈ {1, . . . , t − 1}, BFi (h1 (St [ j])) × · · · × BFi (hk (St [ j])) = 1 (19)
In this case, by the nature of Bloom filters, we know that St [ j] is in the sets of all
{S1 , · · · , St−1 }, i.e., St [ j] is an intersection element of the participants {P1 , · · · , Pt }.
If there is Dec(c j ) 6= r j × St [ j] mod p, then
∃i ∈ {1, . . . , t − 1}, RBFi (h1 (St [ j])) × · · · × RBFi (hk (St [ j])) 6= 1 (20)
In this case, there exists i ∈ {1, ..., t − 1} such that the values in the randomized Bloom
filter RBFi that St [ j] maps to are not all 1, i.e., St [ j] is not an intersection element of the
participants {P1 , · · · , Pt }.
In summary, we can determine whether an element belongs to the intersection by
using the nature of the Bloom filter. Therefore, the protocol for secure computation of the
Appl. Sci. 2023, 13, 13215 10 of 17
intersection of non-equilibrium multiparty sets based on the threshold ElGamal encryption

scheme is correct.
3.3. Security Analysis

We employed the widely accepted simulation paradigm [40] to demonstrate the
security of the new protocol. The essence of the simulation paradigm is that an actual
multiparty computation protocol is considered secure if the participants do not gain more
information from it than they would from an ideal protocol.
We simulated the protocols for two scenarios: one where the adversary-controlled
participant includes the server and another where the adversary-controlled participant
does not include the server. We demonstrate that the information obtained from the actual
protocol in both scenarios is computationally indistinguishable from that obtained from
the ideal protocol, assuming a semi-honest adversary model.
Theorem 1. If the threshold ElGamal encryption scheme in Section 2.2.5 is secure, then the
unbalanced MPSI protocol ∏ in this paper is secure in the case of semi-honest adversaries.
Proof. To analyze the security of our protocol, we assumed that there are ` participants
controlled by an adversary, where ` < t. In the execution of our protocol, there were t
participants, and we only considered the case where ` < t. If all participants are controlled
by the adversary, the analysis is meaningless. We divided the analysis into two cases:
one is that the adversary controls ` clients and the server Pt is not among the ` participants
controlled by the adversary; the other is that the server Pt is among the ` participants
controlled by the adversary, and the adversary controls ` − 1 clients and the server Pt .
Scenario 1: Server Pt is not among the ` participants controlled by the adversary.
Suppose there are ` participants P1 , . . . , P` is controlled by the adversary, and the ad-
versary can access the inputs of these ` participants, as well as the intermediates generated
by the computation. From the Shamir threshold secret-sharing scheme, it is clear that,
when ` < t, the private key corresponding to the public key cannot be inferred from the
information of the ` participants in our protocol. We denote the participants’, P1 , . . . , P` ,
views of the real protocol as View∏
RN , shown in Equation (21), where { S1 , . . . , S` } denote the
original datasets of the participants { P1 , . . . , P` }, { ERBF1 , . . . , ERBF` } are the encrypted
randomized Bloom filters, and finally, {sh1 , . . . , sh` } denote the intermediate results of
the computation.
View∏
RN = ({ S1 , . . . , S` }, { ERBF1 , . . . , ERBF` }, { sh1 , . . . , sh` }) (21)
Construct a simulator Sim N to simulate a situation where the adversary controls `

participants, with inputs from these ` participants, and performs the following steps:
(1) Create an empty view Sim N ({S1 , . . . , S` }, ⊥), and add the sets {S1 , . . . , S` } and the
encrypted randomized Bloom filters { ERBF1 , . . . , ERBF` } to the view.
(2) Simulate the inputs of the participants { P`+1 , . . . , Pt }, and create random sets
0 0 0
{S`+1 , . . . , St−1 } and a random set St containing nt elements.
0 0 0 0
(3) Use the sets {S`+1 , . . . , St−1 } to generate the Bloom filters { BF`+1 , . . . , BFt−1 } and
0 0
the encrypted randomized Bloom filters { ERBF`+1 , . . . , ERBFt−1 }.
0 0
(4) Generate random values r j , then compute w j , where j ∈ {1, 2, . . . , nt }.
0 0 0
w j = Enc(r j ) × Enc(St [ j] ) mod p (22)
0
(5) For each St [ j] , j ∈ {1, 2, . . . , nt }, and each ERBFi (i ∈ {1, ..., `}) and each ERBFi 0
0 0 0
(i ∈ {` + 1, ..., t − 1}), compute {(c1j,i ) , . . . , (ckj,i ) } and c j,i :
0 0 0
c j,i = (c1j,i ) × · · · × (ckj,i ) mod p (23)
Appl. Sci. 2023, 13, 13215 11 of 17
0
(6) For j ∈ {1, 2, . . . , nt }, yield c j by a homomorphic multiplication operation on all
0
the mapped values in the encrypted Bloom filters corresponding to St [ j] .
0 0 0 0
c j = c j,1 × · · · × c j,t−1 × w j mod p (24)
0 0 0 0
(7) Compute shi = {sh1,i , . . . , shnt ,i } by the shareddecryption of c j with private key
t 0
j
sharingsk i , which is expressed as Equation (25), where ∆i = ∏ 0 , i ∈ {1, . . . , `},
0 0 j −i
j =1,j 6=i
j ∈ {1, 2, . . . , nt }.
0 0
sh j,i = (c j )∆i ·ski mod p (25)
0 0
(8) Insert {sh1 , . . . , sh` } into the view. Thus, the view of simulator Sim N is:
Sim N ({S1 , . . . , S` }, ⊥) =({S1 , . . . , S` }, { ERBF1 , . . . , ERBF` },

0 0
{sh1 , . . . , sh` }) (26)
Because the simulator Sim N controls only ` participants other than the server, it can
only obtain the key shares of these ` participants, and the joint computation of the data in
the threshold ElGamal encryption requires at least t participants’ shares of decrypted data,
so the simulator Sim N cannot perform the joint computation in the threshold ElGamal to
obtain the final decryption result.
View∏ RN = ({ S1 , . . . , S` }, { ERBF1 , . . . , ERBF` },{ sh1 , . . . , sh` }) represents the view
of the real protocol with participants P1 , . . . , P` , and Sim N ({S1 , . . . , S` }, ⊥) =
0 0
({S1 , . . . , S` }, { ERBF1 , . . . , ERBF` }, {sh1 , . . . , sh` }) represents the view of the simulator
Sim N , where {S1 , . . . , S` } and { ERBF1 , . . . , ERBF` } are the same. Since the threshold ElGa-
0 0
mal encryption algorithm and random values are used, {sh1 , . . . , sh` } and {sh1 , . . . , sh` }
are computationally indistinguishable. Thus,
View∏
RN ' Sim N ({ S1 , . . . , S` }, ⊥) (27)
Scenario 2: Server Pt is among the ` participants controlled by the adversary.

Suppose that clients {P1 , . . . , P`−1 } and server Pt are controlled by the adversary and the
adversary can access the inputs and outputs of these ` participants. As in case 1, it is known
that, when ` < t, the private key corresponding to the public key cannot be inferred from
the information of the ` participants. In the real protocol, the views of these ` participants
are represented as View∏ RY , as in Equation (28), where { S1 , . . . , S`−1 } denote the input sets
of clients {P1 , . . . , P`−1 }, St is the original set of server Pt , { ERBF1 , . . . , ERBF`−1 } are the
encrypted randomized Bloom filters, {sh1 , . . . , sh`−1 , sht } denote the intermediate results
of the computation, and ∩ is the final intersection obtained by server Pt .
View∏
RY =({ S1 , . . . , S`−1 , St }, { ERBF1 , . . . , ERBF`−1 },
{sh1 , . . . , sh`−1 , sht }, ∩) (28)
Construct a simulator SimY to simulate a situation where the adversary controls `

participants, including the server, with inputs and outputs from these ` participants, and
performs the following steps:
(1) Create an empty view SimY ({S1 , . . . , S`−1 , St }, ∩), and add the sets {S1 , ..., S`−1 , St },
the encrypted randomized Bloom filters { ERBF1 , ..., ERBF`−1 , ERBFt }, and the intersection
“∩” obtained by the server to the view.
(2) Simulate the inputs of the clients {P` , . . . , Pt−1 }, and create random sets
0 0
{S` , . . . , St−1 } that satisfy the intersection with the server as “∩”.
0 0 0 0
(3) Use the sets {S` , . . . , St−1 } to generate the Bloom filters { BF` , . . . , BFt−1 } and the
0 0
encrypted randomized Bloom filters { ERBF` , . . . , ERBFt−1 }.
Appl. Sci. 2023, 13, 13215 12 of 17
0 0
(4) Generate random values r j , and calculate w j , where j ∈ {1, 2, . . . , nt }.
0 0
w j = Enc(r j ) × Enc(St [ j]) mod p (29)
(5) For each St [ j] (j ∈ {1, 2, . . . , nt }), each ERBFi (i ∈ {1, ..., ` − 1}), and each ERBFi 0
j,i0 j,i0 0
(i ∈ {`, ..., t − 1}), compute {c1 , . . . , ck } and c j,i :
0 0 0
c j,i = (c1j,i ) × · · · × (ckj,i ) mod p (30)
0
(6) For j ∈ {1, 2, . . . , nt }, compute c j .
0 0 0 0
c j = c j,1 × · · · × c j,t−1 × w j mod p (31)
0 0 0 0
(7) Compute shi = {sh1,i , . . . , shnt ,i } by the shared decryption of c j with private key
t 0
j
sharing sk i , which is expressed as Equation (32), where ∆i = ∏ 0 , i ∈ {1, . . . , ` −
0 0 j −i
j =1,j 6=i
1, t}, j ∈ {1, 2, . . . , nt }.
0 0
sh j,i = (c j )∆i ·ski mod p (32)
0 0 0
(8) Insert {sh1 , . . . , sh`−1 , sht } into the view. Thus, the view of the simulator SimY is:
SimY ({S1 , . . . , S`−1 , St }, ∩) =({S1 , . . . , S`−1 , St }, (33)

0 0 0
{ ERBF1 , . . . , ERBF`−1 },{sh1 , . . . , sh`−1 , sht })
Because the simulator controls ` participants, it can only obtain the key shares of these
` participants, but the joint computation of the data in the threshold ElGamal encryption
requires at least t participants’ shares, so the simulator cannot perform the joint computation
of the threshold ElGamal encryption and cannot obtain the final decryption results. In this
case, the simulator can obtain the final intersection because it controls server Pt .
In the view View∏ RY , as well as in the simulator’s view SimY ({ S1 , . . . , S`−1 , St }, ∩),
{S1 , . . . , S`−1 , St } and the encrypted Bloom filters { ERBF1 , . . . , ERBF`−1 } are identical.
0 0 0
{sh1 , . . . , sh`−1 } and {sh1 , . . . , sh`−1 , sht } are computationally indistinguishable due to the
security of the threshold ElGamal encryption algorithm. Thus,
View∏
RY ' SimY ({ S1 , . . . , S`−1 , St }, ∩) (34)
From the above two scenarios, it can be seen that, in the face of a semi-honest adversary
who controls ` participants when ` < t, the adversary cannot infer the inputs of the honest
participants through the intermediate process and the final result, so the proposed protocol
in this paper is secure in the face of a semi-honest adversary.
4. Evaluation and Results’ Discussion

This section presents the C++ code implementation of the protocol in the thesis and a
comparative analysis with Bay’s protocol [28].
4.1. Protocol Implementation

We implemented our protocol in C++ on a Linux platform and compared it with the
related protocol [28]. In our experiments, we set the false probability Perr of the Bloom
filter to 2−30 and set the length m of the Bloom filter to −size · log2 e · log2 Perr , as well as the
number of hash functions k to −log2 Perr , where size is the server’s set size. For the ElGamal
encryption used in the protocol, we used the security parameter of k = 1024 bit.
Appl. Sci. 2023, 13, 13215 13 of 17
4.2. Performance Analysis

We compared our MPSI protocol with the related protocol [28] by executing it mul-
tiple times in the same environment and calculating the average value. There were
three experiments: Experiment 1, Experiment 2, and Experiment 3. In order to ensure
the accuracy of the test, we modified the multithreaded computation in the source code
of [28] to a single-threaded one. Additionally, we did not measure the time it took to
generate encrypted Bloom filters for the clients of both protocols.
4.2.1. Experiment 1
The main difference between our protocol and [28] is the encryption algorithm.
The Paillier encryption algorithm was used in [28], while we adopted the ElGamal en-
cryption algorithm. Firstly, with the same security parameters, the ElGamal encryption
algorithm produces two ciphertexts, while the Paillier encryption algorithm produces only
one ciphertext. However, the ciphertexts produced by the Paillier encryption algorithm
are twice as long as those produced by the ElGamal encryption algorithm. Therefore, both
algorithm have the same communication complexity.
Secondly, we conducted experiments with these two algorithms using the NTL [41]
library. We set the value of the encrypted data to 107 and chose the security parameter of
k = 1024 bit for public key encryption. The size of the dataset was increased from 102 to
105 , and the encryption process used the same random value. The experimental results are
presented in Table 2, which demonstrate that the ElGamal encryption algorithm is more
efficient than the Paillier encryption algorithm. The former encrypts approximately twice
as fast as the latter, and the former decrypts about four-times as fast as the latter.
Table 2. Comparison of Paillier encryption algorithm and ElGamal encryption algorithm (in seconds).
Algorithm Data Size 102 103 104 105

encryption 0.502 4.158 39.910 328.851
Paillier
decryption 0.404 3.735 39.926 366.583
encryption 0.184 1.957 16.695 160.772
ElGamal
decryption 0.105 0.950 9.656 92.483
Select k = 1024 bit security parameter for public key encryption. The value of the encrypted data was fixed at 107 .
The encryption process used the same random number. The data volume increased from 102 to 105 .
4.2.2. Experiment 2
We set the amount of data for the client and server to 28 and increased the number
of participants from 24 to 29 for these experiments, which aimed at testing the runtime
of individual clients and the server. The experimental results are shown in Table 3 and
Figure 2. From the results, it is evident that the runtime of the clients remained constant
in both our protocol and the protocol [28], regardless of the increase in the number (t) of
participants. Moreover, in our protocol, the client’s runtime was approximately 1/(2log(t) )
of the server’s runtime when t ≥ 24 . Our protocol ran more efficiently than the protocol [28].
This is mainly because the clients in [28] performed the ShDec0() operation, which involves
exponential operations on a large integer nt with large random values. The client Pi in
t 0
only once and sh j,i = (c j )∆i ·ski nt times,
j
our protocol needs to compute ∆i = ∏ 0
0 0 j −i
j =1,j 6=i
and the computation related to the number of participants is only once for ∆i , which is
insignificant in the whole runtime of the client.
Appl. Sci. 2023, 13, 13215 14 of 17
Table 3. Comparison for different MPSI protocols with different numbers of participants (in seconds).
No. of Participants
24 25 26 27 28 29
Protocols
Client 2.234 2.445 2.176 2.441 2.279 2.405
Bay et al. [28]
Server 3.881 5.913 10.202 15.242 25.371 41.615
Client 0.356 0.422 0.387 0.401 0.390 0.381
Ours
Server 0.389 0.785 1.504 3.350 6.845 12.807
The server and client dataset size was fixed at 28 . The number of participants gradually increased from 24 to 29 .
Figure 2. Comparison for different MPSI protocols with different numbers of participants.
4.2.3. Experiment 3
We fixed the size of clients’ datasets at 28 and the number of participants at 25 . Addi-
tionally, we increased the size of the server’s set from 210 to 214 for the experiment to test
the runtime of the clients and the server. The experimental results are shown in Table 4
and Figure 3. It can be seen that the client–server runtime of both our protocol and [28]
was linearly related to the size of the server dataset. But, our protocol had a significant
improvement in efficiency compared to the protocol of Bay et al. [28], and the runtimes of
the server and clients were approximately 1/6 of [28].
Figure 3. Comparison for different MPSI protocols with different volumes of the server’s dataset.
Appl. Sci. 2023, 13, 13215 15 of 17
Table 4. Comparison for different MPSI protocols with different volumes of the server’s dataset
(in seconds).
Server Data Size 210 211 212 213 214

Protocols
Client 9.516 18.893 37.601 76.489 154.536
Bay et al. [28]
Server 22.962 43.629 87.133 166.917 345.266
Client 1.486 3.000 5.903 13.320 28.393
Ours
Server 3.096 6.078 13.614 27.983 56.617
The client’s dataset was fixed at 28 . The number of participants was fixed at 25 . The volume of the server dataset
increased from 210 to 214 .
4.3. Discussion
From the analysis and comparison of the above experiments, we can conclude that
our protocol offers the following advantages:
(1) In our protocol, the number (t) of participants has almost no impact on the compu-
tation and runtime of the client. When t ≥ 24 , the client’s runtime was about 1/(2log(t) ) of
the server’s runtime. Therefore, our protocol is very suitable for unbalanced scenarios with
a large number of participants.
(2) Compared to the typical related protocol [28], our protocol demonstrated a significant
improvement in efficiency. The sever and client runtimes were approximately 1/6 of [28].
5. Conclusions
This paper proposed an MPSI protocol based on the Bloom filter and Shamir threshold
secret-sharing scheme, which is highly suitable for unbalanced scenarios with a large
number of participants. Compared to the typical related protocol [28], our protocol demon-
strated a significant improvement in efficiency. The server’s and clients’ runtimes were
approximately 1/6 of [28]. Extending the approach to the model of malicious adversaries
is our future work.
Author Contributions: Conceptualization, O.R., C.Y., J.Z. and C.A.; methodology, O.R., C.Y., J.Z. and
C.A.; software, C.Y. and C.A.; validation, O.R., C.Y., J.Z. and C.A.; formal analysis, O.R., C.Y., J.Z.
and C.A.; investigation, O.R., C.Y., J.Z. and C.A.; resources, O.R., C.Y., J.Z. and C.A.; data curation,
O.R., C.Y., J.Z. and C.A.; writing—original draft preparation, C.Y. and C.A.; writing—review and
editing, O.R. and J.Z.; visualization, O.R., C.Y., J.Z. and C.A.; supervision, O.R. and J.Z.; project
administration, O.R.; funding acquisition, O.R. All authors have read and agreed to the published
version of the manuscript.
Funding: This research is supported by the National Natural Science Foundation of China under
Grant 62202146 and Enterprise Technology Innovation Development Project of Hubei Province of
China Grant Number 2021BAB009.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data are contained within the article.
Conflicts of Interest: The authors declare no conflicts of interest. The funders had no role in the
design of the study; in the collection, analyses, or interpretation of the data; in the writing of the
manuscript; nor in the decision to publish the results.
Abbreviations
The following abbreviations are used in this manuscript:
MPSI Multiparty Private Set Intersection

TIFS IEEE Transactions on Information Forensics & Security
Appl. Sci. 2023, 13, 13215 16 of 17
References
1. Demmler, D.; Rindal, P.; Rosulek, M.; Trieu, N. PIR-PSI: Scaling Private Contact Discovery. Proc. Priv. Enhancing Technol. 2018,
4, 159–178. [CrossRef]
2. Nagy, M.; De Cristofaro, E.; Dmitrienko, A.; Asokan, N.; Sadeghi, A.-R. Do i know you? efficient and privacy-preserving
common friend-finder protocols and applications. In Proceedings of the 29th Annual Computer Security Applications Conference,
New Orleans, LA, USA, 9–13 December 2013; pp. 159–168. Available online: https://ia.cr/2013/620 (accessed on 15 May 2023).
3. Yuan, X.; Wang, X.; Wang, C.; Squicciarini, A.; Ren, K. Enabling privacy-preserving image-centric social discovery. In Proceedings
of the 2014 IEEE 34th International Conference on Distributed Computing Systems, Madrid, Spain, 30 June–3 July 2014;
pp. 198–207. [CrossRef]
4. Kim, S.P.; Gil, M.S.; Kim, H.; Choi, M.-J.; Moon, Y.-S.; Won, H.-S. Efficient two-step protocol and its discriminative feature
selections in secure similar document detection. Secur. Commun. Netw. 2017, 2017, 6841216. [CrossRef]
5. Phuong, T.T. Privacy-preserving deep learning via weight transmission. IEEE Trans. Inf. Forensics Secur. 2019, 14, 3003–3015. [CrossRef]
6. Fischlin, M.; Pinkas, B.; Sadeghi, A.R.; Schneider, T.; Visconti, I. Secure set intersection with untrusted hardware tokens. In
Proceedings of the CT-RSA 2011, LNCS, San Francisco, CA, USA, 14–18 February 2011; Volume 6558, pp. 1–16. [CrossRef]
7. Bogdanov, D.; Niitsoo, M.; Toft, T.; Willemson, J. High-performance secure multi-party computation for data mining applications.
Int. J. Inf. Secur., 2012, 11, 403–418. [CrossRef]
8. Wang, Y.-W.; Wu, J.-L. A Privacy-Preserving Symptoms Retrieval System with the Aid of Homomorphic Encryption and Private
Set Intersection Schemes. Algorithms 2023, 16, 244. [CrossRef]
9. Fan, C.; Jia, P.; Lin, M.; Wei, L.; Guo, P.; Zhao, X.; Liu, X. Cloud-Assisted Private Set Intersection via Multi-Key Fully Homomorphic
Encryption. Mathematics 2023, 11, 1784. [CrossRef]
10. Resenede, A.C.D.; de Freitas Aranha, D. Faster unbalanced Private Set Intersection in the semi-honest setting. J. Cryptogr. Eng.
2021, 11, 21–38. [CrossRef]
11. Falk, B.H.; Noble, D.; Ostrovsky, R. Private set intersection with linear communication from general assumptions. In Proceedings
of the 18th ACM Workshop on Privacy in the Electronic Society. London: Association for Computing Machinery, London, UK,
11 November 2019; pp. 14–25. [CrossRef]
12. Le P.H.; Ranellucci, S.; Gordon, S.D. Two-party private set intersection with an untrusted third party. In Proceedings of the 2019 ACM
SIGSAC Conference on Computer and Communications Security, London, UK, 11–15 November 2019; pp. 2403–2420. [CrossRef]
13. Ciampi, M.; Orlandi, C. Combining private set-intersection with secure two-party computation. In Security and Cryptography
for Networks (SCN 2018); Catalano, D., De Prisco, R., Eds.; Lecture Notes in Computer Science; Springer: Amalfi, Italy, 2018;
Volume 11035, pp. 464–482. [CrossRef]
14. Wang, Z.S.; Banawan, K.; Ulukus, S. Multi-party private set intersection: An information-theoretic approach. IEEE J. Sel. Areas Inf. Theory
2021, 2, 366–379. [CrossRef]
15. Debnath, S.K.; Sakurai, K.; Dey, K.; Kundu, N. Secure outsourced private set intersection with linear complexity. In Proceedings
of the 2021 IEEE Conference on Dependable and Secure Computing (DSC), Aizuwakamatsu, Japan, 30 January–2 February 2021;
16. Blanton, M.; Aguiar, E. Private and Oblivious Set and Multiset Operations; Springer: Berlin/Heidelberg, Germany, 2012. [CrossRef]
17. Chen, H.; Huang, Z.; Laine, K.; Rindal, P. Labeled PSI from fully homomorphic encryption with malicious security. In Proceedings
of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15–19 October 2018;
18. Chen, H.; Laine, K.; Rindal, P. Fast private set intersection from homomorphic encryption. In Proceedings of the 2017 ACM SIGSAC
Conference on Computer and Communications Security, New York, NY, USA, 30 October–3 November 2017; pp. 1243–1255. [CrossRef]
19. Lv, S.; Ye, J.; Yin, S.; Cheng, X.; Feng, C.; Liu, X.; Li, R.; Li, Z.; Liu, Z.; Zhou, L. Unbalanced private set intersection cardinality
protocol with low communication cost. Future Gener. Comput. Syst. 2020, 102, 1054–1061. [CrossRef]
20. Ma, J.P.K.; Chow, S.S.M. Secure-Computation-Friendly Private Set Intersection from Oblivious Compact Graph Evaluation. In
Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, Nagasaki, Japan, 30 May–3 June
2022; pp. 1086–1097. [CrossRef]
21. Resende, A.C.D.; Aranha, D.F. Faster unbalanced private set intersection. In Proceedings of the International Conference on
Financial Cryptography and Data Security, Nieuwpoort, Curaçao, 26 February–2 March 2018; pp. 203–221. [CrossRef]
22. Freedman, M.J.; Nissim, K.; Pinkas, B. Efficient private matching and set intersection. In Proceedings of the International Conference
on the Theory and Applications of Cryptographic Techniques, Interlaken, Switzerland, 2–6 May 2004; pp. 1–19. [CrossRef]
23. Kissner, L.; Song, D. Privacy-preserving set operations. In Proceedings of the 25th Annual International Cryptology Conference
on Advances in Cryptology, Santa Barbara, CA, USA, 14–18 August 2005; pp. 241–257. [CrossRef]
24. Sang, Y.; ; Shen, H. Efficient and secure protocols for privacypreserving set operations. ACM Trans. Inf. Syst. Secur. 2009,
13, 1–35. [CrossRef]
25. Zhang, L.; He, C.; Wei, L. Efficient and malicious secure three-party private set intersection computation protocols for small sets.
J. Comput. Res. Dev. 2022, 59, 2286–2298. [CrossRef]
26. Miyaji, A.; Nakasho, K.; Nishida, S. Privacy-preserving integration of medical data: A practical Multiparty Private Set Intersection.
J. Med Syst. 2017, 41, 1–10. [CrossRef]
Appl. Sci. 2023, 13, 13215 17 of 17
27. Davidson, A.; Cid, C. An efficient toolkit for computing private set operations. In Proceedings of the Information Security and
Privacy: 22nd Australasian Conference, ACISP 2017, Auckland, New Zealand, 3–5 July 2017; Proceedings, Part II 22; Springer
International Publishing: Berlin/Heidelberg, Germany, 2017; pp. 261–278. [CrossRef]
28. Bay, A.; Erkin, Z.; Hoepman, J.-H.; Samardjiska, S.; Vos, J. Practical Multi-Party Private Set Intersection Protocols.
IEEE Trans. Inf. Forensics Secur. 2022, 17, 1–15. [CrossRef]
29. Kolesnikov, V.; Matania, N.; Pinkas, B.; Rosulek, M.; Trieu, N. Practical multi-party private set intersection from symmetric-key
techniques. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA,
30 October–3 November 2017; pp. 1257–1272. [CrossRef]
30. Kavousi, A.; Mohajeri, J.; Salmasizadeh, M. Efficient scalable multi-party private set intersection using oblivious PRF. In
Proceedings of the 17th International Workshop on Security and Trust Management, Darmstadt, Germany, 8 October 2021;
31. Inbar, R.; Omri, E.; Pinkas, B. Efficient scalable multiparty private set-intersection via garbled Bloom filters. In Proceedings of the 11th
International Conference on Security and Cryptography for Networks, Amalfi, Italy, 5–7 September 2018; pp. 235–252. [CrossRef]
32. Zhang, E.; Liu, F.; Lai, Q.; Jin, G.; Li, Y. Efficient multi-party private set intersection against malicious adversaries. In Proceedings of the
2019 ACM SIGSAC Conference on Cloud Computing Security Workshop, London, UK, 11–15 November 2019; pp. 93–104. [CrossRef]
33. Ben-Efraim, A.; Nissenbaum, O.; Omri, E.; Paskin-Cherniavsky, A. PSImple: Practical multiparty maliciously-secure private set
intersection. In Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, Nagasaki, Japan,
30 May–3 June 2022; pp. 1098–1112. [CrossRef]
34. Nevo, O.; Trieu, N.; Yanai, A. Simple, fast malicious Multiparty Private Set Intersection. In Proceedings of the 2021 ACM SIGSAC
Conference on Computer and Communications Security, Seoul, Korea, 15–19 November 2021; pp. 1151–1165. [CrossRef]
35. Gordon, S.D.; Hazay, C.; Le, P.H. Fully Secure PSI via MPC-in-the-Head [EB/OL]. 2022. Available online:
https://eprint.iacr.org/2022/379 (accessed on 15 May 2023).
36. ElGamal, T. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans. Inf. Theory 1985,
31, 469–472. [CrossRef]
37. Bloom, B.H. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 1970, 13, 422–426. [CrossRef]
38. Dong, C.; Chen, L.; Wen, Z. When private set intersection meets big data: An efficient and scalable protocol. In Proceedings
of the 2013 ACM SIGSAC conference on Computer & Communications Security, Berlin, Germany, 4–8 November 2013;
39. Shamir, A. How to share a secret. Commun. ACM 1979, 22, 612–613. [CrossRef]
40. Lindell, Y. How to simulate it—A tutorial on the simulation proof technique. In Tutorials on the Foundations of Cryptography;
Lindell, Y, Ed.; Information Security and Cryptography; Springer: Berlin/Heidelberg, Germany, 2017; pp. 277–346. [CrossRef]
41. Shoup, V. NTL: A Library for Doing Number Theory. [Online]. 2020. Available online: https://www.shoup.net/ntl/ (accessed
on 15 May 2023).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

Applsci 13 13215

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Applsci 13 13215

Uploaded by

Copyright:

Available Formats

applied

1 School of Computer Science, Hubei University of Technology, Wuhan 430068, China;

Citation: Ruan, O.; Yan, C.; Zhou, J.;

Appl. Sci. 2023, 13, 13215. https://doi.org/10.3390/app132413215 https://www.mdpi.com/journal/applsci

1.1. Related Works

1.2. Our Contribution

2. Materials and Methods

Table 1. Table of notations.

C11 × C12 = gr1 × gr2 = gr1 +r2 mod p (1)

Therefore, two ciphertexts encrypted by random numbers r1 and r2 are modulo

Dec( Enc(m1 · m2 )) = Dec( Enc(m1 ) · Enc(m2 )) (3)

2.2.2. Bloom Filters

m ≥ −n · log2 e · log2 Perr , k ≥ −log2 Perr (4)

Verify whether an element x belongs to a set X: If BF [hu ( x )] = 1 is held for each u,

2.2.3. Lagrange Interpolation

2.2.4. Shamir Threshold Secret-Sharing Scheme

2.2.5. Threshold ElGamal Encryption Scheme

2.2.6. Security Model

are computationally indistinguishable, where i ∈ {1, . . . , t − 1}, ∏t securely computes the

3. Our Multiparty Private Set Intersection Protocol

3.1. Protocol Description

Figure 1. Our Multiparty Private Set Intersection protocol.

3.2. Protocol Correctness

Dec(c j ) = Dec(c1,j × · · · × ct−1,j × w j )

If there is Dec(c j ) = r j × St [ j] mod p, then

intersection of non-equilibrium multiparty sets based on the threshold ElGamal encryption

3.3. Security Analysis

Construct a simulator Sim N to simulate a situation where the adversary controls `

Sim N ({S1 , . . . , S` }, ⊥) =({S1 , . . . , S` }, { ERBF1 , . . . , ERBF` },

Scenario 2: Server Pt is among the ` participants controlled by the adversary.

Construct a simulator SimY to simulate a situation where the adversary controls `

SimY ({S1 , . . . , S`−1 , St }, ∩) =({S1 , . . . , S`−1 , St }, (33)

4. Evaluation and Results’ Discussion

4.1. Protocol Implementation

4.2. Performance Analysis

Algorithm Data Size 102 103 104 105

Server Data Size 210 211 212 213 214

MPSI Multiparty Private Set Intersection

You might also like