Optimization of Homomorphic Comparison Algorithm o

Received December 14, 2021, accepted February 15, 2022, date of publication March 2, 2022, date of current version
March 11, 2022.

Digital Object Identifier 10.1109/ACCESS.2022.3155882
Optimization of Homomorphic Comparison

Algorithm on RNS-CKKS Scheme
EUNSANG LEE 1 , JOON-WOO LEE 1 , (Graduate Student Member, IEEE),
YOUNG-SIK KIM 2 , (Member, IEEE), AND JONG-SEON NO 1 , (Fellow, IEEE)
1 Department of Electrical and Computer Engineering, INMC, Seoul National University, Seoul 08826, South Korea
2 Department of Information and Communication Engineering, Chosun University, Gwangju 61452, South Korea
Corresponding author: Young-Sik Kim (iamyskim@chosun.ac.kr)
This work was supported in part by the Institute of Information & Communications Technology Planning & Evaluation (IITP) Grant
funded by the Ministry of Science, Information and Communication Technology (ICT) and Future Planning (MSIT) (Development of
Highly Efficient post-quantum cryptography (PQC) Security and Performance Verification for Constrained Devices) (contribution: 80%)
under Grant 2021-0-00400; and in part by the Brain Korea 21 Project (BK21) FOUR Program of the Education and Research Program for
Future ICT Pioneers, Seoul National University, in 2022 (contribution: 20%).
ABSTRACT The sign function can be adopted to implement the comparison operation, max function, and
rectified linear unit (ReLU) function in the Cheon–Kim–Kim–Song (CKKS) scheme; hence, several studies
have been conducted to efficiently evaluate the sign function in the CKKS scheme. Recently, Lee et al. (IEEE
Trans. Depend. Sec. Comp.) proposed a practically optimal approximation method for the sign function in the
CKKS scheme using a composition of minimax approximate polynomials. In addition, Lee et al. proposed
a polynomial-time algorithm that finds the degrees of component polynomials that minimize the number
of non-scalar multiplications. However, homomorphic comparison/max/ReLU functions using Lee et al.’s
approximation method have not been successfully implemented in the residue number system variant
CKKS (RNS-CKKS) scheme. In addition, the degrees of component polynomials found by Lee et al.’s
algorithm are not optimized for the RNS-CKKS scheme because the algorithm does not consider that the
running time of non-scalar multiplication depends significantly on the ciphertext level in the RNS-CKKS
scheme. In this study, we propose a fast algorithm for the inverse minimax approximation error, which is a
subroutine required to find the optimal set of degrees of component polynomials. The proposed algorithm
facilitates determining the optimal set of degrees of component polynomials with higher degrees than in the
previous study. In addition, we propose a method to find the degrees of component polynomials optimized
for the RNS-CKKS scheme using the proposed algorithm for the inverse minimax approximation error.
We successfully implement the homomorphic comparison, max function, and ReLU function algorithms on
the RNS-CKKS scheme with a low comparison failure rate (< 2−15 ), and provide various parameter sets
according to the precision parameter α. We reduce the depth consumption of the homomorphic comparison,
max function, and ReLU function algorithms by one depth for several values of α. In addition, the numerical
analysis demonstrates that the homomorphic comparison, max function, and ReLU function algorithms using
the degrees of component polynomials found by the proposed algorithm reduce the running time by 6%, 7%,
and 6% on average, respectively, compared with those using the degrees of component polynomials found
by Lee et al.’s algorithm.
INDEX TERMS Cheon–Kim–Kim–Song (CKKS) scheme, fully homomorphic encryption (FHE),

homomorphic comparison operation, minimax approximate polynomial, Remez algorithm, residue number
system variant CKKS (RNS-CKKS) scheme.
I. INTRODUCTION A HE that allows all algebraic operations on encrypted

Homomorphic encryption (HE) is a cryptosystem that data is known as fully homomorphic encryption (FHE).
allows certain algebraic operations on encrypted data. Gentry proposed the first FHE scheme using bootstrap-
ping in [1]. FHE has garnered considerable attention in
The associate editor coordinating the review of this manuscript and various applications, and its standardization process is in
approving it for publication was Mehdi Sookhak . progress.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 10, 2022 26163
E. Lee et al.: Optimization of Homomorphic Comparison Algorithm on RNS-CKKS Scheme
The Cheon–Kim–Kim–Song (CKKS) [2] scheme, a repre- provide better comparison operation performance. Although
sentative FHE scheme, allows the addition and multiplication the authors of [9] also proposed a polynomial-time algorithm
of real and complex numbers. Because data are usually that determines the set of degrees that minimizes the number
represented by real numbers, the CKKS scheme, which can of non-scalar multiplications, this set of degrees is not
deal with real numbers, has garnered significant attention optimized for the RNS-CKKS scheme, unlike the CKKS
in several applications, such as machine learning [3]–[6]. scheme. This is because the running time of non-scalar
Thus, several studies have been conducted to optimize the multiplication alters significantly with the current ciphertext
CKKS scheme [7]–[11]. Cheon et al. [7] proposed a residue level in the RNS-CKKS scheme. Thus, if we optimize
number system variant CKKS scheme (RNS-CKKS). The the degrees of component polynomials by considering the
running time of the RNS-CKKS scheme is ten times faster running time of non-scalar multiplication according to
than that of the original CKKS scheme with one thread. the ciphertext level, the performance will be improved
In addition, the running time performance can be improved further.
in a multicore environment because the RNS-CKKS scheme
enables parallel computation. Thus, several HE libraries, A. OUR CONTRIBUTIONS
such as SEAL [12], PALISADE [13], and Lattigo [14], are The contributions of this study are presented as follows.
implemented using the RNS-CKKS scheme. 1) For the first time, we successfully implement the
Although the CKKS scheme can virtually support all homomorphic comparison, max function, and ReLU
arithmetic operations on encrypted data, several applica- function algorithms using a composition of minimax
tions require nonarithmetic operations. One of the core approximate polynomials on the RNS-CKKS scheme
non-arithmetic operations is the comparison operation, with a low failure rate (< 2−15 ), and provide proper
denoted as comp(a, b), which outputs 1 if a > b, 1/2 if parameter sets.
a = b, and 0 if a < b. This comparison operation 2) We improve the performance of an algorithm to
is widely employed in various real-world applications, determine the inverse minimax approximation error,
including machine learning algorithms, such as support which is a subroutine to determine the optimal set
vector machines, cluster analysis, and gradient boosting [15], of degrees of component polynomials. In a previous
[16]. The max function and rectified linear unit (ReLU) study, the optimal set of degrees of component
functions are other essential nonarithmetic operations that polynomials that minimizes the number of non-scalar
are widely adopted in deep learning applications [17], [18]. multiplications was determined among degrees only
These three non-arithmetic operations can all be implemented up to 31 [9]; however, we determine the optimal set
using the sign function sgn(x); that is, of degrees of component polynomials among degrees
up to 63, using the improved algorithm for inverse
1
comp(a, b) = (sgn(a − b) + 1), minimax approximation error (see Algorithm 7). Con-
2 sequently, the depth consumption of the homomorphic
1
max(a, b) = (a + b + (a − b) sgn(a − b)), comparison operation (resp. max/ReLU function) is
2 reduced by one depth when α is 9 or 14 (resp. when
1
ReLU(x) = (x + x sgn(x)), α is 16, 17, or 18), thereby enabling an additional
2 multiplication operation. In addition, this improved
where sgn(x) = x/|x| for x 6 = 0 and 0 otherwise. Thus, algorithm for inverse minimax approximation error
several studies have been conducted to implement the sign enables the identification of a set of degrees of
function in the CKKS scheme efficiently [9], [19]. A method component polynomials optimized for homomorphic
to approximate sgn(x) using the composition of component comparison operation, max function, or ReLU function
polynomials was proposed in [19], and it was proven that this in the RNS-CKKS scheme (see Section IV). Our source
method achieves optimal asymptotic complexity. In addition, code for determining the optimized degrees is avail-
the authors of [9] proposed a practically optimal method that able at https://github.com/eslee3209/
approximates sgn(x) with the minimum number of non-scalar MinimaxComp_degrees.
multiplications using a composition of minimax approximate 3) We propose a method to determine the set of degrees
polynomials. of component polynomials optimized for the homo-
Although the authors of [9] proposed a comparison morphic comparison, max function, and ReLU function
operation algorithm with practically optimal performance on on the RNS-CKKS scheme using the proposed fast
the CKKS scheme, there are several limitations in using algorithm for inverse minimax approximation error.
the comparison operation for the RNS-CKKS scheme. First, Using the optimized set of degrees for the RNS-CKKS
because the rescaling error is relatively large in the RNS- scheme, we reduce the running time of the homomor-
CKKS scheme, unlike in the CKKS scheme, it is necessary phic comparison, max function, and ReLU function
to deal with this comparatively large rescaling error to algorithms by 6%, 7%, and 6%, respectively, compared
achieve low approximation failure rates. Another issue is with the previous work in [9] on the RNS-CKKS
determining a set of degrees of component polynomials that scheme library SEAL [12].
26164 VOLUME 10, 2022

B. RELATED WORKS where ζ̄ = exp(−π i/N ) is the (2N )th root of unity in C.
Although it is not difficult to perform a comparison operation HWTN (h) is the set of signed binary vectors in {0, ±1}N
(or max/ReLU function) in bit-wise FHE, such as the fastest with a Hamming weight h. For 0 < a, b ∈ R, we denote
homomorphic encryption in the West (FHEW) [20] or fast [−b, −a] ∪ [a, b] as R̃a,b . In particular, if a = 1 − τ and b =
fully homomorphic encryption over the torus (TFHE) [21], 1 + τ for some τ ∈ (0, 1), then R̃a,b = R̃1−τ,1+τ is denoted
the comparison operation is very challenging in word-wise by Rτ . |{(n1 , n2 , · · · , ni ); S(n1 , · · · , ni )}| denotes the number
FHE, such as the CKKS scheme. Thus, several studies of tuples (n1 , · · · , ni ), such that statement S(n1 , · · · , ni )
have been conducted on comparison operations in the is true. αmax , `max , mmax , nmax , and tmax denote the upper
CKKS scheme that adopts the evaluation of approximate bound of the precision α, ciphertext level, number of non-
polynomials [9], [19], [22]. Among them, the comparison scalar multiplications, depth consumption, and running time,
operation proposed in [9] exhibits the best performance, and respectively. These values should be sufficiently large; thus,
we improve the performance of [9]. we set αmax = 20, `max = 30, mmax = 70, nmax = 40, and
Another comparison operation method for the CKKS tmax = 240 in this study. dmax denotes the upper bound of
scheme that uses FHEW/TFHE bootstrapping was recently the degrees of the component polynomials, and dmax of 31 or
studied [23]–[26]. Although this approach uses FHEW/TFHE 63 is used in this study.
bootstrapping, users can still employ efficient word-wise
operations in the CKKS scheme. When a comparison opera- B. RNS-CKKS SCHEME
tion is required, users switch the ciphertexts to FHEW/TFHE Before describing the RNS-CKKS scheme, some basic
ciphertexts and perform a comparison operation using operations for the RNS are presented. Let B =
FHEW/TFHE bootstrapping. This comparison method that {p0 , · · · , pk−1 }, C = {q0 , · · · , q`−1 }, and D =
uses FHEW/TFHE bootstrapping can be less efficient than {p0 , · · · , pk−1 , q0 , · · · , q`−1 }, where pi and qj are the distinct
the proposed homomorphic comparison that fully uses CKKS primes.
packing in terms of the amortized running time (running – ConvC →B : For [a]C = (a(0) , a(1) , · · · , a(`−1) ) ∈ Zq0 ×
time per comparison operation). However, this comparison · · · × Zq`−1 , output
method is still interesting research topic because this can have
ConvC →B ([a]C )
advantages in the case of large-precision comparison.  
`−1
X
=  [a(j) · q̂−1
j ]qj · q̂j mod pi
 ,
C. OUTLINE j=0 0≤i<k
The remainder of this paper is organized as follows. Section II Q
describes the notations, RNS-CKKS scheme, scaling factor where q̂j = j0 6=j qj0 Q ∈ Z. This algorithm over
integers ConvC →B (·) : `−1
Qk−1
management technique, and homomorphic comparison oper- j=0 Zqj → i=0 Zpi can
ation using a minimax composite polynomial. In Section III, be extended to Qan algorithm over polynomial rings as
ConvC →B (·) : `−1
Qk−1
a fast algorithm for determining the inverse minimax approx- j=0 R q j → i=0 R p i by applying it
imation error is proposed. A new algorithm that determines coefficient-wise.
– ModUpC →D : For [a]C ∈ `−1
Q
the set of degrees of component polynomials optimized for j=0 Rqj , output
the homomorphic comparison of the RNS-CKKS scheme
k−1 `−1
is proposed in Section IV. The application of min/max and Y Y
(ConvC →B ([a]C ), [a]C ) ∈ Rpi × Rqj .
ReLU functions is presented in Section V. In Section VI,
i=0 j=0
the numerical results for the homomorphic comparison, max
function, and ReLU function algorithms that use the proposed – ModDownD→C : For ([a]B , [b]C ) ∈
Qk−1
Rpi ×
Q`−1 i=0
set of degrees for the component polynomials are provided in
j=0 Rqj , output
the RNS-CKKS scheme library SEAL. Finally, concluding
remarks are presented in Section VII. ([b]C − ConvB→C ([a]B )) · [P−1 ]C ,
where P = k−1 i=0 pi .
Q
II. PRELIMINARIES The basic algorithms in the RNS-CKKS scheme are
A. NOTATION described as follows:
Let R = Z[X ]/(X N + 1) and Rq = R/qR be – Setup(λ; 1, L): For a security parameter λ, scaling
polynomial rings, where N is a power-of-two integer. Let C = factor 1, and the number of levels L (also called
{q0 , q1 , · · · , q`−1 } be the set of positive coprime integers. the maximum level), we set some parameters. The
Then, for a ∈ ZQ , where ZQ is the set of integers modulo polynomial degree N of R is chosen such that the
Q`−1
Q and Q = i=0 qi , we denote the RNS representation number of levels L can be supported by security
of a with respect to C as [a]C = ([a]q0 , · · · , [a]q`−1 ) ∈ λ. A secret key distribution χkey , error distribution
Zq0 × · · · × Zq`−1 . For the set of real numbers R and set of χerr over R, and encryption key distribution χenc are
complex numbers C, the field isomorphism τ̄ : R[X ]/(X N + chosen according to the security parameter λ. Bases
1) → CN /2 is defined as τ̄ : r(X ) 7 → (r(ζ̄ 5 ))0≤j<N /2 ,
j
with prime numbers B = {p0 , p1 , · · · , pk−1 } and
VOLUME 10, 2022 26165

(j) (j) (j)

C = {q0 , q1 , · · · , qL } are selected such that pi ≡ 1 mod • ctmult = (ctmult )0≤j≤` , where ctmult = (ĉ0 +
(j) (j) (j)
2N for 0 ≤ i ≤ k −1 and qj ≡ 1 mod 2N for 0 ≤ j ≤ L. d0 , ĉ1 + d1 ) mod qj for 0 ≤ j ≤ `.
q0 is usually set close to 260 and qj − 1 is as small as Then, output the ciphertext ctmult .
(j) (j)
possible for 1 ≤ j ≤ L. All prime numbers are distinct. 1) RS(ct): For a ciphertext ct = (ct(j) = (c0 , c1 ))0≤j≤` ,
We assume that D = B ∪ C. Let C` = {q0 , Q q1 , · · · , q` }, (j) (j)
output the ciphertext ct0 = (ct0 (j) = (c0 0 , c0 1 ))0≤j≤`−1 ,
and D` = B ∪ C` for 0 ≤ ` ≤ L. Let P = k−1 i=0 pi and (j) (`)
where c0 (j) −1
r = q` · (cr − cr ) mod qj for r = 0, 1 and
Q = Lj=0 qj . Let Q
Q Q
p̂i = 0≤i0 ≤k−1,i0 6=i pi0 for 0 ≤ i ≤
0 ≤ j ≤ ` − 1.
k − 1 and q̂`,j = 0≤j0 ≤`,j0 6=j qj0 for 0 ≤ j ≤ ` ≤ L. In this study, we set the key distribution χkey =
The following numbers are then computed: HWTN (256), which samples an element in R with ternary
• [P−1 ]qj for 0 ≤ j ≤ L coefficients that have 256 nonzero values uniformly at
• [p̂i ]qj and [p̂i−1 ]pi for 0 ≤ i ≤ k − 1, 0 ≤ j ≤ L random.
• [q̂`,j ]pi and [q̂−1 `,j ]qj for 0 ≤ i ≤ k − 1, 0 ≤ j ≤ ` ≤
L C. SCALING FACTOR MANAGEMENT
– Ecd(z; 1): For z ∈ CN /2 , output bτ̄ −1 (1z)e ∈ R. A technique for eliminating the large rescaling error in the
– Dcd(m; 1): For m ∈ R, output 1−1 · τ̄ (m) ∈ CN /2 . RNS-CKKS scheme was proposed in [27], where different
– KSGen(s1 , s2 ): This algorithm generates the switching scaling factors at different levels were utilized instead of the
key for switching the secret key Q s1 to s2 . First,
QLsample same scaling factor for each level. If the maximum level is L
(a0(0) , · · · , a0(k+L) ) ← U ( k−1 i=0 R pi × j=0 Rqj ) and the ciphertext modulus for level i is qi , then the scaling
and e0 ← χerr . Then, for the given s1 , s2 ∈ factor for each level is set as follows: 1L = qL and 1i =
R, output the switching key swk = (swk(0) = 12i+1 /qi+1 for i = 0, · · · , L − 1.
(b0(0) , a0(0) ), · Q
· · , swk(k+L) = (b0(k+L) , a0(k+L) )) ∈ When two ciphertexts at the same level are multiplied
Q k−1 2 L 2 0(i) ← −a0(i) · s + e0
i=0 Rpi × j=0 Rqj , where b 2 homomorphically, an approximate rescaling error is not
mod pi for 0 ≤ i ≤ k − 1 and b 0(k+j) ← −a 0(k+j) · s2 + introduced. Then, we consider when two ciphertexts are at
[P]qj · s1 + e0 mod qj for 0 ≤ j ≤ L. different levels, levels i and j, such that i > j. In this
– KeyGen(λ): This algorithm generates the secret, pub- case, the moduli qi , qi−1 , · · · , qj+1 in the first ciphertext
lic, and evaluation keys. First, sample s ← χkey and set are dropped, and the first ciphertext is then multiplied
1q
theQ secret key sk ← (1, s). Sample (a(0) , · · · , a(L) ) ← by a constant b j1j+1i
e. Subsequently, we rescale the first
U ( Lj=0 Rqj ) and e ← χerr . Then, the public key is ciphertext using qj+1 . Because both ciphertexts are now at the
pk ← (pk(j) = (b(j) , a(j) ) ∈ R2qj )0≤j≤L , where b(j) ← same level, conventional homomorphic multiplication can be
−a(j) · s + e mod qj for 0 ≤ j ≤ L. The evaluation key performed. The approximate rescaling error decreased in this
is evk ← KSGen(s2 , s). manner.
– Enc(z; pk, 1): For a message slot z ∈ CN /2 , compute
m = Ecd(z; 1). Then, sample v ← χenc andQe0 , e1 ← D. HOMOMORPHIC COMPARISON OPERATION USING
χerr . The ciphertext is ct = (ct(j) )0≤j≤L ∈ Lj=0 R2qj , MINIMAX COMPOSITE POLYNOMIAL
In this study, the required depth consumption and number
where ct(j) ← v · pk(j) + (m + e0 , e1 ) mod qj for 0 ≤
of non-scalar multiplications for evaluating a polynomial
j ≤ L.
of degree d with odd-degree terms using the odd baby-
– Dec(ct; sk, 1): For a ciphertext ct = (ct(j) )0≤j≤` ∈
Q` step giant-step algorithm and optimal level consumption
j=0 Rqj , obtain m̃ = hct , ski mod q0 . Then, output
2 (0)
technique are denoted by dep(d) and mult(d), respectively.
z = Dcd(m̃; 1).
(j) The values of dep(d) and mult(d) for odd degrees d of up to
– Add(ct1 , ct2 ): For two ciphertexts ctr = (ctr )0≤j≤` for
(j) 63 are presented in Table 1.
r = 1, 2, output the ciphertext ctadd = (ctadd )0≤j≤` , The minimax approximate polynomial of degree at most
(j) (j) (j)
where ctadd ← ct1 + ct2 mod qj for 0 ≤ j ≤ `. d on D for sgn(x) is denoted by MP(D; d). In addition, for
(j)
– Mult(ct1 , ct2 ; evk): For two ciphertexts ctr = (ctr = the minimax approximate polynomial p(x) = MP(D; d),
(j) (j)
(cr0 , cr1 ))0≤j≤` for r = 1, 2, compute the following: the minimax approximation error maxD kp(x) − sgn(x)||∞ is
(j) (j) (j) (j) (j) (j)
• d0 = c00 c10 mod qj , d1 = c00 c11 + c01 c10 mod
(j) (j) denoted by ME(D; d). It is known that for any continuous
(j) (j) (j) function f on D, the minimax approximate polynomial of
qj , and d2 = c01 c11 mod qj for 0 ≤ j ≤ `.
(0) (`) degree d at most on D is unique [29]. Furthermore, the
• ModUpC` →D` (d2 , · · · , d2 ) = minimax approximate polynomial can be obtained using the
(0) (k−1) (0) (`)
(d̃2 , · · · , d̃2 , d2 , · · · , d2 ). improved multi-interval Remez algorithm [30].
(k+`) (j) (j) (i)
• ct˜ = (ct˜ = (c̃0 , c̃1 ))0≤j≤k+` , where ct˜ = For a domain D = R̃a,b and set of odd integers
(i) (k+j) (j)
d̃2 · evk(i) mod pi and ct˜ = d2 · evk(k+j) mod {di }1≤i≤k , a composite polynomial pk ◦ · · · ◦ p1 is called a
qj for 0 ≤ i ≤ k − 1 and 0 ≤ j ≤ `. minimax composite polynomial on D for {di }1≤i≤k , denoted
(0) (`)
• (ĉr , · · · , ĉr ) = ModDownD` →C` by MCP(D; {di }1≤i≤k ) if the followings are satisfied:
(0) (k+`)
(c̃r , · · · , c̃r ) for r = 0, 1. • p1 = MP(D; d1 )
26166 VOLUME 10, 2022

TABLE 1. Required depth consumption and the number of non-scalar

multiplications for evaluating polynomials of degree d with odd-degree
Algorithm 1: AppIMEbinary(τ 0 ; d, ī) [9]
terms using the optimal level consumption technique [10] and the odd Input: Target maximum error τ 0 , an odd degree d, and
baby-step giant-step algorithm [28].
iteration number ī
Output: An approximate value of IME(τ 0 ; d)
1 min ← 2
−21 and max ← 1 − 2−21
2 while ī > 0 do
3 if ME(R(min+max)/2 ; d) < τ 0 then
4 min ← min+max
2
5 else
6 max ← min+max2
7 end
8 ī ← ī − 1
9 end
min+max
10 return
2
Algorithm 2: ComputehG(τ ) [9]

Input: An input τ and an odd maximum degree dmax
Output: 2-dimensional tables h̃ and G̃
1 Generate 2-dimensional tables h̃ and G̃, both of which
have size of (mmax + 1) × (nmax + 1).
2 for m ← 0 to mmax do
3 for n ← 0 to nmax do
4 if m ≤ 1 or n ≤ 1 then
5 h̃(m, n) ← τ
6 G̃(m, n) ← φ
7 else
8 j ← argmax IME(h̃(m − mult(2i +
−1
1≤i≤ dmax
2
mult(2i+1)≤m
dep(2i+1)≤n
• pi = MP(pi−1 ◦ pi−2 ◦ · · · ◦ p1 (D); di ) for i, 2 ≤ i ≤ k.
1), n − dep(2i + 1)); 2i + 1)
Because ME(Rτ ; d) is a strictly increasing function of τ , its
9 h̃(m, n) ← IME(h̃(m − mult(2j + 1), n −
inverse function exists, which is called the inverse minimax
dep(2j + 1)); 2j + 1)
approximation error, and is denoted by IME(τ 0 ; d). Thus, for
10 G̃(m, n) ←
τ ∈ (0, 1) and d ∈ N, IME(τ 0 ; d) is equal to the value τ 0 ∈
{2j+1}∪G̃(m−mult(2j+1), n−dep(2j+1))
(0, 1) that satisfies ME(Rτ 0 ; d) = τ . An approximate value of 11 end
IME(τ 0 ; d) can be obtained using binary search, as indicated 12 end
in Algorithm 1 [9]. 13 end
The comparison operation is denoted as

1
 if a > b
comp(a, b) = 1/2 if a = b ComputeMinDep and ComputeMinMultDegs algo-
rithms use ComputehG algorithm as a subroutine, and
if a < b.

0

MinimaxComp algorithm uses ComputeMinMultDegs
algorithm as a subroutine. Subsequently, the output of the
The procedure for obtaining an approximate value of 1 (a−b)+1
MinimaxComp algorithm, p̃(a, b) = pk ◦pk−1 ◦···◦p
comp(a, b) for the given precision parameters α, and inputs 2
satisfies the following comparison operation error
a, b ∈ [0, 1] is summarized as follows:
condition.
1) Obtain the minimum depth consumption Mdep from
ComputeMinDep algorithm in Algorithm 3. |p̃(a, b) − comp(a, b)| ≤ 2−α
2) Choose depth consumption D(≥ Mdep ) and obtain for any a, b ∈ [0, 1] satisfying |a − b| ≥ .
the optimal set of degrees Mdegs from ComputeMin
(1)
MultDegs algorithm in Algorithm 4.
3) For some appropriate margin η, perform the homo- The set of degrees, Mdegs = {d1 , · · · , dk }, obtained from
morphic comparison algorithm MinimaxComp in the ComputeMinMultDegs algorithm satisfies deg(pi ) =
Algorithm 5 using Mdegs . di , 1 ≤ i ≤ k. Mdegs is the optimal set of degrees,
VOLUME 10, 2022 26167

Algorithm 3: ComputeMinDep(α, ) [9] III. FAST ALGORITHM FOR INVERSE MINIMAX

Input: Precision parameters α and APPROXIMATION ERROR
Output: Minimum depth consumption Mdep The ComputehG algorithm in Algorithm 2 [9] should
1 h̃, G̃ ← ComputehG(2
1−α ) be performed to obtain the optimal set of degrees from
2 for i ← 0 to nmax do the ComputeMinMultDegs algorithm. To apply the
if h̃(mmax , i) ≥ δ = 1− ComputehG algorithm, the inverse minimax approximation
3
1+ then
4 Mdep ← i error IME(τ 0 ; d) should be computed many times. That
5 return Mdep is, the AppIMEbinary algorithm [9] determining an
6 end approximate value of IME(τ 0 ; d) should be called many
7 if i = nmax then times. However, a single call of the AppIMEbinary
8 return ⊥ algorithm also requires multiple computations of ME(Rτ ; d),
9 end that is, several calls for the improved multi-interval Remez
10 end
algorithm [30]. Accordingly, the ComputehG algorithm
requires a significant running time. Specifically, the number
of computations for IME(τ 0 ; d) in ComputehG for each
Algorithm 4: ComputeMinMultDegs(α, , D) [9] precision parameter α is provided as follows:
Input: Precision parameters α and , and depth m
Xmax n
X max n
consumption D i; mult(2i + 1) ≤ m, dep(2i + 1) ≤ n,
Output: Minimum number of multiplications Mmult and m=0 n=0
the optimal set of degrees Mdegs dmax − 1 o
1−α ) 1≤i≤
1 h̃, G̃ ← ComputehG(2 2
2 for j ← 0 to mmax do n
3 if h̃(j, D) ≥ δ = 1+
1−
then = (m, n, i); 0 ≤ m ≤ mmax , 0 ≤ n ≤ nmax ,
4 Mmult ← j mult(2i + 1) ≤ m, dep(2i + 1) ≤ n,
5 Go to line 11 dmax − 1 o
6 end 1≤i≤
2
7 if j = mmax then dmax −1
8 return ⊥ 2
X n
9 end = (m, n); mult(2i + 1) ≤ m ≤ mmax ,
10 end i=1
11 Mdegs ← G̃(Mmult , D) ;
o
// Mdegs : ordered set dep(2i + 1) ≤ n ≤ nmax
12 return Mmult and Mdegs
dmax −1
X 2
= (mmax −mult(2i + 1) + 1)(nmax −dep(2i + 1)+1)
Algorithm 5: MinimaxComp(a, b; α, , D, η) [9] i=1
Input: Inputs a, b ∈ (0, 1), precision parameters α and dmax − 1
, depth consumption D, and margin η = (mmax + 1)(nmax + 1)
2
Output: Approximate value of comp(a, b) dmax −1
2
1 Mdegs = {d1 , d2 , · · · , dk } ←
X
−(mmax + 1) dep(2i + 1)
ComputeMinMultDegs(α, , D)
i=1
2 p1 ← MP(R̃1−,1 ; d1 )
dmax −1
3 τ1 ← ME(R̃1−,1 ; d1 ) + η X 2
4 for i ← 2 to k do
−(nmax + 1) mult(2i + 1)
5 pi ← MP(Rτi−1 ; di ) i=1
dmax −1
6 τi ← ME(Rτi−1 ; di ) + η X 2
7 end + mult(2i + 1)dep(2i + 1).
pk ◦pk−1 ◦···◦p1 (a−b)+1
8 return i=1
2
If we set dmax = 31, mmax = 70, and nmax = 40, as in [9],
the number of computations for IME(τ 0 ; d) is
such that the homomorphic comparison operation minimizes
15(mmax + 1)(nmax + 1) − (mmax + 1) · 64
the number of non-scalar multiplications and satisfies the
comparison operation error condition in (1) for the given −(nmax + 1) · 113 + 64 · 113
depth consumption D. = 15 · 71 · 41 − 71 · 64 − 41 · 113 + 64 · 113
26168 VOLUME 10, 2022

= 41, 720. Algorithm 6: StoreME(dmax , n̄)

Input: Maximum degree dmax and precision n̄
If we want to adopt dmax = 63, the number of computations Output: 2-dimensional tables X̃ and Ỹ
for IME(τ 0 ; d) is 117, 675. 1 Generate 2-dimensional tables X̃ and Ỹ , both of which
To obtain a precise approximate value of IME(τ 0 ; d)
have size of dmax2 −1 × 2(n̄αmax + 1)
using the AppIMEbinary algorithm, we iterate at least d −3
2 for i ← 0 to max do
ten times. Then, the expected number of computations for 2
3 for j ← 0 to n̄αmax do
ME(Rτ ; d) in ComputehG is at least 117, 675 × 10 =
4 τ ← 2−αmax −1+j/n̄
1, 176, 750. Notably, this is the case for only one value of the
precision parameter α, where the input τ of the ComputehG 5 X̃ (i, j) ← τ
algorithm is 21−α . To perform the ComputehG algorithm for 6 Ỹ (i, j) ← ME(Rτ ; 2i + 3)
α ranging from 4 to 20, approximately 1, 176, 750 × 17 = 7 end
20, 004, 750 calls for ME(Rτ ; d) are required. Because of 8 for j ← 0 to n̄αmax do
the large number of calls for ME(Rτ ; d), a value of dmax 9 τ ← 1 − 2−(1+j/n̄)
larger than 31 could not be used in [9], failing to improve the 10 X̃ (i, j + n̄αmax + 1) ← τ
performance of homomorphic comparison operations using 11 Ỹ (i, j + n̄αmax + 1) ← ME(Rτ ; 2i + 3)
higher degrees. Thus, it is desirable to study how to efficiently 12 end
determine the approximate value of IME(τ 0 ; d). 13 end
14 return X̃ and Ỹ
A. PROPOSED ALGORITHM FOR INVERSE MINIMAX
APPROXIMATION ERROR
We propose a fast method to determine the approximate value algorithm. Thus, StoreME is performed only once for
of IME(τ 0 ; d), which enables the use of a value of dmax larger various precision parameters, α.
than 31. The procedure for the proposed method is as follows.
1) Sample the values of τ at moderate intervals. B. RUNNING TIME OF THE PROPOSED ALGORITHM
2) Compute the values of ME(Rτ ; d) for the sampled τ . We compare the running time of the ComputehG algorithm
3) For τ 0 ∈ (0, 1), obtain an approximate value of using the previous algorithm for the inverse minimax
IME(τ 0 ; d) by interpolation using the computed sample approximation error with that using the proposed algorithm.
values of ME(Rτ ; d). Numerical analysis is conducted on a Linux PC with an
For αmax , which is the upper-bound of α, we consider Intel Core i7-10700 CPU at 2.90 GHz with one thread. One
sampling τ between 2−αmax −1 and 1−2−αmax −1 . If τ is close to call for the improved multi-interval Remez algorithm takes
zero or one, sampling should be very dense; however, densely approximately 1.4 s on average when dmax = 31, and
sampling the entire range between 2−αmax −1 and 1−2−αmax −1 4.9 s on average when dmax = 63. The expected running
requires a large number of samples. Thus, we propose to time of ComputehG can then be obtained. Table 2 presents
sample densely when τ is close to zero or one, and sparsely the expected running time of the ComputehG algorithm
otherwise. for α in the range 4–20 using the previous and proposed
Specifically, we first sample t uniformly between algorithms for the inverse minimax approximation error.
−αmax − 1 and −1. Furthermore, we compute ME(R2t ; d) for As presented in Table 2, the proposed AppIME algorithm
the sampled values of t. In addition, we sample t uniformly requires significantly less running time than the previous
between 1 and αmax + 1 and compute ME(R1−2−t ; d) for AppIMEbinary algorithm, which enables the execution of
the sampled values of t. By interpolating these samples, the ComputehG algorithm for dmax = 63. Although the
we can achieve a precise approximation of IME(τ 0 ; d) running time of 17 h might still seem to be large, it should
using a smaller number of samples. Precision n̄ determines be noted that this process only needs to be performed once
how frequently the values of t are sampled, and we set because its goal is to determine the optimal set of degrees.
n̄ = 10. For a given maximum degree, dmax , and precision, Whereas the ComputehG algorithm in Algorithm 2 can
n̄, the StoreME algorithm in Algorithm 6 stores 2t (resp. only be performed for dmax ≤ 31 in [9], we apply the
1 − 2−t ) in a two-dimensional table X̃ , and ME(R2t ; d) (resp. ComputehG algorithm for dmax ≤ 63 using the proposed
ME(R1−2−t ; d)) in a two-dimensional table Ỹ for all degrees fast AppIME algorithm in Algorithm 7. Table 3 presents
d, 3 ≤ d ≤ dmax . For n̄ = 10 and αmax = 20, the number of the optimal sets of degrees, Mdegs , and the corresponding
calls for ME(Rτ ; d) is 6, 030 for dmax = 31 and 12, 462 for minimum depth consumption, Dmin , for dmax = 31 and
dmax = 63. dmax = 63. From Table 3, it can be observed that depth
The AppIME algorithm in Algorithm 7 outputs an consumption is reduced by one when α is nine or 14. That is,
approximate value of IME(τ 0 ; d) using tables X̃ and Ỹ for α = 9 or α = 14, a high dmax enables one more non-scalar
obtained from the StoreME algorithm. Here, several calls multiplication per homomorphic comparison operation in the
for the AppIME algorithm require only one computation FHE setting, where the available number of operations is
of tables X̃ and Ỹ , that is, one execution of the StoreME very limited per bootstrapping. Furthermore, the proposed
VOLUME 10, 2022 26169

Algorithm 7: AppIME(τ, d; dmax , n̄) TABLE 3. Optimal sets of degrees Mdegs and corresponding minimum
depth consumption for homomorphic comparison operation for
Input: Target maximum error τ , degree d, the odd dmax = 31 and dmax = 63. Dmin denotes the minimum depth
maximum degree dmax , and precision n̄ consumption for the homomorphic comparison operation.
Output: An approximate value of IME(τ ; d)

1 X̃ , Ỹ ← StoreME(dmax , n̄)
2 for j ← 0 to 2n̄αmax + 1 do
3 xj ← X̃ ( d−32 , j)
4
2 , j)
yj ← Ỹ ( d−3
5 end
6 for j ← 1 to 2n̄αmax + 1 do
7 if yj ≥ τ then
8 if xj−1 < 0.5 and xj ≥ 0.5 then
9 return xj
10 else if yj−1 < 0.5 and yj ≥ 0.5 then
11 return xj
12 else if xj < 0.5 and yj < 0.5 then
log x −log xj−1
13 p ← log xj−1 + (τ − log yj−1 ) log yjj −log yj−1
failure rate in the homomorphic comparison operation using a
14 return 2p minimax composite polynomial [9]. We replace all additions
15 else if xj−1 ≥ 0.5 and yj < 0.5 then and multiplications required for polynomial evaluation with
16 p ← − log(1 − xj−1 ) additions and multiplications that use the scaling factor
log(1−x )−log(1−x )
17 −(τ − log yj−1 ) log yj j −log yj−1 j−1 management technique. It can be observed in Section VI
18 return 1 − 2−p that a low failure rate is achieved using this technique and
19 else if yj−1 ≥ 0.5 then appropriate parameter sets.
20 p ← − log(1 − xj−1 ) Another difference exists between the homomorphic
21
log(1−x )−log(1−xj−1 )
+(τ + log(1 − yj−1 )) log(1−yjj )−log(1−yj−1 comparison operation in the CKKS scheme and that in
)
the RNS-CKKS scheme. For a given depth consumption
22 return 1 − 2−p
D, the set of degrees Mdegs that minimizes the number
23 else
of non-scalar multiplications can be obtained using the
24 return ⊥
ComputeMinMultDegs algorithm. Because the compu-
25 end
tation time of a non-scalar multiplication does not depend
26 end
significantly on the current ciphertext modulus in the CKKS
27 return ⊥
scheme, minimizing the number of non-scalar multiplications
corresponds to minimizing the running time. However,
TABLE 2. Expected running time of ComputehG algorithm for α in the
range 4–20 using the previous and proposed algorithms for inverse because the computation time of non-scalar multiplication
minimax approximation error. depends significantly on the current level in the RNS-CKKS
scheme, minimizing the number of non-scalar multiplications
does not always correspond to minimizing the running time.
Fig. 1 illustrates the computation time of an example
polynomial of degree seven, according to the current level
on the RNS-CKKS scheme library SEAL [12]. From Fig. 1,
it can be observed that the computation time of a polynomial
tends to increase quadratically according to the maximum
level. For example, we consider two ordered sets Mdegs =
AppIME algorithm enables the identification of a set of {7, 7, 31, 31} and Mdegs = {31, 31, 7, 7}. In the CKKS
degrees optimized for the RNS-CKKS scheme, as described scheme, the computation time of the homomorphic compari-
in Section IV. son operation using two sets of degrees will be approximately
the same. However, the homomorphic comparison operation
IV. DETERMINING DEGREES OF COMPONENT using the former is faster in the RNS-CKKS scheme because
POLYNOMIALS OPTIMIZED FOR THE RNS-CKKS SCHEME a high degree polynomial is computed at a lower level.
Unlike the previous study on the homomorphic comparison Our core idea is to determine the set of degrees that
operation in the CKKS scheme [9], we study the homo- minimizes the running time rather than the number of non-
morphic comparison operation in the RNS-CKKS scheme scalar multiplications. Specifically, we modify the previous
as well, and there are additional aspects to be considered. ComputehG and ComputeMinMultDegs algorithms so
Unlike the CKKS scheme, the RNS-CKKS scheme has that the modified algorithms can determine the set of degrees
a relatively large rescaling error with the risk of a high Mdegs that minimizes the running time.
26170 VOLUME 10, 2022

Algorithm 8: ComputeuV(τ ; L)
Input: τ , maximum level L
Output: 2-dimensional tables ũ and Ṽ
1 Generate 2-dimensional tables ũ and Ṽ , both of which
have size of (tmax + 1) × (nmax + 1)
2 for m ← 0 to tmax do
3 for n ← 0 to nmax do
4 if m ≤ 1 or n ≤ 1 then
5 ũ(m, n) ← τ
6 Ṽ (m, n) ← φ
7 else
8 j ← argmax IME(ũ(m − CL (i − 1, n),
FIGURE 1. Running time of a polynomial of degree seven according to the 1≤i
current level of the ciphertext in the RNS-CKKS scheme library SEAL [12]. CL (i−1,n)≤m
dep(2i+1)≤n
n − dep(2i + 1)); 2i + 1)
Here, we define C` (i, `0 ) for 0 ≤ i ≤ dmax2 −3 , 0 ≤ 9 ũ(m, n) ← IME(ũ(m − CL (j − 1, n),
` ≤ `max , 0 ≤ `0 ≤ `max . First, we set the maximum n − dep(2j + 1)); 2j + 1)
level to `. Then, starting from level `0 (≤ `), any polynomial 10 Ṽ (m, n) ← {2j + 1} ∪ Ṽ (m − CL (j − 1, n),
of degree 2i + 3 is evaluated using the optimal level n − dep(2j + 1))
consumption technique [10] and an odd baby-step giant-step 11 end
algorithm [28]. If t is the running time of the polynomial 12 end
evaluation in milliseconds, we define C` (i, `0 ) as C` (i, `0 ) = 13 end
t
b 100 e. Here, if ` < `0 or `0 < dlog2 (2i + 3)e, polynomial
evaluation is infeasible; thus, C` (i, `0 ) is set to a sufficiently k i−1
large value of 100, 000 in this case. We obtain the values of
X di − 3 X
CL ( ,L − dep(dj )) ≤ m,
C` (i, `0 ) by performing polynomial evaluation on encrypted i=1
2
j=1
data. This computation is done on Intel Core i7-10700 CPU
at 2.90 GHz in single thread with an Ubuntu 20.04 LTS
where ki=1 CL ( di 2−3 , L − i−1
P P
distribution. Subsequently, uτ,L (m, n) and Vτ,L (m, n) are j=1 dep(dj )) denotes the running
time of the homomorphic comparison operation using a set
defined recursively using the values of C` (i, `0 ) as follows:
of degrees Mdegs = {di }1≤i≤k . For Vτ,L (m, n) = {di0 }1≤i≤k 0 ,
P0
uτ,L (m, n) MCE(Ruτ,L (m,n) ; {di0 }1≤i≤k 0 ) ≤ τ , ki=1 dep(di0 ) ≤ n, and
Pk di −3 i−1
i=1 CL ( 2 , L −
P
j=1 dep(dj )) ≤ m.

τ,
 if m ≤ 1 or n ≤ 1
The ComputeuV algorithm in Algorithm 8 outputs
= IME(uτ,L (m − CL (jm,n − 1, n), n two-dimensional tables ũ and Ṽ that store the values of

−dep(2jm,n + 1)); 2jm,n + 1), otherwise,

uτ,L and Vτ,L , respectively. This ComputuV algorithm
Vτ,L (m, n) requires several computations for IME(τ 0 ; d). However, these
computations can be performed rapidly using the proposed

φ,
 if m ≤ 1 or n ≤ 1
AppIME algorithm.
= {2jm,n + 1} ∪ Vτ,L (m − CL (jm,n − 1, n),
 Then, the ComputeMinTimeDegs algorithm in Algo-
n − dep(2jm,n + 1)), otherwise,

rithm 9 outputs the minimum running time Mtime (in 100 ms)
and optimal set of degrees Mdegs using the two tables ũ and
where
Ṽ obtained from the ComputeuV algorithm.
jm,n = argmax IME(uτ,L (m − CL (i − 1, n), We now propose the homomorphic comparison algorithm
1≤i OptMinimaxComp in Algorithm 10, which uses the
CL (i−1,n)≤m ComputeMinTimeDegs algorithm. This is a modified
dep(2i+1)≤n
version of the previous MinimaxComp algorithm in Algo-
n − dep(2i + 1)); 2i + 1).
rithm 5 [9], minimizing the running time of the RNS-CKKS
uτ,L (m, n) implies the maximum value of τ 0 ∈ (0, 1), such scheme for a given depth consumption D.
that there exists a set of degrees {di }1≤i≤k that satisfy the
following: V. APPLICATION TO MIN/MAX AND ReLU FUNCTION
In this section, we apply the methods of improving the
MCE(Rτ 0 ; {di }1≤i≤k ) ≤ τ homomorphic comparison operation proposed in Sections III
X k and IV to the max and ReLU functions. First, the max
dep(di ) ≤ n function is an important operation employed in several
i=1 applications, including deep learning. The max function is
VOLUME 10, 2022 26171

Algorithm 9: ComputeMinTimeDegs(α, , D) Algorithm 11: OptMinimaxMax(a, b; α, ζ, D, η)

Input: Precision parameters α and , and depth Input: Inputs a, b ∈ [0, 1], precision parameter α, max
consumption D function factor ζ , depth consumption D, and
Output: Minimum running time Mtime and the optimal margin η
set of degrees Mdegs Output: Approximate value of max(a, b)
1 ũ, Ṽ ← ComputeuV(2
1−α ; D) 1 Mdegs = {d1 , d2 , · · · , dk } ←
2 for j ← 0 to tmax do ComputeMinTimeDegs(α, ζ · 2−α , D − 1)
3 if ũ(j, D) ≥ δ = 1+
1−
then 2 p1 ← MP(R̃1−,1 ; d1 )
4 Mtime ← j 3 τ1 ← ME(R̃1−,1 ; d1 ) + η
5 Go to line 11 4 for i ← 2 to k do
6 end 5 pi ← MP(Rτi−1 ; di )
7 if j = tmax then 6 τi ← ME(Rτi−1 ; di ) + η
8 return ⊥ 7 end
(a−b)pk ◦pk−1 ◦···◦p1 (a−b)+(a+b)
9 end 8 return
2
10 end
11 Mdegs ← Ṽ (Mtime , D) ; // Mdegs : ordered set on the previous MinimaxMax algorithm, and we obtain the
12 return Mtime and Mdegs
set of degrees using the ComputeMinTimeDegs algorithm
instead of the ComputeMinMultDegs algorithm.
Algorithm 10: OptMinimaxComp(a, b; α, , D, η)
In addition, the authors of [17] proposed a method to
Input: Inputs a, b ∈ (0, 1), precision parameters α and precisely approximate the ReLU function using the approx-
, depth consumption D, and margin η imation of the sign function. This precise approximation of
Output: Approximate value of comp(a, b) the ReLU function is necessary to evaluate the pretrained
1 Mdegs = {d1 , d2 , · · · , dk } ←
convolutional neural networks on FHE. The ReLU and sign
ComputeMinTimeDegs(α, , D) functions have the following relationship
2 p1 ← MP(R̃1−,1 ; d1 )
x + x sgn(x)
3 τ1 ← ME(R̃1−,1 ; d1 ) + η ReLU(x) = .
4 for i ← 2 to k do 2
5 pi ← MP(Rτi−1 ; di ) Thus, the approximate polynomial r(x) for the ReLU func-
6 τi ← ME(Rτi−1 ; di ) + η tion can be implemented using the approximate polynomial
7 end p(x) for the sign function, as follows:
pk ◦pk−1 ◦···◦p1 (a−b)+1
8 return x + xp(x)
2 r(x) = . (3)
2
easily implemented using the sign function, that is, Then, r(x) should satisfy the following ReLU function
error condition for precision parameter α:
(a + b) + (a − b) sgn(a − b)
max(a, b) = . |r(x) − ReLU(x)| ≤ 2−α for any x ∈ [−1, 1]. (4)
2
Thus, the approximate polynomial for the max function The previous ReLU function algorithm that uses equa-
p̃(a, b) can be obtained from the approximate polynomial for tion (3) can be described as Algorithm 12, which we refer to
the sign function p(x) as as MinimaxReLU. The proposed ReLU function algorithm
(a + b) + (a − b)p(a − b) in Algorithm 13 improves on the previous ReLU function
p̃(a, b) = . algorithm MinimaxReLU, and we obtain the set of degrees
2
using the ComputeMinTimeDegs algorithm instead of the
Then, p̃(a, b) should satisfy the following max function ComputeMinMultDegs algorithm. It should be noted that
error condition for precision parameter α: the ReLU function algorithm uses the same value of the max
|p̃(a, b) − max(a, b)| ≤ 2−α for any a, b ∈ [0, 1]. (2) function factor ζ as the max function algorithm for a given
precision parameter α.
Because we have min(a, b) = a + b − max(a, b), the As explained in Section III, the proposed AppIME
approximate polynomial for the min. function p̂(a, b) can also algorithm makes it possible to apply the ComputehG
be easily obtained, that is, p̂(a, b) = a + b − p̃(a, b). algorithm for dmax = 63, which enables us to obtain
The previous homomorphic max function MinimaxMax a better set of degrees of component polynomials using
in [9] adopts the set of degrees of component polynomials ComputeMinMultDegs. Table 4 presents the optimal set
obtained by executing the ComputeMinMultDegs algo- of degrees Mdegs for max/ReLU functions and corresponding
rithm in Algorithm 4 for inputs α, ζ · 2−α , and D − 1, where minimum depth consumption Dmin for dmax = 31 and
ζ is the max function factor that can be determined experi- dmin = 63. From Table 4, it can be observed that the
mentally. The proposed algorithm in Algorithm 11 improves depth consumption is reduced by one when α is 16, 17,
26172 VOLUME 10, 2022

Algorithm 12: MinimaxReLU(x; α, ζ, D, η) [17] D using the ComputeMinTimeDegs algorithm instead of

Input: Inputs x ∈ [−1, 1], precision parameter α, max the ComputeMinMultDegs algorithm.
function factor ζ , depth consumption D, and
margin η VI. NUMERICAL RESULTS
Output: Approximate value of ReLU(x) In this section, numerical results of the proposed
1 Mdegs = {d1 , d2 , · · · , dk } ← OptMinimaxComp, OptMinimaxMax, and Opt
ComputeMinMultDegs(α, ζ · 2−α , D − 1) MinimaxReLU algorithms in Algorithms 10, 11, and 13,
2 p1 ← MP(R̃1−,1 ; d1 ) respectively, are presented. The performances of
3 τ1 ← ME(R̃1−,1 ; d1 ) + η the OptMinimaxComp, OptMinimaxMax, and
4 for i ← 2 to k do OptMinimaxReLU algorithms using the proposed
5 pi ← MP(Rτi−1 ; di ) ComputeMinTimeDegs algorithm are evaluated and
6 τi ← ME(Rτi−1 ; di ) + η compared with those of the MinimaxComp, MinimaxMax,
7 end and MinimaxReLU algorithms using the previous
8 return
xpk ◦pk−1 ◦···◦p1 (x) +x ComputeMinMultDegs algorithm. Numerical analyses
2 are conducted using the representative RNS-CKKS scheme
library SEAL [12] on an Intel Core i7-10700 CPU at
Algorithm 13: OptMinimaxReLU(x; α, ζ, D, η) 2.90 GHz in a single thread with an Ubuntu 20.04 LTS
Input: Inputs x ∈ [−1, 1], precision parameter α, max distribution.
function factor ζ , depth consumption D, and
A. PARAMETER SETTING
margin η
Output: Approximate value of ReLU(x) Precision parameters, and α, are the input and
1 Mdegs = {d1 , d2 , · · · , dk } ←
output precisions of the homomorphic comparison
ComputeMinTimeDegs(α, ζ · 2−α , D − 1) algorithm MinimaxComp or OptMinimaxComp in
2 p1 ← MP(R̃1−,1 ; d1 )
the RNS-CKKS scheme. We set = 2−α , which
implies that the input and output precisions are the
3 τ1 ← ME(R̃1−,1 ; d1 ) + η
same. By contrast, the homomorphic max function
4 for i ← 2 to k do
algorithms, MinimaxMax and OptMinimaxMax, and the
5 pi ← MP(Rτi−1 ; di )
homomorphic ReLU function algorithms, MinimaxReLU
6 τi ← ME(Rτi−1 ; di ) + η
and OptMinimaxReLU, use only the input precision
7 end
xpk ◦pk−1 ◦···◦p1 (x) +x parameter α. We set N = 216 . MinimaxComp,
8 return
2 MinimaxMax, MinimaxReLU, OptMinimaxComp,
OptMinimaxMax, and OptMinimaxReLU are performed
TABLE 4. Optimal set of degrees Mdegs and corresponding minimum
depth consumption for the homomorphic max/ReLU function for
simultaneously for N /2 tuples of real numbers. The
dmax = 31 and dmax = 63. Dmin is the minimum depth consumption for amortized running time is then obtained by dividing the
the homomorphic max and ReLU functions. running time by N /2.
1) SCALING VALUES AND MARGINS
We adopt the scaled Chebyshev polynomials T̃i (t) = Ti (t/w)
for a scaling value w > 1 as the basis polynomials. The scaled
Chebyshev polynomials are computed using the following
recursion:
T̃0 (x) = 1
T̃1 (x) = x/w
T̃i+j (x) = 2T̃i (x)T̃j (x) − T̃i−j (x) for i ≥ j ≥ 1.
The scaling values, w, and margins, η, are obtained
experimentally. The obtained scaling values and margins used
in our numerical analyses of the homomorphic comparison
operation and homomorphic max/ReLU functions are pre-
sented in Tables 5 and 6, respectively.
2) SCALING FACTOR
or 18, enabling one more non-scalar multiplication per If the output of the homomorphic comparison operation or
homomorphic max or ReLU function. homomorphic max function for one input tuple for two real
Furthermore, the proposed OptMinimaxMax and numbers a and b does not satisfy the comparison operation
OptMinimaxReLU algorithms minimize the running time error condition in (1) or the max function error condition
of the RNS-CKKS scheme for a given depth consumption in (2), then it is considered to have failed. In addition, if the
VOLUME 10, 2022 26173

TABLE 5. Set of scaling values and margins for the previous homomorphic comparison algorithm MinimaxComp and the proposed algorithm
OptMinimaxComp.
TABLE 6. Set of scaling values and margins for the previous homomorphic max/ReLU function algorithms MinimaxMax/MinimaxReLU and the proposed
algorithms OptMinimaxMax/OptMinimaxReLU.
TABLE 7. Comparison between the running time (amortized running time) of the previous homomorphic comparison algorithm MinimaxComp and that of
the proposed algorithm OptMinimaxComp in the RNS-CKKS scheme.
output of the homomorphic ReLU function for one real input p0 ≈ 260 . In the numerical analysis of the homomorphic
number x does not satisfy the ReLU function error condition comparison operation that consumes depth D, we set the
in (4), it is said to have failed. The homomorphic comparison maximum level L = D. We set q0 ≈ 260 and qj ≈ 1 to
operation, max function, or ReLU function is performed for 1 ≤ j ≤ L.
215 inputs for each α, and the number of failures is obtained.
The failure rate is the number of failures divided by the B. PERFORMANCE OF THE PROPOSED HOMOMORPHIC
total number of inputs, 215 . We set the scaling factor to COMPARISON ALGORITHM
be sufficiently large such that the homomorphic comparison The previous homomorphic comparison operation uses a
operation, max function, or ReLU function does not fail in set of degrees Mdegs from ComputeMinMultDegs for
any slot; the failure rate is said to be less than 2−15 in this dmax = 31. By contrast, the proposed homomorphic
case. We set the scaling factor 1 = 250 in all our numerical comparison operation obtains Mdegs for dmax = 63 from
analyses, and the number of failures is zero in all numerical ComputeMinTimeDegs. Depth consumption D should
results. satisfy D ≥ Mdep , where Mdep is the minimum depth
consumption obtained from the ComputeMinDep algo-
3) BASES WITH PRIME NUMBERS rithm. The set of degrees and running times (amortized
Bases with prime numbers B = {p0 , p1 , · · · , pk−1 } and running times) of the previous homomorphic comparison
C = {q0 , q1 , · · · , qL } should be selected. We set k = 1 and algorithm MinimaxComp and the proposed algorithm
26174 VOLUME 10, 2022

TABLE 8. Comparison between the running times (amortized running times) of the previous homomorphic max/ReLU function algorithms
MinimaxMax/MinimaxReLU and that of the proposed algorithms OptMinimaxMax/OptMinimaxReLU in the RNS-CKKS scheme.
OptMinimaxComp are presented in Table 7. It can be is optimized for the RNS-CKKS scheme using the proposed
observed that the proposed homomorphic comparison algo- fast algorithm for inverse minimax approximation error.
rithm reduces the running time by 6% on average compared We reduced the depth consumption of the homomorphic
to the previous algorithm. comparison operation (resp. max/ReLU functions) by one
Increasing the depth consumption D sometimes increases depth when α is 9 or 14 (resp. when α is 16, 17, or 18).
running time. In this case, a depth consumption greater than In addition, the numerical analysis demonstrated that the
D does not need to be used, and Table 7 does not include this proposed homomorphic comparison, max function, and
case. Table 7 also does not include cases in which the previous ReLU function algorithms reduced the running time by 6%,
and proposed algorithms use the same set of degrees, Mdegs . 7%, and 6% on average, respectively, compared with the
previous algorithms.
C. PERFORMANCE OF THE PROPOSED HOMOMORPHIC
MAX/ReLU FUNCTION ALGORITHM REFERENCES
As in the numerical analysis of the homomorphic [1] C. Gentry, ‘‘Fully homomorphic encryption using ideal lattices,’’ in
Proc. 41st Annu. ACM Symp. Symp. Theory Comput. (STOC), 2009,
comparison operation, the proposed homomorphic pp. 169–178.
max and ReLU function algorithms obtain Mdegs from [2] J. H. Cheon, A. Kim, M. Kim, and Y. Song, ‘‘Homomorphic encryption
ComputeMinTimeDegs for dmax = 63. The set of for arithmetic of approximate numbers,’’ in Proc. Int. Conf. Theory Appl.
Cryptol. Inf. Secur., 2017, pp. 409–437.
degrees and running times (amortized running times) of [3] F. Boemer, Y. Lao, R. Cammarota, and C. Wierzynski, ‘‘nGraph-HE: A
the previous homomorphic max/ReLU function algorithms graph compiler for deep learning on homomorphically encrypted data,’’ in
MinimaxMax/MinimaxReLU and the proposed algorithms Proc. 16th ACM Int. Conf. Comput. Frontiers, 2019, pp. 3–13.
[4] F. Boemer, A. Costache, R. Cammarota, and C. Wierzynski, ‘‘nGraph-
OptMinimaxMax/OptMinimaxReLU are presented in HE2: A high-throughput framework for neural network inference on
Table 8. It can be observed that the proposed homomorphic encrypted data,’’ in Proc. 7th ACM Workshop Encrypted Comput. Appl.
max and ReLU function algorithms reduce the running Homomorphic Cryptogr., 2019, pp. 45–56.
time by 7% and 6% on average, respectively, compared [5] X. Jiang, M. Kim, K. Lauter, and Y. Song, ‘‘Secure outsourced matrix
computation and application to neural networks,’’ in Proc. ACM SIGSAC
with the previous homomorphic max and ReLU function Conf. Comput. Commun. Secur., Oct. 2018, pp. 1209–1222.
algorithms. As in the numerical analysis of the homomorphic [6] F. Bourse, M. Minelli, M. Minihold, and P. Paillier, ‘‘Fast homomorphic
comparison operation, Table 8 does not include cases when a evaluation of deep discretized neural networks,’’ in Proc. Annu. Int.
Cryptol. Conf., 2018, pp. 483–512.
larger depth increases the running time or when the previous [7] J. H. Cheon, K. Han, A. Kim, M. Kim, and Y. Song, ‘‘A full RNS variant
and proposed algorithms use the same set of degrees Mdegs . of approximate homomorphic encryption,’’ in Proc. Int. Conf. Sel. Areas
Cryptogr., 2018, pp. 347–368.
[8] Y. Lee, J.-W. Lee, Y.-S. Kim, and J.-S. No, ‘‘Near-optimal polynomial
VII. CONCLUSION for modulus reduction using L2-norm for approximate homomorphic
We implemented the optimized homomorphic compari- encryption,’’ IEEE Access, vol. 8, pp. 144321–144330, 2020.
[9] E. Lee, J.-W. Lee, J.-S. No, and Y.-S. Kim, ‘‘Minimax approximation of
son, max function, and ReLU function algorithms for sign function by composite polynomial for homomorphic comparison,’’
the RNS-CKKS scheme using a composition of minimax IEEE Trans. Dependable Secure Comput., early access, Aug. 18, 2021,
approximate polynomials for the first time. We successfully doi: 10.1109/TDSC.2021.3105111.
[10] J.-P. Bossuat, C. Mouchet, J. Troncoso-Pastoriza, and J.-P. Hubaux,
implemented the algorithms on the RNS-CKKS scheme ‘‘Efficient bootstrapping for approximate homomorphic encryption with
with a low failure rate (< 2−15 ) and provided parameter non-sparse keys,’’ in Proc. Annu. Int. Conf. Theory Appl. Cryptograph.
sets according to the precision parameter α. In addition, Techn., 2021, pp. 587–617.
[11] Y. Lee, J.-W. Lee, Y.-S. Kim, H. Kang, and J.-S. No, ‘‘High-precision
we proposed a fast algorithm for the inverse minimax approximate homomorphic encryption by error variance minimization,’’
approximation error, which is a subroutine required to in Proc. Annu. Int. Conf. Theory Appl. Cryptograph. Techn. (Eurocrypt),
determine the optimal set of degrees. This algorithm enabled 2022.
us to determine the optimal set of degrees for a higher [12] (Apr. 2020). Microsoft SEAL (Release 3.5). Microsoft Research, Redmond,
WA, USA. [Online]. Available: https://github.com/Microsoft/SEAL
maximum degree than that in the previous study. Finally, [13] (Sep. 2020). Lattice Cryptography Library (Release 1.10.4). [Online].
we proposed a method to determine the set of degrees that Available: https://palisade-crypto.org/
VOLUME 10, 2022 26175

[14] (Jul. 2021). Lattigo V2.2.0. [Online]. Available: http://github.com/ldsec/ JOON-WOO LEE (Graduate Student Member,
lattigo IEEE) received the B.S. degree in electrical
[15] C. Cortes and V. Vapnik, ‘‘Support-vector networks,’’ Mach. Learn., and computer engineering from Seoul National
vol. 20, no. 3, pp. 273–297, 1995. University, Seoul, South Korea, in 2016, where
[16] J. A. Hartigan and M. A. Wong, ‘‘Ak-means clustering algorithm,’’ J. Roy. he is currently pursuing the Ph.D. degree. His
Stat. Society: Ser. C (Applied Statistics), vol. 28, no. 1, pp. 100–108, 1979. current research interests include homomorphic
[17] J. Lee, E. Lee, J.-W. Lee, Y. Kim, Y.-S. Kim, and J.-S. No, ‘‘Precise encryption and lattice-based cryptography.
approximation of convolutional neural networks for homomorphically
encrypted data,’’ 2021, arXiv:2105.10879.
[18] J.-W. Lee, H. Kang, Y. Lee, W. Choi, J. Eom, M. Deryabin, E. Lee,
J. Lee, D. Yoo, Y.-S. Kim, and J.-S. No, ‘‘Privacy-preserving machine
learning with fully homomorphic encryption for deep neural network,’’
2021, arXiv:2106.07229. YOUNG-SIK KIM (Member, IEEE) received the
[19] J. H. Cheon, D. Kim, and D. Kim, ‘‘Efficient homomorphic comparison B.S., M.S., and Ph.D. degrees in electrical engi-
methods with optimal complexity,’’ in Proc. Int. Conf. Theory Appl. neering and computer science from Seoul National
Cryptol. Inf. Secur., 2020, pp. 221–256.
University, in 2001, 2003, and 2007, respectively.
[20] L. Ducas and D. Micciancio, ‘‘FHEW: Bootstrapping homomorphic
He joined the Semiconductor Division, Samsung
encryption in less than a second,’’ in Proc. Annual Int. Conf. Theory Appl.
Cryptograph. Techn., 2015, pp. 617–640. Electronics, where he worked in the research and
[21] I. Chillotti, N. Gama, M. Georgieva, and M. Izabachène, ‘‘TFHE: Fast development of security hardware IPs for various
fully homomorphic encryption over the torus,’’ J. Cryptol., vol. 33, no. 1, embedded systems, including modular exponenti-
pp. 34–91, 2020. ation hardware accelerator (called Tornado 2MX2)
[22] J. H. Cheon, D. Kim, D. Kim, H. H. Lee, and K. Lee, ‘‘Numerical method for RSA and elliptic-curve cryptography in smart-
for comparison on homomorphically encrypted numbers,’’ in Proc. Int. card products and mobile application processors with Samsung Electronics,
Conf. Theory Appl. Cryptol. Inf. Secur., 2019, pp. 415–445. until 2010. He is currently a Professor with Chosun University, Gwangju,
[23] I. Chillotti, D. Ligier, J.-B. Orfila, and S. Tap, ‘‘Improved programmable South Korea. He is also a Submitter for two candidate algorithms (McNie
bootstrapping with larger precision and efficient arithmetic circuits for and pqsigRM) in the first round for the NIST Post Quantum Cryptography
TFHE,’’ Cryptol. ePrint Arch., IACR, Tech. Rep. 2021/729, 2021. Standardization. His research interests include post-quantum cryptography,
[24] K. Kluczniak and L. Schild, ‘‘FDFB: Full domain functional boot- the IoT security, physical-layer security, data hiding, channel coding, and
strapping towards practical fully homomorphic encryption,’’ 2021, signal design. He is selected as one of the 2025’s 100 Best Technology
arXiv:2109.02731. Leaders (for Crypto-Systems) by the National Academy of Engineering of
[25] Z. Liu, D. Micciancio, and Y. Polyakov, ‘‘Large-precision homomorphic
Korea.
sign evaluation using FHEW/TFHE bootstrapping,’’ Cryptol. ePrint Arch.,
IACR, Tech. Rep. 2021/1337, 2021.
[26] W.-J. Lu, Z. Huang, C. Hong, Y. Ma, and H. Qu, ‘‘PEGASUS: Bridging
polynomial and non-polynomial evaluations in homomorphic encryption,’’ JONG-SEON NO (Fellow, IEEE) received the
in Proc. IEEE Symp. Secur. Privacy (SP), May 2021, pp. 1057–1073. B.S. and M.S.E.E. degrees in electronics engineer-
[27] A. Kim, A. Papadimitriou, and Y. Polyakov, ‘‘Approximate homomorphic ing from Seoul National University, Seoul, South
encryption with reduced approximation error,’’ Cryptol. ePrint Arch., Korea, in 1981 and 1984, respectively, and the
IACR, Tech. Rep. 2020/1118, 2020. Ph.D. degree in electrical engineering from the
[28] J.-W. Lee, E. Lee, Y.-W. Lee, and J.-S. No, ‘‘Optimal minimax University of Southern California, Los Angeles,
polynomial approximation of modular reduction for bootstrapping of CA, USA, in 1988. He was a Senior MTS with
approximate homomorphic encryption,’’ Cryptol. ePrint Arch., IACR, Hughes Network Systems, from 1988 to 1990.
Tech. Rep. 2020/552/20200803:084202, 2020.
He was an Associate Professor with the Depart-
[29] E. W. Cheney, Introduction to Approximation Theory. Providence, RI,
ment of Electronic Engineering, Konkuk Univer-
USA: AMS, 1966.
[30] J.-W. Lee, E. Lee, Y. Lee, Y.-S. Kim, and J.-S. No, ‘‘High-precision sity, Seoul, from 1990 to 1999. He joined the Faculty of the Department of
bootstrapping of RNS-CKKS homomorphic encryption using optimal Electrical and Computer Engineering, Seoul National University, in 1999,
minimax polynomial approximation and inverse sine function,’’ in Proc. where he is currently a Professor. His research interests include error-
Annu. Int. Conf. Theory Appl. Cryptograph. Techn., 2021, pp. 618–647. correcting codes, cryptography, sequences, LDPC codes, interference
alignment, and wireless communication systems. He became an IEEE
Fellow through the IEEE Information Theory Society, in 2012. He became
a member of the National Academy of Engineering of Korea (NAEK),
in 2015, where he served as the Division Chair for Electrical, Electronic,
and Information Engineering, from 2019 to 2020. He was a recipient of
EUNSANG LEE received the B.S. and the IEEE Information Theory Society Chapter of the Year Award in 2007.
Ph.D. degrees in electrical and computer From 1996 to 2008, he served as the Founding Chair for the Seoul Chapter of
engineering from Seoul National University, the IEEE Information Theory Society. He was the General Chair of Sequence
Seoul, South Korea, in 2014 and 2020, and Their Applications 2004 (SETA 2004), Seoul. He also served as the
respectively. He is currently a Postdoctoral General Co-Chair for the International Symposium on Information Theory
Researcher at the Institute of New Media and and Its Applications 2006 (ISITA 2006) and the International Symposium
Communications, Seoul National University. His on Information Theory, 2009 (ISIT 2009), Seoul. He served as the
current research interests include homomorphic Co-Editor-in-Chief for the IEEE JOURNAL OF COMMUNICATIONS AND NETWORKS,
encryption and lattice-based cryptography. from 2012 to 2013.
26176 VOLUME 10, 2022

Optimization of Homomorphic Comparison Algorithm o

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Optimization of Homomorphic Comparison Algorithm o

Uploaded by

Copyright:

Available Formats

Received December 14, 2021, accepted February 15, 2022, date of publication March 2, 2022, date of current version

March 11, 2022.

Optimization of Homomorphic Comparison

INDEX TERMS Cheon–Kim–Kim–Song (CKKS) scheme, fully homomorphic encryption (FHE),

I. INTRODUCTION A HE that allows all algebraic operations on encrypted

26164 VOLUME 10, 2022

VOLUME 10, 2022 26165

(j) (j) (j)

26166 VOLUME 10, 2022

TABLE 1. Required depth consumption and the number of non-scalar

Algorithm 2: ComputehG(τ ) [9]

VOLUME 10, 2022 26167

Algorithm 3: ComputeMinDep(α, ) [9] III. FAST ALGORITHM FOR INVERSE MINIMAX

26168 VOLUME 10, 2022

= 41, 720. Algorithm 6: StoreME(dmax , n̄)

VOLUME 10, 2022 26169

Output: An approximate value of IME(τ ; d)

26170 VOLUME 10, 2022

VOLUME 10, 2022 26171

Algorithm 9: ComputeMinTimeDegs(α, , D) Algorithm 11: OptMinimaxMax(a, b; α, ζ, D, η)

26172 VOLUME 10, 2022

Algorithm 12: MinimaxReLU(x; α, ζ, D, η) [17] D using the ComputeMinTimeDegs algorithm instead of

VOLUME 10, 2022 26173

26174 VOLUME 10, 2022

VOLUME 10, 2022 26175

26176 VOLUME 10, 2022

You might also like

Algorithm 3: ComputeMinDep(α, ) [9] III. FAST ALGORITHM FOR INVERSE MINIMAX

Algorithm 9: ComputeMinTimeDegs(α, , D) Algorithm 11: OptMinimaxMax(a, b; α, ζ, D, η)