You are on page 1of 22

Proceedings on Privacy Enhancing Technologies ; 2019 (2):187–208

Ágnes Kiss*, Masoud Naderpour, Jian Liu, N. Asokan, and Thomas Schneider

SoK: Modular and Efficient Private Decision


Tree Evaluation
Abstract: Decision trees and random forests are widely
used classifiers in machine learning. Service providers
1 Introduction
often host classification models in a cloud service and
Machine learning is pervasively being applied in various
provide an interface for clients to use the model remotely.
real-world scenarios. All major IT companies, including
While the model is sensitive information of the server,
Amazon, Apple, Google, Facebook and Microsoft, are
the input query and prediction results are sensitive infor-
working on and use machine learning technologies. First,
mation of the client. This motivates the need for private
a training phase takes place where a model is trained on
decision tree evaluation, where the service provider does
a large dataset so that later it can make predictions on
not learn the client’s input and the client does not learn
an input, which happens in the evaluation phase. One
the model except for its size and the result.
of the most commonly used predictive models are deci-
In this work, we identify the three phases of pri-
sion trees, whose extension to random forests – sets of
vate decision tree evaluation protocols: feature selec-
smaller decision trees trained on different random sub-
tion, comparison, and path evaluation. We systematize
sets of features – is considered to be among the most
constant-round protocols for each of these phases to
accurate classification models [DCBA14].
identify the best available instantiations using the two
In many scenarios, predictive models such as deci-
main paradigms for secure computation: garbling tech-
sion trees and neural networks are trained or queried
niques and homomorphic encryption. There is a natural
on sensitive data and should be handled in a pri-
tradeoff between runtime and communication consider-
vate manner [WGC18]. Solutions for private training
ing these two paradigms: garbling techniques use fast
of decision trees exist that have been heavily opti-
symmetric-key operations but require a large amount of
mized [LP00, LP02, VC05, VCKP08, BS09, FW12].
communication, while homomorphic encryption is com-
In this work, we focus on the private evaluation of
putationally heavy but requires little communication.
decision trees with security against passive adversaries.
Our contributions are as follows: Firstly, we systemat-
In this scenario, the server holds a decision tree and
ically review and analyse state-of-the-art protocols for
offers its evaluation as a service to its client with sen-
the three phases of private decision tree evaluation. Our
sitive input to the decision tree. Moreover, the model
methodology allows us to identify novel combinations
may leak information about the training data used for
of these protocols that provide better tradeoffs than ex-
its generation, which is a valuable asset of the server.
isting protocols. Thereafter, we empirically evaluate all
The server’s goal is to offer the service without compro-
combinations of these protocols by providing communi-
mising the client’s input, or the client learning anything
cation and runtime measures, and provide recommenda-
about the decision tree (beyond its size and the result).
tions based on the identified concrete tradeoffs.
Note that recent works [TZJ+ 16, PMG+ 17, SSSS17,
Keywords: Privacy-preserving protocols, secure compu- JSD+ 18] have shown that it is possible for an attacker
tation, garbling techniques, homomorphic encryption, who has only blackbox oracle access to prediction APIs
decision tree evaluation, machine learning of machine learning models to succeed in learning model
DOI 10.2478/popets-2019-0026
parameters by repeated adaptive queries to the API.
Received 2018-08-31; revised 2018-12-15; accepted 2018-12-16. Defending against such attacks is an active area of
research and several proposals have already appeared,
e.g., [TZJ+ 16, KMAM18, JSD+ 18]. We deem dealing
with blackbox attacks out of scope for this paper.
*Corresponding Author: Ágnes Kiss: TU Darmstadt,
E-mail: kiss@encrypto.cs.tu-darmstadt.de
Masoud Naderpour: University of Helsinki, E-mail: ma-
soud.naderpour@helsinki.fi N. Asokan: Aalto University, E-mail: asokan@acm.org
Jian Liu: University of California, Berkeley, E-mail: Thomas Schneider: TU Darmstadt, E-mail:
liu.jian@berkeley.edu (This work was done when the author schneider@encrypto.cs.tu-darmstadt.de
was in Aalto University)
SoK: Modular and Efficient Private Decision Tree Evaluation 188

1.1 Applications Sub-protocol PathH PathG


SelH + CompG HGH §4.1 HGG [BFK+ 09]
Cloud-hosted evaluation of machine learning models is SelG + CompG GGH §4.1 GGG [BFK+ 09]
SelH + CompH HHH [TMZC17] HHG §4.1
a widely used service that companies such as Amazon,
SelG + CompH × ×
Google or IBM provide to their customers. While doing
Table 1. Combinations of sub-protocols for private decision tree
so [Bar15], Amazon Web Services (AWS) recognizes that evaluation ABC = SelA + CompB + PathC . SelH /SelG denotes
their clients care about the privacy of their data [Ser18]. oblivious feature selection, CompH /CompG oblivious comparison,
These services can be enhanced by using private evalu- and PathH /PathG oblivious path evaluation with homomorphic
ation of machine learning models to allow for providing encryption (H) and garbling techniques (G).
a service while violating the privacy of neither the re-
sulting model nor the customers’ input features. Most
machine learning APIs such as BigML [Big18], ML- the most efficient constant-round private decision tree
JAR [MLJ17], Wise.io [Wis18], or Azure Machine Learn- evaluation protocols to date.
ing Studio [Mic18] support decision tree or random for- In more details, our contributions are as follows:
est classifiers. Decision tree evaluation has a tremendous Systematization and analysis of existing tech-
amount of applications, and is being used by companies niques (§3). Private decision tree evaluation consists
such as Facebook [IK17] and Microsoft [RG16]. of three sub-protocols: The first is private selection of
Private decision tree evaluation can provide a so- features that are to be compared with in each of the
lution for private medical diagnosis [TASF09], such decision nodes, the second is private comparison, and
as electrocardiogram classification [BFK+ 09, BFL+ 11], the third is private evaluation of the path correspond-
where the medical history and genomic data of the pa- ing to the given input in the decision tree. We analyze
tient are sensitive information. Decision trees and ran- the state-of-the-art techniques for these sub-protocols
dom forests can be applied for malware [AM18], and that use additively homomorphic encryption (H) as
text classification [RMD18]. Moreover, decision trees well as those using garbling techniques (G). We de-
and random forests have been successfully used for de- note the different approaches for oblivious selection,
tecting insider threat [MAAG15] and multimedia pro- comparison, and path evaluation by SelH /SelG (§3.1),
tocol tunneling [BSR18], as well as spam campaign- CompH /CompG (§3.2), and PathH /PathG (§3.3), re-
ers [GKG+ 18], and predicting the interruptability in- spectively, and describe their efficient instantiations.
tensity of mobile users [YGL17], data hotspots in cellu- Protocols for private decision tree evalua-
lar networks [NIZ+ 16], and even private interactions in tion (§4). We show that different sub-protocols can
underground forums [OTGM18]. Private evaluation of be combined efficiently, resulting in the six protocols
decision trees and random forests can enhance the pri- shown in Tab. 1. We adapt all resulting protocols to
vacy that these techniques provide, while protecting the an offline-online setting, where all operations indepen-
sensitive business information of the service provider. dent of the input features are precomputed in an offline
Recently, secure evaluation of branching programs, an phase, which results in a more efficient online phase
extension of decision trees, has also been utilized in a (§4.1). Thereafter, we present protocol extensions for
protocol for private set intersection [CO18]. some of our protocols (§4.2).
Empirical evaluation (§5). We evaluate the per-
formance of the resulting private decision tree evalua-
1.2 Outline and Our Contributions tion protocols and study the tradeoff between runtime
and communication on example real-world datasets. We
There exist two main paradigms for secure computation: show the competitive performance of general-purpose se-
homomorphic encryption, which often has little commu- cure two-party computation protocols for sub-protocols
nication but high computation complexity, and garbling of private decision tree evaluation that has been claimed
techniques, which require more communication, but uti- to be inefficient in [WFNL16, TMZC17].
lize efficient symmetric-key primitives. In this work, Improvements and recommendations (§6).
we systematically analyze the state-of-the-art constant- We show that our identified hybrid protocols that com-
round protocols for private decision tree evaluation that bine homomorphic encryption with garbling techniques
make use of these paradigms. We explore the tradeoffs achieve an order of magnitude runtime improvement
and combinations of the identified sub-protocols, com-
pare the performance of all combinations and identify
SoK: Modular and Efficient Private Decision Tree Evaluation 189

S Server (holds decision tree) evaluation T (x) means evaluating the comparisons at
C Client (holds input features) each decision node of a path, the result of which defines
T Decision tree the branch taken next. When a leaf node is reached on
n Dimension of feature vector
the path, the corresponding classification is outputted.
t Bitlength of a feature
m Number of decision nodes
The depth d of the decision tree is the length of the
m0 Number of depth-padded decision nodes longest path between the root and any leaf.
m Number of possibly depth-padded decision Additionally, we define depth-padding, where
nodes, i.e., m or m0 , depending on the un-
dummy decision nodes are introduced at every leaf un-
derlying protocol til depth d is reached in order for each path to have
d
Depth of decision tree
the same depth. A dummy node has one outgoing edge,
x = {x1 , . . . , xn }
Client’s feature vector
y = {y1 , . . . , ym } i.e., independently from the comparison result it leads
Server’s thresholds for (padded) decision nodes
κ
Symmetric security parameter (= 128) to the same node as shown in Fig. 1b. The number of
s
Statistical security parameter (= 40) these dummy nodes, which depends on the structure of
τsym
Size of symmetric ciphertext (= 128)
Pm−1
the tree, is at most i=1 i, i.e., the number of decision
τElGamalSize of ciphertext in lifted ElGamal (= 514)
nodes in the depth-padded tree is m ≤ m0 ≤ 12 m(m + 1).
τPaillier
Size of ciphertext in Paillier (= 4096)
τDGK
From here on, we use m whenever both m or m0 can be
Size of ciphertext in DGK encryption (= 2048)
Table 2. Notationsused throughout the paper. used in a protocol.
Real-world datasets. In this paper, we use the
real-world datasets summarized in Tab. 3 for which we
trained decision trees using scikit-learn [sld17]. These
over state-of-the-art protocols. We provide recommen-
datasets are taken mostly from the UCI machine learn-
dations for different settings based on our findings.
ing repository [Lic18]: the classical iris dataset for pat-
Related work (§7) and open questions. Fi-
tern recognition of the iris flower, the wine and (breast)
nally, we discuss our results in relation to related work.
cancer datasets for classification based on the chem-
The protocol based solely on garbling techniques can
ical analysis of wines and the characteristics of a di-
easily be extended to security against malicious clients
agnostic image, resp., as well as the boston dataset
(cf. §4.1). Tai et al. [TMZC17] propose the same (more
that includes house prices, the digits dataset that con-
costly) extension using zero-knowledge proofs for the
tains images of hand-written digits, which has been used
protocol based solely on homomorphic encryption. As
by NIST [GBC+ 97], and the diabetes dataset that in-
interesting future work, one could study and extend the
cludes automatic and paper records of diabetes patient
security of hybrid protocols against malicious clients.
glucose measurements for classification. The linnerud
dataset has been used in [Ten98] for regression and in-
cludes physiological measurements. The decision trees
2 Preliminaries are of varying sizes and depths which allows us to study
the behaviour of our identified protocol combinations.
In this section, we detail necessary preliminaries to our In Tab. 3, we show the values of n (number of input
work. The notations we use are summarized in Tab. 2. features), d (depth), m (number of decision nodes), and
m0 (number of depth-padded decision nodes).

2.1 Decision Trees


2.2 Cryptographic Techniques
A decision tree is a binary tree T with m internal
nodes called decision nodes, as shown in the exam- Private decision tree evaluation protocols can be con-
ple in Fig. 1a. Every leaf node is called a classifica- structed based on both paradigms for secure computa-
tion node and is associated with a classification value tion: homomorphic encryption and garbling techniques.
v = {v1 , . . . , vm+1 }. A decision tree has a threshold vec- We detail the techniques used in this work.
tor y = {y1 , . . . , ym } and receives an n-dimensional fea- Oblivious transfer. 1-out-of-2 oblivious trans-
ture vector x = {x1 , . . . , xn } as input. At the j th deci- fer (OT) allows a sender to obliviously send one of
sion node (j ∈ {1, . . . , m}), a comparison between an xi two messages m0 , m1 to a receiver according to the re-
and yj takes place (i ∈ {1, . . . , n}), where σ : m → n ceiver’s selection bit b, without the receiver learning any-
denotes the mapping for input selection. Decision tree thing about m1−b or the sender learning b. OT generally
SoK: Modular and Efficient Private Decision Tree Evaluation 190

col [Yao82, Yao86] is one of the most widely used solu-


? ?
x2 < y1 x2 < y1 tions for secure two-party computation. Yao’s protocol
0 1 0 1 works as follows: one of the parties acts as the garbler,
who garbles the circuit, i.e., assigns two random keys cor-
v1 ? ?
x1 < y2 x1 < y2 responding to 0 and 1 to each of its wires and encrypts
0 1 0 1 1 the gates using these keys. The garbler sends this gar-
0
bled circuit GC to the other party, the evaluator, who
v2 v3 v1 v2 v3
evaluates it after having obliviously received the keys
(a) Decision tree with m = 2. (b) Depth-padded decision tree with corresponding to its input bits from the garbler through
m0 = 3. OTs. The protocol therefore relies on OTs linear in the
input length of the evaluator and symmetric-key oper-
Fig. 1. Decision trees with depth d = 2.
ations linear in the size of the Boolean circuit describ-
ing the function. Various optimizations exist including
Dataset n d m m0 Source
iris 3 5 8 19 UCI [Lic18]
point-and-permute [BMR90], free-XOR [KS08a], fixed-
wine 7 5 11 26 UCI [Lic18] key AES garbling [BHKR13], and half-gates [ZRE15],
linnerud 3 6 19 47 [Ten98] while at the implementation level Single Instructions
(breast) cancer 12 7 21 66 UCI [Lic18] Multiple Data (SIMD) operations can be used [BJL12].
digits 47 15 168 1,161 UCI [Lic18] All these state-of-the-art optimizations are implemented
diabetes 10 28 393 6,432 UCI [Lic18]
in the ABY framework [DSZ15], which we use in our im-
boston 13 30 425 6,768 UCI [Lic18]
plementation. With all these optimizations, each AND
Table 3. Example decision trees of varying sizes with n input
features, m decision nodes, m0 padded decision nodes, and d gate of the circuit needs 2τsym bits of communication.
depth for real-world datasets, trained using scikit-learn [sld17] Additionally, we use the point-and-permute tech-
which contains a library with these datasets. nique in our protocols [BMR90]. In the classic garbling
scheme, each wire i has two labels (ki0 for 0 and ki1 for 1)
associated with it. In order to hide the truth values of
requires expensive public-key operations [IR89]. How- these wires with minimal overhead, a color bit can be
ever, a small (symmetric security parameter κ) number appended to each label (opposite color bits for 0 and 1).
of base OTs can be extended to any polynomial num- The relation of the color bits and the truth values are
ber of OTs efficiently using only symmetric-key oper- known only to the garbler, while the evaluator holds one
ations [IKNP03]. Improvements to such OT extension label from each wire, the color bit of which it can use to
protocols have been presented with security against pas- proceed with evaluation. In the end of the protocol, the
sive [ALSZ13] and active adversaries [ALSZ15, KOS15]. evaluator possesses the wire labels for the output value,
OTs can be efficiently precomputed [Bea95], which is which have the color bits appended to them.
suited for an offline-online scenario, where the parties We use the following notations for the garbling
can precompute parts of the protocol before providing scheme: (GC, out0 , out1 ) ← Garble(C, in0 , in1 ) denotes
their inputs to the computation: During precomputa- the garbling method that returns the garbled circuit GC,
tion in the offline phase, it is possible to run random including the output wire keys (out0 , out1 ) for both
OTs on random inputs, which are then used to mask the 0 and 1 values, respectively. It gets as input circuit C
the actual messages and choice bits. To retrieve the out- and input wire key pairs (in0 , in1 ) for the evaluator’s in-
put of the OT in the online phase, the parties blind put wires. (GC, color) ← Garble’(C, in0 , in1 ) denotes
their messages using inexpensive XOR operations, and the same garbling method but returns the color bits
require |m0 | + |m1 | + 1 bits of communication, whereas along with GC. out ← Eval(GC, in) denotes GC evalu-
all cryptographic operations are shifted to the offline ation where the evaluator inputs his input wire keys in
phase, which requires τsym bits of communication per (received via OTs from the garbler), and receives the out-
random OT [ALSZ13]. put wire keys out corresponding to the output bits. The
Yao’s garbled circuit protocol. Secure two- actual output remains oblivious to the evaluator, since
party computation allows two parties to compute an he does not learn the values the wire keys correspond to.
arbitrary functionality on their private inputs without out’ ← Eval’(GC, in) denotes a similar method, where
learning anything about the other party’s input (be- the evaluator receives blinded output bits out’, i.e., the
yond the output). Yao’s garbled circuit (GC) proto- color bits of the outputs.
SoK: Modular and Efficient Private Decision Tree Evaluation 191

Homomorphic encryption. Additively homomor-


Ideal functionality FPDTE .
phic encryption allows anyone holding the public key pk
to calculate [x1 + x2 ] = [x1 ]  [x2 ], given two cipher- 1. Private input of S : decision tree T .
texts [x1 ] := Encpk (x1 ) and [x2 ] := Encpk (x2 ). Multiply- 2. Private input of C : inputs x = {x1 , . . . , xn }.
ing [c · x] = c [x] with constant c is supported since this 3. Compute: evaluate v = T (x).
is equal to adding [x] with itself c times, which can be 4. Output of C : classification result v.
efficiently computed using the double-and-add method.
We use semantically secure additively homomorphic
encryption schemes. More specifically, we use Paillier
Fig. 2. FPDTE – Ideal functionality for private decision tree evalu-
encryption with packing [Pai99, DJ01], DGK encryp-
ation (PDTE). A decision tree T is defined by its topology, input
tion [DGK08], and Lifted ElGamal encryption [ElG85]. selection σ : m → n, thresholds y = {y1 , . . . , ym } and classifica-
With a plaintext space of ZN , the ciphertext space of tion values v = {v1 , . . . , vm+1 }.
Paillier and DGK encryption are Z∗N 2 and Z∗N , respec-
tively, where N is an RSA modulus. Paillier encryp-
Ideal functionality FSel , given a secret sharing
tion allows for packing of plaintexts and ciphertexts
scheme S1 .
when the encrypted values are smaller than the plaintext
space. We use ciphertext packing since the ciphertext [x]
1. Private input of S : selection σ : m → n.
in our case is much larger than the plaintext x. Thus,
2. Private input of C : inputs x = {x1 , . . . , xn }.
we can pack n ciphertexts [x1 ], ..., [xn ] into a single one:
3. Compute: z = {xσ(1) , . . . , xσ(m) }, i.e., select
[X] = [xn ||xn−1 ||...||x1 ], and reduce the communication
m values from the n inputs with selection σ.
from n · τPaillier to n · τPaillier /|xi |.
4. Output of S , C : secret S1 shares of z.
Lifted ElGamal encryption is similar to the classical
ElGamal scheme, with the exception that the plaintext
message is in the exponent, i.e., m is replaced by g m ,
Fig. 3. FSel – Ideal functionality for obliviously selecting m ele-
where g is the group generator. This allows for additive
ments from n ≤ m elements.
homomorphism but makes decryption more difficult due
to the discrete logarithm that needs to be solved. Lifted
ElGamal can be instantiated using elliptic curve (EC) two protocols for privately selecting features from n in-
cryptography, where the ciphertext is represented by put features that are assigned to a decision node to be
two EC points. This scheme is thus very efficient but compared at (selection phase). In §3.2, we recapitulate
can only be used when the encrypted plaintext values two sub-protocols for private comparison (comparison
are in a known small subset of the plaintext space. phase): one based on garbled circuits and another one
based on additively homomorphic encryption. There-
after in §3.3, we provide protocols for obliviously eval-
uating the path to the classification leaf based on the
3 Protocol Building Blocks results of the previous phase (path evaluation phase).
In all sub-protocols presented in this section, the
Private decision tree evaluation is run between two par-
number of decision nodes in decision tree T is ei-
ties: a server S holding a classification tree T (with
ther m or the number of decision nodes m0 after depth-
threshold vector y, input selection σ and classification
padding (cf. §2.1). This is according to the approach cho-
values v) and a client C providing its input feature vec-
sen for path evaluation (§3.3). We use m decision nodes
tor x. After the protocol evaluation, C obtains the classi-
in our description, where possibly m = m or m = m0 .
fication result, without learning any information about
For completeness, we give in §A the communication
the tree beyond its size and the output, and S learns
of all sub-protocols in Tab. 5 and their asymptotic com-
nothing about C ’s input features. The corresponding
putation complexities in Tab. 6.
ideal functionality FPDTE is depicted in Fig. 2.
In this section, we recapitulate the state-of-the-art
sub-protocols for private decision tree evaluation with
security against semi-honest adversaries. We describe
each protocol and refer to the original publications for
specific details on each construction. In §3.1, we detail
SoK: Modular and Efficient Private Decision Tree Evaluation 192

tical blinding using fresh randomness (pads are longer


Ideal functionality FComp , given two secret sharing
than the plaintext value by s bits, where s is the sta-
schemes S1 , S2 .
tistical security parameter). S then sends these to C ,
1. Private inputs of S , C : secret S1 shares of the who decrypts them, retrieving a blinded selection of its
selection xσ(1) . . . , xσ(m) . inputs. In this sub-protocol, we use either Paillier en-
2. Private input of S : y = {y1 , . . . , ym }. cryption with ciphertext packing for the m ciphertexts
3. Compute: for i ∈ {1, . . . , m}: ci = 1 if to reduce the communication to (n + m/(t + s)) · τPaillier
xσ(i) < yi , else ci = 0. bits (cf. §2.2), or DGK encryption with smaller cipher-
4. Output of S , C : secret S2 shares of c1 , . . . , cm . texts and (n + m) · τDGK bits of communication.

Selection using a garbled selection net-


work (SelG , Fig. 7). An alternative for obliviously
Fig. 4. FComp – Ideal functionality for obliviously comparing m
selecting the features is evaluating a selection net-
pairs of elements.
work [KS08b, BFK+ 09], which is based on Waksman’s
permutation network [Wak68]. Firstly, S sends the GC
Ideal functionality FPath , given secret sharing corresponding to the selection network to C . They then
scheme S2 . perform nt OTs for each bit of the client’s input. In
most cases, n ≤ m, since features are used more than
1. Private inputs of S , C : secret S2 shares of the once. The size of a selection network that maps n ≤ m
comparison results {c1 , . . . , cm }. n
t-bit elements to m elements is t · Sm≥n = t · (0.5(n +
2. Private input of S : decision tree T . m) log(n) + m log(m) − n + 1) [KS16, Appendix C]. The
3. Compute: evaluation of the decision tree path communication of this protocol is (n + 2Sm≥n n )t · τsym
using the comparison results resulting in leaf v. bits offline and nt(2τsym + 1) bits online.
4. Output of S : classification result v.

3.2 Comparison Phase


Fig. 5. FPath – Ideal functionality for oblivious path evaluation.
This section details two oblivious comparison protocols,
the ideal functionality FComp of which is given in Fig. 4.
3.1 Selection Phase
The outputs of S , C are secret shares of the result. The
first uses homomorphic encryption (instantiating both
The selection phase obliviously selects from the n fea-
SelH and CompH ), the second uses Yao’s GC (CompG ).
tures of the client C a value for each of the m decision
Comparison (including selection) using addi-
nodes (where they are to be compared to thresholds) ac-
tively homomorphic encryption (SelH + CompH ,
cording to the selection σ : m → n of the server S . The
Fig. 8). This comparison protocol, often referred
ideal functionality FSel that describes this selection is
to as the DGK comparison protocol, was proposed
shown in Fig. 3. We identify two ways to instantiate it,
in [DGK07, DGK08, DGK09] and used for private deci-
one based on additively homomorphic encryption (SelH ),
sion tree evaluation in [WFNL16, TMZC17]. Since the
and another one based on the secure evaluation of a
comparison protocol is performed using homomorphi-
Boolean selection network using garbled circuits (SelG ).
cally encrypted bits of the client’s input and known
In both protocols, the outputs of the parties are secret
thresholds, the selection phase SelH can be realized
shares of the result: none of the parties learns any infor-
within this protocol without additional overhead. Thus,
mation about the output, but they are able to continue
it already includes the oblivious selection protocol
secure computation with it.
of §3.1 for free (cf. Tab. 1). It is observed in [DGK07]
Selection using additively homomorphic en-
that x < y is true if and only if ∃i ∈ {1, . . . , t} such
cryption (SelH , Fig. 6). Additively homomorphic en- P
that fi (x, y) = xi − yi + 1 + 3 j<i (xj ⊕ yj ) = 0. C en-
cryption has been used to perform oblivious selection
crypts each bit of its input with an additively homo-
in [BPSW07, BFK+ 09]. In the protocol, the client C
morphic encryption scheme (cf. §2.2), and sends the
encrypts its inputs, and sends the ciphertexts to the
ciphertexts to S , who chooses the appropriate feature
server S . S then selects the values according to selec-
to compare with at each decision node and generates
tion σ and homomorphically blinds these with statis-
a random bit ak for each comparison (k ∈ {1, . . . , m}).
SoK: Modular and Efficient Private Decision Tree Evaluation 193

Server S Client C
Input: m, selection σ, public key pk Input: x = (x1 , . . . , xn ), key pair pk, sk
Output: r1 , . . . , rm random Output: z1 , . . . , zm blinded values

Offline
Generate r1 , . . . , rm random (t + s)-bit values

[x1 ], . . . , [xn ] For l ∈ {1, . . . , n} : [xl ] ← Encpk (xl )


For i ∈ {1, . . . , m} ←−−−−−−−−−−−−

Online
[zi ] ← [xσ(i) + ri ] [z1 ], . . . , [zm ] For i ∈ {1, . . . , m} :
−−−−−−−−−−−−→ zi ← Decsk ([zi ])
Return r1 , . . . , rm Return z1 , . . . , zm

Fig. 6. SelH – Oblivious selection using additively homomorphic encryption.

Server S Client C
Input: m, selection σ Input: x = (x1 , . . . , xn )
Output: K0 = (K1,10 , . . . , K 0 ), K1 = (K 1 , . . . , K 1 )
m,t 1,1 m,t
Output: K = (K1 , . . . , Km,t ) keys

Generate circuit C of selection network realizing σ : m → n

Offline
0 , . . . , K 0 ), (K 1 , . . . , K 1 )) ←
(GC, (K1,1 GC
m,t 1,1 m,t
Garble(C, (s01,1 , . . . , s0n,t ), (s11,1 , . . . , s1n,t )) −−−−−−−−−−−−→

For i ∈ {1, . . . , n} For i ∈ {1, . . . , n}


For j ∈ {1, . . . , t} For j ∈ {1, . . . , t}

Online
s0i,j , s1i,j −−−−−−−−−→ ←−−−−−−−−− xi,j
OT x
−−−−−−−−−→ si,ji,j
x1,1 xn,t
Return K0 = 0 , . . . , K 0 ), K1
(K1,1 m,t
= 1 , . . . , K1 )
(K1,1 m,t
Return (K1,1 , . . . , Km,t ) ← Eval(GC, (s1,1 , . . . , sn,t ))

Fig. 7. SelG – Oblivious selection using a garbled selection network.

Then, the above comparison is performed that outputs tween two values are performed as in plain decision tree
a blinded ciphertext [ri · fi (x, y)] for each bit of each evaluation. We compare with this protocol in §4.2.
feature. If any of these ciphertexts decrypts to zero, Comparison using garbled circuits (CompG ,
then xi < yi . The comparison bit is not revealed to the Fig. 9). All oblivious comparisons can also be per-
client in clear, but is secret shared between the parties formed with generic secure two-party computation pro-
(S holds random bit ak , C holds the result blinded by ak , tocols. This method has been used for comparison
i.e., bk = ak ⊕ [xσ,k < yk ]). The efficient lifted ElGamal in [BPSW07, BFK+ 09]. Yao’s garbled circuit is partic-
encryption scheme can be used here (cf. §2.2), since the ularly efficient for private comparison, and requires t
client needs only to check if the result is g 0 or not. The AND gates per t-bit comparisons [KSS09]. Therefore, m
communication in this variant is (n + m)t · τElGamal bits. comparisons require 2mt·τsym bits communication. Vari-
Joye and Salehi [JS18] present an optimization of ant (a) of this comparison protocol outputs the Boolean
the DGK comparison protocol that improves the com- shares of the output of the comparisons, i.e., the color
munication roughly by factor two when comparing two bits in Yao’s GC protocol (cf. §2.2). Variant (b) out-
values with each other. However, as opposed to the DGK puts the keys that correspond to the output wires: the
comparison protocol presented in Fig. 8, it does not al- server holds key pairs for both 0 and 1, while the client
low for including the selection step when more values holds the keys for the output wires (without knowing
are compared. This incurs an overhead when the client the values they correspond to).
encrypts its elements, i.e., it now needs to encrypt m
instead of n elements. However, Joye and Salehi also
propose a private decision tree evaluation protocol that
uses their protocol such that only d comparisons be-
SoK: Modular and Efficient Private Decision Tree Evaluation 194

Server S Client C
Input: y = (y1 , . . . , ym ), selection σ, public key pk Input: x = (x1 , . . . , xn ), key pair pk, sk
Output: a1 , . . . , am share Output: b1 , . . . , bm share

Generate a1 , . . . , am random bits For i ∈ {1, . . . , n}


For j ∈ {1, . . . , t}
[x1,1 ], . . . , [xn,t ] [xi,j ] = Encpk (xi,j )
For k ∈ {1, . . . , m} ←−−−−−−−−−−−−

Online
For i ∈ {1, . . . , t}
rk,i ← Z∗p , [ck,i ] = [rk,i · (fk,i (xσ , y) − 2ak )] [c1,1 ], . . . , [cm,t ] For k ∈ {1, . . . , m}
Return (a1 , . . . , am ) −−−−−−−−−−−−→ If ∃i ∈ {1, . . . , t} : Decsk ([ck,i ]) = 0 then bk = 1
else bk = 0
Return (b1 , . . . , bm )

Fig. 8. SelH +CompH – Oblivious selection and comparison using additively homomorphic encryption from [WFNL16, TMZC17].

Server S Client C
Input: y = (y1 , . . . , ym ), K0 , K1 Input: K = (K1,1 , . . . , Km,t ) keys
Output: (a) a = (a1 , . . . , am ) share Output:(a) b = (b1 , . . . , bm ) share
Output: (b) k0 = (k10 , . . . , km
0 ), k1 = (k 1 , . . . , k 1 ) keys
1 m
Output: (b) k = (k1 , . . . , km ) keys

Generate C of m t-bit comparisons with y thresholds

Offline
(a) (GC, (a1 , . . . , am )) ← Garble’(C, K0 , K1 ) GC
0 ), (k 1 , . . . , k 1 )) ← Garble(C, K0 , K1 ) −
(b) (GC, (k10 , . . . , km −−−−−−−−−−−→
1 m

Online
Return (a) color bits a = (a1 , . . . , am ), Return (a) b = (b1 , . . . , bm ) ← Eval’(GC, K)
Return (b) k0 = (k10 , . . . , km
0 ), k1 = (k 1 , . . . , k 1 )
1 m
Return (b) k = (k1 , . . . , km ) ← Eval(GC, K)

Fig. 9. CompG – Oblivious comparison using garbled circuits.

3.3 Path Evaluation Phase can decrypt the classification label. This path evaluation
protocol has (3m + 2) · τElGamal bits of communication.
This section describes two path evaluation protocols, the Path evaluation using a garbled decision
first using homomorphic encryption (PathH ) and the tree (PathG , Fig. 11). The method for garbled deci-
second using a garbled decision tree (PathG ). The ideal sion tree (GDT) evaluation is analogous to Yao’s gar-
functionality FPath is given in Fig. 5. bled circuit technique [Yao86] described in §2.2, but
Path evaluation using additively homomor- Eval(GDT, in) returns the output classification in the
phic encryption (PathH , Fig. 10). The path eval- clear. Our description is similar to that of [BPSW07]
uation protocol of [TMZC17] is highly optimized and that has been improved in [Sch08, BFK+ 09, BFL+ 11].
depends only on the number of decision nodes m. The It relies on the idea of encrypting two keys at each de-
client C encrypts his share of the comparison bit for each cision node, one to the left and one to the right child
decision node, and sends it to the server. The server S nodes, along with their node indices, such that the eval-
homomorphically computes the encryption of the actual uator C cannot deviate from the path corresponding to
comparison bit (by removing the blinding), which is the the comparison results with his input vector x. We in-
edge cost of the edge from the decision node that corre- troduce dummy decision nodes for hiding the length
sponds to 0 (left). The other edge, corresponding to 1 of the path to the classification node. Revealing this
(right), gets the opposite of this bit as its edge cost. For length would reveal information about the topology of
each leaf node its path cost is computed by summing the tree, especially when multiple protocol runs with
up the edge costs along its path from the root. This re- the same model are possible. Alternatively, one can use
quires O(m) operations since intermediate values can be full decision trees [BPSW07], but then the overhead is
reused. These and the classification labels on each leaf exponential in the depth d. This is much higher for
node are then blinded by S using fresh randomness, and large trees than using depth-padding with m0 decision
sent to the client C . If the path cost decrypts to zero, C nodes (cf. §2.1) as shown in Fig. 12. Moreover, we re-
SoK: Modular and Efficient Private Decision Tree Evaluation 195

Server S Client C
Input: a = (a1 , . . . , am ), public key pk Input: b = (b1 , . . . , bm ), key pair pk, sk
Output: ⊥ Output: classification v

Offline
For j ∈ {1, . . . , m + 1} : rj ←R Z∗p , rj0 ←R Z∗p

[bi ] For i ∈ {1, . . . , m} : [bi ] ← Enc(bi )


For i ∈ {1, . . . , m} ←−−−−−−−−−−−−
[Bi ] ← [bi ⊕ ai ], [eci,0 ] ← [Bi ], [eci,1 ] ← [1 − Bi ]

Online
For j ∈ {1, . . . , m + 1}
P
[pcj ] ← [ e ∈P ecj,k ], [pcj ] ← [rj ·pcj ], [vj ] ← [rj0 ·pcj +vj ] Permuted [pcj ], [vj ] For j ∈ {1, . . . , m + 1}
j,k
−−−−−−−−−−−−→ If Decsk ([pcj ]) = 0 then
Return v ← Decsk ([vj ])

Fig. 10. PathH – Oblivious path evaluation using additively homomorphic encryption from [TMZC17].

Server S Client C
Input: k0 = (k10 , . . . , km
0 ), k1 = (k 1 , . . . , k 1 )
0 1 m0
Input: k = (k1 , . . . , km0 )
Output: ⊥ Output: classification v

Offline
DT ← DepthPad(DT ), GDT ← Garble(DT , k0 , k1 ) GDT
−−−−−−−−−−−−→

Online
Return v ← Eval(GDT, k)

Fig. 11. PathG – Oblivious path evaluation using a garbled decision tree.

veal only the depth-padded number of decision nodes m0 ,


and nothing about m since both of them would reveal
4 Protocols for Private Decision
information about the tree topology. Tree Evaluation
The server randomly permutes the nodes and gar-
bles the decision tree so that each node has an index i In this section, we discuss all possible combinations of
and an encryption key ki corresponding to it. At each the sub-protocols from §3 that result in private deci-
node, the keys and indices of the child nodes are stored sion tree evaluation protocols. We assess these in Tab. 4,
in an encrypted manner using the key of the node and which is an extended version of Tab. 1. In addition to
keys ki0 or ki1 depending on the comparison result. This the sub-protocols, we recapitulate the number of online
means that at node i, Encki ,k0 (kl ||l) and Encki ,k1 (kr ||r) rounds and the leakage about the model. In private de-
i i
are stored, where l is the index of the node on the left cision tree evaluation, the client only learns the result
and r is the index of the node on the right. The order of and some information about the size of the server’s deci-
the two ciphertexts is permuted according to the color sion tree: either the number of decision nodes m or the
bits of ki0 and ki1 . The server sends GDT to the client, number of padded decision nodes m0 and potentially
who can evaluate it by decrypting one path leading to the depth d. We adapt all resulting protocols with secu-
the leaf node storing the classification. The communi- rity against passive adversaries to an offline-online set-
cation in this protocol is sending the GDT , each node ting. We note that it is possible to precompute parts
of which stores two τsym -bit values and two indices, i.e., of the protocols even if not shown in the respective fig-
2m0 · (τsym + log2 (m0 + 1)). An extension of this proto- ures. In sub-protocols based on homomorphic encryp-
col presented in §4.2 reduces the client’s computation tion, we can precompute encryptions independent from
to O(dt) symmetric key operations. the client’s input, while in sub-protocols based on gar-
bling techniques, we can perform the base OTs and pre-
compute OTs (cf. §2.2).
SoK: Modular and Efficient Private Decision Tree Evaluation 196

Protocol m Selection Interface Comparison Interface Path evaluation Online rounds Leakage
(Sel∗ ) (Comp∗ ) (Eval∗ )
[BPSW07] m H G G 4 m, dpath
[WFNL16] m H (Fig. 8) H (Fig. 8) H 6 m
HGG [BFK+ 09] m0 H (Fig. 6) G (Fig. 13) G (Fig. 9) G (Fig. 11) 4 m0 , d
GGG [BFK+ 09] m0 G (Fig. 7) G (Fig. 9) G (Fig. 11) 2 m0 , d
HHH [TMZC17] m H (Fig. 8) H (Fig. 8) H (Fig. 10) 4 m
HHG m0 H (Fig. 8) H (Fig. 8) G (Fig. 14) G (Fig. 11) 4 m0 , d
HGH m H (Fig. 6) G (Fig. 13) G (Fig. 9) H (Fig. 10) 6 m
GGH m G (Fig. 7) G (Fig. 9) H (Fig. 10) 4 m
Table 4. Concrete protocols based on the presented sub-protocols for selection, comparison and path evaluation, where m denotes
either the number of decision nodes m or padded decision nodes m0 in the protocols. Round complexity and leakage of different
private decision tree evaluation protocols based on homomorphic encryption H and garbling techniques G.

1010 to garbling techniques. Garbling techniques have been


2d (Wu et al. [WFNL16])
109 m0 (depth-padded)
used for oblivious selection SelG (Fig. 7) [BFK+ 09]
m (Tai et al. [TMZC17]) and comparison CompG (Fig. 9) [BFK+ 09]. Thereafter,
108
these techniques have been claimed to be inefficient
107
in [WFNL16, TMZC17]. However, our experiments in §5
106
show that they have a very efficient online phase, and
105
perform better than the method based only on homo-
104 morphic encryption [TMZC17] with respect to total run-
103 time in most cases as well.
102 Path Evaluation Strategy. As discussed in §3.3, the
101 path evaluation sub-protocol of [TMZC17] using homo-
100 morphic encryption via PathH (Fig. 10) depends only
on the number of decision nodes m and is therefore es-
s

ts

n
ce
iri

te
in

ru

o
gi

st
e
w

n
ne

di

ab
ca

bo

pecially efficient with sparse and large decision trees.


lin

di

When this sub-protocol is used, m = m in all preceding


Fig. 12. Number of decision nodes in the different protocols for
the decision trees trained on real-world datasets from Tab. 3. sub-protocols. This has been a tremendous improvement
Note that the y-axis is in logarithmic scale. for real-world decision trees over the path evaluation
method of Wu et al. [WFNL16], which depends expo-
nentially on the depth d of the tree, due to expanding
Security guarantees. We combine sub-protocols se-
the tree to a full decision tree.
cure against passive adversaries with secret shared
As opposed to this, in PathG (Fig. 11) the paths
output between the two parties. The outputs of our
need to have the same length in order to hide the tree
primitives for selection (SelH and SelG ) and compari-
topology. Therefore, the tree is depth-padded to m = m0
son (CompH and CompG ) as well as the inputs for com-
decision nodes. m0 depends on the topology of the deci-
parison and path evaluation (PathH and PathG ) are se-
sion tree, i.e., the level of all leaves (cf. §2.1).
cret shared as indicated in Figs. 3-5, i.e., the parties
We depict in Fig. 12 the different numbers of de-
do not learn any information by themselves. The secret
cision nodes that are used in state-of-the-art proto-
sharing technique is either XOR-sharing, additive blind-
cols and our depth-padded approach, i.e., the orig-
ing or garbled inputs/outputs with the corresponding
inal number of decision nodes m (PathH , Fig. 10,
key(s). Therefore, our sequentially composed protocols
Tai et al. [TMZC17]), the depth-padded number of de-
remain secure against passive adversaries.
cision nodes m0 (PathG , Fig. 11), and the number of
Selection and Comparison Strategy. For oblivi-
decision nodes in a full tree 2d (Wu et al. [WFNL16]).
ous selection and comparison, the state-of-the-art pro-
tocols use additively homomorphic encryption SelH
(Fig. 6) [BFK+ 09] and CompH (Fig. 8) [TMZC17]. This
strategy is beneficial when it comes to communication
and storage, but implies an increased runtime compared
SoK: Modular and Efficient Private Decision Tree Evaluation 197

4.1 Protocol Combinations the output. After the comparison, the path evaluation
PathH of [TMZC17] can be used as before (cf. Fig. 10).
In this section, we describe all protocols resulting from Combining homomorphic selection and garbled com-
valid combinations of sub-protocols presented in §3, and parison (SelH →CompG , Fig. 13). The result of SelH
adapt them to the offline-online setting. Here, all oper- in Fig. 6 is a set of blinded (t + s)-bit values for the
ations that depend only on the input of the server S ’s client C , and the blinding s-bit values for the server S .
decision tree are performed in the offline phase, whereas In order to perform the comparison on t-bit values, S
all operations that depend on the client C ’s input are per- and C must unblind the result using a GC for subtract-
formed in the online phase. Protocols that are efficient ing each random value obliviously [BFK+ 09]. The sub-
in the online phase are especially important in appli- tractions are performed on t-bit values, since the most
cations where the client uses a computationally weak significant s bits of the blinded values can be dropped.
device. Most of the computation can be done offline, The keys corresponding to the result are used in CompG
when the device is idle, and the query itself triggers (cf. Fig. 9). Overall, the m garbled subtraction circuits
the online phase, which is performed more efficiently. require 2mt · τsym bits offline communication, mt · τsym
We show that protocols based on garbling techniques bits for precomputing random OTs, and mt(2τsym + 1)
are especially efficient in this setting, whereas the sub- bits online communication for the OTs.
protocols based on homomorphic encryption require the GGH: SelG + CompG + PathH . Instead of in-
input directly at the start and therefore only few of the stantiating the selection with homomorphic encryption,
computationally intense tasks can be performed offline. one can use SelG , a selection network (with O(mt log m)
ABC denotes the protocol that uses SelA for selection, complexity) using garbled circuits (cf. Fig. 7). Here,
CompB for comparison and PathC for path evaluation. most of the selection phase can be computed offline
HHH: SelH + CompH + PathH [TMZC17]. due to precomputed OTs and the independence of the
The state-of-the-art protocol using only homomorphic GC from the client’s inputs. Then, CompG (Fig. 9)
encryption for all three sub-protocols was presented can be directly performed, and the path evaluation sub-
in [TMZC17], the comparison protocol of which was first protocol PathH of [TMZC17] (Fig. 10) is used as before.
used for private decision tree evaluation in [WFNL16]. The advantage of this protocol is that though the offline
We recapitulate it here as a combination of the ho- communication is higher than in the previous protocols
momorphic sub-protocol SelH + CompH (Fig. 8) with due to the selection network with superlinear complex-
PathH (Fig. 10). The advantage of this protocol is that ity, it performs well in the online phase and due to the
very low communication is necessary between the two path evaluation phase, it depends only on the number of
parties and the selection happens along with the com- decision nodes m, since depth-padding is not required.
parison. However, most computationally expensive cryp- HGG: SelH + CompG + PathG [BFK+ 09]. Ho-
tographic operations must be computed online, since the momorphic selection SelH (Fig. 6) and garbled compari-
computation depends on the client’s input. son CompG (Fig. 9) (with the interface from Fig. 13 be-
HGH: SelH + CompG + PathH . An oblivious tween) can also be combined with the path evaluation
comparison protocol that can be used to output shares technique PathG (Fig. 11). This induces overhead in the
of the comparison bits is Yao’s GC protocol is shown in number of decision nodes, since the depth-padded deci-
CompG (Fig. 9). Here, m garbled comparison circuits sion tree has a larger number of internal nodes m0 ≥ m
can be precomputed in an offline phase. Thereafter, the (cf. Fig. 12). However, this path evaluation protocol can
client evaluates these in the online phase (using SIMD more efficiently be transformed to an offline-online set-
in parallel as described in §2.2) to retrieve the com- ting, since the GCs for comparison and the garbled de-
parison color bits that are shares of the output bits. cision tree can be sent offline.
However, this comparison protocol requires a selection GGG: SelG + CompG + PathG [BFK+ 09].
sub-protocol to choose m inputs to the m comparisons This protocol consists exclusively of sub-protocols that
from the client’s n ≤ m inputs. In this protocol, we use use garbling techniques: selection SelG (Fig. 7) and
SelH with Paillier or DGK encryption (cf. Fig. 6) in or- comparison CompG (Fig. 9) with garbled circuits, and
der to achieve linear complexity. When combining these garbled path evaluation PathG (Fig. 11). It has the
approaches, the result of SelH is not the same as the highest communication, though most data can be sent
input of CompG , so the interface described below and in the offline phase.
depicted in Fig. 13 is inserted to obliviously unblind Another advantage of this protocol is that it can
be easily extended to provide security against a mali-
SoK: Modular and Efficient Private Decision Tree Evaluation 198

Server S Client C
Input: r1 , . . . , rm Input: z1,1 , . . . , zm,t blinded values (truncated)
Output: K0 = (K1,1 0 , . . . , K 0 ), K1 = (K 1 , . . . , K 1 )
m,t 1,1 m,t
Output: K = (K1,1 , . . . , Km,t ) keys

Generate circuit C for m t-bit subtractions of r1 , . . . , rm

Offline
0 , . . . , K 0 ), (K 1 , . . . , K 1 )) ←
(GC, (K1,1 GC
m,t 1,1 m,t
Garble(C, (s01,1 , . . . , s0m,t ), (s11,1 , . . . , s1m,t )) −−−−−−−−−−−−→

For i ∈ {1, . . . , m} For i ∈ {1, . . . , m}


For j ∈ {1, . . . , t} For j ∈ {1, . . . , t}

Online
s0i,j , s1i,j −−−−−−−−−−−−→ ←−−−−−−−−−−−− zi,j
OT zi,j
−−−−−−−−−−−−→ si,j
zi,j
Return K0 = (K1,1
0 , . . . , K 0 ), K1 = (K 1 , . . . , K 1 )
m,t 1,1 m,t
Return K = (K1,1 , . . . , Km,t ) = Eval(GC, (s1,1 ))

Fig. 13. SelH → CompG – Interface between the instantiation for SelH (Fig. 6) and CompG (Fig. 9).

Server S Client C
Input: a1 , . . . , am0 share Input: b1 , . . . , bm0 share
Output: k0 = (k10 , . . . , km0 ), k1 = (k 1 , , . . . , k 1 )
1 m0
Output: k = (k1 , . . . , km0 ) keys

For i ∈ {1, . . . , m0 } For i ∈ {1, . . . , m0 }

Online
←−−−−−−−−−−−− bi
a 1⊕a C-OT a ⊕b
ki i , ki i ←−−−−−−−−−−−− −−−−−−−−−−−−→ ki i i
Return k0 = (k10 , . . . , km
0 ), k1 = (k 1 , , . . . , k 1 )
1 m0
Return k = (k1 , . . . , km0 ) keys

Fig. 14. CompH → PathG – Interface between the instantiation for CompH (Fig. 8) and PathG (Fig. 11).

cious client: Since the only messages sent by the client since we can use the correlated OT (C-OT) extension
are in the OTs, one can use OT extension with secu- optimization of [ALSZ13].
rity against malicious clients, e.g., [ALSZ15, KOS15,
ALSZ17], which are only slightly less efficient than OT
extension with passive security. This adds little over- 4.2 Protocol Extensions
head, whereas securing the other solutions against ma-
licious clients needs more expensive conditional OTs or In this section, we describe protocol extensions, includ-
zero-knowledge proofs, e.g., as for HHH [TMZC17]. ing a natural modification of the HGG and GGG pro-
HHG: SelH + CompH + PathG . Due to the tocols, which allows the client to perform less computa-
worse efficiency of the homomorphic selection and com- tion and evaluate the decision tree in a similar manner
parison SelH + CompH (Fig. 8) and the high number as in plain evaluation with only O(dt) cryptographic
of decision nodes m0 for the garbled path evaluation operations. We compare this method with the protocol
method PathG (Fig. 11), this protocol combination is of [JS18] that has the same asymptotic complexity.
the least efficient. Moreover, it requires an interface of Path evaluation with computationally re-
OTs described below and depicted in Fig. 14. stricted client. A natural extension of HGG and GGG
Combining homomorphic comparison and garbled is to perform only the required d comparisons during the
path evaluation (CompH →PathG , Fig. 14). The result path evaluation phase as in the case of plain evaluation,
0
of SelH + CompH (Fig. 8) is for both parties a share of instead of doing all m comparisons in advance. This
the comparison bit ci = ai ⊕ bi for all i ∈ {1, . . . , m0 }, improves computation, though we cannot use SIMD op-
while the input to PathG (Fig. 11) is a key for all com- erations for the comparisons anymore, since the GCs are
parison bits for the client C and two keys for both possi- evaluated sequentially along the evaluation path. Simi-
0
ble bits for the server S . After S generates these keys, C larly, the client can decrypt only d (instead of m ) homo-
and S engage in a 1-out-of-2 OT protocol for each com- morphically encrypted blinded values in HGG in SelH
parison bit (cf. §2.2). This conversion requires m0 τsym (cf. Fig. 6) before these comparisons. However, the com-
bit offline and m0 (τsym + 1) bit online communication, munication remains unchanged, since the actual evalua-
tion path taken needs to remain oblivious to the server.
SoK: Modular and Efficient Private Decision Tree Evaluation 199

Joye and Salehi present a similar protocol in [JS18]. tails on our implementation in §C. We show the perfor-
In their protocol, full decision trees are utilized to hide mance of the protocols for the datasets from Tab. 3 as
the topology of the tree, and a 1-out-of-2i OT is per- well as for a full tree full(13) with d = n = 13 for a
formed at the ith level (i ∈ {1, . . . , d}). However, the comparison for dense trees.
full tree can be replaced with the depth-padded tree Communication. The offline, online, and total
described in §2.1, in which case 1-out-of-m0 OTs are suf- communication of the protocols is given in Fig. 15a,
ficient for levels where m0 < 2i . Though the communica- Fig. 15b, and Fig. 15c, resp. We observe that gar-
tion of this protocol (O(n+d(t+m0 ))) is better than that bling techniques allow for precomputation and offline
of HGG (O(n + m0 t)) for small decision trees, its depen- communication, but require more communication in to-
dency on the depth makes it worse for larger examples tal. Homomorphic encryption-based methods have less
such as the boston dataset (cf. §2.1). Moreover, it has communication, however, almost all expensive compu-
O(d) online rounds instead of the constant number of 4 tation and all communication happen in the online
rounds provided by HGG. Therefore, we conclude that phase. Methods using PathH (i.e., GGH, HGH and
our HGG protocol is more efficient in most scenarios. HHH [TMZC17]) have a clear advantage for large sparse
Other extensions. In §B.1, we describe modifica- trees, which is lost in case extremely dense trees are con-
tions for categorical variables, and in §B.2 for securely sidered (such as our example full tree full(13)). For
evaluating random forests. Classification confidence val- these kind of trees, Wu et al.’s protocol [WFNL16] has
ues (correctness probabilities), can be appended to the lower communication as discussed in §7.
classification labels in a straightforward manner.
Runtime. For our benchmarks we use two ma-
chines equipped with an Intel Core i7-4790 CPU @
3.6 GHz and 32 GB of RAM that support Intel’s AES-NI
5 Performance Evaluation for fast AES operations. Our benchmarks are run in a
LAN setting with 1 Gbit/s and 0.5 ms latency. Run-
In this section, we compare the runtime and communica-
times are reported from an average of 10 executions. In
tion of all resulting protocols from §4. In our implemen-
our experiments, we neglect the costs for the base OTs
tation, we instantiate primitives corresponding to secu-
and for generating the keypair for additively homomor-
rity level κ = 128 (and κ = 112 for public-key primitives)
phic encryption, since they are a one-time expense that
as recommended by NIST for use until 2030 [BBB+ 16],
can be re-used over multiple protocol executions. We
and statistical security parameter s = 40, and as in prior
also neglect the cost for trial decryption of the classifi-
work, we set the bitlength to t = 64. τ∗ denotes the ci-
cation value in PathH , and note that for a 16-bit value
phertext sizes: τsym = 128 for symmetric-key encryption,
it is around 40 ms on average.
τPaillier = 4096 for Paillier encryption, τDGK = 2048 for
The offline runtimes are given in Fig. 16a, the on-
DGK encryption, and τElGamal = 514 for ElGamal en-
line runtimes are given in Fig. 16b, while the total run-
cryption with elliptic curves and point compression.
times are depicted in Fig. 16c. We can see that homo-
The underlying frameworks for our open-source im-
morphic encryption-based methods (SelH andCompH )
plementation available at https://encrypto.de/code/
perform an order of magnitude worse than their gar-
PDTE are the ABY secure two-party computation frame-
bling technique-based alternatives (SelG and CompG ).
work [DSZ15]1 and the mcl library2 that includes a
However, we observe the advantage of the homomorphic
highly optimized lifted ElGamal implementation. ABY
encryption-based path evaluation technique (PathH ),
implements the state-of-the-art optimizations of Yao’s
where the number of decision nodes m is unchanged,
garbled circuit protocol described in §2.2. In addition,
whereas in the path evaluation method using a garbled
we implement an analogues technique to that of point-
decision tree (PathG ) depth-padding increases the num-
and-permute in order to avoid trial decryption of gar-
ber of decision nodes to m ≤ m0 ≤ 21 m(m+1), which can
bled nodes in oblivious path evaluation PathG [Sch08].
be significantly larger than m (cf. Fig. 12 and Tab. 3).
We precompute OTs and homomorphic encryption when
Our protocol GGH has the fastest total and online run-
possible (i.e., Figs. 6, 7, 8, 10, 14). We give further de-
time of below 1 second for our largest real-world exam-
ple boston. For extremely dense decision trees such as
our example full tree full(13), this advantage is lost
1 https://github.com/encryptogroup/ABY (since no padding is necessary) and the runtime of pro-
2 https://github.com/herumi/mcl tocols based on garbling techniques is an order of mag-
SoK: Modular and Efficient Private Decision Tree Evaluation 200

106 106 106


GGG [BFK+ 09] HHG GGG [BFK+ 09]
HGG [BFK+ 09] HGG [BFK+ 09] HGG [BFK+ 09]
5 5 5
10 GGH 10 HHH [TMZC17] 10 HHG
HGH HGH GGH
HHG GGH HGH
104 104 GGG [BFK+ 09] 104 HHH [TMZC17]

103 103 103

102 102 102

101 101 101

100 100 100


s

er

ts

ud

er

ts

es

on

er

ts

es

on

)
13

13

13
iri

iri

iri
in

in

in
ru

ru
gi

gi

gi
nc

nc

nc
et

et

et
st

st

st
r
w

w
ll(

ll(

ll(
ne

ne

ne
di

di

di
ab

ab

ab
ca

ca

ca
bo

bo

bo
fu

fu

fu
lin

lin

lin
di

di

di
(a) Offline communication in KBytes. (b) Online communication in KBytes. (c) Total communication in KBytes.

Fig. 15. Communication of protocols using the example datasets from Tab. 3. Note that the y-axis is in logarithmic scale.

105 105 105


HGG [BFK+ 09] (HHG) (HHG)
GGG [BFK+ 09] HGG [BFK+ 09] HGG [BFK+ 09]
HGH HHH [TMZC17] HHH [TMZC17]
104 GGH 104 HGH 104 GGG [BFK+ 09]
(HHG) GGG [BFK+ 09] HGH
HHH [TMZC17] GGH GGH
103 103 103

102 102 102

101 101 101

100 100 100


s

st

ts

on

ud

st

ts

es

on

st

ts

on

)
13

13

13
iri

te

iri

iri

te
in

in

in
ru

ru
ea

gi

ea

gi

ea

gi
et
st

st

st
r
e

be
w

w
ll(

ll(

ll(
ne

ne

ne
di

di

di
ab

ab
br

br

br
bo

bo

bo
a
fu

fu

fu
lin

lin

lin
di

di

di
(a) Offline runtime in milliseconds. (b) Online runtime in milliseconds. (c) Total runtime in milliseconds.

Fig. 16. Runtime of protocols using the example datasets from Tab. 3. Note that the y-axis is in logarithmic scale.

nitude lower than that of the protocols based on homo- protocols using garbling techniques for path evaluation
morphic encryption in the LAN setting. lose their performance advantage with growing num-
Tradeoff. In order to show the communication ber of dummy nodes, and the protocols using garbling
(x-axis) vs. computation (y-axis) tradeoff of all resulting techniques for selection become less practical due to
protocols, we give figures for decision trees trained on the largely increased amount of offline communication
two example real-world datasets: a small dataset wine, for transmitting the selection network (which implies
and our largest dataset boston. storage requirements as well). For instance, our largest
While the number of input features n is rela- decision tree trained on the boston dataset requires
tively small for all datasets, the number of (poten- around 1.8 MBytes of total communication with HHH,
tially padded) decision nodes significantly affects the 3 MBytes with HGH, 9 MBytes with GGH, and more
efficiency of our sub-protocols. We depict in Fig. 17a than 200 MBytes with GGG. The designer of an applica-
and Fig. 17b the tradeoff between offline and online run- tion using private decision tree evaluation can consider
time and communication for all protocols for the wine these tradeoffs and choose the best suited protocol.
and boston datasets, respectively, while Fig. 18a and
Fig. 18b depict the corresponding total complexities.
The larger the dataset, the more dummy nodes
are introduced during depth-padding (cf. §2.1) when
6 Concrete Improvements and
PathG is used for path evaluation. Moreover, the selec- Recommendations
tion network in SelG has superlinear size O(m log m),
which implies a significant overhead in offline commu- We investigate the concrete tradeoffs between our iden-
nication. This gap can be observed on the figures: the tified hybrid protocols GGH and HGH from §4.1 and
SoK: Modular and Efficient Private Decision Tree Evaluation 201

103 105
(HHG)
HGG [BFK+ 09]
(HHG)
HHH [TMZC17] 104
2 HGG [BFK+ 09]
Runtime in ms

Runtime in ms
10 HHH [TMZC17]
HGH HGH HGG [BFK+ 09]
103 GGG [BFK+ 09] GGG [BFK+ 09]
+
HGG [BFK 09]
101 GGG [BFK+ 09] HGH
(HHG) HGH GGH
HHH [TMZC17] GGH 102 GGH
GGH
GGG [BFK+ 09]
(HHG)
100 101 HHH [TMZC17]
10 0

10 1

10 2

10 3

10 1

10 2

10 3

10 4

10 5

10 6
0

0
Communication in KBytes Communication in KBytes
(a) Tradeoff for small dataset wine. (b) Tradeoff for large dataset boston.

Fig. 17. Tradeoff between communication (x-axis) and runtime (y-axis). The figures show the offline (unfilled squares) and online
(filled squares) complexities (connected with a line). Note that both axes are in logarithmic scale, with an additional 0 on the x axis
for depicting the offline communication of the HHH protocol. Pareto points, i.e., protocols where one property (computation or com-
munication) cannot be improved without the other property becoming worse, for online complexities are marked in bold.

103 105
(HHG) (HHG)
HHH [TMZC17]
HGG [BFK+ 09] HGG [BFK+ 09]
4
10
Total runtime in ms

Total runtime in ms

HGH
102
HHH [TMZC17] HGH GGG [BFK+ 09]
GGG [BFK+ 09] 103
GGH
101
GGH
102

100 101
10 0

10 1

10 2

10 3

10 1

10 2

10 3

10 4

10 5

10 6
0

Total communication in KBytes Total communication in KBytes


(a) Tradeoff for small dataset wine. (b) Tradeoff for large dataset boston.

Fig. 18. Tradeoff between total communication (x-axis) and runtime (y-axis). Note that both axes are in logarithmic scale, with an
additional 0 on the x axis for depicting the offline communication of the HHH protocol. Pareto points are marked in bold.

the state-of-the-art protocols HHH of [TMZC17] opti- in total and online runtime by 2-5x with only slightly
mized for communication and GGG of [BFK+ 09] opti- increased total communication. The online communica-
mized for online computation. We show that the state- tion is better for both GGH and HGH than for HHH.
of-the-art protocols that exclusively use one of the two Tradeoff with GGG [BFK+ 09]. GGH improves over
paradigms HHH with additively homomorphic encryp- GGG for large datasets by an order of magnitude in
tion or GGG with garbling techniques can be replaced all complexities due to the fact that it does not need
by our hybrid protocols HGH and GGH that provide a any padding for the decision nodes. For small datasets,
more reasonable tradeoff for larger decision trees. GGG is better than GGH in online runtime and com-
In Figs. 17 and 18, we mark in bold pareto points for munication by up to a factor of 2, but is worse in both
online and total tradeoffs, respectively, i.e., we bold the total complexities. HGH improves even more (by 5-73x)
protocols where there is no possibility to improve one for our datasets over the communication of GGG due to
property without the other property becoming worse. the difference in the number of decision nodes. The on-
Tradeoff with HHH [TMZC17]. The improvement in line runtimes of HGH are larger by 2-18x for real-world
total and online runtime of GGH over HHH is more than datasets, and the total runtimes are also larger by a fac-
an order of magnitude for our examples with increased tor of 5 for small datasets. However, it gets better with
total communication by 2-5x. HGH improves over HHH
SoK: Modular and Efficient Private Decision Tree Evaluation 202

larger (sparse) datasets due to the increasing number of col [WFNL16] would perform better than HHH for cir-
decision nodes in the depth-padded tree. cuits with m ∼ 2d [TMZC17]. However, already for a
Recommendations. We identify the following as- very dense tree with m = 2d−2 , HHH has about half
pects to take into consideration when deciding which the communication compared to [WFNL16], whereas
protocol to use for an application: the dataset size, and the necessary computation is around the same.
the boundaries on communication and computational The protocols of [WFNL16] and [TMZC17] use the
power (e.g., network throughput and client storage ca- DGK comparison protocol [DGK07]. Joye and Salehi
pacity). When the client has high storage capacity, pro- present an optimization on the DGK protocol and a pri-
tocols GGG [BFK+ 09] or GGH from §4.1 provide the vate decision tree evaluation protocol with O(d) rounds
best online performance, depending on the dataset size. in [JS18] (cf. §3.2). For private decision tree evaluation
The former may have a slightly more efficient online with constant rounds, their optimization on the DGK
phase, but higher communication due to its selection protocol is not applicable. The alternative solution to
protocol (cf. §3.1). When the client is computationally the DGK protocol of [LZS18] uses fully homomorphic
bounded and has little storage capacity, our protocols encryption and performs well for small bitlengths t of
HGH and GGH from §4.1 provide the best solutions. the features and inputs.
The state-of-the-art protocol HHH of [TMZC17] has A different solution for evaluating full private deci-
the lowest total communication, but uses computation- sion trees using the so-called commodity-based model
ally heavy public-key operations that depend almost was proposed in [CDH+ 17], where correlated random-
entirely on the client’s input and hence do not allow ness is distributed by a trusted authority to the com-
for much offline precomputation. Our protocols HGH puting parties or pre-computed during the offline phase,
or GGH from §4.1 that combine its path evaluation with which are used in the online phase.
garbling techniques yield an order of magnitude faster In concurrent and independent related work
runtimes while slightly increasing communication. Tueno et al. [TKK19] represent the decision tree as an
array, and implement oblivious array indexing. For this,
they use either garbled circuits, oblivious transfer or
oblivious RAM (ORAM), the latter of which results in
7 Related Work a protocol with sub-linear complexity in the size of the
decision tree. Their protocols require dt comparisons as
Private decision tree evaluation was firstly considered
that of [JS18] and our protocol extension for GGG and
in [BPSW07], with application to private evaluation
HGG §4.2, but require at least O(d) rounds of commu-
of remote diagnostic programs. A protocol based on
nication [TKK19, Tab. 2].
homomorphic encryption where the server evaluates
Alternatively, private decision tree evaluation can
a branching program on the client’s encrypted in-
be solved using generic private function evalua-
put was proposed in [IP07]. This has been improved
tion (PFE) protocols such as [KM11, MS13, MSS14].
in [BFK+ 09, BFL+ 11] where a protocol based on mainly
However, these solutions utilize Boolean circuits as the
symmetric-key operations was proposed. Bost et al. use
underlying representation of the functionality. Trans-
additively homomorphic encryption to evaluate the de-
forming the decision tree into a Boolean circuit would
cision tree expressed as a polynomial [BPTG15]. Re-
imply additional unnecessary overhead on the size of
cently, Wu et al. [WFNL16], Tai et al. [TMZC17] and
the function (i.e., O(tm log m) gates for the selection
Joye and Salehi [JS18] improved the state-of-the-art of
of inputs as in SelG in §3.1). Secure evaluation of
private decision tree evaluation protocols. These works
universal circuits (UCs) is an equivalent solution for
rely on additively homomorphic encryption using the
PFE [Val76, KS08b, KS16, GKS17]. UC- and OT-
Diffie-Hellman assumption and present protocols that
based PFE protocols [KS08b, MS13, KS16, GKS17]
achieve security against semi-honest adversaries or ma-
need O((tm log m) log(tm log m)), and HE-based proto-
licious clients. Tai et al. [TMZC17] eliminate the expo-
cols [KM11, MS13, MSS14] need O(tm log m) computa-
nential dependency on the depth of the tree that was
tion and communication. These complexities are larger
present in [WFNL16] by representing decision trees as
than that of protocols designed specifically for DTs.
linear functions. This implies enormous improvement
There are alternative approaches for classification
when large decision trees are considered. These, in prac-
based on machine learning, such as deep neural net-
tice, are usually not very dense, and therefore, we use
works, that can be evaluated in a private manner [SS08,
this protocol to instantiate HHH. Wu et al.’s proto-
BFL+ 11, LJLA17, MZ17, RWT+ 18, JVC18, BFR+ 18,
SoK: Modular and Efficient Private Decision Tree Evaluation 203

BDK+ 18]. Private evaluation of machine learning mod-


els incur a natural overhead, but they can perform clas-
References
sifications of real-world datasets in the order of seconds. [ALSZ13] Gilad Asharov, Yehuda Lindell, Thomas Schneider,
and Michael Zohner. More efficient oblivious transfer
and extensions for faster secure computation. In ACM
Computer and Communications Security (CCS’13),
Acknowledgements pages 535–548. ACM, 2013.
[ALSZ15] Gilad Asharov, Yehuda Lindell, Thomas Schneider,
We thank the anonymous reviewers for their valuable and Michael Zohner. More efficient oblivious transfer
feedback on the paper. This work was supported by the extensions with security for malicious adversaries. In
Advances in Cryptology – EUROCRYPT’15, volume
German Federal Ministry of Education and Research
9056 of LNCS, pages 673–701. Springer, 2015.
(BMBF) and the Hessen State Ministry for Higher Edu- [ALSZ17] Gilad Asharov, Yehuda Lindell, Thomas Schneider,
cation, Research and the Arts (HMWK) within CRISP, and Michael Zohner. More efficient oblivious transfer
by the DFG as part of project E4 within the CRC 1119 extensions. J. Cryptology, 30(3):805–858, 2017.
CROSSING, by the Intel Collaborative Research Insti- [AM18] Bushra A. AlAhmadi and Ivan Martinovic. Malclassi-
fier: Malware family classification using network flow
tute for Collaborative Autonomous & Resilient Systems
sequence behaviour. In 2018 APWG Symposium on
(ICRI-CARS), and by Business Finland (CloSer project,
Electronic Crime Research (eCrime’18), pages 1–13.
3881/31/2016). IEEE, 2018.
[Bar15] Jeff Barr. Amazon machine learning – make data-
driven decisions at scale. aws.amazon.com/blogs/
aws/amazon-machine-learning-make-data-driven-
decisions-at-scale, 2015. Accessed: 2018-08-19.
[BBB+ 16] Elaine B. Barker, William C. Barker, William E.
Burr, W. Timothy Polk, and Miles E. Smid. Sp 800-
57. recommendation for key management, part 1:
General (revised). Technical report, Gaithersburg, MD,
United States, 2016.
[BDK+ 18] Niklas Büscher, Daniel Demmler, Stefan Katzen-
beisser, David Kretzmer, and Thomas Schneider.
HyCC: Compilation of hybrid protocols for practical
secure computation. In ACM Computer and Commu-
nications Security (CCS’18), pages 847–861. ACM,
2018.
[Bea95] Donald Beaver. Precomputing oblivious transfer. In
Advances in Cryptology – CRYPTO’95, volume 963
of LNCS, pages 97–109. Springer, 1995.
[BFK+ 09] Mauro Barni, Pierluigi Failla, Vladimir Kolesnikov,
Riccardo Lazzeretti, Ahmad-Reza Sadeghi, and
Thomas Schneider. Secure evaluation of private linear
branching programs with medical applications. In
European Symposium on Research in Computer Se-
curity (ESORICS’09), volume 5789 of LNCS, pages
424–439. Springer, 2009.
[BFL+ 11] Mauro Barni, Pierluigi Failla, Riccardo Lazzeretti,
Ahmad-Reza Sadeghi, and Thomas Schneider.
Privacy-preserving ECG classification with branch-
ing programs and neural networks. IEEE Transactions
on Information Forensics and Security, 6(2):452–468,
2011.
[BFR+ 18] Ferdinand Brasser, Tommaso Frassetto, Korbinian
Riedhammer, Ahmad-Reza Sadeghi, Thomas Schnei-
der, and Christian Weinert. Voiceguard: Secure and
private speech processing. In Annual Conference of
the International Speech Communication Association
(INTERSPEECH’18), pages 1303–1307. ISCA, 2018.
[BHKR13] Mihir Bellare, Viet Tung Hoang, Sriram Keelveedhi,
and Phillip Rogaway. Efficient garbling from a fixed-
SoK: Modular and Efficient Private Decision Tree Evaluation 204

key blockcipher. In IEEE Symposium on Security and on-line auctions’. International Journal of Advanced
Privacy (S&P’13), pages 478–492. IEEE, 2013. Computer Technology (IJACT), 1(4):323–324, 2009.
[Big18] Inc. BigML. Machine learning made easy, beautiful [DJ01] Ivan Damgård and Mads Jurik. A generalisation,
and understandable. https://bigml.com/, 2018. a simplification and some applications of Paillier’s
Accessed: 2018-08-24. probabilistic public-key system. In Public Key Cryp-
[BJL12] Dan Bogdanov, Roman Jagomägis, and Sven Laur. A tography (PKC’01), volume 1992 of LNCS, pages
universal toolkit for cryptographically secure privacy- 119–136. Springer, 2001.
preserving data mining. In Pacific Asia Workshop [DSZ15] Daniel Demmler, Thomas Schneider, and Michael
on Intelligence and Security Informatics (PAISI’12), Zohner. ABY - A framework for efficient mixed-
volume 7299 of LNCS, pages 112–126. Springer, protocol secure two-party computation. In Net-
2012. work and Distributed System Security Symposium
[BMR90] Donald Beaver, Silvio Micali, and Phillip Rogaway. (NDSS’15). The Internet Society, 2015.
The round complexity of secure protocols (extended [ElG85] Taher ElGamal. A public key cryptosystem and a
abstract). In ACM Symposium on Theory of Comput- signature scheme based on discrete logarithms. In
ing (STOC’90), pages 503–513. ACM, 1990. Advances in Cryptology – CRYPTO’85, volume 196
[BPSW07] Justin Brickell, Donald E. Porter, Vitaly Shmatikov, of LNCS, pages 10–18. Springer, 1985.
and Emmett Witchel. Privacy-preserving remote [FW12] Pui Kuen Fong and Jens H. Weber-Jahnke. Privacy
diagnostics. In ACM Computer and Communications preserving decision tree learning using unrealized
Security (CCS’07), pages 498–507. ACM, 2007. data sets. IEEE Transactions on Knowledge and Data
[BPTG15] Raphael Bost, Raluca Ada Popa, Stephen Tu, and Engineering, 24(2):353–364, 2012.
Shafi Goldwasser. Machine learning classification [GBC+ 97] Michael D. Garris, James L. Blue, Gerald T. Candela,
over encrypted data. In Network and Distributed Sys- Patrick J. Grother, Stanley Janet, and Charles L.
tem Security Symposium (NDSS’15). The Internet Wilson. NIST form-based handprint recognition
Society, 2015. system (release 2.0). Interagency/Internal Report
[BS09] Justin Brickell and Vitaly Shmatikov. Privacy- (NISTIR) - 5959, 1997.
preserving classifier learning. In Financial Cryp- [GKG+ 18] Srishti Gupta, Abhinav Khattar, Arpit Gogia, Pon-
tography and Data Security (FC’09), volume 5628 of nurangam Kumaraguru, and Tanmoy Chakraborty.
LNCS, pages 128–147. Springer, 2009. Collective classification of spam campaigners on
[BSR18] Diogo Barradas, Nuno Santos, and Luís Rodrigues. Twitter: A hierarchical meta-path based approach. In
Effective detection of multimedia protocol tunnel- World Wide Web Conference on World Wide Web
ing using machine learning. In USENIX Security (WWW’18), pages 529–538. ACM, 2018.
Symposium’18, pages 169–185. USENIX, 2018. [GKS17] Daniel Günther, Ágnes Kiss, and Thomas Schneider.
[CDH+ 17] Martine De Cock, Rafael Dowsley, Caleb Horst, Raj More efficient universal circuit constructions. In
Katti, Anderson C. A. Nascimento, Wing-Sea Poon, Advances in Cryptology – ASIACRYPT’17, volume
and Stacey C. Truex. Efficient and private scoring of 10625 of LNCS, pages 443–470. Springer, 2017.
decision trees, support vector machines and logistic [HEK12] Yan Huang, David Evans, and Jonathan Katz. Pri-
regression models based on pre-computation. IEEE vate set intersection: Are garbled circuits better than
Transactions on Dependable and Secure Computing, custom protocols? In Network and Distributed Sys-
To appear., 2017. tem Security Symposium (NDSS’12). The Internet
[CO18] Michele Ciampi and Claudio Orlandi. Combining Society, 2012.
private set-intersection with secure two-party compu- [IK17] Aleksandar Ilic and Oleksandr Kuvshynov. Evaluating
tation. In Security and Cryptography for Networks boosted decision trees for billions of users. https:
(SCN’18), volume 11035 of Lecture Notes in Com- //code.facebook.com/posts/975025089299409/
puter Science, pages 464–482. Springer, 2018. evaluating-boosted-decision-trees-for-billions-of-
[DCBA14] Manuel Fernández Delgado, Eva Cernadas, Senén users, 2017. Accessed: 2018-08-19.
Barro, and Dinani Gomes Amorim. Do we need hun- [IKNP03] Yuval Ishai, Joe Kilian, Kobbi Nissim, and Erez
dreds of classifiers to solve real world classification Petrank. Extending oblivious transfers efficiently. In
problems? Journal of Machine Learning Research, Advances in Cryptology – CRYPTO’03, volume 2729
15(1):3133–3181, 2014. of LNCS, pages 145–161. Springer, 2003.
[DGK07] Ivan Damgård, Martin Geisler, and Mikkel Krøigaard. [IP07] Yuval Ishai and Anat Paskin. Evaluating branching
Efficient and secure comparison for on-line auctions. programs on encrypted data. In Theory of Cryptog-
In Australasian Conference on Information Security raphy Conference (TCC’07), volume 4392 of LNCS,
and Privacy (ACISP’07), volume 4586 of LNCS, pages 575–594. Springer, 2007.
pages 416–430. Springer, 2007. [IR89] Russell Impagliazzo and Steven Rudich. Limits on
[DGK08] Ivan Damgård, Martin Geisler, and Mikkel Krøi- the provable consequences of one-way permuta-
gaard. Homomorphic encryption and secure compar- tions. In ACM Symposium on Theory of Computing
ison. International Journal of Applied Cryptography, (STOC’89), pages 44–61. ACM, 1989.
1(1):22–31, 2008. [JS18] Marc Joye and Fariborz Salehi. Private yet efficient
[DGK09] Ivan Damgård, Martin Geisler, and Mikkel Krøigaard. decision tree evaluation. In Data and Applications
A correction to ’Efficient and secure comparison for Security and Privacy (DBSec’18), volume 10980 of
SoK: Modular and Efficient Private Decision Tree Evaluation 205

LNCS, pages 243–259. Springer, 2018. from homomorphic encryption. In ACM Asia Con-
[JSD+ 18]Mika Juuti, Sebastian Szyller, Alexey Dmitrenko, ference on Computer and Communications Security
Samuel Marchal, and N. Asokan. PRADA: protect- (AsiaCCS’18), pages 67–74. ACM, 2018.
ing against DNN model stealing attacks. CoRR, [MAAG15] Michael J. Mayhew, Michael Atighetchi, Aaron Adler,
abs/1805.02628, 2018. and Rachel Greenstadt. Use of machine learning in
[JVC18] Chiraag Juvekar, Vinod Vaikuntanathan, and Anan- big data analytics for insider threat detection. In IEEE
tha Chandrakasan. GAZELLE: A low latency frame- Military Communications Conference (MILCOM’15),
work for secure neural network inference. In USENIX pages 915–922. IEEE, 2015.
Security Symposium’18, pages 1651–1669. USENIX, [Mic18] Microsoft. Azure machine learning studio. https:
2018. //azure.microsoft.com/, 2018. Accessed: 2018-08-24.
[KM11] Jonathan Katz and Lior Malka. Constant-round [MLJ17] Inc. MLJAR. MLJAR: Machine learning for all.
private function evaluation with linear complexity. In https://mljar.com/, 2016-2017. Accessed: 2018-08-
Advances in Cryptology – ASIACRYPT’11, volume 24.
7073 of LNCS, pages 556–571. Springer, 2011. [MS13] Payman Mohassel and Seyed Saeed Sadeghian. How
[KMAM18] Manish Kesarwani, Bhaskar Mukhoty, Vijay Arya, to hide circuits in MPC an efficient framework for
and Sameep Mehta. Model extraction warning in private function evaluation. In Advances in Cryptology
mlaas paradigm. In Annual Computer Security – EUROCRYPT’13, volume 7881 of LNCS, pages
Applications Conference (ACSAC’18), pages 371–380. 557–574. Springer, 2013.
ACM, 2018. [MSS14] Payman Mohassel, Seyed Saeed Sadeghian, and
[KOS15] Marcel Keller, Emmanuela Orsini, and Peter Scholl. Nigel P. Smart. Actively secure private function evalu-
Actively secure OT extension with optimal overhead. ation. In Advances in Cryptology – ASIACRYPT’14,
In Advances in Cryptology – CRYPTO’15, volume volume 8874 of LNCS, pages 486–505. Springer,
9215 of LNCS, pages 724–741. Springer, 2015. 2014.
[KS08a] Vladimir Kolesnikov and Thomas Schneider. Im- [MZ17] Payman Mohassel and Yupeng Zhang. SecureML:
proved garbled circuit: Free XOR gates and appli- A system for scalable privacy-preserving machine
cations. In International Colloquium on Automata, learning. In IEEE Symposium on Security and Privacy
Languages and Programming (ICALP’08), volume (S&P’17), pages 19–38. IEEE, 2017.
5126 of LNCS, pages 486–498. Springer, 2008. [NIZ+ 16] Ana Nika, Asad Ismail, Ben Y. Zhao, Sabrina Gaito,
[KS08b] Vladimir Kolesnikov and Thomas Schneider. A Gian Paolo Rossi, and Haitao Zheng. Understand-
practical universal circuit construction and secure ing and predicting data hotspots in cellular net-
evaluation of private functions. In Financial Cryp- works. Mobile Networks and Applications (MONET),
tography and Data Security (FC’08), volume 5143 of 21(3):402–413, 2016.
LNCS, pages 83–97. Springer, 2008. [OTGM18] Rebekah Overdorf, Carmela Troncoso, Rachel Green-
[KS16] Ágnes Kiss and Thomas Schneider. Valiant’s universal stadt, and Damon McCoy. Under the underground:
circuit is practical. In Advances in Cryptology – Predicting private interactions in underground fo-
EUROCRYPT’16, volume 9665 of LNCS, pages rums. CoRR, abs/1805.04494, 2018.
699–728. Springer, 2016. [Pai99] Pascal Paillier. Public-key cryptosystems based on
[KSS09] Vladimir Kolesnikov, Ahmad-Reza Sadeghi, and composite degree residuosity classes. In Advances
Thomas Schneider. Improved garbled circuit building in Cryptology – EUROCRYPT’99, volume 1592 of
blocks and applications to auctions and comput- LNCS, pages 223–238. Springer, 1999.
ing minima. In Cryptology and Network Security [PMG+ 17] Nicolas Papernot, Patrick D. McDaniel, Ian J. Good-
(CANS’09), volume 5888 of LNCS, pages 1–20. fellow, Somesh Jha, Z. Berkay Celik, and Ananthram
Springer, 2009. Swami. Practical black-box attacks against machine
[Lic18] Moshe Lichman. UCI machine learning repository. learning. In ACM Asia Conference on Computer and
https://archive.ics.uci.edu/ml. Irvine, CA: University Communications Security (AsiaCCS’17), 2017, pages
of California, School of Information and Computer 506–519. ACM, 2017.
Science, 2018. Accessed: 2018-08-24. [RG16] Carl Rabeler and Craig Guyer. Microsoft decision
[LJLA17] Jian Liu, Mika Juuti, Yao Lu, and N. Asokan. Obliv- trees algorithm. https://docs.microsoft.com/en-
ious neural network predictions via MiniONN trans- us/sql/analysis-services/data-mining/microsoft-
formations. In ACM Computer and Communications decision-trees-algorithm, 2016. Accessed: 2018-08-19.
Security (CCS’17), pages 619–631. ACM, 2017. [RMD18] Alejandro Rago, Claudia Marcos, and J. Andres
[LP00] Yehuda Lindell and Benny Pinkas. Privacy pre- Diaz-Pace. Using semantic roles to improve text
serving data mining. In Advances in Cryptology – classification in the requirements domain. Language
CRYPTO’00, volume 1880 of LNCS, pages 36–54. Resources and Evaluation, 52(3):801–837, 2018.
Springer, 2000. [RWT+ 18] M. Sadegh Riazi, Christian Weinert, Oleksandr
[LP02] Yehuda Lindell and Benny Pinkas. Privacy preserving Tkachenko, Ebrahim M. Songhori, Thomas Schnei-
data mining. Journal of Cryptology, 15(3):177–206, der, and Farinaz Koushanfar. Chameleon: A hybrid
2002. secure computation framework for machine learning
[LZS18] Wenjie Lu, Jun-Jie Zhou, and Jun Sakuma. Non- applications. In ACM Asia Conference on Computer
interactive and output expressive private comparison and Communications Security (AsiaCCS’18), pages
SoK: Modular and Efficient Private Decision Tree Evaluation 206

707–721. ACM, 2018. trees and random forests. Proceedings on Privacy


[Sch08] Thomas Schneider. Practical secure function evalua- Enhancing Technologies (PoPETs), 2016(4):335–355,
tion. Master’s thesis, Friedrich-Alexander University 2016.
Erlangen-Nürnberg, Germany, February 27, 2008. [WGC18] Sameer Wagh, Divya Gupta, and Nishanth Chan-
[Ser18] Amazon Web Services. Data privacy. https://aws. dran. SecureNN: Efficient and private neural network
amazon.com/compliance/data-privacy-faq, 2018. training. Cryptology ePrint Archive, 2018/442, 2018.
Accessed: 2018-08-19. [Wis18] Wise.io. Machine learning for the industrial internet of
[sld17] scikit-learn developers. scikit-learn – machine learning things. wise.io, 2018. Accessed: 2018-08-24.
in python. http://scikit-learn.org/stable/modules/tree. [Yao82] Andrew C.-C. Yao. Protocols for secure computations
html, 2017. Accessed: 2018-08-22. (extended abstract). In Foundations of Computer
[SS08] Ahmad-Reza Sadeghi and Thomas Schneider. General- Science (FOCS’82), pages 160–164. IEEE, 1982.
ized universal circuits for secure evaluation of private [Yao86] Andrew C.-C. Yao. How to generate and exchange
functions with application to data classification. In secrets (extended abstract). In Foundations of
Information Security and Cryptology (ICISC’08), vol- Computer Science (FOCS’86), pages 162–167. IEEE,
ume 5461 of LNCS, pages 336–353. Springer, 2008. 1986.
[SSSS17] Reza Shokri, Marco Stronati, Congzheng Song, and [YGL17] Fengpeng Yuan, Xianyi Gao, and Janne Lindqvist.
Vitaly Shmatikov. Membership inference attacks How busy are you?: Predicting the interruptibility
against machine learning models. In IEEE Symposium intensity of mobile users. In Conference on Human
on Security and Privacy (S&P’17), pages 3–18. IEEE, Factors in Computing Systems (CHI’17), pages
2017. 5346–5360. ACM, 2017.
[TASF09] Ajay Kumar Tanwani, M. Jamal Afridi, M. Zubair [ZRE15] Samee Zahur, Mike Rosulek, and David Evans. Two
Shafiq, and Muddassar Farooq. Guidelines to se- halves make a whole - reducing data transfer in
lect machine learning scheme for classification of garbled circuits using half gates. In Advances in
biomedical datasets. In Evolutionary Computation, Cryptology – EUROCRYPT’15, volume 9057 of
Machine Learning and Data Mining in Bioinformatics LNCS, pages 220–250. Springer, 2015.
(EvoBIO’09), volume 5483 of LNCS, pages 128–139.
Springer, 2009.
[Ten98] Michel Tenenhaus. La régression PLS: théorie et
pratique. Editions technip, 1998.
[TKK19] Anselme Tueno, Florian Kerschbaum, and Stefan A Complexities of Sub-protocols
Katzenbeisser. Private evaluation of decision trees us-
ing sublinear cost. Proceedings on Privacy Enhancing
In this section, we describe the complexities of all sub-
Technologies (PoPETs), 2019(1):266–286, 2019.
[TMZC17] Raymond K. H. Tai, Jack P. K. Ma, Yongjun Zhao, protocols described in §3. Note that oblivious transfers
and Sherman S. M. Chow. Privacy-preserving decision and homomorphic encryptions are precomputed in the
trees evaluation via linear functions. In European offline phase when possible.
Symposium on Research in Computer Security (ES- Communication. The communication of each sub-
ORICS’17), volume 10493 of LNCS, pages 494–512.
protocol is presented in Tab. 5. For all phases we observe
Springer, 2017.
[TZJ+ 16] Florian Tramèr, Fan Zhang, Ari Juels, Michael K.
a direct tradeoff: The homomorphic encryption-based
Reiter, and Thomas Ristenpart. Stealing machine methods have lower communication, while in case of gar-
learning models via prediction APIs. In USENIX bling techniques, most communication can be shifted of-
Security Symposium’16, pages 601–618. USENIX, fline, so these achieve the lowest online communication.
2016. Computation. The asymptotic computational
[Val76] Leslie G. Valiant. Universal circuits (preliminary
complexity of each sub-protocol is shown in Tab. 6. We
report). In ACM Symposium on Theory of Computing
(STOC’76), pages 196–203. ACM, 1976. observe that most computationally intense tasks can
[VC05] Jaideep Vaidya and Chris Clifton. Privacy-preserving be shifted offline for the protocols based on garbling
decision trees over vertically partitioned data. In Data techniques, while almost all operations are performed
and Applications Security (DBSec’05), volume 3654 online when using homomorphic encryption. Unfortu-
of LNCS, pages 139–152. Springer, 2005.
nately, even when using garbling techniques, the client
[VCKP08] Jaideep Vaidya, Chris Clifton, Murat Kantarcioglu,
and A. Scott Patterson. Privacy-preserving decision
needs to perform most cryptographic operations online,
trees over vertically partitioned data. ACM Transac- since he plays the role of the evaluator (cf. §2.2). In
tions on Knowledge Discovery from Data (TKDD), these techniques, however, the client requires less com-
2(3):14:1–14:27, 2008. putational power than using homomorphic encryption.
[Wak68] Abraham Waksman. A permutation network. Journal
of the ACM, 15(1):159–163, 1968.
[WFNL16] David J. Wu, Tony Feng, Michael Naehrig, and
Kristin E. Lauter. Privately evaluating decision
SoK: Modular and Efficient Private Decision Tree Evaluation 207

Sub-protocol Fig. Offline Online


m
SelH Paillier Fig. 6 - (n + t+s ) · τPaillier
SelH DGK Fig. 6 - (n + m) · τDGK
SelG Fig. 7) n
(n + 2Sm≥n )t · τsym nt(2τsym + 1)
SelH + CompH Fig. 8 - (n + m)t · τElGamal
CompG Fig. 9 2mt · τsym -
PathH Fig. 10 - (3m + 2) · τElGamal
PathG Fig. 11 2m0 · (τsym + log2 (m0 + 1)) -
SelH →CompG Fig. 13 3mt · τsym mt(2τsym + 1)
CompH →PathG Fig. 14 m0 · τsym m0 (τsym + 1)
0
Table 5. Offline and online communication of all sub-protocols, where n, t, m, m and m denote the number input features, the
bitlength of the features and thresholds, the number of decision nodes, padded decision nodes and either of the former two, respec-
n
tively. Sm≥n denotes the size of the selection network that selects m bits from n bits. τsym , τDGK , τPaillier and τElGamal denote the size
1
of the ciphertexts in the respective encryption schemes. Note that m ≤ m0 ≤ 2
m(m + 1).

Sub-protocol Fig. Offline Server Offline Client Online Server Online Client
SelH Paillier Fig. 6 O(m)νPaillier - O(m)νPaillier O(n + m)νPaillier
SelH DGK Fig. 6 O(m)νDGK - O(m)νDGK O(n + m)νDGK
SelG Fig. 7) O(t(m log m + n))νsym O(nt)νsym - O(tm log m)νsym
SelH + CompH Fig. 8 - O(nt)νElGamal(off) O(tm)νElGamal O(mt)νElGamal + O(nt)νElGamal(on)
CompG Fig. 9 O(mt)νsym - - O(mt)νsym
PathH Fig. 10 - O(m)νElGamal(off) O(md)νElGamal O(m)νElGamal + O(m)νElGamal(on)
PathG Fig. 11 O(m0 )νsym - - O(m0 )νsym
SelH →CompG Fig. 13 O(mt)νsym O(mt)νsym - O(mt)νsym
CompH →PathG Fig. 14 O(m0 )νsym O(m0 )νsym - -
Table 6. The number of cryptographic operations in the offline and online phases of all sub-protocols, with notations as in Tab. 5.
νsym , νDGK , νPaillier and νElGamal denote the cost of an operation (i.e., encryption, decryption or addition) in symmetric, DGK, Paillier
and lifted ElGamal encryption, respectively. νElGamal(off/on) denote the offline/online costs of precomputed ElGamal encryption.

central. At the end of the protocol, the patient’s profile


B Extensions is classified as no-recurrence-event or recurrence-event
that determines the possibility of cancer recurrence.
In this section, we present extensions to CompG and
Given categorical variable xj ∈ U = {u1 , . . . , u|U | },
PathG presented in §3, which directly apply to the re-
the decision criteria is a set membership test of the form
sulting protocols that use those as sub-protocols in §4.
xj ∈ U 0 , for some U 0 ⊂ U (j ∈ {1, . . . , n}). Wu et al.
supply the SelH + CompH subprotocol with the exten-
sion to handle categorical variables in [WFNL16]. We
B.1 Categorical Variables
briefly describe a private approach to evaluate the deci-
sion nodes that operate over categorical variables when
Since not every member of a feature vector is a nu-
using CompG . We notice that for testing the set inclu-
merical variable, we discuss the useful extension of the
sion, the server can create the |U |-bit masking value
comparison phase in order to add support for categor-
Mask based on U 0 such that Maski = 1 if ui ∈ U 0 ,
ical variables. In fact, a considerable portion of real
and 0 otherwise. This representation was used for pri-
datasets are a mixture of numerical and categorical
vate set intersection (PSI) in [HEK12]. Accordingly, for
variables. We measured the ratio of datasets contain-
xj = ui , the client inputs the transformed |U |-bit vari-
ing categorical variables in the UCI machine learning
able x0j which has only one set bit at position i. Then
repository [Lic18], and found that 33% of datasets con-
the original decision criteria xj ∈ U 0 can be evaluated by
tain categorical variables. For instance, a feature vector
the Boolean circuit which implements x0j ∧ Mask = x0j .
for the breast cancer dataset has the feature breast
quadrant that specifies the position of the tumor in the
patient’s breast, i.e. it takes one of the following five
values: left-up, right-up, left-low, right-low or
SoK: Modular and Efficient Private Decision Tree Evaluation 208

B.2 Random Forests PathH (Fig. 10). We implement the path evalua-
tion phase using lifted EC-ElGamal encryption over the
Random forests improve the accuracy of decision trees same curve and library as for SelH + CompH .
by aggregating the result of n ≥ 2 decision trees trained PathG (Fig. 11). We implement garbled decision
on different random subsets of features. Before evaluat- tree path evaluation in the ABY framework using two
ing the n decision trees, in order to hide all information AES encryptions with AES-NI per decision node.
about the subset of features, all decision tree evalua-
tions need to look the same from the client’s point of
view. Therefore, dummy comparisons are necessary to
pad the decision trees to the maximal m or m0 value.
Additionally, the aggregation step depending on
the aggregation function has to be implemented.
Wu et al. [WFNL16] provide a method that allows for
any affine function of the results using additive secret
sharing, which can be generalized to PathH (cf. Fig. 10)
of [TMZC17]. For PathG (cf. Fig. 11) [BFK+ 09], instead
of returning the results to the client, he can obtain keys
corresponding to these. These can then be used to se-
curely evaluate the GC corresponding to the aggrega-
tion function if the server uses these keys to generate it.
The complexity depends on the aggregation function.

C Implementation Choices
We describe our implementation choices used in §5.
SelH (Fig. 6). The ABY framework includes an
implementation for both Paillier and DGK encryp-
tions (cf. §2.2). The ciphertext lengths are τPaillier =
4096 and τDGK = 2048. However, Paillier encryption al-
lows for ciphertext packing which reduces the number
of ciphertexts and decryptions. We utilize ABY to build
two versions of SelH , and conclude the advantage of the
Paillier encryption both in runtime and communication.
SelG (Fig. 7). We implement the garbled selection
network introduced in [KS08a] in ABY.
SelH + CompH (Fig. 8). We use the mcl library
to implement protocols using lifted ElGamal encryption
over elliptic curve (EC) secp256k1 with 256-bit key as
Tai et al. [TMZC17]. This library has a highly opti-
mized lifted EC-ElGamal implementation and supports
point compression, i.e., an elliptic curve point can be
expressed in 256 + 1 bits. Since an ElGamal ciphertext
consists of two EC points, τElGamal = 514. Our baseline
implementation is that of [LZS18].
CompG (Fig. 9). We use comparison circuits in
ABY using SIMD operations to enhance the perfor-
mance of this sub-protocol.

You might also like