Professional Documents
Culture Documents
The primary cryptographic tools we use are homomor- Although the techniques we propose could be applied to
phic encryption, oblivious transfer, and garbled circuits. We many different biometric-matching systems, our implemen-
summarize each of these standard techniques briefly here. tation targets fingerprint recognition.
Fingerprint recognition (or fingerprint identification) is
Homomorphic encryption. Given a number a, we write
the task of searching for the best match in a database of
the encryption of a using public key pk as JaKpk , or sim-
fingerprints with a given candidate fingerprint. In contrast,
ply JaK when the public key is clear from the context. An
fingerprint authentication seeks to determine if a candidate
encryption scheme is additively homomorphic if given JaK
fingerprint matches a particular registered fingerprint. Tech-
and JbK it is possible to compute Ja + b (mod p)K, for some
niques for matching fingerprints have been extensively stud-
integer p which may depend on pk, without the decryption
ied, and we only provide a brief summary here; Maltoni et
key. (From now on, we leave p implicit.) It follows that
al. [12] provides more comprehensive information.
given JaK and an integer c, one can also compute Jc · aK.
Depending on the sensing technology, fingerprint im-
There are many public-key cryptosystems satisfying this
ages exhibit traits at different levels of image quality. At the
property. In our implementation we use Paillier’s cryptosys-
global level, ridge-lines shapes fall into one of several pat-
tem [16]. In this scheme, the public key is a modulus n;
terns such as loop, whorl, and arch. At the local level, there
encryption of m ∈ Zn is done by choosing a random r ∈ Z∗n
are about 150 different types of local ridge characteristics
and computing (1 + n)m · rn mod n2 .
(minute detail). At an even finer level, intra-ridge details
Oblivious transfer. An oblivious transfer protocol allows a are identified and used in high-end fingerprint applications.
2
Figure 1. System Overview
In the last decade, many fingerprint-recognition techniques (where 1 ≤ i ≤ M), represents the vector corresponding to
have been developed that combine various features of the some fingerprint. The database can also be viewed as N col-
fingerprint [2, 18, 19, 21]. Most of them involve sophisti- umn vectors [c1 , . . . , cN ]. The client’s input is denoted by a
cated training and classification algorithms, which are not vector v0 = [v0 1 , . . . , v0 N ], where each v0j is an 8-bit integer.
suitable for developing an efficient privacy-preserving fin- Our Euclidean-distance protocol is based on an addi-
gerprint recognition system. tively homomorphic encryption scheme. The server’s in-
In our work we use the filterbank-based approach [3] put to this protocol is the matrix [vi, j ]M×N , and the client’s
(also used by Barni et al. [1]) because it provides good accu- input is a single feature vector [v0 1 , . . . , v0 N ]. The squared
racy and leads to an efficient privacy-preserving protocol. In Euclidean distance between vi and v0 is denoted di :
this approach, fingerprints are represented by a FingerCode
N
derived from the raw fingerprint image. For our purposes,
it is important only to know that the FingerCode of each
di = ∑ (vi, j − v0j )2 .
j=1
fingerprint is an N-dimensional (typically N = 640) feature
vector, each entry of which is an 8-bit integer. The dis- At the end of this protocol, the server obtains a list of M
tance between two fingerprints is defined as the Euclidean random numbers r = [r1 , . . . , rM ] and the client obtains d0 =
distance between the two corresponding feature vectors. [d 0 1 , . . . , d 0 M ], where di0 = di + ri . (Addition here is done
We assume a server (“Alice”) holding a database that over the integers, but statistical masking of the di values
contains M fingerprint feature vectors, each of which is as- can be achieved by setting the bit-length of ri large enough
sociated with corresponding profile information (e.g., name, relative to the maximum possible value of di .)
age, criminal record). Given a candidate fingerprint image, In the second phase, the client and server (implicitly)
a client (“Bob”) first locally derives the associated Finger- compute the minimum difference between the candidate fin-
Code feature vector. An advantage of using the filterbank- gerprint vector and the vectors in the database, if this dis-
based algorithm is that fingerprint images can be trans- tance is less than some threshold ε. If no fingerprint in the
formed to their feature vector representation locally; hence, database is within distance ε of the candidate fingerprint,
each party can do the relevant image-processing on its own the client (implicitly) receives a “no match” response; oth-
using a standard image-processing program. erwise, the client (implicitly) learns the index i∗ of the clos-
Our system design can be decomposed into the three est match. We implement this phase using a garbled circuit
stages shown in Figure 1: a secure Euclidean-distance pro- which takes as inputs r and d0 as output by the previous
tocol (Section 4), a secure closest-match protocol (Sec- phase. Conceptually, the garbled circuit computes d0 − r to
tion 5), and an oblivious retrieval protocol (Section 6). In produce d=[d1 , . . . , dM ], which is then fed into a minimum
our implementation, the first two phases are each divided circuit to find the minimal component di∗ . Finally, di∗ is
into a preparation stage which can be performed off-line compared to ε to see if it is a close-enough match. For ef-
(i.e., independently of the client’s candidate fingerprint) and ficiency, our design combines the difference, comparison,
an on-line execution stage. and threshold check into one circuit.
Looking only at the feature vectors (and ignoring the as- In the final phase, the client learns the record associated
sociated profile records for now), we may view the server’s with the closest match (if a close enough match exists). This
database as an M × N matrix [vi, j ]M×N , where each vi, j is is done using the wire labels from the garbled circuit used
an 8-bit integer. Each of the M row vectors, written as vi in the previous phase. Section 6 describes the backtracking
3
protocol we use to efficiently and obliviously retrieve the The distance protocol in the basic settings of both
matching profile record. Erkin’s secure face recognition systems begins with the
As in other scenarios where garbled circuits are used, server (the server) having all the vi, j values while the client
our system is flexible in terms of who learns the result. If (the client) has the v0j value encrypted under the server’s
only the server should learn the outcome, then the client can public key (pkS ). For the secure face recognition algorithm,
just send back the wire labels of the final outputs. Then the the client cannot learn v0j ’s because v0j ’s can carry informa-
server, who knows the label-to-signal mappings, learns the tion about the
r z eigenfaces that must be kept private to the
index of the closest match (if that match is close enough). 0
server. The v j ’s are derived from eigenfaces using secure
On the other hand, if the client is supposed to be the only
dot-product computations. In our application, the client can
party learning the final
outcome, the server only needs to compute v0 from the fingerprint image directly, so can com-
send to the client a pair r00 kHλ 0 (r00 ), r10 kHλ 1 (r10 ) , where H
pute JS3 K locally. This is similar to the Public Eigenfaces
U
is a random oracle, r00 , r10 ← {0, 1}n , and λ 0 , λ 1 are the wire scenario mentioned by Erkin et al. [5].
labels denoting 0 and 1 respectively, for each of the final The next subsection describes the previous privacy-
output wires, then followed by the backtracking tree proto- preserving Euclidean-distance protocols [5, 20]. Section 4.2
col. In this case, the client knows from the additional pairs presents our improved protocol that reduces the bandwidth
how his final output wire labels binds to wire signals. After and computation cost by an order of magnitude.
seeing if a match really happens, he then decides whether to
go on evaluating the backtracking tree. 4.1 Prior Euclidean-Distance Protocols
Note that even a secure two party computation can leak
information about both participants’ private inputs via re- A basic version of the protocol begins by having the
vealing the correct final output. For both fingerprint recog- client publish a public key pkC for a homomorphic encryp-
nition system setups above, the server’s threshold value ε tion scheme. The client computes J−2v01 K, . . . , J−2v0N K and
can be used as a security parameter that controls the infor- JS3 K and sends these ciphertexts to the server. The server
mation leakage through final outputs. By choosing a small computes JSi,1 K (for 1 ≤ i ≤ M) by herself. The server then
enough ε, the server can ensure that no information (other computes JSi,2 K (for 1 ≤ i ≤ M) using the following for-
than the absence of a close match) is revealed unless the mula:
client has a candidate fingerprint that is a close match to N q
t
N
|
0 vi, j 0
y
one in the database. This satisfies the requirements for ap- JSi,2 K = ∏ −2v j = ∑ −2vi, j · v j .
plications such as identifying a criminal while keeping the j=1 j=1
database of known criminals secret while preserving the pri- Finally, for 1 ≤ i ≤ M the server chooses random ri from
vacy of non-criminals. some appropriate range and computes:
q 0y
4 Euclidean-Distance Protocol di = Jdi + ri K = JSi,1 + Si,2 + S3 + ri K
= JSi,1 K · JSi,2 K · JS3 K · Jri K .
As has been observed before (see, e.g., Erkin et al. [5]),
The {Jdi0 K} are sent to the client, who decrypts and recovers
computation of the squared Euclidean distance di between
d10 , . . . , dM
0 ; the server outputs r , . . . , r .
1 M
ri (one of the vectors in the server’s database) and v0 (the
As noted by Sadeghi et al. [20], packing can be ap-
candidate vector) can be broken into three parts:
plied to save bandwidth in the second round of the proto-
N M
col.
q 0 The basic idea y is to send ` ciphertexts of the form
di = kvi − v0 k2 = ∑ (vi, j − v0j )2 0 0
di kdi+1 k · · · kdi+` instead of M ciphertexts of the form
j=1
Jdi0 K, where the maximum value of ` depends on the max-
N N N
2 imum range of the di0 and the bit-length of plaintexts in
= ∑ v2i, j + ∑ (−2vi, j · v0j ) + ∑ v0j the encryption scheme being used. If each di0 satisfies
j=1 j=1 j=1
| {z } | {z } | {z } 0 ≤ di0 < 2σ , then ciphertexts of the required form can be
Si,1 Si,2 S3 computed as,
(Note that the last component does not depend on i.) q 0 0 0
y `+1 q 0 y2(`+1− j)·σ
We will compute each of the above values in encrypted d1 kd2 k · · · kd`+1 = ∏ dj .
j=1
form. In our application (in contrast to the settings con-
sidered in [5, 20]) the client holds v0 and so can compute Note that this method of packing cannot be applied to the
JS3 K locally. Similarly, the server can compute JSi,1 K lo- initial message from the client to the server in the basic pro-
cally. Thus, all that remains is to provide a secure method tocol described above, hence it only reduces the bandwidth
for evaluating JSi,2 K. required for the final response from the server to the client.
4
Improved Privacy-Preserving Euclidean-Distance Protocol
Input to the server: a matrix {vi, j }M×N .
Input to the client: a vector v0 = [v0 1 , . . . , v0 N ].
Output of the server: M random integers [d 0 1 , . . . , d 0 M ], where di0 = di + ri .
Output of the client: M integers [r1 , . . . , rM ].
Preparation:
1. The server generates a key pair hpkSq, skSy i.
2. For 1 ≤ j ≤ N, the server computes 2c j pk .
q S y
3. The server computes JS1 K = S1,1 kS2,1 k · · · kSM,1 .
4. The server sends J2c1 K, . . . , J2cN K, and JS1 K to the client.
Execution:
Server Client
4.2 Improved Euclidean-Distance Protocol fore encrypting. The server also computes
5
Protocol Encryptions Decryptions Exponentiations Bandwidth
Previous (4.1) N +M+1 M M(N + 1) N + M/κ + 1
Improved (4.2) M/κ M/κ MN/κ M/κ
a single homomorphic operation. The number of ho- Section 4.1) as the size of the modulus for Paillier’s encryp-
momorphic operations is reduced by the number of tion varies. Details on the implementation and experimental
scalars, κ, that can be packed into a single ciphertext. setup are provided in Section 7.
3. Our protocol uses less bandwidth. If we assume a The results are consistent with our quantitative analy-
client making multiple queries, then the communica- sis in Table 1 and demonstrate nearly twenty-fold improve-
tion cost of the first round can be amortized over a large ments in both time and bandwidth for a 1024-bit Paillier
number of queries (with the client only sending a new modulus. Our distance protocol also has better scalabil-
second-round message for each query). ity with respect to the security parameter of Paillier’s cryp-
tosystem. This is because as the length of the modulus in-
Security. Security of this protocol (as well as the prior pro- creases we can pack more values into each ciphertext: thus,
tocols [5, 20]) depends on the r values adequately masking e.g., each encryption with a 2048-bit modulus can be used
the entries of d. We stress that addition here is computed to perform twice as many underlying computations as with
over the integers rather than modulo some value; thus, we a 1024-bit modulus.
obtain statistical hiding rather than perfect hiding. Con-
cretely, if each di is a δ -bit integer and ri is a uniform ρ-
bit integer, then releasing di0 = di + ri gives statistical secu- 5 Finding the Closest Match
rity roughly 2δ −ρ for the value of di . (Formally, for any
fixed di0 , di1 the statistical difference between the random
variables di0 + ri and di1 + ri is approximately 2δ −ρ .) By This stage begins with the server knowing [d 0 1 , . . . , d 0 M ]
choosing ρ suitably, we can make this probability arbitrar- and the client knowing [r1 , . . . , rM ]. This stage can be
ily low. Increasing the maximum mask value, however, re- viewed as allowing the client to learn the index i∗ minimiz-
duces κ as discussed next. ing di = di0 − ri , assuming di∗ < ε. In fact, though, the client
does not learn i∗ explicitly; rather, the client learns wire la-
Packing Implementation. Suppose each of vi, j and v0j is bels for “active” wires in a garbled circuit prepared by the
a σ -bit integer and each additive random mask ri is a ρ- server, and this will be sufficient to enable to client to learn
bit integer. From the formulas for computing di and di0 , it is the desired record (that corresponds to index i∗ ) in the back-
clear that δ = 2σ +dlog Ne bits are sufficient to represent di ; tracking stage we describe in Section 6.
as for di0 , since we need ρ > δ for statistical security we see
that δ 0 = ρ + 1 bit suffice to represent di0 . Section 5.1 describes the circuit we evaluate securely
Let θ ≥ δ 0 denote the number of bits designated for each to find the closest match, and Section 5.2 explains how we
of the κ units being “packed” in one ciphertext. We cannot implement the matching phase using this circuit.
set θ = δ 0 because we need to handle possible overflow of
intermediate values in the computation. We allocate σ +
5.1 Circuit Design
dlog Ne bits for overflow; thus,
6
Paillier Modulus 1024 2048 3072
Time/Bandwidth s KB s KB s KB
Standard 123.6 190.8 574.3 350.6 1510.6 511.4
Exec. Improved 6.8 9.8 14.4 8.9 23.9 8.3
Savings 94.5% 94.9% 97.5% 97.5% 98.4% 98.4%
7
Figure 4. Circuits for SubReduce and 2-to-1 Min.
own input bits. We use OT extension [6] to reduce labels can serve as keys for encrypting useful information.
an arbitrary number of OTs to secure evaluation of The wire labels of the output wires of GT comparison
just k OTs, where k is a statistical security parame- circuits in the n-to-1 Minimum tree can be used to reveal a
ter. As our base OT we use the Naor-Pinkas proto- path from the inputs to the minimum value. Figure 5 shows
col [14] which achieves semi-honest security based on an example comparison tree for a four record database. In
the decisional Diffie-Hellman assumption. We also use each of the 2-to-1 Min circuits the GT circuit takes two in-
the standard technique of pre-processing [4] to push puts and outputs a bit indicating which value is greater. We
most of the cost of the oblivious transfer step into a denote that bit as gh,i and the corresponding wire labels λh,i 0
8
Figure 5. Example Find Closest Match Circuit
through the tree, learning the keys along that path, and even- of the following properties hold:
tually the key needed to decrypt the encrypted record infor-
mation at the leaf. When he evaluates the garbled SubRe- 1. The generator (the server) gains nothing.
duceMin circuit, the client obtains either λh,i0 or λ 1 from 2. The evaluator (the client) gains nothing other than the
h,i
each label pair. Since each key in the backtracking tree de- data associated with the closest match.
pends on a complete path from the root to that tree, this The first property trivially holds since the client sends noth-
means the client can only open a single path through the ing back to the server. The second property follows from
tree; specifically, the single path from the root to the leaf two facts:
corresponding to the closest match.
Figure 7 summarizes the backtracking protocol. The 1. With a semantically secure encryption scheme, no in-
algorithms to generate and evaluate the tree are shown in formation is leaked by the encryption (i. e., without
Algorithm 1 and Algorithm 2. For the tree generation al- also revealing the keys).
gorithm (Algorithm 1), the inputs a vector of the profile 2. Wire labels in a garbled circuit convey no information
data to send and an array of the wire label pairs for each unless their mappings to wire signals are known some-
comparison gate in the generated circuit. The output is the how. This follows from the garbled circuit security
backtracking tree, but only the node labels are transmitted proof [10].
to the evaluator. For the tree evaluation algorithm (Algo-
In every iteration of the loop in Algorithm 2, the client
rithm 2), the inputs are the tree (where the label for each
only gets to know the nonce of one of current node’s two
node is the encrypted label in the generated tree) and the
children, and proceeds using that value. The whole subtree
wire labels learned by the evaluator in evaluating the gar-
of the failed branch remains unknown to the client since the
bled circuit. The output is the decrypted profile information
nonce is needed to open any configurations on that subtree.
for the closest matching entry, if there is one within ε.
Thus, the client can only follow a single path in the tree
Security. The backtracking tree protocol is secure if both corresponding to the path leading to the closest match.
9
Backtracking Tree Protocol
Input to Tree-Generator: (1) an array [data1 , . . . , dataM ] denoting M pieces of data stored in the database;
(2) a number of pairs [label pair1 , . . . , label pairM−1 ],
def
where label pair = hlabel0 , label1 i, organized as a tree.
Input to Tree-Evaluator: (1) a backtracking tree tree;
(2) a number of labels [label1 , . . . , labelM−1 ] organized as a tree.
Execution:
Tree-Generator Tree-Evaluator
Note that if the server’s database is released in encrypted and a MUX for each 2-to-1 Min to randomly choose a num-
format beforehand as in the Improved Euclidean Distance ber to output. Instead, we use a simple fix that does not re-
protocol, the server may not want the client to learn extra in- quire adding any new circuitry. We modify the circuit gen-
formation from the position of the matched records. Hence, erator to randomly choose the internal carry-in bit for GT,
the server should randomly permute the order of database instead of always using signal 0. Setting this internal carry-
records before beginning the Euclidean distance protocol. in bit to 1 is equivalent to making the GT test x+1 > y. This
This random permutation needs only to be done once, since change does not affect the functionality when x > y. When
once the database is permuted relative positions of opened x = y, the modified GT outputs 0 (if the internal carry-in
records reveal no information. bit is 0) and 1 (if the internal carry-in bit is 1) with equal
probability 12 .
Multiple matches. In the unlikely case where multiple fin-
gerprints match the candidate fingerprint equally well, a
7 Evaluation
straightforward implementation of garbled circuits always
returns either the left-most or the right-most matched leaf
node. This poses potential threats to the server’s privacy. To measure the impact of our improvements and eval-
For example, if the minimum tree is known to always re- uate the practicality of privacy-preserving biometrics, we
turn the left-most matched leaf node, then the client learns implemented a privacy-preserving fingerprint matching sys-
there can’t be another equally-well matched fingerprint in tem. Our implementation comprises about 5400 lines of
the server’s database. A straightforward but costly way to Java 1.6 code, available from http://mightbeevil.org.
fix this vulnerability would be to add an equality test circuit We set up the server and the client on separate machines
10
Algorithm 2 treeEval(tree, wire labels)
1: msg ← 0;
2: current node ← tree.root;
3: while current node has children do
4: m ← Decmsg||wire labels[pos(current node)] (current node.leftChild.label);
5: if m is valid then
6: msg ← m;
7: current node ← current node.leftChild;
8: else
9: msg ← Decmsg||wire labels[pos(current node)] (current node.rightChild.label);
10: current node ← current node.rightChild;
11: end if
12: end while
13: return msg;
connected by a LAN. Both machines are homogeneously OT protocol depends only on the security parameters we
configured, each with an Intel Xeon CPU (E5504) running choose, so its cost does not scale with M.
at 2.0GHz. The JVMs are configured with a memory cap of The final five sub-stages are the execution sub-stages
4GB, both on the server and the client. which must be done once for each candidate fingerprint ex-
We use randomly generated 640-entry vectors as our ecution: (4) Euclidean distance protocol execution (Dis-
benchmark. Note that we are not evaluating the finger- tance), (5) the server resetting the initial input wire la-
print matching algorithm here, since our privacy-preserving bels and transmitting the wire labels representing her in-
protocol uses exactly the original filterbank-based finger- put to the client (Reset Labels), (6) the client learning the
print matching algorithm which has been extensively eval- wire labels corresponding to his inputs obliviously (OT);
uated [7]. The evaluation time is independent of the actual (7) garbled circuit evaluation, including the server’s gen-
fingerprint vectors. In our experiments, the client’s feature erating and transmitting the intermediate wire labels and
vector is randomly picked from the feature vectors in the garbled truth tables and the client’s evaluating the circuit
database. (Circuit), and (8) the backtracking protocol (Backtracking),
Our implementation used the following parameters: in which comprises generating, transmitting, and evaluating
the Euclidean-distance protocol, the bit length allocated for the backtracking step.
each packed value (i.e., θ ) was 64 and the bit length of the The distance protocol dominates the execution time.
random mask was 45. We use Paillier encryption with a The other two substantial sub-stages are circuit and OT, and
1024-bit modulus. We set ε to be a 16-bit integer. In our the time for the backtracking protocol is negligible. As ex-
garbled circuit implementation, 80-bit wire labels are used. pected, the results in Table 4 confirm that every sub-stage of
the execution phase scales approximately linearly with the
Table 4 shows the running time and bandwidth usage for
size of the database. The bandwidth in the execution phase
our protocol as a function of M, the number of entries in the
is dominated by the circuit, due to transmitting a large num-
database. We report the computation time and bandwidth
ber of garbled truth tables. This is dominated by traffic from
for each of eight protocol sub-stages. The first three sub-
the server to the client, which accounts for 88% of the over-
stages are preparation sub-stages, which do not depend on
all traffic.
the candidate fingerprint and only need to be done once: (1)
Euclidean distance preparation (Distance): the server com-
puting and transmitting the encrypted packed columns and 8 Related Work
JS1 K); (2) Garbled circuit preparation (Circuit): the server
generating the SubReduceMin garbled circuit (except for The protocols presented here build upon, and could be
the wire labels and garbled tables, which must be regen- applied to improve the efficiency of, several previous sys-
erated for each execution); (3) OT preparation (OT): the tems for privacy-preserving biometric identification.
preparation steps for the oblivious transfer. The prepara- Erkin et al. [5] developed an efficient privacy-preserving
tion time is dominated by the time required to compute the face recognition system based on the standard Eigenfaces
encrypted distance vectors, which scales approximately lin- recognition algorithm. Similar to our work, it also com-
early with the size of the database. Since this is done only putes Euclidean distances between vectors using additive
once by the server, though, it is not prohibitively expen- homomorphic encryption. Their work does not use garbled
sive even for large databases. The preparation phase of the circuits but relies heavily on homomorphic encryption for
11
Database Size (M) 128 256 512 1024
Time/Bandwidth s KB s KB s KB s KB
Distance 145.08 1288.99 277.35 2577.25 555.87 5153.77 1089.42 10306.81
Prep. Circuit 0.35 None 1.09 None 2.95 None 6.28 None
OT 0.52 21.91 0.48 21.91 0.48 21.91 0.45 21.91
Distance 1.68 2.93 3.36 5.21 6.82 9.79 13.33 18.95
Reset Labels 0.01 57.94 0.03 115.72 0.08 231.28 0.24 462.40
Exec. OT 0.13 237.13 0.25 467.69 0.61 928.82 1.38 1851.06
Circuit 0.39 656.14 0.67 1313.64 1.58 2628.64 3.12 5258.63
Backtracking 0.01 12.70 0.02 25.45 0.03 50.95 0.04 101.94
Exec Sub-Total 2.22 966.84 4.33 1927.71 9.12 3849.48 18.11 7692.98
Table 4. Running Time (seconds) and Bandwidth (KB) for Protocol Phases
all the core computation. This limits the scalability of their minimum, their implementation produces the indexes of all
system. The length of vectors is 12 in their system (com- entries within a threshold value of the candidate.
pared to 640 in our experiments). For a database of 320 The most similar work to ours is the privacy-preserving
faces, the whole system takes 18 seconds online computa- fingerprint authentication by Barni et al. [1] which uses the
tion to serve a query, and generates about 7.25MB network same FingerCode biometric as we do. Like Erkin et al.’s
traffic. approach, it is also a system based purely on homomorphic
Sadeghi et al. [20] improved the efficiency of Erkin’s encryption. They do not support the computation of global
work by constructing a hybrid protocol that uses homomor- minimum, but instead output the indexes of all matches
phic encryption to compute Euclidean distances and garbled within some threshold. They report results from an exper-
circuits for minimum. They also devised the idea of pack- iment with a database of 320x16 7-bit where their proto-
ing, which, however, they use merely used to save commu- col completes in 16 seconds and uses 9.11MB bandwidth.
nication cost. Our Euclidean distance protocol builds on In contrast, our system’s performance results for the most
this protocol, but improves its efficiency by also incorpo- comparable but larger experiment are 3.47s and 3.76MB for
rating packing in the computation steps. They do not use a database of 512x16 8-bit integers.
any minimum circuit to identify the best match. Instead,
matches were found by securely comparing distance values 9 Conclusion
with individual threshold values. In contrast to our work,
Sadeghi et al. [20] generated their garbled circuit using a Privacy-preserving computation offers the promise of
generic compiler FairplaySPF [17], which is in turn based obtaining results dependent on private data without expos-
on Fairplay [11]. Such compilers are convenient, but cannot ing that private data. The main drawback is that current pro-
take advantage of application-specific properties to develop tocols for privacy-preserving computations are very expen-
more efficient custom circuits. sive and impractical for real-scale problems. In this work,
SCiFI [15] is a practical privacy-preserving face identi- we have shown that those costs can be substantially reduced
fication system. It uses a component-based face identifica- for a large class of biometric matching applications by de-
tion technique with a binary index into a vocabulary repre- veloping efficient protocols for Euclidean distance, finding
sentation. The distance between faces is the Hamming dis- the closest match, and retrieving the associated record. Our
tance between their bit-vectors. One notable design choice approach involves using the normal by-products of a gar-
the authors made for SCiFI is that both the secure Hamming bled circuit evaluation to enable very efficient oblivious in-
distance and secure minimum algorithms are purely based formation retrieval, and we believe this technique can be ex-
on additive homomorphic encryption and oblivious trans- tended to many other applications. Our experimental results
fer. The authors present several optimization techniques support the hope that privacy-preserving biometrics are now
specific to their application. In contrast, we argue that a within reach for practical applications.
hybrid scheme combining both homomorphic encryption
and garbled circuits tends to be superior. They report that
identification takes 31 seconds of online computation for a
database of size 100 (with 900-bit vectors, in comparison
to our 640-byte = 5120-bit vectors), while no bandwidth
consumption is reported. Rather than computing the global
12
Acknowledgements [10] Y. Lindell and B. Pinkas. A Proof of Security of
Yao’s Protocol for Two-Party Computation. Journal
This work was partially supported by a MURI award of Cryptology, 22(2), 2009.
from the Air Force Office of Scientific Research, and grants
from the National Science Foundation and DARPA. The [11] D. Malkhi, N. Nisan, B. Pinkas, and Y. Sella. Fair-
contents of this paper do not necessarily reflect the posi- play — A Secure Two-Party Computation System. In
tion or the policy of the US Government, and no official USENIX Security Symposium, 2004.
endorsement should be inferred. The authors thank Yikan
Chen and Aaron Mackey for insightful discussions about [12] D. Maltoni, D. Maio, A. Jain, and S. Prabhakar. Hand-
this work. book of Fingerprint Recognition. Springer, 2009.
[2] A. Bazen and S. Gerez. Systematic Methods for the [15] M. Osadchy, B. Pinkas, A. Jarrous, and B. Moskovich.
Computation of the Directional Fields and Singular SCiFI: A System for Secure Face Identification. In
Points of Fingerprints. IEEE Transactions on Pattern IEEE Symposium on Security and Privacy (Oakland),
Analysis and Machine Intelligence, 2002. 2010.
[3] A. Bazen, G. Verwaaijen, S. Gerez, L. Veelenturf, and [16] P. Paillier. Public-key Cryptosystems Based on Com-
B. van Der Zwaag. A Correlation-Based Fingerprint posite Degree Residuosity Classes. In 17th Interna-
Verification System. In ProRISC2000 Workshop on tional Conference on Theory and Application of Cryp-
Circuits, Systems and Signal Processing, 2000. tographic Techniques (EUROCRYPT), 1999.
[4] D. Beaver. Precomputing Oblivious Transfer. In 15th [17] A. Paus, A. R. Sadeghi, and T. Schneider. Practical
International Conference on Cryptology (CRYPTO), Secure Evaluation of Semi-Private Functions. In In-
1995. ternational Conference on Applied Cryptography and
[5] Z. Erkin, M. Franz, J. Guajardo, S. Katzenbeisser, Network Security (ACNS), 2009.
I. Lagendijk, and T. Toft. Privacy-Preserving Face
Recognition. In 9th International Symposium on Pri- [18] S. Prabhakar and A. Jain. Decision-Level Fusion in
vacy Enhancing Technologies, 2009. Fingerprint Verification. Pattern Recognition, 2002.
[6] Y. Ishai, J. Kilian, K. Nissim, and E. Petrank. Extend- [19] A. Ross, A. Jain, and J. Reisman. A Hybrid Finger-
ing Oblivious Transfers Efficiently. In 23rd Interna- print Matcher. Pattern Recognition, 2003.
tional Conference on Cryptology (CRYPTO), 2003.
[20] A. Sadeghi, T. Schneider, and I. Wehrenberg. Efficient
[7] A. Jain, S. Prabhakar, L. Hong, and S. Pankanti. Privacy-Preserving Face Recognition. In International
Filterbank-Based Fingerprint Matching. IEEE Trans- Conference on Information Security and Cryptology,
actions on Image Processing, 2000. 2009.
[8] V. Kolesnikov, A. Sadeghi, and T. Schneider. Im-
proved Garbled Circuit Building Blocks and Applica- [21] H. Xu, R. Veldhuis, A. Bazen, T. Kevenaar, T. Akker-
tions to Auctions and Computing Minima. In Cryptol- mans, and B. Gokberk. Fingerprint Verification us-
ogy and Network Security, 2009. ing Spectral Minutiae Representations. IEEE Trans-
actions on Information Forensics and Security, 2009.
[9] V. Kolesnikov and T. Schneider. Improved Garbled
Circuit: Free XOR Gates and Applications. In 35th In- [22] A. C. Yao. How to Generate and Exchange Secrets. In
ternational Colloquium on Automata, Languages and 27th Symposium on Foundations of Computer Science,
Programming (ICAPL), 2008. 1986.
13
Appendix: Oblivious Transfer Protocol crypt xi,0 .
2. When ri = 1, the value of si selects whether qi =
Our protocol is summarized in Figure 8. It combines ti , or r ⊕ ti . However, this “selection” effect is can-
the protocols from Naor and Pinkas [13] (which we refer celed by xor-ing s and qi , so that it is always true that
to as NPOT) and Ishai et al. [6] using an aggressive pre- s ⊕ qi = ti , which is the key used to encrypt xi,1 .
computation strategy to produce an efficient OT protocol.
We denote the ith column vector of a matrix T by ti , Security. The security of our protocol follows from these
and the ith row vector of T by ti . The preparation phase two points:
can be done before any of the selection bits are known.
At the end of the preparation phase, S NDER has k1 keys 1. The R CVER can never learn anything about xi,ri be-
keyi,si , and R CVER has k1 key pairs hkeyi,0 , keyi,1 i, where cause it is encrypted using a different secret key which
(1 ≤ i ≤ k1 ). These keys are later used to transmit the ma- differs from that used for xi,ri by s, the S NDER’s ran-
trix Q efficiently. By using pre-computation, the on-line dom bit vector that is never revealed to the R CVER.
phase of our OT implementation requires only 2(k1 + m) The security property of NPOT used in the preparation
symmetric encryptions and k1 + m symmetric decryptions. phase guarantees that the selection bits of s are not re-
The correctness and security of this protocol follow di- vealed to R CVER.
rectly from the proofs for NPOT [13] and Ishai et al.’s ex- 2. In the first round of communication, either ti or r ⊕ti is
tended OT protocol [6]. sent to the S NDER, but not both. Thus, the fact that the
S NDER can never learn anything about the R CVER’s
Correctness. The R CVER can learn xi,ri for all 1 ≤ i ≤ m selection bits r is derived directly from the security
following this case analysis: property of NPOT used in the preparation phase [13].
1. If ri = 0, then qi = ti no matter what value si takes.
Thus, R CVER knows the key ti , which is used to en-
Execution:
S NDER R CVER
14