Professional Documents
Culture Documents
Jian Yin, Jean-Philippe Martin, Arun Venkataramani, Lorenzo Alvisi, Mike Dahlin
Laboratory for Advanced Systems Research
Department of Computer Sciences
The University of Texas at Austin
P P
that at most f of n servers are faulty. Second, we assume Agreement
Certificate Request Cert. Reply
that no machine can subvert the assumptions about crypto- Agreement/Execution Agreement Cert. Certificate
E E E E E E F F F
Reply Reply Top
Secret
V
C C C C
Figure 2: Illustration of confidentiality filtering properties of (a) traditional BFT architectures, (b) architec-
tures that separates agreement and execution, and (c) architectures that separate agreement and execution
and that add additional privacy firewall nodes.
Separate/Same/MAC
15 Separate/Different/MAC when running on the same machines—4.0ms, 4.3ms, and
Separate/Different/Thresh 5.3ms. The increase is largely caused by two inefficiencies
Priv/Different/Thresh
10 in our current implementation: (1) rather than using the
agreement certificate produced by the BASE library, each of
5
our message queue nodes generates a piece of a new agree-
0 ment certificate from scratch, (2) in our current prototype,
40/40 40/4096 4096/40 we do a full all-to-all multicast of the agreement certificate
and request certificate from the agreement nodes to the ex-
Workload (send/recv) ecution nodes, of the reply certificate from the execution
nodes to the agreement nodes, and (3) our system does not
use hardware multicast. We have not implemented the opti-
Figure 3: Latency for null-server benchmark for
mizations of first having one node send and having the other
three request/reply sizes.
nodes send only if a timeout occurs, and we have not imple-
mented the optimization of clients sending requests directly
uninterruptible power supplies, so power failures are a po- to the execution nodes. However, we added the optimization
tentially significant source of correlated failures across our that the execution nodes send their replies directly to the
system that could cause our current configuration to lose clients. Separating the agreement machines from the execu-
data. Second, our privacy firewall architecture assumes a tion machines adds little additional latency. But, switching
network configuration that physically restricts communica- from MAC authenticator certificates to threshold signature
tion paths between agreement machines, privacy filter ma- certificates increases latencies to 18ms, 19ms, and 20ms for
chines, and execution machines. Our current configuration the three workloads. Adding a two rows of privacy firewall
uses a single 100 Mbit ethernet hub and does not enforce filters (one of which is co-located with the agreement nodes)
these restrictions. We would not expect either of these dif- adds a few additional milliseconds.
ferences to affect the results we report in this section. Third, As expected, the most significant source of latency in the
to reduce the risk of correlated failures the nodes should architecture is public key threshold cryptography. Produc-
be running different operating systems and different imple- ing a threshold signature takes 15ms and verifying a signa-
mentations of the agreement, privacy, and execution cluster ture takes 0.7ms on our machines. Two things should be
software. We only implemented these libraries once and we noted to put these costs in perspective. First, the latency
use only one version of the application code. for these operations is comparable to I/O costs for many
services of interest; for example, these latency costs are sim-
5.2 Latency ilar to the latency of a small number of disk seeks and are
similar to or smaller than wide area network round trip la-
Past studies have found that Byzantine fault tolerant state
tencies. Second, signature costs are expected to fall quickly
machine replication adds modest latency to network appli-
as processor speeds increase; the increasing importance of
cations [11, 12, 40]. Here, we examine the same latency mi-
distributed systems security may also lead to widespread de-
crobenchmark used in these studies. Under this microbench-
ployment of hardware acceleration of encryption primitives.
mark, the application reads a request of a specified size
The FARSITE project has also noted that technology trends
and produces a reply of a specified size with no additional
are making it feasible to include public-key operations as a
processing. We examine request/reply sizes of 40 bytes/40
building block for practical distributed services [1].
bytes, 40 bytes/4 KB, and 4 KB/40 bytes.
Figure 3 shows the average latency (all run within 5%)
for ten runs of 200 requests each. The bars show per- 5.3 Throughput and cost
formance for different system configurations with the algo- Although latency is an important metric, modern services
rithm/machine configuration/authentication algorithm indi- must also support high throughput [49]. Two aspects of
cated in the legend. BASE/Same/MAC is the BASE li- the privacy firewall architecture pose challenges to provid-
brary with 4 machines hosting both the agreement and ex- ing high throughput at low cost. First, the privacy firewall
ecution servers and using MAC authenticators; Separate/- architecture requires a larger number of physical machines
Same/MAC shows our system that separates agreement and in order to to physically restrict communication. Second,
replication with agreement running on 4 machines and with the privacy firewall architecture relies on relatively high-
execution running 3 of the same set of machines and us- overhead public key threshold signatures for reply certifi-
ing MAC authenticators; Separate/Different/MAC moves cates. Two factors mitigate these costs.
the execution servers to 3 machines physically separate from First, although the new architecture can increase the to-
the 4 agreement servers; Separate/Different/Thresh uses the tal number of machines, it also can reduce the number of
same configuration but uses threshold signatures rather than application-specific machines required. Application-specific
MAC authenticators for reply certificates; finally, Priv/Dif- machines may be more expensive than generic machines
ferent/Thresh adds an array of privacy firewall servers be- both in terms of hardware (e.g., they may require more stor-
tween the agreement and execution cluster with a bottom age, I/O, or processing resources) and in terms of software
row of 4 privacy firewall servers sharing the agreement ma- (e.g., they may require new versions of application-specific
software.) Thus, for many systems we would expect the ap- 100
No replication
Sep/Priv (batch=1)
plication costs (e.g., the execution servers) dominate. Like Sep/Priv (batch=10)
Sep/Priv (batch=100)
Sep (batch=1)
router and switch box costs today, agreement node and pri-
100
5 5374 6050 7071
TOTAL 9763 10361 11939
80
60
Figure 7: Andrew-500 benchmark times in seconds
40
with failures.
20