You are on page 1of 16

Asynchronous Byzantine

Fault Tolerance: A Time-


independent & Future-proof
Byzantine Fault Tolerance
for the Brave New World
Summary: This is a light-speed review on the history of
distributed computation in order give the reader an intuitive
understanding on the nature of asynchronous Byzantine Fault
Tolerant (aBFT) consensus algorithms, how they came to be and
why these protocols are the most suitable to be used as consensus
mechanisms in permissionless blockchains than its ancestors, the
classical Byzantine Fault Tolerant (BFT)consensus algorithm and
its many variants. Among the upcoming projects that employ
aBFT consensus algorithms is Fantom which with its Lachesis
consensus algorithm (apart from its partnerships with government
entities such as the Afghan Ministry of Health, the Korean Food
Industry and Dubai’s smart cities) is a serious contender in the
blockchain sphere from a technological point of view.
1. A Quick Review on the History of
Centralized Computing: Breakthroughs &
Problems
Since the initial conceptualization and development of the first
mechanical computer in 1822, the Difference Engine, by Charles
Babbage and Ada Lovelace passing through the creation of the first
programmable computer, the Z, by Konrad Zuse in 1936–1938 and
the Turing machine proposal by Alan Turing around the same
time, the trend has seemingly been to centralize computing power
through the centralization of computing hardware¹. An early
example of this trend in the early digital age can be seen when the
first mainframes like the IBM 700 Series² came into the scene
which resulted in an ecosystem where computation was limited
and only adopted by well-off corporations and governments who
could buy and maintain this equipment which was used mainly for
business purposes and scientific research.

The Defense Calculator

The IBM 701 (“Defense Calculator”) was the first production


computer developed by IBM and was designed primarily for
scientific computing.
As the technology evolved, it became evident that centralized
computing systems had their inherent problems resulting from its
centralized nature. One of main problems was the possibility of
complete failure due to accidents (e.g. fire or water). Another
problem was the continuous insecurity the servers were
susceptible to as results of hacks . Another important problem was
the inefficient allocation of resources due to the localized nature of
hardware equipment, resulting in scalability issues³.

2. The Emergence of Distributed Computing


Systems: from PCs to the World Wide Web
These known problems of centralized computing settings
gradually led to the creation of the first distributed computing
systems like ARPANET⁴ in 1969 which consisted in a network of
interconnected computers that can communicate for research
purposes through the TCP/IP protocol which would later become
the foundation of the internet (the biggest distributed system).
Additionally , the computer revolution in 1970 resulted in the
creation and proliferation of personal computers (PCs) which gave
us a glimpse of the power of distributed computing through single
computers which were in itself a distributed system if we assume
the central control unit, the memory units and the input-output
channels were separate processes⁵. However, these personal
computers didn’t’ interconnect to each other to form a distributed
network, so no real distributed computing at scale was
experienced at that time. Initial applications of the personal
computer were video-games but without inter-connectivity among
players/computers we all currently know about⁶. So even when the
personal computer was initially useful, gradually all the problems
and limitations resulting from its centralized nature became clear
as the internet began to shows us little by little the future of
distributed computing systems that laid ahead of us.

Altair 8800, the first personal computer.

3. Initial Fault Tolerant Systems…but


Permissioned
The initial distributed computing systems were by the most part
limited to highly centralized governance where malicious attacks
were not possible as only permissioned access was allowed which
normally was granted by an admin. For example, a corporation
could claim fault tolerance by utilizing a distributed network of
computers (even located in different geographies) but connected
securely while full administrative control over the network was
preserved by a central authority, thus avoiding any malicious
attack. A well-known example of such as system was Google’s fault
tolerant lock system called Chubby which had 5 nodes and
tolerated up to 2 crashes⁷. In this scenario, adversarial attacks
were not even considered as Chubby only allowed permissioned
access, hence the fault tolerance referred to was only against
crashes due to malfunctioning or damage in the nodes/computers.
However, in real case scenarios, there is always the possibility of
failure due to other factors such as nodes/computers acting
maliciously (byzantine nodes).

4. The Byzantine General Problem: Byzantine


Fault Tolerant (BFT) Consensus Algorithms
Therefore, systems that were resistant to malicious behavior such
as the Byzantine Fault Tolerant⁸ (BFT) consensus algorithms were
created. Just to mention, BFT systems were derived by Lamport as
a general case of the “Two Generals Problem” ⁹(which has been
proven unsolvable) when communication occurs through an
unreliable link (e.g. the internet) known as the “Byzantine General
Problem”. Mathematically, Lamport proved that BFT systems
could only reach deterministic consensus only if less than 1/3 of all
the nodes were malicious (byzantine). As a result, BFT systems
and some of its variations such as the practical Byzantine Fault
Tolerant (pBFT) algorithm which facilitates consensus (with 3%
increase in latency) in a practical manner and other similar BFT
algorithms proliferated through approximately 3 decades and have
had many applications in diverse settings such as in aviation,
space and nuclear power plants industries¹⁰. For example, in
aeronautics, that correct functioning of an airplane depends on the
correct transmission of information relayed by the many
thousands of independent sensors, hence the system must be BFT
to be able to continue its flying course even in the presence of
many sensors failing to correctly communicate, thus avoiding
systemic failure which could lead to an airplane crash¹¹.

Byzantine Fault Tolerant (BFT) Airplanes

5. Nakamoto’s Consensus: A New Solution to


the Byzantine General Problem
Bitcoin White Paper

Nakamoto’s type of consensus (as used in Bitcoin¹²) was the first


solution to the Byzantine General’s problem that probabilistically
guaranteed trustless consensus in a permissionless setting in the
presence of less than 50% of byzantine nodes (a huge
improvement over the less than 1/3 allowed by BFT systems and
its variants at the time). This was really a breakthrough and
worked following a 2-step process: first an anti-sybil mechanism
called Proof of Work (PoW) was used in a computational
competition to select a leader who would mine a block, and then in
a second the other nodes in the network would validate the
correctness of the computation by choosing the longest chain (the
chain with more hash power or height) as the real chain, giving
rise to the blockchain era. However, even when Nakamoto
consensus was a significant breakthrough to the Byzantine
General’s problem as applied to fully permissionless settings and
with no central authority, there were some problems. One of them
is that the PoW mechanism is not environmentally-friendly and
makes inefficient energy consumption¹³. Another problem is the
inability to scale which results from the fact that the ledger needs
to be replicated among all the nodes to reach consensus, and this
requires approximately 10 minutes in the Bitcoin blockchain,
making it unsuitable for applications that require instant finality
and high throughput. Other potential weakness is the synchronous
nature of blockchains such as Bitcoin which can be brought to a
halt by using Distributed Denial of Service (DDoS) attacks. Apart
from this, cryptocurrencies such as Bitcoin are just one application
on top of a blockchain, while smart contacts is another application
that in itself allows the creation of a myriad number of distributed
applications (dApps) in finance, insurance, Internet of Things
(IoT), etc. Unfortunately, smart contracts are not scalable in linear
and synchronous blockchains such as Bitcoin or Ethereum
because in those blockchains every single node is needed to arrive
at consensus which would render the network impractical for
decentralized application that require high throughput and fast
finality while being able to scale globally.
6. Alternatives to Nakamoto’s Consensus
Therefore, many researches have tried to implement more
efficient solutions to the Byzantine General Problem than
Nakamoto’s consensus, while still guaranteeing trustless
consensus in a permissionless setting. Among the most prominent
solutions include the combination of an anti-sibyl deterrent
mechanisms (e.g. PoW, PoS, etc) with any of the many variations
of Byzantine Fault Tolerant (BFT) consensus algorithms. Example
of these variations of BFT-based consensus algorithms include: an
optimized version of practical Byzantine Fault Tolerance
(optimized pBFT) as used in Zilliqa¹⁴, delegated Byzantine Fault
Tolerance (dBFT) as used in Neo blockchain¹⁵, and PoS BFT as
implemented in EOS blockchain¹⁶. Nonetheless, all these
protocols employ at their core a synchronous (or partially/weak
synchronous) BFT consensus algorithm which makes them
unsuitable for TRULY permissionless blockchain settings where
“network links can be unreliable, network speeds change rapidly,
and network delays may even be adversarially induced” ¹⁷. All
these BFT variants might be prepared to tolerate malfunction
faults due to adversarial collusion or damage in the nodes
according to the classical definition of a byzantine attack but the
internet has taught us that byzantine behavior can evolve beyond
to what a legacy understanding of what is a byzantine attack could
have predicted. For instance, Botnetsd¹⁸ and Distributed Denial of
Service (DDoS) attacks can easily be used to flood with packets an
important node involved heavily in consensus and thus delaying
the transmission of packets said node must send/receive for the
network to achieve consensus, leading to the halting of the entire
network. In this scenario is evident that synchronous consensus
algorithms are susceptible to time-delayed attack due to the fact
that the transmission of messages needed to achieve consensus are
time-bounded. Therefore, synchronous BFT consensus algorithms
are not the strongest form BFT because time-assumption are made
which makes provably impossible to guarantee liveness in
adversarial settings where time-based attacks are constant as can
certainly happen when we use the internet¹⁹.
Botnet-multi-server-topology by Mohammed Zahid

7. Enter the world of Asynchronous Byzantine


Fault Tolerant Systems
Only BFT consensus protocols that are really asynchronous make
no timing assumptions to achieve consensus and hence, guarantee
liveness and robustness to thrive in the internet. There are not
many aBFT consensus algorithms for permissionless blockchains
but among the first ones is: the HoneyBadgerBFT consensus
protocol that uses a novel atomic broadcast mechanism to achieve
liveness wihtout making time assumptions²⁰. This was certainly
an improvement over any previous BFT-consensus protocol that is
applied to blockchains. As a seminal research is important to
mention that the HoneyBadger proved the potential of
asynchronous BFT consensus algorithms for upcoming
cryptocurrencies that want to achieve both high throughput and
liveness without compromising in decentralization, censorship-
resistance and security²¹. In the same venue, it is critical to
emphasize that the choice of architecture for the underlying
blockchain, the type of application layer as well as the choice of
middleware can limit or augment the potential benefits of any
asynchronous BFT consensus algorithm.
The HoneyBager BFT Protocol Whitepaper

8. Enter the Fantom: The First Scalable &


Asynchronous BFT DAG-based Smart
Contract Platform
Fantom White paper

A recent leap of innovation in asynchronous BFT consensus


protocols occurred with the work done by Fantom, specifically
through its Lachesis consensus protocol which is a Directed
Acyclic Graph (DAG) based asynchronous non-deterministic
protocol that guarantees a pBFT consensus by creating blocks
asynchronously and where the OPERA chain (DAG) is used for
faster consensus by confirming how many nodes share the
blocks²². Basically, validation by the nodes in the network is not
needed to arrive at consensus as occurs in typical BFT schemes.
Fantom instead reaches probabilistic and asynchronously
consensus by leveraging the concepts of distributed common
knowledge, dominator relations in graph theory and broadcast
based gossip to achieve a local view with high probability of being
a global view, thus increasing throughput near linearly as nodes
enter the network²³ while securing a probabilistically high enough
certainty about the finality of the transactions. In a nutshell,
Fantom achieves probabilistic consensus which mathematically
ensures high throughput and where the state of the transactions is
appended to the Opera underlying chain (not a linear chain or
blockchain as in Bitcoin or Ethereum but a DAG chain which
makes scalability at the protocol level possible!). This is a critical
advantage that brings Fantom to different level if we consider the
fact that Lachesis will unavoidably lead to probabilistic consensus
where there is no need to wait for all the nodes to reach
consensus(hence it will be fast) and once consensus is reached,
then the state of the transactions will be stored in the always
scalable DAG-based Opera. All these details in the structural parts
of the Fantom architecture make it suitable for Smart Contacts
with high throughput and fast finality (probabilistic but fast) and a
prime choice (if not the prime choice) for dApps that will scale as
the masses are onboarded into Fantom.
The Opera DAG constructed through the Lachesis protocol. Scalability at its finest!

You might also like