You are on page 1of 87

Analyzing the Success of Selfish Mining with

Multiple Players

Shiquan Zhang

Master of Science

School of Computer Science

McGill University, Montreal


June, 2020

A thesis submitted to McGill University in partial fulfillment of the


requirements of the degree of Master of Computer Science

© SHIQUAN ZHANG, 2020


Abstract

Proof-of-Work (PoW) is the most popular consensus protocol among current mainstream
blockchain systems including Bitcoin. However, selfish mining has been found as a pos-
sible threat towards PoW blockchain systems and recent works have shown that a selfish
miner can undermine others’ benefit and gain a reward that is higher than its propor-
tion of mining power if it holds more than 25% of the overall mining power. While this
percentage is much larger than the mining power of any current mining pool in Bitcoin,
this threshold is only valid for a single-attacker scenario. In this work, we analyze multi-
attacker scenarios. We first formulate a multi-dimensional Markov Decision Process (MDP)
model for selfish mining with multiple independent attackers, and apply it to thoroughly
analyze two-player scenarios. Furthermore, we use the model to build a simulator, and
use it to present extensive simulation results for multi-attacker games. We show that
when the number of selfish miners increases, each of them requires less mining power
to gain an advantage, but that the interval of the mining power where each selfish miner
benefits decreases. Our work is the first to show that in practice, there are scenarios where
it is enough to have 12% mining power to gain from selfish mining but that having more
than 7 selfish miners which benefit simultaneously is highly unlikely. Furthermore, we
infer that it is always beneficial for selfish miners to collude than mine independently.
We also propose a safe limit for the size of mining pools in Bitcoin to avoid multi-player
attacks. Finally, we also show results for Ethereum in a three-player scenario to show the
flexibility of our simulator to simulate various blockchain reward systems.

i
Abrégé

La preuve de travail (en anglais PoW) est le protocole de consensus le plus populaire
parmi les systèmes de blockchain courants, y compris Bitcoin. Cependant, l’exploitation
minière égoïste a été trouvée comme une menace possible pour les systèmes de blockchain
PoW et des travaux récents ont montré qu’un mineur égoïste peut compromettre les
avantages des autres et obtenir une récompense supérieure à sa proportion de puissance
minière s’il détient plus de 25% de la puissance minière globale. Bien que ce pourcentage
soit beaucoup plus élevé que la puissance d’extraction de tout pool d’extraction actuel
en Bitcoin, ce seuil n’est valable que pour un scénario à attaquant unique. Dans ce tra-
vail, nous analysons des scénarios multi-attaquants. Nous formulons d’abord un modèle
multidimensionnel processus de décision markovien (en anglais MDP) pour l’exploitation
minière égoïste avec plusieurs attaquants indépendants, et l’appliquons pour analyser en
profondeur les scénarios à deux joueurs. De plus, nous utilisons le modèle pour constru-
ire un simulateur et l’utiliser pour présenter des résultats de simulation détaillés pour
les jeux multi-attaquants. Nous montrons que lorsque le nombre de mineurs égoïstes
augmente, chacun d’eux a besoin de moins de puissance minière pour obtenir un avan-
tage, mais que l’intervalle de la puissance minière où chaque mineur égoïste bénéficie
diminue. Notre travail est le premier à montrer que dans la pratique, il existe des scé-
narios où il suffit d’avoir 12% de puissance minière pour gagner de l’exploitation minière
égoïste mais qu’il est très improbable d’avoir plus de 7 mineurs égoïstes qui bénéficient
simultanément. De plus, nous déduisons qu’il est toujours avantageux pour les mineurs
égoïstes de s’entendre que les miens indépendamment. Nous proposons également une

ii
limite de sécurité pour la taille des pools de minage en Bitcoin afin d’éviter les attaques
multi-joueurs. Enfin, nous montrons également des résultats pour Ethereum dans un scé-
nario à trois joueurs pour montrer la flexibilité de notre simulateur pour simuler divers
systèmes de récompense de blockchain.

iii
Acknowledgements

Firstly, I would like to express my sincere appreciation to my supervisor Professor Bettina


Kemme for her continuous support of my Master study and this thesis work. Her con-
sistent guidance on the research and this work has made me through many challenges.
Her patience, encouragement and broad knowledge inspire me to continue my research
career and explore more in this field.
I would also like to give my special thank to Professor Kaiwen Zhang for providing
many suggestions on technical details. His detailed advice offered me many valuable
insights into both the theoretical model and the simulation experiments.
I would also like to thank Joseph D’Silva, Maximilian Schiedermeier, Jianhao Cao,
Yu Ting Gu from the Distributed Information System Lab at McGill University and also
Yahya Shahsavari, Syed Muhammad Danish from the FUSÉE Lab at École de technologie
supérieure Montréal and many other lab mates for sharing their precious thoughts and
research experiences throughout this work.
Finally I would also express my sincere gratitude to my parents for their continuous
love and support throughout my two years of Master study.

iv
Table of Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Abrégé . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

1 Introduction 1
1.1 Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Background 5
2.1 Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.2 Consensus Mechanism And Mining . . . . . . . . . . . . . . . . . . . 8
2.1.3 Ethereum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Selfish Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 Approach and Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2 Multi-attacker Selfish Mining . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 Stubborn Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

v
3 Theoretical Analysis of Selfish Mining 21
3.1 Basic Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Multi-Dimensional MDP Model . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.1 Profitability with regard to Time . . . . . . . . . . . . . . . . . . . . . 26
3.3 Two-player Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.1 Two-Dimensional MDP Model . . . . . . . . . . . . . . . . . . . . . . 29
3.3.2 Two-Player Selfish Mining . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.3 Alternative Mining Strategies . . . . . . . . . . . . . . . . . . . . . . . 34
3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4.1 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4.2 Deficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4.3 Multi-player MDP Model for Simulation . . . . . . . . . . . . . . . . 39

4 Simulation On Multi-player Selfish Mining 41


4.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.1.1 Structure of Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.1.2 Simulation Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 Three-player Selfish Mining Game . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.1 Profitability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.2 Threshold of Three-player Attack . . . . . . . . . . . . . . . . . . . . . 45
4.3 Selfish Mining with More Players . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3.1 Upper Bound on The Number of Attackers . . . . . . . . . . . . . . . 48
4.3.2 Collusion among Attackers And Collapse of Multi-player Attack . . 52
4.3.3 Feasibility in Bitcoin . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4 Three-player Game in Ethereum . . . . . . . . . . . . . . . . . . . . . . . . . 57

5 Conclusions 59
5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

vi
Appendix A Proof of Lemma 3.3.1 62
A.1 Catalan Triangle And Catalan Numbers . . . . . . . . . . . . . . . . . . . . . 62
A.2 Stationary Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
A.3 Rewards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

vii
List of Figures

2.1 Structure of A Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6


2.2 Workflow of The Blockchain Systems . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Branching Cases in 2-Player Selfish Mining Attack . . . . . . . . . . . . . . . 14
2.4 New Branching Cases in Multi-player Selfish Mining Attack . . . . . . . . . 16

3.1 Timeline of The Attack And The Difficulty Adjustment . . . . . . . . . . . . 27


3.2 Selfish Mining Strategy on 2D MDP Model . . . . . . . . . . . . . . . . . . . 32
3.3 Naive Stubborn Mining Strategy on 2D MDP Model . . . . . . . . . . . . . . 35
3.4 Stubborn Mining Strategy on 2D MDP Model . . . . . . . . . . . . . . . . . . 37

4.1 Structure of the Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42


4.2 Reward of Attacker 1 by the Power of Each Attacker . . . . . . . . . . . . . . 44
4.3 Reward of Each Player in 3-Player Game . . . . . . . . . . . . . . . . . . . . 46
4.4 Number of Beneficial Attackers with Different Number of Players (n = 3, 4,
5, 6, 7, 8, 9, 10) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.5 Attackers’ Mining Power in Common Beneficial Areas . . . . . . . . . . . . 51
4.6 Mining Power Distribution in Bitcoin [11] . . . . . . . . . . . . . . . . . . . . 55
4.7 Minimal Total Mining Power of n − 1 Attackers in Common Beneficial Areas 56
4.8 Number of Beneficial Attackers in 3-player Game in Ethereum . . . . . . . . 57

viii
List of Tables

3.1 List of Symbols Used in Section 3.3 . . . . . . . . . . . . . . . . . . . . . . . . 30

4.1 Mining Power and Reward Rate of Miners in 2,3,4-Player Games . . . . . . 54

A.1 Catalan Triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

ix
Chapter 1

Introduction

1.1 Preface

Blockchain is the underlying data structure of cryptocurrency systems such as Bitcoin


developed by Nakamoto in 2008 [64]. A blockchain is basically a series of immutable
records which typically reflect monetary transactions. Users submit their transactions to
the blockchain system, which determines a total order of all transactions and stores them
in a chain of blocks in a secure way. The blockchain system itself consists of a network of
computers, called the miners, each having a copy of the blockchain. Miner compete with
each other, trying to add new blocks to the chain. With new transactions submitted to the
systems, a miner adds them to a block and adds the block to the end of the block chain.
Most of the mainstream blockchain systems adopt Proof-of-Work (PoW) as the core of
their consensus protocol to determine the next block in the block chain [85].
During the production of a block, a miner is required to solve a cryptopuzzle, which
is hard to solve but easy to verify. Therefore, the process is also called mining. Once
the problem is solved, the solution is attached to the new block as a valid certificate.
Other miners verify the certificate and accept the new block to their local blockchain. The
miner of the new block then gets the reward of this block. In the PoW system, a miner
holding more computational power has a higher probability to produce a new block and

1
proportionally has a higher chance to gain the rewards. Nakamoto proves that as long
as a miner has less than 50% of the total mining power, it cannot arbitrarily control the
production of blocks.
However, Eyal and Gün Sirer [34] proposed an attack, named selfish mining, where the
miner gains a higher proportion of block rewards than the proportion of its mining power.
With selfish mining, an attacker will not immediately release a new block once it finds it,
but withhold the new blocks in a private chain. It allows the public chain to grow (with
honest miners adding blocks to it) until the public chain catches up in size with its private
chain. Then, the attacker publishes its private chain and other honest miners will give up
their public branch when finding the attacker’s longer chain. The malicious miner then
will gain a proportion of the block rewards higher than its mining power by overwriting
the public chain and wasting the mining power of honest miners. Paper [34] show that
mining selfishly can be beneficial once a miner has more than 25% of the overall mining
power. Selfish mining is harmful to the blockchain mainly because honest miners gain
less rewards than expected, breaking the Nash Equilibrium in the incentive mechanism
[65, 67].

1.2 Motivation

Most previous works only analyze a 2-player scenario with one single selfish miner and
one honest miner. In principle, each of these players can be a group of players (or pool),
where the mining power is the sum of all the individual miners’ powers. However, the
current Bitcoin structure is divided into several smaller independent pools, with no one
approaching 25% [37, 63]. Hence it is unlikely for the miners who aim at maximizing
their own profits to launch the selfish mining attack. As such, selfish mining has not been
observed in the real Bitcoin system yet [82].
Furthermore, although selfish mining is easy to detect, an attacker can avoid being
caught by, for example, using a Sybil attack [29]. The existence of multiple pools with

2
considerable mining power (albeit below 25%) and the possible camouflaging of the at-
tacker make it possible that more than one miner could independently launch a selfish
attack at the same time. Thus, in our work, we offer a detailed look at multi-player games
where multiple attackers do selfish mining simultaneously but independently, and ana-
lyze the rewards of the attackers and the sustainability of the attack.

1.3 Contribution

Most of our work is based on the Bitcoin protocol and the selfish mining attacks, but we
also extend to the Ethereum protocol and another attack type..

1. Based on a previously proposed one-dimensional Markov Decision Process model


(MDP) [34], which was developed to study two-player games, we develop a multi-
dimensional model for various alternative mining attacks in Bitcoin.

2. We use our model to analyze selfish mining, and a further mining strategy, called
stubborn mining. Our computation shows that a combination of selfish and stubborn
mining has a threshold slightly lower than selfish mining alone. We also provide
an analysis of rewards with regard to the time of the attack to justify our choice of
measurement of the miner’s revenue.

3. We design and implement a simulator based on our multi-dimensional MDP model,


which can run scenarios with multiple selfish miners with adjustable mining power.

4. We demonstrate that the threshold for profitable selfish mining with 2 attackers is
21%, which is lower than the 25% needed when there is only one attacker.

5. We generalize the above threshold for scenarios with more than 2 attackers and
show that there are ranges of mining power distributions where up to 7 attackers
can independently gain advantage. However, scenarios with 7 or more attacker are
practically not sustainable and it is unlikely that small mining pools with less than
12% of mining power benefit from selfish mining in any scenario.

3
6. We observe that a necessary but not sufficient condition for every selfish miner in
an n-player game (representing n − 1 independent attackers and 1 honest miner)
1 1
to benefit is that the mining power of every attacker range from n+2
to n−1
. The
interval shrinks as the number of attackers increases as it is asymptotic to n1 .

7. We observe that it is beneficial for selfish miners to collude and pool their hashing
power together instead of mining independently.

8. We analyze the current Bitcoin network and show that it currently does not satisfy
the conditions for a sustainable selfish mining attack.

9. We propose that to avoid a multi-player selfish mining attack, the mining power of
k
the top k mining pools be limited to be under k+3
.

10. We extend our simulation to Ethereum [92], a different blockchain system, and show
that the Ethereum system is more vulnerable to a multi-player selfish mining attack.

1.4 Structure of the Thesis

The remainder of the thesis is structured as follows.


We first introduce the concept of blockchain and review previous work about selfish
mining and multi-player attacks in Chapter 2. Then, in Chapter 3, we propose a multi-
dimensional Markov model for selfish mining and give some theoretical analysis based
on this model. In Chapter 4, we implement a simulator of the mining game and show
the results and analysis of the simulation on multi-player selfish mining. Finally, we
summarize our simulation results and point out some possible aspects for future work in
Chapter 5.

4
Chapter 2

Background

In this chapter, we first introduce the concept and some basic protocols of blockchain,
including the consensus protocol and the incentive mechanism. Then, we present the
algorithm of selfish mining and some extended attacks, for example stubborn mining and
optimal selfish mining. At the end, we review some previous research on selfish mining.

2.1 Blockchain

2.1.1 Overview

Invented by Satoshi Nakamoto [64], Bitcoin has attracted a lot of attentions ever since. As
the underlying technology of Bitcoin, blockchain is an open, decentralized and distributed
digital ledger, which integrates some cryptographic algorithms, such as digital signature
and hashing, with a distributed system to create a reliable, transparent and traceable en-
vironment [28, 90]. The exchange of cryptocurrency is stored in form of transactions in the
blockchain in a secure manner, so that the transaction history is clearly visible and cannot
be altered.
Specifically, a blockchain is a data format consisting of a chain of blocks. Each of the
blocks contains data items, which are called transactions. Figure 2.1 shows the structure
of a blockchain.

5
Figure 2.1: Structure of A Blockchain

A block is composed of a block body and a block head. The block body contains a
Merkle tree [62], which stores the transactions in its leaves. Besides, the meta-information
relevant to the block, such as the block height, the root of the Merkle tree of the block body,
a nonce for verification (which we will discuss in Section 2.1.2) and other components are
kept in the block head.
The blockchain is a chain of these blocks. However, unlike with a linked list, blocks
are not connected by pointers. Each block includes the hash of its parent block in its
block head as an indicator. The hash value implicitly contains all of the information in
the parent block, including the hash of the grandparent block. Therefore, it implicitly
encodes the order of the blocks in the chain and their contents. This ensures that a block
in the chain cannot be modified without altering all of the descendant blocks.
Furthermore, several blocks may share a common parent block. In other words, a
block is followed by multiple child blocks. Such case is called forking and the sub-chains

6
created from the forking are called branches [13]. However, it is an undesirable behavior
that may undermine the consistency of some blockchain systems including Bitcoin [35,
69, 70]. A consistency algorithm must ensure that eventually only one sequential chain
becomes the valid chain.
The blockchain is maintained by the blockchain system. A blockchain system is a dis-
tributed, decentralized digital ledger system using a peer-to-peer network [85]. Different
from a centralized database system, each node in the blockchain system stores a copy of
the blockchain and manages the transaction data independently. The design of decentral-
ization alleviates the participants’ concern about the reliability of the data. Meanwhile, it
makes the system vulnerable to many kinds of attacks, such as double spending 1 [21,50],
and requires a robust protocol for chain maintenance.
According to the extent of openness, blockchain systems are often categorized into
permissionless, semi-permissioned (consortium) and permissioned (private). Most of the current
public blockchain systems are permissionless, including Bitcoin, Ethereum [14], Ripple
[73, 78], EOS [30], Ouroboros [52], etc. Permissionless blockchain is fully decentralized
and open to the public. Any computer with internet connection can access the peer-to-
peer network and participate in the process without any permission. As such it faces some
security threats [8, 23]. To mitigate this vulnerability, Consortium blockchain systems
have been proposed, for example, Quorum [41], Hyperledger [1, 17] and Corda [24]. The
consortium blockchain system is semi-centralized, where some privileged nodes build a
consortium and one needs permission to join in. They make the blocks and verify the
transactions. Other nodes are only allowed to send and query the transactions. Finally,
a permissioned system is a private ledger system that requires a permission to join in for
both making blocks and submitting/querying transactions. It is more like a conventional
centralized ledger system while using blockchain as its data structure.
In our study, we will focus on the safety of the permissionless blockchain systems,
especially the Bitcoin protocol.
1
With Double Spending, the attacker exploits some loophole of the system and spends the coins in its
account twice.

7
2.1.2 Consensus Mechanism And Mining

Aiming to make an agreement on the data in the decentralized system, the consensus
mechanism is the core of the blockchain protocol. Developed from the conventional
Byzantine Fault Tolerant (BFT) algorithms [19, 71], several kinds of the consensus proto-
cols prevail in the current blockchain systems [95], for example, Proof of Work (PoW) [49],
Proof of Stake (PoS) [5,53,91], Delegated Proof of Stake (DPoS) [7,56,77], Proof of Elapsed
Time [20], Proof of Learning [10], DAG-based Avalanche Consensus [74], etc. Among
these protocols, Proof of Work is the most mature and the most widely used one.
Proposed in 1993, Proof of Work [36, 49] is a consensus protocol that requires a cer-
tain amount of computational work for the participants to prove the validity of the data,
which prevents spam and denial of service attacks. Specifically, Bitcoin and other similar
blockchain systems use the solution-verification mechanism to agree on the transactions
and block data. In this mechanism, a node, also known as a miner, who wants to create
a new block, needs to solve a difficult problem to show the validity of the block. Other
nodes can verify the block by checking the solution of the problem and the transactions.
The problem in the PoW protocols needs to be difficult to solve while the solution should
be easy to verify. For example, in Bitcoin, it is a hash inequality problem. Once a node
finishes wrapping up a new block, it has to add a nonce into the block head, so that the
hash of the block 2 is smaller than a certain number, which is called Difficulty D. Due to
the unpredictability and the irreversibility of the hash function, a miner needs to repeat-
edly try a random nonce and check the hash to get the solution, a process that demands a
considerable amount of computational power. In Bitcoin, D is regularly adjusted so that
the main chain grows by one block per 10 minutes. However, other nodes, who receive
the block, can easily verify the block correctness by only hashing once. Considering the
difficulty of solving the problem to create a valid block, this process is also called min-
ing. Additionally, using PoW, the more computational power a node holds, the higher
probability to mine a new block it has, and the more rewards it can gain from the mining.
2
Bitcoin uses SHA-256 hash function, of which the result is a 256-bit number.

8
To encourage the miners to contribute their computational power to maintain the con-
sensus process, the blockchain systems have an incentive mechanism. In Bitcoin, the min-
ers are incentivized by block rewards and transaction fees. A block in a chain is called con-
firmed once it is followed by six consecutive blocks. The miner can gain a certain amount
of bitcoins (BTC) if the block it mined is included in the blockchain and gets confirmed.
Additional to the block rewards, nodes who submit transactions need to pay a transaction
fee depending on the size of the transaction and the confirmation time (the time needed
to add a transaction to the block and write it to the chain). The miner can claim trans-
action fees for transactions included in blocks it has successfully mined. However, the
transaction rewards are insignificant compared to the block rewards in most of the cases3 .
Therefore, in our study, we only consider the block reward as the reward of the miners.
Figure 2.2 shows the workflow of the blockchain system. At the beginning, every
node who needs to transfer cryptocurrencies to other nodes submits the transactions to
the peer-to-peer network. Nodes collect these transactions and start to solve the hash
problem. After a certain time span, the first miner to succeed in guessing the solution can
pack the block and publish it to other miners. Next, other nodes who receive the new
block will independently verify the solution and the transactions in the block. If the block
passes the verification, they will append the block to their local blockchains. Otherwise,
the block will be discarded. After updating the local blockchain, the miners will move on
to the next round and mine on top of the new block.
It can take some time, called network latency, for a block to reach all nodes [31]. Thus, if
two or more miners succeed in mining a new block before receiving the new blocks of the
other nodes, forks might occur, leading to two or more tails to the chain. When a miner
detects two or more branches, it will continue to mine on the longest one and eventually
all miners follow the same branch. Thus, eventually the problem is resolved.
3
In June, 2020, the average transaction reward is around 0.6 BTC per block [12] and the block reward is
6.25 BTC [6].

9
Figure 2.2: Workflow of The Blockchain Systems

10
2.1.3 Ethereum

Proposed by Vitalik Buterin in 2014, Ethereum [15] is the second largest public blockchain
project after Bitcoin. Although aiming to shift to the PoS consensus protocol [16, 92],
Ethereum currently is still under the threat of selfish mining as it partially adopts the
PoW at the present stage. Therefore, our study involves the Ethereum protocol in our
simulation.
Ethereum has two major differences compared to Bitcoin. First, Ethereum contains
a virtual machine (EVM) allowing the nodes to embed code within the transactions and
execute it when mining and verifying the blocks, which significantly expands the applica-
bility and flexibility of the transactions. An executable code is called a smart contract. With
smart contracts and EVMs, the transactions in the Ethereum system are no longer limited
to an amount of money transferred from one account to another. The smart contract pro-
vides a broader range of actions to the participants, such as contracting and recording
various information. It is often used in insurance, healthcare systems, business manage-
ment, etc.
Furthermore, and more relevant to the thesis, the block time in Ethereum is much
shorter than in Bitcoin. Due to a lower difficulty of the PoW mining problem, the block
time in Ethereum is controlled to be between 10 to 19 seconds. Currently, (June 2020) the
average block time is around 13 seconds [32].
In Ethereum, given the shorter mining time, it is more likely than in Bitcoin to have
multiple miners mining new blocks simultaneously. In other words, it has a higher rate
of forking. To mitigate the impact of the higher forking rate and maintain the consistency
of the blockchain among the nodes, Ethereum has introduced uncle block rewards into
the incentive mechanism. The sibling of a given block’s parent (or its nth generation of
ancestor, n ≥ 1) is referred to as the uncle block. An uncle block is a block that likely will
be in a branch that will not become the final confirmed chain, that is, a block that was
1
mined in vain. A miner can gain a fraction of extra reward called nephew block reward ( 32
of the regular block reward) if it includes the hash of the uncle block into the new block

11
it has just mined. Meanwhile, the miner of the uncle block, that gets referred to, gains an
1 7
uncle reward ranging from 4
to 8
of the regular block reward . The uncle block reward
encourages the miners to reference stale blocks when the forking occurs, which maintains
the traceability and the consistency of the blockchain. In the current Ethereum system, the
natural forking rate is 6% [54, 86], which is higher than 1.69% in Bitcoin [26] and needs to
be taken into account.

2.2 Selfish Mining

2.2.1 Approach and Algorithm

Since Bitcoin came out in 2008, many attacking methods have been proposed. Selfish
mining is one of the best-known attacks. It was first proposed by Eyal and Gün Sirer [34]
in 2013. In selfish mining, the attacker participates in the mining process. However,
once the miner mines a valid new block, the miner withholds it and continues to develop
its own private chain until the main public chain is catching up with it. Then, it will
release its private branch and overwrite the public chain. Hence, the attacker gains more
rewards and undermines the benefit of other miners by wasting their mining power. The
full version of the selfish mining algorithm is provided in Alg. 1 and Fig. 2.3 illustrates
the selfish mining strategy in an example 2-player game.
Their model considers a blockchain system using PoW as its consensus protocol, and
two major players that mine and compete for the block rewards on the blockchain. One is
the group of all honest miners, which follows the PoW protocol and allocates the rewards
the group receives to individual miners according to each one’s mining power. The other
is an attacker (possibly also a pool of several miners that collude) adopting the selfish
mining strategy. To simplify the situation and focus on the mining process of the system,
the model assumes that the system works without any network latency, miners gain their
revenues only from the block rewards and the reward for each block is identical. With
this, the system is modeled as a round-based Markov process, (P, S, γ), where pa , ph ∈ P

12
Algorithm 1 Selfish Mining Algorithm [34]
if My pool found a block then
∆prev ← length(private chain) - length(public chain)
append new block to private chain
privateBranchLen ← privateBranchLen + 1
if ∆prev = 0 and privateBranchLen = 2 then
// Win in the tie case
publish all of the private chain
privateBranchLen ← 0
Mine at the new head of the private chain
if Others found a block then
∆prev ← length(private chain) - length(public chain)
append new block to public chain
if ∆prev = 0 then
// Lose and follow up the public chain
private chain ← public chain
privateBranchLen ← 0
else if ∆prev = 1 then
// Tie case
publish last block of the private chain
else if ∆prev = 2 then
// Overwrite the public chain and win
publish all of the private chain
privateBranchLen ← 0
else
// Hold the private chain
publish first unpublished block in private chain
Mine at the head of the private chain

(pa ∈ [0, 0.5), ph ∈ (0.5, 1], pa + ph = 1) are the mining powers of both players. In each
step, a new block is mined. The blockchain then transits from a state si ∈ S to another
sj ∈ S according to the mining power of the miners and the strategy of each miner. γ
is a parameter that indicates the proportion of honest miners that follows the attacker’s
branch when two published branches are in tie.
We explain the individual steps of the Algorithm 1 along Fig. 2.3, which shows an ex-
ample of the selfish mining attack in the round-based two-player Markov process model.
The squares are blocks in the public chain mined by the honest miners with the mining
power of pH and the circles are private blocks mined by attacker Alice with the mining

13
Figure 2.3: Branching Cases in 2-Player Selfish Mining Attack

power of pA . Blocks with a dash line are unpublished and only visible to their miner. At
the beginning, the attacker Alice and the honest miners work on the public chain. Once
some honest miner finds a new block, Alice will adopt the new block and mine on it.
However, when Alice mines a new block, rather than releasing it to other miners imme-
diately, Alice will conceal it, add it to her private chain, while other honest miners are un-
aware of the existence of the new block and still work on the original block. If later some
honest miner finds a new block and publishes it, Alice immediately publishes her private
block. Two blocks are in tie. Alice and a percentage γ of the honest miners (γ ∈ [0, 1]) will
choose to mine on her branch and the rest mines on the other. And whoever finds a new
block first can make its branch win and overwrite the other branch. Moreover, if Alice
mines two or more blocks in her private chain before other miners find a new block, she
will keep her branch private until the public branch is only one block behind hers. Then,
when she unveils it, her private branch can overwrite the public branch and all the work
done by honest miners in the public branch is wasted. Through this method, a selfish
mining attacker may gain a higher proportion of rewards than its mining power even if

14
its power is less then 50% of the total hash power. The previous study showed that when
γ = 0.5, the selfish miner benefits from the attack if it holds a mining power higher than
25%.

2.2.2 Multi-attacker Selfish Mining

Although the existence of a selfish mining attack can be detected because honest min-
ers will receive less reward than their mining power and because of a higher-than-usual
forking rate, the attacker is still able to conceal itself from the public. Also, other min-
ers are unaware of the longer private branch during the attack. Therefore, it is possible
that multiple attackers launch selfish mining independently and concurrently. If there
are multiple independent attackers, each attacker mines on their own private chain and
multiple private branches may exist at the same time. Thus, we have to consider more
branching cases than in the 2-player game. Fig. 2.4 shows examples for the two new main
situations in a 3-player game. In such scenario, we have a new attacker Bob with the
mining power of pB , whose blocks are represented by triangles.
The first case in Fig. 2.4 depicts the situation where multiple branches are published
that have the same length, and thus, there is a tie. For example, both Alice and Bob have
a private chain with 2 blocks. When now an honest miner mines a new block, both of the
attackers release their private branches according to the selfish mining algorithm. The
new block mined by the honest miner is overwritten and a percentage of honest miners
choose Alice’s chain as their main chain and the rest choose Bob’s. If they choose Alice’s
block, she wins, otherwise Bob wins.
There are other situations where there are ties involving multiple branches. While one
of these branches might be the public one where the last block was just mined by one of
the honest miners, and this one miner will obviously choose its own branch, other miners
in the group of honest miners may still choose other branches. Thus, we let the group of
honest miners randomly choose one of the branches in tie. Furthermore, we also assume
that the selfish miners who have not published in that round choose one of the published

15
Figure 2.4: New Branching Cases in Multi-player Selfish Mining Attack

branches in tie randomly. And the selfish miners who have published one of the branches
in tie, always choose their own branch.
The second case is cascading release. Assume as an example that Alice and Bob both
hold a private branch, Alice’s with the length of 2 and Bob’s with length 3. When some
honest miner mines a new block, Alice releases her branch first. Alice’s branch overwrites
the public chain. Then Bob finds a public chain that is one block shorter than his private
branch, so he also decides to release his private branch and Alice’s branch is also over-
written. At the end of this round, Bob becomes the final winner and all the honest miners
and Alice choose his branch as their main chain. Actually, in one single round, the release
of a private branch may cause a chain reaction of releases of other branches.

2.2.3 Stubborn Mining

The selfish mining provided an insight into attacks altering the mining strategy. Later in
2015, Nayak et al. [66] proposed an alternative strategy called stubborn mining. In stub-
born mining, the attacker will not easily give up even when its private branch lags behind
the main chain and will not easily release the private branch to claim the victory when it
is leading the competition. The stubborn mining strategies can be categorized into three
kinds, lead stubborn, equal fork stubborn and trail stubborn (Tj -stubborn) depending on

16
Algorithm 2 T1 -Stubborn Mining Algorithm
if My pool found a block then
∆prev ← length(private chain) - length(public chain)
append new block to private chain
privateBranchLen ← privateBranchLen + 1
if ∆prev = 0 and privateBranchLen ≥ 2 then
// Win in the tie case
publish all of the private chain
privateBranchLen ← 0
Mine at the new head of the private chain
if Others found a block then
∆prev ← length(private chain) - length(public chain)
append new block to public chain
if ∆prev = 0 and privateBranchLen = 0 or ∆prev = −1 then
// Lose and follow up the public chain
private chain ← public chain
privateBranchLen ← 0
else if ∆prev = 1 then
// Tie case
publish last block of the private chain
else if ∆prev = 2 then
// Overwrite the public chain and win
publish all of the private chain
privateBranchLen ← 0
else
// Hold the private chain
publish first unpublished block in private chain
Mine at the head of the private chain

when and how the attacker adopts the public branch or releases its private branch. In this
study, we mainly focus on the T1 -stubborn mining attack.
Alg. 2 gives the details of the T1 -stubborn mining attack. Compared to the selfish
mining, the major change of the stubborn mining is that when the private branch exists
and is one block behind the public branch, where ∆ = −1 or ∆prev = 0, the attacker keeps
mining on its private branch. It will publish the private branch once it takes the lead
again. In contrast, it will adopt the public chain if the public branch leads by two blocks.
We will provide more details when we model this behaviour in Section 3.3.3, showing
that the stubborn strategy does work and improves the attacker’s rewards.

17
2.3 Related Work

The concept of selfish mining was derived from the block-withholding attack [2, 25], in
which the attacker deliberately withholds blocks or delays the transmission of blocks to
undermine the system. Although it is different from selfish mining, the related studies
[33, 39, 58, 84] still provide some insights into the selfish mining work, such as the system
model and some defense methods.
Eyal and Gün Sirer show in the first paper of selfish mining [34] that a selfish miner
may gain a proportion of rewards that is higher than its mining power even if it holds less
than 50% of mining power. The gaining threshold of this attack varies from 0% to 33%
depending on how many of the honest miners choose to mine on the attacker’s block in
case of a tie. In a realistic scenario where the honest miner is actually a group of many
miners, if there is a tie and there are two branches of equal length (one created through
honest mining and the other by the selfish miner), one can assume that each miner in
the honest group chooses randomly one of these branches. Thus, 50% of honest miners
would choose the attacker’s chain. In this case, the mining threshold for selfish mining to
be beneficial is 25%.
Since selfish mining was introduced, several other attacks using different mining strate-
gies have been proposed [38,81,88]. As mentioned, Nayak et al. [66] introduce the concept
of Stubborn Mining. Sapirshtein et al. [76] explore the optimal mining strategy in a finite
mining strategy space with the help of a Markov Decision Process model (MDP) solver.
pa
They provide a tight upper bound of the gain of an alternative mining strategy as 1−pa
,
where pa is the mining power of the attacker. Recently, Hou et al. [47] propose a frame-
work named SquirRL, which uses reinforcement learning to search for the optimal mining
strategy. And Wang et al. [87] also apply machine learning to train optimal mining strate-
gies. Although not strictly proven as optimal, the results of those optimal strategy studies
are very close.However, which strategy is sub-optimal varies for different distributions of
mining power. Thus, we only consider two strategies, namely selfish mining and stub-

18
born mining in our study. And for Ethereum, Gencer et al. [37] studied the impact of
the uncle block reward. Niu and Feng [68] studied selfish mining in Ethereum. Besides,
Göbel et al. [40] discussed the selfish mining with network latency. And Carlsten [18]
took the transaction fees into account. Furthermore, some defense methods against self-
ish mining have also been proposed by Zhang et al. [94], Heilman et al. [45] and Solat et
al. [80].
Among the previous research on alternative mining strategies, some has been con-
ducted on the theoretical analysis of such attacks from different perspectives. Baccutti
and Jaag [4] show the equilibria in games with honest miners only. Kiayias et al. [51],
Laszka et al. [57] and Dimitri [27] apply game theory models to analyze the block mining
game. Sompolinsky and Zohar [81] suggest some bounds on the benefit threshold for
different mining strategies. Grunspan and Pérez-Marco [42–44] provide a rather detailed
analysis of selfish mining, trail mining and stubborn mining attacks using a Poisson pro-
cess model and conclude that considering the difficulty adjustment in the current Bitcoin
protocol, once the mining difficulty is adjusted, the total block rewards in regard to time
remain the same as that before the attack is launched. Their analysis results also con-
firm the correctness of the Markov models developed by previous work and offer some
theoretical support to the assumptions in this study. We also derive the same conclusion
regarding block rewards with regard to time as presented in Section 3.2.1 using a different
model and process.
Further research has combined above attacks with other attacking methods. For ex-
ample, Heilman et al. [46] proposed an attack named Eclipse attack on the Bitcoin network
and Nayak et al. [66] demonstrated the stubborn mining combined with the Eclipse at-
tack.
Most of these studies focus on a 2-player game with one attacker and one honest
miner. In contrast, our approach with will go further to explore multi-attacker scenar-
ios using previous research on single-attacker games as our foundation.

19
For selfish mining with multiple attackers, a few studies have been carried out in
recent years. Liu et al. [60] consider 3- and 4-player games (with 2 resp. 3 attackers) and
Bai et al. [3] later provide theoretical results for a 3-player game using a Markov model
but only with a limited number of states. In contrast, we use simulation to show attacks
with an infinite range of states and with more players. The study closest to ours is [59] by
Leelavimolsilp et al. In their study, a strategic miner can switch between honest mining
and selfish mining depending on which one is more beneficial. The authors determine
that with an increasing number of strategic miners, there will be less of them that choose
to be selfish (as it is no more beneficial) and instead fall back to be honest. In our work
instead, we provide a much closer look at these situations concerning the relationship
between the benefits of selfish mining, mining power, and expected reward in case of
honest mining. Furthermore, we discuss the stability of the attack with more attackers,
with insights regarding feasibility and success of multi-player attacks.

20
Chapter 3

Theoretical Analysis of Selfish Mining

In this chapter, we present our multi-dimensional Markov Decision Process (MDP) model
[48] of a multi-player mining game, together with some basic assumptions. With the help
of this model, we discuss the profitability of selfish mining with regard to time and Bitcoin
difficulty adjustment. From there, we present in more detail the particular case of the
two-dimensional model and use it for a detailed analysis of selfish and stubborn mining
in two-player scenarios. Our two-dimensional MDP model is able to compute both the
attacker’s profit and the threshold of the attack. Finally, we present an adaptation of our
multi-player model for the simulation in Chapter 4.
Note that our model is different to previous proposed models such as [34] and [60],
that are either one-dimensional or only cover a finite state space.

3.1 Basic Assumptions

Before presenting our model, we list here the main assumptions for our multi-player sce-
narios.

• The mining process depends on the mining power of each miner. According to
the blockchain protocol, every miner solves the hash problem simultaneously and

21
independently. The probability of a miner mining a new block is proportional to its
computational power, which is also called the hash power.

• The private branch of a selfish miner is invisible to the other miners, until its miner
decides to publish it.

• Mining is an independent process. The hash problems in the mining process are
random and independent from each other. Solving the previous hash problem does
not effect the current and the following mining. This provides independence among
the states in the MDP model.

• The propagation time is much smaller than the block mining time. In most of
the Proof-of-Work (PoW) blockchain systems, the propagation time, the time that
a new block needs to reach from its miner to all other nodes, is much lower than
the average block mining time. For example, in Bitcoin, the average propagation
time ∆bitcoin ≈ 12.6 sec [26] and the average block time τbitcoin = 600 sec, that is
∆bitcoin  τbitcoin . Therefore, in our MDP model, the mining process is modeled to
have round-based behaviour. The interval between each round is 10 min and the
propagation time is considered negligible. Every node has adequate time to execute
its strategy and reach a stable state in each round.

• The honest miners can be regarded as a group. According to the PoW protocol, the
honest miners mine on top of the longest chain and share the block rewards proba-
bilistically depending on the hash power of each one. Even under the selfish mining
attack, the block reward of each honest miner among the honest miners is still pro-
portional to its mining power in the group. Therefore, in our model, the honest
miners are regarded as a single individual player. Note that such honest miners’
group is different from real mining pools. Every honest miner is assumed to work
independently and not cooperate with other miners.

22
• Last but not least, we assume that all of the attackers are rational in our model. A
miner is called rational if it always pursues to maximize its gain. In other words, an
attacker deviates from the honest mining if the expected reward of the attack (e.g.
selfish mining) is higher than that of honest mining and quits from the attack once
the expected reward is lower.

3.2 Multi-Dimensional MDP Model

In the multi-player case, we formulate a round-based MDP model (S, A, P, R) [9, 72].
Assuming n players, we denote the first n − 1 players as attackers and the nth player
as the group of honest miners. Each attacker forks its own private branch. We note the
length of each branch from the forking point as li , (i = 1, ..., n), where ln is the length
of the public branch mined by the honest miners since the forking point and the rest are
the lengths of the attackers’ private branches. We define the state of the MDP model as
the tuple of the lengths of the branches, S = {(l1 , ..., ln )}. Note that the initial state of
the mining game is always the origin, s0 = (0, ..., 0) and the length of the branches are
non-negative integers, li ∈ N , i = 1, ..., n.
Then, we define the strategy space of the miners as A = {idle, publish, adopt}. In
each round, a miner starts with mining and keeps solving the hash problem until some-
one successfully mines a new block. Then, each miner chooses some of the actions from
A to execute based on the lengths of the branches that have been published, including
the public branch and other published private branches. Idle represents that the miner
keeps mining on the top of its branch in the next round. Publish represents that the miner
chooses to reveal its branch or blocks to the other miners, so that others will choose their
action considering this newly published branch. Adopt represents that the miner gives
up its branch and continues mining on another public branch. Note that the action of
publish changes the public view of the blockchain and other miners responds to the newly
published block/branch with further actions. Thus, in the multi-player game, multiple

23
actions may be taken in sequence in a single round. Finally, e.g. by adopting it, when every
miner chooses idle to the current chain, this round ends and everyone continues to mine
in the next round.
With the definitions of the action space above, honest mining can be described as
follows. If a miner mines a new block, it will publish it at once. If a miner receives a valid
block or a branch that is longer than the public branch, it will adopt it. Otherwise, it will
take the idle action at this round.
Next, we formulate the transition probability in our model. Each miner holds the
mining power of pi , (i = 1, ..., n), which is equivalent to the probability of it finding a
X n
new block in each round and pi = 1. In particular, the honest miners have the mining
i=1
power of pn , and the attackers have the power of pj < 0.5, (j = 1, 2, ..., n − 1) each. In
the first phase of each round, a miner i can generate a new block with a probability pi
and the length of its branch increases by one. Then, every miner chooses actions from
the action space and executes them. If the miner of the new block decides to publish and
another miner, upon receiving it, chooses adopt, the length of its branch becomes 0 with a
probability 1. In contrast, if the receiver chooses idle or publish as a response, the length of
its branch remains unchanged.
Additional to the states, the actions and the transition probabilities, our model also
contains the rewarding mechanism in the blockchain systems. To determine the block
rewards, we first define that a block is finalized if it is admitted into the public chain and
no block is at the same height in any of the private chains, which means that it will be
adopted and confirmed by all of the miners and eventually contained in everyone’s local
copy of the blockchain. Therefore, when a block is finalized, its miner gains a unit of
reward. In other words, a miner gets the reward for its block if all other miners adopt it.
As we mentioned in Section 2.1.2, the reward of a miner has two parts, block reward
and transaction fees, but our model only considers the block reward as it builds by far
the largest part of the overall remuneration. We assume the reward of one block to be of
1 unit (BTC).

24
Then, we define Ri , the absolute rewards of miner i as the expected number of blocks per
round mined by player i (assuming the game is played sufficiently long) and finalized in
Xn
the public chain. The unit of Ri is BTC per round and 0 ≤ Ri ≤ 1. With this, Rj is
j=1
the expected number of blocks mined in the main chain per round. It is equal to 1 before
the attack, when every miner is honest. However, it can be smaller than 1 when forking
occurs, which is the case in selfish mining as sometimes the blocks mined by the honest
miners are lost.
With this, we define ri ∈ R, the relative rewards of miner i as the proportion of its
absolute rewards to the total block rewards produced per round, which is

Ri
ri = n (3.1)
X
Rj
j=1

n
X
Note that ri is a relative value with no unit and rj = 1. In our study, we take the
j=1
relative reward ri , rather than the absolute reward Ri , as the measurement of player i’s
revenue and we will justify it in Section 3.3.3. For simplicity, we call the relative value as
the reward of a miner if no further specification is given.
According to the rationality assumption, the reward of honest mining serves as a base-
line for the selfish mining attackers. An attacker launches the attack if and only if the rel-
ative reward of the attack is higher than that it can expect from the honest mining, which
is equivalent to its mining power. Therefore, we call the mining power that allows the at-
tacker to earn a relative reward from the attack equal to its mining power as the threshold
of the attack.
Additionally, in the multi-attacker game, the attack can be maintained only when every
attacker holds a mining power above its threshold. With this, in the multi-dimensional
space, the threshold is not a point, but a n − 1 dimensional surface/curve. We call the
side of the threshold where a miner’s reward is higher than its power the Beneficial Area.
Hence, to show the feasibility of a multi-player attack, we need to look for the overlap

25
of all n − 1 miners’ beneficial areas, which we name the Common Beneficial Area (CBA). If
such area exists, an attack is stable when the mining power distribution falls within the
CBA. We will discuss this more in the simulations results in Chapter 4.
Furthermore, since the rewards of the miners sum up to 1 while each attacker’s reward
is larger than its mining power, the rewards of the honest miners must be lower than
their mining power. Thus, the selfish mining as well as other alternative mining strategy
attacks are indeed zero-sum games.

3.2.1 Profitability with regard to Time

According to the Bitcoin protocol, the system automatically adjusts the mining difficulty
to counteract the changes of mining power and maintain the growing speed of the main
chain as one block per 10 minutes [55]. Note that it is different from the block mining
time. Every time when the main chain grows by 2016 blocks (around 14 days), the system
compares mining time of these blocks with 10 minutes and adjust accordingly the mining
difficulty. If the block time has been longer than expected, the mining difficulty will
become lower, so that the new blocks are easier to mine and the block time in the main
chain can fall back to 10 minutes.
Assume now a player decides to become selfish. More forking will occur and some-
times the selfish miner’s private chain will overtake the public chain leading to lost
blocks. Overall, this will lead to an average perceived longer block time in the main
chain, although the processing time of each block (successful or discarded) has actually
been 10 minutes. As a result, Bitcoin will adjust the difficulty, and after that, the actual
block mining time will be less than 10 minutes.
The rewards of a miner are defined as the proportion of blocks it mined in the main
chain over a time period. However, in the real world, a miner’s revenue is often counted
as the profitability per unit of time, e.g. BTC per minute. Thus, we introduce the third
measurement for revenue, the actual reward Ri , as the amount of BTCs miner i receives

26
per minute. The relationship between the actual reward and the absolute reward is

Ri
Ri =
τ

where τ is the block mining time, which, in our model, is equivalent to the round time. It
is the time it takes to mine a block either by an attacker or by honest miners. Due to the
difficulty adjustment mechanism in the Bitcoin system, the block mining time is no longer
a constant during the attack. It decreases if selfish mining attack starts and forking occurs.
Therefore, the actual reward is not always proportional to the absolute reward.
Nevertheless, the other measurement we mentioned before, the relative reward, does
reflect the actual reward. Obviously, before the attack, i.e. all players are honest, a miner’s
actual reward is proportional to its mining power and its relative reward. Next, we will
show that with the difficulty adjustment, a miner’s actual reward remains proportional to
its relative reward during the attack. Therefore, using the relative reward as our reward
count is a proper mechanism.

Theorem 3.2.1. After the difficulty adjustment during the attack, the actual reward of a miner is
proportional to its relative reward.

Proof. The timeline of the attack and the difficulty adjustment is shown in Fig. 3.1

Figure 3.1: Timeline of The Attack And The Difficulty Adjustment

Let Ri , R0i and R00i be the absolute rewards of miner i (i = 1, ..., n) before the attack,
during the attack but before the difficulty adjustment and after the difficulty adjustment
respectively. Similarly, let D, D0 , D00 and τ , τ 0 , τ 00 be the mining difficulty and the block
mining time in these three periods respectively.

27
At the beginning, both the attackers and the honest miners mine honestly. Then, when
the attacker launches the attack, the growth rate of the main chain decreases due to the
forking and block overwriting, but the difficulty and the block mining time remain un-
Xn X n
changed at this moment, which are Rj < R0j , D = D0 and τ = τ 0 . Later, within
j=1 j=1
a period that the main chain grows by at most 2016 blocks, the system finds that it takes
longer to mine a block in the main chain, hence the difficulty goes down, as well as the
block mining time, for both attackers and honest miners, which are D00 < D0 and τ 0 < τ 00 .
So the main chain grows 1 block per 10 minutes while the actual block mining time is
lower than 10 minutes. At the same time, the absolute reward of every miner per round
does not change as the mining power distribution and the mining strategies do not alter,
which is R0i = R00i (i = 1, ..., n). After the difficulty adjustment, the attackers continue to
attack and exploit the benefits from the honest miners.
Here, we assume that the duration of the attack is sufficiently longer than the period
between the launch of the attack and the difficulty adjustment. Thus, the attacker’s rev-
enue is measured by its relative reward after the difficulty adjustment, that is

R00i
riattack = n
X
R00j
j=1

Furthermore, as the growth rate (blocks per round) of the main chain, which is equiva-
n
X
lent to the number of sum of the absolute rewards of all the miners, decreases from Rj
j=1
n
X
to R0j , the difficulty and the block mining time also proportionally decrease from D0
j=1
and τ 0 to D00 and τ 00 in the difficulty adjustment. So it goes as follows,

n
X n
X
R0j R00j
00 00 00
τ τ D j=1 j=1
= 0 = 0 = n = n
τ τ D X X
Rj Rj
j=1 j=1

28
Hence, the actual reward of miner i after attack and adjustment is

n
X n
X
Rj Rj
R00i R00i j=1 R00i j=1 1BT C/round riattack
Riattack = = · n = n · = riattack · = BT C/min
τ 00 τ X X τ 10 min/round 10
R00j R00j
j=1 j=1

The original block time τ is a constant (e.g. 10 min in Bitcoin). Thus, regardless of the
attacking strategy and the mining power distribution, the actual reward of miner i, Ri is
proportional to its relative reward ri during the attack and after the difficulty adjustment.

Therefore, compared to the absolute reward R00i , which is not proportional to the ac-
tual reward during the attack, the relative reward is a proper measurement of a miner’s
revenue in our model.

3.3 Two-player Game

With the basic MDP model shown above, we now apply the model to analyze various
strategies in the two-player mining game, including selfish mining and two kinds of stub-
born mining.

3.3.1 Two-Dimensional MDP Model

We start from the simplest scenario with two players. One of the players is the single
attacker and the other is the group of honest miners. Table 3.1 lists the symbols used in
this section.
For the two-player game, the states in the MDP model are two-dimensional, which
is the tuple of the attacker’s private branch and the honest miners’ public branch, (la , lh ).
Therefore, the graph of the two-dimensional MDP model can be regarded as a chessboard
consisting of the integral points in the first quadrant and the non-negative section of the

29
Symbol Definition
pa mining power of the attacker
ph mining power of honest miners
SM , SW , SL , SU areas of mining, winning, losing and unavailable
π(la , lh ) stationary distribution of state (la , lh )
la length of the attacker’s private branch
lh length of the honest miners’ public branch
lb length of the public branch when the attacker restarts mining after losing
γ proportion of honest miners adopting attackers’ branch when in tie
Table 3.1: List of Symbols Used in Section 3.3

two axes. The original state of the mining game is always the origin (0, 0). Additionally,
the chessboard can be divided into four areas according to the actions of the players:
mining, winning, losing and unavailable. The mining area consists of the states where both
of the players keep mining in this round. The winning area consists of the states where
all of the honest miners adopt the attacker’s branch. The losing area consists of the states
where the attacker adopts the public branch. And the rest of the states form the unavailable
area, which will not be reached during the mining game. For example, in selfish mining,
the attacker gives up when the public branch is longer than its private branch. Thus,
cases that public branch has two blocks longer than the private branch will not occur in
the mining game and states (la , la + 2) are in the unavailable area.
Meanwhile, as discussed in Section 3.1, in each round, the attacker with mining power
pa has a probability of pa to mine a new block. In the contrast, the honest miners have a
probability of ph to be the miner of the new block, where ph = 1 − pa . Hence, the transition
probabilities of the states in each area are

h
Pr st+1 = (la + 1, lh ) st = (la , lh ) ∈ SM = pa


h
Pr st+1 = (la , lh + 1) st = (la , lh ) ∈ SM = ph = 1 − pa


h (3.2)
Pr st = (0, 0) st = (la , lh ) ∈ SW = 1


h
Pr st = (0, lb ) st = (la , lh ) ∈ SL = 1


30
where lb is any length from 0 to lh − 1 depending on the attacker’s strategy. Note that
when the attacker loses, it can choose to adopt part of the public branch and start to mine
below the top of the current public chain. However, for selfish mining or other similar
strategies, the attacker is more likely to win if it always chooses to restart its mining at the
top of the public chain after the losing round, hence we set lb to 0 in our study.
Furthermore, we have discussed the tie cases in Section 2.2. When the attacker pub-
lishes its private branch when it has the same length as the public branch, two branches
are in competition. The group of honest miners will divide into two groups, adopting
different branches. Let γ be the proportion of the honest miners adopting the attacker’s
branch. Thus, the hash power that mines on the attacker’s branch is expanded in this
round and the probability that the attacker’s branch reaches the winning area in the next
round is pa + γph = (1 − γ)pa + γ. Accordingly, the probability that it reaches the losing
area in the next round is (1 − γ)ph = −(1 − γ)pa + (1 − γ).
Given the range of the areas and the transition probabilities, we are now able to com-
pute the stationary distribution of each state, then calculate the reward of the attacker and
finally compare it to the attacker’s mining power to determine the benefit of the attack.
Next we will give detailed analysis with some specific attacking strategies.

3.3.2 Two-Player Selfish Mining

With the selfish mining attack given by Algorithm 1, the areas in the MDP graph can be
defined as follows.

o[n o
SM = (0, 0), (1, 0), (1, 1)

(la , lh ) la > lh + 1
n o
SW = (la , lh ) la = lh + 1, lh > 0
n o
SL = (0, 1) , (1, 2)

31
Figure 3.2: Selfish Mining Strategy on 2D MDP Model

Figure 3.2 shows the two-dimensional MDP transition graph of selfish mining. Partic-
ularly, for the state of (1, 1), which is the tie cases, the transition probabilities are

h i
Pr (2, 1) (1, 1) = pa + γph = (1 − γ)pa + γ
 i (3.3)
Pr (1, 2) (1, 1) = (1 − γ)ph = −(1 − γ)pa + (1 − γ)

Then, the reward of the selfish mining attack can be calculated from the MDP model.

Lemma 3.3.1. The reward of the two-player selfish mining attack is

pa (4p2a − 9pa + 4) + (1 − pa )2 (1 − 2pa ) · γ


raSM = · pa (3.4)
p3a − 2p2a − pa + 1

Proof. From the transition probabilities (3.2) and (3.3), the stationary distribution Π can
be solved1 .

1−2pa
· plaa · plhh ,
 
(la , lh ) ∈ (0, 0), (1, 0), (1, 1)


2p3a −5p2a +1
π(la , lh ) = (3.5)
 2p31−2p · C(la − 2, lh ) · plaa · plhh

 a
−5p2 +1
la ≥ 2, la ≥ lh + 2
a a

1
The derivations of the stationary distribution Π and the reward raSM are attached in Appendix A

32
where C(la − 2, lh ) is the Catalan Triangle Number,

   
n+k n+k (n + k)! (n − k + 1)
C(n, k) = − = , n, k ∈ N, n ≥ k
k k−1 k! (n + 1)!

Then, the reward of the attacker is


X
Ra = 2 · pa · π(1, 1) + 1 · γ(1 − pa ) · π(1, 1) + i · (1 − pa ) · π(i, i − 2)
i=2
p2 (4p2 − 9pa + 4) pa (1 − pa )2 (1 − 2pa )
= a 3a + ·γ
2pa − 5p2a + 1 2p3a − 5p2a + 1
Rh = 1 · (1 − pa ) · π(0, 0) + 2 · (1 − γ)(1 − pa ) · π(1, 1) + 1 · γ(1 − pa ) · π(1, 1)
(1 − pa )(1 − 2pa )(−2p2a + 2pa + 1) pa (1 − pa )2 (1 − 2pa )
= − ·γ
2p3a − 5p2a + 1 2p3a − 5p2a + 1
Ra
raSM =
Ra + Rh
pa (4p2a − 9pa + 4) + (1 − pa )2 (1 − 2pa ) · γ
= · pa
p3a − 2p2a − pa + 1

Therefore, we can determine the profitability of the two-player selfish mining attack.

Theorem 3.3.2. The attacker can benefit from the two-player selfish mining attack if and only
1−γ
if its mining power is larger than 3−2γ
, where γ is the proportion of honest miner adopting the
attacker’s branch in tie scenario.
Specifically, when γ = 0, the threshold of the attack is 13 . When γ = 50%, the threshold is 25%.

Proof. Given the reward of the attacker and the rationality assumption, an attacker can
benefit if and only if

pa (4p2a − 9pa + 4) + (1 − pa )2 (1 − 2pa ) · γ


raSM = · pa ≥ pa
p3a − 2p2a − pa + 1

thus,
1 − 3pa 1−γ
γ≥ or pa ≥
1 − 2pa 3 − 2γ

33
Therefore, when γ = 0, the threshold of the attack is

1−γ 1
|γ=0 =
3 − 2γ 3

When γ = 50%, the threshold is


1−γ 1
|γ=0.5 =
3 − 2γ 4

Thus, we derive the same results as presented in [34], validating our multi-dimensional
model.

3.3.3 Alternative Mining Strategies

The two-dimensional MDP model can be generalized to other attacks with different alter-
native mining strategies, such as Stubborn Mining discussed in Section 2.2. We will first
analyze a simpler strategy of naive stubborn mining (NStM), then show another version of
stubborn mining (StM), T1 -stubborn mining, which has a threshold slightly lower than the
selfish mining.
The main difference between stubborn mining and selfish mining is that the attacker
does not give up even when its private branch is behind the public branch. Although in-
tuitively such strategy may not make the attacker’s chance of winning higher, our model
shows that even without selfish mining, the stubborn behaviour itself has a beneficial
threshold below 50%. Algorithm 3 presents the algorithm of naive stubborn mining.
Figure 3.3 shows the MDP graph of naive stubborn mining. Then, we can calculate
the attacker’s relative reward, and the threshold of the attack.

Theorem 3.3.3. The attacker’s reward from the two-player naive stubborn mining is

pa
raN StM = (3.6)
p4a − 3p3a + 5p2a − 4pa + 2

34
Algorithm 3 Naive Stubborn Mining Algorithm
if My pool found a block then
∆prev ← length(private branch) - length(public branch)
append new block to private branch
privateBranchLen ← privateBranchLen + 1
if ∆prev = 0 then
publish its private branch
privateBranchLen ← 0
Mine at the new head of the private chain
if Others found a block then
∆prev ← length(private branch) - length(public branch)
append new block to public branch
if ∆prev = −1 then
private branch ← public branch
privateBranchLen ← 0
Mine at the head of the private chain

Figure 3.3: Naive Stubborn Mining Strategy on 2D MDP Model

The threshold of the two-player naive stubborn mining attack is 43.1%.

Proof. Given the transition probabilities (3.2) and the transition graph Fig. 3.3, the station-
ary distribution Π can be solved.

1 − pa + p2a la lh
π(la , lh ) = · pa · ph , (la , lh ) ∈ SM (3.7)
2 − pa

35
The reward of the attacker is


X pa
Ra = (i + 1) · pa · π(i, i) =
i=0
(2 − pa )(1 − pa + p2a )

X (1 − pa )2 (2 − pa + p2a )
Rh = (i + 2) · (1 − pa ) · π(i + 1, i) =
i=0
(2 − pa )(1 − pa + p2a )
Ra pa
riN StM = = 4
Ra + Rh pa − 3p3a + 5p2a − 4pa + 2

To profit from the attack,

pa
raN StM = ≥ pa ⇒ (1 − pa )(p3a − 2p2a + 3pa − 1) ≥ 0
p4a − 3p3a 2
+ 5pa − 4pa + 2

The solution of the inequality is pa ≥ 0.43016, thus, the threshold of the naive stubborn
mining attack is 43.1%.

From the analysis above, we can see that the naive stubborn behaviour itself is actually
beneficial, but its threshold is higher than that of selfish mining. Stubborn mining is now a
combination of naive stubborn mining and selfish mining. The algorithm was presented
in Alg. 2 in Section 2.2.3. However, one can add stubborn behaviour on top of selfish
mining. More specifically, in selfish mining, the attacker gives up when the public branch
grows first in the tie case. However, with the stubborn behaviour, the attacker continues
to mine in the states of (la , la ) and (la , la +1) and gives up in the states of (la , la +2) (la ≥ 1).
Figure 3.4 shows the MDP model of the stubborn mining strategy.
Now, we can calculate the profitability and the threshold of the stubborn mining at-
tack. To simplify the calculation, we set the parameter γ to 0. That is, none of the honest
miners adopts the attacker’s branch when two branches are in tie.

Theorem 3.3.4. When γ = 0, the attacker’s reward from the two-player stubborn mining is

p2a (−3p5 + 10p4a − 18p3a + 20p2a − 14pa + 4)


raStM =
p7a − 6p6a + 13p5a − 16p4a + 10p3a − 2p2a − 2pa + 1

36
Figure 3.4: Stubborn Mining Strategy on 2D MDP Model

The threshold of the two-player stubborn mining attack is 33.0%, which is slightly lower than 13 ,
the threshold of the selfish mining attack (γ = 0).

Proof. With the transition probabilities (3.2) and the transition graph Fig. 3.4, the station-
ary distribution Π can be solved.

4 2 2
n oSn o
 −3pa +7pa −6pa +1 plhh

· plaa · (la , lh ) ∈ (0, 0), (1, 0) (la , la ), (la , la + 1) la ≥ 1

(1−pa +p2a )(1−2pa )
π(la , lh ) =
−3p4a +7p2a −6p2a +1
· C(la − 2, lh ) · plaa · plhh


 (1−p 2
a +p )(1−2pa )
la ≥ 2, la ≥ lh + 2
a

(3.8)
Then, with the winning and losing areas given by graph 3.4, the reward of the attacker is

Ra p2 (−3p5 + 10p4a − 18p3a + 20p2a − 14pa + 4)


raStM = = 7 a 6
Ra + Rh pa − 6pa + 13p5a − 16p4a + 10p3a − 2p2a − 2pa + 1

To benefit from the attack,

raStM ≥ pa ⇒ (1 − pa )3 (p4a + 3pa − 1) ≥ 0

37
The solution of the inequality is pa ≥ 0.32941, thus, the threshold of the two-player stub-
born mining attack is 33.0%.

3.4 Discussion

In this section, we will briefly discuss some advantages and disadvantages of our model,
and also show the adaptation of the multi-player MDP model for the simulation in the
next chapter.

3.4.1 Advantages

Most of the previous studies analyze the selfish mining by a one-dimensional MDP model,
which uses the difference between two branches as its states. Although the calculation of
the one-dimensional model is simpler, it can only analyze the scenario with a single at-
tacker. And to compute more strategies, the model needs to be extended with more states.
For some strategies, it is over-complicated or even unable to fully present the strategy.
Compared to the previous one-dimensional model, our MDP model extends the states
to multiple dimensions by representing the length of each branch. Thus, it is suitable
for a scenario with any number of attackers. Furthermore, as shown in Section 3.3, it
is more flexible and can present various kinds of strategies. To analyze a strategy, we
first define the areas of mining, winning and losing according to the strategy. Then it is
straightforward to list the transition probabilities among the states. Next, we calculate
the miners’ rewards. For some complicated cases, we can compute the rewards with the
help of the MDP solving software. Finally, we can compare the attacker’s reward with its
proportion of mining power to determine whether it can benefit from the attack.

38
3.4.2 Deficiency

Despite of the flexibility and other advantages, our MDP model still has some deficien-
cies. Although our model can present the multi-attacker scenario, it is still unable to for-
malize the analysis except for the two-player scenario due to the existence of cascading
release. Cascading release means that in our model, the states may transit several times
in one round. Thus, the transition probabilities are not limited among adjacent states,
which makes it difficult to formulate all the possible transition probabilities for the com-
putation. Therefore, for the multi-attacker scenario, we only apply the model as basis for
simulation, considering the difficulties in the theoretical analysis.

3.4.3 Multi-player MDP Model for Simulation

As shown in Section 2.2. the multi-player selfish mining has two new branching cases,
multiple tie and cascading release. In order to cope with these cases in our simulator, we
make two adaptations to the theoretical MDP model from the previous Section 3.2.
First, we extend the parameter γ with regard of the tie cases in the two-player scenario
 (n−1)×(n−1)
to a matrix Γ = (γij ) ∈ [0, 1] , in which γij is the preference of miner i on miner
j’s branch in multiple tie cases. Specifically, when a group of player M = {m1 , m2 , ..., mk }
/ M choosing miner mj ’s branch is
are in tie cases, the chance of the miner m0i ∈

γm0 mj
Pr = X i
γm0i µ
µ∈M

For simplicity, we assume that miners will uniformly randomly choose a branch in the
 
1
multiple tie cases, which is Γ = n−1 .
(n−1)×(n−1)
The other difference of the multi-player model is that a miner may conduct multiple
adopt actions in one round. Because of the existence of the cascading release, in a single
round, a miner may adopt new branches multiple times or adopt a further branch even
after publishing its own branch. Therefore, a player may take multiple actions in sequence

39
(with a negligible propagation time) and a round ends when all players perform idle to
the current public chain.
Furthermore, to have a more clear picture of the rewards in the simulation of multi-
player game, we collect the rewards of the miners at the end of each simulation, rather
than each time when the honest miners adopt other branches. In the simulation, the re-
ward of a miner is counted as the number of blocks it mined, that is finally admitted into
the main chain.

40
Chapter 4

Simulation On Multi-player Selfish


Mining

In order to have a clear picture of selfish mining with multiple attackers, we build a multi-
player mining game simulator based on the multi-player MDP model in Chapter 3 In
this Chapter, we first introduce the implementation of our simulator and the simulation
settings. Then, we analyze the results of our simulation and summarize each part of our
analysis with some observations. We start with the simplest scenario, the three-player
game with two attackers, then we look deeper into scenarios with more attackers. We
also analyze the Common Beneficial Areas (CBA) for different number of players and the
cases where attackers decide to fall back to honest mining. Furthermore, we relate our
simulation results to the real Bitcoin system and propose some methods to avoid the
multi-player selfish attack. Finally, we briefly present the simulation results in the three-
player selfish mining in Ethereum.

41
4.1 Implementation

4.1.1 Structure of Simulator

The mining game simulator is developed based on our multi-player MDP model pre-
sented in Section 3.2 and 3.4.3. Implemented in Python, the structure of our simulator is
shown in Fig. 4.1.

Figure 4.1: Structure of the Simulator

42
To accelerate the simulation process, instead of simulating the whole distributed en-
vironment, we use a centralized mediator to control the block mining and manage the
blockchain. With the mediator, the mining process can be simplified as randomly choos-
ing a miner in each round and creating a new block in the name of the miner. Then, the
miner decides whether to publish the new block. If so, the mediator will continue to query
actions from other miners, otherwise, the mediator will just add the new block to the
miner’s branch.
When querying actions, the mediator first sends the new block to all miners and every
miner makes a decision based on the published information and its private branch. Then,
if any of the miners returns a publish action, the mediator will again send the branches of
those who choose to publish to all other miners. And it repeats querying until every miner
returns an idle or adopt action. Next, the mediator adds those newly mined/published
blocks to the blockchain. Finally, after updating the blockchain, this round ends and the
mediator continues to the next round.

4.1.2 Simulation Settings

The multi-player selfish mining simulations were run with up to 10 players and with a
different combination of mining powers. By default, the mining power of each miner
varies between 1% to 49% by a granularity of 1%. However, the amount of possible min-
ing power combinations grows exponentially as the number of player increases. There-
fore, to keep the amount of simulations at a reasonable level, for simulations with higher
numbers of players, we confine the simulation within ranges that are more likely for the
attackers to launch the attack. More specifically, we run combinations of mining powers
ranging from 1% to 49% for 2 to 5 players, from 5% to 20% for 6 to 7 players, and from
5% to 15% for 8 to 10 players. For every combination of mining powers, We conduct 10
simulation runs for each combination, and each simulation consists of 100,000 rounds.
For each combination. We calculate the average reward for each player over the 10 runs
and count the number of beneficial attackers.

43
As mentioned in Section 2.1.3, the Ethereum model involves a natural forking rate
and the uncle rewards on the base of the Bitcoin model. Therefore, for the Ethereum
experiments with 3-players, we add a natural forking rate of 6% to the simulator, which
means that with a probability of 0.06 in each round, the honest miners who follow the
Ethereum protocol will create a fork. Also for simplicity, we only consider uncle blocks
7
with one-block depth, which has 8
units of the regular block rewards as the default uncle
1
reward and 32
units as the nephew reward [15]. Moreover, to briefly study the impact of
the uncle block rewards in the selfish mining game, we then compare it with the situation
7 1
that both uncle rewards and nephew rewards half down to 16
and 64
.

4.2 Three-player Selfish Mining Game

In this section, we have a closer look at the rewards of each player for a 3-player game.

4.2.1 Profitability

Figure 4.2: Reward of Attacker 1 by the Power of Each Attacker

Fig. 4.2 shows the reward of attacker 1 depending on the mining power of attacker 1
and attacker 2, respectively.
Without any selfish mining attacker, an honest miner will gain the portion of rewards
that equals its mining power. This is represented by the dash line in the first plot of Fig 4.2.

44
Above this baseline, the blue line with squared dots is the reward of selfish mining in a
conventional 2-player game, where the second attacker’s mining power is 0%. This curve
is identical to the results of the previous studies, with the attacker benefiting once its
mining power is over 25%. The reproducing of this result confirms the correctness of
our model and our simulation. Then, when the second attacker comes in with powers of
10%, 20% and 30% respectively, the reward of the first attacker increases, as long as the
power of the first attacker is larger or the same as that of the second attacker. Although
the second attacker also undermines the public chain, the more powerful the first attacker
is, the more it benefits from undermining both honest miners and the second attacker.
However, when the power of the second attacker is 40%, the first attacker loses a lot
unless it also has 40% or more. This is because if an attacker becomes the largest player
among the three, its private branch tends to be longer with a higher probability and wins
out more often than the other private branch or the public branch. Thus, the dominant
attacker gains most of the rewards. Also in the second plot of Fig. 4.2, when looking at
this from a different angle with the mining power of attacker 2 in the x-axis, the reward of
attacker 1 slightly increases at first and then drops drastically when the second attacker’s
power is larger than both the first attacker and honest miners.

Observation 4.2.1. The reward of the first attacker increases when a second attacker comes in
with less mining power, but drops if the second one has the largest power among three players.

4.2.2 Threshold of Three-player Attack

As we mentioned in the rational assumption in Section 2.1, rational players will actually
only attack (mine selfishly) when it is actually worth it, i.e. they gain more than the
expected rewards of honest mining, which is equal their mining power. Thus, in the
first plot of Fig. 4.2, the intersection points between the reward line and the dashed line
(honest mining) represents a threshold of the attack for a single attacker as determined
by our simulation. A rational miner will not actually perform selfish mining if its mining
power is below that threshold. Compared to the 2-player case, these intersections move

45
Figure 4.3: Reward of Each Player in 3-Player Game

leftwards, when the second attacker’s mining power is between 20% and 30%. That is,
the overall threshold decreases in a 3-player game. However, in order to maintain the
3-player game, both attackers need to maintain their mining powers in a certain range so
that both of them hold a power above the threshold.
To have a better view of the rewards, Fig 4.3 shows the relation between each player’s
reward and the mining power of both attackers as determined by our simulation. The
x-axis and y-axis represent the mining power of each attacker from 0% to 49%. In the
first three subplots, each pixel shows the reward of each player under the mining power

46
configuration of its coordinate. The deeper the color is, the more the player can gain
under the current distribution of mining powers. We also indicate the area in which the
player gains more than its mining power. This is shown by the red dashed line in the first
three subplots. We can see in the first plot that the reward of attacker 1 is larger than its
mining power in the right area of the dashed line. For the second heat map, the reward of
attacker 2 is higher than its mining power above the dashed line. And for the third heat
map, the reward of the honest miner is higher than its mining power left of the dashed
line. So clearly, attacking makes actually only sense in certain area where the reward
is higher than the attacker’s mining power. If attackers have too little power than it is
actually the honest miner who gains most. Thus, two-attacker selfish mining can only
be maintained when the mining power distribution falls in the beneficial areas of both
attackers.
To better understand the interplay of these beneficial areas, we show in the fourth sub-
plot the merge of these three dashed lines. More precisely, the red line, brown line and
orange line show the beneficial thresholds of the two attackers and the honest miners,
respectively. There exists, in fact, an overlap between the two attackers’ beneficial area. It
is filled with blue dots and represents the Common Beneficial Area as mentioned in Section
3.2 for the 2 attackers in the 3-player game. This area, of course, cannot overlap with
the beneficial area of the honest miner as the rewards of all players must sum up to 1. If
the mining power of both attackers falls in this Common Beneficial Area, both of them
gain more rewards than if they did honest mining and, therefore, they will stick to selfish
mining. Hence in such cases, the honest miners become the only loser. We can see that
the bottom left corner of this area is (0.21, 0.21). In other words, the minimum power by
which an attacker can benefit is 21%, while this minimum threshold is 25% in the original
2-player game. It means that when there exists two major mining pools with power larger
than 21% and within the CBA, e.g. 22% each, they may choose selfish mining indepen-
dently and can benefit if both launch the attack. The lowering of the threshold reflects a
higher probability of a selfish mining attack in the actual PoW blockchain system.

47
Observation 4.2.2. The beneficial threshold of selfish mining decreases from 25% to 21% in a
3-player game.

4.3 Selfish Mining with More Players

As the threshold of the selfish mining attack decreases with two independent attackers,
does it decrease monotonically with increasing number of attackers? In this section, we
analyze multi-attacker scenarios in regard to the threshold, the stability of the attack and
the feasibility in the real Bitcoin system.

4.3.1 Upper Bound on The Number of Attackers

Given the exponential number of possible combinations of mining power with increas-
ing number of players, discussing and visualizing in detail the relationship between the
mining powers of the different players is unfeasible. Thus, our discussion focuses on the
Common Beneficial Area, especially for scenarios with more than 5 players. Fig. 4.4 is
a visual approximation of this area. It shows the number of beneficial attackers among
different mining power configurations. Each plot shows the results for a given number of
players. The x-axis shows the mining power of the attacker(s) with the smallest mining
power (minimal mining power of attackers), the y-axis shows the mining power of the
attacker(s) with the highest mining power. Thus, all attackers have a mining power that
lies between the value of the x- and the y-axis. Thus, except of for the first plot with only
two attackers, each pixel actually represents several mining power combinations but not
only one. The color of the pixel indicates the largest number of beneficial attackers pos-
sible within the possible combinations. Yellow parts in the background show areas that
are either not a possible configuration or that we did not simulate because we knew that
it would not result in a significant number of beneficial attackers.
The darkest area in each plot is the Common Beneficial Area, in which all n − 1 at-
tackers benefit from the attack. As shown in the plots, this area shrinks when n increases.

48
Figure 4.4: Number of Beneficial Attackers with Different Number of Players (n = 3, 4, 5,
6, 7, 8, 9, 10)

49
And for n larger than 8, it disappears, which means that the area is negligible under a 1%
mining power granularity. In other words, with 8 or more attackers, even if those attack-
ers successfully launch the attack simultaneously, it cannot be maintained if the mining
power of some attacker varies by ±0.5%. Therefore, it is unlikely to launch and hold a
multi-player selfish mining attack with more than 7 attackers in practice.
Furthermore, we can observe that the beneficial threshold for selfish mining decreases
when n increases. For instance, in a 5-player game it is only 15% and it goes further down
to 12% in an 8-player game. This also implies that selfish miners with mining power less
than 12% can benefit from the multi-player attack only when the number of attacker is
larger than 7, which is highly unlikely in practice. Therefore, we may infer that rational
miners with less than 12% of mining power will not choose to launch the attack but there
are feasible Common Beneficial Areas above this threshold.

Observation 4.3.1. The multi-player rational selfish attack is highly unlikely to be maintained
with more than 7 independent rational attackers. Furthermore, it is unlikely for a rational miner
with less than 12% of mining power to benefit from the multi-player selfish mining attack in any
configuration.

The box plots in Fig. 4.5 present which mining power attackers are frequently oper-
ating at in order for all attackers to simultaneously benefit from selfish mining in all the
simulation cases tested. In other words, it provides lower and upper bounds on the Com-
mon Beneficial Areas with different number of players. For example, in the three-player
(two-attacker) scenario, our simulation results show that the majority of cases when self-
ish mining is beneficial for both attackers fall in the range of [0.26, 0.35], which are the
second and the third quartiles of the box plot.
In general, our results show that the majority of the cases, between the lower and the
upper quartiles, concentrate below n1 . Also, the upper bounds of the areas, represented
1 1
by the top of the whiskers in the figure, lie between n
and n−1
, and the lower bounds,
1
represented by the bottom of the whiskers, lie between n+2
and n1 .

50
Figure 4.5: Attackers’ Mining Power in Common Beneficial Areas

1
Theoretically, we can explain the significance of the boundary lines ( n+2 , n1 , and 1
n−1
)
as follows.
Consider some special cases where every attacker has at least a power of n1 . For any
value of n, the player representing the honest miners has less or equal power than any at-
tacker, and thus will not be able to collect any reward since the selfish miners can reliably
overwrite any block mined honestly. Furthermore, the attackers have to have relatively
similar mining powers, because if there is a significant difference, the attacker with a
higher power will win most of the times, while the other attackers lose, which would not
be a sustainable multi-player scenario within the CBA. Finally, if every attacker has the
1
same power, the mining power of each attacker cannot exceed n−1
, since the sum of the
attackers’ power cannot be more than 1 (100% of the total mining power of the network).
1 1
In light of the above explanations, we observe that the region bounded by n
and n−1

falls inside the CBA at all player counts, meaning every selfish attacker will benefit from
the situation.
For the region below n1 , a different explanation is necessary. Below this threshold, it
is not guaranteed that that the honest miner is dominated by the attackers, i.e., that the

51
honest player is unable to collect any block reward. Still, it is possible that the attackers
benefit from selfish mining while leaving a portion of rewards to the honest miner.
1
We observe that the boundary line n+2
reflects the minimum threshold of mining
power required by an attacker for a scenario to be sustainable (i.e., fall within the CBA).
Note that this fraction raises the sum of mining power controlled by selfish miners as the
n−1
number of players increases (this sum is expressed as n+2
). For example, with 3 players
only, the mining power of the selfish attackers is 40%, whereas for 10 players, it is 75%.
The reason is that the attackers compete with each other overwriting their chains, which
wastes more power than in scenarios with fewer attackers (more details in Section ??).
In order to eventually benefit they must hold, as a whole, a larger share of the mining
power as the number of players increases. Thus, the lower bound must decrease sub-
linearly with respect to the number of players. From the simulation results, we observe
1
that the lower bound is located between n+2
and n1 .
1 1
Observation 4.3.2. The Common Beneficial Area is situated between n+2
and n−1
, and the ben-
1
eficial threshold of multi-player selfish mining attack converges to n
with increasing n.

4.3.2 Collusion among Attackers And Collapse of Multi-player Attack

Real-life blockchain systems are dynamic and miners can choose to join in or quit. Thus,
the proportion of mining powers of individual miners and pools can change. This volatil-
ity in mining power can have an impact on the behavior of rational miners as the mining
power is so crucial in determining the Common Beneficial Area. In particular, for situa-
tions with 3 to 7 attackers, the areas are relatively small, and thus, can be unstable. If one
or more attackers quit from selfish mining due fluctuation or miscalculation on mining
power or other unpredictable reasons, the attack degrades to a game with less players.
The rest of the attackers may turn from benefiting to losing rewards.
For example, assume a 5-player mining game with the mining power configuration
(pA1 , pA2 , pA3 , pA4 , pHM ) = (0.18, 0.18, 0.19, 0.20, 0.25), pHM being the honest miner and
the others the attackers. All 4 attackers benefit in this case. However, if the mining

52
power of weakest attacker A1 decreases only by 0.01 of power to 0.17 (and the honest
mining pool increases by 0.01), A1 begins to lose and may decide to fall back to honest
mining. As a result, we have now a 4-player game with a mining power configuration
(pA2 , pA3 , pA4 , pHM ) = (0.18, 0.19, 0.20, 0.43). A2, the second weakest attacker with 0.18 of
mining power also begins to lose rewards and quits from selfish mining, resulting in a
3-player game (pA3 , pA4 , pHM ) = (0.19, 0.20, 0.61), which is not in the common beneficial
area of a 3-player game. Therefore, none of the miners would stay in selfish mining at the
end and the attack collapses.
Thus, we have to be aware that configurations with many attackers are quite unstable
as the Common Beneficial Area is small and a slight deviation in mining power can lead
to selfish miners quitting.

Observation 4.3.3. Situations with a large number of attackers are unstable and may degrade to
cases with less or even none attacker.

So far, we have assumed that all selfish miners work independently. In the real Bitcoin
system, there are several large mining pools and the assumption is that they do not work
together. However, collusion could be possible through off-chain channels. For example,
a very large mining pool could choose to publicly divide itself into two smaller pools to
avoid detection, but the two smaller pools still work together. With this, two attackers
can be externally regarded as two entities but actually mine together on the same private
branch and proportionally distribute the rewards from selfish mining [61, 75].

In our model, two colluding attackers can be treated as one attacker and the n-player
game reduces to a (n − 1)-player game. Table 4.1 shows some examples of the effect
of collusion. In the table, HM denotes the group of honest miners and A1, A2 and A3
represent three attackers, respectively. Besides, of each player, P means the mining power
ratio and R is the relative reward rate of each player compared to its mining power. A
positive number means it gains more than what it should gain if all performed honest
mining, and a negative number means it gains less than with only honest miners.

53
Miners A1 A2 A3 HM
P R P R P R P R
Case 1 0.09 -28.62% 0.22 1.45% - - 0.69 3.27%
Case 2 0.31 11.85% - - - - 0.69 -5.32%
Case 3 0.22 1.65% 0.24 7.56% - - 0.54 -4.03%
Case 4 0.46 60.17% - - - - 0.54 -51.25%
Case 5 0.12 -13.28% 0.14 -9.86% 0.18 2.25% 0.56 4.59%
Case 6 0.12 -32.20% 0.32 27.99% - - 0.56 -4.96%
Case 7 0.14 -24.84% 0.30 23.90% - - 0.56 -6.59%
Case 8 0.18 -10.71% 0.26 14.07% - - 0.56 -3.09%
Case 9 0.44 49.01% - - - - 0.56 -38.51%
Table 4.1: Mining Power and Reward Rate of Miners in 2,3,4-Player Games

Case 1 shows a 3-player game with attacker A1 losing and case 3 shows a game where
attacker A1 gains (but only little) as its power is lower than that of A2. In contrast, A2 is
gaining in both cases and more than A1 in case 3. If A1 now joins A2, as shown in case 2
and case 4, together they now gain significantly more than their combined mining power
would allow them in an honest-only game.
Case 5 is a 4-player game with two attackers losing while the third attacker and the
honest miners benefit. When any two of the attackers join forces as depicted in case 6, 7
and 8, the reward rate is much higher than before while the other attacker loses a lot. And
in case 9, if all three attackers collude, their relative gain is even higher than that in case
6, 7 and 8.
In a more systematic analysis, we have gone through all possible collusion situations
in our 2- to 4-player games. In all cases, attackers gain a higher reward rate if they do
selfish mining together and distribute the rewards proportionally to their actual mining
power. Therefore, collusion is a serious threat as the benefits can be significant, as shown
in Table 4.1.

Observation 4.3.4. Collusion is always beneficial. Two attackers that join forces to mine selfishly
on a single private chain gain more than if they mine independently selfish.

54
Figure 4.6: Mining Power Distribution in Bitcoin [11]

4.3.3 Feasibility in Bitcoin

In our analysis above, we have shown that multi-player selfish-mining attacks are possi-
ble where all selfish miners benefit compared to honest miners. In this section, we look
at the actual Bitcoin network and see whether the current network has the possibility of
multi-attacker situations.
Fig. 4.6 shows the overall mining power distribution of the Bitcoin network over one
year (July 2019 to June 2020). Although the pools switch ranking frequently, it gives an
idea of the typical sizes of pools.
As shown in Fig. 4.6, the largest mining pool holds 16.7% of mining power, which is
less than the lower bound for the common beneficial area in a 4-player game and larger
than the upper bound in a 7-player game. Thus, if this largest mining pool is selfish, it
can only be a 5- or 6-player game to be in the common beneficial area. Looking at the
fourth largest mining pool, it only has 11.6% of mining power which is below the lower
bound for a 5-player game, and looking at the fifth largest mining pool, it has only 6.7%,
well below the lower bound for a 6-player game. Therefore, a multi-player selfish mining

55
Figure 4.7: Minimal Total Mining Power of n − 1 Attackers in Common Beneficial Areas

attack cannot be maintained under the mining power distribution of the current Bitcoin
system.

Observation 4.3.5. It is unlikely for large mining pools in the current Bitcoin network to launch
a multi-player selfish mining attack under the current configuration of mining power.

Having said this, the mining powers in the current network are not completely far
away from configurations with feasible attacks. To avoid them, one can require limits
on the sizes of mining pools. For instance, to avoid a 2-player selfish mining attack, [34]
proposed that no mining pool should hold more than 25% of the mining power. We now
apply such limits for multi-player attacks. Fig. 4.7 shows the smallest sum of all n − 1
1
attackers’ mining power. Since the lower bound of every attacker is n+2
, the sum of these
n−1
lower bounds over all attackers is n+2
. If the overall mining power of the top n − 1 mining
n−1
pools is less than n+2
, the necessary condition of a multi-player attack cannot be met. For
example, the combined mining power of the two largest mining pools should be below
40%, and 50% for top3 miners, etc.
Therefore, an early detection method for multi-player selfish-mining attack is to mon-
itor the mining power of top 7 mining pools in the network, since attacks with 8 or more

56
attackers are unlikely to occur. Let k = n − 1, if the total power of the top k pools is less
k
than k+3
(1 ≤ k ≤ 7), the system is safe from the attack. Once the mining power exceeds
the threshold, we need to check whether the mining power distribution of these miners
is located in the Common Beneficial Areas. If so, the system is at risk of a multi-player
attack from these miners. Then, we need to further scrutinize the mining behaviour of
the miners to tell whether the system is under attack.

Observation 4.3.6. By monitoring the total power of the top k mining pools (1 ≤ k ≤ 7) and
k
comparing it with the threshold k+3
, we can tell whether the system is at risk of the multi-player
selfish-mining attack.

4.4 Three-player Game in Ethereum

We also extend our simulation to another protocol, Ethereum, which introduces the uncle
block rewards into the incentive mechanism.

Figure 4.8: Number of Beneficial Attackers in 3-player Game in Ethereum

Our simulation results are presented in Fig. 4.8 which shows the number of beneficial
attackers in 3-player game in Ethereum with different uncle block rewards. Compared to
Bitcoin, of which the uncle block reward can be considered as 0, the Common Beneficial
Area in Ethereum expands as the uncle rewards increase and the beneficial threshold

57
7
decreases. For the standard setting of current Ethereum system, which has 8
units of the
1
uncle reward and 32
units of nephew reward, the beneficial threshold of a 3-player selfish
mining attack decreases to 19% of the total mining power from 21% of that in Bitcoin
without uncle block rewards.
Introducing uncle block rewards stimulates selfish mining. That is because selfish
miner undermines the interests of honest miners through forking a private branch and
overwriting the public branch, while uncle block rewards give additional credits to the
uncle blocks that are not included in the final main chain. Therefore, selfish miners tend
to release their private branches even when those fail to override the public branch. If the
blocks in their own branch get referenced by new blocks in the main chain, selfish miners
can gain additional rewards on top of the original block rewards. Also, selfish miners will
not actively reference the blocks in the public branch since the nephew rewards are much
less than the uncle rewards. Although the absolute rewards of both selfish miners and
honest miners increase through uncle block referencing, proportionally, selfish miners
still gain more in this mechanism. Additionally, the lower block time in Ethereum causes
a higher forking rate in the public chain, which also weakens the power of honest miners.

Observation 4.4.1. Selfish miner is more advantageous in Ethereum because of the uncle block
reward mechanism. The beneficial threshold of 3-player selfish mining decreases from 21% in
Bitcoin to 19% in Ethereum.

58
Chapter 5

Conclusions

5.1 Conclusions

In this work, we first reviewed the blockchain mechanism and the concept of selfish min-
ing. Considering the possible existence of multi-player selfish mining, we discuss some
new branching cases that only happen in a multi-player attack. And we look back to some
previous studies related to selfish mining and other alternative mining strategy attacks.
Then, given a few basic assumptions, we formally defined a multi-dimensional Markov
Decision Process (MDP) model for multi-player selfish mining attacks. The relative re-
ward in our model is justified as a proper measurement of a miner’s revenue by ana-
lyzing the actual profitability with regard to real time. Furthermore, we used our two-
dimensional MDP model to calculate the expected rewards and the thresholds of both
selfish mining and stubborn mining in the two-player scenario. Our calculation shows
that the threshold of selfish mining aligns with the results of previous studies. Addi-
tionally, we show that the threshold of stubborn mining is even slightly lower than that
of selfish mining. Considering some difficulties in the theoretical computation of the
multi-attacker cases, we turn to the simulation to scrutinize scenarios with more than one
attacker.

59
From the simulation results of a 3-player game, we found that the reward of the first
attacker increases when a second weaker attacker joins in, and there exists an area that
both attackers benefit, which implies the possibility of a 3-player attack. Also, in a 3-
player game, the threshold of selfish mining is only 21%, compared to 25% in a 2-player
game.
1 1
For games with more attackers, the Common Beneficial Area lies between n+2
and n−1

and it shrinks when the number of players increases. And when the number of attackers
is more than 7, the common beneficial area becomes so small that it is barely possible for
8 or more attackers to launch and maintain the selfish mining attack at the same time.
Miners with less than 12% of mining power in practice are unlikely to benefit from self-
ish mining. We also observe that with more attackers, the attack is more unstable and
easier to collapse to games with less or no attacker since colluding instead of attacking
independently is always beneficial. Furthermore, we apply the simulation results to cur-
rent Bitcoin system and show that the distribution of mining power of the current Bitcoin
system does not satisfy the condition for a beneficial multi-player selfish mining attack.
One can ensure that selfish mining is not possible by restricting the total mining power of
n
the top n mining pools to be below n+3
of the overall mining power. Finally, we run the
3-player selfish mining simulation on Ethereum, showing that the introduction of uncle
block rewards and a higher forking rate leads to the decrease of the threshold, from 21%
to 19%.

5.2 Future Work

We can envision the following future work.


For the theoretical part, as the multiple release cases cannot be handled properly in the
current model, we will continue to adapt our multi-player model. Another possible way
to resolve the problem is to use an MDP solver while it requires a more concise definition
of transition probability among the states.

60
For the simulation part, selfish mining is only one of the possible byzantine behaviors
and other malicious mining strategies exist. We will continue to work on the Ethereum
simulation as well as to include stubborn mining and sub-optimal mining into our sim-
ulation studies. Also, we will try to analyze the result from the distribution of rewards,
because currently the mining power distribution is counted by the blocks in the public
chain, which does not necessarily present the actual mining power [93].

61
Appendix A

Proof of Lemma 3.3.1

To prove the lemma 3.3.1, we first introduce the Catalan triangle and Catalan numbers.
Then, we list the transition probabilities and boundary conditions to solve the stationary
distributions. And finally we calculate the rewards of both the attacker and the honest
miners.

A.1 Catalan Triangle And Catalan Numbers

Catalan Triangle [79], C(n, k), (n ≥ k ≥ 0, n, k ∈ Z), is a number pattern which satisfies

C(n, 0) = 1, n≥0

C(n, k) = C(n − 1, k) + C(n, k − 1), n − 1 ≥ k ≥ 1, n ≥ 2 (A.1)

C(n, n) = C(n, n − 1), n≥1

Table A.1 shows a part of the triangle. The general formula of the numbers in the
Catalan triangle is

   
n+k n+k (n + k)!(n − k + 1)
C(n, k) = − = , n≥k≥0
k k−1 k!(n + 1)!

62
k
0 1 2 3 4 5 6 7 8
n
0 1
1 1 1
2 1 2 2
3 1 3 5 5
4 1 4 9 14 14
5 1 5 14 28 42 42
6 1 6 20 48 90 132 132
7 1 7 27 75 165 297 429 429
8 1 8 35 110 275 572 1001 1430 1430
Table A.1: Catalan Triangle

Besides, the number series along the hypotenuse of the triangle is called the Catalan
numbers [22, 83], Cn .

 
1 2n (2n)!
Cn = C(n, n) = = , n≥0
n+1 n n!(n + 1)!

And the generating function of Catalan number [89] is

∞ √
X
n 1− 1 − 4x
c(x) = Cn x =
n=0
2x

We can also calculate the generating function of another series Dn = n · Cn .

∞ ∞ ∞ √
X
n
X
n dxn
X dc(x) 1 − 2x − 1 − 4x
d(x) = Dn x = n Cn x = x Cn =x = √
n=0 n=0 n=0
dx dx 2x 1 − 4x

Furthermore, to prepare for the following proof, let x = pa (1 − pa ), 0 < pa < 0.5,

∞ p
  X
n n 1 − 1 − 4pa (1 − pa ) 1
c pa (1 − pa ) = Cn pa (1 − pa ) = = (A.2)
n=0
2pa (1 − pa ) 1 − pa

∞ p
  X
n n 1 − 2pa (1 − pa ) − 1 − 4pa (1 − pa ) pa
d pa (1−pa ) = n Cn pa (1−pa ) = p =
n=0 2pa 1 − 4pa (1 − pa ) (1 − pa )(1 − 2pa )
(A.3)

63
A.2 Stationary Distribution

From the definitions of the transition probabilities (3.2) and (3.3), we write down the
following equations.


π(la , 0) = pa · π(la − 1, 0), la ≥ 1,











 π(la , lh ) = ph · π(la , lh − 1), lh = la − 2, la ≥ 3,



= pa · π(la − 1, lh ) + ph · π(la , lh − 1), la − 3 ≥ lh ≥ 1, la ≥ 4,

π(la , lh )


(A.4)


π(1, 1) = ph · π(1, 0),

a −2
∞ lX



 X


π(0, 0) + π(1, 0) + π(1, 1) + π(la , lh ) = 1

la =2 lh =0






p + p =1
a h

where pa , ph and la , lh are the mining power and the length of the branches of the
attacker and the honest miners respectively. π(la , lh ) is the stationary distribution of the
state (la , lh ).
We can observe that the coefficients among the stable distributions except for π(0, 0),
π(1, 0) and π(1, 1) form the number pattern of Catalan Triangle. More specifically, π(la , lh ) =
C(la − 2, lh ) · plaa −2 · plhh · π(2, 0), la − 2 ≥ lh ≥ 0, la ≥ 2.

64

X
besides, let Sn = π(la , la − n), n ≥ 2, Then, with (A.2), we have
la =n


X ∞
X
S2 = π(la , la − 2) = C(la − 2, la − 2) · plaa −2 · plha −2 · π(2, 0)
la =2 la =2
X ∞
= Cla −2 · plaa −2 · (1 − pa )la −2 · π(2, 0)
la =2
π(2, 0)
=
ph


X ∞
X
S2 = π(2, 0) + π(la , la − 2) = π(2, 0) + ph · π(la , la − 3)
la =3 la =3

= ph S2 + ph S3

pa S2 = ph S3


X
Sn+1 = π(n + 1, 0) + π(la , la − n − 1)
la =n+2
X∞ ∞
X
= pa · π(n, 0) + pa · π(la , la − n) + ph · π(la , la − n − 2)
la =n+1 la =n+2

= pa Sn + ph Sn+2

ph Sn+2 − pa Sn+1 = ph Sn+1 − pa Sn = · · · = ph S3 − pa S2 = 0

pa  p 2
a
Sn = Sn−1 = Sn−2 · · ·
ph ph
 p n−2
a
= S2
ph

∞ ∞ 
X X pa i−2 S2 π(2, 0) ph
Si = S2 = pa = ·
i=2 i=2
ph 1 − ph ph ph − pa
π(2, 0)
=
1 − 2pa

65
Therefore, given (A.4), we can solve the stationary distributions,

∞ lX
X a −2

π(0, 0) + π(1, 0) + π(1, 1) + π(la , lh )


la =2 lh =0
∞ X
X ∞
= π(0, 0) + pa π(0, 0) + pa (1 − pa )π(0, 0) + π(la , la − i)
i=2 la =i

X
= (1 + pa + pa − p2a ) · π(0, 0) + Si
i=2
p2a
= (−p2a + 2pa + 1 + ) · π(0, 0) = 1
1 − 2pa

 p2a −1
p2a
π(0, 0) = − + 2pa + 1 +
1 − 2pa
1 − 2pa
= 3
2pa − 5p2a + 1

π(la , lh ) = plaa · plhh · π(0, 0)


1 − 2pa n o
= · pla · plh , (la , lh ) ∈ (0, 0), (1, 0), (1, 1)
2pa − 5p2a + 1 a h
3

π(la , lh ) = C(la − 2, lh ) · plaa −2 · plhh · π(2, 0)


1 − 2pa n o
= · C(la − 2, lh ) · plaa · plhh , (la , lh ) ∈ (la , lh ) la ≥ 2, la ≥ lh + 2
2pa − 5p2a + 1
3

66
A.3 Rewards

Given the sum of the Catalan series (A.2) and (A.3), we first calculate the absolute reward
of each player.


X
Ra = 2 · pa · π(1, 1) + 1 · γ(1 − pa ) · π(1, 1) + la · (1 − pa ) · π(la , la − 2)
la =2
pa (1 − pa )2 (1 − 2pa ) 2p2a (1 − pa )(1 − 2pa )
= · γ +
2p3a − 5p2a + 1 2p3a − 5p2a + 1
hX ∞ X∞ i p2 (1 − p )(1 − 2p )
la −2 a a
+ la −2
(la − 2) · Cla −2 · pa · ph + 2 Cla −2 · plaa −2 · plha −2 · a 3 2 +1
l =2 l =2
2p a − 5p a
a a
2
pa (1 − pa ) (1 − 2pa ) 2p2a (1
− pa )(1 − 2pa )
= 3 2
·γ+
2pa − 5pa + 1 2pa − 5p2a + 1
3
h pa 2 i p2 (1 − pa )(1 − 2pa )
+ + · a 3
(1 − pa )(1 − 2pa ) 1 − pa 2pa − 5p2a + 1
4p4 − 9p3a + 4p2a pa (1 − pa )2 (1 − 2pa )
= a3 + ·γ
2pa − 5p2a + 1 2p3a − 5p2a + 1

Rh = 1 · (1 − pa ) · π(0, 0) + 2 · (1 − γ)(1 − pa ) · π(1, 1) + 1 · γ(1 − pa ) · π(1, 1)


(1 − pa )(1 − 2pa ) pa (1 − pa )2 (1 − 2pa ) pa (1 − pa )2 (1 − 2pa )
= + 2 · − γ ·
2p3a − 5p2a + 1 2p3a − 5p2a + 1 2p3a − 5p2a + 1
4 3 2 2
−4pa + 10pa − 6pa − pa + 1 pa (1 − pa ) (1 − 2pa )
= − ·γ
2p3a − 5p2a + 1 2p3a − 5p2a + 1

Therefore, the relative reward of the attacker from the two-player selfish mining game is

Ra
raSM =
Ra + Rh
p3a − 2p2a − pa + 1
  4
4pa − 9p3a + 4p2a pa (1 − pa )2 (1 − 2pa )

= + ·γ
2p3a − 5p2a + 1 2p3a − 5p2a + 1 2p3a − 5p2a + 1
pa (4p2a − 9pa + 4) + (1 − pa )2 (1 − 2pa ) · γ
= · pa
p3a − 2p2a − pa + 1

67
Bibliography

[1] A NDROULAKI , E., B ARGER , A., B ORTNIKOV, V., C ACHIN , C., C HRISTIDIS , K.,
D E C ARO , A., E NYEART, D., F ERRIS , C., L AVENTMAN , G., M ANEVICH , Y., ET AL .

Hyperledger fabric: a distributed operating system for permissioned blockchains. In


Proceedings of the thirteenth EuroSys conference (2018), pp. 1–15.

[2] B AG , S., R UJ , S., AND S AKURAI , K. Bitcoin block withholding attack: Analysis and
mitigation. IEEE Transactions on Information Forensics and Security 12, 8 (2016), 1967–
1978.

[3] B AI , Q., Z HOU , X., WANG , X., X U , Y., WANG , X., AND K ONG , Q. A deep dive
into blockchain selfish mining. In ICC 2019-2019 IEEE International Conference on
Communications (ICC) (2019), IEEE, pp. 1–6.

[4] B ECCUTI , J., J AAG , C., ET AL . The bitcoin mining game: On the optimality of hon-
esty in proof-of-work consensus mechanism. Swiss Economics Working Paper 0060
(2017).

[5] B ENTOV, I., G ABIZON , A., AND M IZRAHI , A. Cryptocurrencies without proof of
work. In International conference on financial cryptography and data security (2016),
Springer, pp. 142–157.

[6] B ITCOINBLOCKHALF. COM. Bitcoin block reward halving countdown, June 2020.
https://bitcoinblockhalf.com.

68
[7] BITCOIN . IT . Delegated proof of stake. https://en.bitcoin.it/wiki/
Delegated_proof_of_stake.

[8] B ONNEAU , J., M ILLER , A., C LARK , J., N ARAYANAN , A., K ROLL , J. A., AND F EL -
TEN , E. W. Sok: Research perspectives and challenges for bitcoin and cryptocurren-
cies. In 2015 IEEE symposium on security and privacy (2015), IEEE, pp. 104–121.

[9] B OUCHERIE , R. J., AND VAN D IJK , N. M. Markov decision processes in practice,
vol. 248. Springer, 2017.

[10] B RAVO -M ARQUEZ , F., R EEVES , S., AND U GARTE , M. Proof-of-learning: A


blockchain consensus mechanism based on machine learning competitions. In 2019
IEEE International Conference on Decentralized Applications and Infrastructures (DAPP-
CON) (2019), IEEE, pp. 119–124.

[11] B TC . COM. Pool stats. https://btc.com/stats/pool?pool_mode=year.

[12] B TC . COM. Transaction fees. https://btc.com/stats/fee.

[13] B UTERIN , V. Bitcoin network shaken by blockchain


fork, 2013. https://bitcoinmagazine.com/articles/
bitcoin-network-shaken-by-blockchain-fork-1363144448.

[14] B UTERIN , V. What is ethereum? Ethereum Official webpage. Available: http://www.


ethdocs. org/en/latest/introduction/what-is-ethereum. html (2016).

[15] B UTERIN , V., ET AL . Ethereum white paper: a next generation smart contract &
decentralized application platform. First version 53 (2014).

[16] B UTERIN , V., AND G RIFFITH , V. Casper the friendly finality gadget. arXiv preprint
arXiv:1710.09437 (2017).

[17] C ACHIN , C., ET AL . Architecture of the hyperledger blockchain fabric. In Workshop


on distributed cryptocurrencies and consensus ledgers (2016), vol. 310.

69
[18] C ARLSTEN , M. The impact of transaction fees on bitcoin mining strategies. PhD thesis,
Princeton University, 2016.

[19] C ASTRO , M., L ISKOV, B., ET AL . Practical byzantine fault tolerance. In OSDI (1999),
vol. 99, pp. 173–186.

[20] C HEN , L., X U , L., S HAH , N., G AO , Z., L U , Y., AND S HI , W. On security analysis of
proof-of-elapsed-time (poet). In International Symposium on Stabilization, Safety, and
Security of Distributed Systems (2017), Springer, pp. 282–297.

[21] C HOHAN , U. W. The double spending problem and cryptocurrencies. Available at


SSRN 3090174 (2017).

[22] C HU , W. A new combinatorial interpretation for generalized catalan number. Dis-


crete mathematics 65, 1 (1987), 91–94.

[23] C ONTI , M., K UMAR , E. S., L AL , C., AND R UJ , S. A survey on security and privacy
issues of bitcoin. IEEE Communications Surveys & Tutorials 20, 4 (2018), 3416–3452.

[24] CORDA . NET . Corda, open source blockchain platform for business. https://www.
corda.net/.

[25] C OURTOIS , N. T., AND B AHACK , L. On subversive miner strategies and block with-
holding attack in bitcoin digital currency. arXiv preprint arXiv:1402.1718 (2014).

[26] D ECKER , C., AND WATTENHOFER , R. Information propagation in the bitcoin net-
work. In IEEE P2P 2013 Proceedings (2013), IEEE, pp. 1–10.

[27] D IMITRI , N. Bitcoin mining as a contest. Ledger 2 (2017), 31–37.

[28] D INH , T. T. A., L IU , R., Z HANG , M., C HEN , G., O OI , B. C., AND WANG , J. Untan-
gling blockchain: A data processing view of blockchain systems. IEEE Transactions
on Knowledge and Data Engineering 30, 7 (2018), 1366–1385.

70
[29] D OUCEUR , J. R. The sybil attack. In International workshop on peer-to-peer systems
(2002), Springer, pp. 251–260.

[30] EOS. IO. Eos blockchain software architecture, Jun 2020. https://eos.io/.

[31] E RSOY, O., R EN , Z., E RKIN , Z., AND L AGENDIJK , R. L. Information propagation on
permissionless blockchains. arXiv preprint arXiv 1712 (2017).

[32] ETHERSCAN . IO. Ethereum block time. https://etherscan.io/chart/


blocktime.

[33] E YAL , I. The miner’s dilemma. In 2015 IEEE Symposium on Security and Privacy
(2015), IEEE, pp. 89–103.

[34] E YAL , I., AND S IRER , E. G. Majority is not enough: Bitcoin mining is vulnerable.
In International conference on financial cryptography and data security (2014), Springer,
pp. 436–454.

[35] G ARAY, J., K IAYIAS , A., AND L EONARDOS , N. The bitcoin backbone protocol with
chains of variable difficulty. In Annual International Cryptology Conference (2017),
Springer, pp. 291–323.

[36] G ARAY, J. A., K IAYIAS , A., AND PANAGIOTAKOS , G. Proofs of work for blockchain
protocols. IACR Cryptol. ePrint Arch. 2017 (2017), 775.

[37] G ENCER , A. E., B ASU , S., E YAL , I., VAN R ENESSE , R., AND S IRER , E. G. Decen-
tralization in bitcoin and ethereum networks. In International Conference on Financial
Cryptography and Data Security (2018), Springer, pp. 439–457.

[38] G ERVAIS , A., K ARAME , G. O., W ÜST, K., G LYKANTZIS , V., R ITZDORF, H., AND

C APKUN , S. On the security and performance of proof of work blockchains. In Pro-


ceedings of the 2016 ACM SIGSAC conference on computer and communications security
(2016), pp. 3–16.

71
[39] G ERVAIS , A., R ITZDORF, H., K ARAME , G. O., AND C APKUN , S. Tampering with the
delivery of blocks and transactions in bitcoin. In Proceedings of the 22nd ACM SIGSAC
Conference on Computer and Communications Security (2015), pp. 692–705.

[40] G ÖBEL , J., K EELER , H. P., K RZESINSKI , A. E., AND TAYLOR , P. G. Bitcoin blockchain
dynamics: The selfish-mine strategy in the presence of propagation delay. Perfor-
mance Evaluation 104 (2016), 23–41.

[41] GOQUORUM . COM. Quorum. http://docs.goquorum.com/en/latest/.

[42] G RUNSPAN , C., AND P ÉREZ -M ARCO , R. On profitability of selfish mining. arXiv
preprint arXiv:1805.08281 (2018).

[43] G RUNSPAN , C., AND P ÉREZ -M ARCO , R. On profitability of stubborn mining. arXiv
preprint arXiv:1808.01041 (2018).

[44] G RUNSPAN , C., AND P ÉREZ -M ARCO , R. On profitability of trailing mining. arXiv
preprint arXiv:1811.09322 (2018).

[45] H EILMAN , E. One weird trick to stop selfish miners: Fresh bitcoins, a solution for the
honest miner. In International Conference on Financial Cryptography and Data Security
(2014), Springer, pp. 161–162.

[46] H EILMAN , E., K ENDLER , A., Z OHAR , A., AND G OLDBERG , S. Eclipse attacks on
bitcoin’s peer-to-peer network. In 24th {USENIX} Security Symposium ({USENIX}
Security 15) (2015), pp. 129–144.

[47] H OU , C., Z HOU , M., J I , Y., D AIAN , P., T RAMER , F., FANTI , G., AND J UELS , A.
Squirrl: Automating attack discovery on blockchain incentive mechanisms with
deep reinforcement learning. arXiv preprint arXiv:1912.01798 (2019).

[48] H OWARD , R. A. Dynamic probabilistic systems: Markov models, vol. 1. Courier Corpo-
ration, 2012.

72
[49] J AKOBSSON , M., AND J UELS , A. Proofs of work and bread pudding protocols. In
Secure information networks. Springer, 1999, pp. 258–272.

[50] K ARAME , G. O., A NDROULAKI , E., AND C APKUN , S. Double-spending fast pay-
ments in bitcoin. In Proceedings of the 2012 ACM conference on Computer and communi-
cations security (2012), pp. 906–917.

[51] K IAYIAS , A., K OUTSOUPIAS , E., K YROPOULOU , M., AND T SELEKOUNIS , Y.


Blockchain mining games. In Proceedings of the 2016 ACM Conference on Economics
and Computation (2016), pp. 365–382.

[52] K IAYIAS , A., R USSELL , A., D AVID , B., AND O LIYNYKOV, R. Ouroboros: A provably
secure proof-of-stake blockchain protocol. In Annual International Cryptology Confer-
ence (2017), Springer, pp. 357–388.

[53] K ING , S., AND N ADAL , S. Ppcoin: Peer-to-peer crypto-currency with proof-of-stake.
self-published paper, August 19 (2012), 1.

[54] K OTOW, E. What is ethereum’s uncle rate?, Dec 2019. https://hedgetrade.


com/what-is-ethereums-uncle-rate/.

[55] K RAFT, D. Difficulty control for blockchain-based consensus systems. Peer-to-Peer


Networking and Applications 9, 2 (2016), 397–413.

[56] L ARIMER , D. Delegated proof-of-stake (dpos). Bitshare whitepaper (2014).

[57] L ASZKA , A., J OHNSON , B., AND G ROSSKLAGS , J. When bitcoin mining pools run
dry. In International Conference on Financial Cryptography and Data Security (2015),
Springer, pp. 63–77.

[58] L EE , S., AND K IM , S. Countering block withholding attack efficiently. In IEEE IN-
FOCOM 2019-IEEE Conference on Computer Communications Workshops (INFOCOM
WKSHPS) (2019), IEEE, pp. 330–335.

73
[59] L EELAVIMOLSILP, T., T RAN -T HANH , L., AND S TEIN , S. On the preliminary in-
vestigation of selfish mining strategy with multiple selfish miners. arXiv preprint
arXiv:1802.02218 (2018).

[60] L IU , H., R UAN , N., D U , R., AND J IA , W. On the strategy and behavior of bitcoin
mining with n-attackers. In Proceedings of the 2018 on Asia Conference on Computer and
Communications Security (2018), pp. 357–368.

[61] L UU , L., S AHA , R., PARAMESHWARAN , I., S AXENA , P., AND H OBOR , A. On power
splitting games in distributed computation: The case of bitcoin pooled mining. In
2015 IEEE 28th Computer Security Foundations Symposium (2015), IEEE, pp. 397–411.

[62] M ERKLE , R. C. A digital signature based on a conventional encryption function.


In Conference on the theory and application of cryptographic techniques (1987), Springer,
pp. 369–378.

[63] M ILLER , A., L ITTON , J., PACHULSKI , A., G UPTA , N., L EVIN , D., S PRING , N., AND

B HATTACHARJEE , B. Discovering bitcoin’s public topology and influential nodes. et


al (2015).

[64] N AKAMOTO , S. Bitcoin: A peer-to-peer electronic cash system. Tech. rep., Manubot,
2019.

[65] N ASH , J. F., ET AL . Equilibrium points in n-person games. Proceedings of the national
academy of sciences 36, 1 (1950), 48–49.

[66] N AYAK , K., K UMAR , S., M ILLER , A., AND S HI , E. Stubborn mining: Generalizing
selfish mining and combining with an eclipse attack. In 2016 IEEE European Sympo-
sium on Security and Privacy (EuroS&P) (2016), IEEE, pp. 305–320.

[67] N ISAN , N., R OUGHGARDEN , T., TARDOS , E., AND VAZIRANI , V. V. Algorithmic
Game Theory. Cambridge University Press, 2007.

74
[68] N IU , J., AND F ENG , C. Selfish mining in ethereum. arXiv preprint arXiv:1901.04620
(2019).

[69] PASS , R., S EEMAN , L., AND S HELAT, A. Analysis of the blockchain protocol in asyn-
chronous networks. In Annual International Conference on the Theory and Applications
of Cryptographic Techniques (2017), Springer, pp. 643–673.

[70] PASS , R., AND S HI , E. Hybrid consensus: Efficient consensus in the permissionless
model. In 31st International Symposium on Distributed Computing (DISC 2017) (2017),
Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.

[71] P EASE , M., S HOSTAK , R., AND L AMPORT, L. Reaching agreement in the presence of
faults. Journal of the ACM (JACM) 27, 2 (1980), 228–234.

[72] P UTERMAN , M. L. Markov decision processes: discrete stochastic dynamic programming.


John Wiley & Sons, 2014.

[73] RIPPLE . COM. Xrp. https://ripple.com/xrp/.

[74] R OCKET, T. Snowflake to avalanche: A novel metastable consensus protocol family


for cryptocurrencies. Available [online].[Accessed: 4-12-2018] (2018).

[75] R OSENFELD , M. Analysis of bitcoin pooled mining reward systems. arXiv preprint
arXiv:1112.4980 (2011).

[76] S APIRSHTEIN , A., S OMPOLINSKY, Y., AND Z OHAR , A. Optimal selfish mining strate-
gies in bitcoin. In International Conference on Financial Cryptography and Data Security
(2016), Springer, pp. 515–532.

[77] S CHUH , F., AND L ARIMER , D. Bitshares 2.0: general overview. accessed June-
2017.[Online]. Available: http://docs. bitshares.org/downloads/bitshares-general. pdf (2017).

[78] S CHWARTZ , D., Y OUNGS , N., B RITTO , A., ET AL . The ripple protocol consensus
algorithm. Ripple Labs Inc White Paper 5, 8 (2014).

75
[79] S HAPIRO , L. W. A catalan triangle. Discrete Mathematics 14, 1 (1976), 83–90.

[80] S OLAT, S., AND P OTOP -B UTUCARU , M. Zeroblock: Preventing selfish mining in
bitcoin. arXiv preprint arXiv:1605.02435 (2016).

[81] S OMPOLINSKY, Y., AND Z OHAR , A. Bitcoin’s security model revisited. arXiv preprint
arXiv:1605.09193 (2016).

[82] S ONI , A., AND M AHESHWARI , S. A survey of attacks on the bitcoin system. In 2018
IEEE International Students’ Conference on Electrical, Electronics and Computer Science
(SCEECS) (2018), IEEE, pp. 1–5.

[83] S TANLEY, R. P. Catalan numbers. Cambridge University Press, 2015.

[84] T OSH , D. K., S HETTY, S., L IANG , X., K AMHOUA , C. A., K WIAT, K. A., AND N JILLA ,
L. Security implications of blockchain cloud with analysis of block withholding
attack. In 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid
Computing (CCGRID) (2017), IEEE, pp. 458–467.

[85] T SCHORSCH , F., AND S CHEUERMANN , B. Bitcoin and beyond: A technical survey
on decentralized digital currencies. IEEE Communications Surveys & Tutorials 18, 3
(2016), 2084–2123.

[86] V UJ ČI Ć , D., J AGODI Ć , D., AND R ANÐI Ć , S. Blockchain technology, bitcoin, and
ethereum: A brief overview. In 2018 17th international symposium infoteh-jahorina (in-
foteh) (2018), IEEE, pp. 1–6.

[87] WANG , T., L IEW, S. C., AND Z HANG , S. When blockchain meets ai: Optimal mining
strategy achieved by machine learning. arXiv preprint arXiv:1911.12942 (2019).

[88] WANG , W., H OANG , D. T., H U , P., X IONG , Z., N IYATO , D., WANG , P., W EN , Y.,
AND K IM , D. I. A survey on consensus mechanisms and mining strategy manage-
ment in blockchain networks. IEEE Access 7 (2019), 22328–22370.

76
[89] W EISSTEIN , E. W. Catalan number. https://mathworld. wolfram. com/ (2002).

[90] WIKIPEDIA . ORG . Bitcoin, Jun 2020. https://en.wikipedia.org/wiki/


Bitcoin.

[91] WIKIPEDIA . ORG . Proof of stake, May 2020. https://en.wikipedia.org/wiki/


Proof_of_stake.

[92] W OOD , G., ET AL . Ethereum: A secure decentralised generalised transaction ledger.


Ethereum project yellow paper 151, 2014 (2014), 1–32.

[93] W U , K., P ENG , B., X IE , H., AND Z HAN , S. A coefficient of variation method to mea-
sure the extents of decentralization for bitcoin and ethereum networks. IJ Network
Security 22, 2 (2020), 191–200.

[94] Z HANG , R., AND P RENEEL , B. Publish or perish: A backward-compatible defense


against selfish mining in bitcoin. In Cryptographers’ Track at the RSA Conference (2017),
Springer, pp. 277–292.

[95] Z HANG , S., AND L EE , J.-H. Analysis of the main consensus protocols of blockchain.
ICT Express (2019).

77

You might also like