You are on page 1of 154

Pricing and Arbitrage in Cryptocurrency Markets

by
Neel Hajare
S.B., Massachusetts Institute of Technology (2013)
Submitted to the Department of Electrical Engineering and Computer
Science
in partial fulfillment of the requirements for the degree of
Master of Engineering in Electrical Engineering and Computer Science
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
September 2018
Copyright 2018 Neel Hajare. All rights reserved.
The author hereby grants to MIT permission to reproduce and to
distribute publicly paper and electronic copies of this thesis document in
whole or in part in any medium now known or hereafter created.

Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Department of Electrical Engineering and Computer Science
September 4, 2018

Certified by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Prof. Haoxiang Zhu, Thesis Supervisor
September 4, 2018

Accepted by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Katrina LaCurts, Chair, Masters of Engineering Thesis Committee
Pricing and Arbitrage in Cryptocurrency Markets

by

Neel Hajare

Submitted to the Department of Electrical Engineering and Computer Science


on September 4, 2018, in partial fulfillment of the
requirements for the degree of
Master of Engineering in Electrical Engineering and Computer Science

Abstract

Cryptocurrencies have garnered an increasing amount of attention and grown dramatically


in value. Like financial assets, they are traded continuously across a number of exchanges.
This thesis presents the design and implementation of a system for real-time data collec-
tion of pricing and trading activity for four cryptocurrencies across three exchanges. Using
the data collected for the US Dollar to cryptocurrency order books from May 4, 2018 to
May 9, 2018 we find that arbitrage opportunities exist in 0.03% to 40.38% of five-second
intervals depending on the specific cryptocurrency and exchanges considered. Analysis of
the signed trading volume shows that trading behavior differs in the presence of these ar-
bitrage opportunities, but we find only weak evidence suggesting that market participants
actively exploit such opportunities on sub-minute timescales.

Thesis Supervisor: Prof. Haoxiang Zhu

2
Acknowledgments

I would like to thank the many people without whose help this thesis would not have been

possible.

First, I would like express my sincerest gratitude to my thesis supervisor, Professor Haoxi-

ang Zhu, for advising me throughout this work. I truly appreciate his willingness to let me

explore and run with my ideas and his incredible patience and understanding as this thesis

came to fruition. His feedback and guidance were critical, and his encouragement pushed

me beyond what I thought I was capable of.

I am incredibly thankful for all the guidance and assistance I have received from Anne

Hunter throughout my MIT career. I would not have been able to complete either of

my degrees without her help. She has truly been instrumental in helping me realize my

dreams.

This thesis could not have been completed without the continual love and support from

those closest to me throughout this process. I will be forever grateful to Lisl Esherick,

Zach Zappala, Joseph Ong, and Belinda Gu for being there for me as I worked towards

this goal.

Key contributions from Jimmy Myatt and Colin McSwiggen were pivotal in getting me

unstuck and allowing me to continue progressing. Kind words of advice from Ilica Maha-

jan and Ryder Moody offered comfort at moments when I felt most discouraged.

Finally, I would not be where I am today without my family. My parents and my sister

have provided so much for me throughout my life, and this thesis is a testament to their

effort as well as mine.

3
Contents

1 Introduction 15

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2 Background 21

2.1 Cryptocurrencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.1.1 Transaction time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.1.2 Transaction cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.2 Cryptocurrency markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.2.1 Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.2.2 Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2.3 Arbitrage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.3 API formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.3.1 FIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3.2 REST over HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3.3 WebSocket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3 Data Collection 30

3.1 Choosing currencies and exchanges . . . . . . . . . . . . . . . . . . . . . . . 30

4
3.1.1 Currency pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.1.2 Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.2 Collecting ticker data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2.1 GDAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2.2 Bitfinex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.2.3 Bitstamp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.3 Issue of recorded time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.4 Ticker data evaluation and learnings . . . . . . . . . . . . . . . . . . . . . . 42

3.4.1 Variation in data stream frequency . . . . . . . . . . . . . . . . . . . 42

3.4.2 Bitfinex update frequency mitigation strategy . . . . . . . . . . . . . 52

3.4.3 Updated data stream frequency evaluation . . . . . . . . . . . . . . . 56

3.4.4 Exchange latency evaluation . . . . . . . . . . . . . . . . . . . . . . . 68

3.5 Collecting bid/ask data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

3.5.1 GDAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

3.5.2 Bitfinex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

3.5.3 Bitstamp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4 Trading Volume and Bid/Ask Spreads 94

4.1 Trading Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

4.2 Bid/Ask Spreads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5 Arbitrage 110

5.1 Frequency of arbitrage opportunities . . . . . . . . . . . . . . . . . . . . . . 110

5.2 Arbitrage window length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

5.3 Predicting direction of net volume based on price differences . . . . . . . . . 113

5.4 Relationship between net volumes on exchanges . . . . . . . . . . . . . . . . 122

5.5 Predicting arbitrage window length based on trading volume . . . . . . . . . 127

5
5.6 Predicting arbitrage opportunities based on trading volume . . . . . . . . . . 128

6 Future Work 130

6.1 Expansion to more currency pairs . . . . . . . . . . . . . . . . . . . . . . . . 130

6.2 Longer Sample Interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

6.3 Triangular arbitrage on a single exchange . . . . . . . . . . . . . . . . . . . . 131

6.4 Comparing arbitrage opportunities with exchange APIs . . . . . . . . . . . . 132

6.5 Comparing arbitrage opportunities with deposit and withdrawal friction . . 132

6.6 Comparing arbitrage opportunities with exchange consumer confidence . . . 133

7 Conclusion 134

A Data pipeline code detail 136

A.1 Connection stability issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

A.1.1 GDAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

A.1.2 Bitfinex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

A.1.3 Bitstamp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

A.2 Bitfinex order book data processing . . . . . . . . . . . . . . . . . . . . . . . 141

6
List of Figures

3-1 Preliminary GDAX BTC Update Time Deltas . . . . . . . . . . . . . . . . . 43

3-2 Preliminary GDAX ETH Update Time Deltas . . . . . . . . . . . . . . . . . 44

3-3 Preliminary GDAX LTC Update Time Deltas . . . . . . . . . . . . . . . . . 45

3-4 Preliminary Bitfinex BTC Update Time Deltas . . . . . . . . . . . . . . . . 46

3-5 Preliminary Bitfinex ETH Update Time Deltas . . . . . . . . . . . . . . . . 47

3-6 Preliminary Bitfinex LTC Update Time Deltas . . . . . . . . . . . . . . . . . 48

3-7 Preliminary Bitstamp BTC Update Time Deltas . . . . . . . . . . . . . . . . 49

3-8 Preliminary Bitstamp ETH Update Time Deltas . . . . . . . . . . . . . . . . 50

3-9 Preliminary Bitstamp LTC Update Time Deltas . . . . . . . . . . . . . . . . 51

3-10 Bitfinex Web Trading Interface . . . . . . . . . . . . . . . . . . . . . . . . . 52

3-11 Final GDAX BTC Update Time Deltas . . . . . . . . . . . . . . . . . . . . . 57

3-12 Final GDAX ETH Update Time Deltas . . . . . . . . . . . . . . . . . . . . . 58

3-13 Final GDAX LTC Update Time Deltas . . . . . . . . . . . . . . . . . . . . . 59

3-14 Final GDAX BCH Update Time Deltas . . . . . . . . . . . . . . . . . . . . . 60

3-15 Final Bitfinex BTC Update Time Deltas . . . . . . . . . . . . . . . . . . . . 61

3-16 Final Bitfinex ETH Update Time Deltas . . . . . . . . . . . . . . . . . . . . 62

3-17 Final Bitfinex LTC Update Time Deltas . . . . . . . . . . . . . . . . . . . . 63

3-18 Final Bitfinex BCH Update Time Deltas . . . . . . . . . . . . . . . . . . . . 64

3-19 Final Bitstamp BTC Update Time Deltas . . . . . . . . . . . . . . . . . . . 65

7
3-20 Final Bitstamp ETH Update Time Deltas . . . . . . . . . . . . . . . . . . . 66

3-21 Final Bitstamp LTC Update Time Deltas . . . . . . . . . . . . . . . . . . . 67

3-22 Final Bitstamp BCH Update Time Deltas . . . . . . . . . . . . . . . . . . . 68

3-23 GDAX BTC Reported vs Recorded Time Deltas . . . . . . . . . . . . . . . . 69

3-24 GDAX ETH Reported vs Recorded Time Deltas . . . . . . . . . . . . . . . . 70

3-25 GDAX LTC Reported vs Recorded Time Deltas . . . . . . . . . . . . . . . . 71

3-26 GDAX BCH Reported vs Recorded Time Deltas . . . . . . . . . . . . . . . . 72

3-27 Bitfinex BTC Reported vs Recorded Time Deltas . . . . . . . . . . . . . . . 73

3-28 Bitfinex ETH Reported vs Recorded Time Deltas . . . . . . . . . . . . . . . 74

3-29 Bitfinex LTC Reported vs Recorded Time Deltas . . . . . . . . . . . . . . . 75

3-30 Bitfinex BCH Reported vs Recorded Time Deltas . . . . . . . . . . . . . . . 76

3-31 Bitstamp BTC Reported vs Recorded Time Deltas . . . . . . . . . . . . . . 77

3-32 Bitstamp ETH Reported vs Recorded Time Deltas . . . . . . . . . . . . . . 78

3-33 Bitstamp LTC Reported vs Recorded Time Deltas . . . . . . . . . . . . . . . 79

3-34 Bitstamp BCH Reported vs Recorded Time Deltas . . . . . . . . . . . . . . 80

4-1 GDAX BTC Bid/Ask Spread . . . . . . . . . . . . . . . . . . . . . . . . . . 96

4-2 GDAX ETH Bid/Ask Spread . . . . . . . . . . . . . . . . . . . . . . . . . . 97

4-3 GDAX LTC Bid/Ask Spread . . . . . . . . . . . . . . . . . . . . . . . . . . 98

4-4 GDAX BCH Bid/Ask Spread . . . . . . . . . . . . . . . . . . . . . . . . . . 99

4-5 Bitfinex BTC Bid/Ask Spread . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4-6 Bitfinex ETH Bid/Ask Spread . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4-7 Bitfinex LTC Bid/Ask Spread . . . . . . . . . . . . . . . . . . . . . . . . . . 102

4-8 Bitfinex BCH Bid/Ask Spread . . . . . . . . . . . . . . . . . . . . . . . . . . 103

4-9 Bitstamp BTC Bid/Ask Spread . . . . . . . . . . . . . . . . . . . . . . . . . 105

4-10 Bitstamp ETH Bid/Ask Spread . . . . . . . . . . . . . . . . . . . . . . . . . 106

4-11 Bitstamp LTC Bid/Ask Spread . . . . . . . . . . . . . . . . . . . . . . . . . 107

8
4-12 Bitstamp BCH Bid/Ask Spread . . . . . . . . . . . . . . . . . . . . . . . . . 108

9
List of Tables

2.1 GDAX Trading Fee Structure [31] . . . . . . . . . . . . . . . . . . . . . . . . 24

2.2 Bitfinex Trading Fee Structure [29] . . . . . . . . . . . . . . . . . . . . . . . 24

2.3 Bitstamp Trading Fee Structure [42] . . . . . . . . . . . . . . . . . . . . . . 25

3.1 Bitfinex Ticker API Message Format . . . . . . . . . . . . . . . . . . . . . . 38

3.2 Bitstamp Ticker API Request Format . . . . . . . . . . . . . . . . . . . . . . 39

3.3 Bitstamp Ticker API Message Format . . . . . . . . . . . . . . . . . . . . . 40

3.4 Bitfinex Trades API Message Format . . . . . . . . . . . . . . . . . . . . . . 54

3.5 Bitfinex Order Book API Message Format . . . . . . . . . . . . . . . . . . . 86

3.6 Bitstamp Live Order Book API Request Format . . . . . . . . . . . . . . . . 90

3.7 Bitstamp Live Order Book API Message Format . . . . . . . . . . . . . . . . 91

4.1 Average daily trading volume by exchange and currency pair . . . . . . . . . 95

4.2 GDAX BTC Bid/Ask Spread Data . . . . . . . . . . . . . . . . . . . . . . . 96

4.3 GDAX ETH Bid/Ask Spread Data . . . . . . . . . . . . . . . . . . . . . . . 97

4.4 GDAX LTC Bid/Ask Spread Data . . . . . . . . . . . . . . . . . . . . . . . 98

4.5 GDAX BCH Bid/Ask Spread Data . . . . . . . . . . . . . . . . . . . . . . . 99

4.6 Bitfinex BTC Bid/Ask Spread Data . . . . . . . . . . . . . . . . . . . . . . . 100

4.7 Bitfinex ETH Bid/Ask Spread Data . . . . . . . . . . . . . . . . . . . . . . . 101

4.8 Bitfinex LTC Bid/Ask Spread Data . . . . . . . . . . . . . . . . . . . . . . . 102

10
4.9 Bitfinex BCH Bid/Ask Spread Data . . . . . . . . . . . . . . . . . . . . . . 103

4.10 Bitstamp BTC Bid/Ask Spread Data . . . . . . . . . . . . . . . . . . . . . . 104

4.11 Bitstamp ETH Bid/Ask Spread Data . . . . . . . . . . . . . . . . . . . . . . 105

4.12 Bitstamp LTC Bid/Ask Spread Data . . . . . . . . . . . . . . . . . . . . . . 106

4.13 Bitstamp BCH Bid/Ask Spread Data . . . . . . . . . . . . . . . . . . . . . . 107

5.1 Percentage of 5-second intervals with arbitrageable price gap . . . . . . . . . 111

5.2 Mean arbitrage window length by trading pair . . . . . . . . . . . . . . . . . 113

5.3 Regression results using 5-second intervals . . . . . . . . . . . . . . . . . . . 116

5.4 Regression results using 5-second intervals, indifferent to arbitrage direction 118

5.5 Regression results using 5-second intervals, excluding samples without

arbitrage pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

5.6 Regression results using 5-second intervals, excluding samples without

arbitrage pricing net of the minimum exchange fees . . . . . . . . . . . . . . 120

5.7 Regression results using 5-second intervals, excluding samples without

arbitrage pricing net of the maximum exchange fees . . . . . . . . . . . . . . 121

5.8 Correlations between Vi and Vj for BTC . . . . . . . . . . . . . . . . . . . . 123

5.9 Correlations between Vi and Vj for ETH . . . . . . . . . . . . . . . . . . . . 124

5.10 Correlations between Vi and Vj for LTC . . . . . . . . . . . . . . . . . . . . 124

5.11 Correlations between Vi and Vj for BCH . . . . . . . . . . . . . . . . . . . . 124

5.12 Correlations between Vi and Vj for BTC, excluding Bitfinex . . . . . . . . . 125

5.13 Correlations between Vi and Vj for ETH, excluding Bitfinex . . . . . . . . . 125

5.14 Correlations between Vi and Vj for LTC, excluding Bitfinex . . . . . . . . . . 126

5.15 Correlations between Vi and Vj for BCH, excluding Bitfinex . . . . . . . . . 126

5.16 Regression of window length against trading volume . . . . . . . . . . . . . . 128

5.17 Regression results for arbitrage window based on trading volume for BTC . 129

5.18 Regression results for arbitrage window based on trading volume for ETH . 129

11
5.19 Regression results for arbitrage window based on trading volume for LTC . . 129

5.20 Regression results for arbitrage window based on trading volume for BCH . 129

12
List of Listings

3.1 Specification for request for GDAX ticker update subscription . . . . . . . . 34

3.2 Specification for response for GDAX ticker update subscription . . . . . . . 34

3.3 Specification for GDAX ticker update message . . . . . . . . . . . . . . . . . 35

3.4 JavaScript code for reading from GDAX ticker update stream . . . . . . . . 35

3.5 Specification for request and response for Bitfinex ticker update subscription 36

3.6 Specifications for Bitfinex ticker messages . . . . . . . . . . . . . . . . . . . 37

3.7 JavaScript code for reading from Bitfinex ticker update stream . . . . . . . . 39

3.8 JavaScript code for reading from Bitstamp ticker update stream . . . . . . . 41

3.9 Specification for request and response for Bitfinex ticker update subscription 53

3.10 Specifications for Bitfinex trades messages . . . . . . . . . . . . . . . . . . . 54

3.11 JavaScript code for reading from Bitfinex trades update stream . . . . . . . 56

3.12 Specification for request for GDAX level2 update subscription . . . . . . . . 82

3.13 Specification for response for GDAX level2 update subscription . . . . . . . 83

3.14 Specification for GDAX level2 snapshot message . . . . . . . . . . . . . . . . 83

3.15 Specification for GDAX level2 update message . . . . . . . . . . . . . . . . . 83

3.16 JavaScript code for reading from GDAX level2 update stream . . . . . . . . 84

3.17 Specification for request and response for Bitfinex order books update sub-

scription . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

3.18 Specifications for Bitfinex order books messages . . . . . . . . . . . . . . . . 86

13
3.19 JavaScript code for reading from Bitfinex order books update stream . . . . 88

3.20 Python code to process Bitfinex Order Book updates . . . . . . . . . . . . . 90

3.21 JavaScript code for reading from Bitstamp ticker update stream . . . . . . . 92

A.1 JavaScript code for reading from GDAX ticker update stream with logging . 137

A.2 Final JavaScript code for reading from GDAX ticker update stream . . . . . 138

A.3 JavaScript code for reading from Bitfinex ticker update stream with logging 139

A.4 Final JavaScript code for reading from Bitfinex ticker update stream . . . . 141

A.5 Python code to process Bitfinex Order Book updates . . . . . . . . . . . . . 143

A.6 Updated Python code to process Bitfinex Order Book updates . . . . . . . . 145

A.7 Python code to pre-process time deltas between Bitfinex Order Book updates 148

14
Chapter 1

Introduction

During 2017, cryptocurrencies moved from the technology fringe to the mainstream.

Near the start of the year, Bitcoin, the oldest and most well known cryptocurrency, sur-

passed its previous all-time high price of $1000 (last reached in 2013) and continued to rise

throughout the year to nearly $20,000 in December [15]. This price implied a total market

cap of over $300,000,000,000 [26]. Traditional television news outlets began covering price

movements daily and included segments explaining Bitcoin and other cryptocurrencies to

viewers [34]. During the course of the year, Coinbase, a leading cryptocurrency broker in

the United States, grew to have more customers than Charles Schwab [12]. Wall Street

financial firms went from publicly deriding cryptocurrencies to scrambling to create new

desks to trade them [61]. Bitcoin futures began trading on the Chicago Mercantile Ex-

change [11].

1.1 Motivation

Much has been published about the theoretical underpinnings of cryptocurrencies, their

implementations, and analyses of their network activities and transaction histories,

15
though, as of the start of this work, relatively little has been said about their pricing and

the functioning of crypto-to-fiat markets [47, 19, 55, 54]. Cryptocurrencies have seen dif-

ferent prices across exchanges in the past, but it might have been expected that through

2017, as more infrastructure was developed, exchanges became more established, and ma-

jor financial institutions entered the space, prices would converge. At the end of 2017,

however, there was even more media coverage than ever about price disparities across ex-

changes [27, 20].

The price premium for Bitcoin in certain countries garnered a lot of attention. In India,

Zimbabwe, and South Korea, premiums reached more than 20% over prices in the United

States [16, 8, 67]. In those cases, it seemed that the premiums were caused by local mone-

tary policy issues and government regulation.

Conventional wisdom might suggest that such price differences would not exist on US Dol-

lar denominated exchanges that have fully automated trading. While exchanges make

prices publicly available, there is no freely available dataset for fine-grained prices across

exchanges to investigate this, nor is there any publicly available tool to collect this data.

Beyond rising popularity and rapidly developing infrastructure, another theme for cryp-

tocurrencies in 2017 was the rise of “altcoins,” cryptocurrencies other than Bitcoin. Even

though other cryptocurrencies have existed for years, at the start of 2017, Bitcoin com-

prised over 87% of the total value of all cryptocurrencies; by the end of the year, though,

it was only 37% [36]. One of the motivations driving adoption of altcoins was concern

about the scalability of the Bitcoin technology and network. Indeed, this concern became

evident in December of 2017 as the average Bitcoin transaction fees topped $28 and some

transactions took days to complete [39, 56]. Core selling points of many of the alterna-

tive cryptocurrencies are technological differences from Bitcoin that promise faster and/or

cheaper transactions. Indeed, these other cryptocurrencies do, in practice, have faster and

16
cheaper transactions and these properties make them more suitable as media of exchange

than Bitcoin. For example, coffee shops that have stopped taking Bitcoin due to transac-

tion friction are readily accepting other cryptocurrency alternatives. However, there is no

information as to how these properties affect the exchange prices and markets for these

cryptocurrencies.

To the best of our knowledge, the only prior paper that studies arbitrage in cryptocur-

rency markets is [45]. Though they are also interested in arbitrage opportunities across

cryptocurrency exchanges, our approaches differ. At the highest level, we are primarily

interested in the differences between cryptocurrencies and how they differ in trading pat-

terns and arbitrageability, and we make decisions throughout this work with that intent in

mind. Comparatively, they take a broader view of the markets they survey and examine

how many factors, such as foreign currency capital controls, play a role in arbitrage oppor-

tunities in these markets.

Their work examines a larger number of exchanges with more variety in trading environ-

ments over a longer period of time. The cryptocurrencies they look at, too, vary on several

dimensions. Their dataset is obtained from a third-party firm and it comes with second-

level resolution for timestamps, apparently provided by each exchange independently. The

arbitrage index they use is computed on a per-minute basis, comparing the highest price

across all exchanges in a minute with the lowest price available during that minute. The

order book data they use is also snapshotted on a minutely basis. From this, they aggre-

gate to multi-minute or even day-long timescales and examine patterns on these time-

interval frequencies.

Because of our primary interest in differences between cryptocurrencies, we limit our fo-

cus to markets we presume to be the most likely to be free of exogenous barriers to ar-

bitrage. We include only cryptocurrencies that are open, well-understood, and are not

17
largely controlled by a single entity with significant market power. We examine only the

biggest and most liquid exchanges for US-dollar-denominated markets and limit our scope

to exchanges that both allow US Dollar deposits from US-based investors and provide for

persistent-connection real-time streaming data feeds and automated trading. Accordingly,

because of the facilities for automated trading on these exchanges, we aim to collect data

with as high resolution as possible—specifically millisecond level timestamps. In order to

do this, we seek to design and implement an original system for collecting and aggregating

this data directly from the exchanges. We can then be confident in our ability to exam-

ine patterns in market activity for time intervals on the order of seconds. Furthermore,

throughout our work, we account for the presence of exchange fees and their role as a bar-

rier to arbitrage.

1.2 Contributions

In this work we present the design and implementation of a system for collecting real-time

pricing data from multiple cryptocurrency exchanges, techniques for offline processing of

these raw data streams into a normalized time-series format suitable for analysis, and fi-

nally, the results of statistical analysis of pricing and trading activity of various cryptocur-

rencies across exchanges.

We detail our implementation which uses WebSocket APIs to provide streaming trades

and order book changes data from three of the largest cryptocurrency exchanges—

Bitfinex, GDAX1 , and Bitstamp—for the US Dollar denominated markets for four of the

most popular cryptocurrencies—Bitcoin, Ethereum, Litecoin, and Bitcoin Cash. We show

that, at peak periods, we may see as many as hundreds of updates per second for a sin-

gle trading pair on an exchange. We observe that latency varies by exchange from tens of
1
GDAX has since been renamed to Coinbase Pro, but since it was still called GDAX when the data
was collected, we will refer to it as such.

18
milliseconds to over a second.

We detail the development of processing techniques for transforming these raw exchange

data streams into a format suitable for determining trades during arbitrary intervals and

best bid/ask price points at arbitrary times for each exchange. We document the behavior

observed from the exchange APIs and the heuristics created for compensating for invalid

data. We explain performance optimization techniques to quicken this process.

Using our processed data, we show that average daily trading volume varies from less than

$6.5 million to over $250 million depending on the exchange and cryptocurrency examined.

For a given cryptocurrency on a given exchange, we show that bid/ask spreads vary from

$0.01 to over $10.

Looking across exchanges, we show that arbitrage opportunities net of the highest possible

exchange fees, exist in 0.35% to 40.38% of 5-second intervals depending on the cryptocur-

rency and exchanges considered. When we exclude Bitfinex, we find that this range nar-

rows to 0.35% to 1.03%. Moreover, we show that cryptocurrencies with higher market caps

and trading volume exhibit lower frequencies of arbitrage opportunities and that these op-

portunities, when they do arise, last for shorter periods of time. We further explore several

approaches for finding relationships between trading volume and arbitrage opportunities

across exchanges. We find that trading behavior across exchanges differs greatly during

these periods, but, using logistic regression techniques, we find only weak evidence link-

ing these price differences to arbitrage-exploitative trading behavior in the subsequent 5-

second intervals.

19
1.3 Outline

Chapter 2 provides brief background on cryptocurrencies, cryptocurrency markets, and the

types of APIs offered by cryptocurrency exchanges. Chapter 3 provides details about data

collection and processing. Chapter 4 presents the aggregation of collected trading volume

and bid/ask data. Chapter 5 presents the analysis of the data for evidence of arbitrage

opportunities and corresponding trading behaviors. Chapter 6 discusses potential topics

for further research. Chapter 7 concludes.

20
Chapter 2

Background

2.1 Cryptocurrencies

The first cryptocurrency, Bitcoin, was introduced in 2009 [47]. Since then, many more

variants have been introduced. While cryptocurrencies vary in specific properties and de-

tails, they share a core group of characteristics. Cryptocurrencies are software-defined cur-

rencies. Mechanisms for ownership, transactions, and varying money supply are all defined

in software. The state of the entire system (distribution of currency amongst owners, total

amount of currency available, transaction history, etc.) is known as a ”blockchain” and is

altered through distributed consensus protocols that progressively add more ”blocks” to

the blockchain [66, 62]. Security (such as the prevention of arbitrary currency creation or

double-spending) is provided by cryptographic methods. There are many more technical

details and subtleties, but we only provide brief explanations of certain concepts that are

related to their similarity to tradeable financial assets.

21
2.1.1 Transaction time

For a cryptocurrency transaction to occur, the state of the system must be updated to

reflect the transaction. A candidate transaction is broadcast to nodes in the network. It

then may be included in a future block. Creating a block is a computationally difficult

task that is designed to require a certain amount of time in expectation (this amount of

time, however, varies between different cryptocurrencies). Even then, because of the dis-

tributed consensus nature of cryptocurrencies, if consensus cannot be reached to include

that block in the blockchain (history of blocks), no record of the transaction will exist.

Disputes are resolved by selecting the longest chain. It is therefore customary to not trust

a transaction until a certain number of blocks have been created on top of the block con-

taining the transaction. As this number grows, the probability of the transaction being

removed from history becomes vanishingly small [48]. Therefore, minimum cryptocurrency

transaction times are governed by the ”block time” and the number of additional block

confirmations required by the parties to trust the permanence of the transaction. Addi-

tional time beyond the minimum transaction time may be required if the candidate trans-

action is not immediately chosen for inclusion into the next block. This is a possibility

because the number of transactions that can be included in a block is finite; if more than

this number of transactions are broadcast for inclusion, some selection must occur.

2.1.2 Transaction cost

Since a transaction can only occur through inclusion in a block, and finding a block is

computationally difficult, a fee must generally be paid to incentivize a miner to include

a transaction in a block. In the case that there are more candidate transactions than what

is possible to include in a block, transactions offering the highest fees are usually included

first since most miners are profit-maximizing and exhibit greedy behavior. It is therefore

22
generally possible to ensure that a transaction is included in the next block by offering a

sufficiently high fee. Since this is effectively an auction, the fee required depends on the

market conditions for a transaction for a particular cryptocurrency at a particular point in

time.

2.2 Cryptocurrency markets

2.2.1 Exchanges

The cryptocurrency markets we explore in this work are exchanges. There are, of course,

other options for buying and selling cryptocurrencies. Some companies offer opaque price

quotes for currency exchange, and it is also possible to engage in one-off private sales as in

OTC markets. In any case, there is much more information offered by the exchanges, and

they most closely mirror the markets for other financial products we wish to compare to,

so that is where our efforts are focused.

We consider exchanges that serve as matching platforms, allowing users to deposit both

cryptocurrencies and fiat currencies (such as US dollars) and place orders to trade cer-

tain currencies against one another. These exchanges hold users’ currencies in exchange

accounts and match buyers with sellers. They typically generate revenue by charging fees

for each match. The fee structure varies by exchange. Exchanges may also make money by

earning interest on fiat deposits or by charging other fees, such as deposit or withdrawal

fees that vary by payment method (for example, with credit card fees often being the high-

est).

23
30-Day Trading Volume Maker Fee Taker Fee

$0-$10,000,000 0.30% 0%

$10,000,000-$100,000,000 0.20% 0%

$100,000,000+ 0.10% 0%

Table 2.1: GDAX Trading Fee Structure [31]

30-Day Trading Volume Maker Fee Taker Fee

$0-$500,000 0.100% 0.200%

$500,000-$1,000,000 0.080% 0.200%

$1,000,000-$2,500,000 0.060% 0.200%

$2,500,000-$5,000,000 0.040% 0.200%

$5,000,000-$7,500,000 0.020% 0.200%

$7,500,000-$10,000,000 0.000% 0.200%

$10,000,000-$15,000,000 0.000% 0.180%

$15,000,000-$20,000,000 0.000% 0.160%

$20,000,000-$25,000,000 0.000% 0.140%

$25,000,000-$30,000,000 0.000% 0.120%

$30,000,000-$300,000,000 0.000% 0.100%

$300,000,000-$1,000,000,000 0.000% 0.090%

$1,000,000,000-$3,000,000,000 0.000% 0.085%

$3,000,000,000-$10,000,000,000 0.000% 0.075%

$10,000,000,000-$30,000,000,000 0.000% 0.060%

$30,000,000,000+ 0.000% 0.055%

Table 2.2: Bitfinex Trading Fee Structure [29]

24
30-Day Trading Volume Maker Fee Taker Fee

$0-$500,000 0.25% 0.25%

$500,000-$1,000,000 0.24% 0.24%

$1,000,000-$2,500,000 0.22% 0.22%

$2,500,000-$5,000,000 0.20% 0.20%

$5,000,000-$7,500,000 0.15% 0.15%

$7,500,000-$10,000,000 0.14% 0.14%

$10,000,000-$15,000,000 0.13% 0.13%

$15,000,000-$20,000,000 0.12% 0.12%

$4,000,000-$20,000,000 0.11% 0.11%

$20,000,000+ 0.10% 0.10%

Table 2.3: Bitstamp Trading Fee Structure [42]

Cryptocurrency exchanges typically trade 24 hours per day, 7 days per week. Because

each exchange has its own order book, prices for currencies may vary across exchanges.

Exchanges are operated independently by private companies, and each exchange may

offer different trading pairs, fee structures, and order types. Market, limit, maker-only,

immediate-or-cancel, and fill-or-kill order types are all relatively common.

2.2.2 Regulation

Cryptocurrencies are relatively new, and the industries developing around them are even

newer. As the idea itself is so new, there are often situations in which it is unclear what, if

any, laws or regulations apply. The result, so far, has been a rapidly changing environment

that is quite different from more mature, stable, and regulated financial markets, such as

those for foreign exchange or equities.

25
Many exchanges allow anyone to create an account, and there is no formalized regulated

process like that for opening a brokerage account. Similarly, on the other side, there is lit-

tle stopping someone from creating a new exchange. As one might expect, this area has

been rife with fraud and theft [63, 13, 51, 21, 17]. Exchanges have been hacked and de-

posits have vanished [46, 38]. Some have been found to be insolvent, creating a run on

what deposits were left [14, 25]. Others have been shut down by governments [9, 49, 52].

2.2.3 Arbitrage

The cost in exploiting an arbitrage opportunity across cryptocurrency exchanges differs

from other financial markets. An arbitrageur would need to have accounts on multiple ex-

changes, and, in the cases we examine, US Dollar and cryptocurrency deposits across each

account. This exposes the arbitrageur to the risk of holding cryptocurrencies which are

very volatile and the risk of having uninsured exchange deposits. In executing trades, ar-

bitrageurs incur exchange fees based on the exchange and their 30-day trading volume.

They then would need to be able to move US Dollars and cryptocurrencies between their

accounts on the different exchanges. Moving US Dollars would require involving a bank

as an intermediary and would subject them to fees and processing time of both exchanges

as well as the bank. Transferring cryptocurrencies from exchange to another can be done

directly, but it it requires a variable and inconsistent amount of time and a variable fee

depending on the cryptocurrency and the state of the network when the transfer occurs.

2.3 API formats

There are various API formats that exchanges uses to provide programatic access to mar-

ket participants. We briefly survey the basics of the different formats and their pros and

cons to provide further background to explain our technical implementation decisions.

26
2.3.1 FIX

FIX (Financial Information eXchange) is the industry-standard protocol for the dissem-

ination of market data and order placement in traditional financial markets [43]. It has

been in use since 1992 and is used across equities, fixed income, derivatives, and foreign

exchange markets.

The FIX protocol runs on top of persistent TCP connections and allows for data messages

to be initiated either by the client or by the server. This is appropriate and advantageous

in a financial market setting in that the client can make requests, such as asking for or-

der book updates for a trading pair or placing an order, and the server can respond with

data as it becomes available such as new orders added to the book or completion status

of a placed order. Since the default behavior is for the connection to be persistent, under

normal conditions a continuous stream of updates is received without missed messages.

While adoption is widespread within the financial markets industry, the FIX protocol is

not used in other contexts, and most code making use of this protocol is proprietary and

not open source or freely available. This presents an additional challenge and obstacle to

using such an API to collect market data as a FIX client would need to be implemented

from scratch, which would be a significant undertaking.

FIX API interfaces are offered by some cryptocurrency exchanges, but not many. They are

primarily offered by exchanges catering to institutional accounts, and some (for example,

CEX.IO) exclusively offer FIX API access to institutional customers.

2.3.2 REST over HTTP

HTTP is the application-layer protocol on top of which the world wide web is built.

HTTP is also generally run on top of TCP, but the connection is only maintained for the

27
duration of a single request from a client and the corresponding response from a server.

This is appropriate in the context of navigating to a web page in a browser, but is less

than ideal for maintaining a real-time stream of financial market data. A client can re-

quest the most recent market state from the server, but after the server responds, the con-

nection is closed. A new request for market state requires the setup of a new connection.

This repeated overhead cost adds latency and processing time. Furthermore, HTTP is

stateless. Each new connection could be routed to a different server, and that server will

not know which updates the client has already received. This can be mitigated, say, if up-

dates are assigned monotonically increasing identifiers and the client includes with each

request the last identifier it has received. This, too, however, requires extra bandwidth.

Given its usage for the web, HTTP is universally supported, and libraries supporting

HTTP are widely available in essentially any mainstream programming language. In terms

of adoption by cryptocurrency exchanges, many, if not all offer REST APIs over HTTP.

2.3.3 WebSocket

The WebSocket protocol is much newer than FIX or HTTP. It was standardized in 2011.

It is also built on top of TCP and allows for full-duplex communication over a long-lasting

persistent connection. It was designed for low-overhead real-time data transfer for modern

web applications.

Though not specifically designed for financial market settings, the WebSocket protocol

is well suited to transmission of market data. A persistent connection is maintained be-

tween the client and server, so there is less overhead compared to the repeated setup and

teardown required for each new request with HTTP. Furthermore, either the client or the

server can initiate data transmission, so the server can push new market updates to the

client as they become available rather than having to wait to respond to the next request

28
from the client. Connection persistence also allows for continuous transmission of messages

without having to juggle out-of-order or missed updates under normal conditions.

Being a modern web standard, the WebSocket protocol is also well-supported. Though not

as universal as HTTP, WebSocket libraries are both readily available and mature. As an

API protocol, WebSocket interfaces are offered by a minority of cryptocurrency exchanges.

Because of the comparative advantages offered, we use WebSocket APIs exclusively for our

data collection.

29
Chapter 3

Data Collection

3.1 Choosing currencies and exchanges

There are thousands of cryptocurrencies and hundreds of exchanges, some of which offer as

many as hundreds of trading pairs. Therefore, to narrow our scope, we must choose which

exchanges and which currency pairs to focus on for this work.

From a market data perspective, we are interested in choosing currency pairs and ex-

changes that carry the highest daily trading volume in order to be able to study the most

active and important markets. We are also concerned with technical considerations; it is

important to select exchanges for which reliable continuous data collection is possible.

Ease of technical implementation is also a factor. Finally, the choices of currency pairs

and exchanges are not independent in that exchanges each only offer a limited set of cur-

rency pairs for trading.

30
3.1.1 Currency pairs

When considering currency pairs, we only consider US Dollar denominated trading pairs.

For most of the top exchanges by trading volume, the top trading pairs by volume have

US Dollars as the fiat currency. The exchanges for which this is not true are all based in

South Korea (with top trading pairs by volume denominated in South Korean Won) and

do not have English-language websites or technical documentation that would be required

for data collection implementation. The position of the US Dollar as a widely-used reserve

currency is also an important factor. Trading pairs that involved two cryptocurrencies

rather than one fiat currency and one cryptocurrency are not included as such pairs would

have more confounding effects present in trading. For example, we assume that an arbi-

trageur is ultimately interested in a profit in fiat currency and that both having to hold

multiple cryptocurrencies and then having to make an additional trade to convert to fiat

would add friction to achieving price parity.

An issue with choosing the top trading pairs by volume is that the ranking differs between

exchanges. At this point, since we want US Dollars as one currency, we consider the mar-

ket cap rankings of the various cryptocurrencies. The top six cryptocurrencies by market

cap at the start of this work are Bitcoin, Ripple, Ethereum, Bitcoin Cash, Cardano, and

Litecoin. We remove Ripple from consideration because it is the product of a for-profit

company, Ripple Labs, Inc. which controls many of the nodes in the network and a signif-

icant amount of the cryptocurrency itself, so it is quite different from the other cryptocur-

rencies that are more distributed. We eliminate Cardano from consideration because of its

age; it was only released at the end of September 2017 and is therefore less likely to be as

well studied and understood by the market as more mature cryptocurrencies. For exam-

ple, it seems more likely for a debilitating bug or security flaw to exist in its codebase than

those of cryptocurrencies that have undergone more scrutiny [10, 24, 65]. We consider ex-

31
cluding Bitcoin Cash on similar grounds, but, because of its relationship to Bitcoin, we

include it. Bitcoin Cash is a “fork” of Bitcoin that includes the entire history of the Bit-

coin blockchain prior to its inception, and Bitcoin Cash includes very few changes in it’s

functionality and implementation. Bitcoin Cash was introduced with the express purpose

of having lower transaction times and fees than Bitcoin, and this makes it an interesting

cryptocurrency to consider. We therefore select Bitcoin (BTC), Ethereum (ETH), Bitcoin

Cash (BCH), and Litecoin (LTC).

3.1.2 Exchanges

The top three exchanges by trading volume offering US Dollar denominated trading pairs

at the beginning of this work are Binance, OKEx, and Bitfinex. However, as we begin

work on implementation for data collection for Binance, we find that Binance and OKEx

do not actually offer trading against US Dollars, but rather only offer trading against

USDT, a cryptocurrency guaranteed to be worth $1 by Tether Limited and backed by US

Dollar reserves the company holds. However, at the time of exchange selection, it does

not appear to be possible to freely convert between US Dollars and USDT, and there are

widespread rumors that the company does not actually have the reserve funds to back

USDT as they claim to [40, 28, 3, 18, 41, 35]. Because of this risk, we exclude Binance and

OKEx from consideration.

The next three exchanges by US Dollar denominated trading volume are Kraken, GDAX,

and Bitstamp. Kraken does not have a WebSocket API available for data collection, but

Bitfinex, GDAX, and Bitstamp do. Given the technical advantages of a WebSocket API

compared to a REST API as offered by Kraken as its only option, we move forward with

Bitfinex, GDAX, and Bitstamp. Availability of WebSocket APIs provides for simpler im-

plementations and stronger confidence in continuous data collection without interruption.

32
All three of these exchanges offer the chosen currency pairs: BTC/USD, ETH/USD, LTC/

USD, and BCH/USD.

Across these exchanges we expect external barriers to arbitrage to be low. All three ex-

changes accept US Dollar deposits through a variety of means, including bank wires.

Cryptocurrencies are movable across exchange accounts as well. Bitstamp has been op-

erating since 2011, and Bitfinex and the predecessor to GDAX (Coinbase) launched in

2012 [37, 32, 6]. The operator of GDAX (Coinbase, Inc.) is based in San Francisco, CA

in the US and is presumably subject to US regulations [33]. It is, however, more difficult

to find concrete information about Bitfinex and Bitstamp. While it appears that Bitfinex

is registered in the British Virgin Islands, there is no reliable information as to where their

money, employees, or infrastructure are located [57]. Bitstamp discloses that it has entities

in Luxembourg, the UK, and the US, though again, it is not clear where their people or

servers are [7]. Despite not listing any Slovenian address, there are reports of Bitstamp’s

presence there, and they appear to be actively hiring personnel for positions in Slovenia

[59, 1].

3.2 Collecting ticker data

Each of the three exchanges has an interface providing a streaming ticker, so we use that

to collect pricing data for each of the currencies across the exchanges. Each exchange has

a slightly different interface with a different data specification.

3.2.1 GDAX

For GDAX, the request and response specifications for ticker updates are as follows:

33
1 // Request
2 // Subscribe to BTC-USD and LTC-USD ticker updates
3 {
4 "type": "subscribe",
5 "product_ids": [
6 "BTC-USD",
7 "LTC-USD"
8 ],
9 "channels": [
10 "ticker"
11 ]
12 }

Listing 3.1: Specification for request for GDAX ticker update subscription

1 // Response
2 {
3 "type": "subscriptions",
4 "channels": [
5 {
6 "name": "ticker",
7 "product_ids": [
8 "BTC-USD",
9 "LTC-USD",
10 ]
11 }
12 ]
13 }

Listing 3.2: Specification for response for GDAX ticker update subscription

Data messages sent on this channel have the following form:

1 {
2 "type": "ticker",
3 "trade_id": 20153558,
4 "sequence": 3262786978,

34
5 "time": "2017-09-02T17:05:49.250000Z", "product_id": "BTC-USD",
6 "price": "4388.01000000",
7 "side": "buy", // Taker side
8 "last_size": "0.03000000",
9 "best_bid": "4388",
10 "best_ask": "4388.01"
11 }

Listing 3.3: Specification for GDAX ticker update message

As we can see, with GDAX, updates are received for each trade and each update con-

tains the time, price, side, and size of the trade along with the post-trade best bid and ask

prices. To read from this stream, we use the following JavaScript code:

1 const Gdax = require('gdax');


2

3 const websocket = new Gdax.WebsocketClient(


4 ['BTC-USD', 'ETH-USD', 'LTC-USD', 'BCH-USD'],
5 'wss://ws-feed.gdax.com',
6 null,
7 { channels: ['ticker'] }
8 );
9

10 websocket.on('message', data => {


11 if (data['type'] === 'ticker' && data['time'] !== undefined) {
12 const logItem = {
13 date: new Date(),
14 data: data
15 };
16 console.log(JSON.stringify(logItem));
17 }
18 });

Listing 3.4: JavaScript code for reading from GDAX ticker update stream

We record time separately because we do not know whether the clocks between the ex-

35
changes are synchronized, so the only known reference point is on the machine where the

data is being collected and recorded. We include the conditional specifying type and time

to ensure that we only record the updates of interest and ignore extraneous unexpected

data present on the stream.

3.2.2 Bitfinex

Subscribing to the Bitfinex WebSocket Ticker API feed for the BTC/USD currency pair

has the following request and response specification:

1 // request
2 {
3 "event":"subscribe",
4 "channel":"ticker",
5 "pair":"BTCUSD"
6 }
7

8 // response
9 {
10 "event":"subscribed",
11 "channel":"ticker",
12 "chanId":"<CHANNEL_ID>",
13 "pair":"BTCUSD"
14 }

Listing 3.5: Specification for request and response for Bitfinex ticker update subscription

Once subscribed, data messages sent on the channel came in one of the two following for-

mats:

1 // snapshot
2 [
3 "<CHANNEL_ID>",
4 "<BID>",

36
5 "<BID_SIZE>",
6 "<ASK>",
7 "<ASK_SIZE>",
8 "<DAILY_CHANGE>",
9 "<DAILY_CHANGE_PERC>",
10 "<LAST_PRICE>",
11 "<VOLUME>",
12 "<HIGH>",
13 "<LOW>"
14 ]
15

16 // updates
17 [
18 "<CHANNEL_ID>",
19 "<BID>",
20 "<BID_SIZE>",
21 "<ASK>",
22 "<ASK_SIZE>",
23 "<DAILY_CHANGE>",
24 "<DAILY_CHANGE_PERC>",
25 "<LAST_PRICE>",
26 "<VOLUME>",
27 "<HIGH>",
28 "<LOW>"
29 ]

Listing 3.6: Specifications for Bitfinex ticker messages

Note that there is no difference in the specification Bitfinex gives for the snapshot mes-

sage compared to the update message. It is not clear why any distinction is made between

these two message types.

The data fields in this stream have the following descriptions:

37
Field Type Description

CHANNEL_ID integer Channel ID

BID float Price of last highest bid

BID_SIZE float Size of the last highest bid

ASK float Price of last lowest ask

ASK_SIZE float Size of the last lowest ask

DAILY_CHANGE float Amount that the last price has changed since yesterday

DAILY_CHANGE_PERC float Amount that the price has changed expressed in percentage terms

LAST_PRICE float Price of the last trade.

VOLUME float Daily volume

HIGH float Daily high

LOW float Daily low

Table 3.1: Bitfinex Ticker API Message Format

Accordingly, we use the following JavaScript code to record this data:

1 const BFX = require('bitfinex-api-node');


2

3 const bfx = new BFX();


4

5 const ws = bfx.ws(1);
6

7 ws.on('open', () => {
8 ws.subscribeTicker('BTCUSD');
9 ws.subscribeTicker('ETHUSD');
10 ws.subscribeTicker('LTCUSD');
11 ws.subscribeTicker('BCHUSD');
12 });
13

14 ws.on('ticker', (pair, ticker) => {

38
15 const logItem = {
16 date: new Date(),
17 pair: pair,
18 data: ticker
19 };
20 console.log(JSON.stringify(logItem));
21 });
22

23 ws.open();

Listing 3.7: JavaScript code for reading from Bitfinex ticker update stream

As we can see, messages from Bitfinex does not include any timestamp, so the only refer-

ence of time we have is our own.

3.2.3 Bitstamp

Bitstamp provides the following specification for its Ticker API:

CHANNEL live_trades (for BTC/USD trades)

live_trades_{currency_pair}

The currency_pair placeholder can be replaced with the following values

that correspond to different orderbooks: btceur, eurusd, xrpusd, xrpeur,

xrpbtc, ltcusd, ltceur, ltcbtc, ethusd, etheur, ethbtc, bchusd,

bcheur, bchbtc

EVENT trade

PUSHER KEY de504dc5763aeef9ff52

Table 3.2: Bitstamp Ticker API Request Format

The fields have the following descriptions:

39
Field Description

id Trade unique ID.

amount Trade amount.

price Trade price.

type Trade type (0 - buy; 1 - sell).

timestamp Trade timestamp.

buy_order_id Trade buy order id.

sell_order_id Trade sell order id.

Table 3.3: Bitstamp Ticker API Message Format

The data is thus collected using the following code:

1 const Pusher = require('pusher-js');


2

3 const pusher = new Pusher('de504dc5763aeef9ff52');


4 const btcChannel = pusher.subscribe('live_trades');
5 const ltcChannel = pusher.subscribe('live_trades_ltcusd');
6 const ethChannel = pusher.subscribe('live_trades_ethusd');
7 const bchChannel = pusher.subscribe('live_trades_bchusd');
8

9 btcChannel.bind('trade', (data) => {


10 const logItem = {
11 date: new Date(),
12 pair: 'BTCUSD',
13 data: data
14 };
15 console.log(JSON.stringify(logItem));
16 });
17

18 ltcChannel.bind('trade', (data) => {


19 const logItem = {

40
20 date: new Date(),
21 pair: 'LTCUSD',
22 data: data
23 };
24 console.log(JSON.stringify(logItem));
25 });
26

27 ethChannel.bind('trade', (data) => {


28 const logItem = {
29 date: new Date(),
30 pair: 'ETHUSD',
31 data: data
32 };
33 console.log(JSON.stringify(logItem));
34 });
35

36 bchChannel.bind('trade', (data) => {


37 const logItem = {
38 date: new Date(),
39 pair: 'BCHUSD',
40 data: data
41 };
42 console.log(JSON.stringify(logItem));
43 });

Listing 3.8: JavaScript code for reading from Bitstamp ticker update stream

3.3 Issue of recorded time

GDAX and Bitstamp provide timestamps with each data point in the stream. In our

pipeline, we also record our own timestamp with each data point. For Bitfinex, since no

timestamp is provided, our own timestamp is the only reference we have. For making com-

parisons across exchanges, it is not clear which timestamps we should use. Using our own

recorded timestamps provides a guarantee of consistency of clock, though each data point

41
would include delays between the actual time of the trade on the exchange and the time

recorded, and these delays would vary by exchange. Alternatively, using timestamps as re-

ported by the exchanges would eliminate these delays, but there is no guarantee of clock

synchrony across exchanges.

3.4 Ticker data evaluation and learnings

Our initial implementation using the code described above struggles to remain stable and

encounters frequent crashes. We describe the mitigation strategies used and their technical

implementation in more detail in Appendix A.1.

3.4.1 Variation in data stream frequency

After collecting multiple days worth of data, we perform a summary analysis to gain pre-

liminary insight into the differences between the different sources, and confirm the validity

of the collected data. In this process we first look at the frequency of updates received and

recorded across the different exchanges and currencies.

We create histograms for each trading pair on each exchange using 100ms wide bins and

observe the distribution of time deltas in between consecutive updates received for each

data stream.

We note that for each of the three GDAX data streams, the distributions look similar.

The most full bin is the first—0-100ms—with a significant decline after that, though there

is an uptick between 10 and 13 seconds to a frequency not otherwise seen above 1.5 sec-

onds. We do not have a clear explanation for this phenomenon, but we suspect that it is

due to some timeout or retry logic that is activated after a period of 10 seconds.

42
Figure 3-1: Preliminary GDAX BTC Update Time Deltas

43
Figure 3-2: Preliminary GDAX ETH Update Time Deltas

44
Figure 3-3: Preliminary GDAX LTC Update Time Deltas

The histograms for the Bitfinex data streams are very different. Here we see large peaks

at each of 15, 30, 45, and 60 seconds and smaller peaks 0.5s or 0.6s on either side of these

large peaks with steady fall off on both sides. Given such regular intervals, one possible

reason is that the large peaks are due to Bitfinex intentionally aiming to provide updates

every 15 seconds and that the decreasing peaks at 30s, 45s, and 60s are caused by some of

the updates that are meant to be every 15s not arriving. The normal-like fall off on either

side of the peaks makes sense if we assume that their ”errors” when attempting to send

data every 15 seconds are approximately normally distributed, but the smaller peaks 0.5-

0.6s on each side of the large peaks are less explainable.

45
Figure 3-4: Preliminary Bitfinex BTC Update Time Deltas

46
Figure 3-5: Preliminary Bitfinex ETH Update Time Deltas

47
Figure 3-6: Preliminary Bitfinex LTC Update Time Deltas

The Bitstamp data returns to a more expected pattern. Here we see the most samples in

the 100ms-200ms bin and the second-most in the 0-100ms bin. There is a generally fall-off

thereafter with small spikes at 5s and 10s. We again assume that the spikes at 5s and 10s

seem thresholded and likely to be caused by timeout or retry logic present in the system.

Observing slightly slower update times for Bitstamp compared to GDAX is expected given

that GDAX sees more trading volume (assuming size of trades across the two exchanges

are comparable).

48
Figure 3-7: Preliminary Bitstamp BTC Update Time Deltas

49
Figure 3-8: Preliminary Bitstamp ETH Update Time Deltas

50
Figure 3-9: Preliminary Bitstamp LTC Update Time Deltas

Though the most striking resemblance when looking at the histograms is the consistency

on each exchange across the various trading pairs, we do also observe differences between

the trading pairs on each exchange and note that there does seem to be a difference in fre-

quency of trades.

The concerning realization from these observations, however, is that it appears that up-

dates from Bitfinex across all trading pairs are significantly less frequent than from the

other two exchanges. This is particularly surprising because Bitfinex is the exchange with

the highest trading volume, and a cursory qualitative examination of the user-facing trad-

ing interface suggests that many trades are happening per second.

51
Figure 3-10: Bitfinex Web Trading Interface

Furthermore, the distribution of update time deltas makes it appear as though they are

intentionally only providing updates every 15 seconds rather providing a real-time stream

of trades as we require.

3.4.2 Bitfinex update frequency mitigation strategy

In order to not be limited by the apparent artificial throttling Bitfinex is placing on their

Ticker API, we explore other options. First we write code to instead read from Bitfinex’s

Trades WebSocket channel.

The Trades channel has the following subscription request and response specifications:

52
1 // request
2 {
3 "event": "subscribe",
4 "channel": "trades",
5 "pair": "BTCUSD"
6 }
7 // response
8 {
9 "event": "subscribed",
10 "channel": "trades",
11 "chanId": "<CHANNEL_ID>",
12 "pair":"<PAIR>"
13 }

Listing 3.9: Specification for request and response for Bitfinex ticker update subscription

Data messages sent on these channels are in the following forms:

1 // snapshot
2 [
3 "<CHANNEL_ID>",
4 [
5 [
6 "<SEQ> OR <ID>",
7 "<TIMESTAMP>",
8 "<PRICE>",
9 "<AMOUNT>"
10 ],
11 [
12 "..."
13 ]
14 ]
15 ]
16 // updates
17 [
18 "<CHANNEL_ID>",

53
19 "te",
20 "<SEQ>",
21 "<TIMESTAMP>",
22 "<PRICE>",
23 "<AMOUNT>"
24 ]
25

26 [
27 "<CHANNEL_ID>",
28 "tu",
29 "<SEQ>",
30 "<ID>",
31 "<TIMESTAMP>",
32 "<PRICE>",
33 "<AMOUNT>"
34 ]

Listing 3.10: Specifications for Bitfinex trades messages

The fields involved in these messages have the following descriptions:

Field Type Description

SEQ string Trade sequence id

ID int Trade database id

TIMESTAMP int Unix timestamp of the trade.

PRICE float Price at which the trade was executed

±AMOUNT float How much was bought (positive) or sold (negative).

Table 3.4: Bitfinex Trades API Message Format

Based on this spec, we adapted the Bitfinex data collection code to look as shown:

54
1 const BFX = require('bitfinex-api-node');
2

3 const bfx = new BFX();


4

5 const ws = bfx.ws(1);
6

7 ws.on('open', () => {
8 ws.subscribeTrades('BTCUSD');
9 ws.subscribeTrades('ETHUSD');
10 ws.subscribeTrades('LTCUSD');
11 ws.subscribeTrades('BCHUSD');
12 });
13

14 ws.on('trade', (pair, trade) => {


15 const logItem = {
16 date: new Date(),
17 pair: pair,
18 data: trade
19 };
20 console.log(JSON.stringify(logItem));
21 });
22

23 ws.on('error', (err) => {


24 console.error(err);
25 });
26

27 ws.on('close', () => {
28 console.error('close');
29 setTimeout(() => {
30 console.error('reconnecting');
31 ws.open();
32 }, 500);
33 });
34

35 ws.on('info', (msg) => {


36 console.error(msg);
37 });

55
38

39 ws.open();

Listing 3.11: JavaScript code for reading from Bitfinex trades update stream

We run this new code for Bitfinex alongside the existing code being used to collect data

from GDAX and Bitstamp.

3.4.3 Updated data stream frequency evaluation

After several days, we construct new histograms to compare the update time deltas of our

new dataset. The GDAX and Bitstamp datasets look similar to our previous observations,

but now the Bitfinex histograms look very different. We observe a more expected pattern

where we most frequently see samples in the 0-100ms range and a drop-off thereafter. It

is, if anything, more consistent with our expectations than the GDAX or Bitstamp distri-

butions in that we see an frequency almost monotonically decreasing with increasing time

intervals.

56
Figure 3-11: Final GDAX BTC Update Time Deltas

57
Figure 3-12: Final GDAX ETH Update Time Deltas

58
Figure 3-13: Final GDAX LTC Update Time Deltas

59
Figure 3-14: Final GDAX BCH Update Time Deltas

60
Figure 3-15: Final Bitfinex BTC Update Time Deltas

61
Figure 3-16: Final Bitfinex ETH Update Time Deltas

62
Figure 3-17: Final Bitfinex LTC Update Time Deltas

63
Figure 3-18: Final Bitfinex BCH Update Time Deltas

64
Figure 3-19: Final Bitstamp BTC Update Time Deltas

65
Figure 3-20: Final Bitstamp ETH Update Time Deltas

66
Figure 3-21: Final Bitstamp LTC Update Time Deltas

67
Figure 3-22: Final Bitstamp BCH Update Time Deltas

Given the constant stream of updates from all three exchanges across all of the trading

pairs, we are now confident in the integrity of the data streams for further use in our anal-

ysis.

3.4.4 Exchange latency evaluation

Since the new Bitfinex feed includes timestamps for each trade with each update, it is now

possible to both record the time the trade actually took place as reported by the exchange

as well as the time the record of the trade entered our data recording pipeline. The dif-

ference between these two timestamps is a conflated measure of both latency—the delay

between when the trade occurred and when learn of it—as well as the clock skew between

the exchange’s clock and ours. For each trading pair on each exchange, we construct a

histogram with 5ms wide bins to examine the distribution of reported vs recorded trade

68
times. Here, again, we observe pronounced differences between exchanges as well as minor

differences between trading pairs.

For GDAX, with each of the trading pairs, we observe generally low latency, often under

40ms, with a small smattering of samples spread out over a long tail. We observe differ-

ences between trading pairs in that for BTC and BCH, only roughly 1.5% of samples ex-

hibit delays of <5ms compared to over 5% for ETH and nearly 20% for LTC. It is hard to

say with any confidence what causes these differences, but one possibility is that GDAX

vertically shards its infrastructure by trading pair, and, due to more trading activity with

BTC and BCH, latencies are higher compared to their ETH and LTC markets.

Figure 3-23: GDAX BTC Reported vs Recorded Time Deltas

69
Figure 3-24: GDAX ETH Reported vs Recorded Time Deltas

70
Figure 3-25: GDAX LTC Reported vs Recorded Time Deltas

71
Figure 3-26: GDAX BCH Reported vs Recorded Time Deltas

The distributions for Bitfinex look dramatically different. There are no samples for any

trading pair showing delays less than 60ms, and most samples exhibit delays between

100ms and 1100ms with a fairly uniform distribution within that range. We then observe

a rapid decline with a long tail extending past 2 seconds. In contrast to GDAX, there ap-

pears to be almost no difference between the different trading pairs on this exchange. The

only observable point of note is that there appear to be spikes on the LTC data stream

every 100ms along the long tail (1.5s, 1.6s, 1.7s, ...). There is no clear reason why trades

would be bunched along these boundaries, so once again we assume that this behavior is

due to application or network timeouts or retries.

72
Figure 3-27: Bitfinex BTC Reported vs Recorded Time Deltas

73
Figure 3-28: Bitfinex ETH Reported vs Recorded Time Deltas

74
Figure 3-29: Bitfinex LTC Reported vs Recorded Time Deltas

75
Figure 3-30: Bitfinex BCH Reported vs Recorded Time Deltas

The histograms for Bitstamp are very similar to those for Bitfinex, albeit slightly shifted.

We observe no samples with delays of less than 230ms, and most samples seem to fall be-

tween 250ms and 1250ms with a fairly uniform spread within that range. Once again,

there is a rapid decline follow by a long tail extending past 2s. In slight contrast, for the

ETH, LTC, and BCH data streams, we observe slight peaks between 340ms and 370ms.

76
Figure 3-31: Bitstamp BTC Reported vs Recorded Time Deltas

77
Figure 3-32: Bitstamp ETH Reported vs Recorded Time Deltas

78
Figure 3-33: Bitstamp LTC Reported vs Recorded Time Deltas

79
Figure 3-34: Bitstamp BCH Reported vs Recorded Time Deltas

In all, we observe significant differences across exchanges, and minor differences across

trading pairs. We presume a large factor in the differences across exchanges is physical

location of infrastructure and network connectivity. GDAX’s infrastructure is located in

Northern Virginia within the Amazon Web Services us-east-1 region [30] which is also

where our infrastructure is located. This offers an explanation as to why, without any seri-

ous consideration given to network optimization, we are able to observe trades just tens

of milliseconds after they occur. Bitfinex and Bitstamp meanwhile are not US compa-

nies and, while there is no reliable information as to where their servers are located, we

presume that they likely do not have US-based infrastructure. Thus, receiving data from

them requires more network hops and more distance traveled leading to higher latencies.

While some difference in the distributions is likely attributable to network latency, the

particular shape of the Bitfinex and Bitstamp distributions arises suspicion. In both cases,

80
we observe a relatively uniform distribution over a one-second-wide interval compared to

sharp drop-offs centered around a particular peak. We suspect that the response times

are being purposefully shaped by the exchanges. It is possible this is a security or techni-

cal consideration to smooth out traffic and avoid bursts of data inundating their servers

at once. Another possibility is that they intentionally introduce these delays in the data

streams for their general clientele and charge a premium to institutional customers willing

to pay for low latency access in order to better employ high frequency trading strategies.

This would be a variation on techniques for exchanges described in the financial literature

[50].

In any case, we must choose a method to use for our analysis going forward. Absent any

indication of reliable time coordination between the exchanges, it seems most prudent to

use timestamps as recorded by the data collection pipeline as our time of record. With

this method, despite the fact that we know the data we are considering from Bitfinex and

Bitstamp is ”stale” compared to GDAX, we are more certain that the timestamps are con-

sistent throughout the dataset. Furthermore, using this method is most reflective of the

experience of a US-based market participant. Given the physical separation between the

exchanges, it would seem to be impossible for any actor to be able to act with zero or min-

imal delay from all exchanges simultaneously.

3.5 Collecting bid/ask data

In addition to collecting data about the trades executed on the exchanges, we are also in-

terested in pricing and how it varies over time across exchanges and trading pairs. Accord-

ingly, we seek to record data allowing us to determine the best bid and best ask price for

each trading pair on each exchange at each point in time.

81
3.5.1 GDAX

Although the GDAX Ticker channel supplies data for the best bid and best ask along with

each trade, these data do not provide a full picture in that updates are only provided with

each executed trade rather than with each change to the order book. To access changes

to the best bid and best ask in between executed trades, we must subscribe to the level2

channel.

The request and response specification for level2 updates are as follows:

1 // Request
2 // Subscribe to BTC-USD and LTC-USD level2 updates
3 {
4 "type": "subscribe",
5 "product_ids": [
6 "BTC-USD",
7 "LTC-USD"
8 ],
9 "channels": [
10 "level2"
11 ]
12 }

Listing 3.12: Specification for request for GDAX level2 update subscription

1 // Response
2 {
3 "type": "subscriptions",
4 "channels": [
5 {
6 "name": "level2",
7 "product_ids": [
8 "BTC-USD",
9 "LTC-USD",

82
10 ]
11 }
12 ]
13 }

Listing 3.13: Specification for response for GDAX level2 update subscription

Data messages sent on this channel have the following forms:

1 {
2 "type": "snapshot",
3 "product_id": "BTC-USD",
4 "bids": [["6500.11", "0.45054140"]],
5 "asks": [["6500.15", "0.57753524"]]
6 }

Listing 3.14: Specification for GDAX level2 snapshot message

1 {
2 "type": "l2update",
3 "product_id": "BTC-USD",
4 "changes": [
5 ["buy", "6500.09", "0.84702376"],
6 ["sell", "6507.00", "1.88933140"],
7 ["sell", "6505.54", "1.12386524"],
8 ["sell", "6504.38", "0"]
9 ]
10 }

Listing 3.15: Specification for GDAX level2 update message

We use the following code to record this data:

1 const Gdax = require('gdax');


2

3 const websocket = new Gdax.WebsocketClient(

83
4 ['BTC-USD', 'ETH-USD', 'LTC-USD', 'BCH-USD'],
5 'wss://ws-feed.gdax.com',
6 null,
7 { channels: ['level2'] }
8 );
9

10 websocket.on('message', data => {


11 const logItem = {
12 date: new Date(),
13 data: data
14 };
15 console.log(JSON.stringify(logItem));
16 });
17

18 websocket.on('close', () => {
19 websocket.connect();
20 });

Listing 3.16: JavaScript code for reading from GDAX level2 update stream

Even though we collect this data, given the frequency of bid/ask updates available from

the ticker channel, the decision was made to not implement the data processing necessary

to reconstruct the best bid and best ask prices from the order book updates.

3.5.2 Bitfinex

To collect bid/ask data from Bitfinex, we need to read from the Order Books channel. The

interface for subscribing to this feed has the following request and response specifications:

1 // request
2 {
3 "event":"subscribe",
4 "channel":"book",
5 "pair":"<PAIR>",

84
6 "prec":"<PRECISION>",
7 "freq":"<FREQUENCY>",
8 "length":"<LENGTH>"
9 }
10 // response
11 {
12 "event":"subscribed",
13 "channel":"book",
14 "chanId":"<CHANNEL_ID>",
15 "pair":"<PAIR>",
16 "prec":"<PRECISION>",
17 "freq":"<FREQUENCY>",
18 "len":"<LENGTH>"
19 }

Listing 3.17: Specification for request and response for Bitfinex order books update sub-
scription

Once subscribed, data messages sent on the channel came in one of the two following for-

mats:

1 // snapshot
2 [
3 "<CHANNEL_ID>",
4 [
5 [
6 "<PRICE>",
7 "<COUNT>",
8 "<AMOUNT>"
9 ],
10 ...
11 ]
12 ]
13 // updates
14 [
15 "<CHANNEL_ID>",

85
16 "<PRICE>",
17 "<COUNT>",
18 "<AMOUNT>"
19 ]

Listing 3.18: Specifications for Bitfinex order books messages

The fields included have the following descriptions:

Field Type Description

Level of price aggregation (P0, P1, P2, P3).


PRECISION string
The default is P0.

Frequency of updates (F0, F1).

FREQUENCY string F0=realtime / F1=2sec.

The default is F0.

PRICE float Price level.

COUNT int Number of orders at that price level.

Total amount available at that price level.


±AMOUNT float
Positive values mean bid, negative values mean ask.

LENGTH string Number of price points (”25”, ”100”) [default=”25”]

Table 3.5: Bitfinex Order Book API Message Format

Since we are interested in as fine-grained data as possible, we subscribe with parameters

P0 for precision and F0 for frequency. Length is left at the default 25 since we are only

interested in the best bid and best ask.

The following code is used to record this data:

86
1 const BFX = require('bitfinex-api-node');
2

3 const bfx = new BFX();


4

5 const ws = bfx.ws(1);
6

7 ws.on('open', () => {
8 ws.subscribeOrderBook('BTCUSD');
9 ws.subscribeOrderBook('ETHUSD');
10 ws.subscribeOrderBook('LTCUSD');
11 ws.subscribeOrderBook('BCHUSD');
12 });
13

14 ws.on('orderbook', (pair, trade) => {


15 const logItem = {
16 date: new Date(),
17 pair: pair,
18 data: trade
19 };
20 console.log(JSON.stringify(logItem));
21 });
22

23 ws.on('error', (err) => {


24 console.error(err);
25 });
26

27 ws.on('close', () => {
28 console.error('close');
29 setTimeout(() => {
30 console.error('reconnecting');
31 ws.open();
32 }, 500);
33 });
34

35 ws.on('info', (msg) => {


36 console.error(msg);
37 });

87
38

39 ws.open();

Listing 3.19: JavaScript code for reading from Bitfinex order books update stream

This code simply records the data as it is received. The format of the data provided by

Bitfinex is split into an initial snapshot and incremental updates thereafter; therefore, pro-

cessing has to be implemented and undertaken in order to reconstruct the best bid and

best ask prices at any given time. Examining the raw data stream alone would only show

updates to the order book at that time. This would provide the amount available at a spe-

cific price point at that instant in time, but without knowledge of the existing state of the

order book, it would be impossible to determine if that were the best price point on either

side of the order book.

To reconstruct the state of the order book at each update time, the stream of data needs

to be processed sequentially. The procedure for doing so would appear to be straightfor-

ward: begin with an order book defined by the initial snapshot and record the best bid

and best ask at that time; then, apply each update and record the best bid and best ask

at that timestamp, continuing until all the data has been processed.

This is accomplished with the following Python code:

1 def process_update(update):
2 global bestbid, bestask
3

4 if update['data']['amount'] > 0:
5 if update['data']['count'] == 0:
6 del bidsdict[update['data']['price']]
7 else:
8 bidsdict[update['data']['price']] = {
9 'count': update['data']['count'],

88
10 'amount': update['data']['amount'],
11 }
12 if bidsdict:
13 curbestbid = max(bidsdict)
14 if curbestbid != bestbid['price']:
15 bestbid = {
16 'price': curbestbid,
17 'amount': bidsdict[curbestbid]['amount'],
18 'count': bidsdict[curbestbid]['count'],
19 }
20 output.append({
21 'date': update['date'],
22 'pair': update['pair'],
23 'best_bid': bestbid,
24 'best_ask': bestask,
25 })
26 else:
27 bestbid = {
28 'price': -math.inf,
29 'date': update['date'],
30 }
31 elif update['data']['amount'] < 0:
32 if update['data']['count'] == 0:
33 del asksdict[update['data']['price']]
34 else:
35 asksdict[update['data']['price']] = {
36 'count': update['data']['count'],
37 'amount': update['data']['amount'],
38 }
39 if asksdict:
40 curbestask = min(asksdict)
41 if curbestask != bestask['price']:
42 bestask = {
43 'price': curbestask,
44 'amount': asksdict[curbestask]['amount'],
45 'count': asksdict[curbestask]['count'],
46 }

89
47 output.append({
48 'date': update['date'],
49 'pair': update['pair'],
50 'best_bid': bestbid,
51 'best_ask': bestask,
52 })
53 else:
54 bestask = {
55 'price': math.inf,
56 'date': update['date'],
57 }

Listing 3.20: Python code to process Bitfinex Order Book updates

In practice, however, there are additional challenges detailed in Appendix A.2. After solv-

ing these, we are then able to output best bid and best ask prices at each update time for

Bitfinex and use this for further analysis.

3.5.3 Bitstamp

For Bitstamp, to collect bid/ask data, we use the Live Order Book stream which has the

following specification:

CHANNEL order_book (for BTC/USD order book)

order_book_{currency_pair}

The currency_pair placeholder can be replaced with the following values that

correspond to different orderbooks: btceur, eurusd, xrpusd, xrpeur, xrpbtc,

ltcusd, ltceur, ltcbtc, ethusd, etheur, ethbtc, bchusd, bcheur, bchbtc

EVENT data

PUSHER KEY de504dc5763aeef9ff52

Table 3.6: Bitstamp Live Order Book API Request Format

90
The field descriptions for this stream are as follows:

Field Description

bids List of top 100 bids.

asks List of top 100 asks.

timestamp Order book timestamp.

Table 3.7: Bitstamp Live Order Book API Message Format

Accordingly, here is the JavaScript code that is used to collect this data:

1 const Pusher = require('pusher-js');


2

3 const pusher = new Pusher('de504dc5763aeef9ff52');


4 const btcChannel = pusher.subscribe('order_book');
5 const ltcChannel = pusher.subscribe('order_book_ltcusd');
6 const ethChannel = pusher.subscribe('order_book_ethusd');
7 const bchChannel = pusher.subscribe('order_book_bchusd');
8

9 btcChannel.bind('data', (data) => {


10 const logItem = {
11 date: new Date(),
12 pair: 'BTCUSD',
13 bestBid: data.bids[0],
14 bestAsk: data.asks[0]
15 }
16 console.log(JSON.stringify(logItem));
17 });
18

19 ltcChannel.bind('data', (data) => {


20 const logItem = {
21 date: new Date(),
22 pair: 'LTCUSD',
23 bestBid: data.bids[0],
24 bestAsk: data.asks[0]
25 }

91
26 console.log(JSON.stringify(logItem));
27 });
28

29 ethChannel.bind('data', (data) => {


30 const logItem = {
31 date: new Date(),
32 pair: 'ETHUSD',
33 bestBid: data.bids[0],
34 bestAsk: data.asks[0]
35 }
36 console.log(JSON.stringify(logItem));
37 });
38

39 bchChannel.bind('data', (data) => {


40 const logItem = {
41 date: new Date(),
42 pair: 'BCHUSD',
43 bestBid: data.bids[0],
44 bestAsk: data.asks[0]
45 }
46 console.log(JSON.stringify(logItem));
47 });

Listing 3.21: JavaScript code for reading from Bitstamp ticker update stream

Note that since we know we are only interested in the best bid and best ask, we truncate

the data before recording it. This is possible with Bitstamp where it was not with GDAX

or Bitfinex because each message includes the most recent 100 best bids and asks rather

than incremental updates from an initial snapshot. This means that the data as recorded

is nearly readily usable without further processing.

Furthermore, missed or out-of-order messages from Bitstamp do not pose data integrity

issues. Since each message provides the complete state, there is no possibility of inconsis-

tency. Out-of-order messages can be readily detected by comparing the supplied times-

92
tamp with the timestamps of previously received messages, and reconstruction of the

proper timeline is therefore straightforward.

93
Chapter 4

Trading Volume and Bid/Ask Spreads

Our data collection implementation and data processing pipeline allow us a view into each

match that occurs as well as the best bid and best ask price on each exchange for each of

the trading pairs at any given time during our sample period which spans from midnight

(00:00:00) on May 4, 2018 until midnight May 9, 2018. Using this aggregated data, we be-

gin by presenting summaries of trading volume and bid/ask spreads for each trading pair

on each exchange. We then continue with building regression models to test hypotheses

related to arbitrage and trading behavior across exchanges.

4.1 Trading Volume

Defining a match, r, to have price pr and size sr , we can compute the total trade volume

for a given trading pair on each exchange as r pr sr .

We observe that Bitfinex sees the most USD trading volume across all cryptocurrencies

and that BTC is the highest-volume traded cryptocurrency across all three exchanges.

94
Bitfinex GDAX Bitstamp

BTC/USD $253,778,539.85 $55,987,187.13 $90,809,431.88

ETH/USD $241,227,475.03 $44,384,284.14 $25,651,480.21

BCH/USD $93,585,682.11 $27,001,784.91 $7,158,037.02

LTC/USD $33,610,196.44 $25,565,410.78 $6,415,241.34

Table 4.1: Average daily trading volume by exchange and currency pair

4.2 Bid/Ask Spreads

To standardize the data streams and make them more manageable and comparable, we

first sample in 1-second increments across our 5-day dataset. We use whole-second incre-

ments beginning at 00:00:00 on the first day of our sample. If we say that ti represents the

time i seconds (zero-indexed) after the start of the interval and that each orderbook up-

date, u, has timestamp τu , we define the best bid and best ask at time ti to be the prices

reflected by the most recent orderbook update such that τu < ti .

In presenting plots of bid/ask spreads, we do not bin or aggregate the dollar-denominated

amounts—rather we use the same tick size the exchange does for each trading pair ($0.01

in all cases except for BTC/USD and BCH/USD on Bitfinex which use $0.10 tick sizes).

Since we simultaneously collect data for all trading pairs over the same interval, the num-

ber of samples in each case is the same (432,000), but we present percentages for readabil-

ity. In many cases the plots become unwieldy due to both being dominated by the first

two tick size levels and by very long tails, so, in each case, we have removed the first two

tick sizes and only plot spreads up to $10 and instead elect to show this in accompanying

tables below.

We begin with the bid/ask spreads for the BTC/USD market on GDAX. Here we observe

95
tight spreads with over 94% of samples exhibiting the minimum $0.01 spread. We observe

generally decaying frequencies of wider spreads, but we do note spikes at $1.01, $3.01, and

$5.01 suggesting trading behavior influenced by psychological factors at round-number

thresholds, analogous to behavior described in public markets for other financial assets

[5, 58].

Bid/Ask Spread Samples

$0.01 94.14%

$0.02 0.41%

> $10 0.08%

Table 4.2: GDAX BTC Bid/Ask Spread Data


Truncated from Figure 4-1

Figure 4-1: GDAX BTC Bid/Ask Spread


See Table 4.2 for truncated data

96
For ETH we are slightly less likely to see the minimum $0.01 spreads, but we observe a

much smoother and faster drop-off and shorter tail. We never observe a spread greater

than $3.73. It is surprising that these distributions look so different given that they are

the two most active markets on GDAX and involve the two most widely traded cryptocur-

rencies.

Bid/Ask Spread Samples

$0.01 72.71%

$0.02 1.79%

> $10 0.00%

Table 4.3: GDAX ETH Bid/Ask Spread Data


Truncated from Figure 4-2

Figure 4-2: GDAX ETH Bid/Ask Spread


See Table 4.3 for truncated data

97
For the LTC/USD trading pair we observe a more compressed distribution with no sam-

ples exhibiting spreads greater than $1.25.

Bid/Ask Spread Samples

$0.01 80.39%

$0.02 2.60%

> $10 0.00%

Table 4.4: GDAX LTC Bid/Ask Spread Data


Truncated from Figure 4-3

Figure 4-3: GDAX LTC Bid/Ask Spread


See Table 4.4 for truncated data

The data for BCH shows many fewer samples with the minimum $0.01 spread, though

generally a pattern similar to BTC with a slow drop-off, long tail, and spike at $1.01.

98
Bid/Ask Spread Samples

$0.01 43.27%

$0.02 1.80%

> $10 0.01%

Table 4.5: GDAX BCH Bid/Ask Spread Data


Truncated from Figure 4-4

Figure 4-4: GDAX BCH Bid/Ask Spread


See Table 4.5 for truncated data

For Bitfinex BTC trades in $0.10 ticks. Despite Bitfinex charging higher fees for matched

maker orders for traders with less than $7,500,000 in 30-day trading volume, we still ob-

serve the minimum $0.10 spread for over 86% of samples. We also see a much smoother

distribution of spreads than we do for GDAX and do not observe the round-number

spikes.

99
Bid/Ask Spread Samples

$0.10 86.23%

$0.20 1.89%

> $10 0.02%

Table 4.6: Bitfinex BTC Bid/Ask Spread Data


Truncated from Figure 4-5

Figure 4-5: Bitfinex BTC Bid/Ask Spread


See Table 4.6 for truncated data

For ETH/USD the distribution of bid/ask spreads on Bitfinex is largely similar to that on

GDAX with a smooth drop-off and shorter tail.

100
Bid/Ask Spread Samples

$0.01 75.71%

$0.02 3.04%

> $10 0.00%

Table 4.7: Bitfinex ETH Bid/Ask Spread Data


Truncated from Figure 4-6

Figure 4-6: Bitfinex ETH Bid/Ask Spread


See Table 4.7 for truncated data

For LTC, the plotted distributions appear similar between Bitfinex and GDAX in that

they are compressed, but there is a significant difference in that Bitfinex exhibits $0.01

spreads in just over 40% of samples compared to over 80% on GDAX. We also observe no

spread greater than $0.86.

101
Bid/Ask Spread Samples

$0.01 42.28%

$0.02 7.14%

> $10 0.00%

Table 4.8: Bitfinex LTC Bid/Ask Spread Data


Truncated from Figure 4-7

Figure 4-7: Bitfinex LTC Bid/Ask Spread


See Table 4.8 for truncated data

Bitfinex also uses a $0.10 minimum tick size for BCH, and here we find a different pattern

with fewer than 30% of samples exhibiting $0.10 spreads and more than 3% of samples

exhibiting spreads of each of $0.20, $0.30, $0.40, $0.50, $0.60, $0.70, $0.80, $0.90, $1.00,

and $1.10. The frequency of spreads decreases with sizes greater than $1, though much

more slowly than we see in the other distributions. It is not clear what causes these dif-

102
ferences, as they are not repeated in other cases. The BCH/USD market on Bitfinex is

in between ETH/USD and LTC/USD volume-wise, so that does not seem to be a factor.

The fee structure is identical to other trading pairs, and the tick size is the same as for

BTC/USD.

Bid/Ask Spread Samples

$0.01 27.77%

$0.02 4.36%

> $10 0.00%

Table 4.9: Bitfinex BCH Bid/Ask Spread Data


Truncated from Figure 4-8

Figure 4-8: Bitfinex BCH Bid/Ask Spread


See Table 4.9 for truncated data

103
The distribution of bid/ask spreads for BTC/USD on Bitstamp is dramatically different.

Only 3.63% of samples exhibit the minimum $0.01 spread, and there is no other spread

level that is observed in more than 1% of samples. After $0.01 and $0.02, the third most

observed bid/ask spread is $4.98. We observe spikes around each whole-dollar level similar

to those seen on GDAX, but these are even more pronounced. Furthermore, over 14% of

samples exhibit spreads greater than $10. It is unclear what causes such a dramatic varia-

tion.

Bid/Ask Spread Samples

$0.01 3.22%

$0.02 0.40%

> $10 14.49%

Table 4.10: Bitstamp BTC Bid/Ask Spread Data


Truncated from Figure 4-9

104
Figure 4-9: Bitstamp BTC Bid/Ask Spread
See Table 4.10 for truncated data

Examining the ETH/USD data for Bitstamp also reveals a different pattern. Fewer than

3% of samples exhibit the minimum $0.01 spread, and the second-most frequent spread ob-

served is $0.97. Following this, the third, fourth, and fifth most frequent observed spreads

are $0.99, $1.00, and $0.98, respectively.

Bid/Ask Spread Samples

$0.01 3.63%

$0.02 0.59%

> $10 0.00%

Table 4.11: Bitstamp ETH Bid/Ask Spread Data


Truncated from Figure 4-10

105
Figure 4-10: Bitstamp ETH Bid/Ask Spread
See Table 4.11 for truncated data

When looking at LTC, we see that the lowest possible spread is not the spread that is ob-

served most frequently. Rather, we observe a distribution centered around $0.30. The only

mark of similarity to the other exchanges is that there are no samples which show a spread

of greater than $1.28.

Bid/Ask Spread Samples

$0.01 2.24%

$0.02 0.55%

> $10 0.00%

Table 4.12: Bitstamp LTC Bid/Ask Spread Data


Truncated from Figure 4-11

106
Figure 4-11: Bitstamp LTC Bid/Ask Spread
See Table 4.12 for truncated data

For BCH we observe $0.01 spreads for 3.11% of samples. The second most frequently ob-

served spread is $0.99 followed by $1.99 and $2.97. This appears most similar to BTC on

Bitstamp where we observe a wide distribution and spikes around whole-dollar spread lev-

els.

Bid/Ask Spread Samples

$0.01 3.11%

$0.02 0.40%

> $10 0.22%

Table 4.13: Bitstamp BCH Bid/Ask Spread Data


Truncated from Figure 4-12

107
Figure 4-12: Bitstamp BCH Bid/Ask Spread
See Table 4.13 for truncated data

Across all trading pairs, Bitstamp sees many fewer instances of bid/ask spreads at the

minimum tick size. Though a contributing factor may be that Bitstamp sees lower trad-

ing activity than the other exchanges, it appears more likely that this is related to their

fee structure which always levies a charge on both makers and takers. This is supported

by the fact that this pattern exists even for BTC/USD even though Bitstamp sees more

BTC/USD trading volume than GDAX which has $0.01-wide spreads in 94.14% of sam-

ples.

It is not clear why we observe spikes at whole-number dollar amount spreads for some

trading pairs (BTC and BCH) on some exchanges (GDAX and Bitstamp) and not oth-

ers. On GDAX and Bitstamp, BCH is close in trading volume to LTC, and these volumes

are much lower than that of BTC on either exchange, so that does not appear to be a fac-

108
tor. It is possible that these two exchanges see more activity from traders who exhibit psy-

chological tendencies toward trades at these anchoring points. Still, the phenomenon does

not appear to be exhibited for ETH or LTC, so we would also have to further assume that

market participants with these characteristics are also more drawn to BTC and BCH in

particular. A draw to BTC is plausible given its position as the most widely traded and

well-known cryptocurrency. It is less obvious why this would also be true for BCH, though

it is possible that confusion between BTC and BCH among newcomers to cryptocurrency

and retail investors plays a nontrivial role. It is also true that when BCH was created by

forking BTC on August 1, 2017, everyone with BTC at the time of the fork then also had

an equivalent number of units of BCH. It is therefore possible that traders exhibiting this

behavior trade BCH as well as BTC as a result of being ”given” BCH at conception time.

109
Chapter 5

Arbitrage

We are interested in determining if there are arbitrage opportunities across exchanges,

their characteristics, and how they vary across trading pairs. We therefore seek to use our

collected data to build models to demonstrate evidence of relationships between bid/ask

spreads indicative of arbitrage opportunity and trading activity.

5.1 Frequency of arbitrage opportunities

We first examine the 5-second-sampled bid/ask data for each currency pair for each ex-

change pair and compute the percentage of 5-second intervals that begin with arbitrage

possible net of the highest possible exchange fees. For a cryptocurrency c and pair of ex-

changes (i, j) with bids at time t (seconds after the start of the sample), bcit and bcjt , asks

acit and acjt , and maximum taker fees fi and fj , we compute this fraction of intervals as

n
−1
5∑5

Rc (5t)
n t=0 i,j

110




 1 if bcit (1 − fi ) > acjt (1 + fj )



Rci,j (t) =
 or bcjt (1 − fj ) > acit (1 + fi )





0 otherwise

Bitfinex/GDAX Bitfinex/Bitstamp GDAX/Bitstamp

BTC/USD 27.63% 22.04% 0.03%

ETH/USD 40.38% 25.93% 0.23%

BCH/USD 32.76% 19.50% 0.35%

LTC/USD 39.03% 22.15% 1.03%

Table 5.1: Percentage of 5-second intervals with arbitrageable price gap

The percentages for the exchange pairs involving Bitfinex are so high as to arouse suspi-

cion that the bid/ask data may be inaccurate. To confirm, we compare the bid/ask data

from Bitfinex to the trades data collected separately, and we find that the prices at which

trades are executed is consistent with the bid/ask data recorded.

The stark difference in pricing on Bitfinex compared to the other two exchanges sug-

gests additional friction in using this exchange compared to the others despite it being

the highest-volume exchange of the three and offering a competitive fee structure. This

would be consistent with additional perceived risk based on reports of suspected fraud-

ulent behavior on their part [40]. There are numerous reports of being unable to with-

draw fiat currencies, particularly US Dollars from Bitfinex [23]. There are also reports of

deposits being unreliable and cryptocurrency withdrawals taking unpredictable amounts

of time [64, 60, 2, 4]. Additionally, most recently, there is convincing research suggesting

that Bitfinex is involved in cryptocurrency price manipulation [22]. All of these would ex-

plain why arbitrageurs would be less willing to engage with Bitfinex and why there would

111
be such persistent price differences.

Continuing, we see that the frequency for the currency pairs between GDAX and Bitstamp

do appear consistent with arbitrage activity. At the onset of this work, we had thought ar-

bitrage activity might vary by cryptocurrency and be related to the transaction time and

transaction cost associated with each cryptocurrency. Higher transaction times or higher

transaction costs might add additional friction and risk discouraging arbitrage traders.

This, however, appears not to be the case. Bitcoin, with both the highest transaction time

and highest transaction cost has the fewest incidences of arbitrageable price differences

and the lowest mean arbitrage interval time. The relationship, rather, appears to be be-

tween arbitrage opportunity and aggregate trading volume. Both GDAX and Bitstamp

have the same relative ordering of these currency pairs by average daily volume: BTC/

USD, ETH/USD, BCH/USD, and LTC/USD. This is also the same as the ordering of

cryptocurrencies ranked by market cap.

5.2 Arbitrage window length

Based on the above findings, we then further examine arbitrage window length. Using 1-

second samples of the bid/ask data and discarding arbitrage windows lasting fewer than

3 seconds, we compute the mean window length per currency pair considering only the

GDAX and Bitstamp data.

112
Length (seconds)

BTC/USD 21.50

ETH/USD 22.86

BCH/USD 23.25

LTC/USD 31.12

Table 5.2: Mean arbitrage window length by trading pair

This same ordering holds true for both frequency of arbitrage opportunity and mean arbi-

trage window length. These results suggest that the more valuable or more actively traded

a cryptocurrency is, the more arbitrage activity it draws, and these factors overshadow

other technical properties of the cryptocurrency.

5.3 Predicting direction of net volume based on price

differences

Given the bid/ask and trades data assembled across the exchanges, we attempt to find

evidence of arbitrage activity by examining the bid and ask prices on each exchange at

each time t and the net volume during the subsequent time interval.

Hypothesis 1: For a given cryptocurrency, if the bid on the first exchange exceeds the ask

on the second exchange, then we expect to see net sell on the first exchange and net buy on

the second exchange in the following time interval.

If there were active arbitrageurs participating in the market, we would expect that, if, at

time t, the arbitrage condition were true, specifically that the bid on one exchange ex-

ceeded the ask on another, we would see negative net volume on the first exchange and

positive net volume on the second. We construct several logistic regression models in the

113
form
1
Pr(y = 1) =
1+ e−(βx+α)

to test this. In order to align our trades data with our sampled bid/ask data, we must ag-

gregate trades by second. We define a match, r, to have price pr , size sr , timestamp τr ,

and side ur (all obtained from the exchange) where




1 if buyer is taker
u=


−1 if seller is taker

Accordingly we define the net volume at time t as


Vt = pr sr ur
r|t<=τr <t+1

In the first case, we construct one model for each cryptocurrency and each pair-wise per-

mutation of exchanges. We once again say that exchanges i and j at time t have bids bcit

and bcjt and asks acit and acjt for cryptocurrency c. We are interested in varying the time

interval length beyond the 1-second samples we have already defined, so, given that we
n
want to aggregate n samples into m intervals of width w = m
, we define our x values to

be

xk = min(bcikw − acjkw , 0)

as a measure of the magnitude of the arbitrage opportunity and y values to be



 ∑(k+1)w−1

 1 if Vciℓ < 0

 ℓ=kw

∑(k+1)w−1
yk = and Vcjℓ > 0

 ℓ=kw




0 otherwise

114
as an indicator as to whether or not trading activity reflected arbitrage exploitation. We

use an interval period of a minimum of 5 seconds to account for the varying delays in data

streams from the exchanges. This results in 24 models (6 permutations of 3 exchanges and

4 cryptocurrencies) for each interval length. The expectation here is that the resulting

logistic regression curve would be centered around the price delta at which arbitrageurs

determined there was sufficient profit incentive to engage. Though we observe positive co-

efficients and low p-values suggesting a relationship between pricing and subsequent net

trading volume, in each of these cases, however, the resulting model predicts 0 for nearly

or all of the entire range of input data.

We present the regression results below including the threshold arbitrage price delta

at which the model outputs a probability of 0.5 indicating a prediction of arbitrage-

directioned trading activity. We also present the probability outputs of each model for

the 0.1th percentile and 99.9th percentile x values (i.e. price discrepancies). From these

we can see that even though each model shows a positive correlation between magnitude

of arbitrage opportunity and arbitrage-directional trading activity, no model reaches the

threshold prediction level of 0.5 even on the 99.9th percentile sample.

Even if we are less confident in the Bitfinex data and only look at the Bitstamp/GDAX

results, it appears that using this method we are unable to build models that successfully

predict arbitrage trading activity.

We attempt this using interval periods of 5, 10, 20, and 30 seconds, exploring interval peri-

ods of up to 30 seconds due to the infrequency of trades of some cryptocurrencies on some

exchanges. We only include the results from the 5-second interval models below as they

are the most predictive—using wider interval lengths results in noisier outputs and even

less conclusive results.

115
coef Thresh- 0.1th %ile 99.9th %ile Sam-
const coef p-value old prediction (x) prediction (x) ples

BTC GDAX/Bitfinex -2.1543 0.0406 0.003 $53.06 0.1039 ($0.00) 0.1427 ($8.91) 86,400

BTC Bitfinex/GDAX -2.3124 0.0027 0.000 $856.44 0.0901 ($0.00) 0.1198 ($117.70) 86,400

BTC GDAX/Bitstamp -2.3051 0.0491 0.000 $46.95 0.0907 ($0.00) 0.2933 ($29.05) 86,400

BTC Bitstamp/GDAX -2.4290 0.0136 0.000 $178.60 0.0810 ($0.00) 0.1419 ($46.26) 86,400

BTC Bitfinex/Bitstamp -2.0960 0.0050 0.000 $419.20 0.1095 ($0.00) 0.1777 ($111.83) 86,400

BTC Bitstamp/Bitfinex -1.9963 0.2205 0.000 $9.05 0.1196 ($0.00) 0.3453 ($6.15) 86,400

ETH GDAX/Bitfinex -2.1965 0.3173 0.000 $6.92 0.1001 ($0.00) 0.1729 ($1.99) 86,400

ETH Bitfinex/GDAX -2.3670 0.0298 0.000 $79.43 0.0857 ($0.00) 0.1161 ($11.32) 86,400

ETH GDAX/Bitstamp -3.2522 0.8102 0.000 $4.01 0.0372 ($0.00) 0.2722 ($2.80) 86,400

ETH Bitstamp/GDAX -3.5825 0.3311 0.000 $10.82 0.0271 ($0.00) 0.1271 ($5.00) 86,400

ETH Bitfinex/Bitstamp -3.2871 0.1096 0.000 $29.99 0.0360 ($0.00) 0.1055 ($10.49) 86,400

ETH Bitstamp/Bitfinex -3.1290 0.7055 0.000 $4.44 0.0419 ($0.00) 0.1370 ($1.83) 86,400

LTC GDAX/Bitfinex -3.0095 1.5110 0.000 $1.99 0.0470 ($0.00) 0.0887 ($0.45) 86,400

LTC Bitfinex/GDAX -3.1883 0.5254 0.000 $6.07 0.0396 ($0.00) 0.1425 ($2.65) 86,400

LTC GDAX/Bitstamp -4.1180 2.4607 0.000 $1.67 0.0160 ($0.00) 0.0977 ($0.77) 86,400

LTC Bitstamp/GDAX -4.1422 1.5386 0.000 $2.69 0.0156 ($0.00) 0.0967 ($1.24) 86,400

LTC Bitfinex/Bitstamp -4.6585 1.4725 0.000 $3.18 0.0094 ($0.00) 0.1224 ($1.83) 86,400

LTC Bitstamp/Bitfinex -4.1281 12.7227 0.000 $0.32 0.0159 ($0.00) 0.1229 ($0.17) 86,400

BCH GDAX/Bitfinex -2.8973 0.1866 0.000 $15.53 0.0523 ($0.00) 0.1167 ($4.68) 86,400

BCH Bitfinex/GDAX -3.2958 0.0670 0.000 $49.19 0.0357 ($0.00) 0.1554 ($23.92) 86,400

BCH GDAX/Bitstamp -4.3352 0.2951 0.000 $14.69 0.0129 ($0.00) 0.1560 ($8.97) 86,400

BCH Bitstamp/GDAX -4.5634 0.2611 0.000 $17.48 0.0103 ($0.00) 0.1243 ($10.00) 86,400

BCH Bitfinex/Bitstamp -4.7983 0.1489 0.000 $32.23 0.0082 ($0.00) 0.1769 ($21.90) 86,400

BCH Bitstamp/Bitfinex -4.1415 0.6965 0.000 $5.95 0.1156 ($0.00) 0.2411 ($4.30) 86,400

Table 5.3: Regression results using 5-second intervals

116
Next, noting that at a given time t, it is impossible for an arbitrage opportunity to exist

in both directions for a given pair of exchanges, the construction is simplified by reducing

exchange permutations to combinations. Rather than keeping exchanges i and j fixed for

the training of a model, we instead consider an unordered pair of exchanges, and, at each

time t assign exchanges i and j such that bcit − acjt > bcjt − acit . The case in which

these quantities are equal is irrelevant as it implies that no arbitrage opportunity exists.

We then continue as before and define x and y at each time t as follows:

xk = min(bcikw − acjkw , 0)



 ∑(k+1)w−1

 1 if Vciℓ < 0

 ℓ=kw

∑(k+1)w−1
yk = and Vcjℓ > 0

 ℓ=kw




0 otherwise

The potential for lost resolution through this simplification exists if there are real dif-

ferences in ability or cost to engage in arbitrage in one direction compared to another.

This may exist, for example, if one exchange makes depositing USD more expensive or

time consuming (or even impossible) compared to depositing cryptocurrencies (or vice

versa). This construction results in 12 models, but the results have similar outcomes.

Each model predicts 0 for nearly the entire range of input data. However, we do observe

wider separations between the 0.1th percentile and 99.9th percentile predictions for the

GDAX/Bitstamp models than we do for the others.

117
coef Thresh- 0.1th %ile 99.9th %ile Sam-
const coef p-value old prediction (x) prediction (x) ples

BTC GDAX/Bitfinex -2.2700 0.0019 0.000 $1194.74 0.0936 ($0.00) 0.1147 ($117.70) 86,400

BTC GDAX/Bitstamp -2.8166 0.0408 0.000 $69.03 0.0564 ($0.00) 0.2891 ($46.94) 86,400

BTC Bitfinex/Bitstamp -2.1515 0.0063 0.000 $341.51 0.1042 ($0.00) 0.1904 ($111.83) 86,400

ETH GDAX/Bitfinex -2.3849 0.0337 0.000 $70.77 0.0843 ($0.00) 0.1188 ($11.32) 86,400

ETH GDAX/Bitstamp -3.9157 0.5264 0.000 $7.44 0.0195 ($0.00) 0.2169 ($5.00) 86,400

ETH Bitfinex/Bitstamp -3.4033 0.1374 0.000 $24.77 0.0322 ($0.00) 0.1232 ($10.49) 86,400

LTC GDAX/Bitfinex -3.2747 0.5969 0.000 $5.49 0.0364 ($0.00) 0.1555 ($2.65) 86,400

LTC GDAX/Bitstamp -4.8525 2.7746 0.000 $1.75 0.0077 ($0.00) 0.1959 ($1.24) 86,400

LTC Bitfinex/Bitstamp -4.9412 1.7496 0.000 $2.82 0.0071 ($0.00) 0.1485 ($1.83) 86,400

BCH GDAX/Bitfinex -3.4781 0.0840 0.000 $41.41 0.0299 ($0.00) 0.1872 ($23.92) 86,400

BCH GDAX/Bitstamp -5.1460 0.3898 0.000 $13.20 0.0058 ($0.00) 0.2486 ($10.36) 86,400

BCH Bitfinex/Bitstamp -4.8133 0.1530 0.000 $31.46 0.0081 ($0.00) 0.1881 ($21.90) 86,400

Table 5.4: Regression results using 5-second intervals, indifferent to arbitrage direction

The next consideration is that, by providing every time interval, many of which have no

potential for arbitrage and thus arbitrage magnitudes (i.e. x values) of 0, the models are

being too biased towards 0 outputs. Accordingly, we attempt to train models excluding

intervals with x values of 0. This, however, yields the same result of nearly no predictions

of arbitrage activity and no model predicting arbitrage trading activity for the 99.9th per-

centile arbitrage differential. Compared to the last construction, we see smaller separa-

tions between predictions for the 0.1th percentile and 99.9th percentile samples, even for

GDAX/Bitstamp comparisons.

118
coef Thresh- 0.1th %ile 99.9th %ile Sam-
const coef p-value old prediction (x) prediction (x) ples

BTC GDAX/Bitfinex -2.2545 0.0016 0.000 $1409.06 0.0950 ($0.10) 0.1127 ($117.80) 85,973

BTC GDAX/Bitstamp -2.3788 0.0176 0.000 $135.16 0.0130 ($0.01) 0.1646 ($48.27) 68,118

BTC Bitfinex/Bitstamp -1.9962 0.0029 0.000 $688.35 0.0102 ($0.04) 0.1214 ($112.21) 81,396

ETH GDAX/Bitfinex -2.3193 0.0222 0.000 $104.47 0.0896 ($0.01) 0.1123 ($11.35) 84,466

ETH GDAX/Bitstamp -3.2916 0.2596 0.000 $12.68 0.0359 ($0.01) 0.1229 ($5.11) 57,601

ETH Bitfinex/Bitstamp -3.1734 0.0893 0.000 $35.54 0.0402 ($0.01) 0.0982 ($10.71) 77,442

LTC GDAX/Bitfinex -3.1483 0.4997 0.000 $6.30 0.0414 ($0.01) 0.1402 ($2.67) 81,470

LTC GDAX/Bitstamp -4.1487 1.6875 0.000 $2.46 0.0158 ($0.01) 0.1169 ($1.85) 49,457

LTC Bitfinex/Bitstamp -4.5924 1.4126 0.000 $3.25 0.0102 ($0.01) 0.1214 ($1.26) 68,881

BCH GDAX/Bitfinex -3.2473 0.0637 0.000 $50.98 0.0374 ($0.01) 0.1519 ($23.97) 77,768

BCH GDAX/Bitstamp -4.3330 0.2462 0.000 $17.60 0.0130 ($0.01) 0.1646 ($11.00) 43,562

BCH Bitfinex/Bitstamp -4.5068 0.1260 0.000 $35.77 0.0109 ($0.01) 0.1501 ($22.00) 68,959

Table 5.5: Regression results using 5-second intervals, excluding samples without arbitrage
pricing

Taking the idea further, we note that even if we remove samples with x values of 0, we still

have many data points that do not represent arbitrage opportunities due to exchange fees.

All three exchanges charge fees for taker orders, even for traders exceeding their highest

30-day volume thresholds. We therefore again build 12 models, this time excluding data

points in which the arbitrage price difference is assuredly not large enough to be prof-

itable.

Here, many of the results are once again inconclusive, but we note that the model for BTC

on GDAX/Bitstamp does predict arbitrage-directioned trade for more than 0.1% of sam-

ples. Given the exclusionary criteria, the total number of samples considered has been

substantially reduced, and in total we see predictions of 0.5 or greater for only 11 samples.

119
The threshold arbitrage gap implied by the model is $55.87.

coef Thresh- 0.1th %ile 99.9th %ile Sam-


const coef p-value old prediction (x) prediction (x) ples

BTC GDAX/Bitfinex -2.1621 0.0001 0.861 $21,621 0.1032 ($0.04) 0.1043 ($100.11) 57,143

BTC GDAX/Bitstamp -2.2126 0.0396 0.000 $55.87 0.0986 ($0.00) 0.5654 ($62.54) 7532

BTC Bitfinex/Bitstamp -1.9447 0.0029 0.000 $670.59 0.1251 ($0.03) 0.1599 ($97.77) 43,384

ETH GDAX/Bitfinex -2.3734 0.0421 0.000 $56.38 0.0852 ($0.00) 0.1242 ($9.99) 60,318

ETH GDAX/Bitstamp -2.8681 0.2854 0.000 $10.05 0.0538 ($0.00) 0.1533 ($4.06) 13,811

ETH Bitfinex/Bitstamp -3.1823 0.1314 0.000 $24.22 0.0398 ($0.00) 0.1258 ($9.46) 47,970

LTC GDAX/Bitfinex -2.9469 0.4665 0.000 $6.32 0.0499 ($0.00) 0.1425 ($2.47) 53,439

LTC GDAX/Bitstamp -3.6090 1.7994 0.000 $2.01 0.0264 ($0.00) 0.1633 ($1.10) 13,239

LTC Bitfinex/Bitstamp -4.3194 1.7054 0.000 $2.53 0.0131 ($0.00) 0.1576 ($1.55) 45,252

BCH GDAX/Bitfinex -3.1848 0.0808 0.000 $39.42 0.0397 ($0.02) 0.1814 ($20.75) 56,852

BCH GDAX/Bitstamp -3.6202 0.2761 0.000 $13.11 0.0261 ($0.00) 0.2710 ($9.53) 10,935

BCH Bitfinex/Bitstamp -4.1359 0.1331 0.000 $31.07 0.0109 ($0.00) 0.1501 ($19.05) 41,157

Table 5.6: Regression results using 5-second intervals, excluding samples without arbitrage
pricing net of the minimum exchange fees

We take this approach further and attempt a construction including only data points for

which arbitrage opportunities exist net of the highest possible exchange fees. That is to

say, if exchanges i and j have maximum taker fees fi and fj , we compute x values as xk =

bcikw (1 − fi ) − acjkw (1 + fj ) and exclude intervals in which xk ≤ 0.

We observe predictions for 99.9th percentile arbitrage opportunities to exceed 0.5 for each

cryptocurrency between Bitstamp and GDAX. However, since such large arbitrage oppor-

tunities are relatively rarer between Bitstamp and GDAX, the sample sizes have become

much much smaller and we no longer see results that are statistically significant at the 5%

120
level for BTC or ETH.

For LTC and BCH we observe thresholds of $0.69 and $12.28, respectively. However, even

though the 99.9th percentile prediction exceeds 0.5, this is a very low bar. For LTC we

observe only 10 samples for which predictions exceed 0.5, and for BCH, we observe only 2.

coef Thresh- 0.1th %ile 99.9th %ile Sam-


const coef p-value old prediction (x) prediction (x) ples

BTC GDAX/Bitfinex -2.1894 0.0012 0.424 $1824.50 0.1007 ($0.03) 0.1089 ($75.61) 23,876

BTC GDAX/Bitstamp 0.1402 0.0302 0.377 $4.64 0.5426 ($1.02) 0.8224 ($98.30) 27

BTC Bitfinex/Bitstamp -2.2208 0.0178 0.000 $124.76 0.0979 ($0.03) 0.3846 ($46.12) 19,049

ETH GDAX/Bitfinex -2.3045 0.0530 0.000 $43.48 0.0908 ($0.00) 0.1329 ($8.09) 34,892

ETH GDAX/Bitstamp -1.5970 0.4925 0.145 $3.24 0.1692 ($0.00) 0.5104 ($3.32) 203

ETH Bitfinex/Bitstamp -3.0596 0.1782 0.000 $17.17 0.0448 ($0.00) 0.1616 ($7.93) 22,404

LTC GDAX/Bitfinex -2.8832 0.7217 0.000 $4.00 0.0530 ($0.00) 0.1911 ($2.00) 33,726

LTC GDAX/Bitstamp -2.9163 4.2313 0.000 $0.69 0.0514 ($0.00) 0.6804 ($0.87) 886

LTC Bitfinex/Bitstamp -3.8906 2.2953 0.000 $1.70 0.0200 ($0.00) 0.2167 ($1.14) 19,136

BCH GDAX/Bitfinex -2.7721 0.0814 0.000 $34.06 0.0588 ($0.00) 0.1901 ($16.24) 28,304

BCH GDAX/Bitstamp -1.7542 0.1428 0.038 $12.28 0.1475 ($0.00) 0.7982 ($22.31) 306

BCH Bitfinex/Bitstamp -3.5254 0.1272 0.000 $27.72 0.0286 ($0.00) 0.1730 ($15.41) 16,852

Table 5.7: Regression results using 5-second intervals, excluding samples without arbitrage
pricing net of the maximum exchange fees

Even setting aside the Bitfinex data, we are unable to confidently identify meaningful rela-

tionships between trading activity and prices suggesting arbitrage opportunity. It may be

the case that our attempts to look at net volume are impacted by the relative differences

in volume seen on the two exchanges for each currency pair. With the volumes for BCH/

USD and LTC/USD so much higher on GDAX than on Bitstamp, it may be that trades

121
to “correct” the price on Bitstamp have little impact on GDAX and therefore do not move

net volume in the direction we would expect. Additionally, since these trading pairs also

exist on other exchanges, and related pairs (such as BCH/BTC or LTC/BTC) even exist

on these exchanges, only looking at volume for these pairs on these exchanges may be too

narrow in the way of observation.

5.4 Relationship between net volumes on exchanges

The next approach taken is to investigate the relationship between the net volumes on the

highest and lowest price exchanges in the time intervals following the emergence of an ar-

bitrage opportunity.

Hypothesis 2: For a given cryptocurrency, if the bid on the first exchange exceeds the

ask on the second exchange, then we expect a more negative correlation between the signed

volumes on the two exchanges in the following time interval compared to samples that do

not start with arbitrage opportunities.

Instead of limiting the period we examine to just a single time interval following the ap-

pearance of an arbitrage opportunity, in this approach we examine the following five pe-

riods. For each cryptocurrency, we evaluate the bid/ask data for all three exchanges si-

multaneously (rather than pairwise as had been described previously). At each time t, the

exchange with the highest bid price is denoted i and the exchange with the lowest ask

price is denoted j. We then collect net volumes Vi and Vj for time the next 5 time inter-

vals t to t + w, t + δt to t + 2w, ..., t + 4w to t + 5w. For each time interval we compute

the correlation between Vi and Vj .

We expect for Vi and Vj to be more negatively correlated due to selling on the high-priced

exchange and buying on the low-priced exchange with a return towards non-arbitrage cor-

122
relation coefficients as time intervals increase due to the arbitrage opportunity being ex-

ploited away and the volume in subsequent periods being dominated by other types of

trading activity.

When looking at periods following arbitrage opportunities, for each of the time intervals,

Vi and Vj showed very small positive correlations for all four of the cryptocurrencies. This

is in sharp contrast to the much more positive correlations shown for other periods. We

use Fisher’s r to z transformation and then compute the z test statistic to show the sta-

tistical significance of these differences. Note that while, in most cases, we do see statisti-

cally significant differences between the correlations, we do not observe convergence over

the five subsequent time intervals. This suggests that trading behavior, even over 25 sec-

onds following the appearance of an arbitrage opportunity, does not exhibit evidence of

exploitation.

Arbitrage No Arbitrage Comparison

interval r p-value r p-value z p-value

1 0.0425 0.0000 0.6053 0.0000 7.1833 0.0000

2 0.0412 0.0000 0.0735 0.4213 0.3533 0.7239

3 0.0402 0.0000 0.4197 0.0000 4.4379 0.0000

4 0.0395 0.0000 0.1482 0.1034 1.1967 0.2314

5 0.0413 0.0000 0.1308 0.0003 0.9836 0.3253

Table 5.8: Correlations between Vi and Vj for BTC

123
Arbitrage No Arbitrage Comparison

interval r p-value r p-value z p-value

1 0.0814 0.0000 0.3610 0.0000 8.555 0.0000

2 0.0737 0.0000 0.5887 0.0000 17.3682 0.0000

3 0.0840 0.0000 0.4862 0.0000 12.8960 0.0000

4 0.0776 0.0000 0.4218 0.0000 10.7389 0.0000

5 0.0778 0.0000 0.3561 0.0000 8.4975 0.0000

Table 5.9: Correlations between Vi and Vj for ETH

Arbitrage No Arbitrage Comparison

interval r p-value r p-value z p-value

1 0.0785 0.0000 0.5860 0.0000 31.2305 0.0000

2 0.0908 0.0000 0.6092 0.0000 32.4793 0.0000

3 0.0933 0.0000 0.6788 0.0000 38.6273 0.0000

4 0.1021 0.0000 0.2366 0.0000 7.3067 0.0000

5 0.0947 0.0000 0.1984 0.0000 5.5879 0.0000

Table 5.10: Correlations between Vi and Vj for LTC

Arbitrage No Arbitrage Comparison

interval r p-value r p-value z p-value

1 0.0857 0.0000 0.1120 0.0000 1.6442 0.1001

2 0.0683 0.0000 0.3371 0.0000 17.4821 0.0000

3 0.0765 0.0000 0.7207 0.0000 51.5315 0.0000

4 0.0843 0.0000 0.2237 0.0000 8.8551 0.0000

5 0.0848 0.0000 0.6929 0.0000 47.5732 0.0000

Table 5.11: Correlations between Vi and Vj for BCH

124
Given our prior experience with the apparent noisiness in the Bitfinex data, we also com-

pute these figures excluding Bitfinex. The results are even more conclusive.

Arbitrage No Arbitrage Comparison

interval r p-value r p-value z p-value

1 0.0425 0.0000 0.8208 0.0000 134.0510 0.0000

2 0.0444 0.0000 0.8727 0.0000 156.0331 0.0000

3 0.0450 0.0000 0.9180 0.0000 183.7935 0.0000

4 0.0433 0.0000 0.8572 0.0000 148.7705 0.0000

5 0.0441 0.0000 0.9020 0.0000 172.7007 0.0000

Table 5.12: Correlations between Vi and Vj for BTC, excluding Bitfinex

Arbitrage No Arbitrage Comparison

interval r p-value r p-value z p-value

1 0.0506 0.0000 0.8875 0.0000 188.3433 0.0000

2 0.0517 0.0000 0.8961 0.0000 194.0145 0.0000

3 0.0543 0.0000 0.8843 0.0000 185.7687 0.0000

4 0.0555 0.0000 0.8765 0.0000 180.7950 0.0000

5 0.0582 0.0000 0.8844 0.0000 185.2902 0.0000

Table 5.13: Correlations between Vi and Vj for ETH, excluding Bitfinex

125
Arbitrage No Arbitrage Comparison

interval r p-value r p-value z p-value

1 0.0447 0.0000 0.8594 0.0000 181.2290 0.0000

2 0.0395 0.0000 0.8806 0.0000 194.6938 0.0000

3 0.0442 0.0000 0.8801 0.0000 193.6860 0.0000

4 0.0445 0.0000 0.9025 0.0000 209.5394 0.0000

5 0.0429 0.0000 0.9149 0.0000 220.1347 0.0000

Table 5.14: Correlations between Vi and Vj for LTC, excluding Bitfinex

Arbitrage No Arbitrage Comparison

interval r p-value r p-value z p-value

1 0.0384 0.0000 0.8392 0.0000 173.4142 0.0000

2 0.0360 0.0000 0.8530 0.0000 180.9097 0.0000

3 0.0324 0.0000 0.8623 0.0000 186.6094 0.0000

4 0.0341 0.0000 0.8680 0.0000 189.6902 0.0000

5 0.0327 0.0000 0.8766 0.0000 195.1839 0.0000

Table 5.15: Correlations between Vi and Vj for BCH, excluding Bitfinex

We see that, during ordinary periods, net volume on GDAX and Bitstamp are strongly

correlated with r-values greater than 0.8 for all trading pairs. Contrastingly, in periods

following arbitrage opportunities, the correlation coefficients drop to less than 0.06 in all

cases. This shows that arbitrage opportunities are associated with differences in trading

behavior, even if those opportunities are not quickly exploited.

126
5.5 Predicting arbitrage window length based on

trading volume

Seeing as though we find little evidence found of arbitrage exploitation in trading volume

during future time periods, the next approach is to investigate whether the duration of

the existence of an arbitrage opportunity can be predicted using the trading volume on

those exchanges during the affected time window. The theory being tested is that if there

is little trading volume on the relevant exchanges when an arbitrage opportunity exists, it

is likely to continue to exist whereas high volume would be indicative of more liquidity and

active trading likely to eliminate the arbitrage opportunity more quickly.

Hypothesis 3: For a given cryptocurrency, the length of the arbitrage window between the

two exchanges is shorter if the per-second trading volume (unsigned) on the two exchanges

within the arbitrage window is higher.

For this, we construct a linear regression model. The dependent variable is the length of

the arbitrage window in seconds, and the independent variable is the mean of the unsigned

trade volumes on the two exchanges during the period. We use volume per second to ac-

count for the phenomenon that longer windows would naturally be expected to have more

volume. Arbitrage windows lasting fewer than 3 seconds are not considered at risk of these

being artifacts of the varying time delays in streaming data from the different exchanges.

Using this model, we find no significant relationship.

127
coef p-value

BTC 0.0027 0.367

ETH -0.0003 0.463

BCH -0.0016 0.051

LTC -0.0016 0.382

Table 5.16: Regression of window length against trading volume

Even though we see a much lower p-value for BCH, we note the model has r2 = 0.059, so

it does not appear to be descriptive.

5.6 Predicting arbitrage opportunities based on trad-

ing volume

Next, we try to see if the opening of an arbitrage window can be predicted using the trad-

ing volume on the relevant exchanges in the prior time interval.

Hypothesis 4: For a given cryptocurrency, if trading volume is lower in a time interval,

then arbitrage window is more likely to open in the following few seconds.

For this, we include only time samples that reflected arbitrage opportunities where no

such opportunity existed at the time of the previous sample. We build a logistic regres-

sion model in which the x values are the sum of the unsigned trade sizes on the relevant

exchanges during the prior 5 seconds and the y values are 1 if an arbitrage opportunity

existed at the sample time and 0 otherwise. For cryptocurrency c, exchange pairs (i, j),

time t, and matches r each having price pr , size sr , and timestamp τr , we have:


xc(i,j)t = pr sr
r|t−5≤τr <t

128




 1 if bcit (1 − fi ) > acjt (1 + fj )



yc(i,j)t =
 or bcjt (1 − fj ) > acit (1 + fi )





0 otherwise

However, with the results of the model, we do not find evidence of such a relationship.

The fitted coefficients of zero for x suggest that trading volume in the preceding interval

is unrelated to prices diverging across exchanges.

coef p-value

const -6.2134 0.0000

x 0.0000 0.0000

Table 5.17: Regression results for arbitrage window based on trading volume for BTC

coef p-value

const -5.6036 0.0000

x 0.0000 0.0000

Table 5.18: Regression results for arbitrage window based on trading volume for ETH

coef p-value

const -5.2950 0.0000

x 0.0000 0.0000

Table 5.19: Regression results for arbitrage window based on trading volume for LTC

coef p-value

const -4.9343 0.0000

x 0.0000 0.0000

Table 5.20: Regression results for arbitrage window based on trading volume for BCH

129
Chapter 6

Future Work

What we have learned through this work, particularly with regard to the existence of long-

term price discrepancies for cryptocurrencies across exchanges that are indicative of avail-

able arbitrage opportunities, suggests several areas worthy of future research.

6.1 Expansion to more currency pairs

In the currency pair selection phase of this work, we decided to only consider US

Dollar-denominated trading pairs. This was based on the observation that US Dollar-

denominated pairs are typically the highest volume trading pairs and we were optimizing

for data points collected per unit time. In addition, noting the US Dollar’s position as a

commonly used reserve currency made this seem like a natural choice. Given the results

showing a relationship between arbitrage opportunities and trading volume, it would be

interesting to test if this extends further to more trading pairs.

130
6.2 Longer Sample Interval

The sample we have used in this work has a duration of five days. This seemed reasonable

when our prior belief was that arbitrage opportunities would be closed within seconds.

However, now that we have shown that arbitrage opportunities exist for much longer, even

on well-known exchanges allowing for automated trading of well-known cryptocurrencies,

it would be interesting to explore a much longer sample size. With a longer sample, it

would also be possible to note the effects of returns and volatility in the prices of cryp-

tocurrencies.

6.3 Triangular arbitrage on a single exchange

If inter-exchange arbitrage appears to be possible, does this extend to intra-exchange arbi-

trage through triangular trading across multiple currency pairs? For example, is it possible

to perform simultaneous trades on a single exchange to buy Bitcoin with US dollars, buy

Litecoin with Bitcoin, and sell Litecoin for US dollars with the end result being no net po-

sition change in Bitcoin or Litecoin but a net increase in US dollar balance?

At the onset of this work, we assumed that this would be impossible, but these assump-

tions now appear to be incorrect. First, we believed that cryptocurrency markets, while

still relatively young, were mature enough that arbitrage opportunities would actively be

exploited. Though we only examined inter-exchange price differences in this work, it is

clear that these are not being exploited. Second, from a casual inspection of the web trad-

ing interfaces of exchanges, we assumed that trades happen frequently enough that there

would be substantial barriers to entry to operate a low enough latency trading system to

engage in these types of arbitrage trades. However, the data collected shows that, without

any consideration toward latency optimization, we were able to build a system to collect

131
market data with much lower latency than the general frequency of trades. As such, this is

deserving of further exploration.

6.4 Comparing arbitrage opportunities with exchange

APIs

At the exchange-selection phase of this work, it was discussed how only exchanges with

WebSocket APIs were considered. This was because such interfaces provided stronger as-

surances of continuous data collection and were easier to implement against. It would be

interesting to also collect data from exchanges with less suitable interfaces and determine

what, if any, role this technology-related friction plays in pricing.

6.5 Comparing arbitrage opportunities with deposit

and withdrawal friction

The results presented showed significant differences in arbitrage opportunities for the same

currency pairs between different pairs of exchanges. It was suggested based on uncon-

firmed secondhand reports that this exchange also did not provide strong guarantees of

timely or reliable deposits or withdrawals. It would seem intuitive that this unreliability

would add significant friction to achieving price parity across exchanges. It would be in-

teresting to collect data on deposit and withdrawal reliability and latency across many

exchanges and see how these factors affect apparent arbitrage opportunities.

132
6.6 Comparing arbitrage opportunities with exchange

consumer confidence

There are a number of cryptocurrency exchanges available with significant overlap in cur-

rency pairs offered. Given the nascent state of the industry, the relative lack of regula-

tory certainty, and notable events of fraud in the past, it might be the case that would-

be-arbitrageurs see significant risk in doing business with some exchanges. The discount

factor applied due to this risk may outweigh potential profits from what would otherwise

appear to be “risk-free” trades. It would be interesting to compare arbitrage opportunities

across exchange pairs while noting either confidence in those exchanges or signals thereof

(such as presence of significant funding from reputable investors, public partnerships with

international financial institutions, favorable regulatory rulings in relevant jurisdictions,

etc.).

133
Chapter 7

Conclusion

In this work we designed and implemented a system for collecting real-time pricing and

trading data from multiple cryptocurrency exchanges, GDAX, Bitfinex, and Bitstamp. We

collected continuous data streams for order book changes and trades for trading pairs in-

volving four cryptocurrencies—Bitcoin, Ethereum, Litecoin, and Bitcoin Cash—against

the US Dollar. We showed our system to be capable of handling hundreds of updates per

second and robust to undocumented server behavior and unexpected network conditions.

We implemented a data pipeline for processing the raw data streams into a time-series for-

mat. We developed heuristics to infer missing data points due to exchange server errors.

We normalized the data from the different exchanges to be able to compare them.

We analyzed the resulting data and presented bid/ask spreads and trading volume. We

showed how these vary across the different cryptocurrencies and exchanges. We demon-

strated how the market environment on Bitfinex differs from other exchanges and how this

affected our analysis. We showed that, when excluding Bitfinex, trading pairs involving

cryptocurrencies exhibiting more trading volume and higher market capitalizations were

less prone to observable arbitrageable price discrepancies. We examined relationships be-

134
tween price differences suggesting arbitrage opportunity and several variables relating to

trading activity. We presented evidence from the signed trading volume to show that trad-

ing behavior differs in periods following arbitrage opportunities, though we were only able

to find weak evidence that trading behavior during the seconds following the presence of

an arbitrage opportunity reflected market participants’ active exploitation of these mis-

pricings.

135
Appendix A

Data pipeline code detail

A.1 Connection stability issues

Within the first few hours of beginning ticker data collection from the three exchanges, we

observe program crashes that result in data collection interruption. We discuss the issues

we face with each of the exchanges and mitigation strategies we use.

A.1.1 GDAX

In the first 24 hours of data collection, the stream from GDAX experiences three inter-

ruptions due to program crashes. Given this frequency, it appears to be a recurrent issue

that requires attention. The code is updated to add logging when the connection is closed

or experiences an error (neither of which is expected behavior based on the GDAX API

specification).

1 const Gdax = require('gdax');


2

3 const websocket = new Gdax.WebsocketClient(

136
4 ['BTC-USD', 'ETH-USD', 'LTC-USD', 'BCH-USD'],
5 'wss://ws-feed.gdax.com',
6 null,
7 { channels: ['ticker'] }
8 );
9

10 websocket.on('message', data => {


11 if (data['type'] === 'ticker' && data['time'] !== undefined) {
12 const logItem = {
13 date: new Date(),
14 data: data
15 };
16 console.log(JSON.stringify(logItem));
17 }
18 });
19

20 websocket.on('close', () => {
21 console.error('close');
22 });
23

24 websocket.on('error', (err) => {


25 console.error('error', err, new Date());
26 });

Listing A.1: JavaScript code for reading from GDAX ticker update stream with logging

With this logging, it is determined that these WebSocket connections are being

closed contrary to documented behavior. Some of these closures happen along

with ECONNRESET errors which suggests that, on occasion, the connection is being closed

forcibly without respect to the WebSocket protocol. This also appears to be similar to

problematic behavior that others note experiencing with the GDAX API [53, 44]. We

therefore change the code to automatically reconnect upon connection closure.

137
1 const Gdax = require('gdax');
2

3 const websocket = new Gdax.WebsocketClient(


4 ['BTC-USD', 'ETH-USD', 'LTC-USD', 'BCH-USD'],
5 'wss://ws-feed.gdax.com',
6 null,
7 { channels: ['ticker'] }
8 );
9

10 websocket.on('message', data => {


11 if (data['type'] === 'ticker' && data['time'] !== undefined) {
12 const logItem = {
13 date: new Date(),
14 data: data
15 };
16 console.log(JSON.stringify(logItem));
17 }
18 });
19

20 websocket.on('close', () => {
21 websocket.connect();
22 });

Listing A.2: Final JavaScript code for reading from GDAX ticker update stream

This approach works in all cases except for GDAX’s planned outages for maintenance.

Since these planned outages are infrequent and also result in temporary changes to trad-

ing rules, the sample we use for analysis is taken from a time period that does not contain

any such outage.

A.1.2 Bitfinex

The connection with Bitfinex also experienced similar recurrent, unexpected crashes. We

employ a similar investigatory strategy, adding logging as shown.

138
1 const BFX = require('bitfinex-api-node');
2

3 const bfx = new BFX();


4

5 const ws = bfx.ws(1);
6

7 ws.on('open', () => {
8 ws.subscribeTicker('BTCUSD');
9 ws.subscribeTicker('ETHUSD');
10 ws.subscribeTicker('LTCUSD');
11 ws.subscribeTicker('BCHUSD');
12 });
13

14 ws.on('ticker', (pair, ticker) => {


15 const logItem = {
16 date: new Date(),
17 pair: pair,
18 data: ticker
19 };
20 console.log(JSON.stringify(logItem));
21 });
22

23

24 ws.on('error', (err) => {


25 console.error(err);
26 });
27

28 ws.on('close', () => {
29 console.error('close');
30 });
31

32 ws.on('info', (msg) => {


33 console.error(msg);
34 });
35

36 ws.open();

139
Listing A.3: JavaScript code for reading from Bitfinex ticker update stream with logging

This logging reveals three types of behavior that causes crashes. Two of the types are sim-

ilar to the issues with GDAX where there are unexpected WebSocket closes without warn-

ing and ECONNRESET errors where the underlying connection is forcibly closed. The most

frequent type of halt though occurs when the WebSocket connection is closed by the server

after the server sends an info message informing the client of an impending server stop

requesting the client to reconnect. In light of these discoveries, we employ a similar miti-

gation strategy to automatically reconnect upon close. However, this needs to be slightly

adjusted to include a 500 millisecond delay between close and reconnect to make this work

reliably with Bitfinex.

1 const BFX = require('bitfinex-api-node');


2

3 const bfx = new BFX();


4

5 const ws = bfx.ws(1);
6

7 ws.on('open', () => {
8 ws.subscribeTicker('BTCUSD');
9 ws.subscribeTicker('ETHUSD');
10 ws.subscribeTicker('LTCUSD');
11 ws.subscribeTicker('BCHUSD');
12 });
13

14 ws.on('ticker', (pair, ticker) => {


15 const logItem = {
16 date: new Date(),
17 pair: pair,
18 data: ticker
19 };
20 console.log(JSON.stringify(logItem));
21 });

140
22

23 ws.on('close', () => {
24 setTimeout(() => {
25 ws.open();
26 }, 500);
27 });
28

29 ws.open();

Listing A.4: Final JavaScript code for reading from Bitfinex ticker update stream

With these changes, we are able to maintain continuous data collection from Bitfinex.

A.1.3 Bitstamp

Although there are no issues with data collection from Bitstamp in the same time frame,

we deem it prudent to investigate similar mitigation strategies in the event that spurious

disconnection issues arise in the future. However, unlike GDAX and Bitfinex, Bitstamp’s

WebSocket API is provided through a third-party company called Pusher that offers its

own publish/subscribe messaging service with an abstraction level and library on top of

native WebSockets. We can see this by examining the differences in code between what is

used for Bitstamp and what is used for the other exchanges. The result is that it is not

possible to monitor for WebSocket closures or errors in the same way it is in the other

cases. While this initially seems worrisome, we eventually observe that the Bitstamp code

runs for weeks continuously without any further modification.

A.2 Bitfinex order book data processing

To reconstruct the state of the order book at each update time, we first attempt to use the

following Python code:

141
1 def process_update(update):
2 global bestbid, bestask
3

4 if update['data']['amount'] > 0:
5 if update['data']['count'] == 0:
6 del bidsdict[update['data']['price']]
7 else:
8 bidsdict[update['data']['price']] = {
9 'count': update['data']['count'],
10 'amount': update['data']['amount'],
11 }
12 if bidsdict:
13 curbestbid = max(bidsdict)
14 if curbestbid != bestbid['price']:
15 bestbid = {
16 'price': curbestbid,
17 'amount': bidsdict[curbestbid]['amount'],
18 'count': bidsdict[curbestbid]['count'],
19 }
20 output.append({
21 'date': update['date'],
22 'pair': update['pair'],
23 'best_bid': bestbid,
24 'best_ask': bestask,
25 })
26 else:
27 bestbid = {
28 'price': -math.inf,
29 'date': update['date'],
30 }
31 elif update['data']['amount'] < 0:
32 if update['data']['count'] == 0:
33 del asksdict[update['data']['price']]
34 else:
35 asksdict[update['data']['price']] = {
36 'count': update['data']['count'],
37 'amount': update['data']['amount'],

142
38 }
39 if asksdict:
40 curbestask = min(asksdict)
41 if curbestask != bestask['price']:
42 bestask = {
43 'price': curbestask,
44 'amount': asksdict[curbestask]['amount'],
45 'count': asksdict[curbestask]['count'],
46 }
47 output.append({
48 'date': update['date'],
49 'pair': update['pair'],
50 'best_bid': bestbid,
51 'best_ask': bestask,
52 })
53 else:
54 bestask = {
55 'price': math.inf,
56 'date': update['date'],
57 }

Listing A.5: Python code to process Bitfinex Order Book updates

However, this code does not work on the data stream that we collect. The first error that

we encounter is the presence of an update that indicates that a price level should be re-

moved from the order book even though that price level is not present in the state of the

order book immediately prior to receiving the update. At first, the assumption is that

there is a bug in the code processing the data stream, but examining all of the recorded

data between the beginning of the stream and the problematic update reveals that the

price level in question has never been in use. At this point, with the assumption that this

is a single error, we modify the code to ignore this particular update. However, proceeding

further eventually encounters another such invalid update referring to a deletion for a non-

existent price level. This requires us to change the code to handle this type of error in the

143
general case as follows:

1 def process_update(update):
2 global bestbid, bestask
3

4 if update['data']['amount'] > 0:
5 if update['data']['count'] == 0:
6 if update['data']['price'] in bidsdict:
7 del bidsdict[update['data']['price']]
8 else:
9 bidsdict[update['data']['price']] = {
10 'count': update['data']['count'],
11 'amount': update['data']['amount'],
12 }
13 if bidsdict:
14 curbestbid = max(bidsdict)
15 if curbestbid != bestbid['price']:
16 bestbid = {
17 'price': curbestbid,
18 'amount': bidsdict[curbestbid]['amount'],
19 'count': bidsdict[curbestbid]['count'],
20 }
21 output.append({
22 'date': update['date'],
23 'pair': update['pair'],
24 'best_bid': bestbid,
25 'best_ask': bestask,
26 })
27 else:
28 bestbid = {
29 'price': -math.inf,
30 'date': update['date'],
31 }
32 elif update['data']['amount'] < 0:
33 if update['data']['count'] == 0:
34 if update['data']['price'] in asksdict:
35 del asksdict[update['data']['price']]

144
36

37 else:
38 asksdict[update['data']['price']] = {
39 'count': update['data']['count'],
40 'amount': update['data']['amount'],
41 }
42 if asksdict:
43 curbestask = min(asksdict)
44 if curbestask != bestask['price']:
45 bestask = {
46 'price': curbestask,
47 'amount': asksdict[curbestask]['amount'],
48 'count': asksdict[curbestask]['count'],
49 }
50 output.append({
51 'date': update['date'],
52 'pair': update['pair'],
53 'best_bid': bestbid,
54 'best_ask': bestask,
55 })
56 else:
57 bestask = {
58 'price': math.inf,
59 'date': update['date'],
60 }

Listing A.6: Updated Python code to process Bitfinex Order Book updates

It is unclear whether we see these invalid updates as a result of missing previous additions

to the order book at the relevant price level or if the deletes are spuriously inserted into

the stream by Bitfinex due to a bug in their software.

With these modifications in place, it is possible to run this code over the entire Bitfinex

order book dataset to compute the best bid and best ask prices at each point in time.

However, while confirming the validity of this output, we note that there are instances

145
where the best bid is greater than the best ask even though this should not be possible.

Further analysis reveals that this condition exists for hours at a time continuously. Upon

inspection of the raw data updates from Bitfinex, it appears that this is happening be-

cause a single price level remains on the bid side of the order book that is much higher

than the rest (and higher than many, if not all, of the ask price levels present). We assume

that this is caused by a missed delete message for that price level. A manual inspection of

the recorded data stream shows that no such update is present. On the chance that this

is a rare error, we attempt inserting such a delete into the stream, but this only results in

analogous issues later in the data stream. We therefore determine that this is a pervasive

issue likely caused by bugs with Bitfinex that we need to account for.

Dealing with this issue is more complex than the case of a delete for a record that does

not exist as discussed previously. In this case, we need to formulate a heuristic to deter-

mine when an update should have been sent even though we receive no such update. The

first step in our approach is to assume that the invariant of the best bid being strictly less

than the best ask should always hold. The thought is that, in these cases, we should as-

sume that the newer update (bid or ask price level update) is valid and that there has

been a missed update deleting the older of the two price levels. However, in our manual

inspection in the previous phase, we note that there are often cases in which both sides of

the order book visible to us (25 price levels in each direction) are cleared in a quick suc-

cession of updates with the same timestamp. Timestamps are being recorded to the near-

est millisecond, so that is to say that we receive up to 50 updates within one millisecond.

Since this data restoration involves inferring missed data points, we deem it desirable to

minimize the number of such inferred data points we add to the stream. It therefore seems

best then to process updates we receive together in batch before testing the invariant and

possibly adding an inferred update.

We first employ this method to batch updates with the same timestamp. After this is

146
done, analysis is undertaken to examine the amount of time between updates. We deter-

mine that the most common delta between successive updates is 0 milliseconds. The sec-

ond most common delta is 1ms, and 2ms is third. When we examine the logs around in-

stances of these phenomena, it appears as though these cases represent a single logical up-

date from the Bitfinex servers, as we observe both sides of the order book being replaced

over the course of a few milliseconds. Based on these observations, we decide to process a

stream of updates as a single batch as long as the time delta between successive updates

does not exceed 5ms. With this threshold, it is possible to maintain our invariant with an

average of fewer than one inferred update per million updates processed.

As we develop these data cleansing techniques, testing each new iteration of code over

even just one day’s worth of updates for a single currency pair can take hours. This is

frustratingly slow, so we seek to improve the performance. Using a profiler, we determine

that the most costly computation step is parsing the timestamp embedded in each up-

date. This becomes necessary when we decide to use the time deltas between successive

updates to determine batching. To make our processing more efficient, we use separate

code to handle this timestamp parsing and compute the relevant time deltas as shown.

1 import dateutil.parser
2 import json
3 import sys
4

5 last_date = None
6 last_date_str = ''
7

8 for line in sys.stdin:


9 parsed = json.loads(line)
10 date_str = parsed['date']
11 if date_str == last_date_str:
12 print(0)
13 else:

147
14 date = dateutil.parser.parse(date_str)
15 if last_date is not None:
16 diff = date - last_date
17 print(diff.total_seconds())
18

19 last_date_str = date_str
20 last_date = date

Listing A.7: Python code to pre-process time deltas between Bitfinex Order Book updates

We then read these time deltas in parallel when processing the data stream so that we can

avoid timestamp parsing in this step. In all, we are able to consolidate more than 1 mil-

lion updates per day per currency pair to approximately 100,000 updates per day per cur-

rency pair. We are then able to output best bid and best ask prices at each update time

for Bitfinex and use this for further analysis.

148
Bibliography

[1] Bitstamp hiring Senior Developer. https://si.linkedin.com/jobs/view/


senior-developer-at-bitstamp-616662986, 2018.

[2] almondryan. IOTA withdrawal over 24 hours waiting. https://www.reddit.com/r/


bitfinex/comments/7ubyqu/iota_withdrawal_over_24_hours_waiting/, 2018.

[3] Tony Arcieri. The Tether Conundrum: A Quick Backstory. https://tonyarcieri.


com/the-tether-conundrum, 2018.

[4] Atribecalledmeuw. Withdraw from bitfinex?! https://www.reddit.com/r/Iota/


comments/7uiss7/withdraw_from_bitfinex/, 2018.

[5] Utpal Bhattacharya, Craig W Holden, and Stacey Jacobsen. Penny wise, dollar
foolish: Buy–sell imbalances on and around round numbers. Management Science,
58(2):413–431, 2012.

[6] Bitfinex. Bitfinex on Twitter. https://twitter.com/bitfinex/status/


261856016679985152, 2012.

[7] Bitstamp. About Us - Bitstamp. https://www.bitstamp.net/about_us/, 2018.

[8] Robert Brand, Brian Latham, and Godfrey Marawanyika. Zim-


babwe Doesn’t Have Its Own Currency and Bitcoin Is Surg-
ing. https://www.bloomberg.com/news/articles/2017-11-15/
bitcoin-surges-in-zimbabwe-after-military-moves-to-seize-power,
2017.

[9] Russell Brandom and Sarah Jeong. Why the feds took down one of Bit-
coin’s largest exchanges. https://www.theverge.com/2017/7/29/16060344/
btce-bitcoin-exchange-takedown-mt-gox-theft-law-enforcement, 2017.

[10] Bruno. The Curious Case of 184 Billion Bitcoin. https://bitfalls.com/2018/01/


14/curious-case-184-billion-bitcoin/, 2018.

[11] Evelyn Cheng. Bitcoin debuts on the world’s largest futures ex-
change, and prices fall slightly. https://www.cnbc.com/2017/12/17/

149
worlds-largest-futures-exchange-set-to-launch-bitcoin-futures-sunday-night.
html, 2017.

[12] Evelyn Cheng. Bitcoin exchange Coinbase has more users than
stock brokerage Schwab. https://www.cnbc.com/2017/11/27/
bitcoin-exchange-coinbase-has-more-users-than-stock-brokerage-schwab.
html, 2017.

[13] Evelyn Cheng. Japanese cryptocurrency exchange loses more than


$500 million to hackers. https://www.cnbc.com/2018/01/26/
japanese-cryptocurrency-exchange-loses-more-than-500-million-to-hackers.
html, 2018.

[14] Catalin Cimpanu. BitGrail Cryptocurrency Exchange Becomes Insolvent After


Losing $170 Million. https://www.bleepingcomputer.com/news/cryptocurrency/
bitgrail-cryptocurrency-exchange-becomes-insolvent-after-losing-170-million/,
2018.

[15] CoinMarketCap. Bitcoin (BTC) Historical Data. https://coinmarketcap.com/


currencies/bitcoin/historical-data.

[16] Samburaj Das. Bitcoin Price Nears $13,000 in India as Investors


Join Boom Time Despite 30% Premium. https://www.ccn.com/
bitcoin-price-nears-13000-india-investors-join-boom-time/, 2017.

[17] John Detrixhe. A South Korean bitcoin exchange has filed for
bankruptcy after being hacked again. https://qz.com/1160573/
bitcoin-exchange-youbit-files-for-bankruptcy-in-south-korea-after-latest-hack/,
2017.

[18] John Detrixhe and Joon Ian Wong. Bitcoin could fall below $5,000 if this
report on a mysterious cryptotoken is right. https://qz.com/1196866/
bitcoin-prices-could-be-40-lower-because-tether-propped-it-up/, 2018.

[19] Ittay Eyal and Emin Gün Sirer. Majority is not enough: Bitcoin mining is vulnerable.
In International conference on financial cryptography and data security, pages 436–
454. Springer, 2014.

[20] Stephen Gandel. Bitcoin’s Price Isn’t Always What You Think It
Is. https://www.bloomberg.com/gadfly/articles/2017-12-08/
bitcoin-s-price-isn-t-always-what-you-think-it-is, 2017.

[21] Samuel Gibbs. Bitcoin: $64m in cryptocurrency stolen in ’sophisticated’ hack,


exchange says. https://www.theguardian.com/technology/2017/dec/07/
bitcoin-64m-cryptocurrency-stolen-hack-attack-marketplace-nicehash-passwords,
2017.

150
[22] John M Griffin and Amin Shams. Is Bitcoin Really Un-Tethered? 2018.

[23] Samuel Haig. Bitfinex Experiences Withdrawal Difficulties. https://news.bitcoin.


com/bitfinex-experience-withdrawal-difficulties/, 2017.

[24] Ethan Heilman, Neha Narula, Thaddeus Dryja, and Madars Virza. IOTA Vulner-
ability Report: Cryptanalysis of the Curl Hash Function Enabling Practical Signa-
ture Forgery Attacks on the IOTA Cryptocurrency. https://github.com/mit-dci/
tangled-curl/blob/master/vuln-iota.md, 2017.

[25] Stan Higgins. Cryptsy Threatens Bankruptcy, Claims Millions Lost in Bitcoin Heist.
https://www.coindesk.com/cryptsy-bankruptcy-millions-bitcoin-stolen/,
2016.

[26] Stan Higgins. $300 Billion: Bitcoin Price Boosts Crypto Mar-
ket Value to Record High. https://www.coindesk.com/
300-billion-bitcoin-price-boosts-crypto-market-value-record-high/,
2017.

[27] Stan Higgins. As Bitcoin Soars, Prices Diverge Wildly Across Exchanges. https://
www.coindesk.com/bitcoin-soars-prices-diverge-wildly-across-exchanges/,
2017.

[28] Marc Hochstein. Tether Confirms Its Relationship With


Auditor Has ’Dissolved’. https://www.coindesk.com/
tether-confirms-relationship-auditor-dissolved/, 2018.

[29] iFinex Inc. Fee Schdule. https://www.bitfinex.com/fees.

[30] Coinbase Inc. API Reference. https://docs.gdax.com.

[31] Coinbase Inc. Fee Structure. https://gdax.com/fees.

[32] Coinbase Inc. You Can Now Buy And Sell Bitcoin By Connect-
ing Any U.S. Bank Account. https://blog.coinbase.com/
you-can-now-buy-and-sell-bitcoin-by-connecting-any-u-s-bank-account-72457ab182c5,
2012.

[33] Coinbase Inc. About Coinbase. https://www.coinbase.com/about?locale=en-US,


2018.

[34] KGO Television Inc. Bitcoin expert explains the cryptocurrency. http://abc7news.
com/finance/bitcoin-expert-explains-the-cryptocurrency/2801193/, 2017.

[35] Arjun Kharpal. All you need to know about tether, the cryptocurrency that could
have ’devastating’ effects on the market. https://www.cnbc.com/2018/02/02/
tether-what-you-need-to-know-about-the-cryptocurrency-worrying-markets.
html, 2018.

151
[36] Arjun Kharpal. Bitcoin’s dominance of the cryptocurrency mar-
ket is at its lowest level ever. https://www.cnbc.com/2018/01/02/
bitcoin-dominance-of-cryptocurrency-market-lowest-level-ever.html, 2018.

[37] Nejc Kodrič. www.BITSTAMP.net Bitcoin exchange site for USD/BTC. https:
//bitcointalk.org/index.php?topic=38711.0, 2011.

[38] Timothy B. Lee. A brief history of Bitcoin hacks and frauds. https://arstechnica.
com/tech-policy/2017/12/a-brief-history-of-bitcoin-hacks-and-frauds/,
2017.

[39] Timothy B. Lee. Skyrocketing fees are fundamentally changing bitcoin. https://
arstechnica.com/tech-policy/2017/12/bitcoin-fees-rising-high/, 2017.

[40] Timothy B. Lee. Why experts are worried about Tether, a dollar-pegged
cryptocurrency. https://arstechnica.com/tech-policy/2018/02/
tether-says-its-cryptocurrency-is-worth-2-billion-but-its-audit-failed/,
2018.

[41] Matthew Leising. U.S. Regulators Subpoena Crypto Exchange Bitfinex,


Tether. https://www.bloomberg.com/news/articles/2018-01-30/
crypto-exchange-bitfinex-tether-said-to-get-subpoenaed-by-cftc, 2018.

[42] Bitstamp Ltd. Unified Fee Schdule. https://www.bitstamp.net/fee_schedule/.

[43] FIX Protocol Ltd. What is FIX? https://www.fixtrading.org/what-is-fix/.

[44] madzthakz. ConnectionError: (’Connection aborted.’, error(54, ’Connection reset by


peer’)). https://www.reddit.com/r/GDAX/comments/7qe2g6/connectionerror_
connection_aborted_error54/, 2018.

[45] Igor Makarov and Antoinette Schoar. Trading and Arbitrage in Cryptocurrency Mar-
kets. 2018.

[46] Robert McMillan. The Inside Story of Mt. Gox, Bitcoin’s $460 Million Disaster.
https://www.wired.com/2014/03/bitcoin-exchange/, 2014.

[47] Satoshi Nakamoto. Bitcoin: A Peer-to-Peer Electronic Cash System, 2008.

[48] Arvind Narayanan, Joseph Bonneau, Edward Felten, Andrew Miller, and Steven
Goldfeder. Bitcoin and Cryptocurrency Technologies: A Comprehensive Introduction.
Princeton University Press, 2016.

[49] BBC News. China orders Bitcoin exchanges in capital city to close. https://www.
bbc.com/news/business-41320568, 2017.

[50] Maureen O’Hara. High-frequency trading and its impact on markets. Financial Ana-
lysts Journal, 69(2), 2013.

152
[51] Robt Price. One of the world’s biggest bitcoin exchanges
has been hacked. http://www.businessinsider.com/
south-korean-bitcoin-exchange-bithumb-hacked-ethereum-2017-7, 2017.

[52] Kenneth Rapoza. Good Luck Buying Bitcoin In India As Central Banker
Bans. https://www.forbes.com/sites/kenrapoza/2018/04/05/
good-luck-buying-bitcoin-in-india-as-central-banker-bans/, 2018.

[53] Raukk. Error: socket hang up. https://github.com/coinbase/gdax-node/issues/


97, 2017.

[54] Fergal Reid and Martin Harrigan. An analysis of anonymity in the bitcoin system. In
Security and privacy in social networks, pages 197–223. Springer, 2013.

[55] Dorit Ron and Adi Shamir. Quantitative analysis of the full bitcoin transaction
graph. In International Conference on Financial Cryptography and Data Security,
pages 6–24. Springer, 2013.

[56] Blockchain Luxembourg S.A. Average Confirmation Time. https://blockchain.


info/charts/avg-confirmation-time, 2017.

[57] Kai Sedgwick. Bitfinex Starts Sharing Customer Tax


Data with Authorities. https://news.bitcoin.com/
bitfinex-starts-sharing-customer-tax-data-with-authorities/, 2018.

[58] Robert J Shiller. Irrational exuberance: Revised and expanded third edition. Princeton
university press, 2015.

[59] Lauren Shin. Bitstamp Becomes First Nationally Licensed Bitcoin Exchange; License
Applies In 28 EU Countries. https://www.forbes.com/sites/laurashin/2016/04/
25/7886/, 2016.

[60] SJorritsma. Withdrawals Still in Processing. https://www.reddit.com/r/bitfinex/


comments/7eggi9/withdrawals_still_in_processing/, 2017.

[61] Hugh Son, Dakin Campbell, and Sonali Basak. Goldman Is Setting Up a Cryptocur-
rency Trading Desk. https://www.bloomberg.com/news/articles/2017-12-21/
goldman-is-said-to-be-building-a-cryptocurrency-trading-desk, 2017.

[62] Simon Sprankel. Technical basis of digital currencies, 2013.

[63] Sujha Sundararajan. Bitcoin Exchange Implicates Em-


ployee In $3 Million Theft. https://www.coindesk.com/
bitcoin-exchange-names-security-officer-suspect-3-3-million-theft/,
2018.

153
[64] Swartzcenter. HAS ANYONE BEEN ABLE TO WITHDRAW FROM BITFINEX?
https://www.reddit.com/r/bitfinex/comments/7g92u8/has_anyone_been_able_
to_withdraw_from_bitfinex/, 2017.

[65] Neer Varshney. This hacker made $120K in a week by finding bugs in EOS cryptocur-
rency. https://github.com/mit-dci/tangled-curl/blob/master/vuln-iota.md,
2018.

[66] Gavin Wood. Ethereum: A secure decentralised generalised transaction ledger, 2014.

[67] Joseph Young. Bitcoin Price Surpasses $17,000 in South Ko-


rea, an Extreme $3,500 Premium. https://www.ccn.com/
bitcoin-price-surpasses-17000-south-korea-extreme-3500-premium/,
2017.

154

You might also like