Professional Documents
Culture Documents
Stephen Roberts
University of Oxford
sjrob@robots.ox.ac.uk
Abstract
Market makers are faced with the complex optimisation problem of maximising their profit while
minimising their inventory risk. This dual mandate exerts inherently opposing forces: the simplest way to
avoid inventory is to maintain wide bid-offer spreads, yet doing so will prevent deal flow and therefore
neuter profitability. Conversely, showing competitive two-way prices will ensure heavier volumes on
both sides of the order book and therefore lock-in gains from clients crossing the bid-offer spread, but
opens the hazard of building up sizeable inventory risk if trade flow is temporally asymmetric. A further
concern for dealers is the threat of adverse selection by market professionals exploiting an information
advantage. Existing work in finance and stochastic control provides a robust framework for dynamically
balancing inventory, but does little to address adverse selection in over-the-counter markets where prices
can be individually tuned for each counterparty. In this paper, we consider a data-driven adjustment to
the state-of-the-art model based on the entropy of a high-dimensional, probabilistic representation of each
connection’s behaviour. Bayesian community detection techniques are applied to an adjacency matrix of
broker distributions, to reduce dimensionality and guarantee the scalability of our approach. Our entropy
metrics provide a basis for measuring adverse selection and feed directly into the Hamilton-Jacobi-Bellman
framework for deriving inventory-optimal bid and offer prices.
1
Optimal FX Market Making under Inventory Risk and Adverse Selection Constraints
September 2015 • S. Ghoshal, S. Roberts
2
Optimal FX Market Making under Inventory Risk and Adverse Selection Constraints
September 2015 • S. Ghoshal, S. Roberts
equation: and
ut + 21 σ2 uss + max λb (δb ) u(s, x − s + δb ,
a a
δb
JB = r ( s, q, t ) − s
δH
q + 1, t) − u(s, x, q, t) + max λ a (δ a ) u(s,
λ a (δ a )
1
+ log 1 − γ
δa (11)
γ (∂λ a /∂δ)(δ a )
+ s + δ a , q − 1, t) − u(s, x, q, t)
x =0
u(s, x, q, T ) = − exp(−γ( x + qs)) The HJB solution to inventory risk manage-
(7) ment can be viewed as a two-step process: first
solve for r b (s, q, t) and r a (s, q, t) in the PDE
Avellaneda and Stoikov’s choice of an ex- given by Equation (9), then solve Equations
ponential utility allows a simplification of the (10) and (11) to obtain the optimal distance
problem. They postulate that there exists a b a
δH JB ( s, q, t ) and δH JB ( s, q, t ) between the mid
function θ such that price and the optimal bid and ask quotes.
Direct substitution of Equation (8) into In this section we describe both the market
Equation (7) yields the following system for maker dataset provided by BNP Paribas, and
θ: the market metrics derived from EURUSD tick
data for the same time period.
λb (δb )
θt + 12 σ2 θss − 12 σ2 γθs2 + max
δb
γ I. The BNP Paribas FX Dataset
λ a (δ a )
1 − exp(γ(s − δb − r b ))
The BNP Paribas dataset includes a large set
+ max γ
δa
of FX trades executed by their market makers
in 2013. The complete dataset counts 916,683
1 − exp(−γ(s + δ a − r a )) = 0
trades split across 12 currencies, with EURUSD
by far the most actively traded pair. For each
θ (s, q, T ) = qs
trade, the raw dataset provides a timestamp,
(9)
the currency pair, deal size, whether it was a
Furthermore, applying Equation (8) to the
buy or a sell, as well as both client ID and bro-
reservation bid and ask prices given by Equa-
ker ID numbers. In the analysis that follows,
tions (4) and (5) yields the relations
we restrict our attention to EURUSD trades
taking place in January 2013 - this dataset’s
b b dealcount is 35,001, large enough to remain
δH JB = s − r ( s, q, t )
! representative. Figure 1 overleaf provides the
1 λb (δb ) full currency breakdown for the January 2013
+ log 1 − γ (10)
γ (∂λb /∂δ)(δb ) subset.
3
Optimal FX Market Making under Inventory Risk and Adverse Selection Constraints
September 2015 • S. Ghoshal, S. Roberts
Figure 1: Histogram of tradecount by currency in the BNP Paribas dataset for January 2013, N=74958. Note that 46.7% of
trade orders are in EURUSD.
Nt t∈[t,t∑
t EURUSD(mid)t − µt∈[t,t+1m]
+1m]
(12) Bid-Offer Spread(t) =
1-minute Leading Volatility(t) =
v EURUSD(offer)t − EURUSD(bid)t (17)
u 1 2
u
Nt t∈[t∑
t EURUSD(mid)t − µt∈[t−1m,t]
−1m,t] IV. ARD Gaussian Processes
(13)
We briefly recall the fundamentals of Gaussian
• Three return metrics were devised to Process modelling before describing ARD ker-
search for systematic bias in a client’s returns nels and the associated relevance metrics. For
4
Optimal FX Market Making under Inventory Risk and Adverse Selection Constraints
September 2015 • S. Ghoshal, S. Roberts
a comprehensive treatment of Gaussian Pro- adapt to any given dataset: inputs with large
cesses, please refer to [4]. length scales cause only marginal variations in
A Gaussian Process is a collection of ran- the covariance function, whereas inputs with
dom variables, any finite subset of which has a small length scales effectively magnify those
joint Gaussian distribution. Gaussian Processes variations. We can therefore define the rele-
are fully parametrised by a mean function and vance score of each feature to be the reciprocal
covariance function, or kernel. Given a real of its input length scale, and rank the salience
process f (x), we write the Gaussian Process as: of inputs by descending relevance.
5
Optimal FX Market Making under Inventory Risk and Adverse Selection Constraints
September 2015 • S. Ghoshal, S. Roberts
6
Optimal FX Market Making under Inventory Risk and Adverse Selection Constraints
September 2015 • S. Ghoshal, S. Roberts
Table 1: Correlation between Return Metrics and Signed Deal Size, N=35001.
Correlation p-value
Feature Pearson Spearman Pearson Spearman
1-min Return −0.0379 −0.0405 < 0.0001 < 0.0001
1-hour Return −0.0061 −0.0178 0.2516 0.0009
1-day Return +0.0032 −0.0024 0.5494 0.6589
Table 2: Correlation between Time of Day, Volatility, Bid-Offer Spread and Absolute Deal Size, N=35001.
Correlation p-value
Feature Pearson Spearman Pearson Spearman
Time of Day +0.0366 +0.0134 < 0.0001 < 0.0001
1-min Lagging Volatility +0.0345 +0.0801 < 0.0001 < 0.0001
1-min Leading Volatility +0.0478 +0.0570 < 0.0001 < 0.0001
Bid-Offer Spread −0.0047 +0.0247 0.3835 < 0.0001
Table 3: Relevance across all polarity-sensitive features measured on the January 2013 dataset, N=35001.
Table 4: Relevance across all polarity-insensitive features measured on the January 2013 dataset, N=35001.
7
Optimal FX Market Making under Inventory Risk and Adverse Selection Constraints
September 2015 • S. Ghoshal, S. Roberts
8
Optimal FX Market Making under Inventory Risk and Adverse Selection Constraints
September 2015 • S. Ghoshal, S. Roberts
Table 5: Bayesian NMF clustering on the January 2013 dataset across 25 broker connections.
and 1-day feature set. It nevertheless allows us In our case, f and g are GPs defined by
to produce meaningful bivariate visualisations their mean and covariance functions. Detrend-
of client sensitivity to its dominant features. ing during data preprocessing ensures that the
As an example of the framework’s ability to distributions have zero mean, µ f = µ g = 0.
represent behaviour, we provide heatmaps of This simplifies the distance calculation to
broker connection #1’s response to its most
salient features in Figures 4 and 5.
1 | Σ |
DB ( f (x), g(x)) = log q (27)
IV. Community Detection 2 |Σ f ||Σ g |
Having constructed Gaussian Processes for
each connection, we wish to measure the sim- where
ilarity between behaviours in order to cluster
Σ f + Σg
counterparties and apply different adjustments Σ= (28)
2
to the HJB bid-offer price based on the clus-
ter’s entropy. In light of the extreme relevance The Bhattacharyya distance for GPs is there-
scores found in the GPs regressing absolute fore only dependent on the choice of kernel
volumes on time of day, volatility and bid-offer and optimised hyperparameters. Estimation
spread, we have adopted those features as our of each connection’s GP hyperparameters is
basis for community detection. achieved by maximising the marginal likeli-
hood of the data, and allows us to compute co-
IV.1 Bhattacharyya Distance variance matrices for each connection on a com-
mon input scale. Formation of a Bhattacharyya
Detection of clusters requires the creation of a similarity structure follows straightforwardly
matrix of distances that capture the pairwise from Equation (27), calculating the inverse dis-
overlap between any two connections [5]. GPs tance between each GP pair.
by definition have a probabilistic interpretation,
as a draw from a probability distribution over
IV.2 Bayesian Non Negative Matrix Factori-
functions. Amidst the wide range of available
sation
symmetric distance measures, we choose the
Bhattacharyya distance as our measure of simi- Inverse Bhattacharyya distances between GPs
larity between two probability distributions f allows us to map the client dataset to a rela-
and g, defined as tional space, where every pair i, j of connec-
tions is given a similarity value sij . The result-
ing similarity matrix S is input as an adjacency
DB ( f (x), g(x)) = − log( BC ( f (x), g(x))) (25)
structure to a Bayesian non-negative matrix fac-
where BC ( f (x),g(x)) is the Bhattacharyya Co- torisation (NMF) scheme [6], a method with a
efficient given by history of successful application to a diverse
set of community detection challenges rang-
Z q ing from social network structure in ecological
BC ( f (x), g(x)) = f (x) g(x)dx (26) systems to maritime anomaly detection [7].
9
Optimal FX Market Making under Inventory Risk and Adverse Selection Constraints
September 2015 • S. Ghoshal, S. Roberts
Table 6: 2-medoids clustering on the January 2013 dataset across 25 broker connections.
10
Optimal FX Market Making under Inventory Risk and Adverse Selection Constraints
September 2015 • S. Ghoshal, S. Roberts
δ b (i ) = δH
b
JB + f ( H (i )) (30)
and
δ a (i ) = δH
a
JB + f ( H (i )) (31)
11
Optimal FX Market Making under Inventory Risk and Adverse Selection Constraints
September 2015 • S. Ghoshal, S. Roberts
[3] Guéant, O., Lehalle, C. and Fernandez- cesses. Knowledge and Information Systems,
Tapia, J. (2013). Dealing with the inven- March 2014, Volume 38.
tory risk: a solution to the market making
problem. Mathematics and Financial Eco- [6] Psorakis, I., Roberts, S., Ebden, M. et al.
nomics, 2013. (2011). Overlapping community detection
using Bayesian non-negative matrix fac-
[4] Rasmussen, C. and Williams, C. (2006). torisation. Physical Review 83, 2011.
Gaussian Processes for Machine Learning.
MIT Press. [7] Psorakis, I., Rezek, I., Roberts, S. et al.
(2012). Inferring social network struc-
[5] Smith, M., Reece, S., Roberts, S., Psorakis, ture in ecological systems from spatio-
I. and Rezek, I. (2014). Maritime Ab- temporal data streams. Journal of the Royal
normality Detection using Gaussian Pro- Society Interface 9, 2012.
12