You are on page 1of 26

Trading Strategies via Book Imbalance

Umberto Pesavento
joint work with Alexander Lipton and Michael G. Sotiropoulos

Algorithmic Trading Quantitative Research


Bank of America Merrill Lynch

Financial Engineering Workshops, Cass Business School


City University London, 8 October 2014

U. Pesavento, Bank of America Merrill Lynch 1 of 26 8 October 2014


Contents

Limit order books and algorithmic trading

Empirical observations

Modeling the bid and ask queues

Adding trade arrival dynamics

Calibration

Conclusions and current work

Appendix

References

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 2 of 26 8 October 2014
Limit order books and algorithmic trading: a top down approach

volume

t0 t1 t2 t3 t4 t5 t6 t7 t8 time

Buy trade 135 1000


135 500 134 1500
ask

134 1000 134 1000 133 800


133 1500 133 1500 132 100
134 500
132 800 132 800 133 1000
131 300 131 100 132 1500
131 800
131 500 130 300
130 1000
ti 129 1500
130 500 130 500 128 800
129 1000 129 1000 127 300
128 1500 128 1500 Sell trade ti+1
129 500
bid

127 800 127 800


126 300 128 1000
126 300 127 1500
126 800
125 300

A divide and conquer approach based on 3 phases:


• calculating a time-volume schedule;
• limit order placement: optimal trading of the allocated shares within the given horizon;
• venue allocation.

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 3 of 26 8 October 2014
Empirical observations: stopping times and book averaging

t0 p0b p0a q0b q0a


A separation of time scales: t1 p1b p1a q1b q1a
• queue length updates (≈ 3 s); t2 p2b p2a q2b q2a
t˜0 p˜0 q˜0 s˜0
• best bid-ask updates (≈ 8 s);
t3 p3b p3a q3b q3a
• trade arrivals (≈ 12 s). t4 p4b p4a q4b q4a
t˜1 p˜1 q˜1 s˜1
t5 p5b p5a q5b q5a
t6 p6b p6a q6b q6a
t˜2 p˜2 q˜2 s˜2
... ... ... ... ...

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 4 of 26 8 October 2014
Empirical observations: mid price movements conditional on an book imbalance

Non-martingale properties of prices at small time scale:


• future price variations can be predicted by using the imbalance in the bid and ask queues of
the order book I = (q b − q a )/(q b + q a );
• statistically significant (about 107 data points for the plot above);
• the effect is not large enough to lead to a straightforward arbitrage but significant enough to
yield savings in execution costs.

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 5 of 26 8 October 2014
Empirical observations: trade arrivals and related stopping times

Changing the stopping time:


• the overall trend in the expected price movements as a function of the book imbalance is the
same;
• conditioning on the arrival of trades on a particular side of the book breaks the symmetry in
the expected waiting time and price movements.

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 6 of 26 8 October 2014
Modeling the bid and ask queues: replenishment processes and the constant
spread approximation
ask queue (down−tick)

137 1500
135 3700 135 3700 135 3700
initial queues
133 6500 133 6500 133 6500

ask
132 9000 132 9000 132 9000
131 7200 depletion replenishment
replenished queues
131 7200
130 8600 130 8600 130 8600

bid
129 5000 129 5000 129 5000
128 3000 128 3000 128 3000
127 4500 127 4500

bid queue (up−tick)

Simple models for queues and price dynamics: .


• two correlated diffusion processes to represent the bid and ask queues (q b , q a ) = (W b , W a );
• price moves are associated with queues depletion;
• queues replenishment, drawing from a stationary distribution;
• assume constant spreads (not a bad approximation for liquid stocks)

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 7 of 26 8 October 2014
Modeling the bid and ask queues: time dependent dynamics

1 1
Pxx + Pyy + ρxy Pxy = 0,
Pt + (1)
2 2
where ρxy is the correlation between the processes governing the depletion and replenishment of
the bid and ask queues, which typically takes a negative value in a normal market.

 α(x, y ) = x


(ρxy x − y ) (2)
 β(x, y ) = − q ,
1 − ρ2


xy

yielding equation:
1 1
Pt +
Pαα + Pββ = 0. (3)
2 2
And the second to cast the problem in polar coordinates:
 p
2 2
 r = α +β
(
α = − r sin(ϕ − $)

←→ 
α
 (4)
β =r cos(ϕ − $)  ϕ =$ + arctan −
 .
β
where cos $ = −ρxy , so to yield the following equation for the hitting probabilities:
 
1 1 1
Pt + Vrr + Pr + 2 Pϕϕ = 0. (5)
2 r r
with the final condition: P(T , T , r , ϕ) = 0 and boundary conditions:

P(t, T , 0, ϕ) = 0, P(t, T , ∞, ϕ) = 0, P (t, T , r , 0) = P0 , P(t, T , r , $) = P1 .


Bank of America Merrill Lynch
U. Pesavento, Bank of America Merrill Lynch 8 of 26 8 October 2014
Modeling the bid and ask queues: Green’s function formulation
We seek the Green’s function to equation (5) by separating its radial and angular components:
0 0 0 0
G(τ, r , ϕ ) = g(τ, r )f (ϕ ), (6)
This leads to two equations coupled by the positive constant Λ2 :
!
1 1 Λ2
gτ = gr 0 r 0 + 0
gr 0 − 02 g , (7)
2 r r
2
fϕ0 ϕ0 = − Λ f . (8)
The radial part is solved by:

r 02 +r 2
0
e−
r 0 r0
 
0 2τ
g(τ, r ) = , IΛ (9)
τ τ
where IΛ (ξ) is the modified Bessel function of the first kind corresponding to Λ. After applying the
boundary conditions on the angular part of the equation, the final formula for the Green’s function is:

r 02 +r 2
0 ∞
2e− 2τ r 0 r0
 
0 0 0
X
G τ, r0 , r , ϕ0 , ϕ = Iνn sin νn ϕ sin (νn ϕ0 ). (10)
$τ n=1
τ
where νn = nπ ω̄ . Finally, we integrate the equation above to obtain the hitting probability for the of
an up-tick (or down-tick) conditional on the initial condition of the queue.

ˆTˆ∞
1 0 1 0
P(t, T , r0 , ϕ0 ) = − Gϕ (t − t, r , $) dr dt . (11)
2 r
t 0

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 9 of 26 8 October 2014
Modeling the bid and ask queues: infinite time limit

By writing out the explicit form for the Green’s function we obtain:
 2 2

∞ ˆ T ˆ ∞ − r +r0  
X  e 2t rr0  n+1
P (0, r0 , φ0 ) =  Iνn dtdr 
 (−1) νn sin (νn φ0 ) . (12)

n=1 0$tr 0 t

We reverse the order of integration and evaluate the time integral using the following expression:

r 2 +r 2
ˆ 0
∞ e− 2t
 
rr0 1
Iνn dt = p νn (13)
0 $tr t $νn r s2 − 1 + s

where s = (r 2 + r02 )/2rr0 . We can then integrate along the radial component,

ˆ ˆ ˆ
∞ 1 1 r0
νn −1 r0νn ∞
−νn −1 2
νn dr = r dr + r dr = . (14)
$νn r0νn
 
0 $νn r max rr ,
r0
0 $νn r0 $νn2
0 r

Finally, we sum the series to obtain:



2 X (−1)n+1
 
πn φ0
P (0, r0 , φ0 ) = sin φ0 = . (15)
π n=1 n $ $
As expected, the result depends only on the angular distance from the barrier.

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 10 of 26 8 October 2014
Modeling the bid and ask queues: time independent formulation I

We now consider the time-independent problem from the onset:


1 1
Pxx + Pyy + ρxy Pxy = 0, (16)
2 2
P (x, 0) = 1, P (0, y ) = 0. (17)
Again, we perform a change of coordinates to eliminate the correlation term,


 α(x, y ) = x

(−ρxy x + y ) (18)
 β(x, y ) = q ,
1 − ρ2


xy

yielding equation:

Pαα + Pββ = 0. (19)

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 11 of 26 8 October 2014
Modeling the bid and ask queues: time independent formulation II

We then perform a the second transformation to casts the modified problem in polar coordinates:
 p
2 2
 r = α +β
(
α =r sin(ϕ)

←→  
α (20)
β =r cos(ϕ)  ϕ =arctan
 ,
β
where cos $ = −ρxy . Then the equation becomes

Pϕϕ (ϕ) = 0, (21)


with boundary conditions P(0) = 0 and P($) = 1. In this coordinate set the solution is
straightforward P(ϕ) = ϕ/$, which in the original set of coordinates has the form:

 r 
1+ρ y −x
arctan( 1−ρxy )
xy y +x 
1
P(x, y ) = 1− r .
2 1+ρ

arctan( 1−ρxyxy
)
(22)

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 12 of 26 8 October 2014
Adding trade arrival dynamics: the trade arrival process
In analogy with the two Brownian processes representing the bid and ask queues, we add a third
(unobservable) process to model trade arrival on the near side of the book:
b a b a φ
(dq , dq , dφ) = (dw , dw , dw )

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 13 of 26 8 October 2014
Adding trade arrival dynamics: handling correlation

1 1 1
Pxx + Pyy + Pzz + ρxy Pxy + ρxz Pxz + ρyz Pyz = 0 (23)
2 2 2
as in two dimensions, it is possible to eliminate the correlation terms,

α(x, y , z) =x




 β(x, y , z) = (−ρ xy x + y )



 q
1 − ρ2xy


(24)
 h i
(ρxy ρyz − ρxz ) x + (ρxy ρxz − ρyz ) y + (1 − ρ2xy )z




 γ(x, y , z) = q ,


 q
1 − ρ2xy 1 − ρ2xy − ρ2xz − ρ2yz + 2ρxy ρxz ρyz

to obtain:

Pαα + Pββ + Pγγ = 0, (25)

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 14 of 26 8 October 2014
Adding trade arrival dynamics: changing the domain

Again, we can write the exit probability problem in a simpler form by changing the computational
domain Ω:
1 1 ∂
Pφφ (φ, θ) + (sin θPθ (φ, θ)) = 0, (26)
sin2 θ sin θ ∂θ
P (0, θ) = 0, P ($, θ) = 0, P (φ, Θ (φ)) = 1. (27)


 α =r sin θ sin ϕ

β =r sin θ cos ϕ

γ =r cos θ

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 15 of 26 8 October 2014
Adding trade arrival dynamics: semi analytical solutions I
We introduce a new variable ζ = ln tan θ/2 and rewrite the exit problem again as [Lipton 2013]:

Pφφ (φ, ζ) + Pζζ (φ, ζ) = 0, (28)


computational domain is now a semi-infinite strip with curvilinear boundary
  
Θ (φ)
ζ = Z (φ) = ln tan . (29)
2
We look for the solution of the Dirichlet problem for the Laplace equation in the form

X πn
P(ϕ, ζ) = cn sin(kn ϕ), kn = (30)
n=1
$
where the values of expansion coefficients cn can be determined by enforcing the boundary
condition
P(ϕ, Θ(ϕ)) = 1 (31)

(
ζ = ln tan θ/2
ϕ =ϕ

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 16 of 26 8 October 2014
Adding trade arrival dynamics: semi analytical solutions II
In order to compute the coefficients, we introduce the integrals
ˆ$
(kn +km )Z (ϕ)
Jmn = sin(km ϕ) sin(kn ϕ)e dϕ, (32)
0
ˆ$
km Z (ϕ)
Im = sin(km ϕ)e dϕ. (33)
0
Then the boundary condition (31) becomes
X
Jmn cn = Im , (34)
n

and cn can be computed by matrix inversion as c = J −1 I.

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 17 of 26 8 October 2014
Calibration: putting everything together

Book event probabilities as a function of the bid-ask imbalance:


• left region of the plot, price improvement is likely: get ready to reprice;
• central region of the plot, a trade on the near side is likely to anticipate an adverse price move:
stay posted;
• right region of the plot: consider crossing the spread.

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 18 of 26 8 October 2014
Calibration: the role of correlation

• Correlation is the main effect responsible for the symmetry breaking in the evolution of the
price expectation as a function of imbalance.
• It can also explain a big part of the adverse selection effect which we observe when posting
orders in a limit order book.
• The model can capture the main features of symmetry breaking in the trade arrival process.

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 19 of 26 8 October 2014
Conclusions and current work: from trade arrival rates to empirical fill
probabilities

Empirical fill probabilities, learning from our own execution data:


• real data tends to be noisy, but it displays consistent trends
• parametric forms of fill probabilities as a function of the limit order placement x can be
estimated, i.e. P(x) = 1 − e−βx

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 20 of 26 8 October 2014
Conclusions and current work: from empirical fill probabilities to optimization
schedules, a dynamic programming approach

Given an approximate functional form for the fill probability, we can solve the recursive optimization
problem given by:

E[Pi ] = min ((1 − p(x))E[Pi+1 ] + p(x)x) (35)


x

where p(x) is the fill probability of a limit order with a limit price of x.
• what is the optimal placement of a limit order? (blue line)
• what is the expected fill price? (red line)

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 21 of 26 8 October 2014
Conclusions and current work: optimizing thresholds

Going on step further, parameter selection and price slippage estimation as a function of the slice
size:
• as expected, larger slices will produce a larger slippage;
• the optimal trade off between waiting and crossing the spread depends on the size of the slice
to be executed;
• an optimal ridge in the parameter space can be calculated under certain assumptions.

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 22 of 26 8 October 2014
Appendix: order flow and impact

We can also attempt to predict price movements and arrival times by conditioning on local
measurements of prevailing order flows rather than book imbalance.
Bank of America Merrill Lynch
U. Pesavento, Bank of America Merrill Lynch 23 of 26 8 October 2014
Appendix: a time dependent slice of the problem

Average time evolution of the mid-price across a trade event:


• Before trade arrival prices tends to drift towards the near side of the book;
• At trade arrival impact dominates and prices moves towards the far side of the book.

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 24 of 26 8 October 2014
Appendix: queues depletion and replenishment

Depletion of the bid and ask queues across bid up-ticks and down-tick price movements:
• down-tick move, the initial queue size is thin while the next layer if fully formed;
• up-tick move, the previous layer is fully formed and the next queue distribution is thin;
• the ask queue is statistically unaffected.

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 25 of 26 8 October 2014
References
[1] A. Lipton, U. Pesavento, M.Sotiropoulos, Risk, April, 2014.

[2] R. Almgren, C. Thum, H. L. Hauptmann, and H. Li. Equity market impact. Risk, 18:57, 2005.

[3] M. Avellaneda and S. Stoikov. High-frequency trading in a limit order book. Quantitative
Finance, 8:217–224, 2008.

[4] J.-P. Bouchaud, J. D. Farmer, and F. Lillo. How markets slowly digest changes in supply and
demand. In T. Hens and K Schenk-Hoppe, editors, Handbook of Financial Markets: Dynamics
and Evolution.
[5] J.-P. Bouchaud, D. Mezard, and M. Potters. Statistical properties of stock order books:
empirical results and models. Quantitative Finance Finance, 2:251–256, 2002.

[6] R. Cont and A. de Larrard. Order book dynamics in liquid markets: limit theorems and
diffusion approximations. Working paper, 2012.

[7] R. F. Engle. The econometrics of ultra-high frequency data. Econometrica, 68:1–22, 2000.

[8] J. Hasbrouck. Measuring the information content of stock trades. Journal of Finance,
46:179–207, 1991.

[9] A. Lipton and I. Savescu. CDSs, CVA and DVA - a structural approach. Risk, 26(4), 2013.

[10] S. Stoikov R. Cont and R. Talreja. A stochastic model for order book dynamics. Operations
research, 58(3):549–563, 2010.

[11] E. Smith, J.D. Farmer, L. Gillemot, and S. Krishnamurthy. Statistical theory of the continuous
double auction. Quantitative Finance, 3:481– 514, 2003.

Bank of America Merrill Lynch


U. Pesavento, Bank of America Merrill Lynch 26 of 26 8 October 2014

You might also like