You are on page 1of 62

Order Anticipation around Predictable Trades∗

Mehmet Sağlam†
Lindner College of Business
University of Cincinnati
email: mehmet.saglam@uc.edu

Initial Version: August 2016


This Version: August 2018

Abstract

I study the presence of order anticipation strategies by examining predictable patterns in


large order trades. I construct three simple signals based on child-order execution patterns and
find empirical evidence that stronger signals are correlated with higher execution costs. I use
the SEC’s ban on unfiltered access and increase in noise trading as shocks to order anticipatory
activities of algorithmic traders and show that the price impact of predictability is smaller when
order anticipation becomes difficult. The empirical findings are mostly consistent with the back-
running theory which predicts delayed price impact as strategic traders learn about the large
order gradually.

Keywords: Order Anticipation, Algorithmic Trading, Predatory Trading, Back-running


JEL Classification: G12, G14.


I am grateful for helpful comments from Robert Battalio, Hendrik Bessembinder, Jonathan Brogaard (EFM
discussant), Shane Conway, Brian Hatch, Terry Hendershott, Peter Hoffmann, Albert Menkveld, Maureen O’Hara,
Richard Philip (FIRN discussant), Vikas Raman (FMA discussant), Andriy Shkilko, Elvira Sojli, Mao Ye, and Haoxi-
ang Zhu and conference participants at Econometrics of Financial Markets, FIRN 2017 Sydney Market Microstructure
Meeting and FMA 2017.

Please address correspondence to: Mehmet Sağlam, University of Cincinnati, 2925 Campus Green Drive, Cincin-
nati, OH, 45221-0195, Phone: (513) 556-9108; Fax: (513) 556-0979; Email: mehmet.saglam@uc.edu

1
1. Introduction

The recent advances in trading technology combined with machine learning theory have provided

tools to sophisticated investors to extract valuable signals about asset prices and make dynamic

trading decisions at high-frequency. Usually referred to as high-frequency traders (HFTs), they

follow various trading strategies ranging from market making to arbitrage trading between multiple

assets or venues. In response to their rise in trading activity in multiple markets, there have been

several academic studies to decipher the costs and benefits associated with this new form of trading.1

The empirical evidence has been mostly positive by pointing to smaller bid-ask spreads and faster

price discovery. However, the implications are still controversial especially in public perception

thanks in large part to the appearance of Michael Lewis and his book, Flash Boys, in mainstream

media arguing that “the stock market is rigged.” At the center of this debate was the potential

ability of these strategic algorithmic traders to use order anticipation strategies to sniff out large

orders and exploit this information to profit at the expense of other investors.

Schedule based algorithms aiming to match time-weighted average price (TWAP) or volume-

weighted average price (VWAP) in the market are believed to be the main source of information

leakage in large order executions. In a recent survey done by ITG, a financial technology com-

pany implementing algorithmic execution services, roughly 50% of buy side investors reported that

their biggest source of information leakage occurs in schedule based algorithms.2 In some cases,

predictable patterns emerging from these algorithms can be even discerned by human traders. On

July 19, 2012, four large-cap stocks, Coca-Cola, IBM, McDonald’s and Apple, displayed identical

price patterns potentially due to a schedule based algorithm.3 In Figure 1, I plot the trade prices

for Coca-Cola. In odd half-hour intervals, the price is roughly decreasing and in even half-hour in-

tervals the price is increasing. Interestingly, the peaks occur roughly at the half-hour mark whereas

the lows come at the start of the hour. These price dynamics can be exploited by strategic traders.
1
For example, Hendershott et al. (2011) provide empirical evidence that algorithmic trading (AT) reduces spreads
and adverse selection using an exogenous event in NYSE’s quote dissemination. Brogaard et al. (2014b) find that
HFTs increase price efficiency by trading in the direction of permanent price changes. Jones (2013) and Menkveld
(2016) survey the empirical and theoretical literature in high-frequency trading.
2
“Put a Lid on it: New Way to Measure Information Leakage”, ITG, August 2017.
3
“Sawtooth Trading Hits Coke, IBM, McDonald’s, and Apple Shares”, The Wall Street Journal, July 19, 2012.

2
77.6

77.4

77.2
Price
77

76.8

76.6

76.4
0

0.5

1.5

2.5

3.5

4.5

5.5

6.5
Hours

Figure 1: This figure illustrates the trade prices for Coca-Cola on July 19, 2012 from TAQ database. There
is a strong sawtooth trading pattern potentially caused by schedule-based algorithms.

For example, in anticipation of the continuing pattern, the low price corresponding to t = 4.5 occur

a few minutes earlier leading to a trade priced at $76.75 which is significantly lower than its past

30-minute average price. Overall, this real-world example provides evidence that some strategic

traders may take advantage of such patterns with further help from machine-learning techniques.

Investors and policymakers are concerned with these order anticipation strategies due to poten-

tial negative effects on price discovery and liquidity. If an adversary trader can infer the presence

of a large buy order submitted by a long-term investor for liquidity reasons, he can trade along

with the investor during the initial period of the execution to overshoot the price and then sell

back to the investor his accumulated position at the elevated prices to achieve roughly riskless

profits. In return, this means larger execution costs for the long-term investors as examined in

Brunnermeier and Pedersen (2005). If the long-term investor is trading due to private information

that he generated with costly effort, then a strategic adversary can extract this information using

order anticipation strategies and be part of the resulting profit at the expense of the investor. This

implies that the incentive of the investor to invest in information acquisition would decrease in the

presence of order anticipation. In this case, as argued by Stiglitz (2014) and Weller (2015), the

3
markets can be less informative if algorithmic traders can share the information rents of the fun-

damental investors who spend resources to obtain information about the real economy. In order to

differentiate these two different motivations, I will use “predatory trading” and “back-running” (as

introduced by Yang and Zhu (2015)) to refer to the first and second types of anticipatory trading,

respectively. In predatory trading, the resulting price impact is expected to be temporary as the

investor is not informed whereas in the presence of back-running, the price impact is permanent

due to the information-based trading.

In theory, order anticipation strategies may also lead to desirable market liquidity conditions

usually referred to as “sunshine trading.” This possibility is often ignored in the academic dis-

cussions surrounding the effects of order anticipation. If a large order execution is submitted by

an uninformed investor, predictable executions may actually motivate market makers to provide

additional liquidity knowing that there is no adverse selection risk. Admati and Pfleiderer (1991)

formalize this theory and illustrate that public announcement of an uninformed liquidation may

reduce trading costs. They directly assume that market makers have perfect knowledge about the

informationless liquidation ex ante. In the context of large order executions, anticipatory traders

can gradually learn whether the large execution is informed or not by analyzing the price impact of

the past trades. Depending on the accuracy of this exploration process, algorithmic traders may be

incentivized to provide more liquidity during the lifetime of the execution. In the empirical liter-

ature, only the perfect information case has been extensively studied. For example, Bessembinder

et al. (2016) find supporting evidence of sunshine trading in large executions occurring in crude oil

ETF rolls.

Given the mixed implications of predictable trading both theoretically and empirically, it is

important to study the net effects arising from recurring patterns in the order flow data. In this

paper, I empirically investigate whether predictable patterns in large order executions lead to higher

or lower trading costs. Using natural experiments, I construct the potential channels between order

anticipation and predictable patterns. I then examine the consistency of my empirical findings

with predatory trading, back-running and sunshine trading. The empirical analysis utilizes more

than 20,000 parent-orders constituting more than 2.5 million child-order executions. My sample

4
includes 15 months of data on liquid S&P 500 stocks from January 1, 2011 to March 31, 2012. The

dataset consists of large orders submitted by 146 distinct investors comprised of mostly institutional

portfolio managers. All orders in the dataset are executed to match the VWAP realized during the

lifetime of the parent-order. Average order size is roughly $1 million and corresponds to roughly

1.8% of the volume traded during the execution.

If algorithmic traders follow anticipatory trading, their activities should be particularly easy

to detect in predictable executions. Here, I define execution predictability as the likelihood that a

strategic trader can succeed in inferring about the presence of a large order execution. I provide

three simple signals that quantify this measure of predictability utilizing statistics from the child-

orders of the execution. These signals are constructed based on the recurring intuition about

minimizing information leakage.

The first signal is computed using the volatility of the size of the child-order trades. For

example, if a large order is executed in a series of equal trade sizes, e.g., 150 shares, the pattern

recognition algorithms may deduce the existence of a large order with high likelihood. Second,

using a similar intuition, I consider the regularity of the trading intervals as a signal that can leak

order-size information and propose a signal that computes the volatility of time intervals between

successive trades. This is motivated by the empirical evidence in the literature that execution

algorithms exhibit robust clock-time periodicity, the tendency to make trades around full-seconds

or half-seconds as identified by Hasbrouck and Saar (2013). It is also possible to consider the

order size and trading frequency at the same time and propose an aggregate measure involving

the volatility of trading rate. Thus, the final signal is based on the correlation between executed

quantity and elapsed time and will be very close to 1 for executions with almost constant trading

rate.

It is worthwhile to emphasize that in the optimal execution literature, some of these measures

are argued as factors that actually decrease execution costs. However, these results are based on

the absence of any other competing traders in the marketplace. For example, in the seminal paper

on optimal liquidation of large block of shares, Bertsimas and Lo (1998) illustrate that equal-

partitioning policy is optimal if the price impact is permanent and linear. This suggests that the

5
volatility of the traded quantities of child-orders or trading intervals should be at minimum. Similar

deterministic algorithms are proposed in order to accommodate U-shaped volume profiles in the

markets, e.g., VWAP. All three signals are particularly motivated by information leakage due to

deterministic patterns and the presence of strategic traders with order anticipation skills would be

the most significant factor increasing the cost of the predictable executions.

This paper contributes to the literature by providing robust signals that can quantify execution

predictability and analyzing their impact on execution costs using liquid S&P 500 stocks. Since the

execution strategy is known and unique along with a single broker, the dataset allows me to clearly

verify the impact of predictable executions. The empirical findings are consistent with earlier

literature studying the link between HFT activity and institutional trading costs and uncover

another direct channel for the cost increase through execution predictability. Using a diverse

universe of institutional investors in terms of short-term trading skill, I find evidence of back-running

strategies and the cost increase is economically significant. Analyzing uninformed executions, I do

not find evidence of sunshine trading, i.e., predictable uninformed executions do not have lower

trading costs. Specifically, the empirical analysis provides four main contributions.

First, the signals are significantly correlated with execution costs after controlling for the stan-

dard determinants of price impact implying the potential presence of order anticipation. As the

predictability of the execution goes up, I find that the execution cost measured by its implemen-

tation shortfall, the percentage deviation of the average execution price from its starting price,

increases by economically substantial amounts. The median execution cost is 2.7 bps and I find

that a one-standard deviation increase in my signals increases the execution cost by a range of 1.6

to 1.8 bps after controlling for a rich set of execution level statistics including the use of marketable

and passive limit orders. Further, I show that lagged signals predict higher execution costs to

control for potential contemporaneous correlation between illiquidity and signals.

Second, I exploit the SEC ban on unfiltered access and exogenous increase in noise trading to

directly link the increase in trading costs to successful order anticipation ability. In the presence

of the SEC ban, supervised access to the market centers imposes additional controls and latency.

These speed bumps slow down the order submission of some fast algorithmic traders, limiting their

6
order anticipation abilities. I verify that proxies of AT activity point to a significant reduction

after the ban and show that when order anticipation becomes more difficult after the ban, the price

impact of predictability drops. Further, order anticipation ability will be weaker when there is an

increase in noise trading volume. Using the potential behavioral bias of traders to round prices, I

document that when noise trading disguises the patterns leaking from large order executions, the

cost of predictability again decreases.

Third, I decompose the realized cost increase into permanent and temporary components for

each signal. I find that the permanent component explains most of the cost increase suggesting that

these signals may be leaking the potential private information of the investor to the back-runners.

Further, I find that the signals are correlated with the cost increase realized at the later stage of

the execution implying a learning period and delayed reaction by back-runners. These results are

broadly consistent with back-running theory.

Fourth, I find that the broker responsible for the executions is mostly successful in achieving the

main objective of the execution strategy, i.e., minimizing the deviation between average execution

price and the market VWAP. I observe that the correlation between implementation shortfall and

VWAP slippage is negligible. This finding implies that execution predictability may still emerge in

equilibrium despite sharing informational rents with strategic traders.

These contributions have important implications for market structure. In the presence of order

anticipation, fundamental investors may either shy away from information acquisition or engage

in costly encryption strategies to minimize information leakage. In this case, policymakers can

potentially minimize this aggregate social cost as a centralized social planner. If algorithmic traders

are using public order flow data such as trade sizes and timestamps in the SIP, these data might

be rearranged to minimize its signal-to-noise ratio. For example, trade sizes and their timestamps

can be either published with noise or aggregated across different trades by the reporting market

venues. Specifically, Harris (2013) proposes that markets should report only approximate trade

sizes within various buckets or only aggregated volumes at 5-minute time intervals.

The methodology and the findings of the paper can be also utilized in the study of the Consol-

idated Audit Trail (CAT) that will be updated significantly in November 2018. On November 15,

7
2016, the SEC approved a plan to implement a software system that includes information about

the identity of the traders.4 Thus, it is expected that the regulators will soon be able to test related

theories to order anticipation using an extensive dataset. The proposed signals in this paper and

their underlying intuition can be influential for future studies.

This paper is primarily related to the growing empirical literature analyzing the relationship

between institutional trading costs and HFTs’ activities. Van Kervel and Menkveld (2015) study the

cost dynamics of large orders by studying the trading behavior of the HFTs during their execution.

They find that HFTs initially provide liquidity to the execution but finally revert their trading to

the same direction of the large order. My paper differs from this study by providing a specific

mechanism on how the HFTs may learn about the presence of the large order. Van Kervel and

Menkveld (2015) use trading data reported from four institutional investors based in Sweden who

might have used various trading algorithms due to different brokers or trading needs. Instead, my

dataset uses trading data from a much larger set of 146 distinct investors that utilize a specific

broker’s single trading algorithm. The benefit of focusing on one algorithm is that it removes

the variation in execution costs due to heterogeneity across brokers and their various algorithms.

Larger investor base helps me to study the impact across a heterogeneous group of informed and

uninformed traders. I focus on the footprints left by the trading algorithm whereas Van Kervel and

Menkveld (2015) examine how HFTs’ trading activity evolve during the execution.

Brogaard et al. (2014a) examine the shocks to the latency of London Stock Exchange that

increase HFT activity over time and study the variation of institutional trading costs around these

changes using a dataset provided by Abel Noser, a consulting firm specialized in the analysis

of institutional trading costs.5 They cannot find any clear evidence of change in trading costs

during these latency upgrades. In a closely related study to Brogaard et al. (2014a), Tong (2015)

documents evidence that institutional trading cost is higher as the HFT activity increases using Abel

Noser and NASDAQ HFT datasets. Tong (2015) also analyzes the relationship at the stock-day

or parent-order level due to unavailability of the child-order data. Instead, my analysis focuses on
4
http://www.wsj.com/articles/sec-to-vote-on-consolidated-audit-trail-to-detect-market-manipulation-1479240411
5
Hu et al. (Forthcoming) provide detailed information about the Abel Noser dataset and survey the academic
papers that utilize this dataset.

8
identifying predictable patterns using child-order data and illustrates an alternative perspective that

institutional investors may ease exploration activities of the algorithmic traders with predictable

trading.

Hirschey (2013) finds empirical evidence supporting that HFTs may increase the trading cost of

non-HFTs by trading ahead of them. He obtains his dataset from NASDAQ which labels trading

firms either an HFT or a non-HFT. The data is focused on this distinction and does not allow him

to identify a specific instance of a large order execution and its corresponding cost. For this reason,

this paper does not consider whether non-HFTs leave footprints while trading large shares.

Korajczyk and Murphy (2015) use an order-level data with masked trader ids provided by

Investment Industry Regulatory Organization of Canada, a regulatory organization for Canada’s

equity markets. As in the case of Hirschey (2013), the dataset does not fully reveal how a parent

order is split into child orders but they try to identify both large institutional orders and the set

of HFTs using reasonable set of assumptions. Using these labels, they find that HFTs initially

provide liquidity to the large order but then compete with it due to inventory management and

back-running. My dataset exactly identifies the series of the child-orders of the parent order.

Constructing the signals for predictability, I document another channel for the potential increase

in execution costs.

The rest of the paper is organized as follows: Section 2 reviews the relevant theoretical models

that allow me to form my hypotheses regarding the empirical analysis on order anticipation. In

Section 3, I describe the dataset and provide its summary statistics. Section 4 includes the detailed

information about the construction of the signals and uses multivariate analysis to test whether

the presence of the signals leads to higher execution costs. Section 5 examines the impact of

reduced order anticipatory activities on the cost of predictable executions after a regulatory shock

slowed down the order submissions of a group of algorithmic traders. Section 6 illustrates that

when patterns cannot be detected easily, the cost of predictability drops. Section 7 examines the

relationship between the empirical results and the theoretical models summarized in Section 2.

Section 8 examines the performance of the algorithm and illustrates that execution predictability

may emerge in equilibrium under misaligned objectives. Finally, I conclude in Section 9.

9
2. Theoretical Framework and Hypotheses Development

In this section, I review the relevant theoretical models that guide my empirical analysis on order

anticipation. I summarize my expectations from the perspectives of three competing theories:

sunshine trading, predatory trading and back-running. Sunshine trading implies lower trading costs

in predictable executions whereas the remaining two imply higher trading costs. One difference

between predatory trading and back-running is based on the type of the price impact motive: a

permanent impact due to private information or a temporary impact due to urgent liquidity trading.

Another difference between predatory trading and back-running is the timing of the price impact.

Early impact on prices is expected in predatory trading whereas there will be delayed price impact

in back-running due to initial learning about the large order.

Brunnermeier and Pedersen (2005) provide a model of predatory trading in which a distressed

large investor is forced to sell his position and other strategic adversaries aim to exploit from this

liquidity need. Instead of providing liquidity, predatory traders trade in the same direction initially

and cause the price to decrease further. At further depressed prices, predatory traders then buy

from the investor, which ultimately increases the total liquidation cost of the investor.6

Admati and Pfleiderer (1991) provide an alternative theory of forced liquidation based on sun-

shine trading in which investors can potentially signal that their trade is not motivated by private

information. This announcement can induce other strategic traders to provide liquidity and may

actually reduce trading costs. Specifically, Degryse et al. (2014) illustrate that order splitting can

serve as a noisy form of preannouncing trades. Bessembinder et al. (2016) extend the model of

Brunnermeier and Pedersen (2005) for resilient markets in which the immediate price impact of

trades may be transitory. In this model, in addition to the same-side trading before the liquidation,

the strategic anticipators trade in the opposite direction as the liquidator and decrease the liquida-

tor’s transitory price impact if the market is largely resilient. This benefit to the liquidator from

strategic trading persists at any level of market resiliency if there are multiple strategic traders. In

the context of large order executions, these theories suggest potential decrease in trading costs for
6
Moallemi et al. (2012) consider a model with a predatory trader who does not have perfect information about
the large order but learns about it from the noisy price impact of the past trades.

10
predictable executions.

If predatory trading increases the cost of the execution for an uninformed order, one would

expect this cost to be transitory. This implies that if predictable executions induce the stock price

to go up during a large buy order, it will be expected to revert back to its pre-execution level after a

short period. However, if the large order is submitted by an informed investor, the price changes due

to predictable patterns should be permanent. Such a permanent price impact would be consistent

with the back-running theory presented in Madrigal (1996) and Yang and Zhu (2015). These papers

use two-period Kyle models in which an informed investor trades on private information in the first

period and strategic traders receive an imperfect signal about this in the second period. They then

exploit this signal to trade in the same direction implied by the private information of the investor.

In this sense, adversary traders are trying to “steal” the private information of the investor by

inferring from the past order flow.

Yang and Zhu (2015) illustrate that the optimal strategy for the investor facing the risk of leaking

valuable order flow information is to randomize his first period trade. Back-running theory implies

that any additional price impact due to a predictable signal in a large order execution should be

permanent as the prices ultimately reflect the private information of the investor. Although both

predatory trading and back-running imply higher trading costs, they have particularly different

implications with regards to initial price movements during the execution. In predatory trading,

one would expect larger price impact during the initial period as adversary traders have perfect

information about the liquidity trade of the distressed investor and they would trade in the same

direction of the large order at the beginning of the execution. However, in back-running, there is

a learning period initially thus the price impact due to the adversary traders would be delayed till

the later stages of the execution. Thus, this difference can be also exploited to distinguish between

these two theories.

The theoretical model in Yang and Zhu (2015) directly links the success of the back-runner

to the volatility of the signal. When the signal’s volatility is higher, the back-running ability and

the resulting profit becomes lower. Therefore, this model implies that even though an algorithm

may be using the same instructions from the same code, market conditions can determine the net

11
impact on trading costs. For example, the algorithm’s footprints can be detected with less accuracy

during an increased activity from noise traders. Relatedly, predictable signals can be exploited with

higher success if the adversary trader is faster both in processing the patterns and trading on them.

Aït-Sahalia and Sağlam (2013) study this effect in the context of market-making employed by a

high-frequency trader. The high-frequency market maker is able to predict a low-frequency trader’s

liquidity need with an imperfect signal. When the market maker is confident that he is going to buy

from an impatient trader, he lowers his bid quote before his order reaches to the market. Aït-Sahalia

and Sağlam (2013) shows that when the market maker is faster, such strategic quote widening also

occurs more frequently. This theory implies that predictable signals will lead to lower trading costs

when the adversary traders lose their speed advantage. Overall, these two papers directly imply

that when order anticipation becomes more difficult because of noisier signals or slower trading

technology, the cost of predictability decreases.

3. Data

I compile the data from several sources. Stock returns, volume, outstanding shares and prices

come from the Center for Research in Security Prices (CRSP). Intraday trade and quote data come

from the Trade and Quote (TAQ) database. The proprietary data on institutional large trades is

provided by the global execution desk of a large investment bank. In the next section, I describe

this dataset.

3.1. Proprietary Execution Data

For my empirical study, I use detailed execution data from the historical order databases of a large

investment bank (“the broker”) in the US. The broker is one of the top five banks in the United

States with respect to total assets and is one of the top five providers of execution services by

market share.

The orders originate mainly from institutional portfolio managers but the investor identities are

masked. All of the large orders in this dataset are executed according to a single execution strategy,

12
VWAP. This algorithm is the most commonly employed strategy, constituting roughly 50% of all

of the broker’s execution volume. According to this strategy, parent-orders are executed in smaller

child-order trades over the course of a trading day to achieve an average execution price that is as

close as possible to the volume-weighted average price observed in the whole market during this

trading period.

Through personal communication with the data provider, I obtained the implementation details

of the VWAP algorithm. First, the client submits a large order with either a target completion time

or an urgency score in the broker’s order submission platform. The algorithm then slices the large

order using the historical volume curve over the past month between the initiation of the order and

targeted completion time. The algorithm does not explicitly model volume spikes around scheduled

announcements. In this set-up, the client may affect the initiation of the VWAP algorithm with

order size, start time and targeted end time, but once the order is initiated, the clients do not have

any control over how each child order size, price, or its timing is selected. In summary, the client

may shape the parameters of the VWAP algorithm implying that execution costs can depend on

the client identity.

This dataset provides rich attributes at the parent- and child-order level. At the parent-order

level, most of the statistics are based on the execution horizon. These statistics include order size,

direction of the order (buy or sell), order start and end times, participation rate (the ratio of order

size to the total volume during the trading interval), average execution price, proportional bid-ask

spread and mid-quote volatility based on the duration of the execution. For each parent-order, I also

have information at the child-order level. Child-order level statistics include the time (timestamped

to the millisecond), size, venue (market center) and price of each child trade.

Merging the information on child order executions with the quote data from TAQ, I infer

whether the trade execution was a result of the algorithm’s submission of a passive non-marketable

order or an aggressive marketable order. I label a buy (sell) execution at the child-level as a passive

order if the trade price was below (above) the NBBO mid-point at the time of the fill. Similarly,

I label a buy (sell) execution at the child-level as an aggressive order if the trade price was above

(below) the NBBO mid-point at the time of the fill. The remaining orders will be filled at the

13
NBBO mid-point. I then create the corresponding parent-order level statistics, passive order ratio

(PO) and aggressive order (AO) ratio, by computing the fraction of the shares executed via each

order type.

The traded asset universe includes 498 stocks from S&P 500 index with an execution duration

greater than 10 minutes but no longer than 6.5 hours, the duration of a regular trading day. All

executions occur between January 2011 and March 2012, inclusive. I also exclude executions which

have less than 5 child-order trades or have value less than $50,000 at the arrival time of the order

which correspond to approximately 1,500 executions. I finally exclude additional 200 executions

with missing entries of participation rate, spread, volatility, or duration.

This is the first research paper that is completely based on this dataset of detailed child-order

trades. The dataset has been partially used in Sağlam et al. (2014) and Sağlam and Tuzun (2018).

In Sağlam et al. (2014), the child-order dataset has been used to examine whether skilled investors

prefer to trade in dark pools. In Sağlam and Tuzun (2018), the dataset has been used to examine

the price resiliency of stocks with high ETF ownership.

3.2. Summary Statistics

The final sample consists of 20,335 executions coming from 9,856 buy and 10,479 sell orders on

498 stocks. There are 146 distinct investors submitting the orders. Each investor has at least 1

execution and at most 477 executions. Table 1 provide additional summary statistics for my final

execution data.

On average, a parent-order has value of roughly $1 million and is executed in 128 child trades

with average participation rate of 1.8%. The mean duration of the executions is a little above

3 hours. Overall, these statistics are similar in terms of order of magnitude when compared to other

datasets studied in the contemporaneous literature, for example, the Canadian and the Swedish

dataset used by Korajczyk and Murphy (2015) and Van Kervel and Menkveld (2015), respectively.

14
3.3. Measurement of Institutional Trading Costs

In this section, I introduce the most commonly used cost measure for institutional trading, imple-

mentation shortfall (IS). Perold (1988) introduced this measure to quantify the difference between

the performance of a theoretical and the implemented portfolio. Over the years, IS has gained

popularity especially as a proxy for institutional trading cost. It is computed as the normalized

difference between the average execution price and the price of the asset prior to the start of the

execution. Formally, the IS of the ith parent-order is given by

Piavg − Pi,0
ISi = sgn (Qi ) , (1)
Pi,0

where Qi is the order size with Qi > 0 (Qi < 0) for buy (sell) orders, Piavg is the volume-weighted

execution price of the parent-order and Pi,0 is the mid-quote price of the security (arrival price)

when the parent order starts being executed.

Table 1 provides the summary statistics of IS. The mean (median) IS is roughly 3.1 (2.7)

bps. The empirical literature on institutional trading costs often uses participation rate, bid-offer

spreads, volatility, order duration, turnover and market capitalization as the main drivers of IS.

For example, Almgren et al. (2005) includes participation rate, bid-offer spread and volatility in

analyzing the variation in IS. Van Kervel and Menkveld (2015) uses order duration, volatility and

turnover as control variables. Tong (2015) uses logarithm of market capitalization as an additional

control variable. Most of these studies also use stock and client fixed effects. Following these

studies, I will use these variables as controls throughout my empirical analysis.

4. Identification of Predictable Executions

In this section, I consider certain execution characteristics of parent-orders which will potentially

allow strategic algorithmic traders to realize that there is a large order being traded. If the chid-

order executions display a particular pattern with regards to randomness in order sizes or trading

intervals, strategic adversaries may exploit this information to obtain imperfect signals about the

15
presence of impatient, informed or uninformed investors. If these signals were at all informative,

one would expect them to have an ultimate effect on the price impact of the large order.

4.1. Persistent Trade Size

A strategic trader may infer from the noisy order flow that there is a large order being executed if

a series of child-orders is executed in a highly deterministic schedule, e.g., same trade size on each

child order. I expect that such executions can be subject to higher cost due to their predictable

nature as they may be exploited to generate imperfect signals about the information or urgency

level of the investors. In an opposing scenario, if the investor is uninformed, this predictability may

incentivize some algorithmic traders to provide liquidity as well. In order to test these hypotheses,

I will consider the standard deviation of the child-order sizes as a measure of potential information

leakage. Formally, for the ith parent-order in my data, I define


v
u
N
u 1 Xi
NegChildOrderVoli = −t (qi,j − q i )2 , (2)
u
Ni − 1 j=1

where qi,1 , qi,2 , . . . , qi,Ni denote the percentage of the parent order executed with each child order

and q i is the corresponding average of these values. Thus, NegChildOrderVol measures the sign-

adjusted volatility of relative child-order sizes which is comparable across executions. I add a

leading negative sign to match my prior that higher NegChildOrderVol (i.e., less randomness in

order size) should be correlated with higher costs.

If strategic traders were able to detect the magnitude of NegChildOrderVol, I would expect

that an execution with high (low) NegChildOrderVol would be more (less) costly to trade. Figure

2 illustrates this conjecture from two sample executions from the dataset. On the left, I have

an execution that is in the top quintile when sorted on NegChildOrderVol. I observe that out of

more than 70 child orders, only 4 of these do not have a size of 100. Therefore, this execution,

relatively speaking, has a very high value of NegChildOrderVol. On the right, I have an execution

that is in the bottom quintile of NegChildOrderVol statistic. I observe that this execution is pretty

random in terms of traded child-order sizes and has ultimately low NegChildOrderVol. Comparing

16
500 60 500 60
Cumulative IS Cumulative IS
Child Order Size Child Order Size

Implementation Shortfall (bps)

Implementation Shortfall (bps)


400 400
40 40

300 300
Quantity

Quantity
20 20
200 200

0 0
100 100

0 −20 0 −20
0 10 20 30 40 50 60 70 0 5 10 15 20 25 30 35
Child Order Child Order

Figure 2: This figure illustrates two executions with high (left) and low (right) values of NegChildOrderVol
and compares their IS trajectory during the lifetime of the order. This figure serves as a motivating visual
evidence for the formal multivariate analysis. I expect that the execution with high NegChildOrderVol to
be costlier to execute.

the IS values, I find that the execution with high NegChildOrderVol has much higher cost which

is monotonically increasing during the lifetime of the execution. Of course, IS is a noisy measure

that can be affected by many other factors, but nevertheless this simple visual evidence illustrates

the intuition of the signal construction based on leaking valuable order flow information.

4.2. Trading in Constant Intervals

Strategic traders may infer the presence of a large order if its child orders are being traded in nearly

constant trading intervals. For example, Hasbrouck and Saar (2013) report that agency algorithms

exhibit robust clock-time periodicity, the tendency to make trades around full-seconds or half-

seconds. Motivated by this finding, it is reasonable to expect that adversary traders analyzing

the time-stamps of the trades may obtain imperfect signals about the presence of informed or

impatient investors. Similar to earlier measure on child-order sizes, I will consider the standard

deviation of the time between two consecutive child-orders as a measure of potential information

17
40 30 40 30
Cumulative IS
Trade Intervals
Time Elapsed since Last Trade (s)

Time Elapsed since Last Trade (s)


20 20

Implementation Shortfall (bps)

Implementation Shortfall (bps)


30 30
10 10

20 0 20 0

−10 −10
10 10
−20 −20
Cumulative IS
Trade Intervals
0 −30 0 −30
0 5 10 15 20 25 0 2 4 6 8 10 12 14
Child Order Trade Child Order Trade

Figure 3: This figure illustrates two executions with high (left) and low (right) values of NegIntervalVol
and compares their IS trajectory during the lifetime of the order. This figure serves as a motivating visual
evidence for the formal multivariate analysis. I expect that the execution with high NegIntervalVol to be
costlier to execute.

leakage. Formally, for the ith parent-order in my data, I define


v
i −1 
u
u 1 NX 2
NegIntervalVoli = −t ∆ti,j+1 − ∆ti , (3)
u
Ni − 2 j=1

where ∆ti,j+1 , ti,j+1 − ti,j denote the time elapsed between child-orders as a fraction of total

duration and ∆ti is the mean value of these time intervals. Thus, NegIntervalVol measures the

sign-adjusted volatility of normalized trading intervals between consecutive child-orders. I again

add a leading negative sign to match my prior that higher NegIntervalVol (i.e., nearly constant

trading intervals) should be correlated with higher costs. With this definition, high (low) values of

NegIntervalVol are associated with high (low) periodicity of child-order trades.

Similar to the intuition developed for NegChildOrderVol, strategic traders may be able to infer

that a large order is being executed if they can also detect that a series of orders exhibit consistent

periodicity. I hypothesize that an execution with high NegIntervalVol may allow strategic traders

to form a strong belief about the presence of a large order. Figure 3 illustrates this conjecture

visually from two sample executions. On the left, I have an execution from the top quintile of

executions sorted on NegIntervalVol. I observe that each child order is separated by exactly 10

18
seconds throughout the execution of 23 child-orders. Therefore, this execution has zero volatility

with regards to trading intervals. On the right, I have an execution that is in the bottom quintile

of NegIntervalVol statistic. I observe that trading intervals of this execution seem to fluctuate

highly. I again find that the execution with high NegIntervalVol has roughly increasing IS during

the lifetime of the execution whereas the other execution ends with negative cost with more volatile

IS trajectory. This simple visual evidence is again consistent with the potential signal extraction

by algorithmic traders exploiting deterministic patterns.

4.3. Constant Trading Rate

Another measure of interest to adversary algorithmic traders may be the constancy of the execu-

tion’s trading rate. The earlier measures do not account for the joint dynamics of the executed

quantity and its timing and thus may not reflect much information on the dynamic nature of the

trading rate. I expect that executions with roughly constant trading rates may be again subject to

higher cost due to potential signal extractions by strategic algorithmic traders. As an approximate

measure for this tendency, I consider the correlation between cumulative executed quantity and the

elapsed time since the start of the execution. Formally, I define this correlation, QtyTimeCorrel, as

PNi  cum 
cum (t
j=1 qi,j − qi i,j − ti )
QtyTimeCorreli = r 2 qP , (4)
PNi  cum cum Ni 2
j=1 qi,j − qi j=1 (ti,j − ti )

cum , Pj
where qi,j k=1 qi,j denotes the cumulative sum of qi,j and qicum denotes its corresponding mean,

respectively. Note that ti,j measures the time elapsed between the start of the ith execution and its

jth child-order trade. Thus, QtyTimeCorrel measures the correlation between cumulative executed

quantity and total time and high values of QtyTimeCorrel essentially signify a deterministic trading

schedule with fixed trading rate.

Consistent with my earlier intuition, strategic traders will be able to exploit more information

leaking from an execution with high QtyTimeCorrel. Figure 4 illustrates this expectation visually

from my dataset. On the left, I have an execution from the top quintile of executions sorted on

QtyTimeCorrel. I observe that the correlation between executed quantity and time is nearly perfect,

19
4,000 40 6,000 40
Cumulative IS Cumulative IS
Cumulative Quantity 30 Cumulative Quantity 30
5,000

Implementation Shortfall (bps)

Implementation Shortfall (bps)


3,000
20 20
4,000
Quantity

Quantity
10 10
2,000 3,000
0 0
2,000
−10 −10
1,000
1,000
−20 −20

0 −30 0 −30
0 10 20 30 40 50 60 0 20 40 60 80 100 120 140
Time (mins) Time (mins)

Figure 4: This figure illustrates two executions with high (left) and low (right) values of QtyTimeCorrel
and compares their IS trajectory during the lifetime of the order. This figure serves as a motivating visual
evidence for the formal multivariate analysis. I expect that the execution with high QtyTimeCorrel to be
costlier to execute.

i.e., QtyTimeCorrel is roughly one. On the right, I have an execution that is in the bottom quintile

of QtyTimeCorrel statistic. I observe that trading rate for this execution is much higher in the

second-half of the execution. As expected by my findings so far, I again observe that the IS of the

predictable execution is higher than its unpredictable counterpart.

4.4. Multivariate Regressions

Execution cost can be function of multiple trade-level and stock-level characteristics, thus, to test

formally whether each signal is associated with higher IS, I run the following multivariate regression

at the execution level with a rich set of control variables:

X S
X K
X
ISi = α + βSignali + δj Controlj,i + γk I{m(i)=s} + νk I{c(i)=k} (5)
j s=1 k=1
13
X
+ ζh I{Active in hth half-hour} + i ,
h=1

m
where the mapping i → s is used to identify the executed stock s, s = 1, . . . , 498 and the mapping
c
i → k is used to identify the client k submitting the order where k = 1, . . . , 146. In addition to

20
these stock and client dummies, I control for intraday volume patterns using a dummy for every 30

minutes. Further, I also consider execution-level control variables including participation rate, the

ratio of passive and aggressive orders, bid-offer spread, mid-quote volatility, execution duration,

turnover, logarithm of market capitalization.

Participation rate and order duration can control for the urgency of the trade and client-fixed

effects can control over the different trading strategies or the skill level of the investor that may

be correlated with the price movements during the execution. Controlling for the ratio of passive

or aggressive orders is also important as fills due to passive orders can have more randomness in

inter-trade time intervals and may be negatively correlated with the signals overall. Since passive

orders would earn the fraction of the bid-ask spread, if the lower realization of the signals are

driven by passive orders, I may obtain spurious relationship between the signals and execution

costs. Similarly, aggressive orders are costlier and may be positively correlated with the signals and

thus they may lead to an omitted variable bias if they are not properly controlled.

Table 2 reports the regression results with standard errors clustered at the calendar day level.7

In the first three columns, I observe that the estimated coefficient for each signal is positive and

statistically significant aligning with my expectations from the earlier visual evidence. The coeffi-

cients are also economically large. In the final three columns, I standardize all of the continuous

independent variables and find that one-standard deviation increase in each signal is associated with

an IS increase of approximately 1.6 to 1.8 bps. Given that the median implementation shortfall is

2.7 bps, the coefficients on the signals are economically significant.

4.5. Lagged Signals and Execution Costs

Despite for controlling for various execution-level statistics, one potential concern with the earlier

multivariate analysis is that there can be a contemporaneous missing variable that is driving both

the signals and IS. In order to mitigate this concern, I examine the relationship between the lagged

signals and the future execution costs by constructing the signals from the first half of the regression

and testing for potential cost increase in the remaining portion of the execution. This is a more
7
Throughout the analysis, I adjust standard errors by clustering on calendar day. Double clustering on calendar
day and executed stock provides approximately identical standard errors.

21
natural test as the information leakage to potential adversary traders may take some time which

is consistent with the learning mechanism in Yang and Zhu (2015).

Formally, I partition the execution into two parts by assigning the child-order trades from the

first 50% of the order size to the Initial bin and the remaining 50% to the Final bin.8 I then

construct a new set of three signals by using child-order trades only from the Initial bin and define

them InitNegChildOrderVol, InitNegIntervalVol and InitQtyTimeCorrel, respectively. Finally, I

compute the average trade prices in the Initial and Final bins of the execution to define the price

impact measures corresponding to the initial and final periods:

P̄i,1 − Pi,0 P̄i,2 − P̄i,1


InitISi = sgn (Qi ) , FinalISi = sgn (Qi ) (6)
Pi,0 P̄i,1

where P̄i,1 is the average trade price for the Initial bin and P̄i,2 is the average trade price for the

remaining Final bin.

I regress InitIS and FinalIS on the new signals constructed from the first half of the execution

using the same set of control variables:

X S
X K
X
Costi = α + βInitSignali + δj Controlj,i + γk I{m(i)=s} + νk I{c(i)=k} (7)
j s=1 k=1
13
X
+ ζh I{Active in hth half-hour} + i ,
h=1

where Cost is either InitIS or FinalIS and InitSignal is either InitNegChildOrderVol, InitNegInter-

valVol or InitQtyTimeCorrel.

Table 3 reports the regression results. I find that InitIS is not correlated with any of the three

signals constructed from the initial stage. Even the signs of the coefficients are mixed. These

findings alleviate the endogeneity concern in the main regressions as it suggests that the signals are

not correlated with an omitted variable of illiquidity as this would have increased (decreased) the

average price observed in the initial stage of the execution for a buy (sell) order implying a positive
8
Not all of the executions can be exactly split into two equal portions. Formally, Initial bin for the ith execution
includes the maximum number the child orders from the start of the execution so that the cumulative sum of these
orders is still less than or equal to 50% of the total order.

22
correlation with the signals and InitIS. Interestingly, all of the signals are significantly correlated

with FinalIS. Overall, this finding suggests that signals are gradually processed by adversary traders

consistent with the learning mechanism in the back-running theory.

4.6. Deterministic Patterns

Deterministic patterns in the algorithm may leak order flow signals to the strategic traders. In

Figure 3, each child order was separated by exactly 10 seconds throughout the execution of 23 child-

orders. In this section, I examine whether similar deterministic patterns, which can be exploited

by strategic traders, exist in the data.

First, I examine the distribution of the time periods between two consecutive trades in a parent-

order execution. I compute the trading intervals at the precision of hundredths of a second and

focus on intervals that are greater than 200 milliseconds. In this universe, the most frequent trading

interval is surprisingly 600.00 seconds, i.e., exactly 10 minutes. There are 4,476 consecutive trades

which are spaced with this time interval. Further, the second-most frequently employed trading

interval is 599.99 seconds which is again approximately 10 minutes. This trading interval occurs

703 times in the data. To interpret the abnormality of these frequencies, note that there are only 4

consecutive trades which are spaced apart with 599.00 seconds. These two statistics strongly imply

the imperfect randomization of the trading algorithm.9

Second, I study whether trading intervals occur more frequently around round seconds. To

examine this issue formally, I compare the number of consecutive trades that are spaced with X.00

seconds versus X.50 seconds where X is an integer. If the algorithm were to be fully randomized,

one would expect to see roughly equal number of consecutive trades with both of these spacings.

Table 4 reports the frequencies of the number of consecutive trades corresponding to these two

trading interval lengths. X is chosen to be from 1 second to 30 seconds. Contrary to the null

hypothesis of perfect randomization, I find that in all cases of X, the number of consecutive trades

with trading intervals of round seconds is always more frequent. The mean difference in frequency

is 97.2 with a corresponding t-statistic of 8.1.10


9
The third-most frequent trading interval is 1.00 second with 632 occurrences.
10
The findings are identical if I exclude parent-orders that starts at the round second, (e.g., 9:30:00AM) from the

23
Overall, these analyses strongly suggest that the trading algorithm displays imperfect random-

ization across trading intervals that can be exploited by anticipatory traders.

4.7. Signal Heterogeneity

Although all of the executions are traded according to a single execution strategy, there can still

be heterogeneity in how the parent order is traded due to client instructions (e.g., order size),

market conditions and algorithm’s intrinsic randomness. In this section, I investigate what drives

this heterogeneity in detail and examine the correlation between signals and order-level statistics.

One reason for the heterogeneity may be due to imperfect ability to predict volume patterns. If

the broker were to perfectly predict the future volume profile in the market, achieving zero slippage

against market VWAP would be trivial. However, the broker can only inaccurately predict the

upcoming volume patterns by using the historical volume profiles from the previous month. To

assess the predictive power of this approach, using TAQ data, Figure 5 plots the histogram of the

prediction error, the percentage difference between realized volume and average volume. This plot

implies that predicting the future volume patterns using historical volume curves is indeed very

difficult and this would be a major source of heterogeneity between executions.

Relatedly, client instructions in the pre-trade phase can affect the predictability of the execution.

If the client chooses a large-order size, the algorithm may not have enough flexibility in randomizing

the order. Finally, market conditions during the execution may directly affect the algorithm’s child-

order selections. For example, when the volume profile is roughly constant, the execution schedule

could also mimic a similar profile and lead to more predictability. On the other hand, in these quiet

times, the algorithm may try to randomize the execution schedule as well in order to minimize the

information leakage. Note that market conditions can also change the exploitability of the signals.

For example, when noise trading is higher, algorithmic traders may not fully detect the patterns. I

examine such issues in detail in Section 5 and Section 6.

To better understand the drivers of the signals, I study each signal in detail by regressing it on

the control variables I have utilized in the multivariate regressions. I also add stock fixed-effects,
analysis. The findings are robust to the use of other non-round intervals such as X.01.

24
4,000

3,000
Frequency

2,000

1,000

0
−100 0 100 200 300
Prediction Error (%)

Figure 5: Histogram of the percentage difference between realized volume during the execution and the
average volume observed during the same interval in the past month.

client and intraday dummies.

Table 5 reports the regression results. Interestingly, all signals have the same sign in each

coefficient implying the robustness of the intuition behind the signal construction. For each signal,

the coefficient on participation rate is positive suggesting that the signals are correlated with the

order size and they measure the footprints of the order. This finding highlights the role of client

instructions on execution predictability. All signals load positively on Aggressive Ratio and Passive

Ratio implying that predictability is lower when the executions occur at the mid-quote.

Spreads are negatively correlated with the signals implying that liquidity shocks cannot fully

explain the presence of execution predictability. Volatility is positively correlated with the signals

suggesting that the algorithm shies away from volatile trading in times of price uncertainty. This

could be a strategic choice as adversary traders may also find it difficult to extract precise signals

during these volatile periods. Finally, small-cap stocks seem to have higher predictability due to

NegChildOrderVol and NegIntervalVol. Overall, these findings suggest that market conditions may

also drive the heterogeneity in the signals and imply that the algorithm could be leaving more

25
footprints in certain set of executions.

4.8. Relationship with Publicly Available Data

Execution predictability signals are constructed using proprietary data. Since these signals are not

publicly available, it is not directly conclusive that strategic traders can exploit these signals to

detect predictable large executions. In order to address this concern, I show in this section that

similar easy-to-construct signals can proxy the proprietary signals implying that strategic traders

are indeed capable of detecting these signals.

Publicly available data in TAQ do not provide any information about the presence of large

orders or how it is traded in particular child orders. However, if the participation rate of the order

is high, most of the reported trades will belong to the broker’s trades. If the client is also using

multiple brokers, due to potential correlation between VWAP algorithms across brokers, my private

signal may be also correlated with its public counterpart. For these reasons, I utilize all reported

trades in TAQ to construct the publicly replicable version of the signals using the same definitions.

Let PubNegChildOrderVol, PubNegIntervalVol, and PubQtyTimeCorrel be the publicly avail-

able counterparts of NegChildOrderVol, NegIntervalVol, and QtyTimeCorrel, respectively. Table

6 illustrates the Spearman and Pearson correlations between the public and proprietary signals.

Spearman (Pearson) correlations are between 0.25 (0.16) and 0.48 (0.39) and they are statistically

significant. As illustrated in Table 7, the correlation between the signals can go up to 0.77 when

executions with participation rates of at least 0.6% (the median) are considered. Overall, these

values point high positive rank correlation between public and proprietary signals.

4.9. Buy versus Sell Orders

There is an important stream of the microstructure literature that studies the potential asymmetry

between the cost of buy and sell orders. In the earliest example, Kraus and Stoll (1972) find that

block purchases have larger permanent price impact than block sales. In a review paper, Macey and

O’hara (1997) highlight that the direction of the order is an important determinant of execution

costs. Saar (2001) develops a theoretical model to explain why buy orders could be costlier than

26
sell orders. The model relates the asymmetric effect to the historical price performance of the stock.

After a long period of price run-ups, the model predicts a smaller asymmetry between buys and

sells. Chiyachantana et al. (2017) find supporting evidence of this theory by illustrating that price

impact asymmetry varies based on the history of the price run-up. Chiyachantana et al. (2004)

document that the asymmetry depends on the contemporaneous market condition and report that

sells (buys) have larger price impact than buys (sells) in bear (bull) markets. Hu (2009) finds that

the buy–sell asymmetry is a function of the type of the benchmark used. If one uses pre-trade

benchmarks, buys (sells) have higher implicit trading costs during rising (falling) markets but this

relationship reverses when post-trade benchmarks are used. During-trade measures such as VWAP

slippage are neutral to the market movements.

In my dataset, I have perfect knowledge of the trade direction, thus, I can directly test whether

the cost of information leakage due to predictable signals is different across buy and sell orders. To

test this hypothesis, I separately run the multivariate regression in Equation (5) for buy and sell

orders. Table 8 reports the regression results for each signal and trade direction. The coefficients

on the signals are positive in all cases and statistically significant in five out of six cases. Only the

coefficient on NegChildOrderVol in the case of buy orders is insignificant.

Further, I statistically test whether the coefficients are different in buys than in sells. From buy

orders to sell orders, the coefficient on NegChildOrderVol increases by 94.69 with the corresponding

t-statistic of 1.19; the coefficient on NegIntervalVol decreases by 16.55 with the corresponding t-

statistic of 0.68 and the coefficient on QtyTimeCorrel increases by 7.68 with the corresponding

t-statistic of 0.15. These statistics imply that for all three signals, the difference is statistically

insignificant suggesting that predictable signals are costly in both buys and sells without any

significant asymmetry. This finding on lack of asymmetry contributes to the literature studying

differences in execution costs between buy and sell orders. As mentioned earlier in this section,

there have been several studies documenting buy-sell asymmetry in execution costs. Focusing on a

particular component of execution costs, the cost of predictability, I find that no matter the trade

direction of the order, footprints of the algorithm can be exploited by strategic traders.

27
4.10. Stock-by-stock Analysis

The cost of predictable trades can be stock specific. If only a few stocks suffer from order antici-

pation activities, it would not be correct to generalize the findings. To address this issue, I run my

main regression on a stock-by-stock basis. I focus on 385 stocks with more than 20 parent-order

executions. The findings are robust to different choices of minimum parent-order executions at the

stock level. For this group of stocks, I estimate the following regression for all executions of stock

k:

X
ISi,k = αk + βk Signali,k + δj,k Controlj,i,k + i,k , (8)
j

where I include execution-level control variables including participation rate, the ratio of passive and

aggressive orders, bid-offer spread, mid-quote volatility, execution duration, turnover, logarithm of

market capitalization.

Table 9 reports the summary of the β coefficients estimated for every stock. For each signal,

the average β is positive and statistically significant. The fraction of positive coefficients range

from 62% to 68%. Overall, these findings confirm that the cost increase due to predictability is not

driven by a small group of stocks.

4.11. Further Robustness Checks

I perform a number of robustness checks to make sure that my findings are not sensitive to alterna-

tive specifications or any potential data-related biases. First, to control for a potential time trend

or endogenous changes in the trading algorithm, I run my regressions with month dummies. I did

not find any significant change in signal coefficients.

My findings are also largely unchanged if I extend the set of control variables with the ratio

executed in dark pools, inverse of the stock price, number of child orders, or past stock returns.

The cost estimates also do not change if I use a restricted dataset in which I exclude executions

occurring during the first and last 30 minutes. I undertake this robustness check to mitigate the

potential bias in my signals due to higher volume at the beginning and end of the trading day.

28
5. Evidence from the SEC’s Ban on Unfiltered Access

In this section, I investigate whether the predictable patterns are harder to exploit when a particular

group of algorithmic traders faces technological rigidity in implementing their trading strategies.

This hypothesis emerges from the theoretical model of Aït-Sahalia and Sağlam (2013) in which a

high-frequency market maker takes advantage of a signal about an investor’s urgency to trade. The

model predicts that when the market maker is faster, he can track this signal at a higher level of

accuracy and be more successful in predatory quoting, i.e., offering lower prices to an impatient

buyer. To test this implication, I utilize a regulatory shock implemented in November 2011 that

limits the order submission activities of some algorithmic traders that utilize direct market access

through their brokers. Thus, I expect that order anticipatory activities of some algorithmic traders

will weaken after the regulatory change. Consequently, this new regulation allows me to directly

study the relationship between the cost of predictability and order anticipation ability.

5.1. Regulating Unfiltered Access

On November 30, 2011, the SEC required brokers with market access to disallow their customers

to have a direct link to market centers without any supervisory control.11 Before this enforcement,

clients of broker-dealers would be able to route their orders to a market center in an unfiltered

way to achieve faster order placement and trade execution. In the presence of the ban, supervised

access to the market centers impose additional controls and latency and hence slow down the order

submission of some fast algorithmic traders.

The ban did not affect large HFT firms who were registered as broker-dealers but the impact of

the ban was still expected to be significant as unfiltered access has been prevalent before the ban

accounting for nearly 40% of trading volume in the U.S. according to Aite Group, a Boston based

research firm.12 Another evidence of the widespread utilization of unfiltered access was that a

regional and relatively smaller brokerage firm, Wedbush Securities, consistently ranked as Nasdaq’s

largest liquidity provider before the ban, with the most of the volume generated by unfiltered-access
11
https://www.sec.gov/rules/final/2011/34-64748fr.pdf
12
“Study Lays Bare Breadth of ‘Naked’ Access”, The Wall Street Journal, December 15, 2009.

29
clients.13 After the implementation of the ban, an affected high-frequency trading firm could trade

on an exchange either becoming a registered broker-dealer with the SEC or becoming a member

of an exchange. Both of these options are costly to implement. Due to the large market share of

unfiltered access and compliance costs of bypassing the regulation, I expect that the ban’s effect

on all AT activity to be economically significant.

5.2. The Impact of the Ban on Algorithmic Traders

Order anticipatory activities are not directly observable and thus it is impossible to quantify the

impact of the ban solely on strategic trading. However, given that order anticipation is a subset of

overall AT activity, I expect it to be correlated with well-established AT proxies. Recent empirical

evidence documents that average trade sizes and trade-to-order ratios are negatively correlated

with broad AT activity (e.g., Hendershott et al. (2011), Hagströmer and Norden (2013), O’Hara

et al. (2014) and Weller (2015)). In this section, I use the TAQ database to compute these two

proxies of AT activity and merge them with my execution dataset. Since the ban was enforced

on November 30, 2011, I use the trade and quote data from the beginning of November 2011 to

the end of December 2011. The findings are robust to different durations of pre-ban and post-ban

periods.

Formally, I run the following regressions for both proxies at the stock-day level to test whether

the algorithmic trading activity dropped after the ban:

X
ATi,t = α + βBant + δj Controlj,i,t + µi + i,t , (9)
j

where ATi,t is either given by the average trade size or the trade-to-order ratio (%), Bant is a

binary variable with the value of 0 before the ban and 1 after the ban and µi denotes the stock

fixed-effects. Control variables include the bid-offer spread, mid-quote volatility and the logarithm

of market capitalization of the stock.

Table 10 reports the regression results. I find that the coefficients on the Ban are positive
13
“SEC’s ‘Naked’ Proposal May Hurt Small Dealers”, Securities Industry News, January 25, 2010.

30
and highly statistically significant for both proxies suggesting that AT activity dropped after the

ban. The coefficients are also economically significant. The median trade size is approximately 200

shares and the median trade-to-order ratio is 2.35% before the ban. Thus, the coefficients imply a

7% increase in the average trade size and 22% increase in trade-to-order ratio compared to these

pre-ban values.

These findings are consistent with the analysis in Chakrabarty et al. (2014) who also study this

regulatory change in detail. Using 150 stocks equally chosen from groups of most active, normal

and least active in terms of market capitalization and trading volume, they find that the quote-to-

trade ratio declines by 15.2% and the number of quote submissions decline by 25.6% after the ban.

Furthermore, using NASDAQ’s TotalView-ITCH data feed, they report that the reaction times

to order book events slow down to 344 milliseconds from 221 after the regulation. All of these

empirical evidence suggest that order anticipation has become more difficult after the ban.

5.3. The Price Impact of Predictability after the Ban

I have confirmed with the data provider that the broker has not changed the VWAP algorithm in

response to this new regulation and hence the ban on unfiltered access provides me the opportunity

to examine the sensitivity of price impact of predictability to order anticipation. Consistent with

the drop in AT activity, I expect that the predictable signals leaking from large order executions

will now be examined by a constrained group of algorithmic traders. The additional latencies due

to imposed pre-trade checks will slow down the order anticipatory activities and the algorithm’s

footprints may not be fully exploited by the algorithmic traders. Consequently, I hypothesize that

my three signals will lead to smaller execution costs in the post-ban period.

I test this hypothesis by interacting the independent variables in my original regression in

equation 5 with post-ban dummies, Ban. Following Chakrabarty et al. (2014), I use execution

data from October 3, 2011 to January 31, 2012 that straddles the implementation of the ban on

November 30, 2011. With these definitions, I have 2,245 executions in the pre-ban period and 2,870

in the post-ban period. The findings are robust to various definitions of pre- and post-ban periods.

31
Formally, I run the following multivariate regression:

X
ISi = α + βSignali + ϑBani + θBani × Signali + δj ControlAndDummiesj,i
j
X
+ κj Bani × ControlAndDummiesj,i + i ,
j

where Bani is equal to 0 for executions occurring before the ban and 1 for executions occurring

after the ban and I include ControlAndDummies to denote all of the control variables, stock, client

and intraday dummies. Note that the ban may increase or decrease the overall trading costs which

will be reflected in the estimated coefficient of the Ban. However, if order anticipation does not

exacerbate the cost of predictable executions, the interaction coefficients between the signal and

post-ban dummy would be indistinguishable from zero.

Table 11 reports the regression results. I find that the coefficients on the interaction terms

between the signals and the ban period are all negative. The coefficients are statistically signifi-

cant for NegIntervalVol and QtyTimeCorrel. These results suggest that the ban has reduced the

exploitability of the signals when a group of algorithmic traders has slower order submission tech-

nology. Overall, these findings underline the important role of algorithmic traders for the positive

correlation between predictability signals and execution costs. As order anticipation becomes more

difficult after the ban, the same level of predictability in the trading algorithm leads to smaller

costs.

6. Evidence from Low Signal-to-Noise Periods

Anticipating orders would be difficult when order flow signals emerging from large order executions

get confounded with additional noise. The theoretical model in Yang and Zhu (2015) directly links

the back-running ability to the volatility of the signal. When the signal’s volatility is higher, the

back-runner’s expected profit becomes lower. Therefore, this model implies that even though an

algorithm is leaving similar predictable patterns, strategic traders may not exploit them fully as

they may not have perfect access to them. In this section, I investigate whether the potential

32
volatility in the signals reduces the increase in execution costs.

To test this hypothesis formally, I study the price impact of predictability around higher unin-

formed trading activity. The algorithm’s footprints can be detected with less accuracy if there is a

large amount of noise trading in the market as these trades would lower the signal-to-noise ratios of

the patterns. Empirically, it is hard to distinguish periods of high volume due to non-informational

motives. To overcome this challenge, I will exploit the potential behavioral bias of traders to round

prices. Price clustering literature documents that the propensity to trade is higher around round

prices (see e.g., Osborne (1962)) and retail human traders seem to be more sensitive to this bias (see

e.g., Chiao and Wang (2009)). I conjecture that when the arrival mid-price is close to a round price,

there will be more non-informational trading during the execution period that may cloud the signal

extraction of algorithmic traders. Thus, examining these noise trading periods in detail allows me

to directly link predictable signals to execution costs through the variation in order anticipation

ability of algorithmic traders.

6.1. Trading Volume around Round Prices

The decimal part of the arrival mid-price, the average of the prevailing bid and offer price in the

market, can take the following values: {.000, .005, .010, . . . , .985, .990, .995}. If the decimals of the

arrival mid-price are in the set of {.000, .005, .010, .990, .995}, I expect that there will be more

noise trading during these executions. Formally, let the ith execution have the binary attribute

Noisyi = 1, if its arrival mid-price is close to a whole dollar, i.e., its decimal is in the set of

{.000, .005, .010, .990, .995}. Otherwise, the binary attribute is Noisyi = 0 for the ith execution.

There are 522 executions with Noisyi = 1.

First, I verify formally that when the arrival mid-price is close to whole prices, there is an

increase in volume realized during the execution period. Let T otQi be the total volume realized

during the execution period excluding the client’s order. I run the following regression with standard

33
controls and dummies to test the hypothesis:

X
log(T otQi ) = α + βNoisyi + δj ControlAndDummiesj,i + i,t , (10)
j

Table 12 reports that the coefficient on Noisy is positive and statistically significant. The results
Qi
are similar if I use T otQi on the left hand side. In this case, the coefficient is negative suggesting

that the volume increase is still present when potential change in the client’s order size is taken

into account. These findings provide evidence of abnormal noise trading activity during executions

with round stock prices.

6.2. The Price Impact of Predictability during Abnormal Trading

I have confirmed with the data provider that the VWAP algorithm does not use a special routine

for the round arrival mid-prices so these periods of higher noise trading allows me to test the

relationship between price impact of predictability and ease of order anticipation. Due to the

noise introduced by additional trading volume, I expect that the patterns leaking from large order

executions will now be detected by strategic traders with lower accuracy. For example, detecting

orders with equidistant trading intervals is much more difficult due to the combinatorial nature

of the pattern recognition problem. As order anticipation becomes more difficult, the predictable

signals will be expected to be less costly.

I test this hypothesis by interacting the independent variables in my original regression in equa-

tion 5 with dummies of abnormal noise trading, Noisy. Formally, I run the following multivariate

regression:

X
ISi = α + βSignali + ϑNoisyi + θNoisyi × Signali + δj ControlAndDummiesj,i
j
X
+ κj Noisyi × ControlAndDummiesj,i + i ,
j

where Noisyi is equal to 1 for executions with arrival mid-price having a decimal in the set of

{.000, .005, .010, .990, .995} and 0 otherwise. I again use ControlAndDummies to denote all of the

34
control variables, stock, client and intraday dummies. If order anticipation exacerbates the cost of

predictable executions, the interaction coefficients between the signal and Noisy dummies would

be negative.

Table 13 reports the regression results. I find that the coefficients on the interaction terms

with the noisy volume dummies are all negative. The coefficients are statistically significant for

NegIntervalVol and QtyTimeCorrel. Overall, these findings support the order anticipation channel.

As pattern recognition becomes more difficult due to the complexity of processing additional trading

volume, the same level of predictability in the trading algorithm leads to smaller costs.

7. Further Consistency with Existing Theories

I have already examined various hypotheses emerging from the existing theoretical models, but

in this section, I would like to run additional analyses to further differentiate between the main

competing theories outlined in Section 2. Due to the cost increase, sunshine trading does not

seem to consistent with my evidence. On the other hand, the findings can be still consistent with

predatory trading and back-running. Recall that predatory traders are taking advantage of the

urgency or impatience of uninformed investors by trading in the same direction of the investor at

the beginning of the execution. In the back-running theory, an adversary trader first detects the

presence of private information by exploiting an order flow signal and aims to profit by trading in

the same direction at relatively later stages of the execution.

146 different investors trade in the execution data and they may potentially differ with regards

to their investment objectives. As consistent with the empirical study in Sağlam et al. (2014), I

expect that these investors will be very heterogeneous in their short-term beliefs about the funda-

mental value of the asset. Thus, it is convenient to test competing theories of order anticipation in

my dataset. Suppose that the investor initiating the execution has private information about the

fundamental value of the stock. According to the back-running theory, if the execution becomes

predictable (as gauged by the signals), the adversary algorithmic traders also trade on this infor-

mation during the later stages of the execution and the price impact of the execution increases. In

35
this case, one would expect this price change to be permanent. On the contrary, assume that the

investor has no private information about the fundamental value of the asset. If predatory traders

use order anticipation strategies during a predictable execution of a large order, then the potential

price change would be immediate and expected to be temporary. Finally, sunshine trading may

arise if strategic traders increase their liquidity provision to the executions of uninformed investors

and lower the price impact of the large order. Following the approach in Van Kervel and Menkveld

(2015), I first provide a measure for permanent price impact in order to differentiate the informed

and uninformed investors and use it in the decomposition of total price impact.

7.1. Price Impact Measures

The overall price impact (OPI ) of an execution is defined as the percentage return realized between

the start and end of the execution. One can break down this total cost measure into permanent

and temporary components. Let the permanent price impact (PPI ) of an execution be defined as

the percentage return realized between the start of the execution and the fundamental value of the

asset realized on the next trading day. Then, the temporary price impact (TPI ) is just equal to

the difference between OPI and PPI. Formally,

Pi,Ti − Pi,0
OPIi = sgn (Qi ) ,
Pi,0
Xm(i),d(i)+1 − Pi,0
PPIi = sgn (Qi ) , (11)
Pi,0

TPIi = OPIi − PPIi ,

d
where the mapping i → u is used to identify the date of the execution, Pi,0 and Pi,Ti denote the

mid-quote prices at the start and end of the execution, and Xj,u is the fundamental value of the

asset j on day u. I will proxy this fundamental value by computing the volume-weighted average

price from TAQ database using all trades and their corresponding sizes.

The competing theories are based on the presence or absence of private information, and thus

I now study the investor heterogeneity in trader universe by studying the distribution of average

PPI at the investor level.

36
7.2. Investor Heterogeneity

Using PPI as a proxy for the investor’s informed trading, I compute the distribution of the infor-

mation based trading by computing average price impact at the investor level:

PN
i=1 PPIi I{f (i)=I}
AvgPPII = PN , (12)
i=1 I{f (i)=I}

f
where the mapping i → I is used to identify the investor who submitted the order where I =

1, . . . , 146.

Table 14 reports each investor’s AvgPPI with its corresponding t-statistic. Recall that on

average each investor has approximately 139 executions in the dataset. I observe that there is

wide heterogeneity in AvgPPI values. Specifically, I find that 33 investors have positive AvgPPI

with statistical significance at 10% level. This group of investors constitute roughly 23% of my

investor universe. Similarly, there exists 26 investors with negative AvgPPI at 10% significance

level. I find that the maximum (minimum, mean and median) average return that an investor in

this group achieves per execution is approximately 260 (-991, -8 and 2) bps. These statistics show

that my investor universe is suitable to study the applicability of competing theories in predictable

executions.

7.3. Predatory Trading, Back-running or Sunshine Trading

I first test the consistency of the cost increase with predatory trading and back-running. I use my

standard regression model with stock fixed effects to study the correlation between price impact

measures and the cost of predictable executions. For each of my price impact measures, PI, and

execution predictability signals, Signal, I estimate the following regression models:

X
PIi = α + βSignali + δj ControlandDummiesj,i + i , (13)
j

where PI is OPI, PPI or TPI and Signal equals either NegChildOrderVol, NegIntervalVol or

QtyTimeCorrel and each regression includes the standard set of control variables, stock, client

37
and intraday dummies.

Table 15 reports the regression results for each price impact measure. First, I observe that all of

the signals are significant in explaining the variation in OPI providing robust results to IS analysis.

The regressions illustrate that none of the signals are significant in explaining the variation in TPI.

On the other hand, NegIntervalVol and QtyTimeCorrel are statistically significant in explaining

the variation in PPI at 10% and 1% level, respectively.

This evidence is broadly consistent with back-running as it predicts same-side trading with

informed investors. However, Brunnermeier and Pedersen (2005) also assume permanent price

impact in price dynamics so documenting the significant correlation between signals and PPI may

not fully differentiate between the theories of predatory trading and back-running. Fortunately,

these theories have also different implications with regards to initial price movements during the

execution. In predatory trading, adversary traders would trade in the same direction of the large

order during the initial period whereas in back-running, adversary trading would be delayed due to

the potential learning period about the large order. My analysis in Section 4.5 sheds light on the

mechanism by examining the cost increase across the initial and final periods. Recall that signals

constructed from the initial period were not correlated with the initial cost increase but they were

significantly correlated with final price impact measures. These findings point to initial learning by

adversary traders and thus they are not consistent with the predictions of the predatory trading

theory.

Overall, these results suggest that the order flow information leaking from the executions can in-

duce back-running activity by algorithmic traders. Signals are positively correlated with permanent

price movements and signals constructed from the first half of the execution point to higher execu-

tion costs in the second half of the execution aligning with the learning mechanism in back-running

theory.

Finally, I investigate whether there can be potential sunshine trading in uninformed executions.

To test this formally, I examine the joint impact of informed trading and the signals on IS by

separately controlling for each variable. Using the PPI as a proxy for informed trading, I test this

hypothesis by running my main regressions with an interaction term with the signals and PPI.

38
Formally, I estimate the following regressions:

X
ISi = α + θPPIi × Signali + βSignali + ϑPPIi + δj ControlAndDummiesj,i + i ,
j

where ControlAndDummies denotes all of the control variables, stock, client and intraday dummies.

Table 17 reports the regression results. Consistent with the back-running theory, I find that the

interaction terms between PPI and NegChildOrderVol and QtyTimeCorrel are significant suggest-

ing that the predictable executions are costlier if they are more informed. However, I also observe

that the coefficients on the signals are still positive and significant suggesting that sunshine trading

is not applicable. This finding further implies that back-running theory can partially explain the

increase in execution costs due to predictable patterns.

7.4. Spread Costs or Adverse Price Movements

The average trading cost, IS, can be evaluated in two components: spread costs (SC ) and adverse

price movements (APM ). For example, spread costs would be negative if the algorithm were to fill

all of the child-order with passive limit orders. Let Ni be the number of child-order trades in the

ith execution and Qi,j be the jth child-order size (in shares) of the ith execution priced at Pi,j and

let Mi,j be the mid-point of the best available quotes at the same second of the fill (i.e., NBBO

39
mid-quote price). Then, I can decompose IS into these two components as follows:

Piavg − Pi,0
ISi = sgn (Qi ) (14)
Pi,0
 PNi 
1
Qi j=1 Pi,j Qi,j − Pi,0
= sgn (Qi )
Pi,0
PNi  Pi,j −Pi,0 
j=1 Pi,0 Qi,j
= sgn (Qi )
Qi
Ni
!
X Pi,j − Mi,j + Mi,j − Pi,0 Qi,j
= sgn (Qi )
j=1
Pi,0 Qi
Ni N
! !
i
X Pi,j − Mi,j Qi,j X Mi,j − Pi,0 Qi,j
= + sgn (Qi ) + sgn (Qi ) .
j=1
Pi,0 Qi j=1
Pi,0 Qi
| {z } | {z }
,SCi ,APMi

In my dataset, I find that the average SC and APM are 0.68 bps and 2.44 bps, respectively,

suggesting that on average APM constitutes approximately 78% of the total IS.

To examine the relationship between signals and the components of the IS, I run the following

multivariate regressions at the execution level:

X
IS_Parti = α + βSignali + δj ControlandDummiesj,i + i , (15)
j

where IS_Part is either SC or APM, Signal equals either NegChildOrderVol, NegIntervalVol or

QtyTimeCorrel and each regression includes the standard set of control variables, stock, client and

intraday dummies.

Table 16 reports the regression results. I find that all signals have insignificant coefficients when

the cost component is SC but the coefficients become positive and significant in the case of APM.

Consistent with the earlier analysis, these findings suggest that predictable signals increase the

price impact of the execution rather than the spread costs and align with the predictions from the

back-running theory.

40
8. Meeting the Benchmark and Equilibrium Implications

In this section, I provide empirical evidence that the broker is mostly successful in achieving the

market benchmark price for the average execution price. This finding is important because it

illustrates that the objective functions of the broker and the algorithmic traders engaging in order

anticipation are not exactly the opposite, i.e., the trading game between them is not zero-sum.

This highlights that even if the broker is very skilled, execution predictability can still emerge

in equilibrium due to the misalignment of interests. Furthermore, this evidence reinforces the

well-established proposition that VWAP algorithm is a suboptimal choice if the investor is highly

informed.

In a VWAP execution algorithm, the main objective is to minimize the VWAP slippage (VS)

rather than the IS. Recall that in computing VS, the volume-weighted average price realized during

the execution (including all trades) is the benchmark price rather than the arrival mid-quote price,

i.e.,

Piavg − VWAPi
VSi = sgn (Qi ) , (16)
VWAPi

where VWAPi is the realized volume-weighted average price over the ith parent-order period. The

mean (median) VS is roughly 1.6 (1.1) bps. Surprisingly, the correlation between VS and IS,

despite being small in magnitude, is actually negative at −0.06. Figure 6 illustrates the lack of

strong relationship visually. This plot indicates that execution predictability may lead to higher IS

on average, but may not prevent the broker from achieving its objective, i.e., meeting the market

benchmark price.

These findings illustrate that the cost increase due to execution predictability is less pronounced

on the primary objective of the algorithm. However, this lack of impact has still very interesting

implications as it illustrates that the predictable patterns are not completely at odds with the

broker’s performance. This is consistent with the hypothesis that even if the broker is very skilled,

execution predictability may still emerge because the performance of the algorithm does not suffer

from this information leakage. This feature can also decrease the incentive of the brokers to change

41
200

100
VS (bps)
0

−100

−200

−600 −400 −200 0 200 400 600


IS (bps)

Figure 6: Scatter plot of VS and IS.

their algorithms even if they are aware that some strategic traders can detect predictability patterns.

9. Conclusion

This paper tests the presence of order anticipation strategies by examining predictable signals

appearing in large order executions. In this framework, I consider three easy-to-construct signals

based on the child-order executions of the parent-order and establish that higher predictability

implied by the signals result in higher trading costs. The signals are all constructed with the basic

intuition that deterministic patterns are easy to be deciphered by sophisticated algorithmic traders.

The cost increase due to these predictable patterns in each signal point to a robust conclusion that

strategic traders exploit the urgency or the information of the investor using order anticipation

strategies. I exploit the SEC’s ban on unfiltered access and exogenous increase in noise trading

volume as a shock to order anticipatory activities of algorithmic traders and illustrate that the

price impact of predictability is smaller when order anticipation becomes more difficult.

My findings are mostly inconsistent with predictions of sunshine trading whereas they are

42
consistent with back-running. Back-running occurs when the execution is submitted by an investor

having private information about the fundamental value of the asset and algorithmic traders would

like to decrypt this information from the order flow signals to share the profits from the permanent

price impact. Decomposing the price changes into temporary or permanent, initial or delayed

responses and spread costs versus adverse price movements, my empirical analysis reveals that

back-running theory seem to be more consistent with the empirical findings.

This paper highlights the insensitivity of VWAP execution strategy to predictable patterns.

The empirical evidence highlights that this algorithm may be successful in meeting its benchmark

price but may leave substantial footprints that can be exploited by adversary traders. Despite its

drawbacks, this algorithm is still widely used in the industry for its ease in performance bench-

marking and low-cost benefits. This framework can be also utilized to reassess the effectiveness of

the VWAP algorithm especially in informed executions.

Given the absence of sunshine trading, at least at the aggregate level, my results have important

implications for policymakers. In the context of back-running, the price discovery can be argued

to be faster whereas predatory trading may delay the price discovery. For example, Aït-Sahalia

and Sağlam (2013) find that in the presence of a skilled predatory high-frequency market maker,

market liquidity also drops. It is not also evident whether back-running is a good practice. Stiglitz

(2014) shows an important negative implication of back-running arguing that it can disincentivize

costly information acquisition about the real economy.

Finally, the empirical evidence may have potential implications in the design of financial mar-

kets. One potential policy recommendation emerging from the analysis is to randomize trade

quantities and their timestamps in public data feeds. When the signal-to-noise ratio is very low,

algorithmic traders will be potentially discouraged from engaging in order anticipation. For exam-

ple, Harris (2013) proposes that markets should report only approximate trade sizes within various

buckets or only aggregated volumes at 5-minute time intervals to prevent harmful HFT activity.

Controlled pilot studies can ultimately evaluate the effectiveness of such policies with regards to

their unintended consequences.

43
References
Admati, Anat R., and Paul Pfleiderer. “Sunshine Trading and Financial Market Equilibrium.”
Review of Financial Studies, 4 (1991), 443–481.

Aït-Sahalia, Yacine, and Mehmet Sağlam. “High Frequency Market Making: Optimal Quoting.”
(2013).

Almgren, R.; C. Thum; E. Hauptmann; and H. Li. “Direct Estimation of Equity Market Impact.”
Risk, 18 (2005), 58–62.

Bertsimas, Dimitris, and Andrew W Lo. “Optimal Control of Execution Costs.” Journal of Financial
Markets, 1 (1998), 1–50.

Bessembinder, Hendrik; Allen Carrion; Laura Tuttle; and Kumar Venkataraman. “Liquidity, Re-
siliency and Market Quality Around Predictable Trades: Theory and Evidence.” Journal of
Financial Economics, 121 (2016), 142–166.

Brogaard, Jonathan; Terrence Hendershott; Stefan Hunt; and Carla Ysusi. “High-Frequency Trad-
ing and the Execution Costs of Institutional Investors.” Financial Review, 49 (2014), 345–369.

Brogaard, Jonathan; Terrence Hendershott; and Ryan Riordan. “High-Frequency Trading and
Price Discovery.” Review of Financial Studies, 27 (2014), 2267–2306.

Brunnermeier, Markus K, and Lasse Heje Pedersen. “Predatory Trading.” The Journal of Finance,
60 (2005), 1825–1863.

Chakrabarty, Bidisha; Pankaj K Jain; Andriy Shkilko; and Konstantin Sokolov. “Speed of Market
Access and Market Quality: Evidence From the Sec Naked Access Ban.” Available at SSRN
2328231, (2014).

Chiao, Chaoshin, and Zi-May Wang. “Price Clustering: Evidence Using Comprehensive Limit-
Order Data.” Financial Review, 44 (2009), 1–29.

Chiyachantana, Chiraphol; Pankaj K Jain; Christine Jiang; and Vivek Sharma. “Permanent Price
Impact Asymmetry of Trades with Institutional Constraints.” Journal of Financial Markets,
36 (2017), 1–16.

Chiyachantana, Chiraphol N; Pankaj K Jain; Christine Jiang; and Robert A Wood. “International
Evidence on Institutional Trading Behavior and Price Impact.” The Journal of Finance, 59 (2004),
869–898.

Degryse, Hans; Frank de Jong; and Vincent van Kervel. “Does Order Splitting Signal Uninformed
Order Flow.” (2014).

44
Hagströmer, Björn, and Lars Norden. “The Diversity of High-Frequency Traders.” Journal of
Financial Markets, 16 (2013), 741–770.

Harris, Larry. “What to Do About High-Frequency Trading.” Financial Analysts Journal, 69 (2013).

Hasbrouck, Joel, and Gideon Saar. “Low-Latency Trading.” Journal of Financial Markets,
16 (2013), 646–679.

Hendershott, Terrence; Charles M Jones; and Albert J Menkveld. “Does Algorithmic Trading
Improve Liquidity.” The Journal of Finance, 66 (2011), 1–33.

Hirschey, Nicholas. “Do High-Frequency Traders Anticipate Buying and Selling Pressure.” Available
at SSRN 2238516, (2013).

Hu, Gang. “Measures of Implicit Trading Costs and Buy--Sell Asymmetry.” Journal of Financial
Markets, 12 (2009), 418–437.

Hu, Gang; Koren Jo; Yi Alex Wang; and Jing Xie. “Institutional Trading and Abel Noser Data.”
Journal of Corporate Finance, (Forthcoming).

Jones, Charles M. “What Do We Know About High-Frequency Trading.” Columbia Business School
Research Paper, (2013).

Korajczyk, Robert A, and Dermot Murphy. “High Frequency Market Making to Large Institutional
Trades.” Available at SSRN 2567016, (2015).

Kraus, Alan, and Hans R Stoll. “Price Impacts of Block Trading on the New York Stock Exchange.”
The Journal of Finance, 27 (1972), 569–588.

Macey, Jonathan R, and Maureen O’hara. “The Law and Economics of Best Execution.” Journal
of Financial Intermediation, 6 (1997), 188–223.

Madrigal, Vicente. “Non-Fundamental Speculation.” The Journal of Finance, 51 (1996), 553–578.

Menkveld, Albert J. “The Economics of High-Frequency Trading: Taking Stock.” Available at


SSRN, (2016).

Moallemi, Ciamac C; Beomsoo Park; and Benjamin Van Roy. “Strategic Execution in the Presence
of an Uninformed Arbitrageur.” Journal of Financial Markets, 15 (2012), 361–391.

O’Hara, Maureen; Chen Yao; and Mao Ye. “What’s Not There: Odd Lots and Market Data.” The
Journal of Finance, 69 (2014), 2199–2236.

Osborne, Maury FM. “Periodic Structure in the Brownian Motion of Stock Prices.” Operations
Research, 10 (1962), 345–379.

45
Perold, Andre F. “The Implementation Shortfall: Paper Versus Reality.” The Journal of Portfolio
Management, 14 (1988), 4–9.

Saar, Gideon. “Price Impact Asymmetry of Block Trades: An Institutional Trading Explanation.”
The Review of Financial Studies, 14 (2001), 1153–1181.

Sağlam, Mehmet, and Tugkan Tuzun. “Do ETFs Increase Liquidity?.” Available at SSRN 3142081,
(2018).

Sağlam, Mehmet; Ciamac C Moallemi; and Michael Sotiropoulos. “Short-Term Trading Skill: An
Analysis of Investor Heterogeneity and Execution Quality.” Available at SSRN 2463952, (2014).

Stiglitz, Joseph E Tapping the Brakes: Are Less Active Markets Safer and Better for the Economy In
Federal Reserve Bank of Atlanta 2014 Financial Markets Conference Tuning Financial Regulation
for Stability and Efficiency, April, volume 15, 2014.

Tong, Lin. “A Blessing or a Curse The Impact of High Frequency Trading on Institutional Investors.”
SSRN Working Paper, (2015).

Van Kervel, Vincent, and Albert J Menkveld. “High-Frequency Trading Around Large Institutional
Orders.” Available at SSRN 2619686, (2015).

Weller, Brian M. “Efficient Prices At Any Cost: Does Algorithmic Trading Deter Information
Acquisition.” Available at SSRN 2662254, (2015).

Yang, Liyan, and Haoxiang Zhu. “Back-Running: Seeking and Hiding Fundamental Information in
Order Flows.” (2015).

46
Table 1: Summary statistics for the main attributes in my execution data. Participation rate is equal to the ratio of executed volume to
total volume during the lifetime of the order. The bid-ask spread is normalized using the mid-quote price. Order duration is expressed as
a fraction of full trading day. The duration of a full trading day in U.S. equity markets is 6.5 hours.

Statistic Mean Min Pctl(25) Median Pctl(75) Max


Order Value ($ M) 1.015 0.050 0.131 0.343 1.001 62.864

47
Participation Rate 0.018 0.00001 0.002 0.006 0.019 0.521
Number of Child-Order Trades 127.801 5 26 60 148 4,533
Spread (bps) 3.960 0.711 2.347 3.219 4.623 45.128
Volatility 0.015 0.001 0.008 0.012 0.018 0.274
Order Duration 0.525 0.026 0.159 0.519 0.899 1.000
Implementation Shortfall (bps) 3.12 −678.30 −27.89 2.65 35.24 698.10
Table 2: This table presents the regression results where IS is the dependent variable. The main inde-
pendent variables are three execution signals: NegChildOrderVol measures the sign-adjusted volatility of
relative child-order sizes. NegIntervalVol, that measures the sign-adjusted volatility of normalized trading
intervals between consecutive child-orders. QtyTimeCorrel measures the correlation between cumulative
executed quantity and total time. I add participation rate, the ratio of aggressive and passive orders in the
parent-order, bid-offer spread, mid-quote volatility, execution duration, turnover, and logarithm of market
capitalization as control variables. Participation rate is the ratio of order size to the total volume during
the trading interval. Aggressive (passive) order ratio is the fraction of child-orders of the parent-order
that pays (earns) additional spread. Bid-offer spread is normalized using the mid-quote price. Volatility
is computed using mid-quote returns based on five seconds. Order duration is expressed as a fraction of
full trading day. Turnover is the total volume during the execution interval divided by the number of
shares outstanding. Log Market Cap is the logarithm of the market capitalization computed using the
arrival price of the order. All regressions include stock, client and intraday dummies. In columns (4)-(6),
I standardize all of the continuous independent variables. Standard errors are given in parentheses and
are adjusted by clustering on calendar day.

Dependent variable: IS
(1) (2) (3) (4) (5) (6)
∗∗ ∗∗
NegChildOrderVol 88.52 1.61
(42.29) (0.77)
NegIntervalVol 53.83∗∗ 1.77∗∗
(21.83) (0.72)
QtyTimeCorrel 70.94∗∗∗ 1.75∗∗∗
(25.03) (0.62)
Participation Rate 41.08∗∗ 39.31∗ 46.96∗∗ 1.39∗∗ 1.33∗ 1.58∗∗
(20.73) (21.16) (20.19) (0.70) (0.71) (0.68)
Aggressive Ratio 7.70∗ 8.53∗ 8.80∗ 1.82∗ 2.01∗ 2.08∗
(4.55) (4.58) (4.64) (1.07) (1.08) (1.09)
Passive Ratio 6.63 6.96 7.76 1.40 1.46 1.63
(5.42) (5.40) (5.41) (1.14) (1.14) (1.14)
Spread 0.12 0.12 0.10 0.32 0.34 0.28
(0.67) (0.67) (0.67) (1.89) (1.88) (1.88)
Volatility −1.05 −5.35 −0.85 −0.01 −0.06 −0.01
(290.31) (291.73) (291.61) (3.17) (3.18) (3.18)
Order Duration 0.99 0.33 1.57 0.35 0.12 0.56
(38.60) (38.49) (38.59) (13.75) (13.71) (13.75)
Turnover 0.34 0.34 0.34 2.67 2.68 2.68
(0.33) (0.33) (0.33) (2.65) (2.65) (2.66)
Log Market Cap 4.32 4.39 3.76 4.84 4.92 4.21
(7.57) (7.57) (7.59) (8.48) (8.48) (8.50)
Standardized No No No Yes Yes Yes
Observations 20,335 20,335 20,335 20,335 20,335 20,335
Adjusted R2 0.07 0.07 0.07 0.07 0.07 0.07
*** p < 0.01, ** p < 0.05, * p < 0.10

48
Table 3: This table presents the regression results where the initial and final price impact, InitIS and
FinalIS, are the dependent variables. The main independent variables are three execution signals: Init-
NegChildOrderVol measures the sign-adjusted volatility of relative child-order sizes in the first half of
the execution. InitNegIntervalVol, that measures the sign-adjusted volatility of normalized trading in-
tervals between consecutive child-orders in the first half of the execution. InitQtyTimeCorrel measures
the correlation between cumulative executed quantity and total time in the first half of the execution. I
add participation rate, the ratio of aggressive and passive orders, bid-offer spread, mid-quote volatility,
execution duration, turnover, and logarithm of market capitalization as control variables. All regressions
include stock, client and intraday dummies. Standard errors are given in parentheses and are adjusted by
clustering on calendar day.

Dependent Variables:
InitIS FinalIS
(1) (2) (3) (4) (5) (6)
InitNegChildOrderVol −4.04 59.06∗∗
(15.89) (25.76)
InitNegIntervalVol 10.76 26.00∗∗
(9.62) (12.39)
InitQtyTimeCorrel 16.62 54.86∗∗
(16.61) (25.31)
Participation Rate 33.58∗∗ 30.25∗ 33.24∗∗ 23.03 21.96 29.41∗
(15.61) (15.94) (15.49) (18.20) (18.48) (17.48)
Aggressive Ratio 8.77∗∗∗ 8.34∗∗ 8.56∗∗ −0.02 0.36 0.78
(3.40) (3.40) (3.41) (4.03) (4.07) (4.08)
Passive Ratio 7.77∗∗ 7.09∗ 7.52∗∗ 0.49 0.35 1.25
(3.67) (3.67) (3.66) (4.76) (4.86) (4.84)
Spread 0.25 0.28 0.27 −0.32 −0.35 −0.36
(0.53) (0.53) (0.53) (0.46) (0.46) (0.46)
Volatility 108.19 103.12 104.64 −171.82 −174.34 −173.04
(165.91) (165.41) (165.54) (287.13) (289.19) (288.68)
Order Duration −5.96 −8.01 −7.01 16.15 14.98 17.11
(26.39) (26.33) (26.33) (42.03) (42.14) (41.95)
Turnover 0.13 0.13 0.13 0.41 0.41 0.41
(0.21) (0.21) (0.21) (0.32) (0.32) (0.32)
Log Market Cap 6.76 7.14 6.90 −4.72 −4.73 −5.20
(5.29) (5.30) (5.32) (7.16) (7.14) (7.22)
Observations 20,296 20,296 20,296 20,296 20,296 20,296
Adjusted R2 0.07 0.07 0.07 0.07 0.07 0.07
*** p < 0.01, ** p < 0.05, * p < 0.10

49
Table 4: Frequency of consecutive child order trades with trading intervals of X.00 seconds versus X.50 seconds.
X is chosen to be from 1 second to 30 seconds. In the final column, I report the difference in frequencies.

X.00 seconds Frequency X.50 seconds Frequency Difference


1.00 632 1.50 451 181
2.00 554 2.50 293 261
3.00 406 3.50 247 159
4.00 359 4.50 197 162
5.00 424 5.50 159 265
6.00 305 6.50 169 136
7.00 233 7.50 135 98
8.00 232 8.50 98 134
9.00 213 9.50 96 117
10.00 296 10.50 109 187
11.00 198 11.50 89 109
12.00 198 12.50 70 128
13.00 138 13.50 85 53
14.00 162 14.50 85 77
15.00 175 15.50 79 96
16.00 145 16.50 107 38
17.00 155 17.50 82 73
18.00 147 18.50 69 78
19.00 125 19.50 93 32
20.00 156 20.50 87 69
21.00 132 21.50 73 59
22.00 116 22.50 83 33
23.00 107 23.50 75 32
24.00 116 24.50 75 41
25.00 133 25.50 88 45
26.00 159 26.50 78 81
27.00 96 27.50 86 10
28.00 142 28.50 78 64
29.00 109 29.50 78 31
30.00 157 30.50 90 67

50
Table 5: This table presents the regression results where dependent variables are three execution signals:
NegChildOrderVol measures the sign-adjusted volatility of relative child-order sizes. NegIntervalVol, that
measures the sign-adjusted volatility of normalized trading intervals between consecutive child-orders.
QtyTimeCorrel measures the correlation between cumulative executed quantity and total time. I regress
these signals on participation rate, the ratio of aggressive and passive orders, bid-offer spread, mid-quote
volatility, turnover, and logarithm of market capitalization. I also add stock fixed-effects, client and
intraday dummies. Standard errors are given in parentheses and are adjusted by clustering on calendar
day.

Dependent variable:
NegChildOrderVol NegIntervalVol QtyTimeCorrel
Participation Rate 0.07∗∗∗ 0.15∗∗∗ 0.01
(0.01) (0.01) (0.01)

Aggressive Ratio 0.02∗∗∗ 0.01∗∗∗ 0.01∗∗∗


(0.002) (0.003) (0.002)

Passive Ratio 0.02∗∗∗ 0.03∗∗∗ 0.01∗∗∗


(0.002) (0.003) (0.002)

Spread −0.0005∗∗∗ −0.001∗∗∗ −0.0003∗∗


(0.0001) (0.0002) (0.0001)

Volatility 0.09∗ 0.22∗∗∗ 0.11∗∗∗


(0.05) (0.07) (0.03)

Turnover −0.0000 −0.0001 −0.0001∗∗


(0.0000) (0.0001) (0.0000)

Log Market Cap −0.01∗∗∗ −0.01∗∗∗ −0.0005


(0.001) (0.002) (0.001)

Observations 20,335 20,335 20,335


Adjusted R2 0.37 0.44 0.33
*** p < 0.01, ** p < 0.05, * p < 0.10

51
Table 6: Spearman and Pearson correlations between public and proprietary signals using all executions.

Public Proprietary Spearman Correlation Pearson Correlation


∗∗∗
PubNegChildOrderVol NegChildOrderVol 0.48 0.39∗∗∗
PubNegIntervalVol NegIntervalVol 0.25∗∗∗ 0.16∗∗∗
PubQtyTimeCorrel QtyTimeCorrel 0.48∗∗∗ 0.28∗∗∗
*** p < 0.01, ** p < 0.05, * p < 0.10

Table 7: Spearman and Pearson correlations between public and proprietary signals when executions with partic-
ipation rates of at least 0.6% (the median) are considered.

Public Proprietary Spearman Correlation Pearson Correlation


∗∗∗
PubNegChildOrderVol NegChildOrderVol 0.77 0.58∗∗∗
PubNegIntervalVol NegIntervalVol 0.33∗∗∗ 0.25∗∗∗
PubQtyTimeCorrel QtyTimeCorrel 0.50∗∗∗ 0.29∗∗∗
*** p < 0.01, ** p < 0.05, * p < 0.10

52
Table 8: Analysis for buy and sell orders. This table presents the regression results where IS is the
dependent variable in separate datasets of buy and sell orders. The main independent variables are
three execution signals: NegChildOrderVol measures the sign-adjusted volatility of relative child-order
sizes. NegIntervalVol, that measures the sign-adjusted volatility of normalized trading intervals between
consecutive child-orders. QtyTimeCorrel measures the correlation between cumulative executed quantity
and total time. I add participation rate, bid-offer spread, the ratio of aggressive and passive orders, mid-
quote volatility, execution duration, turnover, and logarithm of market capitalization as control variables
but only the coefficients on participation rate are shown. All regressions include stock, client and intraday
dummies. Standard errors are given in parentheses and are adjusted by clustering on calendar day.

Dependent Variable: IS
Buy Sell Buy Sell Buy Sell
(1) (2) (3) (4) (5) (6)
∗∗
NegChildOrderVol 25.91 120.60
(62.99) (47.10)

NegIntervalVol 79.65∗∗∗ 63.10∗∗


(27.20) (31.85)

QtyTimeCorrel 90.54∗∗ 98.22∗∗∗


(41.12) (31.18)

Observations 9,856 10,479 9,856 10,479 9,856 10,479


Adjusted R2 0.08 0.12 0.02 0.12 0.08 0.12
*** p < 0.01, ** p < 0.05, * p < 0.10

53
Table 9: Stock-by-stock analysis. IS is regressed on execution control variables on a stock-by-stock basis. 385
stocks with more than 20 parent-order executions are used in the analysis. I use participation rate, the ratio of
passive and aggressive orders, bid-offer spread, mid-quote volatility, execution duration, turnover, logarithm of
market capitalization as control variables. For each signal, I report the average β coefficient across stocks and the
corresponding t-statistic. In the third row, the fraction of positive coefficients is given.

NegChildOrderVol NegIntervalVol QtyTimeCorrel


(1) (2) (3)

Mean β 251.43 160.84 137.74

t(β) 3.26 3.43 2.09

% positive 0.66 0.68 0.62

54
Table 10: This table presents the regression results where dependent variables are two proxies of AT
activity: average trade size or trade-to-order ratio (%). I use execution data from November 2011 and
December 2011. Ban is a binary variable with the value of 0 before the ban and 1 after the ban. I
add bid-offer spread, mid-quote volatility, and logarithm of market capitalization and stock fixed effects.
Standard errors are given in parentheses and are adjusted by clustering on calendar day.

Dependent variable:
Average Trade Size Trade-to-Order Ratio (%)
Ban 14.15∗∗∗ 0.52∗∗∗
(4.17) (0.18)

Spread 0.07∗∗ 0.003∗∗∗


(0.03) (0.001)

Volatility 2,118.07∗ 42.53


(1,139.72) (38.59)

Log Market Cap −44.40 −0.35


(67.80) (1.22)

Constant 128.49∗∗∗ 0.47


(48.79) (1.07)

Stock FE Yes Yes


Observations 2,141 2,141
Adjusted R2 0.97 0.57
*** p < 0.01, ** p < 0.05, * p < 0.10

55
Table 11: This table presents the regression results where IS is the dependent variable. The main in-
dependent variables are three execution signals and post-ban dummy, Ban and their interaction terms:
NegChildOrderVol measures the sign-adjusted volatility of relative child-order sizes. NegIntervalVol, that
measures the sign-adjusted volatility of normalized trading intervals between consecutive child-orders.
QtyTimeCorrel measures the correlation between cumulative executed quantity and total time. Ban is
a binary variable with the value of 0 before the ban and 1 after the ban. I add participation rate, the
ratio of aggressive and passive orders, bid-offer spread, mid-quote volatility, execution duration, turnover,
and logarithm of market capitalization as control variables. For conciseness, I only report the coefficients
of interest. All regressions include stock, client and intraday dummies. Standard errors are given in
parentheses and are adjusted by clustering on calendar day.

Dependent variable: IS
(1) (2) (3)
NegChildOrderVol 151.82
(137.11)

NegIntervalVol 216.36∗∗
(86.03)

QtyTimeCorrel 245.30∗∗∗
(52.27)

Ban × NegChildOrderVol −143.25


(154.12)

Ban × NegIntervalVol −204.09∗∗


(100.88)

Ban × QtyTimeCorrel −204.21∗∗∗


(77.65)

Ban × Participation Rate 55.34 81.55 49.04


(101.09) (102.76) (100.11)

Ban × Aggressive Ratio 18.39 17.86 16.71


(21.61) (21.23) (21.30)

Ban × Passive Ratio −29.48 −25.19 −31.48


(24.41) (24.13) (23.26)

Observations 5,115 5,115 5,115


Adjusted R2 0.10 0.10 0.10
*** p < 0.01, ** p < 0.05, * p < 0.10

56
Table 12: This table presents the regression results where dependent variables are logarithm of interval
volume (excluding client’s order) and the ratio of client’s order size to interval volume. Noisy is a binary
variable with the value of 1 if the arrival mid-price has a decimal in the set of {.000, .005, .010, .990, .995}
and 0 otherwise. I add bid-offer spread, mid-quote volatility, execution duration, and logarithm of market
capitalization as control variables. All regressions include stock, client and intraday dummies. Standard
errors are given in parentheses and are adjusted by clustering on calendar day.

Dependent variable:
Qi
log(T otQi ) T otQi

(1) (2)
∗∗
Noisy 0.05 −0.003∗∗
(0.02) (0.001)

Spread −0.01 −0.0001


(0.01) (0.0003)

Volatility 22.35∗∗∗ −0.34∗∗∗


(1.87) (0.03)

Order Duration 1.78∗∗∗ −0.02


(0.46) (0.02)

Log Market Cap −0.20∗∗∗ −0.004∗


(0.06) (0.002)

Observations 20,335 20,335


Adjusted R2 0.51 0.34
*** p < 0.01, ** p < 0.05, * p < 0.10

57
Table 13: This table presents the regression results where IS is the dependent variable. The main in-
dependent variables are three execution signals and noise trading dummy, Noisy and their interaction
terms: NegChildOrderVol measures the sign-adjusted volatility of relative child-order sizes. NegInter-
valVol, that measures the sign-adjusted volatility of normalized trading intervals between consecutive
child-orders. QtyTimeCorrel measures the correlation between cumulative executed quantity and total
time. Noisy is a binary variable with the value of 1 if the arrival mid-price has a decimal in the set of
{.000, .005, .010, .990, .995} and 0 otherwise. I add participation rate, bid-offer spread, the ratio of aggres-
sive and passive orders, mid-quote volatility, execution duration, and logarithm of market capitalization
as control variables. All regressions include stock, client and intraday dummies. For conciseness, I only
report the coefficients of interest. Standard errors are given in parentheses and are adjusted by clustering
on calendar day.

Dependent variable: IS
(1) (2) (3)
Noisy × NegChildOrderVol −65.09
(291.10)

Noisy × NegIntervalVol −378.63∗∗


(165.94)

Noisy × QtyTimeCorrel −220.21∗∗


(93.33)

Noisy × Participation Rate 466.14∗∗ 508.72∗∗ 492.33∗∗


(235.65) (232.29) (225.75)

Noisy × Aggressive Ratio −28.50 −10.34 −19.65


(38.98) (36.80) (37.06)

Noisy × Passive Ratio −58.13 −32.40 −44.20


(45.34) (47.47) (40.09)

Observations 20,335 20,335 20,335


Adjusted R2 0.07 0.07 0.07
*** p < 0.01, ** p < 0.05, * p < 0.10

58
Table 14: This table reports each investor’s average PPI with its corresponding t-statistic. The dataset has 146 distinct investors. Client
identifiers are masked with numerical aliases. I define PPI according to Equation 11 and compute the average according to Equation 12.
C78 has only one execution and thus the t-statistic is not defined.

ID AvgPPI t-stat ID AvgPPI t-stat ID AvgPPI t-stat ID AvgPPI t-stat ID AvgPPI t-stat
(bps) (bps) (bps) (bps) (bps)
C26 259.91 11.46 C123 33.52 3.70 C88 6.63 0.22 C84 -7.65 -0.55 C9 -37.30 -1.03
C64 162.30 7.88 C7 32.87 2.07 C31 6.30 0.13 C19 -9.39 -0.45 C41 -38.73 -0.95
C138 123.73 3.96 C30 31.85 1.87 C15 6.25 0.32 C75 -10.20 -0.43 C76 -41.25 -3.45
C28 123.56 3.82 C6 27.93 1.53 C22 5.73 0.06 C102 -11.64 -0.94 C131 -41.32 -4.05
C70 116.60 6.03 C81 26.07 2.16 C106 5.72 0.37 C144 -11.65 -0.88 C77 -46.09 -2.45
C73 115.85 6.72 C99 23.20 1.84 C67 4.83 0.24 C46 -13.14 -0.24 C12 -47.50 -1.76
C51 107.63 4.71 C117 22.48 0.92 C132 4.52 0.45 C135 -13.54 -0.62 C38 -49.04 -2.47
C96 93.51 2.88 C114 21.87 2.24 C125 4.46 0.40 C142 -13.66 -1.32 C57 -49.23 -3.04
C50 92.57 4.18 C133 20.09 1.78 C136 4.33 0.54 C90 -13.76 -0.67 C134 -53.27 -4.11
C119 87.90 2.81 C89 19.88 1.05 C101 4.17 0.26 C44 -14.17 -1.01 C3 -61.05 -1.82
C93 85.77 7.16 C86 19.46 0.83 C137 4.17 0.25 C118 -14.78 -1.22 C120 -62.99 -3.66
C34 84.86 4.92 C141 19.20 2.77 C69 3.41 0.23 C122 -14.88 -1.13 C23 -63.89 -3.21

59
C65 80.12 1.66 C2 18.68 0.29 C140 3.08 0.11 C52 -15.83 -1.03 C129 -78.95 -6.85
C71 79.31 2.73 C59 17.96 1.24 C130 1.68 0.15 C108 -15.85 -1.38 C25 -81.14 -3.73
C60 79.17 6.36 C55 16.80 0.60 C143 1.17 0.05 C62 -16.26 -0.44 C78 -84.55 N/A
C107 78.10 2.27 C5 15.23 0.18 C74 0.69 0.05 C37 -16.40 -0.86 C54 -88.37 -1.11
C92 67.32 1.45 C14 15.13 0.79 C13 -0.70 -0.04 C79 -17.77 -1.05 C95 -92.75 -5.20
C83 65.24 2.23 C21 15.07 0.62 C29 -1.08 -0.06 C43 -17.87 -0.53 C20 -94.73 -4.28
C115 64.31 7.54 C126 14.52 1.65 C1 -1.47 -0.07 C87 -18.79 -1.71 C16 -94.98 -7.78
C48 63.99 3.17 C63 13.83 0.60 C139 -1.62 -0.17 C24 -19.60 -1.53 C58 -97.25 -3.94
C61 59.98 2.88 C85 13.41 0.98 C80 -2.85 -0.26 C36 -21.52 -0.38 C4 -115.97 -6.57
C103 55.55 1.42 C98 12.54 1.05 C97 -3.01 -0.31 C27 -22.28 -0.53 C112 -173.99 -7.42
C8 54.43 2.96 C40 10.94 0.78 C145 -3.11 -0.33 C32 -25.79 -0.86 C56 -263.00 -5.86
C100 49.03 2.27 C72 10.62 0.98 C110 -4.03 -0.26 C49 -26.86 -1.49 C53 -268.00 -13.53
C17 48.23 1.07 C116 10.46 1.15 C146 -4.81 -0.16 C104 -27.03 -0.48 C18 -511.08 -25.06
C105 40.97 2.23 C113 9.79 0.93 C47 -4.94 -0.24 C45 -28.07 -1.21 C33 -991.23 -1.04
C42 38.69 1.87 C35 9.25 0.42 C91 -5.33 -0.36 C109 -30.43 -2.14
C124 35.10 1.17 C127 8.83 0.24 C94 -6.22 -0.52 C68 -35.21 -2.53
C11 34.61 1.38 C128 7.20 0.55 C82 -6.39 -0.50 C39 -36.52 -2.31
C10 34.05 1.43 C121 7.16 0.46 C111 -6.45 -0.31 C66 -36.56 -2.76
Table 15: This table presents the regression results where OPI, PPI and TPI are the dependent variables respectively. The main independent
variables are three execution signals: NegChildOrderVol measures the sign-adjusted volatility of relative child-order sizes. NegIntervalVol,
that measures the sign-adjusted volatility of normalized trading intervals between consecutive child-orders. QtyTimeCorrel measures the
correlation between cumulative executed quantity and total time. I add participation rate, bid-offer spread, the ratio of aggressive and
passive orders, mid-quote volatility, execution duration, turnover, and logarithm of market capitalization as control variables but only
the coefficients on participation rate are shown. All regressions include stock, client and intraday dummies. Standard errors are given in
parentheses and are adjusted by clustering on calendar day.

Dependent Variables:
NegChildOrderVol NegIntervalVol QtyTimeCorrel
OPI PPI TPI OPI PPI TPI OPI PPI TPI
(1) (2) (3) (4) (5) (6) (7) (8) (9)

60
∗∗∗
NegChildOrderVol 161.69 92.32 69.37
(61.14) (118.87) (87.80)

NegIntervalVol 84.20∗ 64.15∗ 20.05


(49.20) (37.36) (50.67)

QtyTimeCorrel 104.30∗∗∗ 167.24∗∗ −62.93


(34.60) (84.70) (68.46)

Observations 20,335 20,335 20,335 20,335 20,335 20,335 20,335 20,335 20,335
Adjusted R2 0.10 0.07 0.07 0.10 0.07 0.07 0.10 0.07 0.07
*** p < 0.01, ** p < 0.05, * p < 0.10
Table 16: This table presents the regression results where IS is the dependent variable. The main indepen-
dent variables are three execution signals and their interaction with PPI, the permanent price impact of the
order (in bps) that can proxy informed trading. NegChildOrderVol measures the sign-adjusted volatility of
relative child-order sizes. NegIntervalVol, that measures the sign-adjusted volatility of normalized trading
intervals between consecutive child-orders. QtyTimeCorrel measures the correlation between cumulative
executed quantity and total time. I add participation rate, bid-offer spread, the ratio of aggressive and
passive orders, mid-quote volatility, execution duration, turnover, and logarithm of market capitalization
as control variables. All regressions include stock, client and intraday dummies. Standard errors are given
in parentheses and are adjusted by clustering on calendar day.

Dependent variable: IS

(1) (2) (3)


∗∗
PPI × NegChildOrderVol 0.87
(0.40)

PPI × NegIntervalVol 0.15


(0.24)

PPI × QtyTimeCorrel 1.07∗∗


(0.45)

NegChildOrderVol 68.76∗∗
(33.45)

NegIntervalVol 37.54∗∗
(17.36)

QtyTimeCorrel 66.34∗∗
(30.61)

PPI 0.25∗∗∗ 0.24∗∗∗ −0.81∗


(0.01) (0.02) (0.44)

Observations 20,335 20,335 20,335


Adjusted R2 0.38 0.38 0.38
*** p < 0.01, ** p < 0.05, * p < 0.10

61
Table 17: This table presents the regression results where SC and APM are the dependent variables respectively. The main independent
variables are three execution signals: NegChildOrderVol measures the sign-adjusted volatility of relative child-order sizes. NegIntervalVol,
that measures the sign-adjusted volatility of normalized trading intervals between consecutive child-orders. QtyTimeCorrel measures the
correlation between cumulative executed quantity and total time. I add participation rate, bid-offer spread, the ratio of aggressive and
passive orders, mid-quote volatility, execution duration, turnover, and logarithm of market capitalization as control variables but only
the coefficients on participation rate are shown. All regressions include stock, client and intraday dummies. Standard errors are given in
parentheses and are adjusted by clustering on calendar day.

Dependent Variables:
NegChildOrderVol NegIntervalVol QtyTimeCorrel
SC APM SC APM SC APM
(1) (2) (3) (4) (5) (6)

62
∗∗
NegChildOrderVol 1.29 86.88
(3.87) (43.51)

NegIntervalVol 0.16 54.35∗∗


(0.48) (21.84)

QtyTimeCorrel −1.54 72.43∗∗∗


(2.02) (25.38)

Observations 20,335 20,335 20,335 20,335 20,335 20,335


Adjusted R2 0.12 0.07 0.11 0.07 0.12 0.07
*** p < 0.01, ** p < 0.05, * p < 0.10

You might also like