You are on page 1of 22

# Investigating in the Impact of Delta-Gamma

## Hedging in SP 500 Returns from 2007-2020

Lily
December 2020

1 Dataset
For this assignment, perhaps despite better wisdom, I set out to discover the
effect of market maker delta-gamma hedging as part of the options market on
SP 500 returns from 2007-2020. To derive our dataset, I combined end of day
index options data (a snapshot taken at 3:45 PM EST) from Options Research
Technology Services (orats.com) with OHLC (open, high, low, close) daily data
for SPY (the major exchange-traded fund operated by State Street Advisors
that tracks the SP 500 Index) parsed from the Yahoo! Finance API.

1.1 Preprocessing
Given the structure of the option API (as described in the attached footnote
documentation from ORATS 1 ), to do analysis I had to preprocess the individual
option chains received to generate the options-related features (net delta of put
options, call options, and the Net Options Pricing Effect (NOPE)) used for
the analysis. These features were calculated as follows:
• Total Call Delta - The summation of delta per call contract c weighted
by daily traded volume on all option chains v and expirations q from day
t to 2023 (latest option chain available).
XXX
T CDt = volumecv,q ∗ deltacv,q
v∈V q∈Q c∈C

• Total Put Delta - The summation of delta per put contract p weighted
by daily traded volume on all option chains v and expirations q from day
t to 2023 (latest option chain available).
XXX
T P Dt = volumepv,q ∗ deltapv,q
v∈V q∈Q p∈P
1 https://docs.orats.io/datav2-api-guide/core-research.htmlexecutive-summary

1
• Net Option Delta - The net delta achieved by subtracting (assuming
put delta is represented as positive) or adding (assuming put delta is
represented as negative) total call and put delta for day t.

N ODt = T P Dt + T CDt

## • Net Options Pricing Effect (NOPE) - A proposed metric under pri-

mary analysis, this divides the net option delta by the total traded share
volume during the market session (TSM) for day t for the ticker (con-
ventionally one share is equal to one delta).

T P Dt + T CDt
N OP E =
T SMt

## • NOPE MAD(30) - This is a simple statistical application of NOPE,

using median absolute deviation to compare the NOPE on day t to the
previous 30 trading days.
Similarly, to analyze return data, I needed to define time intervals (for my
analysis, I used N ∈ {2, 5, 14, 30} days) to explore as derived features, including:
• Next Day Return - The percent change between the close of market
session (4 P.M. EST) on day t and day t+1.
• Next N -Days Return - The percent change between the close of market
session on day t and day t + N.
• Lowest N -Days Low Return - The percent change between the close
of market session on day t and the lowest low occurring in the following
N days.
• Highest N -Days High Return - The percent change between the close
of the market session on day t and the highest high occurring in the
following N days.
• Days to Lowest N -Days Low Return - The number of trading days
between day t and day where the lowest n-day low return occurs.
• Days to Highest N -Days High Return - The number of trading days
between day t and day where the highest n-day high return occurs.
In interpreting option data on SPY (or any ticker with similar regular div-
idends), I needed to be mindful of outlier data that arises due to the behavior
of options before ticker ex-dividend. In particular, before ex-dividend certain
market participants engage in dividend arbitrage, or the purchase of a put
and call options before ex-dividend dates for arbitrage and profit. This vastly
inflates the call and put volume (and associated delta and NOPE features) and
can be considered removable noise for analysis. Hence, I created a separate
feature to represent dates before ex-dividend dates as 0 (false) or 1 (true).

2
1.2 Features and Composition
In total, the dataset includes trading days (non-weekends, non-holidays) from
1/3/07 to 11/12/20, for a total of 3486 days. While this is not as large as
optimal by convention (e.g. 50,000 days), this extends over 14 years of trading
history. This includes the following information per date:
• Date (Integer) - The date t elapsed since the first date studied (1/3/2007).
This is not used in the analysis except as derived features (year, month)
and during pre-processing.
• Year (One-Hot Encoding) - This represents the year of each date analyzed
(0-13).
• Month (One-Hot Encoding) - This represents the month of each date
analyzed (0-11).
• Share Volume (Integer) - Total traded share volume during market ses-
sion on day t.
• Call Volume (Integer) - Call volume traded during market session on
day t.
• Put Volume (Integer) - Put volume traded during market session on day
t.

## • Call Delta (Integer) - Described under preprocessing.

• Put Delta (Integer) - Described under preprocessing.
• Net Option Delta (Integer) - Described under preprocessing.

## • NOPE (Float) - Described under preprocessing.

• NOPE MAD(30) (Float) - Described under preprocessing.
• Open (Float) - SPY opening price on day t.
• High (Float) - Spy daily high on day t.

## • Low (Float) - Spy daily low on day t.

• Close (Float) - Spy daily close on day t. Unadjusted for dividends.
• Volume (Integer) - Spy daily trading volume on day t.
• Day Before Ex-Dividend (Boolean) - Described in preprocessing.

## • Next Day Return (Float) - Described in preprocessing.

• Next N -Days Return (Float) - Described in preprocessing.
• Lowest N -Days Low Return (Float) - Described in preprocessing.

3
• Highest N -Days High Return - Described in preprocessing.
• Days to Lowest N -Days Low Return (Float) - Described in prepro-
cessing.
• Days to Highest N -Days High Return (Float) - Described in prepro-
cessing.

## 1.3 Statistics & Properties

The time period studied in this analysis ranges from 1/3/2007 to and is a con-
tinuous set of 3486 trading days until 11/12/2020. During this time period,
there were 56 ex-dividend dates for SPY, starting from 3/15/07 and extending
to 9/17/20.
During this time period, SPY achieved a total return of 148.3% (compounded
annually, nearly 6.7% growth) (adjusted for dividends). However, also during
this time there were two major crashes/recessionary periods responsible for neg-
ative growth:
• The Global Financial Crisis (2007-2008) - This period began (in the
stock market) with a high in October 2007 and a bottom in March 2009.
• The Coronavirus Crash (2020) - This period began (in the stock mar-
ket) with a high on February 19, 2020 and a bottom on March 23, 2020.

## SPY Statistic Value

Minimum Low \$67.10
Maximum Low \$355.06
Minimum High \$70.00
Maximum High \$364.38
Minimum Open \$67.95
Maximum Open \$363.97
Minimum Close \$68.11
Maximum Close \$357.70

## 2 Literature & Related Studies

While this dataset and associated theoretical basis was developed by me de
novo, there is significant prior literature which looks on the impact of index
options dynamic hedging and the ability of the options market to predict future
returns.
For example, in a recent related study (”Gamma Fragility”), Andrea Barbon
and Andrea Buraschi explored the impact of market maker positioning (specifi-
cally gamma imbalance of options holdings) on the behavior of both single-stock
and index equities. This study looked at security data from 1996-2017 using the

4
Figure 1: Histogram of Close-to-Close Return (1 day) vs Frequency for Given
Days

5
Figure 2: Histogram of maximum 14-day High % Return vs Frequency for Given
Days

6
Figure 3: Histogram of maximum 30-day High % Return vs Frequency for Given
Days

7
IvyDB dataset (for index and equity options, including Greeks) from Option-
Metrics merged with index and equity return data from TAQ/CRSP (Barbon
and Buraschi). In particular, similar my hypothesis, they found that market
maker gamma imbalance contributes to both intra-day and multi-day abnormal
returns. Specifically, market maker (dealer) positive gamma was associated with
more muted equity/index movements and tendency to intra-day reverse, while
negative gamma imbalance was associated with strong price movement.
Similarly, in 2016, SqueezeMetrics Research published a white-paper de-
scribing the Gamma Exposure Index (GEX), a computation of the net gamma
implied by the open interest of call and put options on a given ticker. In its
paper, SqueezeMetrics found strong correlation between SPX returns and Prior
Day GEX Close; specifically that negative GEX was associated with higher-
than-normal next-day volatility, while positive GEX was associated with lower-
than-normal next-day volatility (Zambito).
However, this effect isn’t a recent phenomenon; it was first surmised and
documented by Pearson, Poteshman, and White in their 2007 paper ”Does Op-
tion Trading Have a Pervasive Impact on Underlying Stock Prices?”. In the
paper, the authors used OptionMetrics IvyDB on all CBOE optionable stocks
from 1990 to 2001, using stacked regression for each of the 2,308 tickers to com-
pare daily returns over time with computed market maker gamma imbalance.
They discovered that up to 12% of daily stock return, on average, can be at-
tributed to the impact of market maker re-hedging (delta-gamma hedging), with
the cause being a strong negative correlation between net positive gamma and
stock market volatility (Pearson et al.).
Overall, the observations mentioned in the papers described seem to match
(using related metrics, but not exactly the same formula - GEX, for instance,
measures open interest gamma rather than traded net delta) observed findings
on this dataset. In particular, we can see in Figure 5 and Figure 6 a correlative
relationship between end-of-day NOPE and next-day volatility. This may have
deeper implications outside of the paper’s analysis, including about the nature
of market maker gamma exposure (e.g. that it may be caused mostly due to
day t delta exposure, rather than all open interest).

## 3 Hypothesis & Prediction

3.1 Hypothesis
Based on existing literature and observation in forward testing, my hypothe-
sis was that the Net Options Pricing Effect (NOPE) metric would correlate
significantly to market crash and crash-like (corrections) movements caused or
exacerbated by options hedging effects. This was based on the following as-
sumptions:

• The options market is better informed about future movements than the
stock market.

8
• The effect of options hedging and re-hedging becomes dominant in cases
of low market liquidity (low share trading volume).
• Periods of mania and irrational exuberance (Minsky moment cycles) often
precede market crashes and corrections.

## To quantify what defines such a period, in practice, however, is difficult, and it

is almost always defined in retrospect. For the purpose of this research, we can
constrain this hypothesis to the following:
• There is a correlation between day t NOPE and day t + N return. In
particular, I anticipate in scenarios where NOPE is significantly positive
and elevated, day t + N return should be worse than average.
• In cases of significantly large negative day t NOPE, I anticipate day t + N
return should be better than average.
• I anticipate a relationship between highly positive day t NOPE and

• We can define the ’bottom’ of a given N day period as the day in which
the lowest close is achieved in that period.
• I anticipate an inverse correlation between highly positive NOPE on day
t and Days to Highest N -Days High Return for a given t + N day time
interval.

## • I anticipate an inverse correlation between highly negative NOPE on day

t and Days to Lowest N -Days Low Return for a given t + N day time
interval.
Given the theoretical basis (that an abnormally high or low NOPE has a causal
relationship with anomalous market behavior), I first investigated the baseline
properties of N day (lowest low, highest high) returns over time.
Mean Median Standard Deviation
N = 1 Close-to-Low -0.0067283 -0.0041603 0.01167004
N = 1 Close-to-High 0.006445 0.004566 0.009861297
N = 1 Close-to-Close 0.0003561 0.0006179 0.01306043
N = 2 Lowest-Close -0.003550 -0.001165 0.01544957
N = 2 Highest-Close 0.004595777 0.003615 0.01401055
N = 5 Lowest-Close -0.010561 -0.005255 0.02188044
N = 5 Highest-Close 0.011816 0.009290 0.01768162
N = 14 Lowest-Close -0.022530 -0.012575 0.03578984
N = 14 Highest-Close 0.023800 0.019820 0.02471563
N = 30 Lowest-Close -0.036304 -0.021620 0.05135804
N = 30 Highest-Close 0.03803 0.03390 0.03295866
Additionally, I investigate the values of other derived baseline population
statistics in the dataset. One notable metric is the proportion of days where
SPY has a positive net return (as measured by previous day close to today’s

9
Figure 4: Given NOPE thresholding (X axis) vs Proportion of Close-to-Close
Change Greater Than Zero

## close) (Figure 4).

Statistic Baseline
Close-to-Close Change Greater Than Zero 0.54790595
Intra-day High-Low Mean Change 0.01317114
Median Days to Highest 14-Days High 12
Median Days to Highest 30-Days High 25
Median Days to Lowest 14-Days Low 7
Median Days to Lowest 30-Days Low 11
From this, I performed simple exploratory analysis via linear thresholding
on day t NOPE compared to the baseline proportion ”Close-to-Close Change
Greater Than Zero” (Figure 4). This was done by bucketing NOPE into 5 unit
increments, and comparing all NOPE values above or equal to the threshold
value d to values below d. From this and recomputing proportions above and
below the threshold, we get the following:
We can clearly observe from the attached graph that there is a monotonic
decrease in the observed proportion versus the baseline for all values beginning
around N OP Et = 0. At sufficiently high positive NOPE end-of-day values, we
do not see an average next-day SPY positive return rate (Close-to-Close Change
Greater Than Zero) proportion below about 30%, but this is likely due to the
small sample size matching that threshold.
Similarly, we can observe correlations between end-of-day NOPE and close-
to-close change in the graphs below (Figure 5, Figure 6). These graphs imply
a correlation between NOPE and magnitude of next-day returns of some de-
gree: negative NOPE seems correlated to higher variance of SPY daily return
(volatility), while positive NOPE seems correlated to lower variance.
Finally, an important metric to analyze especially over the given time range
(2007-2020) is the evolution of NOPE’s variance over time (Figure 8). This

10
Figure 5: End of Day NOPE (X axis) vs Close-to-Close Returns

## Figure 6: End of Day NOPE (X axis) vs Close-to-Close Returns, removing all

cases where |N OP E| < 20

11
Figure 7: Auto-correlation of N OP Et to lag = 40

## has implications for comparing extremes, as well as performing regression-based

prediction. An initial hypothesis I had in computing the value was to utilize
median absolute deviation as a way to identify outliers, as well as normalize
N OP Et against peer days (the previous 30 trading days). However, as easily
identifiable in the figure below, NOPE seems to exhibit significant heteroskedas-
ticity. This is not unexpected for stock market-related data in a time series, and
hinted that I should check for auto-correlation.
I confirmed this by performing an auto-correlation plot analysis (Figure 7),
and could observe auto-correlation between day t and preceding NOPE values
up to a lag = 5, significant at the α = 0.05 level. Given that, in my linear models
I attempted to include prior days’ values as part of the regression analysis.
Therefore, my assumption is simple comparison via NOPE MAD(30) will
not be predictive over long timescales.

## 3.2 Predictive Task

While on analyses other correlations were observed (which I detail in Refer-
ences), for the purpose of this assignment I will focus on next-day returns be-
havior, with the end goal of devising an alpha-generating (absolute returns)
strategy. For this, we can benchmark the simple strategy of going long (pur-
chasing and holding shares) SPY shares over the period tested. For simplicity,
we can assume 0 transaction fees, and I’ve ignored dividends in return calcu-
lations (in fact, in the model’s predictions we can safely ignore any scenario of
going short next day before ex-dividend days).
For this exercise, we can divide our dataset into a training and test period.
Given the heteroskedasticity and auto-correlation of NOPE over time, one op-
timal configuration to test the task is to randomly sample approximately 20%

12
Figure 8: End of Day NOPE (Y axis) vs Day t

of days per month. However, given our exploratory analysis, it seems unlikely
that random sample would capture enough high-magnitude end-of-day NOPE
samples to actually show a marked improvement.
We can instead also divide the period into a more conventional training
period of 2007-2017 (training period) and 2017-2020 (test period), in order to
better capture multi-day events and safely threshold over time based on NOPE’s
heteroskedasticity.
Lastly, we can also try a windowing approach, given the auto-correlation and
heteroskedasticity observed - instead of affixing a set threshold or set training
data set, we can have the model look backwards at the previous N trading days
and continue relearning the best strategy. I applied this strategy directly in the
Naive Bayesian approach, to admittedly mixd results.

4 The Model/Analysis
Given the focus on next-day behavior only for generating alpha, the models will
use the following features to test and predict next-day returns and volatility:
• Date (Integer)
• Year (One-Hot Encoding)
• Month (One-Hot Encoding)

13
• Share Volume (Integer)
• Call Volume (Integer)
• Put Volume (Integer)
• Call Delta (Integer)

## • Put Delta (Integer)

• Net Option Delta (Integer)
• NOPE (Float)

## • NOPE MAD(30) (Float)

• NOPE (Binned) (One-Hot Encoding) - This feature decomposes the
NOPE metric into q bins (e.g. quantiles, deciles) which are used as cate-
gorical regression variables.

## • Today is Green (Boolean) - Derived feature returning true if close price

for day t is greater than opening price.
• Volume (Integer)
• Day Before Ex-Dividend(Boolean)

During the process of selecting the best model for predicting next day return,
I looked into multiple different avenues for maximizing total return using the
simple long-short technique discussed above. This included:
• Linear Regression - Here, the output to predict the return anticipated the
following day (day t + 1) and go short if the predicted close-to-close return
is less than 0 for that day alone.
• Logistic Regression - Here, the output is to predict close-to-close return
categorically (to determine if it is less than or greater than 0) and short
or long accordingly.
• Naive Bayesian Thresholding - As a naive model, we can find a threshold
value of NOPE t in which the average return implied in the test set is neg-
ative, and look to optimize it by minimizing by examining the coefficient
of variation and number of days short required.

## 4.1 Linear & Logistic Regression

To investigate this model, I applied both modified linear and logistic regression
approaches to determine if alpha generation based on predicted next day return
was plausible. Linear regression I surmised would be more effective at predicting
absolute return and avoiding undesirable boundary behavior (given that the

14
mean daily return of SPY is 0.0003561, a simple categorical variable to check
greater than 0 might end up ineffective so close to the boundary condition).
However, based on the observed behavior of NOPE deciles and coefficient
of variation (Figure 6) I anticipated to see a non-linear relationship between
NOPE and next day return (extreme NOPE values seem to have a substantial
correlation, while moderate positive and negative ones much less so). There-
fore, I expected overall linear regression would be a poor estimator of next day
returns, which would imply a fairly low r2 and also poor performance using the
most naı̈ve evaluation approach (short SPY when the model predicts returns
< 0, long otherwise).
Conversely, logistic regression, while suffering from the mean return being
close to the logistic boundary, would highly weight picking the actual direction of
returns correctly (since whether the return is +0.10% or +1.00% is irrelevant if
we only consider the sign of the return). Therefore, for simple long-short it may
outperform the linear regression model, especially with a modified probability
cutoff.
In both cases, I observed a tradeoff between using categorical binning of
NOPE as well as one-hot encoding of year in the regression - while it did improve
the accuracy, at higher binning levels it also led to overfitting between the train
and test portions of the dataset (both using random and sequential sampling, as
describe above). This makes sense intuitively - while NOPE is not continuously
distributed in its apparent effect range (for low magnitude values, it seems to
show weak if no relationship to SPY), the actual value is continuous, occurring
in range between -150 and +150 in non-ex-dividend circumstances observed.
Therefore, when the number of categorical bins becomes sufficiently large, the
model will overfit each bin based on the training data (n = 2745).
Similarly, the year parameter is also prone to overfitting, given the time
range involved. In the given dataset, there are two fairly large scale crashes
(the Global Financial Crisis in 2008 and the Coronavirus Crash in 2020) which
heavily distort yearly (in the GFC case) and monthly (February/March, in the
Coronavirus Crash case) returns. Therefore, by one-hot encoding the date/time
variables, I did observe increases in accuracy (logistic) and r2 (linear) which
were likely in part due to overfitting.

## 4.2 Naive Bayesian Thresholding

Perhaps the most versatile model for a simple long-short strategy, however, is to
establish a simple scalar threshold for NOPE values, which can be augmented
or re-selected at a varying basis in response to new data (this could rectify
some of the heteroscedasticity observed over years, while the auto-correlation
was observed mostly on the order of days). This has a significant benefit in
working with a recent trend (and hence fairly limited data): there is no real
worry of overfitting, since the labels are matched exactly. However, it also can
be skewed by the low sample size it matches (especially for positive NOPE, at
the high values we anticipate observing an effect the sample size is small) and
also be insufficient for predicting future trends.

15
Figure 9: End of Day NOPE Decile (X axis) vs Next Day Coefficient-of-
Variation in Daily Returns

To build this, I had the model window a period (the lookback period) before
the given day observed, and look to find the threshold maximizing the absolute
dollar return of the strategy over that period. This would be achieved simply
through long-short (e.g. find the total return of the lookback period for threshold
h, and short if end-of-day NOPE was greater than h, long otherwise).
Though it ends up probably too simplistic for real world use, this thresh-
old can additionally be used as a future feature for more complicated models
incorporating this analysis.

## 5 Results & Conclusions

For my results, I tried two basic approaches for evaluating return:
• Windowing - Given the time dependence on NOPE thresholding and auto-
correlation, I created a sliding window of variable size (N = 50, 100 trading
days) which was used to train each of the models in the selection task
(whether to go long or short next day). This would be used by the model
to predict for the next 20 days (mostly a limitation of computing power).
• Train/Test Split - Through this approach, I segmented the dataset into
20% test, 80% training data, kept sequential. This is because of the time
dependence, but I did surmise it would be more likely to overfit (since
anomalous years did occur in the dataset).

## As mentioned in ”Linear Logistic Regression”, I did observe substantial

overfitting when fitting NOPE data into categorical bins, as well as through the
categorical encoding of years. To solve the latter issue, I tried multiple time-
frames, including removing the 2007-2009 range (the Global Financial Crisis).
In all cases, I compared the following strategies:

16
• Long SPY only - In this case, the model predicts the return (minus
ex-dividend dates) of holding SPY long for the test period.
• Long SPY except over threshold - In this case, the model holds SPY
long except for the day t + 1 where the model predicts negative returns,
in which case it makes 0% return (akin to leaving the market).

• Long/Short - In this case, the model follows the strategy outlined above
(long SPY except for day t + 1 where the model predicts negative returns,
in which case it goes short).
In the linear and logistic regression cases, I observed substantially improved
accuracy (and r2 ) by including the previous 3-4 days values’ as part of the
regression parameters (due to the auto-correlation observed in Figure 4).
For logistic regression, instead of using the categorical variable output alone
(0 vs. 1), I was curious as well to determine the model’s ”surety” of going short,
and tried various probabilities in order to determine what the optimal threshold
for action was. Similarly, for linear regression.
The following results are the absolute return generated by various strategies
by splitting the dataset into 80% training, 20% testing data sequentially.

17
Figure 10: Simple Logistic Regression Returns (Threshold = 0.5) - Day t (X
axis) vs Absolute Return over Time

## Figure 11: Modified Logistic Regression Returns (Threshold = 0.3) - Day t (X

axis) vs Absolute Return over Time

## In the Naive Bayesian model, I implemented a sliding window of size w

(w ∈ 50, 100 trading days) which is re-thresholded on every 20th day (mostly
for computational resource reasons). In this model, the only factor weighed
on is the raw NOPE metric. The results at various window sizes are listed
in the figures below. On testing, I noticed in particular higher-than-expected
accuracy/returns due to the model choosing to stay short during the GFC period
(2007-2009), so I have provided returns with and without that period of time
below.

5.1 Conclusion
We can clearly observed a few notable results from the modeling - by and large,
the naive Bayesian approach with windowing performed the worst, only ex-
ceeding baseline performance including the Global Financial Crisis. This is

18
Figure 12: Simple Logistic Regression Returns (Threshold = 0) - Day t (X axis)
vs Absolute Return over Time

## Figure 13: Simple Linear Regression Returns (Threshold = -0.20) - Day t (X

axis) vs Absolute Return over Time

## Figure 14: Naive Bayesian Window Thresholding Model (Window = 50 trading

days) including Global Financial Crisis - Day t (X axis) vs Absolute Return
over Time

19
Figure 15: Naive Bayesian Window Thresholding Model (Window = 50 trading
days) without Global Financial Crisis - Day t (X axis) vs Absolute Return over
Time

Figure 16: Naive Bayesian Window Thresholding Model (Window = 100 trading
days) including Global Financial Crisis - Day t (X axis) vs Absolute Return over
Time

20
Figure 17: Naive Bayesian Window Thresholding Model (Window = 100 trading
days) without Global Financial Crisis - Day t (X axis) vs Absolute Return over
Time

potentially due to the special nature of that period - the model, rather than
finding an actual optimal threshold, likely knows to stay short during that pe-
riod based on the negative performance that characterized 2007-2009, leading
to higher returns (and higher returns even after a decade following of poorer
performance). Removing that time range, however, we can clearly observe that
dynamically finding a threshold with our given parameters (a lookback window
of 50 and 100 trading days, and resampling every 20th day) performs worse,
even without going short.
Interestingly, the highest performing model is likely the linear regression
model, which was trained on data from approximately 2007 to 2017, and then
tested on the time range following that period. As we can see in Figure 12,
both the long/short and long/null (not performing any action when the model
predicts a negative return next day, long or short) outperformed simple buy-
and-hold over the period tested, to a fairly high margin (Figure 12). In fact,
due to the coronavirus crash period of 2020, the long/short strategy returned
almost double what SPY did over the same period using linear regression.
The worst performer was the simple logistic model, which is not a surprise,
given the mean daily return’s close proximity to 0 (Figure 10). Interestingly,
modifying the prediction probability to a lower threshold (Figure 11) enhanced
the returns by an appreciable degree versus SPY, but it’s difficult to determine
if the threshold holds in general or just for the given period observed.
The optimal features selected through the model tuning process (given Fig-
ure 12’s returns) were:

## • Intercept - This was weighted per the coefficients at 0, so it had no effect

on the final results.
• Year (Represented as one-hot encoding) - This had a moderate effect in
model tuning, but except for the first year in the test there was no overlap

21
with the training data years. This is also observable by the remaining two
years, 2019 and 2020, having nearly 0 weight as coefficients.
• NOPE Daily Values (Float) - This was a composite of 5 NOPE values
recorded end of day (N OP Et , N OP Et−1 , N OP Et−2 , N OP Et−3 , N OP Et−4 ).
These had a moderate weighting coefficient-wise.

## • Same Day is Green (Boolean) - This had a large weight as an individual

parameter, and refers to whether the same day being observed had a
positive return from open to close.
I observed no real utility nor increase in return using Month one-hot en-
coding, which suggests that any seasonal impact in prediction was negligible.
Similarly, I observed no significant predictive relationship with NOPE MAD(30)
(simple scaling of NOPE using median absolute deviation) which suggests that
the actual magnitude of the NOPE value itself, rather than its peculiarity versus
recent peers, is important for predictive effect.
All in all, the model was able to achieve higher performance in the period
tested than buy-and-hold alone, primarily using the feature in question (NOPE).
However, it remains to be seen if this is more emblematic of the period tested
(higher volatility and potentially more opportunity for return going short). In
general, all models using the features tested seem to over-perform in periods of
crashes, which may provide some deeper hints about the relationship of the Net
Option Pricing Effect to crash and correction market periods.

References
[1] Barbon, Andrea, and Andrea Buraschi. “Gamma Fragility.” SSRN, 16 Nov.
2020,
papers.ssrn.com/sol3/papers.cfm?abstracti d = 3725454
[2] Pearson, Neil D., et al. “Does Option Trading Have a Perva-
sive Impact on Underlying Stock Prices?” SSRN, 16 Mar. 2007,
papers.ssrn.com/sol3/papers.cfm?abstracti d = 970592.
[3] Zambito, Matthew. “Gamma Exposure (GEX).” SqueezeMetrics, SqueezeMet-
rics Research, Dec. 2017,

22