You are on page 1of 13

Appendix A.

Preprocessing

Before performing the actual sentiment analysis, there are several prepro-
cessing steps as follows:

1. Filtering. We apply the following filter rules: first, each announcement must
have at least 50 words. Second, we focus only on ad hoc press releases from
German companies which are written in the English language. Our final
corpus consists of 14,427 ad hoc announcements. To study stock market re-
action, we use the daily stock market returns of the corresponding company,
originating from Thomson Reuters Datastream. We only include business
days and take the first message of each day; yielding a total of 1,892 observa-
tions. In addition, we adjust the publication days of ad hoc announcements
according to the opening times of the stock exchange. This is achieved by
assigning all disclosures filed after 8 p. m. to the next day.

2. Tokenization. Corpus entries are split into single words named tokens [1].

3. Negations. Negations invert the meaning of words and sentences [2, 3,


4]. When encountering the word no, each of the subsequent three words
(i. e. the object) are counted as words from the opposite dictionary. When
other negating terms are encountered (rather, hardly, couldn’t, wasn’t, didn’t,
wouldn’t, shouldn’t, weren’t, don’t, doesn’t, haven’t, hasn’t, won’t, hadn’t, never),
the meaning of all succeeding words is inverted.

4. Stop word removal. Words without a deeper meaning, such as the, is, of,
etc. are named stop words [5] and can be removed. We use a list of 571 stop
words proposed in [6].

5. Synonym merging. Synonyms, though spelled differently, convey the same


meaning. In order to group synonyms by their meaning, we follow a method
that is referred to as pseudoword generation [5]. Table A.1 provides list with
approximately 175 frequent synonyms or phrases from the finance domain,
which we utilize to aggregate synomyms according to their meanings. In
the case of e. g. at least, we use pseudoword generation to map a common
phrase with more than one word onto a single token.

6. Stemming. Stemming refers to the process of reducing inflected words to


their stem [5]. Here, we use the so-called Porter stemming algorithm [7].
Pseudoword Synoyms
acquisition takeover
annual_results full year results
annual_results annual results
annual_results annual result
antitrust_authority antitrust authority
antitrust_authority antitrust authorities
at_least at least
bylaws by-laws
bylaws articles of association
capex capital expenditures
capital_increase rights issue
capital_increase capital raise
capital_increase capital increase
ceo chief executive officer
cfo chief financial officer
close closing
common_share voting shares
common_share voting share
common_share common stocks
common_share common stock
common_share common shares
common_share common share
company incorporation
company firm
company corporation
consolidated_income consolidated income
consolidated_loss consolidated loss
consolidated_profit consolidated profit
cost cost structure
debt subordinated debt
debt senior notes
debt net financial debt
debt net debt
debt junk
debt financial liabilities
debt bonds
Dollar $
ebit operating result
ebit operating profits
ebit operating profit
ebit ebt
ebit ebits
ebit earnings before taxes on income
ebit earnings before taxes and interests
ebit earnings before taxes and interest
ebit earnings before taxes
ebit earnings before tax on income
ebit earnings before tax and interest
ebit earnings before tax
ebit earnings before interests and taxes
ebit earnings before interests and tax
ebit earnings before interest and taxes
ebit earnings before interest and tax
ebit earnings after taxes
ebit earnings after tax
ebit earnings
ebtda earnings before interest, taxes, depreciation and amortization
ebtda earnings before interest, tax, depreciation and amortization
economic_cycle economic cycle
eps earnings per share
equity shareholders’ equity
equity shareholder equity
equity book value
Euro €
Euro EUR
first_quarter first-quarter
first_quarter first three months
first_quarter first quarter
five_trading_days five trading days
five_trading_days five days
five_trading_days 5 trading days
five_trading_days 5 days
fourth_quarter fourth-quarter
fourth_quarter fourth quarter
general_meeting shareholders’ meeting
general_meeting shareholder meeting
general_meeting general meeting
general_meeting annual meeting of shareholders
general_meeting annual meeting
half_year first six months
half_year six months
half_year_results semi-annual results
half_year_results semi-annual result
half_year_results half-year results
half_year_results half-year result
half_year_results half year results
half_year_results half year result
half_year_results first half-year results
half_year_results first half-year result
half_year_results first half year results
half_year_results first half year result
hybrid hybrid capital
ipo initial public offering
jv joint venture
last_year last year
legal_dispute legal proceedings
legal_dispute legal proceeding
legal_dispute lawsuit
legal_dispute investigations
legal_dispute investigation
long_term long term
long_term long-term
loss_per_share loss per share
management the board
management executive members
management board of directors
market_leader market leadership
market_leader market leader
market_share market share
merger moe
merger merger of equals
net_income net profit
net_income net income
net_loss net loss
net_operating_income net operating income
net_operating_loss net operating loss
net_operating_profit net operating profit
nine_months first nine months
nine_months nine months
number_of_shares stock split
number_of_shares number of shares
number_of_shares forward split
number_of_shares conversion of shares
number_of_shares change of number of shares
paragraph §
partnership strategic partnership
partnership alliance
percent percentage points
percent per cent
percent pct
percent %
procurement purchasing
purchase_price purchase price
restructuring turnaround
restructuring staffing cutbacks
restructuring staff reduction
restructuring staff cut
restructuring reorganisation
restructuring refocus
restructuring redundancies
restructuring reduction in staff
restructuring personnel reductions
restructuring layoff
restructuring cost cut
restructuring capacity reduction
revenue turnover
revenue sales
revenue revenues
sec securities and exchange commission
sec securities exchange commission
sec security exchange commission
second_quarter second-quarter
second_quarter second quarter
shareholder owner
share_buyback stock repurchase
share_buyback stock buyback program
share_buyback stock buyback
share_buyback share repurchase
share_buyback share buyback program
share_buyback share buyback
share_buyback buyback programme
share_buyback buy back up to
share_price share price
short_term short term
short_term short-term
spinoff spin-off
spinoff spin off
supervisory_board_chairman chairman of the supervisory board
third_quarter third-quarter
third_quarter third quarter
this_year this year
treasury_stock treasury stock
write_off write off
write_off written off
write_off wrote off
Table A.1: Words used for synonym merging and pseudoword generation.

Appendix B. Sentiment analysis

As sentiment analysis is applied to a broad variety of domains and textual


sources, decision support research has devised various approaches to measure
sentiment. A recent literature overview [8] provides a comprehensive domain-
independent survey and, within the domain of finance, a number of surveys [9,
10] compare studies aimed at stock market prediction. In this paper, we want
to address only the trading simulation itself and so utilize a dictionary-based
approach to allow for easier verification of our results. Furthermore, dictionary
approaches seem to be the more widespread technique nowadays in finance
literature [11, 12].
Having completed the preprocessing, we can continue to analyze news sen-
timent. We choose the Net-Optimism sentiment metric [13] in coordination
with Henry’s Finance-Specific Dictionary [14]. The metric is calculated as the
difference between the number of positive Wpos (A) and negative Wneg (A) words
divided by the total number of words Wtot (A) in an announcement A. Thus,
Net-Optimism sentiment S(A) is defined by

Wpos (A) − Wneg (A)


S(A) = ∈ [−1, +1] . (B.1)
Wtot (A)

Appendix C. Abnormal returns

One of the most common ways to study stock price reactions from news
disclosures is to use the event study methodology [15]. We now utilize this ap-
proach to measure the stock price reaction caused by ad hoc announcements.
Abnormal returns are defined as the difference between the actual and the nor-
mal return of a security at time τ. The actual return R(τ) is measured by the
price change of the security, while the normal return originates from the market
model. Then, the abnormal return AR(τ) at a news disclosure τ is calculated
via
AR(τ) = R(τ) − E[R(τ) | ¬X τ ], (C.1)
where E[R(τ) | ¬X τ ] represents the expected return in the absence of an event
X τ.
In our study, we calculate the normal return, i. e. the expected return in
the absence of a news release, based on the market model [15]. The market
model assumes a stable linear relation between market return and normal re-
turn. Thereby, we model the market return using a reference market index,
namely, the CDAX. Our estimation period is set to 20 trading days prior to the
event window. By definition, we set the abnormal return to zero for the index
itself. This might be relevant when we analyze strategies that decide between
individual stocks and an index.
Altogether, we then can expect that the abnormal return following the dis-
closure of new and relevant information is mostly due to the news release. In
other words, the abnormal return can be regarded as some kind of excess return
caused by the news release.

Appendix D. Benchmarks: momentum trading and portfolio approach

Past stock returns can be a predictor of future firm performance. This is


what we define as momentum, whereby historic stock prices continue moving
in their previous direction. The (partly) predictable connection between past
and future return has been proven in the finance literature, such as [16]. Never-
theless, finance academics have trouble with the finding that a simple strategy
of buying winners and selling losers can apparently be profitable, since this
contradicts the theory of efficient markets, where markets quickly absorb new
information and adjust asset prices accordingly. Momentum is, consequently,
also named a “premier anomaly” in stock returns [17]. By extrapolating his-
toric stock trends, we motivate the following momentum trading, which picks
up the subtle patterns in returns. Developing a successful momentum trading
strategy is primarily a product of the manual efforts of finance academics and
practitioners to hand-engineer features from historical prices [18].
Let us define both the terms momentum and rate-of-change respectively.
The so-called momentum M omi,t is the absolute difference in stock i defined by

M omi,t = pi,t − pi,t−δ (D.1)

with a time span of δ days. In short, momentum denotes the difference be-
tween today’s closing price and the closing price N days ago, thus referring to
prices continuing to trend. In comparison, the rate-of-change RoCi represents
the relative change as a fraction, i. e.

pi,t − pi,t−δ M omi,t


RoCi,t = = . (D.2)
pi,t−δ pi,t−δ

Both the momentum and rate-of-change indicators reveal a trend by remaining


positive during an uptrend or negative during a downtrend.
Altogether, this comprises the momentum trading strategy [19], formally
introduced by the following pseudocode. In short, the key idea is to always
choose the stock that has the highest rate-of-change. Step 1 initializes the vari-
able s which stores the stock that our Decision Support System currently holds.
The subequent for-loop iterates through all time steps of our simulation hori-
zon T . In each iteration, Step 3 updates the rate-of-change scales for all stocks,
excluding the current business day. If the previously held stock was empty, then
Step 5 invests in the stock with the highest absolute value of all historic rate-of-
change values. However, if the rate-of-change of the currently held stock drops
below a threshold θRoC , then we trigger transactions to sell the previous stock
(Step 7) and buy (or short-sell) new stocks with the highest rate-of-change in
Step 8.
As free parameters, we can vary the time span δ calculating the rate-of-
change and the threshold θRoC . For the former, we find good results with δ set
to 200 business days. This value serves as a good trade-off between the range
of 20 days to 12 months proposed in the literature [16]. We choose the latter
variable θRoC by testing different values, and decide that θRoC = 50 %.

1: Initialize stock s ← ⊥.
2: for t in T do
pi,t−1 − pi,t−1−δ
3: Compute RoCi,t−1 ← for all stocks i.
pi,t−1−δ
4: if s = ⊥ then
5: Buy or short-sell stock s ← arg max |RoCi |.
i
6: else if RoCs,t−1 < θRoC then
7: Remove investment in stock s.
8: Buy or short-sell stock s ← arg max |RoCi |.
i
9: end if
10: end for

The portfolio strategy builds upon the previous momentum trading. But
instead of choosing always one stock, it updates the selection of stocks on a daily
basis. The sole criterion for the decision is rate-of-change. For that purpose, this
strategy picks a total of N stocks s = 1, . . . , N by maximizing maxi RoCs,t−1 . By
utilizing a larger portfolio of stocks, this strategy adjusts to the relative riskiness
by spreading the potential risk of defaults.

Appendix E. Pseudocode

Appendix E.1. Rule-Based News Trading


The following pseudocode specifies the simple rule-based trading. Steps 2
and 3 trigger buy and short-sell decisions, whenever the absolute value of the
news sentiment metric of an incoming announcement exceeds a certain positive
or negative threshold. This decision is given by the if-statement in Step 1, i. e.
the condition that S(A) is smaller than a negative threshold θS− or larger than a
positive θS+ must be fulfilled. We choose suitable threshold values for both θS−
and θS+ as part of our evaluation.

Input: Released announcement A that corresponds to stock i.


1: if S(A) > θS+ or S(A) < θS− then
2: Remove investment in previous stock s.
3: Buy or short-sell stock s ← i.
4: end if
Appendix E.2. Combined Strategy with News and Momentum
The subsequent trading strategy combines the above approaches by utiliz-
ing both news sentiment and historic prices in the form of momentum. We
develop this trading strategy around the idea that we want to invest in assets
with both (1) a news disclosure with a high polarity and (2) previous momen-
tum in the same direction. The combined pseudocode is given below. Only if
both the news release and historic prices give an indication of a development in
the same direction do Steps 3 and 4 trigger a corresponding trading decision.
Thus, this strategy expects the same direction in terms of the return-of-change
and sentiment metric as tested in Step 2.

Input: Released announcement A that corresponds to stock i at day t.


pi,t−1 − pi,t−1−δ
1: Compute RoCi,t−1 ← for stock i.
pi,t−1−δ
2: if |S(A)| > θS and sign S(A) = sign RoCi,t−1 then
3: Remove investment in stock s.
4: Buy or short-sell stock s ← i.
5: end if

Appendix F. Evaluation: benchmarks

The results of both benchmarks, namely, the CDAX stock market index and
momentum trading, are presented in Fig. F.1 where we see different perfor-
mance patterns. In other words, this figure shows how the value of an invest-
ment portfolio evolves over time when starting with 1 monetary unit. After a
highly volatile beginning (with also negative valuations), the value of the CDAX
increases gradually over time, while momentum trading faces a sharp drop in
the beginning. It only later recovers in early 2005, followed by a substantial
rise. The final valuation is more substantial in the case of momentum trading
in comparison to the CDAX.

Appendix G. Evaluation: return-risk comparison

Figures G.2 and G.3 provide a comparison of risk and (abnormal) returns.
60

CDAX Stock Index


Cumulative Return (in %) Momentum Trading
30

-30

Jan 2004 Apr 2004 Jul 2004 Oct 2004 Jan 2005 Apr 2005 Jul 2005

Figure F.1: Cumulative returns of both the CDAX index and the momentum trading strategy
compared across the first 400 business days.

Supervised Learning
Average Daily Abnormal Return (in %)

0.8 Reinf. Learning

Simple News Trading


0.4

Combined News & Momentum Trading


0.0 Index News Trading with Min. Holding Time
News Trading [Chemicals]
News Trading [Automobile]
Momentum Trading

0.000 0.025 0.050 0.075 0.100 0.125


Volatility of Daily Returns

Figure G.3: Comparison of abnormal returns versus risks across different trading strategies. Con-
sistent with the literature, risk is measured in terms of volatility.
1.2 Supervised Learning
Reinf. Learning

0.9
Average Daily Return (in %)

0.6

Simple News Trading

0.3

Combined News & Momentum Trading

News Trading with Min. Holding Time


Momentum Trading
Index News Trading [Chemicals]
0.0 News Trading [Automobile]

0.000 0.025 0.050 0.075 0.100 0.125


Volatility of Daily Returns

Figure G.2: Comparison of returns versus risks across different trading strategies. Consistent
with the literature, risk is measured in terms of volatility.

References

[1] G. Grefenstette, P. Tapanainen, What Is a Word, What Is a Sentence? Prob-


lems of Tokenization, 1994.

[2] M. Dadvar, C. Hauff, F. de Jong, Scope of Negation Detection in Senti-


ment Analysis, in: Proceedings of the Dutch-Belgian Information Retrieval
Workshop (DIR 2011), Amsterdam and Netherlands, 2011, pp. 16–20.

[3] N. Pröllochs, S. Feuerriegel, D. Neumann, Enhancing Sentiment Analysis


of Financial News by Detecting Negation Scopes, in: 48th Hawaii Inter-
national Conference on System Sciences (HICSS), IEEE Computer Society,
2015, pp. 959–968.

[4] N. Pröllochs, S. Feuerriegel, D. Neumann, Negation Scope Detection in


Sentiment Analysis: Decision Support for News-Driven Trading, Decision
Support Systems (2016).
[5] C. D. Manning, H. Schütze, Foundations of Statistical Natural Language
Processing, MIT Press, Cambridge, MA, 1999.

[6] D. D. Lewis, Y. Yang, T. G. Rose, F. Li, RCV1: A New Benchmark Collection


for Text Categorization Research, Journal of Machine Learning Research
5 (2004) 361–397.

[7] M. F. Porter, An Algorithm for Suffix Stripping, Program 14 (1980) 130–


137.

[8] B. Pang, L. Lee, Opinion Mining and Sentiment Analysis, Foundations and
Trends in Information Retrieval 2 (2008) 1–135.

[9] M.-A. Mittermayer, G. F. Knolmayer, Text Mining Systems for Market Re-
sponse to News: A Survey, 2006.

[10] M. Minev, C. Schommer, T. Grammatikos, News and Stock Markets:


A Survey on Abnormal Returns and Prediction Models, 2012. URL:
http://publications.uni.lu/record/9947/files/TR.Survey.
News.Analytics.pdf.
[11] C. Kearney, S. Liu, Textual Sentiment in Finance: A Survey of Methods and
Models, International Review of Financial Analysis 33 (2014) 171–185.

[12] I. E. Fisher, M. R. Garnsey, M. E. Hughes, Natural Language Processing


in Accounting, Auditing and Finance: A Synthesis of the Literature with a
Roadmap for Future Research, Intelligent Systems in Accounting, Finance
and Management (2016).

[13] E. A. Demers, C. Vega, Soft Information in Earnings Announcements:


News or Noise? INSEAD Working Paper No. 2010/33/AC, SSRN Elec-
tronic Journal (2010).

[14] E. Henry, Are Investors Influenced By How Earnings Press Releases Are
Written?, Journal of Business Communication 45 (2008) 363–407.

[15] Y. Konchitchki, D. E. O’Leary, Event Study Methodologies in Information


Systems Research, International Journal of Accounting Information Sys-
tems 12 (2011) 99–115.

[16] N. Jegadeesh, S. Titman, Returns to Buying Winners and Selling Losers:


Implications for Stock Market Efficiency, Journal of Finance 48 (1993)
65–91.
[17] E. F. Fama, K. R. French, Dissecting Anomalies, Journal of Finance 63
(2008) 1653–1678.

[18] L. Takeuchi, Y.-Y. Lee, Applying Deep Learning to Enhance Momentum


Trading Strategies in Stocks, 2013.

[19] K.-J. Kim, Financial Time Series Forecasting using Support Vector Ma-
chines, Neurocomputing 55 (2003) 307–319.

You might also like