Professional Documents
Culture Documents
Abstract
This article aims to study the link between Twitter announces and stock prices of sports
companies during the COVID crisis. In many instances, news, announces, social media
content affect the evolution of stock prices. This paper assesses the relationship between
the sentiment of social media and the evolution of stock prices. The study focuses on
companies from the sports sector due to their popularity and the consent number of
followers on social networks, which provide a sound basis of analysis. Two aspects are
explored: the Granger causality analysis of the tweets on stock prices and the event
study related to the COVID crisis. The approach is implemented for a sample of 18
listed companies in the sports sector.
Keywords: COVID-19, Pandemic, Panic, Sport sector, Juventus, Lazio, Cristiano
Ronaldo, NLP, Tweets, Sentiment analysis, Granger causality, Event study
1. Introduction
Financial markets make no exception from this rule. In recent years, more and more
investment decisions are being made based on people’s opinions shared through social
media. The ”mass sentiment” becomes a critical factor in understanding the price
dynamic. On April 23, 2013, a false Tweet announcing an attack on the White House
induced a massive dip of the leading stock market indices. It is another illustration of
the growing and sometimes problematic influence of social networks on the markets.
It is increasingly difficult to measure the ”mass sentiment” and to assess the cor-
relation with the stock prices. Manipulating stock prices through mass sentiment is a
concerning issue.
News announcements concerning listed companies can have a huge impact on fi-
nancial markets through stock prices or investor behavior. This leads to rapid changes
or abnormal effects in financial portfolios. Wysocki (1998) showed that, depending on
the quality of the messages, there was a strong positive correlation between the volume
of messages posted on discussion boards during the hours that the stock market was
closed and the next trading day’s volume and stock returns. In 2013, Wang et al.
(2013) extended the analysis to stocks returns volatility. The study found special sig-
nificance of the sentiment behind the words of the financial reports. Indeed, there is a
correlation between certain words and the company risk. It emphasizes the importance
of the words used to predict economic indicators. This makes full sense in the Natural
Language Processing analysis.
Investors are constantly looking for new information to make forecasts. The ef-
ficient market hypothesis states that stock prices are a function of new information
and follow a random walk Fama (1970). Recent studies have shown that behavioral
economics theories can be used to predict investment decisions. They take into account
the predominant role of emotion in decision-making. More broadly, Prechter Jr et al.
(2012) finds that the relationship between public mood and trust in government is
more important to stock market investors, suggesting that public mood over politics is
a factor in investment decisions.
However, in order to minimize this relationship between news and market movements,
Twitter is a gateway to sentiment analysis. Meador and Gluck (2009) have ex-
plored the opportunities that sentiment analysis offers on platforms like twitter. They
found that sentiment analysis is a tool that can determine how the audience received
a film. They also studied sentiment with the volume of shares traded for certain com-
panies. They found that, for Microsoft and for Yahoo, there was a certain correlation
between the feeling towards these companies and the volume of the exchanges of stocks.
In addition, Go et al. (2009) also show that Twitter plays an important role in con-
sumer satisfaction. Some companies use the social network to characterize the feeling
of consumers towards their products.
The work of Bollen et al. (2011) is one of the most popular in the field. In their pub-
lication, the researchers investigate whether public mood as measured from a large-scale
collection of tweets posted on twitter.com is correlated or maybe predictive of DJIA
values. They use two main tools to measure variations within the public mood from
tweets submitted to the Twitter service from February 28, 2008 to December 19, 2008.
The primary tool, OpinionFinder, analyzes the text content of tweets submitted on a
given day to produce a positive versus negative daily statistic of public mood. The
Mittal and Goel (2012) continued the work of Bollen et al. (2011) by integrating
a whole section of neural networks and cross-model validation. The study divides the
sample into N then train on N-1 and test on the last. This operation is repeated n
times. This method has achieved accuracy percentages of up to 75 % for Dow Jones
stocks. They also created a questionnaire with the words to analyze based on their
feelings.
Zhang et al. (2011) analyzed the positive and negative mood of some tweets on
twitter, and they compared it with stock market indices such as Dow Jones, S&P 500,
and NASDAQ. In order to improve the existing methodology, they decided to use mood
words, for instance ”fear”, ”worry”, ”hope” etc., as emotional tags of a tweet. Initially
they expected that the correlation between optimistic mood and exchange indicators
would be positive, and the pessimistic mood would negatively correlate.They found a
direct correlation for all of them with VIX, and correlation with Dow, NASDAQ and
S&P 500. This suggests that people start using more emotional words like hope, fear
and worry in times of economic uncertainty, independent of whether or not they have
a positive or negative context. They also showed that the number of retweets could be
a better baseline than the number of followers, but simply taking the whole number
of tweets gives the simplest results. The main limit of the article is on the sample of
tweets used. They considered it as not large enough.
More recently, Kordonis et al. (2016) use different techniques of Natural Language
The sports industry is an economic sector where measuring the relationship between
tweets’ sentiment and stock prices provides unique insights on how the public mood
impacts the market valuation. Indeed, the sports sector is a fertile ground to assess
and measure the mass sentiment. Laypersons can give their opinions on a sporting
event through social networks, thereby making it possible to aggregate their views and
draw conclusions.
The global lock-down related to the COVID-19 pandemic put a halt to all spot
activities across all countries, with the exception of Belarus. The sport fans were left
in a hiatus filled by the negative perspectives of the pandemic. The global sentiment of
the sport sector was dominated by bad news. Therefore, studying the tweets dynamic
for sport companies during the COVID pandemic could reflect the way the sentiment
of the sport fans is reflected in the social media. Moreover, the study aims to assess
whether that persistent negative sentiment of fans amplified the fall of the market
prices for a sport club or company.
This study enriches the academic literature covering the impact of social networks
on the financial markets. It unveils a new area of research related to the impact of
popular opinion on the stock prices of sports companies. The remainder of this article
is organized as follows:
Sections 4 describes the compilation process of the dataset used in this study
and the built of the main indexes related to tweets’ sentiment and share prices’
return.
Section 6 concludes.
2. Sentiment analysis
Sentiment and opinion analysis are a NLP topic aiming to extract emotions from
text. Regarding the technical research of Natural Language Processing, the academic
literature grew very quickly over the past decade. Pak and Paroubek (2010) presents
a method to collect a corpus with positive and negative sentiments, and a corpus of
objective texts. Their method allows to extract the polarity of a tweet, whether it’s
negative, positive or even neutral thanks to emoticon. Then, they perform statistical
linguistic analysis of the collected corpus. However, the assertion that leads the au-
thors to conclude that a tweet containing emoticons is necessarily of the same kind
of sentiment is questionable. The article is based on the principles it inherits but
microblogging has changed a lot since then and sentiment analysis techniques have
evolved a lot. Agarwal et al. (2011) proposed a method based on three different mod-
els. The article is quite old, so it has several flaws. Resources have evolved a lot, such as
NLP’s apprenticeship programs. In addition, the data of available tweets is also limited.
Saif et al. (2012) propose a new analysis method for the corpus classification era.
After several research, they deduce that the best classification is that of Naive Bayes.
We associate with an entity a semantic particularity, this refines sentiment analysis.
Feelings are far better detected than for the Unigram and Part Of Speech methods
used by authors like Pak and Paroubek (2010).
The first step computed the mutual information of a phrase based on the semantic
of two consecutive words (ie a noun followed by an adjective or viceversa (i.e ’high
profits’). Therefore, this step based on Part of Speech tagging. With this method
two consecutive words having their tags concordant with the suitable patterns
are extracted from a given phrase. Table 1 shows few rules of selecting relevant
consecutive words for analyzing the sentiment in a sentence.
Table 1: Tagging
In the second step the Pointwise Mutual Information (PMI) Church and Hanks
(1990) between two words, w1 and w2 , is computed as following :
P (w1 ∩ w2 )
P M I(w1 , w2 ) = log2 ( ) (1)
P (w1 ) · P (w2 )
The words ’buy’ and ’Sell’ are chosen to exemplify the sentiment related to an-
alyst’s views concerning a stock. Buy would be bullish sentiment denoting a
positive opinion and sell would be a bearish sentiment with a negative forecast.
The last step is to calculate an index of semantic orientation of the phrases based
on a dictionary of positive and negative words wp and wn with sizes of Np and
Nn respectively
Np Nn
X X
ISO(X) = P M I(X, wpi ) − P M I(X, wnj ) (3)
i=1 j=1
Figure 1 shows the relationship between the class and the various words in the doc-
ument. The words as attributes of the class and they are assumed to be independent.
In the log space the above equation becomes:
X
cmap = argmax(log)P ((c)) + log(P (wi |c)) (4)
c∈C
i∈1,nd
10
Under this formalism a document d is classified as the class c = P ositive for Fc > 0
and c = N egative for Fc < 0.
If the classifier encounters a word that has not been seen in the training set, the
probability of both the classes (positive and negative) would become zero and the
classification function generates an error. This issue can be addressed by Laplacian
smoothing :
#(wi ) + k
P (wi |c) = P (7)
(k + 1) c #(wi )
P
where k is a constant usually consider 1 and c #(wi ) is the sum of all words in
class c.
Sentiment analysis does have acceptable performance as shown in the literature. In-
deed Lewis (1998) and Domingos and Pazzani (1997) show that Naive Bayes classifiers
are optimal for certain problem classes with highly dependent features.
11
Naive Bayes is the simplest form of Bayesian network, in which all attributes are
independent given the value of the class variable. One approach to enrich a Naive
Bayes classifier is to extend its structure to represent explicitly the dependencies among
attributes. An augmented naive Bayes (ANB), is an extended naive Bayes, in which the
class node directly points to all attribute nodes, and there exist links among attribute
nodes(Zhang (2004)).
12
where f (xi ) assigns a value of -1 indicates one class, and a value of +1 the other
class.
Considering two training sample T r+ and T r− corresponding to previously labeled
documents as positive and negative(Frunza (2015)). The Support Vector Machine finds
a hyperplane that separates the two sets with maximum margin (or the largest pos-
sible distance from both sets). This search corresponds to a constrained optimization
problem by letting C ∈ (1, −1) be the correct class of document xi , the solution can
be written as d and b are found by maximizing the following expression
X
F(x) = 0.5 ∗ |d|2 − α max(0, 1 − cj (d · xj + b)) (9)
j∈(+,−)
The construction of the model was based on three main stages. The first one is the
development of a lexicon of sentiment analysis which is sensitive to the polarity but
also to the intensity of the feeling. Drawing on well-established sentiment word banks
(LIWC, ANEW and GI), the authors created a list in which they incorporate many of
the common lexical characteristics of sentiment expression in microblogs. Emoticons
and slang, for example, or even acronyms and initialisms (like LOL and WTF, two in-
tense sentiment acronyms). This process has given them a lexicon of more than 9,000
lexical features.
13
The second stage is the identification and evaluation of grammar and syntax rules
to assess the intensity of feelings. Then, they analyzed a targeted sample of 400 pos-
itive tweets and 400 negative tweets. This sample was selected from a larger initial
set of 10,000 random tweets from Twitter’s public timeline based on their sentiment
scores using the Pattern sentiment analysis engine. Two human experts individually
examined the 800 tweets and independently assessed their feeling intensity on a scale
of -4 to +4.
Using a coding technique similar to Strauss and Corbin (1998), the authors used
qualitative analysis techniques to identify the properties and characteristics that affect
the intensity of feeling in a text. This in-depth qualitative analysis made it possible to
isolate five heuristics: Punctuation, capitalization, adverbs of degrees, contrast of the
conjunction ’but’ and the previous three words.
The last stage is comparison of performance with the different existing models. The
composite score is calculated by adding the valence scores of each word in the lexicon,
adjusted according to the rules, then normalized to be between -1 (the most extreme
negative) and +1 (the most extreme positive). The VADER lexicon performs very well
in the social media domain.
Vader is a tool in competition with TextBlob. Both tools use the same lexicon-
based method. TextBlob is however an open source tool that allows greater flexibility.
The results of the two tools are quite similar .
14
Identifing the event date(s) of interest and of the event window is resumed in Figure
3 (MacKinlay (1997)). The main moments are the beginning and the end of the event
period T1 and T2 respectively. The event period is a window of few days around the
time recent news or information arrive. T0 indicates the debut of the pre-event period
serving fro the estimation of the model, corresponding to the normal returns. The post
event period after the moment T2 can be used to assess the impact of the event in the
longer term.
15
∗
ARi,t = Ri,t − Ri,t t ∈ T1¯, T2 (10)
where ARi,t is the abnormal return of firm i and event date t with t ∈ T1¯, T2 , Ri,t
∗
is the observed return of firm i and event date t and Ri,t is the normal return of firm
i conditioned by the information previous to the debut of event (T1 ).
∗
For determining Ri,t several approaches are proposed by the academic literture and
are presented below:
16
The market-adjusted return model considers that the returns of the stock is
equal to the return of the market index.
The market model is the most commonly used in the literature of event studies
where αi and βi are estimated through a linear regression. For time series
with auto-correlation and heterosckedasticity , appropriate method should be
employed in the estimation process. respectively.
The CAPM model leverages the typical capital asset pricing model by a time-
series regression based on realized returns:
The multi risk factors model proposes a multi-variate regression based model
for modelling the returns . The model introduced by Fama and French (1993)
improve the univariate CAPM model.
where βi,m , βi,SM B and βi,HM L are the model parameters , SM Bt is the excess
return of small over big stocks measured by market cap and HM Lt is excess
return of stock with a high market-to-book ratio over stocks with a low market-
to-book ratio at moment t
17
ln(1 + Vi,t ) = ln(1 + Vm,t ) + ln(1 + Vi,t−1 ) + ln(1 + Vi,t−2 ) + γi Dayi,t (16)
where Vi,t is the traded volume of firm i at moment t , Vm,t is the market turnover
volume at time t and Dayi,t are the weekday dummy variables which equals one
for firm i if the trading took place on that day and zero otherwise.
The News sentiment driven models enhance the market model (Delort et al.
(2009), Siering (2013)) with factors based on the sentiment analysis of web based
news concerning the firm .
where α̂i , β̂i are the estimates of the parameters of the market model.
18
ARi,t
SARi,t = (22)
SARi,t
where the adjusted standard error is computed as:
s
1 (Rm,t − Rm )2
SARi,t = σ̂ARi 1+ + PT1
T1 − T0 s=T (Rm,s − Rm )
2
0
T1
1 X
where σˆ2 ARi = (ARi,s )2
T1 − T0 − 2 s=T
0
1
PT1
and Rm = T1 −T0
Rm,s is the mean rate of return of the market index over the
s=T0
estimation period It can be noticed that σˆ2 AR is the standard deviation estimated from
i
the abnormal returns from the estimation window and SARi,t is the standard deviation
filtered over the event window.
The standardized abnormal returns for the firm i over the time horizon of the event
window T1 , T2 is:
T2
X ARi,t
CSARi,(T1 ,T2 ) = (23)
t=T
SARi,t
1
19
Ha : SAR(i,t) 6= 0
The test follows the t-statistics
AR(i,t)
SAR(i,t) = → t(T1 − T0 − 2) (N(0,1) for a big estimation window) (25)
SARi,t
For the full test window and cross sectional for the portfolio with N firms the null
hypothesis :
PN CSARi,(T1 ,T2 )
H0 : E(SCAR) = √1
N i=1 SCSARi
= 0 states the cumulated standardize
abnormal return is equal to zero.
Ha : E(SCAR) 6= 0
CAAR(T1 ,T2 )
T = → N (0, 1) (27)
σ̂CAAR(τ1 ,τ2 )
where the variance estimator of the test is
v
u N
1 u X
σ̂CAARi,(T1 ,T2 ) = √ t (CARi,(T1 ,T2 ) − CAARi,(T1 ,T2 ) )2 (28)
N i=1
20
1. A set of tweets that contain the name of the company published over the period
of interest. We collected a total of nearly 400,000 tweets on these sports entities
which we analyze with different stock market prices associated with them.
2. Time series of daily close prices for the share of the company. We collected a
total number of 18 times series from Yahoo Finance.
21
Table 2: Example of Tweets from our dataset: Tweet text, Date, Length and Tokens
To analyze the impact of COVID-19 on the sports industry, we divide our study
into two parts. A first part is focused on sentiment analysis and a second on event
study. For the sentiment analysis part, Table 3 resumes the dataset of tweets collected
over the COVID period from January 1 to May 30 2020.
22
Table 3: Dataset of tweets collected over the COVID period from January 1 to May 30 2020.
Then, for the event study part, We have selected 3 different analysis intervals.
March 24, 2020 is the same date for the UK. We use this date for Manchester
United.
We analyze the rest of the companies as of March 21, which is the date of
containment in the US.
The TextBlob package for Python is a convenient way to do a lot of Natural Lan-
guage Processing (NLP) tasks. It is a sentiment lexicon that can give a polarity and a
subjectivity score. We created a utility function to classify the sentiment of a passed
23
Figure 4: Histogram of the polarity of all WWE(World Wrestling Entertainment) tweets. We can
see that there is a majority of positive tweets. Neutral tweets are those which are not positive nor
negative
To study the relationship between stock prices and the tweets’ sentiment over a
period of time, it is crucial to build a time series of daily sentiment indicators. Therefore
we build and compute two sentiment indicators as following :
ScoreAbsTwitter (t) the Absolute Sentiment Score at day t
24
Figure 5: Evolution of the Absolute Sentiment score computed for WWE tweets between June 2018
and April 2019.
25
The Excess of Log return is computed with respect to the S&P500’s Close :
The volatility index is built based on daily High and Low price values:
High(t) − Low(t)
V ol(t) = (33)
High(t) + Low(t)
26
27
We obtained stock prices thanks to Yahoo! Finance API. We have the Open, the
Close, the High and the Low values for each trading day over the studied period. We
then preprocessed the data to become suitable for our analysis. We had to deal with
the week-end and holidays missing data. We use a basic approximation in order to fill
the missing data.
P ricet−1 + P ricet+1
P riceM issing (t) = (34)
2
4.2. Methodology
To analyze the relationship between tweets’ sentiment and stock prices two main
approaches are employed:
the Granger causality analysis aiming to assess whether the tweets’ sentiment
determine the moves in share price for a given company
28
RExcess (t) = α0a + β1a · ScoreAbstwitter (t − 1) + β2a · ScoreAbstwitter (t) + ε(t) (35)
RExcess (t) = α0r + β1r · ScoreReltwitter (t − 1) + β2r · ScoreReltwitter (t) + ε(t) (36)
The regressions aim to study the relationship between the excess log-returns and
the two sentiment indicators (Absolute and Relative indicators defined above) from the
actual day and with one day lag.
When the estimated β are relevant at a 95% confidence level we conclude that there
is a dependency between the sentiment scores (with a lag) and the stock returns. The
regression of the excess log-returns over the Absolute scores does not provide sound
results. Thus only the results front he Relative sentiment scores are discussed in the
following sections.
29
Table 4: Granger causality results for sentiment relative index and excess log-returns during the
COVID crisis
30
Table 5: Granger causality results for sentiment relative index and volatility during the COVID crisis
We note that the results are not significant. First, as an explanation, we can say
that there is not necessarily a negative feeling towards football clubs because of the
COVID19. So their sentiment score does not decrease as much as the price of their
shares. Some large companies have had fairly significant results (Dick Sporting Goods
and MGTI). There is surely a dependency between the sentiment and the stock markets
performance indexes that we have chosen, but it is difficult to speak of causality.
31
The results are significant for most of companies. However, there is no important
date, which seems logical because of the linearity of the COVID crisis. We can also
see that some companies have a positive cumulative abnormal return. Thus, several
companies might have been profiting from the crisis.
Figure 9, 10 and Figure 11 depict the event study results, showing the evolution of
the CAR around the time of the event for Lazio, Juventus and Foot Locker.
32
Figure 10: Event study results showing the evolution of the CAR around the time of the Covid
lock-down for Juventus
33
6. Conclusions
This paper studies the interaction between people’s sentiment and stock prices from
a list of leading sports brands. We use daily Tweets to construct a sentiment score
towards 18 sports brands. We analyze the dynamics of the interaction between tweets’
sentiment, stock returns and volatility to assess whether social media can impact fi-
nancial markets. We also use event study methodologies to support our hypothesis.
We assess the relationship between tweets and stock prices for the full scope sample of
18 companies over the period of the COVID crisis.
Our findings indicate the interplay between the sentiment score and the excess of
log return: (i) is more significant for football clubs; (ii) it is more significant the day
before, which suggests a causal relationship; (iii) is less relevant for big companies. This
does not allow us to conclude with certainty on the potential influence of tweets on the
stock prices. Moreover, we show that volatility is not a metric showing dependency
with the sentiment score. In a future study, we intend to test volatility with the number
34
Agarwal, A., Xie, B., Vovsha, I., Rambow, O., Passonneau, R.J., 2011. Sentiment
analysis of twitter data, in: Proceedings of the Workshop on Language in Social
Media (LSM 2011), pp. 30–38.
Bollen, J., Mao, H., Zeng, X., 2011. Twitter mood predicts the stock market. Journal
of Computational Science 2, 1–8.
Brown, S.J., Warner, J.B., 1980. Measuring security price performance. Journal of
financial economics 8, 205–258.
Cha, M., Haddadi, H., Benevenuto, F., Gummadi, K.P., 2010. Measuring user influence
in twitter: The million follower fallacy, in: fourth international AAAI conference on
weblogs and social media.
Church, K.W., Hanks, P., 1990. Word association norms, mutual information, and
lexicography. Computational linguistics 16, 22–29.
Cutler, D.M., Poterba, J.M., Summers, L.H., 1988. What moves stock prices? Tech-
nical Report. National Bureau of Economic Research.
Delort, J.Y., Arunasalam, B., Milosavljevic, M., Leung, H., 2009. The impact of
manipulation in internet stock message boards. International Journal of Banking
and Finance, Forthcoming .
Domingos, P., Pazzani, M., 1997. On the optimality of the simple bayesian classifier
under zero-one loss. Machine learning 29, 103–130.
Fama, E., French, K., 1993. Common risk factors in the returns on stocks and bonds.
Journal of financial economics 33, 3–56.
Fama, E.F., 1970. Efficient capital markets: A review of theory and empirical work*.
The journal of Finance 25, 383–417.
35
Go, A., Bhayani, R., Huang, L., 2009. Twitter sentiment classification using distant
supervision. CS224N project report, Stanford 1, 2009.
Hutto, C.J., Gilbert, E., 2014. Vader: A parsimonious rule-based model for sentiment
analysis of social media text, in: Eighth international AAAI conference on weblogs
and social media.
Java, A., Song, X., Finin, T., Tseng, B., 2007. Why we twitter: understanding mi-
croblogging usage and communities, in: Proceedings of the 9th WebKDD and 1st
SNA-KDD 2007 workshop on Web mining and social network analysis, pp. 56–65.
Joachims, T., 1998. Text categorization with support vector machines: Learning with
many relevant features. Springer.
Klibanoff, P., Lamont, O., Wizman, T.A., 1998. Investor reaction to salient news in
closed-end country funds. The Journal of Finance 53, 673–699.
Kordonis, J., Symeonidis, S., Arampatzis, A., 2016. Stock price forecasting via sen-
timent analysis on twitter, in: Proceedings of the 20th Pan-Hellenic Conference on
Informatics, pp. 1–6.
Lewis, D.D., 1998. Naive (bayes) at forty: The independence assumption in information
retrieval, in: Machine learning: ECML-98. Springer, pp. 4–15.
MacKinlay, A.C., 1997. Event studies in economics and finance. Journal of economic
literature , 13–39.
Meador, C., Gluck, J., 2009. Analyzing the relationship between tweets box-office
performance and stocks. Methods .
Mittal, A., Goel, A., 2012. Stock prediction using twitter sentiment analysis. Stand-
ford University, CS229 (2011 http://cs229. stanford. edu/proj2011/GoelMittal-
StockMarketPredictionUsingTwitterSentimentAnalysis. pdf) 15.
Pak, A., Paroubek, P., 2010. Twitter as a corpus for sentiment analysis and opinion
mining., in: LREc, pp. 1320–1326.
36
Patell, J.M., 1976. Corporate forecasts of earnings per share and stock price behavior:
Empirical test. Journal of accounting research , 246–276.
Prechter Jr, R.R., Goel, D., Parker, W.D., Lampert, M., 2012. Social mood, stock
market performance, and us presidential elections: A socionomic perspective on
voting results. SAGE Open 2, 2158244012459194.
Ranco, G., Aleksovski, D., Caldarelli, G., Grčar, M., Mozetič, I., 2015. The effects of
twitter sentiment on stock price returns. PloS one 10.
Saif, H., He, Y., Alani, H., 2012. Semantic sentiment analysis of twitter, in: Interna-
tional semantic web conference, Springer. pp. 508–524.
Siering, M., 2013. All pump, no dump? the impact of internet deception on stock
markets., in: ECIS, p. 115.
Skrepnek, G.H., Lawson, K.A., 2001. Measuring changes in capital market security
prices: The event study methodology. Journal of Research in Pharmaceutical Eco-
nomics 11, 1–18.
Strauss, A., Corbin, J., 1998. Basics of qualitative research techniques. Sage publica-
tions Thousand Oaks, CA.
Surowiecki, J., 2004. The wisdom of crowds, 2004. New York: Anchor .
Wang, C.J., Tsai, M.F., Liu, T., Chang, C.T., 2013. Financial sentiment analysis
for risk prediction, in: Proceedings of the Sixth International Joint Conference on
Natural Language Processing, pp. 802–808.
37
Wysocki, P.D., 1998. Cheap talk on the web: The determinants of postings on stock
message boards. University of Michigan Business School Working Paper .
Zhang, X., Fuehres, H., Gloor, P.A., 2011. Predicting stock market indicators through
twitter. Procedia-Social and Behavioral Sciences 26, 55–62.
38