Moods in Social Networks as an Indicator for Stock Market Performance

Jonathan Frei

Ongoing research in the fields of semantic analysis and language processing has showcased many instances where aggregated opinions and moods uttered on the Internet lead to a collective intelligence capable of predicting events in the real world. So far much of this research focused on large datasets produced by numerous contributors, this thesis however tries to identify smaller groups of people and make more accurate predictions based on the content they produce. Firstly this thesis examines a possible correlation between the stock price of a single company and the mood of that company's employees and secondly a possible correlation between the volume of tweets and the trading volume of such a company's stock. The six companies observed were Cisco, Dell, Google, Intel, Microsoft, Oracle, which were chosen because of their relatively high number of employees on twitter.

A mood graph was generated by analyzing the tweets from the employees of such a company. Using the collaborative marketplace from Amazon called Mechanical Turk1 the employees were identified by searching the bios'2 of twitter users for keywords such as “google” or “cisco”. With this method 3'981 potential employees were collected along with their user name and bio. They were then displayed in a Mechanical Turk task and several workers3 were assigned to filter out employees based on the description the users gave in their bios'. In total 1'004 employees were successfully identified. Crosschecking a sample of the employees resulted in a very high accuracy achieved by the workers.

Furthermore a Multinomial Naive Bayes classifier was programmed, which would automatically classify the moods expressed in the tweets of the 1'004 identified employees. The tweets were classified in three different categories based on the mood they conveyed: Positive, neutral or negative. The training of the classifier was again conducted with the help of Mechanical Turk. A total of 5'000 tweets were manually classified by workers, in order to extract a sufficiently large vocabulary, which could be used by the classifier to evaluate tweets. However the comparison of moods and stock prices of the six mentioned companies did not yield a significant correlation, except for one instance. However this can more likely be attributed to statistical errors than to a genuine correlation.4

1
2 3 4

Mechanical Turk is a is a crowdsourcing marketplace, which allows posting small assignments online, which can be fulfilled against a small pay Twitter bio: A short personal description used to define who you are on Twitter” (Twitter, 2011c) A worker on Mechanical Turk is a person assigned to a certain task. After completion the worker usually receives a small compensation, which can be specified by the creator of the task. 18 observations were made on twitter and stock volume. 12 obeservations were conducted on stock prices and moods.

1

In terms of causation, various tests were run comparing either emotions with stock prices on the previous day or vice versa. Delays in both directions did however not result in a significantly better correlation, which means that no decision could be made whether changes in stock prices might influence employees' moods or moods might be an indicator for changes in the stock price on the next day. Since trading hours and office hours are roughly at the same time, it is assumed that changes in stock prices affect employees mostly on the same day. This assumption is further supported by observations made between the number of tweets and the trading volume. In a total of six cases a significant correlations was found. Out of those six correlations, four were observed when data on the same day was compared and in two instances with a weaker, albeit still significant correlation, when emotions were one day delayed.

2