You are on page 1of 29


Monitoring mood in the UK via Twitter

Rob Hawkes
BA (Hons) Interactive Media Production The Media School 2010-11


Rob Hawkes BA (Hons) Interactive Media Production The Media School 2010-11

2 of 29

This study sets out to experiment with the possibility of using free Web technologies to build a tool capable of storing and processing large volumes of data, which can then be used to analyse trends in sentiment across the UK using the ANEW dataset. A system is created to gather, store, and analyse messages on Twitter. Throughout the study this system is optimised to handle larger volumes of data, and generally increase performance for sentiment analysis. By moving from single to multi-core processing, the time it takes to perform analysis can be reduced by around 1200%. Initial analysis of sentiment data shows that maps are an ineffective way to visualise sentiment over time, particularly over a large geographical area. By moving to temporal line graphs significant fluctuations in sentiment data can be shown to occur across the period of a week. Further analysis has found that there is a regular sentiment heartbeat on Twitter that, when smoothed, highlights a consistent and key trend in sentiment across the period of a week. Statistical analysis of these trends with t tests further prove that there is a significant difference in sentiment on weekends in comparison to weekdays, correlating with existing studies from the US. Overall, it has been proven that existing Web technologies are capable of this type of analysis. This study has also proven that there are significant trends in sentiment on Twitter, and that these trends can fluctuate during key public events across the UK.

3 of 29

This study would not have happened without the support of my friends and family. I would like to particularly thank my girlfriend, Lizzy Robins, who is always there for me whatever the problem, even if she does not fully understand what I am doing sometimes. I look forward to supporting her through the same dissertation process as she moves into her final year of university. Both Ernesto Jiménez and Mark Embling helped massively with my struggles with multi-core processing. Without them I am sure that I would still be hacking away at it today. I extend a special thank you to my good friend Hannah Wolfe, whoʼs persistence and help has undoubtedly brought this study into much better focus. I owe you one (perhaps even two)!

4 of 29

Introduction!..................................................................6 Defining sentiment! .......................................................7 Using Twitter as a corpus !...........................................8 Receiving data!.............................................................9 Gathering data!...........................................................10 Looking for sentiment!...............................................12 Mapping sentiment!....................................................12 Shaking off the geographical restraints! ..................15 Upgrading the technology !........................................17 The sentiment heartbeat!...........................................18 Taking the analysis further!.......................................22 People love a good smooch on a balcony!..............23 Conclusion!.................................................................24 Bibliography !..............................................................27 Appendices !................................................................29

5 of 29

I am an inquisitive person. I get a kick out of experimenting with stuff and learning new things. It is because of this that I constantly explore new areas and put myself in situations that make me uncomfortable. And it is because of this that I was excited to focus on the Twitter API to explore the secrets hidden within it. Inspiration for this study initially came from the Spirit of Christmas project (Hawkes 2009). In this project a system was produced that analysed tweets over the Christmas period to produce a level of "spirit" for each tweet. This was done by cross-referencing each word in a tweet with a pre-defined list of positive and negative words relating to Christmas. The list of words was created by hand and never went under any proper scrutiny – the purpose was to have fun, not to be statistically fair or accurate. The lessons learnt from the Spirit of Christmas project served to help throughout this study. These lessons allowed me to scrape the Twitter API for tweets in a more efficient way (being aware of its limitations), to analyse sentiment using a method that is much more robust and academically sound (not using a manual dataset), and to be aware of possible trends in the data (most tweets had a negative Christmas spirit). This time around I was particularly interested in the relationship between sentiment and the geographic location of tweets. I believed that there would be a significant relationship between physical locations and sentiment. I wanted to investigate the following: Can general trends be discovered in Twitter sentiment? Does Twitter sentiment vary significantly in different areas of the country, and does this change over time? Are existing Web technologies able to provide these insights? These are the questions I set out to build a tool capable of exploring.

Defining sentiment
Sentiment is a feeling or attitude. For the purposes of this study I am specifically referring to positive and negative sentiment – levels of happiness. Sentiment is incredibly subjective; to find that out you need only ask two people who support different sports teams whether they felt happy or sad about the score. It is for this reason that I am not looking at the sentiment of a particular topic (eg. a football match), but rather looking at sentiment at a particular point in time – the subject of the sentiment is irrelevant. Given that sentiment is subjective, how to you find a method of accurately measuring it in a reliable and reproducible way? There are two main methods to choose from for measuring sentiment; using subjective opinion given by the human brain, or using systematic values given by scientific analysis of sentiment. Subjective opinion can instantly be ruled out as an accurate and reliable measure. Fortunately, thorough studies into sentiment at the University of Florida (Bradley and Lang 2010) have resulted in the Affective Norms for English Words (ANEW), a method of accurately and reliably attributing sentiment to English words. The ANEW dataset contains 2,476 English words, each attributed with a sentiment value for three different types of sentiment; valence (happiness), arousal (calmness), and dominance (strength). Each of the three measures of sentiment have a value on a scale from 1 (eg. low valence – unhappy) to 9 (eg. high valence – happy). I am only interested in positive and negative sentiment therefore using valence alone is sufficient for this study. Danforth and Dodds (2009) point out that ANEW is of particular use for analysing text as the values for average valance are distributed nicely across the 1–9 scale. They state that this distribution gives sufficient sensitivity for discriminating texts based on sentiment. To put this in context, the word triumphant has a valence of 8.82 – very positive, while the word rape has a valence of 1.25 – very negative. This simple, continuous and measurable scale

7 of 29

applied to specific words, along with ANEWʼs availability for research projects made it the obvious solution for this study. I should take this moment to point out that it is beyond the scope of this project to analyse how fairly ANEW represents how an individual person feels. What I am interested in is finding a method that I can rely on to give me accurate and reproducible measurements of sentiment for a corpus, which is exactly what ANEW has been created to do.

Using Twitter as a corpus
Herring and Honeycutt (2009) show that the most popular kind of content on Twitter answers the question, "What are you doing?" It is this personal question, coupled with the exponential growth in Twitter (Penner 2011) and the small format of the messages (tweets) that have played a part in deciding where to gather my data from. Twitter also stands out as a research tool because the majority of the 140-character tweets sent through it are publicly accessible via an easy and free to use API. Surely 140 million tweets per day across the world are enough to get an idea of general sentiment level in the UK, right? I was determined to find out. There are a number of studies that have looked into the analysis of sentiment through Twitter. The interesting part has not been finding out whether sentiment can be gleaned from 140 characters of text [it seems that it can (Gilbert and Kim 2009)], but rather looking into what the resulting sentiment data can be used for. One prominent example of this has been the study by Bollen et al. (2011) which investigates the correlation between Twitter sentiment and the Dow Jones Industrial Average (DJIA). By utilising the OpinionFinder (OF) and Google Profile of Mood States (GPOMS) methods of sentiment analysis, the study found that the calmness of Twitter can help predict the DJIA a whole three days in advance. Apart from the DJIA prediction, this study also interestingly showed that the positive measure of sentiment through both OF and GPOMS correlated with
8 of 29

large public events that occurred within the US; events that you would naturally associate with a change in happiness. For example, both methods of sentiment analysis indicated that Twitter was significantly happier on both election day (4th November 2008) and Thanksgiving (27th November 2008). The fact that they did not use ANEW is also particularly interesting, and left me keen to investigate whether ANEW would show similar correlations between Twitter sentiment and public events within the UK. Another study in the US suggests that there could be a significant correlation between sentiment on Twitter and geographic location Ahn et al. (2010). This study, based on over 300 million tweets, has also shown that sentiment within the US follows a general trend across the period of a single day (early morning and late evening being happiest), as well as the period of an entire week (weekends being happiest). This, along with the DJIA study clearly answers my initial question of whether trends can be found in Twitter sentiment. It will be interesting to see if the tools that I am using will also show the same trends with data from the UK.

Receiving data
My initial study into Twitter sentiment started late in 2010, but before I could even start analysing tweets for sentiment I needed to find a reliable way to gather and store them. Gathering the tweets was first attempted with the Twitter Search API, although some immediate issues that I encountered were that results are restricted to tweets from the last seven days, only 1,500 tweets per request, as well as calls to the API being rate limited (Twitter ca.2011). This means that the Search API has protections in place to stop abuse and overuse of the system – it is bad news for researchers. The reason for this is that the Search API is a REST API, meaning that to get data from it you have to send a HTTP request to Twitter and then get a HTTP response back from Twitter containing the tweets that you are after. Considering that you can only receive a maximum of 1,500 tweets per HTTP request, gathering 1,000,000 tweets for example would take 667 separate HTTP requests. To clarify, that is a lot of

9 of 29

requests, so many in fact that Twitter temporarily banned me from the Search API for abuse of the rate limit when I first attempted to use it. Considering that a large corpus was needed, another method was required to gather the tweets. The good news is that Twitter realise this as a legitimate problem and have released a method for gathering as many tweets as you want, without any of the performance overhead. This method is called the Streaming API, and with a single HTTP request it allows you to effectively connect a pipe between yourself and Twitter. What this pipe allows you to do is keep a persistent connection open with Twitter, which they then use to send tweets down to you in near real time. The benefit of this method is that it allows you to gather an unlimited amount of tweets with just a single HTTP request, which is much better than 667 requests. But all good things come at a price, and the price with the Streaming API is that you only have access to ~1% of tweets for free. You can pay for access to the 10%, 50% and 100% streams, but these cost a considerable amount of money and are therefore unsuitable for the scope of this project (Kirkpatrick 2010). For context, my initial research showed that ~1% returns around 60,000 to 100,000 geo-located tweets sent per day in the UK, which provides an adequate corpus for this study. Another issue to be aware of is that the Streaming API delivers tweets in real-time, meaning that you can't gather an entire corpus in one sitting. This is important because it means that gathering 1,000,000 tweets from the UK could take anything from 10 to 17 days (at a rate around 60,000 to 100,000 tweets a day). Unfortunately there is little option but to put up with this situation, as Twitter have enforced their Terms of Service (TOS) stating that collections of raw tweets cannot be distributed by anyone without a license to do so. Research tools, such as 140kit, are attempting to work around the problem (Gaffney et al. 2011).

Gathering data
To gather the data for this study I knew that I needed to use the Streaming API, but a method was still required to actually connect to and deal with the incoming
10 of 29

tweets in real-time. For this I looked at Node1, a server-side JavaScript environment. Node stands out because it is built on the concept of the event loop, which is perfect because it means that it doesn't get locked up performing a single task as can be the case with more traditional programming environments, like PHP. This event-based approach lends itself to the consumption of real-time data, particularly when you want to continue receiving new data while you're adding the previously received data to a storage system. It also worked well as a solution because there are a variety of extensions 2 that can be utilised to consume the Streaming API. When dealing with so many tweets, it's important to find a storage system that is both stable and fast. The two options available were a traditional relational database, like MySQL, or a document-oriented database, like MongoDB. Normally I would reach for MySQL, as it is what I am used to, but I was very keen to try out a document-oriented solution because of their rapid and flexible nature. What swayed the decision in MongoDB's favour was that it uses BSON to structure and store data, which is a similar, near identical format as the data from the Streaming API (JSON). This means that I could maintain the integrity of the Twitter data within MongoDB without any conversion or cutting down needing to be done to the data, unlike with MySQL. The ability to work rapidly and without repercussion means that MongoDB is an ideal solution, helped also by the fact that there are plenty of Node extensions 3 to interface with it. Combined, these JavaScript technologies allow for the consumption and storage of tweets from the UK to be performed with ease. This is helped massively by the ability of the Streaming API to only send tweets that fall within a geographical boundary in the shape of a rectangle, designated by latitude and longitude coordinates. The coordinates I used to define this rectangle are 49.76707, -12.72216 for the bottom left (South-West of Land's End), and 61.06891, 1.97753 for the top right (North-East of the Shetland Islands). As the UK is separated from other nations by water, this method will do until a better

1 2 3
11 of 29

one is put in place by Twitter. Perhaps by enabling the API to intelligently know whether a tweet was sent from a particular country, based on its coordinates. That way you need only request tweets from the UK, rather than requesting tweets within a geographic boundary.

Looking for sentiment
By gathering a small initial batch of tweets, I ran them through a simple Node script to analyse sentiment based on the ANEW dataset. Each tweet was broken down into its individual words, with each word being cross-examined with the ANEW dataset to get its sentiment value. The average sentiment for the tweet (vtext ) is calculated by finding the frequency (fi) that each word (i ) appears in the tweet, multiplying the sentiment for each word from the ANEW dataset (vi) by the frequency, then dividing the resulting value by the frequency:

vtext ￿

n i=1 vi fi = ￿n i=1 fi

This gives you the weighted average of the sentiment for the entire tweet and is based on the method used by Danford and Dodds (2009) to find sentiment in blog posts and song lyrics.

Mapping sentiment
At this point the system had the average sentiment for each tweet, as well as the geographic coordinates that it was sent from. By using Polymaps4 I then plotted each tweet on a map of the UK, giving a basic visualisation of sentiment across the country on a single day (see Figure 1). Each point on the map represents a tweet, with the colour of each point representing the sentiment.

12 of 29

The colours range from red (unhappy), through to yellow (neutral), then green (happy).

Figure 1. Basic visualisation of sentiment on a map of the UK

I chose to use Polymaps for producing the maps for a number of important reasons. Firstly, it uses JavaScript, meaning that it runs within the browser without compilation. This is important because it allows for rapid development and instant visualisation, unlike other methods which can incur a lengthy waiting time while code is compiled. Another reason is that it is well documented, which ties in with the requirement for rapid development. The most important reason is that it uses the GeoJSON data format, which is the same format as the coordinates outputted by the Twitter Streaming API. This symbiosis of the two technologies means that the data from Twitter can be used without compromising its integrity, increasing its reliability by not going through a conversion process. This in turn lessens the chance of accidental corruption. The first sentiment map of the UK highlighted some key issues that hampered the analysis of sentiment. Chief among these was that it was too hard to distinguish between the full range of happy and sad tweets because the colours were not varied enough. This was happening because the majority of tweets
13 of 29

had a sentiment value between 5 and 7 (see Figure 2), and this lack of distribution meant that the majority of tweets showed up as a green-yellow or green point. Coupled with the size of each point being too large, a much more refined map was produced (see Figure 3).

Figure 2. Distribution of sentiment across a single day in the UK

Reading the map suddenly becomes much easier simply by reducing the size of the points and adding a threshold to the colour values. This threshold dictates that any point with a sentiment value less than 5 should be red, and any with a value greater than 8 should be green. Points with a sentiment value between 5 and 8 will be graded from red through to green respectively, as with the previous map. Adding this level of granularity to the map resulted in a more interesting visualisation, but there simply was not any obvious pattern or trend to the data. The only thing the map did show conclusively was the profound statement that people do not tweet much in the countryside.

14 of 29

Figure 3. Refined map of sentiment in the UK

Another issue is that every tweet across the day is plotted on the same map, resulting in the rather hectic build-up of tweets in populated areas. To overcome this issue the tweets need to be attached to and only visualised at the time that they occur. Doing this on a map would make it hard to see if any trends are occurring as you would only ever see tweets at that specific moment in time. I decided at this point to drop the map visualisations and turn to a time-series line graph, which would allow sentiment values at previous and future points in time to be visualised all at once. As a result of this, my second investigation is now focussing on sentiment across the UK as a whole, rather than sentiment in different areas of the country.

Shaking off the geographical restraints
These line graphs tell a different story when compared to the maps. For example, the graph of average sentiment per hour on Valentine's day shows some potentially significant trends (see Figure 4). The graph seems to indicate that people were sending the happiest tweets between 2 and 3pm, and that the
15 of 29

day followed a negative trend in general. For the most romantic day of the year (apparently) this seemed odd. So, in an effort to shed some light on this I decided to dig further into the collected data and look at the average sentiment per minute instead.

Figure 4. Average sentiment per hour on Valentineʼs day

Figure 5. Average sentiment per minute on Valentineʼs day

16 of 29

What is interesting is that visualising average sentiment on a per minute basis results in a graph that looks wildly different to the graph visualising average sentiment per hour (see Figure 5). In hindsight, the reason for this is obvious –  less tweets per period of analysis means that outlying and erratic values find it easier to bubble up to the surface. However, before I attempted to better visualise the data I was interested to see whether this erratic pattern was the same over a period longer than a single day.

Upgrading the technology
The problem is that up to now the data gathering and analysis technique that I was using was severely limited; it only ran while my personal computer was switched on, the MongoDB database for each day had a huge file size, the Node script sometimes crashed without warning, and it was susceptible to my unpredictable wireless Internet connection. Measuring sentiment over a longer period of time requires a larger corpus, so I needed to find a more reliable solution. The solution that I came up with was threefold. I first moved part of the system onto a hosted virtual private server (VPS), which allowed me to gather tweets 24/7 and to be confident that I would have a more predictable Internet connection. I then moved from Node to Ruby because its Twitter Streaming API extension was more reliable in testing. The move to Ruby meant that I had to sacrifice the event-based approach, but it was a better option than having to construct a complex and unnecessary crash manager into the Node script. To solve the MongoDB file size issue I first moved the database to a hosted environment5. I then had to sacrifice my so far maintained data integrity and use software optimisation techniques. This included reducing the amount of data stored for each tweet to the bare minimum; text, date sent, and a unique identification number. By collecting only that data I was able to reduce the file size of each tweet from 2KB to 0.3KB, which may not look like much but would actually end up saving 1.7GB when storing 1,000,000 tweets. This need for
17 of 29

optimisation is a key issue in the management and processing of large quantities of data, and reflects the issues constantly faced by massive online systems such as Twitter (Catt 2010). One particular method of optimisation proved valuable in the analysis of tweets to produce the sentiment data. Previously, I was using a MongoDB map/reduce function that took tens of minutes to work through 60,000 to 100,000 tweets. This was acceptable the such a small dataset, but working through 2,000,000 tweets took over 10 hours, an unacceptable amount of time. It is already known that map/reduce functionality in MondoDB is not always suitable for large-scale number crunching because it uses a single processor core (Horowitz 2009). To overcome this issue I implemented my own technique that broke up the tweets into smaller batches, spreading out the analysis by using a series of Node processes run over multiple processor cores. This alone reduced the amount of time to work through 2,000,000 tweets from 10 hours to just a few minutes – it was even quicker than the original map/reduce with 60,000 tweets! The resulting system comprises several scripts; one scrapes Twitter and puts it in MongoDB, another is a multi-core script that converts the tweets into sentiment values, another then takes those sentiment values and turns them into per-hour or per-minute average sentiment, and finally one last script converts those values into a JavaScript data file that I then visualise with Protovis6.

The sentiment heartbeat
By visualising average sentiment on Twitter from a period of a few days it is quite obvious that the erratic pattern is still apparent (see Figure 6). Fascinatingly, there seems to be a consistent rhythm in sentiment that always occurs early in the morning. This fluctuation correlates with the study by Balasubramanyan et al. (2010) which also found that day-to-day sentiment is volatile.

18 of 29

Figure 6. Erratic daily sentiment

This volatility of the sentiment can also be referred to as signal-to-noise, and is directly related to the quantity of tweets that are sent throughout the day. This connection is clearly highlighted by adding the quantity of tweets sent per minute onto the same graph (see Figure 7). Let's not be mistaken though, the early-morning fluctuations are still significant variations in average sentiment. This is Twitter's underlying sentiment heartbeat, and in another study I would certainly like to explore this heartbeat pattern in more detail to derive the cause of the fluctuations.

Figure 7. Erratic daily sentiment compared with volume of tweets

You can mitigate these fluctuations in sentiment by smoothing out the data over time. This process is referred to as calculating the moving average, or temporal smoothing, and it stops the sentiment values from responding too quickly to change. This in turn allows consistent trends in the data to appear over longer
19 of 29

periods of time. Balasubramanyan et al. (2010) use this method to overcome the signal-to-noise ratio in their own study, as do Ahn et al. in their study of Twitter sentiment in the US (2010). To apply a moving average to a tweet (t) over a period of time (k ) you must first find the sum of the sentiment values from the current tweet (vt ) and the preceding k

this sum by the period of time:

− 1 tweets. The moving average (M At) is given by then dividing

M At =

vt + vt−1 + vt−2 + . . . + vt−(k−1) k

When applied to 659,030 tweets over the period of a week, a moving average over a period of 24 hours turns an erratic graph that does not say much (see Figure 8) to a smoothed graph which tells us a lot about the change in sentiment over a single week (see Figure 9). For example, the un-smoothed graph shows a few peaks and troughs, but nothing that could be considered an obvious trend. On the other hand, the smoothed graph shows a period of happy sentiment on the weekend, followed by a sharp drop into an unhappy period on Monday which doesn't recover again until the next weekend. As interesting a finding as it is, it would be wrong to take these findings for a single week and assume that all weeks follow the same trend.

Figure 8. Erratic daily sentiment over a period of a week

20 of 29

Figure 9. Smoothed daily sentiment over a period of a week

To get a more reliable picture of the trend of sentiment over a week I analysed the largest continuous set of tweets collected by the Ruby-based system. These 2,981,622 tweets cover a period of just over five weeks (38 days). By grouping the tweets by each day of the week I was then able to calculate the average sentiment for each day across the whole dataset (see Figure 10). This visualisation of combined average sentiment quite clearly shows a similar weekly trend across the whole five weeks. Not only has combining and smoothing the data resulted in a readable graph that uncovers important trends in sentiment on Twitter, but it also correlates with the findings by Ahn et al. with weekly variations in sentiment in the US (2010). It also further confirms that ANEW is a usable dataset that is comparable to OF and GPOMS.

Figure 10. Smoothed average sentiment for each day of the week

By applying a t test I was then able to prove a significant and reliable difference between sentiment on a weekday compared to the sentiment on a weekend,

t(892) = −6.658, p = .000, α = .05 (see Appendix 1). A further t test 
was able to prove that there is no statistically reliable difference between sentiment on a Monday, compared to that of a Tuesday,
21 of 29

t(233) = −1.241, p = .216, α = .05 (see Appendix 2). Together, these
two tests correlate with and back up the visualisation depicting a weekly trend in sentiment. Taking the time to do these statistical tests served to further prove that the data outputted by the system was accurate and reliable.

Taking the analysis further
Knowing that sentiment falls at the beginning of the week only to rise again towards the the end, I wanted to briefly test this method of sentiment analysis further. To do this I used the Ruby-based system to gather tweets relating to two important events in the UK calendar; Easter Sunday, and the royal wedding of Prince William and Catherine Middleton. Both of these events undoubtably affect the entire country and I was particularly interested to see how Twitter reacted to them and whether it affected the weekly trend. The system quickly produced some interesting new findings. In the week leading up to Easter the trend is in vaguely in line with the weekly trend, but it doesn't follow the dramatic drop in happiness on Monday as was shown previously (see Figure 11). The Easter trend also deviates by rising dramatically on bank holiday Friday at a greater incline than that of the average weekly trend. This dramatic increase in happiness is in line with what I would expect on a day when the majority of people have just started a four day weekend.

Figure 11. Sentiment in the week leading up to Easter

22 of 29

Turning our attention to the royal wedding, the happiness rises dramatically on Friday again (see Figure 12), the day of the royal wedding and another bank holiday, which corresponds with the idea that people would be happy about the wedding and having a day off work (bar a few grumpy folk).

Figure 12. Sentiment in the week leading up to the royal wedding

It's good to see that the weekly trend shows up in both events, but at slight extremes on particular key days. Further analysis should definitely be taken to work out if the events are truly affecting the weekly trend, or whether the change is a random or natural fluctuation.

People love a good smooch on a balcony
By further analysing the day of the royal wedding I have uncovered some more potential trends in the sentiment (see Figure 13). The happiest event of the day is the kiss on the balcony of Buckingham Palace. Now it may sound a little odd, but that moment was one of the most anticipated, and it is massively interesting to see that my analysis of the data collected is sensitive enough to highlight this in such a profound way. At this point the Ruby-based system was running smoothly and robustly collecting data. The average weekly graph shows a clear trend, backed up by both the example “normal” weekly graph and its correlation with the Ahn et al. study. This is very much a light-hearted look at the data, but it is clear to see that something is affecting the sentiment during a normal week and on key national events. Like with the weekly graphs, further research should be taken
23 of 29

into these fluctuations to see if they are typical for a Friday, or whether they are genuinely as a result of the day's events.

Figure 13. Sentiment per minute during the royal wedding

At the beginning of this journey I aimed to find trends in sentiment specifically related to geographic location. It didn't take me long to realise that such a focus on location and visualisation with maps was going to prove difficult when measuring such a quantity of tweets over a long period of time. And although some lovely maps were made that highlighted hotbeds of Twitter activity in the UK, it is because of the difficulties involved that the method of analysis for this study was switched to a purely temporal one, dropping geographic location entirely. This shift in focus opened doors to a hidden world of data that just wouldn't have surfaced if visualised on a map. Temporal analysis has shown that there are consistent fluctuations in sentiment throughout each and every day on Twitter, a sentiment heartbeat of sorts. By smoothing this heartbeat it has been possible to break through the noise and shed light on the signals that lay within it. The most noticeable finding has been that there is a consistent rise and fall in sentiment across the period of a week, correlating directly with the study by Ahn et al. into sentiment in the US, and indicating that the ANEW dataset can be used to derive trends from a large

24 of 29

corpus of tweets. What's even more profound is that these trends have shown to be statistically significant, not just random fluctuations in the data. By analysing sentiment surrounding key events in the UK and comparing it to the expected weekly trend, I have been able to show that these events cause sentiment to deviate from what is expected. These findings correlate to those in the study by Bollen et al., which clearly show that national events can have an effect on expected sentiment levels. It has also been shown to me that existing and free Web technologies are indeed able to provide insights into this kind of data. On a technical level, this study has discovered and subsequently overcome a variety of hurdles. From gathering and storing the data, all the way to full blown analysis of millions of tweets, improvements were made throughout the journey to increase performance and refine results. One key improvement was the move to multicore processing for analysing tweets for sentiment; this alone led to improvements in performance that allowed results to be found in a fraction of the time. Together, it is these constant improvements in technology and process that allowed for the creation of an ultimately robust and sensitive system capable of supporting both large-scale, and quite fine-grained data analysis. So I achieved what I set out to discover with Twitter sentiment, particularly that there are significant trends in the data that can be used for further analysis. But how can this area of study be improved? I would first suggest that further thought be put into the type of tweets that are included. At the moment no tweets are culled from the dataset, meaning that re-tweets (duplicates) and other potentially problematic tweets exist. Secondly, I would like to perform the same analysis but with a much larger dataset, over a much longer period of time. I have shown that my system is sensitive enough to detect even small changes in sentiment over a relatively short period of time, but I would like to look at data across the period of an entire year, or ever longer to see if any different fluctuations are found. I would also like to look into alternative methods of sentiment analysis to see how the results compare to those of ANEW. For example, would it be possible to infer the same level of sentiment just from emoticons?
25 of 29

What this study has shown is that when combined, individual tweets can highlight huge trends in areas such as nationwide sentiment. It is these trends that prove the importance of publicly accessible services like Twitter, and I see this study helping others in their journey with this kind of data analysis.

26 of 29

Ahn, Y.A., Lehmann, S.L., Mislove, A.M., Onnela, J.O. & Rosenquist, J.R., 2010. Pulse of the Nation: U.S. Mood Throughout the Day inferred from Twitter. Northeastern University. Available from: twittermood/ [Accessed May 1, 2011]. Balasubramanyan, R.B., OʼConnor, B.O., Routledge, B.R. & Smith, N.S., 2010. From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series. Pittsburgh, PA: Carnegie Mellon University. Bollen, J., Mao, H. & Zeng, X., 2011. Twitter mood predicts the stock market. Journal of Computational Science. Bradley, M.M. & Lang, P.J., 2010. Affective norms for English words (ANEW): Instruction manual and affective ratings. Gainesville, FL: University of Florida. Catt, D.C., 2010. A userʼs guide to websites, part 1: If it wasnʼt broken why fix it?. Rev Dan Catt's Blog. Available from: [Accessed May 1, 2011]. Danforth, C.D. & Dodds, P.D., 2009. Measuring the happiness of Large-Scale written expression: Songs, blogs, and presidents. Journal of Happiness Studies, 1-16. Gaffney, D.G. & Gilbert, S.G., 2011. Regarding Twitter's API Rules and Data Export. 140kit. Available from: Regarding_API_Change.pdf [Accessed May 1, 2011]. Gilbert, S.G. & Kim, E.K., 2009. Detecting Sadness in 140 Characters: Sentiment Analysis and Mourning Michael Jackson on Twitter. Web Ecology Project(03).

27 of 29

Hawkes, R.A., 2009. My Involvement in Redwebʼs Spirit of Christmas 2009. Rawkes. Available from: [Accessed May 1, 2011]. Herring, S.H. & Honeycutt, C.H., 2009. Beyond Microblogging: Conversation and Collaboration via Twitter. Bloomington: Indiana University. Horowitz, E.H., 2009. MapReduce. MongoDB. Available from: http:// [Accessed May 1, 2011]. Kirkpatrick, M.K., 2010. Twitter to Sell 50% of All Tweets for $360k/Year Through Gnip. Read Write Web. Available from: archives/twitter_to_sell_50_of_all_tweets_for_360kyear_thro.php [Accessed May 1, 2011]. Penner, C.P., 2011. #numbers. Twitter. Available from: 2011/03/numbers.html [Accessed May 1, 2011]. Twitter, [no date] Rate Limiting. Twitter. Available from: pages/rate-limiting#search [Accessed May 1, 2011].

28 of 29


Appendix 1. T test finding significance between weekdays and weekends

Appendix 2. T test finding significance between Mondays and Tuesdays

29 of 29