You are on page 1of 7

Draft version, originally published in: Patrick Thonhauser, Selver Softic, and Martin Ebner. 2012.

Thought Bubbles: a conceptual prototype for a Twitter based recommender system for research 2.0. In Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies (i-KNOW '12). ACM, New York, NY, USA, , Article 32 , 4 pages. DOI=10.1145/2362456.2362496

A Conceptual Prototype for a Twitter based Recommender System for Research 2.0
Patrick Thonhauser1 , Selver Softic1 , Martin Ebner1
1 Department

THOUGHT BUBBLES

for Social Learning, Institute for Information Systems and Computer Media, Graz University of Technology, Austria patrick.thonhauser@gmail.com, softic.s@gmail.com, martin.ebner@tugraz.at

Keywords: Abstract:

Recommender System, Twitter, Thought Bubble, Classication, Social, Data Mining The concept of so called Thought Bubbles deals with the problem of nding appropriate new connections within Social Networks, especially Twitter. As a side effect of exploring new users, Tweets are classied and rated and are used for generating a kind of news feed, which will extend the personal Twitter feed. Each user has several interests that can be classied by evaluating his Tweets in rst place and secondly by evaluating user related and already existing contacts. By categorizing a user and concerned connections, one can be placed in an imaginary category specic subset of users, called Thought Bubbles. Following the trace of people who are also active within the same specic Thought Bubble, should reveal interesting and helpful connections between similar minded users.

INTRODUCTION

Twitter has grown tremendously in the last few years and is generating 200 million Tweets and 1.6 million search queries each day. As of now (2012), Twitter has over 250 million users1 . These are pretty impressive numbers for a micro blogging/socialnetwork platform and Twitter has already become a cultural phenomenon. Every day people all over the world are communicating via Twitter, exchanging the latest news and discussing millions of diverse topics. The list of tweetable actions is almost innite and everybody who is interested in a specic person or a specic topic, has the ability to consume the knowledge by reading certain tweets or exploring the tweeted resources. However, the interesting questions for researchers are how to make use of the information contained within millions of tweets and what to extract from those 140 character micro blogs. How much useful information is in a Tweet and how can we separate feasible information from noise? This paper presents a novel concept for nding new interesting users and information for a specic Twitter account. Many researchers already solved parts of this puzzle and several parts of these concepts are based on ndings of (Softic et al., 2010), (Mika and Laniado, 2010) and
for-2012/ (April 2012)
1 http://thesocialskinny.com/100-social-media-statistics-

(Choudhury and Breslin, 2010). To Semantic Web researchers, Twitter has become one of the most popular applications for the dissemination of information (Kraker et al., 2010) and it is therefore a legit candidate to serve as the main source for mining data concerning users and provided information of scientic interest. This paper doesnt serve as a detailed description of a forthcoming semantic recommender system for research 2.0, but rather as a brief overview of a proof of concept application, whichs main task is the classication and recommendation of Twitter users. Also preliminary results of this extensive categorization task are presented in this paper.

CONCEPT

Twitter users follow other users for specic reasons. In the majority of cases these reasons are concerned with similar elds of interest. Nonetheless, this doesnt mean the connection between similar interested Twitter users is bidirectional. When social network connections arent bidirectional, an individual user doesnt implicitly have to know his followers. Obviously, the follower is interested and involved with similar topics, as the person he or she follows. Therefore, there is a big probability that friends and other colleagues of the followed user have similar

connections, which can be of certain interest for a specic user. A user is active in several kinds of topic based bubbles, where the participating users do not necessarily know all participants of such a bubble. However, in most cases, one doesnt have just one special kind of interest and he or she is part of several topic based subsets of users. Hence, users within one users specic bubble, might be of interest for each other. Figure 1 shows an example of a so called network graph,which reveals the sphere of activity within diverse Thought Bubbles. Users marked with a star (*) are potentially of big interest for this account (blue highlighted in gure 1). These users belong to the same topic specic bubble, as in here, to the Science Bubble. However, also the connection between the yellow marked account and the accounts marked with a star, isnt bidirectional.

3.1

Finding potentially interesting users

Developer Bubble Science Bubble

* * .

Music Bubble

The rst sub module deals with the problem of separating users that merely produce noise or spam, from those that spread news, personal thoughts and facts. To simplify this process we have to dene the pool of people who are connected to ones Twitter account. This connection exists because one is following other users or because other users are following oneself. We call this pool of people the inner circle. Separating the inner circle of people by ltering useful information provided by those people helps to reveal further accounts of potential interest, which are hidden in the so called outer circle. However, the outer circle of people represents the connection to every person acting within ones inner circle. Subsequently, a second cycle of ltering is performed to efciently narrow down and identify the people of potential interest. (Horn, 2010) uses Support Vector Machines(SVMs) for this quite rough classication task. SVMs are a commonly used technique for text classication and are recommended by many researchers like (Rios and Zha, 2004), (Hsu et al., 2010) or (Nakagawa et al., 2001). By applying this method, a potentially interesting set of users would remain for further consideration. Also the usage of a POS-tagger and a chunker in advance, could help to acsertain if a Twitter account belongs to a person. Eliminating duplicates within this set and eliminating the accounts that one already follows, one usually leads to a quite clearly arranged set of Twitter accounts that is worth exploring in depth.

3.2
Figure 1: This is an example of how a user can be placed in a Twitter network graph.

Categorization of users

This implies that following a specic user of a certain eld opens a big probability of nding further relevant users who are also acting in a eld of specic relevance. The missing bidirectionality of certain user connections, hints at interest only relationships. Being conscious of this, led to the concept of Thought Bubbles. This holds the possibility to recommend people and information, which is contained within a bubble and wasnt explored by a specic Twitter user so far.

SYSTEM MODULES

The conceptiual realization of Thought Bubbles can be split into several sub modules.

Granular categorization of users is the most complex task within this system. In rst place its necessary to categorize the active user who uses the Thought Bubble service. In the very beginning, a set of appropriate categories that covers all possible interests a user could have, has to be dened. For example such categories include developing, science, teaching, etc... To be able to classify a user, its necessary to process ones Tweet history. The rst step is to annotate words in a users Tweets, which can be performed by applying Natural Language Processing (NLP) (Ritter et al., 2011) techniques. Classifying Tweets is a very special task regarding usual classication of text artefacts. The reasons are: (a) the shortage of Tweets (140 character strings), (b) the often changing context in which a word is used and (c) the above average occurrence of out of vocabulary words. By tagging all words in Tweets (Part-Of-Speech tagging), the elimination of unimportant words like copulas or preposi-

tions can be realized. (Gimple et al., 2011) for example, already developed a POS-tagger especially for the needs of Twitter. Summarizing the results of all categorized user Tweets, leads to a percentaged classication of a user. There are several techniques available and approved for realizing this classication task. Referred to section 3.1 SVMs can be used for such a task as applied by (Nakagawa et al., 2001). But there are several other ways for accomplishing this classication behavior like using Bayesian approaches (Goldwater and Grifths, 2005). Future testing and evaluation will give clarity about the best way for realizing categorization of Twitter users. One possibility for evaluation is presented by (Chen et al., 2009).

3.4

Recommendation

Recommendation decisions are made by calculating ratings for each potentially interesting user, based on their category classication and the additional ratios, mentioned in section 3.2 and section 3.3. Subsequently, category classication of an active service user, is compared to the classied categories of potentially interesting other users. In advance, all additional ratios have different weights, which will nally inuence the position of a user in the nal recommendation list. Denite values for those ratios have to be found during development and test runs of the system and therefore, cant be predicted previously.

3.3

Additional Ratios

DEMO APPLICATION

In addition to measuring the similarity of Thought Bubble attributes, regarding the afliation of a user into a category, several other ratios for determining the importance of a users recommendation are used to sharpen the prediction accuracy. The following ratios are legit candidates for additionally inuencing wether a user within a topic related bubble will be recommended or not. Tweet Frequency is the amount of Tweets a Twitter user is ring within a dened period of time.

The Thought Bubble Server will be implemented in Python and runs on an Apache 2 web server. Figure 2 visualizes the potential infrastructure of this system.

External Clients (iOS, Web, etc) Twitter API

Internal

The Follower ratio. The more followers a user has, the more inuence or credibility one might posses. On the other side, if a user has very few followers, but is following a huge amount of other users, might hint to a Blast Follower2 . The amount of retweets a users Tweets have, indicates the amplitude a users reputation has. If an observed user isnt connected with the inner circle bidirectionally, this denotes a non friendship but a sheer interest related relationship. Clients will have the possibility to rate recommended users or Tweets as interesting or not interesting for a specied category. By comparing users, which are rated as interesting with potential recommendations for a Thought Bubble, similarity between those, can also inuence the users overall rating score within a bubble. These ratios could help to sharpen the selection of recommended Tweets and Twitter users. However, the main task regarding applying these ratios, is to nd an appropriate weighting scheme for every ratio.
follow-twitter-users/ (April 2012)
2 http://www.makeuseof.com/dir/blastfollow-mass-

REST API

Tweet Collector

Classication Worker Threads

Rater

Database Operations Thread

Database Wrapper

SQLite Database

Figure 2: Thought Bubble infrastructure.

Twitter related API calls, which affect or are signicant for the classication and recommendation task, are processed and cached by the Thought Bubble server. The REST API acts as junction between the Twitter REST API and the client. All requests which arent affecting the functionality of the Thought Bubble system, are directly processed by the Twitter REST API. When the system has completed categorizing and rating of potential recommendations for the rst time a user starts to use this service, the system starts to enrich the Twitter stream with Tweets from recommended persons. Recommendation of single Tweets is based on the inuence a Tweet has had during classication of a certain user. Thought Bubble clients can be used just like usual Twitter clients for reading ones personal Twitter stream, tweeting or direct messaging. However, the big difference is that

the user gets recommendations in form of other Twitter accounts that most likely t into his or her specic Thought Bubble. Twitter also features a system for recommended Twitter users concerning specic categories, but these recommendations arent user specic at all. In fact, this is just one part of the whole proof of concept application. The second part is an iOS app for iPhone users, which uses the Thought Bubble service. This app can be used as a common iOS Twitter client. However, this iOS Twitter app includes the big additional feature of being able to explore new recommended, topic specic information and rated users. Another very interesting feature, which would provide huge potential for future applications, is the visualization of Thought Bubbles, which would provide active exploration for users within their own topic related bubbles. Actually, our group is developing an advanced prototype of such an app. Nonetheless, also the server applications implementation is in progress and preliminary results are discussed in the next section.

After applying POS tagging, sentences are brought into the form of so called chunk trees (Abney, 1994). By iterating through the trees and searching for detected phrases, names and nouns, feature vectors are compiled. To strengthen the inuence of hashtags, they are counted twice within a vector. Additionally to reduce or even eliminate the weight of words that occur very often in the English language (200 most used English words) and arent useful for proper categorization, are scratched from the vectors. This task is performed for all Twitter users within a potential Thought Bubble and afterwards compared by applying cosine similarity to rate the similarity of Tweeted content. This similarity is measured by comparing the word frequency counts of words and phrases, which were classied as relevant by the predone operations (POS tagging, chunking and phrase, noun and name ltering).

5.2

First Test Results

PRELIMINARY RESULTS AND APPLICATION SETUP

A rst proof of concept application according to the idea of Thought Bubbles was already implemented. Figure 2 visualizes the architecture of the current Thought Bubble Server implementation. Django3 is used as Web framework and the Natural Language Toolkit (NLTK)4 is used for classication and word processing tasks. Data storage is handled by using SQLite35 and Twitter related requests are generated by the Python Twitter framework6 . The current stage of development contains the following implementations.

5.1

Proof of Concept Setup

Currently, the classication is done by ltering hashtags within Tweets of users and POS tagging and chunking the last 200 Tweets in a users Twitter timeline. POS tagging is done by using NLTKs Trigram Tagger7 , trained with Conll 2000 training data (Tjang and Buchholz, 2000), similar to (Ritter et al., 2011).
(April 2012) (April 2012) 5 http://www.sqlite.org/ (April 2012) 6 http://code.google.com/p/python-twitter/ (April 2012) 7 http://nltk.googlecode.com/svn/trunk/doc/howto/tag.html (April 2012)
4 http://www.nltk.org/ 3 https://www.djangoproject.com/

This rst test run included all operations mentioned in section 5.1. Test data was fetched via the Twitter REST API and cached for further processing of the to be observed Twitter accounts. Caching was done to ensure that all accounts that were observed were in the exact same stage during testing because users tend to Tweet from time to time, which would affect the test results. 49 Twitter accounts were compared to @mebner. Within the test set of users, 21 Twitter accounts of people that work in similar or same elds as @mebner or are actually students of his, were added to measure the reliability of the system. The rest of Twitter accounts for this test run were chosen randomly. Figure 3 visualizes the results of a rst test run, based on Martin Ebners (@mebner) Twitter account. Nearly all best scoring users were wether students of TUGraz or researchers, whose profession is very similar to @mebners. @gargamit100 for example, scored a similarity of 0.28 and was therefore, the best match in this test set. This person is for example an elearning specialist from India and is already followed by @mebner. Not a single random pick scored more than slightly above 0.09, but still lower than 0.1. A tech bloggers Twitter account scored best in the non researcher dataset what indeed could also be of potential interest of a professor of a university of technology. Five of 21 manually added researchers and students scored lower than expected. By applying more ratios like discussed in section 3.3, we expect to minimize the error rate to a satisfying level. Nonetheless, the 0.1 mark seems to be a good threshold for deciding, wether a Twitter account should still be considered for further analyzation. At

as potentially interesting8 . By summing all specic thresholds and dividing them by the amount of tested users, we got an average threshold of 0.098. Although the average threshold of 0.098 is very close to the predicted 0.1 of @mebners case, the statistical spreading of the specic thresholds are up to 50% and more, compared to the average threshold. So maybe the usage of a threshold isnt the best choice for pre elimination, because the amount of accounts for further processing may vary too much. Applying a simple knearest-neighbor approach would be more appropriate to limit the number of potential recommendations in advance. A limit for selecting the top n neighbors will be dened during the forthcoming tests.

5.3

Bubble Selection and Recommendation

Figure 3: Test run with 50 Twitter users.

least in the case of @mebner. In advance to this rst test, similar test runs were done for every member of the manually picked users. Figure 4 visualizes all found optimal thresholds, which would enable the categorization of an account to reach a similar accuracy to @mebners test run.

All top n similar users within a test set, are now part of Thought Bubbles of a service using user. Within this set of potentially interesting users, category specic bubbles can be extracted and then recommended as a topic based subset of users. Unfortunately, this feature is currently in very early stages of development and therefore, not part of rst proof of concept application and test runs. In advance to that, Thought Bubbles for a user of this service, will be available via the REST API like visualized in gure 2. The bubbles will be delivered as JSON9 objects and presented on users client application, according to the client platform, as category specic lists, where users will be able to explore the new recommended Twitter proles on their own, to decide, wether a recommendation is useable and interesting or not. The ability to rate the recommendations, will again sharpen the sense of the classication task like mentioned in section 3.3.

DISCUSSION

Figure 4: Thresholds of the 22 hand picked users.

By observing each result set of the tested users, thresholds were dened. These thresholds were set to meet a minimum 75% limit, where at least threefourths of the hand picked users were categorized

Categorization within the additional 21 test runs delivered of course different results. Thats quite obvious, simply based on the fact that different characters use different words and phrases and have different interests in advance to their professions. Hence students often were identied as potentially interested in musicians or sports men. Nonetheless, the usage of Retweets in the set of tweets that where POS tagged and chunked, lowered the scores signicantly within the set of accounts, which should at least score close
8 The 75% rate of correct classication is motivated by the results of @mebners Twitter account. 9 http://www.json.org/ (April 2012)

to a specic threshold. As a result of that, future test runs will exclude Retweets from the classication task. Therefore, the number of Retweets a users Tweets have, will be considered as additional ratio like mentioned in section 3.3. Actually, another problem during the test runs occurred, which was indeed, very annoying. The 350 requests per hour limit of Twitters REST API was reached very fast. This problem could be solved in future test runs, wether by scheduling the worker threads according to this limit, or simply using an alternative service like Grabeeter10 for grabbing peoples Tweets. The disadvantage of the second alternative would be the fact that its necessary for the majority of users that are observed and categorized, to be users of Grabeeter (Muehlberger et al., 2010). Scheduling threads the way that they dont exceed the Twitter REST APIs limit, would be on the other hand very time intense. Maybe a combination of those alternatives could solve this problem at a satisfying rate of time loss. A big advantage of this system compared to similar approaches like (DeVoch et al., 2011) is that in rst place, the concept of Thought Bubbles isnt limited to the movement of a specied community like Research 2.0, but rather can be used in any kind of topic related community. The fact that people are classied, basically on the content of their tweets and not only on hashtags, mentions or already existing connections, leads to new and so far undiscovered personalized recommendations of similar minded people. At the same time, all recommendations are always based on the context of the latest n Tweets of a user. Therefore, recommendations change automatically, when a user changes his or her interests or projects he or she is currently working on. Of course, assuming that the user is tweeting about his or her current actions. Although we didnt make use of any classic semantic technologies like FOAF11 or SIOC12 so far, we consider to use them in advance of nishing this proof of concept application. This would indeed enable this system to link people beyond the borders of Twitter. (DeVoch et al., 2011) for example, already conceived and partially approved a system using these classic semantic approaches to mine specic science related events and its participants. Nonetheless, one of this projects main purpose is to answer the question to what extend Twitter is salutary for discovering utile and interesting information for a community like Research 2.0. Considering the fact, that recommendations depend on the quality of
10 http://grabeeter.tugraz.at/

Tweets of a user, we aim extract and nd metrics and techniques that enable us to lter as much as noise as possible and detect mattering facts within a dynamically changing context.

CONCLUSION AND FUTURE WORK

Classication of user proles in social networks isnt just a Twitter related topic but can be used for similar networks as well. This can help to establish connections between similar interested people, especially regarding scientic interests or expertise (Stankovic et al., 2010). New connections to new users often lead to novel and utile information. Nonetheless, this kind of categorization of virtual individuals isnt only useful for user recommendations, but also for focusing resources regarding the needs and interests of a specic user, which probably will be the next step in this project. A personalized stream of information similar to a personalized search engine is indeed a very powerful tool for personalization in any kind of business eld that deals with supplying information. This scientic eld is still in the early stages. In near future we plan to nish a rst complete proof of concept application, which would enable us to evaluate our chosen methods for classifying users and recommending them. This certainly will help us to further access the full potential of such applications. Future users of the Thought Bubble service, will have the opportunity to access others people knowledge by just doing and tweeting about what they do. This isnt just a fast and convenient way for nding new interesting people, but rather a way to create ones personal subset of people, which might be able to answer your questions or inuence your work. Or in other words, this is one step forward to a personalized and focused stream of information for everyone. According to our future ndindings during development, we hope to be able to answer if Twitter is a useful source in general, for mining ones needs of information especially for researches and general science related content, or if the huge amount of noise cant be eliminated at a satisfying level computation time.

(April 2012) (April 2012) 12 http://sioc-project.org/ (April 2012)


11 http://www.foaf-project.org/

REFERENCES
Abney, S. P. (1994). Parsing by chunks. Chen, J., Geyer, W., Dugan, C., Muller, M., and Guy, I. (2009). make new friends, but keep the old recommending people on social networking sites. Choudhury, S. and Breslin, J. G. (2010). Extracting semantic entities and events from sports tweets. DeVoch, L., Softic, S., and Ebner, M. (2011). Semantically driven social data aggregation interfaces for research 2.0. Gimple, K., Schneider, N., Brendan, O., Das, D., Mills, D., Eisenstein, J., Heilman, M., Yogamata, D., Flanigan, J., and Smith, N. A. (2011). Part-of-speech tagging for twitter: Annotation, features, and experiments. Goldwater, S. and Grifths, T. L. (2005). A fully bayesian approach to unsupervised part-of-speech tagging. Horn, C. (2010). Analysis and classication of twitter messages. Hsu, C.-W., Chih-Chung, C., and Lin, C.-J. (2010). A practical guide to support vector classication. Kraker, P., Wagner, C., Jeanquartier, F., and Lindstaed, S. (2010). On the way to a science intelligence: Visualizing tel tweets for trend detection. Mika, P. and Laniado, D. (2010). Making sense of twitter. Muehlberger, H., Ebner, M., and Taraghi, B. (2010). @twitter try out #grabeeter to export, archive and search your tweets. Nakagawa, T., Kudoh, T., and Matsumoto, Y. (2001). Unknown word guessing and part-of-speech tagging using support vector machines. Rios, G. and Zha, H. (2004). Exploring support vector machines and random forests for spam detection. Ritter, A., Mausam, C. S., and Etzioni (2011). Named entity recognition in tweets: An experimental study. Softic, S., Ebner, M., Muehlburger, H., Altmann, T., and Taraghi, B. (2010). @twitter mining microblogs using semantic technologies. Stankovic, M., Wagner, C., Jovanovic, J., and Laubert, P. (2010). Looking for experts? what can linked data do for you? Tjang, K. S. and Buchholz, S. (2000). Introduction to the conll-2000 shared task: Chunking.