Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
2Activity
0 of .
Results for:
No results containing your search query
P. 1
Why We Twitter: Understanding Microblogging

Why We Twitter: Understanding Microblogging

Ratings: (0)|Views: 508|Likes:
Published by Matt
a great article on why we use Twitter and microblogging services
a great article on why we use Twitter and microblogging services

More info:

Published by: Matt on Jul 07, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

09/12/2011

pdf

text

original

 
Why We Twitter: Understanding MicrobloggingUsage and Communities
AkshayJava
UniversityofMarylandBaltimoreCounty1000HilltopCircleBaltimore,MD21250,USA
aks1@cs.umbc.eduXiaodanSong
NECLaboratoriesAmerica10080N.WolfeRoad,SW3-350Cupertino,CA95014,USA
xiaodan@sv.nec-labs.comTimFinin
UniversityofMarylandBaltimoreCounty1000HilltopCircleBaltimore,MD21250,USA
finin@cs.umbc.eduBelleTseng
NECLaboratoriesAmerica10080N.WolfeRoad,SW3-350Cupertino,CA95014,USA
belle@sv.nec-labs.com
ABSTRACT
Microblogging is a new form of communication in whichusers can describe their current status in short posts dis-tributed by instant messages, mobile phones, email or theWeb. Twitter, a popular microblogging tool has seen a lotof growth since it launched in October, 2006. In this paper,we present our observations of the microblogging phenom-ena by studying the topological and geographical propertiesof Twitter’s social network. We find that people use mi-croblogging to talk about their daily activities and to seekor share information. Finally, we analyze the user intentionsassociated at a community level and show how users withsimilar intentions connect with each other.
Categories and Subject Descriptors
H.3.3 [
Information Search and Retrieval
]: InformationSearch and Retrieval - Information Filtering; J.4 [
ComputerApplications
]: Social and Behavioral Sciences - Economics
General Terms
Social Network Analysis, User Intent, Microblogging, SocialMedia
1. INTRODUCTION
Microblogging is a relatively new phenomenon defined as
“a  form of blogging that lets you write brief text updates (usu-ally less than 200 characters) about your life on the go and send them to friends and interested observers via text mes-saging, instant messaging (IM), email or the web.”
1
. It is
1
http://en.wikipedia.org/wiki/Micro-blogging
Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.Joint 9th WEBKDD and 1st SNA-KDD Workshop ’07 , August 12, 2007 ,San Jose, California , USA . Copyright 2007 ACM 1-59593-444-8...
$
5.00.
provided by several services including Twitter
2
, Jaiku
3
andmore recently Pownce
4
. These tools provide a light-weight,easy form of communication that enables users to broadcastand share information about their activities, opinions andstatus. One of the popular microblogging platforms is Twit-ter [29]. According to ComScore, within eight months of itslaunch, Twitter had about 94,000 users as of April, 2007 [9].Figure 1 shows a snapshot of the first author’s Twitter home-page. Updates or posts are made by succinctly describingone’s current status within a limit of 140 characters. Top-ics range from daily life to current events, news stories, andother interests. IM tools including Gtalk, Yahoo and MSNhave features that allow users to share their current statuswith friends on their buddy lists. Microblogging tools facili-tate easily sharing status messages either publicly or withina social network.
Figure 1: An example Twitter homepage with up-dates talking about daily experiences and personalinterests.
2
http://www.twitter.com
3
http://www.jaiku.com
4
http://www.pownce.com
 
Compared to regular blogging, microblogging fulfills a needfor an even faster mode of communication. By encourag-ing shorter posts, it lowers users’ requirement of time andthought investment for content generation. This is also oneof its main differentiating factors from blogging in general.The second important difference is the frequency of update.On average, a prolific bloger may update her blog once ev-ery few days; on the other hand a microblogger may postseveral updates in a single day.With the recent popularity of Twitter and similar microblog-ging systems, it is important to understand
why 
and
how 
people use these tools. Understanding this will help usevolve the microblogging idea and improve both microblog-ging client and infrastructure software. We tackle this prob-lem by studying the microblogging phenomena and analyz-ing different types of user intentions in such systems.Much of research in user intention detection has focused onunderstanding the intent of a search queries. According toBroder [5], the three main categories of search queries arenavigational, informational and transactional. Understand-ing the intention for a search query is very different fromuser intention for content creation. In a survey of bloggers,Nardi et al. [26] describe different motivations for “whywe blog”. Their findings indicate that blogs are used as atool to share daily experiences, opinions and commentary.Based on their interviews, they also describe how bloggersform communities online that may support different socialgroups in real world. Lento et al. [21] examined the im-portance of social relationship in determining if users wouldremain active in a blogging tool called Wallop. A user’s re-tention and interest in blogging could be predicted by thecomments received and continued relationship with otheractive members of the community. Users who are invited bypeople with whom they share pre-exiting social relationshipstend to stay longer and active in the network. Moreover, cer-tain communities were found to have a greater retention ratedue to existence of such relationships. Mutual awareness ina social network has been found effective in discovering com-munities [23].In computational linguists, researchers havestudied the prob-lem of recognizing the communicative intentions that un-derlie utterances in dialog systems and spoken language in-terfaces. The foundations of this work go back to Austin[2], Stawson [32] and Grice [14]. Grosz [15] and Allen [1]carried out classic studies in analyzing the dialogues be-tween people and between people and computers in coopera-tive task oriented environments. More recently, Matsubara[24] has applied intention recognition to improve the per-formance of automobile-based spoken dialog system. Whiletheir work focusses on the analysis of ongoing dialogs be-tween two agents in a fairly well defined domain, studyinguser intention in Web-based systems requires looking at boththe content and link structure.In this paper, we describe how users have adopted a spe-cific microblogging platform, Twitter. Microblogging is rel-atively nascent, and to the best of our knowledge, no largescale studies have been done on this form of communicationand information sharing. We study the topological and geo-graphical structure of Twitter’s social network and attemptto understand the user intentions and community structurein microblogging. From our analysis, we find that the maintypes of user intentions are: daily chatter, conversations,sharing information and reporting news. Furthermore, usersplay different roles of information source, friends or informa-tion seeker in different communities.The paper is organized as follows: in Section 2, we describethe dataset and some of the properties of the underlyingsocial network of Twitter users. Section 3 provides an anal-ysis of Twitter’s social network and its spread across geogra-phies. Next, in Section 4 we describe aggregate user behav-ior and community level user intentions. Section 5 providesa taxonomy of user intentions. Finally, we summarize ourfindings and conclude with Section 6.
2. DATASET DESCRIPTION
Twitter is currently one of the most popular microbloggingplatforms. Users interact with this system by either using aWeb interface, IM agent or sending SMS updates. Membersmay choose to make their updates public or available only tofriends. If user’s profile is made public, her updates appearin a “public timeline” of recent updates. The dataset usedin this study was created by monitoring this public timelinefor a period of two months starting from April 01, 2007 toMay 30, 2007. A set of recent updates were fetched onceevery 30 seconds. There are a total of 1,348,543 posts from76,177 distinct users in this collection.Twitter allows a user, A, to “follow” updates from othermembers who are added as “friends”. An individual who isnot a friend of user A but “follows” her updates is known asa “follower”. Thus friendships can either be reciprocated orone-way. By using the Twitter developer API
5
, we fetchedthe social network of all users. We construct a directedgraph
G
(
V,
), where
represents a set of users and
represents the set of “friend” relations. A directed edge
e
exists between two users
and
if user
declares
asa friend. There are a total of 87,897 distinct nodes with829,053 friend relation between them. There are more nodesin this graph due to the fact that some users discoveredthough the link structure do not have any posts during theduration in which the data was collected. For each user, wealso obtained their profile information and mapped theirlocation to a geographic coordinate, details of which areprovided in the following section.
3. MICROBLOGGING IN TWITTER
This section describes some of the characteristic propertiesof Twitter’s Social Network including it’s network topologyand geographical distribution.
3.1 Growth of Twitter
Since Twitter provides a sequential user and post identifier,we can estimate the growth rate of Twitter. Figure 2 showsthe growth rate for users and Figure 3 shows the growth ratefor posts in this collection. Since, we do not have access tohistorical data, we can only observe its growth for a twomonth time period. For each day we identify the maximumvalue for the user identifier and post identifier as provided
5
http://twitter.com/help/api
 
3000000350000040000004500000500000055000006000000650000011-May5-May29-Apr21-Apr14-Apr7-Apr1-Apr
   M  a  x   U  s  e  r   I   D
April - May 2007Twitter Growth Rate (Users)Growth of Users
Figure 2: Twitter User Growth Rate. Figure showsthe maximum userid observed for each day in thedataset. After an initial period of interest aroundMarch 2007, the rate at which new users are joiningTwitter has slowed.
by the Twitter API. By observing the change in these val-ues, we can roughly estimate the growth of Twitter. It isinteresting to note that even though Twitter launched in2006, it really became popular soon after it won the Southby SouthWest (SXSW) conference Web Awards
6
in March,2007. Figure 2 shows the initial growth in users as a resultof interest and publicity that Twitter generated at this con-ference. After this period, the rate at which new users are joining the network has slowed. Despite the slow down, thenumber of new posts is constantly growing, approximatelydoubling every month indicating a steady base of users gen-erating content.Following Kolari et al. [18], we use the following definitionof user activity and retention:
Definition
A user is considered active during a week if heor she has posted at least one post during that week.
Definition
An active user is considered retained for thegiven week, if he or she reposts at least once in the following X weeks.
Due to the short time period for which the data is availableand the nature of Microblogging we decided to use X as aperiod of one week. Figure 4 shows the user activity andretention for the duration of the data. About half of theusers are active and of these half of them repost in the fol-lowing week. There is a lower activity recorded during thelast week of the data due to the fact that updates from thepublic timeline are not available for two days during thisperiod.
3.2 Network Properties
The Web, blogosphere, online social networks and humancontact networks all belong to a class of “scale-free net-works” [3] and exhibit a “small world phenomenon” [33]. It
6
http://2007.sxsw.com/
15000000200000002500000030000000350000004000000045000000500000005500000060000000650000007000000011-May5-May29-Apr21-Apr14-Apr7-Apr1-Apr
   M  a  x   P  o  s   t   I   D
April - May 2007Twitter Growth Rate (Posts)Growth of Posts
Figure 3: Twitter Posts Growth Rate. Figure showsthe maximum post ID observed for each day in thedataset. Although the rate at which new users are joining the network has slowed, the number of postsare increasing at a steady rate.
has been shown that many properties including the degreedistributions on the Web follow a power law distribution[19, 6]. Recent studies have confirmed that some of theseproperties also hold true for the blogosphere [31].
Property Twitter WWE
Total Nodes 87897 143,736Total Links 829247 707,761Average Degree 18.86 4.924Indegree Slope -2.4 -2.38Outdegree Slope -2.4 NADegree correlation 0.59 NADiameter 6 12Largest WCC size 81769 107,916Largest SCC size 42900 13,393Clustering Coefficient 0.106 0.0632Reciprocity 0.58 0.0329
Table 1: Twitter Social Network Statistics
Table 1 describes some of the properties for Twitter’s socialnetwork. We also compare these properties with the corre-sponding values for the Weblogging Ecosystems Workshop(WWE) collection [4] as reported by Shi et al. [31]. Theirstudy shows a network with high degree correlation (alsoshown in Figure 6) and high reciprocity. This implies thatthere are a large number of mutual acquaintances in thegraph. New Twitter users often initially join the networkon invitation from friends. Further, new friends are addedto the network by browsing through user profiles and addingother known acquaintances. High reciprocal links has alsobeen observed in other online social networks like Livejour-nal [22]. Personal communication and contact network suchas cell phone call graphs [25] also have high degree corre-lation. Figure 5 shows the cumulative degree distributions[27, 8] of Twitter’s network. It is interesting to note thatthe slopes
γ 
in
and
γ 
out
are both approximately -2.4. Thisvalue for the power law exponent is similar to that found for

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->