You are on page 1of 12

TEDx Mumbai

Making sense of the voices on twitter

By Elroy Serrao

Photograph: Day one hundred and twelve by Dustim Diaz (http://www.flickr.com/photos/polvero/3466964233/)


CONTENTS
Introduction ........................................................................................................................................................................ 3
Methodology ....................................................................................................................................................................... 4
The Official Twitter Account ........................................................................................................................................ 5
The Tweets .......................................................................................................................................................................... 6
An quick overview ....................................................................................................................................................... 6
Who Tweeted? - The people behind the tweets ............................................................................................. 6
When did they Tweet? - The Tweet Timeline .................................................................................................. 7
What did they Tweet about? – The Content ..................................................................................................... 8
Where did they Tweet From? – The Source................................................................................................... 10
Conclusion ........................................................................................................................................................................ 12

2
INTRODUCTION
Tedx Mumbai was an independently organized TED event that was held in at BlueFrog, Mumbai
on 3rd April 2010. Unfortunately for me, the event was held on the day before Easter, and so
even applying to attend was out. Around the same time I was dabbling with a Ruby Script to
automatically collate tweets matching a few keywords and tags. So I thought, let’s collate all the
tweets from the event and then see the event from the eyes of twitter.

Some frantic coding later, my script was live on my server and I was set. Or so I thought.

The very next day, halfway on the way to work I realized that I had forgotten to setup a script to
monitor the hashtag for the event. Out came my new Android phone to the rescue!! I managed to
access my server, through Net2FTP and then copied the existing script and modified it using a
web based editor and set up the script using my control panel, all from the phone. Whew!!

Net result was that I was setup, but about an hour late, so I missed out on a few tweets. I let my
scripts run for the entire day as well as the next. What follows is an attempt to make some sense
of the many voices out there. I do hope this bit of random craziness, gives you something
interesting to read.

3
METHODOLOGY
To collate tweets, I set up two scripts. One script searched the twitter timeline for the words
“Tedx Mumbai”, while the other searched the timeline for the hashtag #tedxmumbai. The first
script logged the first 25 search results, while the second logged the first 30 search results. Both
scripts ran every 5 minutes, logging the search results to a text file. The text files were created
on the basis of the date i.e. one per day, with the day being determined by the system clock on
the server.

Each script logged date, day, time, tweet contents, user, tags, application from which the tweet
originated, type of the Tweet and whether it was a retweet. If the tweet contained the word “RT”
then it was tagged as a retweet. The scripts automatically tagged the tweet as belonging to a
particular type. Tweets containing the “@” character were considered to be conversations, those
with links were classified as links and everything else was classified as general.

Finally the data was scrubbed of all duplicates before any analysis was attempted.

There are some important caveats that need to be brought to light, though. First of all, the
collation of the tweets based on keywords was by no means comprehensive. This means that
not all tweets may have been collected. This is because of the fact that just two keywords were
used and also because Twitter applies some sort of filtering to its search results. Also Twitter
search excludes accounts that are private, and only brings public results. The net result of all
this is that not everything that was said about the event was collected.

However, I believe that the tweets are certainly a good representative sample and definitely
include a bulk of the tweets about the event.

4
THE OFFICIAL TWITTER ACCOUNT
The official Tedx Mumbai account had the handle @tedxmumbai. Below you can see a visual
summary of the tweets from the account:

Figure 1 Tweets by Time of Day and Week

As expected, the maximum numbers of tweets were posted on the day of the event. The tweets
seemed to peak at the middle of the day, about halfway into the event and then tapered off. Also
the maximum numbers of replies to tweets posted from the account seem to happen at about
3pm.

Why 3 pm? Well that’s an interesting question. I strongly suspect that’s when more people are
actively looking at their twitter timelines (possibly to evade the inevitable lethargy of the
afternoon?)

5
THE TWEETS
AN QUICK OVERVIEW
The event produced 1222 tweets, of which 29% were retweets. A total of 254 people tweeted,
posting 266 links and engaging in195 conversations. Tweets notably picked up on Saturday
after the event started, with a lot of low intensity chatter about the event the day before. I will
now try and analyze these tweets.

WHO TWEETED? - THE PEOPLE BEHIND THE TWEETS


A total of 254 people tweeted about the event. A brief breakup of the kind of people who
tweeted can be seen below:

Figure 2 The people behind the tweets

The sex of the user tweeting was determined through the display picture, description and name
given on the users twitter page. Organizations included accounts like Tedx Mumbai and
Cleartrip and other such accounts. Bots were those accounts that clearly had content that
seemed auto generated. Of course, there could be cases of misclassification; however I am
certain that net impact on the final breakup wouldn’t be very severe.

When I grouped the data by the number of tweets per person, some very interesting results
appeared. The top 20 tweeters accounted for about 56% of the total tweets, with most accounts
tweeting only once. Twitter user Arpit Agarwal (@arpiit) tweeted the most number of times
about the event. The results conform to Pareto’s rule to a large extent, with a small percentage
of the users accounting for majority of the tweets.

6
Organizing the number of tweets by person, in descending order gave me a nice long tail effect
which is illustrated in the graph below.

120

100

80
No of Tweets

60

40

20

Figure 3 The Long Tail of Tweets

WHEN DID THEY TWEET? - THE TWEET TIMELINE


The graph below shows the number of tweets vs. the time of day and date.

300

250

200
No of Tweets

150

100

50

0
Sun-17
Sat-0
Sat-2
Sat-5
Sat-7
Sat-9

Sat-13

Sun-1
Sun-4
Sun-7
Sun-9
Sun-11
Sun-13
Sun-15

Sun-19
Sun-21
Fri-0
Fri-2
Fri-5
Fri-9

Sat-11

Sat-15
Sat-17
Sat-19
Sat-21
Sat-23
Fri-11
Fri-20
Fri-22

Day and Time in Hours

Figure 4 The Twitter Timeline for Tedx Mumbai

The number of tweets drastically picked up once the event started, tapering off as the day
ended. Sharp spikes in activity were seen as each session began and reached the half-way mark.
On the day preceding and the day after, the chatter on twitter was of a very low intensity. The
bulk of the conversations on Sunday centered around the Bombay Gym controversy, with a few
congratulatory messages and links to blog posts exchanged.

7
WHAT DID THEY TWEET ABOUT? – THE CONTENT
The tweets on the day preceding the event concentrated more on a buildup to the event.
Conversations included people checking to see who was coming and few latecomers realizing
that Tedx Mumbai was planned for the next day. The tweets on Saturday naturally concentrated
on the event, while the tweets on the day after concentrated to some extent on the Bombay Gym
incident.

The overall sentiment about the event was positive, with most people quite satisfied by the
event. However, one of the sponsors, Cleartrip seems to have incurred some ire from people at
the event for commandeering a small segment to showcase their new product.

A tag cloud showing the various tags used in the tweets is illustrated below.

Figure 5 Tag Cloud

A total of 266 links to content related to the event were posted on Twitter. A classification of the
links posted can be seen on the next page.

8
Facebook
4%

Others
11%
Blogs
35%

Webcast
31%

Pictures
19%

Figure 6 Types of links in Tweets

A number of bloggers were live blogging from the event, including blogger/author Amit Verma.
The table below gives a listing of the most popular blogs, ranked as per the number of tweets
about them.

Blogger Number of Tweets


Amit Varma 48
Navin Kabra 21
Dina 11
Media Reveries 5
Others 8
Table 1 Bloggers who wrote about the event

A number of the links were also shortened using a URL shortening service. Bit.ly, the current
market leader was predictably the source of a number of short URL’s. The next closest source of
short URL’s was TwitPic.

Similar analysis of data for top sources of image links reveals TwitPic to be the runaway
favorite. Is this because TwitPic is the default service for most applications or is it just that
TwitPic is everyone’s favorite? I don’t know the answer to that question, but I am certain that
the answer to that question will definitely shine some light on the benefits of being a preferred
partner on the web.

9
100
90
80
70
No of Tweets

60
50
40
30
20
10
0
bit.ly Direct Link twitpic j.mp tinyurl others

Figure 7 URL shortening services used

50
45
40
35
No of Tweets

30
25
20
15
10
5
0
twitpic yfrog Misc ow.ly

Figure 8 Image services used

WHERE DID THEY TWEET FROM? – THE SOURCE


Twitter also provides the name of the application from where the tweet originated in the search
results. I think it’s a very useful tool to try and see where your target audience is tweeting from.
I believe in the long run, it may help to know if the people tweeting about you are doing this on
the move or are they doing this from a computer – meaning are they likely to be physically close
to your brand or service or not.

While Twitter for the web still dominates as a source of tweets, other applications are catching
up. Notably, Gravity, a Symbian 60 application has the second largest share of the pie. This I
think represents the prevalence of S60 devices in India versus other smart phones. Besides this
the other point to note is that mobile web as whole clearly overshadows the web. Again, given
the nature of the event and the fact that most of the tweets came from people at the event, this is
to be expected.

10
web Gravity TweetDeck Mobile Web UberTwitter
Echofon Tweetie Seesmic Others

2% 14%
25%
4%

5%
8%
19%
11%
12%

Figure 9 The sources of the tweets

11
CONCLUSION
To be very honest, this compilation of statistics is quite random and might not serve any useful
purpose. For me, the single most important insight was that it’s possible to monitor an event
and reactions to it in real time and do this with some trivial coding effort. With a little more
effort, it may be possible to actually automatically categorize and analyze tweets in real time. I
think a tool developed along these lines would be a great way to measure the effectiveness of
the twitter promotion of a real event.

On a side note, I do hope you learned something useful from this rant of mine.

12

You might also like