You are on page 1of 21

Off the Reservation

with TweetReach

Hayes Davis
Co-Founder, Appozite
hayes@appozite.com
@hayesdavis
So what's this all about?
TweetReach is a simple Twitter app that helps
you see how many people have seen a
particular “message” you've sent
A few requirements
Show people useful stats
Be simple(ish)
Be reasonably fast for a resonable number of users
Fit within Twitter API limits

March 24, 2009 Austin on Rails 2


What to use?
Web app framework: Sinatra
Persistence layer: Tokyo Cabinet + Tokyo Tyrant
+ Memcache-client
Twitter API: Grackle

March 24, 2009 Austin on Rails 3


What is Sinatra?
Framework for simple web applications
Handles just the view and controller part
Controller is a simple file with a DSL for defining
routes and the blocks that handle them
Views can use ERB (and others)
Can use ActiveRecord or another model and/or
persistence mechanism (as you'll see)

March 24, 2009 Austin on Rails 4


Sinatra Example
set :port, 3000

get '/' do
erb :index
end

get '/reach' do
@query = params[:q]
tr = TweetReach.new(username,pass)
@results = tr.measure_reach(@query)
erb :reach_results
end
March 24, 2009 Austin on Rails 5
What is Tokyo Cabinet?
Persistent (and fast) key-value store from Mikio
Hirabayashi at mixi (large Japanese social
network)
Stats: 2.5M inserts/second, 3M queries/second,
Store 8 exabytes
Has a server called Tokyo Tyrant

March 24, 2009 Austin on Rails 6


More Tokyo Cabinet
Offers multiple database engines:
Hash: simple key-value store
B-tree: functionally the same as the hash DB but with
ordered keys based on a user-defined function
Fixed-length: basically a giant array which you index
into by offset keys
Table: similar to a relational DB except no predefined
schema (ala CouchDB). Can index columns and
query them

March 24, 2009 Austin on Rails 7


Tokyo Tyrant
Server for Tokyo Cabinet databases
Provides replication, failover, etc
Speaks memcached protocol, so it looks just like
memcached with a couple exceptions
Data is persistent
Does not provide expiration

March 24, 2009 Austin on Rails 8


Why Tokyo Cabinet/Tyrant?
Relevant Twitter information is easily indexed by
either a Twitter screen name or id
Reach calculations require O(n) looping and
retrievals – much faster with a dedicated key-
value store vs. a relational DB
Need a read-through cache so I don't have to hit
Twitter so often
Can store stuff for relatively long periods that I
may not want to keep solely in memory
March 24, 2009 Austin on Rails 9
What is Grackle?
Grackle is a simple Twitter REST and Search API
library designed not to break when the Twitter
API changes (or breaks)
I wrote it based on my experience building
CheapTweet.com
Dynamically makes requests to the APIs via a
generic syntax that maps to Twitter URIs
Builds OpenStructs of returned JSON or XML
dynamically
March 24, 2009 Austin on Rails 10
Grackle Example
client = Grackle::Client(
:username=>'some_user',
:password=>'secret'
)

#GET http://twitter.com/users/show.json?id=hayesdavis
client.users.show? :id=>'hayesdavis'

#POST http://twitter.com/statuses/update.json
client.statuses.update! :status=>'howdy world'

March 24, 2009 Austin on Rails 11


#GET http://search.twitter.com/search.json?q=AoR
Building TweetReach

March 24, 2009 Austin on Rails 12


Basic Structure

Grackle Twitter

Read-Through
TweetReach
Calculator
Sinatra

Cache
Tokyo
Tyrant

March 24, 2009 Austin on Rails 13


Using Sinatra
Install the sinatra gem
Sinatra apps are a single ruby file (I called it
app.rb)
Just require rubygems and sinatra
Run “ruby app.rb”

March 24, 2009 Austin on Rails 14


Using Tokyo Cabinet/Tyrant
Get the source for Tokyo Cabinet and Tokyo
Tyrant and build it
Run “ttserver -port <port> filename.<ext>”
The type of database comes from the filename
extension. I used “tch” which is the Hash
engine
Get Mike Perham's memcache-client gem

March 24, 2009 Austin on Rails 15


Using Grackle
Make sure you've added gems.github.com to
your list of gem sources
Install the hayesdavis-grackle gem
Just require “grackle”

March 24, 2009 Austin on Rails 16


Lessons Learned

March 24, 2009 Austin on Rails 17


Sinatra Lessons
Very little built-in anything:
No page or action caching
No helpers for easy formatting (stole things from
Rails)
No validation against models
Make sure you really only want something
extremely simple
You'll most likely need to roll one or two of your
own things
March 24, 2009 Austin on Rails 18
Tokyo Cabinet Lessons
Lack of auto-expiration when using as mostly a
key-value cache is annoying
Would definitely use it again for this type of task

March 24, 2009 Austin on Rails 19


Grackle
Twitter is Twitter so make sure you've got decent
error handling in place – things will go wrong or
not respond, etc
Still a very new library so I'm sure there are
places that need cleaning up

March 24, 2009 Austin on Rails 20


Handy Stuff
http://www.sinatrarb.com/
http://www.igvita.com/2009/02/13/tokyo-cabinet-
beyond-key-value-store/
http://www.scribd.com/doc/12016121/Tokyo-
Cabinet-and-Tokyo-Tyrant-Presentation
http://github.com/hayesdavis/grackle

March 24, 2009 Austin on Rails 21

You might also like