You are on page 1of 16

Pragmaticraft

t

Non-relational data stores
Overview, coding and assessment: MongoDB, Tokyo Tyrant & CouchDB github.com/igal/ruby_datastores

Igal Koshevoy, Pragmaticraft Business-Technology Consultant igal@pragmaticraft.com @igalko on Twitter & Identi.ca

"Non-relational data stores for OpenSQL Camp" - Igal Koshevoy - 2009-11-14

typically mapping key to arbitrary columns. CouchDB... e.Pragmaticraft t Terminology ● Relational database: Rigidly-defined relations of table columns to data rows.Igal Koshevoy . "Non-relational data stores for OpenSQL Camp" . Tokyo Tyrant's table. TokyoTyrant hash. PostgreSQL. etc. MongoDB. Berkely DB. e. etc.g.g. MySQL.. etc. Key-value store: Hash-like struct.2009-11-14 ● ● . Document-oriented DB: Hash-like struct. often mapping a string key to a string value.g. e.

Pragmaticraft t Why non-relational? ● Easier replication and scaling Incremental schema migrations Better performance.2009-11-14 .Igal Koshevoy . sometimes ● ● "Non-relational data stores for OpenSQL Camp" .

young and brash with teething pains Require different coding patterns and skills to use effectively ● ● "Non-relational data stores for OpenSQL Camp" .Pragmaticraft t Why NOT non-relational? ● Most are bad for transactions and durability Most are newfangled.2009-11-14 .Igal Koshevoy .

Igal Koshevoy .2009-11-14 ● ● ● .Pragmaticraft t Non-relational coding patterns ● Denormalization to reduce finds/queries due to lack of relational join: – – – Calculations Foreign keys Foreign values ● Track schema versions per document Use as-needed incremental migrations Store associations externally or internally Shard data "Non-relational data stores for OpenSQL Camp" .

soon) Cons ● No transactions across collections No ACID or MVCC Not the fastest ● ● "Non-relational data stores for OpenSQL Camp" .0 Document-oriented ● ● ● ● ● ● ● ● ● ● Pros Great site.2009-11-14 . docs.org/ A 10gen project under GNU AGPL v3.Pragmaticraft t MongoDB http://mongodb.Igal Koshevoy . shards Resilient: replica pair Many data types Multiple indexes Sophisticated queries Many atomic operations Map-Reduce (on shards. API & community General purpose Quick performance Scalable: master-slaves.

p collection. collection = db.find_one(:number => 1) # Query items.new("localhost". db = Mongo::Connection. collection. :message => "Hello" } # Retrieve an item. collection << { :number => 1.find(:message => /ello/).to_a "Non-relational data stores for OpenSQL Camp" .db("mydb") # Get a collection.Igal Koshevoy .16 driver from GemCutter # Connect.2009-11-14 .collection("mycollection") # Add an index. p collection. 27017).Pragmaticraft t MongoDB (cont.create_index("number") # Insert an item.) require 'mongo' # Using mongo 0.

sourceforge.1 Key-value.2009-11-14 .Pragmaticraft t Tokyo Cabinet + Tyrant http://tokyocabinet.jp project under GNU LGPL v2. document-oriented & other engines Pros Specialized engines Very fast Scalable: master-slaves Resilient: dual master Multiple indexes Can do transactions memcache-compatible API Cons Fewer features Strings only Simplistic queries ● ● ● ● ● ● ● ● ● ● "Non-relational data stores for OpenSQL Camp" .net/ A mixi.Igal Koshevoy .

) require 'rufus/tokyo/tyrant' # Connect.query do |q| q. p db["foo"] # Query items. "message" => "Hello" } # Retrieve an item. db["foo"] = { "number" => "1".limit(5) end "Non-relational data stores for OpenSQL Camp" .2009-11-14 . "ello") q.new('localhost'.Pragmaticraft t Tokyo Cabinet + Tyrant (cont. db = Rufus::Tokyo::TyrantTable. 1978) # Insert an item. p db.add_condition("message". :includes.Igal Koshevoy .

Igal Koshevoy .org/ An Apache project under Apache License 2. very.apache.2009-11-14 .0 Document-oriented Pros Very scalable: multi-master MVCC ACID Versioned documents REST Sophisticated queries Map-Reduce Cons ● ● ● ● ● ● ● ● Very. very slow Must create views Harder to use than others Site and docs: FAIL ● ● ● "Non-relational data stores for OpenSQL Camp" .Pragmaticraft t CouchDB http://couchdb.

0.Igal Koshevoy .2009-11-14 . db.Pragmaticraft t CouchDB (cont.database!("http://127. "number" => 1. db = CouchRest.save_doc("_id" => "foo". "message" => "Hello") # Retrieve an item.get("foo") "Non-relational data stores for OpenSQL Camp" .1:5984/couchrest-test") # Insert an item. p db.0.) require 'couchrest' # Connect.

.get("_design/queries") rescue nil db.2009-11-14 .view("queries/by_number". :key => 1)["rows"].continued from last slide # Add an view. :views => { :by_number => { :map => "function(doc) { if (doc.number.Igal Koshevoy .number) { emit(doc. } }" } } }) # Query items. doc). db.map{|row| row["value"]} "Non-relational data stores for OpenSQL Camp" .) # .Pragmaticraft t CouchDB (cont.delete_doc db.save_doc({ "_id" => "_design/queries".. p db.

Query one Above benchmarks are naive: They use serial operations in tight loops with small datasets from single host to localhost.2009-11-14 .Pragmaticraft t Naive benchmarks Columns in graphs.Igal Koshevoy .Insert one 2. rather than concurrent mixture across many clients & servers with much data. left to right: 1. "Non-relational data stores for OpenSQL Camp" .Retrieve one 3.

Igal Koshevoy .Pragmaticraft t "Non-relational data stores for OpenSQL Camp" .2009-11-14 .

Pragmaticraft t "Non-relational data stores for OpenSQL Camp" .2009-11-14 .Igal Koshevoy .

With caching. Drizzle) & better replication. CouchDB has promise.g. Non-relational databases have shown their worth at larger sites when used cleverly. rework (e.Igal Koshevoy .Pragmaticraft t Conclusions ● MongoDB and Tokyo Tyrant are useful now. they will continue to be competitive. powerful and proven. stability & features. but is too slow currently. Relational databases are still a great choice: fast.2009-11-14 ● ● ● . denormalization. Non-relational databases will continue to improve performance. "Non-relational data stores for OpenSQL Camp" .