Apache CouchDB has started. Time to relax.

Wednesday 31 August 2011

A Small Intro
Cluster of Unreliable Commodity Hardware Written in Erlang Self Contained documents No Schema

Wednesday 31 August 2011

Objective
CouchDB features Spectrum of Use Map/Reduce Replication / Decentralization Load Balancing Demo QA
Wednesday 31 August 2011

CAP Theorem
Consistency Partition Tolerance

cassandra, BigTable etc

RDBMS

Availability

couchDB

Wednesday 31 August 2011

Features

Wednesday 31 August 2011

Document Structure
Self contained documents. Schema Free _id and _rev maintained for each document Documents are JSON plain text Design documents Can also have attachments as blobs. Document revisions are maintained on each update
Wednesday 31 August 2011

MVCC
Multi Version Concurrency Control No locks(Writes don’t acquire lock) All nodes are immutable Recovering from a crash is simple

Wednesday 31 August 2011

Embrace HTTP
REST interface GET, PUT, DELETE, UPDATE supported on documents Even database creation/deletion Etags NAT/Firewall friendly

Wednesday 31 August 2011

Replication
Incremental Replication with _changes API For offline usage Bidirectional Replication in database / document level Again through HTTP _filter API to replicate only portions of the database

Wednesday 31 August 2011

Conflict Resolution
Pass revisions on each update Conflicts can happen only on replication of concurrent writes in a cluster Both revisions are maintained for application to resolve the conflict _conflict property set on documents that conflict

Wednesday 31 August 2011

views using CouchDB
no relations, no joins views allow to collate documents using Map/Reduce to simulate joins stored as design docs
function map(doc) { if(doc.name) { emit(_id, doc) } }

function reduce(keys, values){ sum(values); }

Wednesday 31 August 2011

Query Server

Uses SpiderMonkey process by default to run map/reduce jobs written in javascript replaceable by any language like ruby/ python/erlang/lisp

Wednesday 31 August 2011

_changes API
used for replication modes continuous(streaming) polling long polling _filter API to get changes on specified documents

Wednesday 31 August 2011

validations API
Executed on document creation/updation can throw({“http_error”:“message”}) user permissions and roles, self managed oauth / cookie based _sessions API

Wednesday 31 August 2011

Wednesday 31 August 2011

Map/Reduce vs SQL

Wednesday 31 August 2011

Why Map/Reduce
highly parallelizable complex operations on huge datasets can be broken down memoizable

Wednesday 31 August 2011

Advantages of coming out of SQL Approach
Don’t have to model complex relationships for single object , which are enforced by RDBMS

Wednesday 31 August 2011

When to couch?

Wednesday 31 August 2011

Document Oriented
Applications where object representation varies for each document Like applications written on Lotus Notes Ex: Pulse

Wednesday 31 August 2011

Decentralized
Decentralized offline applications (maintain local copies) Application will be replicated as design document along with data Ex: Ubuntu One

Wednesday 31 August 2011

Load Balancing
Intensive reads/writes N couch instances taking part in reads/writes and replicating with each other Lounge - Setup couch cluster easily

Wednesday 31 August 2011

Flipsides
Disk Size(Increases enormously because of revisions) No Built in support for %like% queries - Use Lucene Transactional Support??? No Cascades(Do we need it?)
Wednesday 31 August 2011

Couching Apps
Web Apps served from CouchDB CouchApp, JQCouch Web Apps & Desktop Apps Query HTTP REST interface and provide functionality Infinite Possibilities - Any Framework Libraries available for almost all languages

Wednesday 31 August 2011

Demo

Wednesday 31 August 2011

Q&A

Wednesday 31 August 2011

Sign up to vote on this title
UsefulNot useful