Warum NoSQL?

Hinnerk Haardt <hinnerk@randnotizen.de>

Not only SQL

SQL Structured Query Language

Programmiersprache für relationale Datenbanken

Warum?

Das Internet ist schuld!

1980er: data bank

ACID

• Atomicity — ganz oder gar nicht • Consistency — gewährleistet Integrität • Isolation — Kapselung gleichzeitiger T. • Durability — Persistenz aller Änderungen

»große« Datenbanken

Skalieren vertikal

RAM CPU Storage

RAM CPU Storage

RAM CPU Storage

RAM CPU Storage

teurer →

größer →

21. Jh.

Beispiel Facebook
• 30.000 Server • 25 Terabyte Logdaten täglich • 300.000.000 Nutzer • 230 Ingenieure

Das Internet ist schuld.

Horizontal

RAM CPU Storage

RAM CPU Storage RAM CPU Storage

RAM CPU Storage RAM CPU Storage

RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage

RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage

RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage

RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage

mehr Daten →

horizontale Skalierung vertikale Skalierung

mehr Durchsatz & höhere Verfügbarkeit →

RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage

RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage

RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage

RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage

RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage

RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage

RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage

RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage

RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage

RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage RAM CPU Storage

Verfügbarkeit

Sicherheit (ACID) Verfügbarkeit unbegrenztes Wachstum

CAP-Theorem

Consistency

Availability

Partition Tolerance

»in larger distributed-scale systems, network partitions are a given; therefore, consistency and availability cannot be achieved at the same time« Werner Vogels, Amazon.com

2009: NoSQL

Definition…

»Gruppe nicht konventioneller Datenbanken«

Willkommen im Zoo!

• CouchDB • MongoDB • Redis • Memcachedb • Tokyo Cabinet • Google BigTable • Amazon Dynamo

• Apache Cassandra • Project Voldemort • Mnesia (Erlang) • Hbase (Apache
Hadoop)

• Hypertable • Twitter Gizzard

kein ACID

eingeschränkte Transaktionen

kein »JOIN«

kein SQL

einfach anzusprechen

schemafrei

skaliert horizontal

Replikation

eventual consistency

probabilistic worldview

Amazon's Dynamo
• »applications have received successful
responses […] for 99.9995% of its requests«

• »no data loss event has occurred to
date« [2007, nach 2 Jahren Betrieb]

Veranstaltung »ACID vs. BASE«
• mehr zu NoSQL, ACID, BASE und dem
CAP-Theorem

• Objekt-Stammtisch Kiel • 27.04.2010 19:00 Uhr • Eckernförder Straße 20, Kiel (Toppoint)

Referenzen
• Bigtable: A Distributed Storage System for
Structured Data

• Eventually Consistent - Revisited • Keynote address to the PODC conference
in 2000 by Eric Brewer

• Brewer's conjecture and the feasibility of

consistent, available, partition-tolerant web services