Upload_transparent

Scaling MySQL and Java in High Write Throughput Environments Presentation

 
 
 
 
 
zmg

by zmg

Value This
Doc
Scribd
Average
     
Pages: 20 43
Words: 638 13640
Characters: 3688 81678
Lines: 150 623
     
     
Letters per word: 5.78 5.99
Words per line: 4.25 21.89
Words per page: 31.9 317.21

Add to your reading list

Flag_red Flag this document

Document Information

2,717 Reads | 0 Comments

Description



We present the backend architecture behind Spinn3r – our scalable web and blog crawler.

Most existing work in scaling MySQL has been around high read throughput environments similar to web applications. In contrast, at Spinn3r we needed to complete thousands of write transactions per second in order to index the blogosphere at full speed.

We have achieved this through our ground up development of a fault tolerant distributed database and compute infstructure all built on top of cheap commodity hardware.

We’ve built out a number of technologies on top of MySQL that help enable us to easily scale operations.

We’ve implemented an Open Source load balancing JDBC driver named lbpool. (http://code.tailrank.com/lbpool). Lbpool allows us to loosely couple our MySQL slaves which allow us to gracefully handle system failures. It also supports load balancing, reprovisioning, slave lag, and other advanced features not available in the stock MySQL JDBC driver.

We’ve also built out a sharded database similar to infrastructure built at other companies such as Google (Adwords) and Yahoo (Flickr). Our sharded DB has a number of interesting properties including ultra high throughput requirements (we process 52TB per month), distributed sequence generation, and query plan execution. - Kevin Burton (Tailrank), Jonathan Moore (Tailrank/spinn3r)

Pdf_16x16 20 Pages


Date Added

07/12/2008

Category

Uncategorized.

Tags
Groups
Awards

Flame Rising

Copyright

Attribution Non-commercial

More info »

 

or use Facebook Connect