Professional Documents
Culture Documents
NYC
• Ryan Rawson
• Senior Software Developer @
Stumbleupon
• HBase committer, core contributor
Stumbleupon
• Similar to HDFS:
• Master = Namenode (ish)
• Regionserver = Datanode (ish)
• Often run these alongside each
other!
Server
Architecture 2
• But not quite the same, HBase
stores state in HDFS
• HDFS provides robust data storage
across machines, insulating against
failure
• Master and Regionserver fairly
stateless and machine independent
Region assignment
• HBase
community and project
viability (no major users
beyond zvents)
- hbase local and good
community
Stumbleupon &
HBase
• Picked HBase:
• Community
• Features
• Map-reduce, cascading, etc
• Now highly involved and invested
su.pr marketing
• A bigger table
• History of Data
• RDBMS Issues
• HBase to the Rescue
• Streamy Today and Tomorrow
• Future of HBase
About Me
• Transparent partitioning
• Transparent distribution
• Fast random writes
• Good data locality
• Fast random reads
What We Got
Regions
• Transparent partitioning
RegionServers
• Transparent distribution
MemStore
• Fast random writes
Column
• Good data locality
Families
• Fast random reads
HBase 0.20
What Else We Got
HDFS
• Transparent replication
No SPOF
• High availability
Input/OutputFor
• MapReduce mats
• Versioning Column
Versions
• Fast Sequential Reads
Scanners
HBase @ Streamy
Today
HBase @ Streamy
Today