Understanding Bigtable: Data Model & API

Bigtable is a distributed storage system built by Google to handle large amounts of structured data. It uses a sparse, distributed, persistent multidimensional sorted map as its data model. The API allows for creating/deleting tables and column families as well as reading, writing, and querying data. It is built upon other Google technologies like Google File System for storage and Chubby for locking.

Uploaded by

Mahesh Yadav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views16 pages

Understanding Bigtable: Data Model & API

Uploaded by

Mahesh Yadav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Bigtable

David Wyrobnik, MEng

Overview
● What is Bigtable?
● Data Model
● API
● Building Blocks
● Implementation
What is Bigtable (high level)
● “Distributed storage system for structured data” - title of paper
● “BigTable is a compressed, high performance, and proprietary data storage
system built on Google File System, Chubby Lock Service, SSTable (log-
structured storage like LevelDB) and a few other Google technologies.” -
wikipedia
● “A Bigtable is a sparse, distributed, persistent multidimensional sorted map” -
paper
Data Model
Data Model
● (row:string, column:string, time:int64) → array of bytes
Data Model continued
● Timestamps can be assigned automatically (“real time”) or by client
● Versioned data management, two per-column-family settings for garbage-
collection
○ last n versions of a cell should be kept
○ only new-enough versions kept (e.g. only values that were written in the last seven days)
API
API
● Functions for creating and deleting
○ tables and column families
● Functions for changing
○ clusters, table, and column family metadata (such as control rights)
● Write, delete, and lookup values in individual rows
● Iterate over subset of data in table
● Single-row transactions → perform atomic read-modify-write sequences
● No general transactions across rows, but supports batching writes across rows
● Bigtable can be used with MapReduce (common use case)
Building Blocks and Implementation
Building Blocks
● Google-File-System (GFS) to store log and data files.
● SSTable file format.
● Chubby as a lock service
● Bigtable uses Chubby
○ to ensure at most one active master exists
○ to store bootstrap location of Bigtable data
○ to discover tablet servers
○ to store Bigtable schema information (column family info for each table)
○ to store access control lists
Implementation
● Three major components:
○ library that is linked into every client
○ one master server
○ many tablet servers
● Master mainly responsible for assigning tablets to tablet servers
● Tablet servers can be added or removed dynamically
● Tablet server store typically 10-1000 tablets
● Tablet server handle read and writes and splitting of tablets that are too large
● Client data does not move through master.
Tablet Location
Tablet Assignment
● Master keeps track of live tablet servers, current assignments, and of
unassigned tablets
● Master assigns unassigned tablets to tablet servers by sending a tablet load
request
● Tablet servers are linked to files in Chubby directory (servers directory)
● When new master starts:
○ Acquires unique master lock in Chubby
○ Scans live tablet servers
○ Gets list of tablets from each tablet server, to learn which tablets are assigned
○ Scans METADATA table to learn set of existing tablets → adds unassigned tablets to list
Tablet Serving
Consistency
● Bigtable has a strong consistency model, since operations on rows are atomic
and tablets are only served by one tablet server at a time
Discussion

BigTable: Google's Scalable Storage
No ratings yet
BigTable: Google's Scalable Storage
38 pages
Bigtable: Scalable Distributed Storage System
No ratings yet
Bigtable: Scalable Distributed Storage System
12 pages
Understanding BigTable Architecture
No ratings yet
Understanding BigTable Architecture
9 pages
BigTable: Scalable Data Storage
No ratings yet
BigTable: Scalable Data Storage
23 pages
Bigtable - A Distributed Storage System For Structured Data
No ratings yet
Bigtable - A Distributed Storage System For Structured Data
2 pages
Bigtable: A Distributed Storage System For Structured Data
100% (1)
Bigtable: A Distributed Storage System For Structured Data
26 pages
(En) Bigtable
No ratings yet
(En) Bigtable
26 pages
Bigtable Overview - Google Cloud
No ratings yet
Bigtable Overview - Google Cloud
8 pages
Google Bigtable Architecture Overview
100% (1)
Google Bigtable Architecture Overview
6 pages
Bigtable for Data Engineers
No ratings yet
Bigtable for Data Engineers
14 pages
Unit 5 Lecture 3
No ratings yet
Unit 5 Lecture 3
18 pages
Google Bigtable: Cloud Storage Guide
No ratings yet
Google Bigtable: Cloud Storage Guide
10 pages
BigTable: Scalable Data Storage System
No ratings yet
BigTable: Scalable Data Storage System
9 pages
ADO Lecture II 2024-26
No ratings yet
ADO Lecture II 2024-26
67 pages
Overview of Google Cloud Bigtable
No ratings yet
Overview of Google Cloud Bigtable
21 pages
20 GFS BigTable
No ratings yet
20 GFS BigTable
36 pages
Net 2010 08 2 - 06
No ratings yet
Net 2010 08 2 - 06
6 pages
Bigtable: A Distributed Storage System For Structured Data
100% (1)
Bigtable: A Distributed Storage System For Structured Data
4 pages
Understanding Google Bigtable System
No ratings yet
Understanding Google Bigtable System
3 pages
Column-Family Databases Guide
No ratings yet
Column-Family Databases Guide
32 pages
Bigtable Schema Design Best Practices
No ratings yet
Bigtable Schema Design Best Practices
14 pages
Bigtable: Scalable Data Storage System
No ratings yet
Bigtable: Scalable Data Storage System
24 pages
Google Architecture Case Study
No ratings yet
Google Architecture Case Study
44 pages
5.cloud Computing Lecture
No ratings yet
5.cloud Computing Lecture
7 pages
Cloud Bigtable for Developers
100% (1)
Cloud Bigtable for Developers
18 pages
Overview of BigTable and Cloud Services
No ratings yet
Overview of BigTable and Cloud Services
21 pages
BigTable: Scalable Distributed Storage
No ratings yet
BigTable: Scalable Distributed Storage
2 pages
05 Data Storage Services
No ratings yet
05 Data Storage Services
75 pages
Storage Systems
No ratings yet
Storage Systems
23 pages
HBase Architecture and Performance Insights
No ratings yet
HBase Architecture and Performance Insights
46 pages
Overview of Google File System and Bigtable
No ratings yet
Overview of Google File System and Bigtable
23 pages
Google Storage Architecture Insights
No ratings yet
Google Storage Architecture Insights
25 pages
Mastering Google Bigtable Database
No ratings yet
Mastering Google Bigtable Database
248 pages
GCP Data Storage & BigQuery Guide
No ratings yet
GCP Data Storage & BigQuery Guide
15 pages
Hypertable: Scalable NoSQL Database
100% (2)
Hypertable: Scalable NoSQL Database
37 pages
OD 03 PDE Building and Operationalizing Data Processing Systems
No ratings yet
OD 03 PDE Building and Operationalizing Data Processing Systems
34 pages
Bigtable: Scalable Data Storage System
No ratings yet
Bigtable: Scalable Data Storage System
65 pages
Cloud Application Requirements Overview
No ratings yet
Cloud Application Requirements Overview
21 pages
Google Cloud Storage Solutions
No ratings yet
Google Cloud Storage Solutions
69 pages
Google Compute Engine & Storage Guide
No ratings yet
Google Compute Engine & Storage Guide
3 pages
Chubby System and Google API
No ratings yet
Chubby System and Google API
13 pages
Week7 Slides
No ratings yet
Week7 Slides
70 pages
Understanding Big Data Concepts and Structures
No ratings yet
Understanding Big Data Concepts and Structures
3 pages
Guha Roy 2017
No ratings yet
Guha Roy 2017
3 pages
GCP Storage and Database Services Guide
No ratings yet
GCP Storage and Database Services Guide
98 pages
Google Cloud Storage & Database Services
No ratings yet
Google Cloud Storage & Database Services
64 pages
Google Cloud Storage Guide
No ratings yet
Google Cloud Storage Guide
64 pages
Ccomputing Madurya
No ratings yet
Ccomputing Madurya
20 pages
Week 5 GCP Notes
No ratings yet
Week 5 GCP Notes
5 pages
Introduction-to-Data-Storage-and-Retrieval
No ratings yet
Introduction-to-Data-Storage-and-Retrieval
26 pages
BQ Solutions-1
No ratings yet
BQ Solutions-1
19 pages
Bigquery
No ratings yet
Bigquery
25 pages
Exam Overview: GCP Data Engineer
100% (1)
Exam Overview: GCP Data Engineer
12 pages
Resume of Rakshitha P G - Web Developer
No ratings yet
Resume of Rakshitha P G - Web Developer
4 pages
Documentum Content Server Administration
No ratings yet
Documentum Content Server Administration
23 pages
Satellite L655 Detailed Product Specification: Genuine
No ratings yet
Satellite L655 Detailed Product Specification: Genuine
4 pages
Sruthi Datastage Resume
No ratings yet
Sruthi Datastage Resume
7 pages
J.Burrows Mini Trackball Wireless Keyboard Instruction Manual
No ratings yet
J.Burrows Mini Trackball Wireless Keyboard Instruction Manual
9 pages
MapReduce Basics for Big Data Beginners
No ratings yet
MapReduce Basics for Big Data Beginners
32 pages
Presentation Guidelines
No ratings yet
Presentation Guidelines
4 pages
Oracle DBA Performance Tuning Q&A
100% (1)
Oracle DBA Performance Tuning Q&A
7 pages
English Vocabulary WE Tech School Grade10
No ratings yet
English Vocabulary WE Tech School Grade10
3 pages
Untitled
No ratings yet
Untitled
1 page
Maximum of Three Numbers Project Report
No ratings yet
Maximum of Three Numbers Project Report
24 pages
L11 Intro Logix PPT
No ratings yet
L11 Intro Logix PPT
19 pages
ADAudit Plus: Real-Time AD & Server Auditing
No ratings yet
ADAudit Plus: Real-Time AD & Server Auditing
31 pages
Parallel Computers Architecture and Programming V. Rajaraman Ebook Improved Readability Version
No ratings yet
Parallel Computers Architecture and Programming V. Rajaraman Ebook Improved Readability Version
50 pages
nighthawk-app.com
No ratings yet
nighthawk-app.com
11 pages
WOD Unsupported Device
No ratings yet
WOD Unsupported Device
430 pages
Import Data From Excel To Data Grid View in C#
100% (1)
Import Data From Excel To Data Grid View in C#
8 pages
Hospital Management Database Report
No ratings yet
Hospital Management Database Report
10 pages
Guía de Usuario para S10 Presupuestos
No ratings yet
Guía de Usuario para S10 Presupuestos
190 pages
CWtype Keyer Setup Guide
No ratings yet
CWtype Keyer Setup Guide
12 pages
Gigabyte Motherboard GA Z87 HD3 R102
No ratings yet
Gigabyte Motherboard GA Z87 HD3 R102
5 pages
Boundary Scan Architecture Explained
No ratings yet
Boundary Scan Architecture Explained
59 pages
Setup and Deployment in Visual Basic 2010
No ratings yet
Setup and Deployment in Visual Basic 2010
15 pages
You Completed This Test On 25/03/2021, 06:36: ACDA Level 1
67% (3)
You Completed This Test On 25/03/2021, 06:36: ACDA Level 1
9 pages
CS203 Mid-Term Exam: Computer Programming
No ratings yet
CS203 Mid-Term Exam: Computer Programming
2 pages
Middle Term Test. Tanversion 3
No ratings yet
Middle Term Test. Tanversion 3
4 pages
Qgis-1.4.0 Coding-Compilation Guide en
No ratings yet
Qgis-1.4.0 Coding-Compilation Guide en
135 pages
Azure Migration & Management Courses
No ratings yet
Azure Migration & Management Courses
5 pages
C Programming Csit Micro Syllabus Redefined PDF
No ratings yet
C Programming Csit Micro Syllabus Redefined PDF
7 pages
Computer Systems and Architecture Overview
No ratings yet
Computer Systems and Architecture Overview
73 pages

Understanding Bigtable: Data Model & API

Uploaded by

Understanding Bigtable: Data Model & API

Uploaded by

Bigtable

David Wyrobnik, MEng

You might also like