Professional Documents
Culture Documents
Email: ray.kao@microsoft.com
Cosmos DB Workshop
https://aka.ms/cosmosdb-workshop
Ray Kao
Open Source Software Data Lead
Azure Global Black Belt
Microsoft Canada
OSS Canada Leadership Team
CanadaOpenSource@Microsoft.com
Column-family
Document
Key-value Graph
MongoDB
Table API
Column-family
Document
Key-value Graph
Multi-homing APIs
Black Friday
12000000
10000000
Transparent server-side partition management
8000000
6000000
4000000
Elastically scale storage (GB to PB) and throughput (100 to 100M req/sec)
across many machines and multiple regions
2000000
Physical index
Multi-model, multi-API
Database engine operates on Atom-Record-Sequence type system
Azure Cosmos DB
Azure region B (app + session state)
Azure Storage
(logs, static Azure Cosmos DB
catalog content) (session state)
Retail Order Processing Pipelines
...
Azure Functions Azure Functions Azure Functions
(Microservice 1: Tax) (Microservice 2: Payment) (Microservice N: Fufillment)
Real-time Recommendations
Online Recommendations Service
Shoppers
E-commerce Store Apache Spark on
Azure Databricks
Order Transactions
Multiplayer Gaming
Azure CDN
Azure Storage
(game files)
MLlib
Spark Spark GraphX
(machine
SQL Streaming (graph)
learning)
Scale-out Database
Azure Cosmos DB
Let’s zoom in Azure Cosmos DB
Resource Model
Account
Database
Container
Item
********.azure.com
Account
Database IGeAvVUp …
Container
Item
Account
Database
Container
Item
Account
Database
Container
Item
Account
Database
Container
Item
Account
Database
Container
Item
Account
Database
Container
Item
Account
Database
Container
Item
Account
Database
Container
Item
Account
Database
Container
Item
Account
Database
Container
Item
Account
Database
Container
Item
Account
Database
Container User
Item Permission
Account
Database
Item
Account
Database
Container
Item
Account
Database
Container
Item
Account
Database
Container
Note: Throughput can also be
shared across a set of collections
Item
Account
Database
Container
Item
Account
Database
Container
Item
Account
Database
Container
Database
Container
Tenants
Follower
K
K
V
V
Follower
Lead
K V
Consistency w.r.t. Transactions is NOT the same thing as Consistency w.r.t. Replication.
this is about moving from one valid state to this about getting a consistent view across
another for a single given tx replicated copies of data
(West US)
(East US)
(North Europe)
Value = 5
Value = 5
Value = 5
Value = 5 6
Value = 5
Value = 5 6
Value = 5
Value = 5
What happens when a network partition is introduced? Reader: What is the value?
Should it see 5? (prioritize availability)
Or does the system go offline until network is restored? (prioritize consistency)
Brewer’s CAP Theorem: impossible for distributed data store to
simultaneously provide more than 2 out of the following 3 guarantees:
Consistency, Availability, Partition Tolerance
Latency: packet of information can travel as fast as speed of light.
Replication between distant geographic regions can take 100’s of milliseconds
Value = 5 6
Value = 5
Reader A: What is the value?
Value = 5 6
Value = 5
Value = 5
Choice for
most
distributed
apps
• Clear tradeoffs
• Latency
• Availability
• Throughput
Consistency Level Guarantees
Consistent Prefix Reads will never see out of order writes (no gaps).
Eventual Potential for out of order reads. Lowest cost for reads of all consistency levels.
Bounded-Staleness: Bounds are set server-side via the Azure Portal
Session Consistency: Session is controlled using a “session token”.
• Session tokens are automatically cached by the Client SDK
• Can be pulled out and used to override other requests (to preserve session between multiple clients)
string sessionToken;
client.ReadDocumentAsync(
documentLink,
new RequestOptions { ConsistencyLevel = ConsistencyLevel.Eventual }
);
Security & Compliance
Always encrypted at rest and in transit
• Encryption@ Rest – AES256
• Encryption @ Transit – SSL / TLS