You are on page 1of 33

Apache Cassandra

For Oracle DBA


Robert Bialek

@RobertPBialek doag2018
Who Am I

Senior Principal Consultant and Trainer at Trivadis GmbH in Munich


– Master of Science in Computer Engineering
– At Trivadis since 2004
– Trivadis Partner since 2012

Focus:
– Data and service high availability, disaster recovery
– Architecture design, optimization, automation
– Troubleshooting
– Trainer: O-RAC, O-DG

2 21.11.2018 Apache Cassandra for Oracle DBA


We help to generate added value from data

3 21.11.2018 Apache Cassandra for Oracle DBA


With over 650 specialists and IT experts in your region

16 Trivadis branches and more than


650 employees
Experience from more than 1,900
projects per year at over 800
customers
250 Service Level Agreements
Over 4,000 training participants
Research and development budget:
CHF 5.0 million
Financially self-supporting and
sustainably profitable

4 21.11.2018 Apache Cassandra for Oracle DBA


Agenda

1. Introduction

2. Key Components

3. Data Replication

4. Scalability

5. Read/Write Operations

6. Data Consistency

7. Summary

5 21.11.2018 Apache Cassandra for Oracle DBA


Introduction

6 21.11.2018 Apache Cassandra for Oracle DBA


What is Apache Cassandra?

Distributed NoSQL (wide column) partitioned row store database, which runs within a
JVM

Decentralized, highly fault tolerant database with no single point of failure

Horizontal scalable system (computing resources/performance)

According to CAP theorem, it is a AP system

– Consistency
– Availability
? ?
– Partition tolerance

7 21.11.2018 Apache Cassandra for Oracle DBA


Key Components

8 21.11.2018 Apache Cassandra for Oracle DBA


Keyspaces & Tables
Keyspace Table

Keyspace
– Grouping of data, similar to a schema
– Defines replication properties
Table (Column Family)
– Stores data based on a primary key
• Primary key: partitioning key plus optionally
clustering columns

Cassandra Ring
– Physically split into partitioned
– Denormalization (data duplication) is necessary
– Once written to disk, the data is immutable

9 21.11.2018 Apache Cassandra for Oracle DBA


Cassandra Ring – Virtual Nodes Architecture

num_tokens: 256

Data ‘Cassandra'

Token 356242581507269238
num_tokens: 256
Partitioner

Cassandra Ring
num_tokens: 256

num_tokens: 256

10 21.11.2018 Apache Cassandra for Oracle DBA


Gossip – Internode Communication

Peer-to-peer communication protocol to exchange


ring state information

Gossip process runs every second and exchanges


messages with up to three other nodes in the ring

Eventually, all nodes learn (indirectly) about all


other nodes

11 21.11.2018 Apache Cassandra for Oracle DBA


Scalability

12 21.11.2018 Apache Cassandra for Oracle DBA


Cassandra Ring – Scale Out

Increases computing power and


Data Streaming
throughput of a Cassandra ring
SEED Node Online and transparent to the
applications

Ring Bootstrap
Information
Cassandra Ring
Software &
Configuration Files

Generate START
Tokens Joing Ring

FINISH
Joing Ring
13 21.11.2018 Apache Cassandra for Oracle DBA
Cassandra Ring – Scale In

Decreases computing power of a


Data Streaming
Cassandra ring
Online and transparent to the
applications

Remove
Tokens
Cassandra Ring

DECOMMISSION

DECOMMISSIONED

14 21.11.2018 Apache Cassandra for Oracle DBA


Data Replication

15 21.11.2018 Apache Cassandra for Oracle DBA


Replication – Data High Availability

To ensure data and service high availability, Cassandra stores data on multiple
nodes in a cluster
DC 1 DC 2
All replicas are equally important (no primary or Rack 1 Rack 1
secondary data)

Replication strategy and replication factor (RF) is


defined on a keyspace (application) level
– RF can be set differently in different data centers

Two replication strategies are available:


Rack 2 Rack 2
– SimpleStrategy
– NetworkTopologyStrategy

16 21.11.2018 Apache Cassandra for Oracle DBA


Replication – SimpleStrategy (RF: 2)

Data Center 1
Rack 1 Rack 1

Rack 1 Rack 1

17 21.11.2018 Apache Cassandra for Oracle DBA


Replication – NetworkTopologyStrategy (RF/DC: 2)

Data Center 1 Data Center 2


Rack 1 Rack 1

Rack 2 Rack 2

18 21.11.2018 Apache Cassandra for Oracle DBA


Read/Write Operations

19 21.11.2018 Apache Cassandra for Oracle DBA


Read Request Flow on a Cassandra Node
Partition Key
Memtable Row Cache Bloom Filter Cache
Memory

Compression
Offset Map Partition Summary

SSTables

Partition Index
Disk

20 21.11.2018 Apache Cassandra for Oracle DBA


Write Request Flow on a Cassandra Node
Memory
Memtable

Index.db CompressionInfo.db Filter.db


Disk

Data.db Statistics.db Digest.crc32 TOC.txt


(SSTable)

Commit Log Compaction


Process

21 21.11.2018 Apache Cassandra for Oracle DBA


Upserts on a Cassandra Node SSTables

Partition Key: TAG TAG: CASSANDRA


Primary Key: TAG, ID
ID C1 C2 TSTAMP
1 2 TEST1 100
INSERT INTO t (TAG, ID,C1,C2)
VALUES (‘CASSANDRA‘,1,5,‘TEST3‘); ID C1 C2 TSTAMP
2 3 TEST2 50
Memtable
ID C1 C2 TSTAMP
UPDATE t SET C2=PROD1 WHERE
TAG=‘CASSANDRA‘ AND ID=1; 1 5 TEST3 150

ID C2 TSTAMP
1 PROD1 200
DELTE FROM t
WHERE TAG=‘CASSANDRA‘ AND ID=2;
ID Tombstone TSTAMP
(marked_deleted)
2 250

22 21.11.2018 Apache Cassandra for Oracle DBA


Compaction Process on a Cassandra Node
Compaction Strategies
ID C1 C2 TSTAMP
SizeTieredCompactionStrategy (STCS)
1 2 TEST1 100
LeveledCompactionStrategy (LCS)
ID C1 C2 TSTAMP
TimeWindowCompactionStrategy (TWCS)
2 3 TEST2 50

ID C1 C2 TSTAMP
3 4 TEST3 120
New SSTable

ID C1 C2 TSTAMP ID C1 C2 TSTAMP

1 5 TEST3 150 1 5 PROD1 300

ID Tombstone TSTAMP
ID C2 TSTAMP (marked_deleted)
2 250
1 PROD1 200
ID C1 C2 TSTAMP
3 4 TEST3 120
ID Tombstone TSTAMP gc_grace_seconds
(marked_deleted) reached?
2 250
No
23 21.11.2018 Apache Cassandra for Oracle DBA
Data Consistency

24 21.11.2018 Apache Cassandra for Oracle DBA


Data Consistency – Overview

Cassandra offers tunable data consistency for read and write operations

Two types of read requests:


– Direct read request
– Digest read request

Inconsistent data can be repaired automatically by:


– Background read repair request
– NodeSync continuous background repair (only DSE 6)

Inconsistent data can be repaired manually by:


– Anty-Entropy Repair

25 21.11.2018 Apache Cassandra for Oracle DBA


Tunable Consistency

A tradeoff between data consistency and availability

WRITE Consistency Level READ Consistency Level


ALL ALL
EACH_QUORUM Not supported
QUORUM QUORUM
LOCAL_QUORUM LOCAL_QUORUM
ONE, TWO, THREE ONE, TWO, THREE
LOCAL_ONE LOCAL_ONE
ANY Not supported
Not supported SERIAL
Not supported LOCAL_SERIAL

26 21.11.2018 Apache Cassandra for Oracle DBA


Read Requests & Tunable Consistency (1)

One DC, CONSISTENCY=QUORUM, RF=3

Coordinator

Direct Read

Digest Read Background


read_repair_chance=0.10
Read Repair

27 21.11.2018 Apache Cassandra for Oracle DBA


Read Requests & Tunable Consistency (2)

Two DC, CONSISTENCY=QUORUM, RF=3

Coordinator
Digest Read

Direct Read

DC=1 DC=2
Digest Read

Digest Read

28 21.11.2018 Apache Cassandra for Oracle DBA


Write Requests & Tunable Consistency (1)

One DC, CONSISTENCY=ONE, RF=3

Coordinator

29 21.11.2018 Apache Cassandra for Oracle DBA


Write Requests & Tunable Consistency (2)

One DC, CONSISTENCY=QUORUM, RF=3

Coordinator
DELETE

Hinted Handoff

Possibile
ZOMBI

30 21.11.2018 Apache Cassandra for Oracle DBA


Summary

31 21.11.2018 Apache Cassandra for Oracle DBA


Summary

Cassandra is a very powerful distributed and decentralized NoSQL database with no


singe point of failure
It guarantees service and data availability in case of a partitioned network, though
the data might be stale
Designed for large data stores which require performant and scalable system (e.g.:
IoT)
Application data model need to be designed for Cassandra
Many ways to interact with the database:
– CQLSH (Cassandra Query Language Shell)
– Drivers and tools provided by DataStax

32 21.11.2018 Apache Cassandra for Oracle DBA


Trivadis @ DOAG 2018
#opencompany
Booth: 3rd Floor – next to the escalator

We share our Know how!


Just come across, Live-Presentations
and documents archive
T-Shirts, Contest and much more

We look forward to your visit

33 21.11.2018 Apache Cassandra for Oracle DBA

You might also like