Professional Documents
Culture Documents
APACHE CASSANDRA
Gökhan Atıl
GÖKHAN ATIL
➤ Database Administrator
➤ Twitter: @gokhanatil
2
INTRODUCTION TO APACHE CASSANDRA
➤ What is Apache Cassandra? Why to use it?
➤ Cassandra Architecture
➤ Cassandra nodetool
3
WHAT IS APACHE CASSANDRA? WHY TO USE IT?
4
WHAT IS APACHE CASSANDRA? WHY TO USE IT?
➤ Fast Distributed (Column Family NoSQL) Database
High availability
Linear Scalability
High Performance
➤ Fault tolerant on Commodity Hardware
➤ Multi-Data Center Support
➤ Easy to operate
➤ Proven: CERN, Netflix, eBay, GitHub, Instagram, Reddit
5
HIGH AVAILABILITY: CAP THEOREM AND CASSANDRA
RDBMS Availability
6
HIGH AVAILABILITY: THE RING
NO MASTER NO SLAVE
p
ssi
go e !
nl in
m o
I'
gossip
PEER TO
PEER
7
LINEAR SCALABILITY
8
CASSANDRA ARCHITECTURE
9
CASSANDRA PARTITIONS
10
REPLICATION FACTOR
EMAIL
gokhan@
Murmur3Partitioner
# 60
11
WRITE PATH (CLUSTER)
coordinator
client
node
hinted
hand off
12
WRITE PATH (NODE)
memtable
flush
mem
disk
compaction
13
READ PATH (CLUSTER)
est
e s t
ig
dig
d
coordinator data
client
node
➤ Read Repair: repair during read path using digest and timestamp
14
READ PATH (NODE)
found
memtable row (read) cache
no
mem found
disk
15
CONSISTENCY LEVELS
16
CASSANDRA QUERY LANGUAGE (CQL)
17
CASSANDRA QUERY LANGUAGE (CQL)
18
CASSANDRA QUERY LANGUAGE (CQL)
➤ Create a table:
create table demo.democlients ( email text, name text,
phone text, primary key (email, name));
➤ Alter a table:
EMAIL: PARTITION KEY
NAME: CLUSTERING KEY
alter table democlients add money int;
➤ Remove a table:
drop table democlients;
➤ Remove all rows in a table:
truncate table democlients;
19
CASSANDRA QUERY LANGUAGE (CQL)
➤ Retrieve rows:
select * from democlients where name='Gokhan Atil'
ALLOW FILTERING; -- or create a secondary index
➤ Retrieve distinct values:
EMAIL: PARTITION KEY
21
CASSANDRA QUERY LANGUAGE (CQL)
➤ Update records:
update democlients set phone='535' where
email='gokhan at gokhanatil.com' and
name='Gokhan' IF EXISTS;
➤ Update records with a condition:
update democlients set money=20 where email='gokhan
at gokhanatil.com' and name='Gokhan Atil'
IF phone='542';
➤ Delete rows:
delete from democlients where email='gokhan at
gokhanatil.com' IF EXISTS;
22
CASSANDRA QUERY LANGUAGE (CQL)
➤ Delete row with a condition:
delete from democlients where email='gokhan at
gokhanatil.com' and name='Gokhan Atil' IF money > 10;
➤ Delete columns in a row:
delete money from democlients where email='gokhan at
gokhanatil.com' and name='Gokhan Atil';
23
CASSANDRA DATA MODELING
➤ Query-Driven Data Modeling
➤ Use Denormalization
24
HOW TO INSTALL AND RUN CASSANDRA?
25
HOW TO INSTALL AND RUN CASSANDRA CLUSTER?
➤ Make sure you have JDK (8u40 or newer) installed
➤ Download apache-cassandra-VERSION-bin.tar.gz
➤ Extract the file to a folder
➤ Make data and logs directories in cassandra folder
➤ Run bin/cassandra
26
HOW TO INSTALL AND RUN CASSANDRA CLUSTER?
➤ User docker to pull the latest image:
docker pull cassandra
➤ Run it as standalone:
docker run --name cas1 -p 9042:9042 -e
CASSANDRA_CLUSTER_NAME=MyCluster -d cassandra
27
CASSANDRA NODETOOL
28
CASSANDRA NODETOOL
➤ Get a quick summary of the node:
nodetool info
29
CASSANDRA NODETOOL
➤ Get status of the cluster/keyspace:
nodetool status <keyspace_name>
30
CASSANDRA NODETOOL
➤ Repair a node (you can run it weekly on non-peak hours):
nodetool repair
31
CASSANDRA NODETOOL
➤ Decommission a node (to prepare to remove it):
nodetool decommission <node_UUID>
32
BACKUP AND RECOVERY
33
BACKUP AND RECOVERY
➤ Back up a cluster:
1. Take a snapshot of each node.
2. Move the snapshots to another storage (S3 bucket?)
3. Clean all the snapshots
➤ Restore node(s):
➤ Make sure schema exists
➤ Truncate table
➤ Copy most recent snapshots to a directory. Its name should
be formatted as "keyspace/tablename". Run:
sstableloader -d <nodeip> keyspace/tablename
34
BUILD A BACKUP NODE
➤ Use multi-DC replication:
CREATE KEYSPACE "MyKeyspace"
WITH replication = {
'class' : 'NetworkTopologyStrategy',
'datacenter1' : 3, 'datacenter2' : 1 };
snapshots
RF=3
client
35
QUESTIONS?
36
Blog: www.gokhanatil.com Twitter: @gokhanatil