You are on page 1of 80

K8ssandra - Apache Cassandra™

meets Kubernetes!
+ =

1
Your presenters

Aleks Volochnev Jeff Carpenter


Developer Advocate at DataStax Developer Adoption at DataStax
• Apache Cassandra™ expert • Apache Cassandra™ author (O’Reilly)
• Experienced developer and educator • Experienced architect and developer
• Certified cloud architect

@hadesarchitect @jeffreyscarpenter
@jscarp
2
K8ssandra - Apache
Cassandra™ meets Kubernetes!

● Housekeeping and Setup


● Cassandra and Kubernetes Basics
● Introducing K8ssandra
● Monitoring Cassandra
● Working with Data
● Scaling Up and Down
● API Access with Stargate
● Running Cassandra Repairs
● Backup and Restore
● Wrapping up

3
Kubernetes Workshop - Homework
https://github.com/datastaxdevs/k8ssandra-workshop/wiki

4
menti.com
K8ssandra - Apache
Cassandra™ meets Kubernetes!

● Housekeeping and Setup


● Cassandra and Kubernetes Basics
● Introducing K8ssandra
● Monitoring Cassandra
● Working with Data
● Scaling Up and Down
● API Access with Stargate
● Running Cassandra Repairs
● Wrapping up

6
Cloud Application Architecture - 50,000 ft

Applications

Data

Infrastructure

7
Highly Scalable Cloud Application Architecture

Applications

Web / Mobile Apps Microservices

Data

Data Gateway NoSQL Database

Infrastructure
Container Orchestration

Cloud Infrastructure

8
Apache Cassandra™ = NoSQL Distributed Database

1 Installation = 1 NODE
NODE ✔ Capacity = ~ 2-4TB
✔ Throughput = LOTS Tx/sec/core
NODE NODE

DataCenter | Ring

NODE NODE
Communication:
✔ Gossiping

NODE NODE

9
Apache Cassandra™ Scales linearly

10
Data is distributed
Country City Population

USA New York 8.000.000


USA Los Angeles 4.000.000
FR Paris 2.230.000
DE Berlin 3.350.000
UK London 9.200.000
AU Sydney 4.900.000
DE Nuremberg 500.000
CA Toronto 6.200.000
CA Montreal 4.200.000
FR Toulouse 1.100.000
JP Tokyo 37.430.000
IN Mumbai 20.200.000

Partition Key

11
Data is Distributed
USA New York 8.000.000
Country City Population
USA Los Angeles 4.000.000

FR Paris 2.230.000
DE Berlin 3.350.000
FR Toulouse 1.100.000
DE Nuremberg 500.000

UK London 9.200.000 JP Tokyo 37.430.000

AU Sydney 4.900.000 CA Toronto 6.200.000


IN Mumbai 20.200.000 CA Montreal 4.200.000

12
Data is Replicated

RF = 3 83 17

Replication Factor 3
means that every
row is stored on 3
different nodes
67 33

50

13
Replication within the Ring

0
59 (data)
83 17

RF = 3

67 33

50

14
Replication within the Ring

83 59 (data)
17

RF = 3

67 33

50

15
Replication within the Ring

59 (data)
0

59 (data)
83 17

RF = 3

59 (data)
67 33

50

16
Node Failure

59 (data)
0

83 17 Hint
59 (data)
RF = 3

59 (data)
67 33

50

17
Node Failure Recovered

59 (data)
0

83 17 Hint
59 (data)
RF = 3

59 (data)
67 33

50

18
Topology View - Data Centers and Racks

Data Center A (Region)


0
Rack 1 Rack 2 Rack 3
(AZ) (AZ) (AZ)
83 17

0 17 33

67 33
50 67 83

50
Data Distributed Everywhere
• Geographic Distribution • Hybrid-Cloud and Multi-Cloud

On-premise

20
References (2019)

21
"Kubernetes is an open-source system for automating
deployment, scaling, and management of containerized
applications."

22
Why containers?

23
Why Kubernetes ?
Batch execution

Storage orchestration Automatic bin packing

Secret and configuration Self-healing


management

Automated rollouts and rollbacks Horizontal scaling

Service discovery
Load balancing

24
K8s Infrastructure

Cluster:
Kubernetes cluster.
Kubernetes Control Plane
REST API

Master:
Kubernetes Control Plane.

Node:
Worker machine

25
K8s Infrastructure - Control Plane
K8s API Server
Kubernetes API
Kubernetes Control Plane
ETCD
Controller Manager
(k/v db - ssot)
Kubernetes controller manager

Controller Manager Scheduler


Scheduler (controller loop) (bind pod to node)
In charge of placing pods
API Server
(REST API)
ETCD
Kubernetes’ backing store.

26
K8s Infrastructure - Worker Node

Kubelet:
The kubelet is the primary “node agent”
that runs on each node.

Kube-proxy
The Kubernetes network proxy runs on each
node. This reflects services as defined in the
Kubernetes API on each node.
container

POD
Collection of containers that can Containerd
run on a host.
Operating System
This resource is created by clients
and scheduled onto hosts. Kubernetes Workers

27
Your Task: Deploy C* to K8s. You have 90 minutes. Go!

container

Containerd
Operating System

Kubernetes Workers

28
So much to do, so little time...

● Setup networking
● Setup storage
● Setup firewalls
● Install Cassandra on node1
● Install Cassandra on node2
● Install Cassandra on node3
● Install Prometheus and Grafana
● Install agents for management
Configure backups

Just Kidding!!!

● Configure repairs
● Connect our app
● Install all user agents
● and SO MUCH MORE!!!!!

29
30
K8ssandra - Apache
Cassandra™ meets Kubernetes!

● Housekeeping and Setup


● Cassandra and Kubernetes Basics
● Introducing K8ssandra
● Monitoring Cassandra
● Working with Data
● Scaling Up and Down
● API Access with Stargate
● Running Cassandra Repairs
● Wrapping up

31
Full working scalable database with administration tools and easy data access

“Cloud native”

32
K8ssandra Components
Kubernetes ingress for external access

Data Gateway providing REST, GraphQL, Document APIs

Scalable cloud-native database managed via cass-operator

Reaper Cassandra utilities for repair and backup/restore


Medusa

Metrics aggregation and visualization

Packaged and delivered via Helm charts


33
K8ssandra Components in Context

Users Web / Mobile Apps

Kubernetes Cluster

Ingress Data Gateway Metrics DB Metrics UI Deployment

Reaper

Microservices NoSQL DB Repair Operators

Medusa
Backup /
Restore Object Storage

34
Setting up K8ssandra
cassandra:
version: "3.11.10"
cassandraLibDirVolume:
storageClass: local-path
size: 5Gi
helm repo add k8ssandra https://helm.k8ssandra.io/stable allowMultipleNodesPerWorker: true
heap:
helm repo add traefik https://helm.traefik.io/traefik size: 1G
helm repo update newGenSize: 1G
helm install k8ssandra k8ssandra/k8ssandra -f k8ssandra.yaml resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 1000m
memory: 2Gi
datacenters:
- name: dc1
size: 1
racks:
- name: default
kube-prometheus-stack:
grafana:
https://k8ssandra.io/docs/getting-started/ adminUser: admin
adminPassword: admin123
https://k8ssandra.io/docs/getting-started/developer/ stargate:
https://k8ssandra.io/docs/getting-started/site-engineer/ enabled: true
replicas: 1
heapMB: 256
cpuReqMillicores: 200
cpuLimMillicores: 1000

35
Demo #1
• Setting up K8ssandra

36
Kubernetes Namespaces

kube-public default kube-system

Namespace: Namespace provides a


scope for Names. Use of multiple
namespaces is optional.
k8ssandra

We create resources in
namespaces that span across
nodes.

37
K8s Primitives : Storage
k8ssandra

PersistentVolume: is a storage
resource provisioned by an
administrator or dynamic provisioner.

PersistentVolumeClaim:
PersistentVolumeClaim is a user's
request for and claim to a persistent
volume.

StorageClass: StorageClass describes


the parameters for a class of storage
for which PersistentVolumes can be create
dynamically provisioned.

38
K8s Primitives : StatefulSet
cass-operator

StatefulSet: StatefulSet represents a


set of pods with consistent
identities. Identities are defined as:
network, storage.
create

create

39
K8s Primitives : Custom Resources
Custom Resource Definition : apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
Extensions of the Kubernetes API. metadata:
Customization making K8s more modular name: cassandradatacenters.cassandra.datastax.com
spec:
group: cassandra.datastax.com
names:
kind: CassandraDatacenter
• Spec declares the desired state of a resource listKind: CassandraDatacenterList
plural: CassandraDatacenters
○ Configuration settings provided by the user shortNames:
○ Default values expanded by the system - cassdc
- cassdcs
○ Other properties initialized by other internal singular: CassandraDatacenter
components after resource creation. scope: Namespaced
subresources:
status: {}
• Status : describes the object’s current, observed state. ...
○ Kubernetes API server provides a REST API to
clients. A Kubernetes object or resource is a REST
resource. • Examples used in K8ssandra:
○ The status of a Kubernetes resource is typically ○ CassandraDataCenter (cass-operator)
implemented as a REST subresource that can ○ CassandraBackup, CassandraRestore
only be modified by internal, system components (cassandra-medusa)
○ Reaper (reaper)
○ GrafanaDataSource (grafana)
40
K8s Primitives : Operator
Building an application and driving an application on top
of Kubernetes, behind Kubernetes APIs

A Kubernetes Operator helps extend the types of


applications that can run on Kubernetes by allowing

states
developers to provide additional knowledge to

ev
applications that need to maintain state.” –Jonathan S.

en
Katz

ts
reconcile
● Reconcile CRD instances which states defined within
the “spec” attribute.

● Listen for events and status evolution to react


accordingly. events
Operator
• Examples used in K8ssandra:
○ cass-operator (cass-operator)
○ reaper-operator (cassandra-reaper)
○ PrometheusOperator (Prometheus)
cass-operator

41
Kubernetes Probes (readiness, liveness)
Cassandra Management API Service
https://github.com/datastax/management-api-for-apache-cassandra

CASSANDRA

mgmt-api

42
Sample Deployment with Cass-operator
cluster1-dc1-default-sts-0

WebHook Security cluster1-superuser

server-data-cluster1
-dc1-default-sts-0

dc1 cluster1-dc1-default-sts
Operator
cluster1-dc1-
seed-service

server-storage cluster1-dc1
-all-pods-service cluster1-dc1-
service

cass-operator

43
Remember this?

container

Containerd
Operating System

Kubernetes Workers

44
Deploying a Cassandra Cluster in Kubernetes, Simplified

● Setup networking Cass operator features:


● Setup storage ● Proper token ring initialization, with only one node
● Setup firewalls bootstrapping at a time
● Seed node management - one per rack, or three per
● Install Cassandra on node1 datacenter, whichever is more
● Install Cassandra on node2 ● Server configuration integrated into the
● Install Cassandra on node3 CassandraDatacenter CRD
● Rolling reboot nodes by changing the CRD
● Install Prometheus and Grafana ● Store data in a rack-safe way - one replica per cloud
● Install agents for management AZ
● Configure backups ● Scale up racks evenly with new nodes
● Replace dead/unrecoverable nodes
● Configure repairs ● Multi DC clusters (limited to one Kubernetes
● Connect our app namespace)
● Install all user agents ● Scaling up and down simply
● and SO MUCH MORE!!!!!

Normal K8s Deployment stuff + best practices for C* ops


45
Example Deployment Across Multiple K8s Workers
K8ssandra - Apache
Cassandra™ meets Kubernetes!

● Housekeeping and Setup


● Cassandra and Kubernetes Basics
● Introducing K8ssandra
● Monitoring Cassandra
● Working with Data
● Scaling Up and Down
● API Access with Stargate
● Running Cassandra Repairs
● Wrapping up

47
Monitoring (Kube-prometheus-stack)
https://github.com/datastax/metric-collector-for-apache-cassandra

48
Monitoring

CQL Endpoint

Cass-Operator
Apache
Cassandra®

Metrics
Collector
(MCAC)
Traefik UI

Prometheus
UI
Grafana

49
Demo #2
• Monitoring Cassandra with
Prometheus and Grafana

50
K8ssandra - Apache
Cassandra™ meets Kubernetes!

● Housekeeping and Setup


● Cassandra and Kubernetes Basics
● Introducing K8ssandra
● Monitoring Cassandra
● Working with Data
● Scaling Up and Down
● API Access with Stargate
● Running Cassandra Repairs
● Wrapping up

51
Accessing the Data

Repear Cassandra
(repair) Medusa
Reaper UI (backup/restore)

📁S3, GCP,...

CQL Endpoint

Cass-Operator
Apache
Cassandra®

Metrics
Collector
(MCAC)
Traefik UI

Prometheus
UI
Grafana

52
Demo #3
• Working with Data

53
K8ssandra - Apache
Cassandra™ meets Kubernetes!

● Housekeeping and Setup


● Cassandra and Kubernetes Basics
● Introducing K8ssandra
● Monitoring Cassandra
● Working with Data
● Scaling Up and Down
● API Access with Stargate
● Running Cassandra Repairs
● Wrapping up

54
Two ways to scale a K8ssandra Cluster

1. Config files 2. Simple command

helm upgrade demo k8ssandra/k8ssandra --set k8ssandra.size=3 --reuse-values

https://k8ssandra.io/docs/topics/provision-a-cluster/

55
Demo #4
• Scaling Up and Down

56
K8ssandra - Apache
Cassandra™ meets Kubernetes!

● Housekeeping and Setup


● Cassandra and Kubernetes Basics
● Introducing K8ssandra
● Monitoring Cassandra
● Working with Data
● Scaling Up and Down
● API Access with Stargate
● Running Cassandra Repairs
● Wrapping up

57
Stargate Rationale “Start with Why” Simon Sinek

Developers SREs, Operators, DBAs


● Do you like learning query ● Do you allow developers to
languages? execute direct queries against
● Do you care about how your data your DB ?
is stored ? ● Do you like opening port ranges
● Do you like installing and running like 0-65536 for apps
databases locally?
Architects, CIOs
● Do you like creating dedicated
“Start with Why” Simon Sinek projects and hiring people just to
create APIs?

58
And when those 2 meet each other...

SRE Developers

59
Stargate APIs

KAFKA
(incubating)
REST GraphQL Document
CQL
API

Cassandra Cassandra Cassandra


3.11 4.0 DSE 6.8

60
Data
Access Repear
(repair)
Cassandra
Medusa
Repear UI (backup/restore)

Graphql endpoint +
Playground
rest/doc API +
Swagger Cass-Operator
Authentication Apache
Endpoint Cassandra®

CQL Endpoint

Metrics
Collector

Traefik UI

Prometheus
UI
Grafana

61
Demo #5
• Stargate

62
K8ssandra - Apache
Cassandra™ meets Kubernetes!

● Housekeeping and Setup


● Cassandra and Kubernetes Basics
● Introducing K8ssandra
● Monitoring Cassandra
● Working with Data
● Scaling Up and Down
● API Access with Stargate
● Running Cassandra Repairs
● Wrapping up

63
Repairs in Cassandra

● Repairs are the “last resort” to 59 (data)


0
keep data across a Cassandra
cluster in sync
● If repairs don’t happen the health 83 17
59 (data)
of the data will slowly degrade Hint times
over time due to Entropy out… now
what?

“Without repairs, your database 59 (data)


67 33
is hopefully consistent”
- Vinay Kumar Chella, Principal
Cloud Database Architect, 50
Netflix
64
Repair
Repear
(repair)
Reaper UI

CQL Endpoint

Cass-Operator
Apache
Cassandra®

Metrics
Collector
(MCAC)
Traefik UI

Prometheus
UI
Grafana

65
Cassandra Reaper

66
Demo #6
• Running Repairs

67
K8ssandra - Apache
Cassandra™ meets Kubernetes!

● Housekeeping and Setup


● Cassandra and Kubernetes Basics
● Introducing K8ssandra
● Monitoring Cassandra
● Working with Data
● Scaling Up and Down
● API Access with Stargate
● Running Cassandra Repairs
● Backup and Restore
● Wrapping up

68
Backup and restore
Backups are a work in progress but getting close. In their current form they
are executed by kicking off a helm install job. The following is an example
command.

helm install k8ssandra-backup datastax/k8ssandra-backup --set name=backup

The first backups in development have already happened and changes will
be merged soon. If you want to try them out in beta they will be accessible
in the upcoming weeks.
Backups
Cassandra
Medusa
(backup/restore)

📁S3, GCP,...

CQL Endpoint

Cass-Operator
Apache
Cassandra®

Metrics
Collector
(MCAC)
Traefik UI

Prometheus
UI
Grafana

70
Demo #7
• Backups
K8ssandra - Apache
Cassandra™ meets Kubernetes!

● Housekeeping and Setup


● Cassandra and Kubernetes Basics
● Introducing K8ssandra
● Monitoring Cassandra
● Working with Data
● Scaling Up and Down
● API Access with Stargate
● Running Cassandra Repairs
● Wrapping up

72
Repear Cassandra
(repair) Medusa
Repear UI (backup/restore)

📁S3, GCP,...
Graphql endpoint +
Playground
rest/doc API +
Swagger Cass-Operator
Authentication Apache
Endpoint Cassandra®

CQL Endpoint

Metrics
Collector

Traefik UI

Prometheus
UI
Grafana

73
Kubernetes Workshop - Homework
https://github.com/datastaxdevs/k8ssandra-workshop/wiki

74
Join the K8ssandra Community!

k8ssandra.io
@k8ssandra

github.com/k8ssandra

Forums coming soon!

75
Cassandra on Kubernetes Certification!

Sign up for news and updates at https://datastax.com/dev/certifications

Cassandra 3.x Developer Cassandra 3.x Cassandra on


Certification Administrator Kubernetes Certification
Certification
Coming Soon!

76
We’d love to hear from you

Aleks Volochnev Jeff Carpenter


Developer Advocate at DataStax Developer Adoption at DataStax

@jeffreyscarpenter
@hadesarchitect
@jscarp
https://calendly.com/aleks-volochnev
https://calendly.com/jeffrey-carpenter

77
menti.com
Thank you!
@hadesarchitect
@jeffreyscarpenter
@hadesarchitect
@jscarp
Thank you!

You might also like