Professional Documents
Culture Documents
How Chaos Engineering Assures The Resilience of Your Services
How Chaos Engineering Assures The Resilience of Your Services
Elder Moraes
Developer Advocate
@elderjava
Chaos Engineering is the discipline of
experimenting on a system in order to
build confidence in the system’s
capability to withstand turbulent
conditions in production.
http://principlesofchaos.org/
2 @elderjava
Chaos Engineering is the discipline of
experimenting on a system in order to
build confidence in the system’s
capability to withstand turbulent
conditions in production.
http://principlesofchaos.org/
3 @elderjava
Resilience is the adaptability of a
system when facing changes, failures
and anomalies
4 @elderjava
Create failures on purpose before they
happen unexpectedly
5 @elderjava
Find weaknesses and fix them
6 @elderjava
youtube.eldermoraes.com
applications
@elderjava
Examples of where to inject chaos
8 @elderjava
Chaos Engineering phases
9 @elderjava
Steady state
Usual behaviour of a service based on the business metric
https://medium.com/netflix-techblog/sps-the-pulse-of-netflix-streaming-ae4db0e05f8a
10 @elderjava
Speaking of metrics…
11 @elderjava
Metric is a measure used to evaluate,
to control and/or to select
quantitatively: a person, an event or an
institution
12 @elderjava
Metrics & Health Check
13 @elderjava
RockBalboaService
14 @elderjava
RockBalboaService
15 @elderjava
Back to chaos
16 @elderjava
Hypothesis
What if:
• A service returns a 404
• A database stop working
• The amount of requests spikes up
• Latency grows 100%
• A container is killed
• A port becomes inaccessible
• Etc…
17 @elderjava
Design & Execution
Best practices:
• Start small (baby steps)
• As close as possible of the production environment
• Minimize impact as much as possible
• Have an emergency button
• Automate
18 @elderjava
Design & Execution
Control group 1%
Users
Load
Balancer
Chaos group 1%
19 @elderjava
Lessons learned
20 @elderjava
Fix
21 @elderjava
“We learn from failure, not
from succes”
Dracula, Bram Stoker
22 @elderjava
Kubernetes &
Chaos Engineering
23 @elderjava
Kubernetes is perfect for Chaos Engineering
24 @elderjava
Some tools for Chaos with Kubernetes
• Istio
• Chaos Monkey
• Chaos Kong
• Kube Monkey
25 @elderjava
Istio
https://istio.io/docs/tasks/traffic-management/fault-injection/
26 @elderjava
Chaos Monkey
27 @elderjava
Chaos Kong
28 @elderjava
Kube Monkey
https://github.com/asobti/kube-monkey
29 @elderjava
30 @elderjava
“Chaos doesn’t cause
problems. It reveals them.”
Nora Jones, Ex-Senior Chaos Engineer at Netflix
31 @elderjava
developer.redhat.com
32 @elderjava
Thank you!
33 @elderjava