You are on page 1of 34

Workshop:

Resilience Testing
Geoffrey Arij van der Tas
Mark Abrahams
About us!
About Today
• Setup our lab environment
• Theory:
• What is Resilience?
• Resilience
• How to test Resilience?
• Load Testing
• Resilience Testing
• Automated Resilience Testing
What is Resilience?
“ Resilience is the ability of a system
to withstand a major disruption
within acceptable degradation
parameters and to recover within an
acceptable time and composite
costs and risks. ”
Becoming Resilient Examples:
Load Balancing
Stand-by servers
Infra

Examples:
People &
Re-try Pattern Software Examples:
Processes Yearly Datacenter
Circuit Breaker Pattern
Switch-over
Stand-by shifts
To become Resilient
• Prevent: Measures you take to make sure problems do not occur
• Resist: To stand firm, to withstand problems that occur
• Remediate: Is to correct or make the service again right Prevent
• Recover: Restore the service as quickly as possible

Recover Resist

Remediate
Now it is
up to you..
Our App
VM

Front End API DB


Installing Lab environment from HDD
• Install Virtualbox
• Install JDK
• Install Putty or other ssh client
• Copy Gatling
• Install Notepad ++ or an IDE/other Scala editor

• Make sure you set your JAVA_HOME to the JDK


Setup carshare VM server
• Get the .ova file from USB drive
• Import the .ova file by clicking or via
Virtualbox -> file -> import appliance
• Make sure network connection is set to bridged mode:
• Settings -> network -> network -> from dropdown choose “Bridged Adapter”
• Start vm and login with user “root” and password “root”
• Run command “ifconfig” and get ip from “eth0” interface
• Run command “start-carshare”
• Navigate in browser to url “http://{{OBTAINED_IP}}:8080”
• Connect to the Virtual Box via Putty {{OBTAINED_IP}}:22
Load Testing
“Load testing is the process of putting demand on a system and
measuring its response”

Most Populair Tools: JMeter, Gatling, Neoload & Loadrunner .

We are going to use: Gatling.


Forms of Performance Testing
• Load Testing
• Spike Testing
• Stress Testing
• Endurance Testing
Load testing
Baseline scripts:
Load test
Stress test

VM

Front End API DB


Load Test
Script: carshareBrowse.Load
Users: 20
Runtime: 5 minutes

Go to Gatling\user-files\simulations\carshare
Open the file called load (with Notepad ++ or a IDE (with scala plugin))

Change .baseUrl (“http://OBTAINED_IP:8080”)


Stress Test
Script: carshareBrowse.Stress
Runtime: 5 Minutes
Users: 1000+

Go to Gatling\user-files\simulations\carshare
Open the file called stress(with Notepad ++ or a IDE (with scala plugin))

Change .baseUrl (“http://OBTAINED_IP:8080”)


Resilience Testing
“Resilience testing measures the ability to absorb the impact of a
problem in one or more parts of a system.”

Tools: Stress, Nstress


Resilience testing
Scripts:
Load test

VM

Front End API DB


CPU Spike on VM

# Stress CPU with one worker process for 20 seconds


stress –c 1 –t 20
Memory Spike on VM

# Stress memory for 20 seconds


stress –m 5 -t 20
IO Spike on VM

# Stress io for 60 seconds


stress –i 1 –t 60
Network Failure on VM

#Short network outage


/etc/init.d/networking restart
Automated Resilience Testing
“Resilience testing measures the ability to absorb the impact of a
problem in one or more parts of a system.”

Most Populair Tools: Chaos monkey, Gremlin.

We are going to use: Chaos monkey for kubernetes.


Test results – normal load, normal circumstances
Test app as monolith

Application
Test app as microservice (non HA)
Test results – normal load, pod failure chaos
Test app as microservice (HA)
Test results – normal load, pod failures chaos HA
Resilience testing
Why: To help you build more stable application and perform well in
real life situations;

To avoid: Service loss, Data loss, Customer loss!

How: By creating chaos, testing ‘What if scenarios’ & investigate what


happens;

What: Be creative and use the following tools for example:


Tools you can use
Load/Stress tool Resilience Stress tool
- Nstress - Chaos monkey
- Stress - Gremlin

Application Load Monitoring


- Gatling - ELK stack
- Loadrunner - Dynatrace
- Jmeter - Prometheus
- NeoLoad
Credentials:

Geoffrey van der Tas Mark Abrahams


https://www.linkedin.com/in/geoffreyvdtas/ https://www.linkedin.com/in/mark-abrahams-
b8218129/
Geoffrey.van.der.tas@ordina.nl
@gavdtas Mark.abrahams@ordina.nl
http://geoffreyvdtas.com
Want to read more
• https://gatling.io/
• https://usersnap.com/blog/resilience-testing/
• https://www.ibm.com/cloud/blog/resilience-testing-insights-from-the-pros
• https://medium.com/netflix-techblog/fit-failure-injection-testing-
35d8e2a9bb2
• https://medium.com/netflix-techblog
• https://www.gremlin.com/community/tutorials/chaos-engineering-the-
history-principles-and-practice/
• http://geoffreyvdtas.com/blog (a blog with more information will follow soon)
or
https://drive.google.com/open?id=19UC8m9eLyrViGLdkg7hn70IngIoqIWO1
- to download the stuff from the workshop

You might also like