Professional Documents
Culture Documents
Contents [hide]
1/9
Ultimately, even with very high test coverage we’re still doing things manually and relying on
humans to interpret the output, which means there’s always a chance for a mistake. If the
networking industry is ever to transition to a DevOps operation model having a fully-fledged,
robust and reliable test automation framework is a must. The question is: What’s the right
tool for this?
software provisioning, i.e. running a bunch of wget, yum, apt and pip commands
configuration management, i.e. creating/modifying configuration files
pushing data into a device, i.e. configuration files, binaries
running ad-hoc CLI commands over SSH
However Ansible is very flexible and customisable and with enough effort it can be “taught”
how to do a lot of things it wasn’t designed to do originally. The problem with that is at some
point those playbooks become too hard to manage and troubleshoot. This is where the
additional complexity outweighs any benefits of automation and obscurity of the
resulting DSL code outweighs the benefits of its readability.
Another downside of using pure Python for network testing automation is the need to write
a lot of boilerplate code to create common testing abstractions and libraries (it took 6
months and 10k lines of code to write Brigade). Nevertheless we should never discount the
possibility of using scripting for network testing, however there may be, at least for this
2/9
specific use case, a middle ground that on the one hand offers the simplicity and readability
of a DSL framework, but on the other hand allows as much customisation as necessary to
extend and augment the default behaviour. Enter the Robot.
Robot Framework
Robot is a generic test automation framework written in Python. Its DSL has a very
lightweight syntax which makes it very easy to write and read. The framework comes with a
set of standard libraries that implement typical functionality expected from a test framework
– data types and structures, conditionals and expectation, automated UI interactions (e.g.
selenium, telnet, ssh), as well as many other 3rd party libraries. One of the most recent
additions is AristaLibrary – a library to interact with Arista devices over eAPI. At the time of
writing this library defines 18 new keywords that allow users to define most typical test
scenarios. However, one of the major advantages of Robot framework is the ability to define
your own keywords. As I will show later, we can re-use any of the existing keywords to define
our own higher-level keywords and use them in our test definitions.
$ source bin/activate
We’ll do our testing against a virtual topology built from cEOS devices I’ve described in
the previous post.:
PUBLISH_BASE: 9000
links:
3/9
– [“Device-A:Interface-1”, “Device-B:Interface-1”]
EOF
This will create a pair of cEOS devices interconnected back-to-back with Ethernet interfaces:
+------+ +------+
|cEOS 1|et1+-----+et1|cEOS 2|
+------+ +------+
Testing
Let’s assume we’ve configured those devices with a simple BGP peering over their directly
connected interfaces and advertised their respective loopbacks into BGP. The pseudocode
for this config would look something like this:
interface Loopback0
ip address X.X.X.X/32
redistribute connected
Now we want to verify that our control plane has converged and we have reachability to the
loopback interfaces. We start by creating a simple YAML configuration file “test.yml”,
describing the device connection details:
TRANSPORT: https
PORT: 80
USERNAME: admin
PASSWORD: admin
RUNFORMAT: suite
nodes:
SW1:
4/9
host: localhost
port: 9000
SW2:
host: localhost
port: 9001
PROD_TAGS:
– ignoretags
testfiles:
– network_validation
Arista Network Validation tool will look for test cases inside a “network_validation”
directory and execute all tests that match a particular tag (“ignoretags” will exectue all of
them).
Now it’s time to create our first test scenario. Each test case file contains a number of
sections responsible for various parts of testing procedure. For now let’s focus on the main
section called “Test Cases”. In there we first check that our BGP peering with a neighbor is in
“Established” state. We do that by issuing a “show ip bgp summary” command, using a “Get
Command Output” keyword, and picking apart the output until we get the “peerState”
attribute of a response. The second test case verifies that peer loopback is reachable with a
special “Address Is Reachable” keyword, which behind the scenes issues a ping and verifies
that at least one ping request received a response.
Documentation This test verifies control and dataplane connectivity between two BGP
peers
Library AristaLibrary
Library AristaLibrary.Expect
Library Collections
${PEER_ADDRESS} 12.12.12.2
5/9
${PEER_LOOPBACK} 2.2.2.2
Controlplane verification
Dataplane verification
Connect To Switches
Finally we can execute our test scenario and get the result:
==============================================================================
==============================================================================
Run Full Suite.1 Bgp :: This test verifies control and dataplane connectivi…
==============================================================================
——————————————————————————
6/9
——————————————————————————
Run Full Suite.1 Bgp :: This test verifies control and dataplane c… | PASS |
==============================================================================
==============================================================================
Now that we’ve seen how easy it is to write and read tests using standard AristaLibrary
keywords, let’s have a look at how to extend the Robot Framework by adding new high-level
keywords.
Custom keywords
Let’s assume we want to verify some internal behaviour that is not necessarily exposed
through Arista CLI. One of the common tasks in acceptance testing is to run a
debug to record timing of a certain event (e.g. BGP keepalive or RIP update). Normally, this
would involve some setup/teardown commands to turn the debugging on and off and some
match command to match an event signature. Instead of doing all of these steps at every
tests case, we can define our own keywords in the bottom “Keywords” section of a test case
file:
Run Keyword And Ignore Error Configure bash timeout ${BASH_TIMEOUT} sudo rm
/tmp/${TRACE_FILE}
${trace_on}= Create List trace ${agent} setting ${setting} trace ${agent} filename
${TRACE_FILE}
Log ${result[0][‘messages’][0]}
Run Keyword And Ignore Error Configure bash timeout ${BASH_TIMEOUT} sudo rm
/tmp/${TRACE_FILE}
We can then make use of those keywords in the “Test Cases” section like this:
Sleep ${DEBUG_TIMEOUT}
Assuming we’ve defined the debug variables in the config YAML file like this:
DEBUG_AGENT: “Rib”
DEBUG_SETTING: “Rib::Rip*/*”
DEBUG_TIMEOUT: 35
We get all occurrences of “RIP RECV” event recorded during a 35 second window in the
output logs:
07:57:23.247214 RIP RECV 12.12.12.2 -> 224.0.0.9 vers 2, cmd Response, length 244
07:57:53.651828 RIP RECV 12.12.12.2 -> 224.0.0.9 vers 2, cmd Response, length 244
Further reading
Obviously, since Robot Framework has its own DSL, some learning curve is expected.
However, once one get familiar with most common standard libraries and keywords, writing
robot test cases becomes very easy. Thankfully Robot boasts one of the best-written
documentation for an open-source project, which, along with the Arista Network Validation
user guide, should be enough for anyone to get up to speed and start writing test cases in a
matter of hours.
Coming up
Hopefully this post has given a feel of how easy we can perform automated network
verification and validation, which brings us one step closer to our final goal – a fully
automated build and test pipeline for network devices. In the next and final post we’ll
8/9
complete our journey towards the network CI/CD nirvana by building our own network CI
server based on GitLab and creating a simple CI/CD pipeline that would make use of both
cEOS and Robot framework to build and test all network changes.
9/9