You are on page 1of 6

BROUGHT TO YOU IN PARTNERSHIP WITH

CONTENTS

∙ Introduction

Automated
∙ Testing Fundamentals
∙ Automated Testing
­‑ Unit Tests
‑ Mocking

Testing at Scale ‑ Integration Tests


‑ System Tests
‑ Acceptance Tests
∙ Scaling Tests
‑ Optimizing Tests
‑ Parallelizing Tests

JUSTIN ALBANO ∙ Conclusion


SOFTWARE ENGINEER AT CATALOGIC SOFTWARE, INC.

Test code and test tools are as critical to a software application as 2. Creating architectural and low-level designs based on our
the application code itself. In practice, we often neglect tests because requirements
they add overhead to our development and build times. These
3. Writing code abiding by our designs
slowdowns affect our release cadence and disrupt our short-term
schedules (for long-term gains that are hard to see during crunch). The testing we perform correlates to these generalized steps and is
captured in the V-Model.
In this Refcard, we will look at the fundamentals of testing in an
Agile world and how automated tests can drastically improve the
quality of our applications. We will then look at two critical strategies
for reducing the execution time of our automated tests to keep our
builds lean.

TESTING FUNDAMENTALS
Rarely does a team outright reject tests, but in many cases, tests are
relegated to second-class citizens — and sometimes even treated as a
stretch goal — which comes at the price of:

• Instability
• Lack of confidence
• Hesitancy to change other’s code

The benefits of first-class testing are the opposite: Stability, increased


confidence, and increased interaction between developers. This
enables us to create new features and work on different parts of the
application without worrying that we will create regressions.

Regardless of the development process we choose, we will work


through a basic set of steps when building an application:

1. Eliciting requirements from the customer or stakeholders and


specifying them as specifications (such as formal requirements
or use cases)

1
REFCARD | AUTOMATED TESTING AT SCALE

The V-Model pairs each of the four non-coding steps with an Additionally, the quality and speed of each test tend to diminish in
accompanying type of test: proportion to the number of tests. In the following sections, we will
look at some of the practices for creating quick unit, integration,
1. Acceptance tests: Ensures the application meets the needs of
system, and acceptance tests, as well as some strategies for quickly
the customer and stakeholders
executing a large number of tests.
2. System tests: Ensures the application meets its specification
when deployed in a production-like environment UNIT TESTS
A unit in testing terminology is usually the smallest element of code,
3. Integration tests: Ensures the components of the application
such as a class in Object-Oriented Programming (OOP). Unit tests are
function according to their designs
usually white box tests, where we focus on exercising each statement
4. Unit tests: Ensures each unit that makes up the components in a method. These tests are commonly written using a JUnit-style
function according to its design framework and should be:

While the V-model provides a sound basis, the Agile movement has • Fast: Executes quickly
brought some much-needed additions to this foundation. • Independent: Shares little to no resources
• Concise: Verifies only one logical outcome
AUTOMATED TESTING
As the number of tests grows, it can be tempting to relax these
Prior to the Agile movement, tests were commonly performed
constraints, but it is important that we maintain a large number of
manually, with a Quality Assurance (QA) team deploying the system,
efficient tests rather than untidy ones.
injecting inputs, and inspecting outputs. While this did suffice for
some time, it was:
MOCKING
• Tedious Since we only focus on one unit at a time, sometimes we need to
• Monotonous mock — or create dummy representations, called stubs — other
• Error-prone dependencies.
• Difficult to repeat

This manual process changed with the introduction of an automated


mindset. Instead, tests were created in code so that they can be
executed on demand. This brought numerous benefits, including:

• Reduced execution time


• Quick feedback
• Repeatability
• Version-controlled configuration
For example, if we have a Store class, it may depend on a
This automation also led to the introduction of fully-automated PaymentService:
integration and deployment pipelines, called Continuous Integration
public class Store {
(CI) and Continuous Deployment (CD) pipelines.
private final PaymentService service;
// ...constructor...

In this case, we are only interested the inputs and outputs of


PaymentService, not its internal behavior. Therefore, we can use a
mocking framework (such Mockito) to do the following:
Each commit to the application repository can be automatically
1. Create a stub
tested and deployed, which allows us to determine if each change
breaks our product. As beneficial as this is, automated testing is not 2. Set the expected outputs for methods of the stub based on
a silver-bullet: It is only as good as the tests we create, and poor tests expected inputs
can lead to over-confidence and a false sense of security. 3. Verify that the stubbed methods were called

3 BROUGHT TO YOU IN PARTNERSHIP WITH


REFCARD | AUTOMATED TESTING AT SCALE

This stubbing allows us to abstract the internals of our dependencies SCALING TESTS
and focus our tests on the unit being exercised. Additionally, stubbing While automated testing is essential, the number of tests can grow
makes each test independent since a new stub can be created for to the point where builds take hours and developers cannot obtain
each test case. As we will see, stubs can (and should) be used at all quick feedback. To remedy this situation, we must focus on scaling
testing levels. our tests properly by optimizing individual tests and parallelizing test
execution.
INTEGRATION TESTS
Integration tests are fairly similar to unit tests — and commonly OPTIMIZING TESTS
utilize the same JUnit-style frameworks — but instead of testing One of the most effective strategies for reducing the execution time
individual units of a system, they test groups of units, called of tests is to optimize individual test cases. As with any performance
components or modules. For example, if we create a commerce optimization, we must target the portion of the execution time that
system, one module may handle credit card transactions while will provide us with the most reward for our effort. Generally, test
another may handle shipping with a specific carrier. These modules execution time can be divided into three stages:
can then be aggregated to form the entire system.
• Setup: The time required to acquire and create the resources
that the test case will use and configure the environment

• Execution: The time required to run the test case

• Tear down: The time required to release and destroy any


resources created during setup and leave the environment in
a stable state

This three-part division of execution time also applies to an


entire test phase as well. For example, deploying our system to a
production-like environment for system tests can be thought of as
These tests are usually written as black box tests, where we focus the setup portion of the system test phase and the execution of the
known inputs generating expected outputs, and tend to run as long test cases — each with their own setup, execution, and tear down
or longer than unit tests. parts — can be considered the execution stage.

SYSTEM TESTS
System tests exercise the system as a whole by deploying it in a
production-like environment and then interacting with it like a
user would (from the outside). System tests tend to run longer than
integration tests, but there are usually fewer of them.

Tests at this level commonly exercise the User Interface (UI) or


Application Programming Interface (API) of the application and
utilize popular frameworks such as:

• Selenium For unit and integration tests, the most time-consuming stage will

• Serenity commonly be execution, as the setup and tear down phases of the

• Cypress test should only create simple objects or sets of objects. For system
and acceptance tests, on the other hand, setup (and possibly tear

ACCEPTANCE TESTS down) will commonly take up as much or more time as the test case

Acceptance tests are similar to system tests but they focus on the execution.

customer’s and stakeholders’ perspective. Usually, these tests are


When setup and tear down are the long poles in the tent, we can do
written in plaintext (such as in Gherkin) and use frameworks like
the following to reduce their execution time:
Cucumber to translate between plaintext and code. Acceptance tests
tend to focus on a workflow or entire use case, so they may take 1. Favor lightweight orchestration: Lightweight objects, such

longer than system tests, but there will likely be fewer of them. as containers (like Docker), are generally faster to deploy and

4 BROUGHT TO YOU IN PARTNERSHIP WITH


REFCARD | AUTOMATED TESTING AT SCALE

should be favored over heavier-weight objects, such as Virtual for our effort. Tweaking a test case to the point where its purpose is
Machines (VMs). Additionally, orchestration frameworks (such obfuscated to reduce 2 seconds of execution time should be avoided
as Kubernetes) can generally configure networks and file when a change to the startup components can save us a minute.
systems quicker than VM orchestration tools.

2. Stub live services when possible: Mock any external services PARALLELIZING TESTS

whose behavior is known but whose implementation is While optimizing individual tests can greatly reduce execution time,

not directly needed. This may be more viable for unit and there are instances where further optimizations are not worth the

integration tests, since system and acceptance tests will effort or when each test runs quickly, but the number of test cases

commonly use the actual services needed in production, but causes the aggregate execution to grow beyond a reasonable level.

it is still possible to stub some system services. For example, a


This is often true with unit and integration tests whose individual
system test for a shopping application can stub the financial
execution times may be small, but multiplied over thousands or
institution so that the payment processing portion of the
more tests, starts to grow (i.e., 0.01 seconds per test for 40,000 test
application can be tested, but the payment itself is not made
cases tasks close to 7 minutes to execute). To remedy this situation,
to a real financial institution.
we can execute our tests in parallel in one of two ways:
3. Share startup procedures between tests: In some cases,
1. Parallelizing test cases: Executing individual tests cases
resources such as databases can be shared between tests. This
concurrently or executing sets of test cases (such as test
reduces the need for each test case to startup and tear down
classes) concurrently.
the shared resource. When possible, the interactions with
2. Parallelizing test phases: Executing a test phase (e.g., unit
shared resources should be independent (see the following
tests) at the same time as another phase.
section) and not cause contention (e.g., write to the same
collection in a database).
Parallelizing individual test cases often produces a higher return for
the effort than parallelizing test phases. In most CI/CD pipelines, test
In the case when the execution stage takes the longest, we have a
phases are executed in series so that a failure in an early test phase
few options:
does not cause later test phases (which require more resources) to
1. Optimize the application: Many times, our tests are slow
start. For example, if any unit test fails, there is no need to execute
because the application itself is slow. We should consult
the system tests.
our performance tests to see where the hotspots in our
code are, but sometimes our functional tests can provide Parallelizing test cases, on the other hand, usually produces a
some useful insight into the areas where improvements significant reduction in execution time, especially for unit tests. If
are needed. we write our unit tests (and integration tests) properly —where each
test runs quickly and is independent of one another — we can easily
2. Parameterize timeouts: Some logic in our application
execute them in parallel. The maximum reduction in execution time
may use timeouts or other temporal mechanisms that can
can be calculated according to Amdahl's law:
drastically increase execution time. When possible, these
values should be easy to configure and should be set to the
lowest feasible value during testing (context will dictate a
reasonable value).

3. Use asynchronous logic: Instead of polling to see if a


result is available, we should favor asynchronous transfers, where:
ensuring that a result is found as soon as it is available.
• Sexecution is the relative execution time
Polling may be necessary in some cases, but it runs the risk
• p is the proportion of tests cases that can be run in parallel
of wasted execution time. For example, if we use a polling
(assuming a relatively even execution time per test)
period of 10 seconds, and a result becomes available in 11
seconds, we have to wait to the end of the second period to • T is the number of threads that can be used to execute the
see the result, wasting 9 seconds. test cases in parallel

Regardless of the test phase, it is important that we target the part of As p and T increase, the execution time decreases. To increase
the test case that takes the longest and will provide the most reward parallelization, we must ensure that our test cases are independent

5 BROUGHT TO YOU IN PARTNERSHIP WITH


REFCARD | AUTOMATED TESTING AT SCALE

of one another. For example, if a test case depends on another test While parallelization can produce significant reductions in execution
case to execute first or if a test case depends on a shared resource, it time, there are some considerations to keep in mind:
can be very difficult to parallelize it. To increase the independence
• Output is interlaced: When a test case fails, it can be difficult
of each test case, we can do the following:
to see the output from the failed test cases since it will be in‑
terlaced with the output from other tests. It may be beneficial
1. Mock shared resources: Instead of depending on share
to execute the failed test again by itself (since the test case
resources, such as databases, stubs should be used instead.
should be independent) to see the relevant output.
This allows each test case to create their own stub and interact
with the individual stubs independent of any other test case. • Execution order will change: Some reporting or analysis
tools may display test cases in the order in which they were
2. Remove artificial ordering: Some test cases are artificially
executed. This ordering can — and likely will — change be‑
ordered, where the test case itself does not require ordering,
tween executions and should not be depended upon.
but we, as test authors, create an arbitrary ordering. For
example, we may artificially desire that we test the add
and subtract operations of a calculator before testing CONCLUSION
multiplication and division. Testing is often relegated to an inferior position behind application
code, but demotion can lead to a serious lack of quality. While
The thread count we select will depend on the nature of the tests, automated testing has reduced the burden of testing, it is not a
the resources available in our test environment, and the number of final solution. Instead, we need to combine efficient tests and
test cases that can be parallelized. For example, if we can only find parallelization to ensure we balance sufficient test coverage and
10 test cases that can be run in parallel, using 15 threads may not reasonable build times.
produce any noticeable speed-up. In most cases, we should tune the
thread count and then sample execution time until we approximate a
WRITTEN BY JUSTIN ALBANO,
maximum speed-up.
SOFTWARE ENGINEER, CATALOGIC SOFTWARE, INC.

Once we have parallelized our test cases and have determined a Justin Albano is a Software Engineer
at Catalogic Software, Inc. responsible
starting thread count, we can configure our build system to run our
for building distributed catalog, backup,
tests in parallel. For example, we can configure Maven using the and recovery solutions for Fortune 50
following build configuration: clients, focusing on Spring-based REST API and MongoDB
development. When not working or writing, he can be found
playing or watching hockey, practicing Brazilian Jiu-jitsu,
<build>
drawing, or reading.
<plugins>
<plugin>
<groupId>org.apache.maven.plugins
</groupId>
<artifactId>maven-surefire-plugin
</artifactId>
...
<configuration> DZone, a Devada Media Property, is the resource software developers,
engineers, and architects turn to time and again to learn new skills, solve
<parallel>all</parallel>
software development problems, and share their expertise. Every day,
<threadCount>10</threadCount> hundreds of thousands of developers come to DZone to read about the latest
technologies, methodologies, and best practices. That makes DZone the
...
ideal place for developer marketers to build product and brand awareness
</configuration> and drive sales. DZone clients include some of the most innovative tech‑
nology and tech-enabled companies in the world including Red Hat, Cloud
</plugin>
Elements, Sensu, and Sauce Labs.
</plugins>

</build>
Devada, Inc.
600 Park Offices Drive
Suite 150
Research Triangle Park, NC 27709
Maven can be configured to execute test cases (test methods) or 888.678.0399 919.678.0300
test classes — as well as test suites — in parallel. In the case of the Copyright © 2020 Devada, Inc. All rights reserved. No part of this publication
may be reproduced, stored in a retrieval system, or transmitted, in any form
configuration above, we selected all, which allows both test methods or by means of electronic, mechanical, photocopying, or otherwise, without
prior written permission of the publisher.
and test classes to be executed in parallel. When possible, we should
use the all option so ensure the greatest flexibility.

6 BROUGHT TO YOU IN PARTNERSHIP WITH

You might also like