You are on page 1of 7

07/05/2020 How to test software in the age of machine learning

App Dev & Testing DevOps Enterprise IT Security

SUBSCRIBE CONFERENCES & REPORTS

Find articles, contributors or resources 


Home / App Dev & Testing / Testing

So unpredictable

Moving targets: Testing software in


the age of machine learning
1

Peter Varhol, Principal, Technology Strategy Research

Testing systems that don’t always return the same answers require new
approaches. This is especially true when testing systems whose responses
adapt to what they have learned from previous transactions.

Software testing, in theory, is a fairly straightforward activity. For every


input, there should be a de ned and known output. We enter values, make

https://techbeacon.com/enterprise-it/moving-targets-testing-software-age-machine-learning 1/7
07/05/2020 How to test software in the age of machine learning

selections, or navigate an application and compare the actual result with the
App Dev & Testing DevOps Enterprise IT Security
expected one. If they match, we nod and move on. If they don’t, we possibly
have a bug.
SUBSCRIBE CONFERENCES & REPORTS

The point is, we already know what the


Find output
articles, is supposed
contributors to be. Granted,
or resources 
sometimes an output is not well de ned, there is some ambiguity, and you
get disagreements on whether or not a particular result represents a bug.

But there is a type of software where having a de ned output is no longer the
case. Actually, two types. One is machine learning systems; the second is
predictive analytics.

[ Learn best practices for reducing software defects with TechBeacon's Guide.
Plus: Get the report "Agile and DevOps Reduces Volume, Cost, and Impact of
Production Defects" ]

Difference between machine learning and


predictive analytics
There is a difference between the two. Most machine learning systems are
based on neural networks. A neural network is a set of layered algorithms
whose variables can be adjusted via a learning process. The learning process
involves using known data inputs to create outputs that are then compared
with known results. When the algorithms re ect the known results with the
desired degree of accuracy, the algebraic coef cients are frozen and
production code is generated.

Today, this comprises much of what we understand as arti cial intelligence.

By contrast, predictive analytics makes adjustments to the algorithms in


production, based on results fed back into the software. In other words, the
application better understands how to apply its rules based on how those
rules have worked in the past.

Both of these types of systems have things in common. For one thing,
neither produces an “exact” result. In fact, sometimes they can produce an

https://techbeacon.com/enterprise-it/moving-targets-testing-software-age-machine-learning 2/7
07/05/2020 How to test software in the age of machine learning

obviously incorrect result. But they are extremely useful in a number of


App Dev & Testing DevOps Enterprise IT Security
situations when data already exists on the relationship between recorded
inputs and intended results.
SUBSCRIBE CONFERENCES & REPORTS

Let me give you an example, basedFind


onarticles,
my own experiences.
contributors or resources 

[ Understand what your team needs to know to take advantage of test


automation with TechBeacon's Guide. Plus: Get the Buyer's Guide For Software
Test Automation Tools ]

Developing a few guidelines


Years ago, I devised a neural network that worked as a part of an electronic
wind sensor. This worked though the wind cooling the electronic sensor
based on its decrease in temperature at speci c speeds and directions. My
neural network had three layers of algebraic equations, each with four to ve
separate equations in individual nodes, computing in parallel. They would
use starting coef cients, then adjust those coef cients based on a
comparison.

A neural network is a set of layered algorithms whose variables can be


adjusted via a learning process. 

I then trained it. I had over 500 data points regarding known wind speed and
direction and the extent to which the sensor cooled. The network I created
passed each input into its equations, through the multiple layers, and
produced an answer. At rst, the answer from the network wasn't that close
to the known correct answer. But after multiple iterations with the training
data, the coef cients homed in on accurate and consistent results.

How do you test this? The product is actually tested during the training
process, which takes time. Training either brings convergence to accurate
results or it diverges. The question is, How do you evaluate the quality of the
network?

https://techbeacon.com/enterprise-it/moving-targets-testing-software-age-machine-learning 3/7
07/05/2020 How to test software in the age of machine learning

Predictive analytics systems continue to adapt after deployment, using


App Dev & Testing DevOps Enterprise IT Security
a feedback loop to adjust variables and coef cients within the
algorithm.
SUBSCRIBE CONFERENCES & REPORTS

Here are the guidelines I used.


Find articles, contributors or resources 

1. Have objective acceptance criteria. Know the amount of error you


and your users are willing to accept.

2. Test with new data. Once you’ve trained the network and frozen
the architecture and coef cients, use fresh inputs and outputs to
verify its accuracy.

3. Don’t count on all results being accurate. That’s just the nature of
the beast. And you may have to recommend throwing out the
entire network architecture and starting over.

4. Understand the architecture of the network as a part of the testing


process. Few if any will be able to actually follow a set of inputs
through the network of algorithms, but understanding how the
network is constructed will help testers determine if another
architecture might produce better results.

5. Communicate the level of con dence you have in the results to


management and users. Machine learning systems offer you a
unique opportunity to describe con dence in statistical terms, so
use them.

One important thing to note is that the training data itself could well contain
inaccuracies. In this case, because of measurement error, the recorded wind
speed and direction could be off or ambiguous. In other cases, the cooling of
the lament likely has some error in its measurement.

What about predictive analytics?


Another type of network might be used for predictive analytics. These
systems continue to adapt after deployment, using a feedback loop to adjust
variables and coef cients within the algorithm.
https://techbeacon.com/enterprise-it/moving-targets-testing-software-age-machine-learning 4/7
07/05/2020 How to test software in the age of machine learning

One example is a system under development in the UK to implement


App Dev & Testing DevOps Enterprise IT Security
demand-based pricing for train service. Its goal is to encourage riders to use
the train during nonpeak times, so it dynamically adjusts pricing to make it
SUBSCRIBE CONFERENCES & REPORTS
nancially attractive for riders to consider riding when the trains aren’t as
crowded. Find articles, contributors or resources 

This type of application experiments with different pricing strategies and


tries to optimize two different things: a balance of the ridership throughout
the day, and acceptable revenue from the ridership. A true mathematical
optimization isn’t possible, but the goal is to reach a state of spread-out
ridership and revenue that at least covers costs.

How would you test this type of application? Once again, you need objective
acceptance criteria before even starting the development effort. In this case,
those acceptance criteria should involve some acceptable level of both
ridership averages and revenue.

What testers need to know


Here are some other important considerations.

1. You need test scenarios. By that I mean ridership, when people are
willing to ride, and what they are willing to pay. Because people
don’t know until they are actually placed in the position of
deciding, you are going to have to build data models. Three may be
suf cient, to represent expected best case, average case, and worst
case.

2. You will not reach mathematical optimization of either ridership


or revenue. We are, after all, working with algorithms that produce
approximations, not exact results. Determine what level of
outcomes is acceptable for each scenario.

3. Defects will be re ected in the inability of the model to achieve


the goals of ridership and revenue.

https://techbeacon.com/enterprise-it/moving-targets-testing-software-age-machine-learning 5/7
07/05/2020 How to test software in the age of machine learning

Note that in both the machine learning and analytics application


App Dev & Testing DevOps Enterprise IT Security
examples, the acceptance criteria aren’t expressed in terms of defect number,
type, or severity. In fact, in most cases they are expressed in terms of the
SUBSCRIBE CONFERENCES & REPORTS
statistical likelihood of coming within a certain range. 
Find articles, contributors or resources 

Best practices for testing nondeterministic


applications
That evaluation of quality and risk isn’t a staple with most development and
testing projects, and it is one that testers may be ill equipped to consider.
How can testers provide better feedback on their efforts on such
applications? First, evaluate the application according to the acceptance
criteria. Second, be prepared to support those assertions in statistical terms;
for example, be 95 percent con dent that the application will produce an
answer within a given range. Last, have a high-level understanding of the
underpinnings of the application, so that any de ciencies might be able to be
ascribed to a particular application component.

Both testing practices and results have to change to accommodate


applications that don’t behave the same as traditional software. If you nd
yourself working on machine learning and predictive applications, these
suggestions represent a good start in that direction.

Image source: Flickr

[ Practice quality-driven development with best practices from QA practitioners


in TechBeacon's Guide. Plus: Download the World Quality Report 2019-20 ]

Topics: App Dev & Testing, Testing

1

More from TechBeacon


Shift security left with your continuous testing: 3 key focus areas
https://techbeacon.com/enterprise-it/moving-targets-testing-software-age-machine-learning 6/7
07/05/2020 How to test software in the age of machine learning

Performance reality check: 4 ways to keep up with unexpected surges


App Dev & Testing DevOps Enterprise IT Security
How positive behavioral metrics can boost your software team

SUBSCRIBE CONFERENCES & REPORTS

Find articles, contributors or resources 

    
Brought to you by

Topics TechBeacon

App Dev & Testing About

DevOps Our Contributors

Enterprise IT Terms of Use

Security Privacy

© Copyright 2015 – 2020 Micro Focus or one of its affiliates

https://techbeacon.com/enterprise-it/moving-targets-testing-software-age-machine-learning 7/7

You might also like