You are on page 1of 16

SOFTWARE QUALITY ENGINEERING

A. Concepts
Software Quality Engineering (SQE) is a process that evaluates,
assesses, and improves the quality of software. Software quality is
often defined as the degree to which software meets requirements
for reliability, maintainability, transportability, etc., as contrasted
with functional, performance, and interface requirements that are
satisfied as a result of software engineering.

Quality must be built into a software product during its development


to satisfy quality requirements established for it. SQE ensures that
the process of incorporating quality in the software is done properly,
and that the resulting software product meets the quality
requirements. The degree of conformance to quality requirements
usually must be determined by analysis, while functional
requirements are demonstrated by testing. SQE performs a function
complementary to software development engineering. Their
common goal is to ensure that a safe, reliable, and quality
engineered software product is developed.

B. Software Qualities
Qualities for which an SQE evaluation is to be done must first be
selected and requirements set for them. Some commonly used
qualities are reliability, maintainability, transportability,
interoperability, testability, usability, reusability, traceability,
sustainability, and efficiency.
Some of the key ones are discussed below.

1. Reliability
Hardware reliability is often defined in terms of the Mean-Time-To-
Failure, or MTTF, of a given set of equipment. An analogous notion
is useful for software, although the failure mechanisms are different
and the mathematical predictions used for hardware have not yet
been usefully applied to software. Software reliability is often
defined as the extent to which a program can be expected to
perform intended functions with required precision over a given
period of time. Software reliability engineering is concerned with
the detection and correction of errors in the software; even more, it
is concerned with techniques to
compensate for unknown software errors and for problems in the
hardware and data environments in which the software must
operate.

2. Maintainability
Software maintainability is defined as the ease of finding and
correcting errors in the software. It is analogous to the hardware
quality of Mean-Time-To-Repair, or MTTR. While
there is as yet no way to directly measure or predict software
maintainability, there is a significant body of knowledge about
software attributes that make software easier to maintain. These
include modularity, self (internal) documentation, code readability,
and structured coding techniques. These same attributes also
improve sustainability, the ability to make improvements to the
software.

3. Transportability
Transportability is defined as the ease of transporting a given set of
software to a new hardware and/or operating system environment.

4. Interoperability
Software interoperability is the ability of two or more software
systems to exchange information and to mutually use the
exchanged information.

5. Efficiency
Efficiency is the extent to which software uses minimum hardware
resources to perform its functions.

There are many other software qualities. Some of them will not be
important to a specific software system, thus no activities will be
performed to assess or improve them. Maximizing some qualities
may cause others to be decreased. For example, increasing the
efficiency of a piece of software may require writing parts of it in
assembly language. This will decrease the transportability and
maintainability of the software.

C. Metrics
Metrics are quantitative values, usually computed from the design
or code, that measure the quality in question, or some attribute of
the software related to the quality. Many metrics have been
invented, and a number have been successfully used in specific
environments, but none has gained widespread acceptance.

D. A Software Quality Engineering Program


The two software qualities which command the most attention are
reliability and maintainability. Some practical programs and
techniques have been developed to improve the reliability and
maintainability of software, even if they are not measurable or
predictable. The types of activities that might be included in an SQE
program are described here in terms of these two qualities. These
activities could be used as a model for the SQE activities for
additional qualities.

1. Qualities and Attributes


An initial step in laying out an SQE program is to select the qualities
that are important in the context of the use of the software that is
being developed. For example, the highest priority qualities for
flight software are usually reliability and efficiency. If revised flight
software can be up-linked during flight, maintainability may be of
interest, but considerations like transportability will not drive the
design or implementation. On the other hand, the use of science
analysis software might require ease of change and maintainability,
with reliability a concern and efficiency not a driver at all.

After the software qualities are selected and ranked, specific


attributes of the software that help to increase those qualities
should be identified. For example,
Modularity is an attribute that tends to increase both reliability and
maintainability. Modular software is designed to result in code that
is apportioned into small, self-contained, functionally unique
components or units. Modular code is easier to maintain, because
the interactions between units of code are easily understood, and
low level functions are contained in few units of code. Modular code
is also more reliable, because it is easier to completely test a small,
self contained unit. Not all software qualities are so simply related to
measurable design and code attributes, and no quality is so simple
that it can be easily measured. The idea is to select or devise
measurable, analyzable, or testable design and
code attributes that will increase the desired qualities. Attributes like
information hiding, strength, cohesion, and coupling should be
considered.

2. Quality Evaluations
Once some decisions have been made about the quality objectives
and software attributes, quality evaluations can be done. The intent
in an evaluation is to measure the effectiveness of a standard or
procedure in promoting the desired attributes of the software
product. For example, the design and coding standards should
undergo a quality evaluation. If modularity is desired, the standards
should clearly say so and should set standards for the size of units
or components. Since internal documentation is linked to
maintainability, the documentation standards should be clear and
require good internal documentation.

Quality of designs and code should also be evaluated. This can be


done as a part of the walkthrough or inspection process, or a quality
audit can be done. In either case, the implementation is evaluated
against the standard and against the evaluator's knowledge of good
software engineering practices, and examples of poor quality in the
product are identified for possible correction.

3. Nonconformance Analysis
One very useful SQE activity is an analysis of a project's
nonconformance records. The nonconformances should be analyzed
for unexpectedly high numbers of events in specific
sections or modules of code. If areas of code are found that have
had an unusually high error count (assuming it is not because the
code in question has been tested more thoroughly), then the code
should be examined. The high error count may be due to poor
quality code, an inappropriate design, or requirements that are not
well understood or defined. In any case, the analysis may indicate
changes and rework that can improve the reliability of the
completed software. In addition to code problems, the analysis may
also reveal software development or maintenance processes that
allow or cause a high proportion of errors to be introduced into the
software. If so, an evaluation of the procedures may lead to
changes, or an audit may discover that the procedures are not being
followed.

4. Fault Tolerance Engineering


For software that must be of high reliability, a fault tolerance activity
should be established. It should identify software which provides
and accomplishes critical functions and requirements. For this
software, the engineering activity should determine and develop
techniques which will ensure that the needed reliability or fault
tolerance will be attained. Some of the techniques that have been
developed for high reliability environments include:

Input data checking and error tolerance. For example, if out-of-


range or missing input data can affect reliability, then sophisticated
error checking and data interpolation/extrapolation schemes may
significantly improve reliability.

Proof of correctness. For limited amounts of code, formal "proof of


correctness" methods may be able to demonstrate that no errors
exist.

N-Item voting. This is a design and implementation scheme where a


number of independent sets of software and hardware operate on
the same input. Some comparison
(voting) scheme is used to determine which output to use. This is
especially effective where subtle timing or hardware errors may be
present.

Independent development. In this scheme, one or more of the N-


items are independently developed units of software. This helps
prevent the simultaneous failure of all items due to a common
coding error.

E. Techniques and Tools


Some of the useful fault-tolerance techniques are described under
subsection D, above. Standard statistical techniques can be used to
manipulate nonconformance data. In addition, there is considerable
experimentation with the Failure Modes and Effects Analysis (FMEA)
technique adapted from hardware reliability engineering. In
particular, the FMEA can be used to identify failure modes or other
assumable (hardware) system states which can then lead the
quality engineer to an analysis of the software that controls the
system as it assumes those states.

There are also tools that are useful for quality engineering. They
include system and software simulators, which allow the modeling
of system behavior; dynamic analyzers, which detect the portions of
the code that are used most intensively; software tools that are
used to compute metrics from code or designs; and a host of special
purpose tools that can, for example, detect all system calls to help
decide on portability limits.

White Box Testing

Testing of a function with knowing internal structure of the


program.

Also known as glass box, structural, clear box and open box testing. A
software testing technique whereby explicit knowledge of the internal
workings of the item being tested are used to select the test data. Unlike
black box testing, white box testing uses specific knowledge of
programming code to examine outputs. The test is accurate only if the
tester knows what the program is supposed to do. He or she can then see
if the program diverges from its intended goal. White box testing does not
account for errors caused by omission, and all visible code must also be
readable.

Black Box Testing

Testing of a function without knowing internal structure of the


program.

Black-box and white-box are test design methods. Black-box test design
treats the system as a "black-box", so it doesn't explicitly use knowledge
of the internal structure. Black-box test design is usually described as
focusing on testing functional requirements. Synonyms for black-box
include: behavioral, functional, opaque-box, and closed-box. White-box
test design allows one to peek inside the "box", and it focuses specifically
on using internal knowledge of the software to guide the selection of test
data. Synonyms for white-box include: structural, glass-box and clear-
box.

While black-box and white-box are terms that are still in popular use,
many people prefer the terms "behavioral" and "structural". Behavioral
test design is slightly different from black-box test design because the use
of internal knowledge isn't strictly forbidden, but it's still discouraged. In
practice, it hasn't proven useful to use a single test design method. One
has to use a mixture of different methods so that they aren't hindered by
the limitations of a particular one. Some call this "gray-box" or
"translucent-box" test design, but others wish we'd stop talking about
boxes altogether.

It is important to understand that these methods are used during the test
design phase, and their influence is hard to see in the tests once they're
implemented. Note that any level of testing (unit testing, system testing,
etc.) can use any test design methods. Unit testing is usually associated
with structural test design, but this is because testers usually don't have
well-defined requirements at the unit level to validate.

Unit Testing

In computer programming, a unit test is a method of testing the


correctness of a particular module of source code. The idea is to write test
cases for every non-trivial function or method in the module so that each
test case is separate from the others if possible. This type of testing is
mostly done by the developers.

Benefits

The goal of unit testing is to isolate each part of the program and show
that the individual parts are correct. It provides a written contract that the
piece must satisfy. This isolated testing provides four main benefits:

Encourages change

Unit testing allows the programmer to refactor code at a later date, and
make sure the module still works correctly (regression testing). This
provides the benefit of encouraging programmers to make changes to the
code since it is easy for the programmer to check if the piece is still
working properly.

Simplifies Integration

Unit testing helps eliminate uncertainty in the pieces themselves and can
be used in a bottom-up testing style approach. By testing the parts of a
program first and then testing the sum of its parts will make integration
testing easier.

Documents the code

Unit testing provides a sort of "living document" for the class being tested.
Clients looking to learn how to use the class can look at the unit tests to
determine how to use the class to fit their needs.

Separation of Interface from Implementation

Because some classes may have references to other classes, testing a


class can frequently spill over into testing another class. A common
example of this is classes that depend on a database; in order to test the
class, the tester finds herself writing code that interacts with the
database. This is a mistake, because a unit test should never go outside of
its own class boundary. As a result, the software developer abstracts an
interface around the database connection, and then implements that
interface with their own Mock Object. This results in loosely coupled code,
thus minimizing dependencies in the system.

Limitations

It is important to realize that unit-testing will not catch every error in the
program. By definition, it only tests the functionality of the units
themselves. Therefore, it will not catch integration errors, performance
problems and any other system-wide issues. In addition, it may not be
trivial to anticipate all special cases of input the program unit under study
may receive in reality. Unit testing is only effective if it is used in
conjunction with other software testing activities.

Integration Testing

It is the phase of software testing in which individual software modules


are combined and tested as a group. It follows unit testing and precedes
system testing. takes as its input modules that have been checked out by
unit testing, groups them in larger aggregates, applies tests defined in an
Integration test plan to those aggregates, and delivers as its output the
integrated system ready for system testing.

Purpose

The purpose of Integration testing is to verify functional, performance and


reliability requirements placed on major design items. These "design
items", i.e. assemblages (or groups of units), are exercised through their
interfaces using Black box testing, success and error cases being
simulated via appropriate parameter and data inputs. Simulated usage of
shared data areas and inter-process communication is tested, individual
subsystems are exercised through their input interface. All test cases are
constructed to test that all components within assemblages interact
correctly, for example, across procedure calls or process activations.

The overall idea, is the "building block" approach in which verified


assemblages are added to a verified base which is then used to support
the Integration testing of further assemblages.

Performance testing

In software engineering, performance testing is testing that is performed


to determine how fast some aspect of a system performs under a
particular workload.

Performance testing can serve different purposes. It can demonstrate that


the system meets performance criteria. It can compare two systems to
find which performs better. Or it can measure what parts of the system or
workload cause the system to perform badly. In the diagnostic case,
software engineers use tools such as profilers to measure what parts of a
device or software contribute most to the poor performance or to establish
throughput levels (and thresholds) for maintained acceptable response
time.

In performance testing, it is often crucial (and often difficult to arrange)


for the test conditions to be similar to the expected actual use.

Technology

Performance testing technology employs one or more PCs to act as


injectors – each emulating the presence or numbers of users and each
running an automated sequence of interactions (recorded as a script, or as
a series of scripts to emulate different types of user interaction) with the
host whose performance is being tested. Usually, a separate PC acts as a
test conductor, coordinating and gathering metrics from each of the
injectors and collating performance data for reporting purposes. The usual
sequence is to ramp up the load – starting with a small number of virtual
users and increasing the number over a period to some maximum.

The test result shows how the performance varies with the load, given as
number of users vs response time. Various tools, including Compuware
Corporation's QACenter Performance Edition, are available to perform such
tests. Tools in this category usually execute a suite of tests which will
emulate real users against the system. Sometimes the results can reveal
oddities, e.g., that while the average response time might be acceptable,
there are outliers of a few key transactions that take considerably longer
to complete – something that might be caused by inefficient database
queries, etc.

Performance testing can be combined with stress testing, in order to see


what happens when an acceptable load is exceeded –does the system
crash? How long does it take to recover if a large load is reduced? Does it
fail in a way that causes collateral damage?

Performance specifications

Performance testing is frequently not performed against a specification,


i.e. no one will have expressed what is the maximum acceptable response
time for a given population of users. However, performance testing is
frequently used as part of the process of performance profile tuning. The
idea is to identify the “weakest link” – there is inevitably a part of the
system which, if it is made to respond faster, will result in the overall
system running faster. It is sometimes a difficult task to identify which
part of the system represents this critical path, and some test tools come
provided with (or can have add-ons that provide) instrumentation that
runs on the server and reports transaction times, database access times,
network overhead, etc. which can be analysed together with the raw
performance statistics. Without such instrumentation one might have to
have someone crouched over Windows Task Manager at the server to see
how much CPU load the performance tests are generating. There is an
apocryphal story of a company that spent a large amount optimising their
software without having performed a proper analysis of the problem. They
ended up rewriting the system’s ‘idle loop’, where they had found the
system spent most of its time, but even having the most efficient idle loop
in the world obviously didn’t improve overall performance one iota!

Performance testing almost invariably identifies that it is parts of the


software (rather than hardware) that contribute most to delays in
processing users’ requests. Performance testing can be performed across
the web, and even done in different parts of the country, since it is known
that the response times of the internet itself vary regionally. It can also be
done in-house, although routers would then need to be configured to
introduce the lag what would typically occur on public networks.

It is always helpful to have a statement of the likely peak numbers of


users that might be expected to use the system at peak times. If there
can also be a statement of what constitutes the maximum allowable 95
percentile response time, then an injector configuration could be used to
test whether the proposed system met that specification.

Tasks to undertake

Tasks to perform such a test would include:

* Analysis of the types of interaction that should be emulated and the


production of scripts to do those emulations

* Decision whether to use internal or external resources to perform the


tests.

* set up of a configuration of injectors/controller

* set up of the test configuration (ideally identical hardware to the


production platform), router configuration, quiet network (we don’t want
results upset by other users), deployment of server instrumentation.

* Running the tests – probably repeatedly in order to see whether any


unaccounted for factor might affect the results.

* Analysing the results, either pass/fail, or investigation of critical path


and recommendation of corrective action.

Logged
Stress Testing

is a form of testing that is used to determine the stability of a given


system or entity. It involves testing beyond normal operational capacity,
often to a breaking point, in order to observe the results. For example, a
web server may be stress tested using scripts, bots, and various denial of
service tools to observe the performance of a web site during peak loads.
Stress testing a subset of load testing. Also see testing, software testing,
performance testing.

Security Testing

Application vulnerabilities leave your system open to attacks, Downtime,


Data theft, Data corruption and application Defacement. Security within
an application or web service is crucial to avoid such vulnerabilities and
new threats.

While automated tools can help to eliminate many generic security issues,
the detection of application vulnerabilities requires independent evaluation
of your specific application's features and functions by experts. An
external security vulnerability review by Third Eye Testing will give you the
best possible confidence that your application is as secure as possible.

Installation Testing

Installation testing (in software engineering) can simply be defined as any


testing that occurs outside of the development environment. Such testing
will frequently occur on the computer system the software product will
eventually be installed on.

Whilst the ideal installation might simply appear to be to run a setup


program, the generation of that setup program itself and its efficacy in a
variety of machine and operating system environments can require
extensive testing before it can be used with confidence.

Alpha Testing

In software development, testing is usually required before release to the


general public. In-house developers often test the software in what is
known as 'ALPHA' testing which is often performed under a debugger or
with hardware-assisted debugging to catch bugs quickly.

It can then be handed over to testing staff for additional inspection in an


environment similar to how it was intended to be used. This technique is
known as black box testing. This is often known as the second stage of
alpha testing

In distributed systems, particularly where software is to be released into


an already live target environment (such as an operational web site)
installation (or deployment as it is sometimes called) can involve database
schema changes as well as the installation of new software. Deployment
plans in such circumstances may include back-out procedures whose use
is intended to roll the target environment back in the event that the
deployment is unsuccessful. Ideally, the deployment plan itself should be
tested in an environment that is a replica of the live environment. A factor
that can increase the organisational requirements of such an exercise is
the need to synchronize the data in the test deployment environment with
that in the live environment with minimum disruption to live operation.

Usability testing is a means for measuring how well people can use some
human-made object (such as a web page, a computer interface, a
document, or a device) for its intended purpose, i.e. usability testing
measures the usability of the object. Usability testing focuses on a
particular object or a small set of objects, whereas general human-
computer interaction studies attempt to formulate universal principles.

If usability testing uncovers difficulties, such as people having difficulty


understanding instructions, manipulating parts, or interpreting feedback,
then developers should improve the design and test it again. During
usability testing, the aim is to observe people using the product in as
realistic a situation as possible, to discover errors and areas of
improvement. Designers commonly focus excessively on creating designs
that look "cool", compromising usability and functionality. This is often
caused by pressure from the people in charge, forcing designers to
develop systems based on management expectations instead of people's
needs. A designers' primary function should be more than appearance,
including making things work with people.

"Caution: simply gathering opinions is not usability testing -- you must


arrange an experiment that measures a subject's ability to use your
document."

Rather than showing users a rough draft and asking, "Do you understand
this?", usability testing involves watching people trying to use something
for its intended purpose. For example, when testing instructions for
assembling a toy, the test subjects should be given the instructions and a
box of parts. Instruction phrasing, illustration quality, and the toy's design
all affect the assembly process.

Setting up a usability test involves carefully creating a scenario, or


realistic situation, wherein the person performs a list of tasks using the
product being tested while observers watch and take notes. Several other
test instruments such as scripted instructions, paper prototypes, and pre-
and post-test questionnaires are also used to gather feedback on the
product being tested. For example, to test the attachment function of an
e-mail program, a scenario would describe a situation where a person
needs to send an e-mail attachment, and ask him or her to undertake this
task. The aim is to observe how people function in a realistic manner, so
that developers can see problem areas, and what people like. The
technique popularly used to gather data during a usability test is called a
think aloud protocol.

Beta Testing
In software development, testing is usually required before release to the
general public. In-house developers often test the software in what is
known as 'Beta' testing which is often performed under a debugger or
with hardware-assisted debugging to catch bugs quickly.

It can then be handed over to testing staff for additional inspection in an


environment similar to how it was intended to be used. This technique is
known as black box testing. This is often known as the second stage of
Beta testing.

Product Testing

Software Product development companies face unique challenges in


testing. Only suitably organized and executed test process can contribute
to the success of a software product.

Product testing experts design the test process to take advantage of the
economies of scope and scale that are present in a software product.
These activities are sequenced and scheduled so that a test activity occurs
immediately following the construction activity whose output the test is
intended to validate.

Stability Testing

In PDF files, stability testing is an attempt to determine if an application


will crash.

In the pharmaceutical field, it refers to a period of time during which a


multi-dose product retains its quality after the container is opened.

Acceptance Testing

User acceptance testing (UAT) is one of the final stages of a software


project and will often occur before the customer accepts a new system.

Users of the system will perform these tests which, ideally, developers
have derived from the User Requirements Specification, to which the
system should conform.

Test designers will draw up a formal test plan and devise a range of
severity levels. The focus in this type of testing is less on simple problems
(spelling mistakes, cosmetic problems) and show stoppers (major
problems like the software crashing, software will not run etc.).
Developers should have worked out these issues during unit testing and
integration testing. Rather, the focus is on a final verification of the
required business function and flow of the system. The test scripts will
emulate real-world usage of the system. The idea is that if the software
works as intended and without issues during a simulation of normal use, it
will work just the same in production.
Results of these tests will allow both the customers and the developers to
be confident that the system will work as intended.

System Testing

According to the IEEE Standard Computer Dictionary, System testing is


testing conducted on a complete, integrated system to evaluate the
system's compliance with its specified requirements.

System testing falls within the scope of Black box testing, and as such,
should require no knowledge of the inner design of the code or logic
(IEEE. IEEE Standard Computer Dictionary: A Compilation of IEEE
Standard Computer Glossaries. New York, NY. 1990.).

Alpha testing and Beta testing are sub-categories of System


testing.

As a rule, System testing takes, as its input, all of the "integrated"


software components that have successfully passed Integration testing
and also the software system itself integrated with any applicable
hardware system(s). The purpose of Integration testing is to detect any
inconsistencies between the software units that are integrated together
called assemblages or between any of the assemblages and hardware.
System testing is more of a limiting type of testing, where it seeks to
detect both defects within the "inter-assemblages" and also the system as
a whole.

Regression Testing

According to the IEEE Standard Computer Dictionary, Regression testing is


testing conducted on a complete, integrated Regression to evaluate the
Regression's compliance with its specified requirements.

Regression testing falls within the scope of Black box testing, and as such,
should require no knowledge of the inner design of the code or logic
(IEEE. IEEE Standard Computer Dictionary: A Compilation of IEEE
Standard Computer Glossaries. New York, NY. 1990.).

Alpha testing and Beta testing are sub-categories of Regression


testing.

As a rule, Regression testing takes, as its input, all of the "integrated"


software components that have successfully passed Integration testing
and also the software Regression itself integrated with any applicable
hardware Regression(s). The purpose of Integration testing is to detect
any inconsistencies between the software units that are integrated
together called assemblages or between any of the assemblages and
hardware. Regression testing is more of a limiting type of testing, where it
seeks to detect both defects within the "inter-assemblages" and also the
Regression as a whole
Compatibility Testing

One of the challenges of software development is ensuring that the


application works properly on the different platforms and operating
systems on the market and also with the applications and devices in its
environment.

Compatibility testing service aims at locating application problems by


running them in real environments, thus ensuring you that your
application is compatible with various hardware, operating system and
browser versions.

Fuzz testing

Fuzz testing is a software testing technique. The basic idea is to attach the
inputs of a program to a source of random data. If the program fails (for
example, by crashing, or by failing in-built code assertions), then there
are defects to correct.

The great advantage of fuzz testing is that the test design is extremely
simple, and free of preconceptions about system behavior.
Uses

Fuzz testing is often used in large software development projects that


perform black box testing. These usually have a budget to develop test
tools, and fuzz testing is one of the techniques which offers a high
benefit:cost ratio.

Fuzz testing is also used as a gross measurement of a large software


system's quality. The advantage here is that the cost of generating the
tests is relatively low. For example, third party testers have used fuzz
testing to evaluate the relative merits of different operating systems and
application programs.

Fuzz testing is thought to enhance software security and software safety


because it often finds odd oversights and defects which human testers
would fail to find, and even careful human test designers would fail to
create tests for.

However, fuzz testing is not a substitute for exhaustive testing or formal


methods: it can only provide a random sample of the system's behavior,
and in many cases passing a fuzz test may only demonstrate that a piece
of software handles exceptions without crashing, rather than behaving
correctly. Thus, fuzz testing can only be regarded as a proxy for program
correctness, rather than a direct measure, with fuzz test failures actually
being more useful as a bug-finding tool than fuzz test passes as an
assurance of quality.

Fuzz testing methods

As a practical matter, developers need to reproduce errors in order to fix


them. For this reason, almost all fuzz testing makes a record of the data it
manufactures, usually before applying it to the software, so that if the
computer fails dramatically, the test data is preserved.

Modern software has several different types of inputs:

* Event driven inputs are usually from a graphical user interface, or


possibly from a mechanism in an embedded system.

* Character driven inputs are from files, or data streams.

* Database inputs are from tabular data, such as relational databases.

There are at least two different forms of fuzz testing:

* Valid fuzz attempts to assure that the random input is reasonable, or


conforms to actual production data.

* Simple fuzz usually uses a pseudo random number generator to provide


input.

* An combined approach uses valid test data with some proportion of


totally random input injected.

By using all of these techniques in combination, fuzz-generated


randomness can test the un-designed behavior surrounding a wider range
of designed system states.

Fuzz testing may use tools to simulate all of these domains.

Event-driven fuzz

Normally this is provided as a queue of datastructures. The queue is filled


with data structures that have random values.

The most common problem with an event-driven program is that it will


often simply use the data in the queue, without even crude validation. To
succeed in a fuzz-tested environment, software must validate all fields of
every queue entry, decode every possible binary value, and then ignore
impossible requests.

One of the more interesting issues with real-time event handling is that if
error reporting is too verbose, simply providing error status can cause
resource problems or a crash. Robust error detection systems will report
only the most significant, or most recent error over a period of time.

Character-driven fuzz

Normally this is provided as a stream of random data. The classic source


in UNIX is the random data generator.

One common problem with a character driven program is a buffer


overrun, when the character data exceeds the available buffer space. This
problem tends to recur in every instance in which a string or number is
parsed from the data stream and placed in a limited-size area.

Another is that decode tables or logic may be incomplete, not handling


every possible binary value.

Database fuzz

The standard database scheme is usually filled with fuzz that is random
data of random sizes. Some IT shops use software tools to migrate and
manipulate such databases. Often the same schema descriptions can be
used to automatically generate fuzz databases.

Database fuzz is controversial, because input and comparison constraints


reduce the invalid data in a database. However, often the database is
more tolerant of odd data than its client software, and a general-purpose
interface is available to users. Since major customer and enterprise
management software is starting to be open-source, database-based
security attacks are becoming more credible.
A common problem with fuzz databases is buffer overrun. A common data
dictionary, with some form of automated enforcement is quite helpful and
entirely possible. To enforce this, normally all the database clients need to
be recompiled and retested at the same time. Another common problem is
that database clients may not enderstand the binary possibilities of the
database field type, or, legacy software might have been ported to a new
database system with different possible binary values. A normal,
inexpensive solution is to have each program validate database inputs in
the same fashion as user inputs. The normal way to achieve this is to
periodically "clean" production databases with automated verifiers.

You might also like