You are on page 1of 2

Random Testing / Fuzzing

11 August 2023 09:17

Theories behind why random testing works.


Applications in emerging domains.
Testing tools: monkey and cuzz, Google and Microsoft.

Fuzzing - We feed a program a set of random inputs and observe for correctness in behaviour.
Fuzzing Mutation
Randomly perturbs a specific part of a program Randomly perturbs arbitrary aspects of a program

Motivation for random testing: the Infinite Monkey Theorem.

Case Studies
For random testing to be effective, the test inputs must be generated from a reasonable distribution
specific to the given program or class of programs.
Class of programs:
• UNIX utility programs -
Cmd-line and GUI apps fuzzer to test reliability of UNIX programs by bombarding them with
random data. Errors detected earlier were crashes (dump states) and hangs (indefinite loops).
Repetition of errors due to input sanitization which developers had more issues to sort.
Vulnerability of attacks using the gets() function in C which is non-parametised. C doesn't
check area bounds thus a large amounts of data input create buffer overflow. Error was
overridden by fgets() to limit maximum length of input data.

• Mobile apps -
Fuzzing Testing for Mobile Apps using monkey tool. Onclick function with target then action
associated taken. Testing using GUI events and pixel coordinates: TOUCH(x, y). Can simulate
sophisticated events such as incoming calls. Need pkg name. Monkey tool generates a
sequence of events with a timed delay. Gestures - down, move, up to simulate a drag in a
particular grammar.
Monkey options - basic config, operational constraints, event types & freq, and debugging.

• Concurrent programs -
A bug is triggered under specific inputs. This can also be affected by the thread schedule. Sleep
calls causes delay across different thread schedules.
Cuzz automates the approach of introducing sleep calls instead of manual testing. Give worst-
case probabilistic guarantee on finding bugs.
Bug depth = no. of ordering constraints a schedule has to satisfy to find the bug. Ordering
constraint is a requirement on the ordering between constraints.

Probabilistic guarantee
Measured vs Worst-case probability
- Worst-case guarantee is for hardest-to-find bug of given depth. Increasing threads leads to
more ways of triggering bugs.

Case study of Cuzz - bug triggered once a day (238820 runs) vs 12 times a minute.
Key Takeaways
- Bug depth
- Systematic randomization
- Cuzz does better in whatever stress testing can - flushing out bugs, scaling a large number of
threads, low adoption barrier.
Random testing
Pros Cons
Easy to implement Inefficient
Good coverage with enough testsp Might find unimportant bugs
Works with any format Poor coverage in practice
Appealing for security vulnerabilities

Random Testing Coverage


Compiler - lexer, parser and backend.
Random testing in Lexer will see inputs, only a small percentage in parser and very little at the
backend.

What We've Learned


Random Testing
- Effective for security, mobile apps, and concurrency
- Complements but doesn't replace systematic, formal testing
- Generate test inputs from reasonable distribution to be effective
- May be less effective for multiple layered systems

You might also like