Professional Documents
Culture Documents
Trademarks
MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See www.mathworks.com/trademarks for a
list of additional trademarks. Other product or brand names may be trademarks or registered trademarks of their respective
holders.
Patents
MathWorks products are protected by one or more U.S. patents. Please see www.mathworks.com/patents for more
information.
Revision History
September 2013 New for Version 2.2 (Applies to Release 2013b)
March 2014 Revised for Version 2.3 (Applies to Release 2014a)
October 2014 Revised for Version 2.4 (Applies to Release 2014b)
March 2015 Revised for Version 2.5 (Applies to Release 2015a)
Contents
1 Introduction ...................................................................................................................................... 1-1
2 Initial Requirements ......................................................................................................................... 2-1
3 High-Level Semantics of Programming Languages......................................................................... 3-1
Definition 1. (Operational Semantics) ......................................................................................... 3-2
Definition 2. (Kripke Structures with Single Failure) .................................................................. 3-3
Definition 3. (Run-Time Errors) .................................................................................................. 3-4
Definition 4. (Strongest Invariant at k) ........................................................................................ 3-5
Definition 5. (Run-Time Error Modalities/Colors) ...................................................................... 3-6
Proposition 6. (Run-Time Error Modalities Are Noncomputable)............................................... 3-7
Definition 7. (Admissible Check Modalities) .............................................................................. 3-9
Proposition 8. (Semantics of C#(k))............................................................................................ 3-10
4 High-Level Specification of Polyspace Code Prover Outputs ......................................................... 4-1
Requirement HLR-1. (Soundness) ............................................................................................... 4-2
Requirement HLR-2. (Run-Time Errors Yield Checks) .............................................................. 4-2
Requirement HLR-3. (Check Colors Are Sound) ........................................................................ 4-2
Requirement HLR-4. (Call Graphs) ............................................................................................. 4-2
Requirement HLR-5. (Call Graphs Are Sound) ........................................................................... 4-2
Requirement HLR-6. (Data Dictionaries) .................................................................................... 4-2
Requirement HLR-7. (Data Accesses Referenced in Dictionaries Are Sound) ........................... 4-3
Requirement HLR-8. (Shared Statuses Referenced in Dictionaries Are Sound) ......................... 4-3
Requirement HLR-12. (Compliance with coding standard) ......................................................... 4-3
Requirement HLR-13. (Coding metrics) ...................................................................................... 4-3
5 High-Level Specification of Polyspace Code Prover Outputs ─ Independence............................... 5-1
Requirement HLR-10. (Component Independence)..................................................................... 5-2
Requirement HLR-11. (Behavior Independence)......................................................................... 5-2
6 References ........................................................................................................................................ 6-1
6.1 Reference Documents.............................................................................................................. 6-2
v
vi
1 Introduction
This document describes the Theoretical Foundation for the Polyspace® Code ProverTM
verification tool. It is intended for use in the DO-178C tool qualification process for verification
tools.
This document comprises the Tool Operational Requirements (reference DO-330 Section
10.3.1) for the following verification tools:
Polyspace® Bug FinderTM
Polyspace Code Prover
The Tool Operational Requirements are defined as High-Level Requirements (HLRs) in this
document. The Tool Requirements are defined as Operational Requirements (ORs) and
Language Specific Requirments (LSRs) in the Tool Operational Requirements documents. To
comply with DO-330, Polyspace Bug Finder and Polyspace Code Prover Tool Requirements
trace to HLRs.
The following table summarizes the documents in which the Tool Operational Requirements and
Tool Requirements are defined. The table also provides the name of the requirement traceability
matrices.
This document describes the formal, theoretical background of Polyspace Code Prover
technology. Detecting automatically and exhaustively run-time errors in general programs is
very complex. The seminal idea, introduced by Ben Wegbreit in 1974 and 1975, is to perform
approximate computations in which the direction of approximation is controlled. These
approximations are formalized by closure operators on algebraic structures called complete
lattices, starting with a mathematical model of program execution via operational semantics and
Kripke structures.
Having described the theoretical framework, this document then describes high-level
requirements for the outputs of Polyspace Code Prover for ANSI C and ISO C++, as well as the
independence of Polyspace Code Prover outputs with respect to tools to which it is coupled.
These requirements are linked to operational requirements which can be found in accompanying
documents. These requirements apply to the core of Polyspace Code Prover and do not apply to
its peripherals, such as user interfaces that involve launching or exploitation interfaces.
Polyspace® Bug FinderTM identifies run-time errors in C and C++ embedded software. Polyspace
Bug Finder does not prove the absence of run-time errors. Polyspace Bug Finder uses the same
theoretical foundation as Polyspace Code Prover, but it is not irrefutable with respect to
identification of run-time errors.
1-2
2 Initial Requirements
Designing a bridge, choosing the trajectory for the launch of a communication satellite,
optimizing the shape of a plane wing, estimating the multiple echo effects of urban buildings in
cellular phone communications: what is common among these industrial activities is high-speed
processors and applied mathematics. The central paradigm is to model a physical world system
as a set of mathematical equations, solving these equations using high-speed processors, and
finally using the solutions to these equations to predict the behavior of the physical system.
The software industry has not yet really leveraged this paradigm to optimize its own verification
and validation processes. Polyspace Code Prover brings to the software industry the power of
applied mathematics and high-speed modern processors. The Polyspace Code Prover software
aims at helping users simultaneously:
Run-time errors are an important cause of software defects. The study of Sullivan and
Chillarege1 conducted at Berkeley and IBM® Watson found that many software defects
addressed during a four-year maintenance phase on large IBM codes are due to run-time errors.
Memory allocation errors, array out of bounds, uninitialized pointers, and pointer management
errors accounted for 26% of all observed software faults and more than 57% of the highest
severity faults, causing system outage or major disruption.
1 M. SULLIVAN AND R. CHILLAREGE, Software defects and their impact on system availability, proc. 21th International
Symposium on Fault-Tolerant Computing (FTCS-21), Montreal, 1991, 2-9, IEEE Press.
The Polyspace Code Prover software targets the mathematical modeling paradigm regarding
run-time errors. Polyspace Code Prover addresses two essential needs:
Static verification: statically predicting specific classes of run-time errors and sources of
nondeterminism
Semantic browsing: statically computing data and control flow to ease program
understanding, verification, or qualification
Given a source program, P, written in source programming language L, you want to compute
statically (without specific input data) and automatically a conservative model of the future
dynamic, run-time behavior of P. You also want to extract from this model predictions about the
possible occurrences of run-time errors and sources of nondeterminism (for static verification),
as well as data and control flow information (for semantic browsing).
This document serves as a reference for the design of Polyspace Code Prover and as a criterion
for functional validation testing. MathWorks uses an established tool life cycle process to
address tool development and verification activities. Hardware errors, coding errors, testing
errors, documentation errors, or other unforeseen circumstances may cause significant
deviations between expected behavior and actual behavior of the software tool. Therefore, this
document and the associated documents do not imply that MathWorks explicitly or implicitly
guarantees that Polyspace Code Prover is fully compliant to the specification, that it always
delivers correct results, or that it conforms to the user needs.
2-2
3 High-Level Semantics of
Programming Languages
Program behavior and run-time errors are formalized to provide a firm basis for the specification
of Polyspace Code Prover outputs.
Definition 1. (Operational Semantics)
The operational semantics of a program, P, written in programming language L, consists in the
set of finite and infinite execution traces O[P]2Trace. An execution trace is a time-evolving
sequence of states defined as Trace = State. Each trace Trace is a function from positive
integers to states. These integers represent the discrete computation time measured as the
number of elementary language constructs executed since program start.
The formal behavior of program P consists in the set of all possible runs of P, where each run is
represented by a possibly infinite sequence of states. States can be chosen according to the
programming language. Consider a simple flowchart programming language consisting of
integer variables, integers, arithmetic operations, assignments, conditionals, and loops. States
can be defined as pairs consisting of an integer representing the current flowchart instruction to
be executed, and a vector of integers in an n-dimensional state, where n is the number of
variables in the flowchart program, P, under consideration.
3-2
Definition 2. (Kripke Structures with Single Failure)
Given a Kripke structure2 with single failure (State, succ, ) associated to program P, where:
succ State 2State is a transition function that relates each state to its successors;
State 2A, is a valuation that associates each state with the set of atomic formulas true
in this state. A contains the distinguished elements error, final, initial and the set {at1,at2,…};
s State, error (s) succ(s)= ;
s State, succ(s)= (s) {error, final}.
O[P] = { Trace | initial ((0)), (1) succ((0)), …, (n+1) succ((n)), …}
The transition function associates zero, one, or several successors with a given state: a state with
no successors is either an error state or a final state (corresponding to the nominal termination of
program P); a state with one successor is the ordinary case; and a state with several successors
corresponds to nondeterminism. Nondeterminism can be the result of interleaving tasking or can
occur as a modelization of input/output from the world, external to the computer under
consideration.
No further hypotheses on how the transition function and the set of states are defined. This is
called semantics of programming languages. For further information, see the Handbook of
Theoretical Computer Science by Van Leeuwen3.
3-3
Definition 3. (Run-Time Errors)
A run-time error occurs in state s State if and only if error (s).
What is the connection with actual programming languages? The ANSI C and C++ standards
have an immediate and precisely defined notion of run-time error, as the standard gives an
informal but precise definition of the cases where a run-time occurs. Examples of run-time
errors include indexing an array out of its bounds, dividing by zero, referencing an illegal field
of a structure, or dereferencing a dangling pointer.
3-4
Definition 4. (Strongest Invariant at k)
The strongest invariant at k in program P as the set of states is:
SGI(k) is the set of all possible states that are at point k and reachable in program P. It can be
equivalently formulated by translating the source program, P, to a system of equations that has
one equation per program point456. For each program point k, the solution yields the invariant
SGI(k).
4 R. FLOYD, Assigning meaning to programs. In Mathematical Aspects of Computer Science, Proc. of Symposia on Applied
Mathematics, American Mathematical Society, 19-32, Providence, 1967.
5 D. PARK, Fixpoint induction and proofs of program properties, in Machine Intelligence, Edinburgh Univ. Press, 5 : 59-78, 1969.
6 E. CLARKE, Program invariants as fixedpoints, Computing 21- : 273-294, 1979.
3-5
Definition 5. (Run-Time Error Modalities/Colors)
In some instances, deviation from the reference workflow explained in this document might
occur.
This formal definition of the (semantic, exact) color (modality) associated with program point k
is the cornerstone of the Polyspace Code Prover specification. Polyspace Code Prover does not
rely on a binary partition of cases (correct versus incorrect) but on a more expressive set of four
modalities. Having more than two modalities is a common phenomenon in modal or temporal
logics.
A program point associated with gray or green will not raise run-time errors during execution. A
program point associated with red triggers a run-time error if executed. Gray identifies a
program point that cannot be executed (dead code) and orange is associated with a program
point that can intermittently execute correctly or incorrectly.
The color C(k) is a prediction of the future behavior of program P. If you compute for each
program point k of a program P the modality C(k), then there is a powerful means of verifying
the absence of run-time errors. First fix the red errors in P and then fix the orange program
points by either inserting protection code or correcting the cause of the underlying red case. At
the end, there is a program containing only gray and green program points that is completely
free of run-time errors in any future execution.
3-6
Proposition 6. (Run-Time Error Modalities Are
Noncomputable)
Given an arbitrary program, P, written in programming language L, the run-time error
modalities for P are noncomputable in finite time by any initially established means.
Observe that the halting problem (deciding if a program stops) is reducible to the run-time error
problem. As the halting problem has been shown to be undecidable by Church and Turing7 in
the general case, and for computer programs by Hoare and Allison8 , it follows that the
computation of C(k) is undecidable as well, which in turn implies noncomputability in finite
time.
What is the practical significance of this theoretical fact? Does it mean that run-time errors are a
problem for which theoretical computer science cannot help? When confronted with a problem
for which computational complexity is too difficult, you can resort to approximate methods, if
they can provide close enough solutions for an acceptable time and memory space usage.
Probabilistic or statistical methods give results of the form “C(k) is green with a 95%
confidence interval”. These results raise difficult questions to answer, such as: What is the
underlying notion of distribution? Such probabilistic information cannot validate highly
critical systems such as power plant control systems or fly-by-wire software.
Algebra, lattice theory9, and logics introduce another approximation that is based on
implication and partial orderings. This approximation obtains results of the form: “C(k) is not
orange or red”. This is an approximate property; it could mean that C(k) is actually either
gray or green. Despite the approximation, it can be directly used because it implies that the
operation at k is never wrong.
This document pursues this second approach, following the pioneering work of Wegbreit10,
further extended by Karr, Cousot, & Cousot, Halbwachs, Jones, Sharir, & Pnueli, and others to
more complex language constructs and more expressive properties.
3-7
The key idea of Wegbreit’s work is that while the exact program property may not be
computable, a weaker property implied by the exact one may be computable. He also devises a
means to compute these approximate properties by replacing the exact invariant propagation of a
Floyd-style equation system by its image in a weaker space through an approximation operator
. If properly designed, the approximated system is solvable and its solution, called SGI#(k) at a
particular program point k, is related to that of strongest global invariant by: (SGI(k))
SGI#(k).
This approximate invariant can in turn be used in the computation of run-time error modalities
(colors) according to Definition 5. Replacing the exact invariant SGI(k) at point k in this
definition by its supersets yields an approximate check color C#(k) that is related to the exact
color C(k) by the following relation.
3-8
Definition 7. (Admissible Check Modalities)
An approximate check color C#(k) is admissible with respect to the exact check color C(k) if,
and only if:
C(k) C#(k)
The following lattice diagram defines the ordering between check colors.
In turn, this induces the following meanings for the color defined by C#(k).
3-9
Proposition 8. (Semantics of C#(k))
If C#(k) is computed with an approximate invariant that is a superset of SGI(k), then it is related
to the exact color C(k), as follows:
Knowing the value of the computable term C#(k) gives partial but useful information about the
actual value of the term of interest C(k) that is not computable:
If C#(k) = green, then C(k) is neither red nor orange; the operation at k is correct.
Conversely, if C(k) = orange or C(k )= red, then C# (k) is either red or orange.
3-10
4 High-Level Specification of
Polyspace Code Prover Outputs
Requirement HLR-1. (Soundness)
The outputs generated by Polyspace Code Prover shall be irrefutable (sound) with respect to the
run-time error, call trees and data dictionaries, as specified by the applicable language standard.
This applies to programs that meet applicable language standards at compile-time: programs that
are syntactically correct and for which the context conditions prescribed by the standard are
satisfied (which includes type checking). Language-specific requirements and/or restrictions
may apply. The semantics of the corresponding programming language may be parameterized
by options passed to the Polyspace Code Prover software that describe the target processor and
the target environment, by options that change specific parts of the semantic model, or by
options that favor either analysis time or precision.
4-2
Requirement HLR-7. (Data Accesses Referenced in
Dictionaries Are Sound)
If a read or write access can be dynamically issued to a global variable, then it shall appear in
the data dictionary.
11 A. BERNSTEIN, Analysis of Programs for Parallel Processing, IEEE Trans. on Computers, EC 15: 5, 757-763, 1966.
4-3
4-4
5 High-Level Specification of
Polyspace Code Prover Outputs
─ Independence
Requirement HLR-10. (Component Independence)
Polyspace Code Prover core components shall be specified, developed, and tested independently
from MathWorks code generators. As a result, core components developed by one team are not
reused by another team.
5-2
6 References
6.1 Reference Documents
Floating point arithmetic standard IEEE 748
6-2