2012 Practical Experiences of Applying Source-Level WCET Flow Analysis To Industrial Code

Int J Softw Tools Technol Transfer (2013) 15:53–63
DOI 10.1007/s10009-012-0255-9
ALL-TIMES
Practical experiences of applying source-level WCET flow analysis

to industrial code
Björn Lisper · Andreas Ermedahl ·
Dietmar Schreiner · Jens Knoop · Peter Gliwa
Published online: 24 July 2012

© Springer-Verlag 2012
Abstract Code-level timing analysis, such as worst-case 1 Introduction

execution time (WCET) analysis, usually takes place at the
binary level. However, many program properties that are The purpose of timing analysis is to find out whether the tim-
important for the analysis, such as constraints on possible ing constraints of a system are met. It is usually divided into
program flows, are easier to derive at the source code level system-level analysis, dealing with the combined execution
since this code contains much more information. Therefore, of different programs (“tasks”), and code-level analysis deal-
various source-level analyses can provide valuable support ing with the timing of individual tasks. The most important
for timing analysis. However, source-level analysis is not entity to estimate on code level is the worst-case execution
always smoothly applicable in industrial settings. In this time (WCET) of a task. WCET estimates are needed for many
paper, we report on the experiences of applying source-level system-level analyses. For these to yield reliable results, the
analysis to industrial code in the ALL-TIMES project: the WCET estimates must be safe meaning that they never will
promises, the pitfalls, and the workarounds that were devel- underestimate the true WCETs. Unsafe WCET estimates can
oped. We also discuss various approaches to how the diffi- also be useful, in situations like early design phases where
culties that were encountered can be tackled. they can help dimensioning the system.
WCET analysis assumes that the code is running to com-
Keywords Static analysis · WCET analysis · Source-level pletion in isolation, without being interrupted. The two main
analysis · Embedded software kinds of analysis are measurements and static analysis. In
general, measurements cannot give safe estimates and are
thus suitable for less time-critical software. For time-critical
software, where the WCET must be safely estimated, static
B. Lisper (B) · A. Ermedahl
analysis is preferable. Hybrid analysis combines elements of
School of Innovation, Design, and Engineering,
Mälardalen University, 721 23 Västerås, Sweden static and measurement-based analysis.
e-mail: bjorn.lisper@mdh.se Static WCET analysis derives a WCET estimate from
A. Ermedahl mathematical models of the code and the hardware. If the
e-mail: andreas.ermedahl@mdh.se models are correct, then a static WCET analysis will always
derive a safe WCET estimate. The static analysis is usually
D. Schreiner · J. Knoop
divided into three phases: a flow analysis where information
Institute of Computer Languages, Vienna University
of Technology, 1040 Vienna, Austria about the possible program execution paths is derived, a low-
e-mail: schreiner@complang.tuwien.ac.at level analysis where the execution times for atomic parts of
J. Knoop the code, like basic blocks, are decided from a performance
e-mail: knoop@complang.tuwien.ac.at model of the target architecture, and a final calculation where
the flow and timing information is combined to derive a
P. Gliwa
WCET estimate.
GLIWA GmbH embedded systems, Blütenstr. 20,
82362 Weilheim i.OB., Germany The flow analysis must handle a possibly very large
e-mail: peter@gliwa.com number of paths. Thus, approximations are needed. If the
123
54 B. Lisper et al.
approximations are safe (representations include at least the 2 Related work

possible paths) then a WCET analysis is still safe, although it
might become pessimistic. Flow analysis research has mostly There have been a number of studies of WCET analysis of
focused on loop bound analysis [12] since upper bounds on industrial code. There are some reports on using commer-
the number of loop iterations must be known in order to cial, static analysis WCET tools to analyze code for space
derive finite WCET estimates. Automatic methods to find applications [14,15,24] and in avionics industry [8,21,29]. In
these exist, but often some loops must still be bounded man- [4,26], time-critical parts of the commercial real-time oper-
ually. Flow analysis can also identify other types of program ating system Enea OSE were analysed. Experiences from
flow constraints like infeasible paths [11] constraining the WCET analysis of real-time communication software for
number of times different program parts can be executed automotive systems was reported in [3]. All these case stud-
together. ies concern analysis of binary code only. A problem detected
To be accurate, flow analysis should be performed on the in several of them is the need to provide manual annota-
binary level, for the code actually being executed. However, tions for program flow properties at binary level, which was
much information is often missing in this code. Therefore, often found to be cumbersome and error-prone. Binary level
flow analysis is sometimes done on source code. An advan- annotations were necessary either because the source code
tage is that there is much more high-level information in the was not available, or because the optimizations performed
code that can help obtaining precise flow analysis results. by the compiler would change the control flow too much.
Also, source-level analysis is potentially more portable as Although research has been done how to transform program
the source code format should be independent of the target flow annotations through code transformations [16,23], such
platform. (However, this is not always true: see Sect. 7.2). transformations are not done by any production compilers.
As a downside, the program flows in the source and the Sehlberg et al. [28] reports on WCET analysis of a number
binary are sometimes not exactly the same, especially if the of tasks in the transmission control of an articulated hauler.
compiler applies heavy optimisations. Program flow con- This analysis was done on binary level. Also in this study, a
straints derived from the source code can then be unsafe for problem found was the sometimes large effort needed to pro-
the program flows of the binary. But there are still many situa- vide manual program flow annotations in order to have tight
tions where this is tolerable, for instance when making rough WCET estimates. For one task, several weeks were needed
timing estimates in early design phases. Also, the technique is to provide the appropriate annotations. In a follow-up study
applicable for safety-critical applications where safety stan- [1], the program flow analysis was done on source level,
dards demand that compiler optimisations are turned off. and the results passed to the binary level analysis by hand.
The purpose of the ALL-TIMES project was to create The experiment was successful in the sense that almost all
methodologies and toolchains for combinations of different program flow constraints on the binary level could be auto-
timing tools and techniques, including source-level flow anal- matically derived on source level. Compiler optimisations
ysis. The techniques were finally tried out on an industrial changing the control structure did not pose any big problems
case study from the automotive area. In this paper, we report for the translation. What did pose a problem was the fact that
on some practical experiences of source-level flow analysis in the tasks communicate through data stored in intricate data
the project. We ran into a number of problems when attempt- structures. Value annotations had to be provided for some of
ing to analyse the source code for the case study. We describe these structures, which proved to be difficult for two reasons:
these problems, and the workarounds that we had to make to the inability of the annotation language to express accurately
be able to proceed. The problems that we encountered seem the value fields in the data structures, and the difficulty to find
to be quite general for source-level analysis of this kind of proper value ranges for data stored and updated by several
code, and we discuss a number of possible ways to tackle tasks. For this case study, there was access to all the source
them. code. The code was, with some exception, written in a way
The rest of this article is organised as follows: in Sect. 2 adhering to existing C standards, and parsing did not pose
we give an account for related work. Section 3 describes any major problems.
briefly the role of source-level analysis in the ALL-TIMES There are a number of general-purpose academic or com-
project. In Sect. 4, we describe the target system for which mercial tools for source-level static analysis. Some examples
we applied source-level flow analysis. Section 5 describes are ASTRÉE [6], Coverity [2], Polyspace, and Klocwork.
how the results of the source-level analysis were used, and These tools can check for a variety of possible errors such
which aspects were tested. Section 6 describes our analysis as possible division by zero, array index out of range, or
tool SWEET, which was used in the case study, and its flow potential memory leaks to name a few. A comparative eval-
analysis. Section 7 reports on the results and experiences of uation of Coverity, Polyspace, and Klocwork is found in [7].
the case study. Section 8, finally, wraps up and discusses In [2], a number of practical problems for source-level
possible directions for future work. static analysis tools, when applied to industrial code, were
123
Practical experiences of applying source-level WCET 55
Fig. 1 Source-level analysis

tools and tool connections in
ALL-TIMES
reported. Among the experiences reported were the need infeasible paths, and can export these to aiT and RapiTime
to handle different environments for program development, using the same annotation formats as SATIrE. SWEET’s flow
problems parsing code accepted by the customers’ compil- analysis analyses the ALF code format (see Sect. 6). SATIrE
ers, and the need to reduce the ratio of false to true positives. can convert Rose’s internal program representation to ALF
Clearly, these experiences apply to the source-level analyses through the “melmac” tool, and can thus act as a C frontend
that support WCET analysis as well, and they are in line with to SWEET. Thus, the source-level analysis of SWEET uses
what we report here. SATIrE as a frontend.
The results of ALL-TIMES were validated using indus-
trial code from the automotive domain (see Sect. 4). The
3 Source-level analysis in the ALL-TIMES project validation is described in detail in [20].
One particular goal of ALL-TIMES has been to provide sup-

port for timing analysis from source-level analyses. Figure 1 4 The target system
shows the tool chains created in the ALL-TIMES project:
the parts that concern source-level analysis are the ones that The target system that was used for the final validation in
are not greyed out. Two kinds of program flow information ALL-TIMES is a fuel cell control system from Daimler,
were found to be interesting to export to the WCET analysis for their experimental “F-Cell” fuel-cell powered version of
tools aiT and RapiTime: information about possible values of the Mercedes-Benz A-Class. This control system consists of
function pointers in different program points, and traditional three ECUs. The source-level analysis in ALL-TIMES was
program flow information such as bounds on the number of validated using the VECU (Vehicle Control Unit), which cal-
loop iterations and infeasible path constraints. These analyses culates the torque necessary to fulfil the driver’s request in the
are provided by the tools SATIrE and SWEET, which have current situation. It also handles the buffer battery. For this
been briefly described in [19]. As mentioned there, SAT- particular unit the project could obtain access to the source
IrE is based on the Rose open source compiler infrastruc- code, which was crucial for the validation of the source-level
ture [25]. In particular, it uses the C/C++ frontend of Rose to parts of the toolchains. (It should be mentioned that it is usu-
parse C source code before analysing it. It can deliver func- ally not easy for people outside the company in question to
tion pointer values to aiT and RapiTime using each tool’s have access to industrial source code. This can make it hard
native format for manual source-level annotations. SATIrE to try out source-level analysis techniques on realistic codes.)
also has an experimental loop bounds analysis (see [27]). The VECU unit has a Freescale MPC 564 processor. It has
SWEET can compute a number of program flow constraints, no hardware trace facilities, and it uses the ETAS ERCOSEK
ranging from simple loop bounds to complex constraints for operating system. The source code consists of ca. 685k lines
123
56 B. Lisper et al.
of code, divided on 1297 C files (.c), 768 C header files (.h), would be lost. Therefore it was interesting to identify parts
and 28 assembly files (.a). The C files range in size from less of the code with little variability of the execution time since
than 1 KB to 1997 KB (22910 lines of code), the largest .h such parts are prime candidates not to instrument. Loops that
file is 611KB (12930 lines of code), and the largest .a file always execute the same number of times are likely to have
is 1298KB. The source code is compiled with the Windriver low execution time variability. Since SWEET can compute
compiler (Diab Data). both upper and lower iteration bounds for loops, it was tested
In addition to the source code, the project had access to a how many loops SWEET could find where these bounds were
development environment including compiler, debugger, and equal (and thus the loop always executes the same number
hardware. The target system is described in some more detail of times). This information could help reduce a time-con-
in [20]. suming inspection of the code to find those parts of the code
where instrumentation could be reduced.
5 Source code analysis validation

6 SWEET, and its flow analysis
The source code analyses developed within ALL-TIMES
were validated as parts of larger tool chains, involving also SWEdish Execution Time tool (SWEET)1 is a WCET analy-
the WCET analysis and system-level timing analysis tools sis research tool from Mälardalen University. It has the stan-
in the project (see [19,20]). The tool chains were validated dard WCET analysis tool architecture with a flow analysis,
using two scenarios, in the following way. The first vali- a low-level analysis, and a final WCET estimate calculation.
dation scenario was an “early design exploration” scenario, The flow analysis of SWEET is the part currently being
where TimingExplorer [13] is used to select an appropriately most actively maintained and developed, and this is the part
upgraded hardware platform for an existing application. For of SWEET that has been of concern for the ALL-TIMES
this purpose, a set of approximate WCETs of the tasks were project. The analysis is able to produce a number of pro-
computed for each investigated hardware configuration and gram flow constraints, ranging from simple upper and lower
subsequently sent to SymTA/S for a schedulability analysis. loop iteration bounds to complex infeasible flow constraints.
Computing bounded WCETs requires that all loops in the The analysis is context-sensitive as well as input-sensitive,
code are bounded. TimingExplorer does perform an auto- the latter meaning that the analysis can take restrictions on
matic loop bounds analysis: however, this analysis is done input values into account when computing the program flow
on the binary code, and so, for the target code of VECU, constraints.
TimingExplorer was unable to bound 102 loops. The source- The main analysis method of SWEET is called abstract
level analysis validation for this scenario was to see how execution (AE) [11]. AE can be seen as a fully context-sen-
many loops SWEET could additionally bound by a source- sitive abstract interpretation, where each loop iteration (or
level analysis, and pass the results to TimingExplorer to avoid recursive function call) is analysed separately. It is therefore
the time-consuming process of manually annotating the loop a “deep” analysis method, based on the semantics of the pro-
bounds. gram. Alternatively, AE can be seen as a kind of symbolic
The second scenario was a “late stage verification” sce- execution where program variables store abstract values, and
nario, where SATIrE, SWEET, and RapiTime are used to ana- where operators and functions are given their interpreta-
lyse/measure the code and determine the WCETs of the tasks tions in the abstract value domain. Figure 2 illustrates how
in the target system. RapiTime produces traces that are vis- AE works for a simple loop. Note the input-sensitivity: the
ualised using one of the so-called “trace-viewers” provided restricting interval for the initial value of i will directly influ-
with T1 or SymTA/S. SymTA/S also performs a scheduling ence the resulting iteration bounds for the loop. A simpler,
analysis to verify schedulability under worst-case conditions. non-input sensitive analysis would not be able to bound this
Here, it was tested how well SATIrE could determine func- loop. The current implementation of AE in SWEET uses the
tion pointer values. RapiTime needs these values to cut down abstract interval domain for numerical values, but AE can in
the set of possible execution paths when calling a function principle use any abstract domain.
that is indirectly referenced through such a pointer. SWEET’s AE uses abstract states, which consist of a program point
loop analysis was used in a different way than for the first and a representation of a set of possible memory states, cur-
scenario. RapiTime measures execution times for program rently in the form of a function from program variables
fragments and combines this information into a WCET esti- to intervals. As the analysis proceeds, the abstract state is
mate using program flow constraints. VECU has no hardware updated according to the abstract semantics of the current
tracing facilities, and thus the code had to be instrumented statement. Notably, several abstract states can be spawned
for measuring execution times. The instrumentation had to
be sparse since otherwise buffers would overflow and data 1 http://www.mrtc.mdh.se/projects/wcet/sweet.html.
123
Fig. 2 Example of abstract

execution
(a) (b) (c)
Table 1 SWEET’s different merge strategies The computed flow constraints are arithmetic constraints
Merge strategy Explanation on so-called execution counters, virtual variables that count
the number of executions in different program points. (In
No merge Never merge any abstract states Fig. 2, “#p” is such a counter.) SWEET has a general method
Each function entry Merge abstract states before making
to collect information about possible executions into different
point function calls
Each function return Merge before returning from function calls kinds of program flow constraints. The resulting constraints
point can be directly used in the final calculation of the WCET esti-
Each loop exit edge Merge at each point where a loop can be mate. Below we show some simple program flow constraints:
exited
an upper and a lower loop bound constraint restricting the
Each back-edge Merge at each back-edge of each loop,
going back to its header node number of executions of program point p to be between 3
Each point of joining Merge where program flows join, and 5 (like in the example in Fig. 2), and a constraint speci-
edges like after conditionals fying that the program points c1 and c2 can be executed at
All Apply all the strategies above
most 100 times together:
#p <= 5
when executing conditions: see Fig. 2 for an example, where #p >= 3
this happens for iterations 4 and 5. Thus, AE must be able to #c1 + #c2 <= 100
handle several concurrent abstract states. To keep the number
of abstract states down, they can be merged into new abstract SWEET has the ability to calculate these kinds of constraints,
states that safely overapproximate the union of the concrete as well as considerably more complex ones [11]. In par-
states involved. Merging may reduce the precision of the ticular, SWEET can calculate linear inequalities expressing
analysis, so it should be used only when necessary. SWEET more complex infeasible path information that involve sev-
offers a choice of user-selectable strategies to place points eral program points. SWEET can also generate context-sen-
where to merge, thereby providing several tradeoffs between sitive flow constraints for different call strings. Each flow
precision and analysis complexity: the current strategies are constraint is either a “foreach” constraint, valid for individual
given in Table 1. Of these strategies “no merge” gives the iterations of an outer loop, or a “total” constraint restricting
highest precision but may cause the number of abstract states the total number of executions of program points. See [9] for
to grow exponentially, “all” will give the largest reduction of details.
the number of abstract states, but will also give the lowest The analysis method of SWEET is general and not a pri-
precision, and for the other strategies the impact will depend ori tied to any language or instruction set. To take advantage
on the properties of the code under analysis. of this generality, SWEET’s flow analysis analyses the code
As mentioned AE is context-sensitive, which implies that format ALF [10]. ALF is an intermediate format designed
different function calls will be treated separately by the analy- to be able to faithfully represent code on different levels,
sis. This will in general increase the precision of the analysis, from source to binary level. It therefore contains high-level
at the cost of possibly analysing the same function several concepts, such as functions and function calls, as well as
times. Reconsidering the example in Fig. 2, let us assume low-level concepts like dynamic jumps. Thus, SWEET can
that the code is included in the body of a function taking i analyse code both on binary and source level provided that
as a parameter. Depending on the possible values of i, for a translator into ALF is present. As mentioned, the melmac
different function calls, the loop may iterate a different num- tool is such a translator that maps the internal format of Rose
ber of times, and a precise WCET analysis should take this into ALF, thereby enabling SATIrE to be used as a C frontend
into account. AE will analyse the function anew for each call, to SWEET.
using the actual interval for i, and will thus be able to derive The flow constraints generated by SWEET are by default
loop bounds that are specific for each call. given in SWEET’s own “flow fact language”. This language
123
58 B. Lisper et al.
expresses flow constraints for ALF code. If the ALF code from the configuration files for the ETAS ERCOSEK oper-
is generated from C code by melmac, then SWEET can use ating system.
information provided by melmac to relate the flow constraints Second, the files containing code that is needed for the
to the C source code. This enables SWEET to generate code analysis are to be identified. This may sound trivial: a brute
in the AIS format for aiT [13], and RapiTime’s annotation force approach that should seemingly work would be to
format for flow constraints. include all source files in the project. But this requires knowl-
edge of where the source files are located, which typically has
to come from the build tool (which, if unlucky, may be some
7 Results and experiences set of obscure scripts). A complication is that some source
files may include assembly code, or that they are simply miss-
We now describe our experiences from doing source-level ing (only binaries available): more on the consequences of
analysis on the validator (automotive ECU) code in the ALL- this in Sect. 7.4.
TIMES project. A further complication, which we encountered in the pro-
We discuss the problems encountered for the source-level ject, is late binding. Functions with the same name may be
program flow analysis performed by SWEET. Most of the defined in different compilation units, and only at linking
problems are common to SATIrE’s analyses as well, and time it is selected which of these functions to actually call by
indeed they should be common to “deep”, semantics-based including the corresponding object file. This is a simple way
source-level analyses in general. In our experiments, SWEET to implement a “product line software architecture”, and it is
was run under Windows XP whereas SATIrE was run under helpful in situations where variants of the same software are
linux: however, all conclusions to be drawn should be inde- used in different contexts. Consequently, late binding seems
pendent of the environment used. to be very common for embedded codes. With late binding,
The steps necessary to perform SWEET’s flow analysis are the source code simply does not contain sufficient informa-
basically the following. This scheme should hold for source- tion which source files to include; information from the build
level analysis in general: tool is needed.
To summarize: Step 1 in general requires information both
from the operating system configuration and the build tool,
1. identify the source files that are needed for the analysis,
and a source-level analysis tool must somehow acquire it.
2. convert the files to a format suitable for the static analyses
to be applied,
7.2 Step 2: converting source files
3. “link” the converted files to enable program inter-proce-
dural analyses across file borders,
For SWEET’s flow analysis, the conversion is done in two
4. perform the program analyses, and
steps: first, the source files are parsed into the internal format
5. map the results back to the source code, in a format suit-
of Rose, and then a translation to ALF is done by the melmac
able for other tools.
tool.
Both these steps presented problems. The Rose compiler
We now describe our experiences for each step in turn. framework uses a well-known commercial C/C++ frontend
from Edison Design Group (EDG). This frontend is distrib-
7.1 Step 1: identify needed source files uted with Rose as a binary, so it cannot be changed. Much
to our surprise, it turned out that the parser could not pro-
Academic benchmarks for WCET analysis research tend to cess some of the target system’s source files. The VECU
consist of small, self-contained programs where each pro- target system is compiled with the Windriver Diab compiler,
gram to analyse resides in a single source file. Not so in real which also uses the EDG frontend, so we believed it would
projects: they may consist of thousands of files, where the be compatible with the frontend for Rose. However, the Diab
code that needs to be analysed is scattered over many differ- version of the frontend was configured to parse a different
ent files. dialect of C. Fundamental data types are customised for an
The first step is to identify the entry points for the code embedded platform in the Windriver compiler, while Rose’s
to analyse. For WCET analysis, this typically amounts to frontend targets 64-bit unix systems. Furthermore pragmas,
finding the start functions for the tasks, or runnables, in the preprocessor directives, and especially packing options were
system. If the application uses an operating system, then the encoded in a format that was not understood by Rose’s ver-
tasks can usually be found from some of its configuration sion of the EDG frontend.
files. This is the method that we used. It requires knowledge Since we could not change the Rose frontend, we had to
of how the specific operating system used is configured. We rewrite some source files by hand to get them through the
identified 18 start functions for tasks in the VECU system parser. There were also a total of 46 occurrences of inlined
123
assembly code, located in 25 source files, which the frontend Source-level linkers exist. One example is cilly, from
could not process either. We dealt with this by simply com- the CIL framework [5], which merges a number of C files
menting out the assembly code. The inlines assembly snip- into one large C file. However, cilly changes the line num-
pets were mainly single instructions, or directives allocating bers of statements in a way that cannot be easily traced.
buffer areas in memory. Commenting them out is not a strictly SWEET exports program flow constraints to aiT as source-
safe solution since the assembly code potentially could affect code annotations (tool connection S in Fig. 1). The original
the program flow: however, this seems very unlikely for the line numbers are required for flow constraint annotations in
inlined assembly that we encountered. Overall, the problems the annotation format for aiT, so if they are lost then such
encountered were very similar to those reported in [2]. annotations cannot be generated. For this and other reasons,
It also turned out that Rose itself had problems to handle we decided to create our own ALF linker instead.
some features of the target system’s C code. For instance it Our ALF linker works similarly to cilly, and merges
failed to recognize equivalent typedefs as being aliased, it all the ALF files into one big file. However, it also main-
was not able to handle all types of inline assembly codes, tains the relation between program code locations in C and
and it was unable to represent static initialization of function ALF files by creating a global map file from the merged ALF
pointer arrays: the latter was especially problematic since files. To differ between local entities with the same name, the
real-time operating systems like ETAS ERCOSEK often use ALF linker has to perform some “name mangling” to create
static arrays of function pointers to provide the start func- unique names. This is done essentially in the same way as by
tions for the tasks in the system. We had to rely on the Rose conventional linkers [18].
developers for fixing the problems, which took time. A problem, somewhat similar to the late binding prob-
The C code of the target system would sometimes stretch lems described in Sect. 7.1, was caused by the fact that some
the limits of the C standard. While being handled by Rose, header (.h) files contain full function definitions. The mel-
this would tick off bugs in the melmac tool causing it to gener- mac tool would simply include .h files, and generate ALF
ate invalid ALF code. These bugs had gone unnoticed during code for them as well, when included in a C file. If such a .h
the test phase of melmac since the C code used for testing did file was included in several C files, which was frequently the
not contain these “features”. Examples of coding that trig- case, then several copies (up to hundreds) of the same func-
gered melmac errors are functions being called with a differ- tion declaration would result. The ALF linker, righteously,
ent number of arguments than declared, functions returning did not accept this. Our solution was to change the linker to
no value while not being declared to be of type void, and the eliminate all identical copies of a function declaration but
same function being declared and imported at the same time. one.
Since these bugs were discovered late in the project, we dealt The final linked ALF file for the target system became
with them by allowing SWEET to accept ALF code that was quite large. This was inconvenient at times: for instance, just
not adhering to the ALF specification rather than correcting the time to parse it became significant. When analysing a
the bugs in melmac (which would have taken more time). certain start function, it would have been nice to be able to
After selecting files due to the late binding, there were do “smart linking” where only the entities required to ana-
1234 C files to convert. Of these, eight would not pass Rose’s lyse that function would be included. However, this turned
frontend. For five files, the translation failed due to missing out to be hard: there is no way to identify in which source file
declarations/include files, and for 21 files there were prob- a declaration resides short of scanning them all, and making
lems with the generated ALF code. Thus, in the end there some kind of symbol table telling in which file each entity
were only 1,200 files for which correct ALF could be gener- is declared. Also, the aforementioned problems caused by
ated. late binding would have to be solved by a smart linker. Thus,
The morale of this is that an industrial strength source we abandoned the idea of having a smart linker. Instead we
analysis tool, at least for C, must be able to adapt to the vari- linked all ALF files, as described above, and then extracted
ety of C dialects defined by the C compilers that are in use. from the resulting ALF file those declarations necessary to
It is hard to accomplish this without being in full control of analyse a certain start function.
the translation chain.
7.4 Step 4: performing the flow analysis

7.3 Step 3: link the converted files
The linked ALF file should in principle be ready for analy-
When analysing code that stretches several files or compila- sis. Alas, a successful flow analysis for real industrial codes
tion units, a global namespace has to be created where sym- typically requires additional information not present in the
bols have an unambiguous meaning. This process can be seen available source code. This information must then somehow
as a kind of linking [18], but on source-code level. be provided, or approximated.
123
60 B. Lisper et al.
When analysing the possible program flows of a task, typ- and we had no access to its source code. Apparently this
ically some bounds must be known for inputs that may affect is a common situation in the automotive domain, and the
the program flow (see Fig. 2 for an example). Inputs can trend towards componentising the software and use third-
be either arguments to the start function of the task, if any, party code seems to become stronger. Thus, it is likely that
and values of global variables when the start functions begin missing source code will become an increasingly significant
executing. problem for source-level analysis in the future.
It is especially important to restrict the possible values of We tackled these problems in three ways. First, we added
input pointers. If the abstract value of a pointer is TOP (any a mode to SWEET where it will assume a certain default
address), and an assignment is done using the pointer value semantics for unknown entities. Unknown variables will be
as target, then the analysis must assume that the written value set to TOP, which is a safe overapproximation but may lead to
may possibly be written to any data address. This typically very imprecise results especially if the variable holds pointer
destroys the information needed to identify any interesting values. The default semantics for unknown functions is that
program flow constraints. their return value is TOP, and that they do not affect any global
SWEET provides value annotations for arguments and variables (no side-effects). The latter assumption is clearly
variables, which allow the user to provide simple interval unsafe, but without knowledge of the function a safe analy-
bounds for these. Inputs that are not annotated will assume sis will have to assume that it can set any global variable to
the abstract value TOP (any value is possible). However, the anything, which will render the resulting analysis extremely
problem remains how to identify the important inputs and imprecise.
find proper value ranges for these. Second, we extended SWEET’s annotation language to
Since we had little time to make elaborate annotations in give it some limited capability to express the semantics
the ALL-TIMES project, and little knowledge of the code, of unknown functions. These annotations allow to specify
we would mostly let input values default to TOP. We suspect bounds on return values, and for possible side effects on given
that this affected the precision of the analysis adversely in global variables. When the abstract execution reaches a call to
many cases. Somebody more familiar with the code would an unknown function, it will simply skip the function call, and
probably have been able to do a better job. update the abstract state according to the annotations. How-
For systems that allow parallel or pre-emptive execution ever, the current annotation language cannot express relations
of tasks accessing shared data areas, the additional complica- between different values, and thus has strong limitations for
tion arises that during the execution of a task, a variable can expressing the semantics of functions.
be reassigned a new value by some other task. Such shared Third, we introduced value annotations for volatiles. Such
variables must be treated differently in the analysis. If there an annotation specifies an interval restricting the possible val-
is no knowledge about the possible preemption points, then ues of the volatile variable. This is useful if such restrictions
the value of such a variable must be considered possible to are known, e.g., if the value is read from a sensor with a cer-
change, due to the actions of some other task in the system, tain signal range. We also added a mode for SWEET, which
at any program point. An analysis that does not take this into the user can select, where SWEET simply treats volatile vari-
account is potentially unsafe. ables as ordinary ones. This is potentially unsafe, but useful in
C has a keyword “volatile”, which is a hint to the compiler situations where variables are declared volatile due to some
that the variable might be changed by some other activity in other reason than that their values might be changed from
the system. A variable might be marked volatile for many outside. An example is to prevent compiler optimisations to
other reasons, though, and conversely any “non-volatile” take place. This use of volatile declarations is not uncommon.
global variable could potentially be shared between tasks Missing code, and uncertainty about the nature of volatile-
since neither the compiler nor the linker will make any declared variables, posed a problem for the validation of the
attempt to check that only variables declared as volatile can source-level analysis. As mentioned, TimingExplorer failed
be updated by one task and read by another. Analyses that to bound 102 loops, located in 50 different C functions. Of
identify shared data and compute safe value bounds for these these, only 19 functions could be translated into ALF (the
are conceivable: however, we had no time to develop such other ones failed mostly due to missing source code). 24 of
analyses in the project. SWEET currently deals with the prob- the loops that could not be bounded are present in those func-
lem by forcing declared volatiles to assume the abstract value tions. Two of these 19 functions refer to unknown entities,
TOP throughout the program, but as explained this is poten- and 18 functions refer to volatiles: see Table 2. Of the total
tially unsafe since there may be variables not declared volatile number of functions that could be translated into ALF, about
that still behave as such. 49 % contain neither undefined entities nor volatiles (and
The second problem is that vital source code often is should thus be possible to analyse with safe results, assum-
missing. For the validator, the big problem turned out to be ing that no non-volatile globals can be affected from outside
that important parts of the code come from a subcontractor, the analysed task).
123
Table 2 The 19 functions containing loops to be bounded

Start function Unknown Unknown Volatiles # of
variables functions arguments
ATC_AVCONSCALCULATOR_IMPL_calc 0 0 Yes 3
ATC_RECUPPROTOCOLIMPL_IMPL_calc 0 0 Yes 4
ATC_VECTORCONTENT_IMPL_MoveAndInsertFront 0 0 Yes 4
ATC_VECTORCONTENT_IMPL_SNA_Array 0 0 Yes 2
ATC_VECTORCONTENT_IMPL_ZeroArray 0 0 Yes 2
ATC_VECTORCONTENT_Q05_IMPL_MoveAndInsertFront 0 0 Yes 2
ATC_VECTORCONTENT_Q05_IMPL_ZeroArray 0 0 Yes 1
CAN_CONF_EMU_HWE296_IMPL_CAN_TX_UserMessages_Calc 0 1 Yes 0
CRC_IMPL_calc 0 0 Yes 2
CcpCan_RxMsg 2 0 No 2
EMU_BATTERY_GET_IMAX_IMPL_calc 0 0 Yes 3
EMU_BATTERY_GET_IMAX_IMPL_init 0 0 Yes 1
EMU_FEHLERBEHANDLUNG_IMPL_Fehlerbehandlung 0 0 Yes 1
distab_A 0 0 Yes 0
roelFindQueueEntry 0 0 Yes 1
search_s16 0 0 Yes 4
search_s8 0 0 Yes 4
search_u16 0 0 Yes 4
search_u8 0 0 Yes 4
There are 18 start functions for tasks in the system. Of Table 3 The 18 VECU start functions
these 10 refer to unknown variables in the final, linked ALF Start function Unknown Unknown
code, and 16 to unknown functions. See Table 3, which shows variables functions
the number of undefined variables and functions, respec-
ar_100ms 0 1
tively, that are reachable from each start function. Nine start ar_10ms 0 12
functions contain loops for which we had ALF code, and ar_Background_run 1 3
which TimingExplorer could not bound. For a number of rea- CanInterrupt 0 3
sons, we could not obtain any bounds for any of these loops CAN_TX_mode_10ms 0 1
CAN_TX_mode_2ms 0 3
by analysing the start functions. This was due to imprecision dr_100ms 4 8
from unknown entities (especially unknown pointers), but dr_10ms 4 30
also the presence of volatiles whose safe treatment required dr_50ms 0 3
them to be set to TOP, and infinite for(;;); loops that dr_Background 1 5
dr_EMU_10ms 5 5
pose problems for the abstract execution algorithm used by dr_ErrorManager_100ms 3 0
SWEET. dr_VCU_1000ms 1 0
Thus, we resorted to analysing functions deeper in the dr_VCU_100ms_delay2 2 1
call tree. This had the advantage of avoiding to analyse dr_VCU_10ms 9 18
dr_VCU_20ms 2 2
much problematic code. First, we tried to analyse directly MoCMEM_Cyclic 0 1
the 19 functions containing problematic loops that are listed SciInterrupt 0 1
in Table 2. However, we were still not able to bound any
loops. As seen from the table, all but two of these functions were able to bound three of the previously unbounded loops
take arguments. So although these functions refer to very few by a safe analysis (w.r.t. undefined entities and volatiles), and
undefined entities, information about the calling context was two more by a potentially unsafe analysis making default
missing. We thus had to set argument values to TOP in the assumptions about undefined entities, and treating volatiles
analysis, which is the main reason why the analysis failed for as “ordinary” variables. We believe that the low number of
these functions. interesting loops that we were able to bound mainly is due
We finally managed to bound some loops by analysing a to imprecision from undefined entities.
set of functions in-between the start functions and the func-
tions containing the loops in the call tree. 7.5 Step 5: map results back to source code
These functions take no arguments, but since they appear
deeper in the call tree than the start functions they refer to The last step is to map flow constraints back to TimingEx-
fewer undefined functions. By analysing these functions, we plorer or RapiTime, using their native source-level annotation
123
62 B. Lisper et al.
formats, in order to support a subsequent WCET estimation. as for third-party tools producing program flow information.
As mentioned in Sect. 6, SWEET can generate flow con- Again, it would be helpful to have an expressive, standard-
straints in these formats. Also here, we encountered some ized format. This need was identified already in [17], where
practical problems. One problem is that some precision was an “annotation language challenge” was suggested to stimu-
lost in the translation to these formats: for instance, SWEET late the development of such formats, but little has happened
can generate highly context-sensitive flow constraints [9] since then.
which cannot be expressed to the same degree in the for- It is also possible to develop supporting analyses that can
mats of TimingExplorer and RapiTime. Another issue is that find some of the information that is needed, but not readily
SWEET derives loop bounds as bounds on the number of available in the local code. For instance, type-based static
times the loop header executes, while both annotation for- analyses are known that can detect possible side effects of
mats use the number of times that the loop body executes. functions [22], and in particular safely detect when a func-
This forced us to make adjustments in the generated loop tion is free from side effects. This information can be used
annotations. for two things: to restrict possible side effects from imported,
unknown functions, and to find global data that is written
by some tasks and read by others (and thus ought to be
8 Conclusions declared as volatile in the latter). The side effects are cap-
tured as enriched types, and are therefore suitable to provide
We have identified a number of problems with source-level as metadata to modules or components. Another interesting
program flow analysis. Most problems should be common to possibility is to develop mixed source/binary-level analyses
deep, semantics based source-level analysis in general. Thus, that can be used for code with inlined assembler, or when
solutions to these problems should be important not only for only binaries are available for some parts of the code. Using
WCET analysis but for source-level analysis in general. a multilevel language as ALF for the analysis, which allows
Obviously, standardization is desirable as it will relieve translations from both source-level and binary-level code, is
tool providers from the burden of creating a large number of then clearly an advantage.
versions of their tools. Weak language standards, allowing Effective use of static analysis tools is immensely helped
different compilers to define different dialects, will inevi- by knowing the code well. Many of our difficulties arose from
tably provide problems for source-level analysis tools: we lack of knowledge of the code: not knowing the intention
experienced a fair amount of this. However also information with volatile declarations, or which variables would possi-
for build processes and tools, like linkers, should prefera- bly be shared among tasks, or possible side effects of called
bly be standardized since this information is important for functions, etc. On the other hand, successful application of
source-level analysis. For instance, it is needed to resolve late static analysis also requires good knowledge how tools and
binding. On the operating system level, clean standards for methods work. Thus, to successfully apply static analysis
task communication are very desirable. It should be possible methods in the development of production code, we believe
to provide metadata, in standardised formats, that describe that software developers must have proper training in these
the task communication as well as the task entry points. Also techniques. This puts an obligation on universities to move
other information about the execution environment of the these topics into their curricula, where they are rarely found
code, such as value ranges for input data, should be possible today. Another conclusion is that the people that perform the
to provide in a standardized way. verification should be the same as the ones developing the
For the latter, the natural place to put the information is in code.
software components where it can be shipped together with However, even if advances are made as outlined above,
the code. Components providing functions should also pref- static analysis will in practice often rely on assumptions that
errably carry some information about the semantics of these, have to be made to obtain a reasonably precise analysis when
like which side effects they might have. Component mod- information is lacking. The results are then valid only as
els already provide some of the above: for instance AUTO- far as the assumptions are, which may render the analysis
SAR provides means to specify input ranges, but components unsound if the assumptions are not fulfilled. Static analysis
would need to carry more information. Here, the static analy- will thus sometimes not provide a safe guarantee that bugs
sis community should cooperate with industry to define new are absent (or that the WCET is safely overestimated), it will
standards for component metadata carrying relevant infor- do so only with a certain confidence level. This has to be
mation. kept in mind, but will in no way render static analysis use-
WCET analysis tools need specific information, in partic- less. For instance, when certifying safety-critical systems,
ular regarding program flow constraints. The current situa- static analysis can provide an additional, independent source
tion is that each tool has its own annotation format for this. of evidence which will then provide increased confidence in
This creates problems with portability between tools, as well the result of the verification.
123
Finally, note that quite a few of the problems described 13. Heckmann, R., Ferdinand, C., Kästner, D., Nenova, S.: Architec-
above will not arise for a binary level analysis of a stati- ture exploration and timing estimation during early design phases
(this volume)
cally linked executable. All the code will then be available, 14. Holsti, N., Långbacka, T., Saarinen, S.: Using a worst-case execu-
and all ambiguities will have been resolved by the compiler tion-time tool for real-time verification of the DEBIE software. In:
and linker. However, binary level analysis is difficult since Proceedings of DASIA 2000 Conference (Data Systems in Aero-
it requires a fair amount of reverse engineering, recreating space 2000, ESA SP-457) (2000)
15. Holsti, N., Långbacka, T., Saarinen, S.: Worst-case execution-time
structure that is readily available in the source code. This analysis for digital signal processors. In: Proceedings of EUS-
reverse engineering can be hard to carry out, especially for IPCO 2000 Conference (X European Signal Processing Confer-
heavily optimized embedded code. We are thus convinced ence) (2000)
that source-level analysis has its place both in general, and 16. Kirner, R.: Extending Optimising Compilation to Support Worst-
Case Execution Time Analysis. PhD thesis, Technische Universität
specifically to support timing analysis. Wien, Vienna, Austria (2003)
17. Kirner, R., Knoop, J., Prantl, A., Schordan, M., Wenzel, I.: WCET
analysis: the annotation language challenge. In: Rochange, C. (ed.)
Proceedings of 7th International Workshop on Worst-Case Execu-
References tion Time Analysis (WCET’2007), pp. 83–99, Pisa, Italy (2007)
18. Levine, J.: Linkers and Loaders. Morgan Kaufmann, 2000. ISBN
1. Barkah, D., Ermedahl, A., Gustafsson, J., Lisper, B., Sandberg, C.: 1-55860-496-0
Evaluation of automatic flow analysis for WCET calculation on 19. Lisper, B.: The ALL-TIMES project: Introduction and overview
industrial real-time system code. In: Burns, A. (ed.) Proceedings (this volume)
of 20th Euromicro Conference on Real-Time Systems, pp. 331–340 20. Merriam, N., Lisper, B.: Estimation of productivity increase for
(2008) timing analysis tool chains (this volume)
2. Bessey, A., Block, K., Chelf, B., Chou, A., Fulton, B., Hallem, S., 21. Montag, P., Goerzig, S., Levi, P.: Challenges of timing verification
Henri-Gros, C., Kamsky, A., McPeak, S., Engler, D.: A few billion tools in the automotive domain. In: Margaria, T., Philippou, A.,
lines of code later: using static analysis to find bugs in the real Steffen, B. (ed.) Proceedings of 2nd International Symposium on
world. Commun. ACM 53(2), (2010) Leveraging Applications of Formal Methods (ISOLA’06), Paphos,
3. Byhlin, S., Ermedahl, A., Gustafsson, J., Lisper, B.: Applying static Cyprus (2006)
WCET analysis to automotive communication software. In: Pro- 22. Nielson, F., Nielson, H.R., Hankin, C.: Principles of Program Anal-
ceedings of 17th Euromicro Conference on Real-Time Systems ysis, 2nd edn. Springer, Berlin. ISBN 3-540-65410-0 (2005)
(ECRTS’05) (2005) 23. Prantl, A., Schordan, M., Knoop, J.: TuBound—a conceptually new
4. Carlsson, M., Engblom, J., Ermedahl, A., Lindblad, J., Lisper, B.: tool for worst-case execution time analysis. In: Kirner, R. (ed.) Pro-
Worst-case execution time analysis of disable interrupt regions in ceedings of 8th International Workshop on Worst-Case Execution
a commercial real-time operating system. In: Pettersson, P., Yi, Time Analysis (WCET’2008), pp. 141–148, Prague, Czech Repub-
W. (ed.) Proceedings of 2nd International Workshop on Real-Time lic (2008)
Tools, Copenhagen (2002) 24. Rodriguez, M., Silva, N., Esteves, J., Henriques, L., Costa, D.,
5. CIL infrastructure for C program analysis and transformation Holsti, N., Hjortnaes, K.: Challenges in calculating the WCET of
(2010) http://manju.cs.berkeley.edu/cil a complex on-board satellite application. In: Proceedings of 3rd
6. Cousot, P., Cousot, R., Feret, J., Mauborgne, L., Miné, A., Monni- International Workshop on Worst-Case Execution Time Analysis
aux, D., Rival, X.: The ASTRÉE analyzer. In: Sagiv, M. (ed.) Pro- (WCET’2003), Porto (2003)
ceedings of 14th European Symposium on Programming. Lecture 25. Rose http://www.rosecompiler.org(2012)
Notes in Computer Sci., vol. 3444, pp. 21–30. Springer, Berlin 26. Sandell, D., Ermedahl, A., Gustafsson, J., Lisper, B.: Static timing
(2005) analysis of real-time operating system code. In: Proceedings of 1st
7. Emanuelsson, P., Nilsson, U.: A comparative study of industrial sta- International Symposium on Leveraging Applications of Formal
tic analysis tools (extended version). Technical report, Linköping Methods (ISOLA’04), (2004)
University (2008) 27. Schreiner, D., Barany, G., Schordan, M., Knoop, J.: Comparison
8. Ferdinand, C., Heckmann, R., Langenbach, M., Martin, F., of type-based vs. alias-based component recognition for interface-
Schmidt, M., Theiling, H., Thesing, S., Wilhelm, R.: Reliable and level timing annotations (this volume)
precise WCET determination for a real-life processor. In: Proceed- 28. Sehlberg, D., Ermedahl, A., Gustafsson, J., Lisper, B., Wiegratz,
ings of 1st International Workshop on Embedded Systems (EM- S.: Static WCET analysis of real-time task-oriented code in vehi-
SOFT2001), LNCS, vol. 2211 (2001) cle control systems. In: Margaria, T., Philippou, A., Steffen, B.
9. Gustafsson, J.: SWEET manual, 2011. http://www.mrtc.mdh.se/ (eds.) Proceedings of 2nd International Symposium on Leverag-
projects/wcet/sweet/manual/ ing Applications of Formal Methods (ISOLA’06), Paphos, Cyprus
10. Gustafsson, J., Ermedahl, A., Lisper, B., Sandberg, C., Källberg L.: (2006)
ALF—a language for WCET flow analysis. In: Holsti, N. (ed.) Pro- 29. Thesing, S., Souyris, J., Heckmann, R., Randimbivololona, F.,
ceedings of 9th International Workshop on Worst-Case Execution Langenbach, M., Wilhelm, R., Ferdinand, C.: An abstract interpre-
Time Analysis (WCET’2009), pp. 1–11, Dublin, Ireland (2009) tation-based timing validation of hard real-time avionics software.
11. Gustafsson, J., Ermedahl, A., Sandberg, C., Lisper, B.: Automatic In: Proceedings of the IEEE International Conference on Depend-
derivation of loop bounds and infeasible paths for WCET analysis able Systems and Networks (DSN-2003) (2003)
using abstract execution. In: Proceedings of 27th IEEE Real-Time
Systems Symposium (RTSS’06), pp. 57–66, Rio de Janeiro, Brazil,
Dec. (2006). IEEE Computer Society
12. Healy, C., Sjödin, M., Rustagi, V., Whalley, D., Engelen,
R.van : Supporting timing analysis by automatic bounding of loop
iterations. J. Real-Time Syst. 18(2–3), 129–156 (2000)
123

2012 Practical Experiences of Applying Source-Level WCET Flow Analysis To Industrial Code

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2012 Practical Experiences of Applying Source-Level WCET Flow Analysis To Industrial Code

Uploaded by

Copyright:

Available Formats

Int J Softw Tools Technol Transfer (2013) 15:53–63

Practical experiences of applying source-level WCET flow analysis

Published online: 24 July 2012

Abstract Code-level timing analysis, such as worst-case 1 Introduction

approximations are safe (representations include at least the 2 Related work

Fig. 1 Source-level analysis

One particular goal of ALL-TIMES has been to provide sup-

5 Source code analysis validation

Fig. 2 Example of abstract

(a) (b) (c)

7.4 Step 4: performing the flow analysis

Table 2 The 19 functions containing loops to be bounded

You might also like