You are on page 1of 10

Automatic Test Data Generation using Constraint Solving Techniques

Arnaud Gotlieb Michel Rueher
Dassault Electronique Bernard Botella Universite de Nice - Sophia
55 quai Marcel Dassault Dassault Electronique Antipolis
92214 Saint Cloud, France 55 quai Marcel Dassault I3S-CNRS Route des colles,
and also at BP 145
Universite de Nice - Sophia 92214 Saint Cloud, France
Antipolis 06903 Sophia Antipolis, France

Abstract commonly accepted as minimum requirements. One of
the diculties of the testing process is to generate test
Automatic test data generation leads to identify input data meeting these criteria.
values on which a selected point in a procedure is ex- From the procedure structure alone, it is only possible
ecuted. This paper introduces a new method for this to generate input data. The correctness of the output
problem based on constraint solving techniques. First, of the execution has to be checked out by an \oracle".
we statically transform a procedure into a constraint Two di erent approaches have been proposed for auto-
system by using well-known \Static Single Assignment" matic test data generation in this context. The initial
form and control-dependencies. Second, we solve this one, called path-oriented approach
4, 7, 16, 20, 3], in-
system to check whether at least one feasible control cludes two steps which are : 
ow path going through the selected point exists and
to generate test data that correspond to one of these to identify a set of control ow paths that covers
paths. all statements (resp. branches) in the procedure 
The key point of our approach is to take advantage of to generate input test data which execute every se-
current advances in constraint techniques when solving lected path.
the generated constraint system. Global constraints are
used in a preliminary step to detect some of the non fea- Among all the selected paths, a non-negligeable amount
sible paths. Partial consistency techniques are employed is generally non-feasible
24], i.e. there is no input data
to reduce the domains of possible values of the test data. for which such paths can be executed. The static identi-
A prototype implementation has been developped on a cation of non-feasible paths is an undecidable problem
restricted subset of the C language. Advantages of our in the general case
1]. Thus, a second approach called
approach are illustrated on a non-trivial example. goal-oriented
19] has been proposed. Its two main steps
Keywords are :
Automatic test data generation, structural testing, con- to identify a set of statements (resp. branches) the
straint solving techniques, global constraints covering of which implies covering the criterion 
to generate input test data that execute every se-
1 INTRODUCTION lected statement (resp. branch).
Structural testing techniques are widely used in unit or Assuming that every statement (resp. branch) is reach-
module testing process of software. Among the struc- able, there is at least one feasible control ow path going
tural criteria, both statement and branch coverages are through the selected statement (resp. branch). The goal
of the data generation process is then to identify input
data on which one such path is executed.
For these approaches, existing generation methods are
based either on symbolic execution
18, 4, 16, 7, 10], or
on the so called \dynamic method"
20, 19, 11, 21].

scribes the prototype implementation while the fth sec- Using symbolic execution corresponds to an exhaustive tion provides a detailed analysis of a non-trivial example exploration of all paths going through a selected point. The proposed method operates in devoted to the extension of our method to these types. The fourth section de- 13. dynamic allocated structures. and procedures calls 21]. func- dynamic method takes into account some of the prob. Furthermore. these techniques o er a exible way to eters by symbolic values and in statically evaluating dene and to solve new constraints on values of possible the statements along a control ow path. the procedure p. Korel proposes in 20] to base the test data generation 2 GENERATION OF THE CON- process on actual executions of programs. Its method STRAINT SYSTEM is called the \dynamic method". this may be unacceptable for programs con- taining a large number of paths. straint system. two steps : The generation of the constraint system Kset is done in three steps : 1. 8]). 19] presents an extension of the ments such as \goto-statement" are not handled in our dynamic method to the goal-oriented approach. symbolic execution is to identify the constraints (either A prototype implementation of this method has been equalities or inequalities) called \path conditions" on developped on a restricted subset of the C language. of loops and backward control ow. tion's pointer involve dicult problems to solve in the lems encountered with symbolic execution. not presented. called cKset(n). Of course. If an undesirable path is observed during the execution ow monitoring. point. test data Kset(p n) def = pKset(p)  cKset(n) corresponding to one of these paths are generated. called pKset(p)  dure  3. generation of Kset while the third section is devoted culty to deal with arrays (although some solutions exist to the resolution techniques. The treatement of This paper introduces a new method to identify auto. Generation of a set of constraints corresponding the constraints that are specic to the selected to the control-dependencies of a selected point n. Search methods based on A procedure control ow graph (V E e s) 1] is a con- the combination of both enumeration and inference pro. tures. In this paper. then Application of our method is limited to a structured a function minimization technique is used to \correct" subset of a procedural language. although the Pointer aliasing. paper. form  dependencies 12]. nected oriented graph composed by a set of vertices V . that has been successfully treated with our method. global constraints are used in a preliminary step to detect some of the non-feasible parts of the control structures and partial consistency 2. dynamic struc. 2. frame of a static analysis. basic types such as char and oating point numbers is matically test data on which a selected point in the pro. let us introduce some basics used in the rest of the current constraint techniques to solve the generated con.Symbolic execution consists in replacing input param. Unstructured state- the input variables. 9] and control. a set of edges E and two particular nodes. A few words in the fourth section are cedure is executed. cesses are used in the nal step to identify test data. The result of this step is a set of constraints | called Kset | which is formed of : 2. This method leads to several problems : the Outline of the paper : the second section presents the growth of intermediate algebraic expressions. The key point of this method is to take advantages of Now. symbolic input values under which a selected path is executed. e the unique . However. and the aliasing problem for pointer analysis. In particular. Generation of the \Static Single Assignment" Assignment" (SSA) form 23. the di.1 Basics techniques are employed to reduce the domains of pos- sible values of the test data. Generation of a set of constraints corresponding to the constraints generated for the whole proce. it may re. The goal of test data. The constraint system Kset is solved to check whether at least one feasible path which goes Kset is dened as : through the selected point exists. programs avoid such constructions. Finally. This framework because they introduce non-controled exits method is designed to handle arrays. The procedure is statically transformed into a constraint system by the use of \Static Single 1. 2. we assume that quire a great number of executions of the program.

return j  2 Figure 1: Example 1 3 entry node. The SSA form of a lin. j2 := (j0  j1 )  1c. then i := 2   6. otherwise it is non-feasible. the control ow path < 1 2 4 5 6 > in the CFG of example 1 is non-feasible. 1  od  4. and its control ow 6 graph (CFG) shown in gure 2.! i0  i . For instance. i := i . i2 := (i0  i1 )  2. if ( j = 2 ) 1 5. SSA form introduces special assign- 1 For all the examples throughout the paper. A control ow path is a path < vi  : : :  vj > in the CFG. while (i2 6= 0 ) do 2. A path is feasible if there exists at least one test datum on which the path is executed. int f(int i) int j  1. j := j  i  3b. 1b. Initially proposed 6. 4 sent the basic blocks which are sets of statements exe- cuted without branching and edges represent the possi- ble branching between basic blocks.2 SSA Form 3a. then i3 := 2  program variable as a logical variable. a clear abstract syntax is used to indicate that our method is not designed to a particular language . For instance. j1 := j2  i2  3b. con- 5 sider the procedure1 given in gure 1. j := 1  2. A node v2 is control-dependent 12] on v1 i 1) there exists a path P from v1 to v2 in (V E e s) with any int f(int i0 ) v in P n fv1 v2 g post-dominated by v2  2) v1 is not int j0  1a. /* Heading */ dependent on block 4 in the CFG of example 1. A point is either a node or an edge in the CFG. For example. block 5 is control. A path Figure 2: Control ow graph of example 1 is a sequence < vi  : : :  vj > of consecutive nodes (edge connected) in (V E e s). Figure 3: SSA Form of example 1 ear sequence of code is obtained by a simple renam- ing (i . if (j2 = 2) of variables  this leads to the impossibility to treat a 5.  i4 := (i3  i2 )  for the optimisation of compilers 2. 1  Most procedural languages allow destructive updating od 4. j0 := 1  post-dominated by v2 . For the control structures. the \Static return (j2 )  Single Assignment" form 9] is a semantically equiv- alent version of a procedure on which every variable has a unique denition and every use of a variable is reached by this denition. while ( i 6= 0 ) do 3a. A node v1 is post-dominated 12] by a node v2 if every path from v1 to s in (V E e s) (not including v1 ) con- tains v2. i1 := i2 . Nodes repre. which is designed to compute the factorial function. 23]. where vi = e and vj = s.! i1  : : :) of the variables. and s the unique exit node.

this constraint is driven by the syntax.3 Conditional Statement in gure 3).and the else- pKset(p) is a set of both atomic and global constraints parts of the conditional.3. The decision of statement 2 generates i2 6= 0.3. global constraint element/3 2 : 2. The loop statement while is also translated in a global Let us now present how pKset is generated. in the junction nodes of the 3a in example 1 generates the constraint j1 = j2  i2. Max) ables referenced inside and outside the while-statement. presents the genera. Each variable x which has a the while-statement. maximum) value depending on Note here that the -assignments are only used to iden- the current implementation. The conditional statement if then else is translated grams in one parsing step. Consider the if-statement of sic block is translated into a conjunction of such con- the SSA form of example 1 in gure 3  the -function straints. v~1 is the vector of variables dened basic type declaration. explicit in section 3. Note that -assignments are associated with a procedure p. translated in simple equality constraints. A -function returns one of its arguments depend. translated into a list of variables of the same type while a record is translated into a list of variables of di erent The operational semantics of the constraint w/5 will types. signed to handle more eciently set of atomic con- straints. an atomic constraint is a relation tional semantic of the constraint ITE/3 will be made between logical variables. Each subsection. An array declaration is tify the vectors of variables. then-part of the statement. Informally speaking. the while statement are true for the required data. Parameters and global variables are The generated constraint requires three vectors of vari- considered as input variables while the other variables ables v~0  v~1 v~2. the assignment of statement a special expression 9] : access. such as assignments and ex. which is de. The method constraint W/5.3.1 Declaration pKset(v~2 := (~v0  v~1) while d do s od) = w(pKset(d) v~0  v~1  v~2  pKset(s)) The variables of a procedure are either input variables or local variables. For instance. statements 3a and 3b generate of statement 6 returns i3 if the ow goes through the j1 = j2  i2 ^ i1 = i2 . CFG.ments. which is designed to treat structured pro. therefore we only present the generation of pKset on Elementary statements. also be given in section 3.3. 2. For instance. is the minimum (resp.k) statement is the kth element of a noted v.4 Loop Statement element(k L v) constraints the kth argument of the list L to be equal to v. is devoted to the output value of the procedure. the voted to a particular construction. pressions in the decisions are transformed into atomic Reference of an array is provided in the SSA Form by constraints. 1. constraints generated for the body (fth argument) of tion technique. The evaluation of ac- 2 where /3 denotes the arity of the constraint cess(a. states that as long as its rst argument is true.5 Array and Record 2. For example. . SSA Form is built by using the algorithm given in 5]. A ba- ing on the control ow. the -functions are introduced in a special heading of the loop (as in the while-statement 2. into global constraint ITE/3 in the following way : For convenience.2. Global constraints are de. i2 otherwise. arrays. A specic variable. a list of -assignments will be written with a single statement : pKset(if d then s1 else s2  v~1 := (~v2  v~3)) = x2 := (x1 x0) : : :  z2 := (z1  z0 ) () v~2 := (v~1  v~0) ite(pKset(d) pKset(s1) ^ v~1 = v~2 pKset(s2) ^ v~1 = v~3 ) 2. called -functions. The opera- Informally speaking. named \OUT".3 Generation of pKset This constraint denotes a relation between the decision and the constraints generated for the then. is translated in atomic constraint inside the body of the loop and v~2 is the vector of vari- of the form : x 2 Min Max] where Min (resp. 2.2.2 Assignment and Decision Both arrays and records are treated as list of variables.3. For some more complex structures. v~0 is a vector of variables dened before are considered as local.

if there exists at least one variable assign- cKset(n) is a set of constraints associated with a point ment which respects all the constraints. node 5 is control-dependant . the special expression by one constraint to others constraints. i. These algo- update is used 9]. update(a.For the denition of an array. Informally speaking. the domain of the variables. It represents the necessary conditions a set of constraints  is called a store and the store is under which a selected point is executed.j. S 1 := update(a j w)) = i6=j felement(i a v) ^ element(i a1 v)g Let us rst introduce some basics notations on con-  felement(j a1 w)g straint programming required in the rest of the paper.e. 2. it is the constraint element/3 : usually necessary to combine the resolution process with a search method. For example. Altough these approxi- Both expressions access and update are treated with mation algorithms sometimes decide inconsistency.5 Example x . cKset(n) is then the set of constraints of the statements and the branches on which n is control. 2. These notations are extracted from 15].w) evaluates to an array rithms are usually called partial consistency algorithms a1 which has the same size as a and which has the same because they remove part of inconsistent values from elements as a. where (9) denotes the existential closure of the formula dependent.4 Generation of cKset A constraint system is consistent if it has at least one solution. on node 4 then : cKset(5) = fj2 = 2g Entailment test checks out the implication of a con- straint by a store. n in the CFG. search methods pKset( v:= access(a k)) = felement(k a v)g are intelligent enumeration process. For example. see 14] and 17]. For a survey on Constraint Solving and Constraint Logic pKset(a Programming. These con. consistent if : ditions are precisely the control-dependencies on the j= (9) selected point. except for j where value is w. More formally.

straint solver. Inference is based Precise denitions of these local consistencies are now on algorithms which propagate the information given given : . when the domains contain rial search problems. plied. gation. j1 = j2  i2 ^ i1 = i2 . consistency and entailment. For this reason. 1) Both consistency and entailment tests are NP-complete ite(j2 = 2 i3 = 2 ^ i4 = i3  i4 = i2 ) problems in the general case. Interval-consistency is applied on large domains. variables (x1 : : :  xn) which denotes a subset of ZZ n . Constraint systems are inference a small number of values. domain-consistency is ap- systems based on such operations as constraint propa. Constraint programming has emerged in the last decade Both are applied in a subtle combination by the con- as a new tool to address various classes of combinato. implemen- OUT = j2 g tations of these tests are based on two approximations : domain-consistency and interval-consistency. cKset(5) = fj2 = 2g Kset(f 5) = pKset(f)  cKset(5) 3. A constraint c(x1  : : :  xn) is a n-ary relation between 3 SOLVING THE CONSTRAINT SYS. the following sets are obtained : pKset(f) = j= (8)( =) c) f j0 = 1 w(i2 6= 0 (i0 j0) (i1  j1) (i2  j2) where (8) denotes the universal closure of . TEM AND GENERATION OF TEST Domain-consistency also called arc-consistency removes DATA values from the domains and Interval-consistency only reduces the lower and upper bounds on the domains. 0 is entailed by fx = y2 g The entailment test of the constraint c by the store  is For the procedure given in gure 1 and the statement noted : 5. Intuitivelly.1 Local consistency Associated with each input variable xi is both a domain Di 2 ZZ and an interval Di = min(Di ) max(Di )].

a xpoint is a state of the domains constraints added to the store.2 ITE/3 Consider the problem of automatic test data generation for statement 4. sistency in a preliminary step of the resolution process. The inference engine propagates the reductions pro. the section 2.2. domain. The operational semantic of the user-dened global con- vided by this algorithm on the other constraints. (x0 = 2 y0 = 3) corresponds to the unique test then 3a. ment respecting both the store and the negation of the The corresponding algorithm is able to check out both constraint. Infor. then . t := t . Denition 3. t := 2  x  2.1 Entailment Test Implementation variable xi and value vi 2 fmin(Di ) max(Di )g there exist values v1  : : :  vi. The global holds.consistency and the inference engine may reduce the domains of possible values of test data on the example 2 given in gure 4.and interval. The implementation of entailment test may holds. The key point of our exists values v1  : : :  vi.2. (domain-consistency) 15] local treatment is directly implemented in the constraint A constraint c is domain-consistent if for solver. it is each variable xi and value vi 2 Di there necessary to provide the algorithm. the Denition 1. We have introduced in where no more prunnings can be performed. The straints is designed with properties which are \guarded" propagation iterates until a xpoint is reached. by entailment tests.3 two global constraints : ite/3 and w/5.. Let us give now their denitions. c is domain-consistent.1 vi+1 : : :  vn in The entailment test is used to construct these global D1  : : :  Di. y0 leads to y0 2 f3 4g and t0 2 f4 5g int z  t0 = 2  x0 leads to x0 2 f2g and t0 2 f4g int t  1a..2 Global Constraints Denitions For atomic constraints and some global constraints. 4. y  datum on which statement 4 in the program of gure 4 3b. A store  is domain-consistent if for every constraints are used to propagate information on incon- constraint c in .consistencies for this constraint. z0 = x0  y0 leads to x0 2 2 8] and y0 2 0 4] The last ones are added to identify non-feasible parts t0 = 2  x0 leads to t0 2 4 16] formed by one of the two branches of the control . A constraint is proved to be entailed by a store if there is no variable assign- A local treatment is associated to each constraint. if (z  8) Finally. be done as a proof by refutation. y ) t1 = t0 . Such properties are expressed by mally speaking. 3. Let us illustrate how interval-. Figure 4: Example 2 3. domain. z := x  y  t1 = t0 .   D  : : :  D such that c(v  : : :  v ) 1 i+1 n 1 n constraints.1 vi+1  : : :  vn in approach resides in the use of such global constraints to D1  : : :  Di. (ite/3) Parameters are of non-negative integer type. int g(int x. for user-dened global constraint. if (t = 1 ^ x > 1 ) can be executed. Denition 2. y0 leads to y0 2 f3g 1b. (interval-consistency) 15] A constraint c is interval-consistent if for each 3. The ite(c fc1 ^ : : : ^ cp g fc01 ^ : : : ^ c0q g) following set is provided : if j= (8)( =) c) then  :=   fc1 ^ : : : ^ cp g Kset(g 4) = fx0 y0 2 0 Max] z0 = x0  y0  t0 = if j= (8)( =) :c) then  :=   fc01 ^ : : : ^ c0q g 2  x0 z0  8 t1 = t0 . y0  t1 = 1 x0 > 1g if j= (8)( =) :(c1 ^ : : : ^ cp )) then and the following resolution process is performed :  :=   f:c ^ c01 ^ : : : ^ c0q g z0 = x0  y0 leads to z0 2 0 Max] if j= (8)( =) :(c01 ^ : : : ^ c0q )) then t0 = 2  x0 leads to t0 2 0 Max]  :=   fc ^ c1 ^ : : : ^ cp g z0  8 leads to z0 2 0 8] t1 = 1 leads to t1 2 f1g The rst two features of this denition express the op- x0 > 1 leads to x0 2 2 Max] erational semantic of the control structure if then else.1 Di+1  : : :  Dn such that c(v1 : : :  vn) treat the control structures of the program.

meaning that the else-part of the statement is non feasible. the fourth feature is applied meaning that the body of the loop is ite(i0 6= 0 i1 = i0 . 1) the store. In the opposite. the w(i2 6= 0 (i0  j0) (i1  j1) (i2 j2 ) constraints i0 6= 0 ^ i1 = i0 . (i0 = 2) is obtained. the fourth feature is applied twice and then gives 3. Hence w/5 behaves as a constraint gener.1 i1 = i0 . the output vector of variables v~2 is x y z 2 f0 1g  = fx 6= y y 6= z g then equated with the input vector v~0 . and yields to i0 = 2. When evaluating w/5. it is necessary to allow the gen- eration of new constraints and new variables. local consistencies are incomplete constraint the other features identify non-feasible part of the struc. A substi- tution subs(~v1  v~2  c) is a mechanism which generates 3. generating the follow- if j= (8)( =) subs(~v2  v~0  :c)) then ing store :  :=   fv~2 = v~0 g fj0 = 1 i0 6= 0 i1 6= 0 i2 = 0 i2 = i1 . (w/5) Kset(f 5) = pKset(f)  cKset(5) = w(c v~0 v~1 v~2 c1 ^ : : : ^ cp ) f j0 = 1 w(i2 6= 0 (i0 j0 ) (i1 j1) (i2  j2) if j= (8)( =) subs(~v2  v~0  c)) then j1 = j2  i2 ^ i1 = i2 . The third one is applied if it can be proved that be domain-consistent though there is no solution in the the constraints of the body of the loop are inconsistent domains (i. The store of constraints can ture. Of course. Finally. This means the body cannot be example of a classical pitfall of these techniques : executed even once. Testing j= (8)( =) (x = z)) fails because the store . solving techniques 22].3 W/5 the following store : The while-statement combines looping and destructive fj0 = 1 j2 = 2 j2 = j1  i1  i2 = i1 . j1 = j0  i0  i1 = i0 . As for the ite/3 constraint. Let us illustrate the treatment of w/5 on the while- Suppose that the store contains i2 = 0  when statement of example 1 : applying the fourth feature of the ite constraint we Suppose that the store contains fj0 = 1 j2 = 2g  when have to consider the consistency of the following set : testing the consistency of fi2 = 0g  fi2 = 1g It is inconsistant.structure. 1 assignments.3 Complete Resolution of the Example a new constraint having the same structure as c but where variables vector v~1 has been replaced by vector Consider again the example of gure 1 and the problem v~2 . 1)  :=   fsubs(~v2  v~0  c1 ^ : : : ^ cp ) ^ ite(j2 = 2 i3 = j2 ^ i4 = i3  i4 = i2 ) w(c v~1 v~3 v~2 OUT = j2 g  fj2 = 2g subs(~v1  v~3  c1 ^ : : : ^ cp ))g The loop is executed twice. Consider for example : if v~2 = v~0 is inconsistent in the current store.2.4 Search Process The rst two features represent the operational seman- tic of the while-statement. subs(~v2  v~0  c1 ^ : : : ^ cp ) ^ w(c v~1 v~3 v~2 subs(~ v1  v~3  c1 ^ : : : ^ cp ))g 3.1 j2 = if j= (8)( =) subs(~v2  v~0  :(c1 ^ : : : ^ cp ))) j1  i1  j1 = j0  i0  j2 = 2 i3 = j2 i4 = i3  OUT = j2g then  :=   fsubs(~v2  v~0  :c) ^ v~2 = v~0 g Interval consistency is applied to solve the sys- if j= (8)( =) v~2 6= v~0) then tem. This is the unique test data  :=   fsubs(~v2  v~0  c) ^ on which statement 5 may be executed. Then. 1 ^ i2 = i1  i2 = 1) executed at least once. the store is inconsistent). 1 i2 = 0g ation program. The Kset provided by subs(~v1  v~2  x1 + y1 = 3) is (x2 + y2 = 3) the rst step of our method is : w/5 is now formally dened : Denition 4. 1 ^ i2 = i1 are added to j1 = j2  i2 ^ i1 = i2 . Let us give an with the current store.e. The following example illustrates this mechanism : of generating a test data on which a feasible path going if v~1 = (x1  y1) and v~2 = (x2 y2 ) then through statement 5 is executed.

Experiments are made on into (Dx = a a + b=2] or Dx = a + b=2 b]) ). 15].5 4 IMPLEMENTATION First experiments concern the search of solutions with- out adding any kind of constraints on input data. In order to obtain a solution. This process is re. Consider the problem of automatic test data generation to reach node 13. As claimed in the Introduc. The search process can be user. whereas in the latter we must backtrack and try another value (x = w) until A generator of SSA form and control-dependencies Dx =  . these additional constraints may be used to insure that the 5 EXAMPLE generated input data are \realistic". When a value v is chosen in in 17]. The extension to the rst solution and all solutions of the problem. References on these solvers can be found This process is incremental. The following set of constraints on domains are to select the variable with the smallest domain added : (rst-fail principle)  a 1] a 2] a 3] b 1] b 2] b 3]target 2 1 9] to select the most constrained variable  Table 1 reports only the results of the constraint solver to bissect the domains ( x 2 a b] is transformed and search process module. These constraints brary of Sicstus Prolog 3.fx 6= y y 6= z x 6= z g is domain-consistent. structures.5 6]. Resolution of the constraint system is therefore more ate the possible values in the restricted domains 22.3 17])  gram given in gure 5. A constraints solver tion. we have considered that the user wants the input introduce new diculties in the constraints generation data to satisfy the additional constraint : process. peated until either the domain of all variables is reduced INKA includes 5 modules : to a single value or the domain of some variable becomes empty. The INKA. The extension of our method to pointer variables the domain Dx of the variable x. It is also possible to guide the search process with some well-known heuristics. we obtain a solution of the A C Parser test data generation problem. it is written in the abstract syntax used in this paper.13) constraint system. are propagated by the inference engine as soon as they induce a reduction on the domains. there are many test data on which a se- lected point is executed. A generator of Kset In general. For example : INKA has generated the Kset(SAMPLE. a prototype implementation has been developed line 1 of the table 1 indicates the time required to obtain on a structured subset of language C. Floating point numbers do not Then. This may reduce aliasing problem and the analysis of dynamic allocated the domains of the other variables. For the sake of simplicity. it is of course not possible to a 3]2 = a 1]2 + a 2]2 . A search process module directed by adding new constraints on the input vari- ables of the procedure. enumerate all the values of a oating point variable. Our framework provides an ele. The control structures such as do-while and switch statement exact test data is provided in the former case while the is straightforward. strictly greater than another one x0 ). it is necessary to enumer. Furthermore. Characters are handled in the same number of solutions is only provided in the later one. way as integer variables. but they require another solver. Although the domains remain nite. problematic. In the former case. The constraint solver is provided by the CLP(FD) li- gant way to handle such constraints. a Sun Sparc 5 workstation under Solaris 2. the constraint (x = v) falls into two classical problems of static analysis : the is added to the store and propagated. constraint solving techniques provide a exible way to choose test data. Size constraints between variables (for example y0 > x0 of array have been reduced to 3 for improving the pre- meaning that a parameter y0 of a procedure is sentation. They may have one of the two following forms : We present now the results of our method on a non- constraints on domains (for example x0 2 trivial example adapted from 11] : the SAMPLE pro- .

0s 1953 solu. Test data are given in vector form example made with a prototype implementation tend (a 1] a 2] a 3] b 1] b 2] b 3]target) and CPU time is to show the exibility of our method.5. Figure 5: Program SAMPLE 6 CONCLUSION Second line reports the results of generation when the In this paper. further experiments are needed to show 1] A.5) 9.5.5.T. do   (4.5.CPU time All solutions CPU time 1c. Aho.3.5.4. 7a. 287s 3. R.4.3. int target) int i fa fb out  Table 1: Results 1a. 13.3.5. On the con.4.4. if (fb = 1) (4.R. 11.3s (3.4.4. the added constraint able help on preliminary ideas to design the global con- has been checked out after the enumeration step. i := i + 1  (3. and J.1.3. Compilers Prin- the e ectiveness of our approach and to compare the ciples.4.3.4). if (ai] = target) tions 4. int sample(int a3].5. In both cases.  (3.1. Ullman. int b3]. i := 1  (4.4). fb := 0  tion 2. of this approach is the early detection of some of the straint is added to the current store and propagated.5. of table 1 show an important improvement factor due This research is part of the software testing project DE- to the use of the additional constraint in the resolution VISOR of Dassault Electronique. automatic test data generation problem. trary. Addison-Wesley Pub- . This straints introduced.5. ACKNOWLEDGEMENTS proximatively the same to obtain all the solutions. Of course. Sethi.5). The key point cess and the third line reports the results when the con. while (i  3) do ( fa := 0  First solu. Thanks to Xavier Moulin for its illustrate a generate and test approach.3). 1. First experiments on a non-trivial these experiments. the CPU time elapsed in the rst and second experiments are ap.4.4.4. od 6. Patrick Taillibert and Serge Varennes gave us invalu- ference is that. the additional constraint is used to prune the domains and thus the time elapsed in References the search process module is dramatically reduced.3) 53s (3. process.4.5.4.N.3.3).3).5. non-feasible paths by the global constraints and thus the reduction of the number of trials required for the gen- A rst fail enumeration heuristic has been used for eration of test data.1. techniques and tools. bers  an experimental validation on real applications is also forseen. i := i + 1  (4. In the third case. Note be devoted to the extension of this method to pointer that a complete enumeration stage would involve to try variables and experimentations with oating point num- 97 = 4782969 values. then out := 1 10. return out  method with other approaches. The only dif.5. else out := 0   15. 292s 5. These experiments are intended to show what we have called the exible use of constraints.3.3). First.3.3.4). 7b fb := 1  8. then fa := 1   (3. while (i  3) (4.5.5).5. do if (bi] 6= target) ( helpful comments on earlier drafts of this paper. then fb := 0  (3.4.5. the search process has enumerated all the possible values in the reduced domains.3.5. i := 1  1b. we have presented a new method for the additional constraint is checked out after the search pro.1.3.3. then (4.3) 1.3. Future work will the time elapsed in the constraint solving phase.4).3. note that the results presented in the third line This work is partially supported by A. in the second case.4. 2.5. if (fa = 1) (3.

Florida. Boyer. 22] A. Rosen. and Y. Decem. in Computer Science. 19(7):385{394. M. SE-3(4):266{278. Automated Software Test Data Genera- tions on Software Engineering. In LNCS 910. neering. October 23] B. Clarke. man. Gi ord. 18] J. 5(1):63{86. and F. of the Interna- 2] B. In Proc. 4] R. pages 293{ ber 1994. Symbolic Execution and Program Test- November 1994. Hamlet. A Dynamic Approach of Test Data Gen- gramming over Finite Domains. volume 21(3). Ferrante. 13] D. Korel. Zadeck. Proc. September 1991. May 1993. 7] L. 21] B. Ottenstein. Hentenryck. Computer Lan. M. Analysis. Elspas. 1986. IEEE Transactions sign. Constraint-Based tions. Warren. May 1996. Saraswat. J. of Symposium on Software Testing. N. implementation. F. tional Conference on Software Engineering. Levitt. SIGPLAN Notices on Software Engi- 9] R. N. Hentenryck and V. Articial Intelligence. IEEE Transac. 16(6):1684{1698. and F. and K. of Symposium on Principles of Pro- Automatic Test Data Generation. Korel. Zadeck. Single-Pass 17] J. Wegman. The Program Dependence Graph and its use in optimization. Yates and N. Rosen. K. gramming Languages. The Chaining Approach Of Infeasible Paths In Branch Testing. July 1987. Ferguson and B. ACM Trans. January 1996. San Diego. J. Cytron. N. IEEE Transactions on Software Engineering. 16(8):870{879. Ferrante. Programs with Procedures. V. 5] M. 20(19):503{581. 12] J. 1977. straint language cc(fd). Alpern. B. In Proc. 1997. Exploring Dataow Testing of Arrays. J. Zadeck. Korel. SELECT . lishing Company. Automatic Generation of Path Covers Based on the Control Flow Anal. 18(3):197{216. volume 14(8) of Software En- actions on Software Engineering and Methodology. 1977. gineering Notes. for Software Test Data Generation". 24] D. Constraints. M. SICStus Prolog User's Manual. Inc. New York. In Proc. 1991. uary 1988. 16] W. pages 12{27. Wegman. of ISSTA'96. Coen-Porisini and F. Bertolino and M. august 1990. Global Value Numbers and Redundant Computa- 10] R. November 1990. Deville. Pro. 14] P. DeMillo and A. and F. Transactions on Program. Symbolic Evaluation System. June 1975. A System to Generate Test Data and Symbolically Execute Programs. and evaluation of the con- on Software Engineering. Automated Test Data Generation for sentation in Symbolic Execution. and Verication (TAV3). 8] A. Carlsson. Journal of Logic Program- Structured Languages. Marr!e. SE-2(3):215{222. 1997. A. Malevris. Symbolic Testing and the DISSECT A formal system for testing and debugging pro. D. New York. K. B. July 10(6):234{245. Commun. 19] B. Ja ar and M. tion. 13(4):451{490. ACM. K. January 1988. ACM. 8(1):99{118. ming Languages and Systems. on Software Engineering. pages 1{11. Transactions on Programming lations. pages 209{215. of Symposium on Principles of Programming Languages. 20] B. Programming : Strategic Directions. 9-3:319{349. Weg. Springer Verlag. 1994. Array Repre. K. Nikolik. IEEE Transactions grams by symbolic execution. San Diego. Howden. Transactions on Programming Lan- guages and Systems. Swedish Institute eration. In Conference on Software Maintenance. Mackworth. IEEE. Baltimore. 316. In Proc. 6] M. Maher. SIGPLAN Notices. Consistency in Networks of Re- pendence Graph. ing. M}ossenb}ock. pages Detecting Equality of Variables in Programs. CA. IEEE Transac. M. pages 311{317. September 1976. Reducing The E ects 11] R. K. Constraints ACM. C. Korel. B. Key West. V. . Jan- tions on Software Engineering. 1995. de Paoli. V. Ecently Computing Static Single Assignment Form and the Control De. 1993. K. ming. De- ysis of Computer Programs. Constraint Logic Pro- Generation of Static Single-Assignment Form for gramming : A Survey. ACM. December 1989. Languages and Systems. 2(1):7{34. O ut. and J. Saraswat. CA. SE-17(9):900{910. 15] P. J. King. 3] A. In 118{129. pages 48{54. guages. K. Brandis and H. and B. July 1976. 20(12):885{899. IEEE.