Professional Documents
Culture Documents
Árpád Beszédes, Tamás Gergely, Zsolt Mihály Szabó, János Csirik and Tibor Gyimóthy
Research Group on Artificial Intelligence, University of Szeged & HAS
Aradi vértanuk tere 1., H-6720 Szeged, Hungary, +36 62 544126
fbeszedes,gertom,szabozs,csirik,gyimig@cc.u-szeged.hu
Figure 1. (a) A simple program. (b) The framed statements give the dynamic slice
i: h(d1 : U1 ); (d2 : U2 ); : : : i source level and object-code level. In our algorithm source level instru-
mentation was chosen because of ease of portability to different platforms
and of mapping the slice results to the original source code. However, some
This is needed because in a C instruction (i.e. expression) may argue that object level instrumentation could result in faster execution
several l-values can be assigned new values. Note that the of instrumented code and also system and library calls could be handled
sequence order is important, since d values of a previous more completely [14].
i: hd : U i
#include <stdio.h>
int a, b;
void main() {
int s, *p;
8. s = 0; s : ;
9. scanf("%d", &a); a : ;
10. scanf("%d", &b); b : ;
11. p = &b; p : ;
12. while (*p < 10) { p12 fp; d1g
:
13. s += f(3,4); arg (f; 1) : fp12g; arg (f; 2) : fp12g; s : fs; ret(f ); p12g
14. s += g(3); arg (g; 1) : fp12g; s : fs; ret(g ); p12g
}
15. printf("%d", *p); o15 : fd2g
16. printf("%d", s); o16 : fsg
}
the relevant points in the program with certain dump-actions the corresponding d 0 and u0k 2 U 0 “dynamic dependences”
which will create a dump-file, the so-called T RAC E of the (i.e. memory locations and/or variables and predicates) and
program. The instrumentation is performed in such a way, derive a set DynS lice(d0 ) that contains all the statements,
that after compiling the instrumented program it will behave which affect d0 when instruction i has been executed and
identically to the original code. T RAC E contains all the 0
LS (d ) by the equations below:
After we have determined the D/U representation of the Note, that this computation order is strict as seen in the pre-
program and the T RAC E for a given execution we can de- vious section.
rive the dynamic slice with respect to the given slicing cri- In the case of function calls the actual D/U sequence can-
terion. We process the T RAC E from its beginning and not be processed in a single iteration of the algorithm by
perform different actions for each element. If the actual processing a single E H element “on the fly”. In these cases
T RAC E element is an “administrative” element, then the the remaining sequence positions should be stacked and af-
necessary computations are made for the internal represen- ter the function has returned the remaining sequence items
tations (such as maintaining a stack for the visible variables can be processed (see the formalized algorithm below).
in block-scopes). If the element is an E H element then The d0 and u0k values are determined from the corre-
the corresponding i: hd : U i sequence is processed as fol- sponding d and u k (statically computed) values as follows
lows. For each D/U item in this sequence we determine (the same conversions apply to u k as well):
if d is a scalar variable then d 0 will be its memory program DynamicSliceForC
location (this can be determined based on T RAC E ), begin
Initialize LS and DynS lice sets
ConstructD/U
if d is a dereference variable then d 0 will be the value ConstructTRACE
(also a memory location) of the corresponding pointer actual D/U item = nil
(also from T RAC E ), for all lines of T RAC E
case the current line in T RAC E of
if d is a predicate variable then d 0 is determined by function begin mark:
supplementing the predicate variable pn with a “depth- push(actual D/U item)
level” which corresponds to the depth of the function- actual D/U item = nil
call stack, denoted by pn(k ) (this is needed in the function end mark:
case of recursive functions because the same predicate pop(actual D/U item)
should be considered as a different one in different in- EH element:
stances of the same function), the current action in E H is i j
actual D/U item = the first item in i: hd : U i
other:
in all other cases d and d 0 are the same. resolve the unresolved memory address
references in actual D/U item
Now we can give a formalization of the dynamic slice endcase
algorithm for complete C programs in Figure 5. while actual D/U item can be processed
compute d0 and U 0 based on actual D/U item
We illustrate the above method by applying it on our 0) =
DynS lice(d
example program shown in Figure 4 for the slicing crite- S 0 ) [ fLS (u0 )g
rion (ha=2; b=6i; 1514 ; *p). On this input we get the fol- uk 2U
0 0 DynS lice(u
k k
LS (d
0) =
lowing execution history: h8; 9; 10; 11; 12; 13=1; 2; 3; 4=13; i
14=5; 6; 7=14; 12; 15; 16i. The “dual” E H elements 13=1, actual D/U item = the next item in i: hd : U i
4=13, 14=5 and 7=14 correspond to the function call/return endwhile
and parameter passing “virtual statements”. Parameter endfor
passing can be treated as two statements (but as one action), Output LS and DynS lice sets for the last definition
since first the caller puts the parameter on the function call of all memory locations
stack and then the called function takes that value (returning end
a value can be interpreted similarly). The values computed This is true if: according to the static D/U there are no
during the execution are shown in Figure 6. (The numbers
function calls at the actual D/U item position and item 6=
of the form $xxxxxxx are memory addresses supplied by
nil and item does not have any unresolved memory address
T RAC E .)
references.
The final slice of the program can be obtained as the
union of DynS lice(o15) and fLS (o15)g, while the actual Figure 5. Dynamic slice algorithm for C pro-
result is depicted in Figure 7. The slice contains the lines grams
marked with the bullets in the first column. We can observe
that the dynamic slice contains those statements, which in-
fluenced the value of memory location $4347828 pointed
by p (which is, in fact, the value of the scalar variable b). 5. Related work
As a point of interest, we can observe also that the dy-
namic slice for criterion (ha=2; b=6i; 16 15 ; s) (the second Our method for the computation of dynamic slices sig-
column of bulleted lines) contains only those statements, nificantly differs from the approach presented in [1]. This
which influence the value of scalar s, i.e. those statements, method takes into account that different occurrences of a
which influence the values of the two globals are not in- statement may be affected by different set of statements.
cluded, since s uses only the return values of the functions f This approach uses a Dynamic Dependence Graph. Using
and g (which are dependent only on constant values). Note, this graph precise dynamic slice can be computed, but as
that this computation does not necessitates a separate exe- mentioned earlier, the size of the DDGs may be unbounded.
cution of the algorithm, i.e. our method is “global” for a Agrawal and Horgan also proposed a reduced DDG method.
specific input (and T RAC E ) and slices of all variables at The size of reduced graphs is bounded by the number of
their last definition point can be determined simultaneously. different dynamic slices (see Section 2 for details about the
Action d
0 U
0 DynS lice(d
0) LS (d
0)
81 $6684144 ; ; 8
92 $4347824 ; ; 9
3
10 $4347828 ; ; 10
11
4 $6684148 ; ; 11
12
5 p12(1) f$6684148; $4347828g f10,11g 12
13
6 arg (f; 1) fp12(1)g f10,11,12g 13
13
6 arg (f; 2) fp12(1)g f10,11,12g 13
16 $6684060 farg (f; 1)g f10,11,12,13g 1
16 $6684064 farg (f; 2)g f10,11,12,13g 1
27 $4347824 f$4347824; $6684060g f1,9,10,11,12,13g 2
38 $4347828 f$4347828; $6684064g f1,10,11,12,13g 3
49 ret(f ) f$6684060g f1,10,11,12,13g 4
13
9 $6684144 f$6684144; ret(f ); p12(1)g f1,4,8,10,11,12,13g 13
14
10 arg (g; 1) fp12(1)g f10,11,12g 14
510 $6684064 farg (g; 1)g f10,11,12,14g 5
611 $4347824 f$4347824; $6684064g f1,2,5,9,10,11,12,13,14g 6
712 ret(g ) f$6684064g f5,10,11,12,14g 7
12
14 $6684144 f$6684144; ret(g ); p12(1)g f1,4,5,7,8,10,11,12,13,14g 14
12
13 p12(1) f$6684148; $4347828g f1,3,10,11,12,13g 12
15
14 o15 f$4347828g f1,3,10,11,12,13g 15
16
15 o16 f$6684144g f1,4,5,7,8,10,11,12,13,14g 16
DDG and reduced DDG methods). In [13] a simple pro- the dynamic slice of a program can be expected to be less
gram is presented (called Q n ) which has O(2n ) different than 50% of the executed nodes and to be well within 20%
dynamic slices, where n is the number of statements in the of the entire program.
program. This example shows that even this reduced DDG There have been several methods for dynamic slicing in-
may be very huge for some programs. troduced in the literature, but most of them use the inter-
In [11] Korel and Yalamanchili introduced a forward nal representation of the execution of the program with dy-
method for determining dynamic program slices. Their al- namic dependences called the Dynamic Dependence Graph
gorithm computes executable program slices. In many cases (DDG). The main disadvantage of these methods is that the
these slices are less accurate then those computed by our size of the DDGs is unbounded, since it includes a distinct
forward dynamic slicing algorithm. (Executable dynamic vertex for each instance of a statement execution.
slices may produce inaccurate results in the presence of In this paper we have introduced a new forward global
loops [13].) The method of Korel and Yalamanchili is based method for computing backward dynamic slices of C pro-
on the notion of removable blocks. The idea of this ap- grams. In parallel to the program execution the algorithm
proach is that during program execution, on each exit from a determines the dynamic slices for any program instruction.
block the algorithm determines whether the executed block We have also proposed a solution for some problems spe-
should be included in the dynamic slice or not. cific to the C language (such as pointers and function calls).
Excellent comparison of different dynamic slicing meth- The main advantage of our algorithm is that it can be ap-
ods can be found e.g in [13], [9]. plied to real size C programs, because its memory require-
ments are proportional to the number of different memory
6. Summary locations used by the program (which is in most cases far
smaller than the size of the execution history—which is, in
Different program slicing methods are used for mainte- fact, the absolute upper bound of our algorithm).
nance, reverse engineering, testing and debugging. Slicing We have already developed a prototype system, where
algorithms can be classified as static slicing and dynamic we implemented our forward dynamic slicing algorithm for
slicing methods. In several applications the computation of the C language. Our assumptions about the memory re-
dynamic slices is more preferable since it can produce more quirements of the algorithm turned out to be true according
precise results. Experimental results from [14] show that to our preliminary test results.
*p s Acknowledgements
#include <stdio.h>
int a, b; The authors wish to acknowledge the work of Csaba
Faragó including the implementation and testing of the pre-
1. int f(int x,int y) { sented algorithm.
2. a += x; This work was supported by the grants OTKA 23307 and
3. b += y; T25721.
4. return x+2;
} References
5. int g(int y) { [1] H. Agrawal and J. Horgan. Dynamic program slicing. In
6. a += y; SIGPLAN Notices No. 6, pages 246–256, 1990.
7. return y+1; [2] H. Agrawal, J. R. Horgan, E. W. Krauser, and S. A. London.
} Incremental regression testing. In Proceedings of the IEEE
Conference on Software Maintenance, Montreal, Canada,
void main() { 1993.
int s, *p; [3] J. Beck. Program and interface slicing for reverse engineer-
8. s = 0; ing. In Proceeding of the Fifteenth International Conference
on Software Engineering, 1993. Also in Proceedings of the
9. scanf("%d", &a);
Working Conference on Reverse Engineering.
10. scanf("%d", &b); [4] Á. Beszédes, T. Gergely, Zs. M. Szabó, Cs. Faragó, and
11. p = &b; T. Gyimóthy. Forward computation of dynamic slices of C
12. while (*p < 10) { programs. Technical Report TR-2000-001, RGAI, 2000.
13. s += f(3,4); [5] D. Binkley and K. Gallagher. Program slicing. Advances in
14. s += g(3); Computers, 43, 1996. Marvin Zelkowitz, Editor, Academic
} Press San Diego, CA.
15. printf("%d", *p); [6] K. B. Gallagher and J. R. Lyle. Using program slicing in
16. printf("%d", s); software maintenance. IEEE Transactions on Software En-
gineering, 17(8):751–761, 1991.
}
[7] T. Gyimóthy, Á. Beszédes, and I. Forgács. An efficient rele-
vant slicing method for debugging. In Proceedings of the 7th
Figure 7. The dynamic slices computed for
European Software Engineering Conference (ESEC), pages
the input a=2 and b=6 303–321, Toulouse, France, Sept. 1999. LNCS 1687.
[8] S. Horwitz, T. Reps, and D. Binkley. Interprocedural slicing
using dependence graphs. ACM Transactions on Program-
ming Languages and Systems, 12(1):26–61, 1990.
[9] M. Kamkar. An overview and comparative classification of
We tested our system on several input programs (more program slicing techniques. J. Systems Software, 31:197–
than twenty) and several hundred separate executions for 214, 1995.
different slicing criterions. The average number of different [10] B. Korel and J. Laski. Dynamic slicing in computer pro-
grams. The Journal of Systems and Software, 13(3):187–
memory locations was less than 10% of the size of execu-
195, 1990.
tion histories. We computed a ratio for each test execution [11] B. Korel and S. Yalamanchili. Forward computation of dy-
and after that we computed the average value (for large ex- namic program slices. In Proceedings of the 1994 Interna-
ecution histories the ratios were less than 3%). The largest tional Symposium on Software Testing and Analysis (ISSTA),
execution history consisted of more than 1 million actions. Seattle, Washington, Aug. 1994.
[12] G. Rothermer and M. J. Harrold. Selecting tests and identi-
Our future goal is to obtain empirical data on the size of fying test coverage requirements for modified software. In
Proceedings of ISSTA’94, pages 169–183, Seattle, Washing-
dynamic slices in large C programs. We selected a set of
ton, Aug. 1994.
C programs and a large number of test cases for each pro- [13] F. Tip. A survey of program slicing techniques. Journal of
gram. We are going to construct a union of dynamic slices Programming Languages, 3(3):121–189, Sept. 1995.
for each variable in the program over all test runs to ob- [14] G. A. Venkatesh. Experimental results from dynamic slicing
tain an approximation of decomposition slices. As we men- of C programs. ACM Transactions on Programming Lan-
tioned earlier, decomposition slices are very useful in soft- guages and Systems, 17(2):197–216, Mar. 1995.
ware maintenance and reverse engineering. We also wish [15] M. Weiser. Program slicing. IEEE Transactions on Software
to extend our approach to compute dynamic slices of C++ Engineering, SE-10(4):352–357, 1984.
programs.