Empirical Validation of Variable Based Test Case Prioritization/Selection Technique

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/220670425
Empirical Validation of Variable based Test Case Prioritization/Selection

Technique
Article in International Journal of Digital Content Technology and its Applications · January 2009
DOI: 10.4156/jdcta.vol3.issue3.16 · Source: DBLP
CITATIONS READS
3 56
3 authors:
Yogesh Singh Arvinder Kaur

Shri Mata Vaishno Devi University Guru Gobind Singh Indraprastha University
87 PUBLICATIONS 1,479 CITATIONS 58 PUBLICATIONS 1,051 CITATIONS
SEE PROFILE SEE PROFILE
Bharti Suri
Guru Gobind Singh Indraprastha University
56 PUBLICATIONS 322 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Search based software testing View project
Education in software engineering View project
All content following this page was uploaded by Bharti Suri on 19 March 2014.
The user has requested enhancement of the downloaded file.

Empirical Validation of Variable based Test Case Prioritization/Selection Technique
Yogesh Singh, Arvinder Kaur, Bharti Suri
Empirical Validation of Variable based Test Case

Prioritization/Selection Technique
Yogesh Singh*1, Arvinder Kaur*2, Bharti Suri*3
*1,
Professor, University School of Information Technology, G.G.S.Indraprastha University,
Delhi, India,
*2,
Reader, University School of Information Technology, G.G.S.Indraprastha University, Delhi,
India
*3 Corresponding author
Lecturer, University School of Information Technology, G.G.S.Indraprastha
University, Delhi, India
ys66@rediffmail.com, arvinderkaurtakkar@yahoo.com, bhartisuri@gmail.com
doi: 10.4156/jdcta.vol3.issue3.15
Abstract Due to time and cost constraints of the maintenance

phase, rerunning all the test cases is neither feasible nor
Regression test case prioritization optimizes the optimal. Regression test case prioritization technique
ordering of test cases to be executed to meet some prioritize the test cases from the test suite used during
criteria like maximum code coverage or high rate of development testing so that software testers may test
fault detection. In prior work we prioritized test cases the modified code effectively, efficiently and in an
according to a hybrid technique using variable based inexpensive manner.
method that combined both selection and Various techniques have been proposed for
prioritization. We inferred in our approach that prioritizing test cases for regression testing. In our
variables are vital source of changes in the program previous work [2], we have proposed a hybrid
and test cases should be prioritized according to the approach using variables that combine both selection
variables of any changed statement and variables and prioritization. It considered source code changes
computed from the variables of changed statements. In and coverage information with respect to each test
support of our prioritization approach we extend our case. Variables are the vital source of changes in the
work to validate the effectiveness of prioritized test program, and this method captured the effect of
cases with respect to data flow technique. This paper changes in terms of variable computation.
reports an experimental study investigating the
effectiveness of our prioritization approach given in Here in this work, we extend our previous work
[2] by considering programs. The results obtained are considering the ‘C’ programs. As our work primarily
encouraging and support our work to validate the focused on the uses and computation of variables,
prioritization technique with respect to DU/DC paths changes in variables may induce new def-use
of data flow testing technique. associations. These def-use pairs (DU) may not be def-
clear (DC) which may be very problematic, as these
Keywords pair may be subtle sources of errors. Our work in this
Keywords: Regression testing, prioritization, data paper addresses the coverage of the def-use paths that
flow testing are not def-clear, using the test cases obtained after
implementing the technique.
More specifically we are validating our previous
1. Introduction work, using data flow technique. We have showed in
this work that prioritized test cases according to
Test case prioritization optimizes the ordering of proposed approach in [2] are sufficient enough to cover
test cases to be executed to meet some criteria like all the def-use paths that are not def-clear.
maximum code coverage or improved rate of fault Section 2 covers background and related work.
detection. Although test case prioritization technique Section 3 covers the validation of the technique using
can also be used in development testing, it is primarily data flow technique. Section 3 illustrates the validation
used while regression testing. Regression testing being of our work with example. The observation and
a maintenance activity is very costly and can account
for almost half of the maintenance budget [1].
116
International Journal of Digital Content Technology and its Applications
Volume 3, Number 3, September 2009
analysis of the results are presented in section 4 and of variables as primary and computed. We asserted that
conclusions are given in Section 5. if there is any change in any statement of the code and
if some primary or computed variable is a part of that
2. Background and Related Work statement, then we must prioritize the test cases
according to those variables. The test cases related to
Various techniques for test case prioritization have the variables in the changed line are assigned the
been proposed in research literature by several highest priority and those variables, which are
researchers. These techniques have addressed test case computed from it, are given second highest priority.
prioritization according to rate of fault detection or Following the above criteria we thus obtained a
code coverage capabilities and are evaluated through hierarchy of prioritized test cases.
various empirical studies [3, 4, 5, 6, 7, 8,9]. Rummel,
Kapfhammer, and Thall [10] suggest that test suite can The basic idea here is that if any variable is
be prioritized according to all DU’s with minimal time perturbed by any change, then it would cause ripple
and space overhead. Hutchins et. al. [11] showed that effect throughout the code because same variables may
both control flow and data flow testing can be very be used for computations of other variables and thus
useful at instigating the generation of high yield test can go deep down in the code and can change the
cases that may be otherwise omitted. Frankl et. al. [12] course of execution of the code. So we must prioritize
suggest that mutation based criteria is better than all the test cases according to the variable.
DU criteria when desirable code coverage level is high.
Rothermel et. al. [7] developed 18 different test Data flow testing monitors the life cycle of a piece
case prioritizations techniques, which are further of data and looks out for inappropriate usage of data
subdivided into statement level and function level during definition, use in predicates and computations.
techniques. There are many search techniques for test It has been shown in various studies that data flow
case prioritization, which are being developed and testing strategies lead to richer test suite concentration
unfolded by various researchers in the field [13]. on improper use of data due to coding errors [22].
Further, in recent past many techniques and studies
have been proposed [4, 6, 7, 9, 14, 15, 16, 17, 18, 19] Here, in this work, we identify DU paths and DC
for the prioritization of test cases for regression testing. paths and paths that are not definition clear. We assert
that after changes in any statement, variables may be
Hybrid approach combines both regression test redefined, which can introduce unforeseen errors.
selection and test case prioritization. A number of Thus, regression testing is of utmost importance to
techniques and approach have evolved in the last check these newly introduced DU paths, which may or
decade based on the concept. For example, 1) Test may not be definition clear. We constructed CFG’s of
selection algorithm proposed by Aggarwal et. al. [20] the codes to identify the defining and usage nodes for
2) Hybrid technique proposed by Wong et al, which the variables of the program. In the light of these
combines minimization, modification and observations, we determined whether the prioritized
prioritization, based selection using test history [9] 3) test cases according to variable based hybrid approach,
Hybrid technique proposed by Rajiv et. al. [21] based are good enough to execute these paths.
on regression test selection using slicing and Various studies have shown that all DU criteria are
prioritizing the selected def-use associations, 4) good enough to reveal defects than other CFG based
Variable based hybrid approach by Yogesh et. al. [2]. criteria [10]. But our stress was to check that whether
the selected and prioritized test cases exercise those
def-use paths which are not definition clear. The
3. Accessing Dataflow Information for validation is illustrated with example as:
Validation
#include<stdio.h>
For our work we considered six programs written in #include<conio.h>
‘C’ language. We were interested to validate that the #include<math.h>
prioritized test cases obtained from our hybrid
approach are efficient in covering all the non def-clear 1. int main( )
paths. We constructed CFG of each of the programs 2. {
and identified the nodes for variable definition and 3. int a, b, c, validinput = 0, d;
uses. In our proposed approach we identified two sets 4. double D;
117
5. printf(“ Enter a”); nodes. The third column mentions whether the DU path
6. scanf(“%d”, &a); is DC or not. There are 26 DU paths out of which four
7. printf(“ Enter b”); paths are not DC paths. Two paths are impossible
8. scanf(“%d”, &b); paths. Next two columns specify if the path is passing
9. printf(“ Enter c”); through a change and if the DU paths is not DC and
10. scanf(“%d”, &c); passing through a change. Table 3 lists the test suite for
11. if((a>=0) && (a<=100) && (b>=0) && the example. The selected and prioritized test suite (T’)
(b<=100) && (c>=0) && (c<=100)) { obtained from our technique [2] for regression testing
12. validinput = 1; is specified in table 4.
13. if(a==0) {
14. validinput= -1; Table 1: Defined/Used nodes for all the variables
15. } Variable Define at node Used at node
16. }
11, 13, 18, 20, 24, 27,
17. if(validinput == 1) { a 6
28
18. d = b*b – 4*a*c;
19. if(d == 0) b 8 11, 18, 20, 24, 28
20. printf(“The roots are equal and are r1=r2=%f c 10 11, 18
\n”, -b/(2*(float) a)); d 18 19, 22, 23, 27
21. }
D 23, 27 24, 28
22. else if( d > 0) {
23. D = sqrt(d); validinput 3, 12, 14 17, 31
24. printf(“The roots are real and are r1 = %f and
r2 = %f \n”, (-b-D)/(2*a), (-b+D)/(2*a)); Table 2: List of DU and DC paths
25. } DU paths
26. else { DU that are
27. D = sqrt(-d)/(2*a); paths not DC
28. printf(“The roots are imaginary and are r1 = Variable DU paths DC? passing and
(%f, %f) and r2 = (%f, %f) \n”, through passing
-b/(2.0*a), D, -b/(2.0*a), -D); a change through
29. } a change
30. } A 6,11 yes no --
31. else if(validinput == -1) { 6, 13 yes no --
32. printf(“The value do not constitute a Quadratic 6, 18 yes yes --
Equation:”); 6, 20 yes yes --
33. }
6, 24 yes yes --
34. else {
35. printf(“The inputs belongs to invalid range”); 6, 27 yes yes --
36. } 6, 28 yes yes --
37. getch(); B 8, 11 yes --
38. return 1; 8, 18 yes yes --
39. } 8, 20 yes yes --
8, 24 yes yes --
8, 28 yes yes --
Figure 1: Quadratic Example C 10, 11 yes no --
10, 18 yes yes --
Figure 1 shows the example code for the illustration
D 18, 19 yes yes --
of our validation. The bold lines are the lines that are
18, 22 yes yes --
changed or modified. The Table 1 lists the variable
definition and uses in the example. The first columns in 18, 23 yes yes --
Table 2 specify the variable corresponding to which 18, 27 yes yes --
DU and DC paths are found. Second column lists the D 23, 24 yes no --
DU paths corresponding to the variable. DU paths are not
23, 28 no --
identified and are named by their beginning and ending possible
118
not The roots are

27, 24 no --
possible imaginary and
27, 28 yes no -- T10 b 50 1 50 are r1=(-
validinput 3, 17 no no No 0.01,1.00) and
3, 21 no yes Yes r2=(-0.01,-1.00)
The roots are
12, 17 no no No
imaginary and
12, 31 no yes Yes
T11 b 50 99 50 are r1=(-0.99,
14, 17 yes no -- 0.14) and r2=(-
14, 31 yes yes -- 0.99, -0.14)
The roots are
T12 b 50 100 50 equal and are
r1=r2=-1.00
The value does
Table 3: Test suite (T) for quadratic program T13 b 50 101 50 not belong to
valid range
Variabl
Test The value does
e
Case a b C T14 c 50 201 -1 not belong to
Involve Actual output
ID valid range
d
The value does The roots are real
T01 a -1 50 50 not belong to T15 c 50 50 0 and are r1=-1.00
valid range and r2=0.00
The value does The roots are real
not constitute a T16 c 50 50 1 and are r1=-0.98
T02 a 0 50 50 and r2=-0.02
quadratic
equation The roots are
The roots are real imaginary and
T03 a 1 50 50 and are r1=-48.98 T17 c 50 50 99 are r1=(-
and r2=-1.02 0.50,1.32) and
r2=(-0.50, -1.32)
The roots are
imaginary and The roots are
T04 a 99 50 50 are r1=(- imaginary and
0.50,1.32) and T18 c 50 50 100 are r1=(-
r2=(-0.50, -1.32) 0.50,1.32) and
r2=(-0.50, -1.32)
The roots are
imaginary and The value does
T05 a 100 50 50 are r1=(-0.25, T19 c -1 50 101 not belong to
0.66) and r2=(- valid range
0.25, -0.66) The value does
The value does T20 valida -1 50 50 not belong to
T06 a 101 50 50 not belong to valid range
valid range The value does
The value does T21 valida 102 50 50 not belong to
T07 a 201 50 50 not belong to valid range
The value does T22 validb 50 -1 50 not belong to
T08 b 50 -1 50 not belong to valid range
The roots are T23 Validb 50 102 51 not belong to
imaginary and valid range
T09 b 50 0 50 are r1=(0.00,1.00) The value does
and r2=(0.00, - T24 validc 50 50 101 not belong to
1.00) valid range
119
The value does

T25 validc 50 50 -1 not belong to
valid range
The value does
validIn
T26 -2 50 50 not belong to
put
valid range
The roots are
imaginary and
validIn
T27 50 100 70 are r1=(-
put
1.00,0.63) and
r2=(-1.00, -0.63)
The value does
validIn not constitute a
T28 0 50 50
put quadratic
equation
The roots are
imaginary and
T29 D 60 70 70 are r1=(-0.58, Figure 2. Paths passing from change.
0.91) and r2=(-
0.58, -0.91) Table 5: Percentage of paths passing through change
The roots are % DU paths
imaginary and % DU paths
that are not
T30 Dval 79 56 57 are r1=(-0.35, that are
DC and
0.77) and r2=(- Program Name passing
passing
0.35, -0.77) trough a
through a
change
change (Q)
Table 4: Selected and prioritized test suite (T’)
Test case ID Priority1 Priority2 Quadratic 61.54 12.50
T1-T7 1 1
T14-T19 1 2
T30 2 1 Counter Control 54.55 16.67
T29 2 2
T20, T21 2 2
Gross Salary 35.00 28.57
T24,T25 2 2
Cost of Publishing 80.77 66.67

All the DU paths that are not DC are covered by
the prioritized test suite at 7th test case. Also, the DU
paths that are not DC and passing through change are Pay Bill 69.23 16.67
covered by 1st test case.
4. Observations and Analysis Triangle 86.36 21.05
The results are shown in following tables and

graphs:
120
Figure 3. Path Coverage for DU, DC and for DU Figure 4. Percentage of test cases for path coverage
paths that are not DC with respect to T’
Table 7: Test cases needed for ‘P’ and “Q’ coverage
Table 6: Number of paths covered by computed test
% of % of test
suite (T’) % of test
selected cases
Total Total No. of DU Program cases
No. of P and needed
Program DU DC paths that Name needed for
covered prioritized for 'Q'
Name paths paths are not 'P' coverage
by T' test cases coverage
(m) (n) DC (P)
Quadratic 67.86 14.28 3.57
Counter
43.48 8.70 8.70
Quadratic 26 22 4 4 Control
Gross
42.31 42.31 3.85
Salary
Cost of
Counter 100.00 87.50 87.50
11 10 1 1 Publishing
Control
Pay Bill 66.67 44.44 37.04
Triangle 44.83 10.34 10.34
Gross
20 16 4 4 Our experiments focused on determining the test
Salary
suite’s effectiveness in exercising def-use paths, which
are not def-clear for six C programs. Figure 2 shows
Cost of the percentage of def-use paths that are passing through
26 9 17 17
Publishing a change and the percentage of def-use paths that are
not def-clear and are passing through the change. Thus
data is collected to gain insight of the program
Pay Bill 26 20 6 6 behaviors after modification in any statement of the
code.
Figure 3 shows the total number of DU paths, total
Triangle 22 18 4 4
number of DC paths, number of DU paths that are not
def-clear(P) and number of these paths(P) covered by
resultant test suite(T’) by our technique. It shows that
the resultant test cases were effective enough to
exercise all the paths that are not DC, for all of the
programs.
121
Finally, Figure 4 provide the main results of our [4] S. Elbaum, D. Gable, and G. Rothermel, “ Understanding
experimentation as it shows the percentage of selected and measuring the sources of variation in the
and prioritized test cases, the percentage of test cases prioritization of regression test suites”, In Proceedings of
International Symposium on Software Metrics, 2001, pp.
needed
169–179.
for excercising def-use paths that are not def-clear [5] S.Elbaum, A.Malishevsky, and G. Rothermel, “
and also percentage of test cases to execute the paths Incorporating varying test costs and fault severities into
that are not definition clear and are passing through the test case prioritization”, In Proceedings of International
change. We observe that the resultant test suite Conference on Software Engineering, 2001, pp. 329–338.
achieved from our technique meets the data flow [6] A. Srivastava and J. Thiagarajan, “Effectively prioritizing
criteria of coverage required for DU and DC pathes. tests in development environment” In Proceedings of the
Average percentage of covering ‘P’ paths is 34.59%. In International Symposium on Software Testing and
most of the cases it is less than 45%. The only case Analysis, Rome, Italy, 2002, pp. 97-106.
[7] S. Elbaum, A. G. Malishevsky, and G. Rothermel, “Test
where this percentage is 87.5 is when the total number
case prioritization: A family of empirical studies”, IEEE
of variables in the whole program is equal to total Transanctions of Software Engineering., 2002, vol. 28,
number of variables affected by change. The number of no.2, pp. 159–182.
resultant test cases required for covering ‘Q’ paths is [8] G. Rothermel, R. Untch, C. Chu, and M. J. Harrold, “
further less than those required for ‘P’ coverage. Prioritizing test cases for regression testing”, IEEE
For two programs, the percentage of test cases Transactions of Software Engineering, 2001, vol.7, no.
required for covering paths that are not def-clear and 10, pp. 929–948.
paths that are not def-clear and passing through a [9] W.Wong, J. Horgan, S. London, and H. Agrawal, “A
change, are equal. study of effective regression testing in practice.”, In
proceedings of International Symposium of Software
Reliability Engineering, 1997, pp. 230–238.
5. Conclusion [10] M.J. Rummel, G.M. Kapfhammer and A. Thall, “
Towards the Prioritization of Regression Test Suites with
Data Flow Information”, ACM Symposium on Applied
In this paper, we validated the technique proposed
Computing, 2005.
in [2] using DU and DC paths of data flow testing. The [11] M. Hutchins, H. Foster, T. Goradia, and T. Ostrand,
results achieved are encouraging for the fact that 1) the “Experiments of the effectiveness of dataflow- and
test suite selection and prioritization technique reduces control flow-based test adequacy criteria”, In Proceedings
the size of test suite 2) the required number of test of the 16th International Conference on Software
cases needed with respect to DU, DC coverage are Engineering, IEEE Computer Society Press, 1994, pp.
further less than those obtained by our technique. The 191-200.
ordered test suite potentially enable us to discover [12] P. G. Frankl, S. N. Weiss and C. Hu, “All-uses vs
faults earlier. Our studies validate that the selection and mutation testing: an experimental comparison of
effectiveness”, Journal of System and Software, 1997,
prioritization technique based on variables is efficient vol. 38, no. 3, pp. 235-253.
and effective in terms of data flow criteria. We infer [13] Z. Li, M. Harman, and R.M. Hierons, “Search
that the variable based prioritization of test cases for algorithms for regression test case prioritization” IEEE
regression testing is in conformance with the data flow Transactions on Software Engineering, 2007, vol. 33, no.
testing strategy. Although no research work is complete 4, pp. 225-237.
without further study and experimentation, we intend to [14] G. Rothermel and M.J. Harrold, “Empirical studies of
extend our work for a large set of programs. safe regression test selection technique”, IEEE
Transaction on Software Engineering, 1998, vol. 24, no.
6, pp. 401-419.
6. References [15] S. Elbaum, G. Rothermel, S. Kanduri, and
A.G.Malishevsky, “Selecting a cost-effective test case
[1] B. Beizer, “Software testing techniques”, New York: Van prioritization technique”, Software Quality Journal,
Nostround Reinhold, 1990. 2004, vol. 12, no. 3, pp. 185-210.
[2] Y. Singh, A. Kaur and B. Suri, “Regression Test [16] J. A. Jones and M. J. Harrold, “Test-suite reduction and
Selection and Prioritization Using Variables - Analysis prioritization for modified condition/decision coverage”
and Experimentation”, Software Quality Professional, In Proceedings of the International Conference on
2009, vol. 11, no. 2, pp. 38-51. Software Maintenance, Florence, Italy, 2001, 92-101.
[3] H. Do, G. Rothermel, and A. Kinneer, “Empirical studies [17] J. Kim and A.Porter, “A history-based test prioritization
of test case prioritization in a JUnit testing environment”, technique for regression testing in resource constrained
In Proceedings of International Symposium on Software environments”, In Proceedings of the 24th International
Reliability Engineering., 2004, pp. 113–124. Conference on Software Engineering, Orlando, Fla.,
2002, pp. 119-129.
122
[18] D.Jeffrey and N. Gupta, “Test case prioritization using

relevant slices”, In Proceedings of Computer Software
and Applications (COMPSAC'06), Chicago, 2006, 411-
420.
[19] H. Srikanth, “Requirements-based test case
prioritization”, Student Research Forum in 12th ACM
SIGSOFT International Symposium on the Foundations
of Software Engineering, Newport Beach, Calif., 2004.
[20] K.K. Aggarwal, Y. Singh, and A. Kaur, “Code coverage
based technique for prioritizing test cases for regression
testing”, ACM SIGSOFT Software Engineering Notes,
2004, vol. 29, no. 5.
[21] R. Gupta and M. L. Soffa, “Priority based data flow
testing” IEEE ICSM: 348-357, 1995.
[22] J. Badlaney, R. Ghatol and R. Jadhwani, “An
Introduction to Data-Flow Testing”, Department of
Computer Science, North Carolina Sate University,
NCSU CSC TR-2006-22.
123
View publication stats

Empirical Validation of Variable Based Test Case Prioritization/Selection Technique

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Empirical Validation of Variable Based Test Case Prioritization/Selection Technique

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Empirical Validation of Variable based Test Case Prioritization/Selection

Yogesh Singh Arvinder Kaur

SEE PROFILE SEE PROFILE

Search based software testing View project

Education in software engineering View project

The user has requested enhancement of the downloaded file.

Empirical Validation of Variable based Test Case

Abstract Due to time and cost constraints of the maintenance

not The roots are

The value does

Cost of Publishing 80.77 66.67

4. Observations and Analysis Triangle 86.36 21.05

The results are shown in following tables and

[18] D.Jeffrey and N. Gupta, “Test case prioritization using

View publication stats

You might also like