This action might not be possible to undo. Are you sure you want to continue?
Sibley Panel Editor
Software engineering is a discipline in search of objective measures for factors that contribute to software quality. NPATH, which counts the acyclic execution paths through a function, is an objective measure of software complexity related to the ease with which software can be comprehensively tested.
UPATH: A MEASURE OF EXECUTION PAW COMPLEXITY AND ITS APPLICATIONS
Perhaps the “biggest bang for the buck” in software testing comes from assuring quality at the function or unit level. This is because detecting and correcting unit-level defects during integration and system testing can be very costly. Assuring quality at the function level depends, at least in part, on how thoroughly a function can be tested. Thus, the thoroughness with which functions can be tested is an important design concept. Designers and developers can increase the quality of their software systems by making sure the functions that constitute their systems can be comprehensively tested. In addition to factors such as modularity and input space size that affect the testability of a software system, an equally important factor that affects testability is the number of execution paths through its functions. Functions with more execution paths are more difficult to test than functions with fewer execution paths. The number of execution paths in a function has received considerable attention in software measurement reThe author Productivity 1880 North cl3 1988 ACM is currently a senior member of the technical staff at tht: Software Consortium. The author’s present address is B. Nejmeh. SPC. Campus Commons Drive. Reston. VA 22091. OOOl-0782/88/0200-0188 $1.50
search [16, 171. A difficulty of this research is that functions that contain any looping construct can. have an infinite number of execution paths. Any meaningful measure of the number of paths in a function must be based on some finite subset of the (usually) infinite set of execution paths. A major criticism of existing software complexity metrics based on the number of acyclic execution paths is that there is a poor relationship between the finite subsets of paths selected by various measures and the set of all execution paths. Thus, the accurate measurement of the acyclic execution path complexity for functions must be addressed. SHORTCOMINGS OF MCCABE’S MEASURE OF PATH COMPLEXITY Perhaps the most widely discussed complexity metric is McCabe’s cyclomatic complexity metric [ll]. McCabe’s metric “attempts to determine the number of execution paths in a function” [ll, p. 3091. The cyclomatic complexity number, V(G), is the number of logical conditions in the function plus I. McCabe argued that V(G) represents the number of fundamental circuits in the flow-graph representation of a function. [Evangelist  points out that this formula is correct only when each predicate vertex in the flow graph has outdegree 2).
Communications of the ACM
Volume 31 Number 2
the number of acyclic execution paths that may not be tested by a methodology based on McCabe’s metric varies from 0 to 2N. Thus. It is a count on the number of acyclic execution paths through a function. . This suggests that a good characterization of the number of execution paths in a function should only count a single iteration of each loop. . is the unique function entry vertex. Ventry. . three for-loops in succession will result in the same metric value as three nested forloops). for 1 < i < j <= k. McCabe argued that the number of fundamental circuits acts as an index of testing effort. A second problem with McCabe’s metric is that it fails to distinguish between different kinds of control flow structures (i. Such a metric counts all paths where a loop is not iterated more than twice.g. Fortran). based on McCabe’s metric. l l l l to the discus- A control flow graph is a graph in which each vertex represents either a basic block of code (statement sequence that contains no branches) or a branch point in the function. Although the NPATH metric is defined for the C programming language. next page) l A path P in a control flow graph G is a sequence of vertices ( VO. . . . Vz. V. .. a new and intuitively more appealing measure of software complexity has been developed. structures the same). for. Vk such that VI = Vk and V. A loop is a cycle that begins and ends at a given loop control vertex. an elementary cycle is a cycle that contains no other cycles within it. PL/l. E.1. Syntax if ((exw-)) (if-range) s. NPATH. Certain control flow structures. Pascal. the number of acyclic paths in a flow graph varies from a linear to an exponential function of V(G) . to assert that V(G) is a reasonable estimate of the number of execution paths through a function is unjustified.g.). are more difficult to understand and use properly.. Moreover. More formally. and each edge represents possible flow of control. etc. . a metric that counts the number of execution paths through functions written in the C programming language . where N is the number of vertices in the flow graph . most of the control structures available in C are similar to the control structures in other high-level languages (e. suggests that McCabe’s assumption that total testing effort is proportional to V(G) should not be accepted. Curtis argues that nesting may influence the psychological complexity of the function. <> V. the NPATH approach is applicable to other programming languages. Finally. and -Vexit. In short. -E is a set of edges representing flow of control in the function: -Ventry. In light of the problems with McCabe’s metric. the control flow graph of a function can be represented as a directed graph four-tuple (V. There are several problems with McCabe’s metric. is the unique function exit vertex. such that there is an edge from V. if Statement (Figure 1. -V BACKGROUND OF THE NPATH MEASURE To keep the execution path measure finite and eliminate redundant information. The semantics of the if statement are as follows: If the exnression (exor ) is True. while.Computing Practices Finally. This happens when zero or more of the cycles that comprise the loop for a vertex are substituted until the entire execution path is constructed. An appealing property to determine the loop control vertices in a control flow graph is that any execution path through a function can be constructed once the loop vertices are known. A loop control vertex is a vertex V with the following two properties: (1) V has an out-edge that lies on at least one elementary cycle that begins and ends at V. An elementary cycle is any path P from vertices VI. then the statement com- February 1988 Volume 31 Number 2 Communicatidns of the ACM 199 . however. a measure of execution path complexity should not reflect every possible iteration of a loop.. VI. Thus. to Vi + 1 for i = 0. an element of V. the measure treats the if. overcomes the shortcomings associated with McCabe’s metric. The new metric of software complexity. where is a set of vertices representing basic blocks of code or branch points in the function. Moreover. THE EXECUTION PATH COMPLEXITY OF C CONTROL FLOW STRUCTURES The acyclic execution path complexity expressions for each of the control flow structures in the C programming language are defined in the following subsections. BACKGROUND DEFINITIONS The following definitions are pertinent sion of NPATH. i . Vexit). many researchers  argue that psychological complexity has a large impact on software quality. . and (2) V has a second out-edge that lies on a path leading out of the loop. an element of V. The poor relationship between the number of acyclic execution paths and the number of execution paths tested.e. l l A possible execution path is any path from Ventry to Vexit in a flow graph. First. This approach has initiated the development of NPATH. Curtis [Z] argues that McCabe’s metric does not consider the level of nesting of various control structures (e. A range of a statement V is the set of statements whose execution may be determined by the truth value of the expression in statement V..
el NP((expr)) +l. The acylic execution path complexity for the if else statement is NP(if-else) =NP((lf-range)) + NP((expr)). The same reasoning applies to each of the following acyclic execution path complexity expressions. =NP((while-range))+ do while Statement (Figure 4) \ S FIGURE2. otherwise. +NP((else-range)) FIGURE1. otherwise. S. 190 Communications of the ACM February 1988 Volume 31 Number 2 . The complexity of the logical expression (expr) is also added to the complexity of the if statement. The acyclic execution path complexity (NP) for the if statement is NP(if) = NP(( if range)) + Np(( expr)) + 1. A definition for the complexity of the logical expression ( expr ) appears below. and control branches back to the ( expr ) logical evaluation.e statement is executed. The acyclic execution path complexity for the while statement is NP(whi . then the statement comprising the (If -range) is execc. otherwise. FIGURE3. then the statement comprising the (while-range) is executed. the statement following the whi1. Flow Graph for the i f -e 1 se Statement Syntax do (do-range) while ((expr)).Computing Pracfices (ewr) (if-range) False The semantics of the if-else statement are as follows: If the expression (expr) is True. This expression is derived from the flow-graph representation of the statement. the statement comprising the (else -range) is executed. while ((expr)) (while-range) S. the statement following the if statement is executed. In particular. if-else Statement (Figure 2) False f (while-range) Syntax if ((expr)) (if-range) else (else-range) S. Flow Graph for the i f Statement while Syntax statement (Figure 3) prising the (if-range) is executed. Flow Graph for the wh i 1 e Statement JIe (expr) klse (if-range) (else-range) The semantics of the while statement are as follows: If the expression (expr) is True. the number of acyclic execution paths through the if statement is the number of paths through the (if-range) plus 1 for the case when the (expr) is False.ted.
Flow Graph for the do wh i 1 e Statement The semantics of the do while statement are as follows: The statement comprising the (do range) is executed. The acyclic execution path complexity for the do while statement is NP(do) = NP(do-range)) for Statement for s. (expr3 ) is evaluated. control branches back. (expr2) is used as the termination condition for the loop. and (expr3) is used as the increment/ decrement value for the loop variable upon each iteration of the loop. then control is transferred to the statement following the matched case value. The sequence of statements denoted by the (for-range) are executed as long as the expression (expr2) is True. otherwise. The acyclic execution path complexity for the for statement is NP(for)= NP((for-range)) + NP((expr2)) False switch Syntax + NP((expr1)) + NP((expr3)) + 1 Statement (expr) (case-range. next page) ? (expr2):(expr3) Syntax ? Operator Syntax (exprl) FIGURE5. February 1988 Volume 31 Number 2 Communications of the ACM 191 . When the case statement is executed.thecomplexityof -rangel) is 1. (for-range) (Figure 5) (expr2). a situation the case statement falls through to the next (case-rangecl+l)). and the statement comprising the (do-range) is reexecuted. the statement following the do while statement is executed.) (default-range) The semantics of the switch statement are as follows: The switch statement transfers control to one of several statements depending on the value of the expression (expr ). if the expression (expr) is True. then the statements in the switch are not executed.). next page) switch FIGURE4. and then.) ) (Figure 6. Flow Graph for the for Statement The semantics of the ? statement are as follows: The expression (expr I ) is evaluated first. Note that a (case-range) is delimited by either another (case -range) or a break statement. (case range. then the expression ( expr 2 ) is evaluated and returned as the result of the expression. .Cotnputing Practices do The semantics of the for statement are as follows: The expression (expr I ) is used to initialize a loop control variable. (expr3)) + NP((expr)) + 1. (expr) is evaluated and compared with the value of each case.=I In the where caseor (case case of a null (case-range. If there is neither a case match nor a default. The acyclic execution path complexity for the switch statement is NP(switch)=NP((expr))+NP((default-range)) + C NP((case-range. If (expr 1 ) is nonzero (True). (Figure 7. otherwise.)). Only one of (expr2) and ( expr 3 ) is evaluated. ((exprl). If a case value is equal to the value of ( expr ).
(exprN) . Expressions The syntax for a logical expression (exprl)opl where (expr (expr2)op2. In the case of a forward referencing goto. As such. On the other hand. 1 ). accounting for the complexity of the code beginning at the target of the goto may overstate the complexity of the code between the goto statement and the target statement. where program execution continues. A goto statement is referred to as forward rejerencing when the “labeled” statement being referenced appears textually after the goto statement. Given the inherent ambiguity and difficulty in accounting for the execution path complexity created by the goto statement. and another path traversed if (expr 1 ) is False).ermost enclosing loop (while.1 ) are any one of the logical operators and (&a) or or (I I). The 2 that is included in the NP(?) expression reflects the execution path complexity resulting from this statement (i. Moreover. a backward referencing goto would create a cycle in the program flow graph. Flow Graph for the switch Statement The acyclic execution operator is path complexity for the ? NP(?)=NP((exprl))+NP((expr2)) +NP((expr3))+2.) (default-range) FIGURE 6. transfer of control goes to the “labeled” statement. it would thereby enable the execution path complexity to be infinite. 192 Communications of the ACM February 1988 Volume 31 Number 2 . (expr2).e. . one path is traversed if (expr 1 ) is True. 0~2. Although the number of acyclic paths resulting from the use of the goto statement could be significant in theory. it ends the execution of statements within the basic block of code where it occurs.) (case-range. the break statement can be thought of as the last statement on the execution path containing the basic block of code in which it occurs. (exw2) (expr3) FIGURE 7.Ire expres- sions and op 1. . do. If and when the break statement is reached. the execution path complexity of the break statement is 1. . the ? operator can be treated similarly to the if -else statement. Flow Graph for the ? Operator Statement got0 Statement When the statement goto label is executed. a goto statement is referred to as backward referencing when the “labeled” statement being referenced appears textually before the goto statement. our path complexity metric does not account for the execution path complexity introduced by the goto statement.Computirfg Practices (case-range. op(N-l)(exprN). Similarly. The acyclic execution path complexity expression is as follows: . For our purposes. . In the context of execution path complexity analysis. break Statement A break statement causes exit from the inn.. (expr1) for the goto statement is difficult to define. for) or switch statement in which it appears. op (N . the use of the goto statement is generally considered poor programming practice . in practice the goto statement is rarely used.
then the value of the entire logical expression is True. ( expr 2 ) is evaluated. Therefore. then the value of the entire logical expression is True. The acyclic execution path complexity for any logical expression is NP(expression) = number of && and I I operators in the expression. To illustrate. the value of the logical expression is False. as well as a more explicit but logically equivalent form of the function segment to the right (also see Figure 8): if( The flow-graph representation of the statement indicates that there are four different acyclic execution paths through this flow graph (assuming S 1 and S2 are sequential statements). if (A && B) && C ) if( Sl. otherwise. NPATH does not account for the complexity of this construct. cant inue Statement The continue statement forces the next iteration of an enclosing loop (for. otherwise. has been previouslydefined tobeNP((if-range))+NP((elserange)) + NP( ( expr )). That is. If (expr2) is True. In the case of the or operator. do) to begin. NP(if-else) = 1 + 1 + 2 = 4. the number of acyclic execution paths added as a result of each logical operator in a logical expression is 1. The number of expressions that may conditionally be executed in a logical expression grows linearly with the number of && and 1 I operators in the logical expression. In the above case. otherwise. otherwise. the value of the logical expression is False. That is. NP( ( if range)) and NP(( else . return Statement The return statement C function. the complexity of the if -else statement. NP(if -else). FIGURE8. A Logical Expression and Its Corresponding Flow Graph February 1988 Volume 31 Number 2 Communications of the ACM 193 . which is equivalent to else t else else SN.range)) are each 1 since both s 1 and ~2 are sequential statements. the truth value of the logical expression ( expr 1 ) && ( expr 2 ) is determined as follows: If (expr 1 ) is False. The path complexity expressions defined in this article lead to the same conclusion about the number of acyclic execution paths in this statement. If (expr2) is True. Consider the two logical expression operators && (and) and 1 1 (or). A ) B ) if( C ) if R s2 if( else I s2. Thus. the truth value of the logical expression ( expr 1 ) 1 1 ( expr2 ) is determined as follows: If (expr ) is True. the continue statement represents a back edge in the control flow graph of a function. and the evaluation of the logical expression is terminated. logical expressions are evaluated only until the final truth value of the expression can be determined. while. I Sl. In particular. then the value of the entire logical expression is True. (expr2) is evaluated. every expression within a logical expression may have to be evaluated in order to determine the truth value of the entire logical expression. I SN.Computing Practices The complexity of logical expressions can have a tremendous impact on the number of execution paths in a function. This is because of the way logical expressions are evaluated in C. The complexity of the logical expression (A && B) && C is z (the number of && and I I operators in the expression). and the evaluation of the logical expression is terminated. consider the function segment to the left. A return expression. Thus. then the value of the entire logical expression is False. Therefore. In the case of the and operator. statement is NP(( expr terminates the execution of a statement can also contain an the complexity of the return )).
Therefore. ‘-2. Note that function calls are treated as sequential statements. on the other hand. it is assumed that the code within the function being called has been unit tested and is functioning properly.’ :. Note that the complexity of any statement range is the product of complexities of the statements in the range. Although the number of acyclic paths resulting from the use of the goto statement could be significant in theory.. r=1 True . ~. in practice the use of the goto statement is minimal and generally not thought to be good programming practice .. for a Func-b( ).Computing Practices sequent ial Statements and Function Calls The execution path complexity for the sequential statement is 1 because there is only one path created by consecutive sequential statements. j False end ). -. NPATH.=N = n NP(Statement. Func-a( if ( ch == ‘a’ ) . b bctr++. In particular.. Func-c( ). NPATH is a measure that is more closely related to the number of acyclic execution paths through a function. '.. the NPATH measure clearly distin- 194 Communications of the ACM February 1988 Volume 3.1 Number 2 . NP( ( expr )) = 0 for each if statement. is based on unique expressions of acyclic execution path complexity for each C control flow structure. An Example of NPATH A segment of C source code and its corresponding NPATH measure follows (also see Figure 9): if ( ch == actrff. '---.. “lb if d dctr++. NP(( if-range)) = 1 for each if statement. statement.NP((if))=1+0+1=2foreachif NP(( Characteristics of NPATH We now demonstrate that NPATH overcomes the shortcomings of McCabe’s measure. Also note that (if-range) for each of the above if statements is a sequential statement._ True yr~* . = 2X 2X 2X 2= Func-c if ( ch == dctr++.). is NPATH . Func-b( if ( ch == cctr++. That is. Id’ ) ) NPATH = 16. ..I. Thus. True cctrff." +* : actr++. The NPATH value of 16 is obtained as follows: if )) = NP(( if -range)) + NP(( expr)) + 1. . ‘b’ ) FIGURE9. . -I -k if + False c where N denotes the number of statements in the body of the function and NP(Statementi) denotes the acyclic execution path complexity of statement i. It was noted earlier that the number of acyclic execution paths in a function varies from a linear to an exponential function of V. The complete algorithm to compute NPATH is listed in Appendix A. Example C Code Segment with NPATH= 16 I bctr++.-. the execution path complexity of function calls is also 1. NP(( code segment)) ). McCabe’s measure fails to distinguish between different kinds of control flow structures.Thus. Func-a( 1. Func-d( ). THE EXECUTION PATH COMPLEXITY OF C FUNCTIONS The composite acyclic execution path complexity C function. Therefore. In the above example.. Func-d( ( ). the NPATH measure differs from the actual number of acyclic execution paths by the number of acyclic execution paths resulting from goto statements. NPATH.L if -. ‘c’ ) 16.. if ! I a i .f_.
g. Multiple if Flow Graph with NPATH= 4 so The NPATH measure for this segment of code is 4. Consider the following segment of C source code and its corresponding flow graph (also see Figure 10): ( A == B ) so. TOKENS for the C programming language include -keywords (e. however. acyclic execution path complexity is multiplicative if statements are consecutive. ?) in a function plus 1. I so . McCabe’s [ll] cyclomatic complexity number. The TOKENS metric is the basis for the Halstead collection of metrics referred to as Software Science .. and -punctuation symbols (e. It is the number of logical conditions (if. that is. acyclic execution path complexity is additive if one statement is nested within another.). ( .. The NPATH definition. -identifiers (e.I False FIGURE11.Computing Practices guishes between different kinds of control flow structures. if ( A == B ) Sl . while. l NCSL is the number of noncommentary source lines of code in a function. Another criticism of McCabe’s measure is that it does not account for nesting levels within a function. In particular. l if l if True AA St I False V(G). is a registered trademark of AT&T Bell Laboratories UNIX if FIGURE10. S 1 is executed if and only if SO is executed. whereas the NPATH measure does. and the values of the control variables in (expr 1 ) and (expr2) are the same. Sl \. on the level of nesting among statements in the function. In the above segment of code. the number of acyclic execution paths through a function is dependent. represents the number of fundamental circuits in the flow-graph representation of a function. and NPATH In order to assess whether traditional measures of software complexity are closely related to execution path complexity. TOKENS. Thus. That is. Sl. the above segment of code is equivalent to the following segment of code provided SO does not alter A or B (also see Figure 11): if ( A == B ) j. Single if Flow Graph with NPATH= 2 February 1988 Volume 31 Number 2 Communications of the ACM 195 . Comparing NCSL.. ). does not detect this anomaly. + and <=). if (expr 1 ) is identical to ( expr 2 ) or ( expr 1 ) is not logically equivalent to ( expr 2 ). X and Msg). default. 1 1. and . Anomaly of the NPATH Measure An anomaly arises in the NPATH definition when a certain class of control flow structure sequences appears in a function.g. any line of program text that is not a blank or comment. -operator symbols (e. for. Given any two logical expressions (expr 1 ) and ( expr 2 ) governing the execution of two different sequential sequences of code. TOKENS is the number of lexical tokens in a function. V(G). we computed the following measures of software complexity for 821 functions in a UNIX’*’ C software application. in part. then the NPATH measure overstates the acyclic execution path complexity by a factor of 2. while and if). &&. there are only two unique execution paths possible through the first code segment. if The NPATH measure for this segment of code is 2.g..g. Therefore. case.
This is because there is a weak relationship between the NCSL of a function and the testability of the function. Thus. and V(G) do not capture this property of a function. namely. and V(G). and define module design criteria.53.97 0. R” represents the percentage of variance in one variable that is explained by the other variable.53 0. matrix. and 0. lexical complexity. NPATH is used as a guide in the testing process because it characterizes a significant factor contributing to the complexity of functional testing-the number of acyclic execution paths through a function. TOKENS. V(G). it follows that NPATH can be used in determining the level of review/inspection of a function. TOKENS. Such is not the case with NPATH. TOKENS. testing effort might best be allocated to functions proportional to the testability of the function. such as in the program design language PDL-C . These three measures do not measure the semantic content of code.oo 0. The rank order of the functions based on NPATH is being used by developers to allocate testing resources using this criterion. Also. paths The correlation matrix shown in Table I summarizes the R’ correlations among the metrics. NPATH is aiding in the process of deciding which functions should be thoroughly reviewed and/or inspected.99 1 . TOKENS. with at least 40 percent of the variance in NPATH not accounted for by any one of the other measures. the larger the NCSL count of a function. TOKENS. in the top 25 percent) NPATH values are candidates for thorough review and/or inspection. The NPATH value for a detailed design specification can be computed provided the specification uses the control flow structures of C. The amount of testing resources allocated to a function is proportional to the number of acyclic execution paths through the function.e. TABLE I.. Thus. and NPATH NCSL NCSL TOKENS V(G) NPATH 1 . which are measuring the lexical content. then these data highlight the importance of the NPATH measure.97 0. To make good use of limited resources. A relatively large number suggests that a relatively large proportion of testing resources should be allocated to the function. Testing a function in proportion to its NCSL count.00 PRACTICAL USES OF NPATH Many software complexity metrics lack practical value. however.97. it is important to identify functions that would be most useful to thorough walk-throughs and inspections. These correlations show that NPATH is somewhat independent of the NCSL.56. Walk-throughs/Inspections Code walk-through and inspections have become an integral part of the software development process. could lead to an inappropriate use of testing resources.53 UC) 0. are not particularly sensitive to the number of execution paths through a function. respectively.57 0.56 1. allocate functional testing resources. Because NPATH is measuring different factors than those measured by NCSL. A widely used criterion in deciding how to apportion -testing resources among functions in a software system is the number of NCSL in each function. and V(G). Functions with high (i.’ Number 2 . Measures are often designed without any particular use in mind [a].57 TOKENS 0.99 0. 0. TOKENS. and V(G) are highly correlated. These correlations show that NCSL. when NPATH is correlated to NCSL. TOKENS. and V(G). These correlations do not suggest that any of the measures is superior to the others.97 1. and V(G) measures. Since NPATH measures the functional complexity and testability of code.56 NPATH 0.. we agree with others  that the extent to which a complexity measure can be used as a guide in testing effort depends on how well the measure specifies what is contributing to the complexity of a program.oo 0. the resulting correlations are 0. Note that the three metrics NCSL.97 0. the correlation between these metrics is at least 0. TOKENS. When schedule and resource constraints preclude the comprehensive review of all functions.57. the greater the resources allocated to it. The correlation measures show that NCSL. as evidenced by the correlation. A more appropriate criterion of relevance when apportioning testing resources is the number of unique execution paths through the function. 196 Communications of the ACM February 1988 Volume 3.Computing Practices l NPATH is the number of acyclic execution through a function. R* Correlation Matrix for NCSL. Note the NCSL. and V(G) appear to be measuring the same thing. Testing Software development organizations have limited testing resources. the lower correlations between NPATH and the other measures are expected. Software developers are using NPATH to l l l select functions for thorough walk-through/inspection. Design Criterion NPATH is also being used to establish a functional design criterion and identify functions appropriate for redesign early in the development process.00 0. In short. if we accept the premise that an important property of software to be measured is the number of execution paths through a function. but they do show that NPATH is measuring different factors of complexity than the other measures.
break . and creating a separate function for logical expressions with a high count of and (&a) and or (I 1) operators. the greater && v2) 1. case ‘C’ : cc++. ( c == ‘C’ ) cc++. ( c == ‘b’ ) cb++. Such is the case with NPATH.Computing Practices Software quality can be increased by designing software that requires manageable levels of functional testing to assure its correctness. ( c == ‘c’ ) cc++. in the case of sequential code. break . The following simple example illustrates this strategy: The original sequence of if statements if ( c == ‘a1 ) if if if if ca++. reducing the NPATH value for the original function because function calls are treated as sequential statements. Thus. Along these lines. implementing multiple if statements as a switch statement. reuse. METHODS TO REDUCE COMPLEXITY If a method is to be useful in controlling software complexity. Then. Additional Considerations Any decision to allocate inspections. For functions that exceed the threshold value of 200. and the use of code generators all impact on software quality. Suppose the following logical expression occurs several times in a function: if ((vl Distributing Functionality To reduce NPATH for a function. break . divide the function into blocks of code that logically belong together. default: cOther++. NPATH is not changed by making the sequential code a function and then calling the function. as well as suggest ways to reduce complexity . case ‘C’ : cc++. Operators per Logical Expression NPATH can be reduced for a function with a high count of and (&&) and or (I I) operators in a logical expression by creating a separate function for the logical expression. && c != ‘c’ && has an NPATH value of 80 (2 X 2 X 2 X 2 X (2 + 3)). break . Multiple if Statements Another way to reduce NPATH for a function is to implement multiple if statements via the switch and case statements. developer experience. Many strategies to reduce the NPATH complexity of functions are being used by software developers. software development environment. replace each block of code with a call to the appropriate newly created function. testing. Thus. This is not true in all cases: for example. Generally. the use of NPATH cannot provide absolute principles for software development. or design effort based on NPATH must also take into account the criticality of the function: Whereas even moderately high NPATH values in a heavily used function would identify the function for thorough inspection and testing as well as possibly redesign. l distributing functionality. a noncritical function of similar NPATH complexity might not warrant the same level of attention. and factors other than complexity impact on software quality: Requirements volatility. the testability of each function in a software system is an important design criterion. break . I I ((~3 I I v4) && (v5 && ~6)) The new separate function looks like the following: for the logical expression February 1988 Volume 31 Number 2 Communications of the ACM 197 . The original functionality is thus distributed. case lb’: cb++. Some of the most effective methods of reducing the NPATH value include l l the reduction in NPATH. the more functions defined. less complex case statement implementation of the same sequence switch ( c ) I case ‘a’: ca++. An equivalent. Create a new function for each block of code. t has an NPATH value of 5 (1 + 1 + 1 + 1 + 1). an NPATH threshold value has been established to define a functional design criterion and identify candidate functions for redesign. methods to reduce NPATH complexity are provided to developers. ( c != ‘a’ && c != ‘b’ c != ‘C’ ) cOther++. but is a useful adjunct to traditional and intuitive measures of software complexity. then it must index a function’s complexity level. The value 200 is based on studies done at AT&T Bell Laboratories . An NPATH threshold value of 200 has been established for a function.
Finally. Although both measures of code coverage are useful. To is the MTTF (mean time to failure) at the start of testing.] 11 ((~3 11 v4) l Although strategies to reduce the NPATH complexity of functions are important. in practice. Table II illustrates this point. TABLE II. Another potentially useful application of NPATH is in the area of software reliability modeling. Formal notations such as Structure Charts . as ( NPtotal . NPATH could be used to develop initial approximations for To and R. denote the number of unique acyclic execution paths executed thus far. and to defi. the NPATH measure considers any call to a function as a sequential statement. monitoring the coverage of code during the testing process has proved to be an effective method of improving and verifying testing. and Data Flow Diagrams  now provide computational models and notations on which to base measures of requirement and design complexity.ne the notion of system complexity as a function of the number of hardware and software execution path. in comparison to code. v6 are l l VI.s through a system. Monitoring path coverage would more accurately reveal the completeness of software testing. although every line of code and branch point in a program might be executed. The reason for this is that. To Complexity Classes Based on NPATH NPATHrange l-l 000 1001-2500 2501-5000 TO 100 75 60 198 Communications of the ACM February 1988 Volume 31 Number 2 .. The success of NPATH suggests other possible applications. as NPATH increases. An extension to the current model of software complexity would be to capture the complexity of subsystems and entire systems by accounting for the acyclic execution path complexity of the calling sequences within each function making up the system. there could be a substantial number of execution paths in a program that have not been. function calls do not add complexity. ] approaches 0. whereas formal notations for expressing requirements and designs have only emerged recently. v3. The path failure rate (PFR) at time T is defined as PFR = NPfa. Future software measurement research should focus on defining and analyzing measures of requirements and design complexity based on such notations. The practical applications of NPATH and methods to reduce NPATH have been discussed. This is partially because programming languages provide a formal notation on which to base measures. Therefore. That is.NP. Let NPfail denote the total number of failures covered thus far. v2.. in a near monotonic fashion).. or (2) the percentage of branches executed during code execution...Computing Practices if ( v-check () assuming global to () ) v-check /* monotonically increase or decrease. Booth Diagrams [l]. module v4. systems that have close to 100 percent NCSL and branch coverage may not be adequately tested. To would decrease. then NPest_fail =: PFR X (NPtota~. the software is more completely tested. and the reliability of the system should behave more consistently (i. */ v5.. Future work on developing an NPATH-based coverage monitor is an important next step of this research. First.. The basic approach would be to define To complexity classes based on NPATH. Let NP. reliability models need to take into account the dynamic nature of MTTF rates. NPATH ranges and T. R is the rate at which MTTF is assumed to inc:rease over time. Note that.NP. The MTTF rate (R) changes over time. care must be taken not to distort the logical clarity of the software by applying a strategy to reduce the complexity of functions.. FIJTURE DIRECTIONS AND SUMMARY NPATH counts the acyclic execution paths through a C function. it does not l l Let NPtotal denote the total NPATH for a system. There are also several useful extensions to the NPATH measure. Obviously.. there is a point of diminishing return beyond which a further attempt at reduction of complexity distorts the logical clarity of the system structure. denote the estimated number of failures Let NPest-fail remaining in the software. Another extension to the model would be to apply the proposed notion of acyclic execution path complexity to hardware. values need to be found through empirical study. there is little known about measuring the complexity of requirements and designs.). NPATH and NPATH-coverage monitors could be used to estimate the way MTTF changes over time. A vast majority of software reliability models  require a priori estimates for the model parameters To and R. For example. the (vl return && I (v5 ( && && v2) ~6)) .. branch coverage is much more difficult to achieve than code coverage. To date.l/NPexec.. Both PFR and NPest-fail could be extremely valuable in predicting how MTTF changes over time.e. Most coverage monitors report either (1) the percentage of NCSL executed during code execution. no acceptable means of estimating To and R prior to entering system test has been established. Path coverage is much more difficult to achieve than branch coverage.
case IFST: /* if statement*/ return ( (NPATH(if-range ofV)+Bool-CompofV+ 1) /* if-elsestatement*/ caseIFEST: return ( (NPATH(if-rangeofV)+NPATH(else-rangeofV)+Bool-CompofV) * NPATH(NextV) ). ). NPATH Algorithm l Next v is either (1) the first statement in the compound statement that follows V. ( statementtypeofv ) * NPATH(NextV) ). ). for ( eachcaseanddefaultrangeinswitch) CompSW=CompSW+NPATH(case-range). or (2) LAST if it is the last statement in some compound statement. * NPATH(NextV) caseGOT0: /* got0 statement*/ /* skiptonextstatement*/ return ( NPath(NextV) ). caseWHST: /*while statement*/ return ( (NPATH(while-rangeofV)+Bool-CompofV+l) case DOST: /*dostatement*/ return ( (NPATH(do-rangeofV)+Bool_CompofVf /* for statement*/ caseFORST: return ( (NPATH(for-rangeofV)+Bool-CompofV+ case SWST: /* switchstatement*/ CompSW=Bool-CompofV. return( CompSW* NPATH(NextV) ). /* sequential statement*/ caseSEQ: if ( Bool-Comp > 0 ) return ( Bool-Comp * NPATH(NextV) else return ( NPATH(NextV) ). else return ( 1 ). NPath ( V ) statementv. Bool-Comp ofv is the complexity of the expressions in statement V. ( Vis return else switch if LAST ) ( 1 ). 1 /*endofswitchstatement*/ ) /*endofelse*/ ] /* endof NPathfunction */ February 1988 Volume 31 Number 2 Communications of the ACM 199 . exit statement*/ case RET: /* returnor caseBRST: /* breakstatement*/ caseCONTST: /*continue statement*/ if ( Bool-Comp > 0 ) return ( Bool-Comp ). /* 7 statement*/ caseQUESST: return ( (Bool-CompofV+ 2) * NPATH(NextV) ). 1) * NPATH(NextV) * NPATH(NextV) 1) ).Computing Practices APPENDIX l A.
In Proceedittgs of IEEE COMPSAC ‘84. Wiley-Interscience. 1975. 18. Software reliability modeling. 6. New York. Comntutt. E. 1985). Measurement. SE-S.. 447-456. Kernighan. 3. Measuring the psychological complexity of software maintenance tasks with the Halstead and McCabe Metrics. 14. 1987. Bell Laboratories.. Nejmeh. 527-531. 4. Muss. B. 3 (Mar. New York. E. 1980. N. D. The author many stimulating conversations pher Fox of the Quality Software at AT&T Bell Laboratories about is grateful for the he had with ChristoTechnology Group NPATH. and notice is given that copying is by permission of the Association for Computing Machinery. pp. F. 10. 1977. Van Nostrand Reinhold. Thompson. 16. 1968). software testing 9. Mellor. 2. Nejmeh. B. W.. Curtis. SE-Z.W.. Dunn. Author’s Present Address: Brian A. 4 (Apr. 1975. Nejmeh. Murray Hill. 1976). New York. M. R. 308-320. Soffwarr E‘rtgimwing With Ada. Reliability Additional Key Words and Phrases: Execution path complexity. E..McCraw-Hill. Oct.4 (Software Engineering]: Program Verification-reliabilify: D. Eng.A. Gray. D. S. Englewood Cliffs. A Summary of Execution Path Expressions Complexity expression NP((if-range))+NP((expr))+l NP((if-range))+NP((else-range))+NP((expr)) NP((while-range))+NP((expr))+l NP((do-range))+NP((expr))+l NP((for-range))+NP((exprl))+NP((expr2))+NP((expr3))+1 NP((expr))+ I:I. and Dunsmore. R. Ramamoorthy. 200 Communications of the ACM February 1988 Volume 3.))+ NP((default-range)) NP((exprl))+NP((expr2))+NP((expr3))+2 1 1 Number of && and 1 1 operators 1 1 1 1 n:$ NP( Statement. 11 (Nov. Eknwr~ts of.J.NP((case-range. .lg.A. Lesk.. YACC: Yet Another Compiler Compiler. Tech Memo. Comntur~. Elsevier North-Holland. 11. 367-375.J. VA 22091. Benjamin/Cummings.3 [Logics and Meanings of Programs]: Studies of Program Constructs-control primitives General Terms: Algorithms. B. An analytical approach to program testing.J. S. Booth. W.APPENDIX Structure if if-else wbi le do while for switch ? goto label break Expressions continue return sequential Function call C function B. Software Defect Renroval. A complexity measure. Vick and C. 1986). and Schmidt. Ertg. Management. ACM 29. Halstead. Using Structured Design. 1986. 1986.T. Eds. M. Kearney.. 1981.. ACM II.. 1984. J. IEEE Trans. Softw. J. Selecting software test data flow information. P.8 [Software Engineering]: Metrics-complexify measures. M. pp.B. the ACM copyright notice and the title of the publication and its date appear. Rapp. SE-II.. LEX: A Lexical A~~alysis Generalor.K. M. A survey of program design languages (PDLs). Yourdon Press.2.2. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage. and Ward. New York. N. CR Categories and Subject Descriptors: D. 1984. New York. 17. NPATH. T... 5.. S. 1978. Bell Laboratories. 147-148. Softw. 1880 North Campus Commons Drive. D. IEEE Trans. N. An analysis of control flow complexity. 4 12. 96-104. or to republish. AT&T Bell Laboratories. The C Progranrnring Language.! Number 2 . McCabe. Stevens. E. Sedlmeyer. 7. et al. G. Paige. Design.Soffwarr Science.. Dijkstra. (Apr.7 [Software Engineering]: Distribution and Maintenance-restrucfuring. 1044-1050. and Adler. Johnson. In Proceedirtgs of IEEE COMPSAC ‘86 1986. 1979). Calif. In Proceedings of 1EEE COMPSAC ‘80. Software complexity metrics study summary. GO TO statement considered harmful. requires a fee and/or specific permission.2.) in expression Acknowledgments. 13. D.P.L. Evangelist.3. 388-396. pp. To copy otherwise. Reston.5 [Software Engineering]: Testing and Debugging-monitors.. C. 1984. H. IEEE Trans. D. 8.. Holmdel.2. M. REFERENCES 1. Menlo Park.2. SPC. Softw. Software complexity measurement. Structured Developnrrtzt for Real-Time Systems. and Weyuker.9 [Software Engineering]: Management-software quality assurance (SQ&.2 [Software Engineering]: Tools and Techniques--modules and interfaces. N.2. 3 (Mar. In Halldbook of Software Engimwing.. B. 15.J. M. Murray Hill. Prentice-Hall. and Ritchie.