Professional Documents
Culture Documents
BACHELOR OF TECHNOLOGY
In
Computer Science & Engineering
By
Name of Guide
Mr. Prashant Kumar Mishra
This is to Certify that Shristy Sharma, Swati Nigam and Varsha Mishra have
carried out the B. Tech. Final Year Project work presented in this report entitled
“Software Metrics Calculator Tool” for the award of Bachelor of Technology
from Dr. A.P.J. Abdul Kalam Technical University, Lucknow under my
supervision. The project embodies result of original work and studies carried out
by Student herself and the contents of the project do not form the basis for the
award of any other degree to the candidate or to anybody else.
Date:
DECLARATION
This is to certify that report entitled “Software Metrics Calculator Tool ” which
is submitted by Shristy Sharma, Swati Nigam and Varsha Mishra, in partial
fulfillment of the award of Bachelor of Technology in Computer Science to
Department of Computer Science and Engineering, KRISHNA INSTITUTE OF
TECHNOLOGY, KANPUR. It comprises of our original work and due
acknowledgement has been made in the text to all other materials used.
Last but not the least we are highly indebted to our teachers, family and
friends whose wishes strengthened us to survive the stormy winds.
ABSTRACT iii
LIST OF TABLES viii
LIST OF FIGURES ix
LIST OF ABBREVIATIONS x
1. INTRODUCTION 1
1.1. OBJECTIVES 1
1.2. MOTIVATION AND THE RELEVANCE OF THE WORK 1
2. LITERATURE SURVEY 3
2.1. SOFTWARE METRICS 3
2.1.1. The scope of Software Metrics 4
2.1.2. Different Types of Software Metrics 4
2.2. PRODUCT METRIC 7
2.2.1. Source Lines Of Code 8
2.2.2. Cyclomatic Complexity 8
2.2.3. Halstead Software Science 9
2.3. CYCLOMATIC COMPLEXITY 9
2.3.1. Advantages of Cyclomatic Complexity 11
2.4. HALSTEAD'S SOFTWARE SCIENCE 12
2.4.1. Basic Entities—Operators and Operands 12
2.4.2. The Theory of Software Science 16
3. METHODOLOGY 20
3.1. INTRODUCTION 20
3.2. ADJOINED METRICS 21
4. IMPLEMENTATION 24
5. RESULTS AND VALIDATIONS 25
6. CONCLUSIONS AND FUTURE WORK 26
7. APPENDICES 27
APPENDIX I: SNAPSHOTS OF THE TOOL 27
APPENDIX II: CODE SNIPPETS 28
8. REFERENCES 76
LIST OF TABLES
LIST OF FIGURES
CC Cyclomatic Complexity
CMMI Capability Maturity Model Integrated
SEI Software Engineering Institute
ISO International Organization for Standardization
SLOC Source Lines of Code
AST Abstract Syntax Trees
CHAPTER 1
INTRODUCTION
1.1 OBJECTIVES
1. Cost and effort estimation- Various models work to predict the cost
and time to complete a Project
2. Productivity models and measures
3. Data collection- Consistent, meaningful data collection is difficult, and
made more difficult by try to collect data across diverse projects.
4. Quality models and measures- Bad code is not worth much, even if lots
of it is written quickly.
5. Reliability modeling- Just how reliable is the software? When can we
expect the next failure?
6. Performance evaluation- How well does the system perform? Response
and completion time, transactions processed, etc.
2. Product Metric:
These metrics focus on measuring key characteristics of the software
product. There are many product metrics applicable to analysis, design,
coding, and testing. Commonly used product metrics include:
3. Project Metric:
Project metrics are tactical and related to project characteristics and
execution. They often contribute to the development of process metrics. The
indicators derived from project metrics are utilized by project managers and
software developers to adjust project workflow and technical activities. The
first application of process metrics often occurs during cost and effort
estimation activity. Metrics collected from past projects are used as basis
from which effort and time estimates are made for new projects. During the
project, measured efforts and expended time are compared to original
estimates to help track how accurate the project estimates were. When the
technical work starts, other project metrics begin to have significance for
different measures, such as production rates in terms of models created,
review hours, function points, and delivered source code lines.
Some common examples of software metrics are:-
a) Source lines of code.
b) Cyclomatic Complexity is used to measure code complexity.
c) Function point analysis (FPA), is used to measure the size
(functions) of software.
d) Bugs per lines of code.
e) Halstead software science metric
f) Code coverage, measures the code lines that are executed for a
given set of software tests.
g) Cohesion, measures how well the source code in a given module
work together to provide a single function.
h) Coupling, measures how well two software components are data
related, i.e. how independent they are.
The above list is only a small set of software metrics, the important
points to note are:-
a) They are all measurable, that is they can be quantified.
b) They are all related to one or more software quality characteristics.
Product metrics are metrics that can be calculated from the document
independent of how it was produced. Generally, these are concerned with the
structure of the source code. Product metrics could be defined for other documents.
For example, the number of paragraphs in a requirements specification would be a
product metric.
Source Lines of Code (SLOC) is perhaps the oldest of software metrics and
still a benchmark for evaluating new ones. There are many different ways to count
lines of source code. The definition may be a simple as the number of NEW LINE
characters in the file. Often comments are excluded from the count of lines.
Sometimes blank lines or lines with only delimiters are excluded.
2.2.2. Cyclomatic Complexity
CC = e - n + 2p
Where,
e = Number of edges
n= Number of nodes
p = Number of strongly connected components (which is normally 1)
Cyclomatic complexity measures the amount of decision logic in a single
software module. It gives the number of recommended test for software.
Cyclomatic complexity CC, for a flow graph, G, is also defined as
CC = P + 1
Where, P is number of predicate nodes contained in the flow graph G.
Example:
C code module with complexity six.
complexity6( int i, int j ) {
if ( i>0 && j>0 )
{
while ( i>j ) {
if ( i% 2 && j % 2)
print( "%d\ n", i );
else
print ("%d\ n", j );
i--;
} } }
Starting with 1, each of the two "if" statements add 1, the "while" statement adds 1
and each of the two "&&" operators adds 1, for total of six.
Maurice Halstead was one of the first researchers in software metrics. He did
his work in the late 1960s and 1970s. His goal was to identify what contributed to
the complexity in software. He empirically looked for measures of intrinsic size.
After finding what he felt were good measures and prediction formulas, he tried to
develop a coherent theory.
The basic approach that gave Halstead good results was to consider any
program to be a collection of tokens, which he classified as either operators or
operands. Operands were tokens that had a value. Typically, variables and
constants were operands. Everything else was considered an operator. Thus,
commas, parentheses, arithmetic operators, brackets, and so forth were all
considered operators. All tokens that always appear as a pair, triple, and so on are
counted together as one token. For example, a left parenthesis and a right
parenthesis has been considered as one occurrence of the token parenthesis. A
language that has an if-then construction is considered to have an if-then token.
No standard has been accepted for deciding ambiguous situations. The good
news is that as long as an organization is consistent, it doesn’t matter.
The author recommends a syntax-based approach where all operands are
user defined tokens and all operators are the tokens defined by the syntax of the
language.
IDENTIFIER All identifiers that are not reserved words
TYPENAME (type specifiers) Reserved words that specify type: int, float,
char, double, long, short, signed, unsigned, void. This class
also includes some compiler specific nonstandard keywords.
CONSTANT Character, numeric or string constants.
Halstead's theory of software science is one of "the best known and most
thoroughly studied composite measures of (software) complexity". Software
science proposed the first analytical "laws" for computer software.
a) Vocabulary of a program ( n )
The size of the vocabulary of a program, which consists of the number of
unique tokens used to build a program is defined as:
n = n1 + n2
Here -
n = vocabulary of a program
n1 = number of unique operators
n2 = number of unique operands
b) Potential Operands, ( n2 *)
Halstead wanted to consider and compare different implementations
of algorithms. He developed the concept of potential operands that
represents the minimal set of values needed for any implementation of the
given algorithm. This is usually calculated by counting all the values that are
not initially set within the algorithm. It includes values read in, parameters
passed in, and global values accessed within the algorithm.
c) Length of a program ( N )
The length of the program in the terms of the total number of
tokens used is:
N = N1 + N2
Here -
N = vocabulary of a program
N1 = number of unique operators
N2 = number of unique operands
Ne = n1log2n1 + n2 log2n2
e) Volume (V )
Halstead thought of volume as a 3D measure, when it is really related
to the number of bits it would take to encode the program being measured.
The unit of measurement of volume is the common unit for
size “bits”. It is the actual size of a program if a uniform binary encoding
for the vocabulary is used.
V = N * log2n
f) Potential Volume (V* )
The potential volume is the minimal size of a solution to the problem,
solved in any language. Halstead assumes that in the minimal
implementation, there would only be two operators: the name of the
function and a grouping operator. The minimal number of operands is n2*.
g) Program Level (L )
Since we have the actual volume and the minimal volume, it is natural
to take a ratio. Halstead divides the potential volume by the actual. This
relates to how close the current implementation is to the minimal
implementation as measured by the potential volume. The implementation
level is unit less.
L = V* / V
The value of L ranges between zero and one, with L=1 representing
a program.
h) Program Difficulty (D )
As the volume of an implementation of a program increases, the
program level decreases and the difficulty increases.
D=1/L
i) Effort (E )
Since Halstead wanted to estimate how much time (effort) was
needed to implement this algorithm. He used a notion of elementary
mental discriminations (emd).
E = V/L = D*V
The units are elementary mental discriminations (emd). Halstead’s
effort is not monotonic—in other words, there are programs such that if you
add statements, the calculated effort decreases.
CHAPTER 3
METHODOLOGY
3.1 INTRODUCTION
This project work has been done in two phases -
The objective is to assign adjoin so that the length, volume, and effort
measures can detect complexity produced by non-sequential control structures.
In our development of these adjoined measures, we have limited ourselves to
programs which have the following properties:
1) each module has a single entry and a single exit;
2) each program control graph (with the extra added arc) is strongly
connected;
3) the only control structures used are SEQUENCE, IF THEN ELSE, DO WHILE,
and DO UNTIL;
To help us present clearly the formulas for our Adjoined measure, we have
introduced some notation. Given a module, we define T to be the set of all
operators in the module.
Each operator will need to be in T as often as it is in the module. Since sets
by definition do not allow repetition, alteration is needed in operators before
they can be put into T.
Using a lower case ‘a’ to stand for an adjoined measure, we have the
following formulas:
In general, the values of the adjoined measures equal the values of the
corresponding unadjoined measures if and only if the cyclomatic number of each
program module is 1.
CHAPTER 4
IMPLEMENTATION
Ten pairs of sample C programs have been taken (Some of the programs
are program segments and not complete programs). In each pair, cyclomatic
complexity of the second program is lesser than that of first program.
For each pair of sample C programs, the values of following eight measures
are evaluated -
a) Halstead program length (N)
b) Halstead vocabulary size (n)
c) Halstead effort (E)
d) Cyclomatic number (CC)
e) Adjoined length (Na)
f) Adjoined effort (Ea)
g) Constant level adjoined length (N’a)
h) Constant level adjoined effort (E’a).
It can be easily seen that for each pair of sample programs, Halstead
software science metric is unable to detect corresponding difference in
cyclomatic complexity, but our adjoined metric has detected the difference.
CHAPTER 5
RESULTS AND VALIDATIONS
S. Program CC N n E Na Ea
No
Future work should include enhancing our adjoined metric concept for
some existing limitations of our dissertation work. Adjoined metric should be
enhanced to determine how adjoins can be added for CASE statements, GOTO
statements and multiple-test predicates, and testing the adjoined measures on
larger programs.
APPENDICES
import java.util.*;
}
else
{
// if there was no arguments .. print error message
System.out.println();
System.out.println("ERROR: Argument missing.");
System.out.println("Program requires path to input C file
as an argument.");
System.out.println("example : java PRKM
\"d:\\sample.c\"");
}
}
catch(Exception e)
{
// In case there is any exception print it
System.out.println(e.toString());
}
}
/* ParsingSymbol.java
* these are the various symbol which can be present in C code file
*/
public enum ParsingSymbol {
NONE,
MAIN,
LP,
RP,
BEGIN,
END,
SEMICOLON,
COLON,
PLUS,
MUL,
MINUS,
INCREMENT,
DECREMENT,
INT,
WHILE,
FOR,
IF,
IDENTIFIER,
COMMA,
CONDITIONALOP,
RELOP,
LOP,
BREAK,
ELSE,
EQUAL,
DIV,
CHAR,
VOID,
FLOAT,
DOUBLE,
SWITCH,
CASE,
MODIFIER,
ARLP,
ARRP,
EXTERN,
QUALIFIER,
ENUM,
CONTINUE,
DEFAULT,
LITERAL,
STRUCT,
RETURN,
REMAINDER
};
/* SymbolInfo.java
* These are like columns in Symbol table or say a row in symbol table
* Lexeme, token, SymbolType, Symbol
*/
public class SymbolInfo {
public String lexeme;
public String token;
public SymbolType symType;
public ParsingSymbol symbol;
}
/* SymbolType.java
* This is an enum to categorize the symbol.
* basically to explain what type of symbol this is
*/
public enum SymbolType {
NONE,
CONDITION,
ASSIGNMENT,
OPERATOR,
SCOPE,
ENDOFSTATEMENT,
SEPARATOR,
MAINMETHOD,
FLOW,
DATATYPE,
QUANTIFIER,
LOOP,
ARRAYRP,
ARRAYLP,
IDENTIFIER,
LITERAL,
METHODRP,
METHODLP,
RETURN
}
// Cyclomatic.java
import java.util.*;
/*
* this is the class to calculate cyclomatic complexity
*/
public class Cyclomatic implements IComplexity{
// contain decision symbols for result printout
private Map<String, Integer> decisionSymbols = new HashMap<String,
Integer>();
// number of decision
long numberofDecisions;
// symbol table
private ArrayList<SymbolInfo> symTable;
// constructor
public Cyclomatic(ArrayList<SymbolInfo> symbolTable)
{
this.symTable = symbolTable;
this.numberofDecisions = 0;
this.decisionSymbols = new HashMap<String, Integer>();
}
// this method does the calculation
public void Calculate() {
int count = symTable.size();
SymbolInfo symInfo;
SymbolInfo previousSymInfo = null;
// read symbols one by one from Symbol table
for(int i=0;i<count;i++)
{
symInfo = (SymbolInfo)symTable.get(i);
// if Symbol type of symbol satisfies following then there is a
decision
if(symInfo.symType == SymbolType.LOOP || (symInfo.symType
== SymbolType.FLOW && symInfo.symbol == ParsingSymbol.CASE &&
previousSymInfo != null &&
previousSymInfo.symbol != ParsingSymbol.COLON ) ||
(symInfo.symType == SymbolType.FLOW &&
symInfo.symbol != ParsingSymbol.CASE && symInfo.symbol !=
ParsingSymbol.ELSE
&& symInfo.symbol != ParsingSymbol.DEFAULT
&& symInfo.symbol != ParsingSymbol.BREAK &&
symInfo.symbol != ParsingSymbol.CONTINUE &&
symInfo.symbol != ParsingSymbol.SWITCH) ||
(symInfo.symType == SymbolType.CONDITION &&
symInfo.symbol == ParsingSymbol.CONDITIONALOP))
{
if(this.decisionSymbols.containsKey(symInfo.lexeme))
{
this.decisionSymbols.put(symInfo.lexeme,
this.decisionSymbols.get(symInfo.lexeme) + 1);
}
else
{
this.decisionSymbols.put(symInfo.lexeme, 1);
}
this.numberofDecisions++;
}
previousSymInfo = symInfo;
}
}
System.out.println("=============================================
=====================");
}
}
// Halstead.java
import java.util.*;
/*
* this is the class to calculate halstead complexity.
*/
public class Halstead implements IComplexity{
// distinct operand
private ArrayList<String> distOperand;
// distinct operator
private ArrayList<String> distOperator;
scopeCounter++;
if(!isFunction)
{
AddtoOperand(symInfo.lexeme);
}
}
}
previousSymInfo = symInfo;
}
}
// function to process operand
private void AddtoOperand(String lexeme)
{
// increase N2
this.N2++;
// check if this is already in n2 list
if(!distinctOperandList.contains(lexeme))
{
// increase n2
this.n2++;
// add to n2 list
distOperand.add(lexeme);
distinctOperandList.add(lexeme);
}
}
// function to process operator
private void AddtoOperator(String lexeme)
{
// increase N1
this.N1++;
// check if this is already in n1 list
if(!distinctOperatorList.contains(lexeme))
{
// increase n1
this.n1++;
// add to n1 list
distOperator.add(lexeme);
distinctOperatorList.add(lexeme);
}
}
// check if we are currently processing a function call
private boolean CheckForFunction(int index)
{
SymbolInfo sym = (SymbolInfo)symTable.get(index-1);
boolean found = false;
if(sym.symType == SymbolType.METHODRP){
// now check, if this is actually a function call or its a construct
like in for loop
for(int i=index-1 ;i>0 ; i--)
{
sym = (SymbolInfo)symTable.get(i);
if(sym.symType == SymbolType.ENDOFSTATEMENT ||
sym.symType == SymbolType.ASSIGNMENT)
{
sym = (SymbolInfo)symTable.get(i+1);
found = true;
break;
}
}
// not a function call add to operator list
if(found && sym.symType == SymbolType.IDENTIFIER)
{
AddtoOperator(sym.lexeme);
}
}
return found;
}
System.out.println("=============================================
=====================");
}
}
// AdjoinedHalstead.java
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;
/*
* this is the class to calculate adjoined halstead complexity.
*/
public class AdjoinedHalstead implements IComplexity{
// distinct operand
private ArrayList<String> distOperand;
// distinct operator
private ArrayList<String> distOperator;
// will keep adjoin constant if a value is assigned
private Integer constantAdjoin = 0;
// n1 , n2, N1 and N2 as per definition of adjoined Halstead
private long n1 = 0, n2 = 0, N1 = 0, N2 = 0;
// contain decision symbols for result printout
private Map<String, String> adjoinOperands;
// contain decision symbols for result printout
private Map<String, String> adjoinOperators;
// symbol table
private ArrayList<SymbolInfo> symTable;
// distinct operand
private ArrayList<String> distinctOperandList;
// distinct operator
private ArrayList<String> distinctOperatorList;
// constructor
public AdjoinedHalstead(ArrayList<SymbolInfo> symbolTable)
{
this.symTable = symbolTable;
this.distinctOperatorList = new ArrayList<String>();
this.distinctOperandList = new ArrayList<String>();
this.distOperand = new ArrayList<String>();
this.distOperator = new ArrayList<String>();
this.adjoinOperands = new HashMap<String, String>();
this.adjoinOperators = new HashMap<String, String>();
}
scopeCounter++;
// check for start of function
if(!isFloworLoop && previousSymInfo!=null &&
previousSymInfo.symType == SymbolType.METHODRP)
{
funcCount++;
// this is to be reset for each module
adjoin = 0;
nesting = 0;
adjoinControlStructure = false;
distinctOperandList.clear();
//distinctOperatorList.clear();
}
}
else if(symInfo.symbol == ParsingSymbol.END)
{
scopeCounter--;
if(scopeCounter == 0)
{
isFloworLoop = false;
}
}
continue;
}
// if code block is inside a valid scope
if(scopeCounter>0 && funcCount>0)
{
if(symInfo.symType == SymbolType.FLOW ||
symInfo.symType == SymbolType.LOOP)
{
isFloworLoop = true;
}
// if adjoin control structure flag is ON and symbol is end
of paranthesis which marks the end of structure
// we should reset the adjoin and increase overall
nesting by one;
if(adjoinControlStructure && symInfo.symType ==
SymbolType.METHODRP)
{
adjoinControlStructure = false;
adjoin = 0;
nesting++;
}
// check and set new adjoin properly
// as per definition this is applicable only to if, case and
loop
// else, default and switch should not be counted as they
do not generate a new path
if((symInfo.symType == SymbolType.FLOW &&
symInfo.symbol != ParsingSymbol.SWITCH && symInfo.symbol !=
ParsingSymbol.ELSE &&
symInfo.symbol != ParsingSymbol.DEFAULT)
|| symInfo.symType == SymbolType.LOOP)
{
// new adjoin will be one more than current
nesting
adjoin = nesting + 1;
if(this.constantAdjoin > 0)
{
adjoin = this.constantAdjoin;
}
adjoinControlStructure = true;
}
// if this is an operator
if(IsOperator(symInfo.symType))
{
boolean isFunction = false;
if(symInfo.symType ==
SymbolType.ENDOFSTATEMENT)
{
isFunction = CheckForFunction(i, adjoin);
}
// else, default and switch should not be counted
as they do not generate a new path
if(!isFunction && symInfo.symbol !=
ParsingSymbol.ELSE && symInfo.symbol != ParsingSymbol.SWITCH &&
symInfo.symbol !=
ParsingSymbol.DEFAULT)
{
AddtoOperator(symInfo.lexeme, adjoin);
}
}
// if this is an operand
else if(IsOperand(symInfo.symType))
{
boolean isFunction = false;
if(symInfo.symType == SymbolType.IDENTIFIER)
{
SymbolInfo sym =
(SymbolInfo)symTable.get(i+1);
if(sym.symType == SymbolType.METHODLP)
{
isFunction = true;
}
}
// if this is not a function add to operand list along
with its adjoin
if(!isFunction)
{
AddtoOperand(symInfo.lexeme, adjoin);
}
}
}
previousSymInfo = symInfo;
}
}
System.out.println("=============================================
====================");
}
}
REFERENCES
5. John John Michura & Miriam A. M. Capretz (2005), “Metrics Suite for Class
Complexity”, Volume 2, 4-6 April 2005 Page(s):404 - 409 Vol. 2 ,
Information Technology: Coding and Computing.