Software Metrics
Software Metrics
Within the software development process, many metrics are that are all connected.
Software metrics are similar to the four functions of management: Planning,
Organization, Control, or Improvement.
1. Product Metrics: These are the measures of various characteristics of the software
product. The two important software characteristics are:
2. Process Metrics: These are the measures of various characteristics of the software
development process. For example, the efficiency of fault detection. They are used
to measure the characteristics of methods, techniques, and tools that are used for
developing software.
Types of Metrics
Internal metrics: Internal metrics are the metrics used for measuring properties that are
viewed to be of greater importance to a software developer. For example, Lines of Code
(LOC) measure.
External metrics: External metrics are the metrics used for measuring properties that
are viewed to be of greater importance to the user, e.g., portability, reliability,
functionality, usability, etc.
Hybrid metrics: Hybrid metrics are the metrics that combine product, process, and
resource metrics. For example, cost per FP where FP stands for Function Point Metric.
Project metrics: Project metrics are the metrics used by the project manager to check
the project's progress. Data from the past projects are used to collect various metrics,
like time and cost; these estimates are used as a base of new software. Note that as the
project proceeds, the project manager will check its progress from time-to-time and will
compare the effort, cost, and time with the original effort, cost and time. Also
understand that these metrics are used to decrease the development costs, time efforts
and risks. The project quality can also be improved. As quality improves, the number of
errors and time, as well as cost required, is also reduced.
Based on the LOC/KLOC count of software, many other metrics can be computed:
a. Errors/KLOC.
b. $/ KLOC.
c. Defects/KLOC.
d. Pages of documentation/KLOC.
e. Errors/PM.
f. Productivity = KLOC/PM (effort is measured in person-months).
g. $/ Page of documentation.
Advantages of LOC
1. Simple to measure
Disadvantage of LOC
1. It is defined on the code. For example, it cannot measure the size of the
specification.
2. It characterizes only one specific view of size, namely length, it takes no account
of functionality or complexity
3. Bad software design may cause an excessive line of code
4. It is language dependent
5. Users cannot easily understand it
Objectives of FPA
The basic and primary purpose of the functional point analysis is to measure and
provide the software application functional size to the client, customer, and the
stakeholder on their request. Further, it is used to measure the software project
development along with its maintenance, consistently throughout the project
irrespective of the tools and the technologies.
1. FPs of an application is found out by counting the number and types of functions
used in the applications. Various functions used in an application can be put under five
types, as shown in Table:
Types of FP Attributes
2. FP characterizes the complexity of the software system and hence can be used to
depict the project time and the manpower requirement.
3. The effort required to develop the project depends on what the software does.
5. FP method is used for data processing systems, business systems like information
systems.
6. The five parameters mentioned above are also known as information domain
characteristics.
7. All the parameters mentioned above are assigned some weights that have been
experimentally determined and are shown in Table
The functional complexities are multiplied with the corresponding weights against each
function, and the values are added up to determine the UFP (Unadjusted Function Point)
of the subsystem.
Here that weighing factor will be simple, average, or complex for a measurement
parameter type.
The Function Point (FP) is thus calculated with the following formula.
and ∑(fi) is the sum of all 14 questionnaires and show the complexity adjustment value/
factor-CAF (where i ranges from 1 to 14). Usually, a student is provided with the value of
∑(fi)
a. Errors/FP
b. $/FP.
c. Defects/FP
d. Pages of documentation/FP
e. Errors/PM.
f. Productivity = FP/PM (effort is measured in person-months).
g. $/Page of Documentation.
8. LOCs of an application can be estimated from FPs. That is, they are
interconvertible. This process is known as backfiring. For example, 1 FP is equal to
about 100 lines of COBOL code.
9. FP metrics is used mostly for measuring the size of Management Information System
(MIS) software.
10. But the function points obtained above are unadjusted function points (UFPs). These
(UFPs) of a subsystem are further adjusted by considering some more General System
Characteristics (GSCs). It is a set of 14 GSCs that need to be considered. The procedure
for adjusting UFPs is as follows:
Remember that the value of VAF lies within 0.65 to 1.35 because
a. When TDI = 0, VAF = 0.65
b. When TDI = 70, VAF = 1.35
c. VAF is then multiplied with the UFP to get the final FP count: FP = VAF * UFP
Example: Compute the function point, productivity, documentation, cost per function
for the following data:
Solution:
FP LOC
Token Count
In these metrics, a computer program is considered to be a collection of tokens, which
may be classified as either operators or operands. All software science metrics can be
defined in terms of these basic symbols. These symbols are called as a token.
In terms of the total tokens used, the size of the program can be expressed as N = N1 +
N2.
The unit of measurement of volume is the standard unit for size "bits." It is the actual
size of a program if a uniform binary encoding for the vocabulary is used.
V=N*log2n
The value of L ranges between zero and one, with L=1 representing a program written
at the highest possible level (i.e., with minimum size).
L=V*/V
Program Difficulty
The difficulty level or error-proneness (D) of the program is proportional to the number
of the unique operator in the program.
D= (n1/2) * (N2/n2)
E=V/L=D*V
Estimated Program Length
According to Halstead, The first Hypothesis of software science is that the length of a
well-structured program is a function only of the number of unique operators and
operands.
N=N1+N2
N^ = n1log2n1 + n2log2n2
The following alternate expressions have been published to estimate program length:
The potential minimum volume V* is defined as the volume of the most short program
in which a problem can be coded.
The size of the vocabulary of a program, which consists of the number of unique tokens
used to build a program, is defined as:
n=n1+n2
where
n=vocabulary of a program
n1=number of unique operators
n2=number of unique operands
Language Level - Shows the algorithm implementation program language level. The
same algorithm demands additional effort if it is written in a low-level program
language. For example, it is easier to program in Pascal than in Assembler.
L' = V / D / D
lambda = L * V* = L2 * V
Language levels
PASCAL 2.54 -
APL 2.42 -
C 0.857 0.445
Example: Consider the sorting program as shown in fig: List out the operators and
operands and also calculate the value of software science measure like n, N, V, E, λ ,etc.
int 4 SORT 1
() 5 x 7
, 4 n 3
[] 7 i 8
if 2 j 7
< 2 save 3
; 11 im1 3
for 2 2 2
= 6 1 3
- 1 0 1
<= 2 - -
++ 2 - -
return 2 - -
{} 3 - -
= 14 log214+10 log2)10
= 14 * 3.81+10 * 3.32
= 53.34+33.2=86.45
n2*=3 {x: array holding the integer to be sorted. This is used as both input and
output}
Since L=V*/V
This is probably a reasonable time to produce the program, which is very simple.
Halstead’s Software metrics are a set of measures proposed by Maurice Halstead to
evaluate the complexity of a software program. These metrics are based on the number of
distinct operators and operands in the program and are used to estimate the effort required
to develop and maintain the program.
Field of Halstead Metrics
Program length (N): This is the total number of operator and operand occurrences in
the program.
Vocabulary size (n): This is the total number of distinct operators and operands in
the program.
Program volume (V): This is the product of program length (N) and the logarithm of
vocabulary size (n), i.e., V = N*log2(n).
Program level (L): This is the ratio of the number of operator occurrences to the
number of operand occurrences in the program, i.e., L = n1/n2, where n1 is the
number of operator occurrences and n2 is the number of operand occurrences.
Program difficulty (D): This is the ratio of the number of unique operators to the
total number of operators in the program, i.e., D = (n1/2) * (N2/n2).
Program effort (E): This is the product of program volume (V) and program
difficulty (D), i.e., E = V*D.
Time to implement (T): This is the estimated time required to implement the
program, based on the program effort (E) and a constant value that depends on the
programming language and development environment.
Halstead’s software metrics can be used to estimate the size, complexity, and effort
required to develop and maintain a software program. However, they have some
limitations, such as the assumption that all operators and operands are equally important,
and the assumption that the same set of metrics can be used for different programming
languages and development environments.
Overall, Halstead’s software metrics can be a useful tool for software developers and
project managers to estimate the effort required to develop and maintain software
programs.
n1 = Number of distinct operators.
n2 = Number of distinct operands.
N1 = Total number of occurrences of operators.
N2 = Total number of occurrences of operands.
Halstead Metrics
Halstead metrics are:
Halstead Program Length: The total number of operator occurrences and the total
number of operand occurrences.
N = N1 + N2
And estimated program length is, N^ = n1log2n1 + n2log2n2
The following alternate expressions have been published to estimate program length:
NJ = log2(n1!) + log2(n2!)
NB = n1 * log2n2 + n2 * log2n1
NC = n1 * sqrt(n1) + n2 * sqrt(n2)
NS = (n * log2n) / 2
Halstead Vocabulary: The total number of unique operators and unique operand
occurrences.
n = n1 + n2
Program Volume: Proportional to program size, represents the size, in bits, of space
necessary for storing the program. This parameter is dependent on specific algorithm
implementation. The properties V, N, and the number of lines in the code are shown
to be linearly connected and equally valid for measuring relative program size.
V = Size * (log2 vocabulary) = N * log2(n)
The unit of measurement of volume is the common unit for size “bits”. It is the actual
size of a program if a uniform binary encoding for the vocabulary is used. And error =
Volume / 3000
Potential Minimum Volume: The potential minimum volume V* is defined as the
volume of the most succinct program in which a problem can be coded.
V* = (2 + n2*) * log2(2 + n2*)
Here, n2* is the count of unique input and output parameters
Program Level: To rank the programming languages, the level of abstraction
provided by the programming language, Program Level (L) is considered. The higher
the level of a language, the less effort it takes to develop a program using that
language.
L = V* / V
The value of L ranges between zero and one, with L=1 representing a program written
at the highest possible level (i.e., with minimum size).
And estimated program level is L^ =2 * (n2) / (n1)(N2)
Program Difficulty: This parameter shows how difficult to handle the program is.
D = (n1 / 2) * (N2 / n2)
D=1/L
As the volume of the implementation of a program increases, the program level
decreases and the difficulty increases. Thus, programming practices such as redundant
usage of operands, or the failure to use higher-level control constructs will tend to
increase the volume as well as the difficulty.
Programming Effort: Measures the amount of mental activity needed to translate the
existing algorithm into implementation in the specified program language.
E = V / L = D * V = Difficulty * Volume
Language Level: Shows the algorithm implementation program language level. The
same algorithm demands additional effort if it is written in a low-level program
language. For example, it is easier to program in Pascal than in Assembler.
L’ = V / D / D
lambda = L * V* = L2 * V
Intelligence Content: Determines the amount of intelligence presented (stated) in the
program This parameter provides a measurement of program complexity,
independently of the programming language in which it was implemented.
I=V/D
Programming Time: Shows time (in minutes) needed to translate the existing
algorithm into implementation in the specified program language.
T = E / (f * S)
The concept of the processing rate of the human brain, developed by psychologist
John Stroud, is also used. Stoud defined a moment as the time required by the human
brain requires to carry out the most elementary decision. The Stoud number S is
therefore Stoud’s moments per second with:
5 <= S <= 20. Halstead uses 18. The value of S has been empirically developed from
psychological reasoning, and its recommended value for programming applications is
18.
Stroud number S = 18 moments / second
seconds-to-minutes factor f = 60
Counting Rules for C Language
1. Comments are not considered.
2. The identifier and function declarations are not considered
3. All the variables and constants are considered operands.
4. Global variables used in different modules of the same program are counted as
multiple occurrences of the same variable.
5. Local variables with the same name in different functions are counted as unique
operands.
6. Functions calls are considered operators.
7. All looping statements e.g., do {…} while ( ), while ( ) {…}, for ( ) {…}, all control
statements e.g., if ( ) {…}, if ( ) {…} else {…}, etc. are considered as operators.
8. In control construct switch ( ) {case:…}, switch as well as all the case statements are
considered as operators.
9. The reserve words like return, default, continue, break, size, etc., are considered
operators.
10. All the brackets, commas, and terminators are considered operators.
11. GOTO is counted as an operator and the label is counted as an operand.
12. The unary and binary occurrences of “+” and “-” are dealt with separately. Similarly
“*” (multiplication operator) is dealt with separately.
13. In the array variables such as “array-name [index]” “array-name” and “index” are
considered as operands and [ ] is considered as operator.
14. In the structure variables such as “struct-name, member-name” or “struct-name ->
member-name”, struct-name, and member-name are taken as operands, and ‘.’, ‘->’
are taken as operators. Some names of member elements in different structure
variables are counted as unique operands.
15. All the hash directives are ignored.
Example – List out the operators and operands and also calculate the values of software
science measures like
int sort (int x[ ], int n)
{
int i, j, save, im1;
/*This function sorts array x in ascending order */
If (n< 2) return 1;
for (i=2; i< =n; i++)
{
im1=i-1;
for (j=1; j< =im1; j++)
if (x[i] < x[j])
{
Save = x[i];
x[i] = x[j];
x[j] = save;
}
}
return 0;
}
Explanation
Operators Occurrences OperandsOccurrences
int 4 sort 1
() 5 x 7
, 4 n 3
[] 7 i 8
if 2 j 7
< 2 save 3
; 11 im1 3
for 2 2 2
= 6 1 3
– 1 0 1
<= 2 – –
++ 2 – –
return 2 – –
Operators Occurrences OperandsOccurrences
{} 3 – –
Example:
Program Data Input Internal Data Data Output
Payroll Name/Social Security No./Pay Withholding rates Overtime Gross Pay withholding
rate/Number of hours worked Factors Insurance Net Pay Pay
Premium Rates Ledgers
Software Program Size/No of Software Model Parameter Constants Est. project effort Est.
Planner developer on team Coefficients project duration
That's why an important set of metrics which capture in the amount of data input,
processed in an output form software. A count of this data structure is called Data
Structured Metrics. In these concentrations is on variables (and given constant) within
each module & ignores the input-output dependencies.
There are some Data Structure metrics to compute the effort and time required to
complete the project. There metrics are:
ADVERTISEMENT
ADVERTISEMENT
o Number of variable (VARS): In this metric, the Number of variables used in the
program is counted.
o Number of Operands (η2): In this metric, the Number of operands used in the program
is counted.
η2 = VARS + Constants + Labels
o Total number of occurrence of the variable (N2): In this metric, the total number of
occurrence of the variables are computed
2. The Usage of data within a Module: The measure this metric, the average numbers
of live variables are computed. A variable is live from its first to its last references within
the procedure.
For Example: If we want to characterize the average number of live variables for a
program having modules, we can use this equation.
Where (LV) is the average live variable metric computed from the ith module. This
equation could compute the average span size (SP) for a program of n spans.
Where
[Link] Sharing of Data among Module: As the data sharing between the Modules
increases (higher Coupling), no parameter passing between Modules also increased, As
a result, more effort and time are required to complete the project. So Sharing Data
among Module is an important metrics to calculate effort and time.
Information Flow Metrics
The other set of metrics we would live to consider are known as Information Flow
Metrics. The basis of information flow metrics is found upon the following concept the
simplest system consists of the component, and it is the work that these components do
and how they are fitted together that identify the complexity of the system. The
following are the working definitions that are used in Information flow:
Coupling: The term used to describe the degree of linkage between one component to
others in the same system.
Information Flow metrics deal with this type of complexity by observing the flow of
information among system components or modules. This metrics is given by Henry and
Kafura. So it is also known as Henry and Kafura's Metric.
This metrics is based on the measurement of the information flow among system
modules. It is sensitive to the complexity due to interconnection among system
component. This measure includes the complexity of a software module is defined to be
the sum of complexities of the procedures included in the module. A process
contributes complexity due to the following two factors.
FAN-IN: FAN-IN of a procedure is the number of local flows into that procedure plus
the number of data structures from which this procedure retrieve information.
FAN -OUT: FAN-OUT is the number of local flows from that procedure plus the number
of data structures which that procedure updates.
ADVERTISEMENT
ADVERTISEMENT
Software Project planning starts before technical work start. The various steps of
planning activities are:
The size is the crucial parameter for the estimation of other activities. Resources
requirement are required based on cost and development time. Project schedule may
prove to be very useful for controlling and monitoring the progress of the project. This
is dependent on resources & development time.
Static, Single Variable Models: When a model makes use of single variables to
calculate desired values such as cost, time, efforts, etc. is said to be a single variable
model. The most common equation is:
C=aLb
Where C = Costs
L= size
a and b are constants
The Software Engineering Laboratory established a model called SEL model, for
estimating its software production. This model is an example of the static, single variable
model.
E=1.4L0.93
DOC=30.4L0.90
D=4.6L0.26
Static, Multivariable Models: These models are based on method (1), they depend on
several variables describing various aspects of the software development environment.
In some model, several variables are needed to describe the software development
process, and selected equation combined these variables to give the estimate of time &
cost. These models are called multivariable models.
WALSTON and FELIX develop the models at IBM provide the following equation gives a
relationship between lines of source code and effort:
E=5.2L0.91
D=4.1L0.36
The productivity index uses 29 variables which are found to be highly correlated
productivity as follows:
Where Wi is the weight factor for the ithvariable and Xi={-1,0,+1} the estimator
gives Xione of the values -1, 0 or +1 depending on the variable decreases, has no effect
or increases the productivity.
Example: Compare the Walston-Felix Model with the SEL model on a software
development expected to involve 8 person-years of effort.
Solution:
Then
ADVERTISEMENT
ADVERTISEMENT
(d)Average manning is the average number of persons required per month in the
project
COCOMO Model
Boehm proposed COCOMO (Constructive Cost Estimation Model) in [Link] is
one of the most generally used software estimation models in the world. COCOMO
predicts the efforts and schedule of a software product based on the size of the
software.
The initial estimate (also called nominal estimate) is determined by an equation of the
form used in the static single variable models, using KDLOC as the measure of the size.
To determine the initial effort Ei in person-months the equation used is of the type is
shown below
Ei=a*(KDLOC)b
The value of the constant a and b are depends on the project type.
1. Organic
2. Semidetached
3. Embedded
[Link]: A development project can be treated of the organic type, if the project
deals with developing a well-understood application program, the size of the
development team is reasonably small, and the team members are experienced in
developing similar methods of projects. Examples of this type of projects are simple
business systems, simple inventory management systems, and data processing
systems.
For three product categories, Bohem provides a different set of expression to predict
effort (in a unit of person month)and development time from the size of estimation in
KLOC(Kilo Line of code) efforts estimation takes into account the productivity loss due
to holidays, weekly off, coffee breaks, etc.
According to Boehm, software cost estimation should be done through three stages:
1. Basic Model
2. Intermediate Model
3. Detailed Model
1. Basic COCOMO Model: The basic COCOMO model provide an accurate size of the
project parameters. The following expressions give the basic COCOMO estimation
model:
Effort=a1*(KLOC) a2 PM
Tdev=b1*(efforts)b2 Months
Where
KLOC is the estimated size of the software product indicate in Kilo Lines of Code,
Effort is the total effort required to develop the software product, expressed in person
months (PMs).
Estimation of development effort
For the three classes of software products, the formulas for estimating the effort based
on the code size are shown below:
For the three classes of software products, the formulas for estimating the development
time based on the effort are given below:
Some insight into the basic COCOMO model can be obtained by plotting the estimated
characteristics for different software sizes. Fig shows a plot of estimated effort versus
product size. From fig, we can observe that the effort is somewhat superliner in the size
of the software product. Thus, the effort required to develop a product increases very
rapidly with project size.
The development time versus the product size in KLOC is plotted in fig. From fig it can
be observed that the development time is a sub linear function of the size of the
product, i.e. when the size of the product increases by two times, the time to develop
the product does not double but rises moderately. This can be explained by the fact that
for larger products, a larger number of activities which can be carried out concurrently
can be identified. The parallel activities can be carried out simultaneously by the
engineers. This reduces the time to complete the project. Further, from fig, it can be
observed that the development time is roughly the same for all three categories of
products. For example, a 60 KLOC program can be developed in approximately 18
months, regardless of whether it is of organic, semidetached, or embedded type.
From the effort estimation, the project cost can be obtained by multiplying the required
effort by the manpower cost per month. But, implicit in this project cost computation is
the assumption that the entire project cost is incurred on account of the manpower cost
alone. In addition to manpower cost, a project would incur costs due to hardware and
software required for the project and the company overheads for administration, office
space, etc.
It is important to note that the effort and the duration estimations obtained using the
COCOMO model are called a nominal effort estimate and nominal duration estimate.
The term nominal implies that if anyone tries to complete the project in a time shorter
than the estimated duration, then the cost will increase drastically. But, if anyone
completes the project over a longer period of time than the estimated, then there is
almost no decrease in the estimated cost value.
Example1: Suppose a project was estimated to be 400 KLOC. Calculate the effort and
development time for each of the three model i.e., organic, semi-detached &
embedded.
Effort=a1*(KLOC) a2 PM
Tdev=b1*(efforts)b2 Months
Estimated Size of project= 400 KLOC
(i)Organic Mode
(ii)Semidetached Mode
E = 3.0 * (400)1.12=2462.79 PM
D = 2.5 * (2462.79)0.35=38.45 PM
ADVERTISEMENT
Solution: The semidetached mode is the most appropriate mode, keeping in view the
size, schedule and experience of development time.
Hence E=3.0(200)1.12=1133.12PM
D=2.5(1133.12)0.35=29.3PM
P = 176 LOC/PM
2. Intermediate Model: The basic Cocomo model considers that the effort is only a
function of the number of lines of code and some constants calculated according to the
various software systems. The intermediate COCOMO model recognizes these facts and
refines the initial estimates obtained through the basic COCOMO model by using a set
of 15 cost drivers based on various attributes of software engineering.
Hardware attributes -
Personnel attributes -
o Analyst capability
o Software engineering capability
o Applications experience
o Virtual machine experience
o Programming language experience
Project attributes -
Project ai bi ci di
The effort is determined as a function of program estimate, and a set of cost drivers are
given according to every phase of the software lifecycle.
Software Design
Software design is a mechanism to transform user requirements into some suitable
form, which helps the programmer in software coding and implementation. It deals with
representing the client's requirement, as described in SRS (Software Requirement
Specification) document, into a form, i.e., easily implementable using programming
language.
The software design phase is the first step in SDLC (Software Design Life Cycle), which
moves the concentration from the problem domain to the solution domain. In software
design, we consider the system to be a set of components or modules with clearly
defined behaviors & boundaries.
For software design, the goal is to divide the problem into manageable pieces.
These pieces cannot be entirely independent of each other as they together form the
system. They have to cooperate and communicate to solve the problem. This
communication adds complexity.
Note: As the number of partition increases = Cost of partition and complexity increases
Abstraction
An abstraction is a tool that enables a designer to consider a component at an abstract
level without bothering about the internal details of the implementation. Abstraction
can be used for existing element as well as the component being designed.
1. Functional Abstraction
2. Data Abstraction
Functional Abstraction
i. A module is specified by the method it performs.
ii. The details of the algorithm to accomplish the functions are not visible to the user of the
function.
Functional abstraction forms the basis for Function oriented design approaches.
Data Abstraction
Details of the data elements are not visible to the users of data. Data Abstraction forms
the basis for Object Oriented design approaches.
Modularity
Modularity specifies to the division of software into separate modules which are
differently named and addressed and are integrated later on in to obtain the completely
functional software. It is the only property that allows a program to be intellectually
manageable. Single large programs are difficult to understand and read due to a large
number of reference variables, control paths, global variables, etc.
o Each module is a well-defined system that can be used with other applications.
o Each module has single specified objectives.
o Modules can be separately compiled and saved in the library.
o Modules should be easier to use than to build.
o Modules are simpler from outside than inside.
Advantages of Modularity
Disadvantages of Modularity
Modular Design
Modular design reduces the design complexity and results in easier and faster
implementation by allowing parallel development of various parts of a system. We
discuss a different section of modular design in detail in this section:
The use of information hiding as design criteria for modular system provides the most
significant benefits when modifications are required during testing's and later during
software maintenance. This is because as most data and procedures are hidden from
other parts of the software, inadvertent errors introduced during modifications are less
likely to propagate to different locations within the software.
Strategy of Design
A good system design strategy is to organize the program modules in such a method
that are easy to develop and latter too, change. Structured design methods help
developers to deal with the size and complexity of programs. Analysts generate
instructions for the developers about how code should be composed and how pieces of
code should fit together to form a program.
1. Top-down Approach
2. Bottom-up Approach
1. Top-down Approach: This approach starts with the identification of the main
components and then decomposing them into their more detailed sub-components.
2. Bottom-up Approach: A bottom-up approach begins with the lower details and
moves towards up the hierarchy, as shown in fig. This approach is suitable in case of an
existing system.
Coupling and Cohesion
Module Coupling
In software engineering, the coupling is the degree of interdependence between
software modules. Two modules that are tightly coupled are strongly dependent on
each other. However, two modules that are loosely coupled are not dependent on each
other. Uncoupled modules have no interdependence at all within them.
A good design is the one that has low coupling. Coupling is measured by the number of
relations between the modules. That is, the coupling increases as the number of calls
between modules increase or the amount of shared data is large. Thus, it can be said
that a design with high coupling will have more errors.
Types of Module Coupling
2. Data Coupling: When data of one module is passed to another module, this is called
data coupling.
3. Stamp Coupling: Two modules are stamp coupled if they communicate using
composite data items such as structure, objects, etc. When the module passes non-
global data structure or entire structure to another module, they are said to be stamp
coupled. For example, passing structure variable in C or object in C++ language to a
module.
4. Control Coupling: Control Coupling exists among two modules if data from one
module is used to direct the structure of instruction execution in another.
5. External Coupling: External Coupling arises when two modules share an externally
imposed data format, communication protocols, or device interface. This is related to
communication to external tools and devices.
6. Common Coupling: Two modules are common coupled if they share information
through some global data items.
7. Content Coupling: Content Coupling exists among two modules if they share code,
e.g., a branch from one module into another module.
Module Cohesion
In computer programming, cohesion defines to the degree to which the elements of a
module belong together. Thus, cohesion measures the strength of relationships
between pieces of functionality within a given module. For example, in highly cohesive
systems, functionality is strongly related.
Coupling Cohesion
Coupling is also called Inter-Module Binding. Cohesion is also called Intra-Module Binding.
Coupling shows the relationships between Cohesion shows the relationship within the module.
modules.
Coupling shows the Cohesion shows the module's relative functional strengt
relative independence between the modules.
While creating, you should aim for low coupling, While creating you should aim for high cohesion, i.e., a
i.e., dependency among modules should be module focuses on a single function (i.e., single-m
less. interaction with other modules of the system.
In coupling, modules are linked to the other In cohesion, the module focuses on a single thing.
modules.
Function Oriented Design
Function Oriented design is a method to software design where the model is
decomposed into a set of interacting units or modules where each unit or module has a
clearly defined function. Thus, the system is designed from a functional viewpoint.
Design Notations
Design Notations are primarily meant to be used during the process of design and are
used to represent design or design decisions. For a function-oriented design, the design
can be represented graphically or mathematically by the following:
Data-flow diagrams are a useful and intuitive way of describing a system. They are
generally understandable without specialized training, notably if control information is
excluded. They show end-to-end processing. That is the flow of processing from when
data enters the system to where it leaves the system can be traced.
Data-flow design is an integral part of several design methods, and most CASE tools
support data-flow diagram creation. Different ways may use different icons to represent
data-flow diagram entities, but their meanings are similar.
Data Dictionaries
A data dictionary lists all data elements appearing in the DFD model of a system. The
data items listed contain all data flows and the contents of all data stores looking on the
DFDs in the DFD model of a system.
A data dictionary lists the objective of all data items and the definition of all composite
data elements in terms of their component data items. For example, a data dictionary
entry may contain that the data grossPay consists of the
parts regularPay and overtimePay.
grossPay = regularPay + overtimePay
For the smallest units of data elements, the data dictionary lists their name and their
type.
A data dictionary plays a significant role in any software development process because
of the following reasons:
o A Data dictionary provides a standard language for all relevant information for use by
engineers working in a project. A consistent vocabulary for data items is essential since,
in large projects, different engineers of the project tend to use different terms to refer to
the same data, which unnecessarily causes confusion.
o The data dictionary provides the analyst with a means to determine the definition of
various data structures in terms of their component elements.
Structured Charts
It partitions a system into block boxes. A Black box system that functionality is known to
the user without the knowledge of internal design.
Pseudo-code
Pseudo-code notations can be used in both the preliminary and detailed design phases.
Using pseudo-code, the designer describes system characteristics using short, concise,
English Language phases that are structured by keywords such as If-Then-Else, While-
Do, and End.
Object-Oriented Design
In the object-oriented design method, the system is viewed as a collection of objects
(i.e., entities). The state is distributed among the objects, and each object handles its
state data. For example, in a Library Automation Software, each library representative
may be a separate object with its data and functions to operate on these data. The tasks
defined for one purpose cannot refer or change data of other objects. Objects have their
internal data which represent their state. Similar objects create a class. In other words,
each object is a member of some class. Classes may inherit features from the superclass.
1. Objects: All entities involved in the solution design are known as objects. For
example, person, banks, company, and users are considered as objects. Every
entity has some attributes associated with it and has some methods to perform
on the attributes.
2. Classes: A class is a generalized description of an object. An object is an instance
of a class. A class defines all the attributes, which an object can have and
methods, which represents the functionality of the object.
3. Messages: Objects communicate by message passing. Messages consist of the
integrity of the target object, the name of the requested operation, and any other
action needed to perform the function. Messages are often implemented as
procedure or function calls.
4. Abstraction In object-oriented design, complexity is handled using abstraction.
Abstraction is the removal of the irrelevant and the amplification of the essentials.
5. Encapsulation: Encapsulation is also called an information hiding concept. The
data and operations are linked to a single unit. Encapsulation not only bundles
essential information of an object together but also restricts access to the data
and methods from the outside world.
6. Inheritance: OOD allows similar classes to stack up in a hierarchical manner
where the lower or sub-classes can import, implement, and re-use allowed
variables and functions from their immediate [Link] property of OOD
is called an inheritance. This makes it easier to define a specific class and to
create generalized classes from specific ones.
7. Polymorphism: OOD languages provide a mechanism where methods
performing similar tasks but vary in arguments, can be assigned the same name.
This is known as polymorphism, which allows a single interface is performing
functions for different types. Depending upon how the service is invoked, the
respective portion of the code gets executed.
Advantages
o Many and easier to customizations options.
o Typically capable of more important tasks.
Disadvantages
o Relies heavily on recall rather than recognition.
o Navigation is often more difficult.
Graphical User Interface (GUI): GUI relies much more heavily on the mouse. A typical
example of this type of interface is any versions of the Windows operating systems.
GUI Characteristics
Characteristics Descriptions
Windows Multiple windows allow different information to be displayed simultaneously on the user's
Icons Icons different types of information. On some systems, icons represent files. On other icons
Menus Commands are selected from a menu rather than typed in a command language.
Pointing A pointing device such as a mouse is used for selecting choices from a menu or indicating
window.
Graphics Graphics elements can be mixed with text or the same display.
Advantages
o Less expert knowledge is required to use it.
o Easier to Navigate and can look through folders quickly in a guess and check manner.
o The user may switch quickly from one task to another and can interact with several
different applications.
Disadvantages
o Typically decreased options.
o Usually less customizable. Not easy to use one button for tons of different variations.
UI Design Principles
Structure: Design should organize the user interface purposefully, in the meaningful
and usual based on precise, consistent models that are apparent and recognizable to
users, putting related things together and separating unrelated things, differentiating
dissimilar things and making similar things resemble one another. The structure
principle is concerned with overall user interface architecture.
Simplicity: The design should make the simple, common task easy, communicating
clearly and directly in the user's language, and providing good shortcuts that are
meaningfully related to longer procedures.
Visibility: The design should make all required options and materials for a given
function visible without distracting the user with extraneous or redundant data.
Feedback: The design should keep users informed of actions or interpretation, changes
of state or condition, and bugs or exceptions that are relevant and of interest to the user
through clear, concise, and unambiguous language familiar to users.
Tolerance: The design should be flexible and tolerant, decreasing the cost of errors and
misuse by allowing undoing and redoing while also preventing bugs wherever possible
by tolerating varied inputs and sequences and by interpreting all reasonable actions.
Coding
The coding is the process of transforming the design of a system into a computer
language format. This coding phase of software development is concerned with
software translating design specification into the source code. It is necessary to write
source code & internal documentation so that conformance of the code to its
specification can be easily verified.
Coding is done by the coder or programmers who are independent people than the
designer. The goal is not to reduce the effort and cost of the coding phase, but to cut to
the cost of a later stage. The cost of testing and maintenance can be significantly
reduced with efficient coding.
Goals of Coding
1. To translate the design of system into a computer language format: The
coding is the process of transforming the design of a system into a computer
language format, which can be executed by a computer and that perform tasks as
specified by the design of operation during the design phase.
2. To reduce the cost of later phases: The cost of testing and maintenance can be
significantly reduced with efficient coding.
3. Making the program more readable: Program should be easy to read and
understand. It increases code understanding having readability and
understandability as a clear objective of the coding activity can itself help in
producing more maintainable software.
For implementing our design into code, we require a high-level functional language. A
programming language should have the following characteristics:
Brevity: Language should have the ability to implement the algorithm with less amount
of code. Programs mean in high-level languages are often significantly shorter than their
low-level equivalents.
A coding standard lists several rules to be followed during coding, such as the way
variables are to be named, the way the code is to be laid out, error return conventions,
etc.
Coding Standards
General coding standards refers to how the developer writes code, so here we will
discuss some essential standards regardless of the programming language being used.
Coding Guidelines
General coding guidelines provide the programmer with a set of the best methods
which can be used to make programs more comfortable to read and maintain. Most of
the examples use the C language syntax, but the guidelines can be tested to all
languages.
2. Spacing: The appropriate use of spaces within a line of code can improve readability.
Example:
Bad: cost=price+(price*sales_tax)
fprintf(stdout ,"The total cost is %5.2f\n",cost);
4. The length of any function should not exceed 10 source lines: A very lengthy
function is generally very difficult to understand as it possibly carries out many various
functions. For the same reason, lengthy functions are possible to have a
disproportionately larger number of bugs.
5. Do not use goto statements: Use of goto statements makes a program unstructured
and very tough to understand.
Programming Style
Programming style refers to the technique used in writing the source code for a
computer program. Most programming styles are designed to help programmers
quickly read and understands the program as well as avoid making errors. (Older
programming styles also focused on conserving screen space.) A good coding style can
overcome the many deficiencies of a first programming language, while poor style can
defeat the intent of an excellent language.
2. Naming: In a program, you are required to name the module, processes, and
variable, and so on. Care should be taken that the naming style should not be cryptic
and non-representative.
3. Control Constructs: It is desirable that as much as a possible single entry and single
exit constructs used.
4. Information hiding: The information secure in the data structures should be hidden
from the rest of the system where possible. Information hiding can decrease the
coupling between modules and make the system more maintainable.
5. Nesting: Deep nesting of loops and conditions greatly harm the static and dynamic
behavior of a program. It also becomes difficult to understand the program logic, so it is
desirable to avoid deep nesting.
6. User-defined types: Make heavy use of user-defined data types like enum, class,
structure, and union. These data types make your program code easy to write and easy
to understand.
7. Module size: The module size should be uniform. The size of the module should not
be too big or too small. If the module size is too large, it is not generally functionally
cohesive. If the module size is too small, it leads to unnecessary overheads.
Structured Programming
In structured programming, we sub-divide the whole program into small modules so
that the program becomes easy to understand. The purpose of structured programming
is to linearize control flow through a computer program so that the execution sequence
follows the sequence in which the code is written. The dynamic structure of the program
than resemble the static structure of the program. This enhances the readability,
testability, and modifiability of the program. This linear flow of control can be managed
by restricting the set of allowed applications construct to a single entry, single exit
formats.
Rule 2 of Structured Programming: Two or more code blocks in the sequence are
structured, as shown in the figure.
Structured Rule Three: Alternation
If-then-else is frequently called alternation (because there are alternative options). In
structured programming, each choice is a code block. If alternation is organized as in
the flowchart at right, then there is one entry point (at the top) and one exit point (at the
bottom). The structure should be coded so that if the entry conditions are fulfilled, then
the exit conditions are satisfied (just like a code block).
Rule 5 of Structured Programming: A structure (of any size) that has a single entry
point and a single exit point is equivalent to a code block. For example, we are
designing a program to go through a list of signed integers calculating the absolute
value of each one. We may (1) first regard the program as one block, then (2) sketch in
the iteration required, and finally (3) put in the details of the loop body, as shown in the
figure.
The other control structures are the case, do-until, do-while, and for are not needed.
However, they are sometimes convenient and are usually regarded as part of structured
programming. In assembly language, they add little convenience.
Halstead’s metrics facilitate software maintenance prediction by providing quantitative measures of a program's complexity and effort. By assessing program length, vocabulary, volume, and difficulty, these metrics estimate the work required for maintenance activities. A higher estimated program effort indicates a potentially more complex maintenance task, necessitating more detailed planning and resources. Additionally, the metrics' simplicity and applicability to various languages assist in standardizing maintenance efforts across different projects, promoting better organization and resource allocation for maintenance tasks .
Halstead's potential minimum volume (V*) represents the smallest volume a program can achieve for a given problem, calculated as V* = (2 + n2*) * log2(2 + n2*), with n2* being the count of unique input and output parameters. V* is used to assess the theoretical efficiency of a program; a higher program level (L) indicates a closer alignment between actual and potential volume, denoting a more efficient implementation. This metric helps developers gauge the extent to which a program's code can be optimized, encouraging the creation of more concise and efficient programs .
Halstead's Metrics calculate programming effort (E) using the formula E = V * D, where V is the program volume and D is the program difficulty. Program volume (V) is determined by the formula V = N * log2(n), where N is the program length (total operator and operand occurrences) and n is the vocabulary size (unique operators and operands). Difficulty (D) is calculated as (n1/2) * (N2/n2), with n1 as the number of unique operators, N2 as the total operand occurrences, and n2 as the number of unique operands. These metrics provide a quantitative measure of the complexity and effort required to develop a program .
Halstead Metrics differ from traditional software metrics in that they focus on the number of operators and operands, using these to derive measures like program length, vocabulary, volume, difficulty, and effort. Traditional metrics, such as Lines of Code or cyclomatic complexity, often measure code size or control flow complexity. Halstead Metrics provide a more abstract, language-independent view of complexity, offering insights into the cognitive effort required for software development. This abstract approach allows Halstead Metrics to quantifiably predict error rates and maintenance needs, complementing traditional metrics that often focus on structural aspects .
Halstead’s Software Metrics have several limitations: they assume all operators and operands are equally important and suitable for different programming languages and environments, which may not be the case. They focus solely on code complexity and effort, overlooking other factors like reliability and maintainability. The metrics rely on specific assumptions, possibly limiting their accuracy in diverse contexts, such as highly interactive or real-time systems. Consequently, while useful, Halstead's Metrics may not fully represent all relevant aspects of software development and require complementary approaches for more comprehensive assessments .
The COCOMO model categorizes software projects into three types: organic, semidetached, and embedded. Organic projects involve well-understood applications with small, experienced teams (e.g., business systems), while semidetached projects consist of mixed-experience teams working on somewhat unfamiliar systems (e.g., a new OS). Embedded projects involve software tightly coupled with complex hardware and stringent regulations (e.g., ATM systems). For each category, COCOMO uses project size expressed in Kilo Lines of Code (KLOC) alongside constants specific to each category to estimate effort (person-months) and development time (months).
Using both internal and external software metrics is essential for a comprehensive software evaluation. Internal metrics, such as Lines of Code, focus on development aspects, helping developers assess complexity and internal quality. External metrics, targeting user-facing features like portability, reliability, and usability, address end-user concerns and usability. Combining these metrics provides a holistic understanding of software performance, balancing development demands with user requirements. This integrated evaluation facilitates informed decision-making in the development and deployment stages, optimizing overall software quality and stakeholder satisfaction .
Software metrics are classified into product metrics, process metrics, internal metrics, external metrics, and hybrid metrics. Product metrics measure characteristics such as size, complexity, quality, and reliability of the software product itself. Process metrics focus on the software development process, evaluating the efficiency of methods and tools, such as the effectiveness of fault detection. Internal metrics are important to software developers, like Lines of Code (LOC), while external metrics are significant to end-users, encompassing attributes like portability and usability. Hybrid metrics incorporate aspects of product, process, and resources, such as cost per Function Point. These distinctions allow organizations to address different aspects of software development and maintenance .
Data structure metrics help understand software development effort by focusing on data input, processing, and output within a program or module. They measure the amount of data handled and its internal usage, identifying program weaknesses and data sharing among modules. These insights guide developers in optimizing data handling strategies to reduce effort and time required for completion. Emphasizing variables and constants allows these metrics to offer a detailed view of the data's role in a system, aiding in efficient resource allocation and effort estimation .
Software metrics are vital for project management as they aid in planning, controlling, and improving software projects. Metrics such as project metrics help managers track progress by comparing current effort, cost, and time against original estimates. They inform decisions, such as resource allocation, reducing development costs, and identifying risks early. Additionally, metrics facilitate the comparative study of design methodologies, programming languages, and staff productivity, which aids in strategic planning and optimizing resource utilization. As projects progress, metrics are crucial for quality improvement, reducing errors, and guiding design trade-offs between development and maintenance costs .