0% found this document useful (0 votes)

713 views79 pages

Software Metrics

Q: In what way do Halstead’s metrics facilitate software maintenance prediction?

Halstead’s metrics facilitate software maintenance prediction by providing quantitative measures of a program's complexity and effort. By assessing program length, vocabulary, volume, and difficulty, these metrics estimate the work required for maintenance activities. A higher estimated program effort indicates a potentially more complex maintenance task, necessitating more detailed planning and resources. Additionally, the metrics' simplicity and applicability to various languages assist in standardizing maintenance efforts across different projects, promoting better organization and resource allocation for maintenance tasks .

Q: Explain the role of Halstead's potential minimum volume (V*) and its impact on program efficiency.

Halstead's potential minimum volume (V*) represents the smallest volume a program can achieve for a given problem, calculated as V* = (2 + n2*) * log2(2 + n2*), with n2* being the count of unique input and output parameters. V* is used to assess the theoretical efficiency of a program; a higher program level (L) indicates a closer alignment between actual and potential volume, denoting a more efficient implementation. This metric helps developers gauge the extent to which a program's code can be optimized, encouraging the creation of more concise and efficient programs .

Q: How does Halstead's Metrics define and calculate the effort required for programming, and what are its key components?

Halstead's Metrics calculate programming effort (E) using the formula E = V * D, where V is the program volume and D is the program difficulty. Program volume (V) is determined by the formula V = N * log2(n), where N is the program length (total operator and operand occurrences) and n is the vocabulary size (unique operators and operands). Difficulty (D) is calculated as (n1/2) * (N2/n2), with n1 as the number of unique operators, N2 as the total operand occurrences, and n2 as the number of unique operands. These metrics provide a quantitative measure of the complexity and effort required to develop a program .

Q: How do Halstead Metrics differ from traditional software metrics in their approach to measuring software complexity?

Halstead Metrics differ from traditional software metrics in that they focus on the number of operators and operands, using these to derive measures like program length, vocabulary, volume, difficulty, and effort. Traditional metrics, such as Lines of Code or cyclomatic complexity, often measure code size or control flow complexity. Halstead Metrics provide a more abstract, language-independent view of complexity, offering insights into the cognitive effort required for software development. This abstract approach allows Halstead Metrics to quantifiably predict error rates and maintenance needs, complementing traditional metrics that often focus on structural aspects .

Q: What are the limitations of using Halstead’s Software Metrics in software development?

Halstead’s Software Metrics have several limitations: they assume all operators and operands are equally important and suitable for different programming languages and environments, which may not be the case. They focus solely on code complexity and effort, overlooking other factors like reliability and maintainability. The metrics rely on specific assumptions, possibly limiting their accuracy in diverse contexts, such as highly interactive or real-time systems. Consequently, while useful, Halstead's Metrics may not fully represent all relevant aspects of software development and require complementary approaches for more comprehensive assessments .

Q: How does the COCOMO model categorize software projects, and what parameters are used in each category to estimate effort and development time?

The COCOMO model categorizes software projects into three types: organic, semidetached, and embedded. Organic projects involve well-understood applications with small, experienced teams (e.g., business systems), while semidetached projects consist of mixed-experience teams working on somewhat unfamiliar systems (e.g., a new OS). Embedded projects involve software tightly coupled with complex hardware and stringent regulations (e.g., ATM systems). For each category, COCOMO uses project size expressed in Kilo Lines of Code (KLOC) alongside constants specific to each category to estimate effort (person-months) and development time (months).

Q: Evaluate the importance of using both internal and external software metrics for comprehensive software evaluation.

Using both internal and external software metrics is essential for a comprehensive software evaluation. Internal metrics, such as Lines of Code, focus on development aspects, helping developers assess complexity and internal quality. External metrics, targeting user-facing features like portability, reliability, and usability, address end-user concerns and usability. Combining these metrics provides a holistic understanding of software performance, balancing development demands with user requirements. This integrated evaluation facilitates informed decision-making in the development and deployment stages, optimizing overall software quality and stakeholder satisfaction .

Q: What are the main types of software metrics, and how do they differ in terms of focus and application?

Software metrics are classified into product metrics, process metrics, internal metrics, external metrics, and hybrid metrics. Product metrics measure characteristics such as size, complexity, quality, and reliability of the software product itself. Process metrics focus on the software development process, evaluating the efficiency of methods and tools, such as the effectiveness of fault detection. Internal metrics are important to software developers, like Lines of Code (LOC), while external metrics are significant to end-users, encompassing attributes like portability and usability. Hybrid metrics incorporate aspects of product, process, and resources, such as cost per Function Point. These distinctions allow organizations to address different aspects of software development and maintenance .

Q: Can you analyze how data structure metrics contribute to understanding software development effort?

Data structure metrics help understand software development effort by focusing on data input, processing, and output within a program or module. They measure the amount of data handled and its internal usage, identifying program weaknesses and data sharing among modules. These insights guide developers in optimizing data handling strategies to reduce effort and time required for completion. Emphasizing variables and constants allows these metrics to offer a detailed view of the data's role in a system, aiding in efficient resource allocation and effort estimation .

Q: Discuss how software metrics can assist in project management and decision-making during the development process.

Software metrics are vital for project management as they aid in planning, controlling, and improving software projects. Metrics such as project metrics help managers track progress by comparing current effort, cost, and time against original estimates. They inform decisions, such as resource allocation, reducing development costs, and identifying risks early. Additionally, metrics facilitate the comparative study of design methodologies, programming languages, and staff productivity, which aids in strategic planning and optimizing resource utilization. As projects progress, metrics are crucial for quality improvement, reducing errors, and guiding design trade-offs between development and maintenance costs .

Uploaded by

JYOTI GOYAL

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

713 views79 pages

Software Metrics

Uploaded by

JYOTI GOYAL

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Software Metrics

A software metric is a measure of software characteristics which are measurable or

countable. Software metrics are valuable for many reasons, including measuring
software performance, planning work items, measuring productivity, and many other
uses.

Within the software development process, many metrics are that are all connected.
Software metrics are similar to the four functions of management: Planning,
Organization, Control, or Improvement.

Classification of Software Metrics

Software metrics can be classified into two types as follows:

1. Product Metrics: These are the measures of various characteristics of the software
product. The two important software characteristics are:

1. Size and complexity of software.

2. Quality and reliability of software.

These metrics can be computed for different stages of SDLC.

2. Process Metrics: These are the measures of various characteristics of the software
development process. For example, the efficiency of fault detection. They are used
to measure the characteristics of methods, techniques, and tools that are used for
developing software.
Types of Metrics
Internal metrics: Internal metrics are the metrics used for measuring properties that are
viewed to be of greater importance to a software developer. For example, Lines of Code
(LOC) measure.

External metrics: External metrics are the metrics used for measuring properties that
are viewed to be of greater importance to the user, e.g., portability, reliability,
functionality, usability, etc.

Hybrid metrics: Hybrid metrics are the metrics that combine product, process, and
resource metrics. For example, cost per FP where FP stands for Function Point Metric.

Project metrics: Project metrics are the metrics used by the project manager to check
the project's progress. Data from the past projects are used to collect various metrics,
like time and cost; these estimates are used as a base of new software. Note that as the
project proceeds, the project manager will check its progress from time-to-time and will
compare the effort, cost, and time with the original effort, cost and time. Also
understand that these metrics are used to decrease the development costs, time efforts
and risks. The project quality can also be improved. As quality improves, the number of
errors and time, as well as cost required, is also reduced.

Advantage of Software Metrics

1. Comparative study of various design methodology of software systems.
2. For analysis, comparison, and critical study of different programming language
concerning their characteristics.
3. In comparing and evaluating the capabilities and productivity of people involved
in software development.
4. In the preparation of software quality specifications.
5. In the verification of compliance of software systems requirements and
specifications.
6. In making inference about the effort to be put in the design and development of
the software systems.
7. In getting an idea about the complexity of the code.
8. In taking decisions regarding further division of a complex module is to be done
or not.
9. In guiding resource manager for their proper utilization.
10. In comparison and making design tradeoffs between software development and
maintenance cost.
11. In providing feedback to software managers about the progress and quality
during various phases of the software development life cycle.
12. In the allocation of testing resources for testing the code.

Disadvantage of Software Metrics

1. The application of software metrics is not always easy, and in some cases, it is
difficult and costly.
2. The verification and justification of software metrics are based on
historical/empirical data whose validity is difficult to verify.
3. These are useful for managing software products but not for evaluating the
performance of the technical staff.
4. The definition and derivation of Software metrics are usually based on assuming
which are not standardized and may depend upon tools available and working
environment.
5. Most of the predictive models rely on estimates of certain variables which are
often not known precisely.

Size Oriented Metrics

LOC Metrics
It is one of the earliest and simpler metrics for calculating the size of the computer
program. It is generally used in calculating and comparing the productivity of
programmers. These metrics are derived by normalizing the quality and productivity
measures by considering the size of the product as a metric.

Following are the points regarding LOC measures:

1. In size-oriented metrics, LOC is considered to be the normalization value.

2. It is an older method that was developed when FORTRAN and COBOL
programming were very popular.
3. Productivity is defined as KLOC / EFFORT, where effort is measured in person-
months.
4. Size-oriented metrics depend on the programming language used.
5. As productivity depends on KLOC, so assembly language code will have more
productivity.
6. LOC measure requires a level of detail which may not be practically achievable.
7. The more expressive is the programming language, the lower is the productivity.
8. LOC method of measurement does not apply to projects that deal with visual
(GUI-based) programming. As already explained, Graphical User Interfaces (GUIs)
use forms basically. LOC metric is not applicable here.
9. It requires that all organizations must use the same method for counting LOC.
This is so because some organizations use only executable statements, some
useful comments, and some do not. Thus, the standard needs to be established.
10. These metrics are not universally accepted.

Based on the LOC/KLOC count of software, many other metrics can be computed:

a. Errors/KLOC.
b. $/ KLOC.
c. Defects/KLOC.
d. Pages of documentation/KLOC.
e. Errors/PM.
f. Productivity = KLOC/PM (effort is measured in person-months).
g. $/ Page of documentation.

Advantages of LOC
1. Simple to measure

Disadvantage of LOC
1. It is defined on the code. For example, it cannot measure the size of the
specification.
2. It characterizes only one specific view of size, namely length, it takes no account
of functionality or complexity
3. Bad software design may cause an excessive line of code
4. It is language dependent
5. Users cannot easily understand it

Functional Point (FP) Analysis

Allan J. Albrecht initially developed function Point Analysis in 1979 at IBM and it has
been further modified by the International Function Point Users Group (IFPUG). FPA is
used to make estimate of the software project, including its testing in terms of
functionality or function size of the software product. However, functional point analysis
may be used for the test estimation of the product. The functional size of the product is
measured in terms of the function point, which is a standard of measurement to
measure the software application.

Objectives of FPA
The basic and primary purpose of the functional point analysis is to measure and
provide the software application functional size to the client, customer, and the
stakeholder on their request. Further, it is used to measure the software project
development along with its maintenance, consistently throughout the project
irrespective of the tools and the technologies.

Following are the points regarding FPs

1. FPs of an application is found out by counting the number and types of functions
used in the applications. Various functions used in an application can be put under five
types, as shown in Table:

Types of FP Attributes

Measurements Parameters Examples

[Link] of External Inputs(EI) Input screen and tables

2. Number of External Output (EO) Output screens and reports

3. Number of external inquiries (EQ) Prompts and interrupts.

4. Number of internal files (ILF) Databases and directories

5. Number of external interfaces (EIF) Shared databases and shared routines.

All these parameters are then individually assessed for complexity.

The FPA functional units are shown in Fig:

2. FP characterizes the complexity of the software system and hence can be used to
depict the project time and the manpower requirement.

3. The effort required to develop the project depends on what the software does.

4. FP is programming language independent.

5. FP method is used for data processing systems, business systems like information
systems.

6. The five parameters mentioned above are also known as information domain
characteristics.
7. All the parameters mentioned above are assigned some weights that have been
experimentally determined and are shown in Table

Weights of 5-FP Attributes

Measurement Parameter Low Average High

1. Number of external inputs (EI) 7 10 15

2. Number of external outputs (EO) 5 7 10

3. Number of external inquiries (EQ) 3 4 6

4. Number of internal files (ILF) 4 5 7

5. Number of external interfaces (EIF) 3 4 6

The functional complexities are multiplied with the corresponding weights against each
function, and the values are added up to determine the UFP (Unadjusted Function Point)
of the subsystem.

Here that weighing factor will be simple, average, or complex for a measurement
parameter type.

The Function Point (FP) is thus calculated with the following formula.

FP = Count-total * [0.65 + 0.01 * ∑(fi)]

= Count-total * CAF

where Count-total is obtained from the above Table.

CAF = [0.65 + 0.01 *∑(fi)]

and ∑(fi) is the sum of all 14 questionnaires and show the complexity adjustment value/
factor-CAF (where i ranges from 1 to 14). Usually, a student is provided with the value of
∑(fi)

Also note that ∑(fi) ranges from 0 to 70, i.e.,

0 <= ∑(fi) <=70

and CAF ranges from 0.65 to 1.35 because

a. When ∑(fi) = 0 then CAF = 0.65

b. When ∑(fi) = 70 then CAF = 0.65 + (0.01 * 70) = 0.65 + 0.7 = 1.35

Based on the FP measure of software many other metrics can be computed:

a. Errors/FP
b. $/FP.
c. Defects/FP
d. Pages of documentation/FP
e. Errors/PM.
f. Productivity = FP/PM (effort is measured in person-months).
g. $/Page of Documentation.

8. LOCs of an application can be estimated from FPs. That is, they are
interconvertible. This process is known as backfiring. For example, 1 FP is equal to
about 100 lines of COBOL code.

9. FP metrics is used mostly for measuring the size of Management Information System
(MIS) software.

10. But the function points obtained above are unadjusted function points (UFPs). These
(UFPs) of a subsystem are further adjusted by considering some more General System
Characteristics (GSCs). It is a set of 14 GSCs that need to be considered. The procedure
for adjusting UFPs is as follows:

a. Degree of Influence (DI) for each of these 14 GSCs is assessed on a scale of 0 to 5.

(b) If a particular GSC has no influence, then its weight is taken as 0 and if it has a strong
influence then its weight is 5.
b. The score of all 14 GSCs is totaled to determine Total Degree of Influence (TDI).
c. Then Value Adjustment Factor (VAF) is computed from TDI by using the
formula: VAF = (TDI * 0.01) + 0.65

Remember that the value of VAF lies within 0.65 to 1.35 because
a. When TDI = 0, VAF = 0.65
b. When TDI = 70, VAF = 1.35
c. VAF is then multiplied with the UFP to get the final FP count: FP = VAF * UFP

Example: Compute the function point, productivity, documentation, cost per function
for the following data:

1. Number of user inputs = 24

2. Number of user outputs = 46
3. Number of inquiries = 8
4. Number of files = 4
5. Number of external interfaces = 2
6. Effort = 36.9 p-m
7. Technical documents = 265 pages
8. User documents = 122 pages
9. Cost = $7744/ month

Various processing complexity factors are: 4, 1, 0, 3, 3, 5, 4, 4, 3, 3, 2, 2, 4, 5.

Solution:

Measurement Parameter Count Weighing factor

1. Number of external inputs (EI) 24 * 4 = 96

2. Number of external outputs (EO) 46 * 4 = 184

3. Number of external inquiries (EQ) 8 * 6 = 48

4. Number of internal files (ILF) 4 * 10 = 40

5. Number of external interfaces (EIF) Count-total → 2 * 5 = 10

378

So sum of all fi (i ← 1 to 14) = 4 + 1 + 0 + 3 + 5 + 4 + 4 + 3 + 3 + 2 + 2 + 4 + 5 = 43

FP = Count-total * [0.65 + 0.01 *∑(fi)]
= 378 * [0.65 + 0.01 * 43]
= 378 * [0.65 + 0.43]
= 378 * 1.08 = 408

Total pages of documentation = technical document + user document

= 265 + 122 = 387pages

Documentation = Pages of documentation/FP

= 387/408 = 0.94

Differentiate between FP and LOC

FP LOC

1. FP is specification based. 1. LOC is an analogy based.

2. FP is language independent. 2. LOC is language dependent.

3. FP is user-oriented. 3. LOC is design-oriented.

4. It is extendible to LOC. 4. It is convertible to FP (backfiring)

Halstead's Software Metrics

According to Halstead's "A computer program is an implementation of an algorithm
considered to be a collection of tokens which can be classified as either operators or
operand."

Token Count
In these metrics, a computer program is considered to be a collection of tokens, which
may be classified as either operators or operands. All software science metrics can be
defined in terms of these basic symbols. These symbols are called as a token.

The basic measures are

n1 = count of unique operators.

n2 = count of unique operands.
N1 = count of total occurrences of operators.
N2 = count of total occurrence of operands.

In terms of the total tokens used, the size of the program can be expressed as N = N1 +
N2.

Halstead metrics are:

Program Volume (V)

The unit of measurement of volume is the standard unit for size "bits." It is the actual
size of a program if a uniform binary encoding for the vocabulary is used.

V=N*log2n

Program Level (L)

The value of L ranges between zero and one, with L=1 representing a program written
at the highest possible level (i.e., with minimum size).

L=V*/V

Program Difficulty

The difficulty level or error-proneness (D) of the program is proportional to the number
of the unique operator in the program.

D= (n1/2) * (N2/n2)

Programming Effort (E)

The unit of measurement of E is elementary mental discriminations.

E=V/L=D*V
Estimated Program Length

According to Halstead, The first Hypothesis of software science is that the length of a
well-structured program is a function only of the number of unique operators and
operands.

N=N1+N2

And estimated program length is denoted by N^

N^ = n1log2n1 + n2log2n2

The following alternate expressions have been published to estimate program length:

o NJ = log2 (n1!) + log2 (n2!)

o NB = n1 * log2n2 + n2 * log2n1
o NC = n1 * sqrt(n1) + n2 * sqrt(n2)
o NS = (n * log2n) / 2

Potential Minimum Volume

The potential minimum volume V* is defined as the volume of the most short program
in which a problem can be coded.

V* = (2 + n2) log2 (2 + n2*)

Here, n2* is the count of unique input and output parameters

Size of Vocabulary (n)

The size of the vocabulary of a program, which consists of the number of unique tokens
used to build a program, is defined as:

n=n1+n2

where

n=vocabulary of a program
n1=number of unique operators
n2=number of unique operands
Language Level - Shows the algorithm implementation program language level. The
same algorithm demands additional effort if it is written in a low-level program
language. For example, it is easier to program in Pascal than in Assembler.

L' = V / D / D
lambda = L * V* = L2 * V

Language levels

Language Language level λ Variance σ

PL/1 1.53 0.92

ALGOL 1.21 0.74

FORTRAN 1.14 0.81

CDC Assembly 0.88 0.42

PASCAL 2.54 -

APL 2.42 -

C 0.857 0.445

Counting rules for C language

1. Comments are not considered.

2. The identifier and function declarations are not considered
3. All the variables and constants are considered operands.
4. Global variables used in different modules of the same program are counted as
multiple occurrences of the same variable.
5. Local variables with the same name in different functions are counted as unique
operands.
6. Functions calls are considered as operators.
7. All looping statements e.g., do {...} while ( ), while ( ) {...}, for ( ) {...}, all control
statements e.g., if ( ) {...}, if ( ) {...} else {...}, etc. are considered as operators.
8. In control construct switch ( ) {case:...}, switch as well as all the case statements
are considered as operators.
9. The reserve words like return, default, continue, break, sizeof, etc., are considered
as operators.
10. All the brackets, commas, and terminators are considered as operators.
11. GOTO is counted as an operator, and the label is counted as an operand.
12. The unary and binary occurrence of "+" and "-" are dealt with separately. Similarly
"*" (multiplication operator) are dealt separately.
13. In the array variables such as "array-name [index]" "array-name" and "index" are
considered as operands and [ ] is considered an operator.
14. In the structure variables such as "struct-name, member-name" or "struct-name -
> member-name," struct-name, member-name are considered as operands and
'.', '->' are taken as operators. Some names of member elements in different
structure variables are counted as unique operands.
15. All the hash directive is ignored.

Example: Consider the sorting program as shown in fig: List out the operators and
operands and also calculate the value of software science measure like n, N, V, E, λ ,etc.

Solution: The list of operators and operands is given in the table

Operators Occurrences Operands Occurrences

int 4 SORT 1

() 5 x 7

, 4 n 3
[] 7 i 8

if 2 j 7

< 2 save 3

; 11 im1 3

for 2 2 2

= 6 1 3

- 1 0 1

<= 2 - -

++ 2 - -

return 2 - -

{} 3 - -

n1=14 N1=53 n2=10 N2=38

Here N1=53 and N2=38. The program length N=N1+N2=53+38=91

Vocabulary of the program n=n1+n2=14+10=24

Volume V= N * log2N=91 x log2 24=417 bits.

The estimate program length N of the program

= 14 log214+10 log2)10
= 14 * 3.81+10 * 3.32
= 53.34+33.2=86.45

Conceptually unique input and output parameters are represented by n2*.

n2*=3 {x: array holding the integer to be sorted. This is used as both input and
output}

{N: the size of the array to be sorted}

The Potential Volume V*=5log25=11.6

Since L=V*/V

We may use another formula

V^=V x L^= 417 x 0.038=15.67

E^=V/L^=D^ x V

Therefore, 10974 elementary mental discrimination is required to construct the program.

This is probably a reasonable time to produce the program, which is very simple.

Software Engineering | Halstead’s Software

Metrics


Halstead’s Software metrics are a set of measures proposed by Maurice Halstead to
evaluate the complexity of a software program. These metrics are based on the number of
distinct operators and operands in the program and are used to estimate the effort required
to develop and maintain the program.
Field of Halstead Metrics
 Program length (N): This is the total number of operator and operand occurrences in
the program.
 Vocabulary size (n): This is the total number of distinct operators and operands in
the program.
 Program volume (V): This is the product of program length (N) and the logarithm of
vocabulary size (n), i.e., V = N*log2(n).
 Program level (L): This is the ratio of the number of operator occurrences to the
number of operand occurrences in the program, i.e., L = n1/n2, where n1 is the
number of operator occurrences and n2 is the number of operand occurrences.
 Program difficulty (D): This is the ratio of the number of unique operators to the
total number of operators in the program, i.e., D = (n1/2) * (N2/n2).
 Program effort (E): This is the product of program volume (V) and program
difficulty (D), i.e., E = V*D.
 Time to implement (T): This is the estimated time required to implement the
program, based on the program effort (E) and a constant value that depends on the
programming language and development environment.
Halstead’s software metrics can be used to estimate the size, complexity, and effort
required to develop and maintain a software program. However, they have some
limitations, such as the assumption that all operators and operands are equally important,
and the assumption that the same set of metrics can be used for different programming
languages and development environments.
Overall, Halstead’s software metrics can be a useful tool for software developers and
project managers to estimate the effort required to develop and maintain software
programs.
n1 = Number of distinct operators.
n2 = Number of distinct operands.
N1 = Total number of occurrences of operators.
N2 = Total number of occurrences of operands.
Halstead Metrics
Halstead metrics are:
 Halstead Program Length: The total number of operator occurrences and the total
number of operand occurrences.
N = N1 + N2
And estimated program length is, N^ = n1log2n1 + n2log2n2
The following alternate expressions have been published to estimate program length:

 NJ = log2(n1!) + log2(n2!)
 NB = n1 * log2n2 + n2 * log2n1
 NC = n1 * sqrt(n1) + n2 * sqrt(n2)
 NS = (n * log2n) / 2
 Halstead Vocabulary: The total number of unique operators and unique operand
occurrences.
n = n1 + n2
 Program Volume: Proportional to program size, represents the size, in bits, of space
necessary for storing the program. This parameter is dependent on specific algorithm
implementation. The properties V, N, and the number of lines in the code are shown
to be linearly connected and equally valid for measuring relative program size.
V = Size * (log2 vocabulary) = N * log2(n)
The unit of measurement of volume is the common unit for size “bits”. It is the actual
size of a program if a uniform binary encoding for the vocabulary is used. And error =
Volume / 3000
 Potential Minimum Volume: The potential minimum volume V* is defined as the
volume of the most succinct program in which a problem can be coded.
V* = (2 + n2*) * log2(2 + n2*)
Here, n2* is the count of unique input and output parameters
 Program Level: To rank the programming languages, the level of abstraction
provided by the programming language, Program Level (L) is considered. The higher
the level of a language, the less effort it takes to develop a program using that
language.
L = V* / V
The value of L ranges between zero and one, with L=1 representing a program written
at the highest possible level (i.e., with minimum size).
And estimated program level is L^ =2 * (n2) / (n1)(N2)
 Program Difficulty: This parameter shows how difficult to handle the program is.
D = (n1 / 2) * (N2 / n2)
D=1/L
As the volume of the implementation of a program increases, the program level
decreases and the difficulty increases. Thus, programming practices such as redundant
usage of operands, or the failure to use higher-level control constructs will tend to
increase the volume as well as the difficulty.
 Programming Effort: Measures the amount of mental activity needed to translate the
existing algorithm into implementation in the specified program language.
E = V / L = D * V = Difficulty * Volume
 Language Level: Shows the algorithm implementation program language level. The
same algorithm demands additional effort if it is written in a low-level program
language. For example, it is easier to program in Pascal than in Assembler.
L’ = V / D / D
lambda = L * V* = L2 * V
 Intelligence Content: Determines the amount of intelligence presented (stated) in the
program This parameter provides a measurement of program complexity,
independently of the programming language in which it was implemented.
I=V/D
 Programming Time: Shows time (in minutes) needed to translate the existing
algorithm into implementation in the specified program language.
T = E / (f * S)
The concept of the processing rate of the human brain, developed by psychologist
John Stroud, is also used. Stoud defined a moment as the time required by the human
brain requires to carry out the most elementary decision. The Stoud number S is
therefore Stoud’s moments per second with:
5 <= S <= 20. Halstead uses 18. The value of S has been empirically developed from
psychological reasoning, and its recommended value for programming applications is
18.
Stroud number S = 18 moments / second
seconds-to-minutes factor f = 60
Counting Rules for C Language
1. Comments are not considered.
2. The identifier and function declarations are not considered
3. All the variables and constants are considered operands.
4. Global variables used in different modules of the same program are counted as
multiple occurrences of the same variable.
5. Local variables with the same name in different functions are counted as unique
operands.
6. Functions calls are considered operators.
7. All looping statements e.g., do {…} while ( ), while ( ) {…}, for ( ) {…}, all control
statements e.g., if ( ) {…}, if ( ) {…} else {…}, etc. are considered as operators.
8. In control construct switch ( ) {case:…}, switch as well as all the case statements are
considered as operators.
9. The reserve words like return, default, continue, break, size, etc., are considered
operators.
10. All the brackets, commas, and terminators are considered operators.
11. GOTO is counted as an operator and the label is counted as an operand.
12. The unary and binary occurrences of “+” and “-” are dealt with separately. Similarly
“*” (multiplication operator) is dealt with separately.
13. In the array variables such as “array-name [index]” “array-name” and “index” are
considered as operands and [ ] is considered as operator.
14. In the structure variables such as “struct-name, member-name” or “struct-name ->
member-name”, struct-name, and member-name are taken as operands, and ‘.’, ‘->’
are taken as operators. Some names of member elements in different structure
variables are counted as unique operands.
15. All the hash directives are ignored.
Example – List out the operators and operands and also calculate the values of software
science measures like
int sort (int x[ ], int n)

{
int i, j, save, im1;
/*This function sorts array x in ascending order */
If (n< 2) return 1;
for (i=2; i< =n; i++)
{
im1=i-1;
for (j=1; j< =im1; j++)
if (x[i] < x[j])
{
Save = x[i];
x[i] = x[j];
x[j] = save;
}
}
return 0;
}
Explanation
Operators Occurrences OperandsOccurrences

int 4 sort 1

() 5 x 7

, 4 n 3

[] 7 i 8

if 2 j 7

< 2 save 3

; 11 im1 3

for 2 2 2

= 6 1 3

– 1 0 1

<= 2 – –

++ 2 – –

return 2 – –
Operators Occurrences OperandsOccurrences

{} 3 – –

n1=14 N1=53 n2=10 N2=38

Therefore,
N = 91
n = 24
V = 417.23 bits
N^ = 86.51
n2* = 3 (x:array holding integer
to be sorted. This is used both
as input and output)
V* = 11.6
L = 0.027
D = 37.03
L^ = 0.038
T = 610 seconds
Advantages of Halstead Metrics
 It is simple to calculate.
 It measures the overall quality of the programs.
 It predicts the rate of error.
 It predicts maintenance effort.
 It does not require a full analysis of the programming structure.
 It is useful in scheduling and reporting projects.
 It can be used for any programming language.
 Easy to use: The metrics are simple and easy to understand and can be calculated
quickly using automated tools.
 Quantitative measure: The metrics provide a quantitative measure of the complexity
and effort required to develop and maintain a software program, which can be useful
for project planning and estimation.
 Language independent: The metrics can be used for different programming languages
and development environments.
 Standardization: The metrics provide a standardized way to compare and evaluate
different software programs.
Disadvantages of Halstead Metrics
 It depends on the complete code.
 It has no use as a predictive estimating model.
 Limited scope: The metrics focus only on the complexity and effort required to
develop and maintain a software program, and do not take into account other
important factors such as reliability, maintainability, and usability.
 Limited applicability: The metrics may not be applicable to all types of software
programs, such as those with a high degree of interactivity or real-time requirements.
 Limited accuracy: The metrics are based on a number of assumptions and
simplifications, which may limit their accuracy in certain situations.

Data Structure Metrics

Essentially the need for software development and other activities are to process data.
Some data is input to a system, program or module; some data may be used internally,
and some data is the output from a system, program, or module.

Example:
Program Data Input Internal Data Data Output

Payroll Name/Social Security No./Pay Withholding rates Overtime Gross Pay withholding
rate/Number of hours worked Factors Insurance Net Pay Pay
Premium Rates Ledgers

Spreadsheet Item Names/Item Cell computations Subtotal Spreadsheet of items

Amounts/Relationships among and totals
Items

Software Program Size/No of Software Model Parameter Constants Est. project effort Est.
Planner developer on team Coefficients project duration

That's why an important set of metrics which capture in the amount of data input,
processed in an output form software. A count of this data structure is called Data
Structured Metrics. In these concentrations is on variables (and given constant) within
each module & ignores the input-output dependencies.

There are some Data Structure metrics to compute the effort and time required to
complete the project. There metrics are:

1. The Amount of Data.

2. The Usage of data within a Module.
3. Program weakness.
4. The sharing of Data among Modules.
1. The Amount of Data: To measure the amount of Data, there are further many
different metrics, and these are:

ADVERTISEMENT
ADVERTISEMENT

o Number of variable (VARS): In this metric, the Number of variables used in the
program is counted.
o Number of Operands (η2): In this metric, the Number of operands used in the program
is counted.
η2 = VARS + Constants + Labels
o Total number of occurrence of the variable (N2): In this metric, the total number of
occurrence of the variables are computed

2. The Usage of data within a Module: The measure this metric, the average numbers
of live variables are computed. A variable is live from its first to its last references within
the procedure.

For Example: If we want to characterize the average number of live variables for a
program having modules, we can use this equation.

Where (LV) is the average live variable metric computed from the ith module. This
equation could compute the average span size (SP) for a program of n spans.

3. Program weakness: Program weakness depends on its Modules weakness. If

Modules are weak(less Cohesive), then it increases the effort and time metrics required
to complete the project.

Module Weakness (WM) = LV* γ

A program is normally a combination of various modules; hence, program weakness can
be a useful measure and is defined as:

Where

WMi: Weakness of the ith module

WP: Weakness of the program

m: No of modules in the program

[Link] Sharing of Data among Module: As the data sharing between the Modules
increases (higher Coupling), no parameter passing between Modules also increased, As
a result, more effort and time are required to complete the project. So Sharing Data
among Module is an important metrics to calculate effort and time.
Information Flow Metrics
The other set of metrics we would live to consider are known as Information Flow
Metrics. The basis of information flow metrics is found upon the following concept the
simplest system consists of the component, and it is the work that these components do
and how they are fitted together that identify the complexity of the system. The
following are the working definitions that are used in Information flow:

Component: Any element identified by decomposing a (software) system into it's

constituent's parts.

Cohesion: The degree to which a component performs a single function.

Coupling: The term used to describe the degree of linkage between one component to
others in the same system.
Information Flow metrics deal with this type of complexity by observing the flow of
information among system components or modules. This metrics is given by Henry and
Kafura. So it is also known as Henry and Kafura's Metric.

This metrics is based on the measurement of the information flow among system
modules. It is sensitive to the complexity due to interconnection among system
component. This measure includes the complexity of a software module is defined to be
the sum of complexities of the procedures included in the module. A process
contributes complexity due to the following two factors.

1. The complexity of the procedure code itself.

2. The complexity due to the procedure's connections to its environment. The effect
of the first factor has been included through LOC (Line Of Code) measure. For the
quantification of the second factor, Henry and Kafura have defined two terms,
namely FAN-IN and FAN-OUT.

FAN-IN: FAN-IN of a procedure is the number of local flows into that procedure plus
the number of data structures from which this procedure retrieve information.

FAN -OUT: FAN-OUT is the number of local flows from that procedure plus the number
of data structures which that procedure updates.

Procedure Complexity = Length * (FAN-IN * FANOUT)**2

ase Tools For Software Metrics

Many CASE tools (Computer Aided Software Engineering tools) exist for measuring
software. They are either open source or are paid tools. Some of them are listed below:

1. Analyst4j tool is based on the Eclipse platform and available as a stand-alone

Rich Client Application or as an Eclipse IDE plug-in. It features search, metrics,
analyzing quality, and report generation for Java programs.
2. CCCC is an open source command-line tool. It analyzes C++ and Java lines and
generates reports on various metrics, including Lines of Code and metrics
proposed by Chidamber & Kemerer and Henry & Kafura.
3. Chidamber & Kemerer Java Metrics is an open source command-line tool. It
calculates the C&K object-oriented metrics by processing the byte-code of
compiled Java.
4. Dependency Finder is an open source. It is a suite of tools for analyzing
compiled Java code. Its core is a dependency analysis application that extracts
dependency graphs and mines them for useful information. This application
comes as a command-line tool, a Swing-based application, and a web
application.
5. Eclipse Metrics Plug-in 1.3.6 by Frank Sauer is an open source metrics
calculation and dependency analyzer plugin for the Eclipse IDE. It measures
various metrics and detects cycles in package and type dependencies.
6. Eclipse Metrics Plug-in 3.4 by Lance Walton is open source. It calculates various
metrics during build cycles and warns, via the problems view, of metrics 'range
violations'.
7. OOMeter is an experimental software metrics tool developed by Alghamdi. It
accepts Java/C# source code and UML models in XMI and calculates various
metrics.
8. Semmle is an Eclipse plug-in. It provides an SQL like querying language for
object-oriented code, which allows searching for bugs, measure code metrics, etc.
Software Project Planning
A Software Project is the complete methodology of programming advancement from
requirement gathering to testing and support, completed by the execution procedures,
in a specified period to achieve intended software product.

Need of Software Project Management

Software development is a sort of all new streams in world business, and there's next to
no involvement in structure programming items. Most programming items are
customized to accommodate customer's necessities. The most significant is that the
underlying technology changes and advances so generally and rapidly that experience
of one element may not be connected to the other one. All such business and ecological
imperatives bring risk in software development; hence, it is fundamental to manage
software projects efficiently.

Software Project Manager

Software manager is responsible for planning and scheduling project development.
They manage the work to ensure that it is completed to the required standard. They
monitor the progress to check that the event is on time and within budget. The project
planning must incorporate the major issues like size & cost estimation scheduling,
project monitoring, personnel selection evaluation & risk management. To plan a
successful software project, we must understand:

ADVERTISEMENT
ADVERTISEMENT

o Scope of work to be completed

o Risk analysis
o The resources mandatory
o The project to be accomplished
o Record of being followed

Software Project planning starts before technical work start. The various steps of
planning activities are:
The size is the crucial parameter for the estimation of other activities. Resources
requirement are required based on cost and development time. Project schedule may
prove to be very useful for controlling and monitoring the progress of the project. This
is dependent on resources & development time.

Software Cost Estimation

For any new software project, it is necessary to know how much it will cost to develop
and how much development time will it take. These estimates are needed before
development is initiated, but how is this done? Several estimation procedures have been
developed and are having the following attributes in common.

1. Project scope must be established in advanced.

2. Software metrics are used as a support from which evaluation is made.
3. The project is broken into small PCs which are estimated individually.
To achieve true cost & schedule estimate, several option arise.
4. Delay estimation
5. Used symbol decomposition techniques to generate project cost and schedule
estimates.
6. Acquire one or more automated estimation tools.

Uses of Cost Estimation

1. During the planning stage, one needs to choose how many engineers are
required for the project and to develop a schedule.
2. In monitoring the project's progress, one needs to access whether the project is
progressing according to the procedure and takes corrective action, if necessary.

Cost Estimation Models

A model may be static or dynamic. In a static model, a single variable is taken as a key
element for calculating cost and time. In a dynamic model, all variable are
interdependent, and there is no basic variable.

Static, Single Variable Models: When a model makes use of single variables to
calculate desired values such as cost, time, efforts, etc. is said to be a single variable
model. The most common equation is:

C=aLb

Where C = Costs
L= size
a and b are constants
The Software Engineering Laboratory established a model called SEL model, for
estimating its software production. This model is an example of the static, single variable
model.

E=1.4L0.93
DOC=30.4L0.90
D=4.6L0.26

Where E= Efforts (Person Per Month)

DOC=Documentation (Number of Pages)
D = Duration (D, in months)
L = Number of Lines per code

Static, Multivariable Models: These models are based on method (1), they depend on
several variables describing various aspects of the software development environment.
In some model, several variables are needed to describe the software development
process, and selected equation combined these variables to give the estimate of time &
cost. These models are called multivariable models.

WALSTON and FELIX develop the models at IBM provide the following equation gives a
relationship between lines of source code and effort:

E=5.2L0.91

In the same manner duration of development is given by

D=4.1L0.36

The productivity index uses 29 variables which are found to be highly correlated
productivity as follows:

Where Wi is the weight factor for the ithvariable and Xi={-1,0,+1} the estimator
gives Xione of the values -1, 0 or +1 depending on the variable decreases, has no effect
or increases the productivity.

Example: Compare the Walston-Felix Model with the SEL model on a software
development expected to involve 8 person-years of effort.

a. Calculate the number of lines of source code that can be produced.

b. Calculate the duration of the development.
c. Calculate the productivity in LOC/PY
d. Calculate the average manning

Solution:

The amount of manpower involved = 8PY=96persons-months

(a)Number of lines of source code can be obtained by reversing equation to give:

Then

L (SEL) = (96/1.4)1⁄0.93=94264 LOC

L (SEL) = (96/5.2)1⁄0.91=24632 LOC

(b)Duration in months can be calculated by means of equation

ADVERTISEMENT
ADVERTISEMENT

D (SEL) = 4.6 (L) 0.26

= 4.6 (94.264)0.26 = 15 months
D (W-F) = 4.1 L0.36
= 4.1 (24.632)0.36 = 13 months

(c) Productivity is the lines of code produced per persons/month (year)

(d)Average manning is the average number of persons required per month in the
project
COCOMO Model
Boehm proposed COCOMO (Constructive Cost Estimation Model) in [Link] is
one of the most generally used software estimation models in the world. COCOMO
predicts the efforts and schedule of a software product based on the size of the
software.

The necessary steps in this model are:

1. Get an initial estimate of the development effort from evaluation of thousands of

delivered lines of source code (KDLOC).
2. Determine a set of 15 multiplying factors from various attributes of the project.
3. Calculate the effort estimate by multiplying the initial estimate with all the
multiplying factors i.e., multiply the values in step1 and step2.

The initial estimate (also called nominal estimate) is determined by an equation of the
form used in the static single variable models, using KDLOC as the measure of the size.
To determine the initial effort Ei in person-months the equation used is of the type is
shown below

Ei=a*(KDLOC)b

The value of the constant a and b are depends on the project type.

In COCOMO, projects are categorized into three types:

1. Organic
2. Semidetached
3. Embedded

[Link]: A development project can be treated of the organic type, if the project
deals with developing a well-understood application program, the size of the
development team is reasonably small, and the team members are experienced in
developing similar methods of projects. Examples of this type of projects are simple
business systems, simple inventory management systems, and data processing
systems.

2. Semidetached: A development project can be treated with semidetached type if the

development consists of a mixture of experienced and inexperienced staff. Team
members may have finite experience in related systems but may be unfamiliar with
some aspects of the order being developed. Example of Semidetached system
includes developing a new operating system (OS), a Database Management
System (DBMS), and complex inventory management system.

3. Embedded: A development project is treated to be of an embedded type, if the

software being developed is strongly coupled to complex hardware, or if the stringent
regulations on the operational method exist. For Example: ATM, Air Traffic control.

For three product categories, Bohem provides a different set of expression to predict
effort (in a unit of person month)and development time from the size of estimation in
KLOC(Kilo Line of code) efforts estimation takes into account the productivity loss due
to holidays, weekly off, coffee breaks, etc.

According to Boehm, software cost estimation should be done through three stages:

1. Basic Model
2. Intermediate Model
3. Detailed Model

1. Basic COCOMO Model: The basic COCOMO model provide an accurate size of the
project parameters. The following expressions give the basic COCOMO estimation
model:

Effort=a1*(KLOC) a2 PM
Tdev=b1*(efforts)b2 Months

Where

KLOC is the estimated size of the software product indicate in Kilo Lines of Code,

a1,a2,b1,b2 are constants for each group of software products,

Tdev is the estimated time to develop the software, expressed in months,

Effort is the total effort required to develop the software product, expressed in person
months (PMs).
Estimation of development effort

For the three classes of software products, the formulas for estimating the effort based
on the code size are shown below:

Organic: Effort = 2.4(KLOC) 1.05 PM

Semi-detached: Effort = 3.0(KLOC) 1.12 PM

Embedded: Effort = 3.6(KLOC) 1.20 PM

Estimation of development time

For the three classes of software products, the formulas for estimating the development
time based on the effort are given below:

Organic: Tdev = 2.5(Effort) 0.38 Months

Semi-detached: Tdev = 2.5(Effort) 0.35 Months

Embedded: Tdev = 2.5(Effort) 0.32 Months

Some insight into the basic COCOMO model can be obtained by plotting the estimated
characteristics for different software sizes. Fig shows a plot of estimated effort versus
product size. From fig, we can observe that the effort is somewhat superliner in the size
of the software product. Thus, the effort required to develop a product increases very
rapidly with project size.
The development time versus the product size in KLOC is plotted in fig. From fig it can
be observed that the development time is a sub linear function of the size of the
product, i.e. when the size of the product increases by two times, the time to develop
the product does not double but rises moderately. This can be explained by the fact that
for larger products, a larger number of activities which can be carried out concurrently
can be identified. The parallel activities can be carried out simultaneously by the
engineers. This reduces the time to complete the project. Further, from fig, it can be
observed that the development time is roughly the same for all three categories of
products. For example, a 60 KLOC program can be developed in approximately 18
months, regardless of whether it is of organic, semidetached, or embedded type.
From the effort estimation, the project cost can be obtained by multiplying the required
effort by the manpower cost per month. But, implicit in this project cost computation is
the assumption that the entire project cost is incurred on account of the manpower cost
alone. In addition to manpower cost, a project would incur costs due to hardware and
software required for the project and the company overheads for administration, office
space, etc.

It is important to note that the effort and the duration estimations obtained using the
COCOMO model are called a nominal effort estimate and nominal duration estimate.
The term nominal implies that if anyone tries to complete the project in a time shorter
than the estimated duration, then the cost will increase drastically. But, if anyone
completes the project over a longer period of time than the estimated, then there is
almost no decrease in the estimated cost value.
Example1: Suppose a project was estimated to be 400 KLOC. Calculate the effort and
development time for each of the three model i.e., organic, semi-detached &
embedded.

Solution: The basic COCOMO equation takes the form:

Effort=a1*(KLOC) a2 PM
Tdev=b1*(efforts)b2 Months
Estimated Size of project= 400 KLOC

(i)Organic Mode

E = 2.4 * (400)1.05 = 1295.31 PM

D = 2.5 * (1295.31)0.38=38.07 PM

(ii)Semidetached Mode

E = 3.0 * (400)1.12=2462.79 PM
D = 2.5 * (2462.79)0.35=38.45 PM

(iii) Embedded Mode

E = 3.6 * (400)1.20 = 4772.81 PM

D = 2.5 * (4772.8)0.32 = 38 PM

Example2: A project size of 200 KLOC is to be developed. Software development team

has average experience on similar type of projects. The project schedule is not very
tight. Calculate the Effort, development time, average staff size, and productivity of the
project.

Solution: The semidetached mode is the most appropriate mode, keeping in view the
size, schedule and experience of development time.

Hence E=3.0(200)1.12=1133.12PM
D=2.5(1133.12)0.35=29.3PM

P = 176 LOC/PM
2. Intermediate Model: The basic Cocomo model considers that the effort is only a
function of the number of lines of code and some constants calculated according to the
various software systems. The intermediate COCOMO model recognizes these facts and
refines the initial estimates obtained through the basic COCOMO model by using a set
of 15 cost drivers based on various attributes of software engineering.

Classification of Cost Drivers and their attributes:

(i) Product attributes -

o Required software reliability extent

o Size of the application database
o The complexity of the product

Hardware attributes -

o Run-time performance constraints

o Memory constraints
o The volatility of the virtual machine environment
o Required turnabout time

Personnel attributes -

o Analyst capability
o Software engineering capability
o Applications experience
o Virtual machine experience
o Programming language experience

Project attributes -

o Use of software tools

o Application of software engineering methods
o Required development schedule

The cost drivers are divided into four categories:

Intermediate COCOMO equation:

E=ai (KLOC) bi*EAF

D=ci (E)di

Coefficients for intermediate COCOMO

Project ai bi ci di

Organic 2.4 1.05 2.5 0.38

Semidetached 3.0 1.12 2.5 0.35

Embedded 3.6 1.20 2.5 0.32

3. Detailed COCOMO Model:Detailed COCOMO incorporates all qualities of the
standard version with an assessment of the cost driver?s effect on each method of the
software engineering process. The detailed model uses various effort multipliers for
each cost driver property. In detailed cocomo, the whole software is differentiated into
multiple modules, and then we apply COCOMO in various modules to estimate effort
and then sum the effort.

The Six phases of detailed COCOMO are:

1. Planning and requirements

2. System structure
3. Complete structure
4. Module code and test
5. Integration and test
6. Cost Constructive model

The effort is determined as a function of program estimate, and a set of cost drivers are
given according to every phase of the software lifecycle.
Software Design
Software design is a mechanism to transform user requirements into some suitable
form, which helps the programmer in software coding and implementation. It deals with
representing the client's requirement, as described in SRS (Software Requirement
Specification) document, into a form, i.e., easily implementable using programming
language.

The software design phase is the first step in SDLC (Software Design Life Cycle), which
moves the concentration from the problem domain to the solution domain. In software
design, we consider the system to be a set of components or modules with clearly
defined behaviors & boundaries.

Objectives of Software Design

Following are the purposes of Software design:
1. Correctness:Software design should be correct as per requirement.
2. Completeness:The design should have all components like data structures,
modules, and external interfaces, etc.
3. Efficiency:Resources should be used efficiently by the program.
4. Flexibility:Able to modify on changing needs.
5. Consistency:There should not be any inconsistency in the design.
6. Maintainability: The design should be so simple so that it can be easily
maintainable by other designers.

Software Design Principles

Software design principles are concerned with providing means to handle the
complexity of the design process effectively. Effectively managing the complexity will
not only reduce the effort needed for design but can also reduce the scope of
introducing errors during design.

Following are the principles of Software Design

Problem Partitioning
For small problem, we can handle the entire problem at once but for the significant
problem, divide the problems and conquer the problem it means to divide the problem
into smaller pieces so that each piece can be captured separately.

For software design, the goal is to divide the problem into manageable pieces.

Benefits of Problem Partitioning

1. Software is easy to understand
2. Software becomes simple
3. Software is easy to test
4. Software is easy to modify
5. Software is easy to maintain
6. Software is easy to expand

These pieces cannot be entirely independent of each other as they together form the
system. They have to cooperate and communicate to solve the problem. This
communication adds complexity.

Note: As the number of partition increases = Cost of partition and complexity increases
Abstraction
An abstraction is a tool that enables a designer to consider a component at an abstract
level without bothering about the internal details of the implementation. Abstraction
can be used for existing element as well as the component being designed.

Here, there are two common abstraction mechanisms

1. Functional Abstraction
2. Data Abstraction

Functional Abstraction
i. A module is specified by the method it performs.
ii. The details of the algorithm to accomplish the functions are not visible to the user of the
function.

Functional abstraction forms the basis for Function oriented design approaches.

Data Abstraction

Details of the data elements are not visible to the users of data. Data Abstraction forms
the basis for Object Oriented design approaches.

Modularity
Modularity specifies to the division of software into separate modules which are
differently named and addressed and are integrated later on in to obtain the completely
functional software. It is the only property that allows a program to be intellectually
manageable. Single large programs are difficult to understand and read due to a large
number of reference variables, control paths, global variables, etc.

The desirable properties of a modular system are:

o Each module is a well-defined system that can be used with other applications.
o Each module has single specified objectives.
o Modules can be separately compiled and saved in the library.
o Modules should be easier to use than to build.
o Modules are simpler from outside than inside.

Advantages and Disadvantages of Modularity

In this topic, we will discuss various advantage and disadvantage of Modularity.

Advantages of Modularity

There are several advantages of Modularity

o It allows large programs to be written by several or different people

o It encourages the creation of commonly used routines to be placed in the library and
used by other programs.
o It simplifies the overlay procedure of loading a large program into main storage.
o It provides more checkpoints to measure progress.
o It provides a framework for complete testing, more accessible to test
o It produced the well designed and more readable program.

Disadvantages of Modularity

There are several disadvantages of Modularity

o Execution time maybe, but not certainly, longer

o Storage size perhaps, but is not certainly, increased
o Compilation and loading time may be longer
o Inter-module communication problems may be increased
o More linkage required, run-time may be longer, more source lines must be written, and
more documentation has to be done

Modular Design

Modular design reduces the design complexity and results in easier and faster
implementation by allowing parallel development of various parts of a system. We
discuss a different section of modular design in detail in this section:

1. Functional Independence: Functional independence is achieved by developing

functions that perform only one kind of task and do not excessively interact with other
modules. Independence is important because it makes implementation more accessible
and faster. The independent modules are easier to maintain, test, and reduce error
propagation and can be reused in other programs as well. Thus, functional
independence is a good design feature which ensures software quality.

It is measured using two criteria:

o Cohesion: It measures the relative function strength of a module.

o Coupling: It measures the relative interdependence among modules.

2. Information hiding: The fundamental of Information hiding suggests that modules

can be characterized by the design decisions that protect from the others, i.e., In other
words, modules should be specified that data include within a module is inaccessible to
other modules that do not need for such information.

The use of information hiding as design criteria for modular system provides the most
significant benefits when modifications are required during testing's and later during
software maintenance. This is because as most data and procedures are hidden from
other parts of the software, inadvertent errors introduced during modifications are less
likely to propagate to different locations within the software.

Strategy of Design
A good system design strategy is to organize the program modules in such a method
that are easy to develop and latter too, change. Structured design methods help
developers to deal with the size and complexity of programs. Analysts generate
instructions for the developers about how code should be composed and how pieces of
code should fit together to form a program.

To design a system, there are two possible approaches:

1. Top-down Approach
2. Bottom-up Approach

1. Top-down Approach: This approach starts with the identification of the main
components and then decomposing them into their more detailed sub-components.

2. Bottom-up Approach: A bottom-up approach begins with the lower details and
moves towards up the hierarchy, as shown in fig. This approach is suitable in case of an
existing system.
Coupling and Cohesion
Module Coupling
In software engineering, the coupling is the degree of interdependence between
software modules. Two modules that are tightly coupled are strongly dependent on
each other. However, two modules that are loosely coupled are not dependent on each
other. Uncoupled modules have no interdependence at all within them.

The various types of coupling techniques are shown in fig:

A good design is the one that has low coupling. Coupling is measured by the number of
relations between the modules. That is, the coupling increases as the number of calls
between modules increase or the amount of shared data is large. Thus, it can be said
that a design with high coupling will have more errors.
Types of Module Coupling

1. No Direct Coupling: There is no direct coupling between M1 and M2.

In this case, modules are subordinates to different modules. Therefore, no direct

coupling.

2. Data Coupling: When data of one module is passed to another module, this is called
data coupling.
3. Stamp Coupling: Two modules are stamp coupled if they communicate using
composite data items such as structure, objects, etc. When the module passes non-
global data structure or entire structure to another module, they are said to be stamp
coupled. For example, passing structure variable in C or object in C++ language to a
module.

4. Control Coupling: Control Coupling exists among two modules if data from one
module is used to direct the structure of instruction execution in another.

5. External Coupling: External Coupling arises when two modules share an externally
imposed data format, communication protocols, or device interface. This is related to
communication to external tools and devices.

6. Common Coupling: Two modules are common coupled if they share information
through some global data items.

7. Content Coupling: Content Coupling exists among two modules if they share code,
e.g., a branch from one module into another module.
Module Cohesion
In computer programming, cohesion defines to the degree to which the elements of a
module belong together. Thus, cohesion measures the strength of relationships
between pieces of functionality within a given module. For example, in highly cohesive
systems, functionality is strongly related.

Cohesion is an ordinal type of measurement and is generally described as "high

cohesion" or "low cohesion."
Types of Modules Cohesion

1. Functional Cohesion: Functional Cohesion is said to exist if the different elements of a

module, cooperate to achieve a single function.
2. Sequential Cohesion: A module is said to possess sequential cohesion if the element of
a module form the components of the sequence, where the output from one component
of the sequence is input to the next.
3. Communicational Cohesion: A module is said to have communicational cohesion, if all
tasks of the module refer to or update the same data structure, e.g., the set of functions
defined on an array or a stack.
4. Procedural Cohesion: A module is said to be procedural cohesion if the set of purpose
of the module are all parts of a procedure in which particular sequence of steps has to
be carried out for achieving a goal, e.g., the algorithm for decoding a message.
5. Temporal Cohesion: When a module includes functions that are associated by the fact
that all the methods must be executed in the same time, the module is said to exhibit
temporal cohesion.
6. Logical Cohesion: A module is said to be logically cohesive if all the elements of the
module perform a similar operation. For example Error handling, data input and data
output, etc.
7. Coincidental Cohesion: A module is said to have coincidental cohesion if it performs a
set of tasks that are associated with each other very loosely, if at all.

Differentiate between Coupling and Cohesion

Coupling Cohesion

Coupling is also called Inter-Module Binding. Cohesion is also called Intra-Module Binding.

Coupling shows the relationships between Cohesion shows the relationship within the module.
modules.

Coupling shows the Cohesion shows the module's relative functional strengt
relative independence between the modules.

While creating, you should aim for low coupling, While creating you should aim for high cohesion, i.e., a
i.e., dependency among modules should be module focuses on a single function (i.e., single-m
less. interaction with other modules of the system.

In coupling, modules are linked to the other In cohesion, the module focuses on a single thing.
modules.
Function Oriented Design
Function Oriented design is a method to software design where the model is
decomposed into a set of interacting units or modules where each unit or module has a
clearly defined function. Thus, the system is designed from a functional viewpoint.

Design Notations
Design Notations are primarily meant to be used during the process of design and are
used to represent design or design decisions. For a function-oriented design, the design
can be represented graphically or mathematically by the following:

Data Flow Diagram

Data-flow design is concerned with designing a series of functional transformations that
convert system inputs into the required outputs. The design is described as data-flow
diagrams. These diagrams show how data flows through a system and how the output is
derived from the input through a series of functional transformations.

Data-flow diagrams are a useful and intuitive way of describing a system. They are
generally understandable without specialized training, notably if control information is
excluded. They show end-to-end processing. That is the flow of processing from when
data enters the system to where it leaves the system can be traced.

Data-flow design is an integral part of several design methods, and most CASE tools
support data-flow diagram creation. Different ways may use different icons to represent
data-flow diagram entities, but their meanings are similar.

The notation which is used is based on the following symbols:

The report generator produces a report which describes all of the named entities in a
data-flow diagram. The user inputs the name of the design represented by the diagram.
The report generator then finds all the names used in the data-flow diagram. It looks up
a data dictionary and retrieves information about each name. This is then collated into a
report which is output by the system.

Data Dictionaries
A data dictionary lists all data elements appearing in the DFD model of a system. The
data items listed contain all data flows and the contents of all data stores looking on the
DFDs in the DFD model of a system.

A data dictionary lists the objective of all data items and the definition of all composite
data elements in terms of their component data items. For example, a data dictionary
entry may contain that the data grossPay consists of the
parts regularPay and overtimePay.
grossPay = regularPay + overtimePay

For the smallest units of data elements, the data dictionary lists their name and their
type.

A data dictionary plays a significant role in any software development process because
of the following reasons:

o A Data dictionary provides a standard language for all relevant information for use by
engineers working in a project. A consistent vocabulary for data items is essential since,
in large projects, different engineers of the project tend to use different terms to refer to
the same data, which unnecessarily causes confusion.
o The data dictionary provides the analyst with a means to determine the definition of
various data structures in terms of their component elements.

Structured Charts
It partitions a system into block boxes. A Black box system that functionality is known to
the user without the knowledge of internal design.

Structured Chart is a graphical representation which shows:

o System partitions into modules

o Hierarchy of component modules
o The relation between processing modules
o Interaction between modules
o Information passed between modules

The following notations are used in structured chart:

Pseudo-code
Pseudo-code notations can be used in both the preliminary and detailed design phases.
Using pseudo-code, the designer describes system characteristics using short, concise,
English Language phases that are structured by keywords such as If-Then-Else, While-
Do, and End.
Object-Oriented Design
In the object-oriented design method, the system is viewed as a collection of objects
(i.e., entities). The state is distributed among the objects, and each object handles its
state data. For example, in a Library Automation Software, each library representative
may be a separate object with its data and functions to operate on these data. The tasks
defined for one purpose cannot refer or change data of other objects. Objects have their
internal data which represent their state. Similar objects create a class. In other words,
each object is a member of some class. Classes may inherit features from the superclass.

The different terms related to object design are:

1. Objects: All entities involved in the solution design are known as objects. For
example, person, banks, company, and users are considered as objects. Every
entity has some attributes associated with it and has some methods to perform
on the attributes.
2. Classes: A class is a generalized description of an object. An object is an instance
of a class. A class defines all the attributes, which an object can have and
methods, which represents the functionality of the object.
3. Messages: Objects communicate by message passing. Messages consist of the
integrity of the target object, the name of the requested operation, and any other
action needed to perform the function. Messages are often implemented as
procedure or function calls.
4. Abstraction In object-oriented design, complexity is handled using abstraction.
Abstraction is the removal of the irrelevant and the amplification of the essentials.
5. Encapsulation: Encapsulation is also called an information hiding concept. The
data and operations are linked to a single unit. Encapsulation not only bundles
essential information of an object together but also restricts access to the data
and methods from the outside world.
6. Inheritance: OOD allows similar classes to stack up in a hierarchical manner
where the lower or sub-classes can import, implement, and re-use allowed
variables and functions from their immediate [Link] property of OOD
is called an inheritance. This makes it easier to define a specific class and to
create generalized classes from specific ones.
7. Polymorphism: OOD languages provide a mechanism where methods
performing similar tasks but vary in arguments, can be assigned the same name.
This is known as polymorphism, which allows a single interface is performing
functions for different types. Depending upon how the service is invoked, the
respective portion of the code gets executed.

User Interface Design

The visual part of a computer application or operating system through which a client
interacts with a computer or software. It determines how commands are given to the
computer or the program and how data is displayed on the screen.

Types of User Interface

There are two main types of User Interface:

o Text-Based User Interface or Command Line Interface

o Graphical User Interface (GUI)
Text-Based User Interface: This method relies primarily on the keyboard. A typical
example of this is UNIX.

Advantages
o Many and easier to customizations options.
o Typically capable of more important tasks.

Disadvantages
o Relies heavily on recall rather than recognition.
o Navigation is often more difficult.

Graphical User Interface (GUI): GUI relies much more heavily on the mouse. A typical
example of this type of interface is any versions of the Windows operating systems.

GUI Characteristics
Characteristics Descriptions

Windows Multiple windows allow different information to be displayed simultaneously on the user's

Icons Icons different types of information. On some systems, icons represent files. On other icons

Menus Commands are selected from a menu rather than typed in a command language.

Pointing A pointing device such as a mouse is used for selecting choices from a menu or indicating
window.

Graphics Graphics elements can be mixed with text or the same display.

Advantages
o Less expert knowledge is required to use it.
o Easier to Navigate and can look through folders quickly in a guess and check manner.
o The user may switch quickly from one task to another and can interact with several
different applications.

Disadvantages
o Typically decreased options.
o Usually less customizable. Not easy to use one button for tons of different variations.
UI Design Principles

Structure: Design should organize the user interface purposefully, in the meaningful
and usual based on precise, consistent models that are apparent and recognizable to
users, putting related things together and separating unrelated things, differentiating
dissimilar things and making similar things resemble one another. The structure
principle is concerned with overall user interface architecture.

Simplicity: The design should make the simple, common task easy, communicating
clearly and directly in the user's language, and providing good shortcuts that are
meaningfully related to longer procedures.

Visibility: The design should make all required options and materials for a given
function visible without distracting the user with extraneous or redundant data.

Feedback: The design should keep users informed of actions or interpretation, changes
of state or condition, and bugs or exceptions that are relevant and of interest to the user
through clear, concise, and unambiguous language familiar to users.

Tolerance: The design should be flexible and tolerant, decreasing the cost of errors and
misuse by allowing undoing and redoing while also preventing bugs wherever possible
by tolerating varied inputs and sequences and by interpreting all reasonable actions.
Coding
The coding is the process of transforming the design of a system into a computer
language format. This coding phase of software development is concerned with
software translating design specification into the source code. It is necessary to write
source code & internal documentation so that conformance of the code to its
specification can be easily verified.

Coding is done by the coder or programmers who are independent people than the
designer. The goal is not to reduce the effort and cost of the coding phase, but to cut to
the cost of a later stage. The cost of testing and maintenance can be significantly
reduced with efficient coding.

Goals of Coding
1. To translate the design of system into a computer language format: The
coding is the process of transforming the design of a system into a computer
language format, which can be executed by a computer and that perform tasks as
specified by the design of operation during the design phase.
2. To reduce the cost of later phases: The cost of testing and maintenance can be
significantly reduced with efficient coding.
3. Making the program more readable: Program should be easy to read and
understand. It increases code understanding having readability and
understandability as a clear objective of the coding activity can itself help in
producing more maintainable software.

For implementing our design into code, we require a high-level functional language. A
programming language should have the following characteristics:

Characteristics of Programming Language

Following are the characteristics of Programming Language:
Readability: A good high-level language will allow programs to be written in some
methods that resemble a quite-English description of the underlying functions. The
coding may be done in an essentially self-documenting way.

Portability: High-level languages, being virtually machine-independent, should be easy

to develop portable software.

Generality: Most high-level languages allow the writing of a vast collection of

programs, thus relieving the programmer of the need to develop into an expert in many
diverse languages.

Brevity: Language should have the ability to implement the algorithm with less amount
of code. Programs mean in high-level languages are often significantly shorter than their
low-level equivalents.

Error checking: A programmer is likely to make many errors in the development of a

computer program. Many high-level languages invoke a lot of bugs checking both at
compile-time and run-time.

Cost: The ultimate cost of a programming language is a task of many of its

characteristics.
Quick translation: It should permit quick translation.

Efficiency: It should authorize the creation of an efficient object code.

Modularity: It is desirable that programs can be developed in the language as several

separately compiled modules, with the appropriate structure for ensuring self-
consistency among these modules.

Widely available: Language should be widely available, and it should be feasible to

provide translators for all the major machines and all the primary operating systems.

A coding standard lists several rules to be followed during coding, such as the way
variables are to be named, the way the code is to be laid out, error return conventions,
etc.

Coding Standards
General coding standards refers to how the developer writes code, so here we will
discuss some essential standards regardless of the programming language being used.

The following are some representative coding standards:

1. Indentation: Proper and consistent indentation is essential in producing easy to
read and maintainable programs.
Indentation should be used to:
o Emphasize the body of a control structure such as a loop or a select
statement.
o Emphasize the body of a conditional statement
o Emphasize a new scope block
2. Inline comments: Inline comments analyze the functioning of the subroutine, or
key aspects of the algorithm shall be frequently used.
3. Rules for limiting the use of global: These rules file what types of data can be
declared global and what cannot.
4. Structured Programming: Structured (or Modular) Programming methods shall
be used. "GOTO" statements shall not be used as they lead to "spaghetti" code,
which is hard to read and maintain, except as outlined line in the FORTRAN
Standards and Guidelines.
5. Naming conventions for global variables, local variables, and constant
identifiers: A possible naming convention can be that global variable names
always begin with a capital letter, local variable names are made of small letters,
and constant names are always capital letters.
6. Error return conventions and exception handling system: Different functions
in a program report the way error conditions are handled should be standard
within an organization. For example, different tasks while encountering an error
condition should either return a 0 or 1 consistently.

Coding Guidelines
General coding guidelines provide the programmer with a set of the best methods
which can be used to make programs more comfortable to read and maintain. Most of
the examples use the C language syntax, but the guidelines can be tested to all
languages.

The following are some representative coding guidelines recommended by many

software development organizations.
1. Line Length: It is considered a good practice to keep the length of source code lines
at or below 80 characters. Lines longer than this may not be visible properly on some
terminals and tools. Some printers will truncate lines longer than 80 columns.

2. Spacing: The appropriate use of spaces within a line of code can improve readability.

Example:

Bad: cost=price+(price*sales_tax)
fprintf(stdout ,"The total cost is %5.2f\n",cost);

Better: cost = price + ( price * sales_tax )

fprintf (stdout,"The total cost is %5.2f\n",cost);

3. The code should be well-documented: As a rule of thumb, there must be at least

one comment line on the average for every three-source line.

4. The length of any function should not exceed 10 source lines: A very lengthy
function is generally very difficult to understand as it possibly carries out many various
functions. For the same reason, lengthy functions are possible to have a
disproportionately larger number of bugs.
5. Do not use goto statements: Use of goto statements makes a program unstructured
and very tough to understand.

6. Inline Comments: Inline comments promote readability.

7. Error Messages: Error handling is an essential aspect of computer programming. This

does not only include adding the necessary logic to test for and handle errors but also
involves making error messages meaningful.

Programming Style
Programming style refers to the technique used in writing the source code for a
computer program. Most programming styles are designed to help programmers
quickly read and understands the program as well as avoid making errors. (Older
programming styles also focused on conserving screen space.) A good coding style can
overcome the many deficiencies of a first programming language, while poor style can
defeat the intent of an excellent language.

The goal of good programming style is to provide understandable, straightforward,

elegant code. The programming style used in a various program may be derived from
the coding standards or code conventions of a company or other computing
organization, as well as the preferences of the actual programmer.

Some general rules or guidelines in respect of programming style:

1. Clarity and simplicity of Expression: The programs should be designed in such a
manner so that the objectives of the program is clear.

2. Naming: In a program, you are required to name the module, processes, and
variable, and so on. Care should be taken that the naming style should not be cryptic
and non-representative.

For Example: a = 3.14 * r * r

area of circle = 3.14 * radius * radius;

3. Control Constructs: It is desirable that as much as a possible single entry and single
exit constructs used.

4. Information hiding: The information secure in the data structures should be hidden
from the rest of the system where possible. Information hiding can decrease the
coupling between modules and make the system more maintainable.

5. Nesting: Deep nesting of loops and conditions greatly harm the static and dynamic
behavior of a program. It also becomes difficult to understand the program logic, so it is
desirable to avoid deep nesting.
6. User-defined types: Make heavy use of user-defined data types like enum, class,
structure, and union. These data types make your program code easy to write and easy
to understand.

7. Module size: The module size should be uniform. The size of the module should not
be too big or too small. If the module size is too large, it is not generally functionally
cohesive. If the module size is too small, it leads to unnecessary overheads.

8. Module Interface: A module with a complex interface should be carefully examined.

9. Side-effects: When a module is invoked, it sometimes has a side effect of modifying

the program state. Such side-effect should be avoided where as possible.

Structured Programming
In structured programming, we sub-divide the whole program into small modules so
that the program becomes easy to understand. The purpose of structured programming
is to linearize control flow through a computer program so that the execution sequence
follows the sequence in which the code is written. The dynamic structure of the program
than resemble the static structure of the program. This enhances the readability,
testability, and modifiability of the program. This linear flow of control can be managed
by restricting the set of allowed applications construct to a single entry, single exit
formats.

Why we use Structured Programming?

We use structured programming because it allows the programmer to understand the
program easily. If a program consists of thousands of instructions and an error occurs
then it is complicated to find that error in the whole program, but in structured
programming, we can easily detect the error and then go to that location and correct it.
This saves a lot of time.

These are the following rules in structured programming:

Structured Rule One: Code Block

If the entry conditions are correct, but the exit conditions are wrong, the error must be
in the block. This is not true if the execution is allowed to jump into a block. The error
might be anywhere in the program. Debugging under these circumstances is much
harder.

Rule 1 of Structured Programming: A code block is structured, as shown in the figure.

In flow-charting condition, a box with a single entry point and single exit point are
structured. Structured programming is a method of making it evident that the program
is correct.

Structure Rule Two: Sequence

A sequence of blocks is correct if the exit conditions of each block match the entry
conditions of the following block. Execution enters each block at the block's entry point
and leaves through the block's exit point. The whole series can be regarded as a single
block, with an entry point and an exit point.

Rule 2 of Structured Programming: Two or more code blocks in the sequence are
structured, as shown in the figure.
Structured Rule Three: Alternation
If-then-else is frequently called alternation (because there are alternative options). In
structured programming, each choice is a code block. If alternation is organized as in
the flowchart at right, then there is one entry point (at the top) and one exit point (at the
bottom). The structure should be coded so that if the entry conditions are fulfilled, then
the exit conditions are satisfied (just like a code block).

Rule 3 of Structured Programming: The alternation of two code blocks is structured,

as shown in the figure.

An example of an entry condition for an alternation method is: register $8 includes a

signed integer. The exit condition may be: register $8 includes the absolute value of the
signed number. The branch structure is used to fulfill the exit condition.
Structured Rule 4: Iteration
Iteration (while-loop) is organized as at right. It also has one entry point and one exit
point. The entry point has conditions that must be satisfied, and the exit point has
requirements that will be fulfilled. There are no jumps into the form from external points
of the code.

Rule 4 of Structured Programming: The iteration of a code block is structured, as

shown in the figure.
Structured Rule 5: Nested Structures
In flowcharting conditions, any code block can be spread into any of the structures. If
there is a portion of the flowchart that has a single entry point and a single exit point, it
can be summarized as a single code block.

Rule 5 of Structured Programming: A structure (of any size) that has a single entry
point and a single exit point is equivalent to a code block. For example, we are
designing a program to go through a list of signed integers calculating the absolute
value of each one. We may (1) first regard the program as one block, then (2) sketch in
the iteration required, and finally (3) put in the details of the loop body, as shown in the
figure.
The other control structures are the case, do-until, do-while, and for are not needed.
However, they are sometimes convenient and are usually regarded as part of structured
programming. In assembly language, they add little convenience.

Common questions