You are on page 1of 22

S/W Estimation

Cost Estimation Models in Software Engineering

 Cost estimation simply means a technique that is used to find out the cost estimates.

 The cost estimate is the financial spend that is done on the efforts to develop and test
software in Software Engineering.

 Cost estimation models are some mathematical algorithms or parametric equations


that are used to estimate the cost of a product or a project.
Various techniques or models are available for cost estimation, also
known as Cost Estimation Models as shown below :
Empirical Estimation Technique –

 Empirical estimation is a technique or model in which empirically derived formulas are used for
predicting the data that are a required and essential part of the software project planning step.
 These techniques are usually based on the data that is collected previously from a project and also based
on some guesses, prior experience with the development of similar types of projects, and assumptions.
 It uses the size of the software to estimate the effort.
 In this technique, an educated guess of project parameters is made.
 Hence, these models are based on common sense.
 However, as there are many activities involved in empirical estimation techniques, this technique is
formalized.
 For example Delphi technique and Expert Judgement technique.
Heuristic Technique –

 Heuristic word is derived from a Greek word that means “to discover”.
 The heuristic technique is a technique or model that is used for solving problems, learning, or
discovery in the practical methods which are used for achieving immediate goals.
 These techniques are flexible and simple for taking quick decisions through shortcuts and good
enough calculations, most probably when working with complex data.
 But the decisions that are made using this technique are necessary to be optimal.
 In this technique, the relationship among different project parameters is expressed using
mathematical equations.
 The popular heuristic technique is given by Constructive Cost Model (COCOMO).
 This technique is also used to increase or speed up the analysis and investment decisions.
Analytical Estimation Technique –

 Analytical estimation is a type of technique that is used to measure work.


 In this technique, firstly the task is divided or broken down into its basic component operations or elements for
analyzing.
 Second, if the standard time is available from some other source, then these sources are applied to each element or
component of work.
 Third, if there is no such time available, then the work is estimated based on the experience of the work.
 In this technique, results are derived by making certain basic assumptions about the project.
 Hence, the analytical estimation technique has some scientific basis. 
 Halstead’s software science is based on an analytical estimation model.
Project size estimation techniques

Estimation of the size of the software is an essential part of Software Project Management.
It helps the project manager to further predict the effort and time which will be needed to build
the project.
Various measures are used in project size estimation.
Some of these are: 
 Lines of Code
 Number of entities in ER diagram
 Total number of processes in detailed data flow diagram
 Function points
1. Lines of Code (LOC):

As the name suggests, LOC count the total number of lines of source code in a project.
The units of LOC are:  
 KLOC- Thousand lines of code
 NLOC- Non-comment lines of code
 KDSI- Thousands of delivered source instruction
 The size is estimated by comparing it with the existing systems of the same kind.
 The experts use it to predict the required size of various components of software and then add
them to get the total size. 
1. Lines of Code (LOC):

Advantages: 
 
•Universally accepted and is used in many models like COCOMO.
•Estimation is closer to the developer’s perspective.
•Simple to use.

Disadvantages: 
 
•Different programming languages contain a different number of lines.
•No proper industry standard exists for this technique.
•It is difficult to estimate the size using this technique in the early stages of the
project.
2. Number of entities in ER diagram:

 ER model provides a static view of the project.


 It describes the entities and their relationships.
 The number of entities in ER model can be used to measure the estimation of the size of the project.
 The number of entities depends on the size of the project.
 This is because more entities needed more classes/structures thus leading to more coding. 
2. Number of entities in ER diagram:

Advantages: 
 
•Size estimation can be done during the initial stages of planning.
•The number of entities is independent of the programming technologies used.

Disadvantages: 
 
•No fixed standards exist. Some entities contribute more project size than others.
•It is less used in the cost estimation model. Hence, it must be converted to LOC.
3. Total number of processes in detailed data flow diagram: 

Data Flow Diagram(DFD) represents the functional view of software.


 The model depicts the main processes/functions involved in software and the flow of data between them.
 Utilization of the number of functions in DFD to predict software size.
 Already existing processes of similar type are studied and used to estimate the size of the process.
 Sum of the estimated size of each process gives the final estimated size. 
3. Total number of processes in detailed data flow diagram

Advantages: 
 
•It is independent of the programming language.
•Each major process can be decomposed into smaller processes. This will increase the
accuracy of estimation

Disadvantages: 
 
•Studying similar kinds of processes to estimate size takes additional time and effort.
•All software projects are not required for the construction of DFD.
4. Function Point Analysis:

 The function point is a "unit of measurement" to express the amount of business functionality an
information system (as a product) provides to a user. Function points are used to compute a functional size
measurement (FSM) of software. The cost (in dollars or hours) of a single unit is calculated from past projects.
 In this method, the number and type of functions supported by the software are utilized to find FPC(function
point count).
The steps in function point analysis are: 
• Count the number of functions of each proposed type.
• Compute the Unadjusted Function Points(UFP).
• For each function identified above the function is further classified as simple,
average or complex and a weight is given to each. The sum of the weights
quantifies the size of information processing and is referred to as the
Unadjusted Function points.
• Find Total Degree of Influence(TDI).
• The sum of the values of the 14 GSCs(General System Charachteristic) thus
obtained is termed as Total Degree of Influence (TDI). 
• Compute Value Adjustment Factor(VAF).
• The Value Adjustment Factor (VAF) is based on 14 GSCs that rate the general
functionality of the application being counted. GSCs are user business
constraints independent of technology. Each characteristic has associated
descriptions to determine the degree of influence.
• Find the Function Point Count(FPC).
Size Metrics

 For solving different problems on computer, programs are developed, written and implemented by different
programmers.
 For achieving different objectives, programs are written in different programming languages.
 Some programs are written in C, C++, few in Pascal and FORTRAN, some in COBOL, while others in C++,
VB, VC++, Java, Ada languages and so on.
 Some programs are of good quality, well documented and written with latest software engineering techniques.
 While others are written in a "quick-and-dirty" way with no comments and planning at all.
 Despite all these, there is one common feature which all programs share - all have size.
To be continued…

 Size measure is very simple, and important metric for software industry. It has many useful characteristics like:
 It is very easy to calculate, once the program is completed.
 It plays an important role and is one of the most important parameter for many software development
models like cost and effort estimation.
 Productivity is also expressed in terms of size measure.
 Memory requirements can also be decided on the basis of size measure.
 The principal size measures, which have got more attention than others, are:
 1. Lines of Code (LOC)
 2. Token count
 3. Function count
Lines of Code

 It is one of the earliest and the simplest metric for calculating the size of a computer program.
 It is generally used in calculating and comparing the productivity of programmers.
 Productivity is measured as LOC.
 Among researchers, there is no general agreement what makes a line of code.
 Due to lack of standard and precise definition of LOC measure, different workers for the same program may
obtain different counts.
 Further, it also gives an equal weightage to each line of code.
 But, in fact some statements of a program are more difficult to code and comprehend than others.
 Despite all this, this metric still continues to be popular and useful in software industry because of its simplicity.
To be continued…

 The most important characteristic of this metric is its precise and standard definition.
 There is a general agreement among researchers that this measure should not include comment and blank lines
because these are used only for internal documentation of the program.
 Their presence or absence does not affect the functionality, efficiency of the program.
 Some observers are also of the view that only executable statements should be included in the count, because
these only support the functions of the program.
 The predominant definition of LOC measure used today by various software personnel is:
 "Any line of program text excluding comment or blank lines, regardless of the number of statements or parts of
statements on the line, is considered a line of code (LOC). It excludes all lines containing program headers,
declarations, and non-executable statements and includes only executable statements."
Token Count

 The drawback in LOC size measure of treating all lines alike can be solved by giving more weight to those lines,
which are difficult to code and have more "stuff".
 One natural solution to this problem may be to count the basic symbols used in a line instead of lines themselves.
 These basic symbols are called "tokens".
 Such a scheme was used by Halstead in his theory of software science.
 In this theory, a computer program is considered to be a collection of tokens, which may be classified as either
operators or operands.
 All software science metrics can be defined in terms of these basic symbols.
 The basic measures are:
 n1 = count of unique operators
 n2 = count of unique operands
 N1 = count of total occurrences of operators
 N2= count of total occurrences of operands
To be continued…

 An operator can be defined as a symbol or keyword, which specifies an action.


 Operators consist of arithmetic, relational symbols, punctuation marks, special symbols (like braces, =), reserve-
word/keywords (like WHILE, DO, READ) and function names like printf (), scanf () etc.
 A token, which receives the action and is used to represent the data, is called an operand.
 Operands include variables, constants and even labels.
 In terms of the total tokens used, the size of the program can be expressed as: N = N1 + N2
 At present, there is no general agreement among researchers on counting rules for the classification of these
tokens.
 These rules are made by the programmer for his/her convenience.
 The counting rules also depend upon the programming language.
Function Count

 The size of a large software product can be estimated in a better way, through a larger unit called module, than
the LOC measure.
 A module can be defined as segment of code, which may be compiled independently.
 For large software systems, it is easier to predict the number of modules than the lines of code.
 For example, let a software product require n modules. It is generally agreed that size of the module should be about 50 -
60 lines of code. Therefore, size estimate of this software product is about n x 60 lines of code. But, this metric requires
precise and strict rules for dividing a program into modules. Due to the absence of these rules, this metric may not be so
useful.
 A module may consist of one or more functions. In a program, a function may be defined as a group of
executable statements, which performs a definite task. The number of lines of code for a function should not be
very large. It is because human memory is limited and a programmer cannot perform a task efficiently if the
information to be manipulated is large.

You might also like