You are on page 1of 47

Effort Estimation

 As the variation occurred in software size from


small to medium or large, the need for precision
or correctness in software estimation with
understanding has also grown.

 effective prediction is influenced by several known


or unknown factors like imprecise and drifting
requirements, newness (complete project or
technology or both), trying to match the
estimates with available time and budgets,
impractical or heavy change in plan during the
execution of the project, Software type and its
size, Programming languages, teams capability and
the stage during the development when
estimation is conducted
Estimation Methods
 Intuition Based/Experience based

 Algorithmic Models

 Managerial based

 Soft computing amalgamated


Estimating Principle
 Project Size X Project Attributes results
into the estimates in the form of
 Effort
 Cost
 Schedule
 Deliverables
Effecting Factors
 Rate of Change
 Experience of development team
 Process mode
 Project Size
 Development Languages
 Reusability
 Delivery dates
 And many more….
What is required to be estimated

 Size of the project


 Effort required for the project
 Delivery time for the project/product
 Staff required for completion
Intuition Based
 Expert Judgment
 Expert Judgment sometimes known as Delphi technique is
one of the most widely used Method, sometimes referred as
educated guessing technique based on intuition of some
experts.
 Different expert opinion is then analyzed to predict
estimates.
 Human experts provide cost and schedule estimates based
on their experience. Usually, they have knowledge of similar
development efforts, and the degree of similarity is relative
to their understanding of the proposed project.
 the guess is a sophisticated judgment supported by a variety
of tools. The whole process of estimation could be a group
discussion that ends upon a all agreed effort estimate
Shortcomings of Expert Judgment

① Subjective in nature
② One problem, different estimator will
produce different estimates.
③ Experience level effects estimate.
④ Unstructured process
⑤ Hard to convince customer
⑥ Difficulty in validating estimate.
Intuition Based cont…
Analogy
 Used to estimate effort for a new problem by analyzing solutions that were used to
solve an old problem. The analogy method usually follows the process in three step
fashion.

① Selection of relevant cost attributes


 The selection of relevant cost attributes may be determined by looking at the best
possible combination of variables by implementing a comprehensive search by
means of available clustering algorithms like ANGEL, ACE and ESTOR.

② Selection of suitable similarity/distance functions


 In order to compare the datasets of different projects the Similarity functions are
defined.

③ Number of analogues to consider for prediction


 For prediction, one may use the effort value of the most similar analogue. When
considering more than one analogue, simply the mean value (mean effort from
retrieved project 1 and 2), or a weighted mean may be used for prediction.
Shortcoming of Analogy
1) Availability of appropriate analogue.
2) A sound strategy to select analogue.
3) Accuracy of the data used for both the
analogue and the new project.
Managerial Based
 Work break down structure.
I. Teams work on different tasks
II. Top down structure suggests that total
effort is measurable without decomposing
or breaking down the project into fewer
activities or parts.
III. Bottom Up suggests that work should be
broken down into the number of activities
IV. Discuss any idea
Software Metric
 Quantifiable measures that could be used to measure
characteristics of a software system or the software
development process
 Managers need quantifiable information, and not subjective
information
 Measure: Quantitative indication of the extent, amount,
dimension, or size of some attribute of a product or process.
 Indicators: A combination of metrics that provides insight into
the software process, project or product
 Direct Metrics: Immediately measurable attributes (e.g. line of
code, execution speed, defects reported)
 Indirect Metrics: Aspects that are not immediately quantifiable
(e.g. functionality, reliability)
Software Metric cont…
 Types of Metrics

 Product metrics
◦ quantify characteristics of the product being
developed
 size, cost
 Process metrics
◦ quantify characteristics of the process being
used to develop the software
 efficiency of fault detection
Cont…
 Few Famous Metrics

 Line of Code (LOC)


 Halstead Equation
 McCabe’s Cyclomatic Complexity
 Function Points
Line of Code (LOC)
 perhaps the simplest:
◦ The lines of code measures are the most traditional measures used to
quantify software complexity and for estimating software development
effort
◦ count the number of lines of code (LOC) and use as a measure of
program complexity.
◦ Simple to use:
 if errors occur at 2% per line, a 5000 line program should have about 100
errors.

Some common Measures

◦ productivity KLOC/person-month
◦ quality faults/KLOC
◦ cost $$/KLOC
◦ documentation doc_pages/KLOC
LOC cont…
 Why used?
◦ early systems emphasis on coding

 Criticisms
◦ cross-language inconsistencies
◦ within language counting variations
◦ change in program structure can affect count
◦ stimulates programmers to write lots of code
◦ system-oriented, not user-oriented
How many Lines of Code in this program?

#define LOWER 0 /* lower limit of table */


#define UPPER 300 /* upper limit */
#define STEP 20 /* step size */

main () /* print a Fahrenheit-Celsius conversion


table */
{
int fahr;
for (fahr=LOWER; fahr <= UPPER;
fahr=fahr+STEP)
printf(“%4d %6.1f\n”, fahr, (5.0/9.0)*(fahr-
32));
}
Halstead Equation
 Give more weight to lines that are more complex.
 metrics of the software should reflect the implementation or expression of algorithms
in different languages, but be independent of their execution on a specific platform.
These metrics are therefore computed statically from the code.
 In order to estimate the code length, volume, complexity and effort, software science
suggests the use of operators and operands.

 Program length(N) = N1 + N2
 Following equations are used for computing estimation .
 N = Observed Program Length = N1 + N 2
 N* = Estimated Program Length = n1 (log2 (n1))+ n2 (log2 (n2))
 n = Program Vocabulary = n1 + n2
 V = The program volume (V) is the information contents of the program,
measured in mathematical bits= N*(log 2 (n))
 D = Program Difficulty = (n1/2)*(N 2 /n2)
 E = D*V

 Where
 n1 = number of distinct operators in a program
 n2 = number of distinct operands in a program
 N1 = number of occurrences of operators in a program
 N2 = number of occurrences of operands in a program
Halstead’s Example
if (k < 2)
{
if (k > 3)
x = x*k;
}

 Distinct operators: if ( ) { } > < = * ;


 Distinct operands: k 2 3 x
 n1 = 8
 n2 = 4
 N1 = 10
 N2 = 7
Known weaknesses:
◦ Call depth not taken into account
 a program with a sequence 10 successive calls more
complex than one with 10 nested calls

◦ An if-then-else sequence given same weight as


a loop structure.
◦ Added complexity issues of nesting if-then-
else or loops not taken into account, etc.
McCabe’s Complexity Measures
 It simply measures the amount of decision logic in the
program module. Cyclomatic complexity gives all possible
paths through the module. Cyclomatic complexity is often
referred to as McCabe's complexity.
 It is important to testers because it provides an indication
of the amount of testing (including reviews) necessary to
practically avoid defects
 McCabe's complexity used to define minimum number of
test cases required for a module
 McCabe’s metrics are based on a control flow
representation of the program.
 A program graph is used to depict control flow.
 Nodes represent processing tasks (one or more code
statements)
 Edges represent control flow between nodes
Flow Graph Notation
While
Sequence

If-then-else Until
Cyclomatic Complexity
 Set of independent paths through the graph
(basis set)

 V(G) = E – N + 2
◦ E is the number of flow graph edges
◦ N is the number of nodes

 V(G) = P + 1
◦ P is the number of predicate nodes
Meaning
 V(G) is the number of (enclosed) regions/areas of the planar graph.
 The V(g) or cyclomatic number is a measure of the complexity of a
function which is correlated with difficulty in testing. The standard
value is between 1 and 10.
 A value of 1 means the code has no branching.

 Number of regions increases with the number of decision paths


and loops

 A quantitative measure of testing difficulty and an indication of


ultimate reliability

 Experimental data shows value of V(G) should be no more then 10


- testing is very difficulty above this value
Example
i = 0;
while (i<n-1) do
j = i + 1;
while (j<n) do
if A[i]<A[j] then
swap(A[i], A[j]);
end do;
i=i+1;
end do;
Flow Graph
1

7 4 5

6
Computing V(G)
 V(G) = 9 – 7 + 2 = 4
 V(G) = 3 + 1 = 4
 Basis Set
◦ 1, 7
◦ 1, 2, 6, 1, 7
◦ 1, 2, 3, 4, 5, 2, 6, 1, 7
◦ 1, 2, 3, 5, 2, 6, 1, 7
Another Example
1

2
4
3

5 6

8
What is V(G)?
Function Points
 Function point is emerged which measures the
size of a system from its functionality and
usability.

 History
 Non-code oriented size measure
 Developed by IBM (A.Albrecht) in 1979, 1983
 Now in use by more than 500 organizations
world-wide

 What are they?


 5 weighted functionality types
 14 complexity factors
Processing Complexity Adjustment

1) data communications
2) distributed functions Each rated on scales equivalent
3) performance to the following:
4) heavily used configuration
5) transaction rate Not present =0
6) on-line data entry Incidental Influence = 1
7) end user efficiency Moderate Influence = 2
8) on-line update Average Influence = 3
9) complex processing Significant Influence= 4
10) reusability Strong Influence = 5
11) installation ease
12) operational ease
13) multiple sites
14) facilitates change
Function Point Calculation

5 3

Function Counts = FC   x i w j
i  1 j 1

  14 
Function Points = FP  FC.65  .01  ck 
  k  1 

where
xi = function i
wj = weight j
ck = complexity factor k
Computing Function Points
Analyze information
domain of the Establish count for input domain and system
application and
interfaces
develop counts

Weight each count by Assign level of complexity (simple, average,


assessing complexity complex) or weight to each count

Assess the influence of Grade significance of external factors, F_i,


global factors that
such as reuse, concurrency, OS, ...
affect the application

FP = SUM(count x weight) x C
Compute function where
points complexity multiplier C = (0.65+0.01 x N)
degree of influence N = SUM(F_i)
Analyzing the Information
Domain
weighting factor
measurement parameter count simple avg. complex
number of user inputs X 3 4 6 =
number of user outputs X 4 5 7 =
number of user inquiries X 3 4 6 =
number of files X 7 10 15 =
number of ext.interfaces X 5 7 10 =
count-total
complexity multiplier
function points
Example: SafeHome Functionality
Test Sensor
Password
Zone Setting Sensors
Zone Inquiry

User Sensor Inquiry SafeHome Messages


System User
Sensor Status
Panic Button
(De)activate (De)activate

Monitor
Password, Alarm Alert and
Sensors, etc. Response
System
System
Config Data
Example: SafeHome FP Calc
weighting factor
measurement parameter count simple avg. complex
number of user inputs 3 X 3 4 6 = 9
number of user outputs 2 X 4 5 7 = 8
number of user inquiries 2 X 3 4 6 = 6
number of files 1 X 7 10 15 = 7

number of ext.interfaces 2 X 5 7 10 = 10
count-total 3
complexity multiplier 1.11
function points --
Attempt

 Compute the function point value for a project with the


following information domain characteristics:
◦ Number of user inputs: 32
◦ Number of user outputs: 60
◦ Number of user enquiries: 24
◦ Number of files: 8
◦ Number of external interfaces: 2
◦ Assume that weights are average and external complexity
adjustment values are not important.
 It was developed by Barry W Boehm in the year
1981.
 It is an algorithmic cost model.
 It is based on size of the project.
 The size of the project may vary depending upon
the function points .
 Basic cocomo
 used for relatively smaller projects .
 team size is considered to be small.
 Cost drivers depend upon size of the projects .

 Effort E = a * (KDSI) b
* EAF

Where KDSI is number of thousands of delivered source instructions a


and b are constants, may vary depending on size of the project .

 schedule S= c * (E) d where E is the Effort and c, d are


constants.

 EAF is called Effort Adjustment Factor which is 1 for basic


cocomo , this value may vary from 1 to 15.
Each of the 15 attributes receives a rating on a six-point scale that ranges from "very low" to "extra
high" (in importance or value). An effort multiplier from the table below applies to the rating. The
product of all effort multipliers results in an effort adjustment factor (EAF)
 Intermediate COCOMO
 It is used for medium sized projects.
 Cost drivers depend upon product reliability, database
size, execution and storage.
 Team size is medium.

 Advanced COCOMO
 It is used for large sized projects.
 The cost drivers depend upon requirements, analysis,
design, testing and maintenance.
 Team size is large.
 Organic mode projects used for relatively smaller teams and Project is
developed in familiar environment.
E=2.4(KDSI)1.05 E in person-months and S=2.5(E)0.38.

 Semidetached mode projects lies between organic mode and embedded


mode in terms of team size and consists of experienced and
inexperienced staff. Team members are unfamiliar with the system under
development.
E=3(KDSI)1.12 E in person-months. S=2.5(E)0.35

 Embedded mode projects with project environment is complex. Team


members are highly skilled. Team members are familiar with the system
under development.

E=3.6(KDSI)1.20 E in person-months. S=2.5(E)0.32.



COCOMO Models
COCOMO Models

7000
Organic
6000 Semidetached
Embedded
5000
Person-months

4000

3000

2000

1000

0
0 100 200 300 400 500 600
Thousands of lines of code
Solve
 Assume that the size of an organic type
software product has been estimated to
be 32,000 lines of source code. Assume
that the average salary of software
engineers be Rs. 15,000/- per month.
Determine the effort required to develop
the software product and the nominal
development time
Issues in Algorithmic Models
 Specific Input, Specific Output
 Parameter dependent
 Regression based dependent
Sizing source code volumes
 On the basis of studies, the conversion
between LOC and function points is
possible.

You might also like