Professional Documents
Culture Documents
F J Heemstra
the estimation and control of software development WHAT MAKES SOFTWARE COST
projects in 598 Dutch organizations. The most remark-
ESTIMATION SO DIFFICULT?
able conclusions are:
The main question, when confronting the above-
mentioned problems, is what it is that makes software
• 35% of the participating organizations do not make
cost estimation so difficult. There are many reasons
an estimate
and, without going into detail, some can be listed as
• 50% of the responding organizations record no data
follows:
on an ongoing project
• 57% do not use cost-accounting
(1) There is a lack of data on completed software
• 80% of the projects executed by the participating
projects. This kind of data can support project
organizations have overruns of budgets and duration
management in making estimates.
• the mean overruns of budgets and duration are 50%
(2) Estimates are often done hurriedly, without an
appreciation for the effort required to do a credible
Van Lierop et al. 2 measured extensively whether job. In addition, too often it is the case that an
development activities were executed according to estimate is needed before clear specifications of the
plan. They investigated the reasons for the differences system requirements have been produced. There-
between plan and reality, and overall 80 development fore, a typical situation is that estimators are being
activities were measured. For all these activities 3203 pressured to write an estimate too quickly for a
hours were planned but 3838 hours were used, which system that they do not fully understand.
means an overshoot of 20% on average of the planned (3) Clear, complete and reliable specifications are
number of hours. The duration of the activities (in difficult to formulate, especially at the start of a
days) proved to be 28% longer on average than project. Changes, adaptations and additions are
planned. For all the activities 406 days of duration were more the rule than the exception: as a consequence
planned, while the actual number of days proved to be plans and budgets must be adapted too.
526. (4) Characteristics of software and software develop-
In the literature the impression is given, mistakenly, ment make estimating difficult. For example, the
that software development without overshoots of level of abstraction, complexity, measurability of
plans and budgets is not possible. This impression product and process, innovative aspects, etc.
is inaccurate, and other measurements confirm this 3. (5) A great number of factors have an influence on the
These show that 6% of all the activities had a shorter effort and time to develop software. These factors
duration than planned and 58% were executed are called 'cost drivers'. Examples are size and
according to plan and were ready exactly on time. complexity of the software, commitment and par-
With regard to the development effort, it appeared that ticipation of the user organization, experience of
25% of the activities needed less effort than estimated the development team. In general these cost drivers
and 30% needed precisely the estimated effort. The are difficult to determine in operation.
reasons for the differences between plan and reality (6) Rapid changes in information technology (IT) and
prove to be very specific for the development situation. the methodology of software development are a
In the organization where the measurements were problem for a stabilization of the estimation pro-
taken the reasons were mainly related to things under- cess. For example, it is difficult to predict the
estimation of the quantity of work, underestimation influence of new workbenches, fourth and fifth
of the complexity of the application, and specifications generation languages, prototyping strategies, and
which proved to be unrealistic from a technical point so on.
of view. In other organizations, where similar measure- (7) An estimator (mostly the project manager) cannot
ments were taken, other reasons were discovered. As have much experience in developing estimates, es-
a result, other control actions are, of course, necessary. pecially for large projects. How many 'large'
This conclusion fits well with the results of research projects can someone manage in, for example, 10
carried out by Beers 4. Thirty experienced software years?
developers, project managers, and others, were asked (8) An apparent bias of software developers towards
to give the reasons for unsuccessful software projects. underestimation. An estimator is likely to consider
The answers can be summarized briefly as 'many minds, how long a certain portion of the software would
many thoughts'. It was not possible to indicate just take and then to extrapolate this estimate to the rest
one reason. A long list of all kinds of reasons were of the system, ignoring the non-linear aspects of
given. software development, for example co-ordination
It is alarming that it is so difficult for organizations to and management.
control the development of software. This is sufficient (9) The estimator estimates the time it would take to
reason to emphasize that software development perform the task personally, ignoring the fact that
cost estimation and control should take its place as a lot of work will be done by less experienced
a fully fledged branch within discipline of software people, and junior staff with a lower productivity
development. rate.
Insight in the characteristics of: Definition There is a lack of clear and accepted defi-
• the product (software) that has to WHAT nitions for drivers, such as size, quality, complexity,
be developed experience, etc.
• the production means WITH WHAT
• the production personnel WHO Quantification The majority of the cost drivers are
• the organization of the production HOW hard to quantify. Often one has to use measures such as
• the user/user organization FOR W H O M many, moderate, few, etc.
Objectivity Subjectivity is a potential risk factor. What individual estimated costs are summed to get the
may be complex for developer A is not complex for overall cost estimate of the project.
developer B.
The reliability of estimates based on expert judgement
Correlation It is difficult to consider one driver by (1) depends a great deal to the degree in which a new
itself. A change in the value of driver A may have project conforms with the experience and the ability of
consequences in the values of several other cost drivers. the expert to remember facts of historical projects.
This is a difficulty from the viewpoint of measurability. Mostly the estimates are qualitative and not objective.
An important problem in using this method is that it is
Relation between driver and effort For estimation it is difficult for someone else to reproduce and use the
important to predict the relation between, for example, knowledge and experience of an expert. This can lead to
software size and the required effort, a specified quality misleading situations where the rules of thumb of an
level and required effort, etc. From the literature we expert are becoming general rules and used in inapplic-
know that there is little clarity about these relations. able situations. Despite the disadvantages, this technique
is usually used in situations where a first indication of
effort and time is needed, especially in the first phases of
Calibration It is impossible to talk about "the most
software development in which the specifications of the
important' cost drivers in isolation. It differs from
product are vague and continually adapted.
situation to situation.
The foundation of a cost estimation technique based
on reasoning by analogy (2) is an analysed database of
Effectivity and efficiency There is conflict between similar historical projects or similar project parts or
effectivity and efficiency. From an effectivity perspective
modules. To find a similarity between a new project and
it is worthwhile to pay a lot of attention to, for example,
one or more completed projects it is necessary to collect
user participation. For the efficiency of a project it is
and record data and characteristics of old projects.
justifiable to avoid user involvement.
The Price-to-Win (3) technique can hardly be called an
SCE technique. Primarily commercial motives play an
Human factors Almost all research agrees on the dom- important part in using this approach. It is remarkable
inating influence of cost drivers, such as experience and that the estimates of organizations which use Price-to-
quality of the personnel. This means that investment in Win are no less accurate than organizations which use
'good' developers is important. other methods7.
The basis of the estimation method which regards
Reuse In many studies reuse is regarded as (one of) the SCE as a capacity (4) problem is the availability of
most important factors to increase productivitys-~°. means, especially of personnel. An example is: 'Regard-
ing our capacity planning, three men are available for
the new project over the next four months. So the
SOFTWARE COST ESTIMATION: planned effort will be 12 man months'. If the specifica-
TECHNIQUES AND TOOLS tions of the software are not clear, this method can be
successful. An unfavourable side-effect is that in situ-
In the literature you can find a great number of tech-
ations of overestimation the planned effort will be used
niques for estimating software development costs. Most
completely. This effect is based on Parkinson's law that
of them are a combination of the following primary
'Work expands to fill the available volume'.
techniquesn:
In parametric models (5) the development time and
effort is estimated as a function of a number of variables.
(1) Estimates made by an expert. These variables represent the most important cost driv-
(2) Estimates based on reasoning by analogy. ers. The nucleus of an estimation model is a number of
(3) Estimates based on Price-to-Win. algorithms and parameters. The values of the parameters
(4) Estimates based on available capacity. and the kind of algorithms are, to a significant extent,
(5) Estimates based on the use of parametric models. based on the contents of a database of completed
projects. In the next section a more comprehensive
Furthermore two main approaches can be distinguished: explanation of estimation models is given.
As mentioned earlier only 65% of the organizations
(1) Top-down which participated on the field study estimate a software
In the top-down approach the estimation of the project. Table 2 shows the frequency of use of the
overall project is derived from the global character- different techniques. The figures show that most organ-
istics of the product. The total estimated cost is then izations make use of data from past projects in some
split up among the various components. way. Obviously this works on an informal basis, because
(2) Bottom-up only 50% of the participating organizations record data
In the bottom-up approach the cost of each individ- from completed projects. Estimates based on expert
ual component is estimated by the person who will judgement and the capacity method prove to be quite
be responsible for developing the component. The popular despite the disadvantages of these methods.
Table 2. Use of cost estimation techniques (an organization can cerning the specific characteristics of the software-
use more than one technique) product, the way the software-product will be developed
Use(%) and the production means, a number of cost influencing
Expert judgement 25.5 factors (cost drivers) are added to the model. The effect
Analogy method 60.8 of these cost drivers must be estimated. This effect is
Price-to-Win 8.9 often called a productivity adjustment factor. Appli-
Capacity problem 20.8 cation of this correction factor to the nominal estimation
Parametric models 13.7
of effort provides a more realistic estimate.
Some models, like FPA 16, are focused more on the
sizing stage. Others, like the well-known COCOMO
The next sections of this paper focus on the use of SCE
model" on the productivity stage and some tools, such
models. There was a rapid growth of models in the
as Before You Leap 17combine two models to cover both
1970s. In the 1980s and the 1990s, however, few new
stages. Figure 1 shows the two stages in SCE models.
models have been developed despite the increasing im-
Figure 2 shows the sizing and the productivity stages
portance of controlling and estimating software develop-
in the context of general cost estimation. In Figure 2 five
ment. Most of the 1970 models are of no interest to
components of the general cost estimation structure are
present industrial practitioners. There is a tendency
shown. Besides the sizing and productivity components,
towards automated versions (tools) of (combinations or
a phase distribution and sensitivity/risk analysis com-
refinements) existing models. An important question is
ponent are distinguished. In the phase distribution com-
whether this kind of model can solve all of the problems
ponent the total effort and duration is split up over the
discussed above.
phases and activities of a project. This division has to be
based on empirical data of past projects. The sensitivity
SOFTWARE COST ESTIMATION and risk analysis phase supports project m a n a g e m e n t -
especially at the start of a project when the uncertainty
MODELS is great - - in determining the risk factors of a project and
In this section, one estimation technique, namely SCE the sensitivity of the estimates to the cost drivers settings.
models, will be discussed and the principles of SCE Again data on past projects provide an important input
models described, making a distinction between sizing for this component. Before using a model for the first
and productivity models. The characteristics of some time validation is necessary, and it may also be necessary
well-known models will also be given. to calibrate the model. Mostly the environment in
which the SCE model has been developed and the
database of completed projects on which the model
The principles of SCE models is based will differ from the project characteristics of
Most models found nowadays are two-stage models 7. the environment(s) in which the model is to be used.
The first stage is a sizer and the second stage provides To make validation and calibration possible, data on
a productivity adjustment factor. historical projects have to be available in an organiz-
In the first stage an estimate regarding the size of the ation. As already mentioned, this information is often
product to be developed is obtained. In practice several lacking.
sizing techniques are used. The most well-known sizers Most of the tools implementing SCE models do not
nowadays are function points 12 and lines of code II. But support project management in all of these steps. The
other sizing techniques like 'software science '13 and seven steps are:
DeMarco's Bang method ~4,15, have been defined. The (1) Creation of database of completed projects.
result of a sizing model is the size/volume of the software (2) Size estimation.
to be developed, expressed as the number of lines of (3) Productivity estimation.
source code, number of statements, or the number of (4) Phase distribution.
functions points. (5) Sensitivity and risk analysis.
In the second stage it is estimated how much time and (6) Validation.
effort it will cost to develop the software of the estimated (7) Calibration.
size. First, the estimate of the size is converted into an
estimate in nominal man-months of effort. As this Calibration and risk and sensitivity analysis are es-
nominal effort takes no advantage of knowledge con- pecially lacking.
distribution
I Phasedistributionof development
Ceffort,timeand resources
Sensitivity I
and risk
analysis
An overview of SCE models determined to be very high, the nominal effort has to be
multiplied by 1.40. Furthermore COCOMO provides
In the past 10 years a number of SCE models have been tables to apportion the adjusted estimated effort and
developed. This section does not give an exhaustive development over the project phases and, in the detailed
treatment of all the models: the overview is limited to version of the model, to refine the adjustment for each
one example of a sizing model, one productivity model, phase. For example: the quality of the programmer has
some models which are relevant from an historical point less influence in the feasibility phase than in the design
of view, well documented and within the experience of phase. Thus phase dependent adjustment factors are
the author, and some models which introduce new ideas. used in the detailed model.
Table 4. The COCOMO cost drivers and their influence on the nominal effort
Value of the cost drivers
Very Very Extra
Cost drivers low Low Average High high high
Required reliability 0.75 0.88 1.00 1.15 ! .40
Database size 0.94 1.00 1.08 1.16
Complexity software 0.70 0.85 1.00 1.15 1.30 1.65
Constraints execution time 1.00 1.11 1.30 1.66
Memory constraints 1.00 1.06 1.21 1.56
Hardware volatility 0.87 1.00 1.15 !.30
Response time constraints 0.87 1.00 1.07 1.15
Quality analysts 1.46 1.19 1.00 0.86 0.71
Experience with application 1.29 1.13 1.00 0.91 0.82
Quality programmers 1.42 1.17 1.00 0.86 0.70
Hardware experience 1.21 1.10 1.00 0.90
Programming language experience 1.14 1.07 1.00 0.95
Use modern programming techniques 1.24 1.10 1.00 0.91 0.82
Use software tools 1.24 1.10 1.00 0.91 0.83
Project duration constraints 1.23 1.08 1.00 1.04 1.10
related to the types of data the software uses and projects at IBM, Norden plotted frequency distributions,
generates. Within FPA the software is characterized by in which he showed how many people were allocated
the five functions: to the development and maintenance of a software
product during the life-cycle. The curves he made fitted
• the external input type very well with the Rayleigh curves. His findings were
• the external output type merely empirical. He found no explanations for the
• the external inquiry type shape of the effort curve. On the assumptions of Norden,
• the logical internal file type Putnam formulated his model. There is not enough space
• the external interface file type in this paper to explain the principles of the model and
the reader is referred to Putnam 23'24, Putnam and
For each of these five types the number of simple, Fitzsimmons z5 and Londeix 26.
average and complex occurrences that are expected in
the software is estimated. By weighting each number
Before You Leap (BYL)
with an appropriate weight a number is obtained, the
BYL is a commercial package based on a link-up
unadjusted number of function points. This indication
between FPA and C O C O M O 17. BYL starts with a
for nominal size is then adjusted, using 14 technical
calculation of the amount of net function points.
characteristics. Figure 3 gives an overview of function
This amount is then translated into source lines of
point analysis.
code, taking in account the language used. For
Cobol, for instance, one function point is equal to
PRICE-S 105 SLOC, for LISP 64, etc. This estimate of the size
The PRICE-S model (Programming Review of in SLOC is precisely the necessary input for
Information Costing and E v a l u a t i o n - - S o f t w a r e ) is C O C O M O and the C O C O M O part of BYL, taking
developed and supported by RCA PRICE Systems. into account the influence on effort of the 15 COCOMO
An important disadvantage with regard to C O C O M O cost drivers, calculates the estimates of costs and time-
and FPA is that the underlying concepts and ideas scale.
are not publicly defined and the users are presented
with the model as a black box. The user of PRICE
sends the input to a time-sharing computer in the Estimaes
USA, UK, or France and gets back his estimates Estimacs has been developed by H. R u b i n 27-29 and
immediately. Despite this disadvantage and the high Computer Associates 3°, and is available as a software
rental price, there are many users, especially in America. package. The model consists of nine modules: a function
There is, however, an important motivation for point module; a risk module; an effort module (to
American companies to use the model. The US Depart- estimate development and maintenance effort), etc. The
ment of Defense demands a PRICE estimate for all most important and extensive module is Effort. The user
quotations for a software project. PRICE has separate has to answer 25 input questions. These questions are
sizer and productivity function. partly related to the complexity of the user-organization
and partly to the complexity and size of the software to
The P U T N A M model be developed. The way Estimacs translates the input to
This SCE model was developed by Putnam in 197423. He an estimation of effort is not clear. Like many other
based his model on the work of Norden 34. For many models, Estimacs is a 'closed model'.
evaluate the eight m o d e l s d e s c r i b e d above. The results o f e v a l u a t e d directly using a questionnaire, a n d the tests
that e v a l u a t i o n are presented in T a b l e 7 a n d described e n d e d with a discussion session. T h e results are pre-
in m o r e detail in H e e m s t r a 7. F r o m the table it can be sented in T a b l e 8.
seen t h a t there are o n l y few plusses. T h e conclusion is The real effort a n d d u r a t i o n were eight m a n - m o n t h s
that the quality o f the m o d e l s is p o o r a n d m u c h i m p r o v e - a n d six m o n t h s . The m a i n conclusions o f the e x p e r i m e n t
m e n t is necessary. T h e a c c u r a c y o f the e s t i m a t i o n s were were t h a t on the basis o f the differences f o u n d between
e v a l u a t e d b y several tests. T h e w a y the tests were the estimates a n d reality, it has n o t been shown t h a t the
executed a n d the results o b t a i n e d will be described. T h e selected m o d e l s can be used for a reliable e s t i m a t i o n tool
objectives o f the tests were: at an early stage o f software development. All in all, the
project leaders were n o t wildly enthusiastic a b o u t these
• to d e t e r m i n e the a c c u r a c y o f the estimate using S C E tools, b u t they were, nevertheless, felt to be a c c e p t a b l e
m o d e l s in a semi-realistic s i t u a t i o n as a check-list a n d as a m e a n s o f c o m m u n i c a t i o n . It
• to d e t e r m i n e w h e t h e r these m o d e l s will be accepted by should be m e n t i o n e d t h a t the selected project is small.
project management M o s t m o d e l s are c a l i b r a t e d on d a t a f r o m m e d i u m / l a r g e
projects.
A f t e r a severe selection p r o c e d u r e only two S C E m o d e l s K e m e r e r 33 shows t h a t estimates o f different m o d e l s
r e m a i n e d . These were the B Y L a n d E s t i m a c s models. can differ considerably. F o r each m o d e l he investigated
D u r i n g the tests 14 experienced p r o j e c t leaders were the difference between actual a n d e s t i m a t e d n u m b e r o f
a s k e d to m a k e a n u m b e r o f estimates for a project t h a t m a n - m o n t h s . H e used C O C O M O , Estimacs, F P A a n d
h a d actually been carried out. The p r o j e c t was described P u t n a m ' s m o d e l to estimate the required effort o f 15
as if it was at the start o f the project. The project leaders a l r e a d y realized projects. F r o m T a b l e 9 it can be seen
h a d to m a k e three estimates. T h e first e s t i m a t e o f effort t h a t for b o t h C O C O M O a n d P u t n a m ' s m o d e l there were
a n d d u r a t i o n (the ' m a n u a l ' estimate) was m a d e on the s h a r p overestimations. F P A a n d Estimacs gave distinctly
basis o f the p r o j e c t l e a d e r s ' k n o w l e d g e a n d experience. better results with o v e r s h o o t s o f 100% a n d 85%, re-
Next, two estimates were m a d e using the m o d e l s se- spectively. A similar study was carried o u t by R u b i n 29.
lected. I n conclusion, a final e s t i m a t e was m a d e on the A project d e s c r i p t i o n was sent to Jensen (Jensen's
basis o f the project leaders' k n o w l e d g e a n d experience model), G r e e n e ( P u t n a m ' s m o d e l S L I M ) a n d R o o k
t o g e t h e r with the m o d e l estimates. E a c h estimate was ( G E C O M O ) a n d to himself ( R u b i n ' s m o d e l Estimacs).
Table 8. Some results of the tests. Duration is given in months, Table 9. Estimates of the actual and estimated number of
effort in man-months man-months using four different models
Variable /~ ~r Averages for all projects
Effort Actual Estimated (Estimated
Manual estimate 28.4 18.3 number number divided by
BYL estimate 27.7 14.0 Models of MM of MM actual) * 100%
Estimacs estimate 48.5 13.9
GECOMO 219.25 1291.75 607.85
Final estimate 27.7 12.8 Putnam 219.25 2060.17 771.87
Duration FPA 260.30 533.23 167.29
Manual estimate 11.2 3.7 Estimacs 287.97 354.77 85.48
BYL estimate 8.5 2.4
Final estimate 12.1 3.4
definitions and standards. This results in agreements to use data collected from other organizations. The
such as: relevant data are different for each organization.
2 Lierop van, F L G, Voikers, R S A, Genuchten, M van and 18 Boehm, B W 'Software engineering economics' IEEE Trans.
Heemstra, F J 'Has someone seen the software?' Informatie Soft. Eng. Vol 10 No 1 (January 1984)
Vol 33 No 3 (1991) (In Dutch) 19 Rudolph, E E 'Productivity in computer application devel-
3 Genuchten, van M I J M 'Towards a software factory' PhD opment, Department of Management Studies' Working
Thesis, University of Technology Eindhoven (1991) paper No 9 University of Auckland (March 1983)
4 Beers 'Problems, planning and knowledge, a study of the 20 Rudolph, E E 'Function point analyses, cookbook' own
processes behind success and failure of an automation edition from Rudolph (March 1983)
project' PhD Series in general management, No 1 Faculty 21 Symons, C R 'Function point analysis: difficulties and
Industrial Engineering/Rotterdam School of Management, improvements' IEEE Trans. Soft. Eng. Vol 14 No l (Janu-
Erasmus University Rotterdam (1991) (In Dutch) ary 1988)
5 Brooks, F B The mythical manmonth. Essays on software 22 Symons, C R Software sizing and estimating---MARK H
engineering Addison-Wesley (1975) FPA Wiley (1991)
6 Noth, T and Kretzsclunar, M Estimation of software devel- 23 Putnam, L H 'A general empirical solution to the macro
opment projects Springer-Verlag (1984) (In German) software sizing and estimating problem' IEEE Trans. Soft.
7 Hecmstra, F J How expensive is software? Estimation and Eng. SE-4, 4 (1978)
control of software-development Kluwer (1989) (In Dutch) 24 Putnam, L 'Software costing estimating and life cycle
8 Druffel, L E 'Strategies for a DoD Software initiative' CSS control' IEEE Computer Society Press (1980)
DUSD(RAT) Washington, DC (1982) 25 Putnam, L H and Fitzsimmons, A 'Estimating software
9 Conte, S D, Dunsmore, H F and Shen, V Y Software costs' Datamation (Sept. Oct. Nov. 1979)
engineering metrics and models Benjamin Cummins (1986) 26 Londeix, B Cost estimation for software development
10 Reifer, D J 'The economics of software reuse' Proc. 14th Addison-Wesley (1987)
Annual ISPA Conf., New Orleans (May 1991) 27 Rubin, H A 'Interactive macro-estimation of software life
11 Boehm, B W Software engineering economics Prentice-Hall cycle parameters via personal computer: a technique for
(1981) improving customer/developer communication' Proc.
12 Albrecht, A J and Gaffney, J E 'Software function, source Symp. on application & assessment of automated tools for
lines of code, and development effort prediction: a software software development, IEEE, San Francisco (1983)
science validation' IEEE Trans. Soft. Eng. Vol SE-9 No 6 28 Rubin, H A 'Macro and micro-estimation of maintenance
(1983) effort: the estimacs maintenance models' IEEE (1984)
13 Halstead, M H Elements of software science North-Holland 29 Rubin, H A 'A comparison of cost estimation tools' Proc.
(1977) 8th Int. Conf. Soft. Eng. IEEE (1985)
14 DeMareo, T Controlling software projects: management, 30 Computer Associates CA-Estimacs User Guide Release 5.0
measurement and estimation Yourdon Press, New York (July 1986)
(1982) 31 Jones, C Programming productivity McGraw-Hill (1986)
15 DeMareo, T 'An algorithm for sizing software products' 32 BIS/Estimatnr User manual version 4.4. BIS Applied Sys-
Performance Evaluation Review 12 pp 13-22 (1984) tem Ltd. (1987)
16 Albrecht, A J 'Measuring application development pro- 33 Kemerer, C F 'An empirical validation of software cost
ductivity' Proc. Joint SHARE~GUIDE~IBM application estimation models' Communications of the ACM Vol 30
development syrup. (October 1979) No 5 (May 1987)
17 Gordon 'Before You Leap' User's Guide Gordon Group 34 Norden, P V Useful tools for project management (Oper-
(1986) ations research in research and development) Wiley (1963)