You are on page 1of 11

Examining the

Software Genome
Mapping the DNA of Software Projects
to Develop Predictable Applications

White Paper
November 2002
Examining the Software Genome
Page 2

Summary

Predictability is a pressing need in the software industry. Business


experience, sometimes undermined by client pressure, has not been able to
regularly provide adequate accuracy in cost and schedule estimates. The
problem is compounded by the complexities and interdependencies of
software development, which operate outside the human genetic disposition
towards linear thinking.

The solution to providing real predictability lies in the data and evidence
available from recently completed software projects. Direct examination of
software data can reveal the “DNA” of software development environments.
Mapping this DNA and extracting the elements impacting delivery can
surmount the inherent complexities and non-linear behavior. This is the key
issue to predictability and accuracy in software planning.

This paper presents a mechanism to examine and leverage software data to


foster predictability in planning and support long-term process
improvement for the software industry.

Scientific Model

The scientific model for the Software Genome Council is our near
namesake, the Human Genome Project. The HGP advances are based on
the discovery that the hereditary material of all multi-cellular organisms is
“…the identification of the
the double helix of deoxyribonucleic acid, which contains all genes.
sequence or function of a
gene in a model
In addition to mapping the human genome, “The HGP also sponsors efforts
organism…has the potential
to characterize the entire genome of several other organisms used
to explain a homologous gene
in human beings,…”.
extensively in biological research, such as mice, fruit flies and flatworms.

National Human Genome These efforts support each other, because most organisms have many
Institute similar, or “homologous” genes with similar functions. Therefore, the
identification of the sequence or function of a gene in a model organism,
has the potential to explain a homologous gene in human beings, …”
(National Human Genome Research Institute)
Examining the Software Genome
Page 3

Likewise, the double helix of software projects, as endorsed by the Software


Engineering Institute (SEI), has been identified as size, time, effort and
defects. {Interestingly, DNA is also composed of just four components
called bases.} A substantial database of these building blocks, or software
DNA, allows us to understand and predict the behavior of software projects.
Variance in performance is attributable primarily to environmental
differences and can be identified by trained observers.

Growing Significance of Predictability to IT

Predictability in software planning has always been a quest for IT


management and it follows that the absence of predictability grows in
significance as investments increase.

L. Achuthan, Managing Director, Economic Cycle Research Institute, has


IT expenses now account for
50% of all corporate capital reported that IT expenses now account for 50% of all corporate capital
expenses. expenses. This explains why IT has come under increased scrutiny and
significant pressure to deliver as promised.

IT management is aware of this problem, but current methods of prediction


have inherent flaws as demonstrated by the failure rates of software
projects. As with all complex, non-intuitive processes, the answer is to rely
on historical reference to model future behavior.

The Challenge of Predictability

The human brain excels at The “failure” of software organizations to meet budgets and schedules is not
linear projections. But, the
due to lack of effort or experience. To the contrary, software organizations
complexities and
are filled with some of the most dedicated and creative individuals.
interdependencies of
software projects are beyond
However, there is an inherent genetic disconnect in the business of software
the scope of linear
planning. Specifically, the human brain excels at linear projections. But,
projections.
the complexities and interdependencies of software projects are beyond the
scope of linear projections.

Linear thinking is natural and easy to observe in every day life. If you
double the amount of gas in your automobile you’ll likely go twice as far.
Examining the Software Genome
Page 4

If you reduce your speed by 20% it will likely take you 20% longer to reach
your destination. Of course there are variables in any example like this, but
for estimation of simple repeatable behaviors, linear thinking will probably
get you statistically close enough. However, as the complexity of systems
rises, the ability to use linear approximations decreases rapidly.

This fact is more simply demonstrated in the following example based on


the behavior (or DNA) of several thousand completed software projects*.
For a project of average size using a team with a peak staff of 8 people, the
graph below shows the expected schedule, cost, and quality (this project has
a complete lifecycle from requirements through delivery and warranty).

Linear thinking would support the notion that increasing the resources (peak
staff), by say 100%, to 16 people, would perhaps impact schedule, cost, and
quality by nearly the same percentage.

Figure 1. Resource Profile with 8 People at Peak Staff.


Examining the Software Genome
Page 5

In fact, doubling the staff produces a result that is decidedly non-linear.


The 16 person plan (Figure 2) has a modest impact on schedule (-15%) and
a dramatic impact on cost (+75%) and quality/bugs (+69%).

Figure 2. Resource Profile with 16 People at Peak Staff.


In the absence of data, most
will fall back to rules of thumb While this example may not shock the majority of those living in the
that, to some degree, rely on
“software trenches”, the ability to capture the magnitude of this non-
our genetic disposition for
linearity has eluded many of the best IT organizations. In the absence of
linear thinking.
data, most will fall back to rules of thumb that, to some degree, rely on our
genetic disposition for linear thinking.

What Does the Data Reveal

The above example is derived from real data. This industry database of
more than 6,000 projects* reveals and explicitly quantifies many key
relationships in bottom line software measures. One of the most important
and fundamental of these relationships is that of size/functionality to
Examining the Software Genome
Page 6

schedule, cost, and quality. As size increases, schedule increases rapidly


while cost and defects rise geometrically (read non-linearly).

Figure 3 shows an example of this behavior based on a sample of 700


projects from various IT sectors. The key interpretation here is that changes
in size result in disproportionate changes in schedule. Think of mechanical
levers. You can exert a small force on a lever or flywheel and exact large
changes on an adjacent device. The same is true for the key software
business drivers of schedule, cost, quality, functionality, resources, and
complexity. Their relationship is complex and small changes in one or
more dimension (size and complexity for example) can exert large changes
in one or more of the other factors (cost, schedule, and quality for example).

Figure 3. Data Showing Schedule Varying with Size.

This is essential knowledge, as the great majority of software organizations


struggle to contain the size and complexity of their applications from
inception to delivery. It is also at the heart of the challenge of linear
projections in the complex world of software management.
Examining the Software Genome
Page 7

Predictability: Revealed in the DNA of the Software


Organization

The fundamental relationships revealed in the aggregate industry data


provide a framework or blue print to support statistically sound planning.
To leverage this at the business level, individual organizations can uncover
the key elements influencing their software behavior – think of it as the
DNA of the software organization. To extend the metaphor, consider that
each software organization shares some common characteristics with the
industry at large (i.e. similar resource sets, application types, integration
issues, etc.). However, each also exhibits unique behavior that defines the
organization as an “individual” within the landscape of the software
industry (i.e. specific methods, management, tools, clients, etc.).

Extracting this DNA is key to incremental improvement in the software


organization and by extension, the software industry. While this is not a
Extracting this DNA is key to trivial exercise, existing data documents the benefits of this approach, and
incremental improvement in
establishes a roadmap for gathering the fundamental elements of software
the software organization and
DNA.
by extension, the software
industry.
Both quantitative and qualitative information must be captured. Any
organization with modest discipline in software development will have this
information (with varying degrees of consistency and accuracy). The
quantitative elements are consistent with the bottom line of the business and
industry standards from the Software Engineering Institute (SEI). They
include schedule, cost, staffing, and quality. The qualitative elements
reveal the environment and complexity within which the product was
delivered (i.e. platform, skills, tools, methods, interfaces, etc.).

Consolidating this information from an adequate sample of projects is the


first step in defining the DNA of an organization and mapping those
elements that support exceptional performance.
Examining the Software Genome
Page 8

DNA and Business Value

The notion of software DNA goes well beyond the realm of scientific
measurement theorems or simulation models for planning. It is
fundamental to assessing real business value when prioritizing or approving
applications within an IT budget, making “process investments” to boost the
throughput of the organization, or when molding a strategy for cost
containment.

Each of these business value decisions ultimately relates to return on


investment. But how can you adequately determine ROI without the DNA
to model or quantify the potential investment for applications development
(or enhancement, conversion, maintenance, etc.) or the savings from
The notion of software
DNA…is fundamental to
changes in the environment?

assessing real business


Of course, in the absence of data, you’ll have to rely on linear thinking or
value…or when molding a
strategy for cost containment.
best guess from experience. However, with a fair sample of data, you’re
ROI model will be a fact-based catalyst for decision-making.

Case Study: Return on Investment

Figure 4 depicts a before and after study of data collected from a large U.S.
company. The CIO of this organization wanted to measure the results from
improvement initiatives that went directly to the bottom line. In this
scenario, the “bottom line” was the number of projects outside of budget
that were eating away at the CIO’s ability to fund new strategic initiatives.
Examining the Software Genome
Page 9

Figure 4. Budget performance before and after process investments.

The investments made in this organization were based on a “DNA


Benchmark” of both qualitative and quantitative information. A major
component of this initiative was the systematizing of requirements
management and change control.

The return in the first year of The return in the first year of this program was significant, with a doubling
this program was significant, of the number of projects under budget from 13 to 27, a reduction in the
with a doubling of the number
percentage of overrun, and a dramatic improvement and consistency in data
of projects under budget from
reporting. Most critically, the overall pattern of data in Figure 4 shows a
13 to 27…
shift to a “tighter bell curve” in the bottom “after” scenario, implying fewer
projects on the ends of the bell curve, far from plan. This translated
directly to reduced costs and supported the goal of increased spending
on strategic initiatives. Most importantly, the organization could build a
business case linking directly to the value of understanding their software
DNA.
Examining the Software Genome
Page 10

Examining the Software Genome: Implications for


Business

Examining the software genome is a comprehensive approach to


understanding the inherent attributes that contribute to the successes and
failures of a software organization. It is a serious undertaking that succeeds
in an environment of management commitment. The focus must be on the
overall behavior patterns and long-term trends rather than individual
performances.

The payback can be significant and is multi-faceted. Immediate returns are


generally evident in increased predictability, validation of software plans,
and demonstrable evidence of process bottlenecks. Longer-term benefits
are commonly tied to realizing and quantifying real process improvements
that support corporate investment goals.

The Software Genome Council: A Partnership of


Emergeon and InformationWeek

The Software Genome Council is a fee-based membership organization


composed of IT executives committed to on target performance in software
development. The Council’s charter is to expand an existing industry
databank of real software applications in order to uncover the “DNA” of
successful systems and to leverage that information for improved
predictability across member organizations.

The Software Genome Council has been launched with comprehensive


metrics from more than 6,000 completed software applications*. With
members contributing metrics from 3 to 10 completed projects per year, a
mechanism exists for keeping the databank growing and contemporary.

* The Software Genome Council is powered by QSM, Inc. and their


world-class SLIM industry repository and analysis models.
Examining the Software Genome
Page 11

To Learn More

For information on the Software Genome Council go to


softwaregenomecouncil.com or contact Emergeon at 888-868-7216.

About the Author

Ira Grossman is President and founder of Emergeon, LLC and creator of the
Software Genome Council. Ira has over seventeen years experience helping
companies manage their software development projects. Emergeon is a
professional services firm specializing in helping technology organizations
manage and deliver software on time and on budget. Ira can be reached at
ira.grossman@emergeon.com.

Copyright Emergeon, LLC. 2002

Emergeon
250 Jordan Road,
Troy, NY 12180
U.S.A.

All Rights Reserved.

The Emergeon logo is the registered trademark of Emergeon, LLC.

You might also like