You are on page 1of 37

Guidelines to Spreadsheet

Engineering
Additional Study Material
ePGP
Agenda
Guidelines on Spreadsheet Engineering
• Designing a Spreadsheet
• Building a Spreadsheet
• Testing a Spreadsheet
Planning is Key
• Extra time spent in planning can actually reduce the overall time
required to perform a spreadsheet analysis.
• Without good planning, the need for design improvements is
detected only after implementation has begun, and much of the
implementation effort is wasted.
• In addition to speeding up the design process, advance planning can
help the designer avoid critical errors in a design.
Guidelines on Spreadsheet Engineering
The guidelines for spreadsheet engineering are organized around the
following phases:
• Designing
• Building
• Testing
Designing a Spreadsheet
Designing a Spreadsheet
• The essential first step in developing any spreadsheet model is
to design it.
• The following slides present some tips on good design practices for
single-worksheet models.
Guidelines
The eight guidelines for designing a worksheet are:
• Sketch the spreadsheet
• Organize the spreadsheet into modules
• Start small
• Isolate input parameters
• Design for use
• Keep it simple
• Design for communication
• Document important data and formulae
Sketch the Spreadsheet

• Plan carefully to avoid mistakes and rework.


• Turn the computer off and think for a while before hitting any keys.
• To those who are relatively new to spreadsheet modeling, the recommendation is
to begin with a sketch of their spreadsheet before entering anything into the
computer. This is also a good first step for experienced modelers.
Sketch the Spreadsheet
• A sketch should show the physical layout of major elements and should contain at
least a rough indication of the flow of calculations.
• We use variable names to indicate how calculations will be performed.
• For example, we might write: Profit = Total Revenue − Total Cost. In order to
show the logic for calculating unsold goods, we might write: IF(Stock >
Demand, Stock − Demand, 0).
• The acid test for whether a sketch is sufficiently detailed is whether someone else
could build a spreadsheet from it without any significant redesign.
• An Influence Chart and a Blackbox Diagram provide a useful starting point for a
spreadsheet design. While an Influence Chart provides an overview of the
relationship among all the elements of the model, a Blackbox Diagram
summarizes all inputs and outputs of the model.
Sketch the Spreadsheet
• Generally, we will place all inputs (parameters/data and decision variables) at the
beginning, followed by key performance measures/key output(s) and then
calculations.
• This is because:
• We expect to vary some of the decision variables, and we will want to know
the consequences for the output measure. Since we want to see the effects of
varying decisions on the output, it makes sense to place these items close
together in the spreadsheet.
• In addition, we may also want to alter one or more of the input parameters
and revisit the relationship between decisions and output. Therefore, it also
makes sense to have inputs in close proximity to the output and decisions.
• The one part of the spreadsheet we will not be examining or altering, once
we have tested it, is the set of calculations. Therefore, we place the
calculations in a secondary location.
Organize the Spreadsheet into Modules
• Modules bring together groups of similar items, and they separate unlike items.
Modularization is a basic principle of good design and a useful first step in
organizing information.
• In spreadsheets, this means separating data, decision variables, outcome
measures, and detailed calculations. If an influence chart and a black box diagram
have been constructed, then a lot of this work will already have been done.
• Models that are more complex may, of course, contain many more modules and
require layouts that are more elaborate.
• Along with grouping and separating, the next step is to consider the flow of
information in the model—that is, to specify which information will need to pass
from one group to another. Data/parameters and decisions will serve as inputs to
some of the calculations, and a small number of the calculations will eventually
be highlighted as outputs. After these key linkages are identified, the additional
development of one module can go on somewhat independently of modifications
in other modules. Keep in mind that formulas should generally reference cells
located above and to the left.
Start Small
• Do not attempt to build a complex spreadsheet all at once. Isolate
one part of the problem or one module of the spreadsheet; then
design, build, and test that one part. Once that part of the model is in
good shape, go on to the next one.
• By making (and correcting) many little mistakes in each module, and
thus keeping the mistakes local if they occur, it is possible to avoid
making really large and complex mistakes that require much more
effort to detect and correct.
• For example, if we were building a complex model to cover 12
months, we would start by building a model for the first month; then
we would expand it to 2 months and ultimately to all 12.
Isolate Input Parameters
• Place the numerical values of key parameters in a single location and separate them from
calculations. This means that formulas contain only cell references, not numerical values.
It also means that a parameter contained in several formulas appears only once as a
numerical value in the spreadsheet, although it may appear several times as a cell
reference in a formula.
• Parameterization offers several advantages. First, placing parameters in a separate
location makes it easy to identify them and change them. It also makes a particular
scenario immediately visible. Parameterization ensures that changing a numerical value
in one cell is sufficient to induce a change throughout the entire model. In addition,
parameterization is required for effective sensitivity analysis. Finally, it is relatively easy
to document the assumptions behind parameters, or the sources from which they were
derived, if those parameters appear in a single location.
• Burying parameters in cell formulas and replicating the same parameter in multiple cells
make identifying parameters difficult, because they are not immediately visible. It is also
difficult to know whether all numerical values of a parameter have been changed each
time an update is required. By contrast, the habit of using a single and separate location
considerably streamlines the building and debugging of a spreadsheet.
Design for Use
• While designing a spreadsheet, try to anticipate who will use it and
what kinds of questions they will want to address.
• Make it easy to change parameters that can be expected to change
often.
• Make it easy to find key outputs by collecting them in one place.
Include graphs of the outputs to make it easier to learn from the
spreadsheet.
• In a large model, the outputs may be scattered over many locations. It
is very helpful to gather them together and place them near the
inputs so that the details of the model itself do not interfere with the
process of analysis.
Keep it Simple
• Just as good models should be simple, good spreadsheets should be as
simple as possible while still getting the job done.
• Complex spreadsheets require more time and effort to build than simple
ones, and they are much more difficult to debug.
• Some of the earlier guidelines, such as modularization and
parameterization, help keep models simple.
• Long formulae are a common symptom of overly complex spreadsheets.
• It is better to decompose a complex calculation into its intermediate steps
and to display each step in a separate cell.
• This makes it easier to spot errors in the logic and to explain the
spreadsheet calculations to others. Overall, it is a more efficient use of the
combined human–computer team.
Design for Communication
• Spreadsheets are often used long after the builder ever thought they would be,
and frequently by people who are not familiar with them.
• Logical design helps users understand what the spreadsheet is intended to
accomplish and how to work with it effectively.
• The look and the layout of a spreadsheet often determine whether its developer
or another user can understand it several months or years after it was built.
• Visual cues that reinforce the model's logic also pay dividends when the
spreadsheet gets routine use in a decision-support role.
Design for Communication
• The use of informative labels and the incorporation of blank spaces can go a long
way toward conveying the organization of a spreadsheet.
• The specialized formatting options in Excel (outlines, color, bold font, and so on)
can also be used to highlight certain cell entries or ranges for quick visual
recognition. This facilitates navigating around the spreadsheet, both when building
the spreadsheet and when using it.
• However, formatting tools should be applied with care. If used to excess,
formatting can confuse, obscure, and annoy rather than help.
• It also helps to use these formatting tools consistently.
Document Important Data and Formula
• We can provide documentation within individual cells by inserting cell
comments. The command Review>Comments>New Comment brings
up a small window in which we can describe the contents of the cell
where the cursor is located.
• It may be worth creating a separate module to list the assumptions in
the model, particularly the structural simplifications adopted at the
outset of the model-building process. Assumptions may not be
obvious to another user of the model but may significantly influence
the results. Thus, they should be noted on the spreadsheet itself.
Building a Spreadsheet
Building a Spreadsheet
• A well-designed spreadsheet should be easy and quick to build.
However, speed is not the only criterion here.
• Most bugs in spreadsheets are introduced during the building
process.
• Therefore, learning to build spreadsheets without introducing errors
is also vital.
• The guidelines for this phase are designed to make the building
process routine, repeatable, and error-free.
Guidelines
• Follow a plan
• Build one module/section at a time
• Predict the outcome of each formula
• Copy and paste formulae correctly
• Use relative and absolute referencing to simplify copying
• Use range names to make formulae easy to read
• Choose input data to make errors stand out
Follow a Plan
• Having gone to the trouble of sketching the spreadsheet, we should
follow the sketch when building.
• With a sufficiently detailed sketch, the building process itself becomes
largely mechanical and therefore less prone to mistakes.
Build One Module/Section at a Time
• Rather than trying to build an entire worksheet in one pass, it is
usually more efficient to build a single module/section and test it out
before proceeding.
• As we build the first module, we may discover that the design itself
can be improved, so it is best to have made a limited investment in
the original design before revising it.
• Another rationale for this advice is to localize the potential effects of
an error. If we make an error, its effects are likely to be limited mainly
to the module we are building. By staying focused on that module, we
can fix errors early, before they infect other modules that we build
later.
Predict the Outcome of Each Formula
• For each formula entered, predict the numerical value expected from it
before pressing the Enter key.
• Ask what order of magnitude to expect in the result and give some thought
to any outcomes that do not correspond to predictions.
• This discipline helps to uncover bugs: without a prediction, every numerical
outcome tends to look plausible.
• At the same time, a prediction that is orders of magnitude different from
the calculated number provides an opportunity to catch an error.
• For example, if we predict $100,000 for annual revenue, and the calculated
value comes to $100,000,000, then there is a flaw either in our intuition or
in our formula. Either way, we can benefit: our intuition may be sharpened,
or we may detect an error in need of fixing.
Copy and Paste Formulae Correctly
• The Copy-and-Paste commands in Excel are not simply time-savers; they
are also helpful in avoiding bugs.
• Instead of entering structurally similar formulas several times, we copy and
paste a formula.
• Repetition can be a source of errors and copying formulas can diminish the
potential for this type of error.
• Careless copying is also a source of bugs.
• One of the most common errors is to select the wrong range for copying—
for example, selecting one cell too few in copying a formula across a row.
• Recognizing this problem keeps us alert to the possibility of a copying error,
and we are therefore more likely to avoid it.
Use Relative and Absolute Referencing to Simplify
Copying
• Efficient copying depends on skillful use of relative and absolute
addressing.
• When we include a cell with a relative reference in a formula and
then copy the formula, the cell address changes to preserve the
relative position between the highlighted cell and the input cell.
• When a formula with an absolute reference is copied, the address is
copied unchanged. Absolute addresses are usually used when
referring to a parameter, because the location of the parameter is
fixed.
Use Range Names to Make Formulae Easy to Read
• Any cell or range of cells in a spreadsheet can be given a name. This name
can then be used in formulas to refer to the contents of the cell.
• It is easier to understand a formula that uses range names than one that
uses cell addresses.
• Formulas containing descriptive names are easier for the developer to
debug and easier for new users to understand.
• Range names require extra work to enter and maintain, so they may not be
worth the effort in simple spreadsheets destined for one-time use.
• But in a spreadsheet that will become a permanent tool or be used by
other analysts after the designer has moved on, it is a good idea to use
range names to help subsequent users understand the details.
Choose Input Data to Make Errors Stand Out
• Most modelers naturally use realistic values for input parameters as they
build their spreadsheets. This has the advantage that the results look
plausible, but it has the disadvantage that the results are difficult to check.
• For example, if the expected price is $25.99 and unit sales are 126,475,
revenues will be calculated as $3,287,085.25. We could check this with a
calculator, but it is not easy to check by eye.
• However, if we input arbitrary values of $10 for price and 100 for unit sales,
we can easily check that our formula for revenue is correct if it shows a
result of $1,000.
• Generally speaking, it saves time in the long run to input arbitrary but
simple values for the input parameters (for example, 1, 10, and 100) during
the initial building sequence.
• Once the spreadsheet has been debugged with these arbitrary values, it is
then a simple matter to replace them with the actual input values.
Testing a Spreadsheet
Testing a Spreadsheet
• Even a carefully designed and built spreadsheet may contain errors.
• Errors can arise from incorrect references in formulas, from inaccurate
copying and pasting, from lack of parameterization, and from a host of
other sources.
• There is no recipe to follow for finding all bugs.
• When a bug is found late in the analysis phase, the user must backtrack to
fix the bug and to repeat most or all of the previous analysis. This can be
avoided by carefully testing the spreadsheet before using it for analysis.
• The next few slides provide some guidelines that can help an end user test
whether a model is correct. However, one of the most effective ways to
find errors in a model is to give it to an outsider to test.
Guidelines
• Check that numerical results look plausible
• Check that formulae are correct
• Test that model performance is plausible
Check that Numerical Results Look Plausible
• The most important tool for keeping a spreadsheet error-free is a
skeptical attitude.
• As we build the spreadsheet, we transform input parameters into a
set of intermediate results that eventually lead to final outcomes. As
these numbers gradually appear, it is important to check that they
look reasonable.
• Three distinct ways to accomplish this are:
• Make rough estimates
• Check with a calculator
• Test extreme cases
Check that Numerical Results Look Plausible
• Make Rough Estimates: Predicting the rough magnitude of the result of each
formula before pressing Enter helps catch errors as they are made. Similarly, it is a
good idea to scan the completed spreadsheet and to check that critical results are
the correct order of magnitude.
• Check with a Calculator: A more formal approach to error detection is to check
some portion of the spreadsheet on a calculator. Pick a typical column or row and
check the entire sequence of calculations. Errors often occur in the last row or
column due to problems in Copy-and-Paste operations, so check these areas, too.
• Test Extreme Cases: If the logic behind a spreadsheet is correct, it should give
logical results even with unrealistic assumptions. For example, if we set Price to
$0, we should have zero revenues. Extreme cases such as this are useful for
debugging because the correct results are easy to predict. Note, however, that
just because a spreadsheet gives zero revenues when we set Price to $0 does not
guarantee that the logic will be correct for all cases.
Check that Formulae are Correct
• Most spreadsheet errors occur in formulas.
• We can reduce the possibility of errors by making formulas short and
using multiple cells to calculate a complex result.
• We can also reduce errors by using recursive formulas wherever
possible, so that successive formulas in a row or column have the
same form.
• Yet another good idea is to design the spreadsheet so that formulas
use as inputs only cells that are above and to the left and are as close
as possible.
Check that Formulae are Correct
• Having taken all these precautions, we still need to test that formulas
are correct before beginning the analysis. A few ways to perform this
testing are:
• Check visually (highlighting each cell in a row or column in sequence and
visually auditing the formula)
• Display individual cell references (using the cell-edit capability, invoked either
by pressing the F2 key or by double-clicking on the cell of interest)
• Display all formulas (using ctrl + ~ or Formulas>Show Formulas)
• Use the Excel Auditing Tools (using Formulas>Trace Precedents and
Formulas>Trace Dependents which reveal the pattern of information flow in
the model)
Test that Model Performance is Plausible
• If a spreadsheet model is logically sound and built without errors, it
should react in a plausible manner to a range of input values.
• Sensitivity testing can be a powerful way to uncover logical and
mechanical errors.
References
• Powell, S. G., & Barker, K. R. (2013). Management Science: The Art Of
Modeling With Spreadsheets. John Wiley & Sons.
• Leong, T. Y., & Cheong, M. L. F. (2008). Business Modeling with
Spreadsheets: Problems, Principles, and Practice. McGraw Hill.
• Read, N., and J. Batson. (1999). “Spreadsheet Modeling Best
Practice.” http://www.eusprig.org/smbp.pdf.
• Conway, D. G., and C. T. Ragsdale. (1997). “Modeling Optimization
Problems in the Unstructured World of Spreadsheets.” Omega 25, 313–
322.
• Howard, R. A., & Matheson, J. E. (2005). Influence diagram
retrospective. Decision Analysis, 2(3), 144-147.

You might also like