Professional Documents
Culture Documents
OLD:
Usually a project needs to exist for a few years before it gains enough to become really difficult
to maintain.
It will also go through a number of generations of maintainers.
Including each of these handoffs, knowledge about the original design of the system and the
intentions of the previous maintainer is also lost
LARGE:
INHERITED:
POORLY DOCUMENTED:
Keeping accurate and through documentation is important to its long term survival.
If there’s one thing that developers enjoy less than writing documentation, it’s keeping that
documentation up to date.
So any technical documents that do exist must invariably be taken with a pinch of salt
The system had an API that allowed you to retrieve a list of the most popular threads,
Along with a few of the most recently posted messages in each of those threads.
The API looked something like the following listing.
Exception of the rules
Just because a project fulfills some of the preceding criteria doesn’t necessarily.
Mean it should be treated as a legacy project.
Legacy Code:
The most important part of any software project, especially for an engineer, is the code itself.
In this section we’ll look at a few common characteristics usually seen in legacy code.
The most important part of any software project, especially for an engineer, is the code itself
In this section we’ll look at a few common characteristics usually seen in legacy code.
A good test suite can function as the de facto documentation for a project.
In fact, tests can even be more useful than documentation, because they’re more likely to be
kept in sync with the actual behavior of the system.
A socially responsible developer will take care to fix any tests that were broken by their changes
to production code.
Many legacy projects have almost no tests. Not only this, but the projects are usually written
without testing in mind, so retroactively adding tests is extremely difficult.
Inflexible Code:
A common problem with legacy code is that implementing new features or changes to existing behavior
are inordinately difficult.
What seems like a minor change can involve editing code in a lot of places.
To make matters worse, each one of those edits also needs to be tested, often manually.
Imagine your application defines two types of users: Admins and Normal users.
Admins are allowed to do anything, whereas the actions of Normal users are restricted.
The authorization checks are implemented simply as if statements, spread throughout the codebase.
It’s a large and complex application, so there are a few hundred of these checks in total, each one
looking like the following.
Throws NotauthorizedException {
If (! loggedInUser.isAdmin()) {
One day you’re asked to add a new user type called power User.
These users can do more than Normal users but are not as powerful as Admins
So for every action that Power Users are allowed to perform, you’ll have to search through the
codebase, find the corresponding if statement, and update it
Legacy Infrastructure
The Quality of the code is a major factor in the maintainability of any legacy project, just looking
at the code doesn’t show you the whole picture.
Most software depends on an assortment of tools and infrastructure in order to run, and the
quality of these tools can also have a dramatic effect on a team’s productivity.
In part 3 we’ll look at ways to make improvements in this area.
Development Environment
Think about the last time you took an existing project and set it up on your development
machine.
Approximately how long did it take, from first checking the code out of version control to
reaching a state where you could do the following?
View and edit the code in your IDE
Run the unit and integration tests
Run the application on your local machine
When it comes to legacy projects. The time taken to set up the development environment can
often be measured in days
Setting up a legacy project often involves a combination of downloading, installing, and learning
how to run whatever arcane build tool the project uses
Running the mysterious and unmaintained scripts you found in the project’s /bin folder
Taking a large number of manual steps, listed on an invariably our of date wiki page
Nearly any software project has some dependencies on third-party software. For example, if it’s a java
servlet web application, it will depend on java. It also needs to run inside a servlet container such as
tomcat, and may also use a web server such as Apache. Most likely it will also use various java libraries
such as Apache commons
Heterogeneous Environments
Using hardware and system software from different vendors. Organizations often use computers,
operating systems and databases from a variety of vendors. Contrast with homogeneous environment
• The number and names of the environments are not set in stone, but the process usually goes
something like this:
4. It is released and deployed to the production environment (or, in the case of packaged software,
it’s shipped to the customer).
Legacy culture
• The term legacy culture is perhaps a little contentious nobody wants to think of them selves and
their culture as legacy but
• I’ve noticed that many software teams who spend too much time maintaining legacy projects
share certain characteristics with respect to how they develop software and communicate
among themselves.
Fear of change
• Many legacy projects are so complex and poorly documented that even the teams charged with
maintaining them don’t understand everything about them.
• For example
• Which features are no longer in use and are thus safe to delete?
• Which bugs are safe to fix? (Some users of the software may be depending on the bug and
treating it as a feature.)
• Inspecting your codebase using Find Bugs, PMD, and Check style
• I want you to choose one piece of legacy software that you have experience maintaining.
• Think hard about this software and try to remember everything you can about it.
• Every bug you fixed, every new feature you added, every release you made.
Fear
• I have worked on code in the past that has caused me to think like this:
• I should get in and get out without touching anything I don’t absolutely have to handle.
HELP IS AT HAND
• When performing exploratory refactoring, always remember that you have strong allies you can
rely on to protect you.
• If your refactoring gets out of hand and you don’t feel confident that the code still works
correctly, all it takes is one command to revert your changes and put the code back into a
known, safe state.
The IDE
• Modern IDEs such as Eclipse and IntelliJ offer powerful refactoring functionality.
• They are capable of performing a wide range of standard refactoring, and they can do in
milliseconds what would take a human minutes or hours of tedious editing to achieve.
The compiler
• Assuming you’re working with a statically compiled language such as Java, the compiler can help
you discover the effects of your changes quickly.
• After each refactoring step, run the compiler and check that you haven’t introduced any
compilation errors. Using the compiler to obtain quick feedback like this is often called leaning
on the compiler.
Other developers
• Everybody makes mistakes, so it’s always a good idea to have a coworker review the changes
you’ve made.
• One developer (the “navigator”) can check for mistakes and provide advice while the other
developer (the “driver”) focuses on performing the refactoring.
CHARACTERIZATION TESTS
• As a complement to explanatory refactoring, you could also try adding characterization tests (a
coin termed by Michael Feathers).
• These are tests that demonstrate how a given part of the system currently behaves.
Frustration
• Here are some more negative and unhelpful thoughts that legacy software has provoked in me
in the past:
• I wouldn’t even know where to begin fixing this big ball of mud.
• Every class is as bad as the previous (or next) one. Let’s pick classes at random and start
refactoring.
• I’ve had it with this Widget ManagerFactoryImpl! I’m not doing anything else until I’ve rewritten
it.
LOSS OF MOTIVATION
• You give up your refactoring efforts as hopeless, and resign yourself to a future of legacy
drudgery.
DESPERATE MEASURES
• Until a few weeks later when enough frustration builds up and forces you to go through the
same cycle again.
• We want to gather metrics about legacy software in order to help us answer the following
questions:
• What state is the code in to start with? Is it really in as bad shape as you think?
• What should be your next target for refactoring at any given time?
• How much progress are you making with your refactoring? Are you improving the quality fast
enough to keep up with the entropy introduced by new changes?
• Static analysis tools can analyze a codebase and detect possible bugs or poorly written code.
• Static analysis involves looking through the code (either the human-readable source code or the
machine-readable compiled code)
• Further flagging any pieces of code that match a predefined set of patterns or rules.
Performance
• One goal of your refactoring may be to improve the performance of a legacy system.
• If you wanted to test the performance of the Audit component, you could start with the
following test:
3 Time how long it takes to process the data and generate an audit report.
• If your software is a web application, it’s easy to collect performance data from the production
system.
• Any decent web server will be able to output the processing time of every request to a log file.
• You could write a simple script to aggregate this data to calculate percentile response times per
hour, per day, and so on.
• For example, assuming your web server outputs one access log file per day and the request
processing time is the final column of the tab-separated file, the following shell snippet would
output the 99th percentile response time for a given day’s accesses.
• You could run this script every night and email the results to the development team.
Error counts
• Measuring performance is all well and good, but it doesn’t matter how fast your code runs if it’s
not doing its job correctly (giving users the results they expect and not throwing any errors).
• A count of the number of errors happening in production is a simple but useful indicator of the
quality of your software, as seen from the end-user’s perspective.
• For example, you could introduce an automatic error-reporting feature into your product that
contacts your server whenever an exception occurs.
• A more low-tech solution would be to simply count the support requests you receive from angry
customers.
• We are planning to improve the software and its development process as a whole software.
• Not just the code, metrics such as the following may be useful.
• Every time a new member joins your team, ask them to time how long it takes to get a fully
functional version of the software and all relevant development tools running on their local
machine.
• If creating a new release is taking a long time, it may be a sign that the process has too many
manual steps.
• The process of releasing software is inherently amenable to automation, and automating the
process will both speed it up and reduce the probability of human error.
• Making the release process easier and faster will encourage more frequent releases, which in
turn leads to more stable software.
• Often a developer will spend days tracking down a bug, only to find out later that another team
member had seen a similar problem before and could have fixed the issue within minutes.
• If bugs are getting fixed more quickly, it’s likely that the members of your team are
communicating well and sharing valuable information.
• By Knowing which files in your project are edited most often can be very useful when choosing
your next refactoring target.
• If one particular class is edited by developers very often, it’s an ideal target for refactoring.
• Note that this is slightly different from the other metrics because it’s not a measure of project
quality, but it’s still useful data.
• When it comes to defining and measuring metrics, the possibilities are endless.
• A quick brainstorming session with your team will no doubt provide you with plenty of other
ideas for metrics to measure.
• You could measure the number of Zs in your codebase, the average number of fingers on a
developer’s hand, or the distance between the production server and the moon, but it’s hard to
see how these relate to quality!
• It’s always better to have too much information than not enough.
• As you and your team work with the data, you’ll gradually discover which metrics are most
suitable for your particular needs.
• If a given metric isn’t working for you, feel free to drop it.
• Another popular static analysis tool for Java is PMD. Unlike FindBugs, PMD works by analyzing
Java source code rather than the compiled bytecode.
• This means it can search for some categories of problems that FindBugs can’t.
• For example, PMD can check for readability issues, such as inconsistent use of braces, or code
cleanliness issues such as duplicate import statements.
CYCLOMATIC COMPLEXITY :
• Cyclomatic complexity is a count of the number of different paths your program can take
inside a given method.
• Usually it’s defined as the sum of the number of if statements, loops, case statements, and
catch statements in the method, plus 1 for the method itself.
• In general, the higher the cyclomatic complexity, the more difficult a method is to read and
maintain.
• When run with the default settings, PMD can be very noisy.
• For example, PMD wants you to add a final modifier to method arguments whenever possible,
which can result in thousands of warnings for a typical codebase.
• If PMD spits out myriad warnings the first time you run it, don’t despair. It probably means you
need to tune your PMD ruleset to disable some of the more noisy rules.
• You’ll probably need to tune PMD’s rule set until you find the rules that are right for your code
and your team’s coding style.
• The PMD website includes excellent docu- mentation on the rule set,
• with examples for most rules, so take some time to read through the docs and decide which
rules you want to apply.
SUPPRESSING WARNINGS:
• Just like FindBugs, PMD allows you to suppress warnings at the level of individual fields or
methods using Java annotations.
• PMD uses the standard @java.lang.Suppress- Warnings annotation, so there’s no need to add
any dependencies to your project.
• You can also disable PMD for a specific line of code by adding a NOPMD comment.
• This is a really useful way to inspect a project, but it has a couple of draw- backs:
• Results are only visible to that developer. It’s difficult to share this information with the rest of
the team.
• Ideally we want a system that will automatically monitor the quality of the code, without any
effort on the developer’s part, and make this information available to all members of the team.
• Here are just some of the things I’ve successfully automated with Jenkins:
• Generating documentation
• One of the most important features of Jenkins is its integration with your version control system
(VCS).
• Every time you check in some code, Jenkins should automatically start building the code to
check that there were no unexpected side effects to your change.
• If it finds something wrong, Jenkins can provide feedback by failing the build and perhaps
sending you an angry email.
BACKUP
• For many teams, once they start using Jenkins, it quickly becomes a pillar of their development
infrastructure.
• Just like any other piece of infrastructure, it’s important to plan for failure and take regular
backups.
REST API
• Jenkins comes with a powerful REST API built in, which can be very useful.
• I once had to make a very similar change to the configurations of a large number of jobs.
• This would have been tedious to do using the UI, so I wrote a script to do all the grunt work,
making use of the REST API to update the job configurations.
LECTURE 11
Disciplined refactoring:
• You can refactor all day long without fear of introducing any bugs.
• If you’re not careful, what started out as a simple refactoring can rapidly spiral out of control,
• Should you give up and revert half a day’s work, or keep going and try to get the project back
into a compiling state?
• Even if you can reach the other side, you have no guarantee that the software still works as it’s
supposed to.
• The life of a software developer is stressful enough without having to make decisions like that,
• Often you realize that a piece of code is ripe for refactoring when you’re actually trying to do
something else.
• You may have opened the file in your editor in order to fix a bug or add a new feature.
• Then decide that you may as well refactor the code at the same time.
• Modern IDEs provide good support for the most popular refactoring patterns.
• Safer—They’re not infallible, but IDEs make far fewer mistakes than humans when updating
code.
• More thorough—The IDE often takes care of things that I hadn’t even thought of, such as
updating references inside code comments and renaming test classes to match the
corresponding production classes.
• More efficient—A lot of refactoring is boring work, which is what computers were designed to
do.
• Leave it to the IDE while you, rock star programmer that you are, get on with more important
things!
• With this in mind, it’s best not to trust the IDE unconditionally:
• Any time you feel like your refactoring is getting out of control,
• You have the freedom to back out by hitting the Undo button.