Re Engineering

Legacy Project
 Labeling as legacy on any existing project that’s difficult to maintain or extend.

 As developers, we intend to focus on the code, but a project encompasses many other aspects.
 Build tools and scripts
 Dependencies on other systems
 The infrastructure on which the software runs
 Project documentation
 Methods of communication, such as between developers, or between developers and
stakeholders
Features of legacy projects
OLD:
 Usually a project needs to exist for a few years before it gains enough to become really difficult
to maintain.
 It will also go through a number of generations of maintainers.
 Including each of these handoffs, knowledge about the original design of the system and the
intentions of the previous maintainer is also lost
LARGE:
 The larger project is more difficult to maintain.

 There is more code to understand, a larger number of existing bugs (if we assume a constant
defect rate in software, more code = more bugs), and a higher probability of a new change
causing a regression.
 Because there is more existing code that it can potentially affect.
INHERITED:
 These projects are usually inherited from a previous developer or team.

 The people who originally wrote the code and those who now maintain it are not the same
individuals, and they may even be separated by several intermediate generations of developers.
POORLY DOCUMENTED:
 Keeping accurate and through documentation is important to its long term survival.
 If there’s one thing that developers enjoy less than writing documentation, it’s keeping that
documentation up to date.
 So any technical documents that do exist must invariably be taken with a pinch of salt
 The system had an API that allowed you to retrieve a list of the most popular threads,
 Along with a few of the most recently posted messages in each of those threads.
 The API looked something like the following listing.
Exception of the rules
 Just because a project fulfills some of the preceding criteria doesn’t necessarily.
 Mean it should be treated as a legacy project.
Legacy Code:
 The most important part of any software project, especially for an engineer, is the code itself.
 In this section we’ll look at a few common characteristics usually seen in legacy code.
Untested, un testable code
 The most important part of any software project, especially for an engineer, is the code itself
 In this section we’ll look at a few common characteristics usually seen in legacy code.
 A good test suite can function as the de facto documentation for a project.
 In fact, tests can even be more useful than documentation, because they’re more likely to be
kept in sync with the actual behavior of the system.
 A socially responsible developer will take care to fix any tests that were broken by their changes
to production code.
 Many legacy projects have almost no tests. Not only this, but the projects are usually written
without testing in mind, so retroactively adding tests is extremely difficult.
Inflexible Code:
A common problem with legacy code is that implementing new features or changes to existing behavior
are inordinately difficult.
What seems like a minor change can involve editing code in a lot of places.
To make matters worse, each one of those edits also needs to be tested, often manually.
Imagine your application defines two types of users: Admins and Normal users.
Admins are allowed to do anything, whereas the actions of Normal users are restricted.
The authorization checks are implemented simply as if statements, spread throughout the codebase.
It’s a large and complex application, so there are a few hundred of these checks in total, each one
looking like the following.
Public void deleteWibble (Wibble wibble)
Throws NotauthorizedException {
If (! loggedInUser.isAdmin()) {
Throw new NotAuthorizedException(

“Only Admins are allowed to delete wibbles”);
One day you’re asked to add a new user type called power User.
These users can do more than Normal users but are not as powerful as Admins
So for every action that Power Users are allowed to perform, you’ll have to search through the
codebase, find the corresponding if statement, and update it
Legacy Infrastructure
 The Quality of the code is a major factor in the maintainability of any legacy project, just looking
at the code doesn’t show you the whole picture.
 Most software depends on an assortment of tools and infrastructure in order to run, and the
quality of these tools can also have a dramatic effect on a team’s productivity.
 In part 3 we’ll look at ways to make improvements in this area.
Development Environment
 Think about the last time you took an existing project and set it up on your development
machine.
 Approximately how long did it take, from first checking the code out of version control to
reaching a state where you could do the following?
 View and edit the code in your IDE
 Run the unit and integration tests
 Run the application on your local machine
 When it comes to legacy projects. The time taken to set up the development environment can
often be measured in days
 Setting up a legacy project often involves a combination of downloading, installing, and learning
how to run whatever arcane build tool the project uses
 Running the mysterious and unmaintained scripts you found in the project’s /bin folder
 Taking a large number of manual steps, listed on an invariably our of date wiki page
Out dated Dependencies
Nearly any software project has some dependencies on third-party software. For example, if it’s a java
servlet web application, it will depend on java. It also needs to run inside a servlet container such as
tomcat, and may also use a web server such as Apache. Most likely it will also use various java libraries
such as Apache commons
Heterogeneous Environments
Using hardware and system software from different vendors. Organizations often use computers,
operating systems and databases from a variety of vendors. Contrast with homogeneous environment
• Most software will be run in a number of environments during its lifetime.
• The number and names of the environments are not set in stone, but the process usually goes
something like this:
1. Developers run the software on their local machines.
2. They deploy it to a test environment for automatic and manual testing.
3. It is deployed to a staging environment that mimics production as closely as possible.
4. It is released and deployed to the production environment (or, in the case of packaged software,
it’s shipped to the customer).
Legacy culture
• The term legacy culture is perhaps a little contentious nobody wants to think of them selves and
their culture as legacy but
• I’ve noticed that many software teams who spend too much time maintaining legacy projects
share certain characteristics with respect to how they develop software and communicate
among themselves.
Fear of change
• Many legacy projects are so complex and poorly documented that even the teams charged with
maintaining them don’t understand everything about them.
• For example
• Which features are no longer in use and are thus safe to delete?
• Which bugs are safe to fix? (Some users of the software may be depending on the bug and
treating it as a feature.)
• Which users need to be consulted before making changes to behavior?

Changes, benefits and risks for a legacy project.
Finding your starting point
• Deciding where to focus your refactoring efforts
• Thinking more positively about your legacy software
• Measuring the quality of software
• Inspecting your codebase using Find Bugs, PMD, and Check style
• Using Jenkins for continuous inspection
Overcoming feelings of fear and frustration
• I want you to choose one piece of legacy software that you have experience maintaining.
• Think hard about this software and try to remember everything you can about it.
• Every bug you fixed, every new feature you added, every release you made.
Fear
• I have worked on code in the past that has caused me to think like this:
• Every time I touch a line of code, I break something completely unrelated.
• I should get in and get out without touching anything I don’t absolutely have to handle.
HELP IS AT HAND
• When performing exploratory refactoring, always remember that you have strong allies you can
rely on to protect you.
The version control system
• If your refactoring gets out of hand and you don’t feel confident that the code still works
correctly, all it takes is one command to revert your changes and put the code back into a
known, safe state.
The IDE
• Modern IDEs such as Eclipse and IntelliJ offer powerful refactoring functionality.
• They are capable of performing a wide range of standard refactoring, and they can do in
milliseconds what would take a human minutes or hours of tedious editing to achieve.
The compiler
• Assuming you’re working with a statically compiled language such as Java, the compiler can help
you discover the effects of your changes quickly.
• After each refactoring step, run the compiler and check that you haven’t introduced any
compilation errors. Using the compiler to obtain quick feedback like this is often called leaning
on the compiler.
Other developers
• Everybody makes mistakes, so it’s always a good idea to have a coworker review the changes
you’ve made.
• One developer (the “navigator”) can check for mistakes and provide advice while the other
developer (the “driver”) focuses on performing the refactoring.
CHARACTERIZATION TESTS
• As a complement to explanatory refactoring, you could also try adding characterization tests (a
coin termed by Michael Feathers).
• These are tests that demonstrate how a given part of the system currently behaves.
Frustration
• Here are some more negative and unhelpful thoughts that legacy software has provoked in me
in the past:
• I wouldn’t even know where to begin fixing this big ball of mud.
• Every class is as bad as the previous (or next) one. Let’s pick classes at random and start
refactoring.
• I’ve had it with this Widget ManagerFactoryImpl! I’m not doing anything else until I’ve rewritten
it.
LOSS OF MOTIVATION
• You give up your refactoring efforts as hopeless, and resign yourself to a future of legacy
drudgery.
• The software seems to be doomed, and there’s no way to save it.
DESPERATE MEASURES
• At this point, you go back to doing “real” work.
• Until a few weeks later when enough frustration builds up and forces you to go through the
same cycle again.
Gathering useful data about your software
• We want to gather metrics about legacy software in order to help us answer the following
questions:
• What state is the code in to start with? Is it really in as bad shape as you think?
• What should be your next target for refactoring at any given time?
• How much progress are you making with your refactoring? Are you improving the quality fast
enough to keep up with the entropy introduced by new changes?
Bugs and coding standard violations
• Static analysis tools can analyze a codebase and detect possible bugs or poorly written code.
• Static analysis involves looking through the code (either the human-readable source code or the
machine-readable compiled code)
• Further flagging any pieces of code that match a predefined set of patterns or rules.
Performance
• One goal of your refactoring may be to improve the performance of a legacy system.
• If that’s the case, you’ll need to measure that performance.

PERFORMANCE TESTS
• If you wanted to test the performance of the Audit component, you could start with the
following test:
1 Start the system in a known state.
2 Feed it 1 million lines of dummy log data.
3 Time how long it takes to process the data and generate an audit report.
4 Shut down the system and clean up.
MONITORING PERFORMANCE IN PRODUCTION
• If your software is a web application, it’s easy to collect performance data from the production
system.
• Any decent web server will be able to output the processing time of every request to a log file.
• You could write a simple script to aggregate this data to calculate percentile response times per
hour, per day, and so on.
• For example, assuming your web server outputs one access log file per day and the request
processing time is the final column of the tab-separated file, the following shell snippet would
output the 99th percentile response time for a given day’s accesses.
• You could run this script every night and email the results to the development team.
Error counts
• Measuring performance is all well and good, but it doesn’t matter how fast your code runs if it’s
not doing its job correctly (giving users the results they expect and not throwing any errors).
• A count of the number of errors happening in production is a simple but useful indicator of the
quality of your software, as seen from the end-user’s perspective.
• For example, you could introduce an automatic error-reporting feature into your product that
contacts your server whenever an exception occurs.
• A more low-tech solution would be to simply count the support requests you receive from angry
customers.
Time common tasks
• We are planning to improve the software and its development process as a whole software.
• Not just the code, metrics such as the following may be useful.
TIME TO SET UP THE DEVELOPMENT ENVIRONMENT FROM SCRATCH:
• Every time a new member joins your team, ask them to time how long it takes to get a fully
functional version of the software and all relevant development tools running on their local
machine.
TIME TAKEN TO RELEASE OR DEPLOY THE PROJECT
• If creating a new release is taking a long time, it may be a sign that the process has too many
manual steps.
• The process of releasing software is inherently amenable to automation, and automating the
process will both speed it up and reduce the probability of human error.
• Making the release process easier and faster will encourage more frequent releases, which in
turn leads to more stable software.
AVERAGE TIME TO FIX A BUG
• This metric can be a good indicator of communication between team members.
• Often a developer will spend days tracking down a bug, only to find out later that another team
member had seen a similar problem before and could have fixed the issue within minutes.
• If bugs are getting fixed more quickly, it’s likely that the members of your team are
communicating well and sharing valuable information.
Commonly used files
• By Knowing which files in your project are edited most often can be very useful when choosing
your next refactoring target.
• If one particular class is edited by developers very often, it’s an ideal target for refactoring.
• Note that this is slightly different from the other metrics because it’s not a measure of project
quality, but it’s still useful data.
Measure everything you can
• When it comes to defining and measuring metrics, the possibilities are endless.
• A quick brainstorming session with your team will no doubt provide you with plenty of other
ideas for metrics to measure.
• You could measure the number of Zs in your codebase, the average number of fingers on a
developer’s hand, or the distance between the production server and the moon, but it’s hard to
see how these relate to quality!
• It’s always better to have too much information than not enough.
• A good rule of thumb is, if in doubt, measure it.
• As you and your team work with the data, you’ll gradually discover which metrics are most
suitable for your particular needs.
• If a given metric isn’t working for you, feel free to drop it.
PMD and Checkstyle
• Another popular static analysis tool for Java is PMD. Unlike FindBugs, PMD works by analyzing
Java source code rather than the compiled bytecode.
• This means it can search for some categories of problems that FindBugs can’t.
• For example, PMD can check for readability issues, such as inconsistent use of braces, or code
cleanliness issues such as duplicate import statements.
CYCLOMATIC COMPLEXITY :
• Cyclomatic complexity is a count of the number of different paths your program can take
inside a given method.
• Usually it’s defined as the sum of the number of if statements, loops, case statements, and
catch statements in the method, plus 1 for the method itself.
• In general, the higher the cyclomatic complexity, the more difficult a method is to read and
maintain.
PMD IS NOISY BY DEFAULT:
• When run with the default settings, PMD can be very noisy.
• For example, PMD wants you to add a final modifier to method arguments whenever possible,
which can result in thousands of warnings for a typical codebase.
• If PMD spits out myriad warnings the first time you run it, don’t despair. It probably means you
need to tune your PMD ruleset to disable some of the more noisy rules.
CUSTOMIZING YOUR RULESET
• You’ll probably need to tune PMD’s rule set until you find the rules that are right for your code
and your team’s coding style.
• The PMD website includes excellent documentation on the rule set,
• with examples for most rules, so take some time to read through the docs and decide which
rules you want to apply.
SUPPRESSING WARNINGS:
• Just like FindBugs, PMD allows you to suppress warnings at the level of individual fields or
methods using Java annotations.
• PMD uses the standard @java.lang.Suppress- Warnings annotation, so there’s no need to add
any dependencies to your project.
• You can also disable PMD for a specific line of code by adding a NOPMD comment.
Continuous inspection using Jenkins
• This is a really useful way to inspect a project, but it has a couple of draw- backs:
• Results are only visible to that developer. It’s difficult to share this information with the rest of
the team.
• The process is dependent on developers remembering to run the tools regularly.
Continuous integration and continuous inspection:
• Ideally we want a system that will automatically monitor the quality of the code, without any
effort on the developer’s part, and make this information available to all members of the team.
What else can we use Jenkins for?
• Here are just some of the things I’ve successfully automated with Jenkins:
• Running unit tests
• Executing end-to-end UI tests by using Selenium
• Generating documentation
• Deploying software to staging environments
• Publishing packages to a Maven repository
• Running complex multistage performance tests
• Building and publishing auto-generated API clients

VERSION CONTROL HOOKS
• One of the most important features of Jenkins is its integration with your version control system
(VCS).
• Every time you check in some code, Jenkins should automatically start building the code to
check that there were no unexpected side effects to your change.
• If it finds something wrong, Jenkins can provide feedback by failing the build and perhaps
sending you an angry email.
BACKUP
• For many teams, once they start using Jenkins, it quickly becomes a pillar of their development
infrastructure.
• Just like any other piece of infrastructure, it’s important to plan for failure and take regular
backups.
REST API
• Jenkins comes with a powerful REST API built in, which can be very useful.
• I once had to make a very similar change to the configurations of a large number of jobs.
• This would have been tedious to do using the UI, so I wrote a script to do all the grunt work,
making use of the REST API to update the job configurations.
LECTURE 11
Disciplined refactoring:
• You can refactor all day long without fear of introducing any bugs.
• If you’re not careful, what started out as a simple refactoring can rapidly spiral out of control,
• if you’ve edited half the files in the project,
• You’re staring at an IDE full of red crosses.
• At this point you have to make a heart-wrenching decision:
• Should you give up and revert half a day’s work, or keep going and try to get the project back
into a compiling state?
• Even if you can reach the other side, you have no guarantee that the software still works as it’s
supposed to.
• The life of a software developer is stressful enough without having to make decisions like that,
• so let’s look at some ways to avoid getting into that situation.
Separate refactoring from other work:
• Often you realize that a piece of code is ripe for refactoring when you’re actually trying to do
something else.
• You may have opened the file in your editor in order to fix a bug or add a new feature.
• Then decide that you may as well refactor the code at the same time.
Lean on the IDE:
• Modern IDEs provide good support for the most popular refactoring patterns.
• They can perform refactorings automatically.
• It is obviously has a number of benefits over performing them by hand.
• Faster—The IDE can update hundreds of classes in milliseconds.
• Doing this by hand would take a prohibitively long time.
• Safer—They’re not infallible, but IDEs make far fewer mistakes than humans when updating
code.
• More thorough—The IDE often takes care of things that I hadn’t even thought of, such as
updating references inside code comments and renaming test classes to match the
corresponding production classes.
• More efficient—A lot of refactoring is boring work, which is what computers were designed to
do.
• Leave it to the IDE while you, rock star programmer that you are, get on with more important
things!
IDEs make mistakes too:
• With this in mind, it’s best not to trust the IDE unconditionally:
• If the IDE offers a preview of a refactoring, inspect it carefully.
• Check that the project still compiles afterward,
• Run automated tests if you have them.
• Use tools like grep to get a second opinion.

Lean on the VCS:
• I assume that you’re using a version control system (VCS).
• Any time you feel like your refactoring is getting out of control,
• You have the freedom to back out by hitting the Undo button.

Re Engineering

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Re Engineering

Uploaded by

Copyright:

Available Formats

Legacy Project

 Labeling as legacy on any existing project that’s difficult to maintain or extend.

Features of legacy projects

 The larger project is more difficult to maintain.

 These projects are usually inherited from a previous developer or team.

Untested, un testable code

Public void deleteWibble (Wibble wibble)

Throw new NotAuthorizedException(

Out dated Dependencies

• Most software will be run in a number of environments during its lifetime.

1. Developers run the software on their local machines.

2. They deploy it to a test environment for automatic and manual testing.

3. It is deployed to a staging environment that mimics production as closely as possible.

• Which users need to be consulted before making changes to behavior?

Finding your starting point

• Deciding where to focus your refactoring efforts

• Thinking more positively about your legacy software

• Measuring the quality of software

• Using Jenkins for continuous inspection

Overcoming feelings of fear and frustration

• Every time I touch a line of code, I break something completely unrelated.

The version control system

• The software seems to be doomed, and there’s no way to save it.

• At this point, you go back to doing “real” work.

Gathering useful data about your software

Bugs and coding standard violations

• If that’s the case, you’ll need to measure that performance.

1 Start the system in a known state.

2 Feed it 1 million lines of dummy log data.

4 Shut down the system and clean up.

MONITORING PERFORMANCE IN PRODUCTION

Time common tasks

TIME TO SET UP THE DEVELOPMENT ENVIRONMENT FROM SCRATCH:

TIME TAKEN TO RELEASE OR DEPLOY THE PROJECT

AVERAGE TIME TO FIX A BUG

• This metric can be a good indicator of communication between team members.

Commonly used files

Measure everything you can

• A good rule of thumb is, if in doubt, measure it.

PMD and Checkstyle

PMD IS NOISY BY DEFAULT:

CUSTOMIZING YOUR RULESET

Continuous inspection using Jenkins

• The process is dependent on developers remembering to run the tools regularly.

Continuous integration and continuous inspection:

What else can we use Jenkins for?

• Running unit tests

• Executing end-to-end UI tests by using Selenium

• Deploying software to staging environments

• Publishing packages to a Maven repository

• Running complex multistage performance tests

• Building and publishing auto-generated API clients

• if you’ve edited half the files in the project,

• You’re staring at an IDE full of red crosses.

• At this point you have to make a heart-wrenching decision:

• so let’s look at some ways to avoid getting into that situation.

Separate refactoring from other work:

Lean on the IDE:

• They can perform refactorings automatically.

• It is obviously has a number of benefits over performing them by hand.

• Faster—The IDE can update hundreds of classes in milliseconds.

• Doing this by hand would take a prohibitively long time.