You are on page 1of 8

Reverse engineering, also called back engineering, is the processes of extractin

g knowledge or design information from anything man-made and re-producing it or


re-producing anything based on the extracted information.[1]:3 The process often
involves disassembling something (a mechanical device, electronic component, co
mputer program, or biological, chemical, or organic matter) and analyzing its co
mponents and workings in detail.
The reasons and goals for obtaining such information vary widely from everyday o
r socially beneficial actions, to criminal actions, depending upon the situation
. Often no intellectual property rights are breached, such as when a person or b
usiness cannot recollect how something was done, or what something does, and nee
ds to reverse engineer it to work it out for themselves. Reverse engineering is
also beneficial in crime prevention, where suspected malware is reverse engineer
ed to understand what it does, and how to detect and remove it, and to allow com
puters and devices to work together ("interoperate") and to allow saved files on
obsolete systems to be used in newer systems. By contrast, reverse engineering
can also be used to "crack" software and media to remove their copy protection,[
1]:5 or to create a (possibly improved) copy or even a knockoff; this is usually
the goal of a competitor.[1]:4
Reverse engineering has its origins in the analysis of hardware for commercial o
r military advantage.[2]:13 However, the reverse engineering process in itself i
s not concerned with creating a copy or changing the artifact in some way; it is
only an analysis in order to deduce design features from products with little o
r no additional knowledge about the procedures involved in their original produc
tion.[2]:15 In some cases, the goal of the reverse engineering process can simpl
y be a redocumentation of legacy systems.[2]:15[3] Even when the product reverse
engineered is that of a competitor, the goal may not be to copy them, but to pe
rform competitor analysis.[4] Reverse engineering may also be used to create int
eroperable products; despite some narrowly tailored US and EU legislation, the l
egality of using specific reverse engineering techniques for this purpose has be
en hotly contested in courts worldwide for more than two decades.[5]
Contents [hide]
1
Motivation
2
Common situations
2.1
Reverse engineering of machines
2.2
Reverse engineering of software
2.2.1 Binary software
2.2.1.1 Binary software techniques
2.2.2 Software classification
2.3
Source code
2.4
Reverse engineering of protocols
2.5
Reverse engineering of integrated circuits/smart cards
2.6
Reverse engineering for military applications
2.7
Overlap with patent law
3
Legality
3.1
United States
3.2
European Union
4
See also
5
References
6
Further reading
Motivation[edit]
This section needs additional citations for verification. Please help improve th
is article by adding citations to reliable sources. Unsourced material may be ch
allenged and removed. (July 2014) (Learn how and when to remove this template me
ssage)
Reasons for reverse engineering:

Interfacing. Reverse engineering can be used when a system is required to interf


ace to another system and how both systems would negotiate is to be established.
Such requirements typically exist for interoperability.
Military or commercial espionage. Learning about an enemy's or competitor's late
st research by stealing or capturing a prototype and dismantling it. It may resu
lt in development of similar product, or better countermeasures for it.
Improve documentation shortcomings. Reverse engineering can be done when documen
tation of a system for its design, production, operation or maintenance have sho
rtcomings and original designers are not available to improve it. Reverse engine
ering of software can provide the most current documentation necessary for under
standing the most current state of a software system.
Obsolescence. Integrated circuits are often designed on proprietary systems, and
built on production lines which become obsolete in only a few years. When syste
ms using these parts can no longer be maintained (since the parts are no longer
made), the only way to incorporate the functionality into new technology is to r
everse-engineer the existing chip and then re-design it using newer tools, using
the understanding gained as a guide. Another obsolescence originated problem wh
ich can be solved by reverse engineering is the need to support (maintenance and
supply for continuous operation) existing, legacy devices which are no longer s
upported by their original equipment manufacturer (OEM). This problem is particu
larly critical in military operations.
Software modernization - often knowledge is lost over time, which can prevent up
dates and improvements. Reverse engineering is generally needed in order to unde
rstand the 'as is' state of existing or legacy software in order to properly est
imate the effort required to migrate system knowledge into a 'to be' state. Much
of this may be driven by changing functional, compliance or security requiremen
ts.
Product security analysis. To examine how a product works, what are specificatio
ns of its components, estimate costs and identify potential patent infringement.
Acquiring sensitive data by disassembling and analysing the design of a system
component.[6] Another intent may be to remove copy protection, or circumvention
of access restrictions.
Bug fixing. To fix (or sometimes to enhance) legacy software which is no longer
supported by its creators (e.g. abandonware).
Creation of unlicensed/unapproved duplicates. Such duplicates are sometimes call
ed clones in the computing domain.
Academic/learning purposes. Reverse engineering for learning purposes may be to
understand the key issues of an unsuccessful design and subsequently improve the
design.
Competitive technical intelligence. Understand what one's competitor is actually
doing, versus what they say they are doing.
Saving money, when one finds out what a piece of electronics is capable of, it c
an spare a user from purchase of a separate product.
Repurposing, in which opportunities to repurpose stuff that is otherwise obsolet
e can be incorporated into a bigger body of utility.
Common situations[edit]
Reverse engineering of machines[edit]
As computer-aided design (CAD) has become more popular, reverse engineering has
become a viable method to create a 3D virtual model of an existing physical part
for use in 3D CAD, CAM, CAE or other software.[7] The reverse-engineering proce
ss involves measuring an object and then reconstructing it as a 3D model. The ph
ysical object can be measured using 3D scanning technologies like CMMs, laser sc
anners, structured light digitizers, or Industrial CT Scanning (computed tomogra
phy). The measured data alone, usually represented as a point cloud, lacks topol
ogical information and is therefore often processed and modeled into a more usab
le format such as a triangular-faced mesh, a set of NURBS surfaces, or a CAD mod
el.[8]
Hybrid Modelling is commonly used term when NURBS and Parametric modelling are i
mplemented together. Using a combination of geometric and freeform surfaces can

provide a powerful method of 3D modelling. Areas of freeform data can be combine


d with exact geometric surfaces to create a hybrid model. A typical example of t
his would be the reverse engineering of a cylinder head, which includes freeform
cast features, such as water jackets and high tolerance machined areas.[9]
Reverse engineering is also used by businesses to bring existing physical geomet
ry into digital product development environments, to make a digital 3D record of
their own products, or to assess competitors' products. It is used to analyse,
for instance, how a product works, what it does, and what components it consists
of, estimate costs, and identify potential patent infringement, etc.
Value engineering is a related activity also used by businesses. It involves deconstructing and analysing products, but the objective is to find opportunities
for cost cutting.
Reverse engineering of software[edit]
The term reverse engineering as applied to software means different things to di
fferent people, prompting Chikofsky and Cross to write a paper researching the v
arious uses and defining a taxonomy. From their paper, they state, "Reverse engi
neering is the process of analyzing a subject system to create representations o
f the system at a higher level of abstraction."[10] It can also be seen as "goin
g backwards through the development cycle".[11] In this model, the output of the
implementation phase (in source code form) is reverse-engineered back to the an
alysis phase, in an inversion of the traditional waterfall model. Another term f
or this technique is program comprehension.[3]
Reverse engineering is a process of examination only: the software system under
consideration is not modified (which would make it re-engineering). Software ant
i-tamper technology like obfuscation is used to deter both reverse engineering a
nd re-engineering of proprietary software and software-powered systems. In pract
ice, two main types of reverse engineering emerge. In the first case, source cod
e is already available for the software, but higher-level aspects of the program
, perhaps poorly documented or documented but no longer valid, are discovered. I
n the second case, there is no source code available for the software, and any e
fforts towards discovering one possible source code for the software are regarde
d as reverse engineering. This second usage of the term is the one most people a
re familiar with. Reverse engineering of software can make use of the clean room
design technique to avoid copyright infringement.
On a related note, black box testing in software engineering has a lot in common
with reverse engineering. The tester usually has the API, but their goals are t
o find bugs and undocumented features by bashing the product from outside.[12]
Other purposes of reverse engineering include security auditing, removal of copy
protection ("cracking"), circumvention of access restrictions often present in
consumer electronics, customization of embedded systems (such as engine manageme
nt systems), in-house repairs or retrofits, enabling of additional features on l
ow-cost "crippled" hardware (such as some graphics card chip-sets), or even mere
satisfaction of curiosity.
Binary software[edit]
This process is sometimes termed Reverse Code Engineering, or RCE.[13] As an exa
mple, decompilation of binaries for the Java platform can be accomplished using
Jad. One famous case of reverse engineering was the first non-IBM implementation
of the PC BIOS which launched the historic IBM PC compatible industry that has
been the overwhelmingly dominant computer hardware platform for many years. Reve
rse engineering of software is protected in the U.S. by the fair use exception i
n copyright law.[14] The Samba software, which allows systems that are not runni
ng Microsoft Windows systems to share files with systems that are, is a classic
example of software reverse engineering,[15] since the Samba project had to reve

rse-engineer unpublished information about how Windows file sharing worked, so t


hat non-Windows computers could emulate it. The Wine project does the same thing
for the Windows API, and OpenOffice.org is one party doing this for the Microso
ft Office file formats. The ReactOS project is even more ambitious in its goals,
as it strives to provide binary (ABI and API) compatibility with the current Wi
ndows OSes of the NT branch, allowing software and drivers written for Windows t
o run on a clean-room reverse-engineered Free Software (GPL) counterpart. Window
sSCOPE allows for reverse-engineering the full contents of a Windows system's li
ve memory including a binary-level, graphical reverse engineering of all running
processes.
Another classic, if not well-known example is that in 1987 Bell Laboratories rev
erse-engineered the Mac OS System 4.1, originally running on the Apple Macintosh
SE, so they could run it on RISC machines of their own.[16]
Binary software techniques[edit]
Reverse engineering of software can be accomplished by various methods. The thre
e main groups of software reverse engineering are
Analysis through observation of information exchange, most prevalent in protocol
reverse engineering, which involves using bus analyzers and packet sniffers, fo
r example, for accessing a computer bus or computer network connection and revea
ling the traffic data thereon. Bus or network behavior can then be analyzed to p
roduce a stand-alone implementation that mimics that behavior. This is especiall
y useful for reverse engineering device drivers. Sometimes, reverse engineering
on embedded systems is greatly assisted by tools deliberately introduced by the
manufacturer, such as JTAG ports or other debugging means. In Microsoft Windows,
low-level debuggers such as SoftICE are popular.
Disassembly using a disassembler, meaning the raw machine language of the progra
m is read and understood in its own terms, only with the aid of machine-language
mnemonics. This works on any computer program but can take quite some time, esp
ecially for someone not used to machine code. The Interactive Disassembler is a
particularly popular tool.
Decompilation using a decompiler, a process that tries, with varying results, to
recreate the source code in some high-level language for a program only availab
le in machine code or bytecode.
Software classification[edit]
Software classification is the process of identifying similarities between diffe
rent software binaries (for example, two different versions of the same binary)
used to detect code relations between software samples. This task was traditiona
lly done manually for several reasons (such as patch analysis for vulnerability
detection and copyright infringement) but nowadays can be done somewhat automati
cally for large numbers of samples.
This method is being used mostly for long and thorough reverse engineering tasks
(complete analysis of a complex algorithm or big piece of software). In general
, statistical classification is considered to be a hard problem and this is also
true for software classification, therefore there aren't many solutions/tools t
hat handle this task well.
Source code[edit]
A number of UML tools refer to the process of importing and analysing source cod
e to generate UML diagrams as "reverse engineering". See List of UML tools.
Although UML is one approach to providing "reverse engineering" more recent adva
nces in international standards activities have resulted in the development of t
he Knowledge Discovery Metamodel (KDM). This standard delivers an ontology for t
he intermediate (or abstracted) representation of programming language construct
s and their interrelationships. An Object Management Group standard (on its way
to becoming an ISO standard as well), KDM has started to take hold in industry w

ith the development of tools and analysis environments which can deliver the ext
raction and analysis of source, binary, and byte code. For source code analysis,
KDM's granular standards' architecture enables the extraction of software syste
m flows (data, control, & call maps), architectures, and business layer knowledg
e (rules, terms, process). The standard enables the use of a common data format
(XMI) enabling the correlation of the various layers of system knowledge for eit
her detailed analysis (e.g. root cause, impact) or derived analysis (e.g. busine
ss process extraction). Although efforts to represent language constructs can be
never-ending given the number of languages, the continuous evolution of softwar
e languages and the development of new languages, the standard does allow for th
e use of extensions to support the broad language set as well as evolution. KDM
is compatible with UML, BPMN, RDF and other standards enabling migration into ot
her environments and thus leverage system knowledge for efforts such as software
system transformation and enterprise business layer analysis.
Reverse engineering of protocols[edit]
Protocols are sets of rules that describe message formats and how messages are e
xchanged (i.e., the protocol state-machine). Accordingly, the problem of protoco
l reverse-engineering can be partitioned into two subproblems; message format an
d state-machine reverse-engineering.
The message formats have traditionally been reverse-engineered through a tedious
manual process, which involved analysis of how protocol implementations process
messages, but recent research proposed a number of automatic solutions.[17][18]
[19] Typically, these automatic approaches either group observed messages into c
lusters using various clustering analyses, or emulate the protocol implementatio
n tracing the message processing.
There has been less work on reverse-engineering of state-machines of protocols.
In general, the protocol state-machines can be learned either through a process
of offline learning, which passively observes communication and attempts to buil
d the most general state-machine accepting all observed sequences of messages, a
nd online learning, which allows interactive generation of probing sequences of
messages and listening to responses to those probing sequences. In general, offl
ine learning of small state-machines is known to be NP-complete,[20] while onlin
e learning can be done in polynomial time.[21] An automatic offline approach has
been demonstrated by Comparetti et al.[19] and an online approach very recently
by Cho et al.[22]
Other components of typical protocols, like encryption and hash functions, can b
e reverse-engineered automatically as well. Typically, the automatic approaches
trace the execution of protocol implementations and try to detect buffers in mem
ory holding unencrypted packets.[23]
Reverse engineering of integrated circuits/smart cards[edit]
Reverse engineering is an invasive and destructive form of analyzing a smart car
d. The attacker grinds away layer after layer of the smart card and takes pictur
es with an electron microscope. With this technique, it is possible to reveal th
e complete hardware and software part of the smart card. The major problem for t
he attacker is to bring everything into the right order to find out how everythi
ng works. The makers of the card try to hide keys and operations by mixing up me
mory positions, for example, bus scrambling.[24][25] In some cases, it is even p
ossible to attach a probe to measure voltages while the smart card is still oper
ational. The makers of the card employ sensors to detect and prevent this attack
.[26] This attack is not very common because it requires a large investment in e
ffort and special equipment that is generally only available to large chip manuf
acturers. Furthermore, the payoff from this attack is low since other security t
echniques are often employed such as shadow accounts.
Reverse engineering for military applications[edit]

This section needs additional citations for verification. Please help improve th
is article by adding citations to reliable sources. Unsourced material may be ch
allenged and removed. (July 2014) (Learn how and when to remove this template me
ssage)
Reverse engineering is often used by people in order to copy other nations' tech
nologies, devices, or information that have been obtained by regular troops in t
he fields or by intelligence operations. It was often used during the Second Wor
ld War and the Cold War. Well-known examples from WWII and later include:
Jerry can: British and American forces noticed that the Germans had gasoline can
s with an excellent design. They reverse-engineered copies of those cans. The ca
ns were popularly known as "Jerry cans".
Panzerschreck: The Germans captured an American Bazooka during World War II, and
reverse engineered it to create the larger Panzerschreck.
Tupolev Tu-4: In 1944, three American B-29 bombers on missions over Japan were f
orced to land in the USSR. The Soviets, who did not have a similar strategic bom
ber, decided to copy the B-29. Within three years, they had developed the Tu-4,
a near-perfect copy.
SCR-584 radar: copied by USSR after the Second World War. Known in the form a fe
w modifications - ???-584, ???????-?.
V-2 rocket: Technical documents for the V2 and related technologies were capture
d by the Western Allies at the end of the war. The American side focused their r
everse engineering efforts via operation Paperclip, which led to the development
of the PGM-11 Redstone rocket.[27] The Soviet side used captured German enginee
rs to reproduce technical documents and plans, and work from captured hardware i
n order to make their clone of the rocket, the R-1. Thus began the postwar Sovie
t rocket program that led to the R-7 and the beginning of the space race.
K-13/R-3S missile (NATO reporting name AA-2 Atoll), a Soviet reverse-engineered
copy of the AIM-9 Sidewinder, was made possible after a Taiwanese AIM-9B hit a C
hinese MiG-17 without exploding in September 1958.[28] The missile became lodged
within the airframe, and the pilot returned to base with what Russian scientist
s would describe as a university course in missile development.
BGM-71 TOW Missile: In May 1975, negotiations between Iran and Hughes Missile Sy
stems on co-production of the TOW and Maverick missiles stalled over disagreemen
ts in the pricing structure, the subsequent 1979 revolution ending all plans for
such co-production. Iran was later successful in reverse-engineering the missil
e and are currently producing their own copy: the Toophan.
China has reversed engineered many examples of Western and Russian hardware, fro
m fighter aircraft to missiles and HMMWV cars.
During the Second World War, Polish and British cryptographers studied captured
German "Enigma" message encryption machines for weaknesses. Their operation was
then simulated on electro-mechanical devices called "Bombes" that tried all the
possible scrambler settings of the "Enigma" machines to help break the coded mes
sages sent by the Germans.
Also during the Second World War, British scientists analyzed and defeated a ser
ies of increasingly sophisticated radio navigation systems being used by the Ger
man Luftwaffe to perform guided bombing missions at night. The British counterme
asures to this system were so effective that in some cases German aircraft were
led by signals to land at RAF bases, believing they were back in German territor
y.
Overlap with patent law[edit]
Reverse engineering applies primarily to gaining understanding of a process or a
rtifact, where the manner of its construction, use, or internal processes is not
made clear by its creator.
Patented items do not of themselves have to be reverse-engineered to be studied,
since the essence of a patent is that the inventor provides detailed public dis
closure themselves, and in return receives legal protection of the invention inv
olved. However, an item produced under one or more patents could also include ot

her technology that is not patented and not disclosed. Indeed, one common motiva
tion of reverse engineering is to determine whether a competitor's product conta
ins patent infringements or copyright infringements.
Legality[edit]
United States[edit]
In the United States even if an artifact or process is protected by trade secret
s, reverse-engineering the artifact or process is often lawful as long as it has
been legitimately obtained.[29]
Reverse engineering of computer software in the US often falls under both contra
ct law as a breach of contract as well as any other relevant laws. This is becau
se most EULA's (end user license agreement) specifically prohibit it, and U.S. c
ourts have ruled that if such terms are present, they override the copyright law
which expressly permits it (see Bowers v. Baystate Technologies[30][31]).
Sec. 103(f) of the DMCA (17 U.S.C. 1201 (f)) says that a person who is in legal
possession of a program, is permitted to reverse-engineer and circumvent its pro
tection if this is necessary in order to achieve "interoperability" - a term bro
adly covering other devices and programs being able to interact with it, make us
e of it, and to use and transfer data to and from it, in useful ways. A limited
exemption exists that allows the knowledge thus gained to be shared and used for
interoperability purposes. The section states:
(f) Reverse Engineering.
(1) Notwithstanding the provisions of subsection (a)(1)(A), a person who has law
fully obtained the right to use a copy of a computer program may circumvent a te
chnological measure that effectively controls access to a particular portion of
that program for the sole purpose of identifying and analyzing those elements of
the program that are necessary to achieve interoperability of an independently
created computer program with other programs, and that have not previously been
readily available to the person engaging in the circumvention, to the extent any
such acts of identification and analysis do not constitute infringement under t
his title.
(2) Notwithstanding the provisions of subsections (a)(2) and (b), a person may d
evelop and employ technological means to circumvent a technological measure, or
to circumvent protection afforded by a technological measure, in order to enable
the identification and analysis under paragraph (1), or for the purpose of enab
ling interoperability of an independently created computer program with other pr
ograms, if such means are necessary to achieve such interoperability, to the ext
ent that doing so does not constitute infringement under this title.
(3) The information acquired through the acts permitted under paragraph (1), and
the means permitted under paragraph (2), may be made available to others if the
person referred to in paragraph (1) or (2), as the case may be, provides such i
nformation or means solely for the purpose of enabling interoperability of an in
dependently created computer program with other programs, and to the extent that
doing so does not constitute infringement under this title or violate applicabl
e law other than this section.
(4) For purposes of this subsection, the term ?interoperability? means the abili
ty of computer programs to exchange information, and of such programs mutually t
o use the information which has been exchanged.
European Union[edit]
EU Directive 2009/24, on the legal protection of computer programs, governs reve
rse engineering in the European Union. The directive states:[32]

(15) The unauthorised reproduction, translation, adaptation or transformation of


the form of the code in which a copy of a computer program has been made availa
ble constitutes an infringement of the exclusive rights of the author. Neverthel
ess, circumstances may exist when such a reproduction of the code and translatio
n of its form are indispensable to obtain the necessary information to achieve t
he interoperability of an independently created program with other programs. It
has therefore to be considered that, in these limited circumstances only, perfor
mance of the acts of reproduction and translation by or on behalf of a person ha
ving a right to use a copy of the program is legitimate and compatible with fair
practice and must therefore be deemed not to require the authorisation of the r
ightholder. An objective of this exception is to make it possible to connect all
components of a computer system, including those of different manufacturers, so
that they can work together. Such an exception to the author's exclusive rights
may not be used in a way which prejudices the legitimate interests of the right
holder or which conflicts with a normal exploitation of the program.