You are on page 1of 44

| 

| 



Patrick Graydon
Qiuhua Cao
ë 

 |iruses
 Anti-|iruses
 Discussion
|  

 A virus is ³a program that can µinfect other


programs by modifying them to include a
possibly evolved copy of itself.´ - Fred Cohen
 Fred Cohen seems to have been the first to
define the term virus, but the concept had
been discussed earlier and there were some
viruses out in the wild before he began his
research.
Link to virus history
Π  

 ºn his 1984 Turing award acceptance speech


to the ACM, Ken Thompson related the story
of how he modified the C compiler to insert a
backdoor into the UNº login program and to
insert his modifications into any C compiler
compiled using his modified compiler.
Slick²no trace of the backdoor remains in any
source code!
|   

 The WM.Nuclear Microsoft Word macro virus


infects Word documents during opening,
saving, and printing by adding a set of
macros to them. On April 5th it attempts to
overwrite critical system files, and it
occaisonally adds the text "STOP ALL
FRENCH NUCLEAR TESTºNG ºN THE
PACºFºC!" to the current document.
(ºnformation from Symantec¶s security
bulletin.)
å   

 The |S.SST@mm ³Anna Kournikova´


malware is a worm, not a virus, because it e-
mails copies of itself but does not infect any
other documents. (ºnformation about
|S.SST@mm from Symantec¶s security
bulletin.)
-   

 We found a web site listing 56 different terms


related to viruses and malware, including:
backdoor
boot sector viruses
Encrypted virus
Hoax
Micro virus
«
|   
 Here are some statistics from 2000 we found on
the web:
Over 85% of all the known viruses are for Microsoft
platforms (nearly all the self-propagating worms are as
well)
Slightly less than 52,000 are viruses for
DOS/Windows/NT platforms
- about 6000 of these are Word macro viruses
- about 150-200 of these are known to be widespread
"in the wild"
- in 1999, approximately 650 new viruses were
reported each month (more than 20 a day)
|   

 More statistics from the same site


A few hundred are for Javascript, Hypercard, Perl, and
other scripting languages. Few of these can spread
beyond a few machines without active support of the
users
150 are for the Atari
31 are native to the Macintosh, and only two of
them are known to exist anymore
2 or 3 are viruses native to OS/2
|   
 More statistics from the same site
About 5 are for Linux/Unix/etc, but none have
been found in quantity "in the wild", nor would
they be likely to spread very far if they were
"loose"
None are for eOS, ErOS, or other small-
population systems.
 Question: can we reduce the risk of getting
a virus infection by à  using Microsoft
products?
Π  
 Fred Cohen¶s example virus:
program virus := { 1234567;
subroutine infect-executable := {
loop:file = get-random-executable-file;
if first-line-of-file = 1234567 then goto loop;
prepend virus to file; }
subroutine do-damage := { whatever damage is to be done }
subroutine trigger-pulled := { return true if some condition holds }
main-program := {
infect-executable;
if trigger-pulled then do-damage;
goto next;}
next:
}
-   

 |iruses aren¶t necessarily hard to write


Cohen reports that his first virus took only 8 hours
for an experienced programmer to write.
 |iruses aren¶t necessarily big
Cohen reports on a UNº shell script virus that
was only 7 lines long
|      

 Cohen describes a hypothetical virus that


compresses executables to conserve disk
space.
|      

 |irus payloads could:


Carry out a denial of service attack
Crash the machine
Randomly destroy data
ºnstall a trojan horse program
Perform password cracking
« and basically any other nasty thing you can
think of.
-   

 |irus payloads may not trigger immediately.


ºf a virus has few detectable side effects, it
could spread without notice and become
widespread before the payload is triggered.
Question: is it possible that there are viruses in
the wild today that have infected large numbers of
systems but have gone unnoticed because they
have few if any side effects and have not yet
triggered their destructive payloads?
º 

 One way to protect against infection is to


isolate systems, users, and/or information to
make it difficult or impossible for a virus to
spread widely.
 Total isolation is a sure cure.
Total isolation probably isn't practical for most
users«
ºmagine life without google « without itTorrent
« without Amazon.com «
R   

 ºf we can¶t isolate systems and users from


each other completely, maybe we can erect
partitions to limit the spread of malware.
 ºt was thought that the ell-LaPadula model
might help limit the spread of viruses, but
Cohen reports that ³viruses demonstrated the
ability to cross users boundaries and move
from a given security level to a higher
security level.´
R     

 According to Cohen, the iba and ell-


LaPadula models, if combined, would tend to
create partitions.
Unfortunately: ³When we mix the iba and ell-
LaPadula models, we find that the resulting
isolationism secures us from viruses, but doesn¶t
permit any user to write programs that can be
used throughout the system.´ ± Cohen
†    

 Transitivity is a problem:
³ºf there is a path from user A to user , and there
is a path from user  to user C, then there is a
path from user A to user C with the witting or
unwitting cooperation of user .´ ± Cohen
 The military uses a category system in which
users can only access information needed for
their current duties. ut, some users have
simultaneous access to multiple categories«
-  

 According to Cohen ³a precise system for


integrity is NP-complete´ and ³any non-NP
complete solution must tend toward
isolationism.´
ºf a system restricts user¶s actions unnecessarily,
it will be unpopular«
     

 Cohen notes that flow distance and flow list


models may limit virus spread.
Flow distance restrictions limit how far information
can travel.
Flow lists allow more arbitrary expressions for
accessibility based on the list of users who have
had an effect on an object.
UT: ³tracing exact information flow requires NP-
complete time, and maintaining markings requires
large amounts of space.´
R   

 Couldn¶t we just make it against the law?


³y simply telling users not to launch attacks,
little is accomplished; users who can be
trusted will not launch attacks; but users who
would do damage cannot be trusted, so only
legitimate work is blocked.´ - Cohen
u       

 ºf a given document is interpreted, and the


interpreter lacks commands like ³write file,´ it
may be impossible for it to have a virus
Graphics files are probably immune
 Except AnnaKournikova.jpg.vbs
Documents that can hold scripts probably aren¶t
 Word documents can contain macro viruses such as
WM.Nuclear
J   

 ºf we can¶t limit the spread of a virus, maybe


we can find it and quarantine infected files«
Unfortunately, no general algorithm for detecting
virus behavior is possible.
 Cohen argues this by proposing a virus that infects only
when the detection algorithm thinks it isn¶t a virus.
 Anti-virus programs must make do with more limited
solutions, such as scanning for a virus signature.
|     

 According to Cohen, the following are undecidable:


Detection of a virus by its appearance
Detection of a virus by its behavior
Detection of an evolution of a known virus
Detection of a triggering mechanism by its appearance
Detection of a triggering mechanism by its behavior
Detection of an evolution of a known triggering mechanism
Detection of a virus detector by its appearance
Detection of a virus detector by its behavior
Detection of an evolution of a known viral detector
J    

 Rather than implement a general solution,


virus scanners look for virus signatures.
These signatures could be as small as a few
bytes or as large as the entire virus code.
ºf a virus scanner uses the whole virus code as a
signature, it may not be able to find simple
variants of a virus.
However, if a virus uses a very small signature, it
may incorrectly infections that aren¶t there.
[   

 Anti-virus companies must release new


signatures each time a new virus is
discovered
A virus¶s spread is unimpeded for a while«
According to Andreas Marx of A|-Test.org, it took
Symantec 25h 5m to release an updated
signature file in response to the W32/Sober.C
worm attack.


 ºn order to make it hard for virus scanners to


detect their vurises, virus writers can add
morphing behavior to their creations:
³A x  x   µmorphs¶ itself in order to
evade detection. «  
 x   attempt
to evade heuristic detection techniques by using
more complex obfuscations.´
± Christodorescu and Jha
-  

 Cohen argues that no general solution for


proving the equivalence of two programs is
possible.
His argument follows the same form as his
argument against a general algorithm for virus
detection: he proposes a virus in which two
different infection instances will behave differently
when a watching antivirus program believes they
are the same.
- 

 A virus may morph itself by:


Encrypting part of itself using a different key for each
infection
Changing variable names (in a script virus)
inary obfuscation techniques (more on this later)
 Polymorphic virus examples:
Chameleon -- first polymorphic virus, 90¶s
A partial list of the viruses that can be called 100 percent
polymorphic (late 1993): ootache, CivilWar (four versions),
Crusher, Dudley, Fly, Freddy, Ginger, Grog, Haifa,
Moctezuma (two versions), M|F, Necros, Nukehard, PcFly
(three versions), Predator, Satanbug, Sandra, Shoker,
Todor, Tremor, Trigger, Uruguay (eight versions). ± at link
|irus-Scan-Software
     
 ºf virus author knew what the anti-virus programs
look for, he or she could design a virus that they
wouldn¶t find«
Example: in the early 90s there were a few MS-DOS
'stealth' viruses that could interrupt a virus-scanning
program's attempt to read the boot record and show it a
clean versions rather than what was really there.
 See Symantec¶s description of the Stealth_boot virus.
 "Frodo.4096" virus, first Stealth virus
 ³east.512" Stealth virus, less than a year after Frodo.4096
 More on this at |irus-Scan-Software
Π   

 Christodorescu and Jha report on a


technique for extracting the signature used by
a given antivirus program.
asically they obfuscate parts of the program and
determine what has to remain unobfuscated for
the antivirus program to find the virus.
 FYº there is a typo in the paper: the conditions on the
loop in the SignatureExtraction function cause it to never
execute«
They say it ³was successful in many cases.´
†     ! 

 The goal of binary obfuscation is to make it


difficult to obtain an assembly-language
description of a program from its raw bytes
You need to turn raw bytes back into assembly
code before you can decompile
You can obfuscate by:
 Garbage insertion (more in a minute)
 |ariable renaming
 Code reordering
 Encapsulating/encrypting code or data
 «
  

 ºf you create unused regions in the


executable and fill them with garbage bytes,
the variable-length nature of the x86
instruction set can cause disassemblers to
think that the legitimate instructions following
the garbage are in fact operands.
 You can use a conditional branch instruction
to do an unconditional jump²disassemblers
assume no garbage bytes at the target
address or following the branch instruction.
†   

 Linn and Debray describe obfuscation using


a branch function
This function in turn branches to another target
depending on where it is called from.
 This makes determining which parts of the program are
real by following the branch instructions difficult.
 The function can return to an instruction one or more
bytes after the usual return point, opening up a region to
insert more garbage bytes into.
    

 Kruegel, Robertson, |aleur and |igna


describe a disassembler that is able to
correctly disassemble most instructions from
a program obfuscated by the obfuscator Linn
and Debray describe.
J     

 Static analysis techniques


Linear sweep
 GNU's objdump uses linear sweep
 Gets confused by garbage bytes in unreachable areas
Recursive traversal following control flow
 Drawback: indirect jumps
 Doesn¶t always ³see´ the whole binary
Speculative disassembly
 Hybrid approach
r    

 ³This arms race is usually in favor of the de-


obfuscator. The obfuscator has to devise
techniques that transform the program
without seriously impacting the run-time
performance or increasing the binary's size or
memory footprint while there are no
such constraints for the de-obfuscator.´
- Kruegel et al
|    

 Christodorescu and Jha claim ³the state of


the art for malware detectors is dismal!´
They propose a testing technique and then use it
to show that the tested virus scanners were not
generally able to identify the sampled viruses
when they were obfuscated by code reordering or
encapsulation.
|    

 This doesn¶t mean that these products aren¶t


capable of detecting morphing viruses²the
viruses in the sample set did not perform
these morphs in the wild.
 This does mean that in order to protect
against a new virus that is just a simple
modification of one of these existing viruses
the A| companies would have to release a
new signature file.
D  

 Some virus detection techniques require you


to start from a clean system.
DOS users used clean boot disks to defeat stealth
viruses«
ut is it always possible to get to a known clean
state?
 What if every UNº vendor had been infected with Ken
Thompson¶s C compiler virus? Even their ³clean´
distribution media would be infected«
J  

 Obfuscation vs deobfuscation, who can win?


J  

 Anti-virus can win in the future?


j  "

Ô
à