You are on page 1of 54

Computer virus

A computer virus is a computer program that can


copy itself and infect a computer without the
permission or knowledge of the owner. The term
"virus" is also commonly but erroneously used to
refer to other types of malware, adware, and spyware
programs that do not have the reproductive ability. A
true virus can only spread from one computer to
another (in some form of executable code) when its
host is taken to the target computer; for instance
because a user sent it over a network or the Internet,
or carried it on a removable medium such as a floppy
disk, CD, DVD, or USB drive. Viruses can increase
their chances of spreading to other computers by
infecting files on a network file system or a file
system that is accessed by another computer.[1][2]
The term "computer virus" is sometimes used as a
catch-all phrase to include all types of malware.
Malware includes computer viruses, worms, trojan
horses, most rootkits, spyware, dishonest adware,
crimeware, and other malicious and unwanted
software), including true viruses. Viruses are
sometimes confused with computer worms and
Trojan horses, which are technically different. A
worm can exploit security vulnerabilities to spread
itself to other computers without needing to be
transferred as part of a host, and a Trojan horse is a
program that appears harmless but has a hidden
agenda. Worms and Trojans, like viruses, may cause
harm to either a computer system's hosted data,
functional performance, or networking throughput,
when they are executed. Some viruses and other
malware have symptoms noticeable to the computer
user, but many are surreptitious.
Most personal computers are now connected to the
Internet and to local area networks, facilitating the
spread of malicious code. Today's viruses may also
take advantage of network services such as the World
Wide Web, e-mail, Instant Messaging, and file
sharing systems to spread.

History
The Creeper virus was first detected on ARPANET,
the forerunner of the Internet in the early 1970s.[3]
Creeper was an experimental self-replicating program
written by Bob Thomas at BBN in 1971.[4] Creeper
used the ARPANET to infect DEC PDP-10
computers running the TENEX operating system.
Creeper gained access via the ARPANET and copied
itself to the remote system where the message, "I'm
the creeper, catch me if you can!" was displayed. The
Reaper program was created to delete Creeper.[5]
A program called "Rother J" was the first computer
virus to appear "in the wild" — that is, outside the
single computer or lab where it was created.[citation
needed]
Written in 1981 by Richard Skrenta, it attached
itself to the Apple DOS 3.3 operating system and
spread via floppy disk.[6] This virus was created as a
practical joke when Richard Skrenta was still in high
school. It was injected in a game on a floppy disk. On
its 50th use the Elk Cloner virus would be activated,
infecting the computer and displaying a short poem
beginning "Elk Cloner: The program with a
personality."
The first PC virus in the wild was a boot sector virus
dubbed (c)Brain[7], created in 1986 by the Farooq
Alvi Brothers, operating out of Lahore, Pakistan,
reportedly to deter piracy of the software they had
written[citation needed]. However, analysts have claimed
that the Ashar virus, a variant of Brain, possibly
predated it based on code within the virus.[original
research?]

Before computer networks became widespread, most


viruses spread on removable media, particularly
floppy disks. In the early days of the personal
computer, many users regularly exchanged
information and programs on floppies. Some viruses
spread by infecting programs stored on these disks,
while others installed themselves into the disk boot
sector, ensuring that they would be run when the user
booted the computer from the disk, usually
inadvertently. PCs of the era would attempt to boot
first from a floppy if one had been left in the drive.
Until floppy disks fell out of use, this was the most
successful infection strategy and boot sector viruses
were the most common in the wild for many years.[8]
Traditional computer viruses emerged in the 1980s,
driven by the spread of personal computers and the
resultant increase in BBS, modem use, and software
sharing. Bulletin board-driven software sharing
contributed directly to the spread of Trojan horse
programs, and viruses were written to infect
popularly traded software. Shareware and bootleg
software were equally common vectors for viruses on
BBS's.[citation needed] Within the "pirate scene" of
hobbyists trading illicit copies of retail software,
traders in a hurry to obtain the latest applications
were easy targets for viruses.[original research?]
Macro viruses have become common since the mid-
1990s. Most of these viruses are written in the
scripting languages for Microsoft programs such as
Word and Excel and spread throughout Microsoft
Office by infecting documents and spreadsheets.
Since Word and Excel were also available for Mac
OS, most could also spread to Macintosh computers.
Although most of these viruses did not have the
ability to send infected e-mail, those viruses which
did took advantage of the Microsoft Outlook COM
interface.[citation needed]
Some old versions of Microsoft Word allow macros
to replicate themselves with additional blank lines. If
two macro viruses simultaneously infect a document,
the combination of the two, if also self-replicating,
can appear as a "mating" of the two and would likely
be detected as a virus unique from the "parents."[9]
A virus may also send a web address link as an
instant message to all the contacts on an infected
machine. If the recipient, thinking the link is from a
friend (a trusted source) follows the link to the
website, the virus hosted at the site may be able to
infect this new computer and continue propagating.
Cross-site scripting viruses emerged recently, and
were academically demonstrated in 2005.[10] Since
2005 there have been multiple instances of the cross-
site scripting viruses in the wild, exploiting websites
such as MySpace and Yahoo.

Infection strategies
In order to replicate itself, a virus must be permitted
to execute code and write to memory. For this reason,
many viruses attach themselves to executable files
that may be part of legal programs. If a user attempts
to launch an infected program, the virus' code may be
executed simultaneously. Viruses can be divided into
two types based on their behavior when they are
executed. Nonresident viruses immediately search for
other hosts that can be infected, infect those targets,
and finally transfer control to the application program
they infected. Resident viruses do not search for hosts
when they are started. Instead, a resident virus loads
itself into memory on execution and transfers control
to the host program. The virus stays active in the
background and infects new hosts when those files
are accessed by other programs or the operating
system itself.

Nonresident viruses
Nonresident viruses can be thought of as consisting
of a finder module and a replication module. The
finder module is responsible for finding new files to
infect. For each new executable file the finder
module encounters, it calls the replication module to
infect that file.[11]

Resident viruses
Resident viruses contain a replication module that is
similar to the one that is employed by nonresident
viruses. This module, however, is not called by a
finder module. The virus loads the replication module
into memory when it is executed instead and ensures
that this module is executed each time the operating
system is called to perform a certain operation. The
replication module can be called, for example, each
time the operating system executes a file. In this case
the virus infects every suitable program that is
executed on the computer.
Resident viruses are sometimes subdivided into a
category of fast infectors and a category of slow
infectors. Fast infectors are designed to infect as
many files as possible. A fast infector, for instance,
can infect every potential host file that is accessed.
This poses a special problem when using anti-virus
software, since a virus scanner will access every
potential host file on a computer when it performs a
system-wide scan. If the virus scanner fails to notice
that such a virus is present in memory the virus can
"piggy-back" on the virus scanner and in this way
infect all files that are scanned. Fast infectors rely on
their fast infection rate to spread. The disadvantage of
this method is that infecting many files may make
detection more likely, because the virus may slow
down a computer or perform many suspicious actions
that can be noticed by anti-virus software. Slow
infectors, on the other hand, are designed to infect
hosts infrequently. Some slow infectors, for instance,
only infect files when they are copied. Slow infectors
are designed to avoid detection by limiting their
actions: they are less likely to slow down a computer
noticeably and will, at most, infrequently trigger anti-
virus software that detects suspicious behavior by
programs. The slow infector approach, however, does
not seem very successful.

Vectors and hosts


Viruses have targeted various types of transmission
media or hosts. This list is not exhaustive:
• Binary executable files (such as COM files and
EXE files in MS-DOS, Portable Executable files
in Microsoft Windows, and ELF files in Linux)
• Volume Boot Records of floppy disks and hard
disk partitions
• The master boot record (MBR) of a hard disk
• General-purpose script files (such as batch files
in MS-DOS and Microsoft Windows, VBScript
files, and shell script files on Unix-like
platforms).
• Application-specific script files (such as Telix-
scripts)
• System specific autorun script files (such as
Autorun.inf file needed to Windows to
automatically run software stored on USB
Memory Storage Devices).
• Documents that can contain macros (such as
Microsoft Word documents, Microsoft Excel
spreadsheets, AmiPro documents, and Microsoft
Access database files)
• Cross-site scripting vulnerabilities in web
applications
• Arbitrary computer files. An exploitable buffer
overflow, format string, race condition or other
exploitable bug in a program which reads the file
could be used to trigger the execution of code
hidden within it. Most bugs of this type can be
made more difficult to exploit in computer
architectures with protection features such as an
execute disable bit and/or address space layout
randomization.
PDFs, like HTML, may link to malicious code.[citation
needed]
PDFs can also be infected with malicious code.
In operating systems that use file extensions to
determine program associations (such as Microsoft
Windows), the extensions may be hidden from the
user by default. This makes it possible to create a file
that is of a different type than it appears to the user.
For example, an executable may be created named
"picture.png.exe", in which the user sees only
"picture.png" and therefore assumes that this file is
an image and most likely is safe.
An additional method is to generate the virus code
from parts of existing operating system files by using
the CRC16/CRC32 data. The initial code can be quite
small (tens of bytes) and unpack a fairly large virus.
This is analogous to a biological "prion" in the way it
works but is vulnerable to signature based detection.
This attack has not yet been seen "in the wild".

Methods to avoid detection


In order to avoid detection by users, some viruses employ different kinds of deception.
Some old viruses, especially on the MS-DOS platform, make sure that the "last modified"
date of a host file stays the same when the file is infected by the virus. This approach
does not fool anti-virus software, however, especially those which maintain and date
Cyclic redundancy checks on file changes.

Some viruses can infect files without increasing their sizes or damaging the files. They
accomplish this by overwriting unused areas of executable files. These are called cavity
viruses. For example the CIH virus, or Chernobyl Virus, infects Portable Executable files.
Because those files have many empty gaps, the virus, which was 1 KB in length, did not
add to the size of the file.

Some viruses try to avoid detection by killing the tasks associated with antivirus software
before it can detect them.

As computers and operating systems grow larger and more complex, old hiding
techniques need to be updated or replaced. Defending a computer against viruses may
demand that a file system migrate towards detailed and explicit permission for every kind
of file access.

Avoiding bait files and other undesirable hosts


A virus needs to infect hosts in order to spread further. In some cases, it might be a bad
idea to infect a host program. For example, many anti-virus programs perform an
integrity check of their own code. Infecting such programs will therefore increase the
likelihood that the virus is detected. For this reason, some viruses are programmed not to
infect programs that are known to be part of anti-virus software. Another type of host that
viruses sometimes avoid is bait files. Bait files (or goat files) are files that are specially
created by anti-virus software, or by anti-virus professionals themselves, to be infected by
a virus. These files can be created for various reasons, all of which are related to the
detection of the virus:

• Anti-virus professionals can use bait files to take a sample of a virus (i.e. a copy
of a program file that is infected by the virus). It is more practical to store and
exchange a small, infected bait file, than to exchange a large application program
that has been infected by the virus.
• Anti-virus professionals can use bait files to study the behavior of a virus and
evaluate detection methods. This is especially useful when the virus is
polymorphic. In this case, the virus can be made to infect a large number of bait
files. The infected files can be used to test whether a virus scanner detects all
versions of the virus.
• Some anti-virus software employs bait files that are accessed regularly. When
these files are modified, the anti-virus software warns the user that a virus is
probably active on the system.

Since bait files are used to detect the virus, or to make detection possible, a virus can
benefit from not infecting them. Viruses typically do this by avoiding suspicious
programs, such as small program files or programs that contain certain patterns of
'garbage instructions'.

A related strategy to make baiting difficult is sparse infection. Sometimes, sparse


infectors do not infect a host file that would be a suitable candidate for infection in other
circumstances. For example, a virus can decide on a random basis whether to infect a file
or not, or a virus can only infect host files on particular days of the week.

Stealth

Some viruses try to trick anti-virus software by intercepting its requests to the operating
system. A virus can hide itself by intercepting the anti-virus software’s request to read the
file and passing the request to the virus, instead of the OS. The virus can then return an
uninfected version of the file to the anti-virus software, so that it seems that the file is
"clean". Modern anti-virus software employs various techniques to counter stealth
mechanisms of viruses. The only completely reliable method to avoid stealth is to boot
from a medium that is known to be clean.

Self-modification
Most modern antivirus programs try to find virus-patterns inside ordinary programs by
scanning them for so-called virus signatures. A signature is a characteristic byte-pattern
that is part of a certain virus or family of viruses. If a virus scanner finds such a pattern in
a file, it notifies the user that the file is infected. The user can then delete, or (in some
cases) "clean" or "heal" the infected file. Some viruses employ techniques that make
detection by means of signatures difficult but probably not impossible. These viruses
modify their code on each infection. That is, each infected file contains a different variant
of the virus.

Encryption with a variable key

A more advanced method is the use of simple encryption to encipher the virus. In this
case, the virus consists of a small decrypting module and an encrypted copy of the virus
code. If the virus is encrypted with a different key for each infected file, the only part of
the virus that remains constant is the decrypting module, which would (for example) be
appended to the end. In this case, a virus scanner cannot directly detect the virus using
signatures, but it can still detect the decrypting module, which still makes indirect
detection of the virus possible. Since these would be symmetric keys, stored on the
infected host, it is in fact entirely possible to decrypt the final virus, but this is probably
not required, since self-modifying code is such a rarity that it may be reason for virus
scanners to at least flag the file as suspicious.

An old, but compact, encryption involves XORing each byte in a virus with a constant, so
that the exclusive-or operation had only to be repeated for decryption. It is suspicious
code that modifies itself, so the code to do the encryption/decryption may be part of the
signature in many virus definitions.

Polymorphic code

Polymorphic code was the first technique that posed a serious threat to virus scanners.
Just like regular encrypted viruses, a polymorphic virus infects files with an encrypted
copy of itself, which is decoded by a decryption module. In the case of polymorphic
viruses, however, this decryption module is also modified on each infection. A well-
written polymorphic virus therefore has no parts which remain identical between
infections, making it very difficult to detect directly using signatures. Anti-virus software
can detect it by decrypting the viruses using an emulator, or by statistical pattern analysis
of the encrypted virus body. To enable polymorphic code, the virus has to have a
polymorphic engine (also called mutating engine or mutation engine) somewhere in its
encrypted body. See Polymorphic code for technical detail on how such engines
operate.[12]

Some viruses employ polymorphic code in a way that constrains the mutation rate of the
virus significantly. For example, a virus can be programmed to mutate only slightly over
time, or it can be programmed to refrain from mutating when it infects a file on a
computer that already contains copies of the virus. The advantage of using such slow
polymorphic code is that it makes it more difficult for anti-virus professionals to obtain
representative samples of the virus, because bait files that are infected in one run will
typically contain identical or similar samples of the virus. This will make it more likely
that the detection by the virus scanner will be unreliable, and that some instances of the
virus may be able to avoid detection.

Metamorphic code

To avoid being detected by emulation, some viruses rewrite themselves completely each
time they are to infect new executables. Viruses that use this technique are said to be
metamorphic. To enable metamorphism, a metamorphic engine is needed. A
metamorphic virus is usually very large and complex. For example, W32/Simile
consisted of over 14000 lines of Assembly language code, 90% of which is part of the
metamorphic engine.[13][14]

Vulnerability and countermeasures


The vulnerability of operating systems to viruses

Just as genetic diversity in a population decreases the chance of a single disease wiping
out a population, the diversity of software systems on a network similarly limits the
destructive potential of viruses.

This became a particular concern in the 1990s, when Microsoft gained market dominance
in desktop operating systems and office suites. The users of Microsoft software
(especially networking software such as Microsoft Outlook and Internet Explorer) are
especially vulnerable to the spread of viruses. Microsoft software is targeted by virus
writers due to their desktop dominance, and is often criticized for including many errors
and holes for virus writers to exploit. Integrated and non-integrated Microsoft
applications (such as Microsoft Office) and applications with scripting languages with
access to the file system (for example Visual Basic Script (VBS), and applications with
networking features) are also particularly vulnerable.

Although Windows is by far the most popular operating system for virus writers, some
viruses also exist on other platforms. Any operating system that allows third-party
programs to run can theoretically run viruses. Some operating systems are less secure
than others. Unix-based OS's (and NTFS-aware applications on Windows NT based
platforms) only allow their users to run executables within their own protected memory
space.

An Internet based research revealed that there were cases when people willingly pressed
a particular button to download a virus. Security analyst Didier Stevens ran a half year
advertising campaign on Google AdWords which said "Is your PC virus-free? Get it
infected here!". The result was 409 clicks.[15][16]

As of 2006, there are relatively few security exploits targeting Mac OS X (with a Unix-
based file system and kernel).[17] The number of viruses for the older Apple operating
systems, known as Mac OS Classic, varies greatly from source to source, with Apple
stating that there are only four known viruses, and independent sources stating there are
as many as 63 viruses. Virus vulnerability between Macs and Windows is a chief selling
point, one that Apple uses in their Get a Mac advertising.[18] In January 2009, Symantec
announced discovery of a trojan that targets Macs.[19] This discovery did not gain much
coverage until April 2009.[19]

While Linux, and Unix in general, has always natively blocked normal users from having
access to make changes to the operating system environment, Windows users are
generally not. This difference has continued partly due to the widespread use of
administrator accounts in contemporary versions like XP. In 1997, when a virus for Linux
was released – known as "Bliss" – leading antivirus vendors issued warnings that Unix-
like systems could fall prey to viruses just like Windows.[20] The Bliss virus may be
considered characteristic of viruses – as opposed to worms – on Unix systems. Bliss
requires that the user run it explicitly (so it is a trojan), and it can only infect programs
that the user has the access to modify. Unlike Windows users, most Unix users do not log
in as an administrator user except to install or configure software; as a result, even if a
user ran the virus, it could not harm their operating system. The Bliss virus never became
widespread, and remains chiefly a research curiosity. Its creator later posted the source
code to Usenet, allowing researchers to see how it worked.[21]

The role of software development

Because software is often designed with security features to prevent unauthorized use of
system resources, many viruses must exploit software bugs in a system or application to
spread. Software development strategies that produce large numbers of bugs will
generally also produce potential exploits.

Anti-virus software and other preventive measures

Many users install anti-virus software that can detect and eliminate known viruses after
the computer downloads or runs the executable. There are two common methods that an
anti-virus software application uses to detect viruses. The first, and by far the most
common method of virus detection is using a list of virus signature definitions. This
works by examining the content of the computer's memory (its RAM, and boot sectors)
and the files stored on fixed or removable drives (hard drives, floppy drives), and
comparing those files against a database of known virus "signatures". The disadvantage
of this detection method is that users are only protected from viruses that pre-date their
last virus definition update. The second method is to use a heuristic algorithm to find
viruses based on common behaviors. This method has the ability to detect viruses that
anti-virus security firms have yet to create a signature for.

Some anti-virus programs are able to scan opened files in addition to sent and received e-
mails 'on the fly' in a similar manner. This practice is known as "on-access scanning."
Anti-virus software does not change the underlying capability of host software to transmit
viruses. Users must update their software regularly to patch security holes. Anti-virus
software also needs to be regularly updated in order to prevent the latest threats.

One may also minimise the damage done by viruses by making regular backups of data
(and the operating systems) on different media, that are either kept unconnected to the
system (most of the time), read-only or not accessible for other reasons, such as using
different file systems. This way, if data is lost through a virus, one can start again using
the backup (which should preferably be recent).

If a backup session on optical media like CD and DVD is closed, it becomes read-only
and can no longer be affected by a virus (so long as a virus or infected file was not copied
onto the CD/DVD). Likewise, an operating system on a bootable CD can be used to start
the computer if the installed operating systems become unusable. Backups on removable
media must be carefully inspected before restoration. The Gammima virus, for example,
propagates via removable flash drives.[22][23]

Recovery methods

Once a computer has been compromised by a virus, it is usually unsafe to continue using
the same computer without completely reinstalling the operating system. However, there
are a number of recovery options that exist after a computer has a virus. These actions
depend on severity of the type of virus.

Virus removal

One possibility on Windows Me, Windows XP and Windows Vista is a tool known as
System Restore, which restores the registry and critical system files to a previous
checkpoint. Often a virus will cause a system to hang, and a subsequent hard reboot will
render a system restore point from the same day corrupt. Restore points from previous
days should work provided the virus is not designed to corrupt the restore files or also
exists in previous restore points.[24] Some viruses, however, disable system restore and
other important tools such as Task Manager and Command Prompt. An example of a
virus that does this is CiaDoor.

Administrators have the option to disable such tools from limited users for various
reasons (for example, to reduce potential damage from and the spread of viruses). The
virus modifies the registry to do the same, except, when the Administrator is controlling
the computer, it blocks all users from accessing the tools. When an infected tool activates
it gives the message "Task Manager has been disabled by your administrator.", even if the
user trying to open the program is the administrator.[citation needed]

Users running a Microsoft operating system can access Microsoft's website to run a free
scan, provided they have their 20-digit registration number.

Operating system reinstallation


Reinstalling the operating system is another approach to virus removal. It involves simply
reformatting the OS partition and installing the OS from its original media, or imaging
the partition with a clean backup image (Taken with Ghost or Acronis for example).

This method has the benefits of being simple to do, being faster than running multiple
antivirus scans, and is guaranteed to remove any malware. Downsides include having to
reinstall all other software, reconfiguring, restoring user preferences. User data can be
backed up by booting off of a Live CD or putting the hard drive into another computer
and booting from the other computer's operating system (though care must be taken not
to transfer the virus to the new computer).

Adware

Adware or advertising-supported software is any software package which


automatically plays, displays, or downloads advertisements to a computer after the
software is installed on it or while the application is being used. Some types of adware
are also spyware and can be classified as privacy-invasive software.

Application
Advertising functions are integrated into or bundled with the software, which is often
designed to note what Internet sites the user visits and to present advertising pertinent to
the types of goods or services featured there. Adware is usually seen by the developer as
a way to recover development costs, and in some cases it may allow the software to be
provided to the user free of charge or at a reduced price. The income derived from
presenting advertisements to the user may allow or motivate the developer to continue to
develop, maintain and upgrade the software product. Conversely, the advertisements may
be seen by the user as interruptions or annoyances, or as distractions from the task at
hand.

Some adware is also shareware, and so the word may be used as term of distinction to
differentiate between types of shareware software. What differentiates adware from other
shareware is that it is primarily advertising-supported. Users may also be given the option
to pay for a "registered" or "licensed" copy to do away with the advertisements.

Adware can also download and install Spyware.

[edit] Well-known adware programs/programs


distributed with adware
• 123 Messenger
• 180SearchAssistant
• 888bar
• Adssite Toolbar
• AOL Instant Messenger
• Ask.com Toolbar (Toolbar is automatically installed with many different
programs, even after you uncheck Ask.com during the installation process.)
• Bearshare
• Bonzi Buddy
• BlockChecker
• Burn4Free
• ClipGenie
• Comet Cursor
• Crazy Girls
• Cydoor
• Daemon Tools - (Software comes bundled with the "Daemon Tools WhenUSave
Toolbar" but can be unchecked during installation)
• DivX
• DollarRevenue
• Ebates MoneyMaker
• ErrorSafe
• ErrorSweeper
• Evernote
• Ezula
• FaceGame.exe
• FlashGet
• Gamespy
• Gamevance
• Gator
• Gool.exe
• Kazaa
• Limewire (on some music downloads)
• Messenger Plus! Live - (Software comes bundled with adware, but can be
unchecked during installation)
• MessengerSkinner
• Mirar Toolbar
• Oemji Toolbar
• PornDigger!
• Quicken2008
• RealPlayer
• Smiley Central
• Spotify - (A subscription can be paid to remove ads.)
• TagASaurus
• TopMoxie
• Tribal Fusion
• Videothang
• Viewpoint Media Player
• VirusProtectPro
• Vuze
• WeatherBug
• WhenU
• WinAce (now with MeMedia AdVantage)
• Windows Live Messenger
• Winzix
• XXX Shop online
• XXX Toy
• Zango
• Zango Toolbar
• Zwinky

The Eudora e-mail client is a popular example of an adware "mode" in a program. After a
trial period during which all program features are available, the user is offered a choice: a
free (but feature-limited), an ad-supported mode with all the features enabled, or a paid
mode that enables all features and turns off the ads.

[edit] Prevention and detection


Programs have been developed to detect, quarantine, and remove spyware. As there are
many examples of adware software that are also spyware or malware, many of these
detection programs have been developed to detect, quarantine, and remove adware as
well. Among the more prominent of these applications are Ad-Aware, Malwarebytes'
Anti-Malware and Spybot - Search & Destroy. These programs are designed specifically
for spyware detection and will not detect viruses.

Almost all commercial antivirus software currently detect adware and spyware, or offer a
separate spyware detection package. The reluctance to add adware and spyware detection
to commercial antivirus products was fueled by a fear of lawsuits. Kaspersky, for
example, was sued by Zango for blocking the installation of their products. Zango
software and components are almost universally detected as adware nowadays.

E-mail spam
From Wikipedia, the free encyclopedia

Jump to: navigation, search


An email box folder filled with spam messages.

E-mail spam, also known as junk e-mail, is a subset of spam that involves nearly
identical messages sent to numerous recipients by e-mail. A common synonym for spam
is unsolicited bulk e-mail (UBE). Definitions of spam usually include the aspects that
email is unsolicited and sent in bulk.[1][2][3][4][5] "UCE" refers specifically to unsolicited
commercial e-mail.

E-mail spam has steadily, even exponentially grown since the early 1990s to several
billion messages a day. Spam has frustrated, confused, and annoyed e-mail users. The
total volume of spam (over 100 billion emails per day as of April 2008) has leveled off
slightly in recent years, and is no longer growing exponentially. The amount received by
most e-mail users has decreased, mostly because of better filtering. About 80% of all
spam is sent by fewer than 200 spammers. Botnets, networks of virus-infected computers,
are used to send about 80% of spam. Since the cost of the spam is borne mostly by the
recipient,[6] it is effectively postage due advertising.

The legal status of spam varies from one jurisdiction to another. In the United States,
spam was declared to be legal by the CAN-SPAM Act of 2003 provided the message
adheres to certain specifications. ISPs have attempted to recover the cost of spam through
lawsuits against spammers, although they have been mostly unsuccessful in collecting
damages despite winning in court.[7][8]

Spammers collect e-mail addresses from chatrooms, websites, customer lists,


newsgroups, and viruses which harvest users' address books, and are sold to other
spammers. Much of spam is sent to invalid e-mail addresses. Spam averages 94% of all
e-mail sent.[9]

Overview
From the beginning of the Internet (the ARPANET), sending of junk e-mail has been
prohibited,[10] enforced by the Terms of Service/Acceptable Use Policy (ToS/AUP) of
internet service providers (ISPs) and peer pressure. Even with a thousand users junk e-
mail for advertising is not tenable, and with a million users it is not only impractical,[11]
but also expensive.[12] It is estimated that spam cost businesses on the order of $100
billion in 2007.[13] As the scale of the spam problem has grown, ISPs and the public have
turned to government for relief from spam, which has failed to materialize.[14]

[edit] Types
Spam has several definitions, varying by the source.

• Unsolicited bulk e-mail (UBE)—unsolicited e-mail, sent in large quantities.


• Unsolicited commercial e-mail (UCE)—this more restrictive definition is used by
regulators whose mandate is to regulate commerce, such as the U.S. Federal Trade
Commission.

[edit] Spamvertised sites

Many spam e-mails contain URLs to a website or websites. According to a Commtouch


report in June 2004, "only five countries are hosting 99.68% of the global spammer
websites", of which the foremost is China, hosting 73.58% of all web sites referred to
within spam.[15]

[edit] Most common products advertised

According to information compiled by Spam-Filter-Review.com, E-mail spam for 2006


can be broken down as follows.[16]

E-Mail Spam by Category

Products 25%

Financial 20%

Adult 19%

Scams 9%

Health 7%

Internet 7%
Leisure 6%

Spiritual 4%

Other 3%

"Pills, porn and poker" sums up the most common products advertised in spam. Others
include replica watches.[17][18]

[edit] 419 scams

Main article: Advance fee fraud

Advance fee fraud spam such as the Nigerian "419" scam may be sent by a single
individual from a cyber cafe in a developing country. Organized "spam gangs" operating
from Russia or eastern Europe share many features in common with other forms of
organized crime, including turf battles and revenge killings.[19]

[edit] Phishing

Main article: Phishing

Spam is also a medium for fraudsters to scam users to enter personal information on fake
Web sites using e-mail forged to look like it is from a bank or other organization such as
PayPal. This is known as phishing. Spear-phishing is targeted phishing, using known
information about the recipient, such as making it look like it comes from their
employer.[20]

[edit] Spam techniques


[edit] Appending

Main article: E-mail appending

If a marketer has one database containing names, addresses, and telephone numbers of
prospective customers, they can pay to have their database matched against an external
database containing email addresses. The company then has the means to send email to
persons who have not requested email, which may include persons who have deliberately
withheld their email address. [21]

[edit] Image spam


Main article: Image spam

Image spam is an obfuscating method in which the text of the message is stored as a GIF
or JPEG image and displayed in the email. This prevents text based spam filters from
detecting and blocking spam messages. Image spam is currently used largely to advertise
"pump and dump" stocks.[22]

Often, image spam contains nonsensical, computer-generated text which simply annoys
the reader. However, new technology in some programs try to read the images by
attempting to find text in these images. They are not very accurate, and sometimes filter
out innocent images of products like a box that has words on it.

A newer technique, however, is to use an animated GIF image that does not contain clear
text in its initial frame, or to contort the shapes of letters in the image (as in CAPTCHA)
to avoid detection by OCR tools.

[edit] Blank spam

Blank spam is spam lacking a payload advertisement. Often the message body is missing
altogether, as well as the subject line. Still, it fits the definition of spam because of its
nature as bulk and unsolicited email.

Blank spam may be originated in different ways, either intentional or unintentionally:

1. Blank spam can have been sent in a directory harvest attack, a form of dictionary
attack for gathering valid addresses from an email service provider. Since the goal
in such an attack is to use the bounces to separate invalid addresses from the valid
ones, the spammer may dispense with most elements of the header and the entire
message body, and still accomplish his or her goals.
2. Blank spam may also occur when a spammer forgets or otherwise fails to add the
payload when he or she sets up the spam run.
3. Often blank spam headers appear truncated, suggesting that computer glitches
may have contributed to this problem—from poorly-written spam software to
shoddy relay servers, or any problems that may truncate header lines from the
message body.
4. Some spam may appear to be blank when in fact it is not. An example of this is
the VBS.Davinia.B email worm which propagates through messages that have no
subject line and appears blank, when in fact it uses HTML code to download other
files.

[edit] Backscatter spam

Main article: Backscatter (e-mail)

Backscatter is a side-effect of e-mail spam, viruses and worms, where email servers
receiving spam and other mail send bounce messages to an innocent party. This occurs
because the original message's envelope sender is forged to contain the e-mail address of
the victim. A very large proportion of such e-mail is sent with a forged From: header,
matching the envelope sender.

Since these messages were not solicited by the recipients, are substantially similar to each
other, and are delivered in bulk quantities, they qualify as unsolicited bulk email or spam.
As such, systems that generate e-mail backscatter can end up being listed on various
DNSBLs and be in violation of internet service providers' Terms of Service.

[edit] Legality
See also: E-mail spam legislation by country

Sending spam violates the Acceptable Use Policy (AUP) of almost all Internet Service
Providers. Providers vary in their willingness or ability to enforce their AUP. Some
actively enforce their terms and terminate spammers' accounts without warning. Some
ISPs lack adequate personnel or technical skills for enforcement, while others may be
reluctant to enforce restrictive terms against profitable customers.

As the recipient directly bears the cost of delivery, storage, and processing, one could
regard spam as the electronic equivalent of "postage-due" junk mail.[6][23] Due to the low
cost of sending unsolicited e-mail and the potential profit entailed, some believe that only
strict legal enforcement can stop junk e-mail. The Coalition Against Unsolicited
Commercial Email (CAUCE) argues "Today, much of the spam volume is sent by career
criminals and malicious hackers who won't stop until they're all rounded up and put in
jail."[24]

[edit] Canada

The Government of Canada has introduced anti-spam legislation called the Electronic
Commerce Protection Act at the House of Commons to fight spam.[25]

[edit] European Union and Australia

Several countries have passed laws that specifically target spam, notably Australia and all
the countries of the European Union.

Article 13 of the European Union Directive on Privacy and Electronic Communications


(2002/58/EC) provides that the EU member states shall take appropriate measures to
ensure that unsolicited communications for the purposes of direct marketing are not
allowed either without the consent of the subscribers concerned or in respect of
subscribers who do not wish to receive these communications, the choice between these
options to be determined by national legislation.

In Australia, the relevant legislation is the Spam Act 2003 which covers some types of e-
mail and phone spam, which took effect on 11 April 2004. The Spam Act provides that
"Unsolicited commercial electronic messages must not be sent," which is an opt-in
requirement. This contrasts with the U.S. CAN-SPAM act, which is opt-out (i.e.,
companies are free to send spam until the recipient directs the sender not to). Penalties
are up to 10,000 penalty units, or 2,000 penalty units for a person other than a body
corporate.

[edit] United States

In the United States, most states enacted anti-spam laws during the late 1990s and early
2000s. These have since been pre-empted by the less restrictive CAN-SPAM Act of 2003.

Spam is legally permissible according to the CAN-SPAM Act of 2003 provided it follows
certain criteria: a truthful subject line, no false information in the technical headers or
sender address, and other minor requirements. If the spam fails to comply with any of
these requirements it is illegal. Aggravated or accelerated penalties apply if the spammer
harvested the email addresses using methods described earlier.

A review of the effectiveness of CAN-SPAM in 2005 by the Federal Trade Commission


(the agency charged with CAN-SPAM enforcement) stated that the amount of sexually
explicit spam had significantly decreased since 2003 and the total volume had begun to
level off.[26] Senator Conrad Burns, a principal sponsor, noted that "Enforcement is key
regarding the CAN-SPAM legislation." In 2004 less than 1% of spam complied with the
CAN-SPAM Act of 2003.[27]

[edit] Effectiveness

Legislative efforts to curb spam have been ineffective or counter-productive. For


example, the CAN-SPAM Act of 2003 requires that each message include a means to
"opt out" (i.e., decline future e-mail from the same source). It is widely believed that
responding to opt-out requests is unwise, as this merely confirms to the spammer that
they have reached an active e-mail account. To the extent this is true, the CAN-SPAM
Act's opt-out provisions are counter-productive in two ways: first, recipients who are
aware of the potential risks of opting out will decline to do so; second, attempts to opt-out
will provide spammers with useful information on their targets. A 2002 study by the
Center for Democracy and Technology found that about 16% of web sites tested with opt-
out requests continued to spam.[28]

[edit] Other laws

Accessing privately owned computer resources without the owner's permission counts as
illegal under computer crime statutes in most nations. Deliberate spreading of computer
viruses is also illegal in the United States and elsewhere. Thus, some common behaviors
of spammers are criminal regardless of the legality of spamming per se. Even before the
advent of laws specifically banning or regulating spamming, spammers were successfully
prosecuted under computer fraud and abuse laws for wrongfully using others' computers.
The use of botnets can be perceived as theft. The spammer consumes a zombie owner's
bandwidth and resources without any cost. In addition, spam is perceived as theft of
services. The receiving SMTP servers consume significant amounts of system resources
dealing with this unwanted traffic. As a result, service providers have to spend large
amounts of money to make their systems capable of handling these amounts of email.
Such costs are inevitably passed on to the service providers' customers.[29]

Other laws, not only those related to spam, have been used to prosecute alleged
spammers. For example, Alan Ralsky was indicted on stock fraud charges in January
2008, and Robert Soloway plead guilty to charges of mail fraud, fraud in connection with
electronic mail, and failing to file a tax return in March 2008.[30]

[edit] Deception and fraud


Spammers may engage in deliberate fraud to send out their messages. Spammers often
use false names, addresses, phone numbers, and other contact information to set up
"disposable" accounts at various Internet service providers. They also often use falsified
or stolen credit card numbers to pay for these accounts. This allows them to move quickly
from one account to the next as the host ISPs discover and shut down each one.

Senders may go to great lengths to conceal the origin of their messages. Large companies
may hire another firm to send their messages so that complaints or blocking of email falls
on a third party. Others engage in spoofing of e-mail addresses (much easier than IP
address spoofing). The e-mail protocol (SMTP) has no authentication by default, so the
spammer can pretend to originate a message apparently from any e-mail address. To
prevent this, some ISPs and domains require the use of SMTP-AUTH, allowing positive
identification of the specific account from which an e-mail originates.

Senders cannot completely spoof e-mail delivery chains (the 'Received' header), since the
receiving mailserver records the actual connection from the last mailserver's IP address.
To counter this, some spammers forge additional delivery headers to make it appear as if
the e-mail had previously traversed many legitimate servers.

Spoofing can have serious consequences for legitimate e-mail users. Not only can their e-
mail inboxes get clogged up with "undeliverable" e-mails in addition to volumes of spam,
they can mistakenly be identified as a spammer. Not only may they receive irate e-mail
from spam victims, but (if spam victims report the e-mail address owner to the ISP, for
example) a naive ISP may terminate their service for spamming.

[edit] Theft of service


Spammers frequently seek out and make use of vulnerable third-party systems such as
open mail relays and open proxy servers. SMTP forwards mail from one server to another
—mail servers that ISPs run commonly require some form of authentication to ensure
that the user is a customer of that ISP. Open relays, however, do not properly check who
is using the mail server and pass all mail to the destination address, making it harder to
track down spammers.

Increasingly, spammers use networks of malware-infected PCs (zombies) to send their


spam. Zombie networks are also known as Botnets (such zombifying malware is known
as a bot, short for robot). In June 2006, an estimated 80% of e-mail spam was sent by
zombie PCs, an increase of 30% from the prior year. An estimated 55 billion e-mail spam
were sent each day in June 2006, an increase of 25 billion per day from June 2005.[31]

[edit] Statistics and estimates


[edit] The growth of e-mail spam

Spam is growing, with no signs of abating. The amount of spam users see in their
mailboxes is just the tip of the iceberg, since spammers' lists often contain a large
percentage of invalid addresses and many spam filters simply delete or reject "obvious
spam".

[edit] In absolute numbers

• 1978 - An e-mail spam advertising a DEC product presentation is sent by Gary


Thuerk to 600 addresses, which was all the users of that time's ARPANET, though
software limitations meant only slightly more than half of the intended recipients
actually received it.[32]
• 2002 - 2.4 billion per day[33]
• 2004 - 11 billion per day[34]
• 2005 - (June) 30 billion per day[31]
• 2006 - (June) 55 billion per day[31]
• 2007 - (February) 90 billion per day
• 2007 - (June) 100 billion per day[35]

[edit] As a percentage of the total volume of e-mail

More than 97% of all e-mails sent over the net are unwanted, according to a Microsoft
security report.[36]

MAAWG estimates that 85% of incoming mail is "abusive email", as of the second half
of 2007. The sample size for the MAAWG's study was over 100 million
mailboxes.[37][38][39]

Spamhaus estimates that 90% of incoming email traffic is spam in North America,
Europe or Australasia.[40] By June 2008 96.5% of e-mail received by businesses was
spam.[20]

[edit] Highest amount of spam received


According to Steve Ballmer, Microsoft founder Bill Gates receives four million e-mails
per year, most of them spam.[41] (This was originally incorrectly reported as "per day".[42])

At the same time Jef Poskanzer, owner of the domain name acme.com, was receiving
over one million spam emails per day.[43]

[edit] Cost of spam

A 2004 survey estimated that lost productivity costs Internet users in the United States
$21.58 billion annually, while another reported the cost at $17 billion, up from $11 billion
in 2003. In 2004, the worldwide productivity cost of spam has been estimated to be $50
billion in 2005.[44] An estimate of the percentage cost borne by the sender of marketing
junk mail (snail mail) is 88%, whereas in 2001 one spam was estimated to cost $0.10 for
the receiver and $0.00001 (0.01% of the cost) for the sender. [6]

[edit] Origin of spam

Origin or source of spam refers to the geographical location of the computer from which
the spam is sent; it is not the country where the spammer resides, nor the country that
hosts the spamvertised site. Due to the international nature of spam, the spammer, the
hijacked spam-sending computer, the spamvertised server, and the user target of the spam
are all often located in different countries. As much as 80% of spam received by Internet
users in North America and Europe can be traced to fewer than 200 spammers.[45]

In terms of volume of spam: According to Sophos, the major sources of spam in the
fourth quarter of 2008 (October to December) were:[20][46][47][48][49][50][51][52][53][54]

• The United States (the origin of 19.8% of spam messages, up from 18.9% in Q3)
• China (9.9%, up from 5.4%)
• Russia (6.4%, down from 8.3%)
• Brazil (6.3%, up from 4.5%)
• Turkey (4.4%, down from 8.2%)

When grouped by continents, spam comes mostly from:

• Asia (37.8%, down from 39.8%)


• North America (23.6%, up from 21.8%)
• Europe (23.4%, down from 23.9%)
• South America (12.9%, down from 13.2%)

In terms of number of IP addresses: The Spamhaus Project (which measures spam


sources in terms of number of IP addresses used for spamming, rather than volume of
spam sent) ranks the top three as the United States, China, and Russia,[55] followed by
Japan, Canada, and South Korea.
In terms of networks: As of 5 June 2007, the three networks hosting the most spammers
are Verizon, AT&T, and VSNL International.[55] Verizon inherited many of these spam
sources from its acquisition of MCI, specifically through the UUNet subsidiary of MCI,
which Verizon subsequently renamed Verizon Business.

[edit] Spam in culture

The often rambling and incomprehensible nature of spam has led to an underground
culture, with video tribute on the video sharing service YouTube, cartoons based on spam
titles (Spamusement!) as well as spam blogs such as My Pet Spam, Delightful Spam and
The Spam Hunter Diaries.

[edit] Anti-spam techniques


Main article: Anti-spam techniques (e-mail)

The U.S. Department of Energy Computer Incident Advisory Capability (CIAC) has
provided specific countermeasures against electronic mail spamming.[56]

Some popular methods for filtering and refusing spam include e-mail filtering based on
the content of the e-mail, DNS-based blackhole lists (DNSBL), greylisting, spamtraps,
Enforcing technical requirements of e-mail (SMTP), checksumming systems to detect
bulk email, and by putting some sort of cost on the sender via a Proof-of-work system or
a micropayment. Each method has strengths and weaknesses and each is controversial
due to its weaknesses. For example, one company offers for "removing some spamtrap
and honeypot addresses" from email lists, defeating the ability of those methods for
identifying spammers.

Anti-spam techniques should not be employed on abuse email addresses, as is commonly


the case. The result of this is that when people attempt to report spam to a host, the spam
message is caught in the spam filter and the host remains unaware that their network is
being exploited by spammers.

[edit] How spammers operate


[edit] Gathering of addresses

Main article: E-mail address harvesting

In order to send spam, spammers need to obtain the e-mail addresses of the intended
recipients. To this end, both spammers themselves and list merchants gather huge lists of
potential e-mail addresses. Since spam is, by definition, unsolicited, this address
harvesting is done without the consent (and sometimes against the expressed will) of the
address owners. As a consequence, spammers' address lists are inaccurate. A single spam
run may target tens of millions of possible addresses — many of which are invalid,
malformed, or undeliverable.

Sometimes, if the sent spam is "bounced" or sent back to the sender by various programs
that eliminate spam, or if the recipient clicks on an unsubscribe link, that may cause that
email address to be marked as "valid", which is interpreted by the spammer as "send me
more".

[edit] Delivering spam messages

Main article: Spam email delivery

[edit] Obfuscating message content

This article is missing citations or needs footnotes. Please help add inline
citations to guard against copyright violations and factual inaccuracies. (November
2007)

Many spam-filtering techniques work by searching for patterns in the headers or bodies
of messages. For instance, a user may decide that all e-mail they receive with the word
"Viagra" in the subject line is spam, and instruct their mail program to automatically
delete all such messages. To defeat such filters, the spammer may intentionally misspell
commonly-filtered words or insert other characters, often in a style similar to leetspeak,
as in the following examples: V1agra, Via'gra, Vi@graa, vi*gra, \/iagra. This also
allows for many different ways to express a given work, making identifying them all
more difficult for filter software. For example, using most common variations, it is
possible to spell "Viagra" in over 1.3 * 1021 different ways.[57]

The principle of this method is to leave the word readable to humans (who can easily
recognize the intended word for such misspellings), but not likely to be recognized by a
literal computer program. This is only somewhat effective, because modern filter patterns
have been designed to recognize blacklisted terms in the various iterations of misspelling.
Other filters target the actual obfuscation methods, such as the non-standard use of
punctuation or numerals into unusual places. Similarly, HTML-based e-mail gives the
spammer more tools to obfuscate text. Inserting HTML comments between letters can
foil some filters, as can including text made invisible by setting the font color to white on
a white background, or shrinking the font size to the smallest fine print. Another common
ploy involves presenting the text as an image, which is either sent along or loaded from a
remote server. This can be foiled by not permitting an e-mail-program to load images.

As Bayesian filtering has become popular as a spam-filtering technique, spammers have


started using methods to weaken it. To a rough approximation, Bayesian filters rely on
word probabilities. If a message contains many words which are only used in spam, and
few which are never used in spam, it is likely to be spam. To weaken Bayesian filters,
some spammers, alongside the sales pitch, now include lines of irrelevant, random words,
in a technique known as Bayesian poisoning. A variant on this tactic may be borrowed
from the Usenet abuser known as "Hipcrime" -- to include passages from books taken
from Project Gutenberg, or nonsense sentences generated with "dissociated press"
algorithms. Randomly generated phrases can create spoetry (spam poetry) or spam art.

Another method used to masquerade spam as legitimate messages is the use of


autogenerated sender names in the From: field, ranging from realistic ones such as
"Jackie F. Bird" to (either by mistake or intentionally) bizarre attention-grabbing names
such as "Sloppiest U. Epiglottis" or "Attentively E. Behavioral". Return addresses are
also routinely auto-generated, often using unsuspecting domain owners' legitimate
domain names, leading some users to blame the innocent domain owners. Blocking lists
use IP addresses rather than sender domain names, as these are more accurate. A mail
purporting to be from example.com can be seen to be faked by looking for the
originating IP address in the email's headers; also Sender Policy Framework, for example,
helps by stating that a certain domain will only send email from certain IP addresses.

Spam can also be hidden inside a fake "Undelivered mail notification" which looks like
the failure notices sent by a mail transfer agent (a "MAILER-DAEMON") when it
encounters an error.

[edit] Spam-support services

A number of other online activities and business practices are considered by anti-spam
activists to be connected to spamming. These are sometimes termed spam-support
services: business services, other than the actual sending of spam itself, which permit the
spammer to continue operating. Spam-support services can include processing orders for
goods advertised in spam, hosting Web sites or DNS records referenced in spam
messages, or a number of specific services as follows:

Some Internet hosting firms advertise bulk-friendly or bulletproof hosting. This means
that, unlike most ISPs, they will not terminate a customer for spamming. These hosting
firms operate as clients of larger ISPs, and many have eventually been taken offline by
these larger ISPs as a result of complaints regarding spam activity. Thus, while a firm
may advertise bulletproof hosting, it is ultimately unable to deliver without the
connivance of its upstream ISP. However, some spammers have managed to get what is
called a pink contract (see below) — a contract with the ISP that allows them to spam
without being disconnected.

A few companies produce spamware, or software designed for spammers. Spamware


varies widely, but may include the ability to import thousands of addresses, to generate
random addresses, to insert fraudulent headers into messages, to use dozens or hundreds
of mail servers simultaneously, and to make use of open relays. The sale of spamware is
illegal in eight U.S. states.[58][59][60]

So-called millions CDs are commonly advertised in spam. These are CD-ROMs
purportedly containing lists of e-mail addresses, for use in sending spam to these
addresses. Such lists are also sold directly online, frequently with the false claim that the
owners of the listed addresses have requested (or "opted in") to be included. Such lists
often contain invalid addresses. In recent years, these have fallen almost entirely out of
use due to the low quality e-mail addresses available on them, and because some e-mail
lists exceed 20GB in size. The amount you can fit on a CD is no longer substantial.

A number of DNS blacklists (DNSBLs), including the MAPS RBL, Spamhaus SBL,
SORBS and SPEWS, target the providers of spam-support services as well as spammers.
DNSBLs blacklist IPs or ranges of IPs to persuade ISPs to terminate services with known
customers who are spammers or resell to spammers.

[edit] Related vocabulary


Unsolicited bulk e-mail (UBE)
A synonym for e-mail spam.
Unsolicited commercial e-mail (UCE)
Spam promoting a commercial service or product. This is the most common type
of spam, but it excludes spam which are hoaxes (e.g. virus warnings), political
advocacy, religious messages and chain letters sent by a person to many other
people. The term UCE may be most common in the USA. [61]
Pink contract
A pink contract is a service contract offered by an ISP which offers bulk e-mail
service to spamming clients, in violation of that ISP's publicly posted acceptable
use policy.
Spamvertising
Spamvertising is advertising through the medium of spam.
Opt-in, confirmed opt-in, double opt-in, opt-out
Opt-in, confirmed opt-in, double opt-in, opt-out refers to whether the people on a
mailing list are given the option to be put in, or taken out, of the list. Confirmation
(and "double", in marketing speak) refers to an email address transmitted eg.
through a web form being confirmed to actually request joining a mailing list,
instead of being added to the list without verification.
Final, Ultimate Solution for the Spam Problem (FUSSP)
An ironic reference to naïve developers who believe they have invented the
perfect spam filter, which will stop all spam from reaching users' inboxes while
accidentally deleting no legitimate email.[62][63]
Bacn
Bacn is a rarely used term to refer to e-mail sent to a user who at one time
subscribed to a mailing list - not unsolicited, but also not personal.

Malware
From Wikipedia, the free encyclopedia
Jump to: navigation, search

Malware

Malware, short for malicious software, is software designed to infiltrate a computer


without the owner's informed consent. The expression is a general term used by computer
professionals to mean a variety of forms of hostile, intrusive, or annoying software or
program code.[1] The term "computer virus" is sometimes used as a catch-all phrase to
include all types of malware, including true viruses.

Software is considered malware based on the perceived intent of the creator rather than
any particular features. Malware includes computer viruses, worms, trojan horses, most
rootkits, spyware, dishonest adware, crimeware and other malicious and unwanted
software. In law, malware is sometimes known as a computer contaminant, for instance
in the legal codes of several U. S. states, including California and West Virginia.[2][3]

Malware is not the same as defective software, that is, software which has a legitimate
purpose but contains harmful bugs.

Preliminary results from Symantec published in 2008 suggested that "the release rate of
malicious code and other unwanted programs may be exceeding that of legitimate
software applications."[4] According to F-Secure, "As much malware [was] produced in
2007 as in the previous 20 years altogether."[5] Malware's most common pathway from
criminals to users is through the Internet: primarily by e-mail and the World Wide Web.[6]

The prevalence of malware as a vehicle for organized Internet crime, along with the
general inability of traditional anti-malware protection platforms to protect against the
continuous stream of unique and newly produced professional malware, has seen the
adoption of a new mindset for businesses operating on the Internet - the acknowledgment
that some sizable percentage of Internet customers will always be infected for some
reason or other, and that they need to continue doing business with infected customers.
The result is a greater emphasis on back-office systems designed to spot fraudulent
activities associated with advanced malware operating on customers computers.[7]

Purposes
Many early infectious programs, including the first Internet Worm and a number of MS-
DOS viruses, were written as experiments or pranks generally intended to be harmless or
merely annoying rather than to cause serious damage to computers. In some cases the
perpetrator did not realize how much harm their creations could do. Young programmers
learning about viruses and the techniques wrote them for the sole purpose that they could
or to see how far it could spread. As late as 1999, widespread viruses such as the Melissa
virus appear to have been written chiefly as pranks.

Hostile intent related to vandalism can be found in programs designed to cause harm or
data loss. Many DOS viruses, and the Windows ExploreZip worm, were designed to
destroy files on a hard disk, or to corrupt the file system by writing invalid data.
Network-borne worms such as the 2001 Code Red worm or the Ramen worm fall into the
same category. Designed to vandalize web pages, these worms may seem like the online
equivalent to graffiti tagging, with the author's alias or affinity group appearing
everywhere the worm goes.

However, since the rise of widespread broadband Internet access, malicious software has
come to be designed for a profit motive, either more or less legal (forced advertising) or
criminal. For instance, since 2003, the majority of widespread viruses and worms have
been designed to take control of users' computers for black-market exploitation.[citation needed]
Infected "zombie computers" are used to send email spam, to host contraband data such
as child pornography[8], or to engage in distributed denial-of-service attacks as a form of
extortion.

Another strictly for-profit category of malware has emerged in spyware -- programs


designed to monitor users' web browsing, display unsolicited advertisements, or redirect
affiliate marketing revenues to the spyware creator. Spyware programs do not spread like
viruses; they are generally installed by exploiting security holes or are packaged with
user-installed software, such as peer-to-peer applications.

[edit] Infectious malware: viruses and worms


Main articles: Computer virus and Computer worm

The best-known types of malware, viruses and worms, are known for the manner in
which they spread, rather than any other particular behavior. The term computer virus is
used for a program which has infected some executable software and which causes that
software, when run, to spread the virus to other executable software. Viruses may also
contain a payload which performs other actions, often malicious. A worm, on the other
hand, is a program which actively transmits itself over a network to infect other
computers. It too may carry a payload.

These definitions lead to the observation that a virus requires user intervention to spread,
whereas a worm spreads automatically. Using this distinction, infections transmitted by
email or Microsoft Word documents, which rely on the recipient opening a file or email
to infect the system, would be classified as viruses rather than worms.
Some writers in the trade and popular press appear to misunderstand this distinction, and
use the terms interchangeably.

[edit] Capsule history of viruses and worms

Before Internet access became widespread, viruses spread on personal computers by


infecting programs or the executable boot sectors of floppy disks. By inserting a copy of
itself into the machine code instructions in these executables, a virus causes itself to be
run whenever the program is run or the disk is booted. Early computer viruses were
written for the Apple II and Macintosh, but they became more widespread with the
dominance of the IBM PC and MS-DOS system. Executable-infecting viruses are
dependent on users exchanging software or boot floppies, so they spread heavily in
computer hobbyist circles.

The first worms, network-borne infectious programs, originated not on personal


computers, but on multitasking Unix systems. The first well-known worm was the
Internet Worm of 1988, which infected SunOS and VAX BSD systems. Unlike a virus,
this worm did not insert itself into other programs. Instead, it exploited security holes in
network server programs and started itself running as a separate process. This same
behavior is used by today's worms as well.

With the rise of the Microsoft Windows platform in the 1990s, and the flexible macro
systems of its applications, it became possible to write infectious code in the macro
language of Microsoft Word and similar programs. These macro viruses infect documents
and templates rather than applications, but rely on the fact that macros in a Word
document are a form of executable code.

Today, worms are most commonly written for the Windows OS, although a small number
are also written for Linux and Unix systems. Worms today work in the same basic way as
1988's Internet Worm: they scan the network and leverage vulnerable computers to
replicate.

[edit] Concealment: Trojan horses, rootkits, and


backdoors
Main articles: Trojan horse (computing), Rootkit, and Backdoor (computing)

[edit] Trojan horses

For a malicious program to accomplish its goals, it must be able to do so without being
shut down, or deleted by the user or administrator of the computer via which it is running.
Concealment can also help get the malware installed in the first place. When a malicious
program is disguised as something innocuous or desirable, users may be tempted to
install it without knowing what it does. This is the technique of the Trojan horse or
trojan.
Broadly speaking, a Trojan horse is any program that invites the user to run it, concealing
a harmful or malicious payload. The payload may take effect immediately and can lead to
many undesirable effects, such as deleting the user's files or further installing malicious
or undesirable software. Trojan horses known as droppers are used to start off a worm
outbreak, by injecting the worm into users' local networks.

One of the most common ways that spyware is distributed is as a Trojan horse, bundled
with a piece of desirable software that the user downloads from the Internet. When the
user installs the software, the spyware is installed alongside. Spyware authors who
attempt to act in a legal fashion may include an end-user license agreement which states
the behavior of the spyware in loose terms, and which the users are unlikely to read or
understand.

[edit] Rootkits

Once a malicious program is installed on a system, it is essential that it stay concealed, to


avoid detection and disinfection. The same is true when a human attacker breaks into a
computer directly. Techniques known as rootkits allow this concealment, by modifying
the host operating system so that the malware is hidden from the user. Rootkits can
prevent a malicious process from being visible in the system's list of processes, or keep
its files from being read. Originally, a rootkit was a set of tools installed by a human
attacker on a Unix system where the attacker had gained administrator (root) access.
Today, the term is used more generally for concealment routines in a malicious program.

Some malicious programs contain routines to defend against removal: not merely to hide
themselves; but to repel attempts to remove them. An early example of this behavior is
recorded in the Jargon File tale of a pair of programs infesting a Xerox CP-V timesharing
system:

Each ghost-job would detect the fact that the other had been killed, and would
start a new copy of the recently slain program within a few milliseconds. The only
way to kill both ghosts was to kill them simultaneously (very difficult) or to
deliberately crash the system.[9]

Similar techniques are used by some modern malware, wherein the malware starts a
number of processes which monitor and restore one another as needed.

[edit] Backdoors

A backdoor is a method of bypassing normal authentication procedures. Once a system


has been compromised (by one of the above methods, or in some other way), one or more
backdoors may be installed, in order. Backdoors may also be installed prior to malicious
software, to allow attackers entry.

The idea has often been suggested that computer manufacturers preinstall backdoors on
their systems to provide technical support for customers, but this has never been reliably
verified. Crackers typically use backdoors to secure remote access to a computer, while
attempting to remain hidden from casual inspection. To install backdoors crackers may
use Trojan horses, worms, or other methods.

[edit] Malware for profit: spyware, botnets, keystroke


loggers, and dialers
Main articles: Spyware, Botnet, Keystroke logging, Web threats, and Dialer

During the 1980s and 1990s, it was usually taken for granted that malicious programs
were created as a form of vandalism or prank. More recently, the greater share of
malware programs have been written with a financial or profit motive in mind. This can
be taken as the malware authors' choice to monetize their control over infected systems:
to turn that control into a source of revenue.

Spyware programs are commercially produced for the purpose of gathering information
about computer users, showing them pop-up ads, or altering web-browser behavior for
the financial benefit of the spyware creator. For instance, some spyware programs
redirect search engine results to paid advertisements. Others, often called "stealware" by
the media, overwrite affiliate marketing codes so that revenue is redirected to the spyware
creator rather than the intended recipient.

Spyware programs are sometimes installed as Trojan horses of one sort or another. They
differ in that their creators present themselves openly as businesses, for instance by
selling advertising space on the pop-ups created by the malware. Most such programs
present the user with an end-user license agreement which purportedly protects the
creator from prosecution under computer contaminant laws. However, spyware EULAs
have not yet been upheld in court.

Another way that financially-motivated malware creators can profit from their infections
is to directly use the infected computers to do work for the creator. The infected
computers are used as proxies to send out spam messages. The advantage to spammers of
using infected computers is they provide anonymity, protecting the spammer from
prosecution. Spammers have also used infected PCs to target anti-spam organizations
with distributed denial-of-service attacks.

In order to coordinate the activity of many infected computers, attackers have used
coordinating systems known as botnets. In a botnet, the malware or malbot logs in to an
Internet Relay Chat channel or other chat system. The attacker can then give instructions
to all the infected systems simultaneously. Botnets can also be used to push upgraded
malware to the infected systems, keeping them resistant to antivirus software or other
security measures.

It is possible for a malware creator to profit by stealing sensitive information from a


victim. Some malware programs install a key logger, which intercepts the user's
keystrokes when entering a password, credit card number, or other information that may
be exploited. This is then transmitted to the malware creator automatically, enabling
credit card fraud and other theft. Similarly, malware may copy the CD key or password
for online games, allowing the creator to steal accounts or virtual items.

Another way of stealing money from the infected PC owner is to take control of a dial-up
modem and dial an expensive toll call. Dialer (or porn dialer) software dials up a
premium-rate telephone number such as a U.S. "900 number" and leave the line open,
charging the toll to the infected user.

[edit] Data-stealing malware


Data-stealing malware is a web threat that divests victims of personal and proprietary
information with the intent of monetizing stolen data through direct use or underground
distribution. Content security threats that fall under this umbrella include keyloggers,
screen scrapers, spyware, adware, backdoors, and bots. The term does not refer to
activities such as spam, phishing, DNS poisoning, SEO abuse, etc. However, when these
threats result in file download or direct installation, as most hybrid attacks do, files that
act as agents to proxy information will fall into the data-stealing malware category.

[edit] Characteristics of data-stealing malware


Does not leave traces of the event

• The malware is typically stored in a cache which is routinely flushed


• The malware may be installed via a drive-by-download process
• The website hosting the malware as well as the malware is generally temporary or
rogue

Frequently changes and extends its functions

• It is difficult for antivirus software to detect final payload attributes due to the
combinations of malware components
• The malware uses multiple file encryption levels

Thwarts Intrusion Detection Systems (IDS) after successful installation

• There are no perceivable network anomalies


• The malware hides in web traffic
• The malware is stealthier in terms of traffic and resource use

Thwarts disk encryption

• Data is stolen during decryption and display


• The malware can record keystrokes, passwords, and screenshots
Thwarts Data Loss Prevention (DLP)

• Leakage protection hinges on metadata tagging, not everything is tagged


• Miscreants can use encryption to port data

[edit] Examples of data-stealing malware


• Bancos, an info stealer that waits for the user to access banking websites then
spoofs pages of the bank website to steal sensitive information
• Gator, spyware that covertly monitors web-surfing habits, uploads data to a server
for analysis then serves targeted pop-up ads
• LegMir, spyware that steals personal information such as account names and
passwords related to online games
• Qhost, a Trojan that modifies the Hosts file to point to a different DNS server
when banking sites are accessed then opens a spoofed login page to steal login
credentials for those financial institutions

[edit] Data-stealing malware incidents


• Albert Gonzalez is accused of masterminding a ring to use malware to steal and
sell more than 170 million credit card numbers in 2006 and 2007 -- the largest
computer fraud in history. Among the firms targeted were (BJ’s Wholesale Club,
TJX, DSW Shoe, OfficeMax, Barnes & Noble, Boston Market, Sports Authority
and Forever 21). [10]
• A Trojan horse program stole more than 1.6 million records belonging to several
hundred thousand people from Monster Worldwide Inc’s job search service. The
data was used by cybercriminals to craft phishing emails targeted at Monster.com
users to plant additional malware on users’ PCs. [11]
• Customers of Hannaford Bros. Co, a supermarket chain based in Maine, were
victims of a data security breach involving the potential compromise of 4.2
million debit and credit cards. The company was hit by several class-action law
suits. [12]
• The Torpig Trojan has compromised and stolen login credentials from
approximately 250,000 online bank accounts as well as a similar number of credit
and debit cards. Other information such as email, and FTP accounts from
numerous websites, have also been compromised and stolen. [13]

[edit] Vulnerability to malware


Main article: Vulnerability (computing)

In this context, as throughout, it should be borne in mind that the “system” under attack
may be of various types, e.g. a single computer and operating system, a network or an
application.
Various factors make a system more vulnerable to malware:

• Homogeneity – e.g. when all computers in a network run the same OS, if you can
exploit that OS, you can break into any computer running it.
• Defects – malware leveraging defects in the OS design.
• Unconfirmed code – code from a floppy disk, CD-ROM or USB device may be
executed without the user’s agreement.
• Over-privileged users – some systems allow all users to modify their internal
structures.
• Over-privileged code – some systems allow code executed by a user to access all
rights of that user.

An oft-cited cause of vulnerability of networks is homogeneity or software


monoculture.[14] For example, Microsoft Windows or Apple Mac have such a large share
of the market that concentrating on either could enable a cracker to subvert a large
number of systems, but any total monoculture is a problem. Instead, introducing
inhomogeneity (diversity), purely for the sake of robustness, could increase short-term
costs for training and maintenance. However, having a few diverse nodes would deter
total shutdown of the network, and allow those nodes to help with recovery of the
infected nodes. Such separate, functional redundancy would avoid the cost of a total
shutdown, would avoid homogeneity as the problem of "all eggs in one basket".

Most systems contain bugs, or loopholes, which may be exploited by malware. A typical
example is the buffer-overrun weakness, in which an interface designed to store data, in a
small area of memory, allows the caller to supply more data than will fit. This extra data
then overwrites the interface's own executable structure (past the end of the buffer and
other data). In this manner, malware can force the system to execute malicious code, by
replacing legitimate code with its own payload of instructions (or data values) copied into
live memory, outside the buffer area.

Originally, PCs had to be booted from floppy disks, and until recently it was common for
this to be the default boot device. This meant that a corrupt floppy disk could subvert the
computer during booting, and the same applies to CDs. Although that is now less
common, it is still possible to forget that one has changed the default, and rare that a
BIOS makes one confirm a boot from removable media.

In some systems, non-administrator users are over-privileged by design, in the sense that
they are allowed to modify internal structures of the system. In some environments, users
are over-privileged because they have been inappropriately granted administrator or
equivalent status. This is a primarily a configuration decision, but on Microsoft Windows
systems the default configuration is to over-privilege the user. This situation exists due to
decisions made by Microsoft to prioritize compatibility with older systems above security
configuration in newer systems[citation needed] and because typical applications were
developed without the under-privileged users in mind. As privilege escalation exploits
have increased this priority is shifting for the release of Microsoft Windows Vista. As a
result, many existing applications that require excess privilege (over-privileged code)
may have compatibility problems with Vista. However, Vista's User Account Control
feature attempts to remedy applications not designed for under-privileged users through
virtualization, acting as a crutch to resolve the privileged access problem inherent in
legacy applications.

Malware, running as over-privileged code, can use this privilege to subvert the system.
Almost all currently popular operating systems, and also many scripting applications
allow code too many privileges, usually in the sense that when a user executes code, the
system allows that code all rights of that user. This makes users vulnerable to malware in
the form of e-mail attachments, which may or may not be disguised.

Given this state of affairs, users are warned only to open attachments they trust, and to be
wary of code received from untrusted sources. It is also common for operating systems to
be designed so that device drivers need escalated privileges, while they are supplied by
more and more hardware manufacturers.

[edit] Eliminating over-privileged code

Over-privileged code dates from the time when most programs were either delivered with
a computer or written in-house, and repairing it would at a stroke render most antivirus
software almost redundant. It would, however, have appreciable consequences for the
user interface and system management.

The system would have to maintain privilege profiles, and know which to apply for each
user and program. In the case of newly installed software, an administrator would need to
set up default profiles for the new code.

Eliminating vulnerability to rogue device drivers is probably harder than for arbitrary
rogue executables. Two techniques, used in VMS, that can help are memory mapping
only the registers of the device in question and a system interface associating the driver
with interrupts from the device.

Other approaches are:

• Various forms of virtualization, allowing the code unlimited access only to virtual
resources
• Various forms of sandbox or jail
• The security functions of Java, in java.security

Such approaches, however, if not fully integrated with the operating system, would
reduplicate effort and not be universally applied, both of which would be detrimental to
security.

[edit] Anti-malware programs


As malware attacks become more frequent, attention has begun to shift from viruses and
spyware protection, to malware protection, and programs have been developed to
specifically combat them.

Anti-malware programs can combat malware in two ways:

1. They can provide real time protection against the installation of malware software
on a computer. This type of spyware protection works the same way as that of
antivirus protection in that the anti-malware software scans all incoming network
data for malware software and blocks any threats it comes across.
2. Anti-malware software programs can be used solely for detection and removal of
malware software that has already been installed onto a computer. This type of
malware protection is normally much easier to use and more popular[citation needed].
This type of anti-malware software scans the contents of the windows registry,
operating system files, and installed programs on a computer and will provide a
list of any threats found, allowing the user to choose which files to delete or keep,
or to compare this list to a list of known malware components, removing files that
match.

Real-time protection from malware works identically to real-time antivirus protection: the
software scans disk files at download time, and blocks the activity of components known
to represent malware. In some cases, it may also intercept attempts to install start-up
items or to modify browser settings. Because many malware components are installed as
a result of browser exploits or user error, using security software (some of which are anti-
malware, though many are not) to "sandbox" browsers (essentially babysit the user and
their browser) can also be effective in helping to restrict any damage done.

[edit] Academic research on malware: a brief overview


The notion of a self-reproducing computer program can be traced back to 1949 when
John von Neumann presented lectures that encompassed the theory and organization of
complicated automata.[15] Neumann showed that in theory a program could reproduce
itself. This constituted a plausibility result in computability theory. Fred Cohen
experimented with computer viruses and confirmed Neumann's postulate. He also
investigated other properties of malware (detectability, self-obfuscating programs that
used rudimentary encryption that he called "evolutionary", and so on). His 1988 doctoral
dissertation was on the subject of computer viruses.[16] Cohen's faculty advisor, Leonard
Adleman (the A in RSA) presented a rigorous proof that, in the general case,
algorithmically determining whether a virus is or is not present is Turing undecidable.[17]
This problem must not be mistaken for that of determining, within a broad class of
programs, that a virus is not present; this problem differs in that it does not require the
ability to recognize all viruses. Adleman's proof is perhaps the deepest result in malware
computability theory to date and it relies on Cantor's diagonal argument as well as the
halting problem. Ironically, it was later shown by Young and Yung that Adleman's work
in cryptography is ideal in constructing a virus that is highly resistant to reverse-
engineering by presenting the notion of a cryptovirus.[18] A cryptovirus is a virus that
contains and uses a public key and randomly generated symmetric cipher initialization
vector (IV) and session key (SK). In the cryptoviral extortion attack, the virus hybrid
encrypts plaintext data on the victim's machine using the randomly generated IV and SK.
The IV+SK are then encrypted using the virus writer's public key. In theory the victim
must negotiate with the virus writer to get the IV+SK back in order to decrypt the
ciphertext (assuming there are no backups). Analysis of the virus reveals the public key,
not the IV and SK needed for decryption, or the private key needed to recover the IV and
SK. This result was the first to show that computational complexity theory can be used to
devise malware that is robust against reverse-engineering.

Another growing area of computer virus research is to mathematically model the


infection behavior of worms using models such as Lotka–Volterra equations, which has
been applied in the study of biological virus. Various virus propagation scenarios have
been studied by researchers such as propagation of computer virus, fighting virus with
virus like predator codes,[19][20] effectiveness of patching etc.

[edit] Grayware
Grayware[21] (or greyware) is a general term sometimes used as a classification for
applications that behave in a manner that is annoying or undesirable, and yet less serious
or troublesome than malware.[22] Grayware encompasses spyware, adware, dialers, joke
programs, remote access tools, and any other unwelcome files and programs apart from
viruses that are designed to harm the performance of computers on your network. The
term has been in use since at least as early as September 2004.[23]

Grayware refers to applications or files that are not classified as viruses or trojan horse
programs, but can still negatively affect the performance of the computers on your
network and introduce significant security risks to your organization.[24] Often grayware
performs a variety of undesired actions such as irritating users with pop-up windows,
tracking user habits and unnecessarily exposing computer vulnerabilities to attack.

• Spyware is software that installs components on a computer for the purpose of


recording Web surfing habits (primarily for marketing purposes). Spyware sends
this information to its author or to other interested parties when the computer is
online. Spyware often downloads with items identified as 'free downloads' and
does not notify the user of its existence or ask for permission to install the
components. The information spyware components gather can include user
keystrokes, which means that private information such as login names, passwords,
and credit card numbers are vulnerable to theft. Spyware gathers data, such as
account user names, passwords, credit card numbers, and other confidential
information, and transmits it to third parties.
• Adware is software that displays advertising banners on Web browsers such as
Internet Explorer and Mozilla Firefox. While not categorized as malware, many
users consider adware invasive. Adware programs often create unwanted effects
on a system, such as annoying popup ads and the general degradation in either
network connection or system performance. Adware programs are typically
installed as separate programs that are bundled with certain free software. Many
users inadvertently agree to installing adware by accepting the End User License
Agreement (EULA) on the free software. Adware are also often installed in
tandem with spyware programs. Both programs feed off each other's
functionalities - spyware programs profile users' Internet behavior, while adware
programs display targeted ads that correspond to the gathered user profile.

[edit] Web and spam


<iframe
src="http://example.net/out.ph
p?s_id=11" width=0 height=0 />

The World Wide Web is a criminals' preferred pathway for spreading malware. Today's
web threats use combinations of malware to create infection chains. About one in ten
Web pages may contain malicious code.[26]

[edit] Wikis and blogs

Innocuous wikis and blogs are not immune to hijacking. It has been reported that the
German edition of Wikipedia has recently been used as an attempt to vector infection.
Through a form of social engineering, users with ill intent have added links to web pages
that contain malicious software with the claim that the web page would provide
detections and remedies, when in fact it was a lure to infect.[27]

[edit] Targeted SMTP threats

Targeted SMTP threats also represent an emerging attack vector through which malware
is propagated. As users adapt to widespread spam attacks, cybercriminals distribute
crimeware to target one specific organization or industry, often for financial gain.[28]

[edit] HTTP and FTP

Infections via "drive-by" download are spread through the Web over HTTP and FTP
when resources containing spurious keywords are indexed by legitimate search engines,
as well as when JavaScript is surreptitiously added to legitimate websites and advertising
networks.

Antivirus software
From Wikipedia, the free encyclopedia

Jump to: navigation, search


"Antivirus" redirects here. For antiviral medication, see antiviral drug.
This article needs additional citations for verification.
Please help improve this article by adding reliable references. Unsourced material may be
challenged and removed. (April 2009)

ClamTk 3.08 free antivirus software running on Ubuntu 8.04 Hardy Heron

Antivirus (or anti-virus) software is used to prevent, detect, and remove malware,
including computer viruses, worms, and trojan horses. Such programs may also prevent
and remove adware, spyware, and other forms of malware.

A variety of strategies are typically employed. Signature-based detection involves


searching for known malicious patterns in executable code. However, it is possible for a
user to be infected with new malware in which no signature exists yet. To counter such so
called zero-day threats, heuristics can be used. One type of heuristic approach, generic
signatures, can identify new viruses or variants of existing viruses for looking for known
malicious code (or slight variations of such code) in files. Some antivirus software can
also predict what a file will do if opened/run by emulating it in a sandbox and analyzing
what it does to see if it performs any malicious actions. If it does, this could mean the file
is malicious.

However, no matter how useful antivirus software is, it can sometimes have drawbacks.
Antivirus software can degrade computer performance if it is not designed efficiently.
Inexperienced users may have trouble understanding the prompts and decisions that
antivirus software presents them with. An incorrect decision may lead to a security
breach. If the antivirus software employs heuristic detection (of any kind), the success of
it is going to depend on whether it achieves the right balance between false positives and
false negatives. False positives can be as destructive as false negatives. In one case, a
faulty virus signature issued by Symantec mistakenly removed essential operating system
files, leaving thousands of PCs unable to boot.[1] Finally, antivirus software generally runs
at the highly trusted kernel level of the operating system, creating a potential avenue of
attack.[2]

In addition to the drawbacks mentioned above, the effectiveness of antivirus software has
also been researched and debated. One study found that the detection success of major
antivirus software dropped over a one-year period.[3]

History
See also: Timeline of notable computer viruses and worms

There are competing claims for the innovator of the first antivirus product. Possibly the
first publicly documented removal of a computer virus in the wild was performed by
Bernt Fix in 1987.[4][5]

ClamTk 4.08 virus scanner running on Ubuntu 9.04

An antivirus program to counter the Polish MKS vir was released in 1987. Dr. Solomon's
Anti-Virus Toolkit, AIDSTEST and AntiVir were released by in 1988. Dr. Ahn Chul Soo
(Charles Ahn, founder of AhnLab Inc) in South Korea also released the antivirus software
called 'Vaccine Ⅰ' in June 10, 1988[citation needed]. By late 1990, nineteen separate antivirus
products were available including Norton AntiVirus and McAfee VirusScan.[citation needed]
Early contributors to work on computer viruses and countermeasures included Fred
Cohen, Peter Tippett, John McAfee and Ahn Chul Soo.

Before Internet connectivity was widespread, viruses were typically spread by infected
floppy disks. Antivirus software came into use, but was updated relatively infrequently.
During this time, virus checkers essentially had to check executable files and the boot
sectors of floppy and hard disks. However, as internet usage became common, initially
through the use of modems, viruses spread throughout the Internet.[6]

Powerful macros used in word processor applications, such as Microsoft Word, presented
a further risk. Virus writers started using the macros to write viruses embedded within
documents. This meant that computers could now also be at risk from infection by
documents with hidden attached macros as programs.[7]

Later email programs, in particular Microsoft Outlook Express and Outlook, were
vulnerable to viruses embedded in the email body itself. Now, a user's computer could be
infected by just opening or previewing a message. This meant that virus checkers had to
check many more types of files. As always-on broadband connections became the norm
and more and more viruses were released, it became essential to update virus checkers
more and more frequently. Even then, a new zero-day virus could become widespread
before antivirus companies released an update to protect against it.[8]
[edit] Identification methods

ClamWin 0.95.1 running on Windows XP

There are several methods which antivirus software can use to identify malware.

Signature based detection is the most common method. To identify viruses and other
malware, antivirus software compares the contents of a file to a dictionary of virus
signatures. Because viruses can embed themselves in existing files, the entire file is
searched, not just as a whole, but also in pieces.[9]

Malicious activity detection is another approach used to identify malware. In this


approach, antivirus software monitors the system for suspicious program behavior. If
suspicious behavior is detected, the suspect program may be further investigated, using
signature based detection or another method listed in this section. This type of detection
can be used to identify unknown viruses or variants on existing viruses.[citation needed]

Heuristic-based detection, like malicious activity detection, can be used to identify


unknown viruses. This can be accomplished in one of two ways: file analysis and file
emulation.[citation needed]

File analysis is the process of searching a suspect file for virus-like instructions. For
example, if a program has instructions to reformat the C drive, the antivirus software
might further investigate the file. One downside of this feature is the large amount of
computer resources needed to analyse every file, resulting in slow operation.[citation needed]

File emulation is another heuristic approach. File emulation involves executing a


program in a virtual environment and logging what actions the program performs.
Depending on the actions logged, the antivirus software can determine if the program is
malicious or not and then carry out the appropriate disinfection actions.[10]

[edit] Signature based detection

This section does not cite any references or sources. Please help improve this
article by adding citations to reliable sources. Unsourced material may be
challenged and removed. (April 2009)
A command-line virus scanner, Clam AV 0.95.2, running a virus signature definition
update, scanning a file and identifying a Trojan

Traditionally, antivirus software heavily relied upon signatures to identify malware. This
can be very effective, but cannot defend against malware unless samples have already
been obtained and signatures created. Because of this, signature-based approaches are not
effective against new, unknown viruses.

When antivirus software scans a file for viruses, it checks the contents of a file against a
dictionary of virus signatures. A virus signature is the viral code. If a virus signature is
found in a file the antivirus software can resort to some combination of quarantine, repair
or deletion. Quarantining a file will make it inaccessible, and is usually the first action
antivirus software will take if a malicious file is found. Encrypting the file is a good
quarantining technique because it renders the file useless without the encryption
key.[citation needed]

Sometimes a user wants to save the content of an infected file because viruses can
sometimes embed themselves in files, called code injection, and the file may be essential
to normal operation. To do this, antivirus software will attempt to repair the file. To do
this, the software will try to remove the viral code from the file. Unfortunately, some
viruses might damage the file upon injection.[citation needed]

If a file repair operation fails, usually the best thing to do is to just delete the file.
Deleting the file is necessary if the entire file is infected.[citation needed] This may be necessary
in the case of infected ZIP files, or similar "packed" files.

Because new viruses are being created each day, the signature-based detection approach
requires frequent updates of the virus signature dictionary. To assist the antivirus software
companies, the software may allow the user to upload new viruses or variants to the
company, allowing the virus to be analyzed and the signature added to the dictionary.[9]

Signature-based antivirus software typically examines files when the computer's


operating system creates, opens, closes, or e-mails them. In this way it can detect a
known virus immediately upon receipt. System administrators can schedule antivirus
software to scan all files on the computer's hard disk at a set time and date.[citation needed]

Although the signature-based approach can effectively contain virus outbreaks, virus
authors have tried to stay a step ahead of such software by writing "oligomorphic",
"polymorphic" and, more recently, "metamorphic" viruses, which encrypt parts of
themselves or otherwise modify themselves as a method of disguise, so as to not match
virus signatures in the dictionary.[11]

An emerging technique to deal with malware in general is whitelisting. Rather than


looking for only known bad software, this technique prevents execution of all computer
code except that which has been previously identified as trustworthy by the system
administrator. By following this "default deny" approach, the limitations inherent in
keeping virus signatures up to date are avoided. Additionally, computer applications that
are unwanted by the system administrator are prevented from executing since they are not
on the whitelist. Since organizations often have large quantities of trusted applications,
the limitations of adopting this technique rests with the system administrators' ability to
properly inventory and maintain the whitelist of trusted applications. Viable
implementations of this technique include tools for automating the inventory and
whitelist maintenance processes.[citation needed]

[edit] Suspicious behavior monitoring

This section does not cite any references or sources. Please help improve this
article by adding citations to reliable sources. Unsourced material may be
challenged and removed. (April 2009)

The suspicious behavior approach does not attempt to identify known viruses, but instead
monitors the behavior of all programs. If one program tries to write data to an executable
program, for example, the antivirus software can flag this suspicious behavior, alert a
user and ask what to do.[citation needed]

The suspicious behavior approach provides protection against zero day viruses that are
not yet in the dictionary. However, it can also sound a large number of false positives and
users may become desensitized to the warnings. This problem has worsened since 1997,
since many more non-malicious program designs came to modify other executablea
without regard to this false positive issue. In recent years, however, sophisticated
behaviour analysis has emerged, which analyzes processes and calls to the kernel in
context before making a decision, which gives it a lower false positive rate than rules-
based behavior monitoring.[citation needed]

[edit] Heuristics

Some more sophisticated antivirus software uses heuristic analysis to identify new
malware or variants of known malware. Three methods are used: file analysis, file
emulation, and generic signatures.[citation needed]

File analysis is the process by which antivirus software will analyze the instructions of a
program. Based on the instructions, the software can determine whether or not the
program is malicious. For example, if the file contains instructions to delete important
system files, the file might be flagged as a virus. While this method is useful for
identifying new viruses and variants, it can trigger many false positives.[citation needed]
The second heuristic approach is file emulation, which runs the target file in a virtual
system environment, separate from the real system environment. The antivirus software
would then log what actions the file takes in the virtual environment. If the actions are
found to be damaging or malicious, the file may be marked a virus. But again, this
method can trigger false positives.[citation needed]

Another type of heuristics is generic signatures.[citation needed]

Many viruses start as a single infection and through either mutation or refinements by
other attackers, can grow into dozens of slightly different strains, called variants. Generic
detection refers to the detection and removal of multiple threats using a single virus
definition. [12]

For example, the Vundo trojan has several family members, depending on the antivirus
vendor's classification. Symantec classifies members of the Vundo family into two
distinct members, Trojan.Vundo and Trojan.Vundo.B.[13][14]

While it may be advantageous to identify a specific virus, it can be quicker to detect a


virus family through a generic signature or through an inexact match to an existing
signature. Virus researchers find common areas that all viruses in a family share uniquely
and can thus create a single generic signature. These signatures often contain non-
contiguous code, using wildcard characters where differences lie. These wildcards allow
the scanner to detect viruses even if they are padded with extra, meaningless code. [15]
Padded code is used to confuse the scanner so it can't recognize the threat.

A detection that uses this method is said to be "heuristic detection."

[edit] Virus removal tools


This section does not cite any references or sources. Please help improve this
article by adding citations to reliable sources. Unsourced material may be
challenged and removed. (April 2009)

A virus removal tool is software for removing specific viruses from infected computers.
Unlike complete antivirus scanners, they are usually not intended to detect and remove an
extensive list of viruses; rather they are designed to remove specific viruses, usually more
effectively than normal antivirus software. Sometimes they are also designed to run in
places that regular antivirus software can't. This is useful in the case of a severely
infected computer. Examples of these tools include McAfee Stinger and the Microsoft
Windows Malicious Software Removal Tool (which is run automatically by Windows
update).

[edit] Issues of concern


[edit] Performance
Some antivirus software can considerably reduce performance. Users may disable the
antivirus protection to overcome the performance loss, thus increasing the risk of
infection. For maximum protection, the antivirus software needs to be enabled all the
time[citation needed] — often at the cost of slower performance (see also software bloat).

[edit] Security

Antivirus programs can in themselves pose a security risk as they often run at the
'System' level of privileges and may hook the kernel — Both of these are necessary for
the software to effectively do its job, however exploitation of the antivirus program itself
could lead to privilege escalation and create a severe security threat. Arguably, use of
antivirus software when compared to the principle of least privilege is largely ineffective
when ramifications of the added software are taken into account.

[edit] Unexpected renewal costs

Some commercial antivirus software end-user license agreements include a clause that
the subscription will be automatically renewed, and the purchaser's credit card
automatically billed, at the renewal time without explicit approval. For example, McAfee
requires users to unsubscribe at least 60 days before the expiration of the present
subscription[16] while BitDefender sends notifications to unsubscribe 30 days before the
renewal.[17] Norton Antivirus also renews subscriptions automatically by default.[18]

Open source and free software applications, such as Clam AV, provide both the scanner
application and updates free of charge and so there is no subscription to renew.[19]

[edit] Privacy

This section does not cite any references or sources. Please help improve this
article by adding citations to reliable sources. Unsourced material may be
challenged and removed. (April 2009)

Some antivirus programs may be configured to automatically upload infected or


suspicious files to the developer for further analysis. Care should be taking when
deploying antivirus software to ensure that documents containing confidential or
proprietary information are not sent to the product's developer without prompting the
user.

[edit] Rogue security applications

Some antivirus programs are actually malware masquerading as antivirus software, such
as WinFixer and MS Antivirus.[20]

[edit] False positives


If an antivirus program is configured to immediately delete or quarantine infected files
(or does this by default), false positives in essential files can render the operating system
or some applications unusable.[21]

[edit] System related issues

Running multiple antivirus programs concurrently can degrade performance and create
conflicts.[22] It is sometimes necessary to temporarily disable virus protection when
installing major updates such as Windows Service Packs or updating graphics card
drivers.[23] Active antivirus protection may partially or completely prevent the installation
of a major update.

[edit] Mobile devices


This section does not cite any references or sources. Please help improve this
article by adding citations to reliable sources. Unsourced material may be
challenged and removed. (April 2009)

Viruses from the desktop and laptop world have either migrated to, or are assisted in their
dispersal by mobile devices. Antivirus vendors are beginning to offer solutions for mobile
handsets. These devices present significant challenges for antivirus software, such as
microprocessor constraints, memory constraints and new signature updates to these
mobile handsets.

[edit] Effectiveness
Studies in December 2007 have shown that the effectiveness of Antivirus software is
much reduced from what it was a few years ago, particularly against unknown or zero
day attacks. The German computer magazine c't found that detection rates for these
threats had dropped from 40-50% in 2006 to 20-30% in 2007. At that time, the only
exception was the NOD32 antivirus, which managed a detection rate of 68 percent.[3]

The problem is magnified by the changing intent of virus authors. Some years ago it was
obvious when a virus infection was present. The viruses of the day, written by amateurs,
exhibited destructive behavior or pop-ups. Modern viruses are often written by
professionals, financed by criminal organizations.[24] It is not in their interests to make
their viruses or crimeware evident, because their purpose is to create botnets or steal
information for as long as possible without the user realizing. If an infected user has a
less-than-effective antivirus product that says the computer is clean, then the virus may
go undetected. Nowadays, viruses generally do not attempt to overwhelm the Internet by
flooding. Instead, viruses take a more controlled approach, as damaging the vector of
infection does not result in financial gain.

Traditional antivirus software solutions run virus scanners on schedule, on demand and
some run scans in real time. If a virus or malware is located the suspect file is usually
placed into a quarantine to terminate its chances of disrupting the system. Traditional
antivirus solutions scan and compare against a publicised and regularly updated
dictionary of malware otherwise known as a blacklist. Some antivirus solutions have
additional options that employ an heuristic engine which further examines the file to see
if it is behaving in a similar manner to previous examples of malware. A new technology
utilized by a few antivirus solutions is whitelisting, this technology first checks if the file
is trusted and only questioning those that are not.[25] With the addition of wisdom of
crowds, antivirus solutions backup other antivirus techniques by harnessing the
intelligence and advice of a community of trusted users to protect each other. By
providing these multiple layers of malware protection and combining them with other
security software it is possible to have more effective protection from the latest zero day
attack and the latest crimeware than previously was the case with just one layer of
protection.

[edit] Cloud antivirus


In current antivirus software a new document or program is scanned with only one virus
detector at a time. CloudAV would be able to send programs or documents to a network
cloud where it will use multiple antivirus and behavioural detection simultaneously. It is
more thorough and also has the ability to check the new document or programs access
history.[26]

CloudAV is a cloud computing antivirus developed at a product of scientists of the


University of Michigan. Each time a computer or device receives a new document or
program, that item is automatically detected and sent to the antivirus cloud for analysis.
The CloudAV system uses 12 different detectors that act together to tell the PC whether
the item is safe to open.[26][27][28]

[edit] Other computer protection methods


This section does not cite any references or sources. Please help improve this
article by adding citations to reliable sources. Unsourced material may be
challenged and removed. (April 2009)

Beside antivirus software, virus infection prevention can be assisted by other means such
as implementing a network firewall, or utilizing system virtualization[citation needed].

[edit] Antivirus Card

This section does not cite any references or sources. Please help improve this
article by adding citations to reliable sources. Unsourced material may be
challenged and removed. (June 2009)
This method was used in the early 1990s by DOS users and involves the installation of an
ISA interface card which takes over the DOS interrupt and monitors the WRITE
operation.

[edit] Network Firewall

Network firewalls prevent unknown programs and Internet processes from accessing the
system protected. However, they are not antivirus systems as such and thus make no
attempt to identify or remove anything. They may protect against infection from outside
the protected computer or LAN, and limit the activity of any malicious software which is
present by blocking incoming or outgoing requests on certain TCP/IP ports. A firewall is
designed to deal with broader system threats that come from network connections into the
system and is not an alternative to a virus protection system.

[edit] System Virtualization

This method virtualizes the working system, which prevents the actual system from being
altered by a virus as it stops any alteration attempts to the whole system under
virtualization. Although this may in general be the case, infection may spread to the non-
virtual system if the virus is so conceived that after infecting the virtual system, it will
break (crack) the virtualization environment by using one of its exploits and then spread
to the non-virtual environment.[citation needed]

In general, without any antivirus software the virtual system can still be infected and
suffer damage or malicious action, but as soon as the system is shut down and restarted,
all the changes and damage previously done to the virtual system will be reset. This way,
the system is protected and the virus is removed. However, any damages to unprotected
or unvirtualized data will remain, as will the malicious effects it has caused such as data
theft.[citation needed]

Since not all virtualization software loads the virtual computer from a standard
(unchangeable) boot image, in certain cases it works just as any real computer which has
its own (virtual) hard disk, i.e. infection with a virus will have to be cured in order to
disinfect the system, or the virtual system will have to be destroyed (deleted) in order to
get rid of the damage. This is the case for VMware and VirtualPC, if there is set up a
virtual computer with its own virtual hard disk. Virtualization software like Sandboxie
may prevent infection from spreading to the main (non-virtual) system and then indeed
the only damage made by the virus is data/identity theft, i.e. stealing the data which is
made available by the non-virtual system to the virtual browsing/document processing
environment. As mentioned before, if Sandboxie has exploits, a virus may use such
exploits in order to infect the non-virtual environment.[citation needed]

[edit] Online detection


Some online sites provide scanning of files uploaded by users. These online sites use
multiple virus scanners and provide a report to the user about the uploaded file. Examples
of online scanners include Jotti's malware scan[29], COMODO Automated Analysis
System.[30] and VirusTotal.com[31].