This action might not be possible to undo. Are you sure you want to continue?
ARTIFICIAL IMMUNE SYSTEM
Bachelor of Engineering In Computer Engineering
Ms Amrita Saraswat Mr. Vikas Sharma VIII Sem. Seminar Coordinators (Lecturer CE)
Nakul Chawla IV B.E Computer Engineering Sec-A, Roll no. 61
DEPARTMENT OF COMPUTER ENGINEERING JAIPUR ENGINEERING COLLEGE & RESEARCH CENTER
UNIVERSITY OF RAJASTHAN 2007-2008
The beatitude, bliss and euphoria that accompany the successful completion of any task would not be complete without the expression of appreciation of simple virtues to the people who made it possible. First and foremost, I would like to thank my advisors Ms.Amrita Saraswat & Mr. Vikas Sharma. Their attitude toward excellence, helping nature and enthusiasm has been source of constant inspiration. I am grateful to them for all the advice, encouragement and support they have given me during my work with them, I don’t think this task would be as structured as it is now without his valuable guidance. I would also like to thank my friends for their unhitching support during my work. They were the true driving force behind me through out, constantly encouraging me to do my best and inspiring me to aim higher.
...............10 3 Anti-Virus.12 3......................................2 Algorithm used by signature extractor.........8 2................4...................................1 Problems faced by signature extractor......................................3............................15 5............1 Computer Virus....2 What happens if current Anti-virus is failed?..........................22 4 . Virus-An Artificial Life.........................................................3 Scanner....... Introduction………..4..................................2 Activity monitors......................4 Signature of Virus.5 1.......................... ............3 How Virus Affects A Computer............................. Computer’s Immune System......1 Concept of self and non-self...............2 Check for new software against virus.....8 2.....20 7..1 Viral influx..2 Inter-connectivity among computers…....................................17 Handling Unknown Intruders............. Recognize Known Intruders.........21 7..2....Contents Seminar Abstract …………………………………….......12 3. 7.....................20 7................18 7................................3...........1 Development of new Anti-virus...18 7.....16 Elimination of Intruders........2 Drawbacks of this approach.........8 2.....................21 7................12 3........2 Replication Property of Virus..............3 Failure of current Anti-virus.... 6.....................9 2...................2........20......2......4 Signature extractor.………………….13 3.................2.................20 7............8 2.........................................5 Types of Virus.................1 Anti-virus techniques........................2..........20 7....11 3..........1 Integrity monitors.....................................6 2.......13 3.........3 Use of decoy program..................11 3...... 13 4............................ 7.........................
...............5 Adding signature to database.......................................24 9... To solve these problems.. This artificial immune system will be implemented in couple of years in IBM antivirus laboratory...........................27 Bibliography.... two alarming trends are likely to make computer viruses a much greater threat.............. However..2 Algorithm used by the neighbor............. There should be try to minimize the risk of an auto-immune response........... Currently they are a relatively manageable nuisance.........................7..... It also employs nature’s technique of fighting self-replication with self-replication........... Like the vertebrates immune system............. Secondly the trend towards increasing interconnectivity and interoperability among computers will enable computer viruses and worms to spread much more rapidly than they do today........ in which the immune system mistakenly identifies legitimate software as being undesirable........................ 24 8.. Conclusion.... our system develops anti bodies to previously un-encountered computer viruses or worms and remembers them so as to recognize and respond to them more quickly in the future....... Self Replication......23 8..........23 8....... 5 ......................1 Kill Signal.. there is an immune system for computers and computer networks that takes much of its inspiration from nature... Firstly the rate at which new viruses are being written is high.......... and accelerating....28 ABSTRACT Computer viruses are the first and only form of artificial life to have a measurable impact on society....................... It is also related to biological immune system...........
two alarming trends threaten to turn the balance in favor of virus authors: 6 . However.000/year policy for damage due to computer virus infection Computer viruses are serious business. Of the roughly 100 to 200 million PC and Macintosh users in the world. and admirable. buggy pieces of software that create problems that are often time-consuming to diagnose. it is typical for a few viruses to be increasing in prevalence. Computer viruses have found a niche on all of the world's continents. At least one insurer offers a $100. A sufficiently amoral artificial life enthusiast might view the success of these artificial creatures in the real world as amazing. a virus spreading among several PC's in a company costs (on average) several thousands of dollars in down-time and data losses. the vast majority of them are poorly-written. According to a Dataquest survey and spokesmen for several different insurance companies. Even though just a small minority of viruses is intentionally harmful. They are written by authors and have established themselves pervasively throughout the world’s computing environment. the arms race between virus authors and anti-virus developers is roughly even. consisting of hundreds of researchers and developers who are employed by dozens of companies around the world. including Antarctica and most of its countries. and perhaps over a million. During any particular moment. have been afflicted at one time or another. but most responsible citizens regard computer viruses (and those who write them) with abhorrence. Currently. at least several hundred thousand. amusing. poorly-tested. They have engendered an entire anti-virus industry. and other formerly prevalent ones to be on the decline.INTRODUCTION Computer Viruses are forms of artificial life.
even in cases where the intruder has never been seen before: the Vertebrate/Biological Immune System. and appears to be accelerating. Nature has already invented a remarkably effective mechanism for recognizing and responding rapidly to viruses and other undesired intruders. In the near future. Over the next year or two. Fortunately for us. This type of immune system is called Artificial Immune System. Human experts who analyze and find cures for viruses are already swamped. and their ability to keep pace with the large influx of new viruses is being questioned.• The rate at which new viruses are being written is quite high. The current strategy of periodically distributing updates to anti-virus software from a central source will be orders of magnitude too slow to keep up with the spread of a new virus. The success of the vertebrate immune system in protecting its host from a wide array of viruses and other undesirables that are continually mutating and evolving has inspired us to design and implement an immune system for computers that is founded on similar principles. • The continuing increase in interconnectivity and interoperability among the world's computers enhances the ability of any particular virus to spread. computers will somehow need to automatically recognize and remove previously unknown viruses on the spot soon after they are discovered. and the rapidity with which it does so. the immune system will be phased gradually into IBM's anti-virus software. Various components of the immune system are already being used to automate the task of computer virus analysis in the laboratory. 7 .
Instead. Viruses can be divided into two types. A Computer Virus is an executable program. then that computer will also be “infected” by the same virus. and finally transfer control to the application program they infected. Resident viruses do not search for hosts when they are started. In order to replicate itself. if a file containing a virus is executed or copied on to another computer. In most cases. the virus' code may be executed first. This means that a virus multiplies on a computer by making copies of itself.e. This replication is intentional. For this reason. malicious program that creates problem. COMPUTER VIRUS Computer Virus is a poorly written.VIRUS – AN ARTIFICIAL LIFE Before going to the Artificial Immune System. I would like to explain the enemy of computers i. a resident virus loads itself into memory on execution and transfers control to the host program. Depending on the nature of virus. a virus must be permitted to execute code and write to memory. on the basis of their behavior when they are executed. 8 . The virus stays active in the background and infects new hosts when those files are accessed by other programs or the operating system itself. Nonresident viruses immediately search for other hosts that can be infected. poorly tested buggy piece of software. it may cause damage of hard disk contents and/or interfere with the normal operation of the computer. it is part of the virus program. many viruses attach themselves to executable files that may be part of legitimate programs. infect these targets. Computer Virus. If a user tries to start an infected program. REPLICATION PROPERTY OF VIRUS By definition a virus program is able to replicate itself. Only small minorities of viruses are intentionally harmful.
Nonresident viruses Nonresident viruses can be thought of as consisting of a finder module and a replication module. 9 . For each new executable file the finder module encounters. For instance. Instead. the virus can "piggy-back" on the virus scanner and in this way infect all files that are scanned. However. Slow infectors are designed to avoid detection by limiting their actions: they are less likely to slow down a computer noticeably. This poses a special problem to anti-virus software. The disadvantage of this method is that infecting many files may make detection more likely. For example. Slow infectors. Resident viruses are sometimes subdivided into a category of fast infectors and a category of slow infectors. Fast infectors are designed to infect as many files as possible. this module is not called by a finder module. since a virus scanner will access every potential host file on a computer when it performs a system-wide scan. In this case. the virus infects every suitable program that is executed on the computer. because the virus may slow down a computer or perform many suspicious actions that can be noticed by anti-virus software. If the virus scanner fails to notice that such a virus is present in memory. some slow infectors only infect files when they are copied. The finder module is responsible for finding new files to infect. Resident viruses Resident viruses contain a replication module that is similar to the one that is employed by nonresident viruses. the replication module can be called each time the operating system executes a file. Fast infectors rely on their fast infection rate to spread. and will at most infrequently trigger anti-virus software that detects suspicious behavior by programs. a fast infector can infect every potential host file that is accessed. it calls the replication module to infect that file. on the other hand. For instance. the virus loads the replication module into memory when it is executed and ensures that this module is executed each time the operating system is called to perform a certain operation. are designed to infect hosts infrequently.
it may be a static hash which. Each virus has a unique event associated with it. It is found only in code portion of virus. SIGNATURE OF VIRUS As we know that virus is just a malicious program.e. These events and their effects can range from harmless to devastating. or sometimes even replace. For examples: • • • • • An annoying message appearing on the computer screen. as it is a program so it contains some code and data. C++. an existing program. less commonly. Y. Hard Drive Erased. Depending on the type of scanner being used. if this file tries to do X. Modification of data. Reduced memory or disk space. This usually happens without the user being aware of it. For Internet users. flag it as suspicious and prompt the user for a decision. the virus is also executed. in its simplest form. or referencing email attachments. i.HOW VIRUS AFFECTS A COMPUTER A virus can be introduced to a computer system along with any software program. This is called Mutation of Virus. But there is some fixed byte sequence normally from 16 to 24 bytes which is guaranteed to be found in each instance of virus and not in any uninfected software. Z. and JAVA etc. the algorithm may be behavior-based. Or. it can attach itself to. constants etc. it can be either in C. A virus program contains instructions to initiate some sort of “event” that affects the infected computer. That is called Signature. So virus writers just take the code from previous virus and change the data portion of that to produce new instance of virus. this threat can come from downloading files through FTP (file transfer protocol). Files overwritten or damaged. Data means program contains character strings. When a virus is introduced a computer system. is a calculated numerical value of a snippet of code unique to the virus. 10 . Thus when the user runs the program in question.
there are still some which only exist in the imaginations of the public and the pressknown as the virus hoaxes. 5. 11 . 2. The ability to detect heuristically or generically is significant. 8. These types include: 1. Source Code Viruses: These add code to actual program to source code. Batch File Viruses: These use text batch file to infect. Virus Hoax: Although there are thousands of viruses discovered each year.A single signature may be consistent among a large number of viruses. This ability is commonly referred to as either heuristics or generic detection. System Sector Viruses: They infect control information on the disk itself. This allows the scanner to detect a brand new virus it has never even seen before. TYPES OF VIRUS Virus can be categorized into 8 types according to file types they infect. File Viruses: They infect programs (COM or EXE) files. 7. Boot Sector Viruses: These viruses infect floppy and hard drives. 6. 3. because they contain macro programs they can be infected. These virus hoaxes do not exist despite their rumor of creation and distribution. Macro Viruses: These infect files that we might think of as data files. These programs will load first before the Operating System. 4. But. Generic detection is less likely to be effective against completely new viruses and more effective at detecting new members of an already known virus 'family' (a collection of viruses that share many of the same characteristics and some of the same code). given that most scanners now include in excess of 250k signatures and the numbers of new viruses being discovered continues to increase dramatically year after year. Cluster Viruses: A special file that infects through the disk directory.
Unknown Virus: These are those viruses which are not known to us or which was not discovered or encountered before or for which there is no anti-virus to handle.Virus can also be categorized in two types according to their notoriety: 1. KNOWN VIRUS UNKNOWN VIRUS HANDELED BY HANDELED BY ANTIVIRUS AIS Figure 1 12 . 2. So Artificial Immune System can handle both known and unknown virus as shown in figure 1. Known Virus: These are those viruses which are known to us or for which there is an anti-virus there to handle. They can only be handled by Artificial Immune System. They can also be handled by Artificial Immune System.
These are: ANTI-VIRUS TECHNIQUES ARE DOOMED Activity monitors: It alert users to system activity that is commonly associated with viruses. and can be used to detect the presence of hitherto unknown viruses in the system. which restore infected programs to their original uninfected state. Integrity management systems: warn the user of suspicious changes that have been made to files. it is pushed out to the customer in the form of signature updates. or variants of them. boot records. memory. Virus scanners: It search files. a new signature must be created. There are four techniques present in a typical anti-virus. Virus repairers: Armed with this very specific knowledge. or may be detectable but cannot be properly removed because its behavior is not totally consistent with previously known threats. and so can disrupt normal work or lead the user to ignore their warnings altogether. After the new signature has been created and tested by the antivirus vendor. However. and other locations where executable code can be stored for characteristic byte patterns that occur in one or more known viruses. Each time a new virus is discovered that is not detectable by an existing signature. The drawback of scanning and repair mechanisms is that they can only be applied to known viruses. Scanners are essential for establishing the identity and location of a virus. legitimate programs. can be brought into play. These two methods are quite generic. These updates add the detection capability to 13 . they are not often able to pinpoint the nature or even the location of the infecting agent. this requires that scanners and repairers be updated frequently. They tend to be substantially less prone to false positives than activity monitors and integrity management systems. but only rarely associated with the behavior of normal. and they often flag or prevent legitimate activity.ANTI-VIRUS Like in biological immune system there are antibodies which encounters on antigens. there is a concept of anti-virus in Immune system of computer for fighting against computer virus.
) DEVELOPMENT OF NEW ANTI-VIRUS Typically. Restoration of an infected host program to its original uninfected state (which is usually possible. and the knowledge of the attachment method can be encoded into the repairer. a human expert obtains this information by disassembling the virus and then analyzing the assembler code to determine the virus's behavior and the method that it uses to attach itself to host programs. and 2. international group of virus collectors who exchange samples among themselves. Depending on the scanning vendor. Then.the scan engine. or daily. and they set out to obtain information about the virus which enables: 1. In some cases. or sometimes even weekly.e. Detection of the virus whenever it is present in a host program. with what that scanner is charged with detecting. WHAT HAPPENS IF CURRENT ANTI-VIRUS IS FAILED? Whenever a new virus is discovered. and which (in the expert's estimation) is unlikely to be found in legitimate programs. Many such collectors are in the anti-virus software business. it means that current anti-virus is not able to detect the virus or it is failed. Much of the need to provide signatures varies with the type of scanner it is. It is very quickly distributed among an informal. updates may be offered hourly. a previously provided signature might be removed or replaced with a new signature to offer better overall detection or disinfection capabilities. i. This ``signature'' can then be encoded into the scanner. DRAWBACKS OF THIS APPROACH 14 . the expert selects a ``signature'' (a sequence of perhaps 16 to 32 bytes) that represents a sequence of instructions that is guaranteed to be found in each instance of the virus.
the beginnings of a trend towards automated virus-writing are evinced by the Virus Creation Laboratory. Even the best experts have been known to select poor signatures -.000. Interconnectivity among computers VIRAL INFLUX One reason why current anti-virus techniques can be expected to fail within the next few years is the rapid. The number of different known DOS viruses over the last several years can be fit remarkably well by an exponential curve.ones that cause the scanner to report false positives on legitimate programs. Currently. 2. Already. INTER-CONNECTIVITY AMONG COMPUTERS 15 . but it is not impossible that virus writers could be so prolific. Of course. Even if the rate at which new viruses appear were to suddenly plateau at a level not much higher than what it is today. it is approximately 100. FAILURE OF CURRENT ANTI-VIRUS There are two reasons which have led to the failure of current anti-virus and our motivation towards Artificial Immune System 1. sometimes taking several hours or days. curve extrapolation of a phenomenon that depends largely on human sociology and psychology should be regarded very skeptically. a menu-driven virus toolkit circulating among virus writers' bulletin boards. with two or three new ones appearing each day -. Such an analysis is tedious and time-consuming.a rate which already taxes to the limit the ability of anti-virus vendors to develop detectors and cures for them. they would have to automate both the writing and the distribution of viruses.1. To do so. and the burden on current antivirus techniques to detect and eradicate so many viruses would be severe. Viral Influx 2. the number of different DOS viruses could easily reach the hundreds of thousands by the year 2005. accelerating influx of new computer viruses.
COMPUTER’S IMMUNE SYSTEM 16 . this would not solve the problem. widespread infection is impossible.designed to facilitate the flow of desirable information -. to the extent that technological advances will increase the contact rate and promiscuity among computers. One can expect increased networking to be reflected in increases in two important epidemiological parameters: • The overall rate at which a given infected individual computer spreads a virus If the average rate at which infection can spread from one individual to another is sufficiently low. Thus. However. during that individual's period of contagion. Above a well-defined critical threshold. it is not surprising that many customers blissfully continue to use anti-virus software that is more than a year out of date.9 other people. but hardly surprising.also facilitates the flow of computer viruses. however. if that individual can be expected to infect 1. to spread faster. we can expect computer virus epidemics to become more likely. money. and to affect more computers.even when the infection rate along each link is adjusted so as to keep the total the same in the two cases. he or she can be expected to infect 0. and effort involved. that increased interconnectivity and interoperability among computers -. If. As a simple way of explaining the existence of a sharp threshold. and the customers must install them.It is unfortunate.1 other people. imagine that an individual has the flu. epidemics can occur. A topology in which each individual has several ``neighbors'' to which it can spread infection is more conducive to epidemics than one which is sparsely connected -. there is likely to be a flu epidemic • The number of partners with which that individual has potentially infectious contacts. While it is true that updates might be made somewhat more frequently. Given the time. the strain of flu will sooner or later die out. The updates must be distributed to customers.
T cell receptors can see the inner portions of antigen. Remember how to recognize it. Elimination/neutralization of intruders. consisting of at least 4 to 6 amino acids). Recognition of known intruders. Phrased in this way. which then presents pieces of the antigen on it surface. It is interesting to note that an exact match to the entire antigen is not attempted.Rather than relying on a central authority to protect them from all ills. but only after the antigen has been consumed by a macrophage or other cell. Ability to learn about previously unknown intruders. 17 . not throughout volumes. 3. • • • Determine that the intruder doesn't belong. it is almost certainly a physical impossibility. humans and other vertebrates carry around their own individual immune systems. Use of selective proliferation and self-replication for quick recognition and response. RECOGNIZE KNOWN INTRUDERS The vertebrate immune system recognizes particular antigens (viruses and other undesirable foreign substances) by means of antibodies and immune cell receptors which bind to epitopes (small portions of the antigen. including: 1. it is evident that these fundamental properties are desirable for computers as well. 2. in fact. Figure out how to recognize it. where they can be seen by other cells. No antibody molecule or immune-cell receptor could be perfectly specific to a given antigen because matching occurs at surfaces. 4. The vertebrate immune system exhibits some remarkable properties.
and the antigen is effectively neutralized. and it prevents the virus from completing the replication process. the host's cell wall is ruptured. By killing an infected host cell. a particular virus is not recognized via an exact match. if an antibody meets up with an antigen. Eventually. the two bind together. immunity to one variant of a virus would confer no protection against a slightly different variant. Thus recognition and neutralization of the intruder occur simultaneously. an ability to recognize variants is essential because viruses tend to mutate frequently. vaccines would not work. ELIMINATION OF INTRUDERS In the biological immune system. it could take the analogous step of erasing or otherwise inactivating the 18 . rather. This is a perfectly sensible course of action. and 2. Although matching to a small portion of the virus is not necessitated in this case by the laws of chemistry. it is recognized via an exact or fuzzy match to a relatively short sequence of bytes occurring in the virus (a ``signature''. matter and energy into synthesizing viral proteins that are assembled into copies of the virus. as described in section 2). Similarly. If the computer immune system were to find an exact or fuzzy match to a signature for a known virus. resulting in the death of the host and the release of hundreds or thousands of viruses into the intercellular medium. It is more efficient in time and memory. a killer T cell may encounter a cell that exhibits signs of being infected with a particular infecting agent. because they rely on the biological immune system's ability to synthesize antibodies to tamed or killed viruses that are similar in form to the more virulent one that the individual is being immunized against. It enables the system to recognize variants. whereupon it kills the host cell. in the computer immune system. For biological and computer immune systems. a killer T cell is merely hastening the execution of a cell that was slated to die anyway. 1. it has some important advantages. If an exact match were required. A biological virus co-opts its host cell's machinery. The issues of efficiency and variant recognition are relevant for biology as well. Alternatively.Similarly.
the immune system is able to ``remember'' the antigen (i. each of the applications run by a typical computer user are unique in function and irreplaceable (unless backups have been kept. of course). Consequently.and T cell receptors capable of recognizing that particular intruder very efficiently. From the body's point of view. an infected host cell would hardly be worth the trouble of saving. To be effective. For this reason. However. all but the most ill-conceived computer viruses attach themselves to their host in such a way that they do not destroy its function. Even if biological viruses didn't destroy infected cells. and thus it is ready to respond much more quickly the next time that antigen is encountered. there are plenty of other cells around that can serve the same function. allows one to construct repair algorithms for a large class of non-destructive viruses for which one has a precise knowledge of the attachment method.otherwise. it can immediately recognize the intruder as non-self. it retains immune cells with the proper receptors for recognizing that antigen) for decades after the initial encounter. A user would be likely to notice any malfunction. Over the course of days or weeks. an important difference between computer viruses and biological viruses raises the possibility of a much gentler alternative. through a process of mutation and selective proliferation (see the next subsection). and it must not bind to self proteins -. because the ensuing investigation would surely lead to its discovery and eradication. The biological immune system reduces the chances of recognizing self by subjecting 19 . an antibody or receptor for a particular antigen must bind to that antigen (or close variants of that antigen) with high efficiency. By some unknown means. e. the host would be likely to suffer from an auto-immune disease. In contrast. This is a valid approach. cells are an easily-replenished resource. and attack it on that basis. The fact that host information is merely rearranged. not destroyed.executable file containing the virus. it would be suicidal for a computer virus to destroy its host program. HANDLING UNKNOWN INTRUDERS When the biological immune system encounters an intruder that it has never seen before. it ``learns'' to fabricate antibodies and B.
It worries users unnecessarily. CONCEPT OF ‘SELF’ AND ‘NON-SELF’ Unfortunately. The immune system can simply implement the strategy ``know thyself'' (and reject all else). users are often tempted to stop using anti-virus software. Thus a false positive identification of a virus may be much more harmful than the virus itself. leaving themselves completely unprotected. This is a nice hack. the computer immune system must presume that new software is innocent until it can prove that it is guilty of containing a virus. and can cause them to erase perfectly legitimate programs -. their immune systems can replace the difficult problem of distinguishing between benign and harmful entities by the much simpler one of distinguishing self from non-self. For this reason. Due to the high degree of stability of body chemistry in individual vertebrates during their lifespan. After such an experience. It would be unacceptable if the computer immune system were to reject all such modifications and additions out of hand on the basis that they were different from anything else that happened to be on the system already. Computer users are continually updating and adding new software. during which those possessing self-recognizing receptors are eliminated. rejection of foreign benign entities is generally not harmful.immature immune cells to a training period in the thymus. The actual problem that both the vertebrate and the computer immune system must solve is to distinguish between harmful and benign entities. Although this errs on the side of false positives (i. By contrast. CHECK FOR NEW SOFTWARE AGAINST VIRUS 20 . falsely rejecting benign entities).e.leading to hours or days of lost productivity. because ``self'' is much easier to define and recognize than ``benign''. self/non-self discrimination is not by itself an adequate means for distinguishing between harmful and non harmful software. false rejection of legitimate software is extremely harmful. the notion of ``self'' in computers is somewhat problematic. We can not simply regard the ``self'' as the set of software that was pre-loaded when the computer was first purchased. While the biological immune system can usually get away with presuming the guilt of anything unfamiliar.
this method is faster because recomputing a checksum is considerably faster than comparing the file with all the binary signatures Activity Monitors: However. the software recomputes the checksum and compares it to the one in the database to see if the file has changed. Among these are activity monitors. It is also used to find known viruses that do not lend themselves to signatures. Likewise. Since most files are virus free. have obfuscated code structures (that can shrink or expand themselves through their metamorphic engines). the program scans the file for viruses. Heuristic techniques are used to find unknown viruses and threats that have not yet been cataloged with signatures. which have a sense of what dynamic behaviors are typical of viruses. Integrity monitors: It computes a checksum of each file's contents and stores it in a database. as well as behaviors of its code to determine the likelihood of an infection. Just like a standard signature scanner. like some of the new metamorphic viruses that can obscure their entry points. the user can initiate an on-demand heuristic scan of a new program or diskette before it is used. The next time a file is opened. integrity monitors and generic know-thine-enemy heuristics are periodically or continually on the lookout for any indications that a virus is present in the system. such as size or architecture. evidence of a non-self entity is not by itself enough to trigger an immune response. Heuristics looks at characteristics of a file. users who run an on-access anti-virus program with heuristic 21 . the file is considered virus free. and various heuristics. and are often encrypted as well. In the computer immune system. Mechanisms that employ the complementary strategy of ``know thine enemy'' are also brought into play. Heuristics can sometimes find and stop many new viruses before they execute.The process by which the proposed computer immune system establishes whether new software contains a virus has several stages. If not. which examine the static nature of any modifications that have occurred to see if they have a viral flavor. If it has. A big plus for heuristics is the ability to detect viruses in files and boot records before they have a chance to run and infect your computer.
scanners also have to use more sophisticated inspection techniques.scanning technology can detect a high percentage of new viruses as they are downloaded from the Internet or saved from an email attachment. The most common is pattern matching. Performing detailed heuristic analysis on such a large program would be excruciatingly slow. Scanner: If one of the virus-detection heuristics is triggered. or a previously unknown virus is at large in the system. either the generic virus-detection heuristics yielded a false alarm. the virus is located and removed in the usual way. In the first phase. this region will be the first and last few kilobytes of the file. where the scanner looks inside files for a string of bytes which match those in its database of known viruses. This is an important step because some executable files are many hundreds of kilobytes or even megabytes in length. the goal of the heuristic scanner is to catalog what behaviors the program is capable of exhibiting. The heuristic scanner starts by determining the most likely location where a virus would attach itself to the executable file. 22 . The typical heuristic scanner has at least two phases of operation when scanning an executable file for viruses. the immune system runs the scanner to determine whether the anomaly can be attributed to a known virus. This will find the majority of viruses. To detect viruses which use techniques such as polymorphism to change their code slightly each time they infect. Given most DOS-based computer viruses are only a few kilobytes in length. a well designed heuristic scanner can significantly limit those regions of the file to be scrutinized. Most often. These include processing files in a real-time emulation mode to watch for the polymorphic engine to decrypt the virus. If the anomaly can not be attributed to a known virus. If so. The primary difference between these two schemes is whether the heuristic scanner employs CPU emulation to search for virus-like behavior. Virus scanners work in several ways. Anti-virus researchers have investigated two competing heuristic scanning architectures: static heuristics and dynamic heuristics. if the file were infected. Most virus-scanner developers constantly improve their scanners to keep up with the growth of viruses and the ever-expanding range of infectable objects to inspect.
reading. Likewise. and other directories in the path. In the biological immune system.the signature extractor -. To catch viruses that do not remain active in memory. the computer immune system tries to lure any virus that might be present in the system to infect a diverse suite of ``decoy'' programs. Therefore. it is almost certain that an unknown virus is loose in the system. Such programs are most likely to be executed by the user. 23 . it is very likely to select one of the decoys as its victim. the T cells that recognize the antigen are selected according to their ability to bind to fragments of the antigen that are presented on the surface of cells that have ingested (or been infected by) the antigen. each of the decoy programs is examined to see if it has been modified. the decoys are placed in places where the most commonly used programs in the system are typically located. the infected decoys are then processed by another component of the immune system -. the current directory.It allows the intruder to be processed into a standard format that can be parsed by some other component of the immune system. or otherwise manipulating each of them. copying. decoys are designed to be as attractive as possible to those types of viruses that spread most successfully. and each of the modified decoys contains a sample of that virus. in the computer immune system. such as the root directory. A decoy program's sole purpose in life is to become infected. From time to time. These virus samples are stored in such a way that they will not be executed accidentally. the immune system entices a putative virus to infect the decoy programs by executing. If one or more have been modified. Such activity tends to attract the attention of many viruses that remain active in memory even after they have returned control to their host. and thus serve as the most successful vehicle for further spread. A good strategy for a virus to follow is to infect programs that are touched by the operating system in some way. The computer immune system has an additional task that is not shared by its biological analog: it must attempt to extract from the decoys information about how the virus attaches to its host. The capture of a virus sample by the decoy programs is somewhat analogous to the ingestion of antigen by macrophages or B cells . so that infected hosts can be repaired (if possible). writing to. and provides a standard location where information on the intruder can be found.Decoy Programs: At this point.so as to develop a recognizer for the virus. The next time the infected file is run. selfless endeavor. To increase the chances of success in this noble.
which perform reasonably well. and it must be very unlikely to be found in uninfected programs so that it avoids both the problems explained below: • False negative problem: The samples captured by the decoys may not represent the full range of variable appearance of which the virus is capable.the signature extractor -. which represent machine instructions. In other words. such as running the virus through a debugger or virtual interpreter. etc. non-executable ``data'' portions of programs. work areas for computations. Alternatively. To be conservative. The computer immune system has an additional task that is not shared by its biological analog: it must attempt to extract from the decoys information about how the virus attaches to its host. Although the task of separating code from data is in principle somewhat ill-defined. are inherently more likely to vary from one instance of the virus to another than are ``code'' portions. which can include representations of numerical constants. ``data'' areas are excluded from consideration as possible signatures. the infected decoys are then processed by another component of the immune system -. there are a variety of methods.so as to develop a recognizer for the virus. As a general rule. The signature extractor must select a virus signature from among the byte sequences produced by the attachment derivation step. 24 . character strings. so that infected hosts can be repaired (if possible). such that it avoids both false negatives and false positives. The origin of the variation may be internal to the virus (e.SIGNATURE EXTRACTOR In the computer immune system. it could depend on a date). a virus hacker might deliberately change a few data bytes in an effort to elude virus scanners. the signature must be found in each instance of the virus. PROBLEMS FACED BY THE SIGNATURE EXTRACTOR The signature must be well-chosen.g.
false positives are particularly annoying to customers. ADDING SIGNATURE TO THE DATABASE Having automatically developed both a recognizer and a repair algorithm appropriate to the virus. In both traditional antivirus software and the proposed computer immune system. That is considered signature of the virus and its variants. Typically. If the virus is ever 25 . uninfected ``self'' programs. false positives that accidentally recognize self cause auto-immune diseases. the information can be added to the corresponding databases. 2. Calculating the frequency of each such n-gram in the ``self'' collection (in the case of signatures that are to be distributed worldwide. Using a simple formula to combine the n-gram frequencies into a probability estimate for each candidate signature to be found in a set of programs similar in size and statistical character to the corpus. and for each it estimates the probability for that S-byte sequence to be found in the collection of normal. The probability estimate is made by 1. S is chosen to be 16 or 24. and 4. Forming a list of all n-grams (sequences of n bytes. li=ni=nmax) contained in the input data (nmax is typically 5 or 8). and so infuriating to vendors of falselyaccused software that it has led to at least one lawsuit against a major anti-virus software vendor. we use a half-gigabyte corpus of ordinary. uninfected programs). Selecting the signature with the lowest estimated false-positive probability. 3.• False positive problem: In the biological immune system. ALGORITHM USED BY THE SIGNATURE-EXTRACTOR The automatic signature extractor examines each sequence of S contiguous bytes (referred to as ``candidate signatures'') in the set of invariant-code byte sequences that have presented to it.
The signal conveys to the recipient the fact that the transmitter was infected.encountered again. it can send a signal to neighboring machines. SELF REPLICATION In the biological immune system. and by bringing a degree of mutation into play. When a computer discovers that it is infected. One can view this as a case in which self-replication is being used to fight a self-replicator (the virus) in a very effective manner. KILL SIGNAL It is proposed to use a similar mechanism. A computer with an immune system could be thought of as ``ill'' during its first encounter with a virus. which we call the ``kill signal''. However. ALGORITHM USED BY THE NEIGHBOUR 26 . to quell viral spread in computer networks. plus any signature or repair information that might be of use in detecting and eradicating the virus. This provides a very strong selective pressure for good recognizers. since a considerable amount of time and energy (or CPU cycles) would be expended to analyze the virus. detection and elimination of the virus would occur much more quickly: the computer could be thought of as ``immune'' to the virus. on subsequent encounters. the immune system will recognize it immediately as a known virus. immune cells with receptors that happen to match a given antigen reasonably well are stimulated to reproduce themselves. the immune cell is generally able to come up with immune cells that are extremely well-matched to the antigen in question.
effectively immunizing it against that virus (see Fig. • If the recipient is not infected. and so on and immunizes itself by signature and repair information passed by the neighbor. 27 . it does not pass along the signal.• If the recipient finds that it is infected. it sends the signal to its neighbors. 2). but at least it has received the database updates -.
Figure 2 28 .
Detects virus Scan for Known virus Known virus Remove Virus Unknown virus Capture sample using decoy Send signal to Neighboring Machines Segregate Code/data Algorithmic virus Analysis Extract Signature Add signature to database Add removal info in Database Figure 3: Overview of Artificial Immune System 29 .
Most of the necessary components are already in use in one form or another.CONCLUSION An immune system for computers is desirable and feasible. by malicious users. Others are presently in use in the virus laboratory. 30 . The biological immune system has invented various inhibitory mechanisms which may turn out to be use to us. One of the technical issues that remain to be explored further is the kill signal. or international subversion. Further analysis and simulation must be conducted to assess the effectiveness of various fail-safe mechanisms that have been proposed to deal with the propagation of erroneous kill signals. software bugs. for the purpose of updating the databases employed by IBM Anti-Virus to recognize viruses and repair infected files. which could result from false positives. Further simulation will help to establish the exact circumstances under which a node should send signals to its neighbors. Some already exist in IBM antivirus itself. and for what length of time these signals should be sent.
www.com 3. www.com 31 .newscientist.ibm.REFERENCES 1.com 2. www.google.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue listening from where you left off, or restart the preview.