You are on page 1of 23

Modeling and Detection of Camouflaging Worm

2012

CHAPTER 4

SYSTEM DESIGN
Systems design is the process of defining the architecture, components, modules, interfaces, and data for a system to satisfy specified requirements. It implies a systematic and rigorous approach to designan approach demanded by the scale and complexity of many systems problems. The purpose of System Design is to create a technical solution that satisfies the functional requirements for the system. At this point in the project lifecycle there should be a Functional Specification, written primarily in business terminology, containing a complete description of the operational needs of the various organizational entities that will use the new system. The challenge is to translate all of this information into Technical Specifications that accurately describe the design of the system, and that can be used as input to System Construction. The Functional Specification produced during System Requirements Analysis is transformed into a physical architecture. The purpose of System Design is to create a technical solution that satisfies the functional requirements for the system. At this point in the project lifecycle there should be a Functional Specification, written primarily in business terminology, containing a complete description of the operational needs of the various organizational entities that will use the new system. The challenge is to translate all of this information into Technical Specifications that accurately describe the design of the system, and that can be used as input to System Construction. The Functional Specification produced during System Requirements Analysis is transformed into a physical architecture. System components are distributed across the physical architecture, usable interfaces are designed and prototyped, and Technical Specifications are created for the Application Developers, enabling them to build and test the system. System design contains Logical Design & Physical Designing, logical designing describes the structure & characteristics or features, like output, input, files, database & procedures. The physical design, which follows the logical design, actual software & a working system. There will be constraints like Hardware, Software, Cost, Time & Interfaces.

Department of Computer Science & Engg, SaIT

Page 1

Modeling and Detection of Camouflaging Worm

2012

System Design involves the analysis, design, and configuration of the necessary hardware and software components to support your solution's architecture. The five major components of System Design include: the Information Model, Community Model, Security/Permission Model, System Integration, Workflow, and Technical Architecture. A System Design typically provides the following benefits: Improved system performance; individually tailored configuration advice

demonstrates where improvement is necessary, and how to improve the system to regain lost performance. Customers gain a detailed understanding of how their users use their system. This Usage Profile can be leveraged to develop future architecture changes. Potential to learn of future concerns, allowing customers to take proactive measures to avoid problems. A baseline performance level is established against which benefits can be compared and changes to the system predicted or foreseen.

System design is the process of working out the overall functionality and approach that the system will include. It starts at a high level and then drills down into great detail, and normally ends up with the production of a technical specification. The design is the process of designing exactly how the specifications are to be implemented. Analysis and design are very important in the whole development cycle. Any fault in the design could affect the product or could be very expensive to solve in the later stage of software development. System Design is the activity of proceeding from an identified set of requirements for a system to a design that meets those requirements. A distinction is sometimes drawn between high-level or architectural design, which is concerned with the main components of the system and their roles and interrelationships, and detailed design, which is concerned with the internal structure and operation of individual components. The term system design is sometimes used to cover just the high-level design activity. System components are distributed across the physical architecture, usable interfaces are designed and prototyped, and Technical Specifications are created for the Application. Developers, enabling them to build and test the system.

Department of Computer Science & Engg, SaIT

Page 2

Modeling and Detection of Camouflaging Worm

2012

The system design are broadly classified into two categories : high level design and low level design.

High Level Design :A high-level design provides an overview of a solution, platform, system, product, service, or process. Such an overview is important in a multi-project development to make sure that each supporting component design will be compatible with its neighboring designs and with the big picture. The highest level solution design should briefly describe all platforms, systems, products, services and processes that it depends upon and include any important changes that need to be made to them. A high-level design document will usually include a high-level architecture diagram depicting the components, interfaces and networks that need to be further specified or developed. The high-level design defines the project level architecture of the system. This architecture defines the sub-systems to be built, internal and external interfaces to be developed, and interface standards identified. The high level design is where the subsystem requirements are developed. The high-level design also identifies the major candidate off-the-shelf products that might be used in the system. High-level design is the transitional step between what [requirements for subsystems] the system does, and how [architecture and interfaces] the system will be implemented to meet the system requirements. This process includes the decomposition of system requirements into alternative project architectures and then the evaluation of these project architectures for optimum performance, functionality, cost, and other issues [technical and non-technical]. Stakeholder involvement is critical for this activity. In this step, internal and external interfaces are identified along with the needed industry standards. These interfaces are then managed throughout the development process. The following uses ramp metering as an example for the two key decomposition activities: Functional decomposition is breaking a function down into its smallest parts. [E.g., ramp metering includes the sub-functions of detection, meter rate control, main line metering, ramp queuing, time of day, and communications]. Physical decomposition defines the physical elements needed to carry out the function. [E.g., ramp metering decomposition includes loops, controller clock, fiber or twisted pair for communications, 2070 controllers, host computers, cabinets, and conduits].

Department of Computer Science & Engg, SaIT

Page 3

Modeling and Detection of Camouflaging Worm

2012

Finally, allocating these sub-functions to the physical elements of the system will form the complete project architecture. This step also defines the integration and verification activities needed when the system elements are developed. The high-level design of a software system is a collection of module and subroutine interfaces related to each other by means of USES and IS_COMPONENT_OF relationships. The High Level Design Document is a pretty important document for a project, covering at a high level the overall design of the solution. If one were to try and present a very succinct summary of the High Level Document, it could be something like this: Detailed use case scenarios of key process flows of the application The class model and relationships The sequence diagrams which outline key use case scenarios The data/object model with relational table design User interface style and design After the requirements definition the high level design is the most important document and provides the blueprint for the further stages of a project including the detailed design and implementation stages. By not getting the high level design right, organisations run the risk of creating problems which could be extremely expensive to remedy at a later stage. The purpose of this High Level Design (HLD) Document is to add the necessary detail to the current project description to represent a suitable model for coding. This document is also intended to help detect contradictions prior to coding, and can be used as a reference manual for how the modules interact at a high level. The HLD documentation presents the structure of the system, such as the database architecture, application architecture (layers), application flow (Navigation), and technology architecture. The HLD uses non-technical to mildly-technical terms which should be understandable to the administrators of the system. The document may also depict or otherwise refer to work flows and/or data flows between component systems. In addition, there should be brief consideration of all significant commercial, legal, environmental, security, safety and technical risks, issues and assumptions. The idea is to mention every work area briefly, clearly delegating the ownership of more detailed design activity whilst also encouraging effective collaboration between the various project teams.

Department of Computer Science & Engg, SaIT

Page 4

Modeling and Detection of Camouflaging Worm

2012

Today, most high-level designs require contributions from a number of experts, representing many distinct professional disciplines. Finally, every type of end-user should be identified in the high-level design and each contributing design should give due consideration to customer experience. The HLD uses non-technical to mildly-technical terms which should be understandable to the administrators of the system. The functioning of high level design can be easily explained by the use of architecture diagram, class diagram and sequence diagram. Architecture Diagram An architecture diagram in system architecture is typically a technological set-up, either various computer components working together, or steps in a software process working towards a specific end result.

FIG. 4.1 Architecture diagram of camouflaging worm

In fig 4.1 we have a centralized C-Worm detection system along with its different modules. The different component includes pure random scan, worm detection list, and a system scan. The system scan is performed by selecting system volume information.

Department of Computer Science & Engg, SaIT

Page 5

Modeling and Detection of Camouflaging Worm

2012

Class Diagram
In software engineering, a class diagram in the Unified Modeling Language (UML) is a type of static structure diagram that describes the structure of a system by showing the system's classes, their attributes, and the relationships between the classes. The class

diagram is the main building block of object oriented modeling. It is used both for general conceptual modeling of the systematic of the application, and for detailed modeling translating the models into programming code. Class diagrams can also be used for data modeling. The classes in a class diagram represent both the main objects and or interactions in the application and the objects to be programmed. In the class diagram these classes are represented with boxes which contain three parts. A class with three sections: The upper part holds the name of the class The middle part contains the attributes of the class The bottom part gives the methods or operations the class can take or undertake In the system design of a system, a number of classes are identified and grouped together in a class diagram which helps to determine the static relations between those objects. With detailed modeling, the classes of the conceptual design are often split into a number of subclasses.

FIG. 4.2 Class diagram of camouflaging worm


Department of Computer Science & Engg, SaIT Page 6

Modeling and Detection of Camouflaging Worm Sequence diagram

2012

A sequence diagram in a Unified Modeling Language (UML) is a kind of interaction diagram that shows how processes operate with one another and in what order. It is a construct of a Message Sequence Chart. A sequence diagram shows object interactions arranged in time sequence. It depicts the objects and classes involved in the scenario and the sequence of messages exchanged between the objects needed to carry out the functionality of the scenario. Sequence diagrams typically are associated with use case realizations in the Logical View of the system under development. Sequence diagrams are sometimes called event diagrams, event scenarios, and timing diagrams. A sequence diagram shows, as parallel vertical lines (lifelines), different processes or objects that live simultaneously, and, as horizontal arrows, the messages exchanged between them, in the order in which they occur. This allows the specification of simple runtime scenarios in a graphical manner.
Pure Random Scan(PRS) C-Worm Detection Select Any Input Drive Check Task Manager Scan File Node

Start C-Worm Scan

Check Worm Detection Status Check Worm virus in local drive

if file infected on Worm

Detected the Worm virus Analysis the Worm Detection Time

FIG. 4.3 Sequence diagram of camouflaging worm

Department of Computer Science & Engg, SaIT

Page 7

Modeling and Detection of Camouflaging Worm

2012

Main Modules :The different modules included in this project are: 1. C-Worm detection Module The C-Worm has a self-propagating behavior similar to traditional worms, i.e., it intends to rapidly infect as many vulnerable computers as possible. However, the C-Worm is quite different from traditional worms in which it camouflages any noticeable trends in the number of infected computers over time. The camouflage is achieved by manipulating the scan traffic volume of worm-infected computers. Such a manipulation of the scan traffic volume prevents exhibition of any exponentially increasing trends or even crossing of thresholds that are tracked by existing detection schemes. This worm attempts to remain hidden by sleeping (suspending scans) when it suspects it is under detection. Worms that adopt such smart attack strategies could exhibit overall scan traffic patterns different from those of traditional worms. Since the existing worm detection schemes will not be able to detect such scan traffic patterns, it is very important to understand such smart-worms and develop new countermeasures to defend against them.

2. Worms are malicious : Detection Module OR Anomaly Detection Worms are malicious programs that execute on these computers, analyzing the behavior of worm executables plays an important role in host based detection systems. Many detection schemes fall under this category. In contrast, network-based detection systems detect worms primarily by monitoring, collecting, and analyzing the scan traffic (messages to identify vulnerable computers) generated by worm attacks. Many detection schemes fall under this category. Ideally, security vulnerabilities must be prevented to begin with, a problem which must addressed by the programming language community. However, while vulnerabilities exist and pose threats of large-scale damage, it is critical to also focus on network-based detection, as this paper does, to detect wide spreading worms. Anomaly detection, also referred to as outlier detection refers to detecting patterns in a given data set that do not conform to an established normal behavior.[2] The patterns thus detected are called anomalies and often translate to critical and actionable information in several application domains. Anomalies are also referred to as outliers, change, deviation, surprise, aberrant, peculiarity, intrusion, etc.

Department of Computer Science & Engg, SaIT

Page 8

Modeling and Detection of Camouflaging Worm

2012

In particular in the context of abuse and network intrusion detection, the interesting objects are often not rare objects, but unexpected bursts in activity. This pattern does not adhere to the common statistical definition of an outlier as a rare object, and many outlier detection methods (in particular unsupervised methods) will fail on such data, unless it has been aggregated appropriately. Instead, acluster analysis algorithm may be able to detect the micro clusters formed by these patterns. Three broad categories of anomaly detection techniques exist. Unsupervised anomaly detection techniques detect anomalies in an unlabeled test data set under the assumption that the majority of the instances in the data set are normal by looking for instances that seem to fit least to the remainder of the data set. Supervised anomaly detection techniques require a data set that has been labeled as "normal" and "abnormal" and involves training a classifier (the key difference to many other statistical classification problems is the inherent unbalanced nature of outlier detection). Semi-supervised anomaly detection techniques construct a model representing normal behavior from a given normal training data set, and then testing the likelihood of a test instance to be generated by the learnt model.

3. Pure Random Scan (PRS) Module C-Worm can be extended to defeat other newly developed detection schemes, such as destination distribution-based detection. In the following, Recall that the attack target distribution based schemes analyze the distribution of attack targets (the scanned destination IP addresses) as basic detection data to capture the fundamental features of worm propagation, i.e., they continuously scan different targets

4. Worm propagation Module Worm scan traffic volume in the open-loop control system will expose a much higher probability to show an increasing trend with the progress of worm propagation. As more and more computers get infected, they, in turn, take part in scanning other computers. Hence, we consider the C-Worm as a worst case attacking scenario that uses a closed loop control for regulating the propagation speed based on the feedback propagation status.

Department of Computer Science & Engg, SaIT

Page 9

Modeling and Detection of Camouflaging Worm

2012

Low Level Design


Low Level Design (LLD) is like detailing the HLD. It defines the actual logic for each and every component of the system. Class diagrams with all the methods and relation between classes comes under LLD. Programs specs are covered under LLD. LLD describes each and every module in an elaborate manner so that the programmer can directly code the program based on this. There will be at least 1 document for each module and there may be more for a module. The LLD will contain: - detailed functional logic of the module in pseudocode - database tables with all elements including their type and size - all interface details with complete API references (both requests and responses) - all dependency issues error message listings - complete input and outputs for a module. The low level design document for a project should provide a complete and detailed specification of the design for the software that will be developed in the project, including the classes, member and non-member functions, and associations between classes that are involved. By the end of the Low Level Design stage, the code should be "all but written". The low level design document should contain a listing of the declarations of all the classes, non-member-functions, and class member functions that will be defined during the implementation stage, along with the associations between those classes and any other details of those classes (such as member variables) that are firmly determined by the low level design stage. The low level design document should also describe the classes, function signatures, associations, and any other appropriate details, which will be involved in testing and evaluating the project according to the evaluation plan defined in the project's requirements document. More importantly, each project's low level design document should provide a narrative describing (and comments in your declaration and definition files should point out) how the high level design is mapped into its detailed low-level design, which is just a step away from the implementation itself. This should be an English description of how you converted the technical diagrams (and text descriptions) found in your high level design into appropriate class and function declarations in your low level design. This document describes each and every module in an elaborate manner, so that the programmer can directly code the program based on this. There will be at least 1 document for each module and there may be more for a module. The LLD will contain: - detailed functional logic of the module, in pseudo code - database tables, with all elements, including their type and size - all interface details with complete API references(both
Department of Computer Science & Engg, SaIT Page 10

Modeling and Detection of Camouflaging Worm

2012

requests and responses) - all dependency issues -error message listings - complete input and outputs for a module. The low level design document for a project should provide a complete and detailed specification of the design for the software that will be developed in the project, including the classes, member and non-member functions, and associations between classes that are involved. By the end of the Low Level Design stage, the code should be "all but written". The low level design document should contain a listing of the declarations of all the classes, non-member-functions, and class member functions that will be defined during the implementation stage, along with the associations between those classes and any other details of those classes (such as member variables) that are firmly determined by the low level design stage. The low level design document should also describe the classes, function signatures, associations, and any other appropriate details, which will be involved in testing and evaluating the project according to the evaluation plan defined in the project's requirements document. More importantly, each project's low level design document should provide a narrative describing (and comments in your declaration and definition files should point out) how the high level design is mapped into its detailed low-level design, which is just a step away from the implementation itself. This should be an English description of how you converted the technical diagrams (and text descriptions) found in your high level design into appropriate class and function declarations in your low level design. You should be especially careful to explain how the class roles and their methods were combined in your low level design, and any changes that you decided to make in combining and refining them. During the detailed phase, the view of the application developed during the high level design is broken down into modules and programs. Logic design is done for every program and then documented as program specifications. For every program, a unit test plan is created. The entry criteria for this will be the HLD document. And the exit criteria will the program specification and unit test plan (LLD). The Low Level Design Document gives the design of the actual program code which is designed based on the High Level Design Document. It defines Internal logic of corresponding sub module designers are preparing and mapping individual LLDs to Every module. A good Low Level Design Document developed will make the program very easy to be developed by developers because if proper analysis is made and the Low Level Design Document is prepared then the code can
Department of Computer Science & Engg, SaIT Page 11

Modeling and Detection of Camouflaging Worm

2012

be developed by developers directly from Low Level Design Document with minimal effort of debugging and testing. The Low Level Design is explained by Data Flow Diagram and Activity Diagram. Data Flow Diagram A Data flow diagram (DFD) is a graphical representation of the "flow" of data through an information system, modeling its process aspects. Often they are a preliminary step used to create an overview of the system which can later be elaborated.[2] DFDs can also be used for the visualization of data processing (structured design). Data Flow diagrams (DFDs) are one of the three essential perspectives of the structured-systems analysis and design method SSADM. The sponsor of a project and the end users will need to be briefed and consulted throughout all stages of a system's evolution. With a data flow diagram, users are able to visualize how the system will operate, what the system will accomplish, and how the system will be implemented. The old system's dataflow diagrams can be drawn up and compared with the new system's data flow diagrams to draw comparisons to implement a more efficient system. Flow diagrams can be used to provide the end user with a physical idea of where the data they input ultimately has an effect upon the structure of the whole system from order to dispatch to report. How any system is developed can be determined through a data flow diagram. A Data flow diagram (DFD) is a graphical representation of the "flow" of data through an information system, modeling its process aspects. Often they are a preliminary step used to create an overview of the system which can later be elaborated.[2] DFDs can also be used for the visualization of data processing (structured design). A DFD shows what kinds of data will be input to and output from the system, where the data will come from and go to, and where the data will be stored. It does not show information about the timing of processes, or information about whether processes will operate in sequence or in parallel (which is shown on a flowchart).

Department of Computer Science & Engg, SaIT

Page 12

Modeling and Detection of Camouflaging Worm

2012

Worm-Detection

Pure Random Scan (PRS)

Select Drive no

Scan Selected Drive or Folder ,file

IF selected Drive yes

C-Worm Detection

Stored Worm scan details Log file

Traffic and non-worm traffic

View File Log

Check Status Worm detection list Worm Analysis

End

FIG. 4.4 Data Flow diagram of camouflaging worm

Department of Computer Science & Engg, SaIT

Page 13

Modeling and Detection of Camouflaging Worm Activity Diagram

2012

Activity diagrams are graphical representations of workflows of stepwise activities and actions with support for choice, iteration and concurrency. In the Unified Modeling Language, activity diagrams can be used to describe the business and operational step-by-step workflows of components in a system. An activity diagram shows the overall flow of control. Activity diagrams are constructed from a limited number of shapes, connected with arrows. The most important shape types: Rounded rectangles represent activities. Diamonds represent decisions. Bars represent the start (split) or end (join) of concurrent activities. A black circle represents the start (initial state) of the workflow. An encircled black circle represents the end (final state). Arrows run from the start towards the end and represent the order in which activities happen. Hence they can be regarded as a form of flowchart. Typical flowchart techniques lack constructs for expressing concurrency. However, the join and split symbols in activity diagrams only resolve this for simple cases. The meaning of the model is not clear when they are arbitrarily combined with decisions or loops. Activity diagram is basically a flow chart to represent the flow form one activity to another activity. The activity can be described as an operation of the system. So the control flow is drawn from one operation to another. This flow can be sequential, branched or concurrent. Activity diagrams deals with all type of flow control by using different elements like fork, join etc. Activity is a particular operation of the system. Activity diagrams are not only used for visualizing dynamic nature of a system but they are also used to construct the executable system by using forward and reverse engineering techniques. The only missing thing in activity diagram is the message part. It does not show any message flow from one activity to another. Activity diagram is some time considered as the flow chart. Although the diagrams looks like a flow chart but it is not. It shows different flow like parallel, branched, concurrent and single.

Department of Computer Science & Engg, SaIT

Page 14

Modeling and Detection of Camouflaging Worm

2012

C_Worm Detection

Yes

Select System Volume drive

No

Start C-Worm Scan

Non propagation

Infected file

Worm Scan File List

Detected Worm

Stored Log file

Worm detection Status

Worm Analysis

FIG. 4.5 Activity diagram of camouflaging worm

Department of Computer Science & Engg, SaIT

Page 15

Modeling and Detection of Camouflaging Worm

2012

Use Case Diagram:In software and systems engineering, a use case is a list of steps, typically defining interactions between a role (known in UML as an "actor") and a system, to achieve a goal. The actor can be a human or an external system. In systems engineering, use cases are used at a higher level than within software engineering, often representing missions or stakeholder goals. The detailed requirements may then be captured in SysML or as contractual statements. A use case defines the interactions between external actors and the system under consideration to accomplish a goal. Actors must be able to make decisions, but need not be human: "An actor might be a person, a company or organization, a computer program, or a computer system hardware, software, or both." Actors are always stakeholders, but many stakeholders are not actors, since they "never interact directly with the system, even though they have the right to care how the system behaves." For example, "the owners of the system, the company's board of directors, and regulatory bodies such as the Internal Revenue Service and the Department of Insurance" could all be stakeholders but are unlikely to be actors. Similarly, a person using a system may be represented as different actors because he is playing different roles. For example, user "Joe" could be playing the role of a Customer when using an Automated Teller Machine to withdraw cash from his own account, or playing the role of a Bank Teller when using the system to restock the cash drawer on behalf of the bank. Actors are often working on behalf of someone else. A stakeholder may play both an active and an inactive role: for example, a Consumer is both a "mass-market purchaser" (not interacting with the system) and a User (an actor, actively interacting with the purchased product).[13] In turn, a User is both a "normal operator" (an actor using the system for its intended purpose) and a "functional beneficiary" (a stakeholder who benefits from the use of the system).[13] For example, when user "Joe" withdraws cash from his account, he is operating the Automated Teller Machine and obtaining a result on his own behalf. Conceptual modelling refers to specifying, visualizing, and documenting models of for instance the context of use, a business model, or a software system. The perspective of the terms in this category is rather technical.
Department of Computer Science & Engg, SaIT Page 16

Modeling and Detection of Camouflaging Worm

2012

Context of use refers to the characteristics of the users, tasks, and the organizational and physical environment. Context of use may also describe the cognitive, motivational and emotional characteristics of the different users, tasks, cooperative behavior, articulation work and the organizational and physical environment. This is done out of observations of real work and interviews, including the reflexive point of view of actors on their context of use. Analyses the possible conflicts of interest or need between different types of actors. Tries to anticipate different ways in which a new tool or method could affect the content of the observed tasks and activities, including the network and collaborative behavior. Analyses both norms and practices.

FIG 4.6 Use Case Diagram of Camouflaging Worm Detection System


Department of Computer Science & Engg, SaIT Page 17

Modeling and Detection of Camouflaging Worm

2012

Low Level Design Of The Modules

1. C-Worm detection Module The C-Worm has a self-propagating behavior similar to traditional worms, i.e., it intends to rapidly infect as many vulnerable computers as possible. However, the C-Worm is quite different from traditional worms in which it camouflages any noticeable trends in the number of infected computers over time. The camouflage is achieved by manipulating the scan traffic volume of worm-infected computers. Such a manipulation of the scan traffic volume prevents exhibition of any exponentially increasing trends or even crossing of thresholds that are tracked by existing detection schemes. Worm detection has been intensively studied in the past and can be generally classified into two categories: hostbased detection and network-based detection. Hostbased detection systems detect worms by monitoring, collecting, and analyzing worm behaviors on end-hosts. Since worms are malicious programs that execute on these computers, analyzing the behavior of worm executables plays an important role in hostbased detection systems. Many detection schemes fall under this category [37]. In contrast, network-based detection systems detect worms primarily by monitoring, collecting, and analyzing the scan traffic (messages to identify vulnerable computers) generated by worm attacks. Many detection schemes fall under this category [19]. Ideally, security vulnerabilities must be prevented to begin with, a problem, which must addressed by the programming language community. However, while vulnerabilities exist and pose threats of large-scale damage, it is critical to also focus on network-based detection, as this paper does, to detect widespreading worms. In order to rapidly and accurately detect Internet-wide large-scale propagation of active worms, it is imperative to monitor and analyze the traffic in multiple locations over the Internet to detect suspicious traffic generated by worms. The widely adopted worm detection framework consists of multiple distributed monitors and a worm detection center that controls the former [41]. This framework is well adopted and similar to other existing worm detection systems, such as the Cybercenter for disease controller [11], Internet motion sensor [42], SANS ISC [23], Internet sink [41], and network telescope [43]. The monitors are distributed across the Internet and can be deployed at endhosts, router, or firewalls, etc. Each monitor passively records irregular port-scan traffic, such as
Department of Computer Science & Engg, SaIT Page 18

Modeling and Detection of Camouflaging Worm

2012

connection attempts to a range of void IP addresses (IP addresses not being used) and restricted service ports. Periodically, the monitors send traffic logs to the detection center. The detection center analyzes the traffic logs and determines whether or not there are suspicious scans to restricted ports or to invalid IP addresses. Network-based detection schemes commonly analyze the collected scanning traffic data by applying certain decision rules for detecting the worm propagation. For example, Venkataraman et al. [20] andWuet al. [21] proposed schemes to examine statistics of scan traffic volume, Zou et al. presented a trend-based detection scheme to examine the exponential increase pattern of scan traffic [19], Lakhina et al. [40] proposed schemes to examine other features of scan traffic, such as the distribution of destination addresses. Other works study worms that attempt to take on new patterns to avoid detection [39]. Besides the above detection schemes that are based on the global scan traffic monitor by detecting traffic anomalous behavior, there are other worm detection and defense schemes, such as sequential hypothesis testing for detecting worm-infected computers [44] and payload-based worm signature detection [45]. In addition, Cai et al. [46] presented both theoretical modeling and experimental results on a collaborative worm signature generation system that employs distributed fingerprint filtering and aggregation and multiple edge networks. Dantu et al. [47] presented a state-space feedback control model that detects and control the spread of these viruses or worms by measuring the velocity of the number of new connections an infected computer makes. Despite the different approaches described above, we believe that detecting widely scanning anomaly behavior continues to be a useful weapon against worms, and that, in practice, multifaceted defense has advantages.

2. Worms are malicious: Detection Module OR Anomaly Detection Worms are malicious programs that execute on these computers, analyzing the behavior of worm executables plays an important role in host based detection systems. Many detection schemes fall under this category. In contrast, network-based detection systems detect worms primarily by monitoring, collecting, and analyzing the scan traffic (messages to identify vulnerable computers) generated by worm attacks. Many detection schemes fall under this category. Ideally, security vulnerabilities must be prevented to begin with, a problem which must addressed by the programming language community. However,

Department of Computer Science & Engg, SaIT

Page 19

Modeling and Detection of Camouflaging Worm

2012

while vulnerabilities exist and pose threats of large-scale damage, it is critical to also focus on network-based detection, as this paper does, to detect wide spreading worms. In this section, we develop a novel spectrum-based detection scheme. Recall that the C-Worm goes undetected by detection schemes that try to determine the worm propagation only in the time domain. Our detection scheme captures the distinct pattern of the C-Worm in the frequency domain, and thereby has the potential of effectively detecting the C-Worm propagation. In order to identify the C-Worm propagation in the frequency domain, we use the distribution of PSD and its corresponding SFM of the scan traffic. Particularly, PSD describes how the power of a time series is distributed in the frequency domain. Mathematically, it is defined as the Fourier transform of the autocorrelation of a time series. In our case, the time series corresponds to the changes in the number of worm instances that actively conduct scans over time. The SFM of PSD is defined as the ratio of geometric mean to arithmetic mean of the coefficients of PSD. The range of SFM values is [0,1] and a larger SFM value implies flatter PSD distribution and vice versa. Notice that the frequency-domain analysis will require more samples in comparison with the time-domain analysis, since the frequency-domain analysis technique, such as the Fourier transform, needs to derive power spectrum amplitude for different frequencies. In order to generate the accurate spectrum amplitude for relatively high frequencies, a high granularity of data sampling will be required. In our case, we rely on ITM systems to collect traffic traces from monitors (motion sensors) in a timely manner. As a matter of fact, other existing detection schemes based on the scan traffic rate [20], variance [21], or trend [19] will also demand a high-sampling frequency for ITM systems in order to accurately detect worm attacks. Enabling the ITM system with timely data collection will benefit worm detection in real time.

3. Pure Random Scan (PRS) Module C-Worm can be extended to defeat other newly developed detection schemes, such as destination distribution-based detection. In the following, Recall that the attack target distribution based schemes analyze the distribution of attack targets (the scanned destination IP addresses) as basic detection data to capture the fundamental features of worm propagation, i.e., they continuously scan different targets.

Department of Computer Science & Engg, SaIT

Page 20

Modeling and Detection of Camouflaging Worm

2012

Pure Random Scan Strategy: The worm propagator can randomly select computers in cyber Space to identify whether a computer is vulnerable. For example, the pure random scan (PRS) worm randomly scans the entire network IPv4 address space [1, 19]. In this model, worm- infected hosts do not have any prior vulnerability knowledge or active/inactive information of other hosts. The worm-infected host randomly selects IP addresses of victims from the global network IP address space and launches the attack to those addresses. When the new host is infected, it continuously attacks the network via the same method. The main short coming in this approach is that many IP addresses in the network are not being used by any valid host. Thus, many scans are wasted when targeting non existing hosts. To address this issue, improvements on random scan have been proposed to launch selective scans by using the knowledge of network address allocation. For example, some chunk of IP addresses are used by organizations or enterprises, and thus are more likely to be well-maintained and less vulnerable. Some other IP addresses are more likely to be occupied by personal computers, and thus have higher probability to be vulnerable [33]. Also, computers in the same subnet work are more likely to use similar system settings and May share the same vulnerabilities. Such network topology-related information can be obtained through routing tables and DNS and can improve the probability of successful identication by (up to) three times [34]. We describe a generic random scan algorithm by a sequence of iterates {Xk} on iteration k = 0, 1, . . . which may depend on previous points and algorithmic parameters. The current iterate Xk may represent a single point, or a collection of points, to include populationbased algorithms. The iterates are also capitalized to denote that they are random variables, reecting the probabilistic nature of the random search algorithm. Generic Random Scan Algorithm Step 0. Initialize algorithm parameters 0, initial points X0 S and iteration index k = 0. Step 1. Generate a collection of candidate points Vk+1 S according to a specic generator and associated sampling distribution. Step 2. Update Xk+1 based on the candidate points Vk+1, previous iterates and algorithmic parameters. Also update algorithm parameters k+1. Step 3. If a stopping criterion is met, stop. Otherwise increment k and return to Step 1.

Department of Computer Science & Engg, SaIT

Page 21

Modeling and Detection of Camouflaging Worm

2012

4. Worm propagation Module Worm scan traffic volume in the open-loop control system will expose a much higher probability to show an increasing trend with the progress of worm propagation. As more and more computers get infected, they, in turn, take part in scanning other computers. Hence, we consider the C-Worm as a worst case attacking scenario that uses a closed loop control for regulating the propagation speed based on the feedback propagation status. To analyze the C-Worm, we adopt the epidemic dynamic model for disease propagation, which has been extensively used for worm propagation modeling [2]. Based on existing results [12], this model matches the dynamics of real-worm propagation over the Internet quite well. For this reason, similar to other publications, we adopt this model in our paper as well. Since our investigated C-Worm is a novel attack, we modified the original epidemic dynamic formula to model the propagation of the C-Worm by introducing the P2Pthe attack probability that a worm-infected computer participates in worm propagation at time t. We note that there is a wide scope to notably improve our modified model in the future to reflect several characteristics that are relevant in real-world practice. Particularly, the epidemic dynamic model assumes that any given computer is in one of the following states: immune, vulnerable, or infected. An immune computer is one that cannot be infected by a worm; a vulnerable computer is one that has the potential of being infected by a worm; an infected computer is one that has been infected by a worm. Algorithm for worm propagation: Step 1. Collect traffic in local network Step 2. Create suspicious list from outbound traffic Step 3. Step 4. Step 5. Step 6. Step 7. Step 8. foreach (record in suspicious list) do if (destination addresses have sequential distribution) then worm alert else if (destination addresses contain unused IP addresses) then worm alert else if (the number of distinct addresses of inbound traffic with related port are large) Step 9. Step 10. then worm alert else the record is normal activity

Step 11. End For.


Department of Computer Science & Engg, SaIT Page 22

Modeling and Detection of Camouflaging Worm

2012

We think our algorithm can effectively detect random, sequential and other intelligent worm such as selective-random scan worm. And we can know infected hosts in local network and take proper actions against those hosts. In addition, our algorithm can be applied to a real network having a lot of worms that are not removed. It detects not only the appearance of a new worm also already existing worms.

Department of Computer Science & Engg, SaIT

Page 23

You might also like