You are on page 1of 5

Anomaly Detection in Mobile ad-hoc Network through Feature Selection

1 1

Sadeep Sharma , 2Ravi Arora


2

M.Tech-CSE, IV SEM , M.Tech-IT, III SEM 2 Shekhwati Engineering College, Dundlod Rajasthan, Chandra IT Education Society, Alwar, mcasandy2006@gmail.com, raviarora250785@yahoo.com

Abstract:- Mobile ad hoc networks have recently been the topic of extensive research. The interest in such networks stems from their ability to provide temporary and instant wireless networking solutions in situations where cellular infrastructures are lacking and are expensive or infeasible to deploy. Despite their desirable characteristics, vital problems concerning their security must be solved in order to realize their full potential. Various security controls, such as the use of encryption and authentication techniques, have been proposed to help reduce the risks of intrusion. However since such risks cannot be completely eliminated there is a strong need for intrusion detection systems for ad hoc network security. Among intrusion detection techniques anomaly detection may prove to be more economic from the resources point of view, which is more suitable for the resource constrained ad hoc networks. Therefore, in this paper we present a survey on anomaly detection in ad hoc networks. Keywords: - Ad hoc networks, anomaly detection, intrusion detection, MANET.
I. INTRODUCTION A mobile ad hoc network (MANET) is a collection of mobile hosts that can communicate with each other without any pre-established infrastructure. Each node in the MANET can act as router as well as host. In order to maintain connectivity in a mobile ad hoc network all participating nodes have to perform routing of network traffic. The success of communication highly depends on other nodes cooperation. Therefore, MANET has the property of rapid infrastructure-less deployment and no centralized controller which makes it convenient to many environments, such as soldiers relaying information for situational awareness on the battlefield, business associates sharing information during a meeting; attendees using laptop computers to participate in an interactive conference; and emergency disaster relief personnel coordinating efforts after a fire, hurricane, or earthquake. The other possible applications include personal area and home networking, location based services, and sensor networks. However, with the availability of wireless technologies such as Bluetooth and the IEEE 802.11 WLAN, and the development of next generation networks, civilian applications, such as personal area networks, sensor networks, and disaster area networks are being envisioned. Techniques can be categorized1 into misuse detection and anomaly detection. In misuse detection, decisions are made on the basis of knowledge of a model of the intrusive process and what traces it may leave in the observed system. Legal or illegal behavior can be defined and observed behavior compared accordingly. Such a system tries to detect evidence of intrusive activity irrespective of any knowledge regarding the background traffic (i.e., the normal behavior of the system). Misuse detection systems, e.g., IDIOT [4] and STAT [5], use patterns of well-known attacks or weak spots of the system to match and identify known intrusions. A typical misuse detection system is shown in Figure 1. The main advantage of misuse detection is that it can accurately and efficiently detect instances of known attacks. The main disadvantage is that it lacks the ability to detect the newly invented attacks. In anomaly detection, for example the anomaly detector in IDES [6], a baseline profile of normal system activity is created. Any system activity that deviates from the baseline is treated as an anomaly, i.e. possible intrusion. A typical anomaly detection system is shown in Figure 2. Although this method suffers from the disadvantage of having a high false positive rate,

Figure 1: An example of misuse detection system


However it has the advantages of not requiring prior knowledge of intrusions and can thus detect new intrusions. Another advantage which makes it suitable for an ad hoc network is that there is no need to store a database for attack profiles [3]. II INTRUSION DETECTION SYSTEM The main task of the intrusion detection system (IDS) is to discover the intrusion from the network packet data or system audit data. One of the major problems that the IDS might face is that the packet data or system audit data could be over-whelming. Some of

Figure 2: An example of anomaly detection system


the features of audit data may be redundant or contribute little to the detection process. So the reduction in the size of data set is needed. To perform the reduction, two methods of feature selection, namely, markov blanket discovery and genetic algorithm are proposed. The Intrusion Detection System is distributed in nature so each node of a mobile ad hoc network equipped with an IDS. System architecture of IDS comprises four components: i. Data collection module ii. Profile module

Figure 3: An IDS architecture

iii. iv.

Feature selection module Detection and Response module. data will be given as input to the trained Bayesian classifier. Any deviation from the threshold level is considered level as anomalies. Once all the attacks are identified then the notification will be given to all the nodes in the ad hoc environment. III INTRUSION DETECTION IN MANET Intrusion detection systems have performed well for fixed networks, but in Ad-hoc network it met lot of difficulties due to the following reasons: 1. There is no central point to control all the activities in the network. 2. Dynamically changing network topology and behavior. 3. Limited power level of mobile devices. The basic requirements for IDS to be implemented in ad hoc networks are [1] i.) The IDS should not introduce a new weakness in the MANET. That is, the IDS itself should not make a node any weaker than it already is. ii.) IDS should run continuously and remain transparent to the system and users. iii.) The IDS should use as little system resources as possible to detect and prevent intrusions. IDSs that require excessive communication among nodes or run complex algorithms are not desirable. iv.) It must be fault-tolerant in the sense that it must be able to recover from system crashes, hopefully IV FEATURE SELECTION The main task of the intrusion detection system is to discover the intrusion from the network packet data or system audit data. One of the major problems that the intrusion detection system might face is that the packet data or system audit data could be overwhelming. For wireless network, due to the limited capacity of wireless devices, choosing those features that can best characterize the behavior of network is very important. In wireless network, the way a node communicates with other nodes is by exchanging messages. Therefore, a nodes behavior can be obtained by monitoring the network traffic. Each node monitors its neighbouring nodes network traffic and built a profile during offline training. Then the profile is used as a threshold to detect abnormal behavior in the network. System selects all possible features as the object of the monitoring. That is suitable for a small wireless network, which has only a few nodes. But it requires a big amount of capacity for very large network. Therefore, feature selection method is used. One of the best techniques for feature

A Data collection module: The module collects audit data for each node. The proposed system considers unknown attacks. So the IDS need normal behavior of the system (normal profile) and violation of normal behavior (attack profile). Normal profile is created using the data collected during the normal scenario. Attack profile is created by simulating the attacks. B Profile module: In this module audit data is transformed into appropriate format for the detection process. From the attack data, training data set is created to train the bayesian classifier. Training data consists of labeling of events whether it is a normal event or an attack. Test data is collected under simulated attack environment and it is given to the bayesian classifier to identify an event whether it is an attack or normal. C Feature selection module: Feature selection is the process of selecting important features from the large data set. The selected features are relevant to the detection process. In order to perform this operation two feature selection methods are proposed. i) Markov blanket Discovery: The selection of markov blanket is based on the dseparation rule of the Bayesian network. Given a specific attribute, which is represented as a node in the Bayesian network markov blanket is discovered. Markov blanket is the set of nodes composed of the attributes parents, its childrens parents of bayesian network. When using a bayesian classifier on complete data, the markov blanket of the class node forms feature selection and all features outside the markov blanket are deleted from the bayesian network. ii) Genetic Algorithm: GA-based Feature selection algorithm is based on the wrapper model. In the adapted algorithm, the search component is a GA and the evaluation component is a bayesian network. The initial population is randomly generated. Every individual of the population is represented by means of genes, each of which represents a feature. If the feature value is 1, it is used during constructing of bayesian network if it is 0 that feature is not used. D Intrusion Detection and Response module: This module detects deviation from the norm. In order to detect the anomalies Bayesian classifier is used. Classifier will be trained by the training data. The test

selection is Markov Blanket Feature Selection and other one is Genetic Algorithm . A MARKOV BLANKET FEATURE SELECTION TECHNIQUE Markov blanket (MB) is a novel idea for significant feature selection in large data sets. It is used in conjunction with Bayesian networks. Before markov blanket discovery best Bayesian net-work structure is obtained by using all the features. The best features are selected from MB discovery. The Markov blanket of a node n is the union of ns parents, ns children and the parents of ns children . B GENETIC ALGORITHM The method explores the space of possible subsets to obtain the set of features that maximizes the predictive accuracy and minimizes irrelevant attributes.By using multiple correlations in a fitness function used by GA to evaluate the fitness of each feature subset regarding relationship in its domain. A principle design of an adequate fitness function, which can guide well for searching the best feature subset regarding multi-objective, and more general for a feature selection problem domain. A lack of information leading to limitation of accuracy can occur if the necessary features are not regarded. To avoid this unexpected case and to be successful in the application, the fitness function must be able to measure and maximize the performance of feature subset in discriminating classes of pattern space while minimizing their sizes.recover to the previous state, and resume the operations before the crash. v.) Apart from detecting and responding to intrusions, IDS should also resist subversion. It should monitor itself and detect if it has been compromised by an attacker. vi.) IDs should have a proper response. In other words, IDS should not only detect but also respond to detected intrusions, preferably without human intervention. vii.) Accuracy of the IDS is another major factor in MANETs. Fewer false positives and false negatives are desired. C BAYESIAN CLASSIFIER Classification is a basic task in data analysis and pattern recognition that requires the construction of a classifier, that is, a function that assigns a class label to instances described by a set of attributes. The induction of classifiers from data sets of preclassified instances is a central problem in machine learning. Numerous approaches to this problem are based on various functional representations such as decision trees, decision lists, neural networks, decision graphs, and rules. One of the most effective classifiers, in the

sense that its predictive performance is competitive with state-of-the-art classifiers, is the so-called naive Bayesian classifier. Bayesian classifiers have also exhibited high accuracy and speed when applied to large databases. V CONCLUSION AND FUTURE WORK In this system work, the anomaly detection method is applied for mobile ad hoc networks to detect the intrusions. This method uses the network layer data to characterize the behavior of mobile nodes. The audit data is collected from all the mobile nodes under various scenarios to classify the events. The normal profile is created under the absence of attacks and the attack profile is created by simulating attacks such as flooding attack. After collecting the audit data, it was converted into an appropriate form for the detection process. During the detection process, the attack profile is compared with the normal profile. If there is any deviation from the normal behavior then the event is labeled as an attack. In future we extended this work as, the size of the audit data is reduced by means of feature selection technique. In order to perform the feature selection operation, two methods namely markov blanket discovery and genetic algorithm are used. In order to perform markov blanket discovery, bayesian network is constructed by using the collected features. Then the minimum description length score is calculated. Based on the calculated score the features are selected. In Genetic feature selection method, there are two components used namely search component and evaluation component. GA is used as a search component and Bayesian tree with collected features as an evaluation component. Fitness function is calculated for the Bayesian tree. Based on the fitness function value the features are selected. The performance of genetic feature selection is contrasted with that of markov blanket discovery based on detection rate and false alarm rate. VI REFERENCES [1] Y. Hu, A. Perrig, and D. B. Johnson. Ariadne, A secure on-demand routing protocol for ad hoc networks, in Proceedings of the Eighth Annual International Conference on Mobile Computing and Networking (MobiCom 2002), September 2002. [2] P. Papadimitratos and Z. J. Hass, Secure routing for mobile ad hoc networks, in SCS Communication Networks and Distributed Systems Modeling and Simulation Conference (CNDS), San Antonio, TX, January 2002. [3] A. Mishra, K. Nadkarni and A. Patcha, Intrusion detection in wireless ad hoc networks, IEEE Wireless Communications, vol. 11, no. 1, Feb 2004, pp. 48 60.

[4] S. Kumar and E. H. Spafford, A software architecture to support misuse intrusion detection, In Proceedings of the 18th National Information Security Conference, pp. 194-204, 1995. [5] K. Ilgun, R. A. Kemmerer, and P. A. Porras, State transition analysis: A rule-based intrusion detection approach, IEEE Transactions on Software Engineering, pp.181-199, March 1995. [6] harma Prakash Agrawal and Qing-An Zeng. Introduction to Wireless and Mobile systems. Nelson, a division of Thomson Canada Limited, http://www.nelson.com. [7] Kavitha Kumar itrusion Detection in Mobile Adhoc Networks [8] Marianne A. Azer , Sherif M. El-Kassas , Magdy S. El-Soudani survey on anomaly detection methods for ad hoc networks [9] Peyman Kabiri and Mehran Aghaei feature analysis for intrusion detection in mobile ad-hoc networks. [10] R.Nallusamy, K.Jayarajan, r.K.Duraiswamy Intrusion Detection In Mobile Ad Hoc Networks Using GA Based Feature Selection