You are on page 1of 32

Using Neural based hybrid Classification methods

by M. Govindarajan & RM. Chandrasekaran

Assoc. Prof. Abdlhamit SUBA I BSc. Hakan AH N

Introduction

Due to increasing incidents of cyber attacks, building effective intrusion detection systems are essential for protecting information systems security, and yet it remains an elusive goal and a great challenge. This paper presents two classification methods involving multilayer perceptron and radial basis function and an ensemble of multilayer perceptron and radial basis function. In this research, it is proposed hybrid architecture involving ensemble and base classifiers for intrusion detection systems. The analysis of results shows that the performance of the proposed method is superior to that of single usage of existing classification methods such as multilayer perceptron and radial basis function. Additionally, it has been found that ensemble of multilayer perceptron is superior to ensemble of radial basis function classifier for normal behavior and reverse is the case for abnormal behavior. It is shown that the proposed method provides significant improvement of prediction accuracy in intrusion detection.
2

What is the Intrusion Detection

Intrusions are the activities that violate the security policy of system. Intrusion Detection is the process used to identify intrusions.

THE TOP 10 INTERNET THREATS


(Top 10 from SANS Institute

Bind weakenesses Vulnerable CGI and extension on web server Remote Procedure (NFS and Remote execution) IIS Remote Data Services (for example .htr files) Sendmail Buffer Overflow Solaris sadmind and mountd IMAP/POP buffer overflow or incorrect configuration Default SNMP community strings set to public and private private Global file sharing (netbios, Macintosh web sharing, UNIX NFS) Use of weak password or no password on user id http://www.sans.org/top20/2000

Types of Intrusion Detection System(1)


Based on the sources of the audit information used by each IDS, the IDSs may be classified into
o

Host-base IDSs Network-based IDSs

Types of Intrusion Detection System(2)

Host-based IDSs
o o

Get audit data from host audit trails. Detect attacks against a single host

Network-Based IDSs
o

Use network traffic as the audit data source, relieving the burden on the hosts that usually provide normal computing services Detect by examining the data trail left by user and searching for abnormal user behavior.

ID TECHNOLOGY LANDSCAPE

PREVENTIVE

REAL TIME
7

Using machine learning for network intrusion detection

1.Classification 2.Multi Layer Perceptron 3.Radial Basis Function 4.Case Based Reasoning

1.Classification

In a classification task in machine learning, the task is to

take each instance and assign it to a particular class. For example, in a machine vision application, the task might involve analyzing images of objects on a conveyor belt, and classifying them as nuts, bolts, or other components of some object being assembled. In an optical character recognition task, the task would involve taking instances representing images of characters, and classifying according to which character they are. Frequently in examples, for the sake of simplicity if nothing else, just two classes, sometimes called positive and negative, are used.

2.Multi Layer Perceptron

A type of feedforward neural network that is an extension of the perceptron in that it has at least one hidden layer of neurons. Layers are updated by starting at the inputs and ending with the outputs. Each neuron computes a weighted sum of the incoming signals, to yield a net input, and passes this value through its sigmoidal activation function to yield the neuron's activation value. Unlike the perceptron, an MLP can solve linearly inseparable problems.

10

3.Radial Basis Function

11

4.Case Based Reasoning

Case-based reasoning is a problem solving paradigm that in many respects is fundamentally different from other major AI approaches. Instead of relying solely on general knowledge of a problem domain, or making associations along generalized relationships between problem descriptors and conclusions, CBR is able to utilize the specific knowledge of previously experienced, concrete problem situations (cases). A new problem is solved by finding a similar past case, and reusing it in the new problem situation. A second important difference is that CBR also is an approach to incremental, sustained learning, since a new experience is retained each time a problem has been solved, making it immediately available for future problems.
12

Intrusion Detection Techniques

Anomaly detection
o

Detect any action that significantly deviates from the normal behavior.

Misuse detection
o

Catch the intrusions in terms of the characteristics of known attacks or system vulnerabilities.

13

Anomaly Detection

Based on the normal behavior of a subject. Sometime assume the training audit data does not include intrusion data. Any action that significantly deviates from the normal behavior is considered intrusion

14

Misuse Detection vs. Anomaly Detection


Advantage Disadvantage

Misuse Detection Anomaly Detection

Accurately and generate much fewer false alarm Is able to detect unknown attacks based on audit

Cannot detect novel or unknown attacks High false-alarm and limited by training data.
15

Scientific contributions

Artificial Neural Network Support Vector Machine (SVM) Hidden Markov Model Rule Learning Outklier Detection Scheme Neuron Fuzzy computing Multivariate Adaptive Regression Splines Linear Genetic Programming
16

Lecture Review
Name of Researchers Used Model F. Coenen, G. Swinnen, rule-induction and case-based reasoning K. Vanhoof R. Li, Z. Wang, M.L. Wong, S.Y. Lee, K.S. Leung P.L. Hsu, R. Lai, C.C. Chui, C.I. Hsu Chen Versace Lin and McClean Suh Conversano Hansen and Salaman Indhukhya and Weiss Kuncheva rough sets and neural networks Bayesian networks are generated by a cooperative coevolution genetic algorithm (GA) algorithms and genetic algorithm for tree induction fuzzy theory embedded in a SOM (self-organized map) artificial neural networks and a genetic algorithm general multivariable statistics analysis with an artificial intelligence technique combine classifiers generated by RFM (recency, frequency, monetary), logistic regression, and neural networks (regression analysis, discriminant analysis, non-parametric statistical method, classification and regression trees) ensemble a number of neural networks multiple re-sampling of decision tree induction methods and their combination using the voting method RFM, neural networks, and logistic regression models Aim to improve the response rate of direct mailing that can improve the effectiveness of final classification rules a hybrid approach to discover Bayesian networks from data course scheduling problems Textual Classification in Data Mining to tackle such hard problem using a multi-faceted solution a study on predicting the probability of enterprise failure the low correlation coefficient doesnt always ensure improved performance. a mixture model to improve performance the generalization ability of a neural network system can be significantly improved improvement of predicted gain values of the final nodes in decision trees prediction accuracy was improved using hybrid models

17

Classification methods

Multilayer perceptron neural network Radial basis function neural network

18

Multi layer perceptron neural network

19

Multi layer neural network learning method BACKPROPAGATION


  
Phase 1: Propagation Each propagation involves the following steps: Forward propagation of a training pattern's input through the neural network in order to generate the propagation's output activations. Backward propagation of the propagation's output activations through the neural network using the training pattern's target in order to generate the deltas of all output and hidden neurons.

  

Phase 2: Weight update For each weight-synapse: Multiply its output delta and input activation to get the gradient of the weight. Bring the weight in the opposite direction of the gradient by subtracting a ratio of it from the weight.

20

Radial basis function neural network(1)


(1) RBF networks have three layers of nodes: input layer, hidden layer, and output layer. (2) Feed-forward connections exist between input and hidden layers, between input and output layers (shortcut connections), and between hidden and output layers. Additionally, there are connections between a bias node and each output node. A scalar weight is associated with the connection between nodes. (3) The activation of each input node (fan-out) is equal to its external input. (4) Each hidden node (neuron) determines the Euclidean distance between its own weight vector and the activations of the input nodes, i.e., the external input vector. The distance is used as an input of a RBF in order to determine the activation of node. Here, Gaussian functions are employed. The parameter of node is the radius of the basis function; the vector is its center.

21

Radial basis function neural network(2)


Each output node (neuron) computes its activation as a weighted sum. The external output vector of the network consists of the activations of output nodes, i.e., the activation of a hidden node is high if the current input vector of the network is similar (depending on the value of the radius) to the center of its basis function. The center of a basis function can, therefore, be regarded as a prototype of a hyper spherical cluster in the input space of the network. The radius of the cluster is given by the value of the radius parameter. In the literature, some variants of this network structure can be found, some of which do not contain shortcut connections or bias neurons. Parameters (centers, radii, and weights) of the RBF networks must be determined by means of a set of training patterns with a target vector and (supervised training). For a given input the network is expected to produce an external output.

22

Hybrid model for intrusion detection systems

The main purpose of the hybrid method using error pattern models is to enable application of methods for their pertinent data cases respectively to enhance prediction accuracy. Voting is a simple and popular hybrid model for combining the results of several methods In the case of classification, for a tiebreak, the prediction probabilities of each method are calculated and considered to make final predictions. Bagging (bootstrap aggregation) and boosting are commonly used techniques for combined models.
23

Bagging

Bagging generates multiple training data sets by bootstrapping (resampling randomly with replacement), and combines the results of modeling with each separated set. Brieman reports that prediction accuracy can be improved from 57% to 94% by applying Bagging to the CART algorithm. In summary, Bagging is one of the methods for improving prediction performance by deducing not a single logic but multiple logics from a data set, combining them, and supplementing the misclassified portion.
24

Procedures of hybrid modeling using bagging classifiers

Algorithm: bagging.

The bagging algorithm creates an ensemble of models (classifiers or predictors) for a learning scheme where each model gives an equally weighted prediction.

Input: D, a set of d training tuples; k, the number of models in the ensemble; A learning scheme (e.g., decision tree algorithm and back propagation.) Output: A composite model, M * Method:
(1) for i = 1 to k do// create k models:

(2) create bootstrap sample, Di, by sampling D with replacement; (3) use Di to derive a model, Mi; (4) endfor
25

Experimental

The data used in this study is based on an immune system developed at the University of New Mexico .It is for one privileged program send mail. The data includes both normal and abnormal traces. The normal trace is a trace of the send mail daemon and several invocations of the send mail programs. During the period of collecting these traces, there are no intrusions or any suspicious activities happening. The abnormal traces contain several traces including intrusions that exploit well-known problems in Unix systems. For example, Sunsendmailcp (SSCP) is a script that sends mail uses to append an email message to a file, but when used on a file such as /.rhosts, a local user may obtain root access. Syslog attack uses the syslog interface to overflow a buffer in send mail.

26

Experimental design
While the primary objective of this paper is to show that an ensemble of MLP and RBF classifiers is superior to base classifiers for intrusion detection in terms of prediction accuracy, they are also interested in comparing the performance of the individual classifiers.

Table 1 Properties of dataset System call Normal Abnormal Instances Attributes 2000 2 373 2

27

Comparison of the each model accuracy

Table 2 Performance of MLP. Dataset factor of Accuracy (%) Normal Abnormal 98.81 93.93

Table 3 Performance of RBF. Dataset factor of Accuracy (%) Normal Abnormal 94.20 99.02

Table 4

Performance of ensemble approach.


Dataset factor of

Normal Abnormal

Multilayer perceptron Accuracy (%) 98.88 94.31

Radial basis function Accuracy (%) 94.21 99.03

28

29

Ensemble Approach for IDS

30

Experimental Results and Discussion


We investigated a new technique for intrusion detection model and evaluated its performance on the normal and abnormal intrusion datasets. The run time and error rate are estimated using comparative cross validation method for base classifier. It is shown that, compared to earlier k-NN technique, the run time is reduced by up to 0.01% and 0.06% while error rates are lowered by up to 0.002% and 0.03% for normal and abnormal behavior, respectively. We described feature selection and model selection simultaneously for support vector regression (SVR). It is shown that, compared to earlier SVR technique, the run time is reduced by up to 0.07 s and 0.26 s while error rates are lowered by up to 0.01% and 1.84% for normal and abnormal behavior, respectively. We estimated accuracy using 10-fold cross validation method for base classifiers. Following this, we explored the general MLP and RBF as in intrusion detection model. The results indicated that data mining method had a significant impact on classification accuracy. This study provides opportunities for exploring new directions for future research.
31

Conclusion

Finally, we proposed hybrid architecture involving ensemble and base classifiers for intrusion detection model. From the empirical results, it is seen that by using the hybrid model, normal and abnormal intrusion datasets could be detected with 98.81% and 93.93% accuracy with respect to MLP and 94.20% and 99.02% accuracy with respect to RBF, respectively. The proposed hybrid method shows significantly larger improvement of prediction accuracy than the base classifiers. The prediction accuracy are relatively high at 0.07% and 0.38% with respect MLP and 0.01% and 0.01% with respect to RBF classifier for normal and abnormal behavior respectively. This means that the hybrid method is more accurate than the individual methods and ensemble of MLP is superior to ensemble of RBF classifier for normal behavior and reverse is the case for abnormal behavior.

32