JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.

COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG

65

A Hybrid Model Based on Feature Extraction for Network Intrusion Detection
Saeed Khazaee and Mohammad Saniee Abadeh
Abstract— Intrusion Detection is one o f the most important ways to increase computer network security. In this paper, we intend to explore the feasibility of applying feature extraction methods to the misuse detection task. The feature extraction stage is performed by several classifiers which we call FE-Clasifiers. In this study, instead of the usual five classes, to improvement of recognition accuracy, some subclasses are intended. So, here we have 11 classes not 5 classes. Evaluation of the proposed method is performed by KDDCup99 data set. Our experimental results indicate that intrusion detection system with feature extraction method has better performance than that without feature extraction method in classification rate, detection rate and false alarm rate. Index Terms— Feature extraction, Classification, Intrusion detection, Sampling, Preprocessing.

——————————  ——————————

1 INTRODUCTION
ecurity of computer networks with daily spread has been faced many challenges. Threat of misuse of data and important resources, has led to researchers looking for different ways to intrusion prevention or detection. Common enterprises adopt firewall as the first line of defense for internet safety, but the main function of firewall is to supervise accessing behaviors of internet, and it owns limited detection capacity for internet attacks. Therefore, Intrusion Detection System is always applied to detect internet encapsulation, to improve protective capacity of internet safety [1]. Undoubtedly research about methods for design of an intrusion detection system that can detect intrusions in the network with the appropriate rate is essential. Various intrusion detection systems of the type of detection to be divided into two types, namely (i) misuse detection and (ii) anomaly detection. Misuse detection is used to identify intrusions that match known attack scenarios. However, anomaly detection is an attempt to search for malicious behavior that deviates from established normal patterns [2]. In this paper our interesting is in misuse detection. Up to now, several researches and various methods of intrusion detection have been developed. However, there is a growing interest in intrusion detection community toward the application of machine learning techniques in this field. Considering this trend and the extensive amount of data involved in intrusion detection problem, data mining approaches seem to be appropriate for this purpose [1-5]. In order to evaluate the performance of implementing machine learning methods, KDD CUP 99 dataset is used. This dataset is a common benchmark for evaluation of intrusion detection techniques. Many researches that apply data mining methods such as [1-5] to KDD Cup 99 dataset classification have been conducted. However, among most researches above mentioned, classification algorithms were applied directly on the rough data, which may has a negative effect on the accuracy of the classifier. Therefore feature analysis is an important preprocessing step for improving intrusion detection performance. Feature extraction is a main kind of feature analysis technique [6]. Very little research that applies feature extraction to misuse detection has been done. In this paper, the possibility of use feature extraction to make a misuse detection system has been investigated. The high volume of data involved in matters of intrusion detection makes that the classic methods of classification can’t achieve the desired aims at solving this problem simply. As regards different techniques of dimension reduction in intrusion detection problem, the complexity of the problem greatly reduced and the classification algorithms are becoming more efficient, therefore the dimension reduction will be used in data preprocessing. Also provide of a method that can select appropriate sample from all data. In the problem of intrusion detection, the types of attacks to computer network is divided into four categories DOS, Probe, U2R, R2L. Also consider another class that is normal, the inputs belong in one of five classes. Each of these classes themselves is included of several known attacks. For example, the DOS includes attacks like Neptune, Smurf, Back, e.t. In most ways to solve this problem, classification is about 5 classes and seems a separate at———————————————— tack as sub-class, the accuracy of intrusion detection in• Saeed Khazaee is with the Department of Electrical and Computer Engineering, Islamic Azad University, Qazvin Branch, Iran. creases and causes that a more appropriate response for intrusions. In this study, there are 11 different classes (10 • Mohamad Saniee Abade is with the Department of Electrical and Computer type connections are attacks and one of connections is Engineering, Tarbiat modares University, Tehran, Iran.
© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617 http://sites.google.com/site/journalofcomputing/

S

© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG

66

normal) in the intrusion detection problem, accordingly training and classification are done that in the next sections will be examined. Using data mining techniques such as methods based on feature selection [7], artificial neural networks [8], genetic algorithm [9], association rules [10],e.t has been cause of improve classification and process of detection. But using a method based on feature extraction which could add new features to exact problem and is independent of the type of classifier, had led to more improvement of classification, which hadn’t been seen in previous researches yet. In this paper, in Section 2 how the data preprocessing will be mentioned that itself is comprised of four items: sampling, conversion-normalization, feature selection and feature extraction. In Section 3 the proposed method that is based on feature extraction, will be explained. In this section with proposed method, new dataset will be created for training data and test data that several new features have been added to them. Creating new features in the training set discuss as a preprocessing phase. In the test phase before the final classification and identify the type of connection, new features will be added to the data. Also in section4 the proposed method is compared with some other methods and its performance is evaluated and finally, section 5 draws conclusions.

high, sampling of them will be done with suitable distribution and random. Also, 40 percent of the samples with class Warezclient that existed in training data but weren’t in test data, be separated and will be added to test data. Conversely samples with class Warezmater, samples with class Warezclient are available in test data but don’t exist at all. Note that in this study, our goal is, design of a misuse detection system, at result it is intended to detect attacks that their behavior is studied via system already. Ergo, 60percent of samples with class Warezmaster can be considered in the training data and remaining in the test data considered. Finally, given that the number of instances of other classes than to other samples is much less, remain in training set thoroughly. TABLE 1 DETAILS Of ClASSES AND NUMBER Of SAMPLES
Class Number ClassName Group Number of Samples Train

2

DATA PREPROCESSING

The KDDCup99 datasets [11] includes very high training samples. This dataset is a common benchmark for evaluation of intrusion detection techniques. KDD Cup99 consists of several components, that two of them are used in this work. This dataset contains a number of connection records where each connection is a sequence of packets containing values of 41 features. Also, attack types in this dataset fall into four main categories: denial of service (DoS), probe, user to root (U2R), and remote to local (R2L). Feature values in this data set are as discrete, continuous, and symbolic. Range of values for some of these features is very large and diverse. Note that training data has 41 different features, obviously the intrusion detection problem as usual, is a problem with high dimensions. Thus, current data for an efficient classification is not very suitable and training data must be preprocessed. Here, the data processing will take place in four main stages. First step, appropriate sampling; second step, conversion all symbolic values to real values and normalization; third step selection of affective attributes and Eventually; last stage that heretofore in previous works have not been done, is feature extraction. This phase of preprocessing in the proposed framework will be discussed separately.

1 2 3 4 5 6 7 8 9 10 11

buffer_overflow Ipsweep Loadmodule Neptune Portsweep Rootkit Satan Smurf Warezclient Warezmaster Normal

U2r Probe U2r Dos Probe U2r Probe Dos R2l R2l Normal

30 1247 9 7648 1040 10 1589 8450 674 634 6618

22 306 2 2374 354 13 554 4545 346 423 2796
Test

2.2 Conversions And Normalization As mentioned earlier, features in the KDD datasets have different forms: continuous, discrete, and symbolic with significantly varying resolution and ranges. Most pattern classification methods are not able to process data in such a format. Hence, preprocessing is required.
Symbolic-valued features, such as protocol_type (3 different symbols), service (70 different symbols) and flag (11 different symbols) are mapped to integer values ranging from 0 to S-1, where S is the number of symbols. Continuous features having smaller integer value ranges, such as wrong_fragment [0,3], urgent [0,14], hot [0,101], num_failed_logins [0,5], num_compromised [0,9], num_root [0,7468], num_file_creations [0,100], num_shells [0,5], num_access files [0,9], count [0,511], srv_count [0,511], dst_host_count [0,255] and dst_host_srv_count [0,255], are also scaled linearly to the range [0,1]. Logarithmic scaling (base 10) is applied to three features spanned over a very large integer range (i.e. duration [0,58329], src_bytes [0,1.3billion] and dst_bytes [0,1.3billion]), to reduce the ranges to [0,4.77] and [0,9.11], respectively. Other features are either Boolean (e.g. logged_in), having binary values, or continuous in the range of [0,1] (e.g. diff_srv_rate) and no scaling is needed for these features. So, each of the mapped features are linearly scaled to the range [0,1] [12].

2.1 Sampling Sampling is the main method utilized for data selection. As the number of samples in the training set is very high, training data will be sampled. In this research, instead of the usual five classes (Normal, DoS, Probe, R2L, U2R), a number of sub-classes of these classes are intended to detection.11 selected classes shown in Table1. As the number of samples with classes Smurf, Satan, Normal is very

© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG

67

2.3 Feature Selection In detection approaches based on data mining me-thods different variations of features are utilized. Thesefetures are gained through static and dynamic anlaysis [13]. Performance of a pattern recognition system depends strongly on the employed feature-selection method [14]. Since the computational cost of system increases with increasing of number of features, design and implementation of systems with the least number of features seem to be necessary. In the most high-dimensional issues, Selection of the influence features and remove of other features can greatly raise the accuracy of classification and exactly reduce the complexity of data processing at different stages. KDD Cup99 data includes 41 different features and one label as a class. In most previous work feature selection techniques have been used to enhance performance and dimension reduction of the classification problem [15], [16]. In this paper as for necessity of dimension reduction, by a method based on ranking, Features with higher rank are selected. Here, instead of using the name of attributes, these 14 selected features call F1 to F14. This is done by way of "Chi-Squared Feature Evaluation". Chi-square test statistics (χ2) is a widely used method for testing independence and/or correlation. In our proposed technique, it is used for testing similar behaving between attributes. Essentially, χ2test is based on the comparison of observed frequencies with the corresponding expected frequencies. In other words, χ2is used to test the significance of the deviation from the expected values. Let ƒ 0 be an observed frequency, and ƒ be an expected frequency. The χ2value is defined as:
A χ2 value of 0 implies the attributes are statistically independent. If it is higher than a certain threshold value, we reject the independence assumption [17]. With Study and Compared of other attribute selection ways in Weka, it was found that this method for effective feature selection in this research; is exactly appropriate.

be used to enhance the speed and effectiveness of supervised learning. In this study, proposed method is based on non-linear feature extraction which will be explained in details in Section 3.

3 PRPOSED FRAMEWORK
In the recent years, data mining that is known as knowledge discovery in databases, has established its position as a prominent and important research area. In previous work related to the intrusion detection used from the KDD Cup99 dataset, much attention has been paid to data preprocessing. For example, in [7], [15], [16] with effective selection of attributes, classification has been improved greatly. Besides the previously mentioned preprocessing, since the feature extraction as a fundamental solution is not recommended for better classification. In this study is shown that using the proposed method based on feature extraction, intrusion detection system due to new features will provide better performance that is independent of the classification type.

2.4 Feature Extraction Feature extraction is the process to obtain a group of features with the characters we need from the original data set. It usually uses a transform to obtain a group of features at one time of computation. Feature extraction techniques often involve non-linear transformation. For instances in [18] the authors transformed features nonlinearly using a neural network which is discriminatively trained on the phonetically labeled training data. In [19] various non-linear transformation methods, such as folding, gauge coordinate transformation, and non-linear diffusion, had been explored for feature extraction. Linear discriminant analysis (LDA) [20] and principal components analysis (PCA) [21] are two popular techniques for feature extraction. Non-linear transformation methods are good in approximation and robust for dealing with practical non-linear problems. Feature extraction creates new features, whose meanings are difficult to interpret. Feature extraction can also

3.1. Feature Extraction For Training Data At this stage, which is considered as data pre-processing, training set will be divided in two partitions (Figure1): TrD1, TrD2 that TrD1 has been used for training of multiple classifiers are called “Feature-Extraction classifiers” or “FE-classifiers” and TrD2 will be used in the final step of feature-extraction and training phase. Thereupon, TrD1 will be divided into several smaller parts that they are roughly equivalent in distribute of different classes (TrD 11 to TrD 1-M ). M is the number of FE-classifiers. Here's would be a question of finding the minimum number of these classifiers. Hence, first, we assume that the majority vote of the answers of these classifiers in the phase of test; specifies connection type and final answer. With this assumption, and the pigeonhole principle, (n+1) classifiers are needed, since when the answer of all those classifiers was different; at least two classifiers have been the same answer. In the majority vote, one of the types of attacks or normal connection will be selected. Note that in this problem 11 classes (10 connections of various attacks and normal connection) is given, thus will be have 12 data-sets for training of “FE-Classifiers”.

TrD1 Training Dataset TrD2 Fig. 1.Division of Training data into two parts with equal distribution

Here 12 MLP1s as "FE-Classifier" is used that the inputs of these networks in the first stage of preprocessing, have been normalized also. All nominal data are converted to
1

© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

Multilayer perceptron

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG

68

numerical data and after all the values were in range [0-1], the next stages of the processing are done on the original data. After appropriate sampling and effective feature selection, each classifier has 15 inputs. 14 input values are the attributes of each connection; labels related to one of the classes will be selected as a target for those classifiers (Figure2).
TrD1-1 FE-Classifier 1

New-TrD2 dataset. Given that the proposed method is a method based on the data processing, improve system performance, is almost independent of the final classification. For final classification artificial neural network Fuzzy ART-MAP is used and its performance will be examined in the next section. A more detailed description of this artificial neural network can be found in [22], [23]. TABLE 2. RANKING OF NEW-OLD FEATURES Features
F21 F19 F26 F20 F18 F25 F22 F24 F16 F15 F17 F23 F2 F4 F8 F12 F3 F13 F9 F11 F1 F6 F5 F14 F10 F7

TrD1-2

FE-Classifier 2

Ranked
2.14165 2.11885 2.11184 2.11068 2.08331 2.0622 2.04458 2.01759 1.98643 1.92231 1.91862 1.897 1.89226 1.83072 1.62099 1.28416 1.11476 1.11465 1.10889 1.07104 0.92272 0.88135 0.86871 0.51477 0.3147 0.1323

...

TrD1-12

Fig. 2.Training of FE-Classifiers

In the next phase, the other part of the training, the TrD2 will be used for two important targets: • Using TrD2 for main stage of Feature Extraction: At this point, each sample in TrD2, regardless of its label, with each of the “FE- classifiers” will be tested and as for answer of classifiers will take place in 12 same or different classes. Answer of each of these classifiers in testing of samples, would be new features for this problem. Values of these attributes may be Neptune, Smurf, Rootkit or any kind of attack or normal connections. In this case 12 new features will be added to the problem. In Figure 3 is defined how the new features sets for a sample as A. In this figure Ans i is answer of FE-Classifieri and consequently it’s value of a new feature for sample A. F1 to F14, are the previous features and F15 to F26 are new extracted features. Also mayhap all new features alongside old features haven’t high rank and perhaps change with adding of these new features of influence on the problem. So once again, in this phase effective features via Chi-Squared Feature Evaluation method trough all previous and new features will be selected. Then it’s clear that all new features in terms of effectiveness rank, are higher than all previous features. So, not unexpected that with extraction of new features, significant improvements in intrusion detection is achieved. As regards, ranks of features F7, F10, F14 are lower than the other features specifically (Table2); these features will be removed. After values of Ans1 to Ans12were determined by FE-Classifiers and features with low ranking were removed, NewTrD2 dataset is made. In fact this dataset contains all data of TrD2 except that new data unlike previous data have newer features. • Using New-TrD2 for training of final classifier: After the extraction of new features and pre-processing of data, the final classification will be trained via

...

FE-Classifier 12

3.2 Feature Extraction For Test Data
After the final classifier was trained with New-TrD2, it should be evaluated by test data; and this test data must have the same features that have existed in the training data. Here, it is necessary, regardless of the label of each sample, new features for it, must be initialized and add to test dataset. This will be done by FE-Classifiers. It should be noted, none of the test data are not used during training of these classifiers. Here, it seems extraction of the new features, increases the detection time greatly. But as respects determining of each of extracted features values is independent of other features in the proposed method and has the potential to be done in parallel, therefore, extraction rate can be raised whit parallel processing; so that 12 copies of a sample are given to FE-Classifiers via parallel processing and their results are recorded. Finally, these results are considered as values of new features for above sample.

© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG

69

Ans 1 A Sample "A" with 14 features A A FE-Classifier2 ... FE-Classifier1 Ans 2

Feature Extraction phase

TrD2

Ans 12 FE-Classifier12 ...

F1

F2

...

F14

F15

F16

...

F26

Class

Old Features

New Features

All new-old Features

Re-Feature Selection

Remove features with Low rank

Sample “A” with selected features

New-TrD2

Fig. 3.Proposed framework

ralization ability of IDSs. Statistical details of the two KDD components used here are summarized in Table3.

4 EXPERIMENTS AND RESULTS
4.1 Datasets As mentioned before, KDD dataset is used to evaluate the proposed framework for intrusion detection. This database contains a standard set of data to be audited, which include a wide variety of intrusions simulated in a military network environment [24]. In all experiments described below, '10% KDD' dataset is used for the purpose of training and 'Corrected' dataset is used as a test set. Several new and novel never-before-seen attacks have been used in 'Corrected KDD' in order to assess the geneTABLE 3. CHARACTRESTIC OF KDD'99 COMPONENT USED FOR TRAIN AND TEST

Dataset

© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

10% KDD Corrected

Total Attack Patterns 396,743 250,436

Total Normal Patterns 97,278 60,593

Total

494,021 311,029

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG

70

4.2 Evaluation Criteria Before discussing about the results of experiments, it seems necessary to mention the standard metrics that have been developed for evaluating IDS. Detection rate (DR) and false alarm rate (FAR) are the two most common metrics. DR is computed as the ratio between the number of correctly detected attacks and the total number of attacks, while FAR is computed as the ratio between the number of normal connections that is incorrectly misclassified as attacks and the total number of normal connections. Another metric that used here is the classification rate (CR). Classification rate for each class of data is computed as the ratio between the number of test instances correctly classified and the total number of test instances of this class. 4.3. Experiments Setup and Results The experiments of this study were conducted in the environment of Microsoft Windows7 Ultimate using an IBM compatible computer with Intel(R), Core(TM) 2 Dou CPU 2.4 GHz and 2 GB of RAM. The proposed method was coded by MATLAB R2010a. Before evaluating the system, we determined the best values of important parameters for neural net. After determining the appropriate structure and parameter values for fuzzy ARTMAP, the performance of proposed framework has been evaluated in terms of classification rate

(CR), detection rate (DR), correctly classified (CC) and false alarm rate (FAR). The performance of proposed method when using feature extraction module and without feature extraction module is reported in Table4. Table4 shows the performance of the proposed method as compared to some other Weka methods. As shown in this table, the proposed system almost has higher classification rate for all of the attack classes. This system performs better in term of DR and FAR as compared to some other Weka methods. So, it can be inferred that the proposed approach increases the detection rate and decreases the false alarm rate, effectively. It is interesting to note that, as expected, capabilities of this prototype IDS reveals the effectiveness of data mining techniques. Also Increase CR by proposed feature extraction in FuzzyARTMap has been studied separately in Figure 4. As regards selected classes in this study are perfectly different with previous works; here, the performance of proposed framework has been evaluated in terms of correctly classified, detection rate and false alarm rate in contrast to some previous work. Table5 shows these comparisons.

TABEL 4. PERFORMANCE OF PROPOSED IDS FRAMEWORK Method AS COMPARE TO MODEL WITHOUT FEATURE EXTRACTION
Classifier (Ridor)-Only Feature Selection Classifier (Ridor)FeatureSelection & Feature Extraction Classifier (FuzzyArtMAP)Only Feature Selection Classifier (FuzzyArtMAP)Feature Selection & Feature Extraction

Classification Rate (%)

Metric

Detection Rate (%)

Buffer_oweflow Ipsweep Loadmodule Neptune Portsweep Rootkit Satan Smurf Warezclient Warezmaster Normal

68.2 96.7 0.0 97.9 88.1 30.8 36.1 99.3 48.6 96.8 92.1 92.50 92.40 7.91

68.2 98.7 50.0 100 99.7 15.4 99.8 100 99.1 99.3 99.5

4.5 97.7 0.0 64.25 83.3 0.0 99.6 100 89.0 98.1 97.3 89.47 91.24 2.83

77.3 98.4 0.0 100 99.7 23.1 99.5 100 99.1 99.6 99.6

99.56 99.61 0.52

99.67 99.63 0.36

Correctly Classified (%) False Alarm Rate (%)

© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG

71

Fig. 4. Comparison of FuzzyARTMap classifier-with Feature Extraxtion and with out Feature Extraction in classification rate (CR)

TABLE 5. PERFORMANCE OF PROPOSED IDS FRAMEWORK AS COMPARED TO OTHER METHOD IN CC, DR and FAR Method Metric Correctly Classified (%) Detection Rate (%) False Alarm Rate (%)
PNrule [25] FC-ANN [2] GA-optimized FARM-based FeatureSelector +FuzzyARTMAP [12]

Not Reported 91.1 0.4

96.71%

Not Reported Not Reported

Not Reported 97.2 0.17

Hierarchical Clustering and support vector machines [5]

95.7 0.71

HM-FE (Proposed) 99.63 99.67 0.36

Not Reported

5 CONCLUSION
In this research, an intrusion detection framework based on feature extraction was proposed that instead of the usual five classes, to improvement of recognition accuracy, some subclasses are intended. Feature extraction is a main kind of feature creation technique which it is able to improve recognition accuracy. After sampling and conversions-normalization, the proposed system was developed in two main stages. Chisquared feature evaluation method was used for finding the most important features. In this way, the dimension of input feature space was reduced from 41 to 14 and finally, using the previous features, were extracted new features that these features had positive

impacts. These impacts have been positive for almost every type of classifiers. This show, the proposed method, is largely independent of type of classifier. Experimental results showed that the proposed hybrid model performed better in terms of CR, DR and FAR as compared to some other Weka methods. Also consider the 11 classes instead of the usual five classes (DoS, Probe, R2L, U2R, Normal) can make the more accurate of warnings and more appropriate of repercussion be taken against these attacks.

© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG

72

[17]

REFERENCES
[1] S.Y. Wu and E. Yen, "Data mining-based intrusion detectors", Expert Systems with Applications, Volume 36, Issue 3, Part 1, pp. 5605-5612, 2009. G. Wang, J. Hao, J. Ma and L. Huang, "A new approach to intrusion detection using artificial neural networks and fuzzy clustering", Expert Systems with Applications, Volume 37, Issue 9, pp. 6225-6232, 2010. C.M. Chen, Y.L. Chen and H.C. Lin, " An efficient network intrusion detection", Computer Communications, Volume 33, Issue 4, pp. 477-484, 2010. C.F. Tsai, Y.F. Hsu, C.Y. Lin and W.Y. Lin, "Intrusion detection by machine learning: A review", Expert Systems with Applications, Volume 36, Issue 10, pp. 11994-12000, 2009. S.J. Horng, M.Y. Su, Y.H. Chen, T.W. Kao, R.J. Chen, J.L. Lai and C.D. Perkasa, " A novel intrusion detection system based on hierarchical clustering and support vector machines", Expert Systems with Applications, Volume 38, Issue 1, pp. 306-313, 2011. H.H. Gao, H.H. Yang and X.Y. Wang, "Principal component neural networks based intrusion feature extraction and detection using SVM", Lecture Notes in Computer Science, Advances in neural Computation, Volume 3611, pp. 21 – 27, 2005. W. Li, J.L. Wang, Z.H. Tian, T.B. Lu and C. Young, "Building lightweight intrusion detection system using wrapper-based feature selection mechanisms", Computers & Security, Volume 28, Issue 6, pp. 466-475, 2009. D. Fisch, A. Hofmann and B. Sick, " On the versatility of radial basis function neural networks: A case study in the field of intrusion detection", Information Sciences, Volume 180, Issue 12, pp. 2421-2439, 2010. M. Saniee Abadeh, J. Habibi and C. Lucas, "Intrusion detection using a fuzzy genetics-based learning algorithm", Network and Computer Applications, Volume 30, Issue 1, pp. 414-428, 2007. A. Tajbakhsh, M. Rahmati and A. Mirzaei, "Intrusion detection using fuzzy association rules", Applied Soft Computing, Volume 9, Issue 2, pp. 462-469, 2009. 1999 KDD Cup Competition (Available on http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html). M. Sheikhan and M. Sharifi Rad, "Misuse detection based on feature selection by fuzzy association rule mining", World Applied Sciences Journal, 10 (Special Issue of Computer & Electrical Engineering), pp. 32-40, 2010. M. Eskandari, B. Hosseini, S. Hashemi and A. Salajegheh, “Malware detection using OWA measure”, Journal of computing, ISSN 2151-9617, Volume 2, Issue 12, pp.15-19, 2010. H.T. Nguyen, K. Franke and S. Petrovi'c, "Towards a generic feature-selection measure for intrusion detection", International Conference on Pattern Recognition, ISSN: 1051-4651, pp. 15291532, 2010. A. Zainal, M.A. Maarof and S.M. Shamsuddin, "Feature selection using Rough-DPSO in anomaly intrusion detection", Lecture Notes in Computer Science, Computational Science and its Applications, Volume 4705, Part I, pp. 512–524, 2007. I. Porto-D'ıaz, D. Mart'ınez-Rego, A. Alonso-Betanzos and O. Fontenla-Romero, "Combining feature selection and local modelling in the KDD Cup 99 dataset", Lecture Notes in Computer Science, Artificial Neural Networks, Volume 5768,pp. 824–833, 2009. [18]

[2]

[19]

[3]

[20]

[4]

[21]

[5]

[22]

[6]

[23]

[7]

[24]

[8]

[25]

[9]

Z. Farzanyar, M. Kangavari and S. Hashemi, "Effect of similar behaving attributes in mining of fuzzy association rules in the large databases", Lecture Notes in Computer Science, Computational Science and its Applications, Volume 3980, pp. 1100 – 1109, 2006. S. Sharma, D. Ellis, S. Kajarekar, P. Jain and H. Hermansky, "Feature extraction using non-linear transformation for robust speech recognition on the aurora database", International Conference on Acoustics, Speech, and Signal Processing, pp. 1117– 1120, 2000. J.M. Coggins, "Non-Linear feature space transformations", IEE Colloquium on Applied Statistical Pattern Recognition (Ref. No. 1999/063), pp. 17/1 - 17/5, 1999. C.J. Liu and H. Wechsler, "Enhanced fisher linear discriminant models for face recognition", Fourteenth International Conference on Pattern Recognition, Volume 2, pp. 1368–1372, 1998. S. Chen, "Regularised OLS algorithm with fast implementation for training multi-output radial basis function networks", Fourth International Conference on Artificial Neural Networks, pp. 290–294, 1995. G.A. Carpenter, S. Grossberg, N. Markuzon, J.H. Reynolds, and D.B. Rosen, "Fuzzy ARTMAP: a neural network for incremental supervised learning of analog multidimensional maps", IEEE Transactions on Neural Network, Volume 3, Issue 5, pp. 689-713, 1992. G.A. Carpenter, "Default ARTMAP", In Proceedings of the International Joint Conference on Neural Networks, Volume 2, pp. 1396–1401, 2003. Hamdan.O.Alanazi, Rafidah Md Noor, B.B Zaidan, A.A Zaidan, “Intrusion Detection System: Overview”, Journal of computing, ISSN 2151-9617, Volume 2, Issue 2, pp.130-133, 2010. R. Agrawal, and M.V. Joshi, "PNrule: A new framework for learning classifier models in data mining (a case-study in network intrusion detection)", IBM Research Division, Technical Report TR 00-015, Report No. RC-21719, Department of Computer Science, University of Minnesota, 2000.

[10]

Saeed Khazaee: Received his B.Sc. in Software Engineering in

[11] [12]

2006 and M.Sc. in Software Engineering from the Islamic Azad University of Qazvin. His interests include data mining, machine learning, Operating System, Data base and parallel algorithm

[13]

[14]

[15]

[16]

© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.