You are on page 1of 20

PCA-based Distributed Approach for Intrusion Detection in WSNs

Abstract
Wireless Sensor Networks (WSNs) are applied for various applications, ranging from
military to civilian fields. These networks are often deployed in hostile and unprotected
environments. Hence, security issues are of significant importance.
Here, a novel Distributed Intrusion Detection approach, called PCADID, for detecting
attacks in WSNs is presented. In this, a WSN is partitioned into groups of sensor nodes.
In each group, some nodes are selected as monitor nodes, which cooperate with each
other to compose a global normal profile.
Using Principal Component Analysis (PCA), sub-profiles of normal network traffic is
established and sent to other monitor nodes.
As the normal network behavior changes over time, the global normal profile is updated.
How PCADID achieves a high detection rate with a low false alarm rate, while
minimizing the communication overhead and energy consumption in the network is
demonstrated.

Department of Information Science & Engineering, RNSIT

Page 1

PCA-based Distributed Approach for Intrusion Detection in WSNs

Chapter 1
Introduction
Wireless Sensor Networks (WSNs) have become a growing area of research and
development over the past few years. A wireless sensor network consists of spatially
distributed autonomous sensors to monitor physical or environmental conditions , such as
temperature ,sound ,pressure , etc. and to pass their data through the network .The
development of wireless sensor networks was motivated by military applications such as
battlefield surveillance .Today , such networks are used in many industrial and consumer
applications , such as industrial process monitoring and control , machine health
monitoring , and so on .
Due to the critical nature of such applications, security issues are of significant
importance. WSNs are vulnerable to different types of attacks on them, either for
financial gain or for malicious and illegal purpose, often since they are often deployed in
hostile and unprotected environment. WSNs can play a critical role in detecting these
attacks, and thus themselves can become a target for attacks.

The WSN is built of nodes – from a few to several hundreds or even thousands, where
each node is connected to one (or sometimes several) sensors. These sensor nodes use adhoc communications and have limitations in power supply[1] , memory and
computational capabilities .WSNs offer a new monitoring and control solution for various
applications such as wildlife monitoring , disaster monitoring , traffic monitoring ,
building monitoring , military surveillance and industrial quality control[2] .Due to
resource limitations of sensor nodes in power supply , memory , and processing power ,
WSNs are more vulnerable to attacks . Many different types of attacks against these
networks have been identified including sinkhole, selective forwarding, wormhole,
blackhole, and hello flooding attacks [3].

All the security techniques proposed for WSNs can be divided into two main categories:
prevention and detection. Prevention techniques, such as secure routing protocols are
Department of Information Science & Engineering, RNSIT

Page 2

called PCACID. In each group. we cannot completely rely on them since they pose many difficult technical problems. for Intrusion Detection in WSNs. In fact. The advantage of anomaly detection technique is that they do not require attack signatures and can thus detect novel attacks (i. PCADID reduced the memory and energy consumption in the network by distributing the process of establishing and updating the global normal profile among all monitor nodes. The resource limitations of sensor nodes. and a PCA-based Distributed Approach. In PCACID. Basically. memory. especially in terms of power supply. every monitor node independently establishes a profile of its own normal network traffic using PCA and uses it to detect anomalous network traffic. The first principal component reflects the approximate distribution of data [6]. A misuse detection technique compares current behavior with known attack signatures and generates an alert if there is a match. PCA is mathematically defined as an orthogonal linear transformation that transforms the data to a new coordinate system such that the greatest variance by any projection of the data comes to lie on the first coordinate (called the first component). RNSIT Page 3 .. Generally. Principal Component Analysis (PCA) is a powerful technique for analyzing and identifying patterns in data [5].e. called PCADID. and so on. Every monitor node composes a global normal profile based upon all received sub-profiles and uses it to detect anomalous network traffic. An anomaly detection technique detects abnormal behaviors that have significant deviations from a pre-established normal profile. there is a PCA-based Centralized Approach. some nodes are selected as monitor nodes. The NS-2 simulator is a structure that permits simulations to be written and modified in an interpreted environment without having to resort to recompiling the simulator each time a structural change is made. We partition a ESN into groups of sensor nodes. Hence. it finds the most important axis to express the scattering of data. Detection techniques can come into play once prevention techniques have failed. However. In PCADID. whose nature are unknown). there are 2 types of these techniques: misuse detection and anomaly detection. and processing power. Department of Information Science & Engineering.PCA-based Distributed Approach for Intrusion Detection in WSNs usually considered as the first line of defense against attacks. We conduct WSN simulations using the NS-2 simulator [7] and consider scenarios for detecting two types of routing attacks. avoiding many time consuming recompilations and also allowing potentially easier scripting syntax. the second greatest variance on the second coordinate. every monitor node establishes a sub-profile of its own normal network traffic using PCA and sends it to other monitor nodes. Here. require a novel and cooperative approach for intrusion detection in WSNs.

Z. The eHIP system consists of authentication-based intrusion detection subsystem and collaboration-based intrusion detection subsystem. If an obvious deviation between monitored sensed data is found.PCA-based Distributed Approach for Intrusion Detection in WSNs Chapter 2 Related Work Over time.Wang.Leckie [8] presented a clustering-based approach for intrusion detection in WSNs . W. there have been many approaches and theories proposed for intrusion detection in the traditional networks. The problem with this was that. and Y. T. which combines intrusion detection to provide a secure clustering based WSN. C. an alarm is issued. K. Both subsystems provide heterogeneous mechanisms for different demands of security levels in clusterbased WSNs to improve energy efficiency. Zhao [11] presented an anomaly detection technique based on fuzzy C-means clustering (FCM) that can be used to detect routing attacks in WSNs.Liang.Y Ng. Some of them are discussed below.Fu [10] presented a distributed group-based intrusion detection approach that combines the Mahalanobis distance measurement with the OGK estimators to take into account the inter-attribute dependencies of multi-dimensional sensed data.He. they are followed by limitations and constraints of WSNs making it impossible for direct application. They assume that every sensor node has sufficient power and resources to perform the computation required for intrusion detection.Loo M. and C.Chang and Y. However.Every sensor node uses a fixed-width clustering algorithm to establish a profile of its own normal network traffic and then uses this profile to detect routing attacks. they first partition the sensor nodes in a WSN into a number of groups such that the nodes in a group are physically close to each other and their sensed nodes are similar enough. M Palaniswami and C. Department of Information Science & Engineering. this assumption may not be applicable to all WSNs. called eHIP.Li. RNSIT Page 4 . In this approach.kuo [9] presented an energy efficient hybrid intrusion prohibited system. The monitor nodes supervise sensed data in each group in turn to average the power consumption among the group members.Su. J. G.

At each time interval . gathering sensory information and communicating with other connected nodes in the network. in a static partitioning. The sensor nodes within the same group are physically close to each other and use a suitable routing protocol so that they can route messages among themselves. In each group G. some sensor nodes are selected as monitor nodes. A feature vector is an n-dimensional vector of numerical features that represent some object. In dynamic partitioning. A sensor node is a node in which a wireless network service that is capable of performing some processing. Now. The partitioning of the network could be static or dynamic. Let there be time intervals of . in the network). These attacks are detected by identifying anomalous feature vectors. if the values of the feature vectors have been modified or harmed by an external user. Each feature vector is comprised of a set of attributes or features. the network may be dynamically rearranged periodically. the network is fixed and does not vary with any altering conditions. It is given by: (t) = [ Where collects a matrix ] (t) of feature vectors (1) is the number of feature vectors. Department of Information Science & Engineering. RNSIT Page 5 . and monitor nodes. That is. The main aim is to detect routing attacks launched by compromised malicious nodes. if the environmental conditions change.PCA-based Distributed Approach for Intrusion Detection in WSNs Chapter 3 Proposed Solution Consider a WSN composed of a large number of small sensor nodes set up in the target field (that is. the monitor node from its own network traffic. At each time window t. every monitor node G extracts a feature vector from its own network traffic. However. partition the network into a number of groups.

PCA-based Distributed Approach for Intrusion Detection in WSNs Chapter 4 Centralized Approach The centralized approach is abbreviated as PCACID. (0)) = Department of Information Science & Engineering.Each feature vector (0) (0) comprised of a set of features: (0) . Let (0) be an × d matrix of feature vectors collected by a monitor node G from its own normal network traffic . This approach consists of two phases: training phase and detection phase.PCACID stands for Principal Component Analysis based Centralized Approach for Intrusion Detection. Then. first normalizes of it [12] : (0) to a range [0. The first PC of (0) is denoted as (0). (3) Where (0) is the column mean of (0) and ≡ is a vector with the length . (4) Where (0) is the matrix of PCs of (0) and (0) is the diagonal matrix of eigen values ordered from largest to smallest. Note that the length of each feature vector is the same for all monitor nodes. The principal components (PCs) of (0) are given by singular value decomposition (SVD) [13] of ̂ (0): ̂ (0) = (0) (0) (0). A. RNSIT (0) (0) (5) Page 6 . Training Phase The training phase involves establishing a profile of normal network traffic. (2) Where d is the number of features. for intrusion detection in WSNs . calculates the projection distance of each feature vector from (0) (see figure 1) : ( (0).1] and then computes the column –centered matrix ̂ (0) = (I - ) (0) = (0) - (0) .

Department of Information Science & Engineering. Let t be the current time window and X (t) be the set of normal feature vector collected at the previous time windows (see figure 2): Where X (t) = ⋃ ( )). RNSIT (8) Page 7 . To do this. it first calculates the projection distance of each feature vector (t) (t) from (t-1) and then classifies (t) as anomaly. B. Detection phase The detection phase involves identifying anomalous feature vectors. (0). the simple use of a predefined normal profile will not be efficient. ( )) is the set of feature vectors classified as normal at time window .PCA-based Distributed Approach for Intrusion Detection in WSNs The maximum projection distance of a feature vector (0) = max { Finally. detects anomalous feature vectors based upon the normal profile established in the training phase. it is necessary to every monitor node to update its normal profile. (0) . (0) from (0) is calculated as : (0))} . Let (t) be the matrix of feature vectors collected by the monitor node from its own network traffic at time window t. if the calculated projection distance is greater than. uses the triple ( . Hence. k=1 … (6) (0)) to establish the normal profile (0). (t-1) : { ( ) ( ) (7) Since the normal network behavior may change over time.

PCA-based Distributed Approach for Intrusion Detection in WSNs To update the normal profile. k=1 … | X (t)| (t)) to update the normal profile Department of Information Science & Engineering. first calculates the first PC maximum projection distance of all normal feature vectors X (t) from : (t) = max { Finally uses the triple ( . Page 8 . (t) . RNSIT (9) (t) . (t) . and then calculates the (t))} .

as illustrated in fig. require a novel and cooperative approach for intrusion detection in WSNs. we divide the set of network traffic features F into a number of subsets and assign each subset F to a monitor node: F= ⋃ (10) Let be the subset of features assigned to and (0) be an matrix of feature vectors collected by from its own normal network traffic. The approach consists of two phases: training and detection. In this section. maximum projection distance of all feature vectors from (12) (0) is the (0). Where (0) and (0) are the column means and first PC of (0). sends (0) to its one-hop neighbor monitor node. it sends them with its own normal sub profile to other nodes along a logical ring. every monitor node G established a sub profile of its own normal network traffic and cooperates with other monitor nodes to compose a global normal profile. a distributed intrusion detection approach called PCADID is presented. memory. To do this. especially in terms of power supply. RNSIT Page 9 . which reduces the memory and energy consumption in the network by distributing the process of establishing and updating the normal profile among all monitor nodes. Each feature vector is comprised of a set of features: . After a monitor node has received normal sub profiles from its neighbor monitor node. (0)) } . (0) . Department of Information Science & Engineering. A. Training phase In the training phase. Then.PCA-based Distributed Approach for Intrusion Detection in WSNs Chapter 5 Distributed Approach The resource limitations of sensor nodes. (0) to a range [0.3. where = | | is the number of features assigned to first normalizes profile (0) [12]: (11) . and processing power.1] and then uses (3)-(6) to establish a normal sub (0) = { (0).

at the end of each time window t. only if the calculated projection is greater than (t-1). RNSIT Page 10 . Each feature vector is divided into a number of sub-feature vectors: =⋃ where (t). every monitor node G collects an matrix (t) of feature vectors from its own network traffic and detects anomalous feature based upon the global normal profile established in the training phase. (0))} (13) B. profiles. Hence. (0). a predefined global normal profile will not be sufficiently representative for future anomaly detection.PCA-based Distributed Approach for Intrusion Detection in WSNs Finally. composes the global normal profile GP (0) based upon all normal sub GP(0) = ⋃ =⋃ (0). Then. Since the normal network behavior may change over time. Detection Phase In the detection phase. during each time window t. every monitor node G updates its sub profile and cooperates with other monitor nodes to update the global normal profile GP(t). (t) is a sub-feature vector corresponding to the subset of features (14) . Department of Information Science & Engineering. calculates the projection distance of each sub-feature vector (t) (t) from (t-1) GP(t-1) and classifies (t) as anomaly .

RNSIT Page 11 .11 is a set of standards for implementing Wireless Local Area Network (WLAN) computer communication in the 2.PCA-based Distributed Approach for Intrusion Detection in WSNs Chapter 6 Experiments A. The sensor network was then deployed over 500(m) 500(m) field.6. Ad hoc On-Demand Distance Vector (AODV) Routing is a routing protocol for mobile ad hoc networks (MANETs) and other wireless ad-hoc networks. and one phenomenon node . Simulation Environment To evaluate the performance of PCACID and PCADID.IEEE 802. 5 and 60 GHz frequency.4.The simulation package was extended to implement two types of attacks: active sinkhole attack and passive sinkhole attack. running on the NS-2 simulator [7] . The MAC sub layer provides addressing and channel access control mechanisms that make it possible for several terminals or network nodes to communicate within a multiple access network that incorporates a shared medium.11 was used as the MAC (Media Access control) protocol and Ad hoc On-Demand Distance Vector (AODV) Routing as the routing protocols. Ethernet. some attacks on WSNs were simulated and tested. the sensor nodes were also partitioned into three groups. one base station.g. 3. The simulation was based on the sensor network package from the Naval Research Laboratories [14]. The simulated WSN consisted of 25 sensor nodes. IEEE 802. e. and in each group. two sensor nodes were selected as monitor nodes (see table I) Department of Information Science & Engineering. Also.

PCA-based Distributed Approach for Intrusion Detection in WSNs The next task is to identify suitable traffic feature that are useful for detecting routing attacks. This is because more features means more computation time and more energy consumption are incurred by the sensor nodes . C. and divided them into two subsets. the malicious node starts the attack by broadcasting a false RREQ packet. Cumulative Percent Variance (CPV) [15] is a measure of the percent variance captured by the first few PCs. The only difference is that instead of sending a false RREP packet. Experimental Result The simulation time was set to 10. which contains the maximum destination sequence number and minimum hop count. It can be used to evaluate the importance of each PC. while attempting to have as few features as possible. 2) Passive Sinkhole Attack This attack is similar to the active sinkhole attack. fourteen features were used as presented in [6] and [8]. An RREP message can simply back-trace the way the RREQ message took and simultaneously allow all hosts it passes to record a complementary route back to where it came from. the neighboring nodes assume that the malicious node is having the best route towards the destination. When the malicious node receives a broadcasted RREQ packet for a route to the destination.000(s) . Department of Information Science & Engineering. The length of the training phase was set to 1000(s) and the collected feature vectors were used to establish the normal profile. RNSIT Page 12 . Simulated attacks Some of the routing that were simulated in the experiments are described below: 1) Active Sinkhole Attack A malicious node attracts all network traffic from sensor nodes in a particular area towards itself.Hence. The length of each time interval was set to 5(s) and the length of each time window was set to 250(s). So. The first subset consists of ten features and the second once consisted of six features. it immediately sends a false RREP packet. A malicious node launched active sinkhole attack from 5000(s) to 7000(s) and passive sinkhole attack from 3500(s) to 6000(s). B.

4 shows the percent variance captured by each PC in PCADID.PCA-based Distributed Approach for Intrusion Detection in WSNs Fig. RNSIT Page 13 . the first PC only explains 45. Department of Information Science & Engineering.61 % of the total variance. As shown in the figure.

RNSIT Page 14 .PCA-based Distributed Approach for Intrusion Detection in WSNs Figure 5 shows the percent variance captured by each PC in PCADID for 2 subsets of traffic features. Department of Information Science & Engineering.

End-to-end delay refers to the time taken for a packet to be transmitted across a network from source node to destination node. respectively.96% and 61. The detection rate is defined as the percentage of anomalous feature vectors that are successfully detected.6 shows the impact active sinkhole attack on the average end-to-end delay as the performance parameter. Two performance measures were majorly used: detection rate (DR) and false alarm rate (FAR).PCA-based Distributed Approach for Intrusion Detection in WSNs As shown in figure. Fig. Department of Information Science & Engineering. The false rate is defined as the percentage of normal feature vectors that are incorrectly detected as anomaly. RNSIT Page 15 . the first PCs of the subsets explain 80.08 % of the total variance. Routing attacks degrade the performance of a WSN by injecting false routes into the network. This shows that the first PC in PCADID can express the scattering of data better than that of in PCACID and thus can establish a better profile or normal network traffic.

the false alarm rate will be increased .As the figure shows.6. Department of Information Science & Engineering.2.4. RNSIT Page 16 . if the number of previous time windows is decreased when updating the normal profile .8.PCA-based Distributed Approach for Intrusion Detection in WSNs Fig. This shows that it is necessary to consider the previous normal feature vectors to keep the normal profile from being too sensitive to the sudden changes in the network.7 compares the average detection and false alarm rates of PCACID and PCADID for different values of = 10.

these rates are 77. PCACID and PCADID achieve a better performance when we keep the normal profile updated.43 %.63 % and 1.08 % and 1. Hence.78 % and 0. In the experiment.31 % respectively. In addition. these rates are 85. the average detection and false alarm rates for PCACID with updating are 80. As it can be seen in this table. RNSIT Page 17 .46 % respectively. respectively l while for PCADID without updating.39 % respectively. the average detection and false alarm rated for PCADID with updating are 91. the parameter are set to 8 Department of Information Science & Engineering. while for PCACID without updating.81 % and 0.PCA-based Distributed Approach for Intrusion Detection in WSNs Table II compares the performance of PCACID and PCADID with and without updating of the normal profile.

every monitor establishes a sub profile of its own normal network traffic using PCA and sends it to other monitor nodes. and a PCA-based distributed approach . PCADID significantly performs better than PCADID without updating in terms of detection and false alarm rates. In PCACID. In fact. A WSN is partitioned into groups of sensor nodes. every monitor node independently establishes a profile of its own normal network traffic using PCA and uses it to detect anomalous network traffic. Also. called PCACID . RNSIT Page 18 .PCA-based Distributed Approach for Intrusion Detection in WSNs Chapter 7 Conclusion In this paper . while minimizes the memory and energy consumption in the network. a PCA-based centralized approach . WSN simulations using the NS-2 simulator were conducted and considered scenarios for detecting scenarios for detecting two different types of sinkhole attacks. In PCADID. for intrusion detection in WSNs were presented . some nodes are selected as monitor nodes. The simulation results showed that PCADID achieves a better performance than PCACID. Department of Information Science & Engineering. called PCADID . Every monitor node composes a global normal profile based upon all received sub profiles and uses it to detect anomalous network traffic. In each group. PCADID reduces the memory and energy consumption in the network by distributing the process of establishing and updating the global normal profile among all monitor nodes.

1247-1256.Yick. vol. and M. London.313-332. Computer Communications. USA. and Y. Department of Information Science & Engineering. and J.June 2009 . vol. B. USA. J.51. China.Loo M.”Secure routing in wireless sensor networks: attacks and countermeasures “. [12] M. December 2006.4. Computer networks.3. in Proceedings of the 5th International Conference for Internet Technology and Secured Transactions. UK.isi. C. [11] T. vol. August 2009.”Intrusion detection for routing attacks in sensor networks. and C.Kumar. November 2010 . AK. “eHIP: an energy-efficient hybrid intrusion prohibition system for cluster-based wireless sensor networks “. [2] C.Chong and S.” International Journal of Distributed Sensor Networks. Z.edu/nsnam/ns/ [8] C.Li.Fu. pp. MD.4.Mukherjee. December 2008.”Group-based intrusion detection system in wireless sensor networks “.Nakayama. Tehran.”Wireless sensor network survey “.”Dynamic anomaly detection scheme for AODV-based mobile ad hoc networks “.P. December 2010.18. march 2007 [10] G.Zhao. Iran.Avancha .8. pp. Proceedings of IEEE.pp. and challenges “.1151-1168.Ng. IEEE Transaction On Vehicular Technology .He.Ahmadi Livani and M. no.Palaniswami.Abadi. vol.2292-2330.2. august 2008.Chang. October 2002. pp. and Y.”Security for sensor networks “.Ahmadi Livani and M.Y.113-127.12. august 2003.31. no.”Sensor networks: evolution opportunities.Liang.2471-2481.Su. S.58 . http://www.Pinkston .Baltimore . in Proceedings of the CADIP Research Symposium . pp.Undercoffer . October 2002 .Joshi . RNSIT Page 19 . no. in Proceedings of the first IEEE International Workshop on Sensor Network Protocols and Applications.Karloff and D.Y.Leckie.445-449. vol.no.91.vol. A.Wang . [5] M.Abadi . “An energy-efficient anomaly detection approach for wireless sensor networks “ in Proceedings of the 5th International Symposium on Telecommunication. no. A. [9] W. pp.52.Y .Kato .Jamalipour. no.Nemoto and N. K. and D. vol. [6] H.Ghosal. pp.Kurosawa. Computer Networks. “A detection method for routing attacks of wireless sensor network based on fuzzy C-means clustering “ in Proceedings of 6th International Conference on Fuzzy Systems and Knowledge Discovery.5 .4324-4332. [3] C. S.PCA-based Distributed Approach for Intrusion Detection in WSNs References [1] J. pp.Kuo. Tianjin . [7] NS-2: The Network simulator. “Distributed PCA-based anomaly detection in wireless sensor networks “. [4] J.Wagner.

RNSIT Page 20 .PCA-based Distributed Approach for Intrusion Detection in WSNs [13] G. USA.Downard. 2002 Department of Information Science & Engineering. 1996. New York: SpringerVerlag.Jolliffe. Principal Component Analysis.F. Johns Hopkins University Press.H. Washington DC.Golub and C. Matrix Computations. Third Edition. May 2004. Second Edition.Van Loan. [15] I. [14] I.T. Technical Report Naval Research Laboratory”. “Simulating sensor networks in NS-2”.