Professional Documents
Culture Documents
net/publication/337603875
CITATIONS READS
8 558
3 authors:
Matthew Adigun
University of Zululand
234 PUBLICATIONS 913 CITATIONS
SEE PROFILE
All content following this page was uploaded by Skhumbuzo Zwane on 28 November 2019.
Abstract—Tactical Mobile Ad hoc Networks (TMANET) reinstallation of the network in a short notice. For example,
have received much attention in recent years due to their ability [6] argued that most of the issues in the tactical network are
to provide network connectivity to tactical mobile nodes in a due to that the is no single management system by which
battlefield without any infrastructure. Additionally, they network services can be configured and provisioned. The lack
provide decentralization, self-organizing, robustness and of a central management unit makes it difficult to manage,
scalability to tactical nodes. Research in the tactical network deploy and redeploy information services which limit the
domain indicates that network security remains a critical and agility and flexibility of a military force [6]. Hence, the lack
continuous issue that needs attention. Researchers have of central management not only limit force ability but also
identified the lack of a central management and control unit as
network security applications.
one of the limitations in such networks. That is, there is no single
management system by which the network and security services Moreover, modern military equipment is gradually
can be configured and provisioned. This paper proposes a flow- embedded with ubiquitous sensing and computing devices [7].
based Intrusion Detection System (IDS) framework for Such devices are capable of collecting operational context
TMANETs. The proposed flow-based IDS framework leverages data, and communication information that can be used for the
Machine Learning (ML) and software-defined network (SDN) task of intrusion detection. For example, Flow data has
to achieve anomaly detection in TMANETs. The paper also become easy to acquire, with no significant resource
discusses the envisaged deployment scenarios and the requirements to the network devices. In this work, we propose
capabilities of the proposed system.
an intrusion detection system, which use network flow data to
Keywords—Tactical MANET, Flow-based IDS, Machine detect anomalies.
learning, Software Defined Networks, Network Security The proposed flow-based intrusion detection system
framework leverages Machine Learning (ML) and software-
I. INTRODUCTION defined network (SDN) for anomaly detection in TMANETs.
From the proposed framework, we describe the architectural
Mobile ad hoc networks (MANETs) have been adopted in components, how SDN is leveraged, and the flow of events to
modern military communication systems due to their ability to accomplish intrusion detection tasks. The paper further
support tactical field operations in different areas without discusses the envisaged deployment scenario with the
infrastructure. MANETs are decentralized and have self- advantages and drawbacks of the proposed system.
organizing capabilities, which allow them to be robust and Preliminary results indicate that flow-based detection
scalable [1]. MANETs are usually deployed in the edge of the achieves satisfactory results when compared to packet-based
tactical network to provide connectivity to individual nodes detection.
active in tactical field operations. In that sense, a Tactical
MANET (TMANET) is a military communication network The rest of the paper is organized as follows, section II
supporting operations on the battlefield. presents the background and literature review, which focuses
on the SDN paradigm, adoption of SDN for tactical networks,
The increasing growth in technology renders security a and flow-based intrusion detection methods. In section III the
critical subject in networks [2] [3], especially T-MANETs proposed framework is described, the latter provides an
because any security breach can result in detrimental effects, overview of the stages of the proposed method and the
such as soldiers being killed or manipulated to engage into application scenario and benefits achieved. Section IV
unnecessary gunfights. Over the years, researchers in the presents a performance study of machine learning methods.
security domain have investigated how network intrusion Finally, section V concludes the paper and provide future
detection systems can be effective for securing and mitigating works.
network security violations. The goal of intrusion detection is
to monitor network assets or devices to detect anomalous
behaviors and misuse [4]. However, due to the network II. BACKGROUND AND LITERATURE REVIEW
structure and operational environment, deploying security This section describes a general overview of SDN, SDN
mechanisms to secure T-MANETs is very challenging [5]. in tactical networks, Flow-based IDS, and ML approaches
Modern military communication systems present many used for intrusion detection
different challenging sets of problems to network operators A. Software Defined Networking (SDN)
[6]. Network operators have to battle with the heavy reliance
on wireless barriers, which often offer a fraction of the In Software Defined Networks (SDN), the network
required capacity, the constant changes in quality of service intelligence is logically centralized in software-based
(QoS) requirements, and the possible take down, move, or controllers, known as Control Plane (CP), and network
devices become simple packet forwarding devices, known as Generally, FIDS has a near real-time response, low
Data Plane (DP), that can be programmed via an open deployment cost, and the ability to operate on high-speed
interface such as OpenFlow [8]. The motivation for the backbone network links [10]. Networks with low energy
development of SDN was the lack of flexibility in traditional budget, high confidentiality, and real-time security
networks. In traditional networks, the task of configuring or monitoring requirements can benefit from FIDS.
updating network devices is very challenging because the C. Machine Learning enabled IDS
task is done manually. For example, configuring 100 network
The term Machine Learning (ML) was originally coined
devices located in different places could take several days or
in the 1960s by Arthur Lee Samuel. He defined ML as “a field
weeks depending on the professionals available. Applying of study that gives computers the ability to learn without
changes in the network does not happen fast and accurate being explicitly programmed”. ML is commonly used for
enough. Additionally, this approach is prone to errors. classification and prediction problems based on some known
Another issue faced with traditional networks is the large properties previously learned from some training data. ML
number of different network device vendors/manufacturers techniques have been around for years, the emergence of new
which makes it difficult to find the right professionals to scale computing technologies and availability of data allowed the
up the infrastructure when needed. usage of ML methods in a more efficiently and in real-time
A survey conducted by [6] presented some of the most [11]. Recently, researchers [3], [12]–[14] in the intrusion
common challenges experienced in tactical networks. In the detection field have been attracted by ML techniques for the
list, the heavy reliance on wireless barriers, the critical task of intrusion detection. However, since ML is not a new
dependency of the commander on timely access to field, there are many different ML algorithms that can be used
information, and also the moving, reinstalling, and taking for intrusion detection applications.
down of the network in a short notice are the critical problems The most popular ML category for intrusion detection is
reported. The authors proposed the adoption of SDN to classification. Usually, standalone, hybrid and ensemble
address problems experienced in tactical networks. Other classifiers are used, depending on the requirements and
researchers have also proposed different frameworks and resources available for the IDS. Different researchers in the
SDN based architecture for tactical communication networks field of intrusion detection have conducted performance
[1], [9]. The work of [9] presented a practical implementation analysis of machine learning methods for intrusion detection.
of an SDN mobile ad hoc network. This work is one of a few A recent study [15], reported that ensemble learning methods
studies in the wireless network community to demonstrate the performed better than single learning methods in terms of
SDN advantages in device-to-device (D2D) data accuracy. This demonstrates the applicability of ensemble
transmission and the flexibility introduced by centralized methods to address the high false-positive rate in T-MANET
network management. Another recent study in [1] proposed as reported by [5]. On the other hand, [15] also reported that
different novel architecture designs for SDN-enable mobile single learning methods tend to be quicker when building and
ad hoc network in the tactical field testing the model.
B. Intrusion Detection using Network Flow Data D. ML and SDN based IDS
Flow data have been used over the years in a number of SDN offer built-in information gathering, flexibility,
applications, which include billing, network traffic analysis, programmability, and network global view, thus, it is
network visibility, congestion control, and more recently regarded as one of the best options for network data collection
intrusion detection [10]. The importance of flow data and its and analysis. On the other hand, Machine learning has gained
applications have resulted in major vendors offering built-in popularity in the network security domain [14], [16], [17] due
flow collection and export support in their network devices. to more network-enabled devices getting connected,
Examples include sFlow, IPFIX, and Cisco’s Netflow, which malicious activities becoming stealthier, and the emergence
is more popular. of new technologies, such as SDN [3]. Over recent years,
In recent years flow-based intrusion detection methods many techniques have been proposed that use SDN and
have attracted many researchers in the network security machine learning (ML) for data collection, analysis, and
domain due to the advantages they offer over the traditional traffic classification.
packet-based intrusion detection methods [10]. Key In [18] a simple architecture for data collection in both
advantages of Flow-based IDS (FIDS) include; SDN networks and legacy networks using OpenFlow is
The amount of data processed by FIDS is lesser, making proposed. Their method is solely based on OpenFlow and can
them better suitable for the protection of backbone links be implemented as an SDN application in the controller. The
where the processing of total network traffic is authors deployed a single OpenFlow switch in a non-SDN
computationally demanding. enterprise production network. An HP E3800 OpenFlow
FIDS is an appropriate choice for intrusion detection enabled switch and HP VAN SDN controller were used to run
where network applications use end-to-end encryption their traffic monitoring application. The authors argued that
because no packet data scanning is required. their set up was very lightweight and does not interfere with
FIDS has fewer privacy concerns because user the normal flow of traffic in the monitored network. The
information is protected from any intermediate scans. extracted data were classified using ensemble machine
Because modern network hardware offer built-in flow learning methods, such as random forest (RF) and two
collection support, flow data for FIDS can be collected variations of gradient boosting classifiers, namely; Stochastic
in multiple locations across the network without any Gradient Boosting (SGB) and Extreme Gradient Boosting
additional costs. (EGB). Their initial results indicated that supervised learning
algorithms can be used with their architecture and with the
data that is collected with high accuracy levels.
The work in [19] proposed an SDN based secure IoT
framework called SoftThings to detect abnormal behaviors
and attacks and to mitigate as appropriate. Their framework
used Support Vector Machine to classify traffic. The study
applied both linear and non-linear RBF kernel. They
conducted their experiments on Mininet network emulator,
and they were able to achieve 98% precision in attack
detection. In [20] an application-aware multipath flow
routing using machine learning in SDN is proposed. This
method uses the C4.5 decision tree algorithm on a 40-feature
dataset. DDoS attack detection and mitigation system in SDN
is proposed in [21]. This method employed entropy-based
feature extraction and SVM for attack detection. [22]
Proposed traffic classification deployed at Access Point using
C5.0 decision tree algorithm. 11 ML models were made
available in a library in [23], they can allow developers to
quickly develop network security applications that can
perform real-time detection and responses.
However, although integrating these two technologies
have been an active topic in recent years, little attention has
been paid to the gap between the initial work and practical
real-world deployments. In this work, OpenFlow and sFlow
are utilized to construct an intrusion detection system capable
of feedback control that optimizes performance and
automatically adapts the network to meet challenging
demands.
Fig. 1: Flow-based IDS model
III. PROPOSED FRAMEWORK
This section presents and describes the proposed flow- 2) Data Collection and Preparation
based intrusion detection system (FIDS) model, leveraging Flow records are exported to the flow collector which
SDN, and the flow of events. receive, store, and pre-process flow data from one or more
flow exporters in the network. The flow collector also
A. Proposed FIDS model conducts feature extraction, which includes the task of
The proposed model employs flow sampling techniques picking the optimal features that will be used by the model to
to acquire network flow data from SDN enabled network and successfully classify the records. The Collector then further
use ML to analyze the flow data for anomaly detection, as export the data for storage and pass it on to the data pre-
shown in Figure 1. The model is composed of four essential processing module. Data pre-processing is the process of
components/stages, namely; Packet Observation, Flow converting flow records into a specific format that is
metering and Export, Data collection, and Data Analysis. acceptable to the detection algorithm used. This phase can
include data cleaning, fixing missing values, data encoding,
1) Flow Metering and Export and normalization. In this component, all the features of each
flow record from the data collector is encoded and scaled.
In this stage, packets are aggregated into flows and flow This allows the data analysis to be consistent while using less
records are exported. Packet aggregation is performed through processing power.
a metering process which is based on Information Elements
that define the layout of flow. Information elements are fields 3) Data Analysis
that can be exported in flow records. After the metering
process flow record sampling and filtering functions are In this stage, the results of all the previous stages come
performed. In contrast with packet sampling and filtering together. In the data analysis stage different data analysis
performed on the packet observation stage, flow sampling and methods can be applied, for example, flow analysis and
filtering work on flow records instead of packets. Flow reporting, threat detection, and performance monitoring [24].
records are packaged into a specific message format Our proposed framework employed threat detection by
depending on the protocol used. For example, the IPFIX employing Machine learning to model network behavior and
format or NetFlow format. After constructing the message, it detect anomalies. In order to use machine learning for
is then exported to the flow collector. The most implemented intrusion detection, a machine learning model is required. In
and deployed transport protocol for exporting flows is User the proposed framework, a machine learning classifier is built
Datagram Protocol (UDP) [24]. using the pre-processed data. The ML decision engine uses
the machine learning classifier constructed to classify new
instances as malicious or normal. If an instance is classified
as malicious then an alert is generated, else the instance is
dropped. In addition, the decision engine logs all the It is envisaged that the application will be capable of
instances and decision in a log file. Such log files can be classifying a network flows as malicious or normal. The
exported to a Log Management and Analysis tool, to further discussed process is described in Figure 2.
analyze and visualize generated alerts.
C. Flow of Events
B. Leveraging SDN architecture The proposed flow-based intrusion detection system can
SDN plays a significant role in the proposed machine be divided into three modules; flow sampling module, data
learning intrusion detection framework described above. The collector module and ML IDS application. The flow of events
following section presents an overview of how our method in each of these components is described in Figure 3.
leverages SDN.
1) The Sampling agents
Data Plane (Packet Observation, Flow Metering and
The sampling agents wait for packets to be transmitted,
Export Stage): All the network devices in the data plane
and collects flow information. The agent verifies if the
are embedded with collector agents, as shown in Figure packets meet specified criteria, which is important for
1, the agents sample and send flow records to the filtering network control messages, as depicted in Fig.3. If the
centralized collector. The devices are configured to packets met specified criteria, filtering is conducted by
collect specific flow metrics and export them to the specifying a threshold. The threshold specifies and manages
collector. Today built-in flow collection and export flows by inspecting packet header. Flows are then packaged
support are already offered by major vendors, like Cisco. and sent to the collector.
Control Plane (Data Collection and Preparation Stage): 2) The Data Collector
The data collector residing in the CP module collect The collector module is responsible for collecting the data
network flow records. It then filters the data and conduct from the different sampling agents embedded in each network
feature extraction. Thus, the collector generates and device. Its task is to collect the flow data, employ feature
creates different datasets which are important for the extraction then pass the data with appropriate features or
adopted ML technique. Data sources are all network attributes to the SDN application for data pre-processing and
devices capable of communicating with the OpenFlow cleaning.
controller.
3) The IDS Application
Application Plane (Data Analysis Stage): The machine The IDS application periodically queries or retrieves flow
learning model is constructed and implemented as an records from the collector. It waits with time out and repeats
SDN application. Different ML methods or algorithms the process. For each new flow record, pre-processing
can be applied for different purposes as SDN methods are applied to the flow record. After converting the
applications using different datasets generated by the flow record into ML acceptable input format, the ML model
flow collector. Different applications can be constructed is used to classify it as normal or malicious. If the record is
that are powered by ML models to influence the normal, then the IDS application move on to the next flow
functioning of the network. Examples include incident record retrieved. However, if the record is malicious, the
handling applications, such as Rule or policy application makes a snapshot of the flow record’s data and
enforcement, and path selection applications. In our case, insert the information in a log file. Logging detected
a ML model is built and used as an SDN intrusion malicious incidents could then be helpful for visualization
detection application. and security incident handling