Professional Documents
Culture Documents
September 2013 Gondar, Ethiopia
September 2013 Gondar, Ethiopia
SCIENCE
By:-SURAFEAL ENGDAW
September 2013
Gondar, Ethiopia
UNIVERSITY OF GONDAR
By:-
I declare that the thesis is my original work and has not been presented for a
degree in any other university.
_________________________________________
Date
________________________________________
Advisor
ii
Contents
CHAPTER ONE………………………………………………………………………………………........ 1
INTRODUCTION………………………………………………………………………………………….. 1
1.1. BACKGROUND OF THE STUDY………………………………………………………….. 1
1.1.2. How Network IDS works………………………………………………………………………4
1.1.3. Data Mining Tasks for Intrusion Detection…………………………………………… 6
1.1.1: Background Information about UOG and the network…………………………… 7
1.2. STATEMENT OF THE PROBLEM…………………………………………………………. 8
1.3. OBJECTIVE OF THE STUDY…………………………………………………………….. 11
1.3.1. General objective……………………………………………………………………………… 11
1.3.2. Specific objective……………………………………………………………………………… 11
1.4. SCOPE AND LIMTETION OF THE STUDY…………………………………………. 11
METHODOLOGY OF THE STUDY……………………………………………………………. 12
1.5.1. Study design……………………………………………………………………………………. 12
1.5. SIGNIFICANCE OF THE STUDY………………………………………………………… 13
CHAPTER TWO…………………………………………………………………………………………… 15
LITRETURE REVIEW………………………………………………………………………………….. 15
2.1: Intrusion detection………………………………………………………………………………… 15
2.2: Data mining models ………………………………………………………………………………. 23
2.2.1: The knowledge discovery model (KDD)……………………………………………. 23
2.2.2: The CRISP- DM Process…………………………………………………………………. 25
2.2.3: Hybrid Data mining Model……………………………………………………………….. 28
2.3: Data Mining Tasks……………………………………………………………………………….. 29
2.3.1. Clustering……………………………………………………………………………………….. 30
2.3.2. Classification………………………………………………………………………………….. 31
2.3.4: Association Rule …………………………………………………………………………….. 34
2.4: Feature selection and intrusion detection…………………………………………………. 34
Chapter Three…………………………………………………………………………………………. 41
Intrusion detection system framework…………………………………………………………… 41
3.1: Tasks to be performed…………………………………………………………………………… 41
3.2: Structure and architecture of intrusion detection systems……………………….. 42
3.3.: University of Gondar (UoG) Network Architecture………………………………… 47
3.3.1. Network Service Descriptions……………………………………………………………. 47
3.3.2: Design Considerations……………………………………………………………………… 48
3.3.3: LAN Architecture…………………………………………………………………………….. 50
3.3.4: General Server Farm Network Design……………………………………………….. 52
3.3.5: WAN Block Network Design……………………………………………………………. 55
3.3.6: Internet Block Network Design………………………………………………………… 57
3.3.7: ACS - Access control server……………………………………………………………….. 58
3.3.8: Cs-MARS………………………………………………………………………………………… 59
3.3.9: Cisco Security Manager…………………………………………………………………… 60
CHAPTER FOUR …………………………………………………………………………………………………………………..65
4.1 DATASET PREPARATION…………………………………………………………………………………………… 65
4.2 Problem Domain Understanding……………………………………………………………… 65
4.3 Data Understanding………………………………………………………………………………… 65
4.4 Attributes selection for knowledge discovery…………………………………………….. 68
4.5 Evaluation Metrics……………………………………………………………………………………………………. 68
4.5.2. Performance Measure………………………………………………………………………………………….. 71
4.5.2.1. Error Rate………………………………………………………………………………………………………. 71
4.5.2.2. Accuracy………………………………………………………………………………………………………… 71
4.5.2.3 Detection Accuracy……………………………………………………………………………………….. 71
4.5.2.4. False Positive rate………………………………………………………………………………………….. 72
4.5.2.5. Precision and Recall……………………………………………………………………………………….. 72
CHAPTER FIVE……………………………………………………………………………………………. 73
EXPERIMENTATION………………………………………………………………………………….. 73
5.2. J48 decision tree modeling………………………………………………………………………. 74
Experimentation I……………………………………………………………………………………….. 75
5.2.2.2. Experimentation II…………………………………………………………………………… 78
5.2.3. Naive Bayes modeling using all features…………………………………………………. 81
5.1. Experimentation III:……………………………………………………………………………… 83
5.2. Experimentation IV:……………………………………………………………………………… 84
5.2.4. Naive Bayes modeling with feature selection…………………………………………. 85
Experimentation V:……………………………………………………………………………………… 85
ii
5.2.4.2. Experimentation VI: ………………………………………………………………………… 86
5.2.5. Comparison of J48 decision tree and Naive Bayes model………………………………. 88
Chapter six………………………………………………………………………………………………………………………. 95
Conclusion and Recommendation………………………………………………………………………………….. 95
6.1 Conclusion……………………………………………………………………………………………………………….. 95
6.2. Recommendations………………………………………………………………………………………………… 96
REFERENCES………………………………………………………………………………………………………………… 102
Appendix……………………………………………………………………………………………………… 110
DURATION……………………………………………………………………………………………….. 110
BUDGET…………………………………………………………………………………………………… 111
iii
List of Figures
Figure 1 : the sophistication of hackers' tools over time ..... Error! Bookmark
not defined.
Figure 2 shows the KDD process ..................................................................... 23
Figure 3 Show the crisp data mining model .................................................... 26
Figure 4 Categories of prediction as well as description. ................................. 30
Figure 5 Data mining operations and techniques .......................................... 32
Figure 6 Four key steps of feature selection ..................................................... 35
Figure 7 Intrusion detection system activities .................................................. 41
Figure 8 IDS components for data collection.................................................. 44
Figure 9 Campus LAN Design ........................................................................... 51
Figure 10 Logical Server Farm ......................................................................... 54
Figure 11 WAN Block Network Design ............................................................. 56
Figure 12 Internet Firewalls ............................................................................. 56
Figure 13 CSMARS Attack Path ........................................................................ 60
Figure 14 CSM Security Topology ..................................................................... 61
Figure 15 Comparison of Accuracy the J48 and Naïve Bayes Algorithms ....... 90
Figure 16 TP graph ............................................................................................ 92
Figure 17 FP graph ............................................................................................ 92
iv
List of Tables
Table 1 CRISP-DM phases and tasks ................................................................ 27
Table 2 Distribution of the dataset ................................................................... 67
Table 3 Summary of the distribution of attacks ............................................... 68
Table 4 The 5X5 cost matrix used for the KDD 1999 winner result ............... 70
Table 5 Standard metrics for evaluations of Intrusions (attacks) ................... 70
Table 6 J48 algorithm parameters and their default values ............................75
Table 7 Detailed Accuracy by Class using Supervised J48 algorithm
parameters with their default values- 10 fold cross validation ........................ 76
Table 8 Detailed Accuracy by Class using Supervised J48 algorithm
parameters with Other confidence factor -pruned with cf 0.2 ......................... 77
Table 9 Confusion Matrix using pruned J48 algorithm parameters ............... 78
Table 10 Detailed Accuracy by Class using J48 algorithm parameters with
percentage-split set to 70%............................................................................... 79
Table 11 Detailed Accuracy by Class using J48 algorithm parameters with
percentage-split set to 75% ............................................................................... 80
Table 12 Confusion Matrix using J48 algorithm parameters with percentage-
split set to 75% ...................................................................................................81
Table 13 Detailed Accuracy by Class using Navie Bayes classifier with 10 fold
cross validation ................................................................................................. 83
Table 14 classification accuracy of the model .................................................. 84
Table 15 Confusion Matrix using Navie Bayes classifiers with 10 fold cross
validation .......................................................................................................... 85
Table 16 Detailed Accuracy by Class using Navie Bayes classifier with Feature
selection. ........................................................................................................... 86
Table 17 Detailed Accuracy by Class using Navie Bayes classifier with Feature
selection with 25% -75% percentage split. ....................................................... 87
Table 18 Confusion Matrix using Navie Bayes classifiers with Feature
selection. ........................................................................................................... 88
Table 19 Comparison of the confusion matrix result for J48 and Naïve Bayes
Algorithms ........................................................................................................ 90
Table 20 Algorism for TP and FP ...................................................................... 91
v
List of Acronyms
DOS………………………..Denial of Services
FP…………………………False Positive
IP ………………………….Intrusion Prevention
vi
Dedication
This thesis is dedicated in memory of Ato Taddesse Engdaw and Ato Dagnew
Engdaw passed on July 8/2004 E.C in a car accident. My brothers always look
over us and shine on us from heaven. You are always with all of us.
vii
Acknowledgment
Above all, I would like to glorify the almighty GOD for giving me the ability to
be where I am. Secondly, I would like to a very much grateful thank to my
advisor Dr. Million Meshesha for His constructive comments and overall
guidance whose encouragement, and support from the initial to the final level
enabled me to develop and understand the subject.
I would like to thank Ato Tigabu Dagne Akal for his contribution in the
special ground for the success of my study.
My special thanks goes to My wife for the constant assistance and
encouragement she rendered to me since the time of my admission to the
postgraduate program.
Lastly, I offer my regards and blessings to my family and for those who
supported me in any respect during the completion of the research.
viii
Abstract
Intrusion detection is an important technology in business sector as well as an
active area of research. It is also an important tool for information security. A
Network Intrusion Detection System is used to monitor networks for attacks
or intrusions. Network intrusion detection systems have become a standard
component in security Infrastructures. Unfortunately, current systems are
poor at detecting novel attacks without an unacceptable level of false alarms.
In this study the researcher used the available intrusion detection data sets
from university of Gondar Data center. The researcher has taken 7345 records
which are labeled as Normal, DOS, U2R, R2L and Prob. For supervised
modeling, the 6461 records are taken. For building a predictive model for
intrusion detection J48 decision tree and the Naïve Bayes algorithms have
been tested as a classification approach with and without feature selection
approaches.
The model that was created using 10-fold cross validation using the J48
decision tree algorithm with the default parameter values showed the best
classification accuracy of 94.40% on the training datasets to classify the new
instances as normal, DOS, U2R, R2L and probe classes. The findings of this
study have shown that the data mining methods generates interesting rules
that are crucial for intrusion detection in the networking industry. Future
research directions are forwarded to come up an applicable system in the area
of the study.
xi
ix
CHAPTER ONE
INTRODUCTION
For several years now, society has been dependent on information technology
(IT).With the rise of internet and e-commerce this is more applicable now
than ever [1]. People rely on computer networks to provide them with news,
stock prices, e-mail and online shopping. People‘s credit card details, medical
records and other personal information are stored on computer systems.
Many companies have a web presence as an essential part of their business.
The research community uses computer systems to undertake research and to
disseminate findings. Computers control national infrastructure components
such as the power grid. Due to the rise of high-speed Internet access, the
integrity and availability of all these systems are becoming vulnerable to
potential cyber attacks, such as network intrusions. Amateur hackers, rival
corporations, terrorists and even foreign governments have the motive and
capability to carry out sophisticated attacks against computer systems [1].
1
(middleware) applications that manages the information system. Attacks that
come from these external origins are called outsider attacks. Insider attacks,
involve unauthorized internal users attempting to gain and misuse non-
authorized access privileges.
2
services, floods and brute force attacks [4]. An intrusion prevention system
(IPS) is a software or hardware device that has all the capabilities of IDS and
can also attempt to stop possible incidents.
Data mining (DM) based methods are paradigms for building IDS. According
to Lee et al [6], DM generally refers to the process of extracting useful patterns
and knowledge from large stores of data. The recent rapid development in DM
contributes to apply a wide variety of descriptive and predictive DM
algorithms for suitable network-intrusion-detection problems.
3
1.1.2. How Network IDS works
Intrusion detection systems in theory looks like a defense tool which every e-
organization needs. However there are some challenges the organizations face
while deploying an intrusion detection system [9].
IDS technology itself is undergoing a lot of enhancements. It is therefore
very important for organizations to clearly define their expectations from
the IDS implementation. IDS technology has not reached a level where it
does not require human intervention. Of course today's IDS technology
offers some automation like notifying the administrator in case of
detection of a malicious activity, shunning the malicious connection for a
configurable period of time, dynamically modifying a router's access
control list in order to stop a malicious connection etc. But it is still very
important to monitor the IDS logs regularly to stay on top of the
occurrence of events. Monitoring the logs on a daily basis is required to
analyze the kind of malicious activities detected by the IDS over a period of
time. Today's IDS has not yet reached the level where it can give historical
analysis of the intrusions detected over a period of time. This is still a
manual activity. It is therefore important for an organization to have a
well-defined Incident handling and response plan if an intrusion is
4
detected and reported by the IDS. Also, the organization should have
skilled security personnel to handle this kind of scenario.
The IDS technology is still reactive rather than proactive [10]. The IDS
technology works on attack signatures. Attack signatures are attack
patterns of previous attacks. The signature database needs to be updated
whenever a different kind of attack is detected and the fix for the same is
available. The frequency of signature update varies from vendor to vendor.
5
While deploying a network based IDS solution, it is important to keep in
mind one very important aspect of the network based IDS in switched
environment. Unlike a HUB based network, where a host on one port can
see traffic in and out of every other port in the HUB, in a switched network
however, traffic in and out of one port cannot be seen by a host in another
port, because they are in different collision domains. A network based IDS
sensor needs to see traffic in and out of a port to detect any malicious
traffic. In a switched environment, port mirroring or spanning is required
to achieve this. One entire VLAN can be spanned to one port on which the
network based IDS sensor is installed.
Classification maps a data item into one of several pre-defined categories [11].
These algorithms normally output "classifiers", for example, in the form of
decision trees or rules. An ideal application in intrusion detection is to gather
sufficient "normal" and "abnormal" audit data for a user or a program, then
apply a classification algorithm to learn a classifier that determines (future)
audit data as belonging to the normal class or the abnormal class [11]. There
are many types of classifiers are available like decision tree, Bayesian network,
neural network and rule induction. Basic aim of classifier is predicting the
appropriate class.
Clustering
Clustering is a data mining technique where data points are clustered together
based on their feature values and a similarity metric. There are many clustering
methods available, and each of them may give a different grouping of a dataset
[12]. The choice of a particular method will depend on the type of output
desired. In general, clustering methods may be divided into two categories
based on the cluster structure which they produce.
The non-hierarchical methods divide a dataset of N objects into M clusters,
with or without overlap. These methods are sometime divided into
partitioning methods, in which the classes are mutually exclusive, and the less
6
common clumping method, in which overlap is allowed. Each object is a
member of the cluster with which it is most similar however the threshold of
similarity has to be defined.
The hierarchical methods produce a set of nested clusters in which each pair
of objects or clusters is progressively nested in a larger cluster until only one
cluster remains. The hierarchical methods can be further divided into
agglomerative or divisive methods. In agglomerative methods, the hierarchy is
build up in a series of N-1 agglomerations, or Fusion, of pairs of objects,
beginning with the un-clustered dataset. The less common divisive methods
begin with all objects in a single cluster and at each of N-1 steps divide some
clusters into two smaller clusters, until each object resides in its own cluster.
The University of Gondar strategic vision for its ICT services is [94]: ―To
provide high quality, efficient and effective knowledge management and
information systems that support the University‘s scholars, students, lecturers
and administrators.‖
7
The University has under gone changes since the foundation to enhance the
service it delivers to the society. To cope-up with the growing needs of the
students and the society, and possibly lead the growth, it has undergone an
extensive study of the processes and has reengineered the bits of the work
processes. It is clear that information technology plays the biggest role in the
implementation of the business processes re-engineered. The University has
been working mostly on the network infrastructure in the previous fiscal year
and has completed the interconnection of all the campuses and offices in
them. The data centre is almost in its final phase of setup and configuration
[94]. The Information and Communication Technology (ICT) strategy sets out
the technical direction for technology-based activities and services at the
University of Gondar. The ICT Strategy embodies the principles and priorities
contained in the University Strategic Plan. The overriding aim of the ICT
Strategy is to provide a central resource for all and a platform onto which
faculties can build their specialized hardware and software. The University
recognizes the need to balance academics‘ specialized computing needs within
their faculties with the provision of a cost-effective, efficient and reliable ICT
infrastructure and centralized administration and management information
systems [94].
Given the level and nature of modern network security threats, the question
for security professionals should not be whether to use intrusion detection,
8
but which intrusion detection features and capabilities to use. IDSs have
gained acceptance as a necessary addition to every organization‘s security
infrastructure. In recent years, DM-based IDSs have demonstrated high
accuracy, good generalization to novel types of intrusion, and robust behavior
in a changing environment [13]. Still, significant challenges exist in the
design and implementation of production quality IDSs [13].
9
using controls such as routers, firewalls, public key infrastructures, virtual
private networks, and virus scanners.
Towards this end, the present study intends to get answers for the following
research questions.
Which Data Mining algorithm can be more suitable for the purpose of
constructing a network Intrusion Detection Model?
To what degree can the NIDS correctly classify intrusions as either
malicious or normal traffics?
10
How to provide useful information about intrusions that do take
place, allowing improved diagnosis, recovery, and correction of
causative factors?
This study attempts to apply data mining for analyzing network data. A
classification model is constructed for network intrusion detection at
University of Gondar. Different classification algorithms are used for mining
and creating a model that predict the type of network intrusion.
Due to the datasets were taken from the University of Gondar data center, this
research did not include data from other network security organizations in
Ethiopia. So, further research needs to be conduct including data from these
organizations. Because of time and financial limit this research also focused
11
mainly on how to effectively detect attacks, not to prevent them. The IDS
model constructed in this thesis just notify for the administrators after
detecting an attack and administrators have to manually take proper actions.
12
1.5.5. Modeling:
In this step various data mining methods are used to extract patterns and
hidden knowledge for constructing predictive model.
The study used WEKA data mining tool for mining the data. The reason for
using WEKA software is that it is a powerful, user friendly and freely available
for noncommercial purpose [17].
For performance evaluation: Accuracy, recall Full testing dataset of which is taken
from the University of Gondar pass though the developed model to detect the
intrusions and find the detection error rate, precision, false positive rate, average
misclassification cost and accuracy of the detection models but for comparison of
models, misclassification cost, false positives rate and accuracy of detection is used as
a major performance measurement.
13
This research will enhance the effectiveness and efficiency of Network
Intrusion Detection System in the University of Gondar by proposing efficient
model constructing techniques.
This thesis is structured into six chapters. The first chapter discusses
background to the study problem area, statement of the problem, objective of
the study, scope of the study, research methodology and application of the
study.
The third chapter deals with the physical and logical design of intrusion
detection.
The fourth chapter presents tasks performed towards understanding the data
cleaning, Data transformation and feature selection and evaluation metrics
14
CHAPTER TWO
LITRETURE REVIEW
15
[22] [23] [24] [25] [26] but the growth of the intrusion detection has been
such that many new projects have appeared in the mean time. The data fusion
and correlation capabilities of commercial intrusion detection systems spans
over a wide range. A few products are specifically designed to do centralized
alarm collection and correlation. For example RealSecure SiteProtector, which
claims to do ―advanced data correlation and analysis‖ by interoperating with
the other products in ISS‘s RealSecure line [27]. Some products, such as
Symantec ManHunt and nSecure nPatrol, integrate the means to collect
alarms and the ability to apply multiple statistical measures to the data that
they collect directly into the IDS itself [28][29].
Most IDSs, such as the Cisco IDS, or Network Flight Recorder (NFR) provide
the means to do centralized sensor configuration and alarm collection [30]
[31] .NFR provides the notion of ―central stations‖ for this task, although
(Singh and Kandula [31] note that it was developed ―as an after-thought,‖ as
each sensor was only designed to interoperate with a single Intrusion
Detection Appliance, and that NFR doesn‘t support distributed pattern
matching.
The problem with all IDS is that they are designed more for prioritizing what
conventional intrusion (misuse) detection systems already detect, and not for
finding new threats. Other products, such as Computer Associates‘ eTrust
Intrusion Detection Log View, and Net Secure Log are more focused on
capturing log information to a database, and doing basic analysis on it. Such
an approach seems to be more oriented towards insuring the integrity of the
audit trail (itself an important activity in an enterprise environment), than
data correlation and analysis [32] [33]. Though the public awareness of the
whole area of ―intrusion detection‘‘ seems to have been more recent, it is
certainly not a new area of inquiry. In fact, it has been an area of concern for
most of what we know of ―modern‖ computers. There have been a number of
important milestones in the brief history of Intrusion Detection Systems.
1960‘s: The emergence of time-sharing systems demonstrated the
need to control access to computer resources [33].
16
1970‘s: The DOD Ware Report pointed out the need for computer
security.1970‘s (Mid to late): A number of systems were designed and
implemented using security kernel architectures [34]
1980: Andrson, [35] first proposed that audit trails should be used to
Monitor threats. The Importance of such data had not been
comprehended at that time and all the available system security
procedures were focused on denying access to sensitive data from an
unauthorized source.
1983: The Department of Defense Trusted Computer System
Evaluation Criteria-- the ―orange book―—was published and provided a
set of criteria for evaluating computer security control effectiveness
[36].
1987: Denning [19] presented an abstract model of an Intrusion
Detection Expert System (IDES). Her paper was the first to propose the
concept of intrusion detection as a solution to the problem of providing
a sense of security in computer systems.
1988: The Internet Worm program of 1988--which infected thousands
of machines and disrupted normal activities for several days--was
detected primarily through manual means[37] refined the intrusion
detection model proposed by Denning and created the IDES prototype
system. This system was designed to detect intrusion attempts with
adaptation to gradual changes in behavior to minimize false alarms. in
order to assist Air Force Security Officers detect misuse of the
mainframes used at Air Force Bases [38] developed MIDAS (Multics
Intrusion Detection and Alerting System) to monitor the National
Computer Security Center Dockmaster system [39] developed the
Haystack system
1989: Wisdom and Sense from the Los Alamos National Laboratory,
and Information Security Officer‘s Assistant (ISOA) from Planning
Research Corporation, [40].
1990: A new concept was introduced in 1990, with NSM (Network
Security Monitor, now called Network Intrusion Detector or NID):
instead of examining the audit trails of a host computer system,
17
suspicious behavior was detected by passively monitoring the network
traffic in a LAN, Heberlein
1991: A different idea was introduced with NADIR (Network Anomaly
Detection and Intrusion Reporter) and DIDS (Distributed Intrusion
Detection System): the audit data from multiple hosts were collected
and aggregated in order to detect coordinated attacks against a set of
hosts [41].
1994: according to [42] they suggested the use of autonomous agents
in order to improve the scalability, maintainability, efficiency and fault
tolerance of an IDS. This idea fit well with the ongoing research on
software agents in other areas of computer science.
1995: An improved version of IDES was developed in 1995, called
NIDES (Next-generation Intrusion Detection Expert System) [43].
1996: The design and implementation of GrIDS addressed the
scalability deficiencies in most contemporary intrusion detection
systems. This system facilitates the detection of large-scale automated
or coordinated attacks, which may even span multiple administrative
domains [44].
1998: According to [45] they offered an innovative approach to
intrusion detection, by incorporating information retrieval techniques
into intrusion detection tools. Intrusion is defined by [46] as ―any set of
actions that attempts to compromise the integrity, confidentiality or
availability of a resource‖. He also notes that an intrusion is ―the act of
a person or proxy attempting to break into or misuse one‗s system in
violation of an established policy.‖ [47] Noted that an intrusion threat
is the potential possibility of a deliberate unauthorized attempt to
access information, manipulate information, or render a system
unreliable or unusable. With this perspective, Sundaram also noted
that there are different aspects to an intrusion, each of which is
significant to a full analysis and response. These aspects include: Risk
Accidental or unpredictable exposure of information, or violation of
operations integrity due to the male function of hardware or
incomplete or incorrect software design. Vulnerability: A known or
suspected flaw in the hardware or software or operation of a system
18
that exposes the system to penetration or its information to accidental
disclosure. Attack: A specific formulation or execution of a plan to carry
out a threat. Penetration: A successful attack -- the ability to obtain
unauthorized (undetected) access to files and programs or the control
state of a computer system. Intrusions can be classified into two major
classifications. [46] categorized Intrusions into the following classes:
Misuse intrusions are well-defined attacks against known system
vulnerabilities. They can be detected by watching for specific actions
being performed on specific objects, and
Anomaly intrusions are based on activities that are deviations from
normal system usage patterns. They are detected by building a profile
of the system or users being monitored, and detecting significant
deviations from this profile. intrusion classification was made by [48]
who noted that previous work directed at intrusion classification was
less than adequate for the basis of research. Classifications that
focused on the intruders and their methods (that is the threat or
intrusion technique) tended to focus on the exploitation, but did so in
terms of the technique used. Classifications that stressed the
characteristics of the computer system that make the intrusion
possible (that is the vulnerability or security flaw) frequently did not
account for the exploitation of known flaws. Lindquist and Jonsson
[48] believes that proper intrusion classification is essential for the
following reasons: One significant contribution to the subject of In
general, categorizing a phenomenon makes systematic studies
possible. An established taxonomy would be useful when reporting
incidents to incident response teams, such as the CERT Coordination
Center. If the taxonomy included a grading of the severity impact of
the intrusion, system owners and administrators would be helped in
prioritizing their efforts [48] also cited the work of [49] which was,
based on an analysis of 3,000 computer-abuse cases over a 20-year
period. The Neumann and Parker classification is summarized in the
following
19
Another issue for [48] was the question of intrusion consequences. He
considered that both the immediate result of the breach, as well what the
intruder did after the initial breach were both important. Lindquist and
Jonsson [48] taxonomy encompasses the following properties: The categories
of taxonomy should be mutually exclusive (every specimen should fit in at
most one category) and collectively exhaustive (every specimen should fit in at
least one category). Every category should be accompanied by clear and
unambiguous classification criteria defining what specimens are to be put in
that category. Intrusion detection is comprehensible and useful not only to
experts in security but also to users and administrators with less knowledge
and experience of security. The terminology of the taxonomy should comply
with the established security terminology. The taxonomy should be He also
took into account the properties that had been previously identified by [34]
which include:
Completeness. The taxonomy should encompass all possible attacks on
the target system.
Appropriateness. The selected taxonomy should appropriately
characterize the attacks to the target system; that is any constraints on
the taxonomy or on the system should be specified and considered
before application. Attack taxonomy should differentiate attacks that
require insider access to a system from those that can be initiated by
external intruders who may not have gained access to the system.
Based on his research, and his analysis of previous attempts to
develop classifications, [48] decided to use the traditional aspects of
computer security: Confidentiality, Integrity and Availability (CIA) as
a basis for his model. From this, he developed two classification
schemes as noted in the tables below. One classification focused on
intrusion technique, the other on intrusion result. Lodin [46] classifies
potential intruders into two types Outside Intruders and Inside
Intruders : Outside Intruders - is the most publicized form of intruder
and receives the bulk of attention during security implementations.
Typical terms used to identify outside intruder are hacker and cracker.
Inside Intruders - Studies by the Computer Security Institute in
conjunction with the FBI have revealed that most intrusions and
20
attacks come from within an organization and result from an
authorized user maliciously invoking an authorized process or by
manipulating a known vulnerability. This type of intrusion has the
potential for causing the greatest damage to the organization. Finally
[47] believes that it is important to also consider the type of intrusion,
regardless of the source. He divides intrusion into 6 main types:
Attempted break-ins, which is detected by atypical behavior profiles or
violations of security constraints.
Masquerade attacks which are detected by atypical behavior profiles or
violations of security constraints.
Penetration of the security control system, which are detected by
monitoring for specific patterns of activity
Leakage, which is detected by atypical use of system resources.
Denial of service, which is detected by atypical use of system resources.
Malicious use, which is detected by atypical behavior profiles,
violations of security constraints, or use of special privileges.
21
automatically tailored to a company's profile of operating systems and
network hardware. Vulnerabilities are also showing up in security
equipment, such as firewalls and even IDS equipment.
Finally Hackers are getting smarter: Hackers can use port scanners to
attempt to connect to a target machine on every port and build a list of
potential active ports. Modern port scanners include operating system
identification, can target entire ranges of IP addresses and even send in
decoy scans to make it more difficult for the target to identify who the
scanner source really is.
.
22
2.2: Data mining models
Historically, the notion of finding useful patterns in data has been given a
variety of names, including data mining, knowledge extraction, information
discovery, information harvesting, data archaeology, and data pattern
processing [52] . The term data mining has mostly been used by statisticians,
data analysts, and the management information systems (MIS) communities.
It has also gained popularity in the database field. The phrase knowledge
discovery in databases was coined at the first KDD workshop in 1989 [52] to
emphasize that knowledge is the end product of a data-driven discovery. It has
been popularized in the AI and machine-learning fields. In our view, KDD
refers to the overall process of discovering useful knowledge from data, and
data mining refers to a particular step in this process. Data mining is the
application of specific algorithms for extracting patterns from data.
23
The additional steps in the KDD process, such as data preparation, data
selection, data cleaning, incorporation of appropriate prior knowledge, and
proper interpretation of the results of mining, are essential to ensure that
useful knowledge is derived from the data. Blind application of data-mining
methods (rightly criticized as data dredging in the statistical literature) can be
a dangerous activity, easily leading to the discovery of meaningless and invalid
patterns.
The KDD process is interactive and iterative, involving numerous steps with
many decisions made by the user. [Brachman and Anand [53] give a practical
view of the KDD process, emphasizing the interactive nature of the process.
Here, They broadly outline some of its basic steps:
The First step is developing an understanding of the application
domain and the relevant prior knowledge and identifying the goal of
the KDD process from the customer‘s viewpoint. This is followed by
creating a target data set: selecting a data set, or focusing on a subset
of variables or data samples, on which discovery is to be performed.
The third step is data cleaning and preprocessing. Basic operations
include removing noise if appropriate, collecting the necessary
information to model or account for noise, deciding on strategies for
handling missing data fields, and accounting for time-sequence
information and known changes.
At the fourth step data reduction and projection finding useful
features to represent the data depending on the goal of the task. With
dimensionality reduction or transformation methods, the effective
number of variables under consideration can be reduced, or invariant
representations for the data can be found.
24
models and parameters might be appropriate (for example, models of
categorical data are different than models of vectors over the real) and
matching a particular data-mining method with the overall criteria of
the KDD process (for example, the end user might be more interested
in understanding the model than its predictive capabilities).
The Seventh step is data mining i.e. searching for patterns of interest
in a particular representational form or a set of such representations,
including classification rules or trees, regression, and clustering. The
user can significantly aid the data-mining method by correctly
performing the preceding steps. The next step is interpreting mined
patterns, possibly returning to any of steps 1 through 7 for further
iteration. This step can also involve visualization of the extracted
patterns and models or visualization of the data given the extracted
models.
The final step is acting on the discovered knowledge: using the
knowledge directly, incorporating the knowledge into another system
for further action, or simply documenting it and reporting it to
interested parties.
This process also includes checking for and resolving potential conflicts with
previously believed (or extracted) knowledge. The KDD process can involve
significant iteration and can contain loops between any two steps. The basic
flow of steps (although not the potential multitude of iterations and loops) is
illustrated in figure 2.2. Most previous work on KDD has focused on step 7,
the data mining. However, the other steps are as important (and probably
more so) for the successful application of KDD in practice. Having defined the
basic notions and introduced the KDD process, we now focus on the data-
mining component, which has, by far, received the most attention in the
literature.
Cross Industry Standard Process for Data Mining (CRISP-DM) is the most
used methodology for developing DM projects [54] Analyzing the problems of
DM and KD projects, a group of prominent enterprises (Teradata, SPSS – ISL,
25
Daimler-Chrysler and OHRA) proposed a reference guide to called CRISP-DM
[55] CRISP-DM is vendor-independent so it can be used with any DM tool and
it can be applied to solve any DM problem. CRISP-DM defines the phases to
be carried out in a DM project. CRISP-DM also defines for each phase the
tasks and the deliverables for each task. CRISP-DM is divided into six phases
shown in Figure 3 The crisp data mining model [55] .
26
Table 1 CRISP-DM phases and tasks
Data preparation: The data preparation phase covers all the activities
required to construct the final dataset from the initial raw data. Data
preparation tasks are likely to be performed repeatedly and not in any
prescribed order.
27
Evaluation: What are, from a data analysis perspective, seemingly high
quality models will have been built by this stage. Before proceeding to final
model deployment, it is important to evaluate the model more thoroughly and
review the steps taken to build
it to be certain that it properly achieves the business objectives. At the end of
this phase, a decision should be reached on how to use of the DM results.
Hybrid models combine the aspects of both academic and industrial models
resulting in providing more general, research-oriented description of steps.
Such a model is the six-step KDP model of Cios et al. [16] which consists of
following steps:
28
sampling, correlation and significance tests and data cleaning. The (cleaned)
data may also be processed by derivation of new attributes or summarization
of data.
Data mining: Where various data mining methods are used to derive
knowledge from the preprocessed data.
29
Predictive models can also be descriptive (to the degree that they are
understandable), and descriptive models can be used for prediction.
The categories of prediction as well as description are associated with the five
basic operations, as presented in Figure 4 Categories of prediction as well as
description.
2.3.1. Clustering
Clustering is the classification of similar objects into different groups, or more
precisely, the partitioning of a data set into subsets (clusters), so that the data
in each subset (ideally) share some common trait often proximity according to
some defined distance measure [58]. Machine learning typically regards data
clustering as a form of unsupervised learning techniques that can sense
intrusions that have not been previously learned. Examples of such algorithms
include K-means-Clustering and Self-Organizing feature Map (SOM).
It is useful in intrusion detection as malicious activity should cluster together,
separating itself from non-malicious activity. Clustering provides some
30
significant advantages over the classification techniques; it does not require
the use of a labeled data set for training. The amount of available network
audit data instances is large, human labeling is time consuming, and
expensive. It can also be the process of labeling data and assigning it into
groups. Clustering algorithms can group new data instances into similar
groups. These Groups can be used to increase the performance of existing
classifiers [58] [59]. This can be achieved by first, clustering the traffic data in
to different cluster groups and then give to the classifier.
2.3.2. Classification
Classification is similar to clustering in that it also partitions records into
distinct segments called classes [60]. But unlike clustering, classification
analysis requires that the end-user know in advance of time how classes are
defined. It is necessary that each record in the dataset used to build the
classifier already have a value for the attribute used to define classes. As each
record has a value for the attribute used to define the classes, and because the
end-user decides on the attribute to use, classification is much less
investigative than clustering [60].
32
Decision Trees: Decision trees use simple knowledge representation to
classify examples into a finite number of classes. In a typical setting, the tree
nodes represent the attributes, the edges represent the possible values for a
particular attribute, and the leaves are assigned with class labels. Classifying a
test sample is straightforward once a decision tree has been constructed. An
instance is classified by following paths from the root node through the tree,
taking the edges corresponding to the values of attributes, until the splitting
completed or no over fitting. Some of the popular tree algorithms include ID3
[63] [64] and CART [65]. A decision tree classifier is modeled in two phases:
Tree Building and Tree Pruning. In tree building, the decision tree model is
built by recursively splitting the training data set based on a locally optimal
standard until all or most of the records belonging to each of the partitions
bear the same class label. After building the decision tree, a tree pruning step
is performed to reduce the size of the decision tree. Decision trees that are too
large are prone to over fitting. Pruning attempts to get better the
generalization competence of a decision tree by orderly the branches of the
initial tree. The tree pruning approach is fault based: start from the base of the
tree and inspect each non-leaf sub tree. If replacement of this sub tree with a
leaf, or with its most frequently used branch, would lead to a lower predicted
fault rate, then prune the tree accordingly [64].
33
Bayesian Classification: Bayesian classification is based on the inferences
of Probabilistic graphic models which indicate the probabilistic dependencies
essential to a particular model using a graph structure [67]. In its simplest
form, a probabilistic graphical model is a graph in which nodes stand for
random variables, and the arcs represent conditional dependence
assumptions. Hence it provides a compressed representation of combined
probability distributions.
34
development since 1970‘s in statistical pattern recognition [72] machine
learning [73] and data mining [74] and widely applied to many fields such as
text categorization [75] image retrieval [76] customer relationship
management [77] intrusion detection [6] and genomic analysis [78]
Feature selection is a process that selects a subset of original features. The
optimality of a feature subset is measured by an evaluation criterion. As the
dimensionality of a domain expands, the number of features N increases.
Finding an optimal feature subset is usually intractable [79] and many
problems related to feature selection have been shown to be NP-hard [80].
A typical feature selection process consists of four basic steps (shown in Figure
2.6), namely, subset generation, subset evaluation, stopping criterion, and
result validation [81]. Subset generation is a search procedure [82] that
produces candidate feature subsets for evaluation based on a certain search
strategy. Each candidate subset is evaluated and compared with the previous
best one according to a certain evaluation criterion. If the new subset turns
out to be better, it replaces the previous best subset. The process of subset
generation and evaluation is repeated until a given stopping criterion is
satisfied. Then the selected best subset usually needs to be validated by prior
knowledge or different tests via synthetic and/or real-world data sets. Feature
selection can be found in many areas of data mining such as classification,
clustering, association rules, regression. For example
35
Feature selection is called subset or variable selection in Statistics [83]. A
number of approaches to variable selection and coefficient shrinkage for
regression are summarized in [84]. Early research efforts mainly focus on
feature selection for supervised classification with labeled data [85]
(supervised feature selection) where class information is available.
Latest developments, however, show that the above general procedure can be
well adopted to feature selection for clustering with unlabeled data [81] (or
unsupervised feature selection) where data is unlabeled. Feature selection
algorithms designed with different evaluation criteria broadly fall into three
categories: the filter model [81] the wrapper model [86], and the hybrid model
[87]. The filter model relies on general characteristics of the data to evaluate
and select feature subsets without involving any mining algorithm. The
wrapper model requires one predetermined mining algorithm and uses its
performance as the evaluation criterion. It searches for features better suited
to the mining algorithm aiming to improve mining performance, but it also
tends to be more computationally expensive than the filter model [88],[79].
The hybrid model attempts to take advantage of the two models by exploiting
their different evaluation criteria in different search stages.
His insight into audit data the Evolution of Intrusion Detection Systems and
its importance led to tremendous improvements in the auditing subsystems of
virtually every operating system.
36
Anderson's conjecture also provided the foundation for future intrusion
detection system design and development. His work was the start of host-
based intrusion detection and IDS in general.
In 1984, SRI also developed a means of tracking and analyzing audit data
containing authentication information of users on ARPANET, the original
Internet. Soon after, SRI completed a Navy SPAWAR contract with the
realization of the first functional intrusion detection system, IDES. Using her
research and development work at SRI, Dr. Denning published the decisive
work, An Intrusion Detection Model, which revealed the necessary
information for commercial intrusion detection system development. Her
paper is the basis for most of the work in IDS that followed. Meanwhile, there
were other significant advances occurring at University of California Davis'
Lawrence Livermore Laboratories.
37
Lane (2000) tested his host-based IDS in part of an instance-based learner, a
type of exemplar clustering approach. the same data. While not all clustering
techniques are applicable to the intrusion detection domain, the wealth of
techniques that Berkhin (2002) presented easily leaves the impression that
there is a great deal of potential for further research in the application of
clustering techniques to network intrusion detection. Dewan and Mohammad
(2010) presents an alert classification to reduce false positives in IDS using
improved self adaptive Bayesian algorithm (ISABA). It is applied to the
security domain of anomaly based network intrusion detection. Sathyabama et
al (2011) used clustering techniques to group user‘s behavior together
depending on their similarity and to detect different behaviors and specified
as outliers. Amir et al (2011) formalized SOM to classify IDS alerts to reduce
false positive alerts. Alert filtering and cluster merging algorithms are used to
improve the accuracy of the system.SOM is used to find correlations between
alerts. Alan et al (2002) has developed NIDS using classifying self organizing
maps for data clustering.
38
MLP neural network is an efficient way of creating uniform, grouped input for
detection when a dynamic number of inputs are present. An ensemble
approach Srinivas et al (2004) helps to indirectly combine the synergistic and
complementary features of the different learning paradigms without any
complex hybridization. The ensemble approach outperforms both SVMs
MARs and ANNs. SVMs outperform MARs and ANN in respect of Scalability,
training time, running time and prediction accuracy. Shilendra et al (2011)
focused on the dimensionality reduction using feature selection. The Rough
set support vector machine (RSSVM) approach deploy Johnson‘s (2002) and
genetic algorithm of rough set theory to find the reduce sets and sent to SVM
to identify any type of new behavior either normal or attack one.
Taeshik and Jong (2007) proposed an enhanced SVM approach framework for
detecting and classifying the novel attacks in network traffic. The overall
framework consist of an enhanced SVM- based anomaly detection engine and
its supplement components such as packet profiling using SOFM, packet
filtering using PTF, field selection using Genetic Algorithm and packet flow-
based data preprocessing. SOFM clustering was used for normal profiling. The
SVM approach provides false positive rate similar to that of real NIDSs. Sadiq
53 (2011) genetic algorithm can be effectively used for formulation of decision
rules in intrusion detection through the attacks which are more common can
be detected more accurately. Norouzian and Merati (2011) defined Multi-
Layer Perceptron (MLP) for implementing and designing the system to detect
the attacks and classifying them into six groups with two hidden layers of
neurons in the neural networks. Host based intrusion detection is used to
trace system calls. Normal and intrusive behavior are collected through
system call and analysis is done through DM and fuzzy technique. To the
knowledge of the researcher there are only three attempts in our country that
have been done so far towards the application of DM in the intrusion
detection. Adamu (2010) had tried to study a machine learning intrusion
detection System that investigate the application of cost sensitive learning
using data mining approach to network intrusion detection. Adamu‘s
proposed learning approach for network intrusion detection performed using
cost sensitive learning techniques by testing decision tree algorithm on labeled
records. The other research which is undertaken by Zewdie (2010), tried to
39
develop a predicative model for network intrusion detection using information
gain value for feature selection. He proposed an optimal feature selection for
Network Intrusion Detection using indirect cost sensitive feature selection
approach using decision tree as classification technique. Tigabu also tried to
construct a semi-supervised intrusion detection model and compared the
performance gap of different classification technique. At the same time, his
study compared the result with feature selection algorithms that incorporate
costs.
Both Adamu and Zewdie have tried to test data mining application on
intrusion detection using the supervised classification techniques of the J48
decision tree algorithm in addition Tigabu use‘s semi supervised classification
techniques. All of them after building the model they did not validate the
model with real life data. Hence this study has a great contribution in applying
DM technology for the purpose of enhancing security in the networking
environment with real life data set taken from university of Gondar data
center. It is contributed in the areas considering labeled records which solved
the problem what the result seems in real life data set. The IDS model
constructed in this thesis notify for the administrators after detecting an
attack and administrators will take proper actions
40
Chapter Three
41
Once an intrusion has been detected, IDS issues alerts notifying
administrators of this fact. The next step is undertaken either by the
administrators or the IDS itself, by taking advantage of additional
countermeasures (specific block functions to terminate sessions, backup
systems, routing connections to a system trap, legal infrastructure etc.) –
following the organization‘s security policy (Fig 3.2). An IDS is an element of
the security policy.
Among the various IDS tasks, intruder identification is one of the fundamental
one. It can be useful in the forensic research of incidents and installing
appropriate patches to enable the detection of future attack attempts targeted
on specific persons or resources.
An intrusion detection system always has its core element - a sensor (an
analysis engine) that is responsible for detecting intrusions. This sensor
42
contains decision-making mechanisms regarding intrusions. Sensors receive
raw data from three major information sources (as shown in fig 3.3): own IDS
knowledge base, syslog and audit trails. The syslog may include, for example,
configuration of file system, user authorizations etc. This information creates
the basis for a further decision-making process.
The sensor is integrated with the component responsible for data collection
(Fig 3. 4) — an event generator. The collection manner is determined by the
event generator policy that defines the filtering mode of event notification
information. The event generator (operating system, network, application)
produces a policy-consistent set of events that may be a log (or audit) of
system events, or network packets. This, set along with the policy information
can be stored either in the protected system or outside. In certain cases, no
data storage is employed for example, when event data streams are
transferred directly to the analyzer. This concerns the network packets in
particular.
43
Figure 8 IDS components for data collection [93]
The role of the sensor is to filter information and discard any irrelevant data
obtained from the event set associated with the protected system, thereby
detecting suspicious activities. The analyzer uses the detection policy database
for this purpose. The latter comprises the following elements: attack
signatures, normal behavior profiles, necessary parameters (for example,
thresholds). In addition, the database holds IDS configuration parameters,
including modes of communication with the response module. The sensor also
has its own database containing the dynamic history of potential complex
intrusions (composed from multiple actions).
44
across multiple physical locations. In addition, agents can be specifically
devoted to detect certain known attack signatures. This is a decisive factor
when introducing protection means associated with new types of attacks [95].
IDS agent-based solutions also use less sophisticated mechanisms for
response policy updating [96].
Basically, there are two approaches for intrusion detection design based on
the uses of detection techniques: Knowledge based and behavior-based
45
intrusion detection. Knowledge based intrusion detection is also called misuse
detection. In principle, it is typically realized by modeling known attack
behavior with prior understanding about specific attacks and system
vulnerabilities. Afterward, the intrusion detection system compares network
traffic data being observed with well defined attack patterns for identifying the
possible penetrations to the system. When the data is as same as one of the
explicitly defined attack patterns, an alarm is raised. The defined attack
patterns are frequently referred to as the signatures of intrusions. The
signature could be a static string or a sequence of events. While knowledge-
based intrusion detection is achieved by modeling known attack behavior, on
the contrary, behavior based intrusion detection also known as anomaly
detection models normal or expected behavior of computer users. It looks for
malicious activities by comparing the observed data with these acceptable
behaviors. If the data diverge from the learned normal behavior, an alarm is
raised. In other word, anything will be suspected as an attack if its behavior is
deviated from the previously learned behaviors. For developing intrusion
detection systems, a large amount of traffic data is always necessary to be
collected in advance for analysis with the use misuse detection or anomaly
detection approaches. Based on the collected network audit trail, misuse
detection techniques specify well defined attack signatures and anomaly
detection techniques establish acceptable usage profiles to differentiate
intrusions and normal activities from a future network traffic data stream.
46
3.3.: University of Gondar (UoG) Network
Architecture
Network Architecture
Figure 3.6 below provides a complete view of the UoG LAN logical network
design.
47
Internet
WAN
Server Farm
STAT
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
IronPort IronPort
Web Server
LMS SYST
RPS
STAT
1X
1 2 3 4 5 6 7 8 9 10 11 12
11X 13X
13 14 15 16 17 18 19 20 21 22 23 24
23X
Catalyst 3560 SERIES PoE-24
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
QPM
Classrooms
Cs-Mars President Bldg SYST
RPS
STAT
DUPLX
SPEED
POE
MODE
1X
2X
1 2 3 4 5 6 7 8 9 10 11 12
11X
12X
13X
14X
13 14 15 16 17 18 19 20 21 22 23 24
23X
24X
Catalyst 3560
1
SERIES PoE-24
STAT STAT
DUPLX DUPLX
SPEED 1 2
SPEED 1 2
POE POE
2X 12X 14X 24X 2X 12X 14X 24X
MODE MODE
SYST
RPS
1X
1 2 3 4 5 6 7 8 9 10 11 12
11X 13X
13 14 15 16 17 18 19 20 21 22 23 24
23X
Catalyst 3560 SERIES PoE-24
STAT STAT
DUPLX DUPLX
SPEED 1 2
SPEED 1 2
Catalyst 3560 SERIES PoE-24 POE POE
2X 12X 14X 24X 2X 12X 14X 24X
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X
MODE MODE
SYST
RPS
STAT
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
STAT
SYST SYST
DUPLX RPS RPS
SPEED 1 2
POE
2X 12X 14X 24X
STAT STAT
MODE
DUPLX DUPLX
SPEED 1 2
SPEED 1 2
POE POE
2X 12X 14X 24X 2X 12X 14X 24X
MODE MODE
STAT
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X 1X 11X 13X 23X
SYST SYST
RPS RPS
STAT STAT
DUPLX DUPLX
SPEED 1 2
SPEED 1 2
POE POE
2X 12X 14X 24X 2X 12X 14X 24X
MODE MODE
STAT
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
STAT STAT
DUPLX DUPLX
SPEED 1 2
SPEED 1 2
POE POE
2X 12X 14X 24X 2X 12X 14X 24X
MODE MODE
STAT
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
Lecture
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24
Temp
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X 1X 11X 13X 23X
SYST SYST
RPS RPS
STAT STAT
DUPLX DUPLX
Catalyst 3560 SERIES PoE-24 SPEED 1 2
SPEED 1 2
POE POE
2X 12X 14X 24X 2X 12X 14X 24X
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X MODE MODE
SYST
Library
RPS
STAT
MODE
STAT
DUPLX
Bldg
SPEED 1 2
POE
2X 12X 14X 24X
MODE
1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X
MODE
SYST SYST Catalyst 3560 SERIES PoE-24 SYST SYST SYST
RPS RPS RPS RPS RPS
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
STAT STAT 1X 11X 13X 23X STAT STAT STAT
DUPLX
SPEED
DUPLX
SPEED
SYST
RPS
DUPLX
SPEED
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 DUPLX
SPEED
DUPLX
SPEED
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24
1 2 1 2 1 2 1 2 1 2
POE POE POE POE POE
2X 12X 14X 24X 2X 12X 14X 24X STAT 2X 12X 14X 24X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2X 12X 14X 24X 2X 12X 14X 24X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
DUPLX 1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X
MODE MODE SPEED MODE SYST SYST MODE MODE SYST SYST SYST SYST
1 2
POE RPS RPS RPS RPS RPS RPS
2X 12X 14X 24X
MODE
STAT STAT STAT STAT STAT STAT
DUPLX DUPLX DUPLX DUPLX DUPLX DUPLX
SPEED 1 2
SPEED 1 2
SPEED 1 2
SPEED 1 2
SPEED 1 2
SPEED 1 2
POE POE POE POE POE POE
2X 12X 14X 24X 2X 12X 14X 24X 2X 12X 14X 24X 2X 12X 14X 24X 2X 12X 14X 24X 2X 12X 14X 24X
Catalyst 3560 SERIES PoE-24 MODE MODE MODE MODE MODE MODE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X
SYST
RPS
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24
STAT
DUPLX
SPEED 1 2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
POE 1X 11X 13X 23X 1X 11X 13X 23X
2X 12X 14X 24X SYST SYST
MODE RPS RPS
STAT STAT
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 DUPLX
SPEED
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 DUPLX
SPEED
1 2 1 2
POE POE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2X 12X 14X 24X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2X 12X 14X 24X
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X
SYST
1X 11X 13X 23X MODE SYST SYST SYST SYST MODE
SYST
RPS RPS RPS RPS RPS RPS
SYST
1X
1 2 3 4 5 6 7 8 9 10 11 12
11X 13X
13 14 15 16 17 18 19 20 21 22 23 24
23X Catalyst 3560 SERIES PoE-24
RPS
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
STAT 1X 11X 13X 23X
DUPLX SYST
SPEED
POE
1 2 Catalyst 3560 SERIES PoE-24 RPS
2X 12X 14X 24X
MODE 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 STAT
Catalyst 3560 SERIES PoE-24 SYST
1X 11X 13X 23X DUPLX
SPEED
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24
1 2
RPS POE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2X 12X 14X 24X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X STAT MODE 1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X
SYST DUPLX SYST SYST SYST
RPS SPEED 1 2
RPS RPS RPS
POE
2X 12X 14X 24X
STAT STAT
DUPLX DUPLX
Catalyst 3560 SERIES PoE-24 SPEED 1 2
SPEED 1 2
POE POE
2X 12X 14X 24X 2X 12X 14X 24X
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X MODE MODE
SYST
RPS
STAT
DUPLX
SPEED 1 2
Catalyst 3560 SERIES PoE-24
POE
2X 12X 14X 24X Catalyst 3560 SERIES PoE-24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
MODE
1X 11X 13X 23X
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
SYST
1X 11X 13X 23X
SYST RPS
RPS
STAT
STAT
DUPLX
DUPLX
SPEED
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24
SPEED 1 2
1 2 POE
POE 2X 12X 14X 24X
2X 12X 14X 24X
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X
MODE MODE SYST SYST SYST
RPS RPS RPS
STAT STAT
STAT
DUPLX DUPLX
DUPLX
SPEED SPEED 1 2
SPEED 1 2
1 2
POE POE POE
2X 12X 14X 24X 2X 12X 14X 24X
2X 12X 14X 24X
STAT
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
Design Objectives
48
Provide redundancy, resilience, and economy in the system to
minimize failure points and risks.
Naming Conventions
ASAs only exist in the Internet Block, so the ASA hostnames is represented by
UoG-TED-Int-ASA. The university has two ASAs located at the Internet block,
the name of those two ASAs will be UoG-TED-Int-ASA since they will be
operating in Active/Standby mode.
Firewalls Hostnames
49
3.3.3: LAN Architecture
Campus Design
UoG network design relies on a multilayer building block network design. The
multilayer campus design consists of a number of building blocks connected
across a campus backbone. There are three characteristic layers: access,
distribution, and core where Layer 3 switching is used in the access layer,
Layer 3 switching in the distribution layer, Core-VSS, Serverfarm-VSS, WAN-
VSS, MARAKI, and FASIL campuses.
In the light of new changes, MARAKI and FASIL campuses had been added to
the Core-VSS with a layer 3 etherchannels. Also, the Medical campus is now
connected to the Tewedros campus at the WAN-VSS with a layer 3 physical
link.
Moreover, MARAKI with routing protocol running OSPF area 3 and FASIL
with OSPF area 4, both where connected to the UOG Core-VSS on L3 TenGig
connection with area 0. In addition, the Medical campus is now connected to
the WAN-VSS with OSPF area 1.
50
The most flexible and scalable campus backbone consists of Layer 3 switches,
as shown in figure 3.7 The backbone switches are connected by routed 10
Gigabit EtherChannel links.
Internet
WAN
Server Farm
STAT
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
IronPort IronPort
Web Server
LMS SYST
RPS
STAT
1X
1 2 3 4 5 6 7 8 9 10 11 12
11X 13X
13 14 15 16 17 18 19 20 21 22 23 24
23X
Catalyst 3560 SERIES PoE-24
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
QPM
Classrooms
Cs-Mars President Bldg SYST
RPS
STAT
DUPLX
SPEED
POE
MODE
1X
2X
1 2 3 4 5 6 7 8 9 10 11 12
11X
12X
13X
14X
13 14 15 16 17 18 19 20 21 22 23 24
23X
24X
Catalyst 3560
1
SERIES PoE-24
STAT STAT
DUPLX DUPLX
SPEED 1 2
SPEED 1 2
POE POE
2X 12X 14X 24X 2X 12X 14X 24X
MODE MODE
SYST
RPS
1X
1 2 3 4 5 6 7 8 9 10 11 12
11X 13X
13 14 15 16 17 18 19 20 21 22 23 24
23X
Catalyst 3560 SERIES PoE-24
STAT STAT
DUPLX DUPLX
SPEED 1 2
SPEED 1 2
Catalyst 3560 SERIES PoE-24 POE POE
2X 12X 14X 24X 2X 12X 14X 24X
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X
MODE MODE
SYST
RPS
STAT
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
STAT
SYST SYST
DUPLX RPS RPS
SPEED 1 2
POE
2X 12X 14X 24X
STAT STAT
MODE
DUPLX DUPLX
SPEED 1 2
SPEED 1 2
POE POE
2X 12X 14X 24X 2X 12X 14X 24X
MODE MODE
STAT
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X 1X 11X 13X 23X
SYST SYST
RPS RPS
STAT STAT
DUPLX DUPLX
SPEED 1 2
SPEED 1 2
POE POE
2X 12X 14X 24X 2X 12X 14X 24X
MODE MODE
STAT
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
STAT STAT
DUPLX DUPLX
SPEED 1 2
SPEED 1 2
POE POE
2X 12X 14X 24X 2X 12X 14X 24X
MODE MODE
STAT
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
Lecture
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24
Temp
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X 1X 11X 13X 23X
SYST SYST
RPS RPS
STAT STAT
DUPLX DUPLX
Catalyst 3560 SERIES PoE-24 SPEED 1 2
SPEED 1 2
POE POE
2X 12X 14X 24X 2X 12X 14X 24X
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X MODE MODE
SYST
Library
RPS
STAT
MODE
POE
2X 12X 14X 24X RPS RPS
MODE
STAT STAT
DUPLX Catalyst 3560 SERIES PoE-24 DUPLX
SPEED 1 2
SPEED 1 2
POE 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 POE
2X 12X 14X 24X 2X 12X 14X 24X
1X 11X 13X 23X
SYST
MODE MODE
RPS
STAT
DUPLX
Bldg
SPEED 1 2
POE
2X 12X 14X 24X
MODE
1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X
MODE
SYST SYST Catalyst 3560 SERIES PoE-24 SYST SYST SYST
RPS RPS RPS RPS RPS
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
STAT STAT 1X 11X 13X 23X STAT STAT STAT
DUPLX
SPEED
DUPLX
SPEED
SYST
RPS
DUPLX
SPEED
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 DUPLX
SPEED
DUPLX
SPEED
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24
1 2 1 2 1 2 1 2 1 2
POE POE POE POE POE
2X 12X 14X 24X 2X 12X 14X 24X STAT 2X 12X 14X 24X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2X 12X 14X 24X 2X 12X 14X 24X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
DUPLX
MODE MODE SPEED MODE SYST
1X 11X 13X 23X
SYST
1X 11X 13X 23X
MODE MODE SYST
1X 11X 13X 23X
SYST
1X 11X 13X 23X
SYST
1X 11X 13X 23X
SYST
1X 11X 13X 23X
1 2
POE RPS RPS RPS RPS RPS RPS
2X 12X 14X 24X
MODE
STAT STAT STAT STAT STAT STAT
DUPLX DUPLX DUPLX DUPLX DUPLX DUPLX
SPEED 1 2
SPEED 1 2
SPEED 1 2
SPEED 1 2
SPEED 1 2
SPEED 1 2
POE POE POE POE POE POE
2X 12X 14X 24X 2X 12X 14X 24X 2X 12X 14X 24X 2X 12X 14X 24X 2X 12X 14X 24X 2X 12X 14X 24X
Catalyst 3560 SERIES PoE-24 MODE MODE MODE MODE MODE MODE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X
SYST
RPS
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24
STAT
DUPLX
SPEED 1 2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
POE 1X 11X 13X 23X 1X 11X 13X 23X
2X 12X 14X 24X SYST SYST
MODE RPS RPS
STAT STAT
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 DUPLX
SPEED
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 DUPLX
SPEED
1 2 1 2
POE POE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2X 12X 14X 24X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2X 12X 14X 24X
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X
SYST
1X 11X 13X 23X MODE SYST
1X 11X 13X 23X
SYST
1X 11X 13X 23X
SYST
1X 11X 13X 23X
SYST
1X 11X 13X 23X
MODE
SYST
RPS RPS RPS RPS RPS RPS
SYST
1X
1 2 3 4 5 6 7 8 9 10 11 12
11X 13X
13 14 15 16 17 18 19 20 21 22 23 24
23X Catalyst 3560 SERIES PoE-24
RPS
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
STAT 1X 11X 13X 23X
DUPLX SYST
SPEED
POE
1 2 Catalyst 3560 SERIES PoE-24 RPS
2X 12X 14X 24X
MODE 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 STAT
Catalyst 3560 SERIES PoE-24 SYST
1X 11X 13X 23X DUPLX
SPEED
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24
1 2
RPS POE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2X 12X 14X 24X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X STAT MODE 1X 11X 13X 23X 1X 11X 13X 23X 1X 11X 13X 23X
SYST DUPLX SYST SYST SYST
RPS SPEED 1 2
RPS RPS RPS
POE
2X 12X 14X 24X
RPS RPS
STAT STAT
DUPLX DUPLX
Catalyst 3560 SERIES PoE-24 SPEED 1 2
SPEED 1 2
POE POE
2X 12X 14X 24X 2X 12X 14X 24X
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1X 11X 13X 23X MODE MODE
SYST
RPS
STAT
DUPLX
SPEED 1 2
Catalyst 3560 SERIES PoE-24
POE
2X 12X 14X 24X Catalyst 3560 SERIES PoE-24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
MODE
1X 11X 13X 23X
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
SYST
1X 11X 13X 23X
SYST RPS
RPS
STAT
STAT
DUPLX
DUPLX
SPEED
Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24 Catalyst 3560 SERIES PoE-24
SPEED 1 2
1 2 POE
POE 2X 12X 14X 24X
2X 12X 14X 24X
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
STAT STAT
STAT
DUPLX DUPLX
DUPLX
SPEED SPEED 1 2
SPEED 1 2
1 2
POE POE POE
2X 12X 14X 24X 2X 12X 14X 24X
2X 12X 14X 24X
STAT
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
51
Each of the building blocks has separate VLANs/ VRRP groups that only exist in
each of the building blocks and is not correlated with any other building block. All
Layer2 flooding will be restrained to a single building block and cannot flood to
all parts of the Campus.
The Internet block is having two redundant Internet links; this block has been
designed to host two redundant Failover ASA firewalls with capability to serve
DMZ switches for hosting publicly accessed servers from the Internet.
The core of the network consist of two 6504 switches acting as a VSS with L3
connections over 10 Gigabit Ethernet to the distribution switches and with a
bundle of 2 Gig connection to the WAN VSS and the Server Farm VSS. The WAN
layer also consists of dual 6504 switches with 10 Gigabit Ethernet connections
VSL Link, The VSS at the WAN hosts two Cisco Firewall blades. The distribution
layer consists of two 4506 switches with 10 Gigabit Ethernet connections to the
VSS at the Core. The same applies for the Server Farm where two Catalyst 6509
acting as a VSS host two Cisco FWSM modules but with a 2 Gig bundle to the VSS
at the Core.
The high availability in the network design consists of using redundant switches
and connections at all layers including access, distribution, and Core and Server
farm. The access layer consists of a Catalyst 3560 switches with dual Gigabit
Fiber uplinks to the distribution switches for maximum redundancy.
The Server Farm has two Catalyst 6509 switches acting as a VSS, with 2 Gigabit
Ethernet connections to the VSS at the Core defined as a Multi-chassis ether
channel.
52
OSPF is running area 0 between the Core-VSS, Dist-1-TED, Dist-2-TED,
Management-Switch, Server-Farm-VSS and the WAN-VSS. OSPF was configured
to run in area 2 between FWSM-ServerFarm-VSS and ServerFarm-VSS.
VLANs in the server farm were created to host the servers in different subnets
based on security and performance requirements.
The server farm is hosting FWSM blades so these FWSMs were configured to
protect the entire Server Farm zone, thus they were placed on the Outside of the
Server Farm. The Outside of the FWSM communicates with the MSFC while the
Inside is connected to the various corporate servers.
Figure 3.8 shows the logical design of the 6509 switches hosting two firewall
blades. It‘s clear that all the VLANs were linked to the firewall blade; some
VLANs were only linked to the FWSM thus bypassing the MSFC like VLANs 4
and 5.
The MSFC VLANs still be used restrict the connectivity between VLANs since all
routing for those VLANs is being controlled by the MSFC engine on the 6500
switch.
Figure 3.7 shows the logical design of the server farm. In the figure, VLANs 6 and
7 are routed through the MSFC, while VLANs 4 and 5 are directly protected by
the FWSM firewall.
53
VSS-Core
FWSM:
Failover:10.139.245.0/29
Stateful:10.139.244.0/29
FWSM
54
3.3.5: WAN Block Network Design
The WAN block has two Catalyst 6504 switches acting as a VSS, with 2 Gigabit
Ethernet connections to the VSS at the Core defined as a Multi-chassis ether
channel.
The VSS at the Core is also equipped with advanced services such as
Cisco Firewall Blades; these FWSMs will protect the network from the new
infrastructure to be built, Ethiopian Education and Research Network
(EthERNet).
Figure 5 shows the logical design of the 6504 switches hosting two firewall
blades. The FWSM at the WAN will be connected to ETC where eBGP will be
running between the FWSM at the WAN and ETC. This connection hasn‘t been
established yet since the connection wasn‘t ready. Below is a sample design where
AS numbers will be configured if eBGP will be running with ETC otherwise static
routes can be used to connect UoG to other campuses.
55
eBGP – AS
65009
Internet
eBGP – AS
65010
EthERNet
ASA
Med Campus
VSS-WAN
OSPF - Area 0
Core-VSS
Internet Firewalls
Internet
ASA
STAT
DUPLX
SPEED 1 2
POE
2X 12X 14X 24X
MODE
IronPort IronPort
WAN-VSS
Web Server
Core-VSS
OSPF - Area 0
56
3.3.6: Internet Block Network Design
Figure 3. 11 shows the Internet Block; two Cisco ASA firewalls will be connected
to the Internet in a redundant fashion using Gigabit Ethernet Connections; the
network also contains a DMZ and outside zones.
The DMZ can host any publicly accessible servers like Web Servers and IronPort
Mail filters. The DMZ interface of the ASA firewalls is connected to the 6504
WAN VSS where DMZ devices placed.
The Internet block is being protected by two ASA firewalls; the two ASA firewalls
were configured for Failover with Active/Standby criteria to increase the
availability of the network.
Internet
OSPF Area 1
VSS-WAN
57
The Internet Block will be having two ASA Firewalls with two links to the Internet
Service Provider.
The two ASA Firewalls were configured in active/standby scenario for maximum
availability. In case of failure of one of the ASA firewalls, the standby firewall will
assume the role of primary and start forwarding traffic to the internet. The two
ASA firewalls are running OSPF with the WAN switches and learned the internal
routes using OSPF. A default route was configured on the ASA firewalls to
forward traffic towards the internet.
The IP addresses assigned to the outside interfaces of the ASA firewall will be
coordinated with the ISP.
To meet new demands for access control management and compliance, and to
support the increasingly complex policies that this requires, UOG needs Cisco
Secure Access Control System. This next-generation policy platform is a core
policy component of the Cisco TrustSec solution.
University of Gondar can gain these benefits from Cisco Secure Access Control
System: Deploy it easily with other Cisco TrustSec components for a
comprehensive access control and confidentiality solution
Receive support for two distinct protocols: RADIUS for network access control
and TACACS+ for network device access control Use multiple databases
concurrently for maximum flexibility in enforcing access policy Enjoy increased
power and flexibility for access control policies that may include authentication
for various access requirements Get integrated monitoring, reporting and
troubleshooting components, accessible through an easy-to-use, web-based GUI
58
AAA is an architectural framework for configuring a set of three independent
security functions in a consistent manner. AAA provides a modular way of
performing the following services:
3.3.8: Cs-MARS
The devices that were configured to be monitored by MARS are the LAN switches
in the Campus, the FWSM modules, the ASA Firewalls and the IDSM-2 modules
at the server Farm.
At this point, we have already defined SNMP and SYSLOG for access and
reporting to MARS on the FWSM, IDS modules and the LAN switches.
59
Figure 13 CSMARS Attack Path
Overview
The device centric view delivers a Simplified Interface to Add Devices and Edit
and Deploy Security Policies.
60
Figure 14 CSM Security Topology
61
System design for this research
A Hybrid data mining model is followed to explore the application of data mining
for network intrusion detection to predict whether a data is normal or an attack.
WEKA 3.6.9 data mining tools, techniques and expertise are utilized as means to
address the research problem. The overall design issues are represented
diagrammatically as shown in figure
Problem understanding
Target data
Pre-processing Modeling
Selected model
Discover knowledge
Next Domain
Selected patterns
Knowledge Use of the
discovered
knowledge
User interface
62
Problem Understanding
This initial step has been thoroughly attempted to understand the driving force of
Intrusion detection system. To accomplish this target, various tasks have been
performed such as closely working with domain experts in order to define the
problem and determine the research goals, identifying key people and learning
about current solution to the problem, learning domain-specific terminology and
preparation of a description of the problem are considered as a means of solving
the problem.
Data understanding
Once the data is organized, a selection process occurs where some subset of this
data becomes the target data upon which further analysis is performed. It is
important when creating this target data that the data analyst understands the
domain, the end users needs, and what the data mining task might be.
Pre-processing
Sometimes data is collected in an ad hoc manner. Data entry mistakes can occur
and/or the data may have missing or unknown entries. During the data cleaning
and preprocessing stage noise is removed from the data. Outliers and anomalies
in the data can pose special problems for the data analyst during the data
cleaning process. Care must be taken not to remove these types of outliers and
anomalies. This step in the process can be the most time consuming.
63
Transformation
Data Reduction and Coding step employs transformation techniques that are
used to reduce the number of variables in the data by finding useful features with
which to represent the data. The transformed data is used in the data mining
step.
Modeling
It is in this step that the actual search for patterns of interest is performed. The
search for patterns is done within the context of the data mining task and the
representational model under which the analysis is being performed. The data
mining task itself can be a classification task, linear regression analysis, rule
formation, or cluster analysis.
The interpretation step takes the reported results and interprets this into
knowledge. Interpretation may require that we resolve possible conflicts with
previously discovered knowledge since new knowledge may even be in conflict
with knowledge that was believed before the process began. When this is done to
user‘s satisfaction, the knowledge is documented and reported to any interested
parties. This again may involve visualization.
64
CHAPTER FOUR
4.1 DATASET PREPARATION
This chapter illustrates that how data is organized for the purpose of the
experiment. As shown in chapter one, under the methodology section of this
thesis, the DM process model selected is Hybrid data mining model which starts
from understanding the business and then selection of data.
The data collected for this research is from University of Gondar data center from
Cisco Mars Appliance was in comma separated value csv format. The dataset
initially had 20 attributes and 7345 records but after the preprocessing stage, it
was reduced to 16 attributes and 6461 records for building the predictive model.
The data was extracted to Microsoft Excel for preprocessing purpose. Almost 50%
of the time and effort in this research project was spent on cleaning and
preparing the data for predictive modeling,
Han and Kamber [99] noted that attention should not be neglected to clean data
for knowledge mining because the real world data is highly susceptible to noisy,
inconsistency and incompleteness. Han and Kamber [99] added that the more
the size of the data and the more multiple and heterogeneous source, the less the
predictive performance of a model.
The networking Attacks were classified as per the activities done by the attacker.
Each attack type comes under one of the following four main categories [14]
1. Probe. Probing is a class of attacks where an attacker scans a network to
gather information for the purpose of exploiting known vulnerabilities. An
attacker with a map of machines and services that are available on a network can
use the information to look for exploits. There are different types of probes: some
of them abuse the computer‘s legitimate features; some of them use social
engineering techniques. This class of attacks is the most common and requires
little technical expertise.
2. Denial of Service Attacks. Denial of Service (DoS) is a class of attacks
where an attacker makes some computing or memory resource too busy or too
full to handle legitimate requests, thus denying legitimate users access to a
system. There are different ways to launch DoS attacks: by abusing the
computers‘ legitimate features; by targeting the implementation bugs; or by
exploiting the system‘s misconfigurations. DoS attacks are usually classified
based on the service(s) that an attacker renders unavailable to legitimate users.
3 User to Root Attacks. User to root or user to super-user (U2Su) exploits are
a class of attacks where an attacker starts out with access to a normal user
account on the system and then exploits vulnerability to gain root access. Most
common exploits in this class of attacks are regular buffer overflows, which are
caused by regular programming mistakes and incorrect environment
assumptions.
66
4 Remote to User Attacks. Remote to local (R2L) is a class of attacks where
an attacker sends packets to a machine over a network, then exploits the system‘s
vulnerability to illegally gain local access as a user. There are different types of
R2L attacks; the most common attacks in this class are done using social
engineering.
5. Normal connections (Normal) are produced by pretending daily user
behavior such as downloading files, and visiting web pages.
In this study the researcher used available intrusion detection data sets from
university of Gondar Data center. The researcher has taken 7345 records which
are labeled. For supervised modeling, the 6461 records are taken. The
distributions of the data sets which are shown in Table 4.1:
67
Table 3 Summary of the distribution of attacks
Irrelevant data and redundant data are removed before the actual data mining
process. Percentage generation of data include the number of data from each
class proportional to its size. After removing irrelevant and unnecessary data only
6,461 datasets are used for conducting this study.
68
five classes,{NORMAL, PROBE, DOS, U2R, R2L},and therefore the matrix has
dimensions of 5×5. An entry at row i and column j, C (i, j), represents the non-
negative cost of misclassifying a pattern belonging to class i into class j. Cost
matrix values employed for the dataset are defined [94] These values were also
used for evaluating results of the data set computation. The magnitude of these
values was directly proportional to the impact on the computing platform under
attack if a test record was placed in a wrong category. A confusion matrix (CM) is
similarly defined in that row and column 5×5 matrix for the dataset. An entry at
row i and column j, CM (i, j), represents the number of misclassified patterns,
which originally belong to class i yet mistakenly identified as a member of class j.
The form of the cost matrix C will depend on the actual application. In general, it
is reasonable to choose the diagonal entries equal to zero, i.e. C (i, j) = 0 for i = j,
since correct classification normally incurs no cost as shown in table 4.2. In
addition the size of the cost matrix should be the same as that of the confusion
matrix.
normal 0 2 2 2 2
Probe 2 0 2 2 2
DOS 2 2 0 2 2
U2R 2 2 2 0 2
R2L 2 2 2 2 0
69
Table 4 The 5X5 cost matrix used for the KDD 1999 winner result
To evaluate the approach, the four standard metrics of true positive, true
negative, false positive and false negative developed for network intrusions, have
been used. Table 4.4 shows these standard metrics.
The representation of True Positive (TP), True Negative (TN), False Positive (FP),
and False Negative (FN) are defined as follows:
True Positive (TP): The number of malicious records that are correctly
identified.
True Negative (TN): The number of legitimate record that correctly
classified.
False Positive (FP): The number of records that are incorrectly identified
as attacks however in fact they are legitimate activities.
False Negative (FN): The number of records that are incorrectly classified
as legitimate activities however in fact they are malicious.
70
4.5.2. Performance Measure
General performance of intrusion detection systems is measured in terms of
numbers of selected features and the classification accuracies of the machine
learning algorithms giving the best classification results. As discussed by [91]
there are different techniques used for performance measuring of the IDS. Good
IDS require high detection rate, low false alarm rate and lower average
misclassification cost ([91] Thus during developing IDS; overall classification rate
(OCA), detection rate (DR), false Positive rate (FPR), average misclassification
cost (AMC), Error rate, and training and testing time are considered.
4.5.2.2. Accuracy
Overall Classification accuracy (OCA) is the most essential measure of the
performance of a classifier. It determines the proportion of correctly classified
examples in relation to the total number of examples of the test set i.e. the ratio of
true positives and true negatives to the total number of examples. From the
confusion matrix, we can say that accuracy is the percentage of correctly
classified instances over the total number of instances in total test dataset,
namely the situation TP and TN, thus accuracy can be defined as follows [101]
Recall and precision are two widely used metrics employed in applications where
successful detection of one of the classes is considered more significant than
detection of the other classes [92].
Precision
Precision is the number of class members classified correctly over the total
number of instances classified as class members. Technically can be expressed as
the attack has been occurred and the IDS detect correctly [92]
Recall
Recall (also called True Positive Rate) (TPR)), Recall measures the number of
correctly classified examples relative to the total number of positive examples. In
other words the number of class members classified correctly over the total
number of class members [92]
72
CHAPTER FIVE
EXPERIMENTATION
In this study different experiments were conducted using various data mining
methods to derive knowledge from preprocessed data to predict unseen network
attacks.
The data analysis and classification was carried out using Weka software
environment. The data set collected from University of Gondar information
communication Data center consisted of 6,461 records. Therefore, the size of
records obtained after the class labels were balanced was used for the study.
Fifteen features were identified by the researchers which were deemed to be
pertinent to the study. The package employed for the purpose of this research is
the Weka software and J48 decision tree and Naïve Bayes classifiers. Weka
provides three options to partition the dataset in to training and test data. These
are: preparing distinct files for training dataset and test dataset; cross validation
with possibility of setting variety number of folds (the default was 10 fold) and
percentage split. A 10-fold cross validation has been used for this research. This
test option was selected for this study with the intention to be free from bias
during dataset partitioning for training and testing. In cross validation mode, the
data is divided into some number of partitions of the data, in this case, 10
approximately equal proportions, and each in turn was used for testing while the
remainder was used for training. This process repeats 10 times and at the end,
every instance has been used exactly once for testing. Finally the average result of
the 10 fold cross validation is considered [89]
73
5.1 Experimental setup
All experiments are performed in a computer with the configurations Intel(R)
Core(I5) 2 CPU 2.5GHz, 4 GB RAM, and the operating system platform is
Microsoft Windows 7. WEKA (the latest stable Windows version 3.6.9) is used.
Weka has collections of machine learning algorithms for data mining tasks that
contain facilities for data preprocessing, classification, regression, clustering,
association rules, and visualization. For WEKA the default memory value 128m
for maxheap option was increased to 1Gig because of the memory heap problem
during the experiment.
J48 is an open source Java implementation of the C4.5 algorithm in the Weka
data mining tool. C4.5 is a program that creates a decision tree based on a set of
labeled input data. This algorithm was developed by Ross Quinlan [96]. The
decision trees generated by C4.5 can be used for classification; J48 algorithm
contains some parameters that can be changed to further improve classification
accuracy. Initially the classification model is built with the default parameter
values of the J48 algorithm. Table 5.1 summarizes the default parameters with
their values for the J48 decision tree algorithm.
74
Parameter Description Default Value
Binary Splits Making a binary tree False
ConfidenecFactor The confidence factor used for pruning 0.25
(smaller values incur more pruning)
Sub-tree raising Whether to consider the sub tree raising True
operation in post pruning.
minNumObj The minimum number of instances per leaf 2
Unpruned Weather pruning is performed False
Subtreeraising Weather sub tree information is hidden or True
expanded
Debug False
In the J48 decision tree there are different experimentations. Among the
experimentations the Cross validation and the percentage splitting are well
known. Under the J48 decision tree the two experimentation methods has been
implemented.
Experimentation I
In the experiment one the J48- decision tree cross validation has been tasted. In
the experiment the cross validation with default value has been used. In this
experiment the J48 decision tree with pruned and minobj value of 0.25 were
used. Using the default value the tree has been generated with number of leaves
of 72 and size of the tree 84. The experimental result showed that the
Protocol_kind is the root node for the classification. This indicates that
75
Protocol_kind is the most determinant factor to identify the classes of a given
intrusion. The accuracy of the model has been showed in the following table 5.2.
Table 7 Detailed Accuracy by Class using Supervised J48 algorithm parameters with
their default values- 10 fold cross validation
As shown in the above table 5.2, the J48 learning algorithm scored an accuracy of
94.40 %. This result shows that out of the total number of training datasets 94.40
% records are correctly classified.
Using the un-pruned and min-obj value of 0.2 the experimentation result has
changed. In this experiment the number of leaves and the size of the tree were
increasing to 143 and 166 respectively when we compared to the previous result.
Due to this the performance of the model is smaller than from the pruned one.
The accuracy of the model has been showed in the following table 5.5
76
TP Rate FP Rate Precisio Recall F-Measure ROC Clas
n Area s
0.896 0.008 0.991 0.896 0.941 0.99 Normal
0.934 0.056 0.567 0.934 0.706 0.957 R2L
1 0 1 1 1 1 DOS
1 0 0.998 1 0.999 1 Probe
0.625 0.001 0.769 0.625 0.69 0.921 U2R
Weighte 0.943 0.008 0.962 0.943 0.948 0.992
d
Average
Table 8 Detailed Accuracy by Class using Supervised J48 algorithm parameters with
Other confidence factor -pruned with cf 0.2
As shown in table 5.3, the J48 learning algorithm scored an accuracy of 94.26 %.
This result shows that out of the total datasets 94.26 % records are correctly
classified and 5.74 records were not correctly classified.
In summary, when the J48 decision tree with the un-pruned parameter has been
applied the performance of the models is decreased. Hence, J48 with pruned
creates a model with a better performance of 94.40 % with number of leaves of 72
and size of the tree 84. So this model is selected for further analysis. Table 5.4
depicts the confusion matrix for pruned J48 decision tree algorithm.
77
Classified as Normal R2L DOS Probe U2R Sum
Normal 2816 326 0 0 3 3,145
R2L 16 450 0 1 5 472
DOS 0 0 2377 0 0 2377
Probe 0 0 0 435 0 435
U2R 3 7 0 1 21 32
Sum 6461
The above values of the confusion matrix in table 9 depicts that out of the total
36461 data provided to the program, 6099 (94.39%) records were classified
correctly and the remaining 362(5.61%) were classified incorrectly. The results of
this table also indicates that 326 and 3 records from actual class normal were
classified as R2L and U2R classes respectively, while 16 and 1 and 5 records from
class normal, Prob and U2R are wrongly classified as R2L class respectively. On
the other hand there is no records were misclassified as DOS and Prob. These
results also depict that 11 records were wrongly classified as Normal, R2L and
Prob class while they are actually be U2R. The table also shows that there is
more misclassified records on Normal and U2R the reason is that there is
unbalanced data in the data set on both classes that means that there is large
number .
5.2.2.2. Experimentation II
In the experiment II the percentage splitting were applied using 70 % for training
dataset and the remaining for testing the model. Using this experiment the size of
the tree and number leaves were similar to the previous experiment, J48 decision
tree using 10 fold cross validation pruned. The experimental result showed that
the Protocol kind is the root node for the classification. This indicates that
78
Protocol kind is the most determinant factor to identify the classes of a given
intrusion.
The accuracy of the model has been showed in the following table 5.5.
Table 10 Detailed Accuracy by Class using J48 algorithm parameters with percentage-
split set to 70%
The size of the tree and the number of leaves produced from this training were 72
and 84 respectively. In this experiment out of the 6461 total records 4,523 (70%)
of the records are used for training purpose while 1,938 (30%) of the records are
used for testing purpose. As we can see from the confusion matrix of the model
developed with this proportion, out of the 1,938 testing records 94.12% of them
are correctly classified.
The other experiment was conducted by changing the percentage splitting value
from 70% to 75%. Which means 75% of the dataset were set for the training the
model and the remaining 25% for testing the model. The size of the tree and the
number is similar to the previous percentage experiment while the performance
of the model was varied. The accuracy of the model has been showed in the
following table 5.6
79
.
In the above experiment out of the 6461 total records 4,846(75%) of the records
are used for training purpose while 1,615(25%) of the records are used for testing
purpose. As we can see from the confusion matrix of the model developed with
this proportion, out of the 1,615testing records 94.2415% of them are correctly
classified .
The above two percentage splitting experiments showed that when the training
data increase the performance of the algorithm for predicting the newly coming
instances also increase. Though this experiment is conducted by varying the value
of the training and the testing datasets, the accuracy of the algorithm for
predicting new instances in their respective class could not be improve.
As a result, a model created using J48 decision tree with pruned parameter and
percentage split set to 75 percent is selected for further analysis. Table 5.7 depicts
the confusion matrix for pruned J48 decision tree algorithm.
80
Classified as Normal R2L DOS Probe U2R Sum
Normal 712 86 0 0 1
R2L 4 95 0 0 0
DOS 0 0 608 0 0
Probe 0 0 0 100 0
U2R 1 0 0 1 7
Sum
Table 12 Confusion Matrix using J48 algorithm parameters with percentage-split set
to 75%
From all the above experiments the J48 decision tree with default 10-fold cross
validation with default value has scored a better result and it is selected as
predicted model.
81
representing, using, and learning probabilistic knowledge. Impressive results can
be achieved using it. It has often been shown that Naïve Bayes rivals, and indeed
outperforms, more sophisticated classifiers on many datasets.
Using all 16 attributes or features the Navie Bayes classifier accuracy is 93.6%.
In this experiment the 10 fold cross validation method with default value were
applied.
The detailed classification accuracy has been shown in the following table 5.7:
Table 13 Detailed Accuracy by Class using Navie Bayes classifier with 10 fold cross
validation
As shown in the above table from a total of 6, 461, training dataset 6046
(93.5768%) were correctly classified and 415(6.4232%) were incorrectly
classified.
83
5.2. Experimentation IV:
The second experiment under the Navie Bayes classifier using all features for this
study is the Percentage splitting method. In this experiment 75 – 25 % were
applied, which means 75% of the total dataset were used for training the model
and the remaining 25% were used for testing the model. The classification
accuracy of the model is showed in the following table 14.
The above table showed that from the total of test dataset 1615 were used as
testing data set. From all the testing dataset 1503(93.065%) were correctly
classified and the remaining 112(6.935%) were incorrectly classified.
From the previous two experiments using the Navie Bayes classifier the 10 fold
cross validation has achieved a better result.
When we compared the classification accuracy result of the j48 decision tree with
10 fold cross validation and the result generated by the Naïve Bayes 10 fold cross
validation the former one has a better classification result. Accordingly, Table 5.9
below depicts the confusion matrix for Naive Bayes decision tree algorithm.
84
Classified as Normal R2L DOS Probe U2R Sum
Normal 2763 325 0 1 56
R2L 1 446 0 0 25
DOS 0 0 2377 0 0
Probe 0 0 0 429 6
U2R 0 0 0 1 31
Sum
Table 15 Confusion Matrix using Navie Bayes classifiers with 10 fold cross validation
Experimentation V:
This experiment under the Navie Bayes classifier using features selection for
this study is conducted. From 6461 instances and 16 attributes 7 features are
selected such as protocol_kind,flag,sourve_bytes,count,service count,service and
class are selected by Navie Bayes classifaier.
85
The accuracy of the model has been showed in the following table 5.10.
Table 16 Detailed Accuracy by Class using Navie Bayes classifier with Feature
selection.
The above table showed that from 6461 the total of dataset,6080 were used as
training and 381 were used for testing dataset. From all the training dataset
6080(94.1%) were correctly classified and the remaining 381(5.9%) were
incorrectly classified.
86
The classification accuracy of the model is showed in the following table 5.11.
Table 17 Detailed Accuracy by Class using Navie Bayes classifier with Feature
selection with 25% -75% percentage split.
The above table showed that from the total of dataset 1615 were used as testing
dataset. From all the testing dataset 1510 (93.5%) were correctly classified and
the remaining 105(6.5) instances were misclassified.
When we compared the classification accuracy result of the Navie Bayes with
percentage split 25%-75% and Navie Bayas with selected features decision tree
with 10 fold cross validation the result generated by the Naïve Bayes cross
validation with selected feature has a better classification result.
87
Classified as Normal R2L DOS Probe U2R Sum
Normal 2798 384 0 0 23
R2L 6 448 0 0 18
DOS 0 0 2377 0 0
Probe 0 7 0 428 0
U2R 3 0 0 0 29
Sum
Table 18 Confusion Matrix using Navie Bayes classifiers with Feature selection.
When we compared the classification accuracy result of the Navie Bayes with
percentage split 25%-75% and Navie Bayas with selected features decision tree
with 10 fold cross validation the result generated by the Naïve Bayes cross
validation with selected feature has a better classification result.
88
The detailed classification accuracy for the algorithms conducted in this
Research, are shown in table 5.13 below.
89
features 75-25
Table 19 Comparison of the confusion matrix result for J48 and Naïve Bayes
Algorithms
As showed in 8 5:13 from all Eight experiments, the J48 with 10-fold cross
validation performed better classification accuracy in identifying intrusions
either normal or attack (DOS, U2R, R2L and Probe).
The reason for the J48 decision tree performing better than Naïve Bayes is
because of the linearity nature of the dataset. This means there is a
comprehensible segregation point that can be defined by the algorithm to predict
the class of a particular network intrusion
90
The other reason for the Naïve Bayes, scoring a lower accuracy than the J48
decision tree is because class conditional independence assumption may not hold
for some attributes, therefore loss of accuracy. In addition, in terms of ease to
interpret and implement the J48 decision tree is more self-explanatory. It can
handle large number of features and generates rules that can be converted to
simple and easy to understand classification if-then-else rules
91
Figure 16 TP graph
Figure 17 FP graph
92
Evaluation of the discovered knowledge.
Rule Generated
In this section the selected models from those 8 experiments conducted in this study
are evaluated. From all the experiments in this study, one model has achieved better
classification performance as discussed before from those experiments the J48
decision tree algorithm with the 10-fold cross validation model gives a better
classification accuracy of predicting newly arriving intrusions in their respective class
category. Some of the rules generated from the selected model are the following.
Rule6. If protocol_kind = tcp and flag =SF and Duration = '(4.5-inf)' and
destination_host_service_different_host_rate = '(-inf-0.005]' and
destination_host_Service Error_rate = '(-inf-0.005]' and service = telnet and
source_bytes = '(146.5-1030.5]': then R2L (8.0/3.0). (In this case the paket is
72.7 % R2L one).
93
Rule 7. If Protocol_kind = udp and source_bytes = '(36.5-102]': then normal.
(239.0) (In this case the paket is 100 % normal one).
From the above sample generated rules some of the rules are prevailing (that is
known rules) and some of them are interesting rules (that is new rules). For
deciding theses rules, the researcher consulted the domain expert from
University of Gondar Network engineers . Also, the researcher referred the Cisco
IOS firewall IDS (Cisco, 2012) and Iron port security device. Rule 1, 4, and 7are
prevailing rules while the rest are interesting rules
94
Chapter six
Intrusion detection systems are security management systems that are used to
discover inappropriate, incorrect, or anomalous activities within computers or
networks. With the rapid growth of Internet, these malicious behavior are
increasing at a fast pace and can easily cause millions of dollar in damage to an
organization. Hence, the development of intrusion detection systems has been set
with the highest priority by government, research institutes and commercial
corporations. During the past years, existing intrusion detection systems take a
variety of approaches to the task of detecting intruders‘ activities. For developing
the systems, data are collected and then provided for the use of overall design
process. However, these data sources do have some problems such as problem of
irrelevant and redundant features, problem of uncertainty, and problem of
ambiguity. These problems not only hinder the detection speed but also decline
the detection performance of intrusion detection systems. Data mining is to
identify valid, novel, potentially useful, and ultimately understandable patterns in
massive data. It is demanding to apply data mining techniques to detect various
intrusions.
In the last several years, some exciting and important advances have been made
in intrusion detection using data mining techniques. Research results have been
published and some prototype systems have been established. Inspired by the
huge demands from applications, the interactions and collaborations between the
communities of security and data mining have been boosted substantially.
95
In this study, attempts have been made to use DM technology with the aim of
detecting and predicting intrusions in the networking industry. This study
undertakes a retrospective data analysis following Hybrid DM model. Hybrid
models combine the aspects of both academic and industrial models resulting in
providing more general, research-oriented description of steps. The data set in
this study is taken from university of Gondar data center network appliance.
After taking the data, it has been preprocessed. The major preprocessing
activities include fill in missed values, remove outliers; resolve inconsistencies.
The model that was created using 10-fold cross validation using the J48
decision tree algorithm with the default parameter values showed the best
classification accuracy. The model has a prediction accuracy of 94.3972% on the
training datasets to classify the new instances as normal, DOS, U2R, R2L and
probe classes. The findings of this study have shown that the data mining
methods generates interesting rules that are crucial for intrusion detection and
prevention in the networking industry.
To summing up, the results from this study can contribute towards in
improving the networking security of university of Gondar. The study has shown
that it is promising to identify those network intrusions either normal or attacks
(DOS, U2R, Probe and R2L) and put forward concrete mechanisms for detecting
and preventing them using the appropriate Data mining approaches.
6.2. Recommendations
In this research the result shows that J48 algorithm parameters with 10-fold
cross validation shows better result than other classification algorithms
96
2. This study was carried out using classification algorithms such as J48
decision tree and Naïve Bayes algorithms. So further investigation needs
to be done using other classification algorithms such as Neural Networks
and Support Vector Machine to explore to what extent the performance of
NIDS improved.
3. Conduct similar researches to experiment on more size data set.
4. Conduct similar researches on Ethiopian governmental and non -
governmental organization do IDS research like INSA and universities to
generalize about ids in the country.
5. Further investigation should be done to change the network intrusion
detection in to intrusion prevention system.
6. I recommend university of Gondar to implement the below design
97
Internet Firewall (ASA 5520): is located on the outside far most of the network
configured to execute L4 traffic from/to the ISP network. The internet firewall
contains a user-predefined access-list to filter and NAT packets forward to the
appliance. In addition, DMZ (De-Militarized Zone) is configured and enabled on
the internet firewall to server the university‘s web related services.
The internet firewall does not have the capability of detecting intrusions attack
from the outside world and the network can be vulnerable for attacks of such
kind, hence, in the absence of Intrusion Detection System would simply mean
98
that our Edge-VSS, Core-VSS and last but not least, our Serverfarm-VSS have a
high risk of being attacked by intruders.
The IDS will have two physical and one virtual interface to perform the above-
mentioned functionalities. The First physical port, which is connected to the
Internet firewall, will receive/forward packet to be inspected from/to the ISP.
The second physical interface, which is connected to the Edge-VSS, will
receive/forward packet to be inspected from/to the UoG internal network. The
Third port, which is virtual, uses a different segment of the network to alert the
network administrator via the management switch.
99
The Core VSS: is located at the center of the UoG network interconnects all the
distribution switches from all campuses. The core is a high-end switch cable of
routing and forwarding over 40,000 user-traffic without any performance
degrade. The Core-VSS is also connected to the Server Farm- VSS to connect
users with the University‘s enterprise servers.
The Server Farm VSS: is located next to the Core switch and interconnects all the
University‘s enterprise servers to the Core VSS. The server farm has a modular
FWSM and IDSM modules to prevent attacks before it reaches the servers.
100
101
REFERENCES
[1] Cordesman, Anthony, Cordesman, and G Justin, "Cyber-threats, Information Warfare, and
Critical Infrastructure Protection," 2002.
[2] APEC, "PEC Strategy to Ensure Trusted, Secure and Sustainable Online Environment," 2005.
[4] Eleazar E, Andrew A, Michael P, Leonid P, and sal S, "A Geometric Framework for
Unsupervised Anomaly Detection : Detecting Intrusion in Unlabled Data," Colombia
University, 2002.
[5] Aikaterini M and Christos D, "DoS attacks and defense mechanisms: classification and state-
of-the-art of Computer Networks," The International Journal of Computer and
Telecommunications Networking, vol. Vol. 44, no. No.5 , pp. 643 - 666, 2004.
[6] Lee W, Stolfo S, and Mok K, "Data Mining Framework for Building Intrusion Detection
Model," IEEE , pp. 120-132, 1999.
[7] Hershkop S et al., "A data mining approach to host based intrusion detection," CUCS
Technical Report, 2007.
[8] Rebecca B and Peter M, "Special Publication On Intrusion Detection System," NIST National
Institute of Standards and Technology, USA, p. 48 , 2000.
[9] Marcus and J. Ranum, "Intrusion Detection: Challenges and myths," SANS Institute Reading
Room site, pp. 8-9, 2000.
[10] Lappas T, "Data Mining Techniques for (Network) Intrusion Detection System," International
Journal of Computer Science and Engineering ( IJCSE ) , vol. 1, no. 1, 2007.
[11] Dewan Md et al., "Attacks Classification in Adaptive Intrusion Detection using Decision Tree
," World Academy of Science, Engineering and Technology, 2010.
[12] Shruti Nagpal, "Classification, Clustering And Intrusion Detection System," International
Journal of Engineering Research and Applications, vol. 2, no. 2, pp. 961-964, 2012.
[13] Marcos M, Campos Boriana, and Milenova L, Creation and Deployment of Data Mining-Based
Intrusion Detection Systems in Oracle Database 10g , Oracle Data Mining Technologies. USA,
2005.
102
[14] Zewdie Mossie, "Optimal feature selection for Network Intrusion Detection: A Data Mining
Approch," Addis Ababa University, 2011.
[15] Tigabu Dagne, "Constructing Predictive Model for Network Intrusion Detection, ," Addis
Ababa University, 2012.
[16] Cios K, Pedrycz W, Swiniarski R, and Kurgan L, "Data Mining - A Knowledge Discovery
Approach," pp. 17-18, 2007.
[17] Pakkur T, Avadhani, Vishal, and Prudhvi, "Approaches and Data Processing Techniques for
Intrusion Detection Systems," vol. 181-186, no. 12, p. 9, 2009.
[18] Sterry Brugger, "Data Mining Methods for Network Intrusion Detection," University of
California, 2004.
[19] Denning D, "An Intrusion Detection Mode," Transactions on Software Engineering, IEEE
Communication Magazine , no. 13, pp. 222-232, 1987.
[21] James C and Jay H, "James Cannady and Jay Harrell , A comparative analysis of current
intrusion detection technologies," in Proc. 4th Technology for Information Security
Conference (TISC’96), Houston, p. 1996.
[22] Mansour E, Safavi-Naini, and Josef Pieprzyk, "Computer intrusion detection: A comparative
survey Center for Computer Security Research," University of Wollongong, Australia,
Technical Report 95-07 NSW 2522, 1995.
[23] Jeremy Frank, "Artificial intelligence and intrusion detection: Current and future directions,"
in Proc. 17th National Computer Security Conference, Baltimore, 1994.
[24] GunarLiepins and H.S. Vaccaro, "GunarLiepins and H.S. Vaccaro Anomaly detection: Purpose
and framework ," in Proc. 12th National Computer Security Conference, 1989, pp. 495-504.
[25] Teresa F. Lunt, "Automated audit trial analysis and intrusion detection: A survey," in Proc.
11th National Computer Security Conference, Baltimore, 1988.
[26] McAuliffe N et al., "Is your computer being misued , A survey of current intrusion detection
system technology," in Proc. 6th Annual Computer Security Applications Confe, 1990.
103
[27] Allen J et al., "State of the Practice of Intrusion Detection Technologies," in Camegie Mellon
University, 2000.
[29] Anderson, "Computer Security Threat Monitoring and Surveillance.," Washington, 1980.
[30] Anderson D, Frivold T, and Valdes A, "Next generation Intrusion Detection Expert System
(NIDES)," ISRI Internationa, SRI-CSL-95-07, 1995.
[31] Lunt T, "IDES: A progress Report ”," in Proc., 6th Annual Computer Security Applications
Conferenc, 1990.
[32] Sebring M, Shellhouse E, Hanna M, and Whitehurst R, "Expert Systems in Intrusion Detection
," in Proceedings of the I Ith National Computer Security, p. 1988.
[34] Vaccaro H and Liepins G, "Detection of anomalous computer session activity," in Proceedings
Symposium on Research in Security and Privacy, Oakland, 1989.
[35] Jackson K, DuBois D, and Stallings C, "A Phased Approach to Network Intrusion Detection," in
Proceedings of the United States Department of Energy Computer Group Conference, 1991.
[36] Crosbie M and Spafford E, "Defending a Computer System Using Autonomous Agents," in
Proceedings of the Eighteenth National Information Systems Security Conference, Baltimore,
p. 1995.
[37] Javitz H and Valdes A, "The SRI IDES Statistical Anomaly Detector," in IEEE Symposium on
Security and Privacy, Oakland, 1991.
[38] Cheung S and Levitt k, "Protecting Routing Infrastructures from Denial of Service Using
Cooperative Intrusion Detection," in Proceedings New Security Paradigms Workshop,
Cumbria, 1997.
[39] Anderson R and Khattak A, "The Use of Information Retrieval Techniques for Intrusion
Detection," in Proceedings of RA ID, Louvain-la-Neuve, 1998.
[40] Lodin S. (2013, may ) Intrusion Detection Product Evaluation Criteria. [Online]. HYPERLINK
"http://www.docshow.net/ids.html" http://www.docshow.net/ids.html
104
[41] Sundaram A, "An Introduction to Intrusion Detection," The ACMStudent Magazine, vol. 2, no.
4, 1996.
[42] Lindquist U and Jonsson E, "How to Systematically Classify Computer Security Intrusions," in
Proceedings IEEE Symposium Research in Security and Privacy, Oakland, p. 1997.
[43] Neumann P and Porras A, "Experience with EMERALD to Date," in 1st USENIX Workshop on
Intrusion Detection and Network Monitoring, California, 1999, pp. 73-80.
[44] Enterasys Networks, "Intrusion Detection System: Hackers Are Getting Smarter," in Enterasys
Networks, 2001.
[45] Fayyad U, Piatetsky G, and Smyth P, "The KDD process for Extracting Useful Knowledge from
Volumes of Data," Communications of the ACM, vol. 39, pp. 27-34, 1996.
[46] Piatetsky S and Frawley W, "“Knowledge Discovery in Databases," AAAI/ MIT Press, 1991.
[48] Chapman P et al., "CRISPDM 1.0 step-by-step data mining guide” ," 2003, CRISP-DM.
[49] Chang-Tien L, Arnold P, and Prajwal M, "Exploiting Efficient Data Mining Techniques to
Enhance Intrusion nDetection Systems," IEEE, pp. 512-517, 2005.
[50] Witten I and Frank E, Data Mining: Practical Machine Learning Tools and Techniques.
Massachusetts, 2005.
[51] Jose F, "Data clustering for anomaly detection in Network Intrusion detection," Research
Alliance in Math and Science, pp. 1-12, 2009.
[52] Meng J, "The Application on Intrusion Detection Based on K-Means Cluster Algorithm,"
International Forum on Information Technology and Application, no. 15-17, pp. 150-152,
2009.
[53] Meera G, Gandhi, and Srivatsa S, "Adaptive Machine Learning Algorithm (AMLA) Using J48
Classifier for an NIDS Environment," Advances in Computational Sciences and Technology,
vol. 3, pp. 291-304, 2010.
[54] Hendrickx I and Vanden B, "Hybrid Algorithms with Instance-based Classification," Tilburg
University, pp. 158-169, 2005.
105
[55] Kayacik H, ZincirHeywood A, and Heywood M, "On the Capability of an SOM Based Intrusion
Detection System," vol. 3, pp. 1808-1813, Jul 2003.
[56] Quinlan J, "Induction of decision trees. ," Machine Learning, no. 1, pp. 81-106, 1986.
[58] Breiman L, Freidman, Olshen R, and Stone C, "Classification and Regression Trees," Hapman
and Hall/CRC, 1984.
[61] Witten I and Frank E, Data Mining: Practical Machine Learning Tools and Techniques, Second
editions ed. Massachusetts, 2005.
[62] Agrawal R, Imielinski T, and Swami A, "Database mining: A performance perspective," IEEE
Transactions on Knowledge and Data Engineering, vol. 6, no. 5, pp. 914-925, 1993.
[63] Liu H and Motoda, Feature Selection for Knowledge Discovery and Data Mining. Boston:
Kluwer Academic Publishers, 1998.
[64] Blum A and Langley P, "Selection of relevant features and examples in machine learning,"
Artificial Intelligence, vol. 97, pp. 245-271, 1997.
[65] M. Ben-Bassat.(1982), , Krishnaiah P and Kanal L, Eds. North Holland., Pattern recognition
and reduction of dimensionality.
[66] John G, Kohavi R, and Pfleger k, "Irrelevant feature and the subset selection problem.," in In
Proceedings of the Eleventh International Conference on Machine Learning, 1994, pp. 121–
129.
[67] Dash K, Choi P, and Scheuermann H, "Feature selection for clustering – a filter solution," in In
Proceedings of the Second International Conference on Data Mining, 2002, pp. 115–122.
[68] Leopold E and Kindermann J, "Text categorization with support vector machines. how to
represent texts in input space? ," Machine Learning, vol. 46, pp. 423-444, 2002.
[69] Ru A, i Huang T, and Chang S, "Image retrieval: Current techniques, promising directions and
open issues," Visual Communication and Image Representation, vol. 10, pp. 39–62, 1999.
106
[70] Ng K and Liu H, "Customer retention via data mining. AI Review," vol. 14, pp. 569 – 590,
2000.
[71] Xing, Jordan M, and Karp R, "Feature selection for high-dimensional genomic microarray
data," in In Proceedings of the Eighteenth International Conference on Machine Learning,
2001, pp. 601–608.
[72] Kohavi R and John G, "Wrappers for feature subset selection," Artificial Intelligence, vol. 1,
pp. 273–324, 1997.
[73] Blum A and Rivest R, "Training a 3-node neural networks is NP-complete. Neural Networks ,"
vol. 5, pp. 117 – 127, 1992.
[74] Dash M and Liu H, "Feature selection for classification," Intelligent Data Analysis: An
International Journa, vol. 1, pp. 131-156, 1997.
[75] Langley P, "Selection of relevant features in machine learning," in In Proceedings of the AAAI
Fall Symposium on Relevance, 1994, pp. 140-144.
[76] Miller A, Subset Selection in Regression, 2nd ed.: Chapman & Hall/CRC, 2002.
[77] Hastie T and Tibshirani R, The Elements of Statistical Learning.: Friedman, 2001.
[78] Doak J, "An evaluation of feature selection methods and their application to computer
security," University of California Department of Computer Science, CA, 1992.
[79] Caruana R and Freitag D, "Greedy attribute selection," in In Proceedings of the Eleventh
International Conference on Machine Learning, 1994, pp. 28-36.
[80] Das S, "wrappers and a boosting-based hybrid for feature selection," in In Proceedings of the
Eighteenth International Conference on Machine Learning, 2001, pp. 74-81.
[81] Narendra M and Fukunaga K, "A branch and bound algorithm for feature subset selection,"
IEEE Trans. on Computer, vol. 26, pp. 971-922, 1977.
[82] Liu H and Motoda H, "Feature Selection for Knowledge Discovery and Data Mining," Kluwer
Academic Publishers, 1998.
[84] Almuallim H and Dietterich T, "Learning with many irrelevant features," in In Proceedings of
the Ninth National Conference on Artificial Intelligence, 1991, pp. 547-552.
107
[85] Dy J and Brodley C, "Feature subset selection and order identification for unsupervised
learning," in In Proceedings of the Seventeenth International Conference on Machine
Learning, 2000, pp. 247–254.
[87] Krügel C and Toth T, "Distributed Pattern Detection for Intrusion Detection, ," in Conference
Proceedings of the Network and Distributed System Security Symposium NDSS '02, p. 2002.
[89] Ragsdale D, Carver C, Humphries J, and Pooh U, "Adaptation techniques for intrusion
detection and intrusion response systems ," in Proceedings of the IEEE International
Conference on Systems, Man and Cybernetics, 2000, pp. 2344-2349.
[90] Spafford E and Zamboni D, "Intrusion detection using autonomous agents, Computer
Networks ," , 2000, pp. 547-570.
[91] Cios J, Pedrycz W, Swiniarski R, Kurgan W, and Lukasz A, Data Mining: A knowledge Discovery
approach.: New York: Springer-Verlag Science Business Media, 2001.
[92] Han J and Kamber M, Data Mining: Concepts and Techniques. New York. USA: Morgan
Kaufmann, 2001.
[93] Witten I et al., Practical Machine Learning Tools and Techniques with Java Implementations.,
200.
http://www.uog.edu.et
108
109
Appendix
DURATION
The following series of actives will perform to accomplish the propose model for
Network Intrusion Detection System (NIDS):
No. Task Start time End time
1 Proposal Preparation October 1, 2012 November 31,
2012
2 Intensive reading of secondary December 1, 2012 January 25, 2013
resource and literatures
3 Dataset preparation and January 26,2012 March 17 ,2013
processing
4 Architectural Model November 18, December 01
construction 2012 ,2012
5. Applying basic features and December 02, December 12,
model construction for NIDS 2012 2012
6 Developing prototypes December 13,2012 December 25,
2012
7 Evaluating system performance December 26, January 06, 2013
2012
8 Prototype Evaluation January 07, 2012 January 20, 2013
9 Preparation and submission of January 21, 2013 January 25, 2013
thesis draft document
10 Final thesis submission January 26, 2013 February 15 , 2013
110
BUDGET
Budget Summary
No. Item Description Amount (Birr)
1 Stationary materials 4,275.00
2 Printing service 4,075.00
3 Secretarial Service 1210.50
4 Internet Service and Books 3,579.5
5 Communication 4,689.56
6 Contingency cost (15 %) 2,674.43
Grand Total 20,503.99
Stationary materials
N Item Description Unit Qty Unit cost(Birr) Total Remar
o Cost(Birr) k
1 Computer paper Ream 6 110.00 660.00
s
2 External hard disk (500 Pieces 1 3,200.00 3,200.00
GB)
3 Note pad ‗‘ 2 70.00 140.00
4 Compact Disk (CD-RW) ‗‘ 8 30.00 240.00
5 Pens ‗‘ 7 5.00 35
Total 4,275.00
Printing service
N Item Description Unit Qty Unit Total Rem
o cost(Birr) Cost(Birr) ark
1 Journal articles Page 2000 1.50 3,000.00
111
s
2 Proposal ‗‘ 17 1.50 25
3 thesis ‗‘ 140*5 1.50 1,050.00
Total 4,075.00
Secretarial service
N Item Description Unit Qty Unit Total Rem
o cost(Birr) Cost(Birr) ark
1 Proposal writing Page 17 3.00 51
s
2 Draft writing ― 170 3.00 510.00
3 Binding set 6 14.75 649.50
Total 1210.50
Communications
No Item Description Unit Qty Unit Total Remark
cost(Birr) Cost(Birr)
1 Telephone minute 2341 .91 2,130.31
112
2 Transportation for KM 1765 1.45 2,559.25
communicating
advisor and other
concerned bodies
Total 4,689.56
113