You are on page 1of 115

Intrusion detection system

(CS6442)
SECURITY GOALS

Figure 1. Taxonomy of security goals


ATTACKS

Taxonomy of attacks with relation to


Figure 2.
security goals
Passive Versus Active Attacks
Security Services
Figure 3. Security services
Security Mechanism
Figure 4. Security mechanisms
Relation between Services and
Mechanisms
Relation between security services and
Table 1.
mechanisms
Cryptography

Cryptography, a word with Greek origins, means


“secret writing.” However, we use the term to
refer to the science and art of transforming
messages to make them secure and immune to
attacks.
Symmetric key ciphers

Figure 5. General idea of symmetric-key cipher


• If P is the plaintext, C is the ciphertext, and K is
the key,

• We assume that Bob creates P1; we prove that


P1 = P:
Kerckhoff’s Principle

Based on Kerckhoff’s principle, one should


always assume that the adversary, Eve, knows
the encryption/decryption algorithm. The
resistance of the cipher to attack must be based
only on the secrecy of the key.
Cryptanalysis

• As cryptography is the science and art of creating


secret codes, cryptanalysis is the science and art
of breaking those codes.

Figure 6. Cryptanalysis attacks


Substitution ciphers

A substitution cipher replaces one symbol with


another. Substitution ciphers can be categorized
as:

• Monoalphabetic ciphers

• Polyalphabetic ciphers.
Monoalphabetic Ciphers

In monoalphabetic substitution, the relationship


between a symbol in the plaintext to a symbol in
the ciphertext is always one-to-one.

• Additive cipher
• Multiplicative cipher
• Affine cipher
Additive cipher

• When the cipher is additive, the plaintext, ciphertext,


and key are integers in Z26.
Example
Use the additive cipher with key = 15 to encrypt the message
“hello”.
Solution
We apply the encryption algorithm to the plaintext, character by
character:
Mono- alphabetic Substitution Cipher

• Because additive, multiplicative, and affine ciphers have small


key domains, they are very vulnerable to brute-force attack.
• A better solution is to create a mapping between each plaintext
character and the corresponding ciphertext character. Alice and
Bob can agree on a table showing the mapping for each
character.

Figure 7. An example key for monoalphabetic substitution cipher


Polyalphabetic Ciphers

• In polyalphabetic substitution, each occurrence of a character


may have a different substitute. The relationship between a
character in the plaintext to a character in the ciphertext is
one-to-many.

• Autokey Cipher
Example
• Assume that Alice and Bob agreed to use an autokey cipher
with initial key value k1 = 12. Now Alice wants to send Bob
the message “Attack is today”. Enciphering is done character
by character.
Playfair Cipher
• Let us encrypt the plaintext “hello” using the key in following
table.
What is Intrusion?
• Intrusions: attempts to compromise the confidentiality,
integrity, availability, or to bypass the security mechanisms of a
computer system or network (illegal access).

• Intruders are classified into two groups.

– External intruders (Masquerader) do not have any


authorized access to the system.

– Internal intruders (Misfeasor) have at least some authorized


access to the system. Misuse their privileges or attempt to
gain additional privileges for which they are not authorized.
Solutions to Intrusions
• Firewall: Firewall filters network traffic between your
network and outside network based on some rules configured
by the administrator.

• Intrusion Detection System: Filtered traffic may contain


malicious data. It is a software that monitors the events
occurring in a computer system or network and analyzing them
for signs of possible intrusions (incidents).

• Intrusion Prevention System: It is a software that has all the


capabilities of an intrusion detection system and can also
attempt to stop possible incidents.
Firewall
• Interconnects two networks with differing trusts.
• Firewall inspects traffic passing through two networks
• Allows and Block traffic based on some rules
• It works like a security guard.
• We specify rules based on following information
• IP source and destination addresses
• Transport protocol (TCP, UDP, or ICMP)
• TCP/UDP source and destination ports
• ICMP message type
• Packet options (Fragment Size etc.)

• Actions Available
– Allow the packet to go through
– Drop the packet
What firewall can do and can not do ?

• Firewall can not protect against insider attack.

• A firewall can take action on traffic that pass through it but it


cannot do anything on traffic that does not pass through it.

• A firewall cannot protect you against complete new attacks.

• A firewall can take decision based on source IP address,


destination IP address, source port number and destination port
number but it can not inspect payload or data inside the packet.
This may contain virus.
Intrusion Detection System(IDS)
• Intrusion detection system (IDS) monitors the network traffic and system-level
applications to detect malicious activities in the network

• An IDS detects attacks as soon as possible and takes appropriate action.

• An IDS does not usually take preventive measures when an attack is detected.

• It is a reactive rather than a pro-active agent.

• It can deal with insider and outsider attacks.

• The most popular way to detect intrusions has been using the audit data generated
by the operating system.

• And audit trail is a record of activities on a system that are logged to a file in
chronologically sorted order
IDS Requirements

• Run continually with minimal human supervision


• Be fault tolerant
• Resist subversion
• Minimal overhead on system
• Scalable, to serve a large number of users
• Provide graceful degradation of service
• Configured according to system security policies
• Allow dynamic reconfiguration
IDS Architecture
The roles performed by and relationships among machines, devices,
applications and processes, including the convention used for
communication between them, defines an architecture of an IDS for a
big organization.

• Single-Tiered Architecture
– Most basic one
– Components of IDS or IPS collects and process data themselves.
– Simple system with low cost.
– Not efficient for sophisticated functionality.
– An example is host-based IDS tool that takes output of system logs (utmp
and wtmp files on unix systems) and compares it to known pattern of
attacks.
• Multi-Tiered Architecture
• Peer-to-Peer Architecture
Multi-Tiered Architecture
A Multi-Tiered Architecture: involves multiple components passing information to each
other. It consist of three primary components: sensor, agents, and manager.

• Sensors or Agent like logger; it gathers data for analysis.

• Management server: like analyzer; it analyzes data obtained from the sensors
according to its internal rules
– These are configured to run on the particular operating environment in which it is placed.
– Normally specialized to perform one and only one function.
– Example: TCP traffic examination, FTP connections and connection attempts, third party tools
like network-monitoring tools, connection tracing tools can be used.

• Database server: A repository for event information recorded by sensors R


Management server.

• Console: obtains results from agents, and takes some action


– May simply notify security officer
– May reconfigure agents, director to alter collection, analysis methods
– May activate response mechanism
– Retrieving additional information relevant to the incident.
– Sending commands to a firewall or router that change access control list.
Connection among IDPS components
• IDPS components can be connected to each other through regular networks or a separate
network designed for security software management known as a management network.

• If a management network is used, each sensor or agent host has an additional network
interface known as a management interface that connects to the management network .

• the hosts are configured so that they cannot pass any traffic between management interfaces
and other network interfaces.

• The management servers, database servers, and consoles are attached to the management
network only.

• This architecture effectively isolates the management network from the production networks,
concealing the IDPS from attackers and ensuring that the IDPS has adequate bandwidth to
function under adverse conditions (e.g. worm attack, DDoS).

• This involves additional cost in networking equipment and other H/W, also quite inconvenient
to IDPS administrators in using separate computers for IDPS management and monitoring.

• An alternative of separate management network is VLAN.


Sensors
• Obtains information and sends to agents
• It can gather information from network interface, system logs, personal
firewall etc.
• Two types:
– Network-based (programs or devices) and Host-based sensors

• May put information into another form


– Pre-processing of records to extract relevant parts

• May delete unneeded information.

• A central collection point allows for greater ease in analyzing logs as all
login information is available at one location.

• But its advisable to write log data to a different system from the one that
produced it.
Example

• IDS uses failed login attempts in its analysis


• Sensors scans login log every 5 minutes, sends agent for each
new login attempt:
– Time of failed login
– Account name and entered password
• Agent requests all records of login (failed or not) for particular
user
– Suspecting a brute-force cracking attempt
Network-based Sensors
• Deployed as programs or network devices.

• Captures data in packets traversing a local Ethernet or token


ring or a network switching point in promiscuous mode.

• One sensor might be used to monitor all traffic coming into


and out of a network (cost effective). May miss some amount
of data.

• May not burden the network if configured properly (two


network interface, one for monitoring and other for
management).

• IP addresses are normally not assigned to the sensor network


interfaces used to monitor traffic.
Continued…
• tcpdump (application) and libpcap (library) are examples of
programs used in IDS & IPS tools for capturing data from
network packets.
• tcpdump captures data packet and print packet header that
matches a particular filter(boolean) expression.
• Packet parameter useful are time, source and destination
address, source and destination port, TCP flags, initial
sequence number from source IP for the initial connection,
ending sequence number, number of bytes and window size.
• libpcap is a library called by an application.
• It gathers packet data from the kernal of OS and move it to
various applications of IDS & IPS.
Host-based Sensors
• The network interface captures the data packets as similar to
network based.

• But the network interface on each host is set to capture only


data sent to that particular host.
Agents
• Reduces information from sensors
– Eliminates unnecessary, redundant records

• Analyzes remaining information to determine if attack under


way
– Analysis engine can use a number of techniques, discussed
before, to do this

• Usually run on separate system


– Does not impact performance of monitored systems
– Rules, profiles not available to ordinary users
Manager
• Accepts information from agents

• Takes appropriate action


– Notify system security officer
– Respond to attack
– In case of intrusion, remotely change policies and
parameters , erase log files after they are archived
• Often GUIs
– Well-designed ones use visualization to convey information
Multi-Tiered Architecture
Pros & Cons
• Pros:
– Greater efficiency and depth of analysis. As each component of
the architecture perform function it is designed to do.
– Degree of efficiency far better than single tiered architecture.
– Able to define the security conditions of an organization’s
network and its hosts in a better way than single tiered
architecture.

• Cons:
– Increased complexity in setting up the components, interfaces,
and communication methods among them in this architecture.
– Increased cost in day-to-day maintenance and troubleshooting
challenges.
Peer-to-Peer Architecture
• Involves intrusion detection and prevention information between peer
components performing same type of functions.
• Often used by cooperating firewalls.
• One firewall obtains information about events occurring, passes this
information to another, causing change in access control list or addition of
restriction on proxide connections.
• Neither firewall act as the central server or master repository of
information.
• Simple to implement than multi-tiered architecture.
• Any peer can participate and get benefitted from information gained by
other peers.
• Lack of specialized functionality compared to multi-tiered, due to absence
of specialized components but better than single-tiered architecture.
• Well suited for organizations that have invested enough to obtain and
deploy firewalls capable of cooperating with each other but not on IDS &
IPS.
Types of IDS
• Monitoring Environment
– Host based, vs. network based

• Detection Model
– signature detection vs. anomaly detection

• Operation
– Off-line vs. real-time

• Architecture
– Centralized vs. distributed
Network-based IDPS
• A network-based IDPS monitors and analyzes network traffic
for particular network segments or devices to identify
suspicious activity.
• The IDPS network interface cards that will be performing
monitoring are placed into promiscuous mode so that they
accept all packets that they see, regardless of their intended
destinations.
• An appropriate network for components (management
network/ VLAN) need to be decided.
• Sensor placement location need to be decided
– Inline
– Passive
Inline sensor deployment
• An inline sensor is deployed
so that the traffic it monitors
passes through it.
• Some inline sensors are
hybrid firewall/IDPS devices.
• The primary motivation for
deploying sensors inline is to
stop attacks by blocking traffic.
• Most techniques for having a
sensor prevent intrusions
require that the sensor be
deployed in inline mode .
Passive Sensor deployment
• A passive sensor is deployed
so that it monitors a copy of
the actual traffic; no traffic
passes through the sensor.
• Passive sensors can monitor
traffic through various methods
, including a switch spanning
port, a network tap and an IDS
load balancer.
• Passive techniques typically
provide no reliable way for a
sensor to block traffic.
Continued…
• Spanning port:
– It can see all network traffic going through a switch.
– Connecting sensor to a spanning port can allow it to monitor traffic going to and
from many hosts.
– Easy and inexpensive monitoring method.
– Can be problematic if switch is configured incorrectly, so the spanning port might
not be able to see all the traffic.
– If the switch is in heavy load, its spanning port might not be able to see all traffic
or might be disabled temporarily.
• Network Tap:
– It is a direct connection between a sensor and the physical network media itself,
such as a fiber optic cable.
• IDS load balancer:
– Aggregates and directs network traffic to monitoring system based on set of
rules.
• Send all traffic to multiple IDPS sensor.
• Dynamically split the traffic among multiple IDPS sensor based on volume.
• Split the traffic among multiple IDPS sensor based on IP address, protocols, or other
characteristics.
– Diverting traffic to multiple IDPS sensor may cause reduction in detection
accuracy if related events of a single event are seen by different sensors.
Security capability
The network-based IDPS provide a wide variety of security capabilities
dividing into four categories:
• Information gathering
⁻ Identifying hosts (A list host on organization’s network with IP and MAC address
is prepared to detect any new host.)
⁻ Identifying operating system
• Identify OS and their versions in the hosts through techniques such as checking port used in various
hosts
• Analyze packet header for certain unusual characteristic
⁻ Identifying applications
• Determines application version by noting the port number used or through the communication traces
between application server and client. Noted version can be used to identify potentially vulnerable
application and its unauthorized uses.
⁻ Identifying network characteristics
• Collects information about network traffic such as configuration of network devices and hosts to detect
changes .

• Logging
― IDPS perform extensive logging of data related to detected events so as to confirm
the validity of alerts, to investigate incidents and to correlate events between the
IDPS and other logging sources.
― Timestamp, connection or session ID, network, transport and application protocols,
IP address, port numbers etc.
• Detection
• Prevention
Continued…
• Detection:
― NIDS offers broad detection capabilities using combination of
techniques (signature-based, anomaly-based, stateful protocol )

• Types of event detected:


– Application layer reconnaissance and attack (buffer overflows,
password guessing, malware transmission). Identify by analyzing
various application protocols like DNS, FTP, HTTP, IMAP, SMTP etc.
– Transport layer reconnaissance and attack (port scanning, unusual
packet fragmentation, SYN flood)
– Network layer reconnaissance and attack (spoofed IP address,
illegal IP header values)
– Policy violations (use of inappropriate web sites, use of forbidden
application protocol)

• Detection Accuracy:
– Have high rate of false positive and false negative.
– Require considerable tuning and customization (threshold for port
scan, blacklist and whitelist for host IP addresses and alert setting)
according to monitored environment.
• Technology limitation:
– Cannot analyze encrypted network traffic
• To ensure that sufficient analysis is performed on payloads within
encrypted traffic, IDPSs can be deployed to analyze the payloads
before they are encrypted or after they are decrypted.
– Handling high traffic loads
• For inline IDPS sensors, dropping packets also causes disruptions
in network availability, and delays in processing packets could
cause unacceptable latency.
• either pass certain types of traffic through the sensor without
performing full analysis or drop low-priority traffic.
– Withstanding attacks against IDPS themselves.
• Attackers can generate large volumes of traffic, such as DDoS
attacks, blinding and other anomalous activity
Types of IDS: Network-based
• PROS:
– Protect the whole network and detect network based attacks (like DOS)
– Broad in scope (watches all network activities)
– Easier setup: Easy to deploy
– Better for detecting attacks from the outside
– Less expensive to implement
– Detection is based on what can be recorded on the entire network
– Examines packet headers
– Near real-time response
– OS-independent
– Detects unsuccessful attack attempts

• CONS:
– Require all traffic information.
– Generates an enormous amount of data to be analyzed
– Cannot monitor traffic at higher network traffic rates
– Cannot deal with encrypted network traffic
– Can not detect system-specific attacks (like trojan)
Host-based IDPS deployment architecture
Deployment options:

– Agents can be appliances or software installed on individual hosts.


– Appliance are deployed inline to monitor network traffic as NIDS.
– But are more specialized than standard NIDS as the monitored activity
only for one specific type of application (web server or database server,
FTP and DNS servers, E-commerce database servers, etc.)
– So agents are deployed to protect critical hosts (a server, a client host or
an application server).
– Cost, OS of appliance, importance of host data , infrastructure support
(bandwidth) are few other considerations.
– Components communicate on organization network in encrypted form.
– Agents are generally employed using Shim (a layer of code ).
– It intercepts data (network traffic, filesystem activity, system call, e-
mail and web applications), analyse it decide whether it should be
allowed or not.
– Agent applied without Shim are less effective in detecting threat and
cannot perform prevention activity.
Security capability
• Logging capability
– Timestamp, event type, event information (IP address, port
information, application info)
• Detection capability
– Types of event detected:
• Code analysis
– Code behavior analysis
– Buffer overflow detection
– System call monitoring
– Application list
• Network traffic analysis
• Network traffic filtering
• Filesystem monitoring
– File integrity checking
– File attribute checking
– File access attempts
• Log analysis
– Detection accuracy
• Accuracy of detection is more challenging as various detection
techniques do not have context awareness of event occurred.
Technology limitation
• Alert generation delays
• Centralized reporting delay
• Host resource usage
• Conflict with existing security control
• Rebooting hosts.
Prevention capability
• Code analysis
• Network traffic analysis
• Network traffic filtering
• Filesystem monitoring
Types of IDS: host-based
• PROS:
– No additional hardware
– Less volume of traffic so less overhead
– Better for detecting attacks from the inside
– Near real-time detection and response
– System specific activities
– Encrypted traffic is also available for analysis
– Lower entry cost

• CONS:
– Detection is based on what any single host can record
– Narrow in scope (watches only specific host activities)
– Reduce performance of host system.
– Vulnerable to situation like when host operating system is compromised
– More expensive to implement.
– Deployment is challenging
– OS dependent
– Does not see packet headers.
Evaluation criteria
• Accuracy
• Performance
• Completeness
• Timely response
• Adaptation and cost sensitivity
• Intrusion tolerance and attack resistance
Accuracy
• How correct an IDS works.

• It is a measure of percentage of detection and failure as well as


the number of false alarms that the system is producing.

• The target class are two (normal and abnormal/ intrusion)

• Actual percentage of abnormal data is much smaller than the


percentage of normal one, so harder to detect intrusion.

• Excessive false alarm are biggest problem facing IDS.


False positive and negative
In intrusion detection a positive data is considered to be an attack data, while a
negative data is considered to be a normal data.

Thus, aim of an IDS is to produce as many TP and TN as possible, while trying


to reduce the number of both FP and FN.
• The big circle defines the space of the whole data (i.e., normal
and intrusive data)
• The small ellipse defines the space of all predicted intrusions
by the classifier.
– Thus, it will be shared by both TP and FP.
• The ratio between the real normal data and the intrusions is
graphically represented by the use of a horizontal line.
Confusion Matrix
• It is a ranking method applied to any kind of classification
problem. The size of the matrix is determined by the number
of distinct classes that are to be detected.

• A Confusion Matrix for intrusion detection is defined as a 2-


by-2 matrix,
Precision, Recall, and F-Measure
Under normal operating conditions there is a big difference
between the rate of normal and intrusion data. The Precision,
Recall, and F-Measure metrics ignore the normal data that has
been correctly classified by the IDS (TN), and focus on both the
intrusion data (TP+FN) and FP (also known as False Alarms) that
are generated by the IDS
Precision
• It is a metric defined with respect to the intrusion class. It shows how many
examples, predicted by an IDS as being intrusive, are the actual intrusions
• The aim of an IDS is to obtain a high Precision, meaning that the number
of false alarms is minimized.
precision =TP/ TP+FP
, where precision ϵ[0,1]

• The main disadvantage of the metric is the impossibility to express the


percent of predicted intrusions versus all the real intrusions that exist in the
data.
Recall
• This metric measures the missing part from the Precision;
namely, the percentage from the real intrusions covered by the
classifier. Consequently, it is desired for a classifier to have a
high recall value.

recall = TP/ (TP+FN)


,where recall ϵ [0,1]

• This metric does not take into consideration the number of


False Alarms. Thus, a classifier can have at the same time both
good Recall and high False Alarm rate.
The disadvantage of using only recall as a metric: The figure shows two
classifiers (IDSs)that have almost the same recall (i.e., very good detection
rate) but different precisions. While in the first case (a) the precision is low
(because of the high number of false alarms), in the second case (b) even
though the recall is a little bit lower, the number of false alarms is improved.

• Furthermore, a classifier that blindly predicts all the data as being intrusive
will have a 100% Recall (but a very low precision).
F-Measure
• The F-Measure mixes the properties of the previous two
metrics, being defined as the harmonic mean of precision and
recall.

• F-Measure is preferred when only one accuracy metric is


desired as an evaluation criteria.
• Note that when Precision and Recall reaches 100%, the F-
Measure is maximum (i.e. 1), meaning that the classifier has
0% false alarms and detects 100% of the attacks.
• Thus, the F-Measure of a classifier is desired to be as high as
possible.
ROC Curves
• In intrusion detection, ROC curves are used on the one hand to
visualize the relation between the TP and FP rate of a certain
classifier while tuning it, and on the other hand, to compare the
accuracy of two or more classifiers.
• The lower-left point (0,0) characterizes an IDS that classifies all the data as
normal all the time. Obviously in such situation, the classifier will have a
zero false alarm rate, but at the same time will not be able to detect
anything.

• The upper-right point (1,1) characterizes an IDS that generates an alarm for
each data that is encountered. Consequently, it will have a 100% detection
rate and a 100% false alarm rate as well.

• The line defined by connecting the two previous points represents any
classifier that uses a randomize decision engine for detecting the intrusions.
Any point on this line can be obtained by a linear combination of the two
previously mentioned strategies. Thus, the ROC curve of an IDS will
always reside above this diagonal.

• The upper-left point (0,1) represents the ideal case when there is a 100%
detection rate while having a 0% false alarm rate. Thus the closer a point in
the ROC space is to the ideal case, the more efficient the classifier is.
Performance
• The quality of a NIDS is described by the percentage of true attacks
detected combined with the number of false alerts. However, even a
high-quality NIDS algorithm is not effective if its processing cost is
too high, since the resulting loss of packets increases the probability
that an attack is not detected.

• Besides IDS configuration various other factors influencing the


performance are, a number of architectural and system parameters
such as operating system structure, main memory bandwidth and
latency as well as the processor microarchitecture contribute to a
system’s suitability.

• Performance depends not only on the processor performance, but to


a large extent also on the memory system.

• For NIDS the performance can be evaluated as the system’s ability


to process traffic in a high speed link with minimum packet loss
while working in real time.
• Schaelicke et al. [20] proposes a methodology to measure the performance
of rule-based NIDSs. This study measures and compares two major
components of the NIDS processing cost on a number of diverse systems to
pinpoint performance bottlenecks and to determine the impact of operating
system and architecture differences.

• Given a fixed amount of traffic load, the processing capability of an NIDS


depends on the type of the rules (header & payload) and the packet size.

• Since the size of header is generally fixed, the overall processing cost by
applying header rules depends on the number of packets to be processed.

• For payload rule the overall processing cost is determined by the size of the
packets
• This example demonstrates that for small numbers of rules, nearly
no packets are lost, but when the number of rules exceeds the
maximum processing capability of the system the number of
dropped packets increases drastically.

• the measurement carried out by Schaelicke et al. was performed for


four different packet payload sizes: 64 bytes, 512 bytes, 1000 bytes,
and 1452 bytes, the NIDS to be measured ran on the 100Mbit link,
which was nearly saturated during the evaluation.
• According to this paper, hardware platform for NIDS is very important in
terms of improving performance, thus, general-purpose systems are
generally inadequate to be used as hardware platform for NIDS even on
moderate-speed networks, since maximum number of rules they support is
much smaller than the total number of applicable rules.

• Different system parameters have different degrees of contribution to the


improvement of performance. Memory bandwidth and latency are the most
significant contributor. While CPU is a not a suitable predictor of NIDS’
performance.

• Operating systems also affects the performance of NIDS, the experimental


result presented in this paper shows that, Linux significantly outperformed
FreeBSD because of its efficient interrupt handling.
Completeness
• It represents the space of the vulnerabilities and attacks that
can be covered by an IDS.

• This criterion is very hard to assess because having a global


knowledge about attacks or abuses of privileges is impossible.
Timely response
• An IDS that performs its analysis as quickly as possible will
enable the security officer or the response engine to promptly
react before much damage is done.
• But there always be a delay between the actual moment of the
attack and the response of the system (i.e., Total delay) due to
computational time (data capture, feature extraction).
• There is no point in having a good detection rate if the
detection time takes hours or days.
Evaluation of Anomaly Detection Systems ( Dokas et al. )
• The first derived metric corresponds to the surface areas between the real
attack curve and the predicted attack. The smaller the surface under the real
attack curve, the better the intrusion detection algorithm.
• Burst detection rate (bdr) is defined for each burst and it represents the
ratio between the total number of intrusive network connections ndi that
have the score higher than pre-specified threshold within the bursty attack
and the total number of intrusive network connections within attack
intervals
• Response time represents the
time elapsed from the beginning
of the attack till the moment when
the first network connection has
the score value higher than
prespecified threshold (tresponse).
Adaptation and cost sensitivity (Lee et al.)
• Intrusion detection systems (IDSs) must maximize the realization of
security goals while minimizing costs.
• The major cost factors associated with an IDS
– Damage cost (the amount of damage to the targeted resource)
– Response cost (the cost of responding to an attack)
– Operational cost (the cost of analyzing events using an IDS)
– Confidence metric
– Consequentia cost (Cost associated with the prediction of an IDS)For
example, the false negative detection cost of the event e is equal to
damage cost associated with event e
• Detection of an attack is pointless if the operational cost of detecting
is larger than its damage cost. Also an intrusion with a higher
response cost than damage cost would not be responded to.
• Total cost of intrusion detection is defined as the sum of
consequential and operational cost.
• There is a need of a cost model to estimate the total expected cost of
intrusion detection and shows the trade-off among all relevant cost
factors.
Taxonomy of anomaly Detection
Anomaly detection models

Manually Machine learning


– Time consuming - Constructs model automatically
– Done by experts -Dataset contains instances
described by attributes with labels
– High cost - Labels (continuous /categorical)
(development and updation) (Binary/ Types of attacks)
-Based on label three operating
models are defined (supervised, semi-supervised, unsupervised)
Classification of AIDS methods
Statistical Anomaly-based technique
• A Statistical Anomaly based intrusion detection system (SABIDS) builds a
model for normal behaviour profile, then detects low probability events and
flags them as potential intrusions.
• Rather than inspecting data traffic, each packet is monitored, which
signifies the fingerprint of the flow.
• Find out normal network activity like what sort of bandwidth is generally
used, what protocols are used, what ports and devices generally connect to
each other- and aware the administrator or user when traffic is detected
which is anomalous
• In SABIDS the malicious behavior is differentiate from normal behavior by
using statistical properties such as mean and variance of normal activities
detected which is anomalous statistical test which determine the deviation
of activities from normal behavior.
• A scoring mechanism is used to score an anomalous activity, when the
calculated score exceeds certain threshold value then an alarm will be
generated.
• Univariate: “Uni” means “one”, so it means the data has only one variable. This
technique is used when a statistical normal profile is created for only one measure
of behaviours in computer systems. Univariate IDS look for abnormalities in each
individual metric.
• For univariate technique, statistical norm profile is built for only one measure of
activities in information systems. However, intrusions often affect multiple
measures of activities collectively. Anomalies resulting from intrusions may cause
deviations on multiple measures in a collective manner rather than through separate
manifestations on individual measures.

• Multivariate: It is based on relationships among two or more measures in order to


understand the relationships between variables. This model would be valuable if
experimental data show that better classification can be achieved from
combinations of correlated measures rather than analysing them separately.

• The main challenge for multivariate statistical IDs is that it is difficult to estimate
distributions for high-dimensional data.

• Time series model: A time series is a series of observations made over a certain
time interval. A new observation is abnormal if its probability of occurring at that
time is too low.
Univariate Vs Multivariate
• Univariate anomaly detection looks for anomalies in each individual
metric, while multivariate anomaly detection learns a single model for all
the metrics in the system.

• Univariate methods are simpler, so they are easier to scale to many metrics
and large datasets. However, someone would then need to unravel the
causal relationships between the anomalies in the resulting alert storm.

• Multivariate approaches, on the other hand, detect anomalies as complete


incidents, yet are difficult to scale both in terms of computation and
accuracy of the models.

• This approach also produces anomaly alerts. These are hard to interpret
because all the metrics are inputs that generate a single output from the
anomaly detection system.
Knowledge-based techniques
• This group of techniques is also referred to as an expert system method.
This approach requires creating a knowledge base which reflects the
legitimate traffic profile.

• Unlike the other classes of AIDS, the standard profile model is normally
created based on human knowledge, in terms of a set of rules that try to
define normal system activity.

• The main benefit of knowledge-based techniques is the capability to reduce


false-positive alarms since the system has knowledge about all the normal
behaviors.

• However, in a dynamically changing computing environment, this kind of


IDS needs a regular update on knowledge for the expected normal behavior
which is a time consuming task as gathering information about all normal
behaviors is very difficult.
• Finite state machine (FSM): FSM is a computation model used to
represent and control execution flow.
– Typically, the model is represented in the form of states, transitions, and
activities. A state checks the history data. For instance, any variations in
the input are noted and based on the detected variation transition
happens. An FSM can represent legitimate system behaviour, and any
observed deviation from this FSM is regarded as an attack.

• Description Language: Description language defines the syntax of


rules which can be used to specify the characteristics of a defined
attack. Rules could be built by description languages such as N-
grammars and UML

• Expert System: An expert system comprises a number of rules that


define attacks. In an expert system, the rules are usually manually
defined by a knowledge engineer working in collaboration with a
domain expert.
Intro to Snort
• What is Snort?
– Snort is a multi-mode packet analysis tool
• Sniffer
• Packet Logger
• Forensic Data Analysis tool
• Network Intrusion Detection System
• Where did it come from?
– Developed out of the evolving need to perform
network traffic analysis in both real-time and for
forensic post processing
Snort “Metrics”
• Small (~800k source download)
• Portable (Linux, Windows, MacOS X, Solaris,
BSD, IRIX, Tru64, HP-UX, etc)
• Fast (High probability of detection for a given
attack on 100Mbps networks)
• Configurable (Easy rules language, many
reporting/logging options
• Free (GPL/Open Source Software)
Snort Design
• Packet sniffing “lightweight” network
intrusion detection system
• Libpcap-based sniffing interface
• Rules-based detection engine
• Plug-in system allows endless flexibility
Detection Engine
• Rules form “signatures”
• Modular detection elements are combined to
form these signatures
• Wide range of detection capabilities
– Stealth scans, OS fingerprinting, buffer overflows,
back doors, CGI exploits, etc.
• Rules system is very flexible, and creation of
new rules is relatively simple
Plug-Ins
• Preprocessor
– Packets are examined/manipulated before being
handed to the detection engine
• Detection
– Perform single, simple tests on a single
aspect/field of the packet
• Output
– Report results from the other plug-ins
Using Snort
• Three main operational modes
– Sniffer Mode
– Packet Logger Mode
– NIDS Mode
– (Forensic Data Analysis Mode)
• Operational modes are configured via command line
switches
– Snort automatically tries to go into NIDS mode if no
command line switches are given, looks for snort.conf
configuration file in /etc
Using Snort – Sniffer Mode
• Works much like tcpdump
• Decodes packets and dumps them to stdout
• BPF filtering interface available to shape
displayed network traffic
What Do The Packet Dumps Look Like?

=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+

11/09-11:12:02.954779 10.1.1.6:1032 -> 10.1.1.8:23


TCP TTL:128 TOS:0x0 ID:31237 IpLen:20 DgmLen:59 DF
***AP*** Seq: 0x16B6DA Ack: 0x1AF156C2 Win: 0x2217 TcpLen: 20
FF FC 23 FF FC 27 FF FC 24 FF FA 18 00 41 4E 53 ..#..'..$....ANS
49 FF F0 I..

=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+

11/09-11:12:02.956582 10.1.1.8:23 -> 10.1.1.6:1032


TCP TTL:255 TOS:0x0 ID:49900 IpLen:20 DgmLen:61 DF
***AP*** Seq: 0x1AF156C2 Ack: 0x16B6ED Win: 0x2238 TcpLen: 20
0D 0A 0D 0A 53 75 6E 4F 53 20 35 2E 37 0D 0A 0D ....SunOS 5.7...
00 0D 0A 0D 00 .....

=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
Packet Logger Mode
• Gee, it sure would be nice if I could save those
packets to disk…
• Multi-mode packet logging options available
– Flat ASCII, tcpdump, XML, database, etc available
• Log all data and post-process to look for
anomalous activity
NIDS Mode
• Wide variety of rules available for signature
engine (~1300 as of June 2001, grow to ~2900
at May 2005)
• Multiple detection modes available via rules
and plug-ins
– Rules/signature
– Statistical anomaly
– Protocol verification
Snort Rules
Snort Rules
• Snort rules are extremely flexible and are easy to
modify, unlike many commercial NIDS
• Sample rule to detect SubSeven trojan:
alert tcp $EXTERNAL_NET 27374 -> $HOME_NET any (msg:"BACKDOOR
subseven 22"; flags: A+; content: "|0d0a5b52504c5d3030320d0a|";
reference:arachnids,485; reference:url,www.hackfix.org/subseven/;
sid:103; classtype:misc-activity; rev:4;)

• Elements before parentheses comprise ‘rule header’


• Elements in parentheses are ‘rule options’
Snort Rules
alert tcp $EXTERNAL_NET 27374 -> $HOME_NET any (msg:"BACKDOOR
subseven 22"; flags: A+; content:
"|0d0a5b52504c5d3030320d0a|"; reference:arachnids,485;
reference:url,www.hackfix.org/subseven/; sid:103;
classtype:misc-activity; rev:4;)

• alert action to take; also log, pass, activate, dynamic


• tcp protocol; also udp, icmp, ip
• $EXTERNAL_NET source address; this is a variable – specific IP is ok
• 27374 source port; also any, negation (!21), range (1:1024)
• -> direction; best not to change this, although <> is allowed
• $HOME_NET destination address; this is also a variable here
• any destination port
Snort Architecture
Data Flow

Snort
Packet Stream

Sniffing Packet Decoder

Data Flow
Preprocessor
(Plug-ins)

Detection Engine
(Plug-ins)

Output Stage
(Plug-ins) Alerts/Logs
Detection Engine: Rules

Rule Header Rule Options

Alert tcp 1.1.1.1 any -> 2.2.2.2 any (flags: SF; msg: “SYN-FIN Scan”;)
Alert tcp 1.1.1.1 any -> 2.2.2.2 any (flags: S12; msg: “Queso Scan”;)
Alert tcp 1.1.1.1 any -> 2.2.2.2 any (flags: F; msg: “FIN Scan”;)
Conclusion
• Snort is a powerful tool, but maximizing its
usefulness requires a trained operator
• Becoming proficient with network intrusion
detection takes 12 months; “expert” 24-36?
• Snort is considered a superior NIDS when
compared to most commercial systems
• Managed network security providers should
collect enough information to make decisions
without calling clients to ask what happened
Intrusion alert correlation
Correlation can address some of the IDS weakness
• Alert flooding
– Generate a large amount of alerts
• Contexts
– Not group related attack
• False alerts
– Generate a false negative and false positive
• Scalability
– Difficult to achieve large scale deployment
• Correlation can capture high level view of
attack activity on the target network without
losing security relevant information.
Correlation process
• Pre-process
– Data normalization
– Data reduction
• Process (Alert correlation technique)

• Post-process
Pre-processing
Pre-processing aim is to convert alerts to a generic format
and reduce the number of alerts to be correlated.

1. Normalization: alerts generated in different formats


are translated into standard format IDMEF(intrusion
detection message exchange format) for sharing
information of interest to IDS and management
system.
Two type of IDMEF message
1. Heartbeat- Sent by analyser in a regular period indicating
that it is running.
2. Alert- Contains nine aggregate classes.
• Analyzer: Identification information for the analyzer that generated the alert.

• Create Time: The time when the alert was generated.

• Detect Time: The time when the event(s) leading up to the alert was detected. This could be
different from the CreateTime in some circumstances.

• Analyzer Time: Current time on the analyzer.

• Source: The source that triggers the alert. It is also composed of four aggregate classes, namely,
Node, User, Process and Service.

• Target: The target of the alert. It has same aggregate classes as Source has with one additional class
named File List.

• Classification: Information that describes the alert.

• Assessment: Information about the impact of the event, actions in response to it, and confidence in
valuation; and

• Additional Data: Additional information that does not fit into the data model.

• The alert class attribute Message Id, uniquely identify itself. There are three subclass of Alert class,

– Tool Alert class specifies the attacking tool used by the attacker.
– Overflow Alert class contains the information about buffer overflow attacks such as the size of the contents
in the buffer and the content itself.
– Correlation Alert class provides a means to group alerts together.
2. Data reduction : It is a process to reduce the number of alerts without
losing important information.

2.1 Alert aggregation:


– It reduce the redundancy of alerts by grouping duplicate alerts and making
them a single one. Alerts are considered to be aggregated in terms of their
attributes. These attributes include timestamp, source IP, target IP, port(s),
user name, process name, attack class and sensor ID,

– The duplicate relationship between alerts are defined in a duplicate definition


file. Based on the definitions in duplicate definition file, the attributes of new
alerts are compared with those of previous alerts for aggregation.

– In addition to the aggregation, alert compression is another simple technique


for dealing with duplicate alerts. It means recurring sequence of alerts is
simply replaced by a single alert and a run count.
2.2 Alert filtering
• It filters out low-interest alerts classes and some
known false alerts.
• These alerts are normally predefined by administrators.
• To give an example, consider the alerts raised by Snort
flagging that a critical file has been changed. If this is
caused by Syslog doing garbage collecting, the
corresponding alerts should be filtered out
• Knowledge about the topology of the network is also
important for identifying the low interest alerts.
• Static filtering is only efficient when dealing with the
known situation, and moreover, it is time-consuming.
• The adaptive filtering technique uses learning algorithm to
update Syslog filters.

• The idea is to train a Naive Bayesian text classifier to


classify messages by checking the words that appear in
alert messages.

• Some examples labelled as interesting or not interesting


are presented to the classifier in the training process. Later
on, a knowledge base is built up and becomes on-line.

• The alert messages will have different “score” after being


classified, and only the top “scoring” messages are selected
while others are filtered out.

• To be adaptive the filter is launched on a periodic basis or


on demand, and the feedback from human experts is used
to optimize the classifier.
2.3 Reducing false alerts
• A false positive is the result of an IDS raising alerts for
legitimate activities.
• IDSs can easily trigger thousands of alerts per day, and up
to 99% of them are false positives.
• However, it is hard to significantly reduce false positives by
only improving intrusion detection techniques.
• An Adaptive Learner for Alert Classification (ALAC)
framework to classify the alerts into true positive and false
positive in real-time.
• It incorporates some background knowledge (Network
Topology, Alert Context, Alert Semantics and Installed
Software) in its real-time machine learning techniques.
Correlation techniques
• Based on feature similarity
• Based on known scenarios
• Based on prerequisite and consequence
relationship
postprocess
• Alert prioritization

You might also like