Detecting Abnormal Traffic: Jennifer Rexford

Jennifer Rexford
Fall 2010 (TTh 1:30-2:50 in COS 302)

COS 561: Advanced Computer Networks
http://www.cs.princeton.edu/courses/archive/fall10/cos561/
Detecting Abnormal Traffic
Intrusions
Intrusions
Actions that attempt to bypass security mechanisms
E.g., unauthorized access, inflicting harm, etc.
Example intrusions
Denial-of-service attacks
Scans
Worms and viruses
Host compromises
Intrusion detection
Monitoring and analyzing traffic
Identifying abnormal activities
Assessing severity and raising alarms
2
Where to Detect Intrusions?
End host: OS or application
Includes logins, file I/O, program executions, etc.
Can work with encrypted traffic and at lower speeds
Avoid extra packet reassembly and ambiguity
Network: at enterprise edge
Single location for detecting and blocking attacks
Avoid reliance on the end host, OS, user, ..
Reduce overhead on the end host and network
Network: in the backbone
Multiple locations for detecting correlations in traffic
Too expensive to inspect the contents of every packet
3
What to Do After Detection?
Passively log
Collect, analyze, and record data
Report results to system/network administrator
Allow human to drive any response
Slower response, but better handling of false positives
Actively defend
Detect problems in real time
Automatically generate a response
E.g., drop the traffic, engage the adversary, fight back
Faster response, but worse handling of false positives
4
How to Detect Intrusions?
The challenge
We dont know all the bad things that could happen
And telling good from bad is fundamentally hard

Anomaly detection
What is usual, is known
What is unusual, is bad
Signature detection
What is bad, is know
What is not bad, is good
5
How to Detect Intrusions?
Anomaly Detection Signature Detection
Patterns Train to create a
baseline of normal
network traffic
Codify patterns of
known vulnerabilities
or attacks
Detection Detect statistically
significant deviations
from normal
Detect matches to
the patterns in the
signatures
Pros Can detect novel
(zero day) attacks
Builds on past
experiences
Cons May miss low-rate
attacks; high rate of
false alarms
Misses novel attacks;
requires continuous
updates to signatures
6
Anomaly Detection
Traffic volume
Detect deviations in bytes/sec or packets/sec over time
Not effective for detection low-volume attacks
Traffic features
Detect changes in distributions of traffic characteristics
E.g., traffic distribution by IP address, port number,
packet size, TCP flags, etc.
Aids in classifying the anomaly (e.g., DoS vs. port scan)
Detection techniques
Statistical techniques
Machine learning
...
7
Signature Detection
Examples
Excessive login attempts
TCP packet with both SYN and RST set
HTTP with GET /cgi-bin/phf?

Packet processing
Deep-packet inspection
Regular expression matching
8
alert tcp $EXTERNAL any -> $HTTP_SERVERS $HTTP_PORTS
(msg: WEB-CGI phf access; flow:to_server, established;
uricontent: \phf; nocase; reference: bugtraq, 629;
reference: arachnids, 128; reference: cve, CVE-1999-0067;
classtype: web-application-activity; sid: 886; rev: 8;)
Hybrid Solution: Arbor Example

9
scrubber
http://www.arbornetworks.com/en/stopping-ddos-attacks.html
anomaly
detection
Challenges
Accuracy
False positives
False negatives
High volume of data
Measuring the traffic at line rate
Real-time analysis of packet streams
Deep-packet inspection
Connection reassembly
Attackers gaming the monitoring system
Splitting offending content across multiple packets
Overloading the monitor
Exploiting ambiguity in which packet the receiver uses
10
Discussion
11
Network Security
Denial-of-service mitigation
Pushback and source quenching
Secure Overlay Services
Spam mitigation
Network-based detection of spam
Protecting against bogus DNS and certificates
Comparing results across multiple clients
DNS-SEC
End-to-end encryption: SSL and IPSEC
Clean-slate approaches
Self-certifying addresses
Off by default
Path verification 12
Traffic Measurement
Why is Traffic Measurement Important?
Billing the customer
Measure usage on links to/from customers
Applying billing model to generate a bill
Traffic engineering and capacity planning
Measure the traffic matrix (i.e., offered load)
Tune routing protocol or add new capacity
Denial-of-service attack detection
Identify anomalies in the traffic
Configure routers to block the offending traffic
Analyze application-level issues
Evaluate benefits of deploying a Web caching proxy
Quantify fraction of traffic that is P2P file sharing
Collecting Traffic Data: SNMP
Simple Network Management Protocol
Standard Management Information Base (MIB)
Protocol for querying the MIBs
Advantage: ubiquitous
Supported on all networking equipment
Multiple products for polling and analyzing data
Disadvantages: dumb
Coarse granularity of the measurement data
E.g., number of byte/packet per interface per 5 minutes
Cannot express complex queries on the data
Unreliable delivery of the data using UDP
Collecting Traffic Data: Packet Monitoring
Packet monitoring
Passively collecting IP packets on a link
Recording IP, TCP/UDP, or application-layer traces
Advantages: details
Fine-grain timing information
E.g., can analyze the burstiness of the traffic
Fine-grain packet contents
Addresses, port numbers, TCP flags, URLs, etc.
Disadvantages: overhead
Hard to keep up with high-speed links
Often requires a separate monitoring device
Collecting Traffic Data: Flow Statistics
Flow monitoring (e.g., Cisco Netflow)
Statistics about groups of related packets (e.g., same
IP/TCP headers and close in time)
Recording header information, counts, and time
Advantages: detail with less overhead
Almost as good as packet monitoring, except no fine-
grain timing information or packet contents
Often implemented directly on the interface card
Disadvantages: trade-off detail and overhead
Less detail than packet monitoring
Less ubiquitous than SNMP statistics
Using the Traffic Data in Network Operations
SNMP byte/packet counts: everywhere
Tracking link utilizations and detecting anomalies
Generating bills for traffic on customer links
Inference of the offered load (i.e., traffic matrix)
Packet monitoring: selected locations
Analyzing the small time-scale behavior of traffic
Troubleshooting specific problems on demand
Flow monitoring: selective, e.g,. network edge
Tracking the application mix
Direct computation of the traffic matrix
Input to denial-of-service attack detection
Flow Measurement
Flow Measurement: Outline
Definition
Passively collecting statistics about groups of packets
Group packets based on headers and spacing in time
Essentially a way to aggregate packet measurement data
Scope
Medium-grain information about user behavior
Passively monitoring the link or the interface/router
Helpful in characterizing, detecting, diagnosing, and fixing
Outline
Definition of an IP flow (sequence of related packets)
Flow measurement data and its applications
Mechanics of collecting flow-level measurements
Reducing the overheads of flow-level measurement
flow 1 flow 2 flow 3
flow 4
IP Flows
Set of packets that belong together
Source/destination IP addresses and port numbers
Same protocol, ToS bits,
Same input/output interfaces at a router (if known)
Packets that are close together in time
Maximum spacing between packets (e.g., 30 sec)
Example: flows 2 and 4 are different flows due to time
Flow Abstraction
A flow is not exactly the same as a session
Sequence of related packets may be multiple flows
(due to the close together in time requirement)
Sequence of related packets may not follow the same
links (due to changes in IP routing)
Motivation for this abstraction
As close to a session as possible from inside the
network
Flow switching paradigm from IP-over-ATM technology
Router optimization for forwarding/ACL decisions
(cache the result after the first packet in a flow)
might as well throw in a few counters
Recording Traffic Statistics (e.g., Netflow)
Packet header information (same for every packet)
Source and destination IP addresses
Source and destination TCP/UDP port numbers
Other IP/TCP/UDP header fields (protocol, ToS bits, etc.)
Aggregate traffic information (summary of traffic)
Start/finish time of the flow (time of first & last packet)
Total number of bytes and number of packets in the flow
TCP flags (e.g., logical OR over sequence of packets)
start finish
4 packets
1436 bytes
SYN, ACK, & FIN
SYN ACK ACK FIN
Recording Routing Info (e.g., Netflow)
Input and output interfaces
Input interface is where the packets entered the router
Output interface is the next hop in the forwarding table
Source and destination IP prefix (mask length)
Longest prefix match on the src and dest IP addresses
Source and destination ASnumbers
Origin AS for src/dest prefix in the BGP routing table
Switching
Fabric
Processor
Line card
Line card
Line card
Line card
Line card
Line card
BGP table
forwarding table
Measuring Traffic as it Flows By
input
output
source AS
source
prefix
source
dest AS
dest
prefix
dest
intermediate AS
Source and destination: IP header
Source and dest prefix: forwarding table or BGP table
Source and destination AS: BGP table
Packet vs. Flow Measurement
Basic statistics (available from both techniques)
Traffic mix by IP addresses, port numbers, and protocol
Average packet size
Traffic over time
Both: traffic volumes on a medium-to-large time scale
Packet: burstiness of the traffic on a small time scale
Statistics per TCP connection
Both: number of packets & bytes transferred over the link
Packet: frequency of lost or out-of-order packets, and the number of
application-level bytes delivered
Per-packet info (available only from packet traces)
TCP seq/ack #s, receiver window, per-packet flags,
Probability distribution of packet sizes
Application-level header and body (full packet contents)
Collecting Flow Measurements
Router A
Route CPU that generates flow records
may degrade forwarding performance
Router A
Line card that generates flow records
more efficient to support
measurement in each line card
Router A Router B
Monitor
Packet monitor that generates flow records
third party
CPU
Router Collecting Flow Measurement
Advantage
No need for separate measurement device(s)
Monitor traffic over all links in/out of router (parallelism)
Ease of providing routing information for each flow
Disadvantage
Requirement for support in the router product(s)
Danger of competing with other 1st-order router features
Possible degradation of the throughput of the router
Difficulty of online analysis/aggregation of data on router
Practical application
View from multiple vantage points (e.g., all edge links)
Packet Monitor Collecting Flow Records
Advantages
No performance impact on packet forwarding
No dependence on support by router vendor
Possibility of customizing the thinning of the data
Disadvantages
Overhead/cost of tapping a link & reconstructing packets
Cost of buying, deploying, and managing extra equipment
No access to routing info (input/output link, IP prefix, etc.)
Practical application
Selective monitoring of a small number of links
Deployment in front of particular services or sites
Packet monitor vendors support flow-level output
Mechanics: Flow Cache
Maintain a cache of active flows
Storage of byte/packet counts, timestamps, etc.
Compute a key per incoming packet
Concatenation of source, destination, port #s, etc.
Index into the flow cache based on the key
Creation or updating of an entry in the flow cache
#bytes, #packets, start, finish
#bytes, #packets, start, finish
packet
key
heade
r
key
key
Mechanics: Evicting Cache Entries
Flow timeout
Remove flows that have not received a packet recently
Periodic sequencing through the cache to time out flows
New packet triggers the creation of a new flow
Cache replacement
Remove flow(s) when the flow cache is full
Evict existing flow(s) upon creating a new cache entry
Apply eviction policy (LRU, random flow, etc.)
Long-lived flows
Remove flow(s) that persist for a long time (e.g., 30 min)
otherwise flow statistics dont become available
and the byte and packet counters might overflow
Sampling: Packet Sampling
Packet sampling before flow creation (Sampled Netflow)
1-out-of-m sampling of individual packets (e.g., m=100)
Create of flow records over the sampled packets
Reducing overhead
Avoid per-packet overhead on (m-1)/m packets
Avoid creating records for a large number of small flows
Increasing overhead (in some cases)
May split some long transfers into multiple flow records
due to larger time gaps between successive packets

time
not sampled
two flows
timeout
Conclusions
Flow measurement
Medium-grain view of traffic on one or more links
Advantages
Lower measurement volume than full packet traces
Available on high-end line cards (Cisco Netflow)
Control over overhead via aggregation and sampling
Disadvantages
Computation and memory requirements for flow cache
Loss of fine-grain timing and per-packet information
Not uniformly supported by router vendors

Detecting Abnormal Traffic: Jennifer Rexford

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Detecting Abnormal Traffic: Jennifer Rexford

Uploaded by

Copyright:

Available Formats

Jennifer Rexford

Fall 2010 (TTh 1:30-2:50 in COS 302)

You might also like