• Embed Doc
  • Readcast
  • Collections
  • CommentGo Back
Download
 
“Security Event Analysis Through Correlation”
Anton Chuvakin, Ph.D., GCIA, GCIH
WRITTEN: 2002-2004
Contents
Contents...............................................................................................................................................1Abstract................................................................................................................................................1Introduction to security data analysis..................................................................................................1Types of correlation.............................................................................................................................3Rule-based correlation....................................................................................................................3Statistical correlation......................................................................................................................4Challenges with correlation............................................................................................................5Maximizing benefits of correlation......................................................................................................6Correlation Rule Examples..................................................................................................................6Probes followed by an attack..........................................................................................................6Login guessing................................................................................................................................7Conclusion...........................................................................................................................................7
 DISCLAIMER
:Security is a rapidly changing field of human endeavor. Threats we face literally change every day;moreover, many security professionals consider the rate of change to be accelerating. On top of that, to be able to stay in touch with such ever-changing reality, one has to evolve with the space aswell. Thus, even though I hope that this document will be useful for to my readers, please keep inmind that is was possibly written years ago. Also, keep in mind that some of the URL might have gone 404, please Google around.
Abstract
This paper covers several of the security event correlation methods, utilized by SecurityInformation Management (SIM) solutions for better attack and misuse detection. We describe thesecorrelation methods, show their corresponding advantages and disadvantages and explain how theywork together for maximum security.
Introduction to security data analysis
The security spending survey by “Information Security Magazine”http://www.infosecuritymag.com/2003/may/coverstory.pdf and recent research by Forrester analystfirm indicate that deployment rates of many security technologies will soar in the next three years.According to some estimates, security budgets (and thus technology purchases) will double by2006. Almost every Internet-connected organization now has a firewall, included as part of itsnetwork infrastructure; most Windows networks have an anti-virus solution. Intrusion DetectionSystems (IDSs) are slowly but surely gaining wider acceptance and intrusion prevention starts toshow more promise, despite the obvious hurdles. New types of application security products suchas web application firewalls are starting to be deployed by security-conscious organizations. This buying trend is further enhanced by the growing popularity of so-called "appliance" securitysystems, which are very easy to install and manage. Appliances combine software and hardware in
 
one package and usually have much lower installation and maintenance costs, thus facilitating their adoption. All the above devices, whether aimed at prevention or detection of attacks, usually generate hugevolumes of audit data. Firewalls, routers, switched and other devices recording network connectioninformation are especially guilty of producing vast oceans of data. There are other problemsinduced by this log deluge, turning its analysis into a pursuit few dare to undertake. Many diversedata formats and representations, some binary
1
, obscure and undocumented, are used for those logfiles and audit trails. Also, a percentage of events generated by network Intrusion DetectionSystems (IDS) and Intrusion Prevention Systems (IPS) are false alarms and do not map to realthreats or map to threats that have no chance of causing loss. To further confuse the issue, differentdevices might report on the same things happening on the network, but in a different way, with noapparent way of figuring the truth of their relationship. For example, a UNIX log file might containan FTP connection message. The same will also be recorded by the firewall as 'connection allowedto TCP port 21'. A network IDS might also generate an alert, warning that FTP with no passwordhas occurred. All three messages refer to the same event and a human analyst will recognize themas such. However, programming a system to do that is much more challenging, especially for A broad spectrum of messages, Thus, there is a definite need for a consistent analysis framework toidentify various network threats, prioritize them and learn their impact on the target organization.This needs to be done as fast as possible (preferably in real-time) for attack identification and alsoover the long term for threat trending and risk analysis.To understand the meaning of the piling logs, the data in them may be categorized in several ways.It should be noted that before the data can be intelligently categorized, it should be normalized to acommon schema. The normalization process involves extracting the parts of the log records servingthe common purpose and assigning them to specific fields in the common schema. For example, both firewall and network IDS log records will usually contain the source and destination IPaddresses. If you see both firewall and IDS logs referring to the same source and destination atabout the same time, they are likely to be related.Log categorization helps to make the similarity between different log records to stands out. For example, the generated log data across many security devices, hosts and applications might berelated to:
Device performance data
 Network traffic
Known attacks
Known network/system problems
Anomalous/suspicious network/host activity
Access control decisions
Software failures
Hardware errors
System changes
Evidence of malicious agents
Site-specific AUP
2
violations
1
Binary = here, not containing human-readable text, but binary data
2
AUP = Acceptable Use Policy
 
Each of the above types of events presents unique analysis challenges. For example, some are produced in much higher numbers (network access control, worm events) while some others areoften not what they seem at first (such as network IDS “false positives”). Moreover, sometimes thethreat can only identified and rated by cross-device and cross-category analysis of the above events.Many questions arise upon seeing the above data. How to turn that flood of data into useful andactionable information? How to find what is really relevant for the organization at the moment andfor the near future? How to tell normal log records, produced in the course of business, from theanomalous and malicious, produced by attackers or misbehaving software?Correlation performed by the SIM (Security Information Management) software is believed to bethe solution to those challenges. Correlation is defined in the dictionary as establishing or findingrelationships between entities. However, the good security-specific definition is lacking. Insecurity, “event correlation” may be defined as improving threat identification and assessment process by looking not only at individual events, but also at their sets, bound by some common parameter (“related”).
Types of correlation
Security-specific correlation can be loosely categorized into rule-based and statistical (or algorithmic). Rule-based correlation needs some pre-existing knowledge of the attack (“the rule”)and is able to define what it actually detected in precise terms (“Successful Shopping Cart WebApplication Attack”). Such attack knowledge is used to relate events and analyze them together in broader context.On the other hand, statistical correlation does not employ any pre-existing knowledge of the “bad”activity (at least, not as a primary detection vehicle), but instead relies upon the knowledge of normal activities, accumulated over time. Ongoing events are then rated by the built-in algorithmand are additionally compared to the accumulated activity patterns.This distinction is somewhat similar to signature vs anomaly IDS and makes a SIM solution a kindof meta-IDS, operating on a higher-level data (not packets, but log records). Both of thosecorrelation methods combined can help to sift through the large volume of diverse data and identifyhigh severity threats.
Rule-based correlation
Rule-based correlation uses some pre-existing knowledge of an attack (a rule), which is essentiallya scenario that an attack must follow to be detected. Such scenario might be encoded in the form of “if 
this
, then
that 
, therefore
 some action
is needed”.Rule-based correlation deals with states, conditions, timeouts and actions. Let us define thoseimportant terms. A
state
is a stationary occurrence that the correlation rule might be in. A statemight contain various
conditions
, such as matching incoming events by the source IP address, protocol, port, event type, producing security device type, username and other components of theevent. It should be noted that although such data components vary upon the device, the SIMsolution normalizes them using the cross-device event schema without incurring the informationloss.
Timeout
defines how long the rule will be in a certain state. If the correlation engine has to
of 00

Leave a Comment

You must be to leave a comment.
Submit
Characters: ...
You must be to leave a comment.
Submit
Characters: ...