You are on page 1of 6

Proceedings of the 2008 IEEE

International Conference on Information and Automation
June 20 -23, 2008, Zhangjiajie, China

Acquisition and Visualization of Sensitive Security
Audit Events
Baoyun Wang ,Yingjie Yang
Institute of Electronic Technology
Information Engineering University
Zhengzhou, Henan Province, 450004, China
wangbaoyun303@163.com, yangyj2006@vip.163.com
Abstract- Audit data analysis plays a critical role in the field
of information security. Acquiring sensitive security audit events
(SSAE) and visualizing correlations of them is an important task
of audit data analysis and it is a very difficult issue. In this paper,
we propose an approach to acquire SSAE and present their
correlations in the form of graphs. Firstly, we use DWT (discrete
wavelet transformation) to get sensitive security audit event
objects, and then use DBSCAN (a clustering algorithm of KDD)
and database query technique to obtain SSAE related to the
sensitive objects. Secondly, a security audit event visualization
model based on the theory of Colored Petri-net is presented to
visualize correlations of SSAE, and the acquisition process of
causal relationship among audit events is given. Lastly, we carry
out an experiment, which shows the proposed approach bring
some convenience of browsing and analysing audit data to
security auditor.

I. INTRODUCTION
Security audit is an important part of information security
management. Event log plays an important role. However,
when security auditor is in the face of large amount of logs in
the form of records, it is impractical to browse and analyze
them. Acquiring sensitive security audit events (SSAE) and
Visualizing correlations of them is important for security
audit.
There are several difficulties to achieve above goal. Firstly,
audit events are of huge amount and different kinds, some of
them contain little useful information. Audit events generally
have a property named urgent level, its value can be warning,
mistake, information and so on, but they often cannot reflect
potential threats. We can not acquire SSAE only by that event
property. Thirdly, to visualize audit events correlation, it is
better to have an event visualization model theory, but there
are no related woks currently.
To overcome above problems, in section three we start from
sensitive events object selection, and then acquire SSAE. In
section four, we give a security audit event visualization
model. Section five gives acquisition process of events causal
relationship. Section six gives an experiment example, and we
conclude in section seven.
II. RELATED WORKS
Currently, there are some log analysis tools and techniques
[1].Their event correlation methods are mainly used for
network fault management, and some of them have been used
to security management and intrusion detection. But they

978-1-4244-2184-8/08/$25.00 © 2008 IEEE.

mainly focus on log correlation, and few of them consider
acquisition of SSAE. At the same time, some log visualization
systems [2,3,4,5,6] have been appeared for a period. They are
the applications of information visualization technique in log
monitoring area. They visualize user actions to some extent,
but their visualization is limited to network data, some of them
visualize the result of statistics of user actions. They have no
way to handle a huge amount of log messages, because they
do not have the filter mechanism and can not find out SSAE.
One of the most serious problems in related works is that the
visualization can not present the events relationships in term
of time, space and logic, and therefore they can not provide
help of deep analysis for security auditor.
III. ACQUISITION OF SENSITIVE SECURITY AUDIT EVENTS
In this section, we give the acquisition process starting with
file operation events. As audit events are of huge amount,
involving lots of files, but the importance of different files is
different, so we first find out sensitive object files, and then
find out SSAE. The main process is shown in figure 1

Figure 1. Acquisition of sensitive security audit events

In figure 1, the words on broken line present the methods
we use. The process is as follows:
1)Use DWT technique to select sensitive object files in
object set and we get sensitive object files A, B, C…
2) Use DBSCAN algorithm to cluster events due to events
occurring time, the events in the same cluster have great
correlation in terms of time. The events whose objects are
sensitive object files will fall into different clusters. We get
sensitive security audit events set Cluster (A), Cluster (B,
C)… in the aspect of time.

1514

The model is named SAVM. Some definitions [9]. then find out all the density reachable objects from p based on r and minipts. if and only if exist p1 . the files updated on single host are usually user files. q1) = d (p. if exist points q1 .cAi ( j −1) .T. 4) If p is not a kernel object. 4) Point p and q is density reachable. where M is a subset of TE. The changes of files can be described as a discrete signal. we give a security audit event visualization model in next section. cAi is the first .P. pi and pi+1 ( 1≤i ≤ n −1 ) is directly density reachable. as in figure2.… qm ∈ p (ε ) .the residual signal value. we get sensitive security audit events set S(A). Acquisition of Time correlated SSAE based on DBSCAN DBSCAN (density-based spatial clustering of application with noise) is a clustering algorithm of KDD (Knowledge Discovery and Data Mining). the system files in the same path are identified as the same file.E. B. then from p to q is directly density reachable. corresponding to user applications. It is an extension to Colored Petri-net and it is a variation of HCPN [10]. SAVM = (C.where: 1) C (color set): C = C u * C p C is a non-empty finite set of colors. S(B). The abnormal file update is often relevant to noise signal. we use DWT to accomplish this file selection task. In terms of system security. create a new cluster and attach a cluster tag. the area whose center is p and radius is ε is named the adjacent area of p. such as subject. file i is selected. 5) Ri ( j ) : Ri ( j ) = Si ( j ) . so our method is to use time dimensionality replace space dimensionality. Our security audit event visualization model integrates these properties and some other relation sets. 2) P (place set) 1515 .3)Query users and their actions that had accessed sensitive files A. 3) cDi : cDi is the high frequency signal of Si . so what we are interested in are the files in network system. 3) If q ∈ p (ε ) and p is a kernel. p and p1 is directly density reachable. It clusters data on spatial distance and density which suits for any shape in planar space. q2) = d (p. it usually has some required properties. reflecting the long term update trend.reflecting the day-to-day variation from the long term trend. then p is a kernel . q2 . B. As DWT is very popular in the field of signal process. consider it as a noise point temporarily. Our application is based on time distance. So far. 2) For point p and radius ε . Therefore. Figure 2. q2) 1) For any point p. 3) For any object p in M. DWT has also been used in anomaly detection [7] [8]. C… in the events database. 4) Unite the events obtained from the second step and the third step and we get sensitive security audit events S. The process is as follows: For file i: 1) Si = cAi + cDi 2) Ri ( j ) = Si ( j ) .cAi ( j −1) 3) if Ri ( j ) > α If Ri ( j ) exceeds the preset threshold α which is a parameter based on the statistical distribution of historical residual values. p2 . where Cu and C p correspond to event subjects which are user and process. IV. The wavelet that we use is Haar wavelet. 4) cAi ( j −1) : the signal value of cAi on day j-1. T) where T is event time. which is the simplest wavelet. we have got SSAE from above method. In a network environment. 5) Process another object until all the objects are finished. then the actual number of host that have updated file i on day j is significantly larger or smaller than the prediction cAi ( j −1) based on the long term trend. 2) Fix time radius r and minimum number minpts. marked as P = p (ε ) . We omit the explanation of database query process because it is easy to understand. Some definitions: 1) Si ( n ) : the number of hosts that update file i on day n for file i. Acquisition of sensitive object files based on DWT Object files that we will choose are the ones whose change may be abnormal. every events can be marked as (ID. In order to visualize the correlation of them. m is the minimum number must be met. pn and q is directly density reachable. The dataset preprocessed we call it TE (time event). The time event set that directly related to sensitive objects named M.… pn . M 0 . We discuss DWT and DBSCAN technique in detail. The main process is as follows: 1) Preprocess security audit events data set.G. SECURITY AUDIT EVENT VISUALIZATION MODEL For a security audit event. S(C)… in the aspect of logic. DBSCAN algorithm Reachability-distance (p. search entire data set TE. action and object.V). 2) cAi : Si can be decomposed into a low frequency signal and a high frequency signal. q1) Reachability-distance (p. A. if p is a kernel object.F.O.

and action to action (they are derived from F.en } . In SAVM. an audit event can be pre-treated as e= (subject. where N p is the events count of sequence pattern P in events set. ‘wuftp2. as follows: Place set P mapping to circle nodes.} Tat ={scan. local to root attack described in SAVM V. An example of visualization of audit events with SAVM is shown in figure 3. We use sequence mining algorithm to get the dynamic causal relationship.change. G is a set of guard functions associated with F1 . trojanhorse.Version) DNSVersion (DestIP. 9) V (mapping rules) V is a set of mapping rules.write(A. and device etc. it can be described as the following sequence [11]: prerequisite → actions → consequence .B) Read (A.delete.overflow.. F1 ⊆ ( P×T ) . then the circle node (object) or rectangle node (action) is dyed to the corresponded color (subject).telnet. It aims at mapping audit events to graphs. Tat is action set related to attacks. Circle nodes represent resource (event objects). 4) F (flow set) F = F1 * F2 * F3 . Color set C dye to circle nodes and rectangle nodes.The main process is as follows. object).write(B) Read (A. There are three kinds of arcs in SAVM. F3 is relations between actions and actions. Port) DNSVersion (DestIP. The relevant definition to sequence mining is support.. Figure 3. Pre-order functions describe time sequence between two actions.. F1 and F2 are relations between objects and actions.write. e2 . F2 ⊆ (T × P ) . which are object to action. ΔT is time distance. ACQUISITION OF CAUSAL RELATIONSHIP In above model SAVM. O). Access control rules can be described as Subject(certificate)->Actions(object). N l is the count of events whose length are l in events set.2’) Access(DestIP) FTP IsAlive(DestIP) FTPVersion (DestIP. TABLE I STATIC RULES OF ACCESS CONTROL Subject(Certificate) Root(key) User(passwords) Anonymous( ) Actions(objects) Read (A. what prerequisite it should have and what consequence it will give. action to object. Static rules contain access control rules and attack condition rules. network resource and certificate. For simplicity. G.backdoor} T is a finite set of events actions. 5) G (guard function set) G ={g :F1 →C MS } CMS represents multi-set of color set C. and actions are what they could conduct. ‘DNS’) Consequence IsAlive(DestIP) OpenPort(DestIP. action.database. We assume that the maximum length of the sliding window is MaxLen.P = File* DB * Device* Networkresource *Certificate P is a finite set of event objects. Some examples of attack condition rules can be seen in table2. For sequence pattern p. some examples of access control rules can be seen in table 1.dos. Flow set F mapping to the direction of arcs. device. It describes the process of local to root attack (L2R) [11].B). it mainly contains file.replace.ftp. G. E is a set of functions associated with F2 .B).. Guard functions set G.execute. O reflect the causal relationships of audit events. assuming its length is l.prob.privilege.2.2’) Access(DestIP) The second. Privileges are described by role. E. we think that the events in frequent sequence pattern have causal relationship. Tsv is action set related to network services. logon…} Tsv ={http. Transition set T mapping to rectangle nodes. describing the result of relationship change due to the action. make the breadth of sliding window as 1(len=1). 1) First.cheat. describing the conditions to be met before an action can be conducted by subject.pop3. 3) T (transition set) T =Tob *Tsv *Tat Tob ={read. In figure 3 there is only one kind of color. if a subject accesses an object or has an action. When the frequent sequences are found.add.‘ISC BIND 8.B) Attack conditions rules mean when an attack occurs. and then look up the sequence length of 1 whose support 1516 . 8) M 0 (initial marking distribution) M 0 is the subject-object ownership. describing the beginning event before correlating events.The events set is E ={e1. we call it dynamic rules. database.6. We use sliding window algorithm. and the rectangle nodes represent actions conducted by a user (subject). The first.Version) Wuftp Overflow DNSVersion(DestIP. We acquire these sets from two aspects. Tob is single step action set related to file . 7) O (pre-order function set) O ={o: F3 →ΔT } O is a set of functions associated with F3 . we call it static rules. 6) E (effect function set) E ={e:F2 →CMS } . denoting all the actions were conducted by the same user. E. F3 ⊆ (T ×T ) F is a set of relations. The minimum support is MinSup. then the computing formula of support is Support ( P ) = N p N l . TABLE II STATIC RULES OF ATTACK CONDITION Action IPSweep PortScan Chaos query TSIG overflow Prerequisite Access(SrcIP) Access(SrcIP) and IsAlive(DestIP) ExistService(DestIP. effect function E and pre-order set O mapping to arcs.

get the sequences met by the MinSup. by these sensitive objects and DBSCAN we get SSAE.. We give experiment in section 6. For example.. the threshold is 10. Two typical clusters are marked by ellipses. Firewall log. FTP server log and operating system (Windows Server 2000) log. functions). passwords)--->set policy for operator (domain. different colors represent different clusters. (len=len+1). 3) Slide the window until the right boundary of the sliding window lap over en . operator(key. We use the acquisition and visualization methods the paper proposed to process audit data of a month. Figure 7. we can see that the on day 9 and 23 the file’s changes are sensitive..passwords)--->set policy for each function(device.There is a FTP server and the software is Serv-U. VI. We get causal relationships in table 4 and table 5 which contain static rules and dynamic rules. we can get n-len+1 sub-sequences length of len. after all the sensitive security audit events are acquired. make the left boundary of the sliding window lap over e1 ..exceeds MinSup in E. there are 30 terminal machines. Experiment environment topology 1517 . TABLE IV STATIC RULES USED IN EXPERIMENT Access Control Rules administrator(key.elen } . compute the support of these n-len+1 sub-sequences. TABLE III DYNAMIC RULES ACQUIRED BY SEQUENCE MINING MinSup 30% 40% 15% 20% Sequence pattern su->tcsh->ls ls->ls/etc ls->mail->su->tcsh->ls->df ls->cat/etc/passwd From the above static and dynamic rules. in this network environment. 4) Repeat the second and the third step until len=Maxlen. Subi ={ei . the file C:\WINNT\system32\wbem\Logs is acquired as figure 6. DWT Process Result In order to see clearly the spatial relationship of audit events. administrator(key. we get the causal relationship among audit events. every user machine has installed the monitoring software and only one machine can connect to Internet . When the sliding process is finished. network. file. The topology is as figure 5. Every sensitive objects files was acquired by DWT. The visualization process is as in figure 4. administrator(key. DBSCAN graph is in figure 7. we can get a subsequence length of len. In figure 7. Some examples of dynamic rules can be seen in table 3.e2 . administrator(key.. n − len +1) .2. Visualization process Our experiment environment is a network which has been installed security monitoring software. passwords)--->change policy for operator (domain.ei + len −1}( i =1. logon…) operator (key. after every sliding step. functions). 2) Increase sliding window. We acquire about 20 sensitive object files. the bigger cross is outlier. and the smaller cross represent event containing sensitive objects which start a cluster. get subsequence Sub1 ={e1 . The sequence pattern shows the user’s command habit.ei +1.. we part them into different sets based on the statistic of event aim IP address. As in figure 5. and the circle point is the event added to the cluster. DBSCAN Process Result Figure 4. So the audit data sources are Network Monitoring Server log. passwords)--->shutdown server. passwords)--->change policy for each Figure 5. passwords)--->log backup. EXPERIMENTS Figure 6. In the experiment.. In fact. From the figure.

and he found that the port 21 in IP2 was opened. auditor(key.exe) (log_in.Port). mobiledeviceA)-> (file_new. ppt. P3: C\Program files. changed security policy on mobile device on his machine.xxx. S axis represent space (IP address) and T axis represent Time. Some symbols explanation: P0: System file C:\WINNT\system32\config\SAM which is relevant to user account and password. 3) It acquires causal relationship among audit events in terms of static rules and dynamic rules. The meaning of figure 8: An inner user leak security information with three identities (three colors).exe) -> (app. Promote(DestIP)->Administrator((DestIP)->Anyaction. FTPVersion(DestIP. at the same time. But the task of adding appropriate action for a scenario that was not very clear is also a changeling problem.exe)->(app.exe)->(file_rename. xxx.vmware. allowing writing on mobile device.exe) ->(app . msdev. in order to record this action. he logged on the control center on IP1 in the role of administrator.Version).mp3)>(app. and the software edition was Serv-U. the file was owned by another user. user(email service available) --->send email. Firstly. user(ftp service available) --->transport files. it immediately caused system file P3 updating.sohu. Figure 8 displays an information leak process. user(disk writeable) --->write files on disk. When the policy was send. Our approach can be improved in the following two areas: 1) To acquire SSAE we start and focus on the sensitive files and other objects. An attacker Scanned IP and ports in the network system using some tools. P0 was changed. we need monitoring tools of system calling sequence. 163. Then Attacker entered into computer system in the role of administrator.business. which is very convenient for auditors to browse or analyse audit events.ztzq. P4 and P5 were selected by DWT (because of the transition of the horse program in FTP server. and they can know what events had happened clearly. user(key. xxx. xxx. operator(key. transmitted the file using email. file. Table V DYNAMIC RULES ACQUIRED IN EXPERIMENT User Wu Liangz pj yongw kk jingyu yuhan baby sweet Sequence pattern (network_http . xx.com)>(network_load. P3. auditor). and other events were acquired by DBSCAN algorithm and Database query. firewalllog)>(policy_mod. guba.com)-(network_email.txt) (network_http. The attacker acquired FTP administrator passwords (The dark frame means lots of attempting log) the privilege of Attacker was promoted because of the software hole.exe)->(device_add. and dynamic rules are acquired by sequence mining algorithm. passwords)--->query and analysis logs of each role(administrator. represent password or credential for user to log on system). P4 and P5 were changed immediately.com)>(app. P1: a file which contains a horse program that can change system files. logged on Internet on IP3. we visualized two events scenario.policy) (log_in. 2) The audit event visualization model gives a framework to visualize audit events. P4. 18% Using the above results. Object P3 and P5 were selected by DWT. we plan to expand our method to suit for the events which do not involve file objects. we describe an approach to acquire and visualize SSAE. If one sensitive event occurred only once then we can not acquire the rules. We must point out that the action of Privilege Promote was not recorded in the log files.com) (network_http.exe) (app.function(device. Of course. xxx.Version)->FTP(user)->Promote (DestIP).word) (network_http . The graphs are as in figure 8 and figure 9. we use sequence mining algorithm to acquire them. The user added device P5.TTplayer. P4: P4ΚC:\WINNT\Debug\oakley.log. (P1. P3. word. 2) For dynamic rules in SAVM. network.pdf) (app. Attack Condition Rules Access(SrcIP)-> IPSweep-> IsAlive(DestIP). firewall)->(file_read.ppt)->(device_del. copied file P6.word)>(file_new.policy) (app. Figure 9 describes an attack and intrusion progress. In the figure. but this method required that the audit event occurring frequency must meet the support.savΙ P5ΚC:\WINNT\system32\wbem\Logs P0. xxx.exe)>(network_http. 1518 . ztjy. passwords) --->logon system. molbiledeviceA) MinSup 10% 15% 15% 10% 12% 15% 10% VII. P7.word)>(file_copy. IsAlive(DestIP)->PortScan-> OpenPort(DestIP. logon…). other objects and events were got by DBSCAN algorithm and event query based on sensitive object. CONCLUSIONS AND FUTURE WORK 8% In this paper. As a result. Our approach has the following features: 1) It effectively acquires sensitive object files using DWT and effectively acquires SSAE using DBSCAN and database query technique.xxx. managecenter)->(policy_check . these files were changed in lots of hosts). operator. system files P2.baidu. user(mobile device writeable) --->write files on mobile device. xxx. and it is also one of our next works. upload file P1 and executed it. softtice.cn)>(network_load. IsAlive(DestIP)->FTP-> FTPVersion(DestIP. google. passwords)--->uninstall agent. user(network address available) --->access network. Next work we can apply Correlation Rules (a method in KDD) to solve this problem. we add it because this can clearly demonstrate the entire intrusion progress. P2.

111-115. Eick. The Implementation of Alert Aggregation and Dataset Testing. October 2003 [8] Yinglian Xie. R. 2005 [9] Jun Qian. 2002 [3] S. [10] Dong Yu.R.4. M. An Attack and Intrusion process [5] Girardin L. Wagner. Research on Visualization in Intrusion Detection. Hogan.G and Wilks.452466 .Figure 8. 1519 . A. SEC – a Lightweight Event Correlation Tool. 43(4).1-31. An information leak process Figure 9. A Novel Framework for Alert Correlation and Understanding (Springer-Verlag[A]. Deborah Frincke. Vol. S.2004 [11] JINGMIN ZHOU.ACM Transation on Informaton and System Security 10(1). pp.2007 ACKNOWLEDGMENT Financial supports from China 863 project are highly appreciated. No. Tsinghua Uniiversity. Eick and P. Proceedings of the 2002 IEEE Workshop on IP Operations and Management. Proceedings of the Sixth International Conference on Information Visualization (IV'02). [2] Takada T. pp. Detection of Outbreaks from Time Series Data Using Wavelet Transform.International Conference on Applied Cryptography and Network Security. In AMIA Fall Symp.Modeling Network Intrusion Detection Alerts for Correlation[J].26.748–752. pp. Vol. REFERENCES [1] Risto Vaarandi. 1996.A. Tudumi: information visualization system for monitoring and auditing computer logs.399-409.. The helpful comments from reviewers are also gratefully acknowledged.1. Koike H. Software Practice and Experience. (ACNS).299-308.G. F. [4] Becker. pp.. Brodbeck D. 2006. 1998 [6] Zhi Guo. Visualization and Computer Graphics. M. No. IEEE Trans. Journal of Computer Research and Development.: Visualizing Network Data. and Meilin Shi. Proceedings of the Twelfth Systems Administration Conference (LISA'98).J. Tsui. R. 1995.School of Computer Science Carnegie Mellon University. Lucas: Displaying trace files. and W. Chao Xu. A Spatiotemporal Event Correlation Approach to Computer Security. pp. Zhang. 2002. A visual approach for monitoring logs. pp. pp. Omni Press CD.1628. pp. pp.1. 627-632.2004 [7] J.