Professional Documents
Culture Documents
during the comparison of session data and user's profile. generated as compared to the approach presented in [8].
This type of system is well suited for the detection of More rules generated reduce false alarms. But it is also not
previously unknown attacks. The main disadvantage is that, well suited approach for role based database access. Kamra
it may not be able to describe what the attack is and may et. al [10] have proposed a role based approach for detecting
sometimes have high false positive rate. In contrast, a malicious behavior in RBAC (role based access control)
misuse detection model takes decision based on comparison administered databases. Classification technique is used to
of user's session or commands with the rule or signature of deduce role profiles of normal user behavior. An alarm is
attacks previously used by attackers. raised if roles estimated by classification for given user is
We are presenting unsupervised machine learning different than the actual role of a user. The approach is well
approach for database intrusion detections in databases suited for databases which employ role based access control
enabled with role based access control (RBAC) mechanism. mechanism. It also addresses insider threats scenario
It means number of roles has been defined and assigned to directly. But limitation of this approach is that it is query-
users of database systems. Keeping database security in based approach and it cannot extract correlation among
view, proper privileges are assigned to these roles. queries in the transaction.
The rest of this paper is organized as follows. In section
2, we discuss related background. In section 3, a detailed 3. Our Approach
overview about our approach is given. In section 4, analysis
The approach we are presenting is a transaction level
and result of our approach is presented. Finally in section 5
approach. Attributes referred together for read and write
we conclude with the references at the end.
operations in transactions play important role in defining
normal behavior of user’s activities.
2. Related Work For example consider the following transaction:
Application of machine learning techniques to database
security is an emerging area of research. There are various Begin transaction
approaches that use machine learning/data mining select a1,a2,a3 from t1 where a1= 25;
update t2 set a4= a2+ 1.2(a3);
techniques to enhance the traditional security mechanisms
End transaction
of databases. Bertino et al. [6] have proposed a framework
based on anomaly detection techniques to detect malicious Where t1 and t2 are tables of the database and a1, a2, a3
behavior of database application programs. Association rule are the attributes of table t1 and a4, a5 are the attributes of
mining techniques are used to determine normal behavior of table t2 respectively.
application programs. Query traces from database logs are This example shows the correlation between the two
used for this purpose. This scheme may suffer from high queries of the transaction. It states that after issuing select
detection overhead in case of large number of distinct query, the update query should also be issued by same user
template queries. i.e. the number of association rules to be and in the same transaction. Approach presented in [10] can
maintained will be large. DEMIDS is a misuse-detection easily detect the attributes which are to be referred together,
system, tailored for relational database systems [7]. It uses but it cannot detect the queries which are to be executed
audit log data to derive profiles describing typical patterns together. This example shows the correlation between the
of accesses by database users. The main drawback of the two queries of the transaction. It states that after issuing
approach presented as in [7] is a lack of implementation and select query, the update query should also be issued by same
experimentation. The approach has only been described user and in the same transaction. Our approach extracts this
theoretically, and no empirical evidence has been presented correlation among queries of the transaction. In this
approach database log is read to extract the list of tables
of its performance as a detection mechanism. Yi Hu and
accessed by transaction and list of attributes read and
Brajendra Panda proposed a data mining approach [8] for
written by transaction. The extracted information is
intrusion detection in database systems. This approach
represented in the form of following structure format:
determines the data dependencies among the data items in (Read, TB-Acc[ ], Attr-Acc[ ][ ], Write, TB-Acc[ ],Attr-
the database system. Read and write dependency rules are Acc[ ][ ] )
generated to detect intrusion. The approach is novel, but its Where Read and Write are binary fields while TB-Acc[ ]
scope is limited to detecting malicious behavior in user is binary vector of size equal to number of relations in
transactions. Within that as well, it is limited to user database and Attr-Acc[ ][ ] is vector of N vectors and N is
transactions that conform to the read-write patterns assumed equal to the number of relations in the database. If
by the authors. Also, the system is not able to detect transaction contains select query then Read is equal to 1
malicious behavior in individual read-write commands. otherwise it is 0. Similarly, if transaction contains update or
False alarm rate is may be more. It also does not hold good insert query Write is equal to 1 otherwise it is 0. Element
for different access roles. Sural et al. [9] have presented a TB-Acc[i]=1 if SQL command at hand access i-th table and
approach for extracting dependency among attributes of 0 otherwise. Element Attr-Acc[i][j] = 1 if the SQL
database using weighted sequence mining. They have taken command at hand accesses the j- th attribute of the i-th table
sensitivity of data items into consideration in the form of and 0 otherwise. Table 1 shows the representation of
weights. Advantage of this approach is that more rules are example transaction given above using this format.
(IJCNS) International Journal of Computer and Network Security, 73
Vol. 2, No. 6, June 2010
Table 1: Representation of example transaction into number of groups, we have used k-means clustering
algorithm for clustering. K-means is the fastest among the
Rd t1 t2 a1 a2 a3 a4 a5
partitioning clustering algorithms. Training tuples
1 1 0 1 1 1 0 0 generated from database log has binary data fields.
Therefore similarity measures of binary variables can be
used for clustering such tuples. Similarity measure between
Table 1: (Continued) two tuples for clustering algorithm of our approach is as
follows.
Wt t1 t2 a1 a2 a3 a4 a5
1 0 1 0 0 0 1 0
ncount11
Where Rd=Read and Wt=Write simm(t1,t2) =
Values of fields of above structure will form the normal ncount11 + ncount10 + ncount01
behavior of the transaction to be issued by user. Violation to
such behavior will be detected as anomalous. The overall Where
approach is depicted by figure 1. ncount11 – count equals to number of similar binary
fields of both the tuples t1 and t2 has value 1.
Current Session
For example consider the following transactions:
Clustering Clusters (Role Transaction tr1
(Learning Phase) Profiles)
User transaction Begin Transaction
select a1,a2 from t1;
Comparison
(Detection Phase) update t2 set a4;
End Transaction
Outlier Update
Corresponding bit pattern:
Raise Alarm New DB Log
110 1100010100010
role are grouped into same cluster. Approach is also well and ulnerability Assessment (DIMVA), pages 123-
suited for the users with more than one role. Detection phase 140,2003.
need to be generalized only. [2] Lee, V. C.S., Stankovic, J. A., Son, S. H., “Intrusion
Detection in Real-time Database Systems Via Time
4. Result and Analysis Signatures,” In Proceedings of the Sixth IEEE Real
For verification of our approach, we generated number of Time Technology and Applications Symposium, pages
database tables with number of attributes. We defined 121-128, 2000.
number of roles and generated number of transactions for [3] Marco Vieira and Henrique Madeira, “Detection of
these roles. Based on these transactions, we also generated Malicious Transactions in DBMS,” IEEE Proceedings-
large number of tuples as a training dataset. For detection, 11th Pacific Rim International Symposium on
we generated number of valid as well as invalid Dependable Computing, PP: 8, Dec 12-14, 2005.
transactions. We tested our approach by supplying valid as [4] Ashish Kamra, Elisa Bertino, and Evimaria Terzi. ,
well as invalid transactions and our approach was detecting “Detecting anomalous access patterns in relational
these transactions with full accuracy. We considered all the databases,” The International Journal on Very Large
possible ways for generating valid and invalid transactions Data Bases (VLDB), 2008.
and we got the proper result for all the cases. Our approach [5] Wai Lup LOW, Joseph LEE, Peter TEOH., “DIDAFIT:
is perfectly detecting correlations among commands of the Detecting intrusions in databases through
transactions. We tested the approach by issuing the valid fingerprinting transactions,” ICEIS 2002 - Databases
transactions by eliminating one of the SQL command from and Information Systems Integration, pages 121-
the transaction and it was detected as invalid transaction. 127,2002.
When we issued the transactions with all the desired SQL [6] Elisa Bertino, Ashish Kamra, and James Early,
commands, it was detected as valid transaction. Training “Profiling database application to detect sql injection
time was also varying linearly with respect to number of attacks,” IEEE International Performance, Computing,
training tuples as per the expectations. Figure 2 shows the and Communications Conference (IPCCC) 2007, pages
nature of training time vs number of training tuples. 449–458, April 2007.
[7] C.Y. Chung, M. Gertz, and K. Levitt. , “DEMIDS: a
misuse detection system for database systems,” In
Integrity and Internal Control in Information Systems:
Strategic Views on the Need for Control. IFIP TC11
WG11.5 Third Working Conference, pages 159-178,
2000.
[8] Yi Hu and Brajendra Panda, “A data mining approach
for database intrusion detection,” In SAC ’04:
Proceedings of the 2004 ACM symposium on applied
computing, pages 711–716, New York, NY, USA,
2004.
[9] Abhinav Srivastava, Shamik Sural and A. K.
Majumdar, “Database intrusion detection using
weighted sequence mining,” Journal of Computers,
Vol. 1, NO. 4, pages 8-12, JULY 2006.
[10] Elisa Bertino, Ashish Kamra and Evimaria Terzi,
“Intrusion detection in rbac-administered databases,”
In Proceedings of the Applied Computer Security
Figure 2. Training Time Vs Training Data
Applications Conference (ACSAC), 2005.
4. Conclusion
In this paper we have proposed a new unsupervised machine
learning approach of database intrusion detection for
databases in which role based access control (RBAC)
mechanism is enabled. It considers the correlations among
the queries of the transaction and detects them accordingly.
It does not require role information to be logged in database
log. Clusters of transactions generated can also provide
guidelines to the database administrator for role definitions.
References
[1] Fredrik Valeur, Darren Mutz, and Giovanni Vigna., “A
learning-based approach to the detection of sql
attacks,” In Proceedings of the International
Conference on Detection of Intrusions and Malware,