International Journal of

Computer Science
& Information Security

IJCSIS Vol. 10 No. 9, September 2012
ISSN 1947-5500
Message from Managing Editor
The International Journal of Computer Science and Information Security (IJCSIS), since
May 2009, publishes research articles in the emerging area of computer applications and
practices, and latest advances in cloud computing, information security, green IT etc. The
research theme focus mainly on innovative developments, research issues/solutions in computer
science and related technologies. IJCSIS is a well-established journal for disseminating high
quality research papers as recognised by various universities, international professional bodies
and Google scholar citations. IJCSIS editorial board solicits high-calibre researchers/scholars to
contribute to the journal by submitting articles that illustrate research results, projects, surveying
works and industrial experiences. The aim is also to allow academia promptly publish research
work to sustain or further one's career.

IJCSIS archives publications, abstracting/indexing, editorial board and other important information
are available online on homepage. IJCSIS appreciates all the insights and advice from authors
and reviewers. Indexed by the following International Agencies and institutions: Google Scholar,
Bielefeld Academic Search Engine (BASE), CiteSeerX, SCIRUS, Cornell’s University Library EI,
Scopus, DBLP, DOI, ProQuest, EBSCO.

Google Scholar reported a large amount of cited papers published in IJCSIS. We will continue to
encourage the authors and reviewers to continue citing papers published by the journal.
Considering the growing interest of academics worldwide to publish in IJCSIS, we invite
universities and institutions to partner with us to further encourage open-access publications

We look forward to further collaboration. If you have further questions please do not hesitate to
contact us at Our team is committed to provide a quick and supportive
service throughout the publication process.

A complete list of journals can be found at:
ht t p: / / si t es.googl si t e/ i j csi s/
I JCSI S Vol. 10, No. 9, Sept ember 2012 Edit ion
I SSN 1947-5500 © I JCSI S, USA.

Journal I ndexed by (among ot hers):


Dr . Yong Li
School of Elect ronic and I nformat ion Engineering, Beij ing Jiaot ong Universit y,
P. R. China

Pr of . Hami d Reza Naj i
Depart ment of Comput er Enigneering, Shahid Behesht i Universit y, Tehran, I ran

Dr . Sanj ay Jasol a
Professor and Dean, School of I nformat ion and Communicat ion Technology,
Gaut am Buddha Universit y

Dr Ri kt esh Sr i vast ava
Assist ant Professor, I nformat ion Syst ems, Skyline Universit y College, Universit y
Cit y of Sharj ah, Sharj ah, PO 1797, UAE

Dr . Si ddhi vi nayak Kul kar ni
Universit y of Ballarat , Ballarat , Vict oria, Aust ralia

Pr of essor ( Dr ) Mokht ar Bel dj ehem
Saint e-Anne Universit y, Halifax, NS, Canada

Dr . Al ex Pappachen James ( Resear ch Fel l ow)
Queensland Micro-nanot echnology cent er, Griffit h Universit y, Aust ralia

Dr . T. C. Manj unat h
HKBK College of Engg., Bangalore, I ndia.
Pr of . El boukhar i Mohamed
Depart ment of Comput er Science,
Universit y Mohammed First , Ouj da, Morocco

1. Paper 22081201: Additive Update Algorithm for Nonnegative Matrix Factorization (pp. 1-7)

Tran Dang Hien, Vietnam National University
Do Van Tuan, Hanoi College of Commerce and Tourism, Hanoi – Vietnam
Pham Van At, Hanoi University of Communications and Transport

2. Paper 28081215: QoS Adoption And Protect It Against DoS Attack (pp. 8-17)

Dr. Manar Y. Kashmola, Computer Sciences Department, Computer Sciences and Mathematics College, Mosul
University, Mosul, Iraq
Rasha Saadallah Gargees, Software Engineering Department, Computer Sciences and Mathematics College, Mosul
University, Mosul, Iraq

3. Paper 29081218: Investigation of Hill Cipher Modifications Based on Permutation and Iteration (pp. 18-24)

Mina Farmanbar & Alexander G. Chefranov, Dept. of Computer Engineering, Eastern Mediterranean University,
Famagusta T.R. North Cyprus via Mersin 10, Turkey

4. Paper 29081220: Underwater Acoustic Channel Modeling: A Simulation Study (pp. 25-28)

Pallavi Kamal, Electrical Engineering, University Teknologi Mara, 40450 Shah Alam, Selangor Darul, Ehsan,
Taussif Khanna, Institute of Information Technology, Kohat University of Science and Technology, Pakistan

5. Paper 31081224: A Fast Accurate Network Intrusion Detection System (pp. 29-35)

Ahmed A. Elngar, Sinai University, El-Arish ,Egypt
Dowlat A. El A. Mohamed & Fayed F. M. Ghaleb, Ain-Shams University, Cairo, Egypt

6. Paper 31081231: Quality of Service Support on High Level Petri-Net Based Model for Dynamic
Configuration of Web Service Composition (pp. 36-45)

Sabri MTIBAA & Moncef TAGINA
LI3 Laboratory / University of Manouba, National School of Computer Sciences, 2010 Manouba, Tunisia

7. Paper 31081233: Contextual Ontology for Delivering Learning Material in an Adaptive E-learning System
(pp. 46-51)

Kalla. Madhu Sudhana, Research Scholar, Dept of Computer Science, St. Peter’s University, Chennai, India
Dr V. Cyril Raj, Head, Dept of Computer Science, Dr M.G.R University,Chennai, India

8. Paper 31081234: Comparison of Supervised Learning Techniques for Binary Text Classification (pp. 52-

Hetal Doshi, Dept of Electronics and Telecommunication, KJSCE, Vidyavihar, Mumbai - 400077, India.
Maruti Zalte, Dept of Electronics and Telecommunication, KJSCE, Vidyavihar, Mumbai - 400077, India

9. Paper 31081241: Creation of Digital Test Form for Prepress Department (pp. 60-64)

Jaswinder Singh Dilawari, Research Scholar, Pacific University, Udaipur,Rajasthan, India
Dr. Ravinder Khanna, Sachdeva Engineering College for Girls, Mohali, Punjab, India

10. Paper 31081247: Web Test Integration and Performance Evaluation of E-Commerce web sites (pp. 65-69)

Md. Safaet Hossain, Department of Electrical Engineering and Computer Science, North South University, Dhaka
Md. Shazzad Hosain, Department of Electrical Engineering and Computer Science, North South University, Dhaka

11. Paper 29081219: Performance Evaluation of Some Grid Programming Models (pp. 70-78)

W. A. Awad, Mathematics & Computer Science Department, Faculty of Science, Port Said University, Egypt.
Scientific Research Group in Egypt (SRGE)

12. Paper 31031281: Comprehensive Analysis of π Base Exponential Functions as a Window (pp. 79-85)

Mahdi Nouri, Sepideh Lahooti, Sepideh Vatanpour, and Negar Besharatmehr
Dept. of Electrical Engineering, Iran University of Science & Technology, A.B.A Institute of Higher Education,

13. Paper 31081237: Enhanced techniques for PDF Image Segmentation and Text Extraction (pp. 86-90)

D. Sasirekha, Research Scholar, Computer Science, Karpagam University, Coimbatore, Tamilnadu, India
Dr. E. Chandra, Director,Dept of Computer Science, SNS Rajalakshmi college of Arts and Science,

14. Paper 31081222: Towards an Ontology based integrated Framework for Semantic Web (pp. 91-99)

Nora Y. Ibrahim, Computer and System Department, Electronic Research Institute, Cairo, Egypt
Sahar A. Mokhtar, Computer and System Department, Electronic Research Institute, Cairo, Egypt
Hany M. Harb, Computer and Systems Engineering, Department, Faculty of Engineering, Al-Azhar University,
Cairo, Egypt

15. Paper 31081238: Face Recognition By Fusing Subband Images And PCA (pp. 100-105)

T. Mohandoss, R. Thiyagarajan, S. Arulselvi
Annamalai University

16. Paper 31081228: Identifying Critical Features For Network Forensics Investigation Perspectives (pp. 106-

Ikuesan R. Adeyemi, Shukor Abd Razak, Nor Amira Nor Azhan
Department of Computer System and Communications, Faculty of Computer Science and Information Systems,
Universiti Teknologi Malaysia

17. Paper 31081229: Hybrid Model of Rough Sets & Neural Network for Classifying Renal stones Patient (pp.

Shahinda Mohamed Al Kholy, Ahmed Abo Al Fetoh Saleh, Aziza Asem,Khaled Z. Sheir,M.D.
Department of IS, Faculty of Computers and Information Science, and Urology and Nephrology Center, Mansoura
University, Mansoura, Egypt 

18. Paper 23071202: Cloned Agent based data computed in Homogeneous sensor networks (pp. 136-140)

S. Karthikeyan, Sathyabama university, Chennai-600119, Tamil Nadu, India
S.Jayashri, Adhiparasakthi engineering college, Melmaruvathur – 603319, Kanchipuram District, Tamil Nadu,

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

Additive Update Algorithm for Nonnegative Matrix
Tran Dang Hien
Vietnam National University

Do Van Tuan
Hanoi College of Commerce and Tourism
Hanoi – Vietnam

Pham Van At
Hanoi University of Communications
and Transport

Abstract—Nonnegative matrix factorization (NMF) is an
emerging technique with a wide spectrum of potential
applications in data analysis. Mathematically, NMF can be
formulated as a minimization problem with nonnegative
constraints. This problem is currently attracting much attention
from researchers for theoretical reasons and for potential
applications. Currently, the most popular approach to solve NMF
is the multiplicative update algorithm proposed by D.D. Lee and
H.S. Seung. In this paper, we propose an additive update
algorithm, that has faster computational speed than the
algorithm of D.D. Lee and H.S. Seung.
Keywords - nonnegative matrix factorization; Krush-Kuhn-
Tucker optimal condition; the stationarity point; updating an
element of matrix; updating matrices;
Nonnegative matrix factorization approximation (NMF) is
an approximate representation of a given nonnegative matrix
R V e by a product of two nonnegative matrices
R W e and
R H e :
H W V - ~ (1.1)
Because r is usually chosen by a very small number, the
size of the matrices W and H are much smaller than V. If V is a
data matrix of some object then W and H can be viewed as an
approximate representation of V. Thus NMF can be considered
an effective technique for representing and reducing data.
Although this technique appeared only recently, it has wide
application, such as document clustering [7, 11], data mining
[8], object recognition [5] and detecting forgery [10, 12]
To measure the approximation in (1.1) often use the
Frobenius norm of difference matrix

= =
÷ =
÷ =
ij ij
V WH H W f
1 1
) ) ((
) , (
Thus NMF can be formulated as an optimization problem
with nonnegative constraints:

) , ( min
0 , 0
H W f
H W > >
Since the objective function is not convex, most methods
fail to find the global optimal solution of the problem and only
get the stationary point, i.e. the matrix pair (W, H) satisfies the
Krush-Kuhn-Tucker (KKT) optimal condition [2]:

0 , 0 > >
bj ia

0 )) ( ( , 0 ) ) (( > ÷ > ÷

j b a i V WH W H
, , , , 0 )) ( (
, 0 ) ) ((
¬ = ÷ -
= ÷ -

W H W f H V WH
c c = ÷ / ) , ( ) (
H H W f V WH W
c c = ÷ / ) , ( ) (
To update the H or W (the remaining fixed) often use the
gradient direction reverse with a certain appropriate steps, so
that a reduction in the objective function, on the other still to
ensure non-negative of H and W. Among the known algorithms
solve (1.3) must be mentioned algorithm LS (DD Lee and HS
Seung [6]). This algorithm is a simple calculation scheme, easy
to install and gives quite good results, so now it remains one of
the algorithms are commonly used [10, 12]. LS algorithm is
adjusted using the following formula:

ij ij
ij ij ij
) (
÷ + ÷


÷ =


ij ij
ij ij ij
~ ~ ~
÷ + ÷


÷ =

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

By selecting
 and
 by the formula:
) (
=  ,
~ ~
= 
then the formula (1.5) and (1.6) become:
ij ij
) (
) (
ij ij
~ ~

The adjustment formula uses multiplication so this
algorithm is called the method of multiplicative update. With
this adjustment to ensure non-negative of W
and H
. In [6]
prove the monotonic decrease of the objective function after
) , ( )
( H W f H W f <
This algorithm LS has the advantages of simple easy to
implement on the computer. However, the coefficients
 and
 are selected in a special way should not reach the minimum
in each adjustment. This limits the speed of convergence of the
To improve the convergence speed, E.F. Gonzalez and Y.
Zhang has improved LS algorithm by using a coefficient for
each column of H and a coefficient for each row of W. In other
words instead of (1.5) (1.6) using the following formula:
ij j ij ij
WH W A W H H ) (
÷ + =  
ij i ij ij
H H W AH W W )
~ ~
÷ + =  
The coefficients
 and
 are calculated through the
) , , ( x b A g = 
(A is the matrix. B and x is the vector). This function is
defined as follows:
) ( Ax b A q
÷ = ovo | | q Ax A x p
 ) /( . = ,
Where the symbol “./” and “ ” denote component-wise
division and multiplication, respectively. Then calculate
) , , ( x b A g =  by the formula:
{ }

> + - = 0 : max 99 . 0 , min p x
Ap A p
q p
  
The coefficients
 and
 are determined by the function
g(A,b,x) as follows:
n j H V W g
j j j
.. 1 ), , , ( = = 
m i W V H g
.. 1 ), , , ( = = 

However, the experiments showed that improvement of EF
Gonzalez and Y. Zhang has not really bring obvious effect.
Also as remarks in [3], the LS algorithm and GZ algorithm (EF
Gonzalez - Y. Zhang [3]) are not guaranteed the convergence
to a stationary point. New algorithm uses addition for updating,
so it is called additive update algorithm.
In this paper we propose a new algorithm by updating each
element of every matrix W and H based on the idea of
nonlinear Gauss - Seidel method [4]. Also with some
assumptions, the proposed algorithm ensures reaching
stationary point (Theorem 2, subsection III.B). Experiments
show that the proposed algorithm converges faster than the
algorithms LS and GZ.
The content of the paper is organized as follows. In section 2,
we present an algorithm to update an element of the matrix W
or H. This algorithm will be used in section 3 to construct a
new algorithm for NMF (1.3). We also consider some
convergence properties of new algorithm. Section 4 presents a
scheme for installing a new algorithm on a computer. In
section 5, we present experimental results comparing the
calculation speed of algorithms. Finally some conclusions are
given in section 6.
A. Updating an element of matrix W
In this section, we consider the algorithm for updating an
element of W, while retaining the remaining elements of W and
H. Suppose
W is adjusted by adding  parameter:
 + =
ij ij
If W
is an obtained matrix, then by some matrix
operations, we have:

= = +
= =
m b i a H WH
m b i a WH
jb ib
.. 1 , , ) (
.. 1 , , ) (

So from (1.2) it follows:
) ( ) , ( ) ,
(  g H W f H W f + = (2.2)
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

   q p g + = ) (
) (

H p

÷ =
jb ib
H V WH q
* ) ( (2.5)
To minimize ) ,
( H W f , one needs to define  so that
) ( g achieves the minimum value on the condition
> + = 
ij ij
W W . Because ) ( g is a quadratic function,
then  can be defined as follows:

> ÷
0 ,
0 ), , max(
0 , 0
q w
 (2.6)
Formula (2.6) always means because if q = 0 then by (2.4)
we have p>0
From (2.3) and (2.6), we get:
0 ) ( =  g , if (q = 0) or (q>0 and
W =0) (2.7.a)
0 ) ( <  g , otherwise (2.7.b)
By using update formulas (2.1) and (2.6), the monotonous
decrease of the objective function f(W,H) is confirmed in the
following lemma .
LEMMA 1: If conditions KTT are not satisfied at
W , then:
) , ( ) ,
( H W f H W f <

Otherwise: W W =

Proof. From (2.4), (2.5) it follows
H V WH q ) ) (( ÷ =
Therefore, if conditions KTT (1.4) are not satisfied at
W ,
then properties
0 , 0 , 0 = - > > q W q W
ij ij
cannot occur simultaneously. From this and
because 0 >
W , it follows that case (2.7.a) cannot happen. So
case (2.7.b) must occur and we have 0 ) ( <  g . Therefore,
from (2.2) we obtain
) , ( ) ,
( H W f H W f <
Conversely, if (2.8) is satisfied, it means that: q=0 or q>0
W = 0. So from (2.6), it follows 0 =  . Therefore, by
(2.1) we have
ij ij
W W =

Thus lemma is proved.
B. Updating an element of matrix H
Let H
be matrix obtained from the update rule:
 + =
ij ij
where  is defined by the formulas:

W u

÷ - =
aj ai
V WH W v
) ( (2.11)

> ÷
0 ,
0 ), , max(
0 , 0
v H
 (2.12)
By the same discussions used in lemma 1, we have
LEMMA 2: If conditions KTT are not satisfied at H
, then:
) , ( )
, ( H W f H W f <
Otherwise: H H =

A. Updating matrices W and H
In this section we consider the transformation T from (W,
H) to (W
, H
) as follows:
- Modify elements of W by subsection II.A
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

- Modify elements of H by subsection II.B
In other words, the transformation
) , ( )
( H W T H W = shall be carried out as follows:
Step 1: Initialise
H H W W = =

Step 2: Update elements of W

For j=1,…,r and i=1,…,n
 + ÷
ij ij
~ ~

 is computed by (2.4)-(2.6)
Step 3 : Update elements of H

For i=1,…,r and j=1,…,m
 + ÷
ij ij
~ ~

 is computed by (2.10) - (2.12)
From Lemmas 1 and 2, we easily obtain the following
important property of the transformation T.
LEMMA 3: If solution (W,H) does not satisfy the condition
KTT (1.4), then
) , ( )) , ( ( )
( H W f H W T f H W f < =
In the contrary case, then: ) , ( )
( H W H W =
Following property is directly obtained from Lemma 3.
COROLLARY 1:For any 0 ) , ( > H W , if set
) , ( )
( H W T H W =
then ) , ( )
( H W H W = or ) , ( )
( H W f H W f <
B. Algorithm for NMF (1.3)
The algorithm is described through the transformation T as
Step1. Initialize W=W
>=0, H=H
Step 2. For k=1,2,...
( ) ( )
k k k k
H W T H W , ,
1 1
+ +
 
From Corollary 1, we obtain the following important
property of above algorithm.
THEOREM 1. Suppose ) , (
k k
H W is a sequence of
solutions created by algorithm 3.2, then the sequence of
objective function values ) , (
k k
H W f actually decreases
1 ), , ( ) , (
1 1
> ¬ <
+ +
k H W f H W f
k k k k

Moreover, the sequence ) , (
k k
H W f is bounded below
by zero, so Theorem 1 implies the following corollary.
COROLARRY 2. Sequence ) , (
k k
H W f is a
convergence sequence. In other words, there exists non-
negative value f such that:
f H W f
k k
· ÷
) , ( lim 
Now we consider another convergence property of
Algorithm III.B.
THEOREM 2. Suppose ) , ( H W is a limit point of the
sequence ) , (
k k
H W and

= >
r j H
... 1 , 0 (3.1)
r i W
... 1 , 0
= >
Then ) , ( H W is the stationary point of the problem (1.3)
Proof. By assumption, ) , ( H W is the limit of some
) , (
k k
t t
of the sequence ) , (
k k
H W :
) , ( ) , ( lim H W H W
k k
t t
· ÷
By conditions (3.1), (3.2), the transformation T is
continuous at ) , ( H W . Therefore from (3.3) we get:
) , ( ) , ( lim H W T H W T
k k
t t
· ÷

Moreover, since ) , ( ) , (
1 1 + +
k k k k
t t t t
H W H W T , then:
) , ( ) , ( lim
1 1
k k
t t
+ +
· ÷
Using a continuation of the object function f(W,H), from
(3.3), (3.4) we have
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

) , ( ) , ( lim H W f H W f
k k
t t
· ÷

)) , ( ( ) , ( lim
1 1
H W T f H W f
k k
t t
+ +
· ÷

Because, on the other hand, by Corollary 2, sequence
) , (
k k
H W f is convergent, it follows:
) , ( )) , ( ( H W f H W T f = 
Therefore, by Lemma 3, ) , ( H W must be a stationary
point of problem (1.3).
Thus theorem is proved.
In this section we provide some variations for algorithm in
subsection III.B (Algorithm III.B) to reduce the volume of
calculations and increase convenience for the installation
A. Evaluate computational complexity
To update an element W
by formulas (2.1), (2.4), (2.5),
(2.6), we need to use m multiplications for calculating p and
m*(n*m*r) multiplications for calculating q. Similarly to
update element H
by using (2.9), (2.10), (2.11), (2.12), we
need n multiplications for computing u and (n*m*r)*n
multiplications for computing v. It follows that the number of
calculations to make a loop
( ) , ( )
( H W T H W tion transforma = ) of the Algorithm III.B is
2*n*m*r*(1+n*m*r) (4.1)
B. Some variations for updating W and H
1) Updatting Wij
If set
D=WH - V (4.2)
Then the formula (2.5) for q becomes:

jb ib
H D q
* (4.3)
If one considers D as known, the calculation of q in (4.3)
needs m multiplications. After updating W
by the formula
(2.1), we need to recalculate the D from
to be used for the
adjustment of other elements of W:
V H W D ÷ =
~ ~ ~

From (2.1) and (4.2), it is seen that
is determined from
D by the formula:

= = +
= =
m b i a H D
m b i a D
jb ib
.. 1 , ,
.. 1 , ,

So we only need to adjust the i
row of D and need to use
m multiplications.
From formulas (2.1), (2.4), (2.6), (4.3) and (4.4), we have a
new scheme for updating matrix W as follows.
2) Scheme for updating matrix W
For j = 1 To r

H p

For i=1 To n

jb ib
H D q

= ÷
0 ), , max(
0 , 0
q w

 + ÷
ij ij
End For i
End For j
The total number of operations used to adjust the matrix W
is: 2×n×m×m×r + m×r
3) Updating Hij
Similarly, the formula (2.11) for v becomes

- =
D v
W (4.5)
According to this formula, we only use n multiplications to
calculate v. After adjusting H
by formula (2.9), we need to
recalculate matrix D by the following formula:

= = +
= =
n a j b H D
n a j b D
ai aj
.. 1 , ,
.. 1 , ,

So we only need to adjust the j
column of D and need to
use n multiplications.
From formulas (2.9), (2.10), (2.12), (4.5) and (4.6), we have
a new scheme for updating matrix H as follows.
4) Scheme for updating matrix H

For i = 1 To r

W u

For j=1 To m

- =
D v
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

= ÷
0 ), , max(
0 , 0
v H

 + ÷
ij ij
m b i a H D D
jb ib ib
.. 1 , , = = + ÷ 
End for j
End for i
The total number of operations used to adjust the matrix H
is: 2×n× m×r + n×r.
Using the above results together, we can construct a new
calculating scheme for the Algorithm III.B as follows.
C. New calculating scheme for the Algorithm III.B
1. Initialize W=W
>=0, H=H
2. For k=1,2,...
- Update W by using subsection IV.B.2
- Update H by using subsection IV.B.4
the computational complexity of this scheme is as follows:
- Initialization step needs n*m*r multiplications for
computing D.
- Each loop needs n*m*r + r*(n+m)
Comparing with (4.1), number of operations has now
greatly reduced.
In this section, we present results of 2 experiments on the
algorithms: New NMF (new proposed additive update
algorithm), GZ and LS. The programs are written in MATLAB
and run on a machine with configurations: Intel Pentium Core
2 P6100 2.0 GHz, RAM 3GB. New NMF is built according to
the schema in subsection IV.C.
A. Experiment 1
Used to compare the speed of convergence to stationary
point of the algorithms. First of all condition KKT (1.4) is
equivalent to the following condition:
0 ) , ( = H W 
= =
= =
÷ +
÷ =
1 1
1 1
) ) ( ( , min(
) ) ) (( , min( ) , ( 

Thus if h(W, H) is smaller, then (W,H) is closer to the
stationary point of the problem (1.3). To get a quantity
independent with the size of W and H, we use following

 

= A
) , (
) , (
in which
 is the number of elements of the set:
{ } r a n i H V WH
... 1 , ... 1 | ) ) ) (( , W min(
= = ÷
 is the number of elements of the set:
{ } r b m j V WH W
... 1 , ... 1 | ) ) ( ( , H min(
= = ÷
) , ( H W A is called a normalized KKT residual. Table 1
presents the value ) , ( H W A of the solution (W, H) received
by each algorithm implemented in given time periods on the
data set of size (n, m, r) = (200,100,10) in which V, W
, H
generated randomly with | | 500 , 0 e
V , | | 5 , 0 ) (
W ,
| | 5 , 0 ) (
H .
Time (sec) New NMF GZ LS
60 3.6450 3700.4892 3576.0937
120 1.5523 3718.2967 3539.8986
180 0.1514 3708.6043 3534.6358
240 0.0260 3706.4059 3524.6715
300 0.0029 3696.7690 3508.3239
The results in Table 1 show that the two algorithms GZ and
LS cannot converge to a stationary point (value ) , ( H W A is
still large). Meanwhile, New NMF algorithm still possible
converges to a stationary point because value ) , ( H W A
reaches of approximately equal value 0.
B. Experiment 2
Used to compare the convergence speed to the minimum
value of objective function f(W, H) of the algorithms
implemented in given time periods on the data set of size (n,
m, r ) = (500,100,20), in which V, W
, H
was generated
randomly with | | 500 , 0 e
V , | | 1 , 0 ) (
W ,
| | 1 , 0 ) (
H . The algorithms are run 5 times with 5 different
pairs of W
, H
generated randomly in the interval [0,1].
Average values of objective function after 5 times of
performing the algorithms in each given time period are
presented in Table 2.
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

Time (sec) New NMF GZ LS
60 57.054 359.128 285.011
120 21.896 319.674 273.564
180 18.116 299.812 267.631
240 17.220 290.789 264.632
300 16.684 284.866 262.865
360 16.458 281.511 261.914

The results in Table 2 show that the objective function
value of the solutions generated by two algorithms GZ and LS
is quite large. Meanwhile the objective function value of New
NMF algorithm is much smaller.
This paper proposed a new additive update algorithm for
solving the problem of nonnegative matrix factorization.
Experiments show that the proposed algorithm converges faster
than the algorithms LS and GZ. The proposed algorithm has a
simple calculation scheme too, so it is easy to install and use in
[1] D. P. Bertsekas. On the Goldstein-Levitin-Polyak gradient projection
method. IEEE Transactions on Automatic Control, 21 (1976), pp. 174-
[2] D. P. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont,
MA 02178-9998, second edition, 1999.
[3] E. F. Gonzalez and Y. Zhang, Accelerating the Lee-Seung algorithm for
non-negative matrix factorization, tech. report, Tech Report, Department
of Computational and Applied Mathematics, Rice University, 2005.
[4] L. Grippo and M. Sciandrone. On the convergence of the block
nonlinear gauss-seidel method under convex constraints. Operations
Research Letters, 26 (2000), pp. 127-136.
[5] D. D. Lee and H. S. Seung, Learning the parts of objects by non-
negative matrix factorization, Nature, 401 (1999), pp. 788-791.
[6] D. D. Lee and H. S. Seung, Algorithms for non-negative matrix
factorization, in Advances in Neural Information Processing Systems 13,
MIT Press, 2001, pp. 556-562.
[7] V. P. Pauca, J. Piper, and R. J. Plemmons. Nonnegative matrix
factorization for spectral data analysis, Linear Algebra and Its
Applications, 416 (2006), pp. 29-47.
[8] V. P. Pauca, F. Shahnaz, M. W. Berry, and R. J. Plemmons, Text mining
using non-negative matrix factorizations, In Proceedings of the 2004
SIAM International Conference on Data Mining, 2004.
[9] L. F. Portugal, J. J. Judice, and L. N. Vicente, A comparison of block
pivoting and interior- point algorithms for linear least squares problems
with nonnegative variables, Mathematics of Computation, 63 (1994),
pp. 625–643.
[10] Z. Tang, S. Wang, W. Wei, and S. Su, Robust image hashing for tamper
detection using non-negative matrix factorization, Journal of Ubiquitous
Convergence and Technology, vol. 2(1), 2008, pp. 18-26.
[11] W. Xu, X. Liu, and Y. Gong. Document clustering based on non-
negative matrix factorization. In SIGIR ’03: Proceedings of the 26th
annual international ACM SIGIR conference on Research and
development in informaion retrieval, New York, NY, USA, 2003, ACM
Press, pp. 267-273.
[12] H. Yao, T. Qiao, Z. Tang, Y. Zhao, H. Mao, Detecting copy-move
forgery using non-negative matrix factorization, IEEE Third
International Conference on Multimedia Information Networking and
Security, China, 2011, pp. 591-594.

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
QoS Adoption And Protect It
Against DoS Attack

Computer Sciences Department
Computer Sciences and Mathematics College
Mosul University
Mosul, Iraq
Software Engineering Department
Computer Sciences and Mathematics College
Mosul University
Mosul, Iraq

Abstract— The enormous growth of the internet and the variation
in the needs of its applications resulted in the great interest in the
recent years in the Quality of Service (QoS). Since it must meet
the QoS in all circumstances, another challenge has emerged
which represents a hindrance to achieve the QoS. And this
challenge was represented by the emergence of some types of DoS
that aim at exhausting the bandwidth and eventually violating the
agreements of the QoS.
In this research a system was constructed to achieve the QoS
depending on the Diffserv technology, as the bandwidth is
distributed on the various applications according to the
specifications and the requirements of the application, giving the
priority to certain applications as well as providing protection to
them from the DoS attacks. The model of Anomaly Detection was
adopted to detect the attack, and then prohibiting the attack
detected by means of dropping the attack flow.
The system prove efficiency in improving the QoS for the
applications with critical requirements, through measuring a set
of factors that affect the QoS and the efficiency degree of halting
the DoS attack manifested by means of the available bandwidth,
and eventually preserving the bandwidth in the cases of such
Keywords: QoS, DoS, Bandwidth, DiffServ, attack.
Internet was initially designed for providing the best effort
delivery of application data since average performance
guarantees were sufficient for initial types of applications [1].
But the widespread growth of the Internet and the
development of streaming applications, and the advance of
technologies in multimedia compression, have guided the
Internet society to focus on the design and development of
architectures and protocols, that would guarantee a level of
Quality of Service. QoS is defined as the collective effect of
the service performance, which determines the degree of
satisfaction of a user of the service, or a measure of how good
a service is, as presented to the user and manifests itself in a
number of parameters, all of which have either subjective or
objective values[2][3]. Also we can define it as a set of
techniques to manage network resources in a manner that
enables the network to differentiate and handle traffic based on
policy. This means providing consistent, predictable data
delivery to users or applications that are supported within the
network.[4] Quality of service will be of central importance in
modern domestic infrastructures, crossed by multiple digital
streams for many kinds of user services[5].
Guaranteeing QoS means providing the requested QoS under
all circumstances, including the most difficult ones. Among
the most difficult circumstances are denial of service (DoS)
attacks. Because of this, protection against DoS is a defining
characteristic for guaranteed QoS mechanisms[6].
Denial of service (DoS) attacks pose many threats to the
networking infrastructure. They consume network resources
such as network bandwidth and router CPU cycles with the
malicious objective of preventing or severely degrading
service to legitimate users [7].
The Denial-of-Service attack (DOS attack) is an attempt from
the attacker to prevent legitimate users from accessing system
resources. DOS attack has been one of the most serious and
successful methods of attacking computer networks [8].
Our aim is to Develop a system implemented on Linux
platform to achieve the QoS to distinguish between different
types of network services and to give high priority and
bandwidth for certain services depending on their
requirements, at the expense of other less important services,
as is bandwidth management is to be invisible to the user,
without the need to increase the overall bandwidth of the
network. And we also Protect the security of QoS from DoS
attacks, which drains bandwidth, it is classified traffic to
normal and abnormal, by establishing a system for intrusion
detection and prevention, and be lightweight and quick to
detect DoS attacks and prevent them in real time and without
the need for access and analysis the contents of packets.
This paper is organized as follows: section 2 refers to related
work; section 3 describes Major QoS Framework, Functions
and parameter. Effect of Dos Attack on QoS, Attack
Scenarios, Intrusion Detection System and Intrusion
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Prevention System describe in section 4, in section 5 our
system model is presented. Section 6 evaluates performance of
our system. Section 7 is the conclusion and future work.
In [9] researcher Myung-Sup Kim and et al presented a flow-
based abnormal network traffic detection method and its
system prototype. This method is efficient, since it can reduce
system overhead in the processing of packet data by
aggregating packets into flows.
In [10] Wen-Shyang Hwang and Pei-Chen Tseng proposed a
QoS-aware Residential Gateway (QRG) with real-time traffic
monitoring and a QoS mechanism in order to initiate DiffServ-
QoS bandwidth management during network congestion.
And in [11] proposed a secure and adaptive multimedia
transmission framework to maintain the quality of service
(QoS) of the multimedia streams during the Denial-of-Service
(DoS) attacks The proposed framework consists of two
components: intrusion detection and adaptive transmission
management, The results of preliminary simulations in NS2
show that the quality of the multimedia stream can still be
maintained during an attack.
In [5] investigated QoS issues in such scenario, considering
the delivery of a digital terrestrial television transport stream
for home entertainment, in the presence of video surveillance,
automation data and Internet data streams. They have verified
that the introduction of a quality of service router permits to
effectively regulate the priority and bandwidth assigned to
each service, through the definition of proper QoS rules.
In [3] presented an Optimal Smooth Quality Adaptation (OS-
QA) strategy which gracefully adapts to network bandwidth
fluctuations to protect the service quality with relative
consistent QoS. They set up a mathematical model and derive
the optimal conditions to maximize the system overall
resource utilization and minimize the average QoS variance of
the requests from their ideal QoS requirements under the
resource constraints. Results show that their OS-QA is
effective in providing QoS spacing for different quality classes
and adapting the QoS smoothly to ensure less perceived QoS
In [12] proposes a system for lightweight detection of DoS
attacks, called LD2. The system detects attack activities by
observing flow behaviors and matching them with graphlets
for each attack type and defines appropriate threshold levels
for each DoS attack. The proposed system is lightweight
because it does not analyze packet content nor packet
statistics. The system implemented based on the concept of
In [8] propose a new mechanism to guarantees QoS during
DOS attacks for IPTV networks, they introduce the concept
of “video stream handoff” analogous to the “soft handoff”
done in cellular networks, the idea is to initiate a
selective video handoff procedure, either from the server side,
when the DOS attack is detected, or from the user side
when QoS degradation occurs. All of that should happen
without interrupting the user (i.e. while the video is
playing). They make use of the SIP protocol stack for
signaling, QoS negotiation, and session management.
In [13] presents a virtual inline technique which is based on
the technique of the Man in the Middle attack (MITM), it
combines the NIDS and NIPS together in providing all-wave
protection to networks. This technique integrates the
advantages of both IDSs and IPSs, and avoids their shortages;
it also avoids those problems baffle our researchers in this
A. Major QoS Framework
The IP QoS architecture development began with the IntServ
concept, and The scalability problem led to the design and
introduction of DiffServ architecture [14]
 Integrated Services
Integrated Services (IntServ) works at the granuIarity of the
individual application or flow. It invoIves path setup and
resource reservation (RSVP) when the application starts. This
preliminary dialogue between the sender and receiver nodes
ensures trouble free communication for the session[15].
Typically, applications (such as a VoIP gateway, for example)
originate RSVP messages; intermediate routers process the
messages and reserve resources, accept the flow or reject the
flow [16].
While this is an ideal solution, capable of providing rigorous
QoS guarantees, it is very complex and places a substantial
processing burden on intermediate routers. Scalability
becomes a problem with increasing number of flows. Also,
incremental deployment is virtually impossible. Work is in
progress to extend RSVP to allow flow aggregation, explicit
route setup and QoS negotiation [15].
RSVP messages take the same path that IP packets take, which
is determined by the routing tables in the IP routers. RSVP
provides several reservation styles [17].
 Differentiated Services
Enabling thousands of reservations via multi-field
classification means that a table of active end-to-end flows and
several table entries per flow must be kept. Memory is limited,
and so is the number of flows that can be supported in such a
way. In addition, maintaining the state in this table is another
major difficulty, The only way out of this dilemma appeared
to be aggregation of the state: the Differentiated Services
(DiffServ) architecture[16].
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Unlike IntServ where the RSVP signaling is used to reserve
bandwidths along the path, QoS in DiffServ is provided by
provisioning rather than reservation[18].
Primary goal of the Differentiated Services (DS) architecture
is to provide a simple, efficient, and thus scalable mechanism
that allows for better than best effort services in the
It involves a more coarse grained approach, grouping IP
packets into a relatively small
number of classes. This option has always been available
(though seldom used) in the ToS field of the IPv4 header. The
DiffServ approach formalises this by defining a set of packet
forwarding criteria (Per Hop Behaviours - PHB) based on the
DSCP (Differentiated Services Code Point). Thus a variety of
classes can be defined, providing a priority scheme, but not at
the level of individual applications[15].
DiffServ push the flowbased traffic classification and
conditioning to the edge router of a network domain. The core
of that domain is only having a responsibility of forwarding
the packets according to the PHB associated with each traffic
B.Network QoS Functions
To provide QoS over the IP network, the network must
perform the following two basic tasks [18]:

Figure. 1: IP QoS generic functional requirements
C. Network QoS parameter
The most important metrics that characterise the performance
of an IP network, and that are the most significant factors that
influence the end-to-end quality of an application, are
1) Delay
Network delay corresponds to the time it takes for application
data units to be carried by the network to the destination.
Network delay is caused by the combination of network
propagation delay, processing delays and variable queuing
delays at the intermediate routers on the path to the destination
2)Delay variation ( jitter)
Delay variation is usually caused by the buffers built up on
routers during periods of increased traffic, and less often by
changes of routing due to failures or routing table updates.
3)Packet loss
Packet loss is typically the result of excessive congestion in
the network. Packet loss is defined as the fraction (or
percentage) of IP data packets, out of the total number of
transmitted packets.
This signifies the portion of the available capacity of an end-
to-end network path that is accessible to the application or data
flow. Consequently, the number of bits that are injected into
the network by the various flows of an application have to be
adjusted accordingly.
A.Effect of Dos Attack on QoS
while the adaptive transmission management component is
designed to improve QoS of the video via the efficient
utilization of the network resources. With the detection of the
DoS attacks, the bandwidth occupied by the attacks can be
reduced and protected for video transmission[11]. The most
common DoS attacks target the computer network's bandwidth
or connectivity[22]. Since DoS will inject a large amount of
traffic to the network and occupy the bandwidth resources,
another issue is how to maintain the quality of service (QoS)
of the servers during the DoS attack[11].
Denial of Service (DoS) attacks are then more efficient in a
guaranteed multi-services network than in the ”old” best effort
Internet. Indeed, with best effort services, a DoS attack has to
forbid the target of the attack to communicate. With a multi-
services network, it is sufficient to make the network not
respect the SLA (Service Level Agreement) committed with
clients, what is easier and can be performed using simple
flooding attacks [23].
B. Attack Scenarios
The first attack scenario targets Storage and Processing
Resources. This is an attack that mainly targets the memory,
storage space, or CPU of the service provider [24].
The second attack scenario targets bandwidth. is designed to
flood the victim network with unwanted traffic that prevents
legitimate traffic from reaching the primary victim[25].
Consider the case where an attacker located between multiple
communicating nodes wants to waste the network bandwidth
and disrupt connectivity. The malicious node can continuously
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
send packets with bogus source IP addresses of other nodes,
thereby overloading the network. This consumes the resources
of all neighbours that communicate, overloads the network,
and results in performance degradations[24]. bandwidth
attacks may be caused by traffic that looks entirely normal
except for its high volume[26].
C .Intrusion Detection System
An Intrusion Detection System (IDS) is an entity devoted to
the detection of both non-authorized uses and misuses of a
system. Usually, it does not attempt to stop intrusion upon its
detection, but rather alerts some other system component
[27],and depending on their source of input, IDSs can be
classified into Host-based Intrusion Detection System(HIDS),
Network-based Intrusion Detection System(NIDS) and Hybrid
Intrusion Detection System[28].
IDS analysis: According to the detection model, the IDS
techniques can be classified into :
 Signatures-based detection
The signature approach to intrusion detection, which traces
back to the early 1990s [29], which is also called misuse-based
or pattern detection approaches store the signatures of the
known attacks in a database. Then the current traffic is
compared with the database to find the patterns matching. The
obvious drawback of misused detection approaches is, that it
can only detect known attack patterns and is not for detecting
new attacks that do not match with stored patterns [30].
Signatures are almost useless in network-based IDSs when
network traffic is encrypted. As well as some attacks do not
have single distinguishing signatures, but rather a wide range
of possible variations. Each variation could conceivably be
incorporated into a signature set, but doing so inflates the
number of signatures, potentially hurting IDS performance
 Anomaly-based detection
Anomaly detection approaches build models from the normal
data, and any deviation from the normal model in the new data
is detected as anomaly. Anomaly detection has the advantage
of detecting new types of attacks, while suffering from a high
false alarm rate[31].
Anomaly detectors construct profiles representing normal
behavior of users, hosts, or network connections. These
profiles are constructed from historical data collected over a
period of normal operation[32].
D. Intrusion Prevention System
The majority of current IDSs stops with flagging alarms and
relies on manual response by the security administrator or
system administrator. This results in delays between the
detection of the intrusion and the response which may range
from minutes to months. The Intrusion Prevention Systems
(IPSs) are tried to solve this problem. IPSs solutions are
designed to examine all traffic that passes through it to detect
and stop undesired access, malicious content and inappropriate
transaction rates from penetrating or adversely affecting the
availability of critical IT resources[13] Intrusion prevention
system (IPS): is software that has all the capabilities of an
intrusion detection system and can also attempt to stop
possible incidents[33].
IPSs work inline, the Network based IPS (NIPS) are typically
deployed at the border of the intranet, and the Host based IPS
(HIPS) are typically installed in endpoints[13].
A. General structure of the quality of service system
The system was implemented on Linux platform to achieve
the quality of service based on the concept of Diffserv, and
passes the set of stages as they are first read the incoming
packets to the Network Interface Card (NIC) and analyze
packet headers. Then they will be classified according to the
type of application which belongs to it and that depending on
the type of protocol and port number, then give each
application the particular priority by changing the TOS field in
the packet's header of Internet Protocol, depending on the
definition of the TOS field. Finally is the distribution of the
data in the queues and given a certain percentage of bandwidth
for each queue according to an CBQ algorithm . Figure (2)
show the overall structure of the system.

Figure 2: General structure of the system quality of service
The process of giving precedence to packets is done by
marking packets as they are encoded in a particular field to
change the ToS header located in the Internet Protocol
Version4. Table (1) show the type of encoding used for each
application with the type of protocol and port number. Fig (3)
show the steps of proposed QoS algorithm.
Table 1: ToS field values and port numbers
Application Type Protocol Port Coded type
Audio UDP 1071 EF
Video UDP 2979 AF31
Telnet TCP 23 CS4
Ping ICMP - CS6
other - - BE



ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

Figure. 3: proposed QoS algorithm
B. DoS Prevention System
1. DoS Attack
Because of the impact on the quality of services can be
provided to users by Denial of service attacks bandwidth. The
system was designed for the purpose of protecting the quality
of service of such attacks, and several research puts a lot of
efforts to find many new and effective techniques to detect and
prevent such attacks. However, most studies were conducted,
such as [34] [26] [22] [31] using Offline data where used as a
database of readily available data or by simulation. Having
examined the studies only a few issues of the survivability of
the server when it is exposed to DoS attacks and testing in a
real measure of the effectiveness of the liquidation of such a
movement of malignant and longer capture and analyze the
real attack if it occurs (On the fly) a difficult task, has been the
focus of this research on the types of DoS attack that consume
bandwidth they affect the quality of the service, the system is
working neighborhood (Online) has been taking into account
that the system is fast and light so does not constitute a burden
on the network, capturing packets as soon as they (On the
fly),flowing types of attack addressed in this research.
 UDP Flood Attack
 ICMP Flood Attack
 SYN Flood Attack
2. DoS Attack Detection and Prevention System
We have been designing a DoS attack detection system on
Linux platform based on Anomaly Detection model, and the
detected attack prevented by dropping the attack flow. The
system consists of six units: (Packet Sniffer Unit, Packet
Analysis Unit, Training Unit, Intrusion Detection Unit,
Intrusion Prevention Unit, and Reports Generator Unit).
Packet Sniffer Unit read the network packets in real time, and
then sends these packets to the Packet Analysis Unit, which
analyzes the packets headers and extract information from
them. Then packets are collected to flow based on five fields
(the source address, the destination address, the type of
protocol, source port, the destination port). And each flow will
be known by these five fields. The Training Unit are based on
finding the appropriate threshold limit values for each type of
the three protocols and stored in a text file. The system also
includes an intrusion detection unit that can detect a DoS
attack, depending on the values of the threshold obtained from
the Training Unit, and in the case of the detecting DoS attack
it is prevented by Intrusion Prevention Unit. dropping all the
flows of the attack and then inform they prevent the attack,
Reports Generator Unit issue a report on the attacks that have
occurred and some details of it, as will be mentioned later, all
this is a light and fast, so that no delay or burden on the

Figure. 4: DoS system


coded ToS field depending on the protocol
and port number
Account Checksum of IP header
Distribution of data in the queues
according to the value field ToS
Pass the data from each queue according to
a certain percentage
Is protocol

Analysis packages and determines
the type of protocol



Does it belong
to one of the
ports of TCP

Does it belong
to one of the
ports of UDP
Does it belong
to one of the
ports of ICMP






Read packets
Is protocol

Is protocol

ion unit

Attack flow



Report generator
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

Figure. 5: Dos Attack Detection and Prevention Algorithm

We proposed training algorithm to obtain the values of the
threshold appropriate to each of the three protocols UDP, SYN
and ICMP, as these values will vary depending on network
size and the type of data the passers-by , fig (6) shows steps of
the proposed training algorithm.

Figure. 6: Training Algorithm
Report Generator Unit generates report to administrator
illustrates the attacks that took place depending on information
gained from detection unit. Figure (7), shows a model of the
attack report, the report includes the IP used by the attacker ,
IP of the victim, source port , destination port , type of
protocol, and the date and time of the attack. The report will
be arranged automatically by the date and time of the attack.

Figure. 7 model of the attack report
A. Test1:
In this test was for 5 minutes send a video of Avi type with the
flow of HTTP type from server to a normal computer, with
the video specifications as follows:
Frame rate= 24 frames/second
Frame width =240 Pixel
Frame height =136 Pixel
It has been re-tested twice, with and without QoS, the
outcomes were compared and it was as follows:
Save them in File
that attack
is stopped
Read incoming packets
Is it belonging to
an existing flow?
Increment counter
of existing flow
Start new counter
for new flow
Drop Packets of
Attack flow

Are the incoming
packets belonging to
attack flows?
Save Threshold Values in parameters
Read Thresholds Values

Forward packet
Time> Period
Add these flows to attack flows
Reset parameters
Count the flows larger than threshold
Attack file
Count the number of
Packets per flow
Time > Period?
Count maximum value of
Is it belonging to
an existing flow?

Count of Flow
Start new Count
for New Flow
Save the Values
in text file
Read Packets
Threshold File
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Decrease delay rate of time, as the delay rate without QoS was
equal to 11.0 ms , while with QoS, it became 10.3 ms. (Figure
8a) and (Figure 8b) show the delay in the video packet with
and without QoS.


Figure (8): delay in the video packet (a) without QoS (b)with QoS
The percentage of data loss without QoS was 6.33%, and after
appling QoS it has become 0.41%, which makes the video
presents a clearer view at the recipient. Figure (9a) and Figure
(9 b) show snapshot of the video taken with and without the
quality of service system.

(a) (b)
Figure. (9): snapshot of the video (a) without the QoS (b) with the QoS
From Figure (10 a) and (10 b) the bandwidth rate used by the
video without QoS was equal to 994040.53 bits / sec and for
HTTP was 811308.82bits/sec, while they become 1056430.58
bits / sec and for HTTP 513732.26 bits / sec when QoS was


Figure. (10): used bandwidth (a) without the QoS (b) with the QoS
B. Test2:
The impact of an ICMP Flood attack on the natural flow of
Ping was tested. The ICMP Flood attack from attacking
computer to the server by sending a group of ICMP packets
with different sizes to the server, and at the same 50 ping
request sent such as normally flow by the network. Results
were observed with existence of the attack and after
preventing it, as shown in Figures (11 a) and (11 b), since the
rate of data loss with an attack was 56%, where as the rate of
data loss after preventing the attack 0%, also decrease in the
rate of Round trip time (RTT) as it was before to prevent the
attack 115 ms and after it was stopped 7 ms. Figures (12 a)
and (12 b) shows down in the Response Time of Ping flow
before and after stopping an ICMP Flood attack.
Figure. (11): Impact of ICMP flood attack on ICMP flow (a) no. of send
received and loss packet (b) RTT


Figure. (12): Response Time (a) in presence of attack (b) after stop attack
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
UDP Flood attack was sent for two minutes from the attacking
PC to the server and the impact of the attack on the video sent
through the network was measured with loss of data, 10.94 %
and delay rate was 11.5ms. While after prevent the attack the
loss of data has become 0% and the delay rate was ms10.3.
The packet delay in the video shown in Figures (13a, 13b),
Figures (14a , 14 b) shows the effect on the video snapshot.


Figure. (13): delay in the video packet (a) in presence of UDP flood (b)
after stop UDP flood

(a) (b)
Figure. (14): snapshot of the video (a) in presence of UDP flood (b) after stop
UDP flood
Used bandwidth have been measured, with bit rate for the
video as shown in Figures (15a,15b). The bit rate of the video
was equal 941103.12 bits / sec with the presence of attack,
and equal 1056052.25 bits / sec after preventing the attack.

(a) (b)

Figure. (15): used bandwidth (a) in presence of UDP flood (b) after stop UDP
D. Test4
The last test was made to know the impact of an Syn
Flood attack on normal traffic of HTTP. It was sent Syn Flood
attack for 3 minutes, with HTTP flow as the normal flow and
measured productivity and Round Trip Time (RTT) for HTTP
flow. As shown in figures (16 a,16 b), (17 a,17 b) that in the
existence of the attack, throughput was between (10000-70000
B/S) and it was with a scatter, and the highest value of Round
Trip Time equal 1 sec. While after stopping the attack
throughput is between (25000-70000) B/S and almost in a
straight line, while the highest value of Round Trip Time is
equal 0.5 Sec.


Figure. (16): HTTP Throughput (a) in presence of SYN flood (b) after stop
SYN flood


Figure. (17): RTT (a) in presence of SYN flood (b) after stop SYN flood
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Figure (18 a) and (18 b) show the used bandwidth, The
average bandwidth of the video with an attack was equal to
322220.65 bits / sec, But it was equal 497897.17 bits / sec
after stopping the attack.


Figure. (18): used bandwidth (a) in presence of SYN flood (b) after stop SYN
Because of widespread growth of the Internet and the
development of streaming applications, Quality of service will
be of primary importance in the IP-based networks. In this
paper a system was constructed to achieve the quality of
service depending on the Diffserv technology, giving the
priority to certain applications as well as providing protection
to them from the DoS attacks. The model of Anomaly
Detection was adopted to detect the attack and then
prohibiting the attack detected by means of dropping the
attack flow. From tests we verified effectively the QoS in IP-
based network, and the system successes to guarantee QoS for
IP networks During DOS attacks. Our future work will use a
cross platform language and develop a system to detect and
prevent distributed DOS attacks and other types of attacks.
[1] M. Aykut Yigitel, Ozlem Durmaz Incel, and Cem Ersoy ,(2011), “QoS-
aware MAC protocols for wireless sensor networks: A survey”,
Computer Networks, Volume 55, Issue 8, Pages 1982-2004.
[2] Jayashree , P.; Easwarakumar, K.S. ; Gokul, B.; and Harishankar, S.
,(2008) “Providing QoS as a Means for Defending DoS Attacks in
Active Networks”, IEEE 16th International Conference on Advanced
Computing and Communications ADCOM, ISBN: 978-1-4244-2962-2,
PP. 406 – 409.
[3] Li, X.; Chuah, E.; Tham, J., Y.; and Goh, K. H., (2008) “An Optimal
Smooth QoS Adaptation Strategy for QoS Differentiated Scalable Media
Streaming”, IEEE , International Conference on Multimedia and Expo
ICME, ISBN: 978-1-4244-2570-9 , PP. 429 – 432.
[4] Agrawal V. , December (2005) “Establishment of QoS enabled
multimedia collaboration Grid over native IPv6 fabric”, MSc thesis,
Birla Institute of Technology and Science, India.
[5] Baldi, M.; Morichetti, S.; and Gambi, E. , Sept. (2007),“Quality of
Service in Local Area Networks intended for Home Entertainment and
Domotic Applications”, IEEE 15th International Conference on
Software, Telecommunications and Computer Networks SoftCOM,
ISBN: 978-953-6114-93-1, PP. 1 – 5.
[6] Owezarski, P.; and Larrieu, N. , Aug. (2006) “Measurement Based
Approach of Congestion Control for enforcing a robust QoS in the
Internet”, IEEE International Conference on Internet Surveillance and
Protection, ISBN: 0-7695-2649-7.
[7] Havary-Nassab, V.; Koulakezian, A.;and Ganjali, Y., (2009) “Denial of
Service Attacks in Networks with Tiny Buffers”, IEEE, ISBN: 978-1-
[8] Moh’d, A.; Tawalbeh, L.;and sowe, A., (2009) “A Novel Method to
Guarantee QoS during DOS Attacks for IPTV using SIP”, IEEE Second
International Conference on the Applications of Digital Information and
Web Technologies, 2009. ICADIWT '09., ISBN: 978-1-4244-4456-4,PP.
838 - 842 .
[9] Kim, M.; Kang, H.; Hong,S.; Chung, S.; and Hong, J. W. , (2004) “A
Flow-based Method for Abnormal Network Traffic Detection”, IEEE,
Network Operations and Management Symposium, 2004. NOMS 2004.
IEEE/IFIP, vol.1,ISBN: 0-7803-8230-7,PP.599 - 612.
[10] Hwang W. and Tseng, P., AUGUST (2005) “A QoS-aware Residential
Gateway with Bandwidth Management”, IEEE Transactions on
Consumer Electronics, Vol. 51, No. 3,PP 840 - 848.
[11] Luo, H. and Shyu, M., (2005), “The Protection of QoS for Multimedia
Transmission against Denial of Service Attacks”, Multimedia, Seventh
IEEE International Symposium on.
[12] Pukkawanna, S.; Pongpaibool, P.; and Visoottiviseth, V. , (2008) “LD2:
A System For Lightweight Detection Of Denial-Of-Service Attacks”,
IEEE, Military Communications Conference, MILCOM 2008,
ISBN: 978-1-4244-2676-8,PP.1-7.
[13] Wu, Z.; Xiao, D.; Xu, H.; Peng, X.; and Zhuang, X. , (2009), “Virtual
Inline: A Technique of Combining IDS and IPS Together in Response
Intrusion”, IEEE First International Workshop on Education Technology
and Computer Science,vol.1, ISBN: 978-1-4244-3581-4, PP. 1118 –
[14] Elshaikh, M. A.; Othman, M.; Shamala, S. and J. Desa, November
(2006) “A New Fair eighted Fair Queuing Scheduling Algorithm in
Differentiated Services Network”, IJCSNS International Journal of
Computer Science and Network Security, VOL.6 No.11.
[15] Frangiskatos, D. and Agrawal, S., M., (2004), “Quality Of Service In
Tcp/Ip Networks: A Diffserv Testbed”, Telecommunications Quality of
Services: The Business of Success.
[16] Welzl, M., (2005), “Network Congestion Control Managing Internet
Traffic”,Wiley Series in Communication networking & Distributed
[17] Park, S. and DeDourek J., (2009), “Quality of Service (QoS) for Video
Transmission”, IEEE, First International Conference on Ubiquitous and
Future Networks , ISBN: 978-1-4244-4215-7,PP. 142 – 147.
[18] Park, K., I., (2005), “QOS in Packet Networks”, Springer.
[19] Bechler, M.; Ritter, H.; Schafer, G.; Schiller, J., (2001), “Traffic Shaping
in End Systems Attached to QoS-supporting Networks”, IEEE.
[20] Miras, D., (2002), “Network QoS Needs of Advanced Internet
Applications A Survey”, Internet2 QoS Working Group.
[21] Gargees, R.S., (2011), “QoS Adoption and Secure it by Preventing DoS
Attack”, M.Sc. Thesis, Mosul University, Iraq.
[22] Douligeris, C. and Mitrokotsa, A., (2003), “DDoS attacks and defense
mechanisms: classification and state-of-the-art”, Elsevier B.V.
[23] Owezarski , P. , (2005) “On the Impact of DoS Attacks on Internet
Traffic Characteristics and QoS”, IEEE, 14th International Conference
on Computer Communications and Networks ICCCN, ISSN: 1095-2055
, PP. 269 – 274.
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
[24] Denko, M., K., (2006), “ Detection and Prevention of Denial of Service
(DoS) Attacks in Mobile Ad Hoc Networks using Reputation-Based
Incentive Scheme”, Journal of Systemics,Cybernetics and Informatics,
[25] Specht, S., M.; Lee, R. B., (2004), “Distributed Denial of Service:
Taxonomies of Attacks, Tools and Countermeasures”, 17th International
Conference on Parallel and Distributed Computing Systems.
[26] Gil, T., M., (2000), “MULTOPS: a data structure for denial-of-service
attack detection ”, PhD thesis, VRIJE UNIVERSITEIT.
[27] Cotroneo, D.; Peluso, L.; Romano, S.P. and G. Ventre, (2002), “An
Active Security Protocol against DoS attacks”, IEEE, Proceedings of the
Seventh International Symposium on Computers and Communications
(ISCC’02). ISBN: 0-7695-1671-8 PP. 496 – 501.
[28] Ying, L.; Yan, Z. and Yang-Jia, O., (2010), “The Design and
Implementation of Host-based Intrusion Detection System”, IEEE, Third
International Symposium on Intelligent Information Technology and
Security Informatics, ISBN: 978-1-4244-6730-3,PP. 595 – 598.
[29] Endorf, C.; Schultz, E. and Mellander, J., (2004), “Intrusion Detection &
Prevention”, McGraw-Hill.
[30] Malliga, s.; Tamilarasi, A. and Janani, M., (2008), “Filtering spoofed
traffic at source end for defending against DoS / DDoS attacks”, IEEE,
Proceedings of the 2008 International Conference on Computing,
Communication and Networking (ICCCN 2008), ISBN: 978-1-4244-
3594-4,PP. 1 – 5.
[31] Luo', H. and Shyu, M., (2007) “Differentiated Service Protection Of
Multimedia Transmission Via Detection Of Traffic Anomalies”, IEEE
International Conference on Multimedia and Expo , ISBN: 1-4244-1016-
9,PP. 1539 - 1542 .
[32] Pukkawanna, S., (2008), “Lightweight Detection Of Dos Attacks”,
M.Sc.Thesis in Computer Science, Mahidol University.
[33] Mirashe, S., P. and Kalyankar, N., V., (2010), “3Why We Need the
Intrusion Detection Prevention Systems (IDPS) In IT Company”, IEEE,
2nd International Conference on Computer Engineering and
Technology, Volume 7, ISBN: 978-1-4244-6347-3, PP.V7-112 - V7-
[34] N., M.; Parmar, A. and Kumar, M. , (2010), “A Flow based Anomaly
Detection System using Chi-square Technique”, IEEE, 2nd International
Advance Computing Conference, PP.285 – 289, ISBN: 978-1-4244-

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September 2012

Investigation of Hill Cipher
Modifications Based on Permutation and Iteration

Mina Farmanbar
Dept. of Computer Engineering
Eastern Mediterranean University
Famagusta T.R. North Cyprus via Mersin 10, Turkey
Alexander G. Chefranov
Dept. of Computer Engineering
Eastern Mediterranean University
Famagusta T.R. North Cyprus via Mersin 10, Turkey

Abstract—Two recent Hill cipher modifications which iteratively
use interweaving and interlacing are considered. We show that
strength of these ciphers is due to non-linear transformation used
in them (bit-level permutations). Impact of number of iterations
on the avalanche effect is investigated. We propose two Hill
cipher modifications using column swapping and arbitrary
permutation with significantly less computational complexity (2
iterations are used versus 16). The proposed modifications
decrease encryption time while keeping the strength of the
ciphers. Numerical experiments for two proposed ciphers
indicate that they can provide a substantial avalanche effect.
Keywords : Hill cipher, non-linear transformation, avalanche
effect, permutation, iteration.
In the Hill cipher [1], ciphertext C is obtained by
multiplication of a plaintext vector P by a key matrix, K, i.e.,
by a linear transformation. Encryption is given by:

C = KP(mod N), (1)

and decryption by:

P = K
C(mod N), (2)

where K
is the modular arithmetic inverse of K, N>1. It can
be broken by known plaintext-ciphertext attack due to its
linearity. There are cryptosystems [2, 3, 4, 5, 6, 7] which have
been developed in order to modify the Hill cipher to achieve
higher security. In them, the Hill cipher is modified by
including interweaving, interlacing, and iteration. They have
significant avalanche effect and are supposed to resist
cryptanalytic attacks. Strength of the ciphers is supposed to
come from the nonlinearity of the m times applied matrix
multiplication followed by interlacing or interweaving as it is
mentioned explicitly or implicitly in [2, 3, 4, 5, 6, 8]. In [8]
only, nonlinearity is related to the number of iterations m
defining the order of the system of non-linear equations with
respect to elements of the key matrix, the role of used
permutations (interlacing, interweaving is not mentioned at all
as a source of nonlinearity. If no permuation is used, also non-
linear equations will be obtained for the key matrix elements
after m iterations. However, resulting transformation is still
linear, it may be represented by some matrix, and there is no
need to solve non-linear equations to find elements of the
original key matrix. For the cipher breaking, it is sufficient to
define just the matrix resulting after several iterative
multiplications. In all mentioned above papers, role of used
permutations for non-linearity generation is not shown, and
used in all the ciphers number of iterations m=16 is selected,
we guess, on the base of discussion in [8]: “If we continue the
process of iteration and take m=16, then we get 112 nonlinear
equation of degree 16. As it is totally impossible to solve such
a system of 112 non-linear equations, breaking the cipher is
completely ruled out. Thus the cipher cannot be broken by the
known plaintext attack.” It is not discussed why interweaving
and interlacing strengthen the Hill cipher.
In the present paper, we show that strength of the ciphers
cipher modifications using interlacing, HCML [3], and
interweaving, HCMW [5] is due to non-linear transformation
used in it (bit-level permutations: interweaving and
interlacing), investigate impact of number of iterations on the
avalanche effect, and propose generalizations of the ciphers
from [3, 5]. Then we present two new Hill cipher modifications
which use bit-level permutations and only 1 or 2 iterations. We
show that in the case of performing a bit-level permutation that
swaps arbitrary selected bits, even two bits, a substantial
avalanche effect is achieved.
The rest of the paper is organized as follows. First, a review
of two Hill cipher modifications is given. Next, investigation of
the number of iterations, experimental analysis and results of
taking different number of iterations are presented. Then, two
ciphers, column_swapping Hill cipher (CSHC) and arbitrary
permutation Hill cipher (APHC) are proposed and their
statistical analysis is conducted and discussed. Finally, we
conclude the study. Appendix contains proof of non-linearity of
bit-level permutations.
ISSN 1947-5500
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September 2012

Hill cipher modifications HCML [3] and HCMW [5] use,
respectively, interlacing and interweaving (transposition of the
binary bits of the plaintext letters) and iteration. They are
described as follows:
A plaintext of 2n 7-bit ASCII characters:


and a key matrix K, such that each its entry is less than 64
used in HCML [3], and is less than 128 used in HCMW [5]:

. (4)

HCML/HCMW encryption (N=128):

1. P
= P (5)

2. For i = 1 to m where m=16 do the following:
Compute, P
= KP
)mod N(
= interlace (P
) as used in HCML [3], or P
interweave (P
) as used in HCMW [5].



Algorithm for interlace (P):

1. Divide P into two binary n×7 matrices, B and D ,
where B
= P
and D
= P
,k = 1 to n, j = 1 to 7.
2. Mix B
and D
to get two binary n×7 matrices,


, so that each B
lies in them adjacent to its
corresponding D

3. Construct


convert them to decimal form, j = 1 to n
Algorithm for interweave (P):
1. Convert P into a binary n×14 matrix:

1,1 1,14
n,1 n,14
b b
b b
¸ ¸

2. Rotate circular upward the jth column of B to get new
column as
2, j
3, j
n, j
1, j
¸ ¸
where j = 1,3,5...

3. Similarly, rotate circular leftward the jth row of B
where j = 2,4,6,….
4. Construct P from B using first 7 bits of jth row for P
and last 7 bits for P
, j = 1,2,…,n
In the proposed algorithms, both interweaving and
interlacing are the types of the bit-level permutation which
makes total transformation non-linear that defines strength of
these ciphers. A proof of non-linearity of a transformation
represented by a bit-level permutation is given in Appendix 1.
Let’s consider an example, in which a bit-level permutation
is used after matrix multiplication showing that known
plaintext-ciphertext attack is non-applicable even in the case of
a trivial bit-level permutation that just swaps two bits.
We use in the example below m=26, a 2×2 key matrix

, a pair of plaintext-ciphertext matrices


, and

which is
considered as a new plaintext block.

We denote the permuted matrix as:

where, Y
is a ciphertext matrix obtained for

, i = 1,2, P is a
2 1
3 11
¸ ¸
be a result of a bit-level permutation
swapping two bits, b
and b
, of the Y
= b
where i
= 2, j =1,

i.e. the permutation is P=(43120) out of five bits. So
the key can be obtained by an opponent after setting a linear
system and solving it as


as a new plaintext,

is the permuted
ciphertext. But

mod N=


11 2
12 24
is not equal to



11 12
4 13
N mod
In the HCML/HCMW, m=16 iterations are used to ensure the
security and provide a good avalanche effect, i.e. changing one
bit of the plaintext or one bit of the key should produce
change in a lot of bits of the ciphertext. The number of
iterations m is taken to be 16 [8] because of having in that case
non-linear system of equations of 16-th order, but actaully it is
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September 2012

not the source of non-linearity of the used transformations.
Non-linearity of the transformations used in the ciphers under
consideration comes from the use of bit-level permutations
(their non-linearity is proved in Appendix). Hence, may be
with less number of iterations, still avalanche effect is good.
We examine avalanche effect of these ciphers using
examples of plaintext and key from [3, 5] for different number
of iterations.
Plaintext, given by (7):

“The World Bank h” (7)

and key by (8), are from [3]:

53 62 24 33 49 18 17 43
45 12 63 29 60 35 58 11
8 41 46 30 48 32 5 51
47 9 38 42 2 59 27 61
57 20 6 31 16 26 22 25
56 37 13 52 3 54 15 21
36 40 44 10 19 39 55 4
14 1 23 50 34 0 7 28
¸ ¸

and plaintext (9):
“The development”, (9)
and key (10) are from [5],


There are some problems in the example from [5]
illustrating the avalanche effect. The plaintext (9) in ASCII
code shall have letter “l” represented by 108 that in [5] is
shown as 109. Correct ASCII code representation for (9) is
given in (11):

84 108
104 111
101 112
32 109
100 101
101 110
118 116
101 32
¸ ¸

Correct result after multiplication taking into account (11) is
given by:


Table 1 shows comparison results that were obtained by
changing the first character of the plaintext (7) from “T” to “U”
and the 9th character of the plaintext (10) from “l” to “m” for
different number of iterations ranging from 1 to 100. We also
change the key (8) element

from 46 to 47 and the key (9)

from 32 to 33.
From Table 1, we can see that for all number of iterations
avalanche effect is approximately the same. Hence, used in
HCML/HCML number of iterations equal to 16 is not
distinguished and less number of iterations may be used
m Change in plaintext Change in key
Number of bits that differ Number of bits that differ
1 56 64 30 51
2 52 59 55 61
3 53 54 57 59
4 56 53 58 55
5 53 40 56 56
6 62 61 58 56
7 57 59 59 48
8 61 54 62 61
9 44 63 61 62
10 62 62 47 60
11 53 64 51 54
12 56 60 60 56
13 57 50 49 66
14 52 54 57 64
15 60 62 61 57
16 65 43 55 57
17 51 60 66 56
18 51 60 53 62
19 68 53 62 50
20 59 59 57 53
50 58 63 56 49
100 59 53 58 61

We introduce Column_swapping Hill cipher (CSHC). It
uses swapping columns of the binary bits of the plaintext
characters instead of interlacing and interweaving as in [3, 5].
Also, we introduce arbitrary permutation Hill cipher (APHC)
that uses an arbitrary permutation not known to an opponent
and shared between the two communication parties instead of
a fixed permutation (interweaving or interlacing). In CSHC
and APHC, 1 or 2 iterations are used instead of 16 iterations
used in [3, 5] Cipher inputs are the same as used in
HCML/HCMW, but there are some additional inputs:
- Number of iterations m is considerd as me{1,2}
instead of 16
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September 2012

- Permutation that is a vector of the same length as P
(i.e., L = n×14) with integer components from
{1,…,L}. All values from 1,…L are represented in
Permutation in some order. For example, if L=4 and
Permutation=(4,1,3,2) then applying Permutation to

), we get (

- Additional_multiplication (AD) which has two
values True/False and defines whether the last
multiplication in the algorithms is to be applied.
Algorithm for Column_swapping (P):
1. Divide P into two binary n×7 matrices, E and F ,
where E
= P
and F
= P
,k = 1 to n, j = 1 to 7.
1,1 1,7 1,1 1,7
n,1 n,7 n,1 n,7
e e f f
E , F
e e f f
( (
( (
= =
( (
( (
¸ ¸ ¸ ¸

2. Swap the j-th column of E and jth column of F where
j = 2,4,6 as shown below for n=8:

11 12 13 14 15 16 17
21 22 23 24 25 26 27
31 32 33 34 35 36 37
41 42 43 44 45 46 47
51 52 53 54 55 56 57
61 62 63 64 65 66 67
71 72 73 74 75 76 77
81 82 83 84 85 86 87
e f e f e f e
e f e f e f e
e f e f e f e
e f e f e f e
e f e f e f e
e f e f e f e
e f e f e f e
e f e f e f e
¸ ¸

11 12 13 14 15 16 17
21 22 23 24 25 26 27
31 32 33 34 35 36 37
41 42 43 44 45 46 47
51 52 53 54 55 56 57
61 62 63 64 65 66 67
71 72 73 74 75 76 77
81 82 83 84 85 86 87
f e f e f e f
f e f e f e f
f e f e f e f
f e f e f e f
f e f e e e f
f e f e f e f
f e f e f e f
f e f e f e f
¸ ¸

3. Set P
= E’
and P
= E’
where j = 1 to n

Algorithm for APHC (Permutation, P):
1. Convert P into a binary n×14 matrix:

1,1 1,14
n,1 n,14
b b
b b
¸ ¸

2. Apply Permutation to the bits of B that is considered
as a row-vector = (v1,v2,…vn×14) obtained in row-
major order.

3. Construct P from B using first 7 bits of j-th row for
and last 7 bits for P
where j = 1 to n.


and APHC ciphers are shown as a diagram in
Fig. 1.

Figure 1. Schematic diagram of the CSHC and APHC. Here, m denotes the
number of iterations, and me{1,2}.

For the proposed ciphers, in the case of CSHC with
AD=False and m=1, ciphertext C is defined as follows
If an opponent applies to C inverse of Column_swapping
permutation, he gets K*P, and, hence, the key K of the
algorithm can be disclosed by the opponent by the known
plaintext-ciphertext attack. In the case of AD=True or m=2,
such attack is not possible. In the case of APHC, iteration
number may be taken m=1 with AD=False since a permutation
applied in it is kept secret, and thus, can not be inverted
without enumeration of possible permutations number of
which exponentially grows with the size L of the permuted
vector. Hence, key space for APHC is L! times greater than
that of CSHC and HC.
Let us illustrate the CSHC algorithm after multiplying
plaintext (9) and the key (10) and getting (12). After dividing
(12) into two binary matrices, we get:

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September 2012

Now, we show the process of CSHC:

Transformed plaintext, after the first iteration is as

To illustrate APHC let (b
) be a result of 3-
bit permutation by swapping three bits b
and b
out of the
7-bit ASCII code binary represented by b
. After
converting (12) into a binary matrix, we get:

The process of APHC after performing P= (6,5,4,3,0,2,1)
on the e
where i = 1, j = 1 to 7:

plaintext matrix, after the first iteration is as follows:

29 112
17 83
83 113
108 41
37 25
38 86
59 61
127 11
¸ ¸

To test the strength of the CSHC and APHC we examine both
changing elements in the plaintext and key. Table 2 shows
avalanche effect of CSHC when changing first character of the
plaintext (9) from “T” to “U” which differ by one bit, then
changing second character from “h” to “i” and so on where
me{1,2}and additional multiplication AD is true. We also
change the key (10) element

from 32 to 33.
From Table 2, we can see for CSHC that after m iterations
avalanche effect is more or less the same where me{1,2}.
Hence, one iteration can be sufficient i.e., m = 1.
Table 3 shows the avalanche effect average of 17 samples
for APHC that swaps selected z bits of both plaintext (9) by
changing “T” to ”U” and key (10) by changing

from 32 to 33 by performing iteration and additional
multiplication AD i.e., me{1,2} and AD = True/False to
determine how changing bits provides avalanche effect where
z = 2 to 7.

Original key Changed key
m =1
AD = true
m = 2
AD = true
m =1
AD = true
m = 2
AD = true
“T” to “U” 44 44 46 64
“h” to “i” 42 55 60 61
“e” to “f” 40 55 59 49
“d” to “e” 60 60 61 66
“e” to “f” 56 45 51 61
“v” to “w” 55 50 55 52
“e” to “f” 56 45 49 49
“l” to “m” 51 51 56 62
“o” to “p” 48 51 56 56
“p” to “q” 49 47 64 65
“m” to “n” 51 58 53 57
“e” to “f” 47 54 60 57
“n” to “o” 53 44 65 55
“t” to “u” 45 42 63 50
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September 2012

z Change in plaintext Change in key
m = 1
AD =
m = 1
AD =
m = 2
AD =
m = 1
AD =
m = 1
AD =
m = 2
AD =
2 36.6 45.1 50.1 9.41 57.5 58.8
3 35.8 45.5 50.8 10.6 56.7 60.1
4 36.0 39.8 49.8 11.3 55.2 63.1
5 37.1 38.0 51.1 11.0 55.1 63.1
6 36.8 43.0 48.3 10.7 60.5 55.5
7 36.6 44.2 49.2 14.8 55.2 56.5

z Permutation Element indices
2 6453210 P11,P31
3 6543021 P32,P71
4 6234510 P72,P12
5 2345160 P72,P52
6 1234560 P62,P41
7 0123456 P32,P12

Table 4 displays the number of swapped bits z and
Permutations which were applied when getting the avalanche
effect average of used samples in the Table 3 on the plaintext
characters P that are represented as 7-bit binary

). For example two bits

are swapped
in Permutation(6,4,5,3,2,1,0) that is applied on both elements

in the plaintext matrix where i=1,3 and j=1.
We have seen that even a small change in the plaintext or
key results in changing approximately half of the ciphertext
bits. From Table 3, we found that any simple bit-level
permutation can provide a substantial avalanche effect same as
other complicated and fixed permutations which have been
used in the HCML and HCMW.
The Hill cipher is susceptible to known plaintext-ciphertext
attack due to its linearity. In this study, we generalized two
Hill cipher modifications [3, 5] which use bit-level
permutation and 16 iterations. In both cases, the Hill cipher
has been made secure against the attack. We proved that
strength of the ciphers is due to non-linear transformation used
in them (bit-level permutations), and we found that, for
number of iterations from 1 to 100, avalanche effect is
approximately the same. Hence, use of 16 iterations is not
reasonable, and less number of iterations may be used instead.
We proposed two new Hill cipher modifications, CSHC and
APHC, that also use bit-level permutation and one or two
iterations. Results of statistical tests for examining the strength
of CSHC and APHC are given which indicate that any bit-
level permutation can provide a substantial avalanche effect.
Here we show the non-linearity of the bit-level
transposition P swapping i -th and j -th bits in a binary
b b
2 :
) .. .. .. (
), .. .. .. (
), (
0 1 1 1 1 1 1
0 1 1 1 1 1 1
b b b b b b b b b b b
b b b b b b b b b b b
b P b
j i j i j i n n
j j j i i i n n
÷ + ÷ + ÷
÷ + ÷ + ÷
= '
= '

A linear transformation satisfies the following:

1 2 1 2
T(a X a Y) a T(X) a T(Y) + = +

where, a
, a
are any scalars, and Y X, are any two objects to
which transformation T is applicable. Let us show that the
binary permutation P does not meet (13) for 1
2 1
= = a a and
some two binary numbers,
) .. .. .. (
), .. .. .. (
2 2
2 2
2 2
1 1
1 1
1 1
b b b b b b b b b
b b b b b b b b b
j j j i i i n
j j j i i i n
÷ + ÷ +
÷ + ÷ +

where these numbers are selected so that
, 1 , 1 , 1 , 0 , 2 , 1 , 0 , 1 , 0
2 1
÷ + = = = = = = =
i j l b b l b b b
l l
) .. 11 .. 01 .. (
) .. 10 .. 11 .. ( ) (
)) .. 110 .. 01 .. (
) .. 010 .. 00 .. (( ) (
3 3
1 2 1
b b b b
b b b b P b P
b b b b
b b b b P b b P
j i n
j i n
j i n
j i n
÷ +
÷ +
÷ +
÷ +
= =
+ = +

From the other side,
) (
) .. 10 .. 01 .. ( ) .. 10 .. 01 .. (
) .. 100 .. 11 .. ( ) .. 000 .. 10 .. (
) .. .. .. (
) .. .. .. ( ) ( ) (
3 6
6 3
2 1
2 2
2 2
1 1
1 1
1 5 4 2 1
b P b
b b b b b b b b P
b b b b b b b b
b b b b b b b b
b b b b b b b b b b b P b P
j i n j i n
j i n j i n
j i j i j i n
j i j i j i n
= =
= +
= +
= + = +
÷ + ÷ +
÷ + ÷ +
÷ + ÷ +
÷ + ÷ +
The last inequality proves that the transposition ) (b P
swapping i -th and j -th bits in the binary representation of
the number b is a non-linear transformation, because for any
transpostion we can construct two binary numbers such that
(13) is violated for them and the transposition.
For example, let
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September 2012

), ( ) 00010 ( 2 32 mod 34 9 25
) 01001 ( ) 11001 ( ) ( ) (
, 9 ) 01001 ( ) ( , 25 ) 11001 ( ) (
, 22 ) 10110 ( ) (
, 26
) 11010 ( 32 mod ) 5 21 ( , 5 ) 00101 (
, 21 ) 10101 ( , 2 , 3 , 4
3 6
2 1
2 1
3 2
b P b
b P b P
b P b P
b P
b b
b j i n
= = = = = + =
+ = +
= = = =
= =
= + = = =
= = = = =

As far as any permutation can be represented as a product of
transpositions (see, e.g.,
nspositions), we have proved that any binary-level
permutation is a non-linear transformation.
[1] Stallings, W., Stallings, “Cryptography and Network Security: Principles
and Practices”, 4th Ed, Pearson Education India, ISBN-10: 8177587749,

[2] Kumar S.U., Sastry V.U.K., Vinaya babu A., “An Iterative Process
Involving Interlacing and Decomposition in the Developmant of a Block
Cipher, Int. J. Comp. Sci. Network Sec., Vol. 6, No. 10, 236-245, 2006.
[3] Sastry, V.U.K. and N.R. Shankar, “Modified Hill cipher with interlacing
and iteration”. J. Comput. Sci., 3: 854-859, 2007. DOI:
[4] Sastry, V.U.K., Shankar, N.R., “Modified Hill Cipher for a Large Block
of Plaintext with Interlacing and Iteration”, J. Comput. Sci., vol. 4, No.
1, 15-20, 2008.
[5] Sastry, V.U.K., N.R. Shankar and S.D. Bhavani, “A modified Hill cipher
involving interweaving and iteration”. Int. J. Network Secu., 10: 210-
215, 2010a.
[6] Sastry, V.U.K., A. Varanasi and S.U.D. Kumar, “A modified Hill cipher
involving a pair of keys and a permutation”. Int. J. Comput. Sic.
Network Secu., Vol. 10, No. 3, 210-215, 2, 2010b. DOI:
[7] Sastry, V.U.K. and N.R. Shankar, “Modified hill cipher for a lorge block
of plaintext with interlacing and iteration”. J. Comput. Sci. Publi., 4: 15-
20. DOI: 10.3844/jcssp.2008.15.20, 2008.
[8] Kumar, S.U., Sastry, V.U.K., Vinaya babu, A., “A Block Cipher
Involving Interlacing and Decomposition”, Information Technology
Journal, Vol. 6, No. 3, 396-404, 2007.

ISSN 1947-5500

Underwater Acoustic Channel Modeling:
A Simulation Study

Pallavi Kamal
Electrical Engineering
University Teknologi Mara
40450 Shah Alam, Selangor Darul
Ehsan, Malaysia
Taussif Khanna
Institute of Information Technology
Kohat University of Science and Technology

Abstract—Underwater acoustic (UWA) communications have
been regarded as one of the most challenging wireless
communications due to the unique and complicated properties of
underwater acoustic environments, such as severe multipath
delay, large Doppler shift and fast environmental changes.
Therefore, accurate UWA channel modeling is crucial for
achieving high performance UWA communication systems. Since
there is no generalized UWA channel mode, most of existing
channel models is either empirical measurement- or simulation-
based channel models. In this paper, we propose a study of
simulation-based UWA channel modeling, which is based on the
Bellhop algorithm and Time Variable Acoustic Propagation
Model (TVAPM) platform. Our study is capable of handling
almost any type of steady and unsteady environmental motion
except the modeling of breaking waves.
Keywords-underwater acoustic channel modeling; Bellhop
algorithm; VIRTEX platform ; environmental motion
Underwater Acoustic (UWA) communications have
been widely used in military and civilian applications for a
long time, such as monitoring of underwater environments,
and unmanned underwater vehicle (UUV) communications.
However, how to achieve high performance and reliable
UWA communications is a challenging issue due to the
complicated underwater environments (e.g., shallow water).
One of the driving factors in the performance of certain
UWA communication systems is the Doppler spread, which
is often generated by sea-surface movement. The time-
varying nature of the sea surface adds complexity and often
leads to a statistical description for the variations in the
received signals. Severe multipath propagation and large
Doppler shift due to source/receiver motion are another two
factors that determine the performance of UWA
communications. The available bandwidth of the UWA
channel is limited and it highly depends on both
transmission range and frequency. These introduced
characteristics restrict the reliable and high performance
Many works have already treated the problem of UWA
channel modeling, either based on empirical measurement
or software simulation [1-5]. However, these efforts may
not be suitable for those contexts in which a high level
model is required. In other words, how to accurately model
UWA channels for complicated UWA environments is still
a challenging issue that has not been well resolved.
Bellhop algorithm [6-7] is a beam tracing model for
predicting acoustic pressure fields in ocean environments.
The beam tracing structure leads to a particularly simple
algorithm. Bellhop can produce a variety of useful outputs
including transmission loss, eigenrays, arrivals, and
received time-series. It allows for range-dependence in the
top and bottom boundaries (altimetry and bathymetry), as
well as in the sound speed profile. [8-9] adopted the
Bellhop algorithm to calculated transmission loss item of
signal-to-noise ratio (SNR). [10] used Bellhop Gaussian
beam tracing program for the acoustic field module. [11]
adopted the Bellhop algorithm to simulate channel impulse
response in the presence of effect of wind-generated
The Time Variable Acoustic Propagation Model
(TVAPM) platform [12] aims at generating time variable
simulated acoustic channel responses between moving
sources and receivers in a realistically modeled
environment. The main objective of this simulator is to
properly take into account the Doppler effects induced by
source-receiver relative motion as well as the effects of that
motion that propagate through the acoustic channel.
In this paper, based on the Bellhop algorithm and the
TVAPM platform, we propose a study of UWA channel
modeling in complicated underwater environments. The
content includes but not limited to the following aspects: (1)
source/array initial localization; (2) Doppler shift (in
frequency) estimation; (3) multi-path time delay estimation;
(4) TVAPM platform-based channel modeling with sea-
surface motion (e.g., wind speed). Our paper will provide a
useful reference for scientists and engineers who want to
simulate any complicated UWA channels for
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

The rest of this paper is organized as follows: Section II
is the introduction of the Bellhop algorithm and the
TVAPM platform. Section III is the study of Bellhop
algorithm-based investigation of acoustic signal
propagation and UWA channel properties; Section IV
concludes this paper.
A. Bellhop Algorithm
The overall structure of BELLHOP algorithm is shown
in Figure 1. In order to describe the environment and the
geometry of sources and receivers, various files must be
provided. In the most simple and general case, which is
also typical, there is only one such file. It is referred to as
an environmental file (.env) and includes the sound
speed profile, as well as information about the ocean
bottom. However, if there is a range-dependent bottom,
then a bathymetry file (.bty) with range-depth pairs
defining the water depth must be added. Similarly, if
there is a range dependent ocean sound speed, an SSP
file with the sound speed tabulated on a regular grid
should be shown up. Further, if anyone who wants to
specify an arbitrary bottom reflection coefficient to
characterize the bottom, then one must provide a bottom
reflection coefficient file with angle-reflection
coefficient pairs defining the reflectivity. Similar
capabilities are implemented for the surface. Thus there
is the option of providing a top reflection coefficient and
a top shape (.aty file) [13].
Usually one assumes the acoustic source is omni-
directional; however, if there is a source beam pattern,
then one must provide a source beam pattern file with
angle-amplitude pairs defining it. BELLHOP reads these
files depending on options selected within the main
environmental file. Plot programs (plotssp, plotbty,
plotbrc, etc.) are provided to display each of the input
files. [13]

Figure 1. Structure of Bellhop algorithm [13]
B. TVAPM platform
The TV-APM is a valuable tool to discuss arrival
scattering due to the propagation of wind-driven sea
surface waves or relative motion between a source
and an array. Modeling in the future can be oriented
along the following guidelines:
z Accounting for range dependent bottom
properties, shear included.
z Accounting for sound speed fields and sound
speed temporal variability, in particular using
empirical orthogonal functions.
z Considering the statistical distribution of travel
times and amplitudes.
z Considering the inclusion of additional acoustic
Figures 2-5 provide results of a case study of underwater
acoustic channel properties. Parameter settings for the selected
case can be found in Appendix.

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

Figure 2. Result of Static Source

Figure 3. Result of Static Array

Figure 4. Result of Source/Array initial positioning
delay (s)

t-tau representation
0.68 0.69 0.7 0.71 0.72
frequency (Hz)

t-f representation
-2000 -1000 0 1000 2000
delay (s)

d-tau representation
0.68 0.69 0.7 0.71 0.72
frequency (Hz)

d-f representation
-2000 -1000 0 1000 2000

Figure 5. Results of simulated channel properties.
Bellhop algorithm as well as its associated TVAPM
platform has been validated to be a powerful to model
underwater acoustic channels, especially convenient for
instigating the properties of underwater acoustic channel under
time-varying conditions. In addition, this channel modeling
platform can be combined with other underwater acoustic
communication and network simulation tool (e.g., ns2) to
evaluate the performance of a whole system.

Appendix: Parameter settings of the case study

source_x = [1900 900 5];
source_v = [0 0 0];
source_nrays = 2001;
source_aperture = 60;
source_ray_step = 10;
fc = 15000; % frequency (Hz)
fs_x = 5000; % input baseband signal sampling frequency (Hz)
bottom_properties = [1465 1.5 0.06];
%bottom_properties(1) = compressional speed (m/s)
%bottom_properties(2) = bottom density (g/cm3)
%bottom_properties(3) = bottom attenuation (dB/wavelength)
array_x = [2800 400 0];
array_v = [0 0 0];
first_hyd = 30;
last_hyd = 60;
delta_hyd = 10;
% Wind induced sea surface wave
U = 10; % wind speed
theta = 20; % direction of propagation in degrees
spreading = 'none';


[1] A. F. Harris and M. Zorzi, “Modeling the underwater acoustic channel in
ns2”, ACM proceedings of the 2
international confernece on
performance evaluation methodologies and tool, pp. 1-8, 2007
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

[2] S. H. Byun, S. M. Kim, Y. K. Lim and W. Seong, “Time-varying
underwater acoustic channel modeling for moving platform”, IEEE
Oceans conference, pp. 1-4., 2007
[3] R. Su, R. Venkstesan, and C. Li, "A Review of Channel Modelling
Techniques for Underwater Acoustic Communications," In Proc. of 19th
IEEE Newfoundland Electrical and Computer Engineering Conference
(NECEC'10), St. John.s, Nov. 2010.
[4] M. Chitre, “A high-frequency warm shallow water acoustic
communications channel model and measurements”, J. Acoust. Soc.
Am., vol. 122, no. 5, pp. 2580-2586, 2007
[5] T. C. Yang, “Peoperties of underwater acoustic communication channels
in shallow water”, J. Acoust. Soc. Am., vol. 131, no. 1, pp. 129-145, Jan.
[6] BELLHOP gaussian beam/finite element beam code [Available]
[7] O. C. Rodriguez, “General description of the Bellhop ray tracing
[8] X. Huang and V. B. Lawrence. “Capacity Criterion-Based Bit and Power
Loading for Shallow Water Acoustic OFDM System with Limited
Feedback”, in proceeding of IEEE 73
VTC Conference, pp.1-5, 2011
[9] X. Huang and V. B. Lawrence. “Bandwidth-Efficient Bit and Power
Loading for Underwater Acoustic OFDM Communication System with
Limited Feedback”, in proceeding of IEEE 73
VTC conference, pp.1-5,
[10] Michael B. Porter, "The VirTEX code for modelig Doppler effects due
to platform and ocean dynamics on broadband waveforms,"
Office of Naval Research workshop on High Fidelity Active Sonar
Training, Applied Research Laboratories of the University of Texas,
Austin, TX, August 25-26, 2009.
[11] X. Huang and V. B. Lawrence. “Effect of Wind-Generated Bubbles on
OFDM Power Loading for Tim-Varying Shallow Water Acoustic
Channels with Limited Feedback”, in proceeding of IEEE Oceans
Conference, pp. 1-6, 2011
[12] A. J. Silva and O. Rodriguez, “Time-variable avoustic propagation
model” [Avaialbe]
[13] Bellhop algorithm user’s guide. [Available]

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
A Fast Accurate Network Intrusion Detection System
Ahmed A. Elngar
Computer Science Department
Information & Computer science Faculty
Sinai University
El-Arish ,Egypt
Dowlat A. El A. Mohamed
Math & Computer Science Department
Science Faculty
Ain-Shams University
Cairo, Egypt
Fayed F. M. Ghaleb
Math & Computer Science Department
Science Faculty
Ain-Shams University
Cairo, Egypt
Abstract—Intrusion Detection System (IDS) is a valuable
tool for the defense-in-depth of computer networks. However,
Intrusion detection systems faces a number of challenges.
One of the important challenge is that, the input data to
be classified is in a high dimension feature space. In this
paper, we effectively proposed PSO-DT intrusion detection
system. Where, Particle Swarm Optimization (PSO) is used as a
feature selection algorithm to maximize the C4.5 Decision Tree
classifier detection accuracy and minimize the timing speed. To
evaluate the performance of the proposed PSO-DT IDS several
experiments on NSL-KDD benchmark network intrusion detec-
tion dataset are conducted. The results obtained demonstrate
the effectiveness of reducing the number of features from 41
to 11, which leads to increase the detection performance to
99.17% and speed up the time to 11.65 sec.
Keywords-Network Security;Intrusion Detection System;
Feature Selection; Particle Swarm Optimization ; Genetic
Algorithm; Decision Tree.
Reliance on Internet and online procedures increased the
potential of attacks launched over the Internet. Therefore,
network security needs to be concerned to provide secure
information channels. The concept of Intrusion Detection
(ID) was proposed by Anderson in 1980 [1]. ID is based on
the assumption that the behavior of intruders is different
from a legal user [2]. Intrusion Detection System (IDS)
becomes an essential component of computer networks
security . IDS aims to identify unusual access or attacks
to secure internal networks [3], by looking for potential
malicious activities in network traffic and raises an alarm
whenever a suspicious activity is detected.
IDS can be categorized into two techniques: misuse detec-
tion and anomaly detection [4]. Misuse detection uses well-
defined patterns of attacks (attacks signatures) to identify
known intrusion [5]. While, Anomaly detection creates a
normal behavior profile to identify intrusions traffic based on
significant deviations from this normal profile [6]. Anomaly
detection techniques have the advantage of identifying the
unknown attacks [7].
Several pattern classification techniques have been pro-
posed in the literature for the development of IDS; including
Fuzzy Logic (FL) [7], Neural Networks (NN) [8], Support
Vector Machines (SVM) [5], [7] and Decision Tree (DT) [9].
One of the important problems for IDS is dealing with data
containing high number of features. High dimensional data
may leads to decrease the predictive accuracy of the IDS.
Therefore, feature selection can serve as a pre-processing
tool for high dimensional data before solving the classi-
fication problems. The purpose of the feature selection is
to reduce the number of irrelevant and redundant features.
Different feature selection methods are proposed to increase
the performance of IDS [10] including Genetic Algorithm
(GA) [11], Principal Component Analysis (PCA) [12] and
Information Gain (IG) [13].
In this paper, we propose an anomaly intrusion detec-
tion system using Particle Swarm Optimization (PSO) to
implement a feature selection followed by C4.5 decision
tree classifier. The effectiveness of the proposed PSO-DT
IDS is evaluated by conducting several experiments on NSL-
KDD network intrusion dataset. The results reveal that our
proposed PSO feature selection based IDS increases the
accuracy and speed up the detection time than other well
known feature selection methods compared to. The rest of
this paper is organized as follows: Section II presents an
overview of the used methods, including Genetic algorithm,
particle swarm optimization, Decision tree. Section III de-
scribes The NSL-KDD network intrusion dataset. Section
IV introduces the proposed PSO-DT IDS system. Section
V gives the implementation results and analysis. Finally,
Section VI contains the conclusion remarks.
This section give an overview of Feature Selection, Ge-
netic Algorithm (GA), PSO Algorithm and Decision Tree
A. Feature Selection
Feature selection is one of the important techniques used
for data preprocessing in IDS [14]. It aims to improve the
detection performance through the removal of irrelevant,
noisy and redundant features. Feature selection can be
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
achieve by two different methods: filter methods [15] and
wrapper methods [16]. Filter methods rely on the general
characteristics of the data to evaluate the relevance of
the features, without depending on any machine learning
algorithm to select the new set of features [17].
While, wrapper methods exploit a machine learning al-
gorithm and use the classification performance to evaluate
the goodness of features [18]. Genetic algorithm [19] and
PSO algorithm [20] are chosen for this study, with the aim
of employing wrapper feature selection. A brief description
of Genetic algorithm and PSO algorithm is given below.
B. Genetic Algorithm (GA)
Holland [21] introduced Genetic Algorithm (GA),which
has been successfully applied to solve a search and opti-
mization problems. GA is a computational model simulate
the evolutionary processes in the nature [22].
The basic idea of a GA is to search a hypothesis space
of individuals to find the best individuals. Each individual
is called chromosome and is composed of numbers of
genes. The GA procedure starts from generating an initial
population of random chromosomes. Then the population is
evolved for a number of generations, where the goodness
of the chromosomes are gradually improve depending on
the increasing value of the fitness function. Each genera-
tion of GA includes three fundamental operators: selection,
crossover and mutation.
1) Selection operation: A population is created with a
group of random chromosomes. Based on a fitness
function the chromosomes in the population are eval-
uated and selected for the next generation.
2) Crossover operation: crossover randomly chooses
a point in pairs of the selected chromosomes and
exchanging the remaining segments of them to create
the new chromosomes.
3) Mutation operation: randomly changes one or more
components of a selected chromosomes. [22].
These three operations continues until a suitable solution
has been found or a certain number of generations have
passed. Since GA can find a global optimum solution it is
well suited to the feature selection problems. Algorithm 1
shows the structure of a simple Genetic Algorithm (GA).
Algorithm 1 GA algorithm
1: Initialize a population of randomly individuals.
2: Evaluate population members based on the fitness func-
3: while termination condition or maximum number of
generation Not reach. do
4: Select parents from current population.
5: Apply crossover and mutation to the selected parents.
6: Evaluate offspring.
7: Set offspring equal to current population.
8: end while
9: Return the best individuals.
C. Particle Swarm Optimization (PSO)
Particle Swarm Optimization (PSO) is an evolutionary
computation technique developed by Kennedy and Eber-
hart in 1995 [23]. PSO simulates the social behavior of
organisms, such as bird flocking. PSO is initialized with
a random population (swarm) of individuals (particles).
Where, each particle of the swarm represents a candidate
solution in the d-dimensional search space. To discover the
best solution, each particle changes its searching direction
according to:The best previous position (the position with
the best fitness value) of its individual memory (pbest),
represented by P
= (p
, p
, ..., p
); and the global best
position gained by the swarm (gbest) G
= (g
, g
, ..., g
The d-dimensional position for the particle i at iteration t
can be represented as:
= x
, x
, ..., x
While, the velocity (The rate of the position change) for
the particle i at iteration t is given by
= v
, v
, ..., v
All of the particles have fitness values, which are evalu-
ated based on a fitness function:
Fitness = α.γ
(D) +β
|C| + |R|
Where, γ
(D) is the classification quality of condition
attribute set R relative to decision D and |R| is the length of
selected feature subset. |C| is the total number of features.
While, the parameters α and β are correspond to the impor-
tance of classification quality and subset length, α = [0, 1]
and β = 1 −α.
The particle updates its velocity according to:
= w×v
) (4)
d = 1, 2, ..., D
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
Where, w is the inertia weight and r1 and r2 are random
numbers distributed in the range [0, 1]. positive constant
c1 and c2 denotes the cognition learning factor (the private
thinking of the particle itself) and the social learning factor
(the collaboration among the particles). p
denotes the best
previous position found so far for the i
particle and g
denotes the global best position thus far [25].
Each particle then moves to a new potential position based
on the following equation:
= x
d = 1, 2, ..., D
D. Decision Tree (DT)
Decision tree (DT) introduced by Quinlan [26] is a
powerful data mining algorithm for decision-making and
classification problems. DT classifiers can be build from
large volume of dataset with many attributes, because the
tree size is independent of the dataset size.
A DT consists of three main components: nodes, leaves,
and edges. Each node specifies a feature in the dataset
by which the data is to be partitioned. Each node has a
number of edges, which are labeled according to possible
values of the feature in the parent node. An edge connects
either two nodes or a node and a leaf [27]. The process
of constructing a decision tree is basically a divide-and-
conquer process [26]. DT start from the root node and follow
the edges down until a leaf node representing the class
is reached, where it divides the dataset into subsets. This
process terminates when all the data in the current subset
belong to the same class. C4.5 algorithm [26] uses Gain
Ratio measure to choose the best attribute for each decision
node during the building of the decision tree. Where at each
dividing step, C4.5 choose an attribute which provides the
maximum information gain while reducing the bias in favor
of tests with many outcomes by normalization.
Given probabilities p
, p
, ..., p
for different classes in a
dataset the entropy is calculated by:
, p
, ..., p
) =

)) (6)
H(D)finds the amount of entropy in class based subsets
of the data set. That subset is split into s new subsets S =
, D
, ..., D
using some attribute, where a subset of data
set does not need any further split if all examples in it belong
to the same class. ID3 algorithm calculates the information
gain of a split by and chooses that split which provides
maximum information gain.
Gain(D, S) = H(D) −

) (7)
C4.5 algorithm improves ID3 algorithm by using highest
Gain Ratio that ensures a larger than average information
gain for the splitting purpose [28].
GainRatio(D, S) =
Gain(D, S)
, ...,
For the evaluation of researches in network intrusion
detection systems MIT Lincoln Laboratory has collected
and distributed the DARPA benchmark datasets [29]. The
KDD’99 dataset is a subset of the DARPA benchmark
dataset prepared by Sal Stofo and Wenke Lee [30]. KDD’99
train dataset is five million record of compressed binary TCP
dump data from seven weeks of network traffic. Where, each
KDD’99 training record contains 41 features (e.g., protocol
type, service, and flag) and is labeled as either normal or
specific attack type. The training set contains a total of 22
training attack types, with an additional 17 attack types in
the testing set only. The attacks belong to four categories:
1) DoS (Denial of Service ) e.g Neptune, Smurf, Pod and
2) U2R (user-to-root: unauthorized access to root privi-
leges) e.g Buffer-overflow, Load-module, Perl and Spy
3) R2L (remote-to-local: unauthorized access to local
from a remote machine)e.g Guess-password, Ftp-
write, Imap and Phf
4) Probe (probing:information gathering attacks) eg.
Port-sweep, IP-sweep, Nmap and Satan.
Leung and Leckie [31] reported two problems in the
KDD’99 dataset which affects the performance of results
evaluation of intrusion detection systems.
1) 10% portions of the full KDD’99 dataset contained
only two types of DoS attacks (Smurf and Nep-
tune).These two types constitute over 71% of the
testing dataset which completely affects the evaluation.
2) since these attacks consume large volumes of traffic,
they are easily detectable by other means and there is
no need of using anomaly detection systems to find
these attacks.
To solve these problems, NSL-KDD [32] a new dataset
is suggested. NSL-KDD dataset consists of selected records
of the complete KDD’99 dataset.
The proposed hybrid anomaly intrusion detection sys-
tem is using the advantages of PSO feature selection in
conjunction with C4.5 DT classifier to detect and classify
the network intrusions into five outcomes: normal and four
categories of intrusions. It consists of the following three
fundamental building phases: (1) Preprocessing , (2) Feature
selection based PSO and (3) Classification using C4.5 DT.
Figure 1 shows the overall architecture of the proposed PSO-
DT intrusion detection system.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

NSL-KDD Dataset
(41 Features)

Attack names converting
Convert Symbolic
features to Numeric value
Preprocessed Dataset
(41 Features)
Feature Selection
Reduced Dataset
(11 Features)
Intrusion Detection
C4.5 Classifier
Classified Dataset
Figure 1. The overall architecture of the proposed PSO-DT intrusion
detection system
Preprocessing phase: The following three pre-processing
stages has been done on the NSL-KDD dataset:
1) Symbolic features are converted to numeric value.
2) Each Attack name is converted to its category, 0 for
Normal, 1 for DoS (Denial of service) , 2 for U2R
(user-to-root), 3 for R2L (remote-to-local), and 4 for
3) Normalization is implemented since the data have sig-
nificantly varying resolution and ranges. The features
values are scaled to be within the range [0, 1], using
the following equation:
X −X
− 1 (9)
where, X
, X
are the minimum and maximum
value of a specific feature. X
is the normalized
PSO Feature Selection Phase: In this paper, PSO al-
gorithm [23] has been used as a feature selection method
to reduce the dimensionality of the NSL-KDD dataset. PSO
efficiently reduces the NSL-KDD dataset from 41 features to
11 features, which reduces 73.1% of the feature dimension
At every iteration of the PSO algorithm, each particle X
is updated by the two best values pbest and gbest. Where,
pbest denotes the best solution the particle X
has achieved
so far, and gbest denotes the global best position so far.
Algorithm 2 shows the main steps of the PSO algorithm-
based feature selection.
Algorithm 2 PSO algorithm-based feature selection
m: the swarm size.
, c
: positive acceleration constants.
w: inertia weight.
MaxGen: maximum generation.
MaxFit: fitness threshold.
Global best position (best features of NSL-KDD dataset)
1: Initialize a population of particles with random positions
and velocities on d=1,...,41 NSL-KDD features dimen-
sions pbest
=0, Gbest=0, Iter=0.
2: while Iter < MaxGen or gbest < MaxFit do
3: for i = 1 to number of particles m do
4: Fitness(i)=Evaluate(i)
5: if fitness(i) > fitness (pbest
) then
6: fitness (pbest
)= fitness(i)
7: Update p
= x
8: end if
9: if fitness(i) > Gbest then
10: Gbest=Fitness(i)
11: Update gbest = i
12: end if
13: for each dimension d do
14: Update the velocity vector.
15: Update the particle position.
16: end for
17: end for
18: Iter= Iter+1
19: end while
20: Return the Global best position.
C4.5 DT classification Phase:
A decision tree classifier is built using the C4.5 algorithm
[26].Then the reduced 11 features output from the PSO
where passed to the C4.5 decision tree classifier to be
classified to one of the five categories: Normal, Dos, U2R,
R2L and prob.
The proposed PSO-DT intrusion detection system is eval-
uated using the NSL- KDD dataset, where 59586 records are
randomly taken. All experiments have been performed using
Intel Core i3 2.13 GHz processor with 2 GB of RAM. The
experiments have been implemented using Java language
environment with a ten-fold cross-validation.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
A. Performance evaluation
The detection effectiveness of the proposed PSO-DT IDS
are measured in term of TP Rate, FP Rate and F-measure;
which are calculated based on the confusion matrix. The
confusion matrix is square matrix where columns correspond
to the predicted class, while rows correspond to the actual
classes. Table I gives the confusion matrix, which shows the
four possible prediction outcomes [33].
Table I
Predicted Class
Actual Class Normal Attake
Normal TN FP
Attake FN TP
True negatives (TN): indicates the number of normal
events are successfully labeled as normal.
False positives (FP): refer to the number of normal
events being predicted as attacks.
False negatives (FN): The number of attack events are
incorrectly predicted as normal.
True positives (TP): The number of attack events are
correctly predicted as attack.
TPRate =
FPRate =
F −measure =
2 ∗ TP
(2 ∗ TP) +FP +FN
B. Experiments and analysis
The classification performance measurements are shown
in Table II and III. Table II shows the accuracy measure-
ments achieved for C4.5 classifier using the full dimension
data (41 features). While, Table III gives the accuracy
measurements for the proposed anomaly PSO-DT network
intrusion detection system with reduced dimension feature
(11 features).
Table II
Class name TP Rate FP Rate F-Measure
Normal 0.982 0.012 0.983
DoS 0.998 0.002 0.997
U2R 0.967 0.003 0.958
R2L 0.932 0.003 0.935
Probe 0.983 0.002 0.985
Table III
Class name TP Rate FP Rate F-Measure
Normal 0.989 0.006 0.991
DoS 0.999 0.002 0.998
U2R 0.99 0 0.992
R2L 0.963 0.003 0.954
Probe 0.993 0.001 0.994
From table II and III, it is clear that the classification accu-
racy achieved using PSO as feature selection method with
C4.5 classifier is improved than using C4.5 as standalone
We compared the PSO feature selection method with
a well known feature selection method genetic algorithm
(GA). Table IV shows the classification accuracy of applying
GA feature selection algorithm with C4.5 classifier.
Table IV
Class name TP Rate FP Rate F-Measure
Normal 0.99 0.01 0.988
DoS 0.999 0.003 0.997
U2R 0.985 0.001 0.985
R2L 0.917 0.001 0.943
Probe 0.991 0.001 0.992
Table V compare the detection accuracy, feature numbers
and timing speed of C4.5, GA-DT and proposed PSO-
DT intrusion detection systems. Table V illustrate that the
proposed PSO-DT IDS gives better detection performance
(99.17%) than the C4.5 and GA-DT IDS. Also the proposed
PSO-DT IDS reduced the feature space from 41 to 11
features and enhance the timing speed to 11.65 sec which
is important for real time network applications.
Table V
System Test accuracy Features number Model building Time
C4.5 DT 98.45% 41 64.71 sec.
GA-DT 98.92% 12 12.26 sec.
Proposed PSO-DT 99.17% 11 11.65 sec.
In this paper we proposed a Fast accurate anomaly net-
work intrusion detection system (PSO-DT). Where, PSO
algorithm is used as a feature selection method and then
classify the reduced data by C4.5 decision tree classifier.
The NSL-KDD network intrusion benchmark was used for
conducting several experiments for testing the effectiveness
of the proposed PSO-DT network intrusion detection system.
Also, a comparative study with applying GA feature selec-
tion with C4.5 decision tree classifier was accomplished.
The results obtained showed the adequacy of the proposed
PSO-DT IDS of reducing the number of features from 41
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
to 11 which leads to enhance the detection performance to
99.17% and decreasing the timing speed to 11.65 sec.
[1] J.P. Anderson, ”Computer security threat monitoring and
surveillance”,Technical Report, James P. Anderson Co., Fort
Washington, PA, April 1980.
[2] W. Stallings, ”Cryptography and network security principles
and practices”, USA, Prentice Hall, 2006.
[3] C. Tsai , Y. Hsu, C. Lin and W. Lin, ”Intrusion detection by
machine learning: A review”, Expert Systems with Applica-
tions, vol. 36, pp.11994-12000, 2009.
[4] T. Verwoerd and R. Hunt, ”Intrusion detection techniques and
approaches”, Computer Communications, vol. 25, pp.1356-
1365, 2002.
[5] S. Mukkamala, G. Janoski and A.Sung, ”Intrusion detection:
support vector machines and neural networks”, In Proc. of
the IEEE International Joint Conference on Neural Networks
(ANNIE), St. Louis, MO, pp. 1702-1707, 2002.
[6] E. Lundin and E. Jonsson, ”Anomaly-based intrusion de-
tection: privacy concerns and other problems”, Computer
Networks, vol. 34, pp. 623-640, 2002.
[7] S. X. Wu and W. Banzhaf,”The use of computational intel-
ligence in intrusion detection systems: A review”, Applied
Soft Computing, vol .10, pp. 1-35, 2010.
[8] G. Wang, J. Hao, J. Ma and L. Huang, ”A new approach
to intrusion detection using Artificial Neural Networks and
fuzzy clustering”, Expert Systems with Applications, vol. 37,
pp.6225-6232, 2010.
[9] T. Abbes, A. Bouhoula and M. Rusinowitch, ”Protocol
analysis in intrusion detection using decision tree”, Inform.
Technol. Coding Comput. vol.1,pp. 404-408, 2004.
[10] C. Tsang, S. Kwong and H. Wang, ”Genetic-fuzzy rule min-
ing approach and evaluation of feature selection techniques
for anomaly intrusion detection”, Pattern Recognition, vol.
40, pp. 2373-2391, 2007.
[11] K.Y. Chan, C.K. Kwong, Y.C. Tsim, M.E. Aydin and T.C.
Fogarty, ”A new orthogonal array based crossover, with
analysis of gene interactions, for evolutionary algorithms
and its application to car door design”, Expert Systems with
Applications, vol. 37, pp. 3853-3862, 2010.
[12] R. B. Dubey , M.Hanmandlu and S. K. Gupta, ”An Advanced
Technique for Volumetric Analysis” International Journal of
Computer Applications, vol. 1, pp. 91-98 , 2010.
[13] M. Ben-Bassat, ”Pattern recognition and reduction of dimen-
sionality,” Handbook of Statistics II, vol. 1, North-Holland,
Amsterdam, 1982.
[14] H. Liu and H. Motoda,” Feature Extraction, Construction and
Selection: A Data Mining Perspective”, Kluwer Academic,
second printing, Boston, 2001.
[15] L. Yu and H. Liu,” Feature selection for high-dimensional
data: a fast correlation-based filter solution”, In Proc. of the
Twentieth International Conference on Machine Learning,
pp. 856-863, 2003.
[16] Y. Kim, W. Street and F. Menczer ”Feature selection for
unsupervised learning via evolutionary search”, In Proc.
of the Sixth ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, pp. 365-369, 2000.
[17] H. Almuallim and T.G. Dietterich, ”Learning Boolean Con-
cepts in the Presence of Many Irrelevant Features”, Artificial
Intelligence, vol. 69, pp. 279-305, 1994.
[18] H. F. Eid, M. Salama, A. Hassanien and T. Kim, ”Bi-Layer
Behavioral-based Feature Selection Approach for Network
Intrusion Classification”, In Proc. The International Confer-
ence on Security Technology (SecTech), Korea, December 8
10, pp. 195-203, 2011.
[19] C. Yang, L. Chuang and C. Hong Yang, ”IG-GA: A Hybrid
Filter/Wrapper Method for Feature Selection of Microarray
Data”, Journal of Medical and Biological Engineering, vol.
30, pp. 23-28, 2009.
[20] L. Chuang, C. Ke and C. Yang, ”A Hybrid Both Filter and
Wrapper Feature Selection Method for Microarray Classifi-
cation”, In Proc. of the International Multi Conference of
Engineers and Computer Scientists (IMECS), Hong Kong,
March, volI, pp. 19-21, 2008.
[21] J. H.Holland, ” Adaptation in Natural and Artificial Sys-
tems”. University of Michigan Press, Ann Arbor, MI., 1975.
[22] B.Jiang, X. Ding, L. Ma, Y. He, T. Wang and W. Xie, ”A
Hybrid Feature Selection Algorithm:Combination of Sym-
metrical Uncertainty and Genetic Algorithms”, In Proc. The
Second International Symposium on Optimization and Sys-
tems Biology (OSB’08), China, pp. 152-157, 2008.
[23] R. Eberhart , J. Kennedy,” A new optimizer using particle
swarm theory”, In Proc. of the Sixth International Sympo-
sium on Micro Machine and Human Science, Nagoya, Japan,
[24] G. Venter and J. Sobieszczanski-Sobieski, ”Particle Swarm
Optimization,” AIAA Journal, vol. 41, pp. 1583-1589, 2003.
[25] Y. Liu, G. Wang, H. Chen, and H. Dong, ”An improved
particle swarm optimization for feature selection”, Journal
of Bionic Engineering, vol.8, pp.191-200, 2011.
[26] J. R. Quinlan, ”C4.5 Programs for Machine Learning”,
Morgan Kaufmann San Mateo Ca, 1993.
[27] Y. Kuo-Ching, L. Shih-Wei, L. Chou-Yuan and L. Zne-Jung,
”An intelligent algorithm with feature selection and decision
rules applied to anomaly intrusion detection”, Applied Soft
Computing, In press, 2012.
[28] D. Farid and M. Rahman, ”Anomaly Network Intrusion
Detection Based on Improved Self Adaptive Bayesian Al-
gorithm”, JOURNAL OF COMPUTERS, vol.5, pp. 23-31,
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
[29] MIT Lincoln Laboratory, DARPA Intrusion Detection Evalu-
ation,,MA, USA, July, 2010.
[30] KDD’99 dataset,, Irvine, CA,
USA, July, 2010.
[31] K. Leung and C. Leckie, ”Unsupervised anomaly detection
in network intrusion detection using clusters”, In Proc. of
the Twenty-eighth Australasian conference on Computer
Science, vol. 38, pp. 333- 342, 2005.
[32] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani,
”A Detailed Analysis of the KDD CUP 99 Data Set”,
In Proc. of the 2009 IEEE symposium on computational
Intelligence in security and defense application (CISDA),
[33] R. O. Duda, P. E. Hart, and D. G. Stork, ”Pattern Classifica-
tion”, JohnWiley & Sons, USA, 2nd edition, 2001.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, 2012
Quality of Service Support on High Level Petri-Net
Based Model for Dynamic Configuration of Web
Service Composition

LI3 Laboratory / University of Manouba
National School of Computer Sciences
2010 Manouba, Tunisia
LI3 Laboratory / University of Manouba
National School of Computer Sciences
2010 Manouba, Tunisia

Abstract— Web services are widely used thanks to their features
of universal interoperability between software assets, platform
independent and loose-coupled. Web services composition is one
of the most challenging topics in service computing area. In this
paper, an approach based on High Level Petri-Net model as
dynamic configuration schema of web services composition is
proposed to achieve self adaptation to run-time environment and
self management of composite web services. For composite
service based applications, in addition to functional
requirements, quality of service properties should be considered.
This paper presents and proves some quality of service formulas
in context of web service composition. Based on this model and
the quality of service properties, a suitable configuration with
optimal quality of service can be selected in dynamic way to
reach the goal of automatic service composition. The correctness
of the approach is proved by a simulation results and
corresponding analysis.
Keywords— Web services composition, High Level Petri-Net,
dynamic configuration, quality of service
With the evolution of network and service infrastructure
towards Service-Oriented Architectures (SOA), an important
requirement for new application components is to present a
high level interface that allows developers to use and re-use
such components known as Web Services (WS) in new
applications. WS, as they are easily accessible from any point
of internet, are suitable to build rapidly on-demand
From point of view of their internal complexity, WS can
be divided into two categories: elementary WS and composite
Elementary WS offer a basic service, like simple libraries
and contain a low level of data transformation, for example,
translation services are elementary WS. In the contrary,
composite WS has more complex pattern and more powerful
function. Such a composite service, resulting of the
composition of several processes logically assembled, can be
called an orchestrated service. For instance, converged
services which are involving combined reusable components
based on network capabilities such as calling services,
messaging services and internet services.
Although Web services technologies and BPEL (Business
Process Execution Language)-based orchestration engines are
powerful tools for developing blended applications, they are
just examples of the tools available in this space. These tools
are well suited for developing capabilities that are abstracted
at a fairly high level and have less stringent latency and real-
time requirements. For example, a service sending a time-
sensitive message needs to consider all aspects of delays and
latencies involved in the overall service flow. If a location-
based advert reaches the person after the fact, the advert loses
its meaning.
In Web services composition context, there often exist a
number of alternative component systems which have the
same functionality but differ in QoS (quality of service). In the
recent academia researches [1, 2], the problem of web service
selection was deeply studied through performance evaluation
and estimation methodologies. For this purpose, some
methods and tools to capture and analyze the performance of
WS have been developed [3]. In general, the proposed
approaches for evaluation of quality of service capabilities for
WS are quite different each from the other. Each one focuses
on a different set of QoS metrics and can be applied at run-
time or system design time.
In the actual applications, a functional relationships
dependency between services exists. A service selection is
very probably influencing the next service selection and
consequently the Web services execution flow. The web
services composition must have the ability of dynamic
reconfiguration in order to adapt to the change of run-time
environment [4].
Reconfigurability is an important feature of self-adaptive
system, especially high-assurance system. San-Yih Hwang et
al. formulated the dynamic WS selection procedure in a
dynamic environment that is failure prone. They proposed
FSM (Finite State Machine) usage to invoke operations of
service in an order [4]. The Web services are selected
dynamically at run time in their work. In [5], a composite
configuration schema is modeled by Petri net. These works
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, 2012
have made a good job in modeling of service dependent
relationship. In self-configuring approaches, like those
presented in [6] and [7], service selection is performed by
searching for an optimal configuration of components based
upon the initial constraints.
However these solutions present a complexity in the
integration with available service orchestration engines. A
main reason is the orchestration process of composite service
is not considered in their models. Expected benefits are a
better productivity, the ability to be much more reactive to
requests. Without considering the orchestration process, it will
be difficult to well evaluate whether the user’s QoS
requirements are satisfied or not.
Most of theories such as finite state machine, Pi-calculus
and Petri nets have been used for description of web services
composition. Their main concern is model mapping: modeling
through translating composition plan or language into model
[8, 9]. In [10], an approach oriented towards QoS assurance
for the evaluation of different design alternatives using a
discrete-event modeling is presented. However, it cannot
satisfy the requirements of adaptive and automatic system
management. In order to construct a self-adaptive composite
service, different techniques such as simulated annealing,
stochastic Petri net [11] are presented to accelerate service
global selection.
In this paper, we focus on this paper on modeling
composite service configuration schema and service
orchestration process. The dependency relationship and the
possible orchestration processes are reflected by a high level
Petri net presented in this study. The support of QoS attributes
and properties calculated under different configurations help
to select dynamically the best and optimal configuration. This
final configuration and orchestration process can be directly
used by the current service engine.
The remainder of this paper is organized as follows. In
Section II, we present a web services overview. Section III
introduces the dynamic configuration model using hierarchical
Petri net. In Section IV, we present our QoS calculation
method applied on the Petri-net based model. A selection of
dynamic configuration is described in Section V. Section VI
presents a case study to illustrate our work. Finally, we
conclude in section VII.
SOA is architecture that functions are defined as WSs.
According to [10], WSs are self-contained, modular
applications that can be described, published, located, and
invoked over the network, generally, the World Wide Web.
The SOA is described through three different roles: service
provider, service requester and service registry.
The key idea of SOA is the following: a service provider
publishes services in a service registry [12]. The service
requester searches for a service in the registry. He finds one or
more by browsing or querying the registry. The service
requester uses the service description to bind service. These
ideas are shown in Fig. 1.

Figure 1. Service-Oriented Architecture
The composition mechanism leads us from the elementary
components to the final new service.
Orchestration is the term used to describe the creation of a
"business process" (or a workflow) using Web services. A
business process is an aggregation of services whose the
operations, i.e. the processes, are logically linked together in
order to reach a given objective [1].
Aggregating services to build an added-value service have
many solutions depending on the chosen environment. For
Web services, the orchestration is usually expressed with a
specific language like BPEL that describes the interactions
between the services.
A business process is deployed itself as a service, so it can be
used by other processes. A business process language
describes the behaviour of business processes based on Web
services, i.e.:
Control flow (sequences, loops, conditions,
parallelism …)
Variables, exceptions, timeout management.

SOA is known to bring many advantages in software
development, management and deployment. A product is a set
of SOA services focused on solving on a business problem
(see Fig. 2). The focus is on e.g. the following points [13]:
• Modularity/re-usability: Individual components of a
solution can be managed independently of each other,
allowing components to follow different release
cycles and frequencies of release, with minimal
regression across them.
• Maintainability/evolution: Components have “hard
edge” APIs, so can be substituted as required to
address issues such as performance, scalability and
stability on a more granular, case-by-case basis.
• Loosely coupled: Reduces the risk of “domino
effect” total service outages, as components and
processes are less tightly bound to one another.
• Flexibility: Enables more flexible distribution and
placement of components across multiple system
resources in n-tier service model, whilst not imposing
a static nor a rigid deployment model.
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, 2012
• Scalability: Service capacity requirements can be
managed in an easy, sustainable, repeatable and
consistent manner.

Figure 2. Products as a composition of SOA services

In our research, we chose to adopt Petri nets due to its
combination of (1) rich computational semantics, (2) ability to
formally model systems especially with properties: concurrent,
asynchronous, distributed, parallel, nondeterministic and
stochastic, and (3) availability of graphical simulation tools
Petri nets also have natural representation of changes and
concurrency, which can be used to establish a distributed and
executable operational semantics of Web services [15]. In
addition, Petri nets can address offline analysis tasks such as
Web services static composition, as well as online execution
tasks such as deadlock determination and resource satisfaction.
Furthermore, Petri nets possess natural way of addressing
resource sharing and transportation, which is imperative for
the Web services paradigm.
A. Model Definition
The algebraic structure of Hierarchical Timed
Predicate Petri Net named
if the following conditions hold:
P T = , P T , P is sets of places
T= T
, T
= , T
= , T

= . T
is a set of dummy transitions, T
is a set of
concrete transitions, T
is a set of refinable transitions.
For t
, t
can be associated with a choice
probability. For t
, t
is a HTPPN with unique
input transition t
and unique output transition t
is a finite set of arcs
Name represents the name of
web services composition. represents that the
operation of web service is not actual execution but
structural simulation.
is a capability function. N+ is the set of
positive integers
is a weight function.

is a QoS function is a set of positive
real numbers.

T is a finite set of transitions which represents the activity of
web service. F is called the web services action flow.
A marking in a HTPPN is a function M that maps every place
into a natural number. M
is called the initial marking.

The execution model is defined with the following basic
temporal types: time point, duration and interval constraints.
They are defined as follows:
TEB(j) is the sum of Message Delay Time and
waiting time of the activity j of a process
(j)] denotes the time period
during which the activity j are enabled after the
intermediate preceding activity of the activity j
(j)] denotes the time period
during which the activity j can be executed after
it is enabled.

TC = TC(p) TC(t), where TC(p) is a set of all place time pairs
and TC(p)= TC
(p) x | TC
(p) < TC
p P , TC(t) is a set of all transitions time pairs and
TC(t)= TC
(t) x | TC
(t) < TC
(t) t
T . TD is a set of time duration.

The model has unique input place P
and unique output place
The model is also a Petri net, i.e., all of its arcs
weights are 1.
|≥1. The model contains at least one refinable
Each concrete transition is associated with a
component service.
Each refinable transition model is constructed by a
deferred choice pattern.
p P\ {p
}, M
(p) = 0

Definition 2: (Transition firing/ Transition firing duration)
For t T
, if TC (t) TC
(t) , then a transition t
can fire at least after TC
unit time intervals if it is enabled at
marking M; during the period, transition t must fire at most
after TC
(t) unit time intervals if there is no transition enable,
which may change marking and make the transition t
Definition 3: (Transition enabled interval/ Token arrival) The
interval TC
(P) presents the time period during
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, 2012
which P
succeeding transitions are enabled after a token
arrives at a place P

Definition 4: (Time consistency of composite WSs )
As to a service, the perform execution time span is mapped to
earliest and latest enable time of transition as
(j)), while the execution time is denoted
by firing duration of transition as TE(j). Then a transition is
considered as schedulable if it is candidate to fire and can
finish its firing successfully. i.e., (TEB
(j) - TEB
(j) > 0).

A marking M
, is said to be reachable in TPPN modeling if
there is a firing sequence (M
. . . M
. . . M
) that
transforms M
to M
. In a TPPN, if any transition is
schedulable, we show that service can be successfully within
time constraints.

B. Model Description

The HTPPN is a model representing the service selection as
well as the orchestration process. Places are designed for the
system states and the execution conditions.
A web service behavior is basically or partially ordered set of
activities. For instance, Figure 3 shows typical Petri Nets
model that represents the relationship between places (i.e. state
of the service) and transition (i.e. a given activity).

Figure 3. Model of a simple chained flow

The logical control activities in the composition process (pre-
condition and post-condition) are represented by dummy
transitions. No component service is associated to a dummy
transition. The activities of component services are
represented by concrete transitions. Only one fixed component
service is associated to a concrete transition.
There are several configurations in a HTPPN. A feasible
configuration can be get after one selection branch is retained
and others are eliminated in every refinable transition. Since
different configuration is candidate to include several number
of component services, the corresponding orchestration process
can also be different. In this work, HTPPN reflects not only the
possible service selection but also the corresponding
orchestration process.
Service selection means the process to associate every activity
of composite service to a component service. According to
whether a service selection is related to another, the service
selections can be classified into two families, free selection and
restricted selection.
Figure 4 and 5 depict the two kinds of selections modeled by
using split-join concurrency pattern based Petri net.
In Figure 4, every selection branch only includes one transition.
Therefore, the transitions can be selected freely. In contrary,
every selection in Figure 5 branch includes a set of transitions.
The service dependent relationships are reflected by the flow
pattern of branch. The transitions in a branch must be selected
as a whole. For example, the set {T
, T
} or {T
, T
} is selected.

Figure 4. A modeling example of dynamic configuration using
HTPPN (without dependent relationship)

Figure 5. Example of HTPPN model representing the service
selection with dependent relationship
The activities of service selection are encapsulated in the
refinable transitions. The refinable transition indicates that an
abstract sub-function of composite service can be
accomplished by a number of optional component services or
service modules. Each branch of split-join concurrency
pattern in a refinable transition represents a service selection
(see Figure 6). The QoS attribute is associated to concrete

Figure 6. A split-join concurrency pattern
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, 2012
In this section, a QoS calculation method of composite
service is presented at first. Some QoS attributes, such as
response time, throughput, cost, reliability, availability,
security, accessibility have been exposed to evaluate a Web
service. Each component service may have several QoS
attributes. QoS attributes such as throughput, reliability,
availability, security and accessibility are better if larger value.
But for attributes like response time and cost, the smaller is the
value, the better is the QoS.
In this study, a simplification of the model is made that
consider smaller value as better QoS. For larger-is-better
attributes, their values will be changed automatically to
negative number. If a component service is unavailable or
disabled the QoS value of it is set to + ∞.
Because a selection branch in a refinable transition according
to split-join pattern must be selected as a whole, the QoS of the
branch needs to be evaluated. Different branch patterns need to
be considered. HTPPN is usually comprised of four kinds of
basic Petri net patterns. The α is a choice probability associated
to dummy transition. To different QoS attributes, the QoS
calculation formulas of four kinds of patterns are different.
We propose some feasible estimate formulas to calculate the
QoS of basic model patterns. R(t), C(t), A(t) and T(t) are
respectively response time, configuration cost, availability (or
reliability) and throughput of component service. The
configuration cost indicates the charge being paid for the third-
party component services. It can be simply the summing up of
all the costs of component services. The formulas are easy to be
understood and verified, except QoS formulas of loop pattern
presented in [9]).

A. QoS Calculation Formulas

We figure out some useful patterns defined in Workflow
Management Coalition (WFMC):
Sequential pattern :
The sequence construct allows the definition of a collection of
activities to be performed sequentially in lexical order (see
Figure 7). A sequence activity contains one or more activities
that are performed sequentially.

Figure 7 Petri-Nets based sequential pattern
The processes i and j execute in sequential order, QoS
calculation formulas are:
) ( ) ( ) ( ) (
P TEC j TEB i TEB j i R

) ( ) ( ) ( j C i C j i C

) ( ) ( ) ( j R i R j i A

)) ( ), ( ( ) ( j T i T Min j i T

Parallel pattern :
The flow construct allows to two or more activities to be
executed in parallel, giving rise to multiple threads of control.
Transition T
(see Figure 8) is a point where a single thread of
control splits into two or more threads that are executed in
parallel, allowing multiple activities to be executed

Figure 8. Petri-Net based parallel pattern
The processes i and j execute concurrently, QoS calculation
formulas are:

)) ( ), ( ( ) ( j TEB i TEB Max j i R

) ( ) ( ) ( j C i C j i C

) ( ) ( ) ( j R i R j i A

)) ( ), ( ( ) ( j T i T Min j i T

Conditional pattern:
The flow construct is used to select exactly one branch of
execution from a set of choices. It supports conditional routing
between activities. Place P
(see Figure 9) is a point where a
single thread of control makes a decision upon which branch
to take when encountered with multiple alternative activity

Figure 9. Petri-Net based conditional pattern
Suppose that processes i and j execute alternatively, QoS
calculation formulas are:
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, 2012

)) ( ) (
), ( ) ( ( ) (
P TEC j TEB Max j i R

) ( ) ( ) ( j C i C j i C

) ( ) ( ) (
2 1
j R i R j i A

) ( ) ( ) (
2 1
j T i T j i T

Loop pattern :
The loop construct (Figure 9) is used to indicate that an
activity is to be repeated until a certain success criteria has
been met. A while activity supports repeated performance of
an activity in a structured loop, that is, a loop with one entry
and one exit point.

Figure 10. Petri-Net based Loop pattern

If we suppose that process i execute k times, QoS calculation
formulas are:

) ) ( ), ( ( ) (
1 1
max min
j TEB j TEB i k R

) ( ) ( ) ( j C i C i k C

) ( ) ( 1
) ( ) 1 (
) (
j R i R
i R
i k R

)) ( ), ( ( ) ( ) 1 ( ) ( j T i T Min i t i k T

B. Dynamic Configuration Model

A service orchestration model with static configuration can
be transformed to one with dynamic configuration. For
example, in Figure 11, if there is a number of component
services with identical function, either of which can
accomplishes the function of T
or T
, then T
is transformed
to a refinable transition (modeled as a selection pattern shown
in Figure 4).
Some important modeling trips need to be highlighted and
• In Figure 11, one selection branch depicted by refinable
transition T
can include different flow pattern and different
quantities of transitions in contrast with another, as long as
each branch can accomplish the same function. This enables
the model to reflect the various service orchestration processes.
•The refinable transitions can also exist in hierarchical
manner, i.e., may exist other refinable transitions in a refinable
transition. A complex composite service configuration schema
can be compactly reflected by the N-hierarchical model.
•The service configuration schema and optional
orchestration processes are synthetically reflected in a HTPPN.
But, only one selection branch in every refinable transition
should be kept in the final model.

Figure 11. Example of HTPPN model representing a dynamic

An optimal configuration selection algorithm is designed to
select the configuration with best QoS from the service
configuration schema as shown in Figure 12.
The algorithm, presented in Figure 12, considers HTPPN as
input and provides as output an optimal QoS configuration. It’s
an iterative algorithm that allows considering different levels
through the hierarchical model.
In each iteration, the current selection branch is picked defined
and the caclulateQosForEachBranch method is called. This
method calculates the QoS attributes using the formulas
presented according to Petri-nets patterns. Then
selectBranchWithMinimalQoS method is executed in order to
select the best configuration having the less attributes values.
The EntryModel is the main model in HTPPN. This model is
the first level in the hierarchy. Each branch represents a
selection alternative that will be considered in the orchestration.

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, 2012

Figure 12. Optimal selection algorithm
A. Description

The Healthcare Service Platform (HSP) presented in [16] in
focuses on the delivery of healthcare services. It is an end-to-
end reference architecture that focuses on meeting the needs of
citizens, patients and professionals. Its architectural diagram is
given in Figure 14.
We distinguish three main components, i.e. body sensor
networks (BSN), IaaS cloud, healthcare delivery environment.
BSN: according to circumstances and personalized
needs, appropriate health information collection
terminals (i.e. sensors) are configured for different
individuals. BSN is used to provide long term and
continuous monitoring of patients under their natural
physiological states. It performs the multi-mode
acquisition, integration and real-time transmission of
personal heath information anywhere.
IaaS cloud: this component achieves the rapid storage,
management, retrieval, and analysis of massive heath
data. It mainly includes Electronic Medical Record
(EMR) repository. It considers also personal health
data acquired from BSN.
Healthcare delivery environment: it includes a personal
health information management system. It replaces
expensive in-patient acute care with preventative,
chronic care, offers disease management and remote
patient monitoring and ensures health
education/wellness programs.

Figure 14. HSP architecture

In HSP, we adopt the design idea of SOA and Web service
technology for its design and implementation. The majority of
its functional modules are developed and packaged in the form
of services. Here, we overview some of them as follows [17].
PhyInfoService: this service can acquire some general
physiological signals such as body temperature, blood
pressure, and saturation of blood oxygen,
electrocardiogram, and some special physiological
signals according to different sensor deployment for
different users. User’s ID number is required.
EnvInfoService: for a unique ID number, this service
can acquire temperature, humidity, air pressure and
other environmental information for this user.
SubjFeelAcqService: it can acquire the user subjective
feelings, food intake, etc., and the information is often
provided by the user from the terminal.
TempInfoService: it can return the external temperature
in the patient’s environment.
HealthGuideAssService: this service can assess the
knowledge of the patient’s health risk based on specific
EMRService: this service can output the user’s medical
history information.
GeoInfoService: it can return the user’s location.
EmerAlarmService: it can raise an alarm to the user in
case of illness.
HealthGuideService: it can provide the patient with
preventive measures especially items that need
RealTimeWarmService: it can warm the patient about
the signs of certain disease.

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, 2012
We present the HTPPN model of a healthcare scenario. We
highlight that the transition with H as symbol represents a
refinable transition.

Figure 15. An example of HTPPN modeling a Healthcare scenario
In our model, HTPPN–HSP mapping is as described in Table
I. HTPPN represents the EntryModel (see Figure 15).
Semantic Map
T3 ThirdService
T4 HealthCareService
T6 MedicalAnalysisService
T8 AssesmentService

Figure 16. The model of refinable transition T3
Table II presents T
–HSP mapping. It’s associated with
refinable transition T
(see Figure 16).
Semantic Map
T31 FinancialService
T32 InsuranceService

Figure 17. The model of refinable transition T4
Table III depicts T
–HSP mapping related to refinable
transition T
(see Figure 17).
Semantic Map
T41 EnvInfoService
T42 PhyInfoService
T311 GeoInfoService
T432 TempInfoService
T441 SubjFeelAcqService
T442 EMRService

Figure 18. The model of refinable transition T6
Table IV illustrates T
–HSP mapping representing the model
associated with refinable transition T
(see Figure 18).
Semantic Map
T602 HealthRiskAssService
T603 HealthGuideService
T604 RealTimeWarmService
T605 EmerAlarmService
T606 HealthGuideAssService

The QoS calculation parameters are calculated and stored in
order to be used by the optimal configuration selection
algorithm. They also can be used to estimate or predict the
QoS of composite service.

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, 2012

B. Experimentation

To experiment the efficiency of our model, we developed a
prototype of an Eclipse plug-in called PetriNetWorkbench. It
aims to help the designer in building Petri-nets models for
simulation purpose. Figure 13 shows an overview of four steps
corresponding to phase of web services composition analysis
implanted in this tool:
Domain analysis: it concerns concepts identified in the
textual description. These key concepts form a unified
vocabulary of concepts that will be reusable for the
description of user requirements.
Specification: unified vocabulary that results from
earlier step is used for specifying how both coherent
and synthetic rules of user-requirements are.
Algorithms Generation: automatic generation of
detection algorithms from specification to check the
validity of system model.
Detection: this step gives the developed tool as input
specifications and returns the conformity to user-
requirement specification especially QoS attributes.

Figure 13. Analysis steps

Figure 20 depicts Petri-Net based model is defined by designer
(only refinable transition T

Figure 20. HTTPN model associated to transition T6 creation with

The result of model analysis is returned after the simulation
based on two steps: checking requirements card provided
respecting a pre-defined grammar and then comparing the
simulation result with the expected QoS requirements
parameters defined by user.

In this paper, we presented a novel approach using high
level Petri net with a rapid QoS calculation strategy. By
Taking advantage of the global dynamic configuration, the
composite service can adapt to the QoS dynamic change of
components services and failure-prone runtime environment.
The QoS requirements of users are satisfied to the greatest
Further research is needed for sure. Firstly, the
consideration of balanced configuration from different user
QoS requirements is important in real deployment context of
web services based applications. Secondly, we will extend the
model to support dynamic QoS properties for future work. And
finally, we will further the development of PetriNetWorkbench
to support complex scenarios.


[1] S. Mtibaa, and M. Tagina, “An Automated Petri-Net Based Approach
for Change Management in Distributed Telemedicine Environment,”
Journal of Telecommunications, Vol. 15, No. 1, 2012, pp. 1-9.
[2] W.M.P. van der Aalst, A.H.M. ter Hofstede, B. Kiepuszewski and A.P.
Barros “Workflow Patterns", Distributed and Parallel Databases, Vol.
14, No. 1, 2003, pp. 5-51.
[3] M. Marzolla and R. Mirandola “QoS Analysis for Web Service
Applications: a Survey of Performance-oriented Approaches from an
Architectural Viewpoint", Technical Report UBLCS-2010-05, 2010.
[4] S. Silas, K. Ezra and E. B. Rajsingh, "A novel fault tolerant service
selection framework for pervasive computing", Silas et al. Human-
centric Computing and Information Science, Vol. 2, No. 1, 2012, pp. 1
- 14.
[5] SS-Y. Hwang, EP. Lim, CH. Lee and CH. Chen, "Dynamic Web
Service Selection for Reliable Web Service Composition", IEEE
Transactions on Services Computing, Vol. 1, No. 2, 2008, pp 104 -
[6] L. Ge and B. Zhang, "A Modeling Approach on Self-Adaptive
Composite Services ", in International Conference on Multimedia
Information Networking and Security, 2010, pp. 240 - 244.
[7] P. Châtel, J. Malignant and I. Truck, "QoS-based Late-Binding of
Service Invocations in Adaptive Business Processes", in Proceedings
of IEEE International Conference on Web Services, 2010, pp. 227-234.
[8] B. Li, Y. Xu, J. Wu and J. Zhu, “A Petri-net and QoS Based Model for
Automatic Web Service Composition", Journal of Software, Vol. 7,
No. 1, 2012, pp. 149-155.
[9] S. Mtibaa and M. Tagina, “A Petri-Net Model based Timing
Constraints Specification for E-Learning System”, in International
Conference on Education and E-Learning Innovations, 2012, pp. 73-
[10] Y. Jamoussi, M. Driss, J. M. Jézéquel and H. Hajjami Ben Ghézala,
“QoS Assurance for Service-Based Applications Using Discrete-Event
Simulation ", International Journal of Computer Science Issues, Vol.
7, No. 6, 2010, pp. 1-11.
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, 2012
[11] Wang, K. and N. Tian, "Performance Modelling of Composite Web
Services ", in Proceedings of Pacific-Asia Conference on Circuits,
Communications and System, 2009, pp. 563 - 566.
[12] R. Calinescu, L. Grunske, M. Kwiatkowska, R. Mirandola and G.
Tamburrelli "Dynamic QoS Management and Optimization in
Service-Based Systems", IEEE Transactions on Software Engineering,
Vol. 37, No. 3, 2011, pp. 387 - 409.
[13] F. H. Khan, M.Y. Javed, S. Bashi, A. Khan and M. S. H. Khiyal, “QoS
Based Dynamic Web Services Composition & Execution",
International Journal of Computer Science and Information Security,
Vol. 7, No. 2, 2010, pp. 147-152.
[14] P. Xiong, Y. Fan and M. Zhou, "A Petri Net Approach to Analysis and
Composition of Web Services", IEEE Transactions on Systems, Man,
and Cybernetics, Part A: Systems and Humans, Vol. 40, No. 2, 2010,
pp 376 - 387.
[15] D. Petrova-Antonova and A. Dimov, “Towards a Taxonomy of Web
Service Composition Approaches", W3C Workshop on Frameworks
for Semantics in Web Services, Vol. 12, No. 4, 2011, pp. 377-384.
[16] S. Mtibaa, and M. Tagina, “A Petri Nets-based Conceptual
Framework for Web Service Composition in a Healthcare Service
Platform,” Journal of Telecommunications, Vol. 2, No. 4, 2012, pp.
[17] S. Mtibaa, and M. Tagina, “Managing Changes in Citizen-Centric
Healthcare Service Platform using High Level Petri Ne,” International
Journal of Advanced Computer Science and Applications, Vol. 3, No.
8, 2012, pp. 73-81.


Sabri Mtibaa is currently a Ph.D. student in the National School for
Computer Sciences of Tunis, Tunisia (ENSI). He received the master
degree from High School of Communication of Tunis, University of
Carthage, Tunisia (Sup'Com) in 2008. His current research interest
includes web service composition using Petri nets as well as system
verification and QoS aware.

Moncef Tagina is a professor of Computer Science at the National
School for Computer Sciences of Tunis, Tunisia (ENSI). He received the
Ph.D. in Industrial Computer Science from Central School of Lille,
France, in 1995. He heads research activities at LI3 Laboratory in
Tunisia (Laboratoire d'Ingénierie Informatique Intelligente) on
Metaheuristics, Diagnostic, Production, Scheduling and Robotics.

ISSN 1947-5500
Contextual Ontology for Delivering Learning
Material in an Adaptive E-learning System

Kalla. Madhu Sudhana
Research Scholar, Dept of Computer Science
St. Peter’s University
Chennai, India
Dr V. Cyril Raj
Head, Dept of Computer Science
Dr M.G.R University
Chennai, India

Abstract— The rapid growth of internet technology and the
explosion of learning material in educational domain are leading
to the next generation E-learning applications that exploit user
contextual information to provide a richer experience. One of the
activities to perform during the development of these context-
aware E-learning applications is to define a model to represent
and manage context information. In this work, the model for
Context-aware and adaptive learning system has been proposed
and introduces context ontology, to model context-related
knowledge that allows the system to deliver learning material by
adapting learner context in an adaptive learning system.

Keywords-component; Context aware e-learning; Adaptive
Delivery of learning material; Ontology based context model


The explosion of learning material in educational domain
are leading to develop E-learning applications, services, agents
and recommender systems appeared to improve the quality of
E-learning. Such systems were used in learning systems to
provide the facilities during the learning process and help
learners with a more accurate learning. These forces any E-
learning application developed under the ambient intelligence
paradigm to be aware of contextual information and to be able
to automatically adapt to learner context.
The development of context-aware E-learning applications
should be supported by adequate context modeling and
reasoning techniques [1]. Modeling context knowledge is a
crucial task to support the delivery of the right information at
each moment. The context of the learner and learning
environment should be extracted for adaptation, personalization
and anticipation of learning material that is suitable for learner.
Current E-Learning solutions are not sufficiently aware of
the context of the learner, that is the individual’s characteristics
and the organizational context such as the work processes and
tasks. The traditional E-Learning systems provide adaption
based only on user preference, to improve performance, it is
required to incorporate learning environmental context
information such as the device or network context to determine
the appropriate presentation method along with the user
Here we discuss the general notion of context as well as
how it can be specified and modeled in E-learning domain. The
architecture of context-aware and adaptive learning system is
discussed along with the context ontology to model context-
related knowledge.
This article is organized as follows. In the second and third
sections, we study the background concepts and related works
to his paper. In section four the need for the proposed system is
mentioned. In sections five and six, we describe the
architecture of proposed adaptive learning system and the
ontology based context model for adaptive delivery of learning
A. Context
Context is a multifaceted concept that has been studied in
multiple disciplines, each discipline tends to take its own
idiosyncratic view that is somewhat different from other
disciplines and is more specific than the standard generic
dictionary definition of context as “conditions or circumstances
which affect something” [2].
B. Learning Context
The term learning context is used to describe the current
situation of a learner related to a learning activity. In addition
to attributes relying on the physical world model, like time and
location, a variety of attributes described implicitly or
explicitly might be added to the context. When using an
appropriate context-modeling technique, the current situation
might be compared with the requirements of any specific
learning activity.
C. Ontology
According to Semantic Web led by W3C (World Wide
Web Consortium), ontology is a way to describe knowledge
systematically; a typical and explicit specification about
concepts and conceptualization, that is, it also defines concepts

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
and relations required to describe meaning and information [3],

D. Contextual Ontology
Ontologies are one of the most functional means for
representing contextual data. They map three basic concepts in
a context model (classes, relationships and attributes) to the
existing things in a domain [5]. The formalism of choice in
ontology-based models of context information is typically
OWL-DL [6] or some of its variations, since it is becoming a
de-facto standard in various application domains, and it is
supported by a number of reasoning services. By means of
OWL-DL it is possible to model a particular domain by
defining classes, properties and relations between individuals.
Particularly in mobile and pervasive environments there are
different heterogeneous and distributed entities that must
interact for exchanging users’ context information in order to
provide adaptive services. To this end, various OWL
ontologies have been proposed for representing shared
descriptions of context data. Among the most prominent
proposals are the SOUPA [7] ontology for modelling context in
pervasive environments, and the CONON [8] ontology for
smart home environments.
Schmidt and Winterhalter [9] are using context to retrieve
relevant learning object for a given user. The matching service
computes a similarity measure between the current user context
abstraction and the ontological metadata of each learning object
and then can present a ranked list of relevant learning objects.
It is a kind of active use of context intending to reconfigure
available services (learning objects).
Bomsdorf [10] developed a system prototype by allowing
learning materials to be selected depending on a given situation
– this takes into account learner profiles such as their location,
time available for learning, concentration level and frequency
of disruptions.
Bouzeghoub et al. [11] proposed a situation-aware
framework/mechanism which takes into account time, place,
user knowledge, user activity, user environment and device
capacity for adaptation to user.
Lee et al. [12] developed a Java Learning Object Ontology
for an adaptive learning tool to facilitate different learning
strategies/paths for students, which can be chosen dynamically.
Jane Yau and Mike Joy [13] described the architecture of
Context-aware and Adaptive Learning Schedule (CALS) tool.
This tool is able to automatically determine the contextual
features such as the location and available time. The
appropriate learning materials are selected for the students
according to, firstly, the learner preferences, and secondly the
contextual features.
The contextually aware environment aims to aid in this
task, presenting the right information to the user. In order to
achieve this, a system must have a thorough understanding of
its environment, the preferences and devices that exist within it,
the system must be able to identify where, and under what
context each person is working.
Our approach heavily relies on semantic modeling of the
learner’s environment. For this purpose, we make use of
ontology for modeling contextual knowledge of the learning
environment to use them during the context aware adaption
process. The Protégé 4.1 is used to create ontology for
modeling contextual knowledge of the learner’s environment.
The proposed context-aware and adaptive delivery system
can be more usefully constructed in a fashion that is tailored
specifically to academic e-learning environment for adaptive
delivery of learning material. This may be achieved through
integration of different contextual situations of academic e-
learning environment.
The context aggregator collects all contextual information
supplied by different context sources and provides an
aggregated knowledge view. The representation and reasoning
of contextual information in knowledge base is performed by
means of Ontology represented in OWL format. The
knowledge acquired from the ontological reasoner enables the
system to suggest appropriate learning material to be delivered
to the learner.
In the proposed system the basic elements of context-aware
and adaptive delivery process is made of three-steps as shown
in “Fig. 1”.

Figure 1. Basic elements of context-aware and adaptive delivery process
A. Context Acquisition
Before modeling the user context model, the most
important point in context-aware applications is the acquisition
of context information. There is no single way of determining a
user’s context in E-learning. This mainly depends on the three
strategies that we considered in the proposed system, such as
details of learning device used by learner, what are the basic
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
details of learner? And what are the personal preferences of
learner? Therefore, in the proposed system context information
acquisition includes three approaches that allows for plugging
in different context sources as shown in diagram “Fig. 2”.
These context perspectives are then integrated into a single
context abstraction. Context sources could be:

• Learner profile: First category is the information
obtained from the learner’s profile such as location,
qualification, organization etc. These factors require
the learners to fill in before they participate in the
• Context detection service: the information obtained
through device context detection service provides the
details about the device being used by learner.
• User interface: In E-learning domain different users
may prefer different orientation of learning, learning
mode and subject area and so on, once the basic
material provided to the learner the user interface
provides environment to obtain the personal
preferences of the user based on which the system will
deliver the preferred material.

Figure 2. “Context Acquisition- Modeling- Adaptation” Scenario in
Adaptive System
B. Context Modeling
In general, the context data may be from learner,
learning environment, educational strategy and so on. The
specification of all Contextual entities and relations between
these entities are needed to describe the context as a whole. A
context model is also a system of concepts (entities) and
relations, so that the ontology is a possible mean for context
modeling to specify the representation of contextual
knowledge. An ontology is “formally defined”, is useful for a
computer to interpret it, e.g. for reasoning purposes, and then
the Rules can be used to implement context reasoning. In the
proposed system ontology is formally represented in the OWL

C. Adaptation Mechanism
In E-Learning environments, we may provide Learning
contents not only adaptive to learner, but also adaptive to
learning environment. The learning environment may vary
based on learning device, domain, Learner-Preference etc, so
by incorporating the contextual knowledge in adaptive
mechanism of E-Learning systems will make it more effective.

The adaptive process based on context creates suitable
content for learners according to contextual and situational
data. Secondly, content adaptation process recodes original
content into adapted contents according to the adaptive
suggestion, from adaptive process. The proposed Context-
aware adaptive content delivery model is as shown in “Fig. 3”.

Figure 3. Proposed context ontology based Learning content delivery model

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
We used an ontology-based context model for context
representation. This model adopts OWL as the representation
language to enable expressive context description and data
interoperability with third-party services and applications and it
is a W3C recommendation that employs web standards for
information representation such as RDF and XML Schema.

Because context ontologies have explicit representations of
semantics, they can be reasoned by the available logic
inference engines. Systems with the ability to reason about
context can detect and resolve inconsistent context knowledge
that often results from imperfect sensing.
Here, we consider three categories of contextual
information for the proposed system that are mentioned below,
are mainly important and especially concerned to an adaptive
E-learning systems based on which the proposed system can
deliver the concerned learning material to the learner.
Our ontology context model, which is a context aware
learning environment made by OWL. It consists of three top-
level classes and twelve sub-classes, and contains fifteen main
properties which describe the relations between individuals in
top level class and its sub classes. “Fig. 4” shows that we
comply with XML, RDF Schema and OWL as a part of the
context model and give a definition of three top level classes.
xmlns:owl =""
xmlns:rdf =""
xmlns:xsd ="">
<owl:Ontology rdf:about="">
<rdfs:comment>Learner OWL ontology</rdfs:comment>
<rdfs:label>Learner Context Ontology</rdfs:label>
<owl:Class rdf:ID="Personal">
<rdfs:subClassOf rdf:resource="#Learner-Context"/>
<owl:Class rdf:ID="Device">
<rdfs:subClassOf rdf:resource="#Learner-Context"/>
<owl:Class rdf:ID="Preference">
<rdfs:subClassOf rdf:resource="#Learner-Context"/>
Figure 4. A part of ontology expressions in context model
A. Learner-Context class
This context class is the super class for all the contexts in
Context Aware Learning environment. Any instance of the
context class represents a conceptual context. Different
contexts can be indexed hierarchically based on class
hierarchy, such as Personal, Device and Preference as shown in
“Fig. 5”.

Figure 5. Classes and subclasses relationships in context ontology
OWL defines the vocabulary of context model. It provides
a mechanism to define adaptive -specific properties and classes
of context to which those properties can be applied, using a set
of basic modeling primitives (class, subclass, properties,
domain, range, type). The context model can be specified using
OWL encoding, Fig. 6(a) and, Fig. 6(b) shows that each
statement is essentially a relation between an object (a class),
an attribute (a property), and a value (a resource or free text).
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
Fig. 6(c) shows an example OWL coding part for small part of
our proposed ontology.

Classes Object
Data type
Value Type


Xsd: string

xmlns:owl =""
xmlns:rdf =""
xmlns:xsd ="">
<owl:Ontology rdf:about="">
<rdfs:comment>Learner OWL ontology</rdfs:comment>
<rdfs:label>Learner Context Ontology</rdfs:label>
<owl:Class rdf:ID="Identity">
<rdfs:subClassOf rdf:resource="#Personal"/>
<owl:Class rdf:ID="Personal">
<rdfs:subClassOf rdf:resource="#Learner-Context"/>
<owl:ObjectProperty rdf:ID="hasIdentity">
<rdfs:domain rdf:resource="#Personal"/>
<rdfs:range rdf:resource="#Identity"/>
<owl:ObjectProperty rdf:ID="hasPersonalInfo">
<rdfs:domain rdf:resource="#Learner-Context"/>
<rdfs:range rdf:resource="#Personal"/>
<owl:DatatypeProperty rdf:about="ID">
<rdfs:domain rdf:resource="#Identity"/>
<owl:DatatypeProperty rdf:about="UserName">
<rdfs:domain rdf:resource="#Identity"/>
<owl:DatatypeProperty rdf:about="Password">
<rdfs:domain rdf:resource="#Identity"/>

Figure 6. (a) Few specifications of model, (b) The equivalent directed
semantic graph, and (c) An example of OWL code.
• Personal: This ontology classes contains a wide
categorization details provided by the learner in
Learner Profile. It was created in order to facilitate the
extraction of the user personal information. The user is
requested to register and fill information in few forms
with personal information.
1. Identity (e.g.: ID, Name or Registration-Number)
2.Organization(e.g.: Technical Institute, University or
Research Organization)
3. Location (e.g.: City, State or Country name)
4. Role (e.g.: Student, Lecturer or Professor)
5. Goal (e.g.: Research, Survey, Quick Reference, Basic
Introduction or Seminar)
6. Grade (e.g.: Beginner, Practitioner or Expert)
7. Qualification (e.g.: Bachelor, Master or Researcher)
8. Domain (e.g.: Computer Science, Agriculture etc)

• Device: To be able to cover the device and software
heterogeneities in a learning environment, we have
included device context along with its sub-classes such
as Hardware, Software and Network-Connectivity. It
models knowledge about the different devices that are
being used by learner.
1. Hardware (e.g.: Mobile, PC, Laptop or PDA)
2. Software (e.g.: Operating system, browser or audio and
video encoding software)
3. Network-Connectivity (e.g.: Wired or Wireless)

• Preference: In e-learning environment the category of
learning material is an important context based on
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
needs and interests under the context of
personalization. The Preferences of learner is useful to
select and deliver the suitable type of material based on
Subject-Area, Mode-of-Learning (material format) and
Learning Orientation. The user is requested to enter
this information while interacting for learning material.
1. Subject-Area (e.g.: Data-Structure, Embedded Systems,
neurology or Dental)
2. Mode-of-Learning (e.g.: Video, audio, textual or
3. Orientation-of-Learning (e.g.: Case-Study, Example
Oriented, problem-oriented or conceptual)

We have described our proposed model for Context-aware
and Adaptive Learning system and introduced context ontology
for E-Learning, to deliver learning material by adapting learner
context and we are currently designing the system prototype
which will be implemented and evaluated. To evaluate the
system a small number of students will be employed to work
with the system and to provide us with qualitative results.

We believe that the primary advantages of our otology-
based context model, contains a hierarchical content structure
and semantic relationships between concepts. It can provide
related and useful semantic based context information for
searching learning material in context-based e-learning
[1] M Poveda-Villalon, M C Suárez-Figueroa, R García-Castro. (2010) A
Context Ontology for Mobile Environments-
[2] Webster, N., (1980) Webster’s new twentieth century dictionary of the
English language. Springfield, MA: Merriam-Webster, Inc.
[3] T. Berners-Lee, J. Hendler, and O. Lassila, (2001) “The Semantic Web,”
Scientific American, May:17 2001, pp. 28-37.
[4] T. Gruber, (1995) “Toward Principals for the Design of Ontologies Used
for Knowledge Sharing,” International Journal of Human-Computer
Studies, Vol. 43.
[5] De Almeida, et al. (2006) Using Ontologies in Context-Aware
Applications. Proc. of Database and Expert Systems, Poland.
[6] Horrocks, P. F. Patel-Schneider, F. van Harmelen. (2003) From SHIQ
and RDF to OWL: The making of a web ontology language, Journal of
Web Semantics 1 (1) 7–26.
[7] H. Chen, F. Perich, T. W. Finin, A. Joshi. (2004) SOUPA: Standard
Ontology for Ubiquitous and Pervasive Applications, in: 1st Annual
International Conference on Mobile and Ubiquitous Systems, IEEE
Computer Society, 2004.
[8] D. Zhang, T. Gu, X.Wang. (2005) Enabling Context-aware Smart Home
with Semantic Technology, International Journal of Human-friendly
Welfare Robotic Systems 6 (4), pp. 12–20.
[9] Schmidt A., C. Winterhalter (2004) “User Context Aware Delivery of E-
Learning Material: Approach and Architecture”, Journal of Universal
Computer Science (JUCS), Vol. 10(1) pp. 28-36.
[10] Bomsdorf, B. (2005) Adaptation of Learning Spaces: Supporting
Ubiquitous Learning in Higher Distance Education, Dagstuhl Seminar
Proceedings 05181: Mobile Computing and Ambient Intelligence: The
Challenge of Multimedia.
[11] Bouzeghoub, A.Do, K. and Lecocq, C. (2007) Contextual Adaptation of
Learning Resources, IADIS International Conference Mobile Learning,
pp. 41-48.
[12] Lee, M., Ye, D. and Wang, T. (2005) Java Learning Object Ontology,
International Conference on Advanced Learning Technologies, pp. 538-
[13] Jane Yau and Mike Joy. (ICALT 2007) Architecture of a Context-aware
and Adaptive Learning Schedule for Learning Java.


Dr. V. Cyril Raj received
Bachelor degree in Electronics
and Communication, Master
degree in Computer Science and
Engineering and PhD from
Jadavpur University. He is
currently Head of the Department
of Computer Science and
Engineering, Dr. MGR
University, Chennai, India. He
has published number of papers in national and
international conferences, seminars and journals and
author of many text books. At present many members are
doing research work under his guidance in different areas.
His research interests include Bioinformatics, Semantic-
Web, Computer Networks and Data Mining.

Kalla.Madhu Sudhana received
Bachelor degree in Computer
Science and Engineering from
Visvesvaraya Technological
University, Bangalore and Master
degree in Computer Science and
Engineering from Dr. MGR University, Chennai. He
worked as Assistant Professor in many Engineering
Colleges. Currently he is a research scholar in Department
of Computer Science and Engineering, St. Peter's
University, Chennai, India. His research interests are
Ontology, Semantic-Web and E-learning.

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September, 2012
Comparison of Supervised Learning Techniques
for Binary Text Classification
Hetal Doshi
Dept of Electronics and Telecommunication
KJSCE, Vidyavihar
Mumbai - 400077, India

Maruti Zalte
Dept of Electronics and Telecommunication
KJSCE, Vidyavihar
Mumbai - 400077, India
Abstract — Automated text classifier is useful assistance in
information management. In this paper, supervised learning
techniques like Naïve Bayes, Support Vector Machine (SVM)
and K Nearest Neighbour (KNN) are implemented for
classifying certain categories from 20 Newsgroup and WebKB
dataset. Two weighting schemes to represent documents are
employed and compared. Results show that effectiveness of the
weighting scheme depends on the nature of the dataset and
modeling approach adopted. Accuracy of classifiers can be
improved using more number of training documents. Naïve
Bayes performs mostly better than SVM and KNN when
number of training documents is few. The average amount of
improvement in SVM with more number of training
documents is better than that of Naïve Bayes and KNN.
Accuracy of KNN is lesser than Naïve Bayes and SVM.
Procedure to evaluate optimum classifier for a given dataset
using cross-validation is verified. Procedure for identifying the
probable misclassified documents is developed.

Keywords- Naïve Bayes, SVM, KNN, Supervised learning and
text classification.
Manually organizing large set of electronic documents
into required categories/classes can be extremely taxing,
time consuming, expensive and is often not feasible. Text
classification also known as text categorization deals with
the assignment of text to a particular category from a
predefined set of categories based on the words of the text.
Text classification combines the concepts of Machine
learning and Information Retrieval. Machine Learning is a
field of Artificial Intelligence (AI) that deals with the
development of techniques or algorithms that will let
computers understand and extract pattern in the given data.
Various applications of machine learning in the field of
speech recognition, computer vision, robot control etc are
discussed in [1]. Text classification finds applications in
various domains like Knowledge Management, Human
Resource Management, sorting of online information,
emails, information technology and internet [2]. Text
classification can be implemented using various supervised
and unsupervised machine learning techniques [3]. Various
performance parameters for binary text classification
evaluation are discussed in [4]. Accuracy is the evaluation
parameter for classifiers implemented in this paper.
A binary text classifier is a function that maps input
feature vectors x to output class/category labels y = (1, 0).
Aim is to learn and understand the function f from available
labeled training set of N i/p – o/p pairs (x
, y
), i = 1…N [5].
This is called as supervised learning as opposed to
unsupervised learning which doesn’t comprise of labeled
training set. There are two ways of implementing a classifier
model. In discriminating model, the aim is to learn function
that computes the class posterior p(y/x), thus it discriminates
between different classes given the input. In generative
model, the aim is to learn the class conditional density
p(x/y) for each value of y and also learn class priors p(y) and
then by applying Bayes rule, compute the class posterior, as
shown below [5],
This is known as generative model as it specifies a way
to generate the feature vector x for each possible class y.
Naïve Bayes classifier is an example of generative model
while SVM is an example of discriminative model. KNN
adopts a different approach than Naïve Bayes and SVM. In
KNN, calculations are deferred till actual classification and
model building using training examples is not performed.
In this paper, section II explains the Naïve Bayes and TF
(Term Frequency) & TF*IDF (Term Frequency * Inverse
Document Frequency) weighting schemes. Section III
explains the SVM and its Quadratic programming
optimization problem. Section IV describes the KNN
classifier and distance computation method. Section V
provides the implementation steps for binary text classifier.
Section VI discusses result analysis followed by conclusions
in section VII.
Naïve Bayes (NB) is based on the probabilistic model
that uses collection of labeled training documents to
estimate the parameters of the model and every new
document is classified using Bayes rule by selecting the
category that is most likely to have generated the new
example [8]. Principles of Bayes theorem and its application
in developing Naïve Bayes classifier is discussed in [6].
Naive Bayes has simplistic approach in its training and
classification phase [7]. Naïve Bayes model assumes that all
the attributes of the training documents are independent of
each other given the context of the class. Reference [8]
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September 2012
describes the differences and details of the two models of
Naïve Bayes – Bernoulli Multivariate model and
Multinomial model. Its results show that accuracy achieved
by multinomial model is better than that achieved by
Bernoulli multivariate model for large vocabulary.
Multinomial model of Naïve Bayes is selected for
implementation in this paper and its procedure is described
in [9] and [10].
If document D
(representing the input feature vector x)
is to be classified (i=1…N), the learning algorithm should
be able to classify it in required category C
output class y). Category can be either C
category with
label 1or C
, category with label 0. In Multinomial model,
document feature vector captures word frequency
information and not just its presence or absence. In this
model, a biased V sided dice is considered and each side of
the dice represents the word W
with probability p(W
), t
= 1…V. Thus at each position in the document dice is rolled
and a word is inserted. Thus a document is generated as bag
of words which includes which words are present in the
document and their frequency of occurrence.
Mathematically this can be achieved by defining M
multinomial model feature vector for the i
document D
is the frequency with which word W
occurs in document
and n
is the total number of words in D
. Vocabulary V is
defined as the number of unique words (found in
documents). Training documents are scanned to obtain
following counts,
N: Number of documents
: Number of documents of class C
, for both the
Estimate likelihoods p(W
) and priors p(C
Let Z
= 1 when D
has class C
and Z
= 0, otherwise.
Let N be the total number of documents then, [9]


If a particular word doesn’t appear in a category, then
the probability calculated by (2) will become zero. To avoid
this problem, Laplace smoothing is applied, [9]

The priors are estimated as,
After training is performed and parameters are ready, for
every new unlabelled document, D
, the posterior probability
for each category is estimated as [9]

Calculation mentioned above is done for both categories
and is compared. Depending upon which of the two values
are greater, label of that category is assigned to the testing
document. The assigned labels are compared with the true
labels for each testing document to evaluate accuracy.
represents Term Frequency (TF) representation
method in which frequency of occurrence of a particular
word W
in a given document D
is captured (local
information). But the TF representation also has a problem
that it scales up the frequent terms and scales down rare
terms which are mostly more informative than the high
frequency terms. The basic intuition is that a word that
occurs frequently in many documents is not a good
discriminator. The weighting scheme can help solving this
problem. TF*IDF provides information about how important
a word is to a document in a collection. TF*IDF weighting
scheme does this by incorporating the local and the global
information. This is because it takes into consideration not
only the isolated term but also the term within the document
collection [4].
= Document frequency or number of documents
containing term t and N = Number of documents.
/N = Probability of selecting a document containing
a queried term from a collection of documents.
) = Inverse Document Frequency, IDF
represents global information. 1 is added to NF
to avoid
division by zero in some cases.
It is better to multiply TF values with IDF values, by
considering local and global information. Therefore weight
of a term = TF * IDF. This is commonly referred to as, TF*
IDF weighting. Now as longer documents with more terms
and higher term frequencies tend to get larger dot products
than smaller documents which can skew and bias the
similarity measures, normalization is recommended. A very
common normalization method is dividing weights by the
L2 norm of the documents. The L2 norm of the vector
representing a document is simply the square root of the dot
products of the document vector by itself.
An SVM model is a representation of training
documents as points in space, mapped so that the examples
of the separate categories are divided by a clear gap that is
as wide as possible. New examples are then mapped into
that same space and predicted to belong to a category based
on which side of the gap they fall on. Certain properties of
text like high dimensional feature spaces, few irrelevant
features i. e. dense concept vector and sparse document
vectors are well handled by SVM making it suitable for the
application of document classification [11].
Support Vector Machine classification algorithm is
based on maximum margin training algorithm. It finds a
decision function D(x) for pattern vectors x of dimension V
belonging to either of the two category 1 and 0 (-1). The
input to the training algorithm is a set of N examples x
labels y
i. e (x
, y
), (x
, y
), (x
, y
, y
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September 2012
Fig. 1 shows the two dimensional feature space with
vectors belonging to one of the two categories. Two
dimensional space is considered for simplicity in
understanding of maximum margin training algorithm.
Objective is to have maximum margin M as wide as
possible which separates the two categories. SVM maximize
the margin around the separating hyper plane. The decision
function is fully specified by a subset of training samples
(support vectors).

Figure 1: Maximum margin solution in two dimensional space
To obtain the classifier boundary in terms of w’s and b,
two hyper-planes are defined Plus hyper-plane as
and minus hyper-plane as
which are the borders of the maximum margin. Distance
between plus and minus hyper-plane is called as margin M
which is to be maximized .
Margin 2/ is to be maximized , given the fact

Sometimes vectors are not linearly separable as indicated
in the fig. 2 below

Figure 2: Non- linearly separable vector points
Hence there is a need to soften the constraint that these
data points lie on the correct side of plus and minus hyper-
planes i.e. some data points are allowed to violate these
constraints by preferably a small amount . This approach
also helps in allowing improvements in generalization and is
called as soft margin SVM. In this approach, slack variables
are introduced as shown in the quadratic programming

Subject to,

The value of C is a regularization parameter which
trades between how large of a margin is preferred, as
opposed to number of the training set examples that violate
this margin and by what amount [12]. Optimum value of C
is obtained using cross – validation process. The way of
proceeding with the mathematical analysis is to convert soft
margin SVM problem (8) into an equivalent Lagrangian
dual problem which is to be maximized [12] & [15] using
Lagrange multiplier α

Subject to constraints,

The bias b is obtained by applying decision function to
two arbitrary supporting patterns x
belonging to C
supporting pattern x
belonging to C
LIBSVM is a library for Support Vector Machine
(SVM) used for implementing SVM for text classification in
this paper [14]. Its objective is to assist the users to easily
apply SVM to their respective applications. Reference [17]
provides the practical aspects involved in implementing
SVM algorithm and using linear kernel for text
classification which involves large number of features.
K nearest neighbour is one of the pattern recognition
techniques employed for classification based on the
technique of evaluating closest training examples in the
multidimensional feature space. It is a type of instance based
or lazy learning where the decision function is approximated
locally and all the computations are deferred until
classification. The document is classified by a majority vote
of its neighbours with document being assigned to the
category most common amongst its K nearest neighbour.
Reference [16] explains how K Nearest Neighbour can be
applied to text classification application and importance of
K value for a given problem. KNN is based on calculating
distance of the query document from the training documents

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September 2012
[18].Cosine similarity is selected for distance measurement
as documents are represented as vectors in multidimensional
feature space and distance between documents is calculated
as following
For improving the accuracy of KNN classifier, optimum
value of K is important. K is the parameter which indicates
the number of nearest neighbours to be considered for label
computation. The accuracy of KNN classifier is severely
affected by the presence of noisy and irrelevant features.
The best value of K is data dependent. The larger value of K
reduces the effect of noise and results in smoother, less
locally sensitive decision function but it makes the
boundaries between the classes less distinct resulting in
misclassification. Evaluating the optimum value of K is
achieved by cross-validation.
There are primarily three steps in implementing a binary
text classifier using MATLAB
as a tool,
1. Feature extraction: In this step, text document
comprising of words on the basis of which classification is
performed is converted into a matrix format capturing the
property of the words found in the documents. This can be
done in two ways - TF or TF*IDF. Matrix is created, where
the number of rows is equal to number of documents and
number of columns represents number of words in the
dictionary defined for a given classification task and
individual element of matrix represents TF or TF*IDF
weights in the respective documents.
2. Training the classifier: During the training phase
the classifier is provided with training documents along with
the labels of training documents. Classifier develops the
model representing the pattern with which training
documents are related to their labels on the basis of the
words appearing in the document. Parameter tuning is
performed using cross-validation process.
3. Testing the classifier: On the basis of the model
developed, classifier predicts labels for the testing
documents. Accuracy of classification is assessed by
comparing the predicted labels with the true labels.
Accuracy = Number of correctly classified testing
documents / total number of testing documents.
For implementation and evaluation of text classifiers, the
datasets like 20 Newsgroup and WebKB are used available
at [20]. Difference in the nature of the dataset is provided in
[19]. Within individual dataset, there are two groups. For 20
Newsgroup, it is group 1‘’ &
‘’ and group 2-‘sci.electronics’ & ‘’.
For WebKB dataset, there are also two groups, group 1:
Faculty and Course and group 2: Student and course.
Accuracy evaluation is performed for all four groups using
TF and TF*IDF document representation.
A. Comparison of TF and TF*IDF weighting scheme
Total accuracy for TF and TF*IDF representation is
obtained after performing 10 – fold cross-validation process
for different values of C for SVM classifier and K for KNN
classifier. In 10 fold cross-validation process, the entire
training set for a given group is divided into 10 subsets of
almost equal size. Sequentially one subset is tested using the
classifier trained on remaining 9 subsets. Thus the process is
repeated 10 times. Total accuracy is evaluated. As the
division of the entire training set for a given group into
almost equal 10 parts is performed randomly, three
iterations are performed and average is calculated to obtain
optimum C value for SVM and K value for KNN which
provides highest cross-validation accuracy. SVM and KNN
are implemented using these optimum values of C and K.
Comparison of TF and TF*IDF representation of 20
Newsgroup group 1 is shown in Table I, for 20 Newsgroup
group 2 in Table II, for WebKB group 1in Table III and
WebKB group 2 in Table IV.
Training docs: 1197 Testing docs: 796
Accuracy %
C=1 K=1
97. 236 93.844 85.302
C=1 K= 30
97.613 97.362 96.231
Training docs: 1185 Testing docs: 789
Accuracy %
C=1 K=1
96.831 92.269 79.214
C=1 K= 10
96.451 96.198 93.156
Training docs: 1358 Testing docs: 684
Accuracy %
C=0.01 K=10
97.368 97.807 94.298
C=1 K= 45
97.222 98.538 94.152

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September 2012
Training docs: 1697 Testing docs: 854
Accuracy %
C=0.1 K=10
97.658 97.19 94.262
C=1 K= 40
96.956 98.009 94.496

TF*IDF emphasizes the weight of low frequent terms.
For SVM and KNN, difference in the accuracy for TF*IDF
and TF representation is more pronounced in 20 Newsgroup
Dataset as compared to WebKB dataset as seen from Table I
- IV. This, results due to the fact that contribution of low
frequency words to text categorization is significant in 20
Newsgroup dataset as compared to WebKB dataset [19].
SVM and KNN adopt spatial distribution of documents in
multidimensional feature space. These techniques attempt to
solve the classification problem using spatial means. SVM
tries to find hyper-plane in that space separating the
categories. Classification model of SVM depends on support
vectors. KNN tries to compute which K training examples
are closest to the testing document. TF*IDF weighting
affects the spatial domain largely helping SVM and KNN to
perform better.
It is observed for Naive Bayes classifier, performance in
terms of accuracy doesn’t vary much for TF or TF*IDF
representation. Naive Bayes builds a classifier based on
probability of words and their relative occurrences in
different categories. Hence all the training documents are
used for building the model for TF and TF*IDF
representation. Hence its performance is high for the TF
representation and doesn’t change much for TF*IDF
The contribution of low frequency words to text
categorization is not significant in WebKB as compared to
20 Newsgroup [19]. Hence for WebKB dataset, performance
difference between TF and TF*IDF representation is not
significant for all three classifiers.
B. Comparison over increasing training documents
Comparison of all the classifiers (TF*IDF
representation) for increasing number of training documents
is shown below in fig. 3 for 20 Newsgroup group 1, fig. 4
for 20 Newsgroup group 2, fig. 5 for WebKB group 1 and
fig. 6 for WebKB group 2.

Figure 3: Comparison of Naïve Bayes, SVM and KNN for 20
Newsgroup Group 1

Figure 4: Comparison of Naïve Bayes, SVM and KNN for 20
Newsgroup Group 2

Figure 5: Comparison of Naïve Bayes, SVM and KNN for WebKB
Group 1

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September 2012

Figure 6: Comparison of Naïve Bayes, SVM and KNN for WebKB
Group 2
Results show that the accuracy of the classifier is
dependent on the number of training documents and larger
number of training documents can increase the accuracy of
classification task. In case of Naïve Bayes, increase in the
classification accuracy with larger training size is the result
of improvement in accuracy of probability estimation with
more training documents as larger possibilities are covered.
In case of SVM, increase in the classification accuracy
with larger training size is the result of obtaining the hyper-
plane which provides a more generalized solution thus
avoiding over – fitted solution.
In case of KNN, increase in the classification accuracy
with larger training size results due to the fact that with large
number of training documents, effect of noisy training
example on the classification accuracy reduces. Also it
reduces the locally sensitive nature of KNN classifier.
Results show that accuracy of Naïve Bayes classifier is
usually better than SVM and KNN classification accuracy
when number of training documents is less. But as the
training set size increase, SVM classification accuracy
becomes comparable to Naïve Bayes and in certain cases
becomes better. KNN is observed to have accuracy lower
than Naïve Bayes and SVM. Also average amount of
improvement in SVM with more training documents is
better than that of KNN and Naïve Bayes as seen from fig.
3, fig 4, fig. 5 and fig. 6
Naive Bayes builds a classifier based on probability of
words and their relative occurrences in different categories.
This inherently requires less data than SVM. SVM builds a
hyper plane to separate the two categories, with maximum
margin. Thus, it needs more training documents, especially
training documents close to the hyper plane, to develop its
accuracy. Thus, NB performs better for data sets with low
SVM requires building a hyper – plane which best
separates the two categories. With more training documents,
it acquires more data around supporting hyper-plane to
provide better solution. Thus, SVM learns the data better as
more training documents are provided; especially training
data close to the hyper plane. These training documents are
the support vectors.
Naive Bayes uses word counts/frequencies as a feature
to distinguish between data sets. Each word provides a
probability that the document is in a particular class. The
individual probabilities are combined to arrive at a final
decision. It is expected that adding more words (as a result
of adding more training documents) would not drastically
change the performance level of Naive Bayes, and this is
what is observed. As a result with more training documents
the average performance of SVM improves more, relative to
the improvement in Naive Bayes.
KNN on the other hand considers the entire multi
dimensional feature space as a whole and obtains the labels
for testing documents on the basis of nearest neighbour
concept. Hence classification is done not on the basis of
model building and is dependent on local information as
emphasis is given to K nearest neighbour for label
computation. Thus its accuracy is mediocre compared to
Naive Bayes and SVM
C. Selection of classifier
Performance of the classifier can be predicted using
cross – validation results which is shown in Table V for TF
representation and Table VI for TF *IDF representation.
Cross-validation (CV) results for Naïve Bayes, SVM and
KNN classifier and classification accuracy results for all
three classifiers are provided below in Table V and VI.
TF representation
Naïve Bayes SVM KNN
Accuracy % CV % Accuracy % CV % Accuracy % CV %
20 Newsgroup group 1 97.236 98.747 93.844 96.825 85.302 94.627
20 Newsgroup group 2 96.831 98.819 92.269 97.258 79.214 91.702
WebKB group 1 97.368 96.533 97.807 97.619 94.298 93.813
WebKB group 2 97.658 97.346 97.19 96.778 94.262 95.129

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September 2012
TF*IDF representation
Naïve Bayes SVM KNN
Accuracy % CV % Accuracy % CV % Accuracy % CV%
20 Newsgroup group 1 97.613 98.218 97.362 99.249 96.231 97.882
20 Newsgroup group 2 96.451 98.677 96.198 98.790 93.156 97.720
WebKB group 1 97.222 96.392 98.538 97.913 94.152 94.035
WebKB group 2 96.956 97.584 98.009 98.035 94.496 96.719

In real life application, true labels of the testing
documents are not available. To ensure that classifier
selection is optimum for a given dataset, it is advisable to
perform cross-validation on the training dataset using
different classifiers. Observing the results of the cross-
validation process, the classifier which provides the best
results should be selected for the text classification task. It
is seen from Table V and Table VI that mostly whenever a
classifier gives relatively better cross-validation
performance; it also gives better accuracy results on testing
data. Cross-validation helps in estimating how accurately, a
predictive model of a classifier will generalize to a testing
dataset compared to other classifiers
D. Identification of misclassified documents
In real life application, labels for query documents are
not available. SVM may give best performance but
identifying the misclassified documents is not possible with
one classifier. After performing classification operation by
all three classifiers, the results of all three classifier are
added and ranks are given to testing documents. Thus every
testing document is assigned a rank which is 0, 1, 2 or 3.
Obtaining the rank 0 means all three classifier has assigned
label 0 to that testing documents. Obtaining rank 3 means all
three classifier has assigned label 1 to that testing document.
Obtaining rank 1 means two out of the three classifiers has
assigned a label 0 to the testing document. Obtaining rank 2
means two out of three documents has assigned label 1 to
that testing document. Hence discrepancy between classifier
results is observed when the testing document is assigned
rank 1 or rank 2.
The documents with rank 1 and rank 2 are indicated to
the user. This indication alerts the user about the documents
which may be misclassified. For labels, SVM predicted
labels are considered. Table VII summarizes this procedure
for identification of misclassified documents

It is seen from the Table VII that it is possible to identify few
of misclassified documents using the combination of results of
three classifiers. Around 5% of documents are flagged as
probably misclassified documents. As actual labels are
available for testing documents, it is seen that out of total
misclassified documents by SVM, some are identified. All are
not identified as they were ranked as 0 or 3 indicating that
none of the classifiers could assign correct labels to those
Implementation and evaluation of Naïve Bayes, SVM and
KNN on categories from 20 Newsgroup and WebKB dataset
using two different weighting schemes resulted in several
conclusions. Effectiveness of the weighting schemes to
represent documents depends on the nature of dataset and also
on the modeling approach adopted by the classifiers.
Classification accuracy of the classifier can be improved using
more training documents thus helping in more generalized
solution covering larger possibilities. Naïve Bayes performs
mostly better than SVM and KNN when number of training
documents is few. The average amount of improvement in
SVM with more training documents is better than that of KNN
and Naïve Bayes. Parameter tuning in case of SVM and KNN
using the cross-validation assists in achieving generalized
solution suitable for a given dataset. Classification in KNN is
not done on the basis of model building and is dependent only
on local information. Thus its classification accuracy is lesser
Group Percentage of documents
flagged as probably
Number of documents
misclassified by SVM
Number of misclassified documents
20 Newsgroup group 1 3.89 21 11
20 Newsgroup group 2 4.94 30 9
WebKB group 1 5.11 10 2
WebKB group 2 4.91 17 5
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No.9, September 2012
than Naïve Bayes and SVM. Procedure to evaluate suitable
classifier for a given dataset using cross-validation process is
verified. Procedure for identifying the probable misclassified
documents is developed by combining the results of three
classifiers as they adopt different approach to text
classification problem.
[1] M. Mitchell “The Discipline of Machine Learning” Machine Learning
Department, School of Computer Science, Carnegie Mellon University,
Pittsburgh, PA, USA, Technical report CMU-ML-06-108.
[2] Vishal Gupta, Gurpreet S. Lehal, “ A survey of Text Mining Techniques
and Applications”, in Journal Of Emerging Technologies in Web
Intelligence, Vol 1, No 1, August 2009, pp. 60 -76.
[3] George Tzanis, Ioannis Katakis, Ioannis Partalas, Ioannis Vlahavas, “
Modern Applications of Machine Learning”, in Proceedings of the 1st
Annual SEERC Doctoral Student Conference – DSC 2006. pp. 1-10.
[4] Fabrizio Sebastiani, “Text categorization”, In: Alessandro Zanasi
(ed.), Text mining and its Applications, WIT Press, Southampton, UK,
2005, pp. 109-129.
[5] Kevin P. Murphy, “Naïve Bayes classifier”, Technical report,
Department of Computer Science, University of British Columbia, 2006.
[6] Haiyi Zhang, DiLi, “Naïve Bayes Text Classifier”, in 2007 IEEE
International Conference on Granular Computing, pp. 708-711.
[7] S.L. Ting, W.H. Ip, Albert H.C. Tsang, “Is Naïve Bayes a Good Classifier
for Document Classification?”, In: International Journal of Software
Engineering and Its Applications Vol. 5, No. 3, July, 2011, pp. 37-46.
[8] Andrew McCallum and Kamal Nigam, “A Comparison of Event Models
for Naïve Bayes Text Classification”, In: Learning for Text
Categorization: Papers from the AAAI workshop, AAAI pressc(1998) 41
– 48 Technical report Ws – 98 – 05.
[9] “Text Classification using Naïve Bayes”, Steve Renals, Learning and
Data lecture 7, Informatics 2B. Available online:
[10] “Generative learning algorithm”, lecture notes2 for CS229, Department
of Computer Science, University of Stanford. Available online:
[11] T. Joachims, “Text Categorization with Support Vector Machines:
Learning with Many Relevant Features”, In Proceedings of the European
Conference on Machine Learning (ECML), Springer, 1998.
[12] Brian C. Lovell and Christian J. Walder, “ Support Vector Machines for
Business Applications” , Business Applications and Computational
Intelligence, Idea Group Publishers, 2006.
[13] Bernhard E. Boser, Isabelle M. Guyon, Vladimir N. Vapnik, “A Training
Algorithm for Optimal Margin Classifiers” In proceedings of the Fifth
Annual Workshop on Computational Learning Theory, ACM press, 1992,
pp. 144-152.
[14] Chih – Chung Chang and Chih – Jen Lin, “ LIBSVM: A library for
Support Vector Machines”, Department of Computer Science, National
Taiwan University, Taipei, Taiwan, 2001.
[15] Nello Cristianini , John Shawe-Taylor, An Introduction to support Vector
Machines: and other kernel-based learning methods, Cambridge
University Press, 2000.
[16]KNN classification details available online at
[17] Hsu, C.-W., Chang, C.-C., and Lin, C.-J. “A practical guide to support
vector classification.”, Technical report, Department of Computer
Science, National Taiwan University, 2003.
[18]Zhijie Liu, Xueqiang Lv, Kun Liu, Shuicai Shi, “Study on SVM
compared with the other Text Classification Methods” in 2010 Second
International Workshop on Education Technology and Computer Science,
pp. 219 – 222.
[19] R. Bekkerman, R. El-Yaniv, N. Tishby, and Y. Winter, “Distributional
word clusters vs. words for text categorization”, in Journal of Machine
Learning Research, Volume 3, 2003. pp. 1183–1208.
[20]Dataset used in this paper available online at

Hetal Doshi has received B. E. (Electronics) degree in 2003 from University
of Mumbai and is currently pursuing her M. E. (Electronics and
Telecommunication) from K. J. Somaiya College of Engineering (KJSCE),
Vidyavihar, Mumbai, India. She is in the teaching profession for last 8 years
and is working as Assistant Professor at KJSCE. Her area of interest is
Education Technology, Text Mining and Signal Processing.
Maruti Zalte has received M. E. (Electronics and Telecommunication) degree
in 2006 from Govt. College of Engineering, Pune. He is in the teaching
profession for last 9 years and is working as Associate Professor at KJSCE.
His area of interest is Digital Signal Processing and VLSI technology. He is
currently holding the post of Dean, Students Affairs, KJSCE.

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

Creation of Digital Test Form for Prepress
Jaswinder Singh Dilawari Dr.RavinderKhanna
PhD, Research Scholar, Pacific University, Pricipal, Sachdeva Engineering College for Girls
Udaipur,Rajasthan, India Mohali, Punjab,India

Abstract: The main problem in colour management in
prepress department is lack of availability of literature on
colour management and knowledge gap between prepress
department and press department. So a digital test from has
been created by Adobe Photoshop to analyse the ICC
profile and to create a new profile and this analysed data is
used to study about various grey scale of RGB and CMYK
images. That helps in conversion of image from RGB to
CMYK in prepress department.
Keywords: IT8 Test Chart, Digital Test Form,
Characterisation of Scanners, ISO 12641-1997, Calibration
of Scanners
In the prepress process before printing, there is always
need to produce image accurately and as per requirement
of printing department.For this ICC has made a common
colour profile and every printing process either develops
their own printer profile considering ICC profile as
reference or hire consultants. A biggest problem in
creating own profiles is lack of literature available in
market about colour management. Printers/prepress
houses have difficulties in adjusting specific parameter
settings in the profile, due to insufficient color
management skills.However, if a flatbed scanner can
precisely scan each color patch colorimetric correctly,
then a scanner profile (ICC) would not be necessary and
each color which is incorrectly scanned according to its
colorimetric value will need a color correction when
being converted from the source profile (scanner) to the
destination profile (RGB color working space). The input
image handled by prepress department is in the RGB
mode (Red, Green, Blue) and to print it has to be
converted to CMYK mode(Cyan,Magenta, Yellow,
Black)[1][2].This color conversion is today done by ICC-
profiles. The profiles contain information about separation,
black start, black width, total ink coverage.GCR (Gray
Component Replacement) and UCR (Under Color
Removal) are the two main color separation techniques
used to control the amounts of black, cyan, magenta and
yellow needed to produce the different tones. Since black
ink can replace equal amounts of cyan, magenta and
yellow.To produce a similar tone, UCR and GCR replace
equal amounts of cyan, magenta and yellow in neutral
tones. GCR also replaces some CMY colors in tertiary
colors.[3] Before converting the image from RGB to
CMYK first of file color profiles have to be understood.
For this a digital test form has to be created using the
most common software used in market- Adobe
Photoshop. This digital test form will remove remedy of
availability of literature in creating own profile. Many
printers used external consultants for their color
management and color profiles are created based on the
information provided by them. So a specification is
required to improve the communication between external
consultant and printer as well as prepress and press
Each scanning device has to be characterized to common
ICC color profile. To check their characterization a high
Chroma image is used and tested using three different
IT8.7/2 test targets [4].Besides the established IT8-
targets from the major color chart vendors a new IT8-
target was created for the tests. The four test charts are
named A, B, C and D in the study. Reference color
values such as lightness and chroma coordinates were
read from the test targets. A spectrophotometer was used
for the readings. The fourth test chart will be created by
using the analysis data of three test charts A, and C.
These test charts will be compared with ISO-standard
ISO 12641-1997 and their profile spaces are also
compared and analyzed. The result of this analysis is
used to create the new test chart and its effect on color
gamut is also studied.
A. Creation of New IT-8 Test Chart
The new test chart created follows the ISO standard LCH
(ISO 12641-1997).The scanner target consists of a total of
264 colors, as shown in Figure 1. The target design is a
uniform mapping and is defined in detail in the ANSI
standard IT8.7/2 for reflection material (ISO 12641-1997).
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

Figure 1: The scanner target consists of a total of 264
colors. The red frames show the standardized values.
As is shown in the above figure three luminance levels are
defined with 12 hue angles. Each hue angle for each
luminance level has four chrominance values and the
highest chrominance value is unalterable [5]. A further 84
patches provide additional tone scales which are not
defined by any ISO-standard. Seven tone scales are
defined for the colors cyan, magenta, yellow, red, green
and blue (no ISO standard defined). Each tone scale is
built-up in twelve steps starting from the lowest
chrominance value and keeping the hue angle stable. Each
vendor has defined an optimal tone scale for their own
specific output media [6]. The last three columns are
vendor specific. Here the vendor manufacturing a target
was allowed to add any feature they deemed worthwhile.
By keeping above specifications in mind following IT-8
test chart is developed as shown in figure 2.

Figure 2: The customized IT.8 target for scanners

The analysis of problem in calibration of scanners involves
two different studies at different span of time.
A. First Study
The first study was performed in 2000 when ICC-profiles
were used by only a minority of Swedish printers. Color
separation, at that time, was performed directly in image
scanners or in imaging applications (i.e. Adobe Photoshop)
using color look up tables. A total of 120 companies, both
printers with prepress departments and dedicated prepress
houses, participated in the study. The companies are all
located in Sweden, with an even geographical spread
throughout the nation. The printers and prepress houses
were also chosen on the basis of the size of the company,
but only companies with two or more employees were
included in the survey. Semi-structured interviews were
conducted with prepress representatives, normally by
telephone or by e-mail. Ten company visits were made. A
number of questions concerning the different separation
techniques were asked in order to be able to assess the
general level of competence [7].
B. Outcome of Study
After the study it has been observed that there is poor
communication between prepress and press department,
lack of knowledge of image separation, no proper image
separation. Moreover there is need for guidelines written
in understandable form. It has been observed that only 20
% of printers and prepress house has a good knowledge of
conversion of image from RGB to CMYK.
C. Second Study
The second study was performed in 2003. Eighty sheet-fed
offset printers and thirty four newspaper printers, evenly
geographically spread over Sweden, participated in this
study. Companies with only one employee were not
included. As in the first study, semi-structured interviews
were conducted with prepress representatives for each
printer or prepress house either through a visit or by e-
mail. A structured web questionnaire was also used. The
questions asked concerned the use, creation and
implementation of ICC-profiles [8]. Approximately 50
percent of the printers/prepress houses participating in this
study were also involved in the first study. In order to
verify the findings and clarify the results, nine independent
color consultants were contacted and interviewed.
C. Outcome of Study
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

The second study showed that a majority (70%) of the
commercial printers nationwide in Sweden are using ICC-
profiles for color reproduction, particularly in the
newspaper industry (83%). The majority of the
participants in the survey felt that there was a lack of
communication or non-existing communication in all
process directions. There is normally no dedicated time for
quality meetings [9]. The newspapers have a better know-
how than commercial printers concerning color
management. Few companies set a strategy for their color
management implementation and they therefore may not
use the consultant in the right way. Terminology confusion
is common in the graphic arts industry. The study shows
that many pre-press staff members use the terms
incorrectly or mix them up. The survey indicates that
external consultants play an important role in the creation
of ICC-profiles.
D. Where is the problem?
The problem lies at ground level. The calibration is
difficult because of lack of knowledge and communication
gap in various departments. The context of knowledge
lacking lies in less availability of literature available on
color management instructions which could help printers
to better understand the technology [10]. The
communication problem was due to a lack of a common
language, due mainly to the different backgrounds and
experiences of the people involved. The creation of new
ICC profile is not an easy process. Main problems a
respondent faces are:
1) Device calibration
2) Misunderstood profiling set-up options
3) Lack of understanding of the profiling process
4) Inappropriate test target
5) Inappropriate profiling software
The quality can be improved by removing these problems
during building of color management software.
Generally printing personnel don’t bother about the
maintenance of production quality because of lack of time
and pressure of completion of work on time. To read
manuals and thereby learn more about certain software has
been shown to be a poor alternative, as manuals are often
written in a difficult way, and often in a foreign language.
So graphic art industry needs a tool which can be used by
the user by himself [11][12][13]. User can use its
application and setting for betterment of image quality and
later on can use these setting for better production. For this
digital test forms are created using Adobe Photoshop. The
created test form can provide information to the user about
many settings in the profile. The test form helps to show
the differences between the settings already in the RGB
color mode and to avoid misunderstandings after printing.
The layout of the test form facilitates practical
understanding by showing the result of a color conversion
from RGB to CMYK using a profile.
The digital test form gives information about:
• ICC-profiles in MacOS and Windows
• Different color gamuts
• RGB Gray balance
• Rendering intents
• Gamut warning
• Separation
• CMYK gray balance
• Chroma shift
• Gamut mapping
• Skin tones
• Total Ink Coverage
The image in figure2 can be analyzed in context to change
in profile with the help of digital form factor.

Figure 3: The digital test form for the evaluation of ICC-
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

Figure 4: The lightness circles in five levels for
evaluation. To better understand how the settings affect
the result the user can see the changes in two directions -
the vertical direction and the horizontal direction (the
lightness circles on the digital test form).
Before designing a new ICC profile specifications should
be clearly specified. An understanding of these
specifications will facilitate internal communication
within a printing company, and also between printing
companies and external consultants creating ICC-systems
[14]. It has been observed that none of the printing
company is having clear specifications for the
construction of ICC profiles. In printing process every
process (prepress, press, post press) should communicate
together and to communicate properly input and output
specifications of these processes should be understood
properly. An improved communication will give a better
process understanding and thereby a better production
quality [15][16]. For better communication between
processes these five points should be considered:
1)General demands/specifications
This part of the list contains objectives, implementation
and specifications. Before a profile-based production can
be considered, comprehensive objectives need to be set.
The purpose of the process change must be explained to
the personnel directly involved in the production process.
General information about profile implementation needs
to be given [17]. To avoid common misunderstandings
and improve internal communication, written process
instructions should be followed. Each process should be
definedand described, with regard to responsibility and
2) Test form specifications
This part describes responsibility distribution according
to the creation and content of the test form.
3) RIP (Raster Image Processor) specifications
This part of the communication list describes initial
demands - linearization and setting of the RIP.
4) Output profile specifications
This part describes responsibility distribution according
to settings in the profile.
5) Printing specifications
This part describes initial demands, general facts and
standard demands.
Using above methods if specifications are clearly
understood then communication between processes will
be proper and will result in proper color management.
Each step in a color management set-up must be
documented so that a later profiling update can be
established with the same set-up [18][19][20]. The
communication list deals with specification demands
which are of importance in the development of profiles
and different responsibility distributions in the
development of these profiles. Two scenarios are
described: the first situation is when the printing company
creates its own profiles without the involvement of an
external consultant, and the second scenario describes the
situation when the printing company needs external help to
create the profiles.
After talking with various printing presses and their
correspondent, it has been observed that the lack of
literature available for the color management is the main
problem. When different scanning devices or prepree and
press departement have to communicate then also lack of
communiaction is there because of no knowledge of color
management. So after studying problem digital forms have
been created using Adobe Photoshope which can give
information about chrominance value and hue angles in an
image. By studying this digital form the communication
gap between various departemnt can be coverd. Moreover
the problem of literature available in different languages is
also overcome because digital test form represents color
test chart in user readable format. Moreover digital test
form gives information about RGB gray balance and
CMYK gray balance. So the conversion of RGB to
CMYK in prepress departement is eased by digital test

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

[1] G. Sharma, Ed., Digital Color Imaging Handbook. Boca Raton, FL:
CRC, 2003
[2]H.J. Trussell, M.J. Vrhel, and E. Saber, “Overview of the color
image processing issue,” IEEE Signal Processing Mag., vol. 22, no. 1,
pp. 14–22, 2005.
[3]R. Ramnath, W.E. Snyder, Y.F. Foo, and M.S. Drew, “Color image
processing,” IEEE Signal Processing Mag., vol. 22, no. 1, pp. 34–43,
[4] R. Bala, “Device characterization,” in Digital Color Imaging
Handbook, G. Sharma, Ed. Boca Raton, FL: CRC, 2003, ch. 5.
[5] Z. Fan, G. Sharma, and Shen-ge Wang, “Error-diffusion robust to
mis-registration in multi-pass printing,” in Proc. PICS, 2003, pp. 376–
[6]R. Bala and R.V. Klassen, “Efficient color transformation
implementation,” in Digital Color Imaging Handbook, G. Sharma, Ed.
Boca Raton, FL: CRC, 2003, ch. 11.
[7]P.C. Hung, “Colorimetric calibration for scanners and media,” Proc.
SPIE, vol. 1448, pp. 164–174, 1991.
[8]G. Sharma, “Methods and apparatus for identifying marking process
and modifying image date based on image spatial characteristics,” US
Patent 6 353 675, Mar. 05, 2002.
[9] J.Morovic and Y.Wang, “Influence of test image choice on
experimental results,” in 11th IST/SID ColorImagingConference, pp.
143–148, 2003.
[10] M. J. Vrhel and H. J. Trussell, “Color Device Calibration:A
Mathematical Formulation,” IEEE Transactions on Image Processing,
Vol. 8, No. 12, Dec. 1999.
[11] M. J. Vrhel, H. J. Trussell,”Color Printer Characterization in
[12] G. Sharma, “Target-less scanner color calibration,’’ J. Imaging Sci.
and Tech. vol. 44, no. 4, pp. 301–307, Jul./Aug. 2000.
[13] Sharma, G. and Rodr´ıguez-Pardo, C. E., “The dark side of
CIELAB,” in [Proc. SPIE: Color Imaging XVII:Displaying, Hardcopy,
Processing, and Applications], 8292, 8292–12,1–9 (Jan. 2012).
[14] Storn, R., “System design by constraint adaptation and differential
evolution,” IEEE Trans. Evol. Comput.3(1), 22–34 (1999).
[15] MEYER, G. W., AND GREENBERG, D. P. Color education and
color synthesis in computer graphics. Color Res. Appl. 2 1, Suppl.
(June 1986), S39-S44.
Mapping and the Printing of Digital Color Images” ACM Transactions
on Graphics, Vol. 7, No. 4, October 1988, Pages 249-292.
[17]Tajima, J., Haneishi, H., Ojima, N., and Tsukada, M.,
“Representative data selection for standard objectcolour database
(SOCS),” in [Proc. IS&T/SID Tenth Color Imaging Conference:
Color Science, Systemsand Applications], 155–160 (12-15 Nov. 2002).
[18]Storn, R., “System design by constraint adaptation and differential
evolution,”IEEE Trans. Evol. Comput3(1), 22–34 (1999).
[19]G. Sharma, “Methods and apparatus for identifying marking
process and modifyingimage date based on image spatial
characteristics,” US Patent 6 353 675,Mar. 05, 2002.
[20]E.J. Giorgianni and T.E. Madden, Digital Color Management: Encoding
Solutions. Reading, MA: Addison Wesley, 1998.

Jaswinder Singh Dilawari is working as an Associate Professor ,Geeta
Engineering College, Panipat, Haryana
,India .He has teaching experience of 12
years .His area of interest includes
Computer Graphics, Computer
Architecture ,Software Engineering ,Fuzzy
Logic and Artificial Intelligence .He is life
member of Indian Society for Technical
Education (ISTE)

Born in 1948, Dr. Ravinder Khanna Graduated in Electrical
Engineering from Indian Institute of
Technology(IIT) Dehli in 1970 and
Completed his Masters and Ph.D degree
in Electronics and Communications
Engineering from the same Institute in
1981 and 1990respectively. He worked as
an Electronics Engineer in Indian
Defence Forces for 24Years where he
was involved in teaching, research and
project management of some of the high
tech weapon systems. Since 1996 he has
full time Switched to academics. he has
worked in many premiere technical institute in india and abroad.
Currently he is the Principal of Sachdeva Engineering College for Girls,
Mohali, Punjab (India).He is active in the general area of Computer
Networks, Image Processing and Natural Language Processing.

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Web Test Integration and Performance
Evaluation of E-Commerce Web Sites

Md. Safaet Hossain
Department of Electrical Engineering and Computer Science
North South University, Dhaka Bangladesh

Md. Shazzad Hosain
Department of Electrical Engineering and Computer Science
North South University, Dhaka Bangladesh

Abstract— Web applications are becoming progressively more
complex and imperative for companies. The e-commerce web
sites have been serving to accelerate and disseminate more widely
changes that are already under way in the economy. Their
development, including analysis, design and testing, needs to be
approached by means of support tools, while their correctness
and reliability are often crucial to the success of businesses and
organizations. There are some tools provided to support analysis
and design. However, few tools are provided to directly support
the software testing on Web-based applications. In this paper, an
automated online website evaluation tool hosted at is proposed to support the
automated testing of web-based applications. Testers can
evaluate performance of a site with other websites and can
precisely express the existing websites and find out what are the
modifications required. The tool elevates the automation level of
functional testing for web applications into a new height.

Keywords: Web based applicatoin testing, performance testing,
functional testing, test methods integration, e-commerce.
We need internet in almost every field of life. We use internet
mostly in form of web applications. We use web applications
for paying utility bills, social networking, email, online
transactions etc. Online shopping has become progressively
widespread over the years. E-commerce sales in U.S. grew
from 72 billion U.S. dollars in 2002 to 228 billion U.S. dollars
in 2010. The leading portions of online revenues were
generated by the retail shopping websites, which earned 142
billion U.S. dollars in 2010. A 2011 e-commerce market
projection predicted that online retail revenues alone would
reach 269 billion U.S. dollars by 2015. Simultaneously, the
number of online shoppers in the U.S. is expected to grow
from 140 million in 2010 to 170 million in 2015 according to
eMarketer estimates [14].

In recent years, web applications have become important for
many companies, as being a convenient and inexpensive way
to provide information and services on-line. Since a
malfunctioning web application could interrupt an entire
business and cost millions of dollars, there is a strong demand
for methodologies, models and tools that can improve the
quality and reliability of web sites [1].

In a study by Glenn A. Stout [13], Senior Functional
Specialist, The Revere Group, demonstrates that poorly
operating websites are stunning, and even affect the online
business severely. The study also showed that when errors are
found on an e-commerce website, 28% of the people stopped
shopping at that site, 23% stopped buying from the site, and
6% of the people were so upset, that they stopped buying at e-
commerce sites [13]. One can only surmise that the customers
feel that if the company cannot provide a quality website, then
they may not be able to sell a quality product from their stores.

To make an online site popular, effective, and competitive to
business, whether it is an e-commerce site or social
networking site or any other site, there is no alternative of
making a good quality site in terms of performance and
reliability. Before launching any online sites, thus it is
imperative to test the site for its high performance and
reliability with World Wide Web consortium standard. At the
same time, the site requires to be compared with other online
sites to be competitive in the e-commerce market.

There are many different techniques or Web test methods that
we can apply for performance evaluation and for error free
sites. Tools such as HTML validator, bobby, Netcraft allows
static analysis of sites e.g. HTML errors, link errors etc. The
following table demonstrates about the sites that provide tested
result like page size, performance time etc.
Web Site Measurements
HTML Validator
This HTML Validator – checks the markup
validity of Web documents in HTML,
XHTML, SMIL, MathML, etc.
Links Validator
The Link Checker analyzes anchors
(hyperlinks) in a HTML/XHTML document.
Useful to find broken links.
Functional Accessibility
Use this tool to evaluate the functional
accessibility of your web site.
HERA is a tool to check the accessibility of
Web pages according to the specification
Web Content Accessibility Guidelines
(WCAG 1.0).
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Any Web site that requires evaluating the above mentioned
criteria would require accessing these Web applications
separately, obtain test result for a specific functionality, merge
and evaluate the result manually for proper decision. It
increases the time as well as the cost of application testing.
Testing of web-based applications in particular deserves
further examination due to economic considerations and
companies are choosing not to test due to resource constraints.
However, if we could integrate different online Web test tools
or applications, it would make the Web testing much faster
and easier thus help building robust e-commerce sites.
Application integration is not entirely a new idea. Recently
this issue is of much interest, especially in Bioinformatics
application domain, such as found in [17]. Thus our idea is to
use existing Web testing applications and integrate these
applications for faster and reliable Web testing. The idea is to
develop a test tool, where users will put their Web site URL to
test and the tool automatically crawls different existing test
sites, submit queries for test purpose, gather test result and
display the result to user in an integrated view. Thus in this
research our main contributions are:

1. Integrate different online Web test applications.
2. Evaluate performance of Web applications.
3. Compare different web sites performance to make an
e-commerce business competitive.

The paper is organized as the following. Section II provides
related works, section III presents Web test models, section IV
describes Web evaluation methodologies and finally section V
draws the conclusion.
Good quality Web application is one of the important criteria
for successful e-commerce sites. Investigation by Forrester
Research [6] found that consumers expect pages to load in two
seconds or less, and after three seconds, up to 40 percent users
will abandon the site. In another research [16] authors also
pointed out that users feel uninterrupted page response time if
it is less than 1.0 second, even though the user will notice the
delay. However there are still no industry standards for
acceptable application response time. According to Gomez
benchmarking [15], the comparison between different
websites can be made by evaluating the average response time
of all the pages of a Web site.

The user view of quality e-commerce site can be assessed
mainly in terms of functionality and usability. World Wide
Web Consortium (W3C) [18] defines a set of guidelines for
quality Web designing and testing. These guidelines cover a
wide range of development standards including HTML tags,
CSS, web accessibility, HTTP/1.1: Status Code Definitions
etc. Every guideline provides a technique for accessing the
content of Website. The qualitative measures [19], [20] such
as text formatting, link formatting, page formatting, graphics
element, page performance and site architecture are used to
achieve quality of website.
There are different sites or Web applications to evaluate e-
commerce sites that can check the web pages against the web
standards [18]. HTML Validator (
checks HTML, XHTML or CSS documents, and returns a list
of warnings & errors according to the standard. It also helps us
to eliminate website problems that cause visitors to abandon
websites. The Links Validator (
/checklink) reads an HTML or XHTML document or a CSS
style sheet and extracts a list of anchors and links so that no
anchor is defined twice. It then checks the status of every page
links. The W3C Link checker accepts URL address of Web
page and parses each and every hyperlink to find broken links
in the page. The Functional Accessibility Evaluator
( analyzes web pages for markup that
is consistent with the W3C standard. It analyzes the web pages
based on navigation & orientation, text equivalents, scripting,
styling etc. The Hera is a tool to check the accessibility of
Web pages according to the specification of Web Content
Accessibility Guidelines [18]. HERA (
hera/) performs a preliminary set of tests on the page and
identifies any automatically detectable errors or checkpoints
met, and which checkpoints need further manual verification.
The Web Page Size Checker tool gives us the page size of the
specified URL. Page size determines how long it will take
usually for users to open the web page. For example: 10 kb is
approximately a small page size, which means the loading
speed is also quicker. In this paper we also demonstrate an
additional feature included into the integrated web test model.
By traversing all the web page link of an e-commerce web site
it will provide response time and page size of all links along
with the URL (uniform resource locator).


The present web test model in simple terms is checking the
web application from specified URL to check potential bugs
before it’s made live or before code is moved into the
production environment. During this stage users need to visit
several sites, put URL into these sites and gather reports from
all the sites for test result. It increases the time and cost of web
testing. The model is shown in figure 1.

Figure 1: Present Web test model
In contrast to the present web test model, the integrated web test
model that we are proposing in this paper as shown in figure 2,
integrates existing test sites into a single platform. In this
approach, user puts the web URL under test into the integrated
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Web test tool, the test tool then automatically crawls other
existing Web test tools, submits URL for test purpose, gathers
results from different sites and merges the results so that user can
view all the testing results at a glance. The integrated model also
provides some new web testing features such as link status
checking, page response time for every link, page size checker of
the specified URL in bytes and kilo bytes. These additional
results relieve the user to measure the response time of all the
links of the e-commerce site and thus get an average response
time and average page size.

Figure 2: Integrated Web test model
In the integrated web testing view we will get errors and
warnings about HTML code, page links reports, page link
status code definitions, total number of good link and bad link,
list of all URL links and a table mentioning link number with
page access date & time, web page link response time, web
page size for every webpage link into the e-commerce sites.
Getting all these information would require accessing at least
five different Web test sites.
Like any complex piece of software there is no single, all
inclusive performance indicator that fully characterizes a Web
site. Different fault types define different problems. For
example, HTML head tag errors, font tag errors and body tag
errors identify the problems in the text elements of web page.
Thus text formatting measures are to be evaluated. The image
tag error and image load errors identify the errors in display
link tag errors. The script tag errors, server connectivity errors,
down load time of Website and broken link errors contribute
the need of Website architecture redesign. However in this
research the following fault types are investigated:
 Web page faults: This includes web page faults
according to the Web Content Accessibility Guidelines
and World Wide Web consortium.
 HTML faults: This includes HTML tag opening and
closing error.
 Link Error: This includes page link error.
 Page Link status: This includes page link status
according to the benchmark.
 Response time testing: This includes response time of
each page link.

Based on these above criteria we evaluated four Bangladeshi
e-commerce sites,,, and as elaborated
Evaluation of Good and Bad Links of a Web Site
Broken hyperlinks on websites are not just annoying – their
existence may cause some real damage to ecommerce online
business as well as to the reputation in the online business.
Search engines might stop crawling the e-commerce site if
broken links found. Our developed integrated tool traverses all
the hyperlinks and finds out page statuses that are shown in
table 2. The corresponding graphs are shown in figure 3 and 4.
Good Link
Hutbazar 1 55 56 1.79 98.21
Bazarsodai 5 101 106 4.72 95.28
ClickBD 0 121 121 0.00 100.00
Cellbazaar 331 14 345 95.94 4.06

Figure 3: Number of good vs. bad links of four sites

Figure 4: Percentage of good and bad links of four sites
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Web Page Link Response Time
The end-to-end time elapsed to response a web page link, or in
other words, time elapsed from client’s HTTP request to
render response page in client’s browser. We have traversed
all the page links of a Web site and collected page size (KB)
and response time (second) to evaluate the site. It was
observed that response time increases as the page size
increases, which are obvious. Result of one such site is shown in figure 5.

Figure 5: Website Evaluation with all page links

Website performance (avg. response time vs. avg. page size)
Web page link response time does not give much information
about a Web site, but knowing average page response time
gives insight of performance of an e-commerce site.
However, if we like to compare different Web sites’
performance, then average response time vs. average page size
provides information about the quality of e-commerce sites.
For example, if an e-commerce site has smaller average page
size and thus less response time than other similar e-commerce
sites then the site is good and vice versa. Thus we calculated
average response time and average page size of the four e-
commerce sites as shown in table 3.
Ecommerce Site
Response Time
Page Size
Hutbazar 2.56 seconds 1.85 KB
Bazarsodai 5.34 seconds 4.64 KB
CellBazar 6.40 seconds 6.39 KB
ClickBD 2.84 seconds 2.84 KB

Figure 6 shows average page response time and figure 7 shows
average page size. According to these graphs it is evident that is the best among these four sites.
This paper describes that automated web test integration
according to the benchmarking [15] [16] of e-commerce web
sites can provide a useful service for communities and
identifies areas in which additional automated performance
analysis is needed. However use of number of different
techniques or web test methods has shown that there are
inconsistencies in the way they operate, which can result
inconsistencies and difficulties in producing results. Our future
line of work would be to test different benchmark [18] suites
for different kinds of tools. We can further extend this work to
identify other components of web site design for quality
assessment which would further enable to improve the design
as a part of the end user experience which emphasizes the
continuous improvement of the design aspect and promote a
culture of performance excellence of web design.
Some points that are not addressed, but are certainly beyond
the scope of this paper are the new breed of Web Application
that utilizes Ajax (Asynchronous JavaScript and XML) and
with the page centric view, and page that change their
structure depending on the input. Next step is investigating
more changeable issues in the new breed of web applications
to improve the proposed test path generation approach and
developing a prototype tool to execute the web testing model.

Figure 6: Average Page Response time

Figure 7: Average Page Size
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
This paper investigated various measures required for quality
Website design. A focused approach was made to identify
page link error, page size and page response time in
developing and testing e-commerce websites. This would
enable to adjudge the quality status of Web design of the
various sites and would indicate the necessity of improvement
in the design of the Website. The integrated web testing tool
evaluated effective testing methodology for web application
and improved the performance testing of web application.

Using a series of online diagnostic tools, we examined many
dimensions of quality, and each dimension was measured by
specific test online. To get results on the quality of a Web site,
we measured sample data extracted from different web sites
and calculate response time, page size, number of item, and
load test, tag validation, and broken link, number of link test.
Moreover because the ultimate determinant of quality website
is the users, future directions for this research also involve the
objective and subjective views of the website from user’s
perspective. Finally, the practical experiment of applications
of our methodology has been described. We believe that this
experiment provides encouraging results concerning the
validity, correction and agility of the method.

The tool has been released online for public use. More
information about the tool can be found at the following


[1] C. Kallepalli, J. Tian, “Measuring and Modeling Usage and
Reliability for Statistical Web Testing”, IEEE Trans Software
Engineering, 2001,27(11), pp. 1023-1036.
[2] L. Xu, B. W. Xu, and Z.Q. Chen, “Survey of Web Testing”,
Computer Science (in Chinese), 2003, 30(3), pp. 100-104.
[3] Xu L, Xu BW, Chen HW. “Website Evolution based on Statistic
Data”, Proceedings of the Ninth IEEE International Workshop on
FutureTrends of Distributed Computing Systems (FTDCS 2003),
pp. 301-306.
[4] J. Gao, C. Chen, Y. Toyoshima and D. Leung, “Engineering on the
Internet for Global Software Production”, IEEE Computer, 1999,
32(5), pp. 38-47.
[5] F. Ricca and P. Tonella, “Web Site Analysis: Structure and
Evolution”, Proc. of International Conference on Software
Maintenance (ICSM'2000), 2000, pp. 76-86.
[6] Forrester Consulting, “eCommerce Web Site Performance Today:
An Updated Look At Consumer Reaction To A Poor Online Shop-
ping Experience” A commissioned study conducted on behalf of
Akamai Technologies, August 17, 2009
[7] Bo Song and Huaikou Miao, “Modeling Web Applications and
Generating Tests: “A Combination and Interactions-guided
Approach”, IEEE Computer, 2009, DOI 10.1109/TASE.2009, 54,
pp. 174-181.
[8] K. Y. Cai, “Optimal software testing and adaptive software testing
in the context of software cybernetics”, Information and Software
Technology, 2002, 44, pp. 841-855.
[9] D. Dhyani, W. K. Ng, and S. S. Bhowmick, “A survey of Web
metrics”, ACM Computing Surveys, 2002, 34(4), pp. 469-503.
[10] P. Warren, C. Boldyreff, and M. Munro, “The Evolution of
Websites”, Proc. of the Int. Workshop on Program
Comprehension, 1999, pp. 178-185.
[11] P. Warren, C. Gaskell, C. Boldyreff. Preparing the ground for
website metrics research. Proc of the 3rd International Workshop
on Web Site Evolution, 2001, pp. 78-85.
[12], last access on
January 15, 2012
[13] Gerrard, P. (2000a). Risk-Based E-Business Testing: Part 1 – Risks
and Test Strategy. Retrieved June 15, 2001, from the World Wide
Web: EBTestingPart1.pdf
[14], last
accessed on June 22, 2012
[15] 10 Best practices for benchmarkibg web and mobile site
performance, white paper: web performance management,
Compuware Corporation World Headquarters, Detroit, MI 48226-
5099 © 2011 Compuware Corporation
[16] Mario Milicevic, Krunoslav Zubrinic and Ivona Zakarija
“Dynamic Approach to the Construction of Progress Indicator for a
Long Running SQL Queries”, International Journal of Computers
Issue 4, Volume 2, 2008, pp. 489-496
[17] Turker C, Akal F, Schlapbach R, “Life sciences data and
application integration with B-fabric” Journal of Integrative
Bioinformatics, Volume 8, Issue 2, July 2011.
[18] Techniques for Web Content Accessibility Guidelines by W3C,
[19] G. Sreedhar and A.A. Chari, “An experimental Study to Identify
Qualitative Measures for Website Design”, Global Journal of
Computer Science and Technology, University ofWisconsin, USA,
September, 2009.
[20] G. Sreedhar, A.A. Chari and V. V. Venkata Ramana, “Evaluating
Qualitative Measures forEffective Website Design”, International
Journal on Computer Science and Engineering,vol.02, No.01S,
2010, pp.61-68.
[21] Ali Azad, “Elements of Effective Web Page Design”, Global
Competitiveness, January, 2001.
[22] L. Page, S. Brin, R. Motwani and T. Winograd, “The Page Rank
Citation Ranking: Bring Order to the Web”, Technical Report,
Stanford University, 1998.
[23] E. Glover, K. Tsioutisiouliklis, S. Lawrence, D. Pennock, G. Flake,
“Using Web Structure for Classifying and Describing Web Pages”,
in Proceedings of WWW2002, Hawaii, May2002.


Md. Safaet Hossain is a Master’s student in the
Department Electrical Engineering and Computer
Science, North South University, Bangladesh. Currently
he is doing thesis in the area of Software Engineering.
His interests are in software engineering, Web
engineering, software quality assurances, Web security
related problems.

Dr. Shazzad Hosain is an Assistant Professor in the
department of Electrical Engineering and Computer
Science (EECS) at North South University (NSU),
Bangladesh. His interests are in software testing, Web
data integration, Semantic Web, knowledge
representation, business and scientific workflow
systems, Web security, and bioinformatics related
problems. He is also interested in developing
microcontroller based systems that interface different
devices as well as small/heavy industries, Scientific
Computing, SCILAB, etc.

ISSN 1947-5500
Performance Evaluation of Some Grid Programming Models
W. A. Awad
Mathematics & Computer Science Department, Faculty of Science, Port Said University, Egypt.
Scientific Research Group in Egypt (SRGE),
Abstract- Grid programming has
properties and capabilities of managing
computation in the distributed
environment which is typically
heterogeneous and dynamic in addition
to deepening memory and
bandwidth/latency hierarchy. The grid
applications need to be heterogeneous
and dynamic to run on different types
of resources whose configuration may
change during run-time. The grid
programming models are high-level
programming paradigms that prevent
users from dealing with the low level
implementation details. Implementing
applications with Grid programming
paradigms is a big challenge to perform
efficiency and it overcomes the task
time-consuming that comes from
dealing with excessive low-level details
of provided APIs. In this paper three
programming models are presented and
discussed; they are P-Grade,
JavaSymphony, and OpenGR. Finally
the evaluation between the models is
done to find the best performance with
saving consumed time.
Grid Computing, Grid
Programming, P-Grade, JavaSymphony
and OpenGR.
1. Introduction
Nowadays the distributed
resources become an important
powerful computation platform that is
needed owing to increase the number of
high bandwidth networks, with a need
for decreasing the cost of computing by
bringing powerful and cheap computing
power into the hands of more
individuals in the form of Commodity-
Off-The-Shelf (COTS) desktops and
servers[14]. A new paradigm of
distributed computing is Grid
computing; it provides easy access to
large geographical shared computing
resources which were provided to a
large virtual organization; these
resources are geographically apart
while appearing to users as one
device[1]. These shared computing
resources include traditionally local
resources such as memory, storage and
CPUs. All these resources are
connected through the Internet and a
middleware software layer to provide
basic services for security, monitoring,
accessing information about
components, etc[15]. Also Grid
computing is one of many services
which are provided in Grid technology.
It provides secure services for
executing application jobs on
distributed computational resources
individually or collectively. Some
examples of Grid computing are NASA
IPG [9], the World Wide Grid [3], and
the NSF TeraGrid[19].
There are various types of Grid
computing to address multiple
application requirements which can be
summarized as:
Computational: A computational grid
is focused on setting aside resources
specifically for computing power. In
this type of grid most of the machines
are high-performance servers.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
Scavenging: A scavenging grid is most
commonly used with large numbers of
desktop machines. Machines are
scavenged for available CPU cycles and
of when their resources are available to
participate in the grid.
Data grid: A data grid is responsible
for housing and providing an access to
data across multiple organizations.
Users are not concerned with where this
data is located as long as they have
access to the data. For example, you
may have two universities doing life
science research, each with unique data.
A data grid would allow them to share
their data, manage the data, and manage
security issues such as who has access
to what data[8].
As mentioned above, Grid creates
a computing environment with basic
characteristics that are heterogeneous
and distributed, so the development of
the application needs a good
programming environment to avoid
time consuming[2].
2. Grid Programming Models
The main goal of Grid
programming is to manage computing
environments that have parallel,
distributed, heterogeneous and dynamic
nature. In addition to that, grid
applications need to allocate resources
flexibly to service across those dynamic
environments. On the other hand it may
be possible to build grid applications
using established programming tools,
they are not particularly well-suited to
effectively manage flexible
composition or deal with heterogeneous
hierarchies of machines, data and
networks with heterogeneous
performance[16]. A grid programmer
will have to manage a computation to
this environment, and also will have to
design the interaction between remote
services, data sources and hardware
Grid programming models consist
of tools, conventions, protocols,
language constructs and a set of
libraries that encapsulate a useful
functionality[5]. Building effective
applications for the grid is a challenge
in grid programming environment
where the high level programming
language paradigm has to prevent users
from dealing with the low level details
of each resource. A programming
model can be presented in many
different forms, e.g., a language, a
library API, or a tool with extensible
functionality. The main property of
successful programming models is
enabling both high-performance and the
flexible composition and management
of resources. Programming models also
influence the entire software lifecycle:
design, implementation, debugging,
operation, maintenance, etc. Hence,
successful programming models should
also facilitate the effective use of all
manners of development tools, e.g.,
compilers, debuggers, performance
monitors, etc[16]. Some examples for
various models currently in use include
MPI for message passing, Condor for
high throughput, and HPF for data
2.1 Grid Programming Models
Some properties can be handled to
grid programming models environment
as follows:
1. Usability; Grid programming tools
and environment should support a wide
range of programming concepts and
paradigms, and, to the users, there
should be a low barrier to acceptance of
the tools and environment.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
2. Dynamic; adaptability. A grid
programmer should be able to adapt to
the dynamicity of the architecture or
configurations of the grid.
3. Portability; Grid programming
models and language should allow grid
codes greater software portability.
4. Interoperability; Having the open
and extensible architecture, grid may
support many different kinds of
protocols, services, applications
programming interface, and software
development kits, and they should be
interoperable when appropriate.
5. Reliability; Grid user should be able
to check, recover or react to the faults
of the system helped by programming
3. Related Work
In this section some related
researches that focus on grid
programming environment will be
introduced. Wang, et. al[21],
represented a high-level programming
environment for the non-professional
end users called GSML (Grid Service
Markup Language) , also they defined a
lower-level service-oriented
programming language for Grid
developers called Abacus. Currently
most of the Grid programming models
are evolved from the traditional parallel
and distributed programming
Foster and Karonis introduced an
MPICH-G2 [22] which is a grid-
enabled implementation of the MPI that
uses the Globus services (e.g., job
startup, security) that automatically
convert data in messages to be sent
between machines of different
architectures and also it supports
multiprotocol communication by
automatically selecting TCP for inter-
machine messaging and vendor-
supplied MPI for intra-machine
Tejedor and Badia[4]
implemented COMPSs that provide a
superscalar model implementation
based on Grid Component Model. As a
result, the runtime of COMPSs has
gained some features such as
reusability, flexibility and separation of
concerns which are from the
component-based programming
4. Grid Programming Models
This section handles three
programming models as examples of
Grid programming models, which are
P-Grade, JavaSymphony, and OpenGR.
We will represent and discuss the
models and focus on some of their
important features.
It is a high-level graphical
environment that was provided to
develop parallel applications for both
parallel systems and the Grid. It is
based on the nature of interactive
execution of parallel programs in the
Grid. PVM and MPI codes can be
generated according to the parallel
applications execution in the Grid
environment [13]. The main definition
supported by P-GRADE is multi-job
workflow execution for the Grid. This
workflow management provides
parallel execution at both inter-job and
intra-job level. P-GRADE provides a
mechanism of automatic checkpoint
that provides a fault-tolerant workflow
execution mechanism.
Migration between different grid
sites is an important feature in PVM
applications that are generated by P-
GRADE. It guarantees reliable, fault-
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
tolerant parallel program execution in
the Grid system. These Grid sites have
to be monitored when the application
migrates among them, the
Mercury/GRM/PROVE Grid
applications play the role of monitoring
any parallel application launched and
analyzed by P-GRADE at run time.
In other words, P-GRADE is a
graphical programming environment.
Its major goal is to provide an easy-to-
use, integrated set of programming
tools for the development of generic
message-passing (MP) applications to
be run on both homogeneous and
heterogeneous distributed computing
systems such as supercomputers,
clusters and Grid systems.
From the viewpoint of the
researchers, some features of P-
GRADE can be concluded as follows:
1- It can support each stage of the
parallel program development life-
cycle by an integrated graphical
2- It can generate either PVM or MPI
code for executing on
supercomputers, clusters and in the
3- Its application does not need any
changes in the resources that run
over through its run-time[13].
2- JavaSymphony
JavaSymphony is a high-level
simple programming model that allows
the programmer to control parallelism,
load balancing, and locality at a high
level of abstraction in Grid
applications, with hide the low level
implemented details (i. e. RMI, sockets,
and threads) and error-prone from the
programmer. JavaSymphony
Applications (JSA) can be designed by
JavaSymphony API. It registers with
the JavaSymphony Runtime System
(JRS), allocates resources (machines
where JRS is active), distributes code as
Java objects among the resources, and
remotely invokes methods of these
objects. After all these usages, the
application un-registers and is removed
from the JRSs list of active
applications. JRS is implemented as an
agent based system, with agents
running on each computing resource to
be used by the JSAs ([11,12]). From the
main important services that The JS
programming paradigm introduces are
the dynamic virtual distributed
architectures (VAs), that allow
developer to manage the resources in JS
applications, and JavaSymphony
remote objects (JS objects), that are
used to distribute data and code
between the resources. On the other
hand, there is an important feature that
is not supported by other grid
application, it is the free
communication between the distributed
components with each other in the
Finally JS programming paradigm
handles other main features of grid
applications like migration, distributed
event mechanism, distributed
synchronization mechanisms[7].
3- OpenGR
OpenGr is a new grid
programming environment for Remote
Procedure Call (RPC) based master.
This environment is realized through
the use of a set of compiler directives,
and is implemented as a parallel
execution mechanism.
The Remote Procedure Call
(RPC) facility is widely used for
distributed computing. Several grid
computing systems also adopt RPC
(grid-enabled RPC) as the basic model
of computation [20]. The RPC-style
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
system is particularly useful in which it
provides an easy-to-use, intuitive
programming interface, allowing users
of the grid system to make grid enabled
applications easily[10].
The OpenGR directives make the
existing sequential applications able to
be readily adapted to the grid
environment as master worker parallel
programs using the RPC architecture.
Although , the grid RPC provides a
suitable programming model for a grid
environment, additional tools are
required for implementing these grid
applications. For example, the functions
that will be executed on the server side
should be decided, and its
modifications need to use grid RPC
application programming interface
(API). Another problem in the grid
environment is the heterogeneous
collection of resources at many sites, it
is necessary for the programmer to
check/know the hosting environment of
each and every server before the server
stubs can be deployed. This poses
obvious problems when large number
of servers (or remote hosts) are
OpenGR directives for C/C++ is
presented to solve the above problems,
with providing an incremental
programming environment for a grid
RPC programming with minimal source
code modification, with providing a
Hybrid parallelization for describing
parallelism. All of these methods with
reducing the complexity and the work
of the server stub shipping, and
providing a semi-automated server stub
install facility as a part of the compiler
5. Evolution of Programming Models
Nowadays we can say that, the
programming models is becoming very
important to be easier, more abstract,
and more closer to the user. This paper
is trying to find the best programming
models among the available models to
help both developers and end users.
Evaluation of programming models is a
way to suggest the best one.
The runtime is one of the
important criteria that has to be
evaluated in grid models, because we
need to achieve a faster execution of the
application by these models. So the
evaluation is needed to be applied on
the execution of different applications.
The speed up of the executed
applications depends on the
independent and parallelism between
tasks of the application and the
response time, network bandwidth in
the other side.
Based on the programming
models that are presented in the above
section, the evaluation need a
simulation in order to measure the
performance of each model. The
models, i. e, P-Grade, OpenGR and
java symphony, are evaluated by using
similar workload simulation
environment of Job Sharing and Multi-
site. The comparison between the
performance of the above models is
done through a java simulation that
simulates different grid workloads
environment. This comparison is
concluded in table 1. The simulated
workloads contain ten CPUs nodes
connected by 100 Mbps Ethernet. Each
node contains 2GHz processing unit,
128 MB RAM, and with running Linux
operating system.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
For each node, the efficiency ratio
can be calculated by equation 1.
% 100 *
.N t
eq(1) [17]
t is the sequential run time
for the application, and
t the parallel
run time, and N the number of CPUs
Single CPU
(per sec) t
(per sec)
(per sec) t
(per sec)
10 CPUs
(per sec) t
(per sec)
OpenGR 1.5 1.5 100% 7.5 1.6 93.8%

16 1.9 78.9%
JavaSymphony 2.3 2.3 100% 11.5 2.5 92% 23.6 3.0 76.7%
P-Grade 2.8 2.8 100% 14.00 3.3 84.8% 27.5 4.1 68.2%

Table. 1- Comparison Between Three Programming Models

As shown in the above
comparison in table 1, obviously, we
can find the efficiency of the three
models did not change when one
resource is used, but when the
resources distribution increased, the
time and efficiency are affected. For
example, it was clear that the more
resource are distributed, the more
time is consumed, and this is greatly
affected in sequential time. But the
importance of resources distribution
is clear when we follow the parallel
time, where it did not change
effectively in the efficiency. Finally,
OpenGr is found to be the best
programming model in efficiency
and in saving time. The following
figures show the comparison chart
between the programming models.
Programming Models

Fig. 1- Comparison between Efficiency of the Programming Models
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
1 2 3
Programming Models



Fig. 2- Comparison between Sequential Run-Time of the Programming Models
1 2 3
Programming Models



Fig. 3- Comparison between Parallel Run-Time of the Programming Models
6. Conclusion
End users of Grid computing
systems do not need to deal with the
low level implementation details of
application. They need a high level
programming models that prevent
them from these details. These
models are the Grid programming
models. The big challenge that faced
on the Grid programming paradigms
is performing efficiency and
overcomes the task time-consuming
that comes from dealing with
excessive low-level details of
provided APIs. In this paper we
present three models as example of
grid programming models which are
P-Grade, JavaSymphony, and
OpenGR. Evaluation of these
programming models is a way to
suggest the best one of them. The
runtime is one of the important
criteria that has to be evaluated
because we need to achieve a faster
execution of the application. After
evaluation, OpenGr is found to be
the best programming model in
efficiency and in saving time with
fixed Bandwidth and internet
connection speed.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
[1] Azzedin; F, Mahewaran; M,
Integrating trust into grid
resource management systems,
in: Proceedings of ICPP 2002,
[2] Bai; X., et. al, Grid Programming
visited on 28 August 2012, 11
[3] Buyya; R, The World-Wide Grid.
Available from:
/ecogrid/wwg/, visited on 25
August 2012, 10 Am.
[4] Chu; X, et. al, A Novel Approach
for Realising Superscalar
Programming Model on Global
Grids, Proceedings of the 10th
International Conference on
High-Performance Computing
in Asia-Pacific Region (HPC
Asia 2009), published on 02
May 2009.
[5] Govindaraju; M, et. al, XCAT
2.0: A Component-Based
Programming Model for Grid
Web Services, I n
Grid 2002,3rd International
Workshop on Grid
Computing , 2002,
u/ xcat, visited on 30 July 2012,
02 PM.
[6] He; G, et. al., Research on User
Programming Environment in
Grid, Grid and Cooperative
Computing, pp. 778 785, 2004.
[7] Hirano; M, et. al, OpenGR: A
directive-based grid
programming environment,
Parallel Computing Vol. 31,
issue. 10, pp. 1140
1154, 2005.
[8] Jacob; B., et. Al, Introduction to
Grid Computing, International
Technical Support
Organization, IBM Redbooks ,
December 2005.
[9] Johnston; W, et. al, Grids as
production computing
environments: The engineering
aspects of NASA_s information
power grid, in: Proceedings of
8th IEEE International
Symposium on High
Performance Distributed
Computing, Redondo Beach,
CA, 1999.
[10] Jugravu; A, Fahringer; T,
JavaSymphony, a programming
model for the Grid, Future
Generation Computer Systems,
Vol. 21, Issue. 1, pp. 239
[11] Jugravu; A, Fahringer; T,
Javasymphony: New Directives
to control and synchronize
locality, parallelism, and load
balancing for Cluster and Grid-
Computing, Proceeding in JGI
'02 Proceedings of the 2002
joint ACM-ISCOPE conference
on Java Grande, 2002, pp. 8-17.
[12] Jugravu; A, Fahringer; T, On
the implementation of
JavaSymphony, Proceeding in
IPDPS '03 Proceedings of the
17th International Symposium
on Parallel and Distributed
Processing, IEEE Computer
Society Washington, DC, USA,
pp. 34

43, 2003.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
[13] Kacsuk; P, et. al., P-GRADE:
A Grid Programming
Environment, Journal of Grid
Computing, Vol. 1, No. 2, pp.
171 197, 2003.
[14] Khoo; B,, Amulti-
dimensional scheduling scheme
in a Grid computing
environment, J. Parallel Distrib.
Comput., Vol. 67, Issue. 6, pp.

673, 2007.
[15] Kielmann; T, Programming
Models for Grid Applications
and Systems, proceeding in
Modern Computing, 2006. JVA
'06. IEEE John Vincent
Atanasoff 2006 International
Symposium, Vrije Universiteit,
Amsterdam, 3-6 October 2006,
[16] Lee; C, Talia; T, Grid
Programming Models: Current
Tools, Issues and Directions,
Proceeding in PDCAT'04
Proceedings of the 5th
international conference on
Parallel and Distributed
Computing: applications and
Technologies, pp. 868-871,
[17] Nieuwpoort; R, et. al., Satin:
Simple and Efficient Java-
based Grid Programming,
Scientific International Journal
for Parallel and Distributed
Computing, Vol. 6, No. 3, pp.
19-32, 2005.
[18] Shu; C, et. al., Towards an End-
User Programming
Environment for the Grid,
Lecture Notes in Computer
Science, Vol. 3795, pp. 345

356, 2005.
[19] Tangpongprasit; S, et. al., A
time-to-live based reservation
algorithm on fully decentralized
resource discovery in Grid
computing, Parallel Computing,
Vol. 31, Issue. 6, pp. 529 543,
[20] Tanaka; Y, et. al,
Implementation and evaluation
of gridpc based on globus,
Parallel Computing, Vol. 31
Issue 10, pp. 165 170.
[21] Wang, L., et. Al, Abacus: A
Service-Oriented Programming
Language for Grid
Applications, Proceeding in
SCC '05 Proceedings of the
2005 IEEE International
Conference on Services
Computing, Vol. 1, pp 225-232.
[22] Karonis; N, et. al., MPICH-G2:
A Grid-Enabled Implementa-
tion of the Message Passing
Interface, Journal of Parallel
and Distributed Computing -

Special issue on computational
grids, Vol. 63, Issue 5, pp. 551-
563, 2003.

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

Comprehensive Analysis of π Base Exponential
Functions as a Window

Mahdi Nouri
, Sepideh Lahooti
, Sepideh Vatanpour
, and Negar Besharatmehr
Dept. of Electrical Engineering
Iran University of Science & Technology
, A.B.A Institute of Higher Education
, Abeyek
, Iran

Abstract—A new simple form window with the application of FIR
filter design based on the π Base exponential function is proposed
in this article. An improved window having a closed simple
formula which is symmetric ameliorates ripple ratio in
comparison with Kaiser and cosine hyperbolic windows. The
proposed window has been derived in the same way as Kaiser
Window, but its advantages have no power series expansion in its
time domain representation. Simulation results show that
proposed window provides better ripple ratio characteristics
which are so important for some applications. A comparison with
Kaiser window shows that the proposed window reduces ripple
ratio in about 6.4dB which is more than Kaiser’s in the same
mainlobe width. Moreover in comparison to cosine hyperbolic
window, the proposed window decreases ripple ratio in about
6.5dB which is more than cosine hyperbolic’s. The proposed
window can realize different criteria of optimization and has
lower cost of computation than its competitors.
Keywords-component; Window functions; Kaiser Window; FIR
filter design; Cosine hyperbolic window
FIR filters are particularly useful for applications where
exact linear phase response is required. The FIR filter is
generally implemented in a non-recursive way which
guarantees a stable filter. FIR filter design essentially consists
of two parts, approximation problem and realization problem.
The approximation stage takes the specification and gives a
transfer function through four steps [1,2]. They are as follows:
1) A desired or ideal response is chosen, usually in the
frequency domain.
2) An allowed class of filters is chosen (e.g. the length N
for a FIR filters).
3) A measure of the quality of approximation is chosen.
4) A method or algorithm is selected to find the best
filter transfer function.
The realization part deals with choosing the structure to
implement the transfer function which may be in the form of
circuit diagram or in the form of a program. The essentially
three well-known methods for FIR filter design are the
window method, the frequency sampling technique and
Optimal filter design methods [2]. The basic idea behind the
window is to choose a proper ideal frequency-selective filter
which always have a noncausal, infinite-duration impulse
response and then truncate (or window) its impulse response

[n] to obtain a linear-phase and causal FIR filter [3].


[n] w[n] ; w[n]= ,

- (1)

Where () is function of n , (M+1 ) is the length, h[n]
represented as the product of the desired response

[n] and a
finite-duration ―window‖, w[n]. So the Fourier transform of
h[n], H(

), is the periodic convolution of the desired
frequency response,


), with Fourier transform of the
window, W(

). Thus, H(

) will be a spread version of


). Fourier transforms of windows can be expressed as sum
of frequency-shifted Fourier transforms of the rectangular
windows. Two desirable specifications for a window function
are smaller main lobe width and good side lobe rejection
(smaller ripple ratio). However these two requirements are
incongruous, since for a given length, a window with a narrow
main lobe has a poor side lobe rejection and contrariwise. The
rectangular window has the narrowest mainlobe, it yields the
sharpest transition of H(

) at a discontinuity of


So by tapering the window effortlessly to zero, side lobes are
greatly reduced in amplitude [2]. By increasing M, W(

becomes narrower, and the smoothing provided by W(

) is
reduced. The large sidelobes of W(

) result in some
undesirable ringing effects in the FIR frequency response

), and also in relatively larger sidelobes in H(

). So
using windows that don’t contain abrupt discontinuities in
their time-domain characteristics, and have correspondingly
low sidelobes in their frequency-domain characteristics is
required [3].
There are different kind of windows and the best one is
depending on the required application, Windows can be
categorized as fixed or adjustable [9]. Fixed windows have
only one independent parameter, namely, the window length
which controls the main-lobe width. Adjustable windows have
two or more independent parameters, namely, the window
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
length, as in fixed windows, and one or more additional
parameters that can control other window’s characteristics.
The Kaiser window is a kind of two parameter windows, that
have maximum energy concentration in the mainlobe, it
control the mainlobe width and ripple ratio [4,8,9]. In this
paper an improved two parameter window based on the
exponential function is proposed, that performs better ripple
ratio and lower sildelobe ( 6.42 db ) compared to the Kaiser
and Cosine hyperbolic windows, while having equal mainlobe
width. Also its computation reduced because of having no
power series.
Windows are time-domain weighting functions that are
resulted from the truncation of a Fourier series. They are
utilized in a variety of additional signal processing applications
including power spectral estimation, beamforming, signal
analysis and estimation, digital filter design and speech
processing. In spite of their maturity, windows functions (or
windows for short) maintain to find new roles in the
applications of today. The best window depends on the
applications. Very recently, windows have been used to smooth
the progress of the revealing of irregular and abnormal
heartbeat patterns in patients in electrocardiograms [1].
Medical imaging systems, such as the ultrasound, have also
illustrated enhanced performance when windows are used to
improve the contrast resolution of the system [2]. Windows
have also been utilized to aid in the classification of cosmic
data [3, 4] and to improve the consistency of weather
prediction models [5]. Windows have independent parameter,
namely, the window length which controls the main-lobe width
and one or more additional parameters that can control other
window characteristics [6, 7, 8, 9, 12]. The Dolph-chebyshev
window [10] has two parameters and produces the minimum
main-lobe width for a specified maximum side-lobe level. The
Kaiser window [8- 9] has two parameters and achieves close
approximations to distinct prelate functions that have
maximum energy concentration in the mainlobe. The Kaiser
and Dolph-Chebyshev windows can direct the amplitude of the
sidelobes relative to that of the mainlobe. Kaiser window is a
well-known flexible window and extensively used for FIR
filter design and spectrum analysis applications [2], since it
achieves close approximation to the distinct prelate spheroid
functions that have maximum energy focus in the mainlobe
with adjusting its two independent parameters. Windows can
be classified as fixed or adaptable. Kaiser window is an
adaptable window that has a better sidelobe roll-off
characteristic than the other well-known adjustable windows.
The paper is organized as follows: Section II presents the
characterization of window to distinguish the windows
performance, and introduces Cosh and Kaiser Windows.
Section III introduces the proposed window and presents
numerical simulations and discusses the final results. Section
IV shows the time required to compute the window coefficients
for the Cosh, Kaiser and proposed windows. Section V is given
a numerical comparison example for the filters using the
Proposed and Kaiser windows. Finally, conclusion is given in
section VI after which the paper is equipped with related

Figure 1. A typical window’s normalized amplitude spectrum
First, confirm that you have the correct template for your
paper size. This template has been tailored for output on the
US-letter paper size. If you are using A4-sized paper, please
close this file and download the file for ―MSW A4 format‖.
A window, w(nT), with a length of N is a time domain
function which is defined by:

() ,
|| ( )

Windows are generally compared and classified in terms of
their spectral characteristics. The frequency spectrum of w(nT)
can be introduced as [7]:







Where W(

) is called the amplitude function, N is the
window length, and T is the space of time between samples.
Two parameters of windows in general are the null-to null
width B
and the main-lobe width B
These quantities are
defined as B
= 2ω
and B
= 2ω
, where ω
and ω
are the
half null-to-null and half mainlobe widths, respectively, as
shown in Fig. 1, an important window parameter is the ripple
ratio r which is defined as



Having small proportion less than unity permit to work with
the bilateral of r in dB, which is

R = 20 log (

) (5)

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

R clarifies as the minimum side-lobe attenuation relative to
the main lobe and −R is the ripple ratio in dB. S is the side-lobe
roll-off ratio, which is defined as


is the largest side lobe and

is the lower one which is
furthest from the main lobe. If S is the side-lobe roll-off ratio
in dB, then s is given by

These spectral characteristics are important performance
measures for windows.

A. Kaiser window
Kaiser window is one of the most useful and optimum
windows. It is optimum in the sense of providing a large
mainlobe width for a given stopband attenuation, which implies
the sharpest transition width [1]. The trade-off between the
mainlobe width and sidelobe area is quantified by seeking the
window function that is maximally concentrated around w=0 in
the frequency domain [2].

() ∑ *





In discrete time domain, Kaiser Window is defined by [5]:

() {





Where α is the shape parameter, N is the length of window
and I0(x) is the modified Bessel function of the first kind of
order zero.

B. Cosh Window
The hyperbolic cosine of x is expressed as:

() ∑


Fig. 2, shows that the functions Cosine hyperbolic(x) and
(x) have the same Fourier series characteristics [7]. Cosine-
hyperbolic window is proposed as:

() {





This window provides better sidelobe roll-off ratio, but
worse ripple ratio for the same window length and mainlobe
width compared with Kaiser Window. It has the advantage of
having no power series expansion in its time domain function
so the Cosine hyperbolic window has less computation
compared with Kaiser one.

In some applications of FIR filters it is necessary to reduce
the level of sidelobes below -45 dB. The goal of this work is to
find a window with simple closed form formula, having equal
main lobe width and smaller side lobe peak compared to the
other windows. The FIR filter is designed with the new
window to evaluate its efficiency which is given by Eq. 13,14.

√ (



() (

) ()

Where the

is the adjustable shape parameter. The

defined optimum to gain better ripple ratio for proposed

( (

) (



From Fig.2, it can be easily seen that as in the case for
proposed window, when

increases the mainlobe width
increases and ripple ratio decreases. Fig.3 shows the
relationship between the shape parameter and ripple ratio for
the proposed window. From this figure, the ripple ratio remains
almost constant for a change in the window length. For some
applications such as the spectrum analysis, the design equations
which define the window parameters in terms of the spectrum
parameters are required. From Fig.3, an approximate
relationship for the adjustable shape parameter

can be
found in terms of the ripple ratio (R) by using the curve fitting
method as






The approximation model for the adjustable shape parameter
given by Eq.16 is plotted in Fig.4 It can be seen that the model
provides a good approximation with an error plotted in Fig.4
The largest deviation in alpha is 0.1 which corresponds to an
error of 0.4dB in actual ripple ratio.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

Figure 2. Proposed window spectrum in dB for = 0, 2 and N=50

Figure 3. The relation between and R for the proposed window
As for the Kaiser model given in, the largest deviation in
alpha is 0.07, but this corresponds to an error of 0.44dB in
actual ripple ratio. More accurate results can be obtained by
restricting the range, but the proposed model is adequate for
most applications like Kaiser model.
The second design equation is the relation between the
window length and the ripple ratio. To predict the window
length (N) for a given quantities of the ripple ratio (R) and half
mainlobe width (

), the normalized width D = 2

(N − 1) is
used. The relation between the normalized width and the ripple
ratio for the proposed window with N = 51 is given in Fig.5. By
using the curve fitting method for Fig.5, an approximate
Design relationship between the normalized width (D) and the
ripple ratio(R) can be established as:




The approximation model for the normalized width given by
Eq.17 is plotted in Fig.5. The relative error of approximated
normalized width in percent versus the ripple ratio for N = 50 is
plotted in Fig.e. The percentage error in the model changes
between 0.2 and −0.25.

Figure 4. Error curve of approximated

given for N = 51

Figure 5. Relation between ripple ratio and Dw for cosine hyperbolic and
proposed window in N =50
This error range satisfies the error criterion in which states
that the predicted error in the normalized width must be smaller
than 1%. An integer value of the window length (N) can be
predicted from

To find a suitable window which satisfies the given
prescribed filter specification, it is necessary to obtain the
relation between the window parameters and filter parameters.
Fig.6 shows the relation between the window adjustable
parameter (

) and the minimum stop band attenuation (

) for
N = 50. It is seen that as the window parameter increases the
minimum stop band attenuation also increases. By using the
curve fitting method, an approximate expression as a first filter
design equation can be found as:








(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

Figure 6. Relation between

and the minimum stop band attenuation for the
proposed window with N =50

Figure 7. Error curve of approximated

given by Eq.(19) versus As for
N =50
The approximation model for the adjustable shape
parameter given by Eq.19 is plotted in Fig.6. It is seen that the
model provides a good approximation with an error plotted in

A. Kaiser window
In discrete time domain, the Kaiser window is defined by

() {









is the adjustable parameter, and

(x) is the
modified Bessel function of the first kind of order zero which
can be described by the power series expansion as

Figure 8. Comparison between the proposed and Kaiser windows with N=50




As known from the fixed windows such as the rectangular
and Hamming windows, while the window length increases the
mainlobe width decreases ,but the ripple ratio remains almost
constant .As for the adjustable parameter , a larger value of

results in a wider mainlobe width and a smaller ripple ratio.
Fig.7 shows the Comparison of the proposed and Kaiser
Windows for N =51 and

=6. It can be observed that the both
of window have same mainlobe but the proposed widow have
smaller and narrower mainlobe width. By decreasing

=5.6 both of window have same mainlobe but proposed
window have smaller ripple ratio.

Figure 9. Comparison between the proposed and Cosine hyperbolic
windows with N=50 and

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

A. Cosine Hyperbolic Window
The cosine hyperbolic window is defined by
() {





From Fig.8 the functions cosh(x) and

(x) have the same
shape characteristic. Fig.9 shows the Comparison of the
proposed Window with the cosh window with N=50. It can be
observed that the both of window have often same mainlobe
but the proposed widow have smaller and narrower ripple ratio.

FIR filter design is almost entirely restricted to discrete time
implementations. The design techniques for FIR filters are
based on directly approximating the desired frequency response
of the discrete time system [2]. In order to show the efficiency
of the proposed window and compare the results with the other
windows, an example of designing an FIR low pass filter by
windowing of an ideal IIR low pass filter is considered. Having
a cut-off frequency of ω
, the impulse response of an ideal low
pass filter is:

() ()

() ()




Figure 10. The filters designed by the Proposed and Kaiser Windows
for w
= 0.4 and w = 0.2 rad /sample with N = 50
In this paper an improved class of window family based on
exponential function with π base is proposed. The proposed
window has been derived in the same way of the derivation of
Kaiser Window, but it has the advantage of having no power
series expansion in its time domain function. The spectrum
comparisons with the Kaiser window for the same window
length and normalized width show that the proposed window
provides a better ripple ratio than Kaiser window and larger
sidelobe roll-off ratio which maybe useful for some
applications. The last spectrum comparison is performed with
window, and two specific examples show signs of that for
narrower mainlobe width and smaller ripple ratio. Moreover,
the paper presents the application of the proposed window in
the area of FIR filter design. The filter design equations for the
proposed window to meet the given lowpass filter specification
are established and the comparison with Kaiser window is
discussed. The simulation results show that the filters designed
by the proposed window provide better minimum stopband
attenuation also they perform significantly better maximum
stopband attenuation than the filters designed by other
The preferred spelling of the word ―acknowledgment‖ in
America is without an ―e‖ after the ―g‖. Avoid the stilted
expression, ―One of us (R. B. G.) thanks . . .‖ Instead, try ―R.
B. G. thanks‖. Put sponsor acknowledgments in the unnum-
bered footnote on the first page.
The template will number citations consecutively within
brackets [1]. The sentence punctuation follows the bracket [2].
Refer simply to the reference number, as in [3]—do not use
―Ref. [3]‖ or ―reference [3]‖ except at the beginning of a
sentence: ―Reference [3] was the first . . .‖
Number footnotes separately in superscripts. Place the
actual footnote at the bottom of the column in which it was
cited. Do not put footnotes in the reference list. Use letters for
table footnotes.
Unless there are six authors or more give all authors'
names; do not use ―et al.‖. Papers that have not been published,
even if they have been submitted for publication, should be
cited as ―unpublished‖ [4]. Papers that have been accepted for
publication should be cited as ―in press‖ [5]. Capitalize only
the first word in a paper title, except for proper nouns and
element symbols.
For papers published in translation journals, please give the
English citation first, followed by the original foreign-language
citation [6].

[1] Shukla, P.; Soni, V.; Kumar, M ―Nonrecursive Digital FIR Filter Design
by 3-Parameter Hyperbolic Cosine Window: A High Quality Low
Order Filter Design‖ International Conference on Digital Object
Identifier, communications and signal processing (ICCSP), 2011,
[2] S. R. Seydnejad and R. I. Kitney, ―Real-time heart rate variability
extraction using the Kaiser window,‖ IEEE Trans. On Biomedical
Engineering, vol. 44, no. 10, pp. 990–1005, 1997.
[3] R. M. Rangayyan, Biomedical Signal Analysis: A Case-Study Approach,
Wiley-IEEE Press, New York, NY, USA, 2002.
[4] S. He and J.-Y. Lu, ―Sidelobe reduction of limited diffraction beams
with Chebyshev aperture apodization,‖ Journal of the Acoustical Society
of America, vol. 107, no. 6, pp. 3556–3559, 2000.
[5] E. Torbet, M. J. Devlin,W. B. Dorwart, et al., ―Ameasurement of the
angular power spectrum of the microwave background made from the
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

high Chilean Andes,‖ The Astrophysical Journal, vol. 521, pp. L79–L82,
[6] B. Picard, E. Anterrieu, G. Caudal, and P. Waldteufel, ―Improved
windowing functions for Y-shaped synthetic aperture imaging
radiometers,‖ in Proc. IEEE International Geoscience and Remote
Sensing Symposium (IGARSS ’02), vol.5, pp. 2756–2758, Toronto, Ont,
Canada, June 2002.
[7] P. Lynch, ―The Dolph-Chebyshev window: a simple optimal filter,‖
Monthly Weather Review, vol. 125, pp. 655–660, 1997.
[8] J. F. Kaiser, ―Nonrecursive digital filter design using I0-sinh window
function.,‖ in Proc. IEEE Int. Symp. Circuits and Systems (ISCAS ’74),
pp. 20–23, San Francisco, Calif, USA, April 1974.
[9] T. Saram¨aki, ―A class of window functions with nearly minimum
sidelobe energy for designing FIR filters,‖ in Proc. IEEE Int. Symp.
Circuits and Systems (ISCAS ’89), vol. 1, pp. 359– 362, Portland, Ore,
USA, May 1989.
[10] F. J. Harris, ―On the use of windows for harmonic analysis with the
discrete Fourier transform,‖ Proceedings of the IEEE,
vol. 66, no. 1, pp. 51–83, 1978.
[11] R. L. Streit, ―A two-parameter family of weights for nonrecursive digital
filters and antennas,‖ IEEE Trans. Acoustics, speech, and Signal
Processing, vol. 32, no. 1, pp. 108–118, 1984.
[12] A. G. Deczky, ―Unispherical windows,‖ in Proc. IEEE Int. Symp.
Circuits and Systems (ISCAS ’01), vol. 2, pp. 85–88, Sydney, NSW,
Australia, May 2001.

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Enhanced Techniques for PDF Image
Segmentation and Text Extraction

Research Scholar, Computer Science
Karpagam University
Coimbatore, Tamilnadu, India

Director,Dept of Computer Science
Dr.SNS Rajalakshmi college of Arts and Science ,

Abstract— Extracting text objects from the PDF images is a
challenging problem. The text data present in the PDF images
contain certain useful information for automatic annotation,
indexing etc. However variations of the text due to differences in
text style, font, size, orientation, alignment as well as complex
structure make the problem of automatic text extraction extremely
difficult and challenging job. This paper presents two techniques
under block-based classification. After a brief introduction of the
classification methods, two methods were enhanced and results
were evaluated. The performance metrics for segmentation and
time consumption are tested for both the models.
Keywords- Block based segmentation, Histogram based, AC
Coefficient based.

With the drastic advancement in Computer Technology
& communication technology, the modern society is entering to
the information edge. In change in the traditional document
system (paper etc), people now follow electronic document
system (PDF Format) for communication and storage which is
currently imperative. But on complex matters, the document
image is difficult to accurately identify the information directly
out of the need. On such cases preprocessing the document is
done before its entry. Image segmentation theory, as digital
image processing has become an important part of people active

Image processing document image segmentation
theory is an important research topic in the process it is mainly
between the document image pre-processing and advanced
character recognition an important link between. The relatively
effective and commonly used for document image segmentation
and classification methods include threshold, and geometric
analysis and other categories.

After segmenting, Text part is detected and extracted
for further process, earlier, text extraction techniques have been
developed only on monochrome documents [1]. These
techniques can be classified as bottom-up, top-downand hybrid.
Later with the increasing need for color documents, techniques
[2]have been proposed

The segmentation techniques like Block based image
segmentation [3] is used extensively in practice. Under this
block based segmentation, the comparison goes (i) AC-
Coefficient Based technique and (ii) Histogram Based technique

This paper is organized as follows: In section II the brief
introduction of PDF Image. Section III discuss about the review
of block based segmentation. Section IV discusses in detail about
the Text extraction using proposed techniques. Section V discuss
about the experimental results of the two models. Finally the
section 6 concludes the paper.

PDF format is converted into images using available
commercial software’s so that each PDF page is converted into
image format. From that image format the text part are
segmented and extracted for further process.

The goal of segmentation is to simplify and/or change
the representation of an image into something that is more
meaningful and easier to analyze. Image segmentation is
typically used to locate objects and boundaries (lines, curves,
etc.) in images. More precisely, image segmentation is the
process of assigning a label to every pixel in an image such
that pixels with the same label share certain visual
Most of the recent researches in this field mainly based
on either layer based or block based. This block based
segmentation approach divides an image into blocks of regions
(Fig:1). Each region follows approximate object boundaries, and
is made of rectangular blocks. The size of the blocks may vary
within the same region to better approximate the actual object

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Block-based segmentation algorithms are developed
mostly for grayscale or color compound image, For example ,In
[4], text and line graphics are extracted from check images. In
[5],proposed block based clustering algorithm, [6] propose a
classification algorithm, based on the threshold of the number of
colors in each block. [7], approach based on available local
texture features[8], detection using mask [9], block classification
algorithm for efficient coding by thresholding DCT energy [10]

Figure 1: Block Based Segmentation
The following sub sections discuss the two techniques
(A) AC Coefficient based technique & (B) Histogram based
technique in block based segmentation

A. AC Coefficient based technique

The first model uses the AC coefficients introduced
during (Discrete Cosine Transform)DCT to segment the image
into threeblocks, background, Text and image blocks[11] [12]
[14][15]. The background block has smooth regions of the
imagewhile the text / graphics block has high density of sharp
edges regions and image block has the non-smooth partof the
PDF image( Fig:2). AC energy is calculated from AC
coefficients and is combined with a user-definedthreshold value
to identify the background block initially. The AC energy of a
block ‘s’ is calculated using Equation (1)


where Y
is the estimate of the i-th DCT coefficient of the
block ‘s’, produced by JPEG decompression. When the E

value thus calculated is lesser than a threshold T
, then it is
grouped as smooth region; else it is grouped as non-smooth
region. After much experimentation with different images,
the thresholds T
and T
with 20 and 70 respectively resulted
with better segmentation and were used during further

Figure 2: AC Coefficient based segmentation

To further identify the image and text regions of the compound
image, the non-smooth blocks are considered and a 2-D feature
vector is computed from the luminance channel. Two feature
vectors D
and D
are determined.
It was reported by Konstantinides and Tretter
(2000)[17] that the code lengths of text blocks after entropy
encoding tend to be longer than non-text blocks due to the higher
level of high frequency content in these blocks. Thus, the first
feature, D
, calculates the encoding length using Equation (3).

¿ + ÷
i , s 0 , 1 s 0 , s
) Y ( f ) Y Y ( f
Where f(x) =
¦ > +
Otherwise 0
1 | x | if 4 |) x (| log

The second feature, D
, is the measure of how close a block is to
a two-colored block. For each block ‘s’, a two-color projection is
performed on the luminance channel. Each block is clustered in
to two groups using k-means clustering algorithms with means
denoted by θ
and θ
. The two-color projection is formed by
clipping each luminance of each pixel to the mean of the cluster
to which the luminance value belongs. The l
distance between
the luminance of the block and its two-color projection is then a
measure of how closely the block resembles a two-color block.
This projection error is normalized by the square of the
difference of the two estimated means, |θ
− θ
, so that a high
contrast block has a higher chance to be classified as a text
block. The second feature is then calculated as
¿ ÷
u ÷ u
0 i
2 '
i , s i , s
2 , s 1 , s
2 , s
| X X |
| |

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
where X
is the estimate of the i
pixel of the block ‘s’,
produced by JPEG decompression and X'
is the value of the i

pixel of the two-color projection. If θ
= θ
, then D
= 0.
For the norm used in k-means clustering, the contributions of the
two features are differently weighted. Specifically, for a vector
= [D1,D2], the norm is calculated by
|| D || =
D D ¸ +

where γ = 15. All clusters whose mean value is greater than D
and lesser than D
are grouped as text blocks. The rest of the
blocks are termed as picture blocks.
Each cluster is fit with a Gaussian mixture model. The two
models are then employed within the proposed segmentation
algorithm (Bouman and Shapiro, 1994)[18] in order to classify
each non-background image block as either a text block or a
image block. The SMAP algorithm is shown in Fig: 3.

Figure 3: SMAP Algorithm

B. Histogram Based Technique

The second block-based segmentation model sequences
a histogram-based threshold approach [13][16]. In this technique
the image is segmented using a series of rules. The
segmentation process involves a series of decision rules from the
block type with the highest priority to the block type with the
lowest priority. The decision for smooth and text blocks is
relatively straightforward. The histogram of smooth or text
blocks is typically dominated by one or two intensity values
(modes). Separating the Text and image blocks from the PDF
image is challenging.

Here the intensity value is defined as mode if its frequency
satisfies two conditions,
(i) it is a local maximum and
(ii) the cumulative probability around it is above a pre-
selected threshold, T.
The algorithm begins by calculating the probability of intensity
value i, where i = 0... 255 using Equation (2)
(iii) pi = freq(i) / B
Where B is the block size and a value of 16 is used in the
experiment. Then the mode (m
, … , m
) is calculated and the
cumulative probability around the mode m is computed using
Equation (6).
A m
A m
p .—(6)

The decision rules used is given in Figure.4.
Figure 4: Decision Rules
In the above said Decision rules, after many tests, the thresholds
, T
and T
were set as 30, 45 and 70 respectively, getting
better results.
The following section 4 deals with the text extraction techniques.


By applying the two techniques of block based method, the
image is segmented into
1. Smooth region (Background)
2. Non Smooth region
I. Text regions
II. Image region

The technique (2.1) indicate that while segmenting the
PDF image, background is identified as smooth blocks. The
foreground (non smooth block), using K-means algorithm the
text and image blocks are segmented and thus text part is
separated or extracted from the PDF image.
In the technique (2.2),the PDF image is segmented into
16 X 16 blocks, then a histogram distribution for each pixel in
each segmented group is computed. Grouping of pixels is done
Rule 1 : If N = 1 and c1 > T1÷background block
Rule 2 : If N = 2 and c1 + c2 > T1 and |C1-C2| > T2÷Text block
Rule 3 : If N s 4 and c1+c2+c3+c4>T1÷Graphics block
Rule 4 : If N > 4 and c1+c2+c3+c4 < T3÷Picture block

1. Set the initial parameter values for all n', un, 0 = 1 and uL-1,1 = 0.5.
2. Compute the likelihood functions and the parameters un, 0.
3. Compute x
using Equation
) k ( l max arg x
) L (
M k 1
) L (
s s

+ =
s s
1) (n
x | (k
) 1 n (
x |
) L (
M k 1
) L (
p log ) k ( l max arg x

4. For scales n = L-1 to n = 0
a) Use EM algorithm to iteratively compute un, 1 and T.
Subsample by P
when computing T and stop when
b) compute un,o using Equation .
¿ ¿
= u
= =
0 i
0 h
h , t
0 h
h , i
0 , n

c) Compute x

d) Set un-1,1 = un, 1 (1-10c2)
5. Repeat steps 2 and through 4.
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
low, mid and high gradient pixels. Threshold value is assigned to
calculate the value to identify the text block and image block (4)

if High gradient pixels + low gradient pixels < T1 then
the block is image
else if (High gradient pixels < T2 && low gradient pixels
if number of colour level < T4 then
the block is Text block
the block is image
(Where T1=50; T2=45; T3=10; T4=4)


The following fig: 5 is the combination of PDF images, which
are used for testing.

(i) – Single
PDF file
with no
(ii) Double
column PDF
file with no
(iii) Single
Column PDF
file with
(iv) – Double
Column PDF
file with
Fig 5: sample PDF files used for testing

Total No. of PDF Files used for testing – 100 PDF files

20– Single Column Files with no Figures
20 – Double column Files with no Figures
30 – Single Column Files with Figures
30 – Double Column Files with Figures
Evaluation Method: 10-fold cross validation technique

The Table 1 shows the comparison rate of the two proposed

Single column
PDF image
with no figures
Double column
PDF image with
no figures
Single column
PDF image with
Double column
PDF image with

94.33 93.87 92.66 91.87 93.51 92.44 90.19 91.67
positive 5.67 6.13 7.34 8.13 6.49 7.56 9.81 8.33
(seconds) 20.71 14.91 22.57 13.57 26.64 13.06 21.02 13.10
Table1: comparison rate of the two proposed methods.

From the above table it shows that the accuracy rate is better in
AC-coefficient based technique where the time consumption is
more in this technique. Whereas the time consumption is less
and the accuracy rate is better in Histogram based technique.


On seeing the advantage and disadvantage of both the
algorithms, From the performance analysis, If the user is willing
to trade a little time for a better accuracy, then the AC-
Coefficient based technique will be suitable. However if the user
requires quick retrieval and is willing to tolerate a slightly less
reliable outcome, then the Histogram based technique is more


[1] V. Wu, R. Manmatha, E.M. Riseman, TextFinder: an automaticsystem to
detect and recognize text in images, IEEETrans. Pattern Anal. Mach. Intell. 12
(1999) 1224–1229.

[2] C. Strouthopoulos, N. Papamarkos∗, A.E. Atsalakis:” Text extraction in
complex color documents” Pattern Recognition 35 (2002) 1743–1758.

[3] D.Maheswari, Dr.V.Radha, “Improved Block Based Segmentation and
Compression Techniques for CompoundImages”,International Journal of
Computer Science & Engineering Technology (IJCSET)-2011.

[4] Huang, J., Wang, Y. and Wong. E. K. Check image compression using a
layered coding method. Journal of Electronic Imaging,7(3):426-442, July 1998.

[5] Bing-Fei Wu1, Yen-Lin Chen, Chung-Cheng Chiu and Chorng-Yann Su, “A
Novel Image Segmentation Method for complex document images”, 16th IPPR
Conference on Computer Vision, Graphics and Image Processing (CVGIP 2003)

[6] Lin .T and Hao .P,“Compound Image Compression for Real Time Computer
Screen Image Transmission,” IEEE Trans. on ImageProcessing, Vol.14, pp. 993-
1005, Aug. 2003.

[7] Wong, K.Y., Casey, R.G. and Wahl, F.M. (1982) Document analysis system.
IBM J. of Res. And Develop., Vol. 26, No. 6, Pp. 647-656.

[8]Mohammad Faizal Ahmad Fauzi, Paul H. Lewis,”Block-based Against
Segmentation-based Texture ImageRetrieval”, Journal of Universal Computer
Science, vol. 16, no. 3 (2010), 402-423.

[9] Vikas Reddy, Conrad Sanderson, Brian C. Lovell,” Improved Foreground
Detection via Block-based Classifier Cascade with Probabilistic Decision
Integration”, IEEE Transactions on Circuits And Systems For Video
Technology, Vol. XX, No. XX (2012).

[10] S.Ebenezer Juliet, D.Jemi Florinabel, Dr.V.Sadasivam,” Simplified DCT
based segmentation with efficient coding of computer screen images”,
ICIMCS’09, November 23–25, 2009, Kunming, Yunnan, China. Copyright 2009
ACM 978-1-60558-840-7/09/11.

[11]K. Veeraswamy, S. Srinivas Kumar,” Adaptive AC-Coefficient Prediction
for Image
Compression and Blind Watermarking”,Journal of Multimedia, VOL. 3, NO. 1,
MAY 2008.

12] Wong, T.S., Bouman, C.A. and Pollak, I. (2007) Improved JPEG
decompression of document images based on image segmentation,Proceedings of
the 2007 IEEE/SP 14th Workshop on Statistical Signal Processing, IEEE
Computer Society, Pp. 488-492.

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
[13] Li, X. and Lei, S. (2001) Block-based segmentation and adaptive coding for
visually lossless compression of scanned documents,Proc. ICIP, VOL. III, PP.

[14] Asha D, Dr.Shivaprakash Koliwad, Jayanth.J,” AComparative Study of
Segmentation in Mixed-Mode Images”, International Journal of Computer
Applications (0975 – 8887) Volume 31– No.3, October 2011.

[15] S. A. Angadi, M. M. Kodabagi, “A Texture Based Methodology for Text
Region Extraction from Low Resolution Natural Scene Images”, International
Journal of Image Processing (IJIP) Vol (3), Issue(5)-2009.

[16] D. Maheswari, Dr. V.Radha,” Enhanced Hybrid Compound Image
Compression Algorithm Combining Block and Layer-based Segmentation”, The
International Journal of Multimedia & Its Applications (IJMA) Vol.3, No.4,
November 2011.

[17] Konstantinos Konstantinides, Daniel Tretter: A JPEG variable quantization
method for compound documents. IEEE Transactions on Image Processing 9(7):
1282-1287 (2000)

[18] Charles A. Bouman,Michael Shapiro,”A Multiscale Random Field Modelfor
Bayesian Image Segmentation” " IEEE Trans. on Image Processing, vol. 3, no. 2,
pp. 162-177, March 1994.


D.Sasirekha , completed her BSc (CS)-2003 in Avinashlingam
University for Women, coimbatore and M.Sc (CS)-2005 in
Annamalai University, Currently doing Ph.D (PT) (CS) in
Karpagam University, Coimbatore and working in
Avinashilingam University for Women, Coimbatore ,India.

Dr.E.Chandra received her B.Sc., from Bharathiar University,
Coimbatore in 1992 and received M.Sc., from Avinashilingam
University ,Coimbatore in 1994. She obtained her M.Phil., in the
area of Neural Networks from Bharathiar University, in 1999. She
obtained her PhD degree in the area of Speech recognition system
from Alagappa University Karikudi in 2007. She has totally 16 yrs
of experience in teaching including 6 months in the industry. At
present she is working as Director, School of Computer Studies in
Dr.SNS Rajalakshmi College of Arts & Science, Coimbatore. She
has published more than 30 research papers in National,
International journals and conferences in India and abroad. She has
guided more than 20 M.Phil., Research Scholars. At present 3
M.Phil Scholars and 8 Ph.D Scholars are working under her
guidance. She has delivered lectures to various Colleges in Tamil
Nadu & Kerala. She is a Board of studies member at various
colleges. Her research interest lies in the area of Neural networks,
speech recognition systems, fuzzy logic and Machine Learning
Techniques. She is a Life member of CSI, Society of Statistics and
Computer Applications. Currently Management Committee member
of CSI Coimbatore chapter.
ISSN 1947-5500
Towards an Ontology based integrated Framework for Semantic
Nora Y. Ibrahim
Computer and System Department,
Electronic Research Institute
Cairo, Egypt
Sahar A. Mokhtar
Computer and System Department,
Electronic Research Institute
Cairo, Egypt
Hany M. Harb
Computer and Systems Engineering
Department, Faculty of Engineering,
Al-Azhar University
Cairo, Egypt

Abstract—This Ontologies are widely used as a means for
solving the information heterogeneity problems on the web
because of their capability to provide explicit meaning to the
information. They become an efficient tool for knowledge
representation in a structured manner. There is always more
than one ontology for the same domain. Furthermore, there is no
standard method for building ontologies, and there are many
ontology building tools using different ontology languages.
Because of these reasons, interoperability between the ontologies
is very low. Current ontology tools mostly use functions to build,
edit and inference the ontology. Methods for merging
heterogeneous domain ontologies are not included in most tools.
This paper presents ontology merging methodology for building a
single global ontology from heterogeneous eXtensible Markup
Language (XML) data sources to capture and maintain all the
knowledge which XML data sources can contain.
Keywords-Ontologies; Ontology management; ontology
mapping; ontology merging.
Ontologies have been realized as the key technology for
shaping and exploiting information for the effective
management of knowledge. The study of ontologies and their
use is no longer just one of the fields in the Artificial
Intelligence. Ontologies are now ubiquitous in many
information-systems enterprises: they constitute the backbone
for the Semantic web, they are used in E-commerce, and in
various application fields such as E-science, digital libraries,
bioinformatics and medicine. As a result, developers are
designing a large number of ontologies using different tools
and different languages. These ontologies cover unrelated or
overlapping domains, at different levels of detail and
granularity. Multiple ontologies need to be accessed from
several applications. Such wide-spread use of ontologies
inevitably produces an ontology-management problem.
Ontology management is the whole set of methods,
methodologies, and techniques that is necessary to efficiently
use multiple variants of ontologies from possibly different
sources for different tasks. Ontology management includes
operations such as mapping, alignment, matching, integration
and merging. Ontology mapping aims to find semantic
correspondences between similar elements of different
ontologies [1]. Ontology matching is the process of detecting
links between entities in heterogeneous ontologies. Ontology
alignment is the task of creating links between two original
ontologies. Ontology alignment is made if the sources become
consistent with each other but are kept separate [2]. Ontology
alignment is made when they usually have complementary
domains. Ontology integration is the process of building an
ontology in one subject reusing one or more ontologies in
different subjects [3]. Ontology merging is the process of
generating a single, coherent ontology from two or more
existing and different ontologies related to the same subject [3].
A merged single coherent ontology includes information from
all source ontologies but is more or less unchanged. The
original ontologies have similar or overlapping domains but
they are unique and not revisions of the same ontology [4]. The
merging process can be performed in a number of ways,
manually, semi automatically, or automatically. Manual
ontology merging is a difficult, time-consuming and error
prone task due to the continuous growth in both the size and
number of ontologies. Therefore, several automatic and semi-
automatic ontology merging frameworks have recently been
proposed. These frameworks aim to find semantic
correspondences between the concepts of the ontologies for a
specific domain through exploiting syntactic- and/or semantic-
based techniques. A new method is presented for generation of
OWL ontologies (local ontologies) from heterogeneous XML
data sources [5]. This paper illustrates the process of merging
these local ontolgies to create a global ontology which is the
union of the source ontologies. The merged ontology captures
all the knowledge from the original ontologies. The challenge
in ontology merging is to ensure that all correspondences and
differences between the ontologies are reflected in the merged
ontology. Merging of local ontologies is performed using
protégé. Protégé is a free, open-source ontology editor which
supports two ways of modeling ontologies, namely Protégé-
Frames and Protégé OWL [6] where Prompt plug-in/tab is used
for merging. Section 2 discusses most common methodologies
and tools used for ontology management. Section 3 illustrates
the framework to generate a global ontology from
heterogeneous XML data sources. Section 4 describes
methodology for ontology merging process. This merging
methodology is considered to be integration to the system
presented in [5]. Section 4 focuses on experiment and results of
merging process. Section 5 contains the conclusion and future
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
This section introduces the current research in ontology
management field. Section 2.1 discusses most common
methodologies of ontology management while section 2.2
presents ontology management tools used in ontology merging
A. Ontology Management Methodologies
Recently, there has been considerable interest in ontology
management methodologies and techniques to assist in a
variety of ontology management operations, e.g., mapping,
merging and alignment. The following methodologies represent
the current research in ontology merging field.
1) Miyoung Cho and Pankoo Kim [7]
Miyoung Cho, Hanil Kim and Pankoo Kim proposed
ontology merging method using vertical and horizontal
approaches based in WordNet. They presented the problem of
proximity between two ontologies as a choice between
alignment and merging. The alignment is limited to
establishing links between ontologies while the merging creates
a new single ontology. The both approaches (i.e., horizontal
and vertical) have different characters. The horizontal approach
is used to analyze mapping between ontologies through
integrated similar concept in the same level. The horizontal
approach checks the relationships between the concepts of the
same level in the two ontologies and merges or ties them as
defined by WordNet. The vertical approach completes the
merging operation for concepts with different levels, but
belonging to the same branch of the tree. In this case they fill
the resulting ontology with concepts from both ontologies. A
similarity measure is calculated in order to define the hierarchy
between these concepts in the resulting tree. While this method
doesn't provide an adequate solution to automation, it provides
a purely semantic approach to the merging solution.
2) C.R. Rene Robin and G.V. Uma [8]
C.R. Rene Robin and G.V. Uma proposed an algorithm
used for merging of ontologies automatically using a hybrid
strategy. It consists of four sub strategies such as Lexical
Matching, Semantic Matching, Similarity Check and Heuristics
Functions. The user’s only job is to give the OWL files as input
and a merged file will be produced as output. Merging process
starts from the top in one owl file and from bottom in other.
They consider that ontologies completely differ if the leaf node
has no similarity with the super most class. For comparing
class names, Lexical Analysis and Semantic is used. Semantic
Analysis uses WordNet as a database to identify the synonyms
of the class names. If this has a match it means that the classes
are same. So after a Similarity Check Heuristic is called. If
Lexical and Semantic fails, it checks every class of the
OWLfile1 with the class of OWLfile2 and saves the value of
the classes and an intermediate value (the ratio of similarity
between two classes). Similarity Checking of properties takes
two classes as input, each of their properties is stored in an
array. Every property of the class is compared with the other.
To perform the comparison again lexical and semantic analysis
is used. If the lexical and semantic match is found, the heuristic
function is called. This process is repeated for every class of
OWLfile2. In this process the output file OWLfile3 is
initialized with owlfile1 any addition from OWLfile2 is made
in OWLfile3. As in merging it should have all values that are in
both OWL files. At last the owlfile3 is returned as merged file.
3) N. Maiz and M. Fahad [9]
N. Maiz and M. Fahad presented a strategy for ontology
merging in context of data warehousing by mediation that aims
at building analysis contexts on-the-fly. Their methodology is
based on the combination of the statistical aspect represented
by the hierarchical clustering technique and the inference
mechanism. Their approach on ontology merging topic has two
folds. First, the semantic based ontology merger, Disjoint
Knowledge Preservation based Ontology Merger (DKP-OM)
system, follows the hybrid approach and uses various
inconsistency detection algorithms in initial mapping found in
first steps [10]. Their hybrid strategy makes it possible to find
all possible mappings, and semantic validation of mappings
gives very promising final results by ignoring the incorrect
correspondences. In this approach the methodology starts by
aligning the local ontologies to find similar entities belonging
to different ones, the similarity between entities based on
Wordnet thesaurus. Then, the result of the ontology alignment
is used to merge local ontologies automatically. It generates the
global ontology by four steps. First, it builds classes of
equivalent entities of different categories (concepts, properties,
instances) by applying a hierarchical clustering algorithm.
Secondly, it makes inference on detected classes to discover
new axioms representing the new relationships between entities
in the same class or between different classes of the same
category, and solves synonymy and homonymy conflicts. This
step also consists of generating sets of concept pairs from
ontology hierarchies, such as the first component subsumes the
second one. Third, it merges different sets together, and uses
classes of synonyms and sets of concept pairs to solve semantic
conflicts in the global set of concept pairs. Finally, it
transforms this set to a new hierarchy, which represents the
global ontology. In this approach, it requires human
intervention for the validation of mappings and semantic
inconsistencies are neglected during the generation of merged
4) M. Fahad and N. Molla [11]
M. Fahad and N. Moalla proposed an approach minimizes
human involvement one step more down during the ontology
merging process and it presented a novel methodology for the
detection of semantic inconsistencies in the early stages of
ontology merging. Disjoint knowledge analysis and
preservation in ontology merging helps to identify the
conceptualization mismatches between heterogeneous
ontologies and provides more accuracy to the process of
mapping. This results in global merged ontology free from
‘circulatory error in class/property hierarchy’, ‘common
class/instance between disjoint classes error’, ‘redundancy of
subclass/ sub property relations’, ‘redundancy of disjoint
relations’ and other types of ‘semantic inconsistency’ errors. In
this way, their methodology saves time and cost of traversing
local ontologies for the validation of mappings, improves
performance by producing only consistent accurate mappings,
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

and reduces the user dependability for ensuring the
satisfiability and consistency of merged ontology. This
approach presented Disjoint Knowledge Preservation based
automatic Ontology Merger (DKP-AOM) semantic-based
ontology merger which extends the methodology of semi-
automatic DKP-OM system [10] to encounter more structural
and semantic conflicts, and to provide more optimized fully
automatic solution for accessing and resolving semantic
consistencies in an ontology merging. The newer developed
DKP-AOM system employs various algorithms to detect
inconsistent mappings from the initial list of mappings, and
saves time and resources for the generation of hidden
intermediate global ontology.
5) F. Freire de Araújo and F. Lígia Lopes [12]
F. Freire de Araújo and F. Lígia Lopes proposed an
approach for automatic merging of multiple ontologies, called
MeMO, which uses clustering techniques [13] in order to help
the identification of the most similar ontologies. They consider
that two important tasks must be executed to produce the
global ontology: the similarity matrix building and the
progressive ontology combination. MeMO approach produces
a binary tree whose leaf nodes denote the source ontologies,
and the root node represents the global ontology. The
intermediary nodes represent the integrated ontologies, and are
obtained during the merging process. One important aspect of
MeMO is that it produces a global ontology which is really
close to the ideal one.
6) Salvatore Raunich and E. Rahm [14]
Salvatore Raunich and E. Rahm demonstrated a new
automatic approach to merge large taxonomies based on an
equivalence matching between a source and target taxonomy
to merge them. It is target-driven, i.e. it preserves the structure
of the target taxonomy as much as possible. Furthermore, this
approach can utilize additional relationships between source
and target concepts to semantically improve the merging.
They implemented Automatic Target-Driven Otology Merging
(ATOM) system which is a new approach for taxonomy
merging. It generates a default solution in a fully automatic
way that may interactively be adapted by users if needed.
ATOM base algorithm takes as input two taxonomies and an
equivalence matching between concepts.
B. Ontology Management Tools/Systems
This section provides an overview of the main ontology
management tools that have been developed in the last years.
These tools usually provide a graphical user interface for
merging ontologies. They usually need the participation of the
user to obtain the definitive result of the merging process. In
the final of this section the comparison of these tools is
1) FCA-Merge [15]
FCA-Merge is a semi automatic tool for ontology merging
based on Ganter and Wille’s formal concept analysis [16],
lattice exploration, and instances of ontologies to be merged.
FCA Merge employs bottom up approach. The overall
process of ontology merging consists of three steps:
a) Linguistic processing of source documents for
extraction of instances and formal context.
b) Derivation of common context and computation of
concept lattice.
c) The non automatic generation of the merged
ontology with human interaction based on the
concept lattice. Disadvantage of the technique is, it
does not mention if the ontologies have the
same/similar/complementary/orthogonal subject [8].
2) PROMPT [17]
PROMPT suite contains of a set of tools that had an
important impact in the area of merging, aligning and
versioning of ontologies. The suite includes an ontology
merging tool (iPROMPT, formerly known as PROMPT), an
ontology alignment tool (Anchor PROMPT), an ontology
versioning tool (PROMPT Diff) and a tool for factoring out
semantically complete sub-ontologies (PROMPTFactor). The
PROMPT ontology merging algorithm begins with the
linguistic-similarity matches for the initial comparison,
generates a list of suggestions for the user based on linguistic
and structural knowledge and then points the user to possible
effects of these changes. This tool guides the user to remove
inconsistencies in ontologies by determining the conflicts in
the ontologies and suggests solutions.
3) Chimaera [18]
Chimaera is an interactive ontology merging and diagnosis
tool. It is used to create, browse, edit, merge and diagnose
ontologies. This application builds on a system called
Ontolingua. It makes users affect merging process at any point
during merge process. Chimaera allows the users to map
ontologies by suggesting terms which are possible candidates
in the ontologies to be merged or have taxonomic relationships
which are to be included in the merged ontology. In fact, it is
similar to PROMPT, as both are embedded in ontology editing
environments and offer the user interactive suggestions. It
solves mismatches at terminological and scope of concept
level, and it helps alignment by providing possible edit points
and it is not repeatability. But it is not automatic which means
everything requires user interaction.
4) SAMBO (System for Aligning and Merging Biomedical
Ontologies) [19]
SAMBO is a web-based ontology alignment and merging
tool system, developed at Linköpings universitet in Sweden.
SAMBO supports ontologies, which are represented in
DAML+OIL and OWL and it is designed to allow two merging
• Suggestion Merge: Suggestions for possible merges are
created by SAMBO by comparing the names and
synonyms of slots and classes in the two ontologies but
the user chooses which suggestions to merge or not.
• Manual Merge: The user can choose which slots and
classes to merge without any suggestions from
SAMBO and each merged item is added individually
to the new ontology. The slot and class definitions that
are not merged are copied to the new ontology.
SAMBO provides a number of reasoning services such
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

as consistency, satisfiability and equivalence checking
by using the FaCT reasoner.
5) HCONE [20]
The goal is to validate the mapping between ontologies and
to find a minimum set of axioms for the new merged ontology.
Linguistic and structural knowledge about ontologies are
exploited by the Latent Semantics Indexing method (LSI - a
technique for information retrieval and indexing) [21] for
associating concepts to their informal, human-oriented intended
interpretations realized by WordNet senses. Using concept
intended semantics; the proposed method translates formal
concept definitions to a common vocabulary and exploits the
translated definitions by means of description logics reasoning
services. The HCONE approach is not completely automated;
user involvements are placed at the early stages of the
mapping/ merging process.
6) iiMERGE (interactive merge) [22]
iMERGE aimed to support the analytic comparing and
merging process. It provides tightly linked and integrated
techniques and views for visualizing. iMerge Provides a
method for the user to accept/reject a suggested mapping, and
allows access to full definitions of ontology terms, it Provides
progress feedback on the overall mapping process. iMerge
proposes some mappings but does not consider possible
conflicts which may occur if the concepts are merged. Mapping
and merging strategies in iMerge exploit the linguistic
approach. With the method EditDistance [23] string similarity
is computed from the number of edit operations (insertions,
deletions and substitutions of single characters). It is necessary
to transform one string into another one. Here, strings are
compared according to their set of n-grams [24], i.e., sequences
of n characters. The similarity between concepts based on their
terminological relationships such as synonymy, hypernymy and
hyponymy. This tool requires the use of auxiliary sources, such
as documents or annotations.
7) MOA [25]
MOA is OWL ontology merging and alignment tool for
semantic web based on a linguistic-information. Linguistic
information is usually used to detect (dis)similarities between
concepts. Many of them are based on syntactic and semantic
heuristics such as concept name matching (e.g., exact, prefix,
suffix matching). It works in most cases, and does not work
well in more complex cases. In MoA, the correlations between
the source ontologies are saved to a text file, and this can be
viewed and edited in an editor. So, the users can accept or
reject the result by editing the file. The intermediate output of
MoA is a set of articulation rules between two ontologies; these
rules define what the (dis)similarities are [26]. The final output
of MoA (i.e., new mereged ontology) is similar to that of
Chimaera and PROMPT. MoA refers not only to classes but
also to slots and value restrictions. The merging algorithm of
MoA accepts three inputs: two source ontologies and the
semantic bridge ontology. MoA system is composed of four
main components namely MoA engine, the core module of the
architecture, is joined by Bossam inference engine, Ontomo
editor, and Shell. They provide querying, editing, and user
interfaces. Table 1 contains the information of the different
tools used to management ontologies.

Input Output Ontology
Mapping strategy
or algorithm
User interaction auxiliary
Two ontologies and a
set of documents of
concepts in
Linguistic analysis &
TITANIC algorithm
for computation for
pruned concept
Generating a merged
ontology requires human
interaction of the domain
expert with background
PROMPT Two ontologies Merged
Merging and
Heuristic based
The user can accepts,
Rejects or Adjusts system’s
Chimara RDF and DAML

merging &
Linguistic matcher Semi-
User can map ontologies by
suggesting terms which are
possible candidates in the
ontologies to be merged
SAMBO Two ontologies Merged
merging and
Linguistic matcher Semi-
User can choose suggestions
of merge system
Wordnet ,
HCONE Two ontologies Merged
Merging and
LSI Semi-
- Wordnet

ontologies and a set
of documents linked
with the concepts
Linguistic approach
with the method
EditDistance and N-
user can accept/reject a
suggested mapping
MOA Two ontologies and
semantic bridges
merging and
Linguistic analysis Semi-
Users can accept or reject the
result by editing the output

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

The next section will discuss ontology merging methodology
for building a single global ontology from heterogeneous XML
data sources.
Because of the scattered nature of the Web, Information
sources can be scattered in different XML data sources. Each
information source should be mapped to its own local
ontology. For this reason, the existence of an integrated
framework for storing, querying and managing distributed
XML data sources is of great importance. Towards an ontology
based integrated framework, two modules are implemented,
each of them aims to performing a specific task (Fig. 1):
1) Automatic generation of local OWL ontologies
module: it is used to specify XML-to-OWL mappings
for automatic generating local OWL ontologies from
heterogeneous XML data sources. This module is
implemented and described in [5]. It is composed of
Jena, Trang, XSOM and JUNG as shown in Fig. 1.
2) Merging Module: it uses PROMPT framework for
merging the local ontologies which are generated
from previous module to build a global ontology
covering the domain knowledge presented in XML
data sources. This module will be described in details
in the next section.
An ontology based integrated framework is written in Java and
it uses several online-available APIs such as, Jena, Trang,
XSOM, JUNG and PROMPT as shown in Fig. 1.
Figure 1. Ontology based integrated framework
This section presents a process of merging different local
ontologies for certain domain. These local ontologies are
generated from heterogeneous XML data sources using system
in [5]. The merging process is done through the protégé-owl
and protégé frames where Prompt is plug-in/tab. The Prompt
framework includes a tool called iPrompt which is responsible
for merging ontologies.
A. The IPROMPT Algorithm
The IPROMPT algorithm [17, 2] takes as input two
ontologies and guides the user in the creation of one coherent
merged ontology as output. Fig. 2 illustrates the iPrompt
ontology-merging algorithm. First IPROMPT creates an initial
list of matches based on lexical similarity of class names then
the process goes through the following cycle (Fig. 2):
(1) The user triggers an operation by either selecting one
of IPROMPT’s suggestions from the list or by using
an ontology-editing environment to specify the
desired operation directly.
(2) IPROMPT performs the operation, automatically
executes additional changes based on the type of the
operation, generates a list of suggestions for the user
based on the structure of the ontology around the
arguments to the last operation, and determines
inconsistencies and potential problems that the last
operation introduced in the ontology and finds
possible solutions for those problems.

Figure 2. The workflow of IPROMPT algorithm. The colored boxes indicate
the actions performed by IPROMPT. The white box indicates the action
performed by the user.
B. The IPROMPT Ontology Merging Operations
The set of ontology-merging operations includes both the
operations that are normally performed during traditional
ontology editing of a single ontology and the operations
specific to merging and alignment, such as merging classes,
merging slots, merging instances, performing a deep or a
shallow copy of a class from one ontology to another [17]. In
the descriptions of the operations below, O
is the merged
Merge classes: to merge two classes A and B, create a new
class M in O
. For each class C that is a subclass or a super
class of A or B, if there is an image C
of C in O
, C
a subclass or a super class of M, respectively. For each slot S
attached to A or B, if there is no image of S in O
, copy S to
. For each image S
of S, attach S
to the class M. If either A
or B was already in Om prior to the operation, all references to
it in O
become references to M and the original frame is
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

Merge slots: to merge two slots S
and S
create a new slot
. For each class C in the domain and range of S
or S
, if
there is an image C
of C in O
, add C
to the domain or range
of S
, respectively. If either S
or S
was in O
prior to the
operation, all references to it become references to S
and the
original slot is deleted. Note that as a result of the last step, if
there was a class in O
that had both S
and S
attached to it,
after the operation it will have only S
attached to it. In
addition, IPROMPT suggests that the user merges classes in
the domain (and range) of S
that came from different source
ontologies. For instance, if there exist in one of the source
ontologies, a slot sex had the class Gender as its range and in
the other a slot sex had a class Sex as its range. If the user
merges the two sex slots, IPROMPT will suggest merging
classes Gender and Sex.
Merge instances: IPROMPT does the following when
merging two instances I
and I
to create a new instance I
: If
classes C
and C
which are the types of I
and I
have no images in O
, copy them to O
(see the operation
perform a shallow copy of a class). If C
and C
already have
images in O
and the images are different frames, merge the
images (the user must confirm this operation). Note that as a
result, the merged instance I
will have the same slots, or their
images, that I
and I
had. For each value V for each slot S
attached to I
or I
, do the following:
- If V is a primitive value (string, number, etc.), add V
to the value of the image of S for I
- If V is a frame and there is an image of V, V
, in Om,
add V
to the value of the image of S for I
. For each
slot S of I
that has values that are images of frames
coming from different sources, IPROMPT suggests
merging these frames. Note that adding images of all
the values at sources can create violations of range
and cardinality constraints for the merged instance.
This inconsistency is one of the inconsistencies that
IPROMPT checks for in the next step of the
algorithm - finding inconsistencies and potential
Perform a shallow copy of a class: copy a class from a
source ontology to another. When copying a class C, create a
new class C
in O
. For each slot S directly attached to C, if
there is no image of S in O
, copy S to O
. Then attach
images of all the slots of C to C
Perform a deep copy of a class: copy a class from one
ontology to another copying all the parents of a class up to the
root of the hierarchy. To perform a deep copy of a class C,
perform its shallow copy, and then perform a deep copy for
each of its super classes.
In terms of user support, the IPROMPT tool has the following
features [17]:
1- Setting the preferred ontology: It often happens, that
the source ontologies are not equally important or
stable, and that the user would like to resolve all the
conflicts in favor of one of the source ontologies.
IPROMPT allows the user to designate one of the
ontologies as preferred. When there is a conflict
between values, instead of presenting the conflict to
the user for resolution, the system resolves the
conflict automatically.
2- Maintaining the user’s focus: Suppose a user is
merging two large ontologies and is currently
working in one content area of the ontology.
IPROMPT maintains the user’s focus by rearranging
its lists of suggestions and conflicts and presenting
first the items that include frames related to the
arguments of the latest operations.
3- Providing feedback to the user: For each of its
suggestions, IPROMPT presents a series of
explanations, starting with why it suggested the
operation in the first place. If IPROMPT later
changes the operation placement in the suggestions
list, it augments the explanation with the information
on why it moved the operation.

The next section describes experimental results of proposed
ontology-based integrated framework after applying the
merging process on the generated local ontologies which are
described in [5].
This section includes the description of the experiments that
resulted from ontology merging environments.
Fig. 3 shows the two local ontologies used for the
experiment. The two local ontologies have been generated from
heterogeneous XML data sources using the automatic
generating of local OWL ontologies module which is described
in [5]. Both ontologies represent structure of scientific
publications. They were encoded in OWL-Description Logic
(DL) format [27]. Fig. 3-a shows “Ruby_bibliography”
Ontology which composes of four OWL classes, named:
bibliography, biblioentry, author and publisher. While Fig. 3-b
shows “Niagara_bib” Ontology which composes of four OWL
classes named: bib, vendor, book and author. Various
screenshots are showed to discuss how two ontologies are
merged using PROMPT tab in protégé.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

Figure 3. The two local ontologies. (a) Ruby_bibliography Ontology; (b)
Niagara_bib Ontology References.
In order to proceed for merging the two local ontologies
presented above, PROMPT tab in protégé is used. After
loading the sources ontologies (local ontologies) and
performing an initial analysis of class names in IPROMPT. It
then displays the results of its analysis at the Suggestions tab.
The suggestions screen is obtained as shown in Fig. 4.
Figure 4. Suggestion shown by PROMPT tab for merging two local
As discussed in section 3.2, in order to merge two classes.
For example, the user can follow the IPROMPT’s suggestion
list and merge two author classes. Also, the user can define
new operation merge-classes to merge “bibliography” and
“bib” classes in the Sources window as shown in Fig. 5.
IPROMPT creates automatically new class named
“bibliography” in the GlobalOntology.
Figure 5. Source classes of two ontologies being merged in defined new
After copying the remained classes of two sources
ontologies from the IPROMPT suggestion list such as
biblioentry, publisher, book and vendor, the resulting classes of
the merged ontology (GlobalOntology) are obtained as shown
in Fig. 6. Note that some of the classes in the merged ontology
have the suffix “Ruby_bibliography”, some of them have the
suffix “Niagara_bib” and the rest have no suffix. The classes
that have no suffix describe the shared classes between the two
ontologies which are created by merging the similar classes in
the two ontologies. While the classes having suffix
“Ruby_bibliography” represent the classes that are copied from
the “Ruby_bibliography” ontology and doesn't exist in the
other ontology and in the same way the classes that have the
suffix “Niagara_bib” represent the classes that are copied from
the ontology “Niagara_bib and doesn't exist in the other
Figure 6. The result classes of the merged ontology.
As discussed in section 3.2, in order to merge two slots. In
our experiment, the system suggests merging the two slots
firstname. We define new operations to merge two slots such as
othername and lastname to obtain new merged slot named
lastname. Then copying the remained slots like id, pubdate,
name, publishername, publisher, email, phone, title, year, price,
has biblioentry, hasbook and hasvendor.
Protégé allows us to modify the ontology structure. So, we
can rename or remove ontology terms, or change the domain
and the range of a property, or create new classes. In our case,
the two classes “bibliography” and “author” are extended, with
Publication and Person being their superclasses, respectively.
Fig. 7 shows the new merged ontology after merging the two
ontologies and restructuring the new merged ontology.

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
Figure 7. Classes hierarchy of the new merged ontology.
Tables 2 and 3 show object properties and datatypes
properties of the new merged ontology in order.
Object Properties Domain Range
hasbiblioentry bibliography biblioentry
hasvendor bibliography vendor
hasbook vendor book
hasauthor book author
hasauthor biblioentry author
haspublisher biblioentry publisher
hasissue SigmodRecord issue
hasarticles Issue articles
hasarticle Articles article
hasauthors Article authors
hasauthor Authors author
Datatype Properties Domain Range
id bibliography
name vendor xsd:NCName
email vendor xsd:string
phone vendor xsd:NMTOKEN
title book xsd:string
publisher book xsd:string
year book xsd:integer
price book xsd:decimal
title biblioentry xsd:string
pubdate biblioentry xsd:integer
volume issue
number issue xsd:integer
title article
initPage article
endPage article xsd:integer
position author
firstname Author
surname author xsd:NCName
lastname author
In this paper, some issues of ontology management process
like methodologies and tools have been highlighted. It
illustrates with an example, the merging process, which is the
second module of ontology based integrated framework.
IPROMPT algorithm identifies potential merge candidates
based on class-name similarities. The result is presented to the
user as a list of potential merge operations then the user
chooses one of the suggested operations from the list or
specifies the operation directly. The system performs the
requested action and automatically executes additional
changes derived from the action. It then makes a new list of
suggested actions for the user based on the new structure of
the ontology, determines conflicts introduced by the last action,
finds possible solutions to these conflicts and displays these to
the user. From this example, it is clear that the merging
process requires the intervention of human to be done without
conflicts which means that the fully automation of merging
process is almost impossible since it requires good knowledge
of the domain, understanding of each ontology point of view,
and even the use of negotiation strategies between the
designers of the different ontologies in order to make
proposals, discuss them and to reach an agreement. In this
paper, data instances are not materialized at the local
ontologies. The global ontology only contains the concepts
and properties but not the instances, which stay in the source
and are retrieved and translated as needed in response to user
queries. For this reason, the subsequent work will be focused
on query translation process within this system for XML data
[1] N. Noy, Semantic integration: A survey of ontology - based approaches,
SIGMOD Record, 33(4) - 2004: pp. 65-70.
[2] N. Noy and M. Musen, “PROMPT: Algorithm and Tool for Automated
Ontology Merging and Alignment.” Proceedings of the National
Conference on Artificial Intelligence (AAAI), 2000.
[3] H. Sofia Pinto, A. Gomez-Perez, J. P. Martins, “Some Issues on
Ontology Integration”, In Proc. Of IJCAI99’s Workshop on Ontologies
and Problem Solving Methods: Lessons Learned and Future Trends,
[4] Helena Sofia Pinto, Joao P. Martins, ”A Methodology for Ontology
Integration”, Proceedings of the International Conference on Knowledge
Capture, Technical papers, ACM Press, pp. 131-138, 2001.
[5] Nora Yahia, Sahar Mokhtar., AbdAl Wahab A. “Automatic generation
of OWL ontology from XML data source.” International Journal of
Computer Science Issues, Volume 9, Issue 2, March 2012, pp. 77-83.
[7] Cho, M., Kim, H., and Kim, P. A new method for ontology merging
based on concept using wordnet. In Advanced Communication
Technology, the 8th International Conference, volume 3, pages 1573-
1576, February 2006.
[8] C.R.Rene Robin, G.V.Uma, "A Novel Algorithm for Fully Automated
Ontology Merging Using Hybrid Strategy", European Journal of
Scientific Research, Vol. 47 No. 1 (2010), pp. 74-81.
[9] Nora Maiz, Muhammad Fahad, Omar Boussaid, Fadila Bentayeb,
“Automatic Ontology Merging by Hierarchical Clustering and Inference
Mechanisms.” Proceedings of I-KNOW 2010, 1-3 September 2010,
Graz, Austria, pp. 81-93.
[10] Fahad, M., Qadir, M.A., Noshairwan, M.W., Iftakhir, N.: DKP-OM: A
Semantic Based Ontology Merger, In Proc. 3rd International Conference
I-Semantics 2007, Graz, Austria, pp. 313-322, 2007.
[11] Muhammad Fahada, Nejib Moallaa, Abdelaziz Bourasa, “Towards
ensuring Satisfiability of Merged Ontology.” International Conference
on Computational Science, ICCS 2011/ Procedia Computer Science 4
(2011), pp.2216–2225.
[12] Fabiana Freire de Araujo, Fernanda Lígia R. Lopes, and Bernadette
Farias Lóscio, “MeMO: A Clustering-based Approach for Merging
Multiple Ontologies.” In: DEXA Workshops IEEE Computer Society
(2010), pp. 176-180.
[13] M. R. Anderberg, “Cluster analysis for applications”. New York:
Academic Press, 1993.
[14] Salvatore Raunich, Erhard Rahm, "ATOM: Automatic target-driven
ontology merging." IEEE 27th International Conference on Data
Engineering, pp. 1276-1279, 2011.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

[15] Gerd Stumme, Alexander Maedche, “FCA-Merge: Bottom-Up Merging
of Ontologies”, In proceeding of the International Joint Conference on
Artificial Intelligence IJCA101, Seattle, USA, 2001.
[16] Ganter B., Wille R., “Formal Concept Analysis: Mathematical
Foundaations Springer”, 1999.
[17] Noy N.F. and M.A. Muser, “The PROMPT suite: Interactive tools for
Ontology merging and mapping”, Int. J. Human Comput. Stud, 59: 983–
1024, 2003.
[18] D.L. McGuinness, R. Fikes, J. Rice, S. Wilder, The Chimaera Ontology
Environment, in: 17th National Conference on Artificial Intelligence
(AAAI_00), Austin, 2000.
[19] Bassam Abdulahad and Georgios Lounis, A user interface for the
ontology merging tool SAMBO. PhD thesis, Linköpings universitet,
December 2007.
[20] K. Kotis and G. A. Vouros, “The HCONE Approach to Ontology
Merging”, In ESWS, LNCS 3053, Springer, pp. 137-151, 2004.
[21] H. Chalupsky, OntoMorph: a translation system for symbolic
knowledge, in: Seventh International Conference on Principles of
Knowledge Representation and Reasoning (KR2000), Morgan
Kaufmann, San Francisco, 2000, pp. 471–482.
[22] El Jerroudi, Z. and Ziegler, J, iMERGE: Interactive Ontology Merging,
EKAW 2008 – 16th International Conference on Knowledge
Engineering and Knowledge Management Knowledge Patterns.
Acitrezza, Italy: Springer.
[23] Levenshtein, V.I.: Binary codes capable of correcting deletions,
insertions, and reversals. Technical Report 8 (1966)
[24] Damashek, M. Gauging similarity with n-grams: Language-independent
categorization of text. Science 267(5199), February 1995, pp. 843-848.
[25] Jaehong Kim, Minsu Jang, Young-Guk Ha, Joo-Chan Sohn1, and Sang
Jo Lee: MoA: OWL Ontology Merging and Alignment Tool for the
Semantic Web, 18th International Conference on Industrial and
Engineering Applications of Artificial Intelligence and Expert Systems,
IEA/AIE 2005, Bari, Italy, Springer-Verlag Berlin Heidelberg 2005,
LNAI 3533, pp. 722 – 731.
[26] Noy, N.F and Musen, M.A. Evaluating Ontology-Mapping Tools:
Requirements and Experience. 2002.
[27] World Wide Web Consortium. “OWL Web Ontology Language
Reference”. W3C Recommendation 10 Feb, 2004.

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
Dept of

should po
despite m
been frequ
dimension r
poor discr
load. In vi
combines P
better featu
transform i
frequency s
by PCA to
matrix into
method is u
gives better
image data
and imple
standard fa
proposed w
Rate), FRR
Operating C
terms of b
compared w
active area
such as im
such as
human an
and expres
and deterio
there is a
malai University
Face recogniti
ns of image pro
issue in many
credit card
on etc. Robust
osses the ab
any variations
e. Principal Co
uently used as
reduction. How
riminatory pow
iew of these lim
PCA and diffe
ure representat
is used to deco
sub bands as a
o reduce the
o feature matrix
used for classif
r recognition ra
the propose
onal load sig
abase is large.
ementation of
ace AT&T data
work is justifie
R (False Rejecti
both FRR and
with convention
- Human f
t Analysis, Sub b
I. I
an face detect
a of research
mage processi
vision with w
personal id
ce, facial expr
nd computer i
of human face
ssion, results in
orate the recog
need to deve
Block d

ion is one of t
ocessing. This
applications su
face recogn
ility to reco
in pose, illu
omponent Anal
the potential
wever, it has its
wer and large
mitations, the p
erent types of
tion. In this m
mpose an imag
a pre processin
x. Euclidean di
fication. The pr
ate and discrim
ed method
gnificantly ev
. This paper de
the proposed
abase. The effe
ed by FAR (Fa
ion Rate) and R
s). Significant im
d FAR are ob
nal methods.
face recogniti
band, Wavelet tr
tion and reco
spanning seve
ing, pattern re
wide range of
dentity verifi
ression extract
interaction. Th
es due to pose
n highly compl
gnition perform
elop robust fa
diagram of a
Dept of E
the challenging
has become an
uch as security
and crimina
ition algorithm
ognize identity
umination and
lysis (PCA) has
algorithm for
limitations like
proposed work
wavelet for a
method, wavelet
ge into different
g step followed
y of the image
istance measure
roposed method
minatory power
reduces the
ven when the
etails the design
d method, and
results with
ctiveness of the
alse Acceptance
ROC (Receiver
mprovements in
bserved. When
ion, Principa
ognition is an
eral disciplines
ecognition and
f applications
ication, video-
tion, advanced
he wide-range
e, illumination
lex distribution
mance. Hence
ace recognition
a typical face
lai University

as im
the e
and A
gnition system
rocessing, the
malized or gray
ge to a standa
sification is usu
mum distanc
orks, etc. Fe
s to differ. T
action by apply
ig. 1 Block diagra
A good
em is found
gnition can
es: geometrica
hing. In the
sures about di
, mouth, nose
nd class, the fa
ensional array
pared to a
esenting a who
mage based syst
Image ba
rmation that be
entire face ima
nsion in patte
vich [2, 7] hav
d be economic
dinate system
e are the e
riance of the e
A. Pentland [3
hod based on th
er from tw
riminatory po
It is well k
d representatio
Dept of E&I
Annamalai Univ
m is shown
frontal face
y converted in
ard format for
ually one of st
ce classifier,
eature extracti
This paper ad
ying PCA on su

am of a typical face
survey of
in [1]. The
be divided
al features mat
e first class,
istinctive faci
e and chin ar
ace image is re
of intensity
single or
ole face. This m
sed approach,
est describes a
age. Based on t
ern recognition
ve shown that
cally represente
m that they te
ensemble of fa
] have propose
he eigenfaces a
common P
wo limitations
ower and l
known that P
n of the face
in Fig. 1.
images are re
order to prepa
r further proce
tandard method
ion is the are
ddresses the f
ub band images
e recognition syste
face recog
methods for
into two dif
tching and tem
some geom
al features su
re extracted. I
epresented as a
values and t
several tem
method is also
, the most re
face is derived
the Karhunen-
n, M. Kirby a
t any particula
ed in terms of
ermed "eigenf
of the ave
aces. Later, M
ed a face recog
CA-based me
s, namely,
large computa
PCA gives a
are the
ds like
ea that
r face
uch as
In the
a two-
this is
d from
and L.
ar face
a best
. Turk
a very
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
In view of the limitations existing in PCA-
based approach, in this paper, a new approach based
on PCA – applying PCA on wavelet subband is
addressed. In the proposed method, an image is
decomposed into a number of subbands with
different frequency components using the wavelet
transform (WT). The subband image with reduced
dimension is selected to compute the representational
bases. Also the proposed work takes the face
images with changes in illumination and pose. The
proposed method works on lower resolution, instead
of the original image resolution of 92x112.
Therefore, the proposed method reduces the
computational complexity significantly when the
number of training image is larger, which is expected
to be the case for a number of real-world
The paper organized in the following
manner: section 2 review the background of wavelets.
Section 3 details about PCA and eigenfaces. The
proposed method is reported in section 4.
Experimental results drawn from the research is gives
in section 5 and finally conclusions are given in
section 6.
Wavelets are functions that satisfy certain
mathematical requirements and are used in
presenting data or other functions, similar to sines
and cosines in the fourier transform. However, it
represents data at different scales or resolutions,
which distinguishes it from the fourier transform.

The main characteristic of wavelets is the
possibility to provide a multi- resolution analysis of
the image in the form of coefficient matrices. Strong
arguments for the use of multi resolution
decomposition can be found in psycho visual
research, which offers evidence that the human
visual system processes the images in a multi-scale
way. Moreover, wavelets provide a spatial and a
frequential decomposition of the image at the same
In the proposed system, WT is chosen to be used
in image frequency analysis and image
decomposition because by decomposing an image
using WT, the resolutions of the sub band images are
reduced. In turn, the computational complexity will
be reduced dramatically by working on a lower
resolution image. Also Wavelet decomposition
provides local information in both space domain and
frequency domain.
In 2-D case, the wavelet transform is usually
performed by applying a filter bank to the image.
Typically, a low pass filter and a band pass filter are
used. The convolution with the low pass filter results
in an approximation images and the convolutions
with the band pass filter in specific directions result
in detail images. The wavelet decomposition of a 2-D
image can be obtained by performing the filtering
consecutively along horizontal and vertical
directions. Wavelet coefficients are organized into
wavelet blocks as shown in Fig.2 and 3, where h, v,
and d correspond to horizontal, vertical, and diagonal
sub images.
Using two dimensional wavelet transforms, an image
f (x, y) can be represented as:
¡(x, y) = S
(x, y) + J
(x, y) +
∑ J
(x, y) + ∑ J
(x, y) (1)
= S
+∑ Ð
+ ∑ Ð
+ ∑ Ð
where, the two dimensional wavelets are the tensor
product of one dimensional wavelets as below:
Φ(x,y) = Φ(x) X Φ(y) (3)
(x,y) = Φ(x) X Ψ(y) (4)
(x,y) = Ψ(x) X Φ(y) (5)
(x,y) = Ψ(x) X Ψ(y) (6)
The energy of the original image concentrates
within the approximation image. Images showing the
most significant components in using four stages of
decomposition using wavelets are shown in Fig. 2
and 3.

Fig. 2 Original image and 2 level decomposed image.

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

Fig. 3 Decomposition at third and fourth level.
Here, select the approximation components
because it produces the better recognition rate
compare to the other components. In the feature
extraction method uses the approximation
component of the wavelet coefficients in the
principal component analysis
where the image is A =
∑ o
The Principal Component Analysis (PCA) is
one of the most successful techniques that have
been used in image recognition and compression.
The purpose of PCA is to reduce the large
dimensionality of the data space (observed variables)
to the smaller intrinsic dimensionality of feature
space (independent variables), which are needed to
describe the data economically [8]. This is the case
when there is a strong correlation between observed
By means of PCA one can transform each
original image of the training set into a corresponding
eigenface. The relevant information in a face image is
extracted, encoded as efficiently as possible, and then
compared with a database of models encoded
similarly. A simple approach to extracting the
information contained in an image of a face is to
somehow capture the variation in a collection of face
images, independent of any judgment of features, and
use this information to encode and compare
individual face images. In mathematical terms, the
principal components of the distribution of faces, or
the eigenvectors of the covariance matrix of the set of
face images, treating an image as point (or vector) in
a very high dimensional space is sought. The
eigenvectors are ordered, each one accounting for a
different amount of the variation among the face
These eigenvectors can be thought of as a set of
features that together characterize the variation
between face images. Each image location
contributes more or less to each eigenvector, so that
it is possible to display these eigenvectors as a sort of
ghostly face image which is called an "eigenface".
Each eigenface deviates from uniform gray
where some facial feature differs among the set of
training faces. Eigenfaces can be viewed as a sort of
map of the variations between faces. Each individual
face can be represented exactly in terms of a linear
combination of the eigenfaces. Each face can also be
approximated using only the "best" eigenfaces that
have the largest eigenvalues and therefore account for
the most variance within the set of face images. The
eigenvalues drops very quickly, that means one can
represent the faces with relatively small number of
eigenfaces. The best M eigenfaces span in an M-
dimensional subspace which is called as the face
space with all possible images.
So, in order to reconstruct the original image
from the eigenfaces, building a kind of weighted
sum of all eigenfaces is required. The face images
can be reconstructed by weighted sum of a small
collection of characteristic features or eigenpictures.
Therefore, each individual is characterized by a small
set of feature or eigenpicture weights (eigenvectors)
needed to describe and reconstruct them.
But for practical applications, certain part of the
eigenfaces is used. Then the reconstructed image is
an approximation of the original image. However
losses due to omitting some of the eigenfaces can be
minimized. This happens by choosing the top
eigenvalues and the corresponding eigenvector.
A. Mathematics of PCA
A 2-D facial image can be represented as 1-D
vector by concatenating each row (or column) into a
long thin vector.
We assume the training set of face images be Г
…………. Г
, with each image I(x,y) where
(x,y) is the size of the image.
• Convert each image into set of vectors
and new full-size matrix (M * p) , where
M is the number of training images and p
is x * y the size of the image.
• Then the average of the set is defined by
Ψ =
∑ Γ
• Calculate the each face differs from the
average by the vector
= Γ
-Ψ (9)
i = 1, 2, 3….M. and a set of matrix is
obtained with A = [Φ
, Φ
… Φ
] is the
mean-subtracted matrix with its size A
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
• By implementing the matrix
transformations, the vector matrix is reduced
C =
∑ (Φ
= AA
where C is covariance matrix
• Find the eigenvectors V
and eigen values
from the C matrix and ordered the eigen
vectors by highest eigen values.
• These vectors determine linear combinations
of the M training set face images to form the
eigenfaces u
= ∑ v
• With this analysis, the calculations are
greatly reduced, from the order of the
number of pixels in the images x*y to the
order of the number of images in the training
set (M). In practice, this forms the
eigenspace of size (M << N2), and the
calculations become quite manageable.
• Based on the eigen faces, each image
has its face vector by
= u
(Г -Ψ) (12)
where k = 1, 2, 3...,N
and N
is the total
class used for training.
• The weights form a feature vector,
= |w
, w
, w
|] (13)
• The unknown face is taken as input for the
recognition stage and the normalized form is
obtained by
Φ = Г -Ψ (14)
• The normalized face is projected on the
eigen space to reconstruct the face image
= ∑ w
• Representing

= |w
, w
, w
] (16)
• A measure of similarity is done with
Euclidean distance measure with the weights
of the test image and the weights from the
• The index of distance with minimum weight
represents the face from the library that
closely matches with the test image.

The sample images are taken from AT&T
database. This database has 40 different classes, with
10 images for each class totalling 400 face images.
The image size is 92 x 112. A random index selector
is in the proposed work. From this random index
selector, images were grouped into train and test set.
The system has two phases namely the training and
testing phase. Fig.4 shows the block diagram of the
proposed face recognition system. In training stage,
wavelet (Haar, Daubechies & Coiflet filters) is used
in the preprocessing stage. Using different wavelets
lets the face image being splitted into various
subbands. This preprocessing results a face
recognition system which robust to changes in
illumination, pose, and compress the image size.
According to the filter, image size will change and it
gives the four different subband images
(approximation, vertical, horizontal & diagonal

Fig. 4 Block diagram of the proposed work
The approximation co-efficient is taken as the
input to the next stage. This stage uses the PCA for
reducing the dimension of the image space into
feature space and extracts the principle features. Here
the eigenvectors were calculated, sorted and the top
eigenvectors are used for representation of principle
In the recognition stage, the test samples were
taken and the initial preprocessing is done as done for
the training frontal images. The classification stage
uses Euclidean distance method to find the minimum
distance between the train set and the test image. For
a given test feature vector, the distance between this
test feature vector and the all train feature vector is
calculated. From the distance measure, the index with
minimum distance represents the recognized face
from the training set.
Table 1. Performance comparison of various methods
Database Performance comparison

Methods Recognition Rate
PCA 89
Wavelet Db 92.5
Coiflet 94
Wavelet with
Db 93.5
Coiflet 95.17
Extracted features
Pre-processing PCA
Feature extraction
Train and Test set
AT and T
face Database
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
Table 2. Fault acceptance and rejection rate
Threshold PCA Wavelet
(Coiflet + PCA
0.3 23.397 2.5 41.602 0 33.589 0
0.35 11.538 7.5 30.384 0 20.897 2.5
0.4 4.487 12.5 21.153 2.5 11.474 2.5
0.45 1.282 30 12.564 7.5 4.679 10
0.5 0.513 45 5.961 12.5 1.154 22.5
0.55 0.064 65 1.666 20 0.128 35
0.6 0 90 0.064 35 0 52.5
0.65 0 97.5 0 52.5 0 75

The proposed work is tested with AT&T
database. This frontal face database has 40 classes
with 10 images for each class totalling 400 images.
All the images are gray scaled with 256 gray scale
levels. Each image is of the size 92 X 112. The
random index generator selects train and test data
from the whole database and this division is done
using the index of each image. This results in the
grouping of train and test data separately. The next
stage is the pre-processing stage, where different
wavelets are used to process or filter the frontal face
images. The different wavelets used for this stage are
‘Haar’, ‘Daubechies’, and ‘Coiflet’. Each of these
wavelets is used with various layers at different
levels. When using these wavelets, the filtered image
reduces to the size 12 X 14, 14 X 16, and 15 X 18
respectively. These subband images are concatenated
to form the image matrix which is taken as the input
to the PCA. By applying PCA algorithm, the image
space is reduced to feature space. Classification is
done using the distance measure method between this
feature space and the probe image. Table 1 shows the
recognition rate for the proposed system along with
other methods. The Table 2 shows the FAR and FRR
rate for the methods tested in the work with various
threshold values. Table 3 shows the threshold value
at equal error rate (EER).
Table 3. Threshold value at EER
Methods Threshold FAR (%) FRR (%)
PCA + ED 0.3768 7.5 7.5
Wavelet + ED 0.4623 10.641 10
Wavelet + PCA +
0.422 7.5 7.5

The plot of ROC for various methods are shown
in Fig 6. This results shows that by taking the
subband images, the misclassification can be

Fig.6 shows the ROC curve for various methods.
Experimental results are encouraging, illustrating that
both the combination strategies lead to more accurate
face recognition than that made by method without a
pre-processing method. Also the proposed algorithm
has the minimum FAR and FRR when compared to
other methods. The threshold value for this minimum
FAR and FRR lies in between the other two methods.

Fig 5. FAR and FRR curve for the proposed methods


Fig.6 ROC curve for various methods

In the field of biometric feature recognition,
statistical modelling techniques in combination with a
standard pre-processing technique are now increasing
attention and interest among researchers. The high
dimensional gallery images in a practical face
database problem is a research difficulty of biometric
because this problem often leads to unsatisfactory
recognition performance in real-world applications. A
facial feature extraction algorithm is presented based
on eigen features and multi-resolution images.
Wavelets have been successfully used in image
processing. Their ability to capture localized spatial-
frequency information of image motivates their use
for feature extraction. From the experimentation it sis
found that a small transform of the face, including
translation, small rotation and illumination changes,
0 1 2 3 4 5
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
leave the
wavelets g
reducing t
96.0% are
proved to
and descrip
in the class
terms of EE
[1] W. Zaho
recognition a
no. 4, pp.399
[2] Sirovich,
the characteri
pp. 519-524,
[3] Turk, M
Journal of Co
[4] Dao-Qin
and Mislav G
[5] Bing Lu
Based on Wa
on Informatio
[6] Ali, M.,
FPGAs and
Applied Scien
[7] Kirby, M
Loeve proced
PAMI, Vol. 1
[8] Ai-Ling
“Face Recogn
and Kernel
networks, vol
[9]. Riol, O. a
IEEE Signal P
[10] Beylkin
Numerical A
York: Jones a

face recogni
. With the a
greatly influenc
the dimension
good recogni
e obtained. T
provide an ex
ption. Our pro
sification rate w
, R.Chellappa, P.J
literature survey,”
– 458, December
L., and Kirby, M
ization of human
, (1987).
M., and Pentland,
ognitive Neuroscie
ng Dai and H
”, Face Recognitio
Girgic, I-Tech, Vien
uo, Yun Zhang, Y
avelet Transform a
on Acquisition, Ho
2003. Fast Discr
Distributed Ari
nce and Engineerin
M., and Sirovich,
dure for the chara
12, pp. 103-108, (1
Zhang, Haihong
nition by Applyin
Associative Mem
l. 15, no.1, pp. 166
and Vetterli, M. 19
Processing Magaz
, G., Coifman, R.
Analysis in Wavel
and Bartlett, 181-2
tion performa
application of
ces in the reco
n of the data
ition rates of
hus, the wav
cellent image
oposed method
while much m
J.Philips and A.R
” ACM Computing
M., "Low-dimensio
faces", J. Opt. S
A., "Eigenfaces
ence, Vol. 3, pp. 71
ong Yan, “Wav
on, Book edited b
nna, Aystria, June
Yun-Hong Pan, “
and SVM”, Interna
ong Kong, China.3
rete Wavelet Tran
ithmetic. Internat
ng, 1, 2: 160-171.
L., "Application
acterization of hu
g Zhang, and S
ng Wavelet Subba
mory,” in Proc. I
6-177, Jan 2004.
991 Wavelets and
zine, 8, 4: 14-38.
., and Rokhlin,V.
lets and Their A
ohandoss received
ree in El
mmunication Engin
versity in 2008.
uing the M.E in Pr
umentation En
arch interests inclu
elling methods
esentations in imag
ance relatively
f PCA to the
gnition rate by
a. For AT&T
velet transform
is comparable
more efficient in
Rosenfeld, “Face
g Surveys, vol. 35
onal procedure for
Soc. Am. A, 4, 3
for recognition"
1-86, (1991).
velets and Face
by Kresimir Delac
2007, pp.558
“Face Recognition
ational Conference
373-377, 2005.
nsformation Using
tional Journal o
of the Karhunen
uman faces", IEEE
Shuzhi Sam Ge
and Representation
IEEE trans,Neura
signal processing
1992.Wavelets in
Applications. New
d his Bachelo
lectronics and
neering from Anna
He is currently
rocess Control and
ngineering. His
ude statistics, data
s and sparse
ge processing.

He is a
ling methods, Ima
a member of IEEE
Control etc. She is
in Electronics
Engineering, in
Process Contro
2005. Presently
Assistant Profes
University. He
Ph.D. degree at
research interest
age processing, M

College of Tech
M.E., Control
Anna University
Electronics from
Currently work
Dept. of Instrum
University. Her
Image processin
s a member of ISA
n received his B.
s and Instrum
1998 and M.E d
l and Instrument
y he is workin
ssor in the Depar
Engineering, A
is currently purs
Annamalai Unive
ts include statistica
Machine Vision and
received her
and Control fr
nology, Coimbato
and Instrumenta
y in 1998 and Ph.D
m Anna Universit
king as a Profess
mentation Engg.,
r area of interest
ng, Power Electron
A and IEEE.
E degree
degree in
tation, in
ng as a
rtment of
suing the
ersity. His
d Control.
r B.E.,
rom Govt.
ore in 1988,
ation from
D. in Power
ty in 2007.
sor in the
ts includes
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
Ikuesan R. Adeyemi Shukor Abd Razak Nor Amira Nor Azhan
Department of Computer System and Communications,
Faculty of Computer Science and Information Systems,
Universiti Teknologi Malaysia

Research in the field of network forensics is gradually expanding with the propensity to fully
accommodate the tenacity to help in adjudicating, curbing and apprehending the exponential growth of
cyber crimes. However, investigating cyber crime differs, depending on the perspective of investigation.
There is therefore the need for a comprehensive model, containing relevant critical features required for a
thorough investigation for each perspective, which can be adopted by investigators. This paper therefore
presents the findings on the critical features for each perspective, as well as their characteristics. The
paper also presents a review of existing frameworks on network forensics. Furthermore, the paper
discussed an illustrative methodological process for each perspective encompassing the relevant critical
features. These illustrations present a procedure for the thorough investigation in network forensics.
Key words: Network Forensics Investigation, Model, Framework, Perspective, Military, Law
Enforcement, Industries, Investigator.
Investigating how an incident occurred and who was involved, with respect to computer networks is
usually referred to as network forensics. Various definition of network forensics has trailed the
community of network forensics. In [2], a network forensics definition is given from the military
perspective. Similarly, [3] presented a network forensics in industry paradigm. Moreover, the generally
accepted description of network forensics is given in the digital forensics research workshop (DFRWS)
. However, in this study, we defined network forensics as the study of the underlying aim, action,
source and result of an attack or any incident defined to contravene organization policy, or sets of
command that can result in the compromise of a system such as botnets, and malwares. The inception of
system compromise or network attack is usually designed on a silent and unnoticeable process, which is
often overlooked by system experts, and consequently, progress into fully-fledge attack
. Such
techniques are developed over time, and usually, emerge within the scope of most academic syllabus
[12, 13]

on engineering and computer science (example include digital forensics curriculum)
The academia thus plays a pivotal role
in the challenges rocking the digital world. Ironically, the
mitigation of these challenges also resides within the confines of the academia. For effective
investigation, a thorough understanding of the underlying perspective is undeniably required

to answer
questions relating to ‘who will be involved’, ‘what are the requirements’, ‘what resources are available
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
and in what capacity’, and ‘to what end’ in a decisive, wholly and reliable conclusion. The academia
initiates the background knowledge required for this requirement
[14, 15]
. One could therefore think of the
academia as the pivot upon which all aspect of network forensics is developed, without which, network
forensics could stray frenzy
Network forensics can be viewed from various perspectives, but the prominent ones are the military, law
enforcement, civil litigation, and the network security professional. These perspectives can however, be
generally classified into three
[1, 37]
; ‘law enforcement’, ‘industries’ and ‘military’. The law enforcement
perspective includes personnel in the legal technical institutions, policing system (example include first
responder units), and government agencies. Industries refer to personnel in private sectors such as cyber
security specialist, and organization devoted to the provision of forensic capabilities. The military
perspective on the other hand refers to government military arsenal, military research institutes, as well as
other military academic institution. Moreover, each of these perspectives shares similarity in varying
degree of personnel, personnel qualification and responsibilities. Figure1 gives a descriptive analysis of
the generic perspectives in network forensics.

Each of these personnel: researchers, developers, and investigators shown in Figure 1, though inter-
related in a loop-like relationship
, constitute distinctly, the composition of network forensics.
Researchers are personnel who undertake findings relevant to promote the existence of network forensics.
Developers on the other hand are personnel who develop relevant softwares and hardware devices,
needed for investigation. Investigators are personnel who engage in investigation. However, in
application, each of these distinct components varies in their objective, methodology, as well as content
scope. Scoping each of these perspectives to provide quantitative insight into the field of network
forensics is therefore eminent, and requires urgent formulation, if network forensics discipline is to meet
with its design attributes. Table 1 gives an overview of existing investigative framework for digital
Figure 1: Perspectives of network forensics. It embodies researchers, developers, and investigators but in varying degree of
scope, relevance and priority. In network forensics generic perspective, the personnel are required in almost equal proportion.
Law Enforcement
Military defense
Perspective /Application

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

Framework/ Model








































































, 1995 Cyberspace model
x x x x
2001 Metal model
x x x x x x x
Ashcroft 2001 First responders guide
x x x x
Reith &colleague
Abstract model
x x x x x x x x x
Carrier &Spafford
Event-based scene investigation
x x x x x x
Model for information security
x x x x x
Network forensics readiness
x x x x x x x x
Beebe & Clark
Hierarchical objective-based
x x x x x x
Augmented waterfall
X x x x x x x x x x
Forrester & Irwin
Industrial organization model
x x x x x x x
Field triage process model
x x x x x
Network forensics readiness

ID theft Investigation framework
x x x
Ray & colleague
x x x x
Selamat &

Investigation framework
x x x x x x x x
Country-based investigation
x x x x x x x x x
Shakeel & colleague
Law enforcement framework x x x x x x
Pilli, & colleagues
Generic framework x x x x x x x x x
Cybercrime investigation x x x x x x
Yussof &colleagues
Common phase investigation
x x x x x x x
Agarwal &

Systematic investigation x x x x x x x x x
Ademu, & Activity-based x x x x x x
Table 1. Review of existing network/digital forensic framework
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

Ma, & colleagues
Data Fusion-based x x x x x
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
Moreover, various models, and frameworks have been developed to provide insight into network
forensics perspective as shown in Table 1. Though myriads of frameworks from different perspectives
have trailed the community of network forensic, yet, there is no one framework that addresses the cogent
features for military perspective, law enforcement perspective, and industrial perspective distinctively.
Thus, this paper detailed exclusively, the critical features required for thorough network forensics
investigation from law enforcement, military, and industries perspective. The rest of the paper is as
follows; section 2 detailed existing frameworks and models for network forensics perspectives. Section 3
elucidate on the analysis of network forensics perspectives cueing from the various personnel. In Section
4, we present our illustrative methodological models for network forensics perspectives. Conclusion is
given in section 5.

In [1], the first step on network forensics framework, relevant lexicon and research needs is presented.
Academic researchers, military warfare, critical infrastructure protection and civil litigation paradigm
were identified as the nucleus of network forensics. [2] discussed on the challenges militating against
network forensics in military network environments. They identified information system of military
organization as the primary victim of attack. Consequently, network forensics (in military investigation
process paradigm), is described as the arsenal that provides a conclusive description of all cyber attack
scenes with intent to restore critical information infrastructure, as well as to strengthen the confidence for
investigative process. However, the use of network forensics simulation tools in military cyber warfare
depends on specific requirement and desired aim of the organization
. The military perspective of
network forensics is usually targeted at a near-real-time investigation process
, thus, network forensics
in this paradigm primarily includes the need for physical location detection and a behavior-based
algorithm research, to reduce the level of cyber anonymity
. [6] further illustrated that military
environment suffers most of the cyber attacks on critical infrastructures.
[7] proposed a 3-phased law enforcement investigation framework from law enforcement perspective.
They elucidated a review of the cyber law of the “Republic of Maldives”. Similarly, [6] researched on
threat mitigation for cyber investigation. In law enforcement paradigm however, traditional crime
solvability is not necessarily applicable to cyber crime investigation
, but could be applicable to threat
elimination through security hardening, and crime prosecution
. Regardless of the level of
technological improvement, investigation is human-centric (criminals, tool developers, researchers,
prosecutors, investigators, and victims are human); hence a need for awareness maintenance
[4, 9]
cannot be overemphasized. Furthermore, [10] expostulated that an efficient law-enforcement
investigation process is one, which can facilitate relevance from contextualizing any cyber crime into a
behavioral pattern, as well as quantifying the network technology for quick examination. Moreover, in
[11] an extended cybercrime investigation model, for efficient cyber investigative practice in law
enforcement community was proposed. In [38], a 5-phased industrial paradigm of investigation is
presented. The phases include readiness, deployment, securing physical scene, securing digital scene and
review phase. The readiness phase is the bedrock upon which investigation is vetted in conformance with
stated organizational policy. At-scene investigative model in developed in [34]. Furthermore, timeliness
in investigation was considered essentially important, through the introduction of investigation triage (a
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
medical terminology for prioritization) and chronology timeline. An overview of existing frameworks in
presented in Table 1. Additionally, Table 2 gives a substantive synopsis of the perspective in network
forensics, while Table 3, gives an elucidatory description of the various features constituting network
forensics frameworks.
As shown in Table 2, the three perspectives of network forensics can be described distinctly with their
characteristics, technicalities demand, critical focus, critical framework features and distinction.

2.1. Characteristics of Network Forensics Perspectives
Investigating network forensics differs in scope and objective from one perspective to the other. However,
the scope and objective of an investigation usually depict its characteristic features. A brief description of
the characteristics of the three identified perspectives are thus presented in this section

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
Features Network forensics perspective
Military Law enforcement Industries

Pro-active Post-mortem investigation Pro-active
Off-line investigation
Defensive Training and certification
Near real time analysis Near real time analysis
Target of attack Investigate the target of attack Target of attack, investigate target of
Readily available resources
Similarities Investigation: evidence identification, collection, fusion, analysis and documentation

Usually near real time investigation Post mortem investigation Near real time investigation as well as
post mortem investigation
Heavy-tailed traffic type Lightweight traffic type (usually) Heavy tailed traffic type, and light weight
traffic type
Non-jurisdiction bound Requires jurisdiction justification Requires jurisdiction justification
Inter-nations relationship Civil litigation Inter-city, and inter-nation relationship
Low level of legal requirement High dependency on legal protocol High dependency on legal protocol
24/7 monitoring and analysis, strictly
coordinated, hierarchical investigation
Occasionally, and case specific
investigation process
24/7 monitoring and analysis

Up-to-date technologies, updated soft wares Trusted soft ware, approved technological
Up-to-date technology, enhanced
software, and self-automated applications
High level of technological sophistication

Low level of technological sophistication
High level of technological sophistication
highly skilled and experience personnel

highly experienced personnel highly skilled and highly trained
Large network environment, and variety of
homogenous (manufacturer) network devices

Relatively smaller network environment,
and variety of heterogeneous
(manufacturer) network devices
Large network environment and variety
of homogenous (manufacturer) network
Critical focus Research centric operation Investigation centric operation Developer and training centric
Administrative investigation provision Litigation provision Administrative investigation provision
Table 2: Overview of Network Forensics Perspectives
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

Hypothesis, Event reconstruction, Analysis,
Awareness, Readiness, Incident response,
Approach strategy, Investigation initiation,
Modeling and behavior profiling, Risk
assessment, protection, Analysis evaluation,
Documentation, Reporting
Chain of custody, collection, event
reconstruction, documentation, analysis,
preservation, examination, acquisition,
identification, Digital crime scene,
Physical crime scene,
Documentation, analysis, preparation,
Modeling and behavior prediction,
Risk assessment, protection,
Design, implementation, Reporting
Deployment, examination, chain of
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
2.1.1. The Military Perspective
Network forensic investigation in the military looks beyond reactive and tactical cyber defense, to a
proactive strategic cyber investigation. Military leaders have therefore begun the process of cyber
investigation policy amongst which is the international military deterrence, the establishment of a
Distance Early Warning Line (DEWL), and the capability to select from range of investigative arsenal
As shown in Table 2, the military perspective of network forensic investigation includes;
• Proactive investigation: this type of investigation process involves the integration of expertise
(expert hackers, script kiddies), motivation (financial gain, selfish aggrandizement, political
achievement, personal/corporate/national vendetta, destruction), and attack vector
[49, 51]
network event analysis procedure into modus operandi prediction models. Proactive
investigations therefore tend to predict an event before its full incubation, by studying the
underlying network traffic pattern, and intelligent correlation. This is essentially relevant for
military investigation as it covers both near real time investigations, as well as ensure the
readiness of resources. Additionally, such investigative paradigms are built upon the backdrop
that most successful attack on military networks are heavily sponsored and could cause
unredeemable catastrophic damage if successful.
• Reactive and defensive investigation: defensive investigation
involves identifying network
vulnerabilities, and implementing necessary remedy
to forestall the exploitation of such
loophole. Such investigation covers wide range of information security management system, and
healthy network defense practice. It also involves preventing further incidence occurrence
through traffic filtering and network isolation of infected host
. On the other hand, a reactive
investigation involves investigating network device and traffic with the aim of responding to
breaches, either directly, or counteractively against the intrusion source. Such investigation is
defined with accuracy in identifying intrusion source, environment and underlying circumstance,
as well as detail logistical information; which are reliant on the level of reliance preparedness,
situational awareness, and technical expertise
[14, 49]
. The DOD1998 Solar Sunrise
is an example
of such. Attacks such as the Moonlight Maze, Brazilian Power outage, and Titan Rain explicated
in [54, 55] are fractions of the myriad range of threats/attacks at national infrastructure, military

2.1.2 The Law Enforcement Perspective
This perspective of investigation is carried out after an incident has occurred; a post-mortem scavenging
process of network device and network related artifacts, to uncover facts substantial enough for criminal
prosecution. Law enforcement investigation
can also include the military but for the sake of this
research, we refer to law enforcement as government agencies saddled with the judicial responsibility of
investigating cyber related incident, so as to provide evidence otherwise termed hidden or lost, for cyber
crime related cases. Therefore, the primary responsibility of this perspective is criminal apprehension.
Moreover, deterrence becomes the consequence of the investigation. Being a post-mortem investigation,
it is usually an off-line or passive network evidence collection, identification, analysis, documentation,
and presentation of evidence contravening stipulated law, to court of competent jurisdiction. Additionally,
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
it exhibits reasonable expectation of prejudice
(a real, substantial and convincing grounds for
investigation must exist before the commencement of investigation).

2.1.3 Industries Perspective
This perspective of investigation is relatively similar to that of military in areas of proactive, defensive
investigation. As like military, it can also be the target of an attack. However more unique with this
perspective is the training and certification capacity it also provides. Competent forensic investigators are
usually forged from this perspective, before they are deployed or employed in other perspectives. The
industries can also be described as an outsourcing unit for investigators, especially to law enforcement

2.2 Distinction in Network Forensics Perspective
The unique features that constitute network forensics for each of the perspective are presented in this

2.2.1. Military perspective
As identifies in table 2, network forensics in the military perspective is characterized by a stochastic
heavy-tailed probability distribution (in [58], Fischer, and Fowler identified FTP transfer, page request,
page reading time, session duration, session size, TCP connection, inter-arrival time of packet; to exhibit
heavy tail distribution), which is due to real time or near real time analysis. Hence, most forensic tools
developed in this perspective are heavy tail inclined. Moreover, investigation in this perspective functions
autonomously of jurisdictional boundaries, and does not require any special court order to react, defend,
or initiate investigation. However, monitoring, and event analysis, is strictly coordinated and usually
follow a hierarchical model of clearance level evaluation such as the Bella Padula model

2.2.2 Law Enforcement Perspective
This perspective is case specific, and adheres strictly to legal regulation. Since it has to do with evidence
integrity, and admissibility in court of competent jurisdiction, law enforcement perspective requires
jurisdictional justification, approved search and seizure warrant, well documented chain of custody note
(see table 3), and transparent investigative process. The strict observance of legal protocol is a cardinal
part of law investigation.

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
2.2.3 Industries Perspective
Investigation in this perspective is derives its uniqueness from both the military and the law enforcement.
It is relatively similar to the military as well as law enforcement perspective in term of investigation type
(near real time or offline), autonomous investigative process, inter-city and inter-national boundaries, and
. However, this perspective can grow beyond the capacity of any military or law enforcement or both.
Thus, an industrial perspective can be more complex to describe but maintains certain unique features
The various technicalities demand for each perspectives as well as the critical focus are presented in Table
2. However, the critical framework features (see Table 3) are further discussed in the proceeding section.

The criticality of network forensic feature depends largely on the perspective, size, topology, and
expertise of the investigator. The choice of feature to include in investigation, also describe the expected
thoroughness of the investigation. In this section, we present the features that are critical for network
forensic investigation for the three perspectives.
Moreover, a concise descriptive definition of features used in network forensics is presented in Table 3.
These features are derived from existing framework on digital forensics investigation. The term ‘Ff’ is an
abbreviation for framework features. As noted in Table 3, some features are essential for all perspective
irrespective of the crime scene involved. However, some are unique to certain perspective, which when
included into the investigative process of other perspectives could result in higher overhead running cost
(in term of resources and efficiency) and redundancy of service.

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
Framework feature Description Perspective critical
Ff1-Chain of
Chain of custody is a concept usually a written material that contains all processes carried out before, during
and after an investigation on ‘what was done’, ‘why it was done’, ‘who did it’, and ‘when it was done’
[17, 19]
, a
documentation proofing the integrity of evidence
All network
Ff 2-Hypothesis It is a supposition or proposition put forward by an investigator, as an explanation to an occurrence, to initiate
an investigation based on evidence examination
[20, 21]
. Hypothesis usually followed the SMART (specificity,
measurability, attainability, realistic, and timeliness) ideology consideration.
Industries, Military
Ff 3-
Event reconstruction is the process of reconstructing the sequence of network traffic
, from captured traffic
accumulated, and or network device logs and other related devices, for establishing an occurrence and its
supporting artifacts.
. The use of NFAT in the network forensics community today, has made this process
easier, but still requires more consolidated and efficient technique, for undisputable evidence analysis process.
All network
Ff 4-Authorization Investigation authorization involves the granting of legal permission to the effect of commencing investigation
process. This could also involve the acquisition of a search and/or seizure warrant from a court of competent

Ff 5-Incident
This is the process of closing a particular network investigation exercise, usually after appropriate satisfactory
status. It is preceded by a thorough review of the entire investigation process, well-articulated chain of custody,
documentation and expert review consideration
Law enforcement
Ff 6-Digital crime
Securing the digital crime scene involves the practice of strict adherence to safe digital procedure for evidence
acquisition, and preservation. It describes the ethics of first responders and computer emergency response team
(CERT), to digital crime scene due to fragility and volatility of network forensics evidence
Law enforcement
Ff 7-Physical crime
Securing the physical crime scene involves the practice of due caution, and professionalism in safeguarding
crime scene, and the use of appropriate signage. It generally describes the responsibility of first responders, and

Law enforcement
Ff 8-Awareness It is usually associated with staff training on updated knowledge in network forensics [
. Staffs include CERT,
and organization IT staffs.
Military, Industries
Ff9-Readiness This is the act of being prepared for investigation at any given time. It combines section of organs of an
organization for preparedness in the event of an emergency, as well as anticipated event of network intrusion
Industries, military
Ff 10-Collection This is the process of collecting network traffic information for investigation purpose. It usually takes
reasonable period, and in a pre-event-occurrence process. Due to network traffic volatility, evidence collection
involves the combination of both network hardware and software composition
[16, 26]
All network
Ff 11-
This is the process of taking account of every process and activities carried out during investigation and the
reason why it was done in such as manner
. It is the heart of investigation, and contains, strictly articulated
write-up of the entire investigation procedure. Documentation also serves as expert review, examiners’ note;
All network
Table 3: Framework Feature Description
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
source for future event investigation
[17, 16, 25]
Ff 12-Examination This is the process of scavenging network traffic for clue or sample of relevant incriminating evidence.
Devices to be examined include but not limited to, network devices. Examination could be static/manual
process or automated process.
Law enforcement
Ff13-Analysis Analysis is sometimes categorized as examination. According to [27], it is the “process of interpreting
extracted data, to ascertain the level of relevance or significance to ongoing investigation process”. Network
forensics analysis tools (NFAT)
are usually adopted for this phase (time framing analysis, data
hiding/steganography analysis
) of network forensics. It is also the application of validated techniques to
discovering or uncovering significant data

All network
Ff 14-Evaluation Evaluation could be prior to evidence analysis, in this case, it reviews the facts required for examination;
during evidence analysis, in this case, to determine the accuracy, thorough objectivity of the investigation, as
well as conformity to stated priorities; or post event analysis, which involves the review of resultant artifacts, to
proposed hypothesis, or other related undisputable facts. It is the process of deciding whether to accept or reject
facts uncovered

All network

Ff 15-Preservation This is the acts as well as the process of ensuring that the state of a particular network traffic evidence is not
altered before, during or post event analysis. This is crucial to investigations requiring further analysis or other
independent investigation. Preservation is a major factor for evidence admissibility in civil litigation.
Law enforcement
Ff 16-Returning of
This is the process of ensuring that all evidence collected during investigation are safely return to its supposed
owner, and in the same or almost the same condition at the seizure and acquisition state.
Law enforcement
Ff 17-Investigation
This includes history from previously investigated cases. Initial investigation is the process of gathering
relevant artifacts about a particular investigation process, building a predefined network traffic behavior
database to ease (with respect to time, resources, and methodology) in investigation process. it marks the
beginning or call for investigation
Industries, Law
Ff 18-Acquisition This is the process of gathering or gaining possession
to network traffic artifacts for or during investigation. Law enforcement
Ff 19-Deployment This involves putting in place respective forensics measure for proper conduct of investigation. According to
[21, 30], deployment can be initiated after thorough evaluation of inputs from network security agent.
Industries, Military
Ff 20-Presentation This is the act of presenting authoritatively, the investigated facts, to relevant constituted authority. It is usually
carried out as the last stage of network forensics investigation phases.
Law enforcement
Ff 21-Identification This is the process of pinpointing or locating relevant network forensics evidence from database of network
traffic or from stream of traffic flow. An adequate and precise identification process goes a long way in
influencing the amount of resources, the duration of investigation, and the weight of the evidence.
All network
Ff 22-Decision This is the process of attributing certain parameters, artifacts of evidence and concluding on the result of the
analysis from the investigation. This stage is the most critical phase of investigation, and it requires a thorough
review of the entire process, expert counsel, and experience where necessary.
All network
Ff 23-Approach This describes the designed process adopted for the investigation flow. A choice of which phase to carry out, Military
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
strategy and in what sequence, and with what resources and in what manner. Approach strategy is a decision making
process which usually involves expert input, in line with organization policies.
Ff 24-Preparation This is the process of organizing the necessary network forensics requirement and process for investigation. It
also involves the timely dissemination of investigation procedure and schedules to affected parties.

Industries, military
Ff 25-
This is the process of moving collected network related evidences from one place to another (usually a network
forensics laboratory) through a secure channel and procedure, in a well-documented order, and duly appended
in chain of custody.
Law enforcement
Ff 26- Interaction This is the process of communicating relevant investigation process or result to constituted authority
, with
the view of sharing idea, developing better evidence decision process, and or demonstrates the level of
investigation success.
Ff 27- Storage This is the process of storing network related artifacts. This process usually involves well-established storage
and retrieval mechanism, with a proper write/read blocker.
All network
Ff 28- Search and
This is usually attributed to legal warrant obtained for the commencement of investigation. It involves the
permission from constituted legal authority to carry out search on the victim or suspect system for relevant or
incriminating evidence, and when necessary, seize the evidence source for thorough investigation
Law enforcement
Ff 29- Admission This involves the taking-in of a particular network traffic data as part of the sources of network forensics
evidence. Admitting evidence in network investigation process also involve the process of acknowledging and
accepting an evidence as an authentic, and genuine.
Law enforcement
Ff 30- Defense This is the process of preventing alteration of network evidence, in order to maintain its integrity. Evidence
defense also encompass the act of ensuring that a thorough explanatory analysis is provided to backup
supposition and result of the analysis.
Law enforcement,
Ff 31- Design and
This is the process of establishing a workable network forensics investigation pattern and methodology for a
particular investigation process. it usually stern from organization policies, and investigator’s experience from
previously investigated scenes
Military, industries
Ff 32- Protection Is the process of preventing network traffic alteration before, during or after investigation. It is also the practice
of ensuring integrity and validity of evidence for future use, or reference
All network
Ff 33- Risk
This is the act as well as process of taking into consideration the various factors involves for network forensics
investigation so as to understand the risk at stake before initiating an investigation. Furthermore, risk
assessment is the critical examination of organizations assets to identify assets that can justify legal redress
when deliberately compromised
Ff 34- Modeling
and behavior
This process involves the mathematical or analytical procedure for forecasting the possibilities of event
occurrence, to accelerate investigator’s decision–making process
. Network forensics modeling and behavior
prediction is a complex process that, when properly carried out, can improve the efficiency of network analysis.
Military, industry
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
Ff 35- Data

Data aggregation in network forensics is the process of clustering independent, but similar featured network
traffic. This is executed in a coherent and methodological procedure to speed up investigating time. The
process of significant features identification for data aggregation is given in [35].
All network

Ff 36 - Triage Network forensics triage is the process of sorting and prioritizing methodology, and investigative process, in
order to increase the overall efficiency of the analysis, evaluation and decision making process. In [34], a field
triage model was defined to catalyze the period required for investigation.
Law enforcement

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
In Table 2, a detailed overview of the characteristics, similarities, technicalities, and focus of each of the
perspectives are described. In this section, we present the proposed models for each of the perspectives.

Perspective Critical features Investigation process
Military Ff 1 + Ff 2 + Ff 3 + Ff 4 + Ff
8 + Ff 9+ Ff 10 + Ff 11 + Ff
13 + Ff 14 + Ff 19 + Ff 21 +
Ff 22 + Ff 23 + Ff 24 + Ff 26
+ Ff 27 + Ff 30 + Ff 31 + Ff
32 + Ff 34 + Ff 35+ Ff 36
• Thorough understanding of the
unique investigation scenarios in
each perspective
• Selection of features using a
sequential methodology, such as
appropriate for investigation
development life cycle for each
• Acceptable definition and scope
of each features/phases based on
organization policy,
ation Acts of the country,
international laws

Ff 1 + Ff 3 + Ff 5 + Ff 6 + Ff
7 + Ff 10 + Ff 11 + Ff 12 + Ff
13 + Ff 14 + Ff 15 + Ff 16 +
Ff 17 + Ff 18 + Ff 20 + Ff 21 +
Ff 22 + Ff 25 + Ff 27 + Ff 28
+ Ff 29 + Ff 30 + Ff 32 + Ff
35 + Ff 36
Industries Ff 1 + Ff 2 + Ff 3 + Ff 8 + Ff
9 + Ff 10 + Ff 11 + Ff 13 + Ff
14 + Ff 17 + Ff 19 + Ff 21 +
Ff 22 + Ff 24 + Ff 27 + Ff 31
+ Ff 32 + Ff 33 + Ff 34 + Ff
35+ Ff 36

4.1 Military perspective illustration
The military perspective highlighted in Table 2 reveals that network forensics in this paradigm requires an
updated real-time validation. However, before any action can be taken from a real-time analysis, thorough
investigation must be presented in manner consistent with the military combative methodology. Hence, in
table 4, detailed critical feature for in-depth investigation is presented. Features such as Ff1, Ff11, and Ff8
are primarily critical for decision defense in military paradigm of investigation; hence, they cut across the
entire phases of investigation procedure presented in figure 2.
Table 4: Critical Features for Network Forensics perspectives
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

Figure 2 is a 19-phase (with additional 3-phase attached to each phases) investigation illustration for
network forensics. The MP1 procedure can be further translated into the following sequential procedure.
• Ff2,+ Ff4 + Ff10[(Ff14+ Ff35)], +(Ff12+Ff23)], +Ff13, +[Ff19+Ff22], +Ff26,+ Ff31(Ff 32+ Ff
30) + Ff34 + Ff …(1)
• Ff2 + Ff21 + Ff27 + Ff3 + Ff23 + Ff13 (Ff35 + Ff14)+ [Ff19 + Ff22] + Ff26 + Ff31(Ff32 + Ff30)
+ Ff34 + Ff2 …(2)
• Ff2 + Ff24 +[ Ff30 +( Ff23 + Ff13 +( Ff19 + Ff22) + Ff26 + Ff32] + Ff31 + Ff34 + Ff2
In contrast to other existing model, this illustration adopts a recursive iteration procedure that can help to
reduce possibilities of human error, as well as overlooked facts. Additionally, it reduces investigation
overhead accumulated due to features clustered phases.

4.2 Law Enforcement Perspective Illustration
In Table 2, law enforcement paradigm in network forensics investigation process is characterized by post
event occurrence. Thus, an in-depth postmortem in scavenging network devices and stored databases is
required for a network forensics investigation. Moreover, investigation procedure differs from one crime
scene to another, and usually depends on the discretion of the investigator. Hence, Figure 3 depicts an
illustrative methodology for network forensic investigation.
Figure 2: Network forensics military investigative perspective illustration.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

This illustration involves a 23-phase investigative procedure, which are further translated into the
• Ff28+ Ff17+[ Ff6+ Ff7+( Ff8+ Ff10)]+ Ff25+ [Ff27+ Ff15]+Ff3+Ff21 + Ff12 +( Ff13+ Ff35)+
Ff36+ Ff14+ Ff22+ Ff30+ Ff20+ Ff16+ Ff5 …(4)
• Ff29+ Ff17+[ Ff6+ Ff7+( Ff8+ Ff10)]+ Ff25+ [Ff27+ Ff15]+Ff3+Ff21 + Ff12 +( Ff13+ Ff35)+
Ff36+ Ff14+Ff22+ Ff30+ Ff20+ Ff16+ Ff5 …(5)
Irrespective of the procedure of choice, this example can be seen as a non-recursive investigative process.
The translated procedure in ‘1’ and ‘2’ above terminates on same feature (Ff36+ Ff14+Ff22+ Ff30+
Ff20+ Ff16+ Ff5), further indicating that the law enforcement paradigm of investigation can be termed a
project-like investigation.

4.3 Industry perspective Illustration
Figure 4, depicts an investigative illustration for industries. However, depending on the organizational
management policy, some features could be skipped. It involves a 18-phase (with additional two for each
phases) forensics procedure, which can be translated as

Figure 4: An industry perspective illustration of network forensics investigation
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
• Ff8+ Ff17+ Ff24+[ Ff32+( Ff2+ Ff21+ Ff19)]+ Ff10(Ff32)+[ Ff3(Ff14+ Ff15)+ Ff27]+ Ff13+
Ff22 …(6)
• Ff8+ Ff17+ Ff24+ Ff2+ Ff21+[ Ff10(Ff19)] + Ff10(Ff32)+ Ff27+ Ff31+[ Ff34(Ff33)]+ Ff8
• Ff8+ Ff9+ Ff2+ Ff21+[ Ff19+Ff10(Ff32)]+ Ff27+ Ff31+[ Ff34(Ff33)]+ Ff8 …(8)
• Ff8+ Ff9+ Ff2+ Ff21+[ Ff19+ Ff10(Ff32)]+ [ Ff3(Ff14+ Ff15)+ Ff27]+ Ff13+ Ff22 ...(9)
Each of the above translation distinctly forms a pattern thorough enough for investigation. However, the
combination of the features defined in IWP can yield a more thorough investigation result.

Each of the illustrations can be further translated into the highlighted dimension in equation 1 to 9. In
Figure 2, ‘IWP’ comprises the ‘IP’ combined with the Ff1, Ff11 and Ff8. The Ff8 feature is considered
critical due to the need for constant awareness of latest attack pattern, evolutionary network malwares,
and up-to-dated network defense arsenal. Ff1 feature is a critical feature for all network forensics
procedure as its forms the reservoir for knowledge on evidence detail at every event and process carried
on before, during, and after investigation. Similarly, Ff11, serves as the knowledge deposit for event
procedure, as well as resource for proper investigation evaluation, and expert witness note. The
integration of critical features Ff1, Ff11 and Ff8 into the translated procedures in equations 1, 2 and 3,
provide investigators vintage view of the investigation. Additionally, in Figure 3 ‘LEWP’ represents the
entire investigation procedure for the model. Ff1 and Ff11 features are integrated in every step in the
model. Moreover, in Figure 4 ‘IWP’ integrates Ff1, and Ff11 into each step in the investigative model.
With this illustration, network forensics can thoroughly scavenge network devices in a methodological
procedure. The GCFIM model proposed in [42] by Yussof, Ismail and Hassan, (2011), identified
presentation, preservation, planning, identification, examination, collection and analysis with value of 7,
4, 3, 6, 5, 6, 7 respectively, as the common features for investigation from a survey of 14 frameworks.
However, they failed to identify any specific perspective of application of their 5-phased framework.
Moreover, with description and analysis from this research a thorough analysis and choice of feature
deemed critical to the relevant investigation process can be selected/adopted. Furthermore, a logical
sequential and or iterative methodological principle can be applied.

In this paper, we discussed the existing network forensics frameworks. Special attention was directed
towards the three major perspectives (as identified in most research works, particularly, [1], & [37]) of
network forensics. Furthermore, we identified the critical features required for thorough investigation, and
we synthesize extensively, the various perspective of network forensics. Based on the identified features,
we demonstrated illustrative procedures that can be used to integrate these critical features for each
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
We hope to conduct extensive experimental process on these illustrations, in our research on network
forensics analysis and experimental works on insider misuse prevention. Additionally, we hope to fully
integrate these illustrations into an automated investigative process useful to the cyber policing
community, as well as research community, thus limiting investigators prerogative in investigation
[1] Report, D. T. (2001). A Road Map for Digital Forensic Research. Utica, New York: The MITRE
[2] Joseph Giordano, a. C. (2002). Cyber Forensics: A Military Operations Perspective. International
Journal of Digital Evidence , 1 (2).
[3] Wei, R. (n.d.). On A Network Forensics Model For Information Security. 229-234.
[4] Barbara Endicott-Popovsky, D. A. (MAY 2007). A Theoretical Framework for Organizational
Network Forensic Readiness. Journal of Computers, VOL. 2, NO. 3, , 1-11.
[5] Report, D. o. (November 2011). Department of Defense Cyberspace Policy Report. Pursuant to
Section 934 of the NDAA of FY2011.
[6] Lemieux, F. (2011). Investigating Cyber Security Threats: Exploring National Security and Law
Enforcement Perspectives. Washington D.C: The George Washington University.
[7] Ibrahim Shakeel, A. D. (2011). A Framework for Digital Law Enforcement in Maldives. Second
International Conference on Computer Research and Development (pp. 146-159). IEEE
computer society.
[8] Xiu-yu, Z. (2010). A Model of Online Attack Detection for Computer Forensics. International
Conference on Computer Application and System Modeling (ICCASM 2010) (pp. V8/533-
V8/537). IEEE.
[9] Fang Lan, W. C. (2010). A Framework for Network Security Situation Awareness Based on
Knowledge discovery. 2nd International Conference on Computer Engineering and Technology
(pp. 226-231). IEEE.
[10] Hunton, P. (2011 ). A rigorous approach to formalising the technical investigation stages of
cybercrime and criminality within a UK law enforcement environment. Digital investigation ,
[11] Hunton, P. (2011). The stages of cybercrime investigations: Bridging the gap between technology
examination and law enforcement investigation. compute r law & s e c u rity review , 61-67.
[12] William Figg, a. Z. (2007). A Computer Forensics Minor Curriculum Proposal. CCSC: Central
Plains Conference (pp. 32-38). Consortium for Computing Sciences in Colleges.
[13] Larry Gottschalk, J. L. (2005). Computer Forensics Programs in Higher Education: A Preliminary
Study. SIGCSE'05, February 23-27, (pp. 147-151). St. Louis, Missouri, USA.: ACM.
[14] Harjinder Singh Lallie, An overview of the digital forensic investigation infrastructure of India,
Digital Investigation, Available online 1 March 2012, ISSN 1742-2876,
[15] Mennell, J. ((2006) ). The future of forensic and crime scene science Part II. A UK perspective on
forensic science education. Forensic Science International 157S , S13–S20.
[16] Garfinkel, S. L. (2010 ). Digital forensics research: The next 10 years. d i g i t a l inve s t i g a t i
o n 7 , S64 - S73.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
[17] SOLUTIONS, M. L. Maintaining the Chain of Custody in Civil Litigation.
[18] Chain of Custody. (2012). Retrieved June, 01:15am 09, , 2012, from California Peace Officers
Legal Sourcebook:
[19] Forentech. (2005). Forensic Lifecycle. Retrieved 06 09. 01:15am., 2012, from Forensic Lifecycle
White Paper:
[20] Ciardhuáin, S. Ó. (2004). An Extended Model of Cybercrime Investigations. International
Journal of Digital Evidence .
[21] Brian D. Carrier, a. E. (2004). An Event-Based Digital Forensic Investigation Framework. IEEE .
[22] Sundararaman Jeyaraman, a. M. (2006). An Empirical Study Of Automatic Event Reconstruction
Systems. Purdue University, West Lafayette, IN 47907-2086.
[23] Andr´e °Arnes, P. H. (2006). Digital Forensic Reconstruction and the Virtual Security Testbed
ViSe. Retrieved 06 09, 2012, from
[24] Richard A. Caralli, J. H. (2010). Resilience Management Model, Incident Management and
Control (IMC).
[25] Computer Crime Investigation & Computer Forensics. (n.d.). Retrieved 06 09, 2012, from
[26] Ren, W. (2004). On the Novel Network Forensics Perspective of Enhanced E-Business Security.
The Fourth International Conference on Electronic Business, (pp. 1355-1360). Beijing.
[27] Sarah V. Hart, U. D. (2004). Forensic Examination of Digital Evidence: A Guide for Law
Enforcement. National Institute of Justice.
[28] Emmanuel S. Pilli, R. J. (2010). Network forensic frameworks: Survey and research challenges.
digital investigation , 14-27.
[29] Angelopoulou, O. (2007). ID Theft: A Computer Forensics’ Investigation Framework. Edith
Cowan University.
[30] Ph.D, R. R. (2004). A Ten Step Process for Forensic Readiness. International Journal of Digital
Evidence .
[31] Brian Hay, a. K. (n.d.). Forensics Examination of Volatile System Data Using Virtual
Introspection. 74-82.
[32] Siti Rahayu Selamat, R. Y. (October 2008). Mapping Process of Digital Forensic Investigation
Framework. IJCSNS International Journal of Computer Science and Network Security , 163-169.
[33] Barbara Endicott-Popovsky, D. A. (MAY 2007). A Theoretical Framework for Organizational
Network Forensic Readiness. Journal of Computers, VOL. 2, NO. 3, , 1-11.
[34] Marcus K. Rogers, J. G. (2007). Computer Forensics Field Triage Process Model. Conference on
Digital Forensics, Security and Law, 2006, (pp. 27-40).
[35] Sung, S. M. (2003). Identifying Significant Features for Network Forensic Analysis Using
Artificial Intelligent Techniques. International Journal of Digital Evidence Winter , 1-17.
[36] Mark M. Pollitt, M. (1995). Computer Forensics: an approach to evidence in cyberspace.
[37] Mark M. Pollitt, M. (2007). An Ad Hoc Review of Digital Forensic Models. Proceedings of the
Second International Workshop on Systematic Approaches to Digital Forensic Engineering.
IEEE computer society.
[38] Jock Forrester, a. B. A Digital Forensic Investigative Model For Business Organisations.
Distributed Multimedia Centre of Excellence at Rhodes University.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
[39] Nicole Lang Beebe, a. J. (2005). A Hierarchical, Objectives-Based Framework for the Digital
Investigations Process. Digital Investigation , 146-166.
[40] Hunton, P. (2011). The stages of cybercrime investigations: Bridging the gap between technology
examination and law enforcement investigation. compute r law & s e c u rity review , 61-67.
[41] Mark Reith, C. C. (2002). An Examination of Digital Forensic Models. International Journal of
Digital Evidence , 1-12.
[42] Yunus Yusoff, R. I. (2011). Common Phases Of Computer Forensics Investigation Models.
International Journal of Computer Science & Information Technology (IJCSIT), , 17-31
[43] Guofu Ma, C. S. (August 19-22, 2011). Study on Digital Forensics Model Based on Data Fusion.
International Conference on Mechatronic Science, Electric Engineering and Computer (pp. 898-
901). Jilin, China: IEEE.
[44] Perumal, S. (2009). Digital Forensic Model Based On Malaysian Investigation process. IJCSNS
International Journal of Computer Science and Network Security , 38-44.
[45] Daniel A. Ray, a. P. Models of Models: Digital Forensics and Domain-Specific Languages.
Department of Computer Science, The University of Alabama, Tuscaloosa, AL.
[46] Mr. Ankit Agarwal, M. M. (2011). Systematic Digital Forensic Investigation Model.
International Journal of Computer Science and Security (IJCSS) , 118-131.
[47] Inikpi O. Ademu, D. C. (2011). A New Approach of Digital Forensic Model for Digital Forensic
Investigation. (IJACSA) International Journal of Advanced Computer Science and Applications, ,
[48] Kenneth Geers, The challenge of cyber attack deterrence, Computer Law &amp; Security Review,
Volume 26, Issue 3, May 2010, Pages 298-303, ISSN 0267-3649, 10.1016/j.clsr.2010.03.003.
[49] Will Gragido, John Pirc, 7 - Cyber X: Criminal Syndicates, Nation States, Subnational Entities,
and Beyond, Cybercrime and Espionage, Syngress, Boston, 2011, Pages 115-133, ISBN
9781597496131, 10.1016/B978-1-59749-613-1.00007-8.
[50] Jason Andress, Steve Winterfeld, Chapter 3 - Cyber Doctrine, Cyber Warfare, Syngress, Boston,
2011, Pages 37-59, ISBN 9781597496377, 10.1016/B978-1-59749-637-7.00003-4.
[51] Siebert, E. (2010). The Case for Security Information and Event Management (SIEM) in
Proactive Network Defense. SolarWinds.
[52] Eoghan Casey, Christopher Daywalt, Andy Johnston, Chapter 4 - Intrusion Investigation, In:
Eoghan Casey, Editor(s), Handbook of Digital Forensics and Investigation, Academic Press, San
Diego, 2010, Pages 135-206, ISBN 9780123742674, 10.1016/B978-0-12-374267-4.00004-5.
[53] Natale Fusaro, Erratum to “The role of the expert, of the technical consultant and of the
consultant for the defensive investigations in the criminal trial” [Forensic Sci. Int. 146 (2004)
S219–S220], Forensic Science International, Volume 153, Issues 2–3, 29 October 2005, Pages
277-278, ISSN 0379-0738, 10.1016/j.forsciint.2005.03.004.
[54] Gelinas, R. R. (2010). Cyberdeterrence And The Problem Of Attribution. Washington DC:
Graduate School of Arts and Sciences of Georgetown University.
[55] Command Five PTY. LTD.Ltd, C. F. (2011). Advanced Persistent Threats: A Decade in Review.
[56] 3-19.13, F. (2005, 01). Law Enforcement Investigations, Computer Crimes. Retrieved 6 19, 2012,
from Law Enforcement Investigations :
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
[57] Concepts, F. (2006, 10 05). Law enforcement investigations. Retrieved 06 19, 2012, from
[58] Amarjit Budhiraja, Xin Liu, Multiscale diffusion approximations for stochastic networks in heavy
traffic, Stochastic Processes and their Applications, Volume 121, Issue 3, March 2011, Pages
630-656, ISSN 0304-4149, 10.1016/
[59] Rushby, J. (1986, 06 20). The Bell and La Padula Security Model. Retrieved 06 20, 2012, from

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
Hybrid Model of Rough Sets & Neural Network for
Classifying Renal stones Patient

Department of IS,Faculty of Computers and Information Science,and Urology and Nephrology Center ²*
Mansoura University,Mansoura, Egypt
AbstractـــDespite advances in diagnosis and therapy, renal stone
disease remains a significant health problem. Considerable amount
of time and effort has been expended researching and developing
systems capable of classifying renal stones patients truly. This
paper proposes a new approach for this classification by using a
hybrid model of Rough Sets & ANN to predict optimum renal stone
fragmentation in patients being managed by extracorporeal shock
wave lithotripsy (ESWL). Rough Sets used to reduce the input
factors and detect the most important ones that are used as input
vectors for the ANN. The result (of treatment and the model) will be
Free (success in removing the stone completely) or not Free.

Keywords ـ ــــ Renal Stones, Rough Sets, Artificial Neural
Network (ANN), Fragmentation extracorporeal shock wave
lithotripsy (ESWL).
Patients& MethodsـــWe reviewed the records of patients who
underwent ESWL as monotherapy for renal stones at Center for
Kidney and Urology in Mansoura from 1998 to 2010. Data
included clinical characteristics, stone-free rate and its
relationship to stone size and location, lithotripter and
Two decades ago open surgery was the main treatment for
symptomatic renal stones, but it is increasingly being replaced
by less invasive therapies. At present ESWL, percutaneous
nephrolithotomy and other endoscopic methods of stone
retrieval are being used to treat >90% of patients with renal
stones. Open surgical procedures are now infrequent for
treating renal stones and thus the incidence of morbidity
associated with stone surgery has also markedly decreased.
Since its introduction in 1980 [1] ESWL has considerably
changed the management of renal stones and has become the
therapeutic procedure of choice in most cases.
The management of urolithiasis is a clinical challenge
worldwide which may result in difficulty in diagnosis,
treatment and prevention of recurrence , especially with regard
to choice of procedure. Interventional options include
extracorporeal shock wave lithotripsy (ESWL), ureteroscopic
lithotripsy and percu-taneous nephrolithotomy (PCN). Clinical
challenges include the decision to treat and the choice of
procedures outcome is dependent on multiple pre-determined
variables. At present[2] decision-making is based on clinical
expertise, and statistical models such as matched-pair and
multivariate regression are often followed by the non-linearity
and high variability of medical data, termed the inherent
Rough sets Developed by Zdzislaw Pawlak in the early of
1980’s deals with the classificatory of data tables and focus on
structural relationships in data sets. Rough Sets theory
constitutes a framework for inducing minimal decision rules,
these rules in turn can be used to perform a classification task.
The main goal of the rough set analysis is to search large
databases for meaningful decision rules and finally acquire new
knowledge [3].
In this paper Rough sets try to synthesize approximation of
concepts from acquired data. The starting point of Rough set
theory is an observation that the objects having the same
description are indiscernible (similar) with respect to the
available information (the defined attributes values).
Previous work
In recent years various approaches to predict the useful results
have been proposed.
One of the first reports of ANNs in urolithiasis was by
Michaels et al. [4] who compared standard computational
methods (linear and quadratic discriminative analysis) and
ANNs in the prediction of stone growth after ESWL. A three-
layer feed-forward ANN with EBP was used to analyze a data
set of 98 patients: a training set of65 and test set of 33 patients.
An ANN was used to predict increased stone volume using
variables including pre-existing metabolic abnormalities,
infection and stone size. The ANN showed that no single
variable per se was predictive of continued stone formation.
Comparing linear and quadratic discrimination function,
analysis with ANN revealed a sensitivity of 100 and 91% and a
specificity of 0 and 91%. They concluded that ANNs can
accurately predict future stone activity after ESWL.
Poulakis et al. [5] further used a feed-forward ANN with EBP
to evaluate variables shown to affect lower pole calculi
clearance after ESWL using a data set of 680 patients: 101
kidneys for training and 600 for testing. Overall stone clearance
rate was reported at 68%, with 26.1% of cases requiring further
intervention in the form of further ESWL, ureteroscopy or
PCN. The most influential prognostic variables for clearance
were pathological urinary transport [19] ,ANN was shown to
have a 92% predictive accuracy of lower pole stone clearance.
Gomha et al. [6] compared ANN with a logistic regression
model to predict stone-free status after ESWL. The logistic
regression model was constructed using a backward likelihood
ratio selection and ANN using a three-layer feed-forward
model with EBP. Both models were trained on 688 cases and
tested on 296 cases. Comparing logistic regression with ANN
revealed a sensitivity of 100 and 77.9%, a specificity of 0 and
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
75%, a positive predictive value of 93.2 and 97.2% and an
overall accuracy of 93.2 and 77.7%. They concluded that
ANNs were better at identifying cases that were unlikely to
respond to ESWL and in which further treatment would be
Hamid et al. [1] assessed the ability of a feed-forward ANN
with EBP to predict optimum renal stone fragmentation after
ESWL using data of 82 patients: 60 cases for training and 22
for testing. ANN identified stone size as the most influential
variable, followed by total number of shocks given and 24-h
urinary volume. ANN accurately predicted optimal
fragmentation in 17 of 22 patients, and identified five patients
in whom fragmentation did not occur irrespective of the
number of shocks given. A 75% correlation was found between
the number of shocks given and the number predicted by ANN.
Cummings et al. [7] developed an ANN for calculating the
probability of spontaneous ureteric stone passage; the model
was trained on 125 patients and was accurate in 69% of the
Sonke et al. [8] used several noninvasive variables including
the IPSS and produced an MLP network to predict the outcome
of urinary pressure flow studies. Their model was trained and
tested on 1903 patients and yielded 69% specificity at 71%
Bertrand et al.[9]compare discriminant analysis, logistic
regression analysis and ANN to assessing the risk of urinary
calcium stone among men, Both models were trained and
tested on 215 cases. Comparing discriminant analysis logistic
regression with ANN revealed a sensitivity of 66.4% and 62.2
%, a specificity of 87.5% and 89.8%, a positive predictive
value of 75.8 and 74.4 %
Neeraj K et al.[10]compare the accuracy of ANN analysis and
multivariate regression analysis (MVRA) for renal stone
fragmentation by ESWL. Data of 196 patients were used for
training BP network and MVRA model. The predictability of
trained ANN and MVRA was tested on 80 subsequent patients
giving sensitivity (prediction of number of shocks) the MVRA
was 57.26 % and ANN was 93.29 %.
Studies have examined the role of ANNs in prediction of stone
presence and composition, spontaneous passage clearance and
regrowth after treatment. This paper suggest that ANNs can
identify important predictive variables and accurately predict
treatment outcome with using RS.
All of these researches try to arrive to good Classification, they
use the neural networks, and some of these researches arrive
already to good tools but with taking all patient attributes.
ANNs can be useful for assessing several urological diseases
provide an ‘intelligent’ means of predicting useful outcomes
with greater accuracy and efficiency, but with using Rough
Sets it will present medical advantages, i.e. no need for
prolonged anti-inflammatory treatment and no need for
pyelography (because diagnosis is based also on KUB,
echography and helical CT only when necessary), anaesthesia
and systematic hospitalization.
1. The paper is organized as follows: Section 2 gives an
overview about Preliminaries of Artificial Neural Network
ANN and Rough sets as a two concept used in this paper.
Section 3 presents the proposed hybrid model that predicts
optimum renal stone fragmentation based on previous
preliminaries. Section 4 show simple experimented results
of proposed model and section 5 conclude this paper.
2. Preliminaries
2. 1 Rough Set Theory
Rough set theory proposed by Pawlak is an effective
approach to imprecision, vagueness, and uncertainty.
Rough set theory overlaps with many other theories such
that fuzzy sets, evidence theory, and statistics. From a
practical point of view, it is a good tool for data analysis.
The main goal of the rough set analysis is to synthesize
approximation of concepts from acquired data. The
starting point of Rough set theory is an observation that
the objects having the same description are
indiscernible (similar) with respect to the available
information. The indiscernibility relation is a
fundamental concept of the rough set theory which used
in the complete information systems. The starting point
of rough set theory which is based on data analysis is a
data set called an information system ( IS ). IS is a data
table, whose columns are labeled by attributes, rows are
labeled by objects or cases, and the entire of the table are
the attribute values. Formally, IS= (U, AT) , where U
and AT are nonempty finite sets called “the universe”
and “the set of attributes,” respectively. Every attribute a
AT, has a set of Va of its values called the“domain of
a ”. Any information table defines a function ρ that maps
the direct product U ×AT into the set of all values
assigned to each attribute .The concept of the
indiscernibility relation is an essential concept in rough
set theory which is used to distinguish objects described
by a set of attributes in complete information systems.
Each subset A of AT defines an indiscernibility
relation as follows:
IND (A) = {(x, y) U ×U: ρ (x, a) = ρ(y, a) a A,
A AT} (1)
Obviously, IND(A) is an equivalence relation, the
family of all equivalence classes of IND(A) , for
example, a partition determined by A which is denoted
by U/IND (A) or U /A[11]. Obviously IND (A ) is an
equivalence relation and:
IND (A ) = ∩ IND (a )where a A
(2) A fundamental problem discussed in rough set is
whether the whole knowledge extracted from data sets is
always necessary to classify objects in the universe; this
problem arises in many practical applications and will
be referred to as knowledge reduction. The two
fundamental concepts used in knowledge reduction are
the core and reduct. Intuitively, a reduct of knowledge is
its essential part, which suffices to define all basic
classifications occurring in the considered knowledge,
whereas the core is in a certain sense it’s most important
A reduct can be thought of as a sufficient set of features
– sufficient, that is, to represent the category structure.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
The reduct of an information system is not unique: there
may be many subsets of attributes which preserve the
equivalence-class structure (i.e., the knowledge)
expressed in the information system. The set of
attributes which is common to all reducts is called the
Core: the core is the set of attributes which is possessed
by every legitimate reduct, and therefore consists of
attributes which cannot be removed from the
information system without causing collapse of the
equivalence-class structure. The core may be thought of
as the set of necessary attributes – necessary, that is, for
the category structure to be represented.[11]
2. 2 Artificial Neural Networks
Artificial neural networks are relatively crude electronic
networks of "neurons" based on the neural structure of
the brain. They process records one at a time, and
"learn" by comparing their classification of the record
(which, at the outset, is largely arbitrary) with the known
actual classification of the record. The errors from the
initial classification of the first record is fed back into
the network, and used to modify the networks algorithm
the second time around, and so on for many
iterations. ANNs have been shown to make accurate
predictions in many aspects of medical practice by
pattern recognition and learning .In medicine, artificial
neural networks (ANNs) are the most widely described
form of artificial intelligence (AI),a branch of computer
science concerned with the emulation of complex human
thought processes such as adaptive learning,
optimization, reasoning and decision-making. ANNs are
inspired by, and loosely modeled upon the structure and
the function of biological nervous systems, being
composed of a series of interconnecting parallel non-
linear processing elements (nodes) with a limited
number of inputs and outputs. Medical practice requires
human acquisition, analysis and application of a vast
amount of information in a variety of complex clinical
scenarios. At present, there is no adequate substitute for
the expertise of an experienced clinician .However,
computers can process and analyze large quantities of
data rapidly and efficiently, and so in theory, AI could
facilitate clinical decision-making processes: diagnosis,
treatment and prediction of outcome.[2]
3. Hybrid model of Rough Sets and ANN
The hybrid model discussed in this paper is considered to be a
combination system that contains two intelligent
methodologies which are Rough Sets and ANN.

3.1 Attributes Reduction Using Rough Sets
The hybrid combinational model is divided into two main
parts; the first part consists of RS steps to reduce the input
attributes and detect which more affecting ones. It will receive
N objects with multi valued attributes as input in Figure 2
Sample Information
Object P1 P2 P3 P4 P5
O1 1 2 0 1 1
O2 1 2 0 1 1
O3 2 0 0 1 0
O4 0 0 1 2 1
O5 2 1 0 2 1
O6 0 0 1 2 2
O7 2 0 0 1 0
O8 0 1 2 2 1
O9 2 1 0 2 2
O10 2 0 0 1 0
Figure 2: Sample Information System
When the full set of attributes P = {P1, P2, P3, P4, P5} is
considered, we see that we have the following seven
equivalence classes:

Thus, the two objects within the first equivalence class,
{O1,O2}, cannot be distinguished from one another based on
the available attributes, and the three objects within the second
equivalence class, {O3,O7,O10}, cannot be distinguished from
one another based on the available attributes. The remaining
five objects are each discernible from all other objects. The
equivalence classes of the P-indiscernibility relation are
denoted [x]P.
It is apparent that different attribute subset selections will in
general lead to different indiscernibility classes. For example, if
attribute P = {P1} alone is selected, we obtain the following,
much coarser, equivalence-class structure:

Let be a target set that we wish to represent
using attribute subset P; that is, we are told that an
arbitrary set of objects X comprises a single class, and
we wish to express this class (i.e., this subset) using
the equivalence classes induced by attribute subset P.
In general, X cannot be expressed exactly, because the
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
set may include and exclude objects which are
indistinguishable on the basis of attributes P.[11]
For example, consider the target set X = {O1, O2, O3, O4},
and let attribute subset P = {P1, P2, P3, P4, P5}, the full
available set of features. It will be noted that the set X cannot
be expressed exactly, because in [x] P, objects {O3, O7, and
O10} are indiscernible. Thus, there is no way to represent any
set X which includes O3 but excludes objects O7 and O10.
A reduct can be thought of as a sufficient set of features –
sufficient, that is, to represent the category structure. In the
example table above, attribute set {P3, P4, P5} is a reduct – the
information system projected on just these attributes possesses
the same equivalence class structure as that expressed by the
full attribute set:

Attribute set {P3,P4,P5} is a legitimate reduct because
eliminating any of these attributes causes a collapse of the
equivalence-class structure, with the result that
.The reduct of an information system is
not unique: there may be many subsets of attributes which
preserve the equivalence-class structure (i.e., the knowledge)
expressed in the information system. In the example
information system above, another reduct is {P1, P2, P5},
producing the same equivalence-class structure as [x]P.The set
of attributes which is common to all reducts is called the Core:
the core is the set of attributes which is possessed by every
legitimate reduct, and therefore consists of attributes which
cannot be removed from the information system without
causing collapse of the equivalence-class structure. The core
may be thought of as the set of necessary attributes – necessary,
that is, for the category structure to be represented. In the
example, the only such attribute is {P5}; any one of the other
attributes can be removed singly without damaging the
equivalence-class structure, and hence these are all dispensable.
However, removing {P5} by itself does change the
equivalence-class structure, and thus {P5} is the indispensable
attribute of this information system, and hence the core.[11]
In order to reduct the input attributes and detect the core with
the same manner as described earlier we use the ROSETTA
application V1.4.41 that support The Reduction Algorithms.
We select
Johnson Reducer algorithm that has a natural bias towards
finding a single prime implicant of minimal length.The reduct
B is found by executing the algorithm outlined below , where S
denotes the set of sets corresponding to the discernibility
function, and w(S) denotes a weight for set S in S that
automagically gets computed from the data.

Algorithm 1: Basic Johnson Reducer Algorithm
Supports for computing approximate solutions is provided by
aborting the loop when “enough” sets have been removed from
S, instead of requiring that S has to be fully emptied. The
support count associated with the computed reduct equals the
reduct’s hitting fraction multiplied by 100, i.e., the percentage
of sets in S that B has a non-empty intersection with.[12] The
application will receive an Excel File contain all patient and
stone characteristics (Feature Extraction Record) as listed in
Table 1 & the variables definition listed in Table 2.
TABLE 1:ESWL treatment parameters
serialNo age sex side number length opacity nature solitary morpholo ager free anatomy2 jjstent morph3 comp2 postpr site2 lengthm2 sessn Class
1 61 2 2 1 10 1 1 N 1 2 1 0 0 1 0 0 1 1 1 1
2 50 1 1 1 15 1 1 N 1 2 1 0 0 1 0 0 1 1 1 1
3 45 1 1 1 25 1 1 N 3 2 1 0 0 3 0 0 1 2 1 1
4 46 2 1 1 15 1 2 N 3 2 1 0 0 3 0 0 1 1 2 1
5 39 2 1 1 10 1 1 N 1 1 1 0 0 1 0 0 1 1 2 1
6 56 1 2 2 19 1 1 N 3 2 1 0 0 3 0 0 1 2 2 1
7 42 1 2 1 18 1 1 N 1 2 1 1 0 1 0 0 1 2 2 1
8 29 1 1 2 18 1 2 N 3 1 1 0 0 3 0 0 1 2 1 1
Table 2: Variables Definition
Value  Label 
Sex                 1 
Side               1 
Number         1 
Opacity         1  OPAQUE 
1. Let B =Φ .
2. Let a denote the attribute that maximizes ∑ w(S),
where the sum is taken over all sets S in S that contain a.
Currently, ties are resolved arbitrarily.
3. Add a to B.
4. Remove all sets S from S that contain a.
5. If S = Φ return B. Otherwise, goto step 2.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
                      2  LUCENT 
Nature           1 
Morpholo      1 
Ager               1 
> 40 
Anatomy         0 
JJstent            0 
Morph3          1 
Comp              0 
postpr            0 
Site                1 
Length           1 
<= 15 mm 
> 15 mm 
Sessn             1 
Free(o/p)       1 
Not Free 
It will apply the algorithm as described above and generate the
reduced attributes which have 8 variables {age, sex, side,
number, length, opacity, morphology, site} instead of 20
3.2 ANN for predicting optimum renal stone fragmentation
The second part of our model consists of ANN to classify renal
stone patients into free (success in removing the stone
completely) or not Free .This may be achieved by constructing
a predictive model, taking into account all reduced variables
that affecting stone-free status resulting from Rough Set model.
We trained several 3layer, feed for-ward neural networks with
the back propagation of error algorithm to predict stone-free
Feed forward, Back-Propagation
The feedforward, back-propagation architecture was developed
in the early 1970's by several independent sources (Werbor;
Parker; Rumelhart, Hinton and Williams). This independent co-
development was the result of a proliferation of articles and
talks at various conferences which stimulated the entire
industry. Currently, this synergistically developed back-
propagation architecture is the most popular, effective, and
easy-to-learn model for complex, multi-layered networks. Its
greatest strength is in non-linear solutions to ill-defined
problems. The typical back-propagation network has an input
layer, an output layer, and at least one hidden layer. There is no
theoretical limit on the number of hidden layers but typically
there are just one or two. Some work has been done which
indicates that a maximum of five layers (one input layer, three
hidden layers and an output layer) are required to solve
problems of any complexity. Each layer is fully connected to
the succeeding layer.
As noted above, the training process normally uses some
variant of the Delta Rule, which starts with the calculated
difference between the actual outputs and the desired outputs.
Using this error, connection weights are increased in proportion
to the error times a scaling factor for global accuracy. Doing
this for an individual node means that the inputs, the output,
and the desired output all have to be present at the same
processing element. Training inputs are applied to the input
layer of the network, and desired outputs are compared at the
output layer. During the learning process, a forward sweep is
made through the network, and the output of each element is
computed layer by layer. The difference between the output of
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
the final layer and the desired output is back-propagated to the
previous layer(s), usually modified by the derivative of the
transfer function, and the connection weights are normally
adjusted using the Delta Rule. This process proceeds for the
previous layer(s) until the input layer is reached.[12]
Architecture of the Feed-Forward Back Propagation

Figure 1 : Feed-Forward Back Propagation architecture
Basic Back Propagation Learning Algorithm
Actual algorithm for a 3-layer network (only one hidden layer):

Algorithm 2: Basic Back Propagation Learning Algorithm [13]

In the constructed ANN the input layer had 25 neurons (table
3). An input neuron was assigned for each categorical value of
a categorical variable with a value of 1 when the category was
present and 0 otherwise. The output layer consisted of 2
neuron, giving the class value 1 for stone-free status and the
class value 0 if the not free status. Network output was actually
between 0 and 1, that was then converted according to a
decision threshold to class 0 (when output was equal to the
decision threshold or less) or class 1(when output was greater
than the decision threshold). The numbers of hidden nodes (23)
were chosen by the best performance on the separate test set
through a cascade learning paradigm.

Table 3. ANN Input Data

Using these values a random set of weights is initially assigned
to the connections between the layers. The first output of the
network is determined using these weights. The output thus
obtained is compared with the actual output of the pattern pair
and the mean square error calculated. An error optimization
algorithm then minimizes this error. A feed-forward back-
propagation ANN system has the property of self-optimization
of the error during training. Thus, the final weight of a
particular variable is decided by the system itself, determined
precisely by the relative impact of the variable in the dataset in
relation to the actual output variable.
The network was trained using the XL Miner software system;
the working code for the ANN was constructed so that it was
compatible with the analysis and processing of the input data.
The ANN was trained using a single-layer feed-forward back-
propagation network. To assess the training status of the ANN,
480 randomly selected data from the training set were used for
validation (validation set), and once validation was satisfactory
further training was stopped.
After the ANN was considered to be reasonably trained, input
variables (similar to those used for training) from subsequent
patients were serially fed into the trained ANN and optimum
fragmentation of each patient, as predicted by the ANN via the
output data, recorded. Using the usual protocol these patients
subsequently underwent EWSL and the results (observed
values) were recorded. The predicted and the observed values
were then compared.

Variables No. of



Stone No.















40 or Younger, Older than 40

Male , Female

Right , Left

Single , Multiple

15 or less, Greater than 15 mm

Opaque, Lucent

Perfect, Hydronerphrotic, Pyelonephritic

Upper Calyx, Middle Calyx, Renal pelvis,
Multiple Calyx, Lower Calyx

1 Stone Free, 0 Not free
Step 1 Initialize the weights in the network (often
For each example e in the training set
O = neural-net-output(network, e) ; forward pass
T = teacher output for e
Step 2: Calculate error (T - O) at the output units
Step 3: Compute delta_wh for all weights from hidden layer
to output layer ; backward pass
Step 4: Compute delta_wi for all weights from input layer
to hidden layer ; backward pass continued
Update the weights in the network
Until all examples classified correctly or stopping criterion
Return the network

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
In this study we show the ability of an ANN with RS model to
predict stone-free status in patients with ureteral stones treated
with ESWL, so 449 patients (93.5%) were free of stones, while
the remaining 31 (6.4%) required other treatment modalities
due to inadequate stone disintegration.
Evaluating the performance of the model on the test set
revealed a sensitivity of 93,754 % when the number of
hidden neurons are 23 neuron the processing time was 4 sec
,While with using whole input variables ANN only correctly
classify 78% of the participants.

While ANN tool provide accurate results for predicting renal
stones disease, using the Rough Set make ANN provides a
better prediction as RS reduce the time consumed in the
univariate analysis to detect which variables are the most
effective . This model provides a new way to study stone
disease in combination with Rough set technique and ANN.
[1] Hamid A, Dwivedi US, Singh TN,” Artificial neural
networks in predicting optimum renal stone fragmentation by
extracorporeal shock wave lithotripsy, ” A preliminary study,
vol .91, pp. 821–824 , BJU Int 2003 .

[2 ] Prabhakar Rajan and David A. Tolley, “ Artificial neural
networks in urolithiasis, ” Department of Urology, The
Scottish Lithotriptor Centre, Western General Hospital,Crewe
Road South, Edinburgh. UK Current Opinion in Urology,vol
.14 ,pp.133–137 ,2005.

[3] ZDZISLAW PAWLAK, “Rough Sets Theoretical Aspects
of Reasoning about Data, ” Institute of Computer Science,
Warsaw University of Technology, Kluwer Academic
Publishers, Australia, 1991.
[4] Michaels EK, Niederberger CS, Golden RM , “Use of a
neural network to predict stone growth after shock wave
lithotripsy, ” Urology, vol.51 ,pp.335–338 ,1998.

[5] Poulakis V, Dahm P,Witzsch U, “ Prediction of lower pole
stone clearance after shock wave lithotripsy using an artificial
neural network, ”,vol.169 ,pp.1250–1256 ,J Urol 2003.

[6 ]Gomha MA, Sheir KZ, Showky S ,” Can we improve the
prediction of stone-free status after extracorporeal shock wave
lithotripsy for ureteral stones? A neural network or a statistical
model? , ”vol.172 ,pp.175–179 ,J Urol 2004.

[7 ]Cummings JM,Boullier JA,Izenberg SD, Kitchens DM,
Kothandapani. RV, “Prediction of spontaneous ureteral
calculous passage by an artificial neural network , ” vol.164
,pp.326–438 ,J Urol 2000.

[8] Sonke GS, Heskes T, Verbeek AL, de la Rosette JJ,
Kiemeny LA,” Prediction of bladder outlet obstruction in men
with lower urinary tract symptoms using artificial neural
networks, ”,vol. 163,pp. 300–305, J Urol 2000.

[9 ]Bertrand Dussol Æ Jean-Michel Verdier ,”Artificial neural
networks for assessing the risk of urinary calcium stone among
men., ” Urol Res ,vol.34,pp. 17–25 ,2006.

[10 ]Neeraj K.Goyal ,Abhay Kumar et al ,”A
Comparitive Study of Artificial Neural Network
and Multivaiate Regression Analysis to Analyze
Optimum Renal stone Fragmentation by
Extracorporeal Shock Wave Lithotripsy, ”
Department of Urology, Institute of Medical
science ,Hindu Univeristy ,vol.1 ,India 2010.

[11] 5-4-2011

[12] A. Øhrn ROSETTA Technical Reference
Manual. Department of Computer and Information
Science, Norwegian University of Science and
Technology (NTNU), Trondheim, Norway,pp.28,


Authors Profile …

,Teaching Assistant ,IS Dept ,Faculty of Computers and
Information Science,Mansoura University,Egypt
Phone :00201069493073 ,

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
Cloned Agent based data computed in Homogeneous
sensor networks
S.Karthikeyan S.Jayashri
Research scholar, Sathyabama university Adhiparasakthi engineering college
Tamil Nadu.
Melmaruvathur – 603319, Kanchipuram
India. Tamil Nadu,India.

Abstract- The innovation in the world of Wireless communication
led to progressive use of tiny and multifunctional sensor nodes.
These nodes, being small in size can sense the environment,
compute data and communicate to a large range. The energy
constraint is one of the major limitations in sensor nodes, hence
given major focus. The usage of manifold mobile agents proposed
by mobile agent based framework, offers flexible and robust data
collection within WSNs. This also provides diverse solutions to
energy constraint problems in WSN. This paper proposes multi-
agent system based data gathering from the WSN. The mobile
agent created in the sink is cloned n-1 times, where n is the
number of clusters. The cloned mobile agents travel to their
corresponding cluster in a parallel fashion and compute the data
for which the event occurs. Finally the computed results are
transferred to the sink which sends an alert for the occurred
event to the mobile device. The mobile agent computes and
transfers only the result minimizing energy consumption. The
approach of transferring only the event occurred data also
conserves node energy. This paper proposes that the sink
monitors any application and when an event is sensed, it is
notified to the user via message and call.
Keywords: Mobile agent, Wireless sensor networks, energy
consumption, lifetime.


The WSNs are intended to detect events, receive data from
the environment and compute the received data and finally
transmit the sensed information to interested users. The
information sensed and transmitted by the nodes describes the
condition of its surroundings in which the network is
deployed. The information may include the temperature,
pressure, humidity, heat, light, electricity etc of the
environment. The WSN also routes sensor data, at times
aggregated and summarized, to users who have requested it or
are expected to utilize the information [1]. The users interact
with the sink node, which gathers and holds the result. Thus
information processing and routing are two fundamental
operations in sensor networks.
The sensor network possesses many challenging features.
They are composed of self-organized nodes with controlling
capabilities, cooperating and interacting intelligently with
other nodes in the network. There are limitations in energy
consumption for transmission, computation and reception of
data. The multi-hop routing increases energy efficiency. Dense
deployment of the nodes helps in improving signal-to-noise
ratio, frequently changing topologies which get better

adaptability of the network. The sensor nodes have a finite
sensing and communication range. It can combine the
information from multiple sources.
Network nodes are equipped with wireless transmitters and
receivers using antennas that may be omni directional
(isotropic radiation), highly directional (point-to-point),
possibly steerable, or some combination [3]. Limited and un
rechargeable battery and limited network communication
bandwidth are the most challenging issues in sensor networks.
Energy consumed in sensor networks is mainly for the purpose
of data transmission, signal processing, and hardware
operation [2]. The different states of the node like
transmission, reception, listening and sleeping itself drain
battery power. The reception and transmission encompasses
all the processing activities of the network [3].
Many research efforts aim at improving the energy efficiency
from different aspects by pioneering energy-efficient
processing techniques that reduces power consumption of all
the operations of the sensor networks. Cross-layer
optimization is widely considered as an efficient technique to
ameliorate this concern. The 3 system knobs that can be used
for cross-layer approach are voltage scaling, rate adoption and
tunable compression [4].
Applications of sensor networks are wide ranging and can
vary significantly in application requirements, modes of
deployment (e.g., ad hoc versus instrumented environment),
sensing modality, or means of power supply (e.g., battery
versus wall-socket) [1]. It is mainly used in applications whose
intention is to collect process and transport large volumes of
complex information from the environment. Duty-cycling is a
technique used to reduce energy consumption and extend
network lifetime. Nodes may enter a sleep state when their
presence is not necessary to maintain the functionality of the
system, e.g., when no event occurs in the sensor’s vicinity or
when no message is routed through the sensor [5].
Efficient energy consumption is the critical design challenge
that can be addressed to some extent during hardware design.
For instance, protecting data contents could be tuned to special
needs of sensor networks: relatively weak mechanisms could
be implemented directly in hardware so that data are encoded
and decoded fast and almost no communication overhead rises
[6]. Energy scavenging capability, cost and size are three
important metrics that indicate the measure of a node’s
“obtrusiveness” [7].
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

A novel architecture is introduced using mobile agents to
meet the new challenges of the current Distributed sensor
Networks (DSN's), such as large data volume, low
communication bandwidth, and unreliable environment [8]. In
traditional network based on client server architecture, all the
data collected by leaf nodes are transmitted to the processing
element. But in MA based distributed sensor network, the
computation is distributed into the participating leaf nodes.
Thus, this approach reduces the consumption of power used
for communication and usage of bandwidth significantly.
In Mobile Agent based Distributed sensor Networks
(MADSN), the network is divided into many subtasks. For
each subtask, its mobile agent carries the execution code for
computing the data. The mobile agent finds the optimal path
for the agents routing which influences the overall
performance of MADSN implementation because
communication cost and detection accuracy depend on the
order and the number of nodes to be visited [9]. In a MADSN,
mobile agents migrate among sensor nodes to collect data and
execute an overlap function of partial integration, whose
results are accumulated into a final version upon the arrival of
all mobile agents. MAs act in the interest of an entity,
migrating between different network locations, executing tasks
locally and continuing their execution at the point where they
stopped before migrating [10]. MAs have the ability and
intelligence to cooperate and communicate. The advantages of
MA fall into three different categories [11], among others:
• Bandwidth and delay savings because computation is moved
to the data.
• Flexibility because agents do not require the availability of
specific code.
• Suitability for mobile computing because agents do not
require continuous network connections.
The MA offers many benefits to the application where it is
equipped. It reduces traffic and latency of the sensor network,
reacts immediately to the changes in the executing
environment or application. Also executes the tasks
independently in a distributed and asynchronous manner [12].
The mobile agents offer resource efficiency in wireless sensor
networks. The critical problem in WSNs is power
consumption of data transmission collected through the sensor
nodes. The multi agent system is composed of multiple agents
interacting and communicating intelligently. Intelligence may
include some methodic, functional, procedural or algorithmic
search, find and processing approach [13]. Multi agent system
provides computational capabilities across a network of
interconnected agents.


A mobile agent consists of the program code and the
program execution state (the current values of variables, next
instruction to be executed, etc.) [14]. The home machine is the
computer where the mobile agent actually resides. The mobile
agent host otherwise called a mobile agent platform or server
is a remote computer where the agent is dispatched to execute.
The mobile agent transfers from home computer to ‘n’ number
of hosts in the network and executes on several machines.
When a mobile agent is dispatched, the entire code of the
mobile agent and the execution state of the mobile agent is
transferred to the host [15]. The host provides the appropriate
execution environment for the mobile agent to execute on it,
also permitting the use of its resources (CPU, memory etc.).
On completing its task on a particular host the mobile agent
drifts to another computer. Instead of restating its execution in
the migrated node, the mobile agents are capable of resuming
their execution from where they left off in the previous host
with the aid of the state information which is also transferred
as the agents migrate. This continues until the mobile agent
returns to its home machine after completing execution on the
last machine in its itinerary [16, 17].

Figure 1. Life cycle of cloned mobile Agent

The steps for working of a mobile agent:
1. The mobile agent is created in a computer called Home
2. The mobile agent is then dispatched to the Host Machine A
for execution.
3. The agent performs its task on Host Machine A.
4. After completing its task, the agent is replicated into two
copies. One copy is dispatched to Host Machine B and the
other is dispatched to Host Machine C.
5. The cloned copies perform their task on their respective
hosts shown in Figure 1.
6. After execution, Host Machine B and C send the mobile
agent back to the Home Machine.
7. The Home Machine extracts the data brought by the agents.
8. The agents are then disposed [18].


Energy conservation is considered to be significant while
designing the Wireless Sensor Network. The sensor nodes are
deployed in an area where they are uniformly distributed.
These nodes are divided into 3 clusters with each cluster
enclosing more than 70 nodes. Since there are more nodes in a
cluster, there will be more data sensed by these nodes, which
needs to be transmitted to the cluster head. Usage of single
cluster head will not be sufficient to collect the sensed
information from all these nodes.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
Hence in this proposed work cluster (as portrayed in figure 2)
consists of two cluster heads as Sub cluster heads (SCH0 and

Figure 2. cluster with Master Section Head

The sub cluster head is elected based on the residual energy of
sensor nodes. Based on the signal strength, the nodes in the
cluster will identify their Sub cluster head to broadcast their
information. The transmitted information is gathered in both
the sub cluster heads. The sub cluster heads transmit the data
to the Master Section Head, which is managed by an event
driven mechanism. In this mechanism, the data is transferred
from sub cluster head when an event occurs. The event arises
when the sensed data is higher than the set value of that
parameter, which is the predefined value based on the
application. The mobile agents collect the data from a master
section head instead of collecting it from two sub cluster
The energy consumption of the network is high in order to
transmit data from cluster head to the sink through multi hop
or single hop routing. To reduce this energy consumption of
the network, mobile agent based architecture is opted. The
mobile agent is created in the sink and then cloned into
multiple agents, there by forming a multi-agent system. The
MA is cloned n-1 times, where n is the number of clusters. In
this paper n represents 3 clusters.

Figure 3. GSM modem Transmission

The sink dispatches all the 3 mobile agents to their
corresponding cluster in a parallel manner. Each agent
computes the data collected from its Master section head and
saves the process output. Then the MA transmits the saved
data to the sink. The saved data contains the table of nodes and
its clusters. The result gives list of the nodes whose parameter
value is higher than the set value and its corresponding
clusters. Finally all the MAs are disposed in the sink.
Based on this output, the sink gives an alert to a mobile
device. As represented in figure 3 the Global System for
Mobile communication (GSM) modem is used as an interface
to the sink for the transmission of both SMS and call to the
mobile device which alerts the mobile device user. GSM
modem acts as a communication path device. GSM Subscriber
Identity Module (SIM) card is inserted into the GSM modem
and the modem is connected to the computer that acts as sink
via a serial port. From the receiver point the mobile device
itself acts as a GSM modem. SMS itself will be sufficient for
alerting the user on an event. But still a call is provided to
intimate the user that there is SMS received for that mobile
device. This extra check is present to ensure that the user gets
intimation on the fault immediately so that the user can
respond without delay. This proposed work is suitable for any
homogeneous network application.


In this simulation, there are three clusters as shown in
figure 4. Each cluster is partitioned into 2 sub groups named as
slave nodes group. These Sub groups are headed by Master
Node which is the sub cluster head.

Figure 4. Cluster partition
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
The sub cluster head transmits the data for which the event has
occurred to the master section head shown as Main Head data
storage node.

Figure 5. Event occurred in First Master section head.

Figure 6. Event occurred in third Master section head.

The program is executed for all the clusters and the results
showing the event occurred nodes has been displayed in the
above tables. The event occurred data is stored in the master
section head. Figures5 and Figures6 show the node id and its
corresponding cluster id. The results are stored in Oracle

Figure 7. Mobile Agent creations and cloning

The mobile agent is created in the main container and it is
cloned twice. The Main container contains the actual created
Mobile Agent and the cloned mobile agents represented as
mobileagent_1 and mobileagent_2 as shown in the figure 7.
The Agent name, parameters and clone information are
provided during agent creation.

Figure 8. Agent Platform

The created and the cloned mobile agents are transmitted from
the main container to the respective clusters. The containers
holding each mobile agent is shown in figure 8. The mobile
agents compute the event occurred data from the main section
head. The main container is in the sink where the computed
data is stored.

Figure 9. Mobile Agent is moved to Container-1.

Figure 10. Mobile Agent return back to Main-Container after completion of
computation in Container-1.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500

A GUI (Graphical User Interface) is created to display the
information of the containers like name, protocol and address.
The Agent computes the data of the Master section head
within the cointainer. It displays the available and visited
locations. The agents travel in parallel fashion to container 1,
2 and 3 and compute the data and store the result in main
container is shown in figure 9 and figure10. In this case of
parallel computation of the cluster by mobile agent, the
processing is faster when compared to the serial computation.

Figure 11 Final results in the Main container

The list of nodes and their respective cluster for which the
event has occurred is shown in figure 11. The computed
results from container 1,2 and 3 are fetched in a parallel
manner by the corresponding mobile Agents and carried to the
Main container. This result is stored in the sink in the table
format as shown. A GSM modem is used to convey an alert
for the occurred event to a mobile device.


The cloned mobile agents travel to their corresponding
clusters and compute the data for which the event occurred.
The computed data is then transferred to the sink by the
mobile agents. This data is shown in the simulation results.
The sink sends an alert in the form of both a message and a
call to the destination mobile device through a GSM modem.
It is not possible to recharge or replace the limited battery
power of the sensor nodes. Therefore this entails that the
nodes should resort to minimal power consumption in order to
increase the overall lifetime of the network. In WSN, the data
transmission from the entire cluster to the sink consumes more
energy. The usage of multiple mobile agents consumes less
energy by computing the data in all the clusters in a parallel
manner and transmitting only the results to the sink. The
mobile agents transmit the data of only the event occurred
node which again minimizes the energy consumption.

[1]. Feng Zhao, Leonidas J. Guibas, Wireless Sensor Networks: An
Information Processing Approach, Palo Alto, California January 2004.
[2]. A.J. Goldsmith and S.B. Wicker, Design challenges for energy-
constrained ad-hoc wireless networks, IEEE Wireless Communication., 9, 8–
27, Aug. 2002.
[3]. Mohammad Ilyas and Imad Mahgoub, Handbook of sensor nodes:
Compact wireless and wired sensing system, © 2005 by CRC Press LLC.
[3].YangYu,Viktor K Prasanna, Bhaskar KrishnamachariInformation
processing and routing in wireless sensor networks, University of Southern
California, USA © 2006 by World Scientific Publishing Co. Pvt. Ltd.
[4]. Ananthram Swami, Qing Zhao, Yao-win Hong, Lang Tong, Wireless
Sensor networks signal processing and communication perspectives,
University of California at Davis, USA, Cornell University, USA @2007 John
Wiley & sons.
[5]. Ananthram Swami,Qing Zhao,Yao-Win Hong,Lang Tong, “Wireless
sensor networks signal processing and communication perspectives”, @2007
by John Wiley &Sons.
[6]. Mirosław Kutyłowski, Jacek Cicho´n Przemysław Kubiak, Algorithmic
Aspects of Wireless Sensor Networks Third International Workshop,
007Wrocław, Poland, July 14, 2007 ALGOSENSORS 2
[7]. Brian Otis and Jan Rabaey, Ultra-low power wireless technologies for
sensor networks, University of California, Berkeley, Springer 2007.
[8]. H. Qi, S.S.Iyengar, K. Chakrabarty, Multiresolution data integration
using mobile agents in distributed sensor networks, IEEE Trans. Syst., Man,
Cybernetics Part C: Application. Rev., 31(3), 383–391, August, 2001.
[9]. Q. Wu, S.S. Iyengar, N.S.V. Rao, J. Barhen, V.K. Vaishnavi, H. Qi, and
K. Chakrabarty, On computing the route of a mobile agent for data fusion in a
distributed sensor network, submitted to IEEE Trans. Knowledge Data Eng.,
16(6), June 2004.
[10].G.Sundari, P.E. Sankarnarayanan, “Multi Agent systems Using JADE in
Industrial Application,” International Journal of Computer Information
[11].Cabri G., Leonardi L., and Zamponelli F., MARS: a programmable
coordination architecture for
Mobile agents, IEEE Internet Computing , 4(4), 26–35, Jul.–Aug. 2000.
[12].Aiello,F., Carbone, A., Fortino, G. and Galzarano, S., “java-based Mobile
Agent Platforms for wireless sensor networks,” Proc. Of the International
Multi conference on computer science and Information Technology,
[13].Chen, B., Cheng, H.H. and Palen, J., “Integrating mobile agent
technology with multi-agent system for distributed traffic detection and
management system,” transportation research partc: Emerging Technologies
17(1), 1-10, 2009.
[14] Francesco Aiello et al., “A Java-Based Agent Platform for Programming
Wireless Sensor Networks”, The Computer Journal (2010) Doi: 10.1093.
[15] Abdelkader Outtagarts, “Mobile Agent based Applications -A Survey”,
International Journal of Computer Science and Network Security, Vol 9,
No.11, November 2009.
[16] Sajid Hussain et al., “Agent-based system architecture for wireless
sensor networks”, 20th International conference on Advanced Information
Networking and Applications, April18-20, 2006.
[17] Osborne and K. Shah, “Performance Analysis of Mobile Agents in
Wireless Internet Applications using simulation”, A Scientific and Technical
Publishing company, OACTA press.
[18]. Parineeth M Reddy, “Mobile Agents Intelligent Assistants on the
Internet”, RESONANCE , July 2002,pp.35-43.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012

Assist Prof (Dr.) M. Emre Celebi, Louisiana State University in Shreveport, USA
Dr. Lam Hong Lee, Universiti Tunku Abdul Rahman, Malaysia
Dr. Shimon K. Modi, Director of Research BSPA Labs, Purdue University, USA
Dr. Jianguo Ding, Norwegian University of Science and Technology (NTNU), Norway
Assoc. Prof. N. Jaisankar, VIT University, Vellore,Tamilnadu, India
Dr. Amogh Kavimandan, The Mathworks Inc., USA
Dr. Ramasamy Mariappan, Vinayaka Missions University, India
Dr. Yong Li, School of Electronic and Information Engineering, Beijing Jiaotong University, P.R. China
Assist. Prof. Sugam Sharma, NIET, India / Iowa State University, USA
Dr. Jorge A. Ruiz-Vanoye, Universidad Autónoma del Estado de Morelos, Mexico
Dr. Neeraj Kumar, SMVD University, Katra (J&K), India
Dr Genge Bela, "Petru Maior" University of Targu Mures, Romania
Dr. Junjie Peng, Shanghai University, P. R. China
Dr. Ilhem LENGLIZ, HANA Group - CRISTAL Laboratory, Tunisia
Prof. Dr. Durgesh Kumar Mishra, Acropolis Institute of Technology and Research, Indore, MP, India
Jorge L. Hernández-Ardieta, University Carlos III of Madrid, Spain
Prof. Dr.C.Suresh Gnana Dhas, Anna University, India
Mrs Li Fang, Nanyang Technological University, Singapore
Prof. Pijush Biswas, RCC Institute of Information Technology, India
Dr. Siddhivinayak Kulkarni, University of Ballarat, Ballarat, Victoria, Australia
Dr. A. Arul Lawrence, Royal College of Engineering & Technology, India
Mr. Wongyos Keardsri, Chulalongkorn University, Bangkok, Thailand
Mr. Somesh Kumar Dewangan, CSVTU Bhilai (C.G.)/ Dimat Raipur, India
Mr. Hayder N. Jasem, University Putra Malaysia, Malaysia
Mr. A.V.Senthil Kumar, C. M. S. College of Science and Commerce, India
Mr. R. S. Karthik, C. M. S. College of Science and Commerce, India
Mr. P. Vasant, University Technology Petronas, Malaysia
Mr. Wong Kok Seng, Soongsil University, Seoul, South Korea
Mr. Praveen Ranjan Srivastava, BITS PILANI, India
Mr. Kong Sang Kelvin, Leong, The Hong Kong Polytechnic University, Hong Kong
Mr. Mohd Nazri Ismail, Universiti Kuala Lumpur, Malaysia
Dr. Rami J. Matarneh, Al-isra Private University, Amman, Jordan
Dr Ojesanmi Olusegun Ayodeji, Ajayi Crowther University, Oyo, Nigeria
Dr. Riktesh Srivastava, Skyline University, UAE
Dr. Oras F. Baker, UCSI University - Kuala Lumpur, Malaysia
Dr. Ahmed S. Ghiduk, Faculty of Science, Beni-Suef University, Egypt
and Department of Computer science, Taif University, Saudi Arabia
Mr. Tirthankar Gayen, IIT Kharagpur, India
Ms. Huei-Ru Tseng, National Chiao Tung University, Taiwan
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Prof. Ning Xu, Wuhan University of Technology, China
Mr Mohammed Salem Binwahlan, Hadhramout University of Science and Technology, Yemen
& Universiti Teknologi Malaysia, Malaysia.
Dr. Aruna Ranganath, Bhoj Reddy Engineering College for Women, India
Mr. Hafeezullah Amin, Institute of Information Technology, KUST, Kohat, Pakistan
Prof. Syed S. Rizvi, University of Bridgeport, USA
Mr. Shahbaz Pervez Chattha, University of Engineering and Technology Taxila, Pakistan
Dr. Shishir Kumar, Jaypee University of Information Technology, Wakanaghat (HP), India
Mr. Shahid Mumtaz, Portugal Telecommunication, Instituto de Telecomunicações (IT) , Aveiro, Portugal
Mr. Rajesh K Shukla, Corporate Institute of Science & Technology Bhopal M P
Dr. Poonam Garg, Institute of Management Technology, India
Mr. S. Mehta, Inha University, Korea
Mr. Dilip Kumar S.M, University Visvesvaraya College of Engineering (UVCE), Bangalore University,
Prof. Malik Sikander Hayat Khiyal, Fatima Jinnah Women University, Rawalpindi, Pakistan
Dr. Virendra Gomase , Department of Bioinformatics, Padmashree Dr. D.Y. Patil University
Dr. Irraivan Elamvazuthi, University Technology PETRONAS, Malaysia
Mr. Saqib Saeed, University of Siegen, Germany
Mr. Pavan Kumar Gorakavi, IPMA-USA [YC]
Dr. Ahmed Nabih Zaki Rashed, Menoufia University, Egypt
Prof. Shishir K. Shandilya, Rukmani Devi Institute of Science & Technology, India
Mrs.J.Komala Lakshmi, SNR Sons College, Computer Science, India
Mr. Muhammad Sohail, KUST, Pakistan
Dr. Manjaiah D.H, Mangalore University, India
Dr. S Santhosh Baboo, D.G.Vaishnav College, Chennai, India
Prof. Dr. Mokhtar Beldjehem, Sainte-Anne University, Halifax, NS, Canada
Dr. Deepak Laxmi Narasimha, Faculty of Computer Science and Information Technology, University of
Malaya, Malaysia
Prof. Dr. Arunkumar Thangavelu, Vellore Institute Of Technology, India
Mr. M. Azath, Anna University, India
Mr. Md. Rabiul Islam, Rajshahi University of Engineering & Technology (RUET), Bangladesh
Mr. Aos Alaa Zaidan Ansaef, Multimedia University, Malaysia
Dr Suresh Jain, Professor (on leave), Institute of Engineering & Technology, Devi Ahilya University, Indore
(MP) India,
Dr. Mohammed M. Kadhum, Universiti Utara Malaysia
Mr. Hanumanthappa. J. University of Mysore, India
Mr. Syed Ishtiaque Ahmed, Bangladesh University of Engineering and Technology (BUET)
Mr Akinola Solomon Olalekan, University of Ibadan, Ibadan, Nigeria
Mr. Santosh K. Pandey, Department of Information Technology, The Institute of Chartered Accountants of
Dr. P. Vasant, Power Control Optimization, Malaysia
Dr. Petr Ivankov, Automatika - S, Russian Federation
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Dr. Utkarsh Seetha, Data Infosys Limited, India
Mrs. Priti Maheshwary, Maulana Azad National Institute of Technology, Bhopal
Dr. (Mrs) Padmavathi Ganapathi, Avinashilingam University for Women, Coimbatore
Assist. Prof. A. Neela madheswari, Anna university, India
Prof. Ganesan Ramachandra Rao, PSG College of Arts and Science, India
Mr. Kamanashis Biswas, Daffodil International University, Bangladesh
Dr. Atul Gonsai, Saurashtra University, Gujarat, India
Mr. Angkoon Phinyomark, Prince of Songkla University, Thailand
Mrs. G. Nalini Priya, Anna University, Chennai
Dr. P. Subashini, Avinashilingam University for Women, India
Assoc. Prof. Vijay Kumar Chakka, Dhirubhai Ambani IICT, Gandhinagar ,Gujarat
Mr Jitendra Agrawal, : Rajiv Gandhi Proudyogiki Vishwavidyalaya, Bhopal
Mr. Vishal Goyal, Department of Computer Science, Punjabi University, India
Dr. R. Baskaran, Department of Computer Science and Engineering, Anna University, Chennai
Assist. Prof, Kanwalvir Singh Dhindsa, B.B.S.B.Engg.College, Fatehgarh Sahib (Punjab), India
Dr. Jamal Ahmad Dargham, School of Engineering and Information Technology, Universiti Malaysia Sabah
Mr. Nitin Bhatia, DAV College, India
Dr. Dhavachelvan Ponnurangam, Pondicherry Central University, India
Dr. Mohd Faizal Abdollah, University of Technical Malaysia, Malaysia
Assist. Prof. Sonal Chawla, Panjab University, India
Dr. Abdul Wahid, AKG Engg. College, Ghaziabad, India
Mr. Arash Habibi Lashkari, University of Malaya (UM), Malaysia
Mr. Md. Rajibul Islam, Ibnu Sina Institute, University Technology Malaysia
Professor Dr. Sabu M. Thampi, .B.S Institute of Technology for Women, Kerala University, India
Mr. Noor Muhammed Nayeem, Université Lumière Lyon 2, 69007 Lyon, France
Dr. Himanshu Aggarwal, Department of Computer Engineering, Punjabi University, India
Prof R. Naidoo, Dept of Mathematics/Center for Advanced Computer Modelling, Durban University of
Technology, Durban,South Africa
Prof. Mydhili K Nair, M S Ramaiah Institute of Technology(M.S.R.I.T), Affliliated to Visweswaraiah
Technological University, Bangalore, India
M. Prabu, Adhiyamaan College of Engineering/Anna University, India
Mr. Swakkhar Shatabda, Department of Computer Science and Engineering, United International University,
Dr. Abdur Rashid Khan, ICIT, Gomal University, Dera Ismail Khan, Pakistan
Mr. H. Abdul Shabeer, I-Nautix Technologies,Chennai, India
Dr. M. Aramudhan, Perunthalaivar Kamarajar Institute of Engineering and Technology, India
Dr. M. P. Thapliyal, Department of Computer Science, HNB Garhwal University (Central University), India
Dr. Shahaboddin Shamshirband, Islamic Azad University, Iran
Mr. Zeashan Hameed Khan, : Université de Grenoble, France
Prof. Anil K Ahlawat, Ajay Kumar Garg Engineering College, Ghaziabad, UP Technical University, Lucknow
Mr. Longe Olumide Babatope, University Of Ibadan, Nigeria
Associate Prof. Raman Maini, University College of Engineering, Punjabi University, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Dr. Maslin Masrom, University Technology Malaysia, Malaysia
Sudipta Chattopadhyay, Jadavpur University, Kolkata, India
Dr. Dang Tuan NGUYEN, University of Information Technology, Vietnam National University - Ho Chi Minh
Dr. Mary Lourde R., BITS-PILANI Dubai , UAE
Dr. Abdul Aziz, University of Central Punjab, Pakistan
Mr. Karan Singh, Gautam Budtha University, India
Mr. Avinash Pokhriyal, Uttar Pradesh Technical University, Lucknow, India
Associate Prof Dr Zuraini Ismail, University Technology Malaysia, Malaysia
Assistant Prof. Yasser M. Alginahi, College of Computer Science and Engineering, Taibah University,
Madinah Munawwarrah, KSA
Mr. Dakshina Ranjan Kisku, West Bengal University of Technology, India
Mr. Raman Kumar, Dr B R Ambedkar National Institute of Technology, Jalandhar, Punjab, India
Associate Prof. Samir B. Patel, Institute of Technology, Nirma University, India
Dr. M.Munir Ahamed Rabbani, B. S. Abdur Rahman University, India
Asst. Prof. Koushik Majumder, West Bengal University of Technology, India
Dr. Alex Pappachen James, Queensland Micro-nanotechnology center, Griffith University, Australia
Assistant Prof. S. Hariharan, B.S. Abdur Rahman University, India
Asst Prof. Jasmine. K. S, R.V.College of Engineering, India
Mr Naushad Ali Mamode Khan, Ministry of Education and Human Resources, Mauritius
Prof. Mahesh Goyani, G H Patel Collge of Engg. & Tech, V.V.N, Anand, Gujarat, India
Dr. Mana Mohammed, University of Tlemcen, Algeria
Prof. Jatinder Singh, Universal Institutiion of Engg. & Tech. CHD, India
Mrs. M. Anandhavalli Gauthaman, Sikkim Manipal Institute of Technology, Majitar, East Sikkim
Dr. Bin Guo, Institute Telecom SudParis, France
Mrs. Maleika Mehr Nigar Mohamed Heenaye-Mamode Khan, University of Mauritius
Prof. Pijush Biswas, RCC Institute of Information Technology, India
Mr. V. Bala Dhandayuthapani, Mekelle University, Ethiopia
Dr. Irfan Syamsuddin, State Polytechnic of Ujung Pandang, Indonesia
Mr. Kavi Kumar Khedo, University of Mauritius, Mauritius
Mr. Ravi Chandiran, Zagro Singapore Pte Ltd. Singapore
Mr. Milindkumar V. Sarode, Jawaharlal Darda Institute of Engineering and Technology, India
Dr. Shamimul Qamar, KSJ Institute of Engineering & Technology, India
Dr. C. Arun, Anna University, India
Assist. Prof. M.N.Birje, Basaveshwar Engineering College, India
Prof. Hamid Reza Naji, Department of Computer Enigneering, Shahid Beheshti University, Tehran, Iran
Assist. Prof. Debasis Giri, Department of Computer Science and Engineering, Haldia Institute of Technology
Subhabrata Barman, Haldia Institute of Technology, West Bengal
Mr. M. I. Lali, COMSATS Institute of Information Technology, Islamabad, Pakistan
Dr. Feroz Khan, Central Institute of Medicinal and Aromatic Plants, Lucknow, India
Mr. R. Nagendran, Institute of Technology, Coimbatore, Tamilnadu, India
Mr. Amnach Khawne, King Mongkut’s Institute of Technology Ladkrabang, Ladkrabang, Bangkok, Thailand
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Dr. P. Chakrabarti, Sir Padampat Singhania University, Udaipur, India
Mr. Nafiz Imtiaz Bin Hamid, Islamic University of Technology (IUT), Bangladesh.
Shahab-A. Shamshirband, Islamic Azad University, Chalous, Iran
Prof. B. Priestly Shan, Anna Univeristy, Tamilnadu, India
Venkatramreddy Velma, Dept. of Bioinformatics, University of Mississippi Medical Center, Jackson MS USA
Akshi Kumar, Dept. of Computer Engineering, Delhi Technological University, India
Dr. Umesh Kumar Singh, Vikram University, Ujjain, India
Mr. Serguei A. Mokhov, Concordia University, Canada
Mr. Lai Khin Wee, Universiti Teknologi Malaysia, Malaysia
Dr. Awadhesh Kumar Sharma, Madan Mohan Malviya Engineering College, India
Mr. Syed R. Rizvi, Analytical Services & Materials, Inc., USA
Dr. S. Karthik, SNS Collegeof Technology, India
Mr. Syed Qasim Bukhari, CIMET (Universidad de Granada), Spain
Mr. A.D.Potgantwar, Pune University, India
Dr. Himanshu Aggarwal, Punjabi University, India
Mr. Rajesh Ramachandran, Naipunya Institute of Management and Information Technology, India
Dr. K.L. Shunmuganathan, R.M.K Engg College , Kavaraipettai ,Chennai
Dr. Prasant Kumar Pattnaik, KIST, India.
Dr. Ch. Aswani Kumar, VIT University, India
Mr. Ijaz Ali Shoukat, King Saud University, Riyadh KSA
Mr. Arun Kumar, Sir Padam Pat Singhania University, Udaipur, Rajasthan
Mr. Muhammad Imran Khan, Universiti Teknologi PETRONAS, Malaysia
Dr. Natarajan Meghanathan, Jackson State University, Jackson, MS, USA
Mr. Mohd Zaki Bin Mas'ud, Universiti Teknikal Malaysia Melaka (UTeM), Malaysia
Prof. Dr. R. Geetharamani, Dept. of Computer Science and Eng., Rajalakshmi Engineering College, India
Dr. Smita Rajpal, Institute of Technology and Management, Gurgaon, India
Dr. S. Abdul Khader Jilani, University of Tabuk, Tabuk, Saudi Arabia
Mr. Syed Jamal Haider Zaidi, Bahria University, Pakistan
Dr. N. Devarajan, Government College of Technology,Coimbatore, Tamilnadu, INDIA
Mr. R. Jagadeesh Kannan, RMK Engineering College, India
Mr. Deo Prakash, Shri Mata Vaishno Devi University, India
Mr. Mohammad Abu Naser, Dept. of EEE, IUT, Gazipur, Bangladesh
Assist. Prof. Prasun Ghosal, Bengal Engineering and Science University, India
Mr. Md. Golam Kaosar, School of Engineering and Science, Victoria University, Melbourne City, Australia
Mr. R. Mahammad Shafi, Madanapalle Institute of Technology & Science, India
Dr. F.Sagayaraj Francis, Pondicherry Engineering College,India
Dr. Ajay Goel, HIET , Kaithal, India
Mr. Nayak Sunil Kashibarao, Bahirji Smarak Mahavidyalaya, India
Mr. Suhas J Manangi, Microsoft India
Dr. Kalyankar N. V., Yeshwant Mahavidyalaya, Nanded , India
Dr. K.D. Verma, S.V. College of Post graduate studies & Research, India
Dr. Amjad Rehman, University Technology Malaysia, Malaysia
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Mr. Rachit Garg, L K College, Jalandhar, Punjab
Mr. J. William, M.A.M college of Engineering, Trichy, Tamilnadu,India
Prof. Jue-Sam Chou, Nanhua University, College of Science and Technology, Taiwan
Dr. Thorat S.B., Institute of Technology and Management, India
Mr. Ajay Prasad, Sir Padampat Singhania University, Udaipur, India
Dr. Kamaljit I. Lakhtaria, Atmiya Institute of Technology & Science, India
Mr. Syed Rafiul Hussain, Ahsanullah University of Science and Technology, Bangladesh
Mrs Fazeela Tunnisa, Najran University, Kingdom of Saudi Arabia
Mrs Kavita Taneja, Maharishi Markandeshwar University, Haryana, India
Mr. Maniyar Shiraz Ahmed, Najran University, Najran, KSA
Mr. Anand Kumar, AMC Engineering College, Bangalore
Dr. Rakesh Chandra Gangwar, Beant College of Engg. & Tech., Gurdaspur (Punjab) India
Dr. V V Rama Prasad, Sree Vidyanikethan Engineering College, India
Assist. Prof. Neetesh Kumar Gupta, Technocrats Institute of Technology, Bhopal (M.P.), India
Mr. Ashish Seth, Uttar Pradesh Technical University, Lucknow ,UP India
Dr. V V S S S Balaram, Sreenidhi Institute of Science and Technology, India
Mr Rahul Bhatia, Lingaya's Institute of Management and Technology, India
Prof. Niranjan Reddy. P, KITS , Warangal, India
Prof. Rakesh. Lingappa, Vijetha Institute of Technology, Bangalore, India
Dr. Mohammed Ali Hussain, Nimra College of Engineering & Technology, Vijayawada, A.P., India
Dr. A.Srinivasan, MNM Jain Engineering College, Rajiv Gandhi Salai, Thorapakkam, Chennai
Mr. Rakesh Kumar, M.M. University, Mullana, Ambala, India
Dr. Lena Khaled, Zarqa Private University, Aman, Jordon
Ms. Supriya Kapoor, Patni/Lingaya's Institute of Management and Tech., India
Dr. Tossapon Boongoen , Aberystwyth University, UK
Dr . Bilal Alatas, Firat University, Turkey
Assist. Prof. Jyoti Praaksh Singh , Academy of Technology, India
Dr. Ritu Soni, GNG College, India
Dr . Mahendra Kumar , Sagar Institute of Research & Technology, Bhopal, India.
Dr. Binod Kumar, Lakshmi Narayan College of Tech.(LNCT)Bhopal India
Dr. Muzhir Shaban Al-Ani, Amman Arab University Amman – Jordan
Dr. T.C. Manjunath , ATRIA Institute of Tech, India
Mr. Muhammad Zakarya, COMSATS Institute of Information Technology (CIIT), Pakistan
Assist. Prof. Harmunish Taneja, M. M. University, India
Dr. Chitra Dhawale , SICSR, Model Colony, Pune, India
Mrs Sankari Muthukaruppan, Nehru Institute of Engineering and Technology, Anna University, India
Mr. Aaqif Afzaal Abbasi, National University Of Sciences And Technology, Islamabad
Prof. Ashutosh Kumar Dubey, Trinity Institute of Technology and Research Bhopal, India
Mr. G. Appasami, Dr. Pauls Engineering College, India
Mr. M Yasin, National University of Science and Tech, karachi (NUST), Pakistan
Mr. Yaser Miaji, University Utara Malaysia, Malaysia
Mr. Shah Ahsanul Haque, International Islamic University Chittagong (IIUC), Bangladesh
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Prof. (Dr) Syed Abdul Sattar, Royal Institute of Technology & Science, India
Dr. S. Sasikumar, Roever Engineering College
Assist. Prof. Monit Kapoor, Maharishi Markandeshwar University, India
Mr. Nwaocha Vivian O, National Open University of Nigeria
Dr. M. S. Vijaya, GR Govindarajulu School of Applied Computer Technology, India
Assist. Prof. Chakresh Kumar, Manav Rachna International University, India
Mr. Kunal Chadha , R&D Software Engineer, Gemalto, Singapore
Mr. Mueen Uddin, Universiti Teknologi Malaysia, UTM , Malaysia
Dr. Dhuha Basheer abdullah, Mosul university, Iraq
Mr. S. Audithan, Annamalai University, India
Prof. Vijay K Chaudhari, Technocrats Institute of Technology , India
Associate Prof. Mohd Ilyas Khan, Technocrats Institute of Technology , India
Dr. Vu Thanh Nguyen, University of Information Technology, HoChiMinh City, VietNam
Assist. Prof. Anand Sharma, MITS, Lakshmangarh, Sikar, Rajasthan, India
Prof. T V Narayana Rao, HITAM Engineering college, Hyderabad
Mr. Deepak Gour, Sir Padampat Singhania University, India
Assist. Prof. Amutharaj Joyson, Kalasalingam University, India
Mr. Ali Balador, Islamic Azad University, Iran
Mr. Mohit Jain, Maharaja Surajmal Institute of Technology, India
Mr. Dilip Kumar Sharma, GLA Institute of Technology & Management, India
Dr. Debojyoti Mitra, Sir padampat Singhania University, India
Dr. Ali Dehghantanha, Asia-Pacific University College of Technology and Innovation, Malaysia
Mr. Zhao Zhang, City University of Hong Kong, China
Prof. S.P. Setty, A.U. College of Engineering, India
Prof. Patel Rakeshkumar Kantilal, Sankalchand Patel College of Engineering, India
Mr. Biswajit Bhowmik, Bengal College of Engineering & Technology, India
Mr. Manoj Gupta, Apex Institute of Engineering & Technology, India
Assist. Prof. Ajay Sharma, Raj Kumar Goel Institute Of Technology, India
Assist. Prof. Ramveer Singh, Raj Kumar Goel Institute of Technology, India
Dr. Hanan Elazhary, Electronics Research Institute, Egypt
Dr. Hosam I. Faiq, USM, Malaysia
Prof. Dipti D. Patil, MAEER’s MIT College of Engg. & Tech, Pune, India
Assist. Prof. Devendra Chack, BCT Kumaon engineering College Dwarahat Almora, India
Prof. Manpreet Singh, M. M. Engg. College, M. M. University, India
Assist. Prof. M. Sadiq ali Khan, University of Karachi, Pakistan
Mr. Prasad S. Halgaonkar, MIT - College of Engineering, Pune, India
Dr. Imran Ghani, Universiti Teknologi Malaysia, Malaysia
Prof. Varun Kumar Kakar, Kumaon Engineering College, Dwarahat, India
Assist. Prof. Nisheeth Joshi, Apaji Institute, Banasthali University, Rajasthan, India
Associate Prof. Kunwar S. Vaisla, VCT Kumaon Engineering College, India
Prof Anupam Choudhary, Bhilai School Of Engg.,Bhilai (C.G.),India
Mr. Divya Prakash Shrivastava, Al Jabal Al garbi University, Zawya, Libya
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Associate Prof. Dr. V. Radha, Avinashilingam Deemed university for women, Coimbatore.
Dr. Kasarapu Ramani, JNT University, Anantapur, India
Dr. Anuraag Awasthi, Jayoti Vidyapeeth Womens University, India
Dr. C G Ravichandran, R V S College of Engineering and Technology, India
Dr. Mohamed A. Deriche, King Fahd University of Petroleum and Minerals, Saudi Arabia
Mr. Abbas Karimi, Universiti Putra Malaysia, Malaysia
Mr. Amit Kumar, Jaypee University of Engg. and Tech., India
Dr. Nikolai Stoianov, Defense Institute, Bulgaria
Assist. Prof. S. Ranichandra, KSR College of Arts and Science, Tiruchencode
Mr. T.K.P. Rajagopal, Diamond Horse International Pvt Ltd, India
Dr. Md. Ekramul Hamid, Rajshahi University, Bangladesh
Mr. Hemanta Kumar Kalita , TATA Consultancy Services (TCS), India
Dr. Messaouda Azzouzi, Ziane Achour University of Djelfa, Algeria
Prof. (Dr.) Juan Jose Martinez Castillo, "Gran Mariscal de Ayacucho" University and Acantelys research
Group, Venezuela
Dr. Jatinderkumar R. Saini, Narmada College of Computer Application, India
Dr. Babak Bashari Rad, University Technology of Malaysia, Malaysia
Dr. Nighat Mir, Effat University, Saudi Arabia
Prof. (Dr.) G.M.Nasira, Sasurie College of Engineering, India
Mr. Varun Mittal, Gemalto Pte Ltd, Singapore
Assist. Prof. Mrs P. Banumathi, Kathir College Of Engineering, Coimbatore
Assist. Prof. Quan Yuan, University of Wisconsin-Stevens Point, US
Dr. Pranam Paul, Narula Institute of Technology, Agarpara, West Bengal, India
Assist. Prof. J. Ramkumar, V.L.B Janakiammal college of Arts & Science, India
Mr. P. Sivakumar, Anna university, Chennai, India
Mr. Md. Humayun Kabir Biswas, King Khalid University, Kingdom of Saudi Arabia
Mr. Mayank Singh, J.P. Institute of Engg & Technology, Meerut, India
HJ. Kamaruzaman Jusoff, Universiti Putra Malaysia
Mr. Nikhil Patrick Lobo, CADES, India
Dr. Amit Wason, Rayat-Bahra Institute of Engineering & Boi-Technology, India
Dr. Rajesh Shrivastava, Govt. Benazir Science & Commerce College, Bhopal, India
Assist. Prof. Vishal Bharti, DCE, Gurgaon
Mrs. Sunita Bansal, Birla Institute of Technology & Science, India
Dr. R. Sudhakar, Dr.Mahalingam college of Engineering and Technology, India
Dr. Amit Kumar Garg, Shri Mata Vaishno Devi University, Katra(J&K), India
Assist. Prof. Raj Gaurang Tiwari, AZAD Institute of Engineering and Technology, India
Mr. Hamed Taherdoost, Tehran, Iran
Mr. Amin Daneshmand Malayeri, YRC, IAU, Malayer Branch, Iran
Mr. Shantanu Pal, University of Calcutta, India
Dr. Terry H. Walcott, E-Promag Consultancy Group, United Kingdom
Dr. Ezekiel U OKIKE, University of Ibadan, Nigeria
Mr. P. Mahalingam, Caledonian College of Engineering, Oman
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Dr. Mahmoud M. A. Abd Ellatif, Mansoura University, Egypt
Prof. Kunwar S. Vaisla, BCT Kumaon Engineering College, India
Prof. Mahesh H. Panchal, Kalol Institute of Technology & Research Centre, India
Mr. Muhammad Asad, Technical University of Munich, Germany
Mr. AliReza Shams Shafigh, Azad Islamic university, Iran
Prof. S. V. Nagaraj, RMK Engineering College, India
Mr. Ashikali M Hasan, Senior Researcher, CelNet security, India
Dr. Adnan Shahid Khan, University Technology Malaysia, Malaysia
Mr. Prakash Gajanan Burade, Nagpur University/ITM college of engg, Nagpur, India
Dr. Jagdish B.Helonde, Nagpur University/ITM college of engg, Nagpur, India
Professor, Doctor BOUHORMA Mohammed, Univertsity Abdelmalek Essaadi, Morocco
Mr. K. Thirumalaivasan, Pondicherry Engg. College, India
Mr. Umbarkar Anantkumar Janardan, Walchand College of Engineering, India
Mr. Ashish Chaurasia, Gyan Ganga Institute of Technology & Sciences, India
Mr. Sunil Taneja, Kurukshetra University, India
Mr. Fauzi Adi Rafrastara, Dian Nuswantoro University, Indonesia
Dr. Yaduvir Singh, Thapar University, India
Dr. Ioannis V. Koskosas, University of Western Macedonia, Greece
Dr. Vasantha Kalyani David, Avinashilingam University for women, Coimbatore
Dr. Ahmed Mansour Manasrah, Universiti Sains Malaysia, Malaysia
Miss. Nazanin Sadat Kazazi, University Technology Malaysia, Malaysia
Mr. Saeed Rasouli Heikalabad, Islamic Azad University - Tabriz Branch, Iran
Assoc. Prof. Dhirendra Mishra, SVKM's NMIMS University, India
Prof. Shapoor Zarei, UAE Inventors Association, UAE
Prof. B.Raja Sarath Kumar, Lenora College of Engineering, India
Dr. Bashir Alam, Jamia millia Islamia, Delhi, India
Prof. Anant J Umbarkar, Walchand College of Engg., India
Assist. Prof. B. Bharathi, Sathyabama University, India
Dr. Fokrul Alom Mazarbhuiya, King Khalid University, Saudi Arabia
Prof. T.S.Jeyali Laseeth, Anna University of Technology, Tirunelveli, India
Dr. M. Balraju, Jawahar Lal Nehru Technological University Hyderabad, India
Dr. Vijayalakshmi M. N., R.V.College of Engineering, Bangalore
Prof. Walid Moudani, Lebanese University, Lebanon
Dr. Saurabh Pal, VBS Purvanchal University, Jaunpur, India
Associate Prof. Suneet Chaudhary, Dehradun Institute of Technology, India
Associate Prof. Dr. Manuj Darbari, BBD University, India
Ms. Prema Selvaraj, K.S.R College of Arts and Science, India
Assist. Prof. Ms.S.Sasikala, KSR College of Arts & Science, India
Mr. Sukhvinder Singh Deora, NC Institute of Computer Sciences, India
Dr. Abhay Bansal, Amity School of Engineering & Technology, India
Ms. Sumita Mishra, Amity School of Engineering and Technology, India
Professor S. Viswanadha Raju, JNT University Hyderabad, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Mr. Asghar Shahrzad Khashandarag, Islamic Azad University Tabriz Branch, India
Mr. Manoj Sharma, Panipat Institute of Engg. & Technology, India
Mr. Shakeel Ahmed, King Faisal University, Saudi Arabia
Dr. Mohamed Ali Mahjoub, Institute of Engineer of Monastir, Tunisia
Mr. Adri Jovin J.J., SriGuru Institute of Technology, India
Dr. Sukumar Senthilkumar, Universiti Sains Malaysia, Malaysia
Mr. Rakesh Bharati, Dehradun Institute of Technology Dehradun, India
Mr. Shervan Fekri Ershad, Shiraz International University, Iran
Mr. Md. Safiqul Islam, Daffodil International University, Bangladesh
Mr. Mahmudul Hasan, Daffodil International University, Bangladesh
Prof. Mandakini Tayade, UIT, RGTU, Bhopal, India
Ms. Sarla More, UIT, RGTU, Bhopal, India
Mr. Tushar Hrishikesh Jaware, R.C. Patel Institute of Technology, Shirpur, India
Ms. C. Divya, Dr G R Damodaran College of Science, Coimbatore, India
Mr. Fahimuddin Shaik, Annamacharya Institute of Technology & Sciences, India
Dr. M. N. Giri Prasad, JNTUCE,Pulivendula, A.P., India
Assist. Prof. Chintan M Bhatt, Charotar University of Science And Technology, India
Prof. Sahista Machchhar, Marwadi Education Foundation's Group of institutions, India
Assist. Prof. Navnish Goel, S. D. College Of Enginnering & Technology, India
Mr. Khaja Kamaluddin, Sirt University, Sirt, Libya
Mr. Mohammad Zaidul Karim, Daffodil International, Bangladesh
Mr. M. Vijayakumar, KSR College of Engineering, Tiruchengode, India
Mr. S. A. Ahsan Rajon, Khulna University, Bangladesh
Dr. Muhammad Mohsin Nazir, LCW University Lahore, Pakistan
Mr. Mohammad Asadul Hoque, University of Alabama, USA
Mr. P.V.Sarathchand, Indur Institute of Engineering and Technology, India
Mr. Durgesh Samadhiya, Chung Hua University, Taiwan
Dr Venu Kuthadi, University of Johannesburg, Johannesburg, RSA
Dr. (Er) Jasvir Singh, Guru Nanak Dev University, Amritsar, Punjab, India
Mr. Jasmin Cosic, Min. of the Interior of Una-sana canton, B&H, Bosnia and Herzegovina
Dr S. Rajalakshmi, Botho College, South Africa
Dr. Mohamed Sarrab, De Montfort University, UK
Mr. Basappa B. Kodada, Canara Engineering College, India
Assist. Prof. K. Ramana, Annamacharya Institute of Technology and Sciences, India
Dr. Ashu Gupta, Apeejay Institute of Management, Jalandhar, India
Assist. Prof. Shaik Rasool, Shadan College of Engineering & Technology, India
Assist. Prof. K. Suresh, Annamacharya Institute of Tech & Sci. Rajampet, AP, India
Dr . G. Singaravel, K.S.R. College of Engineering, India
Dr B. G. Geetha, K.S.R. College of Engineering, India
Assist. Prof. Kavita Choudhary, ITM University, Gurgaon
Dr. Mehrdad Jalali, Azad University, Mashhad, Iran
Megha Goel, Shamli Institute of Engineering and Technology, Shamli, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Mr. Chi-Hua Chen, Institute of Information Management, National Chiao-Tung University, Taiwan (R.O.C.)
Assoc. Prof. A. Rajendran, RVS College of Engineering and Technology, India
Assist. Prof. S. Jaganathan, RVS College of Engineering and Technology, India
Assoc. Prof. A S N Chakravarthy, Sri Aditya Engineering College, India
Assist. Prof. Deepshikha Patel, Technocrat Institute of Technology, India
Assist. Prof. Maram Balajee, GMRIT, India
Assist. Prof. Monika Bhatnagar, TIT, India
Prof. Gaurang Panchal, Charotar University of Science & Technology, India
Prof. Anand K. Tripathi, Computer Society of India
Prof. Jyoti Chaudhary, High Performance Computing Research Lab, India
Assist. Prof. Supriya Raheja, ITM University, India
Dr. Pankaj Gupta, Microsoft Corporation, U.S.A.
Assist. Prof. Panchamukesh Chandaka, Hyderabad Institute of Tech. & Management, India
Prof. Mohan H.S, SJB Institute Of Technology, India
Mr. Hossein Malekinezhad, Islamic Azad University, Iran
Mr. Zatin Gupta, Universti Malaysia, Malaysia
Assist. Prof. Amit Chauhan, Phonics Group of Institutions, India
Assist. Prof. Ajal A. J., METS School Of Engineering, India
Mrs. Omowunmi Omobola Adeyemo, University of Ibadan, Nigeria
Dr. Bharat Bhushan Agarwal, I.F.T.M. University, India
Md. Nazrul Islam, University of Western Ontario, Canada
Tushar Kanti, L.N.C.T, Bhopal, India
Er. Aumreesh Kumar Saxena, SIRTs College Bhopal, India
Mr. Mohammad Monirul Islam, Daffodil International University, Bangladesh
Dr. Kashif Nisar, University Utara Malaysia, Malaysia
Dr. Wei Zheng, Rutgers Univ/ A10 Networks, USA
Associate Prof. Rituraj Jain, Vyas Institute of Engg & Tech, Jodhpur – Rajasthan
Assist. Prof. Apoorvi Sood, I.T.M. University, India
Dr. Kayhan Zrar Ghafoor, University Technology Malaysia, Malaysia
Mr. Swapnil Soner, Truba Institute College of Engineering & Technology, Indore, India
Ms. Yogita Gigras, I.T.M. University, India
Associate Prof. Neelima Sadineni, Pydha Engineering College, India Pydha Engineering College
Assist. Prof. K. Deepika Rani, HITAM, Hyderabad
Ms. Shikha Maheshwari, Jaipur Engineering College & Research Centre, India
Prof. Dr V S Giridhar Akula, Avanthi's Scientific Tech. & Research Academy, Hyderabad
Prof. Dr.S.Saravanan, Muthayammal Engineering College, India
Mr. Mehdi Golsorkhatabar Amiri, Islamic Azad University, Iran
Prof. Amit Sadanand Savyanavar, MITCOE, Pune, India
Assist. Prof. P.Oliver Jayaprakash, Anna University,Chennai
Assist. Prof. Ms. Sujata, ITM University, Gurgaon, India
Dr. Asoke Nath, St. Xavier's College, India
Mr. Masoud Rafighi, Islamic Azad University, Iran
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Assist. Prof. RamBabu Pemula, NIMRA College of Engineering & Technology, India
Assist. Prof. Ms Rita Chhikara, ITM University, Gurgaon, India
Mr. Sandeep Maan, Government Post Graduate College, India
Prof. Dr. S. Muralidharan, Mepco Schlenk Engineering College, India
Associate Prof. T.V.Sai Krishna, QIS College of Engineering and Technology, India
Mr. R. Balu, Bharathiar University, Coimbatore, India
Assist. Prof. Shekhar. R, Dr.SM College of Engineering, India
Prof. P. Senthilkumar, Vivekanandha Institue of Engineering And Techology For Woman, India
Mr. M. Kamarajan, PSNA College of Engineering & Technology, India
Dr. Angajala Srinivasa Rao, Jawaharlal Nehru Technical University, India
Assist. Prof. C. Venkatesh, A.I.T.S, Rajampet, India
Mr. Afshin Rezakhani Roozbahani, Ayatollah Boroujerdi University, Iran
Mr. Laxmi chand, SCTL, Noida, India
Dr. Dr. Abdul Hannan, Vivekanand College, Aurangabad
Prof. Mahesh Panchal, KITRC, Gujarat
Dr. A. Subramani, K.S.R. College of Engineering, Tiruchengode
Assist. Prof. Prakash M, Rajalakshmi Engineering College, Chennai, India
Assist. Prof. Akhilesh K Sharma, Sir Padampat Singhania University, India
Ms. Varsha Sahni, Guru Nanak Dev Engineering College, Ludhiana, India
Associate Prof. Trilochan Rout, NM Institute Of Engineering And Technlogy, India
Mr. Srikanta Kumar Mohapatra, NMIET, Orissa, India
Mr. Waqas Haider Bangyal, Iqra University Islamabad, Pakistan
Dr. S. Vijayaragavan, Christ College of Engineering and Technology, Pondicherry, India
Prof. Elboukhari Mohamed, University Mohammed First, Oujda, Morocco
Dr. Muhammad Asif Khan, King Faisal University, Saudi Arabia
Dr. Nagy Ramadan Darwish Omran, Cairo University, Egypt.
Assistant Prof. Anand Nayyar, KCL Institute of Management and Technology, India
Mr. G. Premsankar, Ericcson, India
Assist. Prof. T. Hemalatha, VELS University, India
Prof. Tejaswini Apte, University of Pune, India
Dr. Edmund Ng Giap Weng, Universiti Malaysia Sarawak, Malaysia
Mr. Mahdi Nouri, Iran University of Science and Technology, Iran
Associate Prof. S. Asif Hussain, Annamacharya Institute of technology & Sciences, India
Mrs. Kavita Pabreja, Maharaja Surajmal Institute (an affiliate of GGSIP University), India
Mr. Vorugunti Chandra Sekhar, DA-IICT, India
Mr. Muhammad Najmi Ahmad Zabidi, Universiti Teknologi Malaysia, Malaysia
Dr. Aderemi A. Atayero, Covenant University, Nigeria
Assist. Prof. Osama Sohaib, Balochistan University of Information Technology, Pakistan
Assist. Prof. K. Suresh, Annamacharya Institute of Technology and Sciences, India
Mr. Hassen Mohammed Abduallah Alsafi, International Islamic University Malaysia (IIUM) Malaysia
Mr. Robail Yasrab, Virtual University of Pakistan, Pakistan
Mr. R. Balu, Bharathiar University, Coimbatore, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Prof. Anand Nayyar, KCL Institute of Management and Technology, Jalandhar
Assoc. Prof. Vivek S Deshpande, MIT College of Engineering, India
Prof. K. Saravanan, Anna university Coimbatore, India
Dr. Ravendra Singh, MJP Rohilkhand University, Bareilly, India
Mr. V. Mathivanan, IBRA College of Technology, Sultanate of OMAN
Assoc. Prof. S. Asif Hussain, AITS, India
Assist. Prof. C. Venkatesh, AITS, India
Mr. Sami Ulhaq, SZABIST Islamabad, Pakistan
Dr. B. Justus Rabi, Institute of Science & Technology, India
Mr. Anuj Kumar Yadav, Dehradun Institute of technology, India
Mr. Alejandro Mosquera, University of Alicante, Spain
Assist. Prof. Arjun Singh, Sir Padampat Singhania University (SPSU), Udaipur, India
Dr. Smriti Agrawal, JB Institute of Engineering and Technology, Hyderabad
Assist. Prof. Swathi Sambangi, Visakha Institute of Engineering and Technology, India
Ms. Prabhjot Kaur, Guru Gobind Singh Indraprastha University, India
Mrs. Samaher AL-Hothali, Yanbu University College, Saudi Arabia
Prof. Rajneeshkaur Bedi, MIT College of Engineering, Pune, India
Mr. Hassen Mohammed Abduallah Alsafi, International Islamic University Malaysia (IIUM)
Dr. Wei Zhang,, Seattle, WA, USA
Mr. B. Santhosh Kumar, C S I College of Engineering, Tamil Nadu
Dr. K. Reji Kumar, , N S S College, Pandalam, India
Assoc. Prof. K. Seshadri Sastry, EIILM University, India
Mr. Kai Pan, UNC Charlotte, USA
Mr. Ruikar Sachin, SGGSIET, India
Prof. (Dr.) Vinodani Katiyar, Sri Ramswaroop Memorial University, India
Assoc. Prof., M. Giri, Sreenivasa Institute of Technology and Management Studies, India
Assoc. Prof. Labib Francis Gergis, Misr Academy for Engineering and Technology ( MET ), Egypt
Assist. Prof. Amanpreet Kaur, ITM University, India
Assist. Prof. Anand Singh Rajawat, Shri Vaishnav Institute of Technology & Science, Indore
Mrs. Hadeel Saleh Haj Aliwi, Universiti Sains Malaysia (USM), Malaysia
Dr. Abhay Bansal, Amity University, India
Dr. Mohammad A. Mezher, Fahad Bin Sultan University, KSA
Assist. Prof. Nidhi Arora, M.C.A. Institute, India
Prof. Dr. P. Suresh, Karpagam College of Engineering, Coimbatore, India
Dr. Kannan Balasubramanian, Mepco Schlenk Engineering College, India
Dr. S. Sankara Gomathi, Panimalar Engineering college, India
Prof. Anil kumar Suthar, Gujarat Technological University, L.C. Institute of Technology, India
Assist. Prof. R. Hubert Rajan, NOORUL ISLAM UNIVERSITY, India
Assist. Prof. Dr. Jyoti Mahajan, College of Engineering & Technology
Assist. Prof. Homam Reda El-Taj, College of Network Engineering, Saudi Arabia & Malaysia
Mr. Bijan Paul, Shahjalal University of Science & Technology, Bangladesh
Assoc. Prof. Dr. Ch V Phani Krishna, KL University, India
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 10, No. 9, September 2012
Dr. Vishal Bhatnagar, Ambedkar Institute of Advanced Communication Technologies & Research, India
Dr. Lamri LAOUAMER, Al Qassim University, Dept. Info. Systems & European University of Brittany, Dept.
Computer Science, UBO, Brest, France
Prof. Ashish Babanrao Sasankar, G.H.Raisoni Institute Of Information Technology, India
Prof. Pawan Kumar Goel, Shamli Institute of Engineering and Technology, India
Mr. Ram Kumar Singh, S.V Subharti University, India
Assistant Prof. Sunish Kumar O S, Amaljyothi College of Engineering, India
Dr Sanjay Bhargava, Banasthali University, India
Mr. Pankaj S. Kulkarni, AVEW's Shatabdi Institute of Technology, India
Mr. Roohollah Etemadi, Islamic Azad University, Iran
Mr. Oloruntoyin Sefiu Taiwo, Emmanuel Alayande College Of Education, Nigeria
Mr. Sumit Goyal, National Dairy Research Institute, India
Mr Jaswinder Singh Dilawari, Geeta Engineering College, India
Prof. Raghuraj Singh, Harcourt Butler Technological Institute, Kanpur
Dr. S.K. Mahendran, Anna University, Chennai, India
Dr. Amit Wason, Hindustan Institute of Technology & Management, Punjab
Dr. Ashu Gupta, Apeejay Institute of Management, India
Assist. Prof. D. Asir Antony Gnana Singh, M.I.E.T Engineering College, India

International Journal of Computer Science and Information Security
January - December
ISSN: 1947-5500

International Journal Computer Science and Information Security, IJCSIS, is the premier
scholarly venue in the areas of computer science and security issues. IJCSIS 2011 will provide a high
profile, leading edge platform for researchers and engineers alike to publish state-of-the-art research in the
respective fields of information technology and communication security. The journal will feature a diverse
mixture of publication articles including core and applied computer science related topics.

Authors are solicited to contribute to the special issue by submitting articles that illustrate research results,
projects, surveying works and industrial experiences that describe significant advances in the following
areas, but are not limited to. Submissions may span a broad range of topics, e.g.:

Track A: Security

Access control, Anonymity, Audit and audit reduction & Authentication and authorization, Applied
cryptography, Cryptanalysis, Digital Signatures, Biometric security, Boundary control devices,
Certification and accreditation, Cross-layer design for security, Security & Network Management, Data and
system integrity, Database security, Defensive information warfare, Denial of service protection, Intrusion
Detection, Anti-malware, Distributed systems security, Electronic commerce, E-mail security, Spam,
Phishing, E-mail fraud, Virus, worms, Trojan Protection, Grid security, Information hiding and
watermarking & Information survivability, Insider threat protection, Integrity
Intellectual property protection, Internet/Intranet Security, Key management and key recovery, Language-
based security, Mobile and wireless security, Mobile, Ad Hoc and Sensor Network Security, Monitoring
and surveillance, Multimedia security ,Operating system security, Peer-to-peer security, Performance
Evaluations of Protocols & Security Application, Privacy and data protection, Product evaluation criteria
and compliance, Risk evaluation and security certification, Risk/vulnerability assessment, Security &
Network Management, Security Models & protocols, Security threats & countermeasures (DDoS, MiM,
Session Hijacking, Replay attack etc,), Trusted computing, Ubiquitous Computing Security, Virtualization
security, VoIP security, Web 2.0 security, Submission Procedures, Active Defense Systems, Adaptive
Defense Systems, Benchmark, Analysis and Evaluation of Security Systems, Distributed Access Control
and Trust Management, Distributed Attack Systems and Mechanisms, Distributed Intrusion
Detection/Prevention Systems, Denial-of-Service Attacks and Countermeasures, High Performance
Security Systems, Identity Management and Authentication, Implementation, Deployment and
Management of Security Systems, Intelligent Defense Systems, Internet and Network Forensics, Large-
scale Attacks and Defense, RFID Security and Privacy, Security Architectures in Distributed Network
Systems, Security for Critical Infrastructures, Security for P2P systems and Grid Systems, Security in E-
Commerce, Security and Privacy in Wireless Networks, Secure Mobile Agents and Mobile Code, Security
Protocols, Security Simulation and Tools, Security Theory and Tools, Standards and Assurance Methods,
Trusted Computing, Viruses, Worms, and Other Malicious Code, World Wide Web Security, Novel and
emerging secure architecture, Study of attack strategies, attack modeling, Case studies and analysis of
actual attacks, Continuity of Operations during an attack, Key management, Trust management, Intrusion
detection techniques, Intrusion response, alarm management, and correlation analysis, Study of tradeoffs
between security and system performance, Intrusion tolerance systems, Secure protocols, Security in
wireless networks (e.g. mesh networks, sensor networks, etc.), Cryptography and Secure Communications,
Computer Forensics, Recovery and Healing, Security Visualization, Formal Methods in Security, Principles
for Designing a Secure Computing System, Autonomic Security, Internet Security, Security in Health Care
Systems, Security Solutions Using Reconfigurable Computing, Adaptive and Intelligent Defense Systems,
Authentication and Access control, Denial of service attacks and countermeasures, Identity, Route and
Location Anonymity schemes, Intrusion detection and prevention techniques, Cryptography, encryption
algorithms and Key management schemes, Secure routing schemes, Secure neighbor discovery and
localization, Trust establishment and maintenance, Confidentiality and data integrity, Security architectures,
deployments and solutions, Emerging threats to cloud-based services, Security model for new services,
Cloud-aware web service security, Information hiding in Cloud Computing, Securing distributed data
storage in cloud, Security, privacy and trust in mobile computing systems and applications, Middleware
security & Security features: middleware software is an asset on
its own and has to be protected, interaction between security-specific and other middleware features, e.g.,
context-awareness, Middleware-level security monitoring and measurement: metrics and mechanisms
for quantification and evaluation of security enforced by the middleware, Security co-design: trade-off and
co-design between application-based and middleware-based security, Policy-based management:
innovative support for policy-based definition and enforcement of security concerns, Identification and
authentication mechanisms: Means to capture application specific constraints in defining and enforcing
access control rules, Middleware-oriented security patterns: identification of patterns for sound, reusable
security, Security in aspect-based middleware: mechanisms for isolating and enforcing security aspects,
Security in agent-based platforms: protection for mobile code and platforms, Smart Devices: Biometrics,
National ID cards, Embedded Systems Security and TPMs, RFID Systems Security, Smart Card Security,
Pervasive Systems: Digital Rights Management (DRM) in pervasive environments, Intrusion Detection and
Information Filtering, Localization Systems Security (Tracking of People and Goods), Mobile Commerce
Security, Privacy Enhancing Technologies, Security Protocols (for Identification and Authentication,
Confidentiality and Privacy, and Integrity), Ubiquitous Networks: Ad Hoc Networks Security, Delay-
Tolerant Network Security, Domestic Network Security, Peer-to-Peer Networks Security, Security Issues
in Mobile and Ubiquitous Networks, Security of GSM/GPRS/UMTS Systems, Sensor Networks Security,
Vehicular Network Security, Wireless Communication Security: Bluetooth, NFC, WiFi, WiMAX,
WiMedia, others

This Track will emphasize the design, implementation, management and applications of computer
communications, networks and services. Topics of mostly theoretical nature are also welcome, provided
there is clear practical potential in applying the results of such work.

Track B: Computer Science

Broadband wireless technologies: LTE, WiMAX, WiRAN, HSDPA, HSUPA, Resource allocation and
interference management, Quality of service and scheduling methods, Capacity planning and dimensioning,
Cross-layer design and Physical layer based issue, Interworking architecture and interoperability, Relay
assisted and cooperative communications, Location and provisioning and mobility management, Call
admission and flow/congestion control, Performance optimization, Channel capacity modeling and analysis,
Middleware Issues: Event-based, publish/subscribe, and message-oriented middleware, Reconfigurable,
adaptable, and reflective middleware approaches, Middleware solutions for reliability, fault tolerance, and
quality-of-service, Scalability of middleware, Context-aware middleware, Autonomic and self-managing
middleware, Evaluation techniques for middleware solutions, Formal methods and tools for designing,
verifying, and evaluating, middleware, Software engineering techniques for middleware, Service oriented
middleware, Agent-based middleware, Security middleware, Network Applications: Network-based
automation, Cloud applications, Ubiquitous and pervasive applications, Collaborative applications, RFID
and sensor network applications, Mobile applications, Smart home applications, Infrastructure monitoring
and control applications, Remote health monitoring, GPS and location-based applications, Networked
vehicles applications, Alert applications, Embeded Computer System, Advanced Control Systems, and
Intelligent Control : Advanced control and measurement, computer and microprocessor-based control,
signal processing, estimation and identification techniques, application specific IC’s, nonlinear and
adaptive control, optimal and robot control, intelligent control, evolutionary computing, and intelligent
systems, instrumentation subject to critical conditions, automotive, marine and aero-space control and all
other control applications, Intelligent Control System, Wiring/Wireless Sensor, Signal Control System.
Sensors, Actuators and Systems Integration : Intelligent sensors and actuators, multisensor fusion, sensor
array and multi-channel processing, micro/nano technology, microsensors and microactuators,
instrumentation electronics, MEMS and system integration, wireless sensor, Network Sensor, Hybrid
Sensor, Distributed Sensor Networks. Signal and Image Processing : Digital signal processing theory,
methods, DSP implementation, speech processing, image and multidimensional signal processing, Image
analysis and processing, Image and Multimedia applications, Real-time multimedia signal processing,
Computer vision, Emerging signal processing areas, Remote Sensing, Signal processing in education.
Industrial Informatics: Industrial applications of neural networks, fuzzy algorithms, Neuro-Fuzzy
application, bioInformatics, real-time computer control, real-time information systems, human-machine
interfaces, CAD/CAM/CAT/CIM, virtual reality, industrial communications, flexible manufacturing
systems, industrial automated process, Data Storage Management, Harddisk control, Supply Chain
Management, Logistics applications, Power plant automation, Drives automation. Information Technology,
Management of Information System : Management information systems, Information Management,
Nursing information management, Information System, Information Technology and their application, Data
retrieval, Data Base Management, Decision analysis methods, Information processing, Operations research,
E-Business, E-Commerce, E-Government, Computer Business, Security and risk management, Medical
imaging, Biotechnology, Bio-Medicine, Computer-based information systems in health care, Changing
Access to Patient Information, Healthcare Management Information Technology.
Communication/Computer Network, Transportation Application : On-board diagnostics, Active safety
systems, Communication systems, Wireless technology, Communication application, Navigation and
Guidance, Vision-based applications, Speech interface, Sensor fusion, Networking theory and technologies,
Transportation information, Autonomous vehicle, Vehicle application of affective computing, Advance
Computing technology and their application : Broadband and intelligent networks, Data Mining, Data
fusion, Computational intelligence, Information and data security, Information indexing and retrieval,
Information processing, Information systems and applications, Internet applications and performances,
Knowledge based systems, Knowledge management, Software Engineering, Decision making, Mobile
networks and services, Network management and services, Neural Network, Fuzzy logics, Neuro-Fuzzy,
Expert approaches, Innovation Technology and Management : Innovation and product development,
Emerging advances in business and its applications, Creativity in Internet management and retailing, B2B
and B2C management, Electronic transceiver device for Retail Marketing Industries, Facilities planning
and management, Innovative pervasive computing applications, Programming paradigms for pervasive
systems, Software evolution and maintenance in pervasive systems, Middleware services and agent
technologies, Adaptive, autonomic and context-aware computing, Mobile/Wireless computing systems and
services in pervasive computing, Energy-efficient and green pervasive computing, Communication
architectures for pervasive computing, Ad hoc networks for pervasive communications, Pervasive
opportunistic communications and applications, Enabling technologies for pervasive systems (e.g., wireless
BAN, PAN), Positioning and tracking technologies, Sensors and RFID in pervasive systems, Multimodal
sensing and context for pervasive applications, Pervasive sensing, perception and semantic interpretation,
Smart devices and intelligent environments, Trust, security and privacy issues in pervasive systems, User
interfaces and interaction models, Virtual immersive communications, Wearable computers, Standards and
interfaces for pervasive computing environments, Social and economic models for pervasive systems,
Active and Programmable Networks, Ad Hoc & Sensor Network, Congestion and/or Flow Control, Content
Distribution, Grid Networking, High-speed Network Architectures, Internet Services and Applications,
Optical Networks, Mobile and Wireless Networks, Network Modeling and Simulation, Multicast,
Multimedia Communications, Network Control and Management, Network Protocols, Network
Performance, Network Measurement, Peer to Peer and Overlay Networks, Quality of Service and Quality
of Experience, Ubiquitous Networks, Crosscutting Themes – Internet Technologies, Infrastructure,
Services and Applications; Open Source Tools, Open Models and Architectures; Security, Privacy and
Trust; Navigation Systems, Location Based Services; Social Networks and Online Communities; ICT
Convergence, Digital Economy and Digital Divide, Neural Networks, Pattern Recognition, Computer
Vision, Advanced Computing Architectures and New Programming Models, Visualization and Virtual
Reality as Applied to Computational Science, Computer Architecture and Embedded Systems, Technology
in Education, Theoretical Computer Science, Computing Ethics, Computing Practices & Applications

Authors are invited to submit papers through e-mail Submissions must be original
and should not have been published previously or be under consideration for publication while being
evaluated by IJCSIS. Before submission authors should carefully read over the journal's Author Guidelines,
which are located at .

ISSN 1947 5500

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.